Quantifying catastrophic forgetting in continual federated learning
The deployment of Federated Learning (FL) systems poses various challenges such as data heterogeneity and communication efficiency. We focus on a practical FL setup that has recently drawn attention, where the data distribution on each device is not static but dynamically evolves over time. This setup, referred to as Continual Federated Learning (CFL), suffers from catastrophic forgetting, i.e., the undesired forgetting of previous knowledge after learning on new data, an issue not encountered with vanilla FL. In this work, we formally quantify catastrophic forgetting in a CFL setup, establish links to training optimization and evaluate different episodic replay approaches for CFL on a large scale realworld NLP dataset. To the best of our knowledge, this is the first such study of episodic replay for CFL. We show that storing a small set of past data boosts performance and significantly reduce forgetting, providing evidence that carefully designed sampling strategies can lead to further improvements.