Machine learning

Personalized federated learning for a better customer experience

Accounting for data heterogeneity across edge devices enables more useful model updates, both locally and globally.

December 5, 2022

3 min read

Federated learning (FL) is a framework that allows edge devices (e.g., Alexa devices) to collaboratively train a global model while keeping customers’ data on-device. A standard FL system involves a cloud server and multiple clients (devices). Each device has its local data and a local copy of the machine learning (ML) model being served.

In each round of FL training, a cloud server sends the current global model to the clients; the clients train their local models using on-device data and send the models to the cloud; and the server aggregates the local models and updates the global model. FL also has a personalization branch, which aims to customize local models to improve their performance on local data.

Illustration of a federated-learning system. Each edge device has its own local model, and it periodically communicates with the cloud server to update a shared global model. A personalization branch aims to customize local models to improve their performance on local data.

In many real-world applications, the local datasets for different clients may have heterogeneous distributions. In a paper we presented at the 36^th Conference on Neural Information Processing Systems (NeurIPS), we show that a training procedure that accounts for that heterogeneity improves the efficiency and accuracy of both the local and global models in federated learning.

Accounting for uncertainty

Our intuition was that that when training a local model, selecting a proper initial model and an appropriate number of training steps is critical to minimizing the training loss, thus achieving the desired personalization.

Self-FL intuition.png — Optimization trajectories of a local model. The blue and red curves are the training curves obtained with two different initial models.

Our method, which we call Self-FL, is rooted in a theoretical analysis using Bayesian hierarchical models, in which the intra-client and inter-client uncertainties define different layers of the hierarchy. From the Bayesian analysis, we derive equations relating these two uncertainty measures to three local configuration factors: (1) the local initial model, which is used as the starting point of local model training; (2) the learning rate, which determines how dramatically a single training example can affect network weights; and (3) the early-stop rule, which determines when the training procedure should stop to prevent overfitting.

Adaptive FL aggregation rule

Existing FL algorithms typically update the global model using a weighted sum of local models, where the weight for each local model is proportional to the local dataset size. Our framework uses an adaptive aggregation rule to update the global model for better personalization. Particularly, we derive the aggregation rule from Bayesian hierarchical modeling, where the global model parameters are considered the ”root” of the statistical model.

The idea is, essentially, that the more a local model’s training data deviates from the global averages, the more responsive the model should be to training on that data. Conversely, the more uncertain the optimization of the local models’ parameters appear to be, the less weight they should be given when updating the global model.

Self-FL framework.png — The Self-FL framework for personalized federated learning.

Our method is designed to improve the accuracy of edge devices’ personalized models both directly, by better tailoring them to the types of data they’re likely to see, and indirectly, by making the global models distributed to all clients more accurate. Our empirical results indicate that relative to prior FL schemes, Self-FL improves performance for edge clients. As such, it promises to improve the experience of Amazon customers by making their devices more responsive to their particular needs.

About the Author

Huili Chen

Huili Chen is an applied scientist with Amazon Web Services.

Personalized federated learning for a better customer experience

Accounting for data heterogeneity across edge devices enables more useful model updates, both locally and globally.

Accounting for uncertainty

Adaptive FL aggregation rule

Related content

Work with us