# (草稿)Optimizing Federated Learning on Non-IID Data with Reinforcement Learning ###### tags: `Paper` ## Abstract non-IID data speed up convergence connectiong between the distribution of training data on a device and the model weights trained based on those data deep Q-learning MNIST FashionMNIST CIFAT-10 ## Introduction privacy-sensitive data FEDAVG : reduce the number communication rounds required FAVOR : a control framework that aims to improve the performance of federated learning through intelligent device selection Double DQN ## Background and Motivation 1. Federated Learning - trains a shared global model - slow and unstable network connections 2. The Challenges of Non-IID Data Distribution - when data distributions are non-IID, FEDAVG is unstable and may even diverge - Selecting devices with a clustering algorithm can help to even out data distribution and speed up convergence 4. Deep Reinforcement Learning(DRL) ## DRL for Client Selection 1. The Agent based on Deep Q-Network - DQN can be more efficiently trained and can reuse data more effectively than ploicy gradient methods and actor-critic methods 2. Workflow ![](https://i.imgur.com/Q0X9XRQ.png) ![](https://i.imgur.com/Kks2uWJ.png) 3. Dimension Reduction - apply principle component analysis(PCA) to model weights - use the compressed model weights to represent states instead 4. Training the Agent with Double DQN ![](https://i.imgur.com/tMOy8Fr.png) ## Evaluation 1. Training the DRL agent - AWS EC2 (K80 GPU) 2. Different Levels of Non-IID Data 3. Device Selection and Weight Updates 4. Increasing Parallelism ## Related Work 1. Communication efficiency - Communication effciency - Sample efficiency 2. Sample efficiency ## Concluding Remarks - FAVOR - select the best subset of devices to achieve our objectives