# (草稿)Optimizing Federated Learning on Non-IID Data with Reinforcement Learning
###### tags: `Paper`
## Abstract
non-IID data
speed up convergence
connectiong between the distribution of training data on a device and the model weights trained based on those data
deep Q-learning
MNIST
FashionMNIST
CIFAT-10
## Introduction
privacy-sensitive data
FEDAVG : reduce the number communication rounds required
FAVOR : a control framework that aims to improve the performance of federated learning through intelligent device selection
Double DQN
## Background and Motivation
1. Federated Learning
- trains a shared global model
- slow and unstable network connections
2. The Challenges of Non-IID Data Distribution
- when data distributions are non-IID, FEDAVG is unstable and may even diverge
- Selecting devices with a clustering algorithm can help to even out data distribution and speed up convergence
4. Deep Reinforcement Learning(DRL)
## DRL for Client Selection
1. The Agent based on Deep Q-Network
- DQN can be more efficiently trained and can reuse data more effectively than ploicy gradient methods and actor-critic methods
2. Workflow


3. Dimension Reduction
- apply principle component analysis(PCA) to model weights
- use the compressed model weights to represent states instead
4. Training the Agent with Double DQN

## Evaluation
1. Training the DRL agent
- AWS EC2 (K80 GPU)
2. Different Levels of Non-IID Data
3. Device Selection and Weight Updates
4. Increasing Parallelism
## Related Work
1. Communication efficiency
- Communication effciency
- Sample efficiency
2. Sample efficiency
## Concluding Remarks
- FAVOR
- select the best subset of devices to achieve our objectives