*Federated Learning on Non-IID Data Silos: An Experimental Study*
https://arxiv.org/abs/2102.02079
non-iid造成federated learning在訓練時結果不好。此篇對幾個由FedAvg針對non-iid進行改善而來的algorithm - **FedAvg**, **FedProx** , **FedNova** 及**SCAFFOLD**進行比較,並且提出更多種的non-iid type來對上述演算法比較。
CONTRIBUTION:
1. **提出non-iid dataset怎麼製造(partition)**
2. 更全面的測試目前所見演算法並提出一個bechmark
3. 提共用實驗的dataset以及實驗setting方式(可能會用到)
4. **提供前面提到algorithm implement**
NON-IID TYPE and GENERATE THEM:
1. label distribution skew: label的distribution在各個party中不一樣
1. Quantity-based label imbalance
Dataset中有L個不同的label(依據label將sample分為L組),假設每個party都固定有k種不同label(隨機分配),接著將dataset中對應label的samples平均隨機的分散擁有該label的組別中
2. Distribution-based label imbalance
allocated a proportion of the samples of each label according to Dirichlet distribution
2. feature distribution skew: feature的distribution在各個party中不一樣(但P(yi|xi)可能相同)
1. Noise-based feature imbalance
Dataset平均隨機的分散到各個party中,並且add different levels of Gaussian noise to its local dataset to achieve different feature distributions
2. Synthetic feature imbalance
Paper提出了一個dataset – **FCUBE**(想像成一個三維正方形,平均分為八塊,對角線一組分配給一個party)
3. Real-world feature imbalance
使用real world dataset – **EMNIST**(使用者手寫),一個writer資料只分配給一個party。因為不同writer寫同個字母(label)就會有不同大小粗細斜度(feature),因此滿足feature imbalance
3. same label but different features: P(xi|yi) is different among parties
* for vertical federated learning,~~此篇paper不討論~~
4. same features but different labels: P(yi|xi) is different among parties
* not applicable in most FL studies, which assume there is a common knowledge P(y|x) among the parties to learn,~~此篇paper不討論~~
5. quantity skew: data distribution在各party可能都相同,但是數量不相同
1. quantity skew
use **Dirichlet distribution** to allocate different amounts of data samples into each party.
Public image datasets:
* MNIST
* CIFAR-10
* FMNIST
* SVHN
* FCUBE (from this paper)
* FEMNIST
EXPERIMENT AND FIND:








