# Introduction to Federated Learning (FL)
<br>
<br>
<small>Neil John D. Ortega</small><br>
<small>
<small>ML Engineer @ LINE Fukuoka</small><br>
<small>2021/06/04</small>
</small>
---
## Agenda
- What?
- Why?
- How does it work?
- How to use?
- Challenges
- Recap
---
## What is Federated Learning (FL)?
- ML setting where:<!-- .element: class="fragment" -->
- Multiple clients collaborate in solving an ML problem<!-- .element: class="fragment" -->
- Orchestrated by a central server<!-- .element: class="fragment" -->
- Clients' data remain local and are not exchanged/transferred<!-- .element: class="fragment" -->
- Clients' model updates are sent to central server for aggregation<!-- .element: class="fragment" -->
----
## What is Federated Learning (FL)?
- Can be **cross-device** or **cross-silo** <!-- .element: class="fragment" -->
- Main difference is in scope wherein "cross-silo" FL trains models on siloed (i.e. organization-level) data<!-- .element: class="fragment" -->
- Highly interdisciplinary<!-- .element: class="fragment" -->
- ML, distributed computing and optimization, cryptography, security, differential privacy, fairness, information theory, etc. <!-- .element: class="fragment" -->
---
## Why is FL relevant?
- Allows training of models without the need of centralizing (usually private) data<!-- .element: class="fragment" -->
- Direct application of the following principles [1]:<!-- .element: class="fragment" -->
- **Focused collection**: consumers have a right to limit the amount of personal data companies collect AND retain<!-- .element: class="fragment" -->
- **Data minimization**: orgs should only collect personally identifiable info (PII) directly relevant to the task at hand AND retain it for only as long as necessary<!-- .element: class="fragment" -->
- **How can we protect the privacy of users but at the same time promote innovation?** <!-- .element: class="fragment" -->
---
## How does FL work? - Model Lifecycle

<!-- (https://i.imgur.com/zXWhHzX.png) -->
<small><strong>Fig. 1.</strong> The model lifecycle and the involved actors in an FL setting [2]. Accessed 3 Jun 2021.</small>
----
## How does FL work? - Model Lifecycle
1. **Problem identification**: ML engineer determines a problem where FL makes sense<!-- .element: class="fragment" -->
2. **Client instrumentation**: make sure that the client (e.g. edge device) has everything it needs to perform local training<!-- .element: class="fragment" -->
3. **Simulation prototyping (optional)**: ML engineer may play around with model architectures, hyperparams, etc. in an FL simulation on a proxy dataset<!-- .element: class="fragment" -->
----
## How does FL work? - Model Lifecycle
4. **Federated training**: federated training tasks are started to train the model<!-- .element: class="fragment" -->
5. **(Federated) model evaluation**: after successful/sufficient training, the resulting models are evaluated based on metrics in either a centralized or federated manner (i.e. cross validation but on held-out devices)<!-- .element: class="fragment" -->
6. **Deployment**: after evaluation, the final model goes through the standard model deployment process<!-- .element: class="fragment" -->
----
## How does FL work? - Federated training

<small><strong>Fig. 2.</strong> The typical federated training process [4]. Accessed 3 Jun 2021.</small>
----
## How does FL work? - Federated training
1. **Client selection**: server samples from all eligible clients<!-- .element: class="fragment" -->
2. **Broadcast**: the sampled clients download (a) the current model weights, and (b) a training program from the server<!-- .element: class="fragment" -->
3. **Client computation**: each client computes a local update to the model by performing the training program<!-- .element: class="fragment" -->
4. **Aggregation**: the server collects the client updates<!-- .element: class="fragment" -->
5. **Model update**: the server updates the shared model based on the computed aggregate<!-- .element: class="fragment" -->
----
## How does FL work? - Federated training via `FedAvg` Algo

<small><strong>Fig. 3.</strong> The <code>FedAvg</code> algorithm - a concrete example of federated training [3][4]. Accessed 3 Jun 2021.</small>
----
## How does FL work? - Federated training

<small><strong>Fig. 4.</strong> Typical order-of-magnitude sizes for cross-device FL applications [2]. Accessed 3 Jun 2021.</small>
---
## How to use FL? - Frameworks and Datasets
- Frameworks<!-- .element: class="fragment" -->
- [Tensorflow Federated](https://github.com/tensorflow/federated) - mainly FL, has abstractions for aggregation, broadcast, and serialization of TF computations and can potentially be used in production<!-- .element: class="fragment" -->
- [PySyft](https://github.com/OpenMined/PySyft) - not specific to FL, also includes differential privacy and multi-party computation (MPC)<!-- .element: class="fragment" -->
- 🌸 [**Flower**](https://github.com/adap/flower) - geared towards production environments, under active development 🔥 <!-- .element: class="fragment" -->
- ... and many more!<!-- .element: class="fragment" -->
- Datasets (for benchmarks, experiments)<!-- .element: class="fragment" -->
- [LEAF](https://leaf.cmu.edu/) - compilation of FL-ready versions of well-known datasets such as MNIST (image classification), Shakespeare (next character prediction), etc.<!-- .element: class="fragment" -->
----
## How to use FL? - 🌸 [**Flower**](https://github.com/adap/flower) Demo
---
## Challenges
- Handling non-IID (independent, identically distributed) data still an open problem<!-- .element: class="fragment" -->
- Communication is a big bottleneck, prompting further research in communication efficiency and compression<!-- .element: class="fragment" -->
- Adapting certain techniques in the centralized setting (e.g. hyperparameter tuning, debugging, interpretability, etc.) into FL setting not straightforward<!-- .element: class="fragment" -->
- Expanding FL into other learning settings (e.g. semi-supervised, unsupervised, RL, etc.)<!-- .element: class="fragment" -->
----
## Challenges
- Integrating other privacy-preserving techniques (differential privacy, MPC, etc.) into the FL setting<!-- .element: class="fragment" -->
- Verifying that parties have faithfully executed the parts of a computation delegated to them (i.e. adversarial clients/servers)<!-- .element: class="fragment" -->
- Constant tension between improving robustness and privacy<!-- .element: class="fragment" -->
- Ensuring fairness despite the lack of access to data<!-- .element: class="fragment" -->
- Systems engineering of the entire model lifecycle<!-- .element: class="fragment" -->
- Support for on-device training is still lacking<!-- .element: class="fragment" -->
---
## Recap
<style>
.reveal h1 {font-size: 2.0em !important;}
.reveal h2 {font-size: 1.28em !important;}
.reveal ul {font-size: 32px !important;}
.reveal ol strong,
.reveal ul strong {
color: #E26A6A !important;
}
</style>
- Federated Learning (FL) is useful in settings where ML-based solution is desired but data is not centralized<!-- .element: class="fragment" -->
- FL can address the privacy-vs-innovation problem<!-- .element: class="fragment" -->
- FL introduces some major changes to the entire ML model lifecycle, specifically at the model training step <!-- .element: class="fragment" -->
- Lots of existing frameworks for FL but mainly for simulation - [**Flower**](https://github.com/adap/flower) intends to make FL available for production settings<!-- .element: class="fragment" -->
- Still lots of open problems under FL<!-- .element: class="fragment" -->
---
# Thank you! :nerd_face:
---
## References
<!-- .slide: data-id="references" -->
<style>
.reveal p {font-size: 20px !important;}
.reveal ul, .reveal ol {
display: block !important;
font-size: 30px !important;
}
.reveal li {line-height: 1.4 !important;}
section[data-id="references"] p {
text-align: center !important;
}
</style>
[1] Anonymous, A. “[Consumer Data Privacy in a Networked World: A Framework for Protecting Privacy and Promoting Innovation in the Global Digital Economy](https://journalprivacyconfidentiality.org/index.php/jpc/article/view/623).” Journal of Privacy and Confidentiality 4 (2) (2013). https://doi.org/10.29012/jpc.v4i2.623.
[2] Kairouz, P. et al. “[Advances and Open Problems in Federated Learning](https://arxiv.org/abs/1912.04977).” ArXiv abs/1912.04977 (2019): n. pag.
[3] McMahan, H. B. et al. “[Communication-Efficient Learning of Deep Networks from Decentralized Data](https://arxiv.org/abs/1602.05629).” AISTATS (2017).
[4] Visengeriyeva, L. et al. “[Three Levels of ML Software](https://ml-ops.org/content/three-levels-of-ml-software.html).” Web blog post. INNOQ. Web.
{"metaMigratedAt":"2023-06-17T05:03:17.346Z","metaMigratedFrom":"YAML","title":"Introduction to Federated Learning","breaks":true,"description":"View the slide with \"Slide Mode\".","slideOptions":"{\"spotlight\":{\"enabled\":false}}","contributors":"[]"}