# Introduction to Federated Learning (FL) <br> <br> <small>Neil John D. Ortega</small><br> <small> <small>ML Engineer @ LINE Fukuoka</small><br> <small>2021/06/04</small> </small> --- ## Agenda - What? - Why? - How does it work? - How to use? - Challenges - Recap --- ## What is Federated Learning (FL)? - ML setting where:<!-- .element: class="fragment" --> - Multiple clients collaborate in solving an ML problem<!-- .element: class="fragment" --> - Orchestrated by a central server<!-- .element: class="fragment" --> - Clients' data remain local and are not exchanged/transferred<!-- .element: class="fragment" --> - Clients' model updates are sent to central server for aggregation<!-- .element: class="fragment" --> ---- ## What is Federated Learning (FL)? - Can be **cross-device** or **cross-silo** <!-- .element: class="fragment" --> - Main difference is in scope wherein "cross-silo" FL trains models on siloed (i.e. organization-level) data<!-- .element: class="fragment" --> - Highly interdisciplinary<!-- .element: class="fragment" --> - ML, distributed computing and optimization, cryptography, security, differential privacy, fairness, information theory, etc. <!-- .element: class="fragment" --> --- ## Why is FL relevant? - Allows training of models without the need of centralizing (usually private) data<!-- .element: class="fragment" --> - Direct application of the following principles [1]:<!-- .element: class="fragment" --> - **Focused collection**: consumers have a right to limit the amount of personal data companies collect AND retain<!-- .element: class="fragment" --> - **Data minimization**: orgs should only collect personally identifiable info (PII) directly relevant to the task at hand AND retain it for only as long as necessary<!-- .element: class="fragment" --> - **How can we protect the privacy of users but at the same time promote innovation?** <!-- .element: class="fragment" --> --- ## How does FL work? - Model Lifecycle ![Model lifecycle in a federated learning setting](https://i.imgur.com/PXgvRsH.png) <!-- (https://i.imgur.com/zXWhHzX.png) --> <small><strong>Fig. 1.</strong> The model lifecycle and the involved actors in an FL setting [2]. Accessed 3 Jun 2021.</small> ---- ## How does FL work? - Model Lifecycle 1. **Problem identification**: ML engineer determines a problem where FL makes sense<!-- .element: class="fragment" --> 2. **Client instrumentation**: make sure that the client (e.g. edge device) has everything it needs to perform local training<!-- .element: class="fragment" --> 3. **Simulation prototyping (optional)**: ML engineer may play around with model architectures, hyperparams, etc. in an FL simulation on a proxy dataset<!-- .element: class="fragment" --> ---- ## How does FL work? - Model Lifecycle 4. **Federated training**: federated training tasks are started to train the model<!-- .element: class="fragment" --> 5. **(Federated) model evaluation**: after successful/sufficient training, the resulting models are evaluated based on metrics in either a centralized or federated manner (i.e. cross validation but on held-out devices)<!-- .element: class="fragment" --> 6. **Deployment**: after evaluation, the final model goes through the standard model deployment process<!-- .element: class="fragment" --> ---- ## How does FL work? - Federated training ![The typical federated training process](https://i.imgur.com/OjyIxZH.png =680x433) <small><strong>Fig. 2.</strong> The typical federated training process [4]. Accessed 3 Jun 2021.</small> ---- ## How does FL work? - Federated training 1. **Client selection**: server samples from all eligible clients<!-- .element: class="fragment" --> 2. **Broadcast**: the sampled clients download (a) the current model weights, and (b) a training program from the server<!-- .element: class="fragment" --> 3. **Client computation**: each client computes a local update to the model by performing the training program<!-- .element: class="fragment" --> 4. **Aggregation**: the server collects the client updates<!-- .element: class="fragment" --> 5. **Model update**: the server updates the shared model based on the computed aggregate<!-- .element: class="fragment" --> ---- ## How does FL work? - Federated training via `FedAvg` Algo ![`FedAvg` algorithm explaind](https://i.imgur.com/xHkLnWv.png =821x439) <small><strong>Fig. 3.</strong> The <code>FedAvg</code> algorithm - a concrete example of federated training [3][4]. Accessed 3 Jun 2021.</small> ---- ## How does FL work? - Federated training ![Typical cross-device FL order-of-magnitudes](https://i.imgur.com/LGMTCqH.png) <small><strong>Fig. 4.</strong> Typical order-of-magnitude sizes for cross-device FL applications [2]. Accessed 3 Jun 2021.</small> --- ## How to use FL? - Frameworks and Datasets - Frameworks<!-- .element: class="fragment" --> - [Tensorflow Federated](https://github.com/tensorflow/federated) - mainly FL, has abstractions for aggregation, broadcast, and serialization of TF computations and can potentially be used in production<!-- .element: class="fragment" --> - [PySyft](https://github.com/OpenMined/PySyft) - not specific to FL, also includes differential privacy and multi-party computation (MPC)<!-- .element: class="fragment" --> - 🌸 [**Flower**](https://github.com/adap/flower) - geared towards production environments, under active development 🔥 <!-- .element: class="fragment" --> - ... and many more!<!-- .element: class="fragment" --> - Datasets (for benchmarks, experiments)<!-- .element: class="fragment" --> - [LEAF](https://leaf.cmu.edu/) - compilation of FL-ready versions of well-known datasets such as MNIST (image classification), Shakespeare (next character prediction), etc.<!-- .element: class="fragment" --> ---- ## How to use FL? - 🌸 [**Flower**](https://github.com/adap/flower) Demo --- ## Challenges - Handling non-IID (independent, identically distributed) data still an open problem<!-- .element: class="fragment" --> - Communication is a big bottleneck, prompting further research in communication efficiency and compression<!-- .element: class="fragment" --> - Adapting certain techniques in the centralized setting (e.g. hyperparameter tuning, debugging, interpretability, etc.) into FL setting not straightforward<!-- .element: class="fragment" --> - Expanding FL into other learning settings (e.g. semi-supervised, unsupervised, RL, etc.)<!-- .element: class="fragment" --> ---- ## Challenges - Integrating other privacy-preserving techniques (differential privacy, MPC, etc.) into the FL setting<!-- .element: class="fragment" --> - Verifying that parties have faithfully executed the parts of a computation delegated to them (i.e. adversarial clients/servers)<!-- .element: class="fragment" --> - Constant tension between improving robustness and privacy<!-- .element: class="fragment" --> - Ensuring fairness despite the lack of access to data<!-- .element: class="fragment" --> - Systems engineering of the entire model lifecycle<!-- .element: class="fragment" --> - Support for on-device training is still lacking<!-- .element: class="fragment" --> --- ## Recap <style> .reveal h1 {font-size: 2.0em !important;} .reveal h2 {font-size: 1.28em !important;} .reveal ul {font-size: 32px !important;} .reveal ol strong, .reveal ul strong { color: #E26A6A !important; } </style> - Federated Learning (FL) is useful in settings where ML-based solution is desired but data is not centralized<!-- .element: class="fragment" --> - FL can address the privacy-vs-innovation problem<!-- .element: class="fragment" --> - FL introduces some major changes to the entire ML model lifecycle, specifically at the model training step <!-- .element: class="fragment" --> - Lots of existing frameworks for FL but mainly for simulation - [**Flower**](https://github.com/adap/flower) intends to make FL available for production settings<!-- .element: class="fragment" --> - Still lots of open problems under FL<!-- .element: class="fragment" --> --- # Thank you! :nerd_face: --- ## References <!-- .slide: data-id="references" --> <style> .reveal p {font-size: 20px !important;} .reveal ul, .reveal ol { display: block !important; font-size: 30px !important; } .reveal li {line-height: 1.4 !important;} section[data-id="references"] p { text-align: center !important; } </style> [1] Anonymous, A. “[Consumer Data Privacy in a Networked World: A Framework for Protecting Privacy and Promoting Innovation in the Global Digital Economy](https://journalprivacyconfidentiality.org/index.php/jpc/article/view/623).” Journal of Privacy and Confidentiality 4 (2) (2013). https://doi.org/10.29012/jpc.v4i2.623. [2] Kairouz, P. et al. “[Advances and Open Problems in Federated Learning](https://arxiv.org/abs/1912.04977).” ArXiv abs/1912.04977 (2019): n. pag. [3] McMahan, H. B. et al. “[Communication-Efficient Learning of Deep Networks from Decentralized Data](https://arxiv.org/abs/1602.05629).” AISTATS (2017). [4] Visengeriyeva, L. et al. “[Three Levels of ML Software](https://ml-ops.org/content/three-levels-of-ml-software.html).” Web blog post. INNOQ. Web.
{"metaMigratedAt":"2023-06-17T05:03:17.346Z","metaMigratedFrom":"YAML","title":"Introduction to Federated Learning","breaks":true,"description":"View the slide with \"Slide Mode\".","slideOptions":"{\"spotlight\":{\"enabled\":false}}","contributors":"[]"}
    191 views