a roadmap for
# Principled Deep Learning
<!-- Put the link to this slide here so people can follow -->
### Ferenc Huszár
Computer Lab, Cambridge
Gatsby Computational Neuroscience Unit, UCL
---
### ML@CL

---
### early 2000's: some favourite papers

---
### early 2000's: some favourite papers

---
### early 2000's: some favourite papers

---
### early 2000s
shift from bottom-up to top-down innovation
* make connections between different methods
* cast methods in a common framework
* abstract out key driving principles
* justify new methods from first principles
---
### 2010s: deep learning

---
### 2010s: deep learning

---
## deep learning challenged our existing principles and assumptions
---
### generalization
*"it's a property of the model class"*
### representation learning
*"maximum likelihood is all you need"*
### probabilistic foundations
*"Bayesian learning is the best kind of learning"*
### causal inference
*"goal of ML is to predict one thing from another"*
---
## Main themes in my research
---
### generalization and optimization
### representation learning
### probabilistic foundations
### causal inference
---
## Generalization
---
## Generalization

---
## Generalization

---
## Generalization

---
## Generalization: deep nets

---
## Generalization: deep nets

---
## Generalization
* implicit regularization of optimization method
* new tools:
* neural tangent kernel [(Jacot et al, 2018)](https://arxiv.org/abs/1806.07572)
* infinite width neural networks
* new insights:
* deep linear models [(e.g. Arora et al, 2019)](https://arxiv.org/abs/1905.13655)
* my particular focus:
* natural gradient descent
---
## Representation learning
---
## Representation learning
* unsupervised learning
* learn an alternative representation of data that is 'more useful'
* what does 'more useful' mean?
---
## goal 1: data-efficiency

---
## goal 2: Linearity

---
## Latent variable modeling
* $x$: raw data
* $z$: latent representation/hidden variables
* $p_\theta(x, z)$: latent variable model
* $p_\theta(x) = \int p_\theta(x,z) dz$: marginal likelihood
* maximum likelihood: $\theta^\ast = \operatorname{argmax}_\theta \sum_{n=1}^N \log p_\theta(x_n)$
* $p_\theta(z \vert x) = \frac{p_\theta(x,z)}{p_\theta(x)}$: "inference" model
* $p_\theta(x,z) = p_\theta(z|x)p_\theta(x)$
---
#### Representation learning vs max likelihood

---
#### Representation learning vs max likelihood

---
#### Representation learning vs max likelihood

---
#### Representation learning vs max likelihood

---
#### Representation learning vs max likelihood

---
#### Representation learning vs max likelihood

---
### Open questions
* under what assumptions is maximum likelihood a good criterion?
* do variational methods provide useful implicit regularization?
* can we find principled motivations for self-supervised criteria
* provable self-supervised learning [(Lee et al, 2020)](https://arxiv.org/abs/2008.01064)
* analysis of contrastive schemes [(Arora et al, 2020)](https://arxiv.org/pdf/1902.09229.pdf)
---
## Probabilistic foundations
---
Bayes posterior: $p(\theta\vert \mathcal{D}) \propto p(\mathcal{D}\vert \theta) p(\theta)$
"cold" posterior: $p(\theta\vert \mathcal{D}) \propto p(\mathcal{D}\vert \theta)^T p(\theta)$

---

---
## deep learning challenged our existing principles and assumptions
---
## 2020s - like the 2000s
shift from bottom-up to top-down innovation
* make connections between different methods
* cast methods in a common framework
* abstract out key driving principles
* justify new methods from first principles
---
## Main themes in my research
---
### generalization and optimization
### representation learning
### probabilistic foundations
### causal inference
---
### Thank you!
---
info for prospective PhD students:
[inference.vc/phd](https://www.inference.vc/information-for-prospective-phd-students/)
{"metaMigratedAt":"2023-06-15T12:57:32.675Z","metaMigratedFrom":"YAML","title":"CUED - Roadmap for Principled Deep Learning","breaks":true,"description":"View the slide with \"Slide Mode\".","contributors":"[{\"id\":\"e558be3b-4a2d-4524-8a66-38ec9fea8715\",\"add\":4857,\"del\":1}]"}