Reading Group

Rota link

Next up

Past meetings

  1. Auto-encoding variational Bayes.
    📅 12 Oct 2020
    👤 Feri
    📄 paper and 📝 notes
  2. Auto-encoding variational Bayes. (cont'd)
    📅 26 Oct 2020
    👤 Feri
    📄 paper and 📝 notes
  3. β-VAE: learning basic visual concepts with a constrained variational frameework
    📅 9 Nov 2020
    👤 Csabi
    📄 paper, 📝 notes, and colab notebook
  4. Learning Fair Representations
    📅 23 Nov 2020
    👤 Mina
    📄 paper, 📝 notes, and slides
  5. A Maximum-Likelihood Interpretation for Slow Feature Analysis
    📅 7 Dec 2020
    👤 Patrik
    📄 paper, 📝 notes, and slides
  6. The Kalman Filter
    📅 18 Jan 2021
    👤 Patrik
    📄 paper, 📝 notes and slides
  7. Equality of Opportunity in Supervised Learning
    📅 1 Feb 2021
    👤 Emese
    📄 paper and 📝 notes
  8. Deep Residual Learning for Image Recognition
    📅 15 Feb 2021
    👤 V Dóra
    📄 paper, 📝 notes
  9. Probabilistic PCA
    📅 1 Mar 2021
    👤 J Dóri
    📄 paper, 📝 notes
  10. Monte Carlo Gradient Estimation in Machine Learning
    📅 15 Mar 2021
    👤 Bea
    📄 paper, 📝 notes, slides
  11. Towards Principled Methods for Training Generative Adversarial Networks
    📅 22 Mar 2021
    👤 Martin Arjovsky (guest)
    📄 paper
  12. Wasserstein GAN
    📅 29 Mar 2021
    👤 Anna
    📄 paper, 📝 notes
  13. Policy Gradient Methods for Reinforcement Learning with Function Approximation
    📅 12 Apr 2021
    👤 S Attila
    📄 paper, 📝 notes
  14. Kernel-Predicting Convolutional Networks for Denoising Monte Carlo Renderings
    📅 26 Apr 2021
    👤 Enci
    📄 paper, 📝 notes
  15. Independent Component Analysis
    📅 10 May 2021
    👤 Patrik
    📄 notes
  16. Reformer: The Efficient Transformer
    📅 24 May 2021
    👤 Bence
    📄 paper and 📝 notes
  17. Explainable ML Overview
    📅 7 June 2021
    👤 Emese
    📄 paper and 📝 slides
  18. Guest Seminar: Vision Transformers and MLP mixer
    📅 21 Jun 2021
    👤 Neil Houlsby
    📄 ViT paper, MLP mixer paper
  19. Guest Seminar: Deterministic Policy Gradients, RL for Continuous Control
    📅 28 Jun 2021
    👤 Nicolas Heess
    📄 DPG paper, DDPG paper
  20. Understanding deep learning requires rethinking generalization
    📅 13 Sep 2021
    👤 Feri
    📄 arxiv
  21. Lottery Ticket Hypothesis
    📅 28 Jun 2021
    👤 Mina
    📄 arXiv
  22. Score Based Generative Modeling through Stochastic Differential Equations
    📅 28 Jun 2021
    👤 Máté
    📄 arXiv
  23. Representation Learning with Contrastive Predictive Coding
    📅 12 Nov 2021
    👤 Bea
    📄 paper and follow-up paper
  24. Guest seminar: Data-Efficient Representation Learning and Contrastive Losses
    📅 19 Nov 2021
    👤 Olivier Hénaff
    📄 Divide and Contrast paper
  25. SimCLR v1/v2 and Intriguing Properties of Contrastive Losses
    📅 26 Nov 2021
    👤 Ting Chen
    📄 SimCLR v1 paper, SimCLR v2 paper
  26. Contrastive Learning Inverts the Data Generating Process
    📅 10 Dec 2021
    👤 Eszter
    📄 paper
  27. Deep Q-learning
    📅 7 Jan 2022
    👤 Attila
    📄 paper
  28. TRPO
    📅 4 Feb 2022
    👤 Attila
    📄 paper
  29. The Q-manifesto
    📅 11 March 2022
    👤 Gergely Neu
    📄 logistic Q-learning paper
  30. Gauge Invariant Convolutional Networks.
    📅 18 March 2022
    👤 Szilvi
    📄 paper

Papers

Generative Models

VAE

  • Diederik P Kingma, Max Welling (2013) Auto-encoding variational Bayes. ICLR pdf
  • Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed and Alexander Lerchner (2017)
    β
    -VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
    web openreview
  • Shakir Mohamed, Mihaela Rosca, Michael Figurnov, Andriy Mnih,. (2019) Monte Carlo Gradient Estimation in Machine Learning pdf
  • Milton Llera Montero, Casimir JH Ludwig, Rui Ponte Costa, Gaurav Malhotra, Jeffrey Bowers (2021): The role of Disentanglement in Generalisation openreview
  • Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat (2021) : VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models openreview

GANs

  • Martin Arjovsky and Léon Bottou (2017) Towards Principled Methods for Training Generative Adversarial Networks arXiv
  • Martin Arjovsky, Soumith Chintala and Léon Bottou (2018) Wasserstein GAN arXiv
  • Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. (2016) InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. NeurIPS pdf, inFERENCe
  • Tero Karras, Samuli Laine and Timo Aila (2019) A Style-Based Generator Architecture for Generative Adversarial Networks web

Maximum Likelihood, linear-Gaussian, ICA

  • Laurenz Wiskott and Terrence J. Sejnowski (2002) Slow Feature Analysis: Unsupervised Learning of Invariances pdf
  • Mike Tipping and Chris Bishop Probabilistic Principal Components Analysis pdf
  • James V. Stone Independent Component Analysis: A Tutorial Introduction pdf
  • Andrew Ng's video lecture on ICA video

Misc

  • Aapo Hyvarinen Estimation of Non-Normalized Statistical Models by Score Matching pdf
  • Geoffrey Hinton (2002) Training Products of Experts by Minimizing Contrastive Divergence web
  • Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole (2021) Score-Based Generative Modeling through Stochastic Differential Equations arXiv

Architectures

  • Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun (2016) Deep Residual Learning for Image Recognition pdf
  • Martin Arjovsky, Amar Shah, Yoshua Bengio (2015) Unitary Evolution Recurrent Neural Networks pdf

Generalization/theory

  • Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals (2016) Understanding deep learning requires rethinking generalization arxiv

Fairness, privacy-preserving ML

  • Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, Cynthia Dwork (2013) Learning Fair Representations. ICML pdf
  • Moritz Hardt, Eric Price, and Nati Srebro (2016) Equality of opportunity in supervised learning. NeurIPS pdf
  • Luca Melis, Congzheng Song, Emiliano DeCristofaro, Vitaly Shmatikov (2018) Exploiting Unintended Feature Leakage in Collaborative Learning IEEE Symposium on Security and Privacy pdf

Reinforcement learning

  • Richard S. Sutton, David McAllester, Satinder Singh and Yishay Mansour (1999) Policy Gradient Methods for Reinforcement Learning with Function Approximation pdf
  • John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan and Pieter Abbeel (2015) Trust Region Policy Optimization arxiv
  • David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra and Martin Riedmiller (2014) Deterministic Policy Gradient Algorithms pdf
  • Nicolas Heess, Greg Wayne, David Silver, Timothy Lillicrap, Yuval Tassa and Tom Erez (2015) Learning Continuous Control Policies by Stochastic Value Gradients arXiv
  • John Schulman, Nicolas Heess, Theophane Weber and Pieter Abbeel (2015) Gradient Estimation Using Stochastic Computation Graphs arXiv
  • Théophane Weber, Nicolas Heess, Lars Buesing and David Silver (2019) Credit Assignment Techniques in Stochastic Computation Graphs arXiv

NLP

  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin (2017) Attention Is All You Need. arxiv
  • Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut (2019) ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. arxiv
  • Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya (2020) Reformer: The Efficient Transformer. arxiv
  • Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu (2020) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arxiv

Suggested Content

Online lectures

  • Philipp Hennig: Probabilistic Machine Learning link

Miscellaneous (blogposts, visualizations, etc.)

  • Andrew Miller Monte Carlo Gradient Estimators and Variational Inference link
  • Yang Song Generative Modeling by Estimating Gradients of the Data Distribution link

Textbooks

  • Kevin Murphy: Machine Learing: a Probabilistic Perspective pdf, the first link may not lead to a pdf file: pdf-2
  • David MacKay: Information Theory, Inference and Learning Algorithms pdf See also David's MLSS lectures, and famous information theory lectures.
  • Marc Deisenroth, Aldo Faisal and Cheng Soon Ong. Mathematics for Machine Learning web, pdf
  • Markus Svensen, Chris Bishop: Pattern Recognition and Machine Learning pdf
  • Ian Goodfellow, Yoshua Bengio, Aaron Courville: The Deep Learning Book pdf
  • Fancis Bach: Learning Theory from First Principles pdf

Concepts to learn

Just notes on what concepts should we eventually learn about from the reading list.

  • independence, conditional independence, explaining away
  • matrix decompositions: PCA, matrix factorization, eigendimensions, slow feature analysis
  • variational bound, Jensen inequality, KL divergence, ELBO, EM algorithm
  • exponential family distributions, normalizing constants, conjugate priors
  • decision theory, Bayes-optimality, optimality under L2 vs L1 loss
  • stochastic gradient descent, convergence basics, generalisation properties
  • constrained optimisation: augmented Lagrangians, Lagrange multipliers
  • convex optimization: duality, Newton's method
  • natural gradients: trust region view, Fisher information matrix, K-FAC

Useful taxonomy/terms/expressions

  • Score(-function): gradient of the log-likelihood function w.r.t. the parameter vector.link
    s(θ)=L(θ)θ