Reading Group
Rota link
Next up
Past meetings
- Auto-encoding variational Bayes.
📅 12 Oct 2020
👤 Feri
📄 paper and 📝 notes
- Auto-encoding variational Bayes. (cont'd)
📅 26 Oct 2020
👤 Feri
📄 paper and 📝 notes
- β-VAE: learning basic visual concepts with a constrained variational frameework
📅 9 Nov 2020
👤 Csabi
📄 paper, 📝 notes, and colab notebook
- Learning Fair Representations
📅 23 Nov 2020
👤 Mina
📄 paper, 📝 notes, and slides
- A Maximum-Likelihood Interpretation for Slow Feature Analysis
📅 7 Dec 2020
👤 Patrik
📄 paper, 📝 notes, and slides
- The Kalman Filter
📅 18 Jan 2021
👤 Patrik
📄 paper, 📝 notes and slides
- Equality of Opportunity in Supervised Learning
📅 1 Feb 2021
👤 Emese
📄 paper and 📝 notes
- Deep Residual Learning for Image Recognition
📅 15 Feb 2021
👤 V Dóra
📄 paper, 📝 notes
- Probabilistic PCA
📅 1 Mar 2021
👤 J Dóri
📄 paper, 📝 notes
- Monte Carlo Gradient Estimation in Machine Learning
📅 15 Mar 2021
👤 Bea
📄 paper, 📝 notes, slides
- Towards Principled Methods for Training Generative Adversarial Networks
📅 22 Mar 2021
👤 Martin Arjovsky (guest)
📄 paper
- Wasserstein GAN
📅 29 Mar 2021
👤 Anna
📄 paper, 📝 notes
- Policy Gradient Methods for Reinforcement Learning with Function Approximation
📅 12 Apr 2021
👤 S Attila
📄 paper, 📝 notes
- Kernel-Predicting Convolutional Networks for Denoising Monte Carlo Renderings
📅 26 Apr 2021
👤 Enci
📄 paper, 📝 notes
- Independent Component Analysis
📅 10 May 2021
👤 Patrik
📄 notes
- Reformer: The Efficient Transformer
📅 24 May 2021
👤 Bence
📄 paper and 📝 notes
- Explainable ML Overview
📅 7 June 2021
👤 Emese
📄 paper and 📝 slides
- Guest Seminar: Vision Transformers and MLP mixer
📅 21 Jun 2021
👤 Neil Houlsby
📄 ViT paper, MLP mixer paper
- Guest Seminar: Deterministic Policy Gradients, RL for Continuous Control
📅 28 Jun 2021
👤 Nicolas Heess
📄 DPG paper, DDPG paper
- Understanding deep learning requires rethinking generalization
📅 13 Sep 2021
👤 Feri
📄 arxiv
- Lottery Ticket Hypothesis
📅 28 Jun 2021
👤 Mina
📄 arXiv
- Score Based Generative Modeling through Stochastic Differential Equations
📅 28 Jun 2021
👤 Máté
📄 arXiv
- Representation Learning with Contrastive Predictive Coding
📅 12 Nov 2021
👤 Bea
📄 paper and follow-up paper
- Guest seminar: Data-Efficient Representation Learning and Contrastive Losses
📅 19 Nov 2021
👤 Olivier Hénaff
📄 Divide and Contrast paper
- SimCLR v1/v2 and Intriguing Properties of Contrastive Losses
📅 26 Nov 2021
👤 Ting Chen
📄 SimCLR v1 paper, SimCLR v2 paper
- Contrastive Learning Inverts the Data Generating Process
📅 10 Dec 2021
👤 Eszter
📄 paper
- Deep Q-learning
📅 7 Jan 2022
👤 Attila
📄 paper
- TRPO
📅 4 Feb 2022
👤 Attila
📄 paper
- The Q-manifesto
📅 11 March 2022
👤 Gergely Neu
📄 logistic Q-learning paper
- Gauge Invariant Convolutional Networks.
📅 18 March 2022
👤 Szilvi
📄 paper
Papers
Generative Models
VAE
- Diederik P Kingma, Max Welling (2013) Auto-encoding variational Bayes. ICLR pdf
- Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed and Alexander Lerchner (2017) -VAE: Learning Basic Visual Concepts with a Constrained Variational Framework web openreview
- Shakir Mohamed, Mihaela Rosca, Michael Figurnov, Andriy Mnih,. (2019) Monte Carlo Gradient Estimation in Machine Learning pdf
- Milton Llera Montero, Casimir JH Ludwig, Rui Ponte Costa, Gaurav Malhotra, Jeffrey Bowers (2021): The role of Disentanglement in Generalisation openreview
- Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat (2021) : VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models openreview
GANs
- Martin Arjovsky and Léon Bottou (2017) Towards Principled Methods for Training Generative Adversarial Networks arXiv
- Martin Arjovsky, Soumith Chintala and Léon Bottou (2018) Wasserstein GAN arXiv
- Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. (2016) InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. NeurIPS pdf, inFERENCe
- Tero Karras, Samuli Laine and Timo Aila (2019) A Style-Based Generator Architecture for Generative Adversarial Networks web
Maximum Likelihood, linear-Gaussian, ICA
- Laurenz Wiskott and Terrence J. Sejnowski (2002) Slow Feature Analysis: Unsupervised Learning of Invariances pdf
- Mike Tipping and Chris Bishop Probabilistic Principal Components Analysis pdf
- James V. Stone Independent Component Analysis: A Tutorial Introduction pdf
- Andrew Ng's video lecture on ICA video
Misc
- Aapo Hyvarinen Estimation of Non-Normalized Statistical Models by Score Matching pdf
- Geoffrey Hinton (2002) Training Products of Experts by Minimizing Contrastive Divergence web
- Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole (2021) Score-Based Generative Modeling through Stochastic Differential Equations arXiv
Architectures
- Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun (2016) Deep Residual Learning for Image Recognition pdf
- Martin Arjovsky, Amar Shah, Yoshua Bengio (2015) Unitary Evolution Recurrent Neural Networks pdf
Generalization/theory
- Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals (2016) Understanding deep learning requires rethinking generalization arxiv
Fairness, privacy-preserving ML
- Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, Cynthia Dwork (2013) Learning Fair Representations. ICML pdf
- Moritz Hardt, Eric Price, and Nati Srebro (2016) Equality of opportunity in supervised learning. NeurIPS pdf
- Luca Melis, Congzheng Song, Emiliano DeCristofaro, Vitaly Shmatikov (2018) Exploiting Unintended Feature Leakage in Collaborative Learning IEEE Symposium on Security and Privacy pdf
Reinforcement learning
- Richard S. Sutton, David McAllester, Satinder Singh and Yishay Mansour (1999) Policy Gradient Methods for Reinforcement Learning with Function Approximation pdf
- John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan and Pieter Abbeel (2015) Trust Region Policy Optimization arxiv
- David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra and Martin Riedmiller (2014) Deterministic Policy Gradient Algorithms pdf
- Nicolas Heess, Greg Wayne, David Silver, Timothy Lillicrap, Yuval Tassa and Tom Erez (2015) Learning Continuous Control Policies by Stochastic Value Gradients arXiv
- John Schulman, Nicolas Heess, Theophane Weber and Pieter Abbeel (2015) Gradient Estimation Using Stochastic Computation Graphs arXiv
- Théophane Weber, Nicolas Heess, Lars Buesing and David Silver (2019) Credit Assignment Techniques in Stochastic Computation Graphs arXiv
NLP
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin (2017) Attention Is All You Need. arxiv
- Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut (2019) ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. arxiv
- Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya (2020) Reformer: The Efficient Transformer. arxiv
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu (2020) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arxiv
Suggested Content
Online lectures
- Philipp Hennig: Probabilistic Machine Learning link
Miscellaneous (blogposts, visualizations, etc.)
- Andrew Miller Monte Carlo Gradient Estimators and Variational Inference link
- Yang Song Generative Modeling by Estimating Gradients of the Data Distribution link
Textbooks
- Kevin Murphy: Machine Learing: a Probabilistic Perspective pdf, the first link may not lead to a pdf file: pdf-2
- David MacKay: Information Theory, Inference and Learning Algorithms pdf See also David's MLSS lectures, and famous information theory lectures.
- Marc Deisenroth, Aldo Faisal and Cheng Soon Ong. Mathematics for Machine Learning web, pdf
- Markus Svensen, Chris Bishop: Pattern Recognition and Machine Learning pdf
- Ian Goodfellow, Yoshua Bengio, Aaron Courville: The Deep Learning Book pdf
- Fancis Bach: Learning Theory from First Principles pdf
Concepts to learn
Just notes on what concepts should we eventually learn about from the reading list.
- independence, conditional independence, explaining away
- matrix decompositions: PCA, matrix factorization, eigendimensions, slow feature analysis
- variational bound, Jensen inequality, KL divergence, ELBO, EM algorithm
- exponential family distributions, normalizing constants, conjugate priors
- decision theory, Bayes-optimality, optimality under L2 vs L1 loss
- stochastic gradient descent, convergence basics, generalisation properties
- constrained optimisation: augmented Lagrangians, Lagrange multipliers
- convex optimization: duality, Newton's method
- natural gradients: trust region view, Fisher information matrix, K-FAC
Useful taxonomy/terms/expressions
- Score(-function): gradient of the log-likelihood function w.r.t. the parameter vector.link