# CS 536: Machine Learning II (Deep Learning) ## News - Mar. 18 - [Homework 2](https://hackmd.io/@Tn97A1U0QG6gBtFPXRh4oQ/S1VpZPjzL) is released - Mar. 13 - Announcement on [Coronavirus: Course Plan Update](https://hackmd.io/@Tn97A1U0QG6gBtFPXRh4oQ/Bk517sKrI) - Feb. 11 - [Homework 1](https://hackmd.io/@Tn97A1U0QG6gBtFPXRh4oQ/S1VpZPjzL) is released ## Overview The principal purpose of this course is to introduce the student to the problems of machine learning through a comparative presentation of methodology and practical examples. The course particularly focus on advanced topics including deep learning, generative models, and reinforcement learning. The course is intended for computer science students with an applied mathematics orientation and knowledge on basic machine learning, and also for students in other programs (computer and electrical engineering, statistics, mathematics, psychology) who are interested in this area of research. ## Topic Overview The course is structured into two-fold: (i) Deep Learning (DL) and (ii) advanced topics. For the DL part, the course will cover basic concepts like feed-forward neural networks, convolutional neural networks, RNNs as well as recent advances like attention, batch normalization, transformer, style transfer, etc. An importnat part of the course is that students will not only learn these conceptually but will build their own DL toolbox of these modules to learn the concept completely. In the second part, we will cover advanced topics like reinforcement learning and generative models including topics like variational autoencoders and generative adversarial networks. The plan for the advanced topics can be adjusted though depending on the progress of the first part. ## Pre-Requisites This course is **not an introduction level machine learning course**. (Although the course will review some basic concepts during the first week.) The course assumes that students already have knowledge on basic machine learning concepts. - The prerequisites: 16:198:530 or 16:198:520 - Requesting SPN [[link]](https://secure.sas.rutgers.edu/apps/special_permission/cs) - **If you're an undergraduate student requesting SPN** , the process is: (1) do the above online application, (2) visit my office on Thur 10am to get signed on the [graduate course request form](http://sasundergrad.rutgers.edu/images/forms/Graduate_Course_Request_Form.pdf), (3) return the signed form to the CS office. ## Instructor & TA - Instructor - Sungjin Ahn (sungjin.ahn@cs.rutgers.edu) at CBIM-07 - Teaching Assistant - Chang Chen (cc1547@scarletmail.rutgers.edu) at CBIM ## Time and Location - When: Tuesday and Thursday at 5:00pm - 6:20pm - Where: Pharmacy Building room 115 on Busch campus ## Office Hours - Instructor office hour: 10:00~11:00am on Thursday (CBIM 9) - TA office hour: 4-5pm on Friday ## Grading - Homeworks (30%) - Midterm Exam (30%) - Final Projects (40%) ## Textbooks DL & ML General 1. [Dive into Deep Learning (DDL)](https://d2l.ai/) 1. Deep Learning (DL), Goodfellow, Ian and Bengio, Yoshua and Courville, Aaron, MIT Press, 2016 3. [Pattern Recognition and Machine Learning (PRML)](https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf), Christopher C. Bishop, Springer, 2006 4. [Natural Language Processing with Distributed Representations (NLP)](https://github.com/nyu-dl/NLP_DL_Lecture_Note/blob/master/lecture_note.pdf), Kyunghyun Cho RL 1. [Reinforcement Learning](http://incompleteideas.net/book/RLbook2018.pdf), Andrew Barto, Richard S. Sutton 6. [An Introduction to Deep Reinforcement Learning](https://arxiv.org/abs/1811.12560) Generative Models 1. [An Introduction to Variational Autoencoders](https://arxiv.org/pdf/1906.02691.pdf) 1. [A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications](https://arxiv.org/pdf/2001.06937v1.pdf) <!-- ## Other Resources 1. https://d2l.ai/ (Eng + MXNet) 1. https://github.com/dsgiitr/d2l-pytorch (Eng + Pytorch) 3. https://github.com/ShusenTang/Dive-into-DL-PyTorch (Chinese + Pytorch) --> ## Homework 1. [HW1](https://hackmd.io/@Tn97A1U0QG6gBtFPXRh4oQ/S1VpZPjzL) 2. HW2 3. HW3 ## Final Project - [Description](https://hackmd.io/@Tn97A1U0QG6gBtFPXRh4oQ/rJ9qf0V-8) ## Lecture Slides - See Canvas <!-- - [Link](https://drive.google.com/drive/folders/1mvXGhnnmEHPbXaZYzisTfSqJXuZcv-_O?usp=sharing) --> ## Schedule 1. 1/21 - Course Overview, Basic Concepts for ML 2. 1/24 - Basic Concepts for ML 3. 1/28 - Linear Networks, Softmax, Multilayer Perceptrons, Regularization, Activation Functions 4. 1/30 - Backpropagation, Gradient Explosion, Vanishing, Dropout 5. 2/04 - Convolutional Networks 6. 2/06 - Modern Convolutional Networks (VGG, NiN, GoogLeNet, Batch Normalization, ResNet) 7. 2/11 - Modern Convolutional Networks, **HW1 release** 8. 2/13 - Advanced Convolutions (Deconv, Dilated Conv), RNNs 9. 2/18 - RNNs, Modern RNNs (LSTM, GRU) 10. 2/20 - Modern RNNs, Attention - Attention: See also [DDL](https://d2l.ai/) Chapter 10 and [NLP](https://github.com/nyu-dl/NLP_DL_Lecture_Note/blob/master/lecture_note.pdf) Chapter 6.3 12. 2/25 - Seq2Seq Attention 14. 2/27 - Multi-Head Self Attention, Transformer, GPT-2 - [Transformer paper](https://arxiv.org/pdf/1706.03762.pdf), Blog: [Illustrated Transformer](http://jalammar.github.io/illustrated-transformer/) 16. 3/03 - Optimization for Deep Learning #1 17. 3/05 - Optimization for Deep Learning #2 18. 3/10 - Word Embedding 19. ~~3/12 ~ 3/19~~ - Spring Recession 19. 3/24 - BERT [Paper](https://arxiv.org/pdf/1810.04805.pdf) 21. 3/26 - Deep Generative Models - VAE #1 22. 3/31 - HW1 Review and HW2 Q&A 23. 4/02 - Deep Generative Models - VAE #2 24. 4/07 - Deep Generative Models - VAE #3 26. 4/09 - Deep Generative Models - VAE #4 27. 4/14 - Deep Generative Models - GAN #1 28. 4/16 - Deep Generative Models - GAN #2 29. 4/21 - Deep Generative Models - GAN #3 30. 4/23 - Deep Reinforcement Learning #1 (HW3 Release) 31. 4/28 - Deep Reinforcement Learning #2 32. 4/30 - Deep Reinforcement Learning #3 34. 5/05 - **Final Presentation I** 35. 5/07 - **Final Presentation II** <!-- 33. (Remaining topics: Graph Neural Networks, Meta-Learning, Neural Turing Machine, DRL) --> <!-- 1. 1/21 - Logistics, Intro to DL 2. 1/24 - 3. 1/28 - DL Computation with PyTorch 4. 1/30 - Gradients, Chain Rule, Automatic Differentiation 5. 2/04 - Linear Regression, Basic Optimization 6. 2/06 - Liki 7. 2/11 - Model Selection, Weight Decay, Dropout 8. 2/13 - Numerical Stability (Gradient Explosion/Vanishing, Weight Initialization, Activation) 9. 2/18 - Convolutional Layers 10. 2/20 - LeNet, AlexNet, VGG, NiN 11. 2/25 - Inception, GoogleNet, Batch Normalization, ResNet, DenseNet 12. 2/27 - Image Augmention, Fine Tuning, Style Transfer 13. 3/03 - Object Detection I 14. 3/05 - Object Detection II 15. 3/10 - Sequence Models, Language Models 16. 3/12 - RNN Language Models ~~3/17~~ - Spring Recession ~~3/19~~ - Spring Recession 19. 3/24 - **Mid-Term Exam** 20. 3/26 - T-BPTT, GRU, LSTM, Bi-LSTM, Deep RNNs, Regularization in RNNs MLP - Linear Networks, Activation Function, SoftMax Regularization - Weight Decay, Dropout, Early Stopping, Parameter Sharing, Norm Panalty CNNS - Convolution, Modern CNNs - VGG, NiN, GoogLeNet, Batch Normalization, ResNet) RNNs - RNN, LSTM, GRU, BPTT, Seq2Seq, Gradient Vanishing / Explosion Word Embedding - Word2Vec, SkipGram, Negative Sampling, Glove, FastText, BERT NLP - Language Modeling and Machine Translation Attention - Seq2Seq Attention, Transformer, GPT-2 Optimization - Momentum, AdaGrad, RMSProp, Adam, Gumbel-Softmax Computer Vision - Object Detection, Segmentation To add - Neural Turing Machine, Gumbel-Softmax 21. 3/31 - Word2Vec, FastText, Glove, Sentiment Analysis 22. 4/02 - Encoder-Decoder, Seq2seq, Machine Translation 23. 4/07 - Attention, Transformer, GPT-2, BERT, Graph Neural Networks 24. 4/09 - Optimization - Momentum, AdaGrade, RMSProp, Adam 25. 4/14 - VAE: VAE, VRNN, Semi-VAE, Normalizing Flow 26. 4/16 - GAN: InfoGAN, GAIL 27. 4/21 - Autoregressive: NADE, RNADE, MADE, PixelRNN, PixelCNN, WaveNet 28. 4/23 - Energy-Based Models 29. 4/28 - RL 30. 4/30 - **Final Presentation I** 31. 5/07 - **Final Presentation II** 32. Gumbel-Softmax, -->