# CS 536: Machine Learning (Spring 2021) --- <!-- ## News - Quiz 1 will be on Feb 8 --> <!-- - Description on [project proposal](https://hackmd.io/@sungjin/rJ9qf0V-8) - Assignment 3 is posted (check out Canvas) - Assignment 2 is posted (check out Canvas) - ==[Assignment 1](https://hackmd.io/cb2hagPTTpCL8-LynmeE-w?view) is posted.== - Sep 8 (Tuesday) class is cancelled due to holiday shift - Aug 20: Syllabus Updated --> ### [Link to Canvas](https://rutgers.instructure.com/courses/111645) ### [Final Project](https://hackmd.io/eNKFsZeiRPmn00JbEs0fsA?both#Final-Project) --- ## Overview The principal purpose of this course is to introduce the students to the problems of machine learning through a comparative presentation of methodology and practical examples. In addition to the fundamental concepts of machine learing, this course has a particular emphsis on the approach based on deep learning. The course is intended for computer science students with an applied mathematics orientation and basic knowledge on machine learning at the level that can be acquired through the prerequisite courses. Also students in other programs (computer and electrical engineering, statistics, mathematics, psychology) are encouraged to take the course. The course is structured into three-fold: (i) Fundamentals of Machine Learning, (ii) Deep Learning (DL), and (iii) Advanced Topics. For the DL part, the course will cover basic concepts like feed-forward neural networks, convolutional neural networks, RNNs as well as recent advances including attention-mechanisms, batch normalization, transformer, etc. An importnat part of the course is that students will not only learn these conceptually but will build their own DL toolbox of these modules to learn the concept completely. For the advanced topics, we will cover topics such as deep reinforcement learning and deep generative models, and graph neural networks. The plan for the advanced topics can be adjusted depending on the progress of the course. This course requires strong programming skill in python and numpy. <!-- ## Overview --> <!-- The principal purpose of this course is to introduce machine learning through a comparative presentation of methodology. The course mainly focuses on the fundamentals of deep neural networks and its applications to natural language processing, computer vision, and deep generative modeling. The course is intended for CS students with an applied mathematics orientation and knowledge on essential machine learning, and also for students in other programs (computer and electrical engineering, statistics, mathematics, psychology) who are interested in this topic. The course will cover traditional neural networks architectures, including feed-forward neural networks, convolutional neural networks, recurrent neural networks as well as recent advances such as attention mechanisms, batch normalization, transformer, etc. An essential goal of the course is for the students to learn to build their own DL toolbox using PyTorch. The course will also cover some advanced topics like generative models and reinforcement learning if time allowed. --> <!-- ## Topic Overview - The course will cover basic concepts like feed-forward neural networks, convolutional neural networks, RNNs as well as recent advances like attention, batch normalization, transformer, style transfer, etc. An importnat part of the course is that students will not only learn these conceptually but will build their own DL toolbox of these modules to learn the concept completely. The course will also cover some advanced topics like generative models including topics like variational autoencoders and generative adversarial networks. The plan for the advanced topics can be adjusted though depending on the progress of the first part. - A simlar course was taught at graduate-level [CS536](https://hackmd.io/@Tn97A1U0QG6gBtFPXRh4oQ/B1sZLO55r). The undergraduate course (CS445) will move slower and easier and may not cover some of the topics. --> --- ## Pre-Requisites <!-- 1. CS 440 or CS 439 --> <!-- 1. I encourage students to take [CS 445: Introduction to Machine Learning](http://karlstratos.com/teaching/cs445fall20/cs445fall20.html) together with this course for maximal learning for the topic of model machine learning. --> 1. Students must be familiar with **python** programming. The homework will be based on [**PyTorch**](https://pytorch.org/), a python framework for deep learning. Thus, some experience of PyTorch will be helpful although the course will provide short intro lecture on using PyTorch. Students should set up the PyTorch programming environment as well. - [Getting Started With Pytorch In Google Collab With Free GPU](https://hackernoon.com/getting-started-with-pytorch-in-google-collab-with-free-gpu-61a5c70b86a) --- ## Instructor & TA - Instructor - Sungjin Ahn (sungjin.ahn@cs.rutgers.edu) at CBIM-07 - A contact via email is usually responded within two business days. - TA - Fei Deng (fei.deng@rutgers.edu) <!-- - Yi-Fu Wu (yifuwu2@cs.rutgers.edu) --> --- ## Time and Location - When: Monday and Wednesday at 5:00pm - 6:20pm - The course is remotely instructed via Zoom. Link for the Zoom meeting is posted on the course canvas site. <!-- - Where: [BE-253](https://maps.rutgers.edu/#/?click=true&lat=40.52278946165586&lng=-74.43937818983225&selected=4145&sidebar=true&zoom=17) --> --- ## Office Hours - Instructor - Every Tuesday between 9 am - 10 am via Zoom. The Zoom link is posted on the course canvas site. - Because there can be multiple students waiting for the office hour, making an appointment is required (please send an email to the instructor in advance and schedule a slot within the office hour.) - TA - Friday 4pm-5pm <!-- - Every Wednesday 11am-12pm --> <!-- - Instructor office hour: 10:00~11:00am on Thursday (CBIM 9) - TA office hour: 4-5pm on Friday --> <!-- ## Grading - Homeworks (35%) - There will be three programming assignments - Midterm Exam (30%) - Final Projects (35%) --> --- ## Technology Requirements - A computer, mic/audio system, and access to the Internet to participate in the lectures via Zoom - Please visit the Rutgers Student Tech Guide page for resources available to all students. If you do not have the appropriate technology for financial reasons, please email Dean of Students deanofstudents@echo.rutgers.edu for assistance. If you are facing other financial hardships, please visit the Office of Financial Aid at https://financialaid.rutgers.edu/. --- ## Computing Resources - Rutgers iLab Servers: https://report.cs.rutgers.edu/nagiosnotes/iLab-machines.html - Google Colab: https://colab.research.google.com/ --- ## Required Books and Materials - There is no required book for this course. The following books can be useful though as references on relevant topics. - Reference Books 1. Dive into Deep Learning (https://d2l.ai/), 1. Deep Learning (DL), Goodfellow, Ian and Bengio, Yoshua and Courville, Aaron, MIT Press, 2016, ISBN: 9780262035613 1. Pattern Recognition and Machine Learning (PRML), Christopher C. Bishop, Springer, 2006, ISBN: 9780387310732 1. Natural Language Processing with Distributed Representations (NLP), Kyunghyun Cho, https://arxiv.org/abs/1511.07916 - Pytorch Tutorial - [Deep Learning with PyTorch: A 60 Minute Blitz](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html) - Reinforcement Learning - [Reinforcement Learning (2nd)](http://incompleteideas.net/book/RLbook2020.pdf) - [Lecture by David Silver](https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PLqYmG7hTraZBiG_XpjnPrSNw-1XQaM_gB&index=2&t=0s) <!-- DL & ML General 1. [Dive into Deep Learning (DDL)](https://d2l.ai/) 1. Deep Learning (DL), Goodfellow, Ian and Bengio, Yoshua and Courville, Aaron, MIT Press, 2016 3. [Pattern Recognition and Machine Learning (PRML)](https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf), Christopher C. Bishop, Springer, 2006 4. [Natural Language Processing with Distributed Representations (NLP)](https://github.com/nyu-dl/NLP_DL_Lecture_Note/blob/master/lecture_note.pdf), Kyunghyun Cho --> --- ## Course Structure and Requirement #### Grading - Assignment 1~4 (50%) - Final Project (20%) - Quiz (30%) **Grading proportion of the assignments can be changed.** <!-- - Final Project (25%) --> <!-- #### Attendance policy - One absence is allowed and will not affect your score. From the second absence, a reasonable explanation with supporting evidence is required not to lose the attendance point. - Each absence without a reasonable explanation will lose 3% of the score. --> <!-- #### Check your [attendance sheet](https://docs.google.com/spreadsheets/d/1nmTOCTdemMRnGVm6dPJIm1pCWnyr7dbEBfDgkyiQcoA/edit?usp=sharing) here --> <!-- #### Late submission will lose 2% every day --> #### The final project The final project (if we do it) is a 1~3 person team project. The project consists of (1) proposal writing, (2) final presentation, and (3) final report. More details will be explained in class. For more details see [final project](https://hackmd.io/@Tn97A1U0QG6gBtFPXRh4oQ/rJ9qf0V-8) --- ## Assignments 1. Assignment 1: Due by Feb 3 --- ## Lecture Slides - available in Canvas and lecture recording <!-- - [Link](https://drive.google.com/drive/folders/1mvXGhnnmEHPbXaZYzisTfSqJXuZcv-_O?usp=sharing) --> --- ## Tentative Schedule #### Week 1 - 1/20 - Course Overview, Basic Concepts #### Week 2 - 1/25 - Basic Concepts (HW1 Release) - 1/27 - Basic Concepts #### Week 3 - 2/1 - Basic Concepts - 2/3 - Linear Models for Regression #### Week 4 (HW2-1) - 2/8 - Probablistic Linear Regression and Linear Models for Classification & **Quiz 1** - 2/10 - Multi-Layer Perceptrons (MLP) (HW2-1 Release) <!-- - MLP, Activation Functions, Backpropagation, Vanishing/Exploding Gradinent --> #### Week 5 (HW2-2) - 2/15 - - MLP - 2 - 2/17 - CNN - (HW2-2 Release) #### Week 6 - 2/22 - Modern CNN - 2/24 - Modern CNN & RNN (Quiz 2) #### Week 7 - 3/1 - RNN - 3/3 - RNN #### Week 8 (HW3-1) - 3/8 - Gated RNN - 3/10 - Attention #### Week 9 - 3/15 - Spring Recess - 3/17 - Spring Recess #### Week 10 - 3/22 - Attention & Transformer - 3/24 - Attention & Transformer #### Week 11 (Quiz 3) (HW3-2) - 3/29 - Optimization for Deep Learning <!-- - Deep Reinforcement Learning (MDP, Dynamic Programming) --> - 3/31 (Quiz - 3, HW3-2 Release) <!-- - Deep Reinforcement Learning (Dynamic Programming) --> #### Week 12 - 4/5 - Deep Reinforcement Learning - 4/7 - Deep Reinforcement Learning #### Week 13 (HW4) - 4/12 - Deep Reinforcement Learning - 4/14 - Deep Reinforcement Learning #### Week 14 - 4/19 - Deep Generative Models - 4/21 - Deep Generative Models #### Week 15 (Quiz-4) - 4/26 - Deep Generative Models - 4/28 - TBD #### Week 16 - 5/3 - Final Presentation <!-- #### Week 8 (HW3-2) #### Week 10 (Quiz 3) - 3/22 - BERT, GPT - 3/24 - Optimization for Deep Learning (HW3) #### Week 11 (HW4-1) - 3/29 - Deep Reinforcement Learning (MDP, Dynamic Programming) - 3/31 - Deep Reinforcement Learning (Dynamic Programming) #### Week 12 - 4/5 - Deep Reinforcement Learning (Monte-Carlo, Temporal Difference, DQN) - 4/7 - Deep Reinforcement Learning (Policy Gradient, MB-RL, HW4) #### Week 13 (Quiz 4) - 4/12 - Deep Reinforcement Learning (Policy Gradient, MB-RL) - 4/14 - Deep Generative Models #### Week 14 (HW4-2) - 4/19 - Deep Generative Models - VAE - 4/21 - Deep Generative Models - GAN (End of Regular Class) #### Week 15 - 4/26 - TBD - 4/28 - TBD #### Week 16 - 5/3 - TBD - 5/5 - TBD --> <!-- --- **The following schedule is tentative** --> <!-- 1. 1/21 - Course Overview, Basic Concepts for ML 2. 1/24 - Basic Concepts for ML 3. 1/28 - Linear Networks, Softmax, Multilayer Perceptrons, Regularization, Activation Functions 4. 1/30 - Backpropagation, Gradient Explosion, Vanishing, Dropout 5. 2/04 - Convolutional Networks 6. 2/06 - Modern Convolutional Networks (VGG, NiN, GoogLeNet, Batch Normalization, ResNet) 7. 2/11 - Modern Convolutional Networks, **HW1 release** 8. 2/13 - Advanced Convolutions (Deconv, Dilated Conv), RNNs 9. 2/18 - RNNs, Modern RNNs (LSTM, GRU) 10. 2/20 - Modern RNNs, Attention - Attention: See also [DDL](https://d2l.ai/) Chapter 10 and [NLP](https://github.com/nyu-dl/NLP_DL_Lecture_Note/blob/master/lecture_note.pdf) Chapter 6.3 12. 2/25 - Seq2Seq Attention 14. 2/27 - Multi-Head Self Attention, Transformer, GPT-2 - [Transformer paper](https://arxiv.org/pdf/1706.03762.pdf), Blog: [Illustrated Transformer](http://jalammar.github.io/illustrated-transformer/) 16. 3/03 - Optimization for Deep Learning #1 17. 3/05 - Optimization for Deep Learning #2 18. 3/10 - Word Embedding 19. ~~3/12~~ - Spring Recession ~~3/17~~ - Spring Recession ~~3/19~~ - Spring Recession 19. 3/24 - BERT - [BERT Paper](https://arxiv.org/pdf/1810.04805.pdf) 21. 3/26 - Deep Generative Models - VAE #1 22. 3/31 - HW1 Review and HW2 Q&A 23. 4/02 - Deep Generative Models - VAE #2 24. 4/07 - Deep Generative Models - VAE #3 <!-- Conditional VAE, VRNN, Gumbel-Softmax, DRAW --> <!-- 26. 4/09 - Deep Generative Models - VAE #4 27. 4/14 - Deep Generative Models - GAN #1 <!-- GAN, InfoGAN, ConditionalGAN, CycleGAN, f-GAN, WGAN --> <!-- 28. 4/16 - Deep Generative Models - GAN #2 29. 4/21 - Deep Generative Models - GAN #3 30. 4/23 - Deep Generative Models - GAN #4 (HW3 Release) 31. 4/28 - TBD 32. 4/30 - TBD --> <!-- 33. (Remaining topics: Graph Neural Networks, Meta-Learning, Neural Turing Machine, DRL) --> <!-- 34. 5/05 - **Final Presentation I** 35. 5/07 - **Final Presentation II** 36. 5/15 - HW3 Due --> --> -->