Machine Learning Resources

Machine Learning Resources === Here are some awesome (at least, to me) resources for machine learning and deep learning. They are all **super techy** so you almost certainly don't want to go through them all. Pick whatever you are interested, or ignore this thing altogether if you don't care about the dirty underlying mechanics and mathematics anyway. Going through the exercises and programming assignments are not required ~~because we are not graded anyway~~, though definitely recommended. Some courses uses MATLAB or Lua/Torch, so if you want to do their assignments you'll have to learn new programming language(s). Mostly you will be fine if you know what to Google/StackOverflow for, though. If you plan to take *Deep Learning (DS-GA 1008)*, probably you will have to ramp things up a little bit, because... you know, you get the idea. Also, I have NEVER watched those lectures: the coverages are all taken from the syllabi, and other stuff are pretty much my first impression on these courses after looking at some of the slides, the reputation on the web, as well as hearing from my undergrad schoolmates and professors, so they are only for reference and may not be as accurate. Heard that we may have a little chance watching lectures videos together in CDS, but I'm not sure whether it can be realized or not... Book materials should be already recommended within the syllabi. Anyway, here it goes. #### Not Necessarily Deep Learning 1. **[Machine Learning](https://www.coursera.org/learn/machine-learning)** *Lecturer*: Andrew Y. Ng, Stanford University, Baidu *Coverage*: * Linear Models * Neural Networks * K-Means * PCA * Anomaly Detection using Gaussian Distribution * Recommender Systems * Large Scale Machine Learning including Stochastic Gradient Descent * Best Practices * and more *Programming*: MATLAB or GNU Octave. Probably the most popular course on Coursera, I guess? The content looks close to what we've learned in Intro, but definitely with more math. No idea whether it is a duplicate to *Machine Learning (DS-GA 1003)* or not. 2. **[Probabilistic](https://www.coursera.org/learn/probabilistic-graphical-models) [Graphical](https://www.coursera.org/learn/probabilistic-graphical-models-2-inference) [Models](https://www.coursera.org/learn/probabilistic-graphical-models-3-learning)** **Yes, there are three separate links in the title.** *Lecturer*: Daphne Koller, Stanford University *Coverage*: * Bayesian Networks (a.k.a. Directed Graphical Models) * Template Models * Temporal Models, including Hidden Markov Models * Plate Models, e.g. Latent Dirichlet Allocation. * Markov Random Fields (a.k.a. Undirected Graphical Models) * Decision Theory * *Maximum A Posteriori* Inference * Variable Elimination * Belief Propagation * Sampling, in particular *Markov Chain Monte Carlo (MCMC)*. * Parameter Estimation * Structure Learning *Programming*: MATLAB or GNU Octave. Not deep (I mean, *deep* as in *deep learning*), and probably useful in practice if you don't have access to beefy computational resources to play with neural networks. I fooled around with PGMs a little bit before and I feel that they are more... business-oriented than neural nets (which seems too heavy for most business affairs)? I guess that's why we have Graphical Models as a required course rather than Deep Learning. There's actually [a book with the same name written by the same professor](http://pgm.stanford.edu/). No idea whether this is a duplicate to *Inference and Representation (DS-GA 1005)*. 3. **[Convex Optimization](http://www.stat.cmu.edu/~ryantibs/convexopt/)** Note: the audio quality is quite horrible. *Lecturer*: Ryan Tibshirani, CMU *Coverage*: It is organized into four sections, plus one special topics section: * Basics * Convexity Analysis * First-order methods * Gradient Descent * Subgradients * Proximal Gradient Descent * Optimality and Duality * Duality, mainly Lagrangian Duality * Kuhn-Karush-Tuck Conditions * Second-order methods, including * Newton's Method * Barrier Method * Quasi-Newton Methods, including LBFGS * Other Fancy stuff, including * Integer programming * Coordinate descent *Programming*: Not-specific Maaaaaaaaaaaaaaaaaaaaaath. Hooooooooooooooooooomework. Good for mathy people and those who really wants to understand optimization itself (e.g. how SVM is *exactly* trained). One can also take a look at [EE364a](http://stanford.edu/class/ee364a) and [EE364b](http://stanford.edu/class/ee364b/videos.html) offered by Stephen Boyd in Stanford. Way more material than this version. Also the CMU version seems to be more closely-related to machine learning. #### Deep Learning All these topics are more or less bleeding-edge. 1. **[Convolutional Neural Networks for Visual Recognition](https://www.youtube.com/watch?v=F-g0-6_RRUA&list=PLLvH2FwAQhnpj1WEB-jHmPuUeQ8mX-XXG&index=1)** *Syllabus*: http://cs231n.stanford.edu/syllabus.html *Lecturer*: Andrej Karpathy & Fei-fei Li, Stanford University *Coverage*: * VERY brief k-NN and Linear Classification * Stochastic Gradient Descent * Image Features and High-level Representations * Neural Networks: Basics, including * Backpropagation * Activation functions * Initialization * Hyperparameter selection, including Grid Search and Random Search * Neural Networks: Advanced, including * Fancier Gradient Descents * Training Practices and Babysitting, including Early Stopping * Batch Normalization * Ensembles & Dropout * Convolutional Neural Networks: Architecture * Spatial Localization and Object Detection * Convoltuional Neural Networks: Understanding, including * Visualization * Deep Dream * Artistic Style Transfer (like, painting *[The Scream](https://en.wikipedia.org/wiki/The_Scream)* in *[impressionist](https://en.wikipedia.org/wiki/Impressionism)* style?) * Adversarial Fooling, e.g. fooling the model to recognize a panda as a gibbon ([true story, academic paper ahead](https://arxiv.org/pdf/1412.6572v3.pdf)) * Recurrent Neural Networks, including LSTM * Language Modeling and Image Captioning * Convolutional Neural Networks: Practice * Attention mechanisms, including * Segmentation * Soft Attention Models * Spatial Transformer Networks * Videos * Unsupervised Learning *Programming*: Python + numpy. Also there will be an overview on: * Caffe (C++, but you usually don't need to actually *write* C++) * Theano (Python) * Torch (Lua) * Tensorflow (Python) For *deep learners*, especially those interested in computer vision, this course is a must-have. Very practical, and has a good reputation among my undergrad deep-learner schoolmates. This course takes you from the most basic stuff to second-to-most cutting-edge works (not "the most" because deep learning is evolving too fast). Later topics in this course are significantly biased toward CNNs and computer vision. 2. **[Neural Networks](https://www.coursera.org/learn/neural-networks)** *Lecturer*: Geoffrey Hinton, University of Toronto *Coverage*: * Feed-forward Neural Networks and Backpropagation * Relationship Modeling, including Language Modeling * Convolutional Neural Networks * Optimization, including Stochastic Gradient Descent and Fancier Methods * Recurrent Neural Networks, including * Difficulty of Training RNN: Gradient Explosion and Vanishing * LSTM * Hessian-free Optimizer * Echo State Networks * Regularization, including * Weight penalties (L1/L2), their Bayesian interpretations, and weight constraints * Adding Gaussian noise to inputs/weights * Ensemble methods, including * (Approximate) Full Bayesian Learning, i.e. computing the possibility for every parameter configuration, or approximate this thing * Dropout * Energy Models * Hopfield Nets * Boltzmann Machines * Restricted Boltzmann Machines * Deep Belief Nets * Autoencoders *Programming*: Not sure whether it uses Python or MATLAB. Not a very bleeding-edge course (2012). Relatively "balanced" comparing to CS231n above and CS224d below. Among the topics, *energy models* are not discussed in detail in other courses I found ~~and the ones I have no experience with~~. In particular, RBMs and DBNs are quite versatile and can be applied on a variety of tasks. (It seems that Prof. LeCun's course also introduces some energy-based models: see below) This course also explains different models and ideas under some cognitive science perspective (Prof. Hinton himself is also a cognitive scientist), and also frequently draws references from biology, neuroscience, etc., so it may be hard to understand sometimes. Quite interesting, though. 3. **[Machine Learning](https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/)** *Lecturer*: Nando de Freitas, University of Oxford, DeepMind *Coverage*: * Maximum Likelihood * Linear Models * Regularization * Optimization, including fancier first-order gradient methods and second-order ones. * Neural Networks * "Normal" Neural Networks * Convolutional Neural Networks * Siamese Networks * Recurrent Neural Networks including LSTM * Reinforcement Learning with * Policy Gradient (e.g. REINFORCE algorithms) * Action-value Approximation (e.g. Q-Learning) *Programming*: Lua + Torch. Although there's no "deep" in the course title, the content seems to be heavily biased to deep learning. Pretty basic, though, and gives you an understanding of how building blocks in deep learning works. Not very cutting-edge but not terrible. Also involves [Reinforcement Learning](https://en.wikipedia.org/wiki/Reinforcement_learning), which is an interesting topic (imagine that you build a model which *learns how to play games*). On the 2014-2015 lecture there are two guest talks, both of which are about excellent works. 4. **[Deep Learning for Natural Language Processing](https://www.youtube.com/watch?v=kZteabVD8sU&index=1&list=PLmImxx8Char9Ig0ZHSyTqGsdhb9weEGam)** **Please read the first comment in the first video for better experience.** *Previous year*: https://www.youtube.com/watch?v=sU_Yu_USrNc *Syllabus*: http://cs224d.stanford.edu/syllabus.html *Lecturer*: Richard Socher, Stanford University, MetaMind *Coverage*: * Word Embeddings a.k.a. Word Vector Representations, including word2vec and GLoVe. * This topic also covers Noise Contrastive Estimation (a.k.a. Negative Sampling) * Language Modeling, including Continuous Bag-of-Words (CBOW) and Skip-gram * Neural Networks and Backpropagation * Named Entity Recognition * Training practice, including: * Gradient checks * Preventing overfitting * Regularization * Recurrent Neural Networks, including * Difficulty in Training: Gradient Explosion and Vanishing * Bi-directional RNNs * LSTM and GRU * Machine Translation * Recursive Neural Networks * Parsing * Sentiment Analysis * Convolutional Neural Networks on Sentences *Programming*: Python + Tensorflow. This course seems to be... much harder than the courses above. I watched the first video and the majority of the students had already taken CS231n, so I guess you should have some basic understanding on how neural networks work as well. CS231n, Prof. Hinton's course, or Prof. Freitas' course are probably good starters. This course also offers a list of super cool papers, and the demos in the class are awesome to look at. Obviously this course is very biased to natural language processing. 5. **[Deep Learning (Videos unavailable)](http://cilvr.cs.nyu.edu/doku.php?id=courses:deeplearning2015:start)** *Lecturer*: Yann LeCun, NYU, Facebook AI Research *Coverage*: Uhh, we are taking this course now. 6. **[Deep Learning](http://www.deeplearningbook.org/)** **This is actually a book. Full content available there.** *Author*: Ian Goodfellow, Yoshua Bengio, Aaron Courville, Université de Montréal *Coverage*: You can see the table of contents there, but basically it is organized into three parts: * Basics (not very "deep") * Deep Learning Building Blocks and Practices * Bleeding-edge Research Directions (hard) Very concise, and pretty much covers everything about deep learning you need to know. Also the book is very well-organized. For those interested in more cutting-edge research, this book also provides extensive bibliography which you can dive further into. The whole book explains deep learning from a mathematical perspective, such as explaining why L1 Regularization gives sparse results, why we choose this particular activation function or loss function, etc. Interesting to see how the flavor differs from, e.g. Prof. Hinton's course. #### Reinforcement Learning There are not many lecture videos about reinforcement learning. 1. **[Reinforcement Learning](https://www.youtube.com/watch?v=2pWv7GOvuf0)** *Syllabus*: http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html *Lecturer*: David Silver, University College London, DeepMind *Coverage*: * Fundamentals * Markov Decision Process * Dynamic Programming: Solving a **known** MDP * Unknown MDP (Partially Observable MDP) * Model-free estimation * Model-free control * Value-Function Approximation, including * Q-Learning * SARSA * TD(λ) * Deep Q-Learning Network (Yes, I'm talking about [this guy playing Atari games (Nature paper ahead)](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)) * Policy Gradient, including * REINFORCE Algorithms * Exploration vs Exploitation, including * Multi-armed Bandits *Programming*: Not specific This lecture is given by the main inventor of **AlphaGo**. I feel that its content is pretty solid. Also it's not necessarily deep, although contemporary researches in reinforcement learning often borrows deep learning architectures. ~~Good for gamers.~~ Side note: the introduction video showed some fantastic demos. 2. **[Deep Reinforcement Learning](http://rll.berkeley.edu/deeprlcourse/)** **NOTE**: This is a LIVE course. Stay tuned. *Lecturer*: Sergey Levine, John Schulman, Chelsea Finn, UC Berkeley *Coverage*: Basically split into 4 sections: * Imitation Learning * Basic RL * Value-Function Approximation (Q-Learning etc.) * Policy Gradient methods *Programming*: Not specific, but they recommend Python+Tensorflow/Theano I have not watched the videos yet, but the syllabus suggests that the content is much more advanced compared to David Silver's lecture (and indeed, it calls the latter *introductory*). If you are interested in reinforcement learning, it is probably a good idea to finish David Silver's lectures first.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.