1. Vectors and Matrices: Understanding the concepts of vectors and matrices is fundamental. This includes operations such as addition, subtraction, scalar multiplication, dot product, cross product, matrix multiplication, and inverse.
2. Vector Spaces: Study the properties and characteristics of vector spaces, including linear independence, basis and dimension, subspaces, span, and linear transformations.
3. Eigenvalues and Eigenvectors: Eigenvalues and eigenvectors play a crucial role in many machine learning algorithms, such as dimensionality reduction techniques like Principal Component Analysis (PCA) and spectral clustering. Learn how to compute eigenvalues and eigenvectors, and their applications.
4. Orthogonality and Inner Products: Understand the concept of orthogonality, orthogonal bases, and orthogonal projections. Inner product spaces are essential in AI for techniques like support vector machines (SVMs) and kernel methods.
5. Singular Value Decomposition (SVD): SVD is a powerful matrix factorization technique used in various AI applications, including dimensionality reduction, image compression, collaborative filtering, and recommender systems.
6. Matrix Factorizations: Besides SVD, be familiar with other matrix factorizations, such as LU decomposition, QR decomposition, and Cholesky decomposition. These factorizations are used for solving systems of linear equations and optimizing matrix operations.
7. Matrix Calculus: Develop a working knowledge of matrix calculus, including differentiation and optimization of functions involving matrices. This is important for understanding and implementing various machine learning algorithms.
8. Linear Systems and Solutions: Study techniques for solving linear systems of equations, including Gaussian elimination, LU decomposition, and solving least squares problems.
9. Convex Optimization: Understand the basics of convex optimization, including convex sets, convex functions, and optimization algorithms like gradient descent. Convex optimization is fundamental in training machine learning models and solving optimization problems.
10. Applications to AI: Finally, explore how linear algebra concepts and techniques are applied in specific areas of AI, such as deep learning, computer vision, natural language processing, and recommendation systems. Understanding the linear algebra foundations behind these applications will help you develop a deeper intuition and enable you to design and implement more advanced algorithms.
11. Homeworks in LinAl: https://ocw.mit.edu/courses/18-06-linear-algebra-spring-2010/pages/assignments/ (Do first and last from every section, download the PDF [here](https://students.aiu.edu/submissions/profiles/resources/onlineBook/Y5B7M4_Introduction_to_Linear_Algebra-_Fourth_Edition.pdf))
12. Homeworks from here: https://ocw.mit.edu/courses/6-801-machine-vision-fall-2020/pages/calendar/
13. Computer Graphics:
1. Bezier Curves and Splines
2. Curves Properties and Conversion, Surface Representation
3. Coordinates and Transformations *
4. Hierarchical Modeling
5. Color *
6. Basics of Computer Animation—Skinning/Enveloping *
7. Particle Systems and ODEs
8. Implicit Integration, Collision Detection
9. Collision Detection and Response
10. Ray Casting and Rendering
11. Ray Tracing
12. Acceleration Structures for Ray Casting
13. Shading and Material Appearance
14. Texture Mapping and Shaders
15. Sampling, Aliasing, and Mipmaps
16. Global Illumination and Monte Carlo
17. Image-Based Rendering and Lighting
18. Output Devices
19. Graphics Pipeline and Rasterization
20. Real-time Shadows
14. Bare metal MLOps (I will give you a server, you will deploy)
15. https://www.coursera.org/specializations/machine-learning-engineering-for-production-mlops#courses
16. Neural Networks
1. Introduction:
- The New Connectionism (1988)
- On Alan Turing's Anticipation of Connectionism
- McCullogh and Pitts paper
- Rosenblatt: The perceptron
- Bain: Mind and body
- Hebb: The Organization Of Behaviour
2. Neural Networks Basics:
- Neural Nets As Universal Approximators
- Shannon (1949)
- Boolean Circuits
- On the Bias-Variance Tradeoff
3. Learning and Empirical Risk Minimization:
- The problem of learning, Empirical Risk Minimization
- Widrow and Lehr (1992)
- Adaline and Madaline
- Convergence of perceptron algorithm
- Threshold Logic
- TC (Complexity)
- AC (Complexity)
4. Empirical Risk Minimization and Gradient Descent:
- Empirical risk minimization and gradient descent
- Training the network: Setting up the problem
- Werbos (1990)
- Rumelhart, Hinton, and Williams (1986)
5. Backpropagation:
- Backpropagation
- Calculus of Backpropagation
- Werbos (1990)
- Rumelhart, Hinton, and Williams (1986)
6. Convergence Issues:
- Convergence issues
- Loss Surfaces
- Momentum
- Backprop fails to separate, where perceptrons succeed, Brady et al. (1989)
- Why Momentum Really Works
7. Optimization:
- Optimization
- Batch Size, SGD, Mini-batch, second-order methods
- Momentum, Polyak (1964), Nestorov (1983)
- Derivatives and Influences
8. Optimizers and Regularizers:
- Optimizers and Regularizers
- Choosing a divergence (loss) function
- Batch normalization
- Dropout
- Derivatives and Influence Diagrams
- ADAGRAD, Duchi, Hazan, and Singer (2011), Adam: A method for stochastic optimization, Kingma, and Ba (2014)
9. Shift Invariance and Convolutional Neural Networks:
- Shift invariance and Convolutional Neural Networks
10. Models of Vision and Convolutional Neural Networks
11. Learning in Convolutional Neural Networks:
- Learning in Convolutional Neural Networks
- CNN Explainer
12. Learning in CNNs:
- Learning in CNNs
- Transpose Convolution
- CNN Stories
13. Time Series and Recurrent Networks:
- Time Series and Recurrent Networks
- Fahlman and Lebiere (1990)
- How to compute a derivative, extra help for HW3P1 (*.pptx)
14. Stability and Memory, LSTMs:
- Stability and Memory, LSTMs
- Bidirectional Recurrent Neural Networks
15. Sequence Prediction:
- Sequence Prediction
- Alignments and Decoding
- LSTM
16. Sequence Prediction:
- Sequence Prediction
- Connectionist Temporal Classification (CTC) - Blanks and Beam-search
17. Language Models and Sequence to Sequence Prediction:
- Language Models
- Sequence To Sequence Prediction
- Labelling Unsegmented Sequence Data with Recurrent Neural Networks
18. Sequence to Sequence Methods and Attention:
- Sequence To Sequence Methods
- Attention
- Attention Is All You Need
- The Annotated Transformer - Attention is All You Need paper, but annotated and coded in PyTorch!
19. Transformers and GNNs:
- Transformers and GNNs
- A comprehensive Survey on Graph Neural Networks
20. Learning Representations and Autoencoders:
- Learning Representations
- Autoencoders
21. Variational Autoencoders:
- Variational Autoencoders
- Tutorial on VAEs (Doersch)
- Autoencoding variational Bayes (Kingma)
22. Variational Autoencoders II:
- Variational Autoencoders II
23. Generative Adversarial Networks:
- Generative Adversarial Networks, 1
24. Generative Adversarial Networks:
- Generative Adversarial Networks, 2
25. Hopfield Nets and Autoassociators:
- Hopfield Nets and Autoassociators
26. Hopfield Nets and Boltzmann Machines:
- Hopfield Nets and Boltzmann Machines
*=(Optional - you already know)
**Project ideas**
1. NL toolkit for Ziglang and Chapel or write APIs to Python in Zig for NLTK
2. Dataframe library for Zig
3. Physics Engine from scratch in C
4. Deploying a simple model using Flask to DigitalOcean
5. Implement a few research papers in collision detection etc, and benchmark them against each other
6. Write a tool to easily download, train and deploy your own Llama model to Digital Ocean with an API with it
7. Stereo reconstruction from 2 photos in a 3OBJ object
**Write articles on these**
1. https://arxiv.org/pdf/2307.09477.pdf
2. https://arxiv.org/pdf/2307.09426.pdf
3. https://arxiv.org/pdf/2307.08715.pdf
4. https://arxiv.org/pdf/2307.09042.pdf
5. https://arxiv.org/pdf/2307.09036.pdf
6. https://arxiv.org/pdf/2307.08849.pdf
7. https://arxiv.org/pdf/2307.08558
8. https://arxiv.org/pdf/2307.08810.pdf
9. Federated learning
**Getting better at coding**
You want to implement deep learning functionalities from scratch without using external libraries like PyTorch or TensorFlow, it would require a significant amount of work, as these libraries provide optimized implementations and abstract away many low-level details. However, it's still possible to implement some basic functionalities. Here are a few key steps to get started:
1. Tensor Operations:
- Create a `Tensor` class: Implement a class that represents a tensor and includes methods for basic tensor operations like element-wise addition, subtraction, multiplication, and division.
2. Neural Network Components:
- Implement a `Linear` layer: Create a class that represents a linear layer and includes methods for initializing weights, performing the forward pass, and updating weights during backpropagation.
- Implement activation functions: Define classes or functions for commonly used activation functions like ReLU, sigmoid, and tanh.
3. AutoGrad and Optimization:
- Implement automatic differentiation: Define a mechanism to compute gradients of functions with respect to their inputs, such as the chain rule for backpropagation.
- Implement optimization algorithms: Create classes or functions for optimization algorithms like stochastic gradient descent (SGD) or Adam. These should include methods for updating the model parameters based on computed gradients.
4. Training and Evaluation:
- Implement a training loop: Create a loop that performs forward and backward passes for each training sample, computes the loss, updates the parameters using the chosen optimizer, and tracks the training progress.
- Implement evaluation metrics: Define functions or classes to calculate evaluation metrics such as accuracy or mean squared error.
5. Data Handling:
- Load and preprocess data: Implement functions or classes to load data from files or other sources, and preprocess it (e.g., normalization, shuffling, or batching) before feeding it into the neural network.
Keep in mind that implementing all these functionalities from scratch can be a complex and time-consuming task. Additionally, it may not be as efficient or optimized as using established deep learning libraries. However, it can be a valuable learning experience to gain a deeper understanding of the inner workings of deep learning algorithms.
### Reading list
https://arxiv.org/abs/2209.04836
Git re-basin
this is an extremely good paper
https://arxiv.org/pdf/2212.04089.pdf
editing models with task arithmetic
https://math.mit.edu/~dspivak/informatics/olog.pdf
GLaM: https://arxiv.org/pdf/2112.06905.pdf
DEMix Layers: https://arxiv.org/pdf/2108.05036.pdf
Branch-Train-Merge: https://arxiv.org/pdf/2208.03306.pdf
Cluster-BTM: https://arxiv.org/abs/2303.14177
SMEAR: https://arxiv.org/pdf/2306.03745.pdf
TIES-Merging: https://arxiv.org/abs/2306.01708
AdapterFusion: https://arxiv.org/pdf/2005.00247.pdf
KNN Zero-Shot Inference: https://suchin.io/assets/knnprompt.pdf
Cross-Task Skills with Task-Level Mixture-of-Experts: https://arxiv.org/abs/2205.12701
Mixture-of-Supernets: https://arxiv.org/pdf/2306.04845.pdf
Sparse Upcycling: https://arxiv.org/abs/2212.05055
AdaMix: https://www.microsoft.com/en-us/research/uploads/prod/2022/05/Mixture_of_Adaptations_EMNLP_2022-2.pdf
(Thanks Yaccine for the list above!)