Try   HackMD
  1. Vectors and Matrices: Understanding the concepts of vectors and matrices is fundamental. This includes operations such as addition, subtraction, scalar multiplication, dot product, cross product, matrix multiplication, and inverse.

  2. Vector Spaces: Study the properties and characteristics of vector spaces, including linear independence, basis and dimension, subspaces, span, and linear transformations.

  3. Eigenvalues and Eigenvectors: Eigenvalues and eigenvectors play a crucial role in many machine learning algorithms, such as dimensionality reduction techniques like Principal Component Analysis (PCA) and spectral clustering. Learn how to compute eigenvalues and eigenvectors, and their applications.

  4. Orthogonality and Inner Products: Understand the concept of orthogonality, orthogonal bases, and orthogonal projections. Inner product spaces are essential in AI for techniques like support vector machines (SVMs) and kernel methods.

  5. Singular Value Decomposition (SVD): SVD is a powerful matrix factorization technique used in various AI applications, including dimensionality reduction, image compression, collaborative filtering, and recommender systems.

  6. Matrix Factorizations: Besides SVD, be familiar with other matrix factorizations, such as LU decomposition, QR decomposition, and Cholesky decomposition. These factorizations are used for solving systems of linear equations and optimizing matrix operations.

  7. Matrix Calculus: Develop a working knowledge of matrix calculus, including differentiation and optimization of functions involving matrices. This is important for understanding and implementing various machine learning algorithms.

  8. Linear Systems and Solutions: Study techniques for solving linear systems of equations, including Gaussian elimination, LU decomposition, and solving least squares problems.

  9. Convex Optimization: Understand the basics of convex optimization, including convex sets, convex functions, and optimization algorithms like gradient descent. Convex optimization is fundamental in training machine learning models and solving optimization problems.

  10. Applications to AI: Finally, explore how linear algebra concepts and techniques are applied in specific areas of AI, such as deep learning, computer vision, natural language processing, and recommendation systems. Understanding the linear algebra foundations behind these applications will help you develop a deeper intuition and enable you to design and implement more advanced algorithms.

  11. Homeworks in LinAl: https://ocw.mit.edu/courses/18-06-linear-algebra-spring-2010/pages/assignments/ (Do first and last from every section, download the PDF here)

  12. Homeworks from here: https://ocw.mit.edu/courses/6-801-machine-vision-fall-2020/pages/calendar/

  13. Computer Graphics:

    1. Bezier Curves and Splines
    2. Curves Properties and Conversion, Surface Representation
    3. Coordinates and Transformations *
    4. Hierarchical Modeling
    5. Color *
    6. Basics of Computer Animation—Skinning/Enveloping *
    7. Particle Systems and ODEs
    8. Implicit Integration, Collision Detection
    9. Collision Detection and Response
    10. Ray Casting and Rendering
    11. Ray Tracing
    12. Acceleration Structures for Ray Casting
    13. Shading and Material Appearance
    14. Texture Mapping and Shaders
    15. Sampling, Aliasing, and Mipmaps
    16. Global Illumination and Monte Carlo
    17. Image-Based Rendering and Lighting
    18. Output Devices
    19. Graphics Pipeline and Rasterization
    20. Real-time Shadows
  14. Bare metal MLOps (I will give you a server, you will deploy)

  15. https://www.coursera.org/specializations/machine-learning-engineering-for-production-mlops#courses

  16. Neural Networks

    1. Introduction:

      • The New Connectionism (1988)
      • On Alan Turing's Anticipation of Connectionism
      • McCullogh and Pitts paper
      • Rosenblatt: The perceptron
      • Bain: Mind and body
      • Hebb: The Organization Of Behaviour
    2. Neural Networks Basics:

      • Neural Nets As Universal Approximators
      • Shannon (1949)
      • Boolean Circuits
      • On the Bias-Variance Tradeoff
    3. Learning and Empirical Risk Minimization:

      • The problem of learning, Empirical Risk Minimization
      • Widrow and Lehr (1992)
      • Adaline and Madaline
      • Convergence of perceptron algorithm
      • Threshold Logic
      • TC (Complexity)
      • AC (Complexity)
    4. Empirical Risk Minimization and Gradient Descent:

      • Empirical risk minimization and gradient descent
      • Training the network: Setting up the problem
      • Werbos (1990)
      • Rumelhart, Hinton, and Williams (1986)
    5. Backpropagation:

      • Backpropagation
      • Calculus of Backpropagation
      • Werbos (1990)
      • Rumelhart, Hinton, and Williams (1986)
    6. Convergence Issues:

      • Convergence issues
      • Loss Surfaces
      • Momentum
      • Backprop fails to separate, where perceptrons succeed, Brady et al. (1989)
      • Why Momentum Really Works
    7. Optimization:

      • Optimization
      • Batch Size, SGD, Mini-batch, second-order methods
      • Momentum, Polyak (1964), Nestorov (1983)
      • Derivatives and Influences
    8. Optimizers and Regularizers:

      • Optimizers and Regularizers
      • Choosing a divergence (loss) function
      • Batch normalization
      • Dropout
      • Derivatives and Influence Diagrams
      • ADAGRAD, Duchi, Hazan, and Singer (2011), Adam: A method for stochastic optimization, Kingma, and Ba (2014)
    9. Shift Invariance and Convolutional Neural Networks:

      • Shift invariance and Convolutional Neural Networks
    10. Models of Vision and Convolutional Neural Networks

    11. Learning in Convolutional Neural Networks:

      • Learning in Convolutional Neural Networks
      • CNN Explainer
    12. Learning in CNNs:

      • Learning in CNNs
      • Transpose Convolution
      • CNN Stories
    13. Time Series and Recurrent Networks:

      • Time Series and Recurrent Networks
      • Fahlman and Lebiere (1990)
      • How to compute a derivative, extra help for HW3P1 (*.pptx)
    14. Stability and Memory, LSTMs:

      • Stability and Memory, LSTMs
      • Bidirectional Recurrent Neural Networks
    15. Sequence Prediction:

      • Sequence Prediction
      • Alignments and Decoding
      • LSTM
    16. Sequence Prediction:

      • Sequence Prediction
      • Connectionist Temporal Classification (CTC) - Blanks and Beam-search
    17. Language Models and Sequence to Sequence Prediction:

      • Language Models
      • Sequence To Sequence Prediction
      • Labelling Unsegmented Sequence Data with Recurrent Neural Networks
    18. Sequence to Sequence Methods and Attention:

      • Sequence To Sequence Methods
      • Attention
      • Attention Is All You Need
      • The Annotated Transformer - Attention is All You Need paper, but annotated and coded in PyTorch!
    19. Transformers and GNNs:

      • Transformers and GNNs
      • A comprehensive Survey on Graph Neural Networks
    20. Learning Representations and Autoencoders:

      • Learning Representations
      • Autoencoders
    21. Variational Autoencoders:

      • Variational Autoencoders
      • Tutorial on VAEs (Doersch)
      • Autoencoding variational Bayes (Kingma)
    22. Variational Autoencoders II:

      • Variational Autoencoders II
    23. Generative Adversarial Networks:

      • Generative Adversarial Networks, 1
    24. Generative Adversarial Networks:

      • Generative Adversarial Networks, 2
    25. Hopfield Nets and Autoassociators:

      • Hopfield Nets and Autoassociators
    26. Hopfield Nets and Boltzmann Machines:

      • Hopfield Nets and Boltzmann Machines

*=(Optional - you already know)

Project ideas

  1. NL toolkit for Ziglang and Chapel or write APIs to Python in Zig for NLTK
  2. Dataframe library for Zig
  3. Physics Engine from scratch in C
  4. Deploying a simple model using Flask to DigitalOcean
  5. Implement a few research papers in collision detection etc, and benchmark them against each other
  6. Write a tool to easily download, train and deploy your own Llama model to Digital Ocean with an API with it
  7. Stereo reconstruction from 2 photos in a 3OBJ object

Write articles on these

  1. https://arxiv.org/pdf/2307.09477.pdf
  2. https://arxiv.org/pdf/2307.09426.pdf
  3. https://arxiv.org/pdf/2307.08715.pdf
  4. https://arxiv.org/pdf/2307.09042.pdf
  5. https://arxiv.org/pdf/2307.09036.pdf
  6. https://arxiv.org/pdf/2307.08849.pdf
  7. https://arxiv.org/pdf/2307.08558
  8. https://arxiv.org/pdf/2307.08810.pdf
  9. Federated learning

Getting better at coding

You want to implement deep learning functionalities from scratch without using external libraries like PyTorch or TensorFlow, it would require a significant amount of work, as these libraries provide optimized implementations and abstract away many low-level details. However, it's still possible to implement some basic functionalities. Here are a few key steps to get started:

  1. Tensor Operations:

    • Create a Tensor class: Implement a class that represents a tensor and includes methods for basic tensor operations like element-wise addition, subtraction, multiplication, and division.
  2. Neural Network Components:

    • Implement a Linear layer: Create a class that represents a linear layer and includes methods for initializing weights, performing the forward pass, and updating weights during backpropagation.
    • Implement activation functions: Define classes or functions for commonly used activation functions like ReLU, sigmoid, and tanh.
  3. AutoGrad and Optimization:

    • Implement automatic differentiation: Define a mechanism to compute gradients of functions with respect to their inputs, such as the chain rule for backpropagation.
    • Implement optimization algorithms: Create classes or functions for optimization algorithms like stochastic gradient descent (SGD) or Adam. These should include methods for updating the model parameters based on computed gradients.
  4. Training and Evaluation:

    • Implement a training loop: Create a loop that performs forward and backward passes for each training sample, computes the loss, updates the parameters using the chosen optimizer, and tracks the training progress.
    • Implement evaluation metrics: Define functions or classes to calculate evaluation metrics such as accuracy or mean squared error.
  5. Data Handling:

    • Load and preprocess data: Implement functions or classes to load data from files or other sources, and preprocess it (e.g., normalization, shuffling, or batching) before feeding it into the neural network.

Keep in mind that implementing all these functionalities from scratch can be a complex and time-consuming task. Additionally, it may not be as efficient or optimized as using established deep learning libraries. However, it can be a valuable learning experience to gain a deeper understanding of the inner workings of deep learning algorithms.

Reading list

https://arxiv.org/abs/2209.04836
Git re-basin
this is an extremely good paper
https://arxiv.org/pdf/2212.04089.pdf
editing models with task arithmetic
https://math.mit.edu/~dspivak/informatics/olog.pdf
GLaM: https://arxiv.org/pdf/2112.06905.pdf
DEMix Layers: https://arxiv.org/pdf/2108.05036.pdf
Branch-Train-Merge: https://arxiv.org/pdf/2208.03306.pdf
Cluster-BTM: https://arxiv.org/abs/2303.14177
SMEAR: https://arxiv.org/pdf/2306.03745.pdf
TIES-Merging: https://arxiv.org/abs/2306.01708
AdapterFusion: https://arxiv.org/pdf/2005.00247.pdf
KNN Zero-Shot Inference: https://suchin.io/assets/knnprompt.pdf
Cross-Task Skills with Task-Level Mixture-of-Experts: https://arxiv.org/abs/2205.12701
Mixture-of-Supernets: https://arxiv.org/pdf/2306.04845.pdf
Sparse Upcycling: https://arxiv.org/abs/2212.05055
AdaMix: https://www.microsoft.com/en-us/research/uploads/prod/2022/05/Mixture_of_Adaptations_EMNLP_2022-2.pdf
(Thanks Yaccine for the list above!)