# Koay Yeong Lin's log
Project title: Stochastic Gradient Descent Algorithm with Multiple Adaptive Learning Rate for Deep Learning
1. Learning rate
2. RPCA
3. Application in image processing
# References
- markdown references
- https://commonmark.org/help/
- https://rreece.github.io/sw/markdown-memo/05-math.html
- $e^{\pi i} + 1 = 0$
- markdown for presentation
- markdown -> beamer (latex) -> pdf
- remark.js
- pandoc
- convert markdown -> tex, pdf, doc, epub
# Meeting 9/2/2021
1. Example of ill-conditioned on Newton method
2. why need positive definite
3. Example problem of Hessian matrix (high-dimension)
4. How lbfgs solve high dimension problem
5.(mistake) large learning rate -> stuck in a local minimum
6. Example for AdaGrad, AdaDelta, RMSProp, Adam
7. Learning rate = step size
# Meeting 4/3/2021
1. PCA and SVD
# Meeting 11/3/2021
1. PCA and RPCA
# Meeting 19/3/2021
1.Augmented Lagrange Multiplier (ALM) Method for Robust PCA
# Meeting 2/4/2021
1. Spectral Proximal Method
# Meeting 13/4/2021
1. Convergence analysis (steepest descent convergence properties)
2. Non-smooth optimization(f is non-differentiable)(other application)
3. Average status of current research (adaptive learning rate, RPCA)
# Meeting 20/4/2021
1. Simple neural network program (random dataset)
2. Proposal draft (introduction, objective, problem statement, literature review, methodology)
# Meeting 7/5/2021
1. Proposal draft (expected outcome, gannt chart)
2. Neural network
* replace $\alpha$ by $B^{-1}$
* add more neuron and layer
* compare multiple adaptive learning rate with fixed learning rate
# Meeting 12/5/2021
1. Average loss (multiple random dataset)
2. Proposal defence slide
# Meeting 19/5/2021
1. Proposal defence slide
# Meeting 8/6/2021
1. * Traditional NN
1. download MNIST / cifar10
2. two different classes dataset
3. 28 x 28 pixles -> flatten
4. try lecun website to find a baseline/benchmark (how many epoch)
5. sklearn 'sgd', partial_fit -> how many epoch to achieve convergence
6. multiple learning rate -> how many epoch?
* ? comparison with fixed learning rate and stochastic learning rate
? comparison with AdGrad + RMSProp
? Can multiple learning rate also improve convergence in CNN?
# Meeting 16/6/2021
1. Ammendment of proposal
2. Histogram
* Number of iterations
* Computational time
# Meeting 23/6/2021
1. A Modified Spectral Gradient Method for Solving Non Linear System
2. AdaGrad algorithm
# Meeting 30/6/2021
1. CIFAR dataset
2. cProfile (computational time)
3. sklearn (MAdaGrad)
# Meeting 7/7/2021
1. Histogram (scaling)
2. Cross entropy
# Meeting 14/7/2021
1. CIFAR (2 classes and 3 classes)
2. Iris dataset
# Meeting 21/7/2021
1. Extended abstract for iFSC2021
# Meeting 28/7/2021
1. sklearn (MAdaGrad)
# Meeting 4/8/2021
1. sklearn (MAdaGrad)
2. Directed Reading Assignment
# Meeting 11/8/2021
1. sklearn (MAdaGrad)
2. Conference Slide
# Meeting 18/8/2021
1. MobaXterm
https://ubuntu.com/tutorials/command-line-for-beginners#3-opening-a-terminal
2. sklearn (plot loss)
# Meeting 1/9/2021
1. Intializing neural network
https://www.deeplearning.ai/ai-notes/initialization/
2. MobaXterm
# Meeting 15/9/2021
1. Xavier initialization (MNIST dataset)
# Meeting 24/9/2021
1. _test_madagrad calculation
2. _fit_stochastic_annotated.pdf
3. sklearn _forward_pass, _forward_pass_fast, _backprop
# Meeting 29/9/2021
1. Neural network hand calculation
* Case 1: 2 neurons
* Case 2: 2 features
# Meeting 7/10/2021
1. Amendment of paper (non linear)
2. safe_sparse_dot function
3. forward pass
# Meeting 14/10/2021
1. sklearn backpropagation
# Meeting 27/10/2021
1. Progress Report
2. Amendment of paper (non linear)
# Meeting 3/11/2021
1. Backpropagation derivation
# Meeting 17/11/2021
1. Backpropagation derivation
# Meeting 24/11/2021
1. sklearn (MAdaGrad)
# Meeting 1/12/2021
1. sklearn (MAdaGrad)
# Meeting 9/12/2021
1. sklearn (MAdaGrad)
# Meeting 16/12/2021
1. Histogram (3 histogram for 3 methods)
- Rescaling (B[i] = s^T y/s^T s * I)
- Restarting (B[i] = I)
- Restoring (B[i] = B[i])
2. script (replace B inverse by H)
# Meeting 6/1/2022
1. slide
2. script (replace B inverse by H)
# Meeting 13/1/2022
1. slide
# Meeting 19/1/2022
1. Amendment of paper (non linear)
# Meeting 27/1/2022
1. https://towardsdatascience.com/custom-optimizer-in-tensorflow-d5b41f75644a
# Meeting 8/2/2022
1. TensorFlow (MAdaGrad)
* https://www.tensorflow.org/api_docs/python/tf/math
* https://www.tensorflow.org/api_docs/cc/group/math-ops
* https://www.tensorflow.org/api_docs/python/tf/Tensor
# Meeting 15/2/2022
1. TensorFlow (MAdaGrad)
# Meeting 22/2/2022
1. TensorFlow (MAdaGrad)
# Meeting 1/3/2022
1. TensorFlow (MAdaGrad)
# Meeting 8/3/2022
1. MAdaGrad - get minimum and maximum (learning rate * B_inverse)
# Meeting 15/3/2022
1. Profiling Graph (MAdaGrad, Adam, SGD, SGD(max), SGD(min))
# Meeting 29/3/2022
1. sklearn(MAdaGrad_Adam)
2. Profiling Graph (MAdaGrad_Adam, Adam)
# Meeting 5/4/2022
1. MNIST Fashion (MobaXterm)
# Meeting 21/4/2022
1. MNIST Fashion (MobaXterm)
# Meeting 27/4/2022
1. Paper (Introduction and Literature Review)
# Meeting 10/5/2022
1. Latex Paper (Introduction and Literature Review)
# Meeting 17/5/2022
1. Paper (Methodology)
2. Count (if c1 > c2 ; else)
# Meeting 23/5/2022
1. Average loss value (MAdaGrad)
# Meeting 31/5/2022
1. 30 datasets, 500 number of iterations (loss value) (batch size = default)
- MAdaGrad, SGD (invscaling, constant, adaptive)
- MAdaGrad_Adam, Adam
2. 30 datasets, 500 number of iterations (loss value) (batch size=,50,100,200(default),500,1000)
- MAdaGrad, SGD (invscaling, constant, adaptive)
- MAdaGrad_Adam, Adam
# Meeting 14/6/2022
1. Add (1/self.t) into updating formula
- learning rate * B_inverse * (1/t)
2. Profiling graph (loss value)
# Meeting 23/6/2022
1. SGD (invscaling, adaptive, constant)
2. MNIST Fashion (SGD, MAdaGrad(1/t), MAdaGrad(1/sqrt(t)))
3. Paper
4. Work completion report
5. Thesis
# Meeting 27/6/2022
1. Profiling graph (SGD invscaling and SGD adaptive)
2. breast cancer dataset, diabetes dataset and wine dataset
3. Paper
# Meeting 4/7/2022
1. Paper
2. Work completion report
3. Datasets (Abalone, Dry Bean, Heart Disease, Raisin, Rice)
# Meeting 12/7/2022
1. Datasets (Abalone, Dry Bean, Heart Disease, Raisin, Rice, wine, breast cancer) - MAdaGrad and SGD
# Meeting 18/7/2022
1. Datasets (Abalone, Dry Bean, Heart Disease, Raisin, Rice, wine, breast cancer) - MAdaGrad and Adam
2. Profiling graph
- xlim (-0.2,8), ylim (0,1.05)
- remove SGD(constant) (overlapped with SGD(adaptive))
# Meeting 25/7/2022
1. Paper
2. Combined profiling graph (Breast Cancer, Wine and Abalone datasets) - total 5 graph (6 hidden layer sizes)
# Meeting 1/8/2022
1. Paper
2. Combined profiling graph (MNIST,Breast Cancer, Wine and Abalone datasets) - total 5 graph (6 hidden layer sizes)
# Meeting 15/8/2022
1. Paper
# Meeting 29/8/2022
1. Paper
2. Thesis
# Meeting 5/9/2022
1. Paper
2. Thesis