# Koay Yeong Lin's log Project title: Stochastic Gradient Descent Algorithm with Multiple Adaptive Learning Rate for Deep Learning 1. Learning rate 2. RPCA 3. Application in image processing # References - markdown references - https://commonmark.org/help/ - https://rreece.github.io/sw/markdown-memo/05-math.html - $e^{\pi i} + 1 = 0$ - markdown for presentation - markdown -> beamer (latex) -> pdf - remark.js - pandoc - convert markdown -> tex, pdf, doc, epub # Meeting 9/2/2021 1. Example of ill-conditioned on Newton method 2. why need positive definite 3. Example problem of Hessian matrix (high-dimension) 4. How lbfgs solve high dimension problem 5.(mistake) large learning rate -> stuck in a local minimum 6. Example for AdaGrad, AdaDelta, RMSProp, Adam 7. Learning rate = step size # Meeting 4/3/2021 1. PCA and SVD # Meeting 11/3/2021 1. PCA and RPCA # Meeting 19/3/2021 1.Augmented Lagrange Multiplier (ALM) Method for Robust PCA # Meeting 2/4/2021 1. Spectral Proximal Method # Meeting 13/4/2021 1. Convergence analysis (steepest descent convergence properties) 2. Non-smooth optimization(f is non-differentiable)(other application) 3. Average status of current research (adaptive learning rate, RPCA) # Meeting 20/4/2021 1. Simple neural network program (random dataset) 2. Proposal draft (introduction, objective, problem statement, literature review, methodology) # Meeting 7/5/2021 1. Proposal draft (expected outcome, gannt chart) 2. Neural network * replace $\alpha$ by $B^{-1}$ * add more neuron and layer * compare multiple adaptive learning rate with fixed learning rate # Meeting 12/5/2021 1. Average loss (multiple random dataset) 2. Proposal defence slide # Meeting 19/5/2021 1. Proposal defence slide # Meeting 8/6/2021 1. * Traditional NN 1. download MNIST / cifar10 2. two different classes dataset 3. 28 x 28 pixles -> flatten 4. try lecun website to find a baseline/benchmark (how many epoch) 5. sklearn 'sgd', partial_fit -> how many epoch to achieve convergence 6. multiple learning rate -> how many epoch? * ? comparison with fixed learning rate and stochastic learning rate ? comparison with AdGrad + RMSProp ? Can multiple learning rate also improve convergence in CNN? # Meeting 16/6/2021 1. Ammendment of proposal 2. Histogram * Number of iterations * Computational time # Meeting 23/6/2021 1. A Modified Spectral Gradient Method for Solving Non Linear System 2. AdaGrad algorithm # Meeting 30/6/2021 1. CIFAR dataset 2. cProfile (computational time) 3. sklearn (MAdaGrad) # Meeting 7/7/2021 1. Histogram (scaling) 2. Cross entropy # Meeting 14/7/2021 1. CIFAR (2 classes and 3 classes) 2. Iris dataset # Meeting 21/7/2021 1. Extended abstract for iFSC2021 # Meeting 28/7/2021 1. sklearn (MAdaGrad) # Meeting 4/8/2021 1. sklearn (MAdaGrad) 2. Directed Reading Assignment # Meeting 11/8/2021 1. sklearn (MAdaGrad) 2. Conference Slide # Meeting 18/8/2021 1. MobaXterm https://ubuntu.com/tutorials/command-line-for-beginners#3-opening-a-terminal 2. sklearn (plot loss) # Meeting 1/9/2021 1. Intializing neural network https://www.deeplearning.ai/ai-notes/initialization/ 2. MobaXterm # Meeting 15/9/2021 1. Xavier initialization (MNIST dataset) # Meeting 24/9/2021 1. _test_madagrad calculation 2. _fit_stochastic_annotated.pdf 3. sklearn _forward_pass, _forward_pass_fast, _backprop # Meeting 29/9/2021 1. Neural network hand calculation * Case 1: 2 neurons * Case 2: 2 features # Meeting 7/10/2021 1. Amendment of paper (non linear) 2. safe_sparse_dot function 3. forward pass # Meeting 14/10/2021 1. sklearn backpropagation # Meeting 27/10/2021 1. Progress Report 2. Amendment of paper (non linear) # Meeting 3/11/2021 1. Backpropagation derivation # Meeting 17/11/2021 1. Backpropagation derivation # Meeting 24/11/2021 1. sklearn (MAdaGrad) # Meeting 1/12/2021 1. sklearn (MAdaGrad) # Meeting 9/12/2021 1. sklearn (MAdaGrad) # Meeting 16/12/2021 1. Histogram (3 histogram for 3 methods) - Rescaling (B[i] = s^T y/s^T s * I) - Restarting (B[i] = I) - Restoring (B[i] = B[i]) 2. script (replace B inverse by H) # Meeting 6/1/2022 1. slide 2. script (replace B inverse by H) # Meeting 13/1/2022 1. slide # Meeting 19/1/2022 1. Amendment of paper (non linear) # Meeting 27/1/2022 1. https://towardsdatascience.com/custom-optimizer-in-tensorflow-d5b41f75644a # Meeting 8/2/2022 1. TensorFlow (MAdaGrad) * https://www.tensorflow.org/api_docs/python/tf/math * https://www.tensorflow.org/api_docs/cc/group/math-ops * https://www.tensorflow.org/api_docs/python/tf/Tensor # Meeting 15/2/2022 1. TensorFlow (MAdaGrad) # Meeting 22/2/2022 1. TensorFlow (MAdaGrad) # Meeting 1/3/2022 1. TensorFlow (MAdaGrad) # Meeting 8/3/2022 1. MAdaGrad - get minimum and maximum (learning rate * B_inverse) # Meeting 15/3/2022 1. Profiling Graph (MAdaGrad, Adam, SGD, SGD(max), SGD(min)) # Meeting 29/3/2022 1. sklearn(MAdaGrad_Adam) 2. Profiling Graph (MAdaGrad_Adam, Adam) # Meeting 5/4/2022 1. MNIST Fashion (MobaXterm) # Meeting 21/4/2022 1. MNIST Fashion (MobaXterm) # Meeting 27/4/2022 1. Paper (Introduction and Literature Review) # Meeting 10/5/2022 1. Latex Paper (Introduction and Literature Review) # Meeting 17/5/2022 1. Paper (Methodology) 2. Count (if c1 > c2 ; else) # Meeting 23/5/2022 1. Average loss value (MAdaGrad) # Meeting 31/5/2022 1. 30 datasets, 500 number of iterations (loss value) (batch size = default) - MAdaGrad, SGD (invscaling, constant, adaptive) - MAdaGrad_Adam, Adam 2. 30 datasets, 500 number of iterations (loss value) (batch size=,50,100,200(default),500,1000) - MAdaGrad, SGD (invscaling, constant, adaptive) - MAdaGrad_Adam, Adam # Meeting 14/6/2022 1. Add (1/self.t) into updating formula - learning rate * B_inverse * (1/t) 2. Profiling graph (loss value) # Meeting 23/6/2022 1. SGD (invscaling, adaptive, constant) 2. MNIST Fashion (SGD, MAdaGrad(1/t), MAdaGrad(1/sqrt(t))) 3. Paper 4. Work completion report 5. Thesis # Meeting 27/6/2022 1. Profiling graph (SGD invscaling and SGD adaptive) 2. breast cancer dataset, diabetes dataset and wine dataset 3. Paper # Meeting 4/7/2022 1. Paper 2. Work completion report 3. Datasets (Abalone, Dry Bean, Heart Disease, Raisin, Rice) # Meeting 12/7/2022 1. Datasets (Abalone, Dry Bean, Heart Disease, Raisin, Rice, wine, breast cancer) - MAdaGrad and SGD # Meeting 18/7/2022 1. Datasets (Abalone, Dry Bean, Heart Disease, Raisin, Rice, wine, breast cancer) - MAdaGrad and Adam 2. Profiling graph - xlim (-0.2,8), ylim (0,1.05) - remove SGD(constant) (overlapped with SGD(adaptive)) # Meeting 25/7/2022 1. Paper 2. Combined profiling graph (Breast Cancer, Wine and Abalone datasets) - total 5 graph (6 hidden layer sizes) # Meeting 1/8/2022 1. Paper 2. Combined profiling graph (MNIST,Breast Cancer, Wine and Abalone datasets) - total 5 graph (6 hidden layer sizes) # Meeting 15/8/2022 1. Paper # Meeting 29/8/2022 1. Paper 2. Thesis # Meeting 5/9/2022 1. Paper 2. Thesis