# Comparison of Modern Variants of Stochastic Gradient Descent (Application Mode)
#### The comparisions are given between Stochastic Gradient Descent (SGD), Heavy Ball(HB), Nesterov's Accelerated Gradient Descent (NAG), Accerlerated Stochastic Gradient Descent (ASGD) and Adam.
We provide the source code for our Term Paper for the Course EE609: Convex Optimization in Signal Processing and Communication. All the provided jupyter files can be run through Google Colab platform.
## Libraries Used
numpy
torch
torchvision
torchbearer
copy
matplotplib
tqdm
itertools
sklearn
random
scipy
Note: To install any of the above libraries:
1. Use "pip install *library_name*" for your local system.
2. Use "!pip install *library_name*" when installing on Colab
## Hardware used
GPU should be available at cuda:0
## Folder description
### Linear Regression
"*LinearRegression.ipynb*" performs Linear Regression on Discrete and Gaussiand Distributions with Stochastic Gradient Descent (SGD), Heavy Ball (HB), Nesterov's Accelerated Gradient Descent (NAG) and Accerlerated Stochastic Gradient Descent (ASGD), and outputs the relevant plots. One can also load the pre-saved loss values to reproduce the plots provided in out report.
"*GridSearchLinear.ipynb*" performs grid search for SGD, HB, NAG and ASGD and outputs the optimal hyperparameter values for Discrete and Gaussian Distributions.
### Deep Autoencoders
"*Autoencoders.ipynb*" conists of code for training the autoencoder on MNIST dataset and then plotting the result of performance comparision of various optimizers used on minibatch size 1 and 8.
"*Grid_Search.ipynb*" performs grid search on SGD, HB, NAG, ASGD and Adam to obtains the various parameters where they perform optimally.
### Resnet
"*Resnet8.ipynb*" trains Resnet44 with Batch size 8 on CIFAR10 with SGD, HB, NAG, ASGD and Adam using a Decayed Hyperparameter Schedule. Their performances are compared using plots of suitable metrics. One can also load the pre-saved metrics to reproduce the plots provided in out report.
"*Resnet128.ipynb*" trains Resnet44 with Batch size 128 on CIFAR10 with SGD, HB, NAG, ASGD and Adam using both, a Decayed Hyperparameter Schedule and a Fixed Hyperparameter Schedule. Their performances are compared using plots of suitable metrics. One can also load the pre-saved metrics to reproduce the plots provided in out report.
## Possible Conflicts
1. https://github.com/HangZhongZH/Reproducible-report-of-On-the-insufficiency-of-existing-momentum-schemes-for-Stochastic-Optimization
2. https://github.com/COMP6248-Reproducability-Challenge/Insufficiency-momentum-schemes-for-Stochastic-Optimization
Note that most of the code base available on the above mentioned sources were hard coded to reproduce results similar to those declared in the original paper, instead of using the optimal parameters obtained using a grid search. We have cited urls in our code base, wherever we have imported code from other sources. (Such as code for ASGD optimizer, Resnet architecture, etc).