# Machine Learning 101 # Part 1 [Thoracic Surgery Data Data Set](https://archive.ics.uci.edu/ml/datasets/Thoracic+Surgery+Data) reference dataset [Iris dataset](https://www.kaggle.com/arshid/iris-flower-dataset) ## Linear Regression V.S. Logitsic Regression (numpy) * optimization methods: least square, gradient decent, stochastic gradient decent * Learning Rate: different rate 1e-3, 1e-5, 1e-09, 1e-11, 1e-13 * plot graph compare the loss (Objective/Loss function) * Epoch: how many epochs for training * Try to add regulization parameter L1/L2 * Tell me the difference of Linear Regression and Logitsic Regression, why Logitsic Regression is more suitable for classification ### Recommend Structure * Class * Data Reader * read train * parition train dataset e.g. 80%train, 20validation (for evaluate the performance of the model) * read test * Faeture Engineering like one-hot encoding, feature selection (determine the importance of the feature) in order to avoid overfitting * Model Class * init(hyperparamter) * fit(x, y) * predict(x) * save_weigth(filename) * load_weigth(filename) # Part 2 Digit Recognizer [MNIST](https://www.kaggle.com/c/digit-recognizer/data) ## Logitsic Regression V.S. Convolutional Neural Network (pytorch/tensorflow/numpy) * comparing different parameter/hyperparamter of CNN: * filter size * number filter * number of unit in last layer * learning rate * number of epochs # Part 3 [CIFAR 10](https://www.cs.toronto.edu/~kriz/cifar.html) ## Convolutional Neural Network (pytorch/tensorflow/numpy) * comparing different parameter/hyperparamter of CNN: * filter size * number filter * number of unit in last layer * learning rate * number of epochs * Try to use some pretrained model (optional) * e.g. [pytorch](https://pytorch.org/hub/pytorch_vision_inception_v3/)