# Week 8 ## Day 1 - Image Augmentation - Augment image --> more variable scenario for training - But for application like medical (detect Cancer?) --> it do not make sense to augment an medical images - The size of training set is still the same --> Show the Image Generators throw away the original images? - DropOut: take away some node and evaluate how our model works without these node ## Day 2 - CNN and Transfer Learning**What is CNN** **Why we don't use only Dense/Fully connected layers?** - Small input image (64x64x3) - However, larger high resulution image (1000*1000*3) = 3 mill/feautes --> W1 weight matrix --> 3 billions parameters --> cannot fit into memory and risk of overfit - stuck with small tiny images? --> Machine Learning technique is more efficient for 30 years because of this. **What is CNN** CNN stand for Convolutional Neural NetWork. It is developed in 2012 and have raise attention for deep learning. It is the same concept with human brain, when analyzing an image, CNN break the image into small part CNN use Kernal/Filter to produce Feature Map and help reduce the number of feature. Formula \: Image * Kernal = Feature Map. * is Convolutional operation, it multiply each section of image with the Kernal **CNN Hyparameter** - Input Size: size of image - Padding: Add zero around input to help reshape and adjust the shape of output - Kernel Size: kernal/filter size - Stride: how much space a filter will jump. Larger stride --> less scan to make Demo for Hyperparameter: https://poloclub.github.io/cnn-explainer/ Input Size: 7 Padding: 0 Kernal: 3 Stride: 2 **Many Channel CNN** ![](https://i.imgur.com/QPa2IdW.png) The filer size can vary but the number of layers of filter/input must be the same. For example, 6x6x3 * 3*3*3, the final shape must be 3. CNN has less chance to overfit and very good at feature extraction. To keep model simply, we can convert the input to square shape. **What happend in one layers of CNN?** - Input and filter --> apply convolutional operation - Apply activation function to introduce non-linearity - After each node, CNN make the shape of data smaller but deeper. From 3 color channel --> 32 channels for example. Each layers of the final output can represent a feature. **Pooling Layer** Reduce the size of layer and save computation cost. Two main method is Max Pooling and Average Pooling. Max Pooling is better in the first and middle layers of model since Max Pooling is very good in reduce noise in dataset. Average Pooling is good for the last layers since it preserve the information better. Global Pooling flatten the information and provide final feature that can be used for classification problems. **CNN model** - LeNET 5: The first CNN model with Average Pooling and sigmoid activation function. - AlexNET: The model make CNN popular again with max pooling and ReLU activation function. - RestNET **Localization** ![](https://i.imgur.com/P0kuMSf.png) **Object Detection** - Famous alogrimth right now: YOLO - Similar: R-CNN, but slower than YOLO ## Day 3 - CNN Exercises and Flask Doing exercise on CNN **Framework** Front-end: the part that users interact with, wrritten in HTML/CSS Back0ebdL part that consists of server, application, and a database. Python Framework: Flask **Request Method** - GET: require information - POST: update information ## Day 4 - Tensorflow Cheatsheet Fine-tuning model: unfreeze layers of pretrained model if you have so many data. ## Day 5 - Module Test - Resampling data have the risk of overfitting if we do not have much data - CNN Output size: ![](https://i.imgur.com/1PVQ7MH.png) - How to deal with Overfitting: -- Augementation -- More Data -- Regularlizer -- Train Less (Early Stopping) -- Recude the complexity of Model -- Drop Out -- Weight Initialization - How to deal with Underfitting: -- Increase the complexity of Model -- Train longer -- Transfer Learning Batch Size = 1 --> Quick, 1 element per Batch Batch Size = m --> Slow, memory intensive Mini Batch - Batch Normalization: 4 parameters of output of CNN n, m, c -- Calculate Std of whole dataset --> used for prediction -- Calculate Std of batch --> used for training Batch Normalization allow us to make sure that the Std is