Deep Learning - HackMD

# Deep Learning ###### tags: `python`, `cnn`, `ML` * [SONY - Neural Network console](https://dl.sony.com/) * [ConvnetJs](https://github.com/karpathy/convnetjs) * [visualize filters](https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/) * https://poloclub.github.io/cnn-explainer/ ## Dentall Ai references: * [Understanding your Convolution network with Visualizations](https://towardsdatascience.com/understanding-your-convolution-network-with-visualizations-a4883441533b) * [Image labeling - Rectangle](https://github.com/tzutalin/labelImg) * [Image labeling - Polygon](https://github.com/wkentaro/labelme) ## [Terms](https://machinelearningmastery.com/difference-between-a-batch-and-an-epoch/) **Batch**: the number of samples to work through before updating the internal model parameters. **epoch**: the number times that the learning algorithm will work through the entire training dataset. **softmax**: 將一個含任意實數的K維向量 ${\displaystyle \mathbf {z} }$ 「壓縮」到另一個K維實向量 ${\displaystyle \sigma (\mathbf {z} )}$ 中，使得每一個元素的範圍都在 ${\displaystyle (0,1)}$ 之間，並且所有元素的和為1 **sigmoid**: Sigmoid converts each score of the final node between 0 to 1 independent of what the other scores are. **Learning Rate** ![](https://i.imgur.com/8PiKvvk.png) ## Inplementation ### Read Data Method 1: `flow_from_directory` ```python= train_batches = datagen.flow_from_directory( "AngleData/train", target_size = (224,224), batch_size = 20, classes = ['A', 'B', 'C'], subset='training') valid_batches = datagen.flow_from_directory( "AngleData/train", target_size = (224,224), batch_size = 20, classes = ['A', 'B', 'C'], subset='validation') ``` Method 2: `flow_from_dataframe` ```python= traindf = pd.read_csv("train.csv", header=None) traindf = traindf.rename(columns={0: "name", 1: "class"}) train_batches = datagen.flow_from_dataframe( dataframe=traindf, directory="AngleData/train", x_col="name", y_col="class", target_size = (224,224), batch_size = 20, subset='training') ``` ## Build Network ### Resnet152 ```python # libraries from keras.applications.resnet import ResNet152 from keras.layers.core import Dense, Flatten from keras.layers import Activation,Dropout from keras.models import Model from keras.optimizers import Adam ``` ```python # network net = ResNet152(include_top=False, weights="imagenet", input_tensor=None, input_shape=(target_size[0],target_size[1],classNum), classes=classNum) x = net.output x = Flatten()(x) x = Dropout(0.5)(x) output_layer = Dense(class_num, activation='softmax', name='softmax')(x) ``` ### InceptionV3 ```python # libraries from keras.applications.inception_v3 import InceptionV3 from keras.models import Model from keras.layers import Activation, Dense, GlobalAveragePooling2D, Dropout from keras.optimizers import Adam ``` ```python # network net = InceptionV3(include_top=False, weights="imagenet") x = net.output x = GlobalAveragePooling2D()(x) x = Dropout(0.5)(x) output_layer = Dense(class_num, activation='softmax')(x) ``` ## Train ```python # 設定凍結與要進行訓練的網路層 FREEZE_LAYERS = 2 net_final = Model(inputs=net.input, outputs=output_layer) for layer in net_final.layers[:FREEZE_LAYERS]: layer.trainable = False for layer in net_final.layers[FREEZE_LAYERS:]: layer.trainable = True # 使用 Adam optimizer，以較低的 learning rate 進行 fine-tuning net_final.compile(optimizer=Adam(lr=1e-5), loss='categorical_crossentropy', metrics=['accuracy']) # 訓練模型 history = net_final.fit(train_batches, steps_per_epoch = train_batches.samples // batch_size, validation_data = valid_batches, validation_steps = valid_batches.samples // batch_size, epochs = 30) net_final.save("models/your_nodel_name.h5") STEP_SIZE_VALID = valid_batches.n // valid_batches.batch_size result = net_final.evaluate_generator(generator=valid_batches, steps=STEP_SIZE_VALID, verbose=1) print("result = ", result) # plot metrics plt.plot(history.history['accuracy']) plt.show() plt.savefig('accuracy.jpg') ``` ### Bias and Variance ![](https://i.imgur.com/DZaSgLJ.png) easy model: variance is smaller, bias is higher complex model: varience is higher, bias is lower > simpler model is less influenced by sample data :::success * **Underfitting**: If our model cannot fit the training data, it may have problem due to large bias. > Redesign the model (ex. add feature, use a more complex structure...)[color=grey] * **Overfitting**: If our model can fit the training data perfectly but not the testing data, it may be caused by large varience. > Add more data, or do regularization [color=grey] ::: ## Tips for training DNN [Youtube tutorial](https://www.youtube.com/watch?v=xki61j7z-30&list=PLJV_el3uVTsPy9oCRY30oBPNLCo89yu49&index=16) * We only apply `dropout` when the testing result is not good * `Dense`: fully connected layer * `softmax`: the predicted value for each class is between 0 and 1. The sum of each predicted value is 1. * when the training result is not satisfying: * check the network structure, activation function