--- tags: DeepLearning_HW2_CNN title: Report --- # Hand Posture Recognition with CNN ## 目錄 ###### [TOC] ## 問題 #### 使用CNN來辨識出3種手部姿勢,我們會被給予5組(從不同人、不同拍攝情況)資料集。 #### 資料集會被儲存進9個分開的檔案夾。 #### 檔案夾0000-0002, 0003-0005, 0006-0008分別包含了姿勢1、姿勢2、姿勢3。 #### 每個檔案夾有20個圖檔。每個圖檔皆為32*32的灰階圖。 ## 解決 ### 使用keras來建立CNN的模型,使用給予的圖片訓練,並會探討Drop Out、BN Layers的使用 #### 這次使用Colab來見建構CNN #### 上傳.zip並解壓縮 ``` !unzip /content/All_gray_1_32_32.zip ``` #### import 我們需要的Library ``` import cv2 import numpy as np import os import matplotlib.pyplot as plt from tensorflow import keras from tensorflow.keras.utils import to_categorical from tensorflow.keras.models import Sequential from tensorflow.keras.layers import BatchNormalization from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D import tensorflow.keras.losses as losses import tensorflow.keras.optimizers as optimizers from tensorflow.keras import backend as K ``` #### 使用老師給的範例整理我們所需的資料 ``` def enumerate_files(sets, path='/content/All_gray_1_32_32/All_gray_1_32_32/', n_poses=3, n_samples=20): filenames, labels = [], [] # get the set name from sets list for name_set in sets: # get which pose is now for i_pose in range(n_poses): # for use with pose index to control the name for 9 separate directories for j in range(3): dir_name = path + '/' + name_set + \ '/000' + str(i_pose * 3 + j) + '/' # get the index of sample for i_sample in range(n_samples): d = dir_name + '%04d/' % i_sample # list all files in the 'd' directory for f in os.listdir(d): # find the file which the filename extension is 'jpg' type if f.endswith('jpg'): # append the whole path into the filename list filenames.append(d + f) # append the label number into the labels list labels.append(i_pose) return filenames, labels ``` #### 寫成一個簡單的函式方便我們在主程式呼叫訓練與測試資料 ``` def load_datasets(datasets): files, labels = enumerate_files(datasets) list_of_arrays = read_images(files) return np.array(list_of_arrays), labels ``` ``` def read_images(files): imgs = [] for f in files: img = cv2.imread(f, cv2.IMREAD_GRAYSCALE) img = img/255.0 imgs.append(img) return imgs ``` #### 作圖以及儲存訓練圖 ``` def show_train_history(train_history, str_test_acc, train_acc='acc', test_acc='val_acc'): plt.plot(train_history.history[train_acc]) plt.plot(train_history.history[test_acc]) plt.title('Train History') plt.ylabel('Accuracy') plt.xlabel('Epoch') plt.legend(['train', 'test'], loc='upper left') plt.savefig(str_test_acc + '.png') ``` ## 結果 ### <font size="6">__**Model With No Drop Out & No BN Layers**__ </font> ##### batch_size = 360 ##### num_classes = 3 ##### epochs = 200 ##### Tanh(32)->Relu(32)->Relu(28)->Flatten->Relu(128)->softmax ``` model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='tanh', input_shape=input_shape)) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(28, kernel_size=(3, 3), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dense(num_classes, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc']) ``` ![Imgur](https://i.imgur.com/AC1V7xV.png) <font size="4">__**Accuracy = 0.838...**__ </font> ![Imgur](https://i.imgur.com/1CF47mT.png) <font size="4">__**Accuracy = 0.886...**__ </font> ### <font size="6">__**Model With No Drop Out & With BN Layers**__ </font> ##### batch_size = 360 ##### num_classes = 3 ##### epochs = 200 #### 好玩的是,我們在每一成CNN後又加上BN Layer 結果反而讓網路訓練不起來 #### 但是卻能發現在validation上確實使用BNLayers有明顯的提升速度 ``` model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='tanh', input_shape=input_shape)) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(BatchNormalization()) model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(BatchNormalization()) model.add(Conv2D(28, kernel_size=(3, 3), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(BatchNormalization()) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dense(num_classes, activation='softmax')) ``` ![Imgur](https://i.imgur.com/oXrUPWe.png) <font size="4">__**Accuracy = 0.347222...**__ </font> ![Imgur](https://i.imgur.com/v8IhbJu.png) <font size="4">__**Accuracy = 0.405555...**__ </font> ### <font size="6">__**(Best) Model With Drop Out & No BN Layers**__ </font> ##### batch_size = 360 ##### num_classes = 3 ##### epochs = 200 ##### Tanh(32)->Relu(32)->Relu(28)->Flatten->Relu(128)->softmax ``` model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='tanh', input_shape=input_shape)) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.35)) model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.4)) model.add(Conv2D(28, kernel_size=(3, 3), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.45)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(num_classes, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc']) ``` ![Imgur](https://i.imgur.com/f3c0Shx.png) <font size="4">__**Accuracy = 0.96111...**__ </font> ![Imgur](https://i.imgur.com/7KtRdaQ.png) <font size="4">__**Accuracy = 0.97222...**__ </font> ### <font size="6">__**Model With Drop Out & With BN Layers**__ </font> ##### batch_size = 360 ##### num_classes = 3 ##### epochs = 200 #### 好玩的是,我們在每一成CNN後又加上BN Layer 結果反而讓網路訓練不起來 ``` model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='tanh', input_shape=input_shape)) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(BatchNormalization()) model.add(Dropout(0.35)) model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(BatchNormalization()) model.add(Dropout(0.4)) model.add(Conv2D(28, kernel_size=(3, 3), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(BatchNormalization()) model.add(Dropout(0.45)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(num_classes, activation='softmax')) ``` ![Imgur](https://i.imgur.com/JUXdJnW.png) <font size="4">__**Accuracy = 0.66666...**__ </font> ![Imgur](https://i.imgur.com/BTJgay0.png) <font size="4">__**Accuracy = 0.34444...**__ </font>