# Assignment 2 of Introductory Deep Learning ###### Hand Posture Recognition with CNN *** # Table of contents [TOC] # Deal with training data >I have 5 data sets captured from different persons under different photo shooting conditions. >The samples in each data set are stored in 9 separate directories. The directories 0000~0002, 0003~0005, and 0006~0008 contain samples of Posture 1, 2, and 3, respectively. Each directory has 20 samples. Each sample is a gray image of 32x32 pixels. ```python= import os import cv2 import numpy as np from keras.utils import np_utils def enumerate_files(dirs, path='C:/Users/JasmineLu/Desktop/HW2_CNN/All_gray_1_32_32/', n_poses=3, n_samples=20): filenames, targets = [], [] for p in dirs: for n in range(n_poses): for j in range(3): dir_name = path+p+'/000'+str(n*3+j)+'/' for s in range(n_samples): d = dir_name+'%04d/'%s for f in os.listdir(d): if f.endswith('jpg'): filenames += [d+f] targets.append(n) return filenames, targets #Store as a NumPy array by using cv2.imread def read_images(files): imgs = [] for f in files: img = cv2.imread(f, cv2.IMREAD_GRAYSCALE//255) imgs.append(img) return imgs def read_datasets(datasets): files, labels = enumerate_files(datasets) list_of_arrays = read_images(files) return np.array(list_of_arrays), labels train_sets = ['Set1', 'Set2', 'Set3'] test_sets = ['Set4', 'Set5'] train_array, train_labels = read_datasets(train_sets) test_array, test_labels = read_datasets(test_sets) #reshape train_array = train_array.reshape(train_array.shape[0],32,32,1) test_array = test_array.reshape(test_array.shape[0],32,32,1) train_array = train_array.astype('float32')/255 test_array = test_array.astype('float32')/255 #Label Onehot-encoding test_labels = np_utils.to_categorical(test_labels) train_labels = np_utils.to_categorical(train_labels) print(train_array.shape) #(540, 32, 32, 1) print(train_labels.shape) #(540, 3) ``` # Generating Model >This is my code that I how to generate CNN model. ```python= from keras.models import Sequential from keras.layers import Dense,Dropout,Flatten,Activation from keras.layers.convolutional import Convolution2D,MaxPooling2D from keras.layers.normalization import BatchNormalization model = Sequential() model.add(Convolution2D(filters=16, kernel_size=(5,5), padding='same', input_shape=(32,32,1), activation='relu')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Convolution2D(filters=32, kernel_size=(5,5), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Convolution2D(filters=32, kernel_size=(5,5), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Dropout(rate = 0.25)) model.add(Flatten()) model.add(Dense(units=100, kernel_initializer='random_uniform')) model.add(BatchNormalization()) model.add(Activation('relu')) model.add(Dropout(rate = 0.1)) model.add(Dense(units=50, kernel_initializer='random_uniform')) model.add(BatchNormalization()) model.add(Activation('relu')) model.add(Dense(units=20, kernel_initializer='random_uniform')) model.add(BatchNormalization()) model.add(Activation('sigmoid')) model.add(Dropout(rate = 0.1)) model.add(Dense(units=3, kernel_initializer='random_uniform', activation='softmax')) print(model.summary()) ``` # Training Model A learning curve is an X-Y plot showing the accuracy or loss value (Y-axis) obtained in each epoch (X-axis). ```python= model.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy']) history = model.fit(train_array, train_labels, validation_data=(test_array, test_labels), epochs=40, batch_size=32, verbose=1) #evaluate the model on training set and testing set score_train = model.evaluate(train_array, train_labels, verbose=0) print('\nTrain loss:', score_train[0]) print('Train accuracy: %.2f%%' % (score_train[1]*100)) score_test = model.evaluate(test_array, test_labels, verbose=0) print('Test loss:', score_test[0]) print('Test accuracy: %.2f%%' % (score_test[1]*100)) ``` ![](https://i.imgur.com/WD6jKR7.png) # Compared performances From accuracy and loss value of my training and testing results, try to use the batch normalization which is useful for reducing the side effect of overfitting and the results will be completely different. ![](https://i.imgur.com/5ecvh8b.png) # Outcome discussions I found that there are three possibilities result in poor testing. One is that the parameter and the number of units are not adjusted well. The second is that the activation function is not set well. The third is that batch size and epochs are not adjusted well. The model can be trained better by changing the activation function or adjusting the parameters and the number of units. I think batch size is an important parameter in machine learning. In a certain range, in general, the larger the batch size, the more accurate the downward direction determined by it. # Conclusions In this assignment, I design the CNN architecture and try to use the batch normalization, and collect the accuracy and loss value values obtained in all epochs. The accuracy value of my outcome is close to 100% and the loss value of my outcome is almost 0. # Problems encountered 1. **Question** Why I used cv2.imread and it always returns NoneType? **Solution** In my case the problem was chinese in the path because my username is 佳柔. I change my username by[ this way](https://kknews.cc/zh-tw/tech/986rev8.html) and it very effective ("佳柔" change into "JasmineLu"). ![](https://i.imgur.com/9YFnlvT.png) 2. **Question** Why so many warnings appear?! ![](https://i.imgur.com/YSmpEs2.png) **Solution** Just use the instructions it recommends or update. 3. **Question** Something is wrong with input. ![](https://i.imgur.com/qWOZCyj.png) **Solution** Add a fourth dimension with a value of 1 to train_array (or 3 for color pictures). I refer to [this website](https://github.com/keras-team/keras/issues/10053). 4. **Question** Something is wrong with cuda. ![](https://i.imgur.com/5zcjp4u.png) **Solution** Download the corresponding driver in [this website](https://www.nvidia.cn/Download/index.aspx?lang=cn).