# Assignment 2 of Introductory Deep Learning
###### Hand Posture Recognition with CNN
***
# Table of contents
[TOC]
# Deal with training data
>I have 5 data sets captured from different persons under different photo shooting conditions.
>The samples in each data set are stored in 9 separate directories.
The directories 0000~0002, 0003~0005, and 0006~0008 contain samples of Posture 1, 2, and 3, respectively.
Each directory has 20 samples.
Each sample is a gray image of 32x32 pixels.
```python=
import os
import cv2
import numpy as np
from keras.utils import np_utils
def enumerate_files(dirs, path='C:/Users/JasmineLu/Desktop/HW2_CNN/All_gray_1_32_32/', n_poses=3, n_samples=20):
filenames, targets = [], []
for p in dirs:
for n in range(n_poses):
for j in range(3):
dir_name = path+p+'/000'+str(n*3+j)+'/'
for s in range(n_samples):
d = dir_name+'%04d/'%s
for f in os.listdir(d):
if f.endswith('jpg'):
filenames += [d+f]
targets.append(n)
return filenames, targets
#Store as a NumPy array by using cv2.imread
def read_images(files):
imgs = []
for f in files:
img = cv2.imread(f, cv2.IMREAD_GRAYSCALE//255)
imgs.append(img)
return imgs
def read_datasets(datasets):
files, labels = enumerate_files(datasets)
list_of_arrays = read_images(files)
return np.array(list_of_arrays), labels
train_sets = ['Set1', 'Set2', 'Set3']
test_sets = ['Set4', 'Set5']
train_array, train_labels = read_datasets(train_sets)
test_array, test_labels = read_datasets(test_sets)
#reshape
train_array = train_array.reshape(train_array.shape[0],32,32,1)
test_array = test_array.reshape(test_array.shape[0],32,32,1)
train_array = train_array.astype('float32')/255
test_array = test_array.astype('float32')/255
#Label Onehot-encoding
test_labels = np_utils.to_categorical(test_labels)
train_labels = np_utils.to_categorical(train_labels)
print(train_array.shape) #(540, 32, 32, 1)
print(train_labels.shape) #(540, 3)
```
# Generating Model
>This is my code that I how to generate CNN model.
```python=
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten,Activation
from keras.layers.convolutional import Convolution2D,MaxPooling2D
from keras.layers.normalization import BatchNormalization
model = Sequential()
model.add(Convolution2D(filters=16, kernel_size=(5,5), padding='same', input_shape=(32,32,1), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Convolution2D(filters=32, kernel_size=(5,5), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Convolution2D(filters=32, kernel_size=(5,5), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(rate = 0.25))
model.add(Flatten())
model.add(Dense(units=100, kernel_initializer='random_uniform'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(rate = 0.1))
model.add(Dense(units=50, kernel_initializer='random_uniform'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dense(units=20, kernel_initializer='random_uniform'))
model.add(BatchNormalization())
model.add(Activation('sigmoid'))
model.add(Dropout(rate = 0.1))
model.add(Dense(units=3, kernel_initializer='random_uniform', activation='softmax'))
print(model.summary())
```
# Training Model
A learning curve is an X-Y plot showing the accuracy or loss value (Y-axis) obtained in each epoch (X-axis).
```python=
model.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])
history = model.fit(train_array, train_labels, validation_data=(test_array, test_labels), epochs=40, batch_size=32, verbose=1)
#evaluate the model on training set and testing set
score_train = model.evaluate(train_array, train_labels, verbose=0)
print('\nTrain loss:', score_train[0])
print('Train accuracy: %.2f%%' % (score_train[1]*100))
score_test = model.evaluate(test_array, test_labels, verbose=0)
print('Test loss:', score_test[0])
print('Test accuracy: %.2f%%' % (score_test[1]*100))
```

# Compared performances
From accuracy and loss value of my training and testing results, try to use the batch normalization which is useful for reducing the side effect of overfitting and the results will be completely different.

# Outcome discussions
I found that there are three possibilities result in poor testing. One is that the parameter and the number of units are not adjusted well. The second is that the activation function is not set well. The third is that batch size and epochs are not adjusted well.
The model can be trained better by changing the activation function or adjusting the parameters and the number of units.
I think batch size is an important parameter in machine learning. In a certain range, in general, the larger the batch size, the more accurate the downward direction determined by it.
# Conclusions
In this assignment, I design the CNN architecture and try to use the batch normalization, and collect the accuracy and loss value values obtained in all epochs.
The accuracy value of my outcome is close to 100% and the loss value of my outcome is almost 0.
# Problems encountered
1. **Question**
Why I used cv2.imread and it always returns NoneType?
**Solution**
In my case the problem was chinese in the path because my username is 佳柔.
I change my username by[ this way](https://kknews.cc/zh-tw/tech/986rev8.html) and it very effective ("佳柔" change into "JasmineLu").

2. **Question**
Why so many warnings appear?!

**Solution**
Just use the instructions it recommends or update.
3. **Question**
Something is wrong with input.

**Solution**
Add a fourth dimension with a value of 1 to train_array (or 3 for color pictures).
I refer to [this website](https://github.com/keras-team/keras/issues/10053).
4. **Question**
Something is wrong with cuda.

**Solution**
Download the corresponding driver in [this website](https://www.nvidia.cn/Download/index.aspx?lang=cn).