Hand Posture Recognition with CNN

目錄

問題

使用CNN來辨識出3種手部姿勢,我們會被給予5組(從不同人、不同拍攝情況)資料集。

資料集會被儲存進9個分開的檔案夾。

檔案夾0000-0002, 0003-0005, 0006-0008分別包含了姿勢1、姿勢2、姿勢3。

每個檔案夾有20個圖檔。每個圖檔皆為32*32的灰階圖。

解決

使用keras來建立CNN的模型,使用給予的圖片訓練,並會探討Drop Out、BN Layers的使用

這次使用Colab來見建構CNN

上傳.zip並解壓縮

 !unzip /content/All_gray_1_32_32.zip

import 我們需要的Library

import cv2
import numpy as np
import os
import matplotlib.pyplot as plt

from tensorflow import keras
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
import tensorflow.keras.losses as losses
import tensorflow.keras.optimizers as optimizers
from tensorflow.keras import backend as K

使用老師給的範例整理我們所需的資料

def enumerate_files(sets, path='/content/All_gray_1_32_32/All_gray_1_32_32/', n_poses=3, n_samples=20):
    filenames, labels = [], []
    # get the set name from sets list
    for name_set in sets:
        # get which pose is now
        for i_pose in range(n_poses):
            # for use with pose index to control the name for 9 separate directories
            for j in range(3):
                dir_name = path + '/' + name_set + \
                    '/000' + str(i_pose * 3 + j) + '/'
                # get the index of sample
                for i_sample in range(n_samples):
                    d = dir_name + '%04d/' % i_sample
                    # list all files in the 'd' directory
                    for f in os.listdir(d):
                        # find the file which the filename extension is 'jpg' type
                        if f.endswith('jpg'):
                            # append the whole path into the filename list
                            filenames.append(d + f)
                            #  append the label number into the labels list
                            labels.append(i_pose)
    return filenames, labels

寫成一個簡單的函式方便我們在主程式呼叫訓練與測試資料

def load_datasets(datasets):
    files, labels = enumerate_files(datasets)
    list_of_arrays = read_images(files)
    return np.array(list_of_arrays), labels
def read_images(files):
    imgs = []

    for f in files:
        img = cv2.imread(f, cv2.IMREAD_GRAYSCALE)
        img = img/255.0
        imgs.append(img)
    return imgs

作圖以及儲存訓練圖

def show_train_history(train_history, str_test_acc, train_acc='acc', test_acc='val_acc'):
    plt.plot(train_history.history[train_acc])
    plt.plot(train_history.history[test_acc])
    plt.title('Train History')
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['train', 'test'], loc='upper left')

    plt.savefig(str_test_acc + '.png')

結果

Model With No Drop Out & No BN Layers

batch_size = 360
num_classes = 3
epochs = 200
Tanh(32)->Relu(32)->Relu(28)->Flatten->Relu(128)->softmax
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='tanh', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(28, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Accuracy = 0.838
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Accuracy = 0.886

Model With No Drop Out & With BN Layers

batch_size = 360
num_classes = 3
epochs = 200

好玩的是,我們在每一成CNN後又加上BN Layer 結果反而讓網路訓練不起來

但是卻能發現在validation上確實使用BNLayers有明顯的提升速度

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='tanh', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())

model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())

model.add(Conv2D(28, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())

model.add(Flatten())
model.add(Dense(128, activation='relu'))

model.add(Dense(num_classes, activation='softmax'))

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Accuracy = 0.347222
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Accuracy = 0.405555

(Best) Model With Drop Out & No BN Layers

batch_size = 360
num_classes = 3
epochs = 200
Tanh(32)->Relu(32)->Relu(28)->Flatten->Relu(128)->softmax
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='tanh', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.35))
model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.4))
model.add(Conv2D(28, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.45))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])

Imgur
Accuracy = 0.96111
Imgur

Accuracy = 0.97222

Model With Drop Out & With BN Layers

batch_size = 360
num_classes = 3
epochs = 200

好玩的是,我們在每一成CNN後又加上BN Layer 結果反而讓網路訓練不起來

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='tanh', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.35))
model.add(Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.4))
model.add(Conv2D(28, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.45))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

Imgur
Accuracy = 0.66666
Imgur

Accuracy = 0.34444