---
title: 深度學習_用CNN辨識3種手勢圖片_410621225
tags: Templates, Talk
description: View the slide with "Slide Mode".
---
[TOC]
---
# 深度學習_用CNN辨識3種手勢圖片
410621225_葉映辰
---
### 深度學習(英語:deep learning)
是一種機器學習,使用人工神經網路為架構,對資料進行特徵學習的方法。
### 卷積神經網路(Convolutional Neural Network, CNN)
是一種前饋神經網路,它的人工神經元可以遮罩式的進行範圍處裡,對於大型圖像處理有出色表現。
---
### 程式碼流程圖
```graphviz
digraph {
compound=true
rankdir=RL
graph [ fontname="Source Sans Pro", fontsize=80 ];
node [ fontname="Source Sans Pro", fontsize=40];
edge [ fontname="Source Sans Pro", fontsize=505 ];
prepare [label="準備資料"] [shape=box]
load [label="載入"] [shape=box]
design [label="設計CNN模型架構"] [shape=box]
train [label="訓練模型"] [shape=box]
test [label="測試模型"] [shape=box]
plotable [label="圖表視覺化"] [shape=box]
prepare->load
load->design
design->train
train->test
test->plotable
}
```
---
## CNN手勢辨識程式碼分析
### 步驟1. 準備資料
0號手勢: ![](https://i.imgur.com/gOvULLw.jpg)
1號手勢: ![](https://i.imgur.com/KwruEPE.jpg)
2號手勢: ![](https://i.imgur.com/QE7nXS4.jpg)
3種手勢辨識用圖片,每一張像素為32X32(1024),每個set有180張圖片,共包含5個set
#### 下載連結: http://web.csie.ndhu.edu.tw/ccchiang/Data/All_gray_1_32_32.rar
---
### 步驟2. 載入資料
深度學習非常仰賴用於訓練的資料庫,在獲得大量資料後下一步就是進行資料的批量讀取
:::spoiler 點選顯示更多內容
```python=0
import cv2
import numpy as np
import os
import matplotlib.pyplot as plt
###找出訓練用資料路徑,並規格化
def enumerate_files(dirs, path='All_gray_1_32_32', n_poses=3,n_samples=20):
filenames, targets = [], []
for p in dirs:
for n in range(n_poses):
for j in range(3):
dir_name = path+'/'+p+'/000'+str(n*3+j)+'/'
for s in range(n_samples):
d = dir_name+'%04d/'%s
for f in os.listdir(d):
if f.endswith('jpg'):
filenames += [d+f]
targets.append(n)
print(filenames,targets)
return filenames,targets
###讀檔並把陣列形狀設成(540,32,32,1)
def read_images(files):
imgs = []
for i in range(0,len(files)):
img = cv2.imread(files[i],0)
imgs.extend(np.reshape(img , (1,len(img),len(img),1)))
return np.array(imgs)
###讀取資料(圖片陣列和手勢種類號碼)
def read_datasets(datasets):
files, labels = enumerate_files(datasets)
list_of_arrays = read_images(files)
return np.array(list_of_arrays), labels
train_sets = ['Set1', 'Set2', 'Set3'] #Set1~3用於訓練模型
test_sets = ['Set4', 'Set5'] #Set4~5用於測試成果
trn_array, trn_labels = read_datasets(train_sets)
tst_array, tst_labels = read_datasets(test_sets)
```
(該程式碼修正自: http://web.csie.ndhu.edu.tw/ccchiang/Data/fdl_assignment2.pdf)
:::
---
### 步驟3. 設計CNN模型架構
深度學習用多階段的步驟,對手勢圖片的對應手勢代號進行隨機的猜測,並參考我們提供的答案,如果答對或錯進行微調,最終獲得一個高命中率的手勢辨識模型
:::spoiler 點選顯示更多內容
### 各種層介紹
- #### Convolution Layer卷積層 (CNN特有層)
用影像處理的遮罩旋積技術"強化邊緣",並可搭配"激發函數Relu"讓邊緣單方面極端化
- #### Pooling Layer 池化層 (CNN特有層)
找出遮罩內最大值,並以此作為新陣列對應的值,可以用於去除雜訊
- #### Flaten Layer 扁平化層
CNN為了使用旋積等遮罩運算,因此以多維角度處裡圖片,而要先降成一維,才能做最後分類(辨識3種手勢分3類,即一維陣列: [0,1,2])
- #### Fully Connected Layer 全連接層
將之前的成果收束成成果
- #### Dropout 丟掉層
防止過度擬合,因此要適當忘記一些資訊,好比被"黑色的土狗"咬後,可以忘記那隻狗的"毛色",但該記得之前是被"哪種動物"攻擊過
```python=+
import keras
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D
from keras.callbacks import ReduceLROnPlateau
print(trn_array.shape)#檢查輸入格式是否為(540,32,32,1)
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(32,32,1)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(filters=36, kernel_size=(5, 5), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(32, activation='relu', input_dim=540))
model.add(Dropout(0.05))
model.add(Dense(3, activation='softmax'))
model.compile(optimizer=keras.optimizers.Adadelta(),
loss='sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy'])
```
:::
---
### 步驟4. 訓練模型
:::spoiler 點選顯示更多內容
```python=+
reduce_lr = ReduceLROnPlateau(monitor='loss', patience=5, mode='max')
train_history = model.fit(trn_array, trn_labels, epochs=10, batch_size=32,callbacks=[reduce_lr])
```
:::
---
### 步驟5. 測試模型
:::spoiler 點選顯示更多內容
```python=+
predictions = model.predict(tst_array)
score = 0
for i in range(0,len(tst_array)):
#print(i,": [",'%.0f'% predictions[i,0],"][",'%.0f'% predictions[i,1],"][",'%.0f'% predictions[i,2],"] ",predictions[i,tst_labels[i]])
if round(predictions[i,tst_labels[i]]) == 1:
score += 1
print("hit rate:",score,"/",len(tst_array)," [",round(score/len(tst_array) * 100,1),"%]")
```
:::
---
### 步驟6. 圖表視覺化
:::spoiler 點選顯示更多內容
```python=+
plt.subplot(1,2,1)
plt.title("loss/epoch")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.plot(train_history.epoch,train_history.history['loss'],linestyle = "--",marker='o')
for a, b in zip(train_history.epoch, train_history.history['loss']):
plt.text(float(a), float(b) + 0.1, round(b,2), ha='center', va='bottom', fontsize=13)
plt.subplot(1,2,2)
plt.title("sparse_categorical_accuracy/epoch")
plt.xlabel("epoch")
plt.ylabel("sparse_categorical_accuracy")
plt.plot(train_history.epoch,train_history.history['sparse_categorical_accuracy'],linestyle = "--",marker='o')
for a, b in zip(train_history.epoch, train_history.history['sparse_categorical_accuracy']):
plt.text(float(a), float(b) + 0.003, round(b,2), ha='center', va='bottom', fontsize=13)
plt.show()
```
:::
---
### 測試結果
![](https://i.imgur.com/B4LToOD.png)
sparse_categorical_accuracy: 0.9537
hit rate: 331 / 360 [ 91.9 %]
:::spoiler
```python=0
32/540 [>.............................] - ETA: 2s - loss: 20.6139 - sparse_categorical_accuracy: 0.3750
96/540 [====>.........................] - ETA: 1s - loss: 15.8058 - sparse_categorical_accuracy: 0.3958
160/540 [=======>......................] - ETA: 0s - loss: 13.7545 - sparse_categorical_accuracy: 0.3750
224/540 [===========>..................] - ETA: 0s - loss: 10.5184 - sparse_categorical_accuracy: 0.3750
288/540 [===============>..............] - ETA: 0s - loss: 8.6669 - sparse_categorical_accuracy: 0.3854
352/540 [==================>...........] - ETA: 0s - loss: 7.4399 - sparse_categorical_accuracy: 0.3807
416/540 [======================>.......] - ETA: 0s - loss: 6.5333 - sparse_categorical_accuracy: 0.3870
480/540 [=========================>....] - ETA: 0s - loss: 5.8792 - sparse_categorical_accuracy: 0.3917
540/540 [==============================] - 1s 2ms/step - loss: 5.3469 - sparse_categorical_accuracy: 0.4037
Epoch 2/10
32/540 [>.............................] - ETA: 0s - loss: 0.7484 - sparse_categorical_accuracy: 0.6250
96/540 [====>.........................] - ETA: 0s - loss: 1.3329 - sparse_categorical_accuracy: 0.4688
160/540 [=======>......................] - ETA: 0s - loss: 1.3280 - sparse_categorical_accuracy: 0.4563
224/540 [===========>..................] - ETA: 0s - loss: 1.2546 - sparse_categorical_accuracy: 0.4732
256/540 [=============>................] - ETA: 0s - loss: 1.2180 - sparse_categorical_accuracy: 0.4766
320/540 [================>.............] - ETA: 0s - loss: 1.1545 - sparse_categorical_accuracy: 0.5031
384/540 [====================>.........] - ETA: 0s - loss: 1.0997 - sparse_categorical_accuracy: 0.5182
416/540 [======================>.......] - ETA: 0s - loss: 1.1175 - sparse_categorical_accuracy: 0.5168
448/540 [=======================>......] - ETA: 0s - loss: 1.1255 - sparse_categorical_accuracy: 0.5179
480/540 [=========================>....] - ETA: 0s - loss: 1.1042 - sparse_categorical_accuracy: 0.5229
540/540 [==============================] - 1s 2ms/step - loss: 1.0894 - sparse_categorical_accuracy: 0.5148
Epoch 3/10
32/540 [>.............................] - ETA: 0s - loss: 1.0511 - sparse_categorical_accuracy: 0.5312
96/540 [====>.........................] - ETA: 0s - loss: 0.9700 - sparse_categorical_accuracy: 0.5208
160/540 [=======>......................] - ETA: 0s - loss: 0.8853 - sparse_categorical_accuracy: 0.5688
192/540 [=========>....................] - ETA: 0s - loss: 0.8845 - sparse_categorical_accuracy: 0.5885
256/540 [=============>................] - ETA: 0s - loss: 0.8377 - sparse_categorical_accuracy: 0.6016
320/540 [================>.............] - ETA: 0s - loss: 0.8377 - sparse_categorical_accuracy: 0.6219
384/540 [====================>.........] - ETA: 0s - loss: 0.8343 - sparse_categorical_accuracy: 0.6302
448/540 [=======================>......] - ETA: 0s - loss: 0.8381 - sparse_categorical_accuracy: 0.6339
512/540 [===========================>..] - ETA: 0s - loss: 0.8353 - sparse_categorical_accuracy: 0.6328
540/540 [==============================] - 1s 1ms/step - loss: 0.8219 - sparse_categorical_accuracy: 0.6426
Epoch 4/10
32/540 [>.............................] - ETA: 0s - loss: 0.6801 - sparse_categorical_accuracy: 0.6562
96/540 [====>.........................] - ETA: 0s - loss: 0.6091 - sparse_categorical_accuracy: 0.7396
160/540 [=======>......................] - ETA: 0s - loss: 0.5165 - sparse_categorical_accuracy: 0.7625
192/540 [=========>....................] - ETA: 0s - loss: 0.4942 - sparse_categorical_accuracy: 0.7812
256/540 [=============>................] - ETA: 0s - loss: 0.4932 - sparse_categorical_accuracy: 0.7812
320/540 [================>.............] - ETA: 0s - loss: 0.5351 - sparse_categorical_accuracy: 0.7656
384/540 [====================>.........] - ETA: 0s - loss: 0.5128 - sparse_categorical_accuracy: 0.7786
448/540 [=======================>......] - ETA: 0s - loss: 0.5065 - sparse_categorical_accuracy: 0.7857
512/540 [===========================>..] - ETA: 0s - loss: 0.5072 - sparse_categorical_accuracy: 0.7832
540/540 [==============================] - 1s 1ms/step - loss: 0.5062 - sparse_categorical_accuracy: 0.7833
Epoch 5/10
32/540 [>.............................] - ETA: 0s - loss: 0.2141 - sparse_categorical_accuracy: 0.8750
96/540 [====>.........................] - ETA: 0s - loss: 0.3087 - sparse_categorical_accuracy: 0.8750
160/540 [=======>......................] - ETA: 0s - loss: 0.3582 - sparse_categorical_accuracy: 0.8750
224/540 [===========>..................] - ETA: 0s - loss: 0.3434 - sparse_categorical_accuracy: 0.8705
288/540 [===============>..............] - ETA: 0s - loss: 0.3428 - sparse_categorical_accuracy: 0.8715
352/540 [==================>...........] - ETA: 0s - loss: 0.3682 - sparse_categorical_accuracy: 0.8636
416/540 [======================>.......] - ETA: 0s - loss: 0.4145 - sparse_categorical_accuracy: 0.8438
480/540 [=========================>....] - ETA: 0s - loss: 0.4184 - sparse_categorical_accuracy: 0.8333
540/540 [==============================] - 1s 1ms/step - loss: 0.3950 - sparse_categorical_accuracy: 0.8463
Epoch 6/10
32/540 [>.............................] - ETA: 0s - loss: 0.2378 - sparse_categorical_accuracy: 0.9062
96/540 [====>.........................] - ETA: 0s - loss: 0.3724 - sparse_categorical_accuracy: 0.8646
160/540 [=======>......................] - ETA: 0s - loss: 0.3862 - sparse_categorical_accuracy: 0.8313
224/540 [===========>..................] - ETA: 0s - loss: 0.4477 - sparse_categorical_accuracy: 0.8125
288/540 [===============>..............] - ETA: 0s - loss: 0.5798 - sparse_categorical_accuracy: 0.7882
352/540 [==================>...........] - ETA: 0s - loss: 0.5076 - sparse_categorical_accuracy: 0.8182
416/540 [======================>.......] - ETA: 0s - loss: 0.4747 - sparse_categorical_accuracy: 0.8269
480/540 [=========================>....] - ETA: 0s - loss: 0.4445 - sparse_categorical_accuracy: 0.8354
540/540 [==============================] - 1s 1ms/step - loss: 0.4179 - sparse_categorical_accuracy: 0.8481
Epoch 7/10
32/540 [>.............................] - ETA: 0s - loss: 0.3136 - sparse_categorical_accuracy: 0.9062
96/540 [====>.........................] - ETA: 0s - loss: 0.2757 - sparse_categorical_accuracy: 0.8958
160/540 [=======>......................] - ETA: 0s - loss: 0.2453 - sparse_categorical_accuracy: 0.9062
192/540 [=========>....................] - ETA: 0s - loss: 0.2569 - sparse_categorical_accuracy: 0.8906
256/540 [=============>................] - ETA: 0s - loss: 0.2474 - sparse_categorical_accuracy: 0.8945
320/540 [================>.............] - ETA: 0s - loss: 0.2347 - sparse_categorical_accuracy: 0.8969
384/540 [====================>.........] - ETA: 0s - loss: 0.2347 - sparse_categorical_accuracy: 0.9010
448/540 [=======================>......] - ETA: 0s - loss: 0.2246 - sparse_categorical_accuracy: 0.9085
512/540 [===========================>..] - ETA: 0s - loss: 0.2176 - sparse_categorical_accuracy: 0.9121
540/540 [==============================] - 1s 1ms/step - loss: 0.2137 - sparse_categorical_accuracy: 0.9148
Epoch 8/10
32/540 [>.............................] - ETA: 0s - loss: 0.1827 - sparse_categorical_accuracy: 0.8750
96/540 [====>.........................] - ETA: 0s - loss: 0.1591 - sparse_categorical_accuracy: 0.9167
160/540 [=======>......................] - ETA: 0s - loss: 0.1790 - sparse_categorical_accuracy: 0.9187
192/540 [=========>....................] - ETA: 0s - loss: 0.1646 - sparse_categorical_accuracy: 0.9323
256/540 [=============>................] - ETA: 0s - loss: 0.1649 - sparse_categorical_accuracy: 0.9336
320/540 [================>.............] - ETA: 0s - loss: 0.1663 - sparse_categorical_accuracy: 0.9375
384/540 [====================>.........] - ETA: 0s - loss: 0.1748 - sparse_categorical_accuracy: 0.9323
448/540 [=======================>......] - ETA: 0s - loss: 0.1757 - sparse_categorical_accuracy: 0.9308
480/540 [=========================>....] - ETA: 0s - loss: 0.1742 - sparse_categorical_accuracy: 0.9333
540/540 [==============================] - 1s 1ms/step - loss: 0.1771 - sparse_categorical_accuracy: 0.9315
Epoch 9/10
32/540 [>.............................] - ETA: 0s - loss: 0.2457 - sparse_categorical_accuracy: 0.9375
96/540 [====>.........................] - ETA: 0s - loss: 0.1723 - sparse_categorical_accuracy: 0.9271
128/540 [======>.......................] - ETA: 0s - loss: 0.1568 - sparse_categorical_accuracy: 0.9297
160/540 [=======>......................] - ETA: 0s - loss: 0.1336 - sparse_categorical_accuracy: 0.9438
192/540 [=========>....................] - ETA: 0s - loss: 0.1445 - sparse_categorical_accuracy: 0.9375
224/540 [===========>..................] - ETA: 0s - loss: 0.1471 - sparse_categorical_accuracy: 0.9375
256/540 [=============>................] - ETA: 0s - loss: 0.1390 - sparse_categorical_accuracy: 0.9453
288/540 [===============>..............] - ETA: 0s - loss: 0.1366 - sparse_categorical_accuracy: 0.9479
320/540 [================>.............] - ETA: 0s - loss: 0.1435 - sparse_categorical_accuracy: 0.9469
352/540 [==================>...........] - ETA: 0s - loss: 0.1460 - sparse_categorical_accuracy: 0.9489
384/540 [====================>.........] - ETA: 0s - loss: 0.1380 - sparse_categorical_accuracy: 0.9531
448/540 [=======================>......] - ETA: 0s - loss: 0.1367 - sparse_categorical_accuracy: 0.9487
512/540 [===========================>..] - ETA: 0s - loss: 0.1498 - sparse_categorical_accuracy: 0.9434
540/540 [==============================] - 1s 2ms/step - loss: 0.1585 - sparse_categorical_accuracy: 0.9389
Epoch 10/10
32/540 [>.............................] - ETA: 0s - loss: 0.2048 - sparse_categorical_accuracy: 0.9688
96/540 [====>.........................] - ETA: 0s - loss: 0.2236 - sparse_categorical_accuracy: 0.9167
128/540 [======>.......................] - ETA: 0s - loss: 0.1914 - sparse_categorical_accuracy: 0.9375
160/540 [=======>......................] - ETA: 0s - loss: 0.1591 - sparse_categorical_accuracy: 0.9500
224/540 [===========>..................] - ETA: 0s - loss: 0.1438 - sparse_categorical_accuracy: 0.9554
256/540 [=============>................] - ETA: 0s - loss: 0.1465 - sparse_categorical_accuracy: 0.9531
320/540 [================>.............] - ETA: 0s - loss: 0.1325 - sparse_categorical_accuracy: 0.9594
384/540 [====================>.........] - ETA: 0s - loss: 0.1505 - sparse_categorical_accuracy: 0.9505
448/540 [=======================>......] - ETA: 0s - loss: 0.1480 - sparse_categorical_accuracy: 0.9531
512/540 [===========================>..] - ETA: 0s - loss: 0.1456 - sparse_categorical_accuracy: 0.9551
540/540 [==============================] - 1s 2ms/step - loss: 0.1459 - sparse_categorical_accuracy: 0.9537
hit rate: 331 / 360 [ 91.9 %]
```
:::
---
# 完整程式碼
```python=0
import cv2
import numpy as np
import os
import matplotlib.pyplot as plt
def enumerate_files(dirs, path='All_gray_1_32_32', n_poses=3,n_samples=20):
filenames, targets = [], []
for p in dirs:
for n in range(n_poses):
for j in range(3):
dir_name = path+'/'+p+'/000'+str(n*3+j)+'/'
for s in range(n_samples):
d = dir_name+'%04d/'%s
for f in os.listdir(d):
if f.endswith('jpg'):
filenames += [d+f]
targets.append(n)
print(filenames,targets)
return filenames,targets
def read_images(files):
imgs = []
for i in range(0,len(files)):
img = cv2.imread(files[i],0)
imgs.extend(np.reshape(img , (1,len(img),len(img),1)))
return np.array(imgs)
def read_datasets(datasets):
files, labels = enumerate_files(datasets)
list_of_arrays = read_images(files)
return np.array(list_of_arrays), labels
train_sets = ['Set1', 'Set2', 'Set3']
test_sets = ['Set4', 'Set5']
trn_array, trn_labels = read_datasets(train_sets)
tst_array, tst_labels = read_datasets(test_sets)
import keras
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D
from keras.callbacks import ReduceLROnPlateau
print(trn_array.shape)
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(32,32,1)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(filters=36, kernel_size=(5, 5), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(32, activation='relu', input_dim=540))
model.add(Dropout(0.05))
model.add(Dense(3, activation='softmax'))
model.compile(optimizer=keras.optimizers.Adadelta(),
loss='sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy'])
reduce_lr = ReduceLROnPlateau(monitor='loss', patience=5, mode='max')
train_history = model.fit(trn_array, trn_labels, epochs=10, batch_size=32,callbacks=[reduce_lr])
predictions = model.predict(tst_array)
score = 0
for i in range(0,len(tst_array)):
#print(i,": [",'%.0f'% predictions[i,0],"][",'%.0f'% predictions[i,1],"][",'%.0f'% predictions[i,2],"] ",predictions[i,tst_labels[i]])
if round(predictions[i,tst_labels[i]]) == 1:
score += 1
print("hit rate:",score,"/",len(tst_array)," [",round(score/len(tst_array) * 100,1),"%]")
plt.subplot(1,2,1)
plt.title("loss/epoch")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.plot(train_history.epoch,train_history.history['loss'],linestyle = "--",marker='o')
for a, b in zip(train_history.epoch, train_history.history['loss']):
plt.text(float(a), float(b) + 0.1, round(b,2), ha='center', va='bottom', fontsize=13)
plt.subplot(1,2,2)
plt.title("sparse_categorical_accuracy/epoch")
plt.xlabel("epoch")
plt.ylabel("sparse_categorical_accuracy")
plt.plot(train_history.epoch,train_history.history['sparse_categorical_accuracy'],linestyle = "--",marker='o')
for a, b in zip(train_history.epoch, train_history.history['sparse_categorical_accuracy']):
plt.text(float(a), float(b) + 0.003, round(b,2), ha='center', va='bottom', fontsize=13)
plt.show()
```
---
### 歡迎討論與交流
聯絡方式:
- email me: 410621225@gms.ndhu.edu.tw