--- title: 深度學習_用CNN辨識3種手勢圖片_410621225 tags: Templates, Talk description: View the slide with "Slide Mode". --- [TOC] --- # 深度學習_用CNN辨識3種手勢圖片 410621225_葉映辰 --- ### 深度學習(英語:deep learning) 是一種機器學習,使用人工神經網路為架構,對資料進行特徵學習的方法。 ### 卷積神經網路(Convolutional Neural Network, CNN) 是一種前饋神經網路,它的人工神經元可以遮罩式的進行範圍處裡,對於大型圖像處理有出色表現。 --- ### 程式碼流程圖 ```graphviz digraph { compound=true rankdir=RL graph [ fontname="Source Sans Pro", fontsize=80 ]; node [ fontname="Source Sans Pro", fontsize=40]; edge [ fontname="Source Sans Pro", fontsize=505 ]; prepare [label="準備資料"] [shape=box] load [label="載入"] [shape=box] design [label="設計CNN模型架構"] [shape=box] train [label="訓練模型"] [shape=box] test [label="測試模型"] [shape=box] plotable [label="圖表視覺化"] [shape=box] prepare->load load->design design->train train->test test->plotable } ``` --- ## CNN手勢辨識程式碼分析 ### 步驟1. 準備資料 0號手勢: ![](https://i.imgur.com/gOvULLw.jpg) 1號手勢: ![](https://i.imgur.com/KwruEPE.jpg) 2號手勢: ![](https://i.imgur.com/QE7nXS4.jpg) 3種手勢辨識用圖片,每一張像素為32X32(1024),每個set有180張圖片,共包含5個set #### 下載連結: http://web.csie.ndhu.edu.tw/ccchiang/Data/All_gray_1_32_32.rar --- ### 步驟2. 載入資料 深度學習非常仰賴用於訓練的資料庫,在獲得大量資料後下一步就是進行資料的批量讀取 :::spoiler 點選顯示更多內容 ```python=0 import cv2 import numpy as np import os import matplotlib.pyplot as plt ###找出訓練用資料路徑,並規格化 def enumerate_files(dirs, path='All_gray_1_32_32', n_poses=3,n_samples=20): filenames, targets = [], [] for p in dirs: for n in range(n_poses): for j in range(3): dir_name = path+'/'+p+'/000'+str(n*3+j)+'/' for s in range(n_samples): d = dir_name+'%04d/'%s for f in os.listdir(d): if f.endswith('jpg'): filenames += [d+f] targets.append(n) print(filenames,targets) return filenames,targets ###讀檔並把陣列形狀設成(540,32,32,1) def read_images(files): imgs = [] for i in range(0,len(files)): img = cv2.imread(files[i],0) imgs.extend(np.reshape(img , (1,len(img),len(img),1))) return np.array(imgs) ###讀取資料(圖片陣列和手勢種類號碼) def read_datasets(datasets): files, labels = enumerate_files(datasets) list_of_arrays = read_images(files) return np.array(list_of_arrays), labels train_sets = ['Set1', 'Set2', 'Set3'] #Set1~3用於訓練模型 test_sets = ['Set4', 'Set5'] #Set4~5用於測試成果 trn_array, trn_labels = read_datasets(train_sets) tst_array, tst_labels = read_datasets(test_sets) ``` (該程式碼修正自: http://web.csie.ndhu.edu.tw/ccchiang/Data/fdl_assignment2.pdf) ::: --- ### 步驟3. 設計CNN模型架構 深度學習用多階段的步驟,對手勢圖片的對應手勢代號進行隨機的猜測,並參考我們提供的答案,如果答對或錯進行微調,最終獲得一個高命中率的手勢辨識模型 :::spoiler 點選顯示更多內容 ### 各種層介紹 - #### Convolution Layer卷積層 (CNN特有層) 用影像處理的遮罩旋積技術"強化邊緣",並可搭配"激發函數Relu"讓邊緣單方面極端化 - #### Pooling Layer 池化層 (CNN特有層) 找出遮罩內最大值,並以此作為新陣列對應的值,可以用於去除雜訊 - #### Flaten Layer 扁平化層 CNN為了使用旋積等遮罩運算,因此以多維角度處裡圖片,而要先降成一維,才能做最後分類(辨識3種手勢分3類,即一維陣列: [0,1,2]) - #### Fully Connected Layer 全連接層 將之前的成果收束成成果 - #### Dropout 丟掉層 防止過度擬合,因此要適當忘記一些資訊,好比被"黑色的土狗"咬後,可以忘記那隻狗的"毛色",但該記得之前是被"哪種動物"攻擊過 ```python=+ import keras from keras.models import Sequential from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D from keras.callbacks import ReduceLROnPlateau print(trn_array.shape)#檢查輸入格式是否為(540,32,32,1) model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(32,32,1))) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(filters=36, kernel_size=(5, 5), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.2)) model.add(Flatten()) model.add(Dense(32, activation='relu', input_dim=540)) model.add(Dropout(0.05)) model.add(Dense(3, activation='softmax')) model.compile(optimizer=keras.optimizers.Adadelta(), loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy']) ``` ::: --- ### 步驟4. 訓練模型 :::spoiler 點選顯示更多內容 ```python=+ reduce_lr = ReduceLROnPlateau(monitor='loss', patience=5, mode='max') train_history = model.fit(trn_array, trn_labels, epochs=10, batch_size=32,callbacks=[reduce_lr]) ``` ::: --- ### 步驟5. 測試模型 :::spoiler 點選顯示更多內容 ```python=+ predictions = model.predict(tst_array) score = 0 for i in range(0,len(tst_array)): #print(i,": [",'%.0f'% predictions[i,0],"][",'%.0f'% predictions[i,1],"][",'%.0f'% predictions[i,2],"] ",predictions[i,tst_labels[i]]) if round(predictions[i,tst_labels[i]]) == 1: score += 1 print("hit rate:",score,"/",len(tst_array)," [",round(score/len(tst_array) * 100,1),"%]") ``` ::: --- ### 步驟6. 圖表視覺化 :::spoiler 點選顯示更多內容 ```python=+ plt.subplot(1,2,1) plt.title("loss/epoch") plt.xlabel("epoch") plt.ylabel("loss") plt.plot(train_history.epoch,train_history.history['loss'],linestyle = "--",marker='o') for a, b in zip(train_history.epoch, train_history.history['loss']): plt.text(float(a), float(b) + 0.1, round(b,2), ha='center', va='bottom', fontsize=13) plt.subplot(1,2,2) plt.title("sparse_categorical_accuracy/epoch") plt.xlabel("epoch") plt.ylabel("sparse_categorical_accuracy") plt.plot(train_history.epoch,train_history.history['sparse_categorical_accuracy'],linestyle = "--",marker='o') for a, b in zip(train_history.epoch, train_history.history['sparse_categorical_accuracy']): plt.text(float(a), float(b) + 0.003, round(b,2), ha='center', va='bottom', fontsize=13) plt.show() ``` ::: --- ### 測試結果 ![](https://i.imgur.com/B4LToOD.png) sparse_categorical_accuracy: 0.9537 hit rate: 331 / 360 [ 91.9 %] :::spoiler ```python=0 32/540 [>.............................] - ETA: 2s - loss: 20.6139 - sparse_categorical_accuracy: 0.3750 96/540 [====>.........................] - ETA: 1s - loss: 15.8058 - sparse_categorical_accuracy: 0.3958 160/540 [=======>......................] - ETA: 0s - loss: 13.7545 - sparse_categorical_accuracy: 0.3750 224/540 [===========>..................] - ETA: 0s - loss: 10.5184 - sparse_categorical_accuracy: 0.3750 288/540 [===============>..............] - ETA: 0s - loss: 8.6669 - sparse_categorical_accuracy: 0.3854 352/540 [==================>...........] - ETA: 0s - loss: 7.4399 - sparse_categorical_accuracy: 0.3807 416/540 [======================>.......] - ETA: 0s - loss: 6.5333 - sparse_categorical_accuracy: 0.3870 480/540 [=========================>....] - ETA: 0s - loss: 5.8792 - sparse_categorical_accuracy: 0.3917 540/540 [==============================] - 1s 2ms/step - loss: 5.3469 - sparse_categorical_accuracy: 0.4037 Epoch 2/10 32/540 [>.............................] - ETA: 0s - loss: 0.7484 - sparse_categorical_accuracy: 0.6250 96/540 [====>.........................] - ETA: 0s - loss: 1.3329 - sparse_categorical_accuracy: 0.4688 160/540 [=======>......................] - ETA: 0s - loss: 1.3280 - sparse_categorical_accuracy: 0.4563 224/540 [===========>..................] - ETA: 0s - loss: 1.2546 - sparse_categorical_accuracy: 0.4732 256/540 [=============>................] - ETA: 0s - loss: 1.2180 - sparse_categorical_accuracy: 0.4766 320/540 [================>.............] - ETA: 0s - loss: 1.1545 - sparse_categorical_accuracy: 0.5031 384/540 [====================>.........] - ETA: 0s - loss: 1.0997 - sparse_categorical_accuracy: 0.5182 416/540 [======================>.......] - ETA: 0s - loss: 1.1175 - sparse_categorical_accuracy: 0.5168 448/540 [=======================>......] - ETA: 0s - loss: 1.1255 - sparse_categorical_accuracy: 0.5179 480/540 [=========================>....] - ETA: 0s - loss: 1.1042 - sparse_categorical_accuracy: 0.5229 540/540 [==============================] - 1s 2ms/step - loss: 1.0894 - sparse_categorical_accuracy: 0.5148 Epoch 3/10 32/540 [>.............................] - ETA: 0s - loss: 1.0511 - sparse_categorical_accuracy: 0.5312 96/540 [====>.........................] - ETA: 0s - loss: 0.9700 - sparse_categorical_accuracy: 0.5208 160/540 [=======>......................] - ETA: 0s - loss: 0.8853 - sparse_categorical_accuracy: 0.5688 192/540 [=========>....................] - ETA: 0s - loss: 0.8845 - sparse_categorical_accuracy: 0.5885 256/540 [=============>................] - ETA: 0s - loss: 0.8377 - sparse_categorical_accuracy: 0.6016 320/540 [================>.............] - ETA: 0s - loss: 0.8377 - sparse_categorical_accuracy: 0.6219 384/540 [====================>.........] - ETA: 0s - loss: 0.8343 - sparse_categorical_accuracy: 0.6302 448/540 [=======================>......] - ETA: 0s - loss: 0.8381 - sparse_categorical_accuracy: 0.6339 512/540 [===========================>..] - ETA: 0s - loss: 0.8353 - sparse_categorical_accuracy: 0.6328 540/540 [==============================] - 1s 1ms/step - loss: 0.8219 - sparse_categorical_accuracy: 0.6426 Epoch 4/10 32/540 [>.............................] - ETA: 0s - loss: 0.6801 - sparse_categorical_accuracy: 0.6562 96/540 [====>.........................] - ETA: 0s - loss: 0.6091 - sparse_categorical_accuracy: 0.7396 160/540 [=======>......................] - ETA: 0s - loss: 0.5165 - sparse_categorical_accuracy: 0.7625 192/540 [=========>....................] - ETA: 0s - loss: 0.4942 - sparse_categorical_accuracy: 0.7812 256/540 [=============>................] - ETA: 0s - loss: 0.4932 - sparse_categorical_accuracy: 0.7812 320/540 [================>.............] - ETA: 0s - loss: 0.5351 - sparse_categorical_accuracy: 0.7656 384/540 [====================>.........] - ETA: 0s - loss: 0.5128 - sparse_categorical_accuracy: 0.7786 448/540 [=======================>......] - ETA: 0s - loss: 0.5065 - sparse_categorical_accuracy: 0.7857 512/540 [===========================>..] - ETA: 0s - loss: 0.5072 - sparse_categorical_accuracy: 0.7832 540/540 [==============================] - 1s 1ms/step - loss: 0.5062 - sparse_categorical_accuracy: 0.7833 Epoch 5/10 32/540 [>.............................] - ETA: 0s - loss: 0.2141 - sparse_categorical_accuracy: 0.8750 96/540 [====>.........................] - ETA: 0s - loss: 0.3087 - sparse_categorical_accuracy: 0.8750 160/540 [=======>......................] - ETA: 0s - loss: 0.3582 - sparse_categorical_accuracy: 0.8750 224/540 [===========>..................] - ETA: 0s - loss: 0.3434 - sparse_categorical_accuracy: 0.8705 288/540 [===============>..............] - ETA: 0s - loss: 0.3428 - sparse_categorical_accuracy: 0.8715 352/540 [==================>...........] - ETA: 0s - loss: 0.3682 - sparse_categorical_accuracy: 0.8636 416/540 [======================>.......] - ETA: 0s - loss: 0.4145 - sparse_categorical_accuracy: 0.8438 480/540 [=========================>....] - ETA: 0s - loss: 0.4184 - sparse_categorical_accuracy: 0.8333 540/540 [==============================] - 1s 1ms/step - loss: 0.3950 - sparse_categorical_accuracy: 0.8463 Epoch 6/10 32/540 [>.............................] - ETA: 0s - loss: 0.2378 - sparse_categorical_accuracy: 0.9062 96/540 [====>.........................] - ETA: 0s - loss: 0.3724 - sparse_categorical_accuracy: 0.8646 160/540 [=======>......................] - ETA: 0s - loss: 0.3862 - sparse_categorical_accuracy: 0.8313 224/540 [===========>..................] - ETA: 0s - loss: 0.4477 - sparse_categorical_accuracy: 0.8125 288/540 [===============>..............] - ETA: 0s - loss: 0.5798 - sparse_categorical_accuracy: 0.7882 352/540 [==================>...........] - ETA: 0s - loss: 0.5076 - sparse_categorical_accuracy: 0.8182 416/540 [======================>.......] - ETA: 0s - loss: 0.4747 - sparse_categorical_accuracy: 0.8269 480/540 [=========================>....] - ETA: 0s - loss: 0.4445 - sparse_categorical_accuracy: 0.8354 540/540 [==============================] - 1s 1ms/step - loss: 0.4179 - sparse_categorical_accuracy: 0.8481 Epoch 7/10 32/540 [>.............................] - ETA: 0s - loss: 0.3136 - sparse_categorical_accuracy: 0.9062 96/540 [====>.........................] - ETA: 0s - loss: 0.2757 - sparse_categorical_accuracy: 0.8958 160/540 [=======>......................] - ETA: 0s - loss: 0.2453 - sparse_categorical_accuracy: 0.9062 192/540 [=========>....................] - ETA: 0s - loss: 0.2569 - sparse_categorical_accuracy: 0.8906 256/540 [=============>................] - ETA: 0s - loss: 0.2474 - sparse_categorical_accuracy: 0.8945 320/540 [================>.............] - ETA: 0s - loss: 0.2347 - sparse_categorical_accuracy: 0.8969 384/540 [====================>.........] - ETA: 0s - loss: 0.2347 - sparse_categorical_accuracy: 0.9010 448/540 [=======================>......] - ETA: 0s - loss: 0.2246 - sparse_categorical_accuracy: 0.9085 512/540 [===========================>..] - ETA: 0s - loss: 0.2176 - sparse_categorical_accuracy: 0.9121 540/540 [==============================] - 1s 1ms/step - loss: 0.2137 - sparse_categorical_accuracy: 0.9148 Epoch 8/10 32/540 [>.............................] - ETA: 0s - loss: 0.1827 - sparse_categorical_accuracy: 0.8750 96/540 [====>.........................] - ETA: 0s - loss: 0.1591 - sparse_categorical_accuracy: 0.9167 160/540 [=======>......................] - ETA: 0s - loss: 0.1790 - sparse_categorical_accuracy: 0.9187 192/540 [=========>....................] - ETA: 0s - loss: 0.1646 - sparse_categorical_accuracy: 0.9323 256/540 [=============>................] - ETA: 0s - loss: 0.1649 - sparse_categorical_accuracy: 0.9336 320/540 [================>.............] - ETA: 0s - loss: 0.1663 - sparse_categorical_accuracy: 0.9375 384/540 [====================>.........] - ETA: 0s - loss: 0.1748 - sparse_categorical_accuracy: 0.9323 448/540 [=======================>......] - ETA: 0s - loss: 0.1757 - sparse_categorical_accuracy: 0.9308 480/540 [=========================>....] - ETA: 0s - loss: 0.1742 - sparse_categorical_accuracy: 0.9333 540/540 [==============================] - 1s 1ms/step - loss: 0.1771 - sparse_categorical_accuracy: 0.9315 Epoch 9/10 32/540 [>.............................] - ETA: 0s - loss: 0.2457 - sparse_categorical_accuracy: 0.9375 96/540 [====>.........................] - ETA: 0s - loss: 0.1723 - sparse_categorical_accuracy: 0.9271 128/540 [======>.......................] - ETA: 0s - loss: 0.1568 - sparse_categorical_accuracy: 0.9297 160/540 [=======>......................] - ETA: 0s - loss: 0.1336 - sparse_categorical_accuracy: 0.9438 192/540 [=========>....................] - ETA: 0s - loss: 0.1445 - sparse_categorical_accuracy: 0.9375 224/540 [===========>..................] - ETA: 0s - loss: 0.1471 - sparse_categorical_accuracy: 0.9375 256/540 [=============>................] - ETA: 0s - loss: 0.1390 - sparse_categorical_accuracy: 0.9453 288/540 [===============>..............] - ETA: 0s - loss: 0.1366 - sparse_categorical_accuracy: 0.9479 320/540 [================>.............] - ETA: 0s - loss: 0.1435 - sparse_categorical_accuracy: 0.9469 352/540 [==================>...........] - ETA: 0s - loss: 0.1460 - sparse_categorical_accuracy: 0.9489 384/540 [====================>.........] - ETA: 0s - loss: 0.1380 - sparse_categorical_accuracy: 0.9531 448/540 [=======================>......] - ETA: 0s - loss: 0.1367 - sparse_categorical_accuracy: 0.9487 512/540 [===========================>..] - ETA: 0s - loss: 0.1498 - sparse_categorical_accuracy: 0.9434 540/540 [==============================] - 1s 2ms/step - loss: 0.1585 - sparse_categorical_accuracy: 0.9389 Epoch 10/10 32/540 [>.............................] - ETA: 0s - loss: 0.2048 - sparse_categorical_accuracy: 0.9688 96/540 [====>.........................] - ETA: 0s - loss: 0.2236 - sparse_categorical_accuracy: 0.9167 128/540 [======>.......................] - ETA: 0s - loss: 0.1914 - sparse_categorical_accuracy: 0.9375 160/540 [=======>......................] - ETA: 0s - loss: 0.1591 - sparse_categorical_accuracy: 0.9500 224/540 [===========>..................] - ETA: 0s - loss: 0.1438 - sparse_categorical_accuracy: 0.9554 256/540 [=============>................] - ETA: 0s - loss: 0.1465 - sparse_categorical_accuracy: 0.9531 320/540 [================>.............] - ETA: 0s - loss: 0.1325 - sparse_categorical_accuracy: 0.9594 384/540 [====================>.........] - ETA: 0s - loss: 0.1505 - sparse_categorical_accuracy: 0.9505 448/540 [=======================>......] - ETA: 0s - loss: 0.1480 - sparse_categorical_accuracy: 0.9531 512/540 [===========================>..] - ETA: 0s - loss: 0.1456 - sparse_categorical_accuracy: 0.9551 540/540 [==============================] - 1s 2ms/step - loss: 0.1459 - sparse_categorical_accuracy: 0.9537 hit rate: 331 / 360 [ 91.9 %] ``` ::: --- # 完整程式碼 ```python=0 import cv2 import numpy as np import os import matplotlib.pyplot as plt def enumerate_files(dirs, path='All_gray_1_32_32', n_poses=3,n_samples=20): filenames, targets = [], [] for p in dirs: for n in range(n_poses): for j in range(3): dir_name = path+'/'+p+'/000'+str(n*3+j)+'/' for s in range(n_samples): d = dir_name+'%04d/'%s for f in os.listdir(d): if f.endswith('jpg'): filenames += [d+f] targets.append(n) print(filenames,targets) return filenames,targets def read_images(files): imgs = [] for i in range(0,len(files)): img = cv2.imread(files[i],0) imgs.extend(np.reshape(img , (1,len(img),len(img),1))) return np.array(imgs) def read_datasets(datasets): files, labels = enumerate_files(datasets) list_of_arrays = read_images(files) return np.array(list_of_arrays), labels train_sets = ['Set1', 'Set2', 'Set3'] test_sets = ['Set4', 'Set5'] trn_array, trn_labels = read_datasets(train_sets) tst_array, tst_labels = read_datasets(test_sets) import keras from keras.models import Sequential from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D from keras.callbacks import ReduceLROnPlateau print(trn_array.shape) model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(32,32,1))) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(filters=36, kernel_size=(5, 5), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.2)) model.add(Flatten()) model.add(Dense(32, activation='relu', input_dim=540)) model.add(Dropout(0.05)) model.add(Dense(3, activation='softmax')) model.compile(optimizer=keras.optimizers.Adadelta(), loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy']) reduce_lr = ReduceLROnPlateau(monitor='loss', patience=5, mode='max') train_history = model.fit(trn_array, trn_labels, epochs=10, batch_size=32,callbacks=[reduce_lr]) predictions = model.predict(tst_array) score = 0 for i in range(0,len(tst_array)): #print(i,": [",'%.0f'% predictions[i,0],"][",'%.0f'% predictions[i,1],"][",'%.0f'% predictions[i,2],"] ",predictions[i,tst_labels[i]]) if round(predictions[i,tst_labels[i]]) == 1: score += 1 print("hit rate:",score,"/",len(tst_array)," [",round(score/len(tst_array) * 100,1),"%]") plt.subplot(1,2,1) plt.title("loss/epoch") plt.xlabel("epoch") plt.ylabel("loss") plt.plot(train_history.epoch,train_history.history['loss'],linestyle = "--",marker='o') for a, b in zip(train_history.epoch, train_history.history['loss']): plt.text(float(a), float(b) + 0.1, round(b,2), ha='center', va='bottom', fontsize=13) plt.subplot(1,2,2) plt.title("sparse_categorical_accuracy/epoch") plt.xlabel("epoch") plt.ylabel("sparse_categorical_accuracy") plt.plot(train_history.epoch,train_history.history['sparse_categorical_accuracy'],linestyle = "--",marker='o') for a, b in zip(train_history.epoch, train_history.history['sparse_categorical_accuracy']): plt.text(float(a), float(b) + 0.003, round(b,2), ha='center', va='bottom', fontsize=13) plt.show() ``` --- ### 歡迎討論與交流 聯絡方式: - email me: 410621225@gms.ndhu.edu.tw