深度學習HW3_ Online Character Recognition by RNN(LSTM)

# 深度學習HW3_ Online Character Recognition by RNN(LSTM) ###### tags: `pytorch`, `Python筆記`, `RNN` ### :small_blue_diamond: 410823001 電機四許哲瑜 1. what problems you encountered when doing this assignment 在做這次作業的過程中，我有先將輸入資料透過cv2套件繪製看看，但不太確定是不是因為我設定座標點的位置錯誤，畫出來的圖看起來都不太正確，也有可能剛好那些數據取點的剛好是長這樣，下圖為執行成果與程式碼。 ![](https://hackmd.io/_uploads/rJrjJFEDh.png) 另外我一開始在建立模型時，在輸入維度的部分我選擇是一次輸入16筆資料，並非將資料拆分成x值、y值一組，8*2的維度輸入。這可能是最終我的模型的準確度只有在0.98左右無法繼續上升的原因。 2. how did you solve the problems? 為了解決上述問題，我用了另一種畫圖套件(matplotlib)試試看，利用for迴圈將八個點的x,y座標值分別取出，並依照輸入的順序將兩點之間以線條相接，會發現比之前得方式更容易觀察，繪製圖片的程式碼如下顯示： ```python= import matplotlib.pyplot as plt # 畫出圖測試是否有正確存入資料: num = 43 x = [] # 存放x值的空陣列 y = [] # 存放y值的空陣列 for i in range(8): x.append(train_data[num][i][0]) # train_data陣列中依序為: 第num張，第i點，[0]為x座標值。 y.append(train_data[num][i][1]) # 同上，[1]為y座標值 plt.plot(x, y, color='r', linewidth = 2, marker = 'o') # 輸入x,y陣列的座標值，並設定顏色、線條寬度與點的形狀。 # 將該圖片與對應的標籤繪製顯示 plt.title(labels[num], size=25) plt.show() ``` ![](https://hackmd.io/_uploads/S14Ogo2P2.png) 另外關於輸入維度的方式，我從新修正原本的程式碼，利用reshape的函式將輸入的16個點轉換成兩個x,y一組、共8個點的維度，並且將LSTM的輸入維度設定為2，程式碼如下顯示，這樣就能解決之前遇到的問題了！雖然最後的準確度仍然沒有上升，但這樣的訓練方式比較合理一些。 ```python= train_data = np.empty((N,8,2), dtype="uint8") # 建立存放訓練資料的空陣列 labels = np.array([0]* N) # 建立存放標籤的空陣列 print(train_data.shape) print(labels.shape) #讀取訓練資料 for i in range(N): train_data[i] = np.array(train.iloc[i]).reshape(8,2) # 將訓練資料用iloc()函式，將每筆資料存入 train_data 空陣列中 labels[i] = np.array(label.iloc[i]) # 將標籤資料用iloc()函式，將每筆資料存入 label 空陣列中 ``` ```python= input_size = 2 # 訓練集資料維度是 [64, 8, 2] hidden_size = 1024 # 每層 1024 個隱藏單元的 LSTM 模型 num_layers = 2 # 具有 2 層的 LSTM 模型 num_classes = 10 # 輸出維度設定為有 10 個類別 class simpleLSTM(nn.Module): def __init__(self, input_size, hidden_size, num_layers, num_classes): super(simpleLSTM, self).__init__() self.hidden_size = hidden_size self.num_layers = num_layers self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True) # (N,L,Hin) when batch_first=True self.fc = nn.Linear(hidden_size, num_classes) def forward(self, x): h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device) c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device) out, (h_n, h_c) = self.lstm(x, (h0, c0)) out = self.fc(out[:, -1, :]) return out ``` 3. is there any innovative design you've made in this assignment? 和作業二相同，我在訓練模型前有利用 train_test_split 將資料切分成訓練集與驗證集，其中使用stratify 參數設定為label資料，使得切分後的資料集是均勻分布。 ```python= from sklearn.model_selection import train_test_split # 利用 train_test_split語法將訓練資料切分成訓練集與測試集: stratify 可以根據給定的標籤數均勻切分、random_state 確保每次重複執行切分的資料都相同 x_train, x_test, y_train, y_test = train_test_split(images, labels, stratify = labels, test_size=0.2, random_state=47) ``` 4. what have you learned in this assignment? 最一開始在建立模型時，就想要以8,2的維度方式進行模型的輸入來訓練，但程式碼一直抱錯，顯示input_size只能接受一個整數的值，造成我一開始只能一次就將16個數值當作輸入。後來查了許多資料才了解，其實將輸入的維度設定好，並且有使用batch_first＝ture，便可以將input_size設定為2，讓模型一次接收一個點的x,y值進行訓練，學習到了很多。