深度學習HW2_ Music Scale Recognition by CNN

# 深度學習HW2_ Music Scale Recognition by CNN ###### tags: `pytorch`, `Python筆記`, `CNN` ### :small_blue_diamond: 410823001 電機四許哲瑜 1. what problems you encountered when doing this assignment 在做本次作業的過程中，發現到不管如何修改模型架構與參數，訓練的準確度都會卡在93%無法繼續增加，仔細看過程式碼之後才發現，我使用 os.listdir()的方式讀取圖片資料夾中的檔案名稱，與 "train_truth.csv" 檔案中的順序並不相同，導致我在訓練時有許多圖片所對應的label是錯誤的!難怪不管怎麼調整模型都不會有更好的成效。 2. how did you solve the problems? 為了解決上述問題，只需要將該檔案名稱去對應 "train_truth.csv" 中 "filename" 是否相同，利用pandas套件中的loc函式去過濾資料後，再拿取旁邊 "category" 的值即可，實作的程式碼如下: ```python= label_file = 'train_truth.csv' # 設定檔案路徑 label = pd.read_csv(label_file) # 利用 pandas 讀取 csv 檔案 N = len(label) # 取得訓練資料的數量 labels = np.array([0]* N) # 建立存放標籤的空陣列 path = os.getcwd() + '\music_train' # 設定圖片數據的資料夾路徑 class_filenames = os.listdir(path) # 利用 os.listdir 讀取資料夾中的圖片檔案名稱(type is list) for i, name in enumerate(class_filenames): num = label.loc[label["filename"]==name] # 取得該圖片名稱在 train_truth.csv 中對應的 label值 labels[i] = int(num['category']) # 將label值存入上方建立的標籤空陣列中 ``` 3. is there any innovative design you've made in this assignment? 我在訓練模型前有利用 train_test_split 將資料切分成訓練集與驗證集，其中使用stratify 參數設定為label資料，使得切分後的資料集是均勻分布。並且也將每張圖片都做正規化的處理，實作程式碼如下: ```python= from sklearn.model_selection import train_test_split # 利用 train_test_split語法將訓練資料切分成訓練集與測試集: stratify 可以根據給定的標籤數均勻切分、random_state 確保每次重複執行切分的資料都相同 x_train_image, x_test_image, y_train_label, y_test_label = train_test_split(images, labels, stratify = labels, test_size=0.2, random_state=47) x_train = x_train_image.astype('float32')/255 # 將訓練集圖片數據正規化處理 ``` * 另外因為影像分類的問題我之前有使用Tensorflow的框架訓練過，這次選擇用Pytorch來試試看，發現最大的不同是建立訓練模型架構的部分，同樣使用ResNet模型架構，在Tensorflow可以使用遷移式學習得方式，短短幾行程式碼便可以建立好；而Pytorch我是透過老師上課的程式碼一步步建立，雖然過程較為複雜，但也比較了解此模型架構中的各項layer的內容與資料是如何被傳遞的，兩種不同的框架都各有優缺點，使用Tensorflow框架的程式碼如下: ```python= # Tensorflow的框架建立 ResNet 模型架構 # 使用遷移式學習的imagenet權重，僅保留「非後全連結層的部分」，即include_top為False，並可使用input_shape指定輸入大小。 import tensorflow.keras.applications as keras_model from tensorflow.keras.layers import Dense, GlobalAveragePooling2D from tensorflow.keras.models import Model base_model = keras_model.ResNet50(include_top=False, weights="imagenet", input_shape=(H, W, 3), pooling=None, classes=88) x = base_model.output x = GlobalAveragePooling2D()(x) x = Dense(1024, activation="relu")(x) x = Dense(512, activation="relu")(x) output = Dense(88, activation="softmax")(x) model = Model(inputs=base_model.input, outputs=output) model.summary() ``` 4. what have you learned in this assignment? 經過這次讀取檔案與label的教訓，進後會再建立訓練資料時更加小心，確保要輸入訓練模型的資料集是正確的，避免在發生調整各項參數後都沒有效果的慘案發生!