H 專題討論 - Google Landmark

# H 專題討論 - Google Landmark > 歡迎大家提出任何想問的問題 > 6/24 討論連結 https://meet.google.com/ozw-ytwz-irv ### 1. Data挑選與歸檔 ```python= import pandas as pd train_csv = pd.read_csv('/kaggle/input/landmark-recognition-2020/train.csv') train_csv.head(10) # put .jpg into the file name def add_txt(fn): return fn+'.jpg' train_csv['id'] = train_csv['id'].apply(add_txt) # choose those labels with more than 200 images, and choose the first 200 images of each label # move every training files to the same folder %cd /kaggle/working if not os.path.exists('training'): os.mkdir('training') if not os.path.exists('validation'): os.mkdir('validation') if not os.path.exists('testing'): os.mkdir('testing') import shutil import random label_list = train_csv['landmark_id'].unique() cnt = 0 final_label_list = [] for label in list(label_list): file_list = list(train_csv['id'][train_csv['landmark_id']==label]) if len(file_list) >= 200: final_label_list.append(label) if not os.path.exists('/kaggle/working/training/'+str(label)): os.mkdir('/kaggle/working/training/'+str(label)) if not os.path.exists('/kaggle/working/validation/'+str(label)): os.mkdir('/kaggle/working/validation/'+str(label)) if not os.path.exists('/kaggle/working/testing/'+str(label)): os.mkdir('/kaggle/working/testing/'+str(label)) for file in file_list[:120]: # 120 files for training src = '/kaggle/input/landmark-recognition-2020/train/'+file[0]+'/'+file[1]+'/'+file[2]+'/'+file dst = '/kaggle/working/training/'+str(label)+'/'+file if not os.path.exists(dst): shutil.copyfile(src, dst) for file in file_list[120:160]: # 40 files for validation src = '/kaggle/input/landmark-recognition-2020/train/'+file[0]+'/'+file[1]+'/'+file[2]+'/'+file dst = '/kaggle/working/validation/'+str(label)+'/'+file if not os.path.exists(dst): shutil.copyfile(src, dst) for file in file_list[160:200]: # 40 files for testing src = '/kaggle/input/landmark-recognition-2020/train/'+file[0]+'/'+file[1]+'/'+file[2]+'/'+file dst = '/kaggle/working/testing/'+str(label)+'/'+file if not os.path.exists(dst): shutil.copyfile(src, dst) cnt += 1 if cnt == 100: # only need 100 labels break # 20,000 files in total ``` 如果想要下載訓練資料 [點我](https://reurl.cc/9rR46X) ```python= from shutil import make_archive make_archive('train_data', 'zip', '/kaggle/working') %cd /kaggle/working from IPython.display import FileLink FileLink(r'train_data.zip') ``` 載入資料 ```python= from keras.preprocessing.image import ImageDataGenerator train_datagen = ImageDataGenerator( rescale=1./255, rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True) test_datagen = ImageDataGenerator(rescale=1./255) train_dir = '/kaggle/working/training' validation_dir = '/kaggle/working/validation' test_dir = '/kaggle/working/testing' train_generator = train_datagen.flow_from_directory( train_dir, target_size=(256, 256), batch_size = 32, class_mode='categorical', seed=42) validation_generator = test_datagen.flow_from_directory( validation_dir, target_size=(256, 256), batch_size = 32, class_mode='categorical', seed=42) test_generator = test_datagen.flow_from_directory( test_dir, target_size=(256, 256), batch_size = 1, class_mode='categorical', seed=42) ``` ## 許佳雯 <font color="carol">**0622更新**</font> - 拆分train / test / val - 正則化＋Resize + Augmentation - 建立模型 EfficientNetB0+Pooling+Dropout - 30epoch跑完 `loss:0.7207 / accuracy:0.8307 ` `val_loss:0.7517 / val_accuracy:0.8087` - 評估模型 `loss: 0.7614/ accuracy: 0.8050` ### 1.建立＆訓練模型使用Transfer Learning+Pooling+Dropout ```python= # Model + Transfer Learning-EfficientNetB0 from tensorflow.keras.applications import EfficientNetB0 from tensorflow import keras as K from tensorflow.keras import layers efficientNet = EfficientNetB0( weights='imagenet', include_top=False, input_shape=(256,256,3) ) model = K.models.Sequential() model.add(efficientNet) model.add(layers.GlobalAveragePooling2D()) model.add(layers.Dropout(0.01)) # 篩掉1/100 model.add(layers.Dense(100, activation='softmax')) #100個類別輸出 Adam = K.optimizers.Adam(lr=0.00001) #改善重點 model.compile(optimizer= Adam, loss='categorical_crossentropy', metrics=['accuracy']) print(model.summary()) ``` ![](https://i.imgur.com/gVVlrUj.png) ```python= train_history = model.fit(train_generator, validation_data = validation_generator, epochs=30, verbose=2) ``` ![](https://i.imgur.com/OCeFDkR.jpg) ![](https://i.imgur.com/6YHDkfT.png) ### 2.評估準確度 ```python= scores = model.evaluate(test_generator) scores[1] ``` -loss : 0.7614 - accuracy : 0.8050 0.8050000071525574 ## 甘元昊 #### 1. [MobileNetV2(較輕量)](https://reurl.cc/MA4Q2p) - train : validation : testing = 6 : 2 : 2 - train_datagen 做變形旋轉等處力 - 模型: （MobileNetV2+GlobalAveragePooling2D） - 100 epochs `loss: 0.0068 / accuracy: 0.9982 ` `val_loss: 0.4093 / val_accuracy: 0.9133` - 評估模型 `loss: 0.4393 / accuracy: 0.9112 ` 對training做image augmentation ```python= train_datagen = ImageDataGenerator( rescale=1./255, rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True) ``` 建構模型 (MobileNetV2 + Pooling) ```python= from tensorflow.keras.applications import MobileNetV2 from keras.layers import Dense, Dropout, MaxPooling2D, GlobalAveragePooling2D, Flatten, Conv2D, Input from keras.models import Sequential from keras import optimizers import tensorflow as tf conv_base = MobileNetV2(include_top=False, weights="imagenet", input_shape=(256, 256, 3) ) conv_base.trainable = True model = Sequential() model.add(conv_base) model.add(GlobalAveragePooling2D()) model.add(Dense(100, activation='softmax')) model.compile(optimizer=optimizers.RMSprop(lr=2e-5), loss = 'categorical_crossentropy', metrics=['accuracy']) model.summary() ``` ![](https://i.imgur.com/DHJhcob.png) 訓練結果 ![](https://i.imgur.com/eK8aRtd.png) ![](https://i.imgur.com/vIdJ4P4.png) 繪製ROC curve ```python= import matplotlib.pyplot as plt from sklearn.metrics import roc_curve, roc_auc_score auc=[] plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.legend() plt.title('ROC') for i in range(100): fpr, tpr, _ = roc_curve(y, scores, pos_label=i) plt.plot(fpr, tpr, label=i) y_0_1 = [] for j in y: if j != i: y_0_1.append(0) else: y_0_1.append(1) y_0_1 = np.array(y_0_1) auc_of_the_label = roc_auc_score(y_0_1, scores) auc.append(auc_of_the_label) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.legend() plt.title('ROC') plt.show() ``` ![](https://i.imgur.com/bW7PJvm.png) 最高的AUC score: 0.72 (label 26028) 最低的AUC score: 0.26 (label 14915) #### 2. [EfficientNetB0 (Dropout)](https://reurl.cc/qg0g60) ps 這個設計上有點瑕疵，不小心把dropout放在output後面 - 模型: （EfficientNetB0+GlobalAveragePooling2D+Dropout） - 100 epochs `loss: 1.6511 / accuracy: 0.8919 ` `val_loss: 1.0362 / val_accuracy: 0.7660 `` - 評估模型 `loss: 1.04 / accuracy: 0.76 ` 模型 ```python= conv_base = EfficientNetB0(include_top=False, weights="imagenet", input_shape=(256, 256, 3) ) conv_base.trainable = True model = Sequential() model.add(conv_base) > model.add(GlobalAveragePooling2D()) model.add(Dense(100, activation='softmax')) model.add(Dropout(0.1)) model.compile(optimizer=optimizers.RMSprop(lr=2e-5), loss = 'categorical_crossentropy', metrics=['accuracy']) model.summary() ``` ![](https://i.imgur.com/lgNhEab.png) ![](https://i.imgur.com/6ONXNJn.png) ## 鄭怡伶我的還在跑...先上傳程式碼，就是老師講義的那段CNN ![](https://i.imgur.com/OKgA1Ti.png) ![](https://i.imgur.com/HzT4TKU.png) ![](https://i.imgur.com/RNvHsih.png) ![](https://i.imgur.com/SPBG7K3.png) 結果不太好 ![](https://i.imgur.com/ny4LO8P.png) ![](https://i.imgur.com/w3CFrYK.png) ![](https://i.imgur.com/vutVbMg.png) 第二次，調lr=0.02 ![](https://i.imgur.com/abdpUKD.png) ![](https://i.imgur.com/LML59IO.png) 更新:調kernel_size變(3，3)，lr=0.0255，epoch=50 ![](https://i.imgur.com/UnX4pjg.png) ![](https://i.imgur.com/rTOK1Ub.png) ![](https://i.imgur.com/dP41mnA.png) ![](https://i.imgur.com/k2u3kBO.png) 頂多到這樣了 ## 蔡中瑋 NASNetMobile ![](https://i.imgur.com/7Iqd6Cv.jpg) ![](https://i.imgur.com/nx5P7B3.jpg) ```python= NASNetMobile = tf.keras.applications.NASNetMobile( weights="imagenet", include_top=False, input_shape=(224,224,3) ) model = K.models.Sequential() model.add(NASNetMobile) model.add(GlobalAveragePooling2D()) model.add(Dropout(0.2)) model.add(layers.Dense(100, activation='softmax')) Adam = K.optimizers.Adam(lr=3e-5) model.compile(Adam, loss='categorical_crossentropy', metrics=['accuracy']) print(model.summary()) ``` ![](https://i.imgur.com/fsbOP1H.png) ```python= train_history = model.fit(train_generator, validation_data = validation_generator, epochs=100, callbacks = callbacks, verbose=1) ``` ## 盧彥愷老師老師!!!!!! 假如你看到這邊的話下面有很大一串是我打的講稿範例以及自己的講稿很大一串看完會累死所以我在這邊想請你幫我點左側標題的盧彥愷想請教的問題的標題他就會直接把你的頁面拉到我想問你的問題再麻煩你了非常感謝!! 以下是大略講稿(參考用) 本次地標特徵專題選擇的是2020年google地標識別，第三屆地標識別大賽主題是標記圖像中著名（和不那麼著名）的地標比賽的標準為針對圖片進行地標識別，就像比賽介紹中的敘述你有沒有看過你的假期照片並問自己：我在中國參觀的那個寺廟叫什麼名字？透過對圖片辨識，就能辨別出我們眼前圖像的地標能幫助我們更快的去回憶我們過往的旅途或是計畫未來嚮往的精彩刺激的冒險如果我們認不出美好回憶以及憧憬想望的圖片是位於哪裡？是在哪個地標？我們又如何沉浸或展開新的旅途這樣的訴求便是我們團隊選擇這個專題令人雀躍的原因本次我們各自對專題發起壯烈的挑戰想好好試試這樣猛烈的焰火就像巧手的工匠那般能不能千錘百鍊出我們這為其1個多月產業新尖兵-人工智慧暨深度學習人才培訓課程驗證並提升我們培訓出來的質變雖是非常艱鉅的各自挑戰說真的，自己想法子搞出地標辨識系統可謂摸黑打老鼠到處碰壁不是鬧著玩的!! 每個學員的程度不同有的有一定編寫程式基礎，甚至幾番進過IT產業此次參與培訓只是想擴增自己對AI系統的眼界有的就真的像打入輪迴重新投胎一樣一字一字開始認識python這樣的無字天書很多只能意會不能言傳的地方需要自己不斷實際編打才能深入其境的體會程式語言的奧妙這樣的艱難就好像俗話說的：瘦死的駱駝比馬大沒基礎的要怎麼與在IT產業打滾已久的學員相比呢? 難免會由衷的失落可俗話又說：拔了毛的鳳凰還不如雞呢誰又知道零經驗的程式菜鳥，不會上演偶像劇般的逆襲呢?('燦笑') 更別說我們雖是各自挑戰卻又將每個人各自凝聚出來的絲線捲起來，擰成一股繩我們並不是單打獨鬥各自挑戰卻又互相Carry 每次的分組討論互相加深彼此對程式的見解最後將各自的成果匯聚像濁水溪般川流不息便是我們現在演講的成果不停歇的砥礪學習然而我們此次的挑戰就像遙望希瑪拉雅山一般光是望著就不寒而慄 1580470個訓練圖片資料 81313個訓練特徵我們要怎麼建構出可以辨識這麼多組的資料又能辨識出地標還能分常見與不常見? 這樣處理海量資料的辨識系統呢? 突然覺得有點癡心妄想，哈哈但是會走入程式員的這條路何嘗不是懷揣著不見棺材不掉淚的執著呢? 就像本次專題訴求一樣誰說編寫程式不也是場愉快又驚悚的冒險?苦笑也許這樣的歷程會讓我們看瞥見高山後面那絕美的風景那個我們原以為跨越不了的峰頂，聖母峰，玉山之類的沒有嘗試又怎麼知道峰頂後面是藏著怎樣如詩如畫一眼入魂的絕世饗宴想想就讓人興奮呢!"癡笑" 那麼我們現在開始我們各自組員的成果分享還煩請各位專員耐心聆聽我們驚心動魄的歷險組員:盧彥愷發表各位專員好，我是H組組員盧彥愷我此次選用CNN(convolutional neural network) 卷積神經網路模型做為我辨識地標的系統我概略的介紹我是如何看待此專題的資料型態特徵提取資料前處裡手法裡的格式變換正則化以及如何辨識提取特徵將特徵熱編碼 one-hot編碼的手法如何建立cnn模型選用怎樣的失誤函數優化函數訓練模型過程的介紹訓練成果圖像化辨別等我是怎麼看待並進行建模的整體流程我先是觀察 2020年google地標識別大賽給予的資料型態全是地標的圖片有趣的是每個分類中的圖片不盡相同有風景，有建築有人像非常的有趣有趣的讓人直打滾哈哈然後分類的依據是 train.csv資料集中的 id資料也就是每個地標圖片檔名圖片的資料夾也是依照這個ID碼的前3碼做為一層一層迭代資料夾的名稱就比如有個檔名前3碼為123 這個圖片就會存在於 train訓練資料中的 1資料夾裡的2資料夾裡的3資料夾中就好像俄羅斯套娃一樣一層一層展開的這樣多層的資料夾會導致資料讀取及辨識特徵的困難然而在train.csv資料集中另外又給我們 landmark_id資料許多的ID圖片檔名都是依據這個landmark_id分類排序並且是一個landmark_id就同時有無數筆id資料所以我們可以用landmark_id做為圖片辨識的特徵來重新分類我們所有的圖片資料夾這樣在相同的landmark_id下所有圖片就會都被歸類在這個landmark_id資料夾裡可以大大的簡化我們特徵提取的難度並且加增資料演算的效率然後我又在圖片資料上思索著怎麼平均的分配好做後面的資料處理於是我做了以下的資料抽樣先對train.csv資料集中抽取前100名圖片最多的landmark_id 發現哇最多的分類甚至有6272張第100名卻只有364張這樣資料量的差距太大順帶一提最少的分類竟然只有2張很好這樣離散分布的資料群訓練成果一定會格外有趣呢"苦笑" 根本糟透了所以為了不讓悲劇發生我對前100名的圖片進行隨機抽樣 100張的圖片以10000張的圖片 100組的特徵做為我的訓練資料的依據並且對做為特徵資料的landmark_id 我也是用隨機排序不知道這樣能不能降低影像辨識的重複性至於為何選用CNN的原因在於透過卷積層的N x N 的矩陣就好像濾波器一樣透過分配不同的權重一禎一禎的偵測圖像的形狀藉由偵測出的形狀輪廓及特徵就能針對下一張的圖像做相似度的辨識並且池化層又能壓縮圖檔保留重要的特徵部分大大的降低模型的負擔然後此資料集是按照 landmark_id做分類每個檔名都有著landmark_id 並且擁有相同的landmark_id 分成同一個資料夾的話就表示再同一個特徵資料夾的圖片有著很高的相似程度就像是20120的資料夾就幾乎是聖塞巴斯蒂安大都會大教堂的圖片 ![](https://i.imgur.com/9VrhxU8.png) google後就知道是在玻利維亞的羅馬天主教科恰班巴總教區另外還有647292的資料夾也幾乎是維多利亞紀念堂 ![](https://i.imgur.com/M8FmFBq.png) 為在印度的加爾各答這些...等等的當然也有奇怪的像是286412的資料夾就全是戰車的圖片 ![](https://i.imgur.com/QVkKtHz.jpg) 這應該不是地標吧??哈哈所以可以想像google工程師都滿幽默的所以這樣的資料集非常適合用再 CNN影像相似度辨識上這也就是我為什麼選用 CNN模型做為我本次訓練模型的原因那麼現在開始進行我程式碼的介紹 ```print= import import numpy as np import pandas as pd import os ``` #導入進度條 ```python= from tqdm.autonotebook import tqdm tqdm.pandas() ``` 設輸入資料路徑 ```python= base_dir = '/kaggle/input/landmark-recognition-2020' ``` 讀取train.csv檔中前100筆的資料 ```python= train = pd.read_csv(os.path.join(base_dir,'train.csv')) train.head(100) ``` ![](https://i.imgur.com/z0tNW3o.png) 抓出最多影像檔的前100組特徵(landmark_id)列表 ```python= head100 = pd.DataFrame(train['landmark_id'].value_counts().head(100)).reset_index() print(head100) temp_data = head100 print(temp_data) ``` ![](https://i.imgur.com/qDwauZG.png) 替換'landmark_id','count'名稱 ```python= temp_data.columns=['landmark_id','count'] print(temp_data.columns) print(temp_data) ``` ![](https://i.imgur.com/96yxglE.png) 將train列表與temp_data列表進行核對，篩選出train列表有temp_data列表的部分整列由小到大排列下來 ```python= sample_train_data = train[train['landmark_id'].isin(temp_data['landmark_id'])].reset_index(drop = True) sample_train_data ``` ![](https://i.imgur.com/9GaUuuL.png) ```python= import shutil ``` 將sample_train_data中的landmark_id編號，以不重複的方式提取出來成立成列表 ```python= landmark_id = list(sample_train_data.landmark_id.unique()) ``` ![](https://i.imgur.com/etM8fQs.png) 將sample_train_data中landmark_id標號裡的前100筆的資料列表以隨機方式提取出來進sample_train中 (等於從100個landmark_id中抽取隨機100組資料) ```python= sample_train = pd.DataFrame(columns=['id', 'landmark_id']) for i in landmark_id: fliter = (sample_train_data["landmark_id"] == i) df_shuffled = sample_train_data[fliter].sample(frac=1).reset_index(drop=True) kk = df_shuffled.head(100) sample_train = sample_train.append(kk,ignore_index=True) ``` ![](https://i.imgur.com/DG3iaZr.png) 這整串的程式碼，主要是透過定義函數，以及目標路徑的提取將資料夾依landmark_id命名分類建立至目標路徑中透過sample_train列表與landmark_id列表比對若sample_train中的landmark_id與landmark_id列表中的landmark_id相同將landmark_id相同的列表資料提取出來其中包含了id名稱(圖片檔名)資料以及landmark_id資料 sample_train中包含了id名稱(圖片檔名)資料與landmark_id資料 landmark_id列表只有landmark_id資料將提取出來的資料列表，其中的id名稱(圖片檔名)資料部分丟進函式中與目標路徑(分類資料夾與原始train資料夾都建在相同的路徑下)進行拼接 mode('train') + id名稱(圖片檔名) + 副檔名，與目標路徑拼接得出依landmark_id分類的原始train圖片路徑再將原始train圖片路徑的圖片透過另一個函式複製至事先經過landmark_id建立分類好的資料夾中來進行train資料中圖檔的分類。 ```python= #定義函式(創建分類資料夾函式)，藉由landmark_id建立分類資料夾，若資料夾存在就刪除，不存在就建立。 def create_folder_structure(base_dir,landmark_id, mode='train'): """ :param output_dir: :param mode: :return: """ base_dir = base_dir + '/' + mode if os.path.exists(base_dir): shutil.rmtree(base_dir) os.makedirs(base_dir) for id_ in landmark_id: os.makedirs(base_dir + '/' + str(id_)) print(base_dir) #目標路徑 output_dir = '/kaggle/working/landmark-recognition-2020/classification' #執行create_folder_structure函式建立分類資料夾 train_dir = create_folder_structure(output_dir,landmark_id, mode='train') #定義函式(取得檔案路徑(來源)函式)，透過輸入路徑 #以及輸入經過sample_train與landmark_id比對好的id名稱(圖片檔名) #拼接出依照landmark_id分類好的原始train資料夾中圖片的路徑 def get_file_path(input_dir,path_id,mode='train'): prefix = path_id[:3] path = input_dir + "/" + mode + "/" +"{0}/{1}/{2}/".format(prefix[0],prefix[1],prefix[2]) filename = path_id return path + filename + ".jpg" #定義函式(複製動作函式)，執行複製檔案函式 #並按照迴圈依次從原始train圖片路徑 #複製圖片進已經分類好(經由landmark_id)的資料夾路徑中 def copy(dataframe,output_dir,id_,mode='train'): destination = output_dir + "/" + mode + "/" + str(id_) for index,row in dataframe.iterrows(): shutil.copy(row['file_path'],destination) print(output_dir) #定義函式(複製檔案函式)，執行sample_train與landmark_id的比對 #建立按landmark_id分類好的sample_train列表 #執行get_file_path函式，將sample_train列表中的id名稱(圖片檔名)輸入進get_file_path函式中 #將透過輸入id名稱(圖片檔名)進get_file_path函式中得出的原始train圖片路徑 #加入進train列表中 #將train列表中的原始train圖片路徑，輸入並執行copy函式，來進行圖片的複製 def copy_files(input_dir,output_dir,dataframe,landmark_id): for id_ in tqdm(landmark_id): print('Landmark with id: {}'.format(id_)) train = dataframe[sample_train['landmark_id']==id_] train['file_path'] = train.apply(lambda x: get_file_path(input_dir,x['id'],mode = 'train'),axis =1) # Copy training files copy(train,output_dir,id_,mode = 'train') #執行copy_files函式來進行複製檔案的操作 copy_files(base_dir,output_dir,sample_train,landmark_id) ``` 使用glob.glob函式遍歷抓取依landmark_id分類好的圖片路徑將路徑加入進sample_train_list下可用在後面透過路徑開啟圖像進行格式變換圖像增強效果的實驗如下列程式碼所示 ```python= sample_train_list = glob.glob(os.path.join(output_dir,'train/*/*')) ``` ```python= from numpy import expand_dims from matplotlib import pyplot as plt from keras.preprocessing.image import load_img from keras.preprocessing.image import img_to_array from keras.preprocessing.image import ImageDataGenerator import matplotlib.pyplot as plt import matplotlib.image as img image = img.imread(sample_train_list[0]) plt.imshow(image) plt.show() ![](https://i.imgur.com/ES61wfr.png) import cv2 imagt = cv2.resize(img, (128, 128), interpolation=cv2.INTER_AREA) plt.imshow(imagt) plt.show() ![](https://i.imgur.com/v3bS09j.png) # 將影像轉成轉成影像 array。簡單講就是浮點數組。 data = img_to_array(imagt) # 將轉成 array 的影像變成 [影像 array] 即[ x , x , x , x]格式 samples = expand_dims(data, 0) # 宣告 ImageDataGenerator 並套用左右位移 200 pixel 函數 train_datagen = ImageDataGenerator(width_shift_range=[-50,50]) # 透過迴圈，將影像一張一張套用進圖像增強生成器中 it = train_datagen.flow(samples, batch_size=1)#抓取圖像路徑至圖像增強生成器 for i in range(9): plt.subplot(330 + 1 + i) # 修出一張 3*3 的圖，+1 +i 指定到第幾張 batch = it.next() me_image = batch[0].astype('uint8')#陣列必須轉成整數形式才能被圖片開啟器變識 plt.imshow(me_image) plt.show() ![](https://i.imgur.com/VpidZCo.png) ``` 鍵出抓取出的training images個數 ```python= print('Number of training images: {}'.format(len(sample_train_list))) ``` ![](https://i.imgur.com/PsqYrSe.png) 定義x_train訓練資料路徑 ```python= x_train_bir = '/kaggle/working/landmark-recognition-2020/classification' ``` ```python= import os import glob import numpy as np from keras.preprocessing.image import img_to_array, load_img from PIL import Image ``` ## 盧彥愷資料前處理的索引定義函式(讀取目錄)，建立空陣列image_list1 透過glob.glob函式遍歷經過join函式拼接好的圖片路徑形成圖片路徑list 將圖片路徑list中的路徑藉由迴圈一個一個的加入進下個迴圈中 ```python= size = (128,128) image_list1 = [] def read_directory(directory_name): for folders in glob.glob(os.path.join(directory_name,'train/*')): print(folders) #使用os.listdir函式抓取圖片路徑list中所有的圖檔 #透過迴圈將圖檔一張一張讀取出來 #並將圖片轉換成預設好的(128,128)尺寸 #然後再一張一張經過img_to_array函式變成影像陣列(陣列佔的記憶體比較小) #將一張一張影像陣列加入進image_list1中 for filename in os.listdir(folders): img=load_img(os.path.join(folders,filename)) img=img.resize(size,Image.BILINEAR) if img is not None: x=img_to_array(img) image_list1.append(x) ``` 將x_train訓練資料路徑輸入進並執行read_directory函式即可讀取所有要用來train的圖片資料(陣列格式) ```python= read_directory(x_train_bir) ``` 將加入image_list1陣列中的所有影像陣列連同image_list1陣列都轉換成多維陣列 ```python= train_image_list = image_list1 train_image_array = np.array(train_image_list) print("train_image_array.shapy={}".format(train_image_array.shape)) ``` 打開看起來都一樣 train_image_list[0] ![](https://i.imgur.com/q13ul0H.png) ![](https://i.imgur.com/Kb9mTSb.png) train_image_array[0] ![](https://i.imgur.com/gTxi2CT.png) ![](https://i.imgur.com/wK9aPEf.png) 也都是浮點數然後套用進np.shpe確認格式的時候卻檢視失敗 ![](https://i.imgur.com/6nXdFVb.png) 檢視成功 ![](https://i.imgur.com/mvHbSRG.png) 所以即便是陣列加入進空陣列裡資料的架構上還是屬於列表這也是我為什麼會給image_list1取名為列表的用意以免自己忘記所以執行陣列加入進空陣列的動作後都要記得再次將陣列再次多維陣列化將訓練陣列資料(影像陣列)轉成(train_image_array.shape[0]=影像筆數,128,128,3=顏色通道(1=單色,3=RGB))格式嚴格上陣列資料就被轉成(x, x, x, x, )四個通道的陣列格式了 astype('float32')再小心謹慎的再次將資料轉成浮點數一般來說當影像資料變成陣列後，就會從整數(電腦才能辨識成影像)變為浮點數了然而這邊為求小心謹慎再次轉成浮點數 ```python= x_train4D = train_image_array.reshape(train_image_array.shape[0],128,128,3).astype('float32') ``` ![](https://i.imgur.com/ki8Zs3N.png) 將影像資料(陣列)正則化，也就是縮小至[0到1]的區間因為影像陣列中，每筆資料(像素)的形式都是介於0~255之間所以統一除以255就能對資料進行正則化 ```python= x_Train4D_normalize = x_train4D / 255 ``` ![](https://i.imgur.com/tRKhdud.png) 透過熱編碼將特徵資料做預處理特徵資料為sample_train列表中的landmark_id 先前已將前100大圖片特徵的temp_data資料list 經過隨機抽取裡面的100筆資料將list資料縮減加入進sample_train資料list中所以sample_train裡包含的landmark_id資料也被縮減至100*100 = 10000筆按照100個landmark_id特徵分類的圖像特徵經過熱編碼後10000筆的landmark_id數字資料將轉化成00~99的數字資料此時還是Dataframe型態 ```python= from sklearn.preprocessing import LabelEncoder labelencoder = LabelEncoder() y_train_LabelEncoder = sample_train y_train_LabelEncoder['landmark_id'] = labelencoder.fit_transform(y_train_LabelEncoder['landmark_id']) y_train_LabelEncoder ``` ![](https://i.imgur.com/6JkESP2.png) 經過list化後以及陣列化後才會變成多維陣列的格式 ```python= lable = list(y_train_LabelEncoder.landmark_id) print(lable) landmark_label = np.array(lable) print(landmark_label) ``` ![](https://i.imgur.com/HtpJb0o.png) 這時候我們再透過random.permutation函式將經過熱編碼處理過後的資料隨機排列 ```python= landmark_label_random = np.random.permutation(landmark_label) y_train_label = landmark_label_random y_test_label = landmark_label_random print(y_train_label) print(y_test_label) ``` ![](https://i.imgur.com/zYZjVyc.png) 再將隨機排列後的資料經過np_utils.to_categoricalc函式one-hot編碼後就會得到10000筆，將00~99的數字資料轉成[0, 1, 0, ....100個[0或1]的組合] 格式即從(10000,)變為(10000, 100)的格式所以建立模型的時候，最後的通道要設為100個輸出才剛好對得上特徵資料的格式(為[x, 100]的通道) ```python= from keras.utils import np_utils y_TrainOneHot = np_utils.to_categorical(y_train_label) ``` ![](https://i.imgur.com/AAN9pzx.png) 建立簡易cnn模型 ```python= from keras.models import Sequential from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D from keras.layers import Dense, SpatialDropout2D, Dropout, Flatten from keras.optimizers import SGD,RMSprop,Adam ``` ```python= model = Sequential() model.add(Conv2D(filters=32, kernel_size=(4,4), strides=(1,1), input_shape=(128,128,3), padding='valid', activation='relu', kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(filters=64, kernel_size=(4,4), padding='valid', activation='relu', kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(SpatialDropout2D(0.25)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(100,activation='softmax')) model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['accuracy']) print(model.summary()) ``` ![](https://i.imgur.com/Gp2pyvq.png) 訓練模型 ```python= train_history = model.fit(x = x_Train4D_normalize, y = y_TrainOneHot,validation_split=0.2, shuffle=True, epochs=20, batch_size=200,verbose=1) ``` ![](https://i.imgur.com/O0gqF0c.png) ![](https://i.imgur.com/1LXZapb.png) ![](https://i.imgur.com/dNwVKJL.png) 建立繪圖程式 ```python= import matplotlib.pyplot as plt def show_train_history(train_acc,test_acc): plt.plot(train_history.history[train_acc]) plt.plot(train_history.history[test_acc]) plt.title('Train History') plt.ylabel('categorical_accuracy') plt.xlabel('Epoch') plt.legend(['train', 'test'], loc='upper left') plt.show() ``` 顯示準確度及驗證準確度圖形 ```python= show_train_history('accuracy','val_accuracy') ``` ![](https://i.imgur.com/Ncnv3Pe.png) 顯示失誤及優化函數圖形 ```python= show_train_history('loss','val_loss') ``` ![](https://i.imgur.com/0Lh09c1.png) 顯示測試準確率 ```python= scores = model.evaluate(x_Train4D_normalize , y_TrainOneHot) scores[1] ``` ![](https://i.imgur.com/URHbZLS.png) ## 盧彥愷想請教的問題老師老師看這邊!!!! 我現在的麻煩是我的資料處理應該是完全沒有問題但是很可悲的再建立模型上面可能出了很多錯誤導致我的圖形長這種鬼樣 ![](https://i.imgur.com/Ncnv3Pe.png) ![](https://i.imgur.com/0Lh09c1.png) ![](https://i.imgur.com/URHbZLS.png) 準確率也是這麼不爭氣所以想麻煩你幫我看看我建模以及訓練程式碼的編打上是不是哪裡有問題? 還請你協助給點改進上的方向然後我的資料及特徵如以下介紹對train.csv資料集(train.csv裡有id資料集(圖片名稱)，landmark_id資料集(特徵ID)) 抽取前100名圖片最多的landmark_id 最多的分類甚至有6272張第100名卻只有364張這樣資料量的差距太大另外查了最少的分類只有2張所以我對前100名的圖片進行隨機抽樣 100張的圖片以10000張的圖片 100組的特徵做為我的訓練資料的依據並且做為特徵資料的landmark_id我也是用隨機排序不知道這樣能不能降低影像辨識的重複性以下是我建模的整個流程再麻煩你了，感謝!! 要是問題可能在我資料前處理的部分那就麻煩你幫我點擊左側盧彥愷資料前處理的標題再麻煩你了!!! 忘了提另一個悲劇的事 ![](https://i.imgur.com/cqqgZNA.png) 我再kaggle開了GPU 但記憶體不足我只訓練一次他就要崩掉了根本沒辦法進行第二次就又要重新運行程式所以10000筆圖檔即便讀取成多維陣列 10000筆看來是極限了... 這樣該怎麼解決呢... 建立簡易cnn模型 ```python= from keras.models import Sequential from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D from keras.layers import Dense, SpatialDropout2D, Dropout, Flatten from keras.optimizers import SGD,RMSprop,Adam ``` ```python= model = Sequential() model.add(Conv2D(filters=32, kernel_size=(4,4), strides=(1,1), input_shape=(128,128,3), padding='valid', activation='relu', kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(filters=64, kernel_size=(4,4), padding='valid', activation='relu', kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(SpatialDropout2D(0.25)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(100,activation='softmax')) model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['accuracy']) print(model.summary()) ``` ![](https://i.imgur.com/Gp2pyvq.png) 訓練模型 ```python= train_history = model.fit(x = x_Train4D_normalize, y = y_TrainOneHot,validation_split=0.2, shuffle=True, epochs=20, batch_size=200,verbose=1) ``` ![](https://i.imgur.com/O0gqF0c.png) ![](https://i.imgur.com/1LXZapb.png) ![](https://i.imgur.com/dNwVKJL.png) 建立繪圖程式 ```python= import matplotlib.pyplot as plt def show_train_history(train_acc,test_acc): plt.plot(train_history.history[train_acc]) plt.plot(train_history.history[test_acc]) plt.title('Train History') plt.ylabel('categorical_accuracy') plt.xlabel('Epoch') plt.legend(['train', 'test'], loc='upper left') plt.show() ``` 顯示準確度及驗證準確度圖形 ```python= show_train_history('accuracy','val_accuracy') ``` ![](https://i.imgur.com/Ncnv3Pe.png) 顯示失誤及優化函數圖形 ```python= show_train_history('loss','val_loss') ``` ![](https://i.imgur.com/0Lh09c1.png) 顯示測試準確率 ```python= scores = model.evaluate(x_Train4D_normalize , y_TrainOneHot) scores[1] ``` ![](https://i.imgur.com/URHbZLS.png) > Hi, 我是 Howie 好ㄉ，針對盧同學的情形你 train 不進去的點在於因為你們資料量龐大，你一次性的使用 list 去存放記憶體會不構，要向其他組員用 imagedagtaloader 才可以在需要用的時候分批讀取。 > 針對你 ACC 圖很怪的問題其實也很簡單，模型資料量大，你卻只用一個小小模型去訓練，就是這樣 underfit 的結果(訓練時準確度有起來，但是測試時.....永遠的ㄎ一ㄤ)，可以可慮網路加深一點，甚至是向其他組員一樣，使用 keras application 中架構好的網路XD 盧彥愷:好的，感謝老師，我有用imagedagtaloader進行訓練了，可是圖形還是有點奇怪 ![](https://i.imgur.com/mC9XKb2.png) ![](https://i.imgur.com/jt7W0hH.png) ![](https://i.imgur.com/g6HBSj7.png) ![](https://i.imgur.com/cw7I0YU.png) ![](https://i.imgur.com/Tq5TRQL.png) ![](https://i.imgur.com/TyU9qpt.png) ![](https://i.imgur.com/flD75dm.png) ![](https://i.imgur.com/QcBjnir.png) ![](https://i.imgur.com/ZHXOuA4.png) ![](https://i.imgur.com/3lEtSAm.png) ![](https://i.imgur.com/rRjOeFC.png) 麻煩老師幫忙看了!! ![](https://i.imgur.com/kUcmGvW.png) ![](https://i.imgur.com/9q8oZn7.png) ![](https://i.imgur.com/OX2MGMK.png) ![](https://i.imgur.com/EDLpWna.png) 6/23 10:48更新將資料分成三個資料夾路徑 ```python= train_bir1 = '/kaggle/working/landmark-recognition-2020/classification/train' validation_bir1 = '/kaggle/working/landmark-recognition-2020/classification/validation' test_bir1 = '/kaggle/working/landmark-recognition-2020/classification/test' ``` ![](https://i.imgur.com/8NEDmtq.png) 圖像增強 ```python= train_datagen = ImageDataGenerator(rescale=1./255,dtype='float') validation_datagen = ImageDataGenerator(rescale=1./255,dtype='float') test_datagen = ImageDataGenerator(rescale=1./255,dtype='float') ``` 生成器(資料已是先按6:2:2分配) ```python= x_train_generator = train_datagen.flow_from_directory( train_bir1, target_size=(256, 256), class_mode = 'categorical', color_mode='rgb', batch_size = 32, shuffle=True, ) ![](https://i.imgur.com/c4X0WIp.png) x_validation_generator = validation_datagen.flow_from_directory( validation_bir1, target_size=(256, 256), class_mode = 'categorical', color_mode='rgb', batch_size = 32, shuffle=True, ) ![](https://i.imgur.com/smbMo6j.png) x_test_generator = test_datagen.flow_from_directory( test_bir1, target_size=(256, 256), class_mode = 'categorical', color_mode='rgb', batch_size = 32, shuffle=True, ) ![](https://i.imgur.com/Q5U7aCz.png) ``` ![](https://i.imgur.com/oqFeDEs.png) ```python= from keras.models import Sequential from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D from keras.layers import Dense, SpatialDropout2D, Dropout, Flatten from keras.optimizers import SGD,RMSprop,Adam model = Sequential() model.add(Conv2D(filters=32, kernel_size=(4,4), strides=(1,1), input_shape=(256,256,3), padding='valid', activation='relu', kernel_initializer='uniform')) model.add(Conv2D(filters=32, kernel_size=(4,4), strides=(1,1), padding='valid', activation='relu', kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(SpatialDropout2D(0.25)) model.add(Conv2D(filters=64, kernel_size=(4,4), strides=(1,1), padding='valid', activation='relu', kernel_initializer='uniform')) model.add(Conv2D(filters=64, kernel_size=(4,4), padding='valid', activation='relu', kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(SpatialDropout2D(0.25)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(100,activation='softmax')) model.compile(optimizer = 'Adam', loss='categorical_crossentropy', metrics=['accuracy']) print(model.summary()) ``` ![](https://i.imgur.com/gSEDzvZ.png) ![](https://i.imgur.com/hAHU6CU.png) ```python= train_history=model.fit_generator(x_train_generator, validation_data = x_validation_generator, validation_steps = 125, steps_per_epoch = 375, epochs=100, verbose=1) ``` ![](https://i.imgur.com/ll9IAa1.png) ![](https://i.imgur.com/Fz5zbs2.png) ![](https://i.imgur.com/DFovMPZ.png) 6/23 18:42 ![Uploading file..._1ioi47i4a]() ![Uploading file..._gvnus12io]() ![Uploading file..._7auqgzcxd]() ## 上台發表與QA PS 記得在簡報上放上動畫換頁的提示詞 1. 專題的分工？ - 組長為甘元昊，也負責資料前處理 - 每個人都有訓練自己的模型，分別是今天報告的五種模型 2. 專題的困難點？ - Kaggle系統有額度(一週30小時GPU, 一次最多跑九小時)，沒辦法長時間train模型 - ROC原本為二元分類，利用遞迴方式來解決複數類別分類 3. 為何不直接參加競賽？ - 因為過期了ＸＤ - Kaggle系統跑全部圖片會當掉 (GPU不足，硬碟空間不足) 4. 有沒有嘗試其他 ... Net - 以keras提供的模型為主 5. 為什麼選這個題目 - 可以練習架構深度學習模型，也相較容易說明預測結果

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.