triplet loss - HackMD

--- title: triplet loss tags: NCNU Research --- # triplet loss 概述 * 概念: 將圖片映射到某特徵空間中，獲得其特徵向量，再跟其他特徵向量進行比對 * 為三個樣本一組，一個為參考樣本，一個同類樣本、一個異類樣本 * 使參考樣本盡可能的與同類樣本相似，與異類樣本疏遠 * 常用於臉部辨識 * 其問題在於比對資料的挑選，由於組合太多，故需要挑選好的組別進行訓練 ![](https://i.imgur.com/8n8meo5.jpg) * 要讓anchor更靠近positive，更遠離negative ## loss 之算法 * a = anchor之特徵向量 * p = positive之特徵向量 * n = negative之特徵向量 * margin為誤差值(會決定semi-hard triplets的範圍) :::success loss = max(d(a, p) - d(a, n) + margin, 0) ::: * 為a跟p之間的差異(歐式距離)-a跟n之間的差異(歐式距離)+誤差值，且索求若小於0則為0 * 根據loss的定義，我们可以定義三種類型的triplet： * easy triplets: 此時loss為0，這種情況是我们最希望看到的，可以理解成是容易分辨的triplets。即 d(a, p)+margin < d(a, n) * hard triplets: 此時negative比positive更接近anchor，這種情況是我们最不希望看到的，可以理解成是處在模糊區域的triplets。即d(a, n) < d(a, p) * semi-hard triplets: 此時negative比positive距離anchor更遠，但是距離差没有達到一个margin，可以理解成是一定會被誤識別的triplets ![](https://i.imgur.com/LUlQP5w.jpg) * 此圖為negative所出現的位置，代表的loss ## 實作(尚未完成) :::danger 這次實作尚未挑選過triplets，為隨機產生之組合(positive, negative為隨機挑選)，僅先理解概念 ::: * loss function ``` def triplet_loss(y_true, y_pred, alpha = 0.2): total_lenght = y_pred.shape.as_list()[-1] anchor, positive, negative = y_pred[:,:int(1/3*total_lenght)], y_pred[:,int(1/3*total_lenght):int(2/3*total_lenght)], y_pred[:,int(2/3*total_lenght):] pos_dist = tf.reduce_sum(tf.square(anchor - positive), axis=-1) neg_dist = tf.reduce_sum(tf.square(anchor - negative), axis=-1) basic_loss = pos_dist - neg_dist + alpha loss = tf.reduce_sum(tf.maximum(basic_loss,0.0)) # 降維求總和 return loss ``` * 產生triplets(隨機) ``` def generate_triplets(x, y, num_same = 4, num_diff = 4): anchor_images = np.array([]).reshape((-1,)+ x.shape[1:]) same_images = np.array([]).reshape((-1,)+ x.shape[1:]) diff_images = np.array([]).reshape((-1,)+ x.shape[1:]) print(anchor_images.shape, same_images.shape, diff_images.shape) for i in range(len(y)): print("Triple image", i) point = y[i] anchor = x[i] same_pairs = np.where(y == point)[0] same_pairs = np.delete(same_pairs , np.where(same_pairs == i)[0]) diff_pairs = np.where(y != point)[0] same = x[np.random.choice(same_pairs,num_same)] diff = x[np.random.choice(diff_pairs,num_diff)] anchor_images = np.concatenate((anchor_images, np.tile(anchor, (num_same * num_diff, 1, 1, 1) )), axis = 0) for s in same: same_images = np.concatenate((same_images, np.tile(s, (num_same, 1, 1, 1) )), axis = 0) diff_images = np.concatenate((diff_images, np.tile(diff, (num_diff, 1, 1, 1) )), axis = 0) return anchor_images, same_images, diff_images ``` * main ``` def main(): dir_train = "./train" # 載入資料集 df = data_generator(dir_train, classes_num) print("Length:",len(df)) # 標準化及one-hot encoder le = LabelEncoder() x_train = list(df.image.values) x_train = np.array(x_train) x_train = x_train/255 print(x_train.shape) le.fit(df["label"].values) # print(df["label"].values) y_train = le.transform(df["label"].values) # 產生Triplet imges anchor_images, same_images, diff_images = generate_triplets(x_train,y_train, num_same=12, num_diff=12) print(anchor_images.shape, same_images.shape, diff_images.shape) # Triplet的CNN模型 anchor_input = tf.keras.layers.Input((img_size, img_size, 3), name='anchor_input') positive_input = tf.keras.layers.Input((img_size, img_size, 3), name='positive_input') negative_input = tf.keras.layers.Input((img_size, img_size, 3), name='negative_input') shared_dnn = get_model() # 為自己寫的cnn network(隨便都可以,最後輸出特徵向量即可, [128, ]) encoded_anchor = shared_dnn(anchor_input) encoded_positive = shared_dnn(positive_input) encoded_negative = shared_dnn(negative_input) # 將三個輸出拼接起來變成輸入 merged_vector = tf.keras.layers.concatenate([encoded_anchor, encoded_positive, encoded_negative], axis=-1, name='merged_layer') # 此model將取得triplets之特徵向量且將其合併輸出 model = tf.keras.Model(inputs=[anchor_input,positive_input, negative_input], outputs=merged_vector) model.summary() # 使用triplet loss model.compile(loss=triplet_loss, optimizer="adamax") # 儲存權重 weight_dir = './weight' if not os.path.exists(weight_dir): os.mkdir(weight_dir) checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath=weight_dir+'/checkpoint-{epoch:02d}.hdf5') # label的部分為空值，主要loss判斷依據為三張圖的差異 Y_dummy = np.empty((anchor_images.shape[0],1)) model.fit([anchor_images,same_images,diff_images],y=Y_dummy, batch_size=64, epochs=50, callbacks=[checkpoint]) anchor_model = tf.keras.Model(inputs = anchor_input, outputs=encoded_anchor) ``` ## reference [使用tripleloss訓練臉部辨識模型](https://chtseng.wordpress.com/2020/04/29/%E4%BD%BF%E7%94%A8tripleloss%E8%A8%93%E7%B7%B4%E8%87%89%E9%83%A8%E8%BE%A8%E8%AD%98%E6%A8%A1%E5%9E%8B/) [完全解析tripletloss](https://zhuanlan.zhihu.com/p/295512971)