EfficientNet專題

# 使用 EfficientNet 預測髖關節 X 光影像關鍵點 ## 專案目標 - 預測 X 光影像中的 8 個髖關節關鍵點，輸出對應的 16 個座標（x, y）。 - 使用 EfficientNetB0 作為基底模型進行訓練和微調，結合資料增強技術提高模型泛化能力。 --- ## 專案進展 ### 1. 資料預處理 **程式碼：** ```python def load_labels(image_name): label_path = os.path.join(labels_folder_path, image_name.replace('.jpg', '.csv')) labels_df = pd.read_csv(label_path, header=None) labels = labels_df.values.flatten() parsed_labels = [] for label in labels: label = label.strip('()') x, y = map(float, label.split(',')) parsed_labels.extend([x, y]) return np.array(parsed_labels) def load_image(image_name, target_size=(224, 224)): image_path = os.path.join(images_path, image_name) image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE) original_size = (image.shape[1], image.shape[0]) # (width, height) image = cv2.resize(image, target_size) image = image.astype(np.float32) / 255.0 image = np.stack([image] * 3, axis=-1) # 擴充成3通道 return image, original_size def preprocess_labels(labels, original_size, target_size=(224, 224)): x_ratio = target_size[0] / original_size[0] y_ratio = target_size[1] / original_size[1] return labels.astype(float) * np.array([x_ratio, y_ratio] * (len(labels) // 2)) ``` - **影像處理：** 將灰階影像轉換為 224x224 的 3 通道影像，並進行標準化。 - **標籤處理：** 將每張影像對應的標籤 `(x, y)` 座標縮放到 224x224 的範圍。 --- ### 2. 資料增強 **程式碼：** ```python seq = iaa.Sequential([ iaa.Fliplr(0.5), # 水平翻轉 iaa.Affine( rotate=(-5, 5), # 減小旋轉角度 translate_percent={"x": (-0.01, 0.01), "y": (-0.01, 0.01)}, # 減小平移幅度 scale=(0.98, 1.02) # 減小縮放幅度 ), iaa.Multiply((0.95, 1.05)) # 輕微亮度變化 ], random_order=True) ``` - 使用 `imgaug` 定義增強策略，包括翻轉、旋轉、平移、縮放與亮度調整。 - 確保影像與關鍵點同步進行增強。 --- ### 3. 模型設計 **程式碼：** ```python import efficientnet.tfkeras as efn base_model = efn.EfficientNetB0(input_shape=(resolution, resolution, 3), include_top=False, weights='imagenet') base_model.trainable = True for layer in base_model.layers[:-24]: layer.trainable = False global_avg_pool = tf.keras.layers.GlobalAveragePooling2D()(base_model.output) dense_1 = tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02))(global_avg_pool) dense_2 = tf.keras.layers.Dense(128, activation='relu')(dense_1) dropout = tf.keras.layers.Dropout(0.5)(dense_2) output_layer = tf.keras.layers.Dense(16, activation='linear')(dropout) model = tf.keras.models.Model(inputs=base_model.input, outputs=output_layer) ``` - 使用 EfficientNetB0 作為基底模型，添加自定義全連接層進行關鍵點回歸。 - 凍結前 24 層參數以利用預訓練權重。 --- ### 4. 損失函數設計 **程式碼：** ```python def cosine_similarity_loss(y_true, y_pred): def get_vectors(keypoints): keypoints = tf.reshape(keypoints, [-1, 2]) vectors = keypoints[:-1] - keypoints[1:] return vectors y_true_vectors = get_vectors(y_true) y_pred_vectors = get_vectors(y_pred) dot_product = tf.reduce_sum(y_true_vectors * y_pred_vectors, axis=-1) norm_true = tf.norm(y_true_vectors, axis=-1) norm_pred = tf.norm(y_pred_vectors, axis=-1) cosine_similarity = dot_product / (norm_true * norm_pred + 1e-8) return tf.reduce_mean(1 - cosine_similarity) def combined_loss(y_true, y_pred): mse_loss_value = tf.keras.losses.MeanSquaredError()(y_true, y_pred) cosine_loss_value = cosine_similarity_loss(y_true, y_pred) return mse_loss_value + 0.5 * cosine_loss_value ``` - 自定義損失函數包括： - MSE（均方誤差）。 - Cosine Similarity（確保相對位置一致性）。 --- ### 5. 模型訓練 **程式碼：** ```python model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4), loss=combined_loss, metrics=['mae', 'mse']) history_stage1 = model.fit(train_generator, steps_per_epoch=steps_per_epoch, validation_data=val_generator, validation_steps=validation_steps, epochs=30) ``` - **訓練策略：** - 初始訓練凍結大部分層，學習率為 `1e-4`。 - 中期與後期微調時逐步解凍 EfficientNet 層。 --- ### 6. 測試集評估與可視化 **程式碼：** ```python test_loss, test_mae = model.evaluate(test_generator, steps=test_steps) print(f"Test Loss: {test_loss}") print(f"Test MAE: {test_mae}") for i, (test_images, test_labels) in enumerate(test_generator): if i >= display_count: break predictions = model.predict(test_images) for j in range(len(test_images)): plt.imshow(test_images[j]) plt.scatter(predictions[j][::2], predictions[j][1::2], color='r', label='Predicted') plt.scatter(test_labels[j][::2], test_labels[j][1::2], color='g', label='True') plt.legend() plt.show() ``` - 評估測試集的損失與 MAE。 - 可視化模型預測結果，檢查關鍵點定位效果。 --- ### 7. 訓練曲線 **程式碼：** ```python all_train_loss = history_stage1.history['loss'] + history_stage2.history['loss'] + history_stage3.history['loss'] all_val_loss = history_stage1.history['val_loss'] + history_stage2.history['val_loss'] + history_stage3.history['val_loss'] plt.plot(all_train_loss, label='Training Loss') plt.plot(all_val_loss, label='Validation Loss') plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend() plt.show() ``` - 繪製訓練與驗證損失曲線，觀察模型收斂效果。 --- ### 目前問題 ![image](https://hackmd.io/_uploads/rkBLTWgByl.png)