Background Remove for Moth Project

tags: `Moth project` `image segmentation` `Tool`

目標

在完全無Label情況下，從０開始訓練針對鱗翅目標本的去背模型

困難點

沒有已標記好的資料
不同標本來源建檔（拍照、打燈）等方式沒有標準化

流程

使用預訓練好的yolo v4模型crop出標本主體區域
使用無監督方法　＋　商業去背網站(removebg.com)手工取得少數去背影像(mask/label)
使用監督式方法(UNET)迭代訓練去背模型，逐步增加與更新標籤提升預測精確度
針對少數(<10%)樣本進行處理
- 模型損失函數修改(loss fuction）
- 資料取樣設計(按翅膀花色分層抽樣、針對特定類型資料增強)

code

moth_thermal_project/remove_bg
- 輔助小工具放在: Moth_thermal/remove_bg/tools

鱗翅目去背工作參考流程:

資訊所發展工作流程
- github/colorful-moth
  - 資訊所訓練之去背模型僅適用於特有生物中心標本資料
  - 針對本次任務須重新訓練

資料說明：

Moth Thermal Project 資料代碼說明
Moth Thermal Project Meta data
檔案命名規則
- 多樣性中心標本照(TT)標籤與標本分開
  - 命名規則：副檔名為.jpg、標籤照為.JPG。讀取時可直接分開
- 成大建檔的標本照(RS)，是標籤與與標本一起
  - 可以先直接跑YOLO模型獲得Bboxes值，針對無法獲得的檔案清單在去將原檔做左右裁切後，再去跑YOLO
- 注意: 原檔名內含中文可能導致亂碼
  -　透過 7-Zip 壓縮 ZIP 檔案時須注意中文檔名無法正確解壓縮的問題
  -　[分享]Ubuntu解壓縮zip檔亂碼問題快速解法
```
# 解壓縮時指定繁體中文字碼
unzip -O big5 file
```
目視篩選過濾品質不佳樣本
- 翅膀過度殘缺、展翅不全及其他例外
- 可將檔名寫成.csv或.text檔，之後於批次crop-resize by Bounding Box的流程中排除掉

一、物件偵測(object detection)框出目標位置

裁切抓取出主體(蛾類標本)
使用針對鱗翅目訓練好的yolov4模型
影像讀取使用skimage.io
- 直接讀取影像陣列避免讀取EXIF檔導致未知錯誤(例如自動翻轉)
最後的裁切圖可根據yolo跑出的Bounding Box座標點調整裁切範圍
yolo讀圖預設是使用PIL，要注意可能有些圖片在作業系統中被轉正過，使用PIL的話會讀取到轉正的EXIF資訊

資料io原則與注意事項：

中間影像的轉換格式均避免使用jpg格式壓縮導致細節損失及失真(採用png)
進入非監督、監督式去背的影像預先padding成(256, 256)，避免影像變形
影像存取採用skimage 直接讀取為numpy陣列，避免讀取exif檔時存取到變形、選轉的影像

蛾類標本偵測與裁切流程:

1.1 先使用原資料跑一遍yolov4

獲得裁切的圖片與Bounding Box座標點

1.2 檢視裁切結果

成大(CARS)資料
- 由於圖片太小，較有切邊情形發生
  - 跑yolo取得Bounding Box後，之後再手動加大裁切範圍
    - 檢視Bounding Box裁切範圍
      Image Not Showing Possible Reasons
      The image was uploaded to a note which you don't have access to
      The note which the image was originally uploaded to has been deleted
      Learn More →
    - 左圖:原裁切範圍太小導致切邊，右圖:根據Bounding Box加大裁切範圍
      Image Not Showing Possible Reasons
      The image was uploaded to a note which you don't have access to
      The note which the image was originally uploaded to has been deleted
      Learn More →
      Image Not Showing Possible Reasons
      The image was uploaded to a note which you don't have access to
      The note which the image was originally uploaded to has been deleted
      Learn More →
- 部分標本過小偵測不到
  - 直接將影像切成左右兩半再去跑yolo，取得Bounding Box
    - 原影像太小抓不到
      Image Not Showing Possible Reasons
      The image was uploaded to a note which you don't have access to
      The note which the image was originally uploaded to has been deleted
      Learn More →
    - 先對半裁切後再跑yolo，即可幫助偵測到目標物件
      Image Not Showing Possible Reasons
      The image was uploaded to a note which you don't have access to
      The note which the image was originally uploaded to has been deleted
      Learn More →
      Image Not Showing Possible Reasons
      The image was uploaded to a note which you don't have access to
      The note which the image was originally uploaded to has been deleted
      Learn More →
多樣性中心(SJTT)資料
- 標籤與圖片分開，在前置作業時可以先用regex先分開
  - 依據檔案名稱
- 圖片較不乾淨，內含有側面、生殖器、label名稱誤標為標本照等問題
  - 手動檢視紀錄問題檔名排除
- 較容易裁切到比例尺與色卡

1.3 處理對策

1.3.1 獲得Bounding Box座標點後，直接依據座標點加大選取範圍，來進行裁切獲得影像

Bounding Box座標點分別代表: left, top, right, bottom
縮放尺度 scale= 1.1 - 1.2
- 例如左邊界：- 寬0.95、右邊界： + 寬1.05，合計為 1.1倍
高(上下邊界)的縮放範圍可以比左右再大一些
使用Bounding Box座標點的中心絕對位置(x, y) ± 0.5*寬(高)*縮放比例

Bounding Boxh裁切操作

code
























scale_w = 0.1  if file == "SJTT" else 0.1       
scale_ht, scale_hb = (0.0, 0.0) if file == "SJTT" else (0.1, 0.1)

bboxes_ = np.asarray([float(i) for i in row.bboxes.split(',')])
# 改變原本框選範圍

w = bboxes_[2] - bboxes_[0]
h = bboxes_[3] - bboxes_[1]
x = bboxes_[0] + w/2
y = bboxes_[1] + h/2

bboxes = np.asarray([
    x - 0.5*(w * (1 + scale_w)),       # left
    y - 0.5*(h * (1 + scale_ht)),      # top
    x + 0.5*(w * (1 + scale_w)),       # right
    y + 0.5*(h * (1 + scale_hb))       # bottom
])

image = Image.fromarray(img)
# bboxes_reset = restrict_boundry(bboxes)
image_cropped = image.crop(bboxes)

二、去背

使用手動去背+非監督式去背(Unsup_train.py)，產生想要的mask(label)
- 去背影像示意
  - 左:原圖、中mask、右:去背影像(以藍色填滿背景)
    Image Not Showing Possible Reasons
    The image was uploaded to a note which you don't have access to
    The note which the image was originally uploaded to has been deleted
    Learn More →
- 這邊的label是像素等級的，例如：
  - 標本主題: 類別0
  - 背景(想要去掉的)：類別1
    訓練監督式去背模型(Sup_train.py)
- 將前一步驟產生的mask(label)作為Y，標本照片做為X
- 訓練模型至去背結果與輸入的Y(mask)一致
使用訓練好的去背模型(Sup_predict_rmbg.py)進行去背，獲得mask
輔助的後處理工具(Postprocess.py)
- 採用前一步驟產出的MASK為基底，搭配其他傳統的電腦視覺技術如Dense CRF, opencv的findContours(find_cntr)等優化MASK的輪廓
  - (當監督式模型尚未訓練完成時，需要先用手動去背+非監督式去背產出第一批mask)
  - Postprocess.ipynb內的find_cntr函式可抓出圖像內最大的輪廓主體並填補(可自動化去除圖片內的雜班)
- 最後視覺化呈現，人為篩選mask
- Tip: 去背後的背景圖可用蛾類翅膀不會有的圖層作為底色，較好判別輪廓
~~按資料特性採用不同處理方案~~
- 背景雜亂
  - 手動以小畫家去背製作mask
- 背景單純
  - 低對比度圖案
    - 先以程式篩出低對比的影像
    - 增加對比後再丟進其他背景單純的圖像一併處理

2.1 非監督式方法(Unsupervied Segmentation)

Moth_thermal/remove_bg/Unsup_train.py

目的是讓非監督式方法可以將背景跟主體影像區分開來產生鱗翅目影像的遮罩(mask)
取得主體255(白)，背景0(黑)的黑白mask
Modified from Unsupervised Image Segmentation by Backpropagation by kanezaki

2.1.1 非監督+手動去背操作順序:

a. 使用非監督方法直接取得部分mask

直接送入原始影像
對於部分難處裡的影像
- 使用PCA最大化RGB色階差異後的影像
  - 加入原影像做為第四維
  - 單獨儲存作為獨立影像

b. 無法用非監督方法直接取得的部分，則利用非監督方法中間產出的rgb色塊圖

使用小畫家填補工具將背景以後色填補後
保留黑色背景(0)，將標本主體部分顏色全轉為白色(255)
使用Postprocess.py工具，抓出最大的主體輪廓並填補，取得較好的mask效果
- 核心操作為:
  - 根據輸入的mask，抓出圖像內最大輪廓並填滿

c. 前兩者無法處理的，則手動處理:

- remove.bg網站去背

付費網站，每張成本約4-6ntw
隱私模式下，實測可處理500張以上影像
取得去背影像後，再將圖片透明通道設為0，影像主體設為255即得到mask

- 小畫家3D手動去背

接續處理remove.bg網站無法成功去背的
使用魔術選取工具
去掉選取出來的主體保留背景，再轉換為mask

Unsupervised Image Segmentation by Backpropagation 參數設定

起始segmentation方法與參數選擇

Comparison of segmentation and superpixel algorithms
SLIC:
- conpactness 0.1-10。越小相近的色塊越容易被劃為一塊
- sigma 接近0.0效果較好可以至0.5。數字越大線條越平滑，對於色塊細微的差異越不敏感
- lr=0.1
- 背景填補後很難直接取得mask
- 但適合取得rgb圖供後續用小畫家填補取得背景
FELZENSZWALB
- 1000起跳 -3000
- sigma0.5起跳， 0.5-1
- lr=0.05
- 對於部分主題與背景分離的樣本可成功得到mask，但試驗成功率約10%

對於不同類型影像

翅膀輪廓、對比明顯
翅膀透明或白色跟背景融合
- 可以透過PCA(針對RGB三個色階使之差異最大化)、調整對比等方式先前處理影像，先加強樣本輪廓與背景的差異
- minlab可以稍微設置大一點(建議10以下)，避免背景太難填補

2.2 監督式方法(Supervied Segmentation)

核心技術為Unet

2.2.1 訓練

Sup_train_pytorch.py
以非監督+後處理過的MASK作為標籤(以像素為單位)

2.2.1.1 超參數調整與正規化(regulariztion)

learning rate
- 加入lr warmup
- RMprop 改為 Adam/AdamW
- 加入lr scheduler
加入early stopping
- depend on dice score

2.2.1.2 資料增強 Data Argugmentation

Moth_thermal/remove_bg/utils/data_loading_moth.py
增強方式:
- 型變: ximage)與y(label)同時
  - 變形、位移、選轉
- 色彩調整、加入雜訊: 僅x(image)
  - 弱:
    - 明暗(Multiply)、線性對比(LinearContrast)
    - 斑塊(Coarsedropout)
  - 強:
    - 三頻隨機獨立改變對比跟絕對值分布
      - 色階圖平衡Histhram Equalization、Multiply明暗度、Hsu_Saturantion色調與飽和度
        
        白/透明翅膀等類型圖像明亮度加太多的話，容易導致邊界細節完全消失，亮度加太多可能會導致訓練效果欠佳
    - 不同大小的高斯雜訊、全黑與全白斑塊(Coarsedropout、Cutout)、jpg壓縮(JpegCompression)
    - 模糊、銳利
- 效果示意
- 成效
  - ~~採用較強的資料增強方式對未知類型資料預測(泛化)能力較好，但對於邊緣細節表現則會弱化~~
    - 訓練指標表現(黑:強、藍:弱)
      - ~~強:驗證資料的loss跟dice socre表現都較差，不過在實際預測結果上面對未知來源資料表現較好~~
    - 預測與訓練資料不同取得來源的資料
      - 左邊(弱)、右邊(較強的資料增強)
    - 預測與訓練資料同來源的資料
      - 左邊(弱)、右邊(較強的資料增強)。注意邊緣細節
  - 當資料集中要加入新類型資料時，可採用較強的資料增強方式取得初步label(稍微犧牲資料細節)，當資料累積夠大時(涵蓋各類型資料來源後)，或可以在訓練個低資料增強強度的版本去fit已知資料範圍來取得較好的細節

2.2.1.3 損失函數 Loss Function加權

針對蛾類輪廓邊緣處加權
- 損失函數:
  - CrossEntrophy()+dice_loss()
    => CrossEntrophy(pexel-wise loss *weight ) + dice_loss(All area) + dice_loss(Countour area) *weight
  - Metrics to Evaluate your Semantic Segmentation Model
    - ; ;
    - ; ;
- 加權效果
  - 左圖為原loss，右圖為loss加權。輪廓部分表現變好，不過因為加權比例太高，導致部分輪廓之外的預測表現下降(可以搭配Postprocess.py簡易處理)
  - 輪廓加權計算範圍示意
    - 左:原圖、中:mask、右:加權範圍

code
























































from utils.utils import get_masks_contour

def get_masks_contour(masks: np.ndarray, iterations: int = 5, weighted: float = 2.0) -> torch.Tensor:
    if masks.ndim == 2:
        masks = masks[np.newaxis, ...]
    assert masks.ndim == 3, f'Shape of mask must be (h,w) or (batch, h, w), mask.shape : {masks.shape}'
    assert masks.dtype == 'uint8', f'dtype of mask need to be "uint8", masks.dtype : {masks.dtype}.\nuse .astype("uint8") convert dtype'

    kernel_ELLIPSE = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, ksize=(3, 3))

    masks_contour = []
    for mask_ in masks:
        mask_dilate = cv2.dilate(mask_, kernel_ELLIPSE, iterations=6)
        mask_erode = cv2.erode(mask_, kernel_ELLIPSE, iterations=iterations)
        mask_contour = mask_dilate - mask_erode
        masks_contour.append(mask_contour)

    masks_contour_tensor = torch.tensor(masks_contour, dtype=torch.long)

    return masks_contour_tensor
# ----------------------------------------------------------------------
# get mask_contour and weight it for loss calculation (optional)
# ----------------------------------------------------------------------
masks_contour = get_masks_contour(
    masks_true.cpu().numpy().astype(np.uint8)).to(device)  # (b, h, w), torch.int64, [0, 1]
weight_CrossEntropy = 5
y = torch.ones(1, dtype=torch.int64).to(device)
masks_weighted = torch.where(
    masks_contour == 1, weight_CrossEntropy*y, y)  # (b, h, w)

# caculate weighted loss based on countour
masks_pred_one_hot = F.softmax(
    masks_pred, dim=1, dtype=torch.float32)                 # (b, class, h, w)
masks_true_one_hot = F.one_hot(
    masks_true, net.n_classes).float().permute(0, 3, 1, 2)  # (b, h, w) > (b, h, w, class) > (b, class, h, w)

masks_contour_pred_one_hot = masks_pred_one_hot * \
    torch.stack((masks_contour, masks_contour), dim=1)
masks_contour_true_one_hot = masks_true_one_hot * \
    torch.stack((masks_contour, masks_contour), dim=1)

cross_entrophy_weighted = (criterion_3d(                                        # nn.CrossEntropyLoss(reduction='none') : (b, h, w)
    masks_pred, masks_true) * masks_weighted).mean()
dice_loss_allArea = dice_loss(
    masks_pred_one_hot, masks_true_one_hot, multiclass=True)
dice_loss_contour = dice_loss(
    masks_contour_pred_one_hot, masks_contour_true_one_hot, multiclass=True)

train_loss_contour_weighted = cross_entrophy_weighted + \
    dice_loss_allArea + dice_loss_contour*3

optimizer.zero_grad()
train_loss_contour_weighted.backward()
optimizer.step()

其他Image segmentation loss function ~~待測試~~
- 主要為處裡類別不平衡問題，其中以focal loss(可視為CE的權重版本)較具代表性
  - 蛾類標本的背景與前景(標本)像數數量平衡，問題在邊緣輪廓表現較差(部分影像類別FN、FT比例高)，與loss function相關技術中主要欲解決得類別不平衡問題屬性不同
  - 目前paper review顯示並沒有特定最佳解，建議同時採用Distributed based(CE、或加權的CE) + Regional based(IOU、Dice或加權版本)的loss fuction，表現較為穩定

2.2.1.4 訓練與驗證資料集按比例抽樣Stratified Proportional Sampling

依照來源、類群

處理類別不平衡問題
- Tour of Data Sampling Methods for Imbalanced Classification
整體dice loss與valid loss獲得改進，但對於部分類型影像(白翅、透明翅、翅膀邊緣有白斑者)仍無法獲得良好改善
Stratified Proportional Sampling
- keep consistent class proportions in training and test datasets

花式取樣(直接用影像特徵_HSV)

Moth_thermal/remove_bg/tools/get_imgs_label_byHSV.py
直接系統化採樣在每張照片固定位置取一個5x5或3x3的方塊，計算HSV平均值，假設取12個點，得到36個值，然後做分群然後按照群去分層取樣
取樣點分布示意(翅膀部位取樣(3x3)，每個取樣點計算h,s,v平均)
依據取樣點計算h,s,v後，降維成3個維度，並使用Birch分群視覺化
翅膀顏色偏白與透明的分別被分到cluster1與cluster6
分群結果討論
- 當樣本數量增加時，如果資料來源(建檔方式)有明顯系統性偏差，則分群結果很可能只是反映資料來源
- 分群結果並不穩定，會依照取樣點的數量、位置，樣本本身(大小、性質)，降維演算法的參數(n_components、n_neighbors)，而有較大的浮動
- 取樣點盡可能分佈在翅膀位置，讓分群結果盡可能依照翅膀色型
  - 標本並未在畫面正中央且不一定有擺正，因此會造成部分樣本失準
- 去背模型難處理的類型(白翅、透明翅等)未必會是小樣本(or 失衡樣本)，因此
  - train與valid切分時可以依據label平衡資料分布
  - 但dataloader抽樣時不宜參照依翅膀色型的分群結果

code













    if args.stratify:
        try:
            df = pd.read_csv('../data/imgs_label_byHSV.csv', index_col=0)
        except:
            print('You need provide label of imgs at "../data/imgs_label_byHSV.csv".')
        assert len(df.Name) == len(
            img_paths), f'number of imgs: {len(img_paths)} and imgs_label_byHSV.csv: {len(df.label)} need equal '
        print(
            f'Stratified sampling by "imgs_label_byHSV.csv", clustering: {np.unique(df.label).size}')

    X_train, X_valid, y_train, y_valid = train_test_split(
        img_paths, mask_paths, test_size=val_percent, random_state=1,
        stratify=df.label if args.stratify else None)

2.2.1.5 針對少數、特定類別抽樣進行批次資料增強

想法依據：同一批次內如果都為同一類型影像，可以讓模型參數更新時，更往強化處理該類型影像的方向推動
針對想要增強處理的特定影像類型
- 單張影像批次增強(SingleImgBatch)
  - 單張影像*batch份(即同一batch內只有單張影像)作各種影像增強
- 多張同類型影像批次增強(MultiImgBatch)
  - 特定類型影像抽樣batch份放同一批次內(即同一batch內多張同一類型影像)作各種影像增強
- 實作:
  - 自定義sampler，在dataloder中使用
  - sampler code參見 remove_bg/utils/sampler_moth.py
  - 在Sup_train_pytorch.py主程式中修改Dataloder(指定batch_sampler=batchsampler)
- 成效討論:
  - 兩者針對特定類型影像成效好於隨機抽樣(Randombatch)，可將SingleImgBatch+MultiImgBatch混合一併訓練(見code中ImgBatchAugmentSampler裡面Mix的實作)
  - 左-右: true_label、RandomBatch、SingleImgBatch、MultiImgBatch
  - 實務上可以分別訓練Random 與 Mix的模型，取得預測的mask後再篩選適合的
    - ImgBatchAugmentSampler 採用mix模式，並指定特定類型樣本數量在同一個epoch內重複抽樣3倍(sample_factor=3)，可得到泛化能力較差，但對特定類型影像表現較好的模型
    - ImgBatchAugmentSampler 採用Random模式，可得到泛化能力較好，但對特定類型影像表現略差的模型。

code
























# ------------------------------------------------------
# AddBatchArgmentation
dir_img_arg = Path('../data/data_for_Sup_train/imgs_batch_arg')
img_arg_paths = list(dir_img_arg.glob('**/*' + 'png'))
dir_img_arg = Path('../data/data_for_Sup_train/masks_batch_arg')
masks_arg_paths = list(dir_img_arg.glob('**/*' + 'png'))
# ------------------------------------------------------

# ------------------------------------------------------
# AddBatchArgmentation
X_train_arg = X_train + img_arg_paths
y_train_arg = y_train + masks_arg_paths
size_X_train = len(X_train)

# Sampler : SingleImgBatchAugmentSampler, MultiImgBatchAugmentSampler, RandomImgBatchAugmentSampler
batchsampler = MultiImgBatchAugmentSampler(X_train_arg, size_X_train, batch_size)

train_set = MothDataset(
    X_train_arg, y_train_arg, input_size=input_size, output_size=output_size, img_aug=True)
train_loader = DataLoader(
    train_set, batch_sampler=batchsampler, num_workers=2, pin_memory=True)

其他待實作想法

針對少數、特定類別增加抽樣
針對不同來源影像(TT vs RS)互相模擬影像特性

2.2.2 預測

Moth_thermal/remove_bg/Sup_predict_pytorch.py
同時產生mask與去背後的影像搭配檢視
- 去背影像背景以自然界蛾類翅膀不會出現的顏色填滿(亮藍色)
檢視去背成果
- 如果發現部分樣本類型表現不理想，則回到流程1取得代表性樣本
選擇較難處理的代表性樣本作為BENCHMARK，每次訓練完以該樣本作為成效評估

==========================================================

參考資料

Segmentation

Segmentation分類任務評估指標

A survey of loss functions for semantic
segmentation
- Focal Tversky Loss outperformed all other loss functions, whereas specificity(True Negative Rate) remained consistent across all loss functions. We have also observed similar outcomes in our past research. Focal Tversky loss and Tversky loss generally gives optimal results with right parameter values.
图像分割模型调优技巧，loss函数大盘点
Loss Functions for Medical Image Segmentation:
A Taxonomy
（分割网络评价指标）dice系数和IOU之间的区别和联系

UNET

https://github.com/milesial/Pytorch-UNet

Background Remove for Moth Project

tags: Moth project image segmentation Tool

目標

困難點

流程

code

鱗翅目去背工作參考流程:

資料說明：

一、物件偵測(object detection)框出目標位置

資料io原則與注意事項：

蛾類標本偵測與裁切流程:

1.1 先使用原資料跑一遍yolov4

1.2 檢視裁切結果

1.3 處理對策

1.3.1 獲得Bounding Box座標點後，直接依據座標點加大選取範圍，來進行裁切獲得影像

Bounding Boxh裁切操作

二、去背

2.1 非監督式方法(Unsupervied Segmentation)

2.1.1 非監督+手動去背操作順序:

a. 使用非監督方法直接取得部分mask

b. 無法用非監督方法直接取得的部分，則利用非監督方法中間產出的rgb色塊圖

c. 前兩者無法處理的，則手動處理:

- remove.bg網站去背

- 小畫家3D手動去背

Unsupervised Image Segmentation by Backpropagation 參數設定

起始segmentation方法與參數選擇

對於不同類型影像

2.2 監督式方法(Supervied Segmentation)

2.2.1 訓練

2.2.1.1 超參數調整與正規化(regulariztion)

2.2.1.2 資料增強 Data Argugmentation

2.2.1.3 損失函數 Loss Function加權

2.2.1.4 訓練與驗證資料集按比例抽樣Stratified Proportional Sampling

依照來源、類群

花式取樣(直接用影像特徵_HSV)

2.2.1.5 針對少數、特定類別抽樣進行批次資料增強

其他待實作想法

2.2.2 預測

參考資料

Segmentation

Segmentation分類任務評估指標

UNET

Read more

[GenAI][AI Agents] Long-Term Agentic Memory With LangGraph - Baseline Email Assistant

[GenAI][AI Agents] Long-Term Agentic Memory With LangGraph - Introduction to Agent Memory

[AI Agents in LangGraph](https://learn.deeplearning.ai/courses/ai-agents-in-langgraph/lesson/1/introduction)

AI / ML領域相關學習筆記入口頁面

tags: `Moth project` `image segmentation` `Tool`