Try   HackMD

Machine Learning Note

tags: Machine Learning

紀錄一些自己較陌生的名詞

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
正樣本、負樣本

我們要對一張圖片進行分類,以確定其是否屬於汽車,那麼在訓練的時候:

  • 汽車的圖片為正樣本
  • 任何不是汽車的東西為負樣本

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
難樣本、易樣本

訓練集可以分為Hard Sample和Easy Sample

  • 難分正樣本(hard positives):錯分成負樣本的正樣本,也可以是訓練過程中損失最高的正樣本

  • 難分負樣本(hard negatives):錯分成正樣本的負樣本,也可以是訓練過程中損失最高的負樣本

  • 易分正樣本(easy positive):容易正確分類的正樣本,該類的概率最高。也可以是訓練過程中損失最低的正樣本

  • 易分負樣本(easy negatives):容易正確分類的負樣本,該類的概率最高。也可以是訓練過程中損失最低的負樣本

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
多標籤、多分類、回歸

  • 多標籤分類
    可將多標籤問題轉換為二元分類問題求解

判斷一張圖片上,是否有以下這些東西

House Sky Cat
Yes Yes No
  • 多類別分類

手寫辨識中,有多個類別,但最後只有一個是正確的分類

  • 回歸問題

有關數字的答案,如預測溫度、降雨機率

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Activate function

激勵函數主要是要引入非線性計算

在類神經網路中,常以上層的線性組合(矩陣相乘)作為這層的輸入,因此若不用激勵函數,深度類的神經網路便失去意義。
常見的Activate function有ReLU、Sigmoid、tanh, etc,其中ReLU最為常用。

ReLU(z)=max(0,z)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Regression

  • 線性回歸(Linear regression),用來預測一個連續的值
  • 羅吉斯回歸(Logistic regression),用來分類
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
One-stage Two-stage

物件偵測(Object Dection)的方式主要有兩大類:One-stage與Two-stage。
作者提出focal loss的出發點也是希望one-stage detector可以達到two-stage detector的準確率,同時不影響原有的速度。

作者認爲one-stage detector的準確率不如two-stage detector的原因是:樣本的類別不均衡導致的(不包含目標的背景負樣本框太多)。

  • Two-stage:先找出Rigion proposal(候選區域),再做辨識的作法,通常稱作two stage learning。這類算法可以達到很高的準確率,但是速度較慢。
    Example:R-CNN、Faster R-CNN、Mask R-CNN

  • One-stage:一步完成物件位置偵測及辨識,即一個神經網路能夠同時偵測物件也可以辨識物件。此方法速度快、運算量較低,精準度較two stage差,但辨識率仍在可接受範圍。
    Example:YOLO、Single Shot Detector (SSD)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Hint: Rigion proposal

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
IOU (Intersection Over Union)

IOU = 兩個物件的交集 / 兩個物件聯集

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Non-Maximum Suppression (NMS)

在Object detection中,常會先選出物件選人(candidate),然後在物件候選人中判斷是不是物件,但同一個物件常會被多個候選框選到。而我們常用Non-Maximum Suppression找到最佳的框,如下圖。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

這個選出來的框叫做Bounding Box/Boundary Box (BBox),每個BBox除了框的中心(x, y)和長寬(h, w)外,幾乎都會有一個confidence score。score代表的就是這個框是background還是foreground的信心程度,score∈[0,1]。score=1 代表這個框可以肯定是一個物件,但不知道此物件是哪個類別 (two-stage)。

Two-tage:再將選出BBox的feature map做rescale (一般用ROI pooling),然後再用分類器分類。

One stage:BBox有中心位置(x, y)、BBox長寬(h, w)和confidence score,以及相對應的分類機率

Eample:如下圖中有兩隻狗

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • Step 1:選出當前最大Score,此處為紅框(0.9),並放入Selected objects。
  • Step 2:計算當前最大Score與其他框的IoU值,若超過一定的閾值(threshold),代表重疊度太高,就要刪掉
  • Step3:重複1、2步驟,直到沒有東西。Selected objects即為所求。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Hint

  • 開區間 (open interval):(a, b) 為 a < x < b
  • 閉區間 (closed interval):[a, b] 為 a <= x <= b

Reference ➜ NMS

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
OHEM (Online Hard Example Mining)

選擇一些Hard Example作為訓練樣本,從而改善網路參數效果

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
SVM在多元分類 (multi-class) 的技巧"

SVM是一種二元分類器(binary classifier),以±1作為輸出。當有 T 個類別時,常用以下兩種處理方式:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
(1) One-against-Rest (One-vs-All, OvA, OvR):一對多

  • 第一個SVM:屬於類別1的資料為(+1),其他類別為(-1),這個SVM用來區別這兩者
  • 第二個SVM:屬於類別1的資料為(+1),其他類別為(-1),這個SVM用來區別這兩者
  • 以此類推

因此,T 個類別就會有 T 個SVM
當有一筆資料丟進這 T 個SVM,會得到(v1,v2,v3,vT),

vj±1。出現+1那組,資料便屬於那一類。如(-1,-1,+1,,-1),便屬於第三類。

優點是執行時間與記憶體不會消耗太多
缺點是訓練時,將剩下資料的類別視為同一個類別(-1)時,正負樣本數相差太大,訓練時會出現類別不平衡(class imbalance)。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
(2) One-against-One(OvO):一對一

每次取兩個類別資料訓練,將所有可能的類別組合都訓練一個SVM。
最後會有 T(T-1)/2 個SVM模型

每次有一筆資料要預測時,都丟進所有的SVM模型,而每一個SVM模型都會將此筆資料分類在其中一類,我們將該類別+1(像投票一樣),最後判斷哪一個類別最高票。

優缺點與One-against-Rest相反。

有時候會發生同票數的狀況,此時可以用淘汰賽的方式,,將同票數的晉級決賽,重新判斷。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Rigion proposal

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • 以Faster R-CNN為代表的two-stage檢測方法:從 CNN 的 feature map 上選出 region proposals
  1. RPN: Input image經過CNN提取feature map,先來到Rigion Proposal Network(RPN)。RPN在feature map上取sliding window,每個sliding window的中心點稱為anchor point,然後將事先準備好的k個不同尺寸的box(稱為anchor box),這些k個anchor box在同一個anchor point去計算可能包含物體的機率(score),所以每個anchor point會有2k個score(包含positive跟negative)和4k個偏移量(x,y,w,h),如下圖。

    假設threshold=0.5,則RPN輸出的p>=0.5的Rigion,稱為ROI(Rigion of Interests),即為感興趣的區域。ROI經過bounding box regression後,第一次輸出大致的Bounding-box。而輸出的Classification是一個二進位的p值,p∈[0,1],代表可能是COCO數據集裡80個類別中某一類的機率(但還不知道是哪一類)。此時RPN的工作完成。

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →


    其實RPN最終就是在feature map上,設置許多候選Anchor,然後用CNN去判斷哪些Anchor是裡面有目標的positive anchor,哪些是沒目標的negative anchor。所以,僅僅是個二分類而已!

  2. ROI pooling: 該層收集輸入的feature maps和proposal,接下來輸入普通的分類網路,得到輸出的Classification。這裡的Class才對應到具體類別。而之前的Box/Rigion會再做一次校正位置,即為圖中右上方的Bounding-box regression。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
深入探討RPN架構:

  • 假設每個Feature map為 13 * 13
    • 每個anchor point中有k=9 個proposal (3個scale和3個aspect ratios)
    • sliding window大小為 3 * 3,(stride=1,padding=1)

則此網路的輸入為 3 * 3 (sliding window大小),並經過 (3 * 3 ) * 256 捲積,得到 1 * 1 * 256 的一維向量,並進行分類:

  1. Classification: 1 * 1 * 18 的捲積,得到1 * 1 * 18 的feature vector。 (18 = 9個proposal * 2)
  2. Regression: 1 * 1 * 36的捲積,得到1 * 1 * 36 的feature vector。 (36 = 9個proposal*4 (x, y, w, h))

繼續移動sliding window,並重複以上步驟,之後feature map上每個點都有256維的vector


Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Rigion proposal的作用

COCO數據集有80類物體,沒有先進行Region Proposal將背景刪除的話,會推垮分類性能,難以分類。因此two-stage可以避免類別不平衡發生(foreground與background數量相差太大)。

Reference ➜ Rigion Proposal


Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Envelope (waves)

包絡線(Envelope)是幾何學裡的概念,代表一條曲線與某個曲線族中的每條線都有至少一點相切。

Reference ➜ Envelope


Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
precision

TP = True Positives (Predicted as positive as was correct)
FP = False Positives (Predicted as positive but was incorrect)


From the image, we get:

True Positives (TP) = 1

Fasle Positives (FP) = 0


Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Recall
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

TP = True Positives (Predicted as positive as was correct)

FN = False Negatives (Failed to predict an object that was there)


Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
AP、mAP

The general definition for the Average Precision (AP) is finding the area under the precision-recall curve above.

mAP (mean average precision) is the average of AP.