Machine Learning Note
紀錄一些自己較陌生的名詞
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
正樣本、負樣本
我們要對一張圖片進行分類,以確定其是否屬於汽車,那麼在訓練的時候:
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
難樣本、易樣本
訓練集可以分為Hard Sample和Easy Sample
-
難分正樣本(hard positives):錯分成負樣本的正樣本,也可以是訓練過程中損失最高的正樣本
-
難分負樣本(hard negatives):錯分成正樣本的負樣本,也可以是訓練過程中損失最高的負樣本
-
易分正樣本(easy positive):容易正確分類的正樣本,該類的概率最高。也可以是訓練過程中損失最低的正樣本
-
易分負樣本(easy negatives):容易正確分類的負樣本,該類的概率最高。也可以是訓練過程中損失最低的負樣本
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
多標籤、多分類、回歸
判斷一張圖片上,是否有以下這些東西
手寫辨識中,有多個類別,但最後只有一個是正確的分類
有關數字的答案,如預測溫度、降雨機率
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Activate function
激勵函數主要是要引入非線性計算
在類神經網路中,常以上層的線性組合(矩陣相乘)作為這層的輸入,因此若不用激勵函數,深度類的神經網路便失去意義。
常見的Activate function有ReLU、Sigmoid、tanh, etc,其中ReLU最為常用。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Regression
- 線性回歸(Linear regression),用來預測一個連續的值
- 羅吉斯回歸(Logistic regression),用來分類
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
One-stage Two-stage
物件偵測(Object Dection)的方式主要有兩大類:One-stage與Two-stage。
作者提出focal loss的出發點也是希望one-stage detector可以達到two-stage detector的準確率,同時不影響原有的速度。
作者認爲one-stage detector的準確率不如two-stage detector的原因是:樣本的類別不均衡導致的(不包含目標的背景負樣本框太多)。
-
Two-stage:先找出Rigion proposal(候選區域),再做辨識的作法,通常稱作two stage learning。這類算法可以達到很高的準確率,但是速度較慢。
Example:R-CNN、Faster R-CNN、Mask R-CNN
-
One-stage:一步完成物件位置偵測及辨識,即一個神經網路能夠同時偵測物件也可以辨識物件。此方法速度快、運算量較低,精準度較two stage差,但辨識率仍在可接受範圍。
Example:YOLO、Single Shot Detector (SSD)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Hint: Rigion proposal
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
IOU (Intersection Over Union)
IOU = 兩個物件的交集 / 兩個物件聯集
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Non-Maximum Suppression (NMS)
在Object detection中,常會先選出物件選人(candidate),然後在物件候選人中判斷是不是物件,但同一個物件常會被多個候選框選到。而我們常用Non-Maximum Suppression找到最佳的框,如下圖。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
這個選出來的框叫做Bounding Box/Boundary Box (BBox),每個BBox除了框的中心(x, y)和長寬(h, w)外,幾乎都會有一個confidence score。score代表的就是這個框是background還是foreground的信心程度,score∈[0,1]。score=1 代表這個框可以肯定是一個物件,但不知道此物件是哪個類別 (two-stage)。
Two-tage:再將選出BBox的feature map做rescale (一般用ROI pooling),然後再用分類器分類。
One stage:BBox有中心位置(x, y)、BBox長寬(h, w)和confidence score,以及相對應的分類機率
Eample:如下圖中有兩隻狗
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- Step 1:選出當前最大Score,此處為紅框(0.9),並放入Selected objects。
- Step 2:計算當前最大Score與其他框的IoU值,若超過一定的閾值(threshold),代表重疊度太高,就要刪掉
- Step3:重複1、2步驟,直到沒有東西。Selected objects即為所求。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Hint
- 開區間 (open interval):(a, b) 為 a < x < b
- 閉區間 (closed interval):[a, b] 為 a <= x <= b
Reference ➜ NMS
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
OHEM (Online Hard Example Mining)
選擇一些Hard Example作為訓練樣本,從而改善網路參數效果
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
SVM在多元分類 (multi-class) 的技巧"
SVM是一種二元分類器(binary classifier),以±1作為輸出。當有 T 個類別時,常用以下兩種處理方式:
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
(1) One-against-Rest (One-vs-All, OvA, OvR):一對多
- 第一個SVM:屬於類別1的資料為(+1),其他類別為(-1),這個SVM用來區別這兩者
- 第二個SVM:屬於類別1的資料為(+1),其他類別為(-1),這個SVM用來區別這兩者
- 以此類推…
因此,T 個類別就會有 T 個SVM
當有一筆資料丟進這 T 個SVM,會得到(v1,v2,v3,…vT),為±1。出現+1那組,資料便屬於那一類。如(-1,-1,+1,…,-1),便屬於第三類。
優點是執行時間與記憶體不會消耗太多
缺點是訓練時,將剩下資料的類別視為同一個類別(-1)時,正負樣本數相差太大,訓練時會出現類別不平衡(class imbalance)。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
(2) One-against-One(OvO):一對一
每次取兩個類別資料訓練,將所有可能的類別組合都訓練一個SVM。
最後會有 T(T-1)/2 個SVM模型
每次有一筆資料要預測時,都丟進所有的SVM模型,而每一個SVM模型都會將此筆資料分類在其中一類,我們將該類別+1(像投票一樣),最後判斷哪一個類別最高票。
優缺點與One-against-Rest相反。
有時候會發生同票數的狀況,此時可以用淘汰賽的方式,,將同票數的晉級決賽,重新判斷。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Rigion proposal
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- 以Faster R-CNN為代表的two-stage檢測方法:從 CNN 的 feature map 上選出 region proposals
-
RPN: Input image經過CNN提取feature map,先來到Rigion Proposal Network(RPN)。RPN在feature map上取sliding window,每個sliding window的中心點稱為anchor point,然後將事先準備好的k個不同尺寸的box(稱為anchor box),這些k個anchor box在同一個anchor point去計算可能包含物體的機率(score),所以每個anchor point會有2k個score(包含positive跟negative)和4k個偏移量(x,y,w,h),如下圖。
假設threshold=0.5,則RPN輸出的p>=0.5的Rigion,稱為ROI(Rigion of Interests),即為感興趣的區域。ROI經過bounding box regression後,第一次輸出大致的Bounding-box。而輸出的Classification是一個二進位的p值,p∈[0,1],代表可能是COCO數據集裡80個類別中某一類的機率(但還不知道是哪一類)。此時RPN的工作完成。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
其實RPN最終就是在feature map上,設置許多候選Anchor,然後用CNN去判斷哪些Anchor是裡面有目標的positive anchor,哪些是沒目標的negative anchor。所以,僅僅是個二分類而已!
-
ROI pooling: 該層收集輸入的feature maps和proposal,接下來輸入普通的分類網路,得到輸出的Classification。這裡的Class才對應到具體類別。而之前的Box/Rigion會再做一次校正位置,即為圖中右上方的Bounding-box regression。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
深入探討RPN架構:
- 假設每個Feature map為 13 * 13
- 每個anchor point中有k=9 個proposal (3個scale和3個aspect ratios)
- sliding window大小為 3 * 3,(stride=1,padding=1)
則此網路的輸入為 3 * 3 (sliding window大小),並經過 (3 * 3 ) * 256 捲積,得到 1 * 1 * 256 的一維向量,並進行分類:
- Classification: 1 * 1 * 18 的捲積,得到1 * 1 * 18 的feature vector。 (18 = 9個proposal * 2)
- Regression: 1 * 1 * 36的捲積,得到1 * 1 * 36 的feature vector。 (36 = 9個proposal*4 (x, y, w, h))
繼續移動sliding window,並重複以上步驟,之後feature map上每個點都有256維的vector
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Rigion proposal的作用
COCO數據集有80類物體,沒有先進行Region Proposal將背景刪除的話,會推垮分類性能,難以分類。因此two-stage可以避免類別不平衡發生(foreground與background數量相差太大)。
Reference ➜ Rigion Proposal
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Envelope (waves)
包絡線(Envelope)是幾何學裡的概念,代表一條曲線與某個曲線族中的每條線都有至少一點相切。

Reference ➜ Envelope
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
precision
TP = True Positives (Predicted as positive as was correct)
FP = False Positives (Predicted as positive but was incorrect)

From the image, we get:
True Positives (TP) = 1
Fasle Positives (FP) = 0

Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Recall
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
TP = True Positives (Predicted as positive as was correct)
FN = False Negatives (Failed to predict an object that was there)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
AP、mAPThe general definition for the Average Precision (AP) is finding the area under the precision-recall curve above.
mAP (mean average precision) is the average of AP.