Try   HackMD

前言

在物件偵測過程中,通常是先框選出物件候選人,再判斷是否真的為物件。但通常一個物件會被很多候選框選到。如下圖所示,下左圖為物件被多個候選框選中的例子。為了消除多餘的物件框,並找出最佳的框,大部分研究都使用Non-Maximum Suppression(NMS)達到這個目的。

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

上圖的候選框稱作Bounding Box(BBox),而每個BBox通常帶有以下資訊:

  • 框的中心(x, y)
  • 框的長寬(h, w)
  • 信心程度(Confidence score)
  • 物件類別機率

NOTE:Confidence score代表這個框是background或是foreground的信心程度(並不代表類別機率)。值通常介於0~1之間,當score=1代表這個框一定是個物件。

假設偵測兩個物件(貓和夠),一般都包含BBox中心位置(x,y)、BBox長寬(h,w)、Confidence score、BBox屬於貓或狗的機率,也就是共5+2個資訊。若要對100類物件做偵測,則BBox總共會有5+100=105個輸出。

Non-Maximum Suppression

下圖為NMS的演算法留程度,很難懂

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

根據Medium這篇文章的兩個例子說明NMS的做法

  • 影像中有兩隻狗,如何使用NMS從候選框中找出兩隻狗
  • 影像中有一隻狗一隻貓,如何使用NMS從候選框中找出狗和貓

NMS步驟

  1. 根據信心程度做排序,接著挑選出信心程度最高的BBox,並加入到「確定是物件的集合」中
  2. 其他的BBox與剛挑選出的BBox計算IoU,若IoU>設定閥值,則將那些BBox信心程度設定為0,並刪除這些框
  3. Repeat步驟1和2直到所有BBox被處理完(沒有信心程度大於0的框),「確定是物件的集合」中的物件就是最終結果

範例一:影像中有兩隻狗,如何使用NMS從候選框中找出兩隻狗

由於第一個例子只看怎麼找出狗,因此這裡只用到五個資訊:

  • BBox中心位置(x, y)和BBox長寬(h, w)
  • Confidence score

實際流程圖如下:

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

  1. 「確定是物件的集合」 = {空集合}

  2. (Run1)將BBox根據信心程度排序,而信心度最高的BBox(紅色)會加入到「確定是物件的集合」中,接著與其他BBox計算計算IoU。計算完IoU,假設粉色IoU=0.6大於設定閥值,將其信心度設定為0。
    「確定是物件的集合」 = {紅色BBox}

  3. (Run2)不考慮信心度為0及在「確定是物件的集合」內的BBox。從剩下的BBox中挑選出信心度最高的BBox(黃色),並加入到「確定是物件的集合」,接著黃色BBox與其餘的BBox計算IoU。假設其餘BBox的IoU都大於0.5,因此將這些BBox的信心程度設為0。
    「確定是物件的集合」 = {紅色BBox、黃色BBox}

  4. 由於沒有BBox的信心程度>0,結束NMS。
    「確定是物件的集合」 = {紅色BBox、黃色BBox}

NOTE: 這裡衍伸一個問題,為什麼IoU設定為0.5?若在高一點可以嗎?將以上的例子的IoU改成0.7說明,如下圖所示。

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

從上圖的結果可以得知,若IoU太高,可能會造成重複偵測的問題。

範例二:影像中有一隻狗一隻貓,如何使用NMS從候選框中找出狗和貓

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

  1. 「確定是物件的集合」 = {空集合}

  2. (Run1)將BBox根據信心程度排序,而信心度最高的BBox(紅色)會加入到「確定是物件的集合」中,接著與其他BBox計算計算IoU。計算完IoU,由於藍色BBox的IoU大於設定閥值,將其信心度設定為0。
    「確定是物件的集合」 = {紅色BBox}

  3. (Run2)不考慮信心度為0及在「確定是物件的集合」內的BBox。從剩下的BBox中挑選出信心度最高的BBox(紫色),並加入到「確定是物件的集合」,接著紫色BBox與其餘的BBox計算IoU。其餘BBox的IoU都大於0.5,因此將這些BBox的信心程度設為0。
    「確定是物件的集合」 = {紅色BBox、紫色BBox}

  4. 由於沒有BBox的信心程度>0,結束NMS。
    「確定是物件的集合」 = {紅色BBox、紫色BBox}

這時候搭配分類的機率,就可以把NMS挑選出的BBox做分類了(如上圖,每一個BBox都有一組機率)。


在實際的作法上,會先透過一個閾值初步篩選掉一些候選BBox。假設一張圖玉選出一萬個BBox,後面CPU計算NMS會花費大量時間。因此會先根據Confidence score篩選掉一些BBox,接著才做NMS,如下圖所示。

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →


Source Code Please visit

Reference

機器/深度學習: 物件偵測 Non-Maximum Suppression (NMS)
非极大值抑制(Non-Maximum Suppression)
深度学习目标检测Object Detection基本知识概念