【論文閱讀】State-of-the-art review and benchmarking of barcode localization methods

# 【論文閱讀】State-of-the-art review and benchmarking of barcode localization methods ## Author and institution 義大利摩德納與雷焦艾米利亞大學（University of Modena and Reggio Emilia, Unimore）的 **AImageLab** 實驗室 Enrico Vezzali、Federico Bolelli、Stefano Santi、Costantino Grana - [Paper](https://www.sciencedirect.com/science/article/pii/S0952197625002593) - [Code](https://github.com/Henvezz95/BarBeR) ## Introduction ### Barcode Barcode 是作為一種視覺資料表達方式，可以攜帶大量資訊在一個由條文或區塊組成的圖形，提供給機器去讀取，因此廣泛應用在購物商場、倉儲管理、供應鏈上。 Barcode 能夠紀錄一個物件的出產品地、生產時間，等等進而改善管理效率與精準度，在我們生活當中隨處可以見到 Barcode 身影，包含超商、購物商場上有商品的地方，都會有 Barcode 存在這邊論文針對傳統影像處理定演算法與深度學習演算法進行簡介，並且實驗上比較速度、精準度的比較 ### Barcode Type 廣義上 Barcode 種類分為兩種，為 1D Barcode 與 2D Barcode，1D Barcode 藉由不同寬度的條文來紀錄資料，而這種方式能夠記住資料是有限的，進而衍生 2D Barcode，而 2D Barcode 有水平與垂直兩個方向能夠紀錄資訊，因此能夠攜帶資訊量變多。不論是 1D Barcode 還是 2D Barcode 衍生出各式各樣的規則來記錄資訊 ### Process of Barcode Barcode 處理分為以下兩個步驟 1. Localization 定位 : 影像中 Barcode 位置，可以是 (Rotated) Bbox 或者 Polygons 2. Decode 解碼 : 將 Barcode 資訊進行分析並根據規則進行解碼對於 Decode 來說通常都是使用 3dpart library 來進行解碼，如 **ZXing**、 **ZBar** ，而這篇論文不會將重點放在 Decode 上，而是定位上在 1D Barcode 特徵為高度對比線條與平行，而處理該類特徵多半為**邊緣檢測**與**聚合階段**，將有相同方向線條聚類成一組關係在 2D Barcode 特徵為兩組垂直線條組成，使用**霍夫轉換**來找到兩組垂直線 ### Challenge 1. Reproducibility : 大多研究沒有釋出其測試的程式碼，有釋出的甚至是不同的框架與程式語言，以及不同演算法、資料集與資料 2. Metrics Consistency : 追蹤指標不一致問題，有人使用 Jaccard's index， Dice Similarity Coefficient等等 ### Key Contribution 1. 回顧 Barcode 定位演算法 2. 釋出公開測試資料集 8748 張影像 3. 釋出 BarBeR (Barcode Benchmark Repository)，對於 Barcode detection 提供測試不同演算法，不論傳統或深度學習演算法的測試，並提供相對應的評估指標 ## Algorithm History 傳統影像處理演算法，各有優缺點，而檢測目的也不太一樣，包含 1D 和 2D，是否支持多個 Barcode 檢測，是否能夠對抗旋轉 Table 2 列出了傳統檢測演算法的比較 ![image](https://hackmd.io/_uploads/S1c1Cugblx.png) ### Sörös, G., Flörkemeier, C., 2013. 來自於 Blur-resistant joint 1D and 2D barcode localization for smartphones 論文，基本上就是可以同時做 1D 與 2D Barcode 檢測，但是無法檢測多個 Barcode 與檢測 Barcode 的旋轉角度 #### 流程 1. Sobel Filter - $I_x$ : Horizontal derivatives of image - $I_y$ : Vertical derivatives of image 2. Compute Struct Matrix $$ M = \begin{bmatrix} C_{xx} & C_{xy} \\ C_{xy} & C_{yy} \end{bmatrix} $$ - $C_{xx} = I_x^2$ - $C_{xy} = I_xI_y$ - $C_{xx} = I_y^2$ 3. Genterate HeatMap $$ m_1 = \frac{(C_{xx} - C_{yy})^2 + 4C_{xy}^2}{(C_{xx} + C_{yy})^2 + \epsilon} $$ $$ m_2 = \frac{4 \left( C_{xx} C_{yy} - C_{xy}^2 \right)}{(C_{xx} + C_{yy})^2 + \epsilon} $$ - $m_1$ : Feature of Edge and Bars - $m_2$ : Feature of Corner 4. Patch image $$ C_{ij} = \sum_{(x, y) \in D} w(x, y) \, m_{x,y} $$ ## Evaluation metrics ### Metrics that do not require confidence 這邊評估指標不需要 Confidence 也就是信心分數的預測 #### Intersection over union IOU 計算預測框與真實框的交集區域，數值範圍 $0 \sim1$ ，越接近 $1$ 兩個框交集面積越大 ![image](https://hackmd.io/_uploads/Hk8AhgBsxx.png) #### Precision and Recall * True Positive $(\text{TP})$ : A correct detection matching a ground-truth object 代表著 * False Positive $(\text{FP})$ : An incorrect detection in an empty area or a misplaced detection of an existing object; * False Negative $(\text{FN})$ : An undetected ground-truth object. 給定資料集內有 $G$ 個真實標籤，然後預測有 $N$ 個預測框，定義 $S$ 為正確地預測框，所以 $S\leq G$ ![image](https://hackmd.io/_uploads/H1DWAgrjlg.png) #### F1 Score ## Table 1. Table 1 :　總結了當前 Barcode 的 Dataset ![image](https://hackmd.io/_uploads/HyLIDxBilg.png) 2. Table 12 : 比較傳統演算法與深度學習演算法在不同平台上的執行時間 ![image](https://hackmd.io/_uploads/BJZXsgSsll.png)