Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

# Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks ###### tags: `Paper reading` ## Paper Link #### [Click here](https://arxiv.org/pdf/1506.01497.pdf) --- ## Abstract & Introduction ### Compare with others ![](https://i.imgur.com/DJv9U0R.png) ### Introduce a Region Proposal Network (RPN) > Fast R-CNN已經減少了檢測網絡的運行時間。然而proposals的計算仍是一個重要的瓶頸。 * shares full-image convolutional features with the detection network > 因此region proposal基本不占用運算資源 * simultaneously predicts object bounds and objectness scores at each position. > trained end-to-end * merge RPN and Fast R-CNN into a single network * by sharing their convolutional features—using “attention” mechanisms * 引入了"anchor" box概念用於可以預測尺度和長寬比變化很大的regionproposal > ![](https://i.imgur.com/wkCOkBV.png) ### Experiment Result * frame rate of 5fps (including all steps) on a GPU * achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets > with only 300 proposals per image --- ## Method ### Architecture ![](https://i.imgur.com/Ny9iJGS.png) ### Region Proposal Networks ![](https://i.imgur.com/3pwpRxi.png) * input: image (of any size) * outputs: a set of object proposals (with an objectness score) #### Anchors * reg 4k bbox的x,y,w,h * cls 2k 每個proposal是目標或不是目標的概率 * 位於sliding window的中心，使用3個比例(0.5,1,2)和3個長寬比( 1:1,1:2,2:1)。 > 關於anchors size，其實是根據設定的base size(=16)設置的。![](https://i.imgur.com/oJ2nrqc.png) >![](https://i.imgur.com/f5TN0VX.png) ## Loss Function ![](https://i.imgur.com/lEXW1WX.png) --- ## Refference * [Faster R-CNN代码解析](https://zhuanlan.zhihu.com/p/61221686) * [图解论文Faster RCNN](https://www.bilibili.com/video/BV13W411K7jM?p=6)