Object Detection Survey 2019-2021

# Object Detection Survey 2019-2021 ###### tags: `Survey` ## [MULTI-VIEW FRUSTUM POINTNET FOR OBJECT DETECTION IN AUTONOMOUS DRIVING](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8803572) ###### tags: `ICIP 2019` ### Abstract * object detection in autonomous driving . * multi-view F-PointNet (MVFP) to improve FPointNet-based 3D object detection method ### Method ![](https://i.imgur.com/SQJDx2k.png) * F-PointNet for 3D object detection * BEV Object Detection segmentation in BEV maps `segmentation in BEV maps is much more natural and easier than that in images` * MVFP proposed for injecting the missed-detected object to F-PointNet procedure ### Conclusion * based on adding bird’s eye view detection to F-PointNet --- ## [CONTEXT-ANCHORS FOR HYBRID RESOLUTION FACE DETECTION](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8803548) ###### tags: `ICIP 2019` ### Abstract * face detection (small faces) * 本文認為，anchor的選定範圍對於detection很重要 * proposes a face detection model **CAHR** (context-anchors for hybridresolution model) ### Method ![](https://i.imgur.com/jygCB5c.png) ``` CAHR = Context-anchors + Hybrid Resolution ``` * Anchor box * Hybrid Resolution Model multi-scale face detection model * Context-anchors (define anchor) * Ratio of anchor * Training Label Clustering * 選擇25個centers，anchor的長度和寬度通過K-means獲得（training set） * model can be applied to **multiscale** and **multi-pose** face detection ### Conclusion * 提出的方法具有很強的實用性 * 可以靈活地應用於其他基於錨點的物體檢測模型 --- ``` ## [A CONVLSTM-COMBINED HIERARCHICAL ATTENTION NETWORK FOR SALIENCY DETECTION](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9190788) ###### tags: `ICIP 2020` ### Abstract ### Method ### Conclusion ``` --- ## [ANOMALOUS MOTION DETECTION ON HIGHWAY USING DEEP LEARNING](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9190697) ###### tags: `ICIP 2020` ### Abstract * 提出了一個新的dataset（Highway Traffic Anomaly（HTA）) * 用行車記錄儀視頻中檢測高速公路上的異常交通模式問題` ### Method * Highway Traffic Anomaly Dataset * was curated from the Berkeley DeepDrive dataset * 1280×720 30FPS dash cam videos * only highway driving videos * conditional GAN (CGAN) ![](https://i.imgur.com/OJaY0yR.png) predict the optical flow between a pair of sequential frame * Predictive Coding Network (PredNet) PredNet can learn sequential data such as videos --- ``` ## [Boxy Vehicle Detection in Large Images](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9022257) ###### tags: `ICCVW 2019` ### Abstract * present the Boxy dataset for image-based vehicle detection. `is one of the largest public vehicle detection datasets` ``` --- ## [RRPN: RADAR REGION PROPOSAL NETWORKFOR OBJECT DETECTION IN AUTONOMOUS VEHICLES](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8803392) ###### tags: `ICIP 2019` ### Abstract * RRPN通過將雷達檢測映射到圖像坐標係並為每個映射的雷達檢測點生成預定義的錨框來生成對象建議。然後根據對象與車輛的距離對這些anchor進行轉換和縮放，以為檢測到的對象提供更準確的proposal `regin proposal速度慢，不適用於實時應用（例如自動駕駛汽車）` ### Method * The proposed RRPN steps: * perspective transformation * generating ROIs is mapping the radar detections from the vehicle coordinates to the camera-view coordinates * anchor generation * Faster RCNN --- ## [3D OBJECT DETECTION FOR AUTONOMOUS DRIVING USING TEMPORAL LIDAR DATA](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9191134) ###### tags: `ICIP 2020` ### Abstract * 3D object detection是自動駕駛領域的一個基本問題，而**行人**是一些最重要的對象。 * 將PointPillars`object detection的某個方法`修改為recurrent network ### Method ![](https://i.imgur.com/vfDip3K.png) * PointPillars 生成圍繞自我車輛的3D空間的2D pseudo-image，並使用此表示來生成場景中對象的3D bounding boxes。 * Recurrent PointPillars --- ## [Oil Tank Detection Based on Linear Clustering Saliency Analysis for Synthetic Aperture Radar Images](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8803347) ###### tags: `ICIP 2019` ### Abstract * 在SAR圖像中做Oil Tank Detection * presents a linear clustering saliency analysis based detection model. 1. Firstly, a linear iterative clustering of which the feature vector consists of three texture features and 2-D coordinates is used to over segment the input image. 2. At the same time, multi intensity saliency maps constrain the shape of the superpixel using an adaptive balance weight. 3. Secondly, feature vectors generated from the cluster centers are scattered as far as possible via Principal Component Analysis and sent to the MeanShift model to coarsely locate the candidate area. 4. Finally, strong scattered points on the roof of tanks are utilized to locate the top of the targets. `**看不懂拉**` ### Method ![](https://i.imgur.com/vJCWH3r.png) * Pre-processing enhanced directional smoothing (EDS) model * Saliency analysis based iterative clustering 是image over-segmentation的工具，有助於降低計算複雜度 * texture `由於SAR圖像代表地面物體的輻射反射，因此問題出在缺少顏色信息上` three texture saliency maps are chosen 1. The first two maps are calculated using **grayscale co-occurrence matrix** `是描述像素空間相關性的一種矩陣` 2. The third map is calculated pixel by pixel ### Conclusion 沒興趣就沒看完 --- ## [Railcar Detection, Identification and Tracking for Rail Yard Management](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9190763) ###### tags: `ICIP 2020` ### Abstract * video analytics system combining 1. railcar detection 2. classification 3. text identification 4. logo detection * 專為自動堆場盤點檢查而開發，因此可以在設施中自動監視和管理各個有軌電車的到達，離開和移動。 * dataset collected from a real-world locomotive yard in the US ### Method ![](https://i.imgur.com/xqaWugX.png) ### Conclusion present a video analytics system combining railcar detection, type classification, FRA ID text identification, and logo detection into a system for locomotive transportation and yard management. --- ## [Fast and Efficient Model for Real-Time Tiger Detection In The Wild](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9022313) ###### tags: `ICCVW 2019` ### Abstract * present **TigerNet** `FPN based network architecture` * 用於野外Amur Tiger Detection * introduce a two-stage semi-supervised learning via pseudo-labelling learning * Dataset : 來自在多個野生動物園中以不受時間限制的時間同步監視攝像頭 ### Method * Architecture * FPN `for Object detection` * FD-MobileNet `backbone network` * Semi-Supervised Learning using PseudoLabels * Due to the small size of the Amur Tiger population, it is complicated to create a large-scale dataset with various conditions (day time, weather, zones, scales, etc.) * 對未標記數據進行偽標記(pseudo-labelling ) ### Conclusion This paper introduces a new efficient neural network architecture and training pipeline for object detection. --- ``` ## [MULTI-DOMAIN ATTENTIVE DETECTION NETWORK](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8803206) ###### tags: `ICIP 2019` ### Abstract * present a object detection method called the **multidomain attentive detection network** (MDADN) * adds attention modules to each layer (channel-wise & spatial-wise) ### Method ### Conclusion --- ``` ## [An Adaptive Fitting Approach for the Visual Detection and Counting of Small Circular Objects in Manufacturing Applications](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8803361) ###### tags: `ICIP 2019` ### Abstract * propose a two-stage **circle detection** method * dataset : 沒講 ### Method * two stages * Bottom-up circle detection (first stage) : detector 以sliding window模式對輸入圖像執行 multi-scale掃描 * A pyramid structure * SIFT descriptor作為sketch feature descriptor `?` * POAG descriptor作為graph pattern descriptor `?` * Top-down circle fitting (second stage) : 用hierarchical Bayesian model獲取小型圓形對象的精確位置和比例 --- ``` LSTM 沒很有興趣 ## [Modeling Long- and Short-Term Temporal Context for Video Object Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8802920) ###### tags: `ICIP 2019` ### Abstract ### Method ### Conclusion --- ## [PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9156683) ###### tags: `2020 CVPR` ### Abstract * propose a single-stage Human-Object Interaction (HOI) detection method `HOI is defined as a point triplet < human point, interaction point, object point>` * It is the first real-time HOI detection method * ### Method ### Conclusion --- ``` --- ## [Human Detection In Dense Scene Of Classrooms](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9191136) ###### tags: `ICIP　2020` ### Abstract * Slove occlusion * propose a method named Dense Occlusion Object Detection network ### Method * Dense Anchor Generation Model (DAG) 生成proposal的錨框 * Discriminative Part Selection Model (DPS) 將proposal分為幾個較小的部分並分別進行處理，從而減輕遮擋問題 --- ## [Attentive Layer Separation for Object Classification and Object Localization in Object Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8803439) ###### tags: `ICIP 2019` ### Abstract * 現在的問題 object classification集中在對象的最有區別的部分上，以預測正確的對像類別。object localization集中在整個對象區域上，以便它可以繪製包含整個對象的邊界框。 * The proposed deep learning-based network mainly consists of two parts (classification+localization); 1. Attention network part where task-specific attention maps are generated 2. Layer separation part where layers for estimating two tasks are separated. * Data : COCO ### Method `沒講用甚麼方法` ![](https://i.imgur.com/9AHAQNV.png) * Attention network * Layer separation `把C和L分開做` --- ## [Infrared Target Detection Using Intensity Saliency And Self-Attention](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9191055) ###### tags: `ICIP 2020` ### Abstract * 現在的問題 IR圖像具有常見的紅外特徵，例如較差的texture資訊，low resolution, 和high noise。 * propose a backbone network named **Deep-IRTarget** * dataset : grey ImageNet 1000-class dataset ### Method ![](https://i.imgur.com/W3PWK1y.png) * Infrared Intensity Saliency Map Gaussian Filter * Triple Self-Attention Module channel-wise attention * Detection Faster RCNN Region Proposal Network (RPN) --- ## [Context-Aware Hierarchical Feature Attention Network For Multi-Scale Object Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9190896) ###### tags: `ICIP 2020` ### Abstract * 現在的問題現在的detectors只是融合了從Conv中提取的金字塔特徵，而這並沒有充分利用有用的features並會丟棄多餘的features。 * 提出了ContextAware Hierarchical Feature Attention Network (CHFANet) ，以專注於multi-scale feature extraction以進行object detection。 * dataset : Pascal VOC ### Method ![](https://i.imgur.com/01dwYfT.png) * Context-Aware Feature Extraction Module * Context information對於multi-scale object detection非常重要。常用的模型通過使用最後的feature map或簡單地串聯多個feature layers來學習，這不能有效地提取Context information。 * peopose a CFE module * CFE module takes the feature maps con 4−3, con 5−3 in VGG-16 and con 6−2, con 7−2 respectively as input * Hierarchical Feature Fusion Module * SSD沒有充分利用不同級別的特徵圖。` because features from deep layers typically contain rich semantic information, which are suitable for object classification.` * enrich features by fusing different level features --- ## [Attention-Enhanced And More Balanced R-CNN For Object Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9191309) ###### tags: `ICIP 2020` ### Abstract * 現在的問題許多Attention機制將花費太多的計算 * 在本文中，我們將一種light attention機制（即包含residual module）整合到我們的object detection backbone * * 我們用criss-cross attention module代替non-local attention module，以減少計算並提高性能。 * to solve the imbalance problem`?` in region sample level, we use the cascade region proposal network(RPN) module to gain anchors of higher qualit * dataset : MS COCO ### Method ![](https://i.imgur.com/C2Qw3Vj.png) * Backbone R-CNN * TAM和DCM TAM和DCM組成了residual module，用於增強attention。 * Cascade RPN 用於獲取高質量的正樣本，以獲取更均衡的樣本。 * Criss-cross attention module 用於減少BFP的計算和GPU內存使用。 --- ``` ## [On Generalizing Detection Models for Unconstrained Environments](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9022160) ###### tags: `ICCVW 2019` ### Abstract ### Method ### Conclusion --- ## [Matrix Nets: A New Deep Architecture for Object Detection ](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9022243) ###### tags: `ICCVW 2019` ### Abstract ### Method ### Conclusion --- ``` ## [Noisy Localization Annotation Refinement For Object Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9190728) ###### tags: `ICIP 2020` ### Abstract * 現在的問題 cloud sourcing常用於創建datasets，這導致這些datasets包含不正確的annotations，例如不精準的bounding boxes位置 * RCNN YOLO等方法需要在包含精確annotations的dataset上進行訓練 * 包含精確annotations的dataset `ex` MSCOCO and ImageNet `Noise : incorrect class labels, inappropriate localization, missing objects, and false positive annotations.` * In this paper 強調了帶有noisy的bounding box annotations的Object Detection問題，並表明這些noisy annotations對深度神經網絡的性能有害。 * Dataset PASCAL VOC2007 ### Method - 將noise加入tain data來手動創建幾個noisy dataset。 - framework : based on the joint optimization framework ### Conclusion * 應用環境: * 晚上畫面 * 人多畫面 --- ## [Multi-Resolution Generative Adversarial Networks for Tiny-Scale Pedestrian Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8803030) ###### tags: `ICIP 2019` ### Abstract - 現在的問題 - 現有CNN會遺失小尺度對象的feature - 行人detection - propose a MultiResolution Generative Adversarial Network (MRGAN) `generating a high-resolution pedestrian image from low-resolution image. ` - The key idea : 探索high-resolution&low-resolution之間的關係 - Dataset : KITTI ### Method ![](https://i.imgur.com/PzsUVDe.png) - Classifier : VGG19 ### Conclusion * 應用環境: * tiny object detection --- ## [BEYOND BOUNDING BOX: FINE-GRAINED VEHICLE DETECTION VIA SINGLE STAGE DETECTOR WITH HIERARCHICAL OUTPUT](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8803429) ###### tags: `ICIP 2019` ### Abstract - 現在的問題 - bbox的傳統方法過於粗糙，無法處理具有挑戰性的情況，例如車輛姿態變化 - 可以區分車輛的每個面，還可以準確定位它們的邊界 - Vehicle detection基於攝像頭的前方碰撞警報的關鍵module，通常用於計算距離和碰撞時間 - propose a vehicle detector with a fine-grained output representation - dataset : 自己做的 ### Method ![](https://i.imgur.com/bCVnaXr.png) - Networks based on - SSD : output over nine vehicle poses plus the background - RefineDet - Detection Network with Hierarchical Output - 第一曾輸出車輛confidence & bounding box - 第二級預測三個立面confidences - 第三級輸出九個pose confidences和3個control points ### Conclusion - 應用環境: --- ``` ## [](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9190697) ###### tags: `` ### Abstract ### Method ### Conclusion --- ```