# Scale Match for Tiny Person Detection ###### tags: `Paper reading` `2020` ## Paper Link #### [Click here](https://arxiv.org/pdf/1912.10664.pdf) --- ## Abstract & Introduction * Tiny persons `less than 20 pixels` in large-scale images remains not well investigated * feature representation while the massive and complex backgrounds aggregate the risk of false alarms * To detect the tiny persons, we propose a simple yet effective approach, named Scale Match * **Main contributions** * propose the Scale Match approach * improves the detection performance over the state-of-the-art detector (FPN) with a significant margin (5%). --- ## Tiny Person Benchmark ### Absolute size and Relative size ![Absolute size and Relative size](https://i.imgur.com/UXrDkzu.png) * $G_{ij} = (x_{ij} , y_{ij} , w_{ij} , h_{ij} )$ = $j$-th object’s bounding box of $i$-th image $I_i$ in dataset * $x_{ij}$ , $y_{ij}$ = coordinate of the left-top point * $w_{ij}$ , $h_{ij}$ are the width and height of the bounding box * $W_i$, $H_i$ denote the width and height of $I_i$ ![T1](https://i.imgur.com/aJUVLK8.png) 作者將 Tiny Person和COCO、Wider Face和CityPersons數據集進行對比,具體數據如Table 1,可見Tiny Person的小目標是真的相對很小 ### Benchmark description #### Dataset Collection * Collected from Internet 1. Videos with a high resolution are collected from different websites 2. Sample images from video every 50 frames. 3. Delete images with a certain repetition. 4. Annotate 72651 objects with bounding boxes by hand #### Dataset Properties * The persons in TinyPerson are quite tiny compared with other representative datasets * The aspect ratio of persons in TinyPerson has a large variance > Since the various poses and viewpoints of persons in TinyPerson, it brings more complex diversity of the persons, and **leads to the detection more difficult** * Mainly focus on person around seaside * There are many images with dense objects (more than 200 persons per image) in TinyPerson #### Annotation rules * sea person * Persons on boat * Persons lying in the water * Persons with more than half body in water * earth person * others * ignore * Crowds `我們可以識別為人。 但是當用標準矩形標記時,人群很難一一分開` * Ambiguous regions `難以明確區分是否有一個或多個人` * Reflections in Water `有些物體很難被識別為人類,我們直接將它們標記為“不確定”` #### Evaluation * Use both AP (average precision) and [MR (miss rate)](https://www.twblogs.net/a/5b7f5d252b717767c6af31c8) for performance evaluation $MR = FRN = {FN\over P} = {FN\over FN+TP} = 1-TPR$ * Size range is divided into 3 intervals : * tiny[2,20] * tiny1[2,8] * tiny2[8,12] * tiny3[12,20] * small[20,32] * all[2,inf] * IOU threshold = 0.5 * IOD is for ignored regions for evaluation.`change IOU to IOD` ### Dataset Challenges #### Tiny absolute size To quantify the effect of absolute size reduction on performance 1. Down-sample CityPersons by $4\times4$ to construct tiny CityPersons `objects’ absolute size is same as that of TinyPerson` 2. Train a detector for CityPersons and tiny Citypersons, respectively ![T4](https://i.imgur.com/pcIanRH.png) > Table 4 prove that tiny objects’ size really brings a great challenge in detection The performance drops significantly while the object’s size becomes tiny. #### Tiny relative size absolute size相同,但TinyPerson是遠景。所以TinyPerson的relative size小於CityPersons 。 To better quantify the effect of the tiny relative size 1. obtain two new datasets $3\times3$ tiny CityPersons and $3\times3$ TinyPerson `by directly 3*3 up-sampling tiny CityPersons and TinyPerson, respectively.` ![T3](https://i.imgur.com/55wTKau.png) --- ## Method ![Psize](https://i.imgur.com/95rE9El.png) * $X$ = Dataset * $s$ = Objects’ size * $P_{size}(s;X)$ = probability density function of objects’ size $s$ in $X$ * $T$ = scale transform = Scale Match * $E$ = extra dataset = MS COCO * $D$ = targeted dataset = TinyPerson ### Scale Match #### Architecture ![The framework of Scale Match for detection](https://i.imgur.com/L468cGQ.png) #### Steps of Scale Match 假設 $G_{ij}=(x_{ij},y_{ij},w_{ij},h_{ij})$ 為Dataset $E$中第 $i$ 圖中第 $j$ object 1. $P_{size}(s;X)$ 採樣成 $\hat{s}$ 尺寸 2. 計算縮放比例 $c$ = $\hat{s}\over AS(G_{ij})$ 3. $Resize$ $Object$ $with$ $scale$ $ratio$ $c$ $\hat G_{ij}\leftarrow$ $(x_{ij}\times c$ , $y_{ij}\times c$ , $w_{ij}\times c$ , $h_{ij}\times c$); #### Estimate $P_{size}(s;X)$ #### Algorithm ![Scale Match Algorithm](https://i.imgur.com/4KEQIFp.png =500x) ![Scale Match Algorithm_CH](https://i.imgur.com/diKvMWC.png =500x) ### Monotone Scale Match (MSM) for Detection --- ## Loss Function --- ## Experiments ![](https://i.imgur.com/K2HNW3P.png)