---
tags: Human Face
---
# SSH (2017)
SSH: Single Stage Headless Face Detector
## Contribution
1. One stage anchor-based face detector
2. scale-invariant 網路架構 (不用設計image pyramid)
3. headless 網路架構 (fully convnets, 參數少)
## Network Architecture

1. Convs 1-1~5-3為VGG16的Block 1~5

2. Input shape of tensor before detection module
|Detction module| M1 | M2 | M3 |
| -------- | -------- | -------- | -------- |
| Input Shape (224, 224, 3) | (28, 28, 128) | (14, 14, 512)| (7, 7, 512)|
| Input Shape (h, w, 3) | (h/8, w/8, 128) | (h/16, w/16, 512)| (h/32, w/32, 512)|
> input的feature map長寬越大, 目標偵測的人臉越小
3. 針對偵測小臉的branch使用FCN的特徵融合方法增加feature
- 多拿一個conv layer的output的feature map當做偵測小臉branch的input
4. Detection and context module
- 為了增加CNN的receptive field, 使用類似google net的方法新增5x5和7x7的conv
- 為了節省記憶體消耗, 用兩個3x3的conv代替5x5的conv, 用三個3x3的conv代替7x7的conv

- detection module

- context module

- mix

- Google Net (inception block)

5. Anchor設計
- 僅使用一種比例的anchor(1:1)
- 針對不同的目地使用不同的anchor box size
| Detction module | input feature map size | anchor |
| --------------- | ---------------------- | -------------- |
| M1 | (h/8, w/8, 128) | 16X16, 32X32 |
| M2 | (h/16, w/16, 512) | 64X64, 128X128 |
| M3 | (h/32, w/32, 512) | 256X256, 512X512 |
## Training
- positive and negative anchor setting
- IOU(gt, anchor) > 0.5 -> positive anchor
- IOU(gt, anchor) < 0.3 -> negative anchor
- others -> ignores
- loss function
$$
\sum_{k}\frac{1}{N_k^c}\sum_{i\in A_{k}}l_{c}(p_{i}, g_{i})+\lambda\sum_{k}\frac{1}{N_k^r}\sum_{i\in A_{k}}I(g_{i}=1)l_{r}(b_{i}, t_{i})
$$
- $A_{k}$ represents the set of anchors defined in $M_{k}$
- $l_{c}$ is face classification loss (multinomial logistic loss)
- $l_{r}$ is bounding box regression loss (smooth l1 loss)
- $l_{r}$ 僅針對positive anchor
- OHEM
## Experiments
- Ablation studies
