# caffe SSD mobileNet people detection model - training
:::info
20190904
- install caffe ssd branch
- 生成訓練資料(pascal voc-->lmdb)
- 生成prototxt
- 執行train
:::
[toc]
---
此篇以people detection為例
資料的label只有"person"一種類別
## Install
github: [caffe-SSD](https://github.com/weiliu89/caffe/tree/ssd)
1. 下載並切換到ssd branch
```shell
$ git clone https://github.com/weiliu89/caffe.git
$ cd caffe
$ git checkout ssd
```
2. 編譯
照著github步驟 make
makefile 跟 Makefile.config都要改一下
[[install issue1]](https://blog.csdn.net/u012841667/article/details/53316948) [[install issue2]](https://github.com/rbgirshick/fast-rcnn/issues/52)
[[make issue1]](https://www.cnblogs.com/houjun/p/9982968.html) [[make issue2]](https://blog.csdn.net/wuzuyu365/article/details/52430657)
- 補充:
記得需要編譯"新版的[caffe-SSD](https://github.com/weiliu89/caffe/tree/ssd)",才會認得SSD
不然會出現一些layer找不到的issue
`issue:caffe.LayerParameter has no field named permute_param `
>If you just want to perform inference, I think adding about three new layers (prior_bbox_layer, permute_layer, detection_output_layer) to your caffe is enough.
## Generate lmdb dataset (from pascal voc format )
### ref.
[Train SSD on the Custom Dataset](https://github.com/Coldmooon/SSD-on-Custom-Dataset)
[Mobilenet-SSD的Caffe系列實現](https://blog.csdn.net/Jesse_Mx/article/details/78680055)
[chuanqi305/MobileNet-SSD](https://github.com/chuanqi305/MobileNet-SSD)
### Prepare voc form dataset
將已準備好的pascal voc格式的資料放在`/home/data/VOCdevkit`裡 (不然就要改creat_list.sh裡面的path)
```shell
(caffe)wang@ai4:~/data/VOCdevkit/MYDATASET$ ls
Annotations ImageSets JPEGImages MYDATASET_person_64127.txt
```
複製相關的4個sh,txt檔 到 新的資料夾(MYDATASET)內
```shell
$ cd ~/caffe-SSD/data/
$ cp VOC0712/* MYDATASET/.
```
結果會像這樣
```shell
(caffe)wang@ai4:~/caffe-SSD/data/MYDATASET$ ls
coco_voc_map.txt create_data.sh create_list.sh labelmap_voc.prototxt
```
### modify the `create_list.sh`
In the second loop, replace the keywords **VOC2007** and **VOC2012** with **"MYDATASET"** since we have only one dataset.
```shell
##for name in VOC2007 VOC2012
for name in MYDATASET
```
- Generate filename in `test/trainval.txt`
```shell
(caffe)wang@ai4:~/caffe-SSD/data/MYDATASET$ ./create_list.sh
```
to generate `test_name_size.txt`, `test.txt`, and `trainval.txt` in `data/MYDATASET/`.
結果會像這樣:
```shell
(caffe)wang@ai4:~/caffe-SSD/data/MYDATASET$ ls
coco_voc_map.txt create_list.sh test.txt trainval.txt
create_data.sh labelmap_voc.prototxt test_name_size.txt
```
- Rename the `labelmap_voc.prototxt`
```
$ vim data/MYDATASET/labelmap_MYDATASET.prototxt
```
In this file, the first block points to the background. So, don't change it. For the rest block, change their class names accordingly.
```python
### labelmap_MYDATASET.prototxt
item {
name: "none_of_the_above"
label: 0
display_name: "background"
}
item {
name: "person"
label: 1
display_name: "person"
}
```
### modify the `create_data.sh`
Convert dataset to lmdb database
edit `~/caffe-SSD/data/MYDATASET/create_data.sh`
```python
# 改兩個名稱
dataset_name="MYDATASET"
mapfile="$root_dir/data/$dataset_name/labelmap_MYDATASET.prototxt"
```
- Run `create_data.sh`
```shell
(caffe)wang@ai4:~/caffe-SSD/data/MYDATASET$ ./create_data.sh
```
This will create LMDB database in `~/data/VOCdevkit`
and make a soft link in `examples/MYDATASET/.`
--> `~/data/VOCdevkit/persondataset` 會多一個"lmdb"資料夾(成功!)
`~/data/examples/MYDATASET/`會多soft link
- issue:No module named caffe.proto
參考:[SSD from caffe.proto import caffe_pb2 ImportError: No module named caffe.proto](https://blog.csdn.net/curious999/article/details/81225624)
改`~/caffe-ssd/scripts/create_annoset.py`檔
加一句ssd python path 即可
```python
import sys
sys.path.insert(0,"/home/xxx/caffe-ssd/python") // ++
from caffe.proto import caffe_pb2
```
## Create lmdb symlinks
create symlinks to current directory.
```shell
$ cd ~/MobileNet-SSD
$ ln -s /home/xxx/data/VOCdevkit/MYDATASET/lmdb/MYDATASET_trainval_lmdb trainval_lmdb
$ ln -s /home/xxx/data/VOCdevkit/MYDATASET/lmdb/MYDATASET_test_lmdb test_lmdb
```
## Generate training prototxt
[參考chuanqi305/MobileNet-SSD](https://github.com/chuanqi305/MobileNet-SSD)]
在`caffe-SSD/examples/MobileNet-SSD`
由於原VOC數據集是**21類**(20+背景),而我們是"**2類**"(1+背景)
因此,需要重新生成訓練、測試和運行網絡文件
這裡使用`gen_model.sh`,他會調用template文件夾中的模板,按照我們指定的參數,生成所需的文。用法如下:
```shell
$ cd ~/caffe-SSD/examples/MobileNet-SSD
$ ./gen_model.sh 2
## ./gen_model.sh [cls]
```
執行之後,得到example文件夾,內已生好prototxt檔了!
根據作者設置,其中的deploy文件是已經合併過bn層的,需要後面配套使用。
```shell
$ ls ~/caffe-SSD/examples/MobileNet-SSD/example
MobileNetSSD_deploy.prototxt MobileNetSSD_test.prototxt MobileNetSSD_train.prototxt
```
## Modify `solver.prototxt` setting
- 修改訓練和測試超參數
`~/caffe-SSD/examples/MobileNet-SSD`
根據實際情況,修改`solver_train.prototxt` 和`solver_test.prototxt`。
其中`test_iter=測試集圖片數量/batchsize`;
初始學習率不宜太高,否則基礎權重破壞比較嚴重;
優化算法是RMSProp,可能對收斂有好處,不要改成SGD,也是為了保護權重。
[參考] [Mobilenet-SSD的Caffe系列實現](https://blog.csdn.net/Jesse_Mx/article/details/78680055)
## Train
修改並運行`train.sh`,中途可以不斷調節參數。
訓練結束後,運行`test.sh`,測試網絡的精度值。
>Download the training weights from the link above, and run train.sh, after about 30000 iterations, the loss should be 1.5 - 2.5.
```shell
## train and save log
$ ./train.sh 2>&1 | tee -a log/0905_ssd_mob.log
```
## 附錄
### Training time 很慢長
10 iteration要 3分鐘
1000-->300
10000-->3000min = 50 hr
batch size:24 + 300x300 --> train gpu 5651MiB~7163MiB
### 可改進項目
- training 時的 input size 調整
300x300是大還是小?(但改了就無法使用pretrained model)
- 最後測 deploy 時,用哪一種input size, confidence 效果會更好?
- about 小物件偵測:改變feature maps的大小是否能看得更細?
[Mininum size of the detected objects #297](https://github.com/weiliu89/caffe/issues/297)

>it does not do well for small objects.
A 50x50 object may only have 5-6 pixels on conv4_3 (i.e. 8x reduction in resolution).
To detect smaller objects better, besides what you mentioned, you could increase input image size or increase feature map size.
## Model result
accuracy & loss
 
caffemodel test result:


--- END ---