Try   HackMD

caffe SSD mobileNet people detection model - training

20190904

  • install caffe ssd branch
  • 生成訓練資料(pascal voc>lmdb)
  • 生成prototxt
  • 執行train


此篇以people detection為例
資料的label只有"person"一種類別

Install

github: caffe-SSD

  1. 下載並切換到ssd branch
$ git clone https://github.com/weiliu89/caffe.git
$ cd caffe
$ git checkout ssd
  1. 編譯
    照著github步驟 make
    makefile 跟 Makefile.config都要改一下
    [install issue1] [install issue2]
    [make issue1] [make issue2]
  • 補充:
    記得需要編譯"新版的caffe-SSD",才會認得SSD

不然會出現一些layer找不到的issue
issue:caffe.LayerParameter has no field named permute_param

If you just want to perform inference, I think adding about three new layers (prior_bbox_layer, permute_layer, detection_output_layer) to your caffe is enough.

Generate lmdb dataset (from pascal voc format )

ref.

Train SSD on the Custom Dataset
Mobilenet-SSD的Caffe系列實現
chuanqi305/MobileNet-SSD

Prepare voc form dataset

將已準備好的pascal voc格式的資料放在/home/data/VOCdevkit裡 (不然就要改creat_list.sh裡面的path)

(caffe)wang@ai4:~/data/VOCdevkit/MYDATASET$ ls
Annotations  ImageSets  JPEGImages  MYDATASET_person_64127.txt

複製相關的4個sh,txt檔 到 新的資料夾(MYDATASET)內

$ cd ~/caffe-SSD/data/
$ cp VOC0712/* MYDATASET/.

結果會像這樣

(caffe)wang@ai4:~/caffe-SSD/data/MYDATASET$ ls
coco_voc_map.txt  create_data.sh  create_list.sh  labelmap_voc.prototxt

modify the create_list.sh

In the second loop, replace the keywords VOC2007 and VOC2012 with "MYDATASET" since we have only one dataset.

  ##for name in VOC2007 VOC2012
  for name in MYDATASET 
  • Generate filename in test/trainval.txt
(caffe)wang@ai4:~/caffe-SSD/data/MYDATASET$ ./create_list.sh 

to generate test_name_size.txt, test.txt, and trainval.txt in data/MYDATASET/.

結果會像這樣:

(caffe)wang@ai4:~/caffe-SSD/data/MYDATASET$ ls 
coco_voc_map.txt  create_list.sh         test.txt            trainval.txt
create_data.sh    labelmap_voc.prototxt  test_name_size.txt
  • Rename the labelmap_voc.prototxt
$ vim data/MYDATASET/labelmap_MYDATASET.prototxt

In this file, the first block points to the background. So, don't change it. For the rest block, change their class names accordingly.

### labelmap_MYDATASET.prototxt
item {
  name: "none_of_the_above"
  label: 0
  display_name: "background"
  }
  item {
  name: "person"
  label: 1
  display_name: "person"
  }

modify the create_data.sh

Convert dataset to lmdb database
edit ~/caffe-SSD/data/MYDATASET/create_data.sh

# 改兩個名稱
dataset_name="MYDATASET"
mapfile="$root_dir/data/$dataset_name/labelmap_MYDATASET.prototxt"
  • Run create_data.sh
(caffe)wang@ai4:~/caffe-SSD/data/MYDATASET$ ./create_data.sh 

This will create LMDB database in ~/data/VOCdevkit
and make a soft link in examples/MYDATASET/.

> ~/data/VOCdevkit/persondataset 會多一個"lmdb"資料夾(成功!)
~/data/examples/MYDATASET/會多soft link

import sys
sys.path.insert(0,"/home/xxx/caffe-ssd/python")  // ++
from caffe.proto import caffe_pb2

create symlinks to current directory.

$ cd ~/MobileNet-SSD
$ ln -s /home/xxx/data/VOCdevkit/MYDATASET/lmdb/MYDATASET_trainval_lmdb trainval_lmdb
$ ln -s /home/xxx/data/VOCdevkit/MYDATASET/lmdb/MYDATASET_test_lmdb test_lmdb

Generate training prototxt

參考chuanqi305/MobileNet-SSD]

caffe-SSD/examples/MobileNet-SSD
由於原VOC數據集是21類(20+背景),而我們是"2類"(1+背景)
因此,需要重新生成訓練、測試和運行網絡文件
這裡使用gen_model.sh,他會調用template文件夾中的模板,按照我們指定的參數,生成所需的文。用法如下:

$ cd ~/caffe-SSD/examples/MobileNet-SSD
$ ./gen_model.sh 2
## ./gen_model.sh [cls]

執行之後,得到example文件夾,內已生好prototxt檔了!
根據作者設置,其中的deploy文件是已經合併過bn層的,需要後面配套使用。

$ ls ~/caffe-SSD/examples/MobileNet-SSD/example
MobileNetSSD_deploy.prototxt  MobileNetSSD_test.prototxt  MobileNetSSD_train.prototxt

Modify solver.prototxt setting

  • 修改訓練和測試超參數
    ~/caffe-SSD/examples/MobileNet-SSD
    根據實際情況,修改solver_train.prototxtsolver_test.prototxt
    其中test_iter=測試集圖片數量/batchsize
    初始學習率不宜太高,否則基礎權重破壞比較嚴重;
    優化算法是RMSProp,可能對收斂有好處,不要改成SGD,也是為了保護權重。

[參考] Mobilenet-SSD的Caffe系列實現

Train

修改並運行train.sh,中途可以不斷調節參數。
訓練結束後,運行test.sh,測試網絡的精度值。

Download the training weights from the link above, and run train.sh, after about 30000 iterations, the loss should be 1.5 - 2.5.

## train and save log
$ ./train.sh 2>&1 | tee -a log/0905_ssd_mob.log

附錄

Training time 很慢長

10 iteration要 3分鐘
1000>300
10000>3000min = 50 hr

batch size:24 + 300x300 > train gpu 5651MiB~7163MiB

可改進項目

  • training 時的 input size 調整
    300x300是大還是小?(但改了就無法使用pretrained model)
  • 最後測 deploy 時,用哪一種input size, confidence 效果會更好?
  • about 小物件偵測:改變feature maps的大小是否能看得更細?
    Mininum size of the detected objects #297

it does not do well for small objects.
A 50x50 object may only have 5-6 pixels on conv4_3 (i.e. 8x reduction in resolution).
To detect smaller objects better, besides what you mentioned, you could increase input image size or increase feature map size.

Model result

accuracy & loss

caffemodel test result:

- END -