YOLO (darknet) Tutorial

# YOLO (darknet) Tutorial - https://www.cnblogs.com/shouhuxianjian/p/10567201.html - deprecated - https://github.com/chineseocr/chineseocr/tree/app/tools - test succeeded, but md5 not the same (may be the potential problem) ## 編譯需求 - Windows或Linux - **CMake >= 3.8**以支援modern CUDA support - **CUDA 10.0** - 經過文圻測試，CUDA 9.x亦可 - **OpenCV >= 2.4** - **cuDNN >= 7.0** (配合CUDA 10.0) - **NVIDIA GPU with CC (compute capability) >= 3.0**: https://en.wikipedia.org/wiki/CUDA#GPUs_supported - 編譯器 - Linux: **GCC** or **Clang** - Windows: **MSVC 2015/2017/2019** ## 如何建置和測試 ### Windows - [Windows 10中建置YOLO](https://hackmd.io/w4Pj6Q2-TUCOihk8B5jJyg?view) ### Linux ```git clone```之後執行```make```即可。另外，可以在makefile或編譯時設置以下參數：[連結在此](https://github.com/AlexeyAB/darknet/blob/9c1b9a2cf6363546c152251be578a21f3c3caec6/Makefile#L1) 1. ```GPU=1```來使用CUDA以達到GPU加速(預設搜尋路徑在```/usr/local/cuda```) 2. ```CUDNN=1```來使用cuDNN來加速GPU訓練速度(預售搜尋路徑在```/usr/local/cudnn```) 3. ```CUDNN_HALF=1```來使用Tensor Cores功能(需要特定型號GPU) - 這個選項我沒有測試過>< 4. ```OPENCV=1```來使用OpenCV功能以支援web-cam或network cameras傳來的video或video stream 5. ```DEBUG=1```來使用debug模式 6. ```OPENMP=1```來使用OpenMP以支援多核心 7. ```LIBSO=1```來建置函式庫```darknet.so```和二進位可執行檔```uselib```(```usblib```依賴於```darknet.so```) - 這個選項我也沒有測試過>///< 8. ```ZED_CAMERA=1```來建置函式庫以支援ZED-3D-camera - 這部份有待用過得先進補充 ## 如何訓練和檢視訓練成果 ### AlexeyAB :::info 注意：以下說明僅憑履軒當時使用pjreddie的repo印象撰寫，**未經過測試** 並且，這裡會用魚類圖片和YOLOv3進行說明 ::: 1. Dataset(資料集)準備和架構 - 從v2開始，我們可以在cfg檔設定輸入圖片的尺寸(預設416x416或最大608x608，或者任何大於32的2次方倍數尺寸) - 所以若在拍攝照片時能用越高像素的來拍更好，可label出來的物件更多，也給後續使用提供更大的彈性 - 需注意==多樣性==和==差異性== 2. 相片或攝影的選擇 - 相片 - 優點 - 尺寸和解析度較高 - 篩選相片時間少，可以直接進行label - 成本較低 - 缺點 - 拍攝時需考慮角度對焦和距離，蒐集data會花費比較多時間 - 若缺少某一類相片，必須重回現場拍攝 - 攝影 - 優點 - 直接對目標物件進行錄影，不須一張張拍攝，蒐集data方便省時 - 短時間內可蒐集到各種角度和視野的大量相片 - 訓練時可針對欠缺的某一角度或型態的相片，再從影片檔中取得並加入訓練 - 缺點 - 解析度較低，物體太小時無法利用截圖方式來放大 - 需花費時間篩選影像中的frames，才能開始label - 1080P或4K錄影，硬體成本較相機高 3. 建立相關資料夾 - 建立如下的資料夾，然後將上一部用來訓練的圖片放在image資料夾中 ![](https://i.imgur.com/Daq1697.png) 4. 開始label相片 - Label軟體這裡使用LabelImg(下載位置: https://github.com/tzutalin/labelImg) :::info 履軒murmur: 聽說Windows裝LabelImg一堆問題，如果有更好的label工具，請記得通知履軒XD ::: - 請按"Open Dir", "Change Save Dir"分別選擇上一步建立的image和labels資料夾，接下來便可從下窗格中選擇要label的照片。按下"Create RectBox"便可以開始label - 另外有快速鍵可以幫忙，具體在這：https://github.com/tzutalin/labelImg#hotkeys ![](https://i.imgur.com/Ym6PpxK.png) - 在作label框選時，記得考慮是否有夠的紋理特徵？此外，若人眼都無法確定的物件，就放棄框選 :::info 履軒建議這部份各位要跟老師討論，免得跟履軒一樣弄出不同調的結果XD ::: - 框選完「所有」的圖片之後，你的資料夾應會分別有相同數目的image檔和xml格式的label檔 ![](https://i.imgur.com/9ZyJsKI.png) :::info 履軒建議將標注檔案輸出成Pascal VOC格式方便支援其他object detection model ::: 5. 建立YOLO資料夾，放置label檔和圖片 - YOLO使用自己土砲的text格式txt檔作為標注檔案，每張圖片對應到一個跟圖片檔名(base name)相同的txt檔，且兩者要放在同一個資料夾中 - 因此在這裡建立一個YOLO資料夾，將所有的image和txt放在YOLO資料夾 - 之後訓練時YOLO便會access"YOLO資料夾" ![](https://i.imgur.com/GL6W4lm.png) :::info txt格式標注檔案第七步會創建，請各位先別急XD ::: 6. 建立設定檔cfg資料夾 - 在fish目錄下，新增cfg資料夾，我們在此存放YOLO設定檔 ![](https://i.imgur.com/IT08kI5.png) - ```obj.data``` - 定義class數量以及各個設定檔的weights目錄儲存路徑，YOLO訓練和預測時都會讀取 - 以魚類資料庫為例： ``` classes = 1 # class數量 train = ./fish/cfg/train.txt # 訓練圖片 test = ./fish.cfg/test.txt # 測試圖片 names = ./fish/cfg/obj.names # 等下會提到的obj.names backup = ./fish/cfg/weights/ # 訓練過程儲存的權重 ``` - ```obj.names``` - 此檔內容為class名稱列表 - 以魚類資料庫為例： ``` fish # 我只要辨識魚即可 ``` - ```train.txt``` - YOLO會依序讀取此檔來取出圖片進行訓練 - 通常是圖片總數的80% - 可以手動或是script取出固定比例的圖片路徑放置於此檔案的內容裡 - ```test.txt``` - YOLO會依序讀取此檔來取出圖片進行validation - 通常是圖片總數的20% - 可以手動或是script取出固定比例的圖片路徑放置於此檔案的內容裡 - ```yolov3.cfg``` or ```yolov3-tiny.cfg``` - YOLO模型設定檔 - 請從Darknet安裝目錄下的cfg資料夾找到想要的YOLO cfg檔(標準或是tiny YOLO)，複製到本cfg資料夾 - 另外履軒建議將本檔案取其他名稱避免搞混 > case 1: 訓練YOLO，複製yolov3.cfg之後在以下行數進行如下修改： > ``` > line 3: set batch=64 # 每個training step使用64張圖片 > line 4: set subvisions=16 -> the batch will divided by 16 > line 603: set filters=(classes + 5) * 3 -> 這裡 filters = (1 + 5) * 3 = 18 > line 610: set classes=1 -> 要辨識的類別數目，這裡只有一類 > line 689: set filters=(classes + 5) * 3 -> 這裡 filters = (1 + 5) * 3 = 18 > line 696: set classes=1 -> 要辨識的類別數目，這裡只有一類 > line 776: set filters=(classes + 5) * 3 -> 這裡 filters = (1 + 5) * 3 = 18 > line 783: set classes=1 -> 要辨識的類別數目，這裡只有一類 > ``` > case 2: 訓練tiny-YOLO, 複製yolov3-tiny.cfg之後在以下行數進行如下修改： > ``` > line 3: set batch=64 # 每個training step使用64張圖片 > line 4: set subvisions=16 -> the batch will divided by 16 > line 127: set filters=(classes + 5) * 3 -> 這裡 filters = (1 + 5) * 3 = 18 > line 135: set classes=1 -> 要辨識的類別數目，這裡只有一類 > line 171: set filters=(classes + 5) * 3 -> 這裡 filters = (1 + 5) * 3 = 18 > line 177: set classes=1 -> 要辨識的類別數目，這裡只有一類 > ``` > > batch: 每批次(batch)取幾張圖片進行訓練 > subvision: 要將每批次拆成幾組，以防止GPU memory不夠用 > 另外，由於標準YOLOv3有三個detector針對三種scale的feature map，因此要修改三組filters和classes。 > Tiny-YOLO只有兩個detector，所以只要修改兩組filters和classes即可。 > :::info AlexeyAB在[How to train (to detect your custom objects):](https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects)也有提到如何更改.cfg的檔案內容，有興趣的人可以自行前往觀看 ::: 7. 轉換VOC label為YOLO格式 - YOLO使用自己土砲的text檔案作為標注檔案，其格式如下： ![](https://i.imgur.com/HKvAYeV.png) - ```object-class```: 類別編號從0到(classes - 1) (跟```obj.names```裡面的類別順序有關) - ```<x_center> <y_center> <width> <height>```: 物件bounding box浮點數座標(相對於圖片的寬跟高) - 以上數值在```(0.0 1.0]```區間 - 履軒找到一個方便的格式轉換工具**convert2Yolo**，他可以將VOC轉成YOLO格式 - https://github.com/ssaru/convert2Yolo - 以VOC轉成YOLO格式為例，指令如下： ```$ python example.py --datasets VOC --img_path ./fish/images/ --label ./fish/labels/ --convert_output_path ./fish/yolos/ --img_type ".jpg" --manipast_path ./fish/ --cls_list_file ./fish/cfg/obj.names``` - 詳細指令可在[README.md](https://github.com/ssaru/convert2Yolo#required-parameters)查詢 8. 開始訓練 - 新建一個目錄來儲存訓練過程中的weights權重檔 - 該目錄的路徑名稱定義於```obj.data```中的```backup```參數 ![](https://i.imgur.com/vpsaTE9.png) - 下載預訓練檔(pre-trained weight) - 某種transfer learning的概念 - 從[https://pjreddie.com/media/files/darknet53.conv.74](https://pjreddie.com/media/files/darknet53.conv.74)下載darknet53 pre-trained weight on ImageNet - 另外也可以考慮使用ImageNet + COCO dataset訓練的yolov3.weights(https://pjreddie.com/media/files/yolov3.weights)或yolove-tiny.weights(https://pjreddie.com/media/files/yolov3-tiny.weights) - 以上兩者履軒沒有試過就是了 - 在終端機執行以下command進行訓練(記得根據你的狀況更改指令內容)： ``` $ ./darknet detector train fish/cfg/obj.data fish_train.cfg darknet53.conv.74 ``` - 詳細訓練指令可以在[https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects](https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects)查詢 - 訓練過程會持續秀出各種數值 > Region Avg IOU: 0.798363, Class: 0.893232, Obj: 0.700808, No Obj: 0.004567, Avg Recall: 1.000000, count: 8 Region Avg IOU: 0.800677, Class: 0.892181, Obj: 0.701590, No Obj: 0.004574, Avg Recall: 1.000000, count: 8 > 9002: 0.211667, 0.60730 avg, 0.001000 rate, 3.868000 seconds, 576128 images Loaded: 0.000000 seconds > - 並且每隔100 iterations會出現一個名為```yolo-obj_last.weights```的log檔 - 何時停止訓練 - 發現AlexeyAB和C.H.Tsung各說各話XD - 以下連結有說明就是了 - C.H.Tsung: https://chtseng.wordpress.com/2018/09/01/%E5%BB%BA%E7%AB%8B%E8%87%AA%E5%B7%B1%E7%9A%84yolo%E8%BE%A8%E8%AD%98%E6%A8%A1%E5%9E%8B-%E4%BB%A5%E6%9F%91%E6%A9%98%E8%BE%A8%E8%AD%98%E7%82%BA%E4%BE%8B/ - AlexeyAB: https://github.com/AlexeyAB/darknet#when-should-i-stop-training - 另外AlexeyAB版本有提供mAP和loss chart - 履軒沒有用過，有待各位先進試試XD - https://github.com/AlexeyAB/darknet#when-should-i-stop-training - ```./darknet detector train fish/cfg/obj.data fish/cfg/fish_train.cfg darknet74.conv.map``` 9. 檢視訓練成果 - ```/darknet detector test fish/cfg/obj.data fish/cfg/fish_test.cfg fish_train.backup <要測試圖片路徑>``` - 執行後會將要測試圖片進行檢測之後，輸出結果到predictions.jpg :::info 履軒之前使用pjreddie版本在作testing的時候，有發生不能偵測物件的問題後來在track issue之後發現把.cfg檔從： ```= [net] # Testing # batch=1 # subdivisions=1 # Training batch=64 subdivisions=16 ``` 改成以下內容： ```= [net] # Testing batch=1 subdivisions=1 # Training # batch=64 # subdivisions=16 ``` 就可以正常偵測物件，不知道AlexeyAB版本有沒有這個問題。不過建議各位如果有遇到這問題可以試試本方法。 Reference: https://github.com/pjreddie/darknet/issues/882 ::: #### Reference - https://chtseng.wordpress.com/2018/09/01/%E5%BB%BA%E7%AB%8B%E8%87%AA%E5%B7%B1%E7%9A%84yolo%E8%BE%A8%E8%AD%98%E6%A8%A1%E5%9E%8B-%E4%BB%A5%E6%9F%91%E6%A9%98%E8%BE%A8%E8%AD%98%E7%82%BA%E4%BE%8B/ - https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects - https://pjreddie.com/darknet/yolo/ #### How to Improve Object Detection - https://github.com/AlexeyAB/darknet#how-to-improve-object-detection #### 魚類圖片和標注範例下載 - https://drive.google.com/drive/folders/1voTEhR_xgV8AgOSCTf830Vs7akEWdihg?usp=sharing ## Darknet和Keras權重檔格式轉換 ### Darknet to Keras權重檔轉換demo 1. 下載履軒準備好的Darknet和Keras權重檔轉換程式([點此下載](https://github.com/Cuda-Chen/keras_darknet)) 2. 進入1.步驟的程式資料夾，在該程式的根目錄下載本次demo需要的兩個檔案: * [fish_train.backup](https://drive.google.com/open?id=1ydIRUpJE6ZqrgthgmKQDUOiDdXD1ydkw)，魚類物件偵測的權重檔 * [yolov3.cfg](https://drive.google.com/open?id=1OPrkonUYQUBNRj4v3tJRAHPBNKWE68iF)，魚類物件偵測的網路架構 3. 執行以下指令以製作本次demo要用的keras權重檔(檔名這裡取為```fish_temp.h5```) ``` $ python darknet_to_keras.py -cfg_path yolov3.cfg -weights_path fish_train.backup -output_path fish_temp.h5 ``` 4. 回到使用者根目錄下載[qqwweee](https://github.com/qqwweee/keras-yolo3/tree/master)提供的YOLOv3 with Keras程式(記得建立它的虛擬環境!) 5. 進入4.步驟下載程式的根目錄，將3.步驟產生的```fish_temp.h5```和```obj.names```拷貝到這裡 * 順便下載[連結的圖片](https://i.imgur.com/OWBDYrT.jpg)到這裡(這裡存成檔名`DSC_0061.JPG`) * 還有下載[obj.names](https://drive.google.com/open?id=1CaYqeQIJT_sNURpA5o8Pj-0vpmdY_q3N)，魚類物件偵測的classname list 6. 打開```yolo.py```，找到```YOLO```這個class，對```_defaults```進行以下更改: ```python=22 _defaluts = { "model_path": 'fish_temp.h5', # 這次demo的權重檔 "anchors_path": 'model_data/yolo_anchors.txt', "classes_path": 'obj.names', # 這次demo的classname list "score" : 0.3, "iou" : 0.45, "model_image_size" : (608, 608), # 這次demo提供的權重檔會把輸入圖片resize成608*608 "gpu_num" : 0, # 這次demo不會用到GPU } ``` :::info 履軒的murmur:進行以上動作的原因是因為qqqwweee提供的repo就算在command line輸入權重檔和classname list的檔案位置，它還是會去`_defaluts`裡面尋找事先設定好的預設值。因此才要進行在`yolo.py`裡面進行上述的變更已達到履軒想要的目的。 ::: 7. 執行以下指令 ``` $ python yolo_video.py --image ``` 之後輸入圖片檔名(`DSC_0061.JPG`)來進行魚類物件偵測沒意外的話，會出現一個新視窗的圖片如下: ![](https://i.imgur.com/zg6BVcw.jpg) ### Keras to Darknet權重檔轉換demo 1. 下載履軒準備好的Darknet和Keras權重檔轉換程式([點此下載](https://github.com/Cuda-Chen/keras_darknet)) 2. 進入1.步驟的程式資料夾，在該程式的根目錄下載本次demo需要的兩個檔案: * [fish_temp.h5](https://drive.google.com/open?id=1OtT8yMNqr4vJ9oTRSvxONm0vUamHFNW0)，魚類物件偵測的權重檔 * [yolov3.cfg](https://drive.google.com/open?id=1OPrkonUYQUBNRj4v3tJRAHPBNKWE68iF)，魚類物件偵測的網路架構 3. 執行以下指令以製作本次demo要用的Darknet權重檔(檔名這裡取為```fish_temp.weights```) ``` $ python keras_to_darknet.py -cfg_path yolov3.cfg -weights_path fish_temp.h5 -output_path fish_temp.weights ``` 4. 回到使用者根目錄下載[AlexeyAB](https://github.com/AlexeyAB/darknet)提供的Darknet(記得編譯!) 5. 進入4.步驟下載程式的根目錄，將3.步驟產生的`fish_temp.h5`和`obj.names`拷貝到這裡 * 下載[連結的圖片](https://i.imgur.com/OWBDYrT.jpg)到這裡(這裡存成檔名`DSC_0061.JPG`) * 下載[obj.names](https://drive.google.com/open?id=1CaYqeQIJT_sNURpA5o8Pj-0vpmdY_q3N)，魚類物件偵測的classname list * 下載[obj.data](https://drive.google.com/open?id=1VAOkUUgv_KxF_k6OfAtGKFeBLSCDSYHD) 6. 把4.步驟下載的`yolov3.cfg`拷貝到4.步驟下載程式的根目錄 7. 執行以下指令(根據你的作業系統，你可能需要把`darknet`換成`darknet.exe`) ``` $ ./darknet detector test obj.data yolov3.cfg fish_temp.weights DSC_0061.JPG ``` 之後，你的`predictions.jpg`應該會長這樣: ![](https://i.imgur.com/EbvYWS4.jpg)