# YOLOv4 Training Tutorial [TOC] ## 0. First of all This tutorial will help you build YOLOv4 easily in the cloud with GPU enabled so that you can run object detections in milliseconds! https://colab.research.google.com/drive/1_GdoqCJWXsChrOiY8sZMr_zbr_fH-0Fg?usp=sharing&fbclid=IwAR0DRwtH6-9D_gYbqxpCFdRIeBYJ1ozle0AiAU3hxk3gOjSePSMbc-JhBaY#scrollTo=q2Jjv0yRKLPe ## 1. Git Clone Darknet ```cmd git clone https://github.com/AlexeyAB/darknet ``` ## 2. Change makefile to have GPU and OPENCV enabled ```cmd cd darknet sed -i 's/OPENCV=0/OPENCV=1/' Makefile sed -i 's/GPU=0/GPU=1/' Makefile sed -i 's/CUDNN=0/CUDNN=1/' Makefile sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile ``` verify CUDA ```cmd !/usr/local/cuda/bin/nvcc --version ``` ==Notice that the version of cuda needs >= 10.2== > If you don't know how to establish cuda & cudnn in Ubuntu environment. Please refer to https://medium.com/ching-i/ubuntu-%E5%AE%89%E8%A3%9D-gpu-driver-cuda-cudnn-%E6%95%99%E5%AD%B8-972b189a6d35 ## 3. Make Darknet make darknet, builds darknet so that you can then use the darknet executable file to run or train object detectors ```cmd make ``` ## 4. Files Architecture #### Archive ``` ./darknet | |-generate_test.py | |-generate_train.py | |-data | | |-obj | | | |-0000000.png | | | |-0000000.txt | | | |-... | | |-test | | | |-1000001.png | | | |-1000001.txt | | | |-... | | |-obj.names | | |-obj.data | |-cfg | | |-yolov4-obj.cfg ``` #### obj.data ```cmd classes = 3 train = /home/chenging/Autonomous_Driving_Final/darknet/data/train.txt valid = /home/chenging/Autonomous_Driving_Final/darknet/data/test.txt names = /home/chenging/Autonomous_Driving_Final/darknet/data/obj.names backup = /home/chenging/Autonomous_Driving_Final/darknet/backup ``` https://drive.google.com/file/d/1oVbwfLfsyttismgoRmRHM35WGqDUckCa/view?usp=sharing #### obj.names ```cmd Car Pedestrian Cyclist ``` https://drive.google.com/file/d/1mKFsJ3wdBU0I2fXZ_a4vrQeEIrqG3z9x/view?usp=sharing #### yolov4-obj.cfg ``` [net] # Testing # If you've trained, please un-command it. # batch=1 # subdivisions=1 Training batch=64 subdivisions=16 # if your memory is not enough, please increase it! width=416 height=416 channels=3 momentum=0.949 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1 learning_rate=0.001 burn_in=1000 max_batches = 6000 policy=steps steps=4800,5400 scales=.1,.1 ``` https://drive.google.com/file/d/1H1uVhalKED9bQmVxSQQhEwBScM2zOjti/view?usp=sharing You can find more details about yolov4-obj.cfg in the following website : https://codertw.com/%E7%A8%8B%E5%BC%8F%E8%AA%9E%E8%A8%80/542552/ #### 0000000.png ![](https://i.imgur.com/8Owc9AY.jpg) #### 0000000.txt ```cmd 0 0.570234375 0.389734375 0.11641145833333327 0.1913125 0 0.6023385416666667 0.4096484375 0.05925 0.09820312499999995 0 0.7507083333333333 0.5038671875 0.2666354166666667 0.1546796875 0 0.570234375 0.389734375 0.11641145833333327 0.1913125 0 0.6023385416666667 0.4096484375 0.05925 0.09820312499999995 0 0.7507083333333333 0.5038671875 0.2666354166666667 0.1546796875 ``` * 0000000.txt needs follow YOLO Format ![](https://i.imgur.com/9T1NOGJ.png) Source : https://chtseng.wordpress.com/2018/09/01/%E5%BB%BA%E7%AB%8B%E8%87%AA%E5%B7%B1%E7%9A%84yolo%E8%BE%A8%E8%AD%98%E6%A8%A1%E5%9E%8B-%E4%BB%A5%E6%9F%91%E6%A9%98%E8%BE%A8%E8%AD%98%E7%82%BA%E4%BE%8B/ ![](https://i.imgur.com/EQMyDAx.png) ![](https://i.imgur.com/EoNmsSG.png) ![](https://i.imgur.com/izwusMS.png) ![](https://i.imgur.com/sPbqxdT.png) > What if my format isn't yolo format but waymo > Please refer to > 1. https://github.com/caizhongang/waymo_kitti_converter.git > 2. https://github.com/EscVM/OIDv4_ToolKit ## 5. Pre-trained weight ```cmd wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137 ``` ## 6. Train it and Demo! ```cmd ./darknet detector train data/obj.data cfg/yolov4-obj.cfg yolov4.conv.137 -dont_show -map ``` ```cmd ./darknet detector train data/obj.data cfg/yolov4-obj.cfg backup/yolov4-obj_best.weights -dont_show -map ``` If you have any pre-trained weight, please replace backup/yolov4-obj_best.weights or yolov4.conv.137 with your pre-trained weight. ![](https://i.imgur.com/72cvqwm.png) In ./darknet you can see the training process plot if you add -map. Every 100 iterations, it will compute mAP. ![](https://i.imgur.com/MnlqISp.png) > What is mAP? https://chih-sheng-huang821.medium.com/%E6%B7%B1%E5%BA%A6%E5%AD%B8%E7%BF%92%E7%B3%BB%E5%88%97-%E4%BB%80%E9%BA%BC%E6%98%AFap-map-aaf089920848?fbclid=IwAR2TqOW3XAA7TTG6efarRTdVPuQgxn_c8VDzHeRtJ9RRPm__y8Y6nc19TH0 #### Test a image ``` ./darknet detector test data/obj.data cfg/yolov4-obj.cfg ./backup/yolov4-obj_best.weights ./test.png -thresh 0.3 -out result.txt ``` ![](https://i.imgur.com/YUZ02yK.jpg) ``` [ { "frame_id":1, "filename":"./test.png", "objects": [ {"class_id":1, "name":"Pedestrian", "relative_coordinates":{"center_x":0.025277, "center_y":0.590687, "width":0.056367, "height":0.105150}, "confidence":0.974176}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.379112, "center_y":0.568891, "width":0.094814, "height":0.111459}, "confidence":0.993241}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.809745, "center_y":0.620679, "width":0.186627, "height":0.168321}, "confidence":0.989374}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.650084, "center_y":0.541073, "width":0.065840, "height":0.081862}, "confidence":0.984396}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.502944, "center_y":0.527315, "width":0.068330, "height":0.106554}, "confidence":0.982795}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.623770, "center_y":0.525190, "width":0.061993, "height":0.068749}, "confidence":0.968530}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.272871, "center_y":0.559579, "width":0.083208, "height":0.074861}, "confidence":0.967086}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.339403, "center_y":0.532364, "width":0.062299, "height":0.064379}, "confidence":0.933663}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.301758, "center_y":0.547045, "width":0.063640, "height":0.067330}, "confidence":0.930660}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.432346, "center_y":0.529242, "width":0.058491, "height":0.070736}, "confidence":0.923266}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.605002, "center_y":0.517186, "width":0.048999, "height":0.064098}, "confidence":0.906764}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.227240, "center_y":0.579499, "width":0.114512, "height":0.085782}, "confidence":0.887934}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.368436, "center_y":0.531378, "width":0.060252, "height":0.070806}, "confidence":0.776398}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.589318, "center_y":0.511667, "width":0.038385, "height":0.052332}, "confidence":0.403203}, {"class_id":0, "name":"Car", "relative_coordinates":{"center_x":0.315539, "center_y":0.540429, "width":0.060694, "height":0.062630}, "confidence":0.303930} ] } ] ``` #### Test images ``` /darknet detector test data/obj.data cfg/yolov4-obj.cfg ./backup/yolov4-obj_best.weights -thresh 0.25 -ext_output <Test/Test.txt> result.txt ``` #### Test.txt ``` ./Test/1001000.png ./Test/1001001.png ./Test/1001002.png ./Test/1001003.png ./Test/1001004.png ./Test/1001005.png ./Test/1001006.png ./Test/1001007.png ./Test/1001008.png ./Test/1001009.png ./Test/1001010.png ./Test/1001011.png ./Test/1001012.png ./Test/1001013.png ./Test/1001014.png ./Test/1001015.png ./Test/1001016.png ./Test/1001017.png ./Test/1001018.png ./Test/1001019.png ``` #### result.txt ``` CUDNN_HALF=1 net.optimized_memory = 0 mini_batch = 1, batch = 1, time_steps = 1, train = 0 Create CUDA-stream - 0 Create cudnn-handle 0 nms_kind: greedynms (1), beta = 0.600000 nms_kind: greedynms (1), beta = 0.600000 nms_kind: greedynms (1), beta = 0.600000 seen 64, trained: 131 K-images (2 Kilo-batches_64) Enter Image Path: Detection layer: 139 - type = 28 Detection layer: 150 - type = 28 Detection layer: 161 - type = 28 ./Test/1001000.png: Predicted in 133.273000 milli-seconds. Car: 98% (left_x: 1 top_y: 741 width: 93 height: 150) Car: 89% (left_x: 23 top_y: 733 width: 178 height: 100) Car: 98% (left_x: 312 top_y: 724 width: 222 height: 138) Car: 99% (left_x: 560 top_y: 708 width: 225 height: 198) Car: 79% (left_x: 653 top_y: 703 width: 147 height: 115) Car: 99% (left_x: 1500 top_y: 713 width: 422 height: 275) Enter Image Path: Detection layer: 139 - type = 28 Detection layer: 150 - type = 28 Detection layer: 161 - type = 28 ./Test/1001001.png: Predicted in 116.374000 milli-seconds. Car: 97% (left_x: 1 top_y: 740 width: 91 height: 155) Car: 90% (left_x: 22 top_y: 735 width: 179 height: 100) Car: 97% (left_x: 310 top_y: 722 width: 225 height: 138) Car: 99% (left_x: 564 top_y: 707 width: 217 height: 204) Car: 84% (left_x: 651 top_y: 701 width: 146 height: 116) Car: 99% (left_x: 1433 top_y: 715 width: 493 height: 276) Enter Image Path: Detection layer: 139 - type = 28 Detection layer: 150 - type = 28 Detection layer: 161 - type = 28 ./Test/1001002.png: Predicted in 117.112000 milli-seconds. Car: 97% (left_x: 2 top_y: 738 width: 91 height: 158) Car: 85% (left_x: 23 top_y: 734 width: 178 height: 101) Car: 98% (left_x: 312 top_y: 720 width: 217 height: 143) Car: 99% (left_x: 559 top_y: 708 width: 229 height: 193) Car: 86% (left_x: 649 top_y: 698 width: 151 height: 115) Car: 98% (left_x: 1369 top_y: 711 width: 558 height: 276) ... ``` #### Test video ``` ./darknet detector demo data/obj.data cfg/yolov4-obj.cfg ./backup/yolov4-obj_best.weights -thresh 0.25 -dont_show -ext_output ./test.mp4 -out_filename result.avi ``` https://www.youtube.com/watch?v=DCLd-WqeBcY