Report on 27/09/2019

# Report on 27/09/2019 ## Compare running time between image size 300x300 and 600x600 on Jetson Nano Images for testing ![](https://i.imgur.com/iec8X0c.jpg) ![](https://i.imgur.com/xj7BpjN.jpg) Tested on Jetson Nano with 2 images with size (300x300) and 600x600. Tested with 3 methods: * The Tensorflow MTCNN library (installed by `pip3 install mtcnn`) with default min size of face is 20 pixels (it means if the width of face is less than 20, ignore it). * The Tensorflow MTCNN library with min face size is 60 pixels. * The MTCNN that converted to TensorRT (min face size is 60 pixels) * The MTCNN that converted to TensorRT (min face size is 20 pixels) For each method, run **1000 times** and get the total time. Result: For image size 300x300 (seconds) |Method | min size 20 | min size 60 | |:-:|:-:|:-:| | Tensorflow MTCNN |102.81897354125977 | 47.165743827819824 | | TensorRT MTCNN | 79.857168 | 19.178281 | For image size 600x600 (seconds) |Method | min size 20 | min size 60 | |:-:|:-:|:-:| | Tensorflow MTCNN | 243.88660168647766 | 108.84165668487549 | | TensorRT MTCNN | 203.177823 | 39.289763 | The size of input image for InsightFace is 112x112, and if the face is too small and we scale it up, it will be blur and affects the accuracy. In the 300x300 image above, if min face size is 60, it will return 0 face because the face is too small. **Currently, in Jinjer, min face size is 60 pixels** ## Compare Dlib HOG and MTCNN The result is taken from https://github.com/nodefluxio/face-detector-benchmark/blob/master/benchmark-result.txt 1. Dlib HOG ``` # Average IOU = 0.253 # mAP = 0.365 # Inferencing time (On CPU) : 0.239 s # ### Resource Usage (On CPU): # # Memory Usage : 270.777 MiB # CPU Utilization : 99-100% ``` 2. Tensorflow MTCNN Face Detector ``` # Average IOU = 0.417 # mAP = 0.517 # Inferencing time (On GPU) : 0.699 s # Inferencing time (On CPU) : 1.979 s # ### Resource Usage (On GPU): # # Memory Usage : 2074.180 MiB # GPU Memory Usage : 5004 MiB # GPU Core Utilization : 10-40% # CPU Utilization : 111-120% # ### Resource Usage (On CPU): # # Memory Usage : 790.129 MiB # CPU Utilization : 500-600% ``` 3. Tensorflow Mobilenet SSD Face Detector ``` # Average IOU = 0.598 # mAP = 0.751 # Inferencing time (On GPU) : 0.0238 s # Inferencing time (On CPU) : 0.1650 s # ### Resource Usage (On GPU): # # Memory Usage : 1967.676 MiB # GPU Memory Usage : 502 MiB # GPU Core Utilization : 47-58% # CPU Utilization : 140-150% # ### Resource Usage (On CPU): # # Memory Usage : 536.270 MiB # CPU Utilization : 670-700% ``` Based on the result above, SSD is better than MTCNN and MTCNN is better than Dlib HOG, but both SSD and Dlib HOG return just the bounding box of the face only, and the MTCNN return the box and also 5 landmark points (2 eyes, nose and 2 points of the mount). We need the landmark points to align the face.