# Tuan Report
## Clothes detection:
- Model: Yolov3 Darknet-53 (https://github.com/pjreddie/darknet)
- Dataset: DeepFashion2 (https://github.com/switchablenorms/DeepFashion2)
- Test Data: DeepFashion2 Validate Data: 32156 images [gdrive](https://drive.google.com/open?id=1O45YqhREBOoLudjA06HcTehcEebR0o9y)
- Test Result: mean average precision (mAP@0.50) = 84.20 %
```
Loading weights from ../darknet/backup/yolov3_cloth_130000.weights...
seen 64
Done! Loaded 107 layers from weights-file
calculation mAP (mean average precision)...
32156
detections_count = 175221, unique_truth_count = 52490
class_id = 0, name = short_sleeved_shirt, ap = 95.99% (TP = 11230, FP = 1318)
class_id = 1, name = long_sleeved_shirt, ap = 88.73% (TP = 5251, FP = 2342)
class_id = 2, name = short_sleeved_outwear, ap = 62.14% (TP = 80, FP = 43)
class_id = 3, name = long_sleeved_outwear, ap = 89.05% (TP = 1838, FP = 956)
class_id = 4, name = vest, ap = 87.25% (TP = 1695, FP = 410)
class_id = 5, name = sling, ap = 62.80% (TP = 239, FP = 213)
class_id = 6, name = shorts, ap = 94.20% (TP = 3570, FP = 350)
class_id = 7, name = trousers, ap = 96.08% (TP = 8735, FP = 1026)
class_id = 8, name = skirt, ap = 93.49% (TP = 5795, FP = 1097)
class_id = 9, name = short_sleeved_dress, ap = 84.53% (TP = 2426, FP = 889)
class_id = 10, name = long_sleeved_dress, ap = 68.34% (TP = 1088, FP = 1192)
class_id = 11, name = vest_dress, ap = 88.95% (TP = 3138, FP = 2683)
class_id = 12, name = sling_dress, ap = 83.06% (TP = 1037, FP = 1256)
for conf_thresh = 0.25, precision = 0.77, recall = 0.88, F1-score = 0.82
for conf_thresh = 0.25, TP = 46122, FP = 13775, FN = 6368, average IoU = 67.45 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.842010, or 84.20 %
Total Detection Time: 674.000000 Seconds
```
- Result: https://bitbucket.org/nldanang/attribute-analysis/src/master/clothes_detector/
## Facial expression:
- Model: [Resnet_50](https://drive.google.com/uc?id=17unekscjX6pExycRcA1VD0-hVpT6e354) follow paper [Fine-Grained Facial Expression Analysis Using Dimensional Emotion Model](https://arxiv.org/pdf/1805.01024.pdf)
- Dataset: 28,709 images Facial Emotion Recognition on FER2013 (https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data)
- Test Data: 574 images in Facial Emotion Recognition on FER2013 (https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data)
- Test Result: Using 80% of 28,709 images to train and 20% of FER2013 to test, Accuracy on Test : 84 %
- Result: https://bitbucket.org/nldanang/attribute-analysis/src/master/emotion_detector/
## Improve Anti proofing for face recognition
- Staging Model: Trained on [NUAA](http://parnec.nuaa.edu.cn/xtan/data/NUAAImposterDB.html) dataset using MobileNet
- New Model:
* Face De-Spoofing: Anti-Spoofing via Noise Modeling [paper](https://arxiv.org/abs/1807.09968), [Trained model](https://github.com/yaojieliu/ECCV2018-FaceDeSpoofing/tree/master/lib)
- Why New Model is better:
* It trained with bigger data
* CNN network allow to estimate spoof noise from image
* We can use multiple Anti proofing model to improve accuracy of Anti proofing
- Dataset: [Oulu-NPU](https://sites.google.com/site/oulunpudatabase/), [CASIA-MFSD](http://biometrics.cse.msu.edu/Publications/Databases/MSUMobileFaceSpoofing/index.htm) and [Replay-Attack](https://www.idiap.ch/dataset/replayattack)
- Test Data: [Oulu-NPU](https://sites.google.com/site/oulunpudatabase/), [CASIA-MFSD](http://biometrics.cse.msu.edu/Publications/Databases/MSUMobileFaceSpoofing/index.htm) and [Replay-Attack](https://www.idiap.ch/dataset/replayattack)
- Test Result:
Evaluation metrics to compare with previous methods, They used Attack Presentation Classification Error Rate (APCER), Bona Fide Presentation Classification Error Rate (BPCER) and, ACER = (APCER + BPCER)/2 for the intra testing on Oulu-NPU, and Half Total Error Rate (HTER), half of the summation of FAR and FRR, for the cross testing between CASIA-MFSD and Replay-Attack. The paper show result:


- Result:
+ overview solutions: https://hackmd.io/Nmf1GqKpR7OeXlPLcCEDgQ
+ Demo face anti spoofing for face recognition: https://drive.google.com/file/d/1UcDB__DmtdW1b2WaSd4qpXz6JJ1K7jqR/view?usp=drivesdk
+ I will merged code in Jinjer Face or create new repoitory for this project if need anti-spoofing.
## Convert age and gender model to TensorFlow integration with TensorRT(TF-TRT)
- Improve performance From 3.3 FPS(frames per second) to 3.6 (FPS)
- Results: https://gitlab.com/heyml/neolab/demo_face_recognition/tree/jetson_dev_tftrt
## Improve face recognition (InProgress):
- Staging Model: tf-insightface pretrained [model](https://drive.google.com/open?id=1Iw2Ckz_BnHZUi78USlaFreZXylJj7hnP)
- New Model: Official InsightFace model: LResNet100E-IR (https://www.dropbox.com/s/tj96fsm6t6rq8ye/model-r100-arcface-ms1m-refine-v2.zip?dl=0)
- Why New Model is better:
* LResNet100E-IR is official pretrained model of InsightFace however, convert this to TensorRT is hard
* LResNet100E-IR network trained on MS1M-Arcface dataset with ArcFace loss is SOTA of Face Recognition (https://github.com/deepinsight/insightface)
* We can use multiple face recognition model by ensemble multiple embedding feature vector from multiple pretrained model to give more detail of face feature. The [paper](https://dl.acm.org/citation.cfm?id=3302459) show that: Experiments have proved that the ensemble CNNs classifier is better than the single CNNs classifier.
- Dataset: MS1M-Arcface
- Result:
+ overview solutions: https://hackmd.io/2ZUuxqRyQcirupsD5iPGcw
+ I finished to convert LResNet100E-IR model to Tensorflow model and I am trying to convert it to Tensorflow-TensorRT to improve perfomance on Jetson Nano.