<style> .change-line{ display: block; height: 10px; } </style> # **AIA Object Detection Course Final Project** **Final Project - Person Re-ID (April, 2022)** <div class="change-line"></div> *This model will contain:* * Object Detection Model ( Getting the person ) * Classification Model ( Classification of the ID ) Flow of the process: ![](https://i.imgur.com/33gN7HE.png) <div class="change-line"></div> ## **Grading** * <font color="lightgreen">Simple baseline: 0.4 mAP</font> * <font color="blue">Medium baseline: 0.5 mAP</font> * <font color="red">Strong baseline: 0.6 mAP</font> <div class="change-line"></div> ## **Object Detection Model** * YOLOR ( You Only Learn One Representation ) * Original data ( 723 images with their annotations ) * Additional data ( ~ 900 images with thier annotations from CrowdHuman dataset ) <div class="change-line"></div> ### Detected Sample: ![](https://i.imgur.com/zTteE0M.jpg) <div class="change-line"></div> ## **Classification Model** * ResNet50 * Original data ( 100 classes, 5 ~ 20 images for each class ) * Advanced: OSNet <div class="change-line"></div> **New method** * We apply ResNet50, removing the softmax layer, which results in an output vector with length = 2048. * When the model obtains a cropped image from YOLOR, it will output a vector and save it in a gallery. This will form a Nx2048 matrix where N is the number of ID. If there are multiple cropped images belonging to the same ID, the vector will be updated with their mean value for each row. * In the inferencing phase, the cropped image (query) from YOLOR will be sent to ResNet50 and output a feature vector. This feature vector will be compared with the vectors in the gallery. * After the comparison, it will output the most possible ID, which means the feature vector of the query is similar to the one in the gallery. <font color="red">**~~The first version didn't perform well due to the amount of the data, which implies we have try more way to do it such as Few-shot Learning~~.**</font> ![](https://i.imgur.com/9KZUS5m.png) <div class="change-line"></div> ![](https://i.imgur.com/uscSIlE.png) <div class="change-line"></div> ![](https://i.imgur.com/vXRH93q.png) <div class="change-line"></div> ## **Integration** ### Detected Sample: ![](https://i.imgur.com/SgJdV28.jpg) <div class="change-line"></div> ## **Result of submission version 1:** ![](https://i.imgur.com/caCg48E.png) <font color='red'>**The performance wasn't good, it may be caused by the small amount of the data for training classification model.**</font> ## **Result of submission version 2:** ![](https://i.imgur.com/EftlS4O.png) <font color='red'>**With slight progress...**</font> <div class="change-line"></div> ## **Result of submission version 3:** ![](https://i.imgur.com/aM5IYdT.png) <font color='red'>**With slight progress...**</font> <div class="change-line"></div> ## **Result of submission version 4:** ![](https://i.imgur.com/wXc3fns.png) <font color='red'>**Achieved and exceeded the Strong baseline!**</font> <div class="change-line"></div> ## **Result of submission version 5:** ![](https://i.imgur.com/bDUpk9M.png) <font color='red'>**Outpeform more accuracy than before!!!**</font> <div class="change-line"></div> ## **Progress** - [x] Object Detection Model - [x] Classification Model - [x] Integration - [x] Submission ## **Leader Board** https://docs.google.com/spreadsheets/d/10JYsHlLrqKtN4UwYUwsoUFsyjGdNQK5jI0y6RZvbbU0/edit#gid=0 **4/16/2022** <font color='red'>$\rightarrow$ 1st place</font> ![](https://i.imgur.com/dU9EgNQ.png) **4/15/2022** <font color='red'>$\rightarrow$ 1st place</font> ![](https://i.imgur.com/pQf35cn.png) **4/14/2022** <font color='red'>$\rightarrow$ 1st place</font> ![](https://i.imgur.com/NBcO5jL.png)