Shopping Mall project

# Shopping Mall project ## Common * Most of the current pretrained model (for face, gender, age, people re-id) are trained with nearly perfect datasets: clear pictures, good view angle, use normal camera (with the fisheye one, the shape will be bent). * At the begining, we can try to use the pretrained models that provided on the Internet, but after that, because we use fisheye cameras, we need to get dataset from it and try to train our own models. * In the shopping mall environment, custommers will go cross each other and also be covered by stuffs such as basket, trolley... so the camera just sees part of the body. * We will use multiple models at the same time (for age, gender, faces, attributes, people reid) so it will be heavy for edge device and I think Jetson Nano can't handle all of this. * Also because of the multiple models issue, we have to make sure they can colaborate with each other. Such as the model detect age, gender needs to know that detected values belong to which person, which history path. ## Tracking with multiple cameras ### People Re-ID Compare and consider between [AlignedReID++](https://github.com/michuanhaohao/AlignedReID) and [Bag of Tricks](https://github.com/michuanhaohao/reid-strong-baseline) They belong to the same researcher and Bag of Tricks is newer. In my opinion, we should give AlignedReID++ a chance to check if it will be better in the case of part of body is hidden or in the case that detected bounding box is not correct because it splits the box to smaller parts and gives different weight (contribution) for each parts when compare 2 people. ![](https://i.imgur.com/ByF2Fdv.png) ![](https://i.imgur.com/qS2xXua.png) ![](https://i.imgur.com/spCcHDC.png) ### Merge the tracklets to get right history path of each person (trajectory) The embedded value that generated by ReID algorithm from picture of people is just the input for the next step. We need to find and merge many tracklets that belong to the same person to get the right history path. Each tracklet contains many images, so we have many embedded values. We need to find a good algorithm to compare 2 tracklets. I don't see this in the research results. ### Person attributes The same as Person ReID, we should think about the way we use this input to merge the tracklets together. ## My opinion on the way we merge tracklets to trajectory * Calculate the distance matrix between the known tracklets and unknown tracklets then use Hungary algorithm to assign them together. * The distance matrix will be based on some rules: * Calculate the average distance between all embedded ReID values of a pair of tracklet. * Velocity threshold: A person can't be at this place at this moment then another faraway place next second. The threshold value can be calucated by the average speed => Infinitive distance if it doesn't follow the rule. * If 2 cameras aren't overlaped, a person can't appear in both of them at the same time => For 2 tracklets the belong to 2 non-overlaped cameras, the distance is infinitive.