# SoftTriple Loss: Deep Metric Learning Without Triplet Sampling ## Preliminaries - Why not just use softmax loss? Sometimes in cases such as face recognition, it is required to compare two unknown faces. This indicates that classes may be variable instead of fixed-length. With softmax, classes are of fixed-length, **but with triplet mining, embeddings of classes different from the trained training set may be computed.** - offline mining: at the beginning of each epoch, compute embeddings on full training set. **3B embeddings are computed to get B triplets.** It is required to do a full pass in order to generate the triplets - online mining (on-the-fly): give a batch of images, compute B embeddings to generate B^3 triplets. Most of these triplets are not valid. ## Intro - In this paper, the authors propose to optimize against a "SoftTriple loss," which extends the softmax loss with **multiple centers for each class.** - It is known that learning the embeddings with Triplet Loss does not require the sampling phase by **mildly increasing the size of the last fully-connected layer** (having multiple centers for a single class that are encoded in the last layer) - It is known that by learning the embeddings with multiple centers for a single class, it helps reduce intra-class variation. -