Constrative Embedding for Generalized Zero-Shot Learning

# Constrative Embedding for Generalized Zero-Shot Learning ## Term explanations ::: spoiler Semantic description : The concept of semantic information refers to information which is in some sense meaningful for a system Zero-Shot Learning : A learning method that first trained by datasets provided, Relying on the semantic descriptions, such as **Visual attributes or Word Vector**, transfering knowledges to recognize unseen class in a data-free manner. Long-Tail distribution : Some categories have abundant samples while some categories have few or even no sampls ![](https://i.imgur.com/LptUS3j.png) Class-wise supervision : Recognizing the target with respect to classes, which means it's possible to misrecognize instances in the same class which might not be 100% similiar to the semantic description Instance-wise supervision : Recognizing the target with respect to instance, which means it's able the tolerant differences between instances in the same class, given a more general recogniztion. Commonly-used Semantic Embedding Model : Embed the feature by **ranking loss**, requires the correct semantic descriptor to be ranked higher than wrong descriptors. **Only utilize class-wise supervision.** Contrastive Embedding Model : Learns to discriminate a positive sample and large amount of negative sample. When embedding two data, with same class but different instance, they should still keep similar after embedding. :::success :notes: When using ranking loss, result comparing two different instance in the same class will be negative, while contrastive loss will think they're similiar ::: ## Abstract GZSL problem aims to recognize classes that is unseen when training, comparing to conventional ZSL problem, it's way more hard, since GZSL suffers from a tighter constraint. Solutions in the past are also suffering from some problem, making them only suboptimal, which will be mentioned in **Introduction** The author provide a brand new framework which try to solve this sort of problem. Also they try to make the model more general, utilizing not only class-wise supervision but also instance-wise supervision. ## Introduction In conventional-ZSL problem,we assumed that **only unseen** data will appear in testing datasets, but when we are generalizing the problem, in Generalized-ZSL problem, **not only unseed but also seen** labels appears in testing dataset, which causes severe biased problem when using conventional-ZSL method. Usaully, people solve GZSL problem by Feature generation-based model, merging **synthesis** unseen data with provoded data yield the whole training datasets. :::danger Since we generate synthesis unseen data in the original feature space, we conjecture our origin feature space, far from the semantic description, it's still a suboptimal problem, and thus lack of **discriminative ability** ::: To deal with this suboptimal dilemma, the author provide a novel framework. By mixing a Embedding model and a Feature Generation-Based model, their model can solve the above-mentioned problem by mapping both provided data and synthesis unseen data to a new feature space. Also by using a contrastive embedding model, the model can be more general, utilizing both class-wise supervision but also instance-wise supervision.