Domain adaption in One-shot Learning. ECML-PKDD 2018

Domain adaption in One-shot Learning. ECML-PKDD 2018 === [TOC] # Abstract - [code - official (TF)](https://github.com/leonndong/DAOSL) - 似乎需要 target data，**但是看 code 似乎沒用到 label 又好像有用到 label @@** - given only one example of each new class. Can we **transfer knowledge learned by oneshot learning from one domain to another**? - propose a **domain adaption framework based on adversarial networks**. - This framework is **generalized for situations where the source and target domain have different labels**. - use a **policy network**, inspired by human learning behaviors, to effectively **select samples from the source domain in the training process**. This sampling strategy can further improve the domain adaption performance. # Introduction - **metric-based methods** can achieve state-of-the-art performance in one-shot classication tasks, but the accuracy can be **easily influenced when the test data comes from a different distribution** - Prototypical networks for few-shot learning. NIPS 2017 - Matching networks for one shot learning. NIPS 2016 - we propose an **adversarial framework for domain adaption in one-shot learning** - we **train the one-shot classifier and auxiliary domain discriminator simultaneously**. - We demonstrate that, in **one-shot learning**, the proposed method can **achieve better results than previous domain adaption models**. - we propose to use a **policy gradient** method to **select the samples from the source domain in the training phase**, which is different from the traditional random sample selection - Simple statistical gradient-following algorithms for connectionist reinforcement learning. 1992 - Reinforcement learning: An introduction. 1998 - Policy gradient methods for reinforcement learning with function approximation. 2000 - We also discuss the how the proposed sampling strategy is linked to **distance metric learning (DML)** [30] and **curriculum learning** [2]. - [30] Distance metric learning with application to clustering with side-information. NIPS 2003 - [2] Curriculum learning. ICML 2009 - As in Figure 1., This work focuses on a difficult situation where source domain and target domain do not have any overlap in categories. - ![](https://i.imgur.com/VwhYhgs.png) # Related Work - the most related recent work - Few-shot adversarial domain adaptation. CVPR 2017 # Adversarial Domain Adaption with Reinforced Sample Selection 1. we formulate the domain adaption problem in metric-based one-shot learning. 2. we propose an **adversarial domain adaption framework without stage-wise training scheme**. 3. we introduce the concept of **overgeneralization in domain adaption** 4. we propose **reinforced sample selection** as a solution to overgeneralization. - The complete pipeline is illustrated in Figure 2. ![](https://i.imgur.com/l0dimrs.png) ## Problem Definition - assume class amount $K_S>K_T$, and sample amount $N_T \gg t$ - $N_S$: number of **all source** domain data - $N_T$: number of **all target** domain data - $t$: number of **labeled target** domain data - $K_S$: number of **classes in source** domain - $K_T$: number of **classes in target** domain ## Adversarial Domain Adaptation - SOTA of ADA methods use multi-stage training, but in this paper we **simultaneously train one-shot classifier and discriminator** ![](https://i.imgur.com/bSag48S.png) - embedding function $f_\theta$ - domain discriminator $g_\phi$ - ***domain discriminator loss(paper 公式打錯了，這裡已修正)*** $J_\phi = -\dfrac{1}{B_S}\sum_\limits i \log(p_\phi(y=1|f_\theta(x_i))) - \dfrac{1}{B_T}\sum_\limits j \log(p_\phi(y=0|f_\theta(\bar x_j)))\tag 4$ - adversarial loss $J_{adv} = \dfrac{1}{B_T}\sum_\limits j\log p_\phi(y=1|f_\theta(\bar x_j))$ - target domain sample $\bar x$ - total loss $J_\theta = J_{cls}+\lambda_{adv}J_{adv}$ ### note for ADA - Note, $\bar y_j$ is not used in the optimization for either $\theta$ or $\phi$, so **ADA in one-shot learning is unsupervised domain adaption**. ## Overgeneralization - There is no supervision for $T$, thus the extracted features are dependent on $S$. With limited memory, the learner memorizes **more generalized features from $S$ but misses the features that are most discriminative for $T$**, especially when $K_S \gg K_T$ . Previous methods have shown that ADA performs well when $S$ and $T$ **share same categories**. ## Reinforced Sample Selection - In supervised learning, more examples usually help the learner to grasp more discriminative features. However, the **large sample size of $S$ may not help domain adaption in one-shot learning** because $S$ and $T$ can have totally different categories. - I think its one kind of **negative transfer** - We propose to train the learner to **learn the sampling strategy through reinforcement learning**, which is in contrast to typical random sample selection ***(of few-shot learning??)*** . In the domain adaption process, the learning system actively **selects samples from $S$ when it sees an image from $T$**. - The policy network $h_\psi:\mathbb R^D\rightarrow \mathbb R^G$ parameterized with $\psi$ - $G$ is the number of disjoint subsets of $S$ - **$D$ is number of input features** - In **an episode** of a K-way one-shot learning task, we select the **subset of $S$** according to $\Omega_\psi(\bar x)$ before **sampling the support set and query image**. - After $\theta$ and $\phi$ are updated as in Algorithm 1, if the one-shot classifier correctly predicts the class label for the query image, then we replace the query image with the target image. - Note, the label of the query image is still the original label since we **do not have the label for the target image**. - If the target query image can be correctly(演的ㄅ) classified, the target image is "close" to the corresponding image in the projected feature space. The reward is defined as $R(\Omega_\psi(\bar x)) = \left\{\begin{matrix}1&\text{if correct,}\\-\gamma&\text{otherwise.}\end{matrix}\right.$ - **after the support set is sampled**, we **sample the query images for all $K$ classes** and **for each class, we replace the query image** with the target image to perform a one-shot classification. - Reinforced Sample Seletion (RSS) is actually a single-step Markov Decision Process ![](https://i.imgur.com/DqEOwRc.png) - If the **target query image** can be **correctly classified**, the target image is "close" to the corresponding image in the projected feature space. The reward is defined as $R(\Omega_\psi(\bar x)) = \left\{\begin{array}{}1\text{ , if correct}\\ -\gamma\text{ , otherwise}\end{array}\right.$ $f(x)\left\{ \begin{array}{} a = 1\\ b = -\gamma \end{array} \right.$ # Experiments 主要比較 proposed method 和 MatchingNet+ADDA, ProtoNet+ADDA 差異 ## Basic Settings - use **ADDA** as another baseline because FADA cannot be adapted to one-shot classifier - use **MatchingNet and ProtoNet as backbone** - use **Omniglot as source domain, EMNIST as target domain** ## Adversarial Domain Adaption ## Reinforced Sample Selection ## Complex Settings - We use **CIFAR100 as the source domain** and **ImageNet as the target domain**. ###### tags: `fewshot learning`

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.