owned this note
owned this note
Published
Linked with GitHub
Domain adaption in One-shot Learning. ECML-PKDD 2018
===
[TOC]
# Abstract
- [code - official (TF)](https://github.com/leonndong/DAOSL)
- 似乎需要 target data,**但是看 code 似乎沒用到 label 又好像有用到 label @@**
- given only one example of each new class. Can we **transfer knowledge learned by oneshot learning from one domain to another**?
- propose a **domain adaption framework based on adversarial networks**.
- This framework is **generalized for situations where the source and target domain have different labels**.
- use a **policy network**, inspired by human learning behaviors, to effectively **select samples from the source domain in the training process**. This sampling strategy can further improve the domain adaption performance.
# Introduction
- **metric-based methods** can achieve state-of-the-art performance in one-shot classication tasks, but the accuracy can be **easily influenced when the test data comes from a different distribution**
- Prototypical networks for few-shot learning. NIPS 2017
- Matching networks for one shot learning. NIPS 2016
- we propose an **adversarial framework for domain adaption in one-shot learning**
- we **train the one-shot classifier and auxiliary domain discriminator simultaneously**.
- We demonstrate that, in **one-shot learning**, the proposed method can **achieve better results than previous domain adaption models**.
- we propose to use a **policy gradient** method to **select the samples from the source domain in the training phase**, which is different from the traditional random sample selection
- Simple statistical gradient-following algorithms for connectionist reinforcement learning. 1992
- Reinforcement learning: An introduction. 1998
- Policy gradient methods for reinforcement learning with function approximation. 2000
- We also discuss the how the proposed sampling strategy is linked to **distance metric learning (DML)** [30] and **curriculum learning** [2].
- [30] Distance metric learning with application to clustering with side-information. NIPS 2003
- [2] Curriculum learning. ICML 2009
- As in Figure 1., This work focuses on a difficult situation where source domain and target domain do not have any overlap in categories.
- ![](https://i.imgur.com/VwhYhgs.png)
# Related Work
- the most related recent work
- Few-shot adversarial domain adaptation. CVPR 2017
# Adversarial Domain Adaption with Reinforced Sample Selection
1. we formulate the domain adaption problem in metric-based one-shot learning.
2. we propose an **adversarial domain adaption framework without stage-wise training scheme**.
3. we introduce the concept of **overgeneralization in domain adaption**
4. we propose **reinforced sample selection** as a solution to overgeneralization.
- The complete pipeline is illustrated in Figure 2.
![](https://i.imgur.com/l0dimrs.png)
## Problem Definition
- assume class amount $K_S>K_T$, and sample amount $N_T \gg t$
- $N_S$: number of **all source** domain data
- $N_T$: number of **all target** domain data
- $t$: number of **labeled target** domain data
- $K_S$: number of **classes in source** domain
- $K_T$: number of **classes in target** domain
## Adversarial Domain Adaptation
- SOTA of ADA methods use multi-stage training, but in this paper we **simultaneously train one-shot classifier and discriminator**
![](https://i.imgur.com/bSag48S.png)
- embedding function $f_\theta$
- domain discriminator $g_\phi$
- ***domain discriminator loss(paper 公式打錯了,這裡已修正)*** $J_\phi = -\dfrac{1}{B_S}\sum_\limits i \log(p_\phi(y=1|f_\theta(x_i))) - \dfrac{1}{B_T}\sum_\limits j \log(p_\phi(y=0|f_\theta(\bar x_j)))\tag 4$
- adversarial loss $J_{adv} = \dfrac{1}{B_T}\sum_\limits j\log p_\phi(y=1|f_\theta(\bar x_j))$
- target domain sample $\bar x$
- total loss $J_\theta = J_{cls}+\lambda_{adv}J_{adv}$
### note for ADA
- Note, $\bar y_j$ is not used in the optimization for either $\theta$ or $\phi$, so **ADA in one-shot learning is unsupervised domain adaption**.
## Overgeneralization
- There is no supervision for $T$, thus the extracted features are dependent on $S$. With limited memory, the learner memorizes **more generalized features from $S$ but misses the features that are most discriminative for $T$**, especially when $K_S \gg K_T$ . Previous methods have shown that ADA performs well when $S$ and $T$ **share same categories**.
## Reinforced Sample Selection
- In supervised learning, more examples usually help the learner to grasp more discriminative features. However, the **large sample size of $S$ may not help domain adaption in one-shot learning** because $S$ and $T$ can have totally different categories.
- I think its one kind of **negative transfer**
- We propose to train the learner to **learn the sampling strategy through reinforcement learning**, which is in contrast to typical random sample selection ***(of few-shot learning??)*** . In the domain adaption process, the learning system actively **selects samples from $S$ when it sees an image from $T$**.
- The policy network $h_\psi:\mathbb R^D\rightarrow \mathbb R^G$ parameterized with $\psi$
- $G$ is the number of disjoint subsets of $S$
- **$D$ is number of input features**
- In **an episode** of a K-way one-shot learning task, we select the **subset of $S$** according to $\Omega_\psi(\bar x)$ before **sampling the support set and query image**.
- After $\theta$ and $\phi$ are updated as in Algorithm 1, if the one-shot classifier correctly predicts the class label for the query image, then we replace the query image with the target image.
- Note, the label of the query image is still the original label since we **do not have the label for the target image**.
- If the target query image can be correctly(演的ㄅ) classified, the target image is "close" to the corresponding image in the projected feature space. The reward is defined as $R(\Omega_\psi(\bar x)) = \left\{\begin{matrix}1&\text{if correct,}\\-\gamma&\text{otherwise.}\end{matrix}\right.$
- **after the support set is sampled**, we **sample the query images for all $K$ classes** and **for each class, we replace the query image** with the target image to perform a one-shot classification.
- Reinforced Sample Seletion (RSS) is actually a single-step Markov Decision Process
![](https://i.imgur.com/DqEOwRc.png)
- If the **target query image** can be **correctly classified**, the target image is "close" to the corresponding image in the projected feature space. The reward is defined as $R(\Omega_\psi(\bar x)) = \left\{\begin{array}{}1\text{ , if correct}\\ -\gamma\text{ , otherwise}\end{array}\right.$
$f(x)\left\{
\begin{array}{}
a = 1\\
b = -\gamma
\end{array}
\right.$
# Experiments
主要比較 proposed method 和 MatchingNet+ADDA, ProtoNet+ADDA 差異
## Basic Settings
- use **ADDA** as another baseline because FADA cannot be adapted to one-shot classifier
- use **MatchingNet and ProtoNet as backbone**
- use **Omniglot as source domain, EMNIST as target domain**
## Adversarial Domain Adaption
## Reinforced Sample Selection
## Complex Settings
- We use **CIFAR100 as the source domain** and **ImageNet as the target domain**.
###### tags: `fewshot learning`