Meta-Learning - HackMD

# Meta-Learning ###### tags: `Deep Learning for Computer Vision` ## What If Only Limited Amount of Data Available? * Naive transfer? * Model finetuning: e.g., Train a learning model (e.g., CNN) on large-size data (base classes), following by finetuning on **small-size data (novel classes)**. - That is, freeze feature backbone (learned from base classes) and learn classifier weights for novel classes. * Possibly poor generalization ![](https://i.imgur.com/LYNgsBF.png) ## Selected Applications of Few-Shot Learning in Computer Vision ![](https://i.imgur.com/nN1esVj.jpg) ![](https://i.imgur.com/2HeUNDY.jpg) ## Meta Learning = Learning to Learn * A powerful solution for learning from few-shot data * Let’s consider the following **“2-way(category) 1-shot(image)”** learning scheme: * Only predict **relative label** (+/-) rather than absolut label ![](https://i.imgur.com/kGhSLJ1.jpg) ### Object Function ![](https://i.imgur.com/NBnLwUR.jpg) ### Example ![](https://i.imgur.com/xXSnTu3.jpg) ![](https://i.imgur.com/uLWGmBG.jpg) ### Remarks ![](https://i.imgur.com/5URbB1f.jpg) * Meta learning: learn a N-way K-shot learning mechanism, not fitting data/labels * The conditions (i., N-way K-shot) of meta-training and meta-testing must match. * Additional remarks on N & K for affecting the learning performance? * The task is more difficult when the N is bigger ## A Closely Related Yet Different Task: Multi-Task Learning ## Approaches ### Approach #1: Optimization-Based Approach #### Model-Agnostic Meta-Learning (MAML) ![](https://i.imgur.com/qT9iePc.png) ### Approach #2: Non-Parametric Approach (learn to compare2) #### Siamese Network * Learn a network to determine whether a pair of images are of the same category. ![](https://i.imgur.com/3oG1nEM.png) * Loss function Try the find the best initial $\theta^0$ rather than best $\theta^*$ ![](https://i.imgur.com/mn970E8.jpg) #### Prototypical Networks * Learn a model which properly describes data in terms of intra/inter-class info. * It learns a **prototype** for each class, with **data similarity/separation** guarantees. * For DL version, the learned feature space is derived by a non-linear mapping $f_{\theta}$ and the representatives (i.e., prototypes) of each class is the mean feature vector $c_k$ ![](https://i.imgur.com/MFCGQaT.jpg) #### Matching Networks ![](https://i.imgur.com/6o5G8GI.png) * If we have $g=f$, the model turns into a Siamese network like architecture * Also similar to prototypical network for one-shot learning ![](https://i.imgur.com/DNn8n8F.png) #### Relation Network ![](https://i.imgur.com/Sp4EVzi.png) ### Remark ![](https://i.imgur.com/mBqa6Y0.png) ## Data Hallucination ### Data Hallucination GAN ![](https://i.imgur.com/Bk34FdE.jpg) ![](https://i.imgur.com/09UPRpt.jpg) ### Jointly Trained Hallucinator ![](https://i.imgur.com/hx6Aj5w.png) ## Semantic Segmentation ### Few-Shot Segmentation * A large number of image categories are with pixel-wise ground truth labels, while a small number of them are with limited. * A **shared backbone** produces feature maps for both **support** and **query** images. * **Prototypes** for each class is obtained by **masked pooling** from support feature maps. * Query feature maps are then compared with the pooled prototypes **pixel-by-pixel**. * Typically, **cosine similarity** is adopted for pixel-wise feature comparison. #### 1 way 1 shot ![](https://i.imgur.com/NTaldlU.png) ## Object Detection ![](https://i.imgur.com/9B38uch.jpg) ### Few-Shot Object Detection with Attention-RPN & Multi-Relation Detector * Possible solution: meta-learning + object detection * Network architecture (applicable for N-way K-shot setting) * See the following 1-way 1-shot object detection for example: ![](https://i.imgur.com/vGPfqAR.jpg) ### Frustratingly Simple Few-Shot Object Detection The method based on fine-turning can also achieve good results in the application of small sample target detection, and it has better performance than many methods based on meta-learning ![](https://i.imgur.com/vg1rpAO.png) ## Domain Adaptation ### Strategy of Episodic Training ![](https://i.imgur.com/Kr2wSoB.png) Step 1: ![](https://i.imgur.com/Xbifjmm.png) Step 2: ![](https://i.imgur.com/Yk9TE49.png) ## Challenges & Opportunities in Small-Data Problems ![](https://i.imgur.com/jNe7TAL.png) ![](https://i.imgur.com/E765H6J.png) ### More Opportunities in Small-Data Problems ![](https://i.imgur.com/At9tdez.png)