Useful Papers - HackMD

Useful Papers === [TOC] # in urgent - Few Shot Network Compression via Cross Distillation - [code-official (PyTorch)](https://github.com/haolibai/Cross-Distillation) - 改造 layer-wise distillation loss - 變成 correction loss, imitation loss - Adversarial Complementary Learning for Weakly Supervised Object Localization. CVPR 2018 - 互補的 classifier - Borrowing treasures from the wealthy: Deep transfer learning through selective joint fine- tuning. In CVPR, 2017. - Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification. CVPR 2016 - [中文](https://blog.csdn.net/He_is_all/article/details/55522302) - Addressing Model Vulnerability to Distributional Shifts over Image Transformation Sets. ICCV 2019 - define new **data augmentation** rules according to the image transformations that the **current model is most vulnerable to**, over iterations. (in terms of **N−tuples of image transformations**.) - show that **random search** and, in particular, **evolution-based search** are effective approaches to face this problem. - Understanding how feature structure transfers in transfer learning. 2017, IJCAI - 本篇 paper 有助於理解 feature 如何 transfer - 這篇似乎比較理論性質，有空再看 - justified that feature structure can be transferred, **independently of the change of $P_{y|x}$** over domains - discussed how feature structure can be transferred in **domain adaptation** and **learning to learn** settings from a regularization perspective. - implies that a **tuning parameter is necessary** to help transfer feature structure information from the source domain to the target domain. - 應該是指 fine-tune 是必須的? - Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective. 2017 - https://arxiv.org/pdf/1705.04396.pdf - Knowledge Transfer in Vision Recognition: A Survey - https://hal.archives-ouvertes.fr/hal-02101005/document - Characterizing and Avoiding Negative Transfer. CVPR 2019 - This paper proposes a formal definition of negative transfer and analyzes three important aspects thereof. Stemming from this analysis, a novel technique is proposed to circumvent negative transfer by **filtering out unrelated source data**. Based on **adversarial networks**, the technique is highly generic and can **be applied to a wide range of transfer learning algorithms**. - Disjoint Label Space Transfer Learning with Common Factorised Space. AAAI'19 - Generalizing to unseen domains via adversarial data augmentation. NeurIPS 2018 - Instance Normalization - 好像就是把單個 sample、單一 channel 內的所有 pixel 做 normalize - Adaptive Instance Normalization - Adaptive batch normalization for practical domain adaptation. ICLR 2017 Workshop, *Pattern Recognition* 2018 - **AdaBN** - 跟這篇是同一篇 Revisiting batch normalization for practical domain adaptation. arXiv 16 - [中文](https://zhuanlan.zhihu.com/p/56162416) - [my note](https://hackmd.io/@johnnyasd12/ry7xlIA64) - the **label related** knowledge is stored in the **weight matrix** of each layer, whereas **domain related** knowledge is represented by the **statistics of the Batch Normalization** (BN) layer # Domain & Label Shift - Decaf: A deep convolutional activation feature for generic visual recognition. ICML 2014 - Label Efficient Learning of Transferable Representations across Domains and Tasks. NIPS 2017 (Li Fei-Fei) - Our method shows compelling results on **novel classes within a new domain** even when **only a few labeled examples per class** are available, outperforming the prevalent fine-tuning approach. - 請移駕到 [Awesome Few-shot/Meta Learning](https://hackmd.io/2e6l5BmYS2ebhE__dwOcgQ?both#Label-Efficient-Learning-of-Transferable-Representations-across-Domains-and-Tasks-NIPS-2017-Li-Fei-Fei) - Learning to cluster in order to transfer across domains and tasks. ICLR 2018 - **cross domain + label**- This paper introduces a novel method to perform transfer learning across domains and tasks, formulating it as a problem of **learning to cluster**. - The key insight is that, in addition to features, we can transfer **similarity information** and this is sufficient to learn a **similarity function** and **clustering network** to perform both domain adaptation and cross-task transfer learning. - We begin by reducing categorical information to **pairwise** constraints, which only considers whether two instances **belong to the same class or not (similarity network)** - We then present two novel approaches for performing transfer learning **using this similarity function** 1. for unsupervised domain adaptation, we design a new loss function to **regularize classification** with a **constrained clustering loss**, hence learning a **clustering network** with the transferred similarity metric **generating the training inputs(??)**. 2. for cross-task learning (i.e., unsupervised clustering with unseen categories), we propose a framework to reconstruct and estimate the number of semantic clusters, **again** using the **clustering network**. - Since the similarity network is noisy, the key is to use a **robust clustering** algorithm, - Our results show that we can reconstruct semantic clusters with high accuracy. - Our approach doesn’t explicitly deal with domain discrepancy. **If we combine with a domain adaptation loss, it shows further improvement**. - Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift. CVPR 2018 - **cross domain + label** - **multi-source domain** - Split Batch Normalization: Improving Semi-Supervised Learning under Domain Shift. ICLR 2019 Workshop LLD - Recent work has shown that using **unlabeled data** in semi-supervised learning is **not always beneficial** and can even hurt generalization, especially when there is a class mismatch between the unlabeled and labeled examples. - Our main contribution is showing how to benefit from additional **unlabeled data that comes from a shifted distribution** in **batch-normalized** neural networks. - We achieve it by simply using **separate batch normalization statistics for unlabeled examples**. - Transfer Learning via Learning to Transfer. ICML 2018 - 這篇有點難 - Partial Adversarial Domain Adaptation. ECCV 2018 - 想參考在 source 中和 target 無關的 class 是怎處理的 # Remove Bias in Dataset - Unbiased look at dataset bias. CVPR 2011 - REPAIR: Removing Representation Bias by Dataset Resampling. CVPR'19 - [code - official(PyTorch)](https://github.com/JerryYLi/Dataset-REPAIR/) # Useful - Making Convolutional Networks Shift-Invariant Again. ICML 2019 - SpotTune: Transfer Learning through Adaptive Fine-tuning. CVPR'19 - Multi-class Classification without Multi-class Labels. ICLR'19 - pairwise similarity - M-ADDA: Unsupervised Domain Adaptation with Deep Metric Learning. arXiv 1807 - Improving generalization via scalable neighborhood component analysis. ECCV 2018 - supervised domain adaptation - Cross-entropy adversarial view adaptation for person re-identification. IEEE Transactions on Circuits and Systems for Video Technology - Learning what you can do before doing anything. ICLR'19 - Learning to remember more with less memorization. ICLR'19 - Unsupervised Domain Adaptation for Distance Metric Learning. ICLR'19 - Divide and Conquer the Embedding Space for Metric Learning. CVPR'19 - AutoAugment: Learning Augmentation Strategies from Data. CVPR'19 - AutoDIAL: Automatic Domain Alignment Layers. ICCV 2017 # Conditional VAE - Learning structured output representation using deep conditional generative models. NIPS 2015 # VAE-GAN - Adversarial feature learning. 2016 - Autoencoding beyond pixels using a learned similarity metric. ICML 2016 - Adversarial variational bayes: Unifying variational autoencoders and generative adversarial networks. ICML 2017 - Cvae-gan: Fine-grained image generation through asymmetric training. ICCV 2017 - conditional VAE-GAN # Disentangled Feature Learning - CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training. ICCV 2017 - InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. NIPS 2016 - Conditional image synthesis with auxiliary classifier GANs. ICML 2017 - AC-GAN - Independently Controllable Factors. arXiv 1708 - Multi-Task Adversarial Network for Disentangled Feature Learning. CVPR 2018 - 训练出一个网络能提取出只与任务相关的Feature，从而提升在此网络上针对改任务的泛化性和可迁移性 - Multi-Level Variational Autoencoder: Learning Disentangled Representations from Grouped Observations. AAAI 2018 - Domain Agnostic Learning with Disentangled Representations. ICML'19 - single source & multi-target domain adaptation ![](https://i.imgur.com/pd5FWj7.png) - beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. ICLR 2017 - Isolating Sources of Disentanglement in Variational Autoencoders. NeurIPS 2018 - A two-step disentanglement method. CVPR 2018 - Towards open- set identity preserving face synthesis. CVPR 2018 - requires at least two inputs for training - Neural face editing with intrinsic image disentangling. CVPR 2017 - Disentangling by Factorising. ICML 2018 - factor-VAE - [code - official (PyTorch)](https://github.com/1Konny/FactorVAE) - Disentangling Latent Space for VAE by Label Relevant/Irrelevant Dimensions. CVPR 2019 - [code (under-developed?) - official (TF)](https://github.com/ZhilZheng/Lr-LiVAE) - different from cVAE, we present a method for disentangling the latent space into the **label relevant** and **irrelevant** dimensions $z_s$ and $z_u$ - $z_u$ represent the **common characteristics** of all inputs, hence they are constrained by the **standard Gaussian** - $z_s$ is assumed to follow the **Gaussian mixture distribution** in which each component corresponds to a particular class. # Open-set Domain Adaptation - Universal Domain Adaptation. CVPR 2019 - Open Set Domain Adaptation. ICCV 2017 - [中文](https://zhuanlan.zhihu.com/p/31230331) - [code - official (MatLab)](https://github.com/Heliot7/open-set-da) - Learning Factorized Representations for Open-set Domain Adaptation. ICLR'19 # Partial Domain Adaptation / Partial Transfer Learning - $Y_T\subset Y_S$ - Partial Adversarial Domain Adaptation. ECCV 2018 - based on DANN - (1) Mitigate negative transfer by ﬁltering out unrelated source labeled data belonging to the outlier label space $C_s\ C_t$. 減輕 source labeled data 在 outlier label 的影響，等於減輕 negative transfer 的結果； - (2) Promote positive transfer by maximally matching the data distributions $p_{Ct}$ and $q$ in the shared label space $C_t$. 另一方面，減少 target 與 source 共同 label 分布之差異(positive transfer)。 - predict target domain data by source classifier and take mean value of the output probability - $\gamma = \dfrac{1}{n_t}\sum_\limits{i=1}^{n_t}\hat y_i$ - then - original DANN: ![](https://i.imgur.com/V9KJdRi.png) - proposed PADA: ![](https://i.imgur.com/4cyUeN7.png) - $y_i$ 是 source data $x_i$ 的 ground truth；$\gamma_{y_i}$ 是相應的 class weight - Learning to Transfer Examples for Partial Domain Adaptation. CVPR'19 - Partial transfer learning with selective adversarial networks. 2017 # Explainable / visualizing NN # Domain Generalization - Generalizing Across Domains via Cross-Gradient Training. ICLR 2018 - Episodic Training for Domain Generalization. ICCV'19 (oral) - Metareg: Towards domain generalization using meta-regularization. NeurIPS 2018 - seems to be the same label space between source and target, focus on **domain generalization** problem. # Domain Adaptation - Adversarial Dropout Regularization. ICLR 2018 - 待讀 - Bridging Theory and Algorithm for Domain Adaptation. ICML'19 - Boosting for transfer learning. ICML 2007. (著名的 TrAdaboost) - [Awesome domain adaptation](https://github.com/zhaoxin94/awsome-domain-adaptation) - Domain Adaptation with Asymmetrically-Relaxed Distribution Alignment. ICLR 2019 Workshop LLD - **only cross domain & label "distribution"** - seems not for label space shift - **Recently-proposed domain-adversarial approaches** consist of aligning source and target encodings, they can **break down under shifting label distributions**. - contributions - We propose **asymmetrically-relaxed distribution alignment**, a **relaxed distance** for aligning data across domains that can be minimized **without requiring latent-space distributions to match exactly**. - We propose **several distances** that satisfy the desired properties and are **optimizable by adversarial training**. - Regularized Learning for Domain Adaptation under Label Shifts. ICLR'19 - seems still **only cross domain + label "distribution"** - Transferable meta learning across domains. UAI 2018 - 似乎用到 target domain 的 unlabeled data - MAML + DANN? - 這篇也跟樓上一樣不是真的在做 few-shot，**source 跟 target 是相同 label space** - [my paper note](https://hackmd.io/yh6uPnEwQzOfuvYOynB06Q) - Meta-learning algorithms require **sufficient tasks** for meta model training and resulted model can **only solve new similar tasks**. - to address these two problems, we propose a new **transferable meta learning (TML)** algorithm - Bidirectional One-Shot Unsupervised Domain Mapping. ICCV 2019 - one encoder and one decoder for **each domain** - domain $A$: single training sample (per class???) - domain $B$: richer training set - For example, we can transfer all MNIST images to the visual domain captured by a single SVHN image and transform the SVHN image to the domain of the MNIST images. - seems a one-shot domain adaptation problem... - A DIRT-T Approach to Unsupervised Domain Adaptation. ICLR 2018 - Virtual Adversarial Domain Adaptation (VADA) - using Virtual Adversarial Training (VAT) - Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation. CVPR 2019 Oral - Consensus Adversarial Domain Adaptation. AAAI 19 - few-shot domain adaptation scheme (F-CADA) - just **domain adaptation with few labeled data** - gives freedom to both **target encoder** and **source encoder** to get domain-invariant features. - **After obtaining** a **source encoder** and a **source classifier** as a good reference in the source domain, CADA **trains a target encoder** and also gives freedom to the **source encoder by fine-tuning** it through **adversarial learning**. - ![](https://i.imgur.com/7OsW7tH.png) - i think - it's not so many novelty... - the experiments are not compared to SOTA - Few-shot adversarial domain adaptation. NIPS 2017 - [中文](https://blog.csdn.net/Adupanfei/article/details/85164925) - [中文2](https://www.twblogs.net/a/5c1f39d2bd9eee16b3da81c0) - [code (PyTorch)](https://github.com/Coolnesss/fada-pytorch) - [My paper note](https://hackmd.io/8H_J9XauQgWrGfkLz88dKQ?view) - supervised domain adaptation - 並不真的 focus 在 few-shot learning  - Transferrable Prototypical Networks for Unsupervised Domain Adaptation. CVPR'19 - 單純做 domain adaptation 而不是 few-shot，只是方法借用 few-shot 的 ProtoNet - Maximum classifier discrepancy for unsupervised domain adaptation. CVPR 2018 - 找到 classifier 預測結果的 domain 差異上下界，使之越來越小 - [中文](https://zhuanlan.zhihu.com/p/57083034) - [code - official (PyTorch)](https://github.com/mil-tokyo/MCD_DA) - Transferable Attention for Domain Adaptation. AAAI 2019 - [中文笔记](https://zhuanlan.zhihu.com/p/52591143) - Deep coral: Correlation alignment for deep domain adaptation. ECCV 2016 - Simultaneous deep transfer across domains and tasks. ICCV 2015 - 說是 transfer tasks，其實 source 跟 target 是同 label space。**下標題的人很有當記者的潛力** - 所謂 transfer across tasks 其實是指：利用 source 某類別 predicted probability 的平均得到類別間的 information，再對 target domain 的 labeled data 用 Knowledge Distillation 的 soft label loss 做 regularization - Improving the Generalization of Adversarial Training with Domain Adaptation. ICLR'19 - Learning to generalize: Meta-learning for domain generalization. AAAI 18 - d-SNE: Domain Adaptation using Stochastic Neighborhood Embedding. CVPR'19 - supervised domain adaptation with few labeled data - Learning What and Where to Transfer. ICML'19 - [code - official (PyTorch)](https://github.com/alinlab/L2T-ww) - [中文](https://zhuanlan.zhihu.com/p/66130006) - When Samples Are Strategically Selected. ICML'19 - 沒看懂 abstract 0.0 - Depthwise Convolution is All You Need for Learning Multiple Visual Domains. AAAI'19 - Domain Agnostic Real-Valued Specificity Prediction. AAAI'19 # Zero-shot Learning - [Awesome Zero-shot Learning](https://github.com/chichilicious/awesome-zero-shot-learning) - Zero-Shot Deep Domain Adaptation. ECCV 2018 - To the best of our knowledge, ZDDA is the first domain adaptation and sensor fusion method which **requires no task-relevant target-domain data**. - Preserving Semantic Relations for Zero-Shot Learning. CVPR 2018 - Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition. ECCV 2018 - Zero-Shot Task Transfer. CVPR 2019 # Datasets - A Large-scale Attribute Dataset for Zero-shot Learning. 2018 - Meta-dataset: A dataset of datasets for learning to learn from few examples. arXiv 2019 - other datasets 1. Market-1501: 750类每类平均17.2张图片 1. CUHK03: 1367类每类平均9.6张图片 1. DukeMTMC-reID: 702类每类平均23.5张图片 1. CUBird: 200类每类平均29.97张图片 # Interesting - Rethinking feature distribution for loss functions in image classification. CVPR 2018 - proposed Gaussian Mixture Cross Entropy - Do better imagenet models transfer better? CVPR 2019 - Understanding Deep Learning Requires Rethinking Generalization. arXiv 1612 - Google - [如何评价 ICLR 2017 中关于 Rethinking Generalization 的那篇文章？](https://www.zhihu.com/question/56151007) - self-supervised learning - Adaptive Softmax - Accurate, large minibatch sgd: Training imagenet in 1 hour - Squeeze-and-Excitation Networks. CVPR 2018 - Attention is all you need (Transformer) - [中文](https://zhuanlan.zhihu.com/p/47282410) - Image Transformer. ICML 2018 - Hidden Technical Debt in Machine Learning Systems - zero shot imitation learning. ICLR 2018 - spectral normalization - spectral clustering - Wide-ResidualNet - How Important is a Neuron. ICLR'19 - Like What You Like: Knowledge Distill via Neuron Selectivity Transfer. ICLR'19 - All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification. CVPR'19 - SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks. CVPR'19 - FML: Face Model Learning from Videos. CVPR'19 - (單目標追蹤) - arithmetic in CNN - bag of tricks for CNN ###### tags: `papers` `domain adaptation` `transfer learning`