Try   HackMD

Notes on "Universal Domain Adaptation through Self-Supervision" [Paper]

tags: notes unsupervised domain-adaptation

Notes Author: Rohit Lal


Brief Outline

  • Unsupervised domain adaptation methods traditionally assume that all source categories are present in the target domain. In practice, little may be known about the category overlap between the two domains
  • Author propose a universally applicable domain adaptation framework that can handle arbitrary category shift, called Domain Adaptative Neighborhood Clustering via Entropy optimization (DANCE)

Introduction

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • Combines two novel ideas:
    1. as we cannot fully rely on source categories to learn features discriminative for the target, auuthors propose a novel neighborhood clustering technique to learn the structure of the target domain in a self-supervised way.
    2. Author use entropy-based feature alignment and rejection to align target features with the source, or reject them as unknown categories based on their entropy.
  • Rather than relying only on the supervision of source categories to learn a discriminative representation, DANCE harnesses the cluster structure of the target domain using self-supervision.

Methodology

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • Task is universal domain adaptation: given a labeled source domain
    Ds={(xis,yis)}i=1Ns
    with “known" categories
    Ls
    and an unlabeled target domain
    Dt={(xit)}i=1Nt
    which contains all or some “known" categories and possible “unknown" categories.
  • goal is to label the target samples with either one of the Ls labels or the “unknown" label.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Network Architecture

Neighborhood Clustering (NC)

  • propose to minimize the entropy of each target point’s similarity distribution to other target samples and to prototypes. To minimize the entropy, the point will move closer to a nearby point (we assume a neighbor exists) or to a prototype.
  • Similar classes come together. See Fig. 1
  • p is outpyt of n/w

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Entropy Separation loss (ES)

  • The neighborhood clustering loss encourages the target samples to become well-clustered, but we still need to align some of them with “known” source categories while keeping the “unknown” target samples far from the source.
  • “unknown" target samples are likely to have a larger entropy of the source classifier’s output than “known” target samples. This is because “unknown" target samples do not share common features with “known" source classes.
  • The distance between the entropy and threshold boundary,
    ρ
    , is defined as
    |H(p)ρ|
    , where p is the classification output for a target sample. By maximizing the distance, we can make H§ far from
    ρ
    . The introduction of the confidence threshold m allows us to give the separation loss only to confident samples.Final Loss function:
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

Training with Domain Specific Batch Normalization

  • The batch normalization layer whitens the feature activations, which contributes to a performance gain.
  • simply splitting source and target samples into different mini-batches and forwarding them separately helps alignment.
  • Final Objective:
    L=Lcls+λ(Lnc+Les)

    Lcls
    is cross-entropy loss on source samples
  • The loss on source and target is calculated in a different mini-batch to achieve domain-specific batch normalization.