Akshay Kulkarni
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# Notes on "[Variational Adversarial Active Learning](https://arxiv.org/abs/1904.00370)" ###### tags: `notes` `adversarial` `variational` Author: [Akshay Kulkarni](https://akshayk07.weebly.com/) Note: For proper understanding, the knowledge of Variational AutoEncoders (VAE) is highly recommended. [This post](https://lilianweng.github.io/lil-log/2018/08/12/from-autoencoder-to-beta-vae.html) provides a very good and in-depth explanation. ## Brief Outline - Active learning algorithms attempt to incrementally select samples for annotation that result in high classification performance with low labelling cost. - This paper introduces a pool-based active learning strategy which learns a low dimensional latent space from labeled and unlabeled data using a VAE. ## Introduction - This method (VAAL), selects instances for labeling from the unlabeled pool that are sufficiently different in the latent space learned by the VAE, to maximize the performance of the representation learned on the newly labeled data. - Sample selection in VAAL is performed by an adversarial network which classifies which pool the instances belong to (labeled or unlabeled) and does not depend on the task or tasks for which are trying to collect labels. - The VAE and the discriminator are framed as a two-player mini-max game, similar to GANs, such that the VAE learns a feature space to trick the adversarial network into predicting that all datapoints, from both the labeled and unlabeled sets, are from the labeled pool while the discriminator network learns how to discriminate between them. - The intuition is that once the active learner is trained, the probability associated with discriminator’s prediction effectively estimates how representative that sample is from the pool that it has been deemed to be from. ## Related Work ### Active Learning - Current active learning techniques are broadly of 2 types: - *Query-acquiring (Pool-based)*: Use different sampling strategies to determine how to select the most informative samples. References to read - [Mahapatra et. al. 2018](https://arxiv.org/abs/1806.05473), [Mayer and Timofte, 2018](https://arxiv.org/abs/1808.06671), and [Zhu and Bento, 2018](https://arxiv.org/abs/1702.07956v5). - *Query-synthesizing*: Use generative models to generate informative samples. - Pool-based learning techniques are of 3 types: - *Uncertainty-based methods* - *Representation-based methods* - Combination of the two - A review on these is given by [Settle, 2012](https://www.morganclaypool.com/doi/abs/10.2200/S00429ED1V01Y201207AIM018). ### Variational AutoEncoder (VAE) and Adversarial Learning - A VAE ([Kingma and Welling, 2013](https://arxiv.org/abs/1312.6114)) is a latent variable model that follows an encoder decoder architecture which places a prior distribution on the feature space distribution and uses an Expected Lower Bound to optimize the learnt posterior. - Adversarial Autoencoders ([Makhzani et. al. 2015](https://arxiv.org/abs/1511.05644)) minimize the adversarial loss in the latent space between a sample from the prior and the posterior distribution. - The use of an adversarial network ([Goodfellow et. al. 2014](https://arxiv.org/abs/1406.2661)) enables training the model by solving a mini-max optimization problem. ## Methodology ![VAAL](https://i.imgur.com/lidR4TN.png) - $(x_L, y_L)$ = a sample pair from a pool of labeled data $(X_L, Y_L)$. - $x_U$ = a sample from a much larger pool of unlabeled data $X_U$. - The active learner iteratively queries a fixed sampling *budget*, $b$ number of the most informative samples from the unlabeled pool ($X_U$), using an aquisition function to be annotated by the oracle such that the expected loss is minimized. ### Transductive Representation Learning ($\beta$-VAE) - They use a $\beta$-VAE for representation learning. The encoder learns a low dimensional space for the underlying distribution using a Gaussian prior, while the decoder reconstructs the input. - To capture the features missing in the representation learned on the labeled pool, we can use the unlabeled data and perform *transductive learning*. - Note: Transductive learning is just another name for semi-supervised learning (learning some property from unlabeled data which will help you learn for your main task using labeled data). Similarly, supervised learning is called *inductive learning*. - The objective function of the $\beta$-VAE is minimizing the variational lower bound on the marginal likelihood of a given sample formulated as $$ \mathcal{L}_{VAE}^{trd} = \mathbb{E}[\log p_\theta(x_L|z_L)] - \beta \mathrm{D}_{KL}(q_\phi(z_L|x_L)||p(z)) $$ $$ +\mathbb{E}[\log p_\theta(x_U|z_U)] - \beta \mathrm{D}_{KL}(q_\phi(z_U|x_U)||p(z)) \tag{1} $$ - Here, $q_\phi$ and $p_\theta$ are the encoder and decoder parametrized by $\phi$ and $\theta$ respectively. $p(z)$ is the prior chosen as a unit Gaussian and $\beta$ is the Lagrangian parameter chosen for the optimization problem. - For a better understanding of this loss function (and why it works), refer to [the VAE post mentioned at the beginning](https://lilianweng.github.io/lil-log/2018/08/12/from-autoencoder-to-beta-vae.html). It is sufficient, but if you want, you can look at the original paper ([Kingma and Welling, 2013](https://arxiv.org/abs/1312.6114)). - The reparametrization trick ([Kingma and Welling, 2013](https://arxiv.org/abs/1312.6114) but also explained in [the blog post](https://lilianweng.github.io/lil-log/2018/08/12/from-autoencoder-to-beta-vae.html)) is used to make the VAE trainable. ### Adversarial Representation Learning - An ideal active learning agent is assumed to have a perfect sampling strategy that is capable of sending the most *informative* unlabeled data to the oracle. - Most of the sampling strategies rely on the model's uncertainty i.e. the more uncertain the prediction, the more informative that specific unlabeled sample must be. However, this introduces vulnerability to outliers. - In contrast, VAAL uses a discriminator (like in GANs) to map the latent representation of $z_L \cup z_U$ to a binary label (which is 1 if sample belongs to $X_L$ and is 0 otherwise). - The VAE and discriminator are learned together in an adversarial fashion. - While the VAE maps the labeled and unlabeled data into the same latent space with similar probability distribution $q_\phi(z_L|x_L)$ and $q_\phi(z_U|x_U)$, it fools the discriminator to classify all inputs as labeled. - On the other hand, the discriminator attempts to effectively estimate the probability that the data comes from the unlabeled set. - The objective function for the adversarial role of the VAE can be formulated as a Binary Cross-Entropy Loss as follows $$ \mathcal{L}_{VAE}^{adv} = -\mathbb{E}[\log (D(q_\phi(z_L|x_L)))] - \mathbb{E}[\log (D(q_\phi(z_U|x_U)))] \tag{2} $$ - The objective function to train the discriminator is given as $$ \mathcal{L}_D = -\mathbb{E}[\log (D(q_\phi(z_L|x_L)))] - \mathbb{E}[\log (1-D(q_\phi(z_U|x_U)))] \tag{3} $$ - By combining Eq. 1 and Eq. 2, we get the full objective function for the VAE for VAAL as $$ \mathcal{L}_{VAE} = \lambda_1 \mathcal{L}_{VAE}^{trd} + \lambda_2 \mathcal{L}_{VAE}^{adv} \tag{4} $$ - Here, $\lambda_1$ and $\lambda_2$ are hyperparameters that determine the effect of each component to learn an effective variational representation. ![VAAL Algorithm](https://i.imgur.com/mkF12wk.png) - The task module (T in the first figure) learns the task for which the active learner is being trained. T is trained separately from the active learner as they don't depend on each other. ### Noisy Oracles - Note: Oracles are just sources of labels (they maybe humans or already available information online, etc.). - The labels provided by the oracles might vary in how accurate they are, depending on the quality of available human resources. - They consider 2 types of oracles: - ideal oracle, which always provides correct labels for the active learner. - noisy oracle, which non-adversarially provides erroneous labels for some specific classes. - This noise might occur in practical cases due to similarities across some classes causing ambiguity for the labeler. So, for a realistic oracle, they apply a targeted noise on visually similar classes. - Note: The implementation of the noisy oracle is detailed in Section 5.2 of the paper. ### Sampling Strategy - Sampling strategy is shown below ![Sampling Strategy VAAL](https://i.imgur.com/ae8flh7.png) - The probability associated with the discriminator's predictions is used as a score to collect $b$ number of samples in every batch predicted as 'unlabeled' with the lowest confidence to be sent to the oracle. - Note that the closer the $D$ output probability is to zero, the more likely it is that it comes from the unlabeled pool. - The key idea in this approach is that instead of relying on the performance of the training algorithm on the main task (which may be unreliable at the beginning), samples are selected based on their representativeness w.r.t. other samples which the $D$ thinks belong to the unlabeled pool. ## Analysis of VAAL ### Ablation Study Note: Ablation study is basically taking your system apart, and analyzing what each part does. You'll get it more clearly when you see what these people have done. The variants of ablation considered are: - Eliminating VAE - This explores the role of the VAE as the representation learner by having only a discriminator trained (to discriminate between labeled and unlabeled pool). - This results in $D$ only memorizing the data and yields the lowest performance. - It reveals the key role of the VAE in not only learning a rich latent space but also playing an effective mini-max game with $D$ to avoid overfitting. - Frozen VAE with $D$ - In this, they add a frozen VAE (not trainable) to the previous setting. Thus, they explore the VAE's role as an autoencoder. - This performs better than having only the $D$ trained, but performs similar or worse than random sampling suggesting that the $D$ failed to learn the representativeness of the samples in the unlabeled pool. - Eliminating $D$ - This explores the role of $D$ by training only a VAE that uses a 2-Wasserstein distance from the cluster centroid of the labeled dataset as a heuristic to explicitly measure uncertainty. - For a multivariate isotropic Gaussian distribution, the closed form solution ([Givens and Shortt, 1984](https://scinapse.io/papers/2040104067) - worth going through?) of the 2-Wasserstein distance between 2 probability distributions can be written as: $$ W_{ij} = [||\mu_i - \mu_j||_2^2 + ||\Sigma_i^{\frac{1}{2}} - \Sigma_j^{\frac{1}{2}}||_\mathcal{F}^2] \tag{5} $$ - Here, $||.||_\mathcal{F}$ represents the Frobenius norm, $\mu_i$, $\Sigma_i$ denote the mean and variance predicted by the encoder, and $\mu_j$, $\Sigma_j$ are the mean and variance for the normal distribution over the labeled data from which the latent variable $z$ is generated. - In this, there is an improvement over random sampling which shows the effect of explicitly measuring the uncertainty in the learned latent space. - However, VAAL outperforms all these scenarios by implicitly learning the uncertainty over the adversarial game between the VAE and the $D$. ### Robustness of VAAL #### Effect of biased initial labels - Intuitively, bias can affect the training such that it causes the initially labeled samples to not be representative of the underlying data distribution by being inadequate to cover most of the regions in the latent space. - They model bias in the labeled pool by not providing labels for $m$ chosen classes at random, and compare it to the case where samples are randomly selected from all classes. - They report their method to better or similar to other methods in these experiments. #### Effect of budget size - They repeat their experiments for 2 budget sizes. - Experiments with lower budget size perform better because a larger budget size results in adding redundant samples instead of more informative ones. #### Noisy vs Ideal oracle - CIFAR100 has 100 classes which are grouped into 20 super-classes. So, each image has a fine label (of the 100 classes) and a coarse label (of the 20 super-classes). - For the noise, they randomly change the ground truth labels of a subset of the dataset, but within the same super-class (which is meaningful, as such a mistake maybe incurred by human labelers due to ambiguity). - Since this method (VAAL) does not depend on the main task, the relative performance is comparable to the ideal oracle. - Also, as percentage of noisy labels increases, all the active learning strategies converge to random sampling (which is intuitive because incorrect labels bring some randomness). #### Choice of architecture - They repeat experiments with ResNet18 (earlier done with VGG16) and report that performance gaps between VAAL and other methods remains similar. ## Conclusion - This paper gives a new task agnostic active learning algorithm, VAAL, that learns a latent representation on both labeled and unlabeled data in an adversarial game between a VAE and a discriminator. - It implicitly learns the uncertainty for the samples deemed to be from the unlabeled pool. - They claim SOTA results in terms of accuracy and sampling time for image classification and semantic segmentation tasks. - They also show that VAAL is robust to noisy oracles and biased initial data. It also performs consistently well across different budget sizes.

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully