Domain Adaptation Neural Network

tags: Deep Learning for Computer Vision

In this task, I applied DANN to implement domain adaptation.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

scenario

  • source domain : SVHN
  • Target domain : MNIST-M

training on source domain only (Lower bound)

Use source images and labels in the training folder for training, target images and labels in the testing folder to compute the accuracy

Avg Accuracy = 0.4925

training on source and target domain (domain adaptation)

Use source images and labels in the training folder + target images in the training folder for training, target images and labels in the testing folder to compute the accuracy

Avg Accuracy = 0.5683

training on target domain only (Upper bound)

Use target images and labels in the training folder for training, target images and labels in the testing folder to compute the accuracy

Avg Accuracy = 0.9798

Result Matrix

MNIST-M → USPS SVHN → MNIST-M USPS → SVHN
Trained on source 0.7429 0.4925 0.1526
Adaptation 0.7693 0.5683 0.3126
Trained on target 0.9626 0.9788 0.9155

Visualization



Remark

  • In addition to caculate classification loss of source , the loss function also adds domain loss of source and target. (Ie, loss = source_loss_class + source_loss_domain + target_loss_domain)
  • During the training process, we expect that the accuracy of the domain classifier will get closer and closer to 0.5, which means that the features of the source and target domains have been mixed together.

Improved UDA model

scenario 1

  • source domain : SVHN
  • Target domain : MNIST-M

Original

# https://github.com/wogong/pytorch-dann
SVHNmodel(
  (feature): Sequential(
    (0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (6): ReLU(inplace=True)
    (7): Conv2d(64, 128, kernel_size=(4, 4), stride=(1, 1))
  )
  (classifier): Sequential(
    (0): Linear(in_features=128, out_features=1024, bias=True)
    (1): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=1024, out_features=256, bias=True)
    (4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU(inplace=True)
    (6): Linear(in_features=256, out_features=10, bias=True)
  )
  (discriminator): Sequential(
    (0): Linear(in_features=128, out_features=1024, bias=True)
    (1): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=1024, out_features=256, bias=True)
    (4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU(inplace=True)
    (6): Linear(in_features=256, out_features=2, bias=True)
  )
)

Improved

I add batchnorm and relue to the featurer extractor

Sequential(
  (0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
  (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (2): ReLU(inplace=True)
  (3): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  (4): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1))
  (5): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (6): Dropout2d(p=0.5, inplace=False)
  (7): ReLU(inplace=True)
  (8): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  (9): ReLU(inplace=True)
  (10): Conv2d(64, 128, kernel_size=(4, 4), stride=(1, 1))
)

scenario 1

  • source domain : MNIST-M
  • Target domain : USPS

Original

# https://github.com/wogong/pytorch-dann
MNISTMmodel(
  (feature): Sequential(
    (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (4): Conv2d(32, 48, kernel_size=(5, 5), stride=(1, 1))
    (5): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (6): Dropout2d(p=0.8, inplace=False)
    (7): ReLU(inplace=True)
    (8): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=768, out_features=300, bias=True)
    (1): BatchNorm1d(300, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=300, out_features=100, bias=True)
    (4): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU(inplace=True)
    (6): Linear(in_features=100, out_features=10, bias=True)
  )
  (discriminator): Sequential(
    (0): Linear(in_features=768, out_features=300, bias=True)
    (1): BatchNorm1d(300, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=300, out_features=2, bias=True)
  )
)

Improved

I reduce the number of layers of the classifier and discriminator, and then add batchnorm and relue to the featurer extractor

MNISTMmodel(
  (feature): Sequential(
    (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (4): Conv2d(32, 48, kernel_size=(5, 5), stride=(1, 1))
    (5): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (6): Dropout2d(p=0.5, inplace=False)
    (7): ReLU(inplace=True)
    (8): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=768, out_features=100, bias=True)
    (1): Linear(in_features=100, out_features=10, bias=True)
  )
  (discriminator): Sequential(
    (0): Linear(in_features=768, out_features=100, bias=True)
    (1): Linear(in_features=100, out_features=2, bias=True)
  )
)

scenario 3

  • source domain : USPS
  • Target domain : SVHN

Original

# https://github.com/wogong/pytorch-dann
USPSMmodel(
  (feature): Sequential(
    (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (4): Conv2d(32, 24, kernel_size=(5, 5), stride=(1, 1))
    (5): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (6): Dropout2d(p=0.8, inplace=False)
    (7): ReLU(inplace=True)
    (8): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=384, out_features=24, bias=True)
    (1): BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=24, out_features=10, bias=True)
  )
  (discriminator): Sequential(
    (0): Linear(in_features=384, out_features=24, bias=True)
    (1): BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=24, out_features=2, bias=True)
  )
)

Improved

  1. Data Preprocess
transforms.Compose([
    transforms.CenterCrop((28, 20)),
    transforms.Resize((28,28)),
    transforms.Grayscale(num_output_channels=3),
])
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
  1. Model Architecture
dann.feature[4] = nn.Conv2d(32, 24, kernel_size=(5, 5), stride=(1, 1))
dann.feature[5] = nn.BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
dann.feature[6] = nn.Dropout2d(p=0.5, inplace=False)

dann.classifier[0] = nn.Linear(in_features=24*4*4, out_features=24, bias=True)
dann.classifier[1] = nn.BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
dann.classifier[2] = nn.Sigmoid()
dann.classifier[3] = nn.Linear(in_features=24, out_features=10, bias=True)

dann.discriminator[0] = nn.Linear(in_features=24*4*4, out_features=24, bias=True)
dann.discriminator[1] = nn.BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
dann.discriminator[2] = nn.Sigmoid()
dann.discriminator[3] = nn.Linear(in_features=24, out_features=2, bias=True)

Improved Result

MNIST-M → USPS SVHN → MNIST-M USPS → SVHN
Original model 0.7693 0.5683 0.3126
Improved model 0.7972 0.5849 0.3726

Remark

  • We can transform source domain data into similar to target domain during data preprocessing.
  • The model architecture with some few layer and number of neural will be better performance, when the source domain data (USPS, rgb=1) is more colorless than target domain data (SVHN, rgb=3).
Select a repo