Domain Adaptation Neural Network

tags: `Deep Learning for Computer Vision`

In this task, I applied DANN to implement domain adaptation.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

scenario

source domain : SVHN
Target domain : MNIST-M

training on source domain only (Lower bound)

Use source images and labels in the training folder for training, target images and labels in the testing folder to compute the accuracy

Avg Accuracy = 0.4925

training on source and target domain (domain adaptation)

Use source images and labels in the training folder + target images in the training folder for training, target images and labels in the testing folder to compute the accuracy

Avg Accuracy = 0.5683

training on target domain only (Upper bound)

Use target images and labels in the training folder for training, target images and labels in the testing folder to compute the accuracy

Avg Accuracy = 0.9798

Result Matrix

	MNIST-M → USPS	SVHN → MNIST-M	USPS → SVHN
Trained on source	0.7429	0.4925	0.1526
Adaptation	0.7693	0.5683	0.3126
Trained on target	0.9626	0.9788	0.9155

Visualization

Remark

In addition to caculate classification loss of source , the loss function also adds domain loss of source and target. (Ie, loss = source_loss_class + source_loss_domain + target_loss_domain)
During the training process, we expect that the accuracy of the domain classifier will get closer and closer to 0.5, which means that the features of the source and target domains have been mixed together.

Improved UDA model

scenario 1

source domain : SVHN
Target domain : MNIST-M

Original

# https://github.com/wogong/pytorch-dann
SVHNmodel(
  (feature): Sequential(
    (0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (6): ReLU(inplace=True)
    (7): Conv2d(64, 128, kernel_size=(4, 4), stride=(1, 1))
  )
  (classifier): Sequential(
    (0): Linear(in_features=128, out_features=1024, bias=True)
    (1): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=1024, out_features=256, bias=True)
    (4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU(inplace=True)
    (6): Linear(in_features=256, out_features=10, bias=True)
  )
  (discriminator): Sequential(
    (0): Linear(in_features=128, out_features=1024, bias=True)
    (1): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=1024, out_features=256, bias=True)
    (4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU(inplace=True)
    (6): Linear(in_features=256, out_features=2, bias=True)
  )
)

Improved

I add batchnorm and relue to the featurer extractor

Sequential(
  (0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
  (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (2): ReLU(inplace=True)
  (3): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  (4): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1))
  (5): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (6): Dropout2d(p=0.5, inplace=False)
  (7): ReLU(inplace=True)
  (8): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  (9): ReLU(inplace=True)
  (10): Conv2d(64, 128, kernel_size=(4, 4), stride=(1, 1))
)

scenario 1

source domain : MNIST-M
Target domain : USPS

Original

# https://github.com/wogong/pytorch-dann
MNISTMmodel(
  (feature): Sequential(
    (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (4): Conv2d(32, 48, kernel_size=(5, 5), stride=(1, 1))
    (5): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (6): Dropout2d(p=0.8, inplace=False)
    (7): ReLU(inplace=True)
    (8): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=768, out_features=300, bias=True)
    (1): BatchNorm1d(300, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=300, out_features=100, bias=True)
    (4): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU(inplace=True)
    (6): Linear(in_features=100, out_features=10, bias=True)
  )
  (discriminator): Sequential(
    (0): Linear(in_features=768, out_features=300, bias=True)
    (1): BatchNorm1d(300, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=300, out_features=2, bias=True)
  )
)

Improved

I reduce the number of layers of the classifier and discriminator, and then add batchnorm and relue to the featurer extractor

MNISTMmodel(
  (feature): Sequential(
    (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (4): Conv2d(32, 48, kernel_size=(5, 5), stride=(1, 1))
    (5): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (6): Dropout2d(p=0.5, inplace=False)
    (7): ReLU(inplace=True)
    (8): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=768, out_features=100, bias=True)
    (1): Linear(in_features=100, out_features=10, bias=True)
  )
  (discriminator): Sequential(
    (0): Linear(in_features=768, out_features=100, bias=True)
    (1): Linear(in_features=100, out_features=2, bias=True)
  )
)

scenario 3

source domain : USPS
Target domain : SVHN

Original

# https://github.com/wogong/pytorch-dann
USPSMmodel(
  (feature): Sequential(
    (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    (4): Conv2d(32, 24, kernel_size=(5, 5), stride=(1, 1))
    (5): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (6): Dropout2d(p=0.8, inplace=False)
    (7): ReLU(inplace=True)
    (8): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=384, out_features=24, bias=True)
    (1): BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=24, out_features=10, bias=True)
  )
  (discriminator): Sequential(
    (0): Linear(in_features=384, out_features=24, bias=True)
    (1): BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=24, out_features=2, bias=True)
  )
)

Improved

Data Preprocess

transforms.Compose([
    transforms.CenterCrop((28, 20)),
    transforms.Resize((28,28)),
    transforms.Grayscale(num_output_channels=3),
])

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Model Architecture

dann.feature[4] = nn.Conv2d(32, 24, kernel_size=(5, 5), stride=(1, 1))
dann.feature[5] = nn.BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
dann.feature[6] = nn.Dropout2d(p=0.5, inplace=False)

dann.classifier[0] = nn.Linear(in_features=24*4*4, out_features=24, bias=True)
dann.classifier[1] = nn.BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
dann.classifier[2] = nn.Sigmoid()
dann.classifier[3] = nn.Linear(in_features=24, out_features=10, bias=True)

dann.discriminator[0] = nn.Linear(in_features=24*4*4, out_features=24, bias=True)
dann.discriminator[1] = nn.BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
dann.discriminator[2] = nn.Sigmoid()
dann.discriminator[3] = nn.Linear(in_features=24, out_features=2, bias=True)

Improved Result

	MNIST-M → USPS	SVHN → MNIST-M	USPS → SVHN
Original model	0.7693	0.5683	0.3126
Improved model	0.7972	0.5849	0.3726

Remark

We can transform source domain data into similar to target domain during data preprocessing.
The model architecture with some few layer and number of neural will be better performance, when the source domain data (USPS, rgb=1) is more colorless than target domain data (SVHN, rgb=3).

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`	在筆記中貼入程式碼
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.

Domain Adaptation Neural Network

tags: Deep Learning for Computer Vision

scenario

training on source domain only (Lower bound)

training on source and target domain (domain adaptation)

training on target domain only (Upper bound)

Result Matrix

Visualization

Remark

Improved UDA model

scenario 1

Original

Improved

scenario 1

Original

Improved

scenario 3

Original

Improved

Improved Result

Remark

tags: `Deep Learning for Computer Vision`