Deep Learning for Computer Vision
In this task, I applied DANN to implement domain adaptation.
Use source images and labels in the training folder for training, target images and labels in the testing folder to compute the accuracy
Avg Accuracy = 0.4925
Use source images and labels in the training folder + target images in the training folder for training, target images and labels in the testing folder to compute the accuracy
Avg Accuracy = 0.5683
Use target images and labels in the training folder for training, target images and labels in the testing folder to compute the accuracy
Avg Accuracy = 0.9798
MNIST-M → USPS | SVHN → MNIST-M | USPS → SVHN | |
---|---|---|---|
Trained on source | 0.7429 | 0.4925 | 0.1526 |
Adaptation | 0.7693 | 0.5683 | 0.3126 |
Trained on target | 0.9626 | 0.9788 | 0.9155 |
# https://github.com/wogong/pytorch-dann
SVHNmodel(
(feature): Sequential(
(0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(6): ReLU(inplace=True)
(7): Conv2d(64, 128, kernel_size=(4, 4), stride=(1, 1))
)
(classifier): Sequential(
(0): Linear(in_features=128, out_features=1024, bias=True)
(1): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=1024, out_features=256, bias=True)
(4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=256, out_features=10, bias=True)
)
(discriminator): Sequential(
(0): Linear(in_features=128, out_features=1024, bias=True)
(1): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=1024, out_features=256, bias=True)
(4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=256, out_features=2, bias=True)
)
)
I add batchnorm and relue to the featurer extractor
Sequential(
(0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1))
(5): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): Dropout2d(p=0.5, inplace=False)
(7): ReLU(inplace=True)
(8): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(9): ReLU(inplace=True)
(10): Conv2d(64, 128, kernel_size=(4, 4), stride=(1, 1))
)
# https://github.com/wogong/pytorch-dann
MNISTMmodel(
(feature): Sequential(
(0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
(1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(32, 48, kernel_size=(5, 5), stride=(1, 1))
(5): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): Dropout2d(p=0.8, inplace=False)
(7): ReLU(inplace=True)
(8): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): Linear(in_features=768, out_features=300, bias=True)
(1): BatchNorm1d(300, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=300, out_features=100, bias=True)
(4): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=100, out_features=10, bias=True)
)
(discriminator): Sequential(
(0): Linear(in_features=768, out_features=300, bias=True)
(1): BatchNorm1d(300, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=300, out_features=2, bias=True)
)
)
I reduce the number of layers of the classifier and discriminator, and then add batchnorm and relue to the featurer extractor
MNISTMmodel(
(feature): Sequential(
(0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
(1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(32, 48, kernel_size=(5, 5), stride=(1, 1))
(5): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): Dropout2d(p=0.5, inplace=False)
(7): ReLU(inplace=True)
(8): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): Linear(in_features=768, out_features=100, bias=True)
(1): Linear(in_features=100, out_features=10, bias=True)
)
(discriminator): Sequential(
(0): Linear(in_features=768, out_features=100, bias=True)
(1): Linear(in_features=100, out_features=2, bias=True)
)
)
# https://github.com/wogong/pytorch-dann
USPSMmodel(
(feature): Sequential(
(0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
(1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(32, 24, kernel_size=(5, 5), stride=(1, 1))
(5): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): Dropout2d(p=0.8, inplace=False)
(7): ReLU(inplace=True)
(8): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): Linear(in_features=384, out_features=24, bias=True)
(1): BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=24, out_features=10, bias=True)
)
(discriminator): Sequential(
(0): Linear(in_features=384, out_features=24, bias=True)
(1): BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=24, out_features=2, bias=True)
)
)
transforms.Compose([
transforms.CenterCrop((28, 20)),
transforms.Resize((28,28)),
transforms.Grayscale(num_output_channels=3),
])
dann.feature[4] = nn.Conv2d(32, 24, kernel_size=(5, 5), stride=(1, 1))
dann.feature[5] = nn.BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
dann.feature[6] = nn.Dropout2d(p=0.5, inplace=False)
dann.classifier[0] = nn.Linear(in_features=24*4*4, out_features=24, bias=True)
dann.classifier[1] = nn.BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
dann.classifier[2] = nn.Sigmoid()
dann.classifier[3] = nn.Linear(in_features=24, out_features=10, bias=True)
dann.discriminator[0] = nn.Linear(in_features=24*4*4, out_features=24, bias=True)
dann.discriminator[1] = nn.BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
dann.discriminator[2] = nn.Sigmoid()
dann.discriminator[3] = nn.Linear(in_features=24, out_features=2, bias=True)
MNIST-M → USPS | SVHN → MNIST-M | USPS → SVHN | |
---|---|---|---|
Original model | 0.7693 | 0.5683 | 0.3126 |
Improved model | 0.7972 | 0.5849 | 0.3726 |