--- title: Domain Adaptation Final Project tags: Final Project, Introduction to ML, Innopolis description: View the slide with "Slide Mode". --- # Machine Learning Fall-19 # Innopolis University # Final Project: Domain Adaptation for Digits ## Goals The goal of this Final Project is to help the student gain experience in: 1. Building machine learning models with different components. 2. Experience generalization problems of machine learning models. 3. Apply his/her machine learning knowledge, especially on ANNs, AEs, CNNs, etc. 4. Read research papers and implement them. ## Domain Adaptation Imagine you work at airbnb`a platform for renting houses`, you are building a model that classifies houses into: `verry bad, bad, average, good, verry good`. You build the model using the best data you can find with images taken by professional photographers because you want your model to learn from high quality data. You trainde the model, and then tested it on the test data and got 96% accuracy. The model was deployed, but the customers complained about the performance of your model a lot. The problem was that your clients are using low quality cell-phone generated images. This change in image quality or in general terms, change in data distribution`also refered to as concept drift, concept shift.` dropped the model's performances. And this is the problem domain adaptation tries to fix. More formally, let $D^s$ and $D^t$ be the source and target domains, respectively. Domain adaptation (DA), which is a sub-field of transductive transfer learning (TTL), aims to solve a problem in $D^t$, where data are hard to collect, using data from $D^s$. Both domains, usually, share the same tasks (i.e., $T^s$ = $T^t$) but the marginal distributions of the inputs differ (i.e., $P(X^s)≠P(X^t)$). Similar to other machine learning tasks, DA can be split into supervised, unsupervised and semi-supervised depending on how much labeled data are available from $D^t$. For supervised domain adaptation (SDA) and semi-supervised domain adaptation (SSDA), the data are completely or partially labeled but it is not sufficient enough to train an accurate model for the target domain from scratch. In unsupervised domain adaptation (UDA) the target domain samples are completely unlabeled, which is useful in situations where the data collection process is easy but the data labeling process is time consuming. Most of the previous UDA approaches aim at achieving two targets: (i) produce (or learn) feature vectors from the data from $D^s$ that can be used by a classifier to get highly accurate class labels, and (ii) make the features of both $D^s$ and $D^t$ indistinguishable. ## Dataset We will apply domain adaptation on digitis. Mainly because the images are small so they dont require a lot of computations and because most of the papers use it for evaluating their models. We have two datasets: **MNIST:** MNIST is a handwritten-digit dataset, where all images are in greyscale and have digits in a fixed orientation. ![](https://i.imgur.com/cdcChyD.png) **SVHN:** SVHN images are real-world RGB images of digits (in different orientations) taken from street house numbers. ![](https://i.imgur.com/MbzxYc3.jpg) The domain gap results from such differences in style, brightness, and contrast, among others. Domain adaptation aims at closing the above-mentioned gap between different datasets, such that an image classification model trained on SVHN would perform well on MNIST images, too. For the purpose of this Project we will consider. * SVHN: source domain. * MNIST: target domain ## Task The task is very flexible, all you need to do is build a classifier that gives good results on mnist while respecting the constraints of domain adaptation. **Constraints of Unsupervised Domain Adaption** Training: 1. Access to source domain labels. 2. No Access to target domain labels. 3. Access to images from both domains. Meaning you can use source domain with the labels but target domain you cannot use its labels for training. Testing: 1. Accuracy on target domain is the goal. 2. Accuracy on source domain is not important. We chose the accuracy on mnist(target domain) test data because it is the best best measure for evaluating how good your model is and because it is very widely used in research papers. **General Guidelines** 1. Use any library, pytorch or tensorflow (at your own expence). 2. Build a baseline model by using and combining the code from your labs. 3. Use any github (or other publically available) code to get better results than your baseline model. 4. Implement any paper by yourself (better than 3). 5. Implement and mix techniques from differnet papers (better than 4). 6. Create your own new technique (better than 5). 7. Provide a reference to any code that you use or the paper(s) that you implement. 8. Use a seed so that your results are regeneratable. **Grading** 1. By the end of the first week, every student must implement and submit the baseline model (2). This will get you 50 points. 2. for every day after deadline we deduct 5 points if you didnt submit the baseline. 3. Over the next two weeks, you can improve upon the baseline scores by doing any from (3 - 6). 4.a For 3 (+10 points) 5.b For 4 (+20 points) 6.c For 5 (+ 30 points) 7.d For 6 (+40 points) 8. Apart from the code (which should be properly structured and commented), you will also submit a report (10 points). The report should explain the following: 9.a Details of the baseline model and its results (2 Points) 10.b Details of the Improved method (one of 3-6) (5 Points) 11.c Discussion on your results and the lessons that you learned from them (3 Points) ## Submission **First Week Submission** Code: 1. Python notebooks not allowed. 2. A simple CNN. 3. Data loader in pytorch exist by default. (do not use svhn extra) 4. train on SVHN-train. 5. Test on MNIST-test. 6. your code should output only one value, (the accuracy on MNIST-TEST) range(0,100). Report: 1. What is the accuracy on svhn-train. 2. What is the accuracy on svhn-test. 3. What is the accuracy on mnist-test. 4. Explain your baseline model, how many layers? 5. Try different techniques and give the three accuracies mentioned above with and without them. (Batch-norm, different activations, different Optimizers, dropout, etc) (BONUS TASK). 6. Discuss the results with regards to Domain Adaptation. Contraint: 1. Accuracy on mnist-test must be over 60. **Final Submission** Code: 1. Colab links with the code already run (we should see your results in the output box). 2. Please follow the guidelines. 3. Data loader in pytorch exist by default. (do not use svhn extra) 4. train on SVHN-train. 5. You can use mnist-train but only the images. 6. Test on MNIST-test. 7. Use Xavier Initialization, and ADAM Optimiser. 8. Your code should converge at the end. (the last 10 accuracies should not be very differnet from each other.) Report: 2. Explain your model (Use English, Math, and optional figures) 3. How is your submission different from first submission? 4. Is your solution taken from an existing paper (provide pdfs), or is it your own idea? 5. Explain how does your solution implement domain adaptaion? 6. List all important hyper-parameters; How did you tune HPs? 7. Accuracy (a) on svhn-train, (b) on svhn-test, (c) on mnist-test. Each of these should be provided for the whole the training period -- use graphs to illustrate that. 11. Provide Plots of Latent space to show the domain Invariance and category informative. (before training,before doing DA and after DA). 12. Perform an abalation study on your model. (remove some parts of the model and report the new accuracies) (BONUS TASK). 13. Personal Thoughts: What did you learn from your results? Were you able to solve domain gap? What part of your model can be improved further to get better results, and why? Grading: * First submission :50% * Final submission code: 10% * Final submission report: 40% * Bonus is discussed above ## References 1. [Tutorial on Domain Adaptation](https://www.youtube.com/watch?v=F2OJ0fAK46Q) 2. [Unsupervised Pixel-Level Domain Adaptation With Generative Adversarial Networks](https://www.youtube.com/watch?v=VhsTrWPvjcA) 3. [Duplex Generative Adversarial Network for Unsupervised Domain Adaptation](http://openaccess.thecvf.com/content_cvpr_2018/papers/Hu_Duplex_Generative_Adversarial_CVPR_2018_paper.pdf) 4. [Triplet Loss Network for Unsupervised Domain Adaptation](https://www.mdpi.com/1999-4893/12/5/96) 5. [Domain Adaptation and Domain Generalisation for deployment ready models](https://www.youtube.com/watch?v=xK3LYRE6Xhw) 6. [Progressive Feature Alignment for Unsupervised Domain Adaptation](http://openaccess.thecvf.com/content_CVPR_2019/html/Chen_Progressive_Feature_Alignment_for_Unsupervised_Domain_Adaptation_CVPR_2019_paper.html) 7. [Domain Generalization by Solving Jigsaw Puzzles](http://openaccess.thecvf.com/content_CVPR_2019/html/Carlucci_Domain_Generalization_by_Solving_Jigsaw_Puzzles_CVPR_2019_paper.html) 8. [DLOW: Domain Flow for Adaptation and Generalization](http://openaccess.thecvf.com/content_CVPR_2019/html/Gong_DLOW_Domain_Flow_for_Adaptation_and_Generalization_CVPR_2019_paper.html) 9. [d-SNE: Domain Adaptation Using Stochastic Neighborhood Embedding](http://openaccess.thecvf.com/content_CVPR_2019/html/Xu_d-SNE_Domain_Adaptation_Using_Stochastic_Neighborhood_Embedding_CVPR_2019_paper.html)