[Paper Reading] A survey on Image Data Augmentation for Deep Learning

# [Paper Reading] A survey on Image Data Augmentation for Deep Learning https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0 Shorten, C., Khoshgoftaar, T.M. J Big Data6, 60 (2019) ## Motivation for Data Augmentation * Data augmentation solves overfitting problems by expanding the data set, allowing the model to learn more. * Image data is hard to acquire, especially in medic field. ## Basic Image Manipulation ### Geometric Transformations * note -- The *safety of a Data Augmentation* refers to its likelihood of preserving the label post-transformation. For example, flipping an image might change digit 6 to 9. -- Geometric transformations solve *positional biases* in the training data. * cons: -- Some methods required the labels of data to be manually checked. -- In many application domains such as medical image analysis, the biases distancing the training data from the testing data are more complex than positional and translational variances. ### Color Space Transformations * note -- *Lighting biases* are the most frequently occurring challenges to image recognition problems. -- Color transformations may discard important color information and thus are not always a label-preserving transformation. ### Geometric vs Photometric Transformations [Taylor and Nitschke provide a study](https://arxiv.org/abs/1708.06020) to show the cropping geometric transformation results in the most accurate classifer on e Caltech101 dataset. ### Mixing Images Serveral ways to mix images, averaging pixels, non-linearly pasting, all help models to perform better. In some cases, training time is reduced as well. This technique asks the model to learn recognize an object by only part of it. However, I wonder which label should the generated data be given. ![](https://i.imgur.com/41SPrZj.png =400x) | ![](https://i.imgur.com/2Q2sAGU.png =400x) ### Random Erasing Random erasing forces the model to learn more descriptive features about an image. ## Data Augmentations based on Deep Learning ### Feature space augmentation * A disadvantage of feature space augmentation is that it is very difcult to interpret the vector data. * [Wong et al. find that](https://arxiv.org/abs/1609.08764) when it is possible to transform images in the data-space, data-space augmentation will outperform feature space augmentation. ### Adversarial training Improve model performance under adversarial attack. ### GAN‑based Data Augmentation * The GAN framework can be extended to improve the quality of samples produced with *variational auto-encoders*. * Progressively Growing GAN -- This architecture trains a series of networks with progressive resolution complexity * CycleGAN -- Cycle-Consistency loss function measures forward and backward consistency. -- CycleGAN learns to translate from a domain of images to another domain, such as horses to zebras. -- Used as a method of intelligent oversampling imbalanced data. * Conditional GAN ### Neural Style Transfer * Usually, the set of styles to transfer into is not so obvious. ### Meta learning Data Augmentations * Neural augmentation -- Meta-learn a Neural Style Transfer strategy called Neural Augmentation. -- Use Neural Augmentation to generate data. * Smart Augmentation -- Use an adaptive CNN to merge two images, similar to SamplePairing or mixed-examples. * AutoAugment -- A reinforcement learning algorithm that searches for an optimal augmentation policy. -- The policies learned on the ImageNet dataset were successful when transferred to the Stanford Cars and FGVC Aircraft image recognition tasks. ## Comparing Augmentations [Shijie et al. compared](https://ieeexplore.ieee.org/document/8243510) GANs, WGANs, fipping, cropping, shifting, PCA jittering, color jittering, adding noise, rotation, and some combinations on the CIFAR-10 and ImageNet datasets. The combinations of fipping+cropping and fipping+WGAN were the best overall. ## Additional Design Decisions ### Test-time Augmentation Aggregate predictions across augmented images. ![](https://i.imgur.com/KaZF0zX.png) ### Curriculum Learning When and how to use certain training data. ### Resolution Impact Should we use GAN to generate high resolution images for training better model? ### Alleviating Class Imbalance with Data Augmentation ## Discussion * Having a human-level understanding of convolutional networks features could greatly help guide the augmentation process. * How to determine postaugmented dataset size? ## My Thoughts * This survey is very thorough and very enlightening. * After thinking about data augmentation seriously, I can feel more about the importance of GAN learning methods. It's a learning algorithm that allows machine to learn by itself. * I should study more about GAN and neural style transfer.