# Yes, we GAN: Applying Adversarial Techniques for Autonomous Driving 論文共筆
---
[TOC]
---
## ABSTRACT
- 討論GAN在自動駕駛中的應用問題
- 包括諸如高級數據增強
- 損失函數學習
- 半監督
## INTRODUCTION
- 自動駕駛模塊
- Sense
- Perceive & Localize
- Abstract
- Plan
- Control
- 作法
- 獨立設計模塊(傳統)
- 端到端
- GAN在自動駕駛的優勢
- Discriminative models & Generative models
- Autonomous driving systems requires training the model with all possible scenarios which can happen in real life.
## OVERVIEW OF GAN
- 背景
- GAN were introduced in 2014 and were immediately recognized as a perspective direction of upcoming deep learning research.
- unsupervised learning
- semi-supervised learning
- advanced data augmentation.
- 問題
- stabilization of the complicated GAN learning
- 原因
- the generated data do not reflect the diversity of the under-lying data distribution
- 結果
- the discriminator is fooled to believe in unrealistic samples
- degeneration of the generator
### Vanilla GAN
- 方法
- 一個生成器(network),生成器的任務是生成樣本,這些樣本與真實數據樣本盡可能相似
- 一個鑑別器(network),鑑別器的任務是將真實樣本與生成的樣本區分開
- 平衡點(結束),鑑別器應輸出等於0.5的概率
- 重點
- G must not be trained too much without updating D
- 優點
- only back-propagation is needed to compute the gradients
- a wide variety of functions can be incorporated in the model
-
- 缺點
- absence of explicit representation of $p_g(x)$
- simultaneous optimization of the discriminator D with the generator G
### Prominent Derivatives of GAN
#### Conditional Generative Adversarial Nets (CGAN)
- 特點
- conditioning the model on additional information one can influence the data generation process
- 優點
- we can exploit the conditioning to generate samples of a required label.
- 驗證資料
- MNIST
- 類似GAN
- FC-GAN: 快速收斂
#### Wasserstein GAN (WGAN)
- 特點
- WGAN focus solely on the learning of GAN.
- use the maximum likelihood estimation (MLE) over Wasserstein distance
- 優點
- train the critic till the optimality
- prevent collapsing modes
#### Improved WGAN
- 特點
- the authors propose to penalize the norm of the gradient of the critic with respect to its input
- 優點
- 改進WGAN flawed learning(where only poor quality samples are generated)
#### Boundary-Seeking Generative Adversarial Networks(BGAN)
- 方法
- training the generator in order to produce the samples lying on the decision boundary of the current discriminator
- 特點
- training the generator in order to produce the samples lying on the decision boundary of the current discriminator
- 優點
- proved to be more stable against the mode collapse.
- a definition of a unified learning frame- work for both discrete and continuous variables
### GAN:Recent Advances
#### BigGAN
- 優點
- achieved a new level of performance
- fine control over the trade-off between sample fidelity and variety
#### Self-Attention Generative Adversarial Network (SAGAN)
- 優點
- allows attention-driven modeling for image generation where details are generated using cues from all feature locations
## GAN Applications for autonomous driving
### Advanced Data Augmentation
- GAN create realitic looking images
- from a black and white image to a colored one
- areal image to map
- edges to a photo-realistic images of the sketched objects
- day to night
- summer to winter
- context-aware object placement
- However, the task is much more difficult thanks to the temporal information, which also has to remain consistent.
- CGAN、CycleGAN+UNIT、AC-GAN.........未完的待續
-
#### CGAN
- 功能
- Image-to-Image translation as an instance
- 方法
- mixed with a traditional L1 loss
#### CycleGAN
- 方法
- Briefly, the authors are learning a mapping G : X → Y , such that the distribution of images from G(X ) is indistinguishable from the distribution Y . Because such mapping is highly under-constrained, they couple it with an inverse mapping F : Y → X and introduce a cycle consistent loss enforcing F(G(X)) ≈ X, and vice versa.
- 優點
- operate without a specific supervision
- 缺點
- the generated images, after a careful inspection, show the same artifacts as the previous work
#### Synthesis
- visual perception
- fix noisy input
#### 2D Synthesis
- Image-to-Image translation can be approached based on two main directions : paired or unpaired ; unimodal or mutilmodal
- unimodal paired image translation
- 特點
- the model learns to map images where the training data is organized in pairs of input and output samples
- In many cases, the paired training data could not be available.
- 範例
- Pix2Pix
- SRGAN
- unimodal unpaired
- 特點
- the image translation is conducted on unpaired data from two domains, where it learns a mapping between the two domains without supervision
- 範例
- CycleGAN
- DiscoGAN
- StarGAN
- UNIT
- multimodal image translation
- 特點
- generate several images of different styles based on a single source image
- 範例
- Pix2PixHD
- BicycleGAN
- unpaired MUNIT
- Augmented GAN
#### 3D Synthesis
- 目的
- LiDAR can perceive accurate depth and to produce 3D point clouds
- Most of GAN approaches are not applicable to 3D point clouds
- 範例
- Point Cloud GAN (PC-GAN)
- proposed a two fold modification to GAN algorithm for learning to generate point clouds
- 3D-GAN framework
- map from a low-dimensional probabilistic space to the space of 3D objects
- PrGAN
- investigated the task of generating a distribution over 3D structures given 2D views of multiple objects taken from unknown viewpoints.
#### Video Synthesis
- 目的
- create new interactive 3D virtual worlds for different domains
- 範例
- Temporal GAN(TGAN)
- learns a semantic representation of unlabeled videos and generates videos, using a temporal generator and an image generator.
- from early GAN network for video with a spatio-temporal convolutional architecture
- scene is separated from the background and generating small one second videos.
#### Domain adaptation from simulation to real
- 目的
- using simulated environments enables much easier collection
- simulated environments often fail to generalize on real environments
- GraspGAN
- 功能
- extended the pixel-level domain adaptation to reduce the number of real world samples needed by up to 50 times for vision-based grasping system
- Reinforcement Learning
- 方法
- Two image-to-image translation networks are used
- The first network translates virtual images to their segmentation, the second network translates segmented images into their realistic counterpart
#### Object Detection
- 目的
- inferring the occluded objects is essential for scene understanding and taking decisions.
- SeGAN
- 功能
- an approach for both segmentation and generation of the occluded parts of objects
- 方法
- the proposed network has three parts: segmentor, generator, and discriminator
- Perceptual-GAN
- 功能
- narrows representation difference of small and large objects
- 方法
- the generator learns to transfer the small objects representations large ones
#### Super Resolution
- 目的
- enable and enhance the systems that were trained on high resolution inputs
- SRGAN
- able to infer photo-realistic natural images for 4x upscaling factors
#### Inpainting
- 目的
- sensors may read noisy data or may suffer from failures causing incomplete readings, and Inpainting can provide a solution
### Semi-supervised/Unsupervised Learning
- $a$-GAN
- 方法
- combines VAE(Variational Autoencoders) and GAN
- VAE $\rightarrow$ one of the most popular approaches to unsupervised
- 目的
- the best of both worlds is used, and the limitations of both methods are mitigated
- 沒說互補了哪些優缺點
- unsupervised pre-training is beneficial for deep learning in general
### Learned Loss Functions
- DAN
- 目的
- semi-supervised learning and loss function learning
- 方法
- uses two discriminators
- Predictor(P):receives a data point x on input and outputs a prediction p(x)
- Judge(J):receives a data point x together with a label y,produces a single scalar J(x, y) representing the probability that x, y came from the labeled training data, rather than being predicted by P.
- 特點
- P does not make use of labels, so the semi-supervised learning is pretty straightforward within this framework
- concentrates on learning loss functions for discriminative models
### Adversarial training/testing
- 特點
- attacks to weaken the performance of CNN by addition of noise
- can also be interpreted as loss function learning
- we can use adversarial loss for improving the final classifier robustness
- EL-GAN
- Since there are very stringent requirements on safety in AD, the adversarial examples generation might be used as a tool for testing corner cases and robustness
## Our Results
- 問題
- The image deterioration by soiling and adverse weather is caused either by presence of some “soiling categories”. So we have to enhance the image quality
- obtaining the relevant data is both very problematic and expensive
- 方法
- CycleGAN
- sorted our images to two categories : clean、soiled
- recognize which parts of the image are soiled
- desoiling
- “desoiling” generator
- learned to introduce shadow of the car body to the image
- the vast majority of images in the “clean” category contained shadow of the car body
- “soiling” generator
- learned that the weather was usually cloudy on our images from the “soiled” category.
- MUNIT
- ability to split content from the style, which would help our intention to possess the control over generated images and therefore ease the further classifiers training.
- Fail
## Discussion
- discuss the main challenges of GAN
### Quantitative Evaluation
- 作法
- generative models evaluation is based on the model likelihood.
- 方法
- Inception Score
- conditional label distribution of samples containing meaningful objects should have low entropy and the variability of the samples should be high
- 優點
- well correlated with scores from human annotators
- 缺點
- IS is found to be insensitive to the prior distribution over labels
- Fre ́chet Inception Distance
- 方法
1. the samples are embedded into a feature space given by a specific layer of the Inception Net.
2. these are modeled as a continuous multivariate Gaussian distribution
3. quantify the mean and covariance which is estimated for the generated and the real data and the Fre ́chet distance is evaluated
- 優點
- FID score showed to be consistent with human judgment
- FID can detect intra-class mode dropping
- 比較
- IS mainly captures precision
- FID captures both precision and recall
### Adversarial examples and Safety
- 問題
- inputs to machine learning models that have been intentionally modified to fool the model
- Defensive Distillation mechanism
- 方法
- trains a model whose surface is smoothed in the directions an attacker will typically try to exploit
- 目的
- making it difficult to discover adversarial examples
### Optimization Stability
- designed to minimize loss function
- 方法
- minibatch discrimination
- identifying the Kullback-Leibler (KL) divergence minimization task $\rightarrow$ distributions supported by low-dimensional manifolds $\rightarrow$ KL not defined or simply infinite $\rightarrow$ propose to use a different distance function(earth-mover, or Wasserstein, distance)
## Conclusions
- GAN have a potential for high impact for autonomous driving applications
- discussed the main challenges and open problems which have to be resolved in order for it to be more practically used
要來一段一段討論怎麼報告嗎?? 嗯嗯
我們要用那個簡報軟體? office google hackmd icloud?
好問題
我其實不太知道老師這次報告是希望我們在論文中學到東西 <-其次
還是希望我們可以練習報告 <- 我覺得是這個
還是都有~~
如果要練習報告的話 感覺座簡報比較好一點 google 嗯嗯 好啊
如果要論文學東西 就hackmd直接上 <--不然也是可以hackmd<----做一個重點摘要版的就好,拿這一篇來刪減
也是可以,之前做起來有比office這類的快,因為不用準備背景之類的圖
OK
我先開個範本,在看看要不要
office google hackmd icloud 這幾個都是簡報或有簡報功能
這裏 https://hackmd.io/@NtutShare/SJ34R9EPw/edit
<!---
---
### Semi-supervised/Unsupervised Learning
- $a$-GAN
- combines VAE(Variational Autoencoders) and GAN
- 
---
### Learned Loss Functions
- DAN
- 
--->