# Recommended Papers
## 1. Background and Foundations of Adversarial Robustness
### 1.1 Intriguing properties of neural networks
## 2. Adversarial Attack and Training
### 2.1 Explaining and Harnessing Adversarial Examples (ICLR, 2015)
#### Key Insights
* Adversarial examples can be explained as a property of high-dimensional dot products. They are a result of models being too **linear**, rather than too nonlinear.
* The generalization of adversarial examples across different models can be explained as a result of adversarial perturbations being highly aligned with the **weight vectors** of a model, and different models learning similar functions when trained to perform the same task.
* The **direction of perturbation**, rather than the specific point in space, matters most. Space is not full of pockets of adversarial examples that finely tile the reals like the rational numbers.
* Because it is the direction that matters most, adversarial perturbations generalize across different clean examples.
* We have introduced a family of fast methods for generating adversarial examples.
* We have demonstrated that adversarial training can result in regularization; even further regularization than dropout.
* We have run control experiments that failed to reproduce this effect with simpler but less efficient regularizers including L1 weight decay and adding noise.
* Models that are easy to optimize are easy to perturb.
* Linear models lack the capacity to resist adversarial perturbation; only structures with a hidden layer (where the **universal approximator theorem** applies) should be trained to resist adversarial perturbation.
* RBF networks are resistant to adversarial examples.
* Models trained to model the input distribution are not resistant to adversarial examples.
* Ensembles are not resistant to adversarial examples.
#### Methods
- Generating adversarial examples: Fast Gradient Sign Method
$$\eta = \epsilon \cdot sign(\nabla_xJ(\theta, x, y))$$
* Adversarial Traning against adversarial examples generated by fast gradient sign method:
$$\widetilde J(\theta, x, y) = \alpha J(\theta, x, y) + (1-\alpha)J(\theta, x + \epsilon \cdot sign(\nabla_xJ(\theta, x, y)),y) $$
#### Questions Left
- Whether it is better to preturb the input or the hidden layers or both
- Tradeoff: ease of optimization has come at the cost of models that are easily misled by adversarial exapmles
**Possible Direction: [the development of optimization procedures that are able to train models whose behavior is more locally stable]**
### 2.2 Towards Evaluating the Robustness of Neural Networks
### 2.3 Towards Deep Learning Models Resistant to Adversarial Examples
### 2.4 Theoretically Principled Trade-off between Robustness and Accuracy
## 3. Theoretical Understanding of Adversarial Examples
Robustness May Be at Odds with Accuracy
Adversarially Robust Generalization Requires More Data
Adversarial Examples Are Not Bugs, They Are Features
## 4. Randomized Smoothing
Certified Adversarial Robustness via Randomized Smoothing
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers
## 5. Lipschitz Network
Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons
Boosting the Certified Robustness of L-infinity Distance Nets
## 6. Other Approaches in Certified Robustness
Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope