--- header-includes: - \usepackage{algorithm2e} --- # Practical Adversarial Attacks on Spatiotemporal Traffic Forecasting Models. NIPS 2022. ## Abstract * Existing traffic forecasting models assume a reliable and unbiased forecasting environment, which is not always available. * Investigate the vulnerability of spatiotemporal traffic forecasting models and propose a practical adversarial spatiotemporal attack framework * Extensive experiments show that the proposed framework achieves up to 67.8% performance degradation on baselines. ## Introduction Injecting slight adversarial perturbations on a few randomly selected nodes can significantly degrade the traffic forecasting accuracy of the whole system. ![](https://i.imgur.com/XW6YULO.png) <figcaption>Figure 1: An illustration of adversarial attack against spatiotemporal forecasting models on the Bay Area traffic network in California, the data ranges from January 2017 to May 2017. (a) Adversarial attack of geo-distributed data. The malicious attacker may inject adversarial examples into a few randomly selected geo-distributed data sources. (e.g., roadway sensors) to mislead the prediction of the whole traffic forecasting system. (b) Accuracy drop of victim nodes. By adding less than 50% traffic speed perturbations to 10% victim nodes, we observe 60.4% accuracy drop of victim nodes in morning peak hour. (c) Accuracy drop of neighbouring nodes. Due to the information diffusion of spatiotemporal forecasting models, the adversarial attack also leads to up to about 47.23% accuracy drop for neighboring nodes</figcaption> <br></br> Adversarial attacks have been extensively studied in various application domains. Two major challenges prevent applying existing adversarial attack strategies to spatiotemporal traffic forecasting. ### Limitation * Expensive and impractical to manipulate all data sources (hundreds of sensors and thousands of GPS devices). Identify the subset of salient victim nodes with a limited attack budget to maximize the attack. * Most existing adversarial attack strategies that focus on time-invariant label classification adversarial attack against traffic forecasting aims to disrupt the target model to make biased predictions of continuous "values". ### Solution * Proposing a practical adversarial spatiotemporal attack framework that can disrupt the forecasting models. * Devising an iterative gradient-guided method to estimate node saliency, which helps to identify a small time-dependent set of victim nodes. * Spatiotemporal gradient descent scheme is proposed to guide the attack direction and generate real-valued adversarial traffic states. * Various attack settings, i.e., white-box attack, grey-box attack, and black-box attack. * Experimental studies on two real-world traffic datasets show that attacking 10% nodes in the traffic system can break down the *MAE* from 1.975 to 6.132. * Incorporating adversarial examples we generated with adversarial training can significantly improve the robustness of spatiotemporal traffic forecasting models. ## Background ### Traffic forecasting $\mathcal{G}_{t} = (\mathcal{V}, \mathcal{E})$ denotes a traffic network at time step $t$ , where $\mathcal{V}$ is a set of $n$ nodes and $\mathcal{E}$ is a set of edges. $\mathbf{X}_{t} = (\mathbf{x}_{1,t}, \mathbf{x}_{2,t}, \ldots , \mathbf{x}_{n,t})$ as the spatiotemporal features associated to $\mathcal{G}_{t}$ , where $\mathbf{x}_{i,t} \in \mathbb{R}^{c}$ represents the $c$-dimensional time-varying traffic conditions of node $v_{i} \in \mathcal{V}$ at $t$. $$ \tag{1} \hat{\mathbf{Y}}_{t+1:t+\tau} = f_{\theta}(\mathbf{H}_{t-T+1:t}) $$ where $\mathbf{H}_{t-T+1:t} = \{(\mathbf{X}_{t−T+1}, \mathcal{G}_{t−T+1}), \ldots, (\mathbf{X}_{t}, \mathcal{G}_{t})\}$ denotes the input and the traffic network in previous $T$ time steps. $f_{\theta} (\cdot)$ is the spatiotemporal traffic forecasting model parameterized by $\theta$. $\hat{\mathbf{Y}}_{t+1:t+\tau} = \{\hat{\mathbf{Y}}_{t+1}, \hat{\mathbf{Y}}_{t+2}, \ldots, \hat{\mathbf{Y}}_{t+\tau}\}$ is the estimation and $\mathbf{Y}_{t+1:t+\tau} = \{\mathbf{Y}_{t+1}, \mathbf{Y}_{t+2}, \ldots, \mathbf{Y}_{t+\tau}\}$ is the ground-truth of $\mathbf{H}_{t-T+1:t}$. ### Adversarial attack Adversarial attack aims to mislead the model to derive biased predictions by generating the optimal adversarial example $$ \tag{2} x^{*} \in \text{argmax}_{x'} \mathcal{L}(x', y ; \theta) \\ \text{s.t.,} ||x' − x||_{p} \leq \varepsilon $$ where $x\$ is the adversarial example with maximum bound $\varepsilon$ under $L_{p}$ norm to guarantee the perturbation is imperceptible to human, and $y$ is the ground truth of clean example $x$. For instance, the adversarial example in FGSM $$ x' = x + \varepsilon \text{sign}(\nabla_{x} \mathcal{L}_{\text{CE}} (x, y; \theta)) $$ where $\text{sign}(\cdot)$ is the Signum function and $\mathcal{L}_{\text{CE}}(\cdot)$ is the cross-entropy loss The adversarial attack can be categorized into three classes * White-box attack. The attacker can fully access the target model, including the model architecture, the model parameters, gradients, model outputs, the input traffic states, and the corresponding labels. * Grey-box attack. The attacker can partially access the system, including the target model and the input traffic states, but without the labels. * Black-box attack. The attacker can only access the input traffic states, query the outputs of the target model or leverage a surrogate model to craft the adversarial examples. ### Adversarial attack against spatiotemporal traffic forecasting Adversarial traffic state is defined as $$ \tag{3} \mathbf{H}'_t=\left\{\left(\mathbf{X}'_t, \mathcal{G}_t\right):\left\|S_t\right\|_0 \leq \eta,\left\|\left(\mathbf{X}'_t-\mathbf{X}_t\right) \cdot S_t\right\|_p \leq \varepsilon\right\} $$ where $S_{t} \in \{0,1\}^{n \times n}$ is a diagonal matrix with ith diagonal element indicating whether node $i$ is a victim node, and $\mathbf{X}'_{t}$ is the perturbed spatiotemporal feature named adversarial spatiotemporal feature. $$ \tag{4} \max _{\substack{\mathbf{H}_{t-T+T+t}^{\prime} t \in \mathcal{T}_{\text {test }}}} \sum_{t \in \mathcal{T}_{\text {test }}} \mathcal{L}\left(f_{\theta^{*}}\left(\mathbf{H}_{t-T+1: t}^{\prime}\right), \mathbf{Y}_{t+1: t+\tau}\right) \\ \text { s.t., } \quad \theta^*=\arg \min _\theta \sum_{t \in \mathcal{T}_{\text {train }}} \mathcal{L}\left(f_\theta\left(\mathbf{H}_{t-T+1: t}\right), \mathbf{Y}_{t+1: t+\tau}\right), $$ Since round truth (i.e., future traffic states) is unavailable at run-time. Practical adversarial spatiotemporal attack primarily falls into the grey-box attack setting. ## Methodology ### Identify time-dependent victim nodes One unique characteristic that distinguishes attacking spatiotemporal forecasting from conventional classification tasks is the inaccessibility of ground truth at the test phase. $\Rightarrow$ Surrogate label to guide the attack direction $$ \tag{5} \tilde{\mathbf{Y}}_{t+1: t+\tau}=g_{\phi}\left(\mathbf{H}_{t-T+1: t}\right)+\delta_{t+1: t+\tau} $$ where $g_{\phi}(\cdot)$ is a generalized function (e.g., $\text{tanh}(\cdot)$ , $\text{sin}(\cdot)$ , $f_{\theta}(\cdot)$) , $\delta_{t+1: t+\tau}$ are random variables sampled from a probability distribution $\pi(\delta_{t+1: t+\tau})$ to increase the diversity of the attack direction. Function parameter $\phi$ based on the pre-trained forecasting model parameter $\theta^∗$, and $\delta_{t+1: t+\tau} \sim U(−\varepsilon/10, \varepsilon/10)$ $$ \tilde{\mathbf{H}}_{t} = g_{\varphi}(\mathbf{H}_{t-1}) $$ where $g_{\varphi}(\cdot)$ is the estimation function parameterized by $\varphi$. For simplicity, $\varphi$ is derived from the pre-trained traffic forecasting model $f_{\theta∗}(\cdot)$. With the surrogate traffic state label $\tilde{\mathbf{Y}}_{t+1: t+\tau}$, the derivation the time-dependent node saliency (**TDNS**) for each node. $$ \tag{6} \mathcal{M}_{t}=\left\|\sigma\left(\frac{\partial \mathcal{L}\left(f_{\theta}\left(\tilde{\mathbf{H}}_{t-T+1: t}\right), \tilde{\mathbf{Y}}_{t+1: t+\tau}\right)}{\partial \tilde{\mathbf{X}}_{t-T+1: t}}\right)\right\|_{p} $$ where $\mathcal{L}\left(f_{\theta}\left(\tilde{\mathbf{H}}_{t-T+1: t}\right), \tilde{\mathbf{Y}}_{t+1: t+\tau}\right)$ is the loss function and $\sigma$ is the activation function. $M_{t}$ reveals the node-wise loss impact with the same degree of perturbations. Note depending on the time step $t$, $\mathcal{M}_{t}$ may vary. *A similar idea also has been adopted to identify static pixel saliency for image classification* From Eq. (8), $\mathcal{L}\left(f_{\theta}\left(\tilde{\mathbf{H}}_{t-T+1: t}\right), \tilde{\mathbf{Y}}_{t+1: t+\tau}\right)$ is updated by **gradient-based adversarial method** [Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In ICLR 2018]. $$ \tag{7} \mathbf{X'}_{t-T+1: t}^{(i)}=\operatorname{clip}_{\mathbf{X'}_{t-T+1: t, \varepsilon}}\left(\mathbf{X'}_{t-T+1: t}^{(i-1)}+\alpha \operatorname{sign}\left(\nabla \mathcal{L}\left(f_{\theta^*}\left(\mathbf{H}_{t-T+1: t}^{\prime(i-1)}\right), \tilde{\mathbf{Y}}_{t+1: t+\tau}\right)\right)\right) $$ where $\mathbf{H}'^{(i)}_{t−T+1:t}$ is adversarial traffic states at $i$-th iteration, $\alpha$ is the step size, and $\text{clip}_{\mathbf{X}'_{t−T+1, \varepsilon}}(\cdot)$ is the project operation which clips the spatiotemporal feature with maximum perturbation bound $\varepsilon$. Note $\mathbf{H}'^{(0)}_{t−T+1:t} = \tilde{\mathbf{H}}_{t−T+1:t}$. For each batch of data $\left\{\left(\tilde{\mathbf{H}}_{t-T+1: t}, \tilde{\mathbf{Y}}_{t+1: t+\tau}\right)_{(j)}\right\}_{j=1}^\gamma$, the time-dependent node saliency gradient is derived by $$ \tag{8} \mathbf{g}_{t}=\frac{1}{\gamma} \sum_j\left\{\frac{\partial \mathcal{L}\left(f_\theta \cdot\left(\tilde{\mathbf{H}}_{t-T+1: t}\right), \tilde{\mathbf{Y}}_{t+1: t+\tau}\right)}{\partial \mathbf{X}_{t-T+1: t}^{\prime}}\right\}_{t} $$ $\gamma$ is the batch size. $\text{ReLU}$ is the activation function to compute the non-negative saliency score for each time step. $$ \tag{9} \mathcal{M}_{t} = ||\text{ReLU}\mathbf{g}_{t}||_{2}. $$ The set of victim node $S_{t}$ based on $\mathcal{M}_{t}$, $$ \tag{10} s_{(i, i), t}= \begin{cases}1 & \text { if } v_i \in \operatorname{Top}\left(\mathcal{M}_t, k\right) \\ 0 & \text { otherwise }\end{cases} $$ ### Attack with adversarial traffic state Based on the time-dependent victim set, adversarial attacks to spatiotemporal traffic forecasting models is conducted which is Spatiotemporal Projected Gradient Descent (**STPGD**) . $$ \tag{11} \mathbf{X}_{t-T+1: t}^{\prime(i)}=\text{clip}_{\mathbf{X}^{\prime}_{t-T+1: t, \varepsilon}}\left(\mathbf{X}_{t-T+1: t}^{\prime(i-1)}+\alpha \text{sign}\left(\nabla \mathcal{L}\left(f_{\theta^*}\left(\mathbf{H}_{t-T+1: t}^{\prime(i-1)}\right), \tilde{\mathbf{Y}}_{t+1: t+\tau}\right) \cdot S_t\right)\right) $$ where $\mathbf{H}^{\prime(i−1)}_{t−T+1:t}$ is the adversarial traffic state at ($i$−1)-th iteration in the iterative gradient descent, $\alpha$ is the step size, and $\text{clip}_{\mathbf{X}^{\prime}_t-T+1: t, \varepsilon}$ is the operation to bound adversarial features in a $\varepsilon$ ball. Note $\mathbf{X}^{\prime(0)}_{t}=\tilde{\mathbf{X}}_{t}$. In the testing phase, adversarial traffic states is injected $$ \mathbf{H}'_{t−T+1:t} = \mathbf{H}_{t−T+1:t} + \Delta \mathbf{H}'_{t−T+1:t} $$ where $$ \mathbf{H}'_{t} + \Delta \mathbf{H}'_{t} = \left\{\left(\mathbf{X}'_{t}-\mathbf{X}_{t}\right) \cdot S_{t}+\mathbf{X}_{t}, \mathcal{G}_{t}\right\} \in \mathbf{H}'_{t-T+1:t} $$ and $$ \Delta \mathbf{H}'_{t} = \left\{\left(\left(\mathbf{X}'_{t}-\mathbf{X}_{t}\right) \cdot S_{t}, 0\right): ||S_{t}||_{0} \leq \eta, \left|\left|\left(\mathbf{X}'_{t}-\mathbf{X}_{t}\right) \cdot S_{t}\right|\right|_{p} \leq \varepsilon \right\} \in \Delta\mathbf{H}'_{t-T+1:t} $$ **White-box attack**. Since the adversaries can fully access the data and labels under the whitebox setting, the real ground truth traffic states to guide the generation of adversarial traffic states are directly used. **Black-box attack**. The most restrictive black-box setting assumes limited accessibility to the target model and labels. Therefore, a surrogate model is first built, which can be learned from the training data. Then, adversarial traffic states based on the surrogate model are generated to attack the targeted traffic forecasting model. ![](https://i.imgur.com/nncmjaW.png) ![](https://i.imgur.com/HAi2aQm.png) ## Experiments ### Experimental setup #### Datasets 1. PEMS-BAY 2. METR-LA The first 70% for training, the following 10% for validation, and the rest 20% for testing. #### Target model * GraphWaveNet #### Evaluation metrics Global and local effect $$ \mathbb{E}_{t \in {\mathcal{T}}_{\text{test}}}\mathcal{L}\left(f_{\theta}\left(\mathbf{H}'_{t-T+1:t}\right), \mathbf{Y}_{t+1:t+\tau}\right),\\ \mathbb{E}_{t \in {\mathcal{T}}_{\text{test}}}\mathcal{L}\left(f_{\theta}\left(\mathbf{H}'_{t-T+1:t}\right), f_{\theta}\left(\mathbf{H}_{t-T+1:t}\right)\right),\\ $$ where $\mathcal{L}(\cdot)$ is a user-defined loss function * Mean Average Error (*MAE*) * Root Mean Square Error (*RMSE*) ### Overall attack performance (67.79%, 62.31%) and (19.88%, 14.55%) global performance degradation compared with the original forecasting results on PeMS-BAY and METR-LA dataset, respectively.