Practical Adversarial Attacks on Spatiotemporal Traffic Forecasting Models. NIPS 2022.

Abstract

Existing traffic forecasting models assume a reliable and unbiased forecasting environment, which is not always available.
Investigate the vulnerability of spatiotemporal traffic forecasting models and propose a practical adversarial spatiotemporal attack framework
Extensive experiments show that the proposed framework achieves up to 67.8% performance degradation on baselines.

Introduction

Injecting slight adversarial perturbations on a few randomly selected nodes can significantly degrade the traffic forecasting accuracy of the whole system.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Figure 1: An illustration of adversarial attack against spatiotemporal forecasting models on the Bay Area traffic network in California, the data ranges from January 2017 to May 2017. (a) Adversarial attack of geo-distributed data. The malicious attacker may inject adversarial examples into a few randomly selected geo-distributed data sources. (e.g., roadway sensors) to mislead the prediction of the whole traffic forecasting system. (b) Accuracy drop of victim nodes. By adding less than 50% traffic speed perturbations to 10% victim nodes, we observe 60.4% accuracy drop of victim nodes in morning peak hour. (c) Accuracy drop of neighbouring nodes. Due to the information diffusion of spatiotemporal forecasting models, the adversarial attack also leads to up to about 47.23% accuracy drop for neighboring nodes

Adversarial attacks have been extensively studied in various application domains. Two major challenges prevent applying existing adversarial attack strategies to spatiotemporal traffic forecasting.

Limitation

Expensive and impractical to manipulate all data sources (hundreds of sensors and thousands of GPS devices). Identify the subset of salient victim nodes with a limited attack budget to maximize the attack.
Most existing adversarial attack strategies that focus on time-invariant label classification adversarial attack against traffic forecasting aims to disrupt the target model to make biased predictions of continuous "values".

Solution

Proposing a practical adversarial spatiotemporal attack framework that can disrupt the forecasting models.
- Devising an iterative gradient-guided method to estimate node saliency, which helps to identify a small time-dependent set of victim nodes.
- Spatiotemporal gradient descent scheme is proposed to guide the attack direction and generate real-valued adversarial traffic states.
- Various attack settings, i.e., white-box attack, grey-box attack, and black-box attack.
Experimental studies on two real-world traffic datasets show that attacking 10% nodes in the traffic system can break down the MAE from 1.975 to 6.132.
Incorporating adversarial examples we generated with adversarial training can significantly improve the robustness of spatiotemporal traffic forecasting models.

Background

Traffic forecasting

G_{t} = (V, E)

denotes a traffic network at time step

t

, where

V

is a set of

n

nodes and

E

is a set of edges.

X_{t} = (x_{1, t}, x_{2, t}, \dots, x_{n, t})

as the spatiotemporal features associated to

G_{t}

, where

x_{i, t} \in R^{c}

represents the

c

-dimensional time-varying traffic conditions of node

v_{i} \in V

t

\begin{matrix} (1) & {\hat{Y}}_{t + 1 : t + τ} = f_{θ} (H_{t - T + 1 : t}) \end{matrix}

where

H_{t - T + 1 : t} = {(X_{t - T + 1}, G_{t - T + 1}), \dots, (X_{t}, G_{t})}

denotes the input and the traffic network in previous

T

time steps.

f_{θ} (\cdot)

is the spatiotemporal traffic forecasting model parameterized by

θ

{\hat{Y}}_{t + 1 : t + τ} = {{\hat{Y}}_{t + 1}, {\hat{Y}}_{t + 2}, \dots, {\hat{Y}}_{t + τ}}

is the estimation and

Y_{t + 1 : t + τ} = {Y_{t + 1}, Y_{t + 2}, \dots, Y_{t + τ}}

is the ground-truth of

H_{t - T + 1 : t}

Adversarial attack

Adversarial attack aims to mislead the model to derive biased predictions by generating the optimal adversarial example

\begin{matrix} (2) & x^{*} \in {argmax}_{x^{'}} L (x^{'}, y; θ) s.t., | | x^{'} - x | |_{p} \leq ε \end{matrix}

where $x$ is the adversarial example with maximum bound

ε

under

L_{p}

norm to guarantee the perturbation is imperceptible to human, and

y

is the ground truth of clean example

x

For instance, the adversarial example in FGSM

x^{'} = x + ε sign (\nabla_{x} L_{CE} (x, y; θ))

where

sign (\cdot)

is the Signum function and

L_{CE} (\cdot)

is the cross-entropy loss

The adversarial attack can be categorized into three classes

White-box attack. The attacker can fully access the target model, including the model architecture, the model parameters, gradients, model outputs, the input traffic states, and the corresponding labels.
Grey-box attack. The attacker can partially access the system, including the target model and the input traffic states, but without the labels.
Black-box attack. The attacker can only access the input traffic states, query the outputs of the target model or leverage a surrogate model to craft the adversarial examples.

Adversarial attack against spatiotemporal traffic forecasting

Adversarial traffic state is defined as

\begin{matrix} (3) & H_{t}^{'} = {(X_{t}^{'}, G_{t}) : {‖ S_{t} ‖}_{0} \leq η, {‖ (X_{t}^{'} - X_{t}) \cdot S_{t} ‖}_{p} \leq ε} \end{matrix}

where

S_{t} \in {0, 1}^{n \times n}

is a diagonal matrix with ith diagonal element indicating whether node

i

is a victim node, and

X_{t}^{'}

is the perturbed spatiotemporal feature named adversarial spatiotemporal feature.

\begin{matrix} (4) & max_{\begin{matrix} H_{t - T + T + t}^{'} t \in T_{test} \end{matrix}} \sum_{t \in T_{test}} L (f_{θ^{*}} (H_{t - T + 1 : t}^{'}), Y_{t + 1 : t + τ}) s.t., θ^{*} = \arg min_{θ} \sum_{t \in T_{train}} L (f_{θ} (H_{t - T + 1 : t}), Y_{t + 1 : t + τ}), \end{matrix}

Since round truth (i.e., future traffic states) is unavailable at run-time. Practical adversarial spatiotemporal attack primarily falls into the grey-box attack setting.

Methodology

Identify time-dependent victim nodes

One unique characteristic that distinguishes attacking spatiotemporal forecasting from conventional
classification tasks is the inaccessibility of ground truth at the test phase.

\Rightarrow

Surrogate label to guide the attack direction

\begin{matrix} (5) & {\tilde{Y}}_{t + 1 : t + τ} = g_{ϕ} (H_{t - T + 1 : t}) + δ_{t + 1 : t + τ} \end{matrix}

where

g_{ϕ} (\cdot)

is a generalized function (e.g.,

tanh (\cdot)

sin (\cdot)

f_{θ} (\cdot)

) ,

δ_{t + 1 : t + τ}

are random variables sampled from a probability distribution

π (δ_{t + 1 : t + τ})

to increase the diversity of the attack direction.

Function parameter

ϕ

based on the pre-trained forecasting model parameter

θ^{*}

, and

δ_{t + 1 : t + τ} \sim U (- ε / 10, ε / 10)

{\tilde{H}}_{t} = g_{φ} (H_{t - 1})

where

g_{φ} (\cdot)

is the estimation function parameterized by

φ

. For simplicity,

φ

is derived from the pre-trained traffic forecasting model

f_{θ *} (\cdot)

With the surrogate traffic state label

{\tilde{Y}}_{t + 1 : t + τ}

, the derivation the time-dependent node saliency (TDNS) for each node.

\begin{matrix} (6) & M_{t} = {‖ σ (\frac{\partial L (f_{θ} ({\tilde{H}}_{t - T + 1 : t}), {\tilde{Y}}_{t + 1 : t + τ})}{\partial {\tilde{X}}_{t - T + 1 : t}}) ‖}_{p} \end{matrix}

where

L (f_{θ} ({\tilde{H}}_{t - T + 1 : t}), {\tilde{Y}}_{t + 1 : t + τ})

is the loss function and

σ

is the activation function.

M_{t}

reveals the node-wise loss impact with the same degree of perturbations. Note depending on the time step

t

M_{t}

may vary.

A similar idea also has been adopted to identify static pixel saliency for image classification

From Eq. (8),

L (f_{θ} ({\tilde{H}}_{t - T + 1 : t}), {\tilde{Y}}_{t + 1 : t + τ})

is updated by gradient-based adversarial method [Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In ICLR 2018].

\begin{matrix} (7) & {X^{'}}_{t - T + 1 : t}^{(i)} = {clip}_{{X^{'}}_{t - T + 1 : t, ε}} ({X^{'}}_{t - T + 1 : t}^{(i - 1)} + α sign (\nabla L (f_{θ^{*}} (H_{t - T + 1 : t}^{' (i - 1)}), {\tilde{Y}}_{t + 1 : t + τ}))) \end{matrix}

where

H_{t - T + 1 : t}^{' (i)}

is adversarial traffic states at

i

-th iteration,

α

is the step size, and

{clip}_{X_{t - T + 1, ε}^{'}} (\cdot)

is the project operation which clips the spatiotemporal feature with maximum perturbation bound

ε

.
Note

H_{t - T + 1 : t}^{' (0)} = {\tilde{H}}_{t - T + 1 : t}

For each batch of data

{{({\tilde{H}}_{t - T + 1 : t}, {\tilde{Y}}_{t + 1 : t + τ})}_{(j)}}_{j = 1}^{γ}

, the time-dependent node saliency gradient is derived by

\begin{matrix} (8) & g_{t} = \frac{1}{γ} \sum_{j} {\frac{\partial L (f_{θ} \cdot ({\tilde{H}}_{t - T + 1 : t}), {\tilde{Y}}_{t + 1 : t + τ})}{\partial X_{t - T + 1 : t}^{'}}}_{t} \end{matrix}

γ

is the batch size.

ReLU

is the activation function to compute the non-negative saliency score for each time step.

\begin{matrix} (9) & M_{t} = | | ReLU g_{t} | |_{2} . \end{matrix}

The set of victim node

S_{t}

based on

M_{t}

\begin{matrix} (10) & s_{(i, i), t} = {\begin{cases} 1 & if v_{i} \in Top (M_{t}, k) \\ 0 & otherwise \end{cases} \end{matrix}

Attack with adversarial traffic state

Based on the time-dependent victim set, adversarial attacks to spatiotemporal traffic
forecasting models is conducted which is Spatiotemporal Projected Gradient Descent (STPGD) .

\begin{matrix} (11) & X_{t - T + 1 : t}^{' (i)} = {clip}_{X_{t - T + 1 : t, ε}^{'}} (X_{t - T + 1 : t}^{' (i - 1)} + α sign (\nabla L (f_{θ^{*}} (H_{t - T + 1 : t}^{' (i - 1)}), {\tilde{Y}}_{t + 1 : t + τ}) \cdot S_{t})) \end{matrix}

where

H_{t - T + 1 : t}^{' (i - 1)}

is the adversarial traffic state at (

i

−1)-th iteration in the iterative gradient descent,

α

is the step size, and

{clip}_{X_{t}^{'} - T + 1 : t, ε}

is the operation to bound adversarial features in a

ε

ball. Note

X_{t}^{' (0)} = {\tilde{X}}_{t}

In the testing phase, adversarial traffic states is injected

H_{t - T + 1 : t}^{'} = H_{t - T + 1 : t} + Δ H_{t - T + 1 : t}^{'}

where

H_{t}^{'} + Δ H_{t}^{'} = {(X_{t}^{'} - X_{t}) \cdot S_{t} + X_{t}, G_{t}} \in H_{t - T + 1 : t}^{'}

and

Δ H_{t}^{'} = {((X_{t}^{'} - X_{t}) \cdot S_{t}, 0) : | | S_{t} | |_{0} \leq η, {| | (X_{t}^{'} - X_{t}) \cdot S_{t} | |}_{p} \leq ε} \in Δ H_{t - T + 1 : t}^{'}

White-box attack. Since the adversaries can fully access the data and labels under the whitebox setting, the real ground truth traffic states to guide the generation of adversarial traffic states are directly used.

Black-box attack. The most restrictive black-box setting assumes limited accessibility to the target model and labels. Therefore, a surrogate model is first built, which can be learned from the training data. Then, adversarial traffic states based on the surrogate model are generated to attack the targeted traffic forecasting model.

Experiments

Experimental setup

Datasets

PEMS-BAY
METR-LA

The first 70% for training, the following 10% for validation, and the rest 20% for testing.

Target model

GraphWaveNet

Evaluation metrics

Global and local effect

E_{t \in T_{test}} L (f_{θ} (H_{t - T + 1 : t}^{'}), Y_{t + 1 : t + τ}), E_{t \in T_{test}} L (f_{θ} (H_{t - T + 1 : t}^{'}), f_{θ} (H_{t - T + 1 : t})),

where

L (\cdot)

is a user-defined loss function

Mean Average Error (MAE)
Root Mean Square Error (RMSE)

Overall attack performance

(67.79%, 62.31%) and (19.88%, 14.55%) global performance degradation compared with the original forecasting results on PeMS-BAY and METR-LA dataset, respectively.

Practical Adversarial Attacks on Spatiotemporal Traffic Forecasting Models. NIPS 2022.

Abstract

Introduction

Limitation

Solution

Background

Traffic forecasting

Adversarial attack

Adversarial attack against spatiotemporal traffic forecasting

Methodology

Identify time-dependent victim nodes

Attack with adversarial traffic state

Experiments

Experimental setup

Datasets

Target model

Evaluation metrics

Overall attack performance

Read more

ATOM: Robustifying Out-of-Distribution Detection Using Outlier Mining. ECML PKDD 2021.

Time Series Generative Adversarial Networks. NIPS 2019.