# Verification Attacker
### Setup
Suppose $D^m = \{(x_i, y_i)\}^m$ is a set of size $m$ with instances sampled from a distribution $(x_i, y_i) \sim \mathcal{D}$. Let $f: \mathbb{R}^d \rightarrow \mathbb{R}^c$ be the centre device/offloading device/verifier, $f':\mathbb{R}^d \rightarrow \mathbb{R}^c$ be the edge device under threat.
#### Part I - Adversary
Consider an edge device owes a function $f'$, which takes an input $x$ to compute a target output $y$, alleged to be a copy of a function $f$ from a central device.
#### Part II - Verifier
<!-- **Definition ($q$-Random Output Verifier)**
Given a threshold $\delta$, a scalar $q \leq 1$, an edge funtion $f'$ and its original copy $f$, a set of $m$ inputs $\{x_i\}^{m-1}_{i=0}$, a set of random variables $\{a_i\}^{m-1}_{i=0}$ where $a_i \sim Uniform(0,1)$, a verifier $v(f, f', m)$ is defined as w.r.t norm, $||\cdot||$,
$$v(f, f', m) =1 - \bigvee^m_{i} (a_i\leq q ) \wedge (||f(t(x))- f'(t(x)|| > \delta)$$
Therefore, if $v(f, f', m) = 1$, we find there is no $f_\text{adv}$ in the edge device.
-->
**Definition ($(p, \delta)$-Verifier)** For an instance $x_i$, a central device $f: \mathbb{R}^d \rightarrow \mathbb{R}^c$, an edge device $f': \mathbb{R}^d \rightarrow \mathbb{R}^c$, any $\delta>0$, any $p \in (0, 1]$, any convex function $d: \mathbb{R}^c \times \mathbb{R}^c \rightarrow \mathbb{R}$, a verifier is a function $v(x_i) \rightarrow \{0, 1\}$ such that: with a probability $p$ it computes if $d(f(x_i), f(x_i)) \leq \delta$. If True, $v(x_i) = 0$. $v(x_i) = 1$ for all other conditions.
**Define**
$$v(D^m) = \min_{x_i \in D^m} v(x_i)$$
$$v(\mathcal{D}) = \min_{x\sim\mathcal{D}}v(x) $$
**Definition (c-Soundness of Verification)**
A $(p, \delta)$-verifier $v$ is $c$-sound if
$$Pr(v(\mathcal{D}) = 1 | v(D^m) = 1) \geq c$$
**Definition ($\gamma,\theta$)-Attacker** Define $\gamma$ as the prior and data-independent probability that a device is at risk. For each instance $x_i$, if an attacker exists, it attacks the input at a probability $\theta(x_i)$.
**Definition (Failure Probability)**: For a system with a centre worker, an edge worker, an attacker and a verifier. A data-dependent piror of the possibility that an attacker makes their move and their prediction is different enough from the centre worker to get caught by a verifier.
**Informal Theorem 1** For any ($\gamma,\theta$)-attacker, any confidence level $0<\delta\leq 1$ and any $(p, 0)$-verifier $v$. Suppose $\psi$ is the failure probability of the whole system , $v$ is $c$-sound and if
$$\prod^m_i(1-\psi p\theta(x_i)) \leq\frac{(1-c)(1-\gamma)}{c\gamma} $$
Moreover, if $\forall i \in [m], \theta(x_i) = \theta$, the following statement is equivalent
$$m \geq \frac{\log(1-c) + \log(1-\gamma) - \log(c\gamma)}{\log(1-\psi p\theta)} $$
Proof:
The conditional probability that we find $D^m$ does not have an adversary given there is one at large is
$$Pr(v(D^m)=1|v(\mathcal{D})=0) = \prod^m_i(1-\psi p\theta(x_i))$$
Therefore, the overall probability of the observation is
$$Pr(v(D^m)=1) = \sum_{v(\mathcal{D})} Pr(v(D^m)=1, v(\mathcal{D})) = \prod^m_i(1-\psi p\theta(x_i)) $$
With Bayesian Probability, we have
\begin{align}
Pr(v(\mathcal{D})&=1 | v(D^m)=1) = \frac{Pr(v(D^m)=1 | v(\mathcal{D})=1)Pr(v(\mathcal{D})=1)}{Pr(v(D^m)=1)}\\
&=\frac{1\cdot Pr(v(\mathcal{D})=1)}{Pr(v(D^m)=1)}\\
&=\frac{1-\gamma}{(1-\gamma) + \gamma \prod^m_i(1-\psi p\theta(x_i))}
\end{align}
Because $v$ is $c$-sound, we know $Pr(v(\mathcal{D})=1 | v(D^m)=1) \geq c$, which means
\begin{align}
\frac{1-\gamma}{(1-\gamma) + \gamma \prod^m_i(1-\psi p\theta(x_i))} &\geq c\\
\prod^m_i(1-\psi p\theta(x_i)) &\leq \frac{(1-c)(1-\gamma)}{c\gamma}
\end{align}
If $\forall i \in [m], \theta(x_i) = \theta$, we know
$$\prod^m_i(1-\psi p\theta(x_i)) = (1-\psi p\theta)^m$$
and
\begin{align}
(1-\psi p\theta)^m&\leq \frac{(1-c)(1-\gamma)}{c\gamma}\\
m\log(1-\psi p\theta) &\leq \log[ \frac{(1-c)(1-\gamma)}{c\gamma}]\\
m &\geq \frac{\log(1-c) + \log(1-\gamma) - \log(c\gamma)}{\log(1-\psi p\theta)}
\end{align}
### Table of Variables
| Variable | Description |
|---|---|
| $\gamma$ | A (data-independent) prior probability an attacker exists in the world |
| $\theta(x_i)$ | The probability an attacker makes their move |
| $p$ | The probability an verifier makes their move |
| $\delta$ | The tolerance of difference a verifier accepts between the actual output and the expected output |
| $\psi$ | A (data-dependent) prior probability if an attacker makes their move and a verifier is able to catch it |
| $c$ | The soundness of a verifier |
### Number of Samples To Use
Another useful way to use these probabilities is to compute a *recommending* number of examples $m^*$ each trail one should use in order that we catch the attacker at least $t$ times over $T$ trials. This is equivalent to empirically estimate
$$Pr(v(D^m)=0|v(\mathcal{D})=0) = 1- \prod^m_i(1-\psi p\theta(x_i))$$
with
$$\tau = \frac{t}{T}$$
Namely, we expect
$$1- \prod^m_i(1-\psi p\theta(x_i)) = \tau$$, which gives us
$$m = \frac{\log(1-\tau)}{\log(1-\psi p \theta)}$$
and we round it to integer to let $m^*$ be
$$m^* \geq [\frac{\log(1-\tau)}{\log(1-\psi p \theta)}] + 1$$
Notice that in order for $\tau$ to be an empirical esimator, $T$ should not be too small.