# Integrated Gradients
[Integrated gradients](https://arxiv.org/abs/1802.03788) is an attribution method which works by approximating the following integral, where $F : \mathbb{R}^n \rightarrow \mathbb{R}$ is the network, $x$ is the input, $x'$ is the baseline, and $i$ is the feature.
$$
\mathrm{IntegratedGradients}_i(x) = (x_i - x_i') \times \int_{\alpha=0}^1 \frac{\partial F(x' + \alpha (x - x'))}{\partial x_i} d\alpha
$$
[Intuitively](https://www.tensorflow.org/tutorials/interpretability/integrated_gradients) we're accumulating all the change in output from changing $x'$ to $x$ along the linear path $\alpha x' + (1 - \alpha) x$. For a concrete example in words, let $x_i$ be the weight of a heavy exotic frog, $x_i'$ be the average weight over all frogs, $F(x)$ be the estimated jump height of a frog with all the features $x$ (one of which is weight). Given this, the right hand side of the equation above can be interpreted as
$$
\begin{aligned}
\underbrace{\text{Weight gain of exotic frog}}_{(x_i-x_i')} \times \underbrace{\text{Average rate jump height changes per unit weight gain}}_{\int_{\alpha=0}^1 \frac{\partial F(x' + \alpha (x - x'))}{\partial x_i} d\alpha} \\
= \text{Jump height change of exotic frog}\;\textbf{due to weight gain}
\end{aligned}
$$
In other words, the integrated gradients vector is the vector of the change in output **due** to the change in each input feature.