Ulisse mini

@ulirocks

Joined on Jul 12, 2022

  • Integrated gradients is an attribution method which works by approximating the following integral, where $F : \mathbb{R}^n \rightarrow \mathbb{R}$ is the network, $x$ is the input, $x'$ is the baseline, and $i$ is the feature. $$ \mathrm{IntegratedGradients}i(x) = (x_i - x_i') \times \int{\alpha=0}^1 \frac{\partial F(x' + \alpha (x - x'))}{\partial x_i} d\alpha $$ Intuitively we're accumulating all the change in output from changing $x'$ to $x$ along the linear path $\alpha x' + (1 - \alpha) x$. For a concrete example in words, let $x_i$ be the weight of a heavy exotic frog, $x_i'$ be the average weight over all frogs, $F(x)$ be the estimated jump height of a frog with all the features $x$ (one of which is weight). Given this, the right hand side of the equation above can be interpreted as $$ \begin{aligned}
     Like  Bookmark
  • Here's some math $$ \int_a^b f(x)dx = F(b) - F(a) $$ You can do inline with single dollars $f(x) = 2x^2 + \frac{3}{2}$
     Like  Bookmark