# Approximating the Probability of an Overfull Chunk
What's the probability that a given `Chunk` contains $t = 4096$ or more indices?
There are 256 batches that can target any particular `Chunk`. Every batch consists of 8 items. Every item inserts 45 indices *somewhere* in the active window, which is $2^{20}$-wide and divided into 256 `Chunk`s of width $2^{12}$.
So in summary, there are $N = 256 \cdot 8 \cdot 45$ indices to consider and the probability that one of them falls in the `Chunk` of interest is $p = 2^{-8}$.
The probability of having $k$ elements in a given `Chunk` is [binomial](https://en.wikipedia.org/wiki/Binomial_distribution) and so the probability of having an overfull `Chunk` is given exactly by
$$ {\rm Pr}[\textnormal{overfull}] = \sum_{k=4096}^{N} \binom{N}{k} p^k (1-p)^{N-k} .$$
Unfortunately, this number is rather difficulty to compute after plugging in $N = 256 \cdot 8 \cdot 45 = 92160$. We *think* it is unfathomably small and want to upper bound it.
## Markov's Inequality
[Markov's Inequality](https://en.wikipedia.org/wiki/Markov%27s_inequality) states:
$$ {\rm Pr}[X \geq a] \leq \frac{\mathbb{E}[X]}{a} . $$
In our case, $a = 4096$ and $\mathbb{E}[X] = N \cdot p = 256 \cdot 8 \cdot 45 \cdot \frac{1}{256} = 360,$ so $\mathrm{Pr}[X \geq 4096] \leq \frac{360}{4096} < 2^{-4} .$
So $2^{-4}$ is something but not great. Let's see if we can do better.
## Chebyshev's Bound
[Chebyshev's Inequality](https://en.wikipedia.org/wiki/Chebyshev%27s_inequality) uses the variance to bound the deviation of a random variable from its mean. It states
$$ {\rm Pr}[|X - \mathbb{E}[X]| \geq a] \leq \frac{{\rm Var}[X]}{a^2} . $$
In our case, ${\rm Var}[X] = N \cdot p \cdot (1-p) = 256 \cdot 8 \cdot 45 \cdot \frac{1}{256} \cdot \left( 1 - \frac{1}{256}\right) = 358.59375.$ Furthermore, due to the symmetry of the distribution,
$$ \begin{eqnarray}
& {\rm Pr}[|X - \mathbb{E}[X]| \geq a] \\
=& {\rm Pr}[(X - \mathbb{E}[X] \geq a) \, \vee \, (X - \mathbb{E}[X] \leq -a)] \\
=& 2 \cdot {\rm Pr}[X - \mathbb{E}[X] \geq a] = 2 \cdot {\rm Pr}[X \geq a + \mathbb{E}[X]] .
\end{eqnarray} $$
So in our case $a = 4096 - 360 = 3736$ gives us the bound on $X$ that we are after. Plugging in:
$$ {\rm Pr}[X \geq 4096] \leq \frac{358.59375}{2 \cdot 3736^2} \approx 0.00001284573578619279 \approx 2^{-16} .$$
So $2^{-16}$ is getting somewhere, but still not good enough. What else can we do?
## Chernoff's Bound
[Chernoff's Bound](https://en.wikipedia.org/wiki/Chernoff_bound) bounds tail probabilities in terms of the distribution's [moment-generating function](https://en.wikipedia.org/wiki/Moment-generating_function) $M(t)$. In particular, it states
$$ {\rm Pr}[X \geq a] \leq M(t) \cdot e^{-ta} , $$
for all $t > 0$. In our case, $a = 4096$ again and the moment-generating function for $B(N, p)$ is $M(t) = (1 - p + p\cdot e^{t})^N$.
The expression is easier to manage after taking the natural logarithm of left and right hand sides.
$$ \ln {\rm Pr}[X \geq 4096] \leq N \cdot \ln (1 - p + p \cdot e^t) -ta $$
This expression achieves its minimum where the derivative is zero. A quick plot (thanks, [Wolfram Alpha](https://www.wolframalpha.com/input?i=min+256+*+45+*+8+*+ln%281+%2B+%28exp%28t%29-1%29+%2F+256%29+-+t+*+4096)!) shows that it has one global minimum.

$$
\begin{flalign}
0 &= \frac{{\sf d} \, N \cdot \ln (1 - p + p \cdot e^t) - ta}{{\sf d} t} \\
&= N \cdot \left| \frac{1}{1-p + p \cdot e^t} \right| \cdot p \cdot e^t - a \\
&= N \cdot \frac{p \cdot e^t}{1 - p + p \cdot e^t} - a \\
&= N \cdot \frac{e^t}{255+e^t} - a \\
&= \frac{e^t}{255+e^t} - \frac{a}{N}
\end{flalign}
$$
Rearranging and plugging in the values $N = 256 \cdot 45 \cdot 8$ and $a = 4096$ gives:
$$
\begin{flalign}
\frac{e^t}{255 + e^t} &= \frac{2}{45} \\
e^t &= \frac{2}{45} \cdot (255 + e^t) \\
\left(1 - \frac{2}{45}\right) \cdot e^t = \frac{43}{45} \cdot e^t &= 255 \cdot \frac{2}{45} \\
e^t &= \frac{510}{43}
\end{flalign}
$$
The value at this minimum is approximately $-6301.1632251835945$. This number translates to a probability bound expressible in terms of bits:
$$ \begin{flalign}
\ln {\rm Pr}[X \geq 4096] &\leq -6301.1632251835945 \\
\log_2 {\rm Pr}[X \geq 4096] &\leq -6301.1632251835945 \cdot \ln(2) \\
&\approx -6301.1632251835945 \cdot 0.6931471805599453 \\
&\approx -4367.63352378402 .
\end{flalign} $$
*Muuch* better `\o/`.