# Approximating the Probability of an Overfull Chunk What's the probability that a given `Chunk` contains $t = 4096$ or more indices? There are 256 batches that can target any particular `Chunk`. Every batch consists of 8 items. Every item inserts 45 indices *somewhere* in the active window, which is $2^{20}$-wide and divided into 256 `Chunk`s of width $2^{12}$. So in summary, there are $N = 256 \cdot 8 \cdot 45$ indices to consider and the probability that one of them falls in the `Chunk` of interest is $p = 2^{-8}$. The probability of having $k$ elements in a given `Chunk` is [binomial](https://en.wikipedia.org/wiki/Binomial_distribution) and so the probability of having an overfull `Chunk` is given exactly by $$ {\rm Pr}[\textnormal{overfull}] = \sum_{k=4096}^{N} \binom{N}{k} p^k (1-p)^{N-k} .$$ Unfortunately, this number is rather difficulty to compute after plugging in $N = 256 \cdot 8 \cdot 45 = 92160$. We *think* it is unfathomably small and want to upper bound it. ## Markov's Inequality [Markov's Inequality](https://en.wikipedia.org/wiki/Markov%27s_inequality) states: $$ {\rm Pr}[X \geq a] \leq \frac{\mathbb{E}[X]}{a} . $$ In our case, $a = 4096$ and $\mathbb{E}[X] = N \cdot p = 256 \cdot 8 \cdot 45 \cdot \frac{1}{256} = 360,$ so $\mathrm{Pr}[X \geq 4096] \leq \frac{360}{4096} < 2^{-4} .$ So $2^{-4}$ is something but not great. Let's see if we can do better. ## Chebyshev's Bound [Chebyshev's Inequality](https://en.wikipedia.org/wiki/Chebyshev%27s_inequality) uses the variance to bound the deviation of a random variable from its mean. It states $$ {\rm Pr}[|X - \mathbb{E}[X]| \geq a] \leq \frac{{\rm Var}[X]}{a^2} . $$ In our case, ${\rm Var}[X] = N \cdot p \cdot (1-p) = 256 \cdot 8 \cdot 45 \cdot \frac{1}{256} \cdot \left( 1 - \frac{1}{256}\right) = 358.59375.$ Furthermore, due to the symmetry of the distribution, $$ \begin{eqnarray} & {\rm Pr}[|X - \mathbb{E}[X]| \geq a] \\ =& {\rm Pr}[(X - \mathbb{E}[X] \geq a) \, \vee \, (X - \mathbb{E}[X] \leq -a)] \\ =& 2 \cdot {\rm Pr}[X - \mathbb{E}[X] \geq a] = 2 \cdot {\rm Pr}[X \geq a + \mathbb{E}[X]] . \end{eqnarray} $$ So in our case $a = 4096 - 360 = 3736$ gives us the bound on $X$ that we are after. Plugging in: $$ {\rm Pr}[X \geq 4096] \leq \frac{358.59375}{2 \cdot 3736^2} \approx 0.00001284573578619279 \approx 2^{-16} .$$ So $2^{-16}$ is getting somewhere, but still not good enough. What else can we do? ## Chernoff's Bound [Chernoff's Bound](https://en.wikipedia.org/wiki/Chernoff_bound) bounds tail probabilities in terms of the distribution's [moment-generating function](https://en.wikipedia.org/wiki/Moment-generating_function) $M(t)$. In particular, it states $$ {\rm Pr}[X \geq a] \leq M(t) \cdot e^{-ta} , $$ for all $t > 0$. In our case, $a = 4096$ again and the moment-generating function for $B(N, p)$ is $M(t) = (1 - p + p\cdot e^{t})^N$. The expression is easier to manage after taking the natural logarithm of left and right hand sides. $$ \ln {\rm Pr}[X \geq 4096] \leq N \cdot \ln (1 - p + p \cdot e^t) -ta $$ This expression achieves its minimum where the derivative is zero. A quick plot (thanks, [Wolfram Alpha](https://www.wolframalpha.com/input?i=min+256+*+45+*+8+*+ln%281+%2B+%28exp%28t%29-1%29+%2F+256%29+-+t+*+4096)!) shows that it has one global minimum. ![image](https://hackmd.io/_uploads/S1Ehix-Zxe.png) $$ \begin{flalign} 0 &= \frac{{\sf d} \, N \cdot \ln (1 - p + p \cdot e^t) - ta}{{\sf d} t} \\ &= N \cdot \left| \frac{1}{1-p + p \cdot e^t} \right| \cdot p \cdot e^t - a \\ &= N \cdot \frac{p \cdot e^t}{1 - p + p \cdot e^t} - a \\ &= N \cdot \frac{e^t}{255+e^t} - a \\ &= \frac{e^t}{255+e^t} - \frac{a}{N} \end{flalign} $$ Rearranging and plugging in the values $N = 256 \cdot 45 \cdot 8$ and $a = 4096$ gives: $$ \begin{flalign} \frac{e^t}{255 + e^t} &= \frac{2}{45} \\ e^t &= \frac{2}{45} \cdot (255 + e^t) \\ \left(1 - \frac{2}{45}\right) \cdot e^t = \frac{43}{45} \cdot e^t &= 255 \cdot \frac{2}{45} \\ e^t &= \frac{510}{43} \end{flalign} $$ The value at this minimum is approximately $-6301.1632251835945$. This number translates to a probability bound expressible in terms of bits: $$ \begin{flalign} \ln {\rm Pr}[X \geq 4096] &\leq -6301.1632251835945 \\ \log_2 {\rm Pr}[X \geq 4096] &\leq -6301.1632251835945 \cdot \ln(2) \\ &\approx -6301.1632251835945 \cdot 0.6931471805599453 \\ &\approx -4367.63352378402 . \end{flalign} $$ *Muuch* better `\o/`.