# [Public] Formal proof of Lemma 2.1(I) from the [LDD notes](https://hackmd.io/@U0nm1XUhREKPYLt1eUmE6g/Sycpovkiq) **See the appendix of the [arXiv version (v4)](https://arxiv.org/abs/2203.03456) for an updated version adjusted for the directed case.** --- **Definitions:** * Imagine that we order vertices by $s_1, s_2, ..., s_n$. (The order can be arbitrary.) * We grow the ball of radius $R_i:=R_{s_i}$ from $s_i$ if $s_i$ is not removed already. ($R_i$ is a random variable.) * The only **source of randomness** in the analysis below is $R_1, R_2, ...$ * Let $G_i:=G_i(R_1...R_{i-1})$ be a random variable denoting the graph after we grow balls from $s_1, s_2, \ldots s_{i-1}$. * $G_i$ is a random variable whose value depends on $R_1,...,R_{i-1}.$ * $B_i:=Ball_{G_i}(s_i, R_{i})$. $B_i$ is a random variable whose value depends on $R_1, ..., R_{i}$. * $I_𝑖$ := event that $𝑢,𝑣∉𝑉_1,…, 𝑉_{𝑖−1}$ and at least one of $𝑢$ or $𝑣$ is *included* in $B_𝑖$. ($I$="included") * $X_𝑖$:= event that at least one of $𝑢$ and $𝑣$ is *exclude* from $B_𝑖$ **Proof:** Fix any edge $(u, v)$. 1. Note that $\sum_{i\geq 1} Pr[I_i]= 1$ since events $I_1, I_2, ...$ form a partition of the whole probability space. * Detail: If $I_i$ occurs, then at least one of $u$ or $v$ is in $B_i$ and thus *not* in $V_{i+j}$ for any $j\geq 1$. In contrast, if $I_{i+j}$ occurs for any $j\geq 1$, then both $u$ and $v$ must be in $V_{i+j}$. 2. $Pr[(u,v)\in E_{rem}]= \sum_{i\geq 0} Pr_{R_1, ..., R_i}[X_i \land Y_i]$ To bound $Pr[X_i \land Y_i]$, we use the following claim, which fixes the first $i-1$ random variables and only treats $R_i$ as random. **Claim B:** Fix any instantiations of the first $i-1$ random variables $R_1=r_1, ..., R_{i-1}=r_{i-1}$ (for some $r_1, ..., r_{i-1}$) such that the resutling graph $G_i = G(r_1, ..., r_{i-1})$ contains both $u$ and $v$. Then, either $Pr_{R_i}[I_i] = 0$ or $Pr_{R_i}[X_i|I_i] \leq p w(u,v)$. (Note that since $R_1, ..., R_{i-1}$ are fixed, the randomness is only over $R_i$. We write $Pr_{R_i}[...]$ to emphasize this.) **Lemma: $\sum_{i\geq 1} Pr_{R_1, ..., R_i}[X_i \land I_i] \leq p w(u,v)$** Proof: $\sum_{i\geq1} Pr_{R_1, ..., R_i}[X_i \land I_i]$ = $\sum_{i=1}\sum_{r_1, ..., r_{i-1}} Pr_{R_i}[X_i \land I_i \land R_1 = r_1 ... \land r_{i-1} = R_{i-1}] \cdot Pr[R_1 = r_1 ... \land r_{i-1}]$ = = $\sum_{i=1}\sum_{r_1, ..., r_{i-1}} Pr_{R_i}[X_i | I_i \land R_1 = r_1 ... \land r_{i-1} = R_{i-1}] \cdot Pr[I_i | R_1 = r_1 ... \land r_{i-1} = R_{i-1}] \cdot Pr[R_1 = r_1 ... \land r_{i-1}]$ Now, fix any values $r_1 = R_1, ..., r_{i-1} = R_{i-1}$. If given these values we have $Pr_{R_i}[I_i] = 0$ then the entire term inside the summation becomes zero. If $Pr_{R_i}[I_i] \neq 0$, then by defintion of $I_i$ this means that $G_i = G(R_1 = r_1, ..., R_{i-1} = r_{i-1})$ contains both $u$ and $v$. We can thus apply Claim $B$. This allows us to rewrite our main summation as main sum $\leq p w(u,v) \sum_i \sum_{r_1, ..., r_{i-1}} Pr[I_i | R_1 = r_1 ... \land r_{i-1} = R_{i-1}] \cdot Pr[R_1 = r_1 ... \land r_{i-1}]$ = $p w(u,v) \sum_i Pr[I_i] \leq pw(u,v)$ **Proof of Lemma A (assuming Claim B):** Summing over all possible values of $r_{1}, ..., r_{i-1}$ we have that $Pr_{R_1, ..., R_i}[X_i | I_i] = \sum_{r_1, ..., r_{i-1}} Pr_{R_i}[X_i|I_i\wedge (R_1=r_1) \wedge ...\wedge (R_{i-1}=r_{i-1})]\cdot Pr[R_1=r_1 \wedge ...\wedge R_{i-1}=r_{i-1}].$ Now consider any values $R_1 = r_1, ..., R_{i-1} = r_{i-1}$. If the resutling $G_i = G(R_1, ..., R_{i-1})$ does not contain both $u$ and $v$ then $I_i$ is false by definition so $Pr_{R_i}[X_i|I_i\wedge (R_1=r_1) \wedge ...\wedge (R_{i-1}=r_{i-1})] = 0$. Else, if $G_i$ contains both $u$ and $v$ then by claim B $Pr_{R_i}[X_i|I_i\wedge (R_1=r_1) \wedge ...\wedge (R_{i-1}=r_{i-1})]\leq p w(u,v)$. We thus have $Pr_{R_1, ..., R_i}[X_i | I_i] \leq p\cdot w(u,v) \cdot \sum_{r_1, ..., r_{i-1}} Pr[R_1=r_1 \wedge ...\wedge R_{i-1}=r_{i-1}] = p\cdot w(u,v)$. --- 5. **Proof of Claim B:** From now we fix $R_1=r_1, ..., R_{i-1}=r_{i-1}$ as in the statement of Claim B. This implies that $G_i$ is also a fixed graph $G_i(r_1, ..., r_{i-1}).$ To ease notations, we write $Pr[E]$ instead of $Pr[E | (R_1=r_1) \wedge ...\wedge (R_{i-1}=r_{i-1})]$ (e.g. $Pr_{R_i}[X_i|I_i]$ below is in fact $Pr_{R_i}[X_i|I_i\wedge (R_1=r_1) \wedge ...\wedge (R_{i-1}=r_{i-1})]$.) Recall that our goal is to prove $$Pr_{R_i}[X_i|I_i]\leq p\cdot w(u,v).$$ 6. Define * $d_{min}(r_1...r_{i-1}):=\min(dist_{G_i}(s_i,u),dist_{G_i}(s_i,v)).$ * $d_{max}(r_1...r_{i-1}):=\max(dist_{G_i}(s_i,u),dist_{G_i}(s_i,v)).$ * Note: Since we fix $R_1=r_1, ..., R_{i-1}=r_{i-1}$, $d_{min}(r_1...r_{i-1})$ and $d_{max}(r_1...r_{i-1})$ are *not* random variables. To ease notations, we will simply write $d_{min}$ and $d_{max}$. 7. The event $I_i$ is equivalent to $J_i$:="$R_i\geq d_{min}$". * (i) If $R_i< d_{min}$, then $I_i$ does not occur as $R_i$ is too small for $B_i$ to include $u$ or $v.$ * (ii) If $R_i\geq d_{min}$, then $I_i$ will always happen since either $u$ or $v$ will be in $B_i$. 8. Continuing from (5), we have $Pr_{R_i}[X_i|I_i]$=$Pr_{R_i}[X_i|(R_i\geq d_{min}(r_1...r_{i-1}))\wedge R_1=r_1 \wedge ...\wedge R_{i-1}=r_{i-1}].$ 9. Conditioned on the event "$(R_i\geq d_{min}(r_1...r_{i-1}))\wedge R_1=r_1 \wedge ...\wedge R_{i-1}=r_{i-1}$", event $X_i$ is equivalent to "$R_i<d_{max}(r_1...r_{i-1})$". So, $Pr_{R_i}[X_i|(R_i\geq d_{min}(r_1...r_{i-1}))\wedge R_1=r_1 \wedge ...\wedge R_{i-1}=r_{i-1}]$ = $Pr_{R_i}[R_i\leq d_{max}(r_1...r_{i-1})|(R_i\geq d_{min}(r_1...r_{i-1}))\wedge R_1=r_1 \wedge ...\wedge R_{i-1}=r_{i-1}]$ = $Pr_{R_i}[R_i\leq d_{max}(r_1...r_{i-1})|(R_i\geq d_{min}(r_1...r_{i-1}))].$ * The last equality is because $R_i$ is independent from $R_1, ..., R_{i-1}.$ 11. Now we use the **memoryless** property of the geometric random variable, i.e. for any $i,j$ and event $E$, $Pr[R_i\leq i+j | R_i\geq i]\leq Pr[R_i\leq j].$ So, $Pr_{R_i}[R_i\leq d_{max}(r_1...r_{i-1})|(R_i\geq d_{min}(r_1...r_{i-1}))]$ = $Pr_{R_i}[R_i\leq d_{max}(r_1...r_{i-1})-d_{min}(r_1...r_{i-1})]$ $\leq Pr_{R_i}[R_i\leq w(u,v)]$ * The last inequality is because $d_{max}(r_1...r_{i-1})-d_{min}(r_1...r_{i-1})\leq w(u,v)$ for undirected graphs. (Note that this step needs to be changed for directed graph, but the change is quite obvious; i.e. we have to change $d_{min}$ to $dist_{G_i}(s_i,u)$ and $d_{max}$ to $dist_{G_i}(s_i,v)$) 12. By union bound, $Pr_{R_i}[R_i\leq w(u,v)]\leq p\cdot w(u,v)$ as desired.