# Lecture 1 - Introduction to Approximate Distance Algorithms and Oracles ###### tags: `Algorithm` ## Shortest Path Problem + Given a weighted graph $G=(V,E,w)$ + $V$: sets of vertices, $n=|V|$ + $E$: sets of edges, that is, $E\subseteq\{(u,v)|u,v\in V\}$, $m=|E|$ + $w$: $E\to \Bbb{R}^+$ is called weight/length on $G$ + **Unweighted graph**: $w(e)=1$, $\forall e\in E$ + Degree of $u$: number of edges associated with $u$ + Denoted by $deg(u)$ + Also have in-degree and out-degree for directed graph + Undirected graph: $(u,v)\in E\iff (v,u)\in E$ ![](https://i.imgur.com/ZTJK7PM.png) + Directed graph: $(u,v)\in E\not\Rightarrow (v,u)\in E$ ![](https://i.imgur.com/rdXhtkl.png) ### Single Source Shortest Path (SSSP) + Given $s$ and $t$, find the path of minimum length from $s$ to $t$. ![](https://i.imgur.com/Iks2oPx.png) + The length of the shortest path is called the **distance** from $s$ to $t$, denoted by $\delta(s,t)$ + Triangle inequality: $\delta(u,w)\leq\delta(u,v)+\delta(v,w)$ for any $u,v,w$ ![](https://i.imgur.com/2zN9jRl.png) ### All Pairs Shortest Path (APSP) Given a graph, find the distance between every two vertices ![](https://i.imgur.com/GIQhlCj.png) ## Classic Algorithm ### *Dijkstra*'s Algorithm + Each node has three status: $unreached$, $active$, $scanned$ + Label $d(v)$ for every vertex $v$ ```python Set all node v!=s unreached and d(v)=inf Set s active and d(s)=0 while there is an active node: Select an active node u with minimum d(u) mark u scanned for every outgoing edge (u,v): d(v) = min{d(v), d(u)+w(u,v)} if v is unscanned: mark v active ``` + For single-source all-destination shortest path problem + We can use a **priority queue** to maintain the $d(s)$ for $active$ $s$ + By **Fibonacci heap** to realize the priority queue, the running time is $O(m+n\log n)$ + APSP: run *Dijkstra*'s algorithm for every vertex $v$ + Running Time: $O(mn+n^2\log n)$ + When $m=O(n^2)$, running time is $O(n^3)$ ### *Floyd-Warshall*'s Algorithm + Initially + $dist[u][u]=0$, $\forall u\in V$ + $dist[u][v]=w(u,v)$, $\forall(u,v)\in E$ + $dist[u][v]=\infty$, otherwise ```cpp for(int k=0 ; k<n ; ++k) for(int i=0 ; i<n ; ++i) for(int j=0 ; j<n ; ++j) if(dist[i][j] > dist[i][k]+dist[k][j]) dist[i][j] = dist[i][k]+dist[k][j]; ``` + Running time $O(n^3)$ ### APSP conjecture + Unweighted undirected graphs: $O(n^{\omega})\approx O(n^{2.38})$(*Seidel 95*) + $O(n^{\omega})$ is the time for fast matrix multiplication + Unweighted directed graphs: $O(n^{\mu})\approx O(n^{2.58})$(*Zwick 98*) + $\mu$ is related to rectangular matrix multiplication + Real edge weights: $O(n^3/2^{\Omega(\sqrt{\log n})})$(*Williams 14*) + **Conjecture**: No APSP algorithm in $O(n^{3-\delta})$ time for constant $\delta>0$ when the edge weights are $[1...n]$. ## All Pairs Almost Shortest Path (APASP) Denote the distance between $u$ and $v$ to be $\delta(u,v)$ An estimated distance $\delta ’(u,v)$ is of **stretch $k$** $\iff\delta(u,v)\leq\delta'(u,v)\leq k\delta(u,v)$ An estimated distance $\delta ’(u,v)$ is of **surplus $t$** $\iff\delta(u,v)\leq\delta'(u,v)\leq \delta(u,v)+t$ ### Unweighted Undirected Graph + $G=(V,E)$ is an unweighted undirected graph + SSSP: BFS in $O(m)$ time ![](https://i.imgur.com/2SV06cq.png) + We pick every vertex independently at random with prob. $1/k$ to form **a set** $D$ + $E[|D|]=n/k$ + By Chernoff bound, $Pr[|D|\geq(1+\gamma)(n/k)]\leq e^{-\Omega(\gamma(n/k))}$ + If a vertex has degree $d$, what’s the probability that it is adjacent to a vertex in $D$? + $1-(1-1/k)^d$ + When $d=ck\log n$, $1-(1-1/k)^{ck\log n}\approx 1-1/n^c$ #### Theorem: Given a degree bound $d$, we can find a set $D$ with $\tilde O(n/d)$ vertices such that all vertices with degree $≥d$ are adjacent to a vertex in $D$ w.h.p. :::info $\tilde O(·)$ hides constant number of logarithmic factors e.g. $\tilde O(n^2)$ means $O(n^2\log^cn)$ ::: #### Basic Algorithm + Let $V_1$ be the set with degree $\geq n^{1/2}$ + Let $D_1$ be the sampled set with $\tilde O(n/n^{1/2})=\tilde O(n^{1/2})$ vertices + such that all vertices in $V_1$ is adjacent to some vertex in $D_1$ ![](https://i.imgur.com/iFBkAUI.png) + Let $E^*$ be the set of edges connecting every $w$ vertex in $V_1$ to a vertex $w’$ in $D_1$ + $|E^*|=O(n)$ ![](https://i.imgur.com/3iTBz8s.png) + Let $E_2$ be the edges $(u,v)$ such that $u∉V_1$ or $v∉V_1$ + $deg(u)$ or $deg(v)$ is $<n^{1/2}$ + $|E_2|\leq n\times n^{1/2}=n^{3/2}$ ![](https://i.imgur.com/tDhTm3e.png) + Run BFS from every $u\in D_1$ + We can get $\delta(u,v)$ for $u∈D_1$, $v∈V$, and denote the table by $\delta(D_1\times V)$ + Running time: $O(m|D_1|)= \tilde O(n^{5/2})$, when $m=O(n^2)$ ![](https://i.imgur.com/Sxz6ETw.png) + Run *Dijkstra* from every $u\notin D_1$ in the graph $G’=(V, E_2\cup\delta(D_1\times V))$ + $G’$ has $|E_2|+n|D_1|= \tilde O(n^{3/2})$ edges + Total running time: $\tilde O(n^{5/2})$ ![](https://i.imgur.com/kDTtBaI.png) #### Analysis of Basic Algorithm + Consider a shortest path from $u$ to $v$ + If $u∈D_1$ or $v∈D_1$, $\delta(u,v)$ can be found in the first step + If all vertices in the shortest path between $u$ and $v$ have degree $<n^{1/2}$, all edge in the path is in $E_2$, so can be found in the second step + If a vertex $w$ in the shortest path have degree $≥n^{1/2}$ + $w$ is adjacent to a vertex $w’∈D_1$ + $\delta(u,w')$ and $\delta(w',v)$ are found in first step + So in the second step, the distance $\delta'(u,v)$ we found will have $\delta(u,v) ≤ \delta(u,w’)+\delta(w’,v) ≤ \delta(u,w)+ \delta(w,v) + 2 = \delta(u,v) +2$ ![](https://i.imgur.com/Ai57ZpJ.png) + **Running Time**: $\tilde O(n^{5/2})$ #### Improved Algorithm + Let $V_1$ be the vertices with degree $≥n^{2/3}$ + Let $V_2$ be the vertices with degree $≥n^{1/3}$ + Let $D_1$ be the sampled set with $\tilde O(n/n^{2/3})=\tilde O(n^{1/3})$ vertices such that all vertices in $V_1$ is adjacent to some vertex in $D_1$ + Let $D_2$ be the sampled set with $\tilde O(n/n^{1/3})=\tilde O(n^{2/3})$ vertices such that all vertices in $V_2$ is adjacent to some vertex in $D_2$. + $E^*$ be the edges connecting every $w∈V_2$ to a $w’∈D_2$ + Let $E_2$ be the edges $(u,v)$ such that $u∉V_1$ or $v∉V_1$ + $|E_2|=O(n^{5/3})$ + Let $E_3$ be the edges $(u,v)$ such that $u∉V_2$ or $v∉V_2$ + $|E_3|=O(n^{4/3})$ + Run BFS from every $u∈D_1$ + Run BFS from every $u∈D_2$ in $E_2$ + We get $\delta’(D_2\times V)$ on the graph induced by $E_2$ + Run *Dijkstra* from every $u∈V$ on $G”=(V, E_3∪\delta(D_1\times V)∪\delta’(\{u\}\times D_2)∪E^*)$ + **Running time**: $\tilde O(n^{7/3})$ #### Analysis of Improved Algorithm + Consider a shortest path from $u$ to $v$ + If all vertices on the shortest path have degree $<n^{1/3}$ + the path is in $E_3$ + If there is a vertex $w$ with degree $≥n^{2/3}$ on the shortest path + $w$ is adjacent to a vertex $w’∈D_1$ + $\delta(u,w’)$ and $\delta(w’,v)$ are found in the first step, $\delta(u,w’)≤\delta(u,w)+1$ and $\delta(w’,v)≤\delta(w, v)+1$ + So the distance $\delta''(u,v)$ we found have $\delta''(u,v)\leq\delta(u,w')+\delta(w',v)\leq\delta(u,w)+\delta(w,v)+2=\delta(u,v)+2$ ![](https://i.imgur.com/giN71ra.png) + If there are no vertices with degree $≥n^{2/3}$ but there is a vertex $w$ with degree $≥n^{1/3}$ on the shortest path + Let $w$ be the last such vertex on the path, and $w$ is adjacent to a vertex $w’∈D_2$ and $(w,w’)∈E^*$ + In second step, $\delta'(u,w')\leq\delta(u,w)+1$ + In the final step, $\delta'(u,w')$ is in $\delta’(\{u\}\times D_2)$, $\delta(w',w)$ is in $E^*$, all edges from $w$ to $v$ are in $E_3$, so $\delta''(u,v)≤\delta'(u,w’)+\delta(w’,w)+\delta(w,v)≤\delta(u,v)+2$ ![](https://i.imgur.com/VQUBop3.png) #### Summary |Surplus|Time*|Reference| |---|----|----| |$2$|$n^{5/2}$|*Aingworth-ChekuriIndyk-Motwani ’96*| |$2(k-1)$|$kn^{2-1/k}m^{1/k}$, $kn^{2+1/{(3k-4)}}$|*Dor-Halperin-Zwick ’96*| \*Ignoring polylogarithmic factors + When $k=\log n$, we can get a $\tilde O(n^2)$-time +$O(\log n)$-APASP algorithm ##### Undirected Graph + Undirected is crucial + There is still no such algorithm for directed graphs ##### Unweighted Graph + For additive error approximation, unweighted graph is convenient + For weighted graph, adjacency to a vertex doesn’t mean close to it + We can try **multiplicative approximation** ### Weighted Undirected Graph #### How the find the $b$ closest vertices to $s$? ![](https://i.imgur.com/BGkpa6N.png) + In *Dijkstra* it takes $\tilde O(nb)$ time to find the first $s$ marked vertices + **Truncated *Dijkstra*** in $\tilde O(b^2)$ time ```python Set all node v!=s unreached and d(v)=inf Set s active and d(s)=0 while there is an active node: Select an active node u with minimum d(u) mark u scanned for b smallest outgoing edge (u,v) from u: d(v) = min{d(v), d(u)+w(u,v)} if v is unscanned: mark v active if b vertices are marked: break ``` + For other edges $(u,v)$ (not $b$ smallest), if the path $s\to ...\to u\to v$ is the shortest path from $s$ to $v$, then $v$ is not among the $b$ closest to $s$ ![](https://i.imgur.com/E5TAYGI.png) #### Algorithm + For every vertex $v$, let $ball(v)$ be the set of $n^{2/3}$ closest vertices to $v$. Use truncated-*Dijkstra* to find the distance from $v$ to all vertices in $ball(v)$ + Running time: $\tilde O(n\times n^{4/3}) =\tilde O(n^{7/3})$ + Let $D$ be the randomly sampled $\tilde O(n^{1/3})$ vertices such that every $ball(v)\cap D\neq\emptyset$ for every $v$ ![](https://i.imgur.com/S4IS2Pw.png) + Run *Dijkstra* from all vertices in $D$ + Running time: $\tilde O(n^{1/3}\times n^2)=\tilde O(n^{7/3})$ + For every pair $u$, $v$ of vertices + If $u\in D$ or $v\in D$ or $v\in ball(u)$, real distance have been found + Otherwise, $v\notin ball(u)$ but there is a $w\in ball(u)\cap D$, return $\delta(u,w)+\delta(w,v)$ + $\delta(u,w)\leq\delta(u,v)$ + $\delta(w,v)\leq\delta(w,u)+\delta(u,v)\leq2\delta(u,v)$ $\Rightarrow\delta(u,w)+\delta(w,v)\leq3\delta(u,v)$ ![](https://i.imgur.com/UdNdPKw.png) + Running time: $O(1)$ per pair + **Running time**: $\tilde O(n^{7/3})$ + **3-stretch APASP** #### Summary |Stretch|Time*| |---|---| |$2$|$n^{3/2}m^{1/2}$| |$7/3$|$n^{7/3}$| |$3$|$n^2$| Reference: *Cohen-Zwick ‘97* \*Ignoring polylogarithmic factors ## Distance Oracle ### Basic Idea :::info + For every vertex $v$, let $ball(v)$ be the set of $n^{2/3}$ closest vertices to $v$. Use truncated-*Dijkstra* to find the distance from $v$ to all vertices in $ball(v)$ + Let $D$ be the randomly sampled $\tilde O(n^{1/3})$ vertices such that every $ball(v)∩D\neq\emptyset$ for every $v$ + Run *Dijkstra* from all vertices in $D$ ::: $\Rightarrow$ We obtain $\delta(u,v)$ for $u\in D$ or $v\in D$ or $v\in ball(u)$ :::info + For every pair $u$, $v$ of vertices + If $u\in D$ or $v\in D$ or $v\in ball(u)$, real distance have been found + Otherwise, $v\notin ball(u)$ but there is a $w\in ball(u)\cap D$, return $\delta(u,w)+\delta(w,v)$ ::: $\Rightarrow$ For every pair $u$, $v$, we can obtain the approximate distance + **Data Structure** with $O(n^{5/3})$ space and $O(1)$ query time ### Improve Spcace Complexity :::info + For every vertex $v$, let $ball(v)$ be the set of $n^{1/2}$ closest vertices to $v$. Use truncated-Dijkstra to find the distance from $v$ to all vertices in $ball(v)$ + Let $D$ be the randomly sampled $\tilde O(n^{1/2})$ vertices such that every $ball(v)\cap D\neq\emptyset$ for every $v$ + Run *Dijkstra* from all vertices in $D$ + For every pair $u$, $v$ of vertices + If $u\in D$ or $v\in D$ or $v\in ball(u)$, real distance have been found + Otherwise, $v\notin ball(u)$ but there is a $w\in ball(u)\cap D$, return $\delta(u,w)+\delta(w,v)$ ::: + Data structure with $O(n^{3/2})$ space and $O(1)$ query time ### Summary + APSP algorithms + Have to output an $n\times n$ table + Distance Oracle (*Thorup, Zwick ‘01*) + Space: $O(kn^{1+1/k})$ + Query time: $O(k)$ + Return $(2k-1)$-stretch distance