# Lecture 1 - Introduction to Approximate Distance Algorithms and Oracles
###### tags: `Algorithm`
## Shortest Path Problem
+ Given a weighted graph $G=(V,E,w)$
+ $V$: sets of vertices, $n=|V|$
+ $E$: sets of edges, that is, $E\subseteq\{(u,v)|u,v\in V\}$, $m=|E|$
+ $w$: $E\to \Bbb{R}^+$ is called weight/length on $G$
+ **Unweighted graph**: $w(e)=1$, $\forall e\in E$
+ Degree of $u$: number of edges associated with $u$
+ Denoted by $deg(u)$
+ Also have in-degree and out-degree for directed graph
+ Undirected graph: $(u,v)\in E\iff (v,u)\in E$

+ Directed graph: $(u,v)\in E\not\Rightarrow (v,u)\in E$

### Single Source Shortest Path (SSSP)
+ Given $s$ and $t$, find the path of minimum length from $s$ to $t$.

+ The length of the shortest path is called the **distance** from $s$ to $t$, denoted by $\delta(s,t)$
+ Triangle inequality: $\delta(u,w)\leq\delta(u,v)+\delta(v,w)$ for any $u,v,w$

### All Pairs Shortest Path (APSP)
Given a graph, find the distance between every two vertices

## Classic Algorithm
### *Dijkstra*'s Algorithm
+ Each node has three status: $unreached$, $active$, $scanned$
+ Label $d(v)$ for every vertex $v$
```python
Set all node v!=s unreached and d(v)=inf
Set s active and d(s)=0
while there is an active node:
Select an active node u with minimum d(u)
mark u scanned
for every outgoing edge (u,v):
d(v) = min{d(v), d(u)+w(u,v)}
if v is unscanned:
mark v active
```
+ For single-source all-destination shortest path problem
+ We can use a **priority queue** to maintain the $d(s)$ for $active$ $s$
+ By **Fibonacci heap** to realize the priority queue, the running time is $O(m+n\log n)$
+ APSP: run *Dijkstra*'s algorithm for every vertex $v$
+ Running Time: $O(mn+n^2\log n)$
+ When $m=O(n^2)$, running time is $O(n^3)$
### *Floyd-Warshall*'s Algorithm
+ Initially
+ $dist[u][u]=0$, $\forall u\in V$
+ $dist[u][v]=w(u,v)$, $\forall(u,v)\in E$
+ $dist[u][v]=\infty$, otherwise
```cpp
for(int k=0 ; k<n ; ++k)
for(int i=0 ; i<n ; ++i)
for(int j=0 ; j<n ; ++j)
if(dist[i][j] > dist[i][k]+dist[k][j])
dist[i][j] = dist[i][k]+dist[k][j];
```
+ Running time $O(n^3)$
### APSP conjecture
+ Unweighted undirected graphs: $O(n^{\omega})\approx O(n^{2.38})$(*Seidel 95*)
+ $O(n^{\omega})$ is the time for fast matrix multiplication
+ Unweighted directed graphs: $O(n^{\mu})\approx O(n^{2.58})$(*Zwick 98*)
+ $\mu$ is related to rectangular matrix multiplication
+ Real edge weights: $O(n^3/2^{\Omega(\sqrt{\log n})})$(*Williams 14*)
+ **Conjecture**: No APSP algorithm in $O(n^{3-\delta})$ time for constant $\delta>0$ when the edge weights are $[1...n]$.
## All Pairs Almost Shortest Path (APASP)
Denote the distance between $u$ and $v$ to be $\delta(u,v)$
An estimated distance $\delta ’(u,v)$ is of **stretch $k$** $\iff\delta(u,v)\leq\delta'(u,v)\leq k\delta(u,v)$
An estimated distance $\delta ’(u,v)$ is of **surplus $t$** $\iff\delta(u,v)\leq\delta'(u,v)\leq \delta(u,v)+t$
### Unweighted Undirected Graph
+ $G=(V,E)$ is an unweighted undirected graph
+ SSSP: BFS in $O(m)$ time

+ We pick every vertex independently at random with prob. $1/k$ to form **a set** $D$
+ $E[|D|]=n/k$
+ By Chernoff bound, $Pr[|D|\geq(1+\gamma)(n/k)]\leq e^{-\Omega(\gamma(n/k))}$
+ If a vertex has degree $d$, what’s the probability that it is adjacent to a vertex in $D$?
+ $1-(1-1/k)^d$
+ When $d=ck\log n$, $1-(1-1/k)^{ck\log n}\approx 1-1/n^c$
#### Theorem: Given a degree bound $d$, we can find a set $D$ with $\tilde O(n/d)$ vertices such that all vertices with degree $≥d$ are adjacent to a vertex in $D$ w.h.p.
:::info
$\tilde O(·)$ hides constant number of logarithmic factors
e.g. $\tilde O(n^2)$ means $O(n^2\log^cn)$
:::
#### Basic Algorithm
+ Let $V_1$ be the set with degree $\geq n^{1/2}$
+ Let $D_1$ be the sampled set with $\tilde O(n/n^{1/2})=\tilde O(n^{1/2})$ vertices
+ such that all vertices in $V_1$ is adjacent to some vertex in $D_1$

+ Let $E^*$ be the set of edges connecting every $w$ vertex in $V_1$ to a vertex $w’$ in $D_1$
+ $|E^*|=O(n)$

+ Let $E_2$ be the edges $(u,v)$ such that $u∉V_1$ or $v∉V_1$
+ $deg(u)$ or $deg(v)$ is $<n^{1/2}$
+ $|E_2|\leq n\times n^{1/2}=n^{3/2}$

+ Run BFS from every $u\in D_1$
+ We can get $\delta(u,v)$ for $u∈D_1$, $v∈V$, and denote the table by $\delta(D_1\times V)$
+ Running time: $O(m|D_1|)= \tilde O(n^{5/2})$, when $m=O(n^2)$

+ Run *Dijkstra* from every $u\notin D_1$ in the graph $G’=(V, E_2\cup\delta(D_1\times V))$
+ $G’$ has $|E_2|+n|D_1|= \tilde O(n^{3/2})$ edges
+ Total running time: $\tilde O(n^{5/2})$

#### Analysis of Basic Algorithm
+ Consider a shortest path from $u$ to $v$
+ If $u∈D_1$ or $v∈D_1$, $\delta(u,v)$ can be found in the first step
+ If all vertices in the shortest path between $u$ and $v$ have degree $<n^{1/2}$, all edge in the path is in $E_2$, so can be found in the second step
+ If a vertex $w$ in the shortest path have degree $≥n^{1/2}$
+ $w$ is adjacent to a vertex $w’∈D_1$
+ $\delta(u,w')$ and $\delta(w',v)$ are found in first step
+ So in the second step, the distance $\delta'(u,v)$ we found will have $\delta(u,v) ≤ \delta(u,w’)+\delta(w’,v) ≤ \delta(u,w)+ \delta(w,v) + 2 = \delta(u,v) +2$

+ **Running Time**: $\tilde O(n^{5/2})$
#### Improved Algorithm
+ Let $V_1$ be the vertices with degree $≥n^{2/3}$
+ Let $V_2$ be the vertices with degree $≥n^{1/3}$
+ Let $D_1$ be the sampled set with $\tilde O(n/n^{2/3})=\tilde O(n^{1/3})$ vertices such that all vertices in $V_1$ is adjacent to some vertex in $D_1$
+ Let $D_2$ be the sampled set with $\tilde O(n/n^{1/3})=\tilde O(n^{2/3})$ vertices such that all vertices in $V_2$ is adjacent to some vertex in $D_2$.
+ $E^*$ be the edges connecting every $w∈V_2$ to a $w’∈D_2$
+ Let $E_2$ be the edges $(u,v)$ such that $u∉V_1$ or $v∉V_1$
+ $|E_2|=O(n^{5/3})$
+ Let $E_3$ be the edges $(u,v)$ such that $u∉V_2$ or $v∉V_2$
+ $|E_3|=O(n^{4/3})$
+ Run BFS from every $u∈D_1$
+ Run BFS from every $u∈D_2$ in $E_2$
+ We get $\delta’(D_2\times V)$ on the graph induced by $E_2$
+ Run *Dijkstra* from every $u∈V$ on $G”=(V, E_3∪\delta(D_1\times V)∪\delta’(\{u\}\times D_2)∪E^*)$
+ **Running time**: $\tilde O(n^{7/3})$
#### Analysis of Improved Algorithm
+ Consider a shortest path from $u$ to $v$
+ If all vertices on the shortest path have degree $<n^{1/3}$
+ the path is in $E_3$
+ If there is a vertex $w$ with degree $≥n^{2/3}$ on the shortest path
+ $w$ is adjacent to a vertex $w’∈D_1$
+ $\delta(u,w’)$ and $\delta(w’,v)$ are found in the first step, $\delta(u,w’)≤\delta(u,w)+1$ and $\delta(w’,v)≤\delta(w, v)+1$
+ So the distance $\delta''(u,v)$ we found have $\delta''(u,v)\leq\delta(u,w')+\delta(w',v)\leq\delta(u,w)+\delta(w,v)+2=\delta(u,v)+2$

+ If there are no vertices with degree $≥n^{2/3}$ but there is a vertex $w$ with degree $≥n^{1/3}$ on the shortest path
+ Let $w$ be the last such vertex on the path, and $w$ is adjacent to a vertex $w’∈D_2$ and $(w,w’)∈E^*$
+ In second step, $\delta'(u,w')\leq\delta(u,w)+1$
+ In the final step, $\delta'(u,w')$ is in $\delta’(\{u\}\times D_2)$, $\delta(w',w)$ is in $E^*$, all edges from $w$ to $v$ are in $E_3$, so $\delta''(u,v)≤\delta'(u,w’)+\delta(w’,w)+\delta(w,v)≤\delta(u,v)+2$

#### Summary
|Surplus|Time*|Reference|
|---|----|----|
|$2$|$n^{5/2}$|*Aingworth-ChekuriIndyk-Motwani ’96*|
|$2(k-1)$|$kn^{2-1/k}m^{1/k}$, $kn^{2+1/{(3k-4)}}$|*Dor-Halperin-Zwick ’96*|
\*Ignoring polylogarithmic factors
+ When $k=\log n$, we can get a $\tilde O(n^2)$-time +$O(\log n)$-APASP algorithm
##### Undirected Graph
+ Undirected is crucial
+ There is still no such algorithm for directed graphs
##### Unweighted Graph
+ For additive error approximation, unweighted graph is convenient
+ For weighted graph, adjacency to a vertex doesn’t mean close to it
+ We can try **multiplicative approximation**
### Weighted Undirected Graph
#### How the find the $b$ closest vertices to $s$?

+ In *Dijkstra* it takes $\tilde O(nb)$ time to find the first $s$ marked vertices
+ **Truncated *Dijkstra*** in $\tilde O(b^2)$ time
```python
Set all node v!=s unreached and d(v)=inf
Set s active and d(s)=0
while there is an active node:
Select an active node u with minimum d(u)
mark u scanned
for b smallest outgoing edge (u,v) from u:
d(v) = min{d(v), d(u)+w(u,v)}
if v is unscanned:
mark v active
if b vertices are marked:
break
```
+ For other edges $(u,v)$ (not $b$ smallest), if the path $s\to ...\to u\to v$ is the shortest path from $s$ to $v$, then $v$ is not among the $b$ closest to $s$

#### Algorithm
+ For every vertex $v$, let $ball(v)$ be the set of $n^{2/3}$ closest vertices to $v$. Use truncated-*Dijkstra* to find the distance from $v$ to all vertices in $ball(v)$
+ Running time: $\tilde O(n\times n^{4/3}) =\tilde O(n^{7/3})$
+ Let $D$ be the randomly sampled $\tilde O(n^{1/3})$ vertices such that every $ball(v)\cap D\neq\emptyset$ for every $v$

+ Run *Dijkstra* from all vertices in $D$
+ Running time: $\tilde O(n^{1/3}\times n^2)=\tilde O(n^{7/3})$
+ For every pair $u$, $v$ of vertices
+ If $u\in D$ or $v\in D$ or $v\in ball(u)$, real distance have been found
+ Otherwise, $v\notin ball(u)$ but there is a $w\in ball(u)\cap D$, return $\delta(u,w)+\delta(w,v)$
+ $\delta(u,w)\leq\delta(u,v)$
+ $\delta(w,v)\leq\delta(w,u)+\delta(u,v)\leq2\delta(u,v)$
$\Rightarrow\delta(u,w)+\delta(w,v)\leq3\delta(u,v)$

+ Running time: $O(1)$ per pair
+ **Running time**: $\tilde O(n^{7/3})$
+ **3-stretch APASP**
#### Summary
|Stretch|Time*|
|---|---|
|$2$|$n^{3/2}m^{1/2}$|
|$7/3$|$n^{7/3}$|
|$3$|$n^2$|
Reference: *Cohen-Zwick ‘97*
\*Ignoring polylogarithmic factors
## Distance Oracle
### Basic Idea
:::info
+ For every vertex $v$, let $ball(v)$ be the set of $n^{2/3}$ closest vertices to $v$. Use truncated-*Dijkstra* to find the distance from $v$ to all vertices in $ball(v)$
+ Let $D$ be the randomly sampled $\tilde O(n^{1/3})$ vertices such that every $ball(v)∩D\neq\emptyset$ for every $v$
+ Run *Dijkstra* from all vertices in $D$
:::
$\Rightarrow$ We obtain $\delta(u,v)$ for $u\in D$ or $v\in D$ or $v\in ball(u)$
:::info
+ For every pair $u$, $v$ of vertices
+ If $u\in D$ or $v\in D$ or $v\in ball(u)$, real distance have been found
+ Otherwise, $v\notin ball(u)$ but there is a $w\in ball(u)\cap D$, return $\delta(u,w)+\delta(w,v)$
:::
$\Rightarrow$ For every pair $u$, $v$, we can obtain the approximate distance
+ **Data Structure** with $O(n^{5/3})$ space and $O(1)$ query time
### Improve Spcace Complexity
:::info
+ For every vertex $v$, let $ball(v)$ be the set of $n^{1/2}$ closest vertices to $v$. Use truncated-Dijkstra to find the distance from $v$ to all vertices in $ball(v)$
+ Let $D$ be the randomly sampled $\tilde O(n^{1/2})$ vertices such that every $ball(v)\cap D\neq\emptyset$ for every $v$
+ Run *Dijkstra* from all vertices in $D$
+ For every pair $u$, $v$ of vertices
+ If $u\in D$ or $v\in D$ or $v\in ball(u)$, real distance have been found
+ Otherwise, $v\notin ball(u)$ but there is a $w\in ball(u)\cap D$, return $\delta(u,w)+\delta(w,v)$
:::
+ Data structure with $O(n^{3/2})$ space and $O(1)$ query time
### Summary
+ APSP algorithms
+ Have to output an $n\times n$ table
+ Distance Oracle (*Thorup, Zwick ‘01*)
+ Space: $O(kn^{1+1/k})$
+ Query time: $O(k)$
+ Return $(2k-1)$-stretch distance