# Group Assignment 2.1-2.3
## 2.1 (B-question)
*Design and analyze an algorithm that, given an unweighted undirected graph $G$, outputs whether there are two distinct vertices $u$ and $v$ in $G$ such that $N(u) = N(v)$ where, for any vertex $x$, $N(x)$ is the set of neighbors of $x$, i.e., the vertices that are adjacent to x.
Your algorithm should:
Take $O((n + m)(log(n))^c)$ time for some constant $c$, where n is the number of vertices and m is the number of edges in the graph.
Output the correct answer with probability at least $0.99$.
Hints:
Try thinking of $N(v)$ as an $n$-bit string in $\{0,1\}^n$.
It might be good to use some hash function.
A key challenge is to avoid running in $n^2$ time: note that we do not actually have time to write down an n-bit string for each vertex v in the graph (since that would be $n^2$ bits to write down so would take $Ω(n^2)$ time).*
---
Hash function:
f: [n] -> [k]
Use fingerprint algorithm in order to get the likelihood of $P(N(u) = N(v)) \geq 0.99$.
Each set of neighbors $N(u)$ can be considered an $n$-bit string in $\{0,1\}^n$, where each bit $i$ corresponds to a vertex $i$, and a $1$ means that vertex $i$ is in $N(u)$, while a $0$ means that vertex $i$ is not in $N(u)$.
The likelihood of two neighbor sets $N(x)$ and $N(Y)$ being the same is $P(N(x)=N(y)) \leq \frac{p(p-1)/n}{p(p-1)}$.
N(v)=sum(2^k) där k är index på alla grannar
## 2.2 (E-question)
Note that in some graphs the minimum cut is not unique, i.e., there can be more than one minimum cut. Design and analyze a polynomial time algorithm that finds all minimum cuts in a graph. The algorithm should output a correct answer with probability p for some constant (independent of the size of the graph) $p > 0$.
Hint: look closely at the analysis of Karger’s random contraction algorithm. What is the probability that this algorithm outputs any particular minimum cut?
----
**REAL REAL LÖSNING**
The idea behind the algorithm is simple: create a randomized 2-cut $r$ times and then return the $r$ 2-cuts. The probability that these cuts will be all the minimum cuts of the graph will be greater than or equal to a constant $p$.
Randomized 2-cut algorithm:
```
function RandomizedMinCut(G):
while G has more than 2 vertices
Pick a uniformly random edge of G and contract it.
return the resulting cut
```
Complete algorithm:
```
function ProbablyAllMinCuts(G):
Let S be empty set
do r times:
Let mincut be result from RandomizedMinCut(G)
Add mincut to S
return S
```
To get a all minimum cut solutions with a probability $p$ independent of the size of the graph we run the Karger's random contraction algorithm $r=???$ times.
The chance that we get a solution from a single run of Karger's random contraction algorithm is $p_i = \frac{2}{n^2}$ as was proven in the lecture slides.
Therefore, the chance that we get $l\geq k$ correct minimum cuts within $r$ executions is:
$$
P(\text{l minimum cuts}) = \left(\frac{2}{n^2} \right)^l \cdot {r \choose l} \cdot \left(1-\frac{2}{n^2}\right)^{r-l} \geq \left(\frac{2}{n^2} \right)^l \cdot \frac{r!}{l!(r-l)!} \cdot \left(e^{-r\cdot 2 /n^2} \right)^{r-l}
$$
The chance that each of these are unique is:
$$
P(\text{k unique found}) \geq \frac{k}{k} \cdot \frac{k-1}{k} \cdot ... \cdot \frac{1}{k} = \frac{k!}{k^k}
$$
Therefore the chance that we have $l$ minimum cuts and that $k$ of them are unique is:
$$
P = P(\text{k minimum cuts}) \cdot P(\text{k unique}) \geq \left(\frac{2}{n^2} \right)^l \cdot \frac{r!}{l!(r-l)!} \cdot \left(e^{-r\cdot 2 /n^2} \right)^{r-l} \frac{k!}{k^k}
$$
$$
= \frac{r!k!}{l!(r-l)!} e^{-\frac{2}{n^2} r(r-k)} \left(\frac{2}{k \cdot n^2}\right)^k
$$
If we assume that $r\geq k \cdot n^2$ and choose $l$ such that $\frac{r!}{(r-l)!} \geq r^k$
$$
\geq \frac{k!}{l!} e^{-\frac{2}{n^2} r(r-k)} \geq p
$$
Which leads to:
$$
-\frac{2}{n^2}r(r-k) \geq \ln \left( \frac{l!}{k!} \cdot p \right)
$$
$$
r(r-k) \leq - \frac{n^2}{2} \cdot \ln \left( \frac{l!}{k!} \cdot p \right)
$$
This is only true iff $\frac{l!}{k!} \cdot p < 1$. Since $p$ is supposed to be a constant and this implies that $p$ relies on $l$, which needs to rely on the size of the graph, the probability can not be constant using this method.
---
$$
\frac{r!}{(r-l)!} e^{-\frac{2}{n^2} r(r-k)} \geq \frac{l!}{k!} \left( \frac{k\cdot n^2}{2} \right)^k \cdot p
$$
---
<!-- e^{-\frac{2}{n^2} r(r-k)} \left(\frac{2}{k \cdot n^2}\right)^k -->
$$
e^{-\frac{2}{n^2} r(r-k)} \geq p \cdot \left( \frac{k\cdot n^2}{2}\right)^k \geq p
$$
$$
-\frac{2\cdot r(r-k)}{n^2} \geq k \ln \left(\frac{k\cdot n^2}{2} \right) + \ln p
$$
$$
r(r-k) \leq r^2 \leq -\frac{n^2}{2} \cdot \left( k \ln \left(\frac{k\cdot n^2}{2} \right) + \ln p \right) \leq -n^2 \left( \right)
$$
Therefore,
$$
r < 0
$$
As $r$ becomes negative, we have prooven that this method doesn't work and we have no idea how to solve this problem. Please Email us the correct solution :D
----
**Real Lösning! O.o**
Do the Karger's random contraction algorithm $r=\frac{n^2}{2}$ times, then the probability that we get at least $k$ minimum cuts is constant.
Then the probability of finding any cut is
$$P(\text{minimum cut}) \geq 1-e^{-r \cdot 2 / n^2} = 1- e^{-1}$$
Probability of finding exactly $k$ (nondistinct) minimum cuts after $\frac{kn^2}{2}$ runs is
$$P(\text{$k$ minimum cuts}) \ge (1 - e^{-1})^k$$
Probability of finding exactly $k$ (distinct) minimum cuts after $\frac{kn^2}{2}$ runs is
$$P(\text{$k$ distinct minimum cuts}) \ge (1-e^{-1}) \cdot ((1-e^{-1}) \cdot \frac{k-1}{k}) \cdot ... \cdot ((1-e^{-1}) \cdot \frac{1}{k})$$
$$= (1-e^{-1})(\frac{k-1}{k} \cdot .. \cdot \frac{1}{k}) = (1-e^{-r\cdot 2 / n^2})^k \left( \frac{k!}{k^k} \right) \geq p$$
<!-- \geq (1-e^{-1})^(\frac{(2n)!}{(2n)^{2n}}) -->
<!-- $$
P(\text{$k$ distinct minimum cuts}) \ge (1-e^{-r \cdot 2 / n^2})^k(\frac{k!}{k^k}) \geq p
$$ -->
$$
1- e^{-r\cdot 2 /n^2} \geq \left( p \cdot \frac{k^k}{k!} \right)^{1/k} \geq p^{1/k}
$$
$$
-r \cdot 2 / n^2 \leq \ln(1 - p^{1/k})
$$
$$
r \geq -\frac{n^2}{2} \ln(1-p^{1/k}) \geq -\frac{n^2}{2} \ln(1-p^{n/2m})
$$
as $k \leq \frac{2m}{n}$
For some polynomial number of repetitions $r$ of the original algorithm as defined above there is a constant probability $0<p<1$ that we find all minimum cuts.
----
The probability that Karger's random contraction algorithm outputs any particular random cut in one run is $\frac{2}{n(n-1)} \geq \frac{2}{n^2}$.
P(found mincut 1 & found mincut 2 & ... & found mincut n)
$C = {E \choose k}$
$k \leq \frac{2m}{n}$
Almost: $C = {E \choose \frac{2m}{n}}$
Let the number of minimum cut solutions be $C$.
Karger's random contraction algorithm generates one out of these $C$ solutions with equal probability.
If we repeat Karger's random contraction algorithm $x$ times, the probability of getting a minimum cut exactly $t$ times is $p = (\frac{2}{n(n-1)})^t \cdot {x \choose t} \cdot (1-\frac{2}{n(n-1)})^{x-t}$.
$t$ is supposed to be at least $C$. Therefore we let $t=C$ as that would mean precisely all cuts are found, assuming all cuts found are unique. Likelihood of each minimum cut being unique is $1 \cdot \frac{C-1}{C} \cdot \frac{C-2}{C} \cdot ... \cdot \frac{C-t + 1}{C} = \frac{C-1}{C} \cdot \frac{C-2}{C} \cdot ... \cdot \frac{1}{C}$
$$=\frac{(C-1)!}{C^t \cdot (C-t-1)!} = \frac{C!}{C^C}$$
The likelihood then becomes
$$p = (\frac{2}{n(n-1)})^t \cdot {x \choose t} \cdot (1-\frac{2}{n(n-1)})^{x-t} \cdot \frac{C!}{C^C}$$
## 2.3 (D-question)
Given an undirected unweighted graph $G$, the t-multicut problem is to find the minimum number of edges whose removal separates $G$ into at least $t$ connected components. Thus, the mincut problem we studied in the lecture is the same as the 2-multicut problem.
Design and analyze an algorithm for the 3-multicut problem that takes polynomial time and outputs a correct answer with probability at least 0.99.
Hint: Let Opt be the size of the minimum 3-multicut and $δ_1$ and $δ_2$ be the degrees of the two nodes with smallest degrees in the graph. It might be helpful to prove that $Opt ≤ δ_1 + δ_2$ and that the total degree over all nodes is $Ω((δ_1 + δ_2)n)$.
## Solution
Let the graph be $G=(V,E)$ and $n=|V|$.
In order to get the optimal solution with a probability of at least 0.99 we repeat the following algorithm $r=0.77n^3$ times and pick the output with the smallest amount of edges.
### Algorithm:
Taking inspiration from lecture 8 we implement the same contraction algorithm showed with the slight modification that the algorithm is stopped when three vertices are left (instead of the two vertices for 2-multicut).
The algorithm is therefore that we contract edges by choosing an uniformly random edge and merging the two vertices that it connects. This is repeated until there are only three vertices left.
**Pseudocode:**
Input: A graph $G=(V,E)$
Output: Three subsets, $A$, $B$ and $C$ which separate $G$ into a 3-multicut.
```
def Randomized3-multicut(G):
while G has more than 3 vertices:
Pick a uniformly random edge of G and contract it
return the resulting cut
```
**Lemma 1:** $Opt ≤ δ_1 + δ_2$ where $δ_1$ and $δ_2$ is the degrees of the two nodes with smallest degrees in the graph.
**Proof:** We can always cut out two nodes to generate two subsets with one node, and one subset with $|E|-2$ nodes, where $|E|$ is the number of edges in the graph. The minimum cut that can be generated in this way, is to cut out the two nodes with smallest degrees, let these nodes be $\delta_1$ and $\delta_2$. This would result in a cut of $\delta_1 + \delta_2$ edges.
Therefore $Opt \le \delta_1 + \delta_2$.
**Lemma 2:** The total degree over all nodes is at least $\frac{1}{2}(\delta_1 + \delta_2)n$.
**Proof:** Since $\delta_1$ and $\delta_2$ are the two lowest degrees we have: $\sum_{i \in V}{\delta(i)} \ge \sum_{i \in V}{(\delta_1 + \delta_2)/2} = \frac{1}{2}(\delta_1 + \delta_2)n$.
**Lemma 3:** The probability that the algorithm returns an optimal solution to the minimum 3-multicut problem is $\frac{6}{n^3}$
**Proof:** For each contraction, the contraction is bad if we pick an edge from the optimal cut. For the first contraction there are $Opt$ bad edges to choose from, and $|E|$ total edges. The probabililty of a bad choice is $\frac{Opt}{|E|}$. We know from Lemma 1 that $Opt \le δ_1+δ_2$. We also know from Lemma 2 that $2|E| = \sum_{i \in V}{\delta(i)} \ge \frac{1}{2}(\delta_1 + \delta_2)n \implies |E| \ge \frac{1}{4}(\delta_1 + \delta_2)n$.
$P(\text{bad choice}) = \frac{Opt}{|E|} \le 4\frac{(δ_1+δ_2)}{(\delta_1 + \delta_2)n} = \frac{4}{n}$
In the i:th step there are $n-i+1$ vertices left, and the probability of selecting a good edge is at least:
$1-\frac{4}{n-i+1} = \frac{n-i-3}{n-i+1}$
The success-probabilities for the contractions are independent. The total probability for selecting a good edge in all $n-3$ iterations is therefore at least:
$$\frac{n-3}{n} \cdot \frac{n-4}{n-1} \cdot\cdot\cdot \frac{2}{5} \cdot \frac{1}{4} = \frac{6}{n(n-1)(n-2)} \ge \frac{6}{n^3}$$
**Lemma 4:** Repeating the algorithm at least $r=0.77n^3$ times gives the optimal value with a probability of $\geq0.99$
**Proof:** If we repeat $r$ times, the probability of at least one success is:
$$1-(1-\frac{6}{n^3})^r \ge 1-e^{-r\frac{6}{n^3}}$$ since $(1-x) \geq e^{-x}$ for all x.
Solving for the wanted success probability yields:
$$1-e^{-r\frac{6}{n^3}} = 0.99 \implies -r\frac{6}{n^3} = \ln 0.01 \implies r = -\frac{n^3}{6}\ln 0.01 \approx 0.77n^3$$
**Lemma 5:** Repeating the algorithm $0.77n^3$ times can be done in polynomial time.
**Proof:** By definition of polynomial time repeating the algorithm a polynomial amount of times can be done in polynomial time only if the algorithm itself can be done in polynomial time. This is shown:
The algorithm does two things $n-3$ amount of times (where $n$ is the amount of vertices in the graph): It finds an uniformly random edge and it contracts this edge.
Finding an uniformly random edge can be done in $O(n)$ time as shown in lecture 8 through book-keeping. This is done by randomly picking a vertex $v$ with probability being proportional to its degree (which is stored in an array $D$). Then a random edge incident on $v$ is picked.
Contracting an edge can be done in $O(n^2)$ as also shown in lecture 8. This can be shown using the following pseudocode:
```
def Contract(u,v):
foreach vertex w except u,v:
E[u,w] <- E[u,w] + E[v,w]
E[v,w] <- 0
D[u] <- D[u] + D[v] - 2E[u,v]
E[u,v] <- 0
D[v] <- 0
```
**Final statement:** According to lemma 4 and 5 the solution is therefore correct since it can be run in polynomial time as well as outputs a correct answer with probability at least 0.99.