# Lecture 8 - Some Primal-Dual Relaxation Problem
###### tags: `Algorithm`
## Linear Programming (LP) Duality
| Primal | Dual |
| ----------- | ----------- |
| $\min c^Tx$ | $\max b^Ty$ |
| s.t. $$Ax\geq b\\x\geq 0$$ | s.t. $$A^Ty\leq c\\y\geq 0$$ |
+ For every feasible $x$ and $y$, $c^Tx\geq b^Ty$
+ Thus, $Opt(primal)\geq Opt(dual)$
+ The dual of the dual is the primal
### Meaning of Duality
| $\min$ | $c_1x_1+c_2x_2+...+c_nx_n$ |
| ----------- | ----------------------------------------------------------------------------------- |
| $y_1\geq 0$ | $a_{11}x_1+ a_{12}x_2+...+a_{1n}x_n\geq b_1$ |
| $...$ | $...$ |
| $y_m\geq 0$ | $a_{m1}x_1+ a_{m2}x_2+...+a_{mn}x_n\geq b_m$ |
| | $(\sum_{i=1}^m{y_ia_{i1}})x_1+...+(\sum_{i=1}^m{y_ia_{in}})x_n\geq\sum_{i=1}^m{y_ib_i}$ |
+ If $\sum_{i=1}^ma_{ij}y_i\leq c_j$for all $j$, then $c_1x_1+...+c_nx_n\geq b_1y_1+...+b_my_m$
+ That is, we gives a lower bound on the objective function so we want to maximize $b_1y_1+...+b_my_m$
### LP Duality Theorem: $Opt(primal)$ is finite $\iff$ $Opt(dual)$ is finite, and if $x^*$ and $y^*$ are optimal then $c^Tx^*=b^Ty^*$
### Complementary Slackness Conditions
+ If $x$ and $y$ are optimal
+ Primal conditions: $\forall j$, either $x_j=0$ or $\sum_{i=1}^ma_{ij}y_i=c_j$
+ Dual conditions: $\forall i$, either $y_i=0$ or $\sum_{j=1}^na_{ij}x_j=b_i$
#### Proof
| $\min$ | $c_1x_1+c_2x_2+...+c_nx_n$ |
| ----------- | ----------------------------------------------------------------------------------- |
| $y_1\geq 0$ | $a_{11}x_1+ a_{12}x_2+...+a_{1n}x_n= b_1$ |
| $...$ | $...$ |
| $y_m= 0$ | $a_{m1}x_1+ a_{m2}x_2+...+a_{mn}x_n\geq b_m$ |
| | $(\sum_{i=1}^m{y_ia_i1})x_1+...+(\sum_{i=1}^m{y_ia_in})x_n\geq\sum_{i=1}^m{y_ib_i}$ |
+ If $c_1x_1+...+c_nx_n=b_1y_1+...+b_my_m$, $\forall i$, either $y_i=0$ or $\sum_{j=1}^na_{ij}x_j=b_i$
## Set Cover
### Problem Definition
+ Given
+ Universe $U$ of $n$ elements
+ $S=\{S_1, S_2,..., S_k\}$ is a collection of subsets of $U$
+ Cost function $c:S\to \mathbb Q^+$
+ Goal: Find min cost subcollection of $S$ covering all of $U$

+ The **frequency** of an element is the number of sets it is in
+ $f$ is frequency of most frequent element
+ **NP-hard problem**
+ **Vertex Cover is Set Cover with $f=2$**
+ Elements: edges
+ Sets: vertices
+ Each edge associated with $2$ vertices
#### Set Cover as Integer Programming (IP)
+ Indicator variable $x_i\in \{0,1\}$ for each set $S_i$ in $S$
+ The constraint is that we choose at least one of the sets containing each element
| $\min\sum_{i=1}^k{c(S_i)x_i}$ |
| -------- |
| s.t. $$\sum_{i:e\in S_i}x_i\geq 1, e\in U\\x_i\in \{0,1\}, i=1,2,...,k$$ |
#### LP relaxation of IP
+ The constraint is that we choose at least one of the sets containing each element
| $\min\sum_{i=1}^k{c(S_i)x_i}$ |
| -------- |
| s.t. $$\sum_{i:e\in S_i}x_i\geq 1, e\in U\\x_i\geq 0,i=1,2,...,k$$|
### Method 1 - Rounding
+ Solve the LP to get OPT solution $x^*$
+ Include $S_j$ in the integer **solution $I$** if $x_j^*\geq\frac{1}{f}$
#### Theorem: The solution $I$ is indeed a set cover
+ For every $e$, by $\sum_{i|e\in S_i}x_i\geq1$ and $|\{i|e\in S_i\}|\leq f$
+ There exists $S_i$ for every $e$ such that $e\in S_i$ and $x_i\geq\frac{1}{f}$
+ So $S_i$ is in solution $I$
#### Theorem(*Hochbaum’ 82*): Rounding is an $f$-approximate algorithm
+ $\sum_{j\in I}c(S_j)\leq\sum_{j\in I}c(S_j)x_j^*f\leq\sum_jc(S_j)x_j^*f=f\sum_jc(S_j)x_j^*\leq f\times OPT$
### Method 2 - Primal-Dual
| Primal | Dual |
| ---------------------------- | --------------------------------------- |
|$\min\sum_{i=1}^k{c(S_i)x_i}$ | $\max\sum_{e\in U}y_e$ |
|s.t. $$\sum_{i:e\in S_i}x_i\geq 1,e\in U\\x_i\geq 0,i=1,2,...,k$$ |s.t. $$\sum_{e:e\in S_j}y_e\leq c_j, j=1,2,...,k\\y_e\geq 0, e\in U$$|
+ min-LP is a **covering** and the max-LP is a **packing**
+ Recall: optimal solutions to linear LPs satisfy complementary slackness conditions
+ Iterative method
+ Begin with initial solutions to primal and dual
+ Iteratively start satisfying complementary slackness conditions
+ Modify the primal integrally
+ Once all conditions are met, the solutions must be optimal
+ What if optimal solution is not integral?
+ Relax the conditions
+ Complementary Slackness
+ $x_i=0$ or $\sum_{e\in S_i}y_e=c(S_i)$
+ So $S_i$ is included in $I\Rightarrow x_i=1\Rightarrow\sum_{e\in S_i}y_e=c(S_i)$
#### Basic Algorithm
:::info
+ $y\leftarrow 0$
+ $I\leftarrow\emptyset$
+ **while** there exists $e\notin \cup_{S_j\in I}S_j$ **do**
+ Increase the dual variable $y_e$ until $\sum_{e\in S_k}y_e=c(S_k)$ for some $k$
+ $I\leftarrow I\cup\{S_k\}$
+ return $I$
:::

+ At the beginning the primal conditions are not satisfied, the dual conditions are all satisfied
+ We want some dual conditions to be tight, then the corresponding set is put into $I$
#### Analysis
+ Finally the primal and dual relaxed conditions are all satisfied
+ The final $I$ we get are all tight: $\sum_{e\in S_k}y_e=c(S_k)$ for all $S_k\in I$
+ Because after include $S_k$ in $I$, the $y_e$ in it will not change any more
+ The algorithm is an $f$-approximation algorithm
+ $\sum_{S_j\in I}c(S_j)=\sum_{S_j\in I}\sum_{e\in S_j}y_e=\sum_{e\in U}y_e|\{S_j\in I|e\in S_j\}|\leq f\sum_{e\in U}y_e\leq f\times OPT$
#### LP Relaxation
+ For a minimization LP problem

+ So $\sum_{e\in U}y_e\leq OPT$
+ In this example, we didn’t get a better approximation than simply rounding the LP
+ But this algorithm is fully combinatorial, and easy to analyze complexity
+ No need to use really solve the LP
## Network Design
### Problem Definition
+ Input
+ Undirected graph $G=(V,E)$
+ Cost function: $c:E\to\Bbb R^+$
+ Connectivity requirements
+ Output
+ Minimum cost subgraph satisfying connectivity requirements
+ Example
+ MST: minimum cost subgraph connecting all vertices
### Steiner Tree
+ Input
+ Set of terminals $T=\{t_1,..., t_k\}\subseteq V$

+ Output
+ Minimum cost subgraph connecting $T$
+ **NP-complete**

#### Algorithm
+ Define complete graph $H$ on $T$
+ $c_G(t_i,t_j)$ is the shortest path in $G$ from $t_i$ to $t_j$
+ Find a MST $F$ in $H$
+ Replace each edge in $F$ by the corresponding path in $G$ (deleting cycle)
#### Analysis
+ $F^*$ is the optimal Steiner tree in $G$
+ Duplicate each edge in $F^*$: **all degrees are even**
+ Open up $F^*$ into a **Euler** cycle $Z$
+ $L(Z)=2L(F^*)$, where $L(S)$ is the total length of all edges in $S$

+ $Z$ can be transformed into an **Euler tour** in $H$
+ For any $t,t'\in T$, $c_Z(t,t')\geq c_H(t,t')$
+ The Euler tour in $H$ is itself a Steiner tree in $H$
+ So $F$ we get is smaller than $Z$

+ So the approximation ratio is $2$
### Constrained Steiner Forest Problem
+ Input
+ Undirected graph $G=(V,E)$
+ Cost function: $c:E\to\Bbb R^+$
+ Disjoint sets $T_1,...,T_k\subseteq V$
+ Output
+ Minimum cost subgraph $H$ such that each set $T_i$ belongs to the same connected component of $H$
#### Integer Programming Formulation
+ Requirement function $f:S\to\{0,1\}$, $\forall S\subseteq V$
+ $f(S)=1$ if there is a set $T_i$ split between $S$ and $V\setminus S$
+ **the forest $H$ must cross the cut $S$**
+ $f(S)=0$ otherwise
+ **no requirement, but maybe still a cut**
+ Note: $f(S)=f(V\setminus S)$, $f$ is a function on cuts

| $\min\sum_{e\in E}c_ex_e$ |
|---|
|s.t. $$\sum_{e\in\delta(S)}x_e\geq f(S), S\subseteq V$$ $$x_e\in\{0,1\}, e\in E$$|
+ $\delta(S)$ is the edges that cross the cut $S$

#### Linear Programming Relaxation and Dual Problem
| $\min\sum_{e\in E}c_ex_e$ | $\max\sum_{S\subseteq V}f(S)y_S$ |
|---|---|
|s.t. $$\sum_{e\in\delta(S)}x_e\geq f(S), S\subseteq V\\x_e\geq 0, e\in E$$|s.t. $$\sum_{S:e\in\delta(S)}y_s\leq c_e, e\in E\\y_S\geq 0, S\subseteq V$$|
+ Note the we may have exponential number of $y$-values, but primal-dual algorithm still works since only a polynomial number of them are non-zero
#### Primal-Dual Algorithm
+ Primal complementary slackness
+ For all $e\in E$, $x_e=0$ or $\sum_{S|e\in\delta(S)}y_S=c_e$
+ So $x_e=1\Rightarrow\sum_{S|e\in\delta(S)}y_S=c_e$, that is, **every picked edge must be tight**
+ **No strict dual conditions (only an approximate algorithm)**
+ **Because most $y_S$ are $0$**
##### Definition
+ Picked edge: those $e$ such that $x_e=1$
+ A set $S$ is **active** during the algorithm iff
+ It is a **connected component** in the current picked forest
+ $f(S)=1$ (to be expanded)
+ So we raise the $y_S$ for active set $S$ until new tight edges emerge
+ Tight edge $e$: $\sum_{S|e\in\delta(S)}y_S=c_e$

##### Algorithm
:::info
+ $F\leftarrow\emptyset,y_S\leftarrow 0$ for all $S\subseteq V$
+ **while** there is an active set **do**
+ Simultaneously raise $y_S$ for all active sets $S$, until some $e$ goes tight
+ $F\leftarrow F\cup\{e\}$
+ Prune **redundant** edges in $F$
:::

+ Example:
+ We want to connect $\{s,t\}$ and $\{u,v\}$





+ Finally

:::info
Prune **redundant** edges in $F$
+ For all edges $e$ in $F$ **in the reverse order they added to $F$** do
+ If **$F\setminus\{e\}$ is a feasible solution** then
+ Remove $e$ from $F$
:::
##### Analysis
+ No cut $S$ with $f(S)=0$ is raised during the algorithm, so $f(S)=0\Rightarrow y_S=0$
+ So $\sum_{S\subseteq V}y_S=\sum_{S\subseteq V}f(S)y_S$
+ We want to show $\sum_{e\in F}c_e\leq 2\sum_{S\subseteq V}y_S$, **so it’s a $2$-approximation** (WHY?)
+ Since it’s a relaxation, so the solution of the original LP $\geq$ the solution of the relaxed LP, so if $\sum_{e\in F}c_e$ is a $2$-approx. for this LP, it a $2$-approx. of the original IP
+ Let $deg_F(S)$ be the number of edges of $F$ crossing the cut $S$, then $\sum_{e\in F}c_e=\sum_{S\subseteq V}deg_F(S)y_S$
+ Since $\sum_{e\in F}c_e=\sum_{e\in F}(\sum_{S:e\in\delta(S)}y_S)=\sum_{S\subseteq V}(\sum_{e\in\delta(S)\cap F}y_S)$

+ When we raise all active sets by $\Delta$ ($\sum_{e\in F}c_e\leq 2\sum_{S\subseteq V}y_S$)
+ LHS increases by $2\Delta\times$(#active sets)
+ RHS increases by $\Delta\times(\sum_{S\ active}deg_F(S))$
+ Need to show $\sum_{S\ active}deg_F(S)/$(#active sets)$\leq 2$
+ Key observation: current inactive sets cannot be leaves! Otherwise the edge in $F$ connecting to it will be deleted in the pruning step
+ So average degree of active sets is at most $2$

#### Examples of Requirement Functions
+ Steiner tree: for any cut $S$ separating terminals, $f(S)=1$
+ Shortest path between $s$ and $t$: for any cut $S$ separating $s$ and $t$, $f(S)=1$
+ Minimum weight perfect matching: $f(S)=1$ iff $|S|$ is odd (At least one vertex is matched outside $S$)
## Feedback Vertex Set
### Problem Definition
+ Input: A directed graph $G=(V,E)$ and nonnegative weights $w_i>0$ for $i\in V$
+ Problem: Find a set $S\subseteq V$ of **minimum weight** such that $G[V\setminus S]$ is a forest

### LP Formulation
+ We simply write the constraints based on all cycles $C\in\mathcal C$ have a node in the feedback set
|$\min \sum_{i\in V}w_ix_i$|
|---|
|s.t. $$\sum_{i\in C}x_i\geq 1, \forall C\in\mathcal C\\x_i\in\{0,1\},\forall i\in V$$|
+ LP relaxation and dual problem
|Primal|Dual|
|---|---|
|$\min \sum_{i\in V}w_ix_i$|$\max\sum_{C\in\mathcal C}y_C$
|s.t. $$\sum_{i\in C}x_i\geq 1, \forall C\in\mathcal C\\x_i\geq 0,\forall i\in V$$|s.t. $$\sum_{C\in\mathcal C:i\in C}y_C<w_i,\forall i\in V\\y_C\geq 0,\forall C\in\mathcal C$$|
#### Algorithm
:::info
+ $y\leftarrow 0$
+ $S\leftarrow\emptyset$
+ **while** there exists a cycle $C$ in $G$ **do**
+ Increase $y_C$ until there is some $\ell\in V$ such that $\sum_{C'\in\mathcal C:i\in C}y_C'=w_\ell$
+ $S\leftarrow S\cup \{\ell\}$
+ Remove $\ell$ from $G$
+ Repeatedly remove vertices of degree one from $G$
+ return $S$
:::
#### Analysis
+ The final **vertices** in $S$ we get are all tight $\sum_{C:i\in C}y_C=w_i,\forall i\in S$
+ Because after include $i$ in $S$, the $y_C$ for cycles containing it will not change any more
+ Approximation ratio
+ Weight: $\sum_{i\in S}w_i=\sum_{i\in S}\sum_{C:i\in C}y_C=\sum_Cy_C|S\cap C|$
+ **But still no bound on $|S\cap C|$ (for those $y_C>0$)**
### Improvement
+ Observation: For any path $P$ of vertices of degree two in graph $G$, our algorithm will choose at most one vertex from $P$
+ Because they must be in one cycle
+ When we choose one vertex in $P$, others will be deleted one by one because they have degree one

#### Theorem: In any graph $G$ that has no vertices of degree one, there is a cycle with at most $2\lfloor\log_2 n\rfloor$ vertices of degree three or more, and it can be found in linear time
+ Run a BFS in $G$, the worst-case is

#### Algorithm
:::info
+ $y\leftarrow 0$
+ $S\leftarrow\emptyset$
+ Repeatedly remove vertices of degree one from $G$
+ **while** there exists a cycle $C$ in $G$ **do**
+ Find cycle $C$ with at most $2\lfloor\log_2 n\rfloor$ vertices of degree three or more
+ Increase $y_C$ until there is some $\ell\in V$ such that $\sum_{C'\in\mathcal C:i\in C}y_C'=w_\ell$
+ $S\leftarrow S\cup \{\ell\}$
+ Remove $\ell$ from $G$
+ Repeatedly remove vertices of degree one from $G$
+ return $S$
:::
#### Analysis
+ Approximation ratio
+ Weight: $\sum_{i\in S}w_i=\sum_{i\in S}\sum_{C:i\in C}y_C=\sum_Cy_C|S\cap C|$
+ Bound on $|S\cap C|\leq 4\lfloor\log_2 n\rfloor$
+ When we increase the $y_C$ for some cycle $C$, since there are at most $2\lfloor\log_2 n\rfloor$ vertices in it of degree $\geq 3$
+ There are at most $2\lfloor\log_2 n\rfloor$ subpaths of degree-$2$ vertices joining them, we only need to include one per subpath in $S$
+ In total: $4\lfloor\log_2 n\rfloor$