# CS170 MT2 Notes
1. MT1 Notes can be found [here](https://hackmd.io/@matthewtang/B1js2XPrS)
2. MT2 Notes can be found [here](https://hackmd.io/@matthewtang/B1NCuHWdB)
3. Final Notes can be found [here](https://hackmd.io/@matthewtang/r1PACERcH)
# Lecture 13: 9/27/19
## Horn SAT
* SAT formula = satisfiability formula
* Any formula with clauses that you must satisfy
* A Horn SAT consists of implications and negative clauses
* Implications
* AND of positive literals imply 1 positive literal
* $x \land y \implies z$
* Negative clauses
* OR of negative literals
* $\overline{u} \lor \overline{v}$
* Greedy solution: Only assign a variable to true if you have to
* Deal with negative clauses by "not looking at them"
* General SATs is NP hard (no polynomial time)
## Set Cover
* Input: $B = \{1, 2, \ldots n\}$, $S = S_1, S_2, \ldots S_m \subseteq B$
* Goal: Find fewest sets that union B
* Greedy Algorithm (attempt)
* Choose set $S_i$ that contains the most elements
* Remove elements in $S_i$ in all other sets
* Repeat
* Number of iterations = number of sets
* Property: Let k sets be the best solution
* Each set must have size at least $\frac{1}{k}$ sets
* In iteration $t$, $n_{t+1} \leq n_t (1 - \frac{1}{k})$
* Induction: $n_t \leq n_0 (1- \frac{1}{k})^t $
* Stop when $n_t < 1$, $t = k \ln n$
* $(1-x) \leq e^{-x}$
* $k \ln n$ is worse than our assumed optimal $k$
* Doesn't work - this is NP hard
## Disjoint Set Path Compression
* Link all descendents to new root
* If we just do this for every root we find, worst case is still $O(\log n)$
* But future finds are better
* $m$ finds takes $O(\log^* n)$ time on average (amortized)
* $\log^* \triangleq$ num of itmes it takes for log to get to 1
* How high is the stack of 2s
* At most $O(n/2^r)$ internal nodes with rank $r$
* n per group (geometric sum), $\log^*$ groups. Total: $n \log^* n$
* Each path is $O(\log^* n)$, m finds is $O(m\log^* n)$
* Total cost: $O((m+n) \log^* n)$
# Lecture 14: 10/2/19
* Dynamic and Linear Programming are more applicable than divide and conquer
## Longest Path in a DAG
* Input: an unweighted DAG
* Goal: Find length of longest path
* Assume $1, 2, \ldots n$ is a topological sort
* Define subproblems:
* Let $L[i]$ be length of longest path ending at vertex i
* n subproblems for each vertex
* Longest path in DAG = $\max \{ L[1], L[2], \ldots L[n]\}$
* Write a recurrence relation for subproblems
* Longest path to $L[7] = \max$ of all ancestors of 7 + 1 for the edge
* Compute $T[i]$ in topological/linearized order using recurrence relation
* Algorithm
* Initialize $L[i] = 0, \forall i = 0, 1, \ldots n$
* for $i = 1, 2 \ldots n$
* $L[i] = \max\limits_{j \rightarrow i} \{L[j] + 1\}$
* Return $\max\{L[1], L[2], \ldots L[n]\}$
* Runtime: $O(V+E)$
## Longest Increasing Subsequence (LIS)
* Input: array $A$ of $n$ integers
* Goal: Find LIS
* Idea: Create a DAG where each number is a vertex
* Add an edge from $i \rightarrow j$ if $A[i] < A[j], i < j$
* Longest path in this DAG = LIS
* Runtime: $O(n^2)$
# Lecture 15: 10/4/19
## Knapsack Problem
* Input: Set of N items with (weight, value)
* Goal: Find a subset items that maximizes value and < max weight W
* If repetition is allowed, same as "longest path in a DAG"
* Attempt 1: Without A, find optimal knapsack for $W-w_A$
* But this doesn't prevent repetitions
* Attempt 2: $K[w, F]$ = optimal knapsack with total weight $\leq w$ where F is forbidden elements.
* Too many subproblems b/c too many sets of forbidden elements
## Knapsack Solution
* $K[w,i] = $ optimal knapsack with total weight $\leq w$ that uses elements from $\{1, \ldots i\}$
* $K[w,i] = \max(V_i + K[w-w_i, i-1], K[w, i-1])$
* Max of knapsack that contains i and does not contain i
* $K[w,n]$ is answer
* Have a 2 dimensional DAG of subproblems
* Calculate subproblems in a double for loop
### Pseudocode
```python=
# Initialize with 0s
for i = 1 to n:
K[0, i] = 0
# Note you could switch the order of the loops
for i = 1 to n:
for w = 1 to W:
K[w, i] = max(V_i + K[w-w_i, i-1], K[w, i-1])
return K[W,n]
```
# Lecture 16: 10/7/19
## Min Edit Distance
* Input: 2 strings $x[1 \ldots n]$, $y[1 \ldots m]$
* Goal: Min num of operations/keystrokes to edit x into y
* Operations: Delete, insert, substitute
* Subproblem: $E[i,j] =$ edit distance to go from $x[1 \ldots i]$, $y[1 \ldots j]$
* Recurrence relation: $E[i,j] = \min
\begin{cases}
1 + E[i, j-1] & & \text{Insertion}\\
1 + E[i-1, j] & & \text{Deletion}\\
1 + E[i-1, j-1] & x[i] \neq y[j] & \text{Substitution}\\
E[i-1, j-1] & x[i] = y[j] & \text{Carry over}
\end{cases}$
* Answer: $E[n,m]$
* Runtime: $O(n^2)$
### Pseudocode
```python=
int E[m,n]
for i=0 to m:
E[i,0] = i
for j=0 to n:
E[0,j] = j
for i = 1 to m:
for j = 1 to n:
E[i,j] = recurrence relation
return E[n,m]
```
## Gambling
* Play $n=500$ games in a casino
* Game A: $\begin{cases}
1/2 & \text{Earn 2}\\
1/2 & \text{Lose 2}
\end{cases}$
* Game B: $\begin{cases}
2/3 & \text{Earn }5\\
1/3 & \text{Lose }5
\end{cases}$
* Goal: Succeed if you win exactly $170 after n games
* Given $m$ money and $l$ games left, compute optimal strategy
* $P[m, l] = $ Probability of success for optimal strategy starting with m money and l games left
* Base case: $P[m,0] = \max \begin{cases}
1 & m = 170\\
0 & \text{Everywhere else}
\end{cases}$
* $P[m,l] = \max \begin{cases}
\frac{1}{2}P[m+2, l-1] + \frac{1}{2}P[m-2, l-1] & \text{Game A}\\
\frac{2}{3}P[m+5, l-1] + \frac{1}{3}P[m-5, l-1] & \text{Game B}
\end{cases}$
* Answer: Return $P[0, n]$
* Solve subproblems in increasing order of $l$
# Lecture 17: 10/14/19
## Min Triangulation
* Input: $n$ points that form a convex polygon
* Goal: Minimize cost triangulation (total perimeter of all triangles)
* Subproblems:
* Naive: Don't want exponential number of subproblems. There is 1 convex polygon for each subset of points
* If we look at an edge $(v_i, v_{i+1})$, it must be in some triangle
* Ex. $(1,2)$ can be a part of triangle $(1,2,n)$
* Creates 2 other polygons
* There are $O(n^2)$ subproblems
* $T[i,j] =$ min cost of triangulating the convex polygon formed by $(v_i, v_{i+1}, \ldots v_j)$
* Answer: $T[1, n]$
* Recurrence relation: $T[i, j] = \min\limits_k (T[i, k] + T[k, j] + Perimeter(i, j, k))$
## Formulating subproblems
* Strings:
* Prefix, suffix
* 2 strings:
* (Prefix of x, Prefix of y)
* (Prefix of x, Suffix of y)
* Triangulaton
* Sections: $n^2$ sections of $i \rightarrow j$
* Trees
* Rooted subtrees
* TSP
* All possible subsets
# Lecture 18: 10/16/19
## All-Pair Shortest Paths (DP)
* Input: Graph G with edge weights
* Goal: Compute shortest paths between all pairs of vertices
* Subproblem: $D[i,j, k]=$ length of shortest path from $i$ to $j$ which only uses $k \in \{1, \ldots k\}$ as intermediate vertices
* Shortest path from i to j: $D[i,j,n]$
* Recurrence relation: Either vertex k is on the shortest path or not (then every intermediate vertex is on the shortest path)
* $D[i,j,k] = \min \begin{cases}
D[i,j, k-1] & \text{Does not use vertex k} \\
D[i, k, k-1] + D[k, j, k-1] & \text{Uses vertex k}
\end{cases}$
## Linear Programming
* Family of optimization problems
* Can be solved efficiently
* Variables: $x_1 \ldots x_n \in \mathbb{R}$
* Max/min linear function subject to linear constraints
* Feasible region: range of variables that satisfy constraints
* Fact 1: this is a convex polygon/polyhedron
* Fact 2: The optima of a linear program occurs at a vertex of a polygon/polyhedron
* Fact 3: Corner/vertex of a polygon is given by the intersection of constraints
* Convex: A subset of points $S \subseteq \mathbb{R}^d$ is convex if $\forall a,b \in S$
## Coffee Shop
* 1 cappucino = 3 coffee, 1 milk
* 1 latte = 2 coffee, 4 milk
* Available: 6 coffee, 8 milk
* Goal: Maximize num of drinks (positive real numbers) = $x + y$
* x = num of cappucinos
* y = num of lattes
* $3x+2y \leq 6$, $1x + 4y \leq 8$
# Lecture 19: 10/18/19
## Convex Optimization
* Draw a convex body of feasible region(satisfy inequality)
* **Isocline**: All points have same value on hyperplane
* Optimal at an intersection of boundaries
* If there are n variables, m constraints, there are $\binom{n}{m}$ possible vertices (n constraints define a point)
* Finite but exponential in number of vertices
## Simplex
* Start at vertex, move to better neighboring vertex
* Combine some equations to form the upper bound objective function
* If we can create an upper bound objective value then this is an upper bound
* There is always a way to prove optimality
* Figuring out which ones to add - duality
* Simplex runtime is not polynomial but fast in practice
## Carpet Production
* Demands: $d_1, d_2, \ldots d_{12}$, range: $440-920$
* 30 employees, 20 carpets/month, 2000/month
* Overtime: 80% extra, at most 30% for 1 employee
* Hiring/firing cost: 320/400
* Storage: 8/carpet, no storage at end of the year
* Variables:
* $w_i$ - workers in month $i$ ($w_0$ = 30)
* $x_i$ - carpets made in month $i$
* $o_i$ - overtime carpets in month $i$
* $h_i, f_i$ - hired/fired in month $i$
* $s_i$ - stored at end of month $i$ ($s_{12}=0$)
* Constraints:
* Nonnegative: $w_i, x_i, o_i, h_i, f_i, s_i \geq 0$
* Production: $x_i = 20w_i + o_i$
* Employment: $w_i = w_{i-1} + h_i - f_i$
* Inventory: $s_i = s_{i-1} + x_i - d_i$
* Regulations: $o_i \leq 6w_i$
* Objective: $\min 2000\sum_i w_i + 320 \sum_i h_i + 400 \sum_i f_i + 8 \sum_i s_i + 180 \sum_i o_i$
## Bandwidth
* A network of users A,B,C
* Variables:
* $X_{AB}$ - flow along path AabB
## Reduction to Standard Form
* Maximization to minimization?
* Multiply objective function by $-1$
* Less than into greater than?
* Multiply by $-1$
* Inequalities and equalities
* $\sum_i a_i x_i \leq b$ into equality
* $\sum_i a_i x_i + s = b, s> 0$
* $\sum_i a_i x_i = b$ into inequalities
* $\sum_i a_i x_i \leq b$ and $\sum_i a_i x_i \geq b$
* Simulate unrestricted variable x with positive variables
* Can also write in matrix form
# Lecture 20: 10/21/19
## Duality
* Primal LP:
* $\max c^T x$
* $Ax \leq b, x \geq 0$
* Dual LP:
* $\min y^T b$
* $y^T A \geq c$
* $y \geq 0$
* If a linear program has a bounded value, its dual is bounded with the same value
* Given $A,b,c$ and feasible solutions x,y
* Solutions x and y are both optimal iff $x_i(c_i - (y^TA)_i = 0)$ and $y_j(b_j - (Ax)_j)$
## Simplex Algorithm
* Start at origin
* If there is a positive $c_i$, increase $x_i$ until you hit a new constraint
* Try all constraints, see which stops you first
* If all $c_i \leq 0 \implies x_i$ decreases value (optimal)
* Once you leave the origin, new coordinates = dist from new tight constraints
* For constraint i: $y_i = b_i - a_i x$, with x at $(y_1, y_2)$ in new coordinates
* Rewrite in coordinate system for tight steps
* Go to tight constraint along improving constraints
* If all coeffs are negative, optimal.
## Origin not feasible
* Introduce positive variables $z_i$ for inequality i,
* Constraints: $|a_ix - z_i \leq b_i$
* $\max \sum - z_i$
* Degenerate vertices: intersection of more than 2 constraints
* Bland's anticycle
* Just move the lines a little bit
* Unboundedness:
* Runtime:
* Optimality - $O(n)$
* Find tight constraints: $O(m)$ total
* New coordinate systems (rewriting LP)
* $y_i = b_i - a_ix$ solve for $x_i$ interms of $y_i$
* Only 1 new constraint to update
* $O(mn)$ time to update LP
* Can be exponential simplex steps but fast i
### Dual Solution
* Negation of coefficients of new function
# Lecture 21: 10/23/19
## Max Flow
* Input: Flow network G, source s, sink $t \in V$, capacities $c_e > 0$
* Find flow $f_e$
* Capacity constraints: $0 \leq f_e \leq c_e$
* If $u \neq s, t \implies \sum_{(w,u)} f_{wu} =\sum_{(u,w)} f_{uw}$
* Maximize size$(f) = \sum_{(s,u)} f_{su} $
* S-T cut: Capacity is total capacity of edges from S to T
* Add to flow variables along the path, add flow to variables along path, update remaining capacity
* Exists an integer solution
* Residual capacity
* $r_e = c_e - f_e$, $r_{uv} = f_{vu}$
# Lecture 22: 10/25/19
## Optimality of Max Flow
* $S-T$ cut: splits S and T
* Exists an integer vertex solution
* Find $S-T$ path with remaining capacity
* Add flow variables along path (backwards arc)
* Update remaining capacity (forwards arc)
* Optimality
* Capacity of any S-T cut is an upper bound on the flow
* Flow out of S is flow out of s
* Flow into T is flow into t
* Flow of S into T is sum of flows across edges of S to T
* At termination, no path has residual capacity between cut
* DFS from S does not reach T gives min cut
* Max flow is the min $S-T$ cut
## Edmonds-Karp
* Augment along shortest path
* BFS with $O(VE)$ augmentations
* Flow values do not matter in runtime
* $O(VE)$ removals, $O(VE^2)$ total time
## Maximum Matching
* Input: Bipartite graph $G(L, R, E)$
* Goal: Find largest subset of 1-1 matches
* Reduces to Max-Flow
* Add a source on left (S), sink on right (T)
* Run Max-Flow where each flow capacity is 1
* Directly solve: Augmenting Alternating Paths
* Start at unmatached nodes:
* Follow unmatched edges
* follow matched
* Repeat until an unmatched node
# Lecture 23: 10/28/19
## Strategic Games
* N players, each player has a strategy set $\{S_1, \ldots S_N\}$
* Vector valued playoff function: $u(s_1, \ldots s_n)$
* Nash Equilibrium: Neither player has incentive to change strategies
## Zero Sum Games
* $m x n$ payoff matrix A with row plays $i$, column plays $j$
* Row mixed strategy: $x = (x_1, \ldots x_m)$
* Column mixed strategy: $y = (y_1, \ldots y_n)$
* Payoff: $p(x,y) = x^TAy = \sum_i \sum_j x_ia_{i,j}y_j$
* Row maximizes, column minimizes
* Equilibrium pair: $(x^*, y^*)$
* No better row strat against column player and vice versa
* $\max\limits_i A^{(i)} \cdot y^*$
* $R = \min\limits_y \max\limits_x(x^TAy)$
* Row player sees column strategy so R has advantage
* $C = \max\limits_x \min\limits_y(x^TAy)$
* Column player sees row strategy so C has advantage
* Strong Duality - Doesn't matter who plays first since equilibrium point
* Mixed strategy: Each player plays over distribution of strategies
* Pure strategy: Each player plays a single strategy
* Both only play optimal strategies (like complementary slackness)
* Use a probability distribution to limit opponent playoff
## Zero Sum Games LP
* Row player minimizes C
* $C = \max z$
* $\forall i, a^{(i)} \cdot x \geq z$
* $\sum\limits_i x_i = 1$
* $x_i \geq 0$
* Column player maximizes R
* $R = \min z$
* $\forall i, a^{(i)} \cdot y \leq z$
* $\sum\limits_i y_i = 1$
* $y_i \geq 0$
## Circuit Evaluation
* Input: DAG of boolean gates: AND, Or, NOT
* $0 \leq x_g \leq 1$
* Create inequalities to force gates to work correctly
# Lecture 24: 10/30/19
## Expert Predictions Problem
* $n$ experts, each offers a prediction each day
* Alg 1: Choose one of the perfect experts
* Mistake bound: $n-1$
* Alg 2: Go with majority of previously correct experts
* We don't need to find the perfect expert
* Mistake bound: $\log n$
* Number of perfect experts drops by a factor of 2 when an algorithm makes a mistake
* Don't always have an infallible expert
## Multiplicative Weights Algorithm
* Penalize inaccurate experts
* 1) $w_i = 1$
* 2) Predict with weighted majority of experts
* 3) $w_i \rightarrow w_i/2$ if wrong
* Best expert makes $m$ mistakes
* Potential function: $\sum_i w_i = n$
* For best expert b, $w_b \geq \frac{1}{2^m}$
* For each mistake, each incorrect expert weight is multiplied by $\frac{1}{2}$
* Total sum of incorrect experts is reduced by $\frac{1}{2}$ and this is the majority
* So the total weight decreases by $\geq \frac{1}{4}$
* Mistake $\rightarrow$ potential function decreases by factor of $\frac{3}{4}$
* $$ \frac{1}{2^m} \leq \sum_i w_i \leq n\left( \frac{3}{4}\right)^M$$
* $m$ - best expert mistakes, $M$ - algorithm mistakes
* Take log of both sides, solve for M
* $M \leq (m + \log n)/ \log(4/3) \leq 2.4(m + \log n)$
* Using $(1-\epsilon)$ penalty
* $M \leq 2(1 + \epsilon)m + \frac{2 \ln n}{\epsilon}$
* Approaches a factor of 2 of best expert performance as $m \rightarrow \infty$
## Randomization
* $A$ correct on odd days, $B$ correct on even days
* Never converges
* Choose expert $i$ with probability $\propto w_i$
* For $\epsilon \leq 1, x \in [0,1]$
* $(1-\epsilon)^x \leq (1-\epsilon x)$
* For $\epsilon \in [0, \frac{1}{2}]$
* Expert i loses $l_i^t \in [0,1]$ in round $t$
* Algorithm:
* 1) $w_i = 1$ for expert $i$
* 2) Choose expert $i$ with probability $\frac{w_i}{W}, W = \sum_i w_i$
* 3) $w_i \leftarrow w_i(1-\epsilon)^{l_i^t)}$
* $W(0) = n$
* Best expert b loses $L^*$ total. $W(t) \geq w_b \geq (1-\epsilon)^{L^*}$
* $L_t = \sum_i \frac{w_i l_i^t}{W}$ is expected loss of alg in time $t$
* For $\epsilon \leq \frac{1}{2}, W(t+1) \leq w(t)(1-\epsilon L_t)$
* Loss causes weight loss
* No longer a factor of 2
* Can use positive weights or scaled weights $>1$
## Other Models
* What if you only observe the advice of expert you pick
* Bandit problem (exploration vs exploitation)
* Reinforcement Learning
## Optimization
* Minimize a convex function $f(x), x^* \in R^n$
* $x^*$ minimizes $f(x)$
* Gradient descent
# Lecture 25: 11/1/19
* For any $\epsilon$ there exists an $\epsilon$ Approximate Equilibrium
* Complexity: $T = \frac{\ln n}{\epsilon^2} \rightarrow O(nm\frac{\log n}{\epsilon^2})$
* Basically linear!
* LP would be quadratic $O(n^3m)$
* Boosting: Use a weak learner to produce a strong learner using multiplicative weights
* Experts are the points, adversary is the weak learner