Weighted min-cut lower bound

# Weighted min-cut lower bound ## APSP in BCC We talked a bit about APSP in the broadcast congested clique, is seems quite trivial to show that spanners give the best tradeoff we can hope for there: $(2k-1)$-approximation in about $n^{1/k}$ time by [Baswana, Sen](https://onlinelibrary.wiley.com/doi/abs/10.1002/rsa.20130). It is quite easy to show a matching lower bound for weighted graphs. In the unweighted case there are slightly different tradeoffs. * Lower bound construction: the idea is to take a matching of zero weight edges between Alice and Bob, and in each side to have potential edges that correspond to a graph H with certain girth, the number of edges in this graph determine the lower bound, and gives some tradeoff between time to approximation. If all the edges of H appear either in Alice’s or Bob’s side, we have path of weight 1 between the relevant pairs, otherwise, there is at least one pair which requires a long path. * Results for weighted graphs: for example, we can get a 3-approximtion in about $\sqrt n$ time, and getting a $(3-\epsilon)$-approximation requires linear time for weighted graphs. We can continue and show that to get a (5-epsilon)-approximation you need sqrt(n) time for weighted graphs (showing that the 3-approximation is tight), but a 5-approximation can be obtained in n^{1/3} time, and so on. * Unweighted graphs: for unweighted graphs the same linear lower bound holds for a (2-epsilon)-approximation, and indeed you can get a 2-approximation in sqrt(n) time for the unweighted case, with the following algorithm (follows ideas from the algorithm in the congested clique): compute a hitting set A of high-degree vertices, i.e., for each vertex with degree at least sqrt(n) at least one of its neighbors is in A. Then, compute distances from all vertices to vertices in A (I think that in the paper of Holzer and Pinsker they explain how to get (1+epsilon)-approximations in BCC for this problem), and for each pair of vertices u,v, compute min d(u,w)+d(w,v) for all w in A. This guarantees that we get a good approximation for all shortest paths that have at least one high-degree vertex. In the remaining graph the maximum degree is sqrt(n) which allows all vertices learn the whole graph in sqrt(n) time and compute shortest paths there. Actually it seems that this approach even gives something like +2 approximation in sqrt(n) time, not just a multiplicative 2-approximation. ## Vertex connectivity We tried to see whether the approaches used for vertex connectivity in the sequential setting can be relevant for distributed algorithms, since intuitively the algorithm is very local. However, currently is seems problematic as this approach requires running the same algorithm from many vertices in the same time (in the worst case from all vertices) which incurs a lot of congestion, and we couldn’t find a way to overcome it. ## Lower bounds for matching It seems that the static cut approach is not useful since there are algorithms with very small congestion, and it seems that the hardness comes from computing augmenting paths which is more related to dilation. Also, there are connections between weighted and unweighted matchings using a scaling technique, so even for weighted graphs it’s not clear how to show lower bounds. ## Lower bounds for weighted min cut The weighted min-cut problem can be reduced to the following *interval problem*. some interval problem in communication complexity and we currently try to figure out if this problem is indeed hard. This is related to the following graph problem: we have a cycle of edges of weight M, where M is very large, and additional potential edges of weight 1. We need to remove at least 2 heavy edges to get a cut, and to find the minimum cut we need to find the minimum cut that 2-respects the cycle. · We were able to show that the problem is easy for some special cases: if the min cut has size k, we can get an nk algorithm (assuming there are no parallel edges, and all edges corresponding to intervals have the same weight), a similar thing holds if the min cut has size n-k. · One direction we tried is to look at Karger’s algorithm. There, to find the min cut we compute for each v the value (*) min c(w)+g_w(v), where w is a vertex above v in the tree, c(w) is the number of edges with exactly one endpoint in the subtree rooted at w, and g_w(v) is the number of edges with one endpoint in the subtree rooted at v and one endpoint in the subtree rooted at w. This allows to find the minimum cut that contains the edge between v to its parent as one of the heavy edges, and the other heavy edge in the cut is between w to its parent, for w that minimizes (*). If we fix v, g_w(v) is monotonic increasing and in order that computing the minimum value (*) would be interesting we want c(w) to be monotonic decreasing. If the edges of g_w(v) are split between Alice and Bob, this is related to the following problem: one of Alice and Bob has a monotonic increasing sequence and one has a monotonic decreasing sequence, and they want to find the minimum value in the sum of the 2 sequences. If the sequences can be arbitrary monotonic sequences, it’s possible to show a linear lower bound, the problem is that in our problem there are many dependencies, for example c(w) depends on other values, which makes it difficult to analyze. 2 other points about min cut: · For unweighted graphs there are only O(n) edges in approximate min cuts (section 2.3 here: https://arxiv.org/pdf/1711.03165.pdf). Can it give any intuition for a construction for weighted graphs? · In the interval problem: if we know the total cost of all the cuts that 2-respect the cycle, this allows us to learn all the edges in the graph, so there is a quadratic lower bound for this case. However, in our problem we are interested only in the minimum 2-respecting cuts. --- :point_down: :point_down: This part is wrong! Kept for archival purpose. :point_down: :point_down: ## Lower bound for interval problem ### Attempt 1 We first construct a set of intervals such that the following two prperties hold: * For any pair $(i, j)$ where $i, j$ are multiples of $k$, the $cost(i,j) = t$. * For any other pair $(i,j)$, $cost(i,j) > t$. To this end, we consider intervals of size at most $k/2$. * First, cover each $i$ which is a multiple of $k$ with $t/2$ many intervals of size $k/2 -1$. This makes sure that any interval which covers an $i$ cannot cover another $i'$ where both $i$ and $i'$ are multiples of $k$. At this point, we have satisfied the first condition. * Now, consider two consecutive multiples of $k$: $\ell k$ and $(\ell +1)k$. ==Define block== We will put intervals between them such that the first condition is not violated, and we also satisfy the second condition. To this end, put all intervals of size at most $k/2$ which is completely contained in $\ell k +1$ and $(\ell +1)k -1$. At this point, set $k = n^{1/3}$ and $t = n^{1/3}/100$. First, we bound the cost of $(i,j)$ where $i$ and $j$ are multiples of $k$. **Claim.** For all $\ell < p$, $cost(\ell k, p k) = t$. *Proof.* First, note that the cover set of $\ell k$ is disjoint from the cover set of $pk$. Hence $cost(\ell k, p k) = cov(\ell k) + cov(pk)$. By our construction, $cov(\ell k) = cov(pk) = t/2$. :black_small_square: **Claim.** Any $i$ which is not a multiple of $k$, $cov(i)> k/2 -1$. *Proof.* Consider $i = \ell k +1$. The number of intervals that cover it at least $k/2 -1$ as at least 1 interval of every length covers it. :black_small_square: **Corollary.** For any $(i,j)$ where $i$ and $j$ belong to different blocks, $cost(i,j) > t$. This follows from the fact that the cover sets of such $i$ and $j$ are disjoint. Now, for $(i,j)$-pair which is in the same block, we have the following claim: **Claim.** For any $(i,j)$ pair inside a block, $cost(i,j)> t$. *Proof.* Inside a block, all intervals of size at most $k/2 -1$ are present. The number of intervals which contains $i$ and does not contain $j$ is $j - (\ell k +1)$. Similarly, the number of intervals which contains $j$ and does not contain $i$ is $(\ell + 1)k -1 -i$. Hence $cost(i,j) = j - (\ell k +1) + (\ell + 1)k -1 -i \approx (\ell + 1)k - \ell k + (j - i) > k > t$. :black_small_square: Now we argue that the total number of intervals considered in high enough. **Claim.** The number of intervals thus considered is $O(n^{4/3})$. *Proof.* Inside a block, the number of intervals considered is $\approx k^2$. So the total number of intervals among all blocks is $n/k \cdot k^2 = nk = n^{4/3}$. The number of intervals used for covering $\ell k$ for all $\ell$ is $n/k \cdot t/2 \approx O(n)$. Hence the total number of intervals is $O(n^{4/3})$. Let's call these set of intervals to be $S$. Consider the fooling set $F$ where Alice gets a half of $S$ and Bob gets the rest. The number of input pairs in $F$ is ${|S| \choose |S|/2} \approx 2^{O(n^{4/3})}$. Consider two such input pairs $(A,B)$ and $(A',B')$ such that $A \cup B = A' \cup B' = S$. Of course, minimum cost in $A \cup B$ and $A' \cup B'$ is $t$ by our construction. We claim the following: **Claim.** At least one pair of $(A, B')$ and $(A', B)$ has minimum cost less than $t$. *Proof.* Let's fix an $(i,j)$ pair where $i$ and $j$ are multiples of $k$. The set of intervals which covers $i$ and $j$ are $S_{ij}$. Note that $|S_{ij}| = t$ and every such interval contributes to the min-cost. Let $A$ have $t_A$ many intervals from $S_{ij}$ and $A'$ has $t_{A'}$ many intervals. We know $t_A + t_B = t_{A'} + t_{B'} = t$. This means either $t_A + t_{B'}$ or $t_{A'} + t_B$ is less than $t$. ==This is wrong!== **The problem with this argument:** Here the cover set of $\ell k$ is disjoint from the cover set of $pk$. Alice can just send the cover value of all points of the form $\ell k$ to Bob, and Bob can answer the min-value. :point_right: **Thing to keep in mind.** Fooling set gives NP lower bound. So there should not be a short cerficate for the pairs in the fooling set. :point_up: :point_up: End of wrong part. :point_up: :point_up: --- ### Attempt 2 * Consider a set of random interval, where each interval is chosen by choosing a random pair of numbers in $[n]$. * Probability that a pair $(i,j)$ is separated by a random interval $I$ is $\frac d n \times \frac {n-d} n$ where $d$ is the distance between $i$ and $j$. * If $d$ is constant, the probability is small $\sim \frac 1 n$. * If $d$ is large ($\sim n/100$), the probability is large $\sim O(1)$. * This means that, for a random set of intervals, the min-cost will be witnessed by $(i,j)$ which is small. Clearly, Alice can afford to set the cost of all small $(i,j)$ to Bob, and they can find out the minimum cost cheaply. * Hence, for a hard case, the input cannot look like a random set of intervals. *A random set of intervals will have almost equal number of intervals of each size: This is bad.* *On the other hand, a set of intervals which are of small size can be communicated across easily: This is also bad.* **Conjecture.** Given any $\Omega(n^2)$ many intervals, the min-cost is achieved by $(i,j)$ such that $i$ is at $O(1)$ distance for $j$. **Observation.** If all input intervals are of length at least $t$, then the minimum cost is achieved by $(i,j)$ where either $j - i = 1$ or $j - i > t$. This follows from the simple observation: Consider any $(i,j)$ which are distance $t$ apart. Now look at any subinterval $(i',j')$ of $(i,j)$. Any interval which separates $(i',j')$ must also separate $(i,j)$ (because the intervals are of lenth $\geq t$). Hence $cost(i,j) \geq cost (i',j')$. So the minimum cost subinterval of $(i,j)$ will be of length 1. * Minimum cost intervals cannot be disjoint. --- ### Approximation **Claim.** The interval problem can be approximated easily. Check! The idea is to put weight $2 \cdot OPT$ on the edges of the circle, and weight 1 on the interval edges. --- ### Attempt 3 (Bipartition interval problem) Let's look at a simpler problem (Bipartition interval). ![](https://i.imgur.com/aI4ubHd.png) * The path is divided at the mid point. * There are two types of intervals: * Cover intervals in contained in one of the sides, * Cross intervals cross the mid-point. * We number of points away from the mid-point on both left and right direction. *Goal.* To find the minimum cost pair $(i,j)$ where $i$ belongs to the left and $j$ belongs to the right. :point_right: Note that the cost of any such $(i,j)$ is $cost_{cross}(i,j) + Cov(i) + Cov(j)$ where $Cov(i)$ is the number of cover intervals covering $i$. Alice and Bob can exchange the $Cov(i)$ values cheaply. For the time being, let's assume that there is no cover intervals. **Claim.** Any efficient algorithm for this problem will give an efficient algorithm of the interval problem. *Proof sketch.* The idea is breaking $[n]$ into $n^{2/3}$ blocks, each of size $n^{1/3}$. (I guess we can also do a $(\sqrt n, \sqrt n)$ blocking.) * First, note that, cost of any pair inside a block can be computed efficiently. Alice can just send the intervals contained in a block (and also breaking long intervals into block-sized intervals) to Bob. For each block, Alice has to send $n$ bits, and there are $\sqrt n$ many blocks. So total communication is $n^{3/2}$. * For $(i,j)$ across different blocks, the problems boils down to the previous problem. **Note that we are not finding costs of all possible pairs because that is provably hard.** There are $\sim n$ many pairs. If we can solve the previous problem in $o(n)$ complexity, we get a $o(n^2)$ algorithm. So can we prove that this problem is hard? * Consider the cost matrix $M$ of a set of interval where the rows are indexed by the left side, the columns are indexed by the right side, and the ordering is from the mid-point to the end. * Write $M = M_L + M_R$, where $M_L(i,j)$ is the number of intervals that cover $i$ and does not cover $j$. Similarly define $M_R$. **Observation (Prop 1).** $M_L$ is row-increasing and column-decreasing. $M_R$ is row-decreasing and column-increasing. Consider $M_L$: ![](https://i.imgur.com/J5MKEmx.png) ![](https://i.imgur.com/W2C9A9k.png) **Observation (Prop 2).** The difference vector between two rows in $M_L$ is a monotonic increasing. For $M_R$, the difference vector between two columns is monotonically increasing. **Claim.** Any $M$ with Prop. 1 and 2 can be written as an interval matrix $M_L$. *Proof.* Consider the following set of intervals: * From $i \in L$, put $a_{ij}$ many intervals from $i$ to $j < n$. ($j$ starts from 0). Now the matrix $M_L$ looks like as follows: \begin{bmatrix} \sum_{1 \leq i}a_{i0} & \sum_{1 \leq i}\sum_{j< 2} a_{ij} & \cdots & \sum_{1 \leq i}\sum_{j< n} a_{ij}\\ \sum_{2 \leq i}a_{i0} & \sum_{2 \leq i}\sum_{j< 2} a_{ij} & \cdots & \sum_{2 \leq i}\sum_{j< n} a_{ij}\\ \vdots & \vdots & \ddots & \vdots\\ a_{n0} & \sum_{j< 2} a_{nj} & \cdots & \sum_{j< n} a_{nj} \end{bmatrix} Given any matrix $M$, all variables $a_{ij}$ can be solved uniquely. :black_small_square: Now, consider the problem of finding the minimum-entry of $A + B$ in bipartite interval problem. We can reqrite it as finding the minimum of $A_L + A_R + B_L + B_R$. **Claim.** Finding minimum entry in $A_L + B_R$ is hard. So is finding minimum entry in $A_R + B_L$. *Proof sketch.* The idea is to reduce AND o GT. Fix a monochromatic rectangle $t$. The fooling set is all pairs $(A_L, B_R)$ such that $A_L + B_R = t$. Such pairs are available and plenty in number thanks to the previous observation. :black_small_square: ==:point_up: :point_up: **This is false.** This is because:== ==$A_L$ is a matrix with Prop 1 and 2. Hence, $B_R = t - A_L$ is a matrix with== * ==Row-decresing and column increasing.== * ==The difference vector between two columns is monotonic decreasing.== * ==This means that this matrix cannot be formed as a right-cost matrix of a set of intervals.== ~~**Question.** Can we make both $A_L + B_R$ and $A_R + B_L$ both hard at the same time?~~ ~~**Question.** Can this fooling set (if it exists) be used as a fooling set for $A + B$?~~ The $M_R$ matrix corresponding to $M_L$ in Claim looks like as follows: \begin{bmatrix} 0 & \sum_{j \leq n-1}\sum_{i \leq 1}a_{ij} & \cdots & \sum_{j \leq n-1}\sum_{i \leq n-1}a_{ij}\\ \vdots & \vdots & \ddots & \vdots\\ 0 & \sum_{i \leq 1}a_{i,n-1} & \cdots & \sum_{i \leq n-1}a_{i,n-1}\\ 0 & 0 & \cdots & 0 \end{bmatrix} #### Attempt 3(a) (Fooling set construction for Bipartite interval) The idea is to construct a fooling set of $A + B$ problem: A few things to keep in mind. * Alice and Bob can do a sanity check: If for 'proper'pairing (i.e., pairing of $(A,B)$ as in fooling set) the min value appears in a few places, the * The fooling set should be of large size; otherwise Alice can * Consider a boolean matrix where $(i,j)$ position is 1 iff it is a possible min-cost pair. We want this boolean matrix to be full rank. * A possible candidate is inner-product (mod 2) matrix. How the cost matrix changes when an interval is introduced? * For a cross interval which starts at $i \in L$ and ends in $j \in R$, the following change happens. Let's call this operation **cross operation**.![](https://i.imgur.com/mJKRjva.png) * For a cover interval starting at $i \in L$ and ending at $i' \in L$ the following change happens. Let's call this operation as **row operation** (or **column operation**).![](https://i.imgur.com/Rqlg7jp.png) Given these, the following variations are impossible to achieve by intervals: * \begin{bmatrix} >t & t\\ t & >t \end{bmatrix} * This is because any cross interval from $(i,i')$ to $(j,j')$ can only increase $cost(i,j')$ and $cost(i',j)$. So we can ignore the cross intervals. * For cover intervals, it can be argued that $cost(i,j) + cost(i',j') = cost(i,j') + cost(i',j)$. * \begin{bmatrix} >t & t\\ t & t \end{bmatrix} * Clearly, it is not possible to achieve by using just the cross intervals. * Consider all cover intervals that is used. They have changed the values such that $cost(i,j) + cost(i',j') = cost(i,j') + cost(i',j)$. As $cost(i',j') = cost(i,j') = cost(i',j) = t$, we know that, at this point, $cost(i,j) = t$. No cross interval can increase this value without increasing anything else. * \begin{bmatrix} t & t\\ t & >t \end{bmatrix} * Similar argument. So the claim is the following: ==**Structural claim 1.** For any $i < i'$ and $j < j'$, $cost(i,j) + cost(i',j') \leq cost(i,j') + cost(i',j)$.== This means the following: Consider the positions in $M$ which achieves the minimum. Informally, these positions should look like this: * Denote, for row $i$, the positions where the minimum occurs as $\Lambda_i$ and these positions are between column $i_\ell$ and $i_r$. * For any $i' > i$, $\Lambda_{i'}$ between $i_\ell$ and $i_r$ is a subset of $\Lambda_i$. Another way to analyze $M$ is the following: Consider only column operations and cross operations. * Consider a $2 \times 2$ matrix as a toy example. * Apply $C$ amount of cross operation at $(1,1)$ and $T$ amount column operation at the first column. * The matrix looks like the following: \begin{bmatrix} T & C\\ T+C & 0 \end{bmatrix} Let's look at the column deltas in the two rows: We have \begin{align} T - C = \delta_1, T + C = \delta_2 \end{align} which implies \begin{align} T = \frac{\delta_1 + \delta_2} 2, C = \frac{\delta_2 - \delta_1} 2. \end{align} Now, both $T$ and $C$ needs to be positive (integer), hence we have $\delta_2 \geq \delta_1 \geq -\delta_2$. We can generalize this. Denote, by $\delta_{ij} = M_{ij} - M_{i,j+1}$. ==**Structural claim 2.** Any matrix (with large enough entries) $M$ with the following $\delta$-spectrum can be generated by row, column and cross operations: For every $j$,== $$\delta_{nj} \geq \cdots \geq \delta_{1j} \geq - \delta_{nj}.$$ ==Also for every $i$,== $$\delta_{in} \geq \cdots \geq \delta_{i1} \geq - \delta_{in}.$$ Now, let us change the problem a bit, and try to see if we can prove lower bound for a harder problem. > Suppose Alice and Bob wants to find the minimum of every column of $M$. > Let's take a toy example of just two columns. First let us try to prove a query lower bound: We have query access to deltas. \begin{bmatrix} M_{11} & (- \delta_1) & M_{21}\\ M_{12} & (- \delta_2) & M_{22}\\ & \vdots & \\ M_{1n} & (- \delta_n) & M_{2n} \end{bmatrix} Also, let's assume that the first column is known to the querier. **Adverserial strategy:** Answer $\delta$s to pretend that the second column has identical entries, except the last one. This is possible if the entries in the first column are not adjacent. For *communication lower bound*, Let's assume that: * Alice and Bob both have identical first column. * Alice has $\delta^A$'s and Bob has $\delta^B$'s. For the fooling set, take any arbitrary increasing (large enough) $\delta$. For any partition of $\delta = \delta^A + \delta^B$ where both $\delta^A$ and $\delta^B$ is increasing, the fooling set will have $(\delta^A, \delta^B)$. Now, cosider $(\delta^A, \delta^B)$ and $(\delta^{A'}, \delta^{B'})$. There is an $i$ such that * Either $\delta^A_i > \delta^{A'}_i$: $(\delta^{A'}, \delta_B)$ has different minimum. * Or $\delta^B_i > \delta^{B'}_i$: $(\delta^A, \delta^{B'})$ has different minimum. One way to generate $\delta^A$ and $\delta^B$ from $\delta$ is to take an appropriate $\delta$ and consider all $\delta^A = \delta/2 \pm 1$. There are $2^n$ possibilities, and hence a lower bound of $\Omega(n)$ is immediate. **Question.** Can we generalize this lower bound to super constant many columns? The achilles heel is the following claim: ==**Structural claim 3.** The min of the column $i$ can occur only on or below the row where the min occurs in column $i-1$.== **Question**. What is a super constant query lower bound for multiple columns? Here is a major problem in generalizing the previous lower bound: * Suppose, after $n$ queries, we know the last column. If we want to prove a lower bound of $\Omega(n)$ of this step, the adversarial strategy is to fix the last column having identical enties. * Once the last column is fixed in this way, every other column has the min at the top. This is because, for column $n-1$, $$M_{n-1,j} - \delta_{n-1,j} = M_{n-1,1}-\delta_{n-1,1}$$ Given that $\delta_{n-1,j} \geq \delta_{n-1,1}$, we have $M_{n-1,j} \geq M_{n-1,1}$ for any $j >1$. **Even more**: If any column $i$ is column with identical entries, then all columns $< i$ will have min at the top, and all column $> i$ will have min at the bottom. **If we assume that every column has an unique minimum,** then there is a binary search-type algorithm. > * Query the middle column. Let the position of the minimum in the middle column is $j$. > * Recurse within $M[1,1:(n/2),j]$ and $M[(n/2),j: n,n]$. :point_right: Actually this also works when we remove the unique minimum condition. This actually gives the following two theorems: **Theorem (bipartite).** The **bipartite interval problem** can be solved in $\tilde O(n)$ bits of communication. **Theorem (interval).** The **interval problem** can be solved in $\tilde O(n)$ bits of communication. #### Proof of Theorem (bipartite) * First, we claim that the cost matrix will have the *increasing delta* property. * Then we apply the previous algorithm to find out the minimum of every column. This will give us the global minimum. **Claim.** Any bipartite interval cost matrix $M$ will have the property: $\delta_{nj} \geq \cdots \geq \delta_{1j}$. *Proof sketch.* Let's start with an empty matrix. The condition is trivially satisfied. * Any cover interval does not change the condition: * A cover interval in the left side will increase a set of rows by a fixed amount (weight of the interval). These does not change any delta. * A cover interval in the right side increases a set of columns by a fixed amount. This changes all deltas of a pair of columns by the same amount. * Any cross interval can meaningfully affect deltas between a pair of columns iff the cross interval occurs between these two columns. Any such cross interval will increase the deltas below it, and decrease deltas above it. Thus the relationship between deltas are maintained. :black_medium_square: **Claim.** Given query access to deltas of any such matrix, the minimum of every column can be found within $\tilde O(n)$ queries. *Proof sketch.* We will use the following claim. **Claim.** For any column $i$, let's assume that the first occurance of minimum is at row $j^s$ and the last is at row $j^t$. Then * for any column $i' < i$, minimum will occur in rows $\leq j^s$, * for any column $i'' > i$, minimum will occur in rows $\geq j^t$. ==TODO: Proof of this.== Given this claim, the algorithm is as follows: **Algorithm 1.** * Query the middle column. Let the position of the minimum in the middle column is between $j^s$ and $j^t$. * Recurse within $M[1,1:(n/2),j^s]$ and $M[(n/2),j^t: n,n]$. The analysis is as follows: Let $T(a,b)$ is the number of queries needed to be done to find the minimum of every column of a matrix of dimension $a \times b$. Then, \begin{align} T(n,n) = n + T(j^s, n/2) + T(n-j^t, n/2) \end{align} with the stop condition $T(a, 1) \leq a$. This recursion tree is of depth $\log n$, and in each layer, the total cost is $O(n)$. Hence the total number of queries required in $\tilde O(n)$. :black_medium_square: Now Theorem (bipartite) follows easily by converting this query algorithm to a communication protocol: Each query is simulated by $O(\log n)$ communication between Alice and Bob, and hence the total communication needed is $\tilde O(n)$. #### Proof of Theorem (interval) This follows by using the algorithm of Theorem (bipartite) as a subroutine. * Divide $[1, \cdots, n]$ into blocks of size $k$: There are $n/k$ such blocks. * Now for any $(i,j)$ separated out between two blocks, * Alice and Bob collapses the blocks between between these two blocks, and changes the intervals accordingly. * Alice and Bob applies Algorithm 1 (communication version) between these two blocks (who are one after the other). * For any $(i,j)$ inside the same block: * Brute force. Total communication needed: \begin{align} T(n) = \underbrace{k^2 \cdot \frac n k}_\text{same block} + \underbrace{(\frac n k)^2\cdot k}_\text{diff block} = n \sqrt n. \end{align} by setting $k = \sqrt n$. Now this can be improved even further. For computing cost of $(i,j)$ in the same block, we can * divide the blocks into two equal parts and apply Alogirthm 1 (to compute cost of $(i,j)$ separated by the divider); * recurse into two parts. Note that this process can go on for $O(\log k)$ steps. In each depth of recursion, Alice and Bob will communinate at most $k$ bits. So we have: \begin{align} T(n) = \underbrace{k \log k \cdot \frac n k}_\text{same block} + \underbrace{(\frac n k)^2\cdot k}_\text{diff block} = O(n \log n). \end{align} by setting $k = O(n)$. :black_medium_square: Now we look at the other extreme: star graph. The notes are [here](https://hackmd.io/@sagnik/rJwYzZUYB). Now we look at one other extreme, a star-graph. There is a root $r$ and $n$ vertices $\{1, \cdots, n\}$ and $n$ edges $\{(r,i)\}_i$. The cost matrix is the following: Let $deg_i$ is the weighted degree of the vertex $i$, and $w_{ij}$ is the weight of the edge $(i,j)$. $$cost(i,j) = deg_i + deg_j - 2w_{ij}.$$ Now, given any matrix with the following constraint can be made into a cost matrix: * For any row (or column) **Claim.** Finding an exact 2-respcting min-cut on a star graph requires $\Omega(n^2)$ bits of communication. **Claim.** Finding a 2-respcting min-cut on a star graph (where 1-respecting min-cut is also allowed) can be done in $O(n)$ bits of communication. *Proof.* For a 2-respecting cut to be a min-cut, we have $deg_i + deg_j - 2w_{ij} < \min(deg_i, deg_j)$ which implies $w_{ij} > \max(deg_i, deg_j)/2$. WLOG assume that $\max(deg_i, deg_j) =deg_i$. Then, becuase $\sum_{j}w_{ij} = deg_i$, there is at most 1 $j$ such that $w_{ij} > deg_i/2$. For Alice and Bob, $w_{ij} > deg_i/2$ means either $w^A_{ij} > deg^A_i/2$ or $w^B_{ij} > deg^B_i/2$. So, for each $i$, Alice and Bob have to check at most two $j$s to see if the 2-respecting cut is indeed the min-cut. This can be done in $\tilde O(n)$ bits of communication. :black_medium_small_square: ## Spider graph Here we generalize the star-graph. A spider graph consists of $t$ many legs $b_1, \cdots, b_t$ which are disjoint except at the center where they connect via vertex $r$. **Claim.** Fix a vertex $u$ and consider a vertex $v$ on a different leg. The value $w_{>u,>v}$ is maximized for the vertex $v$ closest to the root. *Proof.* Let $v'$ be the parent of $v$. Then $w_{>u,>v} \leq w_{>u,>v'}$. Hence the maximizer is reached near the root. :black_medium_small_square: Now, for every vertex $u$, let's count how many vertices $v$ are there such that 2-respecting cut has the possibility of being a min-cut. For this, the following needs to happen: $$deg_{>u} + deg_{>v} - 2w_{>u,>v} < \min(deg_{>u}, deg_{>v}).$$ This means, $w_{>u,>v} > \max(deg_{>u}, deg_{>v})/2$. This also means, if we collapse the branch containing $v$ towards the root, then $w_{>u,v} > deg_{v}/2$. As $deg_{>u} = \sum_v w_{>u,v}$, there is only 1 $v$ where this inequality holds. Consider that for $u$, $v_u$ is the vertex for which $w_{>u,v_u} > deg_{v_u}/2$. This means that, there is a $v'$ in the branch containing $v_u$ where: (1) $w_{>u,>v'} > deg_{>v'}/2$, and (2) $deg_{>u} + deg_{>v'} - 2w_{>u,>v'}$ is minimized. The question is, how do we find the $v'$s such that (1) holds? To this end, note that, from the point of view of such $v'$, the branch containing $u$ is also the only interesting branch. So, with the reasonable definition of 'interesting', we make the following claim: **Claim.** Given a leg $\ell$, partition the vertices of $\ell$ in $S^\ell_1, \cdots, S^\ell_k$ where $S^\ell_i$ is the vertices of $\ell$ for which leg $i$ is interesting. Then, given two legs $\ell$ and $\ell'$, the interesting 2-respecting cuts will respect edges in the set $S^\ell_{\ell'} \cup S^{\ell'}_\ell$. Also, any vertex will take part in only constant many edge pairs. So the algorithm is the following: 1. For each leg $\ell$, figure out the sets $S^\ell_1, \cdots, S^\ell_k$. This can be done in $O(n)$ total communication. 2. For each pair of legs $(\ell, \ell')$, * Collapse all vertices except $S^\ell_{\ell'} \cup S^{\ell'}_\ell$. * Collapse all other legs. * Run 2-respecting path algorithm on $S^\ell_{\ell'} \cup S^{\ell'}_\ell$. As each vertex takes part in only constant many pairs, the total communication is $O(n)$. --- ~~Consider the case where there are $\sqrt n$ legs, each of length $\sqrt n$. Two approaches:~~ * ~~Consider all pairs of branches: There are $O(n)$ such pairs. For each pair $(b_i, b_j)$, collapse all other branches to $r$ and treat this as a 2-respecting path problem: This will have complexity $\tilde O(\sqrt n)$. So total complexity is $\tilde O(n \sqrt n)$.~~ * ~~Consider any vertex $i$. Note that, in this case, $$cost(i,j)= deg_{>i} + deg_{>j} - 2 w_{>i,>j}$$ where $>i$ denotes all vertices from $i$ to the end of the leg. As before, the interesting case is when $w_{>i,>j} > deg_{>i}/2$.~~ * ~~Note that, if we collapse every other leg (legs which does not contain $i$) to that of a star, then $w_{>i,>j}$ can only increase: This means, if for some $j$, $w_{>i,>j} > deg_{>i}/2$, then it also holds when we collapse the leg containing $j$.~~ * ~~After collapsing other legs,~~ * ~~Short-short: Collapse and use star~~ * ~~Short-long: Collapse and use star. See from the pov of a vertices in long.~~ * ~~Long-long: Do path~~ --- ## Any tree We do heavy-light path decomposition of tree $T$. After decomposition, we have the following property: Any vertex $v$ can be reached from root by crossing at most $\log n$ many paths. This means, for any $u$, there can be at most $\log n$ many paths which are interesting. The algorithm is as follows: 1. For every vertex $v$, enumerate the set of vertices, $S_v$, which are interesting for $v$. 2. Do it for every pair of paths: a. Given two paths $p$ and $p'$, we keep the vertices $v$ in $p$ for which $p'$ is interesting (i.e., $S_v \cap p' \neq \emptyset$), and the vertices in $p'$ for which $p$ is interesting. We collapse everything else. b. We apply the 2-respecting path algorithm on the collapsed graph.