BOI 2015 Network

# Network ## Insight 1: we always want to connect leaves This insight can be found by drawing. Take this example: adding the red edge will "cover" all purple edges. ![image](https://hackmd.io/_uploads/SkbajO9tee.png) However, we might as well connect the leaves instead, covering a superset of edges: ![image](https://hackmd.io/_uploads/BJDg2O9Yxl.png) So in general, in an optimal solution, we will only connect leaves. ## Insight 2: Lower bound What do we do with insight 1? We obviously want to only connect leaves, but how many might be needed? An obvious lower bound is $\lceil \frac{\text{number of leaves}}{2} \rceil$: connect two new leaves with every edge, perhaps only 1 for the last edge if $N$ is odd. It turns out that this lower bound is always attainable. Now, how do we ensure that connecting two leaves actually makes progress? For example, connecting these two is quite useless: ![image](https://hackmd.io/_uploads/BkhxCu9Ygx.png) One way is to use a tree DP, doing a DFS. However, I will explain another solution. We are guaranteed a solution if we can find a "root node", and are able to always match leaves from different subtrees. In the following image, the green node is the root. Notice that doing things this ways guarantees a solution. ![image](https://hackmd.io/_uploads/S1LDCOqKge.png) Now, how do we find a suitable root node? We require that our root partitions the subtrees' leaves into sets such that no set is too large. For example, if the following root was chosen instead, it's impossible: ![image](https://hackmd.io/_uploads/BJ8ZJKctxe.png) The problem we have is basically "given $k$ sets with $N$ items in total, match pairs of items. Two items can be paired iff they are from different sets". Untuitively, the most difficult group to match is the largest one. Therefore, a reasonable greedy is to maintain all group sizes in a heap, and match elements from the currently two largest groups. It turns out that this algorithm always finds a solution if one exists. The necessary and sufficient condition is that no group has size $s$ such that $2 \cdot s > N$. Now, does such a node even always exist? Yes! We are basically searching for a "leaf centroid"- a node where no subtree has more than $\frac{\text{leaves}}{2}$ leaves. A proof can be found in the bottom. How do we find this? Similarly as a normal centroid: root the tree at $0$, and then compute for every node the number of leaves in its subtree. To find the leaf centroid, iterate over every node check the largest subtree. To find the number of leaves in the parent, simply take the number of leaves minus the sum of leaves in the subtrees. Then, simply check if the largest subtree is sufficiently small- if it is, we're done, otherwise continue. So the algorithm in full is: - Do a DFS, computing number of leaves in the subtree - For every node, check if it's the leaf centroid - Root the tree at the leaf centroid. Do a DFS to collect groups of leaves from the subtrees - Put all of these groups into a priority queue and greedily repeatedly match the two largest groups. - Special handle the case where $N$ is even Note that you might need to special handle $N=2$. ## Proof that leaf centroid always exists Assume for sake of contradiction that no leaf centroid exists. This implies that, by definition, for every node, there is at least one edge where the number of leaves on the other "side" is $> L/2$ (otherwise, it would be a leaf centroid). Direct said edge to that side (note that the edge will never be asked to direct itself to both sides simultaneously: it's impossible for both sides to have $> L/2$ leaves). In turn, every node has outdegree $1$. However, there are only $N-1$ edges, so there must be at least one node with outdegree $0$, a contradiction. Thus, a leaf centroid always exists.