--- tags: fall 2018 cs61b --- # Minimum Spanning Trees [TOC] ## Introduction Let's analyze what a **minimum spanning tree (MST)** is by looking at each individual word that forms the acronym MST. 1. **Minimum:** implies that nodes must be connected with the smallest possible edge weights 2. **Spanning:** all nodes in the graph must be connected 3. **Tree:** an acyclic graph From the following definitions, we can conclude that a **minimum spanning tree** is a tree that connects all nodes in a graph with the smallest possible edge weight. On another note, a **spanning tree** is a tree that connects all nodes in a graph. ![](https://i.imgur.com/WJkrksU.png) ### Cut Property If all the nodes in a graph are divided up into two sets, the minimum crossing edge between the two sets is part of an MST. ### Common Questions **Does adding some constant L to all the weights in graph change the MST?** No. The ordering stays the same. **Does squaring the weights in graph change the MST? (assume positive edge weights)?** No. The ordering stays the same. ### Motivation In real life, we need MSTs for the following: * electricity company in a neighbor trying to connect every house * Computer networks and routing ## Prim's Algorithm ### Steps 1. Start from some arbitary start node. 2. Add the shortest edge that has one node inside the MST under construction and does not create a cycle. 3. Repeat until the MST is complete (has all $V$ nodes and $V-1$ edges). ### Worst Case Runtime The worst case runtime is $Θ(V \textrm{ log }V + E \textrm{ log }V)$. Prim's Algorithm is very similar to Dijkstra's Algorithm. Dijkstra's Algorithm looks at the distance from the start node while Prim's Algorithm looks at the distance from the nodes that already appear in the MST under construction. If $E > V$, then the runtime can be further simplified to $Θ(E \textrm{ log }V)$. **More Detailed Breakdown:** Once again, we're maintaining a priority queue (PQ) of nodes. We insert nodes with a priority based on the edge weight of the edge connecting a node inside our MST and the node we're enqueuing. We then pop off the node with the highest priority (lowest edge weight). **Insertion:** $\Theta(V\textrm{ log } V)$ In the worst case scenario, we perform $V$ insertions total for each node. Since a PQ is implemented using a heap (specifically a min heap in this case), it'll cost $O(\textrm{log } V)$ each time to insert a node. **Remove-min:** $\Theta(V\textrm{ log } V)$ In the worst case scenario, we'll have to pop off all the nodes from the PQ, so the number of operations will be $V$. Removing the minimum node costs $O(\textrm{ log }V)$. **Update priority:** $\Theta(E \textrm{ log } V)$ In the worst case scenario, we have to update the priority of nodes already in the PQ every time we consider an edge. Therefore, the largest number of operations would be $E$. Updating priority in a PQ means bubbling nodes up or down, which will cost $O(\textrm{log } V)$ time. Adding all the runtimes together and dropping constants, we get $Θ(V \textrm{ log }V + E \textrm{ log }V)$. **From [Hug's Spring 2018 Slides](https://docs.google.com/presentation/d/1I8GSEL0CgT09_JjSUF7MfoRMJkyzPjo8lKRd8XdOaRA/edit#slide=id.g36aa282842_234_0):** ![](https://i.imgur.com/JwbSF99.png) ## Kruskal's Algorithm ### Steps 1. Sort all the edges in order of increasing weight. 2. Add the smallest edge that has not been processed and does not create a cycle to the MST under construction. 3. Repeat until $|V| − 1$ edges total. ### Worst Case Runtime There are assumptions we can make about the edges this will affect the runtime. **No assumptions:** We use a PQ (backed by a min heap) of edges with the priority being the edge weight. First, we consider what we're familiar with: $O(E \textrm{ log } E)$ for adding all the edges to the PQ and $O(E \textrm{ log } E)$ for popping off the smallest edge every time (removing the minimum node in the heap). To make sure that we don't create a cycle when adding an edge, we use **union find** to keep track of the nodes in our MST under construction. Every time we consider an edge in the PQ, we have to check if the two nodes that the edge connects are already in the MST. If they are, we'll create a cycle if we add the edge. We use the `isConnected` method from union find, which has a runtime of $O(\textrm{log* } V)$. Don't worry too much about the asterisk. It indicates an [iterated logarithm](https://stackoverflow.com/questions/2387656/what-is-olog-n). The proof for the runtime is [here](https://www.wikiwand.com/en/Proof_of_O(log*n)_time_complexity_of_union%E2%80%93find). We do this `isConnected` operation $O(E)$ times if we end up considering all the edges. Therefore, the total runtime of using `isConnected` would be $E \textrm{ log* } V$. If the edge doesn't create a cycle (meaning one of the nodes is not in the MST) and we add it to the MST, we have to union the new node with the tree containing the nodes inside the MST under construction by using the union find `union` method. We eventually have to union all the nodes to get a complete MST, so we run this operation $O(V)$ times. Similar to `isConnected`, the runtime of `union` is $O(\textrm{log* } V)$. Therefore, the total runtime of using `union` would be $\Theta(V \textrm{ log* } V)$. Without any assumptions made, the worst case runtime of Kruskal's algorithm would be $\Theta(E \textrm{ log } E + E \textrm{ log* } V + V \textrm{ log* } V)$. If $E > V$, then $\Theta(E \textrm{ log } E)$. **From [Hug's Spring 2018 Slides](https://docs.google.com/presentation/d/1I8GSEL0CgT09_JjSUF7MfoRMJkyzPjo8lKRd8XdOaRA/edit#slide=id.g772f8a8e2_0_110):** ![](https://i.imgur.com/1f4z8C0.png) **Assume the edges are already sorted in a list:** If we're given a list of pre-sorted edges in a list, inserting and deleting edges from the list both take **constant** time. In the worst case, both operations are run at most $E$ times, so insertion and deletion both have $\Theta(E)$ runtime. The other operations are not affected. The worst case runtime will now be $\Theta(E \textrm{ log* }V)$. ## Summary of Graph Algorithm Runtimes ![](https://i.imgur.com/o9mJxIy.png)