CS 61BL Week 6 Notes

--- tags: summer 2018 cs61bl --- # CS 61BL Week 6 Notes Hi! Every week, I like to compile notes based off confusing concepts about and good questions that I've been asked! If you’re already familiar with all the topics, check out the **Resources** section for practice problems or insights. Also, if there’s anything that doesn’t make sense or seems confusing, feel free to shoot me an [email](mailto:tinazhao@berkeley.edu). ## Table of Contents [TOC] ## Djikstra's Runtime ### Context While DFS and BFS rely on stack and queue respectively, Djikstra's algorithm relies on a **priority queue** to return nodes that can be reached with the **shortest path possible**. We know from lab that **priority queues are implemented using heaps** and that the priority queue for Djikstra's algorithm will be specifically relying on a **min heap**. ### Breaking Down Runtime Djikstra's algorithm can be broken down into three steps when you look at each node in the graph: 1. **Insert the Node:** Insert the node into the priority queue (insert into a min heap). 2. **Remove minimum node:** Pop off the node with the highest priority from the priority queue (remove root node in the min heap and re-heapify). 3. **Update priority:** Consider the edges weights of the neighboring nodes of the node you just popped off and update the priorities in the priority queue accordingly. (Bubble the nodes in the min heap according to priority) :::info **Important Note:** Think of the runtime from a higher, more conceptual perspective instead of from the perspective of your lab code. It's easy and quite common to get bogged down with details of the lab implementation and be skeptical with this analysis. This may look like a different implementation than the algorithm you wrote in lab. Most notably, we are looking at the edge weights of neighboring nodes and **updating the priorities** in the heap **instead of inserting a new node to our heap with a new priority**. ::: ### Worst Case Runtime Analysis Let's denote the number of our vertices as $V$ and number of our edges as $E$. 1. **Insertion: Θ(V logV)** We will end up inserting all of our vertices into our heap, so we will make **V function calls** for insertion. An insertion into our heap will take **O(log V)** because we need to bubble up the node to the correct spot. Since we do a **heap insertion for every vertex**, the overall runtime of insertion is **Θ(VlogV)**. 2. **Remove Minimum Node: Θ(V logV)** Since the worst case mentions that we put all of our vertices our priority queue, we know we will make **V function calls** to remove the minimum node from the heap. Removing the minimum node in the heap will take **O(log V)** because we need to replace the root node with a node from the end and bubble it down. Since we are **removing the root node for every vertex**, the overall runtime of removing the minimum node is **Θ(VlogV)**. 3. **Update Priority: Θ(E logV)** We update the priorites based off the edge weights from the current vertex we popped off to its neighboring vertices. Therefore, we will make **E function calls**. Updating the priority of a vertex in the priority queue is equivalent to bubbling a node in a heap to the correct spot, so the runtime of updating the priority of one vertex is **O(log V)**. Since we are updating the priority every time we look at an edge, the overall runtime of update priority is **Θ(E logV)**. Since each step is *occuring in parallel* to the other steps (they happen one after another instead of running at the same time), we can add up the runtimes of every step. **Runtime of Djikstra's Algorithm: Θ(V logV + E logV)**. If **E > V** (a complete graph), then the runtime can be further simplified to **Θ(E logV)**. ## Minimum Spanning Trees Let's analyze what a **minimum spanning tree (MST)** is by looking at each individual word that forms the acronym MST. 1. **Minimum:** implies that nodes must be connected with the smallest possible edge weights 2. **Spanning:** all nodes in the graph must be connected 3. **Tree:** an acyclic graph From the following definitions, we can conclude that a **minimum spanning tree** is a tree that connects all nodes in a graph with the smallest possible edge weight. On another note, a **spanning tree** is a tree that connects all nodes in a graph. ![](https://i.imgur.com/WJkrksU.png) ### Common Questions **Does adding some constant L to all the weights in graph change the MST?** No. The ordering stays the same. **Does squaring the weights in graph change the MST? (assume positive edge weights)?** No. The ordering stays the same. ### Motivation In real life, we need MSTs for the following: * electricity company in a neighbor trying to connect every house * Computer networks and routing ## MST Algorithms Suppose we're trying to create a MST from a graph denote by $G$. ### Prim's Algorithm **Quoted from CSM MST Crib Sheet:** > Main Idea: Start from some arbitrary start node. Add shortest edge that has one node inside the MST under construction. Repeat until $|V | − 1$ edges total. **Runtime: Θ(V logV + E logV)** Prim's Algorithm is exactly like Djikstra's Algorithm (very similar implementation in lab). Djikstra's Algorithm looks at the distance from the starting vertex while Prim's Algorithm looks at the distance from the vertices in the MST. If **E > V** (a complete graph), then the runtime can be further simplified to **Θ(E logV)**. **More Detailed Breakdown:** **Insertion:** V, each costing O(log V) time. **Delete-min:** V, each costing O(log V) time. **Decrease priority:** O(E), each costing O(log V) time ![](https://i.imgur.com/JwbSF99.png) ### Kruskal's Algorithm **Quoted from CSM MST Crib Sheet:** > Main Idea: Sort all edges in order of increasing weight. Add smallest edge to MST unless this creates a cycle. Repeat until $|V | − 1$ edges total. **Runtime: $Θ(E \textrm{ log }E)$** if we assume $E > V$ The runtime for Kruskal's Algorithm mainly comes from sorting all edges in the graph. In the worst case scenario with **N** elements, the fastest sort would take **O(N log N)** *(see Sorts Section below)*. Since we are sorting all the edges, the runtime would be **Θ(E logE)**. The weighted quick union operations would take **Θ(V logV)**. **Taken from Spring 2018 Slides:** ![](https://i.imgur.com/1f4z8C0.png) ### Summary of Runtime for Graph Algorithms ![](https://i.imgur.com/o9mJxIy.png) ## Insertion Sort For every item, swap until it is in the correct relative position in regard to the items before it. This is the way an adult would normally sort a pack of cards: iterating through the array, swapping each element left-wards. ![](https://i.imgur.com/I6sbDum.gif) ### Runtime **Best Case:** **Θ($N$)** In the best case scenario, we're given a **sorted** array such as [1, 2, 3, 4, 5]. If we run insertion sort on this array, we do not need to make any comparisons or swaps leftward. **Worst Case:** **Θ($N^2$)** In the worst case, we're given an array that is in **reverse sorted order** such as [5, 4, 3, 2, 1]. We need to make approximately **N comparisons** **and swaps** as we iterate through through each element in the array. **Stable? Yes** Consider an array [ 3A, 2, 3B, 1] where the 3s have been labeled to differentiate between them. We would get the following steps: [2, 3A, 3B, 1 ], [1, 3, 3A, 3B]. In general, this algorithm is stable, because given a 3A and 3B, we would never swap them with each other. ## Selection Sort We loop through the array to find the smallest element. Next, we swap the element at index-0 with the smallest element. Next, we repeat this procedure, but only looking at the array starting at index-1. ![](https://i.imgur.com/lyNx1N3.gif) ### Runtime **Best & Worst Case: Θ($N^2$)** Since it takes **O(N)** time to loop through the array, and we loop through the array **N times**, this algorithm has a runtime of **Θ($N^2$)**. Note that **even if the array is already sorted**, we need to iterate through it to find the minimum, and then iterate through it again, and again, **N times**. **Stable? Depends** Consider an array [3A, 2, 3B, 1 ], where the 3s have been labeled to differentiate between them. The algorithm will find 1 to be the smallest, and will swap it with 3A, pushing 3A after 3B, making it not stable. However, it is also possible to make it stable if we implement Selection Sort in a different way, which involves creating a new array instead of swapping the minimum elements. ## Merge Sort Given an array, divide it into two equal halves, and call mergesort recursively on each half. Take the recursive leap of faith and assume that each half is now sorted. Merge the two sorted halves. Merging takes a single iteration through both arrays, and takes **O(N)** time. The base case is if the input list is just 1 element long, in which case, we return the list itself. ![](https://i.imgur.com/D2c1Rh9.gif =600x360) ### Runtime **Best case & Worst Case:** $Θ(N \textrm{log } N)$ Since the algorithm divides the array and recurses down, this takes $Θ(N \textrm{log })$ time, no matter what. We would be iterating through **N elements** to sort the elements into the subarrays and have $\textrm{log } N$ recursive calls. **Stable? Yes** Merge sort is made stable by being careful during the merging step of the algorithm. If deciding between 2 elements that are the same, one in the left half and one in the right half, pick the one from the left half first. ## In-place Heap Sort Heap Sort place all elements in the array into a heap. It then removes elements one by one from the heap, and places them in an array. ![](https://i.imgur.com/fi2uJ94.gif) ### Runtime **Best Case:** $Θ(N)$ Say that all the elements in the input array are equal. In this case, creating the heap only takes **O(N)** time, since there is no bubbling-down to be done. Also, removing from the heap takes constant time for the same reason. Since we remove **N** elements, and creating the heap takes **O(N)** time, the overall runtime is **O(N)**. **Worst Case:** $Θ(N \textrm{log } N)$ Any general array would require creating the heap with bubbling which itself takes **Θ(N log N)** time. **Stable? Depends, but difficult to make stable** Our implementation of heapsort is usually not stable (If that's hard to believe, it helps to run through a few examples). ## Quick Sort The key idea here is partitioning. We start by picking a pivot, which is an element in our list. Then we group together elements smaller than the pivot, equal to the pivot and greater than the pivot. After this partitioning is done, elements equal to the pivot are in the right place. Then, quicksort the "less than" and "greater than" subarrays. ![](https://i.imgur.com/tGplB7r.gif =600x360) ### Runtime **Best Case:** $Θ(N)$ In the best case, we have an array with the same elements such as [1, 1, 1, 1, 1]. When we choose a pivot, compare every element in the array to the pivot, and partition them into "greater than" and "less than" arrays, the two arrays will end up being empty. There will be nothing to recurse on. Therefore, the runtime mainly comes from comparing the pivot to every element in the array, which is **Θ(N)**. **Worst Case:** $Θ(N^2)$ In the worst case, we choose a bad pivot that completely fills either the "greater than" or "less than" array and leaves the other one empty. An example is an array is already sorted and we pick the leftmost element as the pivot every time. Consider the array: [1, 2, 3, 4, 5]. If we choose the leftmost element as the pivot every time, the "less than" array will always be empty and the "greater than" array will be completely filled with the remaining elements. Therefore, we would be iterating through approximately **N elements** for the **N recursive calls**. **Average Runtime:** $Θ(N \textrm{log } N)$ If we have a reasonably good pivot that splits the elements into equal subarrays and the array is full of distinct elements, we would have **log N function calls** and iterate through all **N** elements for comparison. This is the same thing as mergesort! ## Resources ### Practice Problems [Week 6 Resources](https://drive.google.com/open?id=13H_u9SJAFAgDkyd25ObnXyxYsY3CS335): relevant practice problems compiled by Tina [Spring 2018 Final Solutions](https://d1b10bmlvqabco.cloudfront.net/attach/j9j0udrxjjp758/hzd9m3irk83yj/jh8mo7ebxuce/final_sp18_sol.pdf) & [Video Walkthrough](https://www.youtube.com/watch?v=XJuPjSbexWM): Unfortunately, the blank exam was not provided. You'll have to edit the pdf yourself and cover the solutions to try the problems. ### Sorting Visualizer [VisuAlgo Sorting Visualizer](https://visualgo.net/en/sorting) ### Relevant CSM Crib Sheets [Graph Traversal Crib Sheet](https://d1b10bmlvqabco.cloudfront.net/attach/j9j0udrxjjp758/is6p7xcn6876ht/jgu00wqngg4d/graph_traversals_crib_sheet.pdf): covers DFS, BFS, and topological sort [Graph Search Crib Sheet](https://d1b10bmlvqabco.cloudfront.net/attach/j9j0udrxjjp758/is6p7xcn6876ht/jgu00bofvnzp/graph_search_crib_sheet.pdf): covers Dijkstra’s Algorithm and A* search [MST Crib Sheet](https://d1b10bmlvqabco.cloudfront.net/attach/j9j0udrxjjp758/is6p7xcn6876ht/jgu38oepdual/mst_crib_sheet.pdf): covers minimum spanning trees, Prim's Algorithm, and Kruskal's algorithm [Sorting Crib Sheet](https://d1b10bmlvqabco.cloudfront.net/attach/j9j0udrxjjp758/is6p7xcn6876ht/jgu01m3qenkc/sorting_crib_sheet.pdf): covers sorts learned in class ### Final Review Slides [Spring 2018 Final Review Slides I](https://docs.google.com/presentation/d/1Ix3qTmIiEwV7BrJq_Jw6yTMs4EGgWUl4MpRC2nrFB14/edit#slide=id.p) and [Recording](https://www.youtube.com/watch?v=O8OnpdWLk-o&feature=youtu.be): covers Inheritance, Dynamic Method Selection, Casting, Asymptotics, Overriding/Overloading, Interfaces, Abstract Classes, Exceptions, Hashing, Autoboxing, Binary Search Trees, 2-3 Trees, and Left Leaning Red-Black Trees [Spring 2018 CSM Final Review Slides II](https://docs.google.com/presentation/d/14Y8SKeI7KsXhK6eYn0yhHbb2w_0Ptt4Ui3ePpwL9vM4/edit#slide=id.p) and [Recording](https://www.youtube.com/watch?v=mDujYFkPgFc&feature=youtu.be): covers Stacks, Queues, Graphs, Shortest Paths, Minimum Spanning Trees, Sorting, and Tries [Spring 2018 HKN Final Review Slides](https://docs.google.com/presentation/d/1Uq8vBInTcMWmD1kF2S64EP6SiHqjS-c-CnWi210dHQo/edit#slide=id.p8): covers Asymptotics, Sorting, Balanced Search Trees, Heaps, Graphs, Dijkstra’s / A*, Minimum Spanning Trees