Answer Sheet II

# Answer Sheet II ## Week 9 - Binary Trees ![](https://i.imgur.com/TF86XRX.png) ### Activity 1: Maintaining order in Arrays and Linked Lists - reprise What is the worst case time complexity of insertion of a new value, removal of an existing value, and membership testing of a value, for the Array and Linked List data structure, when each insertion and removal is done in such a way that the data is kept sorted? Provide your answers in “Big O” notation, e.g., O(n), O(n 2 ), O(log n), etc. Note that insertion and removal of values both involve two steps: **First** finding the insert position or the position of the value that is to be removed, and **subsequently** inserting / removing the value. Make sure to take both of these steps into account when filling in the table listed below. ![](https://i.imgur.com/y0SkKtF.png) * Answer: All entries in this table are O(n), except for membership testing in a sorted array, which is O(log n). ![](https://i.imgur.com/H7ns8NN.png) --- --- --- ![](https://i.imgur.com/7cEW9Gh.png) ![](https://i.imgur.com/acu6iw9.png) ![](https://i.imgur.com/nkdOc8J.png) ![](https://i.imgur.com/XJXGpSi.png) ![](https://i.imgur.com/rpVhpZj.png) ![](https://i.imgur.com/ypSlaBW.png) ### Activity 2: Recognizing BSTs Below three binary trees are shown. Only one of these three trees is a binary search tree - which one is it? For the two trees that are not binary search tree, indicate which values violate the order invariant. ![](https://i.imgur.com/4qi846X.png) * Answer: Tree C is the only tree that is a binary search tree --- --- --- ![](https://i.imgur.com/PDzzzV7.png) ### Activity 3: Building a BST The provided source code (file bintree_node.h) contains a definition of a binary tree node, which is partially given below: ```cpp class bintree_node { public: bintree_node(int value, tree_node* parent = nullptr); // constructor const bintree_node* parent() const; // the node's parent const bintree_node* left() const; // the node's left subtree const bintree_node* right() const; // the node's right subtree int value() const; // the node's value } ``` Trees can be composed by constructing a root bintree_node, passing left and right subtrees as (nested) arguments to the constructor. The class assumes that subtrees passed to the constructor were dynamically allocated, as the root node will release this memory on destruction. The following code, for example, constructs the depicted tree with three levels, containing the values −2, 0, 1, and 2: ```cpp bintree_node root{ 0, new bintree_node{-2}, new bintree_node{2, new bintree_node{1}, nullptr}}; bintree_writer::write("example.dot", root); ``` ![](https://i.imgur.com/UAUfDjd.png) Use the bintree_node class to build three different binary search (i.e., the order invariant must hold) trees, each of which contains the first five prime numbers: 2, 3, 5, 7, 11. You must build three different trees: • A tree that has precisely three levels, • a tree that has four levels, • a tree that has five levels. Use the bintree_tester::validate function to determine whether your tree is valid. Furthermore, you can use the bintree_writer::write function (see line 6 of the example code) to write your tree to a file. --- * Answer: ```cpp= bintree_node root_3{ 1, nullptr, new bintree_node( 5, new bintree_node( 3, new bintree_node(2), nullptr ), new bintree_node( 7, nullptr, new bintree_node(11) ) ) }; bintree_node root_4{ 1, nullptr, new bintree_node( 3, new bintree_node(2), new bintree_node 5, nullptr, new bintree_node( 7, nullptr, new bintree_node(11) ) ) ) }; bintree_node root_5{ 1, nullptr, new bintree_node( 2, nullptr, new bintree_node( 3, nullptr, new bintree_node( 5, nullptr, new bintree_node( 7, nullptr, new bintree_node(11) ) ) ) ) }; ``` --- --- --- ![](https://i.imgur.com/bvGXLhV.png) ![](https://i.imgur.com/kJ9C3Bn.png) ![](https://i.imgur.com/rDToRB4.png) ### Activity 4: Searching for values Implement the function bintree_node::find, which returns a pointer to the node containing a given value, or nullptr if the value is not present. The algorithm sketched below provides a recursive approach, but you are free to implement this iteratively. ![](https://i.imgur.com/magb2Fj.png) The main function of the project contains code to test your implementation. --- * Answer: ```cpp= bintree_node *bintree_node::find(int value) { if(this->value()==value){ return this; } else if(value<this->value() && this->left() != nullptr){ return this->m_left->find(value); } else if(value>this->value() && this->right() != nullptr){ return this->m_right->find(value); } else return nullptr; } ``` --- --- --- ![](https://i.imgur.com/7dKVmdc.png) ### Activity 5: Inserting values The function bintree::insert allows for the insertion of a value into the tree. If the tree is empty, the function will dynamically allocate a root node. If the tree is not empty, then the function delegates the insertion to the root node: ```cpp bool tree::insert(int value) { if (m\_root == nullptr) { m\_root = new tree\_node(value); return true; } else { return m\_root.insert(value); } } ``` Implement the function bintree\_node::insert. Test your function by recreating the same trees of Activity 3. Note that the order in which you insert values determines the final structure of the tree. You can visualize the structure of your tree by pasting the code produced by the bintree\_writer::write_dot function into the left pane at the graphviz.it website. --- * Answer: ```cpp= bintree_node* bintree_node::insert(int value) { if(this->m_value==value) { return nullptr; } else if(this->m_value>value&&this->m_left == nullptr){ this->set_left(new bintree_node(value, this)); return this->m_left; } else if(this->m_value>value&&this->m_left != nullptr){ return this->m_left->insert(value); } else if(this->m_value<value&&this->m_right == nullptr){ this->set_right(new bintree_node(value, this)); return this->m_right; } else if(this->m_value<value&&this->m_right != nullptr){ return this->m_right->insert(value); } return nullptr; } ``` --- --- --- ![](https://i.imgur.com/HELeOJ4.png) ### Activity 6: Properties of the minimum value Discuss and answer the following three questions about binary search trees: * Can the minimum value in a node’s left subtree be greater than the minimum value in a node’s right subtree? Why (not)? * Can the node containing the minimum value have a left subtree? Why (not)? * Which traversal strategy will lead to the node containing the minimum value of a tree? --- * Answer: The minimum value in a node’s left subtree can’t be greater than the minimum value in a node’s right subtree, because of the order invariant. The node containing the minimum value can’t have a left subtree, because that subtree would contain values smaller than that "minimum value", which means that the value can’t be the minimum value. Recursively select the left subtree. If a node does not have a left subtree, then that node holds the minimum value. --- --- --- ![](https://i.imgur.com/Yvd5YKV.png) ### Activity 7: Finding the minimum value The class bintree_node defines a (private) function to find the node containing the minimum value. This function currently returns a reference to the current instance, which is incorrect: ```cpp bintree\_node& bintree_node::minimum() { eturn * this; } ``` Re-implement this function so that it correctly returns a reference to the node containing the minimum value. Implement the strategy that you have designed in the previous activity. --- * Answer: ```cpp= bintree_node &bintree_node::minimum() { if(this->m_left== nullptr) return *this; else return this->m_left->minimum(); } ``` --- --- --- ![](https://i.imgur.com/2seUDCF.png) ### Activity 8: Removing leafs In case the node is a leaf node, then removal is straightforward: its parent node should no longer refer to it as its (left or right) child. Implement the first case of the bintree_node::remove function, by using the replace\_child function of the bintree_node class. Don’t forget that the function must eventually return the address of the node for which memory must be released. The main function contains code to test your implementation. --- * Answer: ```cpp= if (m_left == nullptr && m_right == nullptr) { if(this->m_value<this->m_parent->m_value) this->m_parent->set_left(nullptr); else this->m_parent->set_right-(nullptr); return this; } ``` --- --- --- ![](https://i.imgur.com/tJQGcrJ.png) ### Activity 9: Removing nodes that have one child Implement the second and third case treated in the function bintree_node::remove(). These two cases apply to nodes that have either only a left child, or only a right child. Use the bintree_node::set_left and bintree_node::set_right functions to update the left or right subtree of a node. These functions take care of also updating the parent of the new child node. The main function contains code to test your implementation. Be sure to visually verify the correctness of the resulting trees by pasting the contents of the generated ’.dot’ files that were generated into the graphviz.it website. --- * Answer: ```cpp= else if (m_left == nullptr) { if(this->m_value<this->m_parent->m_value) { this->m_parent->set_left(this->m_right); this->m_right->m_parent = this->m_parent; } else { this->m_parent->set_right(this->m_right); this->m_right->m_parent = this->m_parent; } return this; } else if (m_right == nullptr) { if(this->m_value<this->m_parent->m_value) { this->m_parent->set_left(this->m_left); this->m_left->m_parent = this->m_parent; } else { this->m_parent->set_right(this->m_left); this->m_left->m_parent = this->m_parent; } return this; } ``` --- --- --- ![](https://i.imgur.com/cPW2dRL.png) ### Activity 10: Moving values around In the two binary search trees shown below, the root node does not contain a value. Also, each node is labeled with a variable, instead of a number. But even though we do not know the actual values stored in each of the nodes, we can still make statements about their ordering. For example, in the left tree, we know that E ≤ D, because E is in the left subtree of D. ![](https://i.imgur.com/U8rXEgj.png) If we allow for duplicates, then which two values can safely be copied into the root node of the left tree, without violating the order invariant of the tree? Discuss and answer this question for both the **left** tree (choose two of the labels A . . . G), and the **right** tree (choose two of the labels A . . . J). In your answer, include the necessary inequalities (such as B ≥ A and D ≥ A in the left tree) of the trees. --- * Answer: Left tree: value B or value F can safely be copied into the root node, because A ≤ B ≤ C ≤ F ≤ E ≤ D ≤ G. For the right tree, value E or G can safely be copied into the root. --- --- --- ![](https://i.imgur.com/YAXjEun.png) ### Activity 11: Removing full nodes Complete your implementation of the bintree_node::remove() function, by taking care of the final case, in which the node has two child nodes. As outlined above, the function should perform the following three steps: 1. In the right subtree, find the node M that contains the minimum value. 2. Copy the value from M into the full node. 3. Remove node M from the tree. 4. Return the address of M to the caller. --- * Answer: ```cpp= else { m_value = m_right->minimum().m_value; m_right->minimum().remove(); } ``` --- --- --- ![](https://i.imgur.com/H8ItfQW.png) ### Activity 12: Performance analysis In our current implementation, insertion and removal can result in extremely “unbalanced” trees. What are the worst-case time complexities of the three operations, insertion, membership testing, and removal, in our current implementation? Give your answers in Big-O notation. Suppose that we can always maintain a tree that has as few levels as possible, with a constant time complexity O(1). This means that after inserting or removing a value, we can simply repair any “unbalance” without any real overhead. What would be the worst-case time complexities of the three operations in this case? * Answer: In the current implementation, the worst-case time complexities are O(n) for all three operations. The worst case is when each node only has a single child. In the case that we can maintain as few levels as possible, all three worst-case time complexities would be O(log n). --- --- --- ![](https://i.imgur.com/Ms70diZ.png) ### Activity 13: Restoring balance (optional) Implement the function bintree_node::from_vector, partially listed below. The main function contains code to test your implementation. ```cpp= bintree\_node* bintree\_node::from_vector(const std::vector& vector, int min\_idx, int max\_idx) { // 1. compute index of middle element auto idx\_mid = ... // 2. recursively call the function to create a node from the lower ←- indices of the vector auto left = from\_vector(vector, ...); // 3. recursively call the function to create a node from the higher ←- indices of the vector auto right = from\_vector(vector, ...); // 4. return a new node return new bintree\_node{ vector\[idx_mid\], left, right }; } ``` ## Week 10 - Self-Balancing Trees ![](https://i.imgur.com/fBffKZa.png) ### Activity 1: Time complexity Assume that n values are inserted into a binary search tree, in such a way that the tree has as few levels as possible. * How many levels does the tree have? Give a formula that gives the minimum number of levels in case the tree has n values. * What is the worst-case time complexity of removal and membership testing in a tree where the number of levels is kept as low as possible? --- * Answer: Minimum number of levels = log2 n + 1. Worst-case time complexity is O(log n). --- --- --- ![](https://i.imgur.com/7IpuMh4.png) ![](https://i.imgur.com/7gontxu.png) ![](https://i.imgur.com/SpV21t5.png) ### Activity 2: A naive approach Suppose the height function given above is run, passing the root of a tree of n nodes as an argument. * Answer the following two questions: * How many nodes does the function visit? * What is the time complexity of the height function give above? Give your answer in Big O notation. * Answer: The function visits all the n nodes, so the worst-case time complexity is O(n). --- --- --- ![](https://i.imgur.com/tVQkuDd.png) ### Activity 3: Remembering tree heights The bintree_node class has a member variable m_height for keeping track of the height of a node. This variable must be set to its correct value by the function bintree_node::update, which is not implemented yet. The function bintree_node::update is called by the bintree class whenever a node is inserted or removed from the tree (that’s why the bintree_node’s insert and remove functions remove pointers to the corresponding nodes). The function is called in a particular order: it calls update on a node before updating its parent node, and every ancestor of the node is updated next, until the root of the tree is updated. This means that the function can safely assume that the heights of the node’s children are correct, so they don’t need to be recomputed. Implement the function update in the bintree_node class. The function must set the m_height member variable to the node’s height. The main function contains code to test your implementation. Visualize the produced .dot files on the graphviz.it website to verify the heights are correct. --- * Answer: ```cpp= void bintree_node::update_height() { // failed recursive approach // if(m_parent!= nullptr) { // m_parent->m_height++; // m_parent->update_height(); // } if(m_right == nullptr && m_left == nullptr){ m_height = 0; } else if(m_right == nullptr) { m_height = std::max(-1, m_left->m_height)+1; } else if(m_left == nullptr) { m_height = std::max(-1, m_right->m_height)+1; } else { m_height = std::max(m_right->m_height, m_left->m_height)+1; } } ``` --- --- --- ![](https://i.imgur.com/JCmAJxM.png) ![](https://i.imgur.com/shejvrG.png) ### Activity 4: Do it yourself: Balance factors Study the two trees shown in Figure 3. Fill out the two incomplete tables shown below, which list the heights and balance factors of each of the nodes in the two trees (Hint: first fill in the heights of the nodes before filling in the balance factors). ![](https://i.imgur.com/W4S4F9e.png) To find out if you have filled out the tables correctly, compute for every node the product of its height and balance factor. If correct, the sum of these 15 products is equal to -4, and the sum of all node heights is equal to 13. Which one of the two trees is height-balanced? --- * Answer: The left tree is height-balanced, the right tree is not. ### Activity 5: Computing the balance factor Implement the function bintree_node::balance, which must compute and return the node’s balance factor, assuming that the height of the left and right node are correctly set. The main function contains code to test your implementation. --- * Answer: ```cpp= int bintree_node::balance() const { if(this == nullptr){ //std::cout<<"pooo ... disgrace"<<'\n'; return 0; } else{ //std::cout << m_value << '\n'; if (m_right == nullptr && m_left == nullptr) return 0; else if (m_right == nullptr) { m_left->update_height(); return (m_left->m_height); } else if (m_left == nullptr) { m_right->update_height(); return (m_right->m_height); } else if (m_right != nullptr && m_left != nullptr) { m_right->update_height(); m_left->update_height(); return (m_right->m_height - m_left->m_height); } else return 0; } } ``` --- --- --- ![](https://i.imgur.com/grmLmKk.png) ### Activity 6: Balancing acts Study the binary search tree given in Figure 4a. The letters in the tree represent numbers. Without knowing which numbers these letters represent, we can still infer relationships because we know that the order invariant must hold. For example, we can infer that B ≤ A, and A ≤ D ≤ C. Each of the three trees of Figures 4b, 4c, and 4d show a restructuring of the tree that moves the balance of the tree to the left side. Not all of these three trees are guaranteed to be valid binary search trees, though. Which of the three trees is correct in the sense that it preserves the order invariant of a binary search tree? And, more importantly, why is that restructuring correct? --- * Answer: The order invariant of the initial tree gives B ≤ A ≤ D ≤ C ≤ E. The only tree that preserves this invariant is tree 4c. --- --- --- ![](https://i.imgur.com/CuiJSiN.png) ### Activity 7: Implement left rotation Implement the function bintree_node::rotate_left, by reordering the following statements in such a way that the subtree rooted in the node is correctly rotated leftwards ```cpp auto r = m\_right; m\_right->set\_left(this); auto rl = m\_right->m\_left; set\_right(rl); auto parent = m\_parent; parent->replace\_child(this, r); r->update\_height(); update\_height(); ``` The main function contains code to test your implementation on an example tree --- * Answer: ```cpp= void bintree_node::rotate_left() { auto r = m_right; auto rl = m_right->m_left; auto parent = m_parent; m_right->set_left(this); set_right(rl); parent->replace_child(this, r); update_height(); r->update_height(); } ``` ### Activity 8: Implement right rotation A right-rotation of a subtree mirrors a left-rotation. Implement the function bintree_node::rotate_right, by mirroring (left becomes right, and vice versa), the left rotation. The main function contains code to test your implementation on an example tree. --- * Answer: ```cpp= void bintree_node::rotate_right() { auto l = m_left; auto lr = m_left->m_right; auto parent = m_parent; m_left->set_right(this); set_left(lr); parent->replace_child(this, l); update_height(); l->update_height(); } ``` --- --- --- ![](https://i.imgur.com/of9mWxR.png) ![](https://i.imgur.com/CKjzKjI.png) ### Activity 9: Left rotations and balance Depending on the balance factor of the right subtree, a left-rotated right-heavy tree will have a different balance factor. Repeat the same analysis as was done in the previous paragraph, in case the right subtree has a balance factor of -1, +1 or +2. Log the results in the following table: ![](https://i.imgur.com/3AH2eQj.png) If you have filled in this table correctly, then exactly one of the rows will have a value of -2 in the rightmost column. --- * Answer: ![](https://i.imgur.com/On000Y4.png) --- --- --- ![](https://i.imgur.com/BUBtfPD.png) ![](https://i.imgur.com/fU312ms.png) ### Activity 10: Right rotations and balance Suppose you would right-rotate a tree that has a balance factor of -1. Depending on the balance factor of its left subtree, the rotated tree will have a different balance factor. In the paragraph above, it was shown that in case the left subtree has a balance factor of -1, the tree’s balance factor will be +1 after the rotation. Repeat the same analysis in case the left subtree has a balance factor of 0 or +1. Log the results in the following table: ![](https://i.imgur.com/Ppyg3Ca.png) If you have correctly filled in this table, then the sum of the rightmost column will be equal to 4. --- * Answer: ![](https://i.imgur.com/pCDqpYY.png) --- --- --- ![](https://i.imgur.com/BdjWaSm.png) ### Activity 11: Fixing right-heavy trees using rotation combos In a right-heavy unbalanced (sub)tree, the (root) node has a balance factor of +2. There are three cases to consider: the balance factor of the node’s right child can be either -1, 0, or +1. * For which of the three cases do we need to perform a right-left rotation? * For the case where an RL rotation is applied, what are the possible balance factors of the tree after the RL rotation? * For the cases where only a left rotation is applied, what are the possible balance factors of the tree after the left rotation? --- * Answer: * In case the right child has a balance factor of -1, then (see activity 9) left-rotating the tree would result in an unbalanced tree with a balance factor of -2. So only if the right child has a balance factor of -1, a right-left rotation must be performed. * After right-rotating the right child, the balance factor of the right child will be +1 or +2 (see activity 10). * If we then left-rotate the tree, the resulting balance factor will be 0 (see activity 9). If only a left rotation is applied, then this implies that the balance factor of the right child is 0 or -1. The resulting balance factor is then either -1 or 0 (again, see activity 9). --- --- --- ![](https://i.imgur.com/sxnlXtH.png) ### Activity 12: Fixing left-heavy trees For a left-heavy (the height of the root’s left node is greater than the height of the right node) tree, we must mirror the approach taken in the previous activity. The balance factor of the node’s left child can be either -1, 0, or +1, again giving three cases to consider. * For which of the three cases do we need to perform a left-right rotation? * For the case where an LR rotation is applied, what are the possible balance factors of the tree after that rotation combo? * For the cases where only a right rotation is applied, what are the possible balance factors of the tree after the rotation? --- * Answer: * This is the exact opposite of the previou sactivity: in case the left child has a balance factor of +1, then right-rotating the tree would result in an unbalancd tree with a balance factor of +2. So only if the left child has a balance factor of +1, a left-right rotation must be performed. * After left-rotating the left child, the balance factor of the left child will be -1 or -2. If we then right-rotate the tree, the resulting balance factor will be 0. * If only a right rotation is applied, then this implies that the balance factor of the left child is 0 or +1. The resulting balance factor is then either +1 or 0. --- --- --- ![](https://i.imgur.com/nkDwzsm.png) ### Activity 13: Rebalancing the tree The function bintree_node::rebalance checks whether the node is the root of a left-heavy or right-heavy tree. It this is the case, control is passed to the function fix_left_heavy or fix_right_heavy, which must perform the correct rotations to solve for the imbalance. Implement the fix_left_heavy and fix_right_heavy functions in the bintree_node class. This means that you must apply your observations of the previous two activities: based on the balance factor of a child node, these functions must perform either one or two rotations. The main function contains code to test your implementation. ```cpp= void bintree_node::fix_left_heavy() { switch (m_left->balance()) { case 0: rotate_right(); case -1: rotate_right(); case +1: m_left->rotate_left(); rotate_right(); break; } } void bintree_node::fix_right_heavy() { switch (m_right->balance()) { case 0: rotate_left(); case +1: rotate_left(); case -1: m_right->rotate_right(); rotate_left(); break; } } ``` ## Week 11 - Heaps and Priority Queues ![](https://i.imgur.com/62w2SA3.png) ### Activity 1: Comparing different data structures Suppose that you would consider the following three data structures for storing a list of prioritized tasks: * A sorted array * A sorted linked list * A balanced binary search tree What would be the worst-case time complexities for the three required operations, for each of the three data structures? ![](https://i.imgur.com/21S4xQx.png) Which data structure offers the best performance, and why? --- --- --- ![](https://i.imgur.com/nwcfwUe.png) ### Activity 2: Storing the minimum value in the root node Describe how the implementation of a balanced binary search tree can be modified in such a way that the minimum value is always stored inside the root node. *Hint: it’s impossible to do this while keeping the complete tree balanced. In particular, the tree’s root node will be extremely unbalanced, but the remainder of the tree can still be kept balanced.* --- * Answer: When inserting a value that is smaller than the root value, create a new root node from the inserted value, and make the old root the right child of the new root. When deleting the value that is contained in the root node, replace the root value with the smallest value in the right subtree. These two adaptations will make sure that the smallest value is always kept in the root node. --- --- --- ![](https://i.imgur.com/aX9kXVz.png) ![](https://i.imgur.com/nEGPyqs.png) ### Activity 3: Identify valid heaps Below three binary trees are shown. Two of the three are valid heaps. ![](https://i.imgur.com/RjdkgnQ.png) Which trees are heaps? For each heap, indicate if it’s a min-heap or a max-heap. For the tree that is not a heap, describe why it is not a heap. --- * Answer: The leftmost and rightmost binary tree are both valid min-heaps. --- --- --- ![](https://i.imgur.com/Oj6PyvV.png) ![](https://i.imgur.com/OVqe5Ip.png) ### Activity 4: Do it yourself In the heap of Figure 2d, suppose that the value 4 is added as a right child of 13. How many, and which specific swaps are needed to restore the heap invariant? --- * Answer: There are two swaps: first, value 4 is swapped with value 13, and then value 4 is swapped with value 5. After these two swaps, the path from the root to the newly inserted node contains the values 1, 4, 5, and 13. --- --- --- ![](https://i.imgur.com/MlntSqH.png) ![](https://i.imgur.com/wzMUUjO.png) ### Activity 5: Worst-case time complexities What are the worst-case time complexities of: * Finding the minimum value * Removing the minimum value * Inserting a new value * Finding an arbitrary value for a binary heap and for a balanced binary search tree? Give your answer in Big O notation. ![](https://i.imgur.com/R37h4ve.png) --- --- --- ![](https://i.imgur.com/WQ9RKz7.png) ### Activity 6: Moving to arrays Suppose that the value 53 is added to the binary heap shown in Figure 4. * At how many positions can you add this new value as a leaf node to the tree without breaking the heap invariant? * At which index would you need to store the new value if you don’t want any gaps in the array’s data? With which position in the tree does your chosen index correspond? --- * Answer: At eight positions: as either a left or right child of the values 29, 23, 47, or 17. If no gaps are permitted, then the new value is stored at index 11, which corresponds to the left child of 87. --- --- --- ![](https://i.imgur.com/QXMkcJN.png) ### Activity 7: Index computations The private functions left_child_index and right_child_index of the minheap class return the index of the left or right child of any node given by its index. Similarly, the function parent_index returns the index of a node’s parent. These functions are not yet implemented - implement them so that: * left_child_index returns the index of the left child of the node with the given index, * right_child_index returns the index of the right child of the node with the given index, * parent_index returns the index of the parent of the node with the given index, * the functions return -1 if the node with the given index does not have the requested child/ parent. The main function provides code to test your implementation --- * Answer: ```cpp= int minheap::parent_index(int index) const { return index != 0 ? (((index+2)-1) / 2) - 1 : -1; } int minheap::left_child_index(int index) const { return ((index * 2) + 1<=size()) ? (index * 2) + 1 : -1; } int minheap::right_child_index(int index) const { return ((index * 2) + 2<=size()) ? (index * 2) + 2 : -1; } ``` --- --- --- ![](https://i.imgur.com/JrDfnOQ.png) ![](https://i.imgur.com/ecHhuoD.png) ### Activity 8: Implement the find-min operation The find-min operation of our binary heap is represented by the minimum function. Implement the minimum function. The main function provides code to test your implementation. --- * Answer: ```cpp= const task &minheap::mininum() const { return m_values[0]; } ``` --- --- --- ![](https://i.imgur.com/C0cZfn1.png) ### Activity 9: Bubbling up Our binary heap’s insert function is already given. It appends the new value to the array, and then repairs the binary heap by “bubbling up”. ```cpp void minheap::insert(const task& value) { ensure\_capacity(m\_size + 1); m\_values\[m\_size++\] = value; bubble\_up(m\_size - 1); } ``` Implement the bubble_up function, which must repair any heap violation caused by insertion of the value: if the value is greater than its parent value, then these two values are swapped. This is repated until the heap invariant is restored again. Make sure to use the private swap(int, int) function of the minheap class, which swaps the two tasks stored at the given indices. The main function provides code to test your implementation. --- * Answer: ```cpp= void minheap::bubble_up(int index) { while(index>0 && m_values[index]<m_values[parent_index(index)]){ swap(index,parent_index(index)); index=parent_index(index); } } ``` --- --- --- ![](https://i.imgur.com/PKJTNAD.png) ### Activity 10: Bubbling down The minheap class already has a function delete\_min, which deletes the minimum value from the heap by replacing it with the last value in the array. The function calls the bubble\_down function to repair the possible violation of the heap invariant this has caused. ```cpp void minheap::delete\_min() { m\_values\[0\] = m\_values\[--m\_size\]; bubble\_down(0); } ``` Implement the bubble\_down function. This function must move the new root value downwards, by swapping it with the minimum of its child values (but only if it is greater than any of its child values!). The main function provides code to test your implementation. Again, make sure to use the swap(int, int) function --- * Answer: ```cpp= void minheap::bubble_down(int index) { int next_index = 0; if((m_values[left_child_index(index)].priority<=m_values[right_child_index(index)].priority||right_child_index(index)<0)&&left_child_index(index)>=0){ next_index = left_child_index(index); } else if((m_values[left_child_index(index)].priority>=m_values[right_child_index(index)].priority||left_child_index(index)<0)&&right_child_index(index)>=0){ next_index = right_child_index(index); } while(m_values[index].priority>m_values[next_index].priority && index<=size()){ swap(index,next_index); index=next_index; if(m_values[left_child_index(index)].priority<=m_values[right_child_index(index)].priority&&left_child_index(index)>0){ next_index = left_child_index(index); } else if(m_values[left_child_index(index)].priority>=m_values[right_child_index(index)].priority&&right_child_index(index)>0){ next_index = right_child_index(index); } } } ``` --- --- --- ![](https://i.imgur.com/SENxuGs.png) ![](https://i.imgur.com/HaFPB71.png) ### Activity 11: Bottom up - do it yourself The previous paragraphs explain the top-down approach to building a heap from a list of values. The bottom-up approach visits child nodes before their parent nodes, bubbling values down when necessary. This means that the array is iterated from end to begin. The left figure below shows a heap that is initialized from a list of seven values. The right figure shows the heap after the heap invariant is restored. ![](https://i.imgur.com/cjqyDT6.png) Applying the bottom-up approach to the left heap means that value 6 (at index 2) is the first value that is bubbled down, then 5 (at index 1), then 7. Fill in the table below with the contents of the array after each swap, and indicate per step which values were swapped. How many swaps are required to restore the heap invariant? --- * Answer: ![](https://i.imgur.com/DLDZNDh.png) --- --- --- ![](https://i.imgur.com/5D2QaF0.png) ### Activity 12: Turning vectors into heaps The main.cpp file contains the function generate_tasks. This function generates a vector of a given number of tasks, ordered in such a way that for both the top-down and bottom-up approach the worst-case number of swaps are required. Carry out the following experiment: use the generate_tasks function to generate vectors of different lengths, and run the heapify operation in both the bottom-up and top-down approach. You can control which approach is followed, through the static member variable heapify_top_down of the minheap class. Put your results into the table below. From the table, derive the worst-case time complexity of each of the two methods - choose from O(log n), O(n), O(n 2 ), or O(n log n). Answer the following two questions: * What is the worst-case time complexity when bubbling values up, from top to bottom? * What is the worst-case time complexity when following the bottom-up approach, bubbling values downwards? * Which of the two strategies does the function std::make_heap in the C++ standard library use? --- * Answer: The worst-case time complexity when bubbling values up is O(n log n). The worst-case time complexity when bubbling values down is O(n). The C++ standard library uses the faster of the two - bubbling values down. ![](https://i.imgur.com/66RIgJb.png) --- --- --- ![](https://i.imgur.com/wYQmEL5.png) ### Activity 13: Heaps and sorting Use the generate_vector function to build a heap consisting of 10 values. Next, in a loop, keep on deleting the minimum value from the heap, until the heap is empty. Inspect the contents of the values() of the heap after all values have been deleted. What noteworthy observation can you make (hint: look at the priorities of the tasks)? Can you explain this observation? --- * Answer: If you look at the contents of the values of the heap after all values have been deleted, you’ll notice that it contains the values sorted from large to small. This is because, iteratively, the smallest element has been deleted from the heap, which means it was swapped with the element at index n − 1. Next, at index n − 2, the second smallest element is stored, etc. --- --- --- ![](https://i.imgur.com/JbKuDYn.png) ![](https://i.imgur.com/wYlahHa.png) ### Activity 14: In-place heap sort (optional) Write a function (e.g., in the main.cpp file) that takes a vector and sorts it in ascending order (lower values precede higher values), without using the given min_heap class. ```c void heap_sort(vector& values); ``` What is the worst-case time complexity of heap_sort (use n to denote the number of elements in the vector)? Would this way of sorting be an efficient approach to sorting a linked list? Why (not)? ## Week 12 - Graphs - Representations ![](https://i.imgur.com/wr6NMbk.png) ![](https://i.imgur.com/pLxH1Mx.png) ### Activity 1: Applications of graphs There are many real-world phenomena that can be represented by graphs. Search the internet for at least five things that are typically represented by a graph data structure. Search for both undirected and directed graphs. You may choose at most three graphs of each kind (directed and undirected). For each of the graph representations that you find, answer the following questions: * What does the graph represent? * What do the vertices and edges represent? * Are the edges directed or undirected? Why? --- * Answer: * Navigation Application * The graph represents the area the map covers (i.e. town, city, country, world). * The nodes represent intersection and the edges represent streets. Due to the structure of the graph, the traversal is able to find the fastest route from A to B. * The edges are **undirected** since roads/edges can be traversed both ways. * Search Engine * The graph represents collection of webpages and their connections. * The nodes represent web pages and the edges represent links. * The edges are **directed** since links/edges go one way. * Social Network * The graph represents collection of users and their connections. * The nodes represent users and the edges represent follows. * The edges are **undirected** since follows/edges are considered bidirectional. * Cryptocurrency (DAG) * The graph represents users transaction history. * The nodes represent a transaction and the edges connect the transaction adding valididty to them. * The edges are **directed** since the edges point foward in time. * Product Recommendation * The graph represents connections between goods sold by a business. * The nodes represent the goods and the edges represent related goods bought by other users. * The edges are **undirected** since the relation between goods/edges goes both ways. --- --- --- ![](https://i.imgur.com/UH59IHM.png) ### Activity 2: Representing graphs Study the C++ code provided for this week’s activities. The class graph allows one to build either a directed or undirected graph. Search the internet to find out what the difference is between an adjacency matrix and an adjacency list. Describe how the graph class stores its vertices and edges - does it use the concept of an adjacency list, or of an adjacency matrix? --- * Answer: * Adjacency Matrix An adjeceny matrix is a 2 dimensional array storing all verticies on one axis and the vertecies' connections on the other. * Adjacency List An adjeceny list is an array storing lists storing a 'route' through the graph. The array then contains all of the routes and therefor every vertex. In this case the given code is built on a vertix storing every vertex while a map stores the vertecies' connections, making it an adjecency list. --- --- --- ![](https://i.imgur.com/nTU40vE.png) ### Activity 3: Graph construction Use the graph class to create an: * Undirected graph that matches the graph of Figure 1. * Directed graph that matches the graph of Figure 2. Use the to_dot function to write each graph to a .dot file, and visualize it online to see if the graph matches the figure. --- * Answer: * **Undirected:** ```c= graph undirected{false}; for(auto& v:{"a","b","c","d","e","f","g"}){ undirected.add_vertex(v); } undirected.add_edge("a","c"); undirected.add_edge("b","c"); undirected.add_edge("c","d"); undirected.add_edge("b","d"); undirected.add_edge("c","f"); undirected.add_edge("d","e"); undirected.add_edge("f","g"); ``` ![](https://i.imgur.com/8sVeDxW.png) * **Directed:** ```c= graph directed{true}; for(auto& v:{"a","b","c","d","e","f","g"}){ directed.add_vertex(v); } directed.add_edge("a","b"); directed.add_edge("c","a"); directed.add_edge("b","d"); directed.add_edge("d","c"); directed.add_edge("d","e"); directed.add_edge("e","f"); directed.add_edge("f","g"); directed.add_edge("g","e"); ``` ![](https://i.imgur.com/9KD7XaO.png) --- --- --- ![](https://i.imgur.com/Y8oFeMr.png) ### Activity 4: A first traversal Apply the recursive traverse_rec function to the two example (directed) graphs shown below. ![](https://i.imgur.com/Zd1oGSa.png) On one of the graphs, calling the function will result in a problem. For this graph: * Use the debugger to go through the code step by step, and observe what happens. * Explain why your program results in an error. On the other graph, the function will run without problems. For this graph, answer these two questions: * how would you describe the order in which vertices are visited? * How many times is vertex d visited? --- * Answer: The recursive function will run forever on the rightmost graph because it has a cycle. On the left graph, the vertices are visited depth-first. Vertex d is visited three times. --- --- --- ![](https://i.imgur.com/1F0fyul.png) ![](https://i.imgur.com/gLvRSEB.png) ### Activity 5: Find all reachable vertices Change the definition of the traverse\_rec function given above, by adding a parameter: ```cpp // use a typedef to avoid using the long type name typedef std::unordered\_set lookup; void traverse\_rec(const graph::vertex& v, lookup& visited); ``` Next, change the behaviour of the function traverse\_rec by making use of the set visited that is passed to it: * Use the function visited.find(...) to check whether a vertex (label) is already visited, and take appropriate action if it is. * Use the function visited.insert(...) to add a vertex (label) to the visited vertices. There are multiple ways to solve this, but each of the two functions find and insert needs to be called only once. If done correctly, then the following code will change the colour of all the vertices reachable from the given vertex to red: 1. lookup reachable{}; 2. traverse\_rec(graph\["a"\], reachable); 3. for (auto& label : reachable) 4. graph.colour\_vertex(label, colour::red); 5. graph.to_dot("reachable.dot"); Apply your function to the two graphs of the previous activity, using different vertices to start the traversal. Generate .dot files, and visualize them to verify the result (include the .dot files in your logbook). --- * Answer: ```c= void traverse_rec(const graph::vertex& v, lookup& visited) { std::cout << "Visiting " << v.label() << std::endl; visited.insert(v.label()); for (const auto& edge : v.edges()) { if(visited.find(edge.target().label())==visited.end()) { traverse_rec(edge.target(), visited); } } } ``` ```c= digraph undirected { rankdir = LR; node[shape=oval style=filled]; a[label="a", fillcolor="red"]; b[label="b", fillcolor="red"]; c[label="c", fillcolor="red"]; d[label="d", fillcolor="red"]; e[label="e", fillcolor="red"]; f[label="f", fillcolor="red"]; g[label="g", fillcolor="red"]; edge[dir = none]; a -> c; b -> c; b -> d; c -> d; c -> f; d -> e; f -> g; } ``` ```c= digraph directed { rankdir = LR; node[shape=oval style=filled]; a[label="a", fillcolor="red"]; b[label="b", fillcolor="red"]; c[label="c", fillcolor="red"]; d[label="d", fillcolor="red"]; e[label="e", fillcolor="red"]; f[label="f", fillcolor="red"]; g[label="g", fillcolor="red"]; edge[dir = forward]; a -> b; b -> d; c -> a; d -> c; d -> e; e -> f; f -> g; g -> e; } ``` --- --- --- ![](https://i.imgur.com/b42uiKX.png) ### Activity 6: Recursion overhead Use the following code to create increasingly larger (directed) graphs: ```cpp graph g{true}; const int size = 100; for (int i = 0; i < size; i++) g.add\_vertex("a" + std::to\_string(i)); for (int i = 1; i < size; i++) g.add\_edge("a" + std::to\_string(i - 1), "a" + std::to_string(i)); ``` Test the traverse function given above on increasingly larger (change the size variable) graphs. What is the maximum recursion depth that your program can handle? --- * Answer: The recursive function caps at size: 16,136. ![](https://i.imgur.com/7K68A1k.png) --- --- --- ![](https://i.imgur.com/6eoCRtf.png) ![](https://i.imgur.com/XqGUKg3.png) ### Activity 7: Iterative traversal Add the provided traverse function given above to your program. In your main function, call the iterative function instead of the recursive one. Try to run the traverse function on the large graphs on which the recursive function traverse_rec failed in the previous activity. Does the iterative version successfully process the graphs? ```c= void traverse_depth(const graph::vertex& start) { std::deque<graph::vertex> queue = {start}; std::unordered_set<std::string> visited; queue.push_back(start); while(!queue.empty()){ auto v = queue.back(); queue.pop_back(); if(visited.find(v.label()) == visited.end()){ std::cout << "Visiting " << v.label() << std::endl; visited.insert(v.label()); for(auto& edge : v.edges()){ queue.push_back(edge.target()); } } } } ``` --- --- --- ![](https://i.imgur.com/tXkuvIv.png) ### Activity 8: Breadth-first search Change line 15 of the traverse function listed earlier, by calling the function push_front instead of push_back. This will add adjacent vertices to the front of the queue, instead of the back. Add a line that causes the function to print each vertex it visits to the console. Describe the impact this relatively small change has on the order in which the vertices are visited by the function. --- * Answer: Using a FIFO ordering will result in vertices being visited in a breadth-first manner. ```c= void traverse_breadth(const graph::vertex& start) { std::deque<graph::vertex> queue = {start}; std::unordered_set<std::string> visited; queue.push_back(start); while(!queue.empty()){ auto& v = queue.back(); if(visited.find(v.label()) == visited.end()){ std::cout << "Visiting " << v.label() << std::endl; visited.insert(v.label()); for(auto& edge : v.edges()){ queue.push_front(edge.target()); } } queue.pop_back(); } } ``` ![](https://i.imgur.com/xJVSYh1.png) ### Activity 9: Comparison Create a function find\_goal, which is based on the traverse function of the previous activity. Add a parameter that allows you to specify a "goal" vertex to find, and change its return type from void to bool, as shown in the listing below: ```cpp bool find\_goal(const graph::vertex& v, const std::string& goal); ``` Add the necessary code to the function so that it returns true in case the goal vertex was found, and false otherwise. Apply the find_goal function to the two graphs of Figures 1 and 2, using both depth-first and breadth-first search (by changing how adjacent vertices are added to the queue). Which of the two approaches (depth-first or breadth-first) will find the shortest (in terms of number of edges) path to the goal state? Why? --- * Answer: When looking for vertex g in the right graph depth first will find the goal quicker than breadth first. However, when looking for vertex d breadth first will beat depth first by one move. The same can be observed in the left graph. When searching for vertex h depth first wins but when searching for vertex c breadth first wins by a big margin. Depending on the edges and location of the vertex either method can be faster. ```c= bool find_goal_d(const graph::vertex& start, const std::string& goal) { std::deque<graph::vertex> queue = {start}; std::unordered_set<std::string> visited; queue.push_back(start); while(!queue.empty()){ auto v = queue.back(); queue.pop_back(); if(v.label()==goal){ std::cout << "Found " << v.label() << std::endl; return true; } else if(visited.find(v.label()) == visited.end()){ std::cout << "Visiting " << v.label() << std::endl; visited.insert(v.label()); for(auto& edge : v.edges()){ queue.push_back(edge.target()); } } } return false; } bool find_goal_b(const graph::vertex& start, const std::string& goal) { // you need this function in activities 7, 8, 11, and 12 std::deque<graph::vertex> queue = {start}; std::unordered_set<std::string> visited; queue.push_back(start); while(!queue.empty()){ auto v = queue.back(); if(v.label()==goal){ std::cout << "Found " << v.label() << std::endl; return true; } else if(visited.find(v.label()) == visited.end()){ std::cout << "Visiting " << v.label() << std::endl; visited.insert(v.label()); for(auto& edge : v.edges()){ queue.push_front(edge.target()); } } queue.pop_back(); } return false; } ```  --- --- --- ![](https://i.imgur.com/5UlAEcb.png) ### Activity 10: Cycles and visited nodes The function find\_cycle listed below is a first attempt at detecting a cycle. It is a slight modification of the traverse function shown earlier - it will return true whenever a vertex is found that was already visited (lines ...), and. The function will successfully detect a cycle in the graph of Figure 3a, but it will also falsely report a cycle in the graph shown in Figure 3b. ```cpp bool find_cycle(const graph::vertex& start) { std::unordered_set visited{}; std::deque queue = { start }; while (!queue.empty()) { auto v = queue.back(); // v is next vertex to visit queue.pop\_back(); // remove v from queue // check if v was visited before if (visited.find(v.label()) == visited.end()) { visited.insert(v.label()); // not visited before, mark v as visited // add all vertices adjacent to v to the queue for (auto& edge : v.edges()) queue.push\_back(edge.target()); } else return true; // indicate that a cycle was found } return false; // indicate that no cycle was found } ``` --- * Answer: In this function, a cycle is reported whenever a vertex that was already visited is reached. --- --- --- ![](https://i.imgur.com/qCPEcs1.png) ### Activity 11: Understanding the recursion Study the recursive traverse_rec function provided above, and use the debugger to step through its code as it traverses a (small) directed graph. Explain the order in which the two different lines (printing "Visiting .." and "Backtracking from ...") are printed to the console. --- * Answer: The line "Visiting" is printed before a reachable vertex is printed, and the line "Backtracking from" is printed after all reachable vertices have been backtracked from. ![](https://i.imgur.com/BAmz5Ex.png) ![](https://i.imgur.com/4Yd8Fwg.png) --- --- --- ![](https://i.imgur.com/hsCDWUO.png) ![](https://i.imgur.com/maZZMrX.png) ### Activity 12: Understanding the iteration Study the iterative traverse function listed above, and use the debugger to step through its code as it traverses a (small) directed graph. When everyone in the group understands how the function works, continue with the following three tasks: 1. Add two lines of code to the function so that, whenever a vertex is visited or backtracked from, this is written to the console, similar to what the traverse_rec function does. Carefully think about where these two lines must be added to have the desired effect. 2. Explain the order in which nodes are visited by the traverse function. Is it the same order as the order produced by the recursive version? Describe the differences, if any. 3. What information does the content of the queue contain? --- * Answer: 2. The order is slightly different, but the same principle applies - vertices are "visited" before reachable vertices are "visited", and vertices are "backtracked from" after all reachable vertices have been "backtracked from". 3. The queue contains the current path. ```cpp= void traverse(const graph::vertex& start) { std::unordered_set<std::string> visited; std::deque<graph::vertex> queue = { start }; while (!queue.empty()) { auto& v = queue.back(); // v is next vertex to visit if (visited.find(v.label()) == visited.end()) { // was v visited before? std::cout << "Visiting " << v.label() << std::endl; visited.insert(v.label()); // no, mark v as visited } else if (v.edges().empty()) { std::cout << "Backtracking from " << v.label() << std::endl; queue.pop_back(); // remove v from queue } else { // get next adjacent vertex, visit it if not already visited auto edge = v.edges().back(); v.edges().pop_back(); if (visited.find(edge.target().label()) == visited.end()) queue.push_back(edge.target()); } } } ``` ![](https://i.imgur.com/zB4XXPj.png) --- --- --- ![](https://i.imgur.com/yHg5fzR.png) ### Activity 13: Cycle detection If you can travel to a node that is on the current path, then that means that you’ve found a cycle. Change the function traverse so that it detects whether a directed graph contains a cycle. Let the function change the color of all the nodes that are on the cycle to blue. ![](https://i.imgur.com/tMUACa6.png) --- * Answer: ```cpp= std::unordered_set<std::string> traverse(const graph::vertex& start) { std::unordered_set<std::string> visited; std::unordered_set<std::string> cycle; std::deque<graph::vertex> queue = { start }; while (!queue.empty()) { auto& v = queue.back(); // v is next vertex to visit if (visited.find(v.label()) == visited.end()) { // was v visited before? std::cout << "Visiting " << v.label() << std::endl; visited.insert(v.label()); // no, mark v as visited } else if (v.edges().empty()) { std::cout << "Backtracking from " << v.label() << std::endl; queue.pop_back(); // remove v from queue } else { // get next adjacent vertex, visit it if not already visited auto edge = v.edges().back(); v.edges().pop_back(); if (visited.find(edge.target().label()) == visited.end()) queue.push_back(edge.target()); else { cycle.insert(v.label()); auto temp_v = edge.target(); while(temp_v.label()!=v.label()){ cycle.insert(temp_v.label()); temp_v = temp_v.edges().back().target(); } } } } return cycle; } ``` ![](https://i.imgur.com/WnAYNXO.png) ## Week 13 - Graphs - Shortest Path I ![](https://i.imgur.com/NLylqYi.png) ![](https://i.imgur.com/LPShj2A.png) ![](https://i.imgur.com/LpdIpAK.png) ### Activity 1: Example graph: find the shortest distance In the example graph shown below, there are 15 possible ways to get from vertex "a" to vertex "q". Which of these 15 paths has the shortest distance? Do you need to go over all the possible paths to find the shortest one? ![](https://i.imgur.com/W1xMDHU.png) --- * Answer: The quickest way of coming up with a good rout is to look at an intersection of multiple possible edges and chose the one with the least weight. This would be **a**-d-g-k-l-p-**q** coming to a total weight of 16. However, some more playing around reveals that the path **a**-b-c-f-j-m-**q** also totals up to 16 showing that choosing the least heavy weighted edge is not always purely reliable. There could also be a situation where the least --- --- --- ![](https://i.imgur.com/70BL5Bt.png) ![](https://i.imgur.com/Wt1849M.png) ![](https://i.imgur.com/zsmKHDb.png) ![](https://i.imgur.com/U0uL2MN.png) ![](https://i.imgur.com/wcKwaKt.png) ![](https://i.imgur.com/EIORM4g.png) ### Activity 2: Time to relax The class shortest_paths contains a function that relaxes a vertex by relaxing each of its outgoing edges, by calling the function relax(const graph::vertex&, const graph::edge&). However, the latter has not been implemented yet. Implement the relax function (the one that has a parameter for both a vertex and an edge). Study the vertex-relaxing relax function (i.e., the one that takes one parameter) to see how the function is called. The function must return a bool indicating whether the distance to the vertex reachable through the edge was decreased. The main function tests the implementation of the relax functions. Make sure that your implementation passes the test. --- * Answer: ```cpp= bool shortest_paths::relax(const graph::vertex & vertex, const graph::edge & edge) { if(get_data(vertex)->distance()==0){ get_data(edge.target())->set_distance(&edge,edge.weight()); return true; } else{ if(get_data(vertex)->distance()+edge.weight()<get_data(edge.target())->distance()){ get_data(edge.target())->set_distance(&edge,get_data(vertex)->distance()+edge.weight()); return true; } return false; } } ``` --- --- --- ![](https://i.imgur.com/AVazBvo.png) ### Activity 3: Depth-first traversal The class dfs_traversal performs a depth-first traversal through a graph, starting at some vertex. The code listed below shows how the class can be used. ```cpp graph g = example_graph(); auto traversal = dfs_traversal(g["a"]); traversal.run(); ``` Update the compute function of the shortest\_paths class so that it performs a depth-first traversal after it has prepared its internal bookkeeping. --- * Answer: ```c= bool shortest_paths::relax(const graph::vertex & vertex, const graph::edge & edge) { if(get_data(vertex)->distance()==0){ get_data(edge.target())->set_distance(&edge,edge.weight()); return true; } else{ if(get_data(vertex)->distance()+edge.weight()<get_data(edge.target())->distance()){ get_data(edge.target())->set_distance(&edge,get_data(vertex)->distance()+edge.weight()); return true; } return false; } } ``` --- --- --- ![](https://i.imgur.com/2R01qAj.png) ### Activity 4: Topological sort The relax_all function of the shortest_paths class relaxes all vertices passed to it in a vector. The function takes two parameters - a vector with vertex labels, and a boolean indicating whether the vector is to be processed in reverse order or not. Implement the relax_all function, and call the function from the compute function. Hint: you can go over the elements of a vector in reverse by calling the vector’s back() and pop_back() functions. Then, pass the pre_order and post_order vectors filled by the traversal, and explore these four possible relaxation orders: * Pre order * Pre order, reversed * Post order * Post order, reversed You can verify the computed "shortest" paths by calling the to\_dot function of the shortest\_paths instance, and visualizing the resulting .dot file online. Which of the four relaxation orders solves the shortest paths problem for acyclic graphs? Describe why that particular ordering works. --- * Answer: The relaxation order that works is the post order, reversed. This works because a vertex is appended to the post ("backtrack") order after all reachable vertices have been visited and backtracked from. Visiting vertices in the reversed post order therefore means that a vertex is visited after all vertices it is reachable from are visited - all paths to a vertex are explored first. ```c= void shortest_paths::relax_all(std::vector<std::string> vector, bool reversed) { if(!reversed) { for (auto &label : vector) { relax(m_graph[label]); } } else{ while(!vector.back().empty()){ relax(m_graph[vector.back()]); vector.pop_back(); } } } ``` --- --- --- ![](https://i.imgur.com/m8cfaN9.png) ### Activity 5: Cyclic graphs The two graphs returned by the cyclic_1() and cyclic_2() functions in example_graphs.h have exactly the same structure and edge weights. The only difference is in their internal adjacency lists - some are ordered slightly differently, meaning that the depth-first traversal will also be different. Try visiting the vertices in the same order for these two similar but different cyclic graphs. Does this give the correct shortest paths? Why (not)? --- * Answer: No, because we may find the correct distance for a vertex after it has been visited. This means that all paths leading from that vertex must be updated again, which requires a second round of relaxations ![](https://i.imgur.com/1CK0FI3.png) ![](https://i.imgur.com/p89pr94.png) ![](https://i.imgur.com/eFA7sCg.png) ![](https://i.imgur.com/Q8SMg4S.png) --- --- --- ![](https://i.imgur.com/43zGJzs.png) ### Activity 6: Is is possible to find a shorter path? Take a look at the weighted digraph shown below. In the graph, solid edges denote edges that have been relaxed, dashed ones have not (yet) been relaxed. All edges in the graph have weights, but the weights associated with dashed edges are not shown - all that we know is that they are non-negative. ![](https://i.imgur.com/EAJIvRJ.png) Suppose that we’ve been able to find (it does not matter how - this is simply our assumption) the shortest possible distances from vertex "a" to vertices "c" (8) and "d" (12). The other vertices (except for "g") have a distance, too (with a question mark), but it’s unsure whether these distances are shortest distances - more work is required. Answer the following questions: * Is it possible that the shortest path from vertex "a" to vertex "f" has a distance smaller than 16 ? Why (not)? * Is it possible that the shortest path from vertex "a" to vertex "e" has a distance smaller than 14 ? Why (not)? * Suppose that the unknown edge weights can be negative as well. Is it now possible that the shortest path from vertex "a" to vertex "e" has a distance smaller than 14 ? Why (not)? --- * Answer: The shortest path from vertex "a" to vertex "f" can very well have a distance shorter than 16, because the path e-f can be shorter than two. The path from vertex "a" to vertex "e" can not be shorter than 14, because the path from f to e is not negative. If the edge weights can be negative, then edge weight f-e can be -3 or smaller, which means that the distance to "e" is smaller than 14. --- --- --- ![](https://i.imgur.com/9W4UXP6.png) ![](https://i.imgur.com/FmCQVuu.png) ### Activity 7: Implementing array-based Dijkstra Implement the function array_based_dijkstra, which must solve the single-source shortest paths problem, in the following way: First, implement the find_next_dijkstra function. This function must search the m\_data vector for an unvisited element, and return the one that has the smallest distance. If all elements are marked as visited, the function must return nullptr. Next, 1. Find the unvisited shortest_path_data element that has the smallest distance found so far, and retrieve the associated vertex. If no such element exists, then we’re done. 2. Relax the vertex. 3. Mark the shortest_path_data element as visited. 4. Go back to step 1. Call the function from the compute function, in case the graph is cyclic (in case the graph is acyclic, it must call the relax_all function, as you’ve done in an earlier activity). *Hint: the function cycle_found() of the dfs_traversal class can be used to find out if a cycle was detected during the traversal.* --- * Answer: ```c= const graph::vertex *shortest_paths::find_next_dijkstra() { graph::vertex * smallest = nullptr; int distance = 210000; for(auto &var: m_data){ if(!var.visited() && distance>var.distance()){ smallest = (graph::vertex *) var.vertex(); distance = var.distance(); } } return smallest; } void shortest_paths::array_based_dijkstra(const graph::vertex & source) { const graph::vertex *next = &source; while(next!= nullptr) { relax(*next); get_data(*next)->visit(); next = find_next_dijkstra(); } } ``` --- --- --- ![](https://i.imgur.com/qN9BPB8.png) ![](https://i.imgur.com/rRvzpSU.png) ### Activity 8: Time complexity of Dijkstra - array-based To obtain the worst-case time complexity of the array-based implementation, answer the following questions: * What is the worst-case time complexity of finding the next vertex to visit? * What is the worst-case time complexity of updating the distance of a vertex? * What is the worst-case time complexity of the array-based implementation of Dijkstra? --- * Answer: Time complexities: Finding the next vertex to visit: O(n). Updating the distance of a vertex (assuming that we can retrieve it by index): O(1). Array-based implementation: O(n · n + m · 1) = O(n^2 + m). --- --- --- ![](https://i.imgur.com/Fd3mDbh.png) ![](https://i.imgur.com/3AxCCOg.png) ### Activity 9: Using a binary heap Suppose we would use a binary heap to store the heap_data elements, in such a way that the element with the shortest distance is the root element (stored at index 0 of the vector). For Dijkstra’s algorithm, this would mean that finding the next vertex to visit has a worst-case time complexity of O(1), and deleting that vertex is O(log n). Relaxing a vertex may cause the distances of other vertices, the data of which is kept in the heap, to decrease. Answer the following questions: * What is the worst-case time complexity of finding an element with a given (distance) value in the heap, if the heap contains n elements? * What effect does decreasing the distances stored in the heap have on the heap invariant? --- * Answer: Finding an element with a given distance in the heap is O(n) (it would be O(log n) if it was a balanced BST). Changing the distance may break the heap invariant. --- --- --- ![](https://i.imgur.com/jfnAWxw.png) ### Activity 10: A lazy heap Implement the heap-based implementation of Dijkstra’s algorithm, using a lazy heap, in the function heap\_based\_dijkstra. This means that anytime an edge relaxation causes the distance of a vertex to be decreased, we insert a new heap\_data instance into the heap, using its static insert\_into\_heap function. The heap-based implementation starts with creating a heap of heap\_data elements, with a single element - the source vertex: ```cpp std::vector heap = { get\_data(source) }; ``` The algorithm then consists of the following steps: 1. Find the minimum element of the heap, and delete it from the heap. 2. If the corresponding vertex was not yet visited, relax all its outgoing edges. If this causes a vertex distance to be decreased, then insert a new heap\_data instance into the heap for that vertex. 3. If the heap is not empty, go back to step 1. Call the heap_based_dijkstra function from the compute function (instead of calling the array\_based_dijkstra function). The main function contains code to test your implementation. --- * Answer: ```c= void shortest_paths::heap_based_dijkstra(const graph::vertex & source) { std::vector<heap_data> heap = { get_data(source) }; while(!heap.empty()){ auto var = heap.back().delete_min(heap); if(!var.visited()){ for (const graph::edge &edge : var.vertex()->edges()){ if(relax(*var.vertex(),edge)){ heap.emplace_back(get_data(edge.target())); } } var.visit(); } } } ``` --- --- --- ![](https://i.imgur.com/DzIVXsG.png) ### Activity 11: Heap-based Dijkstra - time complexity If we apply the heap-based implementation of Dijkstra’s algorithm to a graph, then the time of inserting and removing elements from the heap depends on the number of elements in the heap. Let’s say that the heap contains h elements. **Question 1** What is the worst-case time complexity of the heap-based implementation of Dijkstra, expressed in the number of vertices, n, the number of edges, m, and the number of elements in the heap, h? --- * Answer: Retrieval of next vertex to visit is O(1), and deleting the relaxed vertex is O(log h). Updating a distance after relaxing an edge is O(log h), so total complexity is O(n log h+m log h). Now, how many elements can the heap contain in the worst case? In the worst case, relaxing a vertex causes the distances of all other (non-relaxed) vertices to be updated. This means that the number of vertices inserted into the heap is n + (n − 1) + (n − 2). . ., which is 1/2n(n + 1), and for sufficiently large n, this is roughly n^2 . --- --- **Question 2** What is the worst-case time complexity of the heap-based implementation, expressed only in n and m? --- * Answer: If h ∼ n 2 , then complexity is O(n log n 2 + m log n 2 ), which is the same as O(2n log n + 2m log n). Since constants don’t affect scaling, they can be dropped, giving a complexity of O(n log n + m log n) or O((n + m) log n). --- --- --- ![](https://i.imgur.com/mRhK9wP.png) ![](https://i.imgur.com/NYTWJO7.png) ### Activity 12: Dijkstra’s worst-case time complexity - a closer look The maximum number of edges in a directed graph is equal to n × (n − 1). If we assume a graph with n 2 edges, i.e., set m = n 2 , then which of the two versions that we have implemented is the best choice in terms of worst-case time complexity? --- * Answer: If m ∼ n^2 , then the complexity of the array-based (or list-based) implementation is O(n 2+n 2 ) = O(n^2). The heap-based implementation then has a complexity of O((n^2+n) log n), or O(n^2 log n), which scales worse in n. The array-based implementation thus is the best choice for these dense graphs. ## Week 14 - Graphs - Shortest Path II ![](https://i.imgur.com/zDNTbr5.png) ### Activity 1: Applications of negative edges When modeling real-world phenomena with weighted graphs, a negative edge weight can mean different things. Search the internet for at least two applications of weighted graphs in which negative edge weights can occur. Describe clearly the meaning of a negative edge weight in each application that you find. --- --- --- ![](https://i.imgur.com/YNVffkx.png) ### Activity 2: Dijkstra on graphs with negative edges The cyclic graph below contains edges that have negative weights. ![](https://i.imgur.com/oGo5xjF.png) When using Dijkstra’s algorithm to compute the shortest paths from vertex a, each vertex is visited only once. The order in which vertices are visited is determined by their current known smallest distance. Answer the following two questions: * In which order does Dijkstra’s algorithm relax the six vertices of this graph? * Did Dijkstra’s algorithm solve the shortest paths problem for this graph? Why (not)? --- * Answer: Dijkstra relaxes the six vertices in the following order: A, C, B, E, F/D, which will set the distance from "a" to "e" to 4. This does not solve the shortest paths problem, because the shortest distance from "a" to "e" is 2, not 4 --- --- --- ![](https://i.imgur.com/KLzxgmP.png) ![](https://i.imgur.com/B9krNvR.png) ![](https://i.imgur.com/hpgEKdZ.png) ### Activity 3: Shortest path trees Consider the graph shown below. At some point, all vertices have been relaxed, some perhaps even multiple times. The graph shows the current shortest paths found (solid and thicker lines). ![](https://i.imgur.com/9MLJIop.png) Answer the following questions: * Are these paths the shortest paths possible? * If shorter paths exist, the distances to which vertices can be further decreased? * If shorter paths exist, which vertices need to be relaxed to update the shortest paths, and in which order? --- * Answer: These paths are not the shortest paths possible - "d" is at a distance of 6 now (through path abced), but a shorter path exists - abd - with a distance of 4. Also, the distance to "f" is not the shortest. Relaxing vertex b will set the correct distance to d. If subsequently d is relaxed as well, then the shortest distance to f is updated to the correct value - 3 - as well. --- --- --- ![](https://i.imgur.com/hVCSIM5.png) ### Activity 4: Do it yourself In the graph below, the shortest path from a to all other vertices is trivial to find. An algorithm, however, is a fixed sequence of steps, and thus has no clue about what the graph looks like. In other words, even though the graph looks simple, it still makes an interesting case study. Initially, source vertex a is assigned a distance of 0, and all other vertices a distance of "very far". ![](https://i.imgur.com/AER8cWp.png) Suppose that we would go over this graph in passes. In each pass, we relax the vertices in the following (reversed topological) order: f,e,d,c,b, and finally a. * How many passes are necessary to compute the correct shortest distances for each vertex? * How many vertex relaxations are performed in total (note that vertices with a distance of "very far" are relaxed as well)? --- * Answer: During each pass, one more distance is set to the correct value. Vertices b,c,d,e, and f all need to be updated, so five passes are necessary. During a single pass, each vertex is relaxed, so a total of 30 vertex relaxations are performed. --- --- --- ![](https://i.imgur.com/To15RTv.png) ### Activity 5: An easy graph For this activity, see the graph shown below: ![](https://i.imgur.com/nKCPaoK.png) Again, go over the graph in passes. In a single pass, the seven vertices are relaxed in the same order. In this activity, your task is to find the worst-case relaxation order. That is: in which order must the vertices be relaxed so that the maximum number of passes is needed to compute the shortest paths? *Hint: there may be multiple answers.* --- * Answer: Any relaxation order will have the same effect - after relaxing vertex a, all vertices will have their distance set to the smallest possible distance. --- --- --- ![](https://i.imgur.com/FuGtFA4.png) ![](https://i.imgur.com/QdeDEyp.png) ### Activity 6: Worst-case number of passes Suppose that we have an undirected graph of n vertices, in which each vertex is reachable from every other vertex. We set the distance of one (source) vertex to 0, and the distance of all other vertices to "very far". We then run an algorithm that, similar to the previous activites, relaxes vertices until no distance is updated. This is done in passes - during a single pass, each vertex is relaxed once. * What is the worst-case number of passes needed by the algorithm? * What can you say about the shortest paths tree if this maximum number of passes is needed? --- * Answer: The worst-case number of passes is n − 1. In case this worst-case number of passes is needed, then the shortest paths tree must be a single chain (i.e. no forks in the shortest paths) of vertices. --- --- --- ![](https://i.imgur.com/Zdrthei.png) ### Activity 7: Bellman-Ford Implement the function relax_all (you can find this function in the shortest_paths class), which performs a single pass of the Bellman-Ford algorithm. The function must return true if one ore more vertex distances have changed due to the relaxations (and false otherwise). In the compute function, repeatedly call the relax_all function until it returns false. This completes the implementation of the Bellman-Ford algorithm. The main function contains tests to verify your implementation --- --- --- ![](https://i.imgur.com/eIOx0Xv.png) ![](https://i.imgur.com/uHK3bjq.png) ### Activity 8: Negative cycles Search the internet for at least two real-world phenomena modeled by graphs, in which negative cycles play a role. What does a negative cycle indicate in each of these phenomena? --- --- --- ![](https://i.imgur.com/Z7viNuk.png) ### Activity 9: Negative cycles and shortest paths Use your implementation of the bellman-ford algorithm to compute the shortest paths for the two graphs undirected_one and undirected_two (see the example\_graphs class). How many passes (i.e., calls to the relax_all function) does the algorithm need to find the shortest paths for these graphs? *Hint: Use the debugger to find this out - set a breakpoint inside the relax_all function to count the number of times, or introduce a variable in the compute function to keep track of the number passes.* --- * Answer: For one of the graphs, the algorithm will never stop, because there’s always a vertex that has its distance updated. ### Activity 10: Bellman-Ford with negative cycle detection Extend your implementation of the Bellman-Ford algorithm so that, after all distances have been found (assuming no negative cycles exist), it does one more iteration to detect the presence of a negative cycle. In case a negative cycle is found, set the m_cycle_found member to true, otherwise set it to false. Apply the updated implementation to the same graphs of the previous activity to see if the algorithm can now successfully deal with negative cycles. Use the shortest_paths_tester class to test the correctness of your implementation. ### Activity 11: Worst-case time complexity of Bellman-Ford What is the worst-case time complexity of the Bellman-Ford algorithm? You can use the internet to answer your question, but provide a clear explanation in your logbook of the complexity you find! --- * Answer: The worst-case time complexity is O(n · m) - each edge is relaxed n − 1 + 1 = n times, and there are m edges in the graph. --- --- --- ![](https://i.imgur.com/6RrYoWi.png) ### Activity 12: Comparing algorithms Which algorithm to apply when computing shortest paths: cyclic vs acyclic, negative edges vs. positive edges. In the first column of the table, "dense" refers to graphs in which the number of edges (m), is in the order of the number of vertices squared, or n^2 , and "sparse" refers to graphs in which m is in the order of n(log(n)). ![](https://i.imgur.com/pnXDQwm.png) In the "Algorithm" column, choose from "Topological order", "List-based Dijkstra", "Heap-based Dijkstra", and "Bellman-Ford".