# Homework 5 The goal of this assignment is to practice working with the Binary Search Tree data structure and with sorting algorithms. In particular, this assignment will reinforce the intertwined nature of algorithms with data structures by having you implement a sorting algorithm with a tree traversal. You will also get some practice with random testing, and set up for a discussion after November break. ## Setup and Handin ### Setup We are providing an implementation of a Binary Search Tree below. Create a new project in VSCode and create `bst.py` and `tree_sort.py` in that project. (Please use this implementation, as it will ease our grading and reduce other potential issues.) ### Staff Assistance Your friendly staff are here to help with this assignment! You can find our hours schedule [here](http://cs.brown.edu/courses/csci0112/fall-2021/hours.html). ### Handin Hand in the following files to Gradescope when submitting the assignment: - `bst.py` - `tree_sort.py` - `test_tree_sort.py` - `README.txt` Remember to include a README.txt when submitting your assignment! We provide a specification for your README file below. ## The Assignment ### Binary Search Trees Here is a lightly cleaned up version of the Binary Search Tree implementation we ended up with in class, which you should copy into a file called `bst.py` in your new project. Note that this implementation is more robust in two ways: - it handles _duplicate keys_ (e.g., the integer `3` might be stored multiple times in the BST); and - it returns the value found (if any) rather than just a boolean (to better support the extension we talked about in class, where the BST stores records, rather than just raw values). ```python class BSTNode: def __init__(self, data): self.data = data self.left = None # duplicate goes left (always) self.right = None class BST: def __init__(self): self.root = None def contains_at(self, node: BSTNode, data) -> bool: if not node: return False if node.data == data: return node.data if data < node.data: return self.contains_at(node.left, data) else: return self.contains_at(node.right, data) def contains(self, data) -> bool: return self.contains_at(self.root, data) def count_at(self, node) -> int: if not node: return 0 return 1 + self.count_at(node.left) + self.count_at(node.right) def count(self) -> int: return self.count_at(self.root) def insert_to(self, node: BSTNode, data): if data <= node.data: # duplicate goes left (always) if node.left: self.insert_to(node.left, data) else: node.left = BSTNode(data) else: if node.right: self.insert_to(node.right, data) else: node.right = BSTNode(data) def insert(self, data): if not self.root: # child was None self.root = BSTNode(data) else: self.insert_to(self.root, data) ``` ### Tree Sort Here is a stencil - copy this into a file called `tree_sort.py`. ```python from bst import * def convert_to_tree(ls: list) -> BST: """ Tree Sort: Takes in a (possibly unsorted) list and converts it to a binary search tree """ pass def traversal(node: BSTNode) -> list: """ Accept a tree node and perform an in-order depth-first traversal of the tree to get (what should be) a sorted list. """ pass def tree_sort(ls: list) -> list: "Combine the above 2 functions to sort the input list" pass ``` In `tree_sort.py`, fill in the `tree_sort` function which takes in a list and returns a sorted list. Your function must use the tree sort algorithm in two stages: - `convert_to_tree`: Create an empty BST and then iterate through the input list, inserting each element one-by-one into the new BST. - `traversal`: Perform an _in-order_ traversal of the BST (following the algorithm below), inserting every element of the BST one-by-one into a new list. - Return the new list. Do not use any pre-defined sorting functions. Do not worry about _balancing_ the tree; it may very well be that our implementation will produce an unbalanced tree; that isn't the focus of this assignment. To convert your binary search tree into a sorted list, you should perform an _in-order_ traversal of it. This means that when you traverse a node, you first recursively traverse its left child, then visit the node itself, then recursively traverse its right child. Here is pseudocode to follow, based on what we covered in class: - Create a new empty list. - If there is a left child, recursively call `traversal` on the left child, and append its result to the list. - If the current node is non-empty, append its value to the list. - If there is a right child, recursively call `traversal` on it, and append its result to the list. - Return the list. ### Unit testing You should write tests for your `tree_sort` function in `test_tree_sort.py` as you have for other assignments in this class. ### Random Testing Inspired by the code from Wednesday's lecture, create a random testing function for your treesort implementation. Use the code of any sorting algorithm as a baseline for correctness (in class, we used merge sort, but e.g. insertion sort would also be OK). Invoke this random tester from a separate function called `test_tree_sort_randomly` in `test_tree_sort.py`. ### README In your `README.txt`, please include answers to the following questions. __Note that there is a third question below, specific to this assignment!__ - Did you discuss this assignment with any students? Please list their cs logins. - How many late days are you using on this assignment? - Suppose that you were sorting records (i.e., instances of a dataclass with multiple fields, only one of which is the key to be sorted over), rather than just integer keys. What issues might arise in your random testing approach? For the third question, think carefully about what new complexities records, rather than just values, add. Suppose you were sorting (say) a list of records about people by their ages in ascending order. Does this affect your random testing in any way? (Be prepared to follow up with a discussion after break.)