[Work in Pyret, putting your work in a file called bsts-code.arr
]
In class, we saw ancestor trees, in which we linked horses to their parents. The datatype we had for this was:
Tree-shaped data (data in which multiple "next" pieces of data branch off a single datum) arises in various contexts in CS. Think of the layers of menus of options you select through when trying to call customer service: first you indicate the general nature of your call, then you get a more refined set of questions, and so on. Each question is a data point that branches off to other questions, each of which has their own set of questions, and so on.
Ancestor trees and phone-call menus are examples where our data is inherently tree-shaped: the relationships we want to capture branch off between data points. But in CS we also use trees in a more artificial manner: we impose a branching structure on data to allow us to work with that data more efficiently.
This assignment has you work with binary search trees (henceforth abbreviated as BST), a way of organizing a collection of data with order to help us find data items faster than searching in lists.
Imagine that you had a set of numbers such as
8, 5, 7, 1, 3, 12, 6, 10, 2, 4
and you wanted to organize it in a way that showed the ordering relationship among the numbers. You could make a sorted list, but as we have seen, checking whether a specific number is in a list can take time linear in the size of the list.
A binary search tree (BST) is a binary tree (meaning a tree with at most two branches from each node/point) in which every node has the following property:
(this statement assumes no duplicates, but that is okay since we are modeling sets. If you want duplicates, change less-than to less-than-or-equal in the definition).
Here is one example BST on our sample set:
Practice (not for turn in): Draw two more BSTs on the same collection of numbers. Start with a different number at the top (called the root) and see what you come up with. Can you make distinctly different shapes?
Practice (not for turn in): Which of the following are BSTs on the numbers from 1 through 5?
Generalizing from our ancestor tree, here's a general datatype for BSTs on numbers:
This has the same shape as the AncTree
definition, but it is not specific to relationships on people. Instead, it allows any kind of data (through the type Any
). It also uses the term leaf
, instead of unknown
, for the empty tree.
Imagine you were given a BST representing a set of numbers, and were asked whether a specific number is in the set. How would you proceed?
Practice (not for turn in): Using the same tree we provided, work through a couple of example searches, some of numbers in the tree and some not in the tree.
Task (to turn in): When you understand the algorithm, implement the function. Don't forget to include test cases.
Note: even though the definition of BST is more general than just numbers, you can assume we are working with trees of numbers when writing your code.
How do we build-up a BST, however? For lists, we know how to use link
to add an element. How do we add an element to a BST?
Assume you are given a BST and a number. You add an element to the BST by looking through the tree for the leaf where the number should be, then putting the number in that spot. For example, assume you wanted to insert the number 4 into the following tree:
We'd start at the root, and ask whether 4 is less than or greater than 8. Less than, so we should insert 4 on the left (so that we maintain the requirement on a BST). Now we compare 4 to 5. Less than, so we insert on the left. Now compare 4 to 3. Larger than, so we insert on the right. But the right is a leaf, so that is where we put the 4. Here's the resulting BST:
Notice that in this algorithm, new elements are always inserted at the leaves, not in the middle of the tree. This disrupts the tree as little as possible, while still maintaining the BST requirement.
Practice (not for turn in): By hand, follow the above algorithm to create BSTs for each of the following setups on the same numbers:
start with a leaf, then add 3, 1, 2, 4
in that order
start with a leaf, then add 4, 3, 2, 1
in that order
start with a leaf, then add 4, 2, 1, 3
in that order
Task: Implement add-elt
(our description explained the algorithm, which inserts at leaves). Don't forget to write test cases (and writing them first could really help you here, especially in the leaf case).
Also, follow the template code skeleton that we discussed on trees in lecture. It is easy to overthink this problem and get your code in a mess. The code should follow fairly naturally from the examples and template if you use them carefully.
You thought you were done with sorting in the previous assignment, but it turns out there are some sorting algorithms that use secondary data structures to help sort the list. One such approach is "tree sort." Given a list, tree sort works in two steps:
Task: Write a function with the definition
that implements tree sort. You can assume that the elements in num-list
will be unique.
Hint: the two steps in tree sort lend themselves well to two helper functions, one that takes in a List<Number>
and outputs a BST
and one that takes in a BST
and outputs a List<Number>
. Since the two functions run indepedently, code and test each of them in turn before combining them. For the second function in particular, it might help to draw some diagrams of what the following five BSTs should look like, turned into Lists (again, you do not have to turn the diagrams in)
In a comment, describe any questions you have (if any) based on these exercises.
Turn in a file bsts-code.arr
with your code.
Information about the Autograder
After you submit your work, Gradescope will run our autograder tests and display the results. It is your responsibility to make sure that all autograder tests pass.
bsts-code.arr
and that the function names and inputs are exactly as asked for in the handout. Specifically for this assignment, make sure that your BST
data declaration is exactly the same as in this handout (with a leaf
and a node
with the relevant fields)The autograder runs our tests on your add-elt
, has-elt
, and tree-sort
implementations. Getting a full autograder score does not guarantee that you passed the assignment – Milda will manually grade your code to make sure that the algorithms were implemented as described in the handout.