IB asks the students in the exam to traverse in different ways data trees and be able to discuss its propierties. The students don't have to implement binary trees in pseudocode.
https://turnoff.us/geek/binary-tree/
We have already covered linked list. A couple of concepts that are important to the concept of linked list is that we don't have just the naked data anymore, but we have nodes, elements that have data (the payload) and one (or more) links.
In a single linked list would look like this:
credit Y.C.
The concept of tree (or data tree) comes when we ask ourselves the question, can we make this nodes connect in a different way? Can we connect one node to 2, 3, 5 different nodes? The answer was yes and the results are data trees.
Here we have a data tree:
(credit Y.C.)
Usually data trees, unlike linked list are represented with the entry point (the root node) on top instead that on the left. In some places the also put the entry point (the root node) on the bottom.
data tree graph in Grasshopper
The basics are kind of specifics versions of each other
A data tree is when we have nodes with any number of children.
Binary tree is the subset of data trees where we specify that any node can have only up to 2 childs.(we call them left and right)
Binary search tree is the subset of binary data trees with a condition that any left child has a value that is less than its parent and the right child is more than the parent.
Since binary search trees have the assumption of nodes having to be bigger or smaller, they cannot hold the same values. (data trees or binary trees don't have this restriction)
Here you have a Venn diagram to clarify
And a UML diagram using inheritance for people doing OOPç
When we have a binary tree (such as this binary search tree)
If we want to tell all the elements that we have in the tree we can have different orders. For example (not meaningful orders) we can go
34, 32, 39, 20, 33, 35 , 41, 18, 22, 7 ,19, 21, 24
Or
7, 19, 21, 24, 35, 41, 18, 22, 39, 20, 33, 32, 39, 34
To make a simplification we have a subtree of this tree.
We need to understand
We visit the root node
Then we visit the left node (or left subtree)
Finally we visit the right node (or right subtree)
So the subtree inorder would be
39, 35, 41
The whole tree would be:
34, 32, 20, 18, 7, 19, 22, 21, 24, 33, 39, 35, 41
One way to do it is creating this cloud around the binary tree
This shape should be as fit as you can. And from the start on the left.
Then we visit the left node (or left subtree)
We visit the root node
Finally we visit the right node (or right subtree)
Then we visit the left node (or left subtree)
Finally we visit the right node (or right subtree)
We visit the root node
Traverse this tree in preorder, inorder, postorder
Traverse this tree in preorder, inorder, postorder
credit Y.C.
Solution in the spoiler
Preorder: Potato, Lettuce, Banana, Cucumber, Carrot, Tomato, Lemon, Apple, Orange, Onion
Inorder: Banana, Lettuce, Cucumber, Potato, Lemon, Tomato, Carrot, Orange, Apple, Onion.
Postorder: Banana, Cucumber, Lettuce, Lemon, Tomato, Orange, Onion, Apple, Carrot, Potato
Look at the three ways of ordering.
Preorder: Root, left, right
Inorder: Left, root, right
Postorder: Left, right, root
This can be done in simple draft in the exam to remember (it's not in the pseudocode approbed notation)
Reference:
https://stackoverflow.com/questions/2130416/what-are-the-applications-of-binary-trees
The reason that binary trees are used more often than n-ary trees for searching is that n-ary trees are more complex, but usually provide no real speed advantage.
In a (balanced) binary tree with m nodes, moving from one level to the next requires one comparison, and there are levels, for a total of comparisons.
In contrast, an n-ary tree will require comparisons (using a binary search) to move to the next level. Since there are total levels, the search will require comparisons total. So, though n-ary trees are more complex, they provide no advantage in terms of total comparisons necessary.
(However, n-ary trees are still useful in niche-situations. The examples that come immediately to mind are quad-trees and other space-partitioning trees, where divisioning space using only two nodes per level would make the logic unnecessarily complex; and B-trees used in many databases, where the limiting factor is not how many comparisons are done at each level but how many nodes can be loaded from the hard-drive at once)
Other reference (not quoted here)
https://www.studysmarter.co.uk/explanations/computer-science/data-structures/binary-tree/
https://www.youtube.com/watch?v=oSWTXtMglKE
https://www.youtube.com/watch?v=K1a2Bk8NrYQ&ab_channel=SpanningTree
https://www.youtube.com/watch?v=t0Cq6tVNRBA
In this video deeps dive on the topic and goes with also high priority queues at the end
https://www.youtube.com/watch?v=HqPJF2L5h9U
I like this example of why using heaps from Stack overflow
https://stackoverflow.com/questions/749199/when-would-i-want-to-use-a-heap
Reverse Polish notation is something that happened in some (actually, one (1) exam) exams and you can see in the book. It's kind of abstract and when I talk to the students about it I see some very different faces trying to undertand it's use. Here you have some reference:
https://mathworld.wolfram.com/ReversePolishNotation.html
Something I was thinking it's interesting is thinking about names and surnames as payloads and pointers of ABT.
Let's see different ways of having surnames in different parts of the world
I'm going to start with the one that I'm familiar with. The Spanish (also seen in Latin America) one. If your name is "Laura Fernández González" you can consider that your surnames are pointers to other nodes.
Let's draw this as a node:
Fernández refers to the surname of Laura's father and González to the surname of Laura's mother (this can be changed and there are some nuances but this is the default)
So let's represent this:
I'm going to start all these diagrams as being the last (the youngest) person on the family the root and from there I'm going to go up as in regular family trees. Remember that the notation in CS of DT is with the root on top
This can go on
Also it's interesting because, in Spain even if the registered surnames are only two, you can tell the rest of the surnames following the procedure of writing the next surnames of your fathers. Following the example of Laura these would be their surnames:
(Of course, for this purpose, I'm going to omit contractions like Garcia-Quintero or the Catalan version Puig i Vidal)
I like to think of these as pointers to the grandparents.
You can go on and write all your eight (if you know your grandparents 2 surnames) or as much as you can find.
This is why there is a comedy called "ocho apellidos vascos" (8 basque surnames) and a sequel called "ocho apellidos catalanes" (8 catalan surnames)
If we think of this as traversing a data tree we can consider that this system kind of omits the middle nodes and focus on the leaves (the farther parent you know) and it's going to omit the middle parts.
In the case of the anglo-saxon system (where ladies lose their maiden surname) the same Surname refers to both parents. The mother may (still) have a maiden name that it's her link to their parents.
This for the purposes of Abstract Data Structures is a mess since we have a pointer pointing to more than one thing place to the same place! (we can consither that it's pointing to the family)
If we continue doing the family tree we would get something like this:
//to-do
//to-do