---
tags: summer 2018 cs61bl
---
# CS 61BL Week 5 Notes
Hi! Every week, I like to compile notes based off confusing concepts about and good questions that I've been asked! If you’re already familiar with all the topics, check out the **Resources** section for practice problems or insights.
Also, if there’s anything that doesn’t make sense or seems confusing, feel free to shoot me an [email](mailto:tinazhao@berkeley.edu).
**[Note]: Fixed Broken Crib Sheet Links**
## Table of Contents
[TOC]
## 2-3/2-3-4 Trees
In last Monday's [lab](https://cs61bl.org/su18/labs/lab17/), you dealt with 2-3-4 trees (also called 2-4 trees). The tree structure below is an example.

### Terminology
* **Node:** refers to the struture containing the numbers (ex. 15 35 45)
* **Element:** refers to the numbers inside the node (ex. 15)
* **Children::** refers to the nodes stemming out from one arbitrary node
### Properties
* Should still follow the BST property (less on the left, greater on the right)
* For a 2-3-4 tree, each node contains 1-3 elements with 2-4 children nodes
* For a 2-3 Tree, each node contains 1-2 elements with 2-3 children nodes
### Common Insertion Misconceptions
**Where to Insert**
Suppose we want to insert 13 into the 2-3 tree below.

**Issue:**
A common misconception is to insert the 13 into the node with the closest number. It seems to make sense because 10 is still smaller than 13 and 15 and 16 are still greater than 14.

**Correct Way:**
However, this is incorrect! The correct resulting 2-3 tree should look like the one below.

**Key Takeaway:** **Always insert into a leaf node.**
#### Pushing up Nodes
Suppose we want to insert 17 into the 2-3 tree below.

**Issue:**
Since we know the 15-16 node will be full with the insertion of 17, we must push the 16 up. However, a common misconception is to keep 16 at the same level and create a new level for its children nodes.

**Correct Way:**
Instead, the correct way is to push the 16 up to the previous level with 14! Intuitvely, we can realize that the previous 2-3 tree becomes even more unbalanced if we create a new level while the one below maintains its previous height.

## Red Black Trees
Red Black Trees are binary trees with nodes that are colored red and black. There may be some discreprencies with red-black trees on some exams because for some semesters, they taught with red or black *links* instead of *nodes*.

### Purpose
Red Black Trees are considered the best of both worlds. They are balanced trees that allow us to search for nodes in **θ(log N)** time. However, they are also in the shape of a binary search tree, allowing us to run some BST operations (not all of them!).
### Constraints
1. Each link is (conceptually) colored red or black.
2. Root link is black.
3. Every leaf node contains no data (null) and is black. (These are the black links at the end of the Red Black Tree above!)
4. Every leaf has same number of black ancestors.
5. Every internal node has two children.
6. Every red node has two black children.
### Insertion Methods
In last Monday's lab, you might have learned about color flips and rotation when it comes to insertion. The cases are quite complicated! If you're really interested in understanding how to insert numbers directly into a red black tree, I recommend reading these [slides](https://docs.google.com/presentation/d/1j7kiMfMzmhdi6AoKSuYdZr65iH7ckeWMKGQAyI4M61c/edit#slide=id.g1ee126c17d_0_0) (start at Slide 43) and watching this short [video](https://www.youtube.com/watch?v=A3JZinzkMpk).
However, last's week [lecture](https://docs.google.com/presentation/d/1J7fPC_BYohgDUUDxTk4PiE6XyBHDUAatvOqX_vOhIWw/edit#slide=id.g3d073bd9ed_7_248) (Slide 32) specified that there was no need consider rotations and color flips for the insertion process. Instead, you can consider the **1-1 correspondance between 2-3 Trees and Red Black Trees**.
#### Summary
There are two methods for inserting into a Red Black Tree:
1. Insert and perform a series of color flips and rotations (difficult, but if you're interested, watch this [video](https://www.youtube.com/watch?v=A3JZinzkMpk))
2. Convert to a 2-3 Tree. Insert everything one-by one (order matters!). Convert back to a Red Black Tree. (recommended!)
### How to Check for a Valid Red-Black Tree
If you're ever unsure if the red-black tree you created is valid, try following this checklist:
- [ ] Maintains BST properties (left child smaller than root, right child greater than root)
- [ ] Every leaf has the same number of black ancestors (meaning every path from the root node to the leaf node should contain the same number of black links/nodes)
- [ ] The root node/link is black

**Let's try the checklist on the Red Black Tree pictured above.**
- [x] Maintains BST properties (e.g. 5 < 15 < 35)
- [x] Every leaf has the same number of black ancestors
- [x] The root node is black
**Conclusion:** The Red Black Tree above is valid!
## Heaps
### Properties
Heaps are just binary trees, but they have additional properties that make them pretty nice:
1) They are always as **complete** as possible. What this means is, the only 'empty' spaces in the tree should be at the very bottom with the leaf nodes. The tree is also 'filled in' from left to right - so when you add in nodes, the left most empty spot is filled in first.
2) They have a 'heap invariant' either for min or max heap known as **max-heap property** or **min-heap property**. With a min heap, this means every parent node is smaller than its children. With a max heap, this means every parent node is larger than its children.

### Underlying Structure
This idea of completeness allows us to actually represent a heap with an array because it guarantees that the tree fills up from the left! If we want to represent a heap with n nodes, we can create an array of size n + 1. We skip the 0th element in the array, and put the root value at index 1.
**Max-Heap** [X 8 7 4 3 1 2], **Min-Heap** [X 0 1 2 3 7 4 8]
If a node was represented by index **x**:
* Parent of the node is **x/2**
* Left child of the node is **x*2**
* Right child of the node is **x*2+1**
### Heapification
We can build a heap from an arbitrary, unsorted array, in a process called **heapification**.
There are 4 ways to heapify, but only 2 ways actually work:
* **Bottom-Up Heapification (Reverse Level-Order Bubbling Down)**
* **Top-Down Heapification (Level-Order Bubbling Up)**
The broad intuition for why this is the case is because only these two methods preserve the heap invariant (i.e. that every node is larger/smaller than all of its children).
### Bottom-Up Heapification
Let’s specifically look at the example of heapifying a **min heap**. Bottom-Up Heapification means we swap each node downwards with its smaller child if the childern nodes are smaller than its parent *(bubbling down)*. Another way to refer to the process of bubbling down is **sinking**. Notice that because we start at the bottom, and bubble downwards from each level onward, at each successive level we have to re-evaluate all the children below each node.

**Why is this a viable method?**
If the true minimum is at the bottom-most level of the tree, as it gets swapped upwards towards the top, we know for sure that each parent it swaps positions with is larger than it, which means as the node moves up we preserve the invariant that each children nodes is smaller than its children.
**Runtime Explanation**
Recall that sinking takes **O(log N)** in worst case because the root node needs to sink **log N** (the height of the tree) times. However, not all nodes do that. In fact, approximately half of the nodes don't even need to sink - the most bottom nodes. They have no children, which means they are a valid heap structure by themselves. *(Steps 1-5 demonstrate in the diagram above demonstrate this!)*
Now, the parents of these bottom nodes do need to sink, but once max. The grandparents, twice max. etc. And finally the root, **log N** max (from the height of the tree).
For the analysis now on, the most bottom level level will be 0, the level above level 1, etc. There are at most $\frac{N}{2^{h+1}}$ nodes at level $h$. The amount of sinking required for each level is $\frac{hN}{2^{h+1}}$. The total is $\sum^{logN}_{h=0}\frac{hN}{2^{h+1}}$.
Now, let's manipulate the expression:
$$
\sum^{log N}_{h=0}\frac{hN}{2^{h+1}} = \frac{N}{2}\sum^{log N}_{h=0}\frac{h}{2^h} < \frac{N}{2}\sum^{\infty}_{h=0}\frac{h}{2^h} = \frac{N}{2}(2) = N
$$
Although the calculation is not shown, the step from the summation to 2 relies on an infinite sum formula. Finally, we can conclude that bottom-up heapification has a runtime of **O(N)**.
### Top-Down Heapification
Top-Down Heapification means we start at the top of our tree, and swap each node upwards with its parent if it is smaller than its parent *(bubbling upwards)*. Bubbling nodes upward can also be referred to as **swimming**. Notice that because we start at the top, and bubble upwards from each level, at each successive level we have to re-evaluate all the parents above each node.

**Why is this a viable method?**
The reason is the same as it is for bottom-up heapfication. If the true minimum is at the bottom-most level of the tree, as we swap it upwards towards the top, we know for sure that each parent it swaps positions with is larger than it, which means as the node moves up we preserve the invariant that each children nodes is smaller than its children.
**Runtime Explanation**
For our implementation of PriorityQueue in Monday's lab, we had to insert new nodes at the bottom and bubble them up to the correct spot. This is the same thing that top-down heapification is doing! Therefore, we can think of top-down heapification of a heap with $N$ nodes as making a new Priority Queue and inserting $N$ nodes. Each insertion will take **O(log N)** and there will be $N$ insertions. Therefore, the runtime should be **O(N log N)**.
### Why the Other Methods Don't Work
The other methods are **level order bubbling down** and **reverse level order bubbling up**.Now let's consider for contrast **level order bubbling down** in particular. The reason it doesn’t work/doesn’t preserve the heap invariant is because we start at the top and bubble 'downwards', so nodes that we've already seen/swapped don't get re-evaluated.
Try to run through the same thought process with **reverse level order bubbling up** (why it doesn’t work). I would also practice the mechanics of heapifying a tree into both a min heap/max heap!
## Resources
### Practice Problems and Notes
* [Week 4 Resources](https://drive.google.com/open?id=113zG4_9jo655emP3Re3bP6voMN3-xQkf): Practice problems compiled by Tina for Week 4
* [Week 5 Resources](https://drive.google.com/open?id=16qbiOzEZo99XIoRPJOKmgzeqvQIlf25M): Practice problems compiled by Tina for Week 5
* [Week 4 Notes](https://hackmd.io/s/BkeKrctXQ): written by Tina on data structure analysis, trees, and tree traversals
### Review Slides
* [CSM MT2 Review Slides](https://docs.google.com/presentation/d/1QjR7Fki-0gjVlOH2aTWX6-AkPj_aHqIi0PUIQdKfnLw/edit#slide=id.g345868e7db_0_125): covers Asymptotics, Trees/BSTs/Self Balancing BST’s, Disjoint Sets, Heaps, Hashing, and ADT Selection Problems
* [HKN MT2 Review Slides](https://docs.google.com/presentation/d/1bJhYwdF6H_AYe7Fs_bCMaNhwDL5BdXOHii3roC8WzvI/edit#slide=id.p7): covers Generics, Iterators and Iterables, Exceptions, Heaps, Asymptotic Analysis, Disjoint Sets, Hashing, Binary Search Tree, B-tree, Red black tree
### Relevant CSM Crib Sheets
Useful for a quick conceptual review and practice
* [Asymptotics](https://d1b10bmlvqabco.cloudfront.net/attach/j9j0udrxjjp758/is6p8ixhpm47pr/jexhuf2m5jeq/asymptotics_edited.pdf)
* [Disjoint Sets](https://d1b10bmlvqabco.cloudfront.net/attach/j9j0udrxjjp758/is6p8ixhpm47pr/jexi79yool1a/disjoint_sets_walkthrough_updated.pdf)
* [Disjoint Sets Worksheet Solutions](https://d1b10bmlvqabco.cloudfront.net/attach/j9j0udrxjjp758/is6p8ixhpm47pr/jexi6pw04bi2/disjoint_sets_walkthrough_sol_updated.pdf)
* [Hashing](https://d1b10bmlvqabco.cloudfront.net/attach/j9j0udrxjjp758/is6p8ixhpm47pr/jevwvg7hte9b/hashing_crib_sheet.pdf)
* [Heaps](https://d1b10bmlvqabco.cloudfront.net/attach/j9j0udrxjjp758/is6p8ixhpm47pr/jevwvypfier2/heaps_crib_sheet.pdf)
### Visualizers
* [Balanced Trees](https://www.cs.usfca.edu/~galles/visualization/BTree.html): use "Max. Degree = 3" option to visualize 2-3 tree
* [Min Heaps](https://www.cs.usfca.edu/~galles/visualization/Heap.html)
* [Max Heaps](https://visualgo.net/en/heap)
### Video Walkthroughs
* [Spring 2017 Midterm 2](https://www.youtube.com/watch?v=_u9GyVG27wY&index=1&list=PLv24fkNrbcYh4GP5vkLsw0lpb6w6PpDfN)
* [Spring 2016 Midterm 2](https://www.youtube.com/watch?v=rciRgoiJVGY)
### Red-Black Tree Rotation and Color Flip Videos
* [Examples](https://www.youtube.com/watch?v=A3JZinzkMpk)
* [Strategies](https://www.youtube.com/watch?v=5IBxA-bZZH8)