# Lecture 5 - Splay Tree, Link/Cut Tree and ET-Tree ###### tags: `Algorithm` ## Dynamic Data Structure + Supporting following operations + $Search$ + Priority queue only support finding minimum + $Minimum/Maximum$ + $Predecessor$/$Successor$ + $Predecessor(i)$: maximum element smaller than $i$ + $Successor(i)$: minimum element greater than $i$ + $Insert$ + $Delete$ ### Binary Search Tree (BST) #### Definition + A rooted binary tree with a key stored at each node + The key in each node must be greater than all keys stored in the left sub-tree, and smaller than all keys in the right sub-tree ![](https://i.imgur.com/DodqCmb.png) #### Property + Binary tree in **sorted** order + Maintain ordering property for **all** sub-trees ![](https://i.imgur.com/THyXgtO.png) + List all elements in sorted order ```cpp //In-order traversal void list(x){ if (x!=NULL){ list(x->left); print(x->key); list(x->right); } } ``` #### Operation + $Search$, $Maximum$, $Minimum$ operations are trivial + $Successor$ + If $x$ has a right child, then return the minimum of its right child ![](https://i.imgur.com/dobEjBC.png) + When $x$ does not have a right subtree + If $x$ is the left child of its parent, then successor is its parent ![](https://i.imgur.com/0Z0UkNG.png) + If $x$ is the right child of its parent + $x$ is the largest in the subtree rooted at it + We want to find the successor of $x$’s parent in the tree without the subtree rooted at $x$ ![](https://i.imgur.com/KAvSigH.png) ```cpp Node* Successor(x){ if (x->right!=NULL) return min(x->right); Node* y=x->parent; while (y!=NULL && x==y->right){ x=y; y=x->parent; } return y; } ``` + $Insert$ + Trivial insertion ![](https://i.imgur.com/ifqPDpo.png) + However, the BST built in this way may be not balanced + $Delete$ + If $z$ has no child, simply delete it + If $z$ has one child $y$, simple replace $z$ by $y$ ![](https://i.imgur.com/H5aHTig.png) + If $z$ has two children, then we can find the $z$’s successor $y$ + $y$ must be in $z$’s right subtree + Replace $z$ by $y$, and the rest of $z$’s right subtree as its right subtree ![](https://i.imgur.com/eDvCJ0s.png) #### Time Analysis + All operations are $O(h)$, where $h$ is the height of the binary search tree + Self-balancing binary search tree + **Definition**: A data structure keeping the height of the binary search tree small + **BST is balanced if $h=O(\log n)$** + $n$: number of elements in BST + **$Insert$ and $Delete$ to keep BST balanced** ![](https://i.imgur.com/6AaGbQe.png) + For example: AVL-tree, Red/Black-tree, B-tree ### Splay Tree (*Sleator-Tarjan* 1983) #### Basic introduction + Self-adjusting BST + Most frequently accessed items are close to root + Tree automatically reorganizes itself after each operation + No balance information is explicitly maintained + Tree remains nicely balanced, but height can potentially be $n -1$ + Sequence of $m$ ops involving $n$ inserts takes $O(m \log n)$ time ##### Theorem (*Sleator-Tarjan*, 1983a): Splay trees are as efficient (in amortized sense) as static optimal BST ##### Theorem (*Sleator-Tarjan*, 1983b): Shortest augmenting path algorithm for max-flow can be implemented in $O(mn\log n)$ time + Sequence of $mn$ augmentations takes $O(mn\log n)$ time + Splay trees used to implement dynamic trees (link/cut tree) #### **Operation** + $ZIG-ZAG$ + $ZIG$: left rotation + $ZAG$: right rotation ![](https://i.imgur.com/VqTIA4M.png) + $Splay(x, S)$: Reorganize splay tree $S$ so that **element $x$ is at the root** if $x\in S$; otherwise the new root is $max\{k\in S|k<x\}$ + Do following operations until $x$ is root + $ZIG$: If $x$ has a parent but no grandparent, then rotate $x$ ![](https://i.imgur.com/IJ4wkWq.png) + $ZIG-ZIG$: If $x$ has a parent $y$ and a grandparent, and if both $x$ and $y$ are either both left children or both right children ![](https://i.imgur.com/hrpa5ug.png) + $ZIG-ZAG$: If $x$ has a parent $y$ and a grandparent, and if one of $x$, $y$ is a left child and the other is a right child ![](https://i.imgur.com/18b0zIe.png) + For example + $Splay(1, S)$ ![](https://i.imgur.com/EDm7o8H.png) + $Splay(2, S)$ ![](https://i.imgur.com/ibR9C60.png) + $Find(x, S)$: determine whether element $x$ is in splay tree $S$ + Call $Splay(x, S)$ + If $x$ is root, then return $x$; otherwise return NO + $Join(S, S')$: join $S$ and $S'$ into a single splay tree, assuming that $x < y$ for all $x ∈S$, and $y ∈S'$ + Call $Splay(\infty, S)$ so that largest element of $S$ is at root and all other elements are in left subtree + Make $S'$ the right subtreeof the root of $S$ + $Delete(x, S)$: delete $x$ from $S$ if it is there + Call $Splay(x, S)$ to bring $x$ to the root if it is there + Remove $x$, let $S'$ and $S''$ be the resulting subtrees + Call $Join(S', S'')$ + $Insert(x, S)$: insert $x$ into $S$ if it is not already there + Call $Splay(x, S)$ and break tree at root to form $S$ and $S''$ + Call $Join(Join(S', \{x\}), S'')$ + $Split(x, S)$: partitions $S$ into two trees, $S_1=\{y\in S| key(y)≤x\}$ and $S_2=\{y\in S| key(y)>x\}$ + Call $Splay(x, S)$ + $S_1$: left subtree of root + $S_2$: right subtree of root #### Analysis ##### Definition + Let $S(x)$ denote subtree of $S$ rooted at $x$ + $|S|$: number of nodes in tree $S$ + $\mu(S) = rank = \lfloor \log |S|\rfloor$ + $\mu(x) = \mu(S(x))$ ![](https://i.imgur.com/jZewxWv.png) ##### Splay Invariant Node $x$ always has at least $\mu(x)$ credits on deposit, which means the potential function $\phi(S)=Σ_{x∈S}\mu(x)$ ##### Splay Lemma: Each $Splay(x, S)$ operation requires at most $3(\mu(S)-\mu(x))+1$ credits (amortized time) to perform the splay operation and maintain the invariant + Let $\mu(x)$ and $\mu'(x)$ be rank before and after single $ZIG$, $ZIG-ZIG$, or $ZIG-ZAG$ operation on tree $S$ + We show invariant is maintained (after paying for low-level operations) using at most + $3(\mu(S) -\mu(x)) + 1$ credits for each $ZIG$ operation + $3(\mu'(x) -\mu(x))$ credits for each $ZIG-ZIG$ operation + $3(\mu'(x) -\mu(x))$ credits for each $ZIG-ZAG$ operation + Thus, if a sequence of of these are done to move $x$ up the tree, we get a telescoping sum $⇒$ total credits $≤ 3(\mu(S) -\mu(x)) + 1$ ###### $ZIG$ ![](https://i.imgur.com/SsBtJ0M.png) + Amortized cost: $1+\mu'(x)+\mu'(y)-\mu(x)-\mu(y)\\=1+\mu'(y)-\mu(x)\\\leq 1+\mu'(x)-\mu(x)\\\leq 1+3(\mu'(x)-\mu(x))\\=1+3(\mu(S)-\mu(x))$ + Use extra credit to pay for low-level operations ###### $ZIG-ZIG$ ![](https://i.imgur.com/x0qLSth.png) + Amortized cost: $1+\mu'(x)+\mu'(y)+\mu'(z)-\mu(x)-\mu(y)-\mu(z)\\=1+\mu'(y)+\mu'(z)-\mu(x)-\mu(y)\\= 1+(\mu'(y)-\mu(x))+(\mu'(z)-\mu(y))\\\leq 1+(\mu'(x)-\mu(x))+(\mu'(x)-\mu(x))\\=1+2(\mu'(x)-\mu(x))$ + If $\mu'(x)>\mu(x)$, then can afford to pay for constant number of low-level operations and maintain invariant using $≤ 3(\mu'(x) -\mu(x))$ credits ###### Nasty case: $\mu(x)=\mu'(x)$ + We show in this case $\mu'(x) +\mu'(y) +\mu'(z) < \mu(x) +\mu(y) +\mu(z)$ + Don't need any credit to pay for invariant + $1$ credit left to pay for low-level operations + For contradiction, suppose $\mu'(x) +\mu'(y) +\mu'(z)\geq \mu(x) +\mu(y) +\mu(z)$ + Since $\mu(x)=\mu'(x)=\mu(z)$, by monotonicity $\mu(x)=\mu(y)=\mu(z)$ + After some algebra, it follows that $\mu(x)=\mu'(z)=\mu(z)$ + Let $a=1+|A|+|B|$, $b=1+|C|+|D|$, then $\lfloor \log a\rfloor=\lfloor\log b\rfloor=\lfloor \log(a+b+1)\rfloor$ + WLOG, assume $b\geq a$ + $\lfloor\log(a+b+1)\rfloor\geq\lfloor \log(2a)\rfloor\geq 1+\lfloor\log a\rfloor>\lfloor \log a\rfloor$ ###### $ZIG-ZAG$ ![](https://i.imgur.com/aBmT014.png) + Argument similar to $ZIG-ZIG$ ##### Theorem: A sequence of $m$ operations involving ninserts takes $O(m\log n)$ time + $\mu(x)≤\lfloor\log n\rfloor\to$ at most $3\lfloor\log n\rfloor+1$ credits are needed for each splay operation + $Find$, $Insert$, $Delete$, $Join$, $Split$ all take constant number of splays plus low-level operations (pointer manipulations, comparisons) #### Optimality of Splay Tree + Static splay tree: no insertion or deletion, only accessing, so the tree will becomes more balanced + Let $1,…,n$ be the elements in the splay tree + A sequence of $m$ accesses, in which item $i$ appears $q(i)$ times ##### Theorem: If $q(i)>0$ for $1≤i≤n$, then the total access time is $O(m+\sum_{i=1}^n q(i)\log(\frac{m}{q(i)}))$ + In the amortized analysis, **assign weights $w(x)$ to each node $w$** + $S(x)$ is the subtree rooted at $x$, $\mu(x) = \mu(S(x))= \lfloor\log(\sum_{y\in S(x)}w(y))\rfloor$ + Total potential $\phi(S)=\sum_{x∈S}\mu(x)$ + You can check all steps in the previous proof work :::info Splay Lemma: Each $Splay(x, S)$ operation requires at most $3(\mu(S)-\mu(x))+1$ credits (amortized time) to perform the splay operation and maintain the invariant ::: + **So amortized time for accessing $i$ is $≤3(\log m-\log q(i))+1$** + The total potential drop is $\leq\sum\log(m/q(i))$ + This bound reaches *Shannon* entropy, thus optimal (to a constant factor) #### Dynamic Optimality Conjecture + $s$: any sequence of dictionary operations starting with any $n$-node BST + $OPT$: optimum off-line algorithm on $s$ + $OPT$ carries out each operation by starting from the root and accessing nodes on the corresponding search path, and possibly some additional nodes, at a cost of one per accessed node, and that between each operation performs an arbitrary number of rotations on any accessed node in the tree, at a cost of one per rotation + Then, $C_{SPLAY}(s) = O(n+C_{OPT}(s))$ ## Dynamic Trees Structure Goal: maintain a forest of (rooted) trees + Operation + $MakeTree(v)$: creates and returns singleton tree with vertex $v$ + $Link(v,w)$: add an edge between $v$ and $w$ (**$v$ and $w$ are not in the same tree**) + $Cut(v,w)$: delete **tree edge** $(v,w)$ + $FindRoot(v)$: returns the root of the tree containing $v$ (**to check whether two nodes are in the same tree**) + Application + Network flows + Static and dynamic graph algorithms + Computational geometry ### Link/Cut Tree #### Definition + We build a structure to represent this tree T + Every node has $0$ or $1$ **preferred child** + **Preferred edge: the edge between a preferred child and its parent** + The original tree $T$ can be partitioned into **preferred path** ![](https://i.imgur.com/AruKFHw.png) ##### Preferred Path + Each preferred path is an ancestor-descendent path + We use a splay tree to represent a preferred path + Keyed by its depth + Left subtree: higher nodes + Right subtree: lower nodes ![](https://i.imgur.com/vYYgrNJ.png) + Every splay tree (preferred path) has a pointer to its parent + From the root of the splay tree ![](https://i.imgur.com/aVYhFu0.png) #### Operation + $Access(v)$ + All operations are implemented by the $Access(v)$ subroutine + When calling $Access(v)$, the path from $v$ to the root becomes **preferred path** + Previous preferred edge associated with nodes on this path becomes unpreferred ![](https://i.imgur.com/9Ep8rmN.png) ```python access(v) splay(v) in the splay tree containing v remove v’ preferred child (split the splay tree and change the pointer) while v is not the in the preferred path containing the root let w be the nodes v points to splay(w) in the splay tree containing w switch w’s preferred child to v v ← w splay(v) in the splay tree containing v ``` + For example: $Access(N)$ ![](https://i.imgur.com/b8fnEzY.png) + Make $N$’s preferred child unpreferred ![](https://i.imgur.com/Nzq7Wof.png) + Find $N$’s preferred path’s parent by $N$’s pointer ![](https://i.imgur.com/7BnJp4n.png) + $Splay(I)$ inside the splay tree ![](https://i.imgur.com/HgeMBdN.png) + Switch $I$’s preferred child and union the two splay trees ![](https://i.imgur.com/qPfM1di.png) + $I$ points to $H$, so call $Splay(H)$ ![](https://i.imgur.com/6EzMWtM.png) + Switch $H$’s preferred child to $I$, change the splay tree accordingly ![](https://i.imgur.com/dA3Mldd.png) ![](https://i.imgur.com/UbZorf1.png) + Call $Splay(N)$ ![](https://i.imgur.com/TiIBL3H.png) + $FindRoot(v)$ + $Access(v)$ + Find the smallest element in the splay tree containing $v$ + $Cut(v, parent(v))$ + $Access(v)$ + Split v from the splay tree + For example, $Cut(N,L)$ ![](https://i.imgur.com/VbtXDue.png) + $Link(v,w)$ + $Access(v)$ + $Access(w)$ + Let $w$ be $v$’s right child + Since after calling $Access(v)$, $v$ doesn’t have left child + So the preferred paths are linked together #### Complexity Analysis + Real running time of $Access(v)$ depends on the number of preferred edge changes + Not bounded in one operation + what about amortized time? + $size(v)$: number of nodes in $v$’s subtree (original tree) + **Heavy edge**: an edge $(v, parent(v))$ is heavy if $size(v)>\frac{1}{2}size(parent(v))$ + Otherwise, it is a **light** edge ![](https://i.imgur.com/t9feHc8.png) ##### Preferred Edge Changed + Consider the preferred edges changed on the path + light-to-light or heavy-to-light: $\leq\log n$ times + Since each time the size of the subtree at least doubles + What about **light-to-heavy**? + Amortized $O(\log n)$ times + No heavy-to-heavy edges + In a long operation sequence, number of switches bounded by $O(\log n)$ per operation ![](https://i.imgur.com/SJkxPkw.png) + In each preferred edge change we need to update the splay trees, so amortized $O(\log^2n)$ time + We can prove an $O(\log n)$ time bound by given the potentials in the splay tree: $\phi(S)=Σ_{x∈S}\lfloor \log(s(x))\rfloor$, where $s(v)$ equals number of nodes in $v$’s subtree, including the splay tree and all its descendent in the **represented tree** ![](https://i.imgur.com/gWL8j50.png) + Then the cost for one splay operation is $3(\log(s(u))-\log(s(v)))+1$, where $u$ is the root of $v$ in the splay tree + If $w$ is the parent of the root of auxiliary tree containing $v$, then we have that $s(v) ≤ s(u) ≤ s(w)$ + So sum over all splay trees changes: amortized $O(\log n)$ time + Thus, $O(\log n)$ time for both splay tree and preferred edge changes operations #### Conclusion + It is a dynamic trees + $O(\log n)$ amortized time per operation + But hard to traverse all nodes in a tree, since no pointer to child splay trees ### Euler-Tour-Tree (ET-Tree) #### Technique + Tree do not have Euler tour ![](https://i.imgur.com/z7v8kbn.png) + Replace each $\{u,v\}$ into $(u,v)$ and $(v,u)$ + Resulting graph has an Euler tour ![](https://i.imgur.com/34h4Ft9.png) + It can use a sequence to represent Euler tour (not unique) ![](https://i.imgur.com/r7LbEHK.png) + It solves the dynamic connectivity problem in forests + **High-level idea**: Instead of storing the trees in the forest, store their Euler tours + Each edge insertion or deletion translates into a set of manipulations on the Euler tours of the trees in the forest + Checking whether two nodes are connected can be done by checking if they're in the same Euler tour. #### Property + The sequence of nodes visited in an Euler tour of a tree is closely connected to the structure of the tree ![](https://i.imgur.com/mnDVltL.png) + Begin by directing all edges away from the first node in the tour + **Claim**: The sequences of nodes visited between the first and last instance of a node $v$ gives an Euler tour of the subtree rooted at $v$ ![](https://i.imgur.com/RLdxncq.png) ![](https://i.imgur.com/gmW7q7V.png) #### Operation + $Reroot$ + The subtrees defined by ranges in Euler tours on trees depend on the root + In some cases, we will need to change the root of the tree ![](https://i.imgur.com/RB2L5rA.png) + To reroot at $r$ + Pick any occurrence of the new root $r$ ![](https://i.imgur.com/Z6SnOSJ.png) + Split the tour into $A$ and $B$, where $A$ is the part of the tour before $r$ + Delete the first node of $A$ ![](https://i.imgur.com/c25TN1H.png) + Concatenate $B$ and $A$ ![](https://i.imgur.com/UaL9KJo.png) + Append $r$ ![](https://i.imgur.com/Y2ehlnS.png) + $Link$ + Given two trees $T_1$ and $T_2$, where $u\in T_1$ and $v\in T_2$, executing $Link(u,v)$ links the trees together by adding edge $\{u,v\}$ ![](https://i.imgur.com/xpP2GOQ.png) + To link $T_1$ and $T_2$ by adding $\{u,v\}$ + Let $E_1$ and $E_2$ be the Euler tours of $T_1$ and $T_2$ respectively ![](https://i.imgur.com/cOrdAxL.png) + Rotate $E_1$ to root the tour at $u$ + Rotate $E_2$ to root the tour at $v$ ![](https://i.imgur.com/AYr3w45.png) + Concatenate $E_1$, $E_2$ and $\{u\}$ ![](https://i.imgur.com/1YlPQF8.png) + $Cut$ + Given a tree $T$, excuting $Cut(u,v)$ cuts the edge $\{u,v\}$ cuts the edge $\{u,v\}$ from the tree (assuming it exists) ![](https://i.imgur.com/vA4N70m.png) + To cut $T$ into $T_1$ and $T_2$ by cutting $\{u,v\}$ + Let $E$ be an Euler tour for $T$ ![](https://i.imgur.com/AbpO7wf.png) + Spilt $E$ at $(u,v)$ and $(v,u)$ to get $J$, $K$ and $L$, in that order ![](https://i.imgur.com/zf22kw9.png) + Delete the last entry of $J$ + $E_1=K$ + $E_2=J,L$ ![](https://i.imgur.com/uGtHJIQ.png) #### Implement + **Goal**: Implement $Link$, $Cut$ and $FindRoot$ as efficienty as possible + By representing trees via their Euler tours, can implement $Link$ and $Cut$ so that only $O(1)$ joins and splits are necessary per operation but $FindRoot$ is $O(n)$ + Use splay tree to maintain the Euler tours then all operations are $O(\log n)$ + Balanced binary tree ordered by ET-list ![](https://i.imgur.com/8hDCAhj.png) + For each vertex in graph, store its first and last appearance in the ET-list ![](https://i.imgur.com/AAoiDif.png) + Then its predecessor and successor is its parent in the original tree ![](https://i.imgur.com/pQhqM67.png) ### Summary + All in $O(\log n)$ time + $FindRoot(v)$ + $Link(v,w)$ + $Cut(v,w)$ + Link/Cut tree are good at aggregation over path + The sum of edge/node values on the path to the root + Minimum value on the path to the root + ET-Tree are good at aggregation over subtrees