## Lecture Lecture 19. Range queries - Static array queries - Binary indexed tree - Segment tree - Range updates --- ### Range queries - **Range query**: to calculate a value based on a subarray of an array (a range). - Typical range queries are: - $sum_q(a,b)$: sum of values in range $[a,b]$ - $min_q(a,b)$: minimum of values in range $[a,b]$ - $max_q(a,b)$: maximum of values in range $[a,b]$ --- #### Range queries <img src="https://cdn.ucode.vn/uploads/1/upload/kdWEQRyQ.png" class="element-left content-img" /> - Consider the range $[3, 6]$ in the array above: $sum_q(3, 6) = 14$, $min_q(3, 6) = 1$, and $max_q(3, 6) = 6$. --- #### Range queries - A simple way to process range queries is to use **a loop** that goes through **all array values** in the range $\rightarrow O(n)$. - Thus, we can process q queries in $O(n \times q)$ time - However, if both $n$ and $q$ are large, this approach is slow. $\rightarrow$ need more efficient ways. --- ### Static Range queries - Static array: the array values are never updated between the queries. - Sum queries: 1D and 2D arrays - Min/max queries --- #### Static Range: Sum Queries - Constructing a prefix sum array: the value at position $k$ is $s(k) = sum_q(0,k)$ - Initial array: <img src="https://cdn.ucode.vn/uploads/1/upload/SlEZPdrA.png" class="element-left content-img" /> - Prefix sum array: <img src="https://cdn.ucode.vn/uploads/1/upload/pebEVBPO.png" class="element-left content-img" /> --- #### Static Range: Sum Queries - We can calculate any value of $sum_q(a,b)$ in $O(1)$ time as follows: - $$sum_q(a, b) = s(b) - s(a-1)$$ - For example, consider the range $[3,6]$: $sum_q(3,6) = s(6) - s(2) = 27 - 8 = 19$ <img src="https://cdn.ucode.vn/uploads/1/upload/GEcYdWmR.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/OGCMbygt.png" class="element-left content-img" /> --- #### Static Range: Sum Queries - This idea can be generalized to **higher dimensions**. - A two-dimensional prefix sum array that can be used to calculate the sum of any **rectangular subarray** in $O(1)$ time. - Each sum in prefix sum array corresponds to a subarray that **begins at the upper-left corner** of the array. --- #### Static Range: Sum Queries <img src="https://cdn.ucode.vn/uploads/1/upload/durPNmBu.png" class="element-left content-img" /> The sum of the **gray subarray**? --- #### Static Range: Sum Queries <img src="https://cdn.ucode.vn/uploads/1/upload/durPNmBu.png" class="element-left content-img" /> $$S(A) - S(B) - S(C) + S(D)$$ --- ### Static Range: Min Queries - More difficult than sum queries: **Sparse table** method. - One time $O(nlogn)$ time preprocessing method after which we can answer any minimum query in $O(1)$ time. <img src="https://cdn.ucode.vn/uploads/1/upload/JXYrgWmf.png" class="element-left content-img" /> --- ### Static Range: Min Queries - The idea is to precalculate all values of $min_q(a,b)$ where (b - a + 1) (the length of the range) is a **power of two**. <img src="https://cdn.ucode.vn/uploads/1/upload/JXYrgWmf.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/IpKrvdyF.png" class="element-left content-img" /> --- ### Static Range: Min Queries - The number of precalculated values is $O(nlogn)$, because there are $O(logn)$ values of range lengths that are powers of two. - The values can be calculated efficiently using the recursive formula: $$min_q(a,b) = min(min_q(a,a + w - 1), min_q(a + w, b))$$ --- ### Static Range: Min Queries - After this, any value of $min_q(a,b)$ can be calculated in $O(1)$ time as a minimum of **two precalculated values**. - Let $k$ be the largest power of two that does not exceed the length $b - a + 1$. We can calculate the value of $min_q(a,b)$ using the formula: $$min_q(a,b) = min(min_q(a,a - k + 1), min_q(b - k + 1,b))$$ --- ### Static Range: Min Queries - Consider the range $[1,6]$: <img src="https://cdn.ucode.vn/uploads/1/upload/LMhKBHvi.png" class="element-left content-img" /> - The length of the range is $6$, thus $k = 4$. The range $[1,6]$ is the union of the ranges $[1,4]$ and $[3,6]$: <img src="https://cdn.ucode.vn/uploads/1/upload/iheYxWoK.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/ALKcTbAG.png" class="element-left content-img" /> --- ### Binary indexed tree: revise - A **binary indexed tree** (BIT) or a **Fenwick** tree can be seen as a dynamic variant of a prefix sum array. - It supports two $O(logn)$ time operations on an array: processing a range sum query and updating a value. - What is the time complexity of prefix sum? --- #### Binary indexed tree: revise - It is usually represented as an array. - Let $p(k)$ denote the largest power of two that **divides** $k$. We store a binary indexed tree as an array tree such that: $$tree[k] = sumq(k - p(k) + 1,k)$$ $$p(k) = k \& -k$$ Note: "divides k" means it is a divisor of $k$ --- #### Binary indexed tree: revise - Original array: <img src="https://cdn.ucode.vn/uploads/1/upload/CuYIGqze.png" class="element-left content-img" /> - Binary index tree: <img src="https://cdn.ucode.vn/uploads/1/upload/YdOdCnZP.png" class="element-left content-img" /> --- #### Binary indexed tree: revise - Binary index tree: <br/> <img src="https://cdn.ucode.vn/uploads/1/upload/COEvnbmv.png" class="element-left content-img" /> --- #### Binary indexed tree: revise - $sum_q(1, k)$ in $O(logn)$ - For example $s(1, 7) = s(1, 4) + s(5, 6) + s(7,7)$ - $sum_q(a,b) = sum_q(1, b) - sum_q(1, a-1)$ <img src="https://cdn.ucode.vn/uploads/1/upload/aBqhwyTI.png" class="element-left content-img" /> --- #### Binary indexed tree: revise - Updating a value in $O(logn)$: each value belongs to $O(logn)$ ranges in the BIT. - For example update the value at position $3$. <img src="https://cdn.ucode.vn/uploads/1/upload/wHxXghAm.png" class="element-left content-img" /> --- ### Segment tree tree: revise - Supports two operations: processing a range query and updating an array value, both in $O(logn)$ - Can support sum queries, minimum and maximum queries and many other queries (gcd, lcm, bitwise operations...) - Compared to BIT, it requires more memory and is a bit more difficult to implement --- #### Segment tree tree: revise - A segment tree is a **binary tree** such that the array elements are storedat the **bottom level** of the tree, and the other nodes contain information of the ranges. - In this section, we assume that the **size** of the array is a **power of two** and zero-based indexing is used. --- #### Segment tree tree: revise - Consider the sum querries: <img src="https://cdn.ucode.vn/uploads/1/upload/ZrWudWMF.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/AYeBrbQd.png" class="element-left content-img" /> --- #### Segment tree tree: revise - Queries: any range $[a,b]$ can be divided into $O(logn)$ ranges whose values are stored in tree nodes. <img src="https://cdn.ucode.vn/uploads/1/upload/TAtnXhRk.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/SDPEzxqH.png" class="element-left content-img" /> --- #### Segment tree tree: revise - Update: update all nodes whose value depends on the updated value (along the path to root node). <img src="https://cdn.ucode.vn/uploads/1/upload/ZZmRImQg.png" class="element-left content-img" /> --- ### Range updates - So far, we have implemented data structures that support **range queries** and **updates of single values**. - An **opposite situation**: when we need **update ranges** and retrieve **single** values - $\rightarrow$ **difference array**. --- #### Range updates - Original array: <img src="https://cdn.ucode.vn/uploads/1/upload/GYsISoLc.png" class="element-left content-img" /> - Difference array: whose values indicate the **differences between consecutive values** in the original array <img src="https://cdn.ucode.vn/uploads/1/upload/kELNUycd.png" class="element-left content-img" /> --- #### Range updates <img src="https://cdn.ucode.vn/uploads/1/upload/GYsISoLc.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/kELNUycd.png" class="element-left content-img" /> - The value at position $k$ in the original array can be calculated as $sum_q(0, k)$ in the difference array. --- #### Range updates - We can **update a range** in the original array by **changing just two elements** in the difference array. - For example, if we want to increase the original array values between positions $1$ and $4$ by $5$: just increase the difference array value at position $1$ by $5$ and decrease the value at position $5$ by $5$. <img src="https://cdn.ucode.vn/uploads/1/upload/kELNUycd.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/tNrXiojE.png" class="element-left content-img" /> --- #### Range updates - Extended problem: update all the values in range $(l, r)$ of the array $a$, such that $a_l$ increases by $1$, $a_{l+1}$ increases by $2$... $a_r$ increases by $r-l+1$. --- #### Range updates - Extended problem: $a_l$ increases by $1$, $a_{l+1}$ increases by $2$... $a_r$ increases by $r-l+1$ - For each $i$ in range $(l, r)$: $a_i += i-(l-1)$ - We can separate the process into 2 different update queries: - Update #1: Descrease a fixed value $l-1$ for every element from $l$ to $r$ $\rightarrow$ normal range update queries --- #### Range updates - Update #2: Increase $a_i$ by $i$ for every $i$ in range $(l,r)$ $\rightarrow$ we can count the number of this update for each element of the array in a separate array $i\_count[i]$, then at the end of the process, we update the final value $a_i += i \times i\_count[i]$ ```python= def update_queries(l, r): arr[l] -= (l - 1) arr[r + 1] += (l - 1) i_count[l] += 1 i_count[r + 1] -= 1 ``` --- #### Range updates ```python= def update_queries(l, r, decrease, i_count): decrease[l] -= (l - 1) decrease[r + 1] += (l - 1) i_count[l] += 1 i_count[r + 1] -= 1 if __name__ == '__main__': n, q = map(int, input.split()) decrease = [0] * (n + 2) i_count = [0] * (n + 2) for _ in range(q): l, r = map(int, input().split()) update_queries(l, r, decrease, i_count) a = [0] * (n + 1) for i in range(1, n + 1): decrease[i] += decrease[i - 1] i_count[i] += i_count[i - 1] a[i] = decrease[i] + i * i_count[i] print(a[i], end=' ') ``` --- #### Range updates: 2-d array - Repeated operations: Increase all every of a rectange $(i_1, j_1, i_2, j_2)$ by $val$. - We can apply difference array technique on every row or every column $\rightarrow$ $O(Q \times n)$ - Common problem: $Q$ is much bigger than $n$ $\rightarrow$ we need a better approach. --- #### Range updates: 2-d array - Let $a$ is the original array. We will maintain another array $b$ to capture the change of values. - For each update query $(i_1, j_1, i_2, j_2)$ by $val$: - $b[i_2][j_2]$ += $val$ - $b[i_1-1][j_2]$ -= $val$ - $b[i_2][j_1-1]$ -= $val$ - $b[i_1-1][j_1-1]$ += $val$ --- #### Range updates: 2-d array After all update queries, we will recalculate the $b$ as following: $$b[i][j] = b[i + 1][j] + b[i][j + 1] - b[i + 1][j + 1] + b[i][j]$$ The final value: $$a[i][j] + b[i][j]$$ --- ## The end ---
{"metaMigratedAt":"2023-06-17T19:46:18.318Z","metaMigratedFrom":"YAML","breaks":true,"description":"View the slide with \"Slide Mode\".","slideOptions":"{\"theme\":\"white\"}","title":"Thuc Nguyen's ADSA Course - Lecture 19. Range queries","contributors":"[{\"id\":\"476f9158-9a39-4a2d-bb09-ce08dd7eb978\",\"add\":12230,\"del\":427}]"}
    186 views