## Lecture Lecture 19. Range queries
- Static array queries
- Binary indexed tree
- Segment tree
- Range updates
---
### Range queries
- **Range query**: to calculate a value based on a subarray of an array (a range).
- Typical range queries are:
- $sum_q(a,b)$: sum of values in range $[a,b]$
- $min_q(a,b)$: minimum of values in range $[a,b]$
- $max_q(a,b)$: maximum of values in range $[a,b]$
---
#### Range queries
<img src="https://cdn.ucode.vn/uploads/1/upload/kdWEQRyQ.png" class="element-left content-img" />
- Consider the range $[3, 6]$ in the array above: $sum_q(3, 6) = 14$, $min_q(3, 6) = 1$, and $max_q(3, 6) = 6$.
---
#### Range queries
- A simple way to process range queries is to use **a loop** that goes through **all array values** in the range $\rightarrow O(n)$.
- Thus, we can process q queries in $O(n \times q)$ time
- However, if both $n$ and $q$ are large, this approach is slow. $\rightarrow$ need more efficient ways.
---
### Static Range queries
- Static array: the array values are never updated between the queries.
- Sum queries: 1D and 2D arrays
- Min/max queries
---
#### Static Range: Sum Queries
- Constructing a prefix sum array: the value at position $k$ is $s(k) = sum_q(0,k)$
- Initial array: <img src="https://cdn.ucode.vn/uploads/1/upload/SlEZPdrA.png" class="element-left content-img" />
- Prefix sum array: <img src="https://cdn.ucode.vn/uploads/1/upload/pebEVBPO.png" class="element-left content-img" />
---
#### Static Range: Sum Queries
- We can calculate any value of $sum_q(a,b)$ in $O(1)$ time as follows:
- $$sum_q(a, b) = s(b) - s(a-1)$$
- For example, consider the range $[3,6]$: $sum_q(3,6) = s(6) - s(2) = 27 - 8 = 19$
<img src="https://cdn.ucode.vn/uploads/1/upload/GEcYdWmR.png" class="element-left content-img" />
<img src="https://cdn.ucode.vn/uploads/1/upload/OGCMbygt.png" class="element-left content-img" />
---
#### Static Range: Sum Queries
- This idea can be generalized to **higher dimensions**.
- A two-dimensional prefix sum array that can be used to calculate the sum of any **rectangular subarray** in $O(1)$ time.
- Each sum in prefix sum array corresponds to a subarray that **begins at the upper-left corner** of the array.
---
#### Static Range: Sum Queries
<img src="https://cdn.ucode.vn/uploads/1/upload/durPNmBu.png" class="element-left content-img" />
The sum of the **gray subarray**?
---
#### Static Range: Sum Queries
<img src="https://cdn.ucode.vn/uploads/1/upload/durPNmBu.png" class="element-left content-img" />
$$S(A) - S(B) - S(C) + S(D)$$
---
### Static Range: Min Queries
- More difficult than sum queries: **Sparse table** method.
- One time $O(nlogn)$ time preprocessing method after which we can answer any minimum query in $O(1)$ time.
<img src="https://cdn.ucode.vn/uploads/1/upload/JXYrgWmf.png" class="element-left content-img" />
---
### Static Range: Min Queries
- The idea is to precalculate all values of $min_q(a,b)$ where (b - a + 1) (the length of the range) is a **power of two**.
<img src="https://cdn.ucode.vn/uploads/1/upload/JXYrgWmf.png" class="element-left content-img" />
<img src="https://cdn.ucode.vn/uploads/1/upload/IpKrvdyF.png" class="element-left content-img" />
---
### Static Range: Min Queries
- The number of precalculated values is $O(nlogn)$, because there are $O(logn)$ values of range lengths that are powers of two.
- The values can be calculated efficiently using the recursive formula:
$$min_q(a,b) = min(min_q(a,a + w - 1), min_q(a + w, b))$$
---
### Static Range: Min Queries
- After this, any value of $min_q(a,b)$ can be calculated in $O(1)$ time as a minimum of **two precalculated values**.
- Let $k$ be the largest power of two that does not exceed
the length $b - a + 1$. We can calculate the value of $min_q(a,b)$ using the formula:
$$min_q(a,b) = min(min_q(a,a - k + 1), min_q(b - k + 1,b))$$
---
### Static Range: Min Queries
- Consider the range $[1,6]$:
<img src="https://cdn.ucode.vn/uploads/1/upload/LMhKBHvi.png" class="element-left content-img" />
- The length of the range is $6$, thus $k = 4$. The range $[1,6]$ is the union of the ranges $[1,4]$ and $[3,6]$:
<img src="https://cdn.ucode.vn/uploads/1/upload/iheYxWoK.png" class="element-left content-img" />
<img src="https://cdn.ucode.vn/uploads/1/upload/ALKcTbAG.png" class="element-left content-img" />
---
### Binary indexed tree: revise
- A **binary indexed tree** (BIT) or a **Fenwick** tree can be seen as a dynamic variant of a prefix sum array.
- It supports two $O(logn)$ time operations on an array: processing a range sum query and updating a value.
- What is the time complexity of prefix sum?
---
#### Binary indexed tree: revise
- It is usually represented as an array.
- Let $p(k)$ denote the largest power of two that **divides** $k$. We store a binary indexed tree as an array tree such that:
$$tree[k] = sumq(k - p(k) + 1,k)$$
$$p(k) = k \& -k$$
Note:
"divides k" means it is a divisor of $k$
---
#### Binary indexed tree: revise
- Original array:
<img src="https://cdn.ucode.vn/uploads/1/upload/CuYIGqze.png" class="element-left content-img" />
- Binary index tree:
<img src="https://cdn.ucode.vn/uploads/1/upload/YdOdCnZP.png" class="element-left content-img" />
---
#### Binary indexed tree: revise
- Binary index tree:
<br/>
<img src="https://cdn.ucode.vn/uploads/1/upload/COEvnbmv.png" class="element-left content-img" />
---
#### Binary indexed tree: revise
- $sum_q(1, k)$ in $O(logn)$
- For example $s(1, 7) = s(1, 4) + s(5, 6) + s(7,7)$
- $sum_q(a,b) = sum_q(1, b) - sum_q(1, a-1)$
<img src="https://cdn.ucode.vn/uploads/1/upload/aBqhwyTI.png" class="element-left content-img" />
---
#### Binary indexed tree: revise
- Updating a value in $O(logn)$: each value belongs to $O(logn)$ ranges in the BIT.
- For example update the value at position $3$.
<img src="https://cdn.ucode.vn/uploads/1/upload/wHxXghAm.png" class="element-left content-img" />
---
### Segment tree tree: revise
- Supports two operations: processing
a range query and updating an array value, both in $O(logn)$
- Can support sum queries, minimum and maximum queries and many other queries (gcd, lcm, bitwise operations...)
- Compared to BIT, it requires more memory and is a bit more difficult to implement
---
#### Segment tree tree: revise
- A segment tree is a **binary tree** such that the array elements are storedat the **bottom level** of the tree, and the other nodes contain information of the ranges.
- In this section, we assume that the **size** of the array is a **power of two** and zero-based indexing is used.
---
#### Segment tree tree: revise
- Consider the sum querries:
<img src="https://cdn.ucode.vn/uploads/1/upload/ZrWudWMF.png" class="element-left content-img" />
<img src="https://cdn.ucode.vn/uploads/1/upload/AYeBrbQd.png" class="element-left content-img" />
---
#### Segment tree tree: revise
- Queries: any range $[a,b]$ can be divided into $O(logn)$ ranges whose values are stored in tree nodes.
<img src="https://cdn.ucode.vn/uploads/1/upload/TAtnXhRk.png" class="element-left content-img" />
<img src="https://cdn.ucode.vn/uploads/1/upload/SDPEzxqH.png" class="element-left content-img" />
---
#### Segment tree tree: revise
- Update: update all nodes whose value depends on the updated value (along the path to root node).
<img src="https://cdn.ucode.vn/uploads/1/upload/ZZmRImQg.png" class="element-left content-img" />
---
### Range updates
- So far, we have implemented data structures that support **range queries** and **updates of single values**.
- An **opposite situation**: when we need **update ranges** and retrieve **single** values
- $\rightarrow$ **difference array**.
---
#### Range updates
- Original array: <img src="https://cdn.ucode.vn/uploads/1/upload/GYsISoLc.png" class="element-left content-img" />
- Difference array: whose values indicate the
**differences between consecutive values** in the original array
<img src="https://cdn.ucode.vn/uploads/1/upload/kELNUycd.png" class="element-left content-img" />
---
#### Range updates
<img src="https://cdn.ucode.vn/uploads/1/upload/GYsISoLc.png" class="element-left content-img" />
<img src="https://cdn.ucode.vn/uploads/1/upload/kELNUycd.png" class="element-left content-img" />
- The value at position $k$ in the original array can be calculated as $sum_q(0, k)$ in the difference array.
---
#### Range updates
- We can **update a range** in the original array by **changing just two elements** in the difference array.
- For example, if we want to increase the original array values between positions $1$ and $4$ by $5$: just increase the difference array value at position $1$ by $5$ and decrease the value at position $5$ by $5$.
<img src="https://cdn.ucode.vn/uploads/1/upload/kELNUycd.png" class="element-left content-img" />
<img src="https://cdn.ucode.vn/uploads/1/upload/tNrXiojE.png" class="element-left content-img" />
---
#### Range updates
- Extended problem: update all the values in range $(l, r)$ of the array $a$, such that $a_l$ increases by $1$, $a_{l+1}$ increases by $2$... $a_r$ increases by $r-l+1$.
---
#### Range updates
- Extended problem: $a_l$ increases by $1$, $a_{l+1}$ increases by $2$... $a_r$ increases by $r-l+1$
- For each $i$ in range $(l, r)$: $a_i += i-(l-1)$
- We can separate the process into 2 different update queries:
- Update #1: Descrease a fixed value $l-1$ for every element from $l$ to $r$ $\rightarrow$ normal range update queries
---
#### Range updates
- Update #2: Increase $a_i$ by $i$ for every $i$ in range $(l,r)$ $\rightarrow$ we can count the number of this update for each element of the array in a separate array $i\_count[i]$, then at the end of the process, we update the final value $a_i += i \times i\_count[i]$
```python=
def update_queries(l, r):
arr[l] -= (l - 1)
arr[r + 1] += (l - 1)
i_count[l] += 1
i_count[r + 1] -= 1
```
---
#### Range updates
```python=
def update_queries(l, r, decrease, i_count):
decrease[l] -= (l - 1)
decrease[r + 1] += (l - 1)
i_count[l] += 1
i_count[r + 1] -= 1
if __name__ == '__main__':
n, q = map(int, input.split())
decrease = [0] * (n + 2)
i_count = [0] * (n + 2)
for _ in range(q):
l, r = map(int, input().split())
update_queries(l, r, decrease, i_count)
a = [0] * (n + 1)
for i in range(1, n + 1):
decrease[i] += decrease[i - 1]
i_count[i] += i_count[i - 1]
a[i] = decrease[i] + i * i_count[i]
print(a[i], end=' ')
```
---
#### Range updates: 2-d array
- Repeated operations: Increase all every of a rectange $(i_1, j_1, i_2, j_2)$ by $val$.
- We can apply difference array technique on every row or every column $\rightarrow$ $O(Q \times n)$
- Common problem: $Q$ is much bigger than $n$ $\rightarrow$ we need a better approach.
---
#### Range updates: 2-d array
- Let $a$ is the original array. We will maintain another array $b$ to capture the change of values.
- For each update query $(i_1, j_1, i_2, j_2)$ by $val$:
- $b[i_2][j_2]$ += $val$
- $b[i_1-1][j_2]$ -= $val$
- $b[i_2][j_1-1]$ -= $val$
- $b[i_1-1][j_1-1]$ += $val$
---
#### Range updates: 2-d array
After all update queries, we will recalculate the $b$ as following:
$$b[i][j] = b[i + 1][j] + b[i][j + 1] - b[i + 1][j + 1] + b[i][j]$$
The final value:
$$a[i][j] + b[i][j]$$
---
## The end
---
{"metaMigratedAt":"2023-06-17T19:46:18.318Z","metaMigratedFrom":"YAML","breaks":true,"description":"View the slide with \"Slide Mode\".","slideOptions":"{\"theme\":\"white\"}","title":"Thuc Nguyen's ADSA Course - Lecture 19. Range queries","contributors":"[{\"id\":\"476f9158-9a39-4a2d-bb09-ce08dd7eb978\",\"add\":12230,\"del\":427}]"}