# 18 - Indicator random variables
**From last lecture**
## Linearity of expectation
For <u>any</u> two random variables $X$ and $Y$
$E(E + Y) = E(X) + E(Y)$
For <u>any</u> sequence of random variables $X_1,...,X_n$
$E(X_1, ..., X_n) = E(X_1) + ... + E(X_n)$
## Indicator random variables
**Lemma:** If $X$ is an indicator random variables the $E(X) = Pr(X=1)$
**Proof**: $E(X) = 0 \times Pr(X=0) + 1 \times Pr(X=1) = Pr(X=1)$
**Ex: Runs in bitstrings**
- $r_1,...,r_n$ is a random bitstring. $S = \{0,1\}^n$ All bitrstrings of length $n$, $Pr(\omega ) = \frac{1}{2^n}$
- A run of length $k$ starting at $i$ occurs when $r_i = r_{i+1} = r_{r+2} = ... = r_{i+j-1}$
- $X = \text{"the number of runs of length } k \text{ in that bitstring"}$
- Create what's known as an indicator random variables $X: S \rightarrow \{0,1\}$
- For each $i \in \{1,2,...,n-k+1\}$ Define $X_i$
- $X_i = 1$ if there's a run of length $k$ that begins at position $i$
- $X_i = 0$ otherwise
- Therefore, $X = \sum_{i=1}^{n-k+1}X_i$
- $E(X) = E(\sum_{i=1}^{n-k+1}X_i) = \sum_{i=1}^{n-k+1}E(X_i)$
- $Pr(X_i=1) = \frac{1}{2^{k-1}}$ (that $k$ bits are all the same value $\frac{1}{2^n}$ times 2 possible values for the bits $\frac{1}{2^n} \times 2 = \frac{1}{2^{n-1}}$)
- $E(X_i) = \frac{1}{2^{n-1}}$
- $\sum_{i=1}^{n-k+1}E(X_i) = \sum_{i=1}^{n-k+1} \frac{1}{2^{k-1}} = (n-k+1) \times \frac{1}{2^{k-1}}$
## Birthday Pairs
- $n$ people, $d$ days per year
- $S = \{(d_1, d_2, ... d_n) \in \{1,...,d\}\}, Pr(\omega ) = \frac{1}{|S|} = \frac{1}{d^n}$
- Define $X = \text{"The number of pairs of people who share a birthday"}$
- $E(X) = ?$
- For each $i,j \space 1 \leq i < j \leq n$
- $X_{i,j} = 1$ if $d_i=d_j$
- $X_{i,j} = 0$ Otherwise
- $X = \sum_{i=1}^{n-1} \sum_{j=i+1}^n X_{i,j}$
- $= \sum_{i=1}^{n-1} \sum_{j=1+1}^n E(X_{i,j})$
- $E(X_{i,j}) = Pr(X_{i,j}=1) = \frac{1}{d}$
- $= \sum_{i=1}^{n-1} \sum_{j=1+1}^n \frac{1}{d}$
- Number of pairs of people $\binom{n}{2}$
- $= \binom{n}{2} \times \frac{1}{d}$
This gives some intuition about the birthday paradox, the probability that two people share a birthday is $\frac{1}{365}$ and there are $\binom{n}{2}$ pairs of people. The larger this number gets, the more likely to find a match.
## Insertion Sort
```
for i = 1 -> n:
j = i
while j >=1 and a_j < a_j-1:
swap a_j and a_j-1
j = j - 1
```
We know that the algorithm runs at least $n$ times for each element, each swap is another operation. Worst case is an array sorted in reverse order. For each index $n$ we make $n$ swaps.
$\frac{n(n-1)}{2}$ swaps at most = $\binom{n}{2}$ because each pair is out of order.
Expected case calculation:
- Define $X = \text{# of times a swap is executed}$
- Define the list as $a_1,...,a_n$ is a random permutation of $\{1,...n\}$
- $|S| = n! \ Pr(\omega) = \frac{1}{n!}$
Random variable procedure:
- For each $1 \leq x < y \leq n$
- $X_x,y = 1$ if a swap is required
- $X_x,y = 0$ otherwise
- Another way to think about it
- $X_x,y = 1$ if $y$ appears before $x$ in $a_1,...a_n$
- $X_x,y = 0$ otherwise
- $E(X) = E(\sum_{x=1}^{n-1} \sum_{y=x+1}^n X_{x,y})$
- $= \sum_{x=1}^{n-1} \sum_{y=x+1}^n(E(X_{x,y}))$
- $E(X_{x,y}) = \frac{1}{2}$ Because either $x$ appears before $y$ or $y$ appears before $x$, 50% chance of this happening
- $= \sum_{x=1}^{n-1} \sum_{y=x+1}^n \frac{1}{2}$ For every pair. $\frac{\binom{n}{2}}{2}$
## Finding the maximum value in an array
```python
m = -inf
for i = 1 -> n
if a_i > m:
m = a_i
return m
```
We want to analyze the assignment operation under a random permutation of numbers from $1,...,n$
- Define $X = \text{"# of times assignment operator is called"}$
- $X_i = 1$ if a swap is made on iteration $i$ / if $a_i > m$ / $a_i = max\{a_1,...a_i\}$
- $X_i = 0$ otherwise
- $E(X_i) = Pr(X_i = 1) = \frac{1}{i}$
- $E(X) = E(\sum_{i=1}^n X_i) = \sum_{i=1}^n E(X_i)$
- $=\sum_{i=1}^n \frac{1}{i}$
### Harmonic number
<u>Def:</u> $H_n = \sum_{i=1}^n \frac{1}{i}$

$\ln n \leq H_n \leq 1+ \ln n$
###### tags: `COMP2804` `Probability`