# 18 - Indicator random variables **From last lecture** ## Linearity of expectation For <u>any</u> two random variables $X$ and $Y$ $E(E + Y) = E(X) + E(Y)$ For <u>any</u> sequence of random variables $X_1,...,X_n$ $E(X_1, ..., X_n) = E(X_1) + ... + E(X_n)$ ## Indicator random variables **Lemma:** If $X$ is an indicator random variables the $E(X) = Pr(X=1)$ **Proof**: $E(X) = 0 \times Pr(X=0) + 1 \times Pr(X=1) = Pr(X=1)$ **Ex: Runs in bitstrings** - $r_1,...,r_n$ is a random bitstring. $S = \{0,1\}^n$ All bitrstrings of length $n$, $Pr(\omega ) = \frac{1}{2^n}$ - A run of length $k$ starting at $i$ occurs when $r_i = r_{i+1} = r_{r+2} = ... = r_{i+j-1}$ - $X = \text{"the number of runs of length } k \text{ in that bitstring"}$ - Create what's known as an indicator random variables $X: S \rightarrow \{0,1\}$ - For each $i \in \{1,2,...,n-k+1\}$ Define $X_i$ - $X_i = 1$ if there's a run of length $k$ that begins at position $i$ - $X_i = 0$ otherwise - Therefore, $X = \sum_{i=1}^{n-k+1}X_i$ - $E(X) = E(\sum_{i=1}^{n-k+1}X_i) = \sum_{i=1}^{n-k+1}E(X_i)$ - $Pr(X_i=1) = \frac{1}{2^{k-1}}$ (that $k$ bits are all the same value $\frac{1}{2^n}$ times 2 possible values for the bits $\frac{1}{2^n} \times 2 = \frac{1}{2^{n-1}}$) - $E(X_i) = \frac{1}{2^{n-1}}$ - $\sum_{i=1}^{n-k+1}E(X_i) = \sum_{i=1}^{n-k+1} \frac{1}{2^{k-1}} = (n-k+1) \times \frac{1}{2^{k-1}}$ ## Birthday Pairs - $n$ people, $d$ days per year - $S = \{(d_1, d_2, ... d_n) \in \{1,...,d\}\}, Pr(\omega ) = \frac{1}{|S|} = \frac{1}{d^n}$ - Define $X = \text{"The number of pairs of people who share a birthday"}$ - $E(X) = ?$ - For each $i,j \space 1 \leq i < j \leq n$ - $X_{i,j} = 1$ if $d_i=d_j$ - $X_{i,j} = 0$ Otherwise - $X = \sum_{i=1}^{n-1} \sum_{j=i+1}^n X_{i,j}$ - $= \sum_{i=1}^{n-1} \sum_{j=1+1}^n E(X_{i,j})$ - $E(X_{i,j}) = Pr(X_{i,j}=1) = \frac{1}{d}$ - $= \sum_{i=1}^{n-1} \sum_{j=1+1}^n \frac{1}{d}$ - Number of pairs of people $\binom{n}{2}$ - $= \binom{n}{2} \times \frac{1}{d}$ This gives some intuition about the birthday paradox, the probability that two people share a birthday is $\frac{1}{365}$ and there are $\binom{n}{2}$ pairs of people. The larger this number gets, the more likely to find a match. ## Insertion Sort ``` for i = 1 -> n: j = i while j >=1 and a_j < a_j-1: swap a_j and a_j-1 j = j - 1 ``` We know that the algorithm runs at least $n$ times for each element, each swap is another operation. Worst case is an array sorted in reverse order. For each index $n$ we make $n$ swaps. $\frac{n(n-1)}{2}$ swaps at most = $\binom{n}{2}$ because each pair is out of order. Expected case calculation: - Define $X = \text{# of times a swap is executed}$ - Define the list as $a_1,...,a_n$ is a random permutation of $\{1,...n\}$ - $|S| = n! \ Pr(\omega) = \frac{1}{n!}$ Random variable procedure: - For each $1 \leq x < y \leq n$ - $X_x,y = 1$ if a swap is required - $X_x,y = 0$ otherwise - Another way to think about it - $X_x,y = 1$ if $y$ appears before $x$ in $a_1,...a_n$ - $X_x,y = 0$ otherwise - $E(X) = E(\sum_{x=1}^{n-1} \sum_{y=x+1}^n X_{x,y})$ - $= \sum_{x=1}^{n-1} \sum_{y=x+1}^n(E(X_{x,y}))$ - $E(X_{x,y}) = \frac{1}{2}$ Because either $x$ appears before $y$ or $y$ appears before $x$, 50% chance of this happening - $= \sum_{x=1}^{n-1} \sum_{y=x+1}^n \frac{1}{2}$ For every pair. $\frac{\binom{n}{2}}{2}$ ## Finding the maximum value in an array ```python m = -inf for i = 1 -> n if a_i > m: m = a_i return m ``` We want to analyze the assignment operation under a random permutation of numbers from $1,...,n$ - Define $X = \text{"# of times assignment operator is called"}$ - $X_i = 1$ if a swap is made on iteration $i$ / if $a_i > m$ / $a_i = max\{a_1,...a_i\}$ - $X_i = 0$ otherwise - $E(X_i) = Pr(X_i = 1) = \frac{1}{i}$ - $E(X) = E(\sum_{i=1}^n X_i) = \sum_{i=1}^n E(X_i)$ - $=\sum_{i=1}^n \frac{1}{i}$ ### Harmonic number <u>Def:</u> $H_n = \sum_{i=1}^n \frac{1}{i}$ ![](https://i.imgur.com/WePBnEt.png) $\ln n \leq H_n \leq 1+ \ln n$ ###### tags: `COMP2804` `Probability`