# 機率 - 2 (Expectation ~ Conti R.V.)
### Expectation
* ex.
* Suppose $X\sim\text{Poisson}(\lambda=15,T=1)$. What is the PMF of $X$? How to define the expected value of $X$?
* sol
* $p_{_X}(x)=\begin{cases}
\dfrac{e^{\lambda T}(\lambda T)^x}{x!},\ x=0,1,\dots \\
0, \text{otherwise}
\end{cases}$
$E(X)=\displaystyle\sum_{x=0}^{\infty}x\dfrac{e^{\lambda T}(\lambda T)^x}{x!}=\displaystyle\sum_{x=0}^{\infty}\dfrac{e^{\lambda T}(\lambda T)^x}{(x-1)!}=\lambda T\displaystyle\sum_{x=0}^{\infty}x\dfrac{e^{\lambda T}(\lambda T)^{x-1}}{(x-1)!}=\lambda T$
$\displaystyle\sum_{x=0}^{\infty}x\dfrac{e^{\lambda T}(\lambda T)^{x-1}}{(x-1)!}=1$ since it look like the sum of PMF of another Poission R.V and the sum of the probability is 1
* **Def** (Expect Value / Mean / Expectation) $E[X]:=\displaystyle\sum_{x\in S}x\cdot p_{_X}(x)$
* $S$: the set of possible value
* another notation: $\mu_{_X}=E[X]$
* **Def** (Expectation using CDF) $E[X]=\displaystyle\sum_{i=1}^{\infty}(x_i-x_{i-1})\cdot (1-F_X(x_{i-1}))$
* Denote $F_X(x_0)=0$
* 
* ex.
* Suppose $X$ is a discrete R.V. and the set of possible value $A=\{2,4,\dots\}$. The CDF is $F_{X}(t)=1-\dfrac{1}{t^2}$. What is $E[X]$?
* sol
* $E[X]=\displaystyle\sum_{i=1}^{\infty}(x_i-x_{i-1})(1-F_X(x_{i-1}))=2\cdot 1+\displaystyle\sum_{\bf{i=2}}^{\infty}2\cdot\dfrac{1}{(i-1)^2}=2+\dfrac{\pi^2}{3}$
* **Thm** (Expectation of a Function of r.v.) Let $X$ be a discrete R.V. with the set of possible value $S$ and PMF $p_{_X}(x)$. And let $g(\cdot)$ be a real valued func. Then $E[g(X)]=\displaystyle\sum_{x\in S}g(x)p_{_X}(x)$
* Also called Law of the unconscious statistician (LOTUS)
> **pf**
> Define a R.V $Y:=g(Y)$
> $E[g(X)]=E[Y]=\displaystyle\sum_{\substack{y:y=g(x) \\ x\in S}}y\cdot p_{_Y}(y)=\sum_{\substack{y:y=g(x) \\ x\in S}}y\sum_{x:g(x)=y}p_{_X}(x)$
> $=\displaystyle\sum_{\substack{(x,y):g(x)=y\\x\in S}}yp_{_X}(x)=\sum_{x\in S}g(x)p_{_X}(x)$
* **Prop** $E[g(X)+h(X)]=E[g(X)]+E{h(X)}$
> **pf**
> $E[g(X)+h(X)]=\sum(g(x)+h(x))p_{_X}(x)=\sum g(x)p_{_X}(x)+\sum h(x)p_{_X}(x)$
> $=E[g(X)]+E[h(X)]$
### Moment and Variable
* moments
| function | expectation |
| --- | --- |
| $g(X)=X^2$ | 2-nd moment |
| $g(X)=X^n$ | n-th moment |
| $g(X)=(X-\mu_{_X})^2$ | 2-nd central moment |
| $g(X)=(X-\mu_{_X})^n$ | n-th central moment |
| $g(X)=e^{tX}$ | moment generating func. |
* **Def** (Variance) $\text{Var}[X]:=E[(X-\mu_{_X})^2]$
* also denote as $\sigma_{_X}^2$
* ex
* Suppose we are given a random variable $X$. We need to output a prediction of $X$ (denoted by $z$). The penalty of prediction is $(X-z)^2$. What is the minimum expected penalty?
* sol
* $g(z):=E[(X-z)^2]=E[X^2-2zX+z^2]=E[X^2]-2zE[X]+z^2$
$=(z-E[X])^2+E[X^2]-(E[X])^2\ge E[X^2]-(E[X])^2=\text{Var}(X)$
* Variance = minimum expected quadratic penalty
* **Thm** $\text{Var}[X]:=E[X^2]-(E[X])^2$
> **pf**
> $\text{Var}[X]=E[(X-\mu_{_X}^2)]=E[X^2]-2\mu_{_X}E[X]+\mu_{_{X}}^2$
> $=E[X^2]-2(E[X]^2)+(E[X])^2=E[X^2]-(E[X])^2$
* **Prop**
* $\text{Var}[X+c]=\text{Var}[X]$
* $\text{Var}[aX]=a^2\text{Var}[X]$
* $E[X^2]\ge (E[X])^2$
> **pf** $\text{Var}[X]=E[X^2]-(E[X])\ge 0$
* **Def** (Existence) Let $X$ be a random variable. Then, the n-th moment of $X$ (i.e. $E[X]$) is said to exist if $E[|X^n|]<\infty$
* ex.
* Let $z_n=(-1)^n\sqrt{n}$, $n=1,2,\dots$. Let $Z$ be a random variable with the set of possible values $\{z_n:n=1,2,\dots\}$and the PMF $p_{_Z}(z)$ as $p_{_Z}(z_n)=\dfrac{6}{(\pi n)^2}$. What is Var[X]?
* $\text{Var}[Z]=E[Z^2]-(E[Z])^2$
$E[|Z^2|]=\displaystyle\sum_{n=1}^{\infty}(\sqrt{n})^2\dfrac{6}{(\pi n)^2}=\infty$
So $\text{Var}[Z]$ DNE.
* **Thm** If $E[|X|^{n+1}]<\infty$, then $E[|X|^n]<\infty$
* pf
* $E[|X^n|]=\displaystyle\sum_{x\in S}|X|^np_{_X}(x)=\sum_{\substack{x:|x|\le 1\\x\in S}}|X|^np_{_X}(x)+\sum_{\substack{x:|x|>1\\x\in S}}|X|^np_{_X}(x)$
$<1+\displaystyle\sum_{\substack{x:|x|>1\\x\in S}}|X|^{n+1}p_{_X}(x)<\infty$
### $E[X]$ and $\text{Var}[X]$ of Special Discrete R.V.
* Bernoulli R.V.: $X\sim$ Bernoulli$(p)$
* $E[X]=p$
* $E[X]=p(1-p)$
* Binomial R.V.: $X\sim$ Binomial$(n,p)$
* $E[X]=np$
* $E[X]=\displaystyle\sum_{k=0}^{n}k\cdot\dbinom{n}{k}p^{k}(1-p)^{n-k}=\sum_{k=0}^{n}k\cdot\dfrac{n!}{k!(n-k)!}p^{k}(1-p)^{n-k}$
$=np\displaystyle\sum_{k=1}^{n}\dfrac{(n-1)!}{(k-1)!(n-k)!}p^{k-1}(1-p)^{n-k}=np$
* $\dfrac{(n-1)!}{(k-1)!(n-k)!}p^{k-1}(1-p)^{n-k}$ is the probability of Binomial(n-1, p), so the sum would be 1
* $\text{Var}[X]=np(1-p)$
* $E[X^2]=\displaystyle\sum_{k=0}^{n}k^2\cdot\dfrac{n!}{k!(n-k)!}p^{k}(1-p)^{n-k}$
$=\displaystyle\sum_{k=1}^{n}(k-1)\cdot\dfrac{n!}{(k-1)!(n-k)!}p^{k}(1-p)^{n-k}$
$+\displaystyle\sum_{k=0}^{n}\dfrac{n!}{(k-1)!(n-k)!}p^{k}(1-p)^{n-k}$
$=n(n-1)p^2\displaystyle\sum_{k=2}^{n}\dbinom{n-2}{k-2}p^{k-2}(1-p)^{n-k}+np=n(n-1)p^2+np$
$\text{Var}[X]=E[X^2]-(E[X])^2=n(n-1)p^2+np-(np)^2=np(1-p)$
### Continuous Random Variables
#### Probability Density Function (PDF)
* **Def** (PDF) Let $X$ be a random variable. Then, $f_{_X}(x)$ is the PDF of $X$ if for every subset $B$ of the real line, we have $P(X\in B)=\displaystyle\int_Bf_{_X}(x)dx$
* $P(X\in\mathbb{R})=1$
* $P(X\le t)=\displaystyle\int_{-\infty}^tf_{_X}(x)dx$
* $P(a\le X\le b)=\displaystyle\int_a^bf_{_X}(x)dx$
* $P(a\le X<b)=\displaystyle\int_a^bf_{_X}(x)dx-P(X=b)=\displaystyle\int_a^bf_{_X}(x)dx$
* Check valid (3 axioms of probability)
* $P(x\in\mathbb{R})=1\Rightarrow \displaystyle\int_{-\infty}^{\infty}f_{_X}(x)dx=1$
* $P(X\in A)\ge 0\Rightarrow\displaystyle\int_Af_{_X}(x)dx\ge 0$
* $P(X\in\displaystyle\bigcup_{i\ge 1}A_i)=\sum_{i\ge 1}P(X\in A_i)\Rightarrow\int_{\cup A_i}f_{_X}(x)dx=\sum_{i\ge 1}\int_{A_i}f_{_X}(x)dx$
* CDF & PDF
* Let $X$ be a random variable with a CDF $F_X(\cdot)$ and a PDF $f_{_X}(\cdot)$
* PDF -> CDF: $F_X(t)=P(X\le t)=\displaystyle\int_{-\infty}^tf_{_X}(x)dx$
* CDF -> PDF: $f_{_X}(x_0)=F_X'(x_0)$ when $f_{_X}(\cdot)$ is continuous at $x_0$
* like Fundamental Thm. of Calculus
* Given CDF, the derived PDF is not unique
* 
#### Uniform Random Variables
* **Def** A random variable $X$ is uniform with parameters $a,b\ (a<b)$ if its PDF is $f_{_X}(x)=\begin{cases}
\frac{1}{b-a},\ a<x<b \\
0,\ \text{otherwise}
\end{cases}$
* ex
* Let $X$ be a random variable with CDF $F(t)$. Define another random variable $Y=F(X)$. What type of random variable is $Y$
* sol
* $F_Y(t)=P(Y\le t)=P(F(X)\le t)$
To simplify, let $t=0.5$. Then we are finding that the probability of $F(X)\le 0.5$, which is just 0.5
So $P(F(X)\le t)=\begin{cases}
1,\ t\ge 1 \\
t,\ 0<t<1 \\
0,\ t\le 0 \end{cases}$
* 
* 目的:對於一個已知 CDF 的分佈,我們如何從中 sample
* pf
* The CDF we generate is $P(X\le t)=P(F^{-1}(U)\le t)=P(F(F^{-1}(U))\le F(t))$
$=P(U\le F(t))=F(t)$
* 假設 $X$ 為一個連續隨機變數,其 CDF 為 $F_{X}$。此時,隨機變數 $Y=F_{X}(X)\sim \text{Unif}(0,1)$。ITS 即是將該過程反過來進行:首先對於隨機變數 $Y$,我們從 0~1 中隨機均勻抽取一個數 $u$。之後,由於隨機變數 $F_{X}^{-1}(Y)$ 與 $X$ 有著相同的分布,$x=F_{X}^{-1}(u)$ 即可看作是從分布 $F_{X}$ 中生成的隨機樣本。 [Ref](https://zh.wikipedia.org/zh-tw/%E9%80%86%E5%8F%98%E6%8D%A2%E9%87%87%E6%A0%B7)