# 機率 - 2 (Expectation ~ Conti R.V.) ### Expectation * ex. * Suppose $X\sim\text{Poisson}(\lambda=15,T=1)$. What is the PMF of $X$? How to define the expected value of $X$? * sol * $p_{_X}(x)=\begin{cases} \dfrac{e^{\lambda T}(\lambda T)^x}{x!},\ x=0,1,\dots \\ 0, \text{otherwise} \end{cases}$ $E(X)=\displaystyle\sum_{x=0}^{\infty}x\dfrac{e^{\lambda T}(\lambda T)^x}{x!}=\displaystyle\sum_{x=0}^{\infty}\dfrac{e^{\lambda T}(\lambda T)^x}{(x-1)!}=\lambda T\displaystyle\sum_{x=0}^{\infty}x\dfrac{e^{\lambda T}(\lambda T)^{x-1}}{(x-1)!}=\lambda T$ $\displaystyle\sum_{x=0}^{\infty}x\dfrac{e^{\lambda T}(\lambda T)^{x-1}}{(x-1)!}=1$ since it look like the sum of PMF of another Poission R.V and the sum of the probability is 1 * **Def** (Expect Value / Mean / Expectation) $E[X]:=\displaystyle\sum_{x\in S}x\cdot p_{_X}(x)$ * $S$: the set of possible value * another notation: $\mu_{_X}=E[X]$ * **Def** (Expectation using CDF) $E[X]=\displaystyle\sum_{i=1}^{\infty}(x_i-x_{i-1})\cdot (1-F_X(x_{i-1}))$ * Denote $F_X(x_0)=0$ * ![](https://hackmd.io/_uploads/BkvetvTGp.png) * ex. * Suppose $X$ is a discrete R.V. and the set of possible value $A=\{2,4,\dots\}$. The CDF is $F_{X}(t)=1-\dfrac{1}{t^2}$. What is $E[X]$? * sol * $E[X]=\displaystyle\sum_{i=1}^{\infty}(x_i-x_{i-1})(1-F_X(x_{i-1}))=2\cdot 1+\displaystyle\sum_{\bf{i=2}}^{\infty}2\cdot\dfrac{1}{(i-1)^2}=2+\dfrac{\pi^2}{3}$ * **Thm** (Expectation of a Function of r.v.) Let $X$ be a discrete R.V. with the set of possible value $S$ and PMF $p_{_X}(x)$. And let $g(\cdot)$ be a real valued func. Then $E[g(X)]=\displaystyle\sum_{x\in S}g(x)p_{_X}(x)$ * Also called Law of the unconscious statistician (LOTUS) > **pf** > Define a R.V $Y:=g(Y)$ > $E[g(X)]=E[Y]=\displaystyle\sum_{\substack{y:y=g(x) \\ x\in S}}y\cdot p_{_Y}(y)=\sum_{\substack{y:y=g(x) \\ x\in S}}y\sum_{x:g(x)=y}p_{_X}(x)$ > $=\displaystyle\sum_{\substack{(x,y):g(x)=y\\x\in S}}yp_{_X}(x)=\sum_{x\in S}g(x)p_{_X}(x)$ * **Prop** $E[g(X)+h(X)]=E[g(X)]+E{h(X)}$ > **pf** > $E[g(X)+h(X)]=\sum(g(x)+h(x))p_{_X}(x)=\sum g(x)p_{_X}(x)+\sum h(x)p_{_X}(x)$ > $=E[g(X)]+E[h(X)]$ ### Moment and Variable * moments | function | expectation | | --- | --- | | $g(X)=X^2$ | 2-nd moment | | $g(X)=X^n$ | n-th moment | | $g(X)=(X-\mu_{_X})^2$ | 2-nd central moment | | $g(X)=(X-\mu_{_X})^n$ | n-th central moment | | $g(X)=e^{tX}$ | moment generating func. | * **Def** (Variance) $\text{Var}[X]:=E[(X-\mu_{_X})^2]$ * also denote as $\sigma_{_X}^2$ * ex * Suppose we are given a random variable $X$. We need to output a prediction of $X$ (denoted by $z$). The penalty of prediction is $(X-z)^2$. What is the minimum expected penalty? * sol * $g(z):=E[(X-z)^2]=E[X^2-2zX+z^2]=E[X^2]-2zE[X]+z^2$ $=(z-E[X])^2+E[X^2]-(E[X])^2\ge E[X^2]-(E[X])^2=\text{Var}(X)$ * Variance = minimum expected quadratic penalty * **Thm** $\text{Var}[X]:=E[X^2]-(E[X])^2$ > **pf** > $\text{Var}[X]=E[(X-\mu_{_X}^2)]=E[X^2]-2\mu_{_X}E[X]+\mu_{_{X}}^2$ > $=E[X^2]-2(E[X]^2)+(E[X])^2=E[X^2]-(E[X])^2$ * **Prop** * $\text{Var}[X+c]=\text{Var}[X]$ * $\text{Var}[aX]=a^2\text{Var}[X]$ * $E[X^2]\ge (E[X])^2$ > **pf** $\text{Var}[X]=E[X^2]-(E[X])\ge 0$ * **Def** (Existence) Let $X$ be a random variable. Then, the n-th moment of $X$ (i.e. $E[X]$) is said to exist if $E[|X^n|]<\infty$ * ex. * Let $z_n=(-1)^n\sqrt{n}$, $n=1,2,\dots$. Let $Z$ be a random variable with the set of possible values $\{z_n:n=1,2,\dots\}$and the PMF $p_{_Z}(z)$ as $p_{_Z}(z_n)=\dfrac{6}{(\pi n)^2}$. What is Var[X]? * $\text{Var}[Z]=E[Z^2]-(E[Z])^2$ $E[|Z^2|]=\displaystyle\sum_{n=1}^{\infty}(\sqrt{n})^2\dfrac{6}{(\pi n)^2}=\infty$ So $\text{Var}[Z]$ DNE. * **Thm** If $E[|X|^{n+1}]<\infty$, then $E[|X|^n]<\infty$ * pf * $E[|X^n|]=\displaystyle\sum_{x\in S}|X|^np_{_X}(x)=\sum_{\substack{x:|x|\le 1\\x\in S}}|X|^np_{_X}(x)+\sum_{\substack{x:|x|>1\\x\in S}}|X|^np_{_X}(x)$ $<1+\displaystyle\sum_{\substack{x:|x|>1\\x\in S}}|X|^{n+1}p_{_X}(x)<\infty$ ### $E[X]$ and $\text{Var}[X]$ of Special Discrete R.V. * Bernoulli R.V.: $X\sim$ Bernoulli$(p)$ * $E[X]=p$ * $E[X]=p(1-p)$ * Binomial R.V.: $X\sim$ Binomial$(n,p)$ * $E[X]=np$ * $E[X]=\displaystyle\sum_{k=0}^{n}k\cdot\dbinom{n}{k}p^{k}(1-p)^{n-k}=\sum_{k=0}^{n}k\cdot\dfrac{n!}{k!(n-k)!}p^{k}(1-p)^{n-k}$ $=np\displaystyle\sum_{k=1}^{n}\dfrac{(n-1)!}{(k-1)!(n-k)!}p^{k-1}(1-p)^{n-k}=np$ * $\dfrac{(n-1)!}{(k-1)!(n-k)!}p^{k-1}(1-p)^{n-k}$ is the probability of Binomial(n-1, p), so the sum would be 1 * $\text{Var}[X]=np(1-p)$ * $E[X^2]=\displaystyle\sum_{k=0}^{n}k^2\cdot\dfrac{n!}{k!(n-k)!}p^{k}(1-p)^{n-k}$ $=\displaystyle\sum_{k=1}^{n}(k-1)\cdot\dfrac{n!}{(k-1)!(n-k)!}p^{k}(1-p)^{n-k}$ $+\displaystyle\sum_{k=0}^{n}\dfrac{n!}{(k-1)!(n-k)!}p^{k}(1-p)^{n-k}$ $=n(n-1)p^2\displaystyle\sum_{k=2}^{n}\dbinom{n-2}{k-2}p^{k-2}(1-p)^{n-k}+np=n(n-1)p^2+np$ $\text{Var}[X]=E[X^2]-(E[X])^2=n(n-1)p^2+np-(np)^2=np(1-p)$ ### Continuous Random Variables #### Probability Density Function (PDF) * **Def** (PDF) Let $X$ be a random variable. Then, $f_{_X}(x)$ is the PDF of $X$ if for every subset $B$ of the real line, we have $P(X\in B)=\displaystyle\int_Bf_{_X}(x)dx$ * $P(X\in\mathbb{R})=1$ * $P(X\le t)=\displaystyle\int_{-\infty}^tf_{_X}(x)dx$ * $P(a\le X\le b)=\displaystyle\int_a^bf_{_X}(x)dx$ * $P(a\le X<b)=\displaystyle\int_a^bf_{_X}(x)dx-P(X=b)=\displaystyle\int_a^bf_{_X}(x)dx$ * Check valid (3 axioms of probability) * $P(x\in\mathbb{R})=1\Rightarrow \displaystyle\int_{-\infty}^{\infty}f_{_X}(x)dx=1$ * $P(X\in A)\ge 0\Rightarrow\displaystyle\int_Af_{_X}(x)dx\ge 0$ * $P(X\in\displaystyle\bigcup_{i\ge 1}A_i)=\sum_{i\ge 1}P(X\in A_i)\Rightarrow\int_{\cup A_i}f_{_X}(x)dx=\sum_{i\ge 1}\int_{A_i}f_{_X}(x)dx$ * CDF & PDF * Let $X$ be a random variable with a CDF $F_X(\cdot)$ and a PDF $f_{_X}(\cdot)$ * PDF -> CDF: $F_X(t)=P(X\le t)=\displaystyle\int_{-\infty}^tf_{_X}(x)dx$ * CDF -> PDF: $f_{_X}(x_0)=F_X'(x_0)$ when $f_{_X}(\cdot)$ is continuous at $x_0$ * like Fundamental Thm. of Calculus * Given CDF, the derived PDF is not unique * ![](https://hackmd.io/_uploads/S1qQFJFMa.png) #### Uniform Random Variables * **Def** A random variable $X$ is uniform with parameters $a,b\ (a<b)$ if its PDF is $f_{_X}(x)=\begin{cases} \frac{1}{b-a},\ a<x<b \\ 0,\ \text{otherwise} \end{cases}$ * ex * Let $X$ be a random variable with CDF $F(t)$. Define another random variable $Y=F(X)$. What type of random variable is $Y$ * sol * $F_Y(t)=P(Y\le t)=P(F(X)\le t)$ To simplify, let $t=0.5$. Then we are finding that the probability of $F(X)\le 0.5$, which is just 0.5 So $P(F(X)\le t)=\begin{cases} 1,\ t\ge 1 \\ t,\ 0<t<1 \\ 0,\ t\le 0 \end{cases}$ * ![](https://hackmd.io/_uploads/Hky7JlFGa.png) * 目的:對於一個已知 CDF 的分佈,我們如何從中 sample * pf * The CDF we generate is $P(X\le t)=P(F^{-1}(U)\le t)=P(F(F^{-1}(U))\le F(t))$ $=P(U\le F(t))=F(t)$ * 假設 $X$ 為一個連續隨機變數,其 CDF 為 $F_{X}$。此時,隨機變數 $Y=F_{X}(X)\sim \text{Unif}(0,1)$。ITS 即是將該過程反過來進行:首先對於隨機變數 $Y$,我們從 0~1 中隨機均勻抽取一個數 $u$。之後,由於隨機變數 $F_{X}^{-1}(Y)$ 與 $X$ 有著相同的分布,$x=F_{X}^{-1}(u)$ 即可看作是從分布 $F_{X}$ 中生成的隨機樣本。 [Ref](https://zh.wikipedia.org/zh-tw/%E9%80%86%E5%8F%98%E6%8D%A2%E9%87%87%E6%A0%B7)