A.2.5 Expectation

# A.2.5 Expectation ## expectation 定義 expectation，也稱作 expected value 或是 mean，就是以前國高中學過的期望值。$X$ 的 expectation ==$E[X]$==，也就是++在我們做非常多實驗後，$X$ 的平均值++。 ![image](https://hackmd.io/_uploads/Bkp7LrNER.png) > 和前面一些定義類似，如果 $X$ 是連續的就用積分，不連續就用 $\sum$。 > - expectation 是在算一種加權過的平均，而權重就是發生的機率，也就是上式中的 $P(x_i)$ 和 $p(x)$。如果我們要計算 $g(·)$ 的 expectation，其中 $g(·)$ 是一個 real-valued function： ![image](https://hackmd.io/_uploads/r1MjcvV4A.png) > 其實就只是先多用一個 function 先把 $x$ map 到另一個值，再用這個 mapping 完的結果去做一樣的事而已。 ## expectation 特性 ![image](https://hackmd.io/_uploads/S1W1OBVVA.png) > 即 expected value 是 linear 的 :::success 假如 $X=c$，其中 $c\in \mathbb{R}$ 則 $E[X] = c$ ::: > 進一步來說，如果一個 random variable $X$ 的 expected value 是 well-defined（$E[X]$ well-defined，也就是說等於某個常數），那麼： > > $E[E[X]] = E[X]$ >> 其實如果把 expected value 想成平均就也很直觀，一個 random varible 的平均值如果 well-defined，它就應該是某個常數，而我們對常數再去取 expectation，也就是再去取一次平均，當然還是常數。 :::success expectation is ++linear++ $\Rightarrow$ $\sum$ 可以提出來 ::: ![image](https://hackmd.io/_uploads/ryxqm_SV0.png) > 理由就想像展開，變成 $E[a_1X_1+...+a_NX_N]$ > > 根據 expectation 的 linear 特性，可以拆成 $E[a_1X_1]+...+E[a_NX_N]$ > > 同樣根據 linear，把每個 $a_i$ 提出來，再合在一起寫成 $\sum$ 即可得到上面的結果。 :::success 如果 $X,Y$: random variable 且 $X,Y$ ++independent++ 則 $E[XY] = E[X]E[Y]$ ::: ## moment 此外，我們去定義一個特殊的 function： :::info $g(x) = x^n$ ::: 稱作 the nth moment of X。完整定義為： :::info 若 $n \in Z^+$ 且 \begin{equation} E(X^n) = \sum_{x \in S}x^nf(x) \end{equation} 為 finite，則稱作 the ++nth moment++ of the distribution ++about the origin++ ::: 如果我們的分佈中心並不是 origin，而是某個值 $b$，那麼我們可以很類似的去定義： :::info 若 $n \in Z^+$ 且 \begin{equation} E[(X-b)^n] = \sum_{x \in S}(x-b)^nf(x) \end{equation} 為 finite，則稱作 the ++nth moment++ of the distribution ++about $b$++ ::: 那麼它的 expectation 計算（根據 random variable 是 discrete 或 continuous）就定義成下方的式子： ![image](https://hackmd.io/_uploads/rJr9swEEA.png) - 如果 $g()$ 是一個 probability distribution function，那麼它的 first moment （因此 $n=1$）就是 mean ==$\mu$==，也就代表 $\mu \equiv E(X)$。 - 同樣的，也會有 2nd moment, 3rd moment⋯⋯，其中 2nd moment 就是下一小節要講的 variance。 > 一個 function 的不同 moment，是藉由將這個 function raise 到不同的次方，同時有可能減掉某個 central 的值（例如下一小節的 variance），進而從不同的角度去描述這個 function 的 graph。 > > 以 probability distribution 來說，如果機率的分佈是在一個 bounded interval 中，那所有 moments 的集合（也就是從 moment $0$ 到 $\infty$）就能 uniquely determine 這個 distribution。 ## LOTUS (law of the unconscious statician) 前面提到我們用一個特殊的 function $g(x) = x^n$ 來定義 nth moment，但推廣到更 general 的 case，我們的 ++$g(x)$ 可以是任何 real-valued function++（即 $g:\mathbb{R} \rightarrow \mathbb{R}$）在 random variable $X$ 是 discrete 或 continuous 的兩種情況下，如果分別滿足以下條件： - ==discrete==： :::info 若 $X$ 的 pmf 為 $P(X=x)$，且 $X$ 的 support 為 $\{x_1,...,x_n\}$ $g(x)$ 需為 defined, summable on $\{x_1,...,x_n\}$ ::: > review： > - pmf：probability mass function > - support：所有 $X$ 把 outcomes map 到的值（range 裡所有 probability 不為零的值。） - ==continuous==： :::info 若 $X$ 的 pdf 為 $f(x)$，且 $X$ 的 support 為 $[a,b]$ $g(x)$ 需為 defined, integrable on $[a,b]$ ::: 那麼： - ==discrete==： :::success \begin{equation} E[g(X)] = \sum_{x:p(x)>0}g(x)P(X=x) \end{equation} ::: - ==continuous==： :::success \begin{equation} E[g(X)] = \int_{-\infty}^{\infty}g(x)f(x)dx \end{equation} ::: Note：這些結果也可以 generlaize 到多個 random variables，只要 distribution 改成 joint distributions 即可，詳細可參考「參考資料的」 LOTUS wiki 頁面。 # 參考資料 - wiki: [Moment (mathematics)](https://en.wikipedia.org/wiki/Moment_(mathematics)#Significance_of_the_moments) - wiki: [Law of the unconscious statistician](https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician) - Hogg,Tanis,Zimmerman_Probability and Statistical Inference, 9th ed(2015)(p.59-60)