楊志璿
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    1
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    統計實驗筆記 === # 變數 $\hat p$ 樣本比例 $\mu$ = 母體平均數 = 中央趨勢量數 $\sigma$ = 母體變異術 = 分散趨勢量數 $p$ = 母體比例 --- 1. nominal scale: 名目尺度/類別尺度 * 姓名、身高... 3. ordinal scale: 順序尺度/等級尺度 * 名次: 1、2、3 5. interval scale: 等距尺度/比例尺度 * 溫度: 10 -> 20 -> 30... https://www.myclass-lin.org/wordpress/archives/615 --- 1. Qualitative Data: 非數值資料(定性資料) 2. Quantitative Data: 數值資料 * 離散隨機變數 * 連續隨機變數 --- ## Random Variable 隨機變數 給定樣本空間$(S,{\mathbb {F}})$,如果其上的實值函數 $X:S \to {\mathbb {R}}$ 是 $\mathbb{F}$ (實值)可測函數,則稱$X$為(實值)隨機變數。 A random variable is a **measurable function** ${\displaystyle X\colon \Omega \to E}$ from a set of possible outcomes $\Omega$ to a measurable space $E$. ## 變異數 * 代數性質 $(\sigma)^2={1 \over N}{\Sigma}_1^N(X_i-\mu)^2$ 移項,拆開後得到 ${\Sigma}X_i^2=N{\sigma}^2+N\mu$ 亦可表達為「$\sigma^2=$ 平方的期望值-期望值的平方」 $(\sigma)^2={{\Sigma}X_i^2 \cdot f(x)}-\mu^2$ 樣本變異數,亦若是 ${\Sigma}x_i^2=(n-1)s^2+n \cdot \bar x$ * 平移不變性 * 變異數的平移不變性,平移變異數不變 * 自己推,很簡單 * 平方擴充性 * 變異數的平方擴充性 * 原來:$X_1,X_2,X_3...X_N$ * 令$Y_i=aX_i$ * 則${\mu}Y=a{\mu}_X$ * $Y$標準差公式,以$aX_i$代換,提出a * 得知${\sigma}_Y=a{\sigma}_X$,所以${\sigma}_Y^2=a^2*{\sigma}_X^2$ ## 共變異數 ${{\sigma}_{x,y}}^2={Cov}(X,Y)$ $=\Sigma_y\Sigma_x(x-\mu_x)(y-\mu_y)f(x,y)$ $={E}((X-{\mu}_X)(Y-{\mu}_Y))$  定義式 $={E}(XY-{\mu}_X \cdot Y-{\mu}_y \cdot X+{\mu_X}{\mu}_Y)$ $=E(XY)-\mu_X \cdot E(Y) - \mu_Y \cdot E(X)+E(\mu_X \mu_Y)$ $=E(XY)-\mu_X \mu_Y$ $=E(XY)-E(X)E(Y)$  計算式 **待自己證** $Var(aX+bY)=a^2Var(X)+b^2Var(Y)+2ab \cdot Cov(X,Y)$ * 性質: * ${Cov}(X,a)=0$, $a \in Constant$ * ${Cov}(X,Y)={Cov}(Y,X)$ * ${Cov}(X,X)=Var(X)={\sigma}^2_X$ * ${Cov}(X+d,Y+c)={Cov}(X,Y)$ * ${Cov}(aX,bY)=a \cdot b\cdot {Cov}(X,Y)$ * Eg: ${Cov}(-2X-5,3Y-7)=-2*3*{Cov}(X,Y)=-6{Cov}(X,Y)$ ## 相關係數 https://zh.wikipedia.org/wiki/%E7%9A%AE%E5%B0%94%E9%80%8A%E7%A7%AF%E7%9F%A9%E7%9B%B8%E5%85%B3%E7%B3%BB%E6%95%B0 Correlation Coefficient $\rho_{X,Y}={\mathrm{cov}(X,Y) \over \sigma_X \sigma_Y} ={E[(X-\mu_X)(Y-\mu_Y)] \over \sigma_X\sigma_Y}$ 完全正相關 $\rho_{X,Y}=1$ 正相關:共變異數>0 負相關參考資料:[菲利浦曲線](https://wiki.mbalib.com/zh-tw/%E8%8F%B2%E5%88%A9%E6%B5%A6%E6%96%AF%E6%9B%B2%E7%BA%BF) 母體相關係數$\rho_{X,Y}=Corr(X,Y)$ 母體標準差$\sigma_{X,Y}=Cov(X,Y)$ 樣本共變異數$\hat{S_{x,y}}={1 \over {n-1}}\Sigma_1^n(x_i-\bar x)(y_i-\bar y)$ 樣本相關係數$\hat{r_{x,y}}$ 我們希望能夠從 樣本推母體 $S_{xy}={{\Sigma_1^nx_iy_i}- n\bar x\bar y}$ $S_{xx}={{\Sigma_1^nx_i^2}- n(\bar x)^2}$ 即 $\Sigma(x_i- \bar x)^2$ $S_{yy}={\Sigma_1^ny_i^2}- n(\bar y)^2$ 即 $\Sigma(y_i- \bar y)^2$ $\hat{r_{x,y}}={\hat{S_{xy}} \over {\hat S_{xx} \hat S_{yy}}}$ ![參考](https://i.imgur.com/DQDxbLN.jpg) </br> 樣本標準差$s_x={{S_{xx}} \over {n-1}}$ ## Chebyshev's Theorem https://zh.wikipedia.org/wiki/%E5%88%87%E6%AF%94%E9%9B%AA%E5%A4%AB%E4%B8%8D%E7%AD%89%E5%BC%8F $P( \left\|{x- \mu} \right\| \lt z \sigma) \gt 1 - {1 \over z^2}$ ### Proof By Markov Theorem We have $P(X \ge a) \le {E(X) \over a}$, Take $X = |x-\mu|$ $\Rightarrow P(|x-\mu| \ge a) \le {E(|x-\mu|) \over a}$ $\Rightarrow P(|x-\mu| \ge a)^2 \le {E((x-\mu)^2) \over a^2}$ $\Rightarrow P(|x-\mu| \ge a)^2 \le {Var(x) \over a^2}$ $\Rightarrow P(|x-\mu| \ge a) \le {\sigma \over a}$ $\Rightarrow P( |x- \mu| \ge a \sigma) \le {1 \over a^2}$ That is Chebyshev's Theorem! ## 機率複習 eg: ||台大|中山|政大|(人數)| |---|---|---|---|---| |男|30|66|234|330| |女|18|42|210|270| ||48|108|444|600| 列聯表 ||台大|中山|政大|機率| |---|---|---|---|---| |男|0.05|0.11|0.39|0.55| |女|0.03|0.07|0.35|0.45| |機率|0.08|0.18|0.74|1| 邊際機率:在有兩個以上的事件的樣本空間中,若僅考慮某一事件個別發生的機率,稱為邊際機率。 也就是最右邊的 column 及 最下面的 row 獨立事件:自己看 $P(A|M)$:念作 probility of $A$ condition $M$ ::: info 算機率在離散型要注意等號 ::: axiom: * $\int_x P(x)=1$ * $0\le P(x) \le 1$, $\forall A \subset \Omega$ * $P(\Omega)=1$ * 設$A_{1},A_{2}..為樣本空間\Omega中之一組事件,A_{i}\land A_{j} \not = 0,\forall_{i\not = j},則P(\cup_{i=1}^{\infty})=\sum_{i=1}^{\infty}P(A_{i})。$ 貝氏定理: 設$A_1,A_2...A_n為\Omega中之一組分割,B為\Omega上之任意分割事件,則P(A_i|B)=\frac{P(B|A_i)P(A_i)}{\sum_{i=1}^{n}P(B|A_i)P(A_i)}$ ## 期望值 Except $E(X)=\mu$ $Var(X)=\sigma^2 = E[(x-\mu)^2]$ ## 分佈 r.v. $X,  X \sim B(n,p)$ ~ : belongs to(服從) $f_{\otimes}(x)=\{^{C^{n}_{x}P^x(1-P)^{n-x}, \forall x \in \mathbb{N} \cup \{0\}}_{0\quad\quad,其他(otherwise)}$ P:成功的機率 二項式分配:當 n = 1 時是 bernoulli ## 機率函數 設x為離散型r.v.,則$f_x(x)=\{^{P(X=x),x\in R_x}_{0, x \not \in R_x}\quad$ R:range * $R_x=\{x|x\in X(\omega),\forall \omega\in \Omega\}$ * $X:\Omega\to\mathbb{R}$ $f_{xy}(x,y)=\{^{P(X=x,Y=y),(x,y)\in R_{xy}}_{0\quad\quad,(x,y)\not\in R_{xy}}$ 老師喜歡這樣表達:當你寫P(),你要在 () 中描述完整事件,所以要寫得像:P(Z<z)或f(x)... * class P(Event); * class f(var); $f(z) \ne P(Z<z)$ $f(z)$ 是單點機率密度 $P(Z<z)$ 是事件機率 # Distribution :::danger 只有 Possion, normal 分布有封閉性 ::: ## Discreate ### Bernoulli distribution $$P(x) = p^xq^{1-x}$$ $$ 1. 進行一次成敗實驗,定義 x 表成功的次數 2. $R_x = \{0,1\}$ 3. 母數:$0 \le P \le 1$ 4. $X \sim Ber(p)$ ### Binomial distribution :::info iid: 獨立且同樣集合,Independent and identically distributed ::: :::success **Definition** 在n個獨立的是/非試驗中成功的次數的離散機率分布,其中每次試驗的成功機率為p。其分佈即為二項分佈。 ::: $$P(x) = {n\choose m} p^x q^{n-x}$$ $$ 1. Testing Bernoulli for n times 2. $Rx = \{x \in \mathbb{N}, x \lt n \}$ 3. Bernomial Sigma additivity (可加性) * $x, y \sim^{iid} B(P)$ 4. 二項式分布式離散型的常態分佈 5. $E(x) = np$ 6. $Var(x) = npq$ ### Poisson distribution 有封閉性 $$P(x) ={e^{- \lambda } \lambda^x \over x!}$$ :::success **Definition** A discrete random variable X is said to have a Poisson distribution with parameter λ > 0, if, for x = 0, 1, 2, ..., the probability density function of X is given by: $$P(x) ={e^{- \lambda } \lambda^x \over x!}$$ $$ ::: 1. 在單位時間內,線段平面空間上連續操作,Poisson 過程 * Poisson must homogeneous and indepedent 2. $R = \{\mathbb{N}+0\}$ 3. $\lambda$ 為發生偶發事件的期望次數 4. $\lambda = E(X) = Var(X)$ ### Hyper Geometric :::success **Definition** 1. The result of each draw (the elements of the population being sampled) can be classified into one of two mutually exclusive categories. 2. The probability of a success changes on each draw, as each draw decreases the population. ::: $${k \choose x}{N-k \choose n-x} \over {N \choose n}$$ $$ 1. $E(x)=n{k \over N}$ 2. 取後不放回抽 n 個,成功 k 次 3. $Var(x)=n({k \over N})(1-{k \over N})({N-n \over N-1})$ 4. 修正因子:$(1-{N-n \over N-1})$ 因為因為他是 finite 所以前一次會影響下一次,(會縮小),這稱作有限母體的修正因子。 5. $R_x = \{0、1、2 ... n\}$ --- ## Continuous ### Normal 有封閉性 $$f(x) = {1 \over \sqrt{2 \pi} \sigma} e^{{-1 \over 2}({x- \mu \over \sigma})^2}$$ $$ ::: success **Definition** 將一連續變項之觀察值發生機率以圖呈現其分布情形,且具有以下特性: 以平均數為中線,構成左右對稱之單峰、鐘型曲線分布。 觀察值之範圍為負無限大至正無限大之間。 ::: 1. $X \sim N(\mu, \sigma^2)$ 2. 積起來很不好積,所以查表 * 因為每個常態分佈的 $\sigma, \mu$ 不同,查表怎麼查? * 規定一個標準常態分布:$Z \sim N(0,1)$ * Standard Normal Probability Distribution * $f(x) = {1 \over \sqrt{2 \pi}} e^{{-1 \over 2}x^2}$ 3. Computing Probabilities for Any Normal Probability Distribution * 標準化 * $X \sim X(\mu, \sigma^2),  Let  {x-\mu \over \sigma} \sim N(0,1)$ 4. 常態分配做線性變換,依舊是常態分配 * 注意平方->平移,變異數->|a|倍 * $E(\bar x) = \mu$ * $Var(\bar x) = {\sigma^2 \over n}$ 5. 反標準化 * $Z \sim N(0,1)$ * Let $X = \sigma Z + \mu$ ### Normal Approximation of Binomial Probabilities * 葉氏連續性校正(Yates continuity correction) 用邊界 +- 0.5 去包住離散值 ## Exponential probability distribution $$f(x) = {1 \over \mu} e^{-x \over \mu}$$ $$ https://zh.wikipedia.org/zh-tw/%E6%8C%87%E6%95%B0%E5%88%86%E5%B8%83 :::success 令 τ 為 隨機變數 且其 機率密度(probability density) 滿足 $f_τ(t):=λ e^{−λt}, if\ t \ge 0;$ $f_τ(t):=0, if\ t \lt 0$ 其中 λ>0 為常數。則我們說 τ 為 exponential distribution 或者說 τ 為 Exponential 隨機變數 ::: $E(x) = {\int}^{\infty}_0 x{1 \over \mu} e^{-x \over \mu} dx =\sigma$ $Var(x) = \mu^2$ By part 公式:$P(x>x_0)=e^{-x_0 \over \mu}$ proof: 若某計次過程服從 poisson process $\iff$ 間格時間必服從指數分布 指數分布的 $\mu$ 跟 poisson 的 $\mu$ 互為倒數 注意單位,使用標準單位不容易錯 ::: info eg: Poisson: ${e^{- \lambda } \lambda^x \over x!}$ $\iff$ Expnential: ${\lambda} e^{-y \lambda}$ ::: # Sampling and Sampling Distributions ## definition 樣本統計量的分配,稱為抽樣分配 ## smapling * 有限母體 * hypergeomttric, sampling w/o replacement, dependent * 取後不放回 * 無限母體 * Binomonal, sampling w/ replacement, independent ## Statistical Inference 統計推測 * Estimatoin 估測 * Testing 檢定 我們主要想要估測三件事 平均數、標準差、百分比 我們說這是統計參數 eg: $X_1, X_2 ... X_n$ $\bar{x} = {1 \over n} \Sigma X$ $Var(\bar x) = Var( {1 \over n} \Sigma X) = {\sigma^2 \over n}$ ## 點估計 重點: ==$\bar x$ 好用== $x_1, x_2 ...X_n \sim^{iid} f_{x_i}(x_i, \theta)$ 用 $\hat \theta$ 去推論母體參數 $\theta$ 估計值跟估計量是不同的,估計量有無限多個 有 hat 是估計量 ### 不偏性 $Bias({\bar \theta}) = E({\bar \theta}) - \theta = 0$ * 高估估計量 $Bias(\theta)>0 \iff E(\theta)>0$ * 不偏估計量 $Bias(\theta)=0 \iff E(\theta)=0$ * 低估估計量 $Bias(\theta)<0 \iff E(\theta)<0$ 證明 $s^2-\sigma^2 =0$ $E(s^2) = E({1 \over (n-1)} \Sigma(x^2_i) - n{\bar x}^2)$ $= {1 \over (n-1)} (\Sigma(E(x^2_i))-nE({\bar x}^2))$ $= {1 \over (n-1)} (\Sigma(Var(x)+E^2(x))-nE({\bar x}^2))$ $= {1 \over (n-1)} (\Sigma(\sigma^2+\mu^2)-nE({\bar x}^2))$ $= {1 \over (n-1)} (\Sigma(\sigma^2+\mu^2)-n({\sigma^2 \over n}+\mu^2))$ $= {1 \over (n-1)} (n\sigma^2+n\mu^2-\sigma^2-n\mu^2)$ $= {1 \over (n-1)} (n\sigma^2-\sigma^2)$ $=\sigma^2$ 倒著寫即可。 ### 有效性 (efficiency) 有效性是以估計式的平均平方誤差來衡量, 越小代表估計式的有效性越高。 ==sum of least squares== [Wiki](https://en.wikipedia.org/wiki/Least_squares) ### 一致性 (consistency) 當樣本數增大時, 估計值會漸近於母體參數真值。 A **consistent estimator** is one for which, when the estimate is considered as a random variable indexed by the number n of items in the data set, **as n increases** the estimates **converge** in probability to the value that the estimator is designed to estimate. ## 區間估計 信賴區間(英語:Confidence interval,C.I) $[L,U]$ 估計 $\theta$,在 $(1-\alpha)100\%$ 信心水準 信心水準 $(1-\alpha)100\%$ 越大表示:越大的信心區間 [L, U] 會包含真實的母體 $\theta$ ==$(1-\alpha)$是中間面積== $1-\alpha = P(L \lt \theta \lt U)$ ### 樞紐量 Pivotal Quantity 樞紐量有 1. 隨機變數 2. 未知代估母數 https://en.wikipedia.org/wiki/Pivotal_quantity >[name=wiki]A pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters. :::success 通常是點估計量的 t 或 z 分配 ::: $x_1, x_2 ...x_n$ 與 $\theta$ 之函數組合 記為 $Q({\hat \theta_i}; \theta)$,且其機率分配不依賴於任何未知母數 (即,可完全被掌握) $g(\hat \theta ,\theta) = \sqrt{n}\frac{\hat \theta - \theta}{s}$ #### 求 $\theta$ 之 $(1-\alpha)100$ 信賴區間 1. 找出適當估計量 2. 找出適當的樞紐量及其機率分配 * 點估計量的分配 3. $1-\alpha = P(L \lt \theta \lt U)$ * $1-\alpha = P({\hat \theta}-k{\sqrt n \over s} \lt g(\hat \theta ,\theta) \lt {\hat \theta}+k{\sqrt n \over s})$ * k 要查表 * Margin error: $E = {\sigma \over \sqrt n}{z_{\alpha \over 2}}$ 為什麼 t 分配的自由度是 n-1? > 因為t分配中的未知待估母數只有一個($\mu$) > 因此未必自由度是 n-1 > $\sigma$ 已知樞紐量是 z 查 t 表,如果自由度很大的時候,可以近似去查 z 表 ### 變異數的區間估計 http://mail.tku.edu.tw/yinghaur/lee/stat-new/%E7%AC%AC%E5%8D%81%E7%AB%A0%E8%A3%9C%E5%85%85--%E7%B5%B1%E8%A8%88%E4%BC%B0%E8%A8%88(%E6%AF%8D%E9%AB%94%E8%AE%8A%E7%95%B0%E6%95%B8%E4%B9%8B%E5%8D%80%E9%96%93%E4%BC%B0%E8%A8%88).pdf #### 信賴區間的意義 試驗 k 次,平均有 $1-\alpha$ 次,未知待估母數會落在該區間。 * 寫法: * $0.95 = P({\bar x}-{\sigma \over \sqrt n}z_{\alpha \over 2} \le \mu \le {\bar x}+{\sigma \over \sqrt n}z_{\alpha \over 2})$ ### 樣本比例的信賴區間 #### 單一母體樣本比例的區間估計 $X_1, X_2, ... X_n \sim^{iid} Ber(p)$ 1. 點估計: $\hat p \Rightarrow p$ 2. $\hat p \Rightarrow^a_{CLT} N(p, \sigma_{\hat p})$ * $z = {{\hat p - p} \over \sqrt{\hat p (1- \hat p) \over n}}$ * a 是漸近 * 根據中央極限定理漸近常態 3. $1-\alpha = P(|\hat p - p| \lt z_{\alpha \over 2}SE(\hat p))$ * SE = standard error margin error = $z_{\alpha \over 2}\sqrt{\hat p(1- \hat p) \over n}$ ## 假說檢定 * 讓樣本據說話 * 檢定力(power),檢定力的大小,就是檢定的有效程度大小: * eg: * 左圖 power 大,右圖 power 小 * ![img](https://i.imgur.com/UVX5oJE.jpg) ||有罪推論|無罪推論| |----|---|---| |H0|有罪|無罪| |Ha|無罪(需負舉證責任)|有罪| ||H0|!H0| |----|---|---| |reject|$\alpha$ type one error|1-$\beta$| |Do not reject|1-$\alpha$|$\beta$ type two error| 如果題目沒說 $\alpha$ 沒說,一般來說設 0.05 ### p-value **樣本觀察值的尾機率** A p-value is a probability that provides a measure of the evidence againest the null hypothesis provided by the sample. Smaller p-value indicate more evidence againest $H_0$. >[name=魏丞偉]把檢定統計量的絕對值拿掉,假設是檢定統計量是x,|x| > a => x > a or x < -a,之後再查表找大於a,小於-a的尾巴機率,加起來就會是p-value。 ### 假說檢定之三面等價法 1. 臨界值法 * Test statistic 2. p-value 法 * 樣本觀察值得尾機率 * 如果雙尾檢定。算兩邊機率 3. 區間估計法 * 從 $\bar x$ 出發,算信賴區間 > 結論必一致 > ## 母體變異數未知 自己算樣本變異數,所以使用 t 分配 * 假設母體常態 1. 假設 H0 2. $\alpha$ 3. test statistic * $T = {{\bar x - \mu_0} \over {s \over {\sqrt{n}}}} \sim t(n-1)$ ## Definition of Student-T distribution $T_\nu = {Z \over \sqrt{\chi^2 \over \nu}} \sim T$ $Z$ is a standard normal distribution $\nu$ is the degree of freedom $\chi^2$ is a Chi-square distribution ## 所需樣本數 單尾檢定 $\mu_0-{\sigma \over \sqrt{n}}\mathcal{z}_\alpha = \mu_a+{\sigma \over \sqrt{n}}\mathcal{z}_\beta$ > 左尾右尾可交換,所就用左尾檢定表示,算法相同。 > 因此,$n={\sigma^2(\mathcal{z}_\alpha+\mathcal{z}_\beta)^2 \over (\mu_0 - \mu_a)^2}$ 注意這邊 $\alpha$ 值有可能因為雙尾檢定而除以 2 想像:用 $\alpha$ 算閾值的砍點跟用 $\beta$ 算肯定會一樣,而**根據這砍點,定義我的 $\alpha$ 要多少** # 兩獨立母體之檢定 ## Case I: 母體常態,$\sigma_1^2 , \sigma_2^2$ 皆已知 :::info Recall: $\bar x_1 - \bar x_2 \to \mu_1 - \mu_2$ $a\bar x_1 - b\bar x_2 \sim N(a\mu_1 - b\mu_2, {(a\sigma_1)^2\over n_1} + {(b\sigma_2)^2\over n_2})$ 同樣的 $Var(aX+bY)=a^2Var(X)+b^2Var(Y)+2ab \cdot Cov(X,Y)$ ::: 然後依樣畫葫蘆,放變數進去 $\sigma = \sqrt{{(a\sigma_1)^2\over n_1} + {(b\sigma_2)^2\over n_2}}$ 我個人稱作 coSigma ### 技巧 在假說檢定上,需要有一個 const 放在右邊(待改進說法),所以會盡量把變數放在左邊,做假說檢定。 :::success $H_0: \mu_0 > \mu_1$ $\to \mu_0 - \mu_1 > 0$ ::: ### 檢定力: $power = 1- \beta$ ## Case II: 母體常態,變異數皆未知 **==使用T分配==** ### 變異數相等(同質) 同質(Homogeneous)變異數假設:$\sigma_1 = \sigma_2$ $S_p^2 = \sigma^2 = {{(n_1 - 1)S_1^2+(n_2 - 1)S_2^2} \over {n_1 + n_2 - 2}}$ 如此帶入 檢定統計數 $TS = {{(\bar x_1 - \bar x_2)-(\mu_1 - \mu_2)} \over \sqrt{S^2_p({1 \over n_1}+{1 \over n_2})}}$ 自由度:$n_1 + n_2 - 2$ ### 變異數相異 檢定統計數 $TS = {{(\bar x_1 - \bar x_2)-(\mu_1 - \mu_2)} \over \sqrt{{{s_1^2}\over n_1}+{{s_2^2}\over n_2}}}$ 自由度為(取高斯整數): $df = {({{s_1^2 \over n_1}+{s_2^2 \over n_2}})^2 \over \sqrt{{1 \over n_1-1}{s_1^2 \over n_1}+{1 \over n_2-1}{s_2^2 \over n_2}}}$ # 兩相關常態母體之檢定 (成對樣本)相依母體 **Sample matched, pair!** eg: 實驗組、對照組 $Sample \ size: n$ $d_k = {x_1}_k - {x_2}_k$ ${\Sigma d_k \over n}= \bar D$ $S_D^2 = \Sigma(d_i- \bar D)^2$ $H_0: \mu_D = C$ ### 服從 T 分配 $T = {{\bar D - \mu_D} \over {S_D \over \sqrt{n}}} \sim T(n-1)$ # 兩獨立母體比例之檢定 $\bar p_1 - \bar p_2 \sim N(p_1-p_2, {p_1q_1 \over n_1}+{p_2q_2 \over n_2})$ 因為沒有 $p_1 \ p_2$ 所以變異數使用 ${\bar p_1}$ & ${\bar p_2}$ 代替 $if \ \ \ \ H_0:(p_1 = p_2 = p)$ > $p = {{n_1 \bar p_1 + n_2 \bar p_2} \over {n_1 + n_2}}$ > > $\sigma = \sqrt{pq({1 \over n_1}+{1 \over n_1})}$ > # 母體變異數之檢定 Chi-Square symbol: ${\chi}^2$ 推導: $s^2 = {1 \over n-1}\Sigma(x_i- \bar x)^2$ $\Rightarrow (n-1) s^2 = \Sigma(x_i- \bar x)^2$ $\Rightarrow {(n-1) s^2 \over \sigma^2} = {\Sigma(x_i- \bar x)^2 \over \sigma^2} = (Z^2_1+Z^2_2+Z^2_3+ ... +Z^2_n)\sim {\chi}^2_{(n-1)}$ :::warning Chi-square doesn't closed!! $c \cdot {\chi}^2 \notin {\chi}^2, \forall c \in R$ --- $E(\chi^2) = df$ 卡方變數之期望值=自由度 $Var(\chi^2) = 2df$ 卡方變數之變異數=兩倍自由度 ::: 檢定統計數: $TS = {(n-1)s^2 \over \sigma^2_0} \sim {\chi}^2_{(n-1)}$ because $\chi^2_{1-{\alpha \over 2}} \le TS \le \chi^2_{\alpha \over 2}$ $\Rightarrow {(n-1)s^2 \over \chi^2_{\alpha \over 2}} \le \sigma^2 \le {(n-1)s^2 \over \chi^2_{1-{\alpha \over 2}}}$ 移項而已 Then we can say $\sigma$ has {$1-\alpha$}% confidence in this intervel! ## 兩獨立母體變異數檢定 ==**F-distribation**== :::danger 必要條件: 1. independent 2. two Normal populations 3. equal variances ::: ### F distribution $X \sim F({df}_1, {df}_2)$ ${df}_1 = n_1 - 1$ ${df}_2 = n_2 - 1$ 一個F-分布的隨機變數是兩個卡方分布變數除以自由度的比率: ${U_1/d_1 \over U_2/d_2} = {U_1/U_2 \over d_1/d_2}$ 其中,$U_1 \sim \chi^2_1, U_2 \sim \chi^2_2$彼此獨立,自由度為 $d_1, d_2$ 檢定統計數: $TS = {s^2_1 \over s^2_2}$ 標準差較大的放上面 可以保證出來的檢定統計數,是在右尾 # 比較多母體比率 ## 多母體比率相等之檢定 卡方分配(chi-square distridution) 檢定統計數: $\chi^2 = \Sigma_i\Sigma_j{(f_{ij}-e_{ij})^2 \over e_{ij}} \sim \chi^2_{(r-1)(c-1)}$ $f_{ij}$ = reality value $e_{ij}$ = expected value, $H_0$, ==$\forall e_{ij} \ge 5$== $r$ = number of rows $c$ = number of columns #### Reject rule 1. p-value approach: Reject $H_0$ if p-value $\le \alpha$ 2. Critical value: Reject $H_0$ if $\chi^2 \ge \chi^2_\alpha$ ### Critical values for the marascuilo pairwise comparison procedure for k population proportions $CV_{ij} = \sqrt{\chi^2_{\alpha}}\sqrt{{\bar p_i \bar q_i \over n_i}+{\bar p_j \bar q_j \over n_j}}$ where $\chi^2_\alpha$ with a level of significance $\alpha$ and $k \ – 1$ degrees of freedom $\bar p_i$ and $\bar p_j$ are the proportions for the populations $i$, $j$ $n_i$ and $n_j$ are the sample size of populations $i$, $j$ :::info Reject or significant if: $|{\bar p_i - \bar p_j}| \gt CV_{ij}$ ::: ### Test of independence use preverious formula to judge whether the $\chi^2$ is siginificance. $H_0$: Assumes that there is no association between the two variables. $H_a$: Assumes that there is an **association** between the two variables. ### Goodness of Fit test 適合度 檢定統計數: $\chi^2_{(k-1)} = \Sigma^k_{i=1}{(f_i - e_i)^2 \over e_i}$ $f_i$ is the reality value $e_i$ is the expected value, $\forall e_i \ge 5$ $k$ is the number of categories ### Test for is Normal distribution? Use **Goodness of fit test** to test whether it is normal distribution. :::success $n$ divided by 5 in to ${\lfloor}{n \over 5}{\rfloor}$ slice. ::: each slice is the $e_i$ ![Imgur](https://imgur.com/C4xPn9v.png) And test it's $\chi^2_{({\lfloor}{n \over 5}{\rfloor} -3)}$ :::info Why -3? --- beacuse the degree of freedom is $k - p -1$ $p$ is the **number of parameters** of the **distribution** estimated by the sample. And the **Normal** distribution has 2 parameters. Hence $k-p-1 = k-3$ :::

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully