--- title: Stat09 tags: Stat --- [Home](https://hackmd.io/y_1O1ws1TQe8VRI7bxQbeg) # Chapter 9 Large-sample Tests of Hypotheses ## 9.3 Two population means ### Key ingredients (B1) - Model: $X_{1i}\stackrel{i.i.d.}{\sim} F(\mu_1,\sigma_1^2)$ for $i=1,\ldots,n_1$ and $X_{2j}\stackrel{i.i.d.}{\sim} F(\mu_2,\sigma_2^2)$ for $i=1,\ldots,n_2$. - We are interested in understanding $(\mu_1-\mu_2)$. - We use the difference between two sample means $(\bar{X}_1-\bar{X}_2)$ to estimate $(\mu_1-\mu_2)$, where $\bar{X}_1=\frac{1}{n_1}\sum_{i=1}^{n_1}X_{1i}$ and $\bar{X}_2=\frac{1}{n_2}\sum_{j=1}^{n_2}X_{2j}$. - With the CLT, we have $$\frac{(\bar{X}_1-\bar{X}_2)-(\mu_1-\mu_2)}{\sqrt{ \frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}}\stackrel{d}{\rightarrow}Z$$ - When $\sigma_1$ and $\sigma_2$ are unknown, we use the sample variances, $$s_1^2 = \frac{1}{n_1-1}\sum_{i=1}^{n_1}(X_{1i}-\bar{X}_1)^2, \quad s_2^2 = \frac{1}{n_2-1}\sum_{j=1}^{n_2}(X_{2j}-\bar{X}_2)^2.$$ - With the advanced CLT (Slutsky's theorem), we have $$\frac{(\bar{X}_1-\bar{X}_2)-(\mu_1-\mu_2)}{\sqrt{ \frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}\stackrel{d}{\rightarrow}Z.$$ ### $H_0: \mu_1-\mu_2 = D_0$ The test statistic and its sampling distribution is $$Z_{STAT} = \frac{(\bar{X}_1-\bar{X}_2)-D_0}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}\stackrel{d}{\rightarrow}Z$$ ### Example: $H_0$: $\mu_1-\mu_2=0$ vs $H_a$: $\mu_1-\mu_2\neq 0$. ![](https://i.imgur.com/MCxJg12.png) ![](https://i.imgur.com/KPu0hbe.png) ## 9.4 Population proportion ### Key ingredients (A2) - Model: $X {\sim} Binomial(n,p)$ - We use sample proportion $\hat{p} = \frac{X}{n}$ to estimate the population proportion $p$. - With the CLT, we have $$\frac{\hat{p}-p}{\sqrt{{p}(1-{p})/n}}\stackrel{d}{\rightarrow}Z.$$ ### $H_0: p = p_0$ For $H_0: p = p_0$, the test statistic and its sampling distribution is $$Z_{STAT}=\frac{\hat{p}-p_0}{\sqrt{p_0(1-p_0)/{n}}}\stackrel{d}{\rightarrow}Z$$ ### The two-sided test 1. $H_0: p = p_0$ versus $H_a: p \neq p_0$. 2. Set up $\alpha$ 3. $Z_{STAT} = \frac{\hat{p}-p_0}{\sqrt{p_0(1-p_0)/n}}\stackrel{d}{\rightarrow}Z$. 4. Calculate the realized statistic $Z^*$ from the data. 5. Find (a) the rejection region = $\{z: z<-z_{\alpha/2}\;,z>z_{\alpha/2}\}$ or (b) the $p$-value= $2*P(Z>|Z^*|)$. 6. Conclude. ### The left-sided test 1. One of the following: - $H_0:p = p_0$ versus $H_a: p < p_0$ - $H_0:p \geq p_0$ versus $H_a: p < p_0$ - $H_0:p > p_0$ versus $H_a: p \leq p_0$ 2. Set up $\alpha$. 3. $Z_{STAT} = \frac{\hat{p}-p_0}{\sqrt{p_0(1-p_0)/n}}\stackrel{d}{\rightarrow}Z$. 4. Calculate the realized statistic $Z^*$ from the data. 5. Find either (a) the rejection region = $\{z: z<-z_{\alpha}\}$ or (b) $p$-value = $P(Z<Z^*)$. 6. Conclude. ### The right-sided test 1. One of the following: - $H_0: p = p_0$ versus $H_a: p > p_0$ - $H_0:p \leq p_0$ versus $H_a: p > p_0$ - $H_0:p < p_0$ versus $H_a: p \geq p_0$ 2. Set up $\alpha$ 3. $Z_{STAT} = \frac{\hat{p}-p_0}{\sqrt{p_0(1-p_0)/n}}\stackrel{d}{\rightarrow}Z$. 4. Calculate the realized statistic $Z^*$ from the data. 5. Find (a) the rejection region = $\{z: z>z_{\alpha}\}$ or (b) the $p$-value = $P(Z>Z^*)$. 6. Conclude. ### Example: $H_0$: $p=0.2$ vs $H_a$: $p< 0.2$. Bernoulli(p=0.20) ![](https://i.imgur.com/4J2OfwE.png) ![](https://i.imgur.com/N2JtaHc.png) ## 9.5 Two population proportions ### Key ingredients (B2) - Model: $X_{1}{\sim} Binomial(n_1,p_1)$ and $X_{2}{\sim} Bernoulli(n_2, p_2)$. - We are interested in understanding $(p_1-p_2)$. - Define $$\hat{p}_1=\frac{X_1}{n_1}, \quad \hat{p}_2=\frac{X_2}{n_2}.$$We use $(\hat{p}_1-\hat{p}_2)$ to estimate $(p_1-p_2)$. - With the CLT, we have $$\frac{(\hat{p}_1-\hat{p}_2)-(p_1-p_2)}{\sqrt{ \frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}}\stackrel{d}{\rightarrow}Z$$ - When $p_1$ and $p_2$ are unknown, assuming $H_0$ is true,i.e., $p_1=p_2=p$, we define $$\hat{p}=\frac{X_1+X_2}{n_1+n_2}$$ as an pool estimate of $p$. - Apply the advanced CLT (Slutsky's theorem), we have $$\frac{(\hat{p}_1-\hat{p}_2)-(p_1-p_2)}{\sqrt{\hat{p}(1-\hat{p})( \frac{1}{n_1}+\frac{1}{n_2})}}\stackrel{d}{\rightarrow}Z.$$ ### $H_0: p_1-p_2 = 0$ For $H_0: p_1-p_2 = 0$, the test statistic and its sampling distribution is $$Z_{STAT}=\frac{(\hat{p}_1-\hat{p}_2)-0}{\sqrt{\hat{p}(1-\hat{p})( \frac{1}{n_1}+\frac{1}{n_2})}}\stackrel{d}{\rightarrow}Z,$$ where $\hat{p}_1=\frac{X_1}{n_1}$, $\hat{p}_2=\frac{X_2}{n_2}$, and $\hat{p}=\frac{X_1+X_2}{n_1+n_2}$. ### Exemple: $H_0$: $p_1-p_2=0$ vs $H_a$: $p_1-p_2\neq 0$. ![](https://i.imgur.com/1jlfDZs.png) ![](https://i.imgur.com/45MFHqf.png)