--- title: MS6 tags: teach:MS --- # 6. Distributions derived from the normal distribution :::info ### Links to Statistics 101 Suppose the population has a distribution, $F(\theta)$, we learn how to estimate the parameter $\theta$ and how well it is using the following two approaches: 1. Confidence interval 2. Hypothesis testing - $Z$-test: means, population proportion - $t$-test: means - $\chi^2$-test: variance - $F$-test: variance, ANOVA ### The $Z$-test for population mean Suppose the population has a distribution $F(\mu, \sigma^2)$, where $F$ can be any distribution. - $H_0:\mu=\mu_0$ versus $H_1:\mu \neq \mu_0$. - The $Z$ statistic is $\frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}}\stackrel{d}{\rightarrow} Z$. - Set $\alpha=0.05$. - Realized $Z$-statistic = $Z^*$. - The p-value is $P(Z > |Z^*|)$. - If p-value $<\alpha$, we reject $H_0$ and conclude that $\mu\neq \mu_0$. Otherwise, we accept $H_0$. ### The $t$-test for population mean Suppose the population has a normal distribution. - $H_0:\mu=\mu_0$ versus $H_1:\mu \neq \mu_0$. - The $t$ statistic is $\frac{\bar{X}-\mu_0}{S/\sqrt{n}}\sim t_{n-1}$. - Set $\alpha=0.05$. - Realized $t$-statistic = $t^*$. - The p-value is $P(t_{n-1} > |t^*|)$. - If p-value $<\alpha$, we reject $H_0$ and conclude that $\mu\neq \mu_0$. Otherwise, we accept $H_0$. ### ### The $F$-test for population variances Suppose the population has a normal distribution. - $H_0:\sigma_1^2/\sigma_2^2=1$ versus $H_1: \sigma_1^2 /\sigma_2^2\neq 1$ (Assume that $\sigma_1^2 \geq \sigma_2^2$.) - The $F$ statistic is $F_{STAT}=\frac{S_1^2}{S_2^2}\sim F_{n_1-1,n_2-1}$. - Set $\alpha=0.05$. - Realized $F$-statistic = $F^*$. - The p-value is $P(F_{n_1-1,n_2-1} > F^*)$. - If p-value $<\alpha$, we reject $H_0$ and conclude that $\sigma_1^2\neq \sigma_2^2$. Otherwise, we accept $H_0$. \textbf{6.2 $\chi^2$, $t$, and $F$ distributions} ::: ### The chi-squared distribution If $Z\sim N(0,1)$. Then, $U =Z^2$ is called the chi-squared distribution (also chi-square distribution, or $\chi^2$ distribution) with 1 degree of freedom, denoted as $$U\sim \chi_1^2.$$ If $U_1,U_2,\ldots,U_n$ are independent chi-squared random variables with 1 degree of freedom, the distribution of $V = U_1+U_2+\cdots+U_n$ is called the chi-squared distribution with $n$ degrees of freedom and is denoted by $\chi^2_n$. We write $$V\sim \chi^2_n$$ #### Other properties - $\chi^2_n \stackrel{d}{=} Gamma(\alpha=n/2, \lambda = 1/2)$. - Its density is $$f(v)=\frac{1}{2^{n/2}\Gamma(n/2)}v^{(n/2)-1}e^{-v/2},\;v\geq 0.$$ - Its moment generating function is $M(t)=(1-2t)^{-n/2}$ for $t<1/2$. - $E(V) = n$ and $Var(V)= 2n$. - If $U$ and $V$ are independent and $U\sim \chi^2_n$ and $V\sim \chi^2_m$, then $U+V\sim \chi^2_{m+n}$. ### The $t$ distribution If $Z\sim N(0,1)$ and $U\sim \chi^2_n$ and $Z$ and $U$ are independent, then the distribution of $$\frac{Z}{\sqrt{U/n}}$$ is called the $t$ distribution with $n$ degrees of freedom. We write $$\frac{Z}{\sqrt{U/n}}\sim t_{n}.$$ The density function of $t_n$ is $$f(t)=\frac{\Gamma[(n+1)/2]}{\sqrt{n\pi}\Gamma(n/2)}\left(1+\frac{t^2}{n}\right)^{-(n+1)/2}.$$ When $n\rightarrow \infty$, $t_n\rightarrow Z$. ### The $F$ distribution Let $U$ and $V$ independent chi-squared random variables with $m$ and $n$ degrees of freedom, respectively. The distribution of $$W =\frac{U/m}{V/n}$$ is called the $F$ distribution with $m$ and $n$ degrees of freedom and is denoted by $F_{m,n}$. (Check its density.) We write $$W =\frac{U/m}{V/n}\sim F_{m,n}. $$ ## 6.3 The sample mean and the sample variance ### The sample mean and sample variance - Let $X_1,\ldots,X_n$ be independent $N(\mu,\sigma^2)$ random variables; we sometimes refer to them as sample from a normal distribution. The sample mean is $$\bar{X}=\frac{1}{n}\sum_{i=1}^n X_i.$$The sample variance is $$S^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i-\bar{X})^2.$$ ### Theorem A The random variable $\bar{X}$ and the vector of random variables $(X_1-\bar{X}, X_2-\bar{X},\ldots,X_n-\bar{X})$ are independent. (Readings: proof in p.\ 196.) ### Corollary A $\bar{X}$ and $S^2$ are independently distributed. (Readings: proof in p. 197.) ### Theorem B. The distribution of $\frac{(n-1)S^2}{\sigma^2}$ is the chi-squared distribution with $n-1$ degrees of freedom. (Readings: proof in p. 197.) #### Sketch of the proof. Notice $$\frac{(n-1)S^2}{\sigma^2} = \frac{(n-1)\left[\frac{1}{(n-1)}\sum_{i=1}^n(X_i-\bar{X})^2\right]}{\sigma^2}=\frac{\sum_{i=1}^n(X_i-\bar{X})^2}{\sigma^2}=\sum_{i=1}^n\left(\frac{X_i-\bar{X}}{\sigma}\right)^2$$ ### Corollary B Let $\bar{X}$ and $S^2$ be as given. Then, $$\frac{\bar{X}-\mu}{S/\sqrt{n}}\sim t_{n-1}.$$ #### Proof It is known: 1. Corollary A: $\bar{X}$ and $S^2$ are independent. 2. Theorem B: $\frac{(n-1)S^2}{\sigma^2}\sim \chi^2_{(n-1)}$. Moreover, we have $$\frac{\bar{X}-\mu}{S/\sqrt{n}}= \frac{\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}}{\sqrt{\frac{S^2}{\sigma^2}}} = \frac{\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}}{\sqrt{\frac{(n-1)S^2}{\sigma^2}\times \frac{1}{n-1}}}\stackrel{d}{=}\frac{Z}{\sqrt{{\chi^2_{n-1}}/{(n-1)}}}\sim t_{n-1}. $$ ### Example Suppose the population has a normal distribution. Consider: $H_0:\sigma_1^2/\sigma_2^2=1$ versus $H_1: \sigma_1^2 /\sigma_2^2\neq 1$ (Assume that $\sigma_1^2 \geq \sigma_2^2$.). Show that the $F$ statistic is $F_{STAT}=\frac{S_1^2}{S_2^2}\sim F_{(n_1-1),(n_2-1)}$. #### Sol. Note that if $H_0$ is true, we have $\sigma_1^2=\sigma_2^2$. $$\frac{S_1^2}{S_2^2} = \frac{(n_1-1)\frac{S_1^2}{\sigma_1^2(n_1-1)} }{(n_2-1)\frac{S_2^2}{\sigma_2^2(n_2-1)}}=\frac{\frac{(n_1-1)S_1^2}{\sigma_1^2}\frac{1}{(n_1-1)}}{\frac{(n_2-1)S_2^2}{\sigma_2^2}\frac{1}{(n_2-1)}}\stackrel{d}{=}\frac{\chi^2_{(n_1-1)}/(n_1-1)}{\chi^2_{(n_2-1)}/(n_2-1)}\sim F_{(n_1-1),(n_2-1)}.$$