owned this note
owned this note
Published
Linked with GitHub
# Statistics and Sampling Distribution
## Statics
- A function of observable random variables,
$$
T=l(X_1,\dots,X_n),
$$
which does not depend on any unknown parameters, is called a **statistic**.
- Let $X_1,\dots,X_n$ represent a random sample from a population with pdf $f(x)$. The **sample mean** (樣本平均),
$$
\bar{X} = \frac{\sum_{i=1}^nX_i}{n}
$$
is an example of a statistic.
- When a random sample is observed, the value of $\bar{X}$ usually is denoted by lower case (小寫) $\bar{x}$.
- [**Ex 86.**](/nnlMWeXcRsmoRkOz6QaL2Q) Prove that if $E(X_i)=\mu$ and $\text{Var}(X_i)=\sigma^2$ for $i=1,\dots,n$, then $$E(\bar{X})=\mu \text{ and } \text{Var}(\bar{X})=\frac{\sigma^2}{n}.$$
- The property $E(\bar{X})=\mu$ shows that $\bar{X}$ is a **unbiased estimation** (不偏估計) for the parameter $\mu$.
- **Ex 87.**
Prove that the **sample variance** given by
$$
S^2=\frac{\sum_{i=1}^n(X_i-\bar{X})^2}{n-1}
$$
is an unbiased estimation for $\sigma^2$, i.e.,
$$
E(S^2)=\sigma^2.
$$
- Hint.
- Note that
\begin{align}
\sum_{i=1}^n(X_i-\bar{X})^2&=\sum_{i=1}^n(X_i^2-2X_i\bar{X}+\bar{X}^2)\\
&=\sum_{i=1}^nX_i^2-2n\bar{X}^2+n\bar{X}^2=\sum_{i=1}^nX_i^2-n\bar{X}^2.
\end{align}
- $\displaystyle E(X_i^2)=\mu^2+\sigma^2,E(\bar{X}^2)=\mu^2+\frac{\sigma^2}{n}$.
## Linear combinations of normal variables
- Let $Y=\sum_{i=1}^na_iX_i$, where $X_i \sim N(\mu_i,\sigma_i^2)$ for $i=1,\dots,n,$ denote independent normal variables. Recall that
$$
Y\sim N\left(\sum_{i=1}^n a_i\mu_i,\sum_{i=1}^na_i^2\sigma_i^2\right).
$$
- **Ex 88.**
Prove that if $X_1,\dots,X_n$ denotes a random sample from $N(\mu,\sigma^2)$, then $\bar{X}\sim N(\mu,\sigma^2/n)$.
- [**Ex 89.**](/fYZIZhSXR0uHr9olyL9liA)
Consider two independent random sample $X_1,\dots,X_m$, and $Y_1,\dots,Y_n$ from $N(\mu_1,\sigma_1^2)$ and $N(\mu_2,\sigma_2^2)$, respectively. Prove that $$\bar{X}-\bar{Y}\sim N(\mu_1-\mu_2, \sigma_1^2/m+\sigma_2^2/n).$$
## [Chi-square distribution](/TTvNssnVSf-QzPLPaaDi4w)
- A random variable $X$ has the **gamma distribution** if its pdf is of the form
$$
f(x)=\frac{\lambda^{\alpha}}{\Gamma(\alpha)}x^{\alpha-1}e^{-\lambda x}
$$
and recall that
\begin{align}M_X(t)&=\left(\frac{\lambda}{\lambda-t}\right)^{\alpha}\\E(X)&=\frac{\alpha}{\lambda}\\E(X^2)&=\frac{\alpha(\alpha+1)}{\lambda^2}\\\text{Var}(X)&=\frac{\alpha}{\lambda^2}\end{align}$$
$$
- In case $\displaystyle\alpha = \frac{\nu}{2}$ and $\lambda=1/2$, we say that $X$ has a **chi-square distribution** with $\nu$ **degrees of freedom** and denote $X\sim \chi^2(\nu )$:
- The pdf of $X\sim \chi^2(\nu)$ is
$$
f(x)=\frac{1}{\Gamma(\nu/2)2^{\nu/2}}x^{\nu/2-1}e^{-x/2}.
$$
- $E(X)=\nu$.
- $\text{Var}(X)=2\nu$.
- $\displaystyle E(X^r)=2^r\frac{\Gamma(\nu/2+r)}{\Gamma(\nu/2)}$
- $M_X(t)=(1-2t)^{-\nu/2}$.
- [**Ex 90.**](/sM57DAORT8WZv6g0sY_IsA)
Prove that if $Z\sim N(0,1)$, then $Z^2\sim \chi^2(1)$.
- Hint.
- Recall that the pdf of $X\sim N(\mu,\sigma)$ is of the form
$$
f(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{-(x-\mu)^2/(2\sigma^2)}.
$$
- Verify that
$$M_{Z^2}(t)=E(e^{tZ^2})=\int_{-\infty}^{\infty}e^{tz^2}\cdot\frac{1}{\sqrt{2\pi}}e^{-z^2/2}dz=\cdots=(1-2t)^{-1/2}.$$
- **Ex 91.**
Prove that if $X_i\sim \chi^2(\nu_i)$; $i=1,2,\dots,n$ are independent chi-square variables, then
$$
Y:=\sum_{i=1}^nX_i\sim\chi^2\left(\sum_{i=1}^n\nu_i\right).
$$
- Hint. Verify that
$$
M_Y(t)=M_{X_1}(t)\cdots M_{X_n}(t)=\cdots=(1-2t)^{-\sum_{i=1}^n\nu_i/2}.
$$
- **Ex 92.**
Let $X_1,\dots,X_n$ denotes a random sample from $N(\mu,\sigma^2)$. Prove that
1. $\displaystyle \sum_{i=1}^n\frac{(X_i-\mu)^2}{\sigma^2}\sim\chi^2(n)$.
2. $\displaystyle \frac{n(\bar{X}-\mu)^2}{\sigma^2}\sim\chi^2(1)$.
- Let $X_1,\dots,X_n$ denotes a random sample from $N(\mu,\sigma^2)$.
- **Ex 93.**
Prove the following equation:
$$\sum_{i=1}^n\frac{(X_i-\mu)^2}{\sigma^2}=\sum_{i=1}^n\frac{(X_i-\bar{X})^2}{\sigma^2}+\frac{n(\bar{X}-\mu)^2}{\sigma^2}.
$$
- [**Ex 94.**]()
Prove that $\bar{X}$ and the terms $X_i-\bar{X}$, $i=1,\dots,n$ are independent, and consequently, $\bar{X}$ and $S^2$ are independent.
- Hint.
1. Note that the joint pdf of $X_1,\dots,X_n$ is
$$
\begin{align}
f(x_1,\dots,x_n)&=\frac{1}{(2\pi)^{n/2}\sigma^n}\exp\left[\frac{-1}{2}\sum_{i=1}^n\left(\frac{x_i-\mu}{\sigma}\right)^2\right]\\
&=\frac{1}{(2\pi)^{n/2}\sigma^n}\exp\left[\frac{-1}{2\sigma^2}\left(\sum_{i=1}^n(x_i-\bar{x})^2
+n(\bar{x}-\mu)^2\right) \right]
\end{align}
$$
2. Let $Y_1=\bar{X}$ and $Y_i=X_i-\bar{X}$ for $i=2,\dots,n$. Since
$\sum_{i=1}^n(X_i-\bar{X})=0$, it follows that $$X_1-\bar{X}=-\sum_{i=2}^n(X_i-\bar{X})=-\sum_{i=2}^nY_i$$
Then the joint pdf of $Y_1,\dots,Y_n$ is
$$
g(y_1,\dots,y_n)=f(x_1,\dots,x_n)\biggr\vert\frac{\partial(x_1,\dots,x_n)}{\partial(y_1,\dots,y_n)}\biggr\vert,
$$
where
$$
\begin{align}
f(x_1,\dots,x_n)&=\frac{1}{(2\pi)^{n/2}\sigma^n}\exp\left[\frac{-1}{2\sigma^2}\left[\left(-\sum_{i=2}^ny_i\right)^2+\sum_{i=2}^ny_i^2
+n(y_1-\mu)^2\right] \right]\\
\biggr\vert\frac{\partial(x_1,\dots,x_n)}{\partial(y_1,\dots,y_n)}\biggr\vert&=n\hspace{.5cm}\text{(Why?)}
\end{align}
$$
This implies that for $i=2,\dots,n$, $Y_1$ and $Y_i$ are independent (Why?). Moreover, since $X_1-\bar{X}=-\sum_{i=2}^nY_i$, it follows that $Y_1=\bar{X}$ and $X_1-\bar{X}$ are independent.
- **Ex 95.** Prove that $(n-1)S^2/\sigma^2\sim\chi^2(n-1)$.
- [**Ex 96.**]()
Prove that if $Y_{\nu}\sim\chi^2(\nu)$, then
$$
Z_{\nu}=\frac{Y_{\nu}-\nu}{\sqrt{2\nu}}\overset{d}{\to}Z\sim N(0,1)
$$
as $\nu\to\infty$.
- **Hint.** By CLT,
$$
\frac{\sum_{i=1}^nX_i-n\mu}{\sqrt{n}\sigma}\overset{d}{\to}Z\sim N(0,1) \text{ as } n\to\infty.
$$
Assume $X_i\sim\chi^2(1)$, so that $E(X_i)=1$ and $\text{Var}(X_i)=2$.
## Student's $t$ distribution
- [**Ex 97.**](/N3uqxDv-TSS4xGi940x6kQ)
Let $Z\sim N(0,1)$ and $V\sim \chi^2(\nu)$ be two independent random variables. The distribution of $$T=\frac{Z}{\sqrt{V/\nu}}$$ is referred to as **Student's $t$ distribution** with $\nu$ degrees of freedom, denoted by $T\sim t(\nu)$. Prove that the pdf of $T$ is given by for $t$ with $-\infty<t<\infty$,
$$
f(t;\nu)=\frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\Gamma\left(\frac{\nu}{2}\right)}\frac{1}{\sqrt{\nu\pi}}\left(1+\frac{t^2}{\nu}\right)^{-(\nu+1)/2}.
$$
- **Hint.** The joint pdf of $Z$ and $V$ is given by
$$
\begin{align}
f_{Z,V}(z,x)&=f_Z(z)f_V(x)\\
&=\frac{1}{\sqrt{2\pi}}e^{-z^2/2}\cdot\frac{1}{\Gamma\left(\nu/2\right)2^{\nu/2}}x^{\nu/2-1}e^{-x/2}.
\end{align}
$$
Let $T=Z/\sqrt{V/\nu}$ and let $Y=V$. Consider
$$
z=t\sqrt{y/\nu} \text{ and } x=y,
$$
then
$$
f_{T,Y}(t,y)=\frac{1}{\sqrt{2\pi}}e^{-t^2y/(2\nu)}\cdot\frac{1}{\Gamma\left(\nu/2\right)2^{\nu/2}}y^{\nu/2-1}e^{-y/2}\biggr\vert\frac{\partial(z,x)}{\partial(t,y)}\biggr\vert,
$$
where $-\infty<t<\infty$ and $0<y<\infty$. Then
$$
f(t;\nu)=\int_0^{\infty}f_{T,Y}(t,y)dy.
$$
- [**Ex 98.**]()
Let $T\sim t(\nu)$. Prove that for $\nu>2r,r=1,2,\dots$,
$$
\begin{align}
E(T^{2r})&=\frac{\Gamma((2r+1)/2)\Gamma((\nu-2r)/2)}{\Gamma(1/2)\Gamma(\nu/2)}\nu^r,\\
E(T^{2r-1})&=0\\
\text{Var(T)}&=\frac{\nu}{\nu-2}, \text{ where } \nu>2.
\end{align}
$$
- Hint. $E(T^{2r})=E(Z^{2r})E((V/\nu)^{-r})$
- [**Ex 99.**](/XRB5rStsQWqRpnSoj2JJrg)
Assume that $\{X_1,\dots,X_n\}$ is a random sample from $N(\mu,\sigma^2)$. Prove that
$$
\frac{\bar{X}-\mu}{S/\sqrt{n}}\sim t(n-1).
$$
## F distribution
- [**Ex 100.**]()
Let $$X=\frac{V_1/\nu_1}{V_2/\nu_2},$$ where $V_1\sim\chi^2(\nu_1)$ and $V_2\sim\chi^2(\nu_2)$ are independent. Then $X$ is known as **F distribution** with $\nu_1$ and $\nu_2$ degrees of freedom, and is denoted by $X\sim F(\nu_1,\nu_2)$. Prove that the pdf of $X$ is given by for $x>0$,
$$
f(x;\nu_1,\nu_2)=\frac{\Gamma\left(\frac{\nu_1+\nu_2}{2}\right)}{\Gamma\left(\frac{\nu_1}{2}\right)\Gamma\left(\frac{\nu_2}{2}\right)}\left(\frac{\nu_1}{\nu_2}\right)^{\nu_1/2}x^{(\nu_1/2)-1}\left(1+\frac{\nu_1}{\nu_2}x\right)^{-(\nu_1+\nu_2)/2}
$$
- Hint. Let $Y=V_2$ and find the joint pdf of $X$ and $Y$.
- [**Ex 101.**]()
Assume that $X\sim F(\nu_1,\nu_2)$. Prove that
$$
\begin{align}
E(X^r)&=\frac{\left(\frac{\nu_2}{\nu_1}\right)^r\Gamma\left(\frac{\nu_1}{2}+r\right)\Gamma\left(\frac{\nu_2}{2}-r\right)}{\Gamma\left(\frac{\nu_1}{2}\right)\Gamma\left(\frac{\nu_2}{2}\right)},\text{ for } \nu_2>2r\\
E(X)&=\frac{\nu_2}{\nu_2-2},\text{ for } \nu_2>2\\
\text{Var}(X)&=\frac{2\nu_2^2(\nu_1+\nu_2-2)}{\nu_1(\nu_2-2)^2(\nu_2-4)}, \text{ for } \nu_2>4.
\end{align}
$$
- Hint. $$E(X^r)=\left(\frac{\nu_2}{\nu_1}\right)^rE(V_1^r)E(V_2^{-r}).$$
- Let $X_1,\dots,X_m$ and $Y_1,\dots,Y_n$ be independent random samples from populations with respect distributions $X_i\sim N(\mu_1,\sigma^2_1)$ and $Y_j\sim N(\mu_2,\sigma_2^2)$. Then
$$
\frac{(m-1)S_1^2}{\sigma_1^2}\sim \chi^2(m-1) \text{ and } \frac{(n-1)S_2^2}{\sigma_2^2}\sim \chi^2(n-1),
$$
so that
$$
\frac{S_1^2/\sigma_1^2}{S_2^2/\sigma_2^2}\sim F(m-1,n-1).
$$
## Exercises
- Let $X_1,\dots,X_n$ be a random sample of size $n$ from a normal distribution, $X_i\sim N(\mu,\sigma^2)$, and define $U=\sum_{i=1}^nX_i$, and $W=\sum_{i=1}^nX_i^2$.
- [**Ex 102.**]() Find a statistic that is a function of $U$ and $W$ and unbiased for the parameter $\theta = 2\mu-5\sigma^2$.
- [**Ex 103.**]() Find a statistic that is unbiased for $\sigma^2+\mu^2$.
- [**Ex 104.**]() Assume that $X_1$ and $X_2$ are independent normal random variables, $X_i\sim N(\mu,\sigma^2)$, and let $Y_1=X_1+X_2$ and $Y_2=X_1-X_2$. Show that $Y_1$ and $Y_2$ are independent and normally distributed.