Stat 340 Learning Target Quiz 3 Study Guide

--- tags: stat340, learning-targets --- # Stat 340 Learning Target Quiz 3 Study Guide Learning Target Quiz #3 will include questions on learning targets 1, 7, and 8. In addition, you can reattempt targets that appeared for the first time on Quiz 2. #### 9. Given a model specification, explain the steps of the Metropolis algorithm. - Clearly outline the key steps of the algorithms - Understand how the proposal and acceptance probability work - What acceptance rate should we target? - Identify situations where MCMC is needed <ins>Example question:</ins> Suppose that you have a random sample, $x_1, \ldots, x_n$, from a Galenshore distribution with PDF. $$f(x_i | \theta) = \frac{2}{\Gamma(a)} \theta^{2a} x_i^{2a-1}e^{-\theta^2 x_i^2}$$ where $x_i, \theta >0$ and $a$ is a known constant. Further, suppose that you put a Gamma$(3, 1)$ prior on $\theta$. 1. Derive the unnormalized posterior distribution for $\theta$. 2. Describe a method for obtaining draws, $\theta^{(1)}, \ldots, \theta^{(m)}$, from the posterior distribution. If helpful, you may use R function names, but you need to also describe the process. #### 10. Given a model specification, explain the steps of the Gibbs sampler. - Derive full conditional posterior distributions - Clearly outline the key steps of the algorithm - Identify situations where MCMC is needed <ins>Example question:</ins> Suppose that you have data, $y_1, \ldots, y_n$ you wish to model and have already derived an expression for the joint posterior distribution using Bayes' rule: $$ \pi(\alpha, \beta|y_1, \ldots, y_n) = (\alpha \beta)^n \exp \left( -\alpha \beta \sum_{i=1}^n y_i \right) \exp\left(-\alpha-\beta \right); \quad y_i>0; \quad \alpha, \beta >0. $$ 1. Write an expression for the conditional posterior distribution for $\alpha$. If it is a known density, then clearly identify that density and its parameters. 2. Write an expression for the conditional posterior distribution for $\beta$. If it is a known density, then clearly identify that density and its parameters. 3. Describe how you would implement a Gibbs sampler to draw a sample from the posterior distribution. #### 11. Given MCMC draws, check whether they have converged to the posterior. - State the appropriate diagnostic tools for assessing whether a chain has converged to the posterior - Given diagnostics for assessing MCMC convergence, comment on whether it is reasonable to assume the chain has converged to the posterior - Describe what it means for a sample from a Markov chain to converge to the approximate posterior. - If your chain has not converged, know what steps you can take to address this issue? <ins>Example question:</ins> Suppose that you run MCMC in JAGS to sample from the posterior distribution of $\theta$. 1. Draw a trace plot for a single chain that obviously has not converged to the posterior distribution. 2. Draw a trace plot for a single chain with high autocorrelations. #### 12. Given draws from the (approximate) posterior distribution, draw inferences about the appropriate parameters in the context of the research question. - Calculate a point estimate for a parameter or function of parameters - Calculate and interpret a credible interval for a parameter or function of parameters - Perform and interpret a Bayesian hypothesis test <ins>Example question:</ins> Two Carleton students collected data on the price of hardcover textbooks from two disciplinary areas: mathematics and the natural sciences, and the social sciences. Let $Y_{1,1}, \ldots, Y_{1,27}$ denote the $n=27$ book price from mathematics and the natural sciences, and let $Y_{2,1}, \ldots, Y_{2,17}$ denote the book prices from the social sciences. The two Carleton students fit the following Bayesian model: $$ Y_{1,i} \overset{{\rm iid}}{\sim} \mathcal{N}(\mu, \sigma), \quad Y_{2,j} \overset{{\rm iid}}{\sim} \mathcal{N}(\mu + \delta, \sigma) $$ with joint prior $\pi(\mu, \delta, \sigma^2) \propto \dfrac{1}{(\sigma^2)^2}$. The students fit their model in JAGS and obtained the following posterior summaries: ``` 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE mu 154.17 10.624 0.15025 0.22235 delta -54.09 16.735 0.23667 0.35641 sigma 55.02 6.051 0.08558 0.09301 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% mu 133.16 147.11 154.07 161.22 175.26 delta -86.93 -65.29 -54.17 -42.81 -21.68 sigma 44.55 50.86 54.52 58.60 68.54 ``` Give an interpretation of the 95\% credible interval for $\delta$ (`delta`) in the context of the problem. #### 13. Given your prior belief, specify an appropriate prior distribution for a hierarchical model. - Define complete pooling, no pooling, and partial pooling - Understand the role of prior distributions and hyperprior distributions on specifying the complete pooling, no pooling, and partial pooling (hierarchical) models - Understand how induced priors can be used <ins>Example question:</ins> In this problem, you'll consider a hierarchical model describing SAT verbal scores across eight schools. Let $Y_{ij}$ denote the SAT verbal score for student $i$ in school $j$. \begin{align*} Y_{ij} &\overset{\rm iid}{\sim} \mathcal{N}( \mu_j, \sigma )\\ \mu_j | \mu, \tau &\sim \mathcal{N}(\mu, \tau) \end{align*} Suppose you don't have much prior information other than the fact that SAT verbal scores can be between 200 and 800. Specify weakly informative priors that you would use for this hierarchical model. #### 14. Given a research question and your prior belief, write out a hierarchical model using two-stage priors in statistical notation. - Know what each component is: sampling model, stage 1 prior and stage 2 prior - Be able to read the description of a problem and set up a model <ins>Example question:</ins> In a pig breeding study two offspring from each of ten litters were measured for average daily weight gain (ADWG). The individual pig measurements can be thought of as having two pieces, pig gain = common mean + litter effect + individual pig effect. Write the model equations specifying a hierarchical model for weight gain, where each litter has it's own mean and variance. Use a normal distribution for the sampling model, and propose reasonable prior distribution for any necessary parameters. #### 15. Given draws from the (approximate) posterior distribution of your hierarchical model, draw inferences about the appropriate parameters in the context of the research question. - Calculate and interpret a point estimate for a parameter or function of parameters - Calculate and interpret a credible interval for a parameter or function of parameters - Perform and interpret a Bayesian hypothesis test <ins>Example question:</ins> Let $Y_{ij}$ denote the the weekly hours spent on homework for student $i$ in school $j$. \begin{align*} Y_{ij} &\overset{\rm iid}{\sim} \mathcal{N}( \mu_j, \sigma )\\ \mu_j | \mu, \tau &\sim \mathcal{N}(\mu, \tau) \end{align*} Below is part of the summary of 5000 posterior draws obtained from JAGS. (Note: the full model specification was not given, but you have enough information from the above.) ``` Lower95 Median Upper95 Mean SD Mode mu 5.993 7.623 9.471 7.628 0.871 NA tau 1.219 2.166 3.533 2.279 0.652 NA mu_j[1] 7.814 9.256 10.614 9.257 0.726 NA mu_j[2] 5.627 7.098 8.512 7.100 0.741 NA mu_j[3] 6.395 7.898 9.435 7.905 0.790 NA mu_j[4] 5.032 6.423 7.924 6.406 0.739 NA mu_j[5] 8.989 10.381 11.900 10.390 0.752 NA mu_j[6] 4.966 6.398 7.965 6.403 0.772 NA mu_j[7] 4.850 6.326 7.845 6.326 0.754 NA mu_j[8] 5.931 7.423 9.049 7.427 0.791 NA sigma 3.402 3.793 4.201 3.802 0.205 NA R 0.098 0.245 0.472 0.263 0.103 NA ``` 1. Give an interpretation of the 90% credible interval for `mu_j[1]` in the context of the problem. 2. What does the 90% credible interval for $R = \dfrac{\tau^2}{\sigma^2 + \tau^2}$ tell you about the between-school variation?