---
tags: stat340, learning-targets
---
# Stat 340 Learning Target Quiz 1 Study Guide
Learning Target Quiz #1 will focus on univariate models. It will include questions on learning targets 2-6.
#### 2. Given the prior distribution and data, derive the posterior distribution for a univariate model.
- Understand how the three steps of inference (prior, likelihood, and posterior) fit together
- Be able to derive the likelihood function
- Identify the likelihood through the story of a distribution (i.e., choose a logical distribution to model the data)
- "The posterior is proportional to prior times likelihood."
- Derive the posterior distribution using Bayes' rule up to the normalizing constant
- Given a sampling distribution (likelihood), show that a prior is conjugate
- For conjugate priors, identify and fully parameterize the posterior distribution
- For discrete priors, calculate the posterior table.
<ins>Example question:</ins>
The Weibull distribution is often used as a model for survival times in biomedical, demographic, and engineering analyses. A random variable $Y$ has a Weibull distribution if its pdf is as follows
\begin{eqnarray*}
f(y \mid \alpha, \lambda) = \lambda \alpha y^{\alpha -1}
\exp(-\lambda y^\alpha) \,\,\,\,\,\,\,\,\, \text{for } y > 0.
\end{eqnarray*}
Here, $\alpha>0$ and $\lambda>0$ are parameters of the distribution. For this problem, assume that $\alpha = \alpha_0$ is known, but $\lambda$ is not known.
1. Assuming the improper prior distribution $\pi(\lambda \mid \alpha = \alpha_0) \propto 1$, and that $Y_1, \ldots, Y_n$ are i.i.d.~Weibull random variables, derive the unstandardized posterior distribution for $\lambda$.
2. Write the name of the posterior distribution your derived in part (a) and expressions for its parameter values.
---
#### 3. Given the posterior distribution and a research question, estimate the parameter (or function of parameters) of interest and interpret the results in context.
- Given a posterior distribution, calculate the posterior mean, median, MAP estimate, variance, standard deviation, equal-tailed credible interval (i.e. percentile interval), and $P (\theta > k)$
- Interpret a credible interval for a parameter
- Perform and interpret a Bayesian hypothesis test
- Be able to use either the theoretical posterior or draws (simulations) from the posterior distribution
<ins>Example question:</ins>
Accidents and other incidents involving commercial aircraft are recorded by the Bureau of Aircraft Accidents Archive (B3A), an international organization located in Geneva, Switzerland. Let $Y$ denote the number of crashes observed per year, and let $n = 10$ denote the total number of observations (years). You observe $\sum Y = 185$. The following model is used to analyze these data:
\begin{align*}
Y_1,\ldots,Y_{10}| \theta & \overset{{\rm iid}}{\sim} {\rm Poisson}(\lambda)\\
\lambda &\sim {\rm Gamma}(10, 0.5)\\
\lambda | Y_1,\ldots,Y_{10} &\sim {\rm Gamma}(195, 10.5)
\end{align*}
1. Explain how you would compute a 90% equal-tail credible interval for $\lambda$.
2. The 90% credible interval for $\lambda$ is 16.54 to 20.73. Interpret this interval in the context of the problem.
---
#### 4. Given the posterior distribution and a research question, conduct a Bayesian hypothesis test and interpret the results in context.
- Given a research question, set up hypotheses of interest
- Given a posterior distribution, calculate the probability of the hypotheses
- Interpret this posterior probability and use it to draw a logical conclusion for the hypothesis test in context
<ins>Example question:</ins>
Denote the probability that a part is defective as $\theta$. The industry standard is that no more than 0.1\% of parts can be defective, i.e., $\theta \le 0.001$. Your company has purchased a new machine, generated 10,000 parts, and tested each to determine if it is defective. You are now tasked with testing the null hypothesis that $\theta \le 0.001$ versus
the alternative hypothesis that $\theta > 0.001$.
You decide to use the following model for $Y$, the number of defective parts in the sample of size $n = 10,000$:
\begin{align*}
Y| \theta &\sim {\rm Binomial}(n, \theta)\\
\theta &\sim {\rm Beta}(0.5, 0.5)\\
\theta | Y &\sim {\rm Beta}(Y+0.5, n-Y+0.5)
\end{align*}
1. Explain how you would compute $P( \theta < 0.001 | Y)$.
2. Provide an interpretation of $P( \theta < 0.001 | Y)$ in the context of the problem.
---
#### 5. Given a univariate Bayesian model, derive the posterior predictive distribution and use it to make predictions about future observations.
- Write down the integral expression for the posterior predictive distribution
- Describe how you would generate samples from the posterior predictive distribution
- Once you have the posterior predictive distribution, use it to calculate predictions and prediction intervals
<ins>Example question:</ins>
You just bought stock in FancyTech. Let $\mu$ denote be the average dollar amount that your FancyTech stock goes up or down in a one-day period. Suppose that it's reasonable to assume that the daily changes in FancyTech stock value are Normally distributed with a known standard deviation of $\sigma = 2$ dollars. On a random sample of 4 days, you observe changes in stock value of $-0.7$, 1.2, 4.5, and $-4$ dollars.
You decide to use the following model for the daily returns:
\begin{align*}
Y_1, Y_2, Y_3, Y_4| \theta &\sim \mathcal{N}(\mu, 2)\\
\mu &\sim \mathcal{N}(7.2, 2.6)\\
\mu | Y_1, Y_2, Y_3, Y_4 &\sim \mathcal{N}(1.15, 0.87)
\end{align*}
Let $\mu^{(1)}, \mu^{(2)}, \ldots, \mu^{(n)}$ be $n$ draws from the posterior distribution of $\mu$. Outline the steps required to calculate an 89\% prediction interval for tomorrow's change in the FancyTech stock price. (Please give a numbered list. No code is required.)
---
#### 6. Assess the adequacy of a univariate Bayesian model.
- Describe how you would generate the posterior predictive distribution for model checking
- Be able to critique/check a model using the posterior predictive distribution
<ins>Example question:</ins>
Beary (my golden retriever) likes to wake me up early, after which I stumble downstairs and make coffee. Suppose that I collected a representative sample of $n=20$ measurements of the time it took me to make coffee in the morning (the time from waking up to pressing the on button). After updating my prior belief, I found that the posterior of $\mu$, the average time to make coffee, is $\mathcal{N}(10, 1)$. Looking at my observed data, I see an unusually large value, $y = 22$.
Below is a histogram of the maximum time to make coffee in a sample of 20 days for 1000 predictive samples. The observed maximum is displayed as a vertical line.

What does this plot reveal about the model's adequacy?