---
title: "Tutorial 8: How to Extract Values from Welch Output and Make into a Lovely Table"
author: "Elizabeth A. Albright, PhD"
output: html_notebook
---
# Tutorial 8: Extract values from Welch Output and Make into a Lovely Table
This tutorial goes over how to extract test statistics from the Welch's t test results that R gives you. We will use the airquality.csv New Jersey January and July data.
```{r}
library(dplyr)
library(readr)
```
We will use the airquality.csv data again. Last time, I hope.
```{r}
airquality.df <- read_csv("~/🪅Master/04_Study/Fall 2023/ENV 710 TA/R for stats/input/airquality.csv")
airquality.df <- airquality.df %>% #making airquality data frame
mutate(date=mdy(`Date Local`))%>% #making date variable
glimpse() #looking at data
```
Let's compare January and July ozone levels in New Jersey. I'll hypothesize that July levels will be greater than January (this is my alternative hypothesis). We can use the format(date,%B) function to pull the month from the `Date Local` variable. The %B tells us to pull the name of the month (in words).
```{r}
ozone.nj.df<-airquality.df %>%
filter(`State Name` == "New Jersey") %>%
mutate(month = format(date, "%B")) %>%
filter(month == "January" | month=="July") %>%
glimpse()
```
Let's make a table of number of observations in July compared to January.
```{r}
obs.tbl <- ozone.nj.df %>%
group_by(month) %>%
count()
obs.tbl
```
I want to make sure that R sets the subtraction as Xbar(July) - Xbar(January) so I will use the factor() function to set the level order.
```{r}
ozone.nj.df<-ozone.nj.df %>%
mutate(month = factor(month, levels=c("July", "January")))
ozone.nj.df
```
```{r}
nj.t.test<-t.test(ozone~month, ozone.nj.df, alternative="greater")
nj.t.test
```
Type in ?t.test in the console. At the bottom of the summary of the t-test page, you will see a list of values that you can extract from the t-test results with the form of t.test(ozone~month, nj.ozone.df)$p.value. You place the name of the value you want after the dollar sign.
This will chunk will give us the p-value. I'm assigning this value to the object p_value. You will notice that this is much smaller than the p-value listed above. The smallest p-value the Welch's test will report is 2.2e-16.
```{r}
p_value<-t.test(ozone~month, ozone.nj.df)$p.value
p_value
```
We could also extract the t statistic from the result.
```{r}
t_stat<-t.test(ozone~month, ozone.nj.df)$statistic
t_stat
```
Here I am extracting several of the values and combining them into a vector called t.test.results.
```{r}
t.test.results<-c(
nj.t.test$estimate[1],
nj.t.test$estimate[2],
ci.lower = nj.t.test$conf.int[1],
ci.upper = nj.t.test$conf.int[2],
nj.t.test$statistic,
nj.t.test$parameter,
p.value = nj.t.test$p.value)
t.test.results
```
And we can make these results into a dataframe.
```{r}
t.test.results.df <- data.frame(t.test.results)
t.test.results.df
```
### Using the broom package with the function tidy().
You could also use the broom package to tidy up the Welch's t-test results. The output is pretty straightforward and clean and VERY QUICK.
```{r}
library(broom)
```
The object nj.t.test is our t.test results from above.
```{r}
tidy(nj.t.test, conf.int=TRUE)
```
We then could combine with the results of other t.test results into a table using the function bind_rows() or rbind().
https://dplyr.tidyverse.org/reference/bind.html
https://www.statology.org/rbind-in-r/
**Now we should round these results, change the column and/or row labels, etc.. We could use gt() to do that.**