Tutorial 8: Extract values from Welch Output and Make into a Lovely Table

--- title: "Tutorial 8: How to Extract Values from Welch Output and Make into a Lovely Table" author: "Elizabeth A. Albright, PhD" output: html_notebook --- # Tutorial 8: Extract values from Welch Output and Make into a Lovely Table This tutorial goes over how to extract test statistics from the Welch's t test results that R gives you. We will use the airquality.csv New Jersey January and July data. ```{r} library(dplyr) library(readr) ``` We will use the airquality.csv data again. Last time, I hope. ```{r} airquality.df <- read_csv("~/🪅Master/04_Study/Fall 2023/ENV 710 TA/R for stats/input/airquality.csv") airquality.df <- airquality.df %>% #making airquality data frame mutate(date=mdy(`Date Local`))%>% #making date variable glimpse() #looking at data ``` Let's compare January and July ozone levels in New Jersey. I'll hypothesize that July levels will be greater than January (this is my alternative hypothesis). We can use the format(date,%B) function to pull the month from the `Date Local` variable. The %B tells us to pull the name of the month (in words). ```{r} ozone.nj.df<-airquality.df %>% filter(`State Name` == "New Jersey") %>% mutate(month = format(date, "%B")) %>% filter(month == "January" | month=="July") %>% glimpse() ``` Let's make a table of number of observations in July compared to January. ```{r} obs.tbl <- ozone.nj.df %>% group_by(month) %>% count() obs.tbl ``` I want to make sure that R sets the subtraction as Xbar(July) - Xbar(January) so I will use the factor() function to set the level order. ```{r} ozone.nj.df<-ozone.nj.df %>% mutate(month = factor(month, levels=c("July", "January"))) ozone.nj.df ``` ```{r} nj.t.test<-t.test(ozone~month, ozone.nj.df, alternative="greater") nj.t.test ``` Type in ?t.test in the console. At the bottom of the summary of the t-test page, you will see a list of values that you can extract from the t-test results with the form of t.test(ozone~month, nj.ozone.df)$p.value. You place the name of the value you want after the dollar sign. This will chunk will give us the p-value. I'm assigning this value to the object p_value. You will notice that this is much smaller than the p-value listed above. The smallest p-value the Welch's test will report is 2.2e-16. ```{r} p_value<-t.test(ozone~month, ozone.nj.df)$p.value p_value ``` We could also extract the t statistic from the result. ```{r} t_stat<-t.test(ozone~month, ozone.nj.df)$statistic t_stat ``` Here I am extracting several of the values and combining them into a vector called t.test.results. ```{r} t.test.results<-c( nj.t.test$estimate[1], nj.t.test$estimate[2], ci.lower = nj.t.test$conf.int[1], ci.upper = nj.t.test$conf.int[2], nj.t.test$statistic, nj.t.test$parameter, p.value = nj.t.test$p.value) t.test.results ``` And we can make these results into a dataframe. ```{r} t.test.results.df <- data.frame(t.test.results) t.test.results.df ``` ### Using the broom package with the function tidy(). You could also use the broom package to tidy up the Welch's t-test results. The output is pretty straightforward and clean and VERY QUICK. ```{r} library(broom) ``` The object nj.t.test is our t.test results from above. ```{r} tidy(nj.t.test, conf.int=TRUE) ``` We then could combine with the results of other t.test results into a table using the function bind_rows() or rbind(). https://dplyr.tidyverse.org/reference/bind.html https://www.statology.org/rbind-in-r/ **Now we should round these results, change the column and/or row labels, etc.. We could use gt() to do that.**