---
tags: math615
robots: noindex, nofollow
---
# Collaborative R Notes
This set of notes is for students at Chico State to help build a reference for themselves and future users on how to do tasks in R.
# General resources
* https://norcalbiostat.github.io/MATH130/faq.html
* https://datacarpentry.org/R-genomics/index.html
# Script Files
* A script file with a `.R` file type is mainly R, all discussion/comments need to be in #comment form
* An **Rmarkdown** (`.Rmd`) file is like a lab notebook where you can write R code,
* See [Math 130 Week 1 lesson 03](https://norcalbiostat.github.io/MATH130/wk1.html) or the [RStudio tutorial page](https://rmarkdown.rstudio.com/lesson-1.html) for an overview.
* A [Quarto](https://quarto.org/) (`.qmd`) file is the new generation of Rmarkdown files. :grinning_face_with_star_eyes: See [Hello Quarto](https://quarto.org/docs/get-started/hello/rstudio.html) tutorial to get started
# Packages / Libraries
A _package_ is a collection of functions that can be used to do things like create graphics or run advanced models. To get access to these functions you have to _load_ the package by typing
```{r}
library(packagename)
```
where `packagename` is the name of the package you are trying to load. Examples include `tidyverse`, `here` and `janitor`.
# Import and preprocessing
* External data files are often saved as text files (`.txt`), comma separated values (`.csv`) or as Excel files (`.xlsx`)
## Import Data into R
If my data set is named **AddHealth_Wave_IV.csv** and it is located in my data folder, then it can be read into R using the `read_csv` function.
```{r}
raw <- read_csv(here("data", "AddHealth_Wave_IV.csv"))
```
Here I am assigning the data set to the name `raw`.
## Only read in selected variables
```{r}
mydata <- raw %>%
select(BIO_SEX, H4TO1, H4TO2)
```
## check the data type
```{r}
class(mydata$H4TO1)
```
## Export R formatted data
```{r}
save(mydata, here("data", "addhealth_clean.Rdata"))
```
## Nice settings for your rendered document
At the top of your `qmd` file, in the YAML header add the following
```md
title: "Describing Relationships between variables"
date: "2022-09-19"
author: "Robin Donatello"
format:
pdf
execute:
echo: true
```
## Hide warning messages
#### For a single code chunk:
Inside the code chunk, at the top, use the hash-pipe to set code chunk options
```md
execute:
echo: true
warning: false
message: false
```
#### For the entire document:
At the top of your `qmd` file, in the YAML header under `execute` add lines to disable `warning` and `message`.
```r
|# message: false
table(mpg$class)
```
### Applying labels to levels of categorical variable
```{r}
depress$marital <- factor(depress$marital,
labels = c("Never Married", "Married",
"Divorced", "Separated", "Widowed"))
```
## Force a page break
Add `\newpage` to a blank line.
----
# Data Exploration and Vizualization
## Side by side plots
This uses div tags. Copy it exactly, then insert your own code chunks.
```verbatim=
:::: {.columns}
::: {.column width="50%"}
code chunk to create plot 1
:::
::: {.column width="50%"}
code chunk to create plot 2
:::
::::
```
## Expand the y axis so that your numbers can be seen
Expand the y-axis to a number a little larger than your max value using `ylim(lower, upper)`
```{r}
plot_frq(mpg$class) + ylim(c(0,100))
```