Try   HackMD

GPS Skills Course Series - R (2023Winter)

Course Details
Dates: January 30th, 2023 - February 17th, 2023
Days: Tuesday/Thursday
Time: 2-3:20pm

Course Agenda:
(Github page)

Software Installation:
*Refer to Canvas course software download files.

Lesson Data (download)
*Refer to Canvas course data download files.

NOTES:

A copy of the instructor live session notes will be uploaded in canvas at the end of session lesson.

(Include only for Python course)
Jupyterlab will be used for the lessons
[m] Markdown cell = notes
[#]also works in code cell for notes
[b] = add cell below [a] is above
[r]Raw cells cannot have text edits

https://www.markdownguide.org/getting-started/
https://www.markdownguide.org/basic-syntax/
End

Session 1

Name (Last, First) / Email

  1. Doe, John / jdoe@ucsd.edu (example)
  2. Kang, Eastern / dkangsim@health.ucsd.edu (instructor)
  3. Romstadt, Lisa / lromstadt@ucsd.edu
  4. Kazden, Matthew/mkazden@ucsd.edu
  5. Yamada, Sohei / syamada@ucsd.edu
  6. Cho, Harrison / hhc002@ucsd.edu
  7. Weitong Li / wel056@ucsd.edu
  8. Paula Jaramillo/ pjaramillo@ucsd.edu
  9. Nakamura, Masaki / m1nakamura@ucsd.edu
  10. Phrasavath, Bonaly / bphrasav@ucsd.edu
  11. Ilaria, Simmen/ ismmen@ucsd.edu
  12. Hernandez, I
  13. Lascano, Dayra / dlascano@ucsd.edu
  14. Orosz, Stephanie / sorosz@ucsd.edu
  15. Faiaz, Muhtadi / mfaiaz@ucsd.edu
  16. Pykurel,Himanshu/ Hpykurel@ucsd.edu,
  17. Zhuohan, Fang / zhf003@ucsd.edu
  18. Kato, Hiroaki / h
  19. Nogueda,Lesley/ lnogueda@ucsd.edu
  20. Edwards, Lauryn/ ltedward@ucsd.edu
  21. Berman-Schneider, Ray/ raybscneid@gmail.com
  22. Ye Yuan/ yeyuan@ucsd.edu
  23. Junyi Hua/juhua@ucsd.edu
  24. Cuevas, Christopher/ccuevas@ucsd.edu
  25. Tianning, Lan / tilan@ucsd.edu
  26. Coyle, Jackson / jacoyle@ucsd.edu
  27. Yamakawa, Daisuke / dayamakawa@ucsd.edu
  28. Groper, Danielle / dgroper@ucsd.edu
  29. Tsuru, Yutaro / ytsuru@ucsd.edu
  30. Jordan Chu / j1chu@ucsd.edu
  31. Bo,Yutong/yubo@ucsd.edu
  32. Yamashita, Daichi / dyamashita@ucsd.edu
  33. Yamashita, Aoi / aoyamashita@ucsd.edu
  34. Tan,Jingyi/ j3tan@ucsd.edu
  35. Artan, Warsan / wartan@ucsd.edu
  36. Xuran, Wang / xuw028@ucsd.edu
  37. Ito, Yoshi / yoito@ucsd.edu
  38. Chen, Wenxin / wec027@ucsd.edu
  39. Zhiyuan,Chi /z1chi@ucsd.edu
  40. Yuxin Guo / yug041@ucsd.edu
  41. Leung, Taysia/ t3leung@ucsd.edu (Asynchronous Video)
  42. Ahluwalia, Gurkriti / gahluwalia@ucsd.edu

Notes

You can use R as a calculator

Variables and assignments

use assignment operator <- to assign values to variables

x <- 1/40 x_2 <- x*2 # try to use unique variables names df <- x*2 rm(x) # delete variable ?rm() # use ? and function to get help in RStudio

Session 1 Questions:

Please enter any questions not answered during live session here:
1.

End Session 1

Session 2

Name (Last, First) / Email

  1. Doe, John / jdoe@ucsd.edu (example)
  2. Phrasavath, Bonaly / bphrasav@ucsd.edu
  3. Hernandez, Isai / i4hernandez@ucsd.edu
  4. Tianning, Lan / tilan@ucsd.edu
  5. Weitong Li / wel056@ucsd.edu
  6. Romstadt,Lisa / lromstadt@ucsd.edu
  7. Lascano, Dayra / dlascano@ucsd.edu
  8. Yamada, Sohei / syamada@ucsd.edu
  9. Tsuru, Yutaro / ytsuru@ucs.edu
  10. ilaria simmen / isimmen@oxy.edu
  11. Orosz, Stephanie/ sorosz@ucsd.edu
  12. Lesley Nogueda / lnogueda@ucsd.edu
  13. Himanshu Pykurel/ hpykurel@ucsd.edu
  14. Xiangning Wu / xiw138@ucsd.edu
  15. Nakamura, Masaki / m1nakamura@ucsd.edu
  16. Cuevas, Christopher/ccuevas@ucsd.edu
  17. Yamashita, Daichi / dyamashita@ucsd.edu
  18. Tan,Jingyi /j3tan@ucsd.edu
  19. Groper, Danielle / dgroper@ucsd.edu
  20. Xuran, Wang / xuw028@ucsd.edu
  21. Coyle, Jackson / jacoyle@ucsd.edu
  22. Artan, Warsan / wartan@ucsd.edu
  23. Yamakawa, Daisuke / dayamakawa@ucsd.edu
  24. Ito, Yoshi / yoito@ucsd.edu
  25. Chen, Wenxin / wec027@ucsd.edu
  26. Tsuru, Yutaro / ytsuru@ucsd.edu
  27. Chu, Jordan / j1chu@ucsd.edu
  28. Zhiyuan, Chi / z1chi@ucsd.edu
  29. Yuxin Guo / yug041@ucsd.edu
  30. Leung, Taysia/ t3leung@ucsd.edu (Asynchronous Video)
  31. Ahluwalia, Gurkriti / gahluwalia@ucsd.edu

Notes:

coat <- ("calico", "black", "tabby") weight <- c(2.1,5.0, 3.2) likes_string <- c(1,0,1)

creating a csv and saving in a data folder

dir.create("data") # creating a directory # creating a .csv file write.csv(cats, file = "data/feline_data.csv", row.names = FALSE)
# assigning feline_data.csv to variable cats cats <- write.csv(cats, file = "data/feline_data.csv", row.names = FALSE)
cats$weight cats$coat cats$weight + 2
cats$weight_plus_2 <- cats$weight + 2
paste("My cat is", cats$coat) View(cats)
cats$weight + cats$coat

5 main data types in R

  • doulbe
  • integer
  • complex
  • logical
  • character
# use typeof() to find data type Typeof("cats$weight")

##############################

Quick clarification for converting variable types.

Suppose you have a mixed variable type recorded as presented in the class:

weight<- c(2.1,5.0, 3.2, "a")

R is going to recognize that as character type because of "a".
If you import the csv file with this type of mixed variable types, and if you use the option stringAsFactors = TRUE, then your character type of variable will be recognized as factor, which is equivalent to integer type of variable.

We can convert the weight back to numeric type using the as.numeric() function, but because R recognizes the weight variable as factor, the converted output would be 1, 2, 3, in increasing order. For our example, 2.1 = 1, 3.2 = 2, 5.0 = 3, and "a" = 4.

However, if you selected stringAsFactros = FALSE option, then R recognizes the weight variable as character type, and converting that variable to numeric would produce NA for mismatching type.

Please see table for a brief summary:

Original data Converted using as.numeric()
when string is factor (2.1, 5.0, 3.2, "a") (1, 3, 2, 4)
when string is character (2.1, 5.0, 3.2, "a") (2.1, 5.0, 3.2, NA)

Session 2 Questions:

Please enter any questions not answered during live session here:
1.

End Session 2


Session 3

Name (Last, First) / Email

Doe, John / jdoe@ucsd.edu (example)
Nakamura, Masaki / m1nakamura@ucsd.edu
Yamada, Sohei / syamada@ucsd.edu
Tianning, Lan / tilan@ucsd.edu
Phrasavath, Bonaly / bphrasav@ucsd.edu
Orosz, Stephanie / sorosz@ucsd.edu
Nogueda, Lesley/lnogueda@ucsd.edu
Paula Jaramillo/pjaramillo@ucsd.edu
Pykurel, Himanshu/ hpykurel@ucsd.edu
Cho, Harrison / hhc002@ucsd.edu
Artan, Warsan / wartan@ucsd.edu
ilaria simmen
Lascano, Dayra / dlascano@ucsd.edu
Chen, Wenxin / wec027@ucsd.edu
Groper, Danielle / dgroper@ucsd.edu
Coyle, Jackson / jacoyle@ucsd.edu
Romstadt, Lisa / lromstadt@ucsd.edu
Tsuru, Yutaro / ytsuru@ucsd.edu
Cuevas, Christopher/ccuevas@ucsd.edu
Xuran, Wang / xuw028@ucsd.edu
Chu, Jordan / j1chu@ucsd.edu
Yamakawa, Daisuke / dayamakawa@ucsd.edu
Zhiyuan, Chi / z1chi@ucsd.edu
Tan,Jingyi / j3tan@ucsd.edu
Yuxin Guo / yug041@ucsd.edu
Leung, Taysia/ t3leung@ucsd.edu (Asynchronous Video)
Ahluwalia, Gurkriti / gahluwalia@ucsd.edu


Notes

importing data

# download data directly from a link. # download.file("https://raw.githubusercontent.com/datacarpentry/r-intro-geospatial/master/_episodes_rmd/data/gapminder_data.csv", destfile = "data/gapminder_data.csv") #make sure to have the "data" file in your Rstudio working directory

exploring data

str(gapminder) class(gapminder$year) nrow(gapminder) ncol(gapminder) dim(gapminder) colnames(gapminder) head(gapminder) head(gapminder, n=10)

Challenge 1
answer:

# check the last few lines tail(gapminder) tail(gapminder, n=15) #random rows gapminder[sample(nrow(gapminder), 5),]

challenge 2
answer:

str(gapminder)

#selecting data by cutoff

below_average <- gapminder$lifeExp < 70.5 head(gapminder)

adding wrong length vector

below_average <- c(TRUE, TRUE, TRUE, TRUE) head(cbind(gapminder, below_average)) below_average <- c(TRUE, FALSE, FALSE, TRUE) head(cbind(gapminder, below_average))
#overwrite content with new data below_average <- as.logical(gapminder$lifeExp < 70.5) below_average gapminder <- cbind(gapminder, below_average)

add new row

new_row <- list('Norway', 2016, 500000, 'nordic', 80.3, 49400.0, FALSE) new_row

factors

# makes continent a character gapminder$continent <- as.character(gapminder$continent) str(gapminder)

dplyr

#installing packages: install.packages("dplyr")
library(dplyr) year_country_gdp <- select(gapminder, year, country, gdpPercap) head(year_country_gdp) year_country_gdp <- gapminder %>% select(year, country, gdpPercap) year_country_gdp_euro <- gapminder %>% filter(continent == "Europe") %>% select(year, country, gdpPercap)

Session 3 Questions:

Please enter any questions not answered during live session here:
1.

End Session 3

Session 4

Name (Last, First) / Email

  1. Doe, John / jdoe@ucsd.edu (example)
  2. Pykurel, Himanshu/ hpykurel@ucsd.edu
  3. Ilaria Simmen/ isimmen@ucsd.edu
  4. Phrasavath, Bonaly / bphrasav@ucsd.edu
  5. Lascano, Dayra / dlascano@ucsd.edu
  6. Nogueda, Lesley / lnogueda@ucsd.edu
  7. Tsuru, Yutaro / ytsuru@ucsd.edu
  8. Nakamura, Masaki / m1nakamura@ucsd.edu
  9. Cuevas, Christopher/ccuevas@ucsd.edu
  10. Groper, Danielle / dgroper@ucsd.edu
  11. Artan, Warsan / wartan@ucsd.edu
  12. Coyle, Jackson / jacoyle@ucsd.edu
  13. Xuran, Wang / xuw028@ucsd.edu
  14. Chu, Jordan / j1chu@ucsd.edu
  15. Zhiyuan, Chi / z1chi@ucsd.edu
  16. Chen, Wenxin / wec027@ucsd.edu
  17. Tan, Jingyi / j3tan@ucsd.edu
  18. Yuxin Guo / yug041@ucsd.edu
  19. Yamakawa, Daisuke / dayamakawa@ucsd.edu
  20. Leung, Taysia/ t3leung@ucsd.edu (Asynchronous Video)
  21. Ahluwalia, Gurkriti / gahluwalia@ucsd.edu

NOTES

# download data directly from a link. # download.file("https://raw.githubusercontent.com/datacarpentry/r-intro-geospatial/master/_episodes_rmd/data/gapminder_data.csv", destfile = "data/gapminder_data.csv") #make sure to have the "data" file in your Rstudio working directory # downlod `gapminder_wide` data from a link. download.file("https://raw.githubusercontent.com/swcarpentry/r-novice-gapminder/gh-pages/_episodes_rmd/data/gapminder_wide.csv", destfile = "data/gapminder_wide.csv")

loading libraries

install.packages("tidyverse", dependencies = TRUE) library(tidyr) library(dplyr) library(tidyverse)

loading data

#assign data to gapminder variable gapminder <- read.csv("data/gapminder_data.csv", stringsAsFactors = FALSE)
gap_wide <- read.csv("data/gapminder_wide.csv", stringsAsFactors = FALSE)

converting wide to long format

gap_long <- gap_wide %>% pivot_longer( cols = c(starts_with('pop'), starts_with('lifeExp'), starts_with('gdpPercap')), names_to = "obstype_year", values_to = "obs_values" ) View(gap_long) #to view the data
gap_long$year <- as.integer(gap_long$year) gap_long <- gap_long %>% separate(obstype_year, into = c('obs_type', 'year'), sep = "_") gap_long$year <- as.integer(gap_long$year) View(gap_long)

Session 4 Questions:

Please enter any questions not answered during live session here:
1.

End Day 4

Session 5

Name (Last, First) / Email

Doe, John / jdoe@ucsd.edu (example)
Nakamura, Masaki / m1nakamura@ucsd.edu
Phrasavath, Bonaly / bphrasav@ucsd.edu
Lascano, Dayra / dlascano@ucsd.edu
Ye, Yuan / yeyuan@ucsd.edu
Tianning , Lan / tilan@ucsd.edu
Tsuru, Yutaro / ytsuru@ucsd.edu
Chen, Wenxin / wec027@ucsd.edu
Coyle, Jackson / jacoyle@ucsd.edu
Tan, Jingyi/ j3tan@ucsd.edu
Groper, Danielle / dgroper@ucsd.edu
Artan, Warsan / wartan@ucsd.edu
Cuevas, Christopher/ccuevas@ucsd.edu
Xuran, Wang / xuw028@ucsd.edu
Yuxin Guo / yug041@ucsd.edu
Yamakawa, Daisuke / dayamakawa@ucsd.edu
Leung, Taysia/ t3leung@ucsd.edu (Asynchronous Video)
Chu, Jordan / j1chu@ucsd.edu
Ahluwalia, Gurkriti / gahluwalia@ucsd.edu

notes

using ggplot

library("ggplot2")
ggplot(data= gapminder, aes(x= gdpPercap, y=lifeExp)) + geom_point()
### challenge answer ggplot(data= gapminder, aes(x= gdpPercap, y=lifeExp, color=continent)) + geom_point()

layers

ggplot(gapminder, aes(x = year, y=lifeExp, color=continent)) + geom_line()
ggplot(gapminder, aes(x = year, y=lifeExp, color=continent, group = country)) + geom_line()
ggplot(gapminder, aes(x = year, y=lifeExp, color=continent, group = country)) + geom_line() + geom_point()
ggplot(gapminder, aes(x = year, y=lifeExp, group = country)) + geom_line(aes(color=continent)) + geom_point()
# challenge: switch geom_line() and geom_point() # challenge answer ggplot(gapminder, aes(x = year, y=lifeExp, group = country)) + geom_point() + geom_line(aes(color=continent))

transformation and stats

ggplot(data = gapminder, aes(x= gdpPercap, y=lifeExp)) + geom_point(alpha=0.5) + scale_x_log10()

changing alpha

ggplot(data = gapminder, aes(x= gdpPercap, y=lifeExp)) + geom_point(aes(alpha=continent)) + scale_x_log10() # you will see a warning, but the plot still works
ggplot(data = gapminder, aes(x= gdpPercap, y=lifeExp)) + geom_point(aes(alpha=0.5)) + scale_x_log10() + geom_smooth(method = "lm", size = 1.5)
#challenge answer ggplot(data = gapminder, aes(x= gdpPercap, y=lifeExp)) + geom_point(aes(alpha=0.5, color="orange")) + scale_x_log10() + geom_smooth(method = "lm", size = 1.5)
# challenge answer ggplot(data = gapminder, aes(x= gdpPercap, y=lifeExp, color=continent)) + geom_point(aes(shape=continent)) + scale_x_log10() + geom_smooth(method = "lm", size = 1.5)

#multi-pannel figures using face-wrap

americas <- gapminder[gapminder$continent == "Americas",] ggplot(americas, aes(x=year, y=lifeExp)) + geom_line() + facet_wrap(~country) + theme(axis.text.x = element_text(angle = 45, hjust =1 )) + labs(x="year", y="life Expectancy", title="figure 1", color= "continent")

useful ggplot cheatsheet:

https://statsandr.com/blog/files/ggplot2-cheatsheet.pdf

## challenge ggplot(gapminder, aes(x=continent, y=lifeExp, fill=continent)) + geom_boxplot() + facet_wrap(~year) + ylab("life Expentancy") + theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank())

Session 5 Questions:

Please enter any questions not answered during live session here:
1.

End Day 5

Session 6

Name (Last, First) / Email

Doe, John / jdoe@ucsd.edu (example)
Pykurel, Himanshu / hpykurel@ucsd.edu
Lascano, Dayra / dlascano@ucsd.edu
Nogueda, Lesley /
lnogueda@ucsd.edu
Nakamura, Masaki / m1nakamura@ucsd.edu
Phrasavath, Bonaly / bphrasav@ucsd.edu
Tsuru, Yutaro / ytsuru@ucsd.edu
Artan, Warsan / wartan@ucsd.edu
Groper, Danielle / dgroper@ucsd.edu
Cuevas, Christopher/ccuevas@ucsd.edu
Xuran, Wang / xuw028@ucsd.edu
Yuxin Guo / yug041@ucsd.edu
Coyle, Jackson / jacoyle@ucsd.edu
Chen, Wenxin / wec027@ucsd.edu
Yamakawa, Daisuke / dayamakawa@ucsd.edu
Leung, Taysia/ t3leung@ucsd.edu (Asynchronous Video)
Chu, Jordan / j1chu@ucsd.edu
Ahluwalia, Gurkriti / gahluwalia@ucsd.edu

download.file("https://raw.githubusercontent.com/datacarpentry/r-intro-geospatial/master/_episodes_rmd/data/gapminder_data.csv", destfile = "data/gapminder_data.csv") #make sure to have the "data" file in your Rstudio working directory
https://drive.google.com/file/d/15ei1LmUURIvbJIjS9J-Y6qlyRx1jrpFB/view?usp=share_link

Notes

library(tidyverse) # read in gapminder data #source("filename.R") # running a group of functions stored in another .r file source("programming-with-r.R") # useful for importing data sets and creating a R pipeline

using .rmd rnotebook

Create and rnotebook file:
file menu/new file/r notebook

#filter the country to plot

gap-to-plot <- gapminder
est <- read_csv("data/countries_estimated.csv/")

Session 6 Questions:

Please enter any questions not answered during live session here:
1.

End Day 6