Course Details
Dates: January 30th, 2023 - February 17th, 2023
Days: Tuesday/Thursday
Time: 2-3:20pm
Course Agenda:
(Github page)
Software Installation:
*Refer to Canvas course software download files.
Lesson Data (download)
*Refer to Canvas course data download files.
A copy of the instructor live session notes will be uploaded in canvas at the end of session lesson.
(Include only for Python course)
Jupyterlab will be used for the lessons
[m] Markdown cell = notes
[#]also works in code cell for notes
[b] = add cell below [a] is above
[r]Raw cells cannot have text edits
https://www.markdownguide.org/getting-started/
https://www.markdownguide.org/basic-syntax/
End
use assignment operator <-
to assign values to variables
x <- 1/40
x_2 <- x*2 # try to use unique variables names
df <- x*2
rm(x) # delete variable
?rm() # use ? and function to get help in RStudio
Please enter any questions not answered during live session here:
1.
coat <- ("calico", "black", "tabby")
weight <- c(2.1,5.0, 3.2)
likes_string <- c(1,0,1)
creating a csv and saving in a data folder
dir.create("data") # creating a directory
# creating a .csv file
write.csv(cats,
file = "data/feline_data.csv",
row.names = FALSE)
# assigning feline_data.csv to variable cats
cats <- write.csv(cats,
file = "data/feline_data.csv",
row.names = FALSE)
cats$weight
cats$coat
cats$weight + 2
cats$weight_plus_2 <- cats$weight + 2
paste("My cat is", cats$coat)
View(cats)
cats$weight + cats$coat
# use typeof() to find data type
Typeof("cats$weight")
##############################
Suppose you have a mixed variable type recorded as presented in the class:
weight<- c(2.1,5.0, 3.2, "a")
R is going to recognize that as character type
because of "a".
If you import the csv file with this type of mixed variable types, and if you use the option stringAsFactors = TRUE
, then your character type
of variable will be recognized as factor
, which is equivalent to integer type
of variable.
We can convert the weight
back to numeric type
using the as.numeric()
function, but because R recognizes the weight
variable as factor, the converted output would be 1, 2, 3, in increasing order. For our example, 2.1 = 1, 3.2 = 2, 5.0 = 3, and "a" = 4.
However, if you selected stringAsFactros = FALSE
option, then R recognizes the weight variable as character type
, and converting that variable to numeric would produce NA
for mismatching type.
Please see table for a brief summary:
Original data | Converted using as.numeric() |
|
---|---|---|
when string is factor | (2.1, 5.0, 3.2, "a") | (1, 3, 2, 4) |
when string is character | (2.1, 5.0, 3.2, "a") | (2.1, 5.0, 3.2, NA) |
Please enter any questions not answered during live session here:
1.
Doe, John / jdoe@ucsd.edu (example)
Nakamura, Masaki / m1nakamura@ucsd.edu
Yamada, Sohei / syamada@ucsd.edu
Tianning, Lan / tilan@ucsd.edu
Phrasavath, Bonaly / bphrasav@ucsd.edu
Orosz, Stephanie / sorosz@ucsd.edu
Nogueda, Lesley/lnogueda@ucsd.edu
Paula Jaramillo/pjaramillo@ucsd.edu
Pykurel, Himanshu/ hpykurel@ucsd.edu
Cho, Harrison / hhc002@ucsd.edu
Artan, Warsan / wartan@ucsd.edu
ilaria simmen
Lascano, Dayra / dlascano@ucsd.edu
Chen, Wenxin / wec027@ucsd.edu
Groper, Danielle / dgroper@ucsd.edu
Coyle, Jackson / jacoyle@ucsd.edu
Romstadt, Lisa / lromstadt@ucsd.edu
Tsuru, Yutaro / ytsuru@ucsd.edu
Cuevas, Christopher/ccuevas@ucsd.edu
Xuran, Wang / xuw028@ucsd.edu
Chu, Jordan / j1chu@ucsd.edu
Yamakawa, Daisuke / dayamakawa@ucsd.edu
Zhiyuan, Chi / z1chi@ucsd.edu
Tan,Jingyi / j3tan@ucsd.edu
Yuxin Guo / yug041@ucsd.edu
Leung, Taysia/ t3leung@ucsd.edu (Asynchronous Video)
Ahluwalia, Gurkriti / gahluwalia@ucsd.edu
# download data directly from a link.
#
download.file("https://raw.githubusercontent.com/datacarpentry/r-intro-geospatial/master/_episodes_rmd/data/gapminder_data.csv",
destfile = "data/gapminder_data.csv") #make sure to have the "data" file in your Rstudio working directory
str(gapminder)
class(gapminder$year)
nrow(gapminder)
ncol(gapminder)
dim(gapminder)
colnames(gapminder)
head(gapminder)
head(gapminder, n=10)
Challenge 1
answer:
# check the last few lines
tail(gapminder)
tail(gapminder, n=15)
#random rows
gapminder[sample(nrow(gapminder), 5),]
challenge 2
answer:
str(gapminder)
#selecting data by cutoff
below_average <- gapminder$lifeExp < 70.5
head(gapminder)
adding wrong length vector
below_average <- c(TRUE, TRUE, TRUE, TRUE)
head(cbind(gapminder, below_average))
below_average <- c(TRUE, FALSE, FALSE, TRUE)
head(cbind(gapminder, below_average))
#overwrite content with new data
below_average <- as.logical(gapminder$lifeExp < 70.5)
below_average
gapminder <- cbind(gapminder, below_average)
new_row <- list('Norway', 2016, 500000, 'nordic', 80.3, 49400.0, FALSE)
new_row
# makes continent a character
gapminder$continent <- as.character(gapminder$continent)
str(gapminder)
#installing packages:
install.packages("dplyr")
library(dplyr)
year_country_gdp <- select(gapminder, year, country, gdpPercap)
head(year_country_gdp)
year_country_gdp <- gapminder %>%
select(year, country, gdpPercap)
year_country_gdp_euro <- gapminder %>%
filter(continent == "Europe") %>%
select(year, country, gdpPercap)
Please enter any questions not answered during live session here:
1.
# download data directly from a link.
#
download.file("https://raw.githubusercontent.com/datacarpentry/r-intro-geospatial/master/_episodes_rmd/data/gapminder_data.csv",
destfile = "data/gapminder_data.csv")
#make sure to have the "data" file in your Rstudio working directory
# downlod `gapminder_wide` data from a link.
download.file("https://raw.githubusercontent.com/swcarpentry/r-novice-gapminder/gh-pages/_episodes_rmd/data/gapminder_wide.csv",
destfile = "data/gapminder_wide.csv")
install.packages("tidyverse", dependencies = TRUE)
library(tidyr)
library(dplyr)
library(tidyverse)
#assign data to gapminder variable
gapminder <- read.csv("data/gapminder_data.csv",
stringsAsFactors = FALSE)
gap_wide <- read.csv("data/gapminder_wide.csv", stringsAsFactors = FALSE)
gap_long <- gap_wide %>%
pivot_longer(
cols = c(starts_with('pop'),
starts_with('lifeExp'),
starts_with('gdpPercap')),
names_to = "obstype_year",
values_to = "obs_values"
)
View(gap_long) #to view the data
gap_long$year <- as.integer(gap_long$year)
gap_long <- gap_long %>% separate(obstype_year, into = c('obs_type', 'year'), sep = "_")
gap_long$year <- as.integer(gap_long$year)
View(gap_long)
Please enter any questions not answered during live session here:
1.
Doe, John / jdoe@ucsd.edu (example)
Nakamura, Masaki / m1nakamura@ucsd.edu
Phrasavath, Bonaly / bphrasav@ucsd.edu
Lascano, Dayra / dlascano@ucsd.edu
Ye, Yuan / yeyuan@ucsd.edu
Tianning , Lan / tilan@ucsd.edu
Tsuru, Yutaro / ytsuru@ucsd.edu
Chen, Wenxin / wec027@ucsd.edu
Coyle, Jackson / jacoyle@ucsd.edu
Tan, Jingyi/ j3tan@ucsd.edu
Groper, Danielle / dgroper@ucsd.edu
Artan, Warsan / wartan@ucsd.edu
Cuevas, Christopher/ccuevas@ucsd.edu
Xuran, Wang / xuw028@ucsd.edu
Yuxin Guo / yug041@ucsd.edu
Yamakawa, Daisuke / dayamakawa@ucsd.edu
Leung, Taysia/ t3leung@ucsd.edu (Asynchronous Video)
Chu, Jordan / j1chu@ucsd.edu
Ahluwalia, Gurkriti / gahluwalia@ucsd.edu
library("ggplot2")
ggplot(data= gapminder, aes(x= gdpPercap, y=lifeExp)) +
geom_point()
### challenge answer
ggplot(data= gapminder, aes(x= gdpPercap, y=lifeExp, color=continent)) +
geom_point()
ggplot(gapminder, aes(x = year, y=lifeExp, color=continent)) +
geom_line()
ggplot(gapminder, aes(x = year, y=lifeExp, color=continent, group = country)) +
geom_line()
ggplot(gapminder, aes(x = year, y=lifeExp, color=continent, group = country)) +
geom_line() + geom_point()
ggplot(gapminder, aes(x = year, y=lifeExp, group = country)) +
geom_line(aes(color=continent)) +
geom_point()
# challenge: switch geom_line() and geom_point()
# challenge answer
ggplot(gapminder, aes(x = year, y=lifeExp, group = country)) + geom_point() + geom_line(aes(color=continent))
ggplot(data = gapminder, aes(x= gdpPercap, y=lifeExp)) +
geom_point(alpha=0.5) +
scale_x_log10()
ggplot(data = gapminder, aes(x= gdpPercap, y=lifeExp)) +
geom_point(aes(alpha=continent)) +
scale_x_log10()
# you will see a warning, but the plot still works
ggplot(data = gapminder, aes(x= gdpPercap, y=lifeExp)) +
geom_point(aes(alpha=0.5)) +
scale_x_log10() +
geom_smooth(method = "lm", size = 1.5)
#challenge answer
ggplot(data = gapminder, aes(x= gdpPercap, y=lifeExp)) +
geom_point(aes(alpha=0.5, color="orange")) +
scale_x_log10() +
geom_smooth(method = "lm", size = 1.5)
# challenge answer
ggplot(data = gapminder, aes(x= gdpPercap, y=lifeExp, color=continent)) +
geom_point(aes(shape=continent)) +
scale_x_log10() +
geom_smooth(method = "lm", size = 1.5)
#multi-pannel figures using face-wrap
americas <- gapminder[gapminder$continent == "Americas",]
ggplot(americas, aes(x=year, y=lifeExp)) +
geom_line() +
facet_wrap(~country) + theme(axis.text.x = element_text(angle = 45, hjust =1 )) +
labs(x="year", y="life Expectancy", title="figure 1", color= "continent")
useful ggplot cheatsheet:
https://statsandr.com/blog/files/ggplot2-cheatsheet.pdf
## challenge
ggplot(gapminder, aes(x=continent, y=lifeExp, fill=continent)) + geom_boxplot() + facet_wrap(~year) +
ylab("life Expentancy") +
theme(axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank())
Please enter any questions not answered during live session here:
1.
Doe, John / jdoe@ucsd.edu (example)
Pykurel, Himanshu / hpykurel@ucsd.edu
Lascano, Dayra / dlascano@ucsd.edu
Nogueda, Lesley /
lnogueda@ucsd.edu
Nakamura, Masaki / m1nakamura@ucsd.edu
Phrasavath, Bonaly / bphrasav@ucsd.edu
Tsuru, Yutaro / ytsuru@ucsd.edu
Artan, Warsan / wartan@ucsd.edu
Groper, Danielle / dgroper@ucsd.edu
Cuevas, Christopher/ccuevas@ucsd.edu
Xuran, Wang / xuw028@ucsd.edu
Yuxin Guo / yug041@ucsd.edu
Coyle, Jackson / jacoyle@ucsd.edu
Chen, Wenxin / wec027@ucsd.edu
Yamakawa, Daisuke / dayamakawa@ucsd.edu
Leung, Taysia/ t3leung@ucsd.edu (Asynchronous Video)
Chu, Jordan / j1chu@ucsd.edu
Ahluwalia, Gurkriti / gahluwalia@ucsd.edu
download.file("https://raw.githubusercontent.com/datacarpentry/r-intro-geospatial/master/_episodes_rmd/data/gapminder_data.csv",
destfile = "data/gapminder_data.csv")
#make sure to have the "data" file in your Rstudio working directory
https://drive.google.com/file/d/15ei1LmUURIvbJIjS9J-Y6qlyRx1jrpFB/view?usp=share_link
library(tidyverse)
# read in gapminder data
#source("filename.R")
# running a group of functions stored in another .r file
source("programming-with-r.R")
# useful for importing data sets and creating a R pipeline
Create and rnotebook file:
file menu/new file/r notebook
#filter the country to plot
gap-to-plot <- gapminder
est <- read_csv("data/countries_estimated.csv/")
Please enter any questions not answered during live session here:
1.