kkt008
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    --- tags: ucsd-carpentries-archived --- # UCSD Carpentries Bootcamp - R and Git (June 2022) **Workshop Details** Dates: June 13th - 16th, 2022 Days: Monday - Thursday Time: 9am - 12 pm **Workshop Agenda:** https://kthoma2484.github.io/2022-06-13-UCSD/ **Software Installation:** R Studio downloads: https://www.r-project.org/ - download the free version Online/Cloud R Studio interface: https://rstudio.cloud/ *This is an online interface that can be used when unable to download R Studio* Git software: https://git-scm.com/downloads **Lesson Data (download)** [Gapminder data](https://kthoma2484.github.io/2022-06-13-UCSD/data/gapminder-FiveYearData.csv) and [Feline-data](https://kthoma2484.github.io/2022-06-13-UCSD/data/feline-data.csv) ## NOTES: A copy of the instructor live session notes will be made available to participants upon request at the end of the workshop. ## Questions after the workshop about working with R? You can email UC San Diego Data Science Librarian [Stephanie Labou](slabou@ucsd.edu) or schedule a [Zoom consultation](https://calendly.com/slabou). ## Workshop Day 1 (10 attendees) ### First name and Last Name/Organization/Dept./Email | Name (first & last) | Organization | Dept. | Email | | ------------------------- | ------------ | ----- | --------------- | | Chris Day | UCSD | Bio | cdday@ucsd.edu | | Skyler Zheng | UCSD | CogSci | x3zheng@ucsd.edu | | Andrew Muroyama | UCSD | Biological Sciences | amuroyama@ucsd.edu | | Anne Marie Berry | UCSD | Biomedical Sciences | amberry@ucsd.edu | Ariel Flores | UCSD | Chem E | a6flores@ucsd.edu | | Rio Aguina-Kang | UCSD | Psych |raguinakang@ucsd.edu | |Christopher Taylor | UCSD |Envrionmental Systems: EBE |cdtaylor@ucsd.edu | Kya Barounis | UCSD | Psychiatry | kfawleyking@health.ucsd.edu | | Peter Huang | UCSD | Bio | phuang@ucsd.edu | | Dina Zangwill | UCSD | BioSci |dzangwil@ucsd.edu | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## Day 1 Questions: Please enter any questions not answered during live session here: 1. ## Day 1 Live Notes Intro to RStudio: RSTudio IDE overview ### Running statements in Console Sometimes when the statement is not complete, the console will be prompt you to finish the statement. For example, if you type: ```r= 1+100+ ``` The console will be waiting for you to complete the sentence. You can either finish the sentence, or can simply hit the "esc" key to cancel the current statement. ### Boolean operators `!=` -> Not equal `==` -> Equal (Note here, not single, but DOUBLE equal sign!) `<` -> Less than `>` -> Greater than `>=` -> Greater or equal to `<=` -> Less than or equal to ### Object assignment It is preferred in R to use the assignment operator `<-` to link variable names to the objects (You can still use `=` for assignment). For example: ```r= a <- 1/40 ``` Now the value `1/40` can be referred as `a` from this point on. Assigning your output to a variable is helpful because you will be able to manipulate the variable and do further anaysis. ### Naming convention When naming your variables, there are a couple of rules: - Cannot start with `_` - Cannot start with numbers - Can start with `.`, the variable started with `.` will be hidden in the current environment, but the user will still be able to access the variable by invoking the variable name. - Normally, variable names start with letters ### Vector ```r= 1:5 # this will return 1 2 3 4 5 ``` We can do vectorized operations as well: ```r= 2^(1:5) # returns 2 4 8 16 32 ``` ### Variable Management To explore what objects are in your current environment, you can use the `ls()` function. If you want to see *all* variables, including the hidden ones, use `ls(all.names=T)`. To remove variables, use the `rm()` function ```r= rm(a) # remove the object with the name "a" rm(list = ls()) # remove all non-hidden variables ``` ### Boolean Values Therea are two boolean values in R, `TRUE`, or `T`; and `FALSE`, or `F`. Note that all letters are capitalized. ### Project Management Go to File -> New Project -> Existing Directory / New Directory Great resource on workflow and organizing folders for scientic computing written by the Carpentries [Good enough practices in scientific computing](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510) ### Dataframe File -> New Script -> R Script | Open a new R script #### Create New Dataframe Create a new dataframe from scratch: ```r= cats <- data.frame(coat = c("calico", "black", "tabby"), weight = c(2.1, 5.0, 3.2), likes_string = c(1,0,1)) ``` Creates a dataframe that looks like | coat | weight | likes_string | | -------- | -------- | -------- | | calico| 2.1| 1| | black | 5.0| 0| | tabby | 3.2| 1| #### View Dataframe If you want to examine the data, go to the top right **environment** pane and click on the data to view the dataframe. Alternatively, you can call the `View(cats)` function to view the `cats` dataframe. #### Output Dataframe to Other Formats (csv, tsv, etc) To save the `cats` dataframe to a CSV file: ```r= write.csv(cats, # the object to output "data_output/feline-data.csv", # the sys path to write to row.names=F) ``` **Tab-completion** when you are type a long system path to an existing file, press tab will prompt the computer to complete the long path for you. Tab completion works for saved variables as well. #### Import data files ```r= read.csv("data/feline-data.csv", stringAsFactors = T) ``` If you see unfamilar with a options in a function, you can prepend `?` before the function name to read the documentation of the function. For instance, if you don't know what the `row.name=F` does in the function `write.csv()`, simply do ```r= ?write.csv # note here we do NOT include pranthesis ``` And the bottom right corner, **help pane** will show you the documentation of the function. Anything written after a `#` will not be executed by the computer. This is useful for commenting. #### Column Access Use the `$` to access or to modify components of an object. Specifically, for dataframe, `$` is used to access columns within a dataframe. ```r= cats$weight # Access column "weight" from the "cats" ``` Modify all elements in a column and save the result to a new column in the dataframe. ```r= cats$weight_minus2 <- cats$weight-2 ``` To coerce object into other types, for example, converting an object of type character to integer type: ```r= char_vct <- c('0', '2', '4') num_vct <- as.numeric(char_vct) # returns 0 2 4, as numbers char_vct <- c('0', 'abc', '4') num_vct <- as.numeric(char_vct) # returns 0 NA 4. 'abc' -> na ``` #### Rename columns Rename the second column of the dataframe to 'weight_kg': names(cats)[2] <- "weight_kg" #### Subset Columns Select the column coat and weight ```r= cats_subset <- cats[c("coat", "weight")] ``` Note you must use the `c()` funtion, because the function is expecting only one object. ### Object Types | Type | Example | | -------- | -------- | | Double | 3.14| | integer| 3 | | complex| | | logical| TRUE or FALSE| | character| "cats"| To ask for the object type, use `typeof()`. For example, we can do `typeof(cats$weight)` to ask R what is the object type of the column, `weight`. **Exercise** Start by makign a vector with the number 1 through 26, multiply the vector by 2, and give the resulting vector name A through Z. ```r= x <- 1:26 names(x) <- LETTERS ``` **Factor** in R is the categorical variables. R assign a number to each unique string and store them in memory. ### End Day 1 ## Workshop Day 2 (11 attendees) ### First name and Last Name/Organization/Dept./Email | Name (first & last) | Organization | Dept. | Email | | ------------------------- | ------------ | ----- | --------------- | |Rio Aguina-Kang |UCSD |Psychology |raguinakang@ucsd.edu | Skyler Zheng | UCSD | CogSci | x3zheng@ucsd.edu | | | Sina Ghaffarnejad |UCSD | Biology | sighaffa@ucsd.edu| | Ariel Flores | UCSD | Chem E | a6flores@ucsd.edu | | Andrew Muroyama | UCSD | Biology | amuroyama@ucsd.edu | | Anne Marie Berry | UCSD | Biology | amberry@ucsd.edu | | Chris Day | UCSD | Bio | cdday@ucsd.edu | |Christopher Taylor |UCSD |ESYS:EBE |cdtaylor@ucsd.edu | | | | | | | Dina Zangwill | UCSD |BioSci |dzangwil@ucsd.edu | | Peter Huang | UCSD | bio | phuang@ucsd.edu | | Kya Barounis | UCSD | Psych | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## Day 2 Questions: Please enter any questions not answered during live session here: 1. **Lesson Data (download)** [Gapminder data](https://kthoma2484.github.io/2022-06-13-UCSD/data/gapminder-FiveYearData.csv) and [Feline-data](https://kthoma2484.github.io/2022-06-13-UCSD/data/feline-data.csv) ## Day 2 Live Notes dplyr and tidyverse: ### Introduction to tidyverse [tidyverse homepage](https://www.tidyverse.org) Install tidyverse if you have not already done so: ```r= install.packages("tidyverse") ``` Load the tidyverse library once it is installed: ```r= library(tidyverse) ``` We will focus on *dplyr* and *tidyr*. ## *dplyr* ```r= #install needs to be done only once install.package("tidyverse", dep=T) library(tidyverse) #output will show 'attaching pages' and 'conflicts' that occurred when installing the libraries; some libraries have conflicts but this should be okay in general - it's primarily an FYI gapminder <- read_csv("data/gapminder_data.csv") rm(cats) #Some common tidyverse functions are select(), filter(), groupby(), summarize(), and mutate(); also will look at %>% (pipe) - this operator lets you pipe down to data ``` ### select() ```r= #select() lets you subset data by column (variable) name smallr_gapminder_data <- gapminder %>% dpylr::select(year, country, gdpPercap) test1 <- dpylr::select(year, country) rm(test1) ``` ### filter() ```r= #filter() - lets you target specific columns based on certain criteria gapminder_europe <- gapminder %>% filter(continent = "Europe")%>% select(year, country, gdpPercap) gapminder_europe2 <- gapminder %>% filter(continent = "Europe")%>% select(year, country, gdpPercap)%>% rename(gdp= gdpPercap) #You can use ftable(gapminder$continent) directly in the console to get a view of the data continent variable ``` ### Challenge: Write a single command (which can span multiple lines and includes pipes) that will produce a data frame that has the African values for lifeExp, country and year, but not for other Continents. How many rows does your data frame have and why? ```r= year_country_lifeExp_Africa <- gapminder %>% filter(continent == "Africa") %>% select(year, country, lifeExp) ``` ### group_by() ```r= test2 <- gapminder %>% filter(continent %in% c("Africa", "Europe")) %>% select(year, continent, country, lifeExp) ``` ### group_by() + summarize() ```r= gdp_bycontinent <- gapminder%>% group_by(continent)%>% summarize(mean_gdp = mean(gdpPercap), sd_gdp = sd(gdpPercap), se_gdp = sd(gdpPercap)/sqrt(n()), count = n()) ``` ### Challenge Calculate the average life expectancy per country. Which has the longest average life expectancy and which has the shortest average life expectancy? ```r= challenge2r <- challenge2%>% filter(mean_life_exp == max (mean_life_exp)) challenge1r <- challenge1%>% filter(mean_life_exp == min (mean_life_exp)) ``` ### mutate() ```r= gdp_per_billion <- gapminder%>% mutate(gdp_per_billion) = gdpPercap*pop/ 10^9) # removing column (variable names) remove_pop_year <- gapminder%>% select(-c(pop,year)) ``` ## Introduction to *tidyr* [tidyr homepage](https://tidyr.tidyverse.org) [guide](https://swcarpentry.github.io/r-novice-gapminder/14-tidyr/index.html) [link to gapminder data](https://drive.google.com/drive/folders/1Xz6CUK71n88UEbqn3OFCHUMEG-w7vudd?usp=sharing) *tidyr* supersedes *reshape2* and *reshape*. *tidyr* is designed specifically for tidying data, not general reshaping (reshape2), or the general aggregation (reshape). The goal of tidyr is to help you create tidy data. Tidy data are data where: 1. Every column is variable. 2. Every row is an observation. 3. Every cell is a single value ***tidyr*** functions fall into five main categories - we will focus on pivotting: “Pivotting”: converts between long and wide forms. See pivot_longer() and pivot_wider(), and the vignette("pivot") for more details. “Rectangling”: turns deeply nested lists (as from JSON) into tidy tibbles. See unnest_longer(), unnest_wider(), hoist(), and the vignette("rectangle") for more details. "Nesting" converts grouped data to a form where each group becomes a single row containing a nested data frame, and "unnesting" does the opposite. See nest(), unnest(), and the vignette("nest") for more details. Splitting and combining character columns. Use separate() and extract() to pull a single character column into multiple columns; use unite() to combine multiple columns into a single character column. Make implicit missing values explicit with complete(); make explicit missing values implicit with drop_na(); replace missing values with next/previous value with fill(), or a known value with replace_na(). ### Linking *dplyr* to *ggplot2* ```r= library(tidyr) americas <- gapminder[gapminder$continent == "Americas", ] ftable(americas$country) levels(as.factor(americas$country)) # Make the plot ggplot(data = americas, mapping = aes(x= year, y= lifeExp)) + geom_line() + facet_wrap(-country) + theme_bw() + theme(axis.test.x = element_text(angle = 30, hjust =1)) ``` ### Import wide-form gapminder data #### Note: You will first need to obtain this file at the link below and add it to your project data folder: [wide-form gapminder data](https://raw.githubusercontent.com/swcarpentry/r-novice-gapminder/gh-pages/_episodes_rmd/data/gapminder_wide.csv) ```r= gap_wide <- read.csv("data/gapminder_wide.csv", stringAsFactors = FALSE) ``` ### Wide-form to long-form ```r= ## Note that this uses the piping notation, and similar to select() we use starts_with() to grab more than one observation simultaneously gap_long <- gap_wide %>% pivot_longer( cols = c(starts_with('pop'), starts_with('lifeExp'), starts_with('gdpPercap')), names_to = "obstype_year", values_to = "obs_values" ) # Check structure str(gap_long) # You can also use the '-' notation to exclude variables ## Note that this generates the same long-form data as the code above gap_long <- gap_wide %>% pivot_longer( cols = c(-continent, -country), names_to = "obstype_year", values_to = "obs_values" ) str(gap_long) # You can separate values in a column by a separator ## Note that in the example above obstype_year has two pieces of information in it gap_long <- gap_long %>% separate(obstype_year, into = c('obs_type', 'year'), sep = "_") # Convert year to an integer gap_long$year <- as.integer(gap_long$year) ``` # Challenge 2 #Using gap_long, calculate the mean life expectancy, population, and gdpPercap for each continent. Hint: use the group_by() and summarize() functions we learned in the dplyr lesson ```r= #challenge 2 answer gap_long %>% group_by(continent, obs_type) %>% summarize(means = mean(obs_values)) ``` ### Go from long-form to the intermediate form of the raw data ```r= gap_normal <- gap_long %>% pivot_wider(names_from = obs_type, values_from = obs_values) # Check dimensions dim(gap_normal) dim(gapminder) # Check column names, and their order names(gap_normal) names(gapminder) # Re-order levels in new data to match original data gap_normal <- gap_normal[, names(gapminder)] # Check for similarity between two datasets all.equal(gap_normal, gapminder) ## There are differences... let's see why... head(gap_normal) head(gapminder) ## Ah, there are differences in how the columns are sorted. We can fix this... gap_normal <- gap_normal %>% arrange(country, year) # Check again... all.equal(gap_normal, gapminder) ## All good! The differences are due to tibble vs. data frame (I think) ``` ### Going back to wide-format ```r= # You can unite variables to make it easier to go to wide-form gap_temp <- gap_long %>% unite(var_ID, continent, country, sep = "_") str(gap_temp) # You can use the pipe to unite more than one group of variables at a time gap_temp <- gap_long %>% unite(ID_var, continent, country, sep = "_") %>% unite(var_names, obs_type, year, sep = "_") str(gap_temp) # You can now pipe to pivot_wider gap_wide_new <- gap_long %>% unite(ID_var, continent, country, sep = "_") %>% unite(var_names, obs_type, year, sep = "_") %>% pivot_wider(names_from = var_names, values_from = obs_values) str(gap_wide_new) #Split ID_var into 2 columns gap_wide_betterID <- gap_long %>% unite(ID_var, continent, country, sep = "_") %>% unite(var_names, obs_type, year, sep = "_") %>% pivot_wider(names_from = var_names, values_from = obs_values) %>% #separate() command splits a column based on a separator separate(ID_var, c("continent", "country", sep = "_")) str(gap_wide_new) #Check against original data all.equal(gap_wide, gap_wide_betterID) ``` ### End Day 2 ## Workshop Day 3 ### First name and Last Name/Organization/Dept./Email | Name (first & last) | Organization | Dept. | Email | | ------------------------- | ------------ | ----- | --------------- | |Rio Aguina-Kang |UCSD | Psychology |raguinakang@ucsd.edu | | Skyler Zheng | UCSD | CogSci | x3zheng@ucsd.edu | | Peter Huang |UCSD |Bioinformatics |phuang@ucsd.edu | | Dina Zangwill | UCSD | BioSci | dzangwil@ucsd.edu | |Christopher Taylor |UCSD | ESYS:EBE |cdtaylor@ucsd.edu | | Andrew Muroyama | UCSD | Biology | amuroyama@ucsd.edu | | Anne Marie Berry | UCSD | Bio | amberry@ucsd.edu | | Chris Day | UCSD |Bio | cdday@ucsd.edu |Sina Ghaffarnejad Sina Ghaffarnejad| UCSD | Bio | sighaffa@ucsd.edu | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## Day 3 Notes [Plotting with ggplot2](https://swcarpentry.github.io/r-novice-gapminder/08-plot-ggplot2/index.html) [R graph gallery - ggplot2 examples](https://r-graph-gallery.com/ggplot2-package.html) ```r= #Call library library(tidyverse) #could also load ggplot by itself using library(ggplot) #Simple scatter plot #specify data to use, x variable, y variable ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)) + geom_point() #specify type of plot (here, scatterplot with points) #Let's try a transformation and adjust the point transparency, for clarity ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)) + #alpha is opacity - lower number = more opaque #can change size of points geom_point(alpha = 0.5, size = 0.8) + #transform x axis, log transform scale_x_log10() + #remove default gray background by using a pre-built theme theme_bw() #Let's adjust labels ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)) + #alpha is opacity - lower number = more opaque #can change size of points geom_point(alpha = 0.5, size = 0.8) + #transform x axis, log transform scale_x_log10() + #remove default gray background by using a pre-built theme theme_bw() + #change size of text theme(axis.text.x = element_text(size = 5)) #Note that each line of additional ggplot code overwrites the ones above #so if you have a theme that made text size 10, but wanted text to be point 5, make sure to use axis.text AFTER theme_bw() #Add a trend line ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)) + geom_point(alpha = 0.5, size = 0.8) + scale_x_log10() + #add linear trend line, set size of line #if you don't specify method, uses default method #can also set color geom_smooth(method = "lm", size = 0.1) + theme_bw() + theme(axis.text.x = element_text(size = 5)) ``` ### Challenge Modify the size and color of the points in the previous example Hint: do not use the aes() function ```r= ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)) + geom_point(alpha = 0.3, size = 1, color = "red") + scale_x_log10() + geom_smooth(method = "lm", size = 0.2, color = "blue") + theme_bw() + theme(axis.text.x = element_text(size = 5)) ``` ## More on modifying plots ```r= #Set color based on a variable, in this case continent ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp, color = continent)) + geom_point(alpha = 0.75, size = 0.5) + geom_smooth(method = "lm", color = "blue") + scale_x_log10() + theme_bw() #Existing color palettes, example ColorBrewer ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp, color = continent)) + geom_point(alpha = 0.75, size = 0.5) + geom_smooth(method = "lm", color = "blue") + scale_x_log10() + theme_bw() + #set color palette #note: works with discrete variables scale_color_brewer(palette = "Dark2") #If want to manually set colors ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp, color = continent)) + geom_point(alpha = 0.75, size = 0.5) + geom_smooth(method = "lm", color = "blue") + scale_x_log10() + theme_bw() + scale_color_manual(values = c("red", "orange", "yellow", "green", "blue")) #Multi-panel figures americas <- gapminder[gapminder$continent == "Americas",] ggplot(data = americas, mapping = aes(x = year, y = lifeExp)) + #make a line plot geom_line() + #have a separate panel for each country facet_wrap(~country) + #rotate axis labels theme(axis.text.x = element_text(angle = 30)) #Adjust labels ggplot(data = americas, mapping = aes(x = year, y = lifeExp)) + geom_line() + facet_wrap(~country) + #adjust labels for x axis, y axis, title of figure labs(x = "Year", y = "Life Expectancy", title = "Figure 1: Americas") + theme(axis.text.x = element_text(angle = 30, hjust = 1), #adjust alignment of plot title plot.title = element_text(hjust = 0.5)) ``` ## Exporting plots Can manually click 'save plot as image' and set file name, format, and size Alternatively, assign plot to an object and use ggsave() ```r = americas_plot <- ggplot(data = americas, mapping = aes(x = year, y = lifeExp)) + geom_line() + facet_wrap(~country) + labs(x = "Year", y = "Life Expectancy", title = "Figure 1: Americas") + theme(axis.text.x = element_text(angle = 30, hjust = 1), plot.title = element_text(hjust = 0.5)) #specify extension, size, units ggsave(filename = "results/amerias_panels.png", plot = americas_plot, width = 12, height = 10, dpi = 300, units = "cm") ``` ## Long vs wide data in plotting What if we wanted to compare a single variable between two years among all countries? Which format (wide vs long) would be easiest to use for a scatterplot? ```r= #See levels for country levels(as.factor(gapminder$country)) #use wide format data ggplot(data = gap_wide, mapping = aes(x = pop_2007, y = pop_1952, color = country)) + geom_point() + geom_smooth(method = "lm") + #get rid of legend because it's very large #(has each country as own color) theme(legend.position = "None") ``` What if we wanted to compare the average of a single variable among continents? ```r= #use long data, combine dplyr functions with ggplot functions gap_long %>% filter(obs_type == "gdpPercap" & year == '2007') %>% ggplot(mapping = aes(x = continent, y = obs_values)) + #make invisible point outliers from the boxplots geom_boxplot(outlier.alpha = 0) + #want scatterplot of points on top of boxplot to better see distribution #use jitter to offset the points slightly #can specify size, opacity, color, and width and height for offset geom_jitter(height = 0, width = 0.1, size = 0.5, alpha = 0.5, color = "blue") + #add a point denoting mean stat_summary(fun = mean, geom = "point", shape = 22, size = 2, color = "red", fill = "red") #update naming for y axis label ylab("GDP per capita") ``` How could we compare the average of all variables among continents? ```r= gap_long %>% filter(year == '2007') %>% ggplot(mapping = aes(x = continent, y = obs_values)) + geom_boxplot() + facet_wrap(~obs_type, scales = "free_y") #Faceting defaults to having everything having same ranges #Can change this using the scales option to specify free or just free_x or free_y #To facet by two variables, use facet_grid() rather than facet_wrap() gap_long %>% filter(year == '2007') %>% ggplot(mapping = aes(x = continent, y = obs_values)) + geom_boxplot() + facet_grid(continent ~ obs_type, scales = "free") #The plot doesn't look very pretty, but you can see how to facet by two variables ``` How could we compare population over time for each country? ```r= gap_long %>% filter(obs_type == "pop") %>% ggplot(mapping = aes(x = year, y = obs_values, group = country)) + geom_line(aes(color = country)) + theme(axis.text.x = element_text(angle = 30, hjust = 1), legend.position = "None") + scale_y_log10() + ylab("Population") + xlab("Year") ``` ## Example: adding model outputs to plots ```r = mod1 <- lm(gapminder$lifeExp ~ gapminder$pop) summary(mod1) mod1$coefficients ``` ## More R plotting resources Make plots interactive with [plotly](https://plotly.com/r/) Make web app plots with [shiny](https://shiny.rstudio.com/) ## Day 3 Questions: Please enter any questions not answered during live session here: 1. ## Pre-Day 4 Git Installation * Install Git Bash via this website: https://git-scm.com/downloads * Setup GitHub account via https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F&source=header-home (follow the interface instructions; recommmend using your personal email rather than UCSD email) ### End Day 3 ## Workshop Day 4 GitHub Git Cheat Sheet - https://education.github.com/git-cheat-sheet-education.pdf ### First name and Last Name/Organization/Dept./Email | Name (first & last) | Organization | Dept. | Email | | ------------------------- | ------------ | ----- | --------------- | | Andrew Muroyama | UCSD | Biology | amuroyama@ucsd.edu | | Rio Aguina-Kang | UCSD | Psychology |raguinakang@ucsd.edu | | Peter Huang | UCSD | Bio | phuang@ucsd.edu | | Christopher Taylor|UCSD |ESYS:EBE |cdtaylor@ucsd.edu | | Skyler Zheng | UCSD | CogSci | x3zheng@ucsd.edu | | Anne Marie Berry |UCSD | Bio | amberry@ucsd.edu | | Sina Ghaffarnejad | UCSD | Bio | sighaffa@ucsd.edu | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## Day 4 Questions: Please enter any questions not answered during live session here: 1. ### End Day 4

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully