INBO CODING CLUB

> # INBO CODING CLUB 24 Januari, 2019 Welcome! Yes very welcome ## CHALLENGE 1 List the issues you encountered during tidying the data: * Een hele hoop "brol" voor en na de data (a lot of "trash" before and after data) * Missing data in column "Sex" and "Weight" row: 3,6,16,18 * "#####" in column "Weight", row 6 * "note" below(colors) > [Ine: Hi, you need to take care with the species called "NA" and make sure that R is not interpreting it as a missing value. I think you can do this by indicating in read_delim() that NA is not a missing value by indicating that read_delim(na = "") --> real missing values are only the empty cells and not "NA" text cells. This works. ] ## CHALLENGE 2 List the issues you encountered during tidying the data: * 2 tabellen in dezelfde Excel sheet * plot 1 has 4 variables, plot 2 only 3 * weight column has unit in plot 2 * "species" and "sex" are merged in 1 column in plot 2 * Datums in eerste tabel worden ingelezen als een factor variabele * Datums plot 2 kunnen zowel M/D/Y als D/M/Y zijn? Sander's code suggestion: ```r library(tidyverse) X20190124_survey_part2 <- read_excel("~/R_coding_club/data/20190124_survey_part2.xlsx", skip = 1) plot1 <- X20190124_survey_part2 %>% select(`Date collected`, Species, Sex, Weight_in_gr = `Weight (g)`) %>% filter(!is.na(`Date collected`)) %>% mutate(plot_id = 1) %>% mutate(Date = as.Date(`Date collected`, format = "%m/%d/%Y")) %>% select(-`Date collected`) plot2 <- X20190124_survey_part2 %>% select(Date = `Date collected__1`, species_sex, wgt ) %>% mutate(Weight_in_gr = as.numeric(gsub(pattern = "g", replacement = "", x=wgt))) %>% mutate(Species = substr(x = species_sex, 1, 2)) %>% mutate(Sex = substr(x = species_sex, 4, 4)) %>% mutate(plot_id = 2) %>% select(-wgt, -species_sex) X20190124_survey_part2_tidy <- rbind(plot1, plot2) ``` ## Intermezzo You can try out yourself to read and work with the tidy version of the data: ```r library(readr) library(tidyverse) survey <- read_delim( "../data/20190124_survey_data_spreadsheet_tidy.csv", delim = ";") test <- survey %>% group_by(sex, species) %>% summarise(median_crap = median(weight_in_g, na.rm = TRUE), mean_crap = mean(weight_in_g, na.rm = TRUE)) %>% ungroup() ``` ## CHALLENGE 3 ### Share your code snippet If you want to share your code snippet, copy paste your snippet within a section of three backticks (```): As an **example**: ```r library(tidyverse) ``` Sander's code: ```r library(tidyverse) main_experiment_tidy <- main_experiment %>% gather(key = "Experiment", value = "Optical_density", 4:6) %>% mutate(Experiment = gsub(pattern = "OD_", replacement = "", x = Experiment)) %>% mutate(Experiment = gsub(pattern = "h", replacement = "", x = Experiment)) ``` Stien en Marijke ```r gather(X20190124_dryad_arias_hall_v3, "OT", "OD", 4:6) ``` Joost ```r dataset <- read_delim("../data/20190124_dryad_arias_hall_v3.csv", delim = ",") # h1#!!!add row ID to each input # combinations of AB_r, bacterial_genotype & phage_t are not unique # applying gather will lose the information of row identity if row ID is not added explicitely dataset <- dataset %>% mutate(ID = 1:nrow(.)) # gather, also taking into account that survival and phage_r haven only data for hour = 72 clean_data <- dataset %>% select(-Survival_72h, -PhageR_72h) %>% gather("hour", "OD", OD_0h, OD_20h, OD_72h) %>% mutate(hour = str_replace(hour, "OD_", "")) %>% left_join(dataset %>% select(-OD_0h, -OD_20h, -OD_72h, -PhageR_72h) %>% gather("hour", "Survival", Survival_72h) %>% mutate(hour = str_replace(hour, "Survival_", ""))) %>% left_join(dataset %>% select(-OD_0h, -OD_20h, -OD_72h, -Survival_72h) %>% gather("hour", "PhageR", PhageR_72h) %>% mutate(hour = str_replace(hour, "PhageR_", ""))) ``` Jeroens opkuiske naar numerisch variabeltje ``` d2$OD <- d2$OD %>% str_replace("OD_", "") %>% str_replace("h", "") %>% as.numeric() ``` (*you can copy paste this example and add your code further down, but do not fill in your code in this section*) ``` ``` #opmerking Frank VDM: structuur van projecten toelichten, waarom projecten gebruiken ipv losse files, structureren van projectonderdelen,etc... figuur gather: https://datacarpentry.org/R-ecology-lesson/img/gather_data_R.png A solution to clean further column `experiment_time_h` of data.frame `main_experiment_tidy` : ```r # remove "OD_" before hour and "h" after main_experiment_tidy_cleaned <- main_experiment_tidy %>% mutate(experiment_time_h = str_remove(experiment_time_h, pattern = "OD_")) %>% mutate(experiment_time_h = str_remove(experiment_time_h, pattern = "h")) # convert hour from character to integer class(main_experiment_tidy_cleaned$experiment_time_h) <- "integer" # check distinct(main_experiment_tidy_cleaned, experiment_time_h) ```

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.