owned this note
owned this note
Published
Linked with GitHub
# INBO CODING CLUB
30 March 2021
Welcome!
## Share your code snippet
If you want to share your code snippet, copy paste your snippet within a section of three backticks (```):
As an **example**:
```
library(tidyverse)
```
(*you can copy paste this example and add your code further down*)
## Yellow sticky notes
No yellow sticky notes online. Put your name + " | " and add a "*" each time you solve a challenge (see below).
## Participants
Name | Challenges
--- | ---
Damiano Oldoni |
Luc DB | **
Lynn Pallemaerts |***
Dirk Maes | **
Emma Cartuyvels|**
Anja Leyman |***
Adriaan Seynaeve |**
Hans Van Calster |***
Joost Vanoverbeke |
Tom De Dobbelaer |**
Amber Mertens|**
Lisse Goris|**
Leen Govaere |***
Maxime Sweetlove | **
Raïsa carmen | ***
Matthieu Chastel |
## Challenge 1
### Damiano' solution:
1. plot 1 and 2 are apart
2. ...
### Tom's solution:
```
1. create equal headers (date, species, sex, weight)
2. move units to header
3. combine two plots by adding "plot" column
4. add additional column for calibration comment
5. choose one value for NA data
```
### Lisse's solution:
```
1. Modify column: split species_sex column of second data frame into species and sex column.
2. Add new column 'plot' with values "1" and "2"
3. Merge dataframes.
4. Not sure what to do with NA rows.
```
### Lynn's solution:
1. merge 2 tables to 1 with extra column for plot number
2. standardise date format
3. split species & sex from plot 2
4. remove unit from weights in plot 2
5. add comments column for the calibration problem
6. fill empty weight cells with NA
### Amber's solution:
Issues: which unit of date
1. Columns: OID, date, date unit, species, sex, weight, weight unit, calibration, plot
2. Add NaN for empty weight cells
3. split species, sex for second plot
4. date and weight unit in metadata?
### Luc's solution
Tell the person to start over again and type in the data correctly :)
## Challenge 2
### Emma
The manta data should be seperate from the sample, evironmental and diver data since this has to be repeated for each individual manta observed. Connect by using sample_ID in manta table as well.
Remove /
### Lynn
1. add metadata sheet for all the comments in the headers
2. separate table to sheets for different groups of variables
3. replace Y/N with T/F
4. avoid using uncommon abbreviations? M/F is fine, but the J/M/U and location is less obvious. Add it as a factor, all typed out, makes it easier for people who've never seen this data before to interpret it correctly immediately
### Lisse
1. remove "/"
2. you can make multiple seperate datasets (one for environmentals, one for manta data, etc.), first column is always Sample_ID
### Anja
- opsplitsen in meerdere tabellen
- MDY opsplitsen
- /' verwijderen (leeg veld)
- infotips in metadata
- ev.met lookuplijsten werken voor oa. Tide ( maar 4 vaste waarden mogelijk)
- Focal_N: '/' vervangen door '0': niet opgevolgd?
- Seks en Maturity: kan ook U zijn, maar komt niet voor => '/' vervangen door 'U' (als dat de betekenis van / is)
- naamgeving kolommen: unit erin verwerken
### Amber
Separate tables:
- Observations/samples: Sample_ID, DATE_YMD, Location, Movie, Manta_ID, Comments
- Environmentals: DateYMD, Location, Cloud_cover, Sea_state, Temperature, Tide
- Divers: Date_YMD, Location, Proximity, Divers_NT, Divers_N
- Mantas: Manta_ID, Focal_N, Sex, Maturity, Pregnant, Injury, Males_T
Observations/samples can be linked to evironmental and diver data through comination of date_location, and linked to manta data through Manta_ID
Metadata sheet for explanation of the columns
Remove slashes
## INTERMEZZO
## Challenge 3
Emma:
```
camtrap_data <- read_csv("./data/20210330/20210330_camtrap_data.csv",
col_names = paste0("V",seq_len(21)))
camtrap_metadata <- read.csv("./data/20210330/20210330_camtrap_metadata.csv")
camtrap_data_deployment <- camtrap_data %>%
filter(V1 == "deployment") %>%
select(-V1)
names <- camtrap_metadata %>%
filter(record_type == "deployment") %>%
select(column_name) %>%
pull()
colnames(camtrap_data_deployment) <- as.character(names)
```
Hans:
````
extract_tidydata <- function(which_record) {
cols <- camtrap_meta %>%
filter(record_type == which_record) %>%
pull(column_name)
tidy_data <- camtrap_data %>%
filter(V1 == which_record) %>%
select(2:(length(cols) + 1))
colnames(tidy_data) <- cols
return(tidy_data)
}
types <- unique(camtrap_meta$record_type)
extract_tidydata(types[1])
extract_tidydata(types[2])
extract_tidydata(types[3])
````
Anja
````
# starten met er drie tabellen van te maken
deployment <- camtrap_data %>%
filter(V1 == "deployment")
multimedia <- camtrap_data %>%
filter(V1 == "multimedia")
observation <- camtrap_data %>%
filter(V1 == "observation")
# metadata
deployment_metadata <- camtrap_metadata %>%
filter(record_type == "deployment")
ncol_deployment <- nrow(deployment_metadata)
colnames_D <- deployment_metadata$column_name
multimedia_metadata <- camtrap_metadata %>%
filter(record_type == "multimedia")
ncol_multimedia <- nrow(multimedia_metadata)
colnames_M <- multimedia_metadata$column_name
observation_metadata <- camtrap_metadata %>%
filter(record_type == "observation")
ncol_observation <- nrow(observation_metadata)
colnames_O <- observation_metadata$column_name
# metadata op data toepassen
# rename_at(vars(oldnames), ~ newnames)
oldnames_M <- colnames(multimedia)
ncol_multimedia
multimedia2 <- multimedia %>%
rename_at(vars(oldnames_M), ~ colnames_M) %>%
select(-starts_with("V"))
````
#Without reading metadata
```
data <- as.data.frame(read_csv("data/20210330_camtrap_data.csv", col_names = F))
data_deployment <- data[data$X1 =='deployment',2:17] %>%
setNames(c("deployment_id","location_id","location_name","longitude","latitude","start","end","setup_by","camera_id","camera_model","camera_interval","camera_height","bait_use","feature_type","tags","comments"))%>%
mutate(longitude = as.numeric(longitude),
latitude = as.numeric(latitude),
start = as.POSIXlt(str_remove_all(start,"[Tz]")),
end = as.POSIXlt(str_remove_all(end,"[Tz]")),
camera_height = as.numeric(camera_height))
data_multimedia <- data[data$X1 =='multimedia',2:9] %>%
setNames(c("multimedia_id","deployment_id","sequence_id","timestamp","file_path","file_name","file_mediatype","comments")) %>%
mutate(
multimedia_id = as.factor(multimedia_id),
deployment_id= as.factor(deployment_id),
sequence_id = as.factor(sequence_id),
timestamp = as.POSIXlt(str_remove_all(timestamp,"[Tz]")),
file_mediatype = as.factor(file_mediatype)
)
data_observation <- as.data.frame(read_csv("data/20210330_camtrap_data.csv", col_names = F, skip = 21)) %>%
setNames(c("type","observation_id","deployment_id","sequence_id","multimedia_id","timestamp","observation_type","sensor_method","camera_setup","scientific_name","vernacular_name","count","age","sex","behaviour","individual_id","classification_method","classified_by","classification_timestamp","classification_confidence","comments"))%>%
mutate(
observation_id = as.factor(observation_id),
multimedia_id = as.factor(multimedia_id),
deployment_id= as.factor(deployment_id),
sequence_id = as.factor(sequence_id),
timestamp = as.POSIXlt(str_remove_all(timestamp,"[Tz]")),
observation_type = as.factor(observation_type),
sensor_method = as.factor(sensor_method),
sex = as.factor(sex),
classification_method = as.factor(classification_method),
classification_timestamp = as.POSIXlt(str_remove_all(classification_timestamp,"[Tz]")),
comments= as.character(comments)
)
```
joost:
camtrap_metadata_multimedia <-
camtrap_metadata %>%
filter(record_type == "multimedia")
camtrap_multimedia <-
camtrap_data %>%
filter(V1 == "multimedia") %>%
select(-V1)
camtrap_multimedia <-
camtrap_multimedia %>%
select(-((nrow(camtrap_metadata_multimedia)+1):ncol(camtrap_multimedia)))
names(camtrap_multimedia) <-
camtrap_metadata_multimedia$column_name
Lynn
```
deployment <- filter(camtrap_data, V1=="deployment")
deployment <- deployment %>%
select(-V1,-V18,-V19,-V20,-V21)
colnames(deployment) <- c("deployment_id","location_id","location_name","longitude","latitude","start","end","setup_by","camera_id","camera_model","camera_interval","camera_height","bait_use","feature_type","tags","comments")
multimedia <- filter(camtrap_data, V1=="multimedia")
multimedia <- multimedia %>%
select(-V1,-V10,-V11,-V12,-V13,-V14,-V15,-V16,-V17,-V18,-V19,-V20,-V21)
colnames(multimedia) <- c("multimedia_id","deployment_id","sequence_id","timestamp","file_path","file_name","file_mediatype","comments")
observations <- filter(camtrap_data, V1=="observation")
observations <- observations %>%
select(-V1)
colnames(observations) <- c("observation_id","deployment_id","sequence_id","multimedia_id","timestamp","observation_type","sensor_method","camera_setup","scientific_name","vernacular_name","count","age","sex","behaviour","individual_id","classification_method","classified_by","classification_timestamp","classification_confidence","comments")
```
## Bonus challenge
Hans:
```
rings %>%
# add bird id
mutate(bird_id = row_number()) %>%
# pivot longer
select(-inscription) %>%
pivot_longer(cols = c(first_inscription, last_inscription),
names_to = "inscription_order",
values_to = "inscription") %>%
# turn into number
mutate(inscription_order = ifelse(inscription_order == "first_inscription",
1,
2)) %>%
#remove rows where last inscription is equal to first
distinct(euring_code, bird_id, inscription)
```
Lynn
```
rings_p <- rings %>%
rename(bird_ID = inscription) %>%
mutate(bird_ID = c(1:50)) %>%
pivot_longer(cols=c("first_inscription", "last_inscription")) %>%
mutate(name = case_when(name == "first_inscription" ~ 1,
name == "last_inscription" ~ 2)) %>%
rename(n_ring = name) %>%
rename(ring_ID = value)
```