owned this note
owned this note
Published
Linked with GitHub
# INBO CODING CLUB
26 April, 2018
Welcome
## Intro: What have I done?
```r
stations <- get_stations("air_pressure") %>%
filter(stringr::str_detect(station_no, "03"))
air_pressure <- stations %>%
group_by(ts_id) %>%
do(get_timeseries_tsid(.$ts_id, period = "P1D",
to = lubridate::today())) %>%
ungroup() %>%
left_join(stations, by = "ts_id")
air_pressure %>%
ggplot(aes(x = Timestamp, y = Value)) +
geom_point() + xlab(format(lubridate::today() - 1, format="%B %d %Y")) +
facet_wrap(c("station_name", "stationparameter_name")) +
scale_x_datetime(date_labels = "%H:%M",
date_breaks = "6 hours")
```
-> get data of yesterday on Waterinfo.be using the https://inbo.github.io/wateRinfo/ package
## Share your code snippet
If you want to share your code snippet, copy paste your snippet within a section of three backticks (```):
As an **example**:
```r
library(tidyverse)
...
```
(*you can copy paste this example and add your code further down, but do not fill in your code in this section*)
Your snippets:
### c(...,...) and | do the same? NOT!
```r
visdata <- read.csv(file = "Copy of 20180426_visdata_cleaned.csv", sep = ",")
soorten1 <- visdata %>%
filter(str_detect(soort, c("garnaal", "krab", "kreeft")))
soorten2 <- visdata %>%
filter(str_detect(soort, "garnaal|krab|kreeft"))
anti_join(soorten2, soorten1)
anti_join(soorten1, soorten2)
```
... and anti-join is a good way of checking the differences!
with `str_c(c("garnaal", "krab", "kreeft"), collapse="|")` you actually achieve the same...
```r
soorten1 <- visdata %>%
filter(str_detect(soort, str_c(c("garnaal", "krab", "kreeft"), collapse="|"))
soorten2 <- visdata %>%
filter(str_detect(soort, "garnaal|krab|kreeft"))
anti_join(soorten2, soorten1)
anti_join(soorten1, soorten2)
```
### the glue::glue() usage
The default date print format:
```r
soorten2 <- visdata %>%
mutate(meetpuntomschrijving = str_to_lower(meetpuntomschrijving)) %>%
filter(str_detect(soort, "garnaal|krab|kreeft")) %>%
mutate(description =
glue::glue("{soort} bij {meetpuntomschrijving} op {format(datum, '%A, %B %d, %Y')}"))
```
Defining a custom date print format (vb. https://www.statmethods.net/input/dates.html to see the meaning of the `%x` symbols):
```r
soorten2 <- visdata %>%
mutate(meetpuntomschrijving = str_to_lower(meetpuntomschrijving)) %>%
filter(str_detect(soort, "garnaal|krab|kreeft")) %>%
mutate(description =
glue::glue("{soort} bij {meetpuntomschrijving} op {format(datum, '%A, %B %d, %Y')}"))
```
### which day of the week? solution
The usage of `label` provides a label instead of a number and with the locale you can define a language:
```r
my_date <- "August 2nd, 2018 14:00"
wday(mdy_hm(my_date), label = TRUE,
locale = "English")
```
### check your conflicts in the namespace
When different packages have the same function, this can give problems. To see the potential issues on overlap, check:
```r
conflicts()
```
## Read surveys file and add Date field; solution
read data... `surveys <- read_csv("data/20180222_surveys.csv")`
cfr.
```
surveys$date <- dmy(str_c(surveys$day, surveys$month, surveys$year, sep = "-"))
```
versus:
```r
surveys %>%
mutate(date = dmy(str_c(day, month, year, sep = "-")))
```
### Remark: `separate` is a tidyr function
```r
fish %>%
separate(meetpuntomschrijving, into = c("place_1", "place_2"), sep = " ")
```
### About `lubridate::pretty_dates` and `ggplot`
```r
ggplot(daily_counts, aes(x = day, y = n, group = 1)) +
geom_line(stat = "identity") +
scale_x_datetime(breaks = lubridate::pretty_dates(daily_counts$day, n = 5)) +
ylab("visitors") +
xlab("")
```
### User stats grofwild; solution
```r
grofwild <- read_delim(file = "../data/20180316_grofwild_logs.csv", delim = " ")
grofwild %>%
filter(type == "AppStart") %>%
mutate(hours = hour(time)) %>%
count(hours) %>%
complete(data.frame(hours = 0:23), fill = list(n = 0)) %>%
ggplot() +
geom_bar(aes(x = hours, y = n), stat = "identity") +
scale_x_continuous("hour of the day", breaks = seq(0, 23, 2)) +
ylab("number of visitors")
```
## Reminder about data import:
1. old-skool R (it's better to not use it)
```
gent <- read.csv(...)
```
2. comma separated values
```
gent <- readr::read_csv("../data/20180222_survey_data_spreadsheet_tidy.csv")
```
3. semicolon separated values
```
gent <- readr::read_csv2("../data/20180123_gent_groeiperwijk.csv")
```