---
# System prepended metadata

title: INBO CODING CLUB

---

# INBO CODING CLUB

31 March 2026


Welcome!

## Share your code snippet

If you want to share your code snippet, copy paste your snippet within a section of three backticks (```):

As an **example**:

```
library(tidyverse)
```
(*you can copy paste this example and add your code further down*)

## Yellow sticky notes

No yellow sticky notes online. Put your name + " | " and add a "*" each time you solve a challenge (see below).

## Participants

Name | Challenges
--- | ---
Damiano Oldoni | ***
Pieter Huybrechts | **
Rhea Maesele | 
Hans Van Calster |
Emma Cartuyvels |
Peter Desmet |
Charlotte Van Driessche |**
Sanne Govaert | **
Falk Mielke | **
Mieke Verbeeck |
Larissa Bonifacio|
Sebastiaan Verbesselt


### general tips

*Peter's tips:*

- Start headings at h2 (`## Setup`)
- Avoid hard line breaks in paragraphs
- Use base pipes `|>`
- Use `file.path()` to create links:
    ```markdown
    [`{r} unique(abv_cube$family)`](`{r} file.path("https://www.gbif.org/species", unique(abv_cube$familyKey))`)
    ```
- Use `#| output: false` to hide output:
    ```r
    #| output: false
    vl_grid <- st_read(
      dsn = here("data", "20260331", "20260331_utm_grid.gpkg")
    )
    ```

*Falk's additions:*

- you can use many different programming languages and even mix with shared objects
    - e.g. mixing ` {python}` and ` {r}`...
    - but also ` {sql}` if you pass a database connection with `--| connection: con_flights`
    - Note that the quarto control parameters such as `#| eval: false` use the language-specific comment symbol.
- as mentioned on chat, you can render to a single HTML file ([*cf.* docs](https://quarto.org/docs/output-formats/html-basics.html#self-contained); self-containing images etc. for easy sharing) with `format:  html:    embed-resources: true`
- There is the option to [render to Word documents](https://quarto.org/docs/reference/formats/docx.html) (e.g. if that is a requested sub, but in 

## Challenge 1


### Damiano's solution (example)

Copy paste this section to show your solutions.

```r
# dummy code
print("This is how to insert code.")
```

### Emma's solution
````{markdown}
---
title: "Read and visualize ABV occurrence data"
format: 
  html:
    df-print: paged
editor: source
date: "`r Sys.Date()`"
date-format: "D MMMM, YYYY"
---

# Setup

Load libraries:

```{r}
#| message: false
#| warning: false

library(tidyverse)    # to do datascience
library(here)         # to build file paths in a project
library(INBOtheme)    # to apply INBO style to graphs
library(sf)           # to work with geospatial vector data
library(plotly)       # to make dynamic plots
library(mapview)      # to make maps
library(leaflet)      # to make dynamic maps
```

# Introduction

In this document we will:

1.  read occurrence cube data
2.  explore data
3.  preprocess data
4.  visualize data

# Read data

Read **ABV** data from the occurrence cube file `20251028_abv_cube.csv`:
```{r}
#| warning: false
abv_cube <- read_csv(
  file = here::here("data", "20260331", "20260331_abv_cube.csv")
)
```

Read the Flemish grid from the geopackage file `20251028_utm_grid.gpkg`:
```{r}
#| warning: false
vl_grid <- st_read(
  dsn = here("data", "20260331", "20260331_utm_grid.gpkg")
)
```

# Explore data

This dataset contains data from `r min(abv_cube$year)` to `r max(abv_cube$year)` related to `r length(unique(abv_cube$specieskey))` species from the [`r unique(abv_cube$family)`](`r paste0("https://www.gbif.org/species/", unique(abv_cube$familyKey))`) family and their distribution in Flanders based on a grid of 1 km x 1 km.

Preview of the first 30 rows of the dataset:
```{r}
head(abv_cube, n = 30)
```

## Taxonomic information

Species present in the dataset:
```{r}
abv_cube %>% distinct(specieskey, species)
```

## Temporal information

The data are temporally defined at year level. Years present:
```{r}
abv_cube %>% dplyr::distinct(year) %>% arrange(year)
```

## Geographical information

The geographical information is represented by the `mgrscode` column, which contains the identifiers of the grid cells containing at least one occurrence of the species.

The dataset contains `r length(unique(abv_cube$mgrscode))` unique grid cells.

# Preprocess data

Add geometrical information to the occurrence cube via `mgrscode`, which contains the identifiers of the grid cells containing at least one occurrence of the species.

```{r}
cells_in_cube <- vl_grid %>%
  dplyr::filter(mgrscode %in% unique(abv_cube$mgrscode)) %>%
  dplyr::select(-c(TAG, Shape_Leng, Shape_Area))
sf_abv_cube <- cells_in_cube %>%
  dplyr::left_join(abv_cube, by = "mgrscode")
```

# Final (spatial) dataset:

```{r}
sf_abv_cube %>% head(n = 30)
```
````

### Sebastiaan Solution

````
# CHALLENGE 1

Convert the code below to a Quarto (qmd) document called `visualize_n_occs.qmd` and make an html version of it.

# Setup
```{r}
#| output: false
#| label: Load libraries
library(tidyverse)    # to do datascience
library(here)         # to build file paths in a project
library(INBOtheme)    # to apply INBO style to graphs
library(sf)           # to work with geospatial vector data
library(plotly)       # to make dynamic plots
library(mapview)      # to make maps
library(leaflet)      # to make dynamic maps
```

# Introduction

In this document we will:

 1. read occurrence cube data
 2. explore data
 3. preprocess data
 4. visualize data


# Read data

Read **ABV** data from the occurrence cube file `20251028_abv_cube.csv`:
```{r}
#| label: read data
abv_cube <- read_csv(
  file = here::here("data", "20260331", "20260331_abv_cube.csv")
)

# Read the Flemish grid from the geopackage file `20251028_utm_grid.gpkg`:
vl_grid <- st_read(
  dsn = here("data", "20260331", "20260331_utm_grid.gpkg")
)
```

# Explore data

This dataset contains data from `{r} min(abv_cube$year)` to `{r} max(abv_cube$year)`  related to `{r}length(unique(abv_cube$specieskey))` species from the `{r} unique(abv_cube$family)`  family and their distribution in Flanders based on a grid of 1 km x 1 km.

Preview of the first 30 rows of the dataset:
```{r}
#| label: data exploration
head(abv_cube, n = 30)
```

## Taxonomic information

Species present in the dataset:
```{r}
abv_cube %>% distinct(specieskey, species)
```

## Temporal information

The data are temporally defined at year level. Years present:
```{r}
abv_cube %>% dplyr::distinct(year) %>% arrange(year)
```

# Geographical information

The geographical information is represented by the `mgrscode` column, which contains the identifiers of the grid cells containing at least one occurrence of the species.

The dataset contains `{r} length(unique(abv_cube$mgrscode))` unique grid cells.


# Preprocess data

Add geometrical information to the occurrence cube via `mgrscode`, which contains the identifiers of the grid cells containing at least one occurrence of the species.

```{r}
#| label: preprocessing
cells_in_cube <- vl_grid %>%
  dplyr::filter(mgrscode %in% unique(abv_cube$mgrscode)) %>%
  dplyr::select(-c(TAG, Shape_Leng, Shape_Area))
sf_abv_cube <- cells_in_cube %>%
  dplyr::left_join(abv_cube, by = "mgrscode")

```

Final (spatial dataset):
```{r}
sf_abv_cube %>% head(n = 30)
```


````


### Hans

````
---
title: "Read and visualize ABV occurrence data"
date: "`r Sys.Date()`"
format:
  html:
    df-print: paged
execute:
  echo: true
  warning: false
  eval: true
---

```{r}
#| label: Setup
#| message: false
#| warning: false

# Load libraries:
library(tidyverse)    # to do datascience
library(here)         # to build file paths in a project
library(INBOtheme)    # to apply INBO style to graphs
library(sf)           # to work with geospatial vector data
library(plotly)       # to make dynamic plots
library(mapview)      # to make maps
library(leaflet)      # to make dynamic maps

```




# Introduction

In this document we will:


1. read occurrence cube data
2. explore data
3. preprocess data
4. visualize data

# Read data


Read **ABV** data from the occurrence cube file `20251028_abv_cube.csv`:

```{r}
#| label: read-abv-data

abv_cube <- read_csv(
  file = here::here("data", "20260331", "20260331_abv_cube.csv")
)
```


Read the Flemish grid from the geopackage file `20251028_utm_grid.gpkg`:

```{r}
#| label: read-flemish-grid
vl_grid <- st_read(
  dsn = here("data", "20260331", "20260331_utm_grid.gpkg")
)
```



# Explore data

This dataset contains data from `r min(abv_cube$year)` to `r max(abv_cube$year)`
related to `r length(unique(abv_cube$specieskey))` species from the
`r sprintf("[%s](https://www.gbif.org/species/%s)", unique(abv_cube$family), unique(abv_cube$familyKey))` family and their distribution in Flanders
based on a grid of 1 km x 1 km.


Preview of the first 30 rows of the dataset:

```{r}
#| label: preview-data
head(abv_cube, n = 30)
```


## Taxonomic information

Species present in the dataset:

```{r}
#| label: species-list
abv_cube %>% distinct(specieskey, species)
```


## Temporal information

The data are temporally defined at year level.
Years present:


```{r}
#| label: years-present
abv_cube %>% dplyr::distinct(year) %>% arrange(year)
```


## Geographical information

The geographical information is represented by the `mgrscode` column,
which contains the identifiers of the grid cells containing at least one
occurrence of the species.

The dataset contains `r length(unique(abv_cube$mgrscode))` unique grid cells.


# Preprocess data

Add geometrical information to the occurrence cube via `mgrscode`, which
contains the identifiers of the grid cells containing at least one occurrence
of the species.

```{r}
#| label: add-geo-info
cells_in_cube <- vl_grid %>%
  dplyr::filter(mgrscode %in% unique(abv_cube$mgrscode)) %>%
  dplyr::select(-c(TAG, Shape_Leng, Shape_Area))
sf_abv_cube <- cells_in_cube %>%
  dplyr::left_join(abv_cube, by = "mgrscode")
```


Final (spatial) dataset:

```{r}
#| label: preview-spatial-data
sf_abv_cube %>% head(n = 30)
```

````


## Challenge 2

### Falk's quarto notebook

- [complete files here...](https://drive.google.com/drive/folders/1jvBD0s0cGUzv0wfNBYg2Ttr-bS3AxaJy?usp=drive_link)

highlights below.

final yaml header: 
```{markdown}
date: "`r Sys.Date()`"
format:
  html:
    toc: true
    toc-location: left
    number-sections: true
    html-math-method: katex
    df-print: paged
    other-links:
      - text: Algemene Broedvogel Monitoring report
        href: https://inbo.github.io/abv-rapport/2023/index.html
      - text: GBIF species occurrence cube
        href: https://doi.org/10.15468/dl.b38nw5
    code-links:
      - text: Dataset
        icon: table
        href: data/20260331_abv_cube.csv
```

callout box:
```{markdown}
:::{.callout-warning title="Coding Club!"}
This report is not intended to be a scientific report but rather a demonstration of how to write a report in Quarto.
:::

```


panel tabset; note the "group" argument so that other tabsets with similar options are coupled:

```{markdown}

## higher header

::: {.panel-tabset group="species_year"}

### per species (1st tab)
[...]

### other headers
[...]

:::

```

code folding:
````{r}
```{r}
#| code-fold: true
#| code-summary: "Show the code"

[...]
```
````
### Sebastiaan's colution
````
---
title: "Read and visualize ABV occurrence data"
author: "Sebastiaan Verbesselt"
format: 
  html:
    df-print: paged
    toc: true
    toc-depth: 2
    toc-location: left
    number-sections: true
    number-depth: 3
    other-links:
      - text: Algemene broedvogels report
        href: https://inbo.github.io/abv-rapport/2023/index.html
      - text: GBIF link to dataset (occurance cube)
        href: https://www.gbif.org/occurrence/download/0013459-251009101135966
editor: visual
date: "`{r} Sys.Date()`"
date-format: "D MMMM, YYYY"
---

# CHALLENGE 1

Convert the code below to a Quarto (qmd) document called `visualize_n_occs.qmd` and make an html version of it.

# Setup

```{r}
#| output: false
#| label: Load libraries
library(tidyverse)    # to do datascience
library(here)         # to build file paths in a project
library(INBOtheme)    # to apply INBO style to graphs
library(sf)           # to work with geospatial vector data
library(plotly)       # to make dynamic plots
library(mapview)      # to make maps
library(leaflet)      # to make dynamic maps
```

# Introduction

In this document we will:

1.  read occurrence cube data
2.  explore data
3.  preprocess data
4.  visualize data

# Read data

Read **ABV** data from the occurrence cube file `20251028_abv_cube.csv`:

```{r}
#| label: read data
#| results: hide
abv_cube <- read_csv(
  file = here::here("data", "20260331", "20260331_abv_cube.csv")
)

# Read the Flemish grid from the geopackage file `20251028_utm_grid.gpkg`:
vl_grid <- st_read(
  dsn = here("data", "20260331", "20260331_utm_grid.gpkg")
)
```

# Explore data

This dataset contains data from `{r} min(abv_cube$year)` to `{r} max(abv_cube$year)` related to `{r} length(unique(abv_cube$specieskey))` species from the `{r} unique(abv_cube$family)` family and their distribution in Flanders based on a grid of 1 km x 1 km.

Preview of the first 30 rows of the dataset:

```{r}
#| label: data exploration
head(abv_cube, n = 30)
```

## Taxonomic information

Species present in the dataset:

```{r}
abv_cube %>% distinct(specieskey, species)
```

## Temporal information

The data are temporally defined at year level. Years present:

```{r}
abv_cube %>% dplyr::distinct(year) %>% arrange(year)
```

# Geographical information

The geographical information is represented by the `mgrscode` column, which contains the identifiers of the grid cells containing at least one occurrence of the species.

The dataset contains `{r} length(unique(abv_cube$mgrscode))` unique grid cells.

# Preprocess data

Add geometrical information to the occurrence cube via `mgrscode`, which contains the identifiers of the grid cells containing at least one occurrence of the species.

```{r}
#| label: preprocessing
cells_in_cube <- vl_grid %>%
  dplyr::filter(mgrscode %in% unique(abv_cube$mgrscode)) %>%
  dplyr::select(-c(TAG, Shape_Leng, Shape_Area))
sf_abv_cube <- cells_in_cube %>%
  dplyr::left_join(abv_cube, by = "mgrscode")

```

Final (spatial dataset):

```{r}
sf_abv_cube %>% head(n = 30)
```


# CHALLENGE 2 ####
::: {.callout-caution }
"This report is not intended to be a scientific report but rather a demonstration of how to write a report in Quarto"
:::

# Visualize data

In this section we will show how the number of occurrences and the number 
occupied grid cells vary by year and species. Both static plots and dynamic ps are generated.

## Static plots
Show number of occurrences and number of occupied grid cells (make a bbed section out of it)


::: {.panel-tabset}
### per species (1st tab)
```{r}
#| code-fold: true
#| code-summary: "Show the code"
n_per_species <- sf_abv_cube %>%
  dplyr::group_by(species) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_per_species, aes(x = species, y = n)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable, scales = "free_y") +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))
```



### per year (2nd tab)
```{r}
#| code-fold: true
#| code-summary: "Show the code"
n_per_year <- sf_abv_cube %>%
  dplyr::group_by(year) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_per_year,aes(x = year, y = n)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable, scales = "free_y") +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))
```

### per year and species (3rd tab)
```{r}
#| code-fold: true
#| code-summary: "Show the code"
n_occs_per_year_species <-
  sf_abv_cube %>%
  dplyr::group_by(year, species) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_occs_per_year_species,
       aes(x = year, y = n, fill = species)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable) +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))
```
:::


## Dynamic plots
### Leaflet dynamic map
We show a map with the distribution of buntings in Flanders. We show the
total number of occurrences per grid cell. The color of the grid cells is
based on the number of occurrences. The legend shows the color scale and e
number of occurrences per grid cell.

```{r}
#| code-fold: true
#| code-summary: "Show the code"
n_occs_per_cell <- sf_abv_cube %>%
  dplyr::group_by(mgrscode) %>%
  dplyr::summarize(
    occurrences = sum(n),
    min_coordinateuncertaintyinmeters = min(mincoordinateuncertaintyinmeters),
    .groups = "drop"
  )

map_abv <- mapview(
  n_occs_per_cell,
  zcol = "occurrences",
  legend = TRUE
)

map_abv
```




### Plotly yearly abundance
We show a graph with the yearly abundances per species.
```{r}
#| code-fold: true
#| code-summary: "Show the code"
n_occs_per_year <- n_occs_per_year_species |>
  dplyr::filter(variable == "occurrences") |>
  st_drop_geometry()

fig <- plot_ly(
  n_occs_per_year,
  x = ~year,
  y = ~n,
  split = ~species,
  stroke = ~species,
  type = "scatter",
  mode = "lines+markers"
)

fig
```



````
### Emma's solution
````
---
title: "Read and visualize ABV occurrence data"
format: 
  html:
    df-print: paged
    toc: true
    toc-location: left
    number-sections: true
    code-summary: "Show the code"
    other-links:
      - text: Algemene Broedvogel Monitoring
        href: https://inbo.github.io/abv-rapport/2023/index.html
      - text: GBIF species occurrence cube
        href: https://www.gbif.org/occurrence/download/0013459-251009101135966
editor: source
date: "`r Sys.Date()`"
date-format: "D MMMM, YYYY"
---

::: {.callout-caution}
This report is not intended to be a scientific report but rather a demonstration of how to write a report in Quarto.
:::

# Setup

Load libraries:

```{r}
#| message: false
#| warning: false

library(tidyverse)    # to do datascience
library(here)         # to build file paths in a project
library(INBOtheme)    # to apply INBO style to graphs
library(sf)           # to work with geospatial vector data
library(plotly)       # to make dynamic plots
library(mapview)      # to make maps
library(leaflet)      # to make dynamic maps
```

# Introduction

In this document we will:

1.  read occurrence cube data
2.  explore data
3.  preprocess data
4.  visualize data

# Read data

Read **ABV** data from the occurrence cube file `20251028_abv_cube.csv`:
```{r}
#| warning: false
abv_cube <- read_csv(
  file = here::here("data", "20260331", "20260331_abv_cube.csv")
)
```

Read the Flemish grid from the geopackage file `20251028_utm_grid.gpkg`:
```{r}
#| warning: false
vl_grid <- st_read(
  dsn = here("data", "20260331", "20260331_utm_grid.gpkg")
)
```

# Explore data

This dataset contains data from `r min(abv_cube$year)` to `r max(abv_cube$year)` related to `r length(unique(abv_cube$specieskey))` species from the [`r unique(abv_cube$family)`](`r paste0("https://www.gbif.org/species/",unique(abv_cube$familyKey))`) family and their distribution in Flanders based on a grid of 1 km x 1 km.

Preview of the first 30 rows of the dataset:
```{r}
head(abv_cube, n = 30)
```

## Taxonomic information

Species present in the dataset:
```{r}
abv_cube %>% distinct(specieskey, species)
```

## Temporal information

The data are temporally defined at year level. Years present:
```{r}
abv_cube %>% dplyr::distinct(year) %>% arrange(year)
```

## Geographical information

The geographical information is represented by the `mgrscode` column, which contains the identifiers of the grid cells containing at least one occurrence of the species.

The dataset contains `r length(unique(abv_cube$mgrscode))` unique grid cells.

# Preprocess data

Add geometrical information to the occurrence cube via `mgrscode`, which contains the identifiers of the grid cells containing at least one occurrence of the species.

```{r}
cells_in_cube <- vl_grid %>%
  dplyr::filter(mgrscode %in% unique(abv_cube$mgrscode)) %>%
  dplyr::select(-c(TAG, Shape_Leng, Shape_Area))
sf_abv_cube <- cells_in_cube %>%
  dplyr::left_join(abv_cube, by = "mgrscode")
```

# Final (spatial) dataset:

```{r}
sf_abv_cube %>% head(n = 30)
```


# Visualize data

In this section we will show how the number of occurrences and the number of
occupied grid cells vary by year and species. Both static plots and dynamic maps are generated.

## Static plots

Number of occurrences and number of occupied grid cells:

::: {.panel-tabset}
### per species

```{r}
#| code-fold: true

n_per_species <- sf_abv_cube %>%
  dplyr::group_by(species) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_per_species, aes(x = species, y = n)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable, scales = "free_y") +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))
```

### per year

```{r}
#| code-fold: true

n_per_year <- sf_abv_cube %>%
  dplyr::group_by(year) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_per_year,aes(x = year, y = n)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable, scales = "free_y") +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))
```


### per year and species

```{r}
#| code-fold: true

n_occs_per_year_species <-
  sf_abv_cube %>%
  dplyr::group_by(year, species) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_occs_per_year_species,
       aes(x = year, y = n, fill = species)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable) +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))
```
:::
## Dynamic plots
### Leaflet dynamic map
We show a map with the distribution of buntings in Flanders. We show the
total number of occurrences per grid cell. The color of the grid cells is
based on the number of occurrences. The legend shows the color scale and the
number of occurrences per grid cell.
```{r}
#| code-fold: true

n_occs_per_cell <- sf_abv_cube %>%
  dplyr::group_by(mgrscode) %>%
  dplyr::summarize(
    occurrences = sum(n),
    min_coordinateuncertaintyinmeters = min(mincoordinateuncertaintyinmeters),
    .groups = "drop"
  )

map_abv <- mapview(
  n_occs_per_cell,
  zcol = "occurrences",
  legend = TRUE
)

map_abv
```


### Plotly yearly abundance

We show a graph with the yearly abundances per species.
```{r}
#| code-fold: true

n_occs_per_year <- n_occs_per_year_species |>
  dplyr::filter(variable == "occurrences") |>
  st_drop_geometry()

fig <- plot_ly(
  n_occs_per_year,
  x = ~year,
  y = ~n,
  split = ~species,
  stroke = ~species,
  type = "scatter",
  mode = "lines+markers"
)

fig
```
````

### Hans

````
---
title: "Read and visualize ABV occurrence data"
date: "`r Sys.Date()`"
format:
  html:
    df-print: paged
    toc: true
    toc-location: left
    number-sections: true
    code-fold: true
    other-links:
      - text: Algemene Broedvogel Monitoring
        href: https://inbo.github.io/abv-rapport/2023/index.html
      - text: GBIF species occurrence cube
        href: https://doi.org/10.15468/dl.b38nw5
execute:
  echo: true
  warning: false
  eval: true
editor:
  markdown:
    wrap: sentence
---

```{r}
#| label: Setup
#| message: false
#| warning: false

# Load libraries:
library(tidyverse)    # to do datascience
library(here)         # to build file paths in a project
library(INBOtheme)    # to apply INBO style to graphs
library(sf)           # to work with geospatial vector data
library(plotly)       # to make dynamic plots
library(mapview)      # to make maps
library(leaflet)      # to make dynamic maps

```




# Introduction

::: {.callout-warning}
This report is not intended to be a scientific report but rather a demonstration of how to write a report in Quarto
:::

In this document we will:


1. read occurrence cube data
2. explore data
3. preprocess data
4. visualize data

# Read data


Read **ABV** data from the occurrence cube file `20251028_abv_cube.csv`:

```{r}
#| label: read-abv-data
#| message: false

abv_cube <- read_csv(
  file = here::here("data", "20260331", "20260331_abv_cube.csv")
)
```


Read the Flemish grid from the geopackage file `20251028_utm_grid.gpkg`:

```{r}
#| label: read-flemish-grid
#| message: false
vl_grid <- st_read(
  dsn = here("data", "20260331", "20260331_utm_grid.gpkg")
)
```



# Explore data

This dataset contains data from `r min(abv_cube$year)` to `r max(abv_cube$year)`
related to `r length(unique(abv_cube$specieskey))` species from the
`r sprintf("[%s](https://www.gbif.org/species/%s)", unique(abv_cube$family), unique(abv_cube$familyKey))` family and their distribution in Flanders
based on a grid of 1 km x 1 km.


Preview of the first 30 rows of the dataset:

```{r}
#| label: preview-data
head(abv_cube, n = 30)
```


## Taxonomic information

Species present in the dataset:

```{r}
#| label: species-list
abv_cube %>% distinct(specieskey, species)
```


## Temporal information

The data are temporally defined at year level.
Years present:


```{r}
#| label: years-present
abv_cube %>% dplyr::distinct(year) %>% arrange(year)
```


## Geographical information

The geographical information is represented by the `mgrscode` column,
which contains the identifiers of the grid cells containing at least one
occurrence of the species.

The dataset contains `r length(unique(abv_cube$mgrscode))` unique grid cells.


# Preprocess data

Add geometrical information to the occurrence cube via `mgrscode`, which
contains the identifiers of the grid cells containing at least one occurrence
of the species.

```{r}
#| label: add-geo-info
cells_in_cube <- vl_grid %>%
  dplyr::filter(mgrscode %in% unique(abv_cube$mgrscode)) %>%
  dplyr::select(-c(TAG, Shape_Leng, Shape_Area))
sf_abv_cube <- cells_in_cube %>%
  dplyr::left_join(abv_cube, by = "mgrscode")
```


Final (spatial) dataset:

```{r}
#| label: preview-spatial-data
sf_abv_cube %>% head(n = 30)
```



# Visualize data

In this section we will show how the number of occurrences and the number of
occupied grid cells vary by year and species. Both static plots and dynamic maps are generated.

## Static plots

Show number of occurrences and number of occupied grid cells (make a tabbed section out of it)

::: {.panel-tabset}

### per species

```{r}
n_per_species <- sf_abv_cube %>%
  dplyr::group_by(species) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_per_species, aes(x = species, y = n)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable, scales = "free_y") +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))

```

### per year

```{r}
n_per_year <- sf_abv_cube %>%
  dplyr::group_by(year) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_per_year,aes(x = year, y = n)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable, scales = "free_y") +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))
```



### per year and species

```{r}
n_occs_per_year_species <-
  sf_abv_cube %>%
  dplyr::group_by(year, species) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_occs_per_year_species,
       aes(x = year, y = n, fill = species)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable) +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))
```

:::

## Dynamic plots

### Leaflet dynamic map

We show a map with the distribution of buntings in Flanders.
We show the total number of occurrences per grid cell.
The color of the grid cells is based on the number of occurrences.
The legend shows the color scale and the number of occurrences per grid cell.

```{r}
n_occs_per_cell <- sf_abv_cube %>%
  dplyr::group_by(mgrscode) %>%
  dplyr::summarize(
    occurrences = sum(n),
    min_coordinateuncertaintyinmeters = min(mincoordinateuncertaintyinmeters),
    .groups = "drop"
  )

map_abv <- mapview(
  n_occs_per_cell,
  zcol = "occurrences",
  legend = TRUE
)

map_abv

```


### Plotly yearly abundance

We show a graph with the yearly abundances per species.

```{r}
n_occs_per_year <- n_occs_per_year_species |>
  dplyr::filter(variable == "occurrences") |>
  st_drop_geometry()

fig <- plot_ly(
  n_occs_per_year,
  x = ~year,
  y = ~n,
  split = ~species,
  stroke = ~species,
  type = "scatter",
  mode = "lines+markers"
)

fig
```
````
### Larissa

---
title: "Read and visualize ABV occurrence data"
format: 
    html:
      df-print: paged
      code-fold: true
      code-summary: "Show me the code"
      number-sections: true
      toc: true
      toc-location: left
      other-links:
      - text: Algemene Broedvogel Monitoring
        href: https://inbo.github.io/abv-rapport/2023/index.html
      - text: GBIF species occurence cube
        href: https://doi.org/10.15468/dl.b38nw5
editor: visual
date: "`r Sys.Date()`"
date-format: "D MMMM YYYY"

---

::: {.callout-caution}

This report is not intended to be a scientific report but rather a demonstration of how to write a report in Quarto

:::

```{r}
#| label: library-setup
#| output: false

library(tidyverse)    # to do datascience
library(here)         # to build file paths in a project
library(INBOtheme)    # to apply INBO style to graphs
library(sf)           # to work with geospatial vector data
library(plotly)       # to make dynamic plots
library(mapview)      # to make maps
library(leaflet)      # to make dynamic maps
```

## Introduction

In this document we will:

1.  read occurrence cube data
2.  explore data
3.  preprocess data
4.  visualize data

### Read ABV data from the occurrence cube file `20251028_abv_cube.csv`:

```{r}
#| label: Read-ABV-data
#| warning: false

abv_cube <- read_csv(
  file = here::here("data", "20260331", "20260331_abv_cube.csv")
)

```

### Read the Flemish grid from the geopackage file `20251028_utm_grid.gpkg`:

```{r}
#| label: Read-Flemish-gpkg
#| warning: false

vl_grid <- st_read(
  dsn = here("data", "20260331", "20260331_utm_grid.gpkg")
)
```

# Explore data

This dataset contains data from `min(abv_cube$year)` to `max(abv_cube$year)`

related to `length(unique(abv_cube$specieskey))` species from the
`unique(abv_cube$family)` family and their distribution in Flanders

based on a grid of 1 km x 1 km.

### Preview of the first 30 rows of the dataset:

```{r}
#| warning: false

head(abv_cube, n = 30)
```

# Taxonomic information

### Species present in the dataset:

```{r}
#| warning: false

abv_cube %>% distinct(specieskey, species)
```

# Temporal information

The data are temporally defined at year level. Years present:

```{r}
#| warning: false

abv_cube %>% dplyr::distinct(year) %>% arrange(year)

```

# Geographical information

The geographical information is represented by the `mgrscode` column, which contains the identifiers of the grid cells containing at least one occurrence of the species.
 The dataset contains `length(unique(abv_cube$mgrscode))` unique grid cells.
 
# Preprocess data

Add geometrical information to the occurrence cube via `mgrscode`, which
contains the identifiers of the grid cells containing at least one occurrence
of the species.

```{r}
#| warning: false

cells_in_cube <- vl_grid %>%
  dplyr::filter(mgrscode %in% unique(abv_cube$mgrscode)) %>%
  dplyr::select(-c(TAG, Shape_Leng, Shape_Area))
sf_abv_cube <- cells_in_cube %>%
  dplyr::left_join(abv_cube, by = "mgrscode")
```

# Final (spatial) dataset:
```{r}
#| warning: false

sf_abv_cube %>% head(n = 30)
```


# (CHALLENGE 2 START) Visualize data
In this section we will show how the number of occurrences and the number of occupied grid cells vary by year and species. Both static plots and dynamic maps are generated.

## Static plots

Show number of occurrences and number of occupied grid cells (make a tabbed section out of it)

::: {.panel-tabset}

## Per species

```{r}
n_per_species <- sf_abv_cube %>%
  dplyr::group_by(species) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

plot1 <- ggplot(n_per_species, aes(x = species, y = n)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable, scales = "free_y") +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))
  
plot1
```

## Per year
```{r}
n_per_year <- sf_abv_cube %>%
  dplyr::group_by(year) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_per_year,aes(x = year, y = n)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable, scales = "free_y") +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))
  

```


## Per year and species
```{r}
n_occs_per_year_species <-
  sf_abv_cube %>%
  dplyr::group_by(year, species) %>%
  dplyr::summarize(occurrences = sum(n),
                   grid_cells = n_distinct(mgrscode),
                   .groups = "drop") %>%
  tidyr::pivot_longer(cols = c(occurrences, grid_cells),
                      names_to = "variable",
                      values_to = "n")

ggplot(n_occs_per_year_species,
       aes(x = year, y = n, fill = species)) +
  geom_bar(stat = 'identity') +
  facet_grid(.~variable) +
  ggplot2::theme(axis.text.x = element_text(angle = 60, hjust = 1))

```
:::

````


## Challenge 3





