owned this note
owned this note
Published
Linked with GitHub
# NICEST2 hackathon on FAIR climate data
This document is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents.
Attendees are expected to follow our code of conduct: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html
All content is publicly available under the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/
### Logistics
- [Zoom mechanics and controls](https://coderefinery.github.io/manuals/zoom-mechanics/)
- [HackMD mechanics and controls](https://coderefinery.github.io/manuals/hackmd-mechanics/)
- **Presentations were recorded** (see https://nordicesmhub.github.io/nicest2-fair-hackathon/)
_______________________________________________________________________________________________________
## Notes from day-1
- https://hackmd.io/@nicest2/HJQnn31XO
## Icebreaker
*After day-1, do you still have some hopes for this hackathon? What are they?*
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
## Welcome
### Code of Conduct
- https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html
#### How to report CoC issues to the CoC committee?
- Email/contact: Naoe Tatara (naoe.tatara@ub.uio.no)
## Presentation
Please sign in so we can record your attendance.
*Name, Institution, relevant project, Email & Twitter (optional)*
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
### EOSC-Nordic Jupyterhub
- https://eosc-nordic.uiogeo-apps.sigma2.no/
### Planning for practical sessions
Day 2 (03/16, 13:00-17:00 CET) and day 3 (03/17, 13:00-17:00 CET) are scheduled for practical work. We hope this provides an opportunity for those interested/motivated in diving deeper into some of the questions raised today in a breakout/working group setting?
Some ideas/suggestions for breakout themes (but we want to hear your ideas also!):
- **ESGF (possibly incl. ESMVal) use case**: Create a notebook/recipe describing the practical steps for download data, query errata, ... to capture all metadata required for scientific publication. Maybe take "reverse" approach by looking at existing pagers, e.g.:
- "Causes of Higher Climate Sensitivity in CMIP6 Models", https://doi.org/10.1029/2019GL085782
- Bias in CMIP6 models as compared to observed regional dimming and brightening, https://acp.copernicus.org/articles/20/16023/2020/
- Equilibrium climate sensitivity above 5 °C plausible due to state-dependent cloud feedback, https://www.nature.com/articles/s41561-020-00649-1
and detail how the (CMIP) metadata from one of these papers can be determined for someone trying to reproduce the research. (Prashanth)
- **Develop a FAIR cookbook for climate data, workflow/DMP integration** (https://fairplus.github.io/the-fair-cookbook/content/home.html). Take PhD/early career researcher as use case to develop a FAIR climate data cookbook. How in practice to integrate DMPs and researchers workflow to ensure information (metadata) is propagated and to make DMPs relevant & userful for researchers? (Tyge, Hamish, Adil, Abdelkader, Oskar, Joakim, Aiden)
- **Tools & Metadata**: Take an existing paper (from Aiden, Sara or Dominic?) to analyze what is missing regarding FAIR (tools & metadata). ReproHack style analysis.
- Humanitarian need drives multilateral disaster aid, https://www.pnas.org/content/118/4/e2018293118
- Equilibrium climate sensitivity above 5 °C plausible due to state-dependent cloud feedback, https://www.nature.com/articles/s41561-020-00649-1
- CMIP6 data request: Migrate CMIP request data into database. (Anne, Naoe, Jean, Klaus, Kirsten, Yanchun)
- Current CMIP6 repository: https://cmip6dr.github.io/Data_Request_Home/
- Subversion repository of the Data Request: http://proj.badc.rl.ac.uk/svn/exarch/CMIP6dreq/tags/01.00.33/
- Documentation for restful api: https://esgf.github.io/esg-search/ESGF_Search_RESTful_API.html
HackMD for Tools and metadata: https://hackmd.io/@nicest2/H1ibGVAXd
## Thoughts & inputs about hackathon
- Matus: Apply to climate and use knowledge from bio-informatics (inputs as an "outsider" from climate), interested in options / efforts towards motivating & lowering barrier. I'll be jumping between the different breakout rooms 🦘
- Abdelkader: interested to work on lowering the barrier for the scientists. There has been various developments in recent years in terms of data formats and tools (e.g NetCDF, CF-compilants) to make sure the climate data is standardized and well documented. Would it be possible to come up with automated FAIRness checks extending the meta data so that increasing our confidence in (re) using these datasets?
- Oskar: motivate on why we "should" share data. Regional modelling to catch up with the global. Publication of research climate data to ESGF to be considered to benefit from existing work
- Jean: see on a practical example how to "make something FAIR" +1+1
- Adil: DMP integration with other services (for instance in Norway with Sigma2 MAS), ES-DOC. also interested in working on making data FAIR (practical example).
- Klaus: identify existing project/tools to extend/reuse. https://c6dreq.dkrz.de/, undertand what kind of grids are used (CMIP6, etc.), make a registry of the different grids (can be reused to re-grid)
ES-DOC ocean description metadata for CMIP6: https://specializations.es-doc.org/static/index.html?target=cmip6.ocean&client=esdoc-url-rewrite
- Aiden: see what tools could be reused (in our own research group), from own data, consider using Aiden *et al.*'s data as a practical example
- Joakim: librarian, DMP (machine actionable) as a tool to evaluate FAIRness of the data. ES-DOC and E
## Day-2 presentation from Joakim (DMP)
Ask questions below:
- Tool to get a score for the FAIRness https://f-uji.net/
- Is there a way to directly get a score for the DMP?
- It is not possible in DMPOnline, but possibly in the Data Stewardshap Wizard
- Is f-uji tool usable for DMP only?
- No, Fuji is usable only for datasets. If you have a dataset with an identifier then you can use the tool to check the FAIRness. All the tool does is to access the page and try to see if it can find links to information. It doesn't check the quality of the metadata.
- How does the Fuji score compare with the DMP score?
- Are DMPs compulsory in Sweden?
- Depends on the funder, but in most cases they are.
- Is the API "accessible" to end-users?
- Probably not (one needs to be admin)
- But can be downloaded (in different formats)
# Day 3
## Working groups
### 1. FAIR climate cookbook:
Participants: Abdelkader, Oskar, Aiden, Hamish, Adil, Jean
- Create repo in https://github.com/NordicESMhub/
- Create basic book structure
- Start hacking content
### 2. Idealized citation
Participants: Tyge, Naoe, Kirsten, Prashanth, Anne, Klaus
- Decide on a paper(s) to analyze citation
- data
- software
- Push recommendations to cookbook
- Consider a small article for EOSC-Nordic?
- [nature paper](https://www.nature.com/articles/s41561-020-00649-1)
CMIP6 data citation; add handles for CMIP6 (Identifier, PID) for the entire dataset (not per file as a dataset can be split per time):
- id = CMIP6.CMIP.NCAR.CESM2-FV2.historical.r1i1p1f1.Amon.tas.gn ([.v20191120], http://hdl.handle.net/hdl:21.14100/8b4957ac-0912-3664-b0c2-67a8874d8974 (no need to add version because we have the PID of the dataset)
- id = CMIP6.CMIP.NCAR.CESM2-FV2.historical.r1i1p1f1.day.tas.gn, http://hdl.handle.net/hdl:21.14100/95abfd7b-5c01-39fd-8dd4-9c155d020143
- id = CMIP6.CMIP.NCAR.CESM2-FV2.historical.r2i1p1f1.day.tas.gn, http://hdl.handle.net/hdl:21.14100/ce68c95e-d0e9-32bf-ab99-19e2596c59aa
- id = CMIP6.CMIP.NCAR.CESM2-FV2.historical.r3i1p1f1.day.tas.gn, http://hdl.handle.net/hdl:21.14100/7554abc7-6fe2-3ebb-8a42-9e0def9fe0db
- id = CMIP6.CMIP.NCAR.CESM2-FV2.historical.r2i1p1f1.Amon.tas.gn, http://hdl.handle.net/hdl:21.14100/c402c2d9-cac8-3df8-9d69-7f06327dfc00
- id = CMIP6.CMIP.NCAR.CESM2-FV2.historical.r3i1p1f1.Amon.tas.gn, http://hdl.handle.net/hdl:21.14100/a587294e-5a2b-36cf-9f64-0ed1168bb862
But we could give the PID of the entire dataset.
--> the best would be to get a PID from a downloaded dataset. (to add in the cookbook); provide a simple tool to return PID from data stored on disk.
CMIP5 REST API example: https://esgf-node.llnl.gov/esg-search/search?type=File&tracking_id=826ee5e9-3cc9-40a6-a42b-d84c6b4aad97
CMIP6 REST API example: https://esgf-node.llnl.gov/esg-search/search?type=File&tracking_id=hdl:21.14100/a97ec9f6-f29f-4c79-b809-285926068043
Resolve hdl handle: https://hdl.handle.net/
#### Data-ography section
- CMIP6: list of handles and identifier (see above)
- DOI of the code or tool to compute anomalies (version, PID). If your own tool in github, make a release and get a DOI with zenodo.
- GISTEMP v4 : no PID so download dataset and archive it to get a PID?
- Compute annual mean (DOI of code; in that case it may be too simple and unecessary)
- Code to make Figure 1a (this code could be added with paper or github)
- Data use for plotting saved so it is easy to reproduce the plot
### 3. Ontologies
Participants: Oscar from 15:00, Matus (jumping between groups), Klaus
- White paper/roadmap
- Outline
- Where will it be hosted?
- How to contribute?
- Important Ontologies or Ontology-like information
- CF Conventions: http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html
- CF Standard Names: http://cfconventions.org/standard-names.html
- CMIP6 Data Request: http://clipc-services.ceda.ac.uk/dreq/index.html
- Implementation repository (svn): http://proj.badc.rl.ac.uk/svn/exarch/CMIP6dreq/tags/01.00.33/dreqPy/docs
- ES-DOC cim schema: https://github.com/ES-DOC/esdoc-cim-v2-schema
- IS-ENES3 Milestones: https://is.enes.org/documents/milestones
- Other communities:
- https://bioportal.bioontology.org
- http://agroportal.lirmm.fr/ontologies
**Sustainability** Please think how your activity can be sustained after the end of the hackathon