HackMD - Collaborative Markdown Knowledge Base

# Meeting about CFF and citable lessons Meeting participants: Toby Hodges, Samantha Wittke, Radovan Bast This doc: https://hackmd.io/@coderefinery/citable-lessons-2024 CodeRefinery: - One blog post on citable lessons in '21: https://coderefinery.org/blog/2021/11/21/towards-citable-lessons/ - Manuals page describing the idea: https://coderefinery.github.io/manuals/lesson-credits/ - Notes from previous meetings on this topic: - https://hackmd.io/@coderefinery/citable-lessons - https://hackmd.io/@yonglei/citable-lesson-coderefinery-2024feb - Missing part: Use Zenodo API to upload to Zenodo automatically at each release - Difficult part: For each lesson, find contributors and reaching out to them and collecting consent. This has been done through issue/PR. Example: https://github.com/coderefinery/documentation/pull/270 - Unclear: Distinguish authors and editors and contributors? - https://github.com/citation-file-format/citation-file-format/issues/112 - Goal: Release one or two versions per year where release version is the date - Example cff: https://github.com/coderefinery/documentation/blob/main/CITATION.cff - Example CodeRefinery lesson on Zenodo: https://zenodo.org/records/8280235 - What about the authorlist, does it grow endlessly with every new lesson release? - initial thought: it grows unless someone wants to be removed - even if the original contribution may be gone, it still often builds on previous versions - alpahbetical author ordering issue: - nothing indicates importance or scale of contribution - other ordering: asks more questions than answers - maybe distinguising editors/maintainers and authors/contributors could help? Carpentries: - AMY collects consent Goal of not missing contributions: - Commit history is not the only thing we look at - Threshold to be added should be low - Better credit too many than too few - Provide a mechanism to add contributors later - Advantage: Metadata can be modified without changing DOI. But we need a mechanism for it. https://github.com/carpentries/lesson-development-training/ Comparing CFFs: - https://github.com/carpentries/lesson-development-training/blob/main/CITATION.cff - https://github.com/coderefinery/documentation/blob/main/CITATION.cff - differences: - type: data vs dataset (dataset is correct type: https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#type ; now also fixed in CR lesson) - authors: in CR we should add affiliation (can be complicated) and ORCID - abstract: in CR we should break lines - doi vs identifiers/type,doi,description - keywords: let's add them in CR - repository-code: missing in CR Converting CFF to a webpage: https://github.com/University-of-Potsdam-MM/cff2pages ``` --- layout: page authors: ["Radovan Bast", "Toby Hodges", "Samantha Wittke"] teaser: "Discussing the current state and potential future applications of Citation File Format for lessons" title: "Citation Information for Open Source Lessons" date: 2024-XX-YY time: "09:00:00" tags: ["Curriculum", "Publishing"] --- ``` Goals for blog post: - context: what is CFF, history of support in our projects up to now ## What is CFF? https://github.com/citation-file-format/citation-file-format: - human and machine readable citation information for software and datasets - standardized fields - enable easy citing as all necessary information for that needs to be provided for a cff file to be valid - Since XX: official GitHub integration (https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files) - cff validator can be integrated in github actions: https://github.com/marketplace/actions/cff-validator - Also usable by Zenodo and Zotero (= tools that people may already use) - Cff easily convertible to other formats using cffconvert (https://github.com/citation-file-format/cffconvert) - While lessons are neither software nor a dataset, the metadata collected in the cff file represents well everything needed in order to also cite lesson materials - Creating cff file is easy with cff-init (https://citation-file-format.github.io/cff-initializer-javascript/#/), but fields need to be filled with care ## What information should be captured in a CFF for a lesson? - [name=Toby]: suggest to move this section further down in the content. - - describe what information we recommend should be captured in a CFF for an open source lesson - authorship criteria and order - type: dataset - references - other lessons/resources your material was inspired by, is based on, adapts, etc - license - identifiers/DOI - cff-version - title - abstract - give a brief description of the lesson and its learning outcomes/objectives - making sure CFF is and remains valid across PRs to it - https://github.com/coderefinery/documentation/blob/main/.github/workflows/validate-cff.yml - CFF was conceived to describe software and data, and it is sometimes not obvious how it should translate to "creative" outputs like lessons. - Open Source lessons like those created by the Code Refinery and Carpentries communities are nevertheless similar to software projects in many of the ways that matter for CFF: they have a commit history, an open license, multiple versions, etc. - Also similar to software and data projects, it is often not clear how lessons should be cited by those who have used and benefitted from them. Based on previous experience and discussions at various conferences and events, members of the Code Refinery and Carpentries teams developed the following list of information that should be captured in the CFF of an open source lesson: ### Authorship information Lessons are usually the product of numerous and diverse contributions from a group of people. The list of authors should aim to include everyone who has contributed to the project, _whether or not they are represented in the commit history_. Contributions can be made in a wide variety of different ways: most directly by writing and commiting content to the default branch of the project, but also by providing feedback on ## Motivation on why we want to make lessons citable The effort of integrating CFF into lesson metadata and to make lessons citable is first and foremost to give the many contributors, editors, and lesson maintainers credit for their work and possibly more visibility for their work. Lessons can then be cited and their contributors can point to these on their CVs to highlight their work and the reach of their work. The second motivation is that by assigning a persistent identifier to lessons we have a chance to make the material more persistent and findable. Many projects are limited in time and we wish to avoid that lessons simply disappear when a project website is discontinued. ## Towards FAIR metadata for lessons We are currently thinking about working towards a yearly release cycle where we check the CFF, create a new release with the version tag in the form `YYYY-MM-DD`. Ideally, the CFF file is continuously modified with pull requests (merge requests) that bring in lesson changes and part of code/lesson review, and not only when we prepare the next release. A successful adoption of the CFF metadata in lessons could bring us one step closer to have a well-defined FAIR metadata for lessons by reusing some of the information captured in the CFF metadata. For this, we will need to compare metadata specifications of related efforts (**LIST THEM**) to find and define a common overlap (however, we might explore this in more detail in a future blog post). ## Next steps - describe next steps for our lessons - recommendations for the github - zenodo automation: assuming we have a CFF, we would like a GitHub workflow to be able to upload stuff to Zenodo at each release - including a PDF or the HTML+CSS etc of the built site - if/how to do this with GitHub-Zenodo integration