Reproducibility - Cross-Atlantic GIScience Meeting - June - 2021 === ###### tags: `turingway` `Workshop` `GIScience` `reproducibility` :::info Lead: Fernando Benitez-Paez :books: **Resources and useful links:** - **Cuckoo:** Useful to control the time in your meetings or your own work https://cuckoo.team/collabcafe - **HackMD:** https://hackmd.io/ Collaborative Markdown tool to create a shared guides - **Markdown cheat-sheet:** https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet - **Emoji-cheat-sheet:** https://github.com/ikatyang/emoji-cheat-sheet#table-of-contents - **Turing Way:** https://the-turing-way.netlify.app/ - **Learn Git:** https://learngitbranching.js.org/ ::: :dart: Agenda --- | Time | Activity | | -------- | -------- | | 5 Min | Intro - Description of the Tools | | 25 Min | Brief Presentation | | 15 Min| Activity in HackMD| |10 Min | Disscusion and close up| :wave: IceBreaker (3-5 mins) --- *Please add your name on a new line below, and answer the icebreaker question:* **:question: What is your top media recommendation of 2021 so far? (movie, TV, music, video game, book, podcast, whatever you like!)** * Fernando Benitez | Thinking Fast and Slow by Daniel Kahneman * Jed | Not the hockey game i watched last night :( * Solene Marion: Book The Invention of Nature: The Adventures of Alexander von Humboldt, the Lost Hero of Science * Jack: Dune * Vanessa | The Bold Type (Netflix Series) * Urska: British Sewing Bee * Mary: No suggestions!!!! * Rhiannon: Starting Over - Chris Stapleton (Album) * Aranya: The Ezra Klein Show * Corallie Hunt | Podcast - BBC Desert Island Discs with Helen McRory :desktop_computer: Definition of Open and Reproducibility (4 min) --- *Please write your name and definition of "Open" & "Reproducibility" as it relates to data science on the sections below.* **What is your definition of Open:** * solene: Free * Jack: plain language summaries :fire: * Urska: free and available, plus finable and usable :mount_fuji: * Vanessa: Available, free and accessible * Mary: Openly avialable for the public sult. * Corallie: downloadable, open-source, free, maintained * Jed: freely downloadable to anyone via the WWW * Aranya: something that is easily findable and accesible globally * Rhiannon: Something with no barriers to access (paywalls, institutional login etc) **What is your definition of Reproducibility:** * Urska: if someone takes my data and code they will get the same thing as i did. This is different from replicability, which is when someone makes the same experiment on another population and gets the same result :smiley_cat: * Jed: the combination of data, methodolody, and sequencing of computer commands necessary to completey recreate a given piece of analysis. * Vanessa:enough and clear instructiosn and data so that someoen can fully reproduce your research from begining to results * Mary: can be done again * Aranya: Something that can is clear enough to be done again * Rhiannon: Ability to follow methods and reproduce the same result * Same data, same calculations, same result * Jack: standardized tools and formats :warning: What are the Barriers to Reproducibility (4 min) --- *Please write your top-one barrier to reproducibility as it relates to **spatial data science** on the line below.* **Main Barriers to Reproducibility** * Jed: Sensitive location data, dynamic code/tool bases, it is not 'easy' enough * Beate: not sharing of data,limitation of computing power * Corallie: opaque methods; data entry errors * Urska: some data cannot be shared, e.g. health data tracking - anything that includes personal data from which people can be identified. how to ensure reproducibility in research that uses such data? :question: * Include fake or made up data of the same format/characteristics * Rhiannon: Bad code, unclear workflow, missing steps, not included with the paper. :handshake: Lets evaluate the reproducibility of our work (15 min) --- *Please include your name and one of your projects/repos/code and let others help you to identify how can be more reproducible on the line below.* * Fernando Benitez | **MagGeo** | https://github.com/MagGeo/MagGeo-Annotation-Program * Jed Long | wildlifeDI R Package | https://github.com/jedalong/wildlifeDI * Urska: Mouse-Eye tracking paper in Plos One, has all data + code as supp info, https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0181818, code is also here: https://github.com/udemsar/EyeMouseInteraction * Vanessa: Cycling stratification from crowdsourced data paper in, code is available for replicability https://findingspress.org/article/10828-where-to-put-bike-counters-stratifying-bicycling-patterns-in-the-city-using-crowdsourced-data * :closed_book: Close out, pluses and deltas (5 mins) --- Before you leave, please add a key take away, something you liked (a plus), and something you’d change (a delta) below. ### Pluses * * ### Deltas * * *