owned this note
owned this note
Published
Linked with GitHub

_Remote ReproHack Hackpad_
===
## Part of the N8 Northern Tour Series
### **#ReproHackRemote**
###### tags: `Reprohack` `hackpad`
:::info
- :earth_africa: **Remote**
- :calendar: **14th May 2020**
- :watch: **10:00 - 17:00 UTC+1**
- :spiral_note_pad: **Paper List:** https://sheffield-university.shinyapps.io/n8cir-reprohacks/
- :writing_hand: **Feedback form:** https://forms.gle/wcVn9UF2zX1g5XdJ8
- :arrow_forward: **Slides:** https://annakrystalli.me/n8cir-reprohacks/slides/#1
- :purple_heart: **Code of Conduct:** https://github.com/reprohack/reprohack-hq/blob/master/CODE_OF_CONDUCT.md
- :left_speech_bubble: Chat to us on Slack: https://reprohack-autoinvite.herokuapp.com/
# Agenda
**10:00 - Welcome and Intro to Blackboard Collaborate
10:10 - Ice breaker session in groups
10:20 - TALK: Daniel Nüst - Research compendia enable code review during peer review ([slides](https://codecheckers.github.io/slides/2020-05_ReproHack.html))
10:40 - Tips and Tricks for Reproducing and Reviewing.
11:00 - Select Papers, Chat and :coffee:
11:15 - Round I of ReproHacking (break-out rooms)
12:15 - Re-group and sharing of experiences
12:30 - LUNCH :pizza: :stew: :strawberry:
13:30 - TALK: Daniel Piqué - How I discovered a missing data point in a paper with 8000+ citations
13:45 - Round II of ReproHacking (break-out rooms)
14:45 :coffee:
15:00 - Round III of ReproHacking (break-out rooms) - Complete Feedback form
16:00 - Re-group and sharing of experiences
16:30 - TALK: Sarah Gibson - Sharing Reproducible Computational Environments with Binder ([slides](https://doi.org/10.5281/zenodo.3826152))
16:45 - Feedback and Closing**
### **Participants:**
***Please sign in (Affiliation / Twitter / GitHub)***
#### If you have a twitter handle, please add it!
* Anna Krystalli (University of Sheffield / @annakrystalli / @annakrystalli)
* Linda Nab (Leiden University Medical Center / @lindanab1 / @LindaNab)
* Esther Plomp (Delft University of Technology / [@PhDtoothFAIRy](https://twitter.com/PhDToothFAIRy) / [EstherPlomp](https://github.com/EstherPlomp))
* Daniela Gawehns (Leiden University/ @dgawehns/ DanielaGawehns).
* David Wilby (University of Sheffield / @DrDavidWilby / @davidwilby)
* Sarah Gibson (The Alan Turing Institute / @drsarahlgibson / @sgibson91)
* Raphaela Heil (Uppsala University / @RaphaelaHeil / @RaphaelaHeil)
* Peter Crowther (Univeristy of Manchester / / @merrygoat)
* Florencia D'Andrea (National Institute of Agricultural Technology - Argentina - @cantoflor_87)
* Alessandro Gasparini (Karolinska Institutet - Stockholm, Sweden / [@ellessenne](https://twitter.com/ellessenne) / [ellessenne](https://github.com/ellessenne/))
* Susannah Cowtan (University of Sheffield / @SuusJC/ sjcowtan)
* Lennert Schepers (Flanders Marine Institute/ [@SchepersLennert](https://twitter.com/SchepersLennert) / LennertSchepers)
* Anna Lohmann(Leiden University Medical Center/ @annloh/ https://github.com/annloh/)
* Daniel Nüst (Institute for Geoinformatics, University of Münster, Germany, [@nordholmen](), [@nuest]())
* Cliff Addison (University of Liverpool)
* Manhui Wang (University of Liverpool)
* Eio Campitelli (University of Bueny os Aires / | @d_olivaw)
* Aleks Nenadic (Software Sustanability Institute, University of Manchester / @aleks_nenadic / @anenadic)
* Megan Stodel (BBC / @MeganStodel / https://github.com/MeganStodel)
* Jeremy Leipzig (Drexel University | @jermdemo | https://github.com/leipzig)
* Jenicca Poongavanan (University of Florida | @Jenicca_11)
* Cassio Amorim (SciGen.Report | @Scigen.Report | https://scigen.report)
* Salvador Fernández (Flanders Marine Institute | [@salvafern](https://twitter.com/salvafern))
:::
# :recycle: ReproHacking - Plan of Action
:computer: Form teams
---
Feel free to tackle papers individually or as teams.
:dart: Select papers
---
- Choose paper from list of proposed papers:
- :spiral_note_pad: **Paper List:** https://sheffield-university.shinyapps.io/n8cir-reprohacks/
- Register the paper selected and the participants reproducing below. You can copy, paste and edit the following template:
```
### **Paper:** <Title of the paper reproduced>
**Reviewers:** Reviewer 1, Reviewer 2 etc.
```
:books: Reproduce
---
- Attempt to reproduce papers from available materials and documentation
- Make notes about your experiences, in particular with respect to how easy it is to:
- :earth_africa: navigate the materials
- :repeat: reproduce the analysis
- :recycle: reuse the materials
:memo: Feedback to authors
---
* Fill in the author feedback form, documenting your experiences reproducing your chosen paper
- :writing_hand: **Feedback form:** https://forms.gle/wcVn9UF2zX1g5XdJ8
---
# Icebreaker Room Names
- **Group 1:** La Casa de Papel
- **Group 2:** Banff
- **Group 3:** Melbourne
- **Group 4:** Beach 🏖 🐋
- **Group 5:** Buenos Aires
- **Group 6:** Chai Latte
- **Group 7:**
- **Group 8:**
- **Group 9:**
- **Group 10:**
***
# Docker resources
Nüst, D., Sochat, V., Marwick, B., Eglen, S., Head, T., Hirst, T., & Evans, B. (2020, April 17). Ten Simple Rules for Writing Dockerfiles for Reproducible Data Science. https://doi.org/10.31219/osf.io/fsd7t
# Questions for Speakers
## Daniel Nüst:
_Link to slides:_ https://github.com/codecheckers/slides (PDF, HTML, Rmd source with notes)
- Do you see fields of research that are "ahead of the curve" regarding the use of research compendia?.
- DN: IMHO: bioinformatics/computational biology (look for Peng's work, e.g., on reproducibility continuum)
- How do research compendia adapt to the first -mostly exploratory- steps in research in which things (including code and function behaviour) are changing all the time?.
- DN: The structure of a research compendium and the tools, such as rrtools, will help you through that process! You can capture changing state in version control, to be able to go back if something breaks. So, they adapt well and will evolve as the research advances.
- Do you know of any journals that require research compendia prior to publication/ the review process?.
- PC: Not specfically research compendia but JOSS (https://joss.theoj.org/) has a similar reviewing process for publication. It's more focussed on software rather than data though.
- DN: no, not as a requirement. A small (growing) number of journals start to increase the requirements around data and software, but none is using the term research compendium AFAIK.
For a list of journals that comply with [OSF TOP Guidelines](https://www.cos.io/blog/the-landscape-of-open-data-policies) at level 3, which effectively requires something like research compendia to check all boxes, see: https://osf.io/kgnva/wiki/home/
- What about checking style and code structure? That would be something that I check as a normal reviewer.
- (copying from the chat) Peter Crowther: _I think that in academic publishing we have more of a narrowly defined "correct" style. I think in code what is "correct" is a lot more broad._
- DN: Agree with Peter. Also, CODECHECK is not discouraging that kind of feedback, and you as a codechecker would probably still give feedback on that. Unlike with research manuscripts, we are just building up a good/common practice of what a code review in science should entail; _Do journals you review for provide good guidelines on things that you should check?_
## Daniel Piqué:
- Have you though about contacting the original authors wth your findings? Should the journal publish a "correction" with the "correct" figure (w/o the missing datapoint)?.
- .
- .
## Sarah Gibson:
_Link to slides:_ https://doi.org/10.5281/zenodo.3826152
Have a look at package holepunch for bindering r projects: github.com/karthik/holepunch
- Who pays for mybinder? - NumFocus, credits from Cloud providers
- How about long-term reproducibility?
- SG: Good question! Who knows how long Docker, MRAN, etc, will be well-used tools for
- PC: You/your institution can host its own BinderHub and then you are to some extent more in control. Contact your local research IT people perhaps?
- Is there a way to "sync" the locally run versions of packages and the versions run in binder? (e.g. packrat, renv)? I would love to hear the wider community's thoughts on this.
- SG: I know (as a Pythonista) it's possible to use `pip freeze` to create a local copy of your `requirements.txt` file (I don't know what the R equivalent is). mybinder.org rebuilds the image with every push to GitHub so, so long you keep your local and remote repos up-to-date with package versions, Binder will find those versions.
- Is this the way to find out how much of my code breaks under R 4.0 before I install it locally?
- SG: Maybe. In theory, yes. You could just include a `runtime.txt` file with R 4.0 in your GitHub repo. But Binder may also break when R 4.0 is released and then we'll have to do some work to accommodate those changes.
- Is there a way to use tools like this when you can't share the data by law (e.g. clinical trials, registry data, etc.)?
- SG: You can't use mybinder.org with private code or data at all. However, you can deploy a [BinderHub](https://binderhub.readthedocs.io) and configure it with the appropriate authenticators and access to private data/code. We have a [Data Safe Haven](https://www.turing.ac.uk/research/research-projects/data-safe-havens-cloud) project at the Turing for working with sensitive data and it would be incredibly awesome to deploy a BinderHub within that infrastructure!
- general question: 'work always in containers' -> when/how should we start working in containers (in which phase of the research)? What is the easiest way to start working in containers?
- SG: I think the earlier you start to think about making your environment easily reproducible (container or otherwise), the better. It'll be much harder to reproduce a mature, complex environment. Build your container alongside your research so they mature together.
# Paper Registration
_Use the following template (also provided in "Plan of Action") and register your review below._
### **Paper:** <Title of the paper reproduced>
**Reviewers:** Reviewer 1, Reviewer 2 etc.
***
### **Paper 27:** Use of significance test logic by scientists in a novel reasoning task
**Reviewers:** Elio Campitelli
**Results:** https://github.com/richarddmorey/Morey_Hoekstra_StatCognition/issues/1
### **Paper 14:** pyKNEEr: An image analysis workflow for open and reproducible research on femoral knee cartilage
**Reviewers:** Raphaela Heil, Javier Moldon
### **Paper 3:** Bivariate spatial point patterns in the retina: a reproducible review
**Reviewers:** David Wilby,
### **Paper 19:** A novel approach to modelling transcriptional heterogeneity identifies the oncogene candidate CBX2 in invasive breast carcinoma
**Reviewers:** Tuur Muyldermans, Jeremy Leipzig
### **Paper 2:** Spatial modelling of rice yield losses in Tanzania due to bacterial leaf blight and leaf blast in a changing climate
**Reviewers:** Salva Fernández, Sarah Gibson, Susannah Cowtan
### **Paper 8:** Resolving the Measurement Uncertainty Paradox in Ecological Management
**Reviewers:** Lennert Schepers, Daniel Nüst
https://github.com/boettiger-lab/pomdp-intro/issues/4
https://github.com/boettiger-lab/pomdp-intro/pull/5
### **Paper 4:** Bayesian determination of the effect of a deep eutectic solvent on the structure of lipid monolayers
**Reviewers:** Peter Crowther
### **Paper 21:** Supercurrent-induced Majorana bound states in a planar geometry
**Reviewers:** Aleks Nenadic, Cliff Addison, Cassio Amorim,
### **Paper:** Mental Health and Social Contact During the COVID-19 Pandemic: An Ecological Momentary Assessment Study
**Reviewers:** Alessandro Gasparini, Linda Nab, Anna Lohmann, Esther Plomp
### **Paper 10:** Spectral measure of color variation of black-orange-black (BOB) pattern in small parasitoid wasps (Hymenoptera: Scelionidae), a statistical approach
**Reviewers:** Florencia D'Andrea and also David Wilby
### **Paper 11:** Comparisons of Citizen Science Data-Gathering Approaches to Evaluate Urban Butterfly Diversity
**Reviewers:** Megan Stodel
### **Paper 29:** Hyperparameter importance Across Datasets
**Reviewers:** Juan Bascur
//
***
# Regroup notes
<!-- Any other notes you'd like to add. -->
## Lunch Regroup
## Afternoon (final) Regroup
# Feedback
## :green_book: One thing you enjoyed
- I loved the talks! +3
- Blackboard colab worked great! Breakout rooms and chat were really good.
- The time is probably the only one that reasonably covers every timezone (morning in Americas, afternoon in Europe, night in Asia/Oceania).
- The basic collaboration framework was good and the talks were super.
- Really good day! I learnt several things not to do with my own code :wink:
- Loved the talks, the break out rooms were nice.
- The talks where awesome. I also think that the platform was very good to work in small groups.
- Loved the talks and the break out rooms, a lot of papers to choose from!
- Great that remote people could participate (this approach is also easily scalable to more people?) - thanks!
- Whale-watching on Docker beach :whale: :beach_with_umbrella:
- Greate to have already a 'score' on the papers: so you can have a look to 'great examples' as well as 'challenges'
- timely reminders to drink enough coffee :coffee: :+1:
- this HackMD form, great way to learn *markdown* `syntax`
- Thank you to the hosts! It was a well run event. Something like this can become quite unproductive/unfocussed without structure and someone organising.
## :red_circle: One thing that can be improved
- I think technical issues are innevitable when online (I had a few), so perhaps the schedule should be made already including delays from bad connection.
- The numbers of the submitted papers changed over the course of the past days, that was slightly confusing
- Chat functionality was rather subobtimal in this framework, maybe have another framework with more options in the background for that.
- Would be awesome to have a remote option even post-pandemic for people who are nowhere near such great events or are home-bound for other reasons.
- I had some connectivity issues with Blackboard Collaborate
- if there would be more time, an intro on how to use start using containers would be useful.
- I miss the more natrual networking of an in-person event.
- .
- No biscuits.. (maybe that's just me) :cookie: