Try โ€‚โ€‰HackMD

Boost your research reproducibility with binder

tags: turing-way Workshop External
  • Event: Boost your research reproducibility with binder
  • Date: 11 June, 2020 13:00 - 17:00 (GMT)
  • Instructors: Kirstie Whitaker, Sarah Gibson, Malvika Sharan
  • Contact: msharan@turing.ac.uk

Shared notes:

https://hackmd.io/@malvikasharan/BinderJune2020

Zoom

Non verbal communication using zoom buttons:

  • Options that you see:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’

  • Click on "Participants"

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Agenda

Time Activity
13:00 - 13:10 Introductions
13:10 - 13:20 Introduction to the workshop and The Turing Way
13:20 - 14.30 Why you need a reproducible computing environment and how Binder can help
14:30 - 15:00 Break
15:00 - 16:00 Zero to Binder, a guided tour of building a Binder resource
16:00 - 16:30 Build your own Binder
16:30 - 17:00 Feedback, demo and closing

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Introductions

Roll call:

Name / Pronouns / Affiliation / GitHub:

  • Malvika Sharan / she/her / The Alan Turing Institute / malvikasharan
  • Sarah Gibson / she/her / The Alan Turing Institute / sgibson91
  • Kirstie Whitaker / she/her / The Alan Turing Institute / KirstieJane
  • Nabila Rahman / - / Cardiff University / NabilaRahman
  • Ali Seyhun Saral / he/him / Max Planck Institute for Research on Coll. Goods / seyhunsaral
  • Catherine Sutherland/ she/her / University of Edinburgh / catsutherland
  • Owen Dando / he/him / University of Edinburgh / lweasel
  • Zrinko Kozic / he/him / University of Edinburgh / zkozic
  • Fiona Grimm/ she/her / The Health Foundation / fiona-grimm
  • Alex Handy / he/him / King's College London / AlexHandy1
  • Festus Nyasimi /he/him / ICIPE / Fnyasimi
  • Sarah Marzi / she/her / Imperial College London / SarahMarzi
  • Katie Emelianova / she/her / University of Edinburgh / katieemelianova
  • Delwen Franzen / she/her / QUEST center (BIH) Charite Universitรคtsmedizin Berlin / delwen
  • Andrea Pierrรฉ / he/him / Brown University / kir0ul
  • Jobin John / he/him / Chalmers University /jobindj
  • Xin He / he/him / University of Edinburgh / hxin
  • Dmitrijs Celinskis / he/him / Brown University / dcelinsk
  • Kristina Salontaji/ she/her/ Imperial College London / KristinaSalontaji
  • Dervis Salih/ he/him / UCL / DSalih20
  • Nathan Skene / - / Imperial / nathanskene

Icebreaker:

Name / One fun app/software you have been using specially a lot during the lockdown (Zoom is not the right answer!)

  • Malvika / slack & netflix
  • Sarah / Elevate brain training
  • Kirstie / Signal and Whatsapp - I'm in better contact in lockdown with my friends than ever before!
  • Owen / https://en.boardgamearena.com
  • Nabila / edx.org & amazon prime & Witcher 1 (game)
  • Alex / Splitwise for shared house meals!
  • Festus / edx.org & Codewars
  • Sarah M / garageband
  • Ali / TypeRacer
  • Delwen / learning R!
  • Xin / BBC iplayer kid
  • Jobin / Zulip & Youtube+netflix
  • Dervis / Twitter
  • Emilia / Animal crossingy

Introduction to The Turing Way

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
GitHub, MarkDown - HackMD

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Introduction to the tools and methods for this workshop

Talk by Kirstie Whitaker

Small Group Exercises:

https://github.com/alan-turing-institute/the-turing-way/blob/master/workshops/boost-research-reproducibility-binder/paired_examples.md

Take shared notes here:

  • Ex1: Binder does not take the environment file as default for running python.
  • Ex2: Different matplotlib versions in the two branches. Exact same codes, but different environment files.
  • Ex3: Different sklearn requirements (requirements.txt) => a big difference in end result.
  • Ex4:

Q&A section:

Minutes from the first half

Up

  • The examples were very insightful (+2)
  • EEEE
  • Friendly, supportive learning environment! (+1)
  • :)
  • The runtime environment in R example was amaaazing!!!
  • Clear explanation of the Turing Way - and lots to think about! Much more to reproducible science than I'd thought about
  • Have got a much better idea what the purpose and uses of Binder are now.
  • I really liked how the groupwork is handled using rooms
  • Good! (I'm here instead of another seminar, because it's been interesting). Looking forward to know more.
  • Very valuable to see a 'good practice' example of what a good, reproducible project is
  • I actually didn't care much about the package versions that I have been using as long as it works and now I understood how important it is to report them. +1000
  • The sk-learn example was very enlighting
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More โ†’
  • Really great instruction and insights, great to have so much experience to draw from and great for answering all the different questions - definitely feel like I've got a really great overview of binder :)

Down

  • Zoom buttons make me feel dizzy :D it's always confusing
  • Maybe more time for those four examples. (I also agree here) +3
  • It would be good with a real word example showing the usage of binder, such as the publication example Kirstie shows during the coffee break +1

Talk by Sarah Gibson

Please write down your name under the programming language the content of your GitHub repo contains or you are interested in:

  • Python: Delwen, Festus, Andrea, Dmitrijs, Jobin
  • R: Nabila Rahman, Fiona Grimm, Ali Seyhun Saral, Kristina Salontaji, Zrinko Kozic, Sarah Marzi, Nathan, Catherine Sutherland, Dervis
  • Julia:
  • Unix scripts: Katie Emelianova, Xin He, Owen Dando
  • No specific language:
  • Other:

Break out Groups:
R Group 1: Ali Seyhun, Nabila, Nathan, Fiona
R Group 2: Catherine, Kristina, Sarah, Dervis
UNIX Group: Katie, Owen, Xin, Zrinko
Python Group: Andrea, Delwen, Dmitrijs, Festus

Take shared notes here:

  • What are we supposed to do now (R group 2 asking)? (Also R group 1 asking)
    • Please try to binderise your code now :)
    • Kirstie Sorry folks! You're hopefully experimenting together with your own code!
    • Kirstie I'm in the main area so shout if you'd like me to come to your breakout room!

Q&A section:

  • Is 2048 MB a memory limit on mybinder.org?
    • Hard limit is 2 GB, better performance for 1 GB

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Build your own Binder: Breakout discussion and hands-on session

Take shared notes here:

Q&A section:

Report out: Shared insights

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Final structuring and writing

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Post links to your Binder-ised GitHub repositories here:

Q&A section:

  • Building R studio takes a long time. Can I set the build going on command line (e.g. from a remote cluster? (So i can go away and shut down my pc in the meantime)

    • Not from the command line, no
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More โ†’
      Using conda to build R takes a bit less take because you don't have to build from binaries: https://github.com/binder-examples/r-conda
  • Still struggling to get RStudio running on Binder (tried to create the URL) -> tried the URL path and doesn't work :(

  • I have been working with RMarkdown for reproducible analysis tutorials - how does that integrate with Binder?

    • I think they should run in RStudio, but Binder wouldn't be able to generate the PDF in a pop-out window as it's serverless
  • A little off topic - but any suggestions for intro to git/github tutorials to get started with version control?

  • Can postBuild be used to get a number of data files in different formats from a Zenodo link?

    • I think so? I don't think it's been tried before, so please tell us what you find!

Report out: Shared insights

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Feedback, demo and closing

Key take away

  • runtime.exe (for R) is needed to load packages from CRAN that was available on THAT specific day. Problem with R on Binder is that it can't load different version of packages. It gets a snapshot from CRAN
  • Binder-izing my project asap
  • That Binder is all about communication of analysis of results, rather than encapsulating the whole of an extended analysis.

Pluses

  • Fantastic tutorial, really engaging sessions. Loved the first examles in the breakout rooms. Soooo much material!
  • Really helpful, everything clearly explained. Examples of "failures" of reproducibility very eye-opening
  • Really impressed by how you managed to help participants that were stuck by quickly channeling them into breakout rooms! The exercises at the beginning really drive home the point on why we need to learn about these tools.
  • I enjoyed this tutorial. Lot to takeaway. I will be sharing what I learned today with my lab
  • Useful tutorial, I liked the practical elements. Great tutors.
  • Thought it was awesome. It can be daunting to be introduced to this first time and you made that a really nice experience. I'm excited to apply it to my work!

Deltas

  • Maybe have a more complex shared example to work through or suggest to people in advance to bring code that they want to "binderise"/maybe go through more complex scenarios like using conda envs
  • Have a more complete R example, using the NAMESPACE/DEPENDENCY structure which is strandard for R. I'm a bit unclear still on what the actual required files are? Is there a document that spells out that it's OK to use environment.yml, runtime.txt etc?
  • I would have benefited from examples drawing on more complex datasets (not necessarily huge files but a variety of different types of files)
  • Was a little lost with the first exercise. A little more time to get acquinted with people and understand the instructions and start working.
  • Would actually be great if the session were a bit longer, with a little more time for each section.
  • The hands-on tutorial was awesome and well explained, but I found it difficult to properly listen and implement it at the same time (maybe a quick run through first followed by implementation, if there is sufficient time?)
  • It would be great to see examples of code/data relevant to us biologists.

Next steps:

  • check with Jobin for feedback

Connect with us!

We love hearing about how you're using The Turing Way.
Stay in touch through one of the many different pathways below!