owned this note
owned this note
Published
Linked with GitHub
# resBaz^2^ Unconference—empowering lab groups to adopt reproducible research practices
Unconference schedule/sign-up: https://docs.google.com/spreadsheets/d/1t4QC5fVFAnncnYVsLWVI6hrgW2L974URplWAVqLTxuc/edit#gid=0
<!--
## Tasks
- [x] Eric emails past fall workshop attendees
- [ ] Eric puts together list of examples of reproducible research practices
- [ ] Lia emails other data science ambassadors
-->
## Goals
- Brainstorm barriers and how to lower them (what has worked for you?)
- Brainstorm how to get lab buy-in
- How to enable that one reproducibility enthusiast trainee to "convert" their lab?
## Attendees
Please add your name and email address, and github user name (if you're intersted in getting updates on this topic)
- Eric Scott (ericrscott@arizona.edu) (Aariq)
- Lia Ossanna
- Andrew Antaya
- Devin Bayly
- Brandon Jernigan (brandonjernigan@arizona.edu)
- Waldo Guzman (waldogb@arizona.edu)
## Agenda
1. What are reproducible research practices?
- Using code instead of point-and-click software
- Version control (e.g. with git)
- Collaborative coding (e.g. with GitHub)
- Code review
- Reproducible reports and manuscripts (e.g. with Quarto)
- Archiving data and code (e.g. with Zenodo)
- [Semantic commits](https://www.conventionalcommits.org/en/v1.0.0/) (including "tags" in commit messages)
3. Introductions
- Name
- Current role / job (student, faculty, staff, industry, etc.)
- What's a reproducible research practice you wish your group engaged in more?
2. Discussion
1. What are barriers to *lab groups* adopting reproducible research practices?
2. How can we (reproducibility enthusiasts, educators, etc.) help lower those barriers?
3. Best ways to get "buy-in" from entire lab groups
## Barriers
What prevents labs from adopting reproducible research practices?
- Not everyone in the lab has programming background (e.g. Box is easier than git)
- Wanting drag and drop convenience
- And not knowing that GitHub offers drag and drop file uploads
- Lack of awareness
- Lack of incentive structure for learning reproducible research practices
## Ways forward
How can we enable lab groups to adopt reproducible research practices?
- Maybe don't even call it "reproducibility" because there might be push-back
- Point out that code availability and other reproducible research practices are increasingly required by publishers
- **Case studies**—concrete, real-world examples of how reproducible research practices have saved time and money for labs. These could be good conversation starters and convince PIs to get involved.
- Remind PIs that these are important transferrable skills
- e.g. git, creating websites, project portfolios, documentation for lab
GitHub specifically:
- Possibly get sponsorship or collaboration from GitHub to create a "GitHub Corps" on campus—individuals who train lab groups specifically on using git and github.
- Teach the "goat path" (i.e. just enough to get by)
- Show how PIs and collaborators who don't code can still use GitHub
- Use GitHub for project management
- GitHub wiki
- Issues and Discussions
Examples to look to:
- The National Library of Medicine has a reproducibility training program that Waldo and his PI are both *requried* to participate in.
- NASA Openscapes is doing something like this as well—a reproducibility bootcamp for lab groups
## Possible training offerings ("lab guides")
1. Make a template research compendium as a lab group
2. Lab website (e.g. with GitHub pages and/or Quarto)
3. Lab handbook/wiki
4. Project managment with GitHub
5. Establishing a practice of peer code review
6. Run a lab ["reprohack"](https://www.reprohack.org/)
7. Project log / lab notebook
- Researchers in a lab keep a running log of project updates using markdown
- These notes are centralized (E.g. simple: pushed to the same GitHub repo; fancy: rendered to a Quarto blog)
- PI can just check the log to get updates without having to deal with emails and meetings
8. GitHub for non-coders
- Ways that GitHub can be useful even when collaborators are non-coders
- Issues / Discussions and setting up email notifications
- Project boards
- GitHub pages for an auto-updating report that collaborators can see
- Uploading / downloading files
- Library carpentries may have some lessons for this
9. Code style guides (`styler`, `lintr`)
- how to choose, customize, implement
10. Literate programming (RMarkdown, Quarto, etc.)
11. Archiving code and data
- E.g. with UA Redata or GitHub + Zenodo
12. Docker / Binder
## GitHub features that labs should know about
Much of our discussion centered on GitHub features, so here is a partial list of features that lab groups may find useful:
- Version control
- GitHub pages
- Project management
- Issues
- Discussions
- Kanban
- Wiki
- Pull request reviews
- VS Code / Codespaces
- Drag & drop upload
- Interacting with other projects
- GitHub organizations
- Template repositories
- LFS
- Private repositiories
## Needs
"User stories" describing the needs of different lab members
*Numbers refer to section above (offerings)*
Principal Investigators:
- Lab internet presence: 2, 4
- Project continuity: 3, 6, 4, 1
- Continue recieving grants: 11
- Tap into larger network of code projects: 5, 7, 8
- Time savings: 1, 3, 4, 8, 9, 11
Lab trainees:
- Develop skills to demonstrate to future employers
- Structure in a lab (for own work, for lab's knowledge, tasks from PI): 2, 3, 4, 6, 10
- Capability to recover from errors: 11
- Freedom to experiment: 11
- Tap into lar ger network of existing code: 5
- Time savings: 11
Research staff:
(we didn't get time to work on this user story)
## Phases
- "Reality check" with trusted partners
- Contact PIs and trainees we've worked with and ask them to prioritize the potential offerings in some way.
- Collect examples for "case studies" described above
- Student summer learning logs with Read the Docs
- Lab websites
- Project log (I think Andrew mentioned this, would love to see an example)
- Collect personal testimonials from users
- Mature incarnations
- A website with materials that could be used by lab members to run a lab meeting on a topic.
- Carpentries lessons
- TikTok examples highlighting needs and offerings
## How to structure "curriculum"(??)
- Goals: Why are we doing this? What is the objective?
- Motivates and helps select training materials
- Successful examples
Repo: https://github.com/cct-datascience/lab-trainings