# Foundational Open Science Skills (FOSS) Lesson 6: Reproducibility I
:::info
**Date**: `2024-02-29`
**Today's Lead Instructor:** Michele, Jeff
**Today's Helpers:** Tina
**Course Website:** https://foss.cyverse.org/06_reproducibility_i
**Hack(pad)-of-Hack(pads):** https://hackmd.io/y9XyyinFToOJS3XPJ2TtGg
**Instant Feedback:** (please complete before you leave class) [Complete Form](https://docs.google.com/forms/d/e/1FAIpQLSeVyEB8sU99Mn4IuzQ561Crp7v_wDl-yEcD2iutBxXRfrHo-Q/viewform?usp=sf_link)
:::
## :stopwatch: Agenda
### Warm-up:
#### Questions & Comments about Open Science left over from last week?
---
## Cyverse User Account
- jgillan
- cosimichele
- ctirambulo
- maryahern
- taolorunnisola
- keherder
- cyversefan
- valeriemilici
- cbilinski
- sebritton
- nbbarba
- owanca9hosi
- syates
- victorandreev
- rjramos
## Discussion Q/A
### Where do you stand on the Reproducibility spectrum?
### Discussion Set 2
**Have you ever had any hurdles to reproducing your work?**
- Have you ever run into a problem that prevented you from generating the same results, figures, analyses as before?
Yes, software versions upgraded (R packages)
I forgot which buttons in which order I clicked
initial developer passed away, with archaic code/language that took a while to figure out
+/-, figured it out eventually
- Have you ever lost time trying to figure out how you (or a collaborator) got a particular result?
-
Yes. A collaborator created a messy spreadsheet only and then left the project. No metadata, no code.
-A collaborator had color codes in the excel file! And no definitions of what the colors meant.
Yep, involved a lot of emailing back and forth to make sense of the mess
Collaborator used a proprietary software I did not have access too. Lots of time with SAS and R open simultaneously
- What were the issues you ran into, and how might you have solved them?
Kept a version of a hard drive disconnected from the internet locked in a certain state to be able to replicate the results reliably--a primitive version of docker
I tried to replicate a regression model that was done in SAS, I was using STATA. I couldn't get it to work.
I make notes underneath my final (best fitting) models that describe the models that did not work well so that I don't try them again.
### Discussion Set 3
**What are some tasks you have automated or want to automate?**
- Have you ever successfully automated a task?
I clean my data with code, not manually, and I can make multi-panel figures with code (not ppt).
I have used functions in python to automate tasks
Planning on doing it as i now have too many reports to run.
No but I would like to! Especially downloading/naming/organizing files
I miss using ghost mouse to automate activites on my computer, mostly for fun when I was a kid
Instead of building separate datasets for the various groups of data I have to analyze, I created scripts that generate them. This ended up saving a bunch of time in the long run.
- Found a way to make something scale or take less time?
- What was the task, and how did you do it?
- Are there any things you wish you could automate?
I'd like to automate my analyses
- What are some barriers to automating them?
Usually cross-platform or image interaction barriers, so it is hard to get it to transition or interact with different environments
The upfront time costs of figuring out how to do it
Time
Time to learn- finding resources to teach yourself
Have to have buy-in from the whole team
---
## Tutorial
:::info
During the second section of the class, we are going to cover a tutorial on reproducibility using Conda and NextFlow as a proof of concept and example on how to approach reproducible science.
:::
---
## Homework
If you have been following the homework since week 1, congratulations! This will be the last time you will be invited to carry out the homework! Since today's lesson is about replicating code, we invite you to reflect on your code and github-pages hosted website. The goal for this week is to try to make your work, or part of your work, replicable.
In your GitHub repository, you should already have a "Home", "Governance & Operations", and "Data Management Plan" tabs, with their appropriate files (e.g., "Home" renders from the `index.md` file).
1. Add a new tab called "Codebase":
- go to the `nav` section in `mkdocs.yml` and add "Codebase"
- Instead of having a single page for "Codebase", add 2 sub pages that explain "Installation and Requirements" and "User Manual" in the following fashion
```
nav:
- Home: index.md
- Codebase:
Installation and Requirements: installation.md
User Manual: manual.md
```
- This will create is 2 subsections in the website: create 2 new files in `docs/` that reflect `installation.md` and `manual.md`
- `installation.md` should explain how one can install the softare
- `manual.md` should explain how one can execute the software
2. In the GitHub repo, create a folder named `code/` (outside of `docs/`). Add here your code/work and any example files you may want. Use the [FOSS Reference Hub](https://github.com/CyVerse-learning-materials/foss-reference-hub) as an example of what files and structure are needed!
---
---
::: success
**Instant Feedback:** (please complete before you leave class) [Complete Form](https://docs.google.com/forms/d/e/1FAIpQLSexFuhcsrTR4VGrXUB4yGLI1QCi7shIC9nQOIlqEZSxWizgzQ/viewform?usp=sf_link)
:::