# Reproducible workflows in R
(A new draft started on 4.5.2023 after [the old one in Hedgedoc disappeared](https://siili.rahtiapp.fi/renv_reprod_tutorial).)
### Things that can affect reproducibility of R workflows:
* data management
* R scripts
* data storage and accessibility
* R version
* R package versions
* tip: define R version when loading the r-env moduleon Puhti: r-env/432 etc.
* operating system, its version and other underlying parts
### Topics to cover:
* R Markdown and Quarto
* projects in R
* help organizing multiple projects
* make version control easier
* [https://r4ds.had.co.nz/workflow-projects.html](https://r4ds.had.co.nz/workflow-projects.html)
* [https://support.posit.co/hc/en-us/articles/200526207-Using-RStudio-Projects](https://support.posit.co/hc/en-us/articles/200526207-Using-RStudio-Projects)
* don't save workspace (save objects and scripts instead)
* keep original data - modified data separate should be a separate copy
* R script reproducibility
* commenting
* file paths
* general readability
* functions for repeating sections of code
* set.seed()
* aim for scripts that can produce the output again at any time (instead of relying on storage of output)
* version control
* Git
* Linking RStudio with Github
* [https://happygitwithr.com/](https://https://happygitwithr.com/)
* renv
* the ultimate tool for reprodubility
* BUT be careful when using on Puhti
* containers
* package versions
* packages on Puhti tied to a specific date
* sessionInfo()
* Posit Public Package manager snapshots
### Structure draft
* R versions on Puhti
* R package versions on Puhti
* minimum information to record for reproducibility
* light-weight tools and tips for R reproducibility
* stand-alone script principle
* version control
* general script tips
* heavy-weight tools for R reproducibility
* renv
* containers
### Useful links
* https://the-turing-way.netlify.app/reproducible-research/reproducible-research.html
* https://agstats.io/post/reproducible-r/
* https://raps-with-r.dev/