# Reproducible workflows in R (A new draft started on 4.5.2023 after [the old one in Hedgedoc disappeared](https://siili.rahtiapp.fi/renv_reprod_tutorial).) ### Things that can affect reproducibility of R workflows: * data management * R scripts * data storage and accessibility * R version * R package versions * tip: define R version when loading the r-env moduleon Puhti: r-env/432 etc. * operating system, its version and other underlying parts ### Topics to cover: * R Markdown and Quarto * projects in R * help organizing multiple projects * make version control easier * [https://r4ds.had.co.nz/workflow-projects.html](https://r4ds.had.co.nz/workflow-projects.html) * [https://support.posit.co/hc/en-us/articles/200526207-Using-RStudio-Projects](https://support.posit.co/hc/en-us/articles/200526207-Using-RStudio-Projects) * don't save workspace (save objects and scripts instead) * keep original data - modified data separate should be a separate copy * R script reproducibility * commenting * file paths * general readability * functions for repeating sections of code * set.seed() * aim for scripts that can produce the output again at any time (instead of relying on storage of output) * version control * Git * Linking RStudio with Github * [https://happygitwithr.com/](https://https://happygitwithr.com/) * renv * the ultimate tool for reprodubility * BUT be careful when using on Puhti * containers * package versions * packages on Puhti tied to a specific date * sessionInfo() * Posit Public Package manager snapshots ### Structure draft * R versions on Puhti * R package versions on Puhti * minimum information to record for reproducibility * light-weight tools and tips for R reproducibility * stand-alone script principle * version control * general script tips * heavy-weight tools for R reproducibility * renv * containers ### Useful links * https://the-turing-way.netlify.app/reproducible-research/reproducible-research.html * https://agstats.io/post/reproducible-r/ * https://raps-with-r.dev/