owned this note
owned this note
Published
Linked with GitHub
# reproducible workflows/collaboration tools brain dump
I created [this document](https://rpubs.com/bbolker/3153) a few years ago about some of the connections and ideas I found interesting among topics such as *literate programming*; *workflow tools*; *collaborative tools*; etc.. I still think the ideas are interesting, but a fair number of the tools are out of date. Here I will just list some categories I think are useful and some tools that fall in those categories.
It's a little alarming how rapid the turnover is (dead links, discontinued projects, etc..)
# Categories
## Tools for reproducible reports
(I'm using this term rather than *literate programming*, which I will save for the old Knuth-style (cweb/noweb) concept)
- Sweave/Rnw
- Rmarkdown and its extensions (bookdown, pagedown, blogdown ...)
- [Quarto](https://quarto.org/) (a new-ish more language-agnostic tool)
- [R `brew` package](https://cran.r-project.org/web/packages/brew/index.html) (templated report generation)
## Notebooks
A different flavour of reproducible analysis/reporting, generally more interactive and a little less document-like
- Rmarkdown in notebook mode
- Jupyter notebooks
- [Beakerx](https://github.com/twosigma/beakerx) notebooks (Jupyter nb/lab extensions)
- Emacs [org-mode-babel](https://orgmode.org/worg/org-contrib/babel/)
## Collaborative authoring
What is the target document format (LaTeX, markdown, ?) Is there a locking mechanism, or does the system allow live/simultaneous updating?
- hackmd.io (e.g. this document)
- Overleaf (LaTeX-focused, can also sort of do markdown)
- [manuscripts.io](https://www.manuscripts.io/about/) - like Overleaf, but even less LaTeX-y
- Google docs
- via version control (Git etc.) or shared spaces (Dropbox/Google Drive/etc.)
## Workflow tools
Dependency management (*not* packages/libraries), targets ... capabilities for branching workflows, remote execution, ... ?
- [`make`](https://www.gnu.org/s/make/manual/make.html) (the grandparent)
- (Data/Software carpentries lessons)
- [`shellpipes`](https://github.com/dushoff/shellpipes) a package by JD that helps R interact with `make`.
- `targets` an R-specific framework
- `snakemake` Python (of course)
- [Sumatra](https://pythonhosted.org/Sumatra/) (mostly Python)
- heavier tools like Kepler
- [DrWatson](https://juliadynamics.github.io/DrWatson.jl/dev/) (Julia)
- Galaxy (bioinformatics)
- [Apache Airflow](https://airflow.apache.org/)
## Markup/authoring languages
[presentational, procedural, or semantic?](https://www.cs.mcgill.ca/~rwest/wikispeedia/wpcd/wp/m/Markup_language.htm); availability of macros? fine-grained output control? what output formats (HTML, HTML5, PDF, docx, ... ?)
- markdown (and all of its extensions)
- LaTeX (and variant flavours, e.g. XeLaTeX, LuaTeX ...)
- [DocBook, reStructured Text?](https://opensource.com/life/15/8/markup-lowdown)
## Development tools
More for software engineering than scientific development
- [Apache Maven](https://maven.apache.org)
## Continuous integration/build tools
Test your code/automatically re-run required dependencies on a remote platform
- Travis
- Github Actions
- Docker/Rocker (containers for software environments)
- `packrat`, `renv` (R packages)
## Revision control
- Git, Mercurial, Subversion
- providers: GitHub, GitLab, BitBucket
## Documentation frameworks
- doxygen (language-agnostic? C/C++)
- Roxygen
- Python (docstrings)
- Julia ? @doc?
## Storage/archiving
- Zenodo
- OSF
- Figshare
- Amazon S3 (commercial)
## Development environments/editors
Should provide tools like *syntax highlighting*, *bracket matching*, *tab completion*, hotkeys for compilation/running, ... possibly source-level debugging; also plugins to revision control systems etc.. For text/collaborative editing, we also want some ability to *track changes* (integration with revision control tools?), add notes, live views? ...
- vim
- emacs ([Free Software Foundation](https://www.fsf.org/))
- [AquaMacs](http://aquamacs.org/) for MacOS
- VScode (Microsoft)
- RStudio (RStudio)
- PyCharm/IntelliJ (JetBrains)
- [Atom](https://atom.io/) (maintenance mode??)
- Sublime Text
- Notepad++ (MS)
## terminal window tools
- `screen`
- `tmux`
- `terminator`
# things to consider when choosing a tool
- cross-platform/web support
- offline capability
- remote/cloud computation
- markup language (markdown/LaTeX/etc)
- programming language support/agnosticism
- business model/pricing
- longevity/stability/maturity/development stage
- lock-in
# miscellaneous (link dump)
- the [reproducible research task view](https://cran.r-project.org/web/views/ReproducibleResearch.html) for R