bbolker - HackMD

managing R on HPC systems
In general, R scripts can be run just like any other kind of program on an HPC (high-performance computing) system such as the Compute Canada systems. However, there are a few peculiarities to using R that are useful to know about. This document compiles some helpful practices; it is aimed at people who are familiar with R but unfamiliar to HPC, or vice versa. Some of these instructions will be specific to Compute Canada ca. 2024, and particularly to the Graham cluster. I assume that you're somewhat familiar with high performance computing (i.e. you've taken the Compute Canada orientation session, know how to use sbatch/squeue/etc. to work with the batch scheduler) Below, "batch mode" means running R code from an R script rather than starting R and typing commands at the prompt (i.e. "interactive mode"); "on a worker" means running a batch-mode script via the SLURM scheduler (i.e. using sbatch) rather than in a terminal session on the head node. Commands to be run within R will use an R> prompt, those to be run in the shell will use $. Getting started Compute Canada reminders
bbolker changed 4 months agoView mode Like Bookmark
Brain dump on MCMC for epidemics
We would like to be able to estimate parameter values for dynamic models with all of the following characteristics: discrete states: we usually want to count at the level of individuals. Especially for beginnings/ends of epidemics, and outbreaks in small populations, finite-size effects (increased sampling noise at low prevalence and fadeout/extinction processes) are important continuous time: epidemics 'really' run into continuous time; even though time scales of epidemic processes are usually longer than a day, some processes can be close to this time scale, and discreteness can cause annoying dynamical instabilities [Ref Mollison and Ud Din?] both observation and process error note that 'process error' can occur at two weakly separable scales, i.e. 'sampling-level' (demographic noise, either 'simple' [Poisson noise/Poisson-process branching events] or overdispersed [Hooke processes, Gamma-white noise processes [Ionides and King], negative binomial/beta-binomial epidemic sampling]) or stochastic time-varying parameter values, especially transmission rates 'plug-and-play' analysis of complex epidemic models convenient inference, especially Bayesian
bbolker changed 3 years agoView mode Like Bookmark
Git(Hub) best? practices for scientific research projects
what to include in the repository Start here and here Tips for managing large repos Definitely include Code Definitely exclude
bbolker changed 3 years agoEdit mode Like Bookmark
reproducible workflows/collaboration tools brain dump
I created this document a few years ago about some of the connections and ideas I found interesting among topics such as literate programming; workflow tools; collaborative tools; etc.. I still think the ideas are interesting, but a fair number of the tools are out of date. Here I will just list some categories I think are useful and some tools that fall in those categories. It's a little alarming how rapid the turnover is (dead links, discontinued projects, etc..) Categories Tools for reproducible reports (I'm using this term rather than literate programming, which I will save for the old Knuth-style (cweb/noweb) concept) Sweave/Rnw Rmarkdown and its extensions (bookdown, pagedown, blogdown ...)
bbolker changed 3 years agoEdit mode Like Bookmark