owned this note
owned this note
Published
Linked with GitHub
# 2024 notes
## Application form/applicant selection
- We ended up with a few learners that had very low or no R skill.
- Drop the git questions, since we are not assuming any knowledge of git or github?
- Possibly add more specific questions like "Do you know how to..."
- install an R package
- use functions from an R package
- write R code to read in data from a .csv file
- create a vector
- learn to use the arguments in an R function you've never used before
- We could ask for a code sample or a small coding exercise. This would require more of a time commitment from applicants but would give us a chance to directly gauge their level. It also might not be a bad thing to require a little bit of investment up-front, because the workshop series itself is more of an investment of time and energy than your average workshop.
## First lesson
- both co-instructors present some topics each in the first lesson?
- Do exercises about naming things in breakout groups (see [reproducibility.rocks](https://reproducibility.rocks/materials/day1/02-projects/))
- More practice organizing files in folders and using RStudio projects. Got to the last session and a few learners still had confusing setups where they had project git repos inside of other git repos or were opening files from different projects in RStudio rather than just switching projects.
## Quarto
- Practice quarto more throughout workshop
- Useful to show other formats for rendering?
- I like a workflow pushing things to gfm because then it's easy to share results via the GH URL.
## shell
How much shell should we do? possibly none next year??
- git/github can be done without touching the shell
- `quarto preview` *could* be done with `quarto::quarto_preview()`
- shell is definitely useful, but mostly for `ssh` stuff (e.g. jetstream, HPC)
- and dedicating a whole lesson to it seems like a lot of cognitive load and general confusion that doesn't help too much later.
- It probably is good to spend a little bit of time talking about file systems, but that could easily dovetail with project management.
- (Should we also teach `here` explicitly? I've avoided it so far this year)
## git
- if you use `git init`, create `.gitignore` before doing a commit. We had several learners putting .Rproj.user files up on GitHub. Not a huge deal, but adding .gitignore before a fist commit is good practice anyways. Alternative would be to do `usethis::git_vaccinate()` if we end up not teaching shell stuff in the future. +1
## github
- should we use `usethis` to create a branch/PR for consistency instead of using RStudio to create a new branch? Might be better not to teach so many ways to do things?
- similarly, should we use `usethis::create_from_github()` instead of the new project wizard?
- I agree on showing just one way to do each of the main things (create a project, add remote, branch, push, PR)
## Data manipulation in R
- Should we use .qmd for going through the code since in the updated syllabus they already have learned Quarto?
- for functions, maybe start with an example script with lots of repetition as motivation for creating a function. Then move function to separate script and source it.
- That could then feed in nicely to iteration, so you could just `map()` the function over inputs
- Motivating case study could even start with re-factoring to rename variables & objects, move `library()` calls to top of script, etc.
- Sharing the scripts with people during the lesson lead to a lot of people just selecting-all and running it. This doesn't expose whether or not they're actually understanding it, and it was really confusing for some folks because they were out of sync with the lesson.
- *possibly* spread R stuff out over 4 lessons instead of 3
## Functions and iteration
Should we follow R4DS more closely and just spend more time on `across()` and `group_by()` to do iteration and just skip for loops entirely?
## Getting credit
- remember that first time logging into Zenodo or sandbox.zenodo it will ask to give it permission to read repos on GitHub. Tell learners to allow it.
# 2023
## Changes from 2022
### Things to remove
- `udunits2` package (in lesson 8, package deprecated)
- SSH setup for GitHub (will still mention it)
### Things to modify
- Start with project managment as lesson 1? (don't need shell, git, or R for this, right?)
- reason for starting with shell was to get hardest thing over first and to set up git to continually practice
- More git in R (RStudio git pane, `usethis`), less in shell?
- Quarto instead of RMarkdown (but mention RMarkdown still)
- `purrr::map()` instead of `lapply()`. Thinking of using R4DS iteration chapter instead of carpentries lesson
- native pipe `|>` instead of `%>%` (but mention `magrittr`)
### Things to add
- Start with "the whole picture" in some way
- `renv` at the end (and concept of docker, but not actually using docker)
- `usethis` for `git_sitrep()` and setting up GitHub connection, but maybe also `use_git()` and `use_github()`
- Getting AI help—how-to and cautions against it
- `reprex` package early on combined with GH Issues & Discussions
- A reprohack assignment?
- Cherry pick papers or use one of their own, students fill out rubric as HW and discuss as a wrap-up activity
- Getting credit: LICENSE, CITATION.cff, & ways to get a DOI.
- Course website with contact info, syllabus with links to slides, other resources, recordings
## Questions
I had a few questions as I was going through last year's notes & materials
1. Did anyone actually have to install git on macOS? Should be there by default
3. How was the unit conversion lesson received? Now that `udunits2` is deprecated, is it worth teaching with `units` instead?
- Might be good to mention `units`, but don't need to teach
5. Did the project managment lesson really take 2 hours?
6. Are we sure about getting rid of `ggplot2`? (it's fine with me)
## Syllabus
| Lesson | Theme | Topic | Notes |
|-------------|---------------------|--------------------|-------------------------|
| 1 | Share & collaborate | Shell scripting | Intro, install git (windows), basic shell|
| 2 | Share & collaborate | Version control with git & GitHub| Basic git with shell, then into RStudio to connect to GitHub (e.g. with `usethis::use_github()` |
| 3 | Share & collaborate | Collaborating with GitHub | branches & PRs & forks maybe|
| 4 | Continued learning | Getting Help with Code | GH Issues & discussions. `reprex` package & concept (this is not 2 hrs)|
| 5 | Manage & organize | Project management and coding best practices | (is this 2 hours?)|
| 6 (modular) | Repeat & reproduce | Intermediate R programming I | |
| 7 (modular) | Repeat & reproduce | Intermediate R programming II| |
| 8 (modular) | Clean & plot | Data manipulation | |
| 9 (modular) | Clean & plot | Data visualization | |
| 10 | Document & publish | Documentation | |
## Notes
- In first lesson, make sure everyone has 2 monitors and help them get a second monitor if they need one
- Tell people there is a showcase at the end where they can show off a project and how it's been improved
- Remember to look through previous iteration notes again! There's a link to **jeopardy**
- Applications: https://docs.google.com/forms/d/18VkmxPoX2QUeu9vB3hxizMhxcqWxDIJrtvnSVjxVGQ4/edit#responses
- Filter out:
- Not in UA
- Don't know any R
- Prioritize:
- ALVSCE
- Answer to "how" question matches workshop description
- Shoot for around 20