owned this note
owned this note
Published
Linked with GitHub
# Remote Computing Workshop Series
[toc]
Questions:
* landing page URL?
* e-mail URL for questions? maybe use a special Jira address?
* should we swap workshop 6 (project org) and 7 (shell scripting)?
TODO:
* start generating pre- and during assessment questions
* ask for MCs, co-lead instructors
* decide how to create materials, start with template
* think about platforms. discord? slack? nothing at all?
## high level blurb:
This series of 11 two-hour workshops introduces researchers to the technical skills needed to work with data sets that are big enough that they cannot be conveniently analyzed on laptop or desktop computers. Through demonstrations and hands on practice we will show researchers how to automate large analyses, connect to and use remote workstations including HPCs and cloud computers, and manage and execute long-running analysis workflows on these remote computers.
## what/when/where/who
We will run 11 workshops during the month of August 2021. Each workshop will start at 9am PDT, and may be up to 2.5 hours long.
The workshops will be hands-on, with a combination of lecture, follow-along work, exercises, and Q&A.
All workshops will be offered via Zoom. We plan to record them and post the recordings publicly.
Any and all attendees are welcome, including researchers not affiliated with UC Davis.
Some prior experience with a scripting language (R, Python, etc.) is expected, but proficiency in a specific language is not required. You will need a computer with a Web browser, Internet connection, and the latest version of Zoom installed. You should ideally have admin permissions to install software on your machine.
This is a joint training series offered by [the UC Davis DataLab](https://datalab.ucdavis.edu/) and [the NIH Common Fund Data Ecosystem](http://nih-cfde.org/).
Space is limited and registration is required. To reserve your spot [register here]. The non-refundable $5 fee signs you up for the entire workshop series; [please inquire](datalab-training@ucdavis.edu) in cases of financial hardship. Registrants can attend any or all of the workshops, and must complete a short pre- and post-assessment for each workshop they attend.
Contact [datalab-training@ucdavis.edu](datalab-training@ucdavis.edu) for more information, or monitor <this website>.
### Registration timing
The workshop series will be announced to DataLab as soon as we are ready to go public.
There will be space for approximately 50 attendees.
The CFDE will advertise that registration is open on Monday, July 26th, if any places remain.
## Introductory skills
### Workshop 1: Introduction to the UNIX Command Line - Tues Aug 3
This two hour workshop will introduce attendees to the UNIX command line, which is the main way to interact with remote computers. We will cover computing concepts, file systems and directory structure, and some of the most important commands for working with remote computers.
### Workshop 2: Creating and modifying text files on remote computers - Wed Aug 4
This two hour workshop will introduce attendees to the concepts and skills needed to create, modify, and search text files on remote computers. We will discuss files and content types, and cover the most common ways to work with remote text files.
## Intermediate skills
### Workshop 3: Connecting to remote computers with ssh - Tu Aug 10
This two hour workshop will show attendees how to connect to remote computers using ssh software, which is the most common way to do so. We will discuss usernames and passwords, introduce ssh software clients, and work through the most common challenges attendees will face in connecting to remote computers.
### Workshop 4: Running programs on remote computers and retrieving the results - Th Aug 12
This two hour workshop will show attendees how to use remote computers to run their analyses, work with the output files, and copy the results back to their laptop and desktop computers. We will discuss input and output formats, where files are usually read from and written to, and how to use the ssh software to copy files to and from remote computers.
### Workshop 5: Installing software on remote computers with conda - Fri Aug 13
This two hour workshop will show attendees how to install and manage software using the conda installation system. We will give examples of installing Python and R software, and managing conda environments on remote systems.
### Workshop 6: Structuring your projects for current and future you - Tues Aug 17
In this two hour workshop, we will discuss folder structures for organizing your projects so that you can track inputs, outputs, and processing scripts over time, and keep yourself organized as your projects evolve.
### Workshop 7: Automating your analyses and executing long-running analyses on remote computers - Th Aug 19
This two hour workshop will show attendees how to automate their analyses using shell scripts, as well as run and manage software that takes minutes, hours, or days to execute. We'll also show you how to disconnect from and resume running processes using the 'screen' command.
(CTB: maybe switch this with workshop 6?)
### Workshop 8: Keeping track of your files with version control - Tues Aug 24
This two hour workshop will show attendees how to use the git version control system to track changes to your files on the remote system, as well as backup your project files to github and transfer them to your laptop or desktop. We will demonstrate file sharing via github, and discuss ways to collaborate with a team.
## Advanced skills
### Workshop 9: Automating your analyses with the snakemake workflow system - Wed Aug 25
This two hour workshop will introduce attendees to the snakemake workflow system, for executing large-scale automated analyses.
### Workshop 10: Executing large analyses on HPC clusters with slurm - Th Aug 26
This two hour workshop will introduce attendees to the slurm system for using, queuing and scheduling analyses on high performance compute clusters. We will also cover cluster computing concepts and talk about how to estimate the compute resources you need and measure how much you've used.
### Workshop 11: Making use of on-demand "cloud" computers from Amazon Web Services - Tues Aug 31
This two hour workshop will introduce attendees to AWS computer "instances" that let you rent compute time on large or specialized computers. We will also talk about how to estimate the compute resources you need and measure how much you're using.
## possible dates
### week 1:
August 3, Tues - workshop 1
August 4, workshop 2
August 5
August 6
### week 2:
August 10, Tues (Titus and Amanda need to leave at 11 PST) - workshop 3
August 11 (Titus and Amanda cannot make)
August 12, workshop 4 (Titus cannot make beginning)
August 13, workshop 5
### week 3:
August 17, Tues (Pamela has to leave by 10am), workshop 6
August 18 (Pamela cannot make)
August 19 (Pamela and Nick cannot make), workshop 7
August 20 (Pamela cannot make)
### week 4:
August 24, Tues, workshop 8
August 25 (Titus cannot lead; Pamela cannot make it), workshop 9
August 26, workshop 10
### week 5
August 31, Tues (Amanda cannot make), workshop 11
Sep 1 (Amanda cannot make)
Sep 2
Sep 3