# Snakemake Hackathon :::info ## Event info **Hackathon date:** 22.05 , 10-16 **Location:** online **Event page and registration:** https://ssl.eventilla.com/event/snakemake_hack (published on 21.3) **Costs:** free of charge **Max participants:** currently 25, to be updated if needed **Collaborative document**: https://siili.rahtiapp.fi/snakemake_hack (draft) ::: **Organizer:** CSC (Geoportti project) **Collaborators:** - projects: CodeRefinery, EuroCC2 - organizations: CSC, Aalto, ENCCS, PDC, UPPMAX ...? - people: - CSC: Antoni Golos, Laxmana Yetukuri, Samantha Wittke - Aalto: Teemu Ruokolainen - ENCCS: Yonglei Wang - PDC (NAISS): Johan Hellsvik - UPPMAX (NAISS): Diana Iusan - **Registration** for all via CSC event page system [TOC] ## TODO - LY: Update own slides - LY: CSC tutorial; add installation - AG: Snakemake intro materials - localrules? - group directive? - SW/LY: update toy example - DI: prepare UPPMAX intro material - DI: test toy example on UPPMAX cluster - SW: general Snakemake on HPC materials - mixing modules? - SW: fix index page - license - funding - expectation (everyone still learning) - SW: check : https://r3.pages.uni.lu/school/snakemake-tutorial/HPC , https://hpc-docs.cubi.bihealth.org/slurm/snakemake/ , https://github.com/SchlossLab/snakemake_cluster_tutorial , https://hpc.nih.gov/apps/snakemake.html , https://zqfang.github.io/2020-08-19-hpc-snakemake/, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8114187/, https://coderefinery.github.io/TTT4HPC_parallel_workflows/parallelization/parallelize_using_workflow_manager/ ## More detailed day plan (times EEST) 10-10.30 Everyone joins and we take questions regarding HPC refresher materials, if no questions we can discuss among "specialists" and see if some questions come up 10.30 - 11.30 Antoni provides lecture with demos on Snakemake basics ## Materials - rendered: https://coderefinery.github.io/snakemake_hackathon/ - source: https://github.com/coderefinery/snakemake_hackathon ## Schedule Latest schedule: https://siili.rahtiapp.fi/snakemake_hack?view#%F0%9F%93%85-Agenda ## Participant stats > Current counts:  ### Home organization nationality: - UK:2 - Fin: 19 - Swe:12 - TR: 2 ### HPC experience: - no: 3 - beginner: 22 - advanced: 12 ### HPC access: - Aalto Triton: 0 - CSC Mahti: 1 - CSC Puhti: 17 - LUMI: 3 - UPPPMAX: 9 (bianca, rackham/snowy) - Other 5 - UK: Oxford Arc: 2 - NAISS C3SE: 1 - VTT internal: 2 - other (person from Lund, 1 from Turkey) ### Snakefile: - No: 21 - WIP: 7 - ready: 7 - Don't know: 1 ## Planning meeting 13.05.24 - 4 people from waiting list accepted - Participant email check - UPPMAX Snakemake usage presentation - snakemake module - conda? - users can do own - hyperqueue? - no - could be provided - interactive session - CSC - plain module installation - good for use with other modules - plain pip installations: pygment, snakemake, snakemake slurm and generic plugin - container wrapper for conda based installations (tykky) - container based installation - launching jobs - native snakemake cluster integration: one job per rule may be overkill - hyperqueue - General Snakemake on cluster - general portability recommendations - general best practices - one job interactive - slurm integration - General room? - Other specialists - Dummy example: - Hyperqueue available on other clusters? - Diana to run on Uppmax as test ## CSC Planning meeting 08.05.24 - HPC refresher materials: https://coderefinery.github.io/TTT4HPC_resource_management/# - Does this cover everything needed? - Send HPC refresher week before the course, have first hour Q&A, hangout - LUMI environment set up :+1: - Mahti environment not set up, ask participant to switch to Puhti - High Throughput example with Hyperqueue and Snakemake (for people without own workflow), modified from CSC docs: https://github.com/csc-training/csc-env-eff/blob/11-workflows-dev/part-2/workflows/snakemake-ht.md - (WIP) CSC- Snakemake intro, slides: https://a3s.fi/CSC_training/snakemake_hackathon.html ## Participant e-mail draft > To be sent 14.5 Dear all, thank you for your interest in the Snakemake hackathon. Joining the event This event will be fully online in Zoom. Join Zoom Meeting: https://cscfi.zoom.us/j/61757837253 Meeting ID: 617 5783 7253 Agenda All times in EEST (Helsinki time): 10-10.30 Supercomputing concept refresher Q&A 10.30-11.30 Intro to Snakemake 11.30-12 Snakemake on HPC (first general, then cluster specific) 12-13 Lunch break 13-15 Hackathon (cluster specific breakoutrooms for CSC Puhti + LUMI, UPPMAX and one general room) You can join any session of interest to you. Please note that due to the feedback during registration, we will not have a lecture on supercomputing concepts, but have a Q&A/discussion session about materials provided below in the morning. The idea for the hackathon part is that you work on your own workflow and ask help from specialists where needed. You can also check out our toy example. Requirements 1. Please read this material as HPC concept refresher: https://coderefinery.github.io/TTT4HPC_resource_management/# - We have reserved 30 min in the beginning of the course for questions related to this material 2. For the hands-on hackathon part, you need to be able to run stuff on a cluster: - For CSC clusters this means, that you need an account and a project with billing units. - If you do not have a project, please send us your CSC username until 20.5 by answering to this email. - For UPPMAX clusters this means, that you need an account. - You will be added to a course project in the beginning of the hackathon session Note on support For participants not using Aalto Triton, CSC Puhti/Mahti, LUMI, ENCCS or UPPMAX clusters, we unfortunately cannot provide cluster specific support. You may still try it out on your cluster/own laptop and ask about Snakemake related stuff though :) Interaction during the event We will use a collaborative document to ask, answer and collect all questions and feedback during the event: https://siili.rahtiapp.fi/snakemake_hack?view Further information Please refer to the event page for general information about the event: https://ssl.eventilla.com/snakemake_hack In case of any questions, please answer to this email. Looking forward to virtually meet you on 22.5! Kind regards, Samantha and the Snakemake hackathon team --- ---- ## Planning meeting 30.4.24 - HPC refresher as materials beforehand, offer Q&A - Antoni Snakemake intro - What is snakemake - When does it make sense to use Snakemake, when not - Implementation of workflow with Snakemake - Snakemake on HPC (Laxmana) - cluster specific short presentations in breakoutroom - CSC: Mahti, Puhti, LUMI - Puhti: LY has project, can invite - Week before: - need Puhti project? Next catch up 8.5 - LY: check Snakemake on LUMI, install as module, Snakemake on Puhti/Mahti/LUMI presentation - SW: update eventilla with HPC refresher as materials, link to materials - AG: Snakemake intro ## Planning meeting 4.4.24 - Idea and plan for the Hackathon - morning: lectures, afternoon: hands-on - HPC concepts; difference between clusters? (Dardel similar to LUMI) - SLURM? :heavy_check_mark: - "fair share"? job priorities - about same - conda - some allow, some prefer containerized, CSC: tykky - container - singularity/apptainer - local scratch - Snakemake on your cluster? - Do we want breakout rooms already here? - CSC: - doc: https://docs.csc.fi/apps/snakemake/ - tutorial: https://docs.csc.fi/support/tutorials/snakemake-puhti/ - own installation via "container wrapper": https://docs.csc.fi/computing/containers/tykky/ - LUMI: - similar to CSC setup - ...? - Dardel - same computer model as LUMI. Python and/or Snakemake configurations might differ - UPPMAX - Triton - ENCCS Have short intro for each cluster in beginning of hackathon session in breakoutrooms - Example case for hackathon (people without own project)? Close to "real life" but not Science specific. - fall back solution: word count from HPC carpentries - add your example here: - .. - .. - Involvement interest (all, add your initials) - lectures - LY - AG - hackathon - DI - YW - JH - advertising - all - Advertisement: - CSC April training newsletter (SW) :heavy_check_mark: - CSChpc twitter (SW) :heavy_check_mark: - Geoportti (SW) - EuroCC (YW) - CodeRefinery (SW) - TTT4HPC Day4 course (TR) - Aalto SciComp (TR) - NAISS :heavy_check_mark: - ENCCS - ## TODO - ASAP: - advertise (use list above) - start material preparation (SW to coordinate CSC internal first, before opening for comnments/additions from others) - Beginning of May: - prepare environment / module on each system - add cluster specific tabs to materials (all) - prepare short cluster specific intros (all) - Mid May: - Course project (on each system) - invite participants - course resource reservation (on each system) ## Other existing related materials - [CSC bio workflows with Nextflow course materials](https://a3s.fi/swift/v1/AUTH_53f5b0ae8e724b439a4cd16d1237015f/csc-training/workflows_workshop.html) - [CSC workflows on HPC considerations - slides](https://a3s.fi/2001659-workflow-workshop/workflows.html#/) - [CSC High Throughput Computing docs page](https://docs.csc.fi/computing/running/throughput/) - [Uppsala snakemake BYOC](https://github.com/NBISweden/workshop-snakemake-byoc) - [Snakemake tutorial](https://slides.com/johanneskoester/snakemake-tutorial) - [CodeRefinery lesson](https://coderefinery.github.io/reproducible-research/workflow-management/) - [Carpentries lesson](https://carpentries-incubator.github.io/workflows-snakemake/) - [Snakemake workflows catalog](https://snakemake.github.io/snakemake-workflow-catalog/) - [Snakemake video](https://www.youtube.com/watch?v=_dG9b3a9zkk&t=400s) - [CSC Snakemake tutorial](https://docs.csc.fi/support/tutorials/snakemake-puhti/) - [MultiXScale Workflows material](https://ocaisa.github.io/hpc-workflows/) - [Universe HPC Snakemake lesson](https://github.com/UNIVERSE-HPC/course-material/tree/main/technology_and_tooling%2Fsnakemake)