# A rapid introduction to snakemake (in 5 examples)
See github repo: [ctb/2026-rapid-snakemake-intro](https://github.com/ctb/2026-rapid-snakemake-intro)
---
## 1. Compress a file
```
rule compress_file:
shell: """
gzip -c example.txt > example.txt.gz
"""
```
Put this in `Snakefile`;
run with `snakemake -j 1`.
---
## 2. Compress a file _once_
```
rule compress_file:
output: "example.txt.gz"
shell: """
gzip -c example.txt > example.txt.gz
"""
```
Run with `snakemake -j 1`
---
## 3. Compress _any_ .txt file
```
rule compress_file:
input: '{name}.txt'
output: '{name}.txt.gz'
shell: 'gzip -c {input} > {output}'
```
Run:
```
snakemake -j 1 example.txt.gz # NOTE: filename req'd!
```
---
## 4. Compress _all_ .txt files (including in subdirs)
```
MATCHES = glob_wildcards('{name}.txt')
rule all:
input:
expand('{name}.txt.gz', name=MATCHES.name)
rule compress_file:
input: '{name}.txt'
output: '{name}.txt.gz'
shell: 'gzip -c {input} > {output}'
```
Run:
```
snakemake -j 4 ## NOTE: uses 4 CPUs; no filename req'd
```
---
## 5. Multistage: compress & do something else
```
MATCHES = glob_wildcards('{name}.txt')
rule all:
input: expand('{name}.info', name=MATCHES.name)
rule compress_file:
input: '{x}.txt'
output: '{x}.txt.gz'
shell: 'gzip -c {input} > {output}'
rule info:
input: '{y}.txt.gz'
output: '{y}.info'
shell: 'gzip -lv {input} > {output}'
```
---
Points to make:
* snakemake basically just builds shell commands and runs them in a particular order.
* the first rule is the default rule, but you can give a rule name to run, or a filename to make, and it will run that instead.
* it works hand in hand with consistent naming schemes...
* it tries to avoid rerunning jobs unnecessarily.
* it's really good at paralellizing jobs!
{"title":"A fast introduction to snakemake (in 5 examples)","description":"TODO:","contributors":"[{\"id\":\"fbac64b8-20e4-4eb4-85a6-d4048a601d72\",\"add\":2176,\"del\":309,\"latestUpdatedAt\":1768418630442}]"}