changed 3 years ago
Linked with GitHub

#10: Making a new institutional profile

James Fellows Yates / @jfy133
LMU München / MPI-EVA


Overview

  • Benefits nf-core institutional profiles
  • Information to gather before writing
  • Step-by-step example of profile
  • How to test a draft profile

Today

🏫 Global institutional profile: a Nextflow configuration file usable by all nf-core pipelines to work efficiently on institutional-level cluster(s). Stored on nf-core/configs.

  • Pipeline institutional profiles: future bytesize

Why use?

  • 🚀 efficiency: computing resource, time
  • 🛄 portable: assists reproducibility
  • ⌚ saves time: write it once, everyone benefits

Recap

General info about nf-core & configs: bytesize #2

📄 Nextflow configuration file: simple text file containing a set of properties (parameters, etc.)

  • Different levels!
    • ./nextflow.config
    • $HOME/.nextflow/config
    • -c
    • nf-core profiles: nf-core/configs

Example:

nextflow run nf-core/eager -profile uppmax
nextflow run nf-core/eager -profile shh,sdag
nextflow run nf-core/mag -profile shh,sdag

:warning: order matters: furthest right has precedence!


Preparation!


Topics to cover

  1. 📛 names
  2. 🛑 resource limits
  3. 📆 scheduling systems
  4. 📦 containers

Names

  • 📛 Do you have recognisable and descriptive name?
    • Short is good (abbreviations OK!)
    • Precise but compact

Resource limits

  • To effectively use nf-core retry ✨:
    • 💾 Largest node's memory

    • 💻 Largest node's CPU

    • ⌚ Longest queue's walltime

    • 📁 Scratch usage?


Scheduling systems

  • For Nextflow to submit for you:
    • 📅 What scheduler (check Nextflow docs!)?
    • 🐍 Queues/partitions?
    • 🛑 Submission limits?
    • 🛠️ Additional configurations (e.g. module load)?

Containers

  • For robust reproducibility:
    • 📦 Which container engine (check nf-core docs!)?
    • 📁 Common cache locations?
    • 🛠️ Additional parameters?

Writing!


To start


To start

  • ✍🏽 Create two new files:
    • conf/<your_cluster>.conf
    • docs/<your_cluster>.md
  • 📛 Add profile name to:
    • nfcore_custom.config
    • README.md
    • .github/workflows/main.yml

Scope: params

params {
  config_profile_description = '<cluster_name> cluster profile provided by nf-core/configs.'
  config_profile_contact = '<your_name> (<your_github_handle>)'
  config_profile_url = 'https://<institutional_url>.com'
  max_memory = 2.TB
  max_cpus = 128
  max_time = 720.h
  igenomes_base = '/<path>/<to>/igenomes/' // optional!
}

In conf/<your_cluster>.conf


Scope: process

Simple example

process {
  executor = 'slurm'
  maxRetries = 2
}

Complex example

process {
  executor = 'sge'
  queue = { task.time <= 2.h ? 'short' : task.time <= 24.h ? 'medium': 'long'
  maxRetries = 2
  clusterOptions = { '-l h_vmem=${task.memory.toGiga()}G' }
}

In conf/<your_cluster>.conf


Scope: executor

executor {
  queueSize = 8
  submitRateLimit = '10 sec'
}

In conf/<your_cluster>.conf


Scope: container

singularity {
  enabled = true
  autoMounts = true
  cacheDir = '/<path>/<to>/<your>/<image_cache>'
}

In conf/<your_cluster>.conf


Scope: profiles

<...>

profiles {
  red {
    params {
      config_profile_description = '<your_institution_name> 'red' cluster cluster profile provided by nf-core/configs.'
      max_memory = 2.TB
      max_cpus = 128
      max_time = 720.h
    }
  }

  blue {
    params {
      config_profile_description = '<your_institution_name> 'blue' cluster profile provided by nf-core/configs.'
      max_memory = 256.GB
      max_cpus = 64
      max_time = 24.h
    }
  }
}

(Or use hostnames )
In conf/<your_cluster>.conf


Documentation!


What to include?

  • 🗺️ Where the cluster is based
  • 🛑 Summary of parameters
    • e.g. resource limits, queues
  • 👨‍💻 Instructions for user-level configuration
    • e.g. cache directories
  • 🖧 Available 'sub'-profiles

In docs/<your_cluster>.md


Testing and submission!


Test profile from your fork

nextflow run nf-core/<fav_pipeline> \
-profile <your_cluster_name>,test \
--custom_config_base 'https://raw.githubusercontent.com/<your_github_user>/configs/<your_branch>'

⚠️ Expect trial and error!


Submit to nf-core/configs!

  • 📑 Make a PR to nf-core/configs

  • 📢 On slack: #request-review

  • 🥳 Once approved, merge, and publicise!

  • 👩🏼‍💻 From now on:

    ​​​​nextflow run nf-core/<fav_pipeline> \
    ​​​​-profile <your_cluster_name>
    ​​​​<...>
    

Need help?

Repository: nf-core/configs
Tutorial: https://nf-co.re/usage/usage_tutorials
Chat: https://nf-co.re/join #configs

Next Bytesize

Development environments & workflows

May 4th 2021, 13:00 CEST

Follow nf-core on

Select a repo