Conda Tutorial

What exactly is Conda?

  • A package and environment manager
    • Like apt/yum, but much more flexible
    • Environments are isolated from each other
  • User-contributed package recipes
    • Different β€œchannels”, can create your own
    • Updated constantly
  • Prebuilt binaries
    • Linked to libraries in the same environment

Conda packages

  • Specific versions
  • Various sources ("channels")
  • Defined requirements
    • Usually from the same or predefined other channels

Conda channels

  • conda-forge: Most dependencies (numpy, scipy, zlib, CRAN packages, etc.)

  • bioconda: Most bioinf packages (salmon, STAR, samtools, DESeq2, etc.)

  • defaults: Packages built by Anaconda Inc.

  • Order matters! Use this one:

$ conda config --show channels channels: - conda-forge - bioconda - defaults $ conda config --prepend channels bioconda $ conda config --prepend channels conda-forge

Finding Packages

$ conda search pysam Loading channels: done # Name Version Build Channel [...] pysam 0.16.0 py36h71d3148_1 bioconda pysam 0.16.0 py36h873a209_0 bioconda pysam 0.16.0 py37ha9a96c6_0 bioconda pysam 0.16.0 py37hc501bad_1 bioconda pysam 0.16.0.1 py27ha863e18_1 bioconda pysam 0.16.0.1 py36h4c34d4e_1 bioconda pysam 0.16.0.1 py36h71d3148_0 bioconda pysam 0.16.0.1 py37hc334e0b_1 bioconda pysam 0.16.0.1 py37hc501bad_0 bioconda pysam 0.16.0.1 py38hbdc2ae9_1 bioconda

Conda environments

  • A (mostly) self-contained directory with a set of compatible packages
  • Uses links to reduce disk space when possible
  • No more conflicting dependencies between versions!

common commands

  • conda info --envs (or conda env list)

    • Lists available environments

      ​​​​​​​​```bash=
      ​​​​​​​​$ conda info --envs
      ​​​​​​​​```
      
    • You start in base

    • The * indicates the active environment

  • conda create/conda env remove

    • Create/remove environments

      ​​​​conda create --name=myenv python=3.8 numpy 'pysam>=0.16' ​​​​conda env remove --name=myenv
    • Packages can have versions specified

    • Min/max versions can be specified

  • conda activate/conda deactivate

    • Activates/deactivates an environment

      ​​​​command -v nanoplot ​​​​conda activate nanoplot ​​​​command -v nanoplot ​​​​conda deactivate ​​​​command -v nanoplot
  • conda install/conda remove

  • conda list

    ​​$ conda activate myenv ​​$ conda install snakemake ​​... a lot of status output ... ​​$ conda list ​​... many packages ... ​​$ conda remove snakemake
    • Keep your base env clean! (only the package manager + its deps)
    • Generously create/remove environments for different tools/workflows!
  • conda env export/conda env create

    • Exports or creates an environment from a YAML file

      ​​​​$ conda env export --no-builds > env.yaml ​​​​$ conda env create --name=more-map-and-call --file=env.yaml ​​​​$ head env.yaml ​​​​name: map-and-call ​​​​channels: ​​​​- conda-forge ​​​​- bioconda ​​​​- defaults ​​​​dependencies: ​​​​- _libgcc_mutex=0.1 ​​​​- _openmp_mutex=4.5 ​​​​- bcftools=1.10.2 ​​​​- blis=0.7.0

Introducing mamba

  • Newer package manager called mamba
  • A reimplementation of conda
  • Compatible with conda
    • Installed beside conda in the base environment
  • Much faster than conda
conda activate base
conda install mamba

from now: every time you see conda somewhere, replace it by mamba!

Sometimes it will complain, if so, you can always resort back to conda :)

Common pitfalls

  • Wrong channel order
  • Installing packages in your base env
  • Manually manipulating $PYTHONPATH
  • Avoid manually (i.e., not via conda/mamba) installed packages

Practicals

Practical 1

What are the most recent versions of samtools and Snakemake?

Practical 2

  • Let's create some new environments!
  • But first make sure that
    • conda is up to date

      ​​​​​​​​```bash=
      ​​​​​​​​$ conda activate base
      ​​​​​​​​$ conda update --all
      ​​​​​​​​$ conda --version
      
      
      
    • The correct channel order is in place

      ​​​​​​​​```bash=
      ​​​​​​​​$ conda config --show channels
      ​​​​​​​​channels:
      ​​​​​​​​    - conda-forge
      ​​​​​​​​    - bioconda
      ​​​​​​​​    - defaults
      ​​​​​​​​$ conda info
      ​​​​​​​​```
      

Practical 2.1

  • Create a new environment named "mapping" with bwa-mem2 and pysamstats
  • What versions of numpy and python got installed in it?

Practical 2.2

  • Create a new environment for Snakemake with mamba

Q&A

  • Questions?