changed 3 years ago
Linked with GitHub

arcasHLA at UPPMAX

https://github.com/RabadanLab/arcasHLA

Version: 2022.08.18

Singularity container

Recipe from the original dockerfile

Singularity definition file (click on the line to unfold)
Bootstrap: docker                                                                                                                                    
From: ubuntu:18.04
                 
%labels          
  Author pmitev@gmail.com
                 
%environment     
  export LC_ALL=C
                 
%post            
  export DEBIAN_FRONTEND=noninteractive
  export LANG=C.UTF-8  
  export LC_ALL=C.UTF-8
                 
  export kallisto_version=0.44.0
  export samtools_version=1.9
  export bedtools_version=2.29.2
  export biopython_version=1.77
                 
  mkdir -p /tmp/apt    
  echo "Dir::Cache "/tmp/apt";" > /etc/apt/apt.conf.d/singularity-cache.conf
                 
  apt-get update && \  
  apt-get  -y --no-install-recommends install \
    build-essential \  
    cmake \      
    automake \   
    zlib1g-dev \ 
    libhdf5-dev \
    libnss-sss \ 
    curl \       
    autoconf \   
    bzip2 \      
    python3-dev \
    python3-pip \
    python \     
    pigz \       
    git \        
    libncurses5-dev \  
    libncursesw5-dev \ 
    libbz2-dev \ 
    liblzma-dev \
    bzip2 \      
    unzip        
                 
  python3 -m pip install --upgrade pip setuptools 
  python3 -m pip install --upgrade numpy scipy pandas biopython==${biopython_version}
                 
  # install kallisto   
  mkdir -p /usr/bin/kallisto \
    && curl -SL https://github.com/pachterlab/kallisto/archive/v${kallisto_version}.tar.gz \
    | tar -zxvC /usr/bin/kallisto
                 
  mkdir -p /usr/bin/kallisto/kallisto-${kallisto_version}/build
  cd /usr/bin/kallisto/kallisto-${kallisto_version}/build && cmake ..
  cd /usr/bin/kallisto/kallisto-${kallisto_version}/ext/htslib && autoreconf
  cd /usr/bin/kallisto/kallisto-${kallisto_version}/build && make -j4
  cd /usr/bin/kallisto/kallisto-${kallisto_version}/build && make install
  
    # install samtools
  cd /usr/bin/     
  curl -SL https://github.com/samtools/samtools/releases/download/${samtools_version}/samtools-${samtools_version}.tar.bz2  > samtools-${samtools_version}.tar.bz2
  tar -xjvf samtools-${samtools_version}.tar.bz2 &&   cd /usr/bin/samtools-${samtools_version} && ./configure && make -j4 && make install
                   
  # install bedtools
  cd  /usr/bin         
  curl -SL https://github.com/arq5x/bedtools2/releases/download/v${bedtools_version}/bedtools-${bedtools_version}.tar.gz > bedtools-${bedtools_version}.tar.gz
  tar -xzvf bedtools-${bedtools_version}.tar.gz && cd /usr/bin/bedtools2 && make -j4 && ln -s /usr/bin/bedtools2/bin/bedtools /usr/bin/bedtools
           
           
  # git lfs            
  curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | bash
  apt-get install -y git-lfs 
  git lfs install --system --skip-repo
          
          
  cd /opt              
  git clone --recursive https://github.com/RabadanLab/arcasHLA.git arcasHLA-master
                       
  rm /etc/apt/apt.conf.d/singularity-cache.conf
          
%runscript             
  if command -v $SINGULARITY_NAME > /dev/null 2> /dev/null; then
    exec $SINGULARITY_NAME "$@"
  else                 
    echo "# ERROR !!! Command $SINGULARITY_NAME not found in the container"
  fi 

To build the container, you need computer which has Singularity installed and root access.

$ sudo singularity build arcasHLA.sif Singularity.arcasHLA |&  tee build.log

Or arrange a way to obtain the build container file.
If you have the container, you can always check how it was build by running.

$ singularity inspect -d arcasHLA.sif

Running the tool - test

https://github.com/RabadanLab/arcasHLA#test

  • Bring the Singularity container on the computer you are going to use.
  • Keep in mind that Bianca does not have access to Internet, so you cannot download databases with the tool, do pip install etc.
  • The tool appears to use the structure on GitHub i.e. you cannot git clone https://github.com/RabadanLab/arcasHLA.git . For convenience, this is done in the container and you can instead copy it to you project folder as follows:
$ singularity exec arcasHLA.sif cp -r /opt/arcasHLA-master /project/folder/

Fetch IMGT/HLA database version 3.24.0:

# step into the folder you just copyed
cd /project/folder/arcasHLA-master

# This step can not be done on Bianca, you need to use the tool on a computer with Singularity and access to internet and bring the data on Bianca
$ singularity exec ../arcasHLA.sif ./arcasHLA reference --version 3.24.0

Extract reads

$ singularity exec ../arcasHLA.sif ./arcasHLA extract test/test.bam -o test/output -t 8 -v

--------------------------------------------------------------------------------
[log] Date: 2022-08-18
[log] Sample: test
[log] Input file: test/test.bam
[log] Read type: paired-end
--------------------------------------------------------------------------------
[extract] Extracting reads from test/test.bam
[extract] indexing bam: 

        samtools index test/test.bam

[extract] Extracting chromosome 6: 

        samtools view -H -@8 test/test.bam -o /tmp/arcas_0ab88919-13ad-4cb6-8959-f1f382fe1d1c/test.hla.sam

[extract] Extracting chromosome 6: 

        samtools view -@8 -f 2 test/test.bam 6 >> /tmp/arcas_0ab88919-13ad-4cb6-8959-f1f382fe1d1c/test.hla.sam

[extract] Converting SAM to BAM: 

        samtools view -Sb -@8 /tmp/arcas_0ab88919-13ad-4cb6-8959-f1f382fe1d1c/test.hla.sam > /tmp/arcas_0ab88919-13ad-4cb6-8959-f1f382fe1d1c/test.hla.bam

[extract] Sorting bam: 

        samtools sort -n -@8 /tmp/arcas_0ab88919-13ad-4cb6-8959-f1f382fe1d1c/test.hla.bam -o /tmp/arcas_0ab88919-13ad-4cb6-8959-f1f382fe1d1c/test.hla.sorted.bam

        [bam_sort_core] merging from 0 files and 8 in-memory blocks...

[extract] Converting bam to fastq: 

        bedtools bamtofastq -i /tmp/arcas_0ab88919-13ad-4cb6-8959-f1f382fe1d1c/test.hla.sorted.bam -fq test/output/test.extracted.1.fq -fq2 test/output/test.extracted.2.fq

--------------------------------------------------------------------------------

Genotyping (no partial alleles)

$ singularity exec ../arcasHLA.sif ./arcasHLA genotype test/output/test.extracted.1.fq.gz test/output/test.extracted.2.fq.gz -g A,B,C,DPB1,DQB1,DQA1,DRB1 -o test/output -t 8 -v

# expected output in test/output/test.genotype.json
$ cat test/output/test.genotype.json
{"A": ["A*01:01:01", "A*03:01:01"], "B": ["B*39:01:01", "B*07:02:01"], "C": ["C*08:01:01", "C*01:02:01"], "DPB1": ["DPB1*14:01:01", "DPB1*02:01:02"], "DQA1": ["DQA1*02:01:01", "DQA1*05:03"], "DQB1": ["DQB1*06:09:01", "DQB1*02:02:01"], "DRB1": ["DRB1*10:01:01", "DRB1*14:02:01"]}]

And so on

Remember, on Bianca you do not have Internet access, so the steps that fetch or update data need to be done on machine with Singularity and Internet access - no root required.

Contacts:


tags: UPPMAX, SNIC
Select a repo