# Create new version HLA reference for Kourami ###### tags: `c4lab` Kourami https://github.com/Kingsford-Group/kourami Lee, H., & Kingsford, C. Kourami: graph-guided assembly for novel human leukocyte antigen allele discovery. Genome Biology 19(16), 2018 I follow this tutorial https://github.com/Kingsford-Group/kourami/blob/master/preprocessing.md#2-creating-kourami-hla-panel-and-merged-msas-from-another-version-release-of-imgthla-db-or-a-custom-version ## Prerequested You can successful run Kourmai for HLA typing. i.e. The below command should work ``` bash java -jar kourami-0.9.6/build/Kourami.jar \ -d ${kourami_index} ${sample}_on_KouramiPanel.bam -o ${sample}.kourami ``` Where `kourami-0.9.6/build/Kourami.jar` are downloaded from https://github.com/Kingsford-Group/kourami/releases/tag/v0.9.6 `${sample}_on_KouramiPanel.bam` is created followed by https://github.com/Kingsford-Group/kourami/blob/master/preprocessing.md `kourami_index` is the index path for Kourami. Default is `kourami/db` if you use `download_panel.sh`. ## Download HLA ### Latest version You can download from IMGT/HLA FTP http://ftp.ebi.ac.uk/pub/databases/ipd/imgt/hla/ Find the alignment data `Alignments_Rel_3450.zip` in latest version (e.g. 3.45.0 at 2021/08/12) And `wmda/hla_nom_g.txt` (This file should be same version of Alignments data) ``` wget ftp://ftp.ebi.ac.uk/pub/databases/ipd/imgt/hla/Alignments_Rel_3450.zip unzip Alignments_Rel_3450.zip wget ftp://ftp.ebi.ac.uk/pub/databases/ipd/imgt/hla/wmda/hla_nom_g.txt ``` ### Previous version You can find older version in github repo https://github.com/ANHIG/IMGTHLA by searching in branch. In this exmpale, I want to use `3.42.0`, then choose `3420` branch(https://github.com/ANHIG/IMGTHLA/tree/3420), then find and download the alignment data and hla_nom_g ``` wget https://github.com/ANHIG/IMGTHLA/raw/3420/Alignments_Rel_3420.zip unzip Alignments_Rel_3420.zip wget https://github.com/ANHIG/IMGTHLA/raw/3420/wmda/hla_nom_g.txt ``` ## Build Index (I use 3.42.0 as example) ``` java -cp kourami-0.9.6/build/Kourami.jar FormatIMGT Alignments_Rel_3420 v3.42.0 tmp mv tmp/v3.42.0 mydb cat mydb/*.merged.fa resources/HLA_decoys.fa | gzip > mydb/All_FINAL_with_Decoy.fa.gz cp hla_nom_g.txt mydb bwa index mydb/All_FINAL_with_Decoy.fa.gz ``` If kourami throw this error, it's a [known bug](https://github.com/Kingsford-Group/kourami/issues/19) ``` REF SEQ names differs : (nuc):Y*01:01 (gen):Y java.lang.NullPointerException: Cannot invoke "LogHandler.appendln(String)" because "HLA.log" is null at Sequence.processBlock(Sequence.java:554) at Sequence.<init>(Sequence.java:498) at MergeMSFs.mergeAndAdd(MergeMSFs.java:383) at MergeMSFs.mergeAndAdd(MergeMSFs.java:372) at MergeMSFs.merge(MergeMSFs.java:298) at FormatIMGT.processGene(FormatIMGT.java:199) at FormatIMGT.main(FormatIMGT.java:100) ``` You can modified `Alignments_Rel_3420/Y_gen.txt` to remove leading deletion **if Y is NOT your target HLA**. ``` gDNA 0 | Y*01:01 | ATGGCGGTC Y*02:01 | --------- Y*03:01 | --------- ``` ## Try it ``` bash kourami_index='mydb' java -jar kourami-0.9.6/build/Kourami.jar \ -d ${kourami_index} ${sample}_on_KouramiPanel.bam -o ${sample}.kourami ```