---
tags: anvio
title: Anvi'o pangenomics with Genbank annotations
---
# Anvi'o pangenomics with GenBank annotations
[toc]
## Conda envs
### bit
We use [bit](https://github.com/AstrobioMike/bioinf_tools#bioinformatics-tools-bit) to download things from NCBI and make sure the LOCUS names in any input GenBank files won't cause problems:
```bash
mamba create -y -n bit -c conda-forge -c bioconda -c defaults -c astrobiomike bit=1.8.47
```
### Anvi'o
Anvi'o 7.1 was installed as described here: https://osf.io/8sy2a/wiki/3.%20Pangenomics/
## Getting GenBank references
From a file holding the NCBI assembly assessions we want, e.g.:
```bash
printf "GCF_000013425.1\nGCA_006094375.1\n" > target-refs.txt
cat target-refs.txt
```
```
GCF_000013425.1
GCA_006094375.1
```
Downloading their genbank files:
```bash
conda activate bit
bit-dl-ncbi-assemblies -w target-refs.txt -f genbank -j 2
gunzip *.gb.gz
```
## Putting all GenBank files (ours and refs) into one place
```bash
mkdir genbank-files
mv *.gb genbank-files
```
## Cleaning LOCUS names just to be sure they don't cause problems later
```bash
bit-genbank-locus-clean-slate -i GCF_000013425.1.gb -w GCF_000013425.1 -o GCF_000013425.1-clean.gb
# renaming back to original so easier to work with
mv GCF_000013425.1-clean.gb GCF_000013425.1.gb
bit-genbank-locus-clean-slate -i GCA_006094375.1.gb -w GCA_006094375.1 -o GCA_006094375.1-clean.gb
mv GCA_006094375.1-clean.gb GCA_006094375.1.gb
```
Done with bit now:
```bash
conda deactivate
```
## Anvi'o
Installed as described here: https://osf.io/8sy2a/wiki/3.%20Pangenomics/
> **NOTE**
> This page is the template for what will be a new anvi'o tutorial. It is not done yet, following from [here](https://hackmd.io/@astrobiomike/ISS-Staph-paper-pangenomics-notes#Processing-each-genome-into-contigs-and-profile-dbs) should help for now.
```bash
conda activate anvio
```
### Making input files for anvi'o from the GenBank files
```bash=
cd ../
```
ignoring that the annotation version will be different for those annotated with JCVI's PGAP and NCBI's PGAP, not important here
mkdir input-files-for-anvio
anvi-script-process-genbank -i all-genbank-files/OBSA1.gb -O input-files-for-anvio/OBSA1
anvi-script-process-genbank -i all-genbank-files/OBSA2.gb -O input-files-for-anvio/OBSA2