--- tags: 'Practicum SARS-CoV-2 Data Science, FU 2022' --- Alignment_and_Phylogeny === ###### tags: `bioinformatics` `sequencing` `workshop` `nanopore` `SARS-CoV-2` Martin Hölzer, Robert Koch Institute, MF1 Bioinformatics [toc] ## Basic setup In this part, we will calculate a multiple sequence alignment (MSA) and a phylogenetic tree. We will do this using the Linux system and command line interfance and online tools for tree visualization. The tools we will use are listed here and will be discussed in detail below: * `MAFFT` * `IQTree` * `president` __Install all tools__ ```bash= # config some channels, this might be already done. # basically helps to not explicitly type the channels conda config --add channels defaults conda config --add channels bioconda conda config --add channels conda-forge # now we create a new environment and install all tools conda create -n tree president mafft iqtree jalview # activate the env conda activate tree ``` ## Example data ```bash= # get example data, a collection of different SARS-CoV-2 lineages, full genomes wget --no-check-certificate https://osf.io/yzn5e/download -O example-data.tar.gz # extract the archive tar zxvf example-data.tar.gz ``` ## Multiple sequence alignment * `MAFFT` ```bash= # first we need a multiple FASTA file, which we can for example generate # by 'cat'ing together single FASTA files, like the ones in the example-data folder cat *.fasta > all.fasta # now we can calculate the alignment mafft --thread 4 all.fasta > alignment.fasta ``` __Task:__ Check the `alignment.fasta` - do you see mismatches? Gaps? You can for example use `jalview` to look at the alignment! The tool is also installed in your conda env. ## Phylogenetic reconstruction * `IQTree` ```bash= # simple usage (there are many parameters though!) iqtree -nt 4 -s alignment.fasta --prefix phylo # first look at the output, scroll a bit to see a tree in ASCII format cat phylo.iqtree # the actual tree is stored in the so-called newick format: cat phylo.treefile ``` ## Tree visualization * `IROKI` Go to [https://www.iroki.net/](https://www.iroki.net/) and upload a tree file in `newick` format, e.g. `phylo.treefile`.