--- tags: Craig cyanos --- # Blindly starting to feel out the landscape of looking at things this way [toc] ## Semi-randomly looking at CbiJ, EVE, and phylogenomic trees ### Phylogenomic tree [Interactive version here](https://itol.embl.de/tree/7184244121410631600459049) <a href="https://i.imgur.com/jIaM5vR.png"><img src="https://i.imgur.com/jIaM5vR.png"></a> ### CbiJ gene **CbiJ in tree ordination-space relative to phylogenomic tree** CbiJ is a involved in B12 biosynthesis ![](https://i.imgur.com/Uk9t8Yn.png) **CbiJ tree** [(interactive version)](https://itol.embl.de/tree/13822922232332501612563890#) <a href="https://i.imgur.com/uSDUJqs.png"><img src="https://i.imgur.com/uSDUJqs.png"></a> ### EVE gene **EVE in tree ordination-space relative to phylogenomic tree** Not as clear what EVE does functionally, but likely involved in RNA binding in some facet ![](https://i.imgur.com/pO81KoG.png) **EVE tree** [(interactive version)](https://itol.embl.de/tree/13822922232332841612563909#) ![]() <a href="https://i.imgur.com/wqCf9X5.png"><img src="https://i.imgur.com/wqCf9X5.png"></a> ### Some alignment and tree metrics |Metric|CbiJ|EVE|Phylogenomic| |:-----|----|---|------------| |Alignment length|210|147|91,999| |Alignment length (no gaps)|142|138|47,922| |Total tree length|13.9411|6.9057|6.2097| |Tree length / align. len. ¯\\\_(ツ)\_/¯ | 0.0664 | 0.0470 |0.0000674| |Tree length / ungapped-align. len. ¯\\\_(ツ)\_/¯ | 0.0982 | 0.0500 | 0.000130 | |Median Shannon uncertainty* (of align. columns) |0.3522|0.1520|0.1305| \* calculated with this function: http://scikit-bio.org/docs/0.5.3/generated/skbio.alignment.TabularMSA.conservation.html) **Density plots of Shannon uncertainties (of align. columns)** <a href="https://i.imgur.com/HQcvhZc.png"><img src="https://i.imgur.com/HQcvhZc.png"></a> ## Some thoughts > * The CbiJ tree topology-wise seems to more closely track with the phylogenomic tree than the EVE one – keeping a monophyletic clade of Prochlorococcus (red), and having Prochlorococcus diverge in a sub-clade with respect to Syn RCC307 (which is believed to be the case). > > * Since right now the tree ordination-space includes distance and topology, it would be expected that there are greater distances involved in the CbiJ tree, causing it to be relatively further away in the ordination plot. And this is indeed the case. > > * Sarah asked at some point if I thought branch length should be incorporated, and I did think that/maybe still do. But maybe this is suggesting that branch length shouldn't be included, or that it should maybe be normalized for alignment length somehow? Though normalizing for alignment length the simplistic way I did above maybe isn't sufficient. --- ## Programs/code used ```bash # for shannon uncertainty of multiple-sequence alignments mamba install -n bit -c conda-forge -c bioconda -c defaults -c astrobiomike bit=1.8.23 conda activate bit bit-calc-variation-in-msa -h # usage: bit-calc-variation-in-msa [-h] -i INPUT_ALIGNMENT_FASTA # [-g {nan,ignore,error,include}] # [-t {DNA,Protein}] [-o OUTPUT_TSV] # This script takes an alignment in fasta format as input and returns the # Shannon uncertainty values for each column using: http://scikit- # bio.org/docs/0.5.3/generated/skbio.alignment.TabularMSA.conservation.html. In # output "variation" column: 0 is same character in all sequences for that # position (highest conservation); 1 is equal probability of any character # (greatest variability). "Conservation" column is inverse. As written, any # ambiguous bases or residues are converted to gap characters. For version info, # run `bit-version`. bit-calc-variation-in-msa -i run_files/individual_alignments/EVE_aln.faa -o EVE-aln-variation.tsv bit-calc-variation-in-msa -i run_files/individual_alignments/CbiJ_aln.faa -o CbiJ-aln-variation.tsv bit-calc-variation-in-msa -i Aligned_SCGs_mod_names.faa -o phylogenomic-aln-variation.tsv conda deactivate mamba create -n phykit -c jlsteenwyk phykit=0.1.1 conda activate phykit phykit alng run_files/individual_alignments/EVE_aln.faa phykit alng run_files/individual_alignments/CbiJ_aln.faa phykit alng Aligned_SCGs_mod_names.faa phykit tree_len individual_trees/EVE.tre phykit tree_len individual_trees/CbiJ.tre phykit tree_len target-picos-gtotree-picocyano-HMM-1.5.45.tre ```