Trying to exploit all the Bakta associated annotations. There are all those GO annotations that could be used to profile bacteria overview. The challenge is that GO anntotations follow a tree system and are not really organized in harmonious categories. ![](https://hackmd.io/_uploads/rkZs9Bxh3.png) Inspired by the [GO ribbons ](http://geneontology.org/docs/ribbon) I decided to use the same categories to try to profile bacteria. ```{R} GOHLMeta<-tibble(GO=c("0003674", "0003824", "0030234", "0038023", "0005102", "0005215", "0005198", "0008092", "0003677", "0003723", "0003700", "0008134", "0140110", "0036094", "0046872", "0030246", "0097367", "0008289", "0003674_Other", "0008150", "0007049", "0016043", "0051234", "0008283", "0030154", "0008219", "0032502", "0000003", "0002376", "0050877", "0050896", "0023052", "0006259", "0016070", "0019538", "0005975", "1901135", "0006629", "0042592", "0009056", "0007610", "0008150_Other", "0005575", "0005576", "0005886", "0045202", "0005911", "0042995", "0031410", "0005768", "0005773", "0005794", "0005783", "0005829", "0005739", "0005634", "0005694", "0005856", "0032991", "0005575_Other"), GOname=c("all molecular function", "catalytic activity", "enzyme regulator activity", "signaling receptor activity", "signaling receptor binding", "transporter activity", "structural molecule activity", "cytoskeletal protein binding", "dna binding", "rna binding", "dna-binding transcription factor activity", "transcription factor binding", "transcription regulator activity", "small molecule binding", "metal ion binding", "carbohydrate binding", "carbohydrate derivative binding", "lipid binding", "other molecular function", "all biological process", "cell cycle", "cellular component organization", "establishment of localization", "cell population proliferation", "cell differentiation", "cell death", "developmental process", "reproduction", "immune system process", "nervous system process", "response to stimulus", "signaling", "dna metabolic process", "rna metabolic process", "protein metabolic process", "carbohydrate metabolic process", "carbohydrate derivative metabolic process", "lipid metabolic process", "homeostatic process", "catabolic process", "behavior", "other biological process", "all cellular component", "extracellular region", "plasma membrane", "synapse", "cell junction", "cell projection", "cytoplasmic vesicle", "endosome", "vacuole", "golgi apparatus", "endoplasmic reticulum", "cytosol", "mitochondrion", "nucleus", "chromosome", "cytoskeleton", "protein-containing complex", "other cellular component")) ``` `0003674_Other` will be a category used to represent all the GO term with the higher `0003674` annotation but no other match.