--- tags: prions-fo-life title: Parsing plaac-positive KO annotations based on specific GO term --- # Parsing plaac-positive KO annotations based on specific GO term Working with data from 1-Jun-2021, UniProt "Standard" proteomes only. The files used below are from [this drive](https://drive.google.com/drive/u/0/folders/1AO0Q_nHx4iNA1loP8LO8CGmSUloJ9lVH). --- > **NOTE** > The code below works based on the directory structure in [the google drive](https://drive.google.com/drive/u/0/folders/1AO0Q_nHx4iNA1loP8LO8CGmSUloJ9lVH), with commands being run from within the sub-directory "[parsing-KO-annotations-for-specific-GO-terms](https://drive.google.com/drive/u/0/folders/12Y7yPOrgXDSUZGCYpC2_O-kOuCoetXzj)". This program relies on several helper scripts which are located at that drive. **The required files to run what's below in order to be able to do it for any GO term will be downloaded and put in the appropriate places automatically the first time it is run (if they aren't present there already).** They are not large, only roughly 20 MB total altogether. --- [toc] --- ## Purpose We give the program a specific GO term, and we get back info and summaries of KO annotations for all the plaac-positive proteins that were associated with the given GO term. E.g., we ask for GO:0009055, we get some more detailed files, and also a summary table like this "GO_0009055/GO_0009055-Combined-KO-annots-summary.tsv": <a href="https://i.imgur.com/FgQGzJx.png"><img src="https://i.imgur.com/FgQGzJx.png"></a> Showing counts of the KO annotations that were assigned to all the plaac-positive proteins that were associated with the GO term GO:0009055. --- ## Environment setup for Tom **You only need to do this part once, Tom. After doing this once, you can start at the [Parsing KO annotations based on GO terms section](https://hackmd.io/@astrobiomike/parsing-plaac-positive-KO-annotations-based-on-GO-terms#Parsing-KO-annotations-based-on-GO-terms) below.** My `bit` package is already installed on your computer, so you just need to run this line the first time you do this (won't hurt anything if you do it again): ```bash conda install -y -c conda-forge gdown ``` Then we're going to make a directory for this work and holding some of the other things (so it's in the same setup as we have on the google drive). Copy and paste these into your terminal (these also only need to be done once): ```bash mkdir -p ~/3-domain-work-June-2021/parsing-KO-annotations-for-specific-GO-terms/ cd ~/3-domain-work-June-2021/parsing-KO-annotations-for-specific-GO-terms/ ``` And we have to download the main program script with the following (also only once): ```bash gdown 'https://drive.google.com/uc?id=19CDtjzSSXGD_RuGNwO0TqHQrCUZBpX1R' ``` --- ## Environment setup for those of you not fortunate enough to be Tom Only difference is here we are making a new conda environment. Be sure to activate it before using the below. ```bash conda install -c conda-forge mamba ``` ```bash mamba create -n bit -c conda-forge -c bioconda -c defaults -c astrobiomike bit=1.8.42 gdown conda activate bit ``` Now downloading the starting script: ```bash gdown 'https://drive.google.com/uc?id=19CDtjzSSXGD_RuGNwO0TqHQrCUZBpX1R' ``` --- ## Parsing KO annotations based on GO terms There are many helper scripts wrapped into this one program. All needed files and helper scripts will be download the first time it's run (they are all hosted at the [google drive](https://drive.google.com/drive/u/0/folders/12Y7yPOrgXDSUZGCYpC2_O-kOuCoetXzj) noted above). This example is done with GO:0009055, but would be the same for any GO term. ```bash # making sure we are in the right place (may be different if you are not Tom!) cd ~/3-domain-work-June-2021/parsing-KO-annotations-for-specific-GO-terms/ # running the program for GO:0009055 bash get-KO-info-for-GO-term.sh GO:0009055 ``` The first time it will need to set some things up, and will ask for confirmation, type yes and hit enter unless there is some reason you don't want to move forward. When it's done, there will be a new directory called "GO_0009055" that holds some working files with more details, and a main summary file "GO_0009055/GO_0009055-Combined-KO-annots-summary.tsv": <a href="https://i.imgur.com/FgQGzJx.png"><img src="https://i.imgur.com/FgQGzJx.png"></a> --- ---