--- tags: cov-irt --- # Combining UniProt counts [toc] ## Overview This script takes tables that look like this: ``` 1_SRR10903401 uniprot 11500 - 3403 P0C6V9 1244 P0C6W2 1191 E0XIZ2 ... 1_SRR10903402 uniprot 22007 - 13805 P0C6V9 4961 E0XIZ2 4676 P0C6W2 ... ``` And combines them into one like this: ``` uniprot 1_SRR10903401 1_SRR10903402 Not annotated 11500 22007 P0C6V9 3403 13805 P0C6W2 1244 4676 E0XIZ2 1191 4961 ... ``` ## Installing [covirt-micro](https://github.com/AstrobioMike/CoV-IRT-Micro) package in a conda environment ```bash conda create -y -n covirt-micro -c conda-forge -c bioconda -c defaults -c astrobiomike covirt-micro conda activate covirt-micro ``` ## Getting example data ``` curl -L -o 01_SRR10903401_uniprot_id_counts.tsv https://osf.io/hmnyk/download curl -L -o 01_SRR10903402_uniprot_id_counts.tsv https://osf.io/629fs/download ``` ## Combining It takes the multiple tables as a space delimited list, so can be called like so: ```bash cov-combine-uniprot-ID-counts -i 01_SRR10903401_uniprot_id_counts.tsv 01_SRR10903402_uniprot_id_counts.tsv \ -o combined-UniProt-counts.tsv ``` Or with wildcards like so: ```bash cov-combine-uniprot-ID-counts -i *.tsv -o combined-UniProt-counts.tsv ``` See `cov-combine-uniprot-ID-counts -h` or the top of the script file itself for more info. <br> > **NOTE** > As currently written, it will retain the sample identifier that is the first column header of the input file. If we want to be able to manipulate this, let me know 🙂