---
tags: cov-irt
---
# Combining UniProt counts
[toc]
## Overview
This script takes tables that look like this:
```
1_SRR10903401 uniprot
11500 -
3403 P0C6V9
1244 P0C6W2
1191 E0XIZ2
...
1_SRR10903402 uniprot
22007 -
13805 P0C6V9
4961 E0XIZ2
4676 P0C6W2
...
```
And combines them into one like this:
```
uniprot 1_SRR10903401 1_SRR10903402
Not annotated 11500 22007
P0C6V9 3403 13805
P0C6W2 1244 4676
E0XIZ2 1191 4961
...
```
## Installing [covirt-micro](https://github.com/AstrobioMike/CoV-IRT-Micro) package in a conda environment
```bash
conda create -y -n covirt-micro -c conda-forge -c bioconda -c defaults -c astrobiomike covirt-micro
conda activate covirt-micro
```
## Getting example data
```
curl -L -o 01_SRR10903401_uniprot_id_counts.tsv https://osf.io/hmnyk/download
curl -L -o 01_SRR10903402_uniprot_id_counts.tsv https://osf.io/629fs/download
```
## Combining
It takes the multiple tables as a space delimited list, so can be called like so:
```bash
cov-combine-uniprot-ID-counts -i 01_SRR10903401_uniprot_id_counts.tsv 01_SRR10903402_uniprot_id_counts.tsv \
-o combined-UniProt-counts.tsv
```
Or with wildcards like so:
```bash
cov-combine-uniprot-ID-counts -i *.tsv -o combined-UniProt-counts.tsv
```
See `cov-combine-uniprot-ID-counts -h` or the top of the script file itself for more info.
<br>
> **NOTE**
> As currently written, it will retain the sample identifier that is the first column header of the input file. If we want to be able to manipulate this, let me know 🙂