# Multiple-Hit capacity in SARS-CoV-2
Last update 3/4/2021
> **Data in:**
> /home/aglucaci/FMM_SARS2/2021/SARS-CoV-2
> /home/aglucaci/FMM_SARS2/2021/SARS-CoV-2/data/fasta/1
>
## Data curation

# Ran through a modified version of SARS2 pipeline
https://github.com/veg/SARS-CoV-2/tree/master
* submit_job.sh
* Modified data directory
* fdate=1
* nprocs=2
* extract_genes.sh
* Stopped before selection analyses
* Creates rapidnj tree
Scripts for MH and other selection analyses.
> /home/aglucaci/FMM_SARS2/2021/SARS-CoV-2/scripts/run_FMM_v2.sh
(this script runs FMM and SLAC)
> /home/aglucaci/FMM_SARS2/2021/SARS-CoV-2/scripts/run_BS-MH.sh
(This runs BUSTEDS, BUSTEDS-MH, aBSREL, aBSREL-MH)
## Results
Top sites of interest
* Spike site 70, physiochemical change (Y->L->T->V(bat, nonsyn)->V(bat to human, syn))
* Spike site 325, serine exchange
* Spike site 450, G->N(non-syn)->N(syn)
* Nucleocapsid site 267, biological DH (1+2) (Q {pangolin, bat} to A in human), review if it is potentially misaligned?
* nsp4 site 184, serine island conversion (TH) in pangolins
* nsp4 site 81, serine island conversion (DH) in pangolins
Comments on other sites
* Spike site 860, Q is probably a sequencing error.
* Spike site 861, abiological DH (1+3), preserving L, K is probably a sequencing error.
* Spike site 1093, A is probably a sequencing error.
* Spike site 1145, abiological DH (1+3), preserving L.
* Envelope site 51, abiological DH (1+3), preserving L.
* Leader site 31, abiological DH (1+3), human F->V in human.
### Spike
**{HTML}**
<iframe src="https://data.hyphy.org/web/MH_SARS2/compressed_ER_S.html" width="1000" height="600" frameBorder="0"> </iframe>
**{Input}**
| | 0 |
|:--------------------|:---------------------------|
| file name | sequences.S.compressed.fas |
| number of sequences | 30 |
| number of sites | 1273 |
| partition count | 1 |
**{Test Results}**
| | LRT | p-value |
|:--------------------------------|---------:|------------:|
| Double-hit vs single-hit | 189.862 | 0 |
| Triple-hit vs Triple-hit-island | 35.1287 | 3.08625e-09 |
| Triple-hit vs double-hit | 37.867 | 5.98817e-09 |
| Triple-hit vs single-hit | 227.729 | 0 |
| Triple-hit-island vs double-hit | 2.7383 | 0.0979689 |
**{Site Substitutions}**
| Site\From Codon | GGT | TTA | AGT | TAC | TCT | AAT | GTC | GTT | TAT | GGA | GGG | CTG | TTG |
|------------------:|:--------|:--------|:--------|:----------------------|:--------|:---------------|:--------|:---------------|:--------|:--------|:---------------|:--------|:---------------|
| 13 | . | . | ['TCA'] | . | . | . | . | . | . | . | . | . | . |
| 28 | . | . | . | ['GAT', 'TAT', 'TTC'] | . | . | . | . | . | . | . | . | . |
| 70 | . | . | . | . | . | . | ['GTT'] | ['ACA', 'TAT'] | ['CTA'] | . | . | . | . |
| 72 | . | . | . | . | . | . | . | . | . | ['TAT'] | ['ACT', 'GGA'] | . | . |
| 162 | . | . | ['TCT'] | . | . | . | . | . | . | . | . | . | . |
| 325 | . | . | . | . | ['AGT'] | . | . | . | . | . | . | . | . |
| 450 | ['AAT'] | . | . | . | . | ['AAC', 'GGT'] | . | . | . | . | . | . | . |
| 860 | . | . | . | . | . | . | . | ['CAA'] | . | . | . | . | . |
| 861 | . | . | . | . | . | . | . | . | . | . | . | ['CTA'] | ['AAG', 'CTG'] |
| 1093 | ['GCC'] | . | . | . | . | . | . | . | . | . | . | . | . |
| 1145 | . | ['CTT'] | . | . | . | . | . | . | . | . | . | . | .
**{SLAC}**











### Envelope
**{HTML}**
<iframe src="https://data.hyphy.org/web/MH_SARS2/compressed_ER_E.html" width="1000" height="600" frameBorder="0"> </iframe>
**{Input}**
| | 0 |
|:--------------------|:---------------------------|
| file name | sequences.E.compressed.fas |
| number of sequences | 6 |
| number of sites | 75 |
| partition count | 1 |
**{Test Results}**
| | LRT | p-value |
|:--------------------------------|-------------:|----------:|
| Double-hit vs single-hit | 4.37694 | 0.0364284 |
| Triple-hit vs Triple-hit-island | -0.000331066 | 1 |
| Triple-hit vs double-hit | -0.000276162 | 1 |
| Triple-hit vs single-hit | 4.37666 | 0.223559 |
| Triple-hit-island vs double-hit | 5.49044e-05 | 0.994088 |
**{Site Substitutions}**
| Site\From Codon | CTT |
|------------------:|:--------|
| 51 | ['TTA'] |
**{SLAC}**

### Nucleocapsid
**{HTML}**
<iframe src="https://data.hyphy.org/web/MH_SARS2/compressed_ER_N.html" width="1000" height="600" frameBorder="0"> </iframe>
**{Input}**
| | 0 |
|:--------------------|:---------------------------|
| file name | sequences.N.compressed.fas |
| number of sequences | 24 |
| number of sites | 419 |
| partition count | 1 |
**{Test Results}**
| | LRT | p-value |
|:--------------------------------|-------------:|------------:|
| Double-hit vs single-hit | 14.1241 | 0.000171134 |
| Triple-hit vs Triple-hit-island | 0.0787973 | 0.778934 |
| Triple-hit vs double-hit | 0.0788147 | 0.961359 |
| Triple-hit vs single-hit | 14.2029 | 0.00264152 |
| Triple-hit-island vs double-hit | 1.74043e-05 | 0.996671 |
**{Site Substitutions}**
| Site\From Codon | GCA |
|------------------:|:--------|
| 267 | ['CAA'] |
**{SLAC}**

### Leader
**{HTML}**
<iframe src="https://data.hyphy.org/web/MH_SARS2/compressed_ER_leader.html" width="1000" height="600" frameBorder="0"> </iframe>
**{Input}**
| | 0 |
|:--------------------|:--------------------------------|
| file name | sequences.leader.compressed.fas |
| number of sequences | 12 |
| number of sites | 180 |
| partition count | 1 |
**{Test Results}**
| | LRT | p-value |
|:--------------------------------|-------------:|------------:|
| Double-hit vs single-hit | 19.871 | 8.28476e-06 |
| Triple-hit vs Triple-hit-island | 0.246531 | 0.619529 |
| Triple-hit vs double-hit | 0.246481 | 0.884051 |
| Triple-hit vs single-hit | 20.1175 | 0.000160487 |
| Triple-hit-island vs double-hit | -5.00444e-05 | 1 |
**{Site Substitutions}**
| Site\From Codon | TTT |
|------------------:|:--------|
| 31 | ['GTA'] |
**{SLAC}**

### nsp3
**{HTML}**
<iframe src="https://data.hyphy.org/web/MH_SARS2/compressed_ER_nsp3.html" width="1000" height="600" frameBorder="0"> </iframe>
**{Input}**
| | 0 |
|:--------------------|:------------------------------|
| file name | sequences.nsp3.compressed.fas |
| number of sequences | 34 |
| number of sites | 1950 |
| partition count | 1 |
**{Test Results}**
| | LRT | p-value |
|:--------------------------------|---------:|------------:|
| Double-hit vs single-hit | 49.0844 | 2.45193e-12 |
| Triple-hit vs Triple-hit-island | 1.11283 | 0.291467 |
| Triple-hit vs double-hit | 3.4303 | 0.179937 |
| Triple-hit vs single-hit | 52.5147 | 2.32656e-11 |
| Triple-hit-island vs double-hit | 2.31746 | 0.127929 |
**{Site Substitutions}**
| Site\From Codon | GGT | GTT | ACT |
|------------------:|:--------|:--------|:--------|
| 255 | . | ['ATC'] | . |
| 674 | ['GTA'] | . | . |
| 677 | . | . | ['TTT'] |
| 1428 | ['CGA'] | . | . |
**{SLAC}**




### nsp4
**{HTML}**
<iframe src="https://data.hyphy.org/web/MH_SARS2/compressed_ER_nsp4.html" width="1000" height="600" frameBorder="0"> </iframe>
**{Input}**
| | 0 |
|:--------------------|:------------------------------|
| file name | sequences.nsp4.compressed.fas |
| number of sequences | 13 |
| number of sites | 500 |
| partition count | 1 |
**{Test Results}**
| | LRT | p-value |
|:--------------------------------|-------------:|------------:|
| Double-hit vs single-hit | 28.6135 | 8.83656e-08 |
| Triple-hit vs Triple-hit-island | -9.56697e-05 | 1 |
| Triple-hit vs double-hit | 4.61787 | 0.0993671 |
| Triple-hit vs single-hit | 33.2313 | 2.87837e-07 |
| Triple-hit-island vs double-hit | 4.61796 | 0.0316388 |
**{Site Substitutions}**
| Site\From Codon | TTG | AGT |
|------------------:|:---------------|:--------|
| 81 | . | ['TCT'] |
| 106 | ['CTT', 'TTA'] | . |
| 184 | . | ['TCA'] |
**{SLAC}**



### nsp8
**{HTML}**
<iframe src="https://data.hyphy.org/web/MH_SARS2/compressed_ER_nsp8.html" width="1000" height="600" frameBorder="0"> </iframe>
**{Input}**
| | 0 |
|:--------------------|:------------------------------|
| file name | sequences.nsp8.compressed.fas |
| number of sequences | 9 |
| number of sites | 198 |
| partition count | 1 |
**{Test Results}**
| | LRT | p-value |
|:--------------------------------|-------------:|-----------:|
| Double-hit vs single-hit | 8.63568 | 0.00329643 |
| Triple-hit vs Triple-hit-island | -5.11222e-05 | 1 |
| Triple-hit vs double-hit | 4.88553e-05 | 0.999976 |
| Triple-hit vs single-hit | 8.63573 | 0.0345474 |
| Triple-hit-island vs double-hit | 9.99775e-05 | 0.992022 |
**{Site Substitutions}**
| Site\From Codon | AGT |
|------------------:|:--------|
| 85 | ['ATG'] |
**{SLAC}**

### nsp9
**{HTML}**
<iframe src="https://data.hyphy.org/web/MH_SARS2/compressed_ER_nsp9.html" width="1000" height="600" frameBorder="0"> </iframe>
**{Input}**
| | 0 |
|:--------------------|:------------------------------|
| file name | sequences.nsp9.compressed.fas |
| number of sequences | 4 |
| number of sites | 113 |
| partition count | 1 |
**{Test Results}**
| | LRT | p-value |
|:--------------------------------|-------------:|------------:|
| Double-hit vs single-hit | 13.3441 | 0.000259234 |
| Triple-hit vs Triple-hit-island | -0.000424736 | 1 |
| Triple-hit vs double-hit | 0.288714 | 0.865579 |
| Triple-hit vs single-hit | 13.6328 | 0.00345001 |
| Triple-hit-island vs double-hit | 0.289139 | 0.590773 |
**{Site Substitutions}**
| Site\From Codon | AGT |
|------------------:|:--------|
| 59 | ['TCT'] |
**{SLAC}**

## RSCU
### Spike