Try   HackMD

Multiple-Hit capacity in SARS-CoV-2

​​​​Last update 3/4/2021

Data in:
/home/aglucaci/FMM_SARS2/2021/SARS-CoV-2
/home/aglucaci/FMM_SARS2/2021/SARS-CoV-2/data/fasta/1

Data curation

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Ran through a modified version of SARS2 pipeline

https://github.com/veg/SARS-CoV-2/tree/master

  • submit_job.sh
    • Modified data directory
    • fdate=1
    • nprocs=2
  • extract_genes.sh
    • Stopped before selection analyses
    • Creates rapidnj tree

Scripts for MH and other selection analyses.

/home/aglucaci/FMM_SARS2/2021/SARS-CoV-2/scripts/run_FMM_v2.sh
(this script runs FMM and SLAC)
/home/aglucaci/FMM_SARS2/2021/SARS-CoV-2/scripts/run_BS-MH.sh
(This runs BUSTEDS, BUSTEDS-MH, aBSREL, aBSREL-MH)

Results

Top sites of interest

  • Spike site 70, physiochemical change (Y->L->T->V(bat, nonsyn)->V(bat to human, syn))
  • Spike site 325, serine exchange
  • Spike site 450, G->N(non-syn)->N(syn)
  • Nucleocapsid site 267, biological DH (1+2) (Q {pangolin, bat} to A in human), review if it is potentially misaligned?
  • nsp4 site 184, serine island conversion (TH) in pangolins
  • nsp4 site 81, serine island conversion (DH) in pangolins

Comments on other sites

  • Spike site 860, Q is probably a sequencing error.
  • Spike site 861, abiological DH (1+3), preserving L, K is probably a sequencing error.
  • Spike site 1093, A is probably a sequencing error.
  • Spike site 1145, abiological DH (1+3), preserving L.
  • Envelope site 51, abiological DH (1+3), preserving L.
  • Leader site 31, abiological DH (1+3), human F->V in human.

Spike

{HTML}

{Input}

0
file name sequences.S.compressed.fas
number of sequences 30
number of sites 1273
partition count 1

{Test Results}

LRT p-value
Double-hit vs single-hit 189.862 0
Triple-hit vs Triple-hit-island 35.1287 3.08625e-09
Triple-hit vs double-hit 37.867 5.98817e-09
Triple-hit vs single-hit 227.729 0
Triple-hit-island vs double-hit 2.7383 0.0979689

{Site Substitutions}

Site\From Codon GGT TTA AGT TAC TCT AAT GTC GTT TAT GGA GGG CTG TTG
13 . . ['TCA'] . . . . . . . . . .
28 . . . ['GAT', 'TAT', 'TTC'] . . . . . . . . .
70 . . . . . . ['GTT'] ['ACA', 'TAT'] ['CTA'] . . . .
72 . . . . . . . . . ['TAT'] ['ACT', 'GGA'] . .
162 . . ['TCT'] . . . . . . . . . .
325 . . . . ['AGT'] . . . . . . . .
450 ['AAT'] . . . . ['AAC', 'GGT'] . . . . . . .
860 . . . . . . . ['CAA'] . . . . .
861 . . . . . . . . . . . ['CTA'] ['AAG', 'CTG']
1093 ['GCC'] . . . . . . . . . . . .
1145 . ['CTT'] . . . . . . . . . . .

{SLAC}

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Envelope

{HTML}

{Input}

0
file name sequences.E.compressed.fas
number of sequences 6
number of sites 75
partition count 1

{Test Results}

LRT p-value
Double-hit vs single-hit 4.37694 0.0364284
Triple-hit vs Triple-hit-island -0.000331066 1
Triple-hit vs double-hit -0.000276162 1
Triple-hit vs single-hit 4.37666 0.223559
Triple-hit-island vs double-hit 5.49044e-05 0.994088

{Site Substitutions}

Site\From Codon CTT
51 ['TTA']

{SLAC}

Nucleocapsid

{HTML}

{Test Results}

LRT p-value
Double-hit vs single-hit 14.1241 0.000171134
Triple-hit vs Triple-hit-island 0.0787973 0.778934
Triple-hit vs double-hit 0.0788147 0.961359
Triple-hit vs single-hit 14.2029 0.00264152
Triple-hit-island vs double-hit 1.74043e-05 0.996671

{Site Substitutions}

Site\From Codon GCA
267 ['CAA']

{SLAC}

Leader

{HTML}

{Input}

0
file name sequences.leader.compressed.fas
number of sequences 12
number of sites 180
partition count 1

{Test Results}

LRT p-value
Double-hit vs single-hit 19.871 8.28476e-06
Triple-hit vs Triple-hit-island 0.246531 0.619529
Triple-hit vs double-hit 0.246481 0.884051
Triple-hit vs single-hit 20.1175 0.000160487
Triple-hit-island vs double-hit -5.00444e-05 1

{Site Substitutions}

Site\From Codon TTT
31 ['GTA']

{SLAC}

nsp3

{HTML}

{Input}

0
file name sequences.nsp3.compressed.fas
number of sequences 34
number of sites 1950
partition count 1

{Test Results}

LRT p-value
Double-hit vs single-hit 49.0844 2.45193e-12
Triple-hit vs Triple-hit-island 1.11283 0.291467
Triple-hit vs double-hit 3.4303 0.179937
Triple-hit vs single-hit 52.5147 2.32656e-11
Triple-hit-island vs double-hit 2.31746 0.127929

{Site Substitutions}

Site\From Codon GGT GTT ACT
255 . ['ATC'] .
674 ['GTA'] . .
677 . . ['TTT']
1428 ['CGA'] . .

{SLAC}




nsp4

{HTML}

{Input}

0
file name sequences.nsp4.compressed.fas
number of sequences 13
number of sites 500
partition count 1

{Test Results}

LRT p-value
Double-hit vs single-hit 28.6135 8.83656e-08
Triple-hit vs Triple-hit-island -9.56697e-05 1
Triple-hit vs double-hit 4.61787 0.0993671
Triple-hit vs single-hit 33.2313 2.87837e-07
Triple-hit-island vs double-hit 4.61796 0.0316388

{Site Substitutions}

Site\From Codon TTG AGT
81 . ['TCT']
106 ['CTT', 'TTA'] .
184 . ['TCA']

{SLAC}



nsp8

{HTML}

{Input}

0
file name sequences.nsp8.compressed.fas
number of sequences 9
number of sites 198
partition count 1

{Test Results}

LRT p-value
Double-hit vs single-hit 8.63568 0.00329643
Triple-hit vs Triple-hit-island -5.11222e-05 1
Triple-hit vs double-hit 4.88553e-05 0.999976
Triple-hit vs single-hit 8.63573 0.0345474
Triple-hit-island vs double-hit 9.99775e-05 0.992022

{Site Substitutions}

Site\From Codon AGT
85 ['ATG']

{SLAC}

nsp9

{HTML}

{Input}

0
file name sequences.nsp9.compressed.fas
number of sequences 4
number of sites 113
partition count 1

{Test Results}

LRT p-value
Double-hit vs single-hit 13.3441 0.000259234
Triple-hit vs Triple-hit-island -0.000424736 1
Triple-hit vs double-hit 0.288714 0.865579
Triple-hit vs single-hit 13.6328 0.00345001
Triple-hit-island vs double-hit 0.289139 0.590773

{Site Substitutions}

Site\From Codon AGT
59 ['TCT']

{SLAC}

RSCU

Spike