Sergei Pond
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    ###### tags: `draft` # SELEXTIAL: A SARS-Cov-2 selection analysis framework - BW Indexed in: https://hackmd.io/@hannahkimincompbio/Sk9T_TIBY - Writeup leader: Steven Weaver - Google doc draft 1: https://docs.google.com/document/d/1ERRQVBIyBt_98uRQ7f4EvkgT1L2xJ9pNT-pmHBd1MzA/edit?usp=sharing - Google doc draft 2: https://docs.google.com/document/d/1rnGZZZrcIzI6YtZlFgXri3j-mknFHXHonn_kK8h8U_g/edit?usp=sharing - Project board: https://github.com/users/stevenweaver/projects/2/views/1 - Authors: Sergei Pond, Steven Weaver, Jordan D Zehr, Alexander Lucaci, Hannah Kim, Avery Selberg - Institutions: iGEM - Potential delivery date: November 19th, 2021 (earliest) --- [ToC] --- ## Abstract Write last --- ## Introduction ### Prior Buildups (03/2021) <details> <summary> Darren's buildup </summary> * So much genomic data available in databases (GISAID, ENA, GenBank). * Endemic viruses (Influenza A virus, HIV-1, Hepatitis C virus) and outbreak-causing viruses (MERS, Zika and Ebola) studied with genomic data, previously * HOWEVER, the current pandemic is different because there is data in a massive scale; it calls for a need for rapid processing of data. * Other computational tools currently addressing the Big Data problem (FASTTREE, USHER, IQ-TREE, NEXTSTRAIN, BEAST, PANGOLIN) * HK: Buildup is there. What can SELEXTIAL do here then? </details> <details> <summary> Anonymous's buildup </summary> * COVID as the first real time pandemic. * Internationally coordinated viral pathogen genomic surveillance and genomic data generation effort (ARTIC protocol) * Data sharing and nucleotide sequence analysis infrastructures * Viruses: Influenza A virus, HIV-1, Hepatitis C virus), and outbreaks (MERS, Zika and Ebola) * SARS-CoV-2 genomic sequence repositories (GISAID, GenBank, ENA) * Superlinear rate of date deposition and development of computational tools to deal with big data (FASTTREE, IQ-TREE, NEXSTRAIN, BEAST, PANGOLIN) * Fortunately, there has been web technology developments as well, and we have powerful web applications. We built a web application pipeline thanks to all these advancement. And the name of the powerful application is: **SELEXTIAL**. * What is SELEXTIAL? * A highly optimized set of computational tools based on the HyPhy software package, that enable near-real time detection, visualization and exploration of natural selection patterns within large, and continually expanding, genomic sequence datasets * Brief description of SELEXTIAL pipeline: 1. Download GISAID data 2. Sequences processed into 26 genes with bealign 3. Alignment using MAFFT 4. Phylogenetic tree inferred with RAPIDNJ 5. Run HyPhy analyses * SLAC * FEL * MEME 6. Annotations for the output * Whether these sites fall on CTL or immune epitopes * Display intrahost variants * Noticeably different from other betacoronaviruses using Contrast-FEL * Detect sites with changing or conserved amino acid properties with PRIME 7. Output in JSON format 8. Visualizations in Observable notebooks * Conclusion: SELEXTIAL has accessibility and power to deal with big data genome sequences for the evidence of natural selection. </details> --- ### Sergei's guide (11/2021) #### COVID is bad and will continue to be bad * Bad: * Medical * COVID-19 Cases, death, and vaccination from [JHU Coronavirus resource center](https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series), [The JHU Dashboard](https://coronavirus.jhu.edu/map.html) & [The JHU Dashboard paper](https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30120-1/fulltext) **Date: 2020-01-22~2021-12-22** * 277 million cases (N=277,153,704) * 5.3 million deaths (N=5,376,763) * 8.8 billion vaccine doses administered (N=8,798,895,872) * 280 countries and regions worldwide * Economical * [IEA report, Economic impacts of COVID-19 in year 2021](https://www.iea.org/reports/global-energy-review-2021/economic-impacts-of-covid-19) **Date: 2021** * World-wide 3.4% GDP drop in 2020 * World-wide 6% GDP growth projection in 2021 * [Economic Consequences of the COVID-19 Outbreak: the Need for Epidemic Preparedness](https://www.frontiersin.org/articles/10.3389/fpubh.2020.00241/full) 2020-05-29 * Psychological * [Symptoms of Anxiety or Depressive Disorder and Use of Mental Health Care Among Adults During the COVID-19 Pandemic — United States, August 2020–February 2021](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8022876/). **Date: 2021-04-02** * <details> <summary> Details </summary> During August 19, 2020–February 1, 2021, the percentage of adults with symptoms of an anxiety or a depressive disorder during the past 7 days increased significantly (from 36.4% to 41.5%), as did the percentage reporting that they needed but did not receive mental health counseling or therapy during the past 4 weeks (from 9.2% to 11.7%). ... During August 19–31, 2020, through December 9–21, 2020, significant increases were observed in the percentages of adults who reported experiencing symptoms of an anxiety disorder (from 31.4% to 36.9%), depressive disorder (from 24.5% to 30.2%), and at least one of these disorders (from 36.4% to 42.4%) (Figure 1). Estimates for all three mental health indicators through January 2021 were similar to those in December 2020. </details> * Vaccine authorization * [FDA In Brief: FDA Issues Guidance on Emergency Use Authorization for COVID-19 Vaccines](https://www.fda.gov/news-events/fda-brief/fda-brief-fda-issues-guidance-emergency-use-authorization-covid-19-vaccines) Date: 2020-10-06 * Vaccinated but vaccine equity problem ? * [Global COVID-19 vaccine inequity: The scope, the impact, and the challenges](https://www.sciencedirect.com/science/article/pii/S1931312821002857?casa_token=QBAXKE_5pNIAAAAA:Iqh-EiATcQPxWve10ZFQWssaDXpvap6A-1pH82JD0dsiwF6j2cI1Y4lwD5393PISXFAwlUmAvA#bib10) **Date: 2021-07-14** * <details> <summary> Details </summary> </details> Only 5% of the world has received one dose of the vaccine, and the inequities are even more profound in areas such as the continent of Africa where most countries have administered doses to less than 1% of their population. * If cannot be vaccinated, how are the less fortunate countries doing? Can their economic toll can be separated? * [The Effects of COVID-19 Vaccines on Economic Activity](https://www.imf.org/en/Publications/WP/Issues/2021/10/19/The-Effects-of-COVID-19-Vaccines-on-Economic-Activity-494714) 2021-10-19 * [Vaccine inequity undermining global economic recovery](https://www.who.int/news/item/22-07-2021-vaccine-inequity-undermining-global-economic-recovery) 2021-07-22 * [Global COVID-19 vaccine inequity](https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(21)00344-3/fulltext) 2021-07 * Will continue to be bad / potentially be bad: * More variants, how many so-called lineage variants or mutations appeared rapidly in the past year * [PANGO Lineages](https://cov-lineages.org/) **Date: 2021-11-10** * Currently, there are 1511 pango lineages. * Researchers have been putting together [constellations](https://github.com/cov-lineages/constellations), collections of mutations that are functionally meaningful that may arise independently a number of times, since July 2021. There are 21 of those. * Evolving, the number of mutations per site in comparison to other well-known viruses * [One year into the pandemic: Short-term evolution of SARS-CoV-2 and emergence of new lineages](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8074502/) **Date: 2021-04-26** * **Influenza A viruses** (negative sense ssRNA): 1.8 × 10-3 substitutions per site per year (s/s/y) (Jenkins et al., 2002) * **Human enterovirus 71** (positive sense ssRNA): 3.4 x 10-3 s/s/y (Jenkins et al., 2002). * **SARS-CoV-2**: 1.29 × 10–3 s/s/y (95% HPD 5.35 × 10–4, 2.15 × 10–3) and 1.23 × 10–4 s/s/y (95% HPD 5.63 × 10–4, 1.98 × 10–3) for relaxed and strict clock models, respectively (Duchene et al., 2020). * [Extremely High Mutation Rate of HIV-1 In Vivo](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002251) **Date: 2015-09-16** * **HIV-1**: (4.1 ± 1.7) × 10−3 per base per cell, the highest reported for any biological entity * Delta peak; suddenly more infections * ??? * Breakthrough infections. Numbers? Hopefully relatively small number but is it significant? * Original vaccine [BNT162b vaccines protect rhesus macaques from SARS-CoV-2](https://www.nature.com/articles/s41586-021-03275-y) **Date: 2021-02-01** * <details> <summary> Details </summary> Antigens encoded by BNT162b vaccine candidates were designed on a background of S sequences from SARS-CoV-2 isolate Wuhan-Hu-1 (GenBank MN908947.3). </details> * The other variants that the vaccine can neutralize [BNT162b2-elicited neutralization of B.1.617 and other SARS-CoV-2 variants](https://www.nature.com/articles/s41586-021-03693-y) **Date: 2021-06-10** * <details> <summary> Details </summary> Here we show that serum samples taken from twenty human volunteers, two or four weeks after their second dose of the BNT162b2 vaccine, **neutralize** engineered SARS-CoV-2 with a USA-WA1/2020 genetic background (a virus strain isolated in January 2020) and spike glycoproteins from the recently identified B.1.617.1, B.1.617.2, B.1.618 (all of which were first identified in India) or B.1.525 (first identified in Nigeria) lineages. </details> * Vaccine breakthrough cases with lineage info 1 [COVID-19 Vaccine Breakthrough Infections Reported to CDC — United States, January 1–April 30, 2021](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8158893/) **Date: 2021-05-28** * <details> <summary> Details </summary> A total of **10,262 SARS-CoV-2 vaccine breakthrough infections** had been reported from 46 U.S. states and territories as of April 30, 2021. ... Sequence data were available from 555 (5%) reported cases, **356 (64%) of which were identified as SARS-CoV-2 variants of concern,§ including B.1.1.7 (199; 56%), B.1.429 (88; 25%), B.1.427 (28; 8%), P.1 (28; 8%), and B.1.351 (13; 4%).** ... As of April 30, 2021, approximately **101 million persons** in the United States had been fully vaccinated against COVID-19. However, during the surveillance period, SARS-CoV-2 transmission continued at high levels in many parts of the country, with approximately 355,000 COVID-19 cases reported nationally during the week of April 24–30, 2021. </details> * Vaccine breakthrough cases with lineage info 2 [Dominance of Alpha and Iota variants in SARS-CoV-2 vaccine breakthrough infections in New York City](https://www.jci.org/articles/view/152702) **Date: 2021-08-10** * <details> <summary> Details </summary> Most breakthrough infections (57/76) occurred with **B.1.1.7 (Alpha) or B.1.526 (Iota)**. ... One of these variants is B.1.526 (Iota), which arose in New York City in late December 2020 (26, 27). </details> * Vaccine breakthrough cases with lineage info 3 [Association Between mRNA Vaccination and COVID-19 Hospitalization and Disease Severity](https://jamanetwork.com/journals/jama/fullarticle/2786039) **Date: 2021-11-04** * <details> <summary> Details </summary> 1983 were case patients with COVID-19 and 2530 were controls without COVID-19. Unvaccinated patients accounted for 84.2% (1669/1983) of COVID-19 hospitalizations. Hospitalization for COVID-19 was significantly associated with decreased likelihood of vaccination (cases, 15.8%; controls, 54.8%; adjusted OR, 0.15; 95% CI, 0.13-0.18), including for **sequenced SARS-CoV-2 Alpha (8.7% vs 51.7%; aOR, 0.10; 95% CI, 0.06-0.16) and Delta variants (21.9% vs 61.8%; aOR, 0.14; 95% CI, 0.10-0.21).** </details> * Decreasing vaccine immunity over time and and Vaccine breakthrough cases during Delta surge [Correlation of SARS-CoV-2-breakthrough infections to time-from-vaccine](https://www.nature.com/articles/s41467-021-26672-3) **Date: 2021-11-04** * <details> <summary> Details </summary> After controlling for potential confounders as age and comorbidities, we found a significant 1.51 fold (95% CI, 1.38–1.66) increased risk for infection for early vaccinees compared to those vaccinated later that was similar across all ages groups. The increased risk reached **2.26- fold (95% CI, 1.80–3.01)** when comparing those who were vaccinated in January to those vaccinated in April. This preliminary finding of vaccine waning as a factor of time from vaccince should **prompt further investigations into long-term protection against different strains**. </details> * N=1,352,444 vaccinated with the BioNTech/Pfizer mRNA BNT162b2 vaccine in a two-dose regimen * Early vaccination: January to February 2021 * Late vaccination: March to April 2021 * Follow-up period: Between June 1 and July 27, 2021 * <details> <summary> Details </summary> Our study has several important limitations. First, as the **Delta** variant was the dominant strain in Israel during the study period, the observed decrease in long-term protection afforded by the vaccine against other strains cannot be inferred. ... Taken together, the study suggests a possible relative decrease in the long-term protection of the BNT162b2 vaccine against the Delta variant of SARS-CoV-2.</details> * Varying vaccine efficacy of BNT162b2 for different variant [Effectiveness of the BNT162b2 Covid-19 Vaccine against the B.1.1.7 and B.1.351 Variants](https://www.nejm.org/doi/full/10.1056/nejmc2104974) **Date: 2021-05-05** * <details> <summary> Details </summary> * The messenger RNA vaccine BNT162b2 (Pfizer–BioNTech) has 95% efficacy against coronavirus disease 2019 (Covid-19). * Nearly all cases in which virus was sequenced after March 7 were caused by either B.1.351 or B.1.1.7. in Qatar. * The estimated effectiveness of the vaccine against any documented infection with the **B.1.1.7 variant was 89.5%** (95% confidence interval [CI], 85.9 to 92.3) at 14 or more days after the second dose (Table 1 and Table S2). The effectiveness against any documented infection with the **B.1.351 variant was 75.0%** (95% CI, 70.5 to 78.9). * however, vaccine effectiveness against the B.1.351 variant was approximately 20 percentage points lower than the effectiveness (>90%) reported in the clinical trial and in real-world conditions in Israel and the United States. </details> * Omicron * [Selection analysis identifies significant mutational changes in Omicron that are likely to influence both antibody neutralization and Spike function (Part 1 of 2)](https://virological.org/t/selection-analysis-identifies-significant-mutational-changes-in-omicron-that-are-likely-to-influence-both-antibody-neutralization-and-spike-function-part-1-of-2/771) 2021-12-05 * [Geographic and Genomic Distribution of SARS-CoV-2 Mutations](https://www.frontiersin.org/articles/10.3389/fmicb.2020.01800/full) 2020-07-22 * [How bad is Omicron? What scientists know so far](https://www.icpcovid.com/sites/default/files/2021-12/Ep%20197-2%20How%20bad%20is%20Omicron_%20What%20scientists%20know%20so%20far.pdf) 2021-12-02 * There is a potential that the pandemic will continue to be bad * [Coevolution of hosts and parasites](https://www.cambridge.org/core/journals/parasitology/article/abs/coevolution-of-hosts-and-parasites/AD3B9037962266A448DF14786AB1D6F8) 1982-10 * [Will the Coronavirus Evolve to Be Less Deadly?](https://www.smithsonianmag.com/science-nature/will-coronavirus-evolve-be-less-deadly-180976288/) 2020-11-16 * [Virulence evolution and the trade-off hypothesis: history, current state of affairs and the future](https://onlinelibrary.wiley.com/doi/10.1111/j.1420-9101.2008.01658.x) 2009-01-19 #### A lot of sequence data * Unprecedented deposition of data * [GISAID](http://weekly.chinacdc.cn/en/article/doi/10.46234/ccdcw2021.255) 2021 * ENA * GenBank * * One cliche is to show a bar plot of growth, but do we want to do such a plot for our purpose? [Plot](https://youtu.be/PG-LtsptUhs?t=272) like that from Sergei's CST talk from 02/2021. * Some big data approaches in the current field * FASTTREE, USHER, IQ-TREE, NEXTSTRAIN, BEAST, PANGOLIN as mentioned in previous build-ups #### Interest in evolutionary analysis (brief lit review ~10 refs), especially for selection * [Evolutionary analysis of the dynamics of viral infectious disease](https://www.nature.com/articles/nrg2583) **Date: 2009-08** * > Emerging insights * > Host immune selection * > Strong natural selection is clearly the dominant force determining HIV evolutionary dynamics in hosts: HIV phylogenies display a high turnover of short-lived lineages that is driven by host immune selection, analogous to the pattern observed for influenza A virus at the global scale2 (Box 2). * > Therefore, a clear goal for the future is to further develop analytic methods that combine genetic and epidemiological data to reconstruct epidemic history and to predict future trends... * [Natural selection in the evolution of SARS-CoV-2 in bats created a generalist virus and highly capable human pathogen](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001115) * The bat origin of SARS-CoV-2 * [Exploring the natural origins of SARS-CoV-2 in the light of recombination](https://www.biorxiv.org/content/10.1101/2021.01.22.427830v3) #### Why is knowing selection patterns useful for SC-2 * More information for the development of future vaccine development/treatment methods * Negatively selected sites as drug targets. * Often overlooked in the clincial setting, but important. * Overlooked, for example, 0 publication for the S:I68T, a non-silent mutation with conserved amino acid physico-chemical properties, in outbreak.info and Stanford CoVDB. * Traditionally, conserved sites were used as the potential target for drug development instead of negatively selected sites. * The major difference between conserved vs negative selection sites is the whether there is enough number of changes to make assessment for the evolutionary relevance to the conservation at the site. Thus, negatively selected sites may be better targets. * [Architecture and Conservation of the Bacterial DNA Replication Machinery, an Underexploited Drug Target](https://www-ingentaconnect-com.libproxy.temple.edu/content/ben/cdt/2012/00000013/00000003/art00006#) **Date: 2012** * > Bacterial DNA synthesis thus appears to be an underexploited drug target. However, beforethis system can be targeted for drug design, it is important to understand which parts are conserved and which are not, as this will have implications for the spectrum of activity of any new inhibitors against bacterial species, as well as the potential for development of drug resistance. * Positively selected sites to re-assess the efficacy of the current vaccine #### How do you find *meaningful* variation when there's so much data? * List previous efforts on finding meaningful variation? * What are good previous pipelines? Is this where prior endemic viruses and outbreak-causing viruses that Darren mentioned comes in? * (MACHINE LEARNING??) #### What did we do: data management, QC, analysis, interpretation, data analysis challenges (data compression) * Data management, periodical download thanks to GISAID, and thus leading to sliding window approach. Data stored in MongoDB. * QC, multiple steps on our own. * QC for ambiguity threshold for the percentile of ```N``` (the ambiguous nucleotide in the sequence). Part of ```pre-msa.bf```. * QC for identity parameter, generally known as ```E```. This looks at how similar the sequence is to the reference sequence. Part of ```pre-msa.bf```. * QC for unique amino acid mutations. If the more than 0.05% of the amino acid mutations in the current sequence not seen in other sequences that there is, we filter it out. After alignment, in the compression step. * QC for singletons. If singletons (when sequences aligned, one sees only one sequence with a unique nucleotide compared to the others) are observed, the singleton can be masked with a gap. After alignment, in the compression step. * HyPhy selection methods (one sentence each) * SLAC counting method * FEL fixed effects likelihood method * MEME method that detects positive selection * Interpretation * Visualization with Observable notebooks * Stanford CoVDB & outbreak.info * Data analysis challenges * Data compression * Billions of sequences already and many sequences are almost identical. * Instead, pick out representative sequences from each cluster using TN93 distances. ### Introduction draft The COVID-19 pandemic is an ongoing global emergency. It has caused 277 million cases and 5.8 million deaths in 280 countries and regions worldwide from January 22nd 2020 to December 22nd, 2021. The pandemic has incurred a huge medical, economical, and psychological cost worldwide. It was through the fast distribution of vaccines and the blood, sweat, and tears of healthcare workers that the US authorities was able to relax rules and regulations in 2021. However, the current state comes with caveats. Because all current vaccines are based on the original NC_045512 in Genbank from Wuhan, they are not equally efficient for all different COVID-19 lineage variants, and it is only a matter of time before a variant completely escapes the current vaccine immunity. The rise of the most recent variant of concern, B.1.1.529 (also called BA.1 or Omicron), is the materialization of such concerns. B.1.1.529 has 53 mutations compared to the original SARS-CoV-2 while all previous variants had around 20, is 3-6 times more transmissible, and is only less than 20% neutralized by the available vaccines, while most patients infected with the variant are showing mild symptoms. The domination of B.1.1.529 should be considered the time bought for research and strategy rather than a premature promise that the end of the pandemic is near. The transmission-virulence trade-off hypothesis has long been debunked due to its simplicity and limitations, and the complex nature of evolution may not necessarily drive the disease phenotype in the way desirable for humans. Thus, the research on the COVID-19 evolution and function must continue. Due to its urgent nature, research on SARS-CoV-2 has become one of the most active fields of research in the recent years. Thanks to the global surveillance efforts, researchers can access unprecedented amount of sequence data deposited in GISAID, ENA, and GenBank. In GISAID alone, there are 6 million SARS-CoV-2 consensus sequences available to public users, and this number is expected to only get bigger. Thus, the viral evolution research has now become a "big data" problem. Some prior methods in the field --FASTTREE, USHER, IQ-TREE, NEXTSTRAIN, BEAST, and PANGOLIN-- have already addressed the need to accommodate growing data. However, for selection analyses, a systematic method of finding representative sequences is still desired. Generally, big data can be tackled in several ways. One can: (1) massively increase the computational power; (2) optimize existing methods; (3) reduce features by dimensionality reduction methods; or (4) take subsampling of data. There are problem-specific traits that we may be able to take advantage of. Phylogenetic tree generation process is insensitive to duplicates, because only the difference from the most recent common ancestor (TMRCA) is taken into consideration at all. In addition, because phylogenetic tree building relies on distances between sequences, we can project that sequences clustered by the distance measure would later end up in a monophyletic group with similarly short distance from TMRCA. Using this observation, we propose a method in which we subsample sequences from large sets of data to find meaningful variation. Selection analyses are useful in understanding SARS-CoV-2. Historically, selection has its roots in the Darwinian selection which is explained by classic definition of the survival of the fitness. These days, selection can be derived as a part of the phylogenetic inference using dN/dS, the rate of replacement fixation divided by the rate of silent fixation. There have been interest in finding this value in the resolution of genes, branches, and even sites. [examples] Especially in the case of the ongoing COVID-19 pandemic, selection analysis can be used as a gateway to detecting structural and functional changes in SARS-CoV-2 early. Using this information, researchers can identify conserved loci that can be reliably targeted in drug development, monitor actively changing loci that may cause the next sweep in the disease landscape, and even directly correlate evolutionary pressure with the way in which the nature escapes it. --- ## Results * The pipeline itself -- what do you get? 1. Selection analysis results 2. Sites under selection 3. Temporal trends 4. Normalized "selective force" * Interpretation 1. Current "dashboard" of selection 2. Examples of how selection analyses could pick VOI/VOC sites prior to their "emergence" * For each important clade, report when each site was first found. (Delta, Alpha, etc) 3. Negative selection -- nobody talks about this, but people should -- sites that vary but are constrained in AA space 4. Looking forward -- tracking sites that are currently interesting * Software deliverables * Do we want people to run it? Can they * https://observablehq.com/@spond/selection-profile * https://observablehq.com/@spond/sars-cov-2-selected-sites ### Result Draft SELEXTIAL takes a large number of SARS-CoV-2 consensus genomes as an input, compresses them into computationally managable size using TN93 clustering, and finds sites under selection using HyPhy software. Thanks to its ability to compress data, the pipeline can be used to analyze rapidly increasing amount of data from GISAID at each temporal sliding window. To summarize selection analysis results from the pipeline, we introduce normalized selective force which we define as the number of positively and negatively selected sites normalized by kilobase of gene length and the internal tree length (sites/[substitutions across the tree x gene length]). We present two additional visualization softwares to improve the interpretation of our results. The first software provides the current dashboard of selection using summary tiles, a genome viewer, and a plot that shows the temporal trend of normalized selective force per gene (See Figures 1-2). The second software provides site-specific selection information based on FEL (See Figure 3). The softwares can be accessed here: (1) https://observablehq.com/@spond/selection-profile and (2) https://observablehq.com/@spond/sars-cov-2-selected-sites. In the next paragraphs, we describe a use case of the pipeline and the visualizations. The entire pipeline is utilized on globally diverse SARS-CoV-2 consensus genomes from GISAID, which are available from the GISAID organization and were sequenced by clinical laboratories as part of their analysis workflows using varied sample preparations, sequencing methodologies, assembly, and quality control measures. Our full dataset (as the time of the writing) contains XXX number of SARS-CoV-2 whole genome sequences. The whole genomes are split across the 23 genes for our analysis pipeline. As well, we perform temporal analysis across X time points which are binned into X months of submitted sequence information. The number of genomes per time point ranges from X to X in our analysis. SELEXTIAL was run directly on SARS-CoV-2 whole genome sequences, and the results are summarized here. The full interactive results of our most recent run are available to view at https://observablehq.com/@spond/selection-profile. Figure 1 shows the genomic overview of natural selection pressures on SARS-CoV-2 genomes with overlapping temporal intervals. {Describe Figure 1 a bit more} Figure 2 shows the temporal evolution of selective force across every gene in the SARS-CoV-2 genome. Here, selective force is described as the number of positively and negatively selected sites normalized by kilobase of gene length and the internal tree length. We observe varying level of correlation between positively and negatively selected sites (quantify this). With “early” selective force dominated by structural genes such as the Spike gene (check this”) and recent upticks in positively selected sites across non-structural genes (verify). Mutations centered around the Spike gene are of functional and clinical interest. By April 2020… (say something about this date and selective force), by April 2021,… (say something about this date and selective force). This observation is consistent with early pandemic expectations and the dynamics present in early host-pathogen interactions (state the observation from time windows early in the pandemic and selective force). {Describe Figure 2 a bit more here} Figure 3 shows an overview of selective pressure on signature mutation sites across the current VOC’s and VOI’s (Alpha to Delta, Mu). As shown, motif trajectories provide a useful specificity for the type and direction of selective pressures at key signature sites. We observe several adaptive and conserved sites, the former being associated with escape variants. Broad genomic over is provided for non-Spike mutations, which are also key to SC2 function (ref). Temporal trends in signature sites can be observed, where we are able to have early detection of key signature sites. Trends for these mutations show that most sequences also contain X mutation from month X onwards (Describe any mutations we pick up early, that become dominant later). {Describe Figure 3 a bit more here} - $-log10(p)$ values show which sites were significantly selected in FEL. The colors show --- ## Methods * Provide example of data flow * Flow chart drawn by Jordan Zehr: https://app.diagrams.net/#G1ui5hTz6rHgOClDMd0-Jr5VsnK_bpENIx ![flow chart drawn by Jordan Zehr](https://i.imgur.com/adAkzyp.png) * Flow chart improved by Avery Selberg: (note labels may need to change) ![](https://i.imgur.com/Ef7CjLa.png) ## Methods Draft Data sources can be any collection of whole genome (at least 28kbp) SARS-CoV-2 consensus sequences. ### Preprocessing Steps Steps 1-5 are achieved with the pre-msa.bf script, part of a suite of HyPhy batch scripts found in https://github.com/veg/hyphy-analyses/. Step 6 is achieved with bealign, a mapper found within the bioext package located at https://github.com/veg/BioExt. The reference sequence is GenBank accession NC_045512. Coding region coordinates correspond to those listed in NC_045512. Preprocessing occurs on new samples yet to have undergone the following preprocessing steps, with results stored for later use. 1. **Filter by sample host and minimum length.** Sequences shorter than 28000 are discarded. Incomplete genomes comprise less than 2% of the GISAID database, are equally distributed across collection dates, and many are only partial coding sequences. Due to these characteristics, minimal resolution is lost while also decreasing the chance of incorrectly mapping an incorrect sequence to the wrong place in the genome. The sample host is limited to human, as there are parts of the analysis that compares SARS-CoV-2 selection inference against a backdrop of other betacoronaviruses, which closely related SARS-CoV-2 samples from bats would contaminate. Additionally, the analysis is to detect selection occuring in humans, so non-human host specific mutations could cloud inference. 2. **Partition each sample into 23 of 29 distinct coding regions.** Each whole-genome sample is split into 23 distinct coding regions, as outlined in _Table N_. The coordinates of each coding region used are according to those outlined in Genbank accession NC_045512.2. Sequences are corrected for frameshifts, and stop codons are removed at this step. | Region | Type | Analyzed | % Ambiguity threshold | | -------- | -------- | -------- |-------- | | leader/nsp1 | NSP | Yes | 0.001 | | nsp2 | NSP | Yes |0.001| | nsp3 | NSP | Yes |0.001| | nsp4 | NSP | Yes |0.001| | 3C/nsp5 | NSP | Yes |0.001| | nsp6 | NSP | Yes |0.001| | nsp7 | NSP | Yes |0.001| | nsp8 | NSP | Yes |0.001| | nsp9 | NSP | Yes |0.001| | nsp10 | NSP | Yes |0.001| | nsp11 | NSP | No |N/A| | RdRp/nsp12 | NSP | Yes |0.001| | helicase/nsp13 | NSP | Yes |0.001| | exonuclease/nsp14 | NSP | Yes |0.001| | endoRNAse/nsp15 | NSP | Yes |0.001| | methyltransferase/nsp15 | NSP | Yes |0.001| | ORF3a | Accessory Factor | Yes |0.01| | ORF3b | Accessory Factor | No |N/A| | ORF6 | Accessory Factor | Yes |0.01| | ORF7a | Accessory Factor | Yes |0.01| | ORF7b | Accessory Factor | No |N/A| | ORF8 | Accessory Factor | Yes |0.01| | ORF10 | Accessory Factor | No |N/A| | S | Structural | Yes |0.005| | N | Structural | Yes |0.01| | E | Structural | Yes |0.01| | M | Structural | Yes |0.01| 3. **Filter by similarity score to reference sequence.** After partioning the sample to its respective gene region, it is mapped to a reference sequence. A similarity score is used to determine whether the sample is close enough to align to the in-frame reference. 5. **Remove stop codons.** Each partitioned sequence is inspected for the presence of stop codons. If there are any present, then the partitioned sequence is discarded. 6. **Filter by fraction of ambiguous nucleotides.** After the sample has been partitioned, each partitioned sample is filtered based on the fraction of ambiguous base calls relative to the length of the region. See _Table N_ under *% Ambiguity threshold* to view the threshold assigned for each region. 7. **Map to reference.** Each of the partitioned samples that pass quality assessment are aligned to a single reference sequence using a codon-aware extension of the Smith-Waterman dynamic programming algorithm (cite HIV-TRACE). This approach allows for not only linear scaling, but also for storage and reuse of previously aligned sequences. ### Analysis Pipeline With samples that have been preprocessed and have passed quality screening, analyses pertaining to selection inference may be started based on a set of predefined queries to stored data. The analyses are conducted at regular intervals defined by a configuration supplied to a workflow management platform, and individual tasks are executed by an asynchronous task queue on a high performance computing cluster. 1. **Export sequences.** Pre-aligned samples stored in MongoDB are exported based on a predefined set of options, such as collection date ranges or lineage assignment. A FASTA file containing sequence data, and a metadata JSON file are exported. 3. **Collapse duplicates.** Duplicates are filtered from the exported FASTA file and are logged for downstream processing, such as haplotype frequency reporting. Duplicate entries provide no information for inference (since log-likelihoods will be exactly the same with or without) for all methods in the workflow, but will slow down execution. 4. **Mask sites that are low frequency variants.** Since the majority of variants are rare (< 1% frequency) and carry a strong possibility of sequencing error, low frequency variants are masked by replacing the variant with a gap character. A detailed description of the algorithm to mask low-frequency variants is supplied in Supplementary Materials. 5. **Filter sequences with many low frequency variants.** A count of masked low-frequency variants are logged in step (4), if the number of low-frequency variants exceed greater than 5 standard deviations from the mean, then the sequence is filtered from the analysis altogether. A detailed description of the algorithm to remove sequences with excessive low-frequency variants is supplied in Supplementary Materials. 6. **Subsample.** If the number of remaining samples exceeds 10000, subsampling to reduce the dataset to below or near 10000 will take place. Given an MSA on N sequences, the workflow computes all N (N – 1)/2 pairwise genetic distances under the Tamura-Nei 93 (TN93) model, and then clusters given a distance threshold separately determined to reduce the dataset by the desired amount. A representative sample is then randomly selected from each cluster and written to a new FASTA file for downstream processing. 7. **Infer Phylogeny.** A phylogeny is inferred using rapidnj(cite), a software implementation of the neighbor-joining algorithm(cite), from each of the filtered alignments. 8. **FEL.** Sites inferred to be under pervasive selection is performed using FEL (Fixed Effects Likelihood) (cite). The method is implemented in the HyPhy software package. 9. **MEME.** MEME (Mixed Effects Model of Evolution)(cite) uses a maximum likelihood methodology and performs a likelihood ratio test for positive selection on each site, comparing modes which allow or disallow positive diversifying selection at a subset of branches. The method enables detection of positive selection on a proportion of branches, as opposed to FEL, which only detects pervasive positive or negative selection. The method is implemented in the HyPhy software package. 10. **SLAC.** SLAC (Single-Likelihood Ancestor Counting) (Kosakovsky Pond and Frost, 2005) is used to count both non-synonymous and synomous substitutions across the inferred tree by inferring ancestral state per site. 11. **Summary.** Results from each analysis are collated into a summary report. The report includes amino-acid variant (both non-synonymous and synonymous separately reported) composition per CDS, minor to major allele frequency ratio across the entire CDS of interest, total internal branch length under a generalized extension of the MG94 model (cite), global dN/dS rates. Frequency trends, whether the site was significantly inferred to be under positive or negative selection, if the variant is unexpected in the context of other betacoronaviruses, and whether the site belongs to a predicted CTL epitope is also reported per site per CDS. The repository associated with the analyses described in this document is located at https://github.com/veg/SARS-CoV-2. Supplementary HyPhy scripts, that offer functions such as * Frameshift correction * Ambiguous character filtering * Homology screening * Amino-acid translation * Duplicate reduction Can be found at https://github.com/veg/hyphy-analyses, namely, the scripts found within the directory https://github.com/veg/hyphy-analyses/tree/master/codon-msa are used for initial processing of incoming sequences. The results of these analyses are provided to the user in an interactive web application. ### Additional Analyses 1. **Predicted CTL epitope** Used to report whether the site belongs in a predicted CTL epitope. The list of epitopes come from two different sources, and must be of length 9 or greater to be considered. The two sources are from Campbell et al's (cite) predicted epitopes with binding affinities less than 500 nM (nano-molars). The second source is currently being added to the dashboard, and will shortly include experimentally validated epitopes from Nelde et. al (cite). 2. **Unexpected residues** When a site belongs to this category, there is at least one amino-acid present in more than one sequence that was not predicted to occur in other betacoronaviruses. It might indicate evolutionary events that we have not observed in similar viruses before. This is determined by likelihoods inferred by the PRIME (cite) method on a set of 1114 whole-genome sequences obtained from GenBank. (See Supplementary Materials) ### Web Application <!-- Will write once I review what I have in the Svelte application --> ### Interactive Notebooks A suite of interactive notebooks have been assembled to visualize results. #### Examining natural selection history on global SARS-CoV-2 genomes [Examining natural selection history on global SARS-CoV-2 genomes](https://observablehq.com/@spond/selection-profile) views the history of natural selection pressures on the SARS-CoV-2 genome, through analyses of three-month overlapping intervals dating back towards the beginning of the SARS-Cov-2 pandemic. The earliest intervals ends in February 2020 and is continuously updated. In addition to a full export via CSV, the visualization reports the number of positively and negatively selected sites in the most recent sliding window, the number of selected sites gained and lost from the last sliding window, the median number of time periods a site has been under selection, and total number of selected sites across all time points. A filter can be applied to report only on CDS of interest, along with a p-value threshold. ##### Selective Force In order to directly compare between CDS across time points, a metric has been developed to assess the amount of selective 'force' on a CDS. Selective force is defined as the number of positively and negatively selected sites normalized by kilobase of gene length and the internal tree length. `Selective Force = Total Selected Sites/(Substitutions across the tree x CDS length)` Genes are sorted by maximal 'force' of positive selection over all time within the visualization. #### Evidence of natural selection history operating on SARS-CoV-2 genomes To view how natural selection may have operated on individual sites over time, the [Evidence of natural selection history operating on SARS-CoV-2 genomes](https://observablehq.com/@spond/sars-cov-2-selected-sites) provides a view of when the variant of interest was first inferred to be under selection. The diameter of each circle is inversely proportional to the p-value according to the FEL test, and color indicating whether it was under positive (red) or negative (blue) selection. Filters may be applied to view variant constellations associated with VOCs/VOIs, or all sites inferred to be under positive or negative selection over the past three intervals. #### Evolutionary prioritization of codons and mutations in SARS-CoV-2 genomes [Evolutionary prioritization of codons and mutations in SARS-CoV-2 genomes](https://observablehq.com/@spond/sc2-prioritization) is a simplified view that assigns each site a significance score based on the following criteria : ##### Evolutionary prediction Evolutionary predictions for codons that are expected to occur in SARS-CoV-2 are based on the evolution of `nCOV` bat/pangolin viruses; this is based on the alignment from [Lytras et al](https://www.biorxiv.org/content/10.1101/2021.01.22.427830v3) This JSON file contains records like ``` "274": [ "CTT", { "CTA": 0.0009359560817146797, "CTC": 0.003176408584479247, "CTG": 0.0005105015949308743, "CTT": 0.995375735314138 } ] ``` This means that at site `274` (nucleotide coordinates in the SARS-CoV-2 genome), the genomic reference for the virus (Wuhan-1 strain) has codon `CTT` and the four codons predicted to occur at this codon are `CT*` with corresponding imputed probabilities (evolutionary "likelihood") shown. When a codon is invariable in `nCOV` sequences, the second entry will be "null". Sites in SARS-CoV-2 that lack homologous sites in `nCOV` sequences will not be present (e.g. insertions). ##### Inferred selection results based on 3-month sliding windows analyses. Site-level selection data is computed with FEL (3-month windows) and sites are reported that have been found under selection at least once (p~I0.05). Attributes for each significant site are 1. `gene` : gene/ORF 2. `site` : site in gene 3. `seqs` : number of representative sequences used for analysis 4. `from` : start of the time interval 5. `to`: end of the time interval 6. `p`: p-value for non-neutrality (positve if alpha < beta, negative otherwise) 7. `alpha` : site MLE estimate for the synonymous rate 8. `beta` : site MLE estimate for the non-synonymous rate 9. `T_int`: total length of internal branches in the gene tree used for inference (subs/site) 10. `T_total`: total length of all branches in the gene tree used for inference (subs/site) ##### Adaptive and Conserved Scores The **adaptive score** is calculated as: `Recently Selected + Type + (Number of Periods With Significant Results/Total Analyses Conducted) + NotPredicted + (1-HRank)^10` Where * Recently Selected is 1 if the variant has only been reported as positively selected within the past three time periods, 0 otherwise. * Type is 1 if 'Alternating', 0 otherwise. * NotPredicted is based on evolutionary prediction outlined earlier. If the probability of observing the mutation is lower than 1e-6, then 1, 0 otherwise. * HRank is the frequency at which the haplotype occurs across all haplotypes. <!-- TODO: Validate --> The **conserved score** is calculated as: `Recently Selected - Type*0.5 + (Number of Periods With Significant Results/Total Analyses Conducted) + Predicted + (1-HRank)^10` * Recently Selected is 1 if the variant has only been reported as negatively selected within the past three time periods, 0 otherwise. * Type is 1 if 'Alternating', 0 otherwise. * Predicted is based on evolutionary prediction outlined earlier. If the probability of observing the mutation is greater than 1e-4, then 1, 0 otherwise. * HRank is the frequency at which the haplotype occurs across all haplotypes. <!-- TODO: Validate --> 'Alternating' is reported if both positive and negative selection were detected for the respective site at some point across all sliding windows. #### Natural selection analysis of global SARS-CoV-2/COVID-19 <!-- TODO -- Do we want to resurrect these observables? --> * Natural selection analysis of global SARS-CoV-2/COVID-19 https://observablehq.com/@spond/revised-sars-cov-2-analytics-page * Evolutionary annotation of global SARS-CoV-2/COVID-19 genomes https://observablehq.com/@spond/evolutionary-annotation-of-sars-cov-2-covid-19-genomes-enab?collection=@spond/sars-cov-2 * Combined view of sites which may be experiencing convergence in N501Y P1 and P2 clades https://observablehq.com/@spond/combined-view-of-sites-which-may-be-experiencing-convergen/2 Source code supplying scripts and workflows can be found at https://github.com/veg/SARS-CoV-2. ----- ## Discussion * Summarize what you get. * Explain again how this is useful. Examples > The power of these analyses to detect evidence of selection acting on individual codon sites progressively increased over time with rising numbers of sampled genome sequences and sequence diversification. -- Martin et al. * Why it is not enough to just look at variants that are increasing in frequency (the obvious stuff) * Limitations ** Selection signals may be false positives due to sequencing error, incorrect phylogenetic inference, or recombinants. ** Low power due to relatively low divergence. ** Temporal and spatial bias ** Epistasis * How do **we** think it should be used. -- DMS (Expansion from single mutations to combinations of interest.) -- Epidemiological (Nextstrain, alternative approaches to classifying ancestry as mutations converge, reduce classification promiscuity by introducing quantitative metrics, as opposed to qualitative observation.) -- Predictive () * Future directions ** Subsampling accounting for temporal and spatial bias. ### Discussion Draft > The past several months have witnessed the emergence of four SARS-CoV-2 variants of concern (Alpha, Beta, Gamma and Delta) associated with increased transmissibility, increased risk of reinfection and/or reduced vaccine efficacy, with many VOCs sharing mutations being identified. > The power of these analyses to detect evidence of selection acting on individual codon sites progressively increased over time with rising numbers of sampled genome sequences and sequence diversification. -- Martin et al. <!-- TODO Papers to cite: The biological and clinical significance of emerging SARS-CoV-2 variants - https://www.nature.com/articles/s41576-021-00408-x > Although it had been previously assumed that waning immunity explained the observation that people are commonly reinfected with endemic common-cold coronaviruses11, recent studies suggest that antigenic drift also contributes to the lack of long-lasting protection following coronavirus infections12,13. > Within 1 month, two additional rapidly growing lineages with large numbers of genetic changes were reported from South Africa16 and Brazil19. > https://www.nature.com/articles/s41586-021-03807-6 -- Bloom --> --- ## Acknowledgements --- We gratefully acknowledge all of the authors from the originating laboratories responsible for obtaining the specimens and the submitting laboratories where genetic sequence data were generated and shared via the GISAID Initiative, on which this research is based. The Wellcome Trust funded D.P.M.(222574/Z/21/Z). The US National Institutes of Health funded S.L.K.P. (R01AI134384 and AI140970). ## Supplementary Material https://hackmd.io/@stevenweaver/rk3S4E6qF --- ## Figures ![](https://i.imgur.com/v6gG02f.png) ![](https://i.imgur.com/KlTeIiC.png) ![](https://i.imgur.com/0Aod0GI.png) **Figure 1.** Genomic overview of natural selection pressures on the SARS-CoV-2 genome, through analyses of 21 three-month overlapping intervals going back to the beginning of the pandemic. The earliest intervals end in February 2020 and the latest - in October 2021. Figures like this can be generated from https://observablehq.com/@spond/selection-profile. ![](https://i.imgur.com/d9aex4E.png) **Figure 2.** Temporal evolution of selection force, defined as the number of positively and negatively selected sites normalized by kilobase of gene length and the internal tree length (sites / [substitutions across the tree x gene length]); this quantity should be directly comparable between genes and time points. Genes are sorted by maximal 'force' of positive selection over all time. Figures like this can be generated from https://observablehq.com/@spond/selection-profile. **ALPHA** ![](https://i.imgur.com/UiST91C.png) **BETA** ![](https://i.imgur.com/iAn7Xqu.png) **GAMMA** ![](https://i.imgur.com/IVGcbeT.png) **DELTA** ![](https://i.imgur.com/etjXds1.png) **MU** ![](https://i.imgur.com/voDdfVr.png) **Figure 3.** Evolutionary trajectories of XX selected sites. If a site was found to be positively (negatively) selected during a specific time period, a bubble will be drawn at a corresponding point on the plot. The area of the bubble is scaled as -log10 (p), where p is the p-value of the FEL likelihood ratio test. Larger bubbles correspond to smaller p-values; p-values are not directly comparable between different time windows and different genes due to differences in sample sizes and other factors. The x-axis shows the endpoint of the time-window; e.g., Mar 30th 2021 will correspond to the analysis performed with the data from Jan 01 2021 to Mar 30th 2021.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully