# Christien und Miriam - ACE2
---

---
## Live Chat
Accession numbers:
* Human NM_001371415
* Gibbon (Nomascus leucogenys - northern white-cheeked gibbon) XM_003261084.3 → XP_003261132.2
* Pig XM_021079374.1 → XP_020935033.1
* Duck XM_013094461.4 → XP_012949915.3
Beispielpaper
https://github.com/MarieKoehler/Bioinformatik/blob/main/WS2021%20Bioinformatik/BioinformatikMaster_GruppeJ_MarieK%C3%B6hler.pdf
---
## Presentation informations
ACE2 (Angiotensin converting enzyme 2):
* The actual main function of ACE2 is that the protein is an important component in the renin-angetionsin-aldosterone system (short RAAS)
* The RAAS regulates the body's fluid and electrolyte balance and thus influences the blood pressure
* ACE2 is localized in the heart and kidneys, but also lung epithelium
* In the lungs, it is used by the SARS-CoV-2 virus as a receptor to enter the host cell
* Detailed research on ACE2 as a receptor function is still lacking ( = Gap)
* In order to advance drug research against coronavirus, the goal of the paper is to perform a precise sequence analysis (also using gene models), to analyze mutations, and to highlight possibly relevant homologies
Gen analysis:
* 1. Gene: 40028 bp
* 2. Pre mRNA: 39928 nt
* 3. mature RNA: 3339 nt
* 4. ORF for ACE2: 2418 nt
* 5. Protein: 805 aa
| ORF start | minimal length [nt] | ORF found |
| -------- | -------- | -------- |
| ATG only | 75 | 37 |
| ATG only | 150| 8|
|ATG and alternatives|75|52|
|ATG and alternatives|150|15|
→ the possibility of many ORFs in the sequence could have implications for the folding of the protein and thus binding ability of the virus to the receptor
Mutationen? (im Intron/Extron Bereich? im Rezeptorbindenden Bereich?)
Homologien durch Alignment und Stammbaumanalyse (Forschung an anderen Tierarten, ob SARS-CoV-2 bindet und wenn nicht, was ist anders in der Sequenz → Nutzung in Medikamentenforschung)
* Significantly more similarities of ACE2 variants between gibbon and human → first choice for research purposes
* The ACE2 from pig has only 62.8 % similarity in transcript local alignment (but still 91.2 % similarity in protein)
* Globally, the porcine ACE2 protein is only 46.5 % identical
* The duck has too many differences in alignment and should not be used for research
---
## Big Picture Suggestions
* In 2019, the angiotensin-converting enzyme 2 (ACE2) became sudden major player in one of the biggest pandemics of today and the focus of its proper role (actor in renin-angiotensin-aldosterone system) was taken away.
* Nearly 260 million infected and over 5 million deaths worldwide since the beginning of the COVID-19 pandemic (as of November 2021; www.covid19.who.int) abruptly shift the field of reasearch regarding angiotensin-converting enzyme 2 (ACE2) toward detailed description of receptor binding of SARS-CoV-2 to ACE2.
* The corona pandemic affects the life of every individual, which gives the ACE2 a new meaning in science to find possible solutions for everyone.
* Angiotensin-Converting enzyme 2 (ACE2) as a true multi-talent in the world of proteins proves its right to exist through its important roles in the renin-angiotensin-aldosterone system (RAAS), as an amino acid transporter in the intestinal tract and, for current events, most notably as the major receptor for the SARS-CoV-2 virus.
* “The coronavirus is currently changing life in our country dramatically. Our idea of normality, of public life, of social togetherness - all of this is being put to the test like never before. It's serious. Take it seriously too.” - Angela Merkel, Chancellor, Germany.
* In the near future, all corona patients can be successfully treated with the help of the ACE2 receptor as a key factor.
* **Since 2019, 5.4 million corona cases were observed in Germany alone and over 100000 people lost their lives.**
---
## General Info
* https://downloads.hindawi.com/archive/2012/256294.pdf
* https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7188049/
* https://www.uniprot.org/uniprot/Q9BYF1
* EC-Number 3.4.17.23
* ACE2 Gen: ENSG00000130234
* Transcript 1: ACE2-201 ENST00000252519.8 --> Exons: 18, Coding exons: 18, Transcript length: 3,339 bps, Translation length: 805 residues
*aktuelle Coronazahlen: https://de.statista.com/statistik/daten/studie/1102667/umfrage/erkrankungs-und-todesfaelle-aufgrund-des-coronavirus-in-deutschland/
https://covid19.who.int/
#### Tasks of ACE2 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7362321/):
1. Protease in Renin-Angiotensin-Aldosteron System (Angiotensin II is converted in Angiotensin 1 - 9)
2. Function in kidney
3. Amino acid transport protein → takes over the task of collectrin in the intestine (provides stability and the membrane strength of the transporter B0AT1)
4. Coronavirus receptor → the virus preferentially uses ACE2 as an entry portal into the host cell
---
## Database
in human
| Gene name | Protein name | Alternative protein names |
| -------- | -------- | -------- |
| ACE2 (UNQ868/PRO1885) | Angiotensin-converting enzyme 2 | Angiotensin-converting enzyme-related carboxypeptidase, Angiotensin-converting enzyme homolog, Metalloprotease |
in mouse, rat, cat and bovine
| Gene name | Protein name | Alternative protein names |
| -------- | -------- | -------- |
| ACE2 | Angiotensin-converting enzyme 2 | ACE-related carboxypeptidase |
---
## SARS-CoV2 specific Info
* https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7938645/

(Source: DOI: 10.1161/CIRCRESAHA.120.317015)
---
## Introduction
Since 2019, 260 million corona cases were observed and over 5 million people lost their lives worldwide (https://covid19.who.int/). The numbers are dramatic and show how urgent research regarding coronavirus infection is. Angiotensin-converting enzyme 2 (ACE2) is actually one of the major player in the renin-angiotensin-aldosterone system (RAAS) (Tikellis & Thomas, 2012), but in the pandemic ACE2 became known as the receptor-binding protein for the SARS-CoV-2 virus.
The ACE2 gene is located on the X chromosome in contrast to the ACE gene, which is located on the 17th chromosome. It contains 18 exons, many of which are similar to those of the ACE gene. It contains only one catalytic domain (Gheblawi et al., 2020), which corresponds to a metallopeptidase and shares with ACE 42% sequence identity and 61% sequence similarity with the catalytic domain of ACE (Tikellis & Thomas, 2012).

Figure 1 presentation of the function of ACE2 in the RAAS and the key role in SARS-CoV-2 infection (Pathangey et al., 2021).
ACE2 is formed in almost all tissues and is also active there as an integral membrane glycoprotein of type 1. The highest expression density is present in the kidney, in the endothelium and precisely also in the lung (Tikellis & Thomas, 2012), where it is easily accessible to SARS-CoV-2.

Figure 2 Role of ACE2 in the course of infection with the Corona virus. The Endocytosis of the enzyme with SARS-CoV-2 weakens the protective shield of the immune system (Gheblawi et al., 2020).
---
# Unit 2
Hauptaussagen
* 1. Gene: 40028 bp
* 2. Pre mRNA: 39928 nt
* 3. mature RNA: 3339 nt
* 4. ORF for ACE2: 2418 nt
* 5. Protein: 805 aa
| ORF start | minimal length [nt] | ORF found |
| -------- | -------- | -------- |
| ATG only | 75 | 37 |
| ATG only | 150| 8|
|ATG and alternatives|75|52|
|ATG and alternatives|150|15|
* für APE gesamte Gendarstellung: https://www.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000130234;r=X:15492313-15609489;t=ENST00000252519
Bilder mit SnapGene


---
# Unit 3
## 1. Homologous Sequences
### 1.1 Homologies with short random sequences
* Ich habe bei keiner Länge von randomisierten Seqenzen einen Match mit ACE2. Zum Glück habe ich wenigstens einen Teil der ACE2 Sequenz mit sich selber blasten können und es wurde wieder gefunden, sonst hätte ich gedacht ich mache was falsch :D
* Letztendlich soll ja die Quintessenz des Versuches aber ja sein, daß die Wahrscheinlichkeit zu matchen zwischen Zielgen und randomisierten Sequenzen immer geringer wird, je länger die zu vergleichende Sequenz ist
### 1.2 Finding homologous sequences in database
Comparison of
* Human
* Gibbon
* Pig
* Duck
FASTA Sequences of Gen, cDNA and Proteinen are saved
* https://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000130234;r=X:15557039-15580438;t=ENST00000677282
* https://www.ensembl.org/Nomascus_leucogenys/Gene/Summary?g=ENSNLEG00000009423;r=X:13571909-13616710;t=ENSNLET00000012094
* https://www.ensembl.org/Sus_scrofa/Gene/Summary?g=ENSSSCG00000012138;r=X:12099853-12151275;t=ENSSSCT00000034032
* https://www.ensembl.org/Anas_platyrhynchos_platyrhynchos/Gene/Summary?g=ENSAPLG00000014477;r=1:126720102-126736364;t=ENSAPLT00000015165#
### 1.3 Comparing Homologous Sequences by Alignment
### 1.4 Find more similar sequences in databases: BLAST, "blast”
### 1.5 Uniprot Align
Uniprot identifiers (Entry names):
* Human: ACE2_HUMAN
* Gibbon: G1RE79_NOMLE
* Pig: K7GLM4_PIG
* Duck: A0A6J3E419_AYTFU
Alignment Links:
https://www.uniprot.org/align/A202112095BF3C56A578D7D6DFD1FC81EE5DA773000EE10K
gewählte Kriterien:
* Similarity
* Glycosilation (ist eine humane Eigenschaft)
* Mutagenesis
## 2. Phylogenies
### 2.1 Compare/align all homologous sequences
https://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=clustalo-I20211209-145658-0926-12437511-p2m&analysis=phylotree
(
(
XM_013094461.4:0.26859,
XM_021079374.1:0.09459)
:0.07484,
NM_001371415.1:0.00799,
XM_003261084.3:0.00938);

### 2.2 Phylogenetic tree based on protein sequence
---
# Unit 4
When is your gene expressed during development?
In which tissues is the gene expressed?
* perfect summary: https://www.proteinatlas.org/ENSG00000130234-ACE2/tissue
* Membranous expression in proximal renal tubules, intestinal tract, seminal vesicle, epididymis, exocrine pancreas and gallbladder
* Expressed in Sertoli and Leydig cells, and trophoblasts
* Membranous expression in ciliated cells in nasal mucosa, bronchus, and fallopian tube
* Expressed in endothelial cells and pericytes in many tissues
How strongly is it expressed? Compare to other genes, e.g. rp49,
When/where is the protein expressed? Does it correlate?
How does expression change in disease?
* https://www.proteinatlas.org/ENSG00000130234-ACE2/pathology
* prognostic marker in renal cancer and liver cancer
## 1. Gene expression DATABASES - Let’s try out one database and find out what kind of Gene Expression data we can find there and what wecan do with it.
Go to the PPT file “ArrayExpress”www.ebi.ac.uk/arrayexpress/and step by step followtheinstructions to learn how to navigate one example database.
Then use ArrayExpress to find out:
1. How many experiments were done involving your gene?
• 12 experients
2. Are there also studies with homologues of your gene in other species?
• Only one study with Homo sapiens (with 11 arrays)
• The other experiments are for example with yeast, mouse, rainbow trout etc.
3. How human samples are from cancerous samples, or a specific kind of cancer?
• HEK cells (HEK293 embryonic kidney cells) were used for the experiment
4. You already know what kind of disease your gene is involved in, do you find datasets reflectingthe presumed role of your gene?
• Unfortunatelly not..
5. How many experiments are microarray? How many experiments with RNAsequencingtechnologies?
• The human experiment is with array
• The other 11 experiments with other species are with array, RNA-seq and ChIP arraay
Summarize these findings in your final report and refer to the Accession number of the Experiment with Hyperlink
* https://www.ebi.ac.uk/arrayexpress/search.html?query=ace2
• The data for ACE2 is very limited here (especially for human ACE2)
• Only 12 experiments were found, which is very few compared to e.g. p53 with 963 experiments
• So there is a great need to start more experiments in human cells (especially lung cells) to improve the data situation, especially with regard to corona
## 2. Gene Expression DATA
Do the exercises described in the “Introduction ExpressionAtlas” document (PPT as PDF). Then try it out foryour own gene:
* Search for your gene in Humans
* In the list of all experiments, go to ‘32 Uhlen’s Lab’
* Investigate your gene’s expression in the dataset, switch to Boxplot view and save it as a figure for your report

(ist etwas sehr unscharf.. Ich hoffe, im Report kann man es besser darstellen)
# Unit 5
Gap/Fragestellung?
Go into even more detail for basic research to understand ACE2 as a covid-19 receptor binding site.
Task: Finding Packages in R in with regard to our gap
* In order to make the previous amounts of data clear, it would be useful to use the ggplot2 package of R. This package creates expandable maps of multiple sequences. This way the differences between the different isotypes or organism types can be seen at a glance. (https://cran.r-project.org/web/packages/alignfigR/index.html)
* To better understand ACE2 visually, it would still be useful to use the package "Autoplotprotein". The image of the amino acid transform on the protein level (with its functional elements like the domain and mutation side) can help to better understand the protein and its function as a receptor protein. (https://cran.r-project.org/web/packages/Autoplotprotein/index.html)
Statistik mit R:
* Histogramm mit R:

* t-test mit R:
Two Sample t-Test
data: ACE2_v1 and ACE2_v2
t = 1.0399, df = 28, p-value = 0.3073
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.78428 14.65095
sample estimates:
mean of x mean of y
33.80000 28.86667
* Ich weiß nur gar nicht, was mir der t-Test jetzt sagen soll.. Müsste man sich noch mal reindenken, ob das jetzt signifikant ist oder nicht
# Phylogenetic Tree with R
