# Christopher Hartl & Katarina Markovic - p53 ## 24.11.2021 - Unit 2 **1. Which part of the gene is translated? Find ORFs!** ![](https://i.imgur.com/A0m8r9i.png) - Note: mRNA in FASTA document -> run in *Open reading frame viewer* - 14 ORFs ## Task 1 - PubMed Looking at the span of published journals per year we can see that protein p53 was present from 1970 until today. During the years the number of publications was increasing, reaching the top in the last year with total number of nearly 6, 000. With the total number of approximately 110 000 published journals and considering how does the graph look, we expect further interest and possible breakthrough in this field of expertise. ## Task 2 Gene Name = TP53 Protein Name = P53 Other Names for TP53: (P53; BCC7; LFS1; BMFS5; TRP53) | Gen Name | Alternative Gene Name | Protein Name | Alternative Protein Name | Gene Name (Spezies X) | Protein Name | | -------- | --------------------- | ------------ | ------------------------ | --------------------- | --- | | Text | Text | Text | | | | # Task 3 1) For now there are 108533 papers published that deal with p53 The first one was published in 1970 and there was no real "hip" but the numbers of papers that were publised started to rise in 1990 and since that the numbers continously increased 2. A **research article** is a primary source...that is, it reports the methods and results of an original study performed by the authors. The kind of study may vary (it could have been an experiment, survey, interview, etc.), but in all cases, raw data have been collected and analyzed by the authors, and conclusions drawn from the results of that analysis. A **review article** is a secondary source...it is written about other articles, and does not report original research of its own. Review articles are very important, as they draw upon the articles that they review to suggest new research directions, to strengthen support for existing theories and/or identify patterns among existing research studies. For student researchers, review articles provide a great overview of the existing literature on a topic. If you find a literature review that fits your topic, take a look at its references/works cited list for leads on other relevant articles and books!" Notizen: - Most proteins where found in the 80s - Task 1-3 are just starting points to write an introduction but we dont need to put all the answers into the introduction - Big Picturen we could write about cancer as still the biggest thread to human health Homework for the 24.11 (Chris) 1. Ask an open question: -> What impact does cancer have on today's healthcare system and how important is the crucial role of the p53 gene? 2. Tell a story that’s relevant to the point you are trying to make: -> Today, about 600,00 people die of cancer each year in America, and a mutation in the p53 gene has been found in about half of those affected. 3. Make a bold statement: -> Cancer is the second most common cause of death in Germany after cardiovascular disease. 4. Tell your audience to imagine something: -> Imagine that a mutation in the p53 gene is found in 50% of all cancers and what the impact on the healthcare system would be if this mutation could be reduced in a controlled manner. 5. Lead with a quote or a jarring/shocking fact -> ## Homework for 24.11. (Katarina) 1. Asking an open question: How much do scientists know about cancer? 2. A relevant story from someone’s life: Rosalind Franklin, David Bowie and George Harrison all died from cancer. 3. A statement: According to the statistics done by WHO in 2012 cancer is the second leading cause of death with a high number of nearly 9 million people who died from it. 4. Tell audience to imagine something: Imagine what would it mean, not only for the healthcare system, but also for patients to be able to cure every cancer and its cause. 5. Quote: “Since I had cancer, I’ve realized that every day is a bonus.” said Geoffrey Boycott, a former professional cricket player who won his battle with a cancer of tongue. # Links -> Tp53 complete Gen https://www.ncbi.nlm.nih.gov/gene/7157 -> P53 Protein Isoform a https://www.ncbi.nlm.nih.gov/nuccore/NM_000546.6 ## Aufgabe 2: 1) How many ORFs are predicted? Play around with the minimum ORF length – do the predicted ORFs change? Experiment with the parameters (which ORF length is likely? Which genetic code do you use? Which start codons are there and which do you want to use here? You can also get information by pressing "?"). ![](https://i.imgur.com/b8TneiA.png) -> Gesucht nach >NM_000546.6 and ORF-Length von 75nt ![](https://i.imgur.com/bj8wt5l.png) -> Gesucht nach >NM_000546.6 and ORF-Length von 600nt # Aufgabe 3 ##### Exons, Introns 3'UTR & 5'UTR http://www.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000141510;r=17:7661779-7687538;t=ENST00000269305 ![](https://i.imgur.com/8Ej1HDN.png) ![](https://i.imgur.com/lF01U18.png) ![](https://i.imgur.com/RLMtpak.png) ![](https://i.imgur.com/ByTfT7I.png) ![](https://i.imgur.com/ZjJnk15.png) # Aufgabe 4 Create a gene model: Based on the collected data you create an overview by sketching a line with annotations in roughly the right length of: gene, pre-mRNA, mRNA, transcript, cDNA, ORF, 5’UTR, 3’UTR, protein, possible isoforms. (either on paper or in PPT). You may also check out the visual representations of NCBI/Graphics or Ensembl. Then next use the sequence manager, e.g. ApE Plasmid Editor to create one isoform of your gene fully annotated with the relevant information (5’UTR, 3’UTR, exons). You have to choose LINEAR representation (why? The default circular setting comes from the tool being used for cloning genes etc, and this is done in circular bacterial plasmids!) Full Gen: ![](https://i.imgur.com/EurbNtl.png) mRNA: ![](https://i.imgur.com/db0Jb3x.png) ## Aufgabe 5 1) Build the sequence of ONE of the isoform from your gene: Exon 1 - Intron 1 - Exon 2 - Intron 2.... Etc. I.e. the sequence should contain both 5'UTR, CDS and 3'UTR. (but not Promoter, Introns). You may already have this from above! Go to https://www.bioinformatics.org/sms2/mutate_dna.html and create new sequences by randomly inserting 2, 20 or 50 mutations one after the other. For point mutations with just 2 nucleotide change please run the “experiment” three times. After this you should have 5 new sequences of your transcript. (Name them clearly so you know what is what). Analyse: what changed and how do you evaluate the mutations? Where are the mutations located? Discuss in the paper: - If the mutations are found e.g. in a patients, are the mutations silent or could they cause a biological misfuntion? Speculate how the mutation may influence disease progression? - If the mutations occur in a PCR experiment (look at the error rate of the Taq polymerase!!!): Would you trust the sequence when it is used for in vitro translation experiments (for instance if a vaccine should be made from it) for this check if the mutations are in 5' or 3'UTR, or in wobble bases a) Two mutations introduced into each sequence ![](https://i.imgur.com/3GY0UwD.png) ![](https://i.imgur.com/dqNVa6f.png) ![](https://i.imgur.com/w91lVE6.png) Pictures above show results after running the program three times with two mutations introduced into each sequence. Afterwards results have been aligned with an original sequence, human mRNA of gene p53, using tool BLAST. Pictures are showing 99% match. Furthermore, according to results, only four mutations occured. First mutation happend on a 351^st^ nucleotide, which is a part of an exon sequence. Using ApE tool it was given that the 351^st^ nucleotide in an original mRNA sequence codes for an amino acid cystein. After mutation the same nucleotide codes for another amino acid, tyrosine. Since gene p53 is a tumor-supressing protein, this mutation may cause cancer cell to grow. Second mutation happend on a 2000^th^ nucleotide which codes for an amino acid proline, but in this case for a leucine. Third mutation was on 2502^nd^ nucleotide resulting in coding for lysine, instead of asparagine. Both mutation happend in the exon sequence and could result in cancer cell growth. b) 20 mutations introduced into each sequence ![](https://i.imgur.com/4NjZLhT.png) ![](https://i.imgur.com/MFxLaDm.png) ![](https://i.imgur.com/D6HJywc.png) - 17 mutation --> 99% match - 233 position (original CGT-->Gln; mutated: TGT-->Cys) - 257 position (original AGC-->Ser; mutated: AAC-->Thr) - 1885 position (original TTG-->Leu; mutated: TGC-->Cys) c) 50 mutations introduced into each sequence ![](https://i.imgur.com/361u4kd.png) - 34 mutations ## Intro Cancer is a disease where body cells grow without control along with the ability to spread all over a body. A lot of causes of cancer are directly correlated with mutations in the tumor gene suppressor, p53. (1) The most frequent mutations on p53, also known as hot spots, can be divided into two categories. Conformational mutations lead to structural changes in the binding domain, while contact mutations change the protein’s ability to bind DNA molecule. (3) Although some facts are well known and studied, there is still a lot to research in order to prevent new cases and give the best care to patients. The aim of this study is to give an overview of p53 in terms of given tasks. p53 is a transcription factor that suppresses tumor growth through cell cycle regulation and codes for w53-kDa phosphoprotein (1). In healthy cells p53 levels are maintained at a low state. (4) Hypoxia, nucleotide deprivation and different DNA damage can activate p53 and resulting in DNA repair, growth arrest or apoptosis. (5) When the mutation occurs in all exons, p53 loses its ability to bind DNA. Loss of a G1 checkpoint increases the mutation rate, while the loss of apoptosis would accelerate tumor expansion and perhaps promote metastasis. (6) *Unfortunately, mutation of p53 can cause resistance to chemotherapy initiating treatment failure, relapse and death.* [6] In this article we are going to investigate molecular biology of p53 and its role in a cancer development by using different tools. Cancer is a disease in which body cells grow uncontrollably and can spread throughout the body, causing dramatic damage. Many causes of cancer are directly related to mutations in the tumor gene suppressor p53. (1) The most common mutations on p53, also known as hot spots, fall into two categories. Conformational mutations lead to structural changes in the binding domain, whereas contact mutations alter the protein's ability to bind DNA molecules. (3) Although some facts are well known and researched, there is still much to be explored to prevent new cases and provide the best possible treatment to patients. The aim of this study is to provide an overview of p53 in light of the tasks ahead. p53 is a transcription factor that suppresses tumor growth by regulating the cell cycle and encodes the phosphoprotein w53-kDa. [1] In healthy cells, p53 levels are maintained at low levels. (4) Hypoxia, nucleotide deficiency, and various DNA damage can activate p53 and lead to DNA repair, growth arrest, or apoptosis. (5) When the mutation occurs in all exons, p53 loses its ability to bind DNA. Loss of a G1 checkpoint increases the mutation rate, while loss of apoptosis would accelerate tumor expansion and possibly promote metastasis. (6) Unfortunately, mutation of p53 can lead to resistance to chemotherapy, which in turn leads to treatment failure, relapse, and death. [6] In this article, we will review the molecular biology of p53 and its role in cancer development using various tools. ## Unit 3 1.3 ![](https://i.imgur.com/Jrnz5AZ.png) 1.4 UWG6MRT2013-Alignment - attach to report Which is the first protein hit? What is the E value and the % identity? Homo sapiens tumor protein p53 (TP53), transcript variant 1, mRNA (NM_000546.6). E value = 0.0, % identity = 100% Go to the "Taxonomy" representation of your results and choose the most distant sequence. From which organism is it, what is the E value and the % identity? Save the sequence in the FASTA format. Use the "Filter Results" to limit your results to human. How many human proteins did you get? 52. Repeat the BLASTP search twice but limit your search with "Organism". First choose Caenorhabditis elegans (taxid:6239). Second, choose Escherchia coli (taxid:562). What is the top hit, what is the E value and the % identity? ![](https://i.imgur.com/vWLBBBy.png) ![](https://i.imgur.com/RvfUBzR.png) Repeat the BLASTP search twice but limit your search with "Organism". First choose Caenorhabditis elegans (taxid:6239). Second, choose Escherchia coli (taxid:562). What is the top hit, what is the E value and the % identity? Save the sequence of the C. elegans and E. coli top hits in the FASTA format. :) 1.5 Now select 3 combinations with one criterion each from the annotation list and one criterion from the amino acid properties list, e.g. beta and serine/threonine. Copy the 3 colored alignments into your protocol and explain the relationships. ![](https://i.imgur.com/5Zxf2HK.png) ![](https://i.imgur.com/H9SNta0.png) ![](https://i.imgur.com/Hh3BkoG.png) - [ ] Katarina: explain this # 2 https://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=clustalo-I20211207-155255-0055-78028090-p1m&analysis=alignments https://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=clustalo-I20211207-163626-0423-26542850-p1m&showColors=true&tool=clustalo - with colours Guide tree: ![](https://i.imgur.com/eXubDtg.png)