Nucleosome Research Project

# Nucleosome Research Project Some resources --- :::info - [x] PDB files for the 'news' systems - [**file link**](https://www.icloud.com/iclouddrive/091dHjzuzOBdpxD2H_GgmQlYw#Cases-octa) - [x] Some stuctrual fiting and analysis methods - [AAMD paper, check their **Analysis Method**](https://www.icloud.com/iclouddrive/0f6eHHXLn5-mnfvb3HLZBRHKA#nature-comm-2021-aamd-paper) - [**do_x3dna**](https://do-x3dna.readthedocs.io/en/latest/index.html); [paper link](https://europepmc.org/article/med/25838463) - [x] Some papers about coarse-grained modeling of nucleosomes - [**a nice review paper**](https://www.icloud.com/iclouddrive/030VVX4GqKlRlNfOtnRSe9uSw#review-2020) - [ ] [**Nuclesome Structure Anaylysis paper**](https://www.nature.com/articles/s41598-018-19875-0) - github repository: **https://github.com/xinmeng2020/nucleosome-analysis.git** MD protocol --- - FIX missing residues using Modeller: https://salilab.org/modeller/wiki/Missing_residues ::: Meetings --- :::warning - [ ] week 35, 30.08.2022 at 2pm, Michele and Manuel ::: Simulations --- :::info - Files - [ ] EM with DNA extension ::: Analysis --- :::info - How to...? - do principal component analysis (solely considering C_alpha histones) - opening mode - analyze helical parameters ::: ## Raw Protocols ### prepare the initial pdb and fasta - system: **raw** protein + **full** DNA - illustration: H3; Mira can finishe the CA case $ mkdir handmake-h3-rawProtein-fullDNA $ cd handmake-h3-rawProtein-fullDNA cp ../h3fixDNA/H3octasome-with465-renamechain-fullDNA.pdb . $ cp ../h3fixDNA/h3.fasta . // Now the pdb file is corret for our purpose, but the fasta file contains the full sequene of the protein chains // So we have to 'chop' the fasta file deliberately to not inlcude the protein tails // The first thing we have to figure out is what is the sequence contains inside pdb, then we can compare with the fasta file, then throw the part that is not needed in the fasta file :::question::: How to know the sequence given in a pdb quickly **Solution** use services/codes to perform pdb to fasta sequences, e.g. https://zhanggroup.org/pdb2fasta/ :question: In pdb file begin to copy from ATOM lines Cross check results results: >pdb:A PHRYRPGTVALREIRRYQKSTELLIRKLPFQRLVREIAQDFKTDLRFQSAAIGALQEASEAYLVGLFEDTNLCAIHAKRVTIMPKDIQLARRIRGER >pdb:B NIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGG >pdb:C PHRYRPGTVALREIRRYQKSTELLIRKLPFQRLVREIAQDFKTDLRFQSAAIGALQEASEAYLVGLFEDTNLCAIHAKRVTIMPKDIQLARRIRGERA >pdb:D RKVLRDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGG >pdb:E PHRYRPGTVALREIRRYQKSTELLIRKLPFQRLVREIAQDFKTDLRFQSAAIGALQEASEAYLVGLFEDTNLCAIHAKRVTIMPKDIQLARRIRGER >pdb:F NIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGG >pdb:G PHRYRPGTVALREIRRYQKSTELLIRKLPFQRLVREIAQDFKTDLRFQSAAIGALQEASEAYLVGLFEDTNLCAIHAKRVTIMPKDIQLARRIRGERA >pdb:H RKVLRDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGG Now we can search for the obtained sequence (which is the converted pdb to fasta sequence) in the full fasta sequence => find the match and remove the rest ***Here, we may make the A,C,E,G chains all the same, thus keep the last A residue which is missing in chain A and E*** Chain B and F are the same, chain D and H have additional resiudes. Must split the common chains B, D, F, H into two chains. For chain A, that means to keep ALA 135 in pdb file. Create new pdb file and remove REMARK 465 residues that also were delted in the fasta file. That is, the new pdb file and the fasta file contain the same residues (also concerning the number of residues). Chain A: deleted res 0-37 keep res ALA 135 Chain B: deleted res 0-24 Chain C: deleted res 0-37 Chain D: deleted res 0-18 Chain E: deleted res 0-37 keep res ALA 135 Chain F: deleted res 0-24 Chain G: deleted res 0-37 Chain H: deleted res 0-18 Then we can save the fasta file. :::warning::: The DNA is not processed in here (that is the entire DNA extention is kept), but we also have a protocol to deal with DNA. Just skipped for now. So if we have the PDB and Fasta files, then we can process with our simualtion protocols. For CA $mkdir handmake-ca-rawProtein-fullDNA $cd handmake-ca-rawProtein-fullDNA/ $cp ../cafixDNA/CA-octasome-with465-renamechain-fullDNA.pdb . $cp ../cafixDNA/ca-chainrename.fasta . >pdb:A RRRQGWLKEIRKLQKSTHLLIRKLPFSRLAREICVKFTRGVDFNWQAQALLALQEAAEAFLVHLFEDAYLLTLHAGRVTLFPKDVQLARRIRGLEEGLG >pdb:B RDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGG >pdb:C RRRQGWLKEIRKLQKSTHLLIRKLPFSRLAREICVKFTRGVDFNWQAQALLALQEAAEAFLVHLFEDAYLLTLHAGRVTLFPKDVQLARRIRGLEEGLG >pdb:D RDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGG >pdb:E RRRQGWLKEIRKLQKSTHLLIRKLPFSRLAREICVKFTRGVDFNWQAQALLALQEAAEAFLVHLFEDAYLLTLHAGRVTLFPKDVQLARRIRGLEEGLG >pdb:F RDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGG >pdb:G RRRQGWLKEIRKLQKSTHLLIRKLPFSRLAREICVKFTRGVDFNWQAQALLALQEAAEAFLVHLFEDAYLLTLHAGRVTLFPKDVQLARRIRGLEEGLG >pdb:H RDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGG Chain A,C,E,G all equal as are chains B,D,F,H Create new pdb file and remove REMARK 465 residues that also were deleted in the fasta file. That is, the new pdb file and the fasta file contain the same residues (also concerning the number of residues). Chain A: deleted res 1-41 Chain B: deleted res 0-22 Chain C: deleted res 1-41 Chain D: deleted res 0-22 Chain E: deleted res 1-42 Chain F: deleted res 0-22 Chain G: deleted res 1-41 Chain H: deleted res 0-22 :Warning: Make sure to delete all additional REMARK 465 entries if there are no missing residues at all ## Rename Chains For pre-porecessing pdb file done in sublime * Alphabetically order chains * Rename chains from A to J For caRawProteinRawDNA Chain E to C delete 817 atoms Chain F to D Cahin K to E delete 817 atoms Chain L to F delete 639 atoms Chain O to G 817 Chain P to H 639 h3RawProteinRawDNA Chain E to C delete 804 atoms Chain F to D 674 Cahin K to E 798 Chain L to F 620 Chain O to G 804 Chain P to H 674 ### Task - [ ] get the consistent PDB and FASTA files for the H3 and CA cases ### Reference **Fasta sequence**

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.