---
type: slide
---
# Contact analysis
We want to have a [tool](https://gitlab.com/pbuslaev/ppi) to analyse the contributions of different residue-residue interactions to the protein-protein interface. This tool should help to
- better understand and describe the contacts
- track the pathways in the pulling simulations
- set up pulling with transformation coordinates
---
<!-- .slide: style="font-size: 36px;" -->
## Extracting contacts
We first want to extract contacts from the structure file. This can be done in two ways:
1. The user might provide pdb code
2. The user might provide the file
From these structures we need to determine the residues that form a contact. We use 8 angstroms as a cutoff for finding interface (the choice is based on [cocompas](https://doi.org/10.1093/bioinformatics/btr484)). By default the code selects chain A and B for contact analysis, but the user can specify his own choice in the format of `MDAnalysis` [selection algebra](https://docs.mdanalysis.org/stable/documentation_pages/selections.html).
---
```graphviz
digraph chargesetter {
pad= .1
splines= ortho
nodesep= 0.20
ranksep= 0.5
fontname= "Sans-Serif"
fontsize= 13
fontcolor= "#2D3436"
node [shape = box
// style = rounded
fixedsize = true
width = 1.4
height = 0.7
labelloc = c
fontname = "Sans-Serif"
fontsize = 13
fontcolor = "#2D4436"]
pdbFile [color = black, label="PDB File"]
pdbCode [color = black, label="PDB ID"]
getContacts [color = red, label="PDB contact\ngetter"]
contactMap [color = blue, label="contact\nmap"]
contactPlot [color = blue, label="contact\nPlot"]
contactCluster [color = red, label="Cluster\ncontacts"]
residuePairs [color = blue, label="List of \nresidues\nin contact"]
pdbFile->getContacts
pdbCode->getContacts
getContacts->contactMap
getContacts->contactPlot
contactMap->contactCluster
contactCluster->residuePairs
}
```
---
## `getPPI.py` script
The `getPPI.py` script calculates all the contacts. To run the script one need to simply specify the PDB ID of interest:
```
getPPI.pi -pdb 1brs
```
However, if one wants to additionally specify the chains needed for the analysis (this is especially important if chains A and B are not iin direct contact), it is possible to provide this information with `-s1` and `-s2` flags:
```
python getPPI.py -pdb 1brs -s1 "protein and segid A" \
-s2 "protein and segid D"
```
---
## Contact map calculation
As a first step `getPPI.py` computes the contact map:

---
## Contact map calculation
And a contacts based on the cutoff:

---
## Finding clusters
Next, we use K-Means to perform clustering and find optimal number of clusters

---
## Finding clusters

---
## Output
<!-- .slide: style="font-size: 36px;" -->
As a result, we get the contact maps, the results for different clustering in the `clustering` folder, and a list of contacting residues (`residue_contacts.dat`)
```
58 35
103 34
35 43
83 36
59 76
31 81
```
If the user thinks that the optimal number of cluster is poorly defined, the user has an access to all residue maps for different `n_clusters` in the `clustering` folder. The user can force provide those maps for following analysis.
---
## Analysis of contact pathways
The list of residues that form a contact is used to analyse the pulling trajectory and to plot multidimensional pathway in terms of contacts. This actually allows to compare pulling pathwas and sampling of the unbinding if needed.
```graphviz
digraph chargesetter {
pad= .1
splines= ortho
nodesep= 0.20
ranksep= 0.5
fontname= "Sans-Serif"
fontsize= 13
fontcolor= "#2D3436"
node [shape = box
// style = rounded
fixedsize = true
width = 1.4
height = 0.7
labelloc = c
fontname = "Sans-Serif"
fontsize = 13
fontcolor = "#2D4436"]
residuePairs [color = black, label="List of \nresidues\nin contact"]
traj1 [color = black, label="Trajectory 1"]
traj2 [color = orange, label="Trajectory 2"]
analyzeContacts [color = red, label="contact\nanalysis"]
pathway [color = blue, label="Pathway\ntrajectory"]
pathwayComp [color = blue, label="Pathway\ncomparison"]
pathwayAnalysis [color = blue, label="Pathway\nanalysis"]
residuePairs->analyzeContacts
traj1->analyzeContacts
traj2->analyzeContacts
analyzeContacts->pathway
analyzeContacts->pathwayComp
analyzeContacts->pathwayAnalysis
}
```
---
## `checkPPI.py` script
<!-- .slide: style="font-size: 36px;" -->
The script takes the `residue_contacts.dat` as input and requires structure and trajectory file
```
python checkPPI.py -r input.pdb -t input.xtc
```
By default, `checkPPI.py` calculates contacts between chains A and B of provided input structures for residue pairs from `residue_contacts.dat`. In case the chain information is lost (e.g. when `.gro` file is provided as input), the user can provide detailed instructions for selection wiht `-s1/2` flags, following the MDAnalysis [selection algebra](https://docs.mdanalysis.org/stable/documentation_pages/selections.html). The distance between residues is computed as minimal distance between heavy atoms, but the user can set `-com` option to `True` to compute distances between residue COMs.
---
## Analysis of single trajectory
As a result, `checkPPI.py` plots the trajectories of the contacts:

---
## Analysis of single trajectory
Also, `checkPPI.py` computes the correlation map between all contacts:

---
## Comparison of two trajectories
The user also has a possibility to provide two trajectories for the analysis using flags `-r2` and `-t2`. If this is the case, the `checkPPI.py` script performes analysis of two individual trajectories separetely and then compares the contact dehavior in two trajectories. It is possible that for two trajectories the selection for chains A and B differ. The user has a possibility to take this into account by providing proper selections for second trajectory with flags `-s12` and `-s22`. By default the chain selections for second trajectory are identical to those of the first one.
---
## Comparison of contact pathways
As a result the distance plots for both trajectories are made, and the correlation between contacts in two trajectories are computed and saved to `cmp_correlations.dat` in the order of contacts in `residue_contacts.dat`
---
## Comparison of contact pathways
