---
tags: grn-control
---
# Control Gene Regulatory Network
## Contacts
- Pierre-Luc Bacon [pierre-luc.bacon@mila.quebec]()
- Payel Das [daspa@us.ibm.com]()
- Youssef Mroueh [mroueh@us.ibm.com]()
- David Joel Sylvain Kanaa [kanaadjs@mila.quebec]()
- Manuel Del Verme [delvermm@mila.quebec]()
- Simon Dufort-Labbé [simon.dufort-labbe@mila.quebec]()
- Ionelia Buzatu [ionelia.buzatu@mila.quebec]()
## Links
- [Our Overleaf](https://www.overleaf.com/project/624b841e2b2437493c19ffef)
- [Meetings's notes](https://hackmd.io/SZpw9vxfTVC7xrNEXmjcCQ) <i class="fa fa-users"></i>
- [Datasets's links](https://hackmd.io/B1FNohmEQTufUfhz2TKOyw) <i class="fa fa-rna"></i>
- [Literature's list](https://hackmd.io/Mea9-ZjJS66QWR_4hZoguQ) <i class="fa fa-book"></i>
- [GitHub's repo](https://github.com/ioneliabuzatu/grn-control) <i class="fa fa-github"></i>
- [assumptions/simplifications]()
#### Motivation and Goal
1. Steering one cell condition to another
2. Identifying condition-specific properties/dynamics
The first is our high goal, but on the way, this will allows us to identify specific genes or markers that marks a condition e.g. healthy cell or unhealthy cell.
Brief description:
- It’s model based
- There are two main states, initial and final.
- Actions; A = ${\{g_i\}}_i \in [1...N]$ where N is the number of genes.
- The input is scRNAseq encoded ias a GRN (gene regulatory network)
## How to proceed?
- start with the rewritten simulator
- think about the control part
- make evaluation as metric a visualization
## PL's notes
There are multiple levels of simulation. At the highest level, SERGIO simulates a differentiation trajectory. In their terminology, this trajectory hops over cell types (confusing terminology), where that cell type represents a steady state of the GRN. More precisely, they say:
> We define a cell type (or cell state) by the average concentration of master regulators.
A path over steady states leads to what the authors call a "differentiation graph", which representes how a given GRN has evolved from stead state to steady state.
What's unclear to mean is how you reach that steady state, or detect it. You let you SDE run long enough? Or is there a more principled way of detecting once your system has entered a steady state?
> [name=Pierre-Luc Bacon] Answer is in the code. They perform some sort of statistical test. Nothing mentioned in the paper. To investigate.
Implementation note: prior to running the SDE, sergio "boostraps" the dynamical system using guess on the expected gene concentrations in steady state. The approximation step comes from swapping the expectation operator with the term inside, which probably provides some sort of upper bound via Jensen's inequality. This step is purely computational and may not be necessary in the first implementation phases.
### rna velocity and control
$$
v(t)=\frac{ds(t)}{dt}=\beta*u(t) - \gamma*s(t) \;\;\;\;\;\; (1)
$$
in the literature some use [Bergen et al. (2020)](https://www.nature.com/articles/s41587-020-0591-3) $\beta, \gamma, \alpha$ as parameters estimation for simulators $\emptyset \xrightarrow{\alpha^k}u(t)\xrightarrow{\beta}s(t)\xrightarrow{\gamma}\emptyset$. time $t$ is treated as a latent variable.
$$
\frac{du(t)}{dt} = \alpha^k(t) - \beta*u(t) \;\;\;\;\;\; (2)
$$
This is the **cost function** C; the simulator S is optimized w.r.t. to C.
$$
\text{C}_t(X_t \mid A) = - \underbrace{\log(P(\text{c}=1 \mid X_T^{(2)}))}_{\text{entropy given condition 1}} - \underbrace{\log(P(\text{c}=0 \mid X_T^{(1)}))}_{\text{entropy given condition 0}} \;\;\;\;\;\; (3)
$$
where $X_t = S\left(X_{t-1}, A \right)$. In summary:
$$
\left[\frac{\partial f(S(X))}{\partial X}\right] \;\;\;\;\;\; (4)
$$
S is the simulator **dynamics** described by (1) and (2).
$$
\frac{\partial u}{\partial t} = \left(P(t) - (\lambda + \mu) \odot u(t) + q^u \odot \left(\sqrt{P(t)}\alpha + \sqrt{(\lambda+\mu) \odot u}\beta\right)\right) + a^u \;\;\;\;\;\;\; (5)
$$
$$
\frac{\partial s}{\partial t} = \left(\mu \odot u(t) - \gamma \odot s(t) + q^s \odot \underbrace{\left({\sqrt{\mu \odot u(t)} \phi + \sqrt{\gamma \odot s(t)} \omega}\right)}_{\text{gaussian noise}}\right)+ a^s \;\;\;\;\;\; (6) $$
where:
- $X=[X^{(1)}, X^{(2)}]$: & the initial and final boundary values of the simulation state, $X^{(i)}$ is of size G, the number of genes. $\bf{X^{(1)}}$ is $\bf{initial}$ condition, $\bf{X^{(2)}}$ is $\bf{final}$ condition. $X_t$ evolves for T timesteps, with the boundary values $X_0=X^{(1)}$, $X_T=X^{(T)}$.
- $A=[a^u, a^s]$: & vector valued actions for unspliced and spliced kinetic reactions.
**Note:** Eq. (6) is the simalar to eq. (1), is the noise part which is different here (Sergio, 2020). Same for eq (2) and eq (5), the noise is crucial.
---
**hill function** (Chu et al. 2009)
$$
p_{ij} = K_{ij} \frac{x^{n_{ij}}_j}{h^{n_{ij}}_{ij} + x^{n_{ij}}_j} \; \text{if regulator j is an activator of gene i}
$$
Here P(x) or p_ij is the probability that all S_S operator sites are occupied; K is a parameter
indicating the number (or concentration) of TFs where P(x) = 1/2. Generally, the
higher the value of h the more switch-like the function P(x).
the sigmoidicity of both Hill functions increases with n.

“K” denotes the maximum interaction strength (see equation 6 in the manuscript). For activating interactions use positive “K” and for repressive ones use negative values.
---
### doubts, questions
- something to think about: originally the dynamical simulation is made on stem cells, and that it state 0! It has not yet an identity. Instead the unhealthy state is already at some point in the trajectory.
- should one add to the cost function both unspliced and spliced reactions or only one, and which one is best?
### workflow
1. start with raw data, clean it.
2. do GRN inference.
3. train oracle.
3. simulate control states.
4. evaluate simulated state with the real state.
* evaluation metrics: mean-std, or a likelihood.