# Causal inference Group meeting notes
## Mediation
#### Definition(CDE)
For any three variables $X$, $Y$ and $Z$, where $Z$ is a mediator between $X$ and $Y$ the \emph{controlled direct effect} (CDE) on $Y$ of changing the value of $X$ from $x$ to $x'$ is defined as:
$$CDE = P(Y=y|do(X=x), do(Z=z)) - P(Y=y|do(X=x'), do(Z=z)).$$
So the CDE is the difference between the probabilities of $Y=y$ when doing two distinct interventions on X while "intervening" on the mediator but holding it stable.
**Question:** When is the CDE identifiable (recall that a causal effect is identifiable if it can be uniquely determined from the causal structure on the basis of the observations only.)?}
In general, the CDE od $X$ on $Y$, mediated by $Z$, is identifiable if the following two properties hold:
1. There exists a set $S_1$ of variables that blocks all backdoor paths from $Z$ to $Y$;
2. There exits a set $S_2$ of variables that blocks all backdoor paths from $X$ to $Y$, after deleting all arrows entering $Z$.
**Question:** What is the total effect and what is the CDE in Figure 3.1.2?
**Note:** In linear systems, the IDE = TE - CDE, but not in non-linear systems.
## 3.8 Causal Inference in Linear Systems
* The causal methods introduced in the book work regardless of the type pf equations that make up the model in question. **d-separation** and the BDC make no assumptions whatsoever about the form of the relationship between two variables - only that the relationship exists.
However, things become simpler in the linear setting and it is in this setting that we are now situated.
See here for a refresher: https://almostsuremath.com/2021/02/24/multivariate-normal-distributions/
### What is the difference between structural and regression coefficients?
#### Regression equations
$$y=r_1x + r_2z + \epsilon$$
* Descriptive and make no assumption about causation
* The error terms in the equation denote the residual errors in observation, after fitting the eqaution $y=r_1x + r_2z$ to the data. These are human-made and arise due to imperfect fitting.
#### Structural equation
$$Y=\alpha X + \beta Z + U$$
* Makes an assumption about causation
* The "error terms" ($U$)represent latent factors (aka disturbances or omitted variables) that influence $Y$ and are not themselves affected by $X$. These are nature-made.
Question: Is it a matter of interpretation then? A different way at looking at things? A different perspective?
* In a linear system, the direct effect of a variable on another corresponds to the structural coefficient.
* In a linear system, the total effect of X on Y is simply the sum of the products of the coefficients of the edges on every nonbackdoor path from X to Y. Once we know the form of the linear system, which we can deduce from our graph, we can easily calculate the TDE and the DE.
## Meeting Notes
22/11/22