## [Why Most Published Research Findings Are False](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/)
:taurus:
Calling Bullshit Reading Group
Lorenzo Gentile
---
#### About me
- PhD Student at ITU :male_elf:
- / Cryptographic protocols / Secure multi-party computation (MPC) / Privacy preserving blockchain applications :mag:

---
Why Most Published Research Findings Are False
- Written iin 2005 by [John Ioannidis](https://profiles.stanford.edu/john-ioannidis), a professor at the Stanford School of Medicine.
- Foundational to the field of metascience.
- Claims that the majority of published medical research papers contain results that cannot be replicated.

---
#### Framework for False Positive Findings: the idea
- The probability that a research claim is true may depend on study acceptable **false positives** and **false negatives** rates, **bias**, the **number of other studies on the same question** and the **ratio of true to false relationships** among the relationships probed in each scientific field.
Note:
$PVV$, $\alpha$, $\beta$, $u$, $R$
---
#### The problems to highlight
- The high rate of nonreplication (lack of confirmation) of research discoveries is a consequence of claiming research findings solely on the basis of a single study assessed by formal statistical significance ($p-value$).
- "Negative" research may be also useful:
- e.g., FDA registers all research, disergarding it is positive or not, disregarding it is published or not (publication bias).
---
#### Framework for False Positive Findings
- Let $R = R_{True}/R_{False}$ be the ratio of the number of “true relationships” to “false relationships” among those tested in a field (e.g., $X$ cures disease $Y$).
- $R$ is characteristic of the field and can vary a lot depending on whether the field targets **highly likely relationships** ($R_{True} >> R_{False}$) or searches for only **a few true relationships among many hypotheses that may be postulated** ($R_{True} << R_{False}$).
---
- $R_{True}/(R_{True} + R_{False}) = R/(R+1)$ is the fraction of true relationships among those tested in a field.
- Let $c$ be the number of relationships tested in a field.
---
#### Error types
- Let $\alpha$ be the probability that given a false relationship the research finding is positive $\implies$ false positive (e.g., $X$ cures disease $Y$ is false but the research says it is true).
- Let $\beta$ be the probability that given a true relationship the research finding is negative $\implies$ false negative (e.g., $X$ cures disease $Y$ is true but the research says it is false).
---
#### Error types
| Res. fin. \ Relation | Yes | No |
|--------------------------------|----------------|-----------------|
| Yes | (TP) $1-\beta$ | (FP) $\alpha$ |
| No | (FN) $\beta$ | (TN) $1-\alpha$ |
---
#### Research Findings and True Relationships

---
### Positive Predictive Value (PVV)
$PVV = \frac{TP}{TP+FP} = \frac{(1-\beta)R}{(1-\beta)R+\alpha}$
For example, if in a certain field $\alpha = 0.05$, $\beta = 0.5$ and $R = 0.1$, then $PVV = 0.5$, *i.e.*, 50 % of positive research findings are true positive and the rest are false positive.
---
#### Introducing bias
- Let $u$ be the percentage of negative research findings turned to positive research findings, for example because:
- Researchers want to find positive research findings since they are easier to be published (publication bias);
- Researchers have a financial incentive (e.g., sponsored research) to find a certain result;
- ...
---
#### Research Findings and True Relationships in the Presence of Bias

Both true positive and false positive increase, however...
---
### Positive Predictive Value (PVV) with bias
$PPV = \frac{(1 - \beta)R + u\beta R}{(1-\beta)R + \alpha + u(1 − \alpha + \beta R)}$
PPV decreases with increasing $u$, unless $1 − \beta ≤ \alpha$.
For example, if again $\alpha = 0.05$, $\beta = 0.5$ and $R = 0.1$, then:
- If $u = 0$ then again $PVV = 0.5$
- If $u = 0.1$ then $PVV = 0.275$
- If $u = 0.2$ then $PVV = 0.2$
---
### Positive Predictive Value (PVV) with testing by $n$ independent teams
In a similar way to the previous cases, we obtain:
$PPV = \frac{R(1 − \beta^n)}{R + 1 − (1 − \alpha)^n − R\beta^n}$
With increasing number of independent studies $n$, $PPV$ tends to decrease, unless $1 - \beta < \alpha$.
---
#### Corollaries
- The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true.
- The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true.
- ...
Note:
Bias and p-hacking
---
#### Time for discussion!
:taurus:
Note:
lorg@itu.dk
{"metaMigratedAt":"2023-06-15T18:13:16.277Z","metaMigratedFrom":"YAML","title":"Why Most Published Research Findings Are False","breaks":true,"description":"View the slide with Slide Mode.","contributors":"[{\"id\":\"d80064c0-9b1a-4455-a2b0-11618ed627cd\",\"add\":7535,\"del\":2374}]"}