Why Most Published Research Findings Are False

## [Why Most Published Research Findings Are False](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/) :taurus: Calling Bullshit Reading Group Lorenzo Gentile --- #### About me - PhD Student at ITU :male_elf: - / Cryptographic protocols / Secure multi-party computation (MPC) / Privacy preserving blockchain applications :mag: ![](https://i.imgur.com/5ogwJbY.jpg =200x) --- Why Most Published Research Findings Are False - Written iin 2005 by [John Ioannidis](https://profiles.stanford.edu/john-ioannidis), a professor at the Stanford School of Medicine. - Foundational to the field of metascience. - Claims that the majority of published medical research papers contain results that cannot be replicated. ![](https://i.imgur.com/T1AZ13V.jpg) --- #### Framework for False Positive Findings: the idea - The probability that a research claim is true may depend on study acceptable **false positives** and **false negatives** rates, **bias**, the **number of other studies on the same question** and the **ratio of true to false relationships** among the relationships probed in each scientific field. Note: $PVV$, $\alpha$, $\beta$, $u$, $R$ --- #### The problems to highlight - The high rate of nonreplication (lack of confirmation) of research discoveries is a consequence of claiming research findings solely on the basis of a single study assessed by formal statistical significance ($p-value$). - "Negative" research may be also useful: - e.g., FDA registers all research, disergarding it is positive or not, disregarding it is published or not (publication bias). --- #### Framework for False Positive Findings - Let $R = R_{True}/R_{False}$ be the ratio of the number of “true relationships” to “false relationships” among those tested in a field (e.g., $X$ cures disease $Y$). - $R$ is characteristic of the field and can vary a lot depending on whether the field targets **highly likely relationships** ($R_{True} >> R_{False}$) or searches for only **a few true relationships among many hypotheses that may be postulated** ($R_{True} << R_{False}$). --- - $R_{True}/(R_{True} + R_{False}) = R/(R+1)$ is the fraction of true relationships among those tested in a field. - Let $c$ be the number of relationships tested in a field. --- #### Error types - Let $\alpha$ be the probability that given a false relationship the research finding is positive $\implies$ false positive (e.g., $X$ cures disease $Y$ is false but the research says it is true). - Let $\beta$ be the probability that given a true relationship the research finding is negative $\implies$ false negative (e.g., $X$ cures disease $Y$ is true but the research says it is false). --- #### Error types | Res. fin. \ Relation | Yes | No | |--------------------------------|----------------|-----------------| | Yes | (TP) $1-\beta$ | (FP) $\alpha$ | | No | (FN) $\beta$ | (TN) $1-\alpha$ | --- #### Research Findings and True Relationships ![](https://i.imgur.com/CkFp4vJ.jpg) --- ### Positive Predictive Value (PVV) $PVV = \frac{TP}{TP+FP} = \frac{(1-\beta)R}{(1-\beta)R+\alpha}$ For example, if in a certain field $\alpha = 0.05$, $\beta = 0.5$ and $R = 0.1$, then $PVV = 0.5$, *i.e.*, 50 % of positive research findings are true positive and the rest are false positive. --- #### Introducing bias - Let $u$ be the percentage of negative research findings turned to positive research findings, for example because: - Researchers want to find positive research findings since they are easier to be published (publication bias); - Researchers have a financial incentive (e.g., sponsored research) to find a certain result; - ... --- #### Research Findings and True Relationships in the Presence of Bias ![](https://i.imgur.com/Y7kriZg.jpg) Both true positive and false positive increase, however... --- ### Positive Predictive Value (PVV) with bias $PPV = \frac{(1 - \beta)R + u\beta R}{(1-\beta)R + \alpha + u(1 − \alpha + \beta R)}$ PPV decreases with increasing $u$, unless $1 − \beta ≤ \alpha$. For example, if again $\alpha = 0.05$, $\beta = 0.5$ and $R = 0.1$, then: - If $u = 0$ then again $PVV = 0.5$ - If $u = 0.1$ then $PVV = 0.275$ - If $u = 0.2$ then $PVV = 0.2$ --- ### Positive Predictive Value (PVV) with testing by $n$ independent teams In a similar way to the previous cases, we obtain: $PPV = \frac{R(1 − \beta^n)}{R + 1 − (1 − \alpha)^n − R\beta^n}$ With increasing number of independent studies $n$, $PPV$ tends to decrease, unless $1 - \beta < \alpha$. --- #### Corollaries - The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true. - The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true. - ... Note: Bias and p-hacking --- #### Time for discussion! :taurus: Note: lorg@itu.dk