We want
Which we can compute, using the given information, via Bayes' rule (where the second line is how I like to write it out, a bit more cluttered but makes it clear how to compute the denominator)
Numerically:
(Less than 1%)
But now let's think about what's behind this… It's a dangerous, highly contagious disease, meaning that (societally) false negatives are much, much worse than false positives:
So, let's focus on the disastrous first case: what's the probability of a false negative? First, we can compute the conditional probability of a negative test, conditional on someone having the disease? Here we just use our complement rule of probability: that for any event :
Now that we know this, let's incorporate the base rate information–-that is, the information we have about the likelihood of having the disease (the thing we conditioned on above):
So, given these two pieces of information, we can compute the probability of a person in the population being a false negative case: having the disease, but not being detected by the test.
i.e., one in a million.
Now let's turn to the second, bad but not catastrophic, case: the probability of a false positive. As before, we start by computing the conditional probability of a positive test result for someone who in fact does not have the disease:
This time, however, we'll see that the base rate will make a big difference. The base rate in this case–-the probability of someone not having the disease–-is:
So, incorporating these two pieces of information, we can compute the likelihood of a false positive case: someone in the population who doesn't have the disease but does test positive:
In words: for every million people in the population, 9999 of them will have a false positive panic: they won't have the disease, but they will think they have the disease because of their positive test.
At first, this example is depressing: "Oh no, that's terrible! We're forcing thousands of people to panic, thinking that they have the disease, when they really don't!"
But, walking through it with this false negative / false positive paradigm, we see the real takeaway: that there is always a tradeoff between false positives and false negatives. In this case, from a public health perspective for example, it's actually somewhat of a good situation: at the "cost" of having several thousand people panic unnecessarily, we achieve the benefit of making it very, very unlikely (one in a million, literally) that someone goes undetected in the population with this dangerous, contagious disease.
True State of the World | |||
---|---|---|---|
0 | 1 | ||
Prediction | 0 | True Negative | False Negative |
1 | False Positive | True Positive |
True Positive:
True Negative: