### How many people in California are infected with coronavirus? (Feb 28)
I am not a doctor and about a week ago knew nothing about epidemiology. So, you know, buy some salt...
1. The woman in Solano County was admitted to UC Davis on Feb 19. [She first went to the emergency department on Feb 15](https://www.mercurynews.com/2020/02/28/coronavirus-government-defends-disputed-uc-davis-test-expands-surveillance/).
2. The woman from Santa Clara was *also* hospitalized [more than a week ago](https://www.latimes.com/california/story/2020-02-28/intense-search-in-california-for-others-exposed-to-coronavirus-patient), the CDC only recently agreed to test her.
3. [The WHO](https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200219-sitrep-30-covid-19.pdf?sfvrsn=3346b04f_2) estimates a mean incubation time of 5-6 days.
4. So, accounting for incubation, we can guess that the woman from Solano County who was hospitalized on Feb 15 actually contracted COVID-19 on Feb 10.
| Date | Event |
| - | - |
| Feb 10 (estimate) | Solano County case is infected |
| Feb 15 | She enters the hospital |
| Feb 28 | Today, 18 days after the first infection |
5. [80% of cases are mild](https://www.worldometers.info/coronavirus/coronavirus-symptoms/#mild). I haven't looked very hard but I haven't found a primary source for this. [NPR seems to believe it](https://www.npr.org/2020/02/17/806729340/new-world-health-organization-data-confirms-around-80-of-cases-are-mild).
6. We know of 2 cases which were bad enough to be hospitalized, so we can guess there are 8 more mild cases which were not hospitalized. Pessimistically, let's say that as of Feb 10 there were 20 cases in California.
7. The serial interval is interval between when a first case becomes symptomatic and the next case becomes symptomatic. I'm going to use it to mean the time between when someone is infected and when they infect the next person, on average this should be the same. The WHO (same source as in (3)), estimates that the serial interval at 4.4 - 7.5 days.
8. The R0 is the number of people each infected person infects, on average. [This pre-print](https://www.ijidonline.com/article/S1201-9712(20)30091-6/fulltext) claims that R0 on Diamond Princess was 2.28. That's a very high number! It's likely to be lower in real life, where you're not all stuck on the same ship and breathing the same air.
9. It's been 18 days since Feb 10. If we pessimistically use the low end of the serial interval (4.4 days) then 4.09 serial intervals have passed. If we assume the R0 is 2.28, then every 4.4 days we can expect the number of cases to more than double. Real epidemiologists do not just multiply these numbers by each other, but I'm trying to come up with a worst-case scenario. I'm not actually trying to predict the number of infected people. Given all that: $20*(2.28)^{4.09} \approx 580$. It seems reasonable to think that fewer than 580 people are currently infected.
10. This estimate might be low, if you believe there are many more hospitalizations which have gone undiagnosed.
11. This model also assumes California is a closed system, that infected people are not entering and leaving. However, the nature of exponentials is that any chain of infections begun *after* Feb 10 hasn't had enough time to infect many people. If the number of infected on Feb 10 was 21 rather than 20 the upper bound on infections would be 611 people. You might disbelieve the 580 number if you believed there were many asymptomatic people entering the state.
12. Just for fun, let's assume all these Californians are in San Francisco. Those 580 people would account for 0.06% of the 884k who live here. If R0 is 2.28 (it's lower) and if the serial interval is 4.4 (it's probably higher), and if everybody infected lived in San Francisco (they don't), then 14 days from now 1% of the population would be infected. Depending on how many people you interact with each day that could be a worrying level.
---
Bonus:
13. The Diamond Princess evacuees landed [on Feb 17](https://www.cnn.com/2020/02/17/health/evacuated-passengers-test-positive-coronavirus/index.html), two days after the Solano county case was hospitalized. So, I'm inclined to believe [there's no connection](https://twitter.com/trapperbyrne/status/1233105856079597568) between her case and Travis Air Force Base.
More resources:
- [UpToDate](https://www.uptodate.com/contents/coronavirus-disease-2019-covid-19), always a good source of information
- [How modeling actually works](https://www.statnews.com/2020/02/14/disease-modelers-see-future-of-covid-19/)
- [A pre-print from Los Alamos](https://www.medrxiv.org/content/10.1101/2020.02.07.20021154v1.full.pdf)
- [Early Transmission Dynamics in Wuhan](https://www.nejm.org/doi/full/10.1056/NEJMoa2001316) claims:
- The mean incubation period is 5.2 days (95% CI of 4.1-7)
- The doubling time was about 7.4 days
- The mean serial interval was 7.5 days (5.3-19)
- The R0 was 2.2 (1.4 to 3.9)
- [This pre-print](https://www.medrxiv.org/content/10.1101/2020.02.03.20019497v2) found a serial interval of 4 days (3.1 to 4.9)
- [This article](https://www.thelancet.com/pdfs/journals/lancet/PIIS0140-6736(20)30260-9.pdf) estimates a doubling time of 6.4 days (5.8 - 7.1), it claims that the serial interval of SARS-Cov (the first one) was 8.4 days.