---
title: "How deadly actually is Covid-19? An impact analysis of the pandemic"
output:
html_document: default
word_document: default
pdf_document: default
---
## How deadly actually is Covid-19? An impact analysis of the pandemic
Over the past 12 months, the SARS-CoV-2 virus has run riot through our societies. Not only has it lead to considerable loss of life in the short term, but the economic and societal impacts caused by the pandemic and its resulting restrictions have also been devastating. The challenges faced by policymakers in recent times have been unprecedented. With a multitude of different approaches to keep the virus at bay, many have chosen to plunge their populations into prolonged lockdowns and social distancing restrictions.
But how deadly actually is the virus? This blog aims to answer this by looking at:
1.) Mortality rate analysis of the virus in the UK
2.) The indirect impact of the virus
3.) International differences
I also hope to provide some clarity in the otherwise confusing and misleading media coverage of the pandemic that has occurred in countries such as the UK.
### 1.) Mortality rate analysis of the virus in the UK
There's no denying it. Covid-19 has had produced a large amount of total and excess deaths in the world's population, putting substantial pressure on health care systems worldwide.
However, with Covid-19 daily death figures now being published as commonly as the chance of rain in the news, poor initial public understanding of this data and little accompanying explanatory information caused hsyteria throughout the early stages of the pandemic. Although the virus is undeniably deadly, what exactly are the risks of dying from it? How do they vary? And how do they compare to other viruses?
To analyze this I calculated the mortality rate for Covid-19 for the whole of 2020 by combining death and population data for different ages. I did the same for Influenza & Pneumonia deaths (Flu) in 2018 before the pandemic even hit.
(Mortality rate = deaths from the virus / population * 100, a good estimated of the risk of dying of a certain virus)
```{r fig.height=6, fig.width=10, message=FALSE, warning=FALSE}
library(tinytex)
library(data.table)
library(dplyr)
library(ggplot2)
library(readxl)
library(ggthemes)
library(extrafont)
library(tidyr)
library(zoo)
covid_deaths_by_age_group_normal <- read_excel("p4_mini_project_data/publishedweek532020 (1).xlsx", # loading Covid-19 deaths data
sheet = 'Covid-19 - Weekly occurrences',
skip = 5,
n_max = 91) %>%
slice(6:25) %>%
select(...2,`1 to 53`) %>%
rename("Ages" = '...2') %>%
mutate(Ages = case_when(
Ages %in% c("1-4","<1") ~ "0-4", # merging age groups so that the death and population data have the same
TRUE ~ Ages # categories and can be joined
)) %>% group_by(Ages) %>%
summarise(total_covid_deaths = sum(`1 to 53`)) # grouping by age and summing weeks to get deaths for whole of 2020
influenza_pneumonia_deaths <- read_excel("p4_mini_project_data/b.xlsx", # loading Influenza deaths data
skip =8) %>%
slice(2:21) %>%
mutate(Age = gsub("Aged ", "", Age), # using gsub function to reformat the age groupings so that the deaths and
Age = gsub("and over", "+", Age), # and population data are the same
Age = gsub("to", "-", Age),
Age = gsub("1 - 4", "1-4", Age),
Age = gsub("5 - 9", "5-9",Age),
Age = case_when(
Age %in% c("under 1", "1-4") ~ "0-4", # merging age groups so that the deaths and population data have the same
TRUE ~ Age # categories and can be joined
),
Age = case_when(
Age %in% c("90 +") ~ "90+", # changing formatting of age labels to match population data
TRUE ~ Age
)) %>%
rename("Ages" = Age) %>%
group_by(Ages) %>%
summarise(total_influenza_pnue_deaths = sum(Total))
pop_age_group <- read_excel("p4_mini_project_data/ewpppsumpop18.xls", # loading population data
sheet = 'PERSONS',
skip = 5) %>%
slice(2:22) %>%
select(Ages,'2018','2020') %>%
mutate(Ages = case_when(
Ages %in% c("90-94","95-99","100 & over") ~ "90+", # merging age groups
TRUE ~ Ages),
`2018` = as.numeric(`2018`),
`2020` = as.numeric(`2020`)) %>%
group_by(Ages) %>%
summarise(total_pop_18 = sum(`2018`),
total_pop_20 = sum(`2020`))
mortality_rates <- full_join(covid_deaths_by_age_group_normal,pop_age_group, by ='Ages') %>% # joining covid-19 deaths and population data
full_join(influenza_pneumonia_deaths, by = "Ages") %>% # joinig influenza death data
mutate(total_pop_18 = total_pop_18 * 1000,
total_pop_20 = total_pop_20 * 1000,
'Covid-19 mortality rate (2020)' = total_covid_deaths / total_pop_20, # Key step, calculating the mortality rate for covid-19 in 2020
"Influenza & Pneumonia mortality rate (2018)" = total_influenza_pnue_deaths / total_pop_18,
"Comparison" = `Covid-19 mortality rate (2020)` / `Influenza & Pneumonia mortality rate (2018)`,
correct_order = as.numeric(sub("\\-.*", "", Ages))) %>%
arrange(correct_order)
mortality_rates_long_1 <- mortality_rates %>% # Turning the data into long format for the mortality rate comparison plot
select(Ages,'Covid-19 mortality rate (2020)',"Influenza & Pneumonia mortality rate (2018)", correct_order) %>%
gather(Type, Total, "Influenza & Pneumonia mortality rate (2018)", 'Covid-19 mortality rate (2020)')
mortality_rates_long_2 <- mortality_rates %>% # Turning the data into long format for the total deaths comparison plot
select(Ages,"total_covid_deaths",'total_influenza_pnue_deaths', correct_order) %>%
rename("Total Covid-19 deaths (2020)" = total_covid_deaths,
"Total Influenza & Pneumonia deaths (2018)" = total_influenza_pnue_deaths) %>%
gather(Type, Total, "Total Covid-19 deaths (2020)", 'Total Influenza & Pneumonia deaths (2018)')
covid_mortality_rates_plot <- ggplot(mortality_rates, aes(x=reorder(Ages, correct_order), y= `Covid-19 mortality rate (2020)`)) + # plotting mortality rates per age group for Covid-19
geom_bar(stat = "identity", fill = "#23203F") +
theme_economist() +
theme(axis.title.x = element_text(size = 10, margin = margin(15,0,0,0)),
axis.title.y = element_text(size = 10, margin = margin(0,15,0,0)),
legend.title = element_blank(),
legend.text = element_text(size=8),
legend.position = "bottom") +
labs(title = "Covid-19 mortality rates by age group,
England & Wales",
x = "Age Group",
y = "Mortality Rate") +
scale_y_continuous(label=scales::percent)
mortality_rate_comparison_plot <- ggplot(mortality_rates_long_1, aes(x=reorder(Ages,correct_order),y=Total, fill = Type)) + # plotting mortality rates per age group for Covid-19 and Influenza
geom_bar(stat = "identity", position = "dodge") +
theme_economist() +
theme(axis.title.x = element_text(size = 10, margin = margin(15,0,0,0)),
axis.title.y = element_text(size = 10, margin = margin(0,15,0,0)),
legend.title = element_blank(),
legend.text = element_text(size=8),
legend.position = "bottom") +
labs(title = "Covid-19 vs Influenza & Pneumonia mortality rates by age group,
England & Wales",
x = "Age Group",
y = "Mortality Rate") +
scale_fill_manual(values =c("#23203F","#C18DBE")) +
scale_y_continuous(label=scales::percent)
total_deaths_comparison_plot <- ggplot(mortality_rates_long_2, aes(x=reorder(Ages, correct_order), y=Total, fill = Type)) + # plotting total deaths per age group for Covid-19 and Influenza
geom_bar(stat = "identity", position = "dodge") +
theme_economist() +
theme(axis.title.x = element_text(size = 10, margin = margin(15,0,0,0)),
axis.title.y = element_text(size = 10, margin = margin(0,15,0,0)),
legend.title = element_blank(),
legend.text = element_text(size=8),
legend.position = "bottom") +
labs(title = "Covid-19 vs Influenza & Pneumonia total deaths by age group,
England & Wales",
x = "Age Group",
y = "Total Deaths") +
scale_fill_manual(values =c("#23203F","#C18DBE")) +
geom_text(aes(label=Total),position=position_dodge(width=1), vjust=-0.25, size = 2.5)
```
```{r fig.height=6, fig.width=10}
plot(covid_mortality_rates_plot)
```

Here we confirm the already well-established relationship that the risk a person faces of dying from Covid-19 increases drastically with age. The mortality rate maxes out in the 90+ group where 3.28% of people died with Covid-19 in 2020, an extremely high number for a single cause of death in the population. Although older people are more likely to die anyway, the risk of dying from Covid-19 is indisputably high in the oldest three age groups. However, when looking at the youngest 16 groups the story is very different. The compelling finding from this visualization is discovering the true extent of the variation in the mortality rate. The risk of death among young people and even middle-aged people is almost negligible when compared to the oldest in the population, where the 90-94 group had a mortality rate 3416 times greater than the 20-24 group.
(Caveat: The mortality rate is a good general indication of the risk of dying from Covid-19 among the population and is adequate for this analysis. However, it is limited as not everyone within the population is exposed to the same uniform risk of infection. The Infection fatality ratio is a better measure that represents the proportion of deaths among infected individuals but is difficult to calculate accurately and requires detailed data on infected persons that is not readily available)
```{r fig.height=6, fig.width=10,}
plot(mortality_rate_comparison_plot)
```

When looking at Covid-19 in comparison to Influenza (Seasonal Flu) we see it is deadlier in the majority of age groups too. The 55-59 age group experiences 5.3 times the mortality rate from Covid-19 than they do from Influenza. However, one must bear in mind that high efficacy vaccines have been available for the Influenza virus since the 1940s along with a host of different effective drug treatments. In contrast, nearly nobody received a vaccine in 2020 for Covid-19. Without these conditions, it is not unrealistic to think we would see a much smaller difference in mortality between the two. Whilst it still may be difficult to label Covid-19 'a bad version of the Flu', some sensible comparison does seem to alleviate some of the fears surrounding Covid-19 and provides some hope for the future.
```{r fig.height=6, fig.width=10,}
plot(total_deaths_comparison_plot)
```

Looking at the absolute numbers of deaths of the two viruses also helps to further dispel some of the fears many people have about the risk they have of dying. In the whole of 2020, only 35 people died with Covid-19 in the 20-24 category where the majority of the student population lies, a relatively lower number.
### 2.) Indirect impact on mortality
Not only has the virus been deadly through the direct deaths it's caused as seen in section 1, the indirect impacts of lockdown and social distancing restrictions have also been considerable. Whilst these measures were initially implemented to protect lives by stopping the spread of the virus they also made it increasingly difficult for doctors to see and treat patients for other conditions. The hysteria caused by reasons described earlier and the unknown nature of a pandemic also discouraged people from going to the hospital. These indirect effects of the virus could prove even more deadly than the direct deaths the virus has caused so. To analyze this I took the number of new referrals made by doctors in England and Wales in 2020 and compared them to average levels in 2019 before the pandemic struck.
```{r fig.height=6, fig.width=10, message=FALSE}
library(data.table)
library(dplyr)
library(ggplot2)
library(readxl)
library(ggthemes)
library(extrafont)
library(tidyr)
library(zoo)
new_rtt_period_20 <- read_excel("p4_mini_project_data/RTT-Overview-Timeseries-May20-XLS-102K-43296-2.xls", # loading referrals data for 2019
skip = 9,
n_max = 168) %>%
select("...2", "No. of new RTT periods") %>%
filter(...2 >= as.POSIXct("2020-01-01") & ...2 <= as.POSIXct("2020-06-01")) %>% # cleaning/filtering data
mutate(ID = row_number()) %>%
rename('2020' = 'No. of new RTT periods',
Date = ...2)
new_rtt_period_19 <- read_excel("p4_mini_project_data/RTT-Overview-Timeseries-May20-XLS-102K-43296-2.xls", # loading referrals data for 2020
skip = 9,
n_max = 168) %>%
select("...2", "No. of new RTT periods") %>%
filter(...2 >= as.POSIXct("2019-01-01") & ...2 <= as.POSIXct("2019-06-01")) %>% # cleaning/filtering data
mutate(ID = row_number()) %>%
select("ID", "No. of new RTT periods") %>%
rename('2019' = "No. of new RTT periods")
plot_data_normal <- full_join(new_rtt_period_19, new_rtt_period_20, by = "ID") %>% # joining 2019/2020 referrals in same data frame
select('Date', '2019', '2020')
plot_data_normal$max = as.numeric(plot_data_normal$`2019`) # getting max and min values for referrals for geom_ribbon
plot_data_normal$min = as.numeric(plot_data_normal$`2020`)
plot_data <- plot_data_normal %>% # turning data into long format for plot
gather(Year, Total, '2019','2020') %>%
mutate(Date = as.Date(Date),
Total = as.numeric(Total))
new_rtt_period_plot <- ggplot(plot_data, aes(x=Date, y=Total, colour = Year, group= Year)) + # creating new_rtt_period_plot
geom_line(size = 0.8) +
theme_economist() +
scale_color_manual(values=c("#F28F64", "#23203F")) +
theme(legend.title = element_blank(),
legend.position = "bottom",
axis.title.x = element_text(size = 10, margin = margin(15,0,0,0)),
axis.title.y = element_text(size = 10, margin = margin(0,15,0,0)),
axis.title=element_text(size=12)) +
labs(title = 'New Referrals to Treatment (RTT), England
2019 vs 2020', size = 6) +
scale_x_date(date_breaks = "1 month", date_minor_breaks = "1 week",
date_labels = "%B") +
geom_ribbon(data = plot_data%>%filter(Year == "2019" & Date >= as.Date("2020-01-01")),
aes(x=Date,ymin=min, ymax=max), inherit.aes = FALSE, fill = "#C18DBE", alpha = 0.5) +
annotate("text", x = as.Date("2020-04-8"), y = 1200000, label = "2,903,137 fewer
referrals to treatment", size = 4)
```
```{r fig.height=6, fig.width=10, message=FALSE}
plot(new_rtt_period_plot)
```

The results of the pandemic are shocking to see. Through 5 months between January to May, 2,903,137 fewer new referrals to treatment were made by doctors in 2020 compared to benchmark levels in 2019. This means that there is a cohort of people out there that have not been seen for a range of different conditions that if not treated quickly enough, could result in death in the future. Through rough back-of-the-envelope calculations we can see the impact this may have. If just 5% (145,156.85 people) of referrals die as a result of not being treated quickly enough, this already eclipses the total death toll directly from the virus (127,000). A scary statistic.
This is when medical ethics comes into the discussion. Is a death from Covid-19 any different from a death from cancer or any other treatable condition that requires a referral? Policymakers have faced a challenging balancing act between mitigating direct deaths from Covid-19 and the other impacts the preventative measures are having on the health of the population. Now that vaccination programs are in full swing, working through the backlog of patients that are still waiting to be treated should be a top priority for the Government. Countries that get this right whilst still maintaining low direct deaths from the virus will minimize the impact of the pandemic in the long run.
### 3.) International differences
Having looked at the deadliness of the virus in the UK, we move on to our final part of comparing its impact on an international scale.
Perhaps the crudest measure of the deadliness of a virus would be to look at the total deaths. Throughout the pandemic, this metric has regularly been thrown around by the world's media outlets as a means to quantify the impact Covid-19 has had on societies. The measure doesn't account for average death levels within the population so can easily misrepresent how deadly the virus may seem by itself. Despite this, it does act as a good yardstick for this comparison.
```{r fig.height=6, fig.width=10, message=FALSE}
library(data.table)
library(dplyr)
library(ggplot2)
library(readxl)
library(ggthemes)
library(extrafont)
library(tidyr)
library(zoo)
owid_data <- read.csv('p4_mini_project_data/owid-covid-data.csv') %>% # loading owid data
mutate(date = as.Date(date))
total_deaths_g7_data <- owid_data %>%
select('iso_code','date', 'total_deaths_per_million') %>%
filter(iso_code %in% c('USA','JPN','ITA','CAN','FRA', 'DEU','GBR'), # cleaning/filtering data for G7 countries
date == as.Date('2021-03-23'))
total_deaths_plot_G7 <- ggplot(total_deaths_g7_data, aes(x=reorder(iso_code, total_deaths_per_million), y=total_deaths_per_million)) + # creating total deaths plot for G7
geom_bar(stat = 'identity', position = 'dodge', fill = "#23203F") +
coord_flip() +
theme_economist() +
geom_text(aes(label=total_deaths_per_million),position=position_dodge(width=1), hjust =-0.25 ,vjust=-0.25, size = 2.5) +
theme(axis.title.x = element_text(size = 10, margin = margin(15,0,0,0)),
axis.title.y = element_text(size = 10, margin = margin(0,15,0,0)),
legend.title = element_blank(),
legend.text = element_text(size=8),
legend.position = "bottom") +
labs(title = "Total deaths per million as of 2021-03-23, G7 countries",
x = "Country",
y = "Total Deaths Per Million") +
scale_fill_manual(values ="#C18DBE")
```
```{r fig.height=6, fig.width=10, message=FALSE}
plot(total_deaths_plot_G7)
```

When looking at total deaths per million for the G7, figures have varied markedly, with the majority of countries all racking up very high death tolls. When accounting for population sizes the UK comes out worst of all with an astonishing 1863.7 recorded deaths per million. This discovery probably means that the UK sits right at the top of countries in which Covid-19 has been most deadly. This notion is horrifying in one sense for us as UK residents because for whatever reason (A whole other blog in itself) we are seemingly more likely to die from covid-19 than other people in similar countries. However, this could also be viewed as strangely comforting as if some of the figures we found in 1.) and 2.) are the worst of the worst, then perhaps the virus isn't as deadly as first thought.
### Conculsion
The Covid-19 outbreak has without a doubt been one of the most deadly in recent history due to high mortality rates among old people and catastrophic indirect impacts that are still relatively unknown. However, after discovering the truth of how the risk of death varies with factors such as age we have been able to see that the virus poses a very low threat to the working population and perhaps didn't deserve the mass hysteria it received early on. With the continued roll-out of vaccines, more effective drug therapy, development of universal best practices, and the possibility of seasonal convergence of the virus we can expect to see Covid-19 deaths drop considerably in the future, providing hope of finally returning to normality in 2021.
### Data
The public data data sources I used for this analysis can be found here:
1.)
Covid-19 Deaths (publishedweek532020 (1).xlsx):
https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/weeklyprovisionalfiguresondeathsregisteredinenglandandwales
Influenza Deaths(b.xlsx):
https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/causesofdeath/datalist
Population (ewpppsumpop18.xls):
https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationprojections/datasets/tablea23principalprojectionenglandandwalespopulationinagegroups
2.)
New Referral to Treatment periods (RTT-Overview-Timeseries-May20-XLS-102K-43296-2.xls):
https://www.england.nhs.uk/statistics/statistical-work-areas/rtt-waiting-times/rtt-data-2019-20/#May19
3.)
owid_data.csv