Introduction to Epidemiology: a tour
Epidemiology is the study of distribution and determinants of diseases in populations and use of this knowledge for the improvement in public health
Epidemiology in triads
Epidemiology triad
Distribution of diseases
Determinants of diseases
Use of the information
Distribution of diseases person
Distribution of diseases with place
Distribution of diseases over time
Measures of disease distribution
Prevalence (proportion)
Incidence (rate)
Ratio (standardised mortality ratio)
Concept of prevalence
total cases of a disease in the population OVER
total number of people there
MULTIPLIED by factor of 100 (percent), 1000 or 10, 000
At a FIXED PERIOD of TIME
Example of prevalence
Advantage of prevalence
Simple snapshot of a health condition in the population
For a single population, at one point in time
You can use it to compare two or more populations
Limitation of prevalence
Does not provide any information about how the disease spreads in the community
Is the disease increasing?
Is it getting worse?
Is it getting better?
Question for the class - what data do we need?
What additional data do we need for that?
(Question for the class, what do you think?)
Concept of Incidence
How many NEW cases of the disease in the community
How many people were AT RISK?
Over WHAt period of time?
Concept of Person-time
If you follow 1 person for 1 year,
You get 1 person-year
If you follow 100 people for 1 year,
You get 100 person-year
If you follow 100 people for 10 years,
You get 1000 person-years
Question for the class
Let's say we follow 1000 people for 5 years, which of the following is correct?
(X) We have 5000 person-years
(Y) We have 1000 person-years
(Z) We have 5 person-years
Concept of incidence rate
Incidence is a rate because it has TIME as DENOMINATOR
Number of NEW CASES OVER Person-years
REQUIRES FOLLOW UP of people
Example of incidence rate
Question for the class 2
Based on the data presented below, is the incidence
(A) increasing or
(B) Falling off
Incidence of the disease in the data
Advantage of Incidence rates
Used for charting epidemics
Helps you to understand whether an epidemic is getting better or worse
Helps to chart data in real time
Example of incidence rate in real life
Question for the class - 3
How can we use the information on the COVID Epidemic Curve?
(1) Test whether the infection is rising or falling
(2) Get an idea when to introduce lockdown or other containment measures
(3) Find out whether lockdown or containment measures are working or not
(4) All of the above
Limitation of Incidence
We need longitudinal data
Without follow-up data we cannot estimate incidence
Incidence is too simple if we want to compare different populations that differ in age groups
Question for class - which population has higher incidence?
Concept of age-standardised rates
Question for class - which population has NOW higher incidence?
Summary for distribution of diseases
Three measures: prevalence, incidence, and standardised ratios
Prevalence is used for static time
Incidence is used in the context of person-year as denominator
Incidnece is used for measuring how disease increases or decreases over time
Standardised ratios used to compare two different populations
What does determinants of diseases in populations mean?
This means what cause diseases in populations?
Examples of some questions:
Does cigarette smoking cause lung cancer? How do we know?
Does long term sitting and sedentary lifestyle lead to heart diseases?
Do particular food items such as chicken salad lead to gastroenteritis?
How do we know?
How do we know that X causes Y?
Observe facts – > Frame theories – > Test with new facts
What is meant by exposure and outcome
Consider a disease, any disease, call it O
We all Y an OUTCOME (health outcome)
Examples: Diabetes, High blood pressure, lung cancer, so on
Consider something to which people are exposed
Example: air pollution, smoking, drinking, gambling, reckless driving
All of these are called EXPOSURE, give it a name E
What is association?
If high levels of E leads to high levels of O, then
We call that a positive association, or RISK
That is, those with high levels of E will have high incidence of O
Example: people who smoke lots of cigarettes end up with lung cancer
We say cigarette smoking is associated with lung cancer
Or, we say Smoking is a RISK for Lung Cancer
How do we measure associations?
Risk Ratio (RR) Or
Odds Ratio (OR)
Risk Ratio = Risk of Disease among Exposed OVER Risk of Disease among non-Exposed
Odds Ratio = Odds of Exposure among Diseased OVER Odds of Exposure among those without the disease
Decisions about RRs and ORs
If RR > 1, the risk is high, OR, association is positive
Otherwise, if RR = 1, then we cannot say anything
Else, if RR < 1, there is BENEFICIAL effect!
Question for the Class - Is this high risk?
You conducted a study on cigarette smoking and risk of lung cancer, and found RR = 2.50; what would you say?
(1) Cigarette smoking increases the risk of lung cancer
(2) Cigarette smoking has no effect on lung cancer
Establishment of cause and effect
If we have to show exposure E is a cause of disease O, then
Show that E has TRUE or REAL association with O
And show that,
that Association is one of CAUSE and EFFECT
Four things to consider
Valid Association: did not occur due to chance (rule out chance)
Observed association could not be due to biases (eliminate biases)
Observed association cannot be due to a third factor (confounding)
Examine causal factors using Hill's criteria
Examine counterfactual theories of causation
Rule out the play of chance for exposure disease relationship
Test with hypothesis testing
Null Hypothesis: that there is equal chances of disease with and without exposure
Alternative hypothesis: risk of disease is higher with exposure
Always test with null hypothesis
Test p-values and 95% confidence interval
What is meant by null hypothesis? (Example: smoking and lung cancer)
Exposure and disease outcomes are unrelated
People who smoke and who are non-smokers get lung cancer at the same rate
Rate of lung cancer among smokers = Rate of smoking among non-smokers
All epidemiological research is about disproving the null hypothesis
Question for the class - what is the correct null hypothesis
Imagine you are investigating whether smoking cause lung cancer. What is the correct null hypothesis?
(X) Cigarette smoking DOES NOT cause Lung Cancer
(Y) There is NO ASSOCIATION between cigarette smoking and lung cancer
(Z) The RISK of Lung Cancer is same for Smokers and Non-smokers
p-value and 95%Confidence Interval
You completed a study of an Exposure E and outcome O
You were to repeat the study a 100 times!
You found a measurement of RISK (RR) of 2.5
You found a p-value of 0.02
You found 95% Confidence Interval of 2.1 - 4.2
What do these things mean?
Question for the class - what does RR of 2.5 mean?
(A) E is high risk for O
(B) E has no association with O
Concept of p-value
Assuming that the null hypothesis is TRUE,
and suppose you ran this study 100 times,
Then only in 2 out of those 100 studies,
you might get the kind of high risk RR you got
You can reject the null hypothesis
Concept of 95% Confidence Interval
If you repeated this study 100 times,
In 95 out of 100 times,
You might get an RR value between 2.1 and 4.2
Most likely 2.5
Question for the class - What does RR of 2.1 mean?
(A) E is high risk for O
(B) E has no association with O
How do we rule out the play of chance?
Select a large enough sample suitable for the effect you want to study
Perform sample size estimation and power calculation ahead of the study
Deal with it during the planning of the study
What is bias?
Bias = Systematic errors in observation or conduct of the study
Selection Bias: Where the groups are not comparable the way they were identified
Response Bias: When the participants of the study provide erroneous information
How can you eliminate biases in the study design?
In the study design phase,
The investigator should be careful about selection of the sample
In experimental studies, use randomisation and blinding
Train the data collectors in the study
Tests of association: control for confounding variable
What is a confounding variable?
Example of a confounding variable
Explanation why Age is a confounding variable
In the study on the association between smoking and lung cancer,
Age is a confounding variable
Old people tend to smoke more than younger people, AND
Old age is also a risk factor for ANY cancer!
Smoking CANNOT CAUSE Aging!
How do we control for confounding variable?
Matching
Multivariable analysis
Allocation to groups being compared using Random Numbers Table
How do we find causal linkage using Bradford-Hill Criteria
Break for a couple of minutes
Putting these ideas together: study designs
Epidemiological study designs
Single Case studies
Case series (used in Epidemic Surveillance)
Cross-sectional surveys
Case control studies (Most widely used study designs)
Cohort studies
Case control studies
Cohort studies
Rounding up everything we learned with Snow's Cholera investigation
The ghost map
Snow's table
Resume presentation
Introduction to Epidemiology: a tour Epidemiology is the study of distribution and determinants of diseases in populations and use of this knowledge for the improvement in public health
{"metaMigratedAt":"2023-06-15T20:04:57.290Z","metaMigratedFrom":"YAML","title":"Introduction to Epidemiology","breaks":false,"contributors":"[{\"id\":\"2a200359-0c0b-4042-9234-d7df32d1a61b\",\"add\":12566,\"del\":1483}]"}