Internal validity: chance, bias, confounding


What is internal validity

  • Chance
  • Bias
  • Confounding

Chance

  • is the observed association explained due to chance alone?
  • Study finds those with high concentrations of arsenic in water have skin diseases
  • Is it possible that this finding could arise by chance?
  • Chance is POSSIBLE, so
  • Rule out play of chance

Null hypothesis

  • To rule out the play of chance
  • Use Null Hypothesis
  • Null Hypothesis is effect of NO DIFFERENCE

An example of Null Hypothesis

  • Suppose we know that exposure to inorganic arsenic in drinking water causes skin disease
  • Risk of Skin disease equal between those with and without high Arsenic exposure
  • Null Hypothesis can be TRUE of FALSE
  • Null Hypothesis should be rejected or failed to be rejected

Alpha and beta errors

Study H0 TRUE H0 FALSE
Reject H0 Type I error (alpha)
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Fail to Reject H0
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Type II error (beta)

Before planning the study

  • Set a value for the Type I error (alpha error)
  • Usually type 1 error set at 5%
  • Set a value for Type II erro (beta error)
  • Usually set at 20%

After completion of study

  • What is the probability of the findings, if
  • Null Hypothesis (H0) were true?
  • If that probability is LOW,
  • Reject the null hypothesis
  • That probability is "p-value"

Interpretation of p-value

  • If H0 were true:
  • out of 100 iterations of the study,
  • We would find the findings p times

How do reject the null

  • if p is very low
  • the probability is low
  • we rule out the chance factor

Alternative approach

  • Construct a 95% confidence interval
  • If the study were to be conducted 100 times
  • 95 out of 100 times, the findings
  • Would be between the lower and upper value

You rule out the play of chance

  • Before the study you set the values for Type I and Type II error
  • Decide on the effect size you want to see as "significant"
  • Estimate sample size

Hands-on practice with sample size calculator


Example


Bias

  • Systematic error
  • The compared groups are unequal in different ways
  • These impact their outcomes

Selection Bias

  • You want to study effect of X on Y
  • You will select different values of X in a way that
  • That will favour your conclusions

Example of selecion bias

  • Suppose you want to study association between indoor smoking and respiratory illnesses
  • You know that indoor smoking is common in poor households
  • You also know that many elderly people in poor households suffer from respiratory illnesses
  • For your case control study you select
  • Cases from poor neighbourhoods
  • This will stack the results in your favour

Response Bias

  • When the information collected is
  • different for different groups that
  • distort the direction of association

Example of Response Bias

  • You want to study association association between indoor smoking and respiratory illnesses
  • In your case control study,
  • Cases if they know the purpose of the study could provide
  • More accurate information about smoking than controls
  • This can DISTORT relationships between smoking and lung disease

Steps to eliminate bias

  • Objective measurement of exposure and outcome
  • If using subjective tools such as interviews,
  • Train interviewers and use checks and balances
  • Blinding and concealment of information from all parties
  • Do everything at the design stage of the study

Confounding

  • Associated with Exposure
  • Associated with Outcome
  • Does not come in the causal pathway connecting the two

Illustration of confounding


Example of confounding

  • You want to study association between indoor smoking exposure and heart disease
  • Male spouse of smokers are both at increased risk of exposure
  • Males are also more likely to suffer from heart disease
  • Yet maleness DOES NOT come in the causal pathway
  • Hence "gender of the spouse" is a confounder

Control for confounding

  • Randomisation (works for randomised controlled trial)
  • Matching for observational studies
  • Stratified analysis
  • Multivariate modelling and analysis

Summary

  • Chance, bias and confounding are three important factors
  • Chance can be ruled out with adequate sample size estimation
  • Bias can be eliminated with design
  • Confounding can controlled with several strategies
  • Next up: Causal inference
Select a repo