Etiologic research Study of the causes of disease Siti Setiati
Dec 30, 2015
Etiologic research
Study of the causes of disease
Siti Setiati
Major Types of Clinical Epidemiologic Research
Type of Research Question
Descriptive/Causal Aim
Diagnostic research DescriptivePredict the probability of presence of target disease from clinical and non-clinical profile
Prognostic research Descriptive Predict the course of disease from clinical an d non-clinical profile
Etiologic research Causal Causally explain occurrence of target disease from determinant
Intervention research Causal & Descriptive
(1) Causally explain the course of disease as influenced by treatment
(2) Predict the course of disease given treatment (options) and clinical and non-clinical profile
Etiologic research
The research question: • Is there a relation between a determinant
(risk factor) and a disease-outcome?
Research question for causal relation!
Etiologic researchCharacteristics
• To demonstrate causality (cause-effect)• Cause comes before effect
– Exposure or determinant occurs before the disease-outcome occurs
• Determinant-outcome relation is not explained by other factors
• Explanatory research – versus descriptive research
Hills’ Criteria
• Temporal relationship, where the cause precedes the outcome
• Strong association (OR,RR)• Dose-response relationship• Biological plausibility
Etiologic researchWhat study design?
• Experimental– Exposure or determinant assigned by
investigator
versus• Observational
– Exposure or determinant not assigned by investigator
This lecture: observational research
Etiologic research What study design?
Design of two observational studies to distinguish between cause and effect:
1. Cohort study
2. Case-control study
Cohort study
• Also called follow-up study• Definition
– Study in which persons, based on their exposure or determinant, and free of the disease outcome at the start of the study, are followed in time to assess the occurrence of the disease outcome.
Cohort study
timestart study disease-
outcome
determinant +
determinant -
disease +
disease -
disease +
disease -
cohortwithoutdiseaseoutcome
Framingham Heart Study
• 1948 – Framingham, MA• 5200 persons 30-62 years old• Aim: identification of risk factors for
cardiovascular diseases• Remeasured every 2 years
Example of a research question:
Is hypertension a risk factor for MI?
Framingham Heart Study
time1948 1998
hypertension +
hypertension -
MI +
MI -
MI +
MI -
cohortwithout
myocardialinfarction
Cohort studydeterminant-outcome relation
MI + MI -
hypertension +
hypertension -
a
c
b
d
a/a+b=probability of MI for hypertension + = Incidence+
relative risk = incidence + / incidence -
c/c+d=probability of MI for hypertension - = Incidence -
Cohort study
How do you get a cohort?
Cohort study
How do you get a cohort?• Geographical data (Framingham Heart Study)• Birth cohort (British 1946 birth cohort)• Occupational cohort (Whitehall study)
Cohort study
How do you follow the cohort?
How do you find the disease-outcome?
Cohort study
How do you follow the cohort?
How do you find the disease-outcome?• After a certain time interval, send out a
questionnaire or invite for interview or medical examination
• Record disease outcomes via medical files or registrations
Cohort studysummary
determinant disease-outcome
Case-control study
• Also called patient-control study• Definition
– Study in which patients with the disease-outcome and a control group without the disease-outcome are selected and in which it is determined how many people in both groups have been exposed to the determinant
Case-control study
timestart study
disease +(patients)
disease –(controls)
determinant +
determinant +
determinant -
determinant -
Creutzfeldt-Jakob’s Disease
Creutzfeldt-Jakob’s Disease• Fast, progressive form of
dementia• In the 90s a new variant of
Creutzfeldt-Jakob was discovered in Europe after an epidemic of mad-cow disease
• Caused by eating beef?
What research question?
Why case control?
Creutzfeldt-Jakob’s Disease
timestart study
patients with CJD
controls from hospital
beef +
beef +
beef -
beef -
Case-control studydeterminant-outcome relation
CJD + CJD -
beef +
beef -
a
c
b
d
a/c = odds beef+ in cases
= a x d / b x cb/d = odds beef+ in controls
Odds Ratio
Case-control study
How do you find cases/patients?
How to selecet a control group?
Case-control study
How do you find patients?• GP; hospital; cancer registration
How to select a control group?• GP; hospital; general population
Patients and controls have to come from the same ‘source’ population.
Selection of Cases
· Ideally, investigator identifies & enrolls all incident cases in a defined population in a specified time period
· Select cases from registries or hospitals, clinics· When all incident cases in a population are included,
the study is representative; otherwise there is potential for bias (e.g. referral bias)
· Use of prevalent vs incident cases
Essence case-control studies
1. Detection of cases
2. Sampling of controls
3. Asses exposure in cases and controls
4. Calculate measure of association
(usually, etiology: odds ratio with 95% CI)
NOTE
Study of cases and controls instead of census
(census: entire population, as in cohort studies and RCT)
Case-control study
How do you assess exposure or determinant?
Case-control study
How do you assess exposure to determinant?
• Interview with participant • Interview with proxy• Medical file
Case-control studysummary
determinant disease-outcome
Validity and bias
• Validity:– absence of systematic errors (free from bias) in
design, conduct or data-analysis of the research
• Bias:– degree of disruption of the determinant–outcome
relation caused by systematic errors – leads to reduced validity
• 3 types of bias in etiologic research: – selection bias, information bias, confounding
Any trend in the collection, analysis, interpretation, publication or review of data that can lead to
conclusions that are systematically different from the truth (Last, 2001)
A process at any state of inference tending to produce results that depart systematically from
the true values (Fletcher et al, 1988)
Systematic error in design or conduct of a study (Szklo et al, 2000)
What is Bias?
1. Selection biasdefinition
• Distortion of the determinant-outcome relation caused by systematic errors in the selection of study participants (cases and/or controls)
Selection Bias
Selective differences between comparison groups that impacts on relationship between exposure
and outcome
Usually results from comparative groups not coming from the same study base and not being representative of the populations they come from
Selection biasexample 1
Patients: women with DVT admitted to hospital.Controls: healthy women between 25-45 years old
Patients turned out to use oral anticonception more often. Oral anticonception should be the cause of DVT.
How could selection bias play a role here?
Oral anticonception and probability of DVT ?
Selection biasexample 1
• Medical circuit: 'oral anticonception could lead to DVT’
• Women with DVT complaints, who use oral anticonception, will be more often referred than those that do not use oral anticonception
• Because of this selective referral all oral anticonception users will have a higher probability to come into the study as a case and the effect of oral anticonception on DVT will be overestimated
Selection biasexample 2
• Patients from hospital – control group from hospital:– In the hospital co-morbidity and unhealthy lifestyles
occur more often than in the population– Relation between smoking and cancer can be
underestimated due to over-representation of controls who smoke
2. Information biasdefinition
• Distortion of the determinant-outcome relation caused by systematic errors in the measurement of the determinant and/or outcome.
• Who knows an example?
Information / Measurement / Misclassification Bias
Sources of information bias:
Subject variationObserver variationDeficiency of tools
Technical errors in measurement
Information biasexamples
• Misclassification of determinant– Self reporting more accurate for cases than
controls (or the other way around)
• Misclassification of outcome– Disease better diagnosed in people with
determinant
• In what cases can this play a role?• Can this also play a role in cohort research?
Information / Measurement / Misclassification Bias
Reporting bias: Individuals with severe disease tends to have complete records, therefore more complete information about exposures and greater association found
Individuals who are aware of being participants of a study behave differently (Hawthorne effect)
Controlling for Information Bias
- Blinding prevents investigators and interviewers from knowing case/control or exposed/non-exposed status of a given participant
- Form of survey mail may impose less “white coat tension” than a phone or face-to-face interview
- Questionnaire use multiple questions that ask same information acts as a built in double-check
- Accuracy multiple checks in medical records gathering diagnosis data from multiple sources
3. Confoundingdefinition
• Determinant – disease outcome relation is disturbed by the effect of another factor (the confounder) (“mixing of effects”)
• Can you think of an example?
Confoundingexample
• Children with a higher birth order more often have Down’s syndrome
What could be a
confounder?
Confounding
determinant(birth order)
disease outcome(Down sydrome)
Confounder(age mother)
1. Confounder is determinant of the disease outcome2. Confounder is associated with the determinant3. Confounder is no factor in the causal chain
Birth Order Down Syndrome
Maternal Age
Confounding
Maternal age is correlated with birth order and a risk factor even if birth order
is low
Confounding
determinant disease outcome
Confounder
Think of another example of confounding
Coffee CHD
Smoking
Confounding
Smoking is correlated with coffee drinking and a risk factor even for those
who do not drink coffee
Coffee
CHDSmoking
Confounding ?
Coffee drinking may be correlated with smoking but is not a risk factor in non-
smokers
Alcohol Lung Cancer
Smoking
Confounding
Smoking is correlated with alcohol consumption and a risk factor even for
those who do not drink alcohol
Diet CHD
Cholesterol
Confounding ?
On the causal pathway
• A third factor which is related to both exposure and outcome, and which accounts for some/all of the observed relationship between the two
• Confounder not a result of the exposure– e.g., association between child’s birth rank
(exposure) and Down syndrome (outcome); mother’s age a confounder?
– e.g., association between mother’s age (exposure) and Down syndrome (outcome); birth rank a confounder?
Confounding
Confounding
Imagine you have repeated a positive finding of birth order association in Down syndrome or association of coffee drinking with CHD in another sample. Would you be able to replicate it? If not why?
Imagine you have included only non-smokers in a study and examined association of alcohol with lung cancer. Would you find an association?
Imagine you have stratified your dataset for smoking status in the alcohol - lung cancer association study. Would the odds ratios differ in the two strata?
Imagine you have tried to adjust your alcohol association for smoking status (in a statistical model). Would you see an association?
Confounding
Imagine you have repeated a positive finding of birth order association in Down syndrome or association of coffee drinking with CHD in another sample. Would you be able to replicate it? If not why?
You would not necessarily be able to replicate the original finding because it was a spurious association due to confounding.
In another sample where all mothers are below 30 yr, there would be no association with birth order.
In another sample in which there are few smokers, the coffee association with CHD would not be replicated.
ConfoundingImagine you have included only non-smokers in a study and examined association of alcohol with lung cancer. Would you find an association?
No, because the first study was confounded. The association with alcohol was actually due to smoking. By restricting the study to non-smokers, we have found the truth. Restriction is one way of preventing confounding at the time of study design.
Confounding
If the smoking is included in the statistical model, the alcohol association would lose its statistical significance. Adjustment by multivariable modelling is another method to identify confounders at the time of data analysis.
Imagine you have tried to adjust your alcohol association for smoking status (in a statistical model). Would you see an association?
Confounding
For confounding to occur, the confounders should be differentially represented in the comparison groups.
Randomisation is an attempt to evenly distribute potential (unknown) confounders in study groups. It does not guarantee control of confounding.
Matching is another way of achieving the same. It ensures equal representation of subjects with known confounders in study groups. It has to be coupled with matched analysis.
Restriction for potential confounders in design also prevents confounding but causes loss of statistical power (instead stratified analysis may be tried).
Controlling confounding
In the design• Restriction of the
study• Matching
In the analysis• Restriction of the
analysis• Stratification• Multivariable
methods
How to prevent bias?
• Confounding – cannot be prevented– Measure and adjust in data analysis
• Information bias - prevent during design– Disease status blind for determinant status– Medical files instead of self-reporting– Same way of reporting for cases and controls
• Selection bias - prevent during design– Control selection independent of determinant
status– Good definition of source population
Cohort studyAdvantages and disadvantages
• What are the advantages of a cohort study?
• What are the disadvantages of a cohort study?
Cohort study
• Advantages– Cause is measured before effect– Not very sensitive to selection- and
information bias– Appropriate for rare determinant– Can study several outcomes
• Disadvantages– Selective withdrawal / loss to follow-up– Expensive and time consuming– Not appropriate for rare outcome
Case-control studyAdvantages and disadvantages
• What are the advantages of a case-control study?
• What are the disadvantages of a case-control study?
Case-control study
• Advantages– Efficient and relatively cheap– Appropriate for rare outcome– Can study several determinants
• Disadvantages– Cause is measured after effect – Very sensitive to selection- and infobias– Not appropriate to study several outcomes
Effect modification
• Definition: The association between exposure and disease differ in strata of the population– Example: Tetracycline discolours teeth in
children, but not in adults– Example: Measles vaccine protects in
children > 15 months, but not in children < 15 months
• Rare occurence
Selection Bias Examples
(www)
Selection Bias Examples
(www)
Selection Bias Examples
(www)
Selection Bias Examples
(www)
Selection Bias Examples
(www)
Selective survival (Neyman's) bias
Selection Bias Examples
Case-control study:Controls have less potential for exposure than cases
Outcome = brain tumour; exposure = overhead high voltage power linesCases chosen from province wide cancer registryControls chosen from rural areasSystematic differences between cases and controls
Case-Control Studies: Potential Bias
Schulz & Grimes, 2002 (www) (PDF)
Selection Bias Examples
Cohort study:Differential loss to follow-up
Especially problematic in cohort studiesSubjects in follow-up study of multiple sclerosis may differentially drop out due to disease severity
Differential attrition selection bias
Selection Bias Examples
Self-selection bias:- You want to determine the prevalence of HIV infection- You ask for volunteers for testing- You find no HIV- Is it correct to conclude that there is no HIV in this location?
Selection Bias Examples
Healthy worker effect: Another form of self-selection bias“self-screening” process – people who are unhealthy “screen” themselves out of active worker populationExample:
- Course of recovery from low back injuries in 25-45 year olds- Data captured on worker’s compensation records- But prior to identifying subjects for study, self-selection has already taken place
Information / Measurement / Misclassification Bias
Method of gathering information is inappropriate and yields systematic errors in measurement of exposures or outcomes
If misclassification of exposure (or disease) is unrelated to disease (or exposure) then the misclassification is non-differential
If misclassification of exposure (or disease) is related to disease (or exposure) then the misclassification is differential
Distorts the true strength of association
Information / Measurement / Misclassification Bias
Recall bias: Those exposed have a greater sensitivity for recalling exposure (reduced specificity)
- specifically important in case-control studies- when exposure history is obtained retrospectivelycases may more closely scrutinize their past history looking for ways to explain their illness- controls, not feeling a burden of disease, may less closely examine their past history
Those who develop a cold are more likely to identify the exposure than those who do not – differential misclassification - Case: Yes, I was sneezed on - Control: No, can’t remember any sneezing
Exposure Outcome
Third variable
To be a confounding factor, two conditions must be met:
Be associated with exposure - without being the consequence of exposure
Be associated with outcome - independently of exposure (not an intermediary)
Confounding
Birth Order
Down SyndromeMaternal Age
Confounding ?
Birth order is correlated with maternal age but not a risk factor in younger mothers
Effect of randomisation on outcome of trials in acute pain
Bandolier Bias Guide (www)
Obesity Mastitis
Age
Confounding
In cows, older ones are heavier and older age increases the risk for mastitis. This association may appear as an obesity
association
Confounding
(www)
If each case is matched with a same-age control, there will be no association (OR for old age = 2.6, P = 0.0001)
Confounding or Effect Modification
Birth Weight Leukaemia
Sex
Can sex be responsible for the birth weight association in leukaemia? - Is it correlated with birth weight? - Is it correlated with leukaemia independently of birth weight? - Is it on the causal pathway? - Can it be associated with leukaemia even if birth weight is low? - Is sex distribution uneven in comparison groups?
Confounding or Effect Modification
Birth Weight Leukaemia
Sex
Does birth weight association differ in strength according to sex?
Birth Weight Leukaemia
Birth Weight Leukaemia/ /
BOYS
GIRLS
OR = 1.8
OR = 0.9
OR = 1.5
Effect Modification
In an association study, if the strength of the association varies over different categories of a third variable, this is called effect modification. The third
variable is changing the effect of the exposure.
The effect modifier may be sex, age, an environmental exposure or a genetic effect.
Effect modification is similar to interaction in statistics.
There is no adjustment for effect modification. Once it is detected, stratified analysis can be used to obtain
stratum-specific odds ratios.
Effect modifierBelongs to natureDifferent effects in different strataSimpleUsefulIncreases knowledge of biological mechanismAllows targeting of public health action
Confounding factorBelongs to studyAdjusted OR/RR different from crude OR/RRDistortion of effectCreates confusion in dataPrevent (design)Control (analysis)
Modification-1
• Present when the measure of association between a given determinant and outcome is not constant across a subject characteristics
• Descriptive modification may easily occur due to differences in prevalence of the disease across populationsor population subgroups
• The presence or absence of modification has a bearing on the domain and the generalizability of research findings
• Modifiers point to subdomains, which implies that generalizing results from a study should be different for populations with or without the (particular level of the) modifier
Modification-2
• In etiologic research, analysis of modifiers may help the investigator to understand the complexity of multicausality and causally explain why a particular disease may be more common in certain individuals despite an apparent similar exposure to determinant
Statistical Interaction• Definition
– when the magnitude of a measure of association (between exposure and disease) meaningfully differs according to the value of some third variable
• Synonyms– Effect modification– Effect-measure modification– Heterogeneity of effect
• Proper terminology – e.g. Smoking, caffeine use, and delayed conception
• Caffeine use modifies the effect of smoking on the risk for delayed conception.
• There is interaction between caffeine use and smoking in the risk for delayed conception.
• Caffeine is an effect modifier in the relationship between smoking and delayed conception.
No Multiplicative Interaction
0.05
0.150.15
0.45
0.01
0.1
1
10
Unexposed Exposed
Ris
k o
f D
ise
as
eThird Variable Present
Third Variable Absent
Multiplicative Interaction
0.05
0.150.08
0.9
0.01
0.1
1
10
Unexposed Exposed
Ris
k o
f D
ise
as
e
Third Variable Present
Third Variable Absent
RR = 3.0
RR = 3.0
RR = 3.0
RR = 11.2
Qualitative Interaction
0.180.13
0.08
0.2
0.01
0.1
1
10
Unexposed Exposed
Ris
k o
f D
ise
as
eThird Variable Present
Third Variable Absent
RR = 0.72
RR = 2.5
Interaction is likely everywhere• Susceptibility to infectious diseases
– e.g., • exposure: sexual activity• disease: HIV infection• effect modifier: chemokine receptor phenotype
• Susceptibility to non-infectious diseases– e.g.,
• exposure: smoking• disease: lung cancer• effect modifier: genetic susceptibility to smoke
• Susceptibility to drugs (efficacy and side effects)• effect modifier: genetic susceptibility to drug
• But in practice to date, difficult to document– Genomics may change this
Additive vs Multiplicative Interaction• Assessment of whether interaction is present depends upon the
measure of association– ratio measure (multiplicative interaction) or difference measure
(additive interaction)– Hence, the term effect-measure modification
• Absence of multiplicative interaction typically implies presence of additive interaction
0.05
0.150.15
0.45
0.01
0.1
1
Unexposed Exposed
Ris
k o
f D
ise
as
e
Additive interaction present
Multiplicative interaction absent
RR = 3.0 RD = 0.3
RR = 3.0 RD = 0.1
Additive vs Multiplicative Interaction• Absence of additive interaction typically implies presence of
multiplicative interaction
0.05
0.150.150.25
0.01
0.1
1
Unexposed Exposed
Ris
k o
f D
ise
as
e Multiplicative interaction present
Additive interaction absent
RR = 3.0 RD = 0.1
RR = 1.7 RD = 0.1
Additive vs Multiplicative Interaction• Presence of multiplicative interaction may or may not be
accompanied by additive interaction
0.1
0.20.2
0.6
0.01
0.1
1
Unexposed Exposed
Ris
k o
f D
ise
as
e
0.1
0.2
0.05
0.15
0.01
0.1
1
Unexposed Exposed
Ris
k o
f D
ise
as
e
Additive interaction present
No additive interaction
RR = 2.0 RD = 0.1
RR = 2.0 RD = 0.1
RR = 3.0 RD = 0.4
RR = 3.0 RD = 0.1
Additive vs Multiplicative Interaction
• Presence of additive interaction may or may not be accompanied by multiplicative interaction
0.1
0.20.2
0.6
0.01
0.1
1
Unexposed Exposed
Ris
k o
f D
ise
as
e
0.1
0.3
0.05
0.15
0.01
0.1
1
Unexposed Exposed
Ris
k o
f D
ise
as
e
Multiplicative interaction absent
Multiplicative interaction present
RR = 3.0 RD = 0.1
RR = 3.0 RD = 0.4
RR = 2.0 RD = 0.1
RR = 3.0 RD = 0.2
Additive vs Multiplicative Interaction• Presence of qualitative multiplicative interaction is always accompanied by
qualitative additive interaction
Qualitative Interaction
0.18
0.13
0.08
0.2
0.01
0.1
1
Unexposed Exposed
Ris
k o
f D
ise
as
e
Third Variable Present
Third Variable Absent
Multiplicative and additive interaction both present
Additive vs Multiplicative Scales
• Additive measures (e.g., risk difference):– readily translated into impact of an exposure (or intervention) in
terms of number of outcomes prevented• e.g. 1/risk difference = no. needed to treat to prevent (or avert)
one case of disease – or no. of exposed persons one needs to take the exposure
away from to avert one case of disease
– gives “public health impact” of the exposure
• Multiplicative measures (e.g., risk ratio)– favored measure when looking for causal association (etiologic
research)
Additive vs Multiplicative Scales• Causally related but minor public health importance
• - Risk ratio = 2– Risk difference = 0.0001 - 0.00005 = 0.00005– Need to eliminate exposure in 20,000 persons to avert one
case of disease
• Causally related and major public health importance
– RR = 2– RD = 0.2 - 0.1 = 0.1– Need to eliminate exposure in 10 persons to avert one case
of disease
Disease No DiseaseExposed 10 99990Unexposed 5 99995
Disease No DiseaseExposed 20 80Unexposed 10 90
Smoking, Family History and Cancer:
Additive vs Multiplicative Interaction
Cancer No CancerSmoking 50 150No Smoking 25 175
CancerNo
CancerSmoking 10 90No Smoking 5 95
Stratified
Crude
Family History Absent
Family History Present
Risk rationo family history = 2.0
RDno family history = 0.05
CancerNo
CancerSmoking 40 60No Smoking 20 80
Risk ratiofamily history = 2.0
RDfamily history = 0.20• No multiplicative interaction but presence of additive interaction
• If etiology is goal, risk ratio’s may be sufficient
• If goal is to define sub-groups of persons to target:
Rather than ignoring, it is worth reporting that only 5 persons with a family history have to be prevented from smoking to avert one case of cancer
Confounding vs Interaction
• Confounding– An extraneous or nuisance pathway that an investigator
hopes to prevent or rule out
• Interaction– A more detailed description of the relationship between the
exposure and disease
– A richer description of the biologic or behavioral system under study
– A finding to be reported, not a bias to be eliminated