Top Banner
Threats to Validity from Confounding and Effect Modification •  Overview: Random vs. systematic error •  Confounding •  Effect Modification •  Logistic regression (time permitting) •  Special thanks for some of the materials in these lecture: –  Professor Jen Ahern (UCB) –  Professor Madhu Pai (McGilll—a former 250b GSI) 1 2014 Page 1
132

4 Threats to validity from confounding bias and effect modification

Apr 15, 2017

Download

Education

A M
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 4 Threats to validity from confounding bias and effect modification

Threats to Validity from Confounding and Effect Modification

•  Overview: Random vs. systematic error•  Confounding•  Effect Modification•  Logistic regression (time permitting)•  Special thanks for some of the materials in these

lecture:–  Professor Jen Ahern (UCB)–  Professor Madhu Pai (McGilll—a former

250b GSI)1

2014 Page 1

Page 2: 4 Threats to validity from confounding bias and effect modification

1

The cardinal rule of epidemiology

• Remember that all results based on epidemiology studies are likely to be …

2014 Page 2

Page 3: 4 Threats to validity from confounding bias and effect modification

The cardinal rule of epidemiology (continued)

• WRONG…– unless proper care has been taken to eliminate

all sources of error in the estimate (…and sometimes even then the results will be wrong because of unknown sources of error)

2

2014 Page 3

Page 4: 4 Threats to validity from confounding bias and effect modification

Example: Confounding• A colleague with outside funding believes that cigarette smoke

is not a “cause” (in any sense) of lung cancer but that exposure to matches (yes, matches) is the cause. This colleague has conducted a large case control study to test the null hypothesis:

Ho: “Matches are not associated with lung cancer”.

• What’s the rationale (in the Popperian sense) for stating the null hypothesis rather than the alternative:

HA: “Matches are associated with lung cancer”.

• What does the colleague hope to do (in terms of hypothesis testing)

• What do you think of the term “associated” –would it be better to write “a cause of”?

2014 Page 4

Page 5: 4 Threats to validity from confounding bias and effect modification

• “We can never finally prove our scientific theories, we can merely (provisionally) confirm or (conclusively) refute them.”– - Karl PopperSir Karl Raimund Popper CH FBA FRS[4] (28 July 1902 – 17 September 1994) was an Austrian-British[5]

philosopher and professor at the London School of Economics.[6] He is generally regarded o regarded as one of the greatest philosophers of science of the 20th century.[7][8] Popper is known for his rejection of the classical inductivist views on the scientific method, in favour of empirical falsification: regarded as one of the greatest philosophers of science of the 20th century.[7][8] (wikipedia.com)

2014 Page 5

Page 6: 4 Threats to validity from confounding bias and effect modification

Confounding: smoking, matches,

10

and lung cancer• Your colleague has located 1000 cases of lung cancer, of

whom 820 carry matches.• Among 1000 reference patients (selected randomly from a

population with recently taken normal chest x-rays), 340 carry matches.

• Strengths of the reference selection process? Weaknesses?• Describe the relationship between matches and lung cancer

in your colleague’s data.• Would you like to analyze the data in any other fashion?

2014 Page 6

Page 7: 4 Threats to validity from confounding bias and effect modification

Confounding: smoking, matches,

and lung cancer

• Odds ratio = (820 * 660) / (180 * 340)

• OR = 8.8

• 95% CI (7.2, 10.9)

Cancer No cancer

Matches 820 340

No matches 180 660

2014 Page 7

Page 8: 4 Threats to validity from confounding bias and effect modification

Confounding: smoking, matches,

and lung cancer• You decide to look at the relationship between matches

and lung cancer in the smokers separately from the non- smokers.

• You find that among the 1000 cases, 900 are smokers and 810 (of the 900) carry matches

• Among the 1000 reference patients, 300 are smokers and 270 (of the 300) carry matches

• Calculate the relevant measure(s) of effect.• What should your colleague do about future funding?

2014 Page 8

Page 9: 4 Threats to validity from confounding bias and effect modification

Confounding: smoking, matches, and lungcancer

• ORpooled = 8.84 (7.2, 10.9)

• ORsmokers = 1.0 (0.6, 1.5)

• ORnonsmokers = 1.0 (0.5, 2.0)

Pooled Cancer No cancerMatches No Matches Smokers Matches

820180Cancer 810

340660No cancer 270

No Matches Non-smoker Matches

No Matches

90Cancer 10

90

30No cancer 70

630 13

2014 Page 9

Page 10: 4 Threats to validity from confounding bias and effect modification

Confounding: smoking, matches,and lung cancer

• To be complete, you also decide to examine the relationship between smoking and lung cancer.

• What tables should you construct to do this?

14

2014 Page 10

Page 11: 4 Threats to validity from confounding bias and effect modification

Confounding: smoking, matches, and lung cancer

• ORpooled = 21.0 (16.3, 27.1)

• ORmatches = 21.0 (10.5, 46.2)

• ORno matches = 21.0 (12.9, 34.7)

• Discuss your intuitions about the 95% CI s

Pooled Cancer No cancerSmoking No Smoking Matches Smoking

900100Cancer 810

300700No cancer 270

No Smoking No matches Smoking No Smoking

10Cancer 90

90

70No cancer 30

630 16

2014 Page 11

Page 12: 4 Threats to validity from confounding bias and effect modification

Confounder?

? ?

? Unadjusted RRExposure Disease

? Adjusted RR

19

2014 Page 12

Page 13: 4 Threats to validity from confounding bias and effect modification

2

BMJ 2004;329:868-869 (16 October)

Why is confounding so important in epidemiology?

● BMJ Editorial: “The scandal of poor epidemiological research” [16 October 2004]● “Confounding, the situation in which an apparent

effect of an exposure on risk is explained by its association with other factors, is probably the most important cause of spurious associations in observational epidemiology.”

2014 Page 13

Page 14: 4 Threats to validity from confounding bias and effect modification

Overview

3

● Causality is the central concern of epidemiology● Confounding is the central concern with establishing

causality● Confounding can be understood using multiple

different approaches● A strong understanding of various approaches to

confounding and its control is essential for all those who engage in health research

2014 Page 14

Page 15: 4 Threats to validity from confounding bias and effect modification

10Adapted from: Maclure, M, Schneeweis S. Epidemiology 2001;12:114-122.

Causal Effect

Random Error

Confounding

Information bias (misclassification)

Selection bias

Bias in inference

Reporting & publication bias

Bias in knowledge use

Confounding is one of the key biases in identifying causal effects

RRcausal

“truth”RR

association

2014 Page 15

Page 16: 4 Threats to validity from confounding bias and effect modification

11

Confounding:4 ways to understand it!

1. “Mixing of effects”2. “Classical” approach based on a priori

criteria3. Collapsibility and data-based criteria4. “Counterfactual” and non-comparability

approaches

2014 Page 16

Page 17: 4 Threats to validity from confounding bias and effect modification

12

Rothman KJ. Epidemiology. An introduction. Oxford: Oxford University Press, 2002

First approach:Confounding: mixing of effects

● “Confounding is confusion, or mixing, of effects; the effect of the exposure is mixed together with the effect of another variable, leading to bias” - Rothman, 2002

Latin: “confundere” is to mix together

2014 Page 17

Page 18: 4 Threats to validity from confounding bias and effect modification

ExampleAssociation between birth order and Down syndrome

13Data from Stark and Mantel (1966) Source: Rothman 2002

2014 Page 18

Page 19: 4 Threats to validity from confounding bias and effect modification

Association between maternal age and Down syndrome

14Data from Stark and Mantel (1966) Source: Rothman 2002

2014 Page 19

Page 20: 4 Threats to validity from confounding bias and effect modification

Association between maternal age and Down syndrome, stratified by birth order

15Data from Stark and Mantel (1966) Source: Rothman 2002

2014 Page 20

Page 21: 4 Threats to validity from confounding bias and effect modification

Mixing of Effects: the water pipes analogy

Exposure

16Adapted from Jewell NP. Statistics for Epidemiology. Chapman & Hall, 2003

Outcome

Confounder

Mixing of effects – cannot separate the effect of exposure from that of confounder

Exposure and disease share a common cause (‘parent’)

2014 Page 21

Page 22: 4 Threats to validity from confounding bias and effect modification

Mixing of Effects: “control” of the confounder

Exposure

17Adapted from: Jewell NP. Statistics for Epidemiology. Chapman & Hall, 2003

Outcome

Confounder

Successful “control” of confounding (adjustment)

If the common cause (‘parent’) is blocked, then the exposure – disease association becomes

clearer

2014 Page 22

Page 23: 4 Threats to validity from confounding bias and effect modification

Second approach: “Classical” approach based on a priori criteria

18

“Bias of the estimated effect of an exposure on an outcome due to the presence of a common cause of the exposure and the outcome” – Porta 2008

● A factor is a confounder if 3 criteria are met:● a) a confounder must be causally or noncausally

associated with the exposure in the source population (study base) being studied;

● b) a confounder must be a causal risk factor (or a surrogate measure of a cause) for the disease in the unexposed cohort; and

● c) a confounder must not be an intermediate cause (in other words, a confounder must not be an intermediate step in the causal pathway between the exposure and the disease)

2014 Page 23

Page 24: 4 Threats to validity from confounding bias and effect modification

19

Exposure

EDisease (outcome)

D

Confounder

C

Confounding Schematic

Szklo M, Nieto JF. Epidemiology: Beyond the basics. Aspen Publishers, Inc., 2000. Gordis L. Epidemiology. Philadelphia: WB Saunders, 4th Edition.

2014 Page 24

Page 25: 4 Threats to validity from confounding bias and effect modification

Exposure

EConfounder

C

Intermediate cause

Disease

D

20

2014 Page 25

Page 26: 4 Threats to validity from confounding bias and effect modification

Exposure

E

ConfounderC

General idea: a confounder could be a ‘parent’ of the exposure, but should not be be a ‘daughter’ of the exposure

Disease

D

21

2014 Page 26

Page 27: 4 Threats to validity from confounding bias and effect modification

Example of schematic (from Gordis)

22

2014 Page 27

Page 28: 4 Threats to validity from confounding bias and effect modification

Birth Order

E

23

Down SyndromeD

Confounding factor: Maternal Age

C

Confounding Schematic

2014 Page 28

Page 29: 4 Threats to validity from confounding bias and effect modification

HRT use Heart disease

Association between HRT and heart disease

Confounding factor: SES

24

Are confounding criteria met?

2014 Page 29

Page 30: 4 Threats to validity from confounding bias and effect modification

BRCA1 gene Breast cancer

Confounding factor:Age

x

25

Are confounding criteria met?Should we adjust for age, when evaluating the association between a genetic factor and risk of breast cancer?

No!

2014 Page 30

Page 31: 4 Threats to validity from confounding bias and effect modification

Sex with multiple partners Cervical cancer

Confounding factor: HPV

Are confounding criteria met?

26

2014 Page 31

Page 32: 4 Threats to validity from confounding bias and effect modification

Sex with multiple partners

HPV Cervical cancer

27

What if this was the underlying causal mechanism?

2014 Page 32

Page 33: 4 Threats to validity from confounding bias and effect modification

Obesity Mortality

Are confounding criteria met?

Confounding factor: Hypertension

28

2014 Page 33

Page 34: 4 Threats to validity from confounding bias and effect modification

Obesity Hypertension Mortality

29

What if this was the underlying causal mechanism?

2014 Page 34

Page 35: 4 Threats to validity from confounding bias and effect modification

Direct vs indirect effects

Obesity Hypertension Mortality

ObesityIndirect effect

Hypertension Mortality

Direct effect

Direct effect is portion of the total effect that does not act via an intermediate cause 30

Indirect effect

2014 Page 35

Page 36: 4 Threats to validity from confounding bias and effect modification

Hernan MA, et al. Causal knowledge as a prerequisite for confounding evaluation: an appl3ic3ation to birth defects epidemiology. Am J Epidemiol 2002;155(2):176-84.

Simple causal graphs

E DC

Maternal age (C) can confound the association between multivitamin use (E) and the risk of certain

birth defects (D)

2014 Page 36

Page 37: 4 Threats to validity from confounding bias and effect modification

34

Complex causal graphs

Hernan MA, et al. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol 2002;155(2):176-84.

E DC

U

History of birth defects (C) may increase the chance of periconceptional vitamin intake (E). A genetic factor (U) could have been the cause of previous birth defects in the family, and could again cause birth defects in the current pregnancy

2014 Page 37

Page 38: 4 Threats to validity from confounding bias and effect modification

35

Smoking

A

ECalcium

DBone

fractures

CBMI

supplementation

U

Physical Activity

B

Source: Hertz-Picciotto

More complicated causal graphs!

2014 Page 38

Page 39: 4 Threats to validity from confounding bias and effect modification

The ultimate complex causal graph!

36A PowerPoint diagram meant to portray the complexity of American strategy in Afghanistan!

2014 Page 39

Page 40: 4 Threats to validity from confounding bias and effect modification

38

Third approach: Collapsibility and data- based approaches

● According to this definition, a factor is a confounding variable if● a) the effect measure is homogeneous across the strata

defined by the confounder and● b) the crude and common stratum-specific (adjusted) effect

measures are unequal (this is called “lack of collapsibility”)● Usually evaluated using 2x2 tables, and simple

stratified analyses to compare crude effects with adjusted effects

“Collapsibility is equality of stratum-specific measures of effect with the crude (collapsed), unstratified measure” Porta, 2008, Dictionary

2014 Page 40

Page 41: 4 Threats to validity from confounding bias and effect modification

39

Crude vs. Adjusted Effects● Crude: does not take into account the effect of the

confounding variable● Adjusted: accounts for the confounding variable(s)

(what we get by pooling stratum-specific effect estimates)● Generating using methods such as Mantel-Haenszel

estimator● Also generated using multivariate analyses (e.g. logistic

regression)● Confounding is likely when:●

RRcrude

=/= RRadjusted

ORcrude

=/= ORadjusted

2014 Page 41

Page 42: 4 Threats to validity from confounding bias and effect modification

42

Crude 2 x 2 tableCalculate Crude OR (or RR)

Stratify by Confounder

Calculate OR’s for each stratum

If stratum-specific OR’s are similar, calculate adjusted RR (e.g. MH)

Crude

Stratum 1 Stratum 2

If Crude OR =/= Adjusted OR, confounding is likely

If Crude OR = Adjusted OR, confounding isunlikely

ORCrude

OR1 OR2

Stratified Analysis

JC: introduce “test of homogeneity”

2014 Page 42

Page 43: 4 Threats to validity from confounding bias and effect modification

Examples: crude vs adjusted RR

Study Crude RR Stratum1 Stratum2 Adjusted ConfoundRR RR RR ing?

1 6.00 3.20 3.50 3.30

2 2.00 1.02 1.10 1.08

3 1.10 2.00 2.00 2.004 0.56 0.50 0.60 0.54

5 4.20 4.00 4.10 4.04

6 1.70 0.03 3.50

48

2014 Page 43

Page 44: 4 Threats to validity from confounding bias and effect modification

49

Maldonado & Greenland, Int J Epi 2002;31:422-29

Fourth approach: Causality: counterfactual model● Ideal “causal contrast” between exposed and

unexposed groups:● “A causal contrast compares disease frequency

under two exposure distributions, but in one target population during one etiologic time period”

● If the ideal causal contrast is met, the observed effect is the “causal effect”

2014 Page 44

Page 45: 4 Threats to validity from confounding bias and effect modification

52

What happens actually?

RRassoc

= Iexp

/ Isubstitute

RRcausal

= Iexp

/ Iunexp IDEAL

ACTUAL

2014 Page 45

Page 46: 4 Threats to validity from confounding bias and effect modification

50

Iexp

Iunexp

Maldonado & Greenland, Int J Epi 2002;31:422-29

Counterfactual, unexposed cohort

RRcausal

= Iexp

/ Iunexp

“A causal contrast compares disease frequency under two exposure distributions, but in one

Exposed cohort

Ideal counterfactual comparison to determine causal effects

target population during one etiologic time period”

“Initial conditions” are identical in the exposed and unexposed groups– because they are the same population!

2014 Page 46

Page 47: 4 Threats to validity from confounding bias and effect modification

51

Iexp

Iunexp

Counterfactual, unexposed cohort

Exposed cohort

Substitute, unexposed cohort

Isubstitute

What happens actually?

counterfactual state is not observed

A substitute will usually be a population other than the target population during the etiologic time period - INITIAL CONDITIONS MAY BE DIFFERENT

2014 Page 47

Page 48: 4 Threats to validity from confounding bias and effect modification

53Maldonado & Greenland, Int J Epi 2002;31:422-29

Counterfactual definition of confounding

● “Confounding is present if the substitute population imperfectly represents what the target would have been like under the counterfactual condition”● “An association measure is confounded (or biased

due to confounding) for a causal contrast if it does not equal that causal contrast because of such an imperfect substitution”

RRcausal=/=

RRassoc

2014 Page 48

Page 49: 4 Threats to validity from confounding bias and effect modification

Residual confounding• Confounding can persist, even after adjustment• Why?

– All confounders were not adjusted for (unmeasured confounding)– Some variables were actually not confounders!– Confounders were measured with error (misclassification of

confounders)– Categories of the confounding variable are improperly defined

(e.g. age categories were too broad)

51

2014 Page 49

Page 50: 4 Threats to validity from confounding bias and effect modification

55

Simulating the counter-factual comparison:Experimental Studies: RCT

Randomization helps to make the groups “comparable” (i.e. similar initial conditions) with respect to known and unknown confounders

Therefore confounding is unlikely at randomization - time t0

Eligible patients

Treatment

Randomization

Placebo

Outcomes

Outcomes

2014 Page 50

Page 51: 4 Threats to validity from confounding bias and effect modification

Confounding: Methods to control or reduce confounding

• Methods used in study design to reduce confounding– Randomization– Restriction– Matching

• Methods used in study analysis to reduce confounding– Stratified analysis– Multivariate analysis

31

2014 Page 51

Page 52: 4 Threats to validity from confounding bias and effect modification

Confounding:The use of randomization to

“ ”

reduce confounding

• Randomization– Useful only for intervention studies– Definition: random assignment of study subjects to

exposure categories– The special strength of randomization is its ability to

control/reduce the effect of confounding variables about which the investigator is unaware

– If there is maldistribution of potentially confounding variables after randomization (the reason for the classic “Table I: Baseline characteristics” in the randomized trial) then other confounding control options (see below) are

32applied 2014 Page 52

Page 53: 4 Threats to validity from confounding bias and effect modification

Substitute, unexposed cohort

54Maldonado & Greenland, Int J Epi 2002;31:422-29

Counterfactual, unexposed cohort

Exposed cohort

“Confounding is present if the substitute population imperfectly represents what the target would have been like under the counterfactual condition”

2014 Page 53

Page 54: 4 Threats to validity from confounding bias and effect modification

Confounding: The use of restriction to reduce confounding• Confounding cannot occur if the distribution of the

potential confounding factors do not vary across exposure or disease categories– Implication of this is that an investigator may restrict

study subjects to only those falling with specific level(s) of a confounding variable

• Extreme example: an investigator only selects subjects of exactly the same age.

• Advantages of restriction– straightforward, convenient, inexpensive

33

2014 Page 54

Page 55: 4 Threats to validity from confounding bias and effect modification

Confounding: The use of restriction to reduce confounding (cont.)

• Disadvantages– May limit number of eligible subjects– Residual confounding may persist if restriction

categories not sufficiently narrow (e.g. “decade of age” might be too broad)

– Not possible to evaluate the relationship of interest at different levels of the confounder

• Question: How does restriction differ from matching?

34

2014 Page 55

Page 56: 4 Threats to validity from confounding bias and effect modification

Confounding:The use of matching to reduce confounding

• Subjects with all levels of a potential confounder are admitted into the study BUT the control/reference subjects (either with respect to exposure in a cohort or disease in a case-reference study) are chosen to have the same distribution of the potential confounder

• The use of matching (may) also require special analysis techniques (matched analyses and conditional logistic regression)

35

2014 Page 56

Page 57: 4 Threats to validity from confounding bias and effect modification

• Disadvantages of matching– Finding appropriate control/reference subjects may be

difficult and expensive and limit sample size– Matching is most often used in case-reference (i.e.

case- control studies because in a large cohort study the cost of matching may be prohibitive)

• Thus, in cohort studies it’s often cheaper to just enroll available controls and use analytic methods (below) to control confounding)—this doesn’t apply to computerized “free” data

36

2014 Page 57

Page 58: 4 Threats to validity from confounding bias and effect modification

Confounding: The use of matching toreduce confounding (cont.)

• Disadvantages of matching (cont.)– Confounding factor used to match subjects cannot be

itself evaluated with respect to the outcome/disease– Obviously, matching does not control for confounding

by factors other than that used to match– The use of matching makes the use of stratified analysis

(for the control of other potential but non-matched factors) very difficult

• One way around this problem is the use of conditional logistic regression but there is a large reduction in “effective” sample size because only discordant pairs are used.

37

2014 Page 58

Page 59: 4 Threats to validity from confounding bias and effect modification

• Advantages of matching– Matching may be the only way to obtain sufficient

numbers of control/reference subjects with relevant levels of the confounding factor(s)

– Example: controlling for “neighborhood” (and all that it implies) by any approach other than matching is very difficult

38

2014 Page 59

Page 60: 4 Threats to validity from confounding bias and effect modification

• Advantages of matching (cont.)– Useful in very small studies in which chance

differences in confounding factors are likely to exist between the study groups and other forms of control for the confounders (such as stratification or multivariate adjustment) are not possible (because of the limited sample size)

– The full benefit of matching (in terms of the reduction of confounding) is obtained only if the proper form of matched analysis is used (to be reviewed later in the course)

39

2014 Page 60

Page 61: 4 Threats to validity from confounding bias and effect modification

• Basic goal of stratification is to evaluate the relationship between the predictor (“cause”) and outcome (“effect”) variable in strata homogenous with respect to potentially confounding variables

40

2014 Page 61

Page 62: 4 Threats to validity from confounding bias and effect modification

Confounding:The use of stratification to reduce confounding

• For example, to examine the relationship between smoking and lung cancer while controlling for the potentially confounding effect of gender:– Create a 2x2 table (smoking vs. lung cancer) for men

and women separately– To control for multiple confounders simultaneously,

stratify by pairs (or triplets or higher) of confounding factors. For example, to control for gender and race/ethnicity determine the OR for smoking vs. lung cancer in multiple strata: white women, black women, Hispanic women, white men, black men, Hispanic men,etc. 41

2014 Page 62

Page 63: 4 Threats to validity from confounding bias and effect modification

• (From the earlier example): Goal: create a summary or “adjusted” estimate for the relationship between matches and lung cancer while adjusting for the two levels of smoking (the potential confounder)

• This process is analgous to the standardization of rates earlier in the course—in those examples the purpose of adjustment was to remove the confounding effect of age on the relationship between populations (A vs. B etc.) and rates of disease or death.

• In the present example the goal is to remove the confounding effect of smoking on the relationship between

matches and lung cancer. 42

2014 Page 63

Page 64: 4 Threats to validity from confounding bias and effect modification

Confounding:Types of summary estimators to determine uniform effect over strata

• Mantel-Haenszel– We will use this estimator in the present course– Resistant to the effects of small strata or cells with a

value of “0”– Computationally a piece of cake

• Directly pooled estimators (e.g. Woolf)– Sensitive to small strata and cells with value “0”– Computationally messy but doable

• Maximum likelihood– The most “appropriate” estimator– Resistant to the effects of small strata or cells with a

value of “0”– Computationally challenging

43

2014 Page 64

Page 65: 4 Threats to validity from confounding bias and effect modification

Confounding: smoking, matches, and lungcancer

• ORpooled = 8.84 (7.2, 10.9)

• ORsmokers = 1.0 (0.6, 1.5)

• ORnonsmokers = 1.0 (0.5, 2.0)

Pooled Cancer No cancerMatches No Matches Smokers Matches

820180Cancer 810

340660No cancer 270

No Matches Non-smoker Matches

No Matches

90Cancer 10

90

30No cancer 70

630 44

2014 Page 65

Page 66: 4 Threats to validity from confounding bias and effect modification

An aside:Terminology

• Pooled = combined = collapsed = unadjusted• Adjusted = summary = weighted, etc.

– All of these reflect some adjustment process such as Mantel-Haenszel or Woolf or maximum likelihood estimation to weight the strata and develop confidence intervals about the estimate.

45

2014 Page 66

Page 67: 4 Threats to validity from confounding bias and effect modification

Confounding:Notation used in Mantel- Haenszel estimators of relative risk

• Notation for case-control or cohort studies with count data

Case-control: RR = OR = ad / bc

Cohort: RR = IeI0

46

= a / (a + b) c/ (c + d)

Cases Controls TotalExposed Nonexposed

a c b d a + b c + d

Total a + c b + d a + b + c + d = T

2014 Page 67

Page 68: 4 Threats to validity from confounding bias and effect modification

Confounding:Notation used in Mantel-Haenszel estimators of relative risk (cont.)

• Notation for cohort studies with person-time data

RR = IeI0

= a / PY1

47

c / PY0

Cases ControlsExposed Nonexposed

a c ------

PY1

PY0

Total a + c T

2014 Page 68

Page 69: 4 Threats to validity from confounding bias and effect modification

Confounding:Mantel-Haenszel estimators ofrelative risk for stratified data

Case-Control Study:

RRMH =∑(ad / T)i

∑(bc / T)i

Cohort Study with Count Denominators:

RRMH =∑{a(c + d) / T}i

∑{b(a + b) / T}ICohort Study with Person-years Denominators:

RRMH = ∑{a(PY0) / T}i

∑{b(PY1) / T}i 48

2014 Page 69

Page 70: 4 Threats to validity from confounding bias and effect modification

Confounding: smoking, matches, and lungcancer

• ORpooled = 8.84 (7.2, 10.9)

• ORsmokers = 1.0 (0.6, 1.5)

• ORnonsmokers = 1.0 (0.5, 2.0)

No Matches 90 630 51

Pooled Cancer No cancerMatches 820 340No Matches 180 660Smokers Cancer No cancerMatches 810 270No Matches 90 30Non-smoker Cancer No cancerMatches 10 70

2014 Page 70

Page 71: 4 Threats to validity from confounding bias and effect modification

Confounding:Mantel-Haenszel estimators of relative risk for stratified data (smoking, matches, lung cancer

RRMH = ∑(ad / T)i / ∑(bc / T)i

Numerator of MH estimator:

• For smokers: (ad/T)=(810*30)/1200=20.25;

• For nonsmokers: (ad/T)=(10*630)/800=7.88;

• Add these together: 20.25 + 7.88=28.13 (numerator)

Denominator of MH estimator:

• For smokers: (bc/T)=(270*90)/1200=20.25;

• For nonsmokers: (bc/T)=(90*70)/800=7.88;

• Add these together: 20.25 + 7.88=28.13•ORMH = 28.13 / 28.13 = 1.0 (as expected since both stratified OR’s were = 1.0)

•Be sure to try this on stratified data in which the two strata are not exactly equal to each other (but also not so different as to suggest that effect modification is present

52

2014 Page 71

Page 72: 4 Threats to validity from confounding bias and effect modification

Confounding:Interpretation of ORMH

• If ORMH (=1.0 in this example) “differs meaningfully” from ORunadjusted (=8.8 in this example) then confounding is present

• What does “differs meaningfully” mean– This is a matter of judgment based on biologic/clinical

sense rather than on a statistical test– Even if they “differ” only slightly, generally the ORMH

rather than the ORcombined is reported as the summary effect estimate

• But what is one disadvantage of reporting ORMH ?– Although there do exist statistical tests of confounding

they are not widely recommended (these tests evaluate53Ho: OR

MH = OR

unadjusted

2014 Page 72

Page 73: 4 Threats to validity from confounding bias and effect modification

67

JC: test of homogeneity

2014 Page 73

Page 74: 4 Threats to validity from confounding bias and effect modification

Hennekens, 1987, p305

54

2014 Page 74

Page 75: 4 Threats to validity from confounding bias and effect modification

55

2014 Page 75

Page 76: 4 Threats to validity from confounding bias and effect modification

56

2014 Page 76

Page 77: 4 Threats to validity from confounding bias and effect modification

Review what the X^2 means in this context.

58

2014 Page 77

Page 78: 4 Threats to validity from confounding bias and effect modification

59

2014 Page 78

Page 79: 4 Threats to validity from confounding bias and effect modification

• Confounding “pulls” the observed association away from the true association

– It can either exaggerate/over-estimate the true association (positive confounding)

• Example– RRcausal = 1.0– RRobserved = 3.0

or

– It can hide/under-estimate the true association (negative confounding)

• Example– RRcausal = 3.0

– RRobserved

= 1.0

Direction of Confounding Bias

40

2014 Page 79

Page 80: 4 Threats to validity from confounding bias and effect modification

Confounding:Summary of steps to evaluate confounding

Table 12-10. Steps for the control of confounding and the evaluation of effect modification through stratified analysis1. Stratify by levels of the potential confounding factor.2. Compute stratum-specific unconfounded relative risk estimates.3. Evaluate similarity of the stratum-specific estimates by either eyeballing or

performing test of statistical significance. (More on this step later)4. If the effect is thought to be uniform, calculate a pooled unconfounded summary

estimate using RRMH. If effect is not uniform (i.e. effect modification is present, skip to step 6)

5. Perform hypothesis testing on the unconfounded estimate, using Mantel-Haenszel chi-square and compute confidence interval.

6. If effect is not thought to be uniform (i.e., if effect modification is present):a. Report stratum-specific estimates, results of hypothesis testing, and

confidence intervals for each estimate

b.If desired, calculate a summary unconfounded estimate using a standar6d6ized formula 2014 Page 80

Page 81: 4 Threats to validity from confounding bias and effect modification

67

JC: test of homogeneity

2014 Page 81

Page 82: 4 Threats to validity from confounding bias and effect modification

68

Effect modification (Interaction)

• Goals of stratification of data– Evaluate and reduce/remove confounding– Evaluate and describe effect modification

• Description of effect modification– A change in the magnitude of an effect measure

(between exposure and disease) according to the level of some third variable

– What two “classes” of effect measures have we used so far in the course?

2014 Page 82

Page 83: 4 Threats to validity from confounding bias and effect modification

Effect modification: example #1

• Disease incidence by exposure and age– Does the relationship between exposure and disease change

over the value of the potential confounder (age)? How?

69

2014 Page 83

Page 84: 4 Threats to validity from confounding bias and effect modification

Effect modification: example #2• Disease incidence by exposure and age

• Does the relationship between exposure and disease change over the value of the potential confounder (age)? How?

Rothman ’86 (p 178) 70

2014 Page 84

Page 85: 4 Threats to validity from confounding bias and effect modification

Effect modification: contrast with confounding

• Confounding– A bias that an investigator hopes to remove– A nuisance that may or may not be present in a given

study design• Properties of a confounding variable: (Rothman, p123):

– a) be a risk factor for disease among the non-exposed;– b) be associated with the exposure variable; and– c) not be an intermediate step in the “causal pathway”

71

2014 Page 85

Page 86: 4 Threats to validity from confounding bias and effect modification

Effect modification: contrast with confounding

• Effect modification– A more detailed description of the “true” relationship

between the exposure and the outcome– Effect modification is a finding to be reported (even

celebrated), not a bias to be eliminated– Effect modification is a “natural phenomenon” that

exists independently of the study design– The presence and interpretation of effect modification

depends upon the choice of effect measure (ratio vs. difference)

72

2014 Page 86

Page 87: 4 Threats to validity from confounding bias and effect modification

73

Some lingo

• Covariate– Confounder, potential confounder– Effect modification, interaction– Intermediate variable

2014 Page 87

Page 88: 4 Threats to validity from confounding bias and effect modification

Effect modification: contrast with confounding

• Note that for any association under study, a given factor may be:– Both a confounder and an effect modifier or– A confounder but not an effect modifier or An effect

modifier but not a confounder or

– neither

74

2014 Page 88

Page 89: 4 Threats to validity from confounding bias and effect modification

Examples of confounding/effect modification

76

Level 1 Level 2 Crude/ collapsed/ Combined “unadjusted”

Uniform estimate (ORMH) /“adjusted”

Confounding present

Interaction present

4.0 4.0 4.0 4.0 NO NO4.0 0.25 1.0 1.0 NO YES1.0 1.0 8.4 1.0 YES NO4.0 0.25 1.0 2.0 YES

(?relevance)YES

2014 Page 89

Page 90: 4 Threats to validity from confounding bias and effect modification

77

2014 Page 90

Page 91: 4 Threats to validity from confounding bias and effect modification

Effect modification: test of homogeneity

• Null hypothesis: The individual stratified estimates of the effect do not differ from some uniform estimate of effect (such as a Mantel Haenszel estimator)

• Notation:

–– N is the number of strata (N=2 in our smoking/matches example);– ln^Ri is the natural logarithm of the estimated (hence the “^”) effect

measure for each stratum (ORi in our example);– ln^R is the natural logarithm of the uniform effect estimate (e.g. ORMH in

X2(N-1)

is chi-square with (N-1) degrees of freedom;

our example—the computer will use the maximum likelihood estimate)• One formula to test homogeneity:

X2

(N-1) = ∑ [ln(^ Ri) – ln(RMH)]2

Var[ln(^ Ri)]

N

i= 1

78

JC: Comment on choice of signifciance level for test of homogeneity 2014 Page 91

Page 92: 4 Threats to validity from confounding bias and effect modification

Paradox

• If effect modification is present, a uniform estimator of effect (such as ORMH) cannot (or at least should not) be reported.

• However, in order to determine if effect modification is present, it is necessary to calculate the value of a uniform estimator of effect (such as ORMH) because it is needed in the calculation of the test of homogeneity.

79

2014 Page 92

Page 93: 4 Threats to validity from confounding bias and effect modification

Effect modification: test of homogeneity (or is heterogeneity?)

• Comments– If the test of homogeneity is “significant” (=“reject homogeneity”)

this is evidence that there is heterogeneity (i.e. no homogeneity) and that effect modification may be present.

• (Null hypothesis: The individual stratified estimates of the effect do not differ from some uniform estimate of effect)

– The choice of a significance level (e.g. p < 0.05) is somewhat open to interpretation.

• One “conservative” approach, because of inherent limitations in the power of the test of homogeneity, is to treat the data as if interaction is present for p < 0.20).

• In other words, one would rather err on the side of assuming that interaction is present (and reporting the stratified estimates of effect) than on reporting a uniform estimate that may not be true across strata. 80

2014 Page 93

Page 94: 4 Threats to validity from confounding bias and effect modification

UC Berkeley

34

2014 Page 94

Page 95: 4 Threats to validity from confounding bias and effect modification

81

2014 Page 95

Page 96: 4 Threats to validity from confounding bias and effect modification

Additive versus multiplicative scale effect modification

● Notation: RXZ● No additive interaction if (R11 – R01) = (R10 – R00)

○ Rewrite as: (R11-R01)-(R10-R00)=0● In words: Difference in risk for (X=1 vs. X=0) when Z=1 is

equal to difference in risk for (X=1 vs. X=0) when Z=0● Note: the values R11, R10, etc. are risks (not counts)

2014 Page 96

Page 97: 4 Threats to validity from confounding bias and effect modification

Additive versus multiplicative scale effect modification

● Notation: RXZ● No multiplicative interaction if (R11/R01)=(R10/R00)

Rewrite as: (R11/R01)/(R10/R00)=1● In words: Ratio of risks/rates when X=1 vs. X=0 when

Z=1 is equal to ratio of risks/rates when X=1 vs. X=0 when Z=0

2014 Page 97

Page 98: 4 Threats to validity from confounding bias and effect modification

Effect modification is scale-dependent

• Evidence for effect modification/statistical interaction if the RR or the AR differs between two groups• However, effect modification/statistical interaction is scale-dependent

– If you do not have interaction on the additive scale (AR is homogenous) then you will have interaction on the multiplicative scale (RR must be heterogeneous)

– If you do not have interaction on the multiplicative scale (RR is homogenous) then you will have interaction on the additive scale (AR must be heterogeneous)

– Note: It is common to have evidence of interaction on both scales.

2014 Page 98

Page 99: 4 Threats to validity from confounding bias and effect modification

Example● No additive scale interaction if (R11-R01)-(R10-R00)=0● No relative scale interaction if (R11/R01)/(R10/R00)=1

● Additive scale: (60-20) - (50-10) = 0○ Interaction not present on the additive scale

● Relative scale: (60/20) / (50/10)=0.6○ Interaction present on the relative scale

Z=1 Z=0

X=1 60 50

X=0 20 10

2014 Page 99

Page 100: 4 Threats to validity from confounding bias and effect modification

Example● No additive scale interaction if (R11-R01)-(R10-R00)=0● No relative scale interaction if (R11/R01)/(R10/R00)=1

● Additive scale: (60-20) - (30-10) = 20○ Interaction present on the additive scale

● Relative scale: (60/20) / (30/10)=1○ Interaction not present on the relative scale

Z=1 Z=0

X=1 60 30

X=0 20 10

2014 Page 100

Page 101: 4 Threats to validity from confounding bias and effect modification

Logistic Regression(time permitting)

2014 Page 101

Page 102: 4 Threats to validity from confounding bias and effect modification

Confounding: smoking, matches, and lung cancer

• ORpooled = 21.0 (16.3, 27.1)

• ORmatches = 21.0 (10.5, 46.2)

• ORno matches = 21.0 (12.9, 34.7)

• Discuss your intuitions about the 95% CI s

Pooled Cancer No cancerSmoking No Smoking Matches Smoking

900100Cancer 810

300700No cancer 270

No Smoking No matches Smoking No Smoking

10Cancer 90

90

70No cancer 30

630 84

2014 Page 102

Page 103: 4 Threats to validity from confounding bias and effect modification

A brief introduction to logistic regressionLet X1 = smoking (1=yes; 0=no) Let X2 = matches (1=yes; 0=no) Let Cancer = cancer (1=yes; 0=no)

Recall earlier tables:

OR=21.0

OR=21.0 OR=21.0

Conclusions: No confounding by matches of the relationship between smoking and lung cancer; no effect modification by matches of the relationship between smoking and lung cancer 85

Collapsed Cancer =1 Cancer=0X1=1 900 300X1=0 100 700

X2=1 Cancer=1 No Cancer=0 X2=0 Cancer=1 No Cancer=0X1=1 810 270 X1=1 90 30X1=0 10 70 X1=0 90 630

2014 Page 103

Page 104: 4 Threats to validity from confounding bias and effect modification

Data structure for computer analysis

• Most computer programs would want to see the data for the individual subjects in the study in the following form:

H 0 0 086

Subject ID X1 X2 Cancer How many?A 1 1 1B 1 1 0C 0 1 1D 0 1 0E 1 0 1F 1 0 0G 0 0 1

2014 Page 104

Page 105: 4 Threats to validity from confounding bias and effect modification

Data structure for computer analysis

• Most computer programs would want to see the data for the individual subjects in the study in the following form:

87

Subject ID X1 X2 Cancer How many?A 1 1 1 810 of theseB 1 1 0 270 of theseC 0 1 1 10 of theseD 0 1 0 70 of theseE 1 0 1 90 of theseF 1 0 0 30 of theseG 0 0 1 90 of theseH 0 0 0 630 of these

2014 Page 105

Page 106: 4 Threats to validity from confounding bias and effect modification

88

The basic logistic equation for this problem

• ln (odds of disease) = a + b1X1 + b2X2 + b3X1X2

• ln (odds of disease) = a + b1(smoking) + b2(matches) + b3(smoking)(matches)

2014 Page 106

Page 107: 4 Threats to validity from confounding bias and effect modification

Solving a logistic equation

• ln (odds of disease) = a + b1X1 + b2X2 + b3X1X2• When X1 = 0 and X2 = 0, solve for “a”• ln (odds) = a = ln ( ) =• a =• So now: ln (odds) =

89

2014 Page 107

Page 108: 4 Threats to validity from confounding bias and effect modification

OR=21.0 OR=21.0

90

X2=1 Cancer=1 No Cancer=0 X2=0 Cancer=1 No Cancer=0X1=1 810 270 X1=1 90 30

X1=0 10 70 X1=0 90 630

2014 Page 108

Page 109: 4 Threats to validity from confounding bias and effect modification

Solving a logistic equation

• ln (odds of disease) = a + b1X1 + b2X2 + b3X1X2• When X1 = 0 and X2 = 0, solve for “a”• ln (odds) = a = ln (90/630) = -1.946• a = -1.946• So now: ln (odds) = -1.946 + b1X1 + b2X2 + b3X1X2

91

2014 Page 109

Page 110: 4 Threats to validity from confounding bias and effect modification

92

Solving a logistic equation (cont.)

• When X1 = 1 and X2 = 0, solve for b1

• ln (odds) =• b1 =• So now: ln (odds) =

2014 Page 110

Page 111: 4 Threats to validity from confounding bias and effect modification

93

OR=21.0 OR=21.0

X2=1 Cancer=1 No Cancer=0 X2=0 Cancer=1 No Cancer=0

X1=1 810 270 X1=1 90 30X1=0 10 70 X1=0 90 630

2014 Page 111

Page 112: 4 Threats to validity from confounding bias and effect modification

94

Solving a logistic equation (cont.)

• ln (odds of disease) = a + b1X1 + b2X2 + b3X1X2• When X1 = 1 and X2 = 0, solve for b1• ln (odds) = ln (90/30) = 1.099 = -1.946 + b1• b1 = 3.045• So now: ln (odds) = -1.946 + 3.045X1 + b2X2 + b3X1X2

2014 Page 112

Page 113: 4 Threats to validity from confounding bias and effect modification

95

Solving a logistic equation (cont.)

• ln (odds of disease) = a + b1X1 + b2X2 + b3X1X2• When X1 = 0 and X2 = 1, solve for b2:• ln (odds) = ln ( ) =• b2=• So now: ln (odds) =

2014 Page 113

Page 114: 4 Threats to validity from confounding bias and effect modification

96

X2=1 Cancer=1 No Cancer=0 X2=0 Cancer=1 No Cancer=0X1=1 810 270 X1=1 90 30X1=0 10 70 X1=0 90 630OR=21.0 OR=21.0

2014 Page 114

Page 115: 4 Threats to validity from confounding bias and effect modification

97

Solving a logistic equation (cont.)

• ln (odds of disease) = a + b1X1 + b2X2 + b3X1X2• When X1 = 0 and X2 = 1, solve for b2:• ln (odds) = ln (10/70) = -1.946 + 0 + b2X2 + 0• b2= 0• So now: ln (odds) = -1.946 + 3.045X1 + 0 + b3X1X2

2014 Page 115

Page 116: 4 Threats to validity from confounding bias and effect modification

Solving a logistic equation (cont.)

• ln (odds of disease) = a + b1X1 + b2X2 + b3X1X2• When X1 = 1 and X2 = 1 then:• ln (odds) =• ln (odds) =• Solve for b3• ln (odds) =• b3 =• So now: ln (odds) =

98

2014 Page 116

Page 117: 4 Threats to validity from confounding bias and effect modification

99

X2=1 Cancer=1 No Cancer=0 X2=0 Cancer=1 No Cancer=0X1=1 810 270 X1=1 90 30X1=0 10 70 X1=0 90 630OR=21.0 OR=21.0

2014 Page 117

Page 118: 4 Threats to validity from confounding bias and effect modification

Solving a logistic equation (cont.)

• ln (odds of disease) = a + b1X1 + b2X2 + b3X1X2• When X1 = 1 and X2 = 1 then:• ln (odds) = -1.946 + b1 + b2 + b3• ln (odds) = -1.946 + 3.045 + 0 + b3• Solve for b3• ln (odds) = ln (810/270) = 1.099 = -1.946 + 3.045 + b3• b3 = 0• So now: ln (odds) = -1.946 + 3.045X1 + 0 + 0

100

2014 Page 118

Page 119: 4 Threats to validity from confounding bias and effect modification

Solving a logistic equation (cont.)

• ln (odds of disease) = a + b1X1 + b2X2 + b3X1X2• This simplifies (earlier calculations) to:

– ln (odds) = -1.946 + 3.045X1 + 0 + 0• One can now use the logistic equation to efficiently describe

relationships in the table• Calculate the ln(odds) for a smoker who uses matches: ln

(odds)=• Calculate the ln(odds) for a smoker who doesn’t use matches:

ln(odds) =

• Now calculate the odds ratio for (smokers vs. non-smokers// matches+)

• At home, calculate the odds ratio for (smokers vs. non- smokers// matches-)

101

2014 Page 119

Page 120: 4 Threats to validity from confounding bias and effect modification

Solving a logistic equation (cont.)

• ln (odds of disease) = a + b1X1 + b2X2 + b3X1X2• This simplifies (earlier calculations) to:

-1.946 + 3.045(X1) + 0(X2) + 0(X1X2 )• One can now use the logistic equation to efficiently describe

relationships in the table• Calculate the ln(odds) for a smoker who uses matches (X1 = 1

and X2 = 1):ln (odds)= -1.946 + 3.045 = 1.099

• Calculate the ln(odds) for a smoker who doesn’t use matches (X1= 1 and X2 = 0):ln(odds) = -1.946 + 3.045 = 1.099

• Now calculate the odds ratio for (smokers vs. non-smokers// matches)

• At home, calculate the odds ratio for (smokers vs. non-smokers//102no matches)

2014 Page 120

Page 121: 4 Threats to validity from confounding bias and effect modification

Logistic RegressionUsing the logistic model model developed in class for the matches-

smoking-lung cancer data (stratified by matches), evaluate the risk of lung cancer for:

1. (in-class) A smoker who uses matches vs. a non-smoker who uses matches.

2. (at home) A smoker who uses matches vs. a non-smoker who does not use matches

SEPARATE ASSIGNMENTDevelop a logistic model for the matches-smoking-lung cancer data (stratified by smoking status). Use this model to evaluate the risk of lung cancer for:

1. (at home) A user of matches who smokes vs. a non-user of matches who smokes.

2. (at home) A smoker who uses matches vs. a non-smoker who uses matches. Is this result consistent with that you arrived at in the in-

103class example above? 2014 Page 121

Page 122: 4 Threats to validity from confounding bias and effect modification

Find OR for smokers (who use matches) vs. non-smokers (who use matches)

For Smokers who use matches X1 = 1

X2 = 1For non-smokers who use matches X1 = 0

X2 = 1From prior slides we determined that: ln (odds) = -1.946

+ 3.045 (X1)

105

2014 Page 122

Page 123: 4 Threats to validity from confounding bias and effect modification

For smokers who use matches (X1 = 1; X2 = 1) ln (odds) = -1.946 + 3.045 (1) = 1.0990

For non-smokers who use matches (X1 = 0; X2 = 1) ln (odds) = -1.946 + 0 + 0 + 0 = -1.946

We want to solve:ln OR = 1.0990 – (-1.946) = 3.045 eln OR = OR = e3.045 = 21.0

Therefore, the odds ratio (determined using logistic regression) comparing smokers using matches to non-smokers using matches is 21.0. This agrees with the stratified data presented earlier.

106

2014 Page 123

Page 124: 4 Threats to validity from confounding bias and effect modification

Confounding: smoking, matches, and lung cancer

• ORpooled = 21.0 (16.3, 27.1)

• ORmatches = 21.0 (10.5, 46.2)

• ORno matches = 21.0 (12.9, 34.7)

• Discuss your intuitions about the 95% CI s’

No Smoking 90 630 107

Pooled Cancer No cancerSmoking 900 300No Smoking 100 700Matches Cancer No cancerSmoking 810 270No Smoking 10 70No matches Cancer No cancerSmoking 90 30

2014 Page 124

Page 125: 4 Threats to validity from confounding bias and effect modification

Some concluding comments on logistic regression

• Interpretations of the final logistic equation for these data:

ln (odds of disease) = a + b1(smoking) + b2(matches) + b3(smoking)(matches)

ln(odds) = -1.946 + 3.045(smoking) + 0(matches) + 0(matches)(smoking)

• This equation describes the data whether stratified either by matches or by smoking.

• The relationship of multiple variables may be simultaneously adjusted for by the the logistic equations

• The estimates of the coefficients for the equation are derived through maximum likelihood techniques

• This technique is very widely used in epidemiologic (and other)

applications when the outcome variable of interest is dichotomous. 108

2014 Page 125

Page 126: 4 Threats to validity from confounding bias and effect modification

Some concluding comments on logisticregression

• Comments– Having multiple strata (how this technique makes

possible)– Test of homogeneity (b3)

Maximum likelihood estimation for coefficient estimation

• Modifications of logistic regression exist for coping with– Outcome variables with multiple levels = polytomous

logistic regression– Studies in which matching was used = Conditional

logistic regression 109

2014 Page 126

Page 127: 4 Threats to validity from confounding bias and effect modification

. use http://www.stata-press.com/data/r8/lbw

storage display valuevariable name type format variable label--------------------------------------------------------------

-----------------

110

id low

int byte

%8.0g%8.0g

identification code birth weight<2500g

age lwt race smoke ptl ht ui ftv

byte int byte byte byte byte byte byte

%8.0g%8.0g%8.0g%8.0g%8.0g%8.0g%8.0g%8.0g

age of weight race smoked

motherat last menstrual period

during pregnancypremature labor history (count) has history of hypertension presence, uterine irritability number of visits to physician

during 1st trimester birth weight (grams)bwt int %8.0g

2014 Page 127

Page 128: 4 Threats to validity from confounding bias and effect modification

Special (and very useful) STATA command“xi” (=“interaction expansion”)

• xi: logistic low age lowwt i.race smoke pt1 ht ui

• In this example, a variable named “race” has three levels (e.g. white/hispanic/black) that might be coded as “0=white”; “1=hispanic”; “2=black”

• The combined use of xi and i.race directs STATA to analyze all levels of race (and compare them to level 1)— this can be a HUGE time-saver (avoids the user having to manually recode such variables)!

111

2014 Page 128

Page 129: 4 Threats to validity from confounding bias and effect modification

Assignments

• Write the logistic model describing these data (next slide).• What is the risk of low birth weight (LBW) for a smoker,

adjusted for all other variables?• How can the 95% CI be determined?• What is the risk of LBW for an Hispanic baby (compared

to a white baby)?• What is the risk of LBW for a black baby (compared to an

Hispanic baby)?

112

2014 Page 129

Page 130: 4 Threats to validity from confounding bias and effect modification

113

2014 Page 130

Page 131: 4 Threats to validity from confounding bias and effect modification

114Discuss intercept

2014 Page 131

Page 132: 4 Threats to validity from confounding bias and effect modification

115

2014 Page 132