Top Banner
1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010
64

1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

Jan 13, 2016

Download

Documents

April Bennett
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

1

Precision and Validity: Selection

Bias

Dr. Jørn OlsenEpi 200B

January 26 and 28, 2010

Page 2: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

2

Bias and confounding (Last, Dictionary) Bias: Deviation of results or inference from

truth, or processes leading to such deviations. Any trend in the collection, analysis, interpretation, publication, or review of data that can lead to conclusions that are systematically different from the truth.

Page 3: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

3

Bias and confounding (Last, Dictionary) Confounding: A situation in which the

effect of two processes are not separated.

Confounder, confounding factor, confounding variable-Poor term, confounding is study specific. No variables are always confounders.

Page 4: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

4

Bias and confounding (Last, Dictionary) Selection bias: caused by the way subjects

are selected into the study or because there are selective losses of subjects prior to data analyses.

In a cohort study the first type of selection bias can often be described as selection leading to more or less confounding.

Page 5: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

5

Selection Bias Selection as a design problem Healthy worker selection, Berkson bias Most problematic non-responders in case-

control studies, loss to follow-up

Page 6: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

6

Survey

N % % % %

Non-respondersSmokersNon-smokers

400200400

402040

-33.366.6

-6040

-2080

All 1000 100% 100% 100% 100%

Page 7: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

7

Follow-up study-Complete Cohort

E N D

+-

10001000

200100 RR = 2.0, RD = 0.10

Page 8: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

8

E N D

+-E

500500D

10050

50% refuse to take part in the study

RR = 2.0, RD = 0.10

S

Page 9: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

9

E N D

+-

1000500

20050 PR = 2.0, RD = 0.10

E D

S

E D

S

Is unlikely at baseline since they do not know D.

E D

S

C

but Is more likely C could be SES.

Page 10: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

10

Most likely

S E D

C

In cohort studies; selection may cause confounding, perhaps more likely reduce confounding. Poor health, poor social conditions, may correlate with selection.

Conditioning on S would open an E-C path-induce confounding that was not present before

Page 11: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

11

Large cohorts recruit seldom more than 50%

DNBC about 30%; half of GPs participated 60% of the invited accepted invitation

Selection bias – Yes, if used as a survey But when making internal comparisons?

Page 12: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

12

Table 2. RORs Based on Adjusted* ORs in the Source Population and Among Participants

Ref Nohr et al. Epidemiology 206;17:413-8

Page 13: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

13

Internal comparison, counterfactual guidelines

RR = 2 for this cohort External validity, generalization For the source population? For all in the future? For other ethnic groups, etc.

Page 14: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

14

Selection bias in a cohort study is mainly related to a loss to follow-up.

Reason to expect selection bias? Will “intention to treat” solve the problem? Not when estimating effect size , but may be ok when testing Ho

RCT – A pain killerrandomization

Drug, N = 100 Placebo, N = 100

40 loss to follow-up5 loss to follow-up

Page 15: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

15

Follow-up studyE D D All

+

-

150

50

9850

9950

10,000

10,000

RR = 3.0

Page 16: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

16

E D D All

+

-

120

45

7880

8955

8,000

9,000

RR = 3.0

Now 20% loss to follow-up among exposed and 10% among not exposed

Page 17: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

17

E D D All

+

-

140

40

7860

8960

8,000

9,000

RR = 3.9

Suppose we got:

How could this happen?

When is it likely?

Page 18: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

18

E D D Total

+

-

A

C

B

D

N1

N0

Source population

E D D All

+

-

a

c

b

d

n1

n0

Study population

Selection bias if ≠A/N1

C/N0

a/n1

c/n0

Page 19: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

19

Does condom use protect against STDs?

What is the source population for such a study?

Page 20: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

20

A case-control study samples cases from an STD clinic and controls from the catchment area of the clinic. Any problems with that?

Results could be like this:

Males with infected partners No requirement for infected

partnersCondom use cases controls

Yes 100No 600

200600

cases controls

100600

100600

OR = 0.5 OR = 1.0

Page 21: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

21

E D D

+-

2080

1090

100 100 OR = = 2.25

E D D

+-

2040

545

60 50 OR = = 4.50

E D

S

Selection bias is often a problem in a case-control study

20/8010/90

20/405/45

Page 22: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

22

Response rates

E D D

+-

100%50%

50%50%

Page 23: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

23

Response rates

ORresponders = ORtrue x ORresponse rates

4.50 = 2.25 x100/5050/50

When would we expect this pattern?

When would we expect the opposite?

Page 24: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

24

Selections of relevance for designs

Berkson’s bias

Disease may be correlated in hospital patients but not in the population

100,000 30% asthma; 30,00010% bronchitis; 10,0000.3 x 0.1 = 0.03; 3000 have both diseases

Page 25: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

25

100,000

30,000

3000

10,000

Page 26: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

26

Selections of relevance for designsIn the hospital, let’s assume 40% of asthma patients get hospitalized, and 60% of patients with bronchitis

27000 asthma only - 10800 in hospital7000 bronchitis only - 4200 in hospital3000 with both diseases - 2280 in hospital

0.4 + 0.6 – 0.4 x 0.6 = 0.76

Thus overrepresented in hospital data, the 2 diseases will look as if they are associated but they are not; those with both diseases just have a higher probability of being hospitalized

A “Berkson’s like” bias could be seen for other factors that influence hospitalization rates or diagnostic probabilities.

Page 27: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

27

Selections of relevance for designs

30,000

3000

10,000

11,080

2280

6,480

Page 28: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

28

Smoking HBP CVD

+ 100 + 20

- 80

+ 6- 14+ 8- 72

(30%)

(10%)

- 100 + 20

- 80

+ 2- 18+ 4- 76

(10%)

(5%)

Smoking ? HBP CVD

HBP CVD risk highest for those with high blood pressure and for smokers

Estimates between smoking and HBP before or after exclusion of patients with CVD

OR – smoking exposure odds ratios for HBP

Page 29: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

29

Smoking HBP CVD

+ 100 + 20

- 80

+ 6- 14+ 8- 72

- 100 + 20

- 80

+ 2- 18+ 4- 76

No exclusion of CVD

OR = = 120/2080/80

Page 30: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

30

Be careful when excluding diseases from the study if they are in the causal pathway, or if they are causally linked to the end point of your study.

Page 31: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

31

Smoking HBP CVD

+ 100 + 20

- 80

+ 6- 14+ 8- 72

- 100 + 20

- 80

+ 2- 18+ 4- 76

Use CVD as controls and exclude them from the case group

OR = = 0.3914/188/4

Page 32: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

32

Smoking HBP CVD

+ 100 + 20

- 80

+ 6- 14+ 8- 72

- 100 + 20

- 80

+ 2- 18+ 4- 76

Use CVD as controls and include them in the case group

OR = = 0.5020/208/4

Page 33: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

33

Smoking HBP CVD

+ 100 + 20

- 80

+ 6- 14+ 8- 72

- 100 + 20

- 80

+ 2- 18+ 4- 76

Exclude CVD patients from the control group but not from the case group

OR = = 1.0620/2072/76

Page 34: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

34

Smoking HBP CVD

+ 100 + 20

- 80

+ 6- 14+ 8- 72

- 100 + 20

- 80

+ 2- 18+ 4- 76

Exclude them from both groups

OR = = 0.8514/1872/76

Page 35: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

35

Using hospital controls to replace population controls is bias prone (this example is extreme, though). Controls should provide the exposure distribution in the population that gave rise to the cases.

Do not take into consideration diseases that follow this pattern:Smoking HBP CVDOnly: smokingHBP, and only if smoking is not causing CVD

CVD

Exclusion of persons with an exposure related condition from one group but not from the other introduces a threat to validity (although one of these estimates was close to 1).

Exclusion of such cases for both groups can cause bias (unless the selection criteria are confounders).

Page 36: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

36

Healthy worker selection

Is a conceptual problem when designing the study, a violation of the counterfactual ideal

Indicates that SMR values for workers who perform physical demanding jobs tend to be less than 100. The reason is that the comparison we make are biased. The population at large include people with chronic diseases (and high mortality) that cannot perform a physically demanding job). “The sick population effect” or

“the stupid investigator effect”

Page 37: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

37

MR

Age

population

exposed

SMR = 80

Page 38: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

38

Selection operates into the workforce at recruitment and out of the workforce over time unemployment is associated with suicide risk – causal or bias?

How can this be studied?

Page 39: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

39

Selection Bias-Publication Bias Decision making depends upon the

combined evidence-e.g. Cochrane reviews not just one study.

But is the source population for Meta-analyses biased?

Page 40: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

40

Selection Bias-Publication Bias Researchers may decide not to submit

based on results Editors may decide to review or reject

based on results Reviewers may decide to recommend

publication based on results Editors may make final conclusions based

on results All of this leads to a biased source

population for reviews and meta analyses

Page 41: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

41

Selection Bias-Publication Bias Example-Panayiotis et al Incl;

2005:97:1043-1055.

Association between TP53 (tumor suppressor protein) and risk of death in patients with head and neck cancers

Page 42: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

42

Selection Bias-Publication BiasFig. 1

Page 43: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

43

Selection Bias-Publication BiasFig. 2

Page 44: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

44

Selection Bias-Publication BiasFig. 3

Page 45: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

45

External validity? In an etiologic study the aim is to formulate

abstract hypotheses in relation to the factors under study.

The hypotheses are abstract in the sense that they are not tied to a specific population but aim to formulate a general scientific theory.

Internal validity

External validity

Page 46: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

46

Estrogen exposure (more than 0.3 mg estrogen/d in at least 6 months) and cancer of the endometrium (N Engl J Med 1978; 299: 1089-94).

Cases: All post-menopausal gynaecological cancer patients at Yale-New Haven Medical Center 1974-1976.

Controls: Mainly patients with cancer of the cervix (60) or the ovarium (43), matched for age and race.

Page 47: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

47

E Cases Controls

+-

3584

4115

All 119 119

OR = 12.0 (95% c.l. 4.1-35.0); = 29.52

Page 48: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

48

Incl. all postmenopausal women with bleedings.

Cases: Same cancer patients. Controls: Women with bleedings, but no

cancer of the endometrium, matched for age and race.

Page 49: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

49

E Cases Controls

+-

44105

23126

All 149 149

OR = 2.3 (95% c.l. 1.3-4.1); = 8.462

Page 50: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

50

Horwitz et al. continued the discussion and presented new data in Lancet 1981;2:66-8.

In the abstract they state (shortened and modified)

“In this study, to determine the frequency with which endometrial cancer escapes detection, all necropsies on 8998 eligible women showed previously unsuspected endometrial cancer in 24 of them. The estimated rate of undetected cancer 27/10,000 is two to five times higher than the detection rate of 5/10,000 noted by the Connecticut State Tumor Registry.”

Comments?

Page 51: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

51

Two types of endometrial cancer: A-diagnosed, B-undetected

A woman of 45 years of age would have a lifetime risk (until 80) of type A cancer

5/10,000 x 35 = 175/10,000

Better

1-e -5/10,000 x 35 = 174/10,000

The proportion of type B cases would be27/(27 + 174) = 13.4%

Page 52: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

52

The most frequent and serious problem of selection bias in case-control studies is non-responders.

And an equal proportion of non-responding cases and controls is NOT a guarantee against selection bias.

The question is whether there is an equal selection of exposed cases and exposed controls.

Page 53: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

53

The most serious selection problem in a follow-up study is loss to follow-up.

“If in doubt, stay out”

Page 54: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

54

Sensitivity Analysis

Page 55: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

55

Cohort – 10 years of follow-up

RR = = 9.0

Smoking N Loss to follow-up

End of follow-up Lung cancer

+-

10001000

200100

8010

80/80010/900

Page 56: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

56

Sensitivity approach: Lung cancer risk among lost to follow-ups

Smokers Non-Smokers Comments RR

1/10

1/10

0

0

0(worst case)

1/90

2/90

1/90

2/90

1.0

As for followed-up

Underestimate risk for non-smokersOverestimate risk among smokersUnderestimate risk for non-smokersAll non-smokers lost to follow-up get lung cancer

9.0

8.2

7.3

6.6

0.7

Page 57: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

57

Selection Bias Main Points

Selection of the people to the study produces bias under the following condition and more.

A. Selection bias in the design1. cross-sectional study: The sampling strategy does not produce a

representative sample of the target population

Page 58: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

58

Selection Bias Main Points, cont.

2. Cohort study/case control study: The not exposed are too far away from the counterfactual ideal. The exposed do not provide the expected disease occurrence had the exposed not been exposed; and stratification or statistical control will not be sufficient to produce unbiased estimates of effects.

examples: health worker selection + many other poorly designed studies.

Page 59: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

59

Selection Bias Main Points, cont.

B. Selection bias in the conduct of study; non- responders, loss to follow-up.

1. The cross-sectional study – response rate may correlate with what you want to estimates which would lead to a biased estimate of its prevalence.

Risk of selection bias is high.

Page 60: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

60

Selection Bias Main Points, cont.

2. The cohort study – non responses at baseline will usually not correlate directly with both the exposure and the (unknown) endpoint, but selection at baseline will often change the confounder structure (will correlate with exposure). Loss to follow may correlate with both the exposure and endpoint and lead to bias.

Give higher priority to compliance to follow-up than to recruitment at baseline. Loss to

follow-up will often cause bias in the randomized trial (intention to treat analysis).

Page 61: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

61

Selection Bias Main Points, cont.

3. The case-control study - Non-responders may well correlate with both the exposure and the endpoint since both are known at the recruitment to the study. Keeping

response rates high should be given high priority and the specific aim of the study should not be disclosed (IRB may not accept this procedure).

Page 62: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

62

Selection Bias Main Points, cont.

Selection bias is a serious problem and should be avoided if possible. Often it is not possible and its magnitude and possible impact should be investigated.

Page 63: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

63

Steps to avoid bias related to non-responders

Keep non-responding as low as possible, expecially in surveys and case-control studies

Try to get some information on non-responders –at best for E and D, but also on confounders

Analyse data according to the time of responding

Do sensitivity analyses

Do follow-up studies (incl RCTs)

Page 64: 1 Precision and Validity: Selection Bias Dr. Jørn Olsen Epi 200B January 26 and 28, 2010.

64

So, the first concern in an etiologic study is that of VALIDITY (FREEDOM FROM BIAS –at least known bias).

Internal validity: validity of inference drawn in relation to the members of the study population.

External validity: validity of the inferences as they extend outside the population.