Evaluating Screening Programs Dr. Jørn Olsen Epi 200B January 19, 2010.

Evaluating Screening Programs

Dr. Jørn OlsenEpi 200B

January 19, 2010

Causal fieldBiological onset

Detectable by testing

ClinicalSymptoms

Diagnosing

Disabling

Primary prevention

Secondary prevention

Tertiary prevention

Definition of Screening Last: The presumptive identification of

unrecognized disease or defect by the application of tests, examinations or other procedures which can be applied rapidly. Screening tests sort out apparently well persons who probably have a disease from those who probably do not. A screening test is not intended to be diagnostic. Persons with positive or suspicious findings must be referred to their physicians for diagnosis and necessary treatment.

Screening is not about diagnosing patients.

The aim is to identify people at high risk of having the disease.

The screening test is not a diagnostic test.

A positive screening test is a cue for using a diagnostic test-not a cue for treatment.

Which conditions speak in favor of a screening program?

A health problem that can be treated

An acceptable test with no or few side effects

Justification for screening: early treatment improves prognosis at reasonable cost

Screening requires a screening test.

We talk about a test’s sensitivity, specificity and predictive value.

What characterizes a good screening test?

Safe

Acceptable

Inexpensive

Good predictive values

Test D D

+-

ac

bd

M3

M4

M1 M2 N

sens = a/M1 spec = d/M2 predictive value pos test = a/M3

Parameters; sens = P (test+ D); spec = P (test- D)

Predictive value pos test = P (D test+)

Predictive value of a neg test = P (D test-)

These are conditional probabilities

A population

Test D D

+-

PP x sensPP x (1-sens)

(1-PP) (1-spec)(1-PP) spec

PP (1-PP)

Predictive value of pos test

P(D test+) =

Predictive value of a negative test

P(D test-) =

PP x sens

PP x sens + (1-PP) (1-spec)

(1-PP) spec

PP x (1-sens) + (1-PP) spec

Bayes’ formula – predictive value depends upon sens, spec and PP, the prevalence proportion. From prior probability, PP, to a postereiori probability P(D test+)

1763 Richard Price presented a paper by Thomas Bayes “An essay toward solving a problem in the doctrine of chances”.

Is epoxy carcinogenic?

Ame’s test Carcinogeniccompounds

Non carcinogeniccompounds

PositiveNegative

15718

1494

175 108

Sens. 157/18+157 = 0.90Spec. 94/108 = 0.87

Epoxy had a positive Ame’s test

90% probability that it is carcinogenic?

Depends upon the prior probabilityIf 1% of all chemicals screened arecarcinogenic

Test C C

+

-

900

100

12870

86130

13770

86230

1000 99000 100,000

Predictive value of a positive test 900/13770 = 6.5% from 1% to 6.5%

Predictive value of negative test 56130/86230= 99.9% from 99% to 99.9%

Benefits and side effects of screening

a: True positives detected at screening – would benefit if detected before critical pointc: False negatives diseased but not detected at screening. Screening may delay their diagnosing b: False positives are called in for diagnostic work up – are worried and diagnostic tests may carry risksd: True negatives are happy and like the program

Main design issue: screening may have positive as well as negative effects. The sensitivity and specificity of the tests are key parameters together with the nature of the test, the disease and its treatment.

test D D

+-

ac

bd

Test values for HEME Select (colorectal cancer)population data

test D D

+-

2210

4187043

4407053

32 7461 7493

Sens = 22/32 = 0.688 Spec = 7043/7461 = 0.944

Predictive value of post test = 22/440 = 0.050

In a clinical setting data could be

test D D

+-

688312

56944

7441256

1000 1000 2000

Predictive value of post test = 688/744 = 0.925

Sensitivity will often depend on the stage of the disease and may well be lower for early stages of the disease the predictive value of the test is closely dependent on the prevalence proportion of the disease. For this test, predictive value of pos test is 0.11 if colon cancer has a prevalence proportion of 0.01 and 0.01 if PP is 0.001.

Likelihood ratios (LR)

LR+ = =

LR- = =

An easy way to use Bayes’ theorem

Prior odds =

Posterior odds = prior odds x LR

Posterior probability =

Sens1-spec

1-sensspec

Prior probability1-prior probability

Posterior odds1+ posterior odds

P (test + D)P (test + D)

P (test - D)P (test - D)

Screening for alcoholism; test sens = 0.90, spec = 0.60Prior probability of alcoholism 0.30, then

Prior odds = = 0.43

LR+ = = 2.25

Posterior odds = 0.43 x 2.25 = 0.97

Posterior probability = = 0.49

You have increased your probability from 0.30 to 0.49 given the test was positive.

0.300.70

0.900.40

0.971+ 0.97

Screening may have negative as well as positive effects; a screening program should therefore be evaluated. It is not enough to show that those who were detected in a screening program had a longer survival than those not screened.

For this patient, the clinical survival time is td-tc and the screening survival time is td-ts; tc-ts longer. This time interval produces “lead time bias”.

dead

healthy

ts tc td time

IR

after screening

without screening

screening

All classical designs have been used in evaluating screening programs. Main concerns:

Since screening programs usually have both positive and negative effects, the case-control design may not be the best choice. Why is that?

RCT: need to be large, may be out of date when finished, unbiased cause specific mortality may difficult to obtain, difficult to randomize at individual level. Does not address normal practice. No “confounding by indication” argument for doing a RCT.

Follow-up: who comply to the program, high risk/low risk?

Case-control: no possibility to include all effects of interest

Ecological: ecological phallacy, but may be the best evidence after all

Additional design issues Screening may address an early pre-

disease lesion (adenoma) or cancer at an early stage. In the first situation, screening may reduce incidence but may have little impact on case fatality. In the second situation, screening should reduce incidence (and case fatality?). In both situations, cause specific mortality should be reduced (and total mortality?).

Additional design issues A case-control addressing the first issue

includes incident cases. For the second issue, cases are cause specific deaths.

The source population are those who are invited to be screened and belong to the population at risk.

Additional design issues Incidence density sampling of controls is

usually the only option.

Exposure is being screened in a given time interval up to case selection.

D M

Gotzsche et al. Is screening for breast cancer with mammography justifiable? Lancet. 2000;355:129-34.

France

In France BC cancer incidence is increasing cause specific mortality rates are stable BCscreening is increasing. How could this be explained?

Randomized trials include 500,000 women but results differ and no conclusion has been reached.

Review of trial according to:

quality of randomizationblinding of outcome assessment exclusion after randomization

New York trial

Pairs of women were matched and the pairs were randomized but imbalance on previous lump in the breast, menopause, education

Edinburgh trial

Cluster randomization of GPs difference in social conditions

Canadian trial

Individual randomization

Stockholm trial

Allocated according to data of birthBorn 11-20 of any month allocated to the control group

inconsistency in numbers

Gothenburg trial

Data of birth and individual difference in age

Other Swedish trials

Randomization of counties

Differences in age

Peter C Gotzsche, et al. Public Health

Peter C Gotzsche, et al. Public Health

Other quality criteria?

Conclusion:

Screening for breast cancer with mammography is unjustified?

Evaluating Screening Programs Dr. Jørn Olsen Epi 200B January 19, 2010.

Documents

positive screening test

positive test

good screening test

negative test pd test

neg test

acceptable test

pp spec slide

screening tests