Top Banner
1 Lecture 2 Screening and diagnostic tests • Normal and abnormal • Validity: “gold” or criterion standard • Sensitivity, specificity, predictive value • Likelihood ratio • ROC curves • Bias: spectrum, verification, information
48

1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

Dec 27, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

1

Lecture 2Screening and diagnostic tests

• Normal and abnormal

• Validity: “gold” or criterion standard

• Sensitivity, specificity, predictive value

• Likelihood ratio

• ROC curves

• Bias: spectrum, verification, information

Page 2: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

2

Clinical/public health applications

• screening: for asymptomatic disease (e.g., Pap test, mammography)

• case-finding: testing of patients for diseases unrelated to their complaint

• diagnostic: to help make diagnosis in symptomatic disease or to follow-up on screening test

Page 3: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

3

Evaluation of screening and diagnostic tests

• Performance characteristics– test alone

• Effectiveness (on outcomes of disease):– test + intervention

Page 4: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

4

Criteria for test selection

• Reproducibility

• Validity

• Feasibility

• Simplicity

• Cost

• Acceptability

Page 5: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

5

Sources of variation:Biological or true variation

• between individuals

• within individuals (e.g., diurnal variation in BP) – “controlled” by standardizing time of

measurement

Page 6: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

6

Sources of variation: Measurement error

• random error vs systematic error (bias)

• method (measuring instrument)

• observer

Page 7: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

7

Page 8: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

8

Quality of measurements

• Validity (accuracy) – Does it measure what it is intended to? – Lack of bias

• Reproducibility (reliability, precision, consistency) of measurements

Page 9: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

9

Examples of types of reproducibility

• Between and within observer (inter- and intra-observer variation)– May be random or systematic

• Regression toward the mean – Systematic error when subjects have extreme

values (more likely to be in error than typical values)

Page 10: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

10

Validity (accuracy)

• Criterion validity – concurrent– predictive

• Face validity, content validity: judgement of the appropriateness of content of measurement

• Construct validity: validity of underlying entity or

theoretical construct

Page 11: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

11

Normal vs abnormal

• Statistical definition– “Gaussian” or “normal” distribution

• Clinical definition – using criterion

Page 12: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

12

Page 13: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

13

Page 14: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

14

Page 15: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

15

Page 16: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

16

Selection of criterion

• Concurrent– salivary screening test for HIV– history of cough more than 2 weeks (for TB)

• Predictive– APACHE (acute physiology and chronic

disease evaluation) instrument for ICU patients – blood lipid level– maternal height

Page 17: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

17

"True" Disease Status

Screeningtest results

Present Absent

Positive "True positives"A

"False positives"B

Negative "False negatives"C

"True negatives"D

Sensitivity of screening test = A A + C

Specificity of screening test = D B + D

Predictive value of positive test = A A + B

Predictive value of negative test = D C + D

Page 18: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

18

Sensitivity and specificity

Assess correct classification of:

• People with the disease (sensitivity)

• People without the disease (specificity)

Page 19: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

19

Predictive value

• More relevant to clinicians and patients

• Affected by prevalence

Page 20: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

20

Choice of cut-point

If higher score increases probability of disease

• Lower cut-point:– increases sensitivity, reduces specificity

• Higher cut-point:– reduces sensitivity, increases specificity

Page 21: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

21

Considerations in selection of cut-point

Implications of false positive results

• burden on follow-up services

• labelling effect

Implications of false negative results

• Failure to intervene

Page 22: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

22

Likelihood ratio

• Likelihood ratio (LR) = sensitivity

1-specificity

• Used to compute post-test odds of disease from pre-test odds:

post-test odds = pre-test odds x LR

• pre-test odds derived from prevalence

• post-test odds can be converted to predictive value of positive test

Page 23: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

23

Example of LR

• prevalence of disease in a population is 25%

• sensitivity is 80%

• specificity is 90%,

• pre-test odds = 0.25 = 1/3

1 - 0.25

• likelihood ratio = 0.80 = 8

1-0.90

Page 24: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

24

Example of LR

• If prevalence of disease in a population is 25%

• pre-test odds = 0.25 = 1/3

1 - 0.25

• post-test odds = 1/3 x 8 = 8/3

• predictive value of positive result = 8/3+8

= 8/11 = 73%

Page 25: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

25

Receiver operating characteristic (ROC) curve

• Evaluates test over range of cut-points

• Plot of sensitivity against 1-specificity

• Area under curve (AUC) summarizes performance:– AUC of 0.5 = no better than chance

Page 26: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

26

Page 27: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

27

Spectrum bias• Study population should be representative

of population in which test will be used

• Is range of subjects tested adequate?– In population with low risk of outcome,

sensitivity will be lower, specificity higher– In population with high risk of outcome,

sensitivity will be higher, specificity lower

• Comorbidity may affect sensitivity and specificity

Page 28: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

28

Verification bias

• results of test affect intensity of subsequent investigation

• increasing probability of detection of outcome in those with positive test result

Page 29: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

29

Information bias

• Diagnosis is not blind to test result

• Improves test performance

Page 30: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

30

Example: Screening seniors in the emergency department (ED) for risk of

function decline

• High risk group

• Many not adequately evaluated or referred for appropriate services

• Development and validation of a brief screening tool to identify those at increased risk of functional decline and other adverse outcomes

Page 31: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

31

Two multi-site studies in Montreal EDs

• Study 1: development of ISAR– Prospective observational cohort study– JAGS (1999) 47: 1226-1237.

• Study 2: evaluation of 2-step intervention – randomized controlled trial– JAGS (2001) 49: 1272-1281.

Page 32: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

32

Common features of 2 studies• 4 Montreal hospitals (2 participated in both studies)• Patients aged 65+, community dwelling, English or

French-speaking• Exclusions:

– cognitively impaired or severe illness with no proxy informant

– language barrier (no English or French)

Page 33: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

33

Differences between 2 studies: Study design

• Study 1– Observational study– Follow-up at 3 and 6 months after ED visit

• Study 2– Randomized controlled trial: 2-step

intervention vs usual care– Randomization by day of visit– Follow-up at 1 and 4 months after ED visit

Page 34: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

34

RESULTS: ISAR development

Adverse health outcome defined as any of following during 6 months after ED visit

• >10% ADL decline

• Death

• Institutionalization

Page 35: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

35

Scale development

• Selection of items that predicted all adverse health events

• Multiple logistic regression - “best subsets” analysis

• Review of candidate scales with clinicians to select clinically relevant scale

Page 36: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

36

Identification of Seniors At Risk (ISAR)

1. Before the illness or injury that brought you to the Emergency, did you need someone to help you on a regular basis? (yes)

2. Since the illness or injury that brought you to the Emergency, have you needed more help than usual to take care of yourself? (yes)

3. Have you been hospitalized for one or more nights during the past 6 months (excluding a stay in the Emergency Department)? (yes)

4. In general, do you see well? (no)

5. In general, do you have serious problems with your memory? (yes)

6. Do you take more than three different medications every day? (yes)

Scoring: 0 - 6 (positive score shown in parentheses)

Page 37: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

37

0

20

40

60

80

0 1 2 3 4 5-6

ISAR SCORE

%

DischargedAdmitted

Any adverse outcome by ISAR score and disposition

Page 38: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

38

Other Outcomes Related to ISAR

Source: Dendukuri et al, JAGS, in press

• Does ISAR score identify patients with current functional problems?– Self-reported premorbid function (OARS)– Function at home visit assessed by nurse 1-2

weeks after ED visit (SMAF)

Page 39: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

39

Area Under the curve (AUC) for concurrent validity criteria

Detection of depression

at baseline

Study 2

OARS: Study 1

Severe functional

impairment

OARS: Study 2

SMAF: Study 1

AUC (95% confidence interval)

0.5 0.6 0.7 0.8 0.9 1.0

Page 40: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

40

Other Outcomes Related to ISAR

• Does ISAR predict adverse outcomes (other than functional decline) during the subsequent 5 or 6 months?– High hospital utilization (11+ days/5 months)– Frequent ED visits– Frequent community health center visits– Increase in depressive symptoms

Page 41: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

41

Area Under the Curve(AUC) for predictive validation criteriaamong patients discharged from ED

Increase in depressivesymptoms

Study 2

10+ community healthcenter visits/5 months

Study 2

11+ hospital days/ 5 months

Study 1

Study 2

2+ ED visits/ 5 months

Study 1

Study 2

Adverse health outcome

Study 1

AUC (95% confidence interval)

0.5 0.6 0.7 0.8 0.9 1.0

Page 42: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

42

Summary of data on performance

• Very good detection of patients with current functional problems and depression (AUC values 0.8 - 0.9)

• Moderate ability to predict future adverse health events (functional decline) and health center utilization (AUC values around 0.7)

• Fair ability to predict future hospital and ED utilization (AUC values 0.6 - 0.7)

Page 43: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

43

Comparison with other screening tools for patients admitted to hospital

Source: McCusker et al, J Gerontol 2002; 57A: M569-577

• Systematic literature review

• Predictors of functional decline (including nursing home admission) among hospitalized seniors

• Investigated individual risk factors and predictive indices

Page 44: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

44

Predictive indices

• Inouye (1993): FD and NH at 3 mo– 4 factors: decubitus ulcer, cognitive

impairment, premorbid functional impairment, low social activity

• Mateev(1998): D/NH at 3 mo. – clinical targeting criteria

Page 45: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

45

Predictive indices (cont)

• McCusker (1999): FD/NH/ D at 6 mo.– Identification of Seniors At Risk (ISAR): 6-

item self-report questionnaire

• Narain (1988): NH at 6 mo– hand-developed algorithm based on residence,

mental status, diagnosis

Page 46: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

46

Predictive indices (cont)

• Rubenstein (1984): FD and NH at 12 mo – expected discharge location and diagnosis

• Sager (1996): FD at 3mo– Hospital Admission Risk Profile (HARP) (age,

MMSE and IADL)

• Zureik (1997): NH at discharge– 6-item index

Page 47: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

47

Performance of 7 predictive indices for functional decline

1-Specificity

Se

nsitiv

ity

C

C

C

A

A

B

D E

F

F

G

0

10

20

30

40

50

60

70

80

90

100

0 10 20 30 40 50 60 70 80 90 100

A: Inouye(1998)

B: Mateev(1998)

C: McCusker(1999)

D: Narain(1988)

E: Rubenstein(1986)

F: Sager(1996)

G: Zureik(1997)

Page 48: 1 Lecture 2 Screening and diagnostic tests Normal and abnormal Validity: “gold” or criterion standard Sensitivity, specificity, predictive value Likelihood.

48

Performance of predictive indices

• Moderate performance (AUC 0.65 - 0.66)