Intermediate methods in observational epidemiology 2008 Instructor: Moyses Szklo Measures of Disease Frequency
Dec 19, 2015
Intermediate methods in observational epidemiology
2008
Instructor: Moyses Szklo
Measures of Disease Frequency
MEASURES OF RISK
• Absolute measures of event (including disease) frequency:
– Incidence and Incidence Odds– Prevalence and Prevalence Odds
What is "incidence"?Two major ways to define incidence
• Cumulative incidence (probability)SURVIVAL ANALYSIS (Unit of analysis:
individual)
• Rate or DensityANALYSIS BASED ON PERSON-TIME (Unit
of analysis: time)
• OBJECTIVE OF SURVIVAL ANALYSIS:To compare the “cumulative incidence” of an
event (or the proportion surviving event-free) in exposed and unexposed (characteristic present or absent) while adjusting for time to event (follow-up time)
• BASIS FOR THE ANALYSIS• NUMBER of EVENTS• TIME of occurrence
Time
Su
rviv
al1.0
Need to precisely define:• “EVENT” (failure):
– Death– Disease (diagnosis, start of symptoms, relapse)– Quit smoking– Menopause
• “TIME”:– Time from recruitment into the study– Time from employment– Time from diagnosis (prognostic studies)– Time from infection– Calendar time– Age
– Example:• Follow up of 6 patients (2 yrs)
– 3 Deaths – 2 censored (lost) before 2 years– 1 survived 2 years
Question: What is the Cumulative Incidence (or the Cumulative Survival) up to 2 years?
Death
Censored observation (lost to follow-up, withdrawal)
( ) Number of months to follow-up
Jan1999
Jan2000
Jan2001
1
3
2
4
5
6
(24)
(6)
(18)
(15)
(13)
(3)
Person ID
Crude Survival:3/6= 50%
Change time scale to “follow-up” time:
Person ID
0 1 2
1
3
2
4
5
6
(24)
(6)
(18)
(15)
(13)
(3)
Follow-up time (years)
One solution:
• Actuarial life tableAssume that censored observations over the period contribute one-half the persons at risk in the denominator (censored observations occur uniformly throughout follow-up interval).
ID
0 1 2
1
32
456
(24)
(6)(18)
(15)(13)
(3)
Follow-up time (years)
60.05
3
2216
32
yrsq
It can be also calculated for years 1 and 2 separately: Year 1: S(Y1)= [1 - {1 ÷ [6 – ½(1)]}= 0.82Year 2: S(Y2)= [1 – {2 ÷ [4 – ½(1)]}= 0.43S(2yrs)= 0.82 × 0.43= 0.35
40.01)2( 2 yrsqyrsS
10082
43
Year 1 Year 2
Cumulative Survival
Follow-up time
KAPLAN-MEIER METHODE.L. Kaplan and P. Meier, 1958*
Calculate the cumulative probability of event (and survival) based on conditional probabilities at each event time
Step 1: Sort the survival times from shortest to longest
*Kaplan EL, Meier P.Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958;53:457-81.
Person ID
0 1 2
1
3
2
4
56
(24)
(6)
(18)
(15)
(3)
Follow-up time (years)
(13)
KAPLAN-MEIER METHODE.L. Kaplan and P. Meier, 1958*
Calculate the cumulative probability of event (and survival) based on conditional probabilities at each event time
Step 1: Sort the survival times from shortest to longest
Person ID
0 1 2
4
1 (24)
2 (6)
3 (18)(15)
5 (13)
6 (3)
Follow-up time (years)
*Kaplan EL, Meier P.Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958;53:457-81.
Step 2: For each time of occurrence of an event, compute the conditional survival
Person ID
0 1 2
4
1 (24)
2 (6)
3 (18)(15)
5 (13)
6 (3)
Follow-up time (years)
When the first event occurs (3 months after beginning of follow-up), there are 6 persons at risk. One dies at that point; 5 of the 6 survive beyond that point. Thus:
• Incidence of event at exact time 3 months: 1/6• Probability of survival beyond 3 months: 5/6
Person ID
0 1 2
4
1 (24)
2 (6)
3 (18)(15)
5
6 (3)
Follow-up time (years)
When the second event occurs (13 months), there are 4 persons at risk. One of them dies at that point; 3 of the 4 survive beyond that point. Thus:
• Incidence of event at exact time 13 months: 1/4
• Probability of survival beyond 13 months: ¾
(13)
Person ID
0 1 2
4
1 (24)
2 (6)
3 (18)(15)
5
6 (3)
Follow-up time (years)
When the third event occurs (18 months), there are 2 persons at risk. One of them dies at that point; 1 of the 2 survive beyond that point. Thus:
• Incidence of event at exact time 18 months: 1/2• Probability of survival beyond 18 months: 1/2
(13)
Step 3: For each time of occurrence of an event, compute the cumulative survival (survival function), multiplying conditional probabilities of survival.
3 months: S(3)=5/6=0.833
12 months: S(13)=5/63/4=0.625
18 months: S(18)=5/6 3/41/2 =0.3125
CONDITIONAL PROBABILITY OF AN EVENT (or of survival)
The probability of an event (or of survival) at time t (for the individuals at risk at time t), that is, conditioned on being at risk at exact time t.
0.8330.6250.3125
Time (mo)
31318
Plotting the survival function:
0.60
0.40
0.20
0.80
Survival
2520151050
Month of follow-up
1.00
The cumulative incidence (up to 24 months): 1-0.3125 = 0.6875 (or 69%)
Si
0.833
0.625
0.3125 0.3125
0.8330.6250.3125
Time (mo)
31318
Plotting the survival function:
2520151050
Month of follow-up
0.60
0.40
0.20
0.80
Cumulative Survival1.00
0.8
0.6
0.3
CEEPlacebo
CEE
Placebo
Cumulative Hazards for Coronary Heart Disease and Stroke in the Women’s Health Initiative Randomized Controlled Trial
(The WHI Steering Committee. JAMA 2004;291:1701-1712)
EXPERIMENTAL STUDY
0.8330.6250.3125
Time (mo)
31318
Plotting the survival function:
2520151050
Month of follow-up
0.60
0.40
0.20
0.80
Cumulative Survival1.00
0.8
0.6
0.3
Cumulative Hazard
0.20
0.80
1.00
0.60
0.400.2
0.4
0.7
The cumulative incidence (hazard) at the end of 24 months: 1-0.3 = 0.7 (or 70%)
ACTUARIAL LIFE TABLE VS KAPLAN-MEIER
If N is large and/or if life-table intervals are small, results are similar
•Survival after diagnosis of Ewing’s sarcoma
ASSUMPTIONS IN KAPLAN-MEIER SURVIVAL ESTIMATES
• (If individuals are recruited over a long period of time)
No secular trends
Calendar time Follow-up time
ASSUMPTIONS IN SURVIVAL ESTIMATES(Cont’d)
• Censoring is independent of survival (uninformative censoring): Those censored at time t have the same prognosis as those remaining.
Types of censoring:• Lost to follow-up
– Migration– Refusal
• Death (from another cause)• Administrative withdrawal (study finished)
Calculation of incidenceStrategy #2
ANALYSIS BASED ON PERSON-TIME
CALCULATION OF PERSON-TIME AND INCIDENCE RATES (Unit of analysis: time)
Example 1 Observe 1st graders, total 500 hours
Observe 12 accidents
Accident rate:
hour-personper0.024500
12R
IT IS NOT KNOWN WHETHER 500 CHILDREN WERE OBSERVED FOR 1 HOUR, OR 250 CHILDREN OBSERVED FOR 2 HOURS, OR 100 CHILDREN OBSERVED FOR 5 HOURS… ETC.
Person ID
0 1 2
4
1 (24)
2 (6)
3 (18)(15)
5 (13)
6 (3)
Follow-up time (years)
CALCULATION OF PERSON-TIME AND INCIDENCE RATES
Example 2
Person ID
No. of person-years in
Total FU1st FU year 2nd FU year
6
2
5
4
3
1
3/12=0.25
6/12=0.50
12/12=1.00
12/12=1.00
12/12=1.00
12/12=1.00
0
0
1/12=0.08
3/12=0.25
6/12=0.50
12/12=1.00
0.25
0.25
1.00
1.25
1.50
2.00
Total 4.75 1.83 6.58
Step 1: Calculate denominator, i.e. units of time (years) contributed by each individual, and total:
Step 2: Calculate rate per person-year for the total follow-up
period:
year-personper0.466.58
3R
It is also possible to calculate the incidence rates per person-year separately for shorter periods during the follow-up:
For year 1:
For year 2:
year-personper0.214.75
1R
year-personper1.09 1.83
2R
Notes:
• Rates have units (time-1). • Proportions (e.g., cumulative incidence) are unitless.• As velocity, rate is an instantaneous concept. The
choice of time unit used to express it is totally arbitrary. E.g.:
0.024 per person-hour = 0.576 per person-day = 210.2 per person-year
0.46 per person-year = 4.6 per person-decade
Person No. Year 1 Year 2 Total
1 1/12= 0.08 (D) 0 0.08
2 2/12= 0.17 (C) 0 0.17
3 3/12= 0.25 (C) 0 0.25
4 4/12= 0.33 (C) 0 0.33
5 5/12= 0.42 (C) 0 0.42
6 6/12= 0.50 (D) 0 0.50
7 7/12= 0.58 (C) 0 0.58
8 8/12= 0.67 (C) 0 0.67
9 9/12= 0.75 (C) 0 0.75
10 10/12= 0.83 (C) 0 0.83
11 11/12= 0.92 (C) 0 0.92
12 12/12= 1.00 (D) 0 1.00
13 12/12= 1.00 (C) 1/12= 0.08 (C) 1.08
14 12/12 = 1.00 (C) 2/12= 0.17 (C) 1.17
15 12/12 = 1.00 (C) 3/12= 0.25 (D) 1.25
16 12/12 = 1.00 4/12= 0.33 (C) 1.33
17 12/12 = 1.00 5/12= 0.42 (C) 1.42
18 12/12 = 1.00 6/12= 0.50 (C) 1.50
19 12/12 = 1.00 7/12= 0.58 (C) 1.58
20 12/12 = 1.00 8/12= 0.67 (C) 1.67
21 12/12 = 1.00 9/12= 0.75 (D) 1.75
22 12/12 = 1.00 10/12= 0.83 (C) 1.83
23 12/12 = 1.00 11/12= 0.92 (C) 1.92
24 12/12 = 1.00 12/12= 1.00 (C) 2.0
Total 18.5 6.5 25.0
Death rate per person-time (person-year)5 deaths/25.0 person-years= 0.20 or 20 deaths per 100 person-years
Death rate per average population, estimated at mid-point of follow-upMid-point (median) population (When calculating yearly rate in Vital Statistics) = 12.5
Death rate= 5/12.5 per 2 years= 0.40Average annual death rate= 0.40/2= 0.20 or 20/100 population
No. of person-years of follow-up
D, deathsC, censored
Person No. Year 1 Year 2 Total
1 1/12= 0.08 (D) 0 0.08
2 2/12= 0.17 (C) 0 0.17
3 3/12= 0.25 (C) 0 0.25
4 4/12= 0.33 (C) 0 0.33
5 5/12= 0.42 (C) 0 0.42
6 6/12= 0.50 (D) 0 0.50
7 7/12= 0.58 (C) 0 0.58
8 8/12= 0.67 (C) 0 0.67
9 9/12= 0.75 (C) 0 0.75
10 10/12= 0.83 (C) 0 0.83
11 11/12= 0.92 (C) 0 0.92
12 12/12= 1.00 (D) 0 1.00
13 12/12= 1.00 (C) 1/12= 0.08 (C) 1.08
14 12/12 = 1.00 (C) 2/12= 0.17 (C) 1.17
15 12/12 = 1.00 (C) 3/12= 0.25 (D) 1.25
16 12/12 = 1.00 4/12= 0.33 (C) 1.33
17 12/12 = 1.00 5/12= 0.42 (C) 1.42
18 12/12 = 1.00 6/12= 0.50 (C) 1.50
19 12/12 = 1.00 7/12= 0.58 (C) 1.58
20 12/12 = 1.00 8/12= 0.67 (C) 1.67
21 12/12 = 1.00 9/12= 0.75 (D) 1.75
22 12/12 = 1.00 10/12= 0.83 (C) 1.83
23 12/12 = 1.00 11/12= 0.92 (C) 1.92
24 12/12 = 1.00 12/12= 1.00 (C) 2.0
Total 18.5 6.5 25.0
Death rate per person-time (person-year)5 deaths/25.0 person-years= 0.20 or 20 deaths/100 person-years
Death rate per average population, estimated at mid-point of follow-upMid-point (median) population (When calculating yearly rate in Vital Statistics) = 12.5
Death rate= 5/12.5 per 2 years= 0.40Average annual death rate= 0.40/2= 0.20 or 20/100 population
No. of person-years of follow-up
D, deathsC, censored
No of person tim eEven ts D
Popu la tion N T im e N
Even ts DPopu la tion N
T im e N
.( )
( ) ( )
( )( )
( )
Notes: Rates have an undesirable statistical property• Rates can be more than 1.0 (100%):
– 1 person dies exactly after 6 months:• No. of person-years: 1 x 0.5 years= 0.5 person-years
Rate per PY per PYs 10 5
2 0 2 0 0 1 0 0.
.
Use of person-time to account for changes in exposure status (Time-dependent exposures)
Example: Adjusting for age, are women after menopause at a higher risk for myocardial infarction?
123456
Number of PY in each group
ID 1 2 3 4 5 6 7 8 9 10No. PY
PRE menoNo. PY
POST meno
C
C
: Myocardial Infarction; C: censored observation.
Rates per person-year:Pre-menopausal = 1/17 = 0.06 (6 per 100 py)Post-menopausal = 2/18 = 0.11 (11 per 100 py)
Rate ratio = 0.11/0.06 = 1.85
3 40 56 00 15 53 317 18
Year of follow-up
Note: Event is assigned to exposure status when it occurs
ASSUMPTIONS IN PERSON-TIME ESTIMATES
Risk is constant within each interval for which person-time units are estimated (no cumulative effect):– N individuals followed for t time t individuals
followed for N time– However, are 10 smokers followed for 1 year
comparable to 1 smoker followed for 10 years (both: 10 person-years)
• No secular trends (if individuals are recruited over a relatively long time interval)
• Losses are independent from survival
Rate for 1st Year= 0.21/PY
Rate for 2nd Year= 1.09/ PY
Total for 2 years = 0.46/PY
ASSUMPTIONS IN PERSON-TIME ESTIMATES
Risk is constant within each interval/period for which person-time units are estimated (no cumulative effect):– N individuals followed for t time t individuals
followed for N time– However, are 10 smokers followed for 1 year
comparable to 1 smoker followed for 10 years (both: 10 person-years)
• No secular trends (if individuals are recruited over a relatively long time interval)
• Losses are independent of survival
Method Estimate Value
Life-table
Life-table
Kaplan-Meier
q (2 years)
q(Y1) × q(Y2)
q (2 years)
0.60
0.65
0.64
Person-year
Midpoint (median) population
Rate (yearly) 0.46/py
0.43 per year
SUMMARY OF ESTIMATES
POINT REVALENCE
Point Prevalence“The number of affected persons present at the population at a specific time divided by the number of persons in the population at that time”Gordis, 2000, p.33
Relation with incidence --- Usual formula:
Point Prevalence = Incidence x Duration* P = I x D
* Average duration (survival) after disease onset.
Prevalence
1 P revalence Incidence D uration
True formula:
ODDS
OddsThe ratio of the probabilities of an event to that of the non-event.
Prob1-
ProbOdds
Example: The probability of an event (e.g., death, disease, recovery, etc.) is 0.20, and thus the odds is:
That is, for every person with the event, there are 4 persons without the event.
0.25) (or 41:0.80
0.20
0.201-
0.20Odds