Analysis of Interval-censored Data and Beyond II ✬ ✫ ✩ ✪ Methods for Interval-Censored Failure Time Data and Beyond II (Tony) Jianguo Sun Department of Statistics, University of Missouri September 19, 2008 Department of Statistics, University of Missouri Page 1
30
Embed
Methods for Interval-Censored Failure Time Data and Beyond II · Analysis of Interval-censored Data and Beyond II ’ & $ % II. Analysis of Doubly Censored Data II.1. An Example |
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Analysis of Interval-censored Data and Beyond II'
&
$
%
Methods for Interval-Censored Failure Time
Data and Beyond II
(Tony) Jianguo Sun
Department of Statistics, University of Missouri
September 19, 2008
Department of Statistics, University of Missouri Page 1
Analysis of Interval-censored Data and Beyond II'
&
$
%
OUTLINE
• I. Analysis of Bivariate Interval-Censored DataI.1. An Example — AIDS Clinical TrialI.2. Nonparametric Maximum likelihood EstimationI.3. Estimation of the Association ParameterI.4. Regression Analysis
• II. Analysis of Doubly Censored DataII.1. An Example — AIDS Cohort StudyII.2. Nonparametric Estimation of Survival FunctionsII.3. Nonparametric Comparison of Survival FunctionsII.4. Regression Analysis
• III. Other Topics and Future ResearchIII.1. Analysis with Informative Interval CensoringIII.2. Bayesian Analysis of Interval-Censored DataIII.3. Some Future Research Directions
Department of Statistics, University of Missouri Page 2
Analysis of Interval-censored Data and Beyond II'
&
$
%
I. Analysis of Bivariate Interval-Censored Data
I.1. An Example — AIDS Clinical Trial
• Subjects: 204 HIV-infected individuals
• Study: a substudy of a comparative clinical trial of three
anti-pneumocystis drugs w.r.t. the opportunistic
infection cytomegalovirus (CMV)
• Variables of interest: times to the presence of CMV
in blood and urine
• Observations: blood and urine samples were collected
and tested every 4 or 12 weeks
• Bivariate interval-censored data
• Goggins and Finkelstein (2000).
Department of Statistics, University of Missouri Page 3
Analysis of Interval-censored Data and Beyond II'
&
$
%
Table 1: Observed intervals in weeks for blood and urine shedding times alongwith the baseline CD4 status from ACTG 181
ID LB RB LU RU CD4 ID LB RB LU RU CD4
1 11 - 11 - 1 45 6 10 0 2 1
2 11 - 11 - 1 46 2 - 2 - 1
3 11 - 11 - 0 47 13 - 13 - 0
4 11 - 8 10 0 48 15 - 0 3 0
5 7 - 6 8 1 49 8 - 0 1 0
6 11 - 12 - 0 50 16 - 6 9 0
7 8 12 8 10 1 51 5 - 0 1 1
8 10 - 10 - 0 52 2 - 0 1 1
9 6 - 6 - 1 53 13 - 0 1 0
10 2 9 9 11 0 54 13 - 13 - 1
...... ......
Department of Statistics, University of Missouri Page 4
Analysis of Interval-censored Data and Beyond II'
&
$
%
I.2. Nonparametric Maximum Likelihood Estimation
• Consider a survival study giving bivariate interval-censored data
{Ui = (L1i, R1i] × (L2i, R2i], i = 1, ..., n}.
• Let F (t1, t2) = P (T1i ≤ t1, T2i ≤ t2) denote the cdf and
H = {Hj = (r1j, s1j] × (r2j, s2j], j = 1, ...,m}
the disjoint rectangles that constitute the regions of
possible support of the NMLE of F .
• Define pj = F (Hj) and αij = I(Hj ⊆ Ui).
Department of Statistics, University of Missouri Page 5
Analysis of Interval-censored Data and Beyond II'
&
$
%
• Then the likelihood function has the form
L(p) =n∏
i=1
m∑
j=1
αij pj
and the NMLE of F can be obtained by maximizing L(p)
subject to∑m
j=1 pj = 1 and pj ≥ 0 for all j.
• How to determine H:
Betensky and Finkelstein (1999),
Gentleman and Vandal (2001, 2002),
Bogaerts and Lesaffre (2004).
Department of Statistics, University of Missouri Page 6
Analysis of Interval-censored Data and Beyond II'
&
$
%
I.3. Estimation of the Association Parameter
— A copula model approach
• Consider a survival study giving bivariate interval-censored data
for (T1, T2) whose joint survival function is given by
S(t1, t2) = Cα(S1(t1), S2(t2)) ,
where α is a global association parameter.
• Let l(α, S1, S2) denote the log likelihood function and
S1 and S2 the marginal MLE of S1 and S2. Then one can
estimate α by solving the equation
∂l(α, S1, S2)
∂α= 0 .
• Wang and Ding (2000), Sun, Wang and Sun (2006).
Department of Statistics, University of Missouri Page 7
Analysis of Interval-censored Data and Beyond II'
&
$
%
— An imputation approach
• Instead of the copula model approach, one can directly estimate
• Step 1: Estimate the joint survival function of (T1, T2).
• Step 2: Impute the exact failure times M times.
• Step 3: Calculate the empirical Kendall’s τ for each of M sets
of the imputed data.
• Step 4: Estimate the Kendall’s τ by the average of the empirical
estimates.
• Betensky and Finkelstein (1999).
Department of Statistics, University of Missouri Page 8
Analysis of Interval-censored Data and Beyond II'
&
$
%
I.4. Regression Analysis
— Observed data and models
• Consider a survival study involving K possibly correlated
failure times (T1, · · · , TK) and n independent subjects.
• Assume that for each Tk, only an interval (Lk, Rk] is observed,
giving Tk ∈ (Lk, Rk]. So the observed data have the form
{ (L1i, R1i] , ..., (LKi, RKi], Zi ; i = 1, ..., n }.
• Let λk(t;Z) and Sk(t;Z) denote the marginal hazard and
survival functions of Tk given covariates Z, respectively.
Department of Statistics, University of Missouri Page 9
Analysis of Interval-censored Data and Beyond II'
&
$
%
• The PH model:
λk(t;Z) = λk0(t) exp(Z ′ β) .
• The PO model:
Sk(t;Z)
1 − Sk(t;Z)= e−Z′β Sk(t;Z = 0)
1 − Sk(t;Z = 0),
logit[Sk(t;Z)] = logit[Sk0(t)] − Z ′ β .
• The AH model:
λk(t;Z) = λk0(t) + Z ′ β .
Department of Statistics, University of Missouri Page 10
Analysis of Interval-censored Data and Beyond II'
&
$
%
— A marginal inference procedure
• Assume that the Tk’s are discrete variables. For the analysis,
note that if T1, ..., TK are independent, the log-likelihood
is proportional to
l(β,A1, ..., AK) =K∑
k=1
n∑
i=1
log {Lik(β,Ak)} ,
where Lik denotes the marginal likelihood on Tk from subject i,
Ak(t) = Sk0(t)/{1 − Sk0(t)} for the PO model, or
Ak(t) =∫ t0 λk0(s) ds for the PH or AH model.
• Thus one can estimate β and Ak’s by maximizing
l(β,A1, ..., AK).
Department of Statistics, University of Missouri Page 11
Analysis of Interval-censored Data and Beyond II'
&
$
%
• Let β denote the estimate of β defined above. Then
under certain conditions, β is consistent and for large n, one can
approximate its distribution using the normal distribution
with the covariance matrix consistently estimated by
I−1(β, Ak)D(β, Ak) I−1(β, Ak) .
• Goggins and Finkelstein (2000), Kim and Xue (2002),
Chen, Tong and Sun (2007),
Tong, Chen and Sun (2008).
• Bogaerts, Leroy, Lesaffre and Declerck (2002),
He and Lawless (2003).
Department of Statistics, University of Missouri Page 12
Analysis of Interval-censored Data and Beyond II'
&
$
%
— An efficient estimation procedure
• Assume that K = 2 and the joint survival function of (T1, T2)is specified by a copula model as
S(s, t) = Cα(S1(s), S2(t)) .
• Also assume that the marginal survival functions S1 and S2
follow the PH model with
Sk(t) = exp(−Λ0 k (t) exp(β ′X)), k = 1, 2.
• Let l(β, α,Λ01,Λ02) denote the log likelihood function. Forestimation of θ = (β, α), one can derive the efficient score
function l∗θ for θ and solve l∗θ(θ, Λ01, Λ02) = 0.
• Wang, Sun and Tong (2008) investigated this for bivariate case Iinterval-censored and showed that the resulting estimates areconsistent and efficient.
Department of Statistics, University of Missouri Page 13
Analysis of Interval-censored Data and Beyond II'
&
$
%
II. Analysis of Doubly Censored Data
II.1. An Example — AIDS Cohort Study
• Subjects: 257 individuals with hemophilia who were treated by
given HIV contaminated blood from 1978 to August 1988
• Groups: heavily treated group and lightly treated group
if received at least 1000 µg/kg of the blood for at least
less between 1982 and 1985
• Variable of interest: AIDS latency time, from HIV infection to
AIDS diagnosis
• Interval-censored HIV infection times, right-censored AIDS
diagnosis time
• De Gruttola and Lagakos (1989), Kim, De Gruttola and
Lagakos (1993)
Department of Statistics, University of Missouri Page 14
Analysis of Interval-censored Data and Beyond II'
&
$
%
Table 2: Observed intervals in 6-month scale given by (L,R] for HIV infectiontime and observations (denoted by T with starred numbers being right-censoredtimes) for AIDS diagnosis time for some of 188 HIV-infected patients (the numbersin parentheses are multiplicities)