Click here to load reader
Jul 11, 2018
Survival Analysis: Logrank Test
Lu Tian and Richard Olshen
Stanford University
1
Two-sample Comparison
Objective: to compare survival functions from two groups.
Requirement: nonparametric, deal with right censoring.
2
Two-sample comparisons
KM estimators: S1() and S0()
Possible test statistics:
sup[0, ] |S1(t) S0(t)| 0|S1(t) S0(t)|dt
0{S1(t) S0(t)}dt
The null distributions are complex.
3
Logrank Test
The most popular method is the logrank test1. Adapted from stratified test for 2 by 2 contingency table (Mantel, 1996)
2. Has a nice relationship with the proportional hazards model
3. Targets on the hazard function (not survival function).
4
Logrank test
1 < 2 < < K are distinct failure times
Yi(j) = # persons in group i at risk at j
Y (j) = Y0(j) + Y1(j), the total # of subjects at risk at j
dij = # of failures in group i at j
dj = d0j + d1j total # of failures at j
5
Two by two table
At the time jobserved to at risk
fail at j at j
group 0 d0j Y0(j) d0j Y0(j)
group 1 d1j Y1(j) d1j Y1(j)
dj Y (j) dj Y (j)
6
Logrank test
Under the null hypothesis H0 : S1(t) = S0(t), 0 < t < , d1j has the hypergeometric distributionconditional on the margins {Y0(j), Y1(j), dj , Y (j) dj}
pr(d1j = d) =
djd
Y (j) djY1(j) d
/ Y (j)Y1(j)
The hypergeometric distribution is a discrete probability distribution that describes the probability of d1 successes
in Y1 draws without replacement from a finite population of size Y containing exactly d successes. This is in
contrast to the binomial distribution, which describes the probability of d1 successes in Y1 draws with
replacement.
7
Logrank test
E(d1j |marginals) =(
Y1(j)Y (j)
)dj
Var(d1j |marginals) = Y (j)Y1(j)Y (j)1 Y1(j)(
djY (j)
)(1 djY (j)
)=
Y0(j)Y1(j)dj{Y (j)dj}Y (j)2{Y (j)1}
8
Logrank Test
Oj = d1j : observed number of failures
Ej = dj Y1(j)Y (j) : expected number of failures
Vj = Y0(j)Y1(j)dj(Y (j)dj)Y (j)2(Y (j)1) : variance of the observed number of failures
The logrank test statistics
Z =
kj=1(Oj Ej)k
j=1 Vj
N(0, 1) under H0
9
Example
data:Group 0 : 3.1, 6.8+, 9, 9, 11.3+, 16.2
Group 1 : 8.7, 9, 10.1+, 12.1+, 18.7, 23.1+
k = 5 and 1, . . . , 5 = 3.1, 8.7, 9, 16.2, 18.7
10
Example
1 = 3.1
1 5 6
0 6 6
1 11 12
2 = 8.7
0 4 4
1 5 6
1 9 10
3 = 9
2 2 4
1 4 5
3 6 9
4 = 16.2
1 0 1
0 2 2
1 2 3
5 = 18.7
0 0 0
1 1 2
1 1 2
Oj = 0 1 1 0 1
Ej = 1/2 6/10 5/9 2/3 1
Oj Ej = 1/2 4/10 4/9 2/3 0
Vj = 1/4 6/25 5/9 2/9 0
Z = .39 (2-sided P = .70)
11
Logrank test
Symmetric in two groups
Only rank matters
k two by two tables are treated as independent.
If the number of subjects at risk becomes zero in one group, the additional two by two tables dontcontribute to the logrank test.
12
Logrank test
The power of the logrank test depends on the number of observed failures rather than the sample sizes
Logrank test is most powerful for detecting the alternatives
H1 : S1(t) = S0(t)exp() h1(t) = h0(t)e , = 0
The power of logrank test under the alterative h1(t) = h0(t)e is approximately
(||
D0(1 0) 1.96
),
where D is the expected number of failures and 0 is the proportion of patient in groups 0.
13
Targeting the hazard funciton
kj=1
(Oj Ej) =k
j=1
(d1j dj
Y1(j)
Y (j)
)
=
kj=1
d1j(Y1(j) + Y0(j)) (d0j + d1j)Y1(j)Y (j)
=
kj=1
Y0(j)Y1(j)
Y (j)
(d1j
Y1(j) d0j
Y0(j)
)
=
0
Y0(s)Y1(s)
Y0(s) + Y1(s)d{H1(s) H0(s)
}
14
Weighted Logrank test
Attach weight wj to the two by two table at time j :
Z =
j wj(Oj Ej)
j w2jVj
15
Generalized Wilcoxon test
Set wj = Y (j) :
kj=1
wj(Oj Ej) =k
j=1
{d1jY0(j) d0jY1(j)}
=i,j
{I(Ui0 Uj1)j1 I(Uj1 Ui0)i0}
The commonly used Wilcoxon test statistics without censoring isi,j
{I(Ui0 > Uj1) 1/2} =1
2
i,j
{I(Ui0 Uj1) I(Uj1 Ui0)}
Sensitive to the early differences between two hazard functions.
16
The Generalized Logrank test
In general, the test statistics is in the form of
Zw =
0w(s)d
{H1(s) H0(s)
}w
The choice of the weight affects the power.
One may construct a test based on several different sets of weights, e.g.,
T = max{|Zw1 |, |Zw2 |, , |ZwL |}.
17
More than two groups
H0 : S0() = S1() = = Sp()
At j
at risk
fail at j at j
Group 0 d0j Y0(j) d0j Y0(j)Group 1 d1j Y1(j) d1j Y1(j)Group 2 d2j Y2(j) d2j Y2(j)
......
......
Group p dpj Yp(j) dpj Yp(j)dj Y (j) dj Y (j)
18
More than two groups
Oj = (d1j , d2j , , dpj)
Ej = (E1j , E2j , , Epj)
where Eij = djYi(j)Y (j)
.
Vj = (V(j)kl )pp :
where V(j)kl =
djYk(j)Yl(j)(Y (j)dj)
Y (j)2(Y (j)1) if k = ldjYk(j)(Y (j)dj)(Y (j)Yk(j))
Y (j)2(Y (j)1) if k = l
19
More than two groups
The test statistics:
j
(Oj Ej)
j
Vj
1 j
(Oj Ej)
2punder the null hypothesis.
20
More than two groups
Trend test
j c
(Oj Ej)j c
Vjc N(0, 1)
under the null hypothesis. How to choose the vector c to increase the power?
21