© 2014 inVentiv Health. All rights reserved. Examining the tests behind survival curves Presented by Hans-Dieter Spies at PhUSE, London 2014
© 2014 inVentiv Health. All rights reserved.
Examining the tests behind survival curves
Presented by Hans-Dieter Spies at PhUSE, London 2014
2 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Agenda
• Introduction• Kaplan-Meier Method › Kaplan-Meier Estimator
› Example
• Log-Rank Test› Log-Rank Statistic
› Example
• Some Further Aspects • Conclusion
3 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Introduction
Time to eventgroup A group B
3 156 * 22
10 2815 3632 * 40
Example: 2 Groups, each group has 5 subjects
An * (asteriks) behind a number shows the time how long a subject without an event is under observation. These subjects are called censored.
4 © 2014 inVentiv Health. All rights reserved.
Survival Curves: SAS Procedure PROC LIFETEST
The main theme of the PhUSE will focus on 'Data Transparency‘.
Here we will have a look what is behind PROC LIFETEST.And especially how to calculate the Log-rank test.
5 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Kaplan-Meier Estimator
The Kaplan–0HLHU�(VWLPDWRU�Dž�W��>6���������������@�LV�GHILQHG�DV�D�SURGXFW��
ni - diDž�W�� ����Ȇ ----------------
t(i) ��W������������Qi
where Dž���� ��di = number of events, for which the event occurs at t(i)ni = number of events under risk at t(i)
PROC LIFETEST uses the Kaplan–Meier Estimator.
6 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Calculation Kaplan-Meier Estimator
GroupTime to event
Subjects, when time period ends without an event
under riskni
event occurs
di
KM-EstimatorDž��W�
A 3 0 5 1 (5-1)/5 = 0.8000A 6 1 4 0 Dž����� ���-0)/4 = 0.8000A 10 0 3 1 Dž����� ���-1)/3 = 0.5333A 15 0 2 1 Dž������ ���-1)/2 = 0.2667A 32 1 1 0 Dž������ ���-0)/1 = 0.2667
ni - diDž�W�� ����Ȇ ----------------
t(i) ��W������������Qi
7 © 2014 inVentiv Health. All rights reserved.
Survival Curves: SAS output
Survival is the KM-Estimator for group A
Let’s examine the output window of SAS:
8 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Log-Rank Test
The log-rank test is the most commonly used statistical test for comparing the survival distributions of two or more groups (e.g. different treatment groups in a clinical trial).
The log-rank test is a non-parametric test.
(ObservedGroup1 – ExpectedGroup1) 2Log-rank statistic = -----------------------------------------------------
Var (ObservedGroup1 – ExpectedGroup1)
Where (Observed – Expected) for each group isȈ� (number events – expected events)
All events
9 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Log-Rank Test
d1j n1(tj) - d1j
d2j n2(tj) - d2j
This leads to the Log-UDQN�VWDWLVWLF�IRU�WZR�JURXSV�ZKLFK�LV�ȋ2 distributed:
( Ȉ (dij - (ni(tj) * dij )) ) 2---------------------------------- a��ȋ2
Var (Ȉ (dij - (ni(tj) * dij )) )
Test construction: Denote the distinct times of events as t1 < t2 < · · · < tk , and define
The information at time tj can be summarized in the following 2x2 table:
at risk at tjgroup 1 n1(tj)group 2 n2(tj)
observedevent at tj
x dij = number of subjects in group i with event (uncensored) at tj (i = 1, 2; j = 1, 2, , k)x dj = d1j + d2j = total number of events at tjx ni(tj ) = number of subjects in group i who are at risk at tj (i = 1, 2; j = 1, 2, , k)x n(tj ) = n1(tj ) + n2(tj ) = total number of persons in both groups who are at risk at tj
10 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Log-Rank Test
We have to consider all time points tj at which events take place. These are listed in the first column.
Time tj Events at tj subjects under risk short before tj
/group A B Both groups A B Both groupsd1j d2j dj = d1j+d2j n1j n2j nj = n1j+n2j
3 1 0 1 5 5 1010 1 0 1 3 5 815 1 1 2 2 5 722 0 1 1 1 4 528 0 1 1 1 3 436 0 1 1 0 2 240 0 1 1 0 1 1
Then we have to list all events at each of the time points tj for each group.
When we count the subjects under risk we have to consider the censored cases. Consider tj = 10. In group A 1 subject has an event at tj = 3 and for another subject the observation period ends at tj = 6. So 2 subjects are no longer at risk at tj = 10. For tj = 10 is n1=5-2=3. In group B all subjects are under risk at tj = 10 : n2=5.
11 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Log-Rank Test
To calculate expected numbers we have to set the observed number in relation to all numbers that are available at this time. For group A: dj * n1j/njExpected numbers of both groups sum up to the number of events of both groups.
Time tj Events at tj
subjects under risk
Expected numbers
Observed -expected Variance
/group A B Both A B Both A B A B Both
d1j d2j dj n1j n2j nj
e1j =dj*n1j/nj
e2j =dj*n2j/nj
d1j-e1jd2j-e2j
(dj*(nj-dj)*n1j*n2j)/ (nj*nj*(nj-1))
3 1 0 1 5 5 10 0.5000 0.5000 0.5000 -0.5000 0.250010 1 0 1 3 5 8 0.3750 0.6250 0.6250 -0.6250 0.234415 1 1 2 2 5 7 0.5714 1.4286 0.4286 -0.4286 0.340122 0 1 1 1 4 5 0.2000 0.8000 -0.2000 0.2000 0.160028 0 1 1 1 3 4 0.2500 0.7500 -0.2500 0.2500 0.187536 0 1 1 0 2 2 0 1 0 0 040 0 1 1 0 1 1 0 1 0 0 0
Sum 1.8964 6.1036 1.1036 -1.1036 1.1720
12 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Log-Rank Test
Now we have to build the log-rank characteristic :
Sum of events
Sum of ex-pected events Chi-square
dg eg (dg-eg)2 / var(dg-eg)
Group A 3 1.8964 1.0391
Group B 5 6.1036 1.0391
The probability can be calculated with SAS function PROBCHI():
The result is p = 0.3080308805
Note that when we calculate the probability, we set “p=1-probchi(1.0391,1)”.
13 © 2014 inVentiv Health. All rights reserved.
Survival Curves: SAS output Log-Rank Statistics
This shape shows the sum of number of observed – expected values.
This shape shows the variance.
This shape shows characteristic value of the chi-square distribution.
This shape shows (1 minus “the probability the characteristic value of the chi-square distribution”).
14 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Some Further Aspects
• Log-rank test: is frequently used. The test statistic follows the chi-square distribution.
• Wilcoxon test: is similar to the log-rank test, but will place higher priority in the beginning of the curves. The test statistic follows also roughly the chi-square distribution.
• To analyse more than two groups, the separate SAS procedure PROC PHREG (Proportional Hazards Regression) can be used to perform a Cox regression to differentiate between different model approaches.
15 © 2014 inVentiv Health. All rights reserved.
Survival Curves: Conclusion
•
• Kaplan-Meier-EstimatorDž�W��>6���������������@�DV�SURGXFW�
• Log-UDQN�VWDWLVWLF�a��ȋ2 (chi-square distribution)
• Finally the procedure PROC LIFETEST was found to be very useful in obtaining all of these values.
© 2014 inVentiv Health. All rights reserved.
Thank you !