“SURVIVAL ANALYSIS ON KIDNEY FAILURE FOR KIDNEY TRANSPLANT PATIENTS” Summary The Accelerated Failure Time model presents a way to easily describe and interpret survival regression data. It approaches the data differently than the widely used and well described Cox proportional hazard model, by assuming proportional effect of the covariates on the log-failure time rather than on the hazard function. In this report, we present semiparametric methods (the Cox PH model) and parametric methods (AFT model) for analyzing survival data. We have the data set for 469 patients with kidney transplants along with the graft survival and failure time. Eight variates have been recognized which might have a relation with the survival experience. Introduction Survival analysis is a statistical method for data analysis where the outcome variable of interest is the time to the occurrence of an event. Hence, survival analysis is also referred to as "time to event analysis", which is applied in a number of applied fields such as medicine, public health, social science, and engineering. The Cox proportional hazards (PH) model is now the most widely used for the analysis of survival data in the presence of covariates or prognostic factors. This is the most popular model for survival analysis because of its simplicity, and not being based on any assumptions about the survival distribution. The model assumes that the underlying hazard rate is a function of the independent covariates, but no assumptions are made about the nature or shape of the hazard function. The accelerated failure time (AFT) model is another alternative method for the analysis of survival data. The AFT model assumes a certain parametric distribution for the failure times and that the effect of the covariates on the failure time is multiplicative. The appeal of the AFT model lies in the ease of interpreting the results, because the AFT models the effect of predictors and covariates directly on the survival time instead of through the hazard function.
45
Embed
Survival analysis on kidney failure of kidney transplant patients
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
“SURVIVAL ANALYSIS ON KIDNEY FAILURE FOR KIDNEY TRANSPLANT PATIENTS”
Summary
The Accelerated Failure Time model presents a way to easily describe and interpret survival regression data. It approaches the data differently than the widely used and well described Cox proportional hazard model, by assuming proportional effect of the covariates on the log-failure time rather than on the hazard function. In this report, we present semiparametric methods (the Cox PH model) and parametric methods (AFT model) for analyzing survival data. We have the data set for 469 patients with kidney transplants along with the graft survival and failure time. Eight variates have been recognized which might have a relation with the survival experience.
Introduction
Survival analysis is a statistical method for data analysis where the outcome variable of interest is the time to the occurrence of an event. Hence, survival analysis is also referred to as "time to event analysis", which is applied in a number of applied fields such as medicine, public health, social science, and engineering.
The Cox proportional hazards (PH) model is now the most widely used for the analysis of survival data in the presence of covariates or prognostic factors. This is the most popular model for survival analysis because of its simplicity, and not being based on any assumptions about the survival distribution. The model assumes that the underlying hazard rate is a function of the independent covariates, but no assumptions are made about the nature or shape of the hazard function.
The accelerated failure time (AFT) model is another alternative method for the analysis of survival data. The AFT model assumes a certain parametric distribution for the failure times and that the effect of the covariates on the failure time is multiplicative. The appeal of the AFT model lies in the ease of interpreting the results, because the AFT models the effect of predictors and covariates directly on the survival time instead of through the hazard function.
Purpose and Methods
The purpose of this report is to analyze the data using the Cox models and the AFT models. This will be studied by means of real dataset which is from a randomized data set for 469 patients with kidney transplants.
We start with the AFT models checking the AIC and BIC (goodness of fit tests) for each of the distributions including all the covariates. We then discuss briefly possible types of response and prognostic variables. We will select the covariate based on the backward selection procedure.The final model will be chosen based on this result.
Second part of the report consists of the Cox’s PH model. We are going to try to fit the data based on the Cox’s PH model based on different ties and compare with the parametric model exponential and Weibull. This will also be followed by selection of significant covariates and the final model will be chosen.
SAS software package was used and proc lifereg, proc phreg are used for estimating the results for AFT and Cox PH model.
Conclusion
We apply these methods to a randomized data of 495 kidney transplants. Our conclusion is that Age, Diabetes (DIBT) and ALG (an immune drug) are the most significant covariates, thus being the interacting variables which is possibly predictive of the outcome under study. The major goal of this report is also to support an argument for the consideration of the AFT model as an alternative to the PH model in the analysis of some survival data by means of this real dataset.
In conclusion, although the Cox proportional hazards model tends to be more popular in the literature, the AFT model should also be considered when planning a survival analysis. It should go without saying that the choice should be driven by the desired outcome or the fit to the data, and never by which gives a significant P value for the predictor of interest. The choice should be dictated only by the research hypothesis and by which assumptions of the model are valid for the data being analyzed.
Analysis,Procedure along with computation and output with interpretation
Data set
Following is the data for 469 patients with kidney transplants. The primary interest was graft survival, and time to graft failure was recorded in months (which was subject to right censoring). This study included measurements of many covariates that may be related to survival experience. Use both Cox's PH model and the accelerated failure time model to analyze the data and write a report.
The 10 covariates included are:
AGE: Age at transplant in years
SEX: 1=female, 0=male
DIALY: Duration of hemodialysis prior to transplant in days
DBT: Diabetes; 1=yes, 0=no
PTX: Number of prior transplants
BLOOD: Amount of blood transfusion, in blood units
MIS: Mismatch score
ALG: Use of ALG, an immune suppression drug; 1=yes, 0=no
MONTH: Duration time starting from transplant , in months
FAIL: status of the new kidney; 1=new kidney failed, 0=functioning
A. AFT Models - Under AFT models we measure the direct effect of the explanatory variables on the survival time instead of hazard, as we do in the PH model. This characteristic allows for an easier interpretation of the results because the parameters measure the effect of the correspondent covariate on the mean survival time. Currently, the AFT model is not commonly used for the analysis of clinical trial data, although it is fairly common in the field of manufacturing. Similar to the PH model, the AFT model describes the relationship between survival probabilities and a set of covariates
Procedure-
Using proc lifereg (SAS code) we perform goodness of fit tests for each distribution (Exponential, Weibull, Lognormal, Gamma and Log-logistic).This process is done by taking all the covariates into consideration.
SAS output-
Exponential-
Fit Statistics
-2 Log Likelihood1266.50
9
AIC (smaller is better)1284.50
9AICC (smaller is better)
1284.901
BIC (smaller is better)1321.86
4
Weibull-
Fit Statistics
-2 Log Likelihood1189.10
3
Fit Statistics
AIC (smaller is better)1209.10
3AICC (smaller is better)
1209.583
BIC (smaller is better)1250.60
9
Lognormal-
Fit Statistics
-2 Log Likelihood1171.51
5
AIC (smaller is better)1191.51
5AICC (smaller is better)
1191.996
BIC (smaller is better)1233.02
1
Gamma-
Fit Statistics
-2 Log Likelihood1166.62
5
AIC (smaller is better)1188.62
5AICC (smaller is better)
1189.202
BIC (smaller is better)1234.28
1
Loglogistic-
Fit Statistics
-2 Log Likelihood1182.57
8
AIC (smaller is better)1202.57
8AICC (smaller is better)
1203.059
BIC (smaller is better)1244.08
5
Interpretation-
From the above tables it is quite evident that gamma distribution being the chosen one with the lowest AIC amongst the others.
Choice of Covariates-
In this section we will be selecting the most significant covariates among the eight covariates that are given in the data set. Covariates are selected by using the backward selection procedure.
The backward selection procedure is an elimination process in which all the covariates are included in the model at the beginning and are removed one by one according to a significance criterion. The specific parameters that define the parametric model and the coefficients of all the covariates are estimated first. Then the Wald test is used to examine each covariate.
We delete the predictor with the highest p-value and re run the model deleting the predictors with highest p-value until all of them satisfies our constraints. The predictors left will be our significant variable.
To be thorough with our selection procedure of the covariates we have done the backward selection procedure for each distributions.
Age, Diabetes (DBT) and ALG(an immune suppression drug) are the three significant covariates.
Following tables shows in details the selection process of the covariates for each and every distribution.
SAS Tables-
Exponential distribution-
Analysis of Maximum Likelihood Parameter Estimates
The non-parametric method does not control for covariates and it requires categorical predictors. When we have several prognostic variables, we must use multivariate approaches.But we cannot use multiple linear regression or logistic regression because they cannot deal with censored observations. We need another method to model survival data with the presence of censoring. One very popular model in survival data is the Cox proportionalhazards model.
Procedure-
We are going to use proc phreg to estimate the regression coefficient (parameter estimate) based on different methods for ties on the given data.Then we are going to compare those results with the two parametric models, exponential and Weibull for further clarification.
Conclusion-From the test results we can say that the signs of regression coefficient of Age, Duration of hemodialysis (DIALY), Diabetes (DBT), Blood are all positive and thus have a higher hazard risk in the kidney transplants.The coefficients of SEX, PTX (number of prior transplants) and ALG (an immune suppression drug) are all negative indicating low hazard risk in the kidney transplantation.
It is also to be noted that if we compare these value with that of exponential and Weibull they give the opposite results. According to Weibull and exponential-SEX, PTX and ALG have higher hazard risk whereas the Age,DIALY,DBT,Blood and MIS have low hazard risk.
The following SAS tabular values explains our outcome in details.
Like we did in the AFT model in this section we will be selecting the significant covariates only, using Breslow’s method (default procedure) for ties, the backward selection method, and the SAS proc phreg.
Conclusion- AGE, DBT and ALG are the three covariates that are significant from our test results.
Following SAS tables give us the detailed process of the backward selection procedure.
Using Breslow approximation of ties and backward selection-
Form the above table we can get our significant covariates and hence we can construct our final model
The final model with significant (p<.10) covariates is-
Log[h(t)/ho(t)]=.01827 AGE + .32304 DBT - .58475 ALG
Interpretation- The positive sign of the regression coefficient of AGE and DBT implies high hazard risk rate in kidney transplant whereas the negative coefficient of ALG drug implies low hazard risk rate.