1 Biostat Methods STAT 5820/6910 – Handout #5: Logistic Regression (with Overdispersion, Separation of Points, and Inverse Interval Estimation) Example 1: 102 patients with acute myelogenous leukemia (AML) in remission were enrolled in a study of a new anti-relapse treatment (ACT). Patients were randomly assigned to receive a 10-day infusion of ACT or a placebo (PBO), and effects were followed for 90 days. Of interest was whether or not the patients suffered a major 'relapse' during the 90 days, including relapse, death, or major intervention, such as bone marrow transplant. The time of remission from diagnosis or prior relapse ('x', in months) at study enrollment was considered an important covariate in predicting relapse. Is there any evidence that ACT leads to a decreased relapse rate compared to PBO? Relapse (y) No (0) Yes (1) Treatment (trt) PBO (0) 20 30 ACT (1) 29 23 /* Define options */ ods html image_dpi=300 style=journal; data aml; input group $ x relapse $ @@; trt = (group='ACT'); y = (relapse='Y'); label x = 'Months in Remission'; cards; ACT 3 N ACT 3 Y ACT 3 Y ACT 6 Y ACT 15 N ACT 6 Y ACT 6 Y ACT 6 Y ACT 15 N ACT 15 N ACT 12 N ACT 18 N ACT 6 Y ACT 15 N ACT 6 Y ACT 15 N ACT 12 Y ACT 9 N ACT 6 Y ACT 6 N ACT 6 N ACT 6 N ACT 3 Y ACT 18 N ACT 9 N ACT 12 Y ACT 6 N ACT 9 Y ACT 9 Y ACT 3 N ACT 9 Y ACT 12 N ACT 12 N ACT 3 N ACT 12 N ACT 12 N ACT 12 N ACT 9 Y ACT 6 Y ACT 12 N ACT 6 N ACT 15 Y ACT 9 N ACT 3 Y ACT 9 N ACT 9 N ACT 9 N ACT 9 N ACT 9 Y ACT 12 Y ACT 3 Y ACT 6 Y PBO 9 Y PBO 3 N PBO 12 Y PBO 3 Y PBO 3 Y PBO 15 Y PBO 9 Y PBO 12 Y PBO 3 Y PBO 9 Y PBO 15 Y PBO 9 Y PBO 6 Y PBO 9 Y PBO 6 Y PBO 12 N PBO 9 N PBO 15 N PBO 15 Y PBO 9 N PBO 9 N PBO 12 Y PBO 3 Y PBO 6 Y PBO 6 Y PBO 12 N PBO 12 N PBO 12 Y PBO 3 Y PBO 12 Y PBO 3 Y PBO 12 Y PBO 6 Y PBO 6 Y PBO 9 Y PBO 15 N PBO 15 N PBO 12 N PBO 9 N PBO 12 N PBO 15 N PBO 18 Y PBO 12 N PBO 15 Y PBO 15 N PBO 15 N PBO 18 N PBO 18 Y PBO 18 N PBO 18 N ;
19
Embed
Biostat Methods STAT 5820/6910 Handout #5: Logistic ......Intercept 0.511028 -0.209445 -0.035981 trt1 -0.209445 0.218010 0.009684 x -0.035981 0.009684 0.003137 Probit Analysis on x
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Biostat Methods STAT 5820/6910 – Handout #5: Logistic Regression
(with Overdispersion, Separation of Points, and Inverse Interval Estimation)
Example 1: 102 patients with acute myelogenous leukemia (AML) in remission were
enrolled in a study of a new anti-relapse treatment (ACT). Patients were randomly
assigned to receive a 10-day infusion of ACT or a placebo (PBO), and effects were
followed for 90 days. Of interest was whether or not the patients suffered a major
'relapse' during the 90 days, including relapse, death, or major intervention, such as bone
marrow transplant. The time of remission from diagnosis or prior relapse ('x', in months)
at study enrollment was considered an important covariate in predicting relapse. Is there
any evidence that ACT leads to a decreased relapse rate compared to PBO?
Relapse (y)
No (0) Yes (1)
Treatment (trt) PBO (0) 20 30
ACT (1) 29 23
/* Define options */
ods html image_dpi=300 style=journal;
data aml; input group $ x relapse $ @@;
trt = (group='ACT');
y = (relapse='Y');
label x = 'Months in Remission';
cards;
ACT 3 N ACT 3 Y ACT 3 Y ACT 6 Y ACT 15 N ACT 6 Y
ACT 6 Y ACT 6 Y ACT 15 N ACT 15 N ACT 12 N ACT 18 N
ACT 6 Y ACT 15 N ACT 6 Y ACT 15 N ACT 12 Y ACT 9 N
ACT 6 Y ACT 6 N ACT 6 N ACT 6 N ACT 3 Y ACT 18 N
ACT 9 N ACT 12 Y ACT 6 N ACT 9 Y ACT 9 Y ACT 3 N
ACT 9 Y ACT 12 N ACT 12 N ACT 3 N ACT 12 N ACT 12 N
ACT 12 N ACT 9 Y ACT 6 Y ACT 12 N ACT 6 N ACT 15 Y
ACT 9 N ACT 3 Y ACT 9 N ACT 9 N ACT 9 N ACT 9 N
ACT 9 Y ACT 12 Y ACT 3 Y ACT 6 Y PBO 9 Y PBO 3 N
PBO 12 Y PBO 3 Y PBO 3 Y PBO 15 Y PBO 9 Y PBO 12 Y
PBO 3 Y PBO 9 Y PBO 15 Y PBO 9 Y PBO 6 Y PBO 9 Y
PBO 6 Y PBO 12 N PBO 9 N PBO 15 N PBO 15 Y PBO 9 N
PBO 9 N PBO 12 Y PBO 3 Y PBO 6 Y PBO 6 Y PBO 12 N
PBO 12 N PBO 12 Y PBO 3 Y PBO 12 Y PBO 3 Y PBO 12 Y
PBO 6 Y PBO 6 Y PBO 9 Y PBO 15 N PBO 15 N PBO 12 N
PBO 9 N PBO 12 N PBO 15 N PBO 18 Y PBO 12 N PBO 15 Y
PBO 15 N PBO 15 N PBO 18 N PBO 18 Y PBO 18 N PBO 18 N
;
2
/* Run usual chi-square test */
proc freq data=aml;
tables trt*y / chisq nopercent nocol;
title1 'Chi-square test of association';
title2 '(ignoring covariate)';
run;
Chi-square test of association
(ignoring covariate)
The FREQ Procedure
Frequency
Row Pct
Table of trt by y
trt y
0 1 Total
0 20
40.00
30
60.00
50
1 29
55.77
23
44.23
52
Total 49
53
102
Statistics for Table of trt by y
Statistic DF Value Prob
Chi-Square 1 2.5394 0.1110
Likelihood Ratio Chi-
Square
1 2.5505 0.1103
Continuity Adj. Chi-
Square
1 1.9469 0.1629
Mantel-Haenszel Chi-
Square
1 2.5145 0.1128
Phi Coefficient -0.1578
Contingency Coefficient 0.1559
Cramer's V -0.1578
Fisher's Exact Test
Cell (1,1) Frequency (F) 20
Left-sided Pr <= F 0.0813
Right-sided Pr >= F 0.9637
Table Probability (P) 0.0450
Two-sided Pr <= P 0.1189
3
/* Do equivalent test in logistic regression */
proc logistic data=aml;
model y(event='1') = trt;
title1 'Logistic regression';
title2 '(ignoring covariate)';
run;
Logistic regression
(ignoring covariate)
Response Profile
Ordered
Value
y Total
Frequency
1 0 49
2 1 53
Probability modeled is y=1.
Model Convergence Status
Convergence criterion
(GCONV=1E-8) satisfied.
Model Fit Statistics
Criterion Intercept
Only
Intercept
and
Covariates
AIC 143.245 142.695
SC 145.870 147.945
-2 Log L 141.245 138.695
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 2.5505 1 0.1103
Score 2.5394 1 0.1110
Wald 2.5178 1 0.1126
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard
Error
Wald
Chi-
Square
Pr > ChiSq
Intercept 1 0.4055 0.2887 1.9728 0.1602
trt 1 -0.6373 0.4016 2.5178 0.1126
Odds Ratio Estimates
Effect Point Estimate 95% Wald
Confidence Limits
trt 0.529 0.241 1.162
4
/* Fit logistic regression model with covariate */
proc logistic data=aml plots(only)=roc;
model y(event='1') = trt x ;
title1 'Logistic regression';
title2 '(accounting for covariate)';
run;
Logistic regression
(accounting for covariate)
Response Profile
Ordered
Value
y Total
Frequency
1 0 49
2 1 53
Probability modeled is y=1.
Model Convergence Status
Convergence criterion (GCONV=1E-
8) satisfied.
Model Fit Statistics
Criterion Intercept
Only
Intercept
and
Covariates
AIC 143.245 129.376
SC 145.870 137.251
-2 Log L 141.245 123.376
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 17.8687 2 0.0001
Score 16.4848 2 0.0003
Wald 14.0612 2 0.0009
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard
Error
Wald
Chi-
Square
Pr > ChiSq
Intercept 1 2.6135 0.7149 13.3662 0.0003
trt 1 -1.1191 0.4669 5.7446 0.0165
x 1 -0.1998 0.0560 12.7187 0.0004
Odds Ratio Estimates
Effect Point Estimate 95% Wald
Confidence Limits
trt 0.327 0.131 0.815
x 0.819 0.734 0.914
5
Association of Predicted Probabilities and
Observed Responses
Percent Concordant 68.5 Somers' D 0.454
Percent Discordant 23.1 Gamma 0.496
Percent Tied 8.4 Tau-a 0.229
Pairs 2597 c 0.727
/* Fit equivalent logistic regression model,
and look at 'dose-response' curves
for each level of group variable */
proc logistic data=aml;
class group;
model y(event='1') = group x ;
effectplot fit(plotby=group x=x);
title1 'Logistic regression';
title2 '(with dose-response curve)';
run;
6
Logistic regression
(with dose-response curve)
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept 1 2.0539 0.5967 11.8477 0.0006
group ACT 1 -0.5595 0.2335 5.7446 0.0165
x 1 -0.1998 0.0560 12.7187 0.0004
Odds Ratio Estimates
Effect Point Estimate 95% Wald
Confidence Limits
group ACT vs PBO 0.327 0.131 0.815
x 0.819 0.734 0.914
7
/***************************************/
/* Look at inverse interval estimation */
/***************************************/
/* First get 'weighted' version of data */
proc sort data=aml; by trt x;
proc means data=aml sum n noprint;
by trt x;
var y;
output out=out1 n=total sum=resp;
proc print data=out1;
title1 'Weighted version of AML data';
run;
Weighted version of AML data
Obs trt x _TYPE_ _FREQ_ total resp
1 0 3 0 7 7 6
2 0 6 0 6 6 6
3 0 9 0 10 10 6
4 0 12 0 12 12 6
5 0 15 0 10 10 4
6 0 18 0 5 5 2
7 1 3 0 8 8 5
8 1 6 0 14 14 9
9 1 9 0 12 12 5
10 1 12 0 10 10 3
11 1 15 0 6 6 1
12 1 18 0 2 2 0
/* Get 'weighted' data in order with trt=1 first
-- that way the ORDER=DATA option in PROC PROBIT
will make trt=1 be the indicated factor level
since it will occur first in the data set. */
proc sort data=out1; by descending trt;
run;
8
/* Get (and plot) inverse intervals for response probabilities
when trt=0 [need to give an x-level (6 here), but not used] */
NOTE: Output here is the same as for the trt=0 case, except for the “95% Fiducial
Limits” table and corresponding figure:
11
Example 2: Erectile Dysfunction Data
48 male subjects in an anti-impotence study had experienced erectile dysfunction
following prostate surgery. Subjects were randomly assigned to receive a new drug
(trt=1) or placebo (trt=0), and kept a diary for one month, recording the number of
attempts at sexual intercourse following taking the medication and the number of
attempts that were successful. Subject age is also recorded.
Does the new drug have a higher success rate than the placebo?
data ED; input trt age successes attempts @@;
ID = _n_;
cards;
0 41 3 6 1 57 3 8
0 44 5 15 1 54 10 12
0 62 0 4 1 65 0 0
0 44 1 2 1 51 5 8
0 70 3 8 1 53 8 10
0 35 4 8 1 44 17 22
0 72 1 6 1 66 2 3
0 34 5 15 1 55 9 11
0 61 1 7 1 37 6 8
0 35 5 5 1 40 2 4
0 52 6 8 1 44 9 16
0 66 1 7 1 64 5 9
0 35 4 10 1 78 1 3
0 61 4 8 1 51 6 12
0 55 2 5 1 67 5 11
0 41 7 9 1 44 3 3
0 53 2 4 1 65 7 18
0 72 4 6 1 69 0 2
0 58 0 0 1 53 4 14
0 56 12 17 1 49 5 8
0 53 8 15 1 74 10 15
0 45 3 4 1 39 4 9
0 40 14 20 1 35 8 10
1 47 4 5
1 46 6 7
;
12
proc logistic data=ED;
model successes/attempts = trt age;
title1 'ED Data Analysis';
run;
ED Data Analysis
Number of Observations Read 48
Number of Observations Used 46
Sum of Frequencies Read 417
Sum of Frequencies Used 417
Response Profile
Ordered
Value
Binary Outcome Total
Frequency
1 Event 234
2 Nonevent 183
Note: 2 observations with invalid response values have been deleted. Either the number of trials was less than or equal to zero or less than the number of events, or the number of events was negative.
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept 1 1.2111 0.4597 6.9410 0.0084
trt 1 0.5265 0.2041 6.6574 0.0099
age 1 -0.0243 0.00881 7.5816 0.0059
Odds Ratio Estimates
Effect Point Estimate 95% Wald
Confidence Limits
trt 1.693 1.135 2.526
age 0.976 0.959 0.993
13
/* Check whether the subject strata (and associated
dependence of observations) has caused overdispersion */
proc logistic data=ED;
model successes/attempts = trt age / scale=pearson;
output out=out1 p=phat;
title1 'ED Data Analysis';
title2 '(Also Check for Overdispersion)';
run;
ED Data Analysis
(Also Check for Overdispersion)
Note: 2 observations with invalid response values have been deleted. Either the number of trials was less than or equal to zero or less than the number of events, or the number of events was negative.
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Deviance and Pearson Goodness-of-Fit Statistics
Criterion Value DF Value/DF Pr > ChiSq
Deviance 70.3355 43 1.6357 0.0053
Pearson 63.7235 43 1.4819 0.0216
Number of events/trials observations: 46
Note: The covariance matrix has been multiplied by the heterogeneity factor (Pearson Chi-Square / DF) 1.48194.
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept 1 1.2111 0.5596 4.6837 0.0305
trt 1 0.5265 0.2484 4.4923 0.0340
age 1 -0.0243 0.0107 5.1160 0.0237
14
Example 3: Menopause Data
370 female patients’ age and menopause status (menopause=1 for post-menopausal, 0
otherwise) is recorded. Age is categorized into a variable agecat: 1 for age<50, 2 for 50 ≤
age < 60, 3 for 60 ≤ age < 70, and 4 for 70 ≤ age. How does menopause rate depend on