Bias corrected estimates for logistic regression models for complex surveys with application to the United States’ Nationwide Inpatient Sample Kevin A. Rader Harvard School of Public Health, Boston, MA, U.S.A. Stuart R. Lipsitz Brigham and Women’s Hospital, Boston, MA, U.S.A. Garrett M. Fitzmaurice Harvard Medical School, Boston, MA, U.S.A. David P. Harrington Harvard School of Public Health, Boston, MA, U.S.A. Michael Parzen Statistiics Department, Harvard University, Cambridge, MA, U.S.A. Debajyoti Sinha Florida State University, Tallahassee, FL, U.S.A. September 18, 2014 1
25
Embed
Bias corrected estimates for logistic regression models ...krader/lipsitz/firth/firth.pdfbias-corrected estimates of the parameters for the logistic regression model when the data
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Bias corrected estimates for logistic regression models
for complex surveys with application to the United
States’ Nationwide Inpatient Sample
Kevin A. Rader
Harvard School of Public Health, Boston, MA, U.S.A.
Stuart R. Lipsitz
Brigham and Women’s Hospital, Boston, MA, U.S.A.
Garrett M. Fitzmaurice
Harvard Medical School, Boston, MA, U.S.A.
David P. Harrington
Harvard School of Public Health, Boston, MA, U.S.A.
where x1i is the surgery type (x1i = 1 for robot-assisted and 0 for standard open radical
cystectomy), x2i is the age of the patient, in years, and x3i is the sex of the ith patient
(x3i = 1 for females and 0 for males).
Table 2 gives the estimates of β obtained using the two bias-corrected approaches we
propose, in addition to the Clogg, et al approach with qhij = k/(∑H
h=1
∑nh
i=1mhi) and the
standard WEE estimates (the latter were obtained using R svyglm)14. Of note, there were
no convergence problems with the three bias-corrected methods. However, because there
were no complications in the robotic arm, the coefficient for β1 was converging to −∞ for
WEE; the results for WEE reported in Table 2 are the estimates at the 25th iteration (the
default maximum number of iterations in R’s svyglm function in the package survey).
Using the independence variance when calculating the adjustment term, the estimated
odds ratio (OR) for surgery type, controlling for age and sex, is e−2.774 = 0.062; when the
small-sample bias-corrected variance is used the OR is estimated to be e−2.917 = 0.054. For
the Clogg estimator, the OR is estimated to be e−2.443 = 0.087. For all of the methods,
the standard errors reported in Table 2 are based on the variance estimator of Morel,
et al13. When comparing the estimates of β to their standard errors, we see that the
standard WEE (the 25th iteration in R svyglm) produces a much more significant result
than the bias-corrected approaches. The three bias-corrected approaches produce very
similar estimates, and all lead to the same conclusion if a test of the null hypothesis were
conducted at the α = 0.05 level.
For the other covariates in the model, age and sex, the estimates of their effects on the
probability of wound infection are relatively stable. The estimated odds ratio is between
12
e−10(0.0325) = 0.72 and e−10(0.0193) = 0.82 for every 10-year increase in age, and the OR is
estimated to be between e0.662 = 1.94 and e0.808 = 2.24 when comparing females to males.
Whereas age but not gender is significat using the standard WEE and the Clogg method,
neither are signficant using the two proposed bias-corrected approaches.
In summary, the results of analyses of the bladder cancer data highlight how the
standard WEE approach and the bias-corrected methods can produce somewhat different
estimates of effects. To examine the finite sample bias of these approaches, we conducted
a simulation study; the results of the simulation study are reported in the next section.
4 Simulation Study
In this section, we study the empirical relative bias in estimating β using standard logis-
tic regression models incorporating the complex survey structure (WEE) and our bias-
reduced approach using 3 different variance estimators when calculating the multiplicative
weighting factor, qhij: variance under independence (Independence), variance using the
bias-corrected sandwich estimator of Morel, et al. (Morel), along with the Clogg, et al.
approach (Clogg)13,14. For simplicity, in the simulation study, we used a cluster design
without stratification and weighting, where sampling of clusters was performed without
replacement from a finite population of clusters.
For the simulations, the true marginal logistic model for any subject in the population
is
logit(P [Yij = 1|xij]) = β0 +10∑k=1
βkxijk , (6)
where the ten xijk’s are independent Bern(px) variables. The intercept β0 was chosen so
that the average P [Yij = 1] equals 0.20. This marginal model is similar to that used in a
13
simulation study performed by Heinze and Schemper11. For simplicity, we set all ten βk
equal to the same value,
logit(P [Yij = 1|xij]) = β0 + β
10∑k=1
xijk
To simulate the clustered data, we use the random intercept logistic regression model
proposed by Wang and Louis and further developed by Parzen, et al.16,17. In particular,
the conditional subject-specific logistic regression model is
logit(P [Yij = 1|xij]) = bi +
(β0 + β
10∑k=1
xijk
)/φ, (7)
where, given the subject-specific random effect bi, the Yij’s from the same cluster are
independent Bernoulli random variables. When bi follows a ‘bridge’ distribution, the
marginal logistic regression equals that given in (6)16. The bridge random variable has
mean 0 and φ is the rescaling parameter. In particular,
Var(bi) =π2
3
(1
φ2− 1
),
so that the larger the value of φ, (0 < φ < 1), the smaller the variance (and the lower the
correlation between pairs of random variables in the same cluster).
We denote the population number of clusters by N , which we set to N = 400, the
number of sampled clusters by n, and the cluster size by mi (we assume all clusters are
of the same size, and all members of the cluster are sampled).
We conducted 24 simulation configurationss varying the following conditions: the
effect of the covariates, βk = β = {ln(2), ln(4), ln(16)} (recall, we set all ten βk to the
same value); cluster sizes, mi = {5, 10}; the bridge distribution’s scaling parameter,
φ = {0.7, 0.9}; and the number of clusters sampled, n = 40 and n = 80. For each
14
simulation configuration, 2000 simulation replications were performed. The convergence
criterion for WEE is that the relative change in the log-likelihood between successive
iterations is less than 0.000001; we report the percentage of simulation replications in
which this convergence criterion was not met. When the standard WEE failed to converge,
we used the estimates from the 25th iteration (the default maximum number of iterations
in R’s svyglm function in the package survey).
Tables 3, 4, and 5 present the relative biases for β2 defined as 100(β̂2 − β2)/β2, the
mean square error of the estimates, and the empirical coverage probabilities of 95% Wald
confidence intervals for all the simulation study specifications, respectively. Without loss
of generality we report results for β2 only; any of the βk could have been selected for bias
reporting since the model is symmetric across covariates (all covariates are independent
and have the same Bernoulli distribution with all ten βk = β). The results indicate that
the relative bias is greatly reduced, by an order of magnitude, when using any of the three
bias-reduced approaches in comparison to standard WEE. The standard WEE approach
gave average estimated values for β close to zero, suggesting no effect when there truly
was an effect of at least β = 0.69. As a result, the average relative bias for the standard
WEE method is very close to -100% in all simulation configurations.
In previous simulations for logistic regression with independent observations, it was
found that the Clogg et al. approach was typically less biased than standard logistic
regression using maximimum likelihood, but more biased than the Box and Firth bias-
correction approach2,3,11. We have found analagous results for the Clogg type estimator
for complex survey data–the estimator is more biased than the proposed bias-corrected
approaches (generalization of the Firth estimators), but much less biased than WEE
(generalization of maximimum likelihood). Overall, these results suggest that the bias-
reduced approach using either the independent variance estimate or the Morel variance
estimate to calculate qhij is the preferred method for performing the analysis.
15
Although Wald confidence intervals are known to be conservative with large β’s, we
found in nearly all sets of simulations with β2 = 2.77, that the coverage probabilities agree
with the nominal 95% level provided the sample size is large11,18,19. However, we caution
against generalizing this result based on the results of a single simulation study.
5 Discussion
In this paper we have described a simple implementation of bias correction in the logis-
tic regression model for complex surveys. By incorporating an adjustment term to the
weighted estimating equations, we derived a bias correction based on univariate Bernoulli
distributions. This bias correction splits each of the original observations into two new
observations: the original response, yi with the original sampling weight, whij, times a
multiplicative factor, 1 + qhij/2, and a pseudo-response, 1− yhij with the original weight
times the multiplicative factor minus one, or qhij/2. Since both the response and pseduo-
response have weights that are guaranteed to be positive, the problematic issue of separa-
tion is eliminated12. These pseudo-responses and weights are relatively easy to calculate,
and this approach leads to an iterative algorithm that is straightforward to implement.
Because WEE is the most widely used estimation approach for logistic regression models
in complex surveys, the approach to correct for bias described in this paper should be
useful in applications where there are rare outcomes, many interaction terms, or the focus
of analysis is on particular subgroups of interest.
Although not specifically discussed in this paper, the proposed method can also be
used for any regression model for binary outcomes in complex surveys, including models
that adopt non-canonical link functions, such as probit or complementary log-log links.
Kosmidis described bias-corrected estimating equations for non-canonical links for binary
data with independent observations, where the original observations are split into yi and
16
1− yhij with weights for each that are a function of ahij in (2)20. The proposed approach
could be used to extend these results by incorporating the complex survey sampling
weights into the weights for yi and 1−yhij. Further, the approach can be extended to other
generalized linear models using weighted estimating equations for complex surveys. The
bias-corrected approach for complex surveys would be similar to that given in Kosmidis
and Firth for other generalized linear models based on specific link functions21. This can
be implemented by creating a psuedo-response (a function of the outcome and ahij) to
correct for the first-order bias.
Finally, the results of the simulations demonstrate that the proposed method can
greatly reduce the finite sample bias of WEE for estimating logistic regression parameters
for binary data in the complex survey setting. WEE estimates can be biased due to the
issue of separation or quasi-separation, a problem that can occur in large complex surveys
when the outcomes are rare or subrgoup analyses are performed. The bias-corrected
methods perform discernibly better than the standard WEE approach for binary data,
suggesting that they could be adopted as the method of choice in regression analyses of
binary outcomes in complex surveys. Because of its computational simplicity, the standard
bias-corrected logistic regression approach of Firth, where qhij is estimated under the naive
assumption of independence, appears to greatly reduce the bias. The latter approach may
be somewhat easier to implement using standard statistical software for logistic regression
with complex survey data3.
Acknowledgments
We thank Edward Giovannucci and Caprice Greenberg for advice on the analysis of the
NIS bladder cancer data. We are grateful for the support provided by grant CA 160679
from the U.S. National Institutes of Health.
17
Table 1: Baseline characteristics (means/percentages and 95% confidence intervals) ofbladder cancer patients treated with radical cystectomy in the National Inpatient Sample(NIS).
Open Radical Cystectomy Robot-assisted Radical(ORC), n = 343 Cystectomy (RARC), n = 42
Age, years 68.6 (67.6, 69.6) 67.2 (63.3, 71.1)Female, % 15.2 (12.6, 18.1) 11.9 (5.8, 22.9)One or more comorbidities, % 22.9 (19.2, 27.0) 21.7 (12.1, 36.0)
Note: Results are reported as population estimates using survey weights, strata, andcluster variables.
18
Table 2: Comparison of WEE logistic regression parameter estimates for the bladdercancer data from the National Inpatient Survey (NIS), n = 343.
Effect Approach Estimate SE Z-statistic P-value
Intercept Standard WEE -1.548 0.868 -1.784 0.079
Bias-Reduced Independent Var. -1.106 1.404 10.817 0.414Morel Var. -1.458 0.909 -1.603 0.109Clogg Var. -1.096 0.849 -1.290 0.201
Robot Standard WEE -15.61 0.527 -29.61 <0.001
Bias-Reduced Independent Var. -2.774 1.309 -2.119 0.034Morel Var. -2.917 1.168 -2.499 0.013Clogg Var. -2.443 0.993 -2.460 0.015
Age Standard WEE -0.0325 0.0160 -2.037 0.045
Bias-Reduced Independent Var. -0.0321 0.0191 -1.552 0.121Morel Var. -0.0193 0.0177 -1.092 0.278Clogg Var. -0.0280 0.0122 -2.275 0.026
Female Standard WEE 0.689 0.430 1.601 0.113
Bias-Reduced Independent Var. 0.697 0.564 1.160 0.247Morel Var. 0.808 0.925 0.874 0.385Clogg Var. 0.662 0.430 1.538 0.128
19
Table 3: Average relative bias and mean square error of β̂2 and empirical coverage prob-abilties of confidence intervals for each simulation specification where the true parametervalues are β2 = ln(2) ≈ 0.69.
Average Mean EmpiricalConfiguration Method Relative Bias Square Error Coverage Prob.
Based on 2000 replications for each simulation for varying levels of the number of obser-vations in each cluster mi, levels of φ for the bridge distribution, and number of clusterssampled, n.
20
Table 4: Average relative bias and mean square error of β̂2 and empirical coverage prob-abilties of confidence intervals for each simulation specification where the true parametervalues are β2 = ln(4) ≈ 1.39.
Average Mean EmpiricalConfiguration Method Relative Bias Square Error Coverage Prob.
Based on 2000 replications for each simulation for varying levels of the number of obser-vations in each cluster mi, levels of φ for the bridge distribution, and number of clusterssampled, n.
21
Table 5: Average relative bias and mean square error of β̂2 and empirical coverage prob-abilties of confidence intervals for each simulation specification where the true parametervalues are β2 = ln(16) ≈ 2.77.
Average Mean EmpiricalConfiguration Method Relative Bias Square Error Coverage Prob.