1-s2.0-S0304407602002233-main

7/30/2019 1-s2.0-S0304407602002233-main

1/24

Journal of Econometrics 114 (2003) 197220

www.elsevier.com/locate/econbase

Bayesian analysis of a self-selection model withmultiple outcomes using simulation-basedestimation: an application to the demand

for healthcare

Murat K. Munkina, Pravin K. Trivedib ;

a Department of Economics, 531 Stokely Management Center, University of Tennessee, Knoxville,

TN 37919, USAbDepartment of Economics, Wylie Hall, Indiana University, 100 South Woodlawn, Bloomington,

IN 47405, USA

Received 25 April 2001; received in revised form 15 August 2002; accepted 14 September 2002

Abstract

This paper studies a self-selection model with discrete and continuous outcomes and a treat-

ment variable. The treatment variable is endogenous to the two outcome variables. The approach

of the paper is fully parametric and Bayesian. The Bayes factor is calculated with the Savage

Dickey density ratio and used for model selection. The model is applied to two dierent micro

data sets, the 19871988 National Medical Expenditure Survey and the 1996 Medical Expen-

diture Panel Survey. The paper studies the eect of managed care and fee-for-service type of

private insurance on the demand for healthcare. It also compares the eects of private insurance

and Medicaid in covering health care expenses of elderly Americans.c 2002 Elsevier Science B.V. All rights reserved.

JEL classication: I11; C11; C31; C35

Keywords: Health insurance; Self-selection; Managed care; Medicare; Medicaid; Markov chain Monte Carlo

1. Introduction

This paper has two major foci, one methodological and the other empirical. The

methodological component deals with estimation of a three-equation self-selection model

with two correlated outcomes, one of which is a count and the second is a contin-

uous variable. We are interested in the impact of selection on the conditional mean

Corresponding author. Tel.: +1-812-855-3567; fax: +1-812-855-3736.

E-mail address: [email protected] (P.K. Trivedi).

0304-4076/03/$ - see front matter c 2002 Elsevier Science B.V. All rights reserved.

PII: S 0 3 0 4 - 4 0 7 6 ( 0 2 ) 0 0 2 2 3 - 3
mailto:[email protected]:[email protected]

7/30/2019 1-s2.0-S0304407602002233-main

2/24

198 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

of the outcomes, allowing for endogenous selection. The econometric methodology

used is Bayesian and parametric; it is strongly motivated by the dicult computational

problems that arose when the same model was estimated in a simulated maximum

likelihood framework. The empirical component of the paper investigates the impactof public and private health insurance, which is our treatment variable, on the demand

for healthcare and expenditure, that constitute our outcome measures. The motivation

behind the empirical component is a long-standing and inconclusive debate in the health

economics literature on the impact of the choice between the traditional fee-for-service

(FFS) types of private insurance and health maintenance organization (HMO) plans on

the demand for healthcare consumption for individuals of all ages.

In the rst part of the paper we consider simulation-based estimation of an econo-

metric model with y1 and y2, which are two jointly dependent discrete and continuous

random outcome variables, respectively. A third variable in our model is denoted d;

which will be referred to as the selection, or treatment, variable. For simplicity let d=1

refer to the treated state and d = 0 refer to the untreated state. Suppose that our main

interest is in the average value of the partial derivative such as y1=d or y2=d.

If (y1; y2; d) are jointly dependent, then it is known that ignoring endogeneity of d

results in self-selection bias, i.e. the causal eect of d on the outcome variable is not

identied. Bias arises because the treatment (self-selection) variable d reects choices

or decisions of an individual, and hence is endogenous.

Lee (2000) provides a recent survey of the large literature on self-selection. The

most widely used version of the selection model is a two-equation model in which

the outcome variable is continuous and the outcome equation is linear. Heckman(1976) proposes a two-stage estimation method for this type of models. In contrast,

the model considered here is nonlinear with both discrete (count) and continuous out-

comes. This model is an extension of Terza (1998), Crepon and Duguet (1997), Greene

(1997) and Winkelmann (1998) who have proposed both two-step moment-based and

simulation-based full-information estimation methods to estimate a selection model in

which the outcome variable is a count (also see van Ophem, 2000). Other approaches

to the same problem follow the traditional selection model format, e.g. Dowd et al.

(1991), in so far as they ignore discreteness, nonnormality, and heteroskedasticity that

are inherent in the data. However, such moment-based procedures are in general inef-

cient, even though computationally they are easier to implement. Further, the methodof moments does not allow one to estimate the full set of parameters for models

with correlated multiple outcomes. A second possibility is to use a weighted nonlin-

ear instrumental variable approach. Such an approach in other contexts has not been

very successful, in part because of the diculty of estimating consistently the weights.

Another approach is the simulated maximum likelihood method which requires a su-

cient number of simulations for consistency but we often do not know the operational

meaning of sucient. 1 In addition, the above-mentioned moment-based procedures

are dicult to generalize to the case of multiple outcomes.

1 The authors have found that the SML estimator for the present problem converged very slowly as the

number of outcomes increases. Besides, convergence can be impossible for some parameterizations and

parameter values due to unbounded gradient vectors.

7/30/2019 1-s2.0-S0304407602002233-main

3/24

M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 199

We approach the estimation problem in a Bayesian setting, assuming specic marginal

distributions for the dependent variables. We consider the case of selection on unob-

served heterogeneity with factor structure (for greater generality). Prior distributions

are assigned to the parameters of the model. We apply the Markov chain Monte Carlo(MCMC) approach to estimation, building on the recent research of Albert and Chib

(1993), Chib et al. (1998), Chib and Hamilton (2000), Koop and Poirier (1997), Li

(1998), McCulloch et al. (2000), as well as several earlier studies.

The existence of possible selection bias has long been an issue of contention in

empirical health economics based on observational data in which insurance status is

a choice variable and not exogenously assigned (Maddala, 1985). The standard treat-

ment of selection eect in a linear model needs signicant extensions in dealing with

a typical healthcare application where outcomes are often count as well as continuous

variables and the selection decision may be multinomial. Computational diculty of

simultaneously dealing with nonlinearity, discreteness, and sample selection has dis-

couraged a full treatment of this topic. However, the issue is clearly important. Public

insurance, such as Medicaid and Medicare, are special programs that target only par-

ticular groups of individuals eligible for them, such as the low-income and the elderly.

A subset of these individuals also purchase private (Medigap) insurance that cov-

ers out-of-pocket expenses that constitute the gap between the provider charges and

the public insurance benets. Therefore, although the public insurance status can be

viewed as predetermined, the choice of private gap insurance plans may be endoge-

nously determined jointly with the level of the demand for healthcare. The issue of

endogeneity is relevant to comparisons of access to, utilization of, and evaluation ofquality of care between groups of healthcare users classied by their health insurance

status. If one can validly assume exogeneity of insurance status, such comparisons

are econometrically easier to implement because insurance choice and healthcare use

can be modeled separately. If not, then the modeling exercise is more computationally

complex especially if one wants to eciently estimate all parameters using a full in-

formation system estimator. We pursue this objective in order to facilitate more precise

comparisons between utilization patterns of dierent categories of insurees.

The literature on the selection models for healthcare utilization does not present a

full consensus on the importance of endogeneity of the insurance decision. In Section 6

we will compare our empirical ndings on endogeneity with those from several recentstudies. Here we briey mention ndings from a few studies to reect the mixed

empirical evidence on the endogeneity issue. For example, Miller and Luft (1994,

1997) survey several studies on HMOs and their impact on utilization. These studies

produce mixed results regarding the bias due to neglect of endogeneity. 2 Dowd et

al. (1991) nd negligible evidence of selection bias. Reschovsky (2000) argues that

self-selectivity of the HMO variable is due to observed characteristics of the individuals

and insurance markets and the threat of selection bias arises from not measuring and

2

Half of these studies show higher rates of physician doctor visits for HMO enrollees and the other halfreach the opposite conclusion. Most of the studies ignore the self-selecting behavior of the individuals. Some

of them argue that for particular types of healthcare, such as physician doctor visits, the insurance status is

exogenous. Others acknowledge the problem but avoid the issue of endogeneity for computational simplicity.

7/30/2019 1-s2.0-S0304407602002233-main

4/24


including these factors. Although unmeasured they are observable and an especially

rich set of explanatory variables could control for selectivity. Tu et al. (2000) and

Kemper et al. (2000) study healthcare utilization of HMO enrollees and generally

argue against the importance of the endogeneity issue and register reservations aboutsome instrumental variable type approaches for handling it. Goldman (1995), on the

other hand, emphasizes the role and importance of endogeneity. Because of dierences

in data and methods used in dierent investigations a full consensus is hard to achieve.

In this study we reconsider the modeling issues using a methodology that addresses

several neglected issues and apply our methodology using 1996 MEPS data as well as

the 1987 NMES data.

The rest of the paper is organized as follows. Section 2 species the model. MCMC

estimation of the model is presented in Section 3. Section 4 considers issues of Bayesian

model selection. Section 5 considers an example with articially generated data and

Section 6 deals with two empirical applications and concludes the paper.

2. Model specication

We observe N (i = 1; : : : ; N ) independent observations and it is assumed that: the

counted (outcome) variable y1i is Poisson distributed conditional on exogenous co-

variates x1i, endogenous variable di and unobserved heterogeneity 1i; the continuous

nonnegative (outcome) variable y2i is exponentially distributed conditional on exoge-

nous covariates x

2i

, endogenous variable di and unobserved heterogeneity 2i. The pres-

ence of unobserved heterogeneity in this structure will permit us to model counts and

continuous variables that display overdispersion. Specically,

y1i|x1i; di; 1i indP[i]; y1i = 0; 1; 2; : : : ; (2.1)y2i|x2i; di; 2i ind exp[i]; y2i 0; (2.2)

where P and exp stand for Poisson and exponential distributions with mean i and

1=i, respectively. Variables y1i and y2i are assumed to be independent conditional

on the unobserved heterogeneity. Their respective marginal distributions, obtained by

integrating out 1 and 2, will be more exible. The specication of the conditional

means is

log i = x1i

1 + 1di + 1i; (2.3)

log(1=i) = x2i

2 + 2di + 2i: (2.4)

The third (selection) equation in the model denes a latent variable zi such that

zi = x3i + ui; (2.5)

di =

1 if zi 0;

0 if zi 0;

where di is the treatment variable and x3i is a vector of exogenous explanatory vari-ables, and zi is a latent variable related to di. More specically, it could be propensity

to purchase private insurance or propensity of being in an HMO.

7/30/2019 1-s2.0-S0304407602002233-main

5/24


Endogeneity of di is modeled through correlation between unobserved variables 1i,

2i and ui, that are assumed jointly normally distributed

1i; 2i; ui N[(0; 0; 0); ]: (2.6)Thus, the outcome variables depend upon the treatment propensity through correlations

between 1i, 2i and ui. Since zi in the selection equation (2.5) is unobservable only

the ratio =

uu is identied. We restrict the variance of the latent variable uu to

unity so that the covariance matrix is

=

11 12 1u

12 22 2u

1u

2u

1

: (2.7)

The next section assigns prior distributions to the parameters of the model and

describes the estimation procedure, which uses the MCMC algorithms to construct

ergodic Markov chains converging to the posterior distributions of the parameters.

3. MCMC estimation

In our model the set of parameters to estimate is 1 ; 1; 2 ; 2; and ve elements

of matrix with uu = 1 as the identication restriction. If the sign of one coef-cient of were known a priori then one could x it to 1 or 1 as an alterna-tive identication restriction. That would allow uu to vary and simplify the MCMC

algorithm leading us to a convenient form of the Wishart distribution for the para-

meters of matrix . However, such information is not generally available. McCulloch

and Rossi (1994) in their analysis of the multinomial probit model propose to spec-

ify proper priors for the full set of parameters (; ) without any identication re-

strictions, derive the posterior distribution for both and and report the marginal

posterior of the identied parameters (=

uu; |uu = 1). Nobile (1998) proposes ahybrid sampler that improves convergence and autocorrelation properties of the al-

gorithm. However, with this approach it is impossible to assign improper priors onthe parameters. Nobile (2000) and Linardakis and Dellaportas (1999) propose algo-

rithms to draw directly from Wishart distribution conditional on one of the diagonal

elements.

We use data augmentation approach (Tanner and Wong, 1987) and include unob-

servable variables zi ; 1i and 2i in the algorithm drawing them at each iteration and

for all observations. Denote i = (1i; 2i), 21 = E(i ui) and 22 = E(

i i). We follow

the approach of Koop and Poirier (1997), Li (1998) and McCulloch et al. (2000) and

write the joint distribution of vi = (1i; 2i; ui) as the product of marginal distribution

of ui and the conditional distribution of i

|ui. This allows us to use Gibbs sampling

algorithm (Geman and Geman, 1984) to draw values from the posterior distributions ofthe elements of matrix . The distribution of ui is standard normal and the conditional

distribution i|ui N(12ui; ), where = 22 2112. Since there is a one-to-one

7/30/2019 1-s2.0-S0304407602002233-main

6/24


correspondence between and (12; )

= + 2112 2112 1

;the MCMC procedure can be organized by blocking as 12 and and including

them in the MCMC algorithm. Block the rest of the parameters as (1 ; 1), (2 ; 2)

and . For brevity, denote: 1 = (1 ; 1), 2 = (

2 ; 2), = (1; 2) and x1 = (x

1 ; d),

x2 = (x2 ; d). Assume the following prior distributions:

1 N(01; B101 ); 2 N(02; B102 ); N(0; A10 );1 Wish(n0; D0); 12 N($12; 112 ); (3.1)

where 01; B1

01 ; 02; B1

02 ; n0, D0; $12; 1

12 ; 0; A0 are known parameters, N(0; B1

0 ) de-notes multivariate normal distribution with mean vector 0 and covariance matrix B

10

and Wish(n0; D0) is the Wishart distribution with n0 degrees of freedom and scale

matrix D0. Denote = (1; 2; ; 1; 12). Then the joint posterior density of the

parameters and unobservables i and zi given the data is

(;;z |y) = C(1)(2)()(1)(12)

N

i=1[I{di = 1}I{zi 0} + I{di = 0}I{zi 0}]

exp(i)y1ii

y1i!i exp(iy2i)(i|12ui; )(ui|0; 1); (3.2)

where I{:} is the indicator function, (:|; 2) is the p.d.f. of the N[; 2], C is aproportionality constant, = (1 2); z

; y = (y1 y2 d) are matrices of N observations

and ui = zi x3i.

We construct our Markov chain blocking the parameters as i = (1i 2i), 1; 2; zi ; ;

12 and with the full conditional distributions

[1; 2

|y; 12; ; ; u]; [1

|y1; 1]; [2

|y2; 2]; [z

|1; 2; ; 12; ];

[|; z; 1; 2; 12; ]; [12|; 1; 2; u] and [1|12; 1; 2; u]:Notice that given z and ; u is known with certainty and when condition on u

formally it is conditioned on z and or (z x3). The following steps summarizeour algorithm.

3.1. Sampling 1 and 2

The full conditional density for 1 and 2 is

(1; 2|y; 12; ; ; u) =N

i=1

(1i; 2i|yi; 12; ; ; ui);

7/30/2019 1-s2.0-S0304407602002233-main

7/24


the product of N independent terms. We utilize the MetropolisHasting algorithm

(Metropolis et al., 1953; Hastings, 1970) to sample (1i; 2i) for each observation i

from the density

(1i; 2i|yi; 12; ; ; ui) = ci exp(exp(x1i1 + 1i))(exp(1i))y1i exp(2i)exp(exp(x2i2 2i)y2i)(i|12ui ; );

where ci is a proportionality constant and choose t-distribution centered at the modal

value of the full conditional density for the proposal density. Let

i = (1i; 2i) = arg max log (1i; 2i|yi; 12; ; ; ui)and Vi =(Hi )1 be the negative inverse of the Hessian of log (1i; 2i|yi; 12; ; ; ui)evaluated at the mode

i. The gradient vector and the Hessian are derived in the Com-

putational Appendix and used for a few steps of the NewtonRaphson algorithm to nd

the modal value and the Hessian formula is used to calculate the covariance matrix of

the proposal distribution q(i|yi; 12; ; ; ui) = fT(i|i; Ve i ; ), a bivariate t-distributionwith degrees of freedom, a tuning parameter selected to obtain reasonable acceptance

rates. When a proposal value i = (1i;

2i) is drawn the chain moves to the proposal

value with probability

(i; i ) = min

(i |yi; 12; ; ; ui)q(i|yi; 12; ; ; ui)(i|yi; 12; ; ; ui)q(i |yi; 12; ; ; ui)

; 1

:

If the proposal value is rejected then the next state of the chain is at the current valuei = (1i; 2i).

3.2. Sampling 1 and 2

The full conditional densities for 1 and 2 are

(1|y1; 1) = C1 (1|01; B101 )N

i=1

exp(exp(x1i1 + 1i))(exp(x1i1 + 1i))y1i ;

(2|y2; 2) = C2 (2|02; B102 )N

i=1

exp(x2i2 2i)

exp(y2i exp(x2i2 2i));where C1 and C2 are proportionality constants. The MetropolisHastings algorithm

is used again to draw samples from these densities. Denote j (j = 1; 2) the current

state of the Markov chain, j the mode of the full conditional density and j the

candidate for the new value of the chain. Following Chib et al. (1998) the proposal

density q(j ; j) is selected to be multivariate t-distribution with k degrees of freedom

fT(

j|

j(j

j); jVj) centered at

j(j

j). This proposal density is symmetricin j and j. The covariance matrix of the multivariate t-density is set to be Vj =H1

j,

negative inverse of the Hessian of log (j|yj ; j) evaluated at the modal value j and

7/30/2019 1-s2.0-S0304407602002233-main

8/24


j is an adjustable constant. The candidate j , drawn from the proposal distribution, is

accepted with probability

Pr(j ; j|yj ; j) = min

1; (j|yj ; j)(j|yj ; j)

:

If the candidate is not accepted then the chain does not change its value.

3.3. Sampling z

Variables zi are included in the MCMC algorithm (Albert and Chib, 1993). Sample

N independent random variables zi such that zi |1i; 2i; ; 12; is distributed normal

with mean x3i + 12122 (

1i2i

) and variance 1

12

122 21 (22 = + 2112) and it

is truncated at zero at the left if di = 1 and at the right if di = 0. To sample from thetruncated normal we follow Geweke (1991). See also Devroye (1986, p. 380).

3.4. Sampling ; 12 and

From Eqs. (2.3), (2.4) and given 1; 2 and 1; 2, variables log and log 1= are

known with certainty. Since variables log , log1= and z are multivariate normal,

1; 2 and are jointly normal and the conditional distribution of given 1; 2 is also

normal. Denote the parameters of this conditional distribution as N( |; 1|) (they

are derived in the Computational appendix). Thus given 1; 2; ; 1; 2; z

; 12; andthe prior N(0; A10 ) the posterior distribution of is normal with mean [A0 +|]

1[A00 + | |] and variance [A0 + |]1. It is straightforward to sample

from this distribution.

Conditional on 1; 1, 2 and u, and given the prior 12 N($12; 112 ), the poste-rior distribution of 12 is normal with the mean of (12 +

1uu)1(12$12 +1u)

and the variance of (12 + 1uu)1.

Given 12; 1; 2; u and the prior 1 Wish(n0; D0), the posterior distribution for

1 is

(1|v) Wishn0 + N;D10 +N

i=1

(i 12ui)(i 12ui)1 ;

see Zellner (1971, p. 389) and Johnson (1987, pp. 203204) for the details of the

Wishart density and the algorithm that draws values from it.

4. Model comparison

This section discusses the issues of Bayesian model comparisons. The model we

consider in this paper permits endogeneity of the treatment variable. We would like todevelop a decision rule that would allow one to select between two models: model M0,

with constraints 12 = (0 0), and M1 that leaves 12 unconstrained. The Bayes factor

7/30/2019 1-s2.0-S0304407602002233-main

9/24


for the null hypothesis H0 : 12 = (0 0) is dened as

B0; 1 =m(y|M0)m(y|M1)

;

where m(y|Mj) is the marginal likelihood of the model specication Mj : Since thesetwo models are nested we take the SavageDickey density ratio approach (Verdinelli

and Wasserman, 1995) to calculate the Bayes factor as

B0; 1 =(12|y)

(12);

where (12|y) is the posterior density and (12) is the prior density of parameter 12calculated at the point 12=(0 0). To estimate (12|y) we approximate it by averagingthe full conditional density of 12; (12

|; 1; 2; u) with respect to a posterior sample

s; s1;

s2 and u

s; s = 1; : : : ; S . Let Vs12 = (12 + 1

s usus)1 and s12 = V

s12(12$12 +

1s sus). Then

(12|y) =1

S

Ss=1

(12|s; s1; s2; us);

where (12|s; s1; s2; us) is the bivariate normal density with mean s12 and varianceVs12 evaluated at

12. One also has to estimate the prior density at

12. In general,

less informative priors would favor the null hypothesis so that improper priors are

not applicable for testing. We will choose informative priors for 12 without a large

spread. The specication of the prior distributions is discussed in more details in the

next sections.

5. Example based on articial data

As an example we generate articial data according to the selection model and

estimate it by the MCMC algorithm. The motivation of this exercise is to see how the

method performs when the model is correctly specied and to investigate the existence

of a self-selection bias when endogeneity of the treatment variable is ignored. To dothat one can estimate the model under the following restriction: 12 = (0 0). The

MCMC algorithm is easily implemented under this restriction by xing the draws of

21 to the zero vector. The data set, consisting of 1000 (i = 1; : : : ; 1000) observations,

is generated as follows.

1. x1i = x2i = (1; i) and i N(0; 1), x3i = (1; wi) and wi N(0; 1), = (1; 1),

1 = (1; 1; 0:5), and 2 = (1; 1;0:5).2. ui; 1i; 2i N[(0; 0; 0); ], where

=

1

0:5 0:5

0:5 1 0:50:5 0:5 1

:

7/30/2019 1-s2.0-S0304407602002233-main

10/24


3. Generate zi = x3i + ui and di such that

di = 1 i zi 0;

di = 0 i zi 0;

The values of and x3 are chosen in such a way that about 75% of all generated

observations have d = 1 and for the rest d = 0. Set x1i = (x1i; di) and x2i = (x

2i; di).

Denote k1; k2 and k3 are the number of variables in x1; x2 and x3, respectively.

4. Finally generate y1i and y2i as

y1i P[exp(x1i1 + 1i)];

y2i exp[1=exp(x2i2 + 2i)]:

We center priors for parameters ; 1 and 2 at zero and choose them to be

1 N(0k1 ; 10Ik1 ); 2 N(0k2 ; 10Ik2 ); N(0k3 ; 10Ik3 ) (5.1)to reect weak prior information on these parameters. Select a proper prior for

and center it at the identity matrix (since in general there is no information on the

covariance parameters 1u, 2u and 12 we center their priors at zero). McCulloch et

al. (2000) analyzing the multinomial probit model point out that in order to choose

such prior for one can select the following priors for 12 and 1:

12 N(02; I2);1 Wish(n0; (n0 2)(1 )I2) (5.2)

and specify only two scalars, and n0. In our example we choose =18

and n0 = 5.

Table 1 3 gives estimates of the posterior means, standard deviations and the auto-

correlation function of the coecients at lag 20. When endogeneity of the treatment

variable is ignored it results in a self-selection bias and coecients 1 and 2 are in-

consistently estimated. We use the SavageDickey density ratio to calculate the Bayes

factor to test H0 : 12 = (0 0). The calculated value of B0; 1 is of 106 order, which

provides decisive evidence against H0. This suggests potentially serious consequence

of ignoring the self-selection problem. The inuence of the self-selection bias on thepredicted utilization and expenditure is examined in more details in the next section.

In addition, we generate 3000 observations according to the selection model and es-

timate the model under M1 model specication. The results are given in Table 1.

Fig. 1 displays the prior and posterior distributions (1000 and 3000 observations)

for one of the parameters, 2u. The histograms are based on 20 000 iterations. The

solid lines, 2u = 0:5, are drawn at the true value of the data generating process. The

3 We run 40 000 replications following rst a burn-in phase of 1000 replications. During the burn-in

phase the Markov chains converge to the stationary distributions. The posterior means and posterior standarddeviations are calculated based on the 40 000 draws. Values for tuning parameters, 1 = 0:7, 2 = 0:9 and

k= = 15, are obtained in short preliminary runs by choosing reasonable acceptance rates close on average

to 0.3 and examining the serial correlations of the Markov chains.

7/30/2019 1-s2.0-S0304407602002233-main

11/24


Table 1

MCMC estimation for generated data

Coecient True value Unrestricted Restricted Unrestricted

N = 3000N = 1000 ACF(20) N = 1000 ACF(20)

Const1 1 1.111 0.560 1.523 0.148 0.955

0.125 0.072 0.069

x1 1 0.990 0.174 0.966 0.373 1.022

0.036 0.039 0.022

d 0.5 0.450 0.586 0.132 0.134 0.529

0.154 0.083 0.084

Const2 1 1.043 0.402 0.665 0.018 1.005

0.187 0.096 0.091

x2 1 1.029 0.024 0.923 0.036 1.003

0.048 0.048 0.028

d 0.5 0.522 0.488 0.017 0.012 0.495

0.251 0.109 0.114

Const3 1 1.080 0.071 0.986 0.003 1.062

0.059 0.059 0.034

x3 1 0.992 0.299 1.048 0.004 1.018

0.074 0.069 0.039

1u 0.5 0.512 0.453 0.474

0.099 0.060

2u

0.5 0.551 0.504 0.500

0.169 0.079

11 1 0.954 0.189 0.979 0.031 0.981

0.066 0.060 0.039

12 0.5 0.520 0.236 0.507 0.102 0.475

0.069 0.055 0.036

22 1 1.089 0.081 0.979 0.052 0.996

0.112 0.098 0.065

results indicate that the impact of the priors diminishes as the number of observations

increases, the posterior means move closer to the true values of the parameter andthe standard deviations of the posterior means become smaller. Overall the estimation

algorithm performs well and it produces Markov chains with reasonable properties and

the estimated results are consistent with the true parameters.

6. Empirical application

In this section we investigate the self-selectivity of private insurance considering

two dierent situations. The model is applied to two dierent data sets from two

household-based medical expenditure surveys sponsored by the Agency for HealthcareResearch and Quality (AHRQ). These are nationally representative surveys of health-

care use, expenditure, source of payment and insurance coverage for the US civilian

7/30/2019 1-s2.0-S0304407602002233-main

12/24


Prior distribution

-1.0 -0.5 0.0 0.5 1.0

0

500

1000

1500

2000

Posterior distribution (N=1000)

0.2 0.4 0.6 0.8

0

500

1000

1500

2000

Posterior distribution (N=3000)

0.2 0.4 0.6 0.8

0

500

1000

1500

2000

2u = 0.5

2u = 0.5

Fig. 1. Prior and posterior distributions for 2u in the articial data example.

noninstitutionalized population. The rst sample pertains to the U.S. elderly population

and the second to the U.S. nonelderly population. Both data sets are based on publicly

available data at AHRQ and contain only individuals with positive healthcare expen-

diture. Denitions and summary statistics for the variables from the data sets used in

this paper are given in Table 2.

6.1. Private insurance

First, we analyze the impact of private insurance on the number of physician doctor

visits and associated expenditures by elderly Americans. A sample of 3690 observationsis obtained from the National Medical Expenditure Survey conducted in 1987 and 1988

(NMES, 1987).

7/30/2019 1-s2.0-S0304407602002233-main

13/24


Table 2

Variable denition and summary statistics

Variable Data set MEPS NMES

Number of observations 2893 3690

DenitionMean St. Dev. Mean St. Dev.

DOCVIS Number of physician oce visits 4.74 6.30 6.88 6.85

DVEXP Expenditure on physician oce visits 481.8 972.0 424.2 788.5

EXCHLTH Equals 1 if self perceived health is excellent 0.32 0.47 0.07 0.25

POORHLTH Equals 1 if self perceived health is poor 0.02 0.13 0.13 0.34

NUMCHRON Number of chronic conditions 0.75 1.11 1.66 1.35

ADLDIFF Equals 1 if the person has a condition which 0.21 0.41

limits activities of daily living

INJURY Number of injuries which limit activities 0.42 0.83

of daily living during 1996NOREAST Equals 1 if the person lives in northeastern U.S. 0.20 0.40 0.19 0.39

MIDWEST Equals 1 if the person lives in midwestern U.S. 0.25 0.44 0.26 0.44

WEST Equals 1 if the person lives in western U.S. 0.21 0.41 0.19 0.39

AGE age in years (divided by 10) 4.03 1.29 7.41 0.62

BLACK Equals 1 if the person is African American 0.10 0.30 0.10 0.31

FEMALE Equals 1 if the person is female 0.58 0.49 0.61 0.49

MARRIED Equals 1 if the person is married 0.65 0.48 0.55 0.50

SCHOOL Number of years of education 13.28 2.58 10.5 3.7

FAMINC family income in $1,000 58.84 38.63 25.6 29.9

EMPLOYED Equals 1 if the person is employed 0.82 0.38 0.10 0.30

PRIVATE Equals 1 if the person is covered by 0.80 0.40

private health insurance

INSURANCE Equals 0 if the person in an HMO 0.51 0.50

1 if the person has a FFS plan

MEDICAID Equals 1 if the person is covered by Medicaid 0.09 0.29

SELFEMP Equals 1 if the person is self-employed 0.09 0.28

SIZE The size of the company where the person works 124.0 176.1

LOCATION Equals 1 if the company has multiple locations 0.53 0.50

GOVT Equals 1 if the company is governmental 0.18 0.39

Deb and Trivedi (1997) and Munkin and Trivedi (2000) treated private insuranceas an exogenous variable for two reasons. First, individuals have strong incentives to

purchase private insurance before they are 65 years old because its price rises sharply

after that age. Second, individuals older than 66 are covered by Medicare, a generous

public insurance program that oers a substantial protection against healthcare cost.

Medicare covers expenses of mostly acute healthcare needs including those associated

with the type of utilization that we consider, physician doctor visits, but not the costs

of long-term healthcare. Hence some will choose to purchase additional insurance such

as private insurance to cover out-of-pocket expenses, justifying the treatment of private

insurance as an endogenous variable.

We also compare the relative impact of private insurance and Medicaid on thelevel of healthcare utilization and expenditure. Medicaid provides health insurance to

low-income individuals at public expense by covering the cost dierence between the

7/30/2019 1-s2.0-S0304407602002233-main

14/24


cost of health service and the Medicare coverage. Those ineligible for Medicaid may

purchase private insurance with coverage similar to that of Medicaid.

The three equation joint model analyzes the number of doctor visits (DOCVIS),

doctor visit expenditure (DVEXP) and private insurance (d). We choose the compo-nents of vectors x1; x2 and x3 as follows. The determinants for the healthcare con-

sumption and expenditure, vectors x1 and x2, have the same set of variables, which

consists of self-perceived health status variables EXCLHLTH and POORHLTH,

a measure of chronic diseases and disability status NUMCHRON and ADLDIFF,

geographical variables NOREAST, MIDWEST and WEST, demographic variables

BLACK, MALE, MARRIED, SCHOOL, AGE, economic variable EMPLOYED

and insurance variables MEDICAID and PRIVINS (d). In general, health status of an

individual is unobservable and dicult to measure. However, self-perceived health vari-

ables, together with the evaluation of chronic conditions and disability status, have

proven to be a good measure of health status. The geographical variables are included

to capture dierences in the local insurance and healthcare markets. Vector x3 com-

prises of factors that inuence the decision to purchase a private insurance, including

EXCLHLTH, POORHLTH, NUMCHRON, ADLDIFF, NOREAST, MIDWEST,

WEST, BLACK, MALE, MARRIED, SCHOOL, EMPLOYED, AGE and FAM-

INC. MEDICAID, which targets low income individuals, is excluded from the insur-

ance equation.

In a nonlinear simultaneous model identication can in principle be secured by non-

linearity of the functional forms (McManus, 1992). However, genuine exclusion restric-

tions, if available, ensure more robust identication of causal parameters (Heckman,2000). Thus motivated, we assume that the variable FAMINC inuences the deci-

sion to purchase private (Medigap) insurance, but does not aect utilization. This

restriction is empirically supported in this study.

Our choice of priors is similar to that in the numerical example of the previous

section and is given by (5.1) and (5.2).

The posterior means and posterior standard deviations are presented in Table 3 4

which also presents estimation results for the restricted model with 12 = (0 0). The

posterior mean estimates for the coecient of PRIVINS are substantially dierent for

the restricted model (Table 3, columns 2 and 3) and unrestricted models (columns 5

and 6). The restricted coecients are positive and relatively precisely estimated, but theunrestricted coecients are positive but quite imprecisely estimated; i.e., they become

statistically insignicant under endogeneity assumptions. This result is consistent with

the presence of selection bias in the following sense. If after accounting for correlation

between the insurance and utilization decisions, PRIVINS no longer has a signicant

variable then the presence of selection bias is conrmed. However, for this argument

to be plausible requires that our estimates of covariance parameters 1u and 2u be

signicantly dierent from zero. Although both are estimated to have positive posterior

means, their standard deviation is too large to permit reliable inference. Consequently,

the result on selection bias is inconclusive.

4 The MCMC estimation is based on 40 000 replications. Values for tuning parameters 1 = 2 = 0:1 and

k = v = 15 are selected.

7/30/2019 1-s2.0-S0304407602002233-main

15/24


Table 3

MCMC estimates of the restricted and unrestricted models for private insurance

M1 (Unrestricted) M0 (Restricted)

Insurance Docvis Dvexp Insurance Docvis Dvexp

CONST 0.172 1.402 4.925 0.179 1.336 4.889

0.334 0.172 0.190 0.350 0.151 0.136

EXCLHLTH 0.098 0.278 0.245 0.106 0.281 0.247

0.110 0.055 0.082 0.111 0.055 0.082

POORHLTH 0.232 0.266 0.291 0.235 0.274 0.300

0.073 0.045 0.077 0.076 0.043 0.072

NUMCHRON 0.014 0.134 0.143 0.014 0.133 0.141

0.020 0.011 0.018 0.020 0.011 0.019

ADLDIFF 0.216 0.076 0.107 0.219 0.081 0.110

0.064 0.040 0.069 0.067 0.039 0.065MEDICAID 0.204 0.254 0.206 0.253

0.056 0.082 0.056 0.079

PRIVINS 0.057 0.227 0.198 0.337

0.192 0.285 0.041 0.064

NOREAST 0.136 0.078 0.188 0.135 0.076 0.186

0.071 0.038 0.065 0.074 0.039 0.064

MIDWEST 0.306 0.0003 0.049 0.307 0.005 0.051

0.067 0.039 0.064 0.070 0.036 0.059

WEST 0.144 0.101 0.325 0.141 0.102 0.327

0.071 0.039 0.064 0.075 0.038 0.062

BLACK 0.881 0.038 0.047 0.874 0.022 0.039

0.074 0.080 0.119 0.076 0.051 0.078

MALE 0.018 0.030 0.004 0.015 0.030 0.005

0.059 0.032 0.054 0.059 0.032 0.052

MARRIED 0.266 0.041 0.008 0.266 0.045 0.011

0.059 0.035 0.059 0.060 0.032 0.051

SCHOOL 0.100 0.017 0.026 0.100 0.016 0.024

0.007 0.007 0.010 0.008 0.004 0.007

EMPLOYED 0.062 0.001 0.020 0.061 0.001 0.020

0.096 0.052 0.079 0.099 0.049 0.077

AGE 0.007 0.038 0.022 0.006 0.039 0.023

0.042 0.018 0.020 0.044 0.019 0.020

FAMINC 0.061 0.0620.014 0.014

1u; 2u 0.076 0.059

0.106 0.154

11 ; 22 0.503 0.777 0.496 0.762

0.021 0.043 0.016 0.036

12 0.553 0.550

0.027 0.020

We calculate the Bayes factor using the SavageDickey density ratio. M1 is the

unrestricted specication that allows endogeneity of the treatment variable and M0 isthe one that ignores it. The calculated Bayes factor value is B0; 1 = 2:95468 (4:27171).

Kass and Raftery (1995) indicate that if B0; 1 does not exceed 10 there is not a strong

7/30/2019 1-s2.0-S0304407602002233-main

16/24


Table 4

Predicted values for doctor visits and expenditure. Private insurance/medicaid

Insurance status Private insurance; Medicaid; No medicaid;

Health status no medicaid no private insurance no private insurance

Excellent Poor Excellent Poor Excellent Poor

Pooled model

Number of visits 4.55 9.72 5.13 10.87 4.23 9.03

0.19 0.30 0.30 0.46 0.18 0.29

Expenditure 296.03 606.18 350.66 670.11 274.22 535.96

18.69 33.87 31.07 47.42 18.01 30.18

evidence against H1 and if it exceeds 100 then the evidence is decisive. In our casethere is no strong evidence in favor of either of the models and none of them can be

ignored in the Bayesian inferential framework. Given the value of the Bayes factor,

and assuming that the prior model probabilities are equal, P(M0) = P(M1) =12

, the

posterior model probabilities are P(M0|y) = 0:74713 and P(M1|y) = 0:25287. FollowingDraper (1995) we form our predictive distribution (pooled model) by averaging pos-

terior densities obtained under model specications M0 and M1 and using the posterior

model probabilities as weights. For example, the weighted coecients of PRIVINS in

utilization and expenditure equations are 0.162 (0.079) and 0.309 (0.120), respectively.

These eects are signicantly positive, but slightly weaker than those predicted by the

restricted model. We do not fully report moments of the pooled posterior density,

however, use it as predictive distribution in the following calculations.

In Table 4 we compare the levels of healthcare use and expenditure for three groups

dened according to their insurance status: those individuals who have private in-

surance and no Medicaid; those who have Medicaid and no private insurance; those

who have neither private insurance nor Medicaid. The individuals from the last group

are covered only by Medicare. We divide these three categories further according

to the self-perceived health status (excellent or poor health). The mean function for

yj (j = 1; 2) after integrating out unobserved heterogeneity has a closed form,

E[yj|j] = (x3 + ju)exp

xj

j + j + jj

2

+ (1 (x3 + ju))

exp

xj

j +jj

2

;

where j = (j ; ; ju; jj).

We calculate posterior moments of the dependent variables for each group eval-

uated at the groups mean value of the regressors, ( xj ; x3), and taken with respect

to the pooled posterior distribution of the parameters j. The posterior moments are

approximated as

E[yj| xj ; x3] = 1S

Si=1

E[yj| xj ; x3; ji]; j = 1; 2;

7/30/2019 1-s2.0-S0304407602002233-main

17/24


where S is the posterior sample size. The results are presented in Table 4. Based on

the estimation results one could conclude that for the poor health group the level of

utilization is higher for Medicaid patients over those with private insurance and for

private insurance over Medicare by about one visit a year. However, for the excellenthealth group the dierences are not as large. These results suggest that the additional

impact of Medigap insurance on average utilization levels for the Medicare elderly

is relatively small, albeit larger for those in poor health. The impact of Medicaid is

slightly larger on average, of the order of two visits for those in poor health and about

one visit for those in excellent health.

6.2. HMO versus FFS

In this application we study the choice of a specic type of private insurance byindividuals aged between 16 and 65 years. The individuals choose between two types

of private insurance: FFS options and HMO plans. The HMO plan serves as a proxy

for managed care type organization which often control costs and access by use of

features such as provider networks, gatekeeping, provider payment mechanisms and

so forth. Literature has emphasized that HMOs may increase the utilization of certain

types of care, e.g. preventive care, while reducing that of other more expensive types

of care, e.g. hospital nights.

Favorable selection into HMO plans means that those who expect to be low users of

services will tend to enrol into these plans, while those who expect to be heavy users

will enrol into the indemnity plans (FFS). If expected future usage can be adequatelyproxied by observed variables, then such selection can be controlled by introducing ap-

propriate proxy variables in the insurance and utilization equations (Reschovsky, 2000).

Under this scenario, estimation is considerably simplied. Ignoring the endogeneity

issue, several studies claim that HMO and FFS plans are similar in meeting individ-

uals needs of healthcare services and in covering the associated costs.

An issue that is relevant in discussing the endogeneity of insurance plans is that

individuals may have no choice or only very limited choice in the choice of insurance

plans. An overwhelming majority (80%) of our sample are employed. A high proportion

of these are thought to have very limited choice of plans. This factor softens the

impact of endogeneity issue even though it is not equivalent to exogenous assignmentof insurance plans. An example of a factor subsumed under unobserved heterogeneity

is attitude towards health risk. For example, a risk averse individual may choose a

health plan conservatively and may see a doctor more often than an individual who

behaves like a risk lover. Attitude towards health risk is not directly observed in our

sample.

We use data from the 1996 Medical Expenditure Panel Survey (MEPS). These are

collected from each household in a series of ve rounds of data collection over a 2.5

years of time. The rst round of the data consists of 10 639 households with more

than 23 000 individuals. Our MEPS sample size is 2893 and it consists of privately

insured individuals aged from 16 to 65 years whose healthcare expenditure is positive.About 50% have a FFS type insurance and the other half purchased their insurance

through an HMO. The categorical variable INSURANCE takes the value 1 for the

7/30/2019 1-s2.0-S0304407602002233-main

18/24


FFS category and 0 for the HMO category. Neither Medicare nor Medicaid are relevant

for this nonelderly sample.

As before the selection model is estimated for the number of doctor visits (DOCVIS),

doctor visit expenditure (DVEXP) and insurance status (d). Vectors x1 and x2 in-clude EXCLHLTH, POORHLTH, NUMCHRON, INJURY, BLACK, FEMALE,

MARRIED, SCHOOL, EMPLOYED, AGE, NOREAST, MIDWEST, WEST and

INSURANCE (d).

As mentioned before there is a problem of limited insurance choices aecting the

selection process. More than 80% of the individuals in the data set are employed and

some employers provide only limited insurance options. If data were available this

problem of constraints to the selection could be solved by restricting the sample to

only those individuals who had an actual choice between FFS and HMO when se-

lecting their insurance plans. However, this study takes a dierent approach. Including

variables controlling for the type of the company and for the employment status such

as size, existence of multiple locations, being self-employed and belonging to a gov-

ernmental organization could capture the eect of the employers constraint to the se-

lection. Vector x3 consists of EXCLHLTH, POORHLTH, NUMCHRON, INJURY,

NOREAST, MIDWEST, WEST, BLACK, FEMALE, MARRIED, SCHOOL,

EMPLOYED, AGE and SIZE, GOVT, LOCATION, SELFEMP, FAMINC. The

geographical variables are included to control for the inequalities in HMO penetration

and dierences in local prices.

The prior distributions of the parameters are the same as those in the previous model.

The posterior means and posterior deviations of the parameters are given in Table 5.5

For this sample neither the restricted nor the unrestricted estimates suggest that the

FFS plan has a signicant positive impact on doctor visits or expenditures relative to

the HMOs. Once again the covariances 1u and 2u are quite imprecisely estimated, so

the evidence in support of the endogeneity hypothesis remains weak.

The Bayes factor value is B0; 1 =1:77255 (2:93544) and the posterior model probabili-

ties are P(M0|y)=0:63932 and P(M1|y)=0:36068. According to these results again thereis no strong evidence in favor of either of the models. We use the posterior model

probabilities to calculate the predictive distribution (pooled model) as the weighted

average. The eect of INSURANCE on utilization and expenditure in the pooled

model is 0:084 (0:146) and 0:125 (0:167), respectively.Calculations similar to those in the previous section are made for four dierent groups

based on whether the individual belongs to an HMO or FFS and according to the health

status, excellent health or poor health. The expected utilization and expenditure for all

groups are presented in Table 6 and the results are based on the posterior distribution

of the pooled model. The results are not surprising given that the posterior mean

estimates for 12 and 13, as well as the coecients for INSURANCE variable, are

not signicantly dierent from zero. Average number of visits is at almost the same

level for the excellent health group for those from HMO and FFS. However, for the

poor health group HMO patients have a slightly higher utilization level.

5 The results are based on 40 000 replications. The following values of tuning parameters are selected:

1 = 2 = 0:1 and k = v = 15.

7/30/2019 1-s2.0-S0304407602002233-main

19/24


Table 5

MCMC estimates of the HMO/FFS model

M1 (Unrestricted) M0 (Restricted)

Insurance Docvis Dvexp Insurance Docvis Dvexp

CONST 0.094 0.584 4.650 0.093 0.425 4.508

0.159 0.208 0.246 0.159 0.110 0.156

EXCLHLTH 0.098 0.169 0.079 0.098 0.181 0.083

0.051 0.041 0.061 0.053 0.039 0.056

POORHLTH 0.115 0.402 0.589 0.103 0.390 0.579

0.179 0.126 0.176 0.187 0.116 0.169

NUMCHRON 0.004 0.214 0.235 0.003 0.214 0.235

0.023 0.016 0.026 0.024 0.016 0.026

INJURY 0.019 0.152 0.180 0.019 0.151 0.179

0.028 0.019 0.032 0.029 0.019 0.032

INSURANCE 0.285 0.220 0.030 0.071

0.347 0.371 0.033 0.052

NOREAST 0.072 0.117 0.076 0.075 0.121 0.079

0.065 0.049 0.076 0.067 0.048 0.074

MIDWEST 0.225 0.014 0.119 0.226 0.019 0.125

0.062 0.049 0.077 0.063 0.046 0.071

WEST 0.426 0.026 0.163 0.430 0.035 0.169

0.066 0.064 0.095 0.067 0.049 0.074

BLACK 0.162 0.137 0.201 0.163 0.132 0.195

0.078 0.061 0.095 0.083 0.060 0.092

FEMALE 0.137 0.282 0.325 0.138 0.289 0.329

0.047 0.039 0.057 0.049 0.036 0.053MARRIED 0.031 0.006 0.077 0.033 0.004 0.077

0.052 0.038 0.059 0.055 0.037 0.058

SCHOOL 0.001 0.026 0.034 0.0004 0.025 0.034

0.009 0.007 0.010 0.010 0.007 0.010

EMPLOYED 0.092 0.111 0.088 0.090 0.107 0.088

0.074 0.049 0.072 0.077 0.046 0.071

AGE 0.052 0.038 0.089 0.053 0.037 0.088

0.020 0.016 0.025 0.021 0.015 0.022

FAMINC 0.0008 0.0009

0.0006 0.0007

SELFEMP 0.202 0.206

0.093 0.100GOVT 0.107 0.112

0.063 0.066

SIZE 0.0006 0.0006

0.0001 0.0002

LOCATION 0.064 0.063

0.059 0.063

1u; 2u 0.195 0.181

0.215 0.228

11 ; 22 0.561 0.828 0.513 0.778

0.065 0.076 0.020 0.041

12 0.582 0.552

0.055 0.023

7/30/2019 1-s2.0-S0304407602002233-main

20/24


Table 6

Predicted values for doctor visits and expenditure, FFS/HMO

Insurance status FFS HMO

Health status Excellent Poor Excellent Poor

Pooled model

Number of visits 3.53 8.79 3.49 9.50

0.12 0.76 0.12 0.81

Expenditure 363.92 1089.42 368.80 1181.61

17.15 141.32 18.02 153.33

Thus, we conclude the type of insurance does not signicantly aect the level of

healthcare use.

6.3. Discussion and concluding remarks

How do our results compare with previous estimates? Dowd et al. (1991) modelled

physician visits and inpatient hospital days using 1984 survey data from 20 Twin Cities

rms that oered their employees a choice from at least one HMO plan and one FFS

plan. This study found no statistically signicant evidence for selection bias. However,

the study is subject to an important qualication. The authors estimated a linear selec-

tion model after restricting the sample to those with positive levels of utilization. Theyused log(physician visits) or log(hospital days) as their outcome variable and did not

account fully for the intrinsically discrete and heteroskedastic nature of the response

variable. By contrast, our formulation takes into account both these features. Yet Dowd

et al. (1991) did not nd signicant dierence between HMO and FFS insurees in the

average number of doctor visits. (They did nd that HMO insurees had a smaller aver-

age inpatient days.) In studies of the impact of HMOs on healthcare utilization, based

on the Community Tracking Study Household Survey 19961997 (Reschovsky, 2000;

Reschovsky and Kemper, 2000), the issue of selection bias was discussed, albeit not

dealt with in a comprehensive econometric framework. Reschovsky and Kemper (2000,

p. 385) argue that there is little evidence of selection on observables, and mention butdo not pursue the possibility of selection on unobservables via an econometric model.

After some tests they conclude that the risk of estimates of impact of HMO on health-

care use being aected by selection bias was small in their study. Tu et al. (2000)

analyze the same data as Reschovsky and Kemper (2000), using similar economet-

ric methodology and nd that no signicant dierences between HMO and non-HMO

enrollees in the use of hospital, surgery, and emergency room services. Mello et al.

(2002), in their study of the Medicare population based on data from 19931996, re-

port tests of endogeneity of insurance choice. Based on empirical models with discrete

factor structures, they do not nd evidence that supports endogeneity of the HMO vari-

able in their utilization equations, but they do nd evidence of favorable selection intoHMOs (healthier individuals self-select into cheaper health plans) and reduced utiliza-

tion of hospital services by HMO enrollees. An important qualication to the above

7/30/2019 1-s2.0-S0304407602002233-main

21/24


results is that incentives for controlling healthcare costs may now also be present in

FFS plans, and hence the marginal impact of HMOs on utilization may be smaller and

harder to detect. A second qualication is that studies that model disaggregated mea-

sures of utilization, such as specic preventive (blood pressure checks, mammograms,etc.) and curative services (surgery or hospital nights), may provide a sharper tests of

the endogeneity hypothesis and improved estimates of the dierential impact of HMO

and non-HMO plans on use of such services. This remains a topic for future research.

The nal qualication concerns the denition of an HMO plan. The denition used in

this study may be too broad and ner distinction based on the attributes of various

managed care plans may provide improved tests of the endogeneity hypothesis.

Embracing computational complications inherent in the problem, we have developed

a exible approach to modeling self-selectivity of the treatment variable in a model

with multiple outcomes. In our analysis of two separate data sets we nd mixed or

weak evidence of self-selectivity. However, the Bayes factor values suggest that the

results for both unrestricted and restricted (no endogeneity) models should be used in

a Bayesian inferential framework because neither model dominates the other.

Acknowledgements

We thank John Geweke, Co-Editor Arnold Zellner, an Associate Editor and three

anonymous referees for their helpful comments on earlier versions of this paper. We

have also beneted from presentation of an earlier version at the 2000 Mid-West Econo-metric Group Meeting in Chicago, Purdue University, University of Tennessee, Tulane

University. However, we retain responsibilities for any errors.

Appendix A. Computational

A.1. Sampling 1 and 2

The gradient vector has the following two components:

g1i = i + y1i (i 12ui)1

1

0

and

g2i = 1 + y2ii (i 12ui)1

0

1

;

where i = exp(x1i1 + 1i) and 1=i = exp(x2i2 + 2i) and the Hessian matrix is

Hei =

i 00 y2ii

1:

7/30/2019 1-s2.0-S0304407602002233-main

22/24


A.2. Sampling 1 and 2

The gradient vectors and the Hessian matrices are

g1 = B101 (1 01) +N

i=1

(y1i exp(x1i1 + 1i))x1i;

g2 = B102 (2 02) +N

i=1

(1 + y2i exp(x2i2 2i))x2i

and

H1 = B101 N

i=1 exp(x1i1 + 1i)x1ix1i;

H2 = B102 N

i=1

y2i exp(x2i2 2i)x2ix2i:

A.3. Sampling ; 12 and

Denote = (1; 2;

), X = diag(x1; x2; x3), Z = (log log1=z

). Then from

Eqs. (2.3), (2.4) and (2.5) has multivariate normal distribution N( ; ) where =

[X(1 IN)X]1 and = [X(1 IN)Z]. Partition and with respect to = (1; 2) and as = ( ) and

=

:

The conditional distribution of given 1 and 2 is normal with mean | = +

1 ( ) and variance 1| = 1 .

References

Albert, J.H., Chib, S., 1993. Bayesian analysis of binary and polychotomous response data. Journal of

American Statistical Association 88, 669679.

Chib, S., Hamilton, B.H., 2000. Bayesian analysis of cross-section and clustered data treatment models.

Journal of Econometrics 97, 2550.

Chib, S., Greenberg, E., Winkelmann, R., 1998. Posterior simulation and Bayes factor in panel count data

models. Journal of Econometrics 86, 3354.

Crepon, B., Duguet, E., 1997. Research and development, competition and innovation: pseudo-maximum

likelihood and simulated maximum likelihood methods applied to count data models with heterogeneity.


Deb, P., Trivedi, P.K., 1997. Demand for medical care by the elderly: a nite mixture approach. Journal of

Applied Econometrics 12, 313336.

Devroye, L., 1986. Non-Uniform Random Variate Generation. Springer, New York.

Dowd, B., Feldman, R., Cassou, S., Finch, M., 1991. Health plan choice and utilization of health careservices. Review of Economics and Statistics 73, 8593.

Draper, D., 1995. Assessment and propagation of model uncertainty. Journal of the Royal Statistical Society,

Series B 57, 4597.

7/30/2019 1-s2.0-S0304407602002233-main

23/24


Geman, S., Geman, D., 1984. Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images.

IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 609628.

Geweke, J., 1991. Ecient simulation from the multivariate normal and Student-t distributions subject to

linear constraints. In: Keramidas, E.M. (Ed.), Computing Science and Statistics: Proceedings of the 23rdSymposium on the Interface, pp. 571578.

Goldman, D.P., 1995. Managed care as a public cost-containment mechanism. Rand Journal of Economics

26, 277295.

Greene, W.H., 1997. FIML estimation of sample selection models for count data. Discussion Paper EC-97-02,

Department of Economics, Stern School of Business, New York University.

Hastings, W.K., 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika

57, 97109.

Heckman, J.J., 1976. The common structure of statistical models of truncation, sample selection and limited

dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement

5, 475492.

Heckman, J.J., 2000. Causal parameters and policy analysis in economics: a twentieth century retrospective.

Quarterly Journal of Economics 115 (1), 4597.Johnson, M., 1987. Multivariate Statistical Simulation. Wiley, New York.

Kass, R.E., Raftery, A.E., 1995. Bayes factors. Journal of American Statistical Association 90, 773795.

Kemper, P., Reschovsky, J.D., Tu, H.T., 2000. Do HMOs make a dierence? Summary and implications.

Inquiry 36, 419425.

Koop, G., Poirier, D.J., 1997. Learning about the across-regime correlation in switching regression models.


Lee, L.-F., 2000. Self-selection. In: Baltagi, B.H. (Ed.), A Companion to Theoretical Econometrics.

Blackwell, Oxford (Chapter 18).

Li, K., 1998. Bayesian inference in a simultaneous equation model with limited dependent variables. Journal

of Econometrics 85, 387400.

Linardakis, M., Dellaportas, P., 1999. Bayesian analysis of latent utilities for transportation services via

extensions of the multinomial probit model. Working paper, Athens University of Economics and Business.

Maddala, G.S., 1985. A survey of the literature on selectivity bias as it pertains to health care markets.

Health Economics and Health Services Research 6, 318.

McCulloch, R.E., Rossi, P.E., 1994. An exact likelihood analysis of the multinomial probit model. Journal

of Econometrics 64, 207240.

McCulloch, R.E., Polson, N.G., Rossi, P.E., 2000. A Bayesian analysis of the multinomial probit model with

fully identied parameters. Journal of Econometrics 99, 173193.

McManus, D.A., 1992. How common is identication in parametric models? Journal of Econometrics 53

(13), 523.

Mello, M.M., Stearns, S.C., Norton, E.C., 2002. Do medicare HMOs still reduce health service use after

controlling for selection bias? Health Economics 11, 323340.

Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E., 1953. Equations of state

calculations by fast computing machines. Journal of Chemical Physics 21, 10871092.

Miller, R.H., Luft, H.S., 1994. Managed care plan performance since 1980. Journal of American Medical

Association 271, 15121519.

Miller, R.H., Luft, H.S., 1997. Does managed care lead to better or worse quality of care? Health Aairs

16, 725.

Munkin, M.K., Trivedi, P.K., 2000. Analysis of patterns of healthcare utilization among the elderly using

mixed discrete-continuous models with unobserved heterogeneity. Working paper.

Nobile, A., 1998. A hybrid Markov chain for the Bayesian analysis of the multinomial probit model. Statistics

and Computing 8, 229242.

Nobile, A., 2000. Comment: Bayesian multinomial probit models with a normalization constraint. Journal of

Econometrics 99, 335345.

Reschovsky, J.D., 2000. Do HMOs make a dierence? Data and methods. Inquiry 36, 378389.Reschovsky, J.D., Kemper, P., 2000. Do HMOs make a dierence? Introduction. Inquiry 36, 374377.

Tanner, M.A., Wong, W.H., 1987. The calculation of posterior distribution by data augmentation. Journal of

American Statistical Association 82, 528540.

7/30/2019 1-s2.0-S0304407602002233-main

24/24


Terza, J.V., 1998. Estimating count data models with endogenous switching: sample selection and endogenous

treatment eects. Journal of Econometrics 84, 129154.

Tu, H.T., Kemper, P., Wong, H.J., 2000. Do HMOs make a dierence? Use of health services. Inquiry 36,

401410.van Ophem, H., 2000. Modeling selectivity in count data models. Journal of Business and Economic Statistics

18, 503510.

Verdinelli, I., Wasserman, L., 1995. Computing Bayes factors using a generalization of the SavageDickey

density ratio. Journal of American Statistical Association 90, 614618.

Winkelmann, R., 1998. Count data models with selectivity. Econometric Reviews 17, 339359.

Zellner, A., 1971. An Introduction to Bayesian Inference in Econometrics. Wiley, New York.

1-s2.0-S0304407602002233-main

Documents