Top Banner

of 24

1-s2.0-S0304407602002233-main

Apr 04, 2018

Download

Documents

sommukh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/30/2019 1-s2.0-S0304407602002233-main

    1/24

    Journal of Econometrics 114 (2003) 197220

    www.elsevier.com/locate/econbase

    Bayesian analysis of a self-selection model withmultiple outcomes using simulation-basedestimation: an application to the demand

    for healthcare

    Murat K. Munkina, Pravin K. Trivedib ;

    a Department of Economics, 531 Stokely Management Center, University of Tennessee, Knoxville,

    TN 37919, USAbDepartment of Economics, Wylie Hall, Indiana University, 100 South Woodlawn, Bloomington,

    IN 47405, USA

    Received 25 April 2001; received in revised form 15 August 2002; accepted 14 September 2002

    Abstract

    This paper studies a self-selection model with discrete and continuous outcomes and a treat-

    ment variable. The treatment variable is endogenous to the two outcome variables. The approach

    of the paper is fully parametric and Bayesian. The Bayes factor is calculated with the Savage

    Dickey density ratio and used for model selection. The model is applied to two dierent micro

    data sets, the 19871988 National Medical Expenditure Survey and the 1996 Medical Expen-

    diture Panel Survey. The paper studies the eect of managed care and fee-for-service type of

    private insurance on the demand for healthcare. It also compares the eects of private insurance

    and Medicaid in covering health care expenses of elderly Americans.c 2002 Elsevier Science B.V. All rights reserved.

    JEL classication: I11; C11; C31; C35

    Keywords: Health insurance; Self-selection; Managed care; Medicare; Medicaid; Markov chain Monte Carlo

    1. Introduction

    This paper has two major foci, one methodological and the other empirical. The

    methodological component deals with estimation of a three-equation self-selection model

    with two correlated outcomes, one of which is a count and the second is a contin-

    uous variable. We are interested in the impact of selection on the conditional mean

    Corresponding author. Tel.: +1-812-855-3567; fax: +1-812-855-3736.

    E-mail address: [email protected] (P.K. Trivedi).

    0304-4076/03/$ - see front matter c 2002 Elsevier Science B.V. All rights reserved.

    PII: S 0 3 0 4 - 4 0 7 6 ( 0 2 ) 0 0 2 2 3 - 3

    mailto:[email protected]:[email protected]
  • 7/30/2019 1-s2.0-S0304407602002233-main

    2/24

    198 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    of the outcomes, allowing for endogenous selection. The econometric methodology

    used is Bayesian and parametric; it is strongly motivated by the dicult computational

    problems that arose when the same model was estimated in a simulated maximum

    likelihood framework. The empirical component of the paper investigates the impactof public and private health insurance, which is our treatment variable, on the demand

    for healthcare and expenditure, that constitute our outcome measures. The motivation

    behind the empirical component is a long-standing and inconclusive debate in the health

    economics literature on the impact of the choice between the traditional fee-for-service

    (FFS) types of private insurance and health maintenance organization (HMO) plans on

    the demand for healthcare consumption for individuals of all ages.

    In the rst part of the paper we consider simulation-based estimation of an econo-

    metric model with y1 and y2, which are two jointly dependent discrete and continuous

    random outcome variables, respectively. A third variable in our model is denoted d;

    which will be referred to as the selection, or treatment, variable. For simplicity let d=1

    refer to the treated state and d = 0 refer to the untreated state. Suppose that our main

    interest is in the average value of the partial derivative such as y1=d or y2=d.

    If (y1; y2; d) are jointly dependent, then it is known that ignoring endogeneity of d

    results in self-selection bias, i.e. the causal eect of d on the outcome variable is not

    identied. Bias arises because the treatment (self-selection) variable d reects choices

    or decisions of an individual, and hence is endogenous.

    Lee (2000) provides a recent survey of the large literature on self-selection. The

    most widely used version of the selection model is a two-equation model in which

    the outcome variable is continuous and the outcome equation is linear. Heckman(1976) proposes a two-stage estimation method for this type of models. In contrast,

    the model considered here is nonlinear with both discrete (count) and continuous out-

    comes. This model is an extension of Terza (1998), Crepon and Duguet (1997), Greene

    (1997) and Winkelmann (1998) who have proposed both two-step moment-based and

    simulation-based full-information estimation methods to estimate a selection model in

    which the outcome variable is a count (also see van Ophem, 2000). Other approaches

    to the same problem follow the traditional selection model format, e.g. Dowd et al.

    (1991), in so far as they ignore discreteness, nonnormality, and heteroskedasticity that

    are inherent in the data. However, such moment-based procedures are in general inef-

    cient, even though computationally they are easier to implement. Further, the methodof moments does not allow one to estimate the full set of parameters for models

    with correlated multiple outcomes. A second possibility is to use a weighted nonlin-

    ear instrumental variable approach. Such an approach in other contexts has not been

    very successful, in part because of the diculty of estimating consistently the weights.

    Another approach is the simulated maximum likelihood method which requires a su-

    cient number of simulations for consistency but we often do not know the operational

    meaning of sucient. 1 In addition, the above-mentioned moment-based procedures

    are dicult to generalize to the case of multiple outcomes.

    1 The authors have found that the SML estimator for the present problem converged very slowly as the

    number of outcomes increases. Besides, convergence can be impossible for some parameterizations and

    parameter values due to unbounded gradient vectors.

  • 7/30/2019 1-s2.0-S0304407602002233-main

    3/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 199

    We approach the estimation problem in a Bayesian setting, assuming specic marginal

    distributions for the dependent variables. We consider the case of selection on unob-

    served heterogeneity with factor structure (for greater generality). Prior distributions

    are assigned to the parameters of the model. We apply the Markov chain Monte Carlo(MCMC) approach to estimation, building on the recent research of Albert and Chib

    (1993), Chib et al. (1998), Chib and Hamilton (2000), Koop and Poirier (1997), Li

    (1998), McCulloch et al. (2000), as well as several earlier studies.

    The existence of possible selection bias has long been an issue of contention in

    empirical health economics based on observational data in which insurance status is

    a choice variable and not exogenously assigned (Maddala, 1985). The standard treat-

    ment of selection eect in a linear model needs signicant extensions in dealing with

    a typical healthcare application where outcomes are often count as well as continuous

    variables and the selection decision may be multinomial. Computational diculty of

    simultaneously dealing with nonlinearity, discreteness, and sample selection has dis-

    couraged a full treatment of this topic. However, the issue is clearly important. Public

    insurance, such as Medicaid and Medicare, are special programs that target only par-

    ticular groups of individuals eligible for them, such as the low-income and the elderly.

    A subset of these individuals also purchase private (Medigap) insurance that cov-

    ers out-of-pocket expenses that constitute the gap between the provider charges and

    the public insurance benets. Therefore, although the public insurance status can be

    viewed as predetermined, the choice of private gap insurance plans may be endoge-

    nously determined jointly with the level of the demand for healthcare. The issue of

    endogeneity is relevant to comparisons of access to, utilization of, and evaluation ofquality of care between groups of healthcare users classied by their health insurance

    status. If one can validly assume exogeneity of insurance status, such comparisons

    are econometrically easier to implement because insurance choice and healthcare use

    can be modeled separately. If not, then the modeling exercise is more computationally

    complex especially if one wants to eciently estimate all parameters using a full in-

    formation system estimator. We pursue this objective in order to facilitate more precise

    comparisons between utilization patterns of dierent categories of insurees.

    The literature on the selection models for healthcare utilization does not present a

    full consensus on the importance of endogeneity of the insurance decision. In Section 6

    we will compare our empirical ndings on endogeneity with those from several recentstudies. Here we briey mention ndings from a few studies to reect the mixed

    empirical evidence on the endogeneity issue. For example, Miller and Luft (1994,

    1997) survey several studies on HMOs and their impact on utilization. These studies

    produce mixed results regarding the bias due to neglect of endogeneity. 2 Dowd et

    al. (1991) nd negligible evidence of selection bias. Reschovsky (2000) argues that

    self-selectivity of the HMO variable is due to observed characteristics of the individuals

    and insurance markets and the threat of selection bias arises from not measuring and

    2

    Half of these studies show higher rates of physician doctor visits for HMO enrollees and the other halfreach the opposite conclusion. Most of the studies ignore the self-selecting behavior of the individuals. Some

    of them argue that for particular types of healthcare, such as physician doctor visits, the insurance status is

    exogenous. Others acknowledge the problem but avoid the issue of endogeneity for computational simplicity.

  • 7/30/2019 1-s2.0-S0304407602002233-main

    4/24

    200 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    including these factors. Although unmeasured they are observable and an especially

    rich set of explanatory variables could control for selectivity. Tu et al. (2000) and

    Kemper et al. (2000) study healthcare utilization of HMO enrollees and generally

    argue against the importance of the endogeneity issue and register reservations aboutsome instrumental variable type approaches for handling it. Goldman (1995), on the

    other hand, emphasizes the role and importance of endogeneity. Because of dierences

    in data and methods used in dierent investigations a full consensus is hard to achieve.

    In this study we reconsider the modeling issues using a methodology that addresses

    several neglected issues and apply our methodology using 1996 MEPS data as well as

    the 1987 NMES data.

    The rest of the paper is organized as follows. Section 2 species the model. MCMC

    estimation of the model is presented in Section 3. Section 4 considers issues of Bayesian

    model selection. Section 5 considers an example with articially generated data and

    Section 6 deals with two empirical applications and concludes the paper.

    2. Model specication

    We observe N (i = 1; : : : ; N ) independent observations and it is assumed that: the

    counted (outcome) variable y1i is Poisson distributed conditional on exogenous co-

    variates x1i, endogenous variable di and unobserved heterogeneity 1i; the continuous

    nonnegative (outcome) variable y2i is exponentially distributed conditional on exoge-

    nous covariates x

    2i

    , endogenous variable di and unobserved heterogeneity 2i. The pres-

    ence of unobserved heterogeneity in this structure will permit us to model counts and

    continuous variables that display overdispersion. Specically,

    y1i|x1i; di; 1i indP[i]; y1i = 0; 1; 2; : : : ; (2.1)y2i|x2i; di; 2i ind exp[i]; y2i 0; (2.2)

    where P and exp stand for Poisson and exponential distributions with mean i and

    1=i, respectively. Variables y1i and y2i are assumed to be independent conditional

    on the unobserved heterogeneity. Their respective marginal distributions, obtained by

    integrating out 1 and 2, will be more exible. The specication of the conditional

    means is

    log i = x1i

    1 + 1di + 1i; (2.3)

    log(1=i) = x2i

    2 + 2di + 2i: (2.4)

    The third (selection) equation in the model denes a latent variable zi such that

    zi = x3i + ui; (2.5)

    di =

    1 if zi 0;

    0 if zi 0;

    where di is the treatment variable and x3i is a vector of exogenous explanatory vari-ables, and zi is a latent variable related to di. More specically, it could be propensity

    to purchase private insurance or propensity of being in an HMO.

  • 7/30/2019 1-s2.0-S0304407602002233-main

    5/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 201

    Endogeneity of di is modeled through correlation between unobserved variables 1i,

    2i and ui, that are assumed jointly normally distributed

    1i; 2i; ui N[(0; 0; 0); ]: (2.6)Thus, the outcome variables depend upon the treatment propensity through correlations

    between 1i, 2i and ui. Since zi in the selection equation (2.5) is unobservable only

    the ratio =

    uu is identied. We restrict the variance of the latent variable uu to

    unity so that the covariance matrix is

    =

    11 12 1u

    12 22 2u

    1u

    2u

    1

    : (2.7)

    The next section assigns prior distributions to the parameters of the model and

    describes the estimation procedure, which uses the MCMC algorithms to construct

    ergodic Markov chains converging to the posterior distributions of the parameters.

    3. MCMC estimation

    In our model the set of parameters to estimate is 1 ; 1; 2 ; 2; and ve elements

    of matrix with uu = 1 as the identication restriction. If the sign of one coef-cient of were known a priori then one could x it to 1 or 1 as an alterna-tive identication restriction. That would allow uu to vary and simplify the MCMC

    algorithm leading us to a convenient form of the Wishart distribution for the para-

    meters of matrix . However, such information is not generally available. McCulloch

    and Rossi (1994) in their analysis of the multinomial probit model propose to spec-

    ify proper priors for the full set of parameters (; ) without any identication re-

    strictions, derive the posterior distribution for both and and report the marginal

    posterior of the identied parameters (=

    uu; |uu = 1). Nobile (1998) proposes ahybrid sampler that improves convergence and autocorrelation properties of the al-

    gorithm. However, with this approach it is impossible to assign improper priors onthe parameters. Nobile (2000) and Linardakis and Dellaportas (1999) propose algo-

    rithms to draw directly from Wishart distribution conditional on one of the diagonal

    elements.

    We use data augmentation approach (Tanner and Wong, 1987) and include unob-

    servable variables zi ; 1i and 2i in the algorithm drawing them at each iteration and

    for all observations. Denote i = (1i; 2i), 21 = E(i ui) and 22 = E(

    i i). We follow

    the approach of Koop and Poirier (1997), Li (1998) and McCulloch et al. (2000) and

    write the joint distribution of vi = (1i; 2i; ui) as the product of marginal distribution

    of ui and the conditional distribution of i

    |ui. This allows us to use Gibbs sampling

    algorithm (Geman and Geman, 1984) to draw values from the posterior distributions ofthe elements of matrix . The distribution of ui is standard normal and the conditional

    distribution i|ui N(12ui; ), where = 22 2112. Since there is a one-to-one

  • 7/30/2019 1-s2.0-S0304407602002233-main

    6/24

    202 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    correspondence between and (12; )

    = + 2112 2112 1

    ;the MCMC procedure can be organized by blocking as 12 and and including

    them in the MCMC algorithm. Block the rest of the parameters as (1 ; 1), (2 ; 2)

    and . For brevity, denote: 1 = (1 ; 1), 2 = (

    2 ; 2), = (1; 2) and x1 = (x

    1 ; d),

    x2 = (x2 ; d). Assume the following prior distributions:

    1 N(01; B101 ); 2 N(02; B102 ); N(0; A10 );1 Wish(n0; D0); 12 N($12; 112 ); (3.1)

    where 01; B1

    01 ; 02; B1

    02 ; n0, D0; $12; 1

    12 ; 0; A0 are known parameters, N(0; B1

    0 ) de-notes multivariate normal distribution with mean vector 0 and covariance matrix B

    10

    and Wish(n0; D0) is the Wishart distribution with n0 degrees of freedom and scale

    matrix D0. Denote = (1; 2; ; 1; 12). Then the joint posterior density of the

    parameters and unobservables i and zi given the data is

    (;;z |y) = C(1)(2)()(1)(12)

    N

    i=1[I{di = 1}I{zi 0} + I{di = 0}I{zi 0}]

    exp(i)y1ii

    y1i!i exp(iy2i)(i|12ui; )(ui|0; 1); (3.2)

    where I{:} is the indicator function, (:|; 2) is the p.d.f. of the N[; 2], C is aproportionality constant, = (1 2); z

    ; y = (y1 y2 d) are matrices of N observations

    and ui = zi x3i.

    We construct our Markov chain blocking the parameters as i = (1i 2i), 1; 2; zi ; ;

    12 and with the full conditional distributions

    [1; 2

    |y; 12; ; ; u]; [1

    |y1; 1]; [2

    |y2; 2]; [z

    |1; 2; ; 12; ];

    [|; z; 1; 2; 12; ]; [12|; 1; 2; u] and [1|12; 1; 2; u]:Notice that given z and ; u is known with certainty and when condition on u

    formally it is conditioned on z and or (z x3). The following steps summarizeour algorithm.

    3.1. Sampling 1 and 2

    The full conditional density for 1 and 2 is

    (1; 2|y; 12; ; ; u) =N

    i=1

    (1i; 2i|yi; 12; ; ; ui);

  • 7/30/2019 1-s2.0-S0304407602002233-main

    7/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 203

    the product of N independent terms. We utilize the MetropolisHasting algorithm

    (Metropolis et al., 1953; Hastings, 1970) to sample (1i; 2i) for each observation i

    from the density

    (1i; 2i|yi; 12; ; ; ui) = ci exp(exp(x1i1 + 1i))(exp(1i))y1i exp(2i)exp(exp(x2i2 2i)y2i)(i|12ui ; );

    where ci is a proportionality constant and choose t-distribution centered at the modal

    value of the full conditional density for the proposal density. Let

    i = (1i; 2i) = arg max log (1i; 2i|yi; 12; ; ; ui)and Vi =(Hi )1 be the negative inverse of the Hessian of log (1i; 2i|yi; 12; ; ; ui)evaluated at the mode

    i. The gradient vector and the Hessian are derived in the Com-

    putational Appendix and used for a few steps of the NewtonRaphson algorithm to nd

    the modal value and the Hessian formula is used to calculate the covariance matrix of

    the proposal distribution q(i|yi; 12; ; ; ui) = fT(i|i; Ve i ; ), a bivariate t-distributionwith degrees of freedom, a tuning parameter selected to obtain reasonable acceptance

    rates. When a proposal value i = (1i;

    2i) is drawn the chain moves to the proposal

    value with probability

    (i; i ) = min

    (i |yi; 12; ; ; ui)q(i|yi; 12; ; ; ui)(i|yi; 12; ; ; ui)q(i |yi; 12; ; ; ui)

    ; 1

    :

    If the proposal value is rejected then the next state of the chain is at the current valuei = (1i; 2i).

    3.2. Sampling 1 and 2

    The full conditional densities for 1 and 2 are

    (1|y1; 1) = C1 (1|01; B101 )N

    i=1

    exp(exp(x1i1 + 1i))(exp(x1i1 + 1i))y1i ;

    (2|y2; 2) = C2 (2|02; B102 )N

    i=1

    exp(x2i2 2i)

    exp(y2i exp(x2i2 2i));where C1 and C2 are proportionality constants. The MetropolisHastings algorithm

    is used again to draw samples from these densities. Denote j (j = 1; 2) the current

    state of the Markov chain, j the mode of the full conditional density and j the

    candidate for the new value of the chain. Following Chib et al. (1998) the proposal

    density q(j ; j) is selected to be multivariate t-distribution with k degrees of freedom

    fT(

    j|

    j(j

    j); jVj) centered at

    j(j

    j). This proposal density is symmetricin j and j. The covariance matrix of the multivariate t-density is set to be Vj =H1

    j,

    negative inverse of the Hessian of log (j|yj ; j) evaluated at the modal value j and

  • 7/30/2019 1-s2.0-S0304407602002233-main

    8/24

    204 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    j is an adjustable constant. The candidate j , drawn from the proposal distribution, is

    accepted with probability

    Pr(j ; j|yj ; j) = min

    1; (j|yj ; j)(j|yj ; j)

    :

    If the candidate is not accepted then the chain does not change its value.

    3.3. Sampling z

    Variables zi are included in the MCMC algorithm (Albert and Chib, 1993). Sample

    N independent random variables zi such that zi |1i; 2i; ; 12; is distributed normal

    with mean x3i + 12122 (

    1i2i

    ) and variance 1

    12

    122 21 (22 = + 2112) and it

    is truncated at zero at the left if di = 1 and at the right if di = 0. To sample from thetruncated normal we follow Geweke (1991). See also Devroye (1986, p. 380).

    3.4. Sampling ; 12 and

    From Eqs. (2.3), (2.4) and given 1; 2 and 1; 2, variables log and log 1= are

    known with certainty. Since variables log , log1= and z are multivariate normal,

    1; 2 and are jointly normal and the conditional distribution of given 1; 2 is also

    normal. Denote the parameters of this conditional distribution as N( |; 1|) (they

    are derived in the Computational appendix). Thus given 1; 2; ; 1; 2; z

    ; 12; andthe prior N(0; A10 ) the posterior distribution of is normal with mean [A0 +|]

    1[A00 + | |] and variance [A0 + |]1. It is straightforward to sample

    from this distribution.

    Conditional on 1; 1, 2 and u, and given the prior 12 N($12; 112 ), the poste-rior distribution of 12 is normal with the mean of (12 +

    1uu)1(12$12 +1u)

    and the variance of (12 + 1uu)1.

    Given 12; 1; 2; u and the prior 1 Wish(n0; D0), the posterior distribution for

    1 is

    (1|v) Wishn0 + N;D10 +N

    i=1

    (i 12ui)(i 12ui)1 ;

    see Zellner (1971, p. 389) and Johnson (1987, pp. 203204) for the details of the

    Wishart density and the algorithm that draws values from it.

    4. Model comparison

    This section discusses the issues of Bayesian model comparisons. The model we

    consider in this paper permits endogeneity of the treatment variable. We would like todevelop a decision rule that would allow one to select between two models: model M0,

    with constraints 12 = (0 0), and M1 that leaves 12 unconstrained. The Bayes factor

  • 7/30/2019 1-s2.0-S0304407602002233-main

    9/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 205

    for the null hypothesis H0 : 12 = (0 0) is dened as

    B0; 1 =m(y|M0)m(y|M1)

    ;

    where m(y|Mj) is the marginal likelihood of the model specication Mj : Since thesetwo models are nested we take the SavageDickey density ratio approach (Verdinelli

    and Wasserman, 1995) to calculate the Bayes factor as

    B0; 1 =(12|y)

    (12);

    where (12|y) is the posterior density and (12) is the prior density of parameter 12calculated at the point 12=(0 0). To estimate (12|y) we approximate it by averagingthe full conditional density of 12; (12

    |; 1; 2; u) with respect to a posterior sample

    s; s1;

    s2 and u

    s; s = 1; : : : ; S . Let Vs12 = (12 + 1

    s usus)1 and s12 = V

    s12(12$12 +

    1s sus). Then

    (12|y) =1

    S

    Ss=1

    (12|s; s1; s2; us);

    where (12|s; s1; s2; us) is the bivariate normal density with mean s12 and varianceVs12 evaluated at

    12. One also has to estimate the prior density at

    12. In general,

    less informative priors would favor the null hypothesis so that improper priors are

    not applicable for testing. We will choose informative priors for 12 without a large

    spread. The specication of the prior distributions is discussed in more details in the

    next sections.

    5. Example based on articial data

    As an example we generate articial data according to the selection model and

    estimate it by the MCMC algorithm. The motivation of this exercise is to see how the

    method performs when the model is correctly specied and to investigate the existence

    of a self-selection bias when endogeneity of the treatment variable is ignored. To dothat one can estimate the model under the following restriction: 12 = (0 0). The

    MCMC algorithm is easily implemented under this restriction by xing the draws of

    21 to the zero vector. The data set, consisting of 1000 (i = 1; : : : ; 1000) observations,

    is generated as follows.

    1. x1i = x2i = (1; i) and i N(0; 1), x3i = (1; wi) and wi N(0; 1), = (1; 1),

    1 = (1; 1; 0:5), and 2 = (1; 1;0:5).2. ui; 1i; 2i N[(0; 0; 0); ], where

    =

    1

    0:5 0:5

    0:5 1 0:50:5 0:5 1

    :

  • 7/30/2019 1-s2.0-S0304407602002233-main

    10/24

    206 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    3. Generate zi = x3i + ui and di such that

    di = 1 i zi 0;

    di = 0 i zi 0;

    The values of and x3 are chosen in such a way that about 75% of all generated

    observations have d = 1 and for the rest d = 0. Set x1i = (x1i; di) and x2i = (x

    2i; di).

    Denote k1; k2 and k3 are the number of variables in x1; x2 and x3, respectively.

    4. Finally generate y1i and y2i as

    y1i P[exp(x1i1 + 1i)];

    y2i exp[1=exp(x2i2 + 2i)]:

    We center priors for parameters ; 1 and 2 at zero and choose them to be

    1 N(0k1 ; 10Ik1 ); 2 N(0k2 ; 10Ik2 ); N(0k3 ; 10Ik3 ) (5.1)to reect weak prior information on these parameters. Select a proper prior for

    and center it at the identity matrix (since in general there is no information on the

    covariance parameters 1u, 2u and 12 we center their priors at zero). McCulloch et

    al. (2000) analyzing the multinomial probit model point out that in order to choose

    such prior for one can select the following priors for 12 and 1:

    12 N(02; I2);1 Wish(n0; (n0 2)(1 )I2) (5.2)

    and specify only two scalars, and n0. In our example we choose =18

    and n0 = 5.

    Table 1 3 gives estimates of the posterior means, standard deviations and the auto-

    correlation function of the coecients at lag 20. When endogeneity of the treatment

    variable is ignored it results in a self-selection bias and coecients 1 and 2 are in-

    consistently estimated. We use the SavageDickey density ratio to calculate the Bayes

    factor to test H0 : 12 = (0 0). The calculated value of B0; 1 is of 106 order, which

    provides decisive evidence against H0. This suggests potentially serious consequence

    of ignoring the self-selection problem. The inuence of the self-selection bias on thepredicted utilization and expenditure is examined in more details in the next section.

    In addition, we generate 3000 observations according to the selection model and es-

    timate the model under M1 model specication. The results are given in Table 1.

    Fig. 1 displays the prior and posterior distributions (1000 and 3000 observations)

    for one of the parameters, 2u. The histograms are based on 20 000 iterations. The

    solid lines, 2u = 0:5, are drawn at the true value of the data generating process. The

    3 We run 40 000 replications following rst a burn-in phase of 1000 replications. During the burn-in

    phase the Markov chains converge to the stationary distributions. The posterior means and posterior standarddeviations are calculated based on the 40 000 draws. Values for tuning parameters, 1 = 0:7, 2 = 0:9 and

    k= = 15, are obtained in short preliminary runs by choosing reasonable acceptance rates close on average

    to 0.3 and examining the serial correlations of the Markov chains.

  • 7/30/2019 1-s2.0-S0304407602002233-main

    11/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 207

    Table 1

    MCMC estimation for generated data

    Coecient True value Unrestricted Restricted Unrestricted

    N = 3000N = 1000 ACF(20) N = 1000 ACF(20)

    Const1 1 1.111 0.560 1.523 0.148 0.955

    0.125 0.072 0.069

    x1 1 0.990 0.174 0.966 0.373 1.022

    0.036 0.039 0.022

    d 0.5 0.450 0.586 0.132 0.134 0.529

    0.154 0.083 0.084

    Const2 1 1.043 0.402 0.665 0.018 1.005

    0.187 0.096 0.091

    x2 1 1.029 0.024 0.923 0.036 1.003

    0.048 0.048 0.028

    d 0.5 0.522 0.488 0.017 0.012 0.495

    0.251 0.109 0.114

    Const3 1 1.080 0.071 0.986 0.003 1.062

    0.059 0.059 0.034

    x3 1 0.992 0.299 1.048 0.004 1.018

    0.074 0.069 0.039

    1u 0.5 0.512 0.453 0.474

    0.099 0.060

    2u

    0.5 0.551 0.504 0.500

    0.169 0.079

    11 1 0.954 0.189 0.979 0.031 0.981

    0.066 0.060 0.039

    12 0.5 0.520 0.236 0.507 0.102 0.475

    0.069 0.055 0.036

    22 1 1.089 0.081 0.979 0.052 0.996

    0.112 0.098 0.065

    results indicate that the impact of the priors diminishes as the number of observations

    increases, the posterior means move closer to the true values of the parameter andthe standard deviations of the posterior means become smaller. Overall the estimation

    algorithm performs well and it produces Markov chains with reasonable properties and

    the estimated results are consistent with the true parameters.

    6. Empirical application

    In this section we investigate the self-selectivity of private insurance considering

    two dierent situations. The model is applied to two dierent data sets from two

    household-based medical expenditure surveys sponsored by the Agency for HealthcareResearch and Quality (AHRQ). These are nationally representative surveys of health-

    care use, expenditure, source of payment and insurance coverage for the US civilian

  • 7/30/2019 1-s2.0-S0304407602002233-main

    12/24

    208 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    Prior distribution

    -1.0 -0.5 0.0 0.5 1.0

    0

    500

    1000

    1500

    2000

    Posterior distribution (N=1000)

    0.2 0.4 0.6 0.8

    0

    500

    1000

    1500

    2000

    Posterior distribution (N=3000)

    0.2 0.4 0.6 0.8

    0

    500

    1000

    1500

    2000

    2u = 0.5

    2u = 0.5

    Fig. 1. Prior and posterior distributions for 2u in the articial data example.

    noninstitutionalized population. The rst sample pertains to the U.S. elderly population

    and the second to the U.S. nonelderly population. Both data sets are based on publicly

    available data at AHRQ and contain only individuals with positive healthcare expen-

    diture. Denitions and summary statistics for the variables from the data sets used in

    this paper are given in Table 2.

    6.1. Private insurance

    First, we analyze the impact of private insurance on the number of physician doctor

    visits and associated expenditures by elderly Americans. A sample of 3690 observationsis obtained from the National Medical Expenditure Survey conducted in 1987 and 1988

    (NMES, 1987).

  • 7/30/2019 1-s2.0-S0304407602002233-main

    13/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 209

    Table 2

    Variable denition and summary statistics

    Variable Data set MEPS NMES

    Number of observations 2893 3690

    DenitionMean St. Dev. Mean St. Dev.

    DOCVIS Number of physician oce visits 4.74 6.30 6.88 6.85

    DVEXP Expenditure on physician oce visits 481.8 972.0 424.2 788.5

    EXCHLTH Equals 1 if self perceived health is excellent 0.32 0.47 0.07 0.25

    POORHLTH Equals 1 if self perceived health is poor 0.02 0.13 0.13 0.34

    NUMCHRON Number of chronic conditions 0.75 1.11 1.66 1.35

    ADLDIFF Equals 1 if the person has a condition which 0.21 0.41

    limits activities of daily living

    INJURY Number of injuries which limit activities 0.42 0.83

    of daily living during 1996NOREAST Equals 1 if the person lives in northeastern U.S. 0.20 0.40 0.19 0.39

    MIDWEST Equals 1 if the person lives in midwestern U.S. 0.25 0.44 0.26 0.44

    WEST Equals 1 if the person lives in western U.S. 0.21 0.41 0.19 0.39

    AGE age in years (divided by 10) 4.03 1.29 7.41 0.62

    BLACK Equals 1 if the person is African American 0.10 0.30 0.10 0.31

    FEMALE Equals 1 if the person is female 0.58 0.49 0.61 0.49

    MARRIED Equals 1 if the person is married 0.65 0.48 0.55 0.50

    SCHOOL Number of years of education 13.28 2.58 10.5 3.7

    FAMINC family income in $1,000 58.84 38.63 25.6 29.9

    EMPLOYED Equals 1 if the person is employed 0.82 0.38 0.10 0.30

    PRIVATE Equals 1 if the person is covered by 0.80 0.40

    private health insurance

    INSURANCE Equals 0 if the person in an HMO 0.51 0.50

    1 if the person has a FFS plan

    MEDICAID Equals 1 if the person is covered by Medicaid 0.09 0.29

    SELFEMP Equals 1 if the person is self-employed 0.09 0.28

    SIZE The size of the company where the person works 124.0 176.1

    LOCATION Equals 1 if the company has multiple locations 0.53 0.50

    GOVT Equals 1 if the company is governmental 0.18 0.39

    Deb and Trivedi (1997) and Munkin and Trivedi (2000) treated private insuranceas an exogenous variable for two reasons. First, individuals have strong incentives to

    purchase private insurance before they are 65 years old because its price rises sharply

    after that age. Second, individuals older than 66 are covered by Medicare, a generous

    public insurance program that oers a substantial protection against healthcare cost.

    Medicare covers expenses of mostly acute healthcare needs including those associated

    with the type of utilization that we consider, physician doctor visits, but not the costs

    of long-term healthcare. Hence some will choose to purchase additional insurance such

    as private insurance to cover out-of-pocket expenses, justifying the treatment of private

    insurance as an endogenous variable.

    We also compare the relative impact of private insurance and Medicaid on thelevel of healthcare utilization and expenditure. Medicaid provides health insurance to

    low-income individuals at public expense by covering the cost dierence between the

  • 7/30/2019 1-s2.0-S0304407602002233-main

    14/24

    210 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    cost of health service and the Medicare coverage. Those ineligible for Medicaid may

    purchase private insurance with coverage similar to that of Medicaid.

    The three equation joint model analyzes the number of doctor visits (DOCVIS),

    doctor visit expenditure (DVEXP) and private insurance (d). We choose the compo-nents of vectors x1; x2 and x3 as follows. The determinants for the healthcare con-

    sumption and expenditure, vectors x1 and x2, have the same set of variables, which

    consists of self-perceived health status variables EXCLHLTH and POORHLTH,

    a measure of chronic diseases and disability status NUMCHRON and ADLDIFF,

    geographical variables NOREAST, MIDWEST and WEST, demographic variables

    BLACK, MALE, MARRIED, SCHOOL, AGE, economic variable EMPLOYED

    and insurance variables MEDICAID and PRIVINS (d). In general, health status of an

    individual is unobservable and dicult to measure. However, self-perceived health vari-

    ables, together with the evaluation of chronic conditions and disability status, have

    proven to be a good measure of health status. The geographical variables are included

    to capture dierences in the local insurance and healthcare markets. Vector x3 com-

    prises of factors that inuence the decision to purchase a private insurance, including

    EXCLHLTH, POORHLTH, NUMCHRON, ADLDIFF, NOREAST, MIDWEST,

    WEST, BLACK, MALE, MARRIED, SCHOOL, EMPLOYED, AGE and FAM-

    INC. MEDICAID, which targets low income individuals, is excluded from the insur-

    ance equation.

    In a nonlinear simultaneous model identication can in principle be secured by non-

    linearity of the functional forms (McManus, 1992). However, genuine exclusion restric-

    tions, if available, ensure more robust identication of causal parameters (Heckman,2000). Thus motivated, we assume that the variable FAMINC inuences the deci-

    sion to purchase private (Medigap) insurance, but does not aect utilization. This

    restriction is empirically supported in this study.

    Our choice of priors is similar to that in the numerical example of the previous

    section and is given by (5.1) and (5.2).

    The posterior means and posterior standard deviations are presented in Table 3 4

    which also presents estimation results for the restricted model with 12 = (0 0). The

    posterior mean estimates for the coecient of PRIVINS are substantially dierent for

    the restricted model (Table 3, columns 2 and 3) and unrestricted models (columns 5

    and 6). The restricted coecients are positive and relatively precisely estimated, but theunrestricted coecients are positive but quite imprecisely estimated; i.e., they become

    statistically insignicant under endogeneity assumptions. This result is consistent with

    the presence of selection bias in the following sense. If after accounting for correlation

    between the insurance and utilization decisions, PRIVINS no longer has a signicant

    variable then the presence of selection bias is conrmed. However, for this argument

    to be plausible requires that our estimates of covariance parameters 1u and 2u be

    signicantly dierent from zero. Although both are estimated to have positive posterior

    means, their standard deviation is too large to permit reliable inference. Consequently,

    the result on selection bias is inconclusive.

    4 The MCMC estimation is based on 40 000 replications. Values for tuning parameters 1 = 2 = 0:1 and

    k = v = 15 are selected.

  • 7/30/2019 1-s2.0-S0304407602002233-main

    15/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 211

    Table 3

    MCMC estimates of the restricted and unrestricted models for private insurance

    M1 (Unrestricted) M0 (Restricted)

    Insurance Docvis Dvexp Insurance Docvis Dvexp

    CONST 0.172 1.402 4.925 0.179 1.336 4.889

    0.334 0.172 0.190 0.350 0.151 0.136

    EXCLHLTH 0.098 0.278 0.245 0.106 0.281 0.247

    0.110 0.055 0.082 0.111 0.055 0.082

    POORHLTH 0.232 0.266 0.291 0.235 0.274 0.300

    0.073 0.045 0.077 0.076 0.043 0.072

    NUMCHRON 0.014 0.134 0.143 0.014 0.133 0.141

    0.020 0.011 0.018 0.020 0.011 0.019

    ADLDIFF 0.216 0.076 0.107 0.219 0.081 0.110

    0.064 0.040 0.069 0.067 0.039 0.065MEDICAID 0.204 0.254 0.206 0.253

    0.056 0.082 0.056 0.079

    PRIVINS 0.057 0.227 0.198 0.337

    0.192 0.285 0.041 0.064

    NOREAST 0.136 0.078 0.188 0.135 0.076 0.186

    0.071 0.038 0.065 0.074 0.039 0.064

    MIDWEST 0.306 0.0003 0.049 0.307 0.005 0.051

    0.067 0.039 0.064 0.070 0.036 0.059

    WEST 0.144 0.101 0.325 0.141 0.102 0.327

    0.071 0.039 0.064 0.075 0.038 0.062

    BLACK 0.881 0.038 0.047 0.874 0.022 0.039

    0.074 0.080 0.119 0.076 0.051 0.078

    MALE 0.018 0.030 0.004 0.015 0.030 0.005

    0.059 0.032 0.054 0.059 0.032 0.052

    MARRIED 0.266 0.041 0.008 0.266 0.045 0.011

    0.059 0.035 0.059 0.060 0.032 0.051

    SCHOOL 0.100 0.017 0.026 0.100 0.016 0.024

    0.007 0.007 0.010 0.008 0.004 0.007

    EMPLOYED 0.062 0.001 0.020 0.061 0.001 0.020

    0.096 0.052 0.079 0.099 0.049 0.077

    AGE 0.007 0.038 0.022 0.006 0.039 0.023

    0.042 0.018 0.020 0.044 0.019 0.020

    FAMINC 0.061 0.0620.014 0.014

    1u; 2u 0.076 0.059

    0.106 0.154

    11 ; 22 0.503 0.777 0.496 0.762

    0.021 0.043 0.016 0.036

    12 0.553 0.550

    0.027 0.020

    We calculate the Bayes factor using the SavageDickey density ratio. M1 is the

    unrestricted specication that allows endogeneity of the treatment variable and M0 isthe one that ignores it. The calculated Bayes factor value is B0; 1 = 2:95468 (4:27171).

    Kass and Raftery (1995) indicate that if B0; 1 does not exceed 10 there is not a strong

  • 7/30/2019 1-s2.0-S0304407602002233-main

    16/24

    212 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    Table 4

    Predicted values for doctor visits and expenditure. Private insurance/medicaid

    Insurance status Private insurance; Medicaid; No medicaid;

    Health status no medicaid no private insurance no private insurance

    Excellent Poor Excellent Poor Excellent Poor

    Pooled model

    Number of visits 4.55 9.72 5.13 10.87 4.23 9.03

    0.19 0.30 0.30 0.46 0.18 0.29

    Expenditure 296.03 606.18 350.66 670.11 274.22 535.96

    18.69 33.87 31.07 47.42 18.01 30.18

    evidence against H1 and if it exceeds 100 then the evidence is decisive. In our casethere is no strong evidence in favor of either of the models and none of them can be

    ignored in the Bayesian inferential framework. Given the value of the Bayes factor,

    and assuming that the prior model probabilities are equal, P(M0) = P(M1) =12

    , the

    posterior model probabilities are P(M0|y) = 0:74713 and P(M1|y) = 0:25287. FollowingDraper (1995) we form our predictive distribution (pooled model) by averaging pos-

    terior densities obtained under model specications M0 and M1 and using the posterior

    model probabilities as weights. For example, the weighted coecients of PRIVINS in

    utilization and expenditure equations are 0.162 (0.079) and 0.309 (0.120), respectively.

    These eects are signicantly positive, but slightly weaker than those predicted by the

    restricted model. We do not fully report moments of the pooled posterior density,

    however, use it as predictive distribution in the following calculations.

    In Table 4 we compare the levels of healthcare use and expenditure for three groups

    dened according to their insurance status: those individuals who have private in-

    surance and no Medicaid; those who have Medicaid and no private insurance; those

    who have neither private insurance nor Medicaid. The individuals from the last group

    are covered only by Medicare. We divide these three categories further according

    to the self-perceived health status (excellent or poor health). The mean function for

    yj (j = 1; 2) after integrating out unobserved heterogeneity has a closed form,

    E[yj|j] = (x3 + ju)exp

    xj

    j + j + jj

    2

    + (1 (x3 + ju))

    exp

    xj

    j +jj

    2

    ;

    where j = (j ; ; ju; jj).

    We calculate posterior moments of the dependent variables for each group eval-

    uated at the groups mean value of the regressors, ( xj ; x3), and taken with respect

    to the pooled posterior distribution of the parameters j. The posterior moments are

    approximated as

    E[yj| xj ; x3] = 1S

    Si=1

    E[yj| xj ; x3; ji]; j = 1; 2;

  • 7/30/2019 1-s2.0-S0304407602002233-main

    17/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 213

    where S is the posterior sample size. The results are presented in Table 4. Based on

    the estimation results one could conclude that for the poor health group the level of

    utilization is higher for Medicaid patients over those with private insurance and for

    private insurance over Medicare by about one visit a year. However, for the excellenthealth group the dierences are not as large. These results suggest that the additional

    impact of Medigap insurance on average utilization levels for the Medicare elderly

    is relatively small, albeit larger for those in poor health. The impact of Medicaid is

    slightly larger on average, of the order of two visits for those in poor health and about

    one visit for those in excellent health.

    6.2. HMO versus FFS

    In this application we study the choice of a specic type of private insurance byindividuals aged between 16 and 65 years. The individuals choose between two types

    of private insurance: FFS options and HMO plans. The HMO plan serves as a proxy

    for managed care type organization which often control costs and access by use of

    features such as provider networks, gatekeeping, provider payment mechanisms and

    so forth. Literature has emphasized that HMOs may increase the utilization of certain

    types of care, e.g. preventive care, while reducing that of other more expensive types

    of care, e.g. hospital nights.

    Favorable selection into HMO plans means that those who expect to be low users of

    services will tend to enrol into these plans, while those who expect to be heavy users

    will enrol into the indemnity plans (FFS). If expected future usage can be adequatelyproxied by observed variables, then such selection can be controlled by introducing ap-

    propriate proxy variables in the insurance and utilization equations (Reschovsky, 2000).

    Under this scenario, estimation is considerably simplied. Ignoring the endogeneity

    issue, several studies claim that HMO and FFS plans are similar in meeting individ-

    uals needs of healthcare services and in covering the associated costs.

    An issue that is relevant in discussing the endogeneity of insurance plans is that

    individuals may have no choice or only very limited choice in the choice of insurance

    plans. An overwhelming majority (80%) of our sample are employed. A high proportion

    of these are thought to have very limited choice of plans. This factor softens the

    impact of endogeneity issue even though it is not equivalent to exogenous assignmentof insurance plans. An example of a factor subsumed under unobserved heterogeneity

    is attitude towards health risk. For example, a risk averse individual may choose a

    health plan conservatively and may see a doctor more often than an individual who

    behaves like a risk lover. Attitude towards health risk is not directly observed in our

    sample.

    We use data from the 1996 Medical Expenditure Panel Survey (MEPS). These are

    collected from each household in a series of ve rounds of data collection over a 2.5

    years of time. The rst round of the data consists of 10 639 households with more

    than 23 000 individuals. Our MEPS sample size is 2893 and it consists of privately

    insured individuals aged from 16 to 65 years whose healthcare expenditure is positive.About 50% have a FFS type insurance and the other half purchased their insurance

    through an HMO. The categorical variable INSURANCE takes the value 1 for the

  • 7/30/2019 1-s2.0-S0304407602002233-main

    18/24

    214 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    FFS category and 0 for the HMO category. Neither Medicare nor Medicaid are relevant

    for this nonelderly sample.

    As before the selection model is estimated for the number of doctor visits (DOCVIS),

    doctor visit expenditure (DVEXP) and insurance status (d). Vectors x1 and x2 in-clude EXCLHLTH, POORHLTH, NUMCHRON, INJURY, BLACK, FEMALE,

    MARRIED, SCHOOL, EMPLOYED, AGE, NOREAST, MIDWEST, WEST and

    INSURANCE (d).

    As mentioned before there is a problem of limited insurance choices aecting the

    selection process. More than 80% of the individuals in the data set are employed and

    some employers provide only limited insurance options. If data were available this

    problem of constraints to the selection could be solved by restricting the sample to

    only those individuals who had an actual choice between FFS and HMO when se-

    lecting their insurance plans. However, this study takes a dierent approach. Including

    variables controlling for the type of the company and for the employment status such

    as size, existence of multiple locations, being self-employed and belonging to a gov-

    ernmental organization could capture the eect of the employers constraint to the se-

    lection. Vector x3 consists of EXCLHLTH, POORHLTH, NUMCHRON, INJURY,

    NOREAST, MIDWEST, WEST, BLACK, FEMALE, MARRIED, SCHOOL,

    EMPLOYED, AGE and SIZE, GOVT, LOCATION, SELFEMP, FAMINC. The

    geographical variables are included to control for the inequalities in HMO penetration

    and dierences in local prices.

    The prior distributions of the parameters are the same as those in the previous model.

    The posterior means and posterior deviations of the parameters are given in Table 5.5

    For this sample neither the restricted nor the unrestricted estimates suggest that the

    FFS plan has a signicant positive impact on doctor visits or expenditures relative to

    the HMOs. Once again the covariances 1u and 2u are quite imprecisely estimated, so

    the evidence in support of the endogeneity hypothesis remains weak.

    The Bayes factor value is B0; 1 =1:77255 (2:93544) and the posterior model probabili-

    ties are P(M0|y)=0:63932 and P(M1|y)=0:36068. According to these results again thereis no strong evidence in favor of either of the models. We use the posterior model

    probabilities to calculate the predictive distribution (pooled model) as the weighted

    average. The eect of INSURANCE on utilization and expenditure in the pooled

    model is 0:084 (0:146) and 0:125 (0:167), respectively.Calculations similar to those in the previous section are made for four dierent groups

    based on whether the individual belongs to an HMO or FFS and according to the health

    status, excellent health or poor health. The expected utilization and expenditure for all

    groups are presented in Table 6 and the results are based on the posterior distribution

    of the pooled model. The results are not surprising given that the posterior mean

    estimates for 12 and 13, as well as the coecients for INSURANCE variable, are

    not signicantly dierent from zero. Average number of visits is at almost the same

    level for the excellent health group for those from HMO and FFS. However, for the

    poor health group HMO patients have a slightly higher utilization level.

    5 The results are based on 40 000 replications. The following values of tuning parameters are selected:

    1 = 2 = 0:1 and k = v = 15.

  • 7/30/2019 1-s2.0-S0304407602002233-main

    19/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 215

    Table 5

    MCMC estimates of the HMO/FFS model

    M1 (Unrestricted) M0 (Restricted)

    Insurance Docvis Dvexp Insurance Docvis Dvexp

    CONST 0.094 0.584 4.650 0.093 0.425 4.508

    0.159 0.208 0.246 0.159 0.110 0.156

    EXCLHLTH 0.098 0.169 0.079 0.098 0.181 0.083

    0.051 0.041 0.061 0.053 0.039 0.056

    POORHLTH 0.115 0.402 0.589 0.103 0.390 0.579

    0.179 0.126 0.176 0.187 0.116 0.169

    NUMCHRON 0.004 0.214 0.235 0.003 0.214 0.235

    0.023 0.016 0.026 0.024 0.016 0.026

    INJURY 0.019 0.152 0.180 0.019 0.151 0.179

    0.028 0.019 0.032 0.029 0.019 0.032

    INSURANCE 0.285 0.220 0.030 0.071

    0.347 0.371 0.033 0.052

    NOREAST 0.072 0.117 0.076 0.075 0.121 0.079

    0.065 0.049 0.076 0.067 0.048 0.074

    MIDWEST 0.225 0.014 0.119 0.226 0.019 0.125

    0.062 0.049 0.077 0.063 0.046 0.071

    WEST 0.426 0.026 0.163 0.430 0.035 0.169

    0.066 0.064 0.095 0.067 0.049 0.074

    BLACK 0.162 0.137 0.201 0.163 0.132 0.195

    0.078 0.061 0.095 0.083 0.060 0.092

    FEMALE 0.137 0.282 0.325 0.138 0.289 0.329

    0.047 0.039 0.057 0.049 0.036 0.053MARRIED 0.031 0.006 0.077 0.033 0.004 0.077

    0.052 0.038 0.059 0.055 0.037 0.058

    SCHOOL 0.001 0.026 0.034 0.0004 0.025 0.034

    0.009 0.007 0.010 0.010 0.007 0.010

    EMPLOYED 0.092 0.111 0.088 0.090 0.107 0.088

    0.074 0.049 0.072 0.077 0.046 0.071

    AGE 0.052 0.038 0.089 0.053 0.037 0.088

    0.020 0.016 0.025 0.021 0.015 0.022

    FAMINC 0.0008 0.0009

    0.0006 0.0007

    SELFEMP 0.202 0.206

    0.093 0.100GOVT 0.107 0.112

    0.063 0.066

    SIZE 0.0006 0.0006

    0.0001 0.0002

    LOCATION 0.064 0.063

    0.059 0.063

    1u; 2u 0.195 0.181

    0.215 0.228

    11 ; 22 0.561 0.828 0.513 0.778

    0.065 0.076 0.020 0.041

    12 0.582 0.552

    0.055 0.023

  • 7/30/2019 1-s2.0-S0304407602002233-main

    20/24

    216 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    Table 6

    Predicted values for doctor visits and expenditure, FFS/HMO

    Insurance status FFS HMO

    Health status Excellent Poor Excellent Poor

    Pooled model

    Number of visits 3.53 8.79 3.49 9.50

    0.12 0.76 0.12 0.81

    Expenditure 363.92 1089.42 368.80 1181.61

    17.15 141.32 18.02 153.33

    Thus, we conclude the type of insurance does not signicantly aect the level of

    healthcare use.

    6.3. Discussion and concluding remarks

    How do our results compare with previous estimates? Dowd et al. (1991) modelled

    physician visits and inpatient hospital days using 1984 survey data from 20 Twin Cities

    rms that oered their employees a choice from at least one HMO plan and one FFS

    plan. This study found no statistically signicant evidence for selection bias. However,

    the study is subject to an important qualication. The authors estimated a linear selec-

    tion model after restricting the sample to those with positive levels of utilization. Theyused log(physician visits) or log(hospital days) as their outcome variable and did not

    account fully for the intrinsically discrete and heteroskedastic nature of the response

    variable. By contrast, our formulation takes into account both these features. Yet Dowd

    et al. (1991) did not nd signicant dierence between HMO and FFS insurees in the

    average number of doctor visits. (They did nd that HMO insurees had a smaller aver-

    age inpatient days.) In studies of the impact of HMOs on healthcare utilization, based

    on the Community Tracking Study Household Survey 19961997 (Reschovsky, 2000;

    Reschovsky and Kemper, 2000), the issue of selection bias was discussed, albeit not

    dealt with in a comprehensive econometric framework. Reschovsky and Kemper (2000,

    p. 385) argue that there is little evidence of selection on observables, and mention butdo not pursue the possibility of selection on unobservables via an econometric model.

    After some tests they conclude that the risk of estimates of impact of HMO on health-

    care use being aected by selection bias was small in their study. Tu et al. (2000)

    analyze the same data as Reschovsky and Kemper (2000), using similar economet-

    ric methodology and nd that no signicant dierences between HMO and non-HMO

    enrollees in the use of hospital, surgery, and emergency room services. Mello et al.

    (2002), in their study of the Medicare population based on data from 19931996, re-

    port tests of endogeneity of insurance choice. Based on empirical models with discrete

    factor structures, they do not nd evidence that supports endogeneity of the HMO vari-

    able in their utilization equations, but they do nd evidence of favorable selection intoHMOs (healthier individuals self-select into cheaper health plans) and reduced utiliza-

    tion of hospital services by HMO enrollees. An important qualication to the above

  • 7/30/2019 1-s2.0-S0304407602002233-main

    21/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 217

    results is that incentives for controlling healthcare costs may now also be present in

    FFS plans, and hence the marginal impact of HMOs on utilization may be smaller and

    harder to detect. A second qualication is that studies that model disaggregated mea-

    sures of utilization, such as specic preventive (blood pressure checks, mammograms,etc.) and curative services (surgery or hospital nights), may provide a sharper tests of

    the endogeneity hypothesis and improved estimates of the dierential impact of HMO

    and non-HMO plans on use of such services. This remains a topic for future research.

    The nal qualication concerns the denition of an HMO plan. The denition used in

    this study may be too broad and ner distinction based on the attributes of various

    managed care plans may provide improved tests of the endogeneity hypothesis.

    Embracing computational complications inherent in the problem, we have developed

    a exible approach to modeling self-selectivity of the treatment variable in a model

    with multiple outcomes. In our analysis of two separate data sets we nd mixed or

    weak evidence of self-selectivity. However, the Bayes factor values suggest that the

    results for both unrestricted and restricted (no endogeneity) models should be used in

    a Bayesian inferential framework because neither model dominates the other.

    Acknowledgements

    We thank John Geweke, Co-Editor Arnold Zellner, an Associate Editor and three

    anonymous referees for their helpful comments on earlier versions of this paper. We

    have also beneted from presentation of an earlier version at the 2000 Mid-West Econo-metric Group Meeting in Chicago, Purdue University, University of Tennessee, Tulane

    University. However, we retain responsibilities for any errors.

    Appendix A. Computational

    A.1. Sampling 1 and 2

    The gradient vector has the following two components:

    g1i = i + y1i (i 12ui)1

    1

    0

    and

    g2i = 1 + y2ii (i 12ui)1

    0

    1

    ;

    where i = exp(x1i1 + 1i) and 1=i = exp(x2i2 + 2i) and the Hessian matrix is

    Hei =

    i 00 y2ii

    1:

  • 7/30/2019 1-s2.0-S0304407602002233-main

    22/24

    218 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    A.2. Sampling 1 and 2

    The gradient vectors and the Hessian matrices are

    g1 = B101 (1 01) +N

    i=1

    (y1i exp(x1i1 + 1i))x1i;

    g2 = B102 (2 02) +N

    i=1

    (1 + y2i exp(x2i2 2i))x2i

    and

    H1 = B101 N

    i=1 exp(x1i1 + 1i)x1ix1i;

    H2 = B102 N

    i=1

    y2i exp(x2i2 2i)x2ix2i:

    A.3. Sampling ; 12 and

    Denote = (1; 2;

    ), X = diag(x1; x2; x3), Z = (log log1=z

    ). Then from

    Eqs. (2.3), (2.4) and (2.5) has multivariate normal distribution N( ; ) where =

    [X(1 IN)X]1 and = [X(1 IN)Z]. Partition and with respect to = (1; 2) and as = ( ) and

    =

    :

    The conditional distribution of given 1 and 2 is normal with mean | = +

    1 ( ) and variance 1| = 1 .

    References

    Albert, J.H., Chib, S., 1993. Bayesian analysis of binary and polychotomous response data. Journal of

    American Statistical Association 88, 669679.

    Chib, S., Hamilton, B.H., 2000. Bayesian analysis of cross-section and clustered data treatment models.

    Journal of Econometrics 97, 2550.

    Chib, S., Greenberg, E., Winkelmann, R., 1998. Posterior simulation and Bayes factor in panel count data

    models. Journal of Econometrics 86, 3354.

    Crepon, B., Duguet, E., 1997. Research and development, competition and innovation: pseudo-maximum

    likelihood and simulated maximum likelihood methods applied to count data models with heterogeneity.

    Journal of Econometrics 79, 355378.

    Deb, P., Trivedi, P.K., 1997. Demand for medical care by the elderly: a nite mixture approach. Journal of

    Applied Econometrics 12, 313336.

    Devroye, L., 1986. Non-Uniform Random Variate Generation. Springer, New York.

    Dowd, B., Feldman, R., Cassou, S., Finch, M., 1991. Health plan choice and utilization of health careservices. Review of Economics and Statistics 73, 8593.

    Draper, D., 1995. Assessment and propagation of model uncertainty. Journal of the Royal Statistical Society,

    Series B 57, 4597.

  • 7/30/2019 1-s2.0-S0304407602002233-main

    23/24

    M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220 219

    Geman, S., Geman, D., 1984. Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images.

    IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 609628.

    Geweke, J., 1991. Ecient simulation from the multivariate normal and Student-t distributions subject to

    linear constraints. In: Keramidas, E.M. (Ed.), Computing Science and Statistics: Proceedings of the 23rdSymposium on the Interface, pp. 571578.

    Goldman, D.P., 1995. Managed care as a public cost-containment mechanism. Rand Journal of Economics

    26, 277295.

    Greene, W.H., 1997. FIML estimation of sample selection models for count data. Discussion Paper EC-97-02,

    Department of Economics, Stern School of Business, New York University.

    Hastings, W.K., 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika

    57, 97109.

    Heckman, J.J., 1976. The common structure of statistical models of truncation, sample selection and limited

    dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement

    5, 475492.

    Heckman, J.J., 2000. Causal parameters and policy analysis in economics: a twentieth century retrospective.

    Quarterly Journal of Economics 115 (1), 4597.Johnson, M., 1987. Multivariate Statistical Simulation. Wiley, New York.

    Kass, R.E., Raftery, A.E., 1995. Bayes factors. Journal of American Statistical Association 90, 773795.

    Kemper, P., Reschovsky, J.D., Tu, H.T., 2000. Do HMOs make a dierence? Summary and implications.

    Inquiry 36, 419425.

    Koop, G., Poirier, D.J., 1997. Learning about the across-regime correlation in switching regression models.

    Journal of Econometrics 78, 217227.

    Lee, L.-F., 2000. Self-selection. In: Baltagi, B.H. (Ed.), A Companion to Theoretical Econometrics.

    Blackwell, Oxford (Chapter 18).

    Li, K., 1998. Bayesian inference in a simultaneous equation model with limited dependent variables. Journal

    of Econometrics 85, 387400.

    Linardakis, M., Dellaportas, P., 1999. Bayesian analysis of latent utilities for transportation services via

    extensions of the multinomial probit model. Working paper, Athens University of Economics and Business.

    Maddala, G.S., 1985. A survey of the literature on selectivity bias as it pertains to health care markets.

    Health Economics and Health Services Research 6, 318.

    McCulloch, R.E., Rossi, P.E., 1994. An exact likelihood analysis of the multinomial probit model. Journal

    of Econometrics 64, 207240.

    McCulloch, R.E., Polson, N.G., Rossi, P.E., 2000. A Bayesian analysis of the multinomial probit model with

    fully identied parameters. Journal of Econometrics 99, 173193.

    McManus, D.A., 1992. How common is identication in parametric models? Journal of Econometrics 53

    (13), 523.

    Mello, M.M., Stearns, S.C., Norton, E.C., 2002. Do medicare HMOs still reduce health service use after

    controlling for selection bias? Health Economics 11, 323340.

    Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E., 1953. Equations of state

    calculations by fast computing machines. Journal of Chemical Physics 21, 10871092.

    Miller, R.H., Luft, H.S., 1994. Managed care plan performance since 1980. Journal of American Medical

    Association 271, 15121519.

    Miller, R.H., Luft, H.S., 1997. Does managed care lead to better or worse quality of care? Health Aairs

    16, 725.

    Munkin, M.K., Trivedi, P.K., 2000. Analysis of patterns of healthcare utilization among the elderly using

    mixed discrete-continuous models with unobserved heterogeneity. Working paper.

    Nobile, A., 1998. A hybrid Markov chain for the Bayesian analysis of the multinomial probit model. Statistics

    and Computing 8, 229242.

    Nobile, A., 2000. Comment: Bayesian multinomial probit models with a normalization constraint. Journal of

    Econometrics 99, 335345.

    Reschovsky, J.D., 2000. Do HMOs make a dierence? Data and methods. Inquiry 36, 378389.Reschovsky, J.D., Kemper, P., 2000. Do HMOs make a dierence? Introduction. Inquiry 36, 374377.

    Tanner, M.A., Wong, W.H., 1987. The calculation of posterior distribution by data augmentation. Journal of

    American Statistical Association 82, 528540.

  • 7/30/2019 1-s2.0-S0304407602002233-main

    24/24

    220 M.K. Munkin, P.K. Trivedi / Journal of Econometrics 114 (2003) 197 220

    Terza, J.V., 1998. Estimating count data models with endogenous switching: sample selection and endogenous

    treatment eects. Journal of Econometrics 84, 129154.

    Tu, H.T., Kemper, P., Wong, H.J., 2000. Do HMOs make a dierence? Use of health services. Inquiry 36,

    401410.van Ophem, H., 2000. Modeling selectivity in count data models. Journal of Business and Economic Statistics

    18, 503510.

    Verdinelli, I., Wasserman, L., 1995. Computing Bayes factors using a generalization of the SavageDickey

    density ratio. Journal of American Statistical Association 90, 614618.

    Winkelmann, R., 1998. Count data models with selectivity. Econometric Reviews 17, 339359.

    Zellner, A., 1971. An Introduction to Bayesian Inference in Econometrics. Wiley, New York.