CCP Estimation of Dynamic Discrete Choice Models with … · 2006. 10. 2. · Peter Arcidiacono Robert Miller Duke University Carnegie Mellon September 29, 2006 Abstract Standard

CCP Estimation of Dynamic Discrete Choice

Models with Unobserved Heterogeneity∗

Peter Arcidiacono Robert Miller

Duke University Carnegie Mellon

September 29, 2006

Abstract

Standard methods for solving dynamic discrete choice models involve calculating the value

function either through backwards recursion (finite-time) or through the use of a fixed point

algorithm (infinite-time). Conditional choice probability (CCP) estimators provide a computa-

tionally cheaper alternative but are perceived to be limited both by distributional assumptions

and by being unable to incorporate unobserved heterogeneity via finite mixture distributions.

We extend the classes of CCP estimators that need only a small number of CCP’s for estima-

tion. We also show that not only can finite mixture distributions be used in conjunction with

CCP estimation, but, because of the computational simplicity of the estimator, an individual’s

location in unobserved space can transition over time. Monte Carlo results suggest that the

algorithms developed are computationally cheap with little loss in precision.

Keywords: dynamic discrete choice, unobserved heterogeneity

∗We thank Paul Ellickson and seminar participants at Duke University for valuable comments. Josh Kinsler

provided excellent research assistance.

1

1 Introduction

Standard methods for solving dynamic discrete choice models involve calculating the value function

either through backwards recursion (finite-time) or through the use of a fixed point algorithm

(infinite-time). Conditional choice probability (CCP) estimators provide an alternative to these

techniques which involves mapping the value functions into the probabilities of making particular

decisions. While CCP estimators are much easier to compute than estimators based on obtaining

the full solution and have experienced a resurgence in the literature on dynamic games,1 there are

at least two reasons why researchers have been reticent to employ them in practice. First, it is

perceived that the mapping between CCP’s and value functions is simple only in specialized cases.

Second, it is believed that CCP estimators cannot be adapted to handle unobserved heterogeneity.2

This latter criticism is particularly damning as one of the fundamental issues in labor economics,

and indeed one of the main purposes of structural microeconomics, is the explicit modelling of

selection.

We show that, for a wide class of Generalized Extreme Value (GEV) distributions of the error

structure, the value function depends only on the one period ahead CCP’s and, in single-agent

problem, often depends upon only the one period ahead CCP’s for a single choice. The class

of problems we discuss is quite large and includes dynamic games where one of the decisions is

whether to exit. Further, unobserved heterogeneity via finite mixture distributions is not only

easily incorporated into the algorithm, but the finite mixture distributions can transition over time

as well. Previous work on incorporating unobserved heterogeneity has been restricted to cases of

permanent unobserved heterogeneity in large part because of the computational burdens associated

with allowing for persistence, but not permanence, in the unobserved heterogeneity distribution.

Using insights from the macroeconomics literature on regime switching and the computational

simplicity of CCP estimation, we show that incorporating persistent but time-varying heterogeneity

comes at very little computational cost.

We adapt the EM algorithm, and in particular its application to sequential likelihood developed

1See Aguirregabiria and Mira (2006), Bajari, Benkard, and Levin (2006), Pakes, Ostrovsky, and Berry (2004),

and Pesendorfer and Schmidt-Dengler (2003).2A third reason is that to perform policy experiments it is often necessary to solve the full model. While this is

true, using CCP estimators would only involve solving the full model once for each policy simulation as opposed to

multiple times in a maximization algorithm.

2

in Arcidiacono and Jones (2003), to two classes of CCP estimators that between them cover a wide

class of dynamic optimization problems and sequential games with incomplete information, based

on representations developed in Hotz et al (1994) and Altug and Miller (1998). Our techniques

can be also readily applied to models with discrete and continuous choices by exploiting the Euler

equation representation given in Altug and Miller (1998).

The algorithm begins by making a guess as to the CCP’s for each unobserved type, conditional

on the observables. Given this initial guess, we iterate on two steps. First, given the type-specific

CCP’s, maximize the pseudo-likelihood function with the unobserved heterogeneity integrated out.

Second, update the type-specific CCP’s using the parameter estimates. These updates can take

two forms. In stationary models, the updates can come from the likelihoods themselves. In models

with finite time horizons, while the updates can come from the likelihoods themselves, a second

method may be computationally cheaper. Namely, similar to the EM algorithm, we calculate the

probability that an individual is a particular type. These conditional type probabilities are then

used as weights in forming the updated CCP’s from the data.

We illustrate the small sample properties of our estimator using two sets of Monte Carlos

designed to highlight the two methods of updating the type-specific CCP’s. The first is a finite

horizon model of teen drug use and schooling decisions. Students in each period decide to whether

to stay in school and, if the choice is to stay, use drugs. Before using drugs, individuals only have a

prior as to how well they will enjoy the experience. However, upon using drugs, students discover

their drug ‘type’ and this is used in informing their future decisions. Here we illustrate both ways

of updating the CCP’s, using the likelihoods or the conditional probabilities of being particular a

particular type as weights. Results of the Monte Carlo show that both methods of updating the

CCP’s yield estimates similar to that of full information maximum likelihood with little loss in

precision.

The second is a dynamic entry/exit example with unobserved heterogeneity in the demand

levels for particular markets which in turn affects the values of entry and exit. Here the unobserved

heterogeneity is allowed to transition over time and the example explicitly incorporates dynamic

selection. The type-specific CCP’s are updated using the likelihoods evaluated at the current

parameter estimates and the current type-specific CCP’s. The results suggest that incorporating

time-varying unobserved heterogeneity is not only feasible but computationally simple and yields

precise estimates even of the transitions on the unobserved state variables.

3

Our work is most closely related to the nested pseudo-liklelihood estimators developed by Aguir-

reggabiria and Mira (2006), and Buchinsky, Hahn and Hotz (2005). Both papers seek to incorporate

a fixed effect within their CCP estimation framework drawn from a finite mixture. Aguirregabiria

and Mira (2006) show how to incorporate unobserved characteristics of markets in dynamic games,

where the unobserved heterogeneity only affects the utility function itself. In contrast our analysis

demonstrates how to incorporate unobserved heterogeneity into both the utility functions and the

transition functions and are thereby account for the role of unobserved heterogeneity in dynamic

selection. Buchinsky, Hahn and Hotz (2005) use the tools of cluster analysis, seeking conditions

on the model structure that allow them to identify the unobserved type of each agent, whereas we

only identify the distribution of unobserved heterogeneity across agents. Thus their approach seems

most applicable in models where there are relatively small numbers of long lived agents which may

or may not be comparable to each other, whereas our approach is applicable to large populations

where the focus is on the unobserved proportions that partition it.

2 The Framework

We consider a dynamic programming problem in which an individual makes a sequence of discrete

choices dt over his lifetime t ∈ {1, . . . , T} for some T ≤ ∞. The choice set has the same cardinality

K at each date t, so we define dt by the multiple indicator function dt = (d1t, . . . , dKt) where

dkt ∈ {0, 1} for each k ∈ {1, . . . , K} and ∑K

k=1dkt = 1

A vector of characteristics (zt, εt) fully describes the individual at each time t, where zt are a set

of time-varying characteristics, and εt ≡ (ε1t, . . . , εKt) is independently and identically distributed

over time having continuous support with distribution function G (εt). The vector zt evolves as a

Markov process, depending stochastically on the choices of the individual. We model the transition

from zt to zt+1 conditional on the choice k ∈ {1, . . . , K} with the probability distribution function

Fk (zt+1 |zt ). We assume the current utility an individual with characteristics (zt, εt) gets from

choosing alternative k by setting dkt = 1, is additively separable in zt and εt , and can be expressed

as

uk (zt) + εkt

4

The individual sequentially observes (zt, εt) and maximizes the expected discounted sum of utilities

E{∑T

t=1

∑K

k=1βtdkt [uk (zt) + εkt]

}where β ∈ (0, 1) denotes the fixed geometric discount factor.

Let dot ≡ (do

1t, . . . , doKt) denote the optimal decision rule for this problem where do

kt ≡ dkt (zt, εt)

for each k ∈ {1, . . . , K}, define the conditional valuation functions by

vk (zt) = E{∑T

t=1

∑K

k=1βtdo

kt [uk (zt) + εkt]}

and the conditional choice probabilities as

pk (zt) =

∫dkt (zt, εt) dG (εt)

The representation theorem of Hotz and Miller (1993) implies there is a mapping from the condi-

tional choice probabilities to the conditional valuation functions, which we now denote as

q [p (zt)] =

q2 [p (zt)]

...

qK [p (zt)]

=

v2 (zt)− v1 (zt)

...

vK (zt)− v1 (zt)

The expected contribution of the εkt disturbance to current utility, conditional on the state zt, is

found by integrating over the region in which the kth action is taken, so appealing to the represen-

tation theorem, we may express it as∫[dkt (zt, εt) εkt] dG (εt)

=

∫1 {εkt − εjt ≥ vj (zt)− vk (zt) for all j ∈ {1, . . . , J}} [dkt (zt, εt) εkt] dG (εt)

=

∫1 {εkt − εjt ≥ qj [p (zt)]− qk [p (zt)] for all j ∈ {1, . . . , J}} [dkt (zt, εt) εkt] dG (εt)

≡ wk [p (zt)]

It now follows that the conditional valuation functions can be expressed as a mapping of the

conditional choice probabilities for states that might be visited in the future

vk (zt) = uk(zt) + Et

{∑T

t′=t+1

∑K

k′=1βt′−tp (zt′) [uk′ (zt′) + wk′ [p (zt′)]]

}(1)

Similarly, we can define the expected value function as:

v(z0) =K∑

k=1

pk (z0) [uk (z0) + wk (z0)] (2)

v then measures what the individual’s expected utility conditional on optimal behavior once the

ε’s are realized.

5

3 Finite Dependence

The power of the Hotz and Miller inversion is particularly strong in the finite dependence case.

In order to see this, we first proceed with a simple illustration that is a special case of Altug and

Miller (1998) and then proceed to generalize their result. We begin by rewriting (1) as:

vk (zt) = uk(zt) + βEt

{∑K

k′=1pk′ (zt+1) [vk′ (zt+1) + wk′ [p (zt+1)]] |zt, dt = k

}(3)

Now suppose there is an action l that can be taken at time t+1 such that vl does not depend upon

what action was taken at time t. That is, there may be state variables affected by the action at t,

but vl does not depend upon these variables. We can rewrite (3) as:


{vl(zt+1) +

∑K

k′=1pk′ (zt+1) [vk′ (zt+1)− vl(zt+1) + wk′ [p (zt+1)]] |zt, dt = k

}(4)

The Hotz and Miller inversion then allows us to write (4) as:


{vl(zt+1) +

∑K

k′=1pk′ (zt+1) [qk′l [p (zt)] + wk′ [p (zt+1)]] |zt, dt = k

}(5)

where qkl [p (zt)] ≡ qk [p (zt)] − ql [p (zt)]. Note that the summation can be viewed as the compen-

sation an individual would need to receive to make him indifferent between committing a priori

to action l or waiting and choosing optimally in the next period. This will be important when we

generalize the finite dependence case.

For estimation purposes, we must difference vk with respect to one of the alternatives, say v1.

Since vl does note depend upon the choice at time t, the vl terms difference out yielding:

vk (zt)− v1(zt) = uk(zt)− u1(zt)

+βEt

{∑K

k′=1pk′ (zt+1) [qkl [p (zt)] + wk′ [p (zt+1)]] |zt, dt = k

}−βEt

{∑K

k′=1pk′ (zt+1) [qkl [p (zt)] + wk′ [p (zt+1)]] |zt, dt = 1

}(6)

Hence the future value terms only depend upon the one period ahead conditional choice probabili-

ties.

We now generalize the model along two dimensions. First, similar to Altug and Miller (1998),

the number of periods before a decision can be made such that the v for that choice does not

depend upon the choice in the previous period can now be any finite number. Second, and more

importantly, the renewal action may still allow for the carryover of past choices. In this case finite

6

dependence is a bit of a misnomer because only a particular class of variables need have the finite

dependence. Formally, the assumption of finite dependence is as follows. Given a value of the

initial state, z0 ∈ Z, there exists a finite integer ρ (z0) , a value of the state variable zρ ∈ Z, and a

(not necessarily unique) sequence of choices over the next ρ periods that ensure the state will be

zρ at date ρ (z0) denoted by d(ρ)t (zt) ≡

(d

(ρ)1t (zt) , . . . , d

(ρ)kt (zt)

)that may depend on zt but not on

the transitory disturbance εt. This assumption of induced finite dependence implies the expected

valuation function can be expressed as

v (z0) = E(ρ)0

∑ρ

t=1

∑K

k=1βtd

(ρ)kt (zt) [uk (zt) + wk (zt)]

+E(ρ)0

∑ρ

t=1

∑K

k=1βt

{[pk (zt)− d

(ρ)kt (zt)

][uk (zt) + wk (zt)]

+[∑K

l=1βtpk (zt) d

(ρ)lt (zt) qkl [p (zt)]

]}+βρ+1v (zρ)

where the expectations of the transition probability for zt are taken conditional on the d(ρ)t (zt)

choices, denoted by E(ρ)0 [·] and qkl [p (zt)] ≡ qk [p (zt)]− ql [p (zt)]. The first of the three expressions

on the right side is the expected utility received from the next period to period ρ (z0) from following

the prescribed choices d(ρ)t (zt) , the second are the expected losses in utility incurred in those periods

from not choosing optimally, while the last is the expected utility from period ρ + 1 onwards when

the state is zρ, which by the assumption of finite dependence does not depend on the initial state.

The cases where finite dependence holds is quite large. To see this, note that we can partition

the state variables into three sets, a set that evolves deterministically and may or not depend

upon the choices an individual makes, a set that evolves stochastically but is unaffected by the

choices, and a set that evolves stochastically and depends upon the choices. Finite dependence

puts restrictions only on this third set of variables where the choice at ρ(z0) breaks any connection

between these variables and past choices. For any z0 ∈ Z and l ∈ {1, . . . , K} the expected valuation

function, v (z0) , can be expressed in terms of any conditional valuation function l ∈ {1, . . . , K} ,

as well as the current utilities obtained from all possible choices and a correction term qkl [p (z0)]

7

that only depends on the current conditional choice probabilities. Namely,

v (z0) ≡∑K

k=1pk (z0) [uk (z0) + wk (z0) + βvk (z0)]

= ul (z0) + wl (z0) + βvl (z0)

+∑K

k=1pk (z0) {uk (z0)− ul (z0) + wk (z0)− wl (z0) + β [vk (z0)− vl (z0)]}

= ul (z0) + wl (z0) + βvl (z0)

+∑K

k=1pk (z0) {uk (z0)− ul (z0) + wk (z0)− wl (z0) + βqkl [p (z0)]}

The expected valuation function is the expected current utility from taking the lth action, ul (z0)+

wl (z0) , a compensation in current utility from not behaving optimally,∑K

k=1 pk (z0) [uk (z0)− ul (z0) + wk (z0)− wl (z0)] ,

plus a compensation in future utility from not behaving optimally, namely∑K

k=1 pk (z0) βqkl [p (z0)] .

In equilibrium the conditional choice probabilities satisfy the identifying restrictions

pk (zt) =

∫dkt (zt, εt) dG (εt)

= Pr

{k = arg max

l∈{1,...,K}}[ul (zt) + vl(zt) + εlt] |zt

}

= Pr

{εlt − εkt < uk(zt)− ul(zt) +

∫z′∈Z

v (z′) {dFk [z′ |zt ]− dFl [z′ |zt ]}

for all l ∈ {1, . . . , K}}

If the finite dependence assumption is satisfied, then it is straightforward to check that only terms

pertaining to the next ρ (z0) periods enter the integrand.

4 Correlation Structures and Finite Time Dependence

4.1 One Period Time Dependence and GEV Errors

We first focus on a simple two choice model. Given the framework described above, Rust (1987)

shows that if we make the additional assumption that the εt’s are i.i.d. extreme value we can write

the v’s as:

v0(zt, εt) = u0(zt, εt) + β

(∫ln

(1∑

k=0

exp(vk(zt+1))

)f(zt+1|zt, dt = 0) + γ

)


(∫ln

(1∑

k=0

exp(vk(zt+1))

)f(zt+1|zt, dt = 1) + γ

)

8

where γ is Euler’s constant.

Note that what is inside the log is the denominator for the probability of choosing any of the

alternatives. This will hold true for any representation where the ε’s come from a GEV distribution.

This implies that the vk’s can be rewritten as:


(∫[v0(zt+1)− ln(p0(zt+1))]f(zt+1|zt, dt = 0) + γ

)


(∫[v0(zt+1)− ln(p0(zt+1))]f(zt+1|zt, dt = 1) + γ

)There are some cases when the v0’s are particularly easy to calculate. For example, if dt = 0 is an

absorbing state, v0 is often specified as something that is linear in the parameters. If it is possible to

consistently estimate the transitions f(zt+1|zt, dt) and the one period ahead p0’s, then the obtaining

estimates of the u1(zt)’s simplifies to estimating a linear in the parameters logit. Note that whether

T is finite or infinite has no bearing on the problem.

The condition that the probability of choosing an alternative depends only on current utility and

on a one period ahead choice probability of a single choice is broader than just the cases satisfying

the terminal state property. To see this, note that the probability of choosing an alternative is

always relative to one of the choices:

v1(zt, εt)− v0(zt, εt) = u1(zt, εt)− u0(zt, εt)

+β

∫[ln(p0(zt+1))][f(zt+1|zt, dt = 0)− f(zt+1|zt, dt = 1)]

+β

∫[v0(zt+1)][f(zt+1|zt, dt = 1)− f(zt+1|zt, dt = 0)]

If v0 does not depend the lagged choice except through the one period ahead flow utility then

the future component effectively cancels out leaving the v(zt)’s once again linear in the utility

parameters. This is a special case of Altug and Miller (1998), which establishes how to write

CCP’s when there is finite time dependence. Note that this does not rule out transitions of the

state variables provided the transitions do not depend upon the choice. Many problems of interest

naturally exhibit this property. Rust’s bus engine problem is one, along with virtually every game

that involves an exit decision.

This renewal property, where the difference in the value functions depends only on the linear

flow utility, the one-period ahead transitions on the state variables, and the one-period ahead choice

probabilities of a single choice, is not restricted solely to logit errors. Consider the case where the

9

errors follow a nested logit specification and the ‘outside option’ depends upon the current state

only through the one period ahead flow utlity, that is, it satisfies one period time dependence. The

conditional value function is then:

vj(zt, εt) = uj(zt, εt) + βγ

+β

∫ln

((K∑

k=1

exp

(vk(zt+1)

σ

))σ

+ exp(v0(zt+1))

)f(zt+1|zt, dtj)

Writing the future component in terms of the probability of choosing the outside good yields:

vj(zt, εt) = uj(zt, εt) + βγ

+β

∫[v0(zt+1)− ln(p0(zt+1))]f(zt+1|zt, dtj)

Note that this expression is identical to the simple logit example above. Differencing with respect

to v0 then yields the same cancellations as before.

Conditional on one of the options having one period time dependence, the expression above

will be the same for a broad class of correlation structures. In particular, consider a mapping

G : RK+1 → R1 that satisfies the restrictions given in McFadden (1978) such that:

F (ε0, ε1, ..., εK) = exp (−G(eε0 , eε1 , ..., eεK ))

is the cdf of multivariate extreme value distribution. Whenever G(·) can be written as:

G(ev) = G(ev1(zt), ..., evK(zt)) + ev0(zt)

then the difference in the jth conditional value function and v0 can be written as:

vk(zt, εt)− v0(zt, εt) = uk(zt, εt)− u0(zt, εt)

+β

∫[ln(p0(zt+1))][f(zt+1|zt, dt = 0)− f(zt+1|zt, dtk)]

+β

∫[v0(zt+1)][f(zt+1|zt, dt = 1)− f(zt+1|zt, dt0)]

Note that this structure can accommodate quite complex error structures. For example, Bresnahan,

Stern, and Trajtenberg (1997) allow errors to be correlated across multiple nests.3 Again, as long

as the outside option satisfies the properties listed above, the difference in the conditional value

functions only depends upon the one-period head probabilities of choosing the outside option.

3Arcidiacono (2005) incorporates the BST framework into a dynamic discrete choice model.

10

Moreover, by using the techniques developed below to incorporate unobserved heterogeneity,

the GEV assumption is quite weak. In particular, we can approximate correlation among the

ε’s with a finite mixture distribution and keep the additive separability conditional on the draw

from the mixture distribution. Now let the error distribution of the ε’s mix over M extreme value

distributions with different intercepts and φm be the probability of the draw coming from the mth

distribution. The future component of the value of choice j is then:

vj(zt, m, εt) = uj(zt, i,m, εt) + β

(M∑

m=1

φm

[∫ln

(1∑

k=0

exp(vk(zt+1, m))

)f(zt+1|zt, dt = j) + γ

])Now, if any of the options exhibits one period time dependence, then once again the future utility

component depends only on the one-period ahead CCP’s for one of the options and the transitions

on the state variables. To see this, suppose the kth alternative then exhibited the terminal state

property one-period time dependence. We can rewrite the expression above as:

vj(zt, m, εt) = uj(zt, m, εt) + β

(M∑

m=1

φm

[∫(ln(pk(zt+1, m)) + vk(zt+1, m)) f(zt+1|zt, dt = j) + γ

])Differencing with respect to vk then leads the third term to cancel out. Estimation is then simple

once we have a means of calculating the CCP’s for each of the m distributions.

4.2 Application: Female Labor Supply

While one period time dependence makes estimation quite simple, there are cases when it is unlikely

to hold. Here we cover an example that has finite dependence, but not one period time dependence.

In particular, consider the case of female labor supply. This example, of human capital accumulation

through work experience, draws on finite dependence that goes beyond one period. Suppose human

capital zt ≡ (z1t, z2t) has two dimensions, total accumulated skill z1t, and recent working experience

z2t. Each period t a woman chooses between three activities, represented by dt = (d1t, d2t, d3t) ,

where following our notational convention djt ∈ {0, 1} for each j ∈ {1, 2, 3} and∑3

j=1 djt = 1.

Choosing d1t = 1, full time work increases her skills (and thus future earnings) by one or two units

of human capital, with probability λ and 1 − λ respectively. Part time work increases her human

11

capital by one unit, while the staying out of the labor force in period t eliminates recent working

experience. Thus recent work experience follows the law of motion

z1,t+1 = d1trt + d2t

where rt is independently distributed on {1, 2} with probability λ on 1, while total accumulated

experience is measured by

z2,t+1 = z2t + d1trt + d2t

In this example we assume the female maximizes her expected lifetime earnings adjusted for the

compensating benefits of not working, and let uj (zt) + εjt denote her period t adjusted income,

from choosing j ∈ {1, 2, 3} . To simplify the notation we analyze stationary policies in an infinite

horizon setting, which implies her expected adjusted lifetime wealth is∑∞

t=1

∑3

j=1βtpj (zt) [uj (zt) + wj (zt)]

where as before pj (zt) is the conditional choice probability of making choice j ∈ {1, 2, 3} and wj (zt)

is the expected value of the idiosyncratic income conditional on the jth choice being optimal.

To apply the finite dependence assumption in this example, we remark that since the woman

might accumulate up to two units of human capital if she works full time in the current period, we

must prescribe choices for the next three periods to neutralize the effects of different choices in the

current period on human capital. This motivates why we set ρ = 3 and zt+3 = (0, h + 2) for . To

achieve this vector of capital in three periods, regardless of her choice in period t, the woman should

not work in period t + 2, thus guaranteeing z1,t+3 = 0. Therefore d(ρ)3,t+2 = 1. The prescribed choices

in the intervening three periods naturally depend on the period t choice. For example suppose the

woman has total experience of h at period t but no recent experience. That is zt = (0, h) .

If the woman does not work in the current period, that is setting d3t = 1 she should work part

time in the following two periods. Thus d(ρ)2,t+1 (0, h) = 1 and d

(ρ)2,t+2 (1, h) = 1. Discounted back to

period t,her expected future adjusted lifetime wealth, conditional on not working in period t, is

v3 (0, h) = u3 (0, h) +∑3

j=1βp (0, h) [uj (0, h) + wj (0, h) + q2j (0, h)]

+∑3

j=1β2p (1, h + 1) [uj (1, h + 1) + wj (1, h + 1) + q2j (1, h + 1)]

+∑3

j=1β3p (1, h + 2) [uj (1, h + 2) + wj (1, h + 2) + q3j (1, h + 2)]

+β4v (0, h + 2)

12

If she works part time in period t, setting d2t = 1, then she should work part time in one of the

following two periods but not both. Thus we could set d(ρ)2,t+1 (1, h + 1) = 1 and d

(ρ)3,t+2 (1, h + 2) = 1.

The conditional valuation function for part time work in period t can be expressed as

v2 (0, h) = u2 (0, h) +∑3

j=1βp (1, h + 1) [uj (1, h + 1) + wj (1, h + 1) + q2j (1, h + 1)]

+∑3

j=1β2p (1, h + 2) [uj (1, h + 2) + wj (1, h + 2) + q3j (1, h + 2)]

+∑3

j=1β3p (0, h + 2) [uj (0, h + 2) + wj (0, h + 2) + q3j (0, h + 2)]

+β4v (0, h + 2)

If the woman works full time this period, then to achieve long term capital of h+2 three periods

later, her choices depend on whether she accumulates one unit of capital in period t, in which case

she should work part time in period t + 1 or period t + 2, or two, in which case she should not

work for the next three periods. In terms of the notation we have developed, if z1,t+1 = 1 then

d(ρ)2,t+1 (1, h + 1) = 1 and d

(ρ)3,t+2 (1, h + 2) = 1, but if if z1,t+1 = 2 then d

(ρ)3,t+1 (1, h + 2) = 1 and

d(ρ)3,t+2 (0, h + 2) = 1. Using the expression we derived for v2 (0, h), and noting that in periods t + 2

and t + 3, her state variables and prescribed actions are the same if she accumulates two units of

capital in the current period, the expected utility accruing over the next three years, discounted

back to period t, conditional on working full time in the current period can now be expressed as

v2 (0, h) = u1 (0, h) + λ[v2 (0, h)− u2 (0, h)− β4v (0, h + 2)

]+ (1− λ)

∑3

j=1β1p (2, h + 2) [uj (2, h + 2) + wj (2, h + 2) + q3j (2, h + 2)]

+ (1− λ)∑3

j=1

(β2 + β3

)p (0, h + 2) [uj (0, h + 2) + wj (0, h + 2) + q3j (0, h + 2)]

+β4v (0, h + 2)

4.3 Application: Dynamic Games

To further show the simplicity of the estimator, we describe how to apply the estimator to a

dynamic entry/exit game that will form the basis for one of the Monte Carlos discussed in section

6. Assume that all players in the game are playing strategies consistent with a Markov-Perfect

Equilibrium. Let there be M markets with one potential entrant arriving in each market in each

period. Potential entrants choose whether or not to enter the market. Should the firm choose to

enter, the firm continues to make choices as to whether to stay in the market. Should the firm

13

leave the market, the firm dies and also dies if it chooses not to enter. As we will see, whether the

firm dies or enters the pool of potential entrants to be assigned to a market is not relevant to the

problem for large M . There can be at most 2 firms in a market.

Firms are all identical but subject to private i.i.d. profit shocks for staying in or enter a market

that are private information and distributed i.i.d. Type I extreme value. These i.i.d. shocks only

affect the cost of staying in the market. Since we are focusing on a simple entry/exit example,

we assume that firm identity is not important except for whether the firm is an incumbent or

a potential entrant– only the number of incumbent and potential entering firms matter. Profits

depend upon The firms in the market simultaneously make decisions regarding entry and exit.

Expected lifetime profits for firm i depend upon:

1. Whether there is another player, Ej ∈ {0, 1}

2. Whether firm i is an incumbent, Ii ∈ {0, 1}

3. If there is another player, whether that player is an incumbent, Ij ∈ {0, 1}

4. A state variable that is observed to the firms but not to the econometrician, s ∈ {0, 1}

The realized flow profits for i, π, then depend upon whether, after the entry/exit decisions are

made, the firm is a monopolist or a duopolist, D ∈ {0, 1}, whether the firm started out as an

incumbent, and the state. We can then express expected lifetime profits from entering as:

vi(Ej, Ii, Ij, s) =1∑

k=0

pk(1, Ij, Ii, s)

(π(k, Ii, s)

+β

2∑s′=1

V (1, 1, k, s′)f(s′|s))

+ εi

where p0 and p1 are the probabilities of the other firm staying out of the market and staying the

market, respectively. With the profits for exiting left as a deterministic scrap value, SV , and i.i.d.

Type I extreme value ε’s imply that we can write the previous expression as:

vi(Ej, Ii, Ij, s) =1∑

k=0

pk(1, Ij, Ii, s)

(π(k, Ii, s) + βγ + βSV

−β

2∑s′=1

p0(1, 1, k, s′)f(s′|s))

+ εi

14

5 CCP’s and Unobserved Heterogeneity

In this section we present our algorithm for solving problems with unobserved heterogeneity where

the unobserved heterogeneity is allowed to transition over time. We focus first on a method that

does not use CCP’s and then show how the algorithm easily adapts to using CCP’s.

5.1 Regime-switching and Dynamic Discrete Choice

Consider a panel data set of N individuals, where we observe T choices for each individual n ∈

{1, . . . , N} . Observations are independent across individuals. We partition the zt’s from the base

framework into what is observed and unobserved. With some abuse of notation, zt now refers

to variables that are observed by the econometrician. We assume that the unobserved variables

lead to a finite number of ‘types’ of individuals. An individual’s type may affect both the utility

function and the transition functions on the observed variables and may also transition over time.

Let there be S types with the initial probability of being assigned to type si given by πi. Types

follow a Markov process with pjk dictating the probability of transitioning from type j to type k.4

Let Lnst be the likelihood of observing the data at individual n at time t conditional on being in

unobserved state s. Integrating out with respect to the unobservable types and their transitions

yields the following log likelihood for the sample:5

N∑n=1

log

S∑s(1)

S∑s(2)

...S∑

s(T )

πs(1)Lns(1)

T∏t=2

ps(t−1),s(t)Lns(t)

(7)

Neglecting the complications in evaluating the individual likelihoods, the unobserved hetero-

geneity portion is actually simpler than the counterparts used in the macroeconomics literature on

regime-switching. While the log likelihood above would be extremely costly to evaluate, Hamil-

ton (1990) shows that the EM algorithm simplifies the problem substantially. The EM algorithm

4Note that this nests both the case where unobserved heterogeneity is permanent (the transition matrix would

be an S×S identity matrix) and the case where type is completely transitory (the transition matrix would have the

same values in row j as in row k for all j and k).5Note that when unobserved heterogeneity is permanent the log likelihood for the sample is given by:

N∑n=1

log

(S∑

s=1

T∏t=1

πsLnst

)

15

involves first calculating the conditional probabilities that an individual is each of the S types at

time t given the values for the parameters and the data. Let qnst be the conditional probability

individual n is type s at time t given the parameters and the data. We can calculate qnst through

the repeated application of Bayes’ rule:

qnst =Lnst

(∑Ss(1) ...

∑Ss(t−1) πs(1)Ls(1)

∏t−1t′=1 ps(t′−1),s(t′)Ls(t′)

)(∑Ss(t+1) ...

∑Ss(T )

∏Tt′=t+1 ps(t′−1),s(t′)Ls(t′)

)∑S

s Lnst

(∑Ss(1) ...

∑Ss(t−1) πs(1)Ls(1)

∏t−1t′=1 ps(t′−1),s(t′)Ls(t′)

)(∑Ss(t+1) ...

∑Ss(T )

∏Tt′=t+1 ps(t′−1),s(t′)Ls(t′)

)(8)

Rather then maximizing (7), the EM algorithm instead maximizes the expected log likelihood:∑N

n=1

∑S

s=1

∑T

t=1qnstLnst (9)

taking the qnst’s as given and where the L’s refer to the type-specific log likelihoods.

For the standard EM algorithm, this maximization has the same first order conditions as max-

imizing (7). Estimation then proceeds iteratively as follows. Given initial values for the utility

function parameters, the transition parameters for the observable state variables, the initial con-

ditions on the probabilities of being a particular type (the πi’s), and the transition matrix of the

unobservables (the pjk’s), calculate the probability of being each of the types for each of the time

periods. Given these conditional probabilities, maximize the expected log likelihood function given

in (9) with respect to the parameters of the utility function and the parameters governing the tran-

sitions on the unobservables. With these parameter estimates in hand, update 1) the conditional

probabilities of being each of the types for each individual at each time period, 2) the πi’s, and

3) the pjk’s. Iterate until convergence. The one piece of the method left to describe is then how

to update the πi’s and the pjk’s. These updates follow directly from Hamilton (1990) with m + 1

estimates of πi and pjk given by:

πm+1i =

∑Nn=1 πm

i Lmni1

N(10)

pm+1jk =

∑Nn=1

∑Tt=2 qm

nkt|jqmnjt∑N

n=1

∑Tt=2 qm

njt

(11)

where qnkt|j gives the conditional probability of individual n being type k at time t conditional on

being type j at time t− 1:

qnkt|j =pjkLnkt

(∑Ss(t+1) ...

∑Ss(T )

∏Tt′=t+1 ps(t′−1),s(t′)Ls(t′)

)∑S

k′ pjk′Lnk′t

(∑Ss(t+1) ...

∑Ss(T )

∏Tt′=t+1 ps(t′−1),s(t′)Ls(t′)

)16

The updating of the πi’s is simply the average of the conditional probabilities of being type i from

the first time period. The updating of the pjk’s takes the probability of being in state k conditional

on the data and on being in state j in the previous period for each individual in each time period

and weighting by the probability of being in state j in the previous time period.

Before introducing CCP’s into the estimator, there are two other important issues related to the

estimator above. The first concerns initial conditions. It may be the case that the first observation

in the data set is not at time 1 but at some later date implying that some dynamic selection has

already taken place. In this case, we can allow the πi’s to depend upon what is observed in the

first period. For example, in a dynamic entry/exit game the πi’s could depend upon the number of

firms in the market. With enough observations and a small number of discrete characteristics the

updated πi’s can then be found from averaging over the conditional probabilities from period 1 for

each of set of characteristics rather than from the population as a whole. If the number of initial

observed states is too large for this then a flexible multinomial logit can be used.

The second issue relates to other advantages of using the EM algorithm for solving dynamic

discrete choice models which also hold here in the presence of regime-switching. In the framework we

described in section 2.1 there were two sets of parameters: those that governed the utility function

and those that governed the transitions of the observed state variables. Let Lntc(θ, α, znt, si) be the

likelihood of the nth individual making the actual choice c at time t conditional on the structural

parameters, θ and α, his observed characteristics, znt, and on being in unobserved state s at time t.6

Similarly, define Lntz(α, znt−1, cnt−1, s) as the likelihood of observing state z at time t conditional θ,

on the subset of the parameters that affect the transitions of the state space, znt−1, the state at time

t− 1, being in unobserved state s at time t− 1, and cnt−1, the choice made at t− 1. Substituting

these likelihoods into (9) yields:∑N

n=1

∑S

s=1

∑T

t=1qnst (Lntc(θ, α, znt, s) + Lntz(α, znt−1, cnt−1, s)) (12)

Note that the additive separability present in the case without unobserved heterogeneity is rein-

troduced at the maximization step of the EM algorithm. Arcidiacono and Jones (2003) show that

consistent estimates of the α’s can then be obtained from maximizing only the second term in

the expected log likelihood function given in (12). The estimates of the θ’s are obtained from

6Note that the dependence of this likelihood on α comes through the transitions affecting the expected future

utility.

17

maximizing the first term, taking the estimates of γ from maximizing the second term as given.

5.2 Incorporating CCP’s

We now extend the algorithm to the case where CCP’s are used in the value function. This adds

one additional step that updates the CCP’s. There are two ways to update the CCP’s, the first

of which uses the likelihoods themselves. This method of updating is particularly convenient for

stationary problems so for the moment we suppress the time subscripts. Let Lk(θ, P, z, s) denote the

conditional likelihood of observing choice k ∈ {1, . . . , K}. For each (z, s, k), the rule for updating

P is defined as:

P(m+1)k (z, s) = Lk

(θ(m), P (m), z, s

)(13)

Denoting the the true value of the parameter vector (θ, P ) by (θ∗, P ∗) , the updating rule follows

from the definitions of conditional probability and the conditional likelihood, which directly imply

P ∗k (z, s) ≡ Pr {k|z, s} = Lk (θ∗, P ∗, z, s)

The second way involves using the conditional probabilities of being particular types as weights.

This method is useful when the problem is not stationary as using the likelihoods would require

solving the full backwards recursion problem, albeit only once at each iteration rather then multiple

times within a maximization routine. For example, consider a finite horizon model where decisions

are made until time T but data is only available until time T − b. Using the likelihoods to update

would require evaluating CCP’s far past when the data is available. Using this second method we

only need the T − b CCP’s.7 Here, we use the individual probabilities of being particular types as

weights in updating the type-specific CCP’s. Namely, the m + 1 estimate of Pkt(zt, st) is given by:

P(m+1)k (zt, st) =

∑Nn=1 q

(m+1)nst dnkth(zt, znt)∑N

n=1 q(m+1)nst h(zt, znt)

(14)

where h(zt, znt) smoothly moves towards I(zt = znt) where I is the indicator function as N goes

to infinity and dnkt indicates whether individual n chose option k at time t. Note that N going

to infinity does not lead to consistent estimates of the qnst’s. However, it does lead to consistent

estimates of the Pk(zt, st)’s as in the population we are obtaining the correct distribution of the

type-specific CCP’s.

7Note that the two updating methods may be used in conjunction. The likelihood approach could be used for all

periods before T − b with the data approach used only for period T − b.

18

We now have all the pieces to fully describe the algorithm. Our algorithm is triggered by setting

an initial guess at the CCP’s, P , the structural parameters, θ and α, and the parameters governing

the unobserved states, π and p, and then following a sequence of iterations. The algorithm iterates

on four steps where the m + 1 iteration follows:

Step 1 For each of the S types, calculate the conditional probability that each individual is of

type s at each time period t, q(m+1)nst using (8).

Step 2 Given the q(m+1)nst ’s, the π

(m)i ’s, and the p

(m)ij ’s, obtain the π

(m+1)i ’s and the p

(m+1)ij using (10)

and (11).

Step 3 Maximize the expected log likelihood function to obtain estimates of θ(m+1) and α(m+1)

taking the q(m+1)nst ’s, the p

(m+1)jk ’s, and the P (m)’s as given. Maximization may proceed sequen-

tially by first obtaining α(m+1) solely from the transitions on the observables and then taking

α(m+1) as given when estimating θ(m+1).8

Step 4 Update the type-specific CCP’s using either (13) or (14) and obtain P (m+1).

6 Small Sample Performance

We perform two Monte Carlo simulations to assess the performance of our estimator for finite

samples. Both simulations are simpler versions of proposed empirical projects discussed below.

The first simulation is of a finite horizon model and shows that both methods of updating the

CCP’s yield parameter estimates similar to full backwards recursion with little loss in efficiency.

The second simulation is an infinite horizon game and shows that CCP estimators can handle

regime switching in dynamic discrete choice models even when few time periods are observed in

the data.

6.1 Finite Horizon Monte Carlo

The first simulation is a model of teenage behavior where in each period t the teen decides among

three alternatives: drop out (d0t = 1), stay in school and do drugs (d1t = 1), or stay in school but

8Note that if the unobserved heterogeneity does not affect the transitions on the observables then these can be

estimated in a first stage outside of the algorithm described above.

19

do not do drugs (d2t = 1). In this model there are two types of agents, where an individual’s type

affects their utility of using drugs. The probability of being type L is given by π. An individual

learns their type through experimentation: if at any point an individual has tried drugs their

type is immediately revealed to them. Hence, the individual’s information about their type, st,

comes from the set {U,L,H} where U indicates that the individual does not know their type. The

econometrician and the individual then have the same priors as to the individual being a particular

type before the individual has tried drugs. However, once an individual has tried drugs their type

is immediately revealed to them but not to the econometrician.

The individual also faces a withdrawal cost if he uses drugs at time t−1 but does not use at time

t. Hence, the other relevant state variable, zt, is taken form {N, Y } which indicate whether the

individual used drugs in the previous period. There are then five possible information sets {st, zt}:

{U,N}, {L, N}, {L, Y }, {H, N}, and {H, Y }. With dropping out being a terminal state, the

evolution of the state space conditional on the two stay in school options is given by the transition

matrix in Table 1.

Table 1: Evolution of the State Space†

{U,N} {L, N} {L, Y } {H, N} {H, Y }

{D, U} d2t = 1 1 0 0 0 0

d1t = 1 0 0 π 0 1− π

{L, N} d2t = 1 0 1 0 0 0

d1t = 1 0 0 1 0 0

{L, Y } d2t = 1 0 1 0 0 0

d1t = 1 0 0 1 0 0

{H, N} d2t = 1 0 0 0 1 0

d1t = 1 0 0 0 0 1

{H, Y } d2t = 1 0 0 0 1 0

d1t = 1 0 0 0 0 1

†Rows give state at time t, columns the state at time t + 1.

With dropping out being a terminal state, we normalize utility with respect to this option. The

20

flow utilities net of the ε’s when an individual know their type are given by:

u1(st ∈ {L, H}, zt) = α0 + α1 + α2(st = H)

u2(st ∈ {L, H}, zt) = α0 + α3(zt = Y )

where α0 is the baseline utility of attending school, α1 is the baseline utility of using drugs, α2 is

the additional utility from being type H and using drugs, and α3 is the withdrawal cost. Note that

if the individual chooses to use drugs no withdrawal cost is paid meaning that zt is not relevant.

On the other hand, if the individual choose not to use drugs only the individual’s type becomes

irrelevant as type only affects the utility of using drugs. We can then write down the similar

expressions for those who do not know their type;

u1(U, zt) = α0 + α1 + (1− π)α2(st = H)

u2(U, zt) = α0

where now the utility of using drugs is probabilistic and, since the individual has not used drugs

in the past, no withdrawal cost needs to be paid.

With the flow utilities in hand, we now focus on our attention on the value functions themselves.

Here it is important to note that choosing to drop out leads to a terminal state. Combining this

with having the ε’s be distributed Type I extreme value means that the future value components

are functions of the transition on the state space and the one period ahead conditional choice

probabilities of dropping out. The expressions for the vk’s when an individual knows their type

then follow:

v1(st ∈ {L, H}, zt) = u1(st ∈ {L, H}, zt)− β ln [p0(st ∈ {L, H}, Y )] + βγ

v2(st ∈ {L, H}, zt) = u2(st ∈ {L, H}, zt)− β ln [p0(st ∈ {L, H}, N)] + βγ

with the corresponding expressions when an individual does not know their type following:

v1(U, zt) = u1(U, zt) + βγ

−β (π ln [p0(L, Y )] + (1− π) ln [p0(H, Y )])

v2(U, zt) = u2(U, zt)− β ln [p0(U,N)] + βγ

For each simulation we create 5000 simulated individuals with 5 periods of data. There are less

observations for those who drop out as no further decisions occur once the simulated individual

21

leaves school. We estimate the model using three different methods of calculating the expected

future utility where all three are equivalent asymptotically. The first calculates the expected future

utility via backwards recursion while the second and third use CCP’s with the CCP’s updated

using the likelihoods or the weighted data respectively. The parameter values were chosen to

approximate the increase in usage as students age in the NLSY97 data, the drop out rate, and the

persistence in drug usage. Each simulation was performed 500 times. Table 2 shows that both

CCP estimators performed nearly as well as the more efficient model. Updating the CCP’s via the

likelihoods yielded smaller standard errors than using the data but the differences were small. Note

that it is particularly surprising how well the CCP estimators performed given that we are only

using data from a discrete outcome. In more standard cases unobserved heterogeneity will affect

both transitions (which may be on continuous variables) and choices. Using the variation from the

transitions is likely to further reduce the differences across the estimators.

Table 2: School and Drug Choice Monte Carlo†

School/No Drug No Withdrawal Low Drug High Drug Discount Prob. of

Intercept Benefit Type Type Factor High Type

True Parameters 0.2 0.4 -15 2 0.9 0.7

Efficient Estimates 0.197 0.403 -15.011 2.003 0.900 0.700

Standard Error 0.069 0.078 0.755 0.074 0.017 0.014

CCP Estimates 1‡ 0.194 0.397 -14.835 1.99 0.891 0.701

Standard Error 0.085 0.092 0.800 0.087 0.017 0.014

CCP Estimates 2 0.2003 0.3934 -14.833 2.000 0.884 0.7005

Standard Error 0.097 0.105 0.817 0.099 0.026 0.014

†Listed values are the means and standard errors of the parameter estimates over the 500 simulations.‡CCP Estimates 1 refers to updating the CCP’s via the likelihoods while CCP Estimates 2 updates the CCP’s

using the data directly.

22

6.2 Monte Carlo 2: An Infinite Horizon Entry/Exit Game

Our second Monte Carlo examines an entry/exit game along the lines of the game described in

section 4.3. We assume that the econometrician observes prices as well, though these prices do not

affect the firm’s expected profits once we control for the relevant state variables. We specify the

price equation as:

Yjt = α0 + α1(1 Firm in Market j) + α2(2 Firms in Market j) + α3sjt + ζjt (15)

where j indexes the market. Price then depends upon how many firms are in the market, an

unobserved state variable that transitions over time, sjt, and takes on one of two values, H or L,

as well as a normally distributed error, ζjt. Firms know the current value of this unobserved state

variable but only have expectations regarding its transitions. The econometrician has the same

information as the firms regarding the probability of transitioning from state to state but does not

know the current value of the state. Profits are assumed to be linear in whether the firm has a

competitor and the state of the market. Each of our Monte Carlos has 3000 markets9 observed for

5 periods each. The rest of the specification of the Monte Carlo follows directly from the entry/exit

game discussed in section 4.3.

Results are presented for 100 simulations in Table 3. The CCP estimator yields estimates

that are quite close to the truth with small standard errors. The noisiest parameters are those

associated with the persistence of the states and with the initial conditions. Of particular interest

are the coefficients on monopoly and duopoly in the demand equation. If we ignored unobserved

heterogeneity and estimated the demand by OLS, the coefficients are biased upward at -.18 and

-.41 compared to the true values of -.3 and -.7. Controlling for dynamic selection shows a much

stronger effect of adding firms to the market which is consistent with the actual data generating

process.

7 Conclusion

Estimation of dynamic discrete choice models is computationally costly, particularly when controls

for unobserved heterogeneity are implemented. CCP estimation provides a computationally cheap

way of estimating dynamic discrete choice problems. In this paper we have broadened the class

9This is roughly the number of U.S. counties.

23

Table 3: Dynamic Entry/Exit Monte Carlo†

True Values Estimates Std. Error

Intercept L 7.000 6.999 0.042

Price Intercept H 8.000 8.005 0.066

Equation 1 Firm -0.300 -0.299 0.035

2 Firms -0.700 -0.702 0.053

Flow Profit L 0.000 -0.002 0.032

Profit Flow Profit H 0.500 0.516 0.103

Function Duopoly Cost -1.000 -1.009 0.043

Entry Cost -1.500 -1.495 0.027

p‡LL 0.800 0.799 0.028

Unobserved pHH 0.700 0.702 0.040

Heterogeneity π††L 0.800 0.799 0.031

† 100 simulations of 3000 markets for 5 periods. β set at 0.9.‡ The probability of a market being in the low state in period t conditional on being in the low state at t− 1.

†† Initial probability of a market being assigned the low state.

of CCP estimators that rely on a small number of CCP’s for estimation and have shown how to

incorporate unobserved heterogeneity that transitions over time. The algorithm itself borrowed

from the macroeconomics literature on regime switching– and in particular the insights gained

from the EM algorithm– in order to form an estimator that iterated on 1) updating the conditional

probabilities of being a particular type, 2) updating the CCP’s, 3) forming the expected future

utility as functions of the CCP’s, and 4) maximizing a likelihood function where the expected

future utility is taken as given. The algorithm was shown to be both computationally simple with

little loss of information relative to full information maximum likelihood.

References

[1] Altug, S. and R.A. Miller (1998): The Effect of Work Experience on Female Wages and Labour

Supply,” Review of Economic Studies, 62, 45-85.

24

[2] Aguirregabiria, V. and P. Mira (2002): Swapping the Nested Fixed Point Algorithm: A Class

of Estimators for Discrete Markov Decision Models,” Econometrica, 70, 1519-1543.

[3] Aguirregabiria, V. and P. Mira (2006): Sequential Estimation of Dynamic Discrete Games,”

Forthcoming in Econometrica.

[4] Arcidiacono, P. (2005): ”Affirmative Action in Higher Education: How do Admission and

Financial Aid Rules Affect Future Earnings?” Econometrica, 73, 1477-1524.

[5] Arcidiacono, P., and J.B. Jones (2003): Finite Mixture Distributions, Sequential Likelihood,

and the EM Algorithm,” Econometrica, 71, 933-946.

[6] Bajari, P., L. Benkard, and J. Levin (2006): Estimating Dynamic Models of Imperfect Com-

petition,” Forthcoming in Econometrica.

[7] Becker, G. and K.M. Murphy (1988): A Theory of Rational Addiction,” Journal of Political

Economy, 96, 675-700.

[8] Bresnahan, T.F., S. Stern and M. Trajtenberg (1997): Market Segmentation and the Sources of

Rents from Innovation: Personal Computers in the Late 1980s,” RAND Journal of Economics,

28, 17-44.

[9] Eckstein, Z. and K.I. Wolpin (1999): Why Youths Drop Out of High School: The Impact of

Preferences, Opportunities, and Abilities,” Econometrica, 67, 1295-1339.

[10] Ericson, R. and A. Pakes (1995): Markov-Perfect Industry Dynamics: A Framework for Em-

pirical Work,” Review of Economic Studies, 62, 53-82.

[11] Hamilton, J.D. (1990): Analysis of Time Series Subject to Changes in Regime,” Journal of

Econometrics, 45, 39-70.

[12] Hotz, J. and R.A. Miller (1993): Conditional Choice Probabilities and Estimation of Dynamic

Models,” Review of Economic Studies, 61, 265-289.

[13] Pakes, A., M. Ostrovsky, and S. Berry (2004): Simple Estimators for the Parameters of Discrete

Dynamic Games (with Entry/Exit Examples),” Working Paper: Harvard University.

25

[14] Pesendorfer, M. and P. Schmidt-Dengler (2003): Identification and Estimation of Dynamic

Games,” NBER working paper #9726.

[15] Rust, J. (1987): Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold

Zurcher,” Econometrica, 55, 999-1033.

26

CCP Estimation of Dynamic Discrete Choice Models with … · 2006. 10. 2. · Peter Arcidiacono Robert Miller Duke University Carnegie Mellon September 29, 2006 Abstract Standard

Documents