Handbook 76 - Chen

Large Sample Sieve Estimation of S-NP Models

Xiaohong Chen; Handbook of Econometrics Chapter 76

Will Matcham

[email protected]

Will Matcham (LSE) Chapter 76: Chen October 2015 1 / 27

Introduction

Sieve Estimation: Examples, Definitions and SievesEmpirical Examples of S-NP ModelsDefinition of Sieve Extremum EstimationTypical Function Spaces and Sieve SpacesSmall Monte Carlo StudySome Sieve Applications in Econometrics

Large Sample Properties of Sieve Estimation of Unknown FunctionsConsistency of Sieve EstimatorsConvergence Rates of M-EstimatorsConvergence Rates of Series EstimatorsPointwise AN of Series LS Estimators

Large Sample Properties of Sieve Estimation of P Parts in S-NP ModelsSP Two-Step EstimatorsSieve Simultaneous M-EstimationSieve Simultaneous MD Estimation

Conclusion


Abstract

• Parametric (P) models often restrictive and sensitive to deviationsfrom parametric specifications

• Semi-nonparametric (S-NP) models are more flexible and robust, butintroduce other complications: potentially non-compact∞-dimensional parameter spaces, which lead to ill-posed optimisationproblems

• Method of sieves (MoS) provides way to tackle such difficulties

• Optimise an empirical criterion over a sequence of approximatingparameter spaces called sieves

• Sieves are dense in the original space and less complex; optimisationwill become well posed.

• Advantage: MoS very flexible for complex models with or withoutheterogeneity and endogeneity.


Abstract

• Advantage: MoS can incorporate constraints and information fromtheory: shape restrictions.

• Advantage: MoS can simultaneously estimate parametric andnonparametric (NP) parts with optimal convergence for both.

• Disadvantage: General theory for MoS not complete.

• Chapter describes estimation of S-NP models via MoS

• Will present general results for large sample properties: consistency,convergence rates, pointwise normality, some

√n asymptotic

normality


Introduction

• mention S-NP, NP and P, and notation used.


Intro to Section 2

• MoS consists of two key ingredients

1. Criterion Function: population Q : Θ 7→ R (a function); empiricalcriterion Q̂n (a random function)

2. Sieve Parameter Spaces – Sequence of approximating spaces{Θn}n∈N

• Both can be extremely flexible as we shall see. Almost all criterionfunctions in Newey McFadden chapter can be used in MoS

• Hence, main new ingredient is choice of sieve parameter space.


Empirical Examples of S-NP Models

• Impossible to list all existing S-NP models and their empiricalapplications. Section presents three, I present the first here.

• Example 2.1 (Single spell duration models with unobservedheterogeneity)

• Typical single spell models suggest functional form for structuralduration distribution conditional upon individual heterogeneity. LetG (τ |u, x) be distribution function of duration T conditional uponunobserved and observed heterogeneity U = u and X = x respectively.

• Then modelling U as a random factor with distribution function h(u),obtain

F (τ |x) =

∫G (τ |u, x)dh(u)

• iid sample {Ti ,Xi}ni=1 identifies F .


Empirical Examples of S-NP Models• Theoretical models provide parametric functional forms of G up to

finite-dimensional β parameter vector.• g(·|β, u, x) density counterpart to G (·|β, u, x)• MLE method assumes hγ known up to finite dimensional γ• Then MLE gives likelihood

n∏i=1

∫g(Ti |β, u,Xi )dhγ(u)

• Thus log likelihood scaled is

L(β, γ) =1

n

n∑i=1

log

{∫g(Ti |β, u,Xi )dhγ(u)

}• And

(β̂MLE , γ̂MLE )′ = argmaxβ,γ

L(β, γ)


Empirical Examples of S-NP Models

• Heckman and Singer (1984) observe that parametric MLE estimatesof β inconsistent if distribution of unobserved heterogeneity hmisspecified.

• They suggest S-NP single spell model

F (τ |β, h, x) =

∫G (τ |β, u, x)dh(u)

• h left unspecified. (β′, h) is identified and a sieve MLE method givesconsistent estimator for β and h jointly.

• Classic example of S-NP model specifying conditional distribution ofobserved economic variables semi-nonparametrically, with specificsemi-nonparametric form derived from independence of errors andregressors.


S-NP Conditional Moment Models• Many economic models imply semi-nonparametric conditional

moment restrictions of the form

E [ρ(Zt ; θ0|Xt ] = 0, θ0 =

(β0

h0

)1. ρ column vector of residual functions with functional forms known up

to θ2. {Zt}nt=1 = {(Y ′t ,X ′t )′}nt=1 data, where Yt endogenous, Xt exogenous.3. Worth noting that E [ρ(Zt , θ)|Xt ] denotes conditional expectation of

ρ(Zt , θ) given Xt . True conditional distribution of Yt given Xt leftunspecified.

• Parameters of interest θ0 = (β′0, h′0)′ is split into vector of finite

dimensional unknown parameters β0 and a vector of ∞-dimensionalfunctions h0(·) = (h01(·), . . . , h0q(·))′, which can depend on anythingin the model.


S-NP Conditional Moment Models• Hansen (1982) studied conditional moment restriction for stationary

ergodic time series without h0, i.e. E [ρ(Zt ;β0|Xt ]• Ai and Chen (2003) and others studied for iid data the general caseE [ρ(Zt ;β0, h0|Xt ]

• Partition S-NP conditional moment restriction models into twosubclasses:

1. Models without endogeneity : ρ(Zt , θ)− ρ(Zt , θ0) doesn’t depend onYt . In such a case, θ0 is the unique maximiser of

Q(θ) = −E(ρ(Zt , θ)′Σ(Xt)

−1ρ(Zt , θ)

)Where Σ(Xt) is pd weight matrix

2. Models with endogeneity : negation of above. Then θ0 identified asunique maximiser of

Q(θ) = −E(m(Xt , θ)′Σ(Xt)

−1m(Xt , θ)

)Where m(Xt , θ) = E [ρ(Zt , θ)|Xt ]


S-NP Conditional Moment Models

• Although second class includes first class as a special case (trivially),when θ contains unknown functions, asymptotic properties for variousnonparametric estimators of θ are easier to derive in the first case.

• First class contains many well studied special cases, such as thepartially linear regression model of RobinsonE [Yi − X ′

1iβ0 − h0(X2i )|X1i ,X2i ] = 0

• The leading, yet difficult example of the second class is the purelynonparametric instrumental variables regressionE [Y1i − h0(Y2i )|Xi ] = 0.

• Even less trivial, the NP IV quantile regressionE [1 (Y1i ≤ h0(Y2i )− γ) |Xi ] = 0.


General Setup

• Let Θ be infinite dimensional parameter space, endowed with pseudometric d .1

• Typical S-NP econometric model specifies population criterionQ : Θ 7→ R uniquely maximised at θ0 ∈ Θ. θ0 “true” parameter value.

• Choice of Q and existence of θ are suggested by identification ofeconometric model.

• True θ0 ∈ Θ unknown but related to joint probability measureP0(z1, . . . , zn) from which sample {Zt}nt=1 is available.

• Q̂n : Θ 7→ R is the empirical criterion. For all θ ∈ Θ, Q̂n is ameasurable function of the data. Q̂n is a random function.

• Q̂n converges to Q in some sense as n→∞.1Pseudo metric space (X , d), d : X × X 7→ R with symmetry and triangle inequality

but only d(x , y) ≥ 0, not that x = y ⇐⇒ d(x , y) = 0.


General Setup

• Generally estimate θ0 by maximising Q̂n over Θ. Assuming it exists,the maximiser argsup

θ∈ΘQ̂n(θ) is called the extremum estimate.

• When Θ infinite dimensional and not compact with respect to d ,maximising Qn over Θ may not be well defined, or even if it exists,may be difficult to compute and have undesirable large sampleproperties.

• Difficulties arise intuitively because problem of optimisation overinfinite dimensional noncompact space is not well posed.

• In ∞ dimensional metric space (H, d), compact set is d-closed andtotally bounded. Set is totally bounded if ∀ε > 0, exists finitely manyopen balls of radius ε covering the set.


Ill-Posed and Well-Posed Problems

• Optimisation problem well posed if ∀{θk} in Θ such thatQ(θ0)− Q(θk)→ 0, then d(θ0, θk → 0.

• Naturally then problem ill posed if ∃{θk} in Θ whereQ(θ0)− Q(θk)→ 0 yet d(θ0, θk) 6→ 0.]

• For a given S-NP model, suppose that Q and Θ are such that Q isuniquely maximised at θ0 ∈ Θ. Then posedness of the problemdepends on choice of d . Different metrics on ∞ dimensional Θ maynot be equivalent. In finite dimensional space, all norms areequivalent.

• In particular, likely that standard norms ‖ · ‖s on Θ don’t havecontinuity in Q(θ0)− Q(θ). This implies that problem is ill-posedwith s metric. Nevertheless, typically a weaker norm ‖ · ‖w on Θ iscontinuous, hence problem well-posed using this norm.


Ill-Posed and Well-Posed Problems• No matter whether ill or well-posed, method of sieves provides a

general approach to resolve difficulties with maximising Q̂n over ∞dimensional parameter space by maximising Q̂n over a series ofapproximating spaces Θn called sieves, which are less complex, butdense in Θ

• Sieves are typically compact, nondecreasing and such that ∀θ ∈ Θ,∃πn(θ) ∈ Θn such that d (θ, πn(θ))→ 0 as n→∞. Think that πn isa projection mapping from Θ to Θn.

• Approximate sieve extremum estimate θ̂n is defined as theapproximate maximiser of Q̂n over Θn, i.e.

Q̂n(θ̂n) ≥ supθ∈Θ

Q̂n(θ)− Op(ηn)

ηn = o(1).• When ηn = 0, we have exact sieve extremum estimator

θ̂n = Q̂n(θ)θ∈ΘnWill Matcham (LSE) Chapter 76: Chen October 2015 16 / 27

•


•


•


•


•


•


•


•


•


•


•


Handbook 76 - Chen

Documents

existing snp models

np models impossible

introduction mention

complex models

sievesempirical examples

abstract parametric

newey mcfadden chapter

single spell duration