Top Banner
Efficient Inference in Econometric Models When Identification Can Be Weak * Vadim Marmer Zhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance estimation (MDE) framework and can suffer from weak identification. The main contribution of the paper is to extend the notion of efficient weak-identification-robust tests from linear IV regressions to the MDE framework. When identification is strong, we find that Kleibergen’s LM (KLM) and Moreira’s CLR tests, which previously have been found to be efficient in the linear IV regression context, remain asymptotically efficient in the MDE framework. However, when identification is weak, there is a test that can uniformly dominate KLM and CLR in more general MDE models. JEL Classification: C14; C21; C24; C25; C26 Keywords: weak instruments; weak identification; minimum distance estimation; cen- sored variables; quantile regression 1 Introduction This paper is concerned with weak-identification-robust inference for certain econometric models that fit into the Minimum Distance Estimation framework (MDE). The covered models include, for example, a model with transformed dependent variable (Horowitz, 1996), censored linear regression with endogenous regressors (Amemiya, 1984), and DSGE models (see for example, Hnatkovska et al. (2012)). The main objective is to characterize efficient tests in that context. While for the linear instrumental variables model, the efficiency of weak-identification-robust tests is well understood, in other frameworks those tests are used in ad-hoc fashion, and we seek to fill this gap. * PRELIMINARY AND INCOMPLETE. We thank Mehmet Caner, Yanqin Fan, Mototsugu Shintani, and Kevin Song for helpful comments. Vadim Marmer gratefully acknowledges the financial support of the SSHRC under grants 410-2010-1394 and 435-2013-0331. Department of Economics, University of British Columbia, 997-1873 East Mall, Vancouver, BC, V6T 1Z1, Canada. Emails: [email protected] (Marmer), [email protected] (Yu). 1
28

EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Mar 28, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Efficient Inference in Econometric Models WhenIdentification Can Be Weak∗

Vadim Marmer† Zhengfei Yu†

May 20, 2013

Abstract

We consider econometric models that fit into the minimum distance estimation(MDE) framework and can suffer from weak identification. The main contributionof the paper is to extend the notion of efficient weak-identification-robust tests fromlinear IV regressions to the MDE framework. When identification is strong, we findthat Kleibergen’s LM (KLM) and Moreira’s CLR tests, which previously have beenfound to be efficient in the linear IV regression context, remain asymptotically efficientin the MDE framework. However, when identification is weak, there is a test that canuniformly dominate KLM and CLR in more general MDE models.JEL Classification: C14; C21; C24; C25; C26Keywords: weak instruments; weak identification; minimum distance estimation; cen-sored variables; quantile regression

1 Introduction

This paper is concerned with weak-identification-robust inference for certain econometricmodels that fit into the Minimum Distance Estimation framework (MDE). The coveredmodels include, for example, a model with transformed dependent variable (Horowitz, 1996),censored linear regression with endogenous regressors (Amemiya, 1984), and DSGE models(see for example, Hnatkovska et al. (2012)). The main objective is to characterize efficienttests in that context. While for the linear instrumental variables model, the efficiency ofweak-identification-robust tests is well understood, in other frameworks those tests are usedin ad-hoc fashion, and we seek to fill this gap.∗PRELIMINARY AND INCOMPLETE. We thank Mehmet Caner, Yanqin Fan, Mototsugu Shintani,

and Kevin Song for helpful comments. Vadim Marmer gratefully acknowledges the financial support of theSSHRC under grants 410-2010-1394 and 435-2013-0331.†Department of Economics, University of British Columbia, 997-1873 East Mall, Vancouver, BC, V6T

1Z1, Canada. Emails: [email protected] (Marmer), [email protected] (Yu).

1

Page 2: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Weak instrumental variables have received a lot of attention in econometrics in partic-ular following Staiger and Stock (1997), who developed an analytical framework for ana-lyzing the effect of weak instruments and constructing weak-identification-robust methodsof inference, and Dufour (1997), who showed that usual bounded confidence sets cannotbe valid in the case of weak identification. Typically, the econometrician is interested intesting a hypothesis about the coefficients on endogenous regressors. A standard approachto construction of weak-identification-robust tests is to use null-restricted residuals, whichare obtained by imposing the value specified under a null hypothesis for the coefficients ofendogenous regressors. The hypothesis can be tested by considering the sample covariancebetween null-restricted residuals and instrumental variables. The approach is robust to weakidentification problems because, under a null hypothesis, the distribution of the covarianceterm does not depend asymptotically on the strength of the correlation between endogenousregressors and instruments. This idea is behind the Anderson-Rubin (AR) statistic, seeAnderson and Rubin (1949) and Staiger and Stock (1997).

Tests based on the AR statistic are efficient if a model is just identified. However, thepower of AR-type tests is inferior to that of usual t- and Wald- tests when a model is overi-dentified and instruments are strong. This is because, when a model is overidentified, theAR approach tests more restrictions than the dimension of the parameter of interest. Toaddress that issue, several papers suggested alternatives to the Anderson-Rubin statistic.Kleibergen (2002, 2007) and Moreira (2001, 2003) proposed Lagrange Multiplier (LM) andConditional Likelihood Ratio (CLR) -type statistics. Kleibergen’s LM (KLM) tests can beused with usual χ2 critical values. CLR tests requires simulations to generate critical val-ues, however, they have been demonstrated to have better power properties in Monte Carlosimulations than KLM or AR tests. Andrews et al. (2006) show that, for the normal linearinstrumental variables regression model with homoskedastic errors, KLM and CLR statis-tics are efficient among weak-identification-robust statistics when instrumental variables arestrong. They also show through numerical calculations that the CLR statistic is nearly effi-cient when instruments are weak in terms of average power. Cattaneo et al. (2012) extendthe results of Andrews et al. (2006) to non-normal errors by using an asymptotic frameworkof Gaussian experiments. Andrews et al. (2006) and Cattaneo et al. (2012) are focused onthe case of single endogenous regressor.

In the case of structural econometric models with latent dependent variables, the usualapproach based on null-restricted residuals cannot be used, because dependent variablesare unobservable and, therefore, null-restricted residuals cannot be constructed. Magnusson(2010) proposed an alternative approach based on testing identifying restrictions that re-late reduced-form coefficients with structural coefficients. (This approach was also used inMarmer et al. (2012) in the context of Regression Discontinuity Design.) Magnusson (2010)

2

Page 3: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

described Anderson-Rubin, KLM and CLR tests in the context of models with latent depen-dent variables, however, offered no analytical power analysis, and the question of efficiencyremained unresolved. The same approach can be used for more general MDE models.

Following Cattaneo et al. (2012), we use the framework of asymptotic experiments ofChoi et al. (1996) to define the notion of efficiency. The framework substantially simplifiesthe analysis by reducing a complex inference problem to that based on a single normallydistributed observation. It allows one to derive an efficiency bound in presence of nuisanceparameters. The KLM and CLR test remain efficient in the MDE framework when iden-tification is strong. However, the near-efficiency of the CLR test no longer holds in moregeneral models. Specifically, we demonstrate that another test (which we call a ConditionalLM (CLM) test) can uniformly dominate the CLR (and KLM) tests. We attribute this toa non-kronecker covariance structure in our asymptotic experiments.

To be completed...

2 Framework

We consider a minimum distance estimation (MDE) framework (see, for example,Neweyand McFadden (1994) for a treatment of classical minimum distance estimation). Let h :

Rl × Rm → Rk be a known function. The framework is defined by

h(π, γ) = 0,

where γ ∈ Rm is a vector of structural parameters, and π ∈ Rl is a vector of reduced-formparameters. We assume that there is a consistent and asymptotically normal estimator ofπ:

√n(πn − π)→d N(0,Ωγ). (1)

We use the notation Ωγ is to emphasize that the asymptotic variance of the estimator ofreduced-form parameters may depend on the true value of the structural parameter. At thesame time, we assume that Ωγ can be estimated without knowing γ or having a consistentestimator of γ: there is an estimator Ωn such that

Ωn →p Ωγ .

Linear simultaneous equations are a classical example where this is true. We will give moreexamples below.

In the MDE framework, the structural parameter γ is typically estimated by minimizing

3

Page 4: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

a weighted distance of h from zero:

γn = arg minx∈Rm

h(πn, x)′A(πn, x)h(πn, x),

where A(·, ·) is a positive definite matrix that may depend on reduced-form parameters aswell as candidate values for the structural parameter. The estimator γn is called the MDEestimator, and its distribution is typically approximated asymptotically by

√n(γn − γ) =

(∂h(πn, γn)′

∂γA(πn, γn)

∂h(πn, γn)

∂γ′

)−1 ∂h(πn, γn)

∂π

√n(πn − π) + op(1). (2)

One can expect that the quality of the asymptotic approximation will be poor if the rankcondition

rank

(∂h(π, γ)′

∂γA(π, γ)

∂h(π, γ)

∂γ′

)= m

fails locally, i.e. if the smallest eigenvalue of the above matrix is local-to-zero.

Example. (Transformation model with endogenous regressors) Horowitz (1996)considers a model with a transformed dependent variable:

Λ(Y1i) = θY2i +X ′iβ + Ui,

where Λ(·) is an unknown function. Suppose that Y2i is an endogenous variable, and Xi isan l-vector of exogenous covariates. Consider in addition a first-stage equation:

Y2i = Π′1Xi + Π′2Zi + Vi,

where Zi is a vector of instruments, Π2 ∈ Rk. Note that the AR-type null-restricted residualsapproach cannot be used in this model. Since Λ(·) is an unknown function, null-restrictedresiduals cannot be constructed even when the value of θ is specified by the null hypothesis.In this model, the marginal structural effect of Y2i on the dependent variable Y1i is given by

∂Y1i

∂Y2i=

θ

Λ′(Y1i).

Since the marginal effect depends on the value of Y1i, fix some y in the support of thedistribution of Y1i, and define

γ =θ

Λ′(Y1i).

4

Page 5: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Following the results of Horowitz (1996), γ is identified by

γΠ2 = −∂ Pr (Y1i ≤ y | Zi = z) /∂z

∂ Pr (Y1i ≤ y | Zi = z) /∂y.

Since the above equation depends on a realization of the k-vector Zi, consider a weightfunction w : Rk → [0, 1] such that

´w(z)dz = 1, and define a k-vector

π2 = −ˆ∂ Pr (Y1i ≤ y | Zi = z) /∂z

∂ Pr (Y1i ≤ y | Zi = z) /∂yw(z)dz. (3)

The transformation model fits in the MDE framework with

h((π′2,Π

′2

)′, γ)

= γΠ2 − π2.

Since ∂h/∂γ = Π2, we have a local identification failure if Π2 is local to zero (weak IVs). Inthis example, Π2 can be estimated by OLS from the first-stage equation, and π2 can be esti-mated from (3) using nonparametric estimators for the derivative ∂ Pr (Y1i ≤ y | Zi = z) /∂z

and the conditional density of Y1i conditional on Zi. Note that π2 will be estimated at aslower rate than Π2, and therefore (1) will be satisfied with different rates of convergencefor different components.

Example. (Censored regression with endogenous regressors) Consider a model witha censored dependent variable:

yi = max y∗i , 0 ,

where y∗i is a latent dependent variable. We assume that the latent dependent variable y∗iis determined by a linear structural model:

y∗i = γ′Yi + β′Xi + ui,

Yi = Π1Xi + Π2,nZi + Vi, (4)

where Yi is the m-vector of endogenous regressors, Xi is the l-vector of exogenous regressors,and Zi is the k-vector of IVs. In the above equations, γ and β denote the m- and l-vectorsof structural parameters respectively, and Π1 and Π2,n are the m × l and m × k matricesof (first-stage) reduced-form parameters. Due to linearity of the mode, the reduced-formequation for yi is a censored regression:

yi = maxπ′1Xi + π′2,nZi + vi, 0

, (5)

5

Page 6: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

where

π1 = Π1γ + β, (6)

π2,n = Π2,nγ, (7)

vi = γ′Vi + ui.

In this example,

h((π′1, π

′2, vec(Π1)′, vec(Π2,n)′)′, (γ′, β′)′

)=

(π1 −Π1γ − βπ2,n −Π2,nγ

).

We have a local identification failure if ∂h/∂γ′ = Π2,n is local-to-zero.1 Again, note that theAnderson-Rubin approach cannot be used directly, since null-restricted residuals cannot beconstructed because y∗i is unobservable. In this case, a part of the reduced-form parameters(Π1 and Π2,n) can be estimated from first-stage equation (4) by OLS. Other parameters, π1

and π2,n, can be estimated from reduced-form censored regression using a variety of methodsdepending on assumptions one is willing to make. For example, if one is willing to makedistributional assumptions, π1 and π2,n can be estimated by MLE. Alternatively, one canuse a semi-parametric quantile regression estimator of Powell (1984, 1986). We provide someof the details of the second approach below. Note that in either case, the distribution of theestimators of π1 and π2,n depends on the structural parameter γ through vi.

3 Inference on the structural parameters: The failure of astandard approach

A standard approach to inference on γ relies on the MDE estimator and the asymptoticallynormal approximation in (2). We illustrate below the problem with the standard approachusing the censored regression example.

Example. (Censored regression continued) Suppose that for some ((m + 1)l + (m +

1)k)× ((m+ 1)l + (m+ 1)k) symmetric and positive definite matrix Ωγ , we have that

n1/2

π1,n − π1

π2,n − π2,n

vec(

Π1,n −Π1

)vec(

Π2,n −Π2,n

)

→d N(0((m+1)l+(m+1)k)×1,Ωγ

). (8)

1We therefore index it by n to allow for drifting sequences of parameter values. Note that to balance(7), the reduced-form coefficient on Zi in (5) must also be indexed by n to allow for drifting sequences ofvalues.

6

Page 7: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Estimators for the structural parameters can be constructed by following the approach ofAmemiya (1978), which is equivalent to MDE in this example. Using (6) and (7), write

π1,n = Π′1,nγ + β + (π1,n − π1)−(

Π1,n −Π1

)′γ,

π2,n = Π′2,nγ + (π2,n − π2,n)−(

Π2,n −Π2,n

)′γ, (9)

or

πn = Hn

β

]+

[η1,n

η2,n

], where

πn =

[π1,n

π2,n

], Hn =

[Π′1,n Il

Π′2,n 0k×l

],

η1,n =[Il − (Il ⊗ γ′)

] [ π1,n − π1

vec(

Π1,n −Π1

) ] ,η2,n =

[Ik − (Ik ⊗ γ′)

] [ π2,n − π2,n

vec(

Π2,n −Π2,n

) ] ,The last two equalities are by the properties of the vec operator and the Kronecker product.2

Partition Ωγ conveniently as

Ωγ =

Ω11,γ Ω12,γ Ω13,γ Ω14,γ

. . . Ω22,γ Ω23,γ Ω24,γ

. . . . . . Ω33 Ω34

. . . . . . . . . Ω44

.

Let Σγ denote the asymptotic variance-covariance matrix of(η′1,n, η

′2,n

)′. We have:

Σγ =

[Σ11,γ Σ12,γ

Σ′12,γ Σ22,γ

], where (10)

Σ11,γ =[Il − (Il ⊗ γ′)

] [ Ω11,γ Ω13,γ

Ω′13,γ Ω33

] [Il − (Il ⊗ γ′)

]′, (11)

Σ22,γ =[Ik − (Ik ⊗ γ′)

] [ Ω22,γ Ω24,γ

Ω′24,γ Ω44

] [Ik − (Ik ⊗ γ′)

]′, (12)

2Given a matrix B and a vector a, Ba = (Ih ⊗ a′) vec (B′), where h denotes the number of rows in B,see Exercise 10.19(b) in Abadir and Magnus (2005).

7

Page 8: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Σ12,γ =[Il − (Il ⊗ γ′)

] [ Ω12,γ Ω14,γ

Ω′23,γ Ω34

] [Ik − (Ik ⊗ γ′)

]′. (13)

Amemiya’s GLS-type estimator of the structural parameters is given by[γAGLS,n

βAGLS,n

]=(H ′nΣ−1

γ Hn

)−1H ′nΣ−1

γ πn.

When IVs are strong (Π2,n = Π2 and π2,n = π2 for all n) and the rank identification conditionholds (rank(Π2) = m), the GLS estimator is consistent and asymptotically normal with theasymptotic variance-covariance matrix equal to

(H ′Σ−1

γ H)−1

.

It is also efficient in the MDE sense.Amemiya’s feasible GLS estimator (AFGLS) can be constructed by exploiting the structureof Σγ in (10)-(13). It requires a consistent estimator of Ω, as well as some preliminaryestimator of γ, for example, the Amemiya’s OLS-type estimator:[

γAOLS,n

βAOLS,n

]=(H ′nHn

)−1H ′nπn.

Let ΣAOLS,n denote a plug-in type estimator of Σγ based on (10)-(13), which replaces Ω’swith their consistent estimators and unknown γ with γAOLS,n:

ΣAOLS,n =

[Σ11,AOLS,n Σ12,AOLS,n

Σ′12,AOLS,n Σ22,AOLS,n

], where

Σ11,AOLS,n =[Il −

(Il ⊗ γ′AOLS,n

) ][ Ω11,n Ω13,n

Ω′13,n Ω33,n

] [Il −

(Il ⊗ γ′AOLS,n

) ]′,

Σ22,AOLS,n =[Ik −

(Ik ⊗ γ′AOLS,n

) ][ Ω22,n Ω24,n

Ω′24,n Ω44,n

] [Ik −

(Ik ⊗ γ′AOLS,n

) ]′,

Σ12,AOLS,n =[Il −

(Il ⊗ γ′AOLS,n

) ][ Ω12,n Ω14,n

Ω′23,n Ω34,n

] [Ik −

(Ik ⊗ γ′AOLS,n

) ]′.

In the above equations, Ω’s denote consistent estimators of the corresponding blocks in Ωγ .A feasible GLS (FGLS) estimator, which is asymptotically equivalent to the GLS estimator

8

Page 9: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

when IVs are strong and the rank condition holds, can be constructed as[γAFGLS,n

βAFGLS,n

]=(H ′nΣ−1

AOLS,nHn

)−1H ′nΣ−1

AOLS,nπn.

Assume that the instruments are only weekly related to the endogenous regressors:

Π2,n = C2/n1/2,

where C2 is m×k matrix of unknown constants. In what follows, we focus on the estimatorsfor γ. It turns out that the GLS estimator of γ depends only on Π2,n, π2,n, and Σ22,γ :3

γAGLS,n =(

Π2,nΣ−122,γΠ′2,n

)−1Π2,nΣ−1

22,γ π2,n. (14)

Define ψ2 to be a k-vector of random variables and Ψ2 to be an m × k matrix of randomvariables such that [

ψ2

vec(Ψ2)

]∼ N

(0(m+1)k×1,

[Ω22,γ Ω24,γ

. . . Ω44

]).

We have:

(γAGLS,n − γ)

=

((n1/2(Π2,n −Π2,n) + C2

)Σ−1

22,γ

(n1/2(Π2,n −Π2,n) + C2

)′)−1

×(n1/2(Π2,n −Π2,n) + C2

)Σ−1

22,γ

(n1/2(π2,n − π2,n)− n1/2(Π2,n −Π2,n)′γ

)→d

((Ψ2 + C2)′Σ−1

22,γ (Ψ2 + C2)′)−1

(Ψ2 + C2) Σ−122,γ

(ψ2 −Ψ′2γ

).

Hence, Amemiya’s GLS estimator is inconsistent when IVs are weak. To describe theasymptotic behavior of Amemiya’s FGLS estimator, define ξAOLS as the random limit ofγAOLS,n − γ:

ξAOLS =((Ψ2 + C2)′ (Ψ2 + C2)′

)−1(Ψ2 + C2)

(ψ2 −Ψ′2γ

),

and let S22,AOLS denote a random matrix defined similarly to Σ22,γ in (12), however, with3The proof is given in the appendix.

9

Page 10: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

γ replaced by γ + ξAOLS :

S22,AOLS =[Ik − (Ik ⊗ (γ + ξAOLS)′)

] [ Ω22,γ Ω24,γ

Ω′24,γ Ω44

] [Ik − (Ik ⊗ (γ + ξAOLS)′)

]′.

The following result describes the asymptotic behavior of the AFGLS estimator:

γAFGLS,n − γ →d

((Ψ2 + C2)′ S−1

22,AOLS (Ψ2 + C2)′)−1

(Ψ2 + C2)S−122,AOLS

(ψ2 −Ψ′2γ

).

Similarly to the γAGLS,n is inconsistent. However, due to the inconsistency of the preliminaryestimator of γ, the asymptotic equivalence between GLS and FGLS no longer holds whenidentification is weak. Lastly, consider testing a hypothesis about γ. In the case of strongIVs and when the rank condition holds, a hypothesis H0 : γ = γ0 can be tested using a Waldstatistic based on the AFGLS estimator:

Waldγ0,n = n (γAFGLS,n − γ0)′(

Π2,nΣ−122,AOLS,nΠ′2,n

)(γAFGLS,n − γ0) ,

where Σ22,AOLS,n denotes a plug-in estimator of Σ22,γ , which is constructed similarly to (12),however, with Ω’s replaced by their consistent estimators, and γ replaced with its Amemiya’sOLS estimator. It follows from the previous results that the null asymptotic distribution ofthe Wald statistic is∥∥∥∥((Ψ2 + C2)′ S−1

22,AOLS (Ψ2 + C2)′)−1/2

(Ψ2 + C2)S−122,AOLS

(ψ2 −Ψ′2γ

)∥∥∥∥2

,

where ‖v‖ denotes the Euclidean norm of a vector v. This null distribution is nonstandard,and as we illustrate in Section 6 on the example of the semiparametric censored structuralregression, one can expect substantial size distortions.

4 Efficient identification-failure-robust inference for the coef-ficients on endogenous regressors: the censored regressionexample

We are interested in testing a hypothesis about the vector of structural parameters γ: for aspecified value γ0 ∈ Rm,

H0 : γ = γ0 vs. H1 : γ 6= γ0.

In this section, we consider the censored regression model as a prototypical example. Resultsfor a general MDE framework are given in Section 5.

10

Page 11: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

A class of AR-type tests is proposed in Magnusson (2010). Instead of using the null-restricted residuals, his tests are based on the identifying restriction that relates the reduced-form and structural parameters. Specifically, for some specified value γ0 ∈ Rm, he tests

H0 : π2,n = Π2,nγ0 vs. H1 : π2,n 6= Π2,nγ0.

Magnusson (2010) describes the basic AR-type statistic, as well as Moreira’s (2003) CLR-type statistic, and Kleibergen’s (2002) LM-type statistic (KLM). He gives their null distri-butions and compares their power properties in a Monte Carlo experiment. However, hispaper does not study their analytical power or efficiency.

4.1 Asymptotic experiment

To characterize efficient tests in this framework, following Cattaneo et al. (2012), we use theframework of asymptotic experiments of Choi et al. (1996). It substantially simplifies theproblem by reducing it to an experiment with a single normal observations.

Assuming asymptotic normality of the estimators of the reduced-form and first-stageparameters as in (8), data can now be summarized as follows:[

π2,n

vec(Π2,n)

]a∼ N

([Π2,nγn

vec(Π2,n)

], n−1

(Ω22,γn Ω24,γn

Ω42,γn Ω44

)),

where ’ a∼’ denotes approximately in large samples. In the above equation, we used theidentifying restriction π2,n = Π2,nγn. Hereafter, the values of the structural parameter ofinterest are now also indexed by n:

H0 : γn = γ0 vs. H1 : γn 6= γ0.

This is done to allow for local deviations from the null hypothesis. In what follows, weassume that Ω’s in the above equation are known (asymptotically) as they can be typicallyestimated consistently. Thus, we have a single normal observation. However, there is ak-vector of nuisance parameters (Π2,n) that may or may not be consistently estimated.

Writeγn = γ0 + δn.

We consider two possible scenarios: i) strong IVs with local alternatives, and ii) weak IVswith fixed alternatives.

Assumption. (Strong IVs) Π2,n = Π2 and δn = δ/√n for all n.

Assumption. (Weak IVs) Π2,n = C2/√n and δn = δ for all n.

11

Page 12: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Next, defineSn,γ0 = π2,n − Π2,nγ0.

The statistic Sn,γ0 measures violations of H0. The information on the nuisance parameter iscontained in Π2,n. However, Π2,n also contains the information on violations of H0 since itis correlated with Sn,γ0 . Following Kleibergen (2002), it is therefore convenient to introducean estimator for Π2,n, which is asymptotically independent from Sn,γ0 . Define Tn,γ0 so that

vec(Tn,γ0) = vec(Π2,n)− Λγn(π2,n − Π2,nγ0),

whereΛγn = ΣΠS,γnΣ−1

SS,γn,

and

ΣSS,γn =[Ik − (Ik ⊗ γ′0)

] [ Ω22,γn Ω24,γn

Ω′24,γn Ω44

] [Ik − (Ik ⊗ γ′0)

]′,

ΣΠS,γn = Ω′24,γn − Ω44 (Ik ⊗ γ0) .

Note that ΣSS,γn is similar to Σ22,γ in (12), however, it is constructed using the null-restrictedvalue of the structural parameter. Nevertheless, ΣSS,γn depends on the true value of thestructural parameter through Ω’s. This fact is important for analyzing the weak IVs/fixedalternatives case. Lastly, the asymptotic variance of Tn,γ0 is given by

ΣTT,γn = Ω44 − ΣΠS,γnΣ−1SS,γn

Σ′ΠS,γn .

Thus, the asymptotic experiment can be alternatively stated as[Sn,γ0

vec(Tn,γ0)

]a∼ N

([Π2,nδn

vec(Π2,n)− Λγ0+δnΠ2,nδn

], n−1

(ΣSS,γ0+δn 0

0 ΣTT,γ0+δn

)).

(15)Note that Sn,γ0 and Tn,γ0 are k- and km-vectors respectively, while δn is an m-vector.Hereafter, we assume that the model is overidentified

m < k.

Asymptotic versions (assuming that Σ’s in (15) are known, since Ω’s are treated asasymptotically known) of the AR, Kleibergen’s KLM, and Moreira’s CLR are given respec-

12

Page 13: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

tively by

ARn,γ0 = nS′n,γ0Σ−1SS,γ0+δn

Sn,γ0 ,

KLMn,γ0 = nS′n,γ0

(Σ−1SS,γ0+δn

Tn,γ0

)(T ′n,γ0Σ−1

SS,γ0+δnTn,γ0

)−1 (T ′n,γ0Σ−1

SS,γ0+δn

)Sn,γ0 ,

CLRn,γ0 = nS′n,γ0Sn,γ0 − nλmin

(S′n,γ0Sn,γ0 S′n,γ0 Tn,γ0

T ′n,γ0Sn,γ0 T ′n,γ0 Tn,γ0

),

where λmin(A) denotes the smallest eigenvalue of matrix A, and

Sn,γ0 = Σ−1/2SS,γ0+δn

Sn,γ0 ,

vec(Tn,γ0) = Σ−1/2TT,γ0+δn

vec(Tn,γ0).

The null asymptotic distribution of ARn,γ0 is χ2k under strong or weak IVs. At the same

time, as shown in Kleibergen (2002), the asymptotic null distribution of KLMn,γ0 is χ2m

under strong or weak IVs. Hence, since m < k, there is a waste of degrees of freedom (andloss of efficiency) for the AR statistic. The critical values for the CLR statistic must besimulated conditional on Tn,γ0 under the assumption δ = 0 (H0).

4.2 Strong IVs

When IVs are strong, we can ignore asymptotically negligible terms in (15). For example,ΣSS,γ0+δn = ΣSS,γ0+δ/

√n can be treated asymptotically as ΣSS,γ0 . Hence, the asymptotic

experiment takes the following form:[ √nSn,γ0√

n vec (Tn,γ0 −Π2)

]a∼ N

([Π2δ

−Λγ0Π2δ

],

(Σ22,γ0 0

0 Σ44,γ0

)).

In this case, we have non-central χ2 asymptotic distributions for the AR and KLM statistics:

ARn,γ0 →d χ2k(δ′Π′2Σ−1

SS,γ0Π2δ), (16)

KLMn,γ0 →d χ2m(δ′Π′2Σ−1

SS,γ0Π2δ). (17)

4.2.1 Single endogenous regressor (m = 1)

Let’s assume for now that there is a single endogenous regressor and, therefore, m = 1. Tofind an efficient test, assume for a moment that Π2 is known. Then, asymptotically uniformlymost powerful unbiased (AUMPU) test must be based on the asymptotic likelihood ratio(LR):

LR∗n,γ0 = δ(

Π′2Σ−1SS,γ0

√nSn,γ0 −Π′2Λ′γ0Σ−1

TT,γ0

√n(Tn,γ0 −Π2)

), (18)

13

Page 14: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

and H0 should be rejected when LR∗2n,γ0 > const. It is easy to see that

LR∗n,γ0 →d N(AsyV ar(LR∗n,γ0), AsyV ar(LR∗n,γ0)

),

whereAsyV ar(LR∗n,γ0) = δ2

(Π′2Σ−1

SS,γ0Π2 + Π′2Λ′γ0Σ−1

TT,γ0Λγ0Π2

).

Hence,

LR∗2n,γ0/AsyV ar(LR∗n,γ0)→d χ

21

(δ2(Π′2Σ−1

SS,γ0Π2 + Π′2Λ′γ0Σ−1

TT,γ0Λγ0Π2)

),

Therefore, when Π2 is known, the power of an AUMPU test is described by a non-centralχ2

1 distribution with the non-centrality parameter

δ2(Π′2Σ−1SS,γ0

Π2 + Π′2Λ′γ0Σ−1TT,γ0

Λγ0Π2).

Note that the second component in the above expression, δ2Π′2Λ′γ0Σ−1TT,γ0

Λγ0Π2, is due tothe following part of LR∗n,γ0 :

δΠ′2Λ′γ0Σ−1TT,γ0

√n(Tn,γ0 −Π2).

This part is unfeasible in practice, because Tn,γ0 must be re-centered using the unknownnuisance parameter Π2. Hence, one can expect that the effective upper bound on the power(when Π2) is unknown corresponds to the non-centrality parameter

δ2Π′2Σ−1SS,γ0

Π2, (19)

which is due to the first component in the LR statistic in (18).This idea can be formalized using the approach of Choi et al. (1996). Consider a local

perturbation Π2,n = Π2 + τ/√n, and a joint test of

H0 : δ = 0 and τ = 0,

against H1 : δ 6= 0 and τ 6= 0. The asymptotic experiment now takes the following form(after ignoring the negligible terms):[ √

nSn,γ0√n (Tn,γ0 −Π2)

]a∼ N

([Π2δ

τ − Λγ0Π2δ

],

(Σ22,γ0 0

0 Σ44,γ0

)).

14

Page 15: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

In this case, the LR statistic is

LRn,γ0 = δΠ′2Σ−1SS,γ0

√nSn,γ0 + (τ − δΛγ0Π2)′Σ−1

TT,γ0

√n(Tn,γ0 −Π2)

→d N (AsyV ar(LRn,γ0), AsyV ar(LRn,γ0)) , where

AsyV ar(LRn,γ0) = δ2Π′2Σ−1SS,γ0

Π2 + (τ − δΛγ0Π2)′Σ−1TT,γ0

(τ − δΛγ0Π2).

Thus, given δ and τ , an asymptotically most powerful unbiased (AMPU) level α test is

Reject H0 when LR2n,γ0/AsyV ar(LRn,γ0) > χ2

1,1−α,

where χ2df,1−α is the (1 − α) quantile of the χ2

df distribution. The power of an AMPU testis captured by the non-centrality parameter equal to AsyV ar(LRn,γ0), since

LR2n,γ0/AsyV ar(LRn,γ0)→d χ

21 (AsyV ar(LRn,γ0)) .

The power minimizing direction (in terms of disturbances to Π2) is τ = δΛγ0Π2, whichresults in the same effective upper bound as in (19).

By comparing these results with (16) and (17), we conclude that, when the model isoveridentified and under strong IVs, the KLM test is AUMPU, since its power attains theeffective bound for the most powerful test, while the AR test is inefficient. Note that theconcept of efficiency here has a semiparametric flavor, as we define efficiency in an asymptoticframework given chosen estimators of reduced-form parameters.

4.2.2 Multiple endogenous regressors (m > 1)

The results for m = 1 can be easily extended to the case of m > 1. For given δ and τ , anAMPU test is based on the following LR statistic

LRn,γ0 = δ′Π′2Σ−1SS,γ0

√nSn,γ0 + (vec(τ)− Λγ0Π2δ)

′Σ−1TT,γ0

√n vec(Tn,γ0 −Π2)

→d N (AsyV ar(LRn,γ0), AsyV ar(LRn,γ0)) , where

AsyV ar(LRn,γ0) = δ′Π′2Σ−1SS,γ0

Π2δ + (vec(τ)− Λγ0Π2δ)′Σ−1TT,γ0

(vec(τ)− Λγ0Π2δ).

Now, given δ and τ , an AMPU level α test is

Reject H0 when LR2n,γ0/AsyV ar(LRn,γ0) > χ2

m,1−α,

As before, the power of an AMPU test is captured by the non-centrality parameter equalto AsyV ar(LRn,γ0), and the effective bound is δ′Π′2Σ−1

SS,γ0Π2δ. When IVs are strong, the

effective bound is attained by the KLM test.

15

Page 16: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

4.3 Weak IVs

In the case of weak IVs, the asymptotic experiment takes the following form:[ √nSn,γ0√

n vec(Tn,γ0)

]a∼ N

([C2δ

vec(C2)− Λγ0+δC2δ

],

(ΣSS,γ0+δ 0

0 ΣTT,γ0+δ

)).

Note that now the asymptotic covariances depend on the deviations from H0, since weconsider fixed alternatives. As a result, there is no AUMPU test even when C2 is knownandm = 1. In this case asymptotically most powerful test is δ-specific, since the LR statisticdepends on

2δC ′2Σ−122,γ0

√nSn,γ0 −

(√nSn,γ0 − δC2

)′ (Σ−1SS,γ0+δ − Σ−1

SS,γ0

) (√nSn,γ0 − δC2

).

It is also infeasible, since it depends on C2 which cannot be consistently estimated.The KLM test has some undesirable properties in the weak IVs case. For simplicity,

suppose that m = 1. In this case, the KLM statistic has a non-central χ21 distribution with

the non-centrality parameter

δ2

((Nγ0+δ + (Ik − δΛγ0+δ)C2)′Σ−1

SS,γ0+δC2

)2

(Nγ0+δ + (Ik − δΛγ0+δ)C2) Σ−1SS,γ0+δ (Nγ0+δ + (Ik − δΛγ0+δ)C2)

,

where Nγa∼ N (0,ΣTT,γ). The non-centrality parameter is zero when δ = 0 (under H0),

however, it can also take zero values for δ 6= 0 (under the alternative). The occurs for δ’ssatisfying the following equality:

δ =(

(Nγ0+δ + C2)′Σ−1SS,γ0+δC2

)/(C ′2Σ−1

22,γ0+δΛγ0+δC2

).

As a result, the KLM test will have only trivial power for such deviations from H0. Theproblem is illustrated in Figure 1.

4.4 Conditional LM (CLM) test

The unattractive behavior of the KLM statistic in the case of weak IVs is due to the trans-formation that produces an estimator for Π2,northogonal to Sn,γ0 . Consider an alternativeLM-type statistic that uses Π2,n instead of Tn,γ0 :

CLMn,γ0 = n(

Π′2,nΣ−1SS,γn

Sn,γ0

)′ (Π′2,nΣ−1

SS,γnΠ2,n

)−1 (Π′2,nΣ−1

SS,γnSn,γ0

).

16

Page 17: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Figure 1: An example of the non-centrality parameter for the KLM test computed condi-tionally on Nγ0+δ = 0.

−5 −4 −3 −2 −1 0 1 2 3 4 50

1

2

3

4

5

6

7

8

9

10

δ

Critical values for the CLM statistic have to be simulated conditional on Tn,γ0 following theapproach of Moreira (2003).

To simulate critical values, first, generateS∗n,r ∼ N(0,Σ−1SS,γn

/n) for r = 1, . . . , R. Next,given Tn,γ0 construct Π∗2,n,r according to

vec(Π∗2,n,r) = vec(Tn,γ0) + ΣΠS,γnΣ−1SS,γn

S∗n,r

for each r = 1, . . . , R, and compute

CLM∗n,r = n(

(Π∗2,n,r)′Σ−1SS,γn

S∗n,r

)′ ((Π∗2,n,r)

′Σ−1SS,γn

Π∗2,n,r

)−1 ((Π∗2,n,r)

′Σ−1SS,γn

S∗n,r

).

The critical value for the CLM test is given by the (1−α)-th empirical quantile of CLM∗n,r :

r = 1, . . . , R. Note that in the case of strong IVs,

CLMn,γ0 →d χ2m(δ′Π′2Σ−1

SS,γ0Π2δ),

and, since γn = γ0 + δ/√n,

CLM∗n,r →d χ2m(0).

Hence, the CLM test preserves the efficiency of the KLM under strong IVs. Moreover, it hasattractive power properties in the case of weak IVs, as illustrated in Section 6. In particular,under some circumstances, it can uniformly dominate the KLM and CLR tests.

17

Page 18: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

5 Efficient identification-failure-robust inference in MDEmod-els

The results of the previous section can be easily extended for general MDE models. Supposethat one is interested in tests that remain valid when identification is weak, efficient whenidentification is strong, and have good power properties when identification is weak. Supposethat identification is strong in the sense that

∂h(π, γn)

∂γ′

is fixed and have rank m. Consider local alternatives of the form

γn = γ0 + δ/√n.

Recall that√n(πn − π)→d N(0,Ω), and define

√nSn,γ0 =

√nh (πn, γ0)

a∼ N(∂h(π, γ0)

∂γ′δ,ΣSS,γ0

),

whereΣSS,γ0 =

∂h(π, γ0)

∂π′Ωγ0

∂h(π, γ0)′

∂π.

From the results of the previous section, when IVs are strong, the power of an AMPU testof H0 : δ = 0 against H1 : δ 6= 0 is described by a non-central χ2

m distribution with anon-centrality parameter

δ′∂h(π, γ0)

∂γ′Σ−1SS,γ0

∂h(π, γ0)′

∂γδ.

Define

Hγn =∂

∂π′vec

(∂h(π, γn)

∂γ′

)Ωγn

∂h(π, γn)′

∂π,

and an estimator Tn,γ0 for ∂h(π, γ0)/∂γ′which is asymptotically independent from√nSn,γ0 :

vec(Tn,γ0) = vec

(∂h(πn, γ0)

∂γ′

)− HnΣ−1

SS,nh (πn, γ0) ,

where

Hn =∂

∂π′vec

(∂h(πn, γ0)

∂γ′

)Ωn

∂h(πn, γ0)′

∂π,

ΣSS,n =∂h(πn, γ0)

∂π′Ωn

∂h(πn, γ0)′

∂π.

18

Page 19: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

With these definitions, the KLM and CLM statistics are given by

KLMn,γ0 = nS′n,γ0

(Σ−1SS,nTn,γ0

)(T ′n,γ0Σ−1

SS,nTn,γ0

)−1 (T ′n,γ0Σ−1

SS,n

)Sn,γ0 ,

and

CLMn,γ0

= n

(∂h(πn, γ0)

∂γ′Σ−1SS,nSn,γ0

)′(∂h(πn, γ0)

∂γ′Σ−1SS,n

∂h(πn, γ0)′

∂γ

)−1(∂h(πn, γ0)

∂γ′Σ−1SS,nSn,γ0

).

One can use χ2 critical values for the KLM test. For the CLM test, critical values have tobe simulated similarly to those described in Section 4.4.

Both tests (KLM and CLM) attain the efficiency bound under strong IVs, however, weexpect better power properties for the CLM test when IVs are weak.

6 Monte Carlo simulations

In our Monte Carlo experiments, we generated n = 1, 000 observations from the followingversion of the censored structural regression model:

yi = max 0, y∗i ,

y∗i = β0 + (γ0 + δ)Yi + Ui,

Yi = Π2Zi + Vi,

i.e. Yi is a single endogenous regressor (m = 1), and the simulated model contains no exoge-nous regressors. The errors Ui and Vi are simulated from a bivariate normal distribution:[

Ui

Vi

]∼ N

([0

0

],

[1 ρ

ρ 1

]),

where the correlation coefficient ρ measures the degree of endogeneity. The number of IVsis set to k = 2, so the model is overidentified. The vector of instruments Zi was generatedindependently from the errors from a bivariate standard normal distribution.

6.1 Size distortions of the standard approach

We used 2, 000 Monte Carlo replications. For each replication, using the quantile value τ =

0.75, we computed Amemiya’s FGLS estimator of γ as described in Sections 3 and AppendixA (Amemiya’s OLS estimator was used as a preliminary estimator of γ). Given those

19

Page 20: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

estimates and their standard errors, we tested the null hypothesis using a usual Wald testthat rejects H0 when Waldγ0,n > χ2

1,1−α. We recorded the simulated rejection frequenciesfor three levels of significance: α ∈ 0.01, 0.05, 0.10. For each replication we also tested thesame null hypothesis using the AR test.

Table 1: Simulated null rejection probabilities of a standard Wald test for different levels of

significance (α), strength of identification (Π2), and endogeneity (ρ)

Π2 = [1, 1] (strong IVs) Π2 = [0.01, 0.01] (weak IVs)

α 0.01 0.05 0.10 0.01 0.05 0.10

ρ = −0.50 0.0105 0.0360 0.0605 0.0065 0.0290 0.0625

ρ = −0.75 0.0105 0.0325 0.0545 0.0435 0.1165 0.1775

ρ = −0.95 0.0085 0.0290 0.0510 0.3095 0.4380 0.5300

The results for the usual Wald test are reported in Table 1. Note that those are rejectionprobabilities under H0. When identification is strong, the Wald test performs very wellwithout any size distortions for all considered values of the endogeneity parameter. However,as the last three columns of the table indicate, the test does have substantial size distortionswhen identification is weak and endogeneity is strong. For example, the asymptotic 5% testhave null rejection probability of 43.8%.

Table 2: Simulated null rejection probabilities of the AR test for different levels of significance

(α), strength of identification (Π2), and endogeneity (ρ)

Π2 = [1, 1] (strong IVs) Π2 = [0.01, 0.01] (weak IVs)

α 0.01 0.05 0.10 0.01 0.05 0.10

ρ = −0.50 0.0170 0.0655 0.1100 0.0180 0.0670 0.1125

ρ = −0.75 0.0140 0.0560 0.1025 0.0100 0.0510 0.1005

ρ = −0.95 0.0115 0.0450 0.0960 0.0115 0.0500 0.0975

Table 2 reports the simulated null rejection probabilities of the AR test. Unlike the usualWald test, the AR test does not show any significant size distortions under both strong andweak IVs designs for all considered values of the endogeneity parameter.

20

Page 21: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Figure 2: Power curves under strong IVs and high degree of endogeneity: λ = 22.5 andρ = 0.9

−2 −1.5 −1 −0.5 0 0.5 1 1.5 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

power of tests

delta

KLM

CLR

CLM

6.2 Power comparisons for the KLM, CLR, and CLM tests

Next, we consider the power properties of the KLM, CLR, and CLM tests. The power ofthe tests depends on the non-centrality parameter, which in our Monte Carlo design is givenby

λ = nΠ′2Π2.

We therefore compare the power of the tests for different values of λ,ρ, and δ. Figure2 shows the power curves for the three tests in the case of strong IVs with λ = 22.5 andρ = 0.9. The three power curves are very close to each other, which reflects the fact thatthe three tests are efficient when IVs are strong.

In the case of weak IVs (λ = 4.9), the simulated results are drastically different. Figure3 shows that the CLM test can uniformly dominate the CLR and KLM tests. Note thataccording to the results in Andrews et al. (2006) in the linear IV regression model withhomoskedastic errors, the CLR test is numerically very close to the infeasible most-powerfultests in terms of average power. We conclude that the numerical near-efficiency of the CLRtest does not hold in more general models.

A Appendix: Semiparametric structural equations with cen-sored dependent variables

In this section, we provide details on Powell’s semiparametric censored regression when it isapproach applied to a model with endogenous regressors.

21

Page 22: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Figure 3: Power curves under weak IVs and low degree of endogeneity: λ = 4.9 and ρ = 0.4

−2 −1.5 −1 −0.5 0 0.5 1 1.5 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

power of tests

delta

KLM

CLR

CLM

Consider first a censored regression model with only exogenous regressors:

yi = max

0, µ+X ′iβ + ui, (20)

where µ ∈ R and β ∈ Rk are unknown parameters, Xi and yi are observed k-vector of regres-sors and the dependent variable respectively, and ui is the unobserved error term. Assumethat conditional on Xi, the error ui is distributed according to the CDF F (·|X). When Fis assumed to belong to some known family of distributions, e.g. normal, µ and β can beestimated using parametric methods such as maximum likelihood, see for example Amemiya(1984). Unfortunately, such an estimator will be inconsistent if the error’s distribution isincorrectly specified. Powell (1984, 1986) suggested an alternative distribution-free approachbased on quantile regression techniques (see, for example, Koenker (2005)).

Let F−1 (τ |Xi) denote the τ -th conditional quantile of ui given Xi, and assume that itis independent of Xi:

PX(ui ≤ F−1 (τ |Xi)

)= PX (ui ≤ qτ ) = τ, (21)

almost surely for some constant qτ , where PX denotes the conditional probability givenXi. A sufficient condition for (21) is independence between ui and Xi. Assume that F iscontinuously differentiable with positive density in the neighborhood of qτ . Let Q (τ |Xi)

denote the τ -th conditional quantile of yi given Xi. Then,

Q (τ |Xi) = max

0, µ+ qτ +X ′iβ. (22)

22

Page 23: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

According to equation (22), the knowledge of the errors’ distribution is not required in orderto identify the effect of Xi on yi. The coefficient β is identified by Q (τ |Xi) provided thatP (µ+ qτ +X ′iβ > 0) > 0 and the quantile independence condition (21) holds. At the sametime, the conditional expectation of yi given Xi does not identify β even when ui and Xi

are independent.Define

µτ = µ+ qτ .

The parameters µτ and β can also be defined as a solution to the following minimizationproblem:

mina∈R,b∈Rk

E[ρτ(yi −max

0, a+X ′ib

)|Xi

],

where ρτ (x) = x (τ − 1 (x < 0)) is the so-called check function (see Koenker (2005)). Giventhe data (yi, X ′i) : i = 1, . . . , n generated according to (20) with F−1 (τ |Xi) = qτ andP (µτ +X ′iβ > 0) > 0, the estimators of µτ and β, µτ,n and βn, are obtained by solving

mina∈R,b∈Rk

n−1n∑i=1

ρτ(yi −max

0, a+X ′ib

).

The quantile-based estimator βn will be informative about β provided that µτ + X ′iβ > 0

for sufficiently large number of observations, i.e. P (µτ +X ′β > 0) is large enough.Structural equations with limited dependent variables and normal errors have been stud-

ied in Amemiya (1978, 1979).4 Here, we adopt Powell’s approach allowing for distribution-free quantile-based estimation of some of the reduced-form parameters. Those estimates willbe used for identification-failures robust inference on structural parameters.

The model is now:

yi = max 0, y∗i , (23)

y∗i = µ+ γ′Yi + β′Xi + ui, (24)

Yi = θ + Π1Xi + Π2,nZi + Vi, (25)

where yi is the observed censored dependent variable, y∗i is the latent dependent variable,and we separated the constant from the vector of other exogenous regressors Xi. Thereduced-form equation for latent dependent variable y∗i is

y∗i = λ+ π′1Xi + π′2,nZi + vi, where

vi = ui + γ′Vi,

4See also Section 9.4 in Amemiya (1984).

23

Page 24: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

and the reduced-form equation for yi is given by

yi = max

0, λ+ π′1Xi + π′2,nZi + vi,

where λ = µ+ γ′θ.

Next, we discuss next how to construct the estimators of the reduced-form parametersand Ωγ . We apply the Powell (1986) quantile-based method to estimate the coefficients ofthe reduced-form equation for yi. Let f(v|Xi, Zi) and F (v|Xi, Zi) denote the conditionalPDF and CDF respectively of vi given Xi and Zi. We assume that the τ -th conditionalquantile of vi is independent of Xi and Zi:

F−1 (τ |Xi, Zi) = qτ almost surely for all i.

Using this assumption, consistent and asymptotically normal estimators of π1 and π2,n aregiven by

(λτ,n, π1,n, π2,n

)= arg min

c∈R,b1∈Rl,b2∈Rkn−1

n∑i=1

ρτ(yi −max

0, c+ b′1Xi + b′2Zi

).

The coefficients of the reduced-form equation for Yi can be estimated by OLS:

[Π1,n Π2,n

]=[ ∑n

i=1 YiX′i

∑ni=1 YiZ

′i

] [ ∑ni=1 XiX

′i

∑ni=1 XiZ

′i∑n

i=1 ZiX′i

∑ni=1 ZiZ

′i

]−1

, (26)

where Xi = Xi − n−1∑n

j=1Xi and Yi = Yi − n−1∑n

j=1 Yi.The variance-covariance matrix of the reduced-form estimators Ωγ can be derived as

follows. Define λτ = λ + qτ . By the asymptotic linearity result for censored quantileregression estimators (Powell, 1986, equation (3.5)),

n1/2

[π1 − π1

π2 − π2,n

]= M−1

1 n−1/2n∑i=1

dτ,i + op (1) , where

M1 = E

(1(λτ + π′1Xi + π′2,nZi > 0

)f (qτ |Xi, Zi)

[XiX

′i XiZ

′i

ZiX′i ZiZ

′i

]),

dτ,i = 1(λτ + π′1Xi + π′2,nZi > 0

)(τ − 1 (vi < qτ ))

[Xi

Zi

].

24

Page 25: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Using the properties of the vec operator and Kronecker product,5 we can write

n1/2

vec(

Π1 −Π1

)vec(

Π2 −Π2

) =(M−1

2 ⊗ Im)n−1/2

n∑i=1

Di + op (1) ,

Di =

[vec(Vi (Xi − EXi)

′)vec(Vi (Zi − EZi)′

) ] ,M2 = E

[(Xi − EXi) (Xi − EXi)

′ (Xi − EXi) (Zi − EZi)′

(Zi − EZi) (Xi − EXi)′ (Zi − EZi) (Zi − EZi)′

]. (27)

We have now

Ωγ =

[M−1

1 0

0(M−1

2 ⊗ Im) ]

E[dτ,id

′τ,i

]E [dτ,iD

′i]

E[Did

′τ,i

]E [DiD

′i]

[ M−11 0

0(M−1

2 ⊗ Im) ] . (28)

Note that Ωγ depends on γ through dτ,i, which in turn depends on γ through vi.A consistent estimator of Ωγ can be constructed by replacing each of the components in

(28) with their consistent estimators. Define y∗τ,i = λτ + π′1,nXi + π′2,nZi and vτ,i = yi− y∗τ,i.

Define further dτ,i = 1(y∗τ,i > 0

)(τ − 1 (vτ,i < 0))

[X ′i Z ′i

]′. Lastly, let Vi denote the

vector of OLS residuals of the reduced-form equation for Yi, and Di be defined similarly toDi but with Vi, Xi−EXi, Zi−EZi and replaced by Vi, Xi, and Zi respectively. With thosedefinitions, M1 can be estimated by

(ncn)−1n∑i=1

1(y∗τ,i > 0

)1 (0 ≤ vτ,i ≤ cn)

[XiX

′i XiZ

′i

ZiX′i ZiZ

′i

],

where cn is such that cn → 0 and n1/2cn → ∞ as n → ∞ (see Powell (1986) and Chapter3.4.2 in Koenker (2005)). One can choose, for example cn ∼ n−1/5. The matrix M2 canbe estimated using its definition in (27) by replacing expectation with sample average, andXi−EXi and Zi−EZi with Xi and Zi respectively. The matrix E

[dτ,id

′τ,i

]can be estimated

by

τ (1− τ)n−1n∑i=1

1(y∗τ,i > 0

) [ XiX′i XiZ

′i

ZiX′i ZiZ

′i

],

(see, for example, Powell (1986) and Chapter 3.2.3 in Koenker (2005)). Lastly, we estimateE [dτ,iD

′i] and E [did

′i] by n

−1∑n

i=1 dτ,iD′i and n

−1∑n

i=1 DiD′i respectively.

Given the above definitions and estimators, one can construct an estimators of the struc-5Given two matrices A and B, vec (BA) = (A′ ⊗ Ih) vec (B), where h is the number of rows in B, see

Exercise 10.19(a) in Abadir and Magnus (2005).

25

Page 26: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

tural parameters following the Amemiya’s approach described in Section 3. Alternatively,one can perform one of the identification-failures-robust tests.

B Appendix: Derivation of the GLS estimator of γ in equation(14)

DefineΣ1·2 = Σ11 − Σ12Σ−1

22 Σ′12.

By the formula for the inverse of a partitioned matrix,

H ′Σ−1H =

=

[Π1 Π2

Il 0l×k

][Σ−1

1·2 −Σ−11·2Σ12Σ−1

22

−Σ−122 Σ′12Σ−1

1·2 Σ−122

(I + Σ′12Σ−1

1·2Σ12Σ−122

) ]−1 [Π′1 Il

Π′2 0k×l

]

=

[Q11

(Π1 −Π2Σ−1

22 Σ′12

)Σ−1

1·2Σ−1

1·2(Π1 − Σ12Σ−1

22 Π′2)

Σ−11·2

],

where

Q11 = Π1Σ−11·2Π′1 −Π2Σ−1

22 Σ′12Σ−11·2Π′1 −Π1Σ−1

1·2Σ12Σ−122 Π′2 + Π2Σ−1

22

(I + Σ′12Σ−1

1·2Σ12Σ−122

)Π′2.

Next,

(H ′ΣH

)−1=

[ (Π2Σ−1

22 Π′2)−1 (

Π2Σ−122 Π′2

)−1 (Π1 −Π2Σ−1

22 Σ′12

)(Π′1 − Σ12Σ−1

22 Π′2) (

Π2Σ−122 Π′2

)−1Q22

],

(29)where

Q22 = Σ1·2 +(Π′1 − Σ12Σ−1

22 Π′2) (

Π2Σ−122 Π′2

)−1 (Π1 −Π2Σ−1

22 Σ′12

).

Similarly,

H ′Σ−1 =

[Π1Σ−1

1·2 −Π2Σ−122 Σ′12Σ−1

1·2 −Π1Σ−11·2Σ12Σ−1

22 + Π2Σ−122

(I + Σ′12Σ−1

1·2Σ12Σ−122

)Σ−1

1·2 −Σ−11·2Σ12Σ−1

22

].

(30)The result follows from (29) and (30).

26

Page 27: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

References

Abadir, K. M., Magnus, J. R., 2005. Matrix Algebra. Vol. 1 of Econometric Exercises.Cambridge University Press, New York.

Amemiya, T., 1978. The estimation of a simultaneous equation generalized probit model.Econometrica 46 (5), 1193–1205.

Amemiya, T., 1979. The estimation of a simultaneous-equation tobit model. InternationalEconomic Review 20 (1), 169–181.

Amemiya, T., 1984. Tobit models: A survey. Journal of Econometrics 24 (1-2), 3–61.

Anderson, T. W., Rubin, H., 1949. Estimation of the parameters of a single equation in acomplete system of stochastic equations. Annals of Mathematical Statistics 20 (1), 46–63.

Andrews, D. W. K., Moreira, M. J., Stock, J. H., 2006. Optimal invariant similar tests forinstrumental variables regression. Econometrica 74 (3), 715–752.

Cattaneo, M. D., Crump, R. K., Jansson, M., 2012. Optimal inference for instrumentalvariables regression with non-gaussian errors. Journal of Econometrics 167 (1), 1–15.

Choi, S., Hall, W. J., Schick, A., 1996. Asymptotically uniformly most powerful tests inparametric and semiparametric models. Annals of Statistics 24 (2), 841–861.

Dufour, J.-M., 1997. Some impossibility theorems in econometrics with applications to struc-tural and dynamic models. Econometrica 65 (6), 1365–1387.

Hnatkovska, V., Marmer, V., Tang, Y., 2012. Comparison of misspecified calibrated models:The minimum distance approach. Journal of Econometrics 169 (1), 131–138.

Horowitz, J. L., 1996. Semiparametric estimation of a regression model with an unknowntransformation of the dependent variable. Econometrica 64 (1), 103–137.

Kleibergen, F., 2002. Pivotal statistics for testing structural parameters in instrumentalvariables regression. Econometrica 70 (5), 1781–1803.

Kleibergen, F., 2007. Generalizing weak instrument robust iv statistics towards multiple pa-rameters, unrestricted covariance matrices and identification statistics. Journal of Econo-metrics 139 (1), 181–216.

Koenker, R., 2005. Quantile Regression. Cambridge University Press, New York.

Magnusson, L., 2010. Inference in limited dependent variable models robust to weak identi-fication. Econometrics Journal 13 (3), S56–S79.

27

Page 28: EfficientInferenceinEconometricModelsWhen IdentificationCanBeWeak · Vadim Marmer yZhengfei Yu May 20, 2013 Abstract We consider econometric models that fit into the minimum distance

Marmer, V., Feir, D., Lemieux, T., 2012. Weak identification in fuzzy regression discontinuitydesigns, UBC Working paper.

Moreira, M. J., 2001. Tests with correct size when instruments can be arbitrarily weak,unpublished manuscript, Department of Economics, University of California, Berkeley.

Moreira, M. J., 2003. A conditional likelihood ratio test for structural models. Econometrica71 (4), 1027–1048.

Newey, W. K., McFadden, D. L., 1994. Large sample estimation and hypothesis testing.In: Engle, R. F., McFadden, D. L. (Eds.), Handbook of Econometrics. Vol. IV. Elsevier,Amsterdam, Ch. 36, pp. 2111–2245.

Powell, J. L., 1984. Least absolute deviations estimation for the censored regression model.Journal of Econometrics 25 (3), 303–325.

Powell, J. L., 1986. Censored regression quantiles. Journal of Econometrics 32 (1), 143–155.

Staiger, D., Stock, J. H., 1997. Instrumental variables regression with weak instruments.Econometrica 65 (3), 557–586.

28