A^V˙O 1 33 1 5 ˇ 2017 c 10 Chinese Journal of Applied Probability and Statistics Oct., 2017, Vol. 33, No. 5, pp. 475-486 doi: 10.3969/j.issn.1001-4268.2017.05.004 Instrumental Variable Estimation in Linear Quantile Regression Models with Measurement Error * GUAN Jing (School of Mathematics, Tianjin University, Tianjin 300350, China ) WANG LiQun ? (Department of Statistics, University of Manitoba, Winnipeg, Manitoba, Canada R3T 2N2 ) Abstract: We extend the instrumental variable method for the mean regression models to lin- ear quantile regression models with errors-in-variables. The proposed estimator is consistent and asymptotically normally distributed under some fairly general conditions. Moreover, this approach is practical and easy to implement. Simulation studies show that the finite sample performance of the estimator is satisfactory. The method is applied to a real data study of education and wages. Keywords: errors in variables; instrumental variables; least absolute deviation; measurement er- ror; quantile regression 2010 Mathematics Subject Classification: 62F10; 62F12; 62J99 Citation: Guan J, Wang L Q. Instrumental variable estimation in linear quantile regression models with measurement error [J]. Chinese J. Appl. Probab. Statist., 2017, 33(5): 475–486. §1. Introduction Quantile regression has drawn much attention in the literature. It is a useful tool for estimating conditional quantiles of a response variable given a set of predictors [1] . While the mean regression describes the effect of predictors on the average response, the quantile regression describes the effect on its entire distribution. It has been applied in many areas such as biology, ecology, economics, finance and environmental science. In practice, often some predictors are not directly observable or are measured with substantial errors. It is known that simple substitution of the surrogate data for the latent variables will result in attenuated and inconsistent estimators in either mean or quantile regression models [2–4] . * The project was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC). ? Corresponding author, E-mail: [email protected]. Received December 11, 2015. Revised July 8, 2016.
12
Embed
Instrumental Variable Estimation in Linear Quantile ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A^VÇÚO 1 33ò1 5Ï 2017c 10�
Chinese Journal of Applied Probability and StatisticsOct., 2017, Vol. 33, No. 5, pp. 475-486
doi: 10.3969/j.issn.1001-4268.2017.05.004
Instrumental Variable Estimation in Linear Quantile
Regression Models with Measurement Error ∗
GUAN Jing
(School of Mathematics, Tianjin University, Tianjin 300350, China)
WANG LiQun?
(Department of Statistics, University of Manitoba, Winnipeg, Manitoba, Canada R3T 2N2)
Abstract: We extend the instrumental variable method for the mean regression models to lin-
ear quantile regression models with errors-in-variables. The proposed estimator is consistent and
asymptotically normally distributed under some fairly general conditions. Moreover, this approach
is practical and easy to implement. Simulation studies show that the finite sample performance of
the estimator is satisfactory. The method is applied to a real data study of education and wages.
Keywords: errors in variables; instrumental variables; least absolute deviation; measurement er-
Citation: Guan J, Wang L Q. Instrumental variable estimation in linear quantile regression
models with measurement error [J]. Chinese J. Appl. Probab. Statist., 2017, 33(5): 475–486.
§1. Introduction
Quantile regression has drawn much attention in the literature. It is a useful tool for
estimating conditional quantiles of a response variable given a set of predictors [1]. While
the mean regression describes the effect of predictors on the average response, the quantile
regression describes the effect on its entire distribution. It has been applied in many areas
such as biology, ecology, economics, finance and environmental science.
In practice, often some predictors are not directly observable or are measured with
substantial errors. It is known that simple substitution of the surrogate data for the latent
variables will result in attenuated and inconsistent estimators in either mean or quantile
regression models [2–4].
∗The project was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).?Corresponding author, E-mail: [email protected].
Received December 11, 2015. Revised July 8, 2016.
476 Chinese Journal of Applied Probability and Statistics Vol. 33
Quantile regression with mismeasured covariates has been investigated before, how-
ever, the publications in this area are sparse due to the difficult nature of the problem.
Brown [5] examined a median regression model and described the difficulty of associated
parameter estimation. He and Liang [6] studied the orthogonal distance quantile regression
in a linear measurement error model where the measurement and regression errors have
a joint spherically symmetric distribution. Wei and Carroll [4] proposed consistent estima-
tors for a linear quantile regression model using modified score function and replicate data.
Ma and Yin [7] used a composite weighting estimation method to deal with censored linear
quantile regression with spherically distributed measurement error. On the other hand,
Hu and Schennach [8] and Schennach [9] used instrumental variable method to establish the
identifiability and propose consistent estimators of a nonparametric quantile regression
model.
While the instrumental variables (IV) method has been widely used in mean regression
with errors in variables (e.g., [2]; [3]; [10]; [11]; [12]; [13]; [14]; [15]), its use in parametric
quantile regression is rare. This paper attempts to explore this possibility. Generally
speaking, the measurement error problem is difficult to deal with in quantile regression
setup, mainly due to the lack of additive property of probability quantiles. For example,
unlike the mean function the quantile of the sum of two random variables does not equal to
the sum of two quantiles. This prevents the extension of usual manipulations of additive
measurement errors in mean regression setups. Due to these difficulties, in this paper
we consider a simple linear quantile regression model and demonstrate that the usual IV
procedure in the mean regression setup can be carried over to certain quantile regression
models.
To illustrate the IV method in the mean regression setup, let us consider the linear
model
y = α1 + α′2x+ ε, (1)
where y ∈ IR is the response variable, x ∈ IRk is the vector of predictor variables, ε is a
random error with E(ε |x) = 0, and α = (α1, α′2)′ is an unknown parameter vector. Here
it is assumed that x is unobservable and the observed predictor is
w = x+ u, (2)
where u is the random measurement error satisfying E(u |x) = 0. Usually the variance-
covariance matrix of u, Σu, can be singular to allow some components of x to be measured
without error. It is well-known that simply substituting w for x in equation (1) will give
inconsistent estimator for α because w and u are dependent. One possible way to overcome
No. 5 GUAN J., WANG L. Q.: IV Estimation in Linear Quantile Regression Models with ME 477
this problem is to use the instrumental variables which, by definition, are correlated with
x but uncorrelated with ε and u [2]. Following [12,13], we assume that there exists a vector
of instrumental variables z ∈ IR` that is related to x through
x = β1 + β′2z + δ, (3)
where the random error δ is uncorrelated with u and ε and satisfies E(δ | z) = 0.
The IV method for the mean regression consists of the following steps. First, substi-
tuting (3) into (1) yields a usual mean regression equation
y = γ1 + γ′2z + ν, (4)
where ν = α′2δ + ε and
γ1 = α1 + α′2β1, (5)
γ2 = β2α2. (6)
Since z is uncorrelated with δ and ε, γ can be consistently estimated by least squares
fitting of y on z. Further, substituting (3) into (2) yields
w = β1 + β′2z + (δ + u) (7)
which can be used to obtain consistent estimators for β1 and β2. Finally, consistent
estimators for α are obtained by solving equations (5)–(6), which yields α2 = (β′2β2)−1β′2γ2
and α1 = γ1 − α′2β1. In subsequent sections, we adapt this procedure to least absolute
deviation (LAD) estimation and more general quantile regression setups.
The paper is organized as follows. Section 2 introduces the IV method to LAD
estimation in the linear median regression model with errors in variables. Section 3 extends
this method to more general quantile regression models. Simulation studies are presented
in Section 4, while an empirical example is given in Section 5. Finally, conclusions and
discussion are given in Section 6.
§2. LAD Estimation
In this section, we extend the IV method described in Section 1 to the LAD estimation
of model (1)–(3). To this end we make the following assumptions.
Assumption 1 ε, δ and u are symmetrically distributed about zero. Further, ν has
a continuous density satisfying fν(0) > 0.
478 Chinese Journal of Applied Probability and Statistics Vol. 33
Assumption 2 The data (yi, wi, zi), i = 1, 2, . . . , n are independent and identically
distributed.
Assumption 3 Mz = E(zz′) is positive definite and max16i6n
‖zi‖ = op(√n ).
Then under Assumption 1, ν = α′2δ + ε in equation (4) is symmetric about zero.
Therefore the LAD estimator γ of γ = (γ1, γ′2)′ can be obtained by minimizing the objective
functionn∑i=1|yi − γ1 − γ′2zi|. (8)
According to [16], Assumptions 1 and 3 ensure that√n(γ − γ) → N(0,Σγ), where Σγ =
M−1z /[4f2ν (0)]. Similarly, from equation (7) the LAD estimator β of β = (β1, β′2)′ can be
obtained by minimizing the objective function
n∑i=1|wi − β1 − β′2zi|. (9)
Now, write (5) and (6) jointly as
γ = Bα, (10)
where α = (α1, α′2)′ and
B =
(1 β′1
0 β′2
).
Then consistent estimator of α can be obtained by solving (10) provided consistent esti-
mators of γ and β = (β1, β′2)′ are available. Specifically, given the consistent estimators γ
and B, consistent estimator for α can be obtained by minimizing (γ − Bα)′An(γ − Bα),
where An is a non-negative definite weight matrix which may depend on the data. The
resulting minimum distance estimator is given by
α = (B′AnB)−1B′Anγ. (11)
Further, by delta-method we have
√n(α− α)→ N(0, (B′AB)−1B′AΣγAB(B′AB)−1), (12)
where A = plim(An/n). The above asymptotic variance-covariance matrix has a lower
bound (B′Σ−1γ B)−1 which is attained when A = Σ−1γ . Therefore the most efficient choice
of wight is An = Σγ which is a consistent estimator for Σγ . Another less efficient but
practical choice is An = Z ′Z, which gives α = (W ′W )−1W ′Zγ, where W = ZB and
Z ′ =
(1 1 · · · 1
z1 z2 · · · zn
).
No. 5 GUAN J., WANG L. Q.: IV Estimation in Linear Quantile Regression Models with ME 479
§3. General Quantile Regression
In this section we consider more general quantile regression model
Qτ (y |x) = α1 + α′2x+ F−1ε (τ), τ ∈ (0, 1), (13)
where Fε is the distribution of ε. If x is observed, then the quantile regression estimator
of α(τ) = (α1 + F−1ε (τ), α2) can be obtained by minimizing the objective function
n∑i=1
ρτ (yi − α1 − α′2xi), (14)
where ρτ (u) = u(τ − I(u < 0)) and I(·) is the indicator function. For any given τ ,
the above quantile regression produces consistent estimator for α(τ). In particular, the
estimator corresponding to τ = 0.5 is the least absolute deviation (LAD) estimator.
In the case of unobserved x, the instrumental variable z can be used instead. Un-
fortunately, the above IV procedure for mean regression cannot be directly applied here
because the quantiles F−1ε (τ) and F−1ν (τ) are not the same. In order to derive consistent
quantile regression estimators, we modify the procedure as follows. First, rewrite (4) as
y = α1 + α′2t+ ν, (15)
where t = β1 + β′2z. For the sake of clarity, we first assume that the true value of β is
known. The case of estimated β will be discussed later. Then the corresponding quantile