1 e Bootstrap’s Finite Sample Distributi An Analytical Approach Lawrence C. Marsh Department of Economics and Econometrics University of Notre Dame Midwest Econometrics Group (MEG) October 15 – 16, 2004 Northwestern University
1
The Bootstrap’s Finite Sample DistributionAn Analytical Approach
Lawrence C. Marsh
Department of Economics and EconometricsUniversity of Notre Dame
Midwest Econometrics Group (MEG)
October 15 – 16, 2004
Northwestern University
2
This is the first of three papers:
(1.) Bootstrap’s Finite Sample Distribution ( today !!! )
(2.) Bootstrapped Asymptotically Pivotal Statistics
(3.) Bootstrap Hypothesis Testing and Confidence Intervals
3
traditional approach in econometrics
Analyticalsolution
Bootstrap’sFinite Sample Distribution
Empiricalprocess
approach used in this paper
Analyticalproblem
Analogy principle (Manski)GMM (Hansen)
Empiricalprocess
4
Bootstrap sample of size m:
Start with a sample of size n: {Xi : i = 1,…,n}
{Xj*: j = 1,…,m}
m < n or m = n or m > n
Define Mi as the frequency of drawing each Xi .
bootstrap procedure
5
n
iii
m
jj XMX
11
*
1
1
11
*
n
iiiM
m
jjM XM
mEX
mE
1
1
11
*
n
iiiM
m
jjM XM
mVarX
mVar ...
6
for i k
n
mME iM
22 1
n
mnmmME iM
2
2
n
mmMME kiM
)!!...(
!
1
1
... 1
2
1
*
2
1
*
1
mMM n
mm
jj
m
jjM
nMM
mnXf
mXf
mE
mMM n
mn
iii
nMM
mnXfM
m... 1
2
11)!!...(
!
1
mMM n
mn
i
nn
kikikiii
nMM
mnXfXfMM
mXfM
m... 11
2
2
222
1
2
)!!...(
!
21
7
= 1
1
*
m
jjM Xf
mVar
2
21
2
2
2
2
1
nn
kiki
n
ii XfXf
nmXf
nm
n
Applied Econometrician:
The bootstrap treats the original sample as if it were the population and induces multinomial distributed randomness.
8
=
Econometric theorist: what does this buy you?
Find out under joint distribution of bootstrap-induced randomness and randomness implied by the original sample data:
1
1
*,
m
jjXM Xf
mVar
2
12
12
1
1iX
n
i
n
iiX XfE
nm
nXfVar
n
.
2 ,
2
2
2
2
2
22
nn
kikiX
nn
kikiX XfXfE
nmXfXfCov
n
9
1
1
2*,
m
jjXM XX
mVar
n
iiX
n
iiX XXE
nm
nXXVar
n 1
4
21
2
2
1
1
2
22
2
222
2
22
2 ,
2
nn
kikiX
nn
kikiX XXXXE
nmXXXXCov
n
=
Econometric theorist:
Applied Econometrician:
222
21
4
21
2*
2
2
1
1nn
kiki
n
ii
m
jjM XXXX
nmXX
nm
nXX
mVar
2** XXXf jj
For example,
10The Wild Bootstrap
5.015.0 | iii WMW
i
iii W
MMWP
Multiply each boostrapped value by plus one or minus one each with a probability of one-half (Rademacher Distribution).
Use binomial distribution to impose Rademacher distribution:
2
1|
2
1
*|
11 n
iiiiiMWM
m
jjMWM XfWMW
mEEXf
mEE
Wi = number of positive ones out of Mi which, in turn, is the number of Xi’s drawn in m multinomial draws.
11
m
jjMW Xf
mVar
1
*,
1
n
iiXf
nm 1
21
The Wild Bootstrap
=
Econometric Theorist:
Applied Econometrician:
n
iiX
m
jjXMW XfVar
nmXf
mVar
11
*,,
11
under zero mean assumption
12
1
1
11
*
n
iii
m
jj XfM
mqXf
mq
.
.
.
1
1
11
*
n
iiiM
m
jjM XfM
mqEXf
mqE
1
1
11
*
n
iiiM
m
jjM XfM
mqVarXf
mqVar
13
n
iiX
nX
1
1
m
jjX
mX
1
** 1 mjX j ,...,1:* niX i ,...,1:
Xgn
go
XE
2*2
** '2
1 nOXXXXXEB GMn
almost surely, where is matrix of second partial derivatives of g. XG2
where X is a p x 1 vector.
nonlinear function of .
Horowitz (2001) approximates the bias of
for a smooth nonlinear function g as an estimator of go
14
*2
' XXXE GM
n
iiiM XM
mXXE G
12
1'
m
jjM X
mXXE G
1
*2
1'
15
2*2
** '2
1 nOXXXXXEB GMn
Horowitz (2001) uses bootstrap simulations to approximate the first term on the right hand side.
Exact finite sample solution:*nB
XXXXXXnm
mXXX
nm
mn GGG k
nn
kiii
n
ii 22
2
221
2'
2
'12'1
2
1
2nO
=
+
16
Definition: Any bootstrap statistic, , that is a function of the elements of the set {f(Xj
*): j = 1,…,m} and satisfies the separability condition
*n
n
iiin XfhMgXf mj
j
1
** : ,...,1
where g(Mi ) and h( f(Xi )) are independent functions
and where the expected value EM [g(Mi)] exists,
is a “directly analyzable” bootstrap statistic.
Separability Condition
17X is an n x 1 vector of original sample values.
X * is an m x 1 vector of bootstrapped sample values.
X * = HX where the rows of H are all zeros except
for a one in the position corresponding to the element of X that was randomly drawn.
EH[H] = (1/n) 1m1n’ where 1m and 1n are column vectors of ones.
m* = g(X *) = g(HX ) Taylor series expansion
m* =
g(Xo*) + [G1(Xo
*)]’(X *Xo*) + (1/2) (X *Xo
*)’[G2(Xo*)](X *Xo
*) + R *
Setup for empirical process: Xo* = Ho X
18
m* = g(X *) = g(HX )
Taylor series expansion
m* =
g((1/n)1m1n’X )
+ [G1((1/n)1m1n’X )]’(H(1/n)1m1n’) X
+ (1/2)X ‘(H(1/n)1m1n’)’[G2((1/n)1m1n’X )](H(1/n)1m1n’) X
+ R *
Taylor series:
Now ready to determine exact finite moments, et cetera.
X * = HX where the rows of H are all zeros
except for a one in the position corresponding
to the element of X that was randomly drawn.
Setup for analytical solution:
Xo* = Ho X Ho = EH[H] = (1/n) 1m1n’
19 YXXX '' ˆ 1
e ̂XY e = ( In – X (X’X)-1X’)
{ , , . . ., } e 1e 2e ne
** ˆ AeXY
*'' *ˆ 1 YXXX
{ , , . . ., } *e *1e *
2e *ne
EH[H] = (1/n) 1n1n’
e* = H e
A = ( In – (1/n)1n1n’ )
}No restrictions on covariance
matrix for errors.
20
Applied Econometrician:
.
*| ̂HCov
11
2''11''11''1 XXXAeeAXXX
n nnnn
1
2
1 '''1'111
'1''
2
XXAXeevecI
neeI
nAXXX
nnnnn
=
A = Inor where
A1n1n’ = 0 1n1n’A = 0 andA = ( In – (1/n)1n1n’ ) so
21
Econometric theorist:
ˆˆ *, CovCov H
1
2
1 ''111'11
1''2
XXXvec
ntr
nnIXXX
nnnn
''''' 11 XXXXIEXXXXI nn
+
where No restrictions on 'E
22
This is the first of three papers:
(1.) Bootstrap’s Finite Sample Distribution ( today !!! )
(2.) Bootstrapped Asymptotically Pivotal Statistics
(3.) Bootstrap Hypothesis Testing and Confidence Intervals
Thank you !
basically done.
almost done.