Simultaneous confidence bands in spectral density estimation Michael H. Neumann Friedrich-Schiller-Universit¨atJena Institut f¨ ur Stochastik Ernst-Abbe-Platz 2 D – 07743 Jena, Germany E-mail: [email protected]Efstathios Paparoditis University of Cyprus Department of Mathematics and Statistics P.O. Box 20537 CY – 1678 Nicosia, Cyprus E-mail: [email protected]Abstract We propose a method for the construction of simultaneous confidence bands for (a smoothed version of) the spectral density of a Gaussian process based on nonparametric kernel esti- mators obtained by smoothing the periodogram. A studentized statistic is used to deter- mine the width of the band at each frequency and a frequency domain bootstrap approach is employed in order to estimate the distribution of the supremum of this statistic over all frequencies. We prove by means of strong approximations that the bootstrap estimates consistently the distribution of the supremum deviation of interest and, consequently, that the proposed confidence bands achieve asymptotically the desired simultaneous cov- erage probability. The behavior of our method in finite sample situations is investigated by simulations and a real-life data example demonstrates its applicability in time series analysis. 2000 Mathematics Subject Classification. Primary 62G15; secondary 62M15. Keywords and Phrases. Bootstrap, confidence bands, Gaussian processes, spectral density, strong approximation. Short title. Confidence bands for the spectral density.
25
Embed
Simultaneous confidence bands in spectral density estimation STATHIS.pdfdence bands for (a smoothed version of) the spectral density of Gaussian processes. The method does not rely
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Simultaneous confidence bands in spectral densityestimation
Michael H. NeumannFriedrich-Schiller-Universitat Jena
We propose a method for the construction of simultaneous confidence bands for (a smoothedversion of) the spectral density of a Gaussian process based on nonparametric kernel esti-mators obtained by smoothing the periodogram. A studentized statistic is used to deter-mine the width of the band at each frequency and a frequency domain bootstrap approachis employed in order to estimate the distribution of the supremum of this statistic over allfrequencies. We prove by means of strong approximations that the bootstrap estimatesconsistently the distribution of the supremum deviation of interest and, consequently,that the proposed confidence bands achieve asymptotically the desired simultaneous cov-erage probability. The behavior of our method in finite sample situations is investigatedby simulations and a real-life data example demonstrates its applicability in time seriesanalysis.
2000 Mathematics Subject Classification. Primary 62G15; secondary 62M15.Keywords and Phrases. Bootstrap, confidence bands, Gaussian processes, spectraldensity, strong approximation.Short title. Confidence bands for the spectral density.
1
1. Introduction
Estimating the spectral density of a stochastic process is an important step in
the statistical analysis of its second order characteristics. Different parametric and
non-parametric procedures have been proposed for this purpose and are now well
investigated in the literature. As in any estimation problem, apart from the con-
struction of point estimators with desirable statistical properties, the construction of
interval estimators that simultaneously contain the unknown spectral density with a
pre-specified probability is also important. Such bands are useful in many situations.
For instance, simultaneous confidence bands can be used to decide if particular fea-
tures of the estimated spectral density are due to the covariance structure of the
underlying process or to the randomness of the spectral estimator used. Confidence
bands are also useful in checking the fit of parametric models. Such checks can be
done by examining if the spectral density of the fitted parametric model lies over all
frequencies within the nonparametrically obtained simultaneous confidence bands
for the spectral density of the process generating the observed time series.
In contrast to point estimators, however, the construction of simultaneous con-
fidence bands for the spectral density has received less attention in the statistical
literature and only few studies exist for this purpose. They mainly focus on the
parametric case of a finite order autoregressive process. In particular and for Gauss-
ian autoregressive processes, Newton and Pagano (1984) proposed a method for the
construction of simultaneous confidence bands based on properties of the reciprocal
spectral density and Scheffe’s projections. Tomasek (1987) derived simultaneous
confidence bands for the autoregressive spectral density using asymptotic properties
of parametric spectral density estimators and Sidak’s inequality. For the vector au-
toregressive case, Sakai and Sakaguchi (1990) using a method proposed by Koslov
and Jones (1985) and Hrafnkelsson and Newton (2000) extending the method pro-
posed by Tomasek (1987), developed different procedures for the construction of
simultaneous confidence bands for the components of the spectral density matrix
or of particular functions thereof. Although the assumption of a finite order au-
toregressive structure allows the implementation of (efficient) parametric spectral
density estimators for the construction of confidence bands it largely restricts the
applicability of the methods proposed.
This paper proposes a nonparametric method to construct simultaneous confi-
dence bands for (a smoothed version of) the spectral density of Gaussian processes.
The method does not rely on parametric structural assumptions on the underlying
stochastic process. Whenever one constructs nonparametric pointwise confidence
intervals or simultaneous confidence bands one faces a notorious bias problem. It
results from the fact that nonparametric curve estimation in the supremum norm
2
is an ill-posed inverse statistical problem. Problems at the practical level, even un-
der smoothness conditions on fX , emerge as follows. If the bandwidth is chosen of
mean-square-error (MSE) optimal order, then bias and standard deviation will be
of the same order of magnitude. The stochastic term can be taken into account by
asymptotic theory (the limiting process is a certain Gaussian process) or eventually
even better by some bootstrap technique. There is, however, no really satisfactory
approach to deal with the bias term. One can try to estimate it explicitly, however,
consistency of this estimator requires that some degrees of smoothness of fX are not
used by the initial estimator. Alternatively, one can choose the bandwidth of smaller
than MSE-optimal order to keep the bias negligible. This seems to be not really
practicable since a well-motivated rule for choosing an undersmoothing bandwidth
is not available, especially for any finite n. These problems can also be seen from a
different angle. Both remedies against the bias problem necessarily require that the
underlying estimator is not asymptotically optimal in the mean square sense. To
circumvent these problems, we urge the reader to re-think the possible initial goal
of setting up a confidence band for fX and suggest to construct the confidence band
for a kernel-smoothed version of fX , which turns the problem in a well-posed one.
We define a convolution operator Kh(·) as
Kh(fX)(λ) =∫
Kh(λ− ω)fX(ω) dω,
where Kh(·) = h−1K(·/h), K and h = hn are the smoothing kernel and the smooth-
ing bandwidth respectively. Our aim is to construct a confidence band for Kh(fX).
The method proposed uses, as a starting point, a nonparametric kernel-type
estimator of the spectral density obtained by smoothing the sample spectral density
(periodogram). To determine appropriately the width of the confidence band at
each frequency, the distribution of the supremum deviation over all frequencies of a
studentized version of the nonparametric estimator applied is used. The width of
the confidence band varies then according to the changing variability associated with
estimating the underlying spectral density at different frequencies. The distribution
of the supremum deviation of the studentized statistic involved in our construction
is estimated using a frequency domain bootstrap procedure which exploits the fact
that periodogram ordinates of a Gaussian noise process at the Fourier frequencies
are independent. This allows the approximation of the random behavior of sums
of weakly dependent random variables by that of independent ones. Asymptotic
validity of the bootstrap procedure proposed to approximate the desired supremum
distribution is then established by means of strong approximations. Using this
basic result we prove that the confidence bands obtained achieve asymptotically the
desired simultaneous coverage probability.
3
The paper is organized as follows. After stating the main assumptions imposed
in Section 2, we introduce the nonparametric spectral density estimator and the
basic studentized statistic used in our approach. The bootstrap method proposed
to approximate the supremum deviations is presented and its asymptotic validity
is established. We conclude this section by stating the main result of the paper re-
garding the asymptotic behavior of the coverage probability of the confidence bands
proposed. Section 3 presents some numerical examples illustrating the behavior of
our method in finite sample situations and a real-life data example demonstrates
its applicability in time series analysis. Finally, proofs of all results are deferred to
Section 4.
2. Confidence bands for the Spectral density
2.1. Preliminaries. We consider real-valued random variables X1, X2, . . . , Xn ob-
served from a stochastic process (Xt)t∈Z satisfying the following assumption.
Assumption 1: (Xt)t∈Z is a zero mean, stationary Gaussian process satisfying
∞∑
k=0
k|ck| < ∞, (2.1)
where ck = cov(Xt, Xt+k) is the autocovariance at lag k ∈ Z. Furthermore, we
assume that the spectral density fX of (Xt)t∈Z is everywhere positive. Notice that
by (2.1), fX exists, is Lipschitz continuous and is given by
fX(λ) =1
2π
∞∑
k=−∞ck cos(λk), λ ∈ [−π, π].
Stathis, oder hattest Du lieber e−iλk in der Summe?
Moreover, we assume that fX is bounded away from zero, that is
infλ∈[−π,π]
fX(λ) > 0. (2.2)
Our aim is to devise simultaneous confidence bands for fX or for some smoothed
version thereof, cf. Section 2.2, with an asymptotic coverage probability of 1−α, for
some given α ∈ (0, 1). Toward this goal we first consider a class of nonparametric
estimators of fX . A common starting point for many nonparametric estimators
proposed in the literature is the periodogram
In,X(λ) =1
2π|Jn,X(λ)|2 =
1
2π
n−1∑
k=−(n−1)
cos(λk)
(1
n
n−k∑
t=1
XtXt+k
), λ ∈ [−π, π],
where Jn,X(λ) = n−1/2 ∑nt=1 Xte
−iλt is the finite Fourier transform of X1, X2, . . . , Xn.
Commonly the periodogram is calculated at the Fourier frequencies λk = 2πk/n,
k ∈ Kn = {−[(n− 1)/2], . . . , [n/2]}.
4
The periodogram is not a consistent estimator of fX(λ) and a class of consis-
tent estimators is obtained by smoothing In,X(λ) over different frequencies, i.e., by
considering
fn,X(λ) =∑
k∈Zwn,k(λ)In,X(λk), (2.3)
In the following we derive our results for commonly used kernel estimators of fX
by setting either
wn,k(λ) =2π
nKh(λ− λk); (2.4)
cf. Priestley (1981), or
wn,k(λ) =∫ λk+π/n
λk−π/nKh(λ− ω) dω; (2.5)
cf. Muller and Prewitt (1992).
Notice that it may happen that we include in (2.3) some λk outside the interval
[−π, π] since we do not use one-sided kernels for estimation near the ends of [−π, π].
Notice further that in defining the periodogram, we could equally well use the mean-
corrected observations,
In,X−Xn(λ) =
1
2πn
∣∣∣∣∣n∑
t=1
(Xt − Xn)e−iλt
∣∣∣∣∣2
,
where Xn = n−1 ∑nt=1 Xt is the mean of the observed series. This would allow to drop
the assumption that the process has zero mean. However, the asymptotic theory
developed in this paper carries over to this case as well, since In,X−Xn(λk) = In,X(λk),
for k ∈ Z with k mod n 6= 0, which implies that the difference of the corresponding
kernel estimators is of negligible size.
We will assume that
Assumption 2: K : R → R is a nonnegative and symmetric kernel with
bounded total variation and support [−π, π]. Furthermore,∫ π−π K(x)dx = 1.
Assumption 3: The smoothing bandwidth h = hn depends on n and the
sequence (hn)n∈N fulfills hn ∼ n−η for some η ∈ (0, 1).
Instead of the class of estimators (2.3) based on a weighted average of the pe-
riodogram over the Fourier frequencies, we may also consider estimators of fX(λ)
which are based on a convolution of the periodogram with a kernel function, i.e.,
estimators given by
fn,X(λ) =∫ π
−πKh(λ− ω)In,X(ω)dω. (2.6)
5
Approximating the above integral by the corresponding Riemann sum gives
2π
n
∑
k
Kh(λ− λk)In,X(λk).
By Theorem 5.9.1 of Brillinger (1981, p. 162), we have that if K has a bounded
derivative, then∣∣∣∣fn,X(λ) − 2π
n
∑
k
Kh(λ− λk)In,X(λk)∣∣∣∣ = OP (n−1h−2 + log(n)(nh)−1),
where the OP term does not depend on λ. Thus if the kernel K satisfies the afore-
mentioned smoothness condition and if hn ∼ n−η for some η ∈ (0, 1/3), then the
asymptotic behavior of the estimators fn,X(λ) and fn,X(λ) is identical. This sug-
gests that properties established for the confidence bands based on the estimator
(2.3) will carry over to those using estimator (2.6).
2.2. Simultaneous confidence bands. We begin our construction of a confidence
band for Kh(fX) by considering the studentized statistic
Dn(λ) =fn,X(λ)−Kh(fX)(λ)
σ(fn,X(λ)), λ ∈ [−π, π], (2.7)
where σ(fn,X(λ)) is an estimator of the standard deviation of the kernel estimator
fn,X(λ), i.e., of σ(fn,X(λ)) =√
var(fX(λ)). We have that var(In,X(λk)) = (1 +
δk)f2X(λk) + O(n−1/2) and cov(In,X(λk1), In,X(λk2)) = O(n−1) for k1 6= k2, where
δk = 1 if λk = 0 or being a multiple of ±π and δk = 0 else; see Brockwell and Davis
(1991), Th. 10.3.2. This implies, in conjunction with supλ{∑
k |wn,k(λ)|} = O(1)
and supλ{∑
k w2n,k(λ)} = O(n−1h−1), that
σ2(fn,X(λ)) =∑
k
w2n,k(λ)(1 + δk)f
2X(λk) + O(n−3/2h−1 + n−1).
(2.8)
This suggests the estimator
σ2(fn,X(λ)) =∑
k
w2n,k(λ)(1 + δk)f
2n,X(λk)
of σ2(fn,X(λ)) which is used in (2.7).
Based on (2.7) a (1 − α)100% simultaneous confidence band for Kh(fX) is ob-
where tn,α denotes the upper α-percentage point of the distribution of supλ∈[−π,π] |Dn(λ)|.Observe that the width of the interval (2.9) is proportional to σ(fn,X(λ)) which
reflects the varying difficulty in estimating the unknown spectral density fX(λ) at
6
different frequencies λ. Implementation of the above confidence band requires knowl-
edge of the distribution of supλ∈[−π,π] |Dn(λ)|. To approximate this distribution we
propose in the following a frequency domain bootstrap procedure which imitates the
distribution of a tractable approximation of the studentized statistic (2.7).
To elaborate on the approximation of Dn(λ) used, recall first the basic fact that
every non-deterministic stationary Gaussian process can be written as a causal linear
Gaussian process (see Proposition 2.1 of Fan and Yao (2003, p. 33)), that is, there
exists a sequence of independent innovations εt ∼ N (0, σ2ε) such that
Xt =∞∑
k=0
ψkεt−k (2.10)
and the coefficients {ψk, k ∈ N ∪ {0}} satisfy∑∞
k=0 ψ2k < ∞. Stathis, ist hier
auch anderes Ergebnis moglich? Wir haben ja jetzt sogar∑
k |ck|k < ∞vorausgesetzt, siehe neue (2.1)... Let ψ(ω) =
∑∞k=0 ψkω
k and denote by
In,ε(λ) =1
2π|Jn,ε(λ)|2,
the periodogram of the Gaussian noise series ε1, ε2, . . . , εn, i.e., Jn,ε(λ) is given by
Jn,ε(λ) = n−1/2 ∑nt=1 εte
−iλt. By Theorem 10.3.1 of Brockwell and Davis (1991,
p. 347) we can express the periodogram as
In,X(λ) = |ψ(e−iλ)|2In,ε(λ) + (2π)−1Rn(λ), (2.11)
where Rn(λ) = ψ(e−iλ)Jn,ε(λ)Yn(−λ)+ψ(eiλ)Jn,ε(−λ)Yn(λ)+ |Yn(λ)|2 and Yn(λ) =
n−1/2 ∑∞k=0 ψke
−iλk(∑n−k
t=1−k εte−iλt − ∑n
t=1 εte−iλt
). The random variable Jn,ε(λ) is
complex normal distributed with mean zero and variance σ2ε while Yn(λ) is complex
normal with mean zero and variance of order O(n−1).
Using (2.11) and |ψ(e−iλ)|2 = fX(λ)/fε(λ) = 2πfX(λ)/σ2ε we can decompose
Dn(λ) as follows:
Dn(λ) =∑
k
wn,k(λ)fX(λk)(2πIn,ε(λk)/σ
2ε − 1
)/σ(fn,X(λ))
+ (2π)−1∑
k
wn,k(λ)Rn(λk)/σ(fn,X(λ))
+( ∑
k
wn,k(λ)fX(λk)−Kh(fX)(λ))/σ(fn,X(λ)). (2.12)
We argue in the following that instead of supλ∈[−π,π] |Dn(λ)| it suffices to consider
supλ∈[−π,π] |Dn(λ)|, where
Dn(λ) =∑
k
wn,k(λ)fX(λk)(2πIn,ε(λk)/σ
2ε − 1
)/σ(fn,X(λ)),
(2.13)
7
that is, the contributions of the second and of the third term on the right hand side of
(2.12) to the distribution of the supremum of interest are asymptotically negligible.
Notice that the study of the distribution of supλ∈[−π,π] |Dn(λ)| is simpler than that of
supλ∈[−π,π] |Dn(λ)| because∑
k wn,k(λ)fX(λk)(2πIn,ε(λk)/σ2ε − 1) is a weighted sum
of independent random variables due to the fact that the In,ε(λk)’s are periodogram
ordinates of a Gaussian white noise series at the Fourier frequencies.
To see why the distribution of the supremum of |Dn(λ)| approximates correctly
the corresponding distribution of |Dn(λ)|, notice first that because of (infλ σ(fn,X(λ)))−1
= OP ((nh)1/2) we get by the properties of the kernel K and of the spectral density fX
that
supλ∈[−π,π]
∣∣∣∣∑
k wn,k(λ)fX(λk) − Kh(fX)(λ)
σ(fn,X(λ))
∣∣∣∣
≤(infλ
σ(fn,X(λ)))−1
supλ
∣∣∣∣∑
k
wn,k(λ)fX(λk)−Kh(fX)(λ)∣∣∣∣
= OP ((nh)1/2) O((nh)−1) = OP ((nh)−1/2). (2.14)
Furthermore, using the bound P (|Y | > x) ≤√
2/π(1/x)e−x2/2, for Y ∼ N (0, 1), we
obtain that for all γ < ∞ there exists a Cγ < ∞ such that
maxk
{P
(|Rn(λk)| > Cγ
log n√n
)}= O(n−γ). (2.15)
It now follows from (2.15) and (2.2) that for all γ < ∞ there exists Cγ < ∞ such
that
P
(sup
λ∈[−π,π]
{∣∣∣∣∣∑
k
wn,k(λ)Rn(λk)
∣∣∣∣∣/
σ(fn,X(λ))
}> Cγh
1/2 log n
)
≤ P
((infλ
σ(fn,X(λ)))−1
supλ{∑
k
|wn,k(λ)|} ×maxk{|Rn(λk)|} > Cγh
1/2 log n
)
= O(n−γ). (2.16)
Using (2.14) and (2.16) we finally obtain that
∣∣∣∣∣supλ|Dn(λ)| − sup
λ|Dn(λ)|
∣∣∣∣∣ ≤ supλ
∣∣∣Dn(λ) − Dn(λ)∣∣∣
= OP ((nh)−1/2) + OP (h1/2 log n), (2.17)
that is, the distribution of supλ∈[−π,π] |Dn(λ)| can be well approximated by the dis-
tribution of supλ∈[−π,π] |Dn(λ)|.
8
2.3. Bootstrap Approximations. In view of (2.17) it is clear that in order to eval-
uate the distribution of supλ |Dn(λ)| appropriately, it suffices to mimic the behavior
of the random variables
ξk = fX(λk)(2πIn,ε(λk)/σ
2ε − 1
)
by the bootstrap. Since the innovations εt are independent with εt ∼ N (0, σ2ε), the
random variables ξ0, . . . , ξ[n/2] are independent with
ξk ∼
fX(λk)(χ22/2− 1), if 1 ≤ k < n/2,
fX(λk)(χ21 − 1), if k ∈ {0, n/2}.
Here χ2m denotes the χ2-distribution with m degrees of freedom. Thus to mimic ξk
it is natural to generate independent random variables γ∗0 , . . . , γ∗[n/2], which are also
independent of the original sample X1, . . . , Xn, with
γ∗k ∼
(χ22/2− 1), if 1 ≤ k < n/2,
(χ21 − 1), if k ∈ {0, n/2}.
The bootstrap counterparts of the ξk are then defined as
ξ∗k = fn,X(λk)γ∗k, k = 0, . . . , [n/2].
According to the 2π-periodicity and the symmetry of the periodogram we define
further ξ∗[n/2]+k = ξ∗n−[n/2]−k (k = 1, . . . , n− [n/2]) and ξ∗−k = ξ∗k (k = 1, . . . , n). The
bootstrap counterpart of Dn(λ) =∑
k wn,k(λ)ξk/σ(fn,X(λ)) is then given by
D∗n(λ) =
∑
k
wn,k(λ)ξ∗k/σ(fn,X(λ)), λ ∈ [−π, π].
Based on this bootstrap approximation, the (1−α)100% simultaneous confidence
where t∗n,α denotes the upper α-percentage point of the distribution of supλ∈[−π,π] |D∗n(λ)|.
Note that this distribution can be evaluated by Monte Carlo simulation.
The following proposition establishes asymptotic validity of the bootstrap pro-
cedure proposed because it shows that (Dn(λ))λ∈[−π,π] is consistently mimicked by
its bootstrap analogue (D∗n(λ))λ∈[−π,π].
Proposition 2.1. Suppose that for every λ ∈ [−π, π] the weights {wn,k(λ), k ∈ Z}are given by (2.4) or (2.5) and that Assumptions 1 to 3 are satisfied. Then, there
exists a coupling of the random variables ξ−n, . . . , ξn and ξ∗−n, . . . , ξ∗n (the latter
9
having a distribution conditioned on X1, . . . , Xn) on an appropriate joint probability
space such that
P
(sup
λ∈[−π,π]
∣∣∣Dn(λ) − D∗n(λ)
∣∣∣ > nδ((nh)−1/4 + h)
)= O(n−γ)
holds for arbitrary δ > 0.
Notice that the bootstrap procedure used to generated replicates of the ξk’s is not
new. It has been proposed in a context different to that considered here by Hurvich
and Zeger (1987). Franke and Hardle (1992) investigated asymptotic properties of a
version of this procedure based on i.i.d. resampling of estimated frequency domain
residuals In,X(λk)/fn,X(λk) instead of the χ2-distributed random variables γ∗k . See
also Dahlhaus and Janas (1996) for the asymptotic properties of this procedure for
different classes of periodogram based statistics.
It is worth mentioning here that, as a careful inspection of the proof of Propo-
sition 2.1 shows, in order for the bootstrap to estimate consistently the random
behavior of Dn(λ))λ∈[−π,π], the random variables used to mimic the behavior of the
ξk’s can be alternatively defined as
ξ+k = In,X(λk)γ
∗k, k = 0, . . . , [n/2].
That is, the periodogram In,X(λk) can be used in place of the estimated spectral
density fn,X(λk) and the distribution of∑
k wn,k(λ)ξ∗k/σ(fn,X(λ)) can be imitated
by that of∑
k wn,k(λ)ξ+k /σ(fn,X(λ)). Our simulation findings suggest, however, that
using the estimated spectral density fn,X(λk) leads to better results in finite sample
situations.
2.4. Main Results. We first give a lemma which provides a concentration inequal-
ity for the supremum deviation and which implies that the strong approximation
result stated in Proposition 2.1 is good enough for proving consistency of the boot-
strap method.
Lemma 2.2. Suppose that for every λ ∈ [−π, π] the weights {wn,k(λ), k ∈ Z} are
given by (2.4) or (2.5) and that Assumptions 1 to 3 are satisfied. Then
P
(sup
λ∈[−π,π]
{∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣}∈ [c, d]
)
= O((d− c)
√nh log n + h(log n)3/2 + nδ(nh)−1/2
)
holds for arbitrary δ > 0.
10
The following theorem is the main result of this paper. It states that the proposed
bootstrap confidence band achieves asymptotically the desired coverage probability.
Theorem 2.3. Suppose that for every λ ∈ [−π, π] the weights {wn,k(λ), k ∈ Z} are
given by (2.4) or (2.5) and that Assumptions 1 to 3 are satisfied. Then
The coupling of η and η∗ will be defined by expressing both vectors by increments
of the same Wiener process. This Wiener process serves as an appropriate tool
to connect the ηj,k with the η∗j,k in such a way that partial sums of these with
slowly changing weights are close to each other. By interpolation with independent
Brownian bridges we build a Wiener process (W (t))t∈[0,∆] such that
ηj,k = V1/2j (W (sj,k) − W (sj,k−1)) .
Now we define, conditioned on X1, . . . , Xn, independent random variables η∗j,k ∼N (0, v∗j,k) as
η∗j,k = V ∗j
1/2(W (s∗j,k) − W (s∗j,k−1)
).
Moreover, the remaining ηj,k and η∗j,k are again defined according to the properties
of 2π-periodicity and symmetry of the periodogram, that is, η[n/2]+k = ηn−[n/2]−k,
η∗[n/2]+k = η∗n−[n/2]−k (1 ≤ k ≤ n− [n/2]), and η−k = ηk, η∗−k = η∗k (1 ≤ k ≤ n).
We decompose∑
l wn,l(λ)(ηj − η∗j ) =∑
j
∑k wn,j,k(λ)(ηj,k − η∗j,k) into a “coarse
structure” term,
∆1(λ) =∑
j
(V1/2j − V ∗
j1/2)
∑
k
wn,j,k(λ)(W (s∗j,k) − W (s∗j,k−1)
),
and a “fine structure term”,
∆2(λ) =∑
j
V1/2j
∑
k
wn,j,k(λ)[(W (sj,k) − W (sj,k−1)) −
(W (s∗j,k) − W (s∗j,k−1)
)].
We obtain from (4.3) that
maxj,k{|tj,k − t∗j,k|} = max
j,k
∣∣∣∣∣k∑
l=1
(f 2
n,X(λj,l) − f 2X(λj,l)
)(1 + I(λj,l mod π = 0))
∣∣∣∣∣
= O(nδ(nh)1/2, n−γ
)+ O(nh3). (4.10)
This yields by Vj ³ V ∗j ³ nh (for V ∗
j , with a probability exceeding 1−O(n−γ)) that
maxj
{|V 1/2
j − V ∗j
1/2|}
= maxj
|Vj − V ∗j |
V1/2j + V ∗
j1/2
= O
(nδ + n1/2h5/2, n−γ
).
Therefore, and since∑
k wn,j,k(λ)(W (s∗j,k) − W (s∗j,k−1)
)is normally distributed with
zero mean and a variance of order O((nh)−2), we get immediately that
|∆1(λ)| = O(nδ(nh)−1 + (nh)−1/2h2, n−γ
).
Proving this on an appropriate sequence of increasingly fine grids we also obtain
that
supλ∈[−π,π]
{|∆1(λ)|} = O(nδ[(nh)−1 + (nh)−1/2h2], n−γ
). (4.11)
18
To estimate ∆2(λ), we rewrite it as
∆2(λ) =∑
j
V1/2j
∑
k
wn,j,k(λ)
[∫ sj,k
sj,k−1
dW (t) −∫ s∗j,k
s∗j,k−1
dW (t)
]
=∑
j
V1/2j
∫ j
j−1[wt(λ) − w∗
t (λ)] dW (t),
where wt(λ) = wn,j,k(λ), if t ∈ (sj,k−1, sj,k], and w∗t (λ) = wn,j,k(λ), if t ∈ (s∗j,k−1, s
∗j,k].
(Note that the integrands in the integrals above are piecewise constant which means
that the integrals can be computed as weighted sums of increments of W ; the more
general concept of stochastic integrals is not needed here.) We conclude from (4.10)
that
|sj,k − s∗j,k| ≤|tj,k − t∗j,k|
Vj
+t∗j,kV ∗
j
|V ∗j − Vj|Vj
= O(nδ(nh)−1/2 + h2, n−γ
).
Moreover, since K has finite total variation we obtain from this relation that
∫(wt(λ) − w∗
t (λ))2 dt = O(nδ(nh)−5/2 + n−2, n−γ
), (4.12)
which yields that
∆2(λ) = O(nδ[(nh)−3/4 + (nh)−1/2h], n−γ
).
Proving this again on an appropriate sequence of increasingly fine grids we conclude
that
supλ∈[−π,π]
{|∆2(λ)|} = O(nδ[(nh)−3/4 + (nh)−1/2h], n−γ
). (4.13)
The assertion now follows from (4.7), (4.9), (4.11) and (4.13). ¤
Proof of Lemma 2.2. As in the proof of Proposition 2.1, (4.4) yields that we can
again ignore the effect of estimating the unknown standard deviation of fn,X(λ) and
consider (fn,X(λ)−Kh(fX)(λ))/σ(λ) instead of (fn,X(λ)−Kh(fX)(λ))/σ(λ). Recall
from (2.16) and (4.7) that
supλ∈[−π,π]
{∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣}
= supλ∈[−π,π]
∣∣∣∣∣∣∑
j
wn,j(λ)ηj
∣∣∣∣∣∣
+ O
(log n√
n+ nδ(nh)−1, n−γ
), (4.14)
where the ηj are independent normal random variables. We approximate the lat-
ter supremum by the maximum over the Fourier frequencies. Since∑
j |wn,j(λ) −
19
wn,j(ω)| = O(|λ− ω|/h) we obtain that∣∣∣∣∣∣
supλ∈[−π,π]
∣∣∣∑
j
wn,j(λ)ηj
∣∣∣ − max
−[n/2]≤k≤[n/2]
∣∣∣∑
j
wn,j(λk)ηj
∣∣∣
∣∣∣∣∣∣
≤ max−[n/2]≤k≤[n/2]
supλ∈[λk−π/n,λk+π/n]∩[−π,π]
∑
j
|wn,j(λ) − wn,j(λk)|×max
j{|ηj|}
= O
(√log n
nh, n−γ
). (4.15)
The random variables∑
j wn,j(λk)ηj (k = −[n/2], . . . , [n/2]) are jointly normal dis-
tributed and it follows from (i) of Lemma 3.1 in Neumann (2001) that
max−[n/2]≤k≤[n/2]{|∑j wn,j(λk)ηj|} has a density p∗n with
supt{p∗n(t)} = O
(√nh log n
).
This implies that
P
max−[n/2]≤k≤[n/2]
∣∣∣∑
j
wn,j(λk)ηj
∣∣∣ ∈ [c, d]
= (d− c)
√nh log n.
(4.16)
By (4.14), (4.15) and (4.16) we obtain, with cn,γ = Cγ(log n√
n+nδ(nh)−1 +
√log n
nh),that
P
(sup
λ∈[−π,π]
{∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣}∈ [c, d]
)
≤ P
max−[n/2]≤k≤[n/2]
∣∣∣∑
j
wn,j(λk)ηj
∣∣∣ ∈ [c − cn,γ, d + cn,γ]
+ O(n−γ)
= O((d− c)
√nh log n + h(log n)3/2 + nδ log n(nh)−1/2
).
¤
Proof of Theorem 2.3. It follows from Proposition 2.1 and Lemma 2.2 that
supt
∣∣∣∣∣∣P
sup
λ∈[−π,π]
∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣
σ(fn,X(λ))
≤ t
− P
sup
λ∈[−π,π]
∣∣∣ ∑j wn,j(λ)ξ∗j
∣∣∣σ(fn,X(λ))
≤ t
∣∣∣∣∣∣
= oP (1).
This implies
P
sup
λ∈[−π,π]
∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣
σ(fn,X(λ))
≤ t
∣∣∣∣∣∣t=t∗n,α
= 1 − α + oP (1),
20
which yields that
P
sup
λ∈[−π,π]
∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣
σ(fn,X(λ))
≤ t∗n,α
= 1 − α + o(1).
¤
Acknowledgment . We thank two referees for many helpful comments.
References
Beltrao, K. L. and Bloomfield, P. (1987). Determining the bandwidth of a kernel spectrum estimate.Journal of Time Series Analysis 8, 21–38.
Brillinger, D. R. (1981). Time Series. Data Analysis and Theory. New York: McGraw-Hill.Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods, 2nd edition. New
York: Springer.Dahlhaus, R. and Janas, D. (1996). A frequency domain bootstrap for ratio statistics in time series
analysis. Annals of Statistics 24, 1934–1963.Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. New
York: Springer.Franke, J. and Hardle, W. (1992). On bootstrapping kernel spectral estimates. Annals of Statistics
20, 121–145.Hrafnkelsson, B. and Newton, J. H. (2000). Asymptotic simultaneous confidence bands for vector
autoregressive spectra. Biometrika 87, 173–182.Hurvich, C. M. (1985). Data-driven choice of spectrum estimation: extending the applicability of
cross-validation methods. Journal of the American Statistical Association 80, 933–940.Hurvich, C. M. and Zeger, S. L. (1987). Frequency domain bootstrap methods for time series.
Technical Report 87-115. Graduate School of Business Administration. New York University.Koslov, J. W. and Jones, R. H. (1985). A unified approach to confidence bounds for the autore-
gressive spectral estimator. Journal of Time Series Analysis 6, 141–151.Muller, H. G. and Prewitt, K. (1992). Weak convergence and adaptive peak estimation for spectral
densities. Annals of Statistics 20, 1329–1349.Neumann, M. H. (2001). On robustness of model-based bootstrap schemes in nonparametric time
series analysis. Statistics 35, 1–40.Newton, J. H. and Pagano, M. (1984). Simultaneous confidence bands for autoregressive spectra.
Biometrika 71, 197–202.Priestley, M. B. (1981). Spectral Analysis and Time Series. New York: Academic Press.Sakai, H. and Sakaguchi, F. (1990). Simultaneous confidence bands for the spectral estimate of
two-channel autoregressive processes. Journal of Time Series Analysis 11, 49–56.Sakhanenko, A. I. (1991). On the accuracy of normal approximation in the invariance principle.
Siberian Advances in Mathematics 1, 58–91.Tomasek, L. (1987). Asymptotic simultaneous confidence bands for autoregressive spectral density.
Journal of Time Series Analysis 8, 469–491.
21
n=256 n=512 n=1024Process h 90% 95% h 90% 95% h 90% 95%
Table 1: Empirical coverage probabilities of 90% and 95% confidence bands for differentsample sizes n and smoothing bandwidths h. CV refers to the results obtained using a cross-validation criterion to select the bandwidth.
22
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
Figure 1. Simultaneous, 95% confidence bands for the spectral den-
sity of the ARMA(2,1) model: The solid line in both figures is the
estimated spectral density, the dashed lines refer to the estimated ex-
act confidence bands and the dotted lines to the bootstrap confidence
bands. The top figure presents the results for n = 256 and the bottom
figure for n = 1024.
23
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 2. Simultaneous, 95% confidence bands for the spectral den-
sity of the AR(7) model: The solid line in both figures is the estimated
spectral density, the dashed lines refer to the estimated exact confi-
dence bands and the dotted lines to the bootstrap confidence bands.
The top figure presents the results for n = 256 and the bottom figure
for n = 1024.
24
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.05
0.10
0.15
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.05
0.10
0.15
Figure 3. Estimated spectral density (solid line) of the differenced
German egg-price data together with 95% confidence bands (dotted
lines). The dashed line in the top graph is the smoothed spectral
density of the fitted ARMA(1,2) model and in the bottom graph of