Simultaneous confidence bands in spectral density estimation STATHIS.pdfdence bands for (a smoothed version of) the spectral density of Gaussian processes. The method does not rely

Simultaneous confidence bands in spectral densityestimation

Michael H. NeumannFriedrich-Schiller-Universitat Jena

Institut fur StochastikErnst-Abbe-Platz 2

D – 07743 Jena, GermanyE-mail: [email protected]

Efstathios PaparoditisUniversity of Cyprus

Department of Mathematics and StatisticsP.O. Box 20537

CY – 1678 Nicosia, CyprusE-mail: [email protected]

Abstract

We propose a method for the construction of simultaneous confidence bands for (a smoothedversion of) the spectral density of a Gaussian process based on nonparametric kernel esti-mators obtained by smoothing the periodogram. A studentized statistic is used to deter-mine the width of the band at each frequency and a frequency domain bootstrap approachis employed in order to estimate the distribution of the supremum of this statistic over allfrequencies. We prove by means of strong approximations that the bootstrap estimatesconsistently the distribution of the supremum deviation of interest and, consequently,that the proposed confidence bands achieve asymptotically the desired simultaneous cov-erage probability. The behavior of our method in finite sample situations is investigatedby simulations and a real-life data example demonstrates its applicability in time seriesanalysis.

2000 Mathematics Subject Classification. Primary 62G15; secondary 62M15.Keywords and Phrases. Bootstrap, confidence bands, Gaussian processes, spectraldensity, strong approximation.Short title. Confidence bands for the spectral density.

1

1. Introduction

Estimating the spectral density of a stochastic process is an important step in

the statistical analysis of its second order characteristics. Different parametric and

non-parametric procedures have been proposed for this purpose and are now well

investigated in the literature. As in any estimation problem, apart from the con-

struction of point estimators with desirable statistical properties, the construction of

interval estimators that simultaneously contain the unknown spectral density with a

pre-specified probability is also important. Such bands are useful in many situations.

For instance, simultaneous confidence bands can be used to decide if particular fea-

tures of the estimated spectral density are due to the covariance structure of the

underlying process or to the randomness of the spectral estimator used. Confidence

bands are also useful in checking the fit of parametric models. Such checks can be

done by examining if the spectral density of the fitted parametric model lies over all

frequencies within the nonparametrically obtained simultaneous confidence bands

for the spectral density of the process generating the observed time series.

In contrast to point estimators, however, the construction of simultaneous con-

fidence bands for the spectral density has received less attention in the statistical

literature and only few studies exist for this purpose. They mainly focus on the

parametric case of a finite order autoregressive process. In particular and for Gauss-

ian autoregressive processes, Newton and Pagano (1984) proposed a method for the

construction of simultaneous confidence bands based on properties of the reciprocal

spectral density and Scheffe’s projections. Tomasek (1987) derived simultaneous

confidence bands for the autoregressive spectral density using asymptotic properties

of parametric spectral density estimators and Sidak’s inequality. For the vector au-

toregressive case, Sakai and Sakaguchi (1990) using a method proposed by Koslov

and Jones (1985) and Hrafnkelsson and Newton (2000) extending the method pro-

posed by Tomasek (1987), developed different procedures for the construction of

simultaneous confidence bands for the components of the spectral density matrix

or of particular functions thereof. Although the assumption of a finite order au-

toregressive structure allows the implementation of (efficient) parametric spectral

density estimators for the construction of confidence bands it largely restricts the

applicability of the methods proposed.

This paper proposes a nonparametric method to construct simultaneous confi-

dence bands for (a smoothed version of) the spectral density of Gaussian processes.

The method does not rely on parametric structural assumptions on the underlying

stochastic process. Whenever one constructs nonparametric pointwise confidence

intervals or simultaneous confidence bands one faces a notorious bias problem. It

results from the fact that nonparametric curve estimation in the supremum norm

2

is an ill-posed inverse statistical problem. Problems at the practical level, even un-

der smoothness conditions on fX , emerge as follows. If the bandwidth is chosen of

mean-square-error (MSE) optimal order, then bias and standard deviation will be

of the same order of magnitude. The stochastic term can be taken into account by

asymptotic theory (the limiting process is a certain Gaussian process) or eventually

even better by some bootstrap technique. There is, however, no really satisfactory

approach to deal with the bias term. One can try to estimate it explicitly, however,

consistency of this estimator requires that some degrees of smoothness of fX are not

used by the initial estimator. Alternatively, one can choose the bandwidth of smaller

than MSE-optimal order to keep the bias negligible. This seems to be not really

practicable since a well-motivated rule for choosing an undersmoothing bandwidth

is not available, especially for any finite n. These problems can also be seen from a

different angle. Both remedies against the bias problem necessarily require that the

underlying estimator is not asymptotically optimal in the mean square sense. To

circumvent these problems, we urge the reader to re-think the possible initial goal

of setting up a confidence band for fX and suggest to construct the confidence band

for a kernel-smoothed version of fX , which turns the problem in a well-posed one.

We define a convolution operator Kh(·) as

Kh(fX)(λ) =∫

Kh(λ− ω)fX(ω) dω,

where Kh(·) = h−1K(·/h), K and h = hn are the smoothing kernel and the smooth-

ing bandwidth respectively. Our aim is to construct a confidence band for Kh(fX).

The method proposed uses, as a starting point, a nonparametric kernel-type

estimator of the spectral density obtained by smoothing the sample spectral density

(periodogram). To determine appropriately the width of the confidence band at

each frequency, the distribution of the supremum deviation over all frequencies of a

studentized version of the nonparametric estimator applied is used. The width of

the confidence band varies then according to the changing variability associated with

estimating the underlying spectral density at different frequencies. The distribution

of the supremum deviation of the studentized statistic involved in our construction

is estimated using a frequency domain bootstrap procedure which exploits the fact

that periodogram ordinates of a Gaussian noise process at the Fourier frequencies

are independent. This allows the approximation of the random behavior of sums

of weakly dependent random variables by that of independent ones. Asymptotic

validity of the bootstrap procedure proposed to approximate the desired supremum

distribution is then established by means of strong approximations. Using this

basic result we prove that the confidence bands obtained achieve asymptotically the

desired simultaneous coverage probability.

3

The paper is organized as follows. After stating the main assumptions imposed

in Section 2, we introduce the nonparametric spectral density estimator and the

basic studentized statistic used in our approach. The bootstrap method proposed

to approximate the supremum deviations is presented and its asymptotic validity

is established. We conclude this section by stating the main result of the paper re-

garding the asymptotic behavior of the coverage probability of the confidence bands

proposed. Section 3 presents some numerical examples illustrating the behavior of

our method in finite sample situations and a real-life data example demonstrates

its applicability in time series analysis. Finally, proofs of all results are deferred to

Section 4.

2. Confidence bands for the Spectral density

2.1. Preliminaries. We consider real-valued random variables X1, X2, . . . , Xn ob-

served from a stochastic process (Xt)t∈Z satisfying the following assumption.

Assumption 1: (Xt)t∈Z is a zero mean, stationary Gaussian process satisfying

∞∑

k=0

k|ck| < ∞, (2.1)

where ck = cov(Xt, Xt+k) is the autocovariance at lag k ∈ Z. Furthermore, we

assume that the spectral density fX of (Xt)t∈Z is everywhere positive. Notice that

by (2.1), fX exists, is Lipschitz continuous and is given by

fX(λ) =1

2π

∞∑

k=−∞ck cos(λk), λ ∈ [−π, π].

Stathis, oder hattest Du lieber e−iλk in der Summe?

Moreover, we assume that fX is bounded away from zero, that is

infλ∈[−π,π]

fX(λ) > 0. (2.2)

Our aim is to devise simultaneous confidence bands for fX or for some smoothed

version thereof, cf. Section 2.2, with an asymptotic coverage probability of 1−α, for

some given α ∈ (0, 1). Toward this goal we first consider a class of nonparametric

estimators of fX . A common starting point for many nonparametric estimators

proposed in the literature is the periodogram

In,X(λ) =1

2π|Jn,X(λ)|2 =

1

2π

n−1∑

k=−(n−1)

cos(λk)

(1

n

n−k∑

t=1

XtXt+k

), λ ∈ [−π, π],

where Jn,X(λ) = n−1/2 ∑nt=1 Xte

−iλt is the finite Fourier transform of X1, X2, . . . , Xn.

Commonly the periodogram is calculated at the Fourier frequencies λk = 2πk/n,

k ∈ Kn = {−[(n− 1)/2], . . . , [n/2]}.

4

The periodogram is not a consistent estimator of fX(λ) and a class of consis-

tent estimators is obtained by smoothing In,X(λ) over different frequencies, i.e., by

considering

fn,X(λ) =∑

k∈Zwn,k(λ)In,X(λk), (2.3)

In the following we derive our results for commonly used kernel estimators of fX

by setting either

wn,k(λ) =2π

nKh(λ− λk); (2.4)

cf. Priestley (1981), or

wn,k(λ) =∫ λk+π/n

λk−π/nKh(λ− ω) dω; (2.5)

cf. Muller and Prewitt (1992).

Notice that it may happen that we include in (2.3) some λk outside the interval

[−π, π] since we do not use one-sided kernels for estimation near the ends of [−π, π].

Notice further that in defining the periodogram, we could equally well use the mean-

corrected observations,

In,X−Xn(λ) =

1

2πn

∣∣∣∣∣n∑

t=1

(Xt − Xn)e−iλt

∣∣∣∣∣2

,

where Xn = n−1 ∑nt=1 Xt is the mean of the observed series. This would allow to drop

the assumption that the process has zero mean. However, the asymptotic theory

developed in this paper carries over to this case as well, since In,X−Xn(λk) = In,X(λk),

for k ∈ Z with k mod n 6= 0, which implies that the difference of the corresponding

kernel estimators is of negligible size.

We will assume that

Assumption 2: K : R → R is a nonnegative and symmetric kernel with

bounded total variation and support [−π, π]. Furthermore,∫ π−π K(x)dx = 1.

Assumption 3: The smoothing bandwidth h = hn depends on n and the

sequence (hn)n∈N fulfills hn ∼ n−η for some η ∈ (0, 1).

Instead of the class of estimators (2.3) based on a weighted average of the pe-

riodogram over the Fourier frequencies, we may also consider estimators of fX(λ)

which are based on a convolution of the periodogram with a kernel function, i.e.,

estimators given by

fn,X(λ) =∫ π

−πKh(λ− ω)In,X(ω)dω. (2.6)

5

Approximating the above integral by the corresponding Riemann sum gives

2π

n

∑

k

Kh(λ− λk)In,X(λk).

By Theorem 5.9.1 of Brillinger (1981, p. 162), we have that if K has a bounded

derivative, then∣∣∣∣fn,X(λ) − 2π

n

∑

k

Kh(λ− λk)In,X(λk)∣∣∣∣ = OP (n−1h−2 + log(n)(nh)−1),

where the OP term does not depend on λ. Thus if the kernel K satisfies the afore-

mentioned smoothness condition and if hn ∼ n−η for some η ∈ (0, 1/3), then the

asymptotic behavior of the estimators fn,X(λ) and fn,X(λ) is identical. This sug-

gests that properties established for the confidence bands based on the estimator

(2.3) will carry over to those using estimator (2.6).

2.2. Simultaneous confidence bands. We begin our construction of a confidence

band for Kh(fX) by considering the studentized statistic

Dn(λ) =fn,X(λ)−Kh(fX)(λ)

σ(fn,X(λ)), λ ∈ [−π, π], (2.7)

where σ(fn,X(λ)) is an estimator of the standard deviation of the kernel estimator

fn,X(λ), i.e., of σ(fn,X(λ)) =√

var(fX(λ)). We have that var(In,X(λk)) = (1 +

δk)f2X(λk) + O(n−1/2) and cov(In,X(λk1), In,X(λk2)) = O(n−1) for k1 6= k2, where

δk = 1 if λk = 0 or being a multiple of ±π and δk = 0 else; see Brockwell and Davis

(1991), Th. 10.3.2. This implies, in conjunction with supλ{∑

k |wn,k(λ)|} = O(1)

and supλ{∑

k w2n,k(λ)} = O(n−1h−1), that

σ2(fn,X(λ)) =∑

k

w2n,k(λ)(1 + δk)f

2X(λk) + O(n−3/2h−1 + n−1).

(2.8)

This suggests the estimator

σ2(fn,X(λ)) =∑

k

w2n,k(λ)(1 + δk)f

2n,X(λk)

of σ2(fn,X(λ)) which is used in (2.7).

Based on (2.7) a (1 − α)100% simultaneous confidence band for Kh(fX) is ob-

tained as[

fn,X(λ)− tn,ασ(fn,X(λ)), fn,X(λ) + tn,ασ(fn,X(λ))], (2.9)

where tn,α denotes the upper α-percentage point of the distribution of supλ∈[−π,π] |Dn(λ)|.Observe that the width of the interval (2.9) is proportional to σ(fn,X(λ)) which

reflects the varying difficulty in estimating the unknown spectral density fX(λ) at

6

different frequencies λ. Implementation of the above confidence band requires knowl-

edge of the distribution of supλ∈[−π,π] |Dn(λ)|. To approximate this distribution we

propose in the following a frequency domain bootstrap procedure which imitates the

distribution of a tractable approximation of the studentized statistic (2.7).

To elaborate on the approximation of Dn(λ) used, recall first the basic fact that

every non-deterministic stationary Gaussian process can be written as a causal linear

Gaussian process (see Proposition 2.1 of Fan and Yao (2003, p. 33)), that is, there

exists a sequence of independent innovations εt ∼ N (0, σ2ε) such that

Xt =∞∑

k=0

ψkεt−k (2.10)

and the coefficients {ψk, k ∈ N ∪ {0}} satisfy∑∞

k=0 ψ2k < ∞. Stathis, ist hier

auch anderes Ergebnis moglich? Wir haben ja jetzt sogar∑

k |ck|k < ∞vorausgesetzt, siehe neue (2.1)... Let ψ(ω) =

∑∞k=0 ψkω

k and denote by

In,ε(λ) =1

2π|Jn,ε(λ)|2,

the periodogram of the Gaussian noise series ε1, ε2, . . . , εn, i.e., Jn,ε(λ) is given by

Jn,ε(λ) = n−1/2 ∑nt=1 εte

−iλt. By Theorem 10.3.1 of Brockwell and Davis (1991,

p. 347) we can express the periodogram as

In,X(λ) = |ψ(e−iλ)|2In,ε(λ) + (2π)−1Rn(λ), (2.11)

where Rn(λ) = ψ(e−iλ)Jn,ε(λ)Yn(−λ)+ψ(eiλ)Jn,ε(−λ)Yn(λ)+ |Yn(λ)|2 and Yn(λ) =

n−1/2 ∑∞k=0 ψke

−iλk(∑n−k

t=1−k εte−iλt − ∑n

t=1 εte−iλt

). The random variable Jn,ε(λ) is

complex normal distributed with mean zero and variance σ2ε while Yn(λ) is complex

normal with mean zero and variance of order O(n−1).

Using (2.11) and |ψ(e−iλ)|2 = fX(λ)/fε(λ) = 2πfX(λ)/σ2ε we can decompose

Dn(λ) as follows:

Dn(λ) =∑

k

wn,k(λ)fX(λk)(2πIn,ε(λk)/σ

2ε − 1

)/σ(fn,X(λ))

+ (2π)−1∑

k

wn,k(λ)Rn(λk)/σ(fn,X(λ))

+( ∑

k

wn,k(λ)fX(λk)−Kh(fX)(λ))/σ(fn,X(λ)). (2.12)

We argue in the following that instead of supλ∈[−π,π] |Dn(λ)| it suffices to consider

supλ∈[−π,π] |Dn(λ)|, where

Dn(λ) =∑

k

wn,k(λ)fX(λk)(2πIn,ε(λk)/σ

2ε − 1

)/σ(fn,X(λ)),

(2.13)

7

that is, the contributions of the second and of the third term on the right hand side of

(2.12) to the distribution of the supremum of interest are asymptotically negligible.

Notice that the study of the distribution of supλ∈[−π,π] |Dn(λ)| is simpler than that of

supλ∈[−π,π] |Dn(λ)| because∑

k wn,k(λ)fX(λk)(2πIn,ε(λk)/σ2ε − 1) is a weighted sum

of independent random variables due to the fact that the In,ε(λk)’s are periodogram

ordinates of a Gaussian white noise series at the Fourier frequencies.

To see why the distribution of the supremum of |Dn(λ)| approximates correctly

the corresponding distribution of |Dn(λ)|, notice first that because of (infλ σ(fn,X(λ)))−1

= OP ((nh)1/2) we get by the properties of the kernel K and of the spectral density fX

that

supλ∈[−π,π]

∣∣∣∣∑

k wn,k(λ)fX(λk) − Kh(fX)(λ)

σ(fn,X(λ))

∣∣∣∣

≤(infλ

σ(fn,X(λ)))−1

supλ

∣∣∣∣∑

k

wn,k(λ)fX(λk)−Kh(fX)(λ)∣∣∣∣

= OP ((nh)1/2) O((nh)−1) = OP ((nh)−1/2). (2.14)

Furthermore, using the bound P (|Y | > x) ≤√

2/π(1/x)e−x2/2, for Y ∼ N (0, 1), we

obtain that for all γ < ∞ there exists a Cγ < ∞ such that

maxk

{P

(|Rn(λk)| > Cγ

log n√n

)}= O(n−γ). (2.15)

It now follows from (2.15) and (2.2) that for all γ < ∞ there exists Cγ < ∞ such

that

P

(sup

λ∈[−π,π]

{∣∣∣∣∣∑

k

wn,k(λ)Rn(λk)

∣∣∣∣∣/

σ(fn,X(λ))

}> Cγh

1/2 log n

)

≤ P

((infλ

σ(fn,X(λ)))−1

supλ{∑

k

|wn,k(λ)|} ×maxk{|Rn(λk)|} > Cγh

1/2 log n

)

= O(n−γ). (2.16)

Using (2.14) and (2.16) we finally obtain that

∣∣∣∣∣supλ|Dn(λ)| − sup

λ|Dn(λ)|

∣∣∣∣∣ ≤ supλ

∣∣∣Dn(λ) − Dn(λ)∣∣∣

= OP ((nh)−1/2) + OP (h1/2 log n), (2.17)

that is, the distribution of supλ∈[−π,π] |Dn(λ)| can be well approximated by the dis-

tribution of supλ∈[−π,π] |Dn(λ)|.

8

2.3. Bootstrap Approximations. In view of (2.17) it is clear that in order to eval-

uate the distribution of supλ |Dn(λ)| appropriately, it suffices to mimic the behavior

of the random variables

ξk = fX(λk)(2πIn,ε(λk)/σ

2ε − 1

)

by the bootstrap. Since the innovations εt are independent with εt ∼ N (0, σ2ε), the

random variables ξ0, . . . , ξ[n/2] are independent with

ξk ∼

fX(λk)(χ22/2− 1), if 1 ≤ k < n/2,

fX(λk)(χ21 − 1), if k ∈ {0, n/2}.

Here χ2m denotes the χ2-distribution with m degrees of freedom. Thus to mimic ξk

it is natural to generate independent random variables γ∗0 , . . . , γ∗[n/2], which are also

independent of the original sample X1, . . . , Xn, with

γ∗k ∼

(χ22/2− 1), if 1 ≤ k < n/2,

(χ21 − 1), if k ∈ {0, n/2}.

The bootstrap counterparts of the ξk are then defined as

ξ∗k = fn,X(λk)γ∗k, k = 0, . . . , [n/2].

According to the 2π-periodicity and the symmetry of the periodogram we define

further ξ∗[n/2]+k = ξ∗n−[n/2]−k (k = 1, . . . , n− [n/2]) and ξ∗−k = ξ∗k (k = 1, . . . , n). The

bootstrap counterpart of Dn(λ) =∑

k wn,k(λ)ξk/σ(fn,X(λ)) is then given by

D∗n(λ) =

∑

k

wn,k(λ)ξ∗k/σ(fn,X(λ)), λ ∈ [−π, π].

Based on this bootstrap approximation, the (1−α)100% simultaneous confidence

band for Kh(fX) we propose is given by[

fn,X(λ)− t∗n,ασ(fn,X(λ)), fn,X(λ) + t∗n,ασ(fn,X(λ))],

where t∗n,α denotes the upper α-percentage point of the distribution of supλ∈[−π,π] |D∗n(λ)|.

Note that this distribution can be evaluated by Monte Carlo simulation.

The following proposition establishes asymptotic validity of the bootstrap pro-

cedure proposed because it shows that (Dn(λ))λ∈[−π,π] is consistently mimicked by

its bootstrap analogue (D∗n(λ))λ∈[−π,π].

Proposition 2.1. Suppose that for every λ ∈ [−π, π] the weights {wn,k(λ), k ∈ Z}are given by (2.4) or (2.5) and that Assumptions 1 to 3 are satisfied. Then, there

exists a coupling of the random variables ξ−n, . . . , ξn and ξ∗−n, . . . , ξ∗n (the latter

9

having a distribution conditioned on X1, . . . , Xn) on an appropriate joint probability

space such that

P

(sup

λ∈[−π,π]

∣∣∣Dn(λ) − D∗n(λ)

∣∣∣ > nδ((nh)−1/4 + h)

)= O(n−γ)

holds for arbitrary δ > 0.

Notice that the bootstrap procedure used to generated replicates of the ξk’s is not

new. It has been proposed in a context different to that considered here by Hurvich

and Zeger (1987). Franke and Hardle (1992) investigated asymptotic properties of a

version of this procedure based on i.i.d. resampling of estimated frequency domain

residuals In,X(λk)/fn,X(λk) instead of the χ2-distributed random variables γ∗k . See

also Dahlhaus and Janas (1996) for the asymptotic properties of this procedure for

different classes of periodogram based statistics.

It is worth mentioning here that, as a careful inspection of the proof of Propo-

sition 2.1 shows, in order for the bootstrap to estimate consistently the random

behavior of Dn(λ))λ∈[−π,π], the random variables used to mimic the behavior of the

ξk’s can be alternatively defined as

ξ+k = In,X(λk)γ

∗k, k = 0, . . . , [n/2].

That is, the periodogram In,X(λk) can be used in place of the estimated spectral

density fn,X(λk) and the distribution of∑

k wn,k(λ)ξ∗k/σ(fn,X(λ)) can be imitated

by that of∑

k wn,k(λ)ξ+k /σ(fn,X(λ)). Our simulation findings suggest, however, that

using the estimated spectral density fn,X(λk) leads to better results in finite sample

situations.

2.4. Main Results. We first give a lemma which provides a concentration inequal-

ity for the supremum deviation and which implies that the strong approximation

result stated in Proposition 2.1 is good enough for proving consistency of the boot-

strap method.

Lemma 2.2. Suppose that for every λ ∈ [−π, π] the weights {wn,k(λ), k ∈ Z} are

given by (2.4) or (2.5) and that Assumptions 1 to 3 are satisfied. Then

P

(sup

λ∈[−π,π]

{∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣}∈ [c, d]

)

= O((d− c)

√nh log n + h(log n)3/2 + nδ(nh)−1/2

)

holds for arbitrary δ > 0.

10

The following theorem is the main result of this paper. It states that the proposed

bootstrap confidence band achieves asymptotically the desired coverage probability.

Theorem 2.3. Suppose that for every λ ∈ [−π, π] the weights {wn,k(λ), k ∈ Z} are

given by (2.4) or (2.5) and that Assumptions 1 to 3 are satisfied. Then

P(Kh(fX)(λ) ∈

[fn,X(λ)− t∗n,ασ(fn,X(λ)), fn,X(λ) + t∗n,ασ(fn,X(λ))

]

for all λ ∈ [−π, π])−→n→∞ 1 − α.

Remark 1. Nonparametric confidence intervals or bands directly for the function

of interest are still dominating in the literature. As argued in the Introduction, we

decided to deviate from this common practice and devised confidence bands for a

smoothed version Kh(fX) of the spectral density fX . Nevertheless, the approxima-

tion results derived here also allow to establish simultaneous confidence bands for

fX , provided that the maximum bias of fX , supλ |fX(λ) − Kh(fX)(λ)|, is of negli-

gible order oP ((nh log n)−1/2). This can be achieved by either choosing h = hn of

smaller than mean-square-error optimal order or by an explicit subsequent bias cor-

rection. However, as discussed at the beginning of the previous section, we do not

see a well-motivated rule for choosing an undersmoothing bandwidth h for a given

sample size n. The alternative of using a subsequent bias correction seems to be less

problematic at first glance. However, this bias correction can only be successful if

some degrees of smoothness of fX are not used by the initial estimator and are hence

left for the correction step. Besides these technical difficulties, we think that these

approaches are also awkward from the conceptional point of view. Both approaches

require the assumption of a sufficient degree of smoothness of fX , a condition that

can be hardly checked by any test. (This just reflects the fact that nonparametric

curve estimation is an ill-posed statistical inverse problem.) In contrast to that,

confidence bands for Kh(fX) do not suffer at all from these difficulties.

3. Numerical Examples

3.1. Simulations. To investigate the finite sample performance of our procedure a

small simulation study has been conducted using the following two linear processes:

1. Xt = 0.276Xt−1−0.084Xt−2+0.048Xt−3−0.039Xt−4+0.043Xt−5+0.09Xt−6+

0.21Xt−7 + εt,

and

2. Xt = Xt−1 − 0.4Xt−2 − 0.9εt−1 + εt.

In both generating equations (εt)t∈Z is an i.i.d. process with standard Gaussian

distributed random variables. The first, high order autoregressive (AR) process has

11

been used by Tomasek (1987). The second autoregressive moving-average (ARMA)

process has been chosen such that the large parameter of its moving average part

makes it difficult to approximate its spectral density by that of a low order autore-

gressive process.

We first investigate how well the method proposed estimates the exact confi-

dence bands. For this, realizations of length n = 256 and n = 1024 of both pro-

cesses have been considered. The estimator fn,X(λ) has been obtained using the

kernel weights wn,k(λ) = 2πKh(λ − λk)/n with K(·) the Bartlett-Priestley kernel

K(x) = 1[−π,π](x)3(1 − (x/π)2)/(4π) and for the values h = 0.12 for n = 256 and

h = 0.07 for n = 1024; see the discussion below for these particular choices of the

smoothing bandwidth h. For each process and sample size we have calculated the

exact confidence bands (2.9) by using 1000 replications to get estimators of the ex-

act percentage points tn,α of the distribution of supλ |Dn(λ)| and of the standard

deviation σ(fn,X(λ)).

The estimated exact confidence bands have been then compared with the confi-

dence bands obtained by using the method proposed in this paper. To get a typical

series as basis for this comparison, we generated 51 independent realizations of each

process and of each sample size considered and for each realization we have cal-

culated the estimation error n−1 ∑[n/2]k=−[(n−1)/2](fn,X(λk) − fX(λk))

2. We have then

selected for our comparison that series with the median value of this error. For the

so selected series the percentage points t∗n,α of the bootstrap confidence bands has

been estimated using 1000 bootstrap replications of supλ |D∗n(λ)| and the standard

deviation σ(fn,X(λ)) has been calculated as the square root of

σ2(fn,X(λ)) = n−2∑

k

K2h(λ− λk)(1 + δk)f

2n,X(λk).

The results obtained are shown in Figure 1 for the AR-process and in Figure 2

for the ARMA-process, respectively.

Please insert Figure 1 and Figure 2 about here

We next investigate how well the bootstrap based confidence bands achieve the

desired nominal coverage probability. Here we include in our simulation study also

the moving-average process

3. Xt = εt+0.276εt−1−0.084εt−2+0.048εt−3−0.039εt−4+0.043εt−5+0.09εt−6+

0.21εt−7+,

which has the same parameters as the autoregressive process 1). For this the em-

pirical coverage probability of the estimated bootstrap confidence bands have been

calculated for different sample sizes and different choices of the smoothing band-

width h. Nominal coverage probabilities of 90% and 95% have been considered.

12

Notice that since we estimate the spectral density nonparametrically the choice of

h is crucial for our analysis. To deal with this problem we calculated the empirical

coverage probabilities for three fixed values of h and for a choice of h based on a

cross-validation criterion like the one proposed by Beltrao, K. L. and Bloomfield,

P. (1987); cf. also Hurvich (1985). The three fixed values of h chosen, correspond

approximately to the mean value of h as well as to the values obtained by taking plus

minus two times the standard deviation of the bandwidth selected using the afore-

mentioned cross-validation method. The obtained empirical coverage probabilities

over 200 trials and 1000 bootstrap replications are summarized in Table 1.

Please insert Table 1 about here

According to the results obtained, our method to construct confidence bands

works very satisfactory in estimating accurately the exact confidence bands of inter-

est and leads to empirical coverage probabilities that are close to the desired nominal

probabilities.

3.2. A real-life data example. We apply the method proposed to construct confi-

dence bands to the egg-price data set analyzed in Fan and Yao (2003). In particular,

we demonstrate how the simultaneous confidence bands obtained using the proce-

dure proposed in this paper can be used to evaluate the fit of parametric models.

The data set considered consists of n = 1201 weekly egg prices at a German agri-

cultural market between April 1967 and May 1990. Since the data exhibit a clear

nonstationarity feature, Fan and Yao (2003, Chapter 3.6) considered the first-order

differences of the series. Using the first 300 observations, Fan and Yao (2003) pro-

posed two different models as appropriate for this data set, an ARMA(1,2) and a

MA(7) model. We re-estimated these models using the whole series of 1200 obser-

vations and evaluated their fit using the estimated simultaneous confidence bands

for the spectral density of the observed series.

In particular, Figure 3 shows the estimated spectral density (solid line) of the

differenced egg-price series together with a 95% bootstrap confidence band (dot-

ted lines) obtained using B = 1000 bootstrap replications. Displayed in the same

plots are also the smoothed spectral densities of the two fitted parametric models

shown by dashed lines. The nonparametric spectral density estimator has been ob-

tained using the Bartlett-Priestley kernel and a bandwidth of h = 0.12 selected by

cross-validation. The same bandwidth and kernel have been used to smooth the

theoretical spectral density of the two fitted parametric models which are shown in

Figure 3 by dashed lines. An inspection of these plots reveals that the ARMA(1,2)

model provides a better fit to the egg price data than the MA(7) model. Moreover,

13

the latter model should be rejected as not appropriate because of its difficulties to

parametrise satisfactory the low frequency behavior of the egg-price differences.

Please Insert Figure 3 about here

4. Proofs

Before we begin with the proofs of the assertions we introduce some notation.

We will generally use γ to denote an arbitrarily large and δ to denote an arbitrarily

small positive constant. For any sequence of random variables (Yn)n∈N and any

sequences of nonnegative constants (αn)n∈N and (βn)n∈N, we write

Yn = O(αn, βn),

if there exists some C < ∞ such that

P (|Yn| > Cαn) ≤ Cβn.

This notion is obviously stronger than the commonly used OP . It is quite an effec-

tive short hand in our context where we have to derive several times results of the

type that a large number of random variables is simultaneously below corresponding

threshold values, with a high probability.

Proof of Proposition 2.1. Abbreviate σ(fn,X(λ)) and σ(fn,X(λ)) by σ(λ) and σ(λ),

respectively. In view of (2.17), it suffices to construct a coupling of the underlying

random variables such that the bootstrap deviation process (∑

k wn,k(λ)ξ∗k/σ(λ))λ∈[−π,π]

is close to the process (∑

k wn,k(λ)ξk/σ(λ))λ∈[−π,π] with a high probability. We have,

similarly to (2.17), that∣∣∣Efn,X(λ) − fX(λ)

∣∣∣

≤∣∣∣Efn,X(λ) − Kh(fX)(λ)

∣∣∣ + |Kh(fX)(λ) − fX(λ)|= O

((nh)−1 + n−1/2 log n

)+ O(h2). (4.1)

Note that we obtain from (2.11) that

fn,X(λ) − Efn,X(λ) =∑

k

wn,k(λ) (ξk + Rn(λk)) .

Now it follows from Rosenthal’s inequality that, for all p ≥ 2,

E

∣∣∣∣∣∑

k

wn,k(λ)ξk

∣∣∣∣∣p

= O((nh)−p/2),

which implies in conjunction with (2.15) by Markov’s inequality that

fn,X(λ) − Efn,X(λ) = O(nδ(nh)−1/2, n−γ). (4.2)

14

(4.1) and (4.2) imply that

maxk

∣∣∣f 2n,X(λk) − f 2

X(λk)∣∣∣ = O(nδ(nh)−1/2, n−γ) + O(h2). (4.3)

Therefore, we obtain by (2.2) that

supλ∈[−π,π]

|σ(λ) − σ(λ)|

≤ supλ∈[−π,π]

{ |σ2(λ) − σ2(λ)|σ(λ)

}

= O((nh)1/2)

{(∑

k

w2n,k(λ)|f 2

n,X(λk) − f 2X(λk)|

)+ O(n−3/2h−1 + n−1)

}

= O(nδ(nh)−1, n−γ) + O((nh)−1/2h2 + n−1h−1/2 + n−1/2h1/2

). (4.4)

Hence, we can ignore the effect of estimating the unknown standard deviation, that

is, it suffices to construct such a coupling for the linear statistics∑

j

wn,j(λ)ξ∗j /σ(λ) and∑

j

wn,j(λ)ξj/σ(λ).

We do this in three steps. First, we replace the ξj by normal random variables

ηj ∼ N (0, var(ξj)) such that supλ{|∑

j wn,j(λ)ξj − ∑j wn,j(λ)ηj|} is small. Then we

replace in complete analogy the ξ∗j by normal random variables η∗j ∼ N (0, var(ξ∗j ))such that supλ{|

∑j wn,j(λ)ξ∗j −

∑j wn,j(λ)η∗j |} is small. And finally, we construct

a coupling of the ηj with the η∗j such that supλ{|∑

j wn,j(λ)ηj − ∑j wn,j(λ)η∗j |} is

small. Gluing these three couplings together we obtain the desired result.

We begin with the first coupling. Recall that the random variables ξ0, . . . , ξ[n/2]

are independent with

ξj ∼

fX(λj)(χ22/2− 1), if 1 ≤ j < n/2,

fX(λj)(χ21 − 1), if j ∈ {0, n/2}

.

Define

vj := var(ξj) =

f 2X(λj), if 1 ≤ j < n/2,

2f 2X(λj), if j ∈ {0, n/2}

.

According to Corollary 4 in Sakhanenko (1991, p. 76), there exists a coupling of

ξ0, . . . , ξ[n/2] with independent random variables η0, . . . , η[n/2], ηj ∼ N (0, vj), such

that, with Sk =∑

0≤j≤k ξj and Sk =∑

0≤j≤k ηj (0 ≤ k ≤ [n/2]), the following

inequality holds for some C < ∞ and arbitrary α ≥ 2:

P

(max

0≤k≤[n/2]{|Sk − Sk|} > Cαx

)≤

[n/2]∑

k=0

E|ξk|α /xα + P

(max

0≤k≤[n/2]{|ξk|} > x

).

15

Since we can majorize the right-hand side by 2(∑[n/2]

k=0 E|ξk|α)/xα and since all mo-

ments of the ξk are bounded we obtain with the choice of x = nδ and α = (γ + 1)/δ

that

P

(max

0≤k≤[n/2]{|Sk − Sk|} > Cαnδ

)

≤ 2 ([n/2] + 1) max0≤k≤[n/2]

{E|ξk|α} n−αδ = O(n−γ). (4.5)

Recall that, according to the 2π-periodicity and symmetry of the periodogram,

ξ[n/2]+k = ξn−[n/2]−k (k = 1, . . . , n− [n/2]) and ξ−k = ξk (k = 1, . . . , n). Accordingly

we set η[n/2]+k = ηn−[n/2]−k (k = 1, . . . , n− [n/2]) and η−k = ηk (k = 1, . . . , n). Now

we extend the above definition of Sk and Sk by setting, this time for −n ≤ k ≤ n,

Sk =k∑

j=−n

ξj −−1∑

j=−n

ξj and Sk =k∑

j=−n

ηj −−1∑

j=−n

ηj.

Then we have, for −n < j ≤ n, that ξj and ηj can be recovered from these partial

sum processes as ξj = Sj − Sj−1 and ηj = Sj − Sj−1. Note that we have, for

k = 1, . . . , n− [n/2], that

S[n/2]+k = S[n/2] + ξn−[n/2]−1 + · · ·+ ξn−[n/2]−k = S[n/2] + Sn−[n/2]−1 − Sn−[n/2]−k

and, analogously,

S[n/2]+k = S[n/2] + Sn−[n/2]−1 − Sn−[n/2]−k.

Furthermore, it follows, for k = −n, . . . ,−1, that Sk = −∑−1j=k+1 ξj = −∑−k−1

j=1 ξj =

S0 − S−k−1 and Sk = S0 − S−k−1. Therefore, we obtain from (4.5) that

max−n≤k≤n

{∣∣∣Sk − Sk

∣∣∣}

= O(nδ, n−γ

). (4.6)

It follows from the bounded total variation of the kernel K that the sequence of

weights (wn,j)j satisfies supλ{∑

j |wn,j(λ)− wn,j+1(λ)|} = O((nh)−1). Therefore, we

obtain from (4.6) that

supλ∈[−π,π]

∣∣∣∑

j

wn,j(λ)ξj −∑

j

wn,j(λ)ηj

∣∣∣ ≤ sup

λ

∑

j

|wn,j(λ)− wn,j+1(λ)||Sj − Sj|

= O(nδ(nh)−1, n−γ

). (4.7)

On the bootstrap side, we proceed similarly. Let

v∗j := var(ξ∗j ) =

f 2n,X(λj), if 1 ≤ j < n,

2f 2n,X(λj), if j ∈ {0, n/2}

.

16

Note that the v∗j can be conveniently bounded by a constant, that is, for any γ < ∞there exists a Cγ < ∞ such that

P

(max

0≤j≤[n/2]{v∗j} > Cγ

)= O(n−γ).

Conditionally on the event that maxj{v∗j} ≤ Cγ, we can again apply Corollary 4 in

Sakhanenko (1991) to show that there exist independent random variables η∗0, . . . , η∗[n/2],

η∗j ∼ N (0, v∗j ), such that, with S∗k =∑k

j=0 ξ∗j and S∗k =∑k

j=0 η∗j ,

max0≤k≤[n/2]

{∣∣∣S∗k − S∗k∣∣∣}

= O(nδ, n−γ

)

holds. Defining η∗[n/2]+k = η∗n−[n/2]−k (k = 1, . . . , n − [n/2]) and η∗−k = η∗k (k =

1, . . . , n) and extending the previous definition of S∗k and S∗k by setting S∗k =∑k

j=−n ξ∗j −∑−1

j=−n ξ∗j and S∗k =∑k

j=−n η∗j −∑−1

j=−n η∗j , respectively, we obtain as in

(4.6) that

max−n≤k≤n

{∣∣∣S∗k − S∗k∣∣∣}

= O(nδ, n−γ

). (4.8)

This implies, analogously to (4.7), that

supλ∈[−π,π]

∣∣∣∑

j

wn,j(λ)ξ∗j −∑

j

wn,j(λ)η∗j∣∣∣ = O(nδ(nh)−1, n−γ). (4.9)

Finally, it remains to construct a coupling of the ηj and the η∗j such that

supλ{|∑

j wn,j(λ)ηj − ∑j wn,j(λ)η∗j |} is small with a high probability. In contrast

to the pairs of random variables (ξj, ηj) and (ξ∗j , η∗j ) which have different distri-

butions but matching variances, the sequences (ηj)j and (η∗j )j consist of random

variables from a convolution-invariant family but with different variances. On the

other hand, because of the bounded total variation of K, the weights wn,j(λ) are

relatively smooth in j. Hence, the following coupling of η = (η0, . . . , η[n/2])′ and

η∗ = (η∗0, . . . , η∗[n/2])′ will prove to be appropriate. We decompose η and η∗ into

∆ ³ h−1 packages of respective lengths dj ³ nh, that is,

η = (η1,1, . . . , η1,d1 , . . . , η∆,1, . . . , η∆,d∆)′,

η∗ = (η∗1,1, . . . , η∗1,d1, . . . , η∗∆,1, . . . , η∗∆,d∆

)′.

Let vj,k = Eη2j,k, v∗j,k = Eη∗j,k

2, and wn,j,k(λ) = wn,l(λ), if l corresponds to (j, k).

Furthermore, let Vj =∑dj

k=1 vj,k and V ∗j =

∑dj

k=1 v∗j,k (j = 1, . . . , ∆). We define

tj,k =k∑

l=1

vj,l, t∗j,k =k∑

l=1

v∗j,l,

sj,k = (j − 1) + tj,k/Vj, s∗j,k = (j − 1) + t∗j,k/V∗j .

17

The coupling of η and η∗ will be defined by expressing both vectors by increments

of the same Wiener process. This Wiener process serves as an appropriate tool

to connect the ηj,k with the η∗j,k in such a way that partial sums of these with

slowly changing weights are close to each other. By interpolation with independent

Brownian bridges we build a Wiener process (W (t))t∈[0,∆] such that

ηj,k = V1/2j (W (sj,k) − W (sj,k−1)) .

Now we define, conditioned on X1, . . . , Xn, independent random variables η∗j,k ∼N (0, v∗j,k) as

η∗j,k = V ∗j

1/2(W (s∗j,k) − W (s∗j,k−1)

).

Moreover, the remaining ηj,k and η∗j,k are again defined according to the properties

of 2π-periodicity and symmetry of the periodogram, that is, η[n/2]+k = ηn−[n/2]−k,

η∗[n/2]+k = η∗n−[n/2]−k (1 ≤ k ≤ n− [n/2]), and η−k = ηk, η∗−k = η∗k (1 ≤ k ≤ n).

We decompose∑

l wn,l(λ)(ηj − η∗j ) =∑

j

∑k wn,j,k(λ)(ηj,k − η∗j,k) into a “coarse

structure” term,

∆1(λ) =∑

j

(V1/2j − V ∗

j1/2)

∑

k

wn,j,k(λ)(W (s∗j,k) − W (s∗j,k−1)

),

and a “fine structure term”,

∆2(λ) =∑

j

V1/2j

∑

k

wn,j,k(λ)[(W (sj,k) − W (sj,k−1)) −

(W (s∗j,k) − W (s∗j,k−1)

)].

We obtain from (4.3) that

maxj,k{|tj,k − t∗j,k|} = max

j,k

∣∣∣∣∣k∑

l=1

(f 2

n,X(λj,l) − f 2X(λj,l)

)(1 + I(λj,l mod π = 0))

∣∣∣∣∣

= O(nδ(nh)1/2, n−γ

)+ O(nh3). (4.10)

This yields by Vj ³ V ∗j ³ nh (for V ∗

j , with a probability exceeding 1−O(n−γ)) that

maxj

{|V 1/2

j − V ∗j

1/2|}

= maxj

|Vj − V ∗j |

V1/2j + V ∗

j1/2

= O

(nδ + n1/2h5/2, n−γ

).

Therefore, and since∑

k wn,j,k(λ)(W (s∗j,k) − W (s∗j,k−1)

)is normally distributed with

zero mean and a variance of order O((nh)−2), we get immediately that

|∆1(λ)| = O(nδ(nh)−1 + (nh)−1/2h2, n−γ

).

Proving this on an appropriate sequence of increasingly fine grids we also obtain

that

supλ∈[−π,π]

{|∆1(λ)|} = O(nδ[(nh)−1 + (nh)−1/2h2], n−γ

). (4.11)

18

To estimate ∆2(λ), we rewrite it as

∆2(λ) =∑

j

V1/2j

∑

k

wn,j,k(λ)

[∫ sj,k

sj,k−1

dW (t) −∫ s∗j,k

s∗j,k−1

dW (t)

]

=∑

j

V1/2j

∫ j

j−1[wt(λ) − w∗

t (λ)] dW (t),

where wt(λ) = wn,j,k(λ), if t ∈ (sj,k−1, sj,k], and w∗t (λ) = wn,j,k(λ), if t ∈ (s∗j,k−1, s

∗j,k].

(Note that the integrands in the integrals above are piecewise constant which means

that the integrals can be computed as weighted sums of increments of W ; the more

general concept of stochastic integrals is not needed here.) We conclude from (4.10)

that

|sj,k − s∗j,k| ≤|tj,k − t∗j,k|

Vj

+t∗j,kV ∗

j

|V ∗j − Vj|Vj

= O(nδ(nh)−1/2 + h2, n−γ

).

Moreover, since K has finite total variation we obtain from this relation that

∫(wt(λ) − w∗

t (λ))2 dt = O(nδ(nh)−5/2 + n−2, n−γ

), (4.12)

which yields that

∆2(λ) = O(nδ[(nh)−3/4 + (nh)−1/2h], n−γ

).

Proving this again on an appropriate sequence of increasingly fine grids we conclude

that

supλ∈[−π,π]

{|∆2(λ)|} = O(nδ[(nh)−3/4 + (nh)−1/2h], n−γ

). (4.13)

The assertion now follows from (4.7), (4.9), (4.11) and (4.13). ¤

Proof of Lemma 2.2. As in the proof of Proposition 2.1, (4.4) yields that we can

again ignore the effect of estimating the unknown standard deviation of fn,X(λ) and

consider (fn,X(λ)−Kh(fX)(λ))/σ(λ) instead of (fn,X(λ)−Kh(fX)(λ))/σ(λ). Recall

from (2.16) and (4.7) that

supλ∈[−π,π]

{∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣}

= supλ∈[−π,π]

∣∣∣∣∣∣∑

j

wn,j(λ)ηj

∣∣∣∣∣∣

+ O

(log n√

n+ nδ(nh)−1, n−γ

), (4.14)

where the ηj are independent normal random variables. We approximate the lat-

ter supremum by the maximum over the Fourier frequencies. Since∑

j |wn,j(λ) −

19

wn,j(ω)| = O(|λ− ω|/h) we obtain that∣∣∣∣∣∣

supλ∈[−π,π]

∣∣∣∑

j

wn,j(λ)ηj

∣∣∣ − max

−[n/2]≤k≤[n/2]

∣∣∣∑

j

wn,j(λk)ηj

∣∣∣

∣∣∣∣∣∣

≤ max−[n/2]≤k≤[n/2]

supλ∈[λk−π/n,λk+π/n]∩[−π,π]

∑

j

|wn,j(λ) − wn,j(λk)|×max

j{|ηj|}

= O

(√log n

nh, n−γ

). (4.15)

The random variables∑

j wn,j(λk)ηj (k = −[n/2], . . . , [n/2]) are jointly normal dis-

tributed and it follows from (i) of Lemma 3.1 in Neumann (2001) that

max−[n/2]≤k≤[n/2]{|∑j wn,j(λk)ηj|} has a density p∗n with

supt{p∗n(t)} = O

(√nh log n

).

This implies that

P

max−[n/2]≤k≤[n/2]

∣∣∣∑

j

wn,j(λk)ηj

∣∣∣ ∈ [c, d]

= (d− c)

√nh log n.

(4.16)

By (4.14), (4.15) and (4.16) we obtain, with cn,γ = Cγ(log n√

n+nδ(nh)−1 +

√log n

nh),that

P

(sup

λ∈[−π,π]

{∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣}∈ [c, d]

)

≤ P

max−[n/2]≤k≤[n/2]

∣∣∣∑

j

wn,j(λk)ηj

∣∣∣ ∈ [c − cn,γ, d + cn,γ]

+ O(n−γ)

= O((d− c)

√nh log n + h(log n)3/2 + nδ log n(nh)−1/2

).

¤

Proof of Theorem 2.3. It follows from Proposition 2.1 and Lemma 2.2 that

supt

∣∣∣∣∣∣P

sup

λ∈[−π,π]

∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣

σ(fn,X(λ))

≤ t

− P

sup

λ∈[−π,π]

∣∣∣ ∑j wn,j(λ)ξ∗j

∣∣∣σ(fn,X(λ))

≤ t

∣∣∣∣∣∣

= oP (1).

This implies

P

sup

λ∈[−π,π]

∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣

σ(fn,X(λ))

≤ t

∣∣∣∣∣∣t=t∗n,α

= 1 − α + oP (1),

20

which yields that

P

sup

λ∈[−π,π]

∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣

σ(fn,X(λ))

≤ t∗n,α

= 1 − α + o(1).

¤

Acknowledgment . We thank two referees for many helpful comments.

References

Beltrao, K. L. and Bloomfield, P. (1987). Determining the bandwidth of a kernel spectrum estimate.Journal of Time Series Analysis 8, 21–38.

Brillinger, D. R. (1981). Time Series. Data Analysis and Theory. New York: McGraw-Hill.Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods, 2nd edition. New

York: Springer.Dahlhaus, R. and Janas, D. (1996). A frequency domain bootstrap for ratio statistics in time series

analysis. Annals of Statistics 24, 1934–1963.Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. New

York: Springer.Franke, J. and Hardle, W. (1992). On bootstrapping kernel spectral estimates. Annals of Statistics

20, 121–145.Hrafnkelsson, B. and Newton, J. H. (2000). Asymptotic simultaneous confidence bands for vector

autoregressive spectra. Biometrika 87, 173–182.Hurvich, C. M. (1985). Data-driven choice of spectrum estimation: extending the applicability of

cross-validation methods. Journal of the American Statistical Association 80, 933–940.Hurvich, C. M. and Zeger, S. L. (1987). Frequency domain bootstrap methods for time series.

Technical Report 87-115. Graduate School of Business Administration. New York University.Koslov, J. W. and Jones, R. H. (1985). A unified approach to confidence bounds for the autore-

gressive spectral estimator. Journal of Time Series Analysis 6, 141–151.Muller, H. G. and Prewitt, K. (1992). Weak convergence and adaptive peak estimation for spectral

densities. Annals of Statistics 20, 1329–1349.Neumann, M. H. (2001). On robustness of model-based bootstrap schemes in nonparametric time

series analysis. Statistics 35, 1–40.Newton, J. H. and Pagano, M. (1984). Simultaneous confidence bands for autoregressive spectra.

Biometrika 71, 197–202.Priestley, M. B. (1981). Spectral Analysis and Time Series. New York: Academic Press.Sakai, H. and Sakaguchi, F. (1990). Simultaneous confidence bands for the spectral estimate of

two-channel autoregressive processes. Journal of Time Series Analysis 11, 49–56.Sakhanenko, A. I. (1991). On the accuracy of normal approximation in the invariance principle.

Siberian Advances in Mathematics 1, 58–91.Tomasek, L. (1987). Asymptotic simultaneous confidence bands for autoregressive spectral density.

Journal of Time Series Analysis 8, 469–491.

21

n=256 n=512 n=1024Process h 90% 95% h 90% 95% h 90% 95%

AR(7) 0.14 0.840 0.920 0.10 0,885 0.940 0.09 0.920 0.9550.12 0.845 0.915 0.08 0.875 0.935 0.07 0.910 0.9500.10 0.860 0.900 0.06 0.840 0.925 0.05 0.850 0.925CV 0.820 0.905 CV 0.845 0.935 CV 0.885 0.930

ARMA(2,1) 0.14 0.875 0.930 0.10 0.890 0.920 0.09 0.875 0.9550.12 0.860 0.915 0.08 0.870 0.910 0.07 0.885 0.9450.10 0.860 0.905 0.06 0.835 0.895 0.05 0.840 0.920CV 0.885 0.925 CV 0.850 0.905 CV 0.860 0.935

MA(7) 0.14 0.885 0.955 0.10 0.865 0.900 0.09 0.865 0.9300.12 0.835 0.930 0.08 0.840 0.900 0.07 0.850 0.9100.10 0.840 0.900 0.06 0.810 0.860 0.05 0.820 0.910CV 0.865 0.920 CV 0.835 0.910 CV 0.860 0.925

Table 1: Empirical coverage probabilities of 90% and 95% confidence bands for differentsample sizes n and smoothing bandwidths h. CV refers to the results obtained using a cross-validation criterion to select the bandwidth.

22

Frequency

Pow

er

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

1.0

Frequency

Pow

er

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

Figure 1. Simultaneous, 95% confidence bands for the spectral den-

sity of the ARMA(2,1) model: The solid line in both figures is the

estimated spectral density, the dashed lines refer to the estimated ex-

act confidence bands and the dotted lines to the bootstrap confidence

bands. The top figure presents the results for n = 256 and the bottom

figure for n = 1024.

23

Frequency

Pow

er

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Frequency

Pow

er

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

1.0

Figure 2. Simultaneous, 95% confidence bands for the spectral den-

sity of the AR(7) model: The solid line in both figures is the estimated

spectral density, the dashed lines refer to the estimated exact confi-

dence bands and the dotted lines to the bootstrap confidence bands.

The top figure presents the results for n = 256 and the bottom figure

for n = 1024.

24

Frequency

Pow

er

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.05

0.10

0.15

Frequency

Pow

er

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.05

0.10

0.15

Figure 3. Estimated spectral density (solid line) of the differenced

German egg-price data together with 95% confidence bands (dotted

lines). The dashed line in the top graph is the smoothed spectral

density of the fitted ARMA(1,2) model and in the bottom graph of

the fitted MA(7) model.

Simultaneous confidence bands in spectral density estimation STATHIS.pdfdence bands for (a smoothed version of) the spectral density of Gaussian processes. The method does not rely

Documents