Page 1

Statistics & Probability Letters 50 (2000) 137–147

On bootstrapping L2-type statistics in density testing

Michael H. Neumanna;∗, Efstathios Paparoditisb

aSonderforschungsbereich 373, Humboldt-Universit? at zu Berlin, Spandauer Stra?e 1, D-10178 Berlin, Germany

bDepartment of Mathematics and Statistics, University of Cyprus, P.O. Box 537, Nicosia, Cyprus

Received October 1999; received in revised form February 2000

Abstract

We consider non-parametric tests for checking parametric hypotheses about the stationary density of weakly dependent

observations. The test statistic is based on the L2-distance between a non-parametric and a smoothed version of a parametric

estimate of the stationary density. Since this statistic behaves asymptotically as in the case of independent observations an

i.i.d.-type bootstrap to determine the critical value for the test is proposed. c ? 2000 Elsevier Science B.V. All rights reserved

MSC: primary 62M07; secondary 62G09; 62G10

Keywords: Bootstrap; Stationary density; Test; Weak dependence

1. Introduction

Especially in the context of data from time series, statisticians very often ?t certain parametric or semi-

parametric models. Parametric restrictions can be imposed for the dependence mechanism between subsequent

observations and=or their marginal distribution. For example, people often assume normality – either directly

for the observed random variables or for the unobserved innovations in structural time series models. The

adequacy of such strong assumptions is almost always debatable and some guidelines for assessing their

appropriateness are of interest. In the present paper we consider tests which can be used to check certain

parametric or semiparametric assumptions on the marginal distribution.

There already exists a lot of theory for tests in the context of independent, identically distributed observa-

tions. Classical approaches are based on a comparison of the assumed cumulative distribution function with

its empirical counterpart and include well-known tests such as the Kolmogorov–Smirnov and the Cram? er-von

Mises test. More recently people also developed tests based on a comparison of the assumed density with a

non-parametric estimate. In the context of i.i.d. observations, Bickel and Rosenblatt (1973) proposed a test

based on the L2-distance between a non-parametric density estimate and a parametric ?t. Although methods

based on the cumulative distribution function such as the Kolmogorov–Smirnov and the Cram? er-von Mises

test mentioned above are perhaps more popular among applied statisticians, both approaches have their relative

∗Corresponding author.

0167-7152/00/$-see front matter c ? 2000 Elsevier Science B.V. All rights reserved

PII: S0167-7152(00)00091-2

Page 2

138 M.H. Neumann, E. Paparoditis / Statistics & Probability Letters 50 (2000) 137–147

advantages and disadvantages. The relative merits of smoothing-based tests based on local characteristics like

densities versus non-smoothing tests based on cumulative characteristics are discussed by Rosenblatt (1975)

and Ghosh and Huang (1991).

In the context of dependent data, the development of practicable tests becomes usually more di?cult than

in the independent case, since even the limit distribution of a potential test statistic depends on the dependence

mechanism within the observations. In this respect, smoothing-based methods have another, perhaps unexpected

advantage since it turns out that certain test statistics have the same limit distribution as in the case of i.i.d.

observations. Whereas this e?ect is well known for the pointwise behavior of non-parametric estimators (see,

e.g., Robinson, 1983), it seems to be much less known for statistics that depend through some non-parametric

estimator on the whole sample. Takahata and Yoshihara (1987) showed for the special case of m-dependent

observations that the integrated squared error of a non-parametric estimate of the stationary density has the

same limit distribution as in the case of i.i.d. data. We provide a central limit theorem for this L2distance under

the assumption of absolute regularity which is satis?ed by several of the processes considered in the time-series

literature. Inspired by the work of H? ardle and Mammen (1993), we will focus on the L2-distance between a

non-parametric estimate and a smoothed version of a parametric estimate rather than the parametric estimate

itself. Similar tests have been considered independently by Fan and Ullah (1999). Moreover, we will also relax

the assumptions of Takahata and Yoshihara (1987) which in particular allow us to include the interesting case

of testing the joint distribution of (Xi;Xi−l1;:::;Xi−ld−1)?. There exists some related work on non-parametric

tests which is also based on the possibility to neglect weak dependence. Theory for L2-tests is developed

in Paparoditis (2000) for the spectral density and in Kreiss et al. (1998) for the autoregression function.

The case of supremum-type statistics that are needed for the construction of simultaneous con?dence bands

and L∞-tests is investigated in Neumann and Kreiss (1998) in the context of non-parametric autoregressive

models, and in Neumann (1998,1997) in the more general framework of weakly dependent processes without

any additional structural assumptions.

Although one could choose the critical value according to the limit distribution of the test statistic, we

propose to use the bootstrap for its determination. According to our asymptotic theory, we employ Efron’s

(1979) bootstrap which was originally designed for i.i.d. observations. Some experience in related cases (e.g.

simulations reported in H? ardle and Mammen, 1993) let us expect that some suitable bootstrap method improves

the accuracy of approximation provided by the limiting normal distribution. Some simulations reported in

Section 3 of this paper seem to corroborate this conjecture. It seems to be possible to obtain an even better

approximation to the distribution of the test statistic by bootstrap methods that take explicitly into account

the dependence structure of the observations. On the other hand, the approach proposed here is easier to

implement. Moreover, our results may also be viewed under the aspect of robustness. If one did incorrectly

assume that the underlying data were generated by a sequence of independent random variables, then our

theory reassures the statistician that i.i.d.-type approach was asymptotically justi?ed.

2. Test statistics and their limit distributions

Throughout the paper we assume that we have observations from a stationary process {Xi; −∞¡i¡∞}.

We do not impose any kind of structural conditions on the dependence mechanism such as, for example,

some ?nite-order autoregressive structure. All we need is some appropriate kind of mixing condition and some

assumption on the joint densities. We impose in particular the following conditions.

Assumption 1. Let, for j6k; Fk

e?cient) is de?ned as

j= ?(Xj;Xj+1;:::;Xk). The coe?cient of absolute regularity (?-mixing co-

?(k) = E

?

sup

V∈F∞

i+k

{|P(V |Fi

−∞) − P(V)|}

?

:

Page 3

M.H. Neumann, E. Paparoditis / Statistics & Probability Letters 50 (2000) 137–147 139

We suppose that the ?(k) decay with an exponential rate, that is

?(k)6C exp(−Ck):

Let f be the stationary density of the process and fXi1;:::;Ximbe the joint density of (Xi1;::: ; Xim).

Assumption 2. (i) f is continuous,

(ii) supx1;:::;xm{fXi1;:::;Xim(x1;:::;xm)}¡∞ for all m and i1¡···¡im.

We study either the case of d-dimensional random variables Xior the case of one-dimensional random variables

Xi where we are interested in testing hypotheses on the joint density of (Xi;Xi−l1;:::;Xi−ld−1)?. To unify our

notation, we introduce random variables Yi, where Yi=Xiin the earlier and Yi=(Xi;Xi−l1;:::;Xi−ld−1)?in the

latter case.

Tests for parametric or semiparametric hypotheses can be derived at di?erent levels concerning the cardi-

nality of the null hypothesis. All essential mathematical features can be easily studied in the simplest case of

a single null hypothesis, which is the object of the following subsection.

2.1. Testing of single hypotheses

In order to present the essential mathematical ideas in a manner as clear as possible, we consider ?rst the

basic case of testing a single hypothesis, that is, of

H0: f = f0:

Let

ˆ fn(x) =

1

nhd

n

?

i=1

K

?x − Yi

h

?

(2.1)

be a usual kernel estimator of f, where h = h(n) denotes a bandwidth tending to 0 as n → ∞. Our test

statistic relates ˆ fnwith the hypothetical density f0. To avoid any kind of bias problems, we compare ˆ fnwith

a smoothed version of f0. This leads to

Tn= nhd=2

?

[ˆ fn(x) − (Kh∗ f0)(x)]2dx;(2.2)

where the smoothing operator Khis de?ned by

(Kh∗ g)(·) =

Before we state a theorem about the limit distribution of Tn, we introduce two more assumptions.

?

h−dK

?: − z

h

?

g(z)dz:(2.3)

Assumption 3. K is bounded and compactly supported.

Assumption 4. (i) h = o([log(n)]−3),

(ii) h−d= o(n).

The asymptotic behaviour of statistics similar to Tn was already investigated by Takahata and Yoshihara

(1987). They found in the special case of m-dependent observations that Tn− ETn converged to a normal

distribution with the same variance as if the Yi were independent. The following theorem provides a similar

result under a di?erent set of assumptions.

Page 4

140 M.H. Neumann, E. Paparoditis / Statistics & Probability Letters 50 (2000) 137–147

Theorem 2.1. Suppose that Assumptions 1–4 are ful?lled. Then; under H0;

Tn− h−d=2

?

K2(u)du

d→N(0;?2);

where

?2= 2

?

f2(x)dx ×

? ??

K(u)K(u + v)du

?2

dv:

In order to provide a self-contained version of this paper, and since our technical assumptions are di?erent

from those in Takahata and Yoshihara (1987), we sketch the proof in Section 4.

2.2. Testing of composite hypotheses

In this subsection we consider the perhaps more important case of testing composite hypotheses. Instead of

a single null hypothesis, f = f0, we have now

H0: f ∈ F;

where F is some parametric or even semiparametric class of density functions. It will turn out that, under

suitable regularity conditions on the class F, the problem can be reduced to the case of a single hypothesis

investigated in the previous subsection. Practitioners are probably most interested in testing (?nite-dimensional)

parametric hypotheses, that is F=F?={f?; ? ∈ ?}, where ?⊆Rp. We will study this case in some detail,

and will discuss the semiparametric problem of testing independence of certain components of Yi brie?y at

the end of this section.

In the case of f ∈ F?, let ?0∈ ? be such that f?0=f. Our test will be based on the L2-distance between

our non-parametric estimate ˆ fnand a smoothed version of a parametric ?t, fˆ?, namely

Tn;ˆ?= nhd=2

?

[ˆ fn(x) − (Kh∗ fˆ?)(x)]2dx;(2.4)

where Khis the smoothing operator de?ned by (2.3). By looking at

Tn;ˆ?− Tn= 2nhd=2

?

[ˆ fn(x) − (Kh∗ f?0)(x)][(Kh∗ f?0)(x) − (Kh∗ fˆ?)(x)]dx

+nhd=2

?

[(Kh∗ (fˆ?− f?0))(x)]2dx;

it is easy to ?nd su?cient conditions for the asymptotic equivalence of Tn;ˆ?and Tn. To formulate such a set

of conditions, we write f?in the form

f?(x) = f?0(x) + (? − ?0)f?

In the following, we will assume:

?0(x) + R(?;?0;x):

Assumption 5. (i)?[(Kh∗ fˆ?)(x) − (Kh∗ f?0)(x)]2dx = oP(n−1h−d=2),

(iii) supx{|f?

(iv)?R2(ˆ?;?0;x)dx = oP(n−1).

(ii)ˆ? − ?0= oP(n−1=2h−d=2),

?0(x)|}¡∞,

Page 5

M.H. Neumann, E. Paparoditis / Statistics & Probability Letters 50 (2000) 137–147 141

It is easy to see that Assumptions 1 and 3 and (ii)–(iv) of Assumption 5 imply that

????

?

[ˆ fn(x) − (Kh∗ f?0)(x)][(Kh∗ f?0)(x) − (Kh∗ fˆ?)(x)]dx

????

=OP

1

???

nhd(ˆ? − ?0)

?

?

?

?var

?

n

?

i=1

? ?

K

?x − Yi

h

?

− EK

?x − Y1

h

??

(Kh∗ f?

?0)(x)dx

?

+O

[ˆ fn(x) − (Kh∗ f?0)(x)]2dx

??

R2(ˆ?;?0;x)dx

?

=oP(n−1h−d=2):(2.5)

This leads immediately to the following theorem.

Theorem 2.2. Suppose that Assumptions 1–5 are ful?lled. Then; under H0;

Tn;ˆ?− h−d=2

?

K2(u)du

d→N(0;?2):

3. Bootstrapping the test statistic

The theoretical results of the previous section motivate the use of bootstrap methods similar to that designed

for the i.i.d. case in order to approximate the distribution of both test statistics considered. In fact, Theorems

2.1 and 2.2 suggest that in order to get an asymptotically correct estimator of the distributions of these

statistics, it is not necessary to reproduce the whole (and probably very complicated) dependence structure

of the stochastic process generating the observations. We stress here the fact that the theorems obtained

are based on asymptotic considerations, i.e., we expect that such a simple bootstrap procedure which neglects

the dependence in the data will lead to valuable approximations only if the smoothing bandwidth h is small

enough and the dependence of the data weak enough. Since we focus our considerations primarily to the error

probability of the ?rst type, it su?ces to provide a consistent estimator of the distribution of the test statistics

under the null hypothesis. On the other hand, since one is of course interested in a good power performance,

we should also approximate (one of the) distributions corresponding to the null if the true distribution does not

correspond to the hypothesis. Hence, we should not use resampling with replacement from the observations

Y1;Y2;:::;Yn. Rather, we generate independent bootstrap resamples Y∗

1;Y∗

2;:::;Y∗

naccording to the density fˆ?.

3.1. Bootstrap approximations

Consider ?rst the case of testing a composite hypothesis, i.e., the case where f=f?0. In order to ensure that

certain random integrals convergence in probability to the correct limits asˆ? → ?0, the following additional

assumptions are imposed on the parametric density estimate fˆ?.

Assumption 6. (i) supx{fˆ?(x)} = OP(1),

(ii)?[fˆ?(x) − f?0(x)]2dx = oP(1).

Page 6

142M.H. Neumann, E. Paparoditis / Statistics & Probability Letters 50 (2000) 137–147

The bootstrap procedure proposed in this case can then be described as follows. Let Y∗

random sample from fˆ?and ˆ f∗

i; i = 1;2;:::;n, be a

n(x) be a kernel estimator of fˆ?de?ned by

ˆ f∗

n(x) =

1

nhd

n

?

i=1

K

?x − Y∗

i

h

?

: (3.1)

In view of the equivalence of Tn;ˆ?and Tn, it su?ces to imitate the statistic Tn, i.e., we consider the bootstrap

statistic

T∗

n;ˆ?= nhd=2

?

[ˆ f∗

n(x) − (Kh∗ fˆ?)(x)]2dx:(3.2)

The following theorem justi?es theoretically the use of the statistic T∗

of Tn and, therefore, also of Tn;ˆ?. It enables us to use the quantiles of this distribution in order to carry out

the test procedure.

n;ˆ?in order to approximate the distribution

Theorem 3.1. Suppose that Assumptions 1;3;4 and 6 are ful?lled. Then we have under H0; and conditionally

on Y1;Y2;:::;Ynthat

T∗

n;ˆ?− h−d=2

?

K2(u)du

d→N(0;?2)in probability:

One could of course also directly approximate the distribution of Tn;ˆ?by the distribution of the bootstrap

statistic T∗

n;ˆ?

∗ where the latter is de?ned by

T∗

n;ˆ?

∗ = nhd=2

?

[ˆ f∗

n(x) − (Kh∗ fˆ?

∗)(x)]2dx:(3.3)

In the above expression fˆ?

:::;Y∗

parameter ?. Therefore, we expect to obtain a better approximation to the distribution of the test statistic Tn;ˆ?.

The validity of this method follows from Theorems 2.2 and 3.1 if the di?erence between T∗

is asymptotically negligible. To be more speci?c, we need the fact that with an increasing probability the

bootstrap distributions of T∗

n;ˆ?

have

∗ denotes the estimated parametric ?t obtained using the bootstrap sample Y∗

n;ˆ?, the quantity T∗

n;ˆ?

1;Y∗

2;

n. In contrast to T∗

∗ also imitates the variability due to estimating the unknown

n;ˆ?and T∗

n;ˆ?

∗

n;ˆ?and T∗

∗ are close to each other, i.e., for arbitrary ?¿0, we would like to

E[P∗(|T∗

n;ˆ?− T∗

n;ˆ?

∗|¿?|Y1;:::;Yn)] = o(1):

This is conveniently expressed by the following assumption on the unconditional probability:

P(|T∗

n;ˆ?− T∗

n;ˆ?

∗|¿?) = o(1):(3.4)

In analogy to Assumption 5, this is ensured by the following assumption:

Assumption 7. (i)?[(Kh∗ fˆ?

(iii) supx{|f?

(iv)?R2(ˆ?

∗)(x) − (Kh∗ fˆ?)(x)]2dx = oP(n−1h−d=2)

(ii)ˆ?

∗−ˆ? = oP(n−1=2h−d=2),

ˆ?(x)|} = OP(1),

∗;ˆ?;x)dx = oP(n−1),

where oPand OPrefer here to the joint distribution of (Y1;:::;Yn) and (Y∗

1;:::;Y∗

n).

Page 7

M.H. Neumann, E. Paparoditis / Statistics & Probability Letters 50 (2000) 137–147 143

As an immediate corollary to Theorem 3.1 we get:

Corollary 3.1. Suppose that Assumptions 1;3;4;6 and 7 are ful?lled. Then we have under H0; and condi-

tionally on Y1;Y2;:::;Ynthat

?

Consider next the case of testing a single hypothesis, i.e., the case f =f0. Since in this case the distribution

of Yi is completely known, the appropriate bootstrap statistic is given by

?

where

ˆ f∗

The following theorem can then be established. Its proof follows exactly the same lines as the proof of

Theorem 2.1.

T∗

n;ˆ?

∗ − h−d=2

K2(u)du

d→N(0;?2)in probability:

T∗

n= nhd=2

[ˆ f∗

n(x) − (Kh∗ f0)(x)]2dx(3.5)

n(x) is de?ned as in (3.1) and the Y∗

i’s are now i.i.d. samples from the known density f0.

Theorem 3.2. Suppose that Assumptions 2(i);3 and 4 are ful?lled. Then we have under H0; and conditionally

on Y1;Y2;:::;Ynthat

?

3.2. Some simulations

T∗

n− h−d=2

K2(u)du

d→N(0;?2):

The theory of the previous section justi?es asymptotically the use of the proposed bootstrap procedure in

order to approximate the distribution of the test statistic considered. In this section we study the ?nite sample

performance of the bootstrap by means of a small simulation experiment. For this realizations of length n=200

have been generated from the ?rst order autoregressive process Xt=?Xt−1+?t, where ?t is an i.i.d. sequence

with ?t∼N(0;?1 − ?2) and the autoregressive parameter ? takes its values in the set {0;±0:4;0:8}. Note that

(1973). The case ? = ±0:4 corresponds to a ‘rather moderate’ dependence while ? = 0:8 to a ‘rather strong’

dependence in the data. The null hypothesis is that of Gaussian distribution with unknown mean and variance.

The test statistic Tn;ˆ?has been calculated using the kernel K(x) = (2√3)−1I(−√36x6√3) for which some

optimality properties have been derived in the testing context considered here; cf. Ghosh and Huang (1991).

The smoothing bandwidth has been set equal to h = 0:03. To estimate the exact distribution of Tn;ˆ?, 1000

replications of the model considered have been used while the bootstrap approximations are based on 1000

re-samples.

Figs. 1 and 2 show the simulated exact densities and three bootstrap estimates of these densities based

on di?erent original time series. In each case, the estimated exact density of Tn;ˆ?as well as the densities of

the corresponding bootstrap approximations shown in these exhibits have been obtained using the Gaussian

smoothing kernel and a bandwidth selection according to the standard normal distribution reference rule,

Silverman (1986, p. 45). Finally, to make some comparisons with the asymptotic Gaussian approximation, we

have plotted in these ?gures also the corresponding Gaussian densities with known variance. As these exhibits

show the asymptotic Gaussian distribution is a poor approximation to the (estimated) exact one. Furthermore,

for small and moderate dependence, the bootstrap approximations are more satisfactorily improving upon the

Gaussian approximation and reproducing more closely the overall behavior and the skewness of the (estimated)

exact density. Only in the case ? = 0:8 with a rather strong positive dependence in the data, the bootstrap

approximations become worse. Clearly, we expect that in this case other bootstrap approaches like the block

bootstrap which explicitly takes into account the dependence structure of the data will lead to better results.

for ? = 0 we are in the i.i.d. setting, i.e., our test is identical to the test proposed by Bickel and Rosenblatt

Page 8

144M.H. Neumann, E. Paparoditis / Statistics & Probability Letters 50 (2000) 137–147

Fig. 1. Estimated exact density (solid line), three bootstrap approximations (dashed lines) and asymptotic Gaussian approximation (dashed

and dotted line) of the test statistic Tn;ˆ?of the model considered with ? = 0 (left plot) and ? = −0:4 (right plot).

Fig. 2. Estimated exact density (solid line), three bootstrap approximations (dashed lines) and asymptotic Gaussian approximation (dashed

and dotted line) of the test statistic Tn;ˆ?of the model considered with ? = 0:4 (left plot) and ? = 0:8 (right plot).

4. Proofs

We give only a brief outline of the proofs. Details can be found in an earlier version of this paper, Neumann

and Paparoditis (1998).

Proof of Theorem 2.1. According to a well-known theorem of Brown (1971), one can derive a central limit

theorem for statistics that can be written as a sum of an increasing number of martingale di?erences which

satisfy an asymptotic negligibility condition. Dvoretzky (1972) extended this result to statistics that form such

a scheme only approximately, which is of particular importance in the context of weakly dependent random

variables. We decompose Tn in such a way that the leading term satis?es Dvoretzky’s conditions while the

remaining terms are of negligible order. The proof follows essentially the same pattern as a proof of a similar

assertion in Takahata and Yoshihara (1987).

First, we write Tnin the form

Tn=

?

16i¡j6n

Hn(Yi;Yj) +1

2

n

?

i=1

Hn(Yi;Yi);(4.1)

Page 9

M.H. Neumann, E. Paparoditis / Statistics & Probability Letters 50 (2000) 137–147145

where

Hn(x;y) =

2

nh3d=2

? ?

K

?u − x

h

?

− EK

?u − Y1

h

???

K

?u − y

h

?

− EK

?u − Y1

h

??

du: (4.2)

The proof of the desired central limit theorem for Tn−h−d=2?K2(u)du is facilitated by using a decomposition

of length ?n=[C1log(n)], where an appropriate choice of C1becomes clear from the calculations below. The

length of the large blocks is denoted by ln, where the only requirement is that ?n?ln?n. In accordance with

this, the kth large block is formed by Yak;Yak+1;:::;Ybk, where ak= (k − 1)(ln+ ?n) + 1 and bk= [(k − 1)

(ln+ ?n) + ln] ∧ n, while the kth small block is given by Ybk+1;:::;Yak−1.

Now, we approximate Tn− h−d=2?K2(u)du by

Un=

?

where

bk−1

?

(i) Central limit theorem for Un

According to Theorem 2 of Dvoretzky (1972),

of Y1;:::;Yn into an alternating sequence of large and small blocks. The gaps between the large blocks are

k

Sk;(4.3)

Sk=

i=1

bk

?

j=ak

Hn(Yi;Yj):(4.4)

Un

d→N(0;?2)(4.5)

follows from

?

?

and, for each ?¿0,

k

E(Sk|Gk−1)

P→0;(4.6)

k

E(S2

k|Gk−1)

P→?2;(4.7)

?

k

E[S2

kI(|Sk|¿?)]

P→0;(4.8)

where Gk= ?(Y1;:::;Ybk). (4.6) to (4.8) can actually be shown by straightforward calculations. To save

space, we omit their proofs and refer the interested reader to an earlier version of this paper, Neumann and

Paparoditis (1998).

(ii) Di?erence between Tn− h−d=2?K2(u)du and Un

?

i=1

We have

Tn− h−d=2

?

K2(u)du

?

− Un=1

2

n

?

Hn(Yi;Yi) −

1

nhd=2

?

K2(u)du +

?

k

bk−1−?n

?

i=1

ak−1

?

j=bk−1+1

Hn(Yi;Yj)

+

?

k

ak−1

?

j=bk−1+1

j−1

?

i=bk−1−?n+1

Hn(Yi;Yj) +

?

k

bk

?

j=ak

j−1

?

i=bk−1+1

Hn(Yi;Yj)

= R1+ ··· + R4:(4.9)

It follows now by straightforward calculations that each of the terms R1;:::;R4 is oP(1), which completes

the proof (for details see again Neumann and Paparoditis, 1998).

Page 10

146M.H. Neumann, E. Paparoditis / Statistics & Probability Letters 50 (2000) 137–147

Proof of Theorem 3.1. We write T∗

n;ˆ?in the form

T∗

n;ˆ?=

?

16i¡j6n

aij;nWn(Y∗

i;Y∗

j) +

1

nh3d=2

n

?

i=1

˜ Hn(Y∗

i;Y∗

i); (4.10)

where

aij;n=2

n{h−3dE∗[˜ Hn(Y∗

i;Y∗

j)]2}1=2; (4.11)

Wn(Y∗

i;Y∗

j) =

˜ Hn(Y∗

i;Y∗

i;Y∗

j)

j)]2}1=2;

{E∗[˜ Hn(Y∗

?u − x

(4.12)

˜ Hn(x;y) =

? ?

K

h

?

− E∗K

?u − Y∗

1

h

???

K

?u − y

h

?

− E∗K

?u − Y∗

1

h

??

du

(4.13)

and E∗denotes expectation with respect to the bootstrap distribution. One can easily show that

?????

1

nh3d=2

n

?

i=1

˜ Hn(Y∗

i;Y∗

i) − h−d=2

?

K2(u)du

?????→ 0

in probability. Finally,

?

16i¡j6n

aij;nWn(Y∗

i;Y∗

j) → N(0;?2)

follows by Theorem 5:3 of de Jong (1987), which completes the proof of the theorem. (Details are again

given in Neumann and Paparoditis (1998).)

Acknowledgements

The work on this project was initiated while the ?rst author was visiting the University of Cyprus. He

gratefully acknowledges ?nancial support from this institution.

References

Bickel, P., Rosenblatt, M., 1973. On some global measures of the derivation of density function estimators. Ann. Statist. 1, 1071–1095.

Brown, B.M., 1971. Martingale central limit theorems. Ann. Math. Statist. 42, 59–66.

Dvoretzky, A., 1972. Asymptotic normality for sums of dependent random variables. In: L. LeCam et al. (Ed.), Proceedings of the Sixth

Berkeley Symposium on Mathematics, Statistics and Probability, Vol. 2. University of California Press, Los Angeles, pp. 513–555.

Efron, B., 1979. Bootstrap methods: another look at the jackknife. Ann. Statist. 7, 1–26.

Fan, Y., Ullah, A., 1999. On goodness-of-?t tests for weakly dependent processes using kernel method. J. Nonparametr. Statist. 11,

337–360.

Ghosh, B.K., Huang, W.-M., 1991. The power and optimal kernel of the Bickel–Rosenblatt test for goodness of ?t. Ann. Statist. 19,

999–1009.

H? ardle, W., Mammen, E., 1993. Comparing nonparametric versus parametric regression ?ts. Ann. Statist. 21, 1926–1947.

de Jong, P., 1987. A central limit theorem for generalized quadratic forms. Probab. Theory Related Fields 75, 261–277.

Kreiss, J.-P., Neumann, M.H., Yao, Q., 1998. Bootstrap tests for simple structures in nonparametric time series regression. Preprint No.

98=07, TU Braunschweig.

Neumann, M.H., 1997. On robustness of model-based bootstrap schemes in nonparametric time series analysis. Discussion Paper 88=97,

SFB 373, Humboldt University, Berlin.

Page 11

M.H. Neumann, E. Paparoditis / Statistics & Probability Letters 50 (2000) 137–147 147

Neumann, M.H., 1998. Strong approximation of density estimators from weakly dependent observations by density estimators from

independent observations. Ann. Statist. 26, 2014–2048.

Neumann, M.H., Kreiss, J.-P., 1998. Regression-type inference in nonparametric autoregression. Ann. Statist. 26, 1570–1613.

Neumann, M.H., Paparoditis, E., 1998. A nonparametric test for the stationary density. Discussion Paper 58=98, SFB 373, Humboldt

University, Berlin.

Paparoditis, E., 2000. Spectral density based goodness of ?t tests for time series models. Scand. J. Statist. 27, 143–176.

Robinson, P.M., 1983. Nonparametric estimators for time series. J. Time Ser. Anal. 4, 185–207.

Rosenblatt, M., 1975. A quadratic measure of deviation of two-dimensional density estimates and a test of independence. Ann. Statist.

3, 1–14.

Silverman, B.W., 1986. Density Estimation for Statistics and Data Analysis. Chapman & Hall, London.

Takahata, H., Yoshihara, K., 1987. Central limit theorems for integrated square error of nonparametric density estimators based on

absolutely regular random sequences. Yokohama Math. J. 35, 95–111.