Bootstrap specification tests for diffusion processes

32
Journal of Econometrics 124 (2005) 117 – 148 www.elsevier.com/locate/econbase Bootstrap specication tests for diusion processes Valentina Corradi a ; , Norman R. Swanson b a Department of Economics, Queen Mary, University of London, Mile End, London E1 4NS, UK b Department of Economics, Rutgers University, 75 Hamilton Street, New Brunswick, NJ 08901-1248, USA Accepted 9 February 2004 Abstract This paper discusses specication tests for diusion processes. In the one-dimensional case, our proposed test is closest to the nonparametric test of A t-Sahalia (Rev. Financ. Stud. 9 (1996) 385). However, we compare CDFs instead of densities. In the multidimensional and/or multifactor case, our proposed test is based on comparison of the empirical CDF of actual data and the empirical CDF of simulated data. Asymptotically valid critical values are obtained using an empirical process version of the block bootstrap which accounts for parameter estimation error. An example based on a simple version of the Cox et al. (Econometrica 53 (1985) 385) model is outlined and related Monte Carlo experiments are carried out. c 2003 Elsevier B.V. All rights reserved. JEL classication: C12; C22 Keywords: Block bootstrap; Diusion process; Multifactor model; Parameter estimation error; Specication test; Stochastic volatility 1. Introduction This paper introduces two bootstrap specication tests for diusion processes. In the one-dimensional case, note that the invariant density associated with the diusion is implied by the specication of the drift and variance terms. Therefore, we can construct a Kolomogorov type test, based on comparison of the empirical cumulative distribution function and the cumulative distribution function (CDF) implied by the specication of the drift and the variance, under the null model. This test, which is the rst of our Corresponding author. E-mail addresses: [email protected] (V. Corradi), [email protected] (N.R. Swanson). 0304-4076/$ - see front matter c 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2004.02.013

Transcript of Bootstrap specification tests for diffusion processes

Journal of Econometrics 124 (2005) 117–148www.elsevier.com/locate/econbase

Bootstrap speci!cation tests for di#usionprocesses

Valentina Corradia ;∗, Norman R. Swansonb

aDepartment of Economics, Queen Mary, University of London, Mile End, London E1 4NS, UKbDepartment of Economics, Rutgers University, 75 Hamilton Street, New Brunswick,

NJ 08901-1248, USA

Accepted 9 February 2004

Abstract

This paper discusses speci!cation tests for di#usion processes. In the one-dimensional case,our proposed test is closest to the nonparametric test of A23t-Sahalia (Rev. Financ. Stud. 9 (1996)385). However, we compare CDFs instead of densities. In the multidimensional and/or multifactorcase, our proposed test is based on comparison of the empirical CDF of actual data and theempirical CDF of simulated data. Asymptotically valid critical values are obtained using anempirical process version of the block bootstrap which accounts for parameter estimation error.An example based on a simple version of the Cox et al. (Econometrica 53 (1985) 385) modelis outlined and related Monte Carlo experiments are carried out.c© 2003 Elsevier B.V. All rights reserved.

JEL classi-cation: C12; C22

Keywords: Block bootstrap; Di#usion process; Multifactor model; Parameter estimation error; Speci!cationtest; Stochastic volatility

1. Introduction

This paper introduces two bootstrap speci!cation tests for di#usion processes. In theone-dimensional case, note that the invariant density associated with the di#usion isimplied by the speci!cation of the drift and variance terms. Therefore, we can constructa Kolomogorov type test, based on comparison of the empirical cumulative distributionfunction and the cumulative distribution function (CDF) implied by the speci!cationof the drift and the variance, under the null model. This test, which is the !rst of our

∗ Corresponding author.E-mail addresses: [email protected] (V. Corradi), [email protected] (N.R. Swanson).

0304-4076/$ - see front matter c© 2003 Elsevier B.V. All rights reserved.doi:10.1016/j.jeconom.2004.02.013

118 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

two tests, is closest to the nonparametric test introduced by A23t-Sahalia (1996), in thesense that both procedures determine whether the drift and variance components ofa particular continuous time model are correctly speci!ed, although our test is basedon a comparison of CDFs, while A23t-Sahalia’s is based on a comparison of densities.Thus, his approach requires the use of a nonparametric density estimator (and hencethe choice of the bandwidth parameter) and is characterized by a nonparametric rate,while our test has a parametric rate. In the case of either multidimensional di#usionsor multifactor models characterized by stochastic volatility, the functional form of theinvariant density of the return(s) is no longer guaranteed to be given in closed form,upon joint speci!cation of the drift and variance terms. However, in this case wecan still compare empirical distributions, and our second test is thus based on thecomparison of the empirical distribution of the actual data and the empirical distributionof the (model) simulated data.It should be noted that tests based on the comparison of CDFs have no power

against i.i.d. alternatives that are generated by the same marginal density as thatimplied under H0. However, this feature of our tests is not particularly relevant inthe context of highly dependent !nancial data, for example. Nevertheless, it would inprinciple be interesting to construct speci!cation tests for di#usions based on the spec-i!cation of the transition density. The main diFculty with this is that knowledge of thedrift and variance terms of a di#usion does not in turn generally imply knowledge of thetransition density. Indeed, if the functional form for the transition density were known,we could test the hypothesis of correct speci!cation of a di#usion via the generalizedcross-spectrum approach of Hong (2001), (see also Hong and Li, 2003, and Honget al. (2002) for application to testing !nancial models), the robust periodogram ap-proach of Thompson (2002), the test of Bai (2003) based on the joint use of aKolmogorov test and a martingalization method, or via the normality transformation ap-proach of Duan (2003). For the case in which the transition density is unknown, a testcan be constructed by comparing the kernel (conditional) density estimator of the actualand simulated data, as in Altissimo and Mele (2002). 1 Alternatively, a closed formapproximation of the transition density (and hence the likelihood function) is proposedby A23t-Sahalia (1999, 2002), who also provides conditions under which the argmaxof the approximated likelihood is asymptotically equivalent to the “true” maximumlikelihood estimator. Whether the same approach can be used to obtain an approxima-tion of the conditional distribution to be used in an implementation of a conditionalKolmogorov test, along the lines of Andrews (1997), is left to future research. Apossible justi!cation for comparing marginal distributions instead of conditional distri-butions is the case of subordinated di#usions (see e.g. Conley et al., 1997), where thesubordinated process may have the same stationary density as the underlying di#usion,but not the same transition density.

1 Unfortunately, in this context there is not a well de!ned conditional empirical distribution. Interestingly,Thompson (2002) suggests an ingenious device (based on the use of an Euler scheme) for approximating thetransition function, when the latter is unknown. However, the approximated conditional distribution functionis not di#erentiable over the parameter space, so that contribution of parameter estimation error to the limitingdistribution cannot be accounted for using his approach in conjunction with standard techniques.

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 119

As mentioned above, the test that we propose for the one-dimensional case is basedon comparison of the empirical CDF of the data and the distribution implied by thedrift and the variance of the di#usion, evaluated at the estimated parameters. 2 In themultidimensional and/or multifactor case, our test is based on comparison of the em-pirical distribution of the data and the empirical distribution of data simulated usingestimated parameters. In both cases, parameters are estimated via the simulated gener-alized method of moments (SGMM) approach of DuFe and Singleton (1993), usingas many moment conditions as parameters (exact identi!cation). As is common withthese types of tests, the limiting distributions are functionals of zero mean Gaussianprocesses with covariance kernels that reMect both the contribution of parameter esti-mation error (PEE) as well as the time series nature of the data. Thus, the limitingdistributions are not nuisance parameters free and critical values cannot be tabulated.Note that, in the special case of testing for normality, Bontemps and Meddahi (2003)provide a GMM type test based on moment conditions that is robust to parameterestimation error.Our approach is to provide valid asymptotic critical values via an extension of the

empirical process version of the block bootstrap which properly captures the contribu-tion of PEE, for the case where parameters are estimated via SGMM. Of note in thiscontext is that when the simulation error is negligible (i.e. the simulated sample growsfaster than the historical sample), and given exact identi!cation, the results developedby GonOcalves and White (2004) for QMLE estimators extend to SGMM estimators. 3

The potential usefulness of our proposed bootstrap based tests is examined via aseries of Monte Carlo experiments in the context of testing the goodness of !t of asquare root di#usion process, which is speci!ed using a simpli!ed version of the Coxet al. (1985) model, under the null hypothesis. Under the alternative, logged data aregenerated according to an Ornstein-Uhlenbeck process, so that the data are lognormal.For samples of 400, 800, and 1200 observations, and based on the use of bootstrapcritical values constructed using as few as 100 replications, rejection rates under thenull are quite close to nominal values, and rejection rates under the alternative aregenerally high. 4

The rest of the paper is organized as follows. In Section 2, we outline the spec-i!cation test and analyze its asymptotic behavior, for the case of one-dimensionaldi#usions. Section 3 outlines the bootstrap procedure which we propose, and estab-lishes its asymptotic validity. Section 4 discusses extention to multifactor models, andSection 5 contains the results from our Monte Carlo experiments. Concluding remarksare contained in Section 6. All proofs are collected in Appendix A.

2 A Kolmogorov-Smirnov test for di#usion processes has previously been suggested by Fournie (1993),for the case in which the continuous trajectories are observed.

3 The issue of PEE is addressed by Thompson (2002) by providing upper bounds, which are valid for thecase of eFciently estimated parameters. Another approach is that of Hong and Li (2003), who perform outof sample tests, with the estimation period growing faster than the prediction period; so that the contributionof PEE vanishes asymptotically.

4 It is worth noting that the joint problem of simulating paths and simulating bootstrap replications makethis Monte Carlo study rather computationally intensive, and we are not aware of other simulation studieswhich analyze the performance of bootstrap tests for di#usion processes.

120 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

2. One-dimensional di�usion speci�cation test

In this section we outline our test for the joint correct speci!cation of the driftand variance terms in one-dimensional stationary-ergodic di#usion processes. Let X (t),t¿ 0, be a di#usion process solution to the following stochastic di#erential equation:

dX (t) = b(X (t); �0) dt + �(X (t); �0) dW (t); (1)

where �0 ∈, ⊂ Rk , and is a compact set.It is known that the drift and variance terms (b(·) and �2(·), respectively) uniquely

determine the stationary density, say f(x; �0), associated with the invariant probabilitymeasure of the above di#usion process (see e.g. Karlin and Taylor, 1981, p. 241). Inparticular,

f(x; �0) =c(�0)

�2(x; �0)exp

x∫

2b(v; �0)�2(v; �0)

dv

; (2)

where c(�) is a constant ensuring that the density integrates to one. Now, supposethat we observe a discrete sample (skeleton) of size T , say (X1; X2; : : : ; XT )′, of theunderlying di#usion X (t), and construct an estimator of �0, say �T;S;h, which is based onthe skeleton of the observed data as well as on a (model) simulated path. 5 Hereafter,we use the notation X (t) for the continuous time process and the notation Xt for theskeleton. In addition, let F0(u) be the cumulative distribution function associated withthe underlying di#usion, and let F(u; �0) be the CDF associated with the density in(2). We consider the following hypotheses:

H0 : F0(u) = F(u; �0); for all u∈U (3)

versus

HA : F0(u) �= F(u; �0); for some u∈U; with nonzero Lebesgue measure: (4)

Now, note the a null hypothesis of joint correct speci!cation of the drift and thevariance terms implies H0 in (3). However, the reverse does not hold, as H0 does notnecessarily imply the correct speci!cation of the di#usion process (see Eq. (2)). In fact,we cannot rule out the possibility that even though Xt is not a skeleton of the solutionto the stochastic di#erential equation in (1), we still have that Pr(Xt6 u) = F(u; �0).In order to test H0 versus HA, consider the following test statistic: 6

V 2T;S;h =

∫UV 2T;S;h(u)�(u) du; (5)

5 In the case in which the moment conditions can be written in closed form, we have that �T;S;h = �T , asS is the sample length of the simulated path used in estimation of �0, and h is the discretization parameterused in the application of Euler and/or Milstein approximation schemes, for example (see below for furtherdetails).

6 In Monte Carlo experiments that are reported on below, we also examine the !nite sample proper-ties of two di#erent version of the test statistic, namely |VT;S;h| =

∫U |VT;S;h(u)|�(u) du, and |VT;S;h|sup =

supu∈U |VT;S;h(u)|.

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 121

where

VT;S;h(u) =1√T

T∑t=1

(1{Xt6 u} − F(u; �T;S;h));

with U the compact interval de!ned below (see Assumption 2) and∫U �(u) du = 1.

Further, �T;S;h is the simulated GMM (SGMM) estimator, de!ned to be

�T;S;h = argmin�∈

(1T

T∑t=1

g(Xt) − 1S

S∑t=1

g(X �t;h)

)′

×WT

(1T

T∑t=1

g(Xt) − 1S

S∑t=1

g(X �t;h)

)

= argmin�∈

GT;S;h(�)′WTGT;S;h(�); (6)

where g denotes a vector of p moment conditions, ⊂ Rp (so that we have asmany moment conditions as parameters), and X �

t;h = X �[Nth=S], with S = Nh (S denotes

simulation path length and h is the discretization interval). Finally, WT is the inverseof a heteroskedasticity and autocorrelation (HAC) robust covariance matrix estimator.That is:

W−1T =

1T

lT∑ =−lT

w

T−lT∑t= +1+lT

(g(Xt) − 1

T

T∑t=1

g(Xt)

)

×(g(Xt− ) − 1

T

T∑t=1

g(Xt)

)′

; (7)

where w =1− =(lT+1). In order to construct simulated estimators, we require simulatedpaths, under the null di#usion. If we use a Milstein scheme (see e.g. Pardoux and Talay,1985), then

X �kh − X �

(k−1)h = b(X �(k−1)h; �)h+ �(X �

(k−1)h; �)"kh − 12�(X �

(k−1)h; �)′�(X �

(k−1)h; �)h

+12�(X �

(k−1)h; �)′�(X �

(k−1)h; �)"2kh (8)

where "khiid∼N(0; h), k = 1; : : : ; N , Nh = S, and �′ is the derivative with respect to the

!rst argument. Also de!ne, the pseudo true value:

� † = argmin�∈

G∞(�)′W0G∞(�);

where G∞(�)′W0G∞(�) = p limT;S→∞; h→0 GT;S;h(�)′WTGT;S;h(�), and � † = �0 underthe null. The reason why we limit our attention to the exactly identi!ed case is that

122 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

this ensures that G∞(� †) = 0, even when the model used to simulate the di#usion ismisspeci!ed, in the sense of di#ering from the underlying DGP. 7

A complete treatment of the asymptotic behavior of (nonsimulated) GMM in thejoint presence of overidenti!cation and misspeci!cation is provided by Hall and Inoue(2003), who show that in the case of a HAC-type weighting matrix, the rate of con-vergence depends on the lag truncation parameter. For the sake of simplicity, wefocus on SGMM estimators in this paper. However, in the case of correct speci-!cation, we could equally rely on Indirect Inference (II: Gourieroux et al., 1993)and EFcient Method of Moments (EMM: Gallant and Tauchen, 1996). 8 In addi-tion to simulation based methods, approximate maximum likelihood estimators haverecently received considerable attention. For example, A23t-Sahalia (1999, 2002) sug-gests a closed form approximation of the likelihood function, and Altissimo andMele (2003) propose an estimator based on the minimization of the distance betweena kernel density estimator constructed using the actual data and one constructed usingsimulated data, in which case they provide conditions under which such estimatorsare asymptotically equivalent to maximum likelihood estimators. Nevertheless, it isnot immediate to see what the properties of the Ait-Sahalia and Altissimo and meleestimators are in the misspeci!ed case. Finally, nonparametric estimation of multidi-mensional di#usions via spectral decomposition of the generator function is studied byChen et al. (2000).The following assumptions are used in the sequel.

Assumption A1. (A1). X (t); t ∈R+, is a strictly stationary, geometric ergodic di#usion,under both the null and the alternative hypotheses. Under the null, the invariant densityis f(·; �0), with cumulative distribution function F(·; �0).

Assumption A2. (A2). b(·) and �(·), as de!ned in (1), are twice continuously dif-ferentiable. Also, b; b′; �, and �′ are Lipschitz, with Lipschitz constant independentof �.

Assumption A3. (A3). F(u; �) is twice continuously di#erentiable in the interiorof × U , where and U are compact subsets of Rp and of R, respectively.Also, ∇�F(u; �), ∇2

�F(u; �) and ∇�;uF(u; �) are jointly continuous on the interior of × U .

7 First order conditions imply that

∇�G∞(� †)′W†G∞(� †) = 0:

However, in the case for which the number of parameters and the number of moment conditions is the same,∇�G∞(� †)′W

†is invertible, and so the !rst order conditions also imply that G∞(� †) = 0.

8 A uni!ed framework for simulation based estimators, which nests SGMM, II and EMM, is provided inDridi (1999).

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 123

Assumption A4. (A4). For any !xed h and ∀�∈, X �kh is geometrically ergodic and

strictly stationary. 9

Assumption A5. (A5). WTa:s:→W0 =

∑−10 , where,

∑0 =∑∞

j=−∞ E((g(X1) −E(g(X1)))(g(X1+j) − E(g(X1+j)))′).

Assumption A6. (A6). ∀�∈, and for all h,(E(g(X �

t;h)′g(X �

t;h))2+%) 1

2+% ¡C ¡∞,g(X �

t;h) is Lipschitz, uniformly on , � → E(g(X �t;h)) is continuous, and g(Xt); g(X �

t;h);∇�X �

t;h are 2r-dominated (the last two also on ) for r ¿ 32 .

10

Assumption A7. (A7). Unique identi!ability: G∞(� †)′W0G∞(� †)¡G∞(�)′W0G∞(�),∀� �= � †.

Assumption A8. (A8). (i) �T;S;h and � † are in the interior of ; (ii) g(X �t ) is twice

continuously di#erentiable in the interior of ; and (iii) D† = E(@g�1=@�|�=� †) exists

and is of full rank, p.

Assumption A1 requires the di#usion to be geometric ergodic, under both hypothe-ses. Note also that A1 ensures that the skeleton is strong mixing with mixing coef-!cients decaying at a geometric rate. A3 imposes very mild smoothness requirementson the cumulative distribution function under the null, and is thus easily veri!ed.A4–A7 ensure consistency and asymptotic normality of �T;S;h, under both hypotheses.

Theorem 1. Let A1–A8 hold. As T; S → ∞, h → 0, T=S → 0, and Th2 → 0:(i) Under H0,

V 2T;S;h ⇒

∫UZ2(u)�(u) du;

where Z is a Gaussian process with covariance kernel given by

K(u; u′) = E

( ∞∑s=−∞

(1{X16 u} − F(u; �0))(1{Xs6 u} − F(u; �0))

)

+∇�F(u; �0)′(D0′W0D0)−1∇�F(u; �0)))−2∇�F(u; �0)′(D0′W0D0)−1D0′W 0

×∞∑

s=−∞E((g(Xs) − E(g(X1)))(1{X16 u} − F(u; �0))): (9)

(ii) Under HA, there exists an j¿ 0 such that,

limT→∞

Pr(1T

V 2T;S;h ¿ j

)= 1:

9 Stramer and Tweedie (1997) propose a new algorithm for simulating the path of a di#usion whichensures that the geometric ergodicity of the underlying di#usion is inherited by the simulated paths. This isin general the case for the Euler or the Milstein scheme, whenever the drift grows at most at a linear rateand the drift and variance terms are not “too big”.10 Let g(X �

t;h)i be the ith element of g(X �t;h). We require sup�∈ |g(X �

t;h)i|6Dt , with supt E(Dt)2r ¡∞.

124 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

As the estimated parameters are√T consistent, PEE does not vanish asymptotically,

but instead enters into the asymptotic covariance kernel (the last two lines in (9)summarize the contribution of PEE to the kernel). Note that in the statement of thetheorem above, we require that the simulated sample size grows at a faster rate thanthe historical sample. We can relax this requirement and still get convergence to afunctional of a Gaussian process, although the covariance kernel would be slightlydi#erent. However, in order to establish validity of the block bootstrap under SGMM,we require T=S → 0. Of further note is that by varying the interval of integration,U , over which the speci!cation test is constructed, one can assess the ‘goodness’ ofspeci!cation over various regions of the distribution.

3. Bootstrap critical values

The limiting distribution of V 2T;S;h is a functional of a Gaussian process with a co-

variance kernel that reMects both PEE and the time series nature of the data. Thus,critical values cannot be tabulated. In the present context, valid asymptotic critical val-ues can be obtained in three ways. First, one can use the conditional p-value approachof Corradi and Swanson (2002), which extends Hansen’s (1996) and Inoue’s (2001)results to the case of nonvanishing PEE. Second, one can use the subsampling methodof Politis et al. (1999). Third, one can use an appropriate block bootstrap procedure. Adrawback of the !rst two approaches is that the simulated (or subsample based) criticalvalues diverge at rate l (where l plays the role of the blocksize length or denotes thesubsample size) under the alternative. Thus, we choose to use the third approach. 11 Inorder to show the !rst order validity of the block bootstrap in our context, we derivethe limiting distribution of appropriately formed bootstrap statistics and show that theycoincide with the limiting distribution in Theorem 1. Then, a test with correct asymp-totic size and unit asymptotic power can be obtained by comparing the value of theoriginal statistic with bootstrapped critical values.In the presence of dependent observations, but no PEE, valid bootstrap critical val-

ues are straightforwardly provided by an empirical version of the K2unsch (1989) blockbootstrap (see e.g. B2uhlmann, 1995; Naik-Nimbalkar and Rajarshi, 1994 or Peligrad,1998). 12 However, in the present context we need a bootstrap procedure which prop-erly mimics the contribution of PEE to the covariance kernel. GonOcalves and White

11 As the limiting distributions in Theorem 1 (above) and Theorem 3 (below) are not pivotal, bootstrapcritical values do not provide any re!nement of !rst order asymptotics (see e.g. Hall, 1992, ch. 3).12 Equally, one could use an empirical version the stationary bootstrap of Politis and Romano (1994a,

b). The main di#erence between the block bootstrap and the stationary bootstrap of Politis and Romano(PR: 1994a) is that the former uses a deterministic block length, which may be either overlapping asin K2unsch (1989) or nonoverlapping as in Carlstein (1986), while the latter resamples using blocks ofrandom length. One important feature of the PR bootstrap is that the resampled series, conditional on thesample, is stationary, while a series resampled from the (overlapping or nonoverlapping) block bootstrap isnonstationary, even if the original sample is strictly stationary. However, Lahiri (1999) shows that all blockbootstrap methods, regardless of whether the block length is deterministic or random, have a !rst order biasof the same magnitude, but the bootstrap with deterministic block length has a smaller !rst order variance.In addition, the overlapping block bootstrap is more eFcient than the non overlapping block bootstrap.

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 125

(2004) show the !rst order validity of the block bootstrap for QMLE (or m-estimators),for the case of dependent and heterogeneous observations. In the sequel we show that,in the stationary case, their results are valid also for SGMM estimators in the exactidenti!cation case and when the simulated series sample size grows faster than thehistorical sample size. While a formal proof is provided in the appendix, it is worth-while to also give an intuitive explanation of our result. First, if T=S → 0, simulationerror is negligible, SGMM is asymptotically equivalent to GMM, and consequently wedo not need to bootstrap the simulated series. Second, in the exactly identi!ed case,the bootstrap sample always satis!es the moment conditions, upto a negligible term,therefore GMM estimators can be treated the same way that QMLE estimators aretreated.In order to implement the appropriate bootstrap statistic, we proceed as follows. At

each replication, draw b blocks (with replacement) of length l from the sample Xt

where T = lb. Thus, the !rst block is equal to Xi+1; : : : ; Xi+l, for some i=0; 1; : : : ; T −l, with probability 1=(T − l + 1), the second block is equal to Xi+1; : : : ; Xi+l, forsome i, with probability 1=(T − l + 1), and so on for all blocks. More formally,let Ik , k = 1; : : : ; b be i.i.d. discrete uniform random variables on [0; 1; : : : ; T − l],and let T = bl. Then, the resampled series, is such that X ∗

1 ; X∗2 ; : : : ; X

∗l ; X

∗l+1; : : : ; X

∗T =

XI1+1; XI1+2; : : : ; XI1+l; XI2 ; : : : ; XIb+l, and so a resampled series consists of b blocks thatare discrete i.i.d. uniform random variables, conditional on the sample. Now, de!nethe simulated SGMM estimator as:

�∗T;S;h = argmin

�∈

(1T

T∑t=1

g(X ∗t ) − 1

S

S∑t=1

g(X �t;h)

)′

×WT

(1T

T∑t=1

g(X ∗t ) − 1

S

S∑t=1

g(X �t;h)

)

= argmin�∈

G∗T;S;h(�)

′WTG∗T;S;h(�); (10)

and note that we do not need to resample the simulated series. The reason is that, asT=S → ∞, simulation error vanishes asymptotically and so there is no need to mimicits contribution to the covariance kernel. Finally, de!ne the bootstrap statistic as

V 2∗T;S;h =

∫UV 2∗T;S;h(u)�(u) du; (11)

with∫U �(u) du= 1, and

V ∗T;S;h(u) =

1√T

T∑t=1

((1{X ∗t 6 u} − 1{Xt6 u})

−(F(u; �∗T;S;h) − F(u; �T;S;h))): (12)

The validity of the suggested bootstrap procedure is stated in the subsequent theorem.

126 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

Theorem 2. Let A1–A7 hold. As T; S → ∞, h → 0, T=S → 0, Th2 → 0, l → ∞ andl=

√T → 0:

P(! : sup

v∈R

∣∣∣∣P∗(∫

UV 2∗T;S;h(u)�(u) du6 v

)

−P(∫

U(VT;S;h(u)−

√TE(1{Xt6 u} − F(u; � †)))2�(u) du6 v

)∣∣∣∣¿ j)

→ 0;

where P∗ denotes the probability law of the resampled series, conditional on thesample.

In summary, from Theorem 2 we know that V 2∗T;S;h has a well de!ned limiting dis-

tribution, conditional on the sample and for all samples except a set of probabilitymeasure approaching zero. Furthermore, the limiting distribution coincides with that ofV 2T;S;h, under H0. The above results suggest proceeding in the following manner. For

any bootstrap replication, compute the bootstrapped statistic, V 2∗T;S;h. Perform B bootstrap

replications (B large) and compute the percentiles of the empirical distribution of the Bbootstrapped statistics. Reject H0 if V 2

T;S;h is greater than the (1−5)th-percentile of thisempirical distribution. Otherwise, do not reject H0. Now, for all samples except a setwith probability measure approaching zero, V 2

T;S;h has the same limiting distribution asthe corresponding bootstrapped statistic, under H0. Thus, the above approach ensuresthat the test has asymptotic size equal to 5. Under the alternative, V 2

T;S;h diverges toin!nity, while the corresponding bootstrap statistic has a well de!ned limiting distribu-tion. This ensures unit asymptotic power. Note that the validity of the bootstrap criticalvalues requires the number of bootstrap replications to go to in!nity, although in prac-tice we need to choose B. Andrews and Buchinsky (2000) suggest an adaptive rulefor choosing B, while Davidson and McKinnon (2000) suggest a pretesting procedureensuring that there is a “small probability” of drawing di#erent conclusions from theideal bootstrap and from the bootstrap with B replications, for a test with a given level.However, in our case, the limiting distribution is a functional of a Gaussian process,so that we do not know the explicit density function. Thus, we cannot directly applythe approaches suggested in the papers above. In the Monte Carlo section below, weanalyze the robustness of our !ndings to the choice of B, and !nd that even for valuesof B as small as 100, the bootstrap has good !nite sample properties.

4. Extension to multifactor di�usion models

In the case of multidimensional and/or multifactor di#usion models, knowledge ofthe drift and of the di#usion matrix no longer implies knowledge of the invariantdensity. Thus, the test suggested in Section 2 cannot be generalized in a straightfor-ward manner to the multidimensional case. This is quite a severe limitation, as manyapplications, such as term structure analysis, require multifactor models (see e.g. Daiand Singleton, 2000, 2002). Therefore, in this section we outline a new test which isbased on comparison of the empirical CDF of historical data and the empirical CDFof (model) simulated data, where data are simulated using estimators constructed via

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 127

application of SGMM. A related test based on the comparison of historical data andsimulated data is discussed in Corradi and Swanson (CS: 2003), in the context of realbusiness cycle model evaluation. The test introduced below di#ers from the CS testsin two main respects. Namely: (i) we are interested in continuous time processes andthus we account for discretization error (CS consider only discrete time models); and(ii) parameters estimators are constructed using both actual and simulated data (in CS,parameters are either calibrated or estimated using only historical data).In providing an extension to the multidimensional case of the above test of one-

dimensional models, a !rst diFculty lies in the choice of the discrete approximationscheme. In particular, the di#usion process X (t) can be expressed as a function ofthe driving Brownian motion W (t), in the one-dimensional case. However, in the mul-tidimensional case, where we have X (t)∈Rp, note that X (t) cannot in general beexpressed as a function of the p driving Brownian motions, but is instead a func-tion of (Wj(t);

∫ t0 Wj(s) dWi(s)), i; j = 1; : : : ; p (see e.g. Pardoux and Talay, 1985,

pp. 30–32). For this reason, simple approximation schemes like the Euler or theMilstein schemes, which do not involve approximation of stochastic integrals, maynot be adequate in the multidimensional case. One situation in which the Milsteinscheme does straightforwardly generalize to the multidimensional case is when the dif-fusion matrix is commutative. Let 6(X ) = (�1(X ) · · · �P(X )), where �i(X ) is a p × 1vector, for i = 1; : : : ; p. If for all i; j = 1; : : : ; p,(

@�j(X )@X1

· · · @�j(X )@Xp

)�i(X ) =

(@�i(X )@X1

· · · @�i(X )@Xp

)�j(X );

then 6(X ) is commutative. It is immediate to see that almost all of the most frequentlyused stochastic volatility (SV) models violate the commutativity property. The intuitivereason for this is that both the variance of the observable asset and the variance of the(unobservable) variance process depend only on the volatility process. In this situation,more “sophisticated” approximation schemes are necessary.In the remainder of this section, we specialize to the case of two-factor stochastic

volatility models. Extension to general multidimensional and multifactor models followsdirectly. Consider:(

dX (t)

dV (t)

)=

(b1(X (t); �)

b2(V (t); �)

)dt +

(�11(V (t); �)

0

)dW1(t)

+

(�12(V (t); �)

�22(V (t); �)

)dW2(t); (13)

where W1; t and W2; t are independent standard Brownian motions. It is immediate tosee that the di#usion in (13) violates the commutativity property. Also, note thatmost of the popular SV models, such as the square-root model of Heston (1993),the GARCH di#usion model of Nelson (1990), the lognormal model of Hull andWhite (1987) and the eigenfunction models of Meddahi (2001) can be written as (13)above.

128 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

Let

b=

(b1

b2

); � =

(�11 �12

0 �22

); (14)

and de!ne the following generalized Milstein scheme (see Eq. (3.3), p. 346 in Kloedenand Platen, 1999):

X �(k+1)h = X �

kh + b1(X �kh; �)h+ �11(V�

kh; �)"1; (k+1)h + �12(V�kh; �)"2; (k+1)h

+12�22(V�

kh; �)@�12(V�

kh; �)@V

"22; (k+1)h

+ �22(V�kh; �)

@�11(V�kh; �)

@V

∫ (k+1)h

kh

(∫ s

khdW1;

)dW2; s; (15)

V�(k+1)h = V�

kh + b2(V�kh; �)h+ �22(V�

kh; �)"2; (k+1)h

+12�22(V�

kh; �)@�22(V�

kh; �)@V

"22; (k+1)h; (16)

where h−1=2"i; kh ∼ N(0; 1), i = 1; 2, E("1; kh"2;mh) = 0 for all k and m, and

b(V; �) =

(b1(V; �)

b2(V; �)

)=

b1(V; �) − 12 �22(V; �)

@�12(V; �)@V

b2(V; �) − 12 �22(V; �)

@�22(V; �)@V

:

The last terms on the RHS of (15) involves stochastic integrals and cannot be explicitlycomputed. However, they can be approximated, up to an error of order o(h) by (seeEq. (3.7), p. 347 in Kloeden and Platen, 1999):∫ (k+1)h

kh

(∫ s

khdW1;

)dW2; s ≈ h

(128182 +

√9p(:1;p82 − :2;p81)

)

+h2�

p∑r=1

1r(&1; r(

√282 + <2; r) − &2; r(

√281 + <1; r));

where for j = 1; 2, 8j; :j;p; &j; r ; <j; r are i.i.d. N(0; 1) and 9p = 112 − (1=2�2)

∑pr=1 1=r2,

and p is such that as h → 0, p → ∞. Now, de!ne,

ZT;S;h(u) =1√T

T∑t=1

(1{Xt6 u} − 1

S

S∑t=1

1{X �T; S; h

t; h 6 u})

; (17)

where X �t;h = X �

[Nth=S], S = Nh, X �t;h is generated using (15) and (16), and �T;S;h is as

de!ned in equation (6), except that X �t;h is simulated using (15) and (16).

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 129

The theorem below requires a strengthened version of Assumption A2 above.Namely:

Assumption A2′. (A2′). b(·) and �(·) (as de!ned in (13) and (14)) and �ij(V; �)@�k–(V; �)=@V are twice continuously di#erentiable, Lipschitz, with Lipschitz constantindependent of �, and grow at most at a linear rate, uniformly in , for i; j; k; –=1; 2.

Theorem 3. Let A1;A2′ and A3–A8 hold, with X �t;h generated according to (15) and

(16). As T; S → ∞, h → 0, T=S → 0, T 2=S → ∞, Sh → 0:(i) Under H0,

Z2T;S;h ⇒

∫UZ2(u)�(u) du;

where Z is a Gaussian process with covariance kernel given by K(u; u′) equal to:

K(u; u′) = E

( ∞∑s=−∞

(1{X16 u} − F(u; �0))(1{Xs6 u} − F(u; �0))

)

+ :′f(�0)(D

0′W0D0)−1:f(�0)

− 2:′f(�0)

′(D0′W0D0)−1D0′W 0∞∑

s=−∞E((g(Xs)

−E(g(X1))(1{X16 u} − F(u; �0))); (18)

where :f(�0) = E(f(u; �0)∇�X�0t; h).

(ii) Under HA, there exists an j¿ 0 such that,

limT→∞

Pr(1T

Z2T;S;h ¿ j

)= 1:

Analogous to that in Theorem 1, the limiting distribution in Theorem 3 is a functionalof a zero mean Gaussian process, with a covariance kernel that reMects both PEE andthe dependence of the data. In fact, the only di#erence between the covariance kernel in(9) and that in (18) lies in the contribution of PEE. Note also that we require Sh → 0instead of Th2 → 0, which is a stronger requirement on the discrete approximationstepsize, h. Further, we require S to grow at a rate faster than T , but slower than T 2.

The relevant bootstrap statistic is

Z2∗T;S;h(u) =

1√T

T∑t=1

((1{X ∗t 6 u} − 1{Xt6 u})

− 1S

S∑t=1

(1{X �∗T; S; h

t; h 6 u} − 1{X �T; S; h

t; h 6 u}))

; (19)

130 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

where X�∗T; S; h

t; h and X �T; S; h

t; h are constructed as in (15), using �∗T;S;h and �T;S;h respectively,

and where �∗T;S;h and �T;S;h are de!ned in (6) and (10). The following theorem then

holds.

Theorem 4. Let A1;A2′ and A3–A7 hold, with X �t;h generated according to (15) and

(16). As T; S → ∞, h → 0, T=S → 0, T 2=S → ∞, Sh → 0, l → ∞ and l=√T → 0:

P(! : sup

v∈R

∣∣∣∣P∗(∫

UZ2∗T;S;h(u)�(u) du6 v

)

−P(∫

U(ZT;S;h(u) −

√TE(1{Xt6 u} − F(u; � †)))2�(u) du6 v

)∣∣∣∣¿ j)

→ 0:

We can thus proceed using the approach discussed below the statement of Theorem 2.

5. Monte Carlo evidence

In this section we discuss implementation of the di#usion speci!cation test and ofthe bootstrap procedure outlined in Sections 2 and 3, for the case in which we wantto test the null hypothesis that the CDF is a gamma distribution. In particular considerthe di#usion below, which is a simpli!ed version of the square root process discussedin Cox et al. (1985): 13

dX (t) = ((c1 − a) − X (t)) dt +√

c1X (t) dW (t);

c1 ¿ 0; and c1 − a¿ 0: (20)

From Wong (1964, pp. 264–265), it is known that the stationary density associatedwith X (t) belongs to the linear exponential (or Pearson) family. The process has anoncentral chi-squared transition density, and the invariant density is a gamma. Theinvariant density can be written as: 14

f(x; a; c1) =(c1=2)−2(1−a=c1)x2(1−a=c1)−1 exp(−x=(c1=2))

?(2(1 − a=c1));

?(2(1 − a=c1)) =∫ ∞

0t2(1−a=c1)−1 exp(−t) dt:

It follows that,

F(u; a; c1) =

∫ u0 (c1=2)

−2(1−a=c1)x2(1−a=c1)−1 exp(−x=(c1=2)) dx?(2(1 − a=c1))

: (21)

13 Note that this di#usion is obtained by de!ning the drift and the variance as inEqs. (10) and (11) in Wong (1964), setting @ = 1, c = 0, d = c1, e = 0 in hisEq. (10), and setting a = −1, b = −a in his Eq. (11). The resulting di#usion is indeed a squareroot process.14 Note that the invariant density under the null hypothesis can be written as f(x; 51; 52) =

5−512 x51−1 exp(−x=52)=?(51), which is the standard form of the density (see e.g. Johnson et al., 1994).In our case, 51 = 2(c1 − a)=c1 and 52 = c1=2.

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 131

For the CDF in (21), the !rst two moments are known (the mean is (c1 − a) and thevariance is (c1=2)(c1 − a)). For details, see Johnson et al. ((1994), ch. 17). Thus, wecan use (nonsimulated) GMM to estimate the parameters of the CDF. Also, as we havetwo moment conditions and two parameters (exact identi!cation), there is some (a†; c†

1)for which the moment conditions are satis!ed under both the null and the alternativehypotheses. Along these lines, de!ne

(a; c1)′ = arg mina;c1∈

G′(�)WTG(�); (22)

where

G′(�) =

((1T

T∑t=1

Xt − (c1 − a)

);

(1T

T∑t=1

(Xt − X )2 − c12(c1 − a)

)):

and

W−1T =

1T

T∑t=1

ftf′t +

1T

lT∑ =1

w

T∑t= +t

(ftf′t− + ft− f′

t )

with ft=(f1t ; f2t)′, f1t=(Xt−X ), f2t=(Xt−X )2, Xf= 1T

∑Tt=1 ft , and w =1− =(lT+1).

Further, assume that we are interested in testing the null hypothesis that the CDF isas de!ned in (21). Finally, note that although the process de!ned under H0 is veryrestrictive, and does not correspond to those versions of the Cox et al. (1985) modelfrequently estimated in practice, it provides us with a convenient form of the squareroot process with which to illustrate the application of our di#usion speci!cation test. Inthe above model, consider the simple case where c1 = 3, and a=−3 (one of the casesconsidered in the experiments reported on below). We generated 1000 observationsaccording to this model. A histogram of these observations is given in Panel 1 ofFig. 1. In addition, daily and monthly observations on the 3-month Treasury Billrate were downloaded from the St. Louis Federal Reserve (FRED) Database. Asthe daily data were available starting in 02/01/1962, we constructed histograms forthese two series starting at that date. These are reported in Panels 2 and 3 of Fig.1. It is interesting that although the historical data are generally ‘smoother’ thanthose generated under H0, they have similar minima, maxima, means, medians, andstandard deviations when compared with the simulated data. Thus, even the restric-tive example considered here for illustrative purposes is not too far removed fromreality.Given the discussion in Section 2 above, the statistic used in our simulation exercise

is: 15

V 2T;S;h = V 2

T =∫UV 2T (u)�(u) du;

15 In addition to examining the !nite sample performance of V 2T , we also examine the properties of

|VT | = ∫U |VT (u)|�(u) du, and |VT |sup = supu∈U |VT (u)|.

132 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

0

400

800

1200

1600

4 6 8 10 12 14 16

Series: DTB3Sample 1 10082Observations 9645

Mean 6.160729Median 5.500000Maximum 17.14000Minimum 2.610000Std. Dev. 2.576973Skewness 1.415659Kurtosis 5.380362

Jarque-Bera 5498.650Probability 0.000000

0

20

40

60

80

100

2 4 6 8 10 12 14 16 18

Series: C13_A3Sample 1 1000Observations 1000

Mean 5.923989Median 5.402733Maximum 18.48595Minimum 0.555858Std. Dev. 2.970412Skewness 0.962083Kurtosis 3.980692

Jarque-Bera 194.3405Probability 0.000000

0

20

40

60

80

4 6 8 10 12 14 16

Series: TB3MSSample 338 800Observations 463

Mean 6.164752Median 5.490000Maximum 16.30000Minimum 2.690000Std. Dev. 2.574769Skewness 1.393517Kurtosis 5.250254

Jarque-Bera 247.5352Probability 0.000000

Fig. 1. Simulated and actual data. Panel 1: 1000 observations generated according to dX (t) = (6 − X (t)) dt+√

3X (t) dW (t). Panel 2: Monthly 3-mo treasury bill data for the period 2/1962-9/2000 (Ann. %). Panel3: Daily 3-mo treasury bill data for the period 02/01/1962-09/22/2000 (Ann. %).

where

VT (u) =1√T

T∑t=1

(1{Xt6 u} −

∫ u0 (c1=2)

−2(1−a=c1)x2(1−a=c1)−1 exp(−x=(c1=2)) dx?(2(1 − a=c1))

);

and a and c1 are de!ned as in (22). The bootstrap statistic is

V 2∗T;S;h = V 2∗

T =∫UV 2∗T (u)�(u) du;

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 133

where

V ∗T (u) =

1√T

T∑t=1

(1{X ∗

t 6 u} −∫ u0 (c

∗1 =2)

−2(1−a∗=c∗1 )x2(1−a∗=c∗

1 )−1 exp(−x=(c∗1 =2)) dx

?(2(1 − a∗=c∗1))

)

− 1√T

T∑t=1

(1{Xt6 u} −

∫ u0 (c1=2)

−2(1−a=c1)x2(1−a=c1)−1 exp(−x=(c1=2)) dx?(2(1 − a=c1))

):

In the above expression, X ∗t denotes the pseudo time series, resampled according to

the moving block scheme described in Section 3. Also, a∗ and c∗1 are de!ned as

(a∗; c∗1)

′ = arg mina;c1∈

G∗′(�)WTG∗(�); (23)

where

G∗′(�) =

((1T

T∑t=1

X ∗t − (c1 − a)

);

(1T

T∑t=1

(X ∗t − X ∗)2 − c1

2(c1 − a)

)):

Given the exact identi!cation of the parameters in this example, the bootstrap seriessatis!es the moment conditions (in the limit) and so no recentering term is required.In our test level experiments, we simulate data according to (20) by using the discrete

Milstein approximation scheme given in Eq. (8). 16 It follows that discrete samples aregenerated according to:

b(X �(k−1)h; �)h= ((c1 − a) − X(k−1)h)h; �(X �

(k−1)h; �) =√

c1X(k−1)h;

�(X �(k−1)h; �)

′ =12(X(k−1)h)−1=2√c1: (24)

Rejection frequencies are tabulated for (a; c1)={(−2; 2); (−3; 3); (−4; 4)}. 17 For powerexperiments we generate log of X (t) as an Ornstein-Uhlenbeck process. Namely, wede!ne:

d logX (t) = −�1 logX (t) dt + � dW (t); �1 ¿ 0;

so that X (t) has a lognormal CDF. In practice we simulate the approximate discretetrajectories of Ykh = log(Xkh) using the following Milstein scheme:

Ykh = −�1Y(k−1)hh+ �"kh;√h"kh ∼ N(0; 1); Xkh = exp(Ykh):

16 Note that in (24) below, paths are only simulated at the “true” parameter values. Thus, we can suppressthe dependence of parameters on the simulation trajectory.17 The three parameterizations considered under H0 can be expressed in terms of 51and 52 as: (51; 52) =

{(4; 1); (4; 3=2); (4; 2)}. The shapes of the densities with these parameterizations are given in Johnson et al.(1994, p. 341). Results qualitatively similar to those reported in Table 1 were also found for (51; 52) ={(10=3; 3=2); (14=3; 3=2)}, although they are not reported here.

134 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

Note that under both hypotheses the DGP is characterized by a di#usion term which islocally, but not globally Lipschitz. In the power experiments, we set �2 ={0:1; 0:5; 1:0}and � = {0:3; 0:6; 0:9}. Summarizing, the null is that Xt has a gamma CDF, whilethe alternative is that Xt does not have a gamma CDF. In all experiments, param-eters are estimated using GMM, given the moment conditions implied under H0,and V 2

T , |VT | , and |VT | sup test statistics are constructed. Further, h is set equal toT−1, samples of T = {400; 800; 1200} observations are generated, block lengths ofl = {2; 4; 5; 8; 10; 20; 25; 40; 50} are tried, the integration interval U is set equal to[0; 15], statistics are formed based on uniform grids of 50 points in U , and criticalvalues are set equal to the 90th percentile of the bootstrap distribution. Finally, wetried B= {100; 200; 500}. As results are qualitatively the same for all three values ofB, we report only the !ndings for B= 100.Results based on data generated according to H0 are collected in Table 1, while

those based on data generated according the HA are in Tables 2–4. Recalling that the“nominal” rejection rate is 10%, note that for samples of 400 observations, empiricalrejection rates range from around 8% to 15%, while for samples of 1200 observations,the range is approximately 10–13%. Interestingly, results seem to be quite robust to thechoice of blocklength, l. From Tables 2–4, we see that empirical rejection frequenciesunder the alternative are also somewhat robust to the choice of blocklength, although!nite sample power in some cases increases rather substantially with sample size (seee.g. Table 2, Panels a and b). However, when the variability of the process is �2 =0:5or �2=1:0, !nite sample power is above 0.85 for all sample sizes and cases considered(see Tables 3 and 4).

6. Concluding remarks

In this paper we have outlined two parametric speci!cation tests for di#usion pro-cesses. In the one-dimensional case, we outline a test that is based on a compari-son of the empirical distribution of the skeleton of the di#usion and the CDF im-plied by the speci!cation of the drift and the variance terms, with the latter evalu-ated at estimated (simulated GMM) parameters. In the multidimensional and/or mul-tifactor case, we outline a related test that is based on a comparison of the empir-ical distributions of the actual and simulated data. In both cases, the limiting dis-tribution is a functional of Gaussian process with a covariance kernel that reMectsdata dependence and parameter estimation error. In order to obtain asymptoticallyvalid critical values for the tests, we propose an extension of the empirical ver-sion of the block bootstrap which properly captures the contribution of parameterestimation error.In an illustration, the one-dimensional test is applied to the problem of select-

ing between two alternative continuous time di#usion models, and a limited num-ber of Monte Carlo experiments are carried out, with results suggesting that the testshold some promise, even when applied to samples of data as small as 400observations.

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 135

Table 1Di#usion speci!cation test rejection frequencies: empirical levela data generated under H0 with (c1; a) ={(2;−2); (3;−3); (4;−4)}

l T = 400 observations T = 800 observations T = 1200 observations

V 2T |VT | |VT |sup V 2

T |VT | |VT |sup V 2T |VT | |VT |sup

Panel a: c1 = 2, a = −22 0.086 0.096 0.092 0.114 0.122 0.124 0.130 0.144 0.1323 0.088 0.092 0.090 0.096 0.106 0.118 0.112 0.136 0.1185 0.080 0.096 0.098 0.100 0.114 0.114 0.132 0.134 0.1188 0.084 0.098 0.090 0.104 0.114 0.114 0.112 0.124 0.12610 0.078 0.086 0.088 0.106 0.110 0.130 0.122 0.128 0.12220 0.086 0.096 0.088 0.096 0.096 0.112 0.116 0.126 0.12825 0.086 0.096 0.092 0.112 0.108 0.104 0.118 0.122 0.12640 0.094 0.100 0.094 0.096 0.106 0.114 0.114 0.124 0.11850 0.096 0.108 0.112 0.098 0.106 0.112 0.128 0.118 0.130

Panel b: c1 = 3, a = −32 0.140 0.134 0.128 0.148 0.142 0.118 0.116 0.122 0.0943 0.124 0.124 0.128 0.130 0.134 0.130 0.108 0.120 0.0945 0.144 0.144 0.126 0.134 0.136 0.122 0.112 0.112 0.1088 0.130 0.122 0.120 0.132 0.126 0.116 0.108 0.120 0.10210 0.136 0.134 0.132 0.148 0.142 0.122 0.110 0.110 0.10420 0.126 0.124 0.132 0.146 0.142 0.118 0.114 0.110 0.10025 0.134 0.134 0.138 0.140 0.134 0.136 0.112 0.114 0.09640 0.128 0.128 0.134 0.138 0.142 0.116 0.112 0.102 0.10250 0.144 0.146 0.146 0.144 0.148 0.126 0.112 0.108 0.096

Panel c: c1 = 4, a = −42 0.120 0.124 0.122 0.114 0.132 0.112 0.108 0.114 0.1123 0.110 0.124 0.114 0.120 0.122 0.110 0.108 0.108 0.1105 0.120 0.130 0.122 0.106 0.104 0.106 0.112 0.108 0.1068 0.114 0.122 0.120 0.122 0.122 0.102 0.106 0.102 0.10610 0.128 0.126 0.130 0.112 0.116 0.112 0.108 0.112 0.11220 0.136 0.138 0.116 0.130 0.130 0.112 0.104 0.116 0.10425 0.148 0.150 0.124 0.128 0.132 0.112 0.112 0.116 0.12040 0.138 0.140 0.124 0.142 0.140 0.116 0.114 0.124 0.11250 0.148 0.156 0.156 0.128 0.146 0.120 0.104 0.118 0.106

aNotes: The !rst column of numerical entries are block lengths used in the construction of bootstrap criticalvalues. All other numerical entries are rejection frequencies based on application of the one-dimensionaldi#usion speci!cation test discussed above. Results for three versions of the test, denoted V 2

T , |VT |, and|VT |sup are reported. Critical values are set equal to the 90th percentile of the empirical distribution of thebootstrap statistics, and all empirical distributions are constructed using 100 bootstrap replications. All resultsare based on 1000 Monte Carlo simulations (see above for further details).

Acknowledgements

Parts of this paper were written while the second author was visiting the Universityof California, San Diego. The authors wish to thank Ron Gallant, the associate editor,two anonymous referees, Filippo Altissimo, Marcelo Fernandes, Silvia Goncalves, Soren

136 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

Table 2Di#usion speci!cation test rejection frequencies: empirical power Ia data generated under HA with �2 = 0:1

l T = 400 observations T = 800 observations T = 1200 observations

V 2T |VT | |VT |sup V 2

T |VT | |VT |sup V 2T |VT | |VT |sup

Panel a: � = 0:32 0.543 0.563 0.473 0.783 0.820 0.727 0.893 0.903 0.8503 0.480 0.503 0.457 0.750 0.783 0.710 0.877 0.887 0.8375 0.483 0.490 0.433 0.750 0.787 0.700 0.870 0.880 0.8278 0.463 0.480 0.423 0.720 0.760 0.677 0.860 0.863 0.83010 0.450 0.483 0.397 0.720 0.747 0.700 0.850 0.867 0.82320 0.460 0.447 0.427 0.723 0.753 0.677 0.847 0.873 0.79025 0.437 0.443 0.417 0.727 0.740 0.667 0.847 0.870 0.82040 0.467 0.477 0.427 0.707 0.727 0.683 0.847 0.867 0.82750 0.450 0.450 0.413 0.717 0.727 0.670 0.857 0.870 0.827

Panel b: � = 0:62 0.333 0.363 0.290 0.570 0.587 0.483 0.723 0.763 0.6303 0.340 0.353 0.287 0.567 0.583 0.467 0.717 0.747 0.6235 0.337 0.353 0.260 0.560 0.593 0.453 0.710 0.750 0.6178 0.340 0.360 0.290 0.560 0.587 0.450 0.717 0.743 0.62310 0.343 0.357 0.280 0.560 0.607 0.443 0.710 0.757 0.61320 0.330 0.357 0.283 0.540 0.590 0.440 0.720 0.740 0.62025 0.327 0.357 0.283 0.547 0.577 0.427 0.703 0.743 0.60740 0.350 0.350 0.313 0.533 0.573 0.447 0.717 0.750 0.61350 0.353 0.363 0.330 0.537 0.583 0.457 0.707 0.753 0.627

Panel c: � = 0:92 0.213 0.243 0.193 0.377 0.460 0.297 0.510 0.603 0.3603 0.223 0.253 0.183 0.383 0.453 0.273 0.513 0.583 0.3605 0.233 0.247 0.197 0.390 0.480 0.297 0.523 0.587 0.3638 0.237 0.253 0.200 0.380 0.453 0.283 0.497 0.587 0.37010 0.230 0.250 0.193 0.367 0.440 0.270 0.503 0.580 0.36720 0.223 0.257 0.193 0.363 0.470 0.283 0.490 0.573 0.36325 0.243 0.270 0.217 0.390 0.480 0.277 0.477 0.590 0.38740 0.267 0.280 0.233 0.370 0.467 0.300 0.503 0.590 0.38750 0.260 0.273 0.237 0.393 0.477 0.313 0.497 0.580 0.373

aNotes: See notes to Table 1.

Johansen, Oliver Linton, Allan Timmerman, Paolo Za#aroni, and seminar participantsat Ente Einaudi-Roma, Queen Mary, University of London, Universit[e Libre Bruxelles,European Universitary Institute, University of Washington, Purdue University, RutgersUniversity, ESRC-UK econometrics Group in Bristol, the 2000 Winter Meetings of theEconometric Society, and the 2001 Meeting of the European Econometric Society foruseful comments on earlier versions of this paper. In addition, special thanks are dueto Antonio Mele and Bent Sorensen for providing numerous very helpful suggestions.Corradi gratefully acknowledges !nancial support from ESRC, grant code R000230006.

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 137

Table 3Di#usion speci!cation test rejection frequencies: empirical power IIa data generated under HA with �2 = 0:5

l T = 400 observations T = 800 observations T = 1200 observations

V 2T |VT | |VT |sup V 2

T |VT | |VT |sup V 2T |VT | |VT |sup

Panel a: � = 0:32 0.953 0.873 0.980 0.990 0.957 1.000 0.993 0.977 1.0003 0.950 0.843 0.977 0.997 0.963 1.000 0.993 0.967 1.0005 0.943 0.857 0.980 0.987 0.960 1.000 0.990 0.973 1.0008 0.933 0.853 0.977 0.990 0.940 1.000 0.990 0.960 1.00010 0.950 0.850 0.967 0.987 0.937 1.000 0.990 0.950 1.00020 0.937 0.833 0.970 0.990 0.947 1.000 0.990 0.957 1.00025 0.920 0.837 0.957 0.983 0.947 1.000 0.993 0.960 1.00040 0.910 0.833 0.957 0.987 0.943 1.000 0.990 0.947 1.00050 0.907 0.820 0.943 0.993 0.943 1.000 0.990 0.963 1.000

Panel b: � = 0:62 0.937 0.883 0.937 1.000 0.987 1.000 1.000 1.000 1.0003 0.917 0.860 0.910 0.997 0.987 1.000 1.000 1.000 1.0005 0.923 0.857 0.913 1.000 0.990 1.000 1.000 1.000 1.0008 0.917 0.853 0.897 0.993 0.990 1.000 1.000 1.000 1.00010 0.923 0.857 0.900 0.997 0.993 1.000 1.000 1.000 1.00020 0.903 0.847 0.897 1.000 0.997 1.000 1.000 1.000 1.00025 0.890 0.837 0.910 1.000 0.990 1.000 1.000 1.000 1.00040 0.913 0.820 0.890 0.993 0.990 1.000 1.000 0.997 1.00050 0.900 0.830 0.900 0.993 0.993 1.000 1.000 1.000 1.000

Panel c: � = 0:92 0.870 0.857 0.723 0.983 0.980 0.950 0.997 0.997 0.9933 0.867 0.853 0.737 0.980 0.980 0.953 0.993 0.997 0.9905 0.857 0.853 0.723 0.987 0.983 0.957 0.997 0.997 0.9938 0.857 0.850 0.713 0.987 0.980 0.957 0.997 0.997 0.99010 0.863 0.850 0.743 0.980 0.973 0.963 0.993 0.997 0.99320 0.850 0.840 0.723 0.983 0.980 0.957 0.997 0.993 0.99325 0.843 0.827 0.717 0.987 0.977 0.960 0.997 0.997 0.99340 0.830 0.827 0.720 0.983 0.980 0.953 0.997 0.993 0.99350 0.823 0.830 0.723 0.973 0.973 0.950 0.997 0.993 0.997

aNotes: See notes to Table 1.

Appendix A.

The proof of Theorem 1 requires the following Lemma.

Lemma A1. Let A1–A2, A4–A8 hold. If as T; S → ∞, h → 0, T=S → 0 and h2T → 0,then

√T (�T;S;h − � †) d→N(0; D†′

W0D†)−1;

where W0 and D† are de-ned in assumptions A5 and A8, respectively.

138 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

Table 4Di#usion speci!cation test rejection frequencies: empirical power IIIa data generated under HA with �2 =1:0

l T = 400 observations T = 800 observations T = 1200 observations

V 2T |VT | |VT |sup V 2

T |VT | |VT |sup V 2T |VT | |VT |sup

Panel a: � = 0:32 0.987 0.973 1.000 1.000 1.000 1.000 1.000 1.000 1.0003 0.977 0.967 0.997 0.997 1.000 1.000 1.000 1.000 1.0005 0.983 0.967 1.000 1.000 1.000 1.000 1.000 1.000 1.0008 0.967 0.953 1.000 1.000 1.000 1.000 1.000 1.000 1.00010 0.967 0.953 0.997 1.000 1.000 1.000 1.000 1.000 1.00020 0.947 0.933 0.990 1.000 1.000 1.000 1.000 1.000 1.00025 0.960 0.933 0.997 1.000 1.000 1.000 1.000 1.000 1.00040 0.933 0.920 0.987 0.997 0.997 1.000 1.000 1.000 1.00050 0.937 0.907 0.987 0.997 0.993 1.000 1.000 0.997 1.000

Panel b: � = 0:62 0.987 0.890 0.997 0.997 0.960 1.000 0.997 0.973 1.0003 0.977 0.887 1.000 0.993 0.957 1.000 0.993 0.957 1.0005 0.980 0.890 0.997 0.990 0.970 1.000 0.997 0.957 1.0008 0.983 0.900 0.993 0.990 0.943 1.000 0.993 0.963 1.00010 0.987 0.893 1.000 0.993 0.957 1.000 0.997 0.967 1.00020 0.973 0.883 0.993 0.993 0.953 1.000 0.997 0.970 1.00025 0.983 0.890 0.997 0.997 0.960 1.000 0.993 0.967 1.00040 0.983 0.890 0.997 0.993 0.960 1.000 0.993 0.967 1.00050 0.990 0.887 0.997 0.993 0.963 1.000 0.990 0.970 1.000

Panel c: � = 0:92 0.987 0.903 0.990 0.993 0.977 1.000 1.000 0.997 1.0003 0.973 0.927 0.990 1.000 0.980 1.000 1.000 0.997 1.0005 0.983 0.910 0.993 0.997 0.980 1.000 1.000 0.993 1.0008 0.973 0.903 0.993 1.000 0.977 1.000 1.000 0.993 1.00010 0.973 0.903 0.993 1.000 0.990 1.000 1.000 1.000 1.00020 0.953 0.913 0.987 0.997 0.980 1.000 0.997 0.993 1.00025 0.960 0.893 0.990 0.997 0.990 1.000 0.997 0.993 1.00040 0.970 0.897 0.990 1.000 0.987 1.000 1.000 0.997 1.00050 0.970 0.910 0.997 0.997 0.983 1.000 1.000 0.990 1.000

aNotes: See notes to Table 1.

Proof of Lemma A1. Note that, because of the !rst order conditions, ∇�GT;S;h(�T;S;h)′

WTGT;S;h(�T;S;h) = 0, then via a mean value expansion around � †,

0 =∇�GT;S;h(�T;S;h)′WTGT;S;h(� †) + ∇�GT;S;h(�T;S;h)′WT∇�GT;S;h( X�T;S;h)

×(�T;S;h − � †);

where X�T;S;h ∈ (�T;S;h; � †). Thus,√T (�T;S;h − � †) = (−∇�GT;S;h(�T;S;h)′WT∇�GT;S;h( X�T;S;h))−1

×∇�GT;S;h(�T;S;h)′WT

√TGT;S;h(� †): (A.1)

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 139

We begin by showing that as T; S → ∞, h → 0, T=S → 0 and h2T → 0,√TGT;S;h(� †) d→N(0; W−1

0 ): (A.2)

Now, given A1 and A5,

√TGT;S;h(� †) =

1√T

T∑t=1

(g(Xt) − E(g(X1))) −√TS

S∑t=1

(g(X � †t; h ) − E(g(X � †

1; h )))

−√T (E(g(X � †

1; h )) − E(g(X � †1 )))

−√T (E(g(X � †

1 )) − E(g(X1))); (A.3)

we need to show that the second, third and fourth term on the RHS of (A.3) approachzero in probability, as T; S → ∞, h → 0, T=S → 0 and h2T → 0. By the !rstorder conditions, ∇�G∞(� †)′W †G∞(� †) = 0 and so, as (∇�G∞(� †)′W †)−1 exists,G∞(� †) = 0, where, given A1 and A3,

0 =G∞(� †) = plimS→∞; h→01S

S∑t=1

g(X � †t; h ) − plimT→∞

1√T

T∑t=1

g(Xt)

= E(g(X � †1 )) − E(g(X1));

thus the last term on the RHS of (A.3) is zero. Now, given A2 and recalling thathT 2 → 0, the second last term on the RHS of (A.3) is oP(1), as a straightforwardconsequence of Theorem 2.3 in Pardoux and Talay (1985). Given A1, A2, A4 andA6, (1=

√S)∑S

t=1 (g(X� †t; h ) − E(g(X � †

1; h ))) is bounded in probability, as it satis!es acentral limit theorem, and so for T =o(S) it vanishes to zero. Given A5, the statementin (A.2) follows straightforwardly. Finally, given A2, and A4–A8, via the uniform lawof large numbers for mixing process, as T; S → ∞ and h → 0,

sup�∈

|GT;S;h(�)′WTGT;S;h(�) − G′∞W0G∞| Pr→0;

and so given unique identi!ability, A7, (�T;S;h − � †) Pr→0. Also, given A2, and A4–A7,

sup�∈

|∇�GT;S;h(�)′WT∇GT;S;h(�) − D(�)′W0D(�)| Pr→0;

and so, given that (�T;S;h − � †) Pr→0,

∇�GT;S;h(�T;S;h)′WT∇GT;S;h( X�T;S;h)Pr→D†′

W0D†;

where D(� †) = D†. The statement in the Lemma then follows.

Proof of Theorem 1. (i) Recall that � †=�0, under the null. We !rst show convergencein distribution for any given u∈U , then we show convergence of the !nite dimensionaldistributions and !nally stochastic equicontinuity over U , this will ensure that VT;S;h(·)weakly converges to Z , and the desired result then follows from the continuous mappingtheorem. Given A1, the skeleton X1; X2; : : : ; XT is a strictly stationary strong mixing

140 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

sequence with mixing coeFcients decaying at a geometric rate. Given A2, we canwrite

VT;S;h(u) =1√T

T∑t=1

((1{Xt6 u} − F(u; �0)) − (F(u; �T;S;h) − F(u; �0)))

=1√T

T∑t=1

(1{Xt6 u} − F(u; �0)) − ∇�F(u; X�T;S;h)′√T (�T;S;h − �0)

= I1T (u) + I2T (u);

where X�T;S;h ∈ (�T;S;h; �0). Recalling A1, A4–A8, by the central limit theorem for strongmixing sequences,(

I1T (u)

I2T (u)

)d→N

(0;

(V1(u) C(u)

C(u) V2(u)

);

where, given Lemma A1,

V1(u) = E

( ∞∑s=−∞

(1{X16 u} − F(u; �0))(1{Xs6 u} − F(u; �0))

);

V2(u) = ∇�F(u; �0)′(D0′W0D0)−1∇�F(u; �0)));

C(u) =−∇�F(u; �0)′(D0′W0D0)−1D0′W0

∞∑s=−∞

E((g(Xs)

−E(g(X1))(1{X16 u} − F(u; �0))):

Thus, ST;S;h(u)d→N(0; K(u; u)), where K(u; u)=V1(u)+V2(u)+2C(u). A straightforward

application of the Cramer Wold device ensures that(VT;S;h(u)

VT;S;h(u′)

)d→N

(0;

(K(u; u) K(u; u′)

K(u; u′) K(u′; u′)

);

where K(u; u′) is as de!ned in (9). As U is compact in R, (and so totally bounded)in order to show weak convergence, we need to show that VT;S;h(u) is stochasticallyequicontinuous on U ,(see e.g. Pollard, 1990, Section 10), that is

limsupT;S→∞; h→0 Pr

(supu

supu′:|u−u′|¡%

|VT;S;h(u) − VT;S;h(u′)|¿ j)

= 0 (A.4)

Now (A.4) will follow if we can show that

limsupT→∞ Pr

(supu

supu′:|u−u′|¡%

∣∣∣∣∣ 1√T

T∑t=1

((1{Xt6 u} − F(u; �0))

− (1{Xt6 u′} − F(u′; �0)))|¿ j=2)

= 0 (A.5)

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 141

limsupT;S→∞; h→0 Pr

(supu

supu′:|u−u′|¡%

|(∇�F(u; X�T )′

− ∇�F(u′; X�T;S;h)′)√T (�T;S;h − �0)|¿ j=2

)= 0 (A.6)

with X�T;S;h ∈ (�T;S;h; �0). We begin by considering (A.6); by a mean value expansion,we have,

supu

supu′:|u−u′|¡%

∣∣∣∣∣∣k∑

j=1

(∇u;�F( Xu; X�T;S;h)j√T (�T;S;h; j − �0; j)(u − u′)

∣∣∣∣∣∣6 k max

j=1;:::; ksup

u×�∈U×|∇u;�F(u; �)j‖

√T (�T;S;h; j − �0; j)|sup

usup

u′:|u−u′|¡%|u − u′|

Now

k lim supT→∞

Pr(

|√T (�T;S;h; j − �0; j)|¿ j=2

% supu×�∈U× |∇u;�F(u; �)j|)= 0;

given that√T (�T;S;h−�0)=Op(1), and A3 ensures that ∇�;uF(u; �)′ is jointly continuous

on × U , and so supu×�∈U× |∇u;�F(u; �)j|6C. It remains to show (A.5).Let mt(u) = 1{Xt6 u} − F(u; �0) and note that,

supt6T;T¿1

supu∈U

|mt(u)| = 1 (A.7)

Without loss of generality set u¡u′,

supt6T;T¿1

(E

(supu

supu′:|u−u′|¡%

(|mt(u) − mt(u′)|p)))1=p

; p¿ 2

6 supu

supu′:|u−u′|¡%

|F(u; �0) − F(u′; �0)|

+

(E

(supu

supu′:|u−u′|¡%

(1{u6Xt6 u′})p))1=p

6 supu×U

|∇uF(u; �0)|%+(

supu∈U

∫ u+%

uf(x) dx

)1=p6 2C%; (A.8)

given A3. Stochastic equicontinuity then follows by Philipp (1982) (see (i)–(iii) inexample 2(a) in Andrews, 1993). In fact (i) is satis!ed given the geometric ergodicityof the skeleton, (ii) is ensured by (A.7) and as shown by Andrews (1993, pp. 201),(iii) is implied by (A.8). Given stochastic equicontinuity and convergence of the !nitedimensional distributions, it follows that VT;S;h(·) ⇒ Z(·) where Z is the Gaussianprocess with the covariance kernel de!ned in (9). The statement then follows from thecontinuous mapping theorem.

142 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

(ii)

ST;S;h(u) =1√T

T∑t=1

((1{Xt6 u} − F0(u)) −√T (F(u; �T;S;h) − F0(u))):

The !rst term on the RHS of the above expression satis!es a central limit theoremand so is Op(1). With regard to the second term, it diverges at rate

√T for all u in a

subset of positive Lebesgue measure. The statement in (ii) then follows.

Hereafter, P∗ denotes the probability law of the resampled series X ∗t , and d∗ de-

notes P∗-convergence in distribution, conditional on the sample. Also, with the notationoP∗(1) Pr − P, we mean a term which approaches zero in P∗-probability, conditionalon the sample and for all sample, but a subset of measure approaching zero.The proof of Theorem 2 requires the following Lemma.

Lemma A2. Let A1–A2, A4–A8 hold. If as T; S → ∞, h → 0, T=S → 0, h2T → 0,l → ∞, and l=

√T → 0, then

Pr(

supx∈Rp

|P∗(√T (�∗

T;S;h − �T;S;h)6 x) − P(√T (�T;S;h − � †)6 x)|¿ j

)→ 0;

Proof of Lemma A2. Via a mean value expansion around � †,

0 = ∇�G∗T;S;h(�

∗T;S;h)

′WTG∗T;S;h(�

†) + ∇�G∗T;S;h(�

∗T;S;h)

′WT∇�G∗T;S;h( X�

∗T;S;h)(�

∗T;S;h − � †);

where X�∗T;S;h ∈ (�∗

T;S;h; �†). Thus,

√T (�∗

T;S;h − � †) = (−∇�G∗T;S;h(�

∗T;S;h)

′WT∇�G∗T;S;h( X�

∗T;S;h))

−1

×∇�G∗T;S;h(�

∗T;S;h)

′WT

√TG∗

T;S;h(�†);

and so given (A.1),√T (�∗

T;S;h − �T;S;h)

= (−∇�G∗T;S;h(�

∗T;S;h)

′WT∇�G∗T;S;h( X�

∗T;S;h))

−1

×∇�G∗T;S;h(�

∗T;S;h)

′WT

√T (G∗

T;S;h(�†) − GT;S;h(� †))

− ((−∇�G∗T;S;h(�

∗T;S;h)

′WT∇�G∗T;S;h( X�

∗T;S;h))

−1∇�G∗T;S;h(�

∗T;S;h)

′WT

− (−∇�GT;S;h(�T;S;h)′WT∇�GT;S;h( X�T;S;h))−1∇�GT;S;h(�T;S;h)′WT )√TGT;S;h(� †):

We begin by showing that√T (G∗

T;S;h(�†) − GT;S;h(� †)) has the same limiting distri-

bution as√TGT;S;h(� †), conditionally on the samples, and for all samples except a set

of probability measure approaching zero. Note that,√T (G∗

T;S;h(�†) − G∞(� †)) =

√T (G∗

T;S;h(�†) − GT;S;h(� †))

+√T (GT;S;h(� †) − G∞(� †));

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 143

where for h2T → 0, the last term on the RHS above is oP(1), by the same argumentused in the proof of Lemma A1. Also, given A1–A4 and given that T=S → 0 (i.e. thesimulation error is vanishing), by the same argument as in the proof of Lemma A1,

√T (G∗

T;S;h(�†) − GT;S;h(� †)) =

1√T

T∑t=1

(g(X ∗t ) − g(Xt)) + oP∗(1): (A.9)

Thus, given A1 and A6, by Theorem 3.5 in K2unsch (1989), the RHS of (A.9) has thesame limiting distribution as 1√

T

∑Tt=1 (g(Xt)−E(g(X1))), conditionally on the samples,

and for all samples except a set of probability measure approaching zero. Therefore,√T (G∗

T;S;h(�†) − GT;S;h(� †))d

∗→N(0; W0), Pr − P, where d∗ denotes P∗-convergence in

distribution. We now need to show that

�∗T;S;h − � † = oP∗(1);Pr − P: (A.10)

First, note that

G∗T;S;h(�) − G∞(�) = (G∗

T;S;h(�) − GT;S;h(�)) + (GT;S;h(�) − G∞(�))

= oP∗(1) + o(1);Pr − P; (A.11)

uniformly in �. In fact, as we do not resample the simulated series, given A1, A4, A6,the !rst term on the RHS of (A.11) is oP∗(1) + o(1) Pr − P because of Lemma A2in GonOcalves and White (2004), the second is oP(1) by the same argument as in theproof of Lemma A1. Therefore, given A5,

sup�∈

|G∗T;S;h(�)

′WTG∗T;S;h(�) − G′

∞W0G∞|Pr∗

→0;Pr − P;

and, given A7, (A.10) follows. By a similar argument, given (A.10), it also followsthat,

((−∇�G∗T;S;h(�

∗T;S;h)

′WT∇�G∗T;S;h( X�

∗T;S;h))

−1∇�G∗T;S;h(�

∗T;S;h)

′WT

−(−∇�GT;S;h(�T;S;h)′WT∇�GT;S;h( X�T;S;h))−1∇�GT;S;h(�T;S;h)′WT )Pr∗→0;Pr − P;

and, as√TGT;S;h(� †) = OP(1), the statement in the lemma follows.

Proof of Theorem 2.

V ∗T;S;h(u) =

1√T

T∑t=1

(1{X ∗t 6 u} − 1{Xt6 u})

−∇�F(u; X�∗T;S;h)

√T (�∗

T;S;h − �T;S;h): (A.12)

Now, given A1 and A3, by the empirical process version of the block bootstrap ofNaik-Nimbalkar and Rajarshi (1994, Theorem 2.1), 1√

T

∑Tt=1 (1{X ∗

t 6 u}−1{Xt6 u})has the same limiting distribution, as a process over U , as 1√

T

∑Tt=1 (1{Xt6 u} −

F(u; � †). Also, given A3 and recalling (A.10), ∇�F(u; X�∗T;S;h) − ∇�F(u; � †) =

oP∗(1) Pr − P. The statement in the theorem then follows from Lemma A2.

144 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

Proof of Theorem 3. (i) Recall that � †=�0, under the null. Now, for any given u∈U ,

ZT;S;h(u) =1√T

T∑t=1

1{Xt6 u} −√TS

S∑t=1

1{X �0t 6 u − (X �T; S; h

t; h − X �0t )}

=1√T

T∑t=1

(1{Xt6 u}−F(u; �0))−√TS

S∑t=1

(1{X �0t 6u−(X �T; S; h

t; h −X �0t )}

−F(u − (X �T; S; h

t; h − X �0t ); �0))

−√TS

S∑t=1

(F(u − (X �T; S; h

t; h − X �0t ); �0) − F(u; �0)):

We need to show that,√TS

S∑t=1

(1{X �0t 6 u − (X �T; S; h

t; h − X �0t )} − F(u − (X �T; S; h

t; h − X �0t ); �0))

=

√TS

S∑t=1

(1{X �0t 6 u} − F(u; �0)) + oP(1); (A.13)

where the oP(1) term holds uniformly in u. In order to show (A.13), we !rst need toshow that,

supt6S

|X �T; S; h

t; h − X �0t | = oP(1): (A.14)

First, note that

supt6S

|X �T; S; h

t; h − X �0t |6 sup

t6S|X �0

t; h − X �0t | + sup

t6S|X �T; S; h

t; h − X �0t; h|: (A.15)

As for the !rst term on the RHS of (A.15),

Pr(supt6S

|X �0t; h − X �0

t |¿ j)

6S∑

t=1

Pr(|X �0t; h − X �0

t |¿ j)6 1j2

S∑t=1

E(|X �0t; h − X �0

t |2)

=Sj2 E(|X

�01; h − X �0

1 |2)6CShj2 → 0 as Sh → 0; (A.16)

where the second last equality on the RHS of (A.16) follows from A1 (stationarity-ergodicity of the discrete approximation for all h), while the last inequality, given A2′

and (15), (16) follows from Theorem 10.3.5 in Kloeden and Platen (1999). As for thesecond term on the RHS of (A.15),

supt6S

|X �T; S; h

t; h − X �0t; h|6 sup

t6S|∇′

�XX�T; S; h

t; h (�T;S;h − �0)|;

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 145

where, X�T;S;h ∈ (�T;S;h; �0), and, given Lemma A1, (�T;S;h − �0) = OP(T−1=2). Further,given the domination condition in A6,

Pr(supt6S

T−1=2|∇′�X

X�T; S; h

t; h |¿ j)6

S∑t=1

Pr(T−1=2|∇′�X

X�T; S; h

t; h |¿ j)

6S∑

t=1

Pr(T−1=2Dt ¿ j)6 1j4

ST 2 E(D

4t ) → 0 if

ST 2 → 0:

Then, (A.13) follows because of the stochastic equicontinuity in u of (1=√S)∑S

t=1

(1{X �0t 6 u} − F(u; �0)). Thus,

ZT;S;h(u) =1√T

T∑t=1

(1{Xt6 u} − F(u; �0)) +

√TS

S∑t=1

(1{X �0t 6 u} − F(u; �0))

−√TS

S∑t=1

(F(u − (X �T; S; h

t; h − X �0t ); �0) − F(u; �0)) + oP(1); (A.17)

with the oP(1) term holding uniformly in u. The second term on the RHS of (A.17)is oP(1), for T=S → 0, by the same argument as that used in the proof of Theorem 1.As for the third term on the RHS of (A.17), it can be written as

− 1S

S∑t=1

f(u − (XX�T; S; h

t; h − X �0t ); �0)∇′

�XX�T; S; h

t; h

√T (�T;S;h − �0);

where f denotes the derivative of F with respect to its argument. Given A3, and

recalling (A.14), and Lemma A1, 1S

∑St=1 f(u − (X

X�T; S; h

t; h − X �0t ); �0)∇�X

X�T; S; h

t; h =

E(f(u; �0)∇′�X

�0t; h) + oP(1) = :′

f(�0) + oP(1). Thus,

ZT;S;h(u) =1√T

T∑t=1

(1{Xt6 u} − F(u; �0)) − :′f(�0)

√T (�T;S;h − �0):

The covariance kernel is thus as given in (18), given Lemma A1, and the !nal resultis an immediate consequence of the continuous mapping theorem.(ii) By the same argument as that used in part (i),

ZT;S;h(u) =1√T

T∑t=1

(1{Xt6 u}−F0(u)) −√TS

S∑t=1

(F(u − (X �T; S; h

t; h − X �0t ); �0) − F(u; �0))

−√T (F(u; �0) − F0(u)) + oP(1):

The result then follows by the same arguments as those used in the proof ofTheorem 1.

146 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

Proof of Theorem 4.

Z∗T;S;h(u) =

1√T

T∑t=1

(1{X ∗t 6 u} − 1{Xt6 u})

−√TS

S∑t=1

(1{X �∗T; S; h

t; h 6 u} − 1{X �T; S; h

t; h 6 u})

= I1;T − I2;T;S;h:

Now,

I2;T;S;h =

√TS

S∑t=1

(1{X �0t 6 u − (X

�∗T; S; h

t; h − X �0t )} − F(u − (X

�∗T; S; h

t; h − X �0t ); �0))

−√TS

S∑t=1

(1{X �0t 6 u − (X �T; S; h

t; h − X �0t )} − F(u − (X �T; S; h

t; h − X �0t ); �0))

+

√TS

S∑t=1

(F(u − (X�∗T; S; h

t; h − X �0t ); �0)

−F(u − (X �T; S; h

t; h − X �0t ); �0)): (A.18)

By a similar argument to that used in the proof of Theorem 3, and recalling thatT=S → 0, the !rst two terms on the RHS of (A.18) are oP(1), uniformly in u. As forthe last term on the RHS of (A.18), it can be written as:

1S

S∑t=1

f(u − (XX�∗T; S; h

t; h − X �0t ); �0)∇′

�XX�∗T; S; h

t; h

√T (�∗

T;S;h − �0)

− 1S

S∑t=1

f(u − (XX�T; S; h

t; h − X �0t ); �0)∇′

�XX�T; S; h

t; h

√T (�T;S;h − �0)

=1S

S∑t=1

f(u − (XX�∗T; S; h

t; h − X �0t ); �0)∇′

�XX�∗T; S; h

t; h

√T (�∗

T;S;h − �T;S;h)

+1S

S∑t=1

(f(u − (XX�∗T; S; h

t; h − X �0t ); �0)∇′

�XX�∗T; S; h

t; h

−S∑

t=1

f(u − (XX�T; S; h

t; h − X �0t ); �0)∇′

�XX�T; S; h

t; h )√T (�T;S;h − �0)

=1S

S∑t=1

f(u − (XX�∗T; S; h

t; h − X �0t ); �0)∇′

�XX�∗T; S; h

t; h

√T (�∗

T;S;h − �T;S;h)

+oP∗(1); Pr − P;

V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148 147

the desired result then follows from the same arguments as those used in the proof ofTheorem 2.

References

A23t-Sahalia, Y., 1996. Testing continuous time models of the spot interest rate. Review of Financial Studies9, 385–426.

A23t-Sahalia, Y., 1999. Transition densities for interest rate and others nonlinear di#usions. Journal of FinanceLIV, 1361–1395.

A23t-Sahalia, Y., 2002. Maximum likelihood estimation of discretely sampled di#usions: a closed formapproximation approach. Econometrica 70, 223–262.

Altissimo, F., Mele, A., 2002. Testing the closeness of conditional densities by simulated nonparametricmethods. Working Paper, London School of Economics.

Altissimo, F., Mele, A., 2003. Simulated nonparametric estimation of continuous time models of asset pricesand returns. Working Paper, London School of Economics.

Andrews, D.W.K., 1993. An introduction to econometric applications of empirical process theory fordependent random variables. Econometric Reviews 12, 183–216.

Andrews, D.W.K., 1997. A conditional Kolmogorov test. Econometrica 65, 1097–1128.Andrews, D.W.K., Buchinsky, M., 2000. A three step method for choosing the number of bootstrap

replications. Econometrica 68, 23–52.Bai, J., 2003. Testing parametric conditional distributions of dynamic models. Review of Economics and

Statistics 85, 531–549.Bontemps, C., Meddahi, N., 2003. Testing normality: a GMM approach. Journal of Econometrics,

forthcoming.B2uhlmann, P., 1995. The blockwise bootstrap for general empirical processes of stationary sequences.

Stochastic Processes and their Applications 58 (2), 247–266.Carlstein, E., 1986. The use of subseries methods for estimating the variance of a general statistic from a

stationary time series. Annals of Statistics 14, 1171–1179.Chen, X., Hansen, L.P., Scheinkman, J., 2000. Principal components and the long run. Working Paper,

New York University.Conley, T.G., Luttmer, E.G.J., Hansen, L.P., Scheinkman, J.A., 1997. Short-term interest rates as subordinated

di#usions. Review of Financial Studies 10, 525–577.Corradi, V., Swanson, N.R., 2002. A consistent test for out of sample nonlinear predictive ability. Journal

of Econometrics 110, 353–381.Corradi, V., Swanson, N.R., 2003. Evaluation of dynamic stochastic general equilibrium models bases on

distributional comparison of simulated and historical data. Journal of Econometrics, forthcoming.Cox, J.C., Ingersoll, J.E., Ross, S.A., 1985. A theory of the term structure of interest rates. Econometrica

53, 385–407.Dai, Q., Singleton, K.J., 2000. Speci!cation analysis of term structure models. Journal of Finance,

1943–1978.Dai, Q., Singleton, K.J., 2002. Expectation puzzles, time-varying risk premia and aFne models of the term

structure. Journal of Financial Economics 63, 415–441.Dridi, R., 1999. Simulated asymptotic least squares theory. Working Paper, London School of Economics.DuFe, D., Singleton, K., 1993. Simulated moment estimation of Markov models of asset prices. Econometrica

61, 929–952.Fournie, E., 1993. Un test de type Kolmogorov-Smirnov pour processus de di#usion ergodiques. W.P. 1696,

INRIA.Gallant, A.R., Tauchen, G., 1996. Which moments to match. Econometric Theory 12, 657–681.GonOcalves, S., White, H., 2004. Maximum likelihood and the bootstrap for dynamic nonlinear models. Journal

of Econometrics 119, 199–219.Gourieroux, C., Monfort, A., Renault, E., 1993. Indirect inference. Journal of Applied Econometrics 8,

85–118.

148 V. Corradi, N.R. Swanson / Journal of Econometrics 124 (2005) 117–148

Hall, P., 1992. The Bootstrap and Egeworth Expansion. Springer, New York.Hall, A.R., Inoue, A., 2003. The large sample behavior of the generalized method of moments estimator in

misspeci!ed models. Journal of Econometrics, 361–394.Hansen, B.E., 1996. Inference when a nuisance parameter is not identi!ed under the null hypothesis.

Econometrica 64, 413–430.Heston, S.L., 1993. A closed form solution for option with stochastic volatility with applications to bond

and currency options. Review of Financial Studies 6, 327–344.Hong, Y., 2001. Evaluation of out of sample probability density forecasts with applications to S& P 500

stock prices. Working Paper, Cornell University.Hong, Y.M., Li, H., 2003. Nonparametric speci!cation testing for continuous time models with applications

to term structure interest rates. Review of Financial Studies, forthcoming.Hong, Y.M., Li, H., Zhao, F., 2002. Out of sample performance of spot interest rate models. Working Paper,

Cornell University.Hull, J., White, A., 1987. The pricing of options on assets with stochastic volatility. Journal of Finance 42,

281–300.Inoue, A., 2001. Testing for distributional change in time series. Econometric Theory 17, 156–187.Johnson, N.L., Kotz, S., Balakrishnan, N., 1994. Continuous Univariate Distributions I. Wiley, New York.Karlin, S., Taylor, H.M., 1981. A Second Course in Stochastic Processes. Academic Press, San Diego.Kloeden, P.E., Platen, E., 1999. Numerical Solution of Stochastic Di#erential Equations. Springer,

New York.K2unsch, H.R., 1989. The jackknife and the bootstrap for general stationary observations. Annals of Statistics

17, 1217–1241.Lahiri, S.N., 1999. Theoretical comparisons of block bootstrap methods. Annals of Statistics 27, 386–404.Meddahi, N., 2001. An eigenfunction approach for volatility modeling. Working Paper, University of

Montreal.Naik-Nimbalkar, U.V., Rajarshi, M.B., 1994. Validity of blockwise bootstrap for empirical processes with

stationary observations. Annals of Statistics 22, 980–994.Nelson, D.B., 1990. ARCH as di#usion approximations. Journal of Econometrics 45, 7–38.Pardoux, E., Talay, D., 1985. Discretization and simulation of stochastic di#erential equations. Acta

Applicandae Mathematicae 3, 23–47.Peligrad, M., 1998. On the blockwise bootstrap for empirical processes for stationary sequences. Annals of

Probability 26, 877–901.Philipp, W., 1982. Invariance principles for sums of mixing random elements and the multivariate empirical

process. Colloquia Mathematica Societatis Janos Bolyai 36, 843–873.Politis, D.N., Romano, J.P., 1994a. The stationary bootstrap. Journal of the American Statistical Association

89, 1303–1313.Politis, D.N., Romano, J.P., 1994b. Limit theorems for weakly dependent Hilbert space valued random

variables with applications to the stationary bootstrap. Statistica Sinica 4, 461–476.Politis, D.N., Romano, J.P., Wolf, M., 1999. Subsampling. Springer, New York.Pollard, D., 1990. Empirical Processes: Theory and Applications. In: CBMS, Conference Series in Probability

and Statistics, Vol. 2. Institute of Mathematical Statistics, Hayward, CA.Stramer, O., Tweedie, R.L., 1997. Geometric and subgeometric convergence of di#usions with given

stationary distribution and their discretization. Working Paper, University of Iowa and Colorado StateUniversity.

Thompson, S.B., 2002. Evaluating the goodness of !t of conditional distributions with an application to theaFne term structure. Working Paper, Harvard University.

Wong, E., 1964. The construction of a class of stationary Markov processes. In: Bellman, R. (Ed.), SixteenthSymposia in Applied Mathematics: Stochastic Processes in Mathematical Physics and Engineering.American Mathematical Society, Providence, RI.