Hurst exponent estimation of locally self-similar Gaussian ...

HAL Id: hal-00005371https://hal.archives-ouvertes.fr/hal-00005371v2

Submitted on 8 Feb 2007

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.

Hurst exponent estimation of locally self-similarGaussian processes using sample quantiles

Jean-François Coeurjolly

To cite this version:Jean-François Coeurjolly. Hurst exponent estimation of locally self-similar Gaussian processes usingsample quantiles. Annals of Statistics, Institute of Mathematical Statistics, 2008, 36 (3), pp.1404-1434.�10.1214/009053607000000587�. �hal-00005371v2�

https://hal.archives-ouvertes.fr/hal-00005371v2

https://hal.archives-ouvertes.fr

hal-

0000

5371

, ver

sion

2 -

8 F

eb 2

007

Hurst exponent estimation of locally self-similar

Gaussian processes using sample quantiles

By Jean-Francois Coeurjolly1

University of Grenoble 2, France

This paper is devoted to the introduction of a new class of consistent estima-

tors of the fractal dimension of locally self-similar Gaussian processes. These

estimators are based on convex combinations of sample quantiles of discrete

variations of a sample path over a discrete grid of the interval [0, 1]. We derive

the almost sure convergence and the asymptotic normality for these estima-

tors. The key-ingredient is a Bahadur representation for sample quantiles of

non-linear functions of Gaussians sequences with correlation function decreas-

ing as k−αL(k) for some α > 0 and some slowly varying function L(·).

1 Introduction

Many naturally occuring phenomena can be effectively modelled using self-similar pro-

cesses. Among the simplest models, one can consider the fractional Brownian motion

introduced in the statistics community by Mandelbrot and Van Ness (1968). Frac-

tional Brownian motion can be defined as the only centered Gaussian process, denoted

by (X(t))t∈R, with stationary increments and with variance function v(·), given by

v(t) = σ2|t|2H , for all t ∈ R. The fractional Brownian motion is an H-self-similar

process, that is for all c > 0, (X(ct))t∈R

d= cH (X(t))t∈R

(whered= means equal in

finite-dimensional distributions) with autocovariance function behaving like O(|k|2H−2)

as |k| → +∞. So the discretized increments of the fractional Brownian motion (called the

fractional Gaussian noise) constitute a short-range dependent process, when H < 1/2,

1Supported by a grant from IMAG Project AMOA.

AMS 2000 subject classifications: Primary 60G18; secondary 62G30.

Key words and phrases: locally self-similar Gaussian process, fractional Brownian motion, Hurst exponent

estimation, Bahadur representation of sample quantiles.

1

Hurst exponent estimation using sample quantiles 2

and a long-range dependent process, when H > 1/2. The index H also characterizes the

path regularity since the fractal dimension of the fractional Brownian motion is equal to

D = 2−H. According to the context (long-range dependent processes, self-similar pro-

cesses,. . . ), a very large variety of estimators of the parameter H has been investigated.

The reader is referred to Beran (1994), Coeurjolly (2000) or Bardet et al. (2003) for

an overview of this problem. Among the most often used estimators we have: methods

based on the variogram, on the log−periodogram e.g. Geweke and Porter-Hudak (1983)

in the context of long-range dependent processes, maximum likelihood estimator (and

Whittle estimator) when the model is parametric e.g. fractional Gaussian noise, meth-

ods based on the wavelet decomposition e.g. Flandrin (1992) or Stoev et al. (2006) and

the references therein, and on discrete filtering studied by Kent and Wood (1997), Istas

and Lang (1997) and Coeurjolly (2001). We are mainly interested in the last one, which

has several similarities with the wavelet decomposition method. Following Constantine

and Hall (1994), Kent and Wood (1997), Istas and Lang (1997), in the case when the

process is observed at times i/n for i = 1, . . . , n, this method is adapted to a larger class

than the fractional Brownian motion, namely the class of centered Gaussian processes

with stationary increments that are locally self-similar (at zero). A process (X(t))t∈R is

said to be locally self-similar (at zero) if its variance function, denoted by v(·), satisfies

v(t) = E(X(t)2) = σ2|t|2H (1 + r(t)) , with r(t) = o (1) as |t| → 0, (1)

for some 0 < H < 1. An estimator of H is derived by using the stationarity of the

increments and the local behavior of the variance function. When observing the process

at regular subdivisions, the stationarity of the increments is crucial since the method

based on discrete filtering (and the one based on the wavelet decomposition) essentially

uses the fact that the variance of the increments can be estimated by the sample moment

of order 2. We do not believe that this framework could be valid for the estimation of

the Hurst exponent of Riemann-Liouville’s process, e.g. Alos et al. (1999) which is an

H-self-similar centered Gaussian process but with increments satisfying only some kind

of local stationarity, see Remark 2 for more details.

Let us be more specific on the construction of the wavelet decomposition method,

see e.g. Flandrin (1992): the authors noticed that the variance of the wavelet coefficient

at a scale say j behaves like 2j(2H−1). An estimator of H is then derived by regressing

the logarithm of sample moment of order 2 at each scale against log(j) for various scales.

This procedure exhibits good properties since it is also proved that the more vanishing


moments the wavelet has the observations are more decorrelated. And so asymptotic

results are quite easy to obtain. However, Stoev et al. (2006) illustrate the fact that this

kind of estimator is very sensitive to additive outliers and to non-stationary artefacts.

Therefore, they mainly propose to replace at each scale, the sample moment of order 2,

by the sample median of the squared coefficients. This procedure, for which the authors

assert that no theoretical result is available, is clearly more robust.

The main objective of this paper is to extend the procedure proposed by Stoev et

al. (2006) by deriving semi-parametric estimators of the parameter H, using discrete

filtering methods, for the class of processes defined by (1). The procedure is extended in

the sense that we consider either convex combinations of sample quantiles or trimmed-

means. Moreover, we provide convergence results. The key-ingredient is a Bahadur

representation of sample quantiles obtained in a certain dependence framework. Let Y =

(Y (1), . . . , Y (n)) be a vector of n i.i.d. random variables with cumulative distribution

function F , as well denote by ξ(p) and ξ (p) the quantile respectively the sample quantile

of order p. By assuming that F ′(ξ(p)) > 0 and F ′′(ξ(p)) exists, Bahadur proved that as

n → +∞,

ξ (p) − ξ(p) =p − F (p)

f(ξ(p)+ rn,

with rn = Oa.s.

(n−3/4 log(n)3/4

). Using a law of iterated logarithm’s type result, Kiefer

obtained the exact rate n−3/4 log log(n)3/4. Extensions of the above results to dependent

random variables have been pursued in Sen and Ghosh (1972) for φ−mixing variables,

in Yoshihara (1995) for strongly mixing variables, and recently in Wu (2005) for short-

range and long-range dependent linear processes, following works of Hesse (1990) and Ho

and Hsing (1996). Our contribution is to provide a Bahadur representation for sample

quantiles in another context that is for non-linear functions of Gaussian processes with

correlation function decreasing as k−αL(k) for some α > 0 and some slowly varying

function L(·). The bounds for rn are obtained under the same assumption as those used

by Bahadur (1966).

The paper is organized as follows. In Section 2, we give some basic notations and

some background on discrete filtering. In Section 3, we derive semi-parametric estimators

of the parameter H, when a single sample path of a process defined by (1) is observed

over a discrete grid of the interval [0, 1]. Section 4 presents the main results: Bahadur

representations and asymptotic results for our estimators. In Section 5 are presented

some numerical computations to compare the theoretical asymptotic variance of our


estimators and a simulation study is also given. In particular, we illustrate the relative

efficiency with respect to Whittle estimator and the fact that such estimators are more

robust than classical ones. Finally, proofs of differents results are presented in Section 6.

2 Some notations and some background on discrete filter-

ing

Given some random variable Y , FY (·) denotes the cumulative distribution function of

Y and ξY (p) the quantile of order p, 0 < p < 1. If FY (·) is absolutely continuous

with respect to Lebesgue measure, the probability density function is denoted by fY (·).The cumulative distribution (resp. probability density) function of a standard Gaus-

sian variable is denoted by Φ(·) (resp. φ(·)). Based on the observation of a vector

Y = (Y (1), . . . , Y (n)) of n random variables distributed as Y , the sample cumulative

distribution function and the sample quantile of order p are respectively denoted by

FY (·;Y ) and ξY (p;Y ) or simply by F (·;Y ) and ξ (p;Y ). Finally, for some measurable

function g(·), we denote by g(Y ) the vector of length n with real components g(Y (i)),

for i = 1, . . . , n.

A sequence of real numbers un is said to be O (vn) (resp. o (vn)) for an other sequence

of real numbers vn, if un/vn is bounded (resp. converges to 0 as n → +∞). A sequence

of random variables Un is said to be Oa.s. (vn) (resp. oa.s. (vn)) if Un/vn is almost surely

bounded (resp. if Un/vn converges towards 0 with probability 1).

The statistical model corresponds to a discretized version X = (X(i/n))i=1,...,n of a

locally self-similar Gaussian process defined by (1).

One of the ideas of our method is to construct some estimators by using some prop-

erties of the variance of the increments of X or the variance of the increments of order 2

of X. While considering the increments of X is conventional since the associated se-

quence is stationary, considering the increments of order 2 (or of a higher order) could

be stranger. However, the main interest relies upon the fact that the observations of

the latter resulting sequences are less correlated than those of the simple increments’

sequence. All these vectors can actually be seen as special discrete filtering of the vector

X. Let us now specify some general background on discrete filtering and its consequence

on the correlation structure. The vector a is a filter of length ℓ + 1 and of order ν ≥ 1


with real components if

ℓ∑

q=0

qjaq = 0, for j = 0, . . . , ν − 1 and

ℓ∑

q=0

qνaq 6= 0.

For example, a = (1,−1) (resp. a = (1,−2, 1)) is a filter with order 1 (resp. 2). Let Xa

be the series obtained by filtering X with a, then:

Xa

(i

n

)=

ℓ∑

q=0

aqX

(i − q

n

)for i ≥ ℓ + 1.

Applying in turn the filter a = (1,−1) and a = (1,−2, 1) leads to the increments of X,

respectively the increments of X of order 2. One may also consider other filters such as

Daubechies wavelet filters, e.g. Daubechies (1992).

The following assumption is needed by different results presented hereafter:

Assumption A1(k) : for i = 1, . . . , k

v(i)(t) = σ2β(i)|t|2H−i + o(|t|2H−i

)

with β(i) = 2H(2H − 1) . . . (2H − i + 1) (where k ≥ 1 is an integer).

This assumption assures that the variance function v(·) is sufficiently smooth around 0.

It allows us to assert that the correlation structure of a locally self-similar discretized

and filtered Gaussian process can be compared to the one of the fractional Brownian

motion. This is announced more precisely in the following Lemma.

Lemma 1 (e.g. Kent and Wood (1997)) Let a and a′ be two filters of length ℓ + 1

and ℓ′ + 1, of order ν and ν ′ ≥ 1. Then we have:

E

(Xa

(i

n

)Xa′

(i + j

n

))=

−σ2

2

ℓ∑

q,q′=0

aqa′q′v

(q − q′ + j

n

)

= γa,a′

n (j)(1 + δa,a′

n (j))

, (2)

with

γa,a′

n (j) =σ2

n2Hγa,a′

(j), γa,a′

(j) = −1

2

ℓ∑

q,q′=0

aqa′q′ |q − q′ + j|2H (3)

and

δa,a′

n (j) =

∑q,q′ aqaq′ |q − q′ + j|2H × r

(q−q′+j

n

)

γa,a′(j). (4)


Moreover, as |j| → +∞γa,a′

(j) = O(

1

|j|2H−ν−ν′

). (5)

Finally, under Assumption A1(ν + ν′), as n → +∞

δa,a′

n (j) = o (1) . (6)

Remark 1 In the case of the fractional Brownian motion the sequence δn is equal to 0,

whereas it converges towards 0 for more general locally self-similar Gaussian processes,

such as the Gaussian processes with stationary increments and with variance function

v(t) = 1− exp(−|t|2H) or v(t) = log(1 + |t|2H) for which Assumption A1(k) is satisfied

(for every k ≥ 1).

Remark 2 The stationarity of the increments and the local self-similarity required on

the process X(·) are important, if the process is observed at times i/n for i = 1, . . . , n.

The crucial result of Lemma 1 is that the variance function of the filtered series behaves

asymptotically as γan (0). It seems to be difficult to relax the constraint of stationarity.

Consider for example the Riemann-Liouville’s process, e.g. Alos et al. (1999). This

process is a Gaussian process which is H-self similar Gaussian but with increments sat-

isfying only some kind of local stationarity. Following the computations of Lim (2001),

the variance of the increments’ series of the Riemann-Liouville’s process is equal to

E

((X

(i + 1

n

)− X

(i

n

))2)

=1

n2H

1

Γ(H + 1/2)2

{I +

1

2H

},

with I =∫ i0

((1 + u)H−1/2 − uH−1/2

)3du+

∫ i/n0 u2H−1du. This integral cannot be asymp-

totically independent of time. Note that this could be the case if the process is observed

at irregular subdivisions. This question has not been investigated.

Define Y a as the normalized vector Xa with variance 1. The covariance between

Y a(i/n) and Y a′

(i + j/n) is denoted by ρa,a′

n (j). Under Assumption A1(ν + ν′), the

following equivalence holds as n → +∞

ρa,a′

n (j) ∼ ρa,a′

(j) =γa,a′

(j)√γa,a(0)γa′ ,a′(0)

. (7)

When a = a′, we set, for the sake of simplicity γan (·) = γa,a

n (·), δan(·) = δa,a

n (·), ρan(·) =

ρa,an (·), γa,a(·) = γa(·) and ρa(·) = ρa,a(·).


3 New estimators of H

3.1 Estimators based on a convex combination of sample quantiles

Let (p, c) = (pk, ck)k=1,...,K ∈ ((0, 1) × R+)K for an integer 1 ≤ K < +∞. Define the

following statistics based on a convex combination of sample quantiles:

ξ (p, c;Xa) =K∑

k=1

ck ξ (pk;Xa) , (8)

where ck, k = 1, . . . ,K are positive real numbers such that∑K

k=1 ck = 1. For example,

this corresponds to the sample median when K = 1,p = 1/2, c = 1 , to a mean of quar-

tiles when K = 2,p = (1/4, 3/4), c = (1/2, 1/2) . Consider the following computation:

from Lemma 1, we have, as n → +∞

ξ (p, c;Xa) ∼ σ2

n2Hγa(0)ξ (p, c;Y a) .

Remark 3 It may be expected that ξ (p, c;Y a) converges towards a constant as n →+∞. In itself, this result is not interesting, since two parameters remain unknown: σ2

and H and thus, it is impossible to derive an estimator of H.

Remark 3 suggests that we have to use at least two filters. Among all available filters,

let us consider the sequence (am)m≥1 defined by

ami =

aj if i = jm

0 otherwisefor i = 0, . . . ,mℓ ,

which is none other than the filter a dilated m times. For example, if the filter a = a1

corresponds to the filter (1,−2, 1), then a2 = (1, 0,−2, 0, 1), a3 = (1, 0, 0,−2, 0, 0, 1),

. . . As noted by Kent and Wood (1997) or Istas and Lang (1997), the filter am, of length

mℓ + 1, is of order ν and has the following interesting property :

γam(0) = m2Hγa(0). (9)

From Lemma 1, this simply means that E(Xam

(i/n)2)

= m2HE(Xa(i/n)2

), exhibit-

ing some kind of self-similarity property of the filtered coefficients. As specified in the

introduction, the same property can be pointed out in the context of wavelet decompo-

sition.


Our methods, that exploit the nice property (9), are based on a convex combination

of sample quantiles ξ(p, c;g(Xam

))

for two positive functions g(·): g(·) = | · |α for

α > 0 and g(·) = log | · |. For such functions g(·) we manage, by using some property

established in Lemma 1, to define some very simple estimators of the Hurst exponent

through a simple linear regression. Other choices of the function g(·) have not been

investigated in this paper. At this stage, let us specify that our methods extend the one

proposed by Stoev et al. (2006); indeed they only consider the statistic ξ(p, c;g(Xam

))

for p = 1/2, c = 1, g(·) = (·)2, that is the sample median of the squared coefficients.

From (3) and (9), we have

ξ(p, c; |Xam

|α)

= E((Xam

(1/n))2)α/2

ξ(p, c; |Y am

|α)

= mαH σα

nαHγa(0)α/2

(1 + δam

n (0))α/2

ξ(p, c; |Y am

|α), (10)

and

ξ(p, c; log |Xam

|)

=1

2log E(Xam

(1/n))2 + ξ(p, c; log |Y am

|)

= H log(m) + log

(σ2

n2Hγa(0)

)

+1

2log(1 + δam

n (0))

+ ξ(p, c; log |Y am

|). (11)

Denote by κH = n−2Hσ2γa(0). Equations (10) and (11) can be rewritten as

log ξ(p, c; |Xam

|α)

= αH log(m) + log(κ

α/2H ξ|Y |α (p, c)

)+ εα

m, (12)

ξ(p, c; log |Xam

|)

= H log(m) + log(κH) + ξlog |Y | (p, c) + εlogm (13)

with the random variables εαm and εlog

m respectively defined by

εαm = log

(ξ(p, c; |Y am

|α)

ξ|Y |α (p, c)

)+

α

2log(1 + δam

n (0)), (14)

and

εlogm = ξ

(p, c; log |Y am

|)− ξlog |Y | (p, c) +

1

2log(1 + δam

n (0))

(15)

where, for some random variable Z, ξZ (p, c) =∑K

k=1 ck ξZ(pk). We decide to rewrite

Equations (10) and (11) as (12) and (13), since we expect that εαm and εlog

m converge

(almost surely) towards 0 as n → +∞.

From Remark 3, two estimators of H can be defined through a linear regression

of ( log ξ(p, c; |Xam

|α))m=1,...,M and (ξ

(p, c; log |Xam

|))m=1,...,M on ( log m)m=1,...,M


for some M ≥ 2. These estimators are denoted by Hα and H log. By denoting A the

vector of length M with components Am = log m − 1M

∑Mm=1 log(m), m = 1, . . . ,M , we

have explicitly from (12) and (13) and the definition of least squares estimates (see e.g.

Antoniadis et al. (1992)):

Hα =AT

α||A||2(log ξ

(p, c; |Xam

|α))

m=1,...,M, (16)

H log =AT

||A||2(ξ(p, c; log |Xam

|))

m=1,...,M, (17)

where ||z|| for some vector z of length d denotes the norm defined by(∑d

i=1 z2i

)1/2.

We can point out that Hα and H log are independent of the scaling coefficient σ2.

3.2 Estimators based on trimmed means

Let 0 < β1 ≤ β2 < 1 and β = (β1, β2), denote by g(Xa)(β)

the β−trimmed mean of the

vector g(Xa) given by

g(Xa)(β)

=1

n − [nβ2] − [nβ1]

n−[nβ2]∑

[nβ1]+1

(g(Xa))(i),n ,

where (g(Xa))(1),n ≤ (g(Xa))(2),n ≤ . . . ≤ (g(Xa))(2),n are the order statistics of

(g(Xa))1 , . . . , (g(Xa))n. It is well-known that (g(Xa))(i),n = ξ(

in ;g(Xa)

). Hence,

by following the ideas of the previous section, one may obtain

log(|Xam

|α(β))

= αH log(m) + log(κ

α/2H |Y |α(β)

)+ εα,tm

m , (18)

log |Xam|(β)

= H log(m) + log(κH) + log |Y |(β)+ εlog,tm

m (19)

with

εα,tmm = |Y am

|α(β) − |Y |α(β)

+α

2log(1 + δam

n (0)), (20)

and

εlog,tmm = log |Y am

|(β) − log |Y |(β)

+1

2log(1 + δam

n (0)), (21)

where for some random variable Z, Z(β)

is referring to

Z(β)

=1

1 − β2 − β1

∫ 1−β2

β1

ξZ(p)dp. (22)


As in the previous section, two estimators of H, denoted by Hα,tm and H log,tm, is derived

through a log-linear regression

Hα,tm =AT

α||A||2(|Xam

|α(β))

m=1,...,M. (23)

H log,tm =AT

||A||2(log |Xam

|(β))

m=1,...,M. (24)

Remark 4 The estimator referred to the “estimator based on the quadratic variations”

in the simulation study and studied with the same formalism by Coeurjolly (2001) cor-

responds to the estimator Hα,tm with α = 2, β1 = β2 = 0.

4 Main results

To simplify the presentation of different results, consider the two following assumptions

on different parameters involved in the estimation procedures

Assumption A2(p, c) : a is a filter of order ν ≥ 1, α is a positive real number, p

(resp. c) is a vector of length K (for some 1 ≤ K < +∞) such that 0 < pk < 1 (resp.

ck > 0 and∑K

k=1 ck = 1), M is an integer ≥ 2.

Assumption A3(β) : a is a filter of order ν ≥ 1, α is a positive real number,

β = (β1, β2) is such that 0 < β1 ≤ β2 < 1, M is an integer ≥ 2.

Since AT (log(m))m=1,...,M = ||A||2 and AT1 = 0 (where 1 = (1)m=1,...,M), we have

Hα − H =AT

α||A||2 εα and H log − H =AT

||A||2 εlog, (25)

and

Hα,tm − H =AT

α||A||2 εα,tm and H log,tm − H =AT

||A||2 εlog,tm, (26)

where εα = (εαm)m=1,...,M , εlog = (εlog

m )m=1,...,M , εα,tm = (εα,tmm )m=1,...,M and εlog,tm =

(εlog,tmm )m=1,...,M . Hence, in order to study the convergence of different estimators, it is

sufficient to obtain some convergence results of sample quantiles ξ(p,g(Y a)) for some

function g(·) and some filter a. Therefore, we first establish a Bahadur representation

of sample quantiles for some non-linear function of Gaussian sequences with correlation

function decreasing as k−α, for some α > 0. In fact, the existing litterature on nonlinear

function of Gaussian sequences (e.g. Taqqu (1977)) allows us to slighlty extend this

framework by considering correlation function decreasing as k−αL(k), for some slowly

varying function L(·).


4.1 Bahadur representation of sample quantiles

Let us recall some important definitions on Hermite polynomials. The j-th Hermite

polynomial (for j ≥ 0) is defined for t ∈ R by

Hj(t) =(−1)j

φ(t)

djφ(t)

dtj. (27)

The Hermite polynomials form an orthogonal system for the Gaussian measure. More

precisely, we have E (Hj(Y )Hk(Y )) = j! δj,k. For a measurable function g(·) defined on

R for which E(g(Y )2) < +∞, the following expansion holds

g(t) =∑

j≥τ

cj

j!Hj(t) with cj = E (g(Y )Hj(Y )) ,

where the integer τ defined by τ = inf {j ≥ 0, cj 6= 0}, is called the Hermite rank of the

function g. Note that this integer plays an important role. For example, it is related to

the correlation of g(Y1) and g(Y2) (for Y1 and Y2 two standard gaussian variables with

correlation ρ) since E(g(Y1)g(Y2)) =∑

k≥τ(ck)2

k! ρk ≤ ρτ ||g||L2(dφ).

In order to obtain a Bahadur representation (see e.g. Serfling (1980)), we have to

ensure that F ′g(Y )(ξ(p)) > 0 and F ′′

g(Y )(·) exists and is bounded in a neighborhood of

ξ(p). This is achieved if the function g(·) satisfies the following assumption (see e.g.

Dacunha-Castelle and Duflo (1982), p.33).

Assumption A4(ξ(p)) : there exist Ui, i = 1, . . . , L, disjoint open sets such that

Ui contains a unique solution to the equation g(t) = ξg(Y )(p), such that F ′g(Y )(ξ(p)) > 0

and such that g is a C2−diffeomorphism on ∪Li=1Ui.

Note that under this assumption

F ′g(Y )(ξg(Y )(p)) = fg(Y )(ξg(Y )(p)) =

L∑

i=1

φ(g−1i (ξ(p)))

g′(g−1i (ξ(p)))

,

where gi(·) is the restriction of g(·) on Ui. Now, define, for some real u, the function

hu(·) by:

hu(t) = 1{g(t)≤u}(t) − Fg(Y )(u). (28)

We denote by τ(u) the Hermite rank of hu(·). For the sake of simplicity, we set τp =

τ(ξg(Y )(p)). For some function g(·) satisfying Assumption A4(ξ(p)), we denote by

τp = infγ∈∪L

i=1g(Ui)τ(γ), (29)

that is the minimal Hermite rank of hu(·) for u in a neighborhood of ξg(Y )(p).


Theorem 2 Let {Y (i)}+∞i=1 be a stationary (centered) gaussian process with variance 1,

and correlation function ρ(·) such that, as i → +∞

|ρ(i)| ∼ L(i) i−α, (30)

for some α > 0 and some slowly varying function at infinity L(s), s ≥ 0. Then, under

Assumption A4(ξ(p)), we have almost surely, as n → +∞

ξ (p;g(Y )) − ξg(Y )(p) =p − F

(ξg(Y )(p);g(Y )

)

fg(Y )(ξg(Y )(p))+ Oa.s. (rn(α, τ p)) , (31)

the sequence (rn(α, τ p))n≥1 being defined by

rn(α, τ p) =

n−3/4 log(n)3/4 if ατp > 1,

n−3/4 log(n)3/4Lτp(n)3/4 if ατp = 1,

n−1/2−ατp/4 log(n)τp/4+1/2L(n)τp/4 if 2/3 < ατ p < 1,

n−ατp log(n)τpL(n)τp if 0 < ατ p ≤ 2/3,

(32)

where for some τ ≥ 1, Lτ (n) =∑

|i|≤n |ρ(i)|τ .

Note that if L(·) is an increasing function, Lτ (n) = O (log(n)L(n)τ ).

Remark 5 Without giving any details here, let us say that the behaviour of the se-

quence rn(·, ·) is related to the characteristic (short-range or long-range dependence) of

the process {hu(Y (i))}+∞

i=1 for u in a neighborhood of ξg(Y )(p). In the case ατp > 1,

corresponding to short-range dependent processes, the result is similar to the one proved

by Bahadur, see e.g. Serfling (1980), in the i.i.d. case. For short-range dependent

linear processes, using a law of iterated logarithm’s type result Wu (2005) obtained a

sharper bound, that is n−3/4 log log(n)3/4. This bound is obtained under the assump-

tion that F ′(·) and F ′′(·) exist and are uniformly bounded. For long-range dependent

processes (ατ p ≤ 1), we can observe that the rate of convergence is always lower than

n−3/4 log(n)3/4 and that the dominant term n−3/4 is obtained when ατ p → 1.

We now propose a uniform Bahadur type representation of sample quantiles. Such

a representation has an application in the study of trimmed-mean. For 0 < p0 ≤ p1 < 1

consider the following assumption which extends A4(ξ(p))


Assumption A5(p0, p1) : there exists Ui, i = 1, . . . , L, disjoint open sets such that

Ui contains a solution to the equation g(t) = ξg(Y )(p) for all p0 ≤ p ≤ p1, such that

F ′g(Y )(ξ(p)) > 0 for all p0 ≤ p ≤ p1 and such that g is a C2−diffeomorphism on ∪L

i=1Ui.

Under the previous assumption, define

τp0,p1 = infγ∈∪L

i=1g(Ui)τ(γ). (33)

Theorem 3 Under the conditions of Theorem 2 and Assumption A5(p0, p1), we have

almost surely, as n → +∞

supp0≤p≤p1

∣∣∣∣∣ξ (p;g(Y )) − ξg(Y )(p) −p − F

(ξg(Y )(p);g(Y )

)

fg(Y )(ξg(Y )(p))

∣∣∣∣∣ = Oa.s. (rn(α, τp0,p1)) . (34)

Remark 6 To obtain convergence results of estimators of H, some results are needed

concerning sample quantiles of the form ξ(p;g(Y am

)), with g(·) = | · |. Lemma 14

asserts that the Hermite rank τp of the function hξg(Y )(p)(·) with g(·) = | · |, is equal

to 2 for all 0 < p < 1. Moreover, for all 0 < p < 1 and for all 0 < p0 ≤ p1 < 1,

Assumptions A4(ξ(p)) and A5(p0, p1) are satisfied, and we have τ p = τp0,p1 = 2.

Since from Lemma 1, the correlation function of Y amsatisfies (30) with α = 2ν − 2H

and L(·) = 1, by applying Theorem 2, the sequence rn(·, ·) is then given by

rn(2ν − 2H, 2) = n−3/4 log(n)3/4, if ν ≥ 2 (35)

and for ν = 1

rn(2 − 2H, 2) =

n−3/4 log(n)3/4 if 0 < H < 3/4,

n−3/4 log(n)3/2 if H = 3/4,

n−1/2−(1−H) log(n) if 3/4 < H < 5/6,

n−2(2−2H) log(n)2 if 5/6 ≤ H < 1.

(36)

4.2 Convergence results of estimators of H

In order to specify convergence results, we make the following assumption concerning

the remainder term of the variance function v(·).Assumption A6(η) : there exists η > 0 such that v(t) = σ2|t|2H (1 + O (|t|η)) , as

|t| → 0.

The first result concentrates itself on estimators Hα and H log based on a convex

combination of sample quantiles.


Theorem 4 Under Assumptions A1(2ν), A2(p, c) and A6(η),

(i) we have almost surely, as n → +∞

Hα − H =

O (n−η) + Oa.s.

(n−1/2 log(n)

)if ν > H + 1

4 ,

O (n−η) + Oa.s.

(n−1/2 log(n)3/2

)if ν = 1,H = 3

4 ,

O (n−η) + Oa.s.

(n−2(1−H) log(n)

)if ν = 1, 3

4 < H < 1.

(37)

A similar result holds for H log.

(ii) the mean squared errors (MSE) of Hα satisfies

MSE(Hα − H

)= O (vn(2ν − 2H)) + O

(rn(2ν − 2H, 2)2

)+ O

(n−2η

). (38)

The sequence rn(2ν − 2H, 2) is given by (35) and (36) and the sequence vn(·) is defined

by

vn(2ν − 2H) =

n−1 if ν > H + 14 ,

n−1 log(n) if ν = 1,H = 34 ,

n−4(1−H) if ν = 1, 34 < H < 1.

(39)

Again, the same result holds for MSE(H log − H

).

(iii) if the filter a is such that ν > H + 1/4, and if η > 1/2, then we have the

following convergence in distribution, as n → +∞√

n(Hα − H

)−→ N (0, σ2

α) and√

n(H log − H

)−→ N (0, σ2

0), (40)

where σ2α is defined for α ≥ 0 by

σ2α =

∑

i∈Z

∑

j≥1

1

(2j)!

( K∑

k=1

H2j−1(qk)ck

qkπα

k

)2

BT R(i, j)B. (41)

The vector B is defined by B =AT

||A||2 , and the real numbers qk and παk are defined by

qk = Φ−1

(1 + pk

2

)and πα

k =(qk)

α

∑Kj=1 cj(qj)α

. (42)

Finally, the matrix R(i, j), defined for i ∈ Z and j ≥ 1, is a M × M matrix whose

(m1,m2) entry is

(R(i, j))m1,m2= ρam1 ,am2

(i)2j , (43)

where ρam1 ,am2 (·) is the correlation function defined by (7).


Remark 7 The expression of the variance σ2α given by (41) could appear to be very

complicated. However, given some vectors p and c and some integer M , it does not take

unreasonnable effort to compute it for each value of H by truncating the two series. This

issue is investigated in Section 5 to compare the different parameters.

Remark 8 Let us discuss the result (38). The first term, O (vn), is due to the variance

of the sample cumulative distribution function. The second term, O(r2n

)is due to the

departure of ξ (p) − ξ(p) from F (ξ(p)) − p. We leave the reader to check that

O(rn(2ν − 2H, 2)2

)+ O (vn(2ν − 2H)) =

O (vn(2ν − 2H)) if ν ≥ H + 14 ,

O(rn(2ν − 2H, 2)2

)if ν < H + 1

4 .

Finally, the third one, O(n−2η

)is a bias term due to the misspecification of the variance

function v(·) around 0.

Remark 9 If K = 1, we have, for every α > 0,

σ2α = σ2

0 =∑

i∈Z

∑

j≥1

H2j−1(q)2

q2 (2j)!BT R(i, j)B.

Assume A6(η) with η > 1/2 which allows to neglict the bias term with respect to the

variance one. The result (40) is proved by using some general central limit theorem

obtained in this dependence context by Arcones (1994), which is available as soon as

ρa(·)2 is summable. Therefore, if only A1(2) is assumed, the filter a cannot exceed 1

(and then correspond to a = (1,−1)) and, due to (5), the result (40) is valid only for

0 < H < 3/4. As a practical point of view, one observes that for such a filter and large

values of H, the estimators have very big variance. Note that if A1(2ν) can be assumed

for ν > 1, then the asymptotic normality is valid for all the values of H.

The next result asserts the link between H log and Hα.

Corollary 5 Let (αn)n≥1 be a sequence such that αn → 0, as n → +∞. Then, under

conditions of Theorem 4 (ii), the following convergence in distribution holds, as n → +∞√

n(Hαn

n − H)−→ N (0, σ2

0). (44)

The following theorem presents the analog results obtained for the estimators Hα,tm

and H log,tm based on trimmed-means.


Theorem 6 Under Assumptions A1(2ν), A3(β) and A6(η), properties (i) and (ii) of

Theorem 4 hold for the estimator Hα,tm and H log,tm with the same rates of convergences.

(iii) if the filter a is such that ν > H +1/4 and if η > 1/2, then, under the notations

of Theorem 4, we have the following convergence in distribution, as n → +∞√

n(Hα,tm − H

)−→ N (0, σ2

α,tm) and√

n(H log,tm − H

)−→ N (0, σ2

0,tm), (45)

where σ2α,tm is defined for α ≥ 0 by

σ2α,tm =

∑

i∈Z

∑

j≥1

1

(2j)!

(∫ 1−β2

β1H2j−1(q)q

α−1dp∫ 1−β2

β1qαdp

)2

BT R(i, j)B, (46)

with q = Φ−1(

1+p2

).

5 Numerical computation and simulations

5.1 Asymptotic constants σ2α and σ

2α,tm

In order to compare the different estimators, we intend to compute the asymptotic con-

stants σ2α and σ2

α,tm defined by (41) and (46) for various set of parameters (a,p, c,β,M).

For this work, both series defining σ2α and σ2

α,tm are truncated (|i| ≤ 200, j ≤ 150). Fig-

ure 2 illustrates a part of this work. We can propose the following general remarks:

• Among all filters tested, the best one seems to be

a⋆ =

inc1 if 0 < H < 3/4,

db4 otherwise.

where inc1 and db4 respectively denote the filter (1,−1) and the Daubechies wavelet

filter with two zero moments explicitly given by

db4 = (0.4829629,−0.8365763, 0.22414386, 0.12940952) .

• Choice of M : increasing M seems to reduce the asymptotic constant σ2α. Obviously,

a too large M increases the bias since ξ(p, c;g(XaM

))

or g(XaM)(β)

are estimated

with N − Mℓ observations. We recommend setting it to the value 5.

• We did not manage (theoretically and numerically since series defining (41) and (46)

are truncated) to determine the optimal value of α. However, for examples considered,

it should be near the value 2.


• Again, this is quite difficult to know theoretically and numerically which choice of p

is optimal. What we observed is that, for fixed parameters a, M and α, the asymptotic

constants are very close to each other.

• Choice of p in the case of a single quantile (see Figure 2): the optimal p seems to

be near the value 90%. However, p = 1/2, corresponding to the estimator based on the

median, leads to good results.

• Choice of β1 = β2 = β for the estimators based on trimmed-means (see Figure 2):

obviously the constant grows with β but we can point out that estimators based on

10%−trimmed-means are very competitive with the ones obtained by quadratic varia-

tions (β = 0).

5.2 Simulation

A short simulation study is proposed in Table 1 and Figure 1 for n = 1000 and H = 0.8.

We consider two locally self-similar Gaussian processes whose variance functions are in

turn v(t) = |t|2H (fractional Brownian motion) and v(t) = 1− exp(−|t|2H). To generate

sample paths discretized over a grid [0, 1], we use the method of circulant matrix (see

Wood and Chan (1994)), which is particularly fast, even for large sample sizes. Various

versions of estimators are considered and compared with classical ones, that is the one

based on quadratic variations, Coeurjolly (2001), and the Whittle estimator, Beran

(1994). In order to illustrate the robustness of our estimators, we also applied them to

contaminated version of sample path processes. We obtain a new sample path discretized

at times i/n and denoted by XC(i/n) for i = 1, . . . , n through the following model

XC(i/n) = X(i/n) + U(i)V (i), (47)

where U(i), i = 1, . . . , n are Bernoulli independent variables B(0.005), and V (i), i =

1, . . . , n are independent centered Gaussian variables with variance σ2C(i) such that the

signal noise ratio at time i/n is equal to 20 dB. As a general conclusion of Table 1,

one can say that all versions of our estimators are very competitive with classical ones

when the processes are observed without contamination and they seem to be particularly

robust to additive outliers. Both bias and variance are approximately unchanged. This

is clearly not the case for classical estimators. Indeed, concerning quadratic variations’

method, the estimation procedure is based on the estimation of E((Xam(1/n))2) by

sample mean of order 2 of (Xam)2, Coeurjolly (2001)), that is particularly sensitive to


additive outliers. Bad results of Whittle estimator can be explained by the fact that

maximum likelihood methods are also non-robust methods.

6 Proofs

We denote by || · ||L2(dφ) (resp. || · ||ℓq) the norm defined by ||h||L2(dφ) = E(h(Y )2)1/2

for some measurable function h(·) (resp. (∑

i∈Z|ui|q)1/2 for some sequence (ui)i∈Z). In

order to simplify the presentation of proofs, we use the notations F(·), ξ(·), f(·), F (·) and

ξ (·) instead of Fg(Y )(·), ξg(Y )(·), fg(Y )(·), Fg(Y ) (·;g(Y )) and ξg(Y ) (·;g(Y )) respectively.

For some real x, [x] denotes the integer part of x. Finally, λ denotes a generic positive

constant.

6.1 Sketch of the proof of Theorem 2

We give here a brief explanation of the strategy to prove Theorem 2. This proof follows

exactly the one proposed by Serfling (1980) in the i.i.d. case. One starts by writing

p − F (ξ(p))

f(ξ(p))−(ξ (p) − ξ(p)

)= A(p) + B(p) + C(p),

with

A(p) =p − F

(ξ (p)

)

f(ξ(p))(48)

B(p) =F(ξ (p)

)− F (ξ(p)) −

(F(ξ (p)) − F(ξ(p))

)

f(ξ(p))(49)

C(p) =F(ξ (p)) − F(ξ(p))

f(ξ(p))−(ξ (p) − ξ(p)

). (50)

From the definition of sample quantile, we have almost surely, see e.g. Serfling (1980),

A(p) = Oa.s.

(n−1

). Now, in order to control the term C(p), Taylor’s Theorem is used

and a control of ξ (p) − ξ(p) is needed. The latter one is done by Lemma 10 which

exhibits the sequence εn(α, τp) such that ξ (p) − ξ(p) = Oa.s. (εn(α, τp)). Then, in order

to control B(p) it is sufficient to control the random variable

Sn(ξ(p), εn(α, τp)) = sup|x|≤εn(α,τp)

|∆(ξ(p) + x) − ∆(ξ(p))| ,

with ∆(·) = F (·)−F(·). This result is detailed in Lemma 11. In order to specify the rate

explicited by Theorem 2, we present and prove Lemmas 10 and 11. Some preliminary


results, given by Lemma 7, Corollary 8 and Lemma 9, are needed. Among other things,

Lemma 7 and Corollary 8 propose some inequalities for controlling the sample mean of

non-linear function of Gaussian sequences with correlation function satisfying (30).

6.2 Auxiliary Lemmas for the proof of Theorem 2

Lemma 7 Let {Y (i)}+∞i=1 a gaussian stationary process with variance 1 and correlation

function ρ(·) such that, as i → +∞, |ρ(i)| ∼ L(i)i−α, for some α > 0 and some slowly

varying function at infinity L(·). Let h(·) ∈ L2 (dφ) and denote by τ its Hermite rank.

Define

Y n =1

n

n∑

i=1

h(Y (i)).

Then, for all γ > 0, there exists a positive constant κγ = κγ(α, τ), such that

P(|Y n| ≥ κγyn

)= O

(n−γ

), (51)

with

yn = yn(α, τ) =

n−1/2 log(n)1/2 if ατ > 1,

n−1/2 log(n)1/2Lτ (n)1/2 if ατ = 1,

n−ατ/2 log(n)τ/2L(n)τ/2 if 0 < ατ < 1.

(52)

where Lτ (n) =∑

|i|≤n |ρ(i)|τ . In the case ατ = 1, we assume that for all j > τ , the

limit, limn→+∞ Lτ (n)−1∑

|i|≤n |ρ(i)|j exists.

Proof. Let (yn)n≥1 be the sequence defined by (52). The proof is splitted into three

parts according to the value of ατ .

Case ατ < 1 : From Chebyshev’s inequality, we have for all q ≥ 1

P(|Y n| ≥ κγyn

)≤ 1

κ2qγ y2q

n

E

((Y n)2q

).

From Theorem 1 of Breuer and Major (1983) and in particular Equation (2.6), we have,

as n → +∞

E

((Y n)2q

)∼ (2q)!

2qq!

1

nqσ2q, with σ2 =

∑

i∈Z

∑

j≥τ

(cj)2

j!ρ(i)j , (53)

where cj denotes the j-th Hermite coefficient of h(·). Note that σ2 ≤ ||h||2L2(dφ)||ρ||2ℓτ .

Thus, for n large enough, we have

P(|Y n| ≥ κγyn

)≤ λ

nqy2qn

(2q)!

2qq!

(||h||2L2(dφ)||ρ||2ℓτ κ−2

γ

)q. (54)


From Stirling’s formula, we have as q → +∞

(2q)!

2qq!∼

√2 qq (2e−1)q. (55)

From (52) by choosing q = [log(n)], (54) becomes

P(|Y n| ≥ κγyn

)≤ λ

(2e−1||h||2L2(dφ)||ρ||2ℓτ κ−2

γ

)log(n)= O

(n−γ

),

if κ2γ > 2||h||2L2(dφ)||ρ||2ℓτ exp(γ − 1).

Case ατ = 1 : Using the proof of Theorem 1′ of Breuer and Major (1983), we can prove

that for all q ≥ 1

E((n1/2Lτ (n)−1/2Y n)2q

)≤ λ

2q!

2qq!E((n1/2Lτ (n)−1/2Y n)2

)q

≤ λ2q!

2qq!

∑

j≥τ

(cj)2

j!lim

n→+∞Lτ (n)−1

∑

|i|≤n

|ρ(i)|j

q

≤ λ2q!

2qq!||h||2q

L2(dφ). (56)

Then from Chebyshev’s inequality, we have for all q ≥ 1

P(|Y n| ≥ κγyn

)≤ λ

Lτ (n)q

nqy2qn

2q!

2qq!

(||h||2L2(dφ)κ

−2γ

)q.

From (52) by choosing q = [log(n)], we obtain

P(|Y n| ≥ κγyn

)≤ λ

(2e−1||h||2L2(dφ) κ−2

γ

)log(n)= O

(n−γ

),

if κ2γ > 2||h||2L2(dφ) × exp(γ − 1).

Case ατ < 1 : Denote by kα the lowest integer satisfying kαα > 1, that is kα = [1/α]+1,

and for j ≥ τ denote by Zj the following random variable

Zj =1

n

n∑

i=1

cj

j!Hj(Y (i)).

Denote by κ1,γ and κ2,γ two positive constants such that κγ = max(κ1,γ , κ2,γ). From the

triangle inequality,

P(|Y n| ≥ κγyn

)≤ P

(|Y n −

kα−1∑

j=τ

Zj| ≥ κ1,γyn

)+

kα−1∑

j=τ

P (|Zj| ≥ κ2,γyn) (57)


Since

Y n −kα−1∑

j=τ

Zj =1

n

n∑

i=1

∑

j≥kα

cj

j!Hj(Y (i)) =

1

n

n∑

i=1

h′(Y (i)),

where h′(·) is a function with Hermite rank kα. Applying Lemma 7 in the case ατ > 1,

it follows that, for all γ > 0, there exists a constant κ1,γ such that, for n large enough

P

(|Y n −

kα−1∑

j=τ

Zj| ≥ κ1,γyn

)= O

(n−γ

). (58)

Now, let τ ≤ j < kα and q ≥ 1, from Theorem 3 of Taqqu (1977), we have

P (|Zj | ≥ κ2,γyn) ≤ 1

κ2q2,γy2q

n

(cj

j!

)2q

n−2q E

∑

i1,...,i2q

Hj(Y (i1)) . . . Hj(Y (i2q))

≤ λL(n)jq

nαjqy2qn

(cj

j!κ−1

2,γ

)2q

µ2q, (59)

where µ2q is a constant such that µ2q ≤(

21−αj

)qE(Hj(Y )2q

). It is also proved in Taqqu

(1977) (p. 228), that E(Hj(Y )2q

)∼ (2jq)!/(2jq(jq)!), as q → +∞. Thus, from Stirling’s

formula, we obtain as q → +∞

P (|Zj | ≥ yn) ≤ λL(n)(j−τ)q

nα(j−τ)qlog(n)−τqqjq

(2

1 − αj

(cj

j!

)2(2j

e

)j

κ−12,γ

)q

.

By choosing q = [log(n)], we finally obtain, as n → +∞

kα−1∑

j=τ

P (|Zj | ≥ κ2,γyn) ≤ λ

(2

1 − ατ

(cτ

τ !

)2(

2τ

e

)τ

κ−22,γ

)log(n)

= O(n−γ

), (60)

if κ22,γ > 2

1−ατ

(cττ !

)2(2τ)τ exp(γ − τ). From (57), we get the result by combining (58)

and (60).

Corollary 8 Under conditions of Lemma 7, for all α > 0, j ≥ 1 and γ > 0, there exists

q = q(γ) ≥ 1 and ζγ > 0 such that

E

{

1

n

n∑

i=1

Hj(Y (i))

}2q ≤ ζγn−γ . (61)


Proof. (53), (56) and (59) imply that there exists λ = λ(q) > 0 such that for all q ≥ 1,

we have

E

{

1

n

n∑

i=1

Hj(Y (i))

}2q ≤ λ(q)n−q = λ(q) ×

n−q if αj > 1

Lτp(n)n−q if αj = 1

L(n)αjqn−αjq if αj < 1

= O(n−γ

). (62)

Indeed, it is sufficient to choose q such that, q > γ if αj ≥ 1 and q > γ/αj if αj < 1.

Lemma 9 Let 0 < p < 1, denote by g(·) a function satisfying Assumption A4(ξ(p))

and by (xn)n≥1 a sequence with real components, such that xn → 0, as n → +∞. Then,

for all j ≥ 1, there exists a positive constant dj = dj(ξ(p)) < +∞ such that, for n large

enough

|cj(ξ(p) + xn) − cj(ξ(p))| ≤ dj |xn|. (63)

Proof. Let j ≥ 1, under Assumption A4(ξ(p)), for n large enough, ξ(p) + xn ∈∪L

i=1g(Ui). Thus, for n large enough,

cj(ξ(p) + xn) − cj(ξ(p)) =

∫

R

(hξ(p)+xn

(t) − hξ(p)(t))Hj(t)φ(t)dt

=

L∑

i=1

∫

Ui

(1gi(t)≤ξ(p)+xn

− 1gi(t)≤ξ(p)

)Hj(t)φ(t)dt

=L∑

i=1

∫ Mi,n

mi,n

(−1)jφ(j)(t)dt,

=

∑Li=1 − (φ(Mi,n) − φ(mi,n)) if j = 1,

∑Li=1(−1)j

(φ(j−1)(Mi,n) − φ(j−1)(mi,n)

)if j > 1,

where gi(·) is the restriction of g(·) to Ui, and where mi,n (resp. Mi,n) is the minimum

(resp. maximum) between g−1i (ξ(p) + xn) and g−1

i (ξ(p)). We leave the reader to check

that there exists a positive constant dj, such that, for n large enough

|cj(ξ(p) + xn) − cj(ξ(p))| ≤dj |xn| ×

∑Li=1

∣∣∣φ(j)(g(−1)i (u)) (g

(−1)i )′(u)

∣∣∣ if j = 1, 2∑L

i=1

∣∣∣φ(j−2)(g(−1)i (u)) (g

(−1)i )′(u)

∣∣∣ if j > 2,

which is the desired result.


Lemma 10 Under conditions of Theorem 2, there exists a constant denoted by κε =

κε(α, τp), such that, we have almost surely, as n → +∞∣∣∣ξ (p;g(Y )) − ξg(Y )(p)

∣∣∣ ≤ εn, (64)

where εn = εn(α, τ(ξ(p))) = κεyn(α, τ(ξ(p)), yn(·, ·) being defined by (52).

Proof. We have

P

(∣∣∣ξ (p) − ξ(p)∣∣∣ ≥ εn

)= P

(ξ (p) ≤ ξ(p) − εn

)+ P

(ξ (p) ≥ ξ(p) + εn

). (65)

Using Lemma 1.1.4 (iii) of Serfling (1980), we have

P

(ξ (p) ≤ ξ(p) − εn

)≤ P

(F (ξ(p) − εn) ≥ p

). (66)

Under Assumption A4(ξ(p)), for n large enough

p − F(ξ(p) − εn) = f(ξ(p))εn + o (εn) ≥ f(ξ(p))

2εn.

Consequently, for n large enough and from (66)

P

(ξ (p) ≤ ξ(p) − εn

)≤ P

(F (ξ(p) − εn) − F(ξ(p) − εn) ≥ f(ξ(p))

2εn

). (67)

Define τp,n = τ(ξ(p) − εn), from Lemma 9, we have for n large enough

F (ξ(p) − εn) − F(ξ(p) − εn) ≥ 2(F (ξ(p)) − F(ξ(p))

)+ 2εn

∑

j∈Jn

Zn,j, (68)

where

Jn =

{τp < j ≤ τp,n} if τp,n > τp,

∅ if τp,n = τp,

{τp,n ≤ j < τp} if τp,n < τp.

and Zn,j =1

n

n∑

i=1

dj

j!Hj(Y (i)).

Now, define cε = κεf(ξ(p))/4. Let γ > 0, (61) implies that there exists q ≥ 1 such that,

for n large enough

P

(|2εnZn| ≥

f(ξ(p))

2εn

)≤∑

j∈Jn

P (|Zn,j| > cε)

≤∑

j∈Jn

1

c2qε

E(Z2q

n,j

)= O

(n−γ

). (69)


Let us fix γ = 2. From (67), (68) and (69) and from Lemma 7 (applied to the function

hξ(p)(·)), we obtain

P

(ξ (p) ≤ ξ(p) − εn

)≤ P

(|F (ξ(p)) − F(ξ(p))| ≥ cεεn

)+ O

(n−2

)= O

(n−2

),

if cε > κ2 that is if κε > 4/f(ξ(p))κ2.

Let us now focus on the second right-hand term of (65). Following the sketch of this

proof, we may also obtain, for n large enough

P

(ξ (p) ≥ ξ(p) + εn

)= O

(n−2

),

if κε > 4/f(ξ(p))κ2. Thus, for n large enough P

(∣∣∣ξ (p) − ξ(p)∣∣∣ ≥ εn

)= O

(n−2

), which

leads to the result thanks to Borel-Cantelli’s Lemma.

The following Lemma is an analogous result obtained by Bahadur in the i.i.d. frame-

work, see Lemma E p.97 of Serfling (1980).

Lemma 11 Under conditions of Theorem 2, denote by ∆(z) for z ∈ R the random

variable, ∆(z) = F (z;g(Y )) − Fg(Y )(z). Then, we have almost surely, as n → +∞

Sn(ξg(Y )(p), εn(α, τp)) = sup|x|≤εn

∣∣∆(ξg(Y )(p) + x) − ∆(ξg(Y )(p))∣∣ = Oa.s. (rn(α, τ p)) ,

(70)

where εn = εn(α, τp) is defined by (64) and rn(α, τ p) is defined by (32).

Proof. Put εn = εn(α, τp) and rn = rn(α, τ p). Denote by (βn)n≥1 and (ηb,n)n≥1 the

following two sequences

βn = [n3/4εn] and ηb,n = ξ(p) + εnb

βn,

for b = −βn, . . . , βn. Using the monotonicity of F(·) and F (·), we have,

Sn(ξ(p), εn) ≤ max−βn≤b≤βn

|Mb,n| + Gn, (71)

where Mb,n = ∆(ηb,n)−∆(ξ(p)) and Gn = max−βn≤b≤βn−1 (F(ηb+1,n) − F(ηb,n)) . Under

Assumption A4(ξ(p)), we have for n large enough

Gn ≤ (ηb+1,n − ηb,n) × sup|x|≤εn

f(ξ(p) + x) = O(n−3/4

). (72)


The proof is finished if one can prove that for all γ > 0 (in particular γ = 2) and for all

b, there exists κ′γ such that

P(|Mb,n| ≥ κ′

γrn

)= O

(n−γ

). (73)

Indeed, since βn = O(n1/2+δ

)for all δ > 0, if (73) is true, then we have

P( max−βn≤b≤βn

|Mb,n| ≥ κ′2rn(α, τp)) ≤ (2βn + 1) × max

−βn≤b≤βn

P(|Mb,n| ≥ κ′

2rn

)

= O(n−3/2+δ

).

Thus, from Borel-Cantelli’s Lemma, we have, almost surely

max−βn≤b≤βn

|Mb,n| = Oa.s. (rn)

And so, from (71) and (72).

Sn(ξ(p), εn) = Oa.s. (rn) + O(n−3/4

)= Oa.s. (rn) , (74)

which is the stated result.

So, the rest of the proof is devoted to prove (73). For the sake of simplicity, denote

by h′n(·) the function hηb,n

(·)− hξ(p)(·). For n large enough, the Hermite rank of h′n(·) is

at least equal to τp, that is defined by (29). In the sequel, we need the following bound

for ||h′n||2L2(dφ)

||h′n||2L2(dφ) = E(h′

n(Y )2) = ωn(1 − ωn) with ωn = |Fg(Y )(ηb,n) − Fg(Y )(ξ(p))|.

As previously, we have ωn = O(εn) and so, there exists ζ > 0, such that

||h′n||2L2(dφ) ≤ ζεn. (75)

From now on, in order to simplify the proof, we use the following upper-bound

εn = εn(α, τp) ≤ εn(α, τ p),

and with a slight abuse, we still denote εn = εn(α, τ p). Note also, that from Lemma 9,

the j-th Hermite coefficient, for some j ≥ τp, is given by cj(ηb,n) − cj(ξ(p)). And there

exists a positive constant dj = dj(ξ(p)) such that for n large enough

|cj(ηb,n) − cj(ξ(p)| ≤ dj εn|b|βn

≤ dj εn. (76)


We now proceed like in the proof of Lemma 7.

Case ατp > 1: using Theorem 1 of Breuer and Major (1983) and (54), we can obtain

for all q ≥ 1

P(|Mb,n| ≥ κ′

γrn

)≤ λ

1

nqr2qn

(2q)!

2qq!

1

(κ′γ)2q

||h′n||2q

L2(dφ)||ρ||2q

ℓτp. (77)

As q → +∞, we get

P(|Mb,n| ≥ κ′

γrn

)≤ λ

εqn

nqr2qn

qq

(2ζe−1||ρ||2

ℓτp

1

(κ′γ)2

)q

.

From (32), (52) (with τ = τp) and by choosing q = [log(n)], we have

P(|Mb,n| ≥ κ′

γrn

)≤ λ

(2ζκεe

−1||ρ||2ℓτp

1

(κ′γ)2

)log(n)

= O(n−γ

), (78)

if κ′γ2 > 2ζκε||ρ||2ℓτp exp(γ − 1).

Case ατp = 1 from (56), we can obtain for all q ≥ 1

E(M2q

b,n

)≤ λ

(2q)!

2qq!

Lτp(n)q

nq||h′

n||2qL2(dφ)

≤ λ ζq (2q)!

2qq!

Lτp(n)qεqn

nq

≤ λLτp(n)qεq

n

nq(2ζe−1)q qq.

From (32), (52) (with τ = τp), by choosing q = [log(n)], we have

P(|Mb,n| ≥ κ′

γrn

)≤ 1

κ′γ2qr2q

n

E(M2q

b,n

)

≤ λ

(2ζ κε e−1

d2τp

τp!

1

κ′γ2

)log(n)

= O(n−γ

),

if κ′γ2 > 2ζκεd

2τp

/τp! exp(γ − 1).

Case ατp < 1: denote by (r1,n)n≥1 and by (r2,n)n≥1 the following two sequences

r1,n = n−1/2−ατp/4 log(n)τp/4+1/2L(n)τp/4 and r2,n = n−ατp log(n)τpL(n)τp . (79)

Note that max (r1,n, r2,n) is equal to r1,n, when 2/3 < ατp < 1 and to r2,n, when

0 < ατp ≤ 2/3. So, in order to obtain (73) in the case 0 < ατp < 1, it is sufficient to

prove that there exists κ′γ such that, for n large enough

P(|Mb,n| ≥ κ′

γ max(r1,n, r2,n))

= O(n−γ

).


Denote by kα the integer [1/α] + 1 for which αkα > 1, and by Zj,n for τp ≤ j < kα the

random variable defined by

Zj,n =1

n

n∑

i=1

cj(ηb,n) − cj(ξ(p))

j!Hj(Y (i)).

From the triangle inequality, we have

P(|Mb,n| ≥ κ′

γ max(r1,n, r2,n))≤ P(|Mb,n−

kα−1∑

j=τp

Zj,n| ≥ κ′γr1,n)+

kα−1∑

j=τp

P(|Zj,n| ≥ κ′

γr2,n

).

(80)

Since,

Mb,n −kα−1∑

j=τp

Zj,n =1

n

n∑

i=1

∑

j≥kα

cj(ηb,n) − cj(ξ(p))

j!Hj(Y (i)) =

1

n

n∑

i=1

h′′n(Y (i)),

where h′′n(·) is a function with Hermite rank kα, such that αkα > 1, we have from (77)

P(|Mb,n −kα−1∑

j=τp

Zj,n| ≥ κ′γr1,n) ≤ λ

1

nqr2q1,n

||h′n||2q

L2(dφ)

(2q)!

2qq!

1

κ′γ2q ||ρ||

2qℓkα

(81)

for all q ≥ 1. From (75), we obtain, as q → +∞


j=τp


εqn

nqr2q1,n

qq(2ζe−1||ρ||2ℓkα κ′

γ−2)q

.

From (52) (with τ = τp), (79) and by choosing q = [log(n)], we obtain


j=τp


(2ζe−1||ρ||2ℓkα κεκ

′γ−2)log(n)

= O(n−γ

), (82)

if κ′γ2 > κ′

1,γ = 2ζ||ρ||2ℓkα κε exp(γ − 1). Now, concerning the last term of (80), from (59),

we can prove, for all τp ≤ j < kα

P(Zj,n ≥ κ′

γr2,n

)≤ λ

L(n)jq

nαjq r2q2,n

1

κ′γ2q

(cj(ηb,n) − cj(ξ(p))

j!

)2q

µ2q,

where µ2q is a constant such that, as q → +∞,

µ2q ≤ λ

(2

1 − αj

)q (2jq)!

2jq(jq)!.


From (76), we have, as q → +∞

P(Zj,n ≥ κ′

γr2,n

)≤ λ

ε2qn L(n)jq

nαjq r2q2,n

qjq

(2

1 − αj

(2j

e

)j

d2j κ′

γ−2

)2q

.

From (32), (52) (with τ = τp) by choosing q = [log(n)], we have, as n → +∞

P(Zj,n ≥ κ′

γr2,n

)≤ λ

(log(n)L(n)

nα

)(j−τp)q(

2

1 − αj

(2j

e

)j

d2j κ2

ε κ′γ−2

)q

.

Consequently, as n → +∞, we finally obtain

kα−1∑

j=τp

P(Zj,n ≥ κ′

γr2,n

)≤ λ

(2

1 − ατ

(2τ

e

)τ

d2τ κ2

ε κ′γ−2

)log(n)

= O(n−γ

), (83)

if κ′γ2 > κ′

2,γ = 21−ατ

(2τe

)τd2

τ κ2ε exp(γ − τ). Let us choose κ′

γ such that κ′γ2 >

max(κ′1,γ , κ′

2,γ). Then, by combining (82) and (83), we deduce from (80) that, for every

γ > 0

P(|Mb,n| ≥ κ′

γ max(r1,n, r2,n))

= O(n−γ

),

and so, (73) is proved.

6.3 Proof of Theorem 2

Proof. Let us detail the proof presented in Section 6.1. We have

p − F (ξ(p))

f(ξ(p))−(ξ (p) − ξ(p)

)= A(p) + B(p) + C(p)

with A(p), B(p) and C(p) respectively defined by (48), (49) and (50). Under Assumption

A4(ξ(p)), from Lemma 10 and Taylor’s theorem we have almost surely, as n → +∞

C(p) ≤ sup|x|≤εn(α,τp)

F ′′g(Y )(ξ(p) + x)

(ξ (p) − ξ(p)

)2= Oa.s.

(εn(α, τp)

2).

From the definition of sample quantile, we have almost surely, see e.g. Serfling (1980),

A(p) = Oa.s.

(n−1

). Now, by combining Lemma 10 and Lemma 11, we have almost

surely B(p) = Oa.s. (rn(α, τ p)). Thus, we finally obtain

ξ (p) − ξ(p) =p − F (ξ(p))

f(ξ(p))+ Oa.s.

(n−1

)+ Oa.s. (rn(α, τ p)) + Oa.s.

(εn(α, τp)

2),

which leads to the result by noticing that εn(α, τp)2 = O (rn(α, τ p)).


6.4 Auxiliary Lemmas for the proof of Theorem 4

Let 0 < p0 ≤ p1 < 1.

Lemma 12 Under conditions of Theorem 3, there exists a constant denoted by θ =

θ(α, τp0,p1) such that, we have almost surely, as n → +∞

T = supp0≤p≤p1

∣∣∣ξ (p;g(Y )) − ξg(Y )(p)∣∣∣ ≤ εn(α, τp0,p1), (84)

where εn = εn(α, τp0,p1) = θyn(α, τp0,p1) and yn is given by (50).

Proof. Define pj,n = p0 + j[n3/2]

(p1 −p0) for j = 0, . . . , [n3/2], and let p ∈ [p0, p1]. Using

the monotonicity of ξ(·) and ξ(·), there exists some j such that p ∈ [pj,n, pj+1,n] and such

that

ξ (p) − ξ(p) ≤ ξ (p) − ξ (pj+1,n)) + ξ (pj+1,n)) − ξ(p)

≤ ξ (pj+1,n) − ξ(pj+1,n)) + ξ(pj+1,n)) − ξ(pj,n)) + ξ(pj,n)) − ξ(p)

≤ ξ (pj+1,n) − ξ(pj+1,n)) + ξ(pj+1,n)) − ξ(pj,n)).

This leads to

T ≤ maxj=0,...,[n3/2]

∣∣∣ξ (pj,n) − ξ(pj,n))∣∣∣ + max

j=0,...,[n3/2]−1|ξ(pj+1,n) − ξ(pj,n))| . (85)

Under Assumption A5(p0, p1), it comes

maxj=0,...,[n3/2]−1

|ξ(pj+1,n) − ξ(pj,n))| = O(n−3/2

). (86)

Now, following the proof of Lemma 10, one can prove that there exists some constant

θ(α, τp0,p1) such that for all j = 0, . . . , [n3/2],

P

(|ξ (pj,n) − ξ(pj,n)| ≥ θyn(α, τp0,p1)

)= O

(n−3

).

Therefore, as n → +∞,

P

(max

j=0,...,[n3/2]|ξ (pj,n) − ξ(pj,n)| ≥ εn

)≤ ([n3/2] + 1) max

j=0,...,[n3/2]P

(|ξ (pj,n) − ξ(pj,n)| ≥ εn

)

= O(n−3/2

).

which, combined with (85), (86) and Borel-Cantelli’s Lemma, leads to the result.

The following result is an extension of Lemma 11 and Theorem 4.2 obtained by Sen

(1971).


Lemma 13 Under Assumptions of Theorem 3 and following Lemma 11, we have almost

surely, as n → +∞

S⋆n = sup

x, y ∈ [ξ(p0), ξ(p1)]

|x − y| ≤ εn(α, τp0,p1)

|∆(x) − ∆(y)| = Oa.s. (rn(α, τp0,p1)) (87)

where τp0,p1 is defined by (33).

Proof. Set εn = εn(α, τp0,p1) and rn = rn(α, τp0,p1). Define ξj,n = ξ(p0) + jpn

(ξ(p1) −ξ(p0)) for j = 0, . . . , pn with pn =

[ε−1n

], and let x, y ∈ [ξ(p0), ξ(p1)] such that |x−y| ≤ εn.

Two cases may occur

• If there exists some j such that x, y ∈ [ξj,n, ξj+1,n] then

|∆(x) − ∆(y)| ≤ |∆(x) − ∆(ξj,n)| + |∆(ξj,n) − ∆(y)| ≤ 2 × Sn(ξj,n, εn)

• Otherwise and witout loss of generality, there exists j, k with k > j such that

x ∈ [ξj,n, ξj+1,n] and y ∈ [ξk,n, ξk+1,n]. Since |x − y| ≤ εn, it follows that |ξk,n −ξj+1,n| ≤ εn. Then,

|∆(x) − ∆(y)| ≤ |∆(x) − ∆(ξk,n)| + |∆(ξk,n) − ∆(ξj+1,n)| + |∆(ξj+1,n) − ∆(y)|≤ Sn(ξk,n, εn) + 2 × Sn(ξj+1,n, εn).

In other words, for all x, y one may obtain

|∆(x) − ∆(y)| ≤ 3 × max0≤j≤pn

Sn(ξj,n, εn).

Hence, S⋆n ≤ 3 × max0≤j≤pn Sn(ξj,n, εn). Now, following the proof of Lemma 11, one

may prove that there exists some positive constant θγ such that for n large enough and

for all j = 0, . . . , pn,

P (Sn(ξj,n, εn) ≥ θγrn) = O(n−γ

).

And in particular for γ = 2, it comes

P

(max

0≤j≤pn

Sn(ξj,n, εn) ≥ θ2rn

)≤ (pn + 1) max

j=0,...,pn

P (Sn(ξj,n, εn) ≥ θγrn)

= O(pn

n2

)= O

(n−3/2

),

whatever the value of ατp0,p1. This leads to the result by using Borel-Cantelli’s Lemma.



Proof. We follow the proof of Theorem 2. Let p ∈ [p0, p1] and let εn = εn(α, τp0,p1),

thenp − F (ξ(p))

f(ξ(p))−(ξ (p) − ξ(p)

)= A(p) + B(p) + C(p)

where A(p), B(p) and C(p) are respectively defined by (48), (49) and (50). Similarly

to the proof of Theorem 2, one may prove that supp0≤p≤p1A(p) = Oa.s.

(n−1

). Under

Assumption A5(p0, p1), C(p) ≤(sup|x|≤εn(α,τp) F ′′(x + ξ(p))

)(ξ(p)−ξ(p))

2

f(ξ(p)) . Therefore,

for n large enough, C(p) ≤ λ(supp0≤p≤p1

(ξ (p) − ξ(p)

))2. And from Lemma 12, this

leads to

supp0≤p≤p1

C(p) = Oa.s.

(εn(α, τp0,p1)

2).

In addition, using Lemma 13, one also has supp0≤p≤p1B(p) = Oa.s. (rn(α, τp0,p1)), which

ends the proof.

6.6 Auxiliary Lemma for the proof of Theorem 4

Lemma 14 Consider for 0 < p < 1 the function hp(·), given by

hp(t) = 1{|t|≤ξ|Y |(p)}(t) − p, (88)

that is the function hξg(Y )(p)(·) with g(·) = | · |. Then by denoting chp

j the j-th Hermite

coefficient of hp(·), we have for all j ≥ 1

chp

0 = chp

2j+1 = 0 and chp

2j = −2H2j−1(q)φ(q), (89)

where q = ξ|Y |(p) = Φ−1(

1+p2

).

Proof. Since P (|Y | ≤ q) = p and hp(·) is even, we have chp

0 = chp

2j+1 = 0, for all j ≥ 1.

Now, (27) implies

chp

2j =

∫

R

hp(t)H2j(t)φ(t)dt = 2 ×∫ q

0H2j(t)φ(t)dt

= 2 ×[φ(2j−1)(t)

]q0

= 2 × [−H2j−1(t)φ(t)]q0

= − 2H2j−1(q)φ(q).


Remark 10 Let g(·) = g(| · |), where g(·) is a strictly increasing function on R+, then

for all 0 < p < 1, we have

ξ|Y |(p) = g−1(ξg(Y )(p)

).

Consequently, the functions hξg(Y )(p)(·) for g(·) = | · |, g(·) = | · |α and g(·) = log | · |are strictly identical. And so, their Hermite decomposition is given by (89) and their

Hermite rank is equal to 2.


Proof. (i) Define

bn =1

2

M∑

m=1

Bm log(1 + δam

n (0)), (90)

where δam

n (0) is given by (4). From (14), (15), and (25), we have almost surely

Hα−H =

M∑

m=1

Bm

αεαm

=

M∑

m=1

Bm

αlog

(ξ(p, c; |Y am

|α)

ξ|Y |α (p, c)

)+ α × bn

=M∑

m=1

Bm

αξ|Y |α (p, c)

(ξ(p, c; |Y am

|α)− ξ|Y |α (p, c)

)(1 + oa.s. (1)) + α bn. (91)

and

H log − H =

M∑

m=1

Bmεlogm

=M∑

m=1

Bm

(ξ(p, c; log |Y am

|)− ξlog |Y | (p, c)

)+ bn (92)

Under Assumption A6(η), we have

bn = O(n−η

). (93)

Moreover, let i, j ≥ 1, under Assumption A1(2ν), we have, from Lemma 1

E(Y am

(i)Y am

(i + j)) = ρam

(j) = O(|j|2H−2ν

). (94)


Then, for all m = 1, . . . ,M and for all k = 1, . . . ,K, from Lemma 10 and Remark 10,

we obtain, that almost surely

ξ(pk; |Y

am

|α)− ξ|Y |α(pk) = Oa.s. (yn(2ν − 2H, τpk

)) ,

ξ(pk; log |Y am

|)− ξlog |Y |(pk) = Oa.s. (yn(2ν − 2H, τpk

)) ,

where the sequence yn(·, ·) is defined by (52) with L(·) = 1. The result (37) is obtained

by combining (91), (92) and (93).

(ii) Let us apply Theorem 2 to the sequence g(Y am

), for some m = 1, . . . ,M , with

g(·) = | · |, g(·) = | · |α and g(·) = log | · |. For all k = 1, . . . ,K, we have almost surely

ξ(pk; |Y

am

|)− ξ|Y |(pk) =

pk − F(ξ|Y |(pk); |Y

am

|)

f|Y |α(ξ|Y |(pk))+ Oa.s. (rn)

ξ(pk; |Y

am

|α)− ξ|Y |α(pk) =

pk − F(ξ|Y |α(pk); |Y

am

|α)

f|Y |α(ξ|Y |α(pk))+ Oa.s. (rn)

ξ(pk; log |Y am

|)− ξlog |Y |(pk) =

pk − F(ξlog |Y |(pk); log |Y am

|)

flog |Y |(ξlog |Y |(pk))+ Oa.s. (rn) ,

where, for the sake of simplicity, rn = rn(2ν − 2H, τpk) defined by (35) and (36). Note

that from Remark 10 τpk= 2 for all k = 1, . . . ,K.

With some little computation, we can obtain, almost surely

ξ(pk; |Y

am

|α)− ξ|Y |α(pk) = αξ|Y |(pk)

α−1(ξ(pk; |Y

am

|)− ξ|Y |(pk)

)+ Oa.s. (rn) , (95)

and

ξ(pk; log |Y am

|)−ξlog |Y |(pk) = ξ|Y |(pk)

−1(ξ(pk; |Y

am

|)− ξ|Y |(pk)

)+ Oa.s. (rn) . (96)

From (91), (92), (95), (96) and properties of Gaussian variables, the following results

hold almost surely

Hα − H =M∑

m=1

K∑

k=1

Bm ck

2qkφ(qk)πk,α

(F (qk; |Y |) − pk

)+ Oa.s. (rn) + O (bn) , (97)

and

H log − H =

M∑

m=1

K∑

k=1

Bm ck

2qkφ(qk)

(F (qk; |Y |) − pk

)+ Oa.s. (rn) + O (bn) , (98)

where qk and παk are defined by (42). Denote by θα

m,k the following constant

θαm,k =

Bmck

2qkφ(qk)πα

k .


Since π0k = 1, (97) and (98) can be rewritten as

Hα − H = Zαn + Oa.s. (rn) + O (bn) (99)

H log − H = Z0n + Oa.s. (rn) + O (bn) , (100)

where for α ≥ 0,

Zαn =

M∑

m=1

K∑

k=1

θαm,k

(F (qk; |Y |) − pk

). (101)

Thus, under Assumption A6(η), we have, as n → +∞,

MSE(Hα − H) = O(E((Zα

n )2))

+ O(rn(2ν − 2H, 2)2

)+ O

(n−2η

), (102)

MSE(H log − H) = O(E((Z0

n)2))

+ O(rn(2ν − 2H, 2)2

)+ O

(n−2η

). (103)

Now,

E((Zα

n )2)

=1

n2

M∑

m1,m2=1

K∑

k1,k2=1

n∑

i1,i2=1

θαm1,k1

θαm2,k2

E(hqk1

(Y am1(i1))hqk2

(Y am2(i2))

).

For k1, k2 = 1, . . . ,K, m1,m2 = 1, . . . ,M and i1, i2 = 1, . . . , n, we have from Lemma 14,

E(hqk1

(Y am1(i1))hqk2

(Y am2(i2))

)=

∑

j1≥τpk1/2

∑

j2≥τpk2/2

chpk12j1

chpk22j2

(2j1)!(2j2)!

× E(H2j1(Y

am1(i1))H2j2(Y

am2(i2))

)

=∑

j≥1

chpk12j c

hpk22j

(2j)!ρam1 ,am2

(i2 − i1)2j . (104)

Under Assumption A1(2ν), we have from Lemma 1, ρam1 ,am2 (i) = O(|i|2H−2ν

). Now,

we leave the reader to check that, as n → +∞

1

n2

n∑

i1,i2=1

ρam1 ,am2(i2 − i1)

2 = O

1

n

∑

|i|≤n

|i|2(2H−2ν)

= O (vn(2ν − 2H))) ,

where the sequence vn(·) is given by (39). Thus, we have, as n → +∞, E((Zα

n )2)

=

O (vn(2ν − 2H)), which leads to the result from (102) and (103).

(iii) Assume ν > H + 1/4 and η > 1/2, then from (99) and (100), the following

equivalences in distribution hold

√n(Hα

n − H)∼ √

n Zαn and

√n(H log

n − H)∼ √

n Z0n. (105)


Now, decompose Zαn = T 1

n + T 2n , where

T 1n =

1√n

M∑

m=1

K∑

k=1

θαm,k

Mℓ+1∑

i=ℓ+1

hqk(Y am

(i))

and

T 2n =

√n

M∑

m=1

K∑

k=1

θαm,k

{1

n

n∑

i=Mℓ+1

hqk(Y am

(i))

},

Clearly, T 1n converges to 0 in probability, as n → +∞. Therefore, we have, as n → +∞

Zαn ∼ √

n

{1

n

n∑

i=Mℓ+1

Gα(Y a1

(i), . . . , Y aM

(i))}

(106)

where Gα is the function from RM to R defined for α ≥ 0 and t1, . . . , tM ∈ R by:

Gα(t1, . . . , tM ) =

M∑

m=1

K∑

k=1

θm,k hqk(tm). (107)

Denote by Y a(i), the vector defined for i = Mℓ + 1, . . . , n by

Y a(i) = (Y a1

(i), . . . , Y aM

(i)).

We obviously have E(Gα(Y a(i))2) < +∞. Since, for all k = 1, . . . ,K, the functions

hqkhave Hermite rank τpk

, the function Gα has Hermite rank 2 (see e.g. Arcones

(1994) for the definition of the Hermite rank of multivariate functions). Moreover under

Assumption A1(2ν), we have from Lemma 1, as j → +∞

E(Y am1

(i)Y am2

(i + j))

= ρam1 ,am2

(j) = O(|j|2H−2ν

)∈ ℓ2(Z),

as soon as ν > H + 1/4. Thus, from Theorem 4 of Arcones (1994), there exists σ2α

(defined for α ≥ 0) such that, as n → +∞, the following convergence in distribution

holds

Zαn −→ N (0, σ2

α)

with

σ2α =

∑

i∈Z

E(Gα(Y a(i′)

)Gα(Y a(i′ + i)

)).

With previous notations, we have

σ2α =

∑

i∈Z

M∑

m1,m2=1

K∑

k1,k2=1

θαm1,k1

θαm2,k2

E(hpk1

(Y am1

(i′))hpk2(Y am2

(i′ + i)))

=∑

i∈Z

M∑

m1,m2=1

K∑

k1,k2=1

∑

j≥r

chpk12j c

hpk22j

(2j)!θαm1,k1

θαm2,k2

ρam1 ,am2

(i)2j . (108)


From (89), we can see that formula (108) is equivalent to (41), which ends the proof

from (105).

6.8 Proof of Corollary 5

Proof. Equation (106) is still available for a sequence αn such that αn → 0 as n → +∞,

that is√

n(Hαn

n − H)∼ √

n

{1

n

n∑

i=Mℓ+1

Gαn

(Y a1

(i), . . . , Y aM

(i))}

From (107) and since παnk → 1, as n → +∞, we have Gαn(·) → G0(·). Therefore, the

following equivalence in distribution holds, as n → +∞√

n(Hαn

n − H)∼ √

n(H log

n − H)

,

which ends the proof.

6.9 Auxiliary Lemma for the proof of Theorem 6

Lemma 15 Let 0 < β1 ≤ β2 < 1 and let Z = (Z1, . . . , Zn) n random variables identi-

cally distributed, such that supβ1≤p≤β2ξZ (p;Z) = Oa.s. (1), then

Z(β) − 1

1 − β2 − β1

∫ 1−β2

β1

ξZ (p;Z) dp = Oa.s.

(n−1

).

Proof. It is sufficient to notice that for i = 1, . . . , n

n

∫ in

i−1n

ξ (p;Z) dp ≤ Z(i),n ≤ n

∫ i+1n

in

ξ (p;Z) dp.

which leads to

n

n − [nβ2] − [nβ1]

∫ n−[nβ2]n

[nβ1]n

ξ (p;Z) dp ≤ Z(β) ≤ n

n − [nβ2] − [nβ1]

∫ n−[nβ2]n

+ 1n

[nβ1]n

+ 1n

ξ (p;Z) dp.

The end is omitted.


Proof. (i) From (20), (21), and (26), we have

Hα,tm−H =

M∑

m=1

Bm

αεα,tmm


=

M∑

m=1

Bm

αlog(|Y am

|α(β)

/|Y |α(β))

+ α × bn

=

M∑

m=1

Bm

α|Y |α(β)

(|Y am

|α(β) − |Y |α(β)

)(1 + oa.s. (1)) + α bn. (109)

and

H log,tm − H =M∑

m=1

Bmεlog,tmm

=

M∑

m=1

Bm

(log |Y am

|(β) − log |Y |(β)

)+ bn (110)

Let us notice that from Lemma 12, one can apply Lemma 15 for the vectors |Y am

|α

and log |Y am

|. Then it comes

|Y am|α

(β)−|Y |α(β)=

1

1 − β2 − β1

∫ 1−β2

β1

(ξ(p; |Y am

|α)− ξ|Y |α(p)

)dp + Oa.s.

(n−1

),

log |Y am|(β)−log |Y |(β)

=1

1 − β2 − β1

∫ 1−β2

β1

(ξ(p; log |Y am

|)− ξlog |Y |(p)

)dp + Oa.s.

(n−1

).

Hence, from (94) and Lemma 12 and Remark 10 and under Assumption (A6(η)), we

obtain

Hα,tm − H = Oa.s. (yn(2H − 2, 2)) + O(n−η

)+ Oa.s.

(n−1

)

H log,tm − H = Oa.s. (yn(2H − 2, 2)) + O(n−η

)+ Oa.s.

(n−1

),

where the sequence yn(·, ·) is defined by (52) with L(·) = 1. This leads to the result by

noticing that n−1 = O (yn(2H − 2, 2)).

(ii) By following the proof of Theorem 4 (ii) and from Theorem 3, we may obtain

the following representation

Hα,tm − H =

M∑

m=1

Bm

α|Y |α(β)× 1

1 − β2 − β1

∫ 1−β2

β1

F(q; |Y am

|)− p

2 1αq1−αφ(q)

dp

+ Oa.s. (rn) + Oa.s.

(n−1

)+ O

(n−η

),

H log,tm − H =M∑

m=1

Bm × 1

1 − β2 − β1

∫ 1−β2

β1

F(q; |Y am

|)− p

2qφ(q)dp

+ Oa.s. (rn) + Oa.s.

(n−1

)+ O

(n−η

).


With such a representation, we observe that the result (ii) can be proved similarly to

the one of Theorem 4.

(iii) By assuming that η > 1/2 and ν > H + 1/4, one may obtain the asymptotic

normality of Hα,tm and H log,tm by using the same tools as the one presented in the proof

of Theorem 4 (iii). Therefore, let us just explicit the asymptotic variance of estimators

Hα,tm and H log,tm. If ν > H +1/4 and η > 1/2, then from previous representations and

from 104 we obtain as n → +∞

V ar(√

n(Hα,tm − H

))∼ n

n2

M∑

m1,m2=1

n∑

i1,i2=1

Bm1Bm2

1

(|Y |α(β))2

× 1

(1 − β2 − β1)2×

∫ 1−β2

β1

∫ 1−β2

β1

∑

j≥1

chp12j c

hp22j

(2j)!

qα−11 qα−1

2

4φ(q1)φ(q2)dp1dp2ρ

am1 ,am2(i)2j ,

with qk = Φ−1(

1+pk2

)for k = 1, 2. Due to (89) and since |Y |α(β)

= 11−β2−β1

∫ 1−β2

β1qαdp,

this variance converges towards σ2α,tm given by (46), as n → +∞.

We leave the reader to check that the asymptotic variance of√

n(H log,tm − H

)is

given by σ20,tm.

Acknowledgement. The author is very grateful to Anestis Antoniadis and Remy

Drouilhet for helpful comments and to Kinga Sipos for a careful reading of the present

paper.

References

Alos E., Mazet O. and Nualart D. (1999). Stochastic calculus with respect to

fractional Brownian motion with Hurst parameter lesser than 12 . Stoch. Proc. Appl.

86 121-139.

Antoniadis A., Berruyer J. and Carmona R. (1992). Regression non lineaire et

applications. Editions Economica, Paris.

Arcones M.A. (1994). Limit theorems for nonlinear functionals of stationary Gaussian

field of vectors. Ann. Probab. 22 2242-2274.

Bahadur R.R. (1966). A note on quantiles in large samples. Ann. Math. Statist. 37

577-580.


Bardet J.-M., Lang G., Oppenheim G., Philippe A., Stoev S. and Taqqu

M. (2003) Semi-parametric estimation of the long-range dependence parameter : A

survey. Monograph Theory and Applications of Long-range Dependence, P. Doukhan,

G. Oppenheim and M. S. Taqqu editors.

Beran J. (1994). Statistics for long memory processes. Monogr. Stat. Appl. Probab. 61.

Chapman and Hall, London.

Boos D.D. (1979). A differential for L-statistics. Ann. Statist. 7 955-959.

Breuer P. and Major P. (1983). Central limit theorems for non-linear functionals of

Gaussian fields. J. Mult. Ann. 13 425-441.

Coeurjolly J.-F. (2000). Simulation and identification of the fractional Brownian

motion: a bibliographical and comparative study. J. Statist. Soft. 5 1-53.

Coeurjolly J.-F. (2001). Estimating the parameters of a fractional Brownian motion

by discrete variations of its sample paths. Statist. Inference Stochastic Process. 4 199-

227.

Constantine A.G. and Hall P. (1994). Characterizing surface smoothness via esti-

mation of effective fractal dimension. J. Roy. Statist. Soc. Ser. B. 56 97-113.

Dacunha-Castelle D. and Duflo M. (1982) Exercices de probabilites et statis-

tiques. Tome 1 Problemes a temps fixe. Collection Mathemathiques Appliquees pour

la maıtrise, Masson, Paris.

Daubechies I. (1998) Ten Lectures on Wavelets. CBMS-NSF, Lecture Notes, 61, SIAM.

Flandrin P. (1992). Wavelet analysis of fractional Brownian motion. IEE Trans. on

Inf. Theory. 38 2 910-917.

Geweke J. and Porter-Hudak S. (1983). The estimation and application of long-

memory time-series models. J. Time Ser. Anal. 4 221-238.

Ghosh J.K. (1971). A new proof of the Bahadur Representation of Quantiles and an

application. Ann. Math. Statist. 42 1957-1961.


Hesse C.H. (1990). A Bahadur-type representation for empirical quantiles of a large

class of stationary, possibly infinite-variance, linear processes. Ann. Statist. 18 1188-

1202.

Ho H.C. and Hsing T. (1996) On the asymptotic expansion of the empirical process

of long-memory moving averages. Ann. Statist. 24 992-1024.

Istas J. and Lang G. (1997). Quadratic variations and estimation of the Holder index

of a Gaussian process. Ann. Inst. H. Poincare Probab. Statist. 33 407-436.

Kent J.T. and Wood A.T.A. (1997). Estimating the fractal dimension of a locally

self-similar Gaussian process using increments. J. Roy. Statist. Soc. Ser. B. 59 679-700.

Kiefer J. (1967). On Bahadur’s representation of sample quantiles. Ann. Math. Statist.

38 1323-1342.

Lim S.C. (2001). Fractional Brownian motion and multifractional of Riemann-Liouville

type. J. Phys. A: Math. Gen. 34 1301-1310.

Mandelbrot B. and Van Ness J. (1968). Fractional brownian motions, fractional

noises and applications. SIAM Rev. 10 422-437.

Sen P.K. and Ghosh M. (1971). On bounded length sequential confidence intervals

based on one-sample rank order statistics. Ann. Math. Statist. 42 189-203.

Sen P.K. (1972). On the Bahadur representation of sample quantiles for sequences of

φ-mixing random variables. J. Multivariate Anal. 2 77-95.

Serfling R.J. (1980). Approximation theorems of mathematical statistics. Wiley, New

York.

Stoev S., Taqqu M., Park C., Michailidis G. and Marron J.S. (2006). LASS:

a Tool for the Local Analysis of Self-Similarity. Comput. Statist. Data Anal. 50 2447-

2471

Taqqu M.S. (1977). Law of the iterated logarithm for sums of non-linear functions of

Gaussian variables that exhibit a long range dependence. Z. Wahrscheinlichkeitstheorie

verw. Geb. 40 203-238.


Taylor C.C. and Taylor S.J. (1991). Estimating the dimension of a fractal. J. Roy.

Statist. Soc. Ser. B. 53 353-364.

Wood A.T.A. and Chan G. (1994). Simulation of stationnary Gaussian processes in

[0, 1]d. J. Comput. Graph. Statist. 3 409-432.

Yoshihara K.-I. (1995). The Bahadur representation of sample quantiles for sequences

of strongly mixing random variables. Stat. Probab. Lett. 24 299-304.

Wu W.-B. (2005). On the Bahadur representation of sample quantiles for dependent

sequences. Ann. Statist. 33 1934-1963.

J.-F. Coeurjolly

LJK, SAGAG Team, Universite Grenoble 2

1251 Av. Centrale BP 47

38040 GRENOBLE Cedex 09

France

E-mail: [email protected]


Figure 1: Two examples for the sample paths of non-contaminted (top) and contaminated

processes with variance function v(·) = | · |2H (left), respectively v(·) = 1 − exp(−| · |2H) (right),

see (47).


Figure 2: Left: σ2α,tm in terms of β; Right: σ2

α for estimators based on a single quantile

in terms of p. Three values of the parameter H are considered: 0.3 (top), 0.5 (middle),

0.8 (bottom). The parameter M is fixed to M = 5. The constant line corresponds to

the asymptotic variance of the Whittle’s estimator.


Non-contaminated sample paths

Estimators v(·) = | · |2H v(·) = 1 − exp(−| · |2H)

p = 1/2, c = 1 (median) 0.796 (0.042) 0.801 (0.042)

p = 0.9, c = 1 0.797 (0.035) 0.798 (0.036)

p = (1/4, 3/4),c = (1/2, 1/2), g(·) = | · |2 0.795 (0.036) 0.800 (0.037)

10%-trimmed mean, g(·) = | · |2 0.797 (0.03) 0.799 (0.034)

Quadratic variations method 0.802 (0.032) 0.798 (0.032)

Whittle estimator 0.805 (0.024) 0.806 (0.024)

Contaminated sample paths

Estimators v(·) = | · |2H v(·) = 1 − exp(−| · |2H)

p = 1/2, c = 1 (median) 0.798 (0.047) 0.803 (0.045)

p = 0.9, c = 1 0.793 (0.033) 0.789 (0.032)

p = (1/4, 3/4),c = (1/2, 1/2), g(·) = | · |2 0.797 (0.040) 0.796 (0.037)

10%-trimmed mean, g(·) = | · |2 0.792 (0.037) 0.797 (0.033)

Quadratic variations method 0.329 (0.162) 0.353 (0.149)

Whittle estimator 0.519 (0.106) 0.510 (0.100)

Table 1: Mean and standard deviations for n = 1000 and H = 0.8 using 500 Monte Carlo

simulations of sample paths of processes with variance function v(·) = | · |2H , respectively v(·) =

1 − exp(−| · |2H) (first table ) and contaminated versions (second table), see (47).

Hurst exponent estimation of locally self-similar Gaussian ...

Documents

Transcript of Hurst exponent estimation of locally self-similar Gaussian ...