On the use of the bootstrap for estimating functions with functional data

12
Computational Statistics & Data Analysis 51 (2006) 1063 – 1074 www.elsevier.com/locate/csda On the use of the bootstrap for estimating functions with functional data Antonio Cuevas a , , Manuel Febrero b , Ricardo Fraiman c a Departamento de Matemáticas, Facultad de Ciencias, Universidad Autónoma de Madrid, 28049-Madrid, Spain b Departamento de Estadística, Universidad de Santiago de Compostela, Spain c Departamento de Matemática, Universidad de San Andrés, Argentina Received 20 January 2005; received in revised form 21 October 2005; accepted 26 October 2005 Available online 16 November 2005 Abstract The bootstrap methodology for functional data and functional estimation target is considered. A Monte Carlo study analyzing the performance of the bootstrap confidence bands (obtained with different resampling methods) of several functional estimators is presented. Some of these estimators (e.g., the trimmed functional mean) rely on the use of depth notions for functional data and do not have received yet much attention in the literature. A real data example in cardiology research is also analyzed. In a more theoretical aspect, a brief discussion is given providing some insights on the asymptotic validity of the bootstrap methodology when functional data, as well as a functional parameter, are involved. © 2005 Elsevier B.V.All rights reserved. Keywords: Bootstrap; Functional data; Smoothed bootstrap; Bootstrap validity; Trimmed functional mean; Functional median 1. Introduction We deal here with the statistical setups where the available sample information consists of (or can be considered as) a set of functions. Depending on the approach and on the assumed structure of the data (which come often in a discretized version) this statistical field is called “longitudinal data analysis” (LDA) or “functional data analysis” (FDA). In the usual situation in LDA only a few measurements are available for each individual; see Rice (2004) for a more detailed discussion of this terminology and related topics. We will follow here a purely functional approach which entails to consider the available data as functions and, as a consequence, to define and motivate the methods in a functional framework. The book by Ramsay and Silverman (1997) has greatly contributed to popularize the FDA techniques among the users, offering a number of appealing case studies and practical methodologies. A further book (Ramsay and Silverman, 2002) by the same authors is devoted to the applied aspects of FDA, with examples in growth analysis, meteorology, physiology, economics, medicine, etc. Additional information on FDA can be found in the web sites http://ego.psych.mcgill.ca/misc/fda/ and http://www.lsp.ups-tlse.fr/staph. Corresponding author. Tel.: +34 914973810; fax: +34 914974889. E-mail address: [email protected] (A. Cuevas). 0167-9473/$ - see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2005.10.012

Transcript of On the use of the bootstrap for estimating functions with functional data

Computational Statistics & Data Analysis 51 (2006) 1063–1074www.elsevier.com/locate/csda

On the use of the bootstrap for estimating functions withfunctional data

Antonio Cuevasa,∗, Manuel Febrerob, Ricardo Fraimanc

aDepartamento de Matemáticas, Facultad de Ciencias, Universidad Autónoma de Madrid, 28049-Madrid, SpainbDepartamento de Estadística, Universidad de Santiago de Compostela, Spain

cDepartamento de Matemática, Universidad de San Andrés, Argentina

Received 20 January 2005; received in revised form 21 October 2005; accepted 26 October 2005Available online 16 November 2005

Abstract

The bootstrap methodology for functional data and functional estimation target is considered. A Monte Carlo study analyzingthe performance of the bootstrap confidence bands (obtained with different resampling methods) of several functional estimatorsis presented. Some of these estimators (e.g., the trimmed functional mean) rely on the use of depth notions for functional data anddo not have received yet much attention in the literature. A real data example in cardiology research is also analyzed. In a moretheoretical aspect, a brief discussion is given providing some insights on the asymptotic validity of the bootstrap methodology whenfunctional data, as well as a functional parameter, are involved.© 2005 Elsevier B.V. All rights reserved.

Keywords: Bootstrap; Functional data; Smoothed bootstrap; Bootstrap validity; Trimmed functional mean; Functional median

1. Introduction

We deal here with the statistical setups where the available sample information consists of (or can be considered as) aset of functions. Depending on the approach and on the assumed structure of the data (which come often in a discretizedversion) this statistical field is called “longitudinal data analysis” (LDA) or “functional data analysis” (FDA). In theusual situation in LDA only a few measurements are available for each individual; see Rice (2004) for a more detaileddiscussion of this terminology and related topics. We will follow here a purely functional approach which entails toconsider the available data as functions and, as a consequence, to define and motivate the methods in a functionalframework.

The book by Ramsay and Silverman (1997) has greatly contributed to popularize the FDA techniques amongthe users, offering a number of appealing case studies and practical methodologies. A further book (Ramsay andSilverman, 2002) by the same authors is devoted to the applied aspects of FDA, with examples in growth analysis,meteorology, physiology, economics, medicine, etc. Additional information on FDA can be found in the web siteshttp://ego.psych.mcgill.ca/misc/fda/ and http://www.lsp.ups-tlse.fr/staph.

∗ Corresponding author. Tel.: +34 914973810; fax: +34 914974889.E-mail address: [email protected] (A. Cuevas).

0167-9473/$ - see front matter © 2005 Elsevier B.V. All rights reserved.doi:10.1016/j.csda.2005.10.012

1064 A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074

Simultaneously, the increasing popularity of the FDA methods motivates the need of a solid theoretical foundationas many basic issues (concerning, e.g., the asymptotic behavior) are often rather involved in the FDA setup. In spiteof the considerable progress made on probability theory in function spaces (which provides a potential probabilisticbackground for FDA) and in estimation of infinite-dimensional parameters (see, e.g., Bickel et al. (1993)), the statisticalFDA theory is still incomplete as many topics remain partly unexplored from the mathematical point of view. Sometheoretical developments with functional data have been made in fields as principal component analysis, linear regres-sion, data depth, clustering and anova models, among others. Some recent references are Boente and Fraiman (2000),Cardot and Sarda (2005), Cardot et al. (2003), Ferraty and Vieu (2002), Cuevas et al. (2004). See also the monographicissue of the journal Statistica Sinica (2004, 14).

If we focus on the topic of location estimation we realize that the FDA theory offers still a limited range of availableestimators, when compared with the classical parametric point estimation theory. Moreover, the exact calculation ofsampling distributions in FDA problems presents an obvious difficulty so that the bootstrap methodology turns out to beoften the only practical alternative if we want to use functional estimators to derive confidence bands or hypothesis tests.

In this paper, we tackle the practical study (via bootstrap and simulation) of different functional estimators, andthe asymptotic behavior of the bootstrap in this context. Thus, in Section 2 we consider several functional estimators,defined by analogy with some well-known location and scale real-valued estimators. Their practical performances areevaluated through simulation in Section 3. A real-data example is discussed in Section 4. Finally, Section 5 is devotedto a brief theoretical discussion (no formal proof is given) about the asymptotic validity of the bootstrap methodologywith functional data.

2. Location estimators in functional setups

In a similar way to that of the classical univariate point estimation, we can look for location estimators in order toget some idea about the “central value” of the population from which the sample of curves x1 = x1(t), . . . , xn = xn(t)

has been drawn. Of course, the obvious candidate is the sample mean but there are other possibilities, either oriented toachieve a higher robustness or to take into account different aspects of the notion of “centrality” with functional data.In particular, we will use the following concept of (functional) depth, proposed by Fraiman and Muniz (2001): Forevery t ∈ [0, 1] let Fn,t be the empirical distribution of the sample x1(t), . . . , xn(t) and let Zi(t) denote the univariatedepth of the data xi(t) in this sample, given by

Zi(t) = 1 − | 12 − Fn,t (xi(t)) |.

Then, define for i = 1, . . . , n,

Ii =∫ 1

0Zi(t) dt (1)

and rank the observations xi(t) according to the values of Ii . Thus the functional median (that we will denote byFM-median) will coincide with the deepest data, that is, the function xi(t) for which Ii is maximum.

Other functional L-estimators, as the �-trimmed mean can be defined as the average of the 100(1 − �)% deepestfunctions in the sample.

The extension of the concept of mode to a functional setup is not straightforward. The point is finding a suitabledensity function, reasonably easy to handle, whose maximization turns out to be feasible and meaningful in inferenceterms. Gasser et al. (1998) tackle the problem by using finite-dimensional projections; see also Hall and Heckman(2002). We will use here an alternative somewhat different approach which can be of interest as an exploratory datatool. The basic idea is to select the trajectory most densely surrounded by other trajectories of the process. Given akernel function K : R → R and a fixed tuning parameter h, we define

g(x; h) = g (x; h; x1, . . . , xn) = 1

nh

n∑i=1

K

(‖x − xi‖h

)

=n∑

i=1

Kh(‖x − xi‖), and M (x1, . . . , xn) = argmaxj g(xj ; h

).

A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074 1065

We will use this version of the kernel mode as a tractable approximation for the following more “direct” version ofthis estimator

M1 (x1, . . . , xn) = argmaxx g(x; h). (2)

As we mention above, all these estimators are oriented to take into account the central trend of the process, so that theirperformance will be checked by assuming that the target is the mean function m(t) := E(X(t)) in all cases. This will bequite reasonable if the underlying model fulfills some symmetry requirements which lead these estimators to approachm(t), as n tends to infinity; in this respect the point of view here is quite similar to that of parametric robust statistics. Adetailed study of the asymptotic behavior of these estimators is beyond the scope of this paper; see however, Fraimanand Muniz (2001) for some consistency results for the FM-median and the �-trimmed mean (see also Section 4 belowfor some empirical assessments of consistency). The focus here is mainly on the use of these estimators as exploratorydata analysis tools.

3. A simulation study

A Monte Carlo study has been carried out in order to evaluate the performance of the function-valued estimatorsdefined in Section 2. The sampling information is given by functional data x1 =x1(t), . . . , xn =xn(t) (with n=25, 100)obtained as iid observations from a stochastic process X(t), with continuous trajectories on [0, 1]. As we will see, thebootstrap procedures play a central role in the estimation methods studied here. The computer codes, written in theR language (R Development Core Team, 2004), are available from the authors. We have used the R packages MASS(Venables and Ripley, 2002) and e1071 (Dimitriadou et al., 2004).

The details of the Monte Carlo study are explained in the next paragraphs.

(a) The considered sampling modelsWe have used two models for the distribution of X(t):

(A) A Wiener process with trend, defined by X(t) = m(t) + B(t) where B(t) is a standard Brownian motion andm(t) := E(X(t)) = 10t (1 − t).

(B) A Gaussian process X(t) with mean m(t)=10t (1−t) and covariance function Cov(X(s), X(t))=exp(−|s−t |/0.3).

The estimation targets are the mean function m(t) and the variance function V (t) = V (X(t)).In order to check the robustness properties of the considered estimators, these models will also be used in

“contaminated” versions which are defined as mixtures giving with probability 0.95 the “central” model (A or B)and with probability 0.05 the “outlier” alternative model, defined by replacing the mean m(t) = 10t (1 − t) by 3m(t).

(b) The simulation mechanismOf course, as both models correspond to continuous time processes, we have to simulate then in a discrete approxi-

mated version. Since (A) and (B) are both Gaussian processes, these discrete approximations are fairly simple to obtain:We just have to draw finite-dimensional marginals (X (t1) , . . . , X (tN)) from a N-dimensional Gaussian distributionwith mean (m (t1) , . . . , m (tN)) and covariance matrix Cov

(X (ti) , X

(tj

))=Cov(B (ti) , B

(tj

))=min{ti , tj

}(model

A) or Cov(X (ti) , X

(tj

)) = exp(− ∣∣ti − tj

∣∣ /0.3)

(model B). In our simulation study we have taken N = 101 andtk = k/100, k = 0, . . . , 100.

(c) The estimators under studyWe check the performance of five estimators, namely (see Section 2 above for definitions):

(1) The sample mean: x̄(t) = ∑ni=1 xi(t)/n.

(2) The sample variance: V̂ (t) = ∑ni=1 (xi(t) − x̄(t))2/n.

(3) The FM-median, based on Fraiman and Muniz’s (2001) notion of functional depth.(4) The �-trimmed (FM) mean with � = 0.25.

(5) The kernel-mode with Gaussian kernel(K(t) =

(1/

√2�

)exp

(−t2/2))

and bandwidth h = 0.2 max{‖xi − xj‖ : i, j = 1, . . . , n}, where ‖.‖ denotes either the L2 or the L∞ norm (see paragraph (e) below).

1066 A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074

0.0 0.2 0.4 0.6 0.8 1.0

0.0

1.0

2.0

3.0

0.0

1.0

2.0

3.0

[0,1]

Mea

n

n=25n=100n=500Theor.

0.0 0.2 0.4 0.6 0.8 1.0[0,1]

FM

-Med

ian

n=25n=100n=500Theor.

Fig. 1. Estimation of m(t) = 10t (1 − t) using the sample mean and the FM-median.

Besides the sample variance, which is obviously used in the estimation of V (t), the remaining estimators are, inbroad sense, intended to approach m(t) and (in the case of the FM-median, the trimmed mean and the kernel-mode)simultaneously to take into consideration other aspects (robustness, modality, number of groups, etc.). As an example,the behavior of the FM-median under model A, compared to that of the ordinary sample mean, can be visualized inFig. 1 which shows the target function m(t) = 10t (1 − t) and the approximations obtained with different sample sizes.As expected, in this contamination-free situation the mean is more efficient, though the FM-median is also clearlytargeted at m(t).

(e) Bootstrap confidence bandsThe aim of the simulation study is to assess the performance of the bootstrap confidence bands constructed as follows:

Given the original data x1(t), . . . , xn(t), we draw 500 bootstrap samples from them. Denote by x∗1 (t), . . . , x∗

n(t) ageneric bootstrap sample. A “0.95 bootstrap confidence band” based on the estimator T (x1, . . . , xn), is then definedby calculating the value D (x1, . . . , xn) such that the 95% of the bootstrap replications T

(x∗

1 , . . . , x∗n

), are within a

distance from their average smaller than D (x1, . . . , xn). In other words, D (x1, . . . , xn) is the radius of a bootstrap 95%tolerance band centered at the bootstrap mean of T. In the simulation study the actual performance of such bootstrapbands is evaluated through 500 replications, that is, 500 original samples are drawn, the corresponding bootstrap bandsare constructed and we calculate the coverage proportion of the target function (m(t) or V (t)) in such bands. Two

distances are used in the construction of the confidence bands, the L2-metric ‖x − y‖=(∫ 1

0 (x(t) − y(t))2 dt)1/2

, and

the supremum (L∞) metric, ‖x − y‖∞ = supx∈[0,1] |x(t) − y(t)|.(f) Two further resampling methods: smoothed and parametric bootstrapIn the classical resampling methodology for univariate data, an alternative bootstrap procedure (called smoothed

bootstrap) is sometimes used in order to avoid the appearance of repeated measures in the artificial samples. The basicidea is replacing the standard bootstrap samples (made of iid observations from the empirical distribution Fn) by theso-called smoothed bootstrap samples, drawn from a smoothed version F̂n of Fn which is often of kernel type, thatis F̂n is the convolution of Fn with a re-scaled kernel Kh (for example Kh = N(0, h)). In practice this amounts toobtain bootstrap observations of type X0

i = X∗i + Zi , where X∗

i is drawn from Fn and Zi is independent from X∗i with

distribution Kh.In our functional setup, the simulation outputs point out a failure of the standard bootstrap for the functional median

and also for the kernel-mode. Our attempt to improve things is based as well on a kind of smoothing: The bootstrapprocess trajectories are approximated by finite-dimensional variables

(X0 (t1) , . . . , X0 (tN )

)defined by X0 (ti) =

X∗ (ti) + Z (ti), where (X∗ (t1) , . . . , X∗ (tN )) is a standard bootstrap replication drawn from the original discretizedtrajectories (X (t1) , . . . , X (tN)) and (Z (ti) , . . . , Z (tN )) is normally distributed with mean 0 and covariance matrix��x , where �x is the covariance matrix of (X (t1) , . . . , X (tN)). Here � is the smoothing parameter. We have first triedwith �=0.01, which turned out to be insufficient for the median and mode, although it performed well in the remainingestimators. The simulation results given below have been obtained with � = 0.05. In Fig. 2 we show the 500 bootstrapreplications obtained for the FM-median with (on the left) and without smoothing. It is apparent that the number ofdifferent replications in the unsmoothed case is not sufficient to provide a reliable approximation to the confidence band.

A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074 1067

Fig. 2. 500 bootstrap replications of the FM-median with and without smoothing.

Table 1Coverage percentages for the bootstrap bands

Estimator and Model A Model Bbootstrap method Distances Distances

L2 L∞ L2 L∞

n n

25 100 25 100 25 100 25 100

Mean StB 92.2 92.6 92.0 94.2 93.0 94.4 95.2 96.2Mean SmB 92.8 93.4 93.2 94.8 94.2 95.6 96.0 96.6Mean PtB 92.8 93.4 92.4 94.8 93.2 94.4 95.0 96.4Var StB 84.6 94.4 86.0 92.8 88.6 96.6 98.2 95.8Var SmB 86.8 95.8 88.2 95.6 91.6 98.8 99.8 99.2FM-median SmB 95.4 95.8 97.6 94.2 97.8 97.4 95.8 95.4Trimmed mean StB 95.4 94.6 95.4 94.8 96.6 95.6 98.0 97.6Kernel-mode SmB 97.2 96.4 98.4 98.6 97.2 99.6 100 98.4

Finally, in the case of the sample mean a further resampling mechanism, which we denote by “parametric bootstrap”,has been considered in our simulation study. The basic point is that, by the standard functional central limit theorem(see, e.g., Laha and Rohatgi, 1979, p. 474), the distribution of X̄(t) can be, for large n, approximated by that of aGaussian process with mean m(t) and covariance function Cov(X(s), X(t))/n. Then, we may use the original sampleto estimate m(t) and Cov(X(s), X(t)), so obtaining an estimator of the (approximate) distribution of X̄(t).

(g) The simulation resultsThe following tables summarize the numerical outputs of our simulation study. The values in Table 1 correspond

to the coverage percentages of m(t) or V (t) for bootstrap confidence bands obtained with different procedures. Wehave used the following abbreviations: StB = Standard (unsmoothed) Bootstrap, SmB = Smoothed Bootstrap, PtB =Parametric Bootstrap. The notations A, B refer to the considered model and L2, L∞ to the distance used in theconstruction of the confidence bands.

Obviously, a complete evaluation of the performance of confidence bands must also consider, besides the confidencelevel, the complementary issue of “accuracy”. This can be measured in terms of the “width” of such bands, that is,the range of distances (radius) covered for every confidence band. Of course, for a given confidence level, the higheraccuracy the smaller values of the width.

We have evaluated the accuracy of the confidence bands by obtaining, for every considered estimator, distance,sample size and underlying model, the histogram of the band widths resulting in the 500 replications. As an example,in Fig. 3 we show two of these histograms; we have used in both cases the standard bootstrap under model A withsample size n = 100.

1068 A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074

Mean StB

Distance

Fre

quen

cy

0.25 0.30 0.35 0.40 0.45

0

50

100

150

0

50

100

150Trimmed Mean StB

Distance

Fre

quen

cy

0.25 0.30 0.35 0.40 0.45

Fig. 3. Histograms of the widths for the L∞-confidence bands based on the sample mean (left) and the 0.25-trimmed mean (right).

Table 2Modal values of the confidence band widths

Estimator and Model A Model Bbootstrap method Distances Distances

L2 L∞ L2 L∞

n n

25 100 25 100 25 100 25 100

Mean StB 5.5 1.50 0.425 0.215 9.5 2.50 0.625 0.295Mean SmB 5.5 1.75 0.425 0.225 9.5 2.50 0.625 0.290Mean PtB 5.5 1.50 0.425 0.225 9.5 2.50 0.625 0.307Var StB 2.5 1.75 0.550 0.275 12.5 3.75 0.950 0.470Var SmB 7.5 2.50 0.550 0.325 12.5 4.25 0.950 0.520FM-median SmB 17.5 7.75 1.750 1.500 57.5 32.5 1.750 1.500Trimmed mean StB 9.0 2.25 0.550 0.250 15.0 3.25 0.775 0.375Kernel-mode SmB 15.0 6.75 2.350 1.550 47.5 31.5 2.350 1.550

Table 3Estimated ranges for the widths of the confidence bands

Estimator and Model A Model Bbootstrap method Distances Distances

L2 L∞ L2 L∞

n n

25 100 25 100 25 100 25 100

Mean StB 2–15 1–3 0.3–0.6 0.16–0.27 5–17 1.5–3.25 0.5–0.85 0.27–0.335Mean SmB 2–16 1–3.25 0.3–0.65 0.17–0.28 6–18 1.75–3.5 0.5–0.85 0.27–0.36Mean PtB 2–14 1–3 0.25–0.65 0.16–0.27 5–17 1.5–3.25 0.5–0.85 0.275–0.325Var StB 0–50 0.5–7.5 0.20–1.30 0.15–0.55 5–45 2–8.5 0.5–1.50 0.35–0.62Var SmB 0–50 0–8 0.2–1.4 0.15–0.55 5–55 2.5–9 0.6–1.8 0.38–0.62FM-median SmB 5–40 5.5–12.5 1.5–3 0.75–2.75 30–100 20–60 12.5–30 1–2.75Trimmed mean StB 2–23 1–4.5 0.3–0.9 0.15–0.4 5–35 2–7 0.6–1.15 0.31–0.44Kernel-mode SmB 5–35 5–10 1.7–2.9 1.25–1.9 30–90 20–42.5 1.7–2.8 1.25–1.9

The 64 histograms for all the considered cases are summarized in Tables 2 and 3. The numbers in the cells ofTable 2 give the estimated modal values of the band widths, defined as the mid point in the corresponding modalintervals of the histograms. Table 3 shows the estimated ranges for the widths, defined as the support intervals of the

A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074 1069

Table 4Coverage percentages (Cover.), modal values of the widths (Mod.), and ranges of widths obtained for the L2-confidence bands resulting undercontaminated models

Estimator and bootstrap method Model A (contaminated) Model B (contaminated)

Cover. Mod. Ranges Cover. Mod. Ranges

Mean StB 56.6 3.75 1.00–8.00 64.2 4.25 2.00–13.00Mean SmB 59.4 3.50 1.00–9.00 66.4 5.50 2.00–13.00Mean PtB 58.6 3.75 1.00–8.50 65.0 5.50 2.00–12.00Trimmed mean StB 93.4 2.25 1.00–7.50 93.0 3.75 2.00–7.00

respective histograms. Let us note that we have used the square of the L2-metric so that the outputs corresponding tothe L2 columns in both tables are in fact expressed in terms of the square of this distance.

Finally, Table 4 is devoted to robustness performances. It gives the results (coverage, range of widths and modalwidths) under the above defined contaminated versions of models A and B. In order to simplify the presentation, thistable provides just the results for the (square of) L2-metric with sample size n = 100 and includes only the mean (as astandard reference) and the trimmed mean (which, in some sense, turns out to be “winner” of our simulation study).

(h) ConclusionsThe main conclusions of our simulation study can be summarized as follows:

(1) Whereas no substantially different conclusion is drawn from the L2 and the L∞-outputs, it is still interesting tohave both results, as these distances provide complementary perspectives. The L2 metric leads to a more conve-nient mathematical setup (that of the Hilbert spaces), so it is especially useful for theoretical developments; seeSection 5 below. The L∞ metric is more appealing from the practical point of view as it is directly interpretablein visual terms. As for the comparison between models A and B, the results (especially Tables 2 and 3) are clearlyworse for B. This is not surprising, in view of the higher variability of this model (V (X(t))) = t for A andV (X(t)) = 1 for B.

(2) The parametric bootstrap does not seem to provide any clear advantage in the only case (the sample mean) whereit can be easily applicable. The smoothed bootstrap is required in order to use either the functional median orthe kernel-mode but leads to some loss of efficiency in the cases of the sample mean and the sample variance. Ingeneral terms, the standard bootstrap appears as a reasonable alternative, except in the cases (e.g., the functionalmedian) where the considered statistic provides too many repeated values.

(3) The simulation outputs show a huge improvement when the sample size increases from 25 to 100. Let us recallthat, as our confidence bands rely on resampling procedures, its motivation must be asymptotic so that, moderateor large sample sizes are required.

(4) Maybe the most relevant result of our study is the good behavior of the trimmed mean. Note that the trimmingvalue (0.25) is rather large so that some efficiency cost with respect to the sample mean is to be expected. However,this cost is moderate, especially taking into account the excellent robustness performance of the trimmed mean,as shown in Table 4, as well as the fact that the trimmed mean does a better job in approximating the nominallevel set.

(5) With regard to the FM-median and the kernel-mode, more research is needed and several other alternative versionscould be checked. For example, we conjecture that an approximation to the median, obtained via the �-trimmedmean with a large value of �, could provide much more satisfactory results. The choice of the tuning parameter hin the kernel mode would require as well a more detailed study. We believe that this estimator could be useful asan auxiliary tool for functional clustering, in a similar way to that of Cuevas et al. (2001) for the standard case ofmultivariate data.

4. A case-study with real data

We consider here an example in experimental cardiology which comes from the research work conducted by Dr. DavidGarcía-Dorado at the Vall d’Hebron Hospital (Barcelona, Spain); see, e.g., Ruiz-Meana et al. (2003) and referencestherein for details on the biochemical and medical aspects of this example.

1070 A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074

0 50 100 200 300

6.0

6.5

7.0

7.5

6.0

6.5

7.0

7.5

Time

Control meanTreatment mean

0 50 100 200 300Time

Control trim.meanTreatment trim.mean

Fig. 4. Confidence bands for the ordinary sample mean and the trimmed mean with the MCO data.

0 50 100 200 300

5

Time

Control FM-MedianTreatment FM-Median

0 50 100 200 300Time

Control Kernel-ModeTreatment Kernel-Mode

6

7

8

9

5

6

7

8

9

Fig. 5. Confidence bands for the sample median and the kernel-mode with the MCO data.

The variable under study is the so-called mitochondrial calcium overload, (MCO) a measure of the mitochondrialcalcium ion Ca2+ levels. This variable was measured every 10 s during an hour in isolated mouse cardiac cells thusleading to genuine functional data. We had previously analyzed these data (from the point of view of functional anovatests) in Cuevas et al. (2004).

It turns out that, during miochardial ischemia, high levels of MCO are associated with a better protection againstthe ischemia process. Thus, it is interesting to assess the effect of different procedures potentially leading to increasethe MCO level. This is the case of a drug called Cariporide, which was expected to induce a MCO increase as aconsequence of its properties as a selective blocker of the ionic exchange.

In short, the data we analyze here consist of two samples with sizes n1 = 45 functions (control group) and n2 = 44functions (treatment group, with Cariporide). Our purpose here is to make an exploratory analysis of these data usingthe estimators in Section 2, with their corresponding bootstrap bands.

The results are summarized in Figs. 4–6. The confidence bands (for the mean, trimmed mean, median, mode andvariance) shown in these figures have been obtained from 500 replications via smoothed bootstrap, using the supremum(L∞) distance. The darker lines correspond to the treatment group and the lighter ones to the control. In each case, theestimator is the central, slightly thicker, line. We should mention that, due to technical reasons related to the experimentalconditions, the variables under observation show wild oscillations at the beginning of the observation period. For thisreason we have disregarded the 19 first observations in every observed curve.

These figures are, hopefully, self-explanatory. It is clear that the treatment group shows a greater variability, mainlydue (see Fig. 6) to the 25% of observations with lower depth. It also seems quite reasonable to accept that the treatmentgroup has “central values” consistently higher than those in the control group. For instance observe that the confidencebands (based on both the sample mean and the trimmed mean) for the control and the treatment group have a smallintersection for the higher values for the treatment. Also the confidence band of the median for the control group isalmost completely below the median of the treatment group.

A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074 1071

0 50 100 200 300

0.0

0.2

0.4

0.6

Time

Control VarianceTreatment Variance

0 50 100 200 300

5

Time

Control 75% DepthTreatment 75% Depth

6

7

8

Fig. 6. Confidence bands for the variance (left). Graphical representation of the 75% deepest functions (right).

5. Some remarks on asymptotic bootstrap validity

In view of the developments in the previous sections it is natural to ask about the validity (at least in an asymptoticsense) of the involved bootstrap approximations. The basic question is: can we ensure that, for n large enough, thesample distributions of the considered statistics are properly approximated by their bootstrap counterparts? In theliterature on this topic usually the bootstrap validity (sometimes called “consistency”) results apply to statistics of typeT (Pn) where T is a differentiable operator (taking values in a functional space) and Pn is the empirical distributionassociated with a sample X1, . . . , Xn of n functions drawn from a common distribution P. In practical terms, thegoal is to establish that the distribution of

√n (T (Pn) − T (P )) is close to that of its corresponding bootstrap version√

n(T

(P ∗

n

) − T (Pn)), where P ∗

n is the empirical distribution based on an artificial (bootstrap) sample drawn fromthe original sample. This is often proved by showing that a suitable distance between the sampling distributions of bothsequences tends (almost surely) to zero or, alternatively, by showing that both sequences of centered re-scaled statisticstend (a.s.), in the sense of weak convergence, to the same limit. In particular, this will provide theoretical support forthe use of bootstrap confidence bands in the estimation of the functional target T (P ).

Not surprisingly, this line of research goes back to the very beginning of the bootstrap as a popular resamplingtechnique. Thus, the classical works by Bickel and Freedman (1981) and Parr (1985), among others, have establishedthe bootstrap validity for a number of real-valued statistics based on real-valued sample observations. These include theordinary sample mean as well as those statistics generated by differentiable statistical functionals which are “linearized”,that is, approximated by suitable sample means, through first-order Taylor expansions (this is the so-called “deltamethod”). A considerable effort has been also paid to the study of the performance of bootstrap-based confidenceintervals (e.g., Hall (1988)).

The functional counterpart of this theory is less developed. However, Giné and Zinn (1990) have proved, in a verygeneral setup, a bootstrap version of Donsker theorem for the empirical processes which includes a validity result forthe functional mean. A partial extension of this result is given in Sheehy and Wellner (1992); see also Dudley (1990) andGill (1989). Politis and Romano (1994) have proved the consistency of the bootstrap for the sample mean in the caseof uniformly bounded functional (not necessarily independent) variables taking values in a separable Hilbert space.van der Vaart and Wellner (1996, Section 3.9.3) prove three theorems of bootstrap validity, based on the “delta method”for Hadamard (or Fréchet) differentiable statistical functionals taking values in normed spaces. These results are indeedvery general and, in principle, could be used for some of the above estimators. However, if additional assumptions(in particular, the uniform boundedness of the observed trajectories) are introduced, a slightly different variant of thismethodology can be considered. It relies on the use of the Fréchet differential with an approach quite similar to thatproposed by Parr (1985, Theorem 4) in the finite-dimensional case.

We now outline the basic ideas of this alternative validity result. The technical details, as well as the correspondingproofs, are available from the authors. A preliminary incomplete version can be found in Cuevas and Fraiman (2004).Recall that, in what follows, P is a probability on a function space X (we will assume, for the sake of concretenessX=L2[0, 1]), which corresponds to the underlying distribution which generates the functional data, and the estimator

1072 A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074

T (Pn) is generated by a X-valued operator T, defined on a suitable space of probability measures on X with includesall the empirical distributions.

(a) A bound can be obtained for the distance d(Pn, P ) between the underlying distribution P and the empirical Pn.It is uniform in all the possible underlying distributions P whose support is contained in a common bounded set H ofX. In precise terms, this bound establishes that for all � > 0 there exists K = K(�) such that

P{√

n d(Pn, P ) > K}< �, for all n, for all P with support in H. (3)

Here d represents the so-called bounded Lipschitz metric defined by

d(Pn, P ) = supf ∈F

∣∣∣∣∫

f dPn −∫

f dP

∣∣∣∣ , (4)

F being the class of Lipschitz functions with Lipschitz constant �1. As we will point out below, the uniformity on Pof (3) allows us to apply it simultaneously to d(Pn, P ) and d(P ∗

n , Pn) which is particularly useful in the differentialmethodology for showing bootstrap validity. The proof follows as a consequence of an exponential bound obtained byYukich (1986, Theorem 1) which in turn is based upon standard techniques of empirical processes.

(b) Now, let us assume that T fulfills a differentiability property of type

T (Q) = T (P ) + T ′P (Q − P) + o(d(Q, P )), (5)

where the remainder term o(d(Q, P )) denotes, as usual, an operator such that limQ→P o(d(Q, P ))/d(Q, P ) = 0,and T ′

P is a linear operator (“the differential at P”) defined on the space of finite signed measures on H. Then,by following the same procedure as in Parr (1985, Theorem 4) the differentiability assumption (5) is used to getfirst-order Taylor expansions for both T (Pn) − T (P ) and T

(P ∗

n

) − T (P ). The respective remainder terms are oftype o (d (Pn, P )) and o

(d

(P ∗

n , P))

. As a consequence, the re-scaled statistic√

n (T (Pn) − T (P )) and its boot-strap counterpart

√n

(T

(P ∗

n

) − T (Pn))

are shown to converge weakly almost surely (that is, in the weak sense foralmost all sequences of sample data) to the same Gaussian limit; thus, the the sampling distribution (under P) of√

n (T (Pn) − T (P )) and the bootstrap sampling distribution (under Pn) of√

n(T

(P ∗

n

) − T (Pn))

must be close toeach other for n large enough.

(c) The proof follows from the fact that√

n(T

(P ∗

n

) − T (Pn))

can be expressed as the sum of a linear term,with a Gaussian limit as a consequence of the bootstrap validity for the sample mean (see Giné and Zinn (1990,Remark 2.5)) and a sum of two remainder terms of type

√no

(d

(P ∗

n , P)) + √

no (d (Pn, P )) which goes to zero,almost surely (a.s.), since

√nd (Pn, P ) is bounded in probability a.s. from (3) and, from the triangle inequality,√

nd(P ∗

n , P)

is also bounded in probability (uniformly on P) a.s., as both√

nd(P ∗

n , Pn

)and

√nd (Pn, P ) are, as

a further consequence of (3). Note that the bound (3) plays in our functional framework a role similar to that of thewell-known Dvoretzky–Kiefer–Wolfowitz (DKW) inequality in the case of real-valued data. The uniformity in F (theunderlying distribution) of this latter inequality is used in Parr (1985, Theorem 4) to show the convergence of thebootstrap remainder term o

(√nd

(F ∗

n , Fn

))where F ∗

n is nothing but the empirical drawn from Fn. In this case noassumption of bounded support is needed, as the DKW-inequality is completely universal on F. In our functionalsetting, however, this assumption is required in order to be able to apply the entropy argument involved in the proof of(3). On the other hand, the hypothesis of uniform boundedness for the observed trajectories is not very restrictive inpractice. It is in some sense similar to the assumption of compact support in nonparametric estimation. If one is willing torenounce to the usual Gaussian models (which is also the case in nonparametrics) the hypothesis of boundedness looksquite natural as every observable phenomenon provides in fact observations taking values in a bounded domain (whoselimits are imposed by the measurement instruments). Note also that the boundedness condition must be fulfilled in themetric of the space where the random elements Xi take values. For example, if this space is L2[0, 1] the assumptionthat Xi ∈ H, where H is bounded in L2[0, 1], does not entail that the realizations of Xi have to be bounded in thesupremum sense.

(d) As a by-product of some practical interest, the indicated method of proof allows us to incorporate some powerfuluniform bounds by Yurinskii (1976) and Yukich (1986) which can be applied to get some information on the difference�n = T (Pn) − T

(P ∗

n

)between the original estimator T (Pn) and its bootstrap version T

(P ∗

n

). In particular, two

exponential inequalities can be obtained which ensure that �n converges to zero in probability (almost surely) uniformlyin P.

A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074 1073

(e) In general it is not easy to prove the differentiability of a given operator, even for the classical case of real-valuedfunctionals T. The functional trimmed mean and the functional mode (2) defined above can be both expressed in theoperator format T (Pn), with T (P ) = argmaxu

∫Kh(‖u − x‖) dP(x) for (2) and

T (P )(t) = E(X(t)I[�,∞)(I (X))

)E

(I[�,∞)(I (X))

)for the �-trimmed mean, where I denotes the ranking variable defined in (1) and I[�,∞) represents the usual indicatorfunction of the interval [�, ∞), � being such that E

(I[�,∞)(I (X))

) = 1 − �; see Fraiman and Muniz (2001) for moredetails. The application of the differential methodology (in any version) to derive the bootstrap validity for theseestimators does not seem at all straightforward. However, the sample variance provides an example (apart from theobvious one of the sample mean) where the corresponding functional T is differentiable in the sense (5) and the bootstrapvalidity can be derived from the differential methodology outlined above. For example, with regard to condition (5),it can be readily seen that (under the imposed boundedness assumptions) the variance estimator is generated by adifferentiable operator T whose differential is given by

T ′P (Q − P)(t) = EQ−P [X2(t)] − 2EQ−P [X(t)]EP [X(t)],

where EP denotes integration with respect to the measure P.

Acknowledgements

We are very grateful to David García-Dorado and Marisol Ruiz-Meana (Servicio de Cardiología, Hospital Valld’Hebron, Barcelona) for providing us with the calcium data. The constructive comments from the reviewers led tosubstantial improvements in the manuscript. This work has been partially supported by Spanish Grants MTM2004-0098(A. Cuevas and R. Fraiman), BFM2002-03213 and PGIDIT03-PXIC-20702PN (M. Febrero).

References

Bickel, P.J., Freedman, D.A., 1981. Some asymptotic theory for the bootstrap. Ann. Statist. 9, 1196–1217.Bickel, P.J., Klaassen, C., Ritov, Y., Wellner, J.A., 1993. Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins University

Press, Baltimore.Boente, G., Fraiman, R., 2000. Kernel-based functional principal components. Statist. Probab. Lett. 48, 335–345.Cardot, H., Sarda, P., 2005. Estimation in generalized linear models for functional data via penalized likelihood. J. Multivariate Anal. 92, 24–41.Cardot, H., Ferraty, F., Mas, A., Sarda, P., 2003. Testing hypotheses in the functional linear model. Scand. J. Statist. 30, 241–255.Cuevas, A., Fraiman, R., 2004. On the bootstrap methodology for functional data. In: Antoch, J. (Ed.), Proceedings in Computational Statistics,

COMPSTAT 2004. Physica-Verlag, Heidelberg, pp. 127–135.Cuevas, A., Febrero, M., Fraiman, R., 2001. Cluster analysis: a further approach based on density estimation. Comput. Statist. Data Anal. 36,

441–459.Cuevas, A., Febrero, M., Fraiman, R., 2004. An anova test for functional data. Comput. Statist. Data Anal. 47, 111–122.Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D., Weingessel, A., 2004. e1071: Misc Functions of the Department of Statistics, TU Wien. R package

version 1.5-1. http://www.R-project.org.Dudley, R.M., 1990. Nonlinear functionals of empirical measures and the bootstrap. In: Eberlein, E., Kuelbs, J., Marcus, M.B. (Eds.), Probability in

Banach Spaces, vol. 7 (Oberwolfach, 1988). Birkhäuser, Boston, pp. 63–82.Ferraty, F., Vieu, P., 2002. The functional nonparametric model and application to spectrometric data. Comput. Statist. 17, 545–564.Fraiman, R., Muniz, G., 2001. Trimmed means for functional data. Test 10, 419–440.Gasser, T., Hall, P., Presnell, B., 1998. Nonparametric estimation of the mode of a distribution of random curves. J. Roy. Statist. Soc. B 60, 681–691.Gill, R.D., 1989. Non- and semi-parametric maximum likelihood estimators and the von Mises method. I. Scand. J. Statist. 16, 97–128.Giné, E., Zinn, J., 1990. Bootstrapping general empirical measures. Ann. Probab. 18, 851–869.Hall, P., 1988. Theoretical comparison of bootstrap confidence intervals (with discussion). Ann. Statist. 16, 927–985.Hall, P., Heckman, N., 2002. Estimating and depicting the structure of a distribution of random functions. Biometrika 89, 145–158.Laha, L.G., Rohatgi, V.K., 1979. Probability Theory. Wiley, New York.Parr, W.C., 1985. The bootstrap: some large sample theory and connections with robustness. Statist. Probab. Lett. 3, 97–100.Politis, D.N., Romano, J.P., 1994. Limit theorems for weakly dependent Hilbert space valued random variables with application to the stationary

bootstrap. Statist. Sin. 4, 461–476.

1074 A. Cuevas et al. / Computational Statistics & Data Analysis 51 (2006) 1063 –1074

R Development Core Team., 2004. R:A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna,Austria,http://www.R-project.org.

Ramsay, J.O., Silverman, B.W., 1997. Functional Data Analysis. Springer, New York.Ramsay, J.O., Silverman, B.W., 2002. Applied Functional Data Analysis. Springer, New York.Rice, J.A., 2004. Functional and longitudinal data analysis: perspectives on smoothing. Statist. Sin. 14, 631–647.Ruiz-Meana, M., García-Dorado, D., Pina, P., Inserte, J., Agulló, L., Soler-Soler, J., 2003. Cariporide preserves mitochondrial proton gradient and

delays ATP depletion in cardiomyocites during ischemic conditions. Amer. J. Physiol. Heart Circulatory Physiol. 285, H999–H1006.Sheehy, A., Wellner, J.A., 1992. Uniform Donsker classes of functions. Ann. Probab. 20, 1983–2030.van der Vaart, A., Wellner, J., 1996. Weak Convergence and Empirical Processes. Springer, New York.Venables, W.N., Ripley, B.D., 2002. Modern Applied Statistics with S. fourth ed. Springer, New York.Yukich, J.E., 1986. Uniform exponential bounds for the normalized empirical process. Stud. Math. 84, 71–78.Yurinskii, V.V., 1976. Exponential inequalities for sums of random vectors. J. Multivariate Anal. 6, 473–479.