Outlier detection algorithms and their performance in GOCE gravity field processing

Outlier detection algorithms and their performance in GOCE gravityfield processing

M. Kern1, T. Preimesberger2, M. Allesch1, R. Pail1, J. Bouman3, R. Koop3

1 Institute of Navigation and Satellite Geodesy, Graz University of Technology, Steyrergasse 30/III, 8010 Graz, Austriae-mail: [email protected]; Tel: +43 316 873 6349; Fax: +43 316 873 68452 Space Research Institute, Austrian Academy of Science, Schmiedlstrasse 6, 8042 Graz, Austria3 SRON National Institute for Space Research, Sorbonnelaan 2, 3584 CA Utrecht, The Netherlands

Received: 5 January 2004 / Accepted: 5 October 2004 / Published Online: 29 January 2005

Abstract. The satellite missions CHAMP, GRACE, andGOCE mark the beginning of a new era in gravity fielddetermination and modeling. They provide uniquemodels of the global stationary gravity field and itsvariation in time. Due to inevitable measurement errors,sophisticated pre-processing steps have to be appliedbefore further use of the satellite measurements. In theframework of the GOCE mission, this includes outlierdetection, absolute calibration and validation of theSGG (satellite gravity gradiometry) measurements, andremoval of temporal effects. In general, outliers aredefined as observations that appear to be inconsistentwith the remainder of the data set. One goal is toevaluate the effect of additive, innovative and bulkoutliers on the estimates of the spherical harmoniccoefficients. It can be shown that even a small number ofundetected outliers (<0:2% of all data points) can havean adverse effect on the coefficient estimates. Conse-quently, concepts for the identification and removal ofoutliers have to be developed. Novel outlier detectionalgorithms are derived and statistical methods arepresented that may be used for this purpose. Themethods aim at high outlier identification rates as well assmall failure rates. A combined algorithm, based onwavelets and a statistical method, shows best perfor-mance with an identification rate of about 99%. Tofurther reduce the influence of undetected outliers, anoutlier detection algorithm is implemented inside thegravity field solver (the Quick-Look Gravity FieldAnalysis tool was used). This results in sphericalharmonic coefficient estimates that are of similar qualityto those obtained without outliers in the input data.

Key words: Outlier detection – Gravity field and steady-state Ocean Circulation Explorer (GOCE) satellitemission – Satellite gravity gradiometry – Quick-LookGravity Field Analysis

1 Introduction

The satellite missions CHAMP [CHAllenging Minisat-ellite Payload; Reigber et al. (1999)], GRACE [GravityRecovery and Climate Experiment; Tapley, (1997)] andGOCE [Gravity field and steady-state Ocean Circula-tion Explorer; European Space Agency (ESA) (1999)]mark the beginning of a new phase in gravity fielddetermination and modeling. While two of the threesatellite gravity missions have already been launched,the satellite mission GOCE will be launched in 2006.Measurement techniques such as satellite gravitygradiometry (SGG) and satellite-to-satellite tracking(SST) will be used. CHAMP already provides preciseglobal long-wavelength features of the Earth’s staticgravity field. Furthermore, global estimates of theEarth’s magnetic field are derived. While GRACE(Tapley 1997) maps its variability in time in an efficientway, GOCE aims at contributing to the knowledgeof the static gravity field with an unprecedentedaccuracy and resolution. Unique models will be deriveddown to spatial scales of about 100–150 km. In termsof geoidal heights, the GOCE mission is expected toprovide an accuracy of about 1–2 cm for the part ofthe spectrum that can be resolved (ESA 1999). Hence,the missions will provide state-of-the-art informationfor the lower and medium frequencies since theirquality will be unmatched. Undoubtedly, many globaland regional applications will greatly benefit from thesemissions.

To meet the performance goals, sophisticated pre-processing tools have to be developed. In the frameworkof the GOCE mission, time-variable effects are removedfrom the SST and SGG data to obtain optimal sta-tionary gravity models. In addition, outliers in the timeseries should be identified and removed. Finally, thepre-processing includes a so-called internal and externalcalibration. Instrument imperfections have to beCorrespondence to: M. Kern

Journal of Geodesy (2005) 78: 509–519DOI 10.1007/s00190-004-0419-9

corrected for. Also, the conversion of basic standardsand instrument read-outs has to be performed.

Several investigations have been carried out into thecalibration and validation of satellite missions (see e.g.Arabelos and Tscherning 1998; Haagmans et al. 2002;Koop et al. 2002). In addition, time-variable effects onSST and SGG measurements have extensively beenstudied in Pail et al. (2000). Less attention has beengiven to the detection and removal of outliers (alsocalled gross errors or blunders). Albertella et al. (2000)consider the problem of outliers in long-wavelengthSGG measurements. A track-wise approach is comparedto an area-wise approach. The track-wise approach,treating the measurements as a time series, employs astatistical test for rejecting outliers. The area-wiseapproach, in turn, is based on least-squares (LS) collo-cation: if the difference between the predicted and theobserved value exceeds a certain threshold, the point willbe considered a questionable measurement.

In this paper, several methods for the detection ofoutliers in time series will be introduced and compared.Special emphasis is given to the fact that the satellitemissions produce large amounts of measurements withdistinct signal and error characteristics. The quality ofthe outlier detection methods will be evaluated in asimulation study. Finally, the effect of outliers on theestimates of spherical harmonic coefficients will beassessed. The Quick-Look Gravity Field Analysis(QL-GFA) tool will be used for this purpose (see e.g.Pail and Plank 2004).

The paper is organized as follows. After a briefreview of the problem in Sect. 2, several algorithms willbe described for the detection and removal of outliers.In Sect. 3, quality (identification) measures will beintroduced and discussed. The numerical test setup isdescribed and representative results using simulated dataare presented in Sect. 4. Outlier detection rates and theireffect on the final gravity field solution are evaluated inSect. 5. Finally, concluding remarks and an outlook forfuture studies are provided in Sect. 6.

2 Methods for outlier detection

Outliers are defined as observations that appear to beinconsistent with the remainder of the data set (Barnettand Lewis, 1994, p. 7). Large values in a data set are notnecessarily bad or erroneous values. If these measure-ments are extreme values relative to their neighboringmeasurements, however, they might be inconsistent withthe rest of the data set. Many outliers can be detected byvisual inspection of the data set. For large measurementcampaigns or time series, however, this is a verycumbersome procedure.

Outliers may arise for many reasons; they may be theresult of an instrument malfunction, a misreading, or acalculation error. The offending sample should beremoved and, if necessary, replaced by corrected values.Typically, the circumstances are less clear-cut and anoutlier is of inexplicable nature. In this case, the outlieridentification is ambiguous and the removal and

correction of the measurement in question is ademanding task.

Outliers may be categorized by the way they occur ina data set. Apparently isolated outliers or data spikes,for instance, are often superposed on the signal. Thistype of outlier will be denoted as an additive outlier(Barnett and Lewis 1994, p. 396). An innovative outlier,in turn, is a more ’inherent’ contamination form.The innovative outlier produces a carry-over effect toneighboring values (through auto-correlation) and pro-duces a typical pattern. Often, innovative outliers occurat places where the signal itself has an extreme value andare difficult to identify. Lastly, a bulk or block of out-liers is typically due to a temporary malfunctioning ofthe measurement unit. Examples of additive, innovative,and bulk outliers are shown in Fig. 1.

There are many ways to deal with outliers. First,measurements that appear to be inconsistent with the restof the data can simply be rejected from the data set, withthe risk of loss of genuine information. Second, themeasurements in question may be left unaltered inthe data set with the risk of actual contamination. Third,the outlier can be incorporated into the data set and the(underlying) model is then changed. Finally, we can try tominimize the effect of outliers by employingoutlier-robustmethods (see e.g. Rousseeuw and Leroy 1987), filter orsmoothing algorithms. A decision in favour of one of theabove strategies typically depends on the application andthe data that are used. Hence, the treatment, detection,and analysis of outliers is a relative problem: relative tothe signal properties and themodels that are to be applied.

We can distinguish several strategies to identify out-liers and eliminate or accommodate their influence;methods that

1. set an upper bound/lower bound: a threshold is setthat the measurements (or residuals) should notexceed during normal operation;

2. set a maximum allowable rate of change from onevalue to the next: this parameter allows spikes in thedata to be removed, even though they lie between theupper and the lower bound;

3. set a scanning horizon: if the measurements changemore than the maximum allowable rate but remain atthis new steady state for longer than a scanninghorizon, then the values should not be considered asoutliers;

4. accommodate the outliers: nonparametric regression,robust methods, smoothers, kernel regression estima-tion, splines, wavelet and filter techniques can be usedto obtain a smooth estimate of the underlying function.

In principle, it is advantageous to decrease the signal-to-noise ratio (see e.g. Albertella et al. 2000). Outliers thenhave a distinct visual appearance in the (residual) dataseries. For SGG measurements, a well-known referencesignal (e.g. a geopotential model, the C20 term only, orupward continued ground gravity information) may beremoved from the measurements. It can further bereduced by eliminating the effect of the topography andthe atmosphere.

510

2.1 Statistical methods

In the following, several statistical methods are presentedthat can be used for the automatic detection of outliers.All of them set an upper/lower bound the measurementsshould not exceed during normal operation. Thesemethods are globalmethods thatwould search for outliersin the entire data set. Due to variable signal characteris-tics, however, only extreme outliers can then be detected.To identify smaller outliers, local or localizing methodsmay be an alternative. Hence, the global methods arereformulated for smaller data windows that implicitlyrealize the concept of scanning horizons.

It should be noted at this point that the presentationmerely focuses on a limited number and type of statis-tical methods. Baarda’s data snooping (Baarda 1968) orPope’s s method (see e.g. Heck 1981), for instance, arenot discussed. Although they enjoy great acceptance forsmaller adjustments and measurement series, they arecomputationally expensive for large-scale problems.Hence, methods are selected that may be used in largeproblems and do not require extensive pre-processingtime.

2.1.1 Thresholding

A standard approach to cope with outliers is to set amaximum threshold (see e.g. Barnett and Lewis 1994, p.273). Typically, a threshold is used that corresponds tothe characteristics of the data, i.e., about some mean ormedian of the data. Given a data set xi ¼ fx1; . . . ; xng,this procedure may be formalized as

outðl;r;kÞ :¼fi¼ 1; . . . ;n : jxi�lj> kr; k¼ 1;2;3g ð1Þ

where outðl; r; kÞ is referred to as the outlier region. land r are the (unknown) mean and standard deviationof the data, respectively. Unfortunately, only estimatesfor the mean �x and standard deviation s can be obtainedfrom the measurements due to the contamination witherrors. Equation (1) can be formulated for the median orother robust measures in a similar way. Confining thecomputations to smaller data sections, the outlier regioncan be defined for a window of size m as

outð�xm;sm;k;mÞ :¼fi¼ 1; . . . ;m : jxi��xmj> ksm; k¼ 1;2;3gð2Þ

where xi ¼ fx1; . . . ; xmg are observations of the datavector x 2 Rn. After computation of the first window,the data window is shifted by one observation, i.e.xi ¼ fx2; . . . ; xmþ1g, and so on. �xm and sm are estimates ofthe mean and the standard deviation of the m datapoints in the window. The parameter k determines therange of the threshold region. A data point xi is‘identified’ as an outlier if it belongs to the outlierregion xi 2 outð�x; sm; k;mÞ. It is eliminated and/orflagged.

Computationally, the method is a fast algorithm thatcan easily be applied to a large number of observations.Depending on the window size m and the parameter k,however, many actual observations are wrongly detectedas outliers. In addition, smaller outliers, innovativeoutliers, and bulk outliers are hardly detectable.

2.1.2 Mahalanobis distance

The well-known Mahalanobis distance (see e.g.Rousseuw and Leroy 1987, p. 224) is given as

additive

innovative

bulk or block

Fig. 1. Outlier types: additive, innovative, and bulkor block outliers in black on top of the signal in gray

511

outða; l;RÞ :¼ fðx� lÞT R�1ðx� lÞ > v21�aðn� 1Þg ð3Þ

where R is the (positive definite) covariance matrix. Apoint xi 2 outða; l;RÞ is identified as an a-outlier withrespect to the normal distribution Nðl;RÞ. Often, R is adiagonal matrix with R ¼ diagðr2

i Þ. The test quantity inEq. (3) is only chi-squared distributed if the standarddeviation is known.

Similar to setting a threshold, the Mahalanobis dis-tance can efficiently be applied to a large number ofobservations. Also, the calculation for a moving win-dow is straightforward. Note that a distance or rangemeasure cannot distinguish between an upper outlier, alower outlier, or both.

2.1.3 Grubbs’s test

A somewhat different kind of thresholding is given bythe Grubbs test. This is a hypothesis test and can beapplied if the data vector is normally distributedaccording to x � Nðl;RÞ. Grubbs (1950) proposes totest for a lower outlier as

outða;�xm; sm;mÞ

:¼ fi ¼ 1; . . . ;m :�xm �minðxiÞ

sm> t1�aðm� 1Þg ð4Þ

where t1�aðm� 1Þ is Student’s t-distribution, whichdepends on the degree of freedom and on the signifi-cance level a 2 ð0; 1Þ. Equation (4) is written for thewindow m. The null hypothesis is that the data do notcontain outliers. The data have at least one (lower)outlier according to the alternative hypothesis. Upperoutliers are tested in an analogous way. A smallmodification is introduced by Halperin et al. (1955),where the maximum deviation is used in the numerator

outða;�xm; sm;mÞ:¼ fi ¼ 1; . . . ;m :

maxðj�xm � xijÞsm

> t1�aðm� 1Þg ð5Þ

The method enjoys computational ease and can easily beapplied to large data sets.

2.1.4 Dixon test

The Dixon test is a hypothesis test that uses a ratio ofdifferences between a possible outlier and its nearest ornext-nearest neighbor (data excess) to the range. Thetrue standard deviation r does not have to be known.An a-outlier region for upper outliers is defined as

outða;mÞ :¼ fjrjj > t1�aðm� 1Þ; j ¼ 1; 2g ð6Þ

where the test functions rj are given as (Barnett andLewis, 1994, p. 228; Neuilly 1999)

r1 ¼xm � xm�1

xm � x2ðDixon’s r11Þ or

r2 ¼xm � xm�1

xm � x3ðDixon’s r12Þ: ð7Þ

In Eq. (7), xi ¼ fx1; . . . ; xmg is sorted in ascending order.If one of the test functions rj exceeds the critical valuet1�aðm� 1Þ, the largest observation xm is an outlier. Thetest statistic r1 does not contain the smallest value x1 inorder to avoid masking effects (large denominator).Similarly, the test statistic r2 can be used to avoidmasking effects from the two smallest values (x1 and x2)in the data window. Critical values are tabularized inBarnett and Lewis (1994, p. 498).

The Dixon test can also identify a pair of upperoutliers (xm�1; xm) in the data set (Dixon, 1951). Theoutlier region is then defined as

outða;mÞ :¼ fjrj > t1�aðm� 1Þg ð8Þ

where the test statistics are

r ¼ xm � xm�2xm � x1

ðDixon’s r20 testÞ or

r ¼ xm � xm�2xm � x2

ðDixon’s r21 testÞ ð9Þ

The r20 statistic may also be used for the test of a singleupper outlier xm, which avoids the risk of masking byxm�1. The r21 statistic does not include x1. The Dixon testrequires a sorting algorithm; a simple bubble sortalgorithm is used in the computations.

2.2 Wavelet outlier detection algorithm

Outlier detection by wavelets is a recent research area[see e.g. Bruce et al. (1994), Wang (1995), or Dorst(1999) for a geodetic example]. Since wavelets can havegood local properties, outlier detection by wavelets maybe an alternative to the previously discussed methods.Also, the data do not have to belong to a knowndistribution. A discontinuity in the signal often coincideswith an unrepresentative observation or outlier. It canbe shown that discontinuities in the signal and in the firstand second derivatives can be detected by Haar wavelets(Daubechies 1), and Daubechies 4 and 6 wavelets,because they represent a straight line and a parabolaexactly (Strang and Nguyen 1996). In the following, anoutlier detection algorithm is developed based on theHaar wavelet. For a general introduction into waveletsand wavelet packages, reference is made to Daubechies(1992) or Mallat (1999).

The continuous wavelet transform (CWT) of afunction f 2 L2ðRÞ is defined as (Daubechies 1992)

CWTf ða; bÞ ¼ hf ;wa;bi ¼1ffiffiffi

apZ

f ðxÞw x� ba

� �

dx ð10Þ

where w is a family of scaled and translated functions(‘mother wavelet’) that is given as

wa;bðxÞ¼ jaj�1=2w

x�ba

� �

; a> 0; a 6¼ 0; b2R ð11Þ

w is a fixed function in L1ðRÞ \ L2ðRÞ withR

wðxÞdx ¼ 0.The factor jaj�1=2 ensures that the functions wa;b have aconstant norm in the space L2ðRÞ of square-integrable

512

functions. a and b are the scale and the transla-tion parameter of the wavelet w. The well-knownHaar wavelet will be used in the following (Daubechies,1992):

hðzÞ ¼1 0 < z � 1

2

�1 12 < z � 1

0 z � 0 or 1 < z

8

>

<

>

:

ð12Þ

The Haar wavelet constitutes an orthonormal basis forL2ðRÞ

hj;kðzÞ ¼ 2j=2hð2jz� kÞ ð13Þ

since

hhj;k; hj0;k0 i ¼Z

hj;kðzÞhj0;k0 ðzÞ

dz ¼ 1 if j ¼ j0 and k ¼ k0

0 otherwise

�

ð14Þ

In Eq. (14), h�; �i denotes the L2-inner product andj; k; j0; k0 2 Z. The Haar function is discontinuous anddoes not have good time-frequency localization as itsFourier transform decays like 1

z for z!1. However, itmay act as a basis for any square-integrable function fsuch that (Mallat 1999)

f ðzÞ ¼X

j;k

hf ; hj;kihj;kðzÞ ð15Þ

The coefficients dj;k ¼ hf ; hj;ki are called the Haar(detailed) wavelet coefficients. It can be shown that thesecoefficients can be computed as (Daubechies 1992)

djþ1;k ¼1ffiffiffi

2p sj;2k � sj;2kþ1� �

ð16Þ

where sj;k are the smoothed wavelet coefficients of thej-th level. The smoothed Haar wavelets, in turn, canrecursively be computed as

sjþ1;k ¼1ffiffiffi

2p sj;2k þ sj;2kþ1� �

ð17Þ

where sj;k ¼ f ðkÞ for a fixed j.When applied to the sgg time series x (zeroth level),

the Haar wavelet can be used to detect outliers. Using asingle-level (discrete) wavelet transformation, the pro-cedure may be written as follows.

1. Compute detailed and smoothed wavelet coefficientsusing the forward wavelet transformation(k ¼ 1; . . . ; n=2� 1)

s1;k ¼1ffiffiffi

2p x2k þ x2kþ1ð Þ

d1;k ¼1ffiffiffi

2p x2k � x2kþ1ð Þ

ð18Þ

2. Threshold the detailed coefficients by setting a max-imum threshold td:

d 01;k ¼d1;k for jd1;kj < td0 else

�

ð19Þ

The threshold td is selected with respect to the ex-pected signal noise level and the minimum magnitudeof the outliers (e.g. td ¼ 0:07E). In a real data envi-ronment, the threshold can be determined by addingsmall outliers to the data. The threshold is varieduntil the added outliers can be identified. An initialguess for the threshold is the standard deviation ofthe details d1;k.

3. Reconstruct the signal using the signal and detailedcoefficients (inverse wavelet transformation)

xw2k ¼1ffiffiffi

2p s1;k þ d 01;k� �

xw2kþ1 ¼1ffiffiffi

2p s1;k � d 01;k� �

ð20Þ

4. Compute a residual signal using the reconstructedsignal xw

ri ¼ xi � xwi ; i ¼ 1; . . . ; n ð21Þ5. Apply a pattern recognition program on the residuals

to identify the position of the outlier.

Steps 1–3. are frequently denoted as wavelet shrinkage(Donoho et al. 1995). Since the Haar wavelet is used,outliers appear as spikes in the residual signal. They havea particular pattern that can be searched for. Of course,the algorithm can easily be extended to higher decompo-sition levels [using Eqs. (16) and 17)] to fully exploit thepower of wavelets. Typically, level-dependent thresholdsare then employed, such as the soft and hard thresholds,the universal threshold etc. (Donoho et al. 1995).Applying the algorithm iteratively (applying differentthresholds) allows for efficient detection of smalleroutliers, innovative outliers, and a bulk of outliers.

3 Quality measures

Quality measures help to analyze and evaluate data andmethods. In this context, several outlier-relevant qualitymeasures can be introduced. The outlier rate (OR) isgiven as

OR :¼ no

nð22Þ

It provides information about the number of outliersno with respect to the number of values n. The relativesize of outliers is of interest in the outlier-to-signal ratioand the outlier-to-noise ratio (ONR). The OSR as

OSRi :¼ jxoi jjxij

; i ¼ 1; . . . ; n ð23Þ

where xoi is the outlier-contaminated signal. If the ratiois small, outliers are hardly detectable since the

513

contaminated observations are not unrepresentative.The OSR can only be determined in simulations.Another important ratio is the ONR which describesthe relative size of outliers to the data noise, i.e.

ONRi :¼ jxoi jjxe

i j; i ¼ 1; . . . ; n ð24Þ

where xei ¼ xi þ ei and e is the noise. The prospect of

identifying small outliers is small in very noisy data.In order to evaluate the performance of outlier

methods, two ratios are considered. The outlier rate ofsuccess (ORS) describes the number of correctly identi-fied outliers nc with respect to the number of all outliersno. The outlier rate of failure (ORF), in turn, providesinformation about incorrectly detected outliers ni withrespect to all data points. The ORF is thus a measure forthe number of genuine observations that are discardedas result of the outlier detection. Note that the ORF isoften referred to as type I error in the statistical litera-ture. Mathematically, the ratios are given as

ORS ¼ nc

noand ORF ¼ ni

nð25Þ

Both the ORS and the ORF can only be computed insimulation studies.

4 Performance of the outlier detection methods

The main intention of the following simulations is toevaluate the performance of the outlier detectionmethods. After a thorough description of the signal,error, and outlier characteristics, quality measures foreach method are summarized. Some selected results areshown as figures.

4.1 Description of the input data

The simulations are based on the following measure-ment time series.

1. The main diagonal component Vzz of the SGG mea-surement tensor is used. (Note that the common-mode or differential-mode accelerations could also beused for outlier detection.) It is based on the geopo-tential model OSU91A (Rapp et al. 1991) completeup to degree and order 250. The signal was simulatedalong a 59-day/950 revolution GOCE orbit, whichwas computed by numerical integration. The datahave a sampling interval of 1s, which is commensu-rate with a data volume of 5 097 835 observations.

2. Colored noise characteristics according to the speci-fications by the gradiometer manufacturer Alenia(Cesare 2002) were generated by an auto-regressivemoving average (ARMA) process (Schuh 1996).These represent a combination of all expected gra-diometer errors. The total error spectral density Sðf Þwithin the measurement bandwidth of the instru-ment (5–100 mHz) does not exceed Sðf Þ � 4mE=

ffiffiffiffiffi

Hzp

(1mE=1 � 10�3E and 1E=10�9s�2Þ. Outside themeasurement bandwidth, the upper limit of thetotal error spectrum is specified by Sðf Þ � 4 � fmin=fmE=

ffiffiffiffiffi

Hzp

for frequencies lower than fmin ¼ 5mHz,and by Sðf Þ � 4 � ðfmin=f Þ2 mE=

ffiffiffiffiffi

Hzp

for frequencieslarger than fmax ¼ 100 mHz.

At present, there are no estimates of the amount, sizeand characteristics of outliers in real SGG measure-ments. Therefore, a pessimistic scenario was generatedthat includes various types of outliers. Systematic errorsare not considered in the study. Approximately OR� 5% of the data set were infected by outliers (255 812data points). These are composed of the following:

1. 255 002 randomly distributed outliers with randomlyvarying absolute values of 0.07–1.0E. For represen-tation purposes, only positive outliers are generated;

2. 810 outliers clustered in five groups with differentcharacteristics:

2.1. One group of randomly scaled outliers (0.07–1.0E) with a length of 600 points, which iscommensurate with a measurement duration of10 min;

2.2. Two groups of randomly scaled outliers (one inthe range of 0.07–1.0E, the other one in the rangeof 0.5–1.0E) with a length of 60 points, which isequivalent to a measurement interval of 1 min;

2.3. One group of randomly scaled outliers (0.07–1.0E) with a length of 30 points, correspondingto a measurement duration of 0.5 min.

2.4. One group with a length of 60 points consistingof outliers with a constant absolute value of 0.5E.

Table 1 summarizes the statistical information of thenoise-free signal, the noisy gradiometer signal and thesignal with outliers. Note that the signal does not con-tain the C20 term. Also, all outliers are positive outliersfor presentation purposes. Clearly, the outliers affect thestandard deviation of the signal. The quality measuresare computed as ONR=3.4 and OSR=0.003, under-scoring the large size of the outliers with respect to thenoise.For a visual comparison of the methods, two datasections were considered in particular. The first section,A1, consists of 2000 data points and comprises randomlydistributed outliers only (data points 14 001–16 000).Data section A2 includes 3000 data points that areaffected by randomly distributed outliers and a bulk ofoutliers with a length of 1 min and a range of 0.07–1.0E(data points 1 999 001–2 002 000).

Table 1. Statistical parameters of the gradiometer signal (5 097 835data points)

Min Max Mean Standarddeviation

[E] [E] [E] [E]

Signal )1.562 1.846 0.006 0.240Noisy signal )1.567 1.849 0.006 0.240Noisy signalwith outliers

)1.567 2.774 0.033 0.471

514

4.2. Statistical methods

The statistical methods described in Sect. 1 wereimplemented and tested. For all simulations, the samewindow size of m ¼ 100 was used. It was determined in atrial and error fashion. Smaller windows (e.g. m < 50)often led to false detections while larger windows (e.g.m > 500) reduced the success of detection. As mentionedbefore, only the C20 term was removed from the data.

Observations, which were detected as outliers, werereplaced by appropriate fill-in values. The fill-in valuesare not used in the gravity field inversion (but they areused inside the filter procedure) and can therefore bebased on a simple linear interpolation. Of course, thismay adversely affect the outlier rate of success if theneighboring values are erroneous or outliers themselves.An alternative is to flag the doubtful observation and toomit the observation in the next window. Generally, thestatistical methods excel in the form of simple programcodes as well as short computation times.

Setting a simple threshold was the first method thatwas implemented. The mean �xm was used as a measurefor central tendency and 2sm was selected as the corre-sponding distance or range. The results of the methodare shown in Figs. 2 and 3. The original signal is drawnin gray, the dash-dotted line indicates the 2sm threshold,and the black line is the signal after threshold cleaning.Applying a threshold to the data results in an outlierdetection rate of ORS = 67% (see first row in Table 2).This is commensurate with 171 394 outliers out of thetotal of 255 812 outliers. The ORF = 0.1% is small(2558 observations are incorrectly declared as outliersout of a total of 5 097 835 measurements) but too manyoutliers (84 418) remain undetected in the signal. Closerinspection of Figs. 2 and 3 lead to the same conclusion,where many smaller additive and innovative outliers arevisible. In reality, a large number of undetected outlierscan lead to the suspicion that the underlying model isincorrect. The bulk of outliers, shown in Fig. 3 at

around data point 2 000 000, cannot adequately betreated by the method.

The Mahalanobis distance, treated in Sect. 2.1.2, wasused in an iterative way. In the first iteration, the Ma-halanobis distance was computed and unrepresentativeobservations were eliminated. The offending observa-tions were replaced by linearly interpolated fill-in values.In the second iteration, the Mahalanobis distance wascomputed again using the measurements and the fill-invalues. Outliers were detected, eliminated and replacedagain, and so forth. At least two iterations were neededto obtain a ‘useful’ detection result. Three iterations ledto an ORS of more than 92% (235 347 outliers found).However, the ORF was significantly increased to 10%(25 582 true observations were discarded). More thanthree iterations led to dissuasive results, i.e. manyobservations were falsely identified as outliers. In sum-mary: as in the case of simple thresholding, the Maha-lanobis distance detects most of the large additiveoutliers. The Mahalanobis distance also identifiessmaller (additive) outliers, but has difficulties in detect-ing bulk outliers.

The Grubbs test was applied for upper and loweroutliers. The test statistic t1�aðm� 1Þ was determinedwith respect to the number of observations in a movingwindow (m ¼ 100). A significance level of a ¼ 1% wasselected. The Grubbs test performs along the lines of theMahalanobis distance and has similar computationaldemands. The ORS is at a high level of about 87%, see

Table 2. Results of the outlier detection methods in the pre-pro-cessing step

Name ORS (%) ORF (%)

Mean �2sM 67.7 0.1Mahalanobis 85.6 10.0Grubbs 86.9 47.3Dixon 68.7 0.0Wavelet algorithm 99.1 2.4Combined algorithm 99.6 2.5

14 000 14 200 14 400 14 600 14 800 15 000 15 200 15 400 15 600 15 800 16 000

0

0.5

1

1.5

data points

[E]

Fig. 2. Data section A1: The gray line shows the original measure-ment signal, the black line the signal after outlier detection with athreshold, and the dash-dotted line is the 2sm threshold

1 999 000 1 999 500 2 000 000 2 000 500 2 00 1000 2 001 500 2 002 000

0

0.5

1

1.5

2

data points

[E]

Fig. 3. Data section A2: The gray line shows the original measure-ment signal, the black line the signal after outlier detection with athreshold, and the dash-dotted line is the 2sm threshold

515

Table 2. The main objection against this method is thehigh outlier rate of failure (ORF = 47%). The testeliminates too many actual observations, which wouldprobably decrease the quality of level 2 products.

The Dixon test has by far the shortest computationtime of all implemented statistical methods. In addition,it has the ability of dual-outlier detection in each singlemoving window. The maximum and the minimum valueof each data window is tested. The test statistics can belooked up in statistical tables (Barnett and Lewis 1994).Although the ORS (= 68.7%) is only at the level of thethreshold method, the method does not suffer from ahigh ORF. It should be noted that the Dixon test acts ina more robust manner towards bulk outliers than do theother methods.

4.3 Application of the wavelet outlier detection algorithm

The main idea of the wavelet algorithm is to decrease thesignal-to-noise ratio and to obtain a characteristicpattern whenever an outlier occurs. A simple patternrecognition algorithm can then be used to detect theposition of the outliers. The outlier detection rate canfurther be increased by an iterative procedure. Obser-vations that are affected by an outlier are eliminated andreplaced by interpolated values. As already mentioned inthe context of the Mahalanobis distance, a simple linearinterpolation was used that involved neighboring points.Of course, more sophisticated procedures could beapplied.

The wavelet algorithm detects most of the additiveand innovative outliers, (see Table 2). The outlier rate ofsuccess is larger than 99% (253 509 outliers detected).Only 2.4% observations are falsely detected. A closerinspection of the remaining outliers leads to the con-clusion that many of the undetected outliers are within abulk of outliers. This is due to the fact that an individualoutlier causes a (at least four points wide) characteristicpattern after application of the wavelet algorithm. Foroutliers occurring in bulk form, however, the charac-teristic pattern is changed and the pattern recognitionroutine fails. The remaining outliers can only be detectedby visual detection and/or a successively applied itera-tive procedure.

4.4 Combined method

The statistical outlier methods and the wavelet algo-rithm are based on different assumptions. Additionally,they act on the data set in somewhat complementaryways. While the wavelet algorithm can identify outliers(almost) on the noise level, the statistical methods areappropriate tools for the detection of larger outliers and(at least some of the) bulk outliers. Therefore, it mightbe advantageous to combine them into a method wherethe wavelet algorithm is used first and a statisticalmethod is applied to the residual data in a second step.

Different combination solutions have been exten-sively tested. For the simulated data set under investi-

gation, a combination of the wavelet algorithm and theDixon method provide the best results. The results ofthis combination are summarized in Table 2. With re-spect to the wavelet-only method, the ORS is nowslightly increased to about 99.6%. The ORF is at a levelof 2.5%. The combined method represents the mostpowerful algorithm for outliers in SGG signals. Figures4 and 5 visually underline the result in the two datasections A1 and A2. The noisy signal with outliers,shown in gray, is almost completely cleaned. Outliers aredetected and replaced with interpolated values. In Fig. 5,some of the bulk outliers at position 2 000 000 cannot bedetected and remain in the data.

5 Effect of outliers on the gravity field solution

In the previous sections, several outlier detectionalgorithms were compared and assessed on the basis ofan SGG time series. A combined method, consisting of awavelet outlier detection algorithm and a statisticalmethod, shows the best performance, with detectionrates better than 99%. The following investigations areconcerned with the effect of undetected outliers on thefinal gravity field solution. For this purpose, the Quick-Look Gravity Field Analysis (QL-GFA) software wasused. The QL-GFA tool is based on the semi-analyticalapproach (see Rummel et al. 1993; Sneeuw 2000; Pailand Plank 2002). In the processing architecture of theGOCE mission, QL-GFA is employed to analyze partialand/or incomplete sets of SGG and SST data. Further-more, it allows for fast analyses of the GOCE systemperformance by detecting potential distortions of statis-tical significance (e.g. systematic errors) in parallel to themission. In addition, it provides feedback to the GOCEmission control segment and estimates SGG and hl-SSTnoise characteristics.

In the first simulation, the QL-GFA software wasapplied to the data set described in Sect. 4.1. Thespherical harmonic coefficients were computed without

14 000 14 200 14 400 14 600 14 800 15 000 15 200 15 400 15 600 15 800 16 000

0

0.5

1

1.5

data points

[E]

Fig. 4. Data section A1: The gray line shows the original measure-ment signal, the black line the signal after outlier detection with thecombined method

516

outliers in the data. The solid curve in Fig. 6 shows thefinal solution of the iterative QL-GFA algorithm. De-gree Median values are then obtained as

MEDIANl;i ¼ medianm j�RðestÞlm;i � �RðOSUÞlm j

n o

ð26Þ

where �RðestÞlm;i ¼ f �CðestÞlm ; �SðestÞlm g and �RðOSUÞlm ¼ f �CðOSUÞ

lm ;�SðOSUÞ

lm g are the spherical harmonic coefficient estimatesof the QL-GFA tool and OSU91A gravity model,respectively. In Eq. (26), i denotes the ith iteration step.The total CPU time for the solution complete up to degreelmax ¼ 250 (about 63 000 harmonic coefficients) was lessthan 30 minutes on a single PC. The normal equationsystem becomes numerically unstable when only Vzz isused (16 significant digits). This is mainly due to a sun-synchronous orbit configuration, whichmainly affects thezonal and near-zonal coefficients (Sneeuw and vanGelderen 1997). A Kaula regularization (Kaula 1966)was applied to stabilize the system of equations.

In the second run, the outliers were included in themeasurement time series. They were not flagged asoutliers and, thus, fully contribute to the least-squares(LS) solution. The light-gray curve, denoted as (1) inFig. 6, demonstrates that the parameter estimates aredrastically affected by the outliers. Although theQL-GFA tool converges, the Median degree curve cutsthe OSU91A curve at degree 140. Hence, there is astrong demand for outlier detection methods.

In the third run, the outliers were included in the dataset. This time, however, they were flagged as outliers andconsistently be left out in the gravity field solution. Thefollowing strategy was used to apply the QL-GFA to theinterrupted data set (Pail and Plank 2002; Preimesbergerand Pail 2003). Theoretically, the semi-analyticalmethod requires uninterrupted measuring time series toapply fast Fourier transform (FFT) techniques. Inpractice, however, an iterative technique is applied thatfills data gaps by interpolated observations. In the fol-lowing iterations, only the differences between theexisting and the adjusted observations are used to im-prove the coefficient solution from the previous itera-tion. Interpolated values are assigned to the data gaps.The idea is that (in the case of convergence) the residualsbecome gradually smaller with successive iterations. Theimpact of Gibbs’ phenomenon, which is due to the fillingof the gaps, becomes less dramatic. By applying thisstrategy, only existing observations were adjusted, al-though an uninterrupted time series (filled by artificialvalues in the data gaps) was processed. The dotted curve(2) in Fig. 6 shows the degree error Median of thisconfiguration. Compared with the solution withoutoutliers (solid line), the results almost coincide.

The next test incorporated the results of the com-bined wavelet/Dixon method, i.e. those observationsthat were identified by the method were flagged as out-liers. After 10 iterations in the QL-GFA tool, the finalsolution was obtained. It is shown in Fig. 7 by curve (1)and demonstrates that the 0:37% (= 1037 points) ofundetected outliers considerably degrade the accuracy ofthe LS solution. A similar result would be obtainedwhen using a different kind of gravity field solver, suchas the direct methods PCGMA or DNA (see e.g. Pailand Plank 2002). As long as the gravity field inversion isbased on the LS principle (L2), the solution will beaffected by outliers. Thus, a high success rate of 99:63%may not be enough to obtain an optimal gravity fieldsolution.

In addition to the pre-processing step, an outlierdetection algorithm inside the QL-GFA tool wasimplemented. Since QL-GFA is an iterative solver, it caneasily be applied to the residuals of the adjustment. Themain idea is that the outliers are not searched for inthe SGG signal but in a residual time series. Note thatthe residuals stem from the adjustment and are notformed by the reduction of an a priori gravity modelsuch as EGM96. The signal-to-noise ratio is decreasedeven further and outliers have a very distinct appear-ance. A simple thresholding approach was applied(threshold = 8� standard deviation of the residuals).Figure 8 shows the situation for data section A1, where

1 999 000 1 999 500 2 000 000 2 000 500 2 001 000 2 001 500 2 002 000

0

0.5

1

1.5

2

data points

[E]

Fig. 5. Data section A2: The gray line shows the original measure-ment signal, the black line the signal after outlier detection with thecombined method

1e-12

1e-111e-11

1e-10

1e-09

1e-08

degr

ee m

edia

n

50 100 150 200 250

degree

OSU91A

Ref. (solid)

(2) (dotted)

(1)

Fig. 6. Degree Median of OSU91A, the reference solution (Ref.), nooutlier detection applied (1), and exact outlier detection (2)

517

some outliers remained undetected in the pre-processingstep. The gray curve shows the Vzz signal (reduced by thedominant effect of the C20 coefficient). The dashed curve(1) shows the residuals of the adjustment, clearlyshowing a block of undetected outliers around positionnumber 145 500. Finally, the black solid curve (2) inFig. 8 shows the results using the pre-processed datathat is further outlier-cleaned inside the QL-GFA tool.This procedure leads to the final gravity field solution,shown in Fig. 7 by the black dotted curve (2), where theaccuracy of the coefficient estimates is considerably im-proved compared to the solution with pre-processeddata. The effect of outliers is minimized and the gravityfield solution comes very close to the one without anyoutliers. Hence, applying an outlier detection strategybefore and inside the gravity field solver ensures high-quality gravity field parameter estimates.

Finally, in order to demonstrate the effect of outliersin derived quantities such as geoidal undulations orgravity anomalies, Fig. 9 shows gravity anomaly differ-ences. Absolute values are shown to stress their magni-tude. These differences are obtained between thesolution with all outliers correctly flagged and thesolution applying the combined outlier detection meth-od. The effect of outliers is clearly visible in the figureand can reach up to 8 mGal in absolute terms.

6 Concluding remarks and outlook

To meet the goals of the satellite missions, sophisticatedpre-processing tools have to be developed that includeoutlier detection methods and calibration procedures. Inthis paper, several outlier detection methods for SGGmeasurements have been presented and evaluated. Whenapplied to an erroneous data set, the combined wavelet/Dixon method provided the best results, with detectionrates of more than 99%. Only 2.5% of actual observa-tions were falsely identified as outliers. The combinedwavelet/Dixon method therefore represents a very goodpre-processing algorithm for the detection of outliers inSGG measurements.

The SGG measurements are used for the determina-tion of spherical harmonic coefficients and undetectedoutliers have an adverse impact on the final coefficients.To improve the solutions, a two-step strategy is pro-posed where outliers are searched for in the pre-pro-cessing step and in the gravity field solver. Aninteraction between pre-processing on the one hand andapplication-oriented processing on the other hand is ofvital importance.

In the present study, the QL-GFA software was used.However, the conclusions should also apply to other(direct) gravity field solvers. One difference betweensemi-analytical solvers and direct gravity solvers is in thetreatment of short-term data gaps. The colored noisebehaviour of the GOCE gradiometer requires theapplication of filters. Since this filter procedure is usually

1e-12

1e-111e-11

1e-10

1e-09

1e-08de

gree

med

ian

50 100 150 200 250

degree

OSU91A

Ref. (solid)

(2) (dotted)

(1)

Fig. 7. Degree Median of OSU91A, the reference solution (Ref.),outlier detection applied in the pre-processing step (1), and outlierdetection incorporated in QL-GFA (2)

-0.5

0.0

0.5

1.0

1.5

[E]

14 000 14 200 14 400 14 600 14 800 15 000 15 200 15 400 15 600 15 800 16 000

data points

Signal

(1)

(2)

Fig. 8. Vzz signal including outliers, residuals before (1) and after (2)incorporation of outlier detection in the QL-GFA

Fig. 9. Gravity anomaly differences (absolute values) between thesolution with all outliers correctly flagged and the solution applyingthe combined outlier detection method (mGal)

518

performed as a discrete recursive filtering in the directsolver (Schuh 1996), each gap requires the new warm-upphase of the digital filter. This implies a decorrelationwith the previous measurement stream and can lead tothe loss of a few hundred observations in direct solvers.The treatment of data gaps as a consequence of outliersin the direct solver is beyond the scope of the presentstudy. Moreover, additional outlier detection methodsshould be implemented and assessed. This includes LScollocation along the orbit, the application of theLaplace condition and the search for outlier pairs (Koch1985, 1999, p. 302). These topics will form part of afuture contribution.

Acknowledgments. Financial support for this study came from theASA contract ASAP-CO-008/03. The authors would like to thankthe three reviewers and the editor for their time and effort inproviding constructive comments on an earlier version of thismanuscript.

References

Albertella A, Migliaccio F, Sanso F, Tscherning CC (2000) Sci-entific data production quality assessment using local space-wisepreprocessing. From Eotvos to mGal. Study team 2—Work-package 4. Tech rep, European Space Agency ESTEC,Noordwijk, The Netherlands

Arabelos D, Tscherning CC (1998) Calibration of satellite gradi-ometer data aided by ground gravity data. J Geod 72:617–625

Baarda W (1968) A testing procedure for use in geodetic networks.Publ 2(5), Netherlands Geodetic Commission, Delft

Barnett V, Lewis T (1994) Outliers in statistical data, 3rd edn. JohnWiley, Chichester

Bruce AG, Donoho LG, Gao HY, Martin RD (1994) Denoisingand robust nonlinear wavelet analysis. SPIE Proceedings,WaveletApplications, vol 2242,HaraldH. San (ed), The Internat.Society for Optical Engineering (SPIE) Orlando, FL pp 325–336

Cesare S (2002) Performance requirements and budgets for thegradiometric mission. Tech note GOC-TN-AI-0027, AleniaSpazio, Turin

Daubechies I (1992) Ten lectures on wavelets. Society for Indus-trial and Applied Mathematics, Philadelphia, PA

Dixon WJ (1951) Ratios involving extreme values. Ann MathStatist 22:68–78

Donoho DL, Johnstone IM, Kerkyacharian G, Picard D (1995)Wavelet shrinkage: asymptopia? J R Statist Soc Ser B 57:301–369

Dorst L (1999) Current and additional procedures for supercon-ducting gravimeter data at the main tidal frequency. Gradua-tion report, Delft University of Technology, Delft

European Space Agency (1999) Gravity field and steady-stateocean circulation mission. Reports for Mission selection. Thefour candidate Earth explorer core missions. ESA SP-1233(1),European Space Agency

Grubbs FE (1950) Sample criteria for testing outlying observa-tions. Ann Math Statist 21:27–58

Haagmans R, Prijatna K, Omang O (2002) An alternative conceptfor validation of GOCE gradiometry results based on regionalgravity. Proc 3rd Meeting of the international gravity and geoidcommission, Edition Ziti, Thessaloniki, pp 281–286

Halperin M, Greenhouse SW, Cornfield J, Zalokar J (1955) Tablesof percentage points for the studentized maximum absolutedeviate in normal samples. J Am Statist Assn 50:185–195

Heck B (1981) Der Einfluss einzelner Beobachtungen auf das Er-gebnis einer Ausgleichung und die Suche nach Ausreissern inden Beobachtungen. Allg Vermess 88(1):17–34

Kaula W (1966) Theory of satellite geodesy. Blaisdell, Waltham,MA

Koch KR (1985) Test von Ausreissern in Beobachtungspaaren. ZVermess 110:34–38

Koch KR (1999) Parameter estimation and hypothesis testing inlinear models, 2nd edn. Springer, Berlin Heidelberg New York

Koop R, Bouman J, Schrama E, Visser P (2002) Calibration anderror assessment of GOCE data. In: Schwarz A (ed) IAG SympProc 125. Vistas for geodesy in the new millenium. Springer,Berlin Heidelberg New York

Mallat S (1999) A wavelet tour of signal processing, 2nd edn.Academic Press, New York pp 167–174

Neuilly M (1999) Modelling and estimation of measurementerrors. Lavoisier, Paris, France

Pail R, Sunkel H, Hausleitner W, Hock E, Plank G (2000) Tem-poral variations/oceans. From Eotvos to mGal. Study team1—Workpackage 6. ESA/ESTEC Contract No. 13392/98/NL/GD Tech rep, European Space Agency, ESTEC, Noordwijk,The Netherlands

Pail R, Plank G (2002) Assessment of three numerical solutionstrategies for gravity field recovery from GOCE satellite gravitygradiometry implemented on a parallel platform. J Geod76:462–474

Pail R, Plank G (2004) GOCE gravity field processing strategy.Studia Geoph. et Geod. 48:289–308

Preimesberger T, Pail R (2003) GOCE quick-look gravity solution:application of the semianalytic approach in the case of datagaps and non-repeat orbits. Stud Geoph Geod 47:435–453

Rapp R, Wang Y, Pavlis N (1991) The Ohio state 1991 geopo-tential and sea surface topography harmonic coefficient models.OSU rep 410, Department of Geodetic Science and Surveying,The Ohio State University, Columbus

Reigber Ch, Schwintzer P, Luhr H (1999) CHAMP geopotentialmission. Boll Geof Teor Appl 40:285–289

Rousseeuw PJ, Leroy AM (1987) Robust regression and outlierdetection. Wiley series in probability and mathematical statis-tics. Wiley, New York

Rummel R, van Gelderen M, Koop R, Schrama E, Sanso F,Brovelli M, Miggliaccio F, Sacerdote F (1993) Spherical har-monic analysis of satellite gradiometry. Publ Geodesy 39,Netherlands Geodetic Commission Delft

Schuh W-D (1996) Tailored numerical solution strategies for theglobal determination of the earth’s gravity field. MitteilungenGeod Inst TU Graz 81, Graz University of Technology, Graz

Sneeuw N (2000) A semi-analytical approach to gravity fieldanalysis from satellite observations. DGK series no. 527. Bay-erische Akademie der Wissenschaften, Munich

Sneeuw N, van Gelderen M (1997) The polar gap. In: Sanso F,Rummel R (eds) Geodetic boundary value problems in view ofthe one centimeter geoid. Lecture Notes in Earth Sciences 65.Springer, Berlin Heidelberg New York, pp 559–568

Strang G, Nguyen T (1996) Wavelets and filter banks. Wellesley-Cambridge Press, Wellesley

Tapley BD (1997) The gravity recovery and climate experiment(GRACE). EOS Trans Am Geophys Un Suppl 78(46)

Wang Y (1995) Jump and sharp cusp detection by wavelets. Bio-metrika 82:385–397

519

Outlier detection algorithms and their performance in GOCE gravity field processing

Documents

Transcript of Outlier detection algorithms and their performance in GOCE gravity field processing