Homogenization Techniques for European Monthly Mean Surface Pressure Series

15
2658 VOLUME 12 JOURNAL OF CLIMATE 1999 American Meteorological Society Homogenization Techniques for European Monthly Mean Surface Pressure Series V. C. SLONOSKY, P. D. JONES, AND T. D. DAVIES Climatic Research Unit, University of East Anglia, Norwich, United Kingdom (Manuscript received 15 July 1998, in final form 16 December 1998) ABSTRACT The quality of 51 series of surface pressure (extending back to between 1780 and 1871) over Europe is assessed using three different homogenization techniques. A new technique introduced here based on an iteration of multiple qualitative comparisons and adjustments (MCAs), and the Caussinus and Mestre technique, based on multiple decision rules and Bayesian statistics, are two methods that do not require a homogeneous reference series for the detection and adjustment of inhomogeneities. The third technique, the standard normal homogeneity test, does require a homogeneous reference series for the homogenization procedure, and has been used only on the last 100 yr of each station series. The results of the three methods, as well as the original, unadjusted data, are compared for differences in the variance of the individual series and in their interstation correlations. Empirical orthogonal function analysis is also used to assess differences in the results of the adjustment methods. The comparisons suggest that surface pressure in this geographical domain may be considered as being stationary over periods ranging from decades to centuries, and thus homogeneous parts of a surface pressure record can be used to adjust for inhomogeneities, as is done using MCA. It is also seen that EOF analysis can be an effective tool to assess the homogeneity of a dataset. The results of the EOF analysis show that inhomogeneities and poorly adjusted series can have undue influence on subsequent analyses. 1. Introduction Homogeneity testing and the adjustment of climatic time series for nonclimatic variations are a fundamental part of any analysis of observational data as many long series contain inhomogeneities that must be adjusted before any meaningful analysis for variations can be performed (Alexandersson 1986; Karl and Williams 1987; Peterson and Easterling 1994; Jones 1995; Heino 1997; Peterson et al. 1998). Inhomogeneities can be due to factors such as station relocations, changes in ob- serving procedures (including instrumentation changes), and changes in methods of calculating monthly means (Karl and Williams 1987; Peterson and Easterling 1994; Jones 1995). For pressure, changes in instrument height, which can be related to station relocations, are the most important causes of inhomogeneities (Heino 1997), al- though changes in procedures for the reduction to mean sea level can also cause discontinuities in station re- cords. Changes in the source of the data, or combining several nearby stations to form one long time series, can also introduce discontinuities (Young 1993). Spu- rious values, such as errors in data processing, can also occur as outliers (Jones et al. 1987). Often the magnitude of the variations caused by these inhomogeneities is as Corresponding author address: Vicky Slonosky, Climatic Research Unit, University of East Anglia, Norwich NR4 7TJ, United Kingdom. E-mail: [email protected] large as, or even larger than, the signal that is being studied, causing distortions in subsequent analyses. The detection of and adjustment for these nonclimatic var- iations are therefore of critical importance. Many different methods exist to detect and adjust for inhomogeneities in climatic time series (Peterson et al. 1998). A series is defined to be ‘‘homogeneous if the variations are caused by variations in weather and cli- mate’’ (Conrad and Pollak 1962). However, in practice it is very difficult to find a station that is homogeneous, and instead researchers focus on relative homogeneity. A series is defined to be ‘‘relatively homogeneous with respect to a synchronous series at another place if the differences (ratios) of the pairs of homologous averages represent a series of random numbers, which satisfies the normal law of errors’’ (Conrad and Pollak 1962). Almost all methods are based on relative homogeneity, that is, comparing the station of interest, called the can- didate station, with nearby stations by taking the dif- ferences (for pressure or temperature) or ratios (for pre- cipitation) between the candidate and neighboring sta- tions. If both stations are homogeneous, the time series of these differences or ratios should be a random series, with no discontinuities or trends. If discontinuities exist in the difference series, these are referred to as breaks, and the (temporal) location of the break is known as the break point. For the remainder of this paper, the term ‘‘homogeneous’’ will be taken to mean ‘‘relatively homogeneous.’’

Transcript of Homogenization Techniques for European Monthly Mean Surface Pressure Series

2658 VOLUME 12J O U R N A L O F C L I M A T E

� 1999 American Meteorological Society

Homogenization Techniques for European Monthly Mean Surface Pressure Series

V. C. SLONOSKY, P. D. JONES, AND T. D. DAVIESClimatic Research Unit, University of East Anglia, Norwich, United Kingdom

(Manuscript received 15 July 1998, in final form 16 December 1998)

ABSTRACT

The quality of 51 series of surface pressure (extending back to between 1780 and 1871) over Europe isassessed using three different homogenization techniques. A new technique introduced here based on an iterationof multiple qualitative comparisons and adjustments (MCAs), and the Caussinus and Mestre technique, basedon multiple decision rules and Bayesian statistics, are two methods that do not require a homogeneous referenceseries for the detection and adjustment of inhomogeneities. The third technique, the standard normal homogeneitytest, does require a homogeneous reference series for the homogenization procedure, and has been used onlyon the last 100 yr of each station series. The results of the three methods, as well as the original, unadjusteddata, are compared for differences in the variance of the individual series and in their interstation correlations.Empirical orthogonal function analysis is also used to assess differences in the results of the adjustment methods.The comparisons suggest that surface pressure in this geographical domain may be considered as being stationaryover periods ranging from decades to centuries, and thus homogeneous parts of a surface pressure record canbe used to adjust for inhomogeneities, as is done using MCA. It is also seen that EOF analysis can be an effectivetool to assess the homogeneity of a dataset. The results of the EOF analysis show that inhomogeneities andpoorly adjusted series can have undue influence on subsequent analyses.

1. IntroductionHomogeneity testing and the adjustment of climatic

time series for nonclimatic variations are a fundamentalpart of any analysis of observational data as many longseries contain inhomogeneities that must be adjustedbefore any meaningful analysis for variations can beperformed (Alexandersson 1986; Karl and Williams1987; Peterson and Easterling 1994; Jones 1995; Heino1997; Peterson et al. 1998). Inhomogeneities can be dueto factors such as station relocations, changes in ob-serving procedures (including instrumentation changes),and changes in methods of calculating monthly means(Karl and Williams 1987; Peterson and Easterling 1994;Jones 1995). For pressure, changes in instrument height,which can be related to station relocations, are the mostimportant causes of inhomogeneities (Heino 1997), al-though changes in procedures for the reduction to meansea level can also cause discontinuities in station re-cords. Changes in the source of the data, or combiningseveral nearby stations to form one long time series,can also introduce discontinuities (Young 1993). Spu-rious values, such as errors in data processing, can alsooccur as outliers (Jones et al. 1987). Often themagnitudeof the variations caused by these inhomogeneities is as

Corresponding author address: Vicky Slonosky, Climatic ResearchUnit, University of East Anglia, Norwich NR4 7TJ, United Kingdom.E-mail: [email protected]

large as, or even larger than, the signal that is beingstudied, causing distortions in subsequent analyses. Thedetection of and adjustment for these nonclimatic var-iations are therefore of critical importance.Many different methods exist to detect and adjust for

inhomogeneities in climatic time series (Peterson et al.1998). A series is defined to be ‘‘homogeneous if thevariations are caused by variations in weather and cli-mate’’ (Conrad and Pollak 1962). However, in practiceit is very difficult to find a station that is homogeneous,and instead researchers focus on relative homogeneity.A series is defined to be ‘‘relatively homogeneous withrespect to a synchronous series at another place if thedifferences (ratios) of the pairs of homologous averagesrepresent a series of random numbers, which satisfiesthe normal law of errors’’ (Conrad and Pollak 1962).Almost all methods are based on relative homogeneity,that is, comparing the station of interest, called the can-didate station, with nearby stations by taking the dif-ferences (for pressure or temperature) or ratios (for pre-cipitation) between the candidate and neighboring sta-tions. If both stations are homogeneous, the time seriesof these differences or ratios should be a random series,with no discontinuities or trends. If discontinuities existin the difference series, these are referred to as breaks,and the (temporal) location of the break is known asthe break point. For the remainder of this paper, theterm ‘‘homogeneous’’ will be taken to mean ‘‘relativelyhomogeneous.’’

AUGUST 1999 2659S L O N O S K Y E T A L .

FIG. 1. Map of station locations.

The difficulty with relative homogeneity is in iden-tifying whether the break is caused by the candidate orneighboring station. Many tests have been devised thatcompare candidate stations to homogeneous referenceseries (e.g., Alexandersson 1986; Karl and Williams1987; Easterling and Peterson 1995; Alexandersson andMoberg 1997). However, finding or creating a homo-geneous reference series for comparison purposes is dif-ficult (Alexandersson 1986; Karl and Williams 1987;Peterson and Easterling 1994). Some methods includemetadata (i.e., station history information) to facilitatethe identification of break points, but problems existwith the quality of the metadata available, as not allstations have complete station history records. Relativehomogeneity techniques are also difficult to apply nearthe boundaries (in space or time) of a dataset, as theremay not be sufficient nearby stations for comparison.In this paper, 51 series of surface pressure from Eu-

rope, described in section 2, are homogenized usingthree different techniques. The first is a new, quasiob-jective technique that uses iterative multiple qualitativecomparisons to perform simultaneous adjustments(hereafter MCA) and is introduced here. The secondmethod, the Caussinus and Mestre technique (CMT), isbased on bayesian statistics and multiple decision rules,described in Caussinus and Mestre (1996) and Caussi-nus and Lyazrhi (1997). These two methods do not re-quire a homogeneous reference series and are describedin section 3. The third method is the standard normalhomogeneity test (SNHT) described in Alexandersson

(1986) and Alexandersson and Moberg (1997), whichwill also be briefly described in section 3. For the pur-poses of this comparison, these methods consider theeffects of discontinuities only: trend inhomogeneitiesare not considered, although a formulation of the SNHTdoes exist for trends (Alexandersson and Moberg 1997).However, one would expect a major inhomogeneoustrend to be identified as one or more discontinuities. Inpractice, few trends are found in pressure data, trendsbeing more common in temperature (with the urbanheat-island effect) and precipitation (with faulty raingauges). Although trends may be detected, in practicethey are more difficult to adjust for than discontinuities.Comparisons between the results of the three tech-

niques are undertaken in section 4, using F-tests, inter-station correlations, and empirical orthogonal functions(EOFs). All tests consider inhomogeneities with respectto the mean only; the homogeneity of higher-order mo-ments, such as the variance, are not considered, althoughthe different adjustment techniques used may have aneffect on the variance of the adjusted series. This isdiscussed in section 4. Some conclusions are presentedin section 5.

2. Surface pressure dataThe locations of the 51 time series are shown in Fig.

1, as well as the latest starting date for each station. Allseries end in 1995. Although much of the data derivesfrom the world weather records (WWR), Monthly Cli-

2660 VOLUME 12J O U R N A L O F C L I M A T E

FIG. 2. (a) Difference plots for Rome, showing annual average (i)Rome–Malta, (ii) Rome–Barcelona, (iii) Rome–Florence, and (iv)Rome–Athens. Solid line shows original differences, with all stationsunadjusted, dotted line shows the differences with all stations ad-justed. The dotted vertical lines show the position of break points,the dashed vertical line shows the position of an outlier. (b) Break-

matic Data for the World, and the Reseau Mondiale,some important early data are taken from Hann (1887),de Tillo (1890), and Gorczynski (1917). The series forLund has been complied and documented by Barring etal. (1999), the series for Barcelona and Madrid by Rod-riguez et al. (1999, manuscript submitted to Int. J. Cli-matol.), and the series for Reykjavik and Gibraltar byJones et al. (1997). The sources are described in moredetail in Jones et al. (1987) and Jones et al. (1999).It is interesting to note that the early observations are

often of extremely good quality (Lamb and Johnson1959), as they were usually taken by dedicated amateurswho kept very careful records for long periods of time,sometimes for over 30 yr, as did the French physicianLouis Morin in the late seventeenth and early eighteenthcenturies (LeGrand and LeGoff 1992). The quality ofobservations may even in some cases have decreasedwith time, as observing practices became more routinewith several different observers and locations for thesame series. However, the sophistication of the correc-tions necessary to obtain reliable pressure readings(scale errors, zero error, capilliarity, temperature, grav-ity, and sea level) gradually developed over the courseof the seventeenth, eighteenth, and nineteenth centuries(Knowles Middleton 1964).All series have been reduced to mean sea level with

the exceptions of Milan, Florence, and Jerusalem, as thedata for these stations were mainly recorded as station,not sea level, pressure. All series are of mean monthlypressure.

3. Methodsa. Homogenization methods1) MCAThe MCA method for detecting and adjusting for

breaks is based on multiple comparisons of a candidatestation with four surrounding stations, and simultaneousadjustments of each series. The surrounding stations arechosen as the nearest stations in each direction (one tothe north, south, east, and west). For stations near theboundary of the area, such as Jerusalem or Reykjavik,the four nearest stations within the area (i.e., to the northand west of Jerusalem) are used. Inhomogeneous pe-riods are identified by inspection of the four differenceseries: if a break occurs at the same place in three ormore of the difference series, the break is identified asbeing in the candidate station record (see Fig. 2a). Thecandidate series is then adjusted by calculating monthlyadjustment factors as the differences between themonthly mean value of a modern reference period andthe monthly mean value of the inhomogeneous period.The modern reference period is usually taken as 1981–95, but in the cases where there is a break after 1981,an earlier homogeneous period is used to ensure a min-imum of 15 yr for the calculation of the reference periodaverage. These adjustments are made for each series. A

second multiple comparison is undertaken, to ensurethat the dates chosen for the breaks were correct, byconsidering the impact of the adjustments made basedon the first comparison. If the date of a detected breakis wrong by a year or so, this should become evidentin subsequent comparisons. If such an error is not de-tected in the comparisons, it is assumed that it will nothave a large impact on the adjustments of other stations.The process is repeated until all time series of station

differences are deemed free of breaks. This process isan iteration of MCA performed on each series at eachiteration. Detection is performed on annual series, al-

AUGUST 1999 2661S L O N O S K Y E T A L .

FIG. 2. (Continued ) point detection example for the CMT. Rome is compared to eight surrounding stations. Triangles indicate the positionof a detected break, and an ‘‘A’’ the position of an outlier, for each comparison. Where a break is detected in the same place in severalcomparisons, it is attributed to the candidate (Rome) series.

though the adjustments are made on a monthly basis.The calculated monthly adjustment values are smoothedusing a Gaussian filter to ensure a reasonable annualcycle in the adjustment values. Outliers are also iden-tified in this process and are corrected. By using aniterative approach with multiple comparisons, the useof a homogeneous reference series is avoided.A potential problem arises if there are inhomogene-

ities that occur at several stations at the same date, aswhen a country’s observing network changes an ob-serving procedure at all stations at the same time. Toensure that nationwide changes would not be masked,care was taken to ensure that all series were comparedto at least one series in a different country. An advantageof the iterative comparison technique is that it allowsone to experiment with the temporal and spatial locationof a suspected discontinuity, until difference series areobtained for each station that are as satisfactory as pos-sible. Even so, problems with nationwide networkchanges are the most difficult inhomogeneities to dealwith and the possibility of a wrongly assigned breakpoint remains.This method is qualitative in that is does not involve

the computation of probabilities or likelihoods withwhich to assign a numerical confidence in the breakdetected. However, it is recognized that subjective judg-ment, particularly when viewing graphs such as Fig. 2a,can factor in considerations such as the magnitude ofan apparent discontinuity compared to the variance ofthe series more easily than can be done numerically.Subjective judgment can also be helpful when the re-liability of the metadata varies, as is the case with thelong series dealt with here (Peterson et al. 1998). Thisapproach also assumes the time series of surface pres-sure are stationary at the location on timescales of de-cades or longer, as the mass of air is a conserved quan-tity. It is possible that this assumption may lead to thereduction of real decadal-to-century-scale variability inclimate. To assess this possibility, the series adjustedusing the MCA are compared to series adjusted usingneighboring series, which do not rely as heavily on thisassumption of stationarity (see section 4). It is importantto note, however, that the idea of relative homogeneitytesting is that real climatic influences, which should bepresent in neighboring series as well as the candidateseries, are preserved.

2662 VOLUME 12J O U R N A L O F C L I M A T E

2) CMT

The second method considered here is the CMT, de-scribed in detail in Caussinus and Mestre (1996). Theprinciple of CMT is based on the statement that a seriesis homogeneous between two break points, so the sec-tions of two series between two break points can beused as references series, and a homogenous referenceseries is thus not necessary. Each candidate station iscompared to a set of surrounding stations by creating adifference series, as described above. These differenceseries are assumed to be normally distributed, and aretested for discontinuities and outliers: when a detectedbreak point occurs at the same location (i.e., the sameyear) in several comparisons, it is attributed to the can-didate station series. The inhomogeneities are assumedto be steplike changes (discontinuities) that alter onlythe average value of the series, while the higher mo-ments, such as the variance, are assumed to be unaf-fected. The procedure is thus far similar to that describedabove for the MCA method, but the detection of breakpoints and outliers for each individual difference seriesis performed using an adapted penalized log-likelihoodprocedure (Caussinus and Mestre 1996). This procedurecan be used for an unknown number of breaks and out-liers. It is formulated as a problem of testing multiplehypotheses and provides a multidecision rule, and isBayes-invariant (Caussinus and Mestre 1996). An ex-ample of this break point detection procedure is givenin Fig. 2b. The detection of break points for individualdifference series (triangles in Fig. 2b) is automated (i.e.,objective), although there is a certain amount of sub-jectivity in determining the existence and position ofthe break point using all comparison series together fora candidate series. Again, it is not necessarily evidentif a break occurs at more than one station at the sametime, although the use of six or more difference seriesincreases the confidence in a break being detected inthe appropriate series.The adjustments are calculated as the differences of

the station series and the regional climate estimate foreach inhomogeneous period. The adjustments are per-formed on the anomaly values, and are readjusted tothose of the last homogeneous (i.e., the most modern)subperiod of the record. As with the MCA method, theadjustments are made on a monthly basis.

3) SNHT

The third method used here for the homogenizationof pressure data is that described in Alexandersson(1986) and Alexandersson and Moberg (1997), knownas the SNHT for single shifts. This procedure requiresa homogeneous reference series. For each pressure se-ries tested here, the homogeneous reference series wascalculated from the U. K. Meteorological Office(UKMO) sea level pressure dataset, which is on a 5� lat� 10� long grid. Values for the 51 stations were inter-

polated from this grid to the station location by usingthe four surrounding gridpoint values. It is by no meanscertain that the UKMO gridded data is itself homoge-neous. However, the charts are considered reliable forthe European region, although there may be some prob-lems in the southeastern part of Europe, over Asia Mi-nor, and the Middle East (Jones 1987; Jones et al. 1999).The SNHT test is based on the maximum likelihood

ratio test, with the assumption that there is at most onebreak in the series, and cannot properly handle serieswith many breaks (Alexandersson and Moberg 1997).In this study, if a break was detected, it was adjustedby calculating the difference between the candidate se-ries and the reference series and applying the differenceas an adjustment factor, as described in Alexanderssonand Moberg (1997). The test was then repeated to checkfor further breaks, until the series is considered ho-mogeneous at the 95% level. As the UKMO griddeddata are only available from 1881 onward, a smallersubset of the station records were tested using the SNHT.The results of the SNHT used for comparison purposeswere done on the last 100 years of data, from 1896 to1995.

4) USE OF METADATA

Possible discontinuities were identified in the meta-data and from careful scrutiny of both the WWR stationinformation and any details that were available from theother sources used (see section 2). Occasionally, achange in the station description was found in the WWRstation information given next to the data, but no moreprecise information was given in the station notes con-cerning these changes. For example, the height of thebarometer at Rome was listed as 63 m for the years1891–1924 (Exner 1944) but is given as 50.16 m forthe years 1921–30 (Simpson 1934), although there isno explicit information in the station history concerningsuch a change. Before 1951, most corrections were ap-plied to the data by the procedures described in theWWR, although since 1951, systematic correctionsmight have been applied by the country supplying thedata, but few details of the pre-1951 type are given.To compare the adjustment procedures for the MCA

method and the CMT, the break points found using eachmethod were carefully studied, and using metadata asa guide, a set of common break points was identifiedas the most probable break points. Adjustment factorswere determined for this common set of break pointsfor both methods.

b. Comparisons

1) CORRELATIONS AND VARIANCES

Two comparisons were done, one from 1896 onwardon four sets of annual data: the original data and thedata adjusted by the three methods described above. The

AUGUST 1999 2663S L O N O S K Y E T A L .

TABLE 1a. Mean pressure, variance, and overall correlation foreach dataset.

Set Pressure Variance Correlation

MCACMTOriginalMCA*CMT*SNHT*Original*

1013.21012.81012.91013.21013.01013.21013.3

1.401.742.241.411.501.562.00

0.340.330.260.350.340.350.29

* Since 1896.

TABLE 1b. F-test results (number of stations with variances thatare significantly different at the 95% level).

Tests Number of stations with F � F.025MCA, CMTMCA, originalCMT, originalMCA, CMT*

1717133

MCA, SNHT*MCA, original*CMT, SNHT*CMT, original*SNHT, original*

58169

* Since 1896.

TABLE 2a. Percentage of total variance explained by first six EOFs.

EOF1 (%) EOF2 (%) EOF3 (%) EOF4 (%) EOF5 (%) EOF6 (%) Total (%)

EOFA: 1896–1995 (51 stations)MCACMTSNHTOriginal

43.3943.6842.6332.14

26.1024.4225.1120.00

12.1110.8411.6215.20

3.893.993.988.09

2.743.582.826.18

2.522.352.374.03

90.7588.8688.5385.64

EOFB: 1789–1995 (12 stations)MCACMTOriginal

62.2965.5645.66

20.2216.3017.86

6.217.5112.50

3.453.189.36

2.082.334.86

1.492.002.96

95.7496.8893.20

second comparisons were performed from the startingpoint of each individual series on the original data andthe data adjusted by the MCA method and the CMT.For each series, the correlation coefficients of that serieswith all other series were calculated, summed, and di-vided by the total number of series, giving a measureof the relative degree of spatial correlation between theseries for each set of data. The variance of each serieswas also calculated over the two periods, as this givesa measure of the degree of temporal variability of eachindividual series. Variances were also summed over all51 series in each set, to give a measure of the degreesof difference in spatial and temporal variability engen-dered by the different adjustment methods. Pairwise F-tests on each series adjusted by the different sets wereperformed. The results of these calculations are givenin Tables 1a and 1b and are discussed in section 4.

2) EMPIRICAL ORTHOGONAL FUNCTIONSEOF analyses were also performed on the covariance

matrices of the different homogenized datasets, as wellas the original data. Five analyses were performed, withdifferent starting dates and hence different numbers ofstations in each analysis. Only the two are presentedhere, the first from 1896 to 1995 (EOFA) on the threesets of adjusted data as well as the original data and thesecond (EOFB) from 1789 to 1995 on the MCA, CMT,and original sets. The number of stations available de-creases from 51 for EOFA (all stations present) to only12 for EOFB. The percentages of variance explained bythe first six EOFs for each set are given in Table 2a.The results are presented in section 4, Figs. 4–8.

Correlations between EOF time series were calculat-ed, and correlations between the same order EOFs forthe different datasets are shown in Table 2b.

4. ResultsIt is difficult to evaluate the effectiveness of the de-

tection procedures as the detected breaks cannot be ver-ified. The metadata is not always an infallible guide, asit may be that not all possible causes of inhomogeneitiesare detailed. Conversely, some disruptions in the stationhistory, such as station relocations or instrument chang-es, may be listed, but their effects have already beenaccounted for in the published data. This is especiallythe case in older data, which may have been studiedover a hundred years ago, although the details of theadjustments applied have been lost.Due to space limitation, only four series were chosen

to be presented in this paper, although all 51 series havebeen adjusted using the three methods described. Viennais an example of a station where there are major dif-ferences between MCA and CMT in the early years ofthe series. Rome, Jerusalem, and Athens are stationswhere there are large inhomogeneities in the originalseries, and so the differences in the homogenizationtechniques are more apparent. The anomaly time series(adjusted values—overall adjusted mean for each ad-justment method for each station) of these four stationsare shown in Fig. 3. These four stations are examplesof problematic behavior in the various adjustment pro-cedures. It must be emphasized here that the adjustments

2664 VOLUME 12J O U R N A L O F C L I M A T E

TABLE 2b. Correlation coefficients between sets for EOFA. Correlations in bold are significantly different from zero at the 95% level.

EOFA 1 EOFA 2 EOFA 3 EOFA 4 EOFA 5 EOFA 6

MCA, CMTMCA, SNHTMCA, originalCMT, SNHTCMT, originalSNHT, original

0.980.980.940.970.980.92

0.980.98

�0.870.99

�0.90�0.90

0.990.98

�0.180.98

�0.16�0.29

0.620.85

�0.020.66

�0.10�0.21

0.290.680.000.110.19

�0.32

0.73�0.560.06

�0.250.000.50

Note: Some of the EOFs have changed order. Correlations between EOFA 4 of the original data and EOFA 3 of the MCA, CMT, andSNHT are 0.94, 0.97, and 0.92, respectively. The correlation between EOFA 4 of MCA and EOFA 5 of CMT is 0.91.

TABLE 2c. Correlation coefficients between sets for EOFB. Correlations in bold are significantly different from zero at the 95% level.

EOFB 1 EOFB 2 EOFB 3 EOFB 4 EOFB 5 EOFB 6

MCA, CMTMCA, originalCMT, original

0.880.840.97

0.960.140.26

�0.510.02

�0.20

�0.66�0.260.27

�0.150.10

�0.27

0.42�0.070.38

Note: Some of the EOFs have changed order. Correlations between EOFB 4 of the original data and EOFB 3 of the MCA and CMT areboth 0.94.

applied by the different techniques seem to be broadlycomparable for most stations, and show only minor dif-ferences in their final values. Due to different referenceperiods, the different adjustment procedures do not nec-essarily adjust to the same mean for each station, so thefinal series are presented as deviations from their overallmean values.

a. Correlations and variancesThe results of the overall variance and correlation

analyses are presented in Table 2a. As the MCA methodassumes the stationarity of the individual time series,but does not refer to neighboring stations in the cal-culation of the adjustment factors, it was expected thatthe MCA set would have relatively lower spatial cor-relations, as there is more independence between sta-tions, but also relatively lower variances, since there isa risk that real variations are damped by forcing theinhomogeneous periods to the same average as the ref-erence period. In contrast, since the CMT adjustmentprocedure does refer to neighboring stations, it was ex-pected that the relative degree of spatial dependence,and hence interstation correlation, would be higher thanthat of MCA. CMT also adjusts to the most recent ho-mogeneous period, and so this should have some effecton the variance. The SNHT method adjusts only to itsreference series, in this case the UKMO gridded data.The variance of the data adjusted using the SNHTshould therefore be relatively high.It can be seen that there do not appear to be any

overall differences between the interstation correlationor mean variance between the three adjusted datasets,suggesting that the theoretical differences in the ad-justment methods do not bias the final results of theadjusted series. The adjusted series all have higher in-terstation correlations and lower variances than the orig-

inal, unadjusted data. The data adjusted using the MCAmethod does have a slightly lower mean variance thanthe CMT or the SNHT adjusted datasets, and in orderto investigate this further, F-tests were performed pair-wise on each station to test for significant differencesin the variance of the series adjusted using the differentmethods (Table 1b). When the full records are consid-ered, 33% (17/51) of the stations have significant dif-ferences in the variance at the 95% level between theMCA method and the CMT. Many of these stations arethose with long records, starting before 1820, for whichthe adjustment factors tend to be underestimated usingthe CMT method in the early years of the record (seeFig. 3a). This underestimation would have a significanteffect on the variance, and may account for the 10 sta-tions with differing variances between the MCAmethodand the CMT with starting dates before 1830. It isthought that the CMT has difficulty adjusting the earliestportions of the long series because there are not enoughsurrounding stations to form a robust estimate of theregional climate, and thus may not be very applicableto the earliest years of the longer series. The same num-ber (and in many cases the same stations) show signif-icant differences in the variance between the MCA ad-justed series and the original series. A somewhat lowerpercentage (25%, or 13/51) of stations show significantdifferences in the variance between the CMT adjusteddata and the original data.When the period from 1896 to 1995 is considered,

there are fewer stations with significant differences inthe variance at the 95% level. Only three stations, Ath-ens, Jerusalem, and Cairo, are different between theMCA and CMT datasets. These three stations at thesoutheastern boundary of the grid are somewhat iso-lated, and therefore difficult to adjust for using sur-rounding stations. It can be seen in Fig. 3d that in fact,the CMT method produced a poor adjustment for Je-

AUGUST 1999 2665S L O N O S K Y E T A L .

FIG. 3. Adjusted series, plotted as anomalies from station mean,for MCA, CMT, and SNHT adjusted series. Original series (lightdotted lines) are also plotted: (a) Vienna, (b) Rome, (c) Athens, and(d) Jerusalem. The station means vary slightly for each method andfrom the original.

rusalem, giving rise to a very high variance for thisstation. Five stations show significantly different vari-ances between the MCA and SNHT adjusted datasets,and only one station (Rome) had significant differencesin the variance between the CMT and SNHT datasets.Figure 3b shows that the SNHT method did not com-pletely adjust for the discontinuity in the 1940s at Rome,and this would account for the higher variance in theSNHT adjusted series for Rome. A higher proportionof stations (although still less than 20% of all stations)had significant differences in the variance between ad-justed and unadjusted stations.

These comparisons suggest that surface pressure canprobably be considered a conserved property over aspecific location, at least over periods of decades orlonger, as there do not appear to be any systematic dif-ferences between the adjustments made using surround-ing reference series (CMT and SNHT) and the adjust-ments made using a homogeneous part of the candidateseries (MCA method).

b. Empirical orthogonal functionsThe percentages of variance explained by the first six

EOFs for the two analyses are given in Table 2a, andthe correlations between the time series of the sameorder EOFs between the datasets are given in Tables 2band 2c. In a few cases, the order of the EOFs appearsto have changed: for example, EOF 4 of the originaldataset corresponds very highly to EOF 3 of the adjusteddatasets, for both analysis, and EOFA 5 of the CMTdata corresponds to EOFA 4 of the MCA adjusted data.The first EOFs for the years 1896–1995 (Fig. 4a for

spatial component and Fig. 4b for time coefficients) arevery similar for the three adjusted datasets, explainingbetween 42.7% and 43.4% of the total variance of thedatasets, with the correlations between the time seriesof the adjusted datasets ranging from 0.97 to 0.98 (Table2b). While the basic spatial patterns of this first EOFare very similar, there are some differences, particularlyin the southeastern part of the domain. EOFA 1 of theoriginal dataset explains 32.1% of the total variance ofthe original dataset, 10% less than that for the adjusteddatasets. This feature may be explained in part by theextremely large discontinuity in the record for Athensin 1930 (see Fig. 3c), as there also appears to be adiscontinuity in the time series of EOFA 1 for the orig-inal data around 1930. The correlations between EOFA1 of the original and the adjusted datasets are still veryhigh, ranging from 0.92 to 0.97. The spatial and tem-poral characteristics for EOFA 2 and 3 (not shown) arealso qualitatively very similar for the three adjusted da-tasets, with correlations between the time series stillgreater than 0.97.In EOFA 4, differences in the three adjusted datasets

start to become apparent. In particular, EOFA 4 (Fig. 5)for the CMT dataset, which accounts for 4.0% of thetotal variance of the dataset, appears to reflect mainlya discontinuity in the adjusted dataset around Jerusalem(see Fig. 3d). There is a high correlation (0.91) betweenthe time series of EOFA 4 of the MCA dataset andEOFA 5 (Fig. 6) of the CMT dataset, suggesting thatEOFA 4 of the CMT is due mostly to the discontinuityin the CMT adjusted series at Jerusalem. Both EOFA 4and EOFA 5 of the SNHT dataset appear to reflect someadjustment problems in the southeastern part of the do-main (Istanbul, Athens, Jerusalem, Cairo, Tbilisi),which may in turn be due to problems with the UKMOgrid in this area (see Jones et al. 1999). EOFA 5 (6.2%)of the original dataset is dominated by the discontinu-

2666 VOLUME 12J O U R N A L O F C L I M A T E

FIG. 4. (a) Spatial patterns of EOFA 1 (1896–1995), for (i) MCA, (ii) CMT, (iii) SNHT, and (iv) original data. Values have been multipliedby 1000 for clarity. (b) Time series of EOFA 1 coefficients, for (i) MCA, (ii) CMT, (iii) SNHT, and (iv) original data. Note: the contouringroutine for the EOF spatial fields tends to smooth the data, so individual station anomalies are less apparent in the contour field; for thisreason, actual station values are also printed on the maps.

AUGUST 1999 2667S L O N O S K Y E T A L .

FIG. 5. As in Fig. 4 but for EOFA 4.

2668 VOLUME 12J O U R N A L O F C L I M A T E

FIG. 6. As in Fig. 4 but for EOFA 5.

AUGUST 1999 2669S L O N O S K Y E T A L .

FIG. 7. As in Fig. 4 but for EOFA 6.

2670 VOLUME 12J O U R N A L O F C L I M A T E

FIG. 8. As in Fig. 4 but for EOFB 1.

ities in the Rome series (see Fig. 3b). EOFA 6 (2.4%)of the SNHT (Fig. 7) adjusted dataset also reflects thediscontinuity at Rome, which is not wholly adjusted forby the SNHT method (Fig. 3b).EOFB 1 for the 12 stations available from 1789 to

1995 is also presented (Fig. 8) for the datasets adjustedusing the MCA method and the CMT, as well as for theoriginal (unadjusted) dataset. In the CMT dataset(65.6% of variance explained in EOF 1), the earlieryears (pre-1860) would appear to be generally too lowover the entire dataset (see, e.g., Fig. 3a). The originaldata (45.6%) also represents values that are much lowerin the early parts of the record.Although the percentages of variance explained by

the higher EOFs are relatively small, they still representthe amount of variance explained by the entire dataset(i.e., all 51 stations), so the fact that they can be attri-buted to inhomogeneities in individual station records,particularly adjusted records, demonstrates the potential

influence of inhomogeneities on any data analysis. Thisalso shows a potential of the higher-order EOFs to iden-tify major inhomogeneities in data (Kutzbach 1967)Some shortcomings of the various adjustment meth-

ods can also be seen in the results of these higher orderEOFs. In particular, the CMT, while not requiring areference series, does require a considerable number ofsurrounding stations in order to build a robust estimateof the adjustment factor, and problems can arise whenthere are few surrounding stations, as tends to be thecase near the geographic boundaries and in the earliestyears of the dataset. The SNHT performs well if thereference series is good, but obtaining or constructinga homogeneous reference series is not always obviousand limits the usefulness of the method. It is assumedthat the SNHT did not perform well at Rome or Jeru-salem because the discontinuities in the station werereflected, albeit less strongly, in the UKMO griddeddata, leading to an underestimate of the adjustment fac-

AUGUST 1999 2671S L O N O S K Y E T A L .

FIG. 8. (Continued )

tor. This demonstrates the difficulty involved in findinga sufficiently homogeneous and independent referenceseries for detecting and adjusting for breaks. The MCAmethod performed well, with no obvious trends or dis-continuities in the time series of the EOFs and relativelysmooth spatial fields, but assumed stationarity of thetime series, an assumption that would not necessarilybe valid for other climatic variables, or even to sea levelpressure in the Tropics. To apply this technique to dataoutside extratropical areas, a much longer period for thecalculation of the reference average would probably benecessary, in order to take into account the effects ofthe El Nino–Southern Oscillation phenomenon.

5. ConclusionsThree different homogenization techniques have been

used to homogenize 51 long series of surface pressureover Europe, and the results assessed using correlationand variance statistics, as well as EOFs. Analysis of thehigher order EOFs was shown to be extremely usefulin evaluating the quality of homogenized datasets andhighlighting where and when problems exist with station

homogeneity. This assessment shows that adjusted se-ries still need to be carefully examined for poorly ap-plied adjustments. The MCA method, a new techniqueintroduced here, gave the best results, with no trends ordiscontinuities in the time series of the first six EOFs,which collectively explained 91% of the variance of theadjusted dataset. The CMT method performed well inmost of the cases, when robust estimates of the sur-rounding climate could be made. However, problemsappeared when there were few surrounding stations, ei-ther near the geographical boundaries of the data do-main, or in the early years of a long record, when therewere not many contemporaneous records available toform a reference. In these cases, the adjustment waspoor, and this was reflected in the higher order EOFs,suggesting that a poorly adjusted station can affect theanalysis of the entire dataset. The SNHT also performedwell when a good reference series was available forcomparison and adjustment, but was of limited use dueto the necessity of having a homogeneous and inde-pendent reference series, which is difficult to obtain.Finally, pressure series in this geographical domain canbe considered to be stationary on timescales of decades

2672 VOLUME 12J O U R N A L O F C L I M A T E

or longer, as the variance and temporal evolution of theadjusted time series were not noticeably affected by theMCA method.

Acknowledgments.Many thanks to Olivier Mestre, forthe calculation of the CMT break points and use of Fig.2b, for the use of his adjustment routines, and for manyinteresting discussions. Thanks also to David Lister, forhelp with the data processing, and to Ben Brabson, formany helpful discussions. Modern data for Milan wereprovided by Maurizio Maugeri and for Florence byChritian Holtz. Vicky Slonosky was supported for thiswork by the Natural Sciences and Engineering ResearchCouncil of Canada, and also by les Fonds pour la for-mation des chercheurs et l’aide a la recherche du Que-bec. The data collection was undertaken as a part of theEuropean Commission project ADVICE, Annual to De-cadal Variability in Climate in Europe (ENV4-CT95-0129).

REFERENCES

Alexandersson, H., 1986: A homogeneity test applied to precipitationdata. J. Climatol., 6, 661–675., and A. Moberg, 1997: Homogenization of Swedish temperaturedata. Part I: A homogeneity test for linear trends. Int. J. Cli-matol., 17, 25–34.

Barring, L., P. Jonsson, C. Achberger, M. Ekstrom, and H. Alexan-dersson, 1999: The Lund pressure record of meteorological in-strumental observations: Monthly pressure 1780–1997. Int. J.Climatol., in press.

Caussinus, H., and O. Mestre, 1996: Towards new tools and meth-odologies for relative homogeneity testing. First Seminar forHomogenization of Surface Climatological Data, Budapest,Hungary, Hungarian Meteorological Service, 62–82., and F. Lyazrhi, 1997: Choosing a linear model with a randomnumber of change-points and outliers. Ann. Inst. Statist. Math.,49, 761–775.

Conrad, V., and L. W. Pollak, 1962: Methods in Climatology. HarvardUniversity Press, 459 pp.

de Tillo, A., 1890: Repartition Geographique de la Pression Atmos-pherique sur le Territoire de l’Empire de Russie, 1836–85. LaSociete Imperiale Russe de Geographie, St. Petersburg, 281 pp.

Easterling, D. R., and T. C. Peterson, 1995: A new method for de-tecting undocumented discontinuities in climatological time se-ries. Int. J. Climatol., 15, 369–377.

Exner, F., G. C. Simpson, G. Walker, H. H. Clayton, and R. G. Moss-man, 1944: World weather records. Vol. 79. Smithsonian Mis-cellaneous Collections, Smithsonian Institute, Publ. 2913, Wash-

ington, DC, 1200 pp. [Available from National MeteorologicalLibrary, Meteorological Office, Bracknell, United Kingdom.]

Gorczynski, W., 1917: Pression Atmospherique en Pologne et enEurope. Jan Cotty, 265 pp.

Hann, J., 1887: Die Vertheilung des Luftdruckes uber Mittel und Sud-Europa. Eduard Holzel, 220 pp.

Heino, R., 1997: Metadata and their role in homogenization. FirstSeminar for Homogenization of Surface Climatological Data,Budapest, Hungary, Hungarian Meteorological Service, 5–8.

Jones, P. D., 1987: The early twentieth century Arctic high—Fact orfiction? Climate Dyn., 1, 63–75., 1995: The instrumental data record: Its accuracy and use inattempts to identify the ‘‘CO2’’ signal. Analysis of Climate Var-iability, H. von Storch, and H. Navarra, Eds., Springer, 53–75., T. M. L. Wigely, and K. R. Briffa, 1987: Monthly mean pressurereconstructions for Europe (back to 1780) and North America(back to 1858). Tech. Note TR037, U.S. Dept. of Energy, Wash-ington, DC, 99 pp. [Available from NTIS, U.S. Dept. of Com-merce, Springfield, VA 22161.], T. Jonsson, and D. Wheeler, 1997: Extension to the NorthAtlantic Oscillation using early instrumental pressure observa-tions from Gibraltar and south-west Iceland. Int. J. Climatol.,17, 1433–1450., and Coauthors, 1999: Monthly mean pressure reconstructionsfor Europe. Int. J. Climatol., 19, 347–364..

Karl, T. R., and C. N. Williams Jr., 1987: An approach to adjustingclimatological time series for discontinuous inhomogeneities. J.Climate Appl. Meteor., 26, 1744–1763.

Knowles Middleton, W. E., 1964: The History of the Barometer. JohnsHopkins Press, 489 pp.

Kutzbach, J. E., 1967: Empirical eigenvectors of sea level pressure,surface temperature and precipitation complexes over NorthAmerica. J. Appl. Meteor., 6, 791–802.

Lamb, H. H., and A. I. Johnson, 1959: Climatic variations and ob-served changes in the general circulation. Parts I and II. Geogr.Ann. Stockh., 41, 94–131.

Legrand, J.-P., and M. LeGoff, 1992: Les observations meteorolo-giques de Louis Morin. Monographie No. 6, Direction de laMeteorologie Nationale, Ministere de l’Equipement, de Loge-ment et des Transports, 41 pp.

Peterson, T. C., and D. R. Easterling, 1994: Creation of a homoge-neous composite climatological reference series. Int. J. Clima-tol., 14, 671–679., and Coauthors, 1998: Homogeneity adjustments of in situ at-mospheric climate data: A review. Int. J. Climatol., 18, 1493–1517.

Simpson, G. C., R. G. Mossman, G. Walker, and F. L. Clayton, 1934:World weather records, 1921–1930. Vol. 90. Smithsonian Mis-cellaneous Collections, Smithsonian Institute, Publ. 3218, Wash-ington, DC, 616 pp. [Available from National MeteorologicalLibrary, Meteorological Office, Bracknell, United Kingdom.]

Young, K. C., 1993: Detecting and removing inhomogeneities fromlong-term monthly sea level pressure time series. J. Climate, 6,1205–1220.