Climatic Influence of Sea Surface Temperature: Evidence of Substantial Precipitation Correlation and...

22
856 VOLUME 4 JOURNAL OF HYDROMETEOROLOGY q 2003 American Meteorological Society Climatic Influence of Sea Surface Temperature: Evidence of Substantial Precipitation Correlation and Predictability GREGORY R. MARKOWSKI AND GERALD R. NORTH Department of Atmospheric Sciences, Texas A&M University, College Station, Texas (Manuscript received 6 September 2001, in final form 3 February 2003) ABSTRACT Using a combination of statistical methods and monthly SST anomalies (SSTAs) from one or two ocean regions, relatively strong SSTA–precipitation relationships are found during much of the year in the United States: hindcast- bias-corrected correlation coefficients 0.2–0.4 and 0.3–0.6, on monthly and seasonal timescales, respectively. Im- proved rigor is central to these results: the most crucial procedure was a transform giving regression residuals meeting statistical validity requirements. Tests on 1994–99 out-of-sample data gave better results than expected: semiquantitative, mapped predictions, and quantitative, Heidke skills, are shown. Correlations are large enough to suggest that substantial skill can be obtained for one to several months’ precipitation and climate forecasts using ocean circulation models, or statistical methods alone. Although this study was limited to the United States for simplicity, the methodology is intended as generally applicable. Previous work suggests that similar or better skills should be obtainable over much of earth’s continental area. Ways likely to improve skills are noted. Pacific SSTAs outside the Tropics showed substantial precipitation influence, but the main area of North Pacific variability, that along the subarctic front, did not. Instead, the east–west position of SSTAs appears important. The main variability is likely due to north–south changes in front position and will likely give PC analysis artifacts. SSTAs from some regions, the Gulf of Mexico in particular, gave very strong correlations over large U.S. areas. Tests indicated that they are likely caused by atmospheric forcing. Because unusually strong, they should be useful for testing coupled ocean–atmosphere GCMs. Investigation of differences between ENSO events noted by others showed that they are likely attributable to differing SSTA patterns. 1. Introduction An important reason to improve regional climate pre- diction is agricultural production’s sensitivity to climate; a second is the scientific merit of elucidating seasonal climatic influences. ENSO is a prominent example and its effects have been investigated for several decades (Berlage 1966; Bjerknes 1966; Ropelewski and Halpert 1987; and many others). Philander (1990) gives a good review of earlier literature. More quantitative studies have been done recently, for example, Ropelewski and Halpert (1996) describe regional precipitation distribution chang- es worldwide where they previously found influence. While earlier studies used mainly an atmospheric variable for prediction, often the Southern Oscillation index, SST was soon identified as the likely major source of seasonal timescale effects (Philander 1990). Detailed, quantitative, statistical studies of SST influ- ence on continental climates became practical with the release of adequate long-term data, especially the Com- prehensive Ocean–Atmosphere Data Set (COADS). Corresponding author address: Dr. Gregory R. Markowski, Dept. of Atmospheric Sciences, Texas A&M University, MS 3150, College Station, TX 77843. E-mail: [email protected] Barnston (1994) and Unger (1995) studied SST U.S. pre- cipitation influence, Unger with screening multiple re- gression. Barnston and Smith (1996; henceforth BS96) extended their analysis globally. These three studies’ main goal was empirical forecasting: concurrent predic- tion (specification, GCM terminology) was done largely as a benchmark. Montroy (1997, henceforth MT) was perhaps first to identify tropical Pacific SST influence on eastern U.S. precipitation. His study was not limited strictly to ENSO effects and analyzed SST patterns in greater detail. These more quantitative studies used linear statistical methods and, except Unger and MT, mainly canonical techniques. However, precipitation studies, even on seasonal timescales and using near-global SSTs, have achieved limited prediction skill, even at 0 lag, over land outside the Tropics. This work emphasizes simultaneous (i.e., 0 lag) monthly and seasonal SST correlation to predict pre- cipitation. Precipitation was chosen for several reasons. Its prediction appears technically difficult due to noisy and highly non-Gaussian distributions. Second, im- provements should be applicable to other climate var- iables. Third, precipitation critically affects food pro- duction. Our motivating heuristic is the high-sensitivity local probability distributions of a chaotic system’s var-

Transcript of Climatic Influence of Sea Surface Temperature: Evidence of Substantial Precipitation Correlation and...

856 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

q 2003 American Meteorological Society

Climatic Influence of Sea Surface Temperature: Evidence of Substantial PrecipitationCorrelation and Predictability

GREGORY R. MARKOWSKI AND GERALD R. NORTH

Department of Atmospheric Sciences, Texas A&M University, College Station, Texas

(Manuscript received 6 September 2001, in final form 3 February 2003)

ABSTRACT

Using a combination of statistical methods and monthly SST anomalies (SSTAs) from one or two ocean regions,relatively strong SSTA–precipitation relationships are found during much of the year in the United States: hindcast-bias-corrected correlation coefficients 0.2–0.4 and 0.3–0.6, on monthly and seasonal timescales, respectively. Im-proved rigor is central to these results: the most crucial procedure was a transform giving regression residualsmeeting statistical validity requirements. Tests on 1994–99 out-of-sample data gave better results than expected:semiquantitative, mapped predictions, and quantitative, Heidke skills, are shown. Correlations are large enough tosuggest that substantial skill can be obtained for one to several months’ precipitation and climate forecasts usingocean circulation models, or statistical methods alone. Although this study was limited to the United States forsimplicity, the methodology is intended as generally applicable. Previous work suggests that similar or better skillsshould be obtainable over much of earth’s continental area. Ways likely to improve skills are noted.

Pacific SSTAs outside the Tropics showed substantial precipitation influence, but the main area of NorthPacific variability, that along the subarctic front, did not. Instead, the east–west position of SSTAs appearsimportant. The main variability is likely due to north–south changes in front position and will likely give PCanalysis artifacts. SSTAs from some regions, the Gulf of Mexico in particular, gave very strong correlationsover large U.S. areas. Tests indicated that they are likely caused by atmospheric forcing. Because unusuallystrong, they should be useful for testing coupled ocean–atmosphere GCMs. Investigation of differences betweenENSO events noted by others showed that they are likely attributable to differing SSTA patterns.

1. Introduction

An important reason to improve regional climate pre-diction is agricultural production’s sensitivity to climate;a second is the scientific merit of elucidating seasonalclimatic influences. ENSO is a prominent example andits effects have been investigated for several decades(Berlage 1966; Bjerknes 1966; Ropelewski and Halpert1987; and many others). Philander (1990) gives a goodreview of earlier literature. More quantitative studies havebeen done recently, for example, Ropelewski and Halpert(1996) describe regional precipitation distribution chang-es worldwide where they previously found influence.While earlier studies used mainly an atmospheric variablefor prediction, often the Southern Oscillation index, SSTwas soon identified as the likely major source of seasonaltimescale effects (Philander 1990).

Detailed, quantitative, statistical studies of SST influ-ence on continental climates became practical with therelease of adequate long-term data, especially the Com-prehensive Ocean–Atmosphere Data Set (COADS).

Corresponding author address: Dr. Gregory R. Markowski, Dept.of Atmospheric Sciences, Texas A&M University, MS 3150, CollegeStation, TX 77843.E-mail: [email protected]

Barnston (1994) and Unger (1995) studied SST U.S. pre-cipitation influence, Unger with screening multiple re-gression. Barnston and Smith (1996; henceforth BS96)extended their analysis globally. These three studies’main goal was empirical forecasting: concurrent predic-tion (specification, GCM terminology) was done largelyas a benchmark. Montroy (1997, henceforth MT) wasperhaps first to identify tropical Pacific SST influence oneastern U.S. precipitation. His study was not limitedstrictly to ENSO effects and analyzed SST patterns ingreater detail. These more quantitative studies used linearstatistical methods and, except Unger and MT, mainlycanonical techniques. However, precipitation studies,even on seasonal timescales and using near-global SSTs,have achieved limited prediction skill, even at 0 lag, overland outside the Tropics.

This work emphasizes simultaneous (i.e., 0 lag)monthly and seasonal SST correlation to predict pre-cipitation. Precipitation was chosen for several reasons.Its prediction appears technically difficult due to noisyand highly non-Gaussian distributions. Second, im-provements should be applicable to other climate var-iables. Third, precipitation critically affects food pro-duction. Our motivating heuristic is the high-sensitivitylocal probability distributions of a chaotic system’s var-

OCTOBER 2003 857M A R K O W S K I A N D N O R T H

TABLE 1. Ocean regions selected for EOF analysis.

Region description

Range

Lat Lon EOFs

Tropical and northernPacific

Gulf of MexicoExtended Gulf of MexicoTropical and northern

Atlantic

308S–568N198–318N108–388N

258S–668N

1398E–718W1008–808W988–608W

728W–128E

1045

10

iables (regional climate) is likely to show from onlymoderate, but persistent, changes in system forcing (Lo-renz 1964): monthly SST and precipitation anomalies(SSTAs and PAs) here.

Using detailed SST pattern analysis and several sta-tistical steps, we find sizeable SST–precipitation cor-relations during much of the year: multiple, hindcast-bias corrected, monthly and seasonal correlation coef-ficients (Rc, Rsc) about 0.2–0.4, and 0.3–0.6, respec-tively, over much of the United States, especially winterand summer. The 1994–99 out-of-sample data show bet-ter than expected skill. Useful precipitation and climateforecasts from one to a few months appear feasible usingan ocean circulation model, or even statistically. Pre-liminary statistical forecasts (not included; using laggedSSTs, Markowski and North 1999) support this judg-ment. We also find other results, for example, 1) un-expectedly strong and widespread Gulf of Mexico SSTAcorrelations; 2) missing influence from the central NorthPacific’s main variability, that about the subarctic front(east–west anomaly position appears important instead);and 3) ENSO event differences noted by others can beexplained by differing SSTA patterns.

While this study was limited to the United States forsimplicity, methodology should be applicable to mostregions and climate variables. Related work, for ex-ample, BS96, suggests similar skills should be obtain-able over much of the global land surface.

2. Methodology

Monthly SST anomaly principal component (PC)time series and transformed precipitation anomalies(PAs) were correlated using multiple least-squares linearregression (MLR). (For clarity, empirical orthogonalfunction, EOF, will denote a PC’s eigenvector, usuallyshown as a geographic SSTA pattern.) The PAs werederived from the monthly U.S. National Climatic DataCenter (NCDC) State Climatic Division (CD) Precipi-tation Data Base (Guttman and Quayle 1996). Regres-sions were done for each CD using each of the seasons’months from each year from 1950 through 1993 (44 yr).Monthly precipitation was first normalized to unit var-iance (s2 5 1) to remove changes in distribution width,especially from the annual cycle. Monthly means weresubtracted to obtain anomalies. (Note: numerous abbre-viations and variables are used throughout this text. Alist of them is provided in appendix A.)

a. SSTA data preparation

A 18 gridded January 1950 through 1993 COADSdataset (da Silva et al. 1994) was averaged over 38 to58 squares, depending on ocean region size. Earlier datawere not used due to poor Pacific coverage (Oort et al.1987). Squares with 1/2 coverage were considered valid.Interpolation was done for less coverage if any opposing

border squares were valid. Final time series had no miss-ing values and were the PC analysis input.

Because semiglobal analysis would likely miss smallerimportant patterns, several separate regions were ana-lyzed (Table 1). The PC calculation used all 528 months.Squares were not area-weighted to retain SSTA patternsensitivity in northern ocean regions, especially sinceNorth Pacific weather systems often propagate directlyinto the United States. The Gulf of Mexico was chosenmainly for its influence on inflowing U.S. moisture.

b. Precipitation anomaly transformation

1) NECESSITY OF A PA TRANSFORMATION

Although precipitation data were averages over sub-stantial time and space, distributions were far fromGaussian. For example, 262 of October–December(OND) 344 CD anomaly distributions, likely to be non-arid and well behaved nearly everywhere, failed our x2

normality test (see later) at p 5 0.1. At p 5 0.5, only4% (13) passed, versus ø50% if normal. Regressionsremoved only small to moderate variance fractions, thus,residual distributions differed little in shape. Since se-verely non-Gaussian residuals make even basic MLRstatistics unreliable, if not meaningless, a transformationwas critical. Since MLR minimizes squared deviations,the usual extremes from floods and droughts would fur-ther harm results.

2) TRANSFORMATION DEVELOPMENT AND

DESCRIPTION

The precipitation distributions’ (PDs) large shape var-iations over the United States, and often within seasons,complicated transform development. Resemblance to ashifted lognormal suggested a lognormal transform, butdistributions were truncated at small values to varyingdegrees, some highly. Where seasonal dryness was com-mon, 0 or ø0 values often occurred. Figure 1 sche-matically illustrates this behavior.

To avoid the log’s zero singularity, distributions’ small-er values were adjusted. A PA distribution was first shift-ed so that its smallest value was 0 (largely reconstitutingthe normalized PD); then a constant was added to thenow 0 value(s) forcing geometric symmetry about themedian. The addition was tapered to 0 by the median,

858 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

FIG. 1. Schematic of typical dry season precipitation distribution(solid line) and distribution adjusted for a lognormal transformation(dashed line). Effect on frequency is not shown. (See section 2).

as the dashed line illustrates in Fig. 1. The result waslog transformed and normalized to s 5 1. (See appendixB for detail.) The transformed modes and averages werenear 0 and extreme event influence largely eliminated.Residual normality and independence, the most criticalMLR requirements, were carefully checked: normalitywith a six-bin x2 test, and residual behavior with timeseries and scatterplots (see appendix B).

Overall, precipitation distributions transformed intoapproximately normal ratio anomalies, Y(t), giving re-gressions that predicted (approximately) ratios to mediansas (t) 5 Y exp[yd(t)sY], where YdM is a CD, d, season’sy9d dM

median (normalized) precipitation, sY, its tPA’s unnor-malized s, and yd(t) the predicted tPA for month, t.

c. Multiple linear regression and hindcast biascorrections

Equation (1) shows our usual MLR prediction:

a x (t) 1 a x (t) 1 a 5 y (t) 1 « (t), (1)O Odi i d j j do d di j

where x(t) is a month’s PC, the two sums and indices,i, j, indicate predictors from two ocean regions weresometimes used, adi are the MLR coefficients and con-stant, and «d(t) the residual, or error. Months, typicallythree, were selected serially from each year, a ‘‘season,’’so that the number, N, in an MLR was usually 132.Predictors were usually 10 to 14 leading PCs from oneor two ocean regions. MLR statistics were linearly in-terpolated to a grid using Delaunay triangulation onrepresentative CD centers (from NCDC) and contoured.Significance was log interpolated.

The MLR correlation coefficient’s, R’s, positive, that is,hindcast, bias was removed using the standard adjustment(Neter et al. 1996; Wherry 1931) to obtain the unbiasedpopulation (also termed ensemble) estimate, Rc:

2 2R 5 1 2 (s /s ) (N 2 1)/[N 2 (y 1 1)],c y . x y (2)

where d is dropped for simplicity, y the independentvariable number, and sy . x and sy MLR residuals’ anddependent variable’s (i.e., tPAs) ss, respectively. (Here,sy 5 1 from tPA normalization.) Overall MLR signifi-cance was obtained with the F test, which directly ac-counts for finite sample (i.e., hindcast) bias.

The regression unbiased standard error estimate, sy . x,ø the rms prediction error (e.g., Neter et al. 1996, chapter7), and Rc, the hindcast-corrected monthly or seasonal Rc,are related by

2 2(s /s ) 5 1 2 R ,y . x Y c (3)

where sY is Y’s unbiased s. The expected and averageanomaly, and probability that Y differs in sign frompredicted, are estimated from these quantities. The valueRc is operationally the most important statistic: it de-termines nonclimatology prediction frequency and over-all SSTA sensitivity [since (1) can be recast as RX(t) 5y(t), where X(t) 5 the left side of (1) and X and y arenormalized to s 5 1].

Seasonal average correlation coefficients (Rs) were ob-tained by averaging each season’s predicted and actualtPAs, Y(t), and correlating these 44 paired averages. Thevalue Rs tends to increase since averaging reduces noise.However, its hindcast bias increases for similar reasons,see appendix C. The 44-point correlation also adds bias;it was removed by Eq. (2) with N 5 44, y 5 1.

Equation (4) gives a conservative, that is, usuallyhigh, estimate of the seasonal average hindcast bias:

2 2b 5 b a /[(a/g)(1 2 R ) 1 R )],s 1 s (4)

where as is the seasonal correlation’s slope, a and gare variance ratios of the monthly and averaged residualsand predicted tPAs, respectively, and b1 is R2’s hindcastbias. The value , unbiased, is 2 bs. (Appendix2 2 2R R Rsc s s

C has a brief derivation.) These monthly and seasonalhindcast corrections, and cross-validation’s (XV) hind-cast-bias correction and out-of-sample (OOS) skill es-timation, were checked by Monte Carlo simulation. Itshowed adding as in (4) substantially reduced false pos-itives where p . 0.1.

d. Trade-offs: Ease of use versus skill optimizationand interpretation

Our approach trades off simplicity, ease of use, andinterpretation versus skill optimization, typically doneby removing nonsignificant predictors. Properly remov-ing them is not simple (see section 2h) while keepingthem penalizes MLR significance and skill since: (a)variance explained by chance grows; (b) hindcast-biascorrection depends directly on predictor number; and(c) seasonal bias correction (4) becomes more conser-vative. Advantages are that different seasons, ocean re-gions, and their combinations, can be easily examinedsince biases are readily corrected. We emphasize thatthe above corrections remove R and Rs’s hindcast biases.Results should be conservative since removing poor pre-

OCTOBER 2003 859M A R K O W S K I A N D N O R T H

dictors should appreciably improve skill, and other im-provements described below seem likely.

e. Out-of-sample, forecast, hindcast, population, andartificial skill confusion

The population skill Rc is not equivalent to OOS skill(Ros), often termed forecast skill. The value Ros has typ-ically been ‘‘estimated’’ by XV after selecting in-samplevariables by their skill. (See BS96 and references there-in.) Unlike population (ensemble) skill, Ros is a function,sometimes strong, of Nos, the OOS prediction number.Hindcast and OOS skill difference, R2 2 (Michael-2Ros

sen 1987; henceforth MI), and population and hindcastskill difference, R2 2 (Chelton 1983), have both been2Rc

called artificial skill, but they are clearly distinct biases(Davis 1977). When Nos 5 N, the in-sample observa-tions, then R2 2 5 2D , with D 5 2 .2 2 2R R R2 2os R R os c

However, our transform–MLR approach does not needXV to estimate Ros: Davis shows that is 2 b1

2 2R Ros c

since if Nos ù N, b1 ù D when residual statistical2R

requirements are met.

f. Multiple PC regression advantages anddisadvantages

1) ADVANTAGES

Multiple PC regression has important advantagescompared to often-used canonical correlation methods,such as SVD. Advantages 1 and 2 obtain by combiningPC analysis with MLR.

1) MLR removes the question of EOF accuracy (e.g.,North et al. 1982) since PC analysis is used only tofind reasonably strong and persistent anomaly pat-terns, as per our heuristic motivation (see introduc-tion). Eigenvalue and EOF pattern accuracy do nothave special relevance. Instead, the important quan-tities are an EOF’s influence and statistical signifi-cance, which MLR gives directly as adi’s size andsignificance (1) (see also Wang 2001), and R, whichis little affected by EOF mixing.

2) Quantitative statistics, such as R, Rs, sy · x, and PAsensitivity to SSTs are also directly obtained, unlikeother pattern correlation methods, such as canonicaland combined field techniques (Bretherton et al.1992, henceforth BSW).

3) Target field prediction using only PC 1 performednotably better than canonical correlation analysis(CCA) in most tests by BSW, and nearly as well asother combined field techniques. MLR avoids thelargest error BSW found: PC 1 missing the signal,which was in BSW’s PC 2.

4) All SST ‘‘signal’’ is available since CD data werenot gridded or filtered. Since CDs reflect climatedifferences, such as a mountainous region borderinga plain, adjacent CDs can behave differently. Inter-polation then causes signal loss.

5) MLR is a well-characterized and sensitive techniquewhen its statistical requirements are met.

6) Hindcast bias (often misleadingly termed artificialskill) is modest, well understood, and can be accu-rately removed (Wherry 1931). Its fundamentalcause is finite sample size and, thus, is common tonearly all predictive statistical methods.

7) Predictions, their confidence limits, and PA SST sen-sitivity can be obtained with textbook methods (e.g.,Neter et al. 1996, chapter 6).

8) Optimal predictors can be used when predictions arewanted under all conditions (Davis 1977).

2) LIMITATIONS AND DIFFICULTIES; CHOICE OF

EOFS

The main limitation is that predictors must be chosena priori to avoid adding hard-to-quantify artificial skill(also Nicholls 2001): EOF significance and OOS skillmust be traded off against possible exclusion of im-portant predictors (MI). Davis and MI recommend thea priori choice, used here, if a reasonable model is notavailable. Little difficulty arises if a few PCs capturenearly all variance. Also, eigenvalue size can be mis-leading. Small ocean regions or SSTAs can dispropor-tionately influence weather patterns but contribute onlysmall variance, for example, the western tropical Pacific.

Figures 2 and 3 show the Pacific and Gulf EOFs used.The first 8 Pacific were judged likely important by theircenters-of-action and variance fraction; 9 and 10 wereincluded for comparison: they appeared unlikely to beimportant. EOFs . 12 appeared too complex. The firstfour Gulf EOFs included nearly all Gulf variance. Mostresults here use these Pacific and Gulf EOFs, or onlythe Pacific, to avoid false skill from variable selection.Since our methods can easily examine many cases, in-cluding predictors based on a posteriori performance isa danger: these EOFs are our a priori, naıve, choice.

g. EOF field significance, pf

EOF significance has scientific and, thus, statisticalimportance. A current question is which, if any, oceanregions influence climate—except for ENSO, SST isusually assumed slaved to the atmosphere (Neelin andWeng 1999).

The highly spatially correlated tPAs (adjacent CDs ø0.9) and many CDs make likely sizeable chance patternswith high local (single CD) significance, p, that mimicreal influence and require consideration of the entire CDp, field (Livezey and Chen 1983, henceforth LC). Moresensitive field significance, pf , tests typically use MonteCarlo methods (LC). For qualitative guidance, about 200EOF significance maps were made substituting randomand varying persistence (lag 1 autocorrelation, r1) first-order autoregressive (AR1) time series for PCs.

However, accurately estimating pf is not straightfor-

860 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

FIG. 2. The first 10 Pacific EOFs and their variance fraction (Var). (a)–(d) The most influential and persistent and (e)–(j) usually lessinfluential and less persistent, see text. Loadings have been scaled for presentation by ø100 after unit length normalization. Dot–dashedlines indicate negative PC contributions. Zero contours are dotted.

ward: 1) large chance low-p areas are likely; 2) largeacausal correlation areas seem present, many probablyfrom atmospheric forcing: AR1 r1 ø 0.5 time seriessubstituted for PCs gave unusually many large areaswith p K 0.1; 3) seasonal testing is required: spatialcorrelation depends notably on season; 4) the large im-plicit number of trials in a single map, 344, and 10 to

14 maps per season make CDs with chance p K 1023

likely; 5) relatively small areas of substantial influenceseem common; thus, the usual, entire region test (LC)will likely miss significant EOFs. Nonuniform spatialPA correlation increases this difficulty. Items 1 through4 together require a great many Monte Carlo trials foraccurate confidence limits. Item 5 suggests using a re-

OCTOBER 2003 861M A R K O W S K I A N D N O R T H

FIG. 3. As in Fig. 2 except the first four Gulf of Mexico EOFs and normalization ø 50.

gional field test, but properly defining regions does notappear straightforward. Using only statistically signif-icant predictors should add appreciable skill.

To avoid these difficulties, simple but conservativeBonferroni tests (e.g., Neter et al. 1996; LC) were adoptedadequate to give some interesting results: only local sig-nificance is considered, joint independence is assumed,and sufficiently small p is required so that chance oc-currence on one map in a set is #pf . We chose pf 5 0.1to reduce rejecting EOFs with real influence. Ten mapsper set and (conservatively) assuming 300 independentpoints per map require p # 1.5 3 1025 for one, and #3.33 1024 each for two independent points on one map.(Independence, defined r # 0.20, required CDs ø1000km apart.) Ocean regions were separately assessed sinceinfluence is expected. The ps for MLR significance, onlyone map, are less stringent: 1.5 3 1024 and 1.3 3 1023.

h. Out-of-sample testing

Although MLR predictions should be reliable sincecare was given to statistical rigor, out-of-sample testinggives an independent and overall evaluation. Six yearsfollowing the in-sample data were used, 1994–99 [Na-tional Oceanic and Atmospheric Administration–Co-operative Institute for Research in Environmental Sci-ences (NOAA–CIRES Climate Diagnostic Center 2000;Reynolds and Smith 1994)]. Mapped and quantitative,Heidke skills were examined. For simplicity, predictionswere made using an optimal predictor (Davis 1977).

1) OPTIMAL PREDICTOR CONSTRUCTION INCLUDING

p AND pf

Davis derives optimal prediction weights for the aid

(1) when an a priori model is unavailable. FollowingDavis, weights, wid, used here are

2 2w 5 t /(t 1 2l),id id id (5)

where tid is aid’s Student’s t statistic, and l determineswid’s p dependence (via t). Usually 1 , l , 2. To usePC field and MLR significance advantageously, wid wasreduced to 0 by additional factors when p . 0.1, seeappendix D.

2) QUANTITATIVE SKILL ASSESSMENT—HEIDKE

SKILL

Skill was quantitatively assessed using Heidke skill,SH; Barnston et al. (1999, their appendix) describe SH

in detail. Briefly, predictands are divided evenly intoterciles: low, ‘‘normal,’’ and high; SH is given by:

S 5 100(N 2 N )/(N 2 N ).H c ec p ec (6)

Here, Nc and Np are the number of correct and totalpredictions (or guesses), respectively. The value Nec isthe expected number correct by chance, Np/3. Here, SH

5 250, 0, 25, and 100 for 0, exactly chance (Nc 5 Nec),50% and 100% correct, respectively. Predicting onlywhen accuracy is expected improves SH; otherwise, poorguesses dilute real skill.

3. Results

We show three main types of results: single EOFsignificance and influence, MLR skill for a set of EOFs,and OOS skills. Results are mainly paired same-month(0 lag) PC and tPA correlations; ‘‘forecast’’ will be usedto denote future prediction. (The terms correlation andinfluence are usually used interchangeably, with realinfluence the default assumption. Cause and effect aredistinguished in some cases.) Different season lengthresults are shown since statistical significance often in-creases with observation number, N. However, EOFsinfluence location and strength often vary and they maybe missed unless constant in location or very strongduring part of a long season.

a. Noteworthy EOFs: Large influence, acausality,and Pacific 2’s missing influence

We find several clearly significant Pacific and GulfEOFs, and one Pacific whose influence is unexpectedlyabsent. Three EOFs stand out on a full annual basis andmost seasons, Pacific 1, and Gulf 1 and 3; they arediscussed first.

1) PACIFIC 1: THE MAIN ENSO EOF

Figure 4 shows Pacific 1 significance for 3-monthseasons over an annual cycle. The large p , 1023 area

862 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

FIG. 4. Local significance maps of the main ENSO EOF, Pacific 1, for (a)–(f ) six 3-month seasons covering the year. Regions enclosedwith a thick solid (thin) p 5 0.1 contour (and thick dot–dashed inner contours) are wet (dry) during a warm (cold) event. Contour intervalsare a factor of 10, except a thin dot–dashed line marks p 5 0.03. The p 5 1 3 1024 contour is labeled 0 and is thickest; the innermostcontour is 1.5 3 1025. Dots are approximate Climatic Division centers. Note the large areas of significance and the (a) Jan–Mar strongtransient dipole. The first 10 Pacific PCs were used as lag 0 predictors—Gulf PCs were not used, see text.

positions, and some ,1.5 3 1025 during January–March (JFM), identify Pacific 1 as the main ENSO EOF.Seasonal behavior is consistent with previous ENSOstudies (e.g., Ropelewski and Halpert 1987; MT); and,considerably more detail, some substantially larger in-

fluence areas, and more seasonal changes appear thanpreviously identified. Three JFM Florida CD ps are,1027 and the southernmost Texas (TX) CD is fieldsignificant, p , 1.5 3 1025. (Henceforth, ‘‘field sig-nificant’’ will mean significant by our local Bonferroni

OCTOBER 2003 863M A R K O W S K I A N D N O R T H

FIG. 5. (a) Gulf EOF 1 and (b) EOF 3 for Mar–Jun and full year, respectively. Otherwise as in Fig. 4 except the four Gulf PCs areincluded as regression predictors and different line styles only indicate opposite responses.

tests and Gaussian MLR residuals.) Opposing responsesoccur; the JFM dipole (Fig. 4a) is especially strong. Italso corresponds well with MT’s JFM result: his rotatedEOF 1 is very close to our Pacific 1. Like MT, we findthe dipole fades rapidly on either side of JFM. Identi-fiable influence is least during July–September (JAS),largely due to shifting influence position. On an annualbasis, EOF 1 is field significant over much of the desertsouthwest: two New Mexico ps are #1027 and westernTX has p , 1.5 3 1025. (TX residuals fail our x2 test.)EOF 1’s influence moves northward from late spring tosummer [Fig. 12 shows June–August (JJA)] and ap-parently has influence all year, although May–July andAugust–October (ASO) are not field significant. Theseseem likely significant based on random time series pat-tern areas (section 2h).

2) ACAUSAL GULF EOFS

Gulf 1 March–June (Fig. 5a) has ps , 1025 and in-fluence throughout the year (inspecting 3-month seasonmaps as earlier). Influence is fairly constant from latefall through spring and weakest in summer and earlyfall. Gulf 3 (Fig. 5b, annual basis) is not as strong, butour two-point test shows field significance (Idaho andSouth Carolina CDs). Gulf 3’s influence moves consid-erably and half of its southern dipole (Fig. 5b) is weakeach season. (Gulf 2 and 4 often show large areas withlow p.)

But contrary to expectations, Gulf influence is usuallynot near the Gulf and often distant, for example, theNorthwest (Fig. 5a). The Gulf’s small area does notseem capable of causing such strong and distant effects.Several explanations seem reasonable:

1) The Gulf anomalies are forced by atmospheric sys-tems causing the precipitation (Frankignoul 1985;Neelin and Weng 1999).

2) Gulf SSTAs are part of a larger pattern that includesthe storm formation region (SFR) off Cape Hatteras:

this pattern forces the North Atlantic Oscillation(NAO)—the link between cause and effect. Suttonand Allen (1997, henceforth SA) found wintertimeSFR SSTAs and NAO strongly correlated.

3) SSTAs not considered force the correlations, for ex-ample, a region not analyzed, such as the IndianOcean, or higher order Pacific PCs. The first 10 onlyhave ø1/2 of all variance, but higher order PCs(.10) typically have short persistence, #0.5, likethe Gulf (ø0.5).

4) Nonlinear SST interactions force the Gulf. Rapidvariation could occur even though the Pacific PCsvary slowly: most of the first 10 had persistence$0.65, 1–3 usually $0.8.

Explanations (a) and (b) can be tested. Explanation (a)predicts Gulf PCs to be nearly AR1 time series (red noise):they are. [Water’s large heat capacity reddens white noiseatmospheric forcing (Frankignoul 1985).] Precipitationshould then mainly correlate with the PC’s random forcing(noise) term, «t, of its AR1-like time series, that is, xt 5r1xt21 1 «t. The value «t is the residual from a month’sweather forcing, r1 is xt’s lag 1 autocorrelation (persis-tence); and «t 5 xt 2 r1xt21. If SST is causal, «t shouldshow much weaker correlation than xt, and xt21 shouldshow 1-month forecast skill where its 0 lag correlationsare strong. Here, «t’s stronger 0 lag correlations than xt

and xt21’s negligible forecast skill indicate atmosphericforcing. Gulf EOF 1’s loading (Fig. 3a) also suggests at-mospheric forcing: loading decreases monotonically fromthe coast and explains 66% of Gulf variance. (Note: at-mospheric forcing can create causal SSTAs as ‘‘causal’’is used here; they at least need long enough persistence,*0.6, to become influential; their correlation must not bedue to forcing, that is, «t.)

Explanation (b) requires that the strongest Gulf EOFpatterns continue into the SFR. The strongest ‘‘extendedGulf’’ EOF (EG; Table 1 and Fig. 6a) has this character,34% of EG’s variance, and similar p, #1026, at similarlocations. However, the statistical tests above indicate

864 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

FIG. 6. As in Fig. 3 but for the first two extendedGulf EOFs, see Table 1.

FIG. 7. As in Fig. 4 except Pacific EOF 5 during its season ofmaximum influence. Wet line style corresponds to EOF 5’s right polewarm (Fig. 2d).

acausality. The loading pattern of EG 1 is also like Gulf1’s with respect to the coast. Thus, EG 1 appears relatedto Gulf 1 but acausal vis-a-vis the NAO, though theopposite may be true.

None of the next four EG EOFs matches Gulf 2 or3 well. While resembling Gulf 3 (Fig. 3c), EG 2 (Fig.

6b) shows considerable loading pattern difference. ItsPC is much more persistent, r1 ø 0.8, has good 1-monthforecast skill, and often strongly correlates with theENSO EOF, Pacific 1, r ø 0.6, with distinctly overlap-ping influence. Extended Gulf 2 seems associated withPacific 1, but not Gulf 3.

Because the previous results fit very well the behaviorexpected from atmospheric forcing, it likely causes mostor almost all the Gulf’s correlations. Caveats are thatexplanations (c) and (d) cannot be ruled out; and Gulf3 or EG 1 may have some causal influence not explainedby the Pacific: the large monthly variability here couldswamp the winter-average variability SA analyzed. Nev-ertheless, very strong correlations exist between theGulf, its surrounding ocean regions, and U.S. precipi-tation. Reproducing these should be an important testfor coupled GCMs.

3) PACIFIC EOFS 2’S MISSING INFLUENCE AND

EOF 5

Pacific EOF 2 (Fig. 2b) has the second-largest vari-ance fraction, marks the Kuroshio Extension and sub-arctic front, coincides with the large SST variabilityacross the northern Pacific, is at least as coherent asEOF 1, but during fall and winter, significance areasappear little more than random. Several researchers con-clude that EOF 2 should affect U.S. winter weather(Latif and Barnett 1994, henceforth LB; Chen et al.1996; Nakamura et al. 1997), but none is apparent here.Contrarily, EOF 2 shows influence March through Au-gust. (Fig. 12d shows JJA.) Although usually not closeto field significance by our conservative tests, areas arelarge enough to suggest a positive result by more sen-sitive (Monte Carlo) tests.

EOF 5 (Fig. 2d), straddling EOF 2, shows major in-fluence instead. EOF 5 has a large influence area withseveral ps ø 7 3 1024 on an annual basis. Figure 7shows its strongest season, October–December field sig-nificance is clear: several CDs within the 1024 (0) con-tour have 1026 , p , 1025. EOF 5 also shows largeinfluence areas most seasons: March–June, north-centraland northeast U.S.; ASO, most of the U.S. middle third;after December mainly on the West Coast. March–Juneand October–February maps are field significant. EOF5 appears to strongly modulate ENSO influence muchof the year, compare Figs. 7 and 4. Gershunov and Bar-nett (1998, henceforth, GB98) report modulation duringJFM. We see moderate JFM modulation, but EOF 1 and5 JFM correlation is (unusually) high, 20.45, so theyare not independent.

EOF 2 and 5’s differing influence is likely partly dueto EOF physical and statistical stationarity limitations:EOFs cannot explicitly show motion. Much of EOF 2’svariance is likely a motion artifact, such as subarcticfront waves (Gill 1982, 493–547; Nakamura et al. 1997;Yuan et al. 1999), or, especially, north–south front mo-tion. Small position shifts generate large variance from

OCTOBER 2003 865M A R K O W S K I A N D N O R T H

FIG. 8. (a)–(d) Overall regression local significance for four 3-month seasons. Contours, dots, and predictors (lag 0) as in Fig. 4, exceptthe 0.03 contour is omitted and opposing responses do not apply. Note high significance levels and field significance (see text) for mostseasons.

the front’s large temperature gradient which PC analysissingles out. EOF 5’s shape also indicates motion: whensubstantial, a dipole EOF will typically straddle an EOFcentered in the motion (Kim and Wu 1999); front ro-tation also generates dipole variance. Position changeswill convolve regional SSTs; for example, if the ArcticOcean is anomalously warm but the front is south, PC2 is likely to be negative, not positive, since position isso influential. Standard analysis will likely be problem-atic; separating variability is recommended.

EOF 5’s large influence may be dynamically expected:its dipole spacing (508–608) is near typical polar jet wave-numbers 3 and 4. By coupling through cyclones, espe-cially the Aleutian low, the dipole should affect the low’sstrength, position (LB; Chen et al. 1996; Nakamura etal. 1997) and shape and, thus, the short waves formingfrom it and propagating across the United States. Con-sistent with polar jet behavior, EOF 5’s summer influenceweakens and moves to the northern United States.

Pacific EOF 5 is also unusual with respect to cau-sality: its correlations appear to be partly causal andabout 2/3 to 1/2 from atmospheric forcing. Region sep-aration is also seen, especially late spring and early

summer: one region tends to be mainly causal, anothermainly acausal. Causality seems absent from about Julyto October, although correlation areas are large. Notethat when some causality exists, atmospheric forcingwill develop some real influence.

In summary, EOF 2 mainly appears to reflect thesubarctic front’s position, with actual influence likelyduring late spring and summer.

b. Seasonal averaging and seasonal averagecorrelations

Figures 8 and 9 show monthly regression significanceand hindcast-bias-corrected seasonal average skill, Rsc

(section 2c), for four seasons with predictors the first10 Pacific PCs. (Figure 12 shows JJA.) Including theGulf PCs adds much skill but were not used due to theirapparent acausality. As expected, seasonal skill closelyfollows monthly significance and usually is .Rc (Rc notshown). Some chance positives and negatives are ex-pected. (p 5 0.1 for 3-month seasons corresponds toRsc ø 0.3.) Seasonal skills are likely to be chance if notnear a locally significant area.

866 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

FIG. 9. Season-averaged MLR bias-corrected 0 lag prediction skill, Rsc, corresponding to the significance results and seasons in Fig. 8a–d. Contours begin at 0.3; intervals are 0.1. The 0.3 and 0.6 contours are enhanced for clarity. Note the good predictability during wintermonths.

Large low-p areas are Fig. 8’s outstanding features. Ex-cept the weakest, JAS, all seasons are field significant andmost have high skills. The later fall and winter monthsshow much skill area $0.5 with some CDs . 0.6; JFMshows skills $ 0.7 and substantial area with Rsc . 0.6.Although JAS’s p , 0.1 area is within chance, it has goodOOS Heidke skill, 16. JAS’s EOF 1 and 2 likely have realinfluence since their patterns are similar to adjacent sea-sons where they are stronger, especially EOF 1.

Winter skill is in large part due to ENSO-relatedEOFs. Pacific 1 is typically dominant, except on mostof the West Coast. Other EOFs have greater influencehere, especially 3, 4, and 8. EOF 3 makes major con-tributions from October to March and appears relatedto an intermediate North Pacific Oscillation (NPO) state(LB; GB98). In many seasons, these EOFs also oftenstrongly modulate the main ENSO response. ExceptJFM, coastal southern California shows little skill.

c. Out-of-sample test results

Out-of-sample predictions were made for 6 yr, 1994–99, a period expected to be challenging since it includes

weak ENSO conditions and a rapid switch from strongEl Nino to strong La Nina states (1998–99). Semiquan-titative mapped and quantitative numerical skills follow.

1) OOS MAPPED RESULTS

Figures 10 and 11 compare predicted, Yest, and actual,Ys, transformed anomaly season averages for the bestpredicted colder and warmer seasons, JFM and JJA.Transformed anomalies facilitate comparing the wide pre-cipitation range across regions. The maps show approx-imately the season average log of the ratio of the monthlyto the season’s median monthly precipitation (monthlylogs normalized to s 5 1: tPA normalization; section2b). Normalized by seasonal average s, values would betypically 1.4–2 times larger, depending on CD. An im-portant caveat is that longer-term skill may be greatlyunderestimated due to the small season number: on anR2 basis, little or no skill is expected here (Davis 1977).Nevertheless, results are consistent with and confirm thein-sample skills, Fig. 9, since they appear better thanexpected statistically. Since most seasons’ predictions in-

OCTOBER 2003 867M A R K O W S K I A N D N O R T H

FIG. 10. Jan–Mar actual and predicted season average precipitation for (a),(c) 1998 and (b),(d) 1999. Averaged monthly transformedanomalies are shown. These are approximately loge of the ratio of season average to the actual median precipitation of the season’s months(monthly data normalized to s 5 1, see text). Predicted (actual) contours start at 0.2 (0.4), with sequence 0.2, 0.4, 0.7, 1.0, 1.2, 1.4, . . .Corresponding ratios to observed medians (reciprocals when negative) are ø1.1, 1.3, 1.5, 1.8, 2.1, 2.3, . . . Positive (negative) anomaliesare indicated by thick (thin) solid outer contours and thin solid (thick dot–dashed) inner contours. Darkest shading starts at 61.4. Dots areapproximate U.S. State Climatic Division centers.

clude a large area, they should intuitively represent atleast two to four independent predictions per year.

(i) October–December and April–June seasons

April–June (AMJ) and OND maps were inspected forall OOS years. These seasons’ accuracy varied muchmore than JFM and JJA. The 1999 OND prediction isstrikingly accurate, 1996 strikingly poor; 1997 predic-tion is poor, while 1994 and 1995 appear generally skill-ful. Overall OND Heidke skill is good. The 1994–96AMJ predictions look fairly skillful, and 1997 quitegood; but 1998 is poor and 1999 bad. The 1999 resultwas largely from the main ENSO PC (EOF 1) showinga strong La Nina while precipitation behaved like astrong El Nino. While 1998 was the reverse of 1999with respect to PC 1 and precipitation, the differencewas not as large: although PC 1 was still indicating anEl Nino, it was rapidly changing to La Nina values. Therapid change likely accounts for some of the error.

The OND and AMJ accuracies may be highly variablebecause jet stream and ITCZ positions change rapidly

during these seasons: our PCs may poorly capture po-sition and stability differences and other ocean regionsmay be more influential. Regressions for these seasons’first and last 2 months support this reasoning: they usu-ally show the main influence areas shifting markedly,but do little to explain the 1999 AMJ errors.

(ii) Out-of-sample predictions for the January–March season

Figure 10 shows actual and OOS JFM predictions forthe strong 1998 El Nino and 1999 La Nina, and areabout the best obtained. Figure 10a is a fairly classicJFM pattern. The 1998 actual anomalies are large sincethe values shown are (about) loge(Ys/YM), YM the sea-son’s Y(t) median. (This figure preserves the transforminverse, section 2f. Hereafter, ‘‘normal’’ or ‘‘normal me-dian’’ will mean YM.) The several 1.4s (Fig. 10a) in-dicate seasonal anomalies about e1.4(0.62), or 2.4 times thenormal monthly median; 0.62 is Florida’s CD 4 sY andtypical. Large anomalies are likewise predicted: Flori-da’s CD 4 Yest and Ys are nearly identical; good predic-

868 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

FIG. 11. As in Fig. 10 but for 1996 and 1997 Jun–Aug seasons.

tion is expected as Rsc is .0.7 (Fig. 9a). Interestingly,the oddly dry CD in western Texas is well predicted.The California (CA) results are better than expected.Figure 10b has large areas with no predictions, espe-cially the central and northwest United States. Most arewhere regressions are not significant, see Fig. 8. Themain errors, ø1.0 to 1.4sY, are differences from theclassic pattern (Barnston et al. 1999). The larger are inNebraska and Kansas, where expected precipitation wasabsent, and eastern Michigan and northern Ohio, whichwas wet instead of dry. Overall, the Appalachian regionis not nearly as dry as expected. The major EOFs, 1and 3, are strong: PC 3 greatly helps correct the overlydry result that would obtain from only the main ENSOEOF, see Fig. 4, especially in Tennessee and southward.

Figures 10c,d shows 1999 La Nina anomalies. Actualanomalies again differ notably from the classic pattern,especially the Appalachians, southern Texas, and severalstates to its north: much area expected dry is close tonormal or even wet, such as Oklahoma (OK). Predictionis good in Arizona (AZ) and southern CA, and at leastof correct sign along the Appalachian states. NorthwestWyoming error is unusually high. The Southeast is pre-dicted rather well. More PCs are strong than in 1998; PC3 tends to degrade accuracy except near eastern Penn-

sylvania and the Northwest. Other strong PCs are 5, 6(which has little significance), 7, 8, and 9. These generallyincrease accuracy, especially in AZ and Maryland.

(iii) June–August out-of-sample predictions

Figure 11 shows 1996 and 1997 JJA summers. In-sam-ple regression results and EOF 1 and 2 significance arein Fig. 12. Year 1996 illustrates more typical accuracy anda nearly null ENSO state, PC 1 about 20.2s. But pre-cipitation resembles a strong La Nina, so that the very dryconditions in western Wyoming and nearby states are al-most completely missed—the largest error is ø1.5 sY.Small dry parts of Montana and Idaho are predicted mod-erately well and also locally significant CDs in Washingtonand Oregon. Principal component 5 predicts a small wetanomaly, which gives small errors in the Dakotas but helpsin their north where precipitation is normal; PC 2 and 3make the major contributions and give the fairly accuratewet central United States and New Mexico predictions. Afew scattered predictions are small but with correct signs:one shows in North Carolina.

Figures 11c,d shows the strong 1997 El Nino summer.Accuracy is remarkably good (likely somewhat fortu-itously) at CDs where p is small enough for prediction.

OCTOBER 2003 869M A R K O W S K I A N D N O R T H

FIG. 12. (a) Regression significance as in Fig. 8, (b) prediction skill, Rsc, as in Fig. 9, and (c),(d) Pacific EOF 1 and 2 significance as inFig. 4, but all for the Jun–Aug season. EOF 2 dry contours correspond to a cold North Pacific anomaly (Fig. 2b).

The PCs 1 and 2 make the major contributions; PC 2’smain effect is near each side of the Colorado–Kansasborder (but adds to Iowa and Nebraska errors). Severalother PCs are quite strong, but with much lower sig-nificance, and add only moderately to accuracy. Nota-bly, EOF 1 contributes in the proper region, as opposedto much of the area obtained by Wang et al. (1999)without transformed variates. Figures 11b,d also shownotably better predictability than indicated by recentGCM results (Koster et al. 1999), although adding asurface moisture variable would probably improve ac-curacy.

2) OUT-OF-SAMPLE QUANTITATIVE SKILL—HEIDKE

SKILL SCORING

Predictions (guesses) were made only where theywere likely useful, that is, enough . chance, to avoidwashing out real skill; if actual skill or OOS samplesize, Nos, is small, as here, results may even be negative(Davis 1977). We optimize R2 skill. Some improvementcould be gained optimizing SH instead, but since largeerrors are serious in practice, R2 seems reasonable.

(i) Prediction decision criteria and parameters

The values Yest (with Ys renormalized to s 5 1) andRsc were the main parameters for outer- and middle-tercile nonclimatology decisions, respectively. Outerterciles were picked if | Yest | . Lk, k indicating morethan 1 limit, L, was tried. The value L should be largeenough to expect an outer-tercile result. Larger L shouldincrease SH if real skill is present until Np and Nc becomesmall enough for stochastic noise to dominate, an effectseen in Barnston et al.’s Fig. 7 (1999).

The value L 5 0.33 was our main choice: one morecorrect guess is expected . chance per each eight. Normaldistribution tercile boundaries, 60.44 (Nc/Np 5 1/2), givean increase of one per six. (In practice, fractions correctare likely to be smaller due to OOS size, Ys departuresfrom normality, and climate nonstationarity.) When Rsc .0.45, L/(1 2 )1/2 was used to account for smaller sy · x

2Rsc

[Eq. (3)]. The middle tercile was chosen if Rsc . 0.45 and| Yest | (1 2 )1/2 # 0.14: a small limit is needed to keep2Rsc

the prediction plus error likely within the tercile. The value0.14 was simply estimated; optimization may give usefulimprovement. (1 2 )1/2 again is from (3).2Rsc

870 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

TABLE 2. Heidke skill scores, effective t statistic, difference from correct relative to binomial distribution standard deviation, sB, andactual prediction numbers. Nonclimatology prediction levels and other conditions are rightmost (see text).

Heidkeskill

FieldtB score

z-score (sB

from chance)Total

predictionsExpected correct

by chanceActual no.

correct Diff

Nonclimatologyprediction limit,

Lk

Other decisioncriteria

Jan–Mar1822243037

1.41.51.61.82.2

7.78.28.18.39.0

878704555376293

293235185125

98

400337274201170

107102

897672

0.330.440.550.771.0

1821263134

1.41.41.71.82.0

7.77.78.68.58.1

864704547372283

288235182124

94

394331277201158

10696957764

0.330.440.550.771.0

0.43 Terciles0.43 Terciles0.43 Terciles0.43 Terciles0.43 Terciles

Apr–Jun006

0.00.00.4

20.30.01.0

413296142

1389947

1359953

2306

0.330.440.66

Jun–Aug10141615

0.81.01.11.0

3.03.73.73.0

414344278181

138115

9360

166147122

79

28322919

0.330.440.550.77

Oct–Dec1213121314

0.80.80.80.80.9

4.54.54.14.05.4

694648547463805

231216182154268

286270227195341

5554454173

0.290.330.440.550.29 1.0, EG21orth

15151516161415

0.90.90.91.01.00.80.9

6.25.75.95.75.74.35.0

831744758601613490513

277248253200204163171

361320329265270208224

84727665664553

0.290.330.330.440.440.550.55

1.2, EG21orth1.0, EG21orth1.2, EG21orth1.0, EG21orth1.2, EG21orth1.0, EG21orth1.2, EG21orth

t Statistic significance (1-tailed, 15 degrees of freedom)

tp

1.01/6

1.340.10

1.750.05

2.080.03

2.600.01

(ii) Quantitative skill considerations: Table 2

Table 2 shows quantitative skill for four seasons, eachusing all OOS years. Results for at least three Lk areshown under prediction limit; some seasons include SH

with other decision criteria for illustration. The numberof predictions, Np, number of correct predictions, Nc,and expected correct, Nec, columns explicitly show Lk

sensitivity. Maximum possible predictions per seasonare 2064 (6 3 344 CDs). Like the OOS maps, Heidkeskills include the in-sample annual cycle so that OOSbaseline shifts from climate change, or other nonsta-tionarity, directly reduce skill. Testing here is more de-manding than used by Barnett and Preisendorfer (1987,henceforth BP) since BP derived their terciles from theirOOS predictions.

(iii) Quantitative skill: Results, Table 2

Table 2’s salient feature is the consistently positiveskill: even the most problematic season, April–June (seeearlier), shows slightly positive results. The Heidke skillscore usually increases with L, as expected for real skill.JFM’s large Lk range shows behavior when predictionis unusually good (note many high Rsc CDs, Fig. 9a),ENSO response is strong, and strong ENSOs occur:even L 5 1.0 gives ø280 predictions, .1/2 are correctfor L . 0.55. In-sample tercile divisions varied aboutnormal (see appendix D). JFM skills shown for normalterciles (60.44) are nearly as good as including regionalvariation—a result typical of all seasons, not explicitlyshown.

OND illustrates other commonalties. The value 1.2under ‘‘Other decision criteria’’ shows skills when

OCTOBER 2003 871M A R K O W S K I A N D N O R T H

monthly tPAs are exponentiated by 1.2 before normal-ization. This change (tails are stretched, middle com-pressed) corrects much of the kurtosis from the logtransform. Reduced tercile jitter and larger Np shouldincrease skill—both effects are seen. From July throughDecember, adding EG 2 to the predictors and orthog-onalizing (EG2 1 orth under Other decision criteria)notably increased Np, SH, and pfo. (The optimal predictorassumes orthogonality while EG 2 often correlatedstrongly with several PCs.) Extended Gulf 2 had largelow p areas in all the included 3-month seasons andfield significance in most.

(iv) Quantitative skill: Estimating statisticalsignificance, pfo

Table 2’s third column shows z-scores (see appendixD). The value z determines pfo directly if predictandsare independent, but spatial correlation biases z high(BP, their appendix, section 4). Correction was donefollowing BP (see appendix D). Table 2’s bottom showsconservative, 15 degrees of freedom, tB (field) signifi-cance levels. (The B indicates the underlying binomialdistribution.)

Nearly all Table 2 entries have pfo # 1/6; during JFMmany have pfo # 0.05. Surprisingly, even JAS (notshown) has pfo ø 1/6 and SH ø 16 in spite of its lowMLR significance. Its skill apparently results from a fewinfluential PCs, especially 1 and 2. Considered together,the results described earlier indicate that real skill isachievable during most of the year. Given the small Nos

and the t statistic’s standard error, 61, the results seemdistinctive.

4. Discussion

a. Seasonal correlation results: Comparison toprevious studies

Previous statistical studies often used canonical cor-relation methods, usually with EOF prefiltering (e.g.,BS96; BP). While BS96 predicted temperature reason-ably well, precipitation results were not good: simpleprediction using ENSO composites shows markedlymore potential skill (Barnston et al. 1999). Figures 8and 11b’s seasonal 0 lag skills are notably greater thansimilar other studies. Several factors contribute.

1) FACTORS LIMITING AND IMPROVING SKILL

In hindsight, canonical methods’ limited performancemay be straightforwardly explained:

1) These types of methods rely fundamentally on linearregression: precipitation distributions’ high skew andextreme values give flawed regression statistics, evenfor one predictor variable (Tabachnick and Fidell1996, henceforth TB96, chapter 4, Wang and Swail2001, henceforth WS). This characteristic of conti-nental precipitation is a nontrivial difficulty.

2) EOF prefiltering the precipitation field is more likelyto filter out the desired signal than enhance it sinceSST influence typically accounts for a small variancefraction. Instead, prefiltering will likely capturemainly stochastic atmospheric variance since it ishighly spatially correlated on 1–2-month timescales(e.g., droughts) and usually dominates.

3) Canonical methods typically generate much artificialskill (BP; MI): cross-validation is often used for cor-rection, but its help is limited, see below. Using XVeffectively subtracts two large quantities, each withsubstantial uncertainties. This difference, at best, islikely to have uncertainty hard to characterize [seeMI and Davis (1977)].

4) Canonical methods are ‘‘exquisitely sensitive’’ to de-partures from normality (TB96, p. 640).

Several factors improve our results: (i) Smaller oceanregions were analyzed, following MT. (ii) A CD’s entirePA was used—no variance was lost by filtering or in-terpolation. (iii) Significance levels, correlations, andreliability were markedly improved by removing the PAdistributions’ large skew and truncation by transfor-mation (TB96; WS). (iv) MLR recombines signalssmeared by PC analysis (see smearing below): skill isimproved more than expected from unrelated predictors.

2) CAVEATS AND ADDITIONAL DISCUSSION

Some care is needed when comparing results here toothers’. Seasonally averaged tPA predictions are, ap-proximately, the season’s geometric mean ratio to eachmonths’s (monthly s normalized) median. This ratiomay not be easily related to the usual normal, the seasonaverage, if monthly averages or distribution shapeschange greatly in the season. (Normalizing anomaliesby monthly s, usually done, causes a similar ambiguity.)

Secondly, XV has often been used to remove artificialskill. However, XV (assuming Gaussian distributions)gives estimated OOS skill, Ros, for Nos ù N, the numberof observations in model(s) construction. The Ros willusually be noticeably lower than population-basedskills, (those shown here) if N is not immense.103 and

is small, ,0.15 (section 2e). If expected OOS (XV)2Rc

skills were shown in Fig. 9 (same N), the 0.4 contourwould be ø at 0.3’s position, with accordingly less in-fluence area, and 0.4 at ø 0.5. Contours $ 0.6 wouldbe little affected since the governing quantity is squared.For only two or three predictors and N $ 100, Rc 2Ros becomes small, a trade-off between predictor numberand Ros discussed in section 2c–e. [Davis (1977) treatsthese skill types in detail.] Note that XV does not iden-tify well most artificial skill generated by screeningtechniques like CCA: only that from predictors whoseskill is from a few influential observations. However,these are preferably dealt with before the main analysis(TB96, chapter 4).

We emphasize: the term artificial skill requires dif-

872 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

ferentiation from other biases. Hindcast, population(also termed ensemble), and OOS skill are already welldefined (section 2e; Davis 1977). Our recommendationis the false skill from selecting predictors based on theirperformance.

b. ENSO event differences and nonlinearity:Comparison with simpler methods

Methods here explain ENSO event differences foundusing simpler approaches and show additional SST in-fluence.

1) IMPROVEMENT OVER INDEX-COMPOSITE

TECHNIQUES: EXAMPLES

Smith et al. (1999) and Harrison and Larkin (2001,henceforth HL) find the 1997–98 ENSO differing fromprevious events. Many of their differences appear at-tributable to using a single ENSO index, even thoughrobust. Artifacts are also likely due to composites’ smallsample sizes. PC-MLR analysis appears to capture ef-fects better: Pacific EOF 1 shows strong and widespreadresponses in both time and space (Fig. 4) especially JJA(Fig. 12), when HL find little effect. It also properlypredicts, OOS, the 1997–98 winter wet southeast coastand Texas regions—a strong warm event (see Figs. 10b,9a; also MT). A wet southern CA winter is also correctlypredicted (Fig. 9a). (Southern CA is influenced by EOF1, 2, 5, and 7; unless PC 1 is very strong, as in 1997–98, we expect differences such as HL note.) Sometimeswe confirm index results, especially in SON (Fig. 1,HL): we expected a very dry 1997–98 Northwest winter;and, as we both find, the opposite occurred.

2) WARM AND COLD EVENT NONLINEARITIES:PHYSICAL AND DYNAMICAL

Our work apparently explains warm and cold eventdifferences seen in other studies (e.g., Hoerling et al.2001, henceforth HO; MT). Differences should resultfrom both physical and dynamical nonlinearities. Someof the former is requisite since precipitation cannot benegative, and water vapor’s nearly exponential temper-ature dependence should cause a positive skew. Our PAtransform directly adjusts for both these effects (e.g.,the log is the exponential inverse), so that our tPAsshould have little purely physical nonlinearity (whichinverse parameters absorb and quantify).

Unlike others, we find little dynamical nonlinearity,which can be found by including | PC 1 | as a predictor.It typically shows less significance than expected bychance. Other EOFs account for most differences, forexample, EOF 4 correlates highly with | PC 1 | . In par-ticular, HO’s south-central and Northeast difference pat-tern coincides very well with EOF 3’s influence. We dofind a small northwestern region with sufficient area andlow p to suggest reality and coinciding well with HO.

But the effective sample size is likely quite small andthe 1997–98 event a major contributor, so that chancemay easily be responsible.

c. Effects inherent in EOF-PC analysis

EOF map sets typically had some maps with a fewsmall areas of local significance, some with fairly largeareas, and usually two or three maps with areas largeenough to suggest field significance. All three area typesoften overlapped or fit together. This joint behavior sug-gests signal smearing among EOFs. More evidence isoverlapping areas forming clearly significant MLRfields (common in our Atlantic set, not shown). Muchof this smearing appears inherent in PC analysis. Horel(1981) and Richman (1986) show EOF’s sensitivity toanalysis region boundaries, especially when variance isnear a border (K.-Y. Kim 1999, personal communica-tion). The empirical orthogonal function’s, rotated ornot, poor ability to represent motion, usual for SSTAs,will complicate EOF patterns and spread varianceamong several, or more likely, many. MLR helps byrecombining at least some of these parts: those over-lapping in a map set. The values of R and p are invariantif the predictors are linearly transformed. Varimax, skewrotation, and cyclostationary analysis (Kim 2000) mayalso help. Factor analysis should give useful noise re-duction. However, a simple means to analyze noncy-clical motion seems unavailable.

d. Using additional ocean regions and pattern fieldtesting—Future work

Other ocean regions show clear skill, for example,the Atlantic, Table 1; but simply adding a group ofpredictors will likely reduce overall skill by adding toomany unimportant variables. A pattern-limited field testseems needed for selection since a whole field test willlikely miss smaller significant areas. Other consider-ations are predictor consistency over time and dynamics;see also Wang (2001).

e. Noise EOFs and SSTAs

Some EOFs, including all Gulf EOFs, had surpris-ingly strong and distant correlations, likely from at-mospheric forcing. Although their PCs had short per-sistence, ø1 month half-life, and little 1-month forecastskill, occasionally one had very high 3-month forecastskill, Gulf 3 in particular. Its JJA forecast from March–May (MAM) has clear field significance: p ø 1027 anda large second area with ps K 1022. Since atmosphericsystems causing the Gulf correlations are large-scale,mechanisms could be excited with delayed influence,for example, oceanic adjustment. Gulf 3 suggests snowcover: MAM forecasts JJA, and its full PC is needed,not just «t. Soil moisture does not appear responsible.

An apparent conclusion from the previous material

OCTOBER 2003 873M A R K O W S K I A N D N O R T H

and our AR1 guidance maps (section 2f) is that in theUnited States, at least, large regions show high localsignificance when correlated with randomlike or AR1,r1 ø 0.5, time series. While some may be due to SSTAs,many are likely from large-scale atmospheric processes.Since these should have roughly semiglobal effects, theyshould cause SSTAs in other ocean regions and theircorrelations will pass usual statistical tests and XV. Theypose a statistical forecasting pitfall and suggest warinesswhen short-persistence PCs (or other predictors) seemto forecast well. We can only suggest caution, hopefullyreasonable causes, and suggest future research.

5. Summary and conclusions

Relatively high concurrent (0 lag) precipitation pre-diction skill is found over much of the United Statesover most of the year: hindcast-bias corrected correla-tion coefficients typically 0.2–0.4 and 0.3–0.6 onmonthly and seasonal timescales, respectively, and con-siderably higher than earlier comparable studies. Asfound earlier, high-skill regions are seasonally depen-dent, winter highest. While this study was limited to theUnited States, methods developed can be applied to cli-mate parameters generally. Related studies, such asBS96, suggest similar results are obtainable over mostcontinents. Out-of-sample (OOS) tests confirmed ex-pected accuracy (as did cross-validation).

Factors improving results were detailed SSTA patternanalysis, multiple regression, monthly data, and, criti-cally, a precipitation transform giving Gaussian regres-sion residuals. Method advantages are trustworthy sta-tistics, especially correlation coefficients, and rigorousmonthly, and simple and conservative seasonal averagehindcast-bias removal. The number of predictors chosen(necessarily) a priori can be traded off against the dif-ficulty removing unimportant ones without introducingfalse skill. Moreover, predicted response to an SST stateand its confidence limits are obtainable by textbookmethods.

Quantitative OOS testing with 1994–99 data showeddistinct skill for nearly all seasons—often with surpris-ing accuracy some years and, at times, the opposite.This behavior fit with in-sample scatterplots. Elucidat-ing poor prediction causes should be interesting futurework. At least one non-Pacific EOF found field signif-icant during in-sample analysis (EG 2) performed wellin OOS testing, suggesting its skill was not just due toselection from many possibilities.

Improved skill seems likely using better databasesand techniques, such as EOF rotation, trimming-cor-rected COADS data (Wolter 1997), and factor and cy-clostationary analysis, in addition to removing unim-portant predictors. Several-month-lead forecast pros-pects appear good by combining ocean circulation andstatistical models or statistical alone. [Initial workshowed population skill typically decreased by onlyø0.1 for late fall and winter 3-month forecasts (Mar-

kowski and North 1999).] Carefully including otherocean predictors seems likely to noticeably increaseforecast and concurrent skill.

Examining single EOF correlations gave additionalresults:

1) The strong North Pacific variability about the sub-arctic front (typically the main North Pacific EOF)appears largely due to the front’s north–south move-ment and showed little winter precipitation influence.Instead, the east–west temperature gradient or anom-aly position along the front showed much influenceand notably modulated ENSO effects during severalseasons (seen also by GB98). Other Pacific EOFsalso modulated ENSO influence, especially EOF 3.

2) Methods here explain ENSO event differences foundby others using simpler approaches, show additionalSST influence, and account for much nonlinearitypreviously unexplained.

3) Gulf of Mexico EOFs and a large region includingthe Gulf show unusually strong and widespread U.S.precipitation correlations during all seasons. TheirPC behavior strongly suggests the correlations aremainly due to atmospheric forcing.

4) The previous and similarly caused correlations arelikely to be troublesome in statistical forecasts sincethey mimic causal influence and are not identifiedby usual significance and cross-validation tests.However, a few highly significant forecast skills sug-gest some may be useful forecastors, perhaps due todelayed influences such as snow cover.

5) Because the previous correlations are strong, andextend over large ocean regions (perhaps near glob-al), they should provide useful, if not important, testsfor coupled GCMs.

Acknowledgments. This work was supported in partby the National Institute for Global EnvironmentalChange, Southern Region, U.S. DOE. It does not nec-essarily endorse this study’s findings. We are indebtedto Kwang-Yul Kim of The Florida State University formany helpful discussions, comments, and EOF algo-rithm. This manuscript also benefited substantially fromcomments and review by Matt Gilmore (University ofIllinois at Urbana–Champaign). We thank Steve Schroe-der for editing help. The first author is most grateful toYongyun Hu for his crucial encouragement at the startof this work.

APPENDIX A

Variables and Abbreviations

«(t) Error or residual, month ts, s2 Standard deviation, varianceadi MLR coefficient of ith predictor and CD,

das Seasonal correlation slope

874 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

TABLE B1. Normality x2 results, and numbers and expected percentage of values in tails of transformed and residual distributions. Thex2 test used six bins, p 5 0.05. Southwest states (CA, NV, UT, AZ, NM, and TX) were considered separately because of their dry seasons.See text for details and discussion.

Season and states

No. and percent failing x2 test, p 5 0.05Distributions

Total no. Transformed Residual

Actual percent of expected no. in indicated tail fractionDistributions

Transformed Residual Transformed Residual

0.02* 0.02* 0.005* 0.005*

Nov–FebNo SouthwestSouthwest

30143

42 (14)**12 (28)

15 (5)0 (0)

5858

7473

151

3248

Jul–SepNo SouthwestSouthwest

30143

25 (8)7 (16)

15 (5)4 (9)

6066

7381

1526

4367

* Fraction of each tail.** Brackets indicates percentage of total no.

bs Hindcast bias estimate of Rs

N Number of observations (usually in aMLR)

Nc Number of correct predictionsNcef Effectively independent Nc

Nec Number of expected correct predictions,Np/3

Nos Number of OOS predictionsN Number of predictionsNpef Effectively independent Np

p Significance, Single CD (local) MLR sig-nificance

pf Field statistical significancepfo Out-of-sample field significanceRd MLR correlation coefficient (hindcast

skill)R9 Rc or Rsc

Rc Population multiple correlation coefficientRos Expected OOS correlation coefficient

(skill)Rs Seasonal average correlation coefficient,

with hindcast biasRsc Rs corrected for hindcast bias (ensemble

estimate, also skill)r Simple linear correlation coefficientr1 Lag 1 autocorrelation coefficientrij Cross-correlation coefficientSH Heidke skill scoresy . x Standard deviation of regression residualssy Standard deviation of regression predic-

tandsy Number of regression predictorsy(t) Predicted transformed precipitation anom-

alyY(t) Transformed tPAYest Season’s predicted monthly tPA averageYs Season’s actual monthly tPA averageAR1 Autoregressive process, Order 1CCA Canonical correlation analysisCD(s) U.S. State Climatic Division(s)

EG Extended Gulf region: Table 1MLR Multiple least-squares linear regressionNCDC U.S. National Climatic Data CenterOOS Out-of-samplePA(s) Precipitation anomaly(s)PC(s) Principal component(s); PC time seriesPD(s) Precipitation distribution(s)SSTA(s) SST anomaly(s)SVD Singular value decompositiontPA(s) Transformed observed monthly PA(s)XV Cross-validation

APPENDIX B

Precipitation Anomaly DistributionTransformation and Regression

Residual Testing

a. Transformation algorithm

Let y(t) 5 the normalized PAs for a given CD and aconsecutive series of months, t, from each year. Aftershifting y(t) so that y(tL) 5 0, choose c0 so y(tM)/[y(tL)1 c0] 5 y(tH)/y(tM) : y(tL), y(tM), and y(tH) are the low-est, median, and highest y(t), respectively. When y(t) ,y(tM), square interpolate the adjustment so that it de-creases rapidly as y → y(tM):

2[y(t) 2 y(t )]My9(t) 5 y(t) 1 c . (B1)o5 6[y(t ) 2 y(t )]M L

No adjustment is made, y(t) . y(tM). A log transformabout the median gives ‘‘geometric’’ transformed anom-alies, Y(t): Y(t) 5 1n[y9(t)/y(tM)]. The values Y(t), nor-malized to s 5 1, are the tPAs.

b. Transformed anomaly and residual distributionexamination

1) x 2 NORMALITY TESTING AND RESULTS

The x2 normality tests used six bins with equal 18%intervals and 4.5% tail bins ( | z | . 1.690, two obser-

OCTOBER 2003 875M A R K O W S K I A N D N O R T H

vations expected per month in season). Tail bin countswere added. Values outside 62.0% and 60.5% werecounted to see if the transform reduced or eliminatedextreme event bias: counts were too small for x2 use.

Table B1 shows transform performance with predic-tors for our 10 Pacific and 4 Gulf PCs and x2 p 5 0.05.Summer distributions from states with highly arid CDs,typically U.S. southwest, are shown separately sincethey had the most very small values and are the mostsevere test. Results are typical: residual distributions areclosely normal with failing fractions well within chance.Southwest July–September was little more than 1s .expected, even if the several wetter Texas CDs are ex-cluded. The tail counts show extreme values are wellremoved. The tPAs differ from strict normality, but onlyresidual normality is required. Some limitations: toomany very small raw values will give an abnormal neg-atively peaked residual distribution; over long seasons,distribution shape changes need consideration.

2) RESIDUAL AUTOCORRELATION AND

SCATTERPLOT EXAMINATION

We checked residual r1 autocorrelation nominally us-ing a season’s entire residual time series, and within-season values only. Nominal calculation includes de-correlation between years: r1 ø 2/3 versus season only.Autocorrelation varies seasonally so annual basis is notadequate. Autocorrelations appeared statistically con-sistent with 0. (MLR requirements are based on thenominal method.)

Season-averaged residual time series and their 44-point correlation scatterplots were examined for manyCDs with the highest skills. No unusual behavior wasfound: outlier points did not appear to drive correlations(an important reason for XV) nor was nonrandom re-sidual behavior visible.

APPENDIX C

Brief Derivation of Seasonal Average HindcastBias Correction, Eq. (4)

Noting that yi can be written as yi 5 rxi 1 ei, whereei are the regression residuals, the product-moment for-mula for rs in terms of sums (x and y means 5 0)becomes

x (rx 1 e )O s j s j s jr 5 . (C1)s 1/2

2 2x (rx 1 e )O Os j s j s j[ ]Neglecting the cross-terms in (C1) gives:

2r xO s jr 5 . (C2)s 1/2

2 2 2 2x r x 1 e )O Os j s j s j[ ]After normalizing x and y to s 5 1, formally let S/(N/n) 5 a S /N and S /(N/n) 5 g S /N, where2 2 2 2e e x xsj i sj i

i denotes monthly values and n 5 months per season,and use S 5 (1 2 r2)S . After dividing by g S2 2e xi i

, squaring, canceling, and considering just the bias2xi

(which sums as a square), obtains Eq. (4) minus as (as

will be ø1 when p # 0.1). Here, a will typically beø1/n. If xi were uncorrelated, g also would be ø1/n.But xi derive from persistent SSTAs, so that g will becloser to 1 since xsj will be ø its corresponding xi’s.

APPENDIX D

Out-of-Sample Testing: Optimal Weighting andHeidke Skill Scoring

a. Additional optimal weighting procedures anddiscussion

Choosing l [Eq. (5)] depends on predictor confi-dence: underweighting variables that have real influ-ence, versus overweighting poor predictors and losingforecast skill. Smaller ls weight more less-significant(smaller t) predictors. The value l rapidly becomes mi-nor as t increases since t is squared. Davis (1977) showsthat l should lie between 1 and 2 when t is moderatelylow. (Davis’ R2/^r2& is well approximated by t2 for Nand p here.)

A PC’s pf was included by varying l vs pf . Field-significant PCs: l 5 1.7, t , 1.0, and linearly taperedto 0, 1.0 , t , 2.3 (0 . 2.3). PCs with very large low-p areas plus long persistence, r1 $ 0.7, were treatedsimilarly. Otherwise, l was more conservative: 2, t ,1.75, and linearly tapered to 0, 1.75 , t , 3.5.

A PC’s pf was also used where local p was ø0.1: ForCDs with $1 field-significant PC, weights were appliedto Yest which varied linearly 1.0 to 0.0, 0.12 , p , 0.18,and 0.095 # p # 0.135 otherwise. This weightingavoids most nonclimatology predictions for CDs withonly a few moderate p predictors; the higher limits par-tially compensate for MLR significance reduction fromusing many predictors. The tradeoff is underweightingPCs with real influence which do not appear field sig-nificant.

b. Heidke skill: Tercile divisions

A Ys’s actual tercile divisions often differed from nor-mal. Since 3 does not evenly divide the training seasonN, 44, divisions fall evenly between two Ys values. Av-eraging many Ys pairs flanking normalized tercile di-visions gave 0.43, indicating unbiased terciles. Bound-ary jitter was evident—often statistically significant be-tween nearby CDs. To reduce this systematic error, av-erage terciles within a state (and adjoining states if ,4CDs) were used; 0.43 was also examined since averageregional bias was small.

c. Heidke skill field significance

Z-scores (Table 2, column 3) 5 (Nc 2 Nec)/sB: sB

the binomial distribution s, [Npn(1 2 p]1/2, p 5 1/3,

876 VOLUME 4J O U R N A L O F H Y D R O M E T E O R O L O G Y

the chance correct probability. Here, sB is well approx-imated by a Gaussian if N $ 15 (Spiegel 1961, chapter7). The value Z overestimates pfo due to spatial corre-lation (BP, their appendix, section 4): neighboring CDYs are usually correlated ø0.9. Correction followed BP’sapproach. Like pf , effective independent and correctguesses, Npef and Ncef, are needed. They were approxi-mated following BP except: cross correlations, rij, wereused, BP’s Eq. (A17), as Ys is 1s normalized; rij werelimited to CDs with p # 0.1. They were weighted bythe fraction of years predictions were made at CDs iand j, wiwj instead of (sisj), to account for differingprediction numbers. Here, | rij | ’s bias was removed be-fore summing. [BP’s Eq. (A17): using | rij | gives thetotal normalized equivalent CDs per a typical CD,ø1/10 here.)] Diagonal elements are summed since wi

is not always 1. tB 5 z/(S wiwjrij)1/2. Here, tB corre-sponds to an N 5 Npef binomial distribution.

REFERENCES

Barnett, T. P., and R. Preisendorfer, 1987: Origins and levels of month-ly and seasonal forecast skill for United States surface air tem-perature determined by canonical correlation analysis. Mon.Wea. Rev., 115, 1825–1850.

Barnston, A. G., 1994: Linear statistical short-term climate predictiveskill in the Northern Hemisphere. J. Climate, 7, 1513–1564.

——, and T. M. Smith, 1996: Specification and prediction of globalsurface temperature and precipitation from global SST usingCCA. J. Climate, 9, 2660–2697.

——, A. Leetma, V. Kousky, R. Livezey, E. O’Lenic, H. Van denDool, A. J. Wagner, and D. A. Unger, 1999: NCEP forecasts ofthe El Nino of 1997–98 and its U.S. impacts. Bull. Amer. Meteor.Soc., 80, 1829–1852.

Berlage, H. P., 1966: The Southern Oscillation and World Weather.Meded. Verh. Monogr., No. 88, Koninklijk Nederlands Meteor-ologish Institut, 150 pp.

Bjerknes, J., 1966: A possible response of the atmospheric Hadleycirculation to equatorial anomalies of ocean temperature. Tellus,18, 820–829.

Bretherton, C. S., C. Smith, and J. M. Wallace, 1992: An intercom-parison of methods for finding coupled patterns in climate data.J. Climate, 5, 541–560.

Chelton, D. B., 1983: Effects of sampling errors in statistical esti-mation. Deep-Sea Res., 30, 1083–1103.

Chen, T. C., J. M. Chen, and C. K. Wikle, 1996: Interdecadal variationin U.S. Pacific coast precipitation over the past four decades.Bull. Amer. Meteor. Soc., 77, 1197–1205.

da Silva, A., A. C. Young, and S. Levitus, 1994: Algorithms andProcedures. Vol. 1, Atlas of Surface Marine Data 1994, NOAAAtlas NESDIS 6, 83 pp.

Davis, R. E., 1977: Techniques for statistical analysis and predictionof geophysical fluid systems. Geophys. Astrophys. Fluid Dyn.,8, 245–277.

Frankignoul, C., 1985: Sea surface temperature anomalies, planetarywaves, and air–sea feedback in the middle latitudes. Rev. Geo-phys., 23, 357–390.

Gershunov, A., and T. P. Barnett, 1998: Interdecadal modulation ofENSO teleconnections. Bull. Amer. Meteor. Soc., 79, 2715–2725.

Gill, A. E., 1982: Atmosphere–Ocean Dynamics. Academic Press,662 pp.

Guttman, N. B., and R. G. Quayle, 1996: A historical perspective ofU.S. climate divisions. Bull. Amer. Meteor. Soc., 77, 293–303.

Harrison, D. E., and N. K. Larkin, 2001: Comments on ‘‘Comparison

of 1997–98 U.S. temperature and precipitation anomalies to his-torical ENSO warm phases.’’ J. Climate, 14, 1894–1895.

Hoerling, M. P., A. Kumar, and T. Xu, 2001: Robustness of the non-linear climate response to ENSO’s extreme phases. J. Climate,14, 1277–1293.

Horel, J. D., 1981: A rotated principal component analysis of theinterannual variability of the Northern Hemisphere 500 mbheight field. Mon. Wea. Rev., 109, 2080–2092.

D Kim, K.-Y., 2000: Statistical prediction of cyclostationary pro-cesses. J. Climate, 13, 1098–1115.

———, and Q. Wu, 1999: A comparison study of EOF techniques:Analysis of nonstationary data with periodic statistics. J. Cli-mate, 12, 185–199.

Koster, R. D.,, M. J. Suarez, and M. Heiser, 1999: Variance andpredictability of precipitation at seasonal-to-interannual time-scales. J. Hydrometeor., 1, 26–46.

Latif, M., and T. P. Barnett, 1994: Causes of decadal climate vari-ability over the North Pacific and North America. Science, 266,634–637.

Livezey, R. E., and W. Y. Chen, 1983: Statistical field significanceand its determination by Monte Carlo techniques. Mon. Wea.Rev., 111, 46–59.

Lorenz, E. N., 1964: The problem of deciding the climate from thegoverning equations. Tellus, 16, 1–12.

Markowski, G. R., and G. R. North, 1999: On the climatic influenceof sea surface temperature: Indications of substantial correlationand predictability. Preprints, 10th Symp. on Global Change Stud-ies, Dallas, TX, Amer. Meteor. Soc., 282–284.

Michaelsen, J., 1987: Cross-validation in statistical climate forecastmodels. J. Climate Appl. Meteor., 26, 1589–1600.

Montroy, D. L., 1997: Linear relation of central and eastern NorthAmerican precipitation to tropical Pacific SST anomalies. J. Cli-mate, 10, 541–558.

Nakamura, H., G. Lin, and T. Yamagata, 1997: Decadal climate var-iability in the North Pacific during recent decades. Bull. Amer.Meteor. Soc., 78, 2215–2225.

Neelin, D. J., and W. Weng, 1999: Analytical prototypes for ocean–atmosphere interaction at midlatitudes. Part I: Coupled feedbacksas a sea surface temperature dependent stochastic process. J.Climate, 12, 697–721.

Neter, J., M. H. Kutner, C. J. Nachtsheim, and W. Wasserman, 1996:Applied Linear Regression Models. 3d ed. Irwin, 720 pp.

Nicholls, N., 2001: The insignificance of significance testing. Bull.Amer. Meteor. Soc., 82, 981–986.

NOAA–CIRES Climate Diagnostics Center, cited 2000: ReynoldsSST. [Available online at http://www.cdc.noaa.gov/cdc/data.reynoldspsst.html.]

North, G. R., T. L. Bell, R. F. Cahalan, and F. J. Moeng, 1982: Sam-pling errors in the estimation of empirical orthogonal functions.Mon. Wea. Rev., 110, 701–706.

Oort, A. H., Y. H. Pan, R. W. Reynolds, and C. F. Ropelewski, 1987:Historical trends in the surface temperature over the oceansbased on the COADS. Climate Dyn., 2, 29–38.

Philander, S. G. H., 1990: El Nino, La Nina, and the Southern Os-cillation. Academic Press, 289 pp.

Reynolds, R. W., and T. M. Smith, 1994: Improved global sea surfacetemperature analyses using optimum interpolation. J. Climate,7, 929–948.

Richman, M. B., 1986: Rotation of principal components. J. Cli-matol., 6, 293–335.

Ropelewski, C. F., and M. S. Halpert, 1987: Global and regional scaleprecipitation patterns associated with the El Nino/Southern Os-cillation. Mon. Wea. Rev., 115, 1606–1626.

——, and ——, 1996: Quantifying Southern Oscillation–precipitationrelationships. J. Climate, 9, 1043–1059.

Smith, S. R., D. M. Legler, M. J. Remigio, and J. J. O’Brien, 1999:Comparison of the 1997–98 U.S. temperature and precipitationanomalies to historical ENSO warm phases. J. Climate, 12,3507–3515.

OCTOBER 2003 877M A R K O W S K I A N D N O R T H

Spiegel, M. R., 1961: Theory and Problems of Statistics: Schaum’sOutline Series. McGraw Hill, 359 pp.

Sutton, R. T., and M. R. Allen, 1997: Decadal predictability of NorthAtlantic SST and climate. Nature, 388, 563–567.

Tabachnick, T. G., and L. S. Fidell, 1996: Using Multivariate Statis-tics. 3d ed. Harper Collins College, 880 pp.

Unger, D. A., 1995: Skill assessment strategies for screening regres-sion predictions based on a small sample size. Preprints, 13thConf. on Probability and Statistics in the Atmospheric Sciences,San Francisco, CA, Amer. Meteor. Soc., 260–267.

Wang, H., M. Ting, and M. Ji, 1999: Prediction of seasonal meanUnited States precipitation based on El Nino sea surface tem-peratures. Geophys. Res. Lett., 26, 1341–1344.

Wang, R., 2001: Prediction of seasonal climate in a low-dimensional

phase space derived from the observed SST forcing. J. Climate,14, 77–99.

Wang, X. L., and V. R. Swail, 2001: Changes of extreme wave heightsin Northern Hemisphere oceans and related atmospheric circu-lation regimes. J. Climate, 14, 2204–2221.

Wherry, R. J., Sr., 1931: A new formula for predicting the shrinkageof the coefficient of multiple correlation. Ann. Math. Stat., 2,440–457.

Wolter, K., 1997: Trimming problems and remedies in COADS. J.Climate, 10, 1980–1997.

Yuan, G., I. Nakano, H. Fujimori, T. Nakamura, T. Kamoshida, andA. Kaya, 1999: Tomographic measurements of the Kuroshio Ex-tension meander and its associated eddies. Geophys. Res. Lett.,26, 79–82.