Clues for discriminating between moving average and autoregressive models in spatial processes

26
Span Econ Rev (2007) 9:273–298 DOI 10.1007/s10108-006-9018-7 REGULAR ARTICLE Clues for discriminating between moving average and autoregressive models in spatial processes Jesús Mur · Ana Maria Angulo Published online: 16 November 2006 © Springer-Verlag 2006 Abstract In this paper we try to provide additional insight into the problem of how to discriminate between the two most common spatial processes: the autoregressive and the moving average. This problem, whose analogous time series is apparently simple, acquires a certain complexity when it is considered in an irregular system of spatial units, mainly because there are few tools to carry out this discussion. Nevertheless, even with this lack, we believe that it is possible to make some progress using the methods available at present. In this paper we discuss the advantages and inconveniences of the different tech- niques that can help us to discriminate between both processes. We finish off the examination with a Monte Carlo exercise, and an application to the European regional income, which has enabled us to better understand the performance of several proposals such as the Lagrange Multipliers, the so-called Variance criterion and the tests of Vuong and Clarke. Keywords Spatial series · Spatial autoregressive process · Spatial moving average process · Selection strategies JEL Classification Numbers C21 · C50 Introduction Brockwell and Davies (2003 p.6) define Time Series Analysis as the ‘techniques for drawing inferences from such series’ for which ‘it is necessary to set up a J. Mur · A. M. Angulo (B ) Departamento de Análisis Económico, Universidad de Zaragoza, Zaragoza, Spain e-mail: [email protected] J. Mur e-mail: [email protected]

Transcript of Clues for discriminating between moving average and autoregressive models in spatial processes

Span Econ Rev (2007) 9:273–298DOI 10.1007/s10108-006-9018-7

R E G U L A R A RT I C L E

Clues for discriminating between moving averageand autoregressive models in spatial processes

Jesús Mur · Ana Maria Angulo

Published online: 16 November 2006© Springer-Verlag 2006

Abstract In this paper we try to provide additional insight into the problemof how to discriminate between the two most common spatial processes: theautoregressive and the moving average. This problem, whose analogous timeseries is apparently simple, acquires a certain complexity when it is consideredin an irregular system of spatial units, mainly because there are few tools tocarry out this discussion. Nevertheless, even with this lack, we believe that itis possible to make some progress using the methods available at present. Inthis paper we discuss the advantages and inconveniences of the different tech-niques that can help us to discriminate between both processes. We finish off theexamination with a Monte Carlo exercise, and an application to the Europeanregional income, which has enabled us to better understand the performanceof several proposals such as the Lagrange Multipliers, the so-called Variancecriterion and the tests of Vuong and Clarke.

Keywords Spatial series · Spatial autoregressive process · Spatial movingaverage process · Selection strategies

JEL Classification Numbers C21 · C50

Introduction

Brockwell and Davies (2003 p.6) define Time Series Analysis as the ‘techniquesfor drawing inferences from such series’ for which ‘it is necessary to set up a

J. Mur · A. M. Angulo (B)Departamento de Análisis Económico, Universidad de Zaragoza, Zaragoza, Spaine-mail: [email protected]

J. Mure-mail: [email protected]

274 J. Mur, A. M. Angulo

hypothetical probability model to represent the data. After an appropriate fam-ily of models has been chosen, it is then possible to estimate parameters, checkfor goodness of fit to the data, and possibly to use the fitted model to enhanceour understanding of the mechanism generating the series’. If we substitute theadjective Time for Spatial, we do not modify anything essential in the abovedefinition. Brockwell and Davis are less specific with respect to the aim: ‘Once asatisfactory model has been developed, it may be used in a variety of ways depend-ing on the particular field of application’ . Prediction is the objective in TimeSeries whereas in Space the purpose is mainly to gain insight into the spatialstructure of the data. Given the obvious conceptual similarity between the twotechniques, Time and Spatial Series Analysis, it should come as no surprise thatthe development of the latter has tended to mimic the strategy of the former.

However, the efforts to adapt the Time Series methodology to a spatialcontext have had limited success since the Spatial dimension is not fully com-patible with the Time dimension. For example, the simplicity of the sequencePast-Present-Future is not applicable to Space, in which relationships are poten-tially multidimensional. We also lack a natural ordering of the observations.Important concepts such as those of the sampling design or the way that wesolve the asymptotic passage for the sampling size are not evident (Lahiri 2003).The problems of endogeneity and of the lack of identification in the parametersof the model occur with relative frequency (Lee 2002) and Central Limit The-orems are, in many circumstances, very hard to fulfil (Smith 1980). A proof ofthe shortages that still exist when doing Spatial Analysis is the scant attentionpaid by the specialized literature to the question of model selection.

In the text of Anselin (1998, p. 243) states that the techniques of modelselection ‘. . . have not yet found general acceptance in the practice of empiricalregional science’ and that, therefore, ‘. . . they form a starting point for severalpromising and unexplored research directions in spatial econometrics’. Manyyears later, the situation has only improved partially, as Florax et al. (2003)point out. According to them, at present, a classical forward stepwise approachdominates, whose reliability depends on the quality of the misspecification testscarried out during the process. In particular, it is strange to put the specificationprocess under the control of a well-defined model selection strategy. Transferredto the more precise context of Spatial Processes, the situation can be summedup as a lack of a regime of series identification. The habitual practice is not veryelaborated and consists of specifying autoregressive-type models, discarding,for no apparent reason, other mechanisms.

The objective of our paper is to try to alleviate this situation, focusing thediscussion on the two most common processes in the field of Spatial StochasticProcesses: the autoregressive and the moving average. In the first place, wewould like to claim the usefulness of a moving average process as opposedto the former, too mechanically assumed in applied work. We believe that itis possible to progress in the discussion of how to differentiate between bothtypes of processes. Indeed, there are several proposals in the literature that maybe useful for taking a decision. Furthermore, the question of selecting one fromamong various rival models is standard in mainstream Econometrics, where we

Clues for discriminating between moving average and autoregressive models 275

can find several useful proposals capable of being adapted to the spatial case.In the section “Time and spatial series analysis: some unsettled questions”,we discuss the main characteristics of both types of processes, paying atten-tion to some methods for discriminating between them. For simplicity, we willdistinguish between nested and non-nested model approaches. In the section“Discriminating between spatial processes: some Monte Carlo evidence” wesolve a Monte Carlo exercise whose objective is to compare the performanceof the proposals that appeared to be the most interesting. In the section “Anapplication to the case of the European regional income”, we include an appli-cation to the case of the regional distribution of the European income in 2002.Finally, the “Conclusions” section gathers the main conclusions of our research.

Time and spatial series analysis: some unsettled questions

The first step for discussing the stochastic structure of a series should be to makesure that this structure exists through some test of non spatial correlation1. Wecan identify three groups of such tests: (i) the traditional tests, whose specifica-tion is ad hoc but of recognised efficacy, such as the I of Moran (1948); (ii) thosecoming from the maximum likelihood approach whose introduction was pro-duced belatedly in this field (the first versions of the Lagrange Multiplier, ofthe Likelihood Ratio and of the Wald test appear in Burridge, 1980, Cliff andOrd, 1981, and Anselin, 1988, respectively); (iii) those initially designed withspecific aims, though of general application, like the KR test of Kelejian andRobinson (1992) and its revision in Anselin and Moreno (2003), the tests robustto endogeneity in the regressors (Anselin and Kelejian 1997) and those of localautocorrelation (Anselin 1955; Ord and Getis 1995).

After rejecting the null hypothesis, a structure of spatial dependencies forthe series should be specified. The problem at this point is the lack of efficientinstruments with which to take a decision. Put another way, the null hypothesisin the tests cited above is that the distribution of the variable in Space is randombut, except in a few cases, no well-defined alternative is proposed. If the null isrejected, it is common to specify a first order spatial autoregressive process:

SAR(1):y = ρWy + u

u ∼ N(0, σ 21 I)

}(1)

where y is an (Rx1) vector of observations of the variable y, u a vector of errorterms that we suppose to be normal (for simplicity’s sake), uncorrelated andhomoskedastic, ρ a parameter that reflects the intensity of the dependenciesand W is the so-called weighting matrix, of order (RxR), whose role is to definethe regions with which each region interacts. The elements of the main diagonalof this matrix are zero (a region does not interact with itself). With respect to theother elements, different options are possible (Upton and Fingleton 1985). The

1 See Brett and Pinkse (1997) and De Graaff et al. (2001) for tests of spatial independence.

276 J. Mur, A. M. Angulo

simplest consists of introducing a one in the corresponding pair if the regions arerelated (physically contiguous, for example) and a zero otherwise. Obviously,the specification of the W matrix has a high (but unavoidable) a priori compo-nent (Cressie 1991). This practice is very general but, in our opinion, it is too sim-plistic because the SAR(1) is not the only type of process capable of introducingspatial dependence. Moving average structures should also be considered:

SMA(1):y = u − θWuu ∼ N(0, σ 2

2 I)

}(2)

where θ is the moving average parameter.The specifications of (1) and (2) are markedly different. For example, if we

interpret u as a random input vector and y as the vector of final outputs, bothmodels describe different adjustment mechanisms over the Space. In the mov-ing average case, the influences that a point receives will be of a local type;that is, the shocks produced in a region will not have any impact beyond itsimmediate neighbours in accordance with the specification of W. However, inthe autoregressive case the shocks suffered by a specific region will spread allover the Space affecting the whole set of regions. The shock will thus have aglobal character. This distinction has implications in regional analysis when try-ing to explain questions such as spatial externalities (Anselin 2003), processesof spatial diffusion (Le Gallo et al. 2005; Moreno et al. 2005), hedonic spatialprices (Beron et al. 2004) or convergence mechanisms (Rey and Dev 2004).

The problem of discrimination worsens because both types of process gener-ate similar series (Sneek and Rietveld 1997) and, as we have said, we have fewtechniques available with which to articulate the discussion. Cliff and Ord werealready conscious of this problem in 1981 when they proposed using severaltechniques developed in a Time Series context (the correlogram and the partialcorrelogram) with spatial data, but the results were very unsatisfactory. Thiswas the overall situation reflected by the statement of Haining (1977, p. 122):“if we reject the null hypothesis we are often in no better position to postulate analternative model of dependence”. The development of the Exploratory SpatialData Analysis, ESDA, (Anselin 1955; Anselin and Bao 1997; Moreno and Vayá2000; or Haining 2003), as a set of techniques whose aim is to go deeper intothe spatial arrangement of data has improved our perspective on the questionalthough there is still a lack of statistical support. Our proposal for advancingalong these lines is to treat the question as a generic problem of model selectiondistinguishing between Non-Nested and Nested Models approaches, to whichwe dedicate the two following sections.

Non-nested models approach

According to Clarke (2004), we can distinguish four different techniques forselecting between non-nested models which include the Bayesian approach, theCox test and the model selection criteria and test selection criteria.

Clues for discriminating between moving average and autoregressive models 277

The Bayesian approach is attractive though it is not exempt from criticism.The posterior odds combine the prior odds with the Bayes factor, measuringthe change in the priors due to the sampling information. The first question isto define the prior odds. Besides, as Clarke (2004, p.3)indicates: “(. . .) the Bayesfactor does not provide a measure of support for one model over another, butrather it measures ‘the change in the odds in favour of the hypothesis when goingfrom prior to the posterior”’.2 For the case of spatial data, Bayesian analyseshave been used extensively, among others, by Hepple (1999) or by Lesage andPace (2004) who achieve elegant solutions.

The Cox test (1961) is not easy to obtain, especially when the models arenon-linear, and may be confusing because it allows both models to be chosenor discarded simultaneously. It has become very popular through the J andJA tests of Davidson and MacKinnon (1981) and Fisher and MacAleer (1979),respectively. However, when attempts have been made to adapt this method tothe spatial case, serious computation problems appear and the results are notsatisfactory (Anselin 1984).

The most extended technique to compare models in a non-nested frameworkuses some of the wide range of model selection criteria. The main advantagelies in their simplicity: the selection criterion will always choose the best modelaccording to the basis of the approach. Possibly the most popular is the AkaikeInformation Criterion, AIC, (Akaike 1973), or its Bayesian version in the SBICstatistic (Schwarz 1978). For the case we are dealing with in this paper, andfor reasons that will be obvious later on, we would like to refer to an ad hoccriterion, developed in a context of spatial series analysis by Trivez and Mur(2004) and called the Variance criterion. Its objective is to discriminate betweenSMA(1) and SAR(1) processes and it requires a previous filtering of the seriesusing the eigenvalues of the weighting matrix, W. If this matrix is symmetric, itcan be decomposed as3

W = Q�Q′ ⇔

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

� =

⎡⎢⎢⎣

λ1 0 . . . 00 λ2 . . . 0. . . . . . . . . . . .

0 0 . . . λR

⎤⎥⎥⎦ = diag(λ1, λ2, . . . , λR)

Q =

⎡⎢⎢⎣

q11 q21 . . . qR1q12 q22 . . . qR2. . . . . . . . . . . .q1R q2R . . . qRR

⎤⎥⎥⎦ = [q1 q2 . . . qR

](3)

2 In the original, quoted from Lavine and Schervish (1999).3 The weighting matrix may be non symmetric. For example, it is usual to standardize the matrixso that the elements of each row sum to one. In this case, the decomposition of (3) will not bevalid. The standardization is not free of problem, probably the most important is that as indicatedby Baltagi et al. (2006, p. 14), it ‘reduces the amount of spatial correlation’. For this reason, weprefer to standardize the matrix in a second stage, solving previously the questions related to theidentification of the process using the raw binary weighting matrix.

278 J. Mur, A. M. Angulo

where {λr, r = 1, 2, · · · , R} are the eigenvalues of W whereas the eigenvectorsare in the columns of matrix Q. Introducing this relation into the equation ofthe SAR(1) model of (1) we obtain that:

y = ρWy + uu∼N(0, σ 2

1 I)

}⇒ y∗ = ρ�y∗ + u∗

u∗∼N(0, σ 21 I)

}(4)

where y∗ = Q′y and u∗ = Q′u are the filtered series. Note thaty∗∼N

[0; σ 2

1 �(ρ)−2] with �(ρ) = (I − ρ�); that is, the elements of vector y∗

are heteroskedastic, V(y∗

r) = σ 2

1 (1 − ρλr)−2, but independent, Cov

(y∗

r ; y∗s) =

0; ∀r �= s. In the case of the SMA(1), the filtering process results in

y = u − θWuu∼N(0, σ 2

2 I)

}⇒ y∗ = u∗ − θ�u∗

u∗∼N(0, σ 22 I)

}. (5)

Once again, the elements of vector y* are heteroskedastic, V(y∗

r) =

σ 22 (1 − θλr)

2, but independent with y∗∼ N[0; σ 2

2 �(θ)2].The Variance criterion is based upon the heteroskedastic structure on the

filtered series y∗. It could be shown that, if the process is an SAR(1) model, theabsolute value of the filtered series,

∣∣y∗r

∣∣, admits an equation such as

∣∣y∗r

∣∣ = E[∣∣y∗

r

∣∣]+ νr = c(V[∣∣y∗

r

∣∣])1/2 + νr = α0

1 − α1λr+ νr (6)

where c is a constant, c = √2/π ; α0 =

√c2σ 2

1 ;α1 = ρ and νr is a heteroskedasticerror term. The equation corresponding to the SMA(1) case reads as:

∣∣y∗r

∣∣ = E[∣∣y∗

r

∣∣]+ νr = γ0 + γ1λr + νr (7)

where γ0 =√

c2σ 22 and γ1 = −θ

√c2σ 2

2 . The Variance criterion selects the func-

tional form that better adjusts to∣∣y∗

r

∣∣, using the eigenvalues of the weightingmatrix W as regressors. The comparison is solved through the estimated var-iance of both equations. Denoting by σ 2

SAR the estimated variance of Eq. (6)and by σ 2

SMA the estimated variance of Eq. (7), the SAR model will be selectedwhen σ 2

SAR<σ 2SMA whereas the SMA model will be preferred in the case that

σ 2SMA<σ 2

SAR.The novelty that the tests of model selection introduce in this discussion is

that, in many situations, the difference between the models is hardly apprecia-ble to take a decision and the user should be aware of this. So, if the availableevidence is not enough in favour of any of the models, it may be preferableto treat them as indifferent alternatives. The test of Vuong (1989) emergesas a reinterpretation of the test of Cox, using the information contained inthe individual log-likelihoods of the model (See Appendix A for details). TheDistribution-Free Test of Clarke (Appendix B) is even more intuitive because

Clues for discriminating between moving average and autoregressive models 279

it ‘(· · · ) applies a modified paired sign test to the differences in the individuallog-likelihoods from two nonnested models. Whereas the Vuong test determineswhether or not the average log-likelihood ratio is statistically different from zero,the Clarke test determines whether or not the median log-likelihood ratio isstatistically different from zero. If the models are equally close to the true spec-ification, half the individual log-likelihood ratios should be greater than zeroand half should be less than zero’ (Clarke 2004, p. 6). In other words, theVoung test concentrates on the mean and the Clarke test on the median of thelog-likelihoods.

The conditions under which both tests are applicable differ slightly. In thecase of the test of Clarke, we need independence between the log-likelihoodsand that each of them comes from a continuous population (‘not necessarily thesame’ as Clarke 2004, p. 6, indicates). In the case of the Vuong test, we need thefull iid clause (independent and identically distributed) for the log-likelihoods.The conditions of regularity, that guarantee the existence of the ML estimators,4

must be observed in both tests. In neither case is either the normality or thelinearity of the equation necessary.

Given that the condition of independence between the individuallog-likelihoods is a requisite for both tests, we again need to previously fil-ter the series using the eigenvectors of W (see footnote 3). From (4) we canwrite the log-likelihood function of the SAR(1) model:

l(y∗|ϕ1) = −R2

ln 2π − R2

ln σ 21 −

R∑r=1

[(1 − ρλr)y∗

r]2

2σ 21

+R∑

r=1

ln (1 − ρλr) (8)

where ϕ1 = (ρ, σ 2

1

)and

∑Rr=1 ln (1 − ρλr) = ln |I − ρW| is the logarithm of the

determinant from the Jacobian for the transformation of the random variablesu* (or u if not filtered) into the vector of random variables y* (or y). Using (5),we obtain the log-likelihood function of the SMA(1):

l(y∗|ϕ2) = −R2

ln 2π − R2

ln σ 22 −

R∑r=1

[(1 − θλr)

−1y∗r

]2

2σ 22

−R∑

r=1

ln (1 − θλr) (9)

where ϕ2 = (θ , σ 2

2

)and

∑Rr=1 ln (1 − θλr) = ln |I − θW| is the logarithm of the

determinant from the Jacobian.

4 Rivers and Vuong (2002) discuss the use of the Vuong test for GMM and non maximum likelihoodestimators and models with weakly dependent heterogeneous data.

280 J. Mur, A. M. Angulo

The likelihood ratio, which is the basis for the test of Vuong, can be expressedas

LRR(ϕ1; ϕ2

) = l(y∗|ϕ1) − l(y∗|ϕ2)

= −R2

lnσ 2

1

σ 22

+R∑

r=1

ln[(1 − ρλr)(1 − θλr)

](10)

which results in

LRR(ϕ1; ϕ2

)√

R

D→N(0; � ∗2) (11)

where

ω2R = 1

R

R∑r=1

(lg

Lr(y∗

r |ϕ1)

Lr(y∗

r |ϕ2))2

−(

1R

R∑r=1

lgLr(y∗

r |ϕ1)

Lr(y∗

r |ϕ2))2

⇒ ω2R = 1

R

R∑r=1

(lg

Lr(y∗

r |ϕ1)

Lr(y∗

r |ϕ2))2

−(

LRR(ϕ1, ϕ2)

R

).2

(12)

Finally, the obtaining of the Clarke test is simple by just considering thefollowing results:

lr(y∗|ϕ1) = − ln 2π

2− ln σ 2

1

2−[(1 − ρλr)y∗

r]2

2σ 21

+ ln (1 − ρλr)

lr(y∗|ϕ2) = − ln 2π

2− ln σ 2

2

2− [(1 − θλr)

−1y∗r ]

2

2σ 22

− ln(

1 − θλr

)

⎫⎪⎪⎪⎪⎬⎪⎪⎪⎪⎭

⇒ dr(y∗|ϕ1, ϕ2) = lr(y∗|ϕ1) − lr(y∗|ϕ2). (13)

Calling B the number of positive differences in the individual log-likelihoods, dr, its distribution under the null hypothesis is a Binomial (R;0.5).

Before leaving this discussion on non-nested models, it should be noted thatthe Voung and Clarke tests may be related to the AIC and SBIC criteria. TheAIC criterion selects the model that minimises:

AIC = −2 log[

Maximized likelihood]+ 2k (14)

where k is the number of parameters of the model. In our case, k is the samefor both models (k = 2), so the choice will depend exclusively on the valueof the likelihood. The AIC and SBIC criteria will pick out the model with thehighest likelihood, the same as the Vuong test, provided that there is enoughdistance between the models to discard the possibility that they are indifferent.However, this coincidence should not always arise with the Variance criterion.

Clues for discriminating between moving average and autoregressive models 281

Nested models approach

The first question to consider now is the nesting structure that relates bothspecifications, SAR(1) and SMA(1). At least two options can be contemplated.The first follows on from the proposal of Huang (1984) in which a SARMA(1,1)model is adopted as the framework

SARMA(1,1):y = ρWy + u − θWu

u ∼ N(0, σ 23 I)

}. (15)

The problem is that model (15) is not identified unless different weightingmatrices are specified into the SMA and SAR parts of the process. This optionis not very reasonable from a strictly spatial point of view.

Haining (1978, 1979) resolves the nesting question by proposing a morerestricted approach. Firstly, it is demonstrated that if Space is composed ofa regular grid, the SAR(1) process appears to be nested in a SMA processof a large order. This is the point to develop a Likelihood Ratio to compareboth models. However, when the Space is of an irregular type, Haining’s LRtest is not applicable. The work of Mur (1999) follows the previous suggestionof Haining but introducing some elements of Huang (1984). The result is anidentification regime which is also applicable to an irregular Space. The basisof the approach appears to be simple.

The SAR(1) process of (1) can be expressed as the following SMA(R) model:

y = ρWy + uu ∼ N(0, σ 2

1 I)

}⇔{

y = u + θ1Wu + θ2W2u + · · · + θR−1WR−1u + θRWRuu ∼ N(0, σ 2

1 I)

(16)

where Wj = ∏ji=1 W and θj are parameters. Similarly, the SMA(1) process of

(2) can be represented as an SAR(R) model:

y = u − θWuu ∼ N(0, σ 2

2 I)

}⇔{

y = u + ρ1Wy + ρ2W2y + · · · + ρR−1WR−1y + ρRWRyu ∼ N(0, σ 2

2 I)

(17)

Comparing these expressions, it is clear that what differentiates a SAR(1)process from a SMA(1) process is that, in the moving average representationof the SAR(1) all the expansion coefficients associated with powers greaterthan two of the contiguity matrix are different from zero, whereas those ofthe SMA(1) are zero. Reciprocally, in the autoregressive representation of anSMA(1), all the expansion coefficients are different from zero, but in the caseof a SAR(1) they will be equal to zero, except for the first. This is the basis forproposing an identification strategy based on zero restrictions on the expansioncoefficients, using a sequence of Lagrange Multipliers. The procedure can besummed up in the following two tests:

282 J. Mur, A. M. Angulo

H0 : ρ2 = 0

HA : ρ2 �= 0

}H0 : θ2 = 0

HA : θ2 �= 0

}. (18)

For the first hypothesis (left panel), the testing equation will be

y = ρ1Wy + ρ2W2y + u (19)

whereas for the second (right panel)

y = u + θ1Wu + θ2W2u. (20)

If the series is an SAR(1), we should reject H0 : θ2 = 0 in (20) and acceptH0 : ρ2 = 0 in (19). On the contrary, if the series is an SMA(1), then we shouldreject H0 : ρ2 = 0 in (19) and accept H0 : θ2 = 0 in (20). As is shown inMur (1999), the LM statistic to test such hypotheses adopts a rather unpleasantexpression:

LM=⎛⎜⎝ u′W2

(I − δ1W

)−1u

σ2 −

R∑r=1

λ2r

1 − δ1λr

⎞⎟⎠

2/[2R(

1 − τ 2)

V(x)] ∼

asχ2(1)

(21)

where V(x)

indicates the variance of xr = λ2r

1−δ1λr; τ is the correlation coefficient

between xr and zr = λr1−δ1λr

; and δ1 will be replaced by ρ1 when testing ρ2 = 0

in Eq. (19), and by θ1 when testing θ2 = 0 in Eq. (20). This sequence onlyallows a positive identification when the results of the two tests are Non-Rejection/Rejection. The combination Non-Rejection/Non-Rejection meansthat both models, SMA(1) and SAR(1), are acceptable for the series, a commonsituation when the parameter of dependencies (be it autoregressive or movingaverage) takes a low value. The combination Rejection/Rejection discards bothalternatives, which may be due to, for example, the Data Generating Process(DGP from now on) being a SARMA(1,1). In any case, the last two combina-tions do not allow a positive identification and the method is Non Conclusive.

Discriminating between spatial processes: some Monte Carlo evidence

In this section we will present the results of a Monte Carlo exercise designed toanalyse the efficiency of the selection methods presented in the section “Timeand spatial series analysis: some unsettled questions”. Three of them pertain tothe class of Non-Nested Models techniques (the tests of Voung and Clarke plusthe Variance criterion) whereas the fourth is representative of a more classicalNested Models perspective. Their behaviour appears satisfactory in an asymp-totic context although little is known about how they work in finite samples.

Clues for discriminating between moving average and autoregressive models 283

Besides, to our knowledge, this is the first time that the selection tests (highlycompetent in time series) have been employed in a context of Spatial SeriesAnalysis, where there are several peculiarities.

Premises of the simulation and additional information

The most relevant characteristics of the exercise are the following:

(i) The observations of the random term u were generated using a normaldistribution with mean zero and unit variance. These observations weresubsequently transformed into SAR(1) and SMA(1) series, using Eqs. (1)and (2), respectively.

(ii) We used three different sample sizes: 25, 100 and 225.(iii) The contiguity matrix W was always of a binary type, using rook move-

ments in a regular lattice system of (5 × 5), (10 × 10) or (15 × 15).(iv) The range of values admissible for parameters ρ and θ depends on the

contiguity matrix and on the type of process used in the exercise. Theconcept of admissibility is a restriction that affects the range of accept-able values for the parameters ρ and θ . This condition arises from theJacobian that appears in the likelihood functions of (8) and (9). Thisdeterminant must be positive, which implies that (Anselin 1988, p. 62):

|I − δW| =R∏

r=1

(1 − δλr) > 0 ⇒(

1

λ−M

)< δ <

(1

λ+M

)(22)

being δ= ρ or θ and λ−M and λ

+M the largest negative and positive eigen-

values of W (in absolute value). The type of process also intervenes in thediscussion, adjusting the condition of (22) to the structure of the DGP.In our case, the ranges of admissible values are the following:

Type of Parameter General condition Weighting matrices

process (5 × 5) (10 × 10) (15 × 15)

SAR ρ

(1

λ−M

)< ρ <

(1

λ+M

)(−0.274; (−0.248; (−0.204;

0.274) 0.248) 0.204)

SMA θ

(−1λ

+M

)< θ <

(−1λ

−M

)(−0.274; (−0.248; (−0.204;

0.274) 0.248) 0.204)

(v) In each case, 40 values of the parameter, distributed regularly over thewhole interval, were simulated.

(vi) Each combination was repeated 1,000 times.

The simulations gave us a lot of information which we are going to arrangeaccording to two cases of interest. In the first, the data were generated with

284 J. Mur, A. M. Angulo

the SAR(1) model and in the decision problem we have to decide between theSAR(1), which is the true model, and the SMA(1) model. In the second, thedata were generated with the SMA(1) model.

Furthermore, depending on the use made of the instruments, we may fol-low two different approaches with respect to the selection problem. In the firstscheme, we speak of the Testing Approach given that we apply the selectiontests of Vuong and of Clarke as they have been presented: the null hypothesismeans that both models are not distinguishable, according to the informationavailable, and the alternative is bilateral. The Lagrange Multipliers of (21) areused as explained in the section “Time and spatial series analysis: some unset-tled questions”. In both cases we could get a Non-Conclusive situation: thenull hypothesis of the selection tests and the combinations Rejection/Rejec-tion and Non-Rejection/Non-Rejection in the case of the Lagrange Multipliers.In these cases we cannot adopt a decision, so the problem will remain unre-solved. Obviously, in this scheme, the Variance criterion has no role.

In the second scheme, which we call the Criterion Approach, we use thesame techniques, including the Variance criterion, but disconnected from theirprobabilistic structure. That is, if the Vuong statistic (obtained as the typifieddifference between the log-likelihood of the SAR(1) model and that corre-sponding to the SMA(1) model) is negative, we identify the SMA(1) model,and if it is positive, we identify the SAR(1). In the case of Clarke, we will choosethe model to which the highest number of differences between the individuallog-likelihoods corresponds. We replicate this strategy in terms of the LagrangeMultipliers: if the LM test for testing H0 : θ2 = 0 in Eq. (20) is greater than thatthe LM for testing H0 : ρ2 = 0 in Eq. (19), an SAR(1) model is identified. Onthe contrary, an SMA(1) model is chosen.

Figures 1 and 2 present the results obtained for the first case (TestingApproach) whereas in Table 1 we include a summary of a selection of cases. InFig. 1 the data have been generated using an SAR(1) model whereas in Fig. 2we used an SMA(1). The data we present in each figure indicate the numberof times, in percentages, the corresponding method chooses one of the twomodels. Figures 3 and 4, together with Table 2, have the same structure but areassociated with the Criteria Approach. The change of focus means that thereare no situations of indetermination. In the figures, VU(SAR) means the Vuongtest as the method and SAR as the selected model, CL indicates the Clarke testand LM is reserved for the Lagrange Multipliers; finally, NC means that thecorresponding test is not conclusive.

Main results of interest

As can be seen from Figs. 1, 2, 3 and 4, the tests display sharp differences depend-ing on the sample size, the process used to generate the data and, of course, thetype of statistic. The main conclusions can be summarised as follows:

• There is an evident sample size effect that underlies the behaviour of all themethods examined. Their reliability improves substantially when the size of

Clues for discriminating between moving average and autoregressive models 285

a R=25

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

-0.27 -0.24 -0.21 -0.18 -0.15 -0.12 -0.09 -0.06 -0.03 0.00 0.03 0.06 0.09 0.12 0.15 0.18 0.21 0.24 0.27

VU(SAR) CL(SAR) LM(SAR) VU(SMA) CL(SMA) LM(SMA) VU(INC) CL(INC) LM(INC)

b R=100

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

-0.25 -0.20 -0.16 -0.11 -0.07 -0.02 0.02 0.07 0.11 0.16 0.20 0.25

c R=225

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

-0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20

VU(SAR) CL(SAR) LM(SAR) VU(SMA) CL(SMA) LM(SMA) VU(INC) CL(INC) LM(INC)

VU(SAR) CL(SAR) LM(SAR) VU(SMA) CL(SMA) LM(SMA) VU(INC) CL(INC) LM(INC)

Fig. 1 Testing approach. DGP: SAR

286 J. Mur, A. M. Angulo

a R=25

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

-0.27 -0.24 -0.21 -0.18 -0.15 -0.12 -0.09 -0.06 -0.03 0.00 0.03 0.06 0.09 0.12 0.15 0.18 0.21 0.24 0.27

VU(SAR) CL(SAR) LM(SAR) VU(SMA) CL(SMA) LM(SMA) VU(INC) CL(INC) LM(INC)

VU(SAR) CL(SAR) LM(SAR) VU(SMA) CL(SMA) LM(SMA) VU(INC) CL(INC) LM(INC)

VU(SAR) CL(SAR) LM(SAR) VU(SMA) CL(SMA) LM(SMA) VU(INC) CL(INC) LM(INC)

b R=100

-0.25 -0.20 -0.16 -0.11 -0.07 -0.02 0.02 0.07 0.11 0.16 0.20 0.25

c R=225

-0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20

Fig. 2 Testing approach. DGP: SMA

Clues for discriminating between moving average and autoregressive models 287

Tabl

e1

Dis

trib

utio

nof

deci

sion

s.Te

stin

gap

proa

ch

R=

25D

GP

=SA

RR

=25

DG

P=

SMA

VU

CL

LM

VU

CL

LM

VU

CL

LM

VU

CL

LM

VU

CL

LM

VU

CL

LM

(SA

R)

(SA

R)

(SA

R)

(SM

A)

(SM

A)

(SM

A)

(NC

)(N

C)

(NC

)(S

AR

)(S

AR

)(S

AR

)(S

MA

)(S

MA

)(S

MA

)(N

C)

(NC

)(N

C)

−0.2

09.

433

.225

.22.

70.

10.

487

.966

.774

.4−0

.20

0.5

9.3

2.4

13.8

0.4

5.7

85.7

90.3

91.9

−0.1

52.

017

.57.

52.

90.

20.

895

.182

.391

.7−0

.15

0.4

10.0

2.3

5.1

0.2

1.3

94.5

89.8

96.4

−0.1

00.

912

.53.

02.

30.

41.

196

.887

.195

.9−0

.10

0.3

9.1

1.8

2.4

0.0

1.2

97.3

90.9

97.0

−0.0

50.

27.

81.

11.

00.

21.

098

.892

.097

.9−0

.05

0.0

8.4

1.1

1.1

0.1

0.9

98.9

91.5

98.0

0.05

0.0

8.3

1.0

1.5

0.3

1.1

98.5

91.4

97.9

0.05

0.0

8.3

1.6

1.1

0.2

0.6

98.9

91.5

97.8

0.10

0.5

12.8

2.8

2.4

0.1

1.2

97.1

87.1

96.0

0.10

0.2

10.1

2.9

2.9

0.2

1.5

96.9

89.7

95.6

0.15

1.8

18.2

6.7

3.2

0.2

0.8

95.0

81.6

92.5

0.15

0.9

9.5

3.7

7.1

0.2

1.9

92.0

90.3

94.4

0.20

8.4

30.9

23.0

2.6

0.2

1.0

89.0

68.9

76.0

0.20

0.8

8.1

3.5

14.3

0.5

5.8

84.9

91.4

90.7

R=

100

DG

P=

SAR

R=

100

DG

P=

SMA

VU

CL

LM

VU

CL

LM

VU

CL

LM

VU

CL

LM

VU

CL

LM

VU

CL

LM

(SA

R)

(SA

R)

(SA

R)

(SM

A)

(SM

A)

(SM

A)

(NC

)(N

C)

(NC

)(S

AR

)(S

AR

)(S

AR

)(S

MA

)(S

MA

)(S

MA

)(N

C)

(NC

)(N

C)

−0.2

031

.266

.770

.60.

50.

01.

368

.333

.328

.1−0

.20

0.2

5.9

1.6

44.2

0.6

49.7

55.6

93.5

48.7

−0.1

510

.345

.935

.91.

10.

01.

988

.654

.162

.2−0

.15

0.2

9.3

2.8

18.5

0.2

22.5

81.3

90.5

74.7

−0.1

01.

428

.89.

70.

90.

01.

797

.771

.288

.6−0

.10

0.2

16.0

2.6

2.8

0.0

5.5

97.0

84.0

91.9

−0.0

50.

218

.72.

00.

10.

01.

299

.781

.396

.8−0

.05

0.0

16.2

1.6

0.0

0.2

1.6

100.

083

.696

.80.

050.

117

.32.

20.

20.

11.

199

.782

.696

.70.

050.

014

.71.

70.

50.

32.

299

.585

.096

.10.

101.

329

.28.

91.

30.

02.

497

.470

.888

.70.

100.

215

.82.

34.

40.

16.

695

.484

.191

.10.

158.

444

.635

.20.

90.

01.

990

.755

.462

.90.

150.

19.

12.

918

.40.

324

.781

.590

.672

.40.

2029

.467

.670

.90.

50.

01.

170

.132

.428

.00.

200.

15.

52.

046

.10.

949

.953

.893

.648

.1R

=22

5D

GP

=SA

RR

=22

5D

GP

=SM

AV

UC

LL

MV

UC

LL

MV

UC

LL

MV

UC

LL

MV

UC

LL

MV

UC

LL

M(S

AR

)(S

AR

)(S

AR

)(S

MA

)(S

MA

)(S

MA

)(N

C)

(NC

)(N

C)

(SA

R)

(SA

R)

(SA

R)

(SM

A)

(SM

A)

(SM

A)

(NC

)(N

C)

(NC

)−0

.20

76.0

97.5

95.5

0.0

0.0

0.0

24.0

2.5

4.5

−0.2

00.

05.

00.

086

.52.

093

.513

.593

.06.

5−0

.15

21.5

78.5

64.5

0.5

0.0

1.5

78.0

21.5

34.0

−0.1

50.

016

.01.

537

.50.

053

.562

.584

.045

.0−0

.10

4.5

62.0

13.5

0.5

0.0

1.0

95.0

38.0

85.5

−0.1

00.

033

.51.

56.

00.

09.

594

.066

.589

.0−0

.05

0.0

53.0

2.5

0.5

0.0

1.0

99.5

47.0

96.5

−0.0

50.

045

.01.

00.

00.

01.

010

0.0

55.0

98.0

0.05

0.0

45.5

1.5

0.0

0.0

1.0

100.

054

.597

.50.

051.

046

.02.

00.

00.

51.

599

.053

.596

.50.

103.

559

.015

.51.

50.

03.

095

.041

.081

.50.

101.

533

.03.

05.

50.

010

.593

.067

.086

.50.

1518

.082

.063

.00.

50.

03.

081

.518

.034

.00.

150.

516

.53.

035

.50.

550

.064

.083

.047

.00.

2077

.596

.593

.00.

00.

00.

022

.53.

57.

00.

200.

04.

00.

086

.00.

592

.514

.095

.57.

5

Per

cent

age

ofti

mes

that

each

Cri

teri

ase

lect

sth

ede

cisi

onth

atap

pear

inbr

acke

tsat

the

head

ing

ofth

eco

lum

nC

olum

nsw

ith

ital

icen

trie

sde

fine

the

DG

P

288 J. Mur, A. M. Angulo

a R=25

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

-0.27 -0.24 -0.21 -0.18 -0.15 -0.12 -0.09 -0.06 -0.03 0.00 0.03 0.06 0.09 0.12 0.15 0.18 0.21 0.24 0.27

VU(SAR) CL(SAR) LM(SAR) CV(SAR) VU(SMA) CL(SMA) LM(SMA) CV(SMA)

b R=100

-0.25 -0.20 -0.16 -0.11 -0.07 -0.02 0.02 0.07 0.11 0.16 0.20 0.25

VU(SAR) CL(SAR) LM(SAR) CV(SAR) VU(SMA) CL(SMA) LM(SMA) CV(SMA)

c R=225

-0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20VU(SAR) CL(SAR) LM(SAR) CV(SAR) VU(SMA) CL(SMA) LM(SMA) CV(SMA)

Fig. 3 Criteria approach. DGP: SAR

Clues for discriminating between moving average and autoregressive models 289

a R=25

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

-0.27 -0.24 -0.21 -0.18 -0.15 -0.12 -0.09 -0.06 -0.03 0.00 0.03 0.06 0.09 0.12 0.15 0.18 0.21 0.24 0.27

VU(SAR) CL(SAR) LM(SAR) CV(SAR) VU(SMA) CL(SMA) LM(SMA) CV(SMA)

b R=100

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

-0.25 -0.20 -0.16 -0.11 -0.07 -0.02 0.02 0.07 0.11 0.16 0.20 0.25

VU(SAR) CL(SAR) LM(SAR) CV(SAR) VU(SMA) CL(SMA) LM(SMA) CV(SMA)

c R=225

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

-0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20

VU(SAR) CL(SAR) LM(SAR) CV(SAR) VU(SMA) CL(SMA) LM(SMA) CV(SMA)

Fig. 4 Criteria approach. DGP: SMA

the sample increases from 25 to 100 observations. The leap from 100 to225 observations introduces additional improvements, although they are ofminor importance.

290 J. Mur, A. M. Angulo

Tabl

e2

Dis

trib

utio

nof

deci

sion

s.C

rite

ria

App

roac

h

R=

25D

GP

=SA

RR

=25

DG

P=

SMA

VU

CL

LM

CV

VU

CL

LM

CV

VU

CL

LM

CV

VU

CL

LM

CV

(SA

R)

(SA

R)

(SA

R)

(SA

R)

(SM

A)

(SM

A)

(SM

A)

(SM

A)

(SA

R)

(SA

R)

(SA

R)

(SA

R)

(SM

A)

(SM

A)

(SM

A)

(SM

A)

−0.2

064

.989

.567

.768

.635

.110

.532

.231

.4−0

.20

18.2

66.0

30.0

33.4

81.8

34.0

70.0

66.6

−0.1

550

.082

.855

.656

.550

.017

.244

.443

.5−0

.15

29.5

72.1

41.1

38.2

70.5

27.9

58.9

61.8

−0.1

045

.078

.051

.850

.455

.022

.048

.149

.5−0

.10

35.7

74.3

43.5

41.7

64.3

25.7

56.4

58.3

−0.0

540

.575

.548

.945

.059

.524

.550

.855

.0−0

.05

38.3

74.8

44.8

43.9

61.7

25.2

55.0

56.1

0.05

40.8

75.9

47.4

45.8

59.2

24.0

52.2

54.2

0.05

39.0

73.5

46.9

43.2

61.0

26.5

52.9

56.8

0.10

45.4

78.3

52.6

50.2

54.6

21.7

47.3

49.8

0.10

35.8

73.4

45.1

39.6

64.2

26.6

54.8

60.4

0.15

50.1

81.0

54.2

56.8

49.9

19.0

45.7

43.2

0.15

29.3

70.7

40.3

35.9

70.7

29.3

59.4

64.1

0.20

64.8

85.9

67.2

68.9

35.2

14.1

32.8

31.1

0.20

17.8

62.8

28.2

30.4

82.2

37.2

71.8

69.6

R=

100

DG

P=

SAR

R=

100

DG

P=

SMA

VU

CL

LM

CV

VU

CL

LM

CV

VU

CL

LM

CV

VU

CL

LM

CV

(SA

R)

(SA

R)

(SA

R)

(SA

R)

(SM

A)

(SM

A)

(SM

A)

(SM

A)

(SA

R)

(SA

R)

(SA

R)

(SA

R)

(SM

A)

(SM

A)

(SM

A)

(SM

A)

−0.2

090

.097

.790

.689

.710

.01.

49.

410

.3−0

.20

3.2

57.2

5.2

17.3

96.8

35.7

94.8

82.7

−0.1

575

.894

.877

.275

.824

.23.

022

.824

.2−0

.15

16.2

72.0

20.8

26.0

83.8

21.3

79.2

74.0

−0.1

059

.890

.861

.660

.140

.26.

638

.439

.9−0

.10

31.5

84.7

38.6

35.6

68.5

11.4

61.4

64.4

−0.0

550

.287

.253

.450

.649

.88.

246

.349

.4−0

.05

41.3

86.5

48.6

41.7

58.7

10.0

51.2

58.3

0.05

50.1

87.9

51.2

50.1

49.9

7.8

48.5

49.9

0.05

41.2

84.3

45.7

42.0

58.8

10.6

53.8

58.0

0.10

60.0

89.5

60.0

61.0

40.0

6.7

40.0

39.0

0.10

32.0

82.5

37.5

36.6

68.0

11.8

62.4

63.4

0.15

75.4

94.7

76.3

73.7

24.6

3.6

23.7

26.3

0.15

16.7

72.2

20.1

26.6

83.3

20.3

79.9

73.4

0.20

94.2

99.0

94.9

91.7

5.8

0.7

5.1

8.3

0.20

3.3

57.1

6.5

18.1

96.7

36.2

93.5

81.9

R=

225

DG

P=

SAR

R=

225

DG

P=

SMA

VU

CL

LM

CV

VU

CL

LM

CV

VU

CL

LM

CV

VU

CL

LM

CV

(SA

R)

(SA

R)

(SA

R)

(SA

R)

(SM

A)

(SM

A)

(SM

A)

(SM

A)

(SA

R)

(SA

R)

(SA

R)

(SA

R)

(SM

A)

(SM

A)

(SM

A)

(SM

A)

−0.2

099

.010

0.0

99.0

99.0

1.0

0.0

1.0

1.0

−0.2

00.

564

.50.

59.

599

.535

.599

.590

.5−0

.15

86.5

99.0

89.0

87.0

13.5

1.0

11.0

13.0

−0.1

59.

583

.010

.021

.590

.517

.090

.078

.5−0

.10

70.5

98.5

69.5

69.5

29.5

1.5

30.5

30.5

−0.1

025

.093

.026

.535

.575

.07.

073

.564

.5−0

.05

56.5

96.5

55.0

54.0

43.5

3.5

45.0

46.0

−0.0

541

.096

.546

.546

.059

.03.

553

.554

.00.

0556

.595

.053

.553

.543

.55.

046

.546

.50.

0541

.598

.050

.542

.558

.52.

049

.057

.50.

1071

.598

.071

.568

.528

.52.

028

.531

.50.

1025

.093

.528

.031

.575

.06.

572

.068

.50.

1586

.598

.588

.085

.513

.51.

512

.014

.50.

159.

089

.510

.518

.591

.010

.589

.581

.50.

2099

.099

.599

.598

.51.

00.

50.

51.

50.

201.

064

.01.

09.

099

.036

.099

.091

.0

Per

cent

age

ofti

mes

that

each

crit

eria

sele

cts

the

deci

sion

that

appe

arin

brac

kets

atth

ehe

adin

gof

the

colu

mn

Col

umns

wit

hit

alic

entr

ies

defin

eth

eD

GP

Clues for discriminating between moving average and autoregressive models 291

• The intensity of the autocorrelation, be it in the SAR or in the SMA case, isan even more determining factor than the sample size. The behaviour of thedifferent methods is very poor when the symptoms of dependence are weak,but improves as we approach the extremes of the range of values admissiblefor the parameter.

• Methods of discrimination seem biased towards the SAR model. In otherwords, the SAR model is better captured than the SMA model. This impliesthat the regions of wrong and/or non conclusive decisions are wider whenthe data follow an SMA process. The method with the worst behaviour inthis respect is the Clarke test, since it always selects the SAR model. Thismakes the Clarke test unreliable for this specific part of exploratory spatialanalysis.

• For the Testing Approach, the LM sequence outperforms the Vuong test inall cases. The exception is when the SMA model intervenes in the DGP andthe sample size is very small.

• Under the Criteria Approach, the Vuong test, the LM test and the Variancecriterion perform very similarly when the data are obtained from an SARprocess. However, when data are generated by an SMA model, the bestcriterion is the Voung test, closely followed by the LM test and the thirdposition is for the Variance criterion approach.

To sum up, we could state that if we are interested in keeping the probabilityof making a wrong selection to a minimum, it seems advisable to use the Vuongtest, first in a Testing Approach and, if we cannot reject the null of indifference,to apply the same statistic but disconnected from its probabilistic structure in aCriteria Approach. This is our main conclusion from the exercise.

An application to the case of the European regional income

The aim of this section is to illustrate the model selection dilemmadiscussed previously, using the well-known case of the regional distribution ofEuropean income in the year 2002. Our data come from the REGIO databankof EUROSTAT and the sample includes a total of 1,274 regions correspond-ing to the NUTS III division of the 25 member States of the European Unionplus the two candidates that will join next (EU27), Bulgaria and Rumania.5 Inthis application we are going to focus our attention on the regional per capitagross domestic product, measured in units of purchasing power parities (PPS).Throughout the exercise we will work in logarithms, and in deviations from thesampling average, to smooth the discrepancies.

Figure 5 shows how per capita income was distributed among the Europeanregions in 2002. The regions with lower incomes were located in the peripheryof the regional system in a sweep that includes the South of Spain and Italy,Greece, all the countries of Eastern Europe including the German part andsome areas of Finland and Scotland. On the other hand, the regions with the

5 A deeper description of the data may be found in Mur, López and Angulo (2006).

292 J. Mur, A. M. Angulo

Fig. 5 Box map of the European regional per capita income: year 2002

highest incomes were grouped at the centre of the continent, an area madeup of the North of Italy, Austria, Sweden, Denmark, Holland, the South ofthe United Kingdom and the North of Spain. Apparently, we are faced with atypical central-peripheral structure to which the literature has alluded on manyoccasions (see Ertur and Koch 2006, for a recent review). Besides, there is ahigh number of lower outliers (83, 6.5% of the sample) situated in the Eastof the UE27 (the main nucleus is formed by Bulgarian and Rumanian regionstogether with others from Estonia, Latvia, Lithuania and the East of Poland).Only 1% of European regions (14 in total) fall into the category of upper outlierand these are found dispersed in the central part of the continent.

Along with the data on income, we must specify one or several contiguitymatrices to reflect the structure of cross-sectional relationships. Due to theheterogeneity that characterises the regions, we have specified them using thefollowing mixed neighbourhood criterion which combines distance between thecentroids with the k-nearest-neighbours approach:

wij(k, r) =

⎧⎪⎪⎪⎨⎪⎪⎪⎩

if mini �=s

{dis} > k ⇒{

wij(k, r) = 1 if j ∈ Nr(i)wij(k, r) = 0 if j �∈ Nr(i)

if mini �=s

{dis} ≤ k ⇒{

wij(k, r) = 1 if dij ≤ kwij(k, r) = 0 if dij > k

(23)

where dij is the distance in kilometres between the centroids of regions i andj and Nr(i) is the set of the r regions closest to region i. As usual, wii = 0 for

Clues for discriminating between moving average and autoregressive models 293

Table 3 ML estimation of the SAR and the SMA model

SAR SMA

ρ, θ 0.025*(116.362) 0.096*(37.872)

σ 2 0.188 0.174Log ver −753.268 −633.543

In brackets are given the t-ratiosAn asterisk indicates that the estimator is significant at 5%

all i. In this paper we will present the results obtained for the combination6

of k = 100 and r = 2, W(100,2). The total of number of contacts included inmatrix W is 19,176, being 15.1 the average number of contacts with a minimumand maximum by rows of 2 and 52, respectively. Lastly, the largest eigenvalues,positive and negative, are 40.272 and −7.976 so the admissible range of valuesfor the parameter of autocorrelation in an SAR process is (−0.125; 0.025) and(−0.025; 0.125) for the SMA case.

Returning to the case of income, Fig. 5 leads us graphically to the intuitionthat there is a strong autocorrelation in the spatial distribution of this variable.The Moran’s I test statistically corroborates this point: the standardized statisticis 6.797, highly significant, pointing to the existence of a structure of positivecorrelation. The next step must consist of discussing which type of process,whether an SAR or an SMA, best fits the data. The estimation of both modelsappears in Table 3. As can be observed, both spatial parameters are significantalthough the value of the log-likelihood function for the SMA model is higher.

Nevertheless, in order to take a decision consistently we need more informa-tion, such as that included in Table 4 where we replicate the selection criteriapresented in the previous section.

The first conclusion we would like to highlight is that there have been nosignificant discrepancies between the different techniques. It is clear from theresults shown in Table 4 that the spatial moving average model is selected in allthe cases. The only exception corresponds to the Lagrange Multiplier test usedin a testing approach: in such case the procedure leads us to a Non Conclusivesolution.

Finally, the last result points to an even more important conclusion: it clearlyshows that an ad-hoc selection of the SAR model would be a wrong deci-sion for the case of, for example, European regional income, even though thisprocess has been systematically chosen to introduce spatial autocorrelation inthe applied literature. In our view, the selection of an SMA structure may bemore acceptable since, in the end, it is quite reasonable that the shocks pro-duced in a region will have an impact upon its immediate neighbours (there isspatial autocorrelation), but this incidence should decline with distance until itfinally disappears (in a process showing an SMA-like pattern).

6 Overall, the results of the application are quite robust to the specification of the weighting matrix.

294 J. Mur, A. M. Angulo

Table 4 Results for the selection techniques for the European regional income

Statistic Testing Criteriavaluec approach approach

Selected Selectedmodel model

Variance criterion σ 2SAR 0.133 – SMA

σ 2SMA 0.125

Vuong testa −3.57* SMA SMAClarke testb B: positive differences 415* SMA SMALM H0 : ρ2 = 0 289.34* Non-Conclusive SMA

H0 : θ2 = 0 62.12*

Differences in the individual log-likelihoods and the likelihood ratio have been calculated subtract-ing from the SAR results the SMA oneDifferences in the individual log-likelihoods have been calculated subtracting from the SAR resultsthe SMA oneAn asterisk indicates that the estimator is significant at 5%

Conclusions

The goal of this paper was to provide further evidence on the proceduresto identify the stochastic structure that underlies a spatial series. More pre-cisely, we tried to answer one specific question: how to discriminate betweenan SAR(1) and an SMA(1) process. These models are habitually specifiedto introduce spatial dependency mechanisms but they are hardly distinguish-able from a purely statistical perspective, in spite of being very different intheir economic implications. We tried to solve this discussion through a MonteCarlo exercise. Consequently, it is important to recognize the restrictions of ouranalysis. In first place, we should mention the small number of alternatives con-sidered (only two and without exogenous variables in the equation; the workof Florax et al. 2003 goes deeply into this line). In second place, the assumptionof symmetry in the weighting matrix is important, specially, in applied work.Finally, all the other simplifying suppositions such as normality, linearity, etc.,which are necessary to maintain the dimensions of the work within reasonablelimits were also assumed in our study.

Now it is time to answer the question we have been dealing with throughoutthis work: which method seems to be more suitable for choosing between thetwo processes, SAR(1) or SMA(1)? Following the Criteria Approach, the powerof all the tests increases but the probability of selecting the wrong model alsoincreases. Hence, it seems more advisable to develop a two-stage procedure.First, the Testing Approach should be employed. If we can reach a satisfactorysolution at this stage, the procedure finishes with the model selected. If a non-conclusive solution prevails in the first step, this can equally be the end of thediscussion: both processes are acceptable (and really, in our own experience,the differences between the estimations obtained from the two models will bereally small). However, if we persist in selecting only one process, we shouldcarry out a second analysis adopting a Criteria Approach.

Clues for discriminating between moving average and autoregressive models 295

The next point is what method should be used at each stage? We can saythat, although the LM and the Voung proposals guarantee a correct decision atthe extremes of the stability interval, the test of Vuong seems to be preferablein both situations because, without being inferior in other aspects, it is moresensitive to the symptoms of a moving average in the DGP.

Appendix A: The Cox and Vuong tests

In the Cox test we have two families of conditioned density functions: fθ ={fY|Z(θ); θ ∈ � ⊂ p

}and gγ = {

gY|Z(γ ); γ ∈ � ⊂ q}, and we want to test

one against the other. The null hypothesis corresponds to one of the familieswhile the other is the alternative (H0 : fθ vs. HA : gγ ). The test statistic is acentred and typified version of the Likelihood Ratio:

C0( f ) =LRn(θn; γ n) − EZEf

(LRn(θn; γ n)

)√

V[LRn(θn; γ n) − EZEf

(LRn(θn; γ n)

)]as∼ N(0, 1) (A.1)

where θn and γ n are the respective ML estimations of θ and γ and LRn isthe Likelihood Ratio (LRn(θn; γ n) = Lf (θn) − Lg(γ n)). Next, the test shouldbe repeated taking the other density function gγ in the null hypothesis. Theproblem with this reasoning is that the model of the alternative hypothesis onlyhas power to reject the model of the null hypothesis, which is a potential sourceof conflict.

Vuong (1989) uses the structure of the test of Cox, but redefines the contentof the null and alternative hypotheses. Now, the former is associated with asituation of indifference between the models, while the alternative is bilateraland identifies the model most favoured by the data. In analytical terms:

H0 : E0[lg f (Yt|Zt ;θ)

g(Yt|Zt ;γ )

]= 0

HA :

⎧⎨⎩

Hf : E0[lg f (Yt|Zt ;θ)

g(Yt|Zt ;γ )

]> 0

Hg : E0[lg f (Yt|Zt ;θ)

g(Yt|Zt ;γ )

]< 0

⎫⎪⎪⎪⎪⎬⎪⎪⎪⎪⎭

. (A.2)

This change of perspective allows us to obtain additional results of conver-gence in probability and in distribution with respect to the LRn statistic of (A.1).The most important are summarised in the following expression:

√n{

LRn(θn;γ n)

n − E0[lg f (Yt|Zt ;θ)

g(Yt|Zt ;γ )

]}D→N(0; ω2)

ω2 = V0[lg f (Yt|Zt ;θ)

g(Yt|Zt ;γ )

]= E0

[(lg f (Yt|Zt ;θ)

g(Yt|Zt ;γ )

)2]

−(

E0[lg f (Yt|Zt ;θ)

g(Yt|Zt ;γ )

])2

⎫⎪⎬⎪⎭ (A.3)

296 J. Mur, A. M. Angulo

Finally, Theorem 5.1 of Vuong (1989, p. 318) states that ‘. . . if Fθ and Gγ arestrictly non-nested, then

(i) under H0 : n−1/2LRn

[θn; γ n

]/ωn

D→N(0; 1)

(ii) under Hf : n−1/2LRn

[θn; γ n

]/ωn

as→ + ∞

(iii) under Hg : n−1/2LRn

[θn; γ n

]/ωn

as→ − ∞(. . .)’.

Given that the two models are different, the acceptance of the null shouldbe interpreted as implying that the available evidence does not permit us todiscriminate between both alternatives. With slight variations, the test can alsobe used in the case where the models are nested or overlapped.

Appendix B: Clarke’s test

The framework for the test of Clarke (2003) is the following:

H0 : Median[

lgf (Yt |Zt; θ)

g(Yt |Zt; γ )

]= 0 ⇒ H0 : Pr

[lg

f (Yt |Zt; θ)

g(Yt |Zt; γ )> 0]

= 0.5

HA :

⎧⎪⎨⎪⎩

Hf : Median[lg f (Yt|Zt ;θ)

g(Yt|Zt ;γ )

]> 0

Hg : Median[lg f (Yt|Zt ;θ)

g(Yt|Zt ;γ )

]< 0

(B.1)

The solution is particularly simple. It is only necessary (i) To estimate themodel corresponding to the family fθ = {fY|Z(θ); θ ∈ � ⊂ p} retaining theindividual log-likelihoods {ft = lg f (Yt|Zt; θn); t = 1, 2, . . . , n}; (ii) To estimatethe model corresponding to the family gγ = {gY|Z(γ ); γ ∈ � ⊂ q} maintain-ing the individual log-likelihoods

{gt = lg g(Yt

∣∣Zt; γ n) ; t = 1, 2, . . . , n}; (iii) To

obtain the differences between the individual log-likelihoods{dt = ft − gt; t = 1,

2, . . . , n}; (iv) Calling B the number of positive differences of the last series, itsdistribution under the null hypothesis of (B.1) is a Binomial(n;0.5).

Acknowledgments We thank the project SECJ2006-02328/ECON del Ministerio de Educación yCiencia del Reino de España, for financial support.

References

Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In:Petrov B, Cáski F (eds) International Symposium on Information Theory: 267–281. Akade-miai Kaidó, Budapest. Reprinted in Kort S, Johnson N (eds) Breakthroughs in Statistics, vol I,Springer, Berlin Heidelberg New York, pp. 599–624

Clues for discriminating between moving average and autoregressive models 297

Anselin L (1984) Specification tests on the structure of interaction in spatial econometric models.Papers Reg Sci Assoc 54:165–182

Anselin L (1988) Spatial econometrics: methods and models. Kluwer, DordrechtAnselin L (1995) Local indicators of spatial association. Geograph Anal 27:93–115Anselin L (2003) Spatial externalities, spatial multipliers and spatial econometrics. Int Reg Sci Rev

26:153–166Anselin L, Bao S (1997) Exploratory spatial data analysis linking spacestat and arcView. In:

Fisher M, Getis A (eds) Recent developments in spatial analysis. Springer, Berlin HeidelbergNew york

Anselin L, Kelejian H (1997) Testing for spatial error autocorrelation in the presence of endoge-neous regressors. Int Reg Sci Rev 20: 153–182

Anselin L, Moreno R (2003) Properties of tests for spatial error components. Reg Sci Urban Econ33: 595–618

Baltagi B, Bresson G, Pirotte A (2006) Panel unit root tests and spatial dependence. Working Paper,Department of Economics. Syracuse University

Beron K, Hansen Y, Murdoch J, Thayer M (2004) Hedonic price functions and spatial depen-dence. Implications for the Demand of Urban Air Quality. In: Anselin L, Florax R,Rey S (eds) Advances in spatial econometrics. Methodology, Tools and Applications. Springer,Berlin Heidelberg New york, pp 267–282

Brett C, Pinkse J (1997) Those taxes are all over the map! a test for spatial dependence of municipaltax rates in british columbia. Int Reg Sci Rev 20:131–151

Brockwell P, Davies R (2003) Introduction to time series and forecasting. Springer, BerlinHeidelberg New York

Burridge P (1980) On the Cliff-Ord test for spatial correlation. J R Stat Soc B 42:107–108Clarke K (2003) Nonparametric model discrimination in international relations. J Conf Resolut 47:

72–93Clarke K (2004) A simple distribution-free test for nonnested hypothesis. Working paper, Depart-

ment of Political science. University of RochesterCliff A, Ord J (1981) Spatial processes, Models and Applications. Pion, LondonCox D (1961) Tests of separate families of hypotheses. Proc fourth berkeley symp math stat prob

1:105–123Cressie N (1991) Statistics for spatial data. Wiley, New YorkDavidson R, MacKinnon J (1981) Several tests for model specification in the presence of alternative

hypotheses. Econometrica 88:781–793De Graaff T, Florax R, Nijkamp P, Reggiani A (2001) A general misspecification test for spatial

regression models: dependence, heterogeneity, and nonlinearity. J Reg Sci 41:255–276Ertur C, Koch W (2006) Regional disparities in the european union and the enlargement process.

An exploratory spatial data analysis, 1995–2000. Ann Reg Sci (forthcoming)Fisher G, MacAleer M (1979) On the interpretation of the Cox tests in Econometrics. Econ Lett

4:145–150Florax R, Folmer H, Rey S (2003) Specification searches in spatial econometrics: the Relevance of

Hendry’s Methodology. Reg. Sci. Urban Econ. 33:557–579Haining R (1977) Model specification in stationary random fields. Geograph Anal 9:107–129Haining R (1978) The moving average model for spatial interaction. Trans Pap Inst British Geog

New Ser 3:202–225Haining R (1979) Statistical tests and process generators for random field models. Geograph Anal

11:45–64Haining R (2003) Spatial data analysis. Theory and Practice. Cambridge University Press,

CambridgeHepple L (1999) Bayesian techniques in spatial and network econometrics: 1. Model Comparison

and Posterior Odds. Environ Plan A 27: 447–469Huang JS (1984) The autoregressive moving average model for spatial analysis. Australian J Stat

26:169–178Kelejian H, Robinson D (1992) Spatial autocorrelation: a new computationally simple test with

application to per capita county police expenditure. Reg Sci Urban Econ 22:317–331Lahiri S (2003) Central limit theorems for weighted sums of a spatial process under a class of

stochastic and fixed designs. Sankhya 65:365–388

298 J. Mur, A. M. Angulo

Lavine M, Schervish M (1999) Bayes factors: what they are and what they are not. American Stat53:119–122

Le Gallo J, Beaumont C, Dall’Erba S (2005) On the property of diffusion in the spatial error model.Appl Econ Lett 12: 533–536

Lee L (2002) Consistency and efficiency of least squares estimation for mixed regressive, Spatialautoregressive models. Econ Theory 18: 252–277

Lesage J, Pace K (eds) (2004) Spatial and spatiotemporal econometrics. Elsevier, AmsterdamMoran P (1948) The interpretation of statistical maps. J R Stat Soc B 10: 243–251Moreno R, Vayá E (2000) Técnicas econométricas para el tratamiento de datos espaciales: La

econometría espacial. Editions Universitat de Barcelona, BarcelonaMoreno R, Paci R, Usai S (2005) Spatial spillovers and innovation activity in european regions.

Environ Plan A 37: 1793–1812Mur J (1999) Testing for spatial autocorrelation: moving average versus autoregressive processes.

Environ Plan A 31: 1371–1382Mur J, López F, Angulo A (2006) Symptoms of instability in models of spatial dependence.

An application to the European case. Working Paper, Department of Economic Analysis,University of Zaragoza

Ord J, Getis A (1995) Local spatial autocorrelation statistics: distributional issues and an applica-tion. Geograph Anal 27: 286–306

Rey S, Dev B (2004) σ -Convergence in the presence of spatial effects. Working Paper, Departmentof Geography. San Diego State University

Rivers D, Vuong Q (2002) Model Selection Tests for Nonlinear Dynamic Models. Econ J 5: 1–39Schwarz C (1978) Estimating the dimension of a model. Ann Stat 6: 461–464Smith T (1980) A central limit theorem for spatial samples. Geograph Anal 12: 299–324Sneek J, Rietveld P (1997) On the estimation of the moving average model. Tinberger Institute

Discussion Paper 97049/4Trivez F, Mur J (2004) Some proposals for discriminating between spatial processes. In: Getis A,

Mur J, Zoller H (eds) Spatial econometrics and spatial statistics: Palgrave Macmillan Publishers,Basingstoke, pp. 150–175

Upton G, Fingleton B (1985) Spatial data analysis by example. point pattern and quantitative data.Wiley, New York

Vuong Q (1989) Likelihood ratio-tests for model selection and non-nested hypotheses. Econome-trica 57: 307–333