THE EFFECTS OF DISAGGREGATION ON FORECASTING NONSTATIONARY I(1) TIME SERIES

THE EFFECTS OF DISAGGREGATION ON

FORECASTING NONSTATIONARY I(1) TIME

SERIES

Pilar Poncela and Antonio Garcia-Ferrer.

Dept. Analisis Economico: Economia CuantitativaUniversidad Autonoma de Madrid (SPAIN)

May 21, 2005

Abstract:This paper focuses on the effects of disaggregation on forecast accuracy for

nonstationary time series using unobserved components models. Both, unre-lated and common trends are considered for I(1) processes. Although the basictheoretical results are known for stationary vector ARMA time series, the pos-sibility of cointegration, or equivalently the presence of common trends, bringsa new dimension to this problem. The usage of unobserved components modelsallows the possibility of explicitly modeling the trends. We study the presence ofcommon trends among several components using a multivariate dynamic factormodel. We also analyze if the common information can help in forecasting theaggregate. Alternatively, if the dissimilarities among the components are quitestrong, a disaggregated approach will be preferred. The results are applied toquarterly Gross Domestic Product (GDP) data of several European countriesof the euro area and to their aggregated GDP. Decisions of political economyare based on macroeconomic variables both at the country level and on its ag-gregates. We will make forecasts at the country level and then pool them toobtain the forecast of the aggregate. This will be compared to the predictionobtained directly from modeling and forecasting the aggregated GDP of theseEuropean countries.

Keywords: aggregation, common trends, factor analysis, forecast

Acknowledgment: Both authors acknowledge financial support from the Span-ish Ministery of Science and Technology, project number BEC2002-00081. Thefirst author also acknowledges financial support from Comunidad Autonoma deMadrid, project number 06/HSE/0016/2004.

1

1 Introduction

It has been known, for more than two decades, what are the effects of disag-gregation in forecasting the aggregate. This effect has been studied for ARMAprocesses both from a theoretical and an empirical points of view. From a the-oretical point of view, Tiao and Guttman (1980) provide conditions that guar-antee no gain in forecast efficiency (measured as the minimum mean squarederror, MMSE) when employing the component series rather than the aggre-gate, where this was just the sum of the component series. Wei and Abraham(1981) compare the forecast efficiency based on the aggregate series, the univari-ate component series and the joint multiple time series. Kohn (1982) analyzesthe equality of forecasts between the aggregated and disaggregated approaches,when the aggregate is any linear combination of the observed series. Later,Luthkepol (1984, 1985) establishes the equality of forecasts when we considerany set of linear combinations of the observed time series. More recently, Clark(2000) examines the problem of forecasting an aggregate of cointegrated dis-aggregates, and establishes conditions under which forecasts of an aggregatevariable obtained from a disaggregate Vector Error Correction Model (VECM,from now on) will be equal to those from an aggregate univariate time seriesmodel. Interestingly, when the underlying disaggregated model takes the formof a VECM, we have to add an additional condition to Kohn’s (1982) results inorder to apply them to VECM (this additional condition is referred to the errorcorrection term). Although the empirical counterpart of this issue has shownmixed results (being case dependent), recent results from Zellner and Tobias(2000) in the case of GDP growth rates and Marcellino et al. (2003) for severalEuropean aggregates, indicates that, in general, ”it pays to disaggregate”.In this paper we will use unobserved components models (UCM, from now

on) to study the issue of disaggregation in nonstationary time series. Thisallows us to explicitly model the trends and contemplate the case of commontrends. We will derive some results where the forecasts based on the aggregateddata are equal to those based on the disaggregated data. As a by-product, wehave also obtained that, when the weights for the aggregation are given by acointegrating vector, the pooled forecasts from the dynamic factor model havethe MMSE. Finally, we will apply our results to the case of the GDPs of severalcountries belonging to the European Community and to their aggregated GDP.The forecast of European aggregates has an increasing interest since decisionsof the European political economy are increasingly taken at the aggregate level.The article is organized as follows: In section 2 we analyze the unrelated

trends case. In section 3 we analyze the case of common trends. In section 4we introduce the one factor model. In section 5 we present the empirical resultsfound for aggregate GDP of a group of five countries in the euro area of theEuropean community. Finally, in section 6 we conclude.

2

2 The unrelated trends model

In this section, we present a multiple time series UCM for non-seasonal nonsta-tionary time series that are not cointegrated. Let yt be anm-dimensional vectorof observed time series. Let zt = w

0yt where w = (w1,..., wm)0, be the univari-

ate aggregated series that we want to forecast. In some casesPmj=1wj = 1,

while in others w = 1 = (1, ..., 1)0, although the results derived are valid for any

linear combination of the series. We will use three forecasting procedures: 1) aunivariate UCM for the aggregated series; 2) a univariate UCM for each of thecomponents of the vector time series and then aggregate the forecasts and 3) amultiple UCM for the component series and then aggregate the forecasts.Suppose that each component of the time series vector yt is generated by

a different trend plus noise. Later, we will allow for the presence of commontrends. Then

yt = Tt + et, (1)

where Tt = (T1,t, T2,t, ..., Tm,t)0 is them×1 vector of trends, that collects all the

fluctuations associated to the low frequencies and, in a broad sense, we couldcall the trend-cycle component, and et is multivariate white noise (0,Σe). Wewill assume that each component of the trend vector, Tj,t, is given by

xj,t = Fj xj,t−1 +Gjaj,t (2)

where xj,t = [Tj,t dj,t]0, being Tj,t and dj,t the level and slope of the trend that

evolve according to

Fj =

·αj βj0 γj

¸Gj =

·δj 00 1

¸, (3)

and aj,t = [ηj,t ξj,t]0 are uncorrelated white noise processes with variance

Σaj =diag(σ2ηj,σ2ξj ). All the noises aj,t, j = 0, . . . ,m, are uncorrelated so

Ση =diag(Σaj ), j = 1, ...,m; and are also uncorrelated with the noise et ofequation (1). This model includes the following well-known formulations forthe trend (see, for instance Young, 1984 and Harvey, 1989): random walk (RW:αj = δj = 1;βj = γj = 0); smooth random walk (SRW: 0 < αj < 1; βj =γj = 1; δj = 0); integrated random walk (IRW: αj = 1; βj = γj = 1; δj =0); local linear trend (LLT: αj = βj = γj = δj = 1) or the damped trend (DT:αj = βj = δj = 1; and 0 < γj < 1).If the noise variance covariance matrix Σe is diagonal, there are no cross-

correlations among the components of yt, and the univariate models for thecomponents are straightforward to obtain.Since we are interested in forecasting real GDP and this variable is usually

modelled as integrated of order one, that is I(1), we will now assume that all thetrends are random walks. The simple model that results is able to capture thedominant term in forecasting a trending variable. In this case αj = δj = 1 andβj = γj = 0,∀j = 1, ...,m and α = Im, γ = 0m, ηt = (η1,t, ..., ηm,t)

0. Noticethat Σe represents the short run noise variance, while Ση represents the varianceof the noise associated to the long run.

3

2.1 Univariate model for the aggregate

Let us derive now the model for the univariate aggregated series zt = w0yt.From (1)

zt =mXj=1

wj (Tj,t + ej,t)

= T t + et

where et =Pmj=1wje

jt = w0et is the new white noise series whose variance

is var(et) = w0Σew and T t =Pm

j=1wjTj,t = w0Tt is the new trend for theaggregate, which is given by

T t =mXj=1

wj¡Tj,t−1 + ηj,t

¢= T t−1 + ηt

where ηt =Pmj=1wjηj,t = w0ηt is the new white noise process associated to

the long run whose variance is given by var(ηt) =Pmj=1 w

2jσ

2ηj= w0Σηw.

Since all trends are random walks, the univariate trend for the aggregate isalso a random walk, obtained as a weighted average of the components. If thetrends for the components are not of the same kind, the trend for the aggregatecan be a different type as well. For instance, assume m = 2 and that wemodel the trends for the components as an IRW and a RW, respectively (then,α1 = α2 = β2 = γ2 = δ1 = 1 and β1 = δ2 = 0). It is easy to check that thetrend for the aggregate is a new type of I(2) trend given by·

T td2,t

¸=

·1 w20 1

¸ ·T t−1d2,t−1

¸+

·w1 00 1

¸ ·η1,tξ2,t

¸that incorporates to the level in t a fraction w2 of the random walk processfollowed by the slope d2,t.

2.2 Forecasting results

Let us call zt(h) to the h− steps ahead forecast of the aggregated series based onthe univariate model for the aggregate, zut (h) =

Pmj=1wjy

uj,t(h) to the h− steps

ahead forecast of the aggregated series based on the univariate forecasts for thecomponents yuj,t(h), and z

mt (h) =

Pmj=1wjyj,t(h) to the h− steps ahead forecast

of the aggregate series based on the forecasts for the components yj,t(h) comingfrom the multiple time series model. We will compare the previous three fore-casts: zt(h), z

ut (h) and z

mt (h). As the UCM is equivalent to an ARIMA model,

and the trending behaviour of all series can be simply removed by differenc-ing, the results given in this section are the ones derived by several authors inthe eighties. We just have to translate them in terms of the UCM parameters.When the aggregate is just the sum of the component series, Tiao and Guttman

4

(1980) provide conditions for the sum to be as efficiently forecast from the dis-aggregated approach than by modelling and forecasting directly the aggregatedseries. Later, Kohn (1982) generalizes these results to any linear combinationof the time series. Wei and Abraham (1981) give similar results for stationaryseries based on Hilbert spaces. Lutkepohl (1984) generalizes the previous resultsconsidering the three types of forecasts, zt(h), z

ut (h) and z

mt (h).

We will briefly review the results by Lutkepohl (1984, theorem 2). Assumethat the series can be converted into stationary through regular differencin. Letthe MA representation of the multiple time series yt be (1− L)dyt =M(L)ut,the MA representation of the aggregated series zt be (1−L)dzt = k(L)ut and theunivariate MA representation of the i-th component series yi,t, be (1−L)dyi,t =ni(L)ui,t, i = 1, 2, ...,m. Then, the three forecasting methods are related asfollows:(i) zmt (h) = zt(h), h = 1, 2, ... if and only if w

0M(L) = k(L)w0

(ii) zut (h) = zt(h), h = 1, 2, ... if and only if w0N(L) = k(L)w0, N(L) =diag(n1(L), ..., nm(L))(iii) If M−1(L) and N−1(L) exist, then zmt (h) = zut (h), h = 1, 2, ... if and

only if w0M−1(L) = w0N−1(L).Conditions (i) to (iii) are based both on the weighting vector and on the

dynamics of the series. For instance, assume that yt follows a stationary (d = 0)vector MA(1) process and let M(L) = I−M1L. Therefore, the weighted serieszt is also MA(1). Let k(L) = 1 − k1L. Then, by (i) zmt (h) = zt(h) if and onlyif the weighting vector w is an eigenvector of M1 with associated eigenvaluek1. Kohn (1982) argues that if we want (i) to hold for every linear combinationof the observed series (that is for every vector w), then M(L) has to be of theform a(L)Im, where a(L) is scalar. In this later case all series in yt will have thesame MA(1) parameter andM1 = diag(k1, ..., k1). Since these conditions do notseem to be easily fulfilled, the equality of forecasts seems to be the exceptionrather than the rule. Of course, these are theoretical results based on knownprocesses and this is not the case when we analyze real data. Lutkepohl (1985)analyzes how estimation affects the previous results through several Monte Carlosimulations.In order to check the forecasting results in our case, we first derive the

reduced form for the multiple time series as well as for the aggregate, and thencheck the conditions (i) to (iii).The VARMA model for the component series in yt given by (1) with all

trends given by random walks is an IMA(1,1)

(1− L)yt = ηt + (1− L)et= (I−ΘL)ut

where var(ut) = Σu and Θ are the solutions of the second moment equations

Σu +ΘΣuΘ0 = Ση + 2Σe,

ΘΣu = Σe. (4)

5

If in addition to Ση, Σe is also diagonal, then Θ =diag(θ1, ..., θm) and Σu =diag(σ21, ...,σ

2m) where the invertible solution is given by (see, for instance, Har-

vey, 1989, page 68)

θj =(qj + 2)−

qq2j + 4qj

2(5)

σ2j =2σ2ej

(qj + 2)−qq2j + 4qj

where

qj = NV Rj =σ2ηjσ2ej

(6)

is known as the noise-variance ratio in the literature of UCM. (See, for instance,Young 1984).The reduced form of the aggregate is given by

(1− L)zt = (1− θL)ut

where var(ut)=σ2 and in the case of Σe diagonal it is straightforward to show

that

θ =(q + 2)−

pq2 + 4q

2(7a)

σ2 =2σ2e

(q + 2)−pq2 + 4q

with

q =w0Σηww0Σew

. (8)

Proposition 1 Let yt be given by the unrelated trends model in (1) where thetrends are random walks and the error et has full rank variance-covariancematrix Σe. Let Σu be the error variance-covariance matrix of the equivalentVARMA form, given by (4). Let zt be the aggregated series zt = w0yt andθ be the univariate MA(1) parameter of the model for the aggregated series zt.Let zt(h) be the h− steps ahead forecast of the aggregated series based on theunivariate model for the aggregate and zmt (h) =

Pmj=1wjyj,t(h) be the h− steps

ahead forecast of the aggregate series based on the forecasts for the componentsyj,t(h) coming from the multiple time series model. Then, zmt (h) = zt(h), h =1, 2, ... if w is an eigenvector of Σ−1u Σe with associated eigenvalue θ.

Proof. The proof is given in the Appendix.

6

Corollary 2 Let zut (h) =Pmj=1wjy

uj,t(h) be the h− steps ahead forecast of the

aggregated series based on the univariate forecasts for the components yuj,t(h)and qj and q be defined as in (6) and (8), respectively. If, in addition to thehypothesis of Proposition 1, Σe is diagonal, then z

mt (h) = zut (h) = zt(h), h =

1, 2, ... if qj = q,∀j = 1, ...,m.

Proof. The proof is given in the Appendix.The Proposition 1 and Corollary 2 translate the results of theorem 2 by

Lutkepohl (1984) to nonstationary series formulated in state space form. Thecase of cointegrated series is not contemplated. We will address this issue innext section. With the restriction of Σe diagonal, we have close form formulasfor the parameters of the models and on Corollary 2, it is basically stated thatzmt (h) = z

ut (h) = zt(h), that is, the aggregate forecasts (based on the multiple

time series model, zmt (h), or on the univariate time series for the components,zut (h)) and the forecast of the aggregate, zt(h), are identical if all the componentseries have the same noise-variance ratios or, equivalently, the trends exhibit thesame degree of smoothness.

3 Common trends

In the case of common trends, the model for the observed series is given by

yt = PTt + et, (9)

where Tt is a r × 1 vector of trends such that r < m, P, m × r, is the factorloading matrix and et is multivariate white noise (0,Σe). We assume that eachcomponent of the vector of common trends Tt = (T1,t, T2,t, ..., T3,t)

0 is gener-ated by (2). The model, as stated, is not identified since the product PTt isnot identified. We have to place restrictions to fix the scale. Some restrictionscommonly used in the literature are, for instance, the ones in Harvey (1989) whosets Ση = I. Yet, the model is not identified under rotations. Although theremight be already some parameters restricted to 0 or 1 on the system matrices,the model can still suffer from an identification problem due to possible rota-tions, and some additional restrictions are needed. For instance, Harvey (1989)adds pij = 0, if i > j, being P = [pij ]. A common restriction in factor analysisis that all the cross correlations come from the common factors (common trendsin our case); therefore it is usually assumed that Σe is diagonal. Nevertheless,this assumption has been relaxed in some recent papers allowing some degreeof cross correlations in the errors. (See, for instance, Stock and Watson, 2002).In what follows, we will indicate when we place this restriction (Σe diagonal).As in the case of unrelated trends, we will keep assuming that all trends are

random walks. So, in (3) αj = δj = 1 and βj = γj = 0,∀j = 1, ..., r and α = Ir,γ = 0r, ηt = (η1,t, ..., ηr,t)

0.

7

3.1 Univariate UCM model for the aggregate

Let us derive now the model for the univariate aggregate zt = w0yt, assuming

the presence of common trends. From (9)

zt =mXj=1

ÃrXi=1

wjpjiTi,t + ej,t

!= T

ct + et

where var(et) = σ2e = w0Σew and T

ct =

Pmj=1

Pri=1wjpjiTi,t = w

0PTt is thenew trend component coming from the common trends, given by

Tct =

mXj=1

rXi=1

wjpji¡Ti,t−1 + ηi,t

¢= T

c

t−1 + ηct

with ηct =Pmj=1

Pri=1wjpjiηi,t = w

0Pηt and var(ηct) = w

0PΣηP0w = w0PP0w

if we impose the identification restriction Ση = Ir.In the case of a unique common trend that affects all series in a similar way,

then P = p1 where 1 =(1, ..., 1)0 and if the aggregate is a weighted mean of itscomponents such that

Pmj=1wj = 1, then

Tc

t = pmXj=1

wj (Tt−1 + ηt)

= p (Tt−1 + ηt) .

The trend for the aggregate is just the trend for the components multiplied bya scale factor p.

3.2 Univariate UCM models for the components

Let us derive now the model for each of the components of the vector yt, assum-ing the presence of common trends. Let pj. denote the j−th row of the factorloading matrix P. From (9)

yj,t =rXi=1

pjiTi,t + ej,t

= Tuj,t + ej,t

where Tuj,t =Pr

i=1 pjiTi,t = pj.Tt is the univariate trend component for thej-th series and it is given by

Tuj,t =rXi=1

pji¡Ti,t−1 + ηi,t

¢= Tuj,t−1 + ηuj,t

8

where ηuj,t =Pri=1 pjiηi,t = pj.ηt and var(η

uj,t) = pj.Σηp

0j. = pj.p

0j., being pj.,

1 × r, the j−th row of the factor loading matrix P and Ση = Ir due to theidentification restriction.In the case of a unique common trend that affects all series in a similar way,

then P = p1 where 1 =(1, ..., 1)0, then

Tuj,t = p (Tt−1 + ηt)

= Tuj,t−1 + pηt.

The trend for each component is just multiplied by a scale factor p.

3.3 Comparisons of the aggregated forecast vs the pooledforecast from the factor model and the univariate UCM

The following propositions give the neccessary and suficient conditions neededin order to achieve the equality of forecasts first, between the aggregate and thepooled forecasts from the factor model; and, secondly, between the aggregateand the pooled forecasts from the univariate models.

Proposition 3 Let the disaggregated series be given by the common trendsmodel given in (9), where all the common trends are given by random walks.Let et(h) the prediction error of the univariate forecast of zt+h, defined aset(h) = zt+h − zt(h), and w0et(h) be the pooled forecast error of the multi-variate common trends model, where et(h) = yt+h − yt(h). Let k and K be theKalman filter gains of the univariate and multivariate models respectively. If werun the Kalman filter with equivalent initial conditions, the equality of forecastsis achieved if and only if kw0 = w0PK, or

kw = K0P0w. (10)

Proof. The proof is given in the Appendix.The equality of forecasts between the pooled forecast from the common

trends model and the univariate forecast of the aggregated series is only obtainedif the weighting vector w is an eigenvector of K0P0 with associated eigenvaluek, where recall that P is the factor loading matrix. This condition resemblesthe one proposed by Tiao and Guttman (1980) in their theorem 1 for stationaryMA processes since it is stated in terms of the eigenvalues of certain matricesassociated to the parameters of the models.

Corollary 4 Under the same conditions of Proposition 3, the aggregation vec-tor w cannot be a cointegrating vector in order to achieve the equality of forecastsbetween both the disaggregated and the aggregated approaches.

Proof. The proof is given in the Appendix.As a by-product of the previous proposition, it can be easily proven (see, the

Appendix) that the aggregation vector w cannot be a cointegrated vector. LetP⊥ be a m×(m−r) matrix, such that rank(P⊥)=m−r which is the orthogonal

9

complement of the factor loading matrix P, that is P0⊥P = 0. Notice that P⊥is a basis for the cointegration space of yt since P

0⊥yt = P0⊥et are m − r

stationary independent linear combinations of the components of yt. Therefore,if the weighting vector is any of the columns of P⊥, from (20)

w0et(h) = w0et+h.

which gives the minimum MSE in forecasting the aggregated GDP. Moreover,in this case K0P0w = 0 so the condition in (31) cannot be fulfilled since k is aratio of two variances and even though it can be zero, we can only guaranteethat it is ≥ 0.Proposition 5 Let the disaggregated series be given by the common trendsmodel given in (9), where all the common trends are given by random walks.Let et(h) the prediction error of the univariate forecast of zt+h, and w

0ut(h)be the pooled forecast error of the univariate UCM analysis of the componentseries, where ut(h) = (u1,t(h), ..., um,t(h))

0 and uj,t(h) is the h−step ahead fore-cast error of the j-th component series when it is analyzed univariately usingUCM, j = 1, 2, ...,m. Let k and kuj , j = 1, ...m, be the Kalman filter gains of theunivariate models of the aggregate and the j-th component series, respectively.Finally, let Ku = diag(k

u1 , ..., k

um). If we run the Kalman filter with equivalent

initial conditions, the equality of forecasts is achieved if and only if kw0 = w0Ku,or

kw =Kuw. (11)

Proof. The proof is given in the Appendix.The equality of forecasts is only produced if the weighting vector w is an

eigenvector ofKu with associated eigenvalue k. SinceKu is diagonal, this impliesthat k = kuj for all j = 1, ...,m. That is, the Kalman filter gain is the same inall the univariate analysis of the components and on its aggregate. This resultcan be translated in terms of the smoothness of the series.Taking into account the definition in (26)

kuj =vuj

vuj + σ2ej(12)

where vuj = σ2ejqj+√q2j+4qj

2 is the solution of the algebraic Riccati equation¡vuj¢2

vuj + σ2ej= pj.p

0j. (13)

and qj =pj.p

0j.

σ2ej=

Pri=1 p

2ji

σ2ej.

Now, from (12) and the Ricatti equation (13) the Kalman filter gain onlydepends on the noise-variance ratio

kuj =vuj

vuj + σ2ej=pj.p

0j.

vuj=

2

1 +q1 + 4

qj

. (14)

10

Therefore, in order to obtain the equality of forecasts with the univariate ag-gregated series and pooling the forecasts from the univariate models of eachcomponent, we just need that all series have the same noise variance ratio,qj = q for all j = 1, ...,m. Or, equivalently, all the series exhibit the same degreeof smoothness.

4 The one factor model

The one factor model has special interest because many economic time seriesare characterized by a common trend. For example, it can be considered thatthe GDP of some countries of a certain area of influence are driven by the samecommon trend. It has also been widely used in the business cycle analysis (see,for instance, Stock and Watson, 1991, and Diebold and Rudebusch, 1996, amongothers).In this subsection we are going to further analyze the condition in (31) for

the case of the one factor model, that is when r = 1 in (9) and this equation iswritten as

y1,ty2,t...ym,t

=

p1p2...pm

Tt +

e1,te2,t...em,t

. (15)

Proposition 6 Let the disaggregated series be given by the common trendsmodel given in (15), where the unique common trend is a random walk. Thencondition (10) is fulfilled if

P0Σ−1e P =w0PP0ww0Σew

. (16)

Proof. The proof is given in the Appendix.This last condition is satisfied, for instance, if wi = pi, for all i = 1, ...,m

and σ2i = σ2j , for all i, j = 1, ...,m, or all the series have the same short runnoise variance and the weight of each component series on the aggregate wi isthe same as the weight pi of the common trend of the component series. Thatis, for each series how it is affected by the common trend is the same of howeach series enters into the aggregate. In the case of adding up the variables (asit is the case of the aggregated GDP of several countries of the European Union)w = 1 and if Σe = diag(σ

21, ...,σ

2m), the condition (16) is simplified into

mXi=1

p2iσ2i=(Pmi=1 pi)

2Pmi=1 σ

2i

which is met if pi = pj and σ2i = σ2j ,for all i, j = 1, ...,m, which mean that inaddition of having the same short run noise variance, all the series are affectedby the common trend in the same way.

11

5 Application: GDP data

Data are quarterly observations of the GDP of Belgium, France, Italy, Nether-lands and Spain at market prices in millions of euros (from 1.1.1999) and ECUs(up to 31.12.1998), at constant 1995 prices. The source of the data is Eurostatwho provides them seasonally adjusted. The sample covers the period 1980:III-2002-IV. Also, the aggregated GDP for these 5 countries, named EURO5, isbuilt summing up the GDP of each country. Therefore, the weighting vectoris w = 1. Then, we take logs in all series. Plots of the series are presented infigure 1. The same set of countries was also analyzed in Garcia-Ferrer and Pon-cela (2002) using annual data, who found the existence of two common factorsamong them.*Insert figure 1 around here.

5.1 Estimated models

In order to make a forecasting comparison of the different forecasting meth-ods, we have estimated several models. We will also compare the results fromthe UCM models with those of the Box-Jenkins approach, that will serve as abenchmark. We will fit two models for the aggregate zt: a pure ARI(p) modeland an UCM. As regards the UCM, we need to add a cycle in the univariatemodelization of the aggregated GDP, so log(GDP)=Trend +cycle+irregular. Ifneeded, we will fit an AR model to the irregular component to capture remain-ing serial correlation. In the disaggregated approach, we will build some modelsfor yt and then aggregate the forecasts with the weighting vector w.We will usetwo variants of the disaggregated approach. First, obtain univariate forecastsfor the GDP of each country and then pool them to obtain the forecast of theaggregated GDP. Two models were fitted in order to obtain the univariate fore-casts of each country: an ARI(pi) and an UCM in a similar way as in the caseof zt. Finally, we have also obtained multivariate forecasts for the GDP of thefive countries and then pool them in order to obtain the forecast for the EURO5series. We have also considered two types of models to obtain the multivariateforecasts: a dynamic factor model and a VAR in differences. All the UCM havebeen estimated in state space form by maximum likelihood using the EM algo-rithm. In particular, for the factor model and as in Garcia-Ferrer and Poncela(2002) we found two common factors, that are modelled as random walks. Theestimated models are defined in table 1.

Aggregated approach: models for zt Dissaggregated approach: models for ytUnivariate analysis Multivariate analysis

AR EU5: zt ∼ARI(p) AR DIS: yj,t ∼ARI(pj) FM: yt ∼factor modelUCM EU5:log(zt)∼Trend+cycle+irr UCM EU5:log(yj,t)=Trend+cycle+irr DVAR: ∇yt ∼VAR(P )

.

Table 1: Definition of the estimated models by the aggregated and dissag-gregated approaches.

12

5.2 Forecasting results

We contemplate two forecasting periods: a short one covering 1998:1-2002:4 anda longer one from 1991:1-2002:4 to check whether the results found during thefirst forecasting period can be extended to the second one. We build one-stepahead forecasts and re-estimate the models adding one data point at the time.To measure the forecasting accuracy of the different methods, we compute theroot mean squared error (RMSE) of the one step ahead forecasts. In table 2we present the results: in column 1 we denote the procedure used to build theforecasts and in columns 2 and 3 we show the RMSE for the short and longforecasting samples, respectively.

model 1998:1-2002:4 1991:1-2002:4

UCM EU5 0.2823 0.3450UCM DIS 0.2990 0.3652FM 0.2514 0.3301

AR EU5 0.2885 0.3557AR DIS 0.2902 0.3682DVAR 0.4118 0.6510

Table 2: RMSE for 1-step ahead forecasts of the aggregated GDP of Belgium,France, Italy, the Netherlands and SpainIn bold figures we have pointed out the model that provides the best results.

For both forecasting samples, it is the factor model with relative gains rangingfrom 11% to 39% for the short sample 1998:1-2002:4, and 4.3% to 49% for thelong one 1991:1-2002:4. Notice the bad performance of the VAR in differences.This will be probably due to overdifferentiation since this multivariate modeldoes not take into account the presence of common trends or cointegration anddifferentiates all the components of the vector yt (the log of GDPs in this case).As regards the comparison of the remaining UCM and ARIMA approaches, noother significant differences appear.For better understanding these results, we present the last estimated factor

model used for forecasting (with all data points but the last one).The modelis given by yt = Pft + et and ft = ft−1 + at where yt represents the log ofGDP of Belgium, France, Italy, the Netherlands and Spain, with the followingestimates for the system matrices: Σa = I (this is an identification restriction),bΣe = 10−3diag(.0510, .0183, .1371, .0083, .0550) and

bP =.0587 0.0567 .0027.0537 .0027.0720 −.0029.0786 .0021

.We are going to check if condition (31), can be fulfilled. Remember that if thiscondition is met, the equality of forecasts using a univariate model over the

13

aggregated GDP of the five European countries and the pooled forecasts fromthe factor model could be achieved. We have obtained that the Kalman filtergain for the factor model is given by

bK =

1.2420 7.13004.2582 91.08490.5459 11.95387.1636 −115.7460

105871.7735 26.8656

.As regards the estimation of the unobserved component model for the aggre-gated GDP (denoted by UCM EU5) we have obtained that the Kalman filtergain is given by .9995. In this case condition (31) is not met since kw =0.9995× (1; 1; 1; 1; 1)0 and K0P0w = (.4289; 1.7781; 0.2292; 1.7619; 0.6900)0. No-tice that while in kw all the components must be equal, in K0P0w the secondcomponent is more than seven times the third one. So, in a priori, but once themodels are estimated we could expect a better forecasting performance of thefactor model over the forecasts of the aggregate.

6 Conclusions

Using unobserved components models, we have found algebraic conditions fornonstationary I(1) time series that guarantee the equality of forecasts betweenan aggregate and the pooled forecasts of the individual series that form theaggregate. These conditions are in terms of eigenvalues of certain matricesassociated to the parameters of the models, in the same spirit of the ones foundby Kohn (1982) and later generalized by Lutkhepol (1984) and Clark (2000).We have studied both the cases of unrelated and common trends. We have

found for the unrelated trends case that when the series exhibit the same de-gree of smoothness, these conditions are automatically fulfilled. In the case ofcommon trends we have also obtained that if the aggregation vector is a cointe-grating vector the equality of forecasts between both approaches, the aggregatedone and the dissaggregated through the factor model, cannot be attained. Thesecond set of forecasts (the ones obtained through the factor model) has alwaysa smaller RMSE.For the one factor model, we have also obtained that the equality of fore-

casts between the previous two approaches is guaranteed if, for instance, all theseries have the same short run noise variance and their weight in the aggregatecorresponds to their factor loading, that is, how the common trend affects them.We have check the case of the GDP of several countries belonging to the

European Community and to their aggregated GDP and found that forecastingthe GDP of each individual country and then aggregating the forecast to obtainthe prediction of the aggregated forecast produces a smaller RMSE that fore-casting directly the aggregated GDP when we account for the common trends.Nevertheless, the other multivariate model (a VAR in first differences) producesthe worst forecasting results in terms of the RMSE. This points out that, at

14

least for this particular data and sample, it is not only a larger information setthe cause of a smaller RMSE, but also the possibility of taking into account thepresence of common trends in the model.

7 Appendix

Proof. of Proposition 1By Lutkepohl (1984) theorem 2 zmt (h) = zt(h), h = 1, 2, ... if and only if

w0M(L) = k(L)w0. In our case,M(L) = (I−ΘL) and k(L) = (1−θL), where Θand θ are given by (4) and (7a), repectively. Therefore w0(I−ΘL) = (1−θL)w0if θ is an eigenvalue of Θ0. By (4) Θ0 = Σ−1u Σe.

Proof. of Corollary 2If Σe is diagonal, Θ =diag(θ1, ..., θm) and Σu = diag(σ21, ...,σ

2m), where θj

and σ2j are given by the set of equations (5). Therefore zmt (h) = zut (h) and

nothing is gained from the joint modelling of the series. By Lutkepohl (1984)theorem 2 zut (h) = zt(h), h = 1, 2, ... if and only if w0N(L) = k(L)w0. In ourcase, N(L) = diag(1 − θ1L, ..., 1 − θmL) and k(L) = (1 − θL), where θj , j =1, ...,m and θ are given by (5) and (7a), repectively. Therefore w0N(L) =k(L)w0 if and only if θ = θj for all j = 1, ...,m. By (5) and (7a), θ = θj if andonly if q = qj .

Proof. of Proposition 3.First, we are going to derive the prediction errors from the multivariate

dynamic factor model and then pool them in order to form the forecast of theaggregate. Then, we derive the forecast of the univariate UCM of the aggregate,and later, the comparison of both alternatives is presented.The factor model is already written in state space form with state vector

given by the vector of common trends. Using the Kalman filter equations we caneasily compute the h-steps ahead forecast of the state vector with observationsup to time t, given by Tt+h|t = Tt|t, with mean square error (MSE) matrix

Vt+h|t = E(Tt+h −Tt+h|t)(Tt+h −Tt+h|t)0 = Vt|t+hΣη = Vt|t+hI, (17)

where the last equality comes from the identification restriction. The h stepsahead forecast for the observed series with origin in t is

yt(h)= PTt|t, (18)

with MSE matrix

Σt+h|t = E [yt+h − yt(h)] [yt+h − yt(h)]0 = PVt+h|tP0+Σ². (19)

Let et(h) = yt+h − yt(h) be the forecast errors of the multivariate model. Theaggregation of these forecast errors of the multivariate model is

w0et(h) = w0P(Tt −Tt|t) +w0PhXi=1

ηt+i +w0et+h. (20)

15

On the other hand, the forecast obtained from the univariate model for zt =w0yt is

zt(h) = Tct|t, (21)

with MSE of prediction

st+h|t = E(zt+h − zt+h|t)(zt+h − zt+h|t)0 = vt+h|t+w0Σ²w (22)

where vt+h|t is the MSE of estimation of the univariate trend in t + h withinformation up to t. The prediction error of the univariate forecast of zt+h,defined as et(h) = zt+h − zt(h), is given by

et(h) = (Tct − T

ct|t) +w

0PhXi=1

ηt+i +w0et+h. (23)

The error difference between the aggregation of the forecasts produced by thefactor model and the univariate forecast of the aggregate is due only on thedifference in estimating the trend, given by

w0et(h)− et(h) = T ct|t −w0PTt|t. (24)

The update of state variables is given by (see, for instance, Durbin and Koopman2001, chapter 4)

Tc

t|t = (1− kt)Tc

t|t−1 + ktzt (25)

where kt is the Kalman filter gain defined for this model as

kt = vt|t−1s−1t|t−1. (26)

and

Tct|t−1 = (1− kt)T

ct−1|t−2 + kt−1zt−1.

By backwards substitution of Tc

t|t−1 in (25)

Tc

t|t = (1− k1)...(1− kt)Tc

1|0 + k1(1− k2)...(1− kt)z1 + ...+ kt−1(1− kt)zt−1 + ktzt(27)

and on the steady state (achieved after a few iterations) kj = k,∀j

Tct|t ' (1− k)tT

c1|0 +

tXτ=1

k(1− k)τ−1zτ .

Similarly for the multiple time series model

Tt|t = (I−KtP)Tt|t−1 +Ktyt (28)

16

where the Kalman filter gain (r ×m) is now defined asKt= Vt|t−1P0Σ

−1t|t−1 (29)

and

Tt|t−1= (I−KtP)Tt−1|t−2+Kt−1yt−1.

Proceeding in a similar way as for the univariate aggregate forecasts, on thesteady state Kt =K,∀t and by backwards substitution of Tt|t−1 in (28)

Tt|t= (I−KP)tT1|0+tX

τ=1

(I−K)τ−1Kyτ .

According to (24), the difference in error forecasts is then given by

w0et(h)− et(h) (30)

= (1− k)tT c1|0 +tX

τ=1

k(1− k)τ−1w0yτ −w0P

Ã(I−KP)tT1|0+

tXτ=1

(I−KP)τ−1Kyτ!

If we run the Kalman filter with equivalent initial conditions such that Tc

1|0 =w0PT1|0, the equality of forecasts is achieved if and only if kw0 = w0PK, or

kw = K0P0w. (31)

Proof. of Corollary 4.First, notice that the univariate Kalman filter gain k is always positive since

from (26) it is a ratio of variances. Second, let P⊥ be anm×(m−r) matrix, suchthat rank(P⊥)=m−r which is the orthogonal complement of the factor loadingmatrix P, that is P0⊥P = 0. Notice that P⊥ is a basis for the cointegration spaceof yt since P

0⊥yt = P

0⊥et are m− r stationary independent linear combinations

of the components of yt. Therefore, if the weighting vector is any of the columnsof P⊥, from (20)

w0et(h) = w0et+h.

which gives the minimum MSE in forecasting the aggregated GDP. Moreover,in this case K0P0w = 0 so the condition in (10) cannot be fulfilled since we canonly guarantee k ≥ 0; therefore the equality of forecasts might not be achieved.Proof. of Proposition 5.First, we are going to derive the prediction errors from univariate UCMs of

each of the components and then pool them in order to form the forecast ofthe aggregate. Then, we are going to compare it to the univariate unobservedcomponents model of the aggregate derived in the previous section. Using theKalman filter equations we can easily compute the h-steps ahead forecast of theunivariate UCMs state vector with observations up to time t, given by

yuj,t(h) = Tuj,t|t, (32)

17

with MSE of prediction

suj,t+h|t = E£yj,t+h − yuj,t(h)

¤2= vuj,t+h|t+σ

2ej (33)

where vuj,t+h|t is the MSE of estimation of the univariate trend of the j-thcomponent in t + h with information up to t. The prediction error of theunivariate forecast of the j-th component is given by

uj,t(h) = (Tuj,t − Tuj,t|t) + pj.

hXi=1

ηt+i + ej,t+h.

Let ut(h) = (u1,t(h), ..., um,t(h))0,Tut = (Tu1,t, ..., T

um,t)

0 andTut|t = (Tu1,t|t, ..., T

u2,t|t)

0

be the vectors of forecast errors, trends and estimated trends with informationup to time t obtained from the univariate analysis of each of the components,respectively. The error difference between the univariate forecast of the aggre-gate and the pooling of the forecasts produced by the univariate UCMs is onlydue to the difference in the estimation of the trends. To see this, first noticethat the pooling of the forecast errors from the univariate UCMs is given by

w0ut(h) = w0(Tut−Tut|t) +w0PhXi=1

ηt+i +w0et+h. (34)

From (23) and (34), the difference between the two different forecasting ap-proaches that we are considering in this section is given by

w0ut(h)− et(h) = T ct|t −w0Tut|t. (35)

Proceeding as in the previous section, we will analyze this difference for thesteady state where the Kalman filter gains are constant. Denote by kuj theKalman filter gain associated to the j−th component of the series yt on thesteady state. In a similar way to (27), it is easy to obtain that

Tuj,t|t = (1− kuj )tTuj,1|0 +tX

τ=1

kuj (1− kuj )τ−1yj,τ .

Therefore, the difference in error forecasts (35) is given by

w0ut(h)− et(h) (36)

= (1− k)tT c1|0 +tX

τ=1

k(1− k)τ−1w0yτ −w0Ã(I−Ku)

tTu1|0+tX

τ=1

(I−Ku)τ−1

Kuyτ

!where Tu1|0 = (T

u1,1|0, ..., T

um,1|0)

0 is the set of initial conditions of the state vectorfor the univariate analysis of each of the components andKu = diag(k

u1 , ..., k

um).

If we run the Kalman filter with equivalent initial conditions such that Tc

1|0 =w0Tu1|0, the equality of forecasts is achieved if and only if kw

0 = w0Ku, or

kw =Kuw. (37)

18

Proof. of Proposition 6.First, notice that on the steady state, for all t, vt|t−1 = v, st|t−1 = v + σ2e =

v+w0Σew and the Kalman filter gain kt = k. Taking into account its definitionin (26), then

k =v

v +w0Σew(38)

where

v = σ2eq +

pq2 + 4q

2(39)

is the solution of the algebraic Riccati equation (see, for instance Durbin andKoopman, 2001, page 33)

v2

v + σ2e= σ2η, (40)

where q is the noise variance ratio of the aggregate univariate model that in ourcase is w

0PP0ww0Σew

. Therefore from (38) and (39)

k =σ2ηv

=2

1 +p1 + 4/q

(41)

where the last equality is obtained by substituting v by its expression given in(39).As regards as the multivariate model, on the steady state Vt|t−1 = V is now

a scalar given by the solution of the algebraic Riccati equation (see, for instanceHarvey, 1989, page 106),

V 2P0(VPP0+Σe)−1P =1 (42)

where we have imposed the identification condition Ση = 1. Applying the inverselemma for the sum of two matrices (Rao, 1973) to (VPP0+Σe)−1, then thesolution of (42) is given by

V =1 +

p1 + 4/µ

2(43)

where µ = P0Σe−1P. So, K0P0 on the steady state is given, applying again theinverse lemma for the sum of two matrices (Rao, 1973), by

K0P0 = (PVP0+Σe)−1PVP0 (44)

=µ

µ+ 1/V=

1

1 + 2

µ+√µ2+4µ

19

where the last equality is obtained plugging (43) into (44). So, taking intoaccount (41) and (44) the condition (31) is satisfied if

2

1 +p1 + 4/q

=1

1 + 2

µ+√µ2+4µ

. (45)

It can be shown by straightforwad algebra that (45) is met if q = µ, whichmeans that

P0Σ−1e P =w0PP0ww0Σew

.

8 References

Clark, T.E. (2000) Forecasting an aggregate of cointegrated disaggregates Jour-nal of Forecasting, 19, 1-21.

Diebold, F.X. and Rudebusch, G.D. (1996) Measuring business cycles: A mod-ern perspective The Review of Economics and Statistics, 78, 67-77.

Durbin, J. and Koopman, S.J (2001) Time series analysis by state space meth-ods Oxford University Press: Oxford.

Garcia-Ferrer, A. and Poncela, P. (2002) Forecasting European GNP datathrough common factor models and other procedures Journal of Forecast-ing, 21, 225-244.

Harvey, A. (1989) Forecasting Structural Time Series Models and the KalmanFilter (2nd edn). Cambridge University Press: Cambridge.

Kohn, R. (1982) When is an aggregate of a time series efficiently forecast byits past? Journal of Econometrics, 18, 337-349.

Lutkepohl, H. (1984) Linear transformations of vector ARMA precesses Jour-nal of Econometrics, 26, 283-293.

Lutkepohl, H. (1985) Forecasting contemporaneously aggregated vector ARMAprocessed, Journal of Business and Economic Statistics, 2, 201-214.

Marcellino, M., Stock, J. and Watson, M. (2003) Macroeconomic forecastingin the Euro area: country specific versus area-wide information, EuropeanEconomic Review, 47, 1-18.

Rao, C.R. (1973) Linear statistical inference and its applications. Wiley, NewYork.

Stock, J.H. and Watson, M.W. (1991) A probability model of the coincidenteconomic indicators, in K. Lahiri and G. H. Moore, eds., Leading economicindicators: new approaches and forecasting records, Cambridge UniversityPress, Cambridge, 63-89.

20

Stock, J. and Watson, M. (2002) Forecasting using principal components froma large number of predictors, Journal of the American Statistical Associ-ation, 97, 1167-1179.

Tiao, G.C. and Guttman, I. (1980) Forecasting contemporal aggregates of mul-tiple time series, Journal of Econometrics, 12, 219-230.

Wei, W.W.S. and Abraham, B. (1981) Forecasting contemporal time seriesaggregates Communications in Statistics, Part A-Theory and Methods,10, 1335-1344.

Young, P.C. (1984) Recursive estimation and time series analysis. Springer-Verlag: Berlin.

Zellner, A. and Tobias, J. (2000)A note on aggregation, disaggregation andforecasting performance, Journal of Forecasting, 19, 457-469.

21

9 Figures

10.5

10.6

10.7

10.8

10.9

11.0

11.1

80 82 84 86 88 90 92 94 96 98 00 02

LBEL

12.3

12.4

12.5

12.6

12.7

12.8

80 82 84 86 88 90 92 94 96 98 00 02

LFRA

11.9

12.0

12.1

12.2

12.3

12.4

80 82 84 86 88 90 92 94 96 98 00 02

LITA

10.9

11.0

11.1

11.2

11.3

11.4

11.5

80 82 84 86 88 90 92 94 96 98 00 02

LNET

11.2

11.4

11.6

11.8

12.0

80 82 84 86 88 90 92 94 96 98 00 02

LSPA

13.2

13.3

13.4

13.5

13.6

13.7

80 82 84 86 88 90 92 94 96 98 00 02

LEURO5

Figure 1: Logs of GDP of Belgioum, France, Italy, the Netherlands, Spain andits aggregated GDP.

22

THE EFFECTS OF DISAGGREGATION ON FORECASTING NONSTATIONARY I(1) TIME SERIES

Documents

Transcript of THE EFFECTS OF DISAGGREGATION ON FORECASTING NONSTATIONARY I(1) TIME SERIES