An Early Warning System for Tail Risks

35
An Early Warning System for Tail Risks * Gianni De Nicolo’ Johns Hopkins University Carey Business School Tuesday 18 th June, 2019 (Preliminary) Abstract This paper formulates an Early Warning System (EWS) for tail risks based on a forecast combination of Value-at-Risk (VaR) and Expected Shortfalls (ES) of multi-period returns of selected real and financial indicators. The forecast combination includes baseline (VaR,ES) forecasts conditional on an aggregate risk factor, as well as stress (sVaR,sES) forecasts condi- tional on the VaR of the aggregate risk factor. Using monthly data of the G-7 economies for the period 1975:01-2017:12, I show that stress forecasts improve forecasting performance, and that the constructed forecast combination has significant out-of-sample forecasting power for real and financial tail risks up to a 6-month forecasting horizon. Keywords: Value at Risk; Expected Shortfall; Forecast combinations. JEL Classification: C5; E3; G2. * I thank without implications Fabio Canova, Paul Kupiec, and participants to a seminar at the Norges Central Bank for comments and suggestions. Johns Hopkins University Carey Business School ([email protected]).

Transcript of An Early Warning System for Tail Risks

An Early Warning System for Tail Risks ∗

Gianni De Nicolo’Johns Hopkins UniversityCarey Business School †

Tuesday 18th June, 2019(Preliminary)

Abstract

This paper formulates an Early Warning System (EWS) for tail risks based on a forecastcombination of Value-at-Risk (VaR) and Expected Shortfalls (ES) of multi-period returns ofselected real and financial indicators. The forecast combination includes baseline (VaR,ES)forecasts conditional on an aggregate risk factor, as well as stress (sVaR,sES) forecasts condi-tional on the VaR of the aggregate risk factor. Using monthly data of the G-7 economies forthe period 1975:01-2017:12, I show that stress forecasts improve forecasting performance, andthat the constructed forecast combination has significant out-of-sample forecasting power forreal and financial tail risks up to a 6-month forecasting horizon.

Keywords: Value at Risk; Expected Shortfall; Forecast combinations.JEL Classification: C5; E3; G2.

∗I thank without implications Fabio Canova, Paul Kupiec, and participants to a seminar at the Norges CentralBank for comments and suggestions.†Johns Hopkins University Carey Business School ([email protected]).

1 Introduction

The financial crisis of 2007-2009 has spurred renewed efforts in constructing early warning sys-

tems for real and financial tail risks, prompted by progress in defining and measuring systemic

risk. 1 However, a standardized forecasting procedure that maximizes forecasting performance

of tail risk measures useful for individual and policy institutions alike to manage risks is not

currently available. This paper proposes such a procedure.

I formulate an Early Warning System (EWS) for tail risks measured by out-of-sample real-

time forecast combinations of Value-at-Risk (VaR) and Expected Shortfall (ES) of real and

financial returns. Rather than conducting a classical horse race among competing models with

the goal of determining whether a winner exists, the proposed EWS exploits the potential of

several competing (and likely mis-specified) models to improve forecasting performance in the

spirit of Geweke and Amisano (2012).

Two novel features characterize the proposed EWS. First, I construct a forecast combination

of (VaR, ES) for each return from specifications of a set of models. The weights assigned to each

forecast in the combination are determined by a simple version of an Approximate Bayesian

Computation (ABC) method. In a nutshell, at each forecasting date the (VaR,ES) forecast

of model specification A is said to be dominated if there exists a forecast of another model

that is strictly superior to A’s forecast. Dominance is established by a test of equal forecasting

performance at a given confidence level using an appropriate scoring function associated with

each model. The test is conducted at each forecasting date using a data evaluation window

that precedes the forecasting date. The dominated forecasts receive zero weight in the combina-

tion, while the non-dominated forecasts receive equal weights. For ease of reference, I call this

(VaR,ES) forecast combination the ABC forecast. The construction of the ABC forecast aims

at capturing the potential persistence and time-variation in forecasting performance of different

models found in the literature (see e.g. Aiolfi and Timmermann, 2016).

1An early analysis of measures of systemic risk in banking is in Lehar (2005). Bisias et al., (2012) and Benoitet al. (2016) survey recently proposed measures of systemic risk. Current prominent statistical models aimed atcapturing the dynamics of systemic risk include the CoVaR measures of Adrian and Brunnermeier (2016), theSystemic Expected Shortfall measure of Acharya et al (2017), and the SRISK measures of Brownlees and Engle(2017).

1

Second, the ABC forecast includes forecasts of (VaR,ES) for each model’s specification con-

ditional on observed predictors, called baseline forecasts, as well as forecasts conditional on the

VaR of the predictor, called stress forecasts, and denoted by (sVaR,sES). The sVaR measure

is a forecasting version of the CoVaR measure introduced by Adrian and Brunnermeier (2016).

The sES measure is the ES conditional on the sVaR. The inclusion of stress forecasts in the

combination permits to gauge the value added of a stress test in terms of its ability to improve

the performance of the ABC forecast. With the exception of Covas, Rump and Zakrajsek (2014)

and Kupiec (2018), who explore the predictive power of stress test exercises in the US with re-

spect to the financial crisis period, the integration of stress testing and forecasting has not been

explored in the literature to date.2

To construct the ABC forecast, a set of models must be chosen. In this paper my choice of

models is deliberately parsimonious, since I wish to gauge in a transparent way the contribution

of each model’s forecast to the ABC forecast as related to its underlying assumptions. I consider

three basic models of returns. Each model has an aggregate risk factor as a predictor, This risk

factor is a measure of the volatility of stock market returns, interpreted as a measure of a

“portfolio distance to insolvency” as in Atkenson et al. (2017). The first model is a simple

linear model of mean returns predicted by the risk factor under a Gaussian distribution of the

innovation. The second model is the same as the first one, except that the variance of a return

has the risk factor as predictor. The third model is a quantile model with the risk factor as a

predictor.

The weights of each model’s forecast in the ABC forecast are determined by tests of equal

forecasting performance using the scoring function derived by Patton, Ziegel and Chen (2019),

which is a member of the family of strictly consistent scoring functions identified by Fissler and

Ziegel (2016). Note that a scoring function for a statistic is strictly consistent if there exists a

score (or loss) function such that the correct prediction of this statistics is the unique minimizer

of the expected score. A statistic for which a strictly consistent scoring function exists is called

“elicitable”. Gneiting (2011) shows that ES is not elicitable. Fissler and Ziegel (2016) identify

2De Nicolo and Lucchetta (2011, 2012) develop a ”structural” stress test based on shock identification througha structural VAR, while Corbae et al. (2017) develop a ”structural” microprudential stress tests based on abanking model. Yet, these studies do not deal with forecasting

2

the family of scoring funtions such that the pair (VaR,ES) is “jointly” elicitable.3

I implement the EWS in real time using monthly time series of industrial production growth

and equity returns of indexes of the non-financial and financial sectors of the G-7 economies

during the 1975:01-2017:12 period. Return’s (VaR,ES) forecasts are 1-month-, 3-month-, 6

month-, and 12-month-ahead. Recent contributions on tail risks have focused mainly on US

data (e.g. Allen et al., 2012, Hubrich and Tetlow, 2015, White et al., 2015, Giglio et al., 2016,

De Nicolo and Lucchetta , 2017, Adrian, Bornyacenko and Giannone, 2018, Engle and Ruan,

2018). I use an international sample to evaluate the proposed EWS in the context of datasets

with different, yet connected, histories and statistical properties, and with a focus on both real

and financial tail risks.

I obtain two main results. First, stress forecasts have a significant role in improving the ABC

forecast, since they receive sizable weights. Moreover, their contribution to the ABC forecast

is largest during periods of financial stress. Second, the ABC forecast has significant predictive

power up to the 6-month-ahead horizon, providing timely signals of increased real and financial

vulnerabilities. Interestingly, the equally weighted forecast combination, which has been found

hard to be beaten by a variety of ”optimal” unequal weighting schemes proposed in the literature,

is always beaten by the ABC forecast.

The remainder of the paper is composed of three sections. Section 2 describes the EWS setup

and the forecasting procedure. Section 3 details the empirical results. Section 4 concludes. An

Appendix reports additional tables and figures.

2 The EWS set-up

I detail below the choice of risk factor, the set of forecasting models, baseline and stress forecasts,

and the procedure to combine forecasts under different specifications of the models to obtain

the ABC forecast.

Let Rt+h = ln(Xt+h/Xt) a log return of an indicator Xt over the interval [t, t+h]. I consider

three basic forecasting models of Rt+h, where the key predictor is a risk factor measured at a

3For a survey on elicitability and its relationship with back-testing and forecasting, see Nolde and Ziegel (2017).

3

country level.

2.1 The risk factor

Risk factors are proxy measures of the Distance to Insolvency (DI) derived by Atkeson, Eisfeldt,

and Weill (2017) for a (market) portfolio of non-financial and financial firms.. Using Leland’s

(1994) structural model of credit risk, Atkeson, Eisfeldt, and Weill (2017) show that DI ≤ σ−1 ≤

DD, where σ is the volatility of equity, DD is a measure of the distance to default, and DI is

a measure of distance to insolvency. The above inequality is tight if creditors force firms into

bankruptcy to minimize the cost of distress. Using U.S. data, they show that measures of σ−1

for a large set of non-financial and financial firms track measures of insolvency risk derived from

a wide range of structural models of firm valuation, as well as those implied by CDS spreads.

A portfolio DI is a lower bound of the distance to insolvency of the firms in the portfolio, since

its volatility is generally lower than the sum of the volatilities of its components.

This choice of risk factors is consistent with (endogenous) volatility as a key driver of systemic

risk in recent aggregate models of financial intermediation (see e.g. Brunnermeier and Sannikov,

2014, and Klimenko et al., 2016). Measures of risk shocks obtained either by cross-sectional or

time varying indicators of equity volatility have also been shown to be important sources of

business cycle fluctuations (see, e.g. Christiano, Motto and Rostagno, 2014, and Brunnermeier

et al., 2018).

Empirically, risk factors for each country are the (log) equity volatility (standard deviation)

of stock market indexes constructed using daily data. As in Bandi and Perron (2008), I use the

estimator of monthly realized variance given by σ2t =

∑djj=1 r

2t−1+j/dj

, where dj is the number of

trading days in a month and r2t−1+j/dj

is the squared continuously compounded return in day j

of month t. A country risk factor at date t is defined by and denoted with Vt ≡ 0.5 log σ2t .

2.2 Baseline and stress forecasts

I consider three models, labeled Model 1, Model 2, and Model 3. For simplicity, multistep

forecasts are h-month-ahead projections. Baseline τ level (VaR, ES) pairs are estimated for

τ = 0.10. Predictions and estimated coefficients are denoted with a ”bar”.

4

The h-month-ahead projection of a return with Model 1 is :

Rt+h = αh1 +

p∑i=1

βh1iVt−i + σ1t+hηt+h (1)

where p is the number of lags and the innovation ηt+h is i.i.d N(0, 1). The h-month-ahead

baseline (VaR,ES) forecasts are:

V aRτ (Rt+h) = αh1 +

p∑i=1

βh1iVt−i + σ1t+hG(τ) (2)

ESτ (Rt+h) = αh1 +

p∑i=1

βh1iVt−i − σ1t+hH(τ) (3)

where G(τ) ≡ F−1(τ), H(τ) ≡ f(F−1(τ))τ , f(.) is the density function, and F (.) is the cdf of the

standardized Normal respectively.

Model 2 ’s projection of the h-month-ahead return is the same as Model 1, but the variance

depends on the risk factor:

σ22t+h = exp(φ0 + φ1Vt) (4)

The h-month-ahead baseline (VaR, ES) forecasts of Model 2 are:

V aRτ (Rt+h) = αh2 +

p∑i=1

βh2iVt−i +√

exp(φ0 + φ1Vt)G(τ) (5)

ESτ (Rt+h) = αh2 +

p∑i=1

βh2iVt−i −√

exp(φ0 + φ1Vt)H(τ) (6)

where G(τ) and H(τ) are defined as above.

Model 3 is a quantile forecasting model. As stressed by Komunjer (2013), an advantage of

a quantile regression model is its independence of distributional assumptions, which may give

it the potential ability to capture important time-varying asymmetries in the distribution of

variables of interest. De Nicolo and Lucchetta (2017) document that this is the case for several

indicators of tail real and financial risks in the U.S.

The VaR forecast of Model 3 is the h-month-ahead projection of Koenker and Xiao’s (2006)

5

quantile auto-regression (QAR(p)) given by:

V aRτ (Rt+h) = αh3(τ) +

p∑i=1

βh3i(τ)Vt−i (7)

To estimate the conditional ES forecast, I use a version of the semi-parametric procedure

proposed and implemented by Taylor (2017). The stating point of the procedure is a result by

Basset, Koenker and Kordas (2004), who show that an estimate of the unconditional ESτ of a

time series Rt is ESτ = R−τ−1σ , where R is the sample mean of Rt , σ is the sample average of

the minimized thick loss function σt = (Rt−V aRτ (Rt))(τ − I(Rt ≤ V aRτ (Rt)) , and V aRτ (Rt)

is the estimated quantile. The conditional h-month-ahead ES forecast can be written as:

ESτ (Rt+h) = EtRt+h − τ−1σt (8)

where σt+h = (Rt+h − V aRτ (Rt+h)(τ − I(Rt+h ≤ V aRτ (Rt+h) is the forecast of the minimized

thick loss function. Gourieroux and Li (2012) show that VaR and ES are connected by a

monotonically increasing link function L(τ), yielding:

EtRt+h − τ−1σt = L(τ)V aRτ (Rt+h) (9)

Given a VaR estimate, the ES forecast can be obtained by estimating the parameters of a

link function. Let Zt+h ≡ Rt+h − τ−1σt. I assume a linear link function, given by Lh(τ) =

ah1(τ)I(V aRτ (Rt+h<0) + ah2(τ)I(V aRτ (Rt+h>0). Then, the ES forecast of Model 3 is the predicted

value of the following regression:

Zt+h = ah1(τ)I(V aRτ (Rt+h<0) + ah2(τ)I(V aRτ (Rt+h>0))V aRτ (Rt+h) + et+h (10)

The baseline ES forecast of Model 3 is:

ESτ (Rt+h) = (ah1(τ)I(V aRτ (Rt+h<0) + ah2(τ)I(V aRτ (Rt+h)V aRτ (Rt+h) (11)

Stress forecasts are h-month-ahead (VaR,ES) projections of real and financial returns condi-

6

tional on the VaR of the risk factor. The VaR of the risk factor is estimated with Koenker and

Xiao’s (2006) quantile auto-regression at level τV = 0.95:

V aRτ (Vt) = a(τV ) + b(τV )Vt−1 (12)

The estimated VaR is denoted by V aRτV (Vt). All (sVaR,sES) forecasts are measured for

the pair of levels (τ, τV ) = (0.10, 0.95), and are obtained replacing Vt with V aRτV (Vt) in all

Equations (2)-(3), 5()-(6), and (6), (9) and (11).

2.3 The scoring function for (VaR, ES) forecasts

To evaluate the out-of-sample forecasting performance of (VaR,ES) generated by different model

specifications, I use the FZ0 scoring function derived by Patton, Ziegel and Chen (2019, Propo-

sition 1). The FZ0 scoring function is a member of the family of strictly consistent scoring

functions introduced by Fissler and Zigler (2016), given by:

FZτ (V aR,ES, x) = (I(x ≤ V aR)) − τ)(G1(V aR) −G1(x) +

1

τG2(ES)V aR

)−G2(ES)

( 1

τESI(x ≤ V aR)x− ES

)−H2(ES)

(13)

where G1 is strictly increasing, G2 is strictly increasing and strictly positive, and G′2 = H2.

The FZ0 scoring function applies to strictly negative values of VaR and ES, and is given by

FZ0τ (V aR,ES, x) = − 1

τESI(x ≤ V aR)(V aR− x) +

V aR

ES+ log(−ES) − 1 (14)

Following Amisano and Giacomini (2007), pairwise comparisons of forecasting performance

of the (VaR, ES) forecasts is carried out by applying Diebold and Mariano (1995) tests of

equal forecasting performance (DM tests henceforth) using the FZ0 scoring function. Patton,

Ziegel and Chen (2019) show that the difference of the FZ0 scoring function of two (VaR,ES)

estimates is homogenous of degree 0, which is a property that strengthens the power of DM

tests. Under standard regularity conditions, the DM statistics of the differences between the

FZ0 scores associated with two (VaR, ES) forecasts is asymptotically standard normal under

7

the null hypothesis of equality of scores. A (VaR,ES) forecast with FZ0 score A is superior

to a (VaR,ES) forecast with score B if the DM statistics is significantly negative, i.e. the FZ0

statistics has negative orientation.

2.4 The ABC forecast

Forecast combinations have been used extensively in applications focusing on mean and den-

sity forecasts (for a review, see Chapter 10 in Eliott and Timmermann, 2016). A “forecast

combination puzzle” as emerged, as the forecasting performance of equally weighted forecast

combinations has been found hard to beat by a variety of “optimal” unequal weighting schemes

in many forecasting applications.

The focus on tail risk measures such as VaR is comparatively more recent, stemming from

studies of combinations of either density forecasts or quantile forecasts. Recent contributions

include Diks et al (2011) and Opschoor et al (2017), who study the performance of optimal

weighting schemes derived from modified versions of the log score criterion (see, e.g. Geweke

and Amisano, 2011), or from the quantile weighted probability score proposed by Gneiting and

Ranjan (2011). De Nicolo and Lucchetta (2017) focus on quantile forecasts evaluated according

to the Gneiting and Ranjan (2011) criterion, showing the superiority of equally weighted forecast

combinations relative to single model VaRs.

Yet, these studies focus only on VaR forecast combinations and, to the best of my knowledge,

I know of no study considering ES combinations. Moreover, in most studies the evaluation of

forecasting performance is conducted ex-post, since all out-of-sample forecasts obtained with

the data are used. In other words, the evaluation data window is the entire sample, which is

not available when evaluation is conducted at each forecasting date, i.e. in real-time.

The ABC forecast is obtained by pairwise DM tests of FZ0 scoring functions among candidate

models, which determine the inclusion of a forecast in the combination and the relevant weights

assigned to each included model. This procedure is germane to the computational procedures

of likelihood-free ABC methods, which by-pass the estimation of a posterior distribution (for a

review of ABC methods, see Lantusaari et al., 2017).

The ABC forecast is constructed as follows. At the forecasting date T a forecaster chooses the

8

weights to assign to h-month-ahead tail risk forecasts of a return from a set of candidate models.

The choice of weights is determined by pairwise DM tests of equal forecasting performance using

past out-of-sample forecasts of each model over an evaluation data window the ends at T . If

the forecasting performance of a model is significantly worse than the forecast of at least one

competing model, then the forecast of that model is assigned zero weight, that is, that forecast

is dominated at the forecasting date. Forecasts that are not dominated are given equal weight,

since DM tests do not reject equal forecasting performance. This procedure is repeated at each

forecasting date.

Formally, denote with (V aRm(Rt+h), ESm(Rt+h)) the h-month-ahead forecasts of return Rt

obtained with model m. The total number of models is M + N , where M is the number of

individual models and N is the number of reference combinations of all M models with pre-

determined weights, whose rationale for inclusion is explained momentarily. Denote with ωT the

length of the evaluation data window The h-month-ahead forecast combination C of (VAR, ES)

at forecasting date T is:

(V aRCτ (RT+h), V aRCτ (RT+h)) = (

M+N∑m=1

wm(ωt)tV aRm(Rt+h),

M+N∑m=1

wmt (ωt)ESm(Rt+h)) (15)

where wm(ωt) ≥ 0 for all m and∑M+N

m=1 wmt (ωt) = 1.

The weights of forecasts of models m at the forecasting date T , for all models, are determined

by the following rules:

1. wm(ωt) = 0 if there exists a model m′ such that the differences FZ0 means over the

evaluation period given by

1

ωT

( T−h∑s=T−ωt−h

FZ0s(V aRm′(Rs+h), ESm

′(Rs+h)) − FZ0s(V aR

m(Rs+h), ESm(Rs+h))

is negative and significant according to a DM test at a 5% significance level

2. wm(ωT ) = 1M+N−

∑M+Nj=1 Ij

, where Ij = 1 if wj(ωt) = 0, and Ij = 0 otherwise.

At each forecasting date, Rule 1 places a zero weight to all dominated models, while Rule

9

2 assigns equal weight to all models for which the DM tests do not reject the null of equal

forecasting performance. In essence, the ABC forecast is constructed under the assumption of

equal (prior) weights and with equal (posterior) weights on the non-dominated models, with the

difference of FZ0 scores for a given confidence level serving as the distance function partitioning

the forecasts in dominated and not dominated ones.

Rules 1 and 2 are akin to a combination of the selection rules of so-called “rejections” and

“regression” ABC methods. The ABC forecast can also be interpreted as optimal in the sense

of implementing the minimization of the weighted sum of the FZ0 scores of all models, subject

to the “exclusion” constraint of Rule 1, and an equal (posterior) weight assigned to all non-

dominated models of Rule 2.

In sum, the ABC forecast is constructed with the goal to: (a) incorporate the potential

“benefits” of equally weighted combinations; (b) capture forecasting persistence; (c) exploit the

potential of stress forecasts to improve performance; and, (d) exclude inferior forecasts. Its

implementation is in real time, as it replicates what a forecaster could do with the information

available at each forecasting date.

3 Implementation

To illustrate the mechanics of the EWS, some properties of the data, and the model evaluation

procedure, I begin the analysis with the results of a simple ABC (in-sample) prediction, which is

obtained by combining the predictions of the three models considered and their equally weighted

combination (EWC). The evaluation of models’ ”fit” is ex-post, since the evaluation data window

is the entire data set. I then turn to the results of the ABC forecast in real time.

The three returns I consider are industrial production growth (IPG), the equity return of af

an index of non-financial firms (RNF), and the equity return of an index of banks (RB).

3.1 A simple ABC prediction

Table 1 reports the weights of the simple ABC prediction for the three variables and the four

horizons of each of the seven countries. A cursory glance at the table reveals the significant

10

variation of the weights in the simple ABC prediction across variables, horizons and countries:

no single model appears to dominate across the board, since a positive weight is assigned to

individual models in almost all instances.

Table 2 summarizes the contribution of each individual model and the EWC to the ABC

prediction for each variable and model across all horizons and countries. It reports the number

and percentage of samples in which a model is dominated (the lower is this number or percentage,

the larger is a model’s contribution to the BA combination), as well as the number of samples

in which the EWC exactly corresponds to the ABC prediction. Model 1 records the highest

number of samples in which it is dominated: this is particularly evident for the financial returns,

suggesting that the dependence of the volatility of Models 2 and the quantile of Model 3 on the

risk factor plays an important role. In fact, Models 2 and 3 exhibit similar inclusion rates in

the BA prediction across all three variables considered. The EWC equals the ABC prediction

in about one third of samples for IPG, but equals the ABC prediction in only one sample for

RB, and no sample for RNF, showing a significant weakness for tail risk predictions of financial

variables.

Table 3 and Figure 1 report in tabular and graphical form respectively summary statistics

of the ABC prediction of ES for all variables and countries. Two results stand out. First, both

mean and volatility of the ES forecast vary markedly across variables, horizons and countries,

suggesting significant heterogeneity in the sources of risks. Second, the ES forecast of RNF is

strictly lower than the ES forecast of RF in all countries, indicating significantly higher exposures

of the financial sectors to the risk factor relative to the non-financial sector.

I conduct model validation through backtesting, which aims at establishing the extent to

which a model, or a combination of models, is likely to either predict (or forecast) actual re-

alizations of tail risks. Several tests are available to backtest VaR and several tests have been

proposed in the literature to backtest ES. As pointed out by Acerbi and Szekely (2017), how-

ever, the information content of ES tests when considered in isolation is unclear: while VaR is

backtestable since it is elicitable, ES is not backtestable since it is not elicitable. To overcome

this problem, joint backtests of VaR and ES based on the FZ class of scoring functions have been

recently proposed by Nolde and Ziegel (2017) and Patton et al (2019) under the assumption that

11

the models are correctly specified.

In this application, I evaluate the simple ABC prediction (and later the ABC forecast) with

a standard backtest for VaR, and a joint test for (VaR,ES) using the FZ0 scoring function. In

addition, I provide mean statistics comparing the ”historical” ES and the predicted ES.

The VaR backtest is the quantile test proposed by Ganaglione et al. (2011), who has good

power in small samples. This test is based on the following quantile regression:

Rt+h = β0(τ) + β1(τ)V aRABCτ (Rt+h)) (16)

where V aRABCτ (Rt+h)) is the ABC’s VaR prediction (or forecast). The null hypothesis is H0 :

(β0(τ), β1(τ)) = (0, 1), testing the discrepancy of the estimated VaR from the ”true” VaR.

Ganaglione et al. (2011) show that the relevant test statistics has a χ22 distribution under the

null H0. The joint test on the (VaR,ES) is a DM test on the mean difference between the FZ0

score evaluated at the historical values of (VaR,ES) and the FZ0 evaluated at the (VaR,ES)

prediction.

Table 4 summarizes the results of this backtesting exercise (Appendix Table 1 reports detailed

results). First, violations lower then the reference 10% indicate an over-prediction of VaR levels,

which is concentrated at horizons longer then 1 month, and account for 43% of the cases. Second,

the quantile test does not reject the null in about 28% of the estimates. Third, the DM test

does not reject equal predictive performance in all cases (See Appendix table 1). Yet, the simple

comparison of the historical ES and predicted ES show that the ES is under-predicted in most

cases.

Overall, the in-sample ”fit” of the ABC prediction varies significantly across variables, coun-

tries, and horizons. These results pinpoint the problems in interpreting the backtest statistics

pointed out by Acerbi and Szekely (2017): many predictions do not pass the VaR backtest but

pass the joint (VaR,ES) test based on the DM tests.

12

3.2 The ABC forecast

The ABC forecast includes forecasts of the following set of model specifications: baseline and

stress forecasts of each model and EWC combinations obtained with an expanding and a 84-

month rolling estimation window for each of the three models considered, as well as baseline and

stress EWC reference combinations. Therefore, the ABC forecast is obtained selecting weights

of 16 different forecasts (4 baseline forecasts x 2 estimation windows + 4 stress forecasts x 2

estimation widows) for each forecasting horizon at each forecasting date.

The inclusion of forecasts using different estimation windows is meant to capture time vari-

ation in the estimated parameters. As reference combinations I consider the equally weighted

forecast combination (EWC) of the baseline and stress models (N=2) as benchmarks.4 The

inclusion of the EWC combinations is useful to assess their weight in the ABC forecast and

gauge the marginal contribution of each individual model specification relative to the EWC. As

observed previously, the role of stress forecasts in improving forecasting performance is assessed

by the size of their weights in the ABC forecast.

In this application the length of the evaluation window ωT is assumed to be expanding, from

the first set of forecast evaluation, progressively adding observations up to the forecasting date

as time progresses.5. The first estimation is conducted on the data window 1975:1-1984:12, with

the first 1-month ahead forecast for 1985:1. The first evaluation window starts in 1985:1 and

ends in 1991:12, and expands thereafter at each forecasting date. Therefore, all forecasts are

produced from 1992:1 on.

The following three tables show averages of weights of forecasts of different model specifica-

tions in the ABC forecast. Table 5 reports the sum of weights over all models for baseline and

stress forecasts. For all variables and countries, the weight of stress forecasts is large, although

the stress weights are on average lower than the weight assigned to the baseline forecasts, as

it may be expected. Specifically, the weight of the stress forecasts increases in periods of high

volatility, indicating a positive value added of stress forecasts. Table 6 reports the sum of weights

4Other reference combinations could be considered, such as one whose weights are proportional to the relativemagnitude of the FZ0 score associated with each model

5Note that the length of the evaluation window might be an important parameter to consider, since a forecastermay be concerned more about the evaluation of recent performance rather than past performance. Further analysison this issue is in progress.

13

over all models estimated using an expanding window and a rolling window. On average, as

well as in most individual estimations, forecasts with a rolling window carry a larger weight

than those using an expanding window, indicating that time variations in the parameters of the

forecasting model are important. Lastly, Table 7 reports the average weights for each model

summed over expanding and rolling windows. No model is a winner across all estimates, but

each model contributes to the ABC forecast. Note that in no instance the ABC forecast equals

the EWC forecasts: in other words, the ABC forecast always beats the EWC forecast.

Table 8 summarizes the results of the VaR and ES backtesting of the ABC forecast. (detailed

results are in Appendix Table 8). Three results stand out. First, IPG violations lower then the

reference 10%, which indicate an over-prediction of VaR levels, account to 8 out of 28 cases, that

is, VaR is over estimated in 28% of cases. By contrast, VaR of RNF and RB is over-estimated in

all cases. In the VaR dimension, the BA forecasts may be viewed as delivering ”conservative”

forecasts in terms of an excess of warning signals: this feature might be preferred by a risk-averse

forecaster to the opposite case, where warning signals miss increases in tail risk. Looking at the

quantile tests, the null hypothesis is not rejected in 18 out of 28 cases for IPG, in 8 out of

28 cases for RNF, and 12 out of 28 cases for RB. Perhaps unsurprisingly, the number of tests

indicating rejection of the null increases with the lenght of the horizon. Third, the ES appears

to be under-estimated in all cases, but DM tests do not reject equal distance between the FZ0 of

the historical (VaR ES) and the (VaR,ES) forecast in all cases (See Appendix table 8). Overall,

the forecasting performance of the ABC forecast varies significantly across variables, countries,

and horizons, but performs relatively better at the shortest horizons.

The usefulness of the ABC forecast as a risk monitoring tool rests on its ability to anticipate

tail risk realizations. I assess this over the whole sample, as well as with reference to the period

sorrounding the 2007-2009 financial crisis.

Figure Set 1 shows the US time series of IPG, RNF and RB and the relevant (negative) h-

month-ahead ES forecasts over the entire sample (upper panel) and during the 2007:01-2011:12

period (lower panel). The ES forecasts anticipate actual tail risk realizations for all horizons

and all variables during the entire forecasting period. The usefulness of the proposed EWS can

be further illustrated focusing on the 2007.1-2011.12 period. The ES forecasts at each horizon

14

steadily worsened since the beginning of 2007:01. Note that at the beginning of the second

quarter of 2008 most financial risk indicators in advanced economies (such as CDS spreads)

were observed to return to levels witnessed in the mid of 2007 (see BIS, 2008, pp.1-2), while on

the real side, global growth was projected to slow down moderately (see IMF World Economic

Outlook, 2008). The ease in several risk indicators in the financial sector was perceived as a

decline in the potential for financial tail risk realizations, whereas growth prospects, although

revised downward, were not generally judged as implying a potential for real tail risk realizations.

Yet, as Figure Set 1 shows, the proposed EWS would have given a different message, since ES

forecasts continuously increased since 2007:1.

Figure Set 2 shows the same evidence for the 6-month-ahead ES forecasts for all the other

countries. By and large, the results are similar to those found for the US: the ES forecasts

anticipate actual tail risk realizations for all variables during the entire forecasting period, and

particularly prior to the 2007-2009 financial crisis. It is important to note that in some instances

a sharp drop in T + h ES forecast is followed by the actual drop of the relevant variable at date

t < T + h, yet it occurs at t > T , that is, after the forecasting date. These are instances in

which the ES forecast does not identify the exact date of the fall of a variable. Nevertheless, it

anticipates its fall within the forecasting horizon, providing a useful early warning signal.

4 Conclusions

This paper has formulated an EWS based on a Bayesian-type forecast combination of models

for VaR and ES that integrates historical-based stress testing scenarios into tail risk forecasting.

The implementation on data for the G7 countries shows that the proposed EWS is promising in

delivering timely early warning signals for tail risks up to a 6-month forecasting horizon.

The implementation of the EWS presented in this paper has been designed parsimoniously

in terms of models and variables used to illustrate in a transparent way how the EWS can be

implemented and its underlying assumptions. However, it can be easily extended in several

directions. One important advantage of the modeling framework underlying the proposed EWS

is its flexibility. The model can be implemented using data at any level of dis-aggregation (firm,

15

sector, country), and it can incorporate any desired set of models from which to construct the

ABC forecast. Some of these extensions are part of my future research.

16

REFERENCES

Acerbi, Carlo and Szekely, Balazs, 2017, General Properties of Backtestable Statistics, Avail-able at SSRN: http://dx.doi.org/10.2139/ssrn.2905109

Acharya, Viral, Lesse Pedersen, Thomas Philippon, and Matthew Richardson, 2017, Mea-suring Systemic Risk, Review of Financial Studies, Vol. 30, n.1: 1-47.

Aiolfi, Marco, and Alan Timmermann, 2016, Persistence in forecasting performance andconditional combination strategies, Journal of Econometrics, Vol. 135: 31-53.

Allen, L., Bali, T. G., and Tang, Y., 2012, Does Systemic Risk in the Financial Sector PredictFuture Economic Downturns? Review of Financial Studies, 25: 3000–3036.

Amisano, Gianni, and Raffaella Giacomini, 2007, Comparing Density Forecasts via WeightedLikelihood Tests, Journal of Business and Economic Statistics, 25:2:177-190.

Atkenson, Andrew, Andrea Eisfeldt, and Pierre-Olivier Weill, 2017, Measuring the FinancialSoundness of U.S. Firms, 1926-2012, Research in Economics, 71(3): 613-635.

Bandi, Federico, and Benoit Perron, 2008, Long-run risk-return trade-offs, Journal of Econo-metrics, Vol. 143, Issue 2: 349-374.

Basset, Koenker and Kordas, 2004, Pessimistic Portfolio Allocation and Choquet ExpectedUtility, Journal of Financial Econometrics, 2: 477-492.

Benoit Sylvain, Gilbert Colletaz, Christophe Hurlin, Christophe Perignon, 2013, A Theoret-ical and Empirical Comparison of Systemic Risk Measures. ffhalshs-00746272v2f

Bisias, Dimitrios, Mark Flood, Andrew Lo and Stavros Valavanis, 2012, “A Survey of Sys-temic Risk Analytics, Office of Financial Research,” Working Paper 0001, January

Brownlees, Christian, and Robert Engle, 2017, SRISK: A Conditional Capital Shortfall Mea-sure of Systemic Risk, Review of Financial Studies, Vol. 30, n.1: 48-79.

Brownlees, C., Chabot, B., Ghysels, E., and Kurz, C., 2018, Back to the Future: BacktestingSystemic Risk Measures during the Great Depression and Historical Bank Runs. Technicalreport.

Brunnermeier, M., D. Palia, K.A. Sastry, and C. Sims, 2018, Feedbacks: Financial Mar-kets and Economic Activity, working paper, March, http://creativecommons.org/licenses/by-nc-sa/3.0/

Brunnermeier, M. and Y. Sannikov, 2014, A Macroeconomic Model with a Financial Sector,American Economic Review, Vol 104, 2: 379-421

Corbae, Dean, Pablo D’Erasmo, Sigurd Gaalasen, Alfonso Irarrazaal, and Thomas Siemsen,2017, Structural Strress Testing, mimeo.

Covas, F.B., B. Rump, and E. Zagrajsek, 2014. Stress-testing US Bank Holding Companies:A Dynamic Panel Quantile Regression Approach, International Journal of Forecasting, 30: 691-713,

Chernozhukov, Victor, Ivan Fernandez-Val and Alfred Galichon, 2010, Quantile and Proba-bility Curves Without Crossing, Econometrica, Vol. 78, 3, pp. 1093-1125.

Christiano, L., R. Motto and M. Rostagno, 2014, Risk Shocks, American Economic Review,Vol. 104(1): 27-65.

De Nicolo, Gianni, and Marcella Lucchetta, 2011, “Systemic Risks and the Macroeconomy,”NBER Working Paper no. 16998, in Quantifying Systemic Risk, Joseph Haubrich and AndrewLo, eds. (National Bureau of Economic Research, Cambridge, Massachusetts).

De Nicolo, Gianni, and Marcella Lucchetta, 2012, Systemic Real and Financial Risks: Mea-surement, Forecasting and Stress Testing, IMF Working Paper 12/58, February.

17

De Nicolo, Gianni, and Marcella Lucchetta, 2017, Forecasting Tail Risks, Journal of AppliedEconometrics, 32: 159-170.

Diks, Cees, Valentyn Panchenko, and Dick van Dijk, 2011, Likelihood-based scoring rules forcomparing density forecasts in tails, Journal of Econometrics, 163, 215–230.

Diebold F.X., and R. Mariano, 1995, Comparing Predictive Accuracy, Journal of Businessand Economic Statistics, 13, 253-263.

Elliott, Graham, and Allan Timmermann, 2016, Economic Forecasting, Princeton UniversityPress, Princeton, NJ.

Geweke, John, and Gianni Amisano, 2011, Optimal Prediction Pools, Journal of Economet-rics, 164(1), 130-141.

Geweke, John, and Gianni Amisano, 2012, Predictions with Misspecified Models, AmericanEconomic Review: Papers and Proceedings, 102(3): 482–486.

Giacomini, Raffaella, and Ivana Komunjer, 2005. Evaluation and combinations of conditionalquantile forecasts. Journal of Business and Economics Statistics, Vol.3, no.4: 416-431.

Giglio, S., Kelly, B., and Pruitt, S., 2016. Systemic Risk and the Macroeconomy: AnEmpirical Evaluation, Journal of Financial Economics, 119: 457-471.

Gneiting, Tilman, 2011, Making and Evaluating Point Forecasts, Journal of the AmericanStatistical Association, Vol. 106, No. 494:746-762.

Gneiting, Tilman, and Roopesh Ranjan, 2011, Comparing Density Forecasts using Threshold-and Quantile-Weighted Scoring Rules, Journal of Business and Economic Statistics, 29:3, 411-422. DOI: 10.1198/jbes.2010.08110.

Hubrich, K. and Tetlow, T. 2015. Financial Stress and Economic Dynamics: The Transmis-sion of Crises, Journal of Monetary Economics, 70, 100–115.

Klimenko, N., S. Pfeil, JC Rochet and G. De Nicolo’, 2016, Aggregate Bank Capital andCredit Dynamics, Swiss Finance Institute Research Paper n. 16-42.

Lantusaari, Jarno, Michael Gutmann, Ritabrata Dutta, Samuel Kaski, and Jukka Cirander,2017, Fundamentals and Recent Developments in Approximate Bayesian Computation, System-atic Biology, Vol 66(1): e66-e82.

Lehar, Alfred, 2005, Measuring systemic risk: A risk management approach, Journal ofBanking and Finance, Vol. 29:2577-2603.

Koenker, R, and Z. Xiao, 2006, Quantile Autoregression, Journal of the American StatisticalAssociation, Vol. 101, Issue 475: 113-129.

Komunjer, Ivana, 2013, Quantile Prediction, Chapter 17 in Handbook of Economic Fore-casting (Edited by Elliott, G. and A. Timmermann). Volume 2, Chapter 17, 961-994. ISBN978-0-444-53683-9.

Kupiec, Paul, 2018, On the accuracy of alternative approaches for calibrating bank stresstest models, Journal of Financial Stability, Vol. 38: 142-156.

Fissler, Tobias and Johanna Zigler, 2016, Higher Order Elicitability and the Osband’s Prin-ciple, The Annals of Statistics, Vol. 44, No. 4, 1680–1707, DOI: 10.1214/16-AOS1439

Leland, Hayne, 1994, Corporate debt value, bond covenants, and optimal capital structure.Journal of Finance, 49(4):1213-1252.

Nolde, Natalia, and Johanna F. Ziegel, 2017, Elicitability and Backtesting: Perspective forBanking Regulation, The Annals of Applied Statistics, 11 4: 1833-1874.

Opschoor, Anne, Dick van Dijk, and Michel van der Wel, Combining density forecasts usingfocused scoring rules, Journal of Applied Econometrics,

18

Patton, Andrew J, Johanna F. Ziegel, and Rui Chen, 2019, Dynamic Semiparametric Modelsfor Expected Shortfall (and Value-atRisk), Journal of Econometrics, forthcoming,

19

1

Tables and Figures

Table 1. Weights of the simple BA combination (sample: 1975:01-2017:12)

IPG RNF RB

Horizon (months) 1 3 6 12 1 3 6 12 1 3 6 12

US Model 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Model 2 0.00 0.33 0.00 0.33 0.33 0.00 0.50 0.50 0.50 0.00 0.00 0.33

Model 3 0.50 0.33 0.50 0.33 0.33 1.00 0.50 0.50 0.50 1.00 1.00 0.33

EWP 1-3 0.50 0.33 0.50 0.33 0.33 0.00 0.00 0.00 0.00 0.00 0.00 0.33

CN Model 1 0.00 0.25 0.00 0.25 0.00 0.00 0.00 0.33 0.33 0.00 0.00 0.25

Model 2 0.33 0.25 0.00 0.25 0.00 0.50 1.00 0.33 0.33 0.33 0.50 0.25

Model 3 0.33 0.25 0.50 0.25 0.00 0.50 0.00 0.00 0.00 0.33 0.50 0.25

EWP 1-3 0.33 0.25 0.50 0.25 1.00 0.00 0.00 0.33 0.33 0.33 0.00 0.25

JP Model 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Model 2 0.33 0.00 0.33 1.00 0.33 1.00 0.00 0.33 0.50 0.33 0.00 0.33

Model3 0.33 1.00 0.33 0.00 0.33 0.00 1.00 0.33 0.50 0.33 1.00 0.33

EWP 1-3 0.33 0.00 0.33 0.00 0.33 0.00 0.00 0.33 0.00 0.33 0.00 0.33

UK Model 1 0.33 0.33 0.25 0.33 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Model 2 0.00 0.33 0.25 0.33 0.33 1.00 0.50 1.00 0.50 0.33 0.00 0.33

Model3 0.33 0.00 0.25 0.00 0.33 0.00 0.50 0.00 0.00 0.33 1.00 0.33

EWP 1-3 0.33 0.33 0.25 0.33 0.33 0.00 0.00 0.00 0.50 0.33 0.00 0.33

BD Model 1 0.33 0.00 0.00 0.25 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Model 2 0.33 0.00 0.00 0.25 0.50 0.50 0.50 0.33 1.00 0.50 0.00 0.33

Model3 0.00 1.00 0.50 0.25 0.00 0.50 0.50 0.33 0.00 0.50 1.00 0.33

EWP 1-3 0.33 0.00 0.50 0.25 0.50 0.00 0.00 0.33 0.00 0.00 0.00 0.33

FR Model 1 0.50 0.25 0.00 0.25 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Model 2 0.50 0.25 0.33 0.25 0.00 0.33 0.33 0.33 0.50 0.00 0.50 0.50

Model3 0.00 0.25 0.33 0.25 0.50 0.33 0.33 0.33 0.00 1.00 0.00 0.00

EWP 1-3 0.00 0.25 0.33 0.25 0.50 0.33 0.33 0.33 0.50 0.00 0.50 0.50

IT Model 1 0.33 0.25 0.00 0.25 0.00 0.00 0.00 0.50 0.00 0.00 0.00 0.00

Model 2 0.33 0.25 0.33 0.25 0.00 1.00 0.33 0.00 0.33 0.50 0.00 0.50

Model3 0.00 0.25 0.33 0.25 0.50 0.00 0.33 0.00 0.33 0.00 1.00 0.00

EWP 1-3 0.33 0.25 0.33 0.25 0.50 0.00 0.33 0.50 0.33 0.50 0.00 0.50

2

Table 2. Number and percentages of dominated models and EWC = AB

(sample: 1975:01-2017:12)

Table 3. ES estimates (sample: 1975:01-2017:12)

number of

samples Model 1 Model 2 Model 3 EWC EWC = BA

IPG 28 14 7 6 4 8

% 0.50 0.25 0.21 0.14 0.29

RNF 28 26 6 9 13 0

% 0.93 0.21 0.32 0.46 0.00

RB 28 26 8 7 13 1

% 0.93 0.29 0.25 0.46 0.04

Country

horizon

(months) IPG RNF RB

Mean Std. Dev. Min Max Mean Std. Dev. Min Max Mean Std. Dev. Min Max

US 1 -0.95 0.39 -3.24 -0.23 -5.96 3.17 -27.09 -1.06 -10.34 5.85 -46.03 -1.04

2 -1.44 0.63 -5.52 -0.29 -6.62 2.85 -20.82 -0.86 -12.76 4.20 -36.29 -0.27

3 -2.45 1.50 -11.25 -0.55 -10.95 5.62 -43.67 0.33 -15.65 8.52 -64.70 -0.14

4 -3.56 1.67 -14.20 -0.61 -13.62 8.08 -59.85 -0.94 -25.02 6.54 -60.47 -12.37

CN 1 -1.69 0.33 -3.11 -0.88 -6.87 3.08 -25.10 -1.61 -7.91 2.08 -20.48 -3.96

2 -2.32 0.59 -5.28 -1.29 -10.46 3.14 -28.47 -5.37 -11.63 2.02 -22.83 -7.61

3 -3.24 1.54 -12.67 -0.81 -17.86 6.40 -54.35 -7.91 -15.82 4.02 -36.20 -8.69

4 -5.29 1.29 -12.18 -3.11 -22.76 5.99 -52.89 -12.68 -20.07 1.89 -31.01 -16.26

JP 1 -2.84 0.81 -7.99 -1.11 -7.88 3.44 -26.54 -1.25 -11.49 5.52 -41.42 -1.74

2 -3.23 1.06 -9.58 0.11 -14.56 5.07 -40.14 -2.68 -18.18 6.30 -99.69 -4.67

3 -5.40 2.45 -17.93 -0.94 -14.34 5.98 -36.45 -0.29 -21.48 10.69 -69.17 -0.01

4 -9.02 5.18 -38.61 1.75 -23.79 6.88 -51.31 -5.13 -37.58 8.69 -68.79 -13.50

UK 1 -2.07 0.25 -3.00 -1.48 -6.68 3.29 -25.94 -1.00 -10.49 5.09 -43.43 -1.70

2 -2.61 0.43 -4.58 -1.71 -11.23 3.90 -33.81 -2.71 -15.83 5.55 -43.83 -2.45

3 -3.34 0.93 -7.66 -1.64 -12.11 4.77 -36.11 -2.84 -16.82 8.04 -44.51 -1.44

4 -5.01 1.39 -12.71 -2.36 -16.62 7.59 -58.73 -2.15 -27.22 7.74 -65.74 -10.53

BD 1 -2.85 0.14 -3.49 -2.42 -7.13 3.28 -28.01 0.18 -10.82 6.24 -62.02 -2.78

2 -2.89 0.37 -4.91 -1.37 -11.52 2.97 -26.63 -4.85 -18.53 7.01 -60.93 -7.36

3 -3.85 1.30 -18.15 -1.48 -16.84 5.21 -40.86 -5.75 -23.74 11.84 -66.70 -0.52

4 -6.16 0.43 -10.30 -5.13 -21.34 5.87 -44.74 -9.07 -38.01 14.31 -91.87 -11.51

FR 1 -2.30 0.18 -3.00 -1.86 -8.15 4.11 -39.01 -1.66 -9.45 4.80 -36.58 -0.66

2 -2.63 0.47 -4.58 -1.59 -12.61 2.31 -21.71 -6.14 -11.95 4.60 -34.63 0.83

3 -3.20 0.81 -6.80 -1.42 -18.59 4.69 -36.72 -8.28 -17.69 7.13 -41.37 1.74

4 -4.65 0.88 -8.70 -2.73 -23.86 4.02 -37.58 -12.45 -36.30 3.20 -50.42 -27.22

IT 1 -3.08 0.23 -3.93 -2.47 -9.69 3.96 -25.77 -1.38 -13.08 6.11 -42.31 -2.47

2 -3.80 0.55 -5.88 -2.58 -19.67 5.06 -40.87 -9.78 -24.59 7.38 -58.12 -11.06

3 -4.87 1.09 -9.19 -2.57 -23.71 2.94 -34.93 -16.37 -25.20 7.87 -49.59 -7.93

4 -7.18 0.92 -10.84 -5.14 -36.36 1.96 -42.61 -31.91 -59.26 7.00 -82.93 -43.31

3

Figure 1. ES mean and standard deviation of AB combinations (sample: 1975:01-2017:12)

0

1

2

3

4

5

6

7

8

9

10

1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12

US CN JP UK BD FR IT

IPG ES mean IPG ES SD

0

5

10

15

20

25

30

35

40

1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12

US CN JP UK BD FR IT

RNF ES mean RNF SD

0

10

20

30

40

50

60

70

1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12 1 3 6 12

US CN JP UK BD FR IT

RB ES mean RB SD

4

Table 4. VaR and ES backtests1 (sample: 1975:01-2017:12)

1 DM test of ES based on differences of FZ0 scores of the historical and estimated (VaR,ES) do not reject the null of equal performance for all variables and countries (See Appendix Table 4)

IPG RNF RB

Country

horizon h=1 h=3 h=6 h=12 h=1 h=3 h=6 h=12 h=1 h=3 h=6 h=12

US

Violations (in%) 9.28 10.98 7.39 8.90 9.66 11.17 6.82 9.66 9.85 13.45 5.30 6.63

p-value quantile test 0.31 0.00 0.00 0.00 0.85 0.00 0.00 0.00 0.83 0.00 0.00 0.00

Historical ES -1.10 -2.51 -4.33 -6.74 -7.80 -13.20 -17.58 -26.10 -12.63 -21.34 -28.75 -38.86

Estimated ES -0.95 -1.44 -2.45 -3.56 -5.96 -6.62 -10.95 -13.62 -10.34 -12.76 -15.65 -25.02

CN

Violations (in%) 10.61 13.64 9.47 11.17 9.85 11.36 8.14 10.04 8.90 10.80 8.33 8.90

p-value quantile test 0.69 0.00 0.00 0.02 0.80 0.00 0.01 0.04 0.22 0.00 0.04 0.36

Historical ES -1.85 -3.07 -4.99 -7.46 -8.39 -15.35 -22.30 -28.76 -8.78 -15.98 -20.66 -23.95

Estimated ES -1.69 -2.32 -3.24 -5.29 -6.87 -10.46 -17.86 -22.76 -7.91 -11.63 -15.82 -20.07

JP

Violations (in%) 9.85 13.45 8.33 8.90 10.23 9.47 6.44 9.09 7.58 11.93 8.52 9.47

p-value quantile test 0.01 0.00 0.00 0.04 0.22 0.00 0.00 0.00 0.04 0.00 0.57 0.00

Historical ES -3.02 -5.26 -8.36 -12.42 -9.42 -17.29 -24.37 -34.93 -13.49 -24.58 -34.96 -47.70

Estimated ES -2.84 -3.23 -5.40 -9.02 -7.88 -14.56 -14.34 -23.79 -11.49 -18.18 -21.48 -37.58

UK

Violations (in%) 10.80 10.98 9.09 9.47 11.93 8.14 6.63 9.47 10.04 10.61 5.30 8.90

p-value quantile test 0.57 0.21 0.15 0.07 0.04 0.00 0.00 0.00 0.86 0.01 0.00 0.00

Historical ES -2.16 -3.14 -4.47 -6.77 -8.02 -13.76 -17.40 -22.98 -12.80 -23.45 -30.52 -42.82

Estimated ES -2.07 -2.61 -3.34 -5.01 -6.68 -11.23 -12.11 -16.62 -10.49 -15.83 -16.82 -27.22

BD

Violations (in%) 9.47 12.12 10.04 9.47 11.74 11.93 8.90 10.04 11.17 10.04 6.06 8.14

p-value quantile test 0.06 0.00 0.00 0.27 0.09 0.00 0.00 0.00 0.77 0.00 0.01 0.00

Historical ES -2.88 -4.05 -6.28 -9.30 -9.73 -17.88 -24.43 -31.74 -14.79 -29.16 -43.48 -61.08

Estimated ES -2.85 -2.89 -3.85 -6.16 -7.13 -11.52 -16.84 -21.34 -10.82 -18.53 -23.74 -38.01

FR

Violations (in%) 11.93 11.74 9.66 9.47 10.04 11.17 7.20 11.74 10.23 13.45 7.39 11.93

p-value quantile test 0.90 0.65 0.01 0.00 0.10 0.00 0.00 0.00 0.18 0.00 0.00 0.02

Historical ES -2.33 -3.13 -4.23 -6.38 -9.79 -18.55 -25.48 -34.83 -12.55 -23.96 -33.42 -44.02

Estimated ES -2.30 -2.63 -3.20 -4.65 -8.15 -12.61 -18.59 -23.86 -9.45 -11.95 -17.69 -36.30

IT

Violations (in%) 11.17 12.12 8.14 9.28 10.42 8.14 8.33 11.17 9.85 7.77 6.82 6.25

p-value quantile test 0.60 0.34 0.00 0.08 0.64 0.08 0.00 0.00 0.98 0.00 0.00 0.00

Historical ES -3.20 -4.36 -6.58 -9.81 -11.14 -20.91 -30.00 -38.84 -14.80 -26.60 -39.52 -58.89

Estimated ES -3.08 -3.80 -4.87 -7.18 -9.69 -19.67 -23.71 -36.36 -13.08 -24.59 -25.20 -59.26

5

Table 5. Average Fraction of Baseline and Stress Forecasts included in the BA Combination (sample: 1992:01-2017:12)

Table 6. Average Fraction of Expanding and Rolling Window Forecasts in the BA Combination (sample: 1992:01-2017:12)

Horizon (months) 1 3 6 12

Country Mean

IPG RNF RF IPG RNF RF IPG RNF RF IPG RNF RF

US

Baseline 0.76 0.85 0.64 0.93 0.60 0.53 0.86 0.54 0.50 0.58 0.55 0.50 0.65

Stress 0.24 0.15 0.36 0.07 0.40 0.47 0.14 0.46 0.50 0.42 0.45 0.50 0.35

CN

Baseline 0.66 0.72 0.67 0.64 0.83 0.65 0.58 0.73 0.47 0.60 0.58 0.70 0.65

Stress 0.34 0.28 0.33 0.36 0.17 0.35 0.42 0.27 0.53 0.40 0.42 0.30 0.35

JP

Baseline 0.46 0.82 0.61 0.62 0.82 0.89 0.46 0.72 0.58 0.45 0.53 0.37 0.61

Stress 0.54 0.18 0.39 0.38 0.18 0.11 0.54 0.28 0.42 0.55 0.47 0.63 0.39

UK

Baseline 0.71 0.94 0.74 0.73 0.73 0.95 0.59 0.44 0.84 0.54 0.55 0.44 0.68

Stress 0.29 0.06 0.26 0.27 0.27 0.05 0.41 0.56 0.16 0.46 0.45 0.56 0.32

BD

Baseline 0.49 0.72 0.59 0.46 0.52 0.56 0.59 0.59 0.67 0.53 0.37 0.55 0.55

Stress 0.51 0.28 0.41 0.54 0.48 0.44 0.41 0.41 0.33 0.47 0.63 0.45 0.45

FR

Baseline 0.48 0.76 0.72 0.44 0.73 1.00 0.77 0.96 0.70 0.64 0.56 0.50 0.69

Stress 0.52 0.24 0.28 0.56 0.27 0.00 0.23 0.04 0.30 0.36 0.44 0.50 0.31

IT

Baseline 1.00 0.78 0.58 0.47 0.76 0.98 0.81 0.53 0.67 0.76 0.60 0.67 0.72

Stress 0.00 0.22 0.42 0.53 0.24 0.02 0.19 0.47 0.33 0.24 0.40 0.33 0.28

Horizon (months) 1 3 6 12

Country Mean

IPG RNF RF IPG RNF RF IPG RNF RF IPG RNF RF

US

Expanding 0.30 0.44 0.46 0.23 0.22 0.46 0.20 0.25 0.31 0.32 0.38 0.23 0.33

Rolling 0.70 0.56 0.54 0.77 0.78 0.54 0.80 0.75 0.69 0.68 0.62 0.77 0.67

CN

Expanding 0.38 0.39 0.40 0.18 0.24 0.36 0.18 0.20 0.47 0.34 0.45 0.36 0.33

Rolling 0.62 0.61 0.60 0.82 0.76 0.64 0.82 0.80 0.53 0.66 0.55 0.64 0.67

JP

Expanding 0.41 0.52 0.58 0.28 0.56 0.52 0.41 0.51 0.52 0.36 0.60 0.67 0.48

Rolling 0.59 0.48 0.42 0.72 0.44 0.48 0.59 0.49 0.48 0.64 0.40 0.33 0.52UK

Expanding 0.10 0.57 0.71 0.05 0.41 0.22 0.14 0.36 0.22 0.09 0.30 0.41 0.29

Rolling 0.90 0.43 0.29 0.95 0.59 0.78 0.86 0.64 0.78 0.91 0.70 0.59 0.71

BD

Expanding 0.51 0.47 0.58 0.31 0.34 0.47 0.47 0.18 0.14 0.21 0.32 0.51 0.36

Rolling 0.49 0.53 0.42 0.69 0.66 0.53 0.53 0.82 0.86 0.79 0.68 0.49 0.64

FR

Expanding 0.20 0.51 0.50 0.26 0.35 0.53 0.30 0.42 0.44 0.29 0.06 0.44 0.35

Rolling 0.80 0.49 0.50 0.74 0.65 0.47 0.70 0.58 0.56 0.71 0.94 0.56 0.65

IT

Expanding 0.00 0.71 0.32 0.16 0.38 0.32 0.45 0.41 0.25 0.28 0.38 0.25 0.33

Rolling 1.00 0.29 0.68 0.84 0.62 0.68 0.55 0.59 0.75 0.72 0.62 0.75 0.67

6

Table 7. Average Fraction of Model Forecasts in the BA Combination (sample: 1992:01-2017:12)

Horizon (months) 1 3 6 12

Country Mean

IPG RNF RF IPG RNF RF IPG RNF RF IPG RNF RF

US

Model 1, baseline 0.11 0.10 0.07 0.14 0.02 0.03 0.15 0.04 0.08 0.12 0.10 0.11 0.09

Model 2, baseline 0.16 0.24 0.18 0.18 0.15 0.17 0.17 0.22 0.21 0.13 0.14 0.09 0.17

Model 3, baseline 0.24 0.30 0.19 0.47 0.25 0.17 0.38 0.04 0.10 0.16 0.19 0.18 0.22

EWC, baseline 0.24 0.20 0.20 0.14 0.17 0.17 0.15 0.23 0.11 0.16 0.12 0.11 0.17

Model 1, stress 0.12 0.01 0.02 0.04 0.14 0.02 0.00 0.05 0.09 0.07 0.07 0.10 0.06

Model 2, stress 0.09 0.12 0.18 0.00 0.15 0.15 0.04 0.25 0.21 0.08 0.17 0.14 0.13

Model 3, stress 0.03 0.02 0.05 0.02 0.07 0.18 0.02 0.04 0.07 0.15 0.07 0.12 0.07

EWC, stress 0.00 0.00 0.12 0.00 0.03 0.12 0.08 0.13 0.13 0.12 0.15 0.14 0.09

CN

Model 1, baseline 0.05 0.07 0.08 0.10 0.02 0.13 0.10 0.09 0.03 0.12 0.15 0.01 0.08

Model 2, baseline 0.21 0.30 0.45 0.10 0.16 0.15 0.09 0.19 0.19 0.12 0.15 0.24 0.19

Model 3, baseline 0.19 0.08 0.01 0.36 0.49 0.33 0.30 0.34 0.16 0.19 0.12 0.24 0.24

EWC, baseline 0.21 0.27 0.13 0.08 0.17 0.04 0.09 0.12 0.08 0.16 0.17 0.21 0.14

Model 1, stress 0.01 0.00 0.00 0.04 0.00 0.13 0.10 0.05 0.07 0.11 0.13 0.08 0.06

Model 2, stress 0.16 0.18 0.31 0.06 0.06 0.08 0.09 0.03 0.16 0.14 0.10 0.11 0.12

Model 3, stress 0.07 0.00 0.01 0.16 0.09 0.13 0.12 0.16 0.21 0.02 0.06 0.05 0.09

EWC, stress 0.10 0.11 0.01 0.10 0.02 0.00 0.10 0.02 0.09 0.12 0.13 0.07 0.07

JP

Model 1, baseline 0.18 0.01 0.00 0.04 0.01 0.09 0.04 0.08 0.08 0.09 0.17 0.11 0.08

Model 2, baseline 0.10 0.29 0.12 0.14 0.28 0.22 0.13 0.23 0.18 0.21 0.14 0.16 0.19

Model 3, baseline 0.13 0.23 0.18 0.28 0.25 0.30 0.16 0.18 0.12 0.03 0.09 0.03 0.16

EWC, baseline 0.05 0.29 0.31 0.15 0.28 0.27 0.13 0.23 0.21 0.11 0.12 0.08 0.19

Model 1, stress 0.21 0.01 0.00 0.07 0.00 0.01 0.08 0.00 0.03 0.13 0.17 0.23 0.08

Model 2, stress 0.19 0.17 0.03 0.05 0.15 0.00 0.15 0.11 0.11 0.23 0.17 0.18 0.13

Model 3, stress 0.09 0.00 0.25 0.25 0.00 0.02 0.16 0.03 0.12 0.01 0.00 0.11 0.09

EWC, stress 0.05 0.01 0.11 0.01 0.02 0.08 0.14 0.13 0.15 0.19 0.13 0.11 0.09

UK

Model 1, baseline 0.17 0.02 0.03 0.06 0.03 0.13 0.12 0.07 0.07 0.15 0.13 0.16 0.09

Model 2, baseline 0.19 0.34 0.19 0.06 0.34 0.42 0.12 0.25 0.40 0.15 0.18 0.25 0.24

Model 3, baseline 0.20 0.34 0.29 0.30 0.18 0.14 0.23 0.00 0.08 0.08 0.10 0.00 0.16

EWC, baseline 0.14 0.23 0.23 0.30 0.18 0.27 0.12 0.13 0.30 0.15 0.14 0.03 0.18

Model 1, stress 0.11 0.01 0.00 0.00 0.00 0.01 0.09 0.01 0.06 0.12 0.14 0.26 0.07

Model 2, stress 0.14 0.02 0.25 0.02 0.27 0.04 0.12 0.24 0.04 0.11 0.19 0.25 0.14

Model 3, stress 0.00 0.00 0.00 0.22 0.00 0.00 0.13 0.06 0.04 0.10 0.06 0.00 0.05

EWC, stress 0.04 0.02 0.01 0.03 0.00 0.00 0.07 0.24 0.00 0.14 0.06 0.05 0.06

7

Table 7. Average Fraction of Model Forecasts in the BA Combination (sample: 1992:01-2017:12) (cont.)

Horizon (months) 1 3 6 12 Mean

Country

IPG RNF RF IPG RNF RF IPG RNF RF IPG RNF RF

BD

Model 1, baseline 0.01 0.00 0.00 0.00 0.03 0.03 0.13 0.14 0.15 0.13 0.02 0.10 0.06

Model 2, baseline 0.09 0.26 0.25 0.09 0.34 0.18 0.11 0.23 0.23 0.13 0.11 0.16 0.18

Model 3, baseline 0.21 0.26 0.20 0.30 0.14 0.16 0.17 0.14 0.17 0.14 0.12 0.13 0.18

EWC, baseline 0.18 0.21 0.14 0.06 0.00 0.19 0.17 0.09 0.12 0.13 0.12 0.17 0.13

Model 1, stress 0.11 0.00 0.00 0.00 0.00 0.00 0.07 0.12 0.08 0.05 0.08 0.14 0.05

Model 2, stress 0.11 0.26 0.38 0.12 0.31 0.32 0.05 0.23 0.23 0.13 0.19 0.10 0.20

Model 3, stress 0.11 0.00 0.01 0.36 0.17 0.09 0.17 0.00 0.00 0.22 0.24 0.10 0.12

EWC, stress 0.18 0.02 0.03 0.06 0.00 0.03 0.12 0.06 0.02 0.06 0.12 0.11 0.07

FR

Model 1, baseline 0.18 0.04 0.04 0.14 0.00 0.00 0.12 0.00 0.08 0.12 0.27 0.08 0.09

Model 2, baseline 0.21 0.23 0.20 0.04 0.00 0.00 0.19 0.16 0.21 0.14 0.16 0.21 0.15

Model 3, baseline 0.00 0.24 0.25 0.12 0.71 1.00 0.27 0.79 0.21 0.24 0.00 0.21 0.34

EWC, baseline 0.09 0.24 0.23 0.14 0.02 0.00 0.19 0.01 0.20 0.15 0.14 0.20 0.13

Model 1, stress 0.19 0.00 0.00 0.14 0.00 0.00 0.04 0.00 0.08 0.04 0.25 0.08 0.07

Model 2, stress 0.30 0.22 0.23 0.12 0.00 0.00 0.07 0.01 0.05 0.06 0.15 0.05 0.11

Model 3, stress 0.01 0.01 0.00 0.15 0.27 0.00 0.04 0.03 0.14 0.19 0.03 0.14 0.08

EWC, stress 0.02 0.01 0.05 0.15 0.00 0.00 0.07 0.00 0.05 0.07 0.00 0.05 0.04

IT

Model 1, baseline 0.16 0.04 0.00 0.11 0.00 0.00 0.14 0.21 0.07 0.15 0.20 0.07 0.10

Model 2, baseline 0.35 0.22 0.29 0.13 0.34 0.00 0.04 0.21 0.15 0.16 0.13 0.15 0.18

Model 3, baseline 0.14 0.29 0.14 0.12 0.20 0.97 0.37 0.00 0.28 0.24 0.08 0.28 0.26

EWC, baseline 0.35 0.23 0.15 0.12 0.21 0.00 0.26 0.11 0.17 0.21 0.19 0.17 0.18

Model 1, stress 0.00 0.03 0.08 0.06 0.00 0.00 0.03 0.16 0.00 0.12 0.20 0.00 0.06

Model 2, stress 0.00 0.13 0.22 0.11 0.22 0.02 0.03 0.20 0.17 0.12 0.06 0.17 0.12

Model 3, stress 0.00 0.01 0.00 0.25 0.01 0.00 0.05 0.01 0.12 0.01 0.00 0.12 0.05

EWC, stress 0.00 0.04 0.12 0.11 0.02 0.00 0.08 0.10 0.04 0.00 0.14 0.04 0.06

8

Table 8. VaR and ES backtests2 for the BA combination (sample: 1992:01-2017:12)

2 DM test of ES based on differences of FZ0 scores of the historical and estimated (VaR,ES) do not reject the null of equal performance for all variables and countries (See Appendix Table 8)

US CN JP UK BD FR IT US CN JP UK BD FR IT

Horizon (months) 1 3

Variable IPG IPG

Violations (in %) 6.79 10.49 12.65 12.35 12.96 11.73 12.04 7.72 9.88 10.19 13.27 11.73 12.35 9.88

quantile test p-value 0.30 0.18 0.87 0.19 0.93 0.84 0.62 0.03 0.27 0.37 0.89 0.02 0.48 0.10

HistES -1.00 -1.62 -3.65 -1.69 -2.63 -2.33 -2.75 -2.36 -3.01 -6.90 -2.51 -4.42 -3.47 -4.55

EstES -0.83 -1.52 -2.93 -1.64 -2.75 -2.25 -2.43 -1.33 -2.21 -4.50 -2.05 -3.31 -2.75 -3.71

Variable RNF RNF

Violations (in %) 8.64 7.10 8.64 8.95 9.88 5.25 4.94 5.25 6.17 7.41 5.86 7.41 2.78 5.25

quantile test p-value 0.58 0.45 0.18 0.08 0.24 0.02 0.00 0.02 0.05 0.00 0.00 0.22 0.00 0.00

HistES -8.34 -8.76 -9.58 -7.70 -10.74 -9.55 -10.58 -14.22 -16.21 -18.04 -13.17 -19.09 -17.46 -19.41

EstES -6.20 -7.25 -9.41 -6.98 -8.12 -8.84 -10.50 -10.77 -11.59 -15.95 -11.44 -15.48 -13.08 -19.13

Variable RB RB

Violations (in %) 7.41 5.25 5.25 9.57 9.26 8.64 8.95 7.72 6.17 8.64 8.33 8.95 7.10 5.86

quantile test p-value 0.66 0.01 0.00 0.96 0.27 0.21 0.68 0.01 0.03 0.65 0.41 0.16 0.42 0.00HistES -14.99 -8.59 -15.00 -14.14 -17.74 -14.50 -16.65 -24.28 -15.65 -27.36 -26.37 -34.41 -26.91 -30.63

EstES -10.73 -8.17 -13.98 -11.03 -12.50 -10.70 -13.07 -16.56 -13.74 -22.59 -19.30 -22.16 -14.25 -18.26

Horizon (months) 6 12

Variable IPG IPG

Violations (in %) 5.86 9.26 12.35 7.41 11.73 11.73 10.19 4.32 9.26 13.89 8.33 11.11 10.49 11.42

quantile test p-value 0.01 0.09 0.37 0.02 0.07 0.01 0.00 0.00 0.01 0.40 0.03 0.57 0.21 0.76

HistES -4.33 -4.83 -11.02 -3.64 -7.54 -5.11 -7.43 -7.56 -7.04 -16.07 -6.02 -12.10 -8.35 -12.16

EstES -2.76 -3.50 -6.99 -2.91 -4.86 -3.55 -4.79 -5.80 -5.60 -11.54 -4.71 -7.91 -5.52 -8.42

Variable RNF RNF

Violations (in %) 5.25 7.41 6.48 5.86 7.72 4.63 5.25 4.94 6.17 7.10 7.10 3.70 6.48 5.86

quantile test p-value 0.00 0.02 0.00 0.02 0.60 0.00 0.03 0.00 0.01 0.00 0.06 0.01 0.21 0.01

HistES -21.03 -24.13 -26.64 -17.69 -26.22 -24.42 -28.29 -32.27 -30.83 -37.48 -26.74 -37.68 -37.47 -39.80

EstES -16.99 -17.35 -23.52 -15.79 -22.84 -18.26 -29.60 -21.49 -24.53 -34.88 -20.24 -31.70 -32.03 -37.77

Variable RB RB

Violations (in %) 7.10 3.70 8.02 7.41 8.33 7.41 7.41 4.01 2.78 10.49 6.79 7.41 4.94 7.72

quantile test p-value 0.00 0.00 0.48 0.02 0.37 0.00 0.02 0.00 0.00 0.20 0.01 0.05 0.01 0.30

HistES -34.11 -20.20 -40.08 -36.41 -52.91 -36.92 -46.49 -48.01 -21.42 -51.96 -51.98 -71.92 -49.90 -67.84

EstES -25.29 -18.02 -36.80 -27.48 -38.23 -26.64 -35.09 -33.66 -23.02 -51.26 -35.83 -48.46 -41.93 -48.95

9

Figure Set 1 (1992:01-2017:12)

10

Figure Set 2 (1992:01-2017:12)

11

Figure Set 2 (cont.) (forecasting sample: 1992:01-2017:12)

12

Appendix

Appendix Table 1. VaR and ES backtests (sample: 1975:01-2017:12)

IPG RNF RB

Country

horizon h=1 h=3 h=6 h=12 h=1 h=3 h=6 h=12 h=1 h=3 h=6 h=12

US

Historical VaR -0.59 -1.17 -1.89 -2.87 -4.23 -6.99 -8.23 -11.51 -6.56 -10.84 -14.86 -15.61

Mean Estimated VaR -0.53 -0.87 -1.56 -2.48 -4.24 -4.59 -8.36 -9.93 -6.75 -7.17 -15.23 -18.57

Violations (in%) 9.28 10.98 7.39 8.90 9.66 11.17 6.82 9.66 9.85 13.45 5.30 6.63

beta0 0.19 0.85 1.03 3.07 0.38 9.43 6.11 5.18 0.47 19.69 8.75 33.94

beta1 1.29 2.15 1.46 2.21 1.06 3.03 1.60 1.74 1.10 3.95 1.37 2.69

test Eq 16 2.38 21.88 32.55 40.58 0.33 146.59 15.32 11.58 0.38 103.37 39.90 86.52

p-value 0.31 0.00 0.00 0.00 0.85 0.00 0.00 0.00 0.83 0.00 0.00 0.00

Historical ES -1.10 -2.51 -4.33 -6.74 -7.80 -13.20 -17.58 -26.10 -12.63 -21.34 -28.75 -38.86

Estimated ES -0.95 -1.44 -2.45 -3.56 -5.96 -6.62 -10.95 -13.62 -10.34 -12.76 -15.65 -25.02

DM stat 2.59 3.19 4.08 4.29 2.85 10.63 3.63 4.11 1.82 6.18 3.01 3.06

p-value 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.97 1.00 1.00 1.00

CN

Historical VaR -1.17 -1.86 -3.03 -3.80 -4.53 -7.50 -12.90 -16.55 -5.59 -9.23 -11.91 -14.11

Mean Estimated VaR -1.16 -1.55 -2.63 -3.78 -4.76 -6.74 -12.14 -15.35 -5.56 -7.81 -12.47 -14.06

Violations (in%) 10.61 13.64 9.47 11.17 9.85 11.36 8.14 10.04 8.90 10.80 8.33 8.90

beta0 0.26 1.51 1.60 4.32 -0.48 7.55 6.73 8.02 0.00 15.73 11.77 15.62

beta1 1.16 2.13 1.54 2.20 0.85 2.41 1.49 1.62 0.87 3.11 1.94 2.09

test Eq 16 0.73 18.01 20.30 8.30 0.45 15.96 10.05 6.46 3.03 14.96 6.27 2.07

p-value 0.69 0.00 0.00 0.02 0.80 0.00 0.01 0.04 0.22 0.00 0.04 0.36

Historical ES -1.85 -3.07 -4.99 -7.46 -8.39 -15.35 -22.30 -28.76 -8.78 -15.98 -20.66 -23.95

Estimated ES -1.69 -2.32 -3.24 -5.29 -6.87 -10.46 -17.86 -22.76 -7.91 -11.63 -15.82 -20.07

DM stat 1.83 1.87 3.42 2.10 1.19 3.88 3.56 1.97 1.07 3.51 2.80 0.90

p-value 0.97 0.97 1.00 0.98 0.88 1.00 1.00 0.98 0.86 1.00 1.00 0.82

JP

Historical VaR -1.56 -2.19 -3.36 -5.51 -5.73 -10.95 -15.45 -23.14 -8.81 -16.18 -24.13 -35.17

Mean Estimated VaR -1.84 -1.63 -3.96 -6.11 -5.40 -10.29 -14.32 -18.52 -8.50 -12.67 -20.73 -29.96

Violations (in%) 9.85 13.45 8.33 8.90 10.23 9.47 6.44 9.09 7.58 11.93 8.52 9.47

beta0 0.72 4.99 2.27 1.12 1.36 6.19 7.62 19.69 0.48 9.93 0.05 27.64

beta1 1.28 4.21 1.43 1.04 1.28 1.58 1.41 2.06 0.95 1.88 0.94 1.88

test Eq 16 9.24 605.10 31.21 6.30 3.07 41.87 83.64 37.05 6.67 22.46 1.13 41.43

p-value 0.01 0.00 0.00 0.04 0.22 0.00 0.00 0.00 0.04 0.00 0.57 0.00

Historical ES -3.02 -5.26 -8.36 -12.42 -9.42 -17.29 -24.37 -34.93 -13.49 -24.58 -34.96 -47.70

Estimated ES -2.84 -3.23 -5.40 -9.02 -7.88 -14.56 -14.34 -23.79 -11.49 -18.18 -21.48 -37.58

DM stat 0.85 4.17 2.56 3.42 4.80 5.50 6.01 4.92 4.37 6.42 5.23 6.07

p-value 0.80 1.00 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

UK

Historical VaR -1.26 -1.77 -2.26 -2.89 -5.11 -7.36 -9.12 -11.59 -7.21 -13.13 -17.29 -23.35

Mean Estimated VaR -1.37 -1.83 -2.50 -3.51 -4.56 -7.59 -9.29 -9.80 -7.33 -10.65 -16.65 -20.88

Violations (in%) 10.80 10.98 9.09 9.47 11.93 8.14 6.63 9.47 10.04 10.61 5.30 8.90

beta0 0.12 1.64 1.30 2.49 -1.17 5.05 7.14 7.53 0.53 8.15 7.50 28.52

beta1 1.00 1.88 1.45 1.56 0.75 1.63 1.57 1.72 1.09 1.83 1.25 2.28

test Eq 16 1.13 3.12 3.84 5.49 6.40 17.10 20.83 12.16 0.31 9.95 20.99 24.51

p-value 0.57 0.21 0.15 0.07 0.04 0.00 0.00 0.00 0.86 0.01 0.00 0.00

Historical ES -2.16 -3.14 -4.47 -6.77 -8.02 -13.76 -17.40 -22.98 -12.80 -23.45 -30.52 -42.82

Estimated ES -2.07 -2.61 -3.34 -5.01 -6.68 -11.23 -12.11 -16.62 -10.49 -15.83 -16.82 -27.22

DM stat 0.66 1.26 1.45 0.75 3.26 5.60 3.93 2.55 3.11 5.29 3.37 3.77

p-value 0.75 0.90 0.93 0.77 1.00 1.00 1.00 0.99 1.00 1.00 1.00 1.00

13

Appendix Table 1. VaR and ES in-sample backtests (sample: 1975:01-2017:12)

(cont.)

IPG RNF RB

Country

horizon h=1 h=3 h=6 h=12 h=1 h=3 h=6 h=12 h=1 h=3 h=6 h=12

BD

Historical VaR -1.74 -2.03 -2.73 -3.31 -5.58 -8.80 -13.82 -18.61 -7.42 -14.99 -23.83 -36.77

Mean Estimated VaR -2.00 -1.74 -2.94 -4.29 -5.02 -7.63 -13.75 -16.46 -7.70 -11.87 -23.63 -30.32

Violations (in%) 9.47 12.12 10.04 9.47 11.74 11.93 8.90 10.04 11.17 10.04 6.06 8.14

beta0 0.53 10.62 3.59 -1.68 0.84 10.83 10.20 13.62 0.03 11.28 4.66 20.48

beta1 1.11 7.36 2.17 0.39 1.32 2.66 1.72 1.95 1.05 2.11 1.05 1.68

test Eq 16 5.72 45.63 17.52 2.63 4.94 47.32 10.87 16.57 0.52 38.84 9.78 12.06

p-value 0.06 0.00 0.00 0.27 0.09 0.00 0.00 0.00 0.77 0.00 0.01 0.00

Historical ES -2.88 -4.05 -6.28 -9.30 -9.73 -17.88 -24.43 -31.74 -14.79 -29.16 -43.48 -61.08

Estimated ES -2.85 -2.89 -3.85 -6.16 -7.13 -11.52 -16.84 -21.34 -10.82 -18.53 -23.74 -38.01

DM stat -0.36 2.24 2.86 -0.82 4.38 7.81 4.77 5.46 3.88 7.97 4.58 3.84

p-value 0.36 0.99 1.00 0.21 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

FR

Historical VaR -1.61 -1.87 -2.24 -2.68 -6.06 -10.96 -15.25 -22.78 -5.97 -12.65 -20.09 -28.37

Mean Estimated VaR -1.66 -1.84 -2.37 -3.33 -5.74 -8.87 -14.67 -18.12 -6.60 -6.89 -17.69 -25.51

Violations (in%) 11.93 11.74 9.66 9.47 10.04 11.17 7.20 11.74 10.23 13.45 7.39 11.93

beta0 -0.28 0.40 1.36 1.67 -1.92 19.92 24.02 51.18 1.44 26.37 16.53 62.15

beta1 0.80 1.30 1.53 1.36 0.69 3.30 2.41 3.74 1.38 4.65 1.78 3.55

test Eq 16 0.20 0.88 9.18 10.76 4.53 35.49 56.56 76.62 3.49 393.58 36.54 7.94

p-value 0.90 0.65 0.01 0.00 0.10 0.00 0.00 0.00 0.18 0.00 0.00 0.02

Historical ES -2.33 -3.13 -4.23 -6.38 -9.79 -18.55 -25.48 -34.83 -12.55 -23.96 -33.42 -44.02

Estimated ES -2.30 -2.63 -3.20 -4.65 -8.15 -12.61 -18.59 -23.86 -9.45 -11.95 -17.69 -36.30

DM stat 0.88 1.82 1.33 1.35 1.02 4.91 5.83 4.04 3.25 10.38 6.58 1.32

p-value 0.81 0.97 0.91 0.91 0.85 1.00 1.00 1.00 1.00 1.00 1.00 0.91

IT

Historical VaR -2.04 -2.51 -3.13 -4.48 -7.69 -14.32 -18.24 -27.19 -8.88 -16.44 -24.80 -39.90

Mean Estimated VaR -2.21 -2.60 -3.75 -5.28 -6.99 -13.88 -18.87 -25.68 -9.60 -17.50 -25.15 -43.40

Violations (in%) 11.17 12.12 8.14 9.28 10.42 8.14 8.33 11.17 9.85 7.77 6.82 6.25

beta0 -1.81 1.09 2.97 1.29 -0.91 5.92 44.62 146.31 -0.21 9.05 17.70 85.85

beta1 0.12 1.49 1.64 1.08 0.89 1.37 3.33 6.74 0.98 1.48 1.65 2.86

test Eq 16 1.03 2.16 19.59 4.99 0.89 5.08 66.61 85.98 0.03 16.95 22.73 83.11

p-value 0.60 0.34 0.00 0.08 0.64 0.08 0.00 0.00 0.98 0.00 0.00 0.00

Historical ES -3.20 -4.36 -6.58 -9.81 -11.14 -20.91 -30.00 -38.84 -14.80 -26.60 -39.52 -58.89

Estimated ES -3.08 -3.80 -4.87 -7.18 -9.69 -19.67 -23.71 -36.36 -13.08 -24.59 -25.20 -59.26

DM stat -0.32 1.33 1.93 -0.35 0.77 4.06 3.99 3.43 2.75 3.24 4.36 1.55

p-value 0.38 0.91 0.97 0.36 0.78 1.00 1.00 1.00 1.00 1.00 1.00 0.94

14

Appendix Table 8. VaR and ES backtests for the BA combination (sample: 1992:01-2017:12)

Horizon (months) 1 3

Variable IPG IPG

US CN JP UK BD FR IT US CN JP UK BD FR IT

Historical VaR -0.48 -0.91 -1.72 -1.01 -1.75 -1.61 -1.86 -1.05 -1.59 -2.96 -1.33 -2.05 -1.88 -2.21

Mean ES estimate -0.53 -1.03 -2.04 -1.15 -1.82 -1.63 -1.79 -0.78 -1.54 -3.48 -1.55 -2.38 -2.09 -3.01

Violations (in %) 6.79 10.49 12.65 12.35 12.96 11.73 12.04 7.72 9.88 10.19 13.27 11.73 12.35 9.88

beta0 0.10 0.18 0.65 -0.52 0.30 0.15 -0.20 0.22 0.49 0.42 0.06 0.73 0.31 0.69

beta1 1.08 1.04 1.38 0.46 1.15 1.05 0.88 1.17 1.19 1.02 1.09 1.36 1.18 1.07

Chi^2 stat 2.40 3.50 0.28 3.34 0.15 0.36 0.97 6.86 2.62 2.00 0.23 8.10 1.49 4.60

p-value 0.30 0.18 0.87 0.19 0.93 0.84 0.62 0.03 0.27 0.37 0.89 0.02 0.48 0.10

HistES -1.00 -1.62 -3.65 -1.69 -2.63 -2.33 -2.75 -2.36 -3.01 -6.90 -2.51 -4.42 -3.47 -4.55

EstES -0.83 -1.52 -2.93 -1.64 -2.75 -2.25 -2.43 -1.33 -2.21 -4.50 -2.05 -3.31 -2.75 -3.71

DMstat 2.10 1.87 0.93 0.19 0.58 1.16 1.24 3.08 1.99 1.66 1.61 1.28 1.23 1.24

p-value 0.98 0.97 0.82 0.58 0.72 0.88 0.89 1.00 0.98 0.95 0.95 0.90 0.89 0.89

Variable RNF RNF

US CN JP UK BD FR IT US CN JP UK BD FR IT

Historical VaR -4.29 -4.64 -6.66 -5.08 -6.80 -5.98 -7.59 -7.73 -7.33 -13.24 -7.15 -12.54 -10.99 -13.85

Mean ES estimate -4.38 -5.04 -6.70 -4.80 -5.85 -6.43 -7.69 -8.07 -9.44 -12.75 -8.41 -11.80 -12.33 -14.86

Violations (in %) 8.64 7.10 8.64 8.95 9.88 5.25 4.94 5.25 6.17 7.41 5.86 7.41 2.78 5.25

beta0 0.65 0.18 -1.09 -1.20 1.38 0.96 -0.80 2.11 1.88 10.74 4.07 2.00 5.34 5.79

beta1 1.08 0.94 0.73 0.65 1.26 0.94 0.74 0.96 1.05 1.72 1.29 0.95 1.15 1.16

Chi^2 stat 1.10 1.61 3.46 5.18 2.91 7.71 10.87 7.78 6.12 10.85 176.11 3.01 16.75 13.62

p-value 0.58 0.45 0.18 0.08 0.24 0.02 0.00 0.02 0.05 0.00 0.00 0.22 0.00 0.00

HistES -8.34 -8.76 -9.58 -7.70 -10.74 -9.55 -10.58 -14.22 -16.21 -18.04 -13.17 -19.09 -17.46 -19.41

EstES -6.20 -7.25 -9.41 -6.98 -8.12 -8.84 -10.50 -10.77 -11.59 -15.95 -11.44 -15.48 -13.08 -19.13

DMstat 3.04 -0.06 2.77 4.14 3.01 3.01 0.48 2.64 2.34 2.49 3.96 1.76 2.26 3.18

p-value 1.00 0.48 1.00 1.00 1.00 1.00 0.69 1.00 0.99 0.99 1.00 0.96 0.99 1.00

Variable RB RB

US CN JP UK BD FR IT US CN JP UK BD FR IT

Historical VaR -7.66 -5.38 -10.79 -7.30 -9.00 -7.90 -10.35 -12.08 -8.20 -19.97 -15.06 -18.43 -14.80 -17.51

Mean ES estimate -7.77 -5.87 -11.35 -7.70 -8.89 -7.57 -9.80 -13.09 -10.62 -18.67 -14.14 -17.56 -12.84 -17.18

Violations (in %) 7.41 5.25 5.25 9.57 9.26 8.64 8.95 7.72 6.17 8.64 8.33 8.95 7.10 5.86

beta0 0.79 1.29 3.56 -0.29 2.46 2.71 0.76 6.40 3.89 4.45 4.01 6.88 2.19 4.72

beta1 1.03 1.03 1.11 0.93 1.37 1.37 1.01 1.33 1.13 1.22 1.24 1.40 1.11 1.09

Chi^2 stat 0.83 9.57 11.67 0.09 2.64 3.14 0.78 9.89 7.10 0.86 1.80 3.74 1.72 13.66

p-value 0.66 0.01 0.00 0.96 0.27 0.21 0.68 0.01 0.03 0.65 0.41 0.16 0.42 0.00

HistES -14.99 -8.59 -15.00 -14.14 -17.74 -14.50 -16.65 -24.28 -15.65 -27.36 -26.37 -34.41 -26.91 -30.63

EstES -10.73 -8.17 -13.98 -11.03 -12.50 -10.70 -13.07 -16.56 -13.74 -22.59 -19.30 -22.16 -14.25 -18.26

DMstat 2.25 -0.83 2.31 2.34 3.41 3.48 3.92 1.79 1.47 2.42 2.21 2.71 4.29 3.46

p-value 0.99 0.20 0.99 0.99 1.00 1.00 1.00 0.96 0.93 0.99 0.99 1.00 1.00 1.00

15

Appendix Table 8. VaR and ES backtests for the BA combination (cont.) (sample: 1992:01-2017:12)

Horizon (months) 6 12

Variable IPG IPG

US CN JP UK BD FR IT US CN JP UK BD FR IT

Historical VaR -1.15 -2.25 -5.09 -1.70 -2.74 -2.28 -3.24 -2.15 -2.97 -7.72 -2.43 -4.03 -3.25 -4.79

Mean ES estimate -1.57 -2.55 -5.46 -2.27 -3.62 -2.73 -3.90 -3.45 -3.91 -8.74 -3.67 -5.92 -4.22 -6.70

Violations (in %) 5.86 9.26 12.35 7.41 11.73 11.73 10.19 4.32 9.26 13.89 8.33 11.11 10.49 11.42

beta0 0.47 0.68 0.69 0.46 1.57 0.37 2.08 1.44 1.53 0.44 0.40 0.49 0.87 0.70

beta1 1.05 1.07 1.19 1.06 1.65 1.17 1.59 1.12 1.20 1.20 0.89 1.00 1.30 1.10

Chi^2 stat 9.16 4.94 2.00 8.36 5.43 9.19 10.92 306.91 9.38 1.84 6.88 1.14 3.16 0.54

p-value 0.01 0.09 0.37 0.02 0.07 0.01 0.00 0.00 0.01 0.40 0.03 0.57 0.21 0.76

HistES -4.33 -4.83 -11.02 -3.64 -7.54 -5.11 -7.43 -7.56 -7.04 -16.07 -6.02 -12.10 -8.35 -12.16

EstES -2.76 -3.50 -6.99 -2.91 -4.86 -3.55 -4.79 -5.80 -5.60 -11.54 -4.71 -7.91 -5.52 -8.42

DMstat 3.08 2.46 1.38 2.40 1.34 1.14 1.41 2.48 1.95 1.33 2.10 1.63 1.36 1.29

p-value 1.00 0.99 0.92 0.99 0.91 0.87 0.92 0.99 0.97 0.91 0.98 0.95 0.91 0.90

Variable RNF RNF

US CN JP UK BD FR IT US CN JP UK BD FR IT

Historical VaR -8.92 -13.72 -18.58 -10.15 -16.22 -14.57 -18.35 -20.75 -18.43 -27.23 -18.43 -23.09 -24.85 -26.48

Mean ES estimate -12.09 -14.27 -18.84 -11.68 -16.77 -16.15 -21.64 -14.31 -18.53 -26.79 -14.67 -25.84 -22.86 -28.09

Violations (in %) 5.25 7.41 6.48 5.86 7.72 4.63 5.25 4.94 6.17 7.10 7.10 3.70 6.48 5.86

beta0 1.27 6.22 14.22 5.42 0.40 8.79 5.13 4.13 8.53 12.05 1.56 9.00 2.47 11.13

beta1 0.81 1.29 1.64 1.29 0.96 1.35 1.07 1.05 1.20 1.33 0.96 1.11 0.97 1.23

Chi^2 stat 2148.92 8.13 26.15 7.65 1.02 18.15 6.98 15.23 10.07 13.55 5.66 9.76 3.12 8.84

p-value 0.00 0.02 0.00 0.02 0.60 0.00 0.03 0.00 0.01 0.00 0.06 0.01 0.21 0.01

HistES -21.03 -24.13 -26.64 -17.69 -26.22 -24.42 -28.29 -32.27 -30.83 -37.48 -26.74 -37.68 -37.47 -39.80

EstES -16.99 -17.35 -23.52 -15.79 -22.84 -18.26 -29.60 -21.49 -24.53 -34.88 -20.24 -31.70 -32.03 -37.77

DMstat 0.29 2.05 1.89 1.90 2.02 3.11 1.18 3.47 2.11 1.65 2.53 1.88 3.18 1.47

p-value 0.61 0.98 0.97 0.97 0.98 1.00 0.88 1.00 0.98 0.95 0.99 0.97 1.00 0.93

Variable RB RB

US CN JP UK BD FR IT US CN JP UK BD FR IT

Historical VaR -15.24 -9.29 -29.38 -19.48 -29.94 -21.20 -28.63 -19.28 -10.34 -42.33 -32.94 -47.75 -32.73 -49.90

Mean ES estimate -18.44 -14.17 -28.61 -20.72 -30.17 -21.47 -29.17 -24.98 -17.01 -40.61 -25.81 -39.05 -33.14 -41.72

Violations (in %) 7.10 3.70 8.02 7.41 8.33 7.41 7.41 4.01 2.78 10.49 6.79 7.41 4.94 7.72

beta0 5.17 9.66 9.74 7.66 6.05 11.74 -2.26 7.90 6.46 -14.25 10.84 14.88 6.28 7.44

beta1 1.23 1.35 1.29 1.35 1.17 1.46 0.78 1.10 0.78 0.65 1.19 1.34 1.01 1.09

Chi^2 stat 10.99 43.06 1.49 8.08 2.02 13.09 8.11 35.29 74.79 3.26 9.33 6.15 9.87 2.44

p-value 0.00 0.00 0.48 0.02 0.37 0.00 0.02 0.00 0.00 0.20 0.01 0.05 0.01 0.30

HistES -34.11 -20.20 -40.08 -36.41 -52.91 -36.92 -46.49 -48.01 -21.42 -51.96 -51.98 -71.92 -49.90 -67.84

EstES -25.29 -18.02 -36.80 -27.48 -38.23 -26.64 -35.09 -33.66 -23.02 -51.26 -35.83 -48.46 -41.93 -48.95

DMstat 1.27 1.51 1.76 1.82 2.07 2.05 1.81 1.78 1.03 1.15 2.03 2.21 1.87 2.84

p-value 0.90 0.94 0.96 0.97 0.98 0.98 0.96 0.96 0.85 0.87 0.98 0.99 0.97 1.00