Bayesian validation assessment of multivariate computational models

18
This article was downloaded by:[Jiang, Xiaomo] On: 9 January 2008 Access Details: [subscription number 789378972] Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Applied Statistics Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713428038 Bayesian validation assessment of multivariate computational models Xiaomo Jiang a ; Sankaran Mahadevan a a Department of Civil and Environmental Engineering, Vanderbilt University, Nashville, TN, USA Online Publication Date: 01 January 2008 To cite this Article: Jiang, Xiaomo and Mahadevan, Sankaran (2008) 'Bayesian validation assessment of multivariate computational models', Journal of Applied Statistics, 35:1, 49 - 65 To link to this article: DOI: 10.1080/02664760701683577 URL: http://dx.doi.org/10.1080/02664760701683577 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article maybe used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Transcript of Bayesian validation assessment of multivariate computational models

This article was downloaded by:[Jiang, Xiaomo]On: 9 January 2008Access Details: [subscription number 789378972]Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied StatisticsPublication details, including instructions for authors and subscription information:http://www.informaworld.com/smpp/title~content=t713428038

Bayesian validation assessment of multivariatecomputational modelsXiaomo Jiang a; Sankaran Mahadevan aa Department of Civil and Environmental Engineering, Vanderbilt University,Nashville, TN, USA

Online Publication Date: 01 January 2008To cite this Article: Jiang, Xiaomo and Mahadevan, Sankaran (2008) 'Bayesianvalidation assessment of multivariate computational models', Journal of AppliedStatistics, 35:1, 49 - 65To link to this article: DOI: 10.1080/02664760701683577URL: http://dx.doi.org/10.1080/02664760701683577

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article maybe used for research, teaching and private study purposes. Any substantial or systematic reproduction,re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expresslyforbidden.

The publisher does not give any warranty express or implied or make any representation that the contents will becomplete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should beindependently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings,demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with orarising out of the use of this material.

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

Journal of Applied StatisticsVol. 35, No. 1, January 2008, 49–65

Bayesian validation assessment ofmultivariate computational models

Xiaomo Jiang and Sankaran Mahadevan∗

Department of Civil and Environmental Engineering, Vanderbilt University, Nashville, TN, USA

(Received)

ABSTRACT Multivariate model validation is a complex decision-making problem involving comparisonof multiple correlated quantities, based upon the available information and prior knowledge. This paperpresents a Bayesian risk-based decision method for validation assessment of multivariate predictive mod-els under uncertainty. A generalized likelihood ratio is derived as a quantitative validation metric basedon Bayes’ theorem and Gaussian distribution assumption of errors between validation data and modelprediction. The multivariate model is then assessed based on the comparison of the likelihood ratio with aBayesian decision threshold, a function of the decision costs and prior of each hypothesis. The probabilitydensity function of the likelihood ratio is constructed using the statistics of multiple response quantities andMonte Carlo simulation. The proposed methodology is implemented in the validation of a transient heatconduction model, using a multivariate data set from experiments. The Bayesian methodology provides aquantitative approach to facilitate rational decisions in multivariate model assessment under uncertainty.

Keywords: Bayesian statistics; decision making; risk; reliability; model validation; multivariate statistics

Motivation

Model validation involves comparing model predictions with experimental results. During thecomparison, decisions need to be made to accept or reject the model by decision makers withcertain preferences, based upon the available information and prior knowledge. Therefore, modelvalidation is inherently a decision-making problem. Model validation may involve comparingpredicted and observed values of multiple response quantities. In some engineering problemssuch as mechanical stress, temperature, etc a single response quantity may be predicted andobserved at different spatial and temporal points; thus multiple correlated response quantities willneed to be compared in model validation. It is possible that simple multiple univariate comparisonsmay yield conflicting inferences in such a situation [24]. The focus of this study is to develop aBayesian risk-based decision theoretic method for the effective overall validation assessment ofmultivariate models, considering uncertainty and correlation of multiple response quantities.

Two types of quantitative approaches may be pursued to develop model validation metrics:hypothesis-testing-based and decision-based methods. Hypothesis-testing-based methods may

∗Corresponding author. Email: [email protected]

ISSN 0266-4763 print/ISSN 1360-0532 online© 2008 Taylor & FrancisDOI: 10.1080/02664760701683577http://www.informaworld.com

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

50 X. Jiang and S. Mahadevan

be based on either classical or Bayesian statistics. The classical hypothesis testing approach is awell-developed statistical method of accepting or rejecting a model based on an error statistic.Bayesian hypothesis testing has been recently explored for model validation in several studies[28,19,24]. In particular, Rebba and Mahadevan [23] developed aggregate validation metrics formultivariate model validation, in the context of both classical and Bayesian hypothesis testing.Refer to the literature in [4,5,6,12,24] for detailed discussions regarding the differences betweenclassical and Bayesian approaches.

The decision-theoretic approach can be related to either classical or Bayesian statistics, but hasnot been widely pursued for computational model validation. In the classical decision theoreticapproach, a decision is made to minimize the expected loss, which is defined as a function of theconditional error probabilities of Type I error (reject a correct model) and Type II error (accepta wrong model). Balci and Sargent [2] presented a classical hypothesis-testing-based cost-riskdecision analysis to validate a simulation model of a real system, considering the model user’s risk,model builder’s risk, acceptable validity range, budget, sample sizes, and cost of data collection.Similar to the classical hypothesis testing methods, the classical decision-based approach hasdifficulties in properly interpreting the error probabilities.

Recently, Jiang and Mahadevan [15] developed a Bayesian decision methodology for com-putational model validation, considering the risk of using the current model, data support forthe current model, and cost of acquiring new information to improve the model. The Bayesiandecision approach was forced to offer a comprehensive theoretical foundation for model valida-tion by explicitly including risk, and the validation metric for Bayesian hypothesis testing wasmathematically derived from the decision approach. The objective of a Bayesian decision-basedmodel validation method is clearly identified, that is, to make the decision (accept or reject) thatminimizes cost or risk. A decision maker can determine various decision thresholds by specifyingthe importance of the different cost sources and the prior knowledge about the null and alternativehypotheses. The Bayes factor validation metric based on Bayesian hypothesis testing is derived inthis paper from a generalized risk-based decision perspective. Refer to Jiang and Mahadevan [15]for details of the Bayesian decision method for univariate model validation and its technologicalmerits.

Nevertheless, two important issues need to be addressed prior to applying the Bayesian deci-sion method for the reliability assessment of multivariate predictive models. First, a quantitativevalidation metric needs to be developed to consider the uncertainty and correlation of multipleresponse quantities. Multivariate statistical methods are being extensively used in meteorologi-cal and climate modeling [26], but they are limited in the assessment of civil, mechanical andaerospace engineering models.

Second and more importantly, how to assess the overall validation of multivariate predictivemodels?Validation metrics such as p-value in the classical methods or Bayes factor in the Bayesianmethods run into difficulty in decision-making when multiple experiments give conflicting infer-ences. Each validation metric value is based only on a given validation data set. Thus, the specificvalidation metric value can only be used to assess whether the model is valid or not in terms ofthat particular experiment. For example, assume we have eight sets of experimental data, resultingin eight different validation metric values. Of these, suppose four values support the model andfour others reject the model. In this situation, we cannot make a conclusive decision whether toreject or accept the model based on all the eight validation metric values. Due to the uncertaintyin both experimental results and model prediction, even if we obtain a majority of the validationmetric values (e.g. six of eight) to support the model, we still cannot quantify the confidence thatthe model is valid in the tested domain.

In order to address the above two issues, a generalized likelihood ratio is derived in this studybased on Bayes’ theorem and the normal distribution assumption of the difference between val-idation data and model prediction, and by including the variability and correlation of multiple

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

Journal of Applied Statistics 51

quantities. The Bayesian risk of model validation is assessed based on the comparison of thelikelihood ratio with a Bayesian decision threshold. Furthermore, the probability density functionof the likelihood ratio is constructed using the statistics of the response quantity and Monte Carlosimulation. The overall reliability of the model is assessed based on the simulated results. The pro-posed Bayesian decision method is implemented in the validation of a transient heat conductionmodel, using an experimental data set provided by Sandia National Laboratories.

In the next section, the Bayesian risk-based decision method is first described briefly. Thegeneralized likelihood ratio is then derived based on Bayes’ theorem, and several relevant issuesare discussed. Finally, the multivariate model validation in both deterministic and stochasticcontexts is presented.

Methodology

Bayesian decision rule

Within the context of binary hypothesis testing in model validation, we need to consider twohypotheses H0 and H1, i.e. the point null hypothesis (H0: yexp = ypred) to accept the model andan alternative hypothesis (H1: yexp �= ypred) to reject the model. The prior probabilities of twohypotheses are denoted by

π0 = Pr(H0) and π1 = Pr(H1) (1)

Note that π1 = 1 − π0 for the binary hypothesis testing problem. Each time model validation isconducted given the experimental data, one of four possible scenarios, [Hi |Hj ] (i = 0, 1; j =0, 1), may happen, and [Hi |Hj ] is the event of inferring Hi when Hj is true. Analogous tothe classical testing approach, the type I and II error probabilities (α and β, respectively) arecalculated as

α = Pr(H1|H0) and β = Pr(H0|H1) (2)

Let Z represent the entire experimental data set, and Z0 and Z1 represent two mutually exclusivesubsets of Z such that Z0 ∪ Z1 = Z, and Z0 ∩ Z1 = ∅, where ∪ and ∩ represent union andintersection, respectively, and ∅ represents an empty set or null space. Thus, Z0 and Z1 representtwo decision regions corresponding to two hypotheses H0 and H1. Every possible observation Y

belongs to either decision region. The problem to be solved here is to assign each experiment toZ0 or Z1 such that a minimum risk in model validation is obtained.

Assuming that the observation Y has a probability density function under each hypothesis,i.e. Y |H0 ∼ f (y|H0) and Y |H1 ∼ f (y|H1). Thus, Pr[Hi |Hj ] = ∫

Zif (y|Hj ) dy (i = 0, 1; j =

0, 1) is obtained. The loss function (or risk) for model validation is defined to be the expected costof validation experiments, which is obtained by averaging the decision cost over two probabili-ties: the prior probability of the hypothesis and the probability of a particular action to be taken[15,22]:

R =1∑

j=0

1∑i=0

(cijπj

∫Zi

f (y|Hj )dy

)(3)

where cij = the cost of deciding Hi when Hj is true (decision consequence).Based on the assumption that the total risk (cost) resulting from a correct decision is always

less than the total risk resulting from a wrong decision, Jiang and Mahadevan [15] derive theBayes decision rule to accept the model through minimizing the total Bayes risk [Equation (3)]

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

52 X. Jiang and S. Mahadevan

as follows:

�(y) = f (y|H0)

f (y|H1)>

π1(c01 − c11)

π0(c10 − c00)= η (4)

where �(y) is the likelihood ratio, referred to as the Bayes factor, and η is the acceptable threshold,which is dependent on the prior densities of the two hypotheses and the costs of deciding Hi whenHj is true (i = 0, 1; j = 0, 1). Equation (4) is the Bayesian decision rule developed by Jiang andMahadevan [15] for univariate model validation. It should be pointed out that, when particular costinformation ci (e.g. c00 = c11 = 0 and c01 = c10 = 1) and prior densities πi (e.g. π0 = π1 = 0.5)are assumed, the threshold η = 1 is obtained, as in the Bayes factor approach proposed by Zhangand Mahadevan [28] for model validation.

Figure 1 shows the Bayesian decision theoretic method for univariate model validation.The optimal decision boundary Y* (dashed line) is obtained from the Bayes decision rule inEquation (4). The corresponding minimum type I and II errors (shadowed area) are calculatedrespectively by α = π0

∫Z1

f (y|H0)dy and β = π1∫Z0

f (y|H1)dy based on Equation (2). If theacceptable threshold η is chosen arbitrarily (dashed dot line), the type I or II error will increasedue to the shifting of the decision boundary Ya , resulting in the increase of the total error, asshown by the total shadowed area in Figure 1. Refer to Jiang and Mahadevan [15] for details ofminimizing the Bayesian risk in univariate model validation.

In practical applications of the Bayes risk approach for model validation, it becomes critical tocompute efficiently the probability density (or likelihood) function of experimental data under eachhypothesis. If data are available only on one or more intermediate quantities, a Bayes network [16]approach and a Markov chain Monte Carlo (MCMC) simulation technique has been suggested byMahadevan and Rebba [19] to estimate the probability density of the response quantity of interest.In the following sections, the likelihoods of experimental data under two hypotheses are derivedmathematically, based on Bayes’ theorem for both univariate and multivariate model validation.

Likelihood ratio for univariate model validation

Assuming yexp = {y1,exp, y2,exp, . . . , yn,exp} and ypred = {y1,pred, y2,pred, . . . , yn,pred} to be n sam-ples of experimental data and model predictions, respectively. Let di = yi,exp − yi,pred represent thedifference between the ith experimental data and the ith model prediction, and y = {d1, d2, . . . , dn}

Figure 1. Bayesian decision theoretic method for univariate model validation.

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

Journal of Applied Statistics 53

represent n values of the error with distribution N(μ, σ 2). Usually σ is assumed to be knownand incorporate the uncertainties of input parameters, experiment condition and measurementerrors. The model validation problem becomes testing H0: μ = e0 versus H1: μ �= e0 withμ|H1 ∼ N(ρ, τ 2), in which e0 = 0 in the model validation problem, and ρ and τ are theparameters of the prior density of μ under the alternative hypothesis [denoted by f (μ|H1)].If no information on f (μ|H1) is available, the parameters ρ = e0 and τ 2 = σ 2 are suggestedin Migon and Gamerman [21]. This selection assumes that the amount of information in theprior is equal to that in the observation, which is consistent with the Fisher information-basedmethod [17].

Notice that H0 is a simple hypothesis with f (y|H0) = f (y|e0), which is referred to as themarginal likelihood of H0 given y. Using the Bayes theorem, the marginal likelihood (or posteriorjoint function) of H1 given y is obtained by

f (y|H1) =∫

−{e0}f (y, μ|H1)dμ =

f (y|μ)f (μ|H1)dμ (5)

where the last integration is performed over the entire parameter space because a single pointe0 does not alter the integration value. Substituting the distributions of f (y|μ) and f (μ|H1) intoEquation (5) with a few algebraic transformation yields [21]

f (y|H1) =∫ (

1

2πσ 2

)n/2

exp

[− 1

2σ 2

n∑i=1

(di − μ)2

]× 1√

2πτexp

[− 1

2τ 2(μ − ρ)2

]dμ

=(

1

2πσ 2

)n/2σ√

nτ 2 + σ 2exp

(− ns2

2σ 2

)exp

[−1

2

n(d − ρ)2

nτ 2 + σ 2

](6)

where d = 1/n∑n

i=1 di is the mean of n data points, and s2 = 1/n∑n

i=1

(di − d

)2is the mean

of squared errors (MSE).The likelihood ratio (or Bayes factor), �(y) in Equation (4), thus becomes

�(y) = f (y|e0)∫f (y|μ)f (μ|H1)dμ

=(1/2πσ 2

)n/2exp

[−(1/2σ 2)∑n

i=1 (di − e0)2]

((1/2πσ 2)

)n/2(σ/

√nτ 2 + σ 2) exp

(−ns2/2σ 2)

exp[−(1/2)(n(d − ρ)2/nτ 2 + σ 2)

]

=exp

[−(1/2σ 2)

∑ni=1

(di − d

)2]

× exp[− 1

2σ 2

∑ni=1 (d − e0)

2]

(σ/√

nτ 2 + σ 2) exp(−ns2/2σ 2

)exp

[−(1/2)(n(d − ρ)2/nτ 2 + σ 2)]

=√

nτ 2 + σ 2

σexp

{n

2

[(d − ρ)2

nτ 2 + σ 2− (d − e0)

2

σ 2

]}(7)

It is observed from Equation (7) that the MSE term has been conveniently omitted in computing thelikelihood ratio. Thus, only the mean (d) of observations is needed in evaluating the computationalmodel. Since �(y) is non-negative, the value of �(y) is converted into the logarithm scale forthe convenience of comparison among a large range of values as follows [e0 = 0 is used in

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

54 X. Jiang and S. Mahadevan

Equation (8)]:

b = ln[�(y)] = 1

2ln

(nτ 2 + σ 2

σ 2

)+ n

2

[(d − ρ)2

nτ 2 + σ 2− d

2

σ 2

](8)

where ln(.) is a natural logarithm operator with a basis of e. Based on the Bayesian decision rulerepresented by Equation (4), a value of b larger than ln η indicates that the observations y arejudged to support H0 (i.e. accepting the model). Otherwise, the observations y are judged to rejectH0 (i.e. rejecting the model).

Substituting the relationships of ρ = e0, τ 2 = σ 2 and e0 = 0 into Equation (8) gives

b = 1

2ln(n + 1) − n2

2(n + 1)

d2

σ 2≤ 1

2ln(n + 1) (9)

Therefore the value of 0.5 ln(n + 1) is the upper bound of the Bayes factor.

Likelihood ratio for multivariate model validation

Assuming that Y = [y1 y2 · · · ym]T is an m × n matrix representing m variables each havingn observations, in which yi = [di1 di2 · · · din](i = 1, 2, . . . , m) represents the n observa-tions of the ith variable yi . The observations Y are assumed to be drawn from a multivariatenormal density Nm(μ, Σ), where the vector μ = E[y] represents the corresponding m mean val-ues and Σ = E[(y − μ)(y − μ)′] is an m × m covariance matrix of all variables. Similar to thevariance σ in the univariate case, the covariance matrix Σ is related to the uncertainties of inputparameters, experiment condition and measurement errors, considering the correlation of multipleresponse quantities. Thus, the likelihood function of the multiple observations, L(Y), is expressedas follows:

L(Y) ∝ |Σ|−1/2

(2π)m/2exp

[−1

2(Y − μ)TΣ−1(Y − μ)

](10)

Let di = 1n

∑nj=1 dij (i = 1, 2, . . . , m) is the mean of n data points of the ith variable yi . Within

the context of multivariate model validation, we wish to test whether the observed sets of means,Y = [d1 d2 · · · dm]T, are equal to zeros, E0 = [0 0 · · · 0]T. Thus, the multivariate model val-idation problem becomes testing the two hypotheses H0 : μ = E0 versus H1 : μ �= E0 withμ|H1 ∼ N(ρ, Λ). The expression of �(Y) for multivariate model validation is derived in theAppendix using the similar procedure to that in the univariate case described previously, andconverted into the logarithmic scale as follows

bM = 1

2ln

(n |Λ| + |Σ|

|Σ|)

+ n

2

[(Y − ρ)′(nΛ + Σ)−1(Y − ρ) − Y

′Σ−1Y

](11)

Equation (11) is used to calculate the likelihood ratio bM in a logarithmic scale for multivariatemodel validation, incorporating the correlation of multiple quantities in the covariance matrix Σ.Similar to the univariate case, the value of 0.5 ln(n + 1) is the upper bound of the Bayes factorbM based on the particular selection of ρ and Λ.

Issues to be considered in multivariate model validation

Now the Bayes decision rule in Equation (4) may be pursued for multivariate model validationusing the likelihood ratio calculated by Equation (10). For illustration purposes, Figure 2 shows

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

Journal of Applied Statistics 55

Figure 2. Bayesian decision theoretic method for two-variable model validation.

the Bayesian decision theoretic method for two-variable model validation. The contour lines inFigure 2 show the regions where the likelihood function has constant density. From the equationof normal density in Equation (9), the contour lines are defined by points that have the sameconstant value:

r2 = (Y − μ)′Σ−1(Y − μ) (12)

where r is referred to as the Mahalanobis distance from Y to μ [18]. Thus, the contour lines areregarded to be lines of constant Mahalanobis distance. Note that (1) since the value of r dependson the contents of the covariance matrix Σ, the shape of the contour lines is determined by thismatrix as well; (2) since this distance is a quadratic function, the contours of constant density arehyperellipsoids of constant r; and (3) the contents of Σ are determined by the model propertiesand the measurement error information, incorporating the uncertainty and correlation in multipleresponse quantities.

Three issues are considered here in multivariate model validation. First, a statistical methodis used in this paper to demonstrate the advantages of the proposed Bayesian method. Usingthe correlated normally distributed difference between experimental results (Yexp) and modelpredictions (Ypred) (i.e. Y = Yexp − Ypred), Equation (11) is employed by Hills and Trucano [10]as a statistic metric to assess the multivariate computational model, calculated by

r2 = Y′Σ−1Y (13)

The normal distribution assumption of the difference Y results in that r2 in Equation (12) has aχ2(df) (chi-square) distribution, in which df is the number of observations, statistically calledthe degree of freedom. Thus, the p-value can be calculated as the cumulative probability ofPr(R2 > r2) based on the χ2(df) distribution and used as the evidence to reject the model.However, not having enough evidence to reject a model is not the same as having enough evidenceto accept the model, which will be demonstrated later in the illustrative example.

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

56 X. Jiang and S. Mahadevan

Second, in practical applications, both experimental results Yexp and model prediction Ypred

may be assumed to be drawn from normal distributions with covariance matrices Σexp and Σpred,respectively. Thus, the covariance matrix Σ can be easily obtained by the summation of twocovariance matrices, i.e. Σ = Σpred + Σexp. It should be pointed out that the normal distributionassumption of the difference Y is independent of the density functions of Yexp and Ypred. If Yexp

and Ypred are not normally distributed, a set of simulated data may be generated from their PDFs,their difference Y is obtained, and the value of Σ is calculated in terms of Y.

Third and finally, there are two cases regarding the likelihood ratios calculated usingEquation (10) (or Equation (8) for univariate model validation): deterministic and stochastic.In the first case, the deterministic values of the likelihood ratio (or Bayes factor) are calculatedbased on available experimental data. Various decision thresholds may be determined throughspecifying the importance of the different cost sources and the prior knowledge about the null andalternative hypotheses. The computational model is then evaluated using Equation (4) based onthe comparison of the likelihood ratio with a decision threshold. Accordingly, the confidence foraccepting the null hypotheses (i.e. accepting the model) is quantified by using Bayes’ theorem.This situation has been investigated in detail by Jiang and Mahadevan [15] in the validation oftwo univariate reliability models and will be further demonstrated below within the context ofmultivariate model validation.

In the second case, considering the uncertainty in the model prediction and validation data,the likelihood ratio is treated as a random variable and used to assess the overall reliability ofthe corresponding computational model. The probability of accepting the null hypothesis may becalculated by using any simulation technique, as discussed below.

Model assessment

Deterministic case

Given a set of validation data Yexp and model prediction Ypred, the likelihood ratio is calculatedusing Equation (10). Thus, the Bayesian measure of evidence that the computational model isvalid may be quantified by the posterior probability of the null hypothesis Pr(H0|data). Using theBayes theorem, the relative posterior probabilities of two models are obtained as:

Pr (H0 | data )

Pr (H1 | data )=

[Pr (data |H0 )

Pr (data |H1 )

] [Pr (H0)

Pr (H1)

](14)

The term in the first set of square brackets on the right hand side is referred to as the ‘Bayes factor’[15], which is the same as the likelihood ratio in Equation (4). Substituting Equations (1) and (4)into Equation (13) yields:

Pr(H0|data)

Pr(H1|data)= �M(Y)

π0

π1(15)

where Pr(H1|data) represents the posterior probability of the alternative hypothesis (i.e. the modelis rejected). For a binary hypothesis testing we have Pr(H1|data) = 1 − Pr(H0|data) and π1 =1 − π0. Thus, Pr(H0|Y) can be derived from Equation (14) as follows:

Pr(H0|Y) = �M(Y)π0

1 − π0 + B01π0= ebM π0

1 − π0 + ebM π0(16)

where Y represents the observations and �M(Y) = ebM is used to restore the likelihood ratio fromthe logarithm value bM . Equation (15) is used to quantify the confidence in computational modelvalidation based on the validation data and model prediction. Before conducting experiments,

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

Journal of Applied Statistics 57

π0 = π1 = 0.5 is assumed due to the absence of any prior knowledge about two hypotheses. Inthat case, Equation (15) is simplified as

Pr(H0|Y) = ebM

1 + ebM(17)

From Equation (16), bM → −∞ indicates 0% confidence in the model validation, and bM → ∞indicates 100% confidence.

In this situation, the marginal likelihood of H0 given Y, f (Y|H0) = f (Y|E0) where E0 is azero vector, is obtained by

f (Y|H0) = 1

(2π)(m+n)/2 |Σ|n/2 exp

[−1

2

n∑i=1

Y′iΣ

−1Yi

](18)

Further, the marginal likelihood of H1 given Y, f (Y|H1), is obtained by the relation of �(Y) =f (Y|H0)/f (Y|H1). Thus, the Bayes risk associated with cost information and prior densities oftwo hypotheses is determined by Equation (3). Refer to Jiang and Mahadevan [15] for detailsregarding the Bayes risk assessment of model validation given a set of specific validation data.

Stochastic case

In order to consider the uncertainties in both validation data and model prediction, the likelihoodratio, �M(Y), is treated as a random variable. Given a fixed sample size n and the variable statisticsof response quantity (i.e. probability density functions of Yexp and Ypred), various values of Yare obtained by using the Monte Carlo simulation technique to produce a distribution of bM . Assuch, the probability of accepting the model can be estimated by finding the proportion of �M(Y)

whose values are greater than ln(η) as follows:

γ = Pr(bM > ln(η)) (19)

Equation (18) gives the probability of accepting the model based on the given information. Itprovides a quantitative measure for the overall reliability assessment of a computational model,incorporating the importance of the decision costs and the prior of each hypothesis in terms ofthe decision threshold η.

Numerical implementation

A multivariate transient heat conduction model is employed to demonstrate the effectiveness ofthe proposed methodology. For illustration purposes, assume that the costs of deciding Hi whenHj true are normalized to be c00 = c11 = 1 (unit), and c01 = c10 = 2 (unit), and the prior densitiesabout the two hypotheses are π0 = π1 = 0.5. This gives the threshold η = 1 and ln η = 0, fromEquation (4). Various decision thresholds can be determined by specifying the importance of thedifferent cost sources (cij , i, j = 0, 1) and the prior knowledge of each hypothesis πi(i = 0, 1).Refer to Jiang and Mahadevan [15] for the effects of various cost values on the decision risk inthe context of univariate model validation. The sensitivity of results to the choice of costs is alsodemonstrated in [15].

Problem description

A transient heat conduction problem has been designed at Sandia National Laboratories [8,9]as a model validation challenging problem. Its purpose is to incorporate features representingpractical realities that are often imposed on modelers and experimentalists in the context of modelvalidation. The one-dimensional heat conduction through a slab (Figure 3) is established by a set

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

58 X. Jiang and S. Mahadevan

Figure 3. Heat conduction model.

of governing differential equations [8]. The corresponding analytical solution is approximated byBeck et al. [3])

T (x, t) = Ti + qL

k

{(k/ρ)t

L2+ 1

3

− x

L+ 1

2

( x

L

)− 2

π2

∞∑n=1

1

n2exp

[−(nπ)2 (k/ρ)t

L2

]cos(nπx/L)

}(20)

where T (x, t) is the temperature at the spatial location x and the temporal locations t ; Ti is theinitial temperature at the location x = 0; the parameters q and L are the heat flux and the slabthickness, respectively, and the parameters k and ρ represent the thermal properties of the material,namely the thermal conductivity and the heat capacity, respectively. In this study, n = 10, 000 isfound to be enough for approximating the infinite based on the code verification data provided byDowding et al. [8].

Equation (19) is the predictive model to be validated in this example given several sets of exper-imental data. The uncertain input parameters associated with this model are specimen thickness,L (m), applied heat flux, q (W/m2), initial temperature Ti (◦C), thermal conductivity, k (W/m◦C),and heat capacity, ρc (J/m3◦C). For each experiment, the specimen thickness and the applied fluxare measured with specified uncertainty as follows [8]

L ∼ N(L0, 2.54 × 10−4) and q ∼ N(q0 + δq, 0.015q0) (21)

where L0 and q0 are nominal values of the specimen thickness and the applied flux, respectively.In Equation (20), the variable δq is the bias in the diagnostic reading of the flux q. The biasis assumed to be uniformly distributed with U(−0.05q0, 0.05q0). For the material properties, k

and ρ, the uncertainties are characterized by three sets of material characterization experimentsat three specific temperatures 20 ◦C, 500 ◦C and 1000 ◦C, each having two observations of thetwo parameters [8]. The model is required to be validated for constant material properties. Nor-mal distributions are assumed for the two material properties, and the parameters are estimatedindividually using six relevant data points as follows

k ∼ N(0.06393, 1.624 × 10−4) and ρc ∼ N(4.1928 × 105, 5.597 × 105) (22)

The measurement error for the temperature is specified in the same way as that of the appliedflux. Thus, the uncertainty of the measured temperature is described by Dowding et al. [8]

T ∼ N (Tobs + δT , 0.005�T ) (23)

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

Journal of Applied Statistics 59

where Tobs is the observed temperature; δT is the measurement bias with δT ∼ U(−0.025�T,

0.025�T ), and �T = T –Ti is the temperature difference. In particular, the initial temperature isassumed to be normal, i.e. T ∼ N(25 + δT , 0.005 × �T ) with �T = 25.

In this example, experimental data are available at four different configurations at the timeincrement of 50 seconds over the period of 0 to 1000 seconds, thus resulting in 21 observationsfor each experiment. Given any configuration, measurements were taken at discrete, regular timeintervals to provide multivariate, correlated data at each time period. This example thus serves asa case study for multivariate model validation with correlated observations at different spatial andtemporal points for a single response quantity. Table 1 summarizes the four experimental config-urations of different specimen thickness L0 and heat flux q0. Measurements are repeated once ateach experimental configuration, resulting in eight sets of 21-variable observations available formodel validation.

Given a fixed spatial point (x), the prediction temperature is generated using Equation (19)with the random input parameters and various time points. The uncertainty in input parameters isthus propagated to the response output through this approximate model repeatedly to obtain theoutput statistics. In the corresponding spatial and temporal point, the experimental measurementat every spatial location is simulated based on its statistics [Equation (22)]. The simulation isconducted in the following three steps:

(1) Randomly generate M sets of model input parameters from their statistics (M = 5000 in thisexample);

(2) Compute the prediction output using Equation (19) with every set of generated input parame-ters. The covariance of the multivariate prediction output Σpred is estimated using the simulatedmodel prediction, and

(3) Randomly generate M sets of experimental data using Equation (22). The covariance of exper-imental data is Σexp = (0.005�T )2 I. As an illustrative example, Figure 4 shows the curvesof mean actual observation and mean model prediction for configuration 1. It is observed thatthe mean actual observation (dashed line) falls in the region of simulated prediction outputwith 95% bounds.

Model validation

In this example, the proposed methodology is applied for two cases: (1) four configurations(Case 1: n = 2), and (2) eight experiments (Case 2: n = 1), where n is the number of repeatedexperiments in every case. In Case 1, assume no measured data are available on all five parameters,namely specimen thickness (L), heat flux (q), initial temperature (Ti), thermal conductivity (k) andheat capacity (ρc). The variation of prediction output thus comes from the uncertainty of all fiveparameters. Note that the number of variables is equal to the number of observation time intervals,i.e. m = 20, for all four configurations, resulting in a 20-variable model validation problem (the

Table 1. Four experimental configurations and model validation results for Case 1.

Config. n q0 (W/m2) L0 (m) bM Pr(H0|Y) (%) r2 Pr(R2 > r2) (%) γ (%)

1 2 1000 0.0127 −2.01 11.82 19.8 47.1 0.082 2 1000 0.0254 −15.32 2.22 × 10−5 224.0 0 0.223 2 2000 0.0127 −3.34 3.42 32.9 3.46 0.044 2 2000 0.0254 −17.92 1.65 × 10−6 221.2 0 0.22

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

60 X. Jiang and S. Mahadevan

Figure 4. Model prediction versus experimental output for configuration 1.

initial temperature is treated as a variable). This case aims to validate the model for applicationswhere test data are not available for all input parameters.

In Case 2, the available measurements of three parameters L, q, and Ti are used in modelvalidation.Accordingly, the model prediction for every experiment is obtained using Equation (19)with the measured values of L, q, and Ti and the simulated values of k and ρ. Thus, the variationof prediction output comes only from the thermal properties k and ρc, resulting in the reductionof the uncertainty in model prediction compared with Case 1. Again, this is a 20-variable modelvalidation problem.

Deterministic case

Table 1 summarizes the likelihood ratios obtained by Equation (10), the confidence values ofmodel acceptance obtained by Equation (16), and the statistical r2 obtained by Equation (12)and the corresponding p-values for all four scenarios (Case 1). Two phenomena are observedfrom the results. First, all likelihood ratios are less than 0 (the decision threshold ln η = 0). Basedon the Bayes decision rule in Equation (4), the eight sets of experimental data are judged toreject the computational model. Second, all p-values are less than 50%, implying the evidenceto reject the model, which is consistent with the conclusion made from the Bayesian decisionmethod. Therefore, the proposed method results in the same conclusion as that resulting from theclassical method in rejecting the model.

We calculate the marginal likelihoods of H0 given Y using Equation (17) for all four scenarios.The likelihood quantity from Equation (17) for each set of experimental data set is very small(less than 10−6). A minimum risk of Rmin = 1.09 (unit) is obtained by Equation (3) for this case.

Table 2 summarizes the model validation results for Case 2. Three observations are made inthis case. First, the Bayes factor obtained from the second set of validation data is greater thanzero and therefore is said to accept the model with 55.1% confidence, while the other sevensets of validation data are judged to reject the model. Second, the values of bM in this case arerelatively greater than those in Case 1. This may be attributed to the reduction of uncertainty ofinput parameters in this case. Third and finally, the p-values are 79.5% and 82.3% respectivelyfor experiments 2 and 6, implying that we have not enough evidence to reject the model, whichis against the conclusion made from the Bayesian decision method.

Similar to Case 1, the marginal likelihoods of H0 given Y are calculated using Equation (17)for all scenarios, and the minimum risks of Rmin = 0.25 (unit) is obtained by using Equation (3)

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

Journal of Applied Statistics 61

Table 2. Model validation results for Case 2.

Experiment 1 2 3 4 5 6 7 8

q ( W/m2) 1023 1006 964 958 2123 2108 2048 2096L (m) 0.0126 0.0122 0.0255 0.0258 0.0126 0.0129 0.0255 0.0254Ti (◦C) 25.91 25.80 25.87 21.22 25.95 24.92 24.88 23.43bM −1.505 0.204 −4.899 −2.348 −0.614 −0.545 −3.454 −2.889Pr(H0|Y) (%) 18.2 55.1 0.7 8.7 35.1 36.7 3.1 5.3γ (%) 0.6 0.6 0.5 0.5 0.6 0.7 0.5 0.5r2 51.39 14.67 354.57 208.56 21.03 14.15 254.47 220.51Pr(R2 > r2) (%) 0.014 79.5 0 0 39.5 82.3 0 0

for this case. The minimum risk is less than that obtained in Case 1 (Rmin = 1.09) due to thereduction of uncertainty, as expected.

Stochastic case

In this example, the probability density function (PDF) of bM in each scenario of two cases isyielded by simulating 10,000 sets of experimental output and model prediction using their statis-tics and then calculating bM using every set of simulated data. Figures 5 and 6 show the PDFs ofbM for Case 1 (four experimental configurations) and Case 2 (eight experiments), respectively.The probabilities of accepting the null hypothesis are given in Table 1 for Case 1 and Table 2 forCase 2. Three phenomena are observed from the results. First, all probabilities of accepting thenull hypotheses are less than 1%, implying that these validation experiments are to reject the com-putational model with a higher probability, the same conclusion as obtained in the deterministicsituation described previously.

Second, only one probability density function of bM is needed to encapsulate all scenarios ineach case. Generally, the likelihood ratio (or Bayes factor) calculated by Equation (10) dependson the uncertainty of validation data and model prediction, as well as the number of repeatedtests (i.e. sample size n). Given the uncertainty of input parameters and measurement error, thestatistics of both model prediction and experimental output is determined. Thus, the statistics ofthe resulting likelihood ratio become invariant.

Third and finally, the variation of bM in Figure 6 (Case 2) appears to be less than that inFigure 5 (Case 1). This is consistent with the results obtained in the deterministic situation, where

Figure 5. Probability density functions of bM for four experimental configurations (Case 1).

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

62 X. Jiang and S. Mahadevan

Figure 6. Probability density functions of bM for eight experiments (Case 2).

the reduction of the uncertainty of input parameters leads to the reduction of the uncertainty of thevalidation metric. To sum up, the simulation results indicate that the experimental data in thesecases are always to reject the model.

Comments

The validation results have demonstrated that the predictive model in Equation (19) is overallrejected based on the eight sets of experimental data. One of the possible reasons is that largeuncertainty may exist in the two material properties, namely, the thermal conductivity (k) and theheat capacity (ρc). In this example, the model validation computations are based on the assumptionthat the material properties are constant against temperature variation. However, from the limitedtest data provided about the material properties, it is observed that the two material properties,particularly k, appear to be largely correlated with the temperature value. The assumption ofconstant material properties may result in inaccurate model prediction. Therefore, more testsshould be conducted to quantify the uncertainty of the two material properties or calibrate thecomputational model. Another possible reason is that the number of repeated experiments n issmall for each scenario. More validation experiments may be conducted to reduce the uncertainty,thus leading to a more accurate model validation result.

In order to illustrate further the effectiveness of the proposed methodology, measured datacollected from a single accreditation validation experiment [8] is employed to assess themodel [Equation (19)]. The experimental configuration for the accreditation test consists ofq0 = 2969 W/m2, L = 0.0189 m, and Ti = 27.37 ◦C. Temperature values collected at the sur-face (x = 0), in the middle of the specimen (x = L/2), and at the back surface (x = L) areavailable for a duration of 1500 seconds at increments of 75 seconds, results in a 20-variablemodel validation problem.

Again, 10,000 sets of experimental output and model prediction are simulated with the statisticsof input parameters and measurement error, using the procedure described previously. Substitutingthe resulting quantities into Equation (10) yields the values of bM = −0.66, −189.24, and −95.60for the three scenarios. Clearly, all specific values in various cases are less than 0 (ln η) suggestingthat the model is rejected based on the accreditation data. Numerical simulation also demonstrates

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

Journal of Applied Statistics 63

that the probability density functions of bM in the three cases are the same as that in Figure 6.The values of bM = −189.24 and −95.60 fall in the far left tail of the PDF of bM, implying thatlarge uncertainty may exist in the accreditation experiment and the model cannot yet be appliedfor practical prediction.

Concluding remarks

A Bayesian risk-based decision method is developed in this paper for the overall reliability assess-ment of multivariate computational models, considering the uncertainty and correlation of multipleresponse quantities. The generalized expression of the likelihood ratio is derived to facilitate thereliability assessment, which is based on the comparison of the likelihood ratio with a Bayesiandecision threshold. In the illustrative example of a transient heat conduct problem, the Bayesiandecision rule is applied to judge whether to accept or reject the computational model based onthe specific metric with a quantified confidence. The minimum risk of model validation is alsoobtained through correctly assigning experimental data to two decision regions.

Considering the uncertainty of input parameters and experimental data, the probability densityfunction of the likelihood ratio is estimated by using the Monte Carlo simulation technique. Theprobability of accepting the model is easily obtained from the density function. It is revealed thatthe density function is unique for any specific uncertainty in the model validation. This featureis particularly useful for assessing the reliability of the predictive model for application in anuntested region.

The methodology proposed in this paper has common technological merits with Jiang andMahadevan’s [15], in terms of the Bayesian decision approach to model validation incorporatingthe risk of using the current model, data support for the current model, and cost of acquiring newinformation to improve the model. On the other hand, this paper has several unique and distinctcontributions, different from the latter in objectives, methodology, and example problem.

(1) Multivariate model validation is pursued in this paper, whereas the earlier paper by Jiang andMahadevan [15] pursues a univariate case. This difference is not trivial. It requires consideringboth uncertainty in validation data and the correlation of multiple response quantities. In addi-tion, a computationally efficient method is needed to estimate the likelihoods of multivariateexperimental data under each hypothesis.

(2) The generalized expression of the likelihood ratio is explicitly derived in this paper to facilitatethe multivariate reliability assessment, whereas a Bayes network approach and a MCMCsimulation technique suggested by Mahadevan and Rebba [19] are used in the earlier paperto estimate the likelihood of the response quantity of interest. To the best of our knowledge,this paper is the first to derive an explicit expression for the point hypothesis testing-basedBayes factor metric for multivariate model validation.

(3) The overall validation of multivariate predictive models is assessed in this paper using thestatistics of the response quantity and Monte Carlo simulation technique. This earlier paperdid not address this issue at all. This is especially important when validation metrics, suchas the p-value in the classical methods or Bayes factor in the Bayesian methods, run intodifficulty in decision-making when multiple experiments give conflicting inferences.

(4) The example problem in this paper is one of the three model validation challenging problemsprovided by Sandia National Labs [8]. It is very different from the simple examples used inthe earlier paper by Jiang and Mahadevan [15].

It should be pointed out, however, that the current work is based on the error Gaussian (ornormality) assumption. In the case of non-normality, various transformation methods [1,7,25]are available to achieve normality of error data. The transformed data can then be used in the

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

64 X. Jiang and S. Mahadevan

proposed Bayesian decision methodology for model validation. Refer to Rebba and Mahadevan[24] for details of the transformation of non-normality to normality.

In summary, the Bayesian decision methodology provides an effective quantitative approachto assess the reliability of multivariate computational models under uncertainty, considering thecorrelation of multiple response quantities. In addition, numerical results have demonstratedthat the number of validation data appears to play a crucial role on the model validation accuracy.Therefore, the effects of the number of tested data on validation accuracy may be investigated usingthe Bayesian method presented in [28] (determining the number of tests required) and the Bayesiancross entropy methodology presented in [15] (optimally designing validation experiments).

Acknowledgements

The research was supported by funds from Sandia National Laboratories, Albuquerque, New Mexico (contract no. BG-7732, Project Monitors: Dr Thomas L Paez, Dr Martin Pilch). The Support is gratefully acknowledged.Valuable discussionsabout this study contributed by Dr Remash Rebba and John McFarland are greatly appreciated.

References

[1] D.F. Andrews, R. Gnanadesikan, and J.L. Warner, Transformations of multivariate data, Biometrika 27 (1971),pp. 825–840.

[2] O. Balci and R.G. Sargent, A methodology for cost-risk analysis in the statistical validation of simulation models,Commun. ACM 24 (1981), pp. 190–197.

[3] J.V. Beck et al., Heat Conduction Using Green’s Functions, Hemisphere Publishing Corporation, London, 1992.[4] J.O. Berger, Statistical Decision Theory and Bayesian Analysis, 2nd ed. Springer-Verlag, New York, 1985.[5] J.O. Berger and M. Delampady, Testing precise hypotheses, Stat. Sci. 2 (1987), pp. 317–352.[6] J.O. Berger and T. Sellke, Testing a point null hypothesis: the irreconcilability of P values and evidence, J. Am. Stat.

Assoc. 82 (1987), pp. 112–122.[7] G.E.P. Box and D.R. Cox, An analysis of transformations, J. R. Stat. Soc.: Ser. B (Statistical Methodology) 26

(1964), pp. 211–252.[8] K.J. Dowding, M. Pilch, and R.G. Hills, Thermal Validation Challenge Problem, Sandia National Laboratories,

Albuquerque, NM, 2005.[9] K.J. Dowding et al., Case study for model validation: assessing a model for thermal decomposition of polyurethane

foam, SAND2004-3632, Sandia National Laboratories, Albuquerque, NM, 2004.[10] R.G. Hills and T.G. Trucano, Statistical validation of engineering and scientific models: a maximum likelihood based

metric, Sandia National Laboratories Tec. Rep. Sand. No 2001 – 1783, Albuquerque, NM, 2001.[11] M.L. Hobbs, K.L. Erickson, and T.Y. Chu, Modeling decomposition of unconfined rigid polyurethane foam, SAND99-

2758, Sandia National Laboratories, Albuquerque, NM, 1999.[12] J.T. Hwang et al., Estimation of accuracy in testing, Ann. Stat. 20 (1992), pp. 490–509.[13] H. Jeffreys, Theory of Probability, 3rd ed., Oxford University Press, London, 1961.[14] F.V. Jensen and F.B. Jensen, Bayesian Networks and Decision Graphs, Springer-Verlag, New York, 2001.[15] X. Jiang and S. Mahadevan, Bayesian risk-based decision method for model validation under uncertainty, Reliab.

Eng. Syst. Saf. 92 (2006), pp. 707–718.[16] ———, Bayesian cross entropy methodology for optimal design of validation experiments, Meas. Sci. Technol. 17

(2006), pp. 1895–1908.[17] R. Kass and A. Raftery, Bayes factors, J. Am. Stat. Assoc. 90 (1995), pp. 773–795.[18] W.J. Krzanowski, Principles of Multivariate Analysis, 2nd ed., Clarendon Press, Oxford, 2000.[19] S. Mahadevan, and R. Rebba, Validation of reliability computational models using Bayes networks, Reliab. Eng.

Syst. Saf., 87 (2005), pp. 223–232.[20] J.I. Marden, Hypothesis testing: from p values to Bayes factors, J. Am. Stat. Assoc. 95 (2000), pp. 1316–1320.[21] H.S. Migon and D. Gamerman, Statistical Inference – An Integrated Approach, Arnold (a Member of the Holder

Headline Group) London, 1999.[22] R. Nowak and C. Scott, The Bayes risk criterion in hypothesis testing, Connexions (2004), Available at

http://cnx.rice.edu/content/m11533/1.6/.[23] R. Rebba and S. Mahadevan, Validation of models with multivariate output, Reliab. Eng. Syst. Saf. 91 (2006),

pp. 861–871.[24] ———, (2007) Statistic methods for model validation under uncertainty. Reliab. Eng. Syst. Saf. in press.

Dow

nloa

ded

By:

[Jia

ng, X

iaom

o] A

t: 13

:34

9 Ja

nuar

y 20

08

Journal of Applied Statistics 65

[25] M.S. Srivastava, Methods of Multivariate Statistics, Wiley, New York, 2002.[26] D.S. Wilks, Statistical Methods in Atmospheric Sciences: An Introduction, Academic Press, London, 1995.[27] R. Zhang and S. Mahadevan, Integration of computation and testing for reliability estimation, Reliab. Eng. Syst.

Saf. 74 (2001), pp. 13–21.[28] ———, Bayesian methodology for reliability model acceptance, Reliab. Eng. Syst. Saf. 80 (2003), pp. 95–103.

A. Appendix Bayes factor BM for multivariate model validation

Using a derivation procedure similar to that in the univariate case [Equation (6)], the posterior marginal likelihood of H1

given Y is obtained as follows:

f (Y|H1) =∫

f (Y|μ)f (μ|H1)dμ

=∫

1

(2π)(m+n)/2 |Σ|n/2 exp

[− 1

2

n∑i=1

(Yi −μ)′Σ−1(Yi −μ)

]× 1√

2π |Λ| exp

[− 1

2(μ−ρ)′Λ−1(μ−ρ)

]dμ

=∫

1

(2π)(m+n)/2 |Σ|n/2 exp

[− 1

2

n∑i=1

(Yi − Y

)′Σ−1 (

Yi − Y)] × exp

[− 1

2

n∑i=1

(Y − μ

)′Σ−1 (

Y − μ)]

× 1√2π |Λ| exp

[− 1

2(μ − ρ)′Λ−1(μ − ρ)

]dμ

= exp(−1/2

∑ni=1 (Yi − Y)′Σ−1(Yi − Y)

)(2π)(m+n)/2 |Σ|n/2

√|Σ|

n |Λ| + |Σ| exp[−n

2(Y − ρ)′(nΛ + Σ)−1(Y − ρ)

]

×∫

1√2π |Π| exp

[− 1

2(Z − Z0)

′ (Π)−1 (Z − Z0)

]dZ

= exp(−1/2

∑ni=1 (Yi − Y)′Σ−1(Yi − Y)

)(2π)(m+n)/2 |Σ|n/2

√|Σ|

n |Λ| + |Σ| exp[−n

2(Y − ρ)′(nΛ + Σ)−1(Y − ρ)

](A.1)

where Yi = [yi1 yi2 · · · yim]T(i = 1, 2, . . . , n) is the ith measurement of m variables, and |.| denotes the determi-nant of a matrix. The integral

∫g(Z)dZ = 1 is used in the algebraic transformation of Equation (A.1), in which

Z = μ√

n |Λ| + |Σ| and g(Z) = 1/√

2π |Π| exp[−1/2 (Z − Z0)

′ Π−1 (Z − Z0)]

is a normal distribution function withthe mean Z0 = (

nY |Λ| + ρ |Σ|/√n |Λ| + |Σ|) and the variance Π = ΣL.Accordingly, the likelihood ratio in Equation (4) for multivariate model validation is obtained by

�M(Y) = f (Y|E0)∫f (Y|μ)f (μ)dμ

= (1/(2π)(m+n)/2 |Σ|n/2) exp[−1/2

∑ni=1 (Yi − E0)

′Σ−1(Yi − E0)]

(exp

(−1/2∑n

i=1 (Yi − Y)′Σ−1(Yi − Y))/(2π)(m+n)/2 |Σ|n/2)√|Σ|/n |Λ| + |Σ| exp

[−n/2(Y − ρ)′(nΛ + Σ)−1(Y − ρ)]

=

(exp

(−1/2∑n

i=1 (Yi − Y)′Σ−1(Yi − Y))/(2π)(m+n)/2 |Σ|n/2)

exp[−1/2

∑ni=1 (Y − E0)

′Σ−1(Y − E0)]

(exp

(−1/2∑n

i=1 (Yi − Y)′Σ−1(Yi − Y))/(2π)(m+n)/2 |Σ|n/2) √|Σ|/n |Λ| + |Σ|

exp[−n/2(Y − ρ)′(nΛ + Σ)−1(Y − ρ)

]

=√

n |Λ| + |Σ||Σ| exp

{n

2

[(Y − ρ)′(nΛ + Σ)−1(Y − ρ) − (Y − E0)

′Σ−1(Y − E0)]}

(A.2)

where E0 is a zero vector. Again, only the mean (Y) of observations is needed in computing the likelihood ratio usingEquation (A.2).