A Scalable Parallel Uncertainty Analysis and Data Assimilation Framework Applied to Some Hydraulic...

10
A Scalable Parallel Uncertainty Analysis and Data Assimilation Framework Applied to Some Hydraulic Problems Waad Subber * , Hatef Monajemi * , Mohammad Khalil * and Abhijit Sarkar A significant uncertainty exists in numerical predictions of hydraulic and hydrologic phenomena such as solute transport in river and subsurface flow due to (a) insufficient knowledge on the spatio-temporal variabil- ity of the hydraulic and geological parameters across multiple length scales and (b) lack of complete knowledge of the physics and mechanics of the associated phenomena. The conventional approach of treating the system parameters as deterministic quantities may therefore be unacceptable for such models. Due to the uncertainty and complexity of such spatially-distributed interconnected systems, it is necessary to quantify confidence in computer simulation models for their acceptability as reliable alternatives to field experiments. This paper describes a scalable parallel uncertainty quantification and data assimilation framework for some hydraulic and hydrologic systems that can exploit modern supercomputers and blend sensor data to reduce uncertainty in numerical predictions. I. Introduction With the recent availability of cost-effective high performance computing platforms, high resolution numerical simulations may offer a cost-effective alternative to extensive field experiments for hydraulic and hydrologic phenom- ena. When given access to powerful computers while tackling such problems, the first recourse adopted by numerical modelers is to increase the model resolution in spatial and temporal dimensions. In conjunction with modern multi- processor computers, this strategy permits numerical simulations with substantial model resolution and thereby signif- icantly reduces the numerical (discretization) errors. However, in modelling such natural phenomena, it is necessary to consider the random heterogeneity of the model parameters and model structural errors for realistic computer pre- dictions. On the other hand, the widespread availability of modern sensing technology and high-speed communication network offers the possibility of including measured data regarding the overall system in its operational state in (near) real time, and permits online and adaptive calibration of the dynamic evolution of the natural system arising in such problems. Although rich with valuable information regarding the operational state of the system, the sensor data is invariably corrupted by measurement noise. Significant interest exists in the assimilation of noisy sensor data into the high resolution computer model to recursively calibrate or infer (in a statistical sense) the parameters used by the numerical simulators and to provide a realistic ensemble forecast (as in weather predictions) regarding the evolution or state of the system. Whenever substantial statistical data is available, such uncertainty can be modelled using the theory of probability and stochastic processes. This approach forms a rational basis for experiment design, data acquisition strategy, safety and risk assessment. Two main categories of uncertainty are 1 : (a) modeling uncertainty and (b) data (parameter) un- certainty. The modeling uncertainty generally arises due to simplifying assumption made to describe the mathematical modeling of the physical phenomena. The parameter uncertainty emerges due to insufficient information and measure- ment errors in calibrating the parameters of the mathematical models. When few parameters capture the uncertainty in the system, the parametric method 2 can adequately tackle the uncertainty quantification procedure. When a large number of random parameters is needed or the modeling uncertainty becomes significant, the so-called non-parametric methods based on Wishart random matrix 1 representation of uncertainty is more practical. In this investigation, we consider the first category of uncertainty. This investigation intiates the development of next generation uncertainty quantification models to exploit tera- scale supercomputers to enable high fidelity simulation with new capability to quantify and reduce uncertainty in hydraulic and hydrological predictions. It involves effective computational models that will 1) assess uncertainty in input and models, 2) propagate uncertainty through models, 3) quantify and reduce the effect of uncertainty by exploiting high performance computing (HPC) and data assimilation methods with fewer simplifying assumptions * Graduate Student, Assistant Professor and Canada Research Chair, Dept. of Civil and Environmental Eng., Carleton University, Ottawa, Ontario, Canada. 1 of 10 International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008

Transcript of A Scalable Parallel Uncertainty Analysis and Data Assimilation Framework Applied to Some Hydraulic...

A Scalable Parallel Uncertainty Analysis and Data Assimilation Framework Applied toSome Hydraulic Problems

Waad Subber∗, Hatef Monajemi∗, Mohammad Khalil∗andAbhijit Sarkar†

A significant uncertainty exists in numerical predictions of hydraulic and hydrologic phenomena such assolute transport in river and subsurface flow due to (a) insufficient knowledge on the spatio-temporal variabil-ity of the hydraulic and geological parameters across multiple length scales and (b) lack of complete knowledgeof the physics and mechanics of the associated phenomena. The conventional approach of treating the systemparameters as deterministic quantities may therefore be unacceptable for such models. Due to the uncertaintyand complexity of such spatially-distributed interconnected systems, it is necessary to quantify confidence incomputer simulation models for their acceptability as reliable alternatives to field experiments. This paperdescribes a scalable parallel uncertainty quantification and data assimilation framework for some hydraulicand hydrologic systems that can exploit modern supercomputers and blend sensor data to reduce uncertaintyin numerical predictions.

I. Introduction

With the recent availability of cost-effective high performance computing platforms, high resolution numericalsimulations may offer a cost-effective alternative to extensive field experiments for hydraulic and hydrologic phenom-ena. When given access to powerful computers while tacklingsuch problems, the first recourse adopted by numericalmodelers is to increase the model resolution in spatial and temporal dimensions. In conjunction with modern multi-processor computers, this strategy permits numerical simulations with substantial model resolution and thereby signif-icantly reduces the numerical (discretization) errors. However, in modelling such natural phenomena, it is necessaryto consider the random heterogeneity of the model parameters and model structural errors for realistic computer pre-dictions. On the other hand, the widespread availability ofmodern sensing technology and high-speed communicationnetwork offers the possibility of including measured data regarding the overall system in its operational state in (near)real time, and permits online and adaptive calibration of the dynamic evolution of the natural system arising in suchproblems. Although rich with valuable information regarding the operational state of the system, the sensor data isinvariably corrupted by measurement noise. Significant interest exists in the assimilation of noisy sensor data intothe high resolution computer model to recursively calibrate or infer (in a statistical sense) the parameters used by thenumerical simulators and to provide a realistic ensemble forecast (as in weather predictions) regarding the evolutionor state of the system.

Whenever substantial statistical data is available, such uncertainty can be modelled using the theory of probabilityand stochastic processes. This approach forms a rational basis for experiment design, data acquisition strategy, safetyand risk assessment. Two main categories of uncertainty are1 : (a) modeling uncertainty and (b) data (parameter) un-certainty. The modeling uncertainty generally arises due to simplifying assumption made to describe the mathematicalmodeling of the physical phenomena. The parameter uncertainty emerges due to insufficient information and measure-ment errors in calibrating the parameters of the mathematical models. When few parameters capture the uncertaintyin the system, the parametric method2 can adequately tackle the uncertainty quantification procedure. When a largenumber of random parameters is needed or the modeling uncertainty becomes significant, the so-called non-parametricmethods based on Wishart random matrix1 representation of uncertainty is more practical. In this investigation, weconsider the first category of uncertainty.

This investigation intiates the development of next generation uncertainty quantification models to exploit tera-scale supercomputers to enable high fidelity simulation with new capability to quantify and reduce uncertainty inhydraulic and hydrological predictions. It involves effective computational models that will 1) assess uncertaintyin input and models, 2) propagate uncertainty through models, 3) quantify and reduce the effect of uncertainty byexploiting high performance computing (HPC) and data assimilation methods with fewer simplifying assumptions

∗Graduate Student,†Assistant Professor and Canada Research Chair, Dept. of Civil and Environmental Eng., Carleton University, Ottawa, Ontario, Canada.

1 of 10

International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008

and enhanced physical realism. To demonstrate the capability of the methods, we consider the uncertainty analsisof the following problems3, 4 : (1) steady-state seepage flow under dams with uncertain permeability properties, (2)pollution dispersion problem on rivers described by stochastic advection-diffusion equation.

II. Discrete Representation of Dynamical Systems

The discrete model and measurement equations for many nonlinear hydraulic systems are given by

uk+1 = ψk (uk, fk, qk) , (1)

dk = hk (uk, ǫk) . (2)

Hereu ∈ Rn is the state vector,ψ ∈ R

n is the discrete nonlinear model operator,f ∈ Rp is a deterministic input and

d ∈ Rm is the measurement vector which relates to the true state by the measurement operatorh ∈ R

m. q ∈ Rs and

ǫ ∈ Rr are independent random vectors with covariance matricesQ ∈ R

s×s andΓ ∈ Rr×r respectively.

Under certain assumptions, Eq.(1) can be recast into a vector Ito Stochastic Differential Equation (SDE)5 as

du = g(u, t) dt + H (u, t) dw (3)

whereu is ann-dimensional state vector,g is ann-dimensional random vector function,H is an × m matrix-valuedfunction andw is anm-dimensional vector Wiener process.The transition probability density function (pdf)p (u, t|u0, t0) of u (t) satisfies the Fokker-Planck equation5 with agiven initial conditionp (u0, t0) given by

∂p

∂t= −

n∑

i=1

∂ui

pgi +1

2

n∑

i,j=1

∂2

∂ui∂uj

p(

HHT)

ij

. (4)

Whenever observational data is available, the conditionalpdf is obtained using Bayes’ formula as:5

p (uk, tk|dk) =p (dk|uk, tk) p

(

uk, tk|d−

k

)

p (dk|uk, tk) p(

uk, tk|d−

k

)

duk

. (5)

wherep(

uk, tk|d−

k

)

is the conditional pdf up to but not including the observation dk.Unfortunately, the closed-form solution of the Fokker-Plank equation (4) is not available in general. Therefore an

approximate solution is obtained using Monte Carlo sampling which can be computationally expensive for large-scalesystems. Furthermore, Bayesian estimation of conditionalpdf (5) is not practical for high-dimensional systems. Anapproximate solution is resorted using nonlinear filteringtechniques such as Ensemble Kalman Filter and ParticleFilter.6, 7

III. Forward Modelling

In this section, we review a recently proposed fast parallelnumerical algorithm2 that provides an alternative toMonte Carlo based approximate solution ofp (u, t) for discrete systems arising from hydraulic models described bystochastic partial differential equations. The numericaltechnique is based on domain decomposition in geometricspace and functional decomposition in probabilistic space. In what follows is a brief overview of the methodology.

Uncertainty Representation By Stochastic Processes

We assume the data induces a representation of the model parameters as random variables and processes which spanthe Hilbert spaceHG. A set of basis functions,ξi is identified to characterize this space using Karhunen-Loeveexpansion. The state of the system resides in the Hilbert spaceHL with basis functions,Ψi being identified withthe polynomial chaos expansion (PC). The Karhunen-Loeve expansion of a stochastic processα(x, θ) is based on thespectral expansion of its covariance functionRαα(x, y). The expansion takes the following form

α(x, θ) = α(x) +

∞∑

i=1

λiξi(θ)φi(x) (6)

2 of 10

International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008

whereα(x) is the mean of the stochastic process,θ represents the random dimension, andξi(θ) is a set of uncorrelated(but not generally independent for non-Gaussian processes) random variables.φi(x) are the eigenfunctions andλi arethe eigenvalues of the covariance kernel which can be obtained as the solution to the following integral equation

Rαα(x, y)φi(y)dy = λiφi(x) (7)

The covariance function of the solution process is not knowna priori, and hence the Karhunen-Loeve expansioncannot be used to represent it.2 Therefore, a generic basis, that is complete in the space of all second-order randomvariables will be identified and used in the approximation process. Since the solution process is a function of thematerial properties, nodal solution variables, denoted here by u(θ), can be formally expressed as some nonlinearfunctional of the setξi(θ) used to represent the material stochasticity. It has been shown that this functional dependencecan be expanded in terms of polynomials in Gaussian random variables, referred to as polynomial chaos. Namely

u(θ) =

N∑

j=0

Ψj(θ)uj (8)

These polynomials are orthogonal in the sense that their inner product〈ΨjΨk〉, which is defined as the statisticalaverage of their product, is equal to zero forj 6= k.

Stochastic Domain Decomposition

At time steptk for a given nonlinear iteration, Eq.(1) can be recast into the following form

A(α, θ)u(θ) = f , (9)

whereα is the random system parameters acting as multiplicative noise,u is the response, generally representing anon-Gaussian random process.

For a large scale system Eq.(9) can be solved efficiently using domain decomposition method2 , whereby the spatialdomainΩ is partitioned intons non-overlapping subdomainsΩs, 1 ≤ s ≤ ns such that

Ω =

ns⋃

s=1

Ωs, Ωs

Ωr = 0, s 6= r and Γ =⋃

s=1

Γs where Γs = ∂Ωs\∂Ω (10)

The unknown vector in each subdomain is split into two subvectors usI(θ) andus

Γ(θ) which are respectively the

interior unknowns to the subdomains and the interface unknowns among neighboring subdomains. Consequently, thestiffness matrix and the forcing vector for a typical subdomains can be expressed as:

[

AsII(θ) As

IΓ (θ)

AsΓI(θ) As

ΓΓ(θ)

]

usI(θ)

usΓ(θ)

=

fsI

fsΓ

(11)

We assume uncertainty associated with the system parameters can be represented by stochastic processes. Thesestochastic processes are represented using PC expansion. Accordingly the stiffness matrix is approximated as

L∑

i=0

Ψi

[

AsII,i As

IΓ ,i

AsΓI,i As

ΓΓ ,i

]

usI(θ)

usΓ(θ)

=

fsI

fsΓ

(12)

Let us define a restriction operatorRs which maps the entire set of interface degree-of-freedom (dof) uΓ (θ) intothe local interface dofsus

Γ(θ) onΓs as

usΓ(θ) = RsuΓ (θ) (13)

Enforcing the transmission conditions (compatibility andequilibrium) along the interfaces, the global equilibriumequation of the stochastic system can be expressed in the following block linear systems of equations:

L∑

i=0

Ψi

A1II,i . . . 0 A1

IΓ ,iR1

.... . .

......

0 . . . Ans

II,i Ans

IΓ ,iRns

RT1 A1

ΓI,i . . . RTns

Ans

ΓI,i

ns∑

s=1

RTs As

ΓΓ ,iRs

u1I(θ)...

uns

I (θ)

uΓ (θ)

=

f1I...

fns

Ins∑

s=1

RTs fs

Γ

(14)

3 of 10

International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008

The solution vector can be expanded using the same PC basis. Substituting this expansion in Eq.(14), and applyingGalerkin projection to minimize the error over the space spanned by the chaos basis leads to

A1II . . . 0 A1

IΓR1

.... . .

......

0 . . . Ans

II Ans

IΓRns

RT1 A

1ΓI . . . RT

nsAns

ΓI

ns∑

s=1

RTs A

sΓΓRs

U1I...

Uns

I

=

F1I...

Fns

Ins∑

s=1

RTs F

(15)

where we define the following variables

[Asαβ ]jk =

L∑

i=0

〈ΨiΨjΨk〉Asαβ,i ; Fs

α,k = 〈Ψkfsα〉 (16)

The subscriptsα andβ represent the indexI andΓ . The coefficient matrix in Eq.(15) is of ordern(N +1)×n(N +1)wheren and(N + 1) are the number of the physical dofs and chaos coefficients respectively. The following reducedsystem, namely the Schur complement system for interface dofs UΓ , is given by

S UΓ = GΓ (17)

where the Schur complement matrix is given by

S =

ns∑

s=1

RTs [As

ΓΓ−As

ΓI(As

II)−1As

IΓ ]Rs (18)

and the right hand side vector is

GΓ =

ns∑

s=1

RTs [Fs

Γ−As

ΓI(As

II)−1Fs

I ] (19)

Once the interface unknowns are available, the interior unknowns can be obtained by solving the interior problem oneach subdomain in parallel

AsII Us

I = FsI −As

ΓIRsUΓ (20)

In practice, being a dense matrix, it is expensive to form Schur complement matrix Eq.(18) explicitly. Instead parallelpreconditioned conjugate gradient methods are used to solve the reduced system without explicitly constructing theSchur complement matrix.

Numerical Illustration: Application to Seepage Problem

In this section, we analyze the steady-state seepage flow under a dam. The governing stochastic differential equationdescribing the hydraulic headu(x, y, θ) is given by:

∂x[cx(x, y, θ)

∂u

∂x] +

∂y[cy(x, y, θ)

∂u

∂y] = 0 in Ω (21)

u = u0 on Γu (22)

cx(x, y, θ)∂u

∂xnx + cy(x, y, θ)

∂u

∂yny = 0 on Γq (23)

The random permeability coefficientscx(x, y, θ) andcy(x, y, θ) are modeled as independent lognormal random vari-ables andu0 is the given boundary data. The Gaussian images of the two random variables have coefficients of varia-tion (cov) equal to 0.3. The upstream and downstream water levels are 10m and 1m respectively. Spatial discretizationwith linear triangular finite elements leads to 1,383,860 elements and 694,023 nodes. Both random permeability co-efficients and the response are represented by fourth order PC expansions (L = 9, N = 15). The size of the resultinglinear system is 10,410,345 DOF. This large-scale system istackled using 160 CPUs in a Linux cluster with InfiniBandinterconnect (2 Quad-Core 3.0 GHz Intel Xeon processors and32 GB of memory per node). The graph-partitioningsoftware ParMETIS8 and PETSc9 parallel libraries are used.

4 of 10

International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008

Figure(1a) shows the finite element (FE) mesh decompositionof the geometric domain into 160 non-overlappingsubdomains. Figure(1b) shows the mean of the hydraulic head, while the associated standard deviation is shown inFigure(1c). The coefficient of variation of the solution is presented in Figure(1d) where the maximum magnitudeappears to be 0.2 underscoring the effect of uncertainty. Figure(2) presents the convergence of PC expansion and theassociated execution time. The Euclidean norm of the standard deviation is considered as the metric for convergenceas shown in Figure(2a). The difference between the norms of the third order and the fourth order PC expansions isrelatively small meaning that PC expansion converges at thefourth order. In Figure(2b) the execution time of up tofourth order PC is presented. As the order of PC expansion increases, the execution time increases almost linearly. Thestrong scalability of the algorithm for a linear system sizeof 6,940,230 is shown in Figure(3a). For strong scalabilitymeasures, we fix the problem size and increase the number of the subdomains (and CPUs) whereby the goal is toreduce the execution time for a fixed problem size. Figure(3b) shows the weak scalability of the algorithm. The size ofthe problem per core is fixed to 102,000 and the number of CPUs is allowed to increase in order to measure the abilityof the parallel algorithm to solve larger problems with moreprocessors. Details concerning the stochastic features ofthe solution process are shown in Figure(4) in which the chaos coefficients of the solution are presented. Note thatuj

for j > 3 represent the non-Gaussian contributions to the solution field.

IV. Inverse Modelling using Sequential Data Assimilation

In the context of dynamical systems, a forward model provides an approximate pdf of the current state of thesystem based on the prior knowledge of the uncertain system parameters. This pdf is the so-called “prior” pdf of thesystem state as it does not take into account any dynamic measurement data whenever available. In order to reduceuncertainty in the predictive models, sequential data assimilation techniques condition this prior pdf recursively byblending measurement data into the running numerical model. The conditioned pdf is often labelled as the “posterior”pdf as it represents the best knowledge of the system state given measurement data. A wide range of Monte Carlobased sequential filtering algorithms have been developed to tackle general classes of nonlinear and non-Gaussiansystems,6, 7 including applications to hydraulic problems.3 EnKF can also be coupled with stochastic DDM discussedin the previous section but not explored in this paper.

Ensemble Kalman Filter (EnKF)

In EnKF, a finite number of Monte Carlo samples of the state vector uk are propagated forward in time using theforward model given by Eq.(1). One can estimate the pdf ofuk using ensemble averaging. EnKF adopts a linearanalysis step based on the assumption of a Gaussian state andmeasurement noise which offers computational efficiencybut introduces errors in the estimated conditional pdf ofuk. However, full nonlinear model is integrated forward intime maintaining non-Gaussian features of the system. Considering again the model and measurement equations asdescribed by Eqs. (1)-(2), EnKF provides estimates of the conditional meanua

k and covariancePak

of the state vectorand forecast meanuf

k+1and covariancePf

k+1as follows6 :

1. Analysis step:

Ck =∂hk (uk, ǫk)

∂uk

uk=uf

k,ǫk=ǫk

, (24)

Dk =∂hk (uk, ǫk)

∂ǫk

uk=uf

k,ǫk=ǫk

, (25)

Kk = PfkCT

k

[

DkΓkDT

k + CkPfkCT

k

]

−1

, (26)

uak,i = uf

k,i + Kk

(

dk,i − hk

(

ufk,i, ǫk,i

))

, (27)

uak =

1

N

N∑

j=1

uak,j , (28)

Pak

= [I − KkCk] Pfk . (29)

5 of 10

International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008

2. Forecast step:

ufk+1,i = Ψk

(

uak,i, fk, qk,i

)

, (30)

ufk+1

=1

N

N∑

j=1

ufk+1,j , (31)

Pfk+1

=1

N − 1

N∑

j=1

(

ufk+1,j − uf

k+1

)(

ufk+1,j − uf

k+1

)T

. (32)

For the case ofadditive measurement noisehk (uk, ǫk) = hk (uk) + ǫk, one can avoid the linearisation step of themeasurement operator by augmenting the original state vector with hk (uk). In that case, the measurement becomes alinear function of the new augmented state vector. For strongly nonlinear models, the linear analysis step in EnKF isthe major limitation as it implicitly assumes statistical closure at the second order moments. However, EnKF appearsto perform well for many practical applications with significant nonlinear and non-Gaussian behavior.

Numerical Illustration: Applications to Contaminant Trac king

In this section, EnKF is used to track the distribution of contaminant in an idealized open channel flow. The concen-tration of the pollutant is described by stochastic advection-diffusion equation:

∂u

∂t+ b · ∇u −∇ · (c∇u) = f in Ω × (0, T ] (33)

u = u∂ on Γu × (0, T ] (34)

u = u0 at Ω × 0 (35)

whereu(x, y, t, θ) is the pollutant concentration,b(x, y, t, θ) is the velocity vector,f(x, y, t, θ) is the contaminantsource function andc(x, y, θ) is the diffusion coefficient whereθ denotes the random dimension.u∂ andu0 representthe given boundary and initial data. For steady discharge inthe channel, the velocity fieldb(x, y, θ) can be obtainedby solving the Stokes equation:

− µ∆b+ ∇p = s in Ω (36)

∇ · b = 0 in Ω (37)

b = b0 on Γu (38)

wherep(x, y, θ) is the pressure,s(x, y, θ) is the source vector andµ(x, y, θ) is the viscosity coefficient.b0 is the givenboundary data.

Numerical Simulation

The geometry of the channel consists of a 1m× 5m rectangle. The problem domain was discretized using lineartriangular finite elements with 1008 nodes and integrated for a time period of3.3s with ∆t = 0.1s using FEMsoftware FENICS.10 The diffusion coefficient was set to 0.001 m2/s. The source and the velocity fields are as follows:

f(x, y, t) = 1 × 103δ(t)exp[

−50(x − x0)2 + (y − y0)

2]

(39)

b =

sin(πy)

0

(40)

where(x0, y0) denotes the source location andδ(t) is the Dirac delta function. For the advection-diffusion problem, ahomogeneous Dirichlet boundary condition is assumed on they = 0, y = 1 andx = 0 boundaries.

Data Assimilation using EnKF

The true contaminant concentration to be tracked is obtained from a specific run of an FE model with the solutionshown in subplots (a) through (c) of Figure 5. The subplots present the concentration at specific time instances oft = 0s, t = 0.4s andt = 3.3s. With an initially uncertain source term, subplots (d) through (f) and (g) through (i)

6 of 10

International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008

(a) (b)

(c) (d)

Figure 1: (a) Mesh decomposition with 160 non-overlapping subdomains using ParMETIS, (b) The

meanµ = u0, (c) The standard deviationσ =

N∑

j=1

〈Ψj2〉u2

j , (d) Coefficient of variation

COV =σ

µ

1 2 3 4226

227

228

229

230

231

232

233

234

235

PC Order

||σ|| 2

1 2 3 40

500

1000

1500

2000

2500

3000

PC Order

Exe

cutio

n T

ime,

sec

.

(a) (b)

Figure 2: Convergence of the Polynomial Chaos and the associated execution time: (a)Convergence of PC, (b) Execution time

,

60 80 100 120 140 1601300

1400

1500

1600

1700

1800

1900

2000

2100

No of CPU

Exe

cutio

n T

ime,

sec

.

10 20 30 40 50 600

200

400

600

800

1000

1200

1400

1600

1800

2000

2200

No of CPU

Exe

cutio

n T

ime,

sec

.

(a) (b)

Figure 3: Strong and Weak scalability for3rd order PC expansion: (a) Strong scalability for alinear system of order 6,940,230, (b) Weak scalability for alinear system of order 102,000 per core

7 of 10

International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008

u0 u1 u2

u3 u4 u5

u6 u7 u8

u9 u10 u11

u12 u13 u14

Figure 4: Chaos coefficientsuj of the solution process

8 of 10

International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008

(a) (d) (g)

(b) (e) (h)

(c) (f) (i)

Figure 5: (a) through (c): the actual contaminant concentration; (d) through (f) the unconditionedforward model mean; (g) through (i) the unconditioned forward model standard deviation at

t = 0.0s,t = 0.3s andt = 3.3s.

(a) (d) (g)

(b) (e) (h)

(c) (f) (i)

Figure 6: (a) through (c): The actual contaminant concentration; (d) through (f) EnKF estimate; (g)through (i) EnKF estimate error standard deviation att = 0.0s,t = 0.3s andt = 3.3s.

9 of 10

International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008

of Figure 5 present the unconditional mean and standard deviation of the concentration respectively. The source termis given in Eq.(39) withx0 = 0.5 andy0 assumed to be a uniform random variable distributed between0.1 and 0.9.It is evident that such an uncertainty provides a mean estimate that is far from a sample solution that we are trying toestimate.

It is assumed that the true concentration to be estimated is sparsely measured at 25 different locations. Themeasurement data is collected from a 5× 5 sensor array with the sensors equally spaced by 0.2m in boththex andy

directions. The sensor in the middle of the array is located at (x = 2.5, y = 0.5). Measurement noise is assumed to beGaussian with zero mean and variance of1 × 10−6. Furthermore, the measurements are obtained at every0.1s. Themeasurements are only assimilated if the concentration value reaches a threshold of0.1.

EnKF is applied to assimilate the data into the forward model. An ensemble size ofN = 50 is used. The estimatedconditional mean and standard deviation of the contaminantconcentration are shown in subplots (d) through (f) and(g) through (i) of Figure (6) respectively. As evidenced from subplots (d) and (e), the estimated mean is far from thetrue concentration att = 0.0s andt = 0.3s. This is due to the lack of assimilation of data as the measurements arebelow a threshold value of0.1. In contrast to unconditioned case in Figure (5), EnKF provides an accurate estimate ofthe true concentration once data is assimilated. An estimate with much smaller uncertainty is obtained as more data isassimilated as evidenced in Figure (6i).

V. Conclusion

Firstly an efficient uncertainty analysis method is employed for large-scale stochastic systems in the context ofa hydrologic problem. In addition to reducing discretization errors with high resolution finite element models, themethodology can efficiently propagate uncertainty throughthe system. The numerical technique, based on a re-cently proposed domain decomposition of stochastic systems, can effectively exploit modern parallel computers. Theparallel performance of the algorithms is demonstrated fornon-Gaussian systems arising from the stationary (time-independent) seepage flow under dam having random soil permeability properties. Secondly an inverse problemrelating to the state estimation of a non-Gaussian hydraulic system is described for a time-dependent system usinga sequential data assimilation method, namely Ensemble Kalman Filter. In particular, the problem of contaminanttracking in an idealized channel is tackled under uncertainenvironment. The EnKF demonstrates excellent capabilityin tracking the spatio-temporal evolution of the contaminant concentration by reducing uncertainty in the predictivemodel.

Acknowledgments

The authors acknowledge the support of the Natural Sciencesand Engineering Research Council of Canada,Canada Research Chair Program, Canada Foundation for Innovation and Ontario Innovation Trust. The authors thankDr. Ali Rebaine for his help with ParMETIS graph-partitioning software.

References1S. Adhikari and A. Sarkar. Uncertainty in structural dynamics: Experimental validation of Wishart random matrix model. Journal of Sound

and Vibration, 2007. Under review.2A. Sarkar, N. Benabbou, and R. Ghanem. Domain decompositionof stochastic PDEs: Theoretical formulations.International Journal for

Numerical Methods in Engineering, 2008. Published online.3J. A. Vrugt, C. G. H. Dicks, H. V. Gupta, W. Bouten, and J. M. Verstraten. Improved treatment of uncertainty in hydrolgic modeling:

Combining the strengths of global optimization and data assimilation. Water Resources Research, 41:W01017, 2005.4K. J. Beven. Uncertainty in predictions of floods and hydraulic transport.Publications, Institute of Geophysics: Polish Academy of Science,

E-7, 401:5–20, 2007.5A. H. Jazwinski.Stochastic Processes and Filtering Theory. Academic Press, San Diego, California, 1970.6G. Evensen.Data Assimilation: The Ensemble Kalman Filter. Springer, Berlin, 2006.7M. Khalil, A. Sarkar, and S. Adhikari. Nonlinear filters for chaotic oscillatory systems.Journal of Nonlinear Dynamics.8K. Schloegel G. Karypis and V. Kumar. PARMETIS parallel graph partitioning and sparse matrix ordering library, 1998.9Satish Balay, Kris Buschelman, William D. Gropp, Dinesh Kaushik, Matthew G. Knepley, Lois Curfman McInnes, Barry F. Smith, and Hong

Zhang. PETSc Web page, 2001. http://www.mcs.anl.gov/petsc.10FEniCS. FEniCS project. http//www.fenics.org/.11Y. Saad.Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia, second edition, 2000.

1–4,7–9,11

10 of 10

International Symposium On Uncertainties In Hydrologic And Hydraulic Modeling, Montreal, October 2008