Download - Efficiently characterizing the origin and decay rate of a nonconservative scalar using probability theory

Transcript

En

Aa

b

a

A

R

R

6

A

P

K

B

P

P

D

A

T

M

B

1

Aresio

0d

e c o l o g i c a l m o d e l l i n g 2 0 5 ( 2 0 0 7 ) 437–452

avai lab le at www.sc iencedi rec t .com

journa l homepage: www.e lsev ier .com/ locate /eco lmodel

fficiently characterizing the origin and decay rate of aonconservative scalar using probability theory

ndrew Keatsa,∗, Eugene Yeeb, Fue-Sang Lienb

Department of Mechanical Engineering, University of Waterloo, Waterloo, Ont. N2L 3G1, CanadaDefence R&D Canada – Suffield, P.O. Box 4000, Medicine Hat, Alta. T1A 8K6, Canada

r t i c l e i n f o

rticle history:

eceived 23 September 2006

eceived in revised form

March 2007

ccepted 9 March 2007

ublished on line 3 May 2007

eywords:

ayesian inference

robability theory

arameter estimation

ispersion modelling

djoint equations

ravel time inventory

arkov chain Monte Carlo

ack trajectories

a b s t r a c t

We present a methodology for estimating parameters which describe the source location

and strength, as well as the rate of transformation, of a passive, nonconservative scalar

released into (or already present in) the atmosphere. A finite number of uncertain (noisy)

concentration measurements is the primary source of information for the source recon-

struction, which implies that the problem is ill-posed and must be solved using a Bayesian

probabilistic inferential framework. All parameters are estimated simultaneously and the

model describing the forward problem (prediction of the dispersion of a contaminant given

its source and rate of decay) is computationally demanding, so the capability to efficiently

calculate the solution to the inverse problem is crucial. It is demonstrated how a backward

Lagrangian stochastic particle model facilitates the rapid characterization of the source loca-

tion and strength, while by monitoring the statistical properties of the particle travel times,

the rate of transformation of the agent can be efficiently estimated. Markov chain Monte

Carlo is used to sample from the multi-dimensional domain of definition of the posterior

probability density function that results from applying Bayes’ theorem. The overall method-

ology is validated in two stages. First, the source reconstruction approach is tested using

concentration data measured during an experiment in which a conservative scalar was con-

tinuously released from a point source into a statistically stationary flow in a horizontally

homogeneous and neutrally stratified atmospheric surface layer (Project Prairie Grass). Next,

the reconstruction approach is applied to synthetic concentration data generated using a

Lagrangian stochastic model operating under the same atmospheric conditions, with decay

of particle mass being modelled by a straightforward first-order mechanism.

et al. (2003) estimated the magnitude, rate and reaction order

. Introduction

ccurately predicting the dispersion of pollutants in the envi-onment has important implications for both emergency andnvironmental management, and has been a topic of intensive

tudy over the last several decades. Equally important is thenverse problem: determining the characteristics of the sourcef the pollutant, whether it be natural or anthropogenic,

∗ Corresponding author.E-mail address: [email protected] (A. Keats).

304-3800/$ – see front matter © 2007 Elsevier B.V. All rights reserved.oi:10.1016/j.ecolmodel.2007.03.010

© 2007 Elsevier B.V. All rights reserved.

given a finite and noisy set of concentration measurements.Bayesian inference has recently gained popularity as a frame-work for solving these problems of parameter estimation andmodel selection; for example, Borsuk and Stow (2000) and Qian

of a biochemical oxygen demand model using experimentallymeasured wastewater data. Estimating rate parameters is notonly useful for model selection and verification, but it can

i n g

438 e c o l o g i c a l m o d e l l

also be used to find evidence for specific types of chemistrywhich occur in the environment. Ariya et al. (1998) isolated thepresence of halogen chemistry in the troposphere by examin-ing ozone and nonmethane hydrocarbon (NMHC) depletion inthe Arctic boundary layer. They applied linear regression toestimate the removal rates of various NMHCs and deduceda reaction mechanism based on information about Cl, Brand HO radical chemistry. The air mass under considerationremained stagnant which allowed the authors to treat it as a‘smog chamber’ reactor. In contrast, the present work explic-itly considers the advection of species and would be useful insituations involving both transport and chemistry.

In the literature, problems related to source determinationare addressed in many different contexts of varying scope.In addition to the noun ‘determination’, other terms usedby authors to describe related applications include: identi-fication, inversion, estimation, apportionment, localization,and reconstruction. Assuming the existence of a single modelused to represent the source (or sources), the problem ofsource determination becomes one of parameter estimationin which the parameters describing the source could includeits strength (emission rate), location, rate of decay in the envi-ronment, and if the source is transient, its turn-on and turn-offtimes. Hanna et al. (1990) used Eulerian-based models to esti-mate the source strength for the Project Prairie Grass (PPG)experiments (Barad, 1958; Haugen, 1959), while Flesch et al.(1995) used a backward-time Lagrangian stochastic (LS) modelto estimate the emission rate of a sustained surface areasource in horizontally homogeneous turbulence. The work ofFlesch et al. is important because it presents the backwardLagrangian stochastic model used in the present research.A similarly motivated but more involved study was carriedout by Lin et al. (2003) who developed a backward LS modelfor determining surface CO2 fluxes from aircraft measure-ments made in the planetary boundary layer. The LS modelsdescribed by Flesch et al. and Lin et al. both take a ‘receptor-oriented’ approach to determining detector concentrations, anapproach which is also adopted here. However, they do notexploit statistics related to particle travel times for treatingpotentially nonconservative tracers.

The so-called ‘source apportionment’ problem1 wasaddressed by Skiba (2003) who used an adjoint pollutiontransport model (the Eulerian equivalent of the backwardLagrangian stochastic model) to identify industries operatingin violation of emissions regulations. In this case, a limitedset of possible source locations were known a priori, whereasin the present work, no such assumptions need to be madeabout the position of the source. Penenko et al. (2002) andLiu et al. (2005) use a similar method to perform a sensitivityanalysis and risk assessment for populated areas which couldpotentially suffer from the effects of a chemical or radiologicalaccident.

The Bayesian methodology we have adopted for this workis flexible in that any number or type of source parameters maybe considered for estimation; however, the main contribu-

1 In the source apportionment problem, we are aware of the loca-tions of a number of pre-existing sources and must estimate therelative magnitudes of their emissions.

2 0 5 ( 2 0 0 7 ) 437–452

tion of this research is the attachment of a statistical method(described in Section 3.3) to a Lagrangian stochastic dispersionmodel. This permits the efficient estimation of the first-orderdecay rate of a dispersed tracer. Therefore, for simplicity weconsider only a single point source which continuously emitsmaterial into a statistically (horizontally) homogeneous atmo-spheric surface layer. Although this may seem like a specialcase of limited relevance, the method can easily be generalizedto account for transient sources and wind fields. Further-more, many practical transport-related problems in the fieldof ecological modelling can be addressed or at the very leastapproximated by assuming a continuous release and statisti-cal homogeneity. In certain wind conditions, the assumptionof stationarity (independence of mean variables and turbu-lence statistics on time) may be considered valid, which allowstracer transport (emanating from a briefly sustained source)over small distances (100–1000 m) to be modelled using a con-tinuous release. For example, the assumption of a continuousrelease and horizontal homogeneity of the turbulent windfield was adopted by Meyers et al. (1998) who inferred dry depo-sition rates for SO2, O3, HNO3, as well as particulate matter. Infact, whereas Meyers et al. used eddy correlation methods toestimate the deposition rates, the method presented in thiswork could also potentially be used (directly or by augment-ing a separate experimental procedure) in the estimation ofdeposition wherever it can be modelled as a first-order decayprocess.

It is important to distinguish the present work from stud-ies which have been done on global atmospheric transportinversion, such as Rodenbeck et al. (2003) and the TransComstudies, e.g., Denning et al. (1999). While it is true that theyincorporate backward trajectories (or adjoint equations) andBayesian inference techniques, these investigations are drivenby global-scale transport models with the objective of deter-mining surface fluxes (of CO2and SF6) and do not requirenear-field models to estimate location or reaction rate param-eters.

Contaminant source identification in groundwater flowshas been addressed by a number of authors, including Aralet al. (2001) who used an optimization approach (based ongenetic algorithms) to infer the release history and sourcelocation of a contaminant. Michalak and Kitanidis (2002)adopted a Bayesian approach and used Markov chain MonteCarlo to sample the posterior distribution for the sourceparameters. While both investigations share the goal of infer-ring the source location of a contaminant, they differ fromthe present work in that they did not consider scenarios inwhich the first-order decay coefficient of the contaminant wasunknown.

Recent attempts to determine both the source locationand strength in atmospheric flows have employed a Bayesianprobabilistic framework. Starting with assumptions about thenature of the uncertainties involved in both experimentaland modelled detector concentrations, applying probability

theory as a form of extended logic (Cox, 1946; Jaynes, 2003)yields an estimate for the source parameters which is logi-cally consistent with these uncertainties.2 In a similar way to

2 We would refer to this as an ‘honest’ estimate.

g 2 0

tCfiswaapboev

tcnctiltnneaamictoa

2

Pavrpva

P︸w

m

wq

retm

pendent; i.e., measurements at one detector do not affectmeasurements at another detector, and measurementerrors do not affect model errors at each detector location.

e c o l o g i c a l m o d e l l i n

he present work, Chow et al. (2005) used Markov chain Montearlo (MCMC) to sample from the posterior probability density

unction (PDF) of the source parameters; however, they did notmplement adjoint (backwards) dispersion equations, whichignificantly increased their computational burden. Recentork of Hsieh et al. (2005), Yee et al. (2006) and Keats etl. (2007) has remedied this problem by using the adjointdvection-diffusion equation in conjunction with MCMC toerform the computations efficiently. In the present work, theackward Lagrangian stochastic (bLS) model assumes the roleriginally played by the Eulerian adjoint advection-diffusionquation in Keats et al. (2007), which considered only conser-ative tracers.

Nonconservative tracers (which decay or grow in mass overime either through mechanical, chemical or photolytic pro-esses) represent an important subset of dispersion cases. Aaıve treatment of these cases will result in computationallyhallenging and data-intensive calculations. Accordingly, inhis paper we present the efficient numerical solution of annverse problem in which parameters describing the sourceocation and strength, as well as the rate of tracer transforma-ion (growth or decay), are simultaneously estimated. In theext section, we formulate the solution to the source determi-ation problem in terms of the comprehensive probabilisticxpression for the source parameters. Efficiently evaluatingnd interpreting this expression involves techniques whichre described in Section 3. In Section 4, we validate the overallethodology using concentration measurements made dur-

ng Project Prairie Grass (where the scalar was assumed to beonservative), and also against data obtained from a solutiono the forward problem using a Lagrangian stochastic modelf short-range dispersion in the atmospheric surface layer fornonconservative scalar.

. Bayesian problem formulation

arameter estimation is performed by first obtaining the prob-bility that the vector of source parameters, m, takes certainalues given measured concentration data D and any otherelevant background information I. The information I encom-asses, for example, the practical bounds of the parameteralues. Bayes’ theorem is used to obtain the conditional prob-bility of m:

(m |D, I )︷︷ ︸Posterior

=

Prior︷ ︸︸ ︷P(m |I )

Likelihood︷ ︸︸ ︷P(D |m, I )

P(D |I )︸ ︷︷ ︸Evidence

, (1)

here the source parameters are

= (xs, ys, zs, qs, ks), (2)

here {xs, ys, zs} represent the spatial location of the source,is its strength (of dimension [M T−1]), and k [T−1] is the

s s

ate of tracer transformation. Bias and inaccuracy in thexperimental and numerical data (e.g., measured concen-ration data are subject to experimental uncertainty, while

odelled concentration measurements are affected by the

5 ( 2 0 0 7 ) 437–452 439

accuracy of the numerical model) are taken into account byusing PDFs.3

The posterior distribution expresses the plausibility of allpossible hypotheses (a single hypothesis consists of a singleset of values for the source parameters) and, when evaluatedfor a specific set of source parameters, is a scalar quantitywhose domain of definition has the same dimensionality as m.Therefore, for an n-dimensional problem where each param-eter is rendered into s discrete values, the entire posteriordistribution might be represented by sn numbers. For high-dimensional m, this may be an impossibly large number ofdata points to calculate, which motivates the use of the MCMCtechnique for exploring only the significant regions of the pos-terior PDF. Furthermore, MCMC draws samples (i.e., selectssample parameter values) from the posterior PDF withoutrequiring the evidence term as a normalization constant. Therelationship between the posterior, prior and likelihood PDFscan then be simplified:

P(m|D, I)︸ ︷︷ ︸Posterior

∝ P(m|I)︸ ︷︷ ︸Prior

P(D|m, I)︸ ︷︷ ︸Likelihood

. (3)

The bulk of the time required to calculate the value of theposterior PDF for a single hypothesis is determined by the cal-culation of the likelihood function, which relates modelled tomeasured concentration data. Using a backward Lagrangianparticle model in conjunction with an inventory of averagedparticle travel times significantly mitigates this effort; thesetechniques are described in detail in Section 3.

2.1. Assignment of the likelihood function

The likelihood function essentially quantifies the probabilitythat the measured concentrations D differ from a correspond-ing set of modelled concentrations, R. Ri(m), obtained usingthe theoretical source–receptor relationship for detector i, isthe concentration that detector i would theoretically measureif the source were correctly characterized by the parametersm. R is typically computed using a dispersion model.

Both the physical concentration measurements and thetheoretical source–receptor relationship are subject to uncer-tainties, which are assumed to have the following properties:

(1) We adopt the basic assumption that the measurementerror for detector i can be characterized as additive Gaus-sian noise with root-mean square (RMS) experimentalerror �D,i.

(2) The model error associated with the source–receptor rela-tionship may be characterized in a similar way, having RMSerror �T,i.

(3) Both the measurement and the model errors are inde-

3 In this work, we represent errors by simple variances whichprecludes accounting for bias in the measurement and modeluncertainties. However, the present work could easily be extendedto account for bias by adopting a more descriptive PDF than theGaussian.

i n g

440 e c o l o g i c a l m o d e l l

Under these assumptions, the likelihood function takes theform:

P(D|m, I) ∝ exp

[−1

2

∑i

(Di − Ri(m))2

�2D,i

+ �2T,i

]. (4)

Assuming that we know D, �D and �T , calculating Ri(m) for var-ious m provides P(D|m, I) up to a constant of proportionality.

2.2. Assignment of the prior probabilities

The prior PDF describes our degree of belief that the parame-ters take particular values before any additional information(in the form of concentration measurements) becomes avail-able. Objectively expressing a prior distribution based onqualitative criteria is difficult, but necessary for conductinginference in a logically consistent manner.

For the present case, we assume a state of ignorance withrespect to each of the parameters. Parameters are indepen-dent, which allows the prior distribution to be expressed asthe product of the priors of the individual parameters (indexedby j):

P(m|I) =∏j

P(mj|I), m ∈R. (5)

R is a bounded computational domain, ensuring that eachprior distribution integrates to unity.4 Ignorance regardingthe location and decay parameters, {xs, ys, zs, ks}, is expressedusing a uniform distribution:

P(xs|I) = P(ys|I) = P(zs|I) = P(ks|I) = constant, (6)

and the remaining parameter, qs, is assigned a prior whichremains invariant under transformations of scale:

P(qs|I) ∝ q−1s , qs ∈ [qmin, qmax]. (7)

Using a scale-invariant prior ensures that P(qs|I) = P(aqs|I) forany constant a (Jaynes, 2003). The interval [qmin, qmax] ensuresthat the prior PDF is normalizable; in practice, qmax is chosento be some finite, reasonable upper bound.

2.3. The posterior probability density function

The posterior PDF is proportional to the product of the priorand the likelihood:

P(m|D, I) ∝ P(m|I)P(D|m, I)

∝ I(m ∈R)1qs

exp

[−1

2

∑i

(Di − Ri(m))2

�2D,i

+ �2T,i

], (8)

where I(·) denotes the indicator function.

4 Since we do not consider hypotheses where parameters lie out-side of the computational domain, our state of ignorance is nottotal.

2 0 5 ( 2 0 0 7 ) 437–452

3. Modelling and numerical approach

Numerically predicting Ri, the modelled concentration at theith receptor, requires the use of a model which, when givena specific source configuration (location, strength and decayrate), is capable of providing a concentration value at eachof the detector (receptor) locations. Rather than running aforward dispersion model for every possible combination ofsource parameters, using a backward (or adjoint) dispersionmodel requires less computational time when the numberof detectors is significantly less than the number of possi-ble source locations. Lagrangian stochastic particle dispersionmodels are routinely applied in the field of meteorology tosimulate the dispersion of species in environmental flows.Backward Lagrangian models are structurally very similar totheir corresponding forward models (Seibert and Frank, 2004)and are used in the present work to generate the required dual(adjoint) concentration fields.

3.1. Source–receptor relationship

In this paper, we consider an ideal continuous point source ofthe form:

Q = qsı(x − xs). (9)

Q is a source density distribution [M L−3 T−1] which releasesmaterial continuously at a steady rate of qs [M T−1] from loca-tion xs. The mean concentration field C resulting from thisrelease can be found using a forward dispersion model, whichrelates Q to C through the linear operator L in the followingway:

LC = Q. (10)

The definition of the operator L is flexible; in the Eulerianframework, Eq. (10) becomes the steady advection-diffusionequation:

u · ∇C− K∇2C = Q, (11)

where

L(·) ≡ u · ∇(·) − K∇2(·). (12)

In a Lagrangian framework, L effectively describes a forwardLagrangian stochastic (fLS) particle model:

C =∫�

G(x|x0)Q dx, (13)

where G(x|x0) is the integral kernel of L−1 and is a function ofthe specific LS model chosen. In LS models, G(x|x0) representsa transition probability density.

The adjoint operator, L∗, relates a dual (or adjoint) concen-

tration field (viz., C∗ field) to a ‘detector response’ function, h[L−3]. For the ith detector, the relationship is

L∗C∗

i = hi, (14)

g 2 0

woxaiawmars

mt

R

wtca

Ft

R

Itficti

otttes

3

Lnaceoflti

c

e c o l o g i c a l m o d e l l i n

here h = h(x − xr) models the detector response functionf a receptor which measures the concentration at location

r. The function h acts as a spatial filter and would be, e.g.,delta function for an ideal detector with infinite resolv-

ng power. In an Eulerian framework, Eq. (14) becomes thedjoint advection-diffusion equation. In a Lagrangian frame-ork, L∗ describes a backward Lagrangian stochastic (bLS)odel. The C∗ field has units of [T L−3] and can be interpreted

s a residence-time density field. Here, we assume that theelease is continuous and the flow is statistically stationary,o transient terms are absent in both L and L∗.

The source–receptor relationship is used to obtain Ri, theodelled concentration value at the location of the ith detec-

or. Ri is defined through the inner product:

i = 〈C, hi〉 ≡∫�

Chi d�, (15)

here � ⊂ R defines the domain of the location parame-ers {xs, ys, zs}. The duality relationship, defined by Eq. (16),onnects the forward and adjoint operators and provides anlternative way to calculate the source–receptor relationship:

C,L∗C∗〉 = 〈LC,C∗〉 (16)

Ri = 〈C, hi〉 = 〈Q,C∗i 〉 ≡

∫�

QC∗i d�. (17)

or a point source, the inner product reduces to a simple mul-iplication:

i(m) = qsC∗i (xs, ys, zs, ks). (18)

n general, one C∗ field must be generated per receptor, withhe response function h treated as a ‘source’. Once all of the C∗

elds have been generated, the source–receptor relationshipan be rapidly calculated using Eq. (18) for any combination ofhe parameter values {xs, qs}. Varying the tracer decay rate, ks,ntroduces difficulties which are addressed in Section 3.3.

Qualitatively, eachC∗ field delineates a ‘region of influence’;utside these regions, a source would be unable to contributeo a detector measurement. Issartel and Baverel (2003) usehe term ‘retroplumes’ to describe these fields, since theyake the same form as concentration plumes, except that theymanate from each detector and travel backward in time andpace.

.2. Backward Lagrangian stochastic dispersion model

agrangian stochastic (LS) particle models provide an alter-ative to Eulerian methods for simulating the dispersion oftracer in a wind flow. Whereas Eulerian methods directly

alculate concentration fields using the advection-diffusionquation discretized over a grid of fixed locations, LS meth-ds track individual ‘particles’ or ‘parcels of fluid’ through aow field and generate a set of particle trajectories which can

hen be manipulated to yield concentration fields (and othernformation).

Both the fLS and bLS models used in the present researchalculate particle trajectories by solving for velocity incre-

5 ( 2 0 0 7 ) 437–452 441

ments dui which evolve according to the Langevin equation(Rodean, 1996):

dui = ai(x,u, t) dt + bi(x,u, t) dWt, (19)

where dWt denotes an increment of the standard Wienerprocess; x is a vector indicating the position of the parti-cle; u is the Lagrangian velocity vector of the particle; and(u1, u2, u3) = (u, v,w) are the streamwise, spanwise and verti-cal components of the velocity vector. Particle positions arecalculated using the following equation:

dxi = ui dt. (20)

The ai terms govern the deterministic component of the par-ticle trajectory and represent a combination of:

(1) Damping coefficients which relax the velocity incrementsdui back toward the mean flow;

(2) A ‘drift correction’ term which satisfies the ‘well-mixedcriterion’, a constraint placed on the Langevin equationcoefficients by Thomson (1987), designed to avoid non-physical particle distributions.

In the present model, the ai are functions of velocity ui,Reynolds stresses �ij, and their respective z-derivatives, andthe dissipation rate �. The parameterization of these quan-tities for the test cases to be addressed later can be foundin Section 4.1. The bi terms govern the stochastic compo-nent of the trajectory and represent acceleration incrementsgenerated by random pressure fluctuations with very shortcorrelation times. The present work assumes horizontallyhomogeneous flow, for which functional forms of the ai and bicoefficients for Gaussian turbulence consistent with the ‘well-mixed criterion’ and Kolmogorov’s theory of local isotropy canbe found in the work by Flesch et al. (1995).

With respect to the fLS model, dispersion from a contin-uous source is modelled by releasing a large number P ofparticles from the source location, with each particle’s initial‘pseudo-mass’ representing a fraction of the source strengthqs. As particles spend time in the flow field, they may undergotransformations which alter their pseudo-mass. These mech-anisms are modelled according to a first-order decay processwhich is described in Section 3.3. For the moment, we con-sider that the jth particle’s pseudo-mass (viz., source strengthfraction), qj, is a function of the amount of time, tj, that it hasspent in the field:

qj = qj,0f (ks, tj), (21)

where qj,0 = qs/P is the particle’s pseudo-mass at tj = 0, andks is a coefficient characterizing the decay process. Similarly,with reference to the bLS model, P particles are released fromeach of the detector locations. Since the detector responsefunction, h, integrates to unity, each particle’s initial dual

pseudo-mass is q∗

j,0 = 1/P, and the decay process is modelledin the same way:

q∗j = q∗

j,0 f (ks, �j). (22)

i n g 2 0 5 ( 2 0 0 7 ) 437–452

Fig. 1 – Dual particles are ‘released’ from detectors andtravel upstream through the wind field, generatingretroplumes. A number of particles will pass through thesame grid cell (a potential source location) and will have

442 e c o l o g i c a l m o d e l l

The temporal frame of reference has been reversed (particlesoriginate from detectors and are travelling backward throughthe flow field), so �j = −tj + T, where T is an arbitrary transfor-mation constant.

In order to obtain a discrete representation of the particletrajectory defined by Eqs. (19) and (20), particle locations are

recorded at discrete time intervals. Between the locations x(n)j

and x(n+1)j

, and within a volume enveloping {x(n)j,x(n+1)

j}, the jth

particle spends a ‘residence time’, ��j = �(n+1)j

− �(n)j

. To first-order accuracy, the individual contribution of the particle tothe residence-time density in a grid cell whose centroid is xcan be obtained using:

C∗(x, ks) =∑j:xj ∈D

(xj; x)q∗j ��j, (23)

where (xj,x) is a mass-conserving kernel function with finitesupport over spatial domain D ⊂ �. It assigns a relative contri-bution to theC∗ value in the cell based on the distance betweenthe jth particle’s position xj and the center of the grid cell. Thebandwidth (a measure of the region of support) of this kernelis typically on the order of the edge length of a grid cell.

3.3. Tracer decay treatment

The species being dispersed is assumed to undergo transfor-mation (decay or growth) by, e.g., reaction, radiological decay,or scavenging, which can be modelled by the first-order mech-anism:

dCdt

= −ksC, (24)

where ks is a rate constant with units of [T−1]. Here, we assumeks to be positive for transformations in which the concentra-tion of the tracer decays over time as it is transported. Thesolution to Eq. (24) is

C(t) = C0 exp(−kst), (25)

where C0 is the concentration at time t = 0. The adjoint decaymechanism is then modelled using:

dC∗

d�= −ksC

∗, (26)

whose solution is

C∗(�) = C∗0 exp(−ks�), (27)

assuming that particles were released from the detector attime � = 0. Note that � > 0 refers to earlier times relative to� = 0.

With respect to the bLS model, ‘tagging’ the jth particle with

its ‘accumulated travel time’ �j enables us to rapidly quantifyits expected transformation for any ks:

q∗j (�j) = q∗

j,0 exp(−ks�j) (28)

taken different lengths of time to travel there, by virtue ofthe stochastic nature of their trajectories.

where q∗j,0 ≡ q∗

j(�j = 0). Treating dual pseudo-masses (as

opposed to dual concentrations) in this manner is appropriategiven the linearity of Eq. (26).

Complications arise when we attempt to calculate the C∗

value in a grid cell through which a large number N of par-ticles have passed. Fig. 1 demonstrates how different [dual]particles travelling upstream from a given detector could takedifferent amounts of time to reach the same grid cell. Strictlyspeaking, the correct C∗ value for the cell is obtained by re-calculating Eq. (28) using the desired ks value for all N particles,and then substituting q∗

jinto Eq. (23). However, this requires us

to maintain and manipulate lists of all the particles generatedfrom each detector by the bLS model, which would be compu-tationally intractable and excessively data-intensive for largesimulations consisting of millions or billions of particles mov-ing in flow fields which might evolve over time. Fortunately,by phrasing the problem in a statistical sense, i.e., by consid-ering the distribution of the individual particle travel times�j, we can estimate C∗(x, ks) with an accuracy determined bythe value of ks together with the properties of the distributionof the �j. In Section 4.4 we assess the computational savingsmade through the use of the statistical method, relative tosolving the problem exactly using all available trajectory infor-mation.

In order to simplify the analysis of our estimate forC∗(x, ks),we will assume a simple kernel (with a top-hat function) whicharithmetically averages both the travel times and dual pseudo-masses of particles passing though the domain D defined by asingle grid cell centered on x. Assuming a kernel of this formresults in simple expressions for the sample mean travel time,�, and the conservative C∗ field value, C∗

0:

(xj,x) = 1, xj ∈D, (29)

�(x) = 1N

∑j:xj ∈D

�j, (30)

C∗0(x) = 1 ∑

q∗j,0 ��j. (31)

�x�y�zj:xj ∈D

If we assume that the only information available describ-ing the distribution of particle travel times, �j, is their mean

g 2 0

a2mbdi

tt

C

Mfidt

wafi

w

P

Hltme

C

tapaartgiebpss�

a

a

can be found in the book by Gilks et al. (1996).

e c o l o g i c a l m o d e l l i n

nd variance, then the principle of maximum entropy (Jaynes,003) asserts that the maximally non-committal (least infor-ative) PDF used to describe the particle travel times should

e the Gaussian distribution. Given that the �j are obtainedirectly from the Lagrangian stochastic particle model, obtain-

ng their mean and standard deviation is straightforward.The true C∗ value in a given grid cell is a consequence of

he decaying pseudo-mass, q∗, and is obtained directly usinghe arithmetic mean of the exponentiated decay coefficients:

∗(x, ks) = C∗0

1N

∑j:xj ∈D

exp(−ks�j). (32)

aintaining lists of individual �j is impractical, so we mustnd a way to estimate C∗ using the sample mean and standardeviation of the �j. Assuming that the individual particle travelimes � are distributed normally:

∼ N(�, �), (33)

here N(�, �) is a normal (Gaussian) distribution with mean �nd standard deviation �, then the exponentiated decay coef-cients are distributed log-normally:

≡ exp(−ks�) ∼ LN(−ks�, ks�), (34)

ith probability density function given by

(�) = 1

ks��√

2�exp

(−1

2

(log(�) + ks�

ks�

)2). (35)

ere LN(−ks�, ks�) is a log-normal distribution such that theogarithm of the random variate results in a Gaussian distribu-ion whose mean and standard deviation are −ks� and ks�. The

ean of this log-normal distribution provides the followingstimate for C∗:

ˆ ∗(x, ks; �, �) = C∗0 exp

(−ks� + 1

2k2s�

2). (36)

Before proceeding to use this estimate, it is worthwhileo examine its accuracy in view of the fact that the limitedvailability of computational power constrains the number ofarticle trajectories that can be simulated (within a reason-ble amount of time) during a given bLS model run. Considerdetector for which the bLS model is used to generate a cor-

esponding C∗ field. With increasing upstream distance fromhis detector, particle trajectories are spread thinly over morerid cells, resulting in lower individual cell particle counts (andn turn lower dual concentration [C∗] values). For grid cellsxperiencing low tagged particle counts, the law of large num-ers does not necessarily guarantee the accuracy of the meanarticle travel time, �, which might vary significantly acrosseveral different realizations of the same (in the parametric

ense) bLS model run. Since the estimate, C , is a function ofˆ , the impact of variability in � on the variability of C∗ must benalyzed.

Based on the earlier assumption that particle travel timesre normally distributed, their sum, and thus the estimated

5 ( 2 0 0 7 ) 437–452 443

mean travel time, �, is also normally distributed5:

� ∼ N

(�,

�√N

), (37)

where � is the ‘true’ mean travel time, and the standard devi-ation is �/

√N, where N is the number of particles passing

through a given grid cell. N could change for different bLSmodel realizations, but here we assume for simplicity andwith no loss in generality that it remains the same. As with�, the fact that � is normally distributed leads to a log-normaldistribution for the estimate, C∗:

C∗(x, k; �, �) = C∗0 exp

(−k� + 1

2k2�2

)∼ LN

(−k� + log

[C∗

0 exp(

12k2�2

)],k�√N

), (38)

whose standard deviation is

sd(C∗) = C∗0 exp

(k2�2

N

(N+ 1

2

)− k�

)(exp

(k2�2

N

)− 1

)1/2

.

(39)

It is clear from Eq. (39) that as k and � decrease, andas N increases, the standard deviation of the sample meandecreases. However, for certain values of k, �, � and N, C∗ mayvary significantly. This variance could be treated as part of theoverall model (theoretical) uncertainty, �T , information whichcan be encoded into the likelihood function. It should also benoted that the above analysis assumes that the variability in� as encoded in � is known exactly; in practice, � must be esti-mated from the sample of N particles that ‘move’ through thegiven grid cell.

3.4. Markov chain Monte Carlo

MCMC algorithms are commonly used to generate samples forproblems of parameter estimation which have been formu-lated in a Bayesian framework. As parameter spaces increasein dimensionality, the significant regions of the posterior dis-tribution rapidly diminish in size (Gregory, 2005), renderingdirect or even conventional Monte Carlo integration impracti-cal. MCMC algorithms such as Metropolis–Hastings (Hastings,1970) overcome the so-called ‘curse of dimensionality’ by gen-erating a sequence of correlated random points (e.g., m(k) ∈R;m(k) is the kth sample) whose distribution tends asymptoti-cally to a target distribution; in this case, the posterior PDF forthe source parameters. For a given target probability distri-bution, the MCMC algorithm will generate samples from thistarget distribution, from which summary statistics and his-tograms can be obtained. Detailed theory of MCMC methods

5 For more general distributions of �, Eq. (37) is only true in thelimit as N → ∞, but for the present case, the relationship is alsovalid for small N.

i n g

conditions encountered in Run 24 of the PPG experiment.Using a roughness length of z0 = 0.006 m, a friction velocity ofu∗ = 0.38 m s−1, and a source strength of qs = 41.2 g s−1, par-ticle trajectories were used to generate a three-dimensional

Fig. 2 – Arrangement of the point source (square) and

444 e c o l o g i c a l m o d e l l

4. Short-range dispersion in theatmospheric surface layer

Here we apply the source determination methodology to atest case whose wind field and geometry match those usedfor Project Prairie Grass (PPG), a benchmark tracer dispersionexperiment that was conducted over flat terrain with no obsta-cles. This experiment is described in detail in the originalreports (Barad, 1958; Haugen, 1959), and more recently by otherauthors such as Venkatram and Du (1997) and Hanna et al.(2004).

In PPG, sulfur dioxide (SO2) was released from a small tubeplaced 46 cm above the ground. Seventy 20-min releases wereconducted during July and August 1956, in a wheat field nearO’Neil, Nebraska. The wild hay was trimmed to a uniformheight of 5–6 cm. Samplers were positioned on concentricsemi-circular arcs centred on the release, at downwind dis-tances of 50, 100, 200, 400, and 800 m. The samplers werepositioned 1.5 m above the ground, and provided 10-min (aver-aged) concentration values. Towers for measuring verticalprofiles of mean concentration were also available along thearc with a radius of 100 m.

After stating the parameterization used to define the windfield (Section 4.1), we present a reference solution based onpart of the actual PPG experiment (Section 4.2). This is fol-lowed by a validation of the statistical tracer decay treatmentand a discussion of its performance. We validate the over-all source determination methodology in two stages. First,in Section 4.5.1 the source reconstruction approach is testedusing real concentration data measured during the PPG exper-iment (in which the scalar was considered to be conservative).During the source reconstruction approach, it is assumedthat the rate of decay is unknown. In the second stage (Sec-tion 4.5.2), the reconstruction approach is applied to two setsof synthetic concentration data generated using a forwardLagrangian stochastic model operating under the same atmo-spheric conditions as PPG, with decay of particle mass beingmodelled by the first-order mechanism described by Eq. (24).These synthetic measurements are then chosen to play therole of D in the inverse problem, and the bLS model is appliedto generate the required C∗ fields.

It should be noted that the Lagrangian stochastic parti-cle model described in this paper has already been validatedagainst PPG using parameterized wind statistics, appropriatefor a horizontally homogeneous neutrally stratified atmo-spheric surface layer (or, adiabatic wall shear layer), for thecase of a passive, conservative tracer.

4.1. Wind field

The wind field is fully developed and horizontally homoge-neous, so all velocity and turbulence statistics are functions ofz (height above the ground surface) only. The mean wind veloc-ity is aligned with the x-axis. The fLS and bLS models requirethe wind field to be supplied in terms of its mean velocity

and turbulence statistics. For the present case, the wind fieldcan be described analytically by semi-empirical relationshipsdeveloped for a horizontally homogeneous neutrally stratifiedsurface layer. Parameterizations of wind statistics also exist

2 0 5 ( 2 0 0 7 ) 437–452

for describing non-neutral (e.g., stably stratified and convec-tive) boundary layers, but they are not considered in this work.The components used to describe the turbulent wind fieldare outlined below. These expressions are commonly usedfor LS models applied to the surface layer, and are similar tothose found in Flesch et al. (1995) and Rodean (1996). They areparameterized in terms of u∗, the friction velocity, and z0, theroughness length.

4.1.1. Mean wind velocity profileThe average wind speed in the x (streamwise) direction isassumed to follow a log-law profile in the surface layer:

u(z) = u∗

lnz

z0, (40)

where ≈ 0.4 is von Karman’s constant. The mean y and zvelocity components (v and w) are both zero.

4.1.2. Velocity variancesThe velocity variances, �2

u , �2v , �2

w, and the covariance, u′w′, areconstant within the surface layer:

�2u = �2

v = 4.5u2∗, �2

w = 1.69u2∗, u′w′ = −u2

∗ . (41)

It is assumed that the velocity covariances u′v′ and v′w′ vanishin the surface layer.

4.1.3. Dissipation rateThe turbulence kinetic energy dissipation rate is determinedas follows:

�(z) = u3∗z. (42)

4.2. Reference solution: forward dispersion

The fLS model was used to simulate dispersion under the

detector arcs (circular dots) for Project Prairie Grass. Thedetectors shown measured non-zero concentrations duringthe PPG experiment. Contours of log10(C [�g/m3]) obtainedusing the fLS model (ks = 0, z = 1.5 m) are also plotted.

e c o l o g i c a l m o d e l l i n g 2 0 5 ( 2 0 0 7 ) 437–452 445

F mulam 1).

chlcs

tdErFf

4

BwwiC

wtmaFo0

FlC(

ig. 3 – Experimentally measured concentrations (circles), siodel), and decayed concentration (dashed line, ks = 0.03 s−

oncentration field over a 420 m × 200 m × 10 m domain. Aorizontal slice of this domain (z = 1.5 m), along with the

ocations of the source and detectors, is shown in Fig. 2. Theoncentration field was generated for a grid of cells of dimen-ion �x = �y = �z = 1 m.

Along each arc, detectors were spaced at 2◦ intervals, andhe streamwise flow direction (aligned with the x-axis) wasetermined using the maximum concentration measurement.xperimentally measured concentration data for each arc (atadii r = {50,100,200,400} m from the source) are plotted inig. 3 together with concentration profiles generated with theLS model using decay coefficient values of ks = {0,0.03} s−1.

.3. Validation: tracer decay treatment

efore proceeding to solve the inverse problem (Section 4.5),e first examine the suitability of the tracer decay treatmenthich was presented in Section 3.3. Without loss of general-

ty, consider the detector located at (x, y, z) = (400,0,1.5). The∗ field (or ‘retroplume’) emanating upwind from this detectoras calculated by using the bLS model to determine the trajec-

ories of 1 × 105 tagged particles. For the six grid cell locationsarked in Fig. 4, tagged particle travel times were recorded

nd used to generate the normal probability plots shown inig. 5. The Kolmogorov–Smirnov test was applied to each setf travel times. P-values for each set except �a met or exceeded.1. Despite �a failing the test, we consider the assumption of

ig. 4 – Detector (circle) and grid cell locations (squaresabelled a–f) in which particle travel times were binned.ontours of log10(C∗) obtained using the bLS model

ks = 0, z = 1.5 m) are plotted in the background.

ted concentration profiles (solid line, obtained using fLS

normality to be vindicated by the high quality of the statisticaltracer decay approximation in regions near to the source. Thisproperty is quantified in Fig. 7.

Having assessed the validity of assuming normally dis-tributed particle travel times, we turn our attention toestimating the relative error incurred by using particle traveltime statistics (mean and variance) to approximate the ‘true’C∗ field.6 The estimate, C∗, is defined by Eq. (36). We define theabsolute value of the relative error incurred by the approxima-tion as

EC∗ = |C∗ − C∗|C∗ , C∗ > 0. (43)

For the case of the C∗ field shown in Fig. 4, the error term EC∗

was calculated in all cells (where C∗ > 0) throughout the three-dimensional domain � for the cases of ks = {0.03,0.3} s−1.Histograms showing the distribution of EC∗ over the domainas a whole are presented in Fig. 6. For the present test case, theerror is clearly significant for the larger value of ks, and Fig. 7demonstrates that EC∗ generally grows with upstreamdistance and, by extension, increasing particle traveltime.

While Figs. 6 and 7 characterize EC∗ specifically with respectto the estimate of the C∗ field of Fig. 4, it remains desirableto characterize EC∗ for more general cases. When consideringbLS simulations that are not problem-specific, the standarddeviation of C∗ as defined by Eq. (39) is indicative of the accu-racy of the approximation. This quantity depends on ks, �, �and N, but can be characterized effectively by approximat-ing the ratio �/� using a constant value. For the bLS testcase outlined above, Fig. 8 presents the distribution (over allgrid cells) of this ratio, and shows a pronounced mean valueof approximately 0.10.7 In Fig. 9, we plot contours of sd(C∗)as a function of ks and N, conservatively assuming that the

ˆ ∗

ratio �/� = 0.15. For low N and large ks, sd(C ) increases dras-tically, indicating that bLS simulations involving high decayrates should be made more accurate by increasing the num-ber of particles released from the detector (in the hope of

6 By ‘true’, we refer to a C∗ field calculated using Eq. (32), not nec-essarily a field generated using a large enough number of particlesto ensure small statistical error.

7 Empirical observation suggests that this figure remains moreor less constant for different scenarios.

446 e c o l o g i c a l m o d e l l i n g 2 0 5 ( 2 0 0 7 ) 437–452

Fig. 5 – Normal probability plots of particle travel times recorded at the grid cells shown in Fig. 4.

on, E

Fig. 6 – Histograms of the error incurred by the approximatithree-dimensional spatial domain.

increasing N, the number of particles passing through the gridcell).

We recapitulate that the approximation C∗ is based on theassumption that particle travel times are normally distributed,which leads to the consequence that decayed pseudo-massesq∗ are distributed log-normally. Alternatively, if we assume

that pseudo-masses q∗ are in fact normally distributed,numerical experiments show that accuracy of the approxima-tion suffers (EC∗ grows, and the histograms of Fig. 6 are shiftedto the right by a significant amount).

Fig. 7 – The error EC∗ as a percentage, evaluated along the centerdetector.

q. (36). The error is evaluated once per grid cell over the

4.4. Performance of the statistical tracer decaytreatment

The above analysis has shown that the statistical tracer decayapproximation is valid under the condition that the ratio of theaverage particle travel time to the decay coefficient is relatively

low. Applying this statistical treatment to an existing LS modelrequires only a little code modification and results in a muchfaster and more memory-efficient calculation of the expecteddual concentration in a grid cell, compared to using an exact

line of the C∗ field (y = 0, z = 1.5 m), upstream of the

e c o l o g i c a l m o d e l l i n g 2 0 5 ( 2 0 0 7 ) 437–452 447

Table 1 – Comparison of computational effort required by the exact vs. statistical tracer decay treatments

Computational task Exact approach Statistical treatment

Trajectory data storage ∝ NpNt 0Field storage ∝ NxNyNz ∝ 3 ×NxNyNz (for each of C∗

0, �, �2)

Data retrieval required for C∗(xs, ks) calculation List of N particlesthrough grid cell c

Calculation of C∗(xs, ks) N evaluations of e

Fig. 8 – Histogram of the ratio of travel time standarddi

ah

nsrI(gp

acipd

F�

eviation (�) to mean particle travel time (�) for all grid cellsn the PPG domain.

pproach in which all particle trajectory information wouldave to be retained.

Due to the considerable variation in computer storage tech-iques and data structures, there is no point in performing aide-by-side comparison of two LS codes, one which incorpo-ates the statistical treatment and another which does not.nstead, we consider how the required CPU time and memorystorage) requirements relate to typical LS model parametersoverning the length and spatiotemporal resolution of a dis-ersion simulation.

Consider a bLS dispersion model run in which Np particlesre released from a single detector. Each of these Np parti-

les follows a trajectory which is recorded either on disk orn memory by saving the individual particle’s position (xj),seudo-mass (q∗

j), and cumulative travel time (�j). Trajectory

ata is built up by saving this positional information at every

ig. 9 – Contours of log10(sd(C∗)) for varying ks and N. Here/� = 0.15 and � = 100 s.

which passedentred on xs

Three items: C∗0(xs), �(xs), �2(xs)

xp(−ks�j) One evaluation of C∗0 exp(−ks � + 1

2 k2s�

2)

time step, and we can assume that the average particle spendsNt time steps in the domain. Thus, an LS simulation will typ-ically write (NpNt) entries in a trajectory file.

The desired end product of a bLS simulation is usually aC∗ field. Consider such a field, generated from a trajectory fileusing the kernel described in Eqs. (29)–(31), and discretizedover a Cartesian grid which contains (NxNyNz) grid cells. Inorder to calculate the hypothetical concentration expected bythe detector using Eq. (18), we extract the value of theC∗ field ata potential source location (xs, ys, zs). When the rate of tracerdecay (ks) is known a priori, a C∗ field needs be generated onlyonce from the trajectory data using the first-order decay Eq.(28). However, in the present scenario, ks is unknown, whichmeans that a new C∗ field (or at the very least, the C∗ value atall potential source locations) must be recalculated for manypossible values of ks.

Because LS models are inherently stochastic,Np is requiredto be very large, large enough to generate reasonably smoothC∗ (or concentration, in the fLS case) data on what may bea high resolution grid of the problem domain. For a givengrid cell, especially if it is near the plume centerline, a veryhigh number N of particles will pass through, contributing tothe C∗ value in the cell. By way of illustration, the examplesaddressed in this work use Np = 1 × 105 and for many gridcells, N is on the order of 1000 particles. Larger-scale problemssolved over higher-resolution grids may result in the genera-tion of a 1000 times as much trajectory data. It should be notedthat the C∗ fields which are generated from trajectory data are,for the present example, on the order of 1/1000 the size of thetrajectory data. In general, it is desirable to avoid manipulatingraw trajectory data once it has been used to generate griddeddata fields.

With this background in mind, Table 1 summarizes thesavings in computational time and memory requirementsobtained by using the statistical decay treatment. From apractical standpoint, three fields must be generated from thetrajectory data in order to use the statistical approximation:C∗

0(x) (a C∗ field for which ks = 0); �(x) (the average particletravel time for each grid cell); and �2(x) (the variance of theparticle travel times). However, once these fields have beengenerated, it is no longer necessary to store or manipulate theraw trajectory data.

4.5. Inverse problem: source determination

We begin by assessing the performance of the over-all source determination methodology using concentration

data measured during the PPG experiment, Run 24. Inthis case, the tracer was considered conservative (ks = 0);however, we have included ks as an unknown parame-ter to be estimated. Following this assessment, in Section

448 e c o l o g i c a l m o d e l l i n g 2 0 5 ( 2 0 0 7 ) 437–452

Fig. 10 – Layout, case 1 (measured data). Unknown source

Table 2 – Case 1: summary statistics (mean and standarddeviation) of the MCMC samples used to generate Fig. 11

mi xs (m) ys (m) zs (m) qs (g s−1) ks (s−1)

(square) and detector (circular dots) arrangement fordetermining the source parameters.

4.5.2 we apply the methodology to two problems involv-ing synthetic, decayed (ks > 0) concentration data generatedusing the fLS model. For all three test cases we assumethe following parameter bounds for the computationaldomain R:

xs ∈ [−10,410] m, ys ∈ [−100,100] m, zs ∈ [0,10] m,

qs ∈ [1,200] g s−1, ks ∈ [−1,1] s−1

4.5.1. Assessment of methodology using PPGexperimental data (conservative tracer)The experimentally measured concentration data used hereis extracted from a subset of the arc-based measuring sta-tions shown in Fig. 2. Four stations from each arc were usedand are shown in Fig. 10 In the {50,100} m arcs, detectors are

placed off-centerline by {−6,−2,2,6}◦. In the {200,400} m arcs,detectors are placed off-centerline by {−4,−1,1,4}◦.

The combined uncertainty in the model and measurementnoise for each detector was conservatively set such that the

Fig. 11 – Case 1: marginal parameter distributions generated fromby the solid vertical line in the histograms, and the circular dot irepresented by the dashed vertical line in the histograms, and th

Actual mi 0.0 0.0 0.46 41.20 0.0mean (mMCMC

i) 2.11 3.09 2.12 45.78 0.006

sd (mMCMCi

) 9.12 1.84 1.58 20.53 0.012

model results shown in Fig. 3 lie within three standard devi-ations of the measurements. This led to the assignment of{100,50,50,50}% of the mean concentration measured at thedetectors in each of the four {50,100,200,400} m arcs, respec-tively. Dual concentration (C∗), and particle travel time meanand variance (� and �2) fields were generated for one of therightmost detectors using the bLS model (105 particles werereleased), and were subsequently translated in space to all ofthe other detectors in the array. This translation is admissi-ble due to the horizontally homogeneous nature of the flowencountered in PPG. Vertical translation of the C∗ field is inad-missible owing to the vertical inhomogeneity of the windstatistics, and for general flows which lack homogeneity inany one direction, C∗ fields and particle travel time statisticscannot be translated.

The posterior PDF was sampled using the Metropolis–Hastings algorithm, and MCMC samples for each parameterwere binned. 105 points were generated using normal proposaldistributions whose width was decided based on observationsof each chain’s progress. Histograms of the MCMC samples areshown in Fig. 11, and corresponding summary statistics arepresented in Table 2. The scatter plots in the lower-right cor-ner of the set of histograms show the MCMC samples drawn in{qs, ks} parameter space. A clear positive correlation is evidentin the spread and density of these samples, indicating that

difficulty could potentially be encountered when attemptingto isolate both source strength and decay rate simultaneously.Detector measurements must be relatively unambiguous intheir representation of the effects of decay rate and source

MCMC samples. The true parameter value is representedn the scatter plot. The mean of the MCMC samples ise square dot in the scatter plot.

e c o l o g i c a l m o d e l l i n g 2 0 5 ( 2 0 0 7 ) 437–452 449

Fig. 12 – Layout, case 2. Unknown source (square) andds

si

vtc

4(Isrimata

Fig. 13 – Layout, case 3. Unknown source (square) and

Fbr

etector (circular dots) arrangement for determining theource parameters (for small ks).

trength in order for the posterior PDF to yield meaningfulnformation about these two parameters.

In this case, all parameters are estimated such that the truealues are enclosed within at most two standard deviations ofhe means of the MCMC samples. This includes the decay rateoefficient, which was known a priori to be zero.

.5.2. Assessment of methodology using synthetic datanonconservative tracer)n Figs. 12 and 13, detector arrangements are shown for twoource determination test cases where the unknown decayate ks differs by an order of magnitude. For the first case,n which the decay rate is very low, detectors which provide

eaningful data (given their susceptibility to noise) are gener-lly located closer to the source. By ‘meaningful data’ we refero data points which are in general representative of the decaynd dispersion of the plume. Hence, the streamwise spread of

ig. 14 – Case 2: marginal parameter distributions generated fromy the solid vertical line in the histograms, and the circular dot iepresented by the dashed vertical line in the histograms, and th

detector (circular dots) arrangement for determining thesource parameters (for large ks).

the detectors in the second case (Fig. 13) is approximately halfthat of those for the first case (Fig. 12). In both cases, the detec-tor array is asymmetric about the plume centerline, with alldetectors placed at a height of z = 1.5 m (as in the PPG fieldexperiment). Synthetic concentration data were obtained forthe two decay rates using the concentration field generatedby the fLS model after releasing 5 × 104 particles. This datawas then subjected to additive Gaussian noise whose standarddeviation was 50% of the measured concentration.

For both cases, uncertainties at all detectors were assumedto be 50%, and the same C∗ field was generated and hori-zontally translated in the x–y plane to each of the syntheticdetector locations. The posterior PDF was sampled using the

same MCMC approach as with case 1. Histograms for the sec-ond case (ks = 0.03) are shown in Fig. 14, and the results of thethird case (ks = 0.30) are shown in Fig. 15. Corresponding sum-mary statistics are presented in Tables 3 and 4. The scatter

MCMC samples. The true parameter value is representedn the scatter plot. The mean of the MCMC samples ise square dot in the scatter plot.

450 e c o l o g i c a l m o d e l l i n g 2 0 5 ( 2 0 0 7 ) 437–452

Fig. 15 – Case 3: marginal parameter distributions generated from MCMC samples. The true parameter value is representedby the solid vertical line in the histograms, and the circular dot irepresented by the dashed vertical line in the histograms, and th

Table 3 – Case 2: summary statistics (mean and standarddeviation) of the MCMC samples used to generate Fig. 14

mi xs (m) ys (m) zs (m) qs (g s−1) ks (s−1)

Actual mi 0.0 0.0 0.46 41.20 0.030mean (mMCMC

i) 6.53 −0.30 1.05 44.76 0.028

sd (mMCMCi

) 5.37 0.79 1.05 13.22 0.011

Table 4 – Case 3: summary statistics (mean and standarddeviation) of the MCMC samples used to generate Fig. 15

mi xs (m) ys (m) zs (m) qs (g s−1) ks (s−1)

Actual mi 0.0 0.0 0.46 41.20 0.30

source strength) and decay. The MCMC samples presented in

mean (mMCMCi

) 2.35 0.12 0.82 39.05 0.29sd (mMCMC

i) 1.53 0.57 0.92 14.03 0.029

plots of the MCMC samples drawn from the {qs, ks} param-eter space once again demonstrate a positive correlation. Inboth cases, parameters are generally well estimated; the trueparameter values are enclosed within two standard deviationsof the means of the MCMC samples.

5. Conclusions

Combining the following techniques:

(1) a statistical approach to reconstructing the [dual] concen-tration field for a given decay coefficient (in LS particle

models);

(2) the adjoint approach; and(3) Markov chain Monte Carlo

n the scatter plot. The mean of the MCMC samples ise square dot in the scatter plot.

results in a computationally efficient method for solving thesource determination problem for a nonconservative tracer ina Bayesian probabilistic framework.

The detector positions used in the three test cases mightbe construed to be arranged based on prior knowledge of thesource location, as opposed to being randomly spread aboutthe domain (as might be expected in a real-life scenario).However, our choice of detector arrangement is designed toelicit knowledge about the capabilities and limitations of thesource determination methodology. In a random array, detec-tors upwind of the source would be expected to measurezero concentration, information which would effectively con-strain the potential source location (but not the decay rateor the strength). It could be considered that we have utilizeda ‘subset’ of detectors which sample only part of the plume(downwind of the source), which actually results in a morechallenging inference.

A prior understanding of the expected scale of tracer decayis clearly important when considering an inverse problem inwhich the decay coefficient and source strength could varyby orders of magnitude. For the case of a tracer undergo-ing rapid decay, detectors will yield useful information (interms of their signal-to-noise ratio) only when placed rela-tively close to the source. Lagrangian stochastic simulationsmust be run using relatively large numbers of particles in orderto improve model concentration estimates for detectors whichlie far from the source. Conversely, for tracers which decayslowly in time, the spread of detectors must be wide enough tocapture the relative behaviours of dispersion (indicative of the

the previous section demonstrate that while these parametersare closely correlated (i.e., the concentration measured by asingle detector could be reduced either by reducing the source

g 2 0

slsssb

tesUimoweitce

A

TBtWpc

r

A

A

B

B

C

C

D

F

e c o l o g i c a l m o d e l l i n

trength or by increasing the decay rate), they can neverthe-ess be simultaneously estimated (using multiple detectors),ince they are not linearly dependent. The ks versus qs MCMCample scatter plots shown in Figs. 11–15 match the trendeen in the analogous plot of rate coefficient versus ultimateiochemical oxygen demand presented by Qian et al. (2003).

The test cases demonstrate that the method can be appliedo environmental flows in which several of the source param-ters are unknown. Consider a scenario where tracer decay orcavenging is unexpected, but experimentally unconfirmed.sing an inference procedure based on the ‘overparameter-

zed’ model (i.e., the model which includes ks) will result inore truthful estimates (in terms of their accuracy) of the

ther source parameters (Reichert and Omlin, 1997). In otherords, uncertainty about the persistence of a tracer in the

nvironment will be reflected in additional uncertainty aboutts origin and strength. The statistical approach to particleravel times described in this work significantly mitigates theomputational effort required to include ks as a source param-ter in the inference procedure.

cknowledgement

he authors wish to acknowledge support from the Chemicaliological Radiological Nuclear Research and Technology Ini-iative (CRTI) Program under project number CRTI-02-0093RD.

e would also like to thank the anonymous reviewers whorovided detailed and insightful comments which signifi-antly improved the manuscript.

e f e r e n c e s

ral, M.M., Guan, J., Maslia, M.L., 2001. Identification ofcontaminant source location and release history in aquifers. J.Hydrol. Eng. 6, 225–234.

riya, P.A., Jobson, B.T., Sander, R., Niki, H., Harris, G.W., Hopper,J.F., Anlauf, K.G., 1998. Measurements of C2–C7 hydrocarbonsduring the Polar Sunrise Experiment 1994: further evidencefor halogen chemistry in the troposphere. J. Geophys. Res. 103,13169–13180.

arad, M.L. (Ed.), 1958. Project Prairie Grass, A Field Program inDiffusion. Geophysical Research Papers No. 59, vols. I and II.Report AFCRC-TR-58-235, Air Force Cambridge ResearchCenter, 439 pp.

orsuk, M.E., Stow, C.A., 2000. Bayesian parameter estimation ina mixed-order model of BOD decay. Water Res. 34, 1830–1836.

how, T.K., Kosovic, B., Chan, S., 2005. Source inversion forcontaminant plume dispersion in urban environments usingbuilding resolving simulations. In: Proceedings of the NinthAnnual George Mason University Conference on AtmosphericTransport and Dispersion Modeling, Fairfax, VA, USA, July.

ox, R.T., 1946. Probability, frequency and reasonableexpectation. Am. J. Phys. 14, 1–13.

enning, A.S., Holzer, M., Gurney, K.R., Heimann, M., Law, R.M.,Rayner, P.J., Fung, I.Y., Fan, S.-M., Taguchi, S., Friedlingstein, P.,Balkanski, Y., Taylor, J., Maiss, M., Levin, I., 1999.Three-dimensional transport and concentration of SF6. A

model intercomparison study (TransCom 2). Tellus B 51,266–297.

lesch, T.K., Wilson, J.D., Yee, E., 1995. Backward-time Lagrangianstochastic dispersion models and their application toestimate gaseous emissions. J. Appl. Meteorol. 34, 1320–1332.

5 ( 2 0 0 7 ) 437–452 451

Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (Eds.), 1996. MarkovChain Monte Carlo in Practice. Chapman & Hall/CRC, London.

Gregory, P.C., 2005. Bayesian Logical Data Analysis for the PhysicalSciences: A Comparative Approach with Mathematica®

Support. Cambridge University Press, Cambridge.Hanna, S.R., Chang, J.S., Strimaitis, D.G., 1990. Uncertainties in

source emission rate estimates using dispersion models.Atmos. Environ. 24A, 2971–2980.

Hanna, S.R., Hansen, O.R., Dharmavaram, S., 2004. FLACS CFD airquality model performance evaluation with Kit Fox, MUST,Prairie Grass, and EMU observations. Atmos. Environ. 38,4675–4687.

Hastings, W.K., 1970. Monte Carlo sampling methods usingMarkov chains and their applications. Biometrika 57,97–109.

Haugen, D.A. (Ed.), 1959. Project Prairie Grass, A Field Program inDiffusion. Geophysical Research Papers No. 59, vol. III. ReportAFCRC-TR-58-235, Air Force Cambridge Research Center, 673pp.

Hsieh, K.J., Keats, W.A., Lien, F.-S., Yee, E., 2005. Scalar dispersionand inferred source location in an urban canopy. In:Proceedings of the Ninth Annual George Mason UniversityConference on Atmospheric Transport and DispersionModeling, Fairfax, VA, USA, July.

Issartel, J.-P., Baverel, J., 2003. Inverse transport for theverification of the Comprehensive Nuclear Test Ban Treaty.Atmos. Chem. Phys. 3, 475–486.

Jaynes, E.T., 2003. Probability Theory: The Logic of Science.Cambridge University Press, Cambridge.

Keats, A., Yee, E., Lien, F.-S., 2007. Bayesian inference for sourcedetermination with applications to a complex urbanenvironment. Atmos. Environ. 41, 465–479.

Lin, J.C., Gerbig, C., Wofsy, S.C., Andrews, A.E., Daube, B.C., Davis,K.J., Grainger, C.A., 2003. A near-field tool for simulating theupstream influence of atmospheric observations: theStochastic Time-inverted Lagrangian Transport (STILT) model.J. Geophys. Res. 108, 4493.

Liu, F., Zhang, Y., Hu, F., 2005. Adjoint method for assessment andreduction of chemical risk in open spaces. Environ. Model.Assess. 10, 331–339.

Meyers, T.P., Finkelstein, P., Clarke, J., Ellestad, T.G., Sims, P.F.,1998. A multilayer model for inferring dry deposition usingstandard meteorological measurements. J. Geophys. Res. 103,22645–22661.

Michalak, A.M., Kitanidis, P.K., 2002. Application of Bayesianinference methods to inverse modeling for contaminantsource identification at Gloucester Landfill, Canada. Comput.Meth. Water Resour. 2, 1259–1266.

Penenko, V., Baklanov, A., Tsvetova, E., 2002. Methods ofsensitivity theory and inverse modeling for estimation ofsource parameters. Future Gener. Comput. Syst. 18, 661–671.

Qian, S.S., Stow, C.A., Borsuk, M.E., 2003. On Monte Carlomethods for Bayesian inference. Ecol. Model. 159, 269–277.

Reichert, P., Omlin, M., 1997. On the usefulness ofoverparameterized ecological models. Ecol. Model. 95,289–299.

Rodean, H.C., August 1996. Stochastic Lagrangian models ofturbulent diffusion. Meteorological Monographs 26.

Rodenbeck, C., Houweling, S., Gloor, M., Heimann, M., 2003. CO2

flux history 1982–2001 inferred from atmospheric data using aglobal inversion of atmospheric transport. Atmos. Chem.Phys. 3, 1919–1964.

Seibert, P., Frank, A., 2004. Source–receptor matrix calculationwith a Lagrangian particle dispersion model in backward

mode. Atmos. Chem. Phys. 4, 51–63.

Skiba, Y.N., 2003. On a method of detecting the industrial plantswhich violate prescribed emission rates. Ecol. Model. 159,125–132.

i n g

452 e c o l o g i c a l m o d e l l

Thomson, D.J., 1987. Criteria for the selection of stochasticmodels of particle trajectories in turbulent flows. J. Fluid

Mech. 180, 529–556.

Venkatram, A., Du, S., 1997. An analysis of the asymptoticbehavior of cross-wind-integrated ground-levelconcentrations using Lagrangian stochastic simulation.Atmos. Environ. 31, 1467–1476.

2 0 5 ( 2 0 0 7 ) 437–452

Yee, E., Lien, F.-S., Keats, W.A., Hsieh, K.J., D’Amours, R., 2006.Validation of Bayesian inference for emission source

distribution reconstruction using the Joint Urban 2003 andEuropean Tracer experiments. In: Proceedings of the FourthInternational Symposium on Computational WindEngineering (CWE2006), Yokohama, Japan,July.