Optimal interpolation to re-analyse PM10 concentration modelling simulations

6
Optimal interpolation to re-analyse PM10 concentration modelling simulations Gabriele Candiani, Claudio Carnevale, Veronica Filisina, Giovanna Finzi, Enrico Pisoni, Marialuisa Volta Abstract— An optimal interpolation technique is formalized and applied to the output of an air quality deterministic model in order to improve the description of the evolution of pollutant (namely particulate matter, PM10) in atmosphere. The paper presents an application of the methodology to the Northern Italy region, often affected by high concentration of PM10. The validation of the methodology performed for winter 2004 shows that the re-analysis highly improves the description of the phenomena both in terms of mean error and correlation coefficient. I. INTRODUCTION The complexity and nonlinearity of physical and chemical phenomena taking place in the atmosphere make the veri- fication of the level of pollutant concentrations in a certain domain a very challenging task for National Environmental Authorities. A first answer to this problem comes from the monitoring networks, measuring the concentrations of different pollutants in a limited number of points of the area under study. The accuracy of this approach heavily depends on the position and number of monitoring stations that can be placed in the region. In the last decades, the scientific community moved to a complementary approach to solve the problem, based on the design and implementation of 3D deterministic systems modeling the dynamics of the pollutants in atmosphere (TCAM [1], CAMx [2], CHIMERE [3], EMEP [4], LOTOS [5]). These models offer a fairly comprehensive description of the most relevant gas and aerosol dynamics, but due to their complexity, they need a large and very detailed number of meteorological and emission input, always affected by uncertainty impacting the model performances [6], [7]. In the last years, different techniques have been applied to re-analyse the model sim- ulations with the information coming from the monitoring networks in order to obtain a system able to reproduce the concentration level of pollutants in the atmosphere. Some of such techniques (i.e. IDW [8] or residual kriging [8] [9]) do not use any statistical information concerning the errors of model simulations and observations. On the other side, being both model simulations and measurements affected by errors, in recent years the application of more detailed multivariate statistical methods [10] has been continuously increased. In this study an optimal interpolation re-analysis technique [11] has been applied to the simulations of TCAM model over Northern Italy. The paper is organized as follows: Department of Electronic for Automation, Univer- sity of Brescia, Via Branze 38, IT-25123 Brescia, Italy [email protected] Section 2 presents the main features of TCAM model and the formalization of optimal interpolation; the setup of the model for a case study is presented in Section 3. Finally, the re-analysis results are presented in Section 4. II. METHODOLOGY PM10 (fraction of particles with diameter lower than 10μ m) is one of the most important pollutants in the atmosphere because of its significant impact on human health and ecosystem. Moreover, the dynamics leading to the production, accumulation and removal of PM10 in a certain area are heavily nonlinear. For these reasons, a detailed and accurate representation of the PM10 concentration level over a domain is a very important but challenging task. In the following, a methodology devoted to the computation of the PM10 concentration integrating different information sources is presented. First of all, the simulation of PM10 phenomena is performed through a deterministic 3D multiphase model, then the data of the measurement network over the domain are collected and analysed and finally, the two information sources are integrated through an optimal interpolation re- analysis procedure. A. TCAM model PM10 concentrations are typically simulated by three- dimensional deterministic models. In this work TCAM (Transport Chemical Aerosol Model) [1] has been used. It is a part of the Gas Aerosol Modelling Evaluation System (GAMES) [12] (Figure 1) including also the meteorological pre-processor PROMETEO, that provides TCAM all the meteorological input fields in the correct spatial-temporal resolution, starting from the output of continental scale mod- els; the emission processor POEM-PM [13] and a boundary condition preprocessor, that computes the boundary condi- tions for TCAM model in the application domain starting from the simulation of continental scale models. The multi-phase transport model TCAM [1] is an Eulerian 3D grid model. It solves, time by time and for each cell of the computational grid, a PDE system modelling the hori- zontal/vertical transport, the multiphase chemical reactions and the gas to particle conversion phenomena by means of a splitting operator technique [14]. The horizontal transport is solved by means of a chapeau function approximation and the non linear Forester filter [1], while the vertical transport PDE system is solved by a hybrid implicit-explicit scheme [1]. TCAM allows the simulation of the gas chemistry using both the lumped structure (Carbon Bond [15]) and Joint 48th IEEE Conference on Decision and Control and 28th Chinese Control Conference Shanghai, P.R. China, December 16-18, 2009 WeBIn2.5 978-1-4244-3872-3/09/$25.00 ©2009 IEEE 1794

Transcript of Optimal interpolation to re-analyse PM10 concentration modelling simulations

Optimal interpolation to re-analyse PM10 concentration modellingsimulations

Gabriele Candiani, Claudio Carnevale, Veronica Filisina, Giovanna Finzi, Enrico Pisoni, Marialuisa Volta

Abstract— An optimal interpolation technique is formalizedand applied to the output of an air quality deterministic modelin order to improve the description of the evolution of pollutant(namely particulate matter, PM10) in atmosphere. The paperpresents an application of the methodology to the NorthernItaly region, often affected by high concentration of PM10.The validation of the methodology performed for winter 2004shows that the re-analysis highly improves the description ofthe phenomena both in terms of mean error and correlationcoefficient.

I. INTRODUCTION

The complexity and nonlinearity of physical and chemicalphenomena taking place in the atmosphere make the veri-fication of the level of pollutant concentrations in a certaindomain a very challenging task for National EnvironmentalAuthorities. A first answer to this problem comes fromthe monitoring networks, measuring the concentrations ofdifferent pollutants in a limited number of points of thearea under study. The accuracy of this approach heavilydepends on the position and number of monitoring stationsthat can be placed in the region. In the last decades, thescientific community moved to a complementary approach tosolve the problem, based on the design and implementationof 3D deterministic systems modeling the dynamics of thepollutants in atmosphere (TCAM [1], CAMx [2], CHIMERE[3], EMEP [4], LOTOS [5]). These models offer a fairlycomprehensive description of the most relevant gas andaerosol dynamics, but due to their complexity, they needa large and very detailed number of meteorological andemission input, always affected by uncertainty impactingthe model performances [6], [7]. In the last years, differenttechniques have been applied to re-analyse the model sim-ulations with the information coming from the monitoringnetworks in order to obtain a system able to reproduce theconcentration level of pollutants in the atmosphere. Some ofsuch techniques (i.e. IDW [8] or residual kriging [8] [9])do not use any statistical information concerning the errorsof model simulations and observations. On the other side,being both model simulations and measurements affectedby errors, in recent years the application of more detailedmultivariate statistical methods [10] has been continuouslyincreased. In this study an optimal interpolation re-analysistechnique [11] has been applied to the simulations of TCAMmodel over Northern Italy. The paper is organized as follows:

Department of Electronic for Automation, Univer-sity of Brescia, Via Branze 38, IT-25123 Brescia, [email protected]

Section 2 presents the main features of TCAM model andthe formalization of optimal interpolation; the setup of themodel for a case study is presented in Section 3. Finally, there-analysis results are presented in Section 4.

II. METHODOLOGY

PM10 (fraction of particles with diameter lower than10μm) is one of the most important pollutants in theatmosphere because of its significant impact on humanhealth and ecosystem. Moreover, the dynamics leading to theproduction, accumulation and removal of PM10 in a certainarea are heavily nonlinear. For these reasons, a detailed andaccurate representation of the PM10 concentration level overa domain is a very important but challenging task. In thefollowing, a methodology devoted to the computation of thePM10 concentration integrating different information sourcesis presented. First of all, the simulation of PM10 phenomenais performed through a deterministic 3D multiphase model,then the data of the measurement network over the domainare collected and analysed and finally, the two informationsources are integrated through an optimal interpolation re-analysis procedure.

A. TCAM model

PM10 concentrations are typically simulated by three-dimensional deterministic models. In this work TCAM(Transport Chemical Aerosol Model) [1] has been used. Itis a part of the Gas Aerosol Modelling Evaluation System(GAMES) [12] (Figure 1) including also the meteorologicalpre-processor PROMETEO, that provides TCAM all themeteorological input fields in the correct spatial-temporalresolution, starting from the output of continental scale mod-els; the emission processor POEM-PM [13] and a boundarycondition preprocessor, that computes the boundary condi-tions for TCAM model in the application domain startingfrom the simulation of continental scale models.

The multi-phase transport model TCAM [1] is an Eulerian3D grid model. It solves, time by time and for each cell ofthe computational grid, a PDE system modelling the hori-zontal/vertical transport, the multiphase chemical reactionsand the gas to particle conversion phenomena by means of asplitting operator technique [14]. The horizontal transport issolved by means of a chapeau function approximation andthe non linear Forester filter [1], while the vertical transportPDE system is solved by a hybrid implicit-explicit scheme[1]. TCAM allows the simulation of the gas chemistryusing both the lumped structure (Carbon Bond [15]) and

Joint 48th IEEE Conference on Decision and Control and28th Chinese Control ConferenceShanghai, P.R. China, December 16-18, 2009

WeBIn2.5

978-1-4244-3872-3/09/$25.00 ©2009 IEEE 1794

��������

���

����������

�� ���

�����������������������������������

�������

�� ���� � !���

�����������������������������

����������������������"�����

Fig. 1. The GAMES modeling system.

the lumped molecule (SAPRC97 [16]) approach. In order todescribe the mass transfer between gas and aerosol phase, theCOCOH97 scheme [17], an extended version of SAPRC97mechanism including 95 gaseous species and 185 reactions,is implemented. The ODE chemical kinetic system is solvedby means of the Implicit-Explicit Hybrid (IEH) solver [18],that splits the species in fast and slow ones, according to theirreaction rate. The system of fast species is solved by meansof the implicit Livermore Solver for Ordinary DifferentialEquations (LSODE) [18], implementing an Adams predictor-corrector method in the non-stiff case [19], and the BackwardDifferentiation Formula method in the stiff case [19]. Theslow specie system is solved by the Adams-Bashfort method[19].

B. Optimal Interpolation

The Optimal Interpolation (OI) [11] method is an algo-rithm which allows the merging of the available observationswith the simulation results of a model, in order to producean analysis field representing an improved estimate of thestate of the atmosphere. However, both the model outputand the observations are affected by errors, so that they canbe expressed as:

xb(t) = x(t)+ηb(t) (1)

y0(t) = H(x(t))+ ε(t) (2)

where:• xb is the background (first guess) simulated field, that

is to say the daily mean PM10 concentration fieldcomputed by TCAM model over a grid domain;

• x is the true field;• ηb represents the error of the background field;

• y0 is the vector of the measurements;• H is a linear operator linking the grid field and the

observations;• ε is the observation error;

The goal of the OI is therefore the computation of a re-analysed field xa(t) (PM10 concentration field), representingthe best estimate of x (linear and unbiased) in the least-square sense. This field can be obtained as the solution ofthe minimization problem:

xa(t) = argmin J[x(t)]

= argmin[(x(t)− xb(t))T P−1(t)(x(t)− xb(t)) (3)

+(y0(t)−H(x(t)))T R−1(t)(y0(t)−H(x(t)))]

where P(t) = E[ηb(t)ηTb (t)] and R(t) = E[ε(t)εT (t)] are

the error covariance matrices for background field and ob-servations respectively. Under the hypothesis of stationarity,P(t) = P and R(t) = R. Stating the following assumptions:

• E[ηb(t)]=0, for each t;• E[ε(t)]=0, for each t;• E[ηT

b (t)ε(t)] = 0, for each t;• H is a linear operator, implementing a bilinear interpo-

lation between model and observation state.

xa(t) can be esplicitly computed as:

xa(t) = xb(t)+K(y0(t)−Hxb(t)) (4)

where K is the so called Kalman gain matrix, defined as:

K = PHT (HPHT +R)−1 (5)

The estimate of the covariance matrices P and R is akey issue, as the Kalman gain is significantly affected bytheir values. Matrix P is estimated following the Gaussianexponential function approach, stating that the simulationerror variance in the grid point is a characteristic of themodel (therefore it is constant for all the grid cells), whereasthe covariance between the errors in 2 points (i and j) of acomputational grid is a function of the horizontal distancebetween them. In this way, the P matrix can be approximatedas:

P = [pi, j] = exp(−d(i, j)2

2 ·L2h

) · v = D(i, j) · v (6)

where d(i, j) is the distance between the center of thecells i and j, Lh is a parameter defining the decay ofcovariance with respect to the distance, and v is the modelerror variance estimate, computed on the basis of previoussimulation results.

The covariances between the observation errors can beapproximately set to zero since the instruments errors areindependent, so R should be diagonal. Moreover, if the sametype of instruments are used for the measurements, it is alsopossible that all the monitoring stations have the same errorvariance r. Therefore, under these assumptions R = rI whereI is the identity matrix.

WeBIn2.5

1795

It should be noted that, under these assumptions about Pand R matrices, the Equation 5 can be written as:

K = D(i, j)HT· v(v · (HD(i, j)HT +

rv· I)−1) (7)

= D(i, j)HT (HD(i, j)HT +σ · I)−1

where the only degree of freedom σ = r/v is the ratiobetween the observation and model error variances.

III. SIMULATION SETUP

The methodology has been applied for the re-analysis ofPM10 daily mean concentration fields computed by TCAMmodel over a 640×410 km2 domain including the whole ofNorthern Italy (Figure 2). The area is often characterized byhigh level of PM10 concentration, in particular in its centralpart, where the most important industrial and residentialareas (Milano, Torino, Brescia, Verona, Venezia) are placed.A winter period (January 2004 - February 2004) has beenselected for the simulation. In this period, the high emissionsof primary PM10, nitrogen oxides and ammonia caused veryhigh PM10 concentration in atmosphere.

The emission fields are computed by POEM-PM pre-processor starting from the regional emission inventorycollected by Italian Environmental Agengy (APAT), whilethe meteorological fields are computed by PROMETEOpre-processor starting from continental scale simulationsperformed by MM5 model. The boundary conditions havebeen computed starting from continental scale simulations ofCHIMERE model [3]. The observation data are taken from133 station of the regional monitoring networks. The 80% ofthe stations (represented as a cross in the Figure 2) have beenused for the computation of the daily analysis fields, whilethe remaining 20% (indicated as a circle in Figure 2) havebeen used for the validation of the proposed methodology.

The ratio σ between the observation and model errorvariances is assumed equal to 0.1, as suggested in [10].

AOSTA

TORINO

MILANO

GENOVA

TRENTO

BOLOGNA

FIRENZE

VENEZIA

TRIESTE

VERONA

PIACENZA

MODENA

BRESCIA

RAVENNA

260000 360000 460000 560000 660000 760000 860000

UTM 32 [m]

4780000

4880000

4980000

5080000

5180000

UT

M 3

2 [m

]

Fig. 2. Application domain.

IV. RESULTS AND DISCUSSION

The validation of the implemented re-analysis techniquehas been performed comparing, for the selected stations,the OI (Optimal Interpolation) computed daily mean PM10concentration series, with the observations. In order to assessthe methodology effectiveness, the OI performances havebeen compared on one side with the reliability of TCAMmodel output (background field) and on the other side withthe fully-deterministic IDW (Inverse Distance Weighted)[10] re-analysis technique. The validation statistical indexesinclude the correlation coefficient, the normalized mean error(NME) and the normalized mean absolute error (NMAE).

Figures 3-5 show the statistical indexes computed compar-ing the daily mean PM10 simulated and observed concen-trations. Each box-plot presents the distribution of the indexcomputed for the validation stations in terms of minimum,25th percentile, median, 75th percentile and maximum val-ues. The implemented procedure leads to better results forall the considered indexes with respect to the backgroundfields. The correlation coefficient median rises from 0.4 ofbackground to 0.9 of analysis, and the normalized mean er-ror, that indicates a relevant underestimation trend in TCAM,becomes approximately 0. Such improvement is confirmedalso by the value of the normalized mean absolute error.It can be noticed how the two re-analysis techniques showcomparable results for all the three considered indexes.

In order to investigate more deeply any differences be-tween the two re-analysis techniques, their robustness withrespect to the measurement noise has been investigated.

In this context, a simulation has been performed addingto the value measured in each station s, a random signalranging in the interval [−0.3 ·E[PM10s]÷+0.3 ·E[PM10s]],where E[PM10s] is the mean concentration measured in thestation itself. Figures 6-8 show that, in this case, the OI re-analysis results are better than the IDW ones for all thecomputed indexes. In fact, measurement noise causes thecorrelation for IDW dropping from 0.9 to 0.7, while for theOI the reduction is only from 0.88 to 0.8. Moreover, themedian of the correlation computed for IDW is lower thanthe 25th percentile of the one computed for OI. The NMEis only partially affected by presence of noise, probably dueto compensation error phenomena. With regard to NMAE,it is interesting to highlight how the OI results are close tothose obtained without any measurement noise, with only alimited increase of distribution median.

The results in terms of concentration maps are shown inFigure 9 to Figure 13, where both noiseless concentrationmeasurements (circles) and model results (map) are depictedwith different colors, ranging from low (black) to high(white) values of PM10 concentrations. Figure 9 showshow TCAM model, without reanalysis technique application,underestimates PM concentrations over the whole domain.While applying optimal interpolation (Figure 10 and 11), asalready shown in the box-plot graphs, the results are improv-ing both with or without measurement noise introduction.The maps for IDW application are depicted in Figure 12and Figure 13.

WeBIn2.5

1796

Fig. 3. Correlation for background, IDW and Optimal Interpolation analysisresults (OI).

Fig. 4. NME for background, IDW and Optimal Interpolation analysisresults (OI).

Fig. 5. NMAE for background, IDW and Optimal Interpolation analysisresults (OI).

Fig. 6. Correlation for background, IDW and Optimal Interpolation analysisresults (OI): measurement noise case.

Fig. 7. NME for background, IDW and Optimal Interpolation analysisresults (OI): measurement noise case.

Fig. 8. NMAE for background, IDW and Optimal Interpolation analysisresults (OI): measurement noise case.

WeBIn2.5

1797

Fig. 9. Comparison between PM concentration measurements (circle) andits model estimation (map).

Fig. 10. Comparison between PM concentration measurements (circle) andits model estimation (map), after optimal interpolation application.

V. CONCLUSIONS

A re-analysis method based on optimal interpolation hasbeen used to integrate the simulation results of the deter-ministic multiphase transport model TCAM and the mea-surements of a monitoring network, in order to obtain abetter representation of the PM10 concentration level overNorthern Italy. The validation of the technique shows thatthe selected method ensures a significant improvement ofthe performances with respect to TCAM in terms of all theselected statistical indexes. The comparison with the IDWclassical technique shows very similar performances in thecase of noiseless measurement, while the OI, thanks to theintroduction of an estimation of error statistics, ensures morerobust results in the presence of measurement noise, with alimited increasing of computational time.

ACKNOWLDEGMENTS

This work has been developed in the frame of Pilot ProjectQUITSAT (QUalita dell’aria mediante l’Integrazione di mis-ure da Terra, da SAtellite e di modellistica chimica multifasee di Trasporto - contract I/035/06/0 - http://www.quitsat.it),sponsored by the Italian Space Agency (ASI). We alsoacknowledge the Italian Ministry of University and Re-search (MIUR), the COST728 action (Enhancing mesoscalemeteorological modelling capabilities for air pollution anddispersion applications) and the EU Network of ExcellenceACCENT (Atmospheric Sustainability).

Fig. 11. Comparison between PM concentration measurements (circle)and its model estimation (map), after optimal interpolation application andconsidering measurements noise.

Fig. 12. Comparison between PM concentration measurements (circle) andits model estimation (map), after IDW application.

REFERENCES

[1] C. Carnevale, G. Finzi and M. Volta, Design and validation of amultiphase 3D model to simulate tropospheric pollution, Proc. 44thIEEE Conference on Decision and Control and European ControlConference, ECC-CDC 2005, 2005, CD-ROM, ISBN 0-7803-9568-9.

[2] S. Andreani-Aksoyoglu, A. Prevot, U. Baltensperger, J. Keller andJ. Dommen, Modeling of formation and distribution of secondaryaerosols in the Milan area (Italy), Journal of Geophysical Research,109, 2004.

[3] H. Schmidt, C. Derognat, R. Vautard and M. Beekmann, A comparisonof simulated and observed ozone mixing ratios for the summer of 1998in Western Europe, Atmospheric Environment, 2001.

[4] D. Simpson, H. Fagerli, J. Jonson, S. Tsyro, S. and P. Wind, Trans-boundary Acidification, eutrophication and ground level ozone inEurope - Part I: Unified EMEP model description, EMEP MSC-W,2003.

[5] P. Builtjes, Comparison of three models for long term photochemicaloxidants in Europe: the lotos model results, EMEP MSC-W, 1991.

[6] C. Cuvelier, P. Thunis, R. Vautard, M. Amann, B. Bessagnet, M.Bedogni, R. Berkowicz, J. Brandt, F. Brocheton, P. Builtjes, C.Carnevale, B. Denby, J. Douros, A. Graf, O. Hellmuth, A. Hodzic, C.Honore, J. Jonson, A. Kerschbaumer, F. de Leeuw, E. Minguzzi, N.Moussiopoulos, C. Pertot, V. Peuch, G. Pirovano, L. Rouil, F. Sauter,M. Schaap, R. Stern, L. Tarrason, E. Vignati, M. Volta, L. White, P.Wind, and A. Zuber, CityDelta: A model intercomparison study toexplore the impact of emission reductions in European cities in 2010,Atmospheric Environment, 41, 2007, pp 189-207.

[7] M. Bedogni, C. Carnevale, G. Pirovano and M. Volta, Can a modellingsystem bias air quality policy selection?, In P. Horacek, M. Simandl,and P. Zitek, editors, Proc of 16th IFAC World Congress, CD-ROM,2005.

[8] E. Isaaks, and R. Srivastava, Applied Geostatistics, Oxford Universitypress, 1989.

WeBIn2.5

1798

Fig. 13. Comparison between PM concentration measurements (circle)and its model estimation (map), after IDW application and consideringmeasurements noise.

[9] B. Denby, M. Schaap, A. Segers, P. Builtjes, and J. Horalek, Compari-son of two data assimilation methods for assessing PM10 exceedanceson the European scale, Atmospheric Environment, 42, 2008, pp 7122-7134.

[10] E. Kalnay, Atmospheric modelling, data assimilation and predictabil-ity, Cambridge University press, 2003.

[11] L. Bengtsson, M. Ghil and E. Kallen, Dynamic meteorology: dataassimilation methods, Springer-Verlag, 1981.

[12] M. Volta, G. Finzi,GAMES, a comprehensive Gas Aerosol ModellingEvaluation System, Environmental Modelling and Software, 21, 2006,pp 587-594.

[13] C. Carnevale, V. Gabusi, and M. Volta, POEMPM: an emission modelfor secondary pollution control scenarios, Environmental Modellingand Software, 21 2006, pp 320-329.

[14] Marchuk, G. Methods of Numerical Mathematics, Springler, 1975.[15] M. Gery, G. Whitten and J. Killus, A photochemical kinetics mech-

anism for urban and regional-scale computer modeling, Journal ofGeophysical Resource, 94 ,1989, pp 12925-12956.

[16] W. Carter, D. Luo, and I. Malkina, Environmental chamber studiesfor development of an updated photochemical mechanism for VOCreactivity assessment, California Air Resources Board, Sacramento(CA), 1997.

[17] A. Wexler and J. Seinfeld, Second-generation inorganic aerosol model,Atmospheric Environment, 25, 1991, pp 2731-2748.

[18] D. Chock, S. Winkler and P. Sun, A comparison of stiff chemistrysolvers for air quality modeling. Proc. of Air and waste managementassociation 87th annual meeting, 1994.

[19] D. Wille, New stepsize estimators for linear multi step methods, Procof MCCM, 1994.

[20] R. Balgovind, A. Dalcher, M. Ghil, and E. Kalnay, A scholasticdynamic model for the spatial structure of forecast error statistics,Monthly Weather Review, 111, 1983, pp 701-722.

WeBIn2.5

1799