Daily spatial prediction of PM10 mass concentrations with geostatistics: an Austrian case study

27
Spatial prediction of PM 10 mass concentrations with geostatistics: an Austrian case study Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini Abstract The correct evaluation of fine particles suspended near surface is clearly of great concern since it has direct effects on human health. Continuous exposure to elevated concentrations of particulate matter is cause of several heart and lungs diseases. Accurate, high resolution maps of PM mass concentrations are eagerly de- manded for either environmental/health policy and future monitoring stations de- sign. The partial picture seen by the ground monitoring sites (point locations) can lead to severely misclassified epidemiological studies, and the sites themselves might not catch hot-spots of pollution over unobserved locations. In this study we pro- pose an adaptive regession-based geostatistical method to predict daily PM 10 con- centrations by means of ground measurements, satellite-based maps of atmospheric aerosols and model-based maps of meteorological features. The area of interest is Austria and the analysis thoroughly spans 3 years (from 2008 to 2010). AOT has a verified relationship with PM dry mass concentrations, but AOT alone is inadequate primarily because of the different support of the two measurements: instantaneous atmospheric columnar retrievals of AOT against punctual time-averages measure- ments on the ground. In our work, meteorological variables are selected ad-hoc to normalize the AOT information to fill the lack of vertical profiles. These data, Piero Campalani UNIFE, Via Saragat 1, 44122 Ferrara, Italy e-mail: [email protected] Jacobs University Bremen, Campus Ring 1, 28759 Bremen, Germany e-mail: p.campalani@ jacobs-university.de Simone Mantovani MEEO Srl, Via Saragat 9, 44122 Ferrara, Italy e-mail: [email protected] SISTEMA GmbH, W¨ aringerstraße 61, 1090 Wien, Austria e-mail: [email protected] Marcus Hirtl ZAMG, Hohe Warte 38, 1190 Wien, Austria e-mail: [email protected] Mark Caglienzi UNIFE, Via Saragat 1, 44122 Ferrara, Italy. e-mail: [email protected] Gianluca Mazzini LepidaSpA, Via Aldo Moro 64, 40127 Bologna, Italy. e-mail: [email protected] Ninth Conference on Geostatistics for Environmental Applications, geoENV2012, Valencia, Spain, September 19 – 21, 2012

Transcript of Daily spatial prediction of PM10 mass concentrations with geostatistics: an Austrian case study

Spatial prediction of PM10 massconcentrations with geostatistics: anAustrian case study

Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi,Gianluca Mazzini

Abstract The correct evaluation of fine particles suspended near surface is clearlyof great concern since it has direct effects on human health. Continuous exposureto elevated concentrations of particulate matter is cause of several heart and lungsdiseases. Accurate, high resolution maps of PM mass concentrations are eagerly de-manded for either environmental/health policy and future monitoring stations de-sign. The partial picture seen by the ground monitoring sites (point locations) canlead to severely misclassified epidemiological studies, and the sites themselves mightnot catch hot-spots of pollution over unobserved locations. In this study we pro-pose an adaptive regession-based geostatistical method to predict daily PM10 con-centrations by means of ground measurements, satellite-based maps of atmosphericaerosols and model-based maps of meteorological features. The area of interest isAustria and the analysis thoroughly spans 3 years (from 2008 to 2010). AOT has averified relationship with PM dry mass concentrations, but AOT alone is inadequateprimarily because of the different support of the two measurements: instantaneousatmospheric columnar retrievals of AOT against punctual time-averages measure-ments on the ground. In our work, meteorological variables are selected ad-hocto normalize the AOT information to fill the lack of vertical profiles. These data,

Piero CampalaniUNIFE, Via Saragat 1, 44122 Ferrara, Italy e-mail: [email protected] University Bremen, Campus Ring 1, 28759 Bremen, Germany e-mail: [email protected]

Simone MantovaniMEEO Srl, Via Saragat 9, 44122 Ferrara, Italy e-mail: [email protected] GmbH, Waringerstraße 61, 1090 Wien, Austria e-mail: [email protected]

Marcus HirtlZAMG, Hohe Warte 38, 1190 Wien, Austria e-mail: [email protected]

Mark CaglienziUNIFE, Via Saragat 1, 44122 Ferrara, Italy. e-mail: [email protected]

Gianluca MazziniLepidaSpA, Via Aldo Moro 64, 40127 Bologna, Italy. e-mail: [email protected]

Ninth Conference on Geostatistics for Environmental Applications, geoENV2012,Valencia, Spain, September 19 – 21, 2012

2 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

namely planetary boundary layer height, pressure, humidity and wind — which playa major role in the assessment of the AOT-PM relationship — feed the underlyingregression as well and ensure the prediction of the final map also in case the in-sufficient aerosols information inhibits its inclusion in the model, which representsthe majority of the cases during cold seasons. Excellent regression goodness-of-fit(R2 > 0.9 on average) are observed; cross-validation statistics show overall betterperformances of kriging in comparison with an inverse distance mechanical inter-polator.

Introduction

Previous epidemiological studies suggested that there is an association between inci-dence and exacerbation of adverse respiratory and cardiovascular health effects andair pollution (Al-Hamdan et al., 2009). Accurate, high resolution maps of ground-level particulate matter are then highly awaited for environmental policies and futuremonitoring stations design. Though the measurements made by the ground stationscan ensure a high level of reliability, still they cannot provide full spatial monitoringover an area and thus they might lead to misclassified epidemiological studies (Bee-len et al., 2009).

Spaceborne aerosols products like the ones offered by the polar-orbiting MOD-erate resolution Imaging Spectrometer (MODIS) are successfully finding practicalapplications for scientific research studies, with nearly an exponential rate of growthof related publications in the last years (Remer et al., 2006). Though not previ-ously intended, the Aerosol Optical Thickness or Depth (AOT, AOD, or simplyτ) from MODIS revealed to have a leading role in the evaluation of surface airquality due to its full spatial (clear-sky constrained) coverage and daily overpassesalmost throughout the globe. Though the ``promised land'' has not been reachedyet (Hoff and Christopher, 2009), researchers have verified an existing correla-tion of the AOT with Particulate Matter (PM) mass concentrations (e.g. (Engel-Coxet al., 2004; Gupta et al., 2008; Liu et al., 2007; Schaap et al., 2009)), increasing therole of air quality models for high-scale environmental characterization, which arenot possible with ground observations alone. This relationship strongly depends onside-conditions of the area of analysis, furthermore is not straightforward: severalaspects need to be considered to properly translate the columnar information ofaerosols into ground-level presence of PM. In ideal conditions, i.e. with cloud-freeconditions, no aerosols above the well-mixed boundary layer and aerosols withsimilar optical properties, the AOT might be written as:

AOT = PM ·H · f (RH) ·3Qext,dry

4ρ ref f(1)

where PM is the particulate matter at surface with aerodynamic diameter minorthan a certain threshold (typically 10 µm and 2.5 µm are reference thresholds), His the height of the well-mixed boundary layer, RH is the relative humidity, f (RH)

Spatial prediction of PM10: an Austrian case study 3

is the ratio of ambient and dry extinction coefficients, ρ is the aerosol mass density(g·m-3), Qext,dry is the Mie extinction efficiency, and ref f is the particle effectiveradius (Hoff and Christopher, 2009).

The austrian region offers a challenging scenario for air quality modelling: it rep-resents a highly populated area with consistent sources of particulate emissionsfrom traffic, industries and home heating; the Alps with their steep slopes togetherwith local meteorology create a complex context of local air pollution transport,with several currents mixed together. These are either caused by air mass differ-ences along the valleys, multiple wind reversals and upslope winds during daytimethat decrease low-level pollution concentrations and cause the formation of ele-vated pollution layers (Gohm et al., 2009), creating a further level of indirection be-tween ground measured PM and spaceborne AOT, whose retrieval is also inhibitedby the presence of clouds and snow through the path. Aerosol monitoring over thiscomplex area hence required high-resolution data: validated MODIS-derived AOTretrievals at 1×1 km2 of spatial resolution (at nadir) from PM MAPPER software(details in Sect. ) were then used in place of the original MODIS datasets whichare yielded at the coarser resolution of 10×10 km2. It should be noted how AOTfrom remote sensing sources does not necessarily represent a meaningful proxyfor PM concentrations (e.g. (Emili et al., 2011b)), either due to the lack of sufficientexplanatory data or to a particularly rugged and complex topography. It should bepointed out that nowadays there is a considerable selection of satellite based AOTproducts that can be chosen for the specific application, including for instance theMISR sensor aboard the Terra satellite (as for MODIS), or data from geostation-ary satellites as well like GOES and METEOSAT which can offer a higher temporalresolution.

Moreover several research studies, such as (Gupta et al., 2006; Liu et al., 2005;Paciorek et al., 2008), have highlighted how meteorological effects play a major rolein the AOT-PM relationship and need to be considered when assessing air qualityon the surface: temperature can enhance the photochemical reactions in the atmo-sphere and hence the production of fine particles, temperature inversion can alsoreduce the vertical mixing; high relative humidity can enhance the production of sec-ondary particles and hence change the size distribution and the optical propertiesof aerosols; the height of the planetary boundary layer drive the dilution of pol-lution in the atmospheric volume. Intuitively, optimal AOT-PM correlations shouldbe found with dry relative humidity, higher temperatures and well-mixed boundarylayer (Gupta and Christopher, 2009). Model-based maps of several meteorologicalvariables like temperature, pressure, relative humidity and wind components wereavailable for use in our model over Austria.

This article investigates the use of kriging geostatistical techniques in the spatialfilling of daily 1×1 km 2 maps of PM10 concentrations over Austria by means ofspaceborne maps of AOT, ground measurements of PM10 itself and model-basedmaps of several meteorological features. This work should be intended as the pros-ecution of (Campalani et al., 2011). It should be noted how geostatistical techniquesare only a choice amongst the panorama of models that can be adopted for such anapplication, including for instance Chemical Transport Models (CTM) (Emili et al.,

4 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

2011b), mechanical interpolators (Wong et al., 2004), land-use regression (Janssenet al., 2008), dispersion models (Maantay et al., 2009), hybrid models (Li et al., 2006).The most widely used approach to predict PM concentrations by means of AOTdata instead consisted of empirical estimations of linear coefficients based on largeseries of co-located measurements (e.g. (Vidot et al., 2007)).

In Sect. the input datasets will be described in detail; in Sect. and Sect. themodel workflow and the results will be presented. Finally, conclusions and futurework will be drawn in Sect. .

Datasets

In this section the input datasets which are actively used in our model are described.All the variables were collected over the austrian geographic area, delimited by thebounding box of corners [9.5 ◦N,46 ◦E] and [17 ◦N,49 ◦E]. The temporal interval ofanalysis goes from 1 January 2008 to 31 December 2010, totaling three years ofdaily data. Ground-based daily measurements of PM10 are described in Sect. ; theauxiliary predictors or covariates are described in Sect. for the AOT imagery andSect. for the hourly model-based meteorological maps.

PM10 ground measurements

For the analysis of the ground based aerosol distribution, 24-hour PM10 measure-ments from all austrian AQ-stations (159 stations) are used. The data was kindlyprovided by the regional austrian administrations and extracted from the IDV (Im-missions Daten Verbund) which is a database containing all measurements from theoperational austrian Air Quality network. The distribution of the stations is de-picted in Fig. 1 along with two examples of monthly trends in the cold and warmseasons (as can be intuitively expected, higher concentrations are observed duringthe cold season). This data were already used in the project AQA-PM (``Extensionof the Air-Quality model for Austria with satellite based Particulate Matter esti-mates'') which is supported in the frame of the seventh austrian Space ApplicationProgramme (ASAP-7).

Although PM2.5 has been more frequently chosen for AOT comparisons (Hoffand Christopher, 2009) because of its higher involvement in public human health —the finer fractions of PM are inhalable and can reach the lungs — and also because ofthe higher sensitivity of satellite sensors in the visible bands to fine-particle concen-trations, in this study we chose PM10 as target variable: firstly because of the wideravailability of data from ground sites, besides the AOT is related to the extinctionof both the fine and coarse particle fractions, which are indeed considered whenassuming a threshold aerodynamic diameter of at least 10 µm (Emili et al., 2011b).

Spatial prediction of PM10: an Austrian case study 5

Fig. 1: PM10 monthly trends in January (above) and August (below) 2009. AQ ground stationsare highlighted (black circles).

AOT satellite maps

Due to its wide swath — 2330 km, for a ±55◦ angular view — the MODIS sensoraboard polar-orbiting Earth Observating System (EOS) Terra and Aqua satellites(launched in 1999 and 2002 respectively) is a very appealing source of aerosol in-formation and can provide daily observations over Austria (as well as almost overthe entire globe). Although MODIS raw spectral radiances reach 250 m of spatialresolution around the 0.66 and 0.86 µm channels, the final AOT product is yieldedat a final resolution of 10×10 km2 for signal-to-noise ratio requirements (Remeret al., 2006) regarding near-clouds pixels and other land-related factors. In orderto register finer-scale events, particularly needed over such a mountainous arealike the austrian Alps, MODIS-derived maps of 550 µm AOT at 1×1 km2 wereactually used: these are amongst a set of air quality products generated by thePM MAPPER software from MODIS Level 1B multispectral data (calibrated, geolo-

6 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

Fig. 2: August 2010 composite of raw AOT observations from PM MAPPER at 1×1 km2 of spatialresolution (85 maps, 3.2% of missing pixels computed via inverse distance interpolation).

cated reflectances) (MEEO Srl). These aerosol products were validated against theAERONET network of uplooking radiometers over a three-year period through-out Europe in (Campalani, P. and Nguyen, T.N.T. and Mantovani, S. and Bottoni, M.and Mazzini, G., 2011). The whole archive of the PM MAPPER products (includingPM, Air Quality Index and Land Cover), can be visualized on the MEA-PM platformat http://alcs.services.meeo.it:8080/sensorer/ (Natali et al., 2011). Fig-ure 2 depicts a monthly composite of satellite-based AOT maps: the strong depen-dence on terrain topography can be appreciated.

AOT data retrieval is heavily constrained by cloud and snow coverages, this wayinhibiting a considerable percentage of pixels to be evaluated, in particular over theaustrian area and during the cold season. An analysis of monthly-aggregated AOTspaceborne availability over Austria showed a substantial fraction of the area to beunretrievable due to clouds and snow in the cold season (up to 90-99% in somecases). For instance, Fig. 3 depicts the availability of AOT pixels over the austrianarea for the year 2009.

In Tab. 1, the analytical details of the survey on AOT pixel-based presence overAustria are shown: even with a minimum threshold of 5% of available pixels over

2 18 34 50 66 82 98 114

130

146

162

178

194

210

226

242

258

274

290

306

322

338

354

370

386

402

418

434

450

466

482

498

514

530

546

562

578

594

0

10

20

30

40

50

60

70

80

AOT presence over Austria

2009

%AOT

granule

% p

ixe

ls

Fig. 3: AOT presence (%) in Austria for the year 2009 for each available granule.

Spatial prediction of PM10: an Austrian case study 7

the area, only a small fraction of the available maps satisfy the requirement, meaningthat a lot of maps are almost empty.

Table 1: Percentage of AOT granules satisfying a minimum % of pixel availability over Austria forthe three years of analysis and for different thresholds of percentage.

Threshold 2008 2009 2010

5% 33.4 37.8 20.410% 24.1 29.4 14.520% 16.5 19.0 9.430% 11.1 12.9 6.840% 6.9 10.1 4.250% 3.2 6.9 3.060% 1.4 3.0 2.070% 0.2 1.7 1.180% 0 0.5 0.490% 0 0 0

Meteorological data

Simulated meteorological fields of wind, temperature, pressure, relative humidityand planetary boundary layer height were provided on a 3-dimensional grid. Themodel simulations are based on the global forecasts provided by the IFS (IntegratedForecast System) of the ECMWF (European Centre for Medium-Range WeatherForecasts). This data is further processed by the Weather Research and Forecasting(WRF) Model. The data is extracted on 16 pressure levels (between 10-1000 hPa)with a spatial resolution of 0.5° in each horizontal direction. These fields are usedas initial and boundary conditions by WRF, which conducts forecasts of meteorol-ogy on an hourly basis and on 43 model levels. To obtain the dataset the modellingsystem is setup to provide forecasts on a resolution of 27 km over the whole Euro-pean domain, which were then interpolated at 1 km of spatial resolution by meansof a cubic spline mechanical interpolator to meet the prediction output grid scalerequirements. Amongst the available vertical levels, only the 2D grids at surfacewere extracted so as to vertically co-locate the meteorological datasets with theground sites' PM measurements. In Fig. 4 a complete daily set of input meteoro-logical maps is shown: it includes Planetary Boundary Layer Height (PBLH) [m],PRESSure (PRESS) [Pa], Relative Humidity (RH) [%], wind horizontal intensities (Ufor the easting direction and V for the northing direction) [m/s], and wind verticalintensity (W) [m/s].

8 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

Fig. 4: Example of a 1-hour average set of input meteorological maps from the WRF model.

Method

This section describes step-by-step the creation of the final map of PM10 concen-trations. The description is focused on a single day prediction, being it iterativelyrepeated for the whole interval of the analysis (2008 to 2010). The input datasetcomprises ground site point measurements of PM10 (daily averages), spacebornecolumnar AOT retrievals in the 550 µm band at and interpolated grids of model-based meteorological data fields, both at 1×1 km2 of spatial resolution. The wholestudy was developed in the R environment for statistical computing (R DevelopmentCore Team, 2011), with the remarkable role of the gstat package for the geosta-tistical analysis (Pebesma, 2004) and the sp package for the spatial data handling andvisualization (Pebesma and Bivand, 2005).

In (Campalani et al., 2011) cokriging techniques were investigated for this ap-plication, but the strong mismatch of experimental and fitted model in the crossand direct variograms upon coregionalization, even in the case of only two variables,suggested the adoption of a different interpolation method (or of a more complexautomatic modelling of the variograms). In this study, External Drift Kriging (KED)was chosen as a safer estimator: variogram evaluation is done only on the tar-

Spatial prediction of PM10: an Austrian case study 9

get variable, while the covariates feed the regression that determines the expectedvalue, the drift. This kriging technique seems to be more effective for this kind of ap-plications (Beelen et al., 2009), and merges the meteorological correlations with thespatial knowledge offered by the variogram. Though cokriging has been adopted insome case with successful results for air quality modelling (Singh et al., 2011), KEDseems to handle more conveniently gridded covariates, and is more widely rec-ognized as a flexible and performing technique for unbiased estimation of spatiallycontinuous features (Hengl et al., 2011). This method however requires that all thecovariables can be evaluated at all the output locations and this was not true in ourcase for AOT retrievals, forcing for an intermediate interpolation on this data.

Sect. describes the preliminary treatment applied to the input data; in Sect. themultiple linear regression approach is described; Sect. finally supplies details on theresidual variogram fitting for the kriging geostatistical model.

Preliminary Analysis

As a first important step, due to the high percentage of missing AOT pixels, a pixel-based filter is applied on the available AOT data1 for the day in analysis to decidewhether it is worth to include it in the model. Indeed AOT is going to be treated asan independent variable in the regression, just like any other meteorological data,hence needs to be available on all the output locations: the fewer the AOT dataavailable, the higher the uncertainty of the interpolated AOT input grid. The inter-polation method chosen for the AOT maps is simple inverse distance: with sucha high number of pixels (thousands or tens of thousands), more complex interpo-lators were not possible without high computational requirements. As a trade-offbetween AOT positive rates and uncertainty in the interpolated grid, a thresholdof 30% on minimum required available pixels is set: if there is no such availability ofpixels, AOT is discarded. Looking at Tab. 1 it can be observed how this thresholdfilters out approx. 90% of the available AOT maps: models on less complex terrainsmight admit a softer threshold, but this specific case should require strict condi-tions due to the high spatial variability of the aerosols. Fig. 5 shows an example ofan AOT input dataset, along with its gap-filled grid.

In case the AOT is accepted into the model, a box-cox analysis is carried outon the pixel values: AOT typically shows a highly positively skewed distribution,and this is further evident on the products we used in this model, which offer highspatial resolution and higher data availability at the cost of a higher noise in theretrievals (e.g. near-cloud pixels). This way AOT data is dynamically transformedfor normalization, depending on the specific daily distribution. Normalization2 ishighly suggested on the independent variables in a linear regression model (Faraway,2002).

1 The input AOT data might be the result of the concatenation of multiple adjacent granules.2 Which in this case is meant as transformation to approximate data density to a Normal (i.e. Gaus-sian) distribution.

10 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

Fig. 5: Example of (log transformed) AOT daily aggregation of 2 consecutive granules (left) andthe relative interpolated map actually used for the kriging prediction (right). Optical thicknessvalues are to scaled by a factor of 1000.

Meteorological data did not show a particularly skewed distribution on aver-age, hence were not transformed (Faraway, 2002). However an offline analysiswas carried out to evaluate the multicollinearity amongst the meteorological pre-dictors: indeed the initial set of available auxiliary data included additional vari-ables, specifically TEMPerature (TEMP), easting and northing geographical coor-dinates (X and Y), a Digital Elevation Model (DEM) and a map of yearly averagesof remote-sensed Night Lights (NL). (taken from http://www.ngdc.noaa.gov/dmsp/downloadV4composites.html). As shown in Tab. 2, temperature and el-evation showed a strong collinearity (r = 0.8) and were thus removed; also geo-graphical location showed a fair collinearity with these variables (0.5 < r < 0.7) andsince the kriging is going to account for the relative positions of the stations, theywere removed as well from the model. Although night lights were clearly orthogo-nal with all the other covariates, they strongly biased the regression with relativelyhigh β weights hence forcing each daily prediction to be a kind of scaled map ofnight lights: the location of PM measuring sites is indeed strongly biased over areas

Spatial prediction of PM10: an Austrian case study 11

with high suspected air pollution, often coinciding with areas of high level of humanpresence (and consequently of night lights).

Table 2: Analysis of multicollinearity amongst the complete set of available gridded input predic-tors.

PRESS RH TEMP U V W X Y DEM NL

PBLH 0.394 -0.675 0.563 0.033 -0.06 -0.061 0.273 0.265 -0.367 0.071PRESS -0.55 0.918 0.046 0.008 0.357 0.592 0.804 -0.851 0.182RH -0.784 0.13 0.136 0.112 -0.301 -0.366 0.512 -0.112TEMP -0.055 -0.032 0.133 0.503 0.694 -0.799 0.18U 0.064 -0.001 0.02 0.114 -0.037 -0.009V 0.183 -0.178 0.076 0.044 0.012W 0.216 0.377 -0.314 0.037X 0.355 -0.639 -0.013Y -0.618 0.034DEM -0.392

Eventually, due to the evident grouping of nearby ground measuring stations es-pecially in the north-eastern area around the capital city (see Fig. 1), a declusteringprocedure is carried out a priori of every analysis: at the cost of negligible changesin the relative distances of the stations with respect to the output map scale, themeasurements were overlaid onto a 2×2 km2 grid: the locations of the stationswere translated to the center of the each correspondent pixel, at the same time av-eraging the measurements inside the same pixel. This way some very close stationscould be grouped onto a single virtual site.

Afterwards, the whole set of data was standardized to unitary standard devia-tion to ensure computations on numbers approximately on the same scale: as anexample, raw AOT values usually do not exceed the optical depth of 1 whereas pres-sure values are in the order of thousands of Pascal. Finally, before proceeding tothe regression analysis, data were finally reprojected to a common 2D geographicalprojection, in our case the Universal Transverse Mercator (UTM) projection on thezone 33-North: cartesian coordinate systems, unlike the ellipsoidal ones, speed upthe various distances computations which are heavily demanded in the estimationof the variogram and in the prediction process.

Multiple Linear Regression

After proper transformations on the skewly distributed AOT measurements (whenaccepted into the model) and scaling of the whole set of covariates, these werethen overlaid onto a common Austria-shaped grid at 1×1 km2 of resolution, thewhole set of predictors were co-located with the ground stations of PM10 for theordinary least-square regression: as similarly done in (Beelen et al., 2009), a stepwise

12 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

procedure — based on the Akaike Information Criterion (AIC) — was applied aposteriori to the regression in order to keep only the most relevant predictors.

Before that, a regression analysis was carried out over the three years of avail-able data to understand how optimize the regression fit, with the minimum amountof degrees of freedom. A first evaluation was done on the benefit of the interceptinto the model: clearly the performances were considerably better without inter-cept, with an average goodness-of-fit (R2) of 0.95 against a value of 0.43 for thecorrespondent analysis with the intercept. Secondly, the effect of removing mul-ticollinear covariates was evaluated on the regression performances: at the costof a negligible decrease in the model fit, temperature, DEM, coordinates and nightlights could be safely removed from the set of independent variables, reducing sig-nificantly the regression errors that could derive from such a large set of inputs.After that, the sign of average beta regression weights also showed agreement withthe expected directions, as for instance PBLH turned from an average normalizedβ of 0.163 to −0.155, and PRESS similarly went from −0.57 to 0.221: PM concen-trations in fact should be higher at lower altitudes were anthropic installations are,and lower boundary layer heights should push the aerosols at ground level.

An evaluation over the proper use of wind information was carried out as well:is it better to use the three components as separated independent variables, orto group them as a unique 2D or 3D intensity vector and use the magnitude ofthis vector as wind data? The analysis actually showed how the horizontal inten-sity of the wind (

√U2 +V 2) was generally preferred over the 3-dimensional one

(√

U2 +V 2 +W 2) for the explanation of the ground PM variability, however no sig-nificant changes were observed in the model performances so, for more flexibilityin the choice of predictors, the single components were left separated.

At last, a study over normalization of the AOT measurements was done. Previ-ous works, as for instance the one by Tsai et al. (Tsai et al., 2011), confirmed theintuitive idea to normalize AOT by the height of the boundary layer for a betterproxy from the columnar information to the surface-level particulate matter. In ourstudy we analyzed the correlation between PM and raw AOT, box-cox transformedAOT and AOT normalized with each of the meteorological variables described inSect. , for the three years of available data. Actually, not only PBLH can be a candi-date for normalization in the AOT-PM relationship: high level of humidity increasethe backscattered reflectances of the hygroscopic aerosols, while PM is measuredon its dry mass instead; at lower pressure levels (higher altitudes) the atmosphericmixing volume is compressed and hence there is probably a higher agreement be-tween columnar and surface-level aerosols; finally low wind intensities imply a closerrelationship between an instantaneous observation and the daily-averaged measure-ments done by the ground sites.

The study at the same time analyzed the relationship by varying the size of thediscs within which to average the AOT pixels, to overcome the support differenceof the PM time averages and AOT instantaneous retrievals (Ichoku et al., 2002).As depicted in Fig. 6, the normalization by means of PBLH curiously produced theworse average correlations, whereas relative humidity and temperature seemed tobe best on average. Although generally the normalization did not seem to bring ben-

Spatial prediction of PM10: an Austrian case study 13

efits to the AOT-PM relationship, this might not be true on the single day case, sowe decided to flexibly run this kind of correlation-based analysis on-the-fly for eachdaily prediction and then consequently normalize the AOT pixels. Again looking atFig. 6, it is clear how the optimal averaging discs of the AOT pixels is close to punc-tual: maybe, even if the temporal support of the two measurements is considerablydiverse, the highly mountainous topography inhibits important transport of aerosolsin time. Anyway the correlations are quite weak, and this also confirms how com-plex can be this application over such a distinctive terrain, as announced in Sect. .On the other side, this result inherently permitted the use of AOT informationwithout some e.g. moving-window averaging filter to achieve support synchroniza-tion with the PM measurements, that would have inevitably reduced the resolutionof the data.

AOT

boxcox(AOT)

AOT / PBLH

AOT / RH

AOT / TEMP

AOT / WIND

AOT / PRESS

0.000

0.050

0.100

0.150

0.200

0.250

PM10/AOT linear correlation

average 2008-2010

1km

5km

10km

20km

30km

r

0 5 10 15 20 25 300.000

0.050

0.100

0.150

0.200

0.250

PM10/AOT linear correlation

Different averaging radius

AOT

boxcox(AOT)

AOT/PBLH

AOT/RH

AOT/TEMP

AOT/WIND

AOT/PRESS

radius [km]

r

Fig. 6: Average linear correlation between PM ground measurements and co-located raw, trans-formed and normalized AOT measurements for different averaging disc sizes for different group-ings.

Summarizing, when AOT is included in the model, the multiple linear regressioncan either rely on raw (transformed) or normalized AOT, depending on the mete-orological normalization which better translates the columnar aerosols to ground-level concentrations. The remaining meteo inputs are then also included in the re-gression. In case AOT is filtered out for insufficient number of available pixels (seeSect. ), then simply the whole set of meteorological data is used, without preliminaryanalysis. Irrelevant predictors are meant to be subsequently discarded via stepwiseelimination in both cases.

14 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

Fig. 7: Visual comparison of kriging maps of daily PM10 concentrations without localizing theneighbor search (left side) or by selecting the nearest 50 PM measurements for the kriging weight-ing (right side): though stationarity assumption are better explained with localized kriging, it causedugly artifacts in the final predictions.

Variogram Modelling and Kriging

After defining the regression formula, the isotropic spatial autocorrelation of theregression residuals is analyzed via an automatic variogram fitting procedure whichchooses the best variogram model among a finite set of models, namely spherical,exponential, gaussian and matern (M. Stein parametrization) for out study. Althoughfrom the perspective of the model additional samples will always improve the esti-mation regardless of their distance from the unsampled location, this is not alwaystrue in reality: stationarity assumptions on the varigram model does not justify theuse of very wide search windows, whereas they become more plausible when re-stricting the search to the closest samples, i.e. using localized kriging (Isaaks andSrivastava, 1989). For this reasons — and for a lighter computational burden aswell3 — a study over the three years of data was performed to find approximatelya proper number of neighbour locations to be involved in a kriging prediction. Forthis specific pattern of measuring sites and measurements, it was found that includ-ing the 50 nearest PM10 measurements for kriging weighting was minimizing thecross-validation RMSE of the final predictions, namely from 5.359 to 5.226 on theaverage statistic with respect to the case where all stations are considered. Afterpreviewing the prediction maps however, we decided to remove this constraintand let kriging include the whole set of available stations in a prediction to avoidthe formation of evident artifacts, as can be appreciated in Fig. 7: this meant highercomputational resources but smoother filled maps.

3 Ordinary Kriging has O(N3) complexity.

Spatial prediction of PM10: an Austrian case study 15

The geostatistical model is now set up, so that predictions of PM10 concentra-tions can be evaluated at each unsampled location of the Austria-shaped output grid:block kriging was used to achieve averaged predictions over 1 km sided squares inthe UTM projected plane. The underlying mathematical model gives also the chanceto evaluate the uncertainty of each prediction as well, so that predictions associ-ated with a high theoretical error might be discarded: it is suggested indeed thatpixels with an associated prediction uncertainty higher than the total variance ofthe target variable are to be considered highly uncertain, and as a rule of thumb aprediction map should be discarded when more then half of its pixels goes beyondthis threshold (Hengl, 2007). After predicting the PM10 concentrations, the valuesare back-transformed to their original scale and finally stored as GeoTiff files forfurther investigations.

Results

This sections describes the results of the three-year geostatistical model test fordaily-averages PM10 filled maps. To assess the performances of the KED estimator,Root Mean Square Error (NRMSE), Mean Error (NME) and Pearson's r coefficientsbased on Leave-One-Out (LOO) cross-validation were evaluated. To achieve somerelative comparison with other interpolation methods, KED statistics were com-pared to both a mechanical inverse distance interpolator with inverse power of 4(arbitrarily chosen, due to the high spatial gradients of the target variable) and to themultiple linear regression alone4. This way we could evaluate the worthiness of thecomplexity of the model with respect to a simple mechanical univariable interpola-tor on one side, and check for the effects of kriging variogram-based adjustmentson the estimated drift on the other side.

In Sect. the regression results are presented; variogram models and cross-validation statistics are then commented respectively in Sect. and Sect. .

Regression coefficients and goodness-of-fit

As a first analysis the effect of box-cox transformations on the AOT pixels wasinvestigated to confirm the positive effect of distribution normalization on the tra-ditionally positively skewed AOT pixels: on the 18% of days where AOT had passedthe 30%-pixels constraint, AOT was kept in the predictors set by the stepwise AIC-based elimination in the 43% of the cases for raw optical depths, and in the 50%of the cases for transformed values. When considering meteorological normaliza-tions of the AOT as well, the rate of inclusion in the model raised up to 70%. Thelinear correlation coefficients between PM10 co-located measurements and AOT

4 The same identical multiple linear regression which is intrinsically computed to estimate the driftin the kriging model.

16 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

1 13 25 37 49 61 73 85 97 109 121 133 145 157 169 181 193 205 217 229 241 253 265 277 289 301 313 325 337 349 361-0.150

0.000

0.150

0.300

0.450

0.600

0.750

-120

-90

-60

-30

0

30

60

Linear correlation AOT/PM

2008

%AOT

r(AOT:PM)

r(bxcx(AOT):PM)

day

r

1 9111 21 31 41 51 61 71 81 101 111 121 131 141 151 161 171 181 191 201 211 221 231 241 251 261 271 281 291 301 311 321 331

0.800

0.850

0.900

0.950

1.000

Goodness-of-fit

2008

regr_R^2

regr.aot_R^2

day

r

Fig. 8: Above: time series of linear correlations of PM10 ground measurements with raw AOT(dashed blue line) and box-cox transformed AOT (continuous blue line). AOT data availability (%over the area of interest) is also shown as a yellow line referring to the secondary (right) Y axis.Below: regression R2 comparison with (blue segments) and without (red line) AOT in the set ofindependent variables (below), for the year 2008. Relative julian days are expressed in the abscissa.

pixels however encountered a negligible decrease on average from 0.152 to 0.145when applying the box-cox power transform: while this may be due to the fact thatPM10 were not transformed, the standard deviation of the r statistic was relativelyhigh (around 0.2) hence the small decrease in correlation should not be consideredsignificant. The addition of the AOT information did not have any effect on the av-erage adjusted R2 of the regression, which remained around the (excellent) valueof 0.95: this means that probably the meteorological variables alone are enough toexplain the variance of the PM measurements. Figure 8 depicts the analysis results,showing how the AOT-PM correlations highly varies for each different days fromeven negative values up to approx. 0.6.

The ad-hoc normalization of the transformed AOT data showed that PBLH in theend was actually one of the most frequently selected choices (23% of the cases overthe three years), though average statistics revealed to be poor, as seen in Sect. . Thismight suggest that the mixing volume is determinant in the AOT-PM association butonly in extreme cases e.g. of particularly high layer heights and of elevated stratusof aerosols not caught by the monitoring stations and properly eliminated by thePBLH normalization. Relative humidity was chosen even more frequently as proxyfrom AOT to PM (28% of the cases), whereas for the 27% of the cases the AOTinformation was better left as is, without normalization. In the remaining days theother meteorological variables were used, principally the wind northing componentV (8.5%) and pressure (7%). No particular seasonal pattern was observed, howeverthis analysis should deserve a much wider investigation because of its relevance inthe model.

Spatial prediction of PM10: an Austrian case study 17

The final three-years ad-hoc regression performances revealed how PRESS wasthe major predictor in the PM10 explanation with an exclusion rate of only 20%,not including the cases where PRESS was hidden inside the AOT normalization (ap-proximately reducing the rate of 1-2%). This confirmed the important role of thetopography — which inevitably drives the pressure levels — for air quality predic-tions over such a complex terrain. The normalized β coefficients of PRESS werealso the more stable with respect to the other predictors, with a standard deviationof 0.17 on an average value of 0.24. The other predictors, namely AOT (including allthe possible ad-hoc normalizations), PBLH, RH, U, V and W showed much higheruncertain trends in the regression role with standard deviations between 0.3 and0.5, and near-zero mean normalized beta values. With due care to this high devi-ations, PBLH confirmed a negative direction in the regression as expected. All theresults were 0.05-alpha significant on average over the three years of analysis.

The most remarkable result of the multiple linear regression adopted for thisapplication is the very high (adjusted) goodness-of-fit or R2, with a median valueof 0.93 over the whole three years. As listed in Tab. 3, the performances tend toincrease during the warm season especially with respect to the first trimester of theyear. This may be explained by the much higher participation of AOT informationin the regression during summer (approx. 50% of the days with AOT in the modelare concentrated in the warm season) that may be a more powerful to model thePM10 with respect to the meteorological maps alone.

Table 3: Median adjusted-R2 for the stepwise multiple linear regression with ad-hoc AOT nor-malization, over the years 2008 to 2010.

2008 2009 2010 overall

Whole year 0.931 0.933 0.929 0.931Jan-Apr 0.907 0.917 0.938 0.921May-Aug 0.952 0.946 0.937 0.945Sep-Dec 0.929 0.933 0.918 0.927

Residual variogram models

The external-drift kriging internally computes the mean expected value of the PM10aleatory process by means of the multiple linear regression which has been stud-ied in the previous section. The spatial auto-correlation of the regression residu-als is then analyzed and modeled by fitting a variogram to the experimental semi-variances accounting for the relative distances of the site locations.

As described in Sect. , the variogram fitting was carried out by an automatedselection of a set of models, chosen to minimize the residual sum of squares withthe sample variogram. The analysis on the three years showed a clear prevalence

18 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

distance 50000 100000 150000distance 50000 100000 15000050000 100000 150000[m]

Matern (Stein's parametrization)

Gaussian

Spherical

Fig. 9: Examples of the three fitted variogram models that were primarily chosen over the threeyears of analysis.

JanFeb

MarApr

MayJun Jul

AugSep

OctNov

Dec

0

0.5

1

1.5

2

2.5

3

Variogram models

2010

nugget

psill

day

Fig. 10: Trend of partial sill (black line) and nugget effect (green line) on the fitted variogrammodels for the year 2010.

of the Matern model (M. Stein's parametrization), chosen in the 70% of the days.The sample variograms of remaining days were fitted with gaussian and sphericalmodels (15% each) instead. Looking at Fig. 9, we can conclude that usually thereis a relatively high spatial gradient in the PM10 residuals between even near mea-suring stations, due to the modeled high derivative of the Matern variogram nearthe origin: despite the strong goodness-of-fit of the regression, still there is somemisspecification in the regression that creates spatial patterns in the residuals. Thisis also confirmed by the relevant nugget component in the models, 0.4 on average(over scaled residuals), with a higher trend in the warm months (0.55).

Looking at Fig. 10 however it can be seen how in the warm season the highernugget effect is associated with lower values of partial sill, hence resulting in close-to pure nugget variogram models: in these cases the kriging adjustments on theregression would not be needed. The spatial pattern of the residuals shown dur-ing the cold season may be due to the lack of AOT information which creates amisspecification in the regression and hence the residuals tend to express a spatialtrend.

Spatial prediction of PM10: an Austrian case study 19

Finally, the range of the fitted models did not show any seasonal pattern, withhigh standard deviations (approx. 50 km) and a final mean value of 40 km. Nodirect considerations on the PM10 ground measurement variability can be donethough, being the variograms evaluated on the regression residuals instead of themeasurement themselves.

Cross-validation results

As a final and crucial means of model evaluation, a LOO cross-validation was under-taken for each daily prediction by analysing the back-scaled residuals of our models:to appreciate relative performances, the validation was also computed for the driftresiduals alone and for an inverse distance mechanical interpolation. The three-yearmedian values of the output statistics, namely RMSE, ME and Pearson's r, are shownin Tab. 4 and Tab. 5. The time-series of these data are instead shown in Figs. 11-12-13.

Looking at the Tab. 4 we can appreciate the performance increase of the modelwhen using the meteorological and spaceborne auxiliary information for the airquality predictions: the IDW interpolator yields always the worst statistics; the lin-ear model of regression alone optimizes the correlation between measured andpredicted PM10 (0.88 on average) and also minimizes the bias (∼ 10−3µg/m3): thekriging effect however can be appreciated on the lower median RMSE errors, whichare around 5.57 (µg/m3)2 over the three years of analysis. Furthermore the krigingadjustments to the regression surface are necessary to introduce the awareness ofgeographical spatial location into the model. The seasonality in the model perfor-mances observed in the previous section can be observed instead in Tab. 5 and aswell in Fig. 11: similarly to the linear drift, the kriging predictions are closer to theactual PM10 measurements during summer, as was observed in (Campalani et al.,2011) over Emilia Romagna (Italy), and this might also be caused the higher vari-ances of the PM10 concentrations during these months, which gives way to morehardly predictable outliers.

Although a consistent comparison cannot be easily done with other stud-ies carried out on different areas, the achieved performances seem in line withwhat already obtained e.g. by Emili et al. in (Emili et al., 2011b) (mean RMSE of10.6 (µg/m3)2 and ME of 1.4 µg/m3 over two years in the alpine regions) andby Janssen et al. in (Janssen et al., 2008) (mean RMSE of 9.89 (µg/m3)2 and ME of0.01 µg/m3 over three years in Belgium). The relative differences of cross-validationstatistics with these works must be taken with due care: for instance, the absoluteRMSE in (Janssen et al., 2008) is effectively higher with respect to our results, but theIDW intepolator in (Janssen et al., 2008) produced a mean RMSE of 12.12 (µg/m3)2

hence the performances of the proposed model increased by 18%, whereas wefound out a 11% RMSE gain over IDW.

As a final remark, almost no highly uncertain pixels were produced in the krigingoutput maps (order of 10−1% on average), with a slight higher percentage over the

20 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

warm months: this is probably due to the relativity of this evaluation to the varianceof the daily measurements of PM10, which is typically lower in that period of theyear.

Table 4: LOO cross-validation median statistics for daily PM10 predictions using Inverse DistanceWeighting (IDW), Linear Model (LM) and External-Drift Kriging (KED), for the three years ofanalysis.

RMSE [(µg/m3)2] ME [(µg/m3)] Pearson's rIDW LM KED IDW LM KED IDW LM KED

2008 6.076 5.620 5.498 -0.702 -0.002 -0.059 0.611 0.878 0.7962009 5.988 5.268 5.210 -0.700 0.002 -0.046 0.630 0.884 0.8212010 6.653 6.276 6.008 -0.534 -0.001 -0.053 0.614 0.873 0.798

Table 5: LOO cross-validation median RMSE errors for daily PM10 predictions using withExternal-Drift Kriging (KED) estimator for different periods of the year.

Jan-Apr May-Aug Sep-Dec

2008 7.588 3.926 6.3752009 6.624 4.019 5.5312010 7.496 4.398 6.631

Spatial prediction of PM10: an Austrian case study 21

JanFeb

MarApr

May Jun JulAug

Sep OctNov

Dec

0

2

4

6

8

10

12

14

16

18

20

0

2

4

6

8

10

12

14

16

18

20

Cross-Validation RMSE

2008

RMSE_idw

RMSE_lm

RMSE_ked

RM

SE

JanFeb

MarApr

May Jun JulAug

Sep OctNov

Dec

0

2

4

6

8

10

12

14

16

18

20

0

2

4

6

8

10

12

14

16

18

20

Cross-Validation RMSE

2009

RMSE_idw

RMSE_lm

RMSE_ked

ME

JanFeb

MarApr

May Jun JulAug

Sep OctNov

Dec

0

2

4

6

8

10

12

14

16

18

20

0

2

4

6

8

10

12

14

16

18

20

Cross-Validation RMSE

2010

RMSE_idw

RMSE_lm

RMSE_ked

RM

SE

Fig. 11: Temporal trend of LOO cross-validation RMSE errors for Inverse Distance Weighting(IDW, fine-dashed line), Linear Model (LM, dashed line) and External-Drift Kriging (KED, contin-uous line), for 2008, 2009 and 2010 respectively.

Conclusions and Future Work

In this study, a kriging-based geostatistical model for daily PM10 spatial filling hasbeen proposed. The model adaptively uses available AOT information from satellitewhen the sky and terrain conditions permits a sufficient percentage of AOT to beretrieved by the remote sensor. Further model-based meteorological maps are usedas drivers for the PM10 explanation and for the AOT-PM relationship as well: dailyad-hoc analysis are undertaken to normalize the AOT information for an optimizedtranslation from columnar to surface-level aerosols. The study was carried out overthe years 2008, 2009 and 2010 over the austrian region, however the validity of themodel is not linked to a specific geographic area.

The cross-validation results showed an increased prediction power of about 11%on average with respect to a simple inverse distance interpolator. The study focusedon the possibility to use remote-sensed AOT retrievals on such an application anddespite the overall results were satisfying, with a remarkable average goodness-of-fit of the drift surface (0.93), the role of AOT is still not clear: on one side the needfor AOT onto all the output grid prediction forced the majority of satellite gran-

22 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

Jan Feb

Mar

Apr

May Jun Jul Au

gSep Oc

tNov

Dec

-2

-1.5

-1

-0.5

0

0.5

1

-2

-1.5

-1

-0.5

0

0.5

1

Cross-Validation ME

2008

ME_idw

ME_lm

ME_ked

ME

Jan Feb

Mar

Apr

May Jun Jul Au

gSep Oc

tNov

Dec

-2

-1.5

-1

-0.5

0

0.5

-2

-1.5

-1

-0.5

0

0.5

Cross-Validation ME

2009

ME_idw

ME_lm

ME_ked

ME

Jan Feb

Mar

Apr

May Jun Jul Au

gSep Oc

tNov

Dec

-2

-1.5

-1

-0.5

0

0.5

1

-2

-1.5

-1

-0.5

0

0.5

1

Cross-Validation ME

2010

ME_idw

ME_lm

ME_ked

ME

Fig. 12: Temporal trend of LOO cross-validation ME errors for Inverse Distance Weighting (IDW,fine-dashed line), Linear Model (LM, dashed line) and External-Drift Kriging (KED, continuous line),for 2008, 2009 and 2010 respectively.

ules to be discarded, on the other side the driving force in the estimation of thePM10 actually seemed to be represented by pressure. This is probably due to thecomplex rugged topography that heavily contraints the transport of aerosols. Thecomplexity of the terrain and of wind profiles of the austrian region, along with thedifference of spatio-temporal support between the ground measurements of PM10and the atmospheric AOT observations are maybe the cause of their highly vari-able linear correlations, sometimes causing the AOT information to be excludedfrom the regression due to its negligible contribution. Model performances how-ever are generally better during the hot season, and this might be rooted eitherin the far higher availability of AOT pixels or in the smaller variance of the groundmeasurements. Independently of the availability of satellite aerosols information,the proposed model is dynamically adapted to the daily available datasets and datadistributions and can achieve a map of ground-level particulate matter with limitedrequired resources (few minutes of computation on Intel(R) Core(TM)2 Duo [email protected] with 6 GB of RAM).

Further investigations include the spatial and temporal analysis on the producedoutput maps over the three selected years: these could be interactively analysed

Spatial prediction of PM10: an Austrian case study 23

Jan Feb

Mar

Apr

May Jun Jul Au

gSep Oc

tNov

Dec

0

0.25

0.5

0.75

1

0

0.25

0.5

0.75

1

Cross-Validation Pearson's r

2008

r_idw

r_lm

r_ked

r

Jan Feb

Mar

Apr

May Jun Jul Au

gSep Oc

tNov

Dec

0

0.25

0.5

0.75

1

0

0.25

0.5

0.75

1

Cross-Validation Pearson's r

2009

r_idw

r_lm

r_ked

r

Jan Feb

Mar

Apr

May Jun Jul Au

gSep Oc

tNov

Dec

0

0.25

0.5

0.75

1

0

0.25

0.5

0.75

1

Cross-Validation Pearson's r

2010

r_idw

r_lm

r_ked

r

Fig. 13: Temporal trend of LOO cross-validation Pearson's r correlations for Inverse DistanceWeighting (IDW, fine-dashed line), Linear Model (LM, dashed line) and External-Drift Kriging (KED,continuous line), for 2008, 2009 and 2010 respectively.

with visualization tools (Pebesma et al., 2007) or analytically investigated by meanse.g. of the OGC Web Coverage Processing Service (WCPS) coverage processinglanguage via ingestion into a rasdaman database (Baumann, 2010).

Eventually, several improvements that could be achieved over the proposedmethod are hereby listed:

- The high percentage of discarded AOT pixels suggests to enlarge the temporalbasis of the air quality predictions, at least over areas with limited satellite capa-bilities: aggregated annual or seasonal analysis might be preferable to exploit andmore profitably associate the atmospheric aerosols with the ground particles: inspite of this still there is a severe problem of AOT availability on the cold months,with very few available data even on monthly composites.

- Despite the overall good performances, still the model accumulates a consid-erable amount of uncertainty: the measurement errors of the ground sites, theuncertainty in the MODIS AOT inversion model and in the meteorological fore-casts, the intermediate interpolations needed to achieve complete gridded co-variates, the variogram model and the linear regression: either reducing or re-

24 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

moving the intermediate-stage grid interpolations by means of a different modelcan yield more reliable maps.

- The complex topography of the autrian region required high-resolution AOTobservations, hereby achieved by means of the PM MAPPER software products:this data however suffers from higher noise e.g. on the near-cloud pixels, hencean additional filter would be preferred, like the one proposed in (Emili et al.,2011a).

- Still a generally low correlation between the AOT pixels and the ground measure-ments of PM10 was determined and this is probably caused by the disagreementin the spatio-temporal support of the datasets (Kassteele et al., 2006), in associa-tion with the high gradients of variability both in the spatial and temporal domainsof air quality features (Emili et al., 2010): availability of hourly measurements ofparticulate matter would probably lead to a closer AOT-PM relationship.

- Ground measuring sites are historically clustered around high-pollution risk areashence represent a strongly biased input sample of the target variable: althougha soft declustering process was undertaken in this study, some more advancedtechniques should be adopted to more properly instruct the model.

- Variogram analysis was carried out without considering possible directionalisotropy, hence by considering only the relative distances among the data, andnot the relative angles: in the study region however the mountain faces probablyforce directional patterns in the particle distribution.

- The input datasets for each prediction were singularly taken on a daily basiswithout memory: an important and recently suggested approach ((Finazzi et al.,2009; Hengl et al., 2011)) is to consider the temporal dimension in the statisticalmodel.

- Kriging variogram estimation rely on horizontal distance, which may not beproper on a mountainous area with high elevation differences, like our case studywhere elevation spreads amongst stations were in the order of 102-103 m: 3Dpredictions in a volume by means of variogram surfaces might bring improve-ments in the kriging predictions.

- The proposed method relies on multiple linear regression and kriging: althoughthis is widely known as a good way for prediction of continuous fields, this par-ticular application might require far more complex modelling, due to the supportproblem and high variability both in space and time of the target variable.

- Despite the excellent goodness-of-fit, a more detailed regression study shouldbe undertaken over such a numerous set of independent variables: possible non-linear transforms, inter-variable associations and generally more complex formu-las can avoid misspecifications and spatial patterns in the residuals to arise.

Acknowledgements PM MAPPER AOT products, for allowing high-resolution aerosols in-formation inputs, which were possible thanks to the various MODIS software development andsupport teams for the production and distribution of the MODIS data. The austrian Air Qualitymonitoring network for the provision of PM10 ground measurements. The IFS (Integrated Fore-cast System) of the ECMWF (European Centre for Medium-Range Weather Forecasts) and ZAMGcompany for the meteorological maps over the austrian region. The R-Sig-Geo community for thecontinuous support on spatial modeling inside the R environment for statistical computing.

Spatial prediction of PM10: an Austrian case study 25

Bibliography

Al-Hamdan, M., Crosson, W., Limaye, A., Rickman, D., Quattrochi, D., Estes Jr, M.,Qualters, J., Sinclair, A., Tolsma, D., Adeniyi, K., et al. Methods for characterizingfine particulate matter using ground observations and remotely sensed data: po-tential use for environmental public health surveillance. Journal of the Air & WasteManagement Association, 59(7):865--881, 2009.

Baumann, P. The OGC web coverage processing service (WCPS) standard. Geoin-formatica, 14(4):447--479, 2010.

Beelen, R., Hoek, G., Pebesma, E., Vienneau, D., de Hoogh, K., and Briggs, D. Map-ping of background air pollution at a fine spatial scale across the European Union.Science of the Total Environment, 407(6):1852--1867, 2009.

Campalani, P., Nguyen, T. N. T., Mantovani, S., and Mazzini, G. On the AutomaticPrediction of PM10 with in-situ measurements, satellite AOT retrievals and ancil-lary data. In ISSPIT, pages 93--98, 2011.

Campalani, P. and Nguyen, T.N.T. and Mantovani, S. and Bottoni, M. and Mazzini, G.Validation of PM MAPPER aerosol optical thickness retrievals at 1×1 km2 of spa-tial resolution. In Software, Telecommunications and Computer Networks (SoftCOM),2011 19th International Conference on, sept. 2011.

Emili, E., Popp, C., Petitta, M., Riffler, M., Wunderle, S., and Zebisch, M. PM10 remotesensing from geostationary SEVIRI and polar-orbiting MODIS sensors over thecomplex terrain of the European Alpine region. Remote sensing of environment,114(11):2485--2499, 2010.

Emili, E., Lyapustin, A., Wang, Y., Popp, C., Korkin, S., Zebisch, M., Wunderle, S., andPetitta, M. High spatial resolution aerosol retrieval with MAIAC: Application tomountain regions. Journal of Geophysical Research, 116(D23):D23211, 2011a.

Emili, E., Popp, C., Wunderle, S., Zebisch, M., and Petitta, M. Mapping particulatematter in alpine regions with satellite and ground-based measurements: An ex-ploratory study for data assimilation. Atmospheric Environment, 2011b.

Engel-Cox, J., Holloman, C., Coutant, B., and Hoff, R. Qualitative and quantitativeevaluation of MODIS satellite sensor data for regional and urban scale air quality.Atmospheric Environment, 38(16):2495--2509, 2004.

Faraway, J. Practical Regression and ANOVA using R, 2002.Finazzi, F., D’Ariano, C., Fasso, A., Mannarini, G., and Nicolis, O. Integrating satellite

and ground level data for air quality monitoring and dynamical mapping. Proceed-ings of TIES, pages 5--9, 2009.

Gohm, A., Harnisch, F., Vergeiner, J., Obleitner, F., Schnitzhofer, R., Hansel, A., Fix,A., Neininger, B., Emeis, S., and Schafer, K. Air pollution transport in an Alpinevalley: Results from airborne and ground-based observations. Boundary-layer me-teorology, 131(3):441--463, 2009.

Gupta, P. and Christopher, S. A. Particulate matter air quality assessment usingintegrated surface, satellite, and meteorological products: Multiple regressionapproach. Journal of Geophysical Research, 114(D14):D14205+, July 2009. doi:10.1029/2008JD011496.

26 Piero Campalani, Simone Mantovani, Marcus Hirtl, Mark Caglienzi, Gianluca Mazzini

Gupta, P., Christopher, S., Wang, J., Gehrig, R., Lee, Y., and Kumar, N. Satelliteremote sensing of particulate matter and air quality assessment over global cities.Atmospheric Environment, 40(30):5880--5892, 2006.

Gupta, P., Christopher, S., et al. Seven year particulate matter air quality assess-ment from surface and satellite measurements. Atmospheric Chemistry and PhysicsDiscussions, 8(1):327--365, 2008.

Hengl, T. A practical guide to geostatistical mapping of environmental variables.EUR, 22904:143, 2007.

Hengl, T., Heuvelink, G., Percec Tadic, M., and Pebesma, E. Spatio-temporal pre-diction of daily temperatures using time-series of MODIS LST images. Theoreticaland Applied Climatology, pages 1--13, 2011.

Hoff, R. and Christopher, S. Remote sensing of particulate pollution from space:have we reached the promised land. J. Air & Waste Manage. Assoc, 59:645--675,2009.

Ichoku, C., Chu, A., Mattoo, S., Kaufman, Y., Remer, L., Tanre, D., Slutsker, I.,and Holben, B. A spatio-temporal approach for global validation and analysisof MODIS aerosol products. Geophys. Res. Lett, 29(12):8006, 2002.

Isaaks, E. and Srivastava, R. An introduction to applied geostatistics, volume 46. OxfordUniversity Press, USA, 1989.

Janssen, S., Dumont, G., Fierens, F., and Mensink, C. Spatial interpolation of airpollution measurements using CORINE land cover data. Atmospheric Environment,42(20):4884--4903, 2008.

Kassteele, J., Koelemeijer, R., Dekkers, A., Schaap, M., Homan, C., and Stein, A. Sta-tistical mapping of PM10 concentrations over Western Europe using secondaryinformation from dispersion modeling and MODIS satellite observations. Stochas-tic Environmental Research and Risk Assessment, 21(2):183--194, 2006.

Li, Y., Huang, G., Veawab, A., Nie, X., and Liu, L. Two-stage fuzzy-stochastic robustprogramming: a hybrid model for regional air quality management. Journal of theAir & Waste Management Association, 56(8), 2006.

Liu, Y., Sarnat, J., Kilaru, V., Jacob, D., and Koutrakis, P. Estimating ground-level PM2.5in the eastern United States using satellite remote sensing. Environmental science& technology, 39(9):3269--3278, 2005.

Liu, Y., Franklin, M., Kahn, R., and Koutrakis, P. Using aerosol optical thicknessto predict ground-level PM2.5 concentrations in the St. Louis area: a comparisonbetween MISR and MODIS. Remote sensing of environment, 107(1-2):33--44, 2007.

Maantay, J., Tu, J., and Maroko, A. Loose-coupling an air dispersion model and ageographic information system (GIS) for studying air pollution and asthma in theBronx, New York City. International Journal of Environmental Health Research, 19(1):59--79, 2009.

MEEO Srl. PM MAPPER system description, issue 1.1, 2009. Internal report, un-published. If requested, can be delivered upon agreement from the sponsor ofthe project.

Natali, S., Beccati, A., D'Elia, S., Veratelli, M., Campalani, P., Folegani, M., and Man-tovani, S. Multitemporal data management and exploitation infrastructure. InMultiTemp, Trento, July 2011.

Spatial prediction of PM10: an Austrian case study 27

Paciorek, C., Liu, Y., Moreno-Macias, H., and Kondragunta, S. Spatiotemporal asso-ciations between GOES aerosol optical depth retrievals and ground-level PM2.5.Environmental science & technology, 42(15):5800--5806, 2008.

Pebesma, E., de Jong, K., and Briggs, D. Interactive visualization of uncertain spa-tial and spatio-temporal data under different scenarios: an air quality example.International Journal of Geographical Information Science, 21(5):515--527, 2007.

Pebesma, E. J. Multivariable geostatistics in S: the gstat package. Computers &Geosciences, 30:683--691, 2004.

Pebesma, E. J. and Bivand, R. S. Classes and methods for spatial data in R. R News, 5(2):9--13, November 2005. URL http://CRAN.R-project.org/doc/Rnews/.

R Development Core Team. R: A Language and Environment for Statistical Computing.R Foundation for Statistical Computing, Vienna, Austria, 2011. URL http://www.R-project.org/. ISBN 3-900051-07-0.

Remer, L., Tanre, D., Kaufman, Y., Levy, R., and Mattoo, S. Algorithm for remotesensing of tropospheric aerosol from MODIS: Collection 005. National Aeronau-tics and Space Administration, 2006.

Schaap, M., Apituley, A., Timmermans, R., Koelemeijer, R., and De Leeuw, G. Ex-ploring the relation between aerosol optical depth and PM2.5 at Cabauw, theNetherlands. Atmos. Chem. Phys, 9:909--925, 2009.

Singh, V., Carnevale, C., Finzi, G., Pisoni, E., and Volta, M. A cokriging based approachto reconstruct air pollution maps, processing measurement station concentra-tions and deterministic model simulations. Environmental Modelling & Software,2011.

Tsai, T., Jeng, Y., Chu, D., Chen, J., and Chang, S. Analysis of the relationship be-tween MODIS aerosol optical depth and particulate matter from 2006 to 2008.Atmospheric Environment, 45(27):4777--4788, 2011.

Vidot, J., Santer, R., and Ramon, D. Atmospheric particulate matter (PM) estimationfrom SeaWiFS imagery. Remote Sensing of Environment, 111(1):1--10, 2007.

Wong, D., Yuan, L., and Perlin, S. Comparison of spatial interpolation methods forthe estimation of air quality data. Journal of Exposure Science and EnvironmentalEpidemiology, 14(5):404--415, 2004.