Representing soil pollution by heavy metals using continuous limitation scores
Transcript of Representing soil pollution by heavy metals using continuous limitation scores
www.elsevier.com/locate/cageo
Author’s Accepted Manuscript
Representing soil pollution by heavy metals usingcontinuous limitation scores
Marija Romic,Tomislav Hengl, Davor Romic, StjepanHusnjak
PII: S0098-3004(07)00090-8DOI: doi:10.1016/j.cageo.2007.05.002Reference: CAGEO 1831
To appear in: Computers & Geosciences
Received date: 23 April 2005Revised date: 11 October 2006
Cite this article as: Marija Romic, Tomislav Hengl, Davor Romic and Stjepan Husnjak,Representing soil pollution by heavy metals using continuous limitation scores, Computers& Geosciences (2007), doi:10.1016/j.cageo.2007.05.002
This is a PDF file of an unedited manuscript that has been accepted for publication. Asa service to our customers we are providing this early version of the manuscript. Themanuscript will undergo copyediting, typesetting, and review of the resulting galley proofbefore it is published in its final citable form. Please note that during the production processerrors may be discovered which could affect the content, and all legal disclaimers that applyto the journal pertain.
Accep
ted m
anusc
ript
Representing soil pollutionbyheavymetals using continuous
limitation scores
Marija Romic a,1 Tomislav Hengl b Davor Romic a Stjepan Husnjak a
aFaculty of Agriculture, Svetosimunska 25, 10000 Zagreb, Croatia
bEuropean Commission, Directorate General JRC, Institute for Environment and Sustainability, TP 280, Via E. Fermi 1,
I-21020 Ispra (VA), Italy
Abstract
The paper suggests a methodology to represent overall soil pollution in a sampled area using continuous limitation scores.
The interpolated heavy metal concentrations are first transformed to limitation scores using the exponential transfer function
determined by using two threshold values: permissible concentration (0 limitation points) and seriously polluted soil (4
limitation points). The limitation scores can then be summed to produce the map of cumulative limitation scores and visualize
the most critically polluted areas. The methodology was illustrated using the 784 soil samples analyzed for Cd, Cr, Cu, Ni,
Pb and Zn in the central region of Croatia. The samples were taken at 1Ö1 and 2Ö2 km grids and at fixed depths of 20 cm.
Heavy metal concentrations in soil were determined by ICP-OES after microwave assisted aqua regia digestion. The sampled
concentrations were interpolated using block regression-kriging with geology and land cover maps, terrain parameters and
industrialization parameters as auxiliary predictors. The results showed that the best auxiliary predictors are geological map,
ground water depth, NDVI and slope map and distance to urban areas. The spatial prediction was satisfactory for Cd, Ni, Pb
and Zn, and somewhat less satisfactory for Cu and Cr. The final map of cumulative limitation scores showed that 33.5% of the
total area is suitable for organic agriculture and 7.2% of the total area is seriously polluted by one or more heavy metals. This
procedure can be used to assess suitability of soils for agricultural production and as a basis for possible legal commitments
to maintain the soil quality.
Key words: heavy metal concentrations, regression-kriging, limitation scores, spatial planning, GIS
Submitted on 23 April 2005 to Computers and Geosciences,
special issue Geostats-UK conference 2005, Belfast; first revi-
sion on 21 March 2006; second revision on 11 October 2006;
1 Tel.: +385-1-2394014; fax: +385-1-2315300. E-mail ad-
dress: [email protected]
1 Introduction 1
The problem of soil pollution by heavy metals has been re- 2
ceiving an increasing attention in the last few decades. In 3
Europe, decision makers and spatial planners more and more 4
require information on soil quality for different purposes: to 5
locate areas suitable for organic (ecologically clean) farming 6
and agro-tourism; to select sites suitable for conversion of 7
agricultural to non-agricultural land, particularly for urban- 8
Preprint submitted to Computers and geosciences 12 October 2006
Accep
ted m
anusc
ript
ization; setting up protection zones for groundwater pumped1
for drinking water; to estimate costs of remediation of con-2
taminated areas and similar. Heavy metals occur naturally3
in rocks and soils, but increasingly higher quantities of them4
are being released into the environment by anthropogenic5
activities. Every decision on the application of any measures6
in the environment relating to soil quality and management,7
whether statutory regulations or practical actions, must be8
based on reliable and comparable data on the status of this9
part of environment in the given area. Various aspects must10
be considered by the society to provide a sustainable en-11
vironment, including a soil clean of heavy metal pollution.12
The first among them is to identify environments (or areas)13
in which anthropogenic loading of heavy metals puts ecosys-14
tems and their inhabitants at health risk. Maps indicating15
areas with pollution risks can provide decision-makers or lo-16
cal authorities with critical information for delineating areas17
suitable for the planned land use or soil clean up (Van der18
Gaast et al., 1998; Broos et al., 1999). Maximum permissi-19
ble concentrations of heavy metals in soil are now regulated20
by law in many countries.21
Before any solution for the problem of soil heavy metal pollu-22
tion can be suggested, a distinction needs to be made between23
natural anomalies and those resulting from human activities.24
Namely, it often happens that also natural concentrations25
and distribution of potentially toxic metals could present26
health problems, like in the case of chromium, cobalt, and27
particularly nickel in ultramafic soils (Proctor and Baker,28
1994). Rock type and geological-geochemical processes can29
change markedly in a relatively small area, resulting in great30
spatial variability in the soil content of elements. Soils in the31
vicinity of urban areas and industry are exposed to input of32
potentially toxic elements, and the situation of agricultural33
soils gets additionally complicated due to continuous appli-34
cation of agrochemicals.35
In practice, soil pollution by heavy metals is commonly as-36
sessed by interpolating concentrations of heavy metals sam-37
pled at point locations, so that each heavy metal is repre-38
sented in a separate map (Webster and Oliver, 2001; Juang39
et al., 2003). The first problem of working with maps of sepa-40
rate heavy metal concentrations (in further text HMCs)41
is that the limiting values for polluted soils are commonly42
set as crisp boundaries. For example, a soil is polluted by 43
zinc and not suitable for organic agriculture if the measured 44
values are larger than 150 mg kg−1 (Official Gazette, 2001). 45
This means that a soil with zinc concentration of 149 mg kg−146
and a soil with a concentration of 151 mg kg−1 will be clas- 47
sified differently although the difference may be due to the 48
measurement or interpolation error. Similarly, if the concen- 49
tration of zinc at a location is 151 mg kg−1 and at neighbor- 50
ing location 300 mg kg−1, both locations will be classified as 51
not suitable although the latter shows two times higher con- 52
centration. The second problem with HMCs is that different 53
elements come in different ranges of values. This makes it 54
fairly difficult to get the picture about the overall soil qual- 55
ity. For example the threshold value for zinc is 150 mg kg−156
and for cadmium 0.8 mg kg−1. If we measure, at a point, 57
values Zn=130 (suitable) and Cd=1.1 (not suitable), this 58
makes this location unsuitable but how serious is the prob- 59
lem? Now imagine a case with tens of HMCs — how can 60
we sum these values to get the compound picture about the 61
quality of soil? 62
To solve a problem of presenting overall polluted areas, 63
Romic and Romic (2003) applied factor analysis prior to in- 64
terpolation and then interpolated only the first factor fac- 65
tor indicating anthropogenic loads of heavy metals. Van der 66
Gaast et al. (1998) used maps of background values of soil 67
contaminants focusing on the 90-percentiles. Hanesch et al. 68
(2001) tested fuzzy classification algorithms to distinguish 69
different sources of pollution. Amini et al. (2004) classified 70
HMCs using unsupervised fuzzy k-means to partition the 71
values optimally. The final outputs are maps of memberships 72
to each cluster, which commonly reflect the combination of 73
most correlated heavy metals. In all these examples the pro- 74
cedures are statistically valid, but the meaning of such fac- 75
tors and continuous memberships is hard to interpret. In 76
practice, decision makers usually only wish to see the areas 77
that are polluted without any training in (geo)statistics. 78
In this paper, we propose an approach to interpolate sampled 79
heavy metal concentrations using numerous environmental 80
predictors and then represent the overall pollution by using 81
the continuous limitation scores. We advocate the use of 82
cumulative limitation scores because they can be summed 83
and used to represent areas of overall high pollution. Such 84
2
Accep
ted m
anusc
ript
visualizations can supplement maps of separate HMCs so1
that the end-users can more easily delineate areas of high2
overall pollution and focus their actions where their are more3
needed.4
2 Materials and methods5
2.1 Spatial interpolation6
For spatial interpolation of HMCs we used the regression-7
kriging (Odeh et al., 1995), also known as Universal krig-8
ing (Webster and Oliver, 2001) or Kriging with External9
Drift (Goovaerts, 1997) (see also the article on Regression-10
kriging published in the same issue of this journal). This11
technique is especially attractive as it can employ both our12
empirical knowledge about the distribution of HMCs and13
the spatial autocorrelation between the point samples. It14
will also minimize the artificial point patterns in the fi-15
nal predictions typical for plain kriging techniques (Juang16
et al., 2003). We used the generic framework suggested by17
Hengl et al. (2004), which requires several processing steps18
in R (http://r-project.org), Integrated Land and Water19
Information System (ILWIS) (see (Unit Geo Software De-20
velopment, 2001) and http://itc.nl/ilwis/) and GSTAT21
(http://gstat.org) packages. The first step is the logit-22
transformation of all target variables (in this case HMCs),23
which will (in most cases) ensure the normality of residuals.24
The original, target parameter (z) is first transformed to a25
relative (indicator) variable by:26
z++ = ln
�z+
1− z+
�; 0 < z+ < 1 (1)
where z+ is the target variable standardised to the 0 to 127
range:28
z+ =z − zmin
zmax − zmin; zmin < z < zmax (2)
and zmin and zmax are the physical minimum and maximum29
of z. This means that all new predicted values will be in-30
between these two thresholds. So if we measured concentra-31
tion of Zn = 88, the indicator variable is (88 − 0)/1000 =32
0.088 and the logit-transform is logit(Zn) = −2.338. In this 33
case, the lower threshold (0) is the physical limit of the val- 34
ues, while the upper threshold (1000) is an arbitrary num- 35
ber. Setting up these thresholds prevents from making pre- 36
dictions that do not have a physical meaning (e.g. negative 37
values). 38
The advantage of transforming the original heavy metal con- 39
centrations is that they can now be all set to a common 40
range (e.g. -10.000 to 10.000), which also means that the re- 41
sults of the statistical analysis can be directly compared. For 42
example, variograms of several parameters can be displayed 43
together (see later Figure 4). This would not be possible if 44
the original values were used as the range of values can be 45
very different. For example, for Cd the standard deviation 46
is 0.34 and for Cr it is 15.9 or fifty times larger. 47
After the transformation, the HMCs are estimated using the 48
regression-kriging model: 49
z(s0) = qT0 · β + λT
0 · e (3)
where q0 is a vector of p + 1 predictors at s0, β is a vector 50
of p + 1 estimated drift model coefficients, λ0 is a vector of 51
n kriging weights and e is a vector of n residuals. 52
The prediction accuracy of our interpolation technique is 53
commonly analyzed using two measures — mean prediction 54
error (MPE): 55
MPE =1
l·
lXj=1
[z(sj)− z∗(sj)] (4)
and the root mean square prediction error (RMSPE): 56
RMSPE =
vuut1
l·
lXj=1
[z(sj)− z∗(sj)]2 (5)
both calculated at validation points (z∗(sj)), where l is the 57
number of validation points. For detailed instructions on 58
how to run regression-kriging see also the attached article on 59
regression-kriging published in the same issue of this journal. 60
3
Accep
ted m
anusc
ript
After the regression modelling in R, we fitted the vari-1
ogram models of residuals using automatic fitting options2
in GSTAT. In all cases, we used the exponential model and3
initial variogram with nugget parameter=0, sill parame-4
ter=sampled variance and range=10% of spatial extent of5
the data. Once we determined the most significant predic-6
tors, estimated the regression model and the variogram of7
residuals, we created GSTAT scripts to produce the final8
predictions. Note that GSTAT, in fact, implements the so9
called “kriging with external drift” approach, which is com-10
putationally slower but gives the same predictions. Finally,11
the fitted HMCs were back-transformed to the original scale12
by:13
z(s0) =ez++(s0)
1 + ez++(s0)· (zmax − zmin) + zmin (6)
In practice, the regression-kriging consists of the four main14
steps. After the estimation of regression coefficients, the15
trend can be fitted by using a map calculation in ILWIS:16
Zn t REG = -1.5812 + 0.0197*SLOPE + 0.0001*URBAN -
0.0001*ROADS - 0.0225*WINDE90 - 0.0423*WINDE180
+ 0.0293*WINDE225 - 0.0422*WINDE270 -
0.029*WINDE360 + 0.1646*GEO03 + 0.1178*GEO04
- 0.147*GEO07 - 0.2396*GEO08 - 0.1602*GEO09 -
0.409*GEO11 + 0.2147*GEO12
17
where Zn t REG is the fitted trend on the transformed vari-18
able and SLOPE, URBAN, WINDE90 etc. are different predictors19
(raster maps). The second step is to derive residuals and20
then fit variograms. The third step is to make predictions21
of values at all locations using universal kriging in GSTAT22
(Pebesma, 2004). Finally, the fitted trend and residuals can23
be summed and back-transformed using the following com-24
mand:25
Zn RK.mpr = iff(isundef(MASK), ?, exp(Zn t REG +
RES Zn t)/ (1 + exp(Zn t REG + RES Zn t))*1000)26
where RES Zn t are the interpolated residuals using regres-27
sion kriging, Zn RK is the final prediction map based on the28
regression-kriging and MASK is the map used to mask out only29
the areas of interest. In this case, only the agricultural soils 30
have been sampled and analyzed. 31
2.2 Continuous limitation scores 32
Traditionally, suitability maps are derived as Boolean maps 33
(yes or no) where none of the dangerous HMCs exceeds the 34
threshold value (see Table 1). In ILWIS, such a spatial query 35
would look like this: 36
SUITABLE{dom=Bool} = iff((Cd RK>0.8) OR
(Cr RK>50) OR (Cu RK>50) OR (Ni RK>30) OR
(Pb RK>50) OR (Zn RK>150), 0, 1)
37
This means that only the areas that do not exceed ANY of 38
the given thresholds can be considered as being suitable for 39
agricultural production. Here the problem is obviously that 40
the intensity of pollution within the polluted areas is un- 41
known. Our approach is somewhat different in the sense that 42
we also want to spatially represent the overall soil pollution. 43
For this we use the concept of limitation scores. 44
Table 1
Transformation coefficients calculated for given threshold
concentrations. X1 — maximum concentration of contami-
nant to maintain multifunctionality, X2 — serious soil pol-
lution. Official threshold levels used in Croatia.
X1 X2 ln(b0) b1
mg kg−1 mg kg−1
Cd 0.8 2 0.392 1.756
Cr 50 100 -9.083 2.322
Cu 50 100 -9.083 2.322
Ni 30 60 -7.897 2.322
Pb 50 150 -5.731 1.465
Zn 150 300 -11.634 2.322
After the HMCs have been interpolated, they can be con- 45
verted to limitation scores, which will then allow us to sum 46
different maps of HMCs. Such a scoring system is often used 47
in land evaluation studies (Triantafilis et al., 2001). For each 48
4
Accep
ted m
anusc
ript
evaluation parameter, thresholds and limitation scores are1
predefined and then can be implemented for the whole area.2
For example, a slope map is typically used to give suitability3
scores to a certain area. Triantafilis et al. (2001) assigned to4
each slope class a limitation score based on some empirical5
rules: 0 for 0-2% slope class, 1 for 2-8%, 3 for 9-16%, 9 for6
17-25% and 27 for slopes >25%. Note that in this case the7
limitation scores increase exponentially with the increase of8
slope. Although the slope difference between the second and9
third class is only two and half times, the third class gets10
three times more limitation points.11
We propose here that the limitation scores, instead of making12
classes of HMCs, can be derived directly by using a simple13
transfer function that converts HMCs directly to limitation14
scores (LS). A flexible transfer function, also used in this15
paper, is the exponential:16
LS =
8>><>>:
b0 ·HMC b1 − 1 if HMC ≥ X1
0 if HMC < X1
(7)
where LS are the limitation scores, b0 and b1 are the coef-17
ficients, HMC is heavy metal concentration and X1 is the18
permissible or baseline concentration. An example of how19
HMCs are transformed to limitation scores can be seen in20
Figure 1. In this case we also assume that the cost of re-21
mediation increases exponentially with HMC. The b0 and b122
coefficients can be estimated by solving the linear regression23
model:24
ln(LS + 1 ) = ln(b0) + b1 · ln(HMC ) (8)
In this case we used three known points to estimate the25
two unknowns. For example, for Cr we used LS + 1=026
for HMC=0, LS + 1=1 for HMC=50 and LS + 1=5 for27
HMC=100. The first threshold (50) for Cr is the permissible28
concentration and the second threshold (100) is the criti-29
cal concentration that classifies this soil as being polluted.30
After we determine the coefficients of the transfer function31
for each HMC, we can directly derive limitation scores in32
ILWIS using:33
LS Zn{dom=value;vr=0.0:50.0:0.1}=iff(Zn RK<150,
0, exp(-11.634+2.322*ln(Zn RK))-1)34
where LS Zn is the derived map with limitation scores for 35
Zn and Zn RK is the Zn concentration interpolated using 36
regression-kriging. 37
Fig. 1. Transforming HMCs to limitation scores — the func-
tion is determined by two thresholds: permissible or baseline
concentration (0) and serious soil pollution (4).
2.3 Study area and sampling methods 38
The study area includes Zagreb city and the surrounding 39
county in north-western Croatia(E15°20’–16°44’, N45°25’– 40
46°05’) (Figure 2). The region exhibits a variety of soils de- 41
veloped on diverse lithologies. The main bodies of the moun- 42
tain ranges (Zumberak, Medvednica and some other smaller 43
ranges) consist mainly of Palaeozoic and Mesozoic rocks 44
— para-metamorphic rocks, ortho-metamorphic rocks, inge- 45
nious rocks and clastic sedimentary rocks. A major portion 46
of the region consists of Tertiary rocks — limestones, marls, 47
clastites, igneous rocks and Quaternary rocks — mostly al- 48
luvium sediments of the rivers and their tributaries (Miko 49
et al., 2001). The dominant land use systems in the area are 50
cultivations of mostly corn and vegetables in the floodplain 51
region and vineyards, orchards and partly pastures, in the 52
hilly part. 53
5
Accep
ted m
anusc
ript
The data set used is a part of the multi-element geochemi-1
cal mapping project covering 3700 km2 of agricultural land2
in Zagreb region, Croatia. The basic sampling grid was a3
square mesh with sampling points at intervals of 2Ö2 km,4
and 1Ö1 km in the area of higher urbanization. By doubling5
the inspection density we wanted to reduce spatial prediction6
error in areas where we expected higher values. A total of 7847
topsoil samples were collected. We used composite samples8
made up of 10 increments collected from the soil upper 20 cm9
in a cross pattern, with a 5 m distance between increments.10
Site descriptions were registered at the time of sampling to11
record the sample location in relation to land use and ma-12
jor environmental features. Soil samples were digested with13
aqua regia in accordance with the HR ISO 11466 procedure14
at the Analytical laboratory of the Faculty of Agriculture,15
University of Zagreb. Heavy metal concentrations (Cd, Cr,16
Cu, Ni, Pb and Zn) in soil extracts were determined by induc-17
tively coupled plasma optical emission spectrometry (Vista18
MPX AX, Varian). The choice of the method was dictated19
by Croatian Government regulations, which define the limit20
values of potentially toxic substances (including trace met-21
als) in agricultural soils (Official Gazette, 1992), as well as22
soil quality criteria for organic production (Official Gazette,23
2001).24
For this study area, we assumed that the distribution of25
HMCs is systematic, i.e. controlled by the environmental26
and anthropogenic factors. Indeed, Romic and Romic (2003)27
already showed that the distribution of HMCs in part of28
the study area is primarily controlled by: (a) geology, (b)29
industrial impact — traffic, heating plants, chemical industry30
and airports and (c) external factors — some heavy metals31
are brought by the Sava River, which has been exposed to32
intensive pollution by mining, industries and cities in recent33
history. A portion of heavy metals is wind-blown from the34
industrial region of north Italy (Antonic and Legovic, 2004).35
Following the empirical knowledge about the studied area,36
we produced a list of potential predictors that were used37
as auxiliary data in the RK system. Eight GIS layers were38
prepared in ILWIS:39
Geological map (GEO) — This layer was produced from40
the geological map of Croatia at scale 1:100K. This com-41
prised six map sheets that were first georeferenced and 42
then converted into a polygon map in ILWIS GIS. Each 43
map unit was converted to an indicator map, which was 44
then used as a predictor. 45
Land cover map (LAND) — This layer was produced by 46
classifying a LANDSAT 7 satellite sensor image from 20th 47
March 2003. The date corresponds to the sampling period. 48
It was also selected because the vegetation was just start- 49
ing to develop so that the soil surface visibility was good. 50
The LAND map was used also to mask out areas such as 51
forests and water bodies that were not of interest for the 52
project. 53
Normalized Difference Vegetation Index (NDVI) — 54
The NDVI, derived from the same LANDSAT image, 55
reflects the actual green mass, i.e. vegetation cover. The 56
urbanized areas typically show small or even negative 57
NDVI values. 58
Depth to the water table (lnGWD) — This is an ap- 59
proximate variable and was derived in two steps: first the 60
water table elevation map was estimated by fitting a 2nd 61
degree trend function to the point map showing the wa- 62
ter table at main rivers, then the original DEM was sub- 63
stracted from this map that approximates the fall of the 64
water table. The GWD represents the flooding potential 65
of an area and and can be used in estimation of the level 66
of the ground water table. The map was log-transformed 67
to emphasize smaller depths. 68
Slope (SLOPE) — Slope was derived in ILWIS as 69
the standard terrain parameter (see http://spatial- 70
-analyst.net for terrain parameterization scripts). 71
Distance to urban areas (URBAN) — This variable 72
describes the proximity to sources of pollution. URBAN 73
was derived in two steps: first the urban areas were 74
masked out in the land cover map, then a buffer to these 75
areas was derived using the distance operation in ILWIS. 76
This means that all areas outside the urban areas received 77
a distance, while the urban areas themselves received a 78
zero value. 79
Distance to roads (ROADS) — Similar to the URBAN 80
map, this map was derived as a buffer to the road network. 81
6
Accep
ted m
anusc
ript
0 25 km
N
Croatiageological map
land cover map
distance to urban areasNDVI from Landsat imagedepth to ground water (lnGWD)
control
interpolation
Fig. 2. Location of the study area Zagreb city and Zagreb county: sampled locations used for interpolation and mask map
showing agricultural areas. The key auxiliary predictors: geological and land cover maps, distance to urban areas, NDVI and
ln of ground water depth.
Wind exposition (WINDE) — Wind exposition was1
calculated as relative slope insolation for eight positions2
(azimuths): 45◦, 90◦, 135◦, 180◦, 225◦, 270◦, 315◦ and3
360◦ using the vertical exposition angle of 5◦. A map with4
a name WINDE45 means azimuth of 45◦.5
3 Results and discussion6
3.1 Regression analysis7
The first screening of data showed that almost all HMCs8
have asymmetrical distributions, clearly shifted toward the9
lower values. After the logit transformation, the distribu- 10
tions were closer to approximately normal (Figure 3), which 11
allowed us to do further statistical analysis. This confirms 12
that logit transformation is an important step prior to actual 13
interpolation. The step-wise regression analysis in R selected 14
geological map (GEO), ground water depth (GWD), NDVI, 15
slope map (SLOPE) and distance to urban areas (URBAN) 16
as the best auxiliary predictors (Table 2). The number of pre- 17
dictors selected with step-wise filtering still remained fairly 18
large (on average, 19 out of 31). This confirmed that all pre- 19
dictors pre-selected have importance for the mapping of the 20
HMCs. 21
7
Accep
ted m
anusc
ript
The auxiliary predictors accounted for 31.5% of the total1
variability on average (Table 2). Most satisfactory was es-2
timation of Cu (R2=0.51) and Cd (R2=0.46), while some-3
what less satisfactory was estimation of Zn (R2=0.20) and Cr4
(R2=0.17). Interpolated maps of HMCs resemble our empiri-5
cal knowledge. Especially the floodplains and lowest terraces6
of the Sava river were strongly correlated with high concen-7
trations of Cd, Zn and Pb. This suggest that the recent sedi-8
mentation of the river deposits is the most probable cause of9
the accumulation of heavy metals (Romic and Romic, 2003).10
Emission by anthropogenic sources is especially dominant in11
southeastern part of Zagreb city (Velika Gorica). This area12
has been expanding rapidly in the last decade. Direct indus-13
trial emission is the most prominent source especially of Pb,14
Cd and Zn.15
The geological map was the most useful predictor in all cases.16
However, geological strata was not always the direct cause of17
HMCs. For example, high copper concentrations were actu-18
ally related to the land use. Hilly and mountainous regions19
in surroundings of Zagreb are geologically heterogenous. In20
the northern part, the old Paleozoic and Mesozoic moun-21
tain core comprise belts of Tertiary hills, the south-western22
part are Tertiary sediments and Pleistocene loams, forming23
well-protected, amphitheatre-shaped areas. These locations24
have been occupied almost exclusively by vineyards by many25
decades. Accumulation of copper in the vineyard soils is the26
most common effect of continuing protection of grapevine27
against fungal diseases.28
In the case of Cd, the strong correlation was probably due29
to the clear connection between Cd and geological material.30
Several carbonate soils developed on limestone contain also31
rather high cadmium concentrations. Romic et al. (2004)32
studied origin and preferential feature of metal retention in33
the vineyard topsoil of NW Croatia using multivariate statis-34
tics and pointed out the importance of CaCO3 for cadmium35
retention in soil.36
3.2 Geostatistical analysis37
Spatial autocorrelation of residuals was distinct in all cases38
except for the Cu. The variogram of Cu showed the pure39
nugget effect (Goovaerts, 1997), which means that the resid-40
Fig. 3. Histrogram of HMC (Cd) before and after logit trans-
formation. The logit transformation ensures the normality
of the target variable, which is an important prerequisite for
regression-kriging.
uals were practically uncorrelated (pure regression analysis 41
is sufficient). We already mentioned that in part of the area 42
the origin of Cu can be related to the copper-based fungi- 43
cides, i.e. land use system (vineyards). In this case, plots are 44
very irregularly placed so that it is hard to detect any spa- 45
tial correlation at this working scale (2Ö2 km grids). The 46
average distance at which we measured spatial correlation 47
ranged from 2 (Cd) to 10 km (Pb), or about 6 km on average 48
(Table 2). The automatically fitted variograms in GSTAT 49
can be seen in Figure 4. The shape of the fitted variogram 50
gives us an idea about the speed and intensity of horizon- 51
tal diffusion of HMCs in an open environment: in this case 52
Pb diffuses faster (longer range), while Cu does not seem to 53
diffuse at all (pure nugget effect). 54
Because the short range (nugget) variation was rather high, 55
we have decided to use the block-kriging option in GSTAT to 56
derive final predictions. We have set the block size at 100 m to 57
correspond to the output grid size (Hengl, 2006). Note that 58
the block-kriging does not give much different output than 59
punctual estimates. However, it has a powerful property to 60
adjust for the local outliers (usually very high HMCs) and 61
the final prediction error (UK variance) will be much lower, 62
i.e. more precise (Hengl, 2006). 63
8
Accep
ted m
anusc
ript
0.8 100
100 70
50
6.0
150
0.4 60
55 40
30
3.0
100
0.0 20
10 10
10
0.0
50
-1mg kg
-1mg kg
-1mg kg -1mg kg
-1mg kg -1mg kg
Cd Cr
Cu Ni
Pb
LS
(a)
(b)
Zn
Fig. 5. (a) Interpolated maps for Cd, Cr, Cu, Ni, Pb and Zn. Masked areas (white) are forests and water bodies. (b) Map of
cumulative limitation scores showing overall soil pollution. ** PRINT IN COLOR **
3.3 Final maps of heavy metals1
Spatial prediction, when cross-checked at control points, was2
satisfactory for Cd, Ni, Pb and Zn, and somewhat less satis-3
factory for Cu and Cr. In all cases, the RMSPE at 50 control4
points did not exceed 80% of the original variation of the5
heavy metals (Table 2). In average, the RMSPE at control 6
points was 56% of the total variation (STD) of the HMCs. 7
Interpolated maps for Cd, Cr, Cu, Ni, Pb and Zn can be seen 8
in Figure 5a and the summary map of overall pollution in 9
Figure 5b. In this case, the reddish areas (high values) indi- 10
cate problematic areas where the concentrations are either 11
9
Accep
ted m
anusc
ript
Fig. 4. Variograms estimated for residuals after transforma-
tion and regression analysis. Because logit transforms are
used, all HMCs are at same scale so the variograms can be
compared directly. In this case Cd is correlated at shorter
and Pb at longer distances, Cu is not correlated at all.
above the recommended limit, or above the critical limit. At1
first look, it appears that there is strong correlation between2
the distributions of heavy metals. Especially distributions3
of Pb, Zn and Ni seem to be spatially correlated. Further4
statistical analysis of interpolated maps confirmed that all5
HMCs are correlated with correlation coefficient (r) ranging6
from 0.24 to 0.72. The most strongly correlated metals in7
soil were Zn and Ni (r = 0.72), Ni and Cd (r = 0.71) and Zn8
and Pb (r = 0.70). Cr and Cu are least strongly correlated9
with other heavy metals.10
The final suitability map (Figure 6a) shows that 33.5% of11
the total area is suitable for organic agricultural production.12
These are the areas where none of the HMCs exceeds the13
permissible concentration (> X1). On the other hand, 7.2%14
of the total area is critically polluted by one or more heavy15
metals (any of HMCs > X2). This leaves about 59.3% of16
the area as marginally suitable soil for organic agriculture.17
This map can be compared with the representation of the18
cumulative limitation scores, which shows much greater de-19
tail and contrast (Figure 6b). Clearly, the location around20
the city of Velika Gorica, close to the Zagreb airport, and the21
areas where the vineyards are located are the most critically22
polluted areas.23
0 25 km
6.0
3.0
0.0
Not suitable
Suitable
LS
(a)
(b)
Fig. 6. Comparison of crisp and continuous interpretation
maps: (a) Boolean map showing locations suitable for organic
agricultural production; (b) cumulative limitation scores.
4 Discussion and Conclusion 24
The developed procedure for geostatistical analysis of HMC 25
data enabled us to identify a number of contamination 26
hotspots and to map the cumulative contamination by heavy 27
metals. Regression-kriging has shown to be a powerful inter- 28
polation technique because it utilizes all possible linear cor- 29
relations (with auxiliary predictors and auto-correlation). 30
An alternative to regression-kriging would be to run multi- 31
variable interpolation (all at once) on sets of HMCs, which is 32
also possible in the GSTAT package (Pebesma, 2004). This 33
might be computationally challenging because interpolation 34
of separate HMCs used in this case study lasted more than 35
several hours on a standard PC. In the current case study, 36
each parameter was evaluated and interpolated separately, 37
which allowed us to do more in-depth exploratory analysis 38
(step-wise regression). 39
An advantage of using limitation scores is that the map of cu- 40
mulative limitation scores can be directly interpreted as the 41
map of overall soil pollution. Unlike when the factors, fuzzy 42
10
Accep
ted m
anusc
ript
Table 2
Summary results for regression and geostatistical analysis of data. STD — standard deviation of the original data, m —
number of predictors selected in the step-wise regression, RSE — residual square error or remaining residuals after regression,
MPE — mean prediction error at control points, RMSPE — root mean square prediction error at control points.
Fitted using step-wise regression Residuals Control
STD m Most significant predictors R2 RSE Nugget Sill Range MPE RMSPE
Cd 0.34 28 GEO, NDVI, lnGWD 0.46 0.498 0.020 0.229 860 -0.020 0.217
Cr 15.9 15 GEO, URBAN 0.17 0.293 0.000 0.089 1152 1.6 12.8
Cu 108.4 18 GEO, WINDE180, lnGWD 0.51 0.746 0.395 0.395 – -5.0 16.1
Ni 18.1 17 GEO, SLOPE 0.28 0.432 0.092 0.202 2372 0.7 9.2
Pb 15.7 22 GEO, LAND, WINDE45 0.27 0.455 0.122 0.228 3852 1.0 9.5
Zn 32.8 15 GEO, SLOPE, WINDE270 0.20 0.341 0.080 0.119 1708 -0.5 21.7
classes or probability percentages of exceeding a threshold1
value are used to represent the pollution by heavy metals.2
Such maps can supplement the maps of separate HMCs and3
serve decision makers who require a single map representing4
amount of overall pollution. Note that the formulas in Eq. 75
can easily adopt any model between the cost and concentra-6
tion. The most important thing about the limitation scores7
is that they are standardized and can be summed for differ-8
ent HMCs. The limitation of using the scores is that the high9
overall pollution can be due to very high values of a single10
element, or due to a cumulative effect of a large number of11
HMCs (Figure 7). This means that the map of cumulative12
limitation scores should be only used to delineate the most13
critical areas, but the user then needs to return to the sep-14
arate maps of HMCs.15
Note that we did not evaluate the acidity of soils, which is16
also an important factor for the pollution of soils. Mol et al.17
(2003) showed that the mobility, i.e. bioavailability, of heavy18
metals in soil will increase as the soils become more acid. In19
our case study, most of the soils showed neutral to slightly20
acid reaction (average pH of 6.8 with std. of 1.01). In areas21
where the soil acidity is a more serious problem, it would22
be also important to map pH in soils and then convert this23
variable to limitation scores or use this information to calcu-24
late weighted limitation scores from the input concentration25
values.26
Cd CdCr CrCu CuNi NiPb PbZn Zn
(a) (b)
Fig. 7. High cumulative limitation scores can be a result of
(a) the cumulative effect of multiple elements, or (b) a single
element that shows very high values.
Our hope is that this methodological framework will open 27
new perspectives. The following step will be to think of meth- 28
ods to relate the cumulative limitation scores directly with 29
the remediation costs (Broos et al., 1999), i.e. to estimate 30
the financial losses connected with the constrained use of 31
the land. Different ratios could have been used for different 32
HMCs. A more objective approach would be to work with 33
real figures from real-life projects and then adjust the coeffi- 34
cients statistically. Another idea for future research is to use 35
magnetic susceptibility field images (Schmidt et al., 2005) 36
as predictors in mapping soil pollutants. Note that the only 37
requirement would be that such images are available for the 38
whole study area. One could also make a special case study 39
11
Accep
ted m
anusc
ript
only to observe how HMCs change at different scales, i.e. at1
different distances and with auxiliary predictors with differ-2
ent grid sizes. Our experience is that we need to consider3
building up much more detailed maps of auxiliary predic-4
tors, especially the ones related to the urbanization — maps5
of active heating plants, density of traffic etc. In addition,6
one might consider the methodology of error propagation7
(Heuvelink, 1998) to derive the composite uncertainty of the8
final soil pollution map. At the moment, multiple conditional9
simulations with such a large amount of data in GSTAT are10
almost impossible due to the computational complexity and11
size of the input maps. Geostatistical simulations would help12
us get an idea about the propagated uncertainty, but can13
also be used as an input to a more complex environmental14
data modelling.15
References16
Amini, M., Afyuni, M., Fathianpour, N., Khademi, H., Fluh-17
ler, H., 2004. Continuous soil pollution mapping using18
fuzzy logic and spatial interpolation. Geoderma 124 (3-4),19
223–233.20
Antonic, O. and Legovic, T., 1999. Estimating the direction21
of an unknown air pollution source using a digital elevation22
model and a sample of deposition. Ecological Modelling,23
124 (1): 85–95.24
Broos, M. J., Aarts, L., van Tooren, C. F., Stein, A., 1999.25
Quantification of the effects of spatially varying environ-26
mental contaminants into a cost model for soil remedia-27
tion. Journal of Environmental Management 56 (2), 133–28
145.29
Goovaerts, P., 1997. Geostatistics for Natural Resources30
Evaluation. Oxford University Press, New York, pp. 483.31
Hanesch, M., Scholger, R., Dekkers, M. J., 2001. The ap-32
plication of fuzzy c-means cluster analysis and non-linear33
mapping to a soil data set for the detection of polluted34
sites. Physics and Chemistry of the Earth, Part A: Solid35
Earth and Geodesy 26 (11-12), 885–891.36
Hengl, T., Heuvelink, G., Stein, A., 2004. A generic frame-37
work for spatial prediction of soil variables based on38
regression-kriging. Geoderma 122 (1-2), 75–93.39
Hengl, T., 2006. Finding the right pixel size. Computers &40
Geosciences, 32(9), 1283–1298.41
Heuvelink, G., 1998. Error propagation in environmental 42
modelling with GIS. Taylor & Francis, London, UK. 43
Juang, K.W., Chen, Y.S., Lee, D.Y., 2003. Using sequen- 44
tial simulation to assess the uncertainty of delineating 45
heavy-metal contaminated soils. Environmental pollution 46
127:229–238. 47
Miko, S., Halamic, J., Peh, Z., Galovic, L., 2001. Geochemical 48
baseline mapping of soils developed on diverse bedrock 49
from two regions in Croatia. Geologica Croatica 54 (1), 50
53–118. 51
Mol, G., Vriend, S., van Gaans, P., 2003. Monitoring soil acid- 52
ification. Conceptual considerations and practical solu- 53
tions based on current practice in the Netherlands. Chem- 54
ical geology 203 (1-2), 3417–3441. 55
Odeh, I., McBratney, A., Chittleborough, D., 1995. Fur- 56
ther results on prediction of soil properties from terrain 57
attributes: heterotopic cokriging and regression-kriging. 58
Geoderma 67 (3-4), 215–226. 59
Official Gazette, 1992. Regulation on protection of agricul- 60
tural land in the Republic of Croatia (in Croatian). Vol. 61
No. 15/92. Narodne novine, Zagreb, Croatia. 62
Official Gazette, 2001. Regulation on organic agricultural 63
production and products quality (in Croatian). Vol. 64
No. 12/01. Narodne novine, Zagreb, Croatia. 65
Pebesma, E. J., 2004. Multivariable geostatistics in s: the 66
gstat package. Computers & Geosciences 30 (7), 683–691. 67
Proctor, J., Baker, A., 1994. The importance of nickel for 68
plant growth in ultramafic (serpentine) soils. In: Ross, S. 69
(Ed.), Toxic metals in soil-plant system. Wiley, New York, 70
pp. 417–432. 71
Reimann, C., de Caritat, P., 2005. Distinguishing between 72
natural and anthropogenic sources for elements in the 73
environment: regional geochemical surveys versus enrich- 74
ment factors. Science of The Total Environment 337 (1-3), 75
91–107. 76
Romic, M., Romic, D., 2003. Heavy metals distribution in 77
agricultural topsoils in urban areas. Environmental Geol- 78
ogy 43, 795–805. 79
Romic, M., Romic, D., Dolanjski, D., Stricevic, I., 2004. 80
Heavy metals accumulation in topsoil from the wine- 81
growing regions. part 1. factors which control retention. 82
Agriculturae Conspectus Scientificus 69, 1–10. 83
Schmidt, A., Yarnold, R., Hill, M., Ashmore, M., 2005. Mag- 84
12
Accep
ted m
anusc
ript
netic susceptibility as proxy for heavy metal pollution:1
a site study. Journal of Geochemical Exploration, 85(3),2
109–117.3
Triantafilis, J., Ward, W., McBratney, A., 2001. Land suit-4
ability assesment in the namoi valley of australia, using a5
continuous model. Australian journal of Soil Research 39,6
273–290.7
Unit Geo Software Development, 2001. ILWIS 3.0 Academic8
user’s guide. ITC, Enschede.9
URL http://www.itc.nl/ilwis/10
Van der Gaast, N., Leenaers, H., Zegwaard, J., 1998. The11
grey areas in soil pollution risk mapping the distinction12
between cases of soil pollution and increased background13
levels. Journal of Hazardous Materials 61 (1-3), 249–255.14
Webster, R., Oliver, M., 2001. Geostatistics for Environmen-15
tal Scientists. Statistics in Practice. John Wiley & Sons,16
Chichester.17
13