Estimation or simulation of soil properties? An optimization problem with conflicting criteria

Ž .Geoderma 97 2000 165–186www.elsevier.nlrlocatergeoderma

Estimation or simulation of soil properties? Anoptimization problem with conflicting criteria

P. Goovaerts)

Department of CiÕil and EnÕironmental Engineering, The UniÕersity of Michigan, EWRE Bldg.,Room 117, Ann Arbor, MI 48109-2125, USA

Received 5 October 1998; received in revised form 11 May 1999; received in revised form6 July 1999; accepted 6 July 1999

Abstract

Both estimation and simulation approaches are formulated as the selection of a set of attributevalues that are optimal for criteria that are typically conflicting. Estimation amounts to minimizelocal criteria such as a local error variance, whereas stochastic simulation aims to reproduce global

Ž .statistics such as the histogram or semivariogram. A simulated annealing SA algorithm ispresented to generate maps of optimal values: an initial random image is gradually perturbed so asto minimize a weighted combination of three components that measure deviations from local orglobal features of interest. The approach is illustrated using an environmental data set related tosoil contamination by zinc. A validation set shows that, depending on the relative weight given tolocal and global constraints, the final maps have properties ranging from estimation to simulation

Ž .in terms of mean square error MSE of prediction and extent of the space of uncertainty.ŽCombination of both types of constraints leads to better performances smaller proportions of

misclassified locations, smaller prediction errors for the average proportion of contaminated.locations within remediation units than a smooth estimated map or a simulated map that

reproduces only the histogram and semivariogram. q 2000 Elsevier Science B.V. All rightsreserved.

Keywords: geostatistics; indicator kriging; simulated annealing; prediction error; heavy metals;soil mapping

) Tel.: q1-734-936-0141; fax: q1-734-763-2275.Ž .E-mail address: [email protected] P. Goovaerts .

0016-7061r00r$ - see front matter q2000 Elsevier Science B.V. All rights reserved.Ž .PII: S0016-7061 00 00037-9

( )P. GooÕaertsrGeoderma 97 2000 165–186166

1. Introduction

Most environmental applications, such as the delineation of contaminatedareas, require a prior mapping of the target attribute, say a soil pollutantconcentration, over the study area. Until recently, soil properties at unsampled

Ž .grid nodes have been mainly determined using minimum error variance krigingŽ .interpolation algorithms Goovaerts, 1999a . Most users are now aware that the

map of such local estimates smooths out local details of the spatial variation ofthe attribute, with small values being overestimated while large values areunderestimated. This type of selective bias is a serious shortcoming when one

Ž .aims at detecting large pollutant concentrations Goovaerts, 1997a .Unlike kriging, stochastic simulation does not aim at minimizing local error

variance but focuses on the reproduction of statistics such as the samplehistogram or the semivariogram model in addition to the honoring of data valuesŽ .Goovaerts, 1997b; Deutsch and Journel, 1998 . A simulated map, which issome realization of the RF model adopted, looks more ‘‘realistic’’ than the mapof statistically ‘‘best’’ estimates because it reproduces the spatial variabilitymodeled from the sample information. Stochastic simulation is thus increasinglypreferred to kriging for applications where the spatial variation of the measured

Ž .field must be preserved Srivastava, 1996 . Examples in soil science are theŽ .delineation of soil contaminated areas Desbarats, 1996; Goovaerts, 1997a , the

Ž .modeling of solute transport in the vadose zone Vanderborght et al., 1997 , orŽ .the prediction of crop yield Pachepsky and Acock, 1998 . The trade-off cost for

the better reproduction of spatial features by simulated maps is that the localproperty of ‘‘minimum error variance’’ of kriging is lost. A practical conse-quence is that the mean prediction error tends to be larger for simulated values

Žthan for kriging estimates, as pointed out by recent studies Olea and Pawlowsky,.1996; Goovaerts, 1997a, 1998 . The question is thus whether one can strike a

balance between local minimization of error variance of the estimated map andreproduction of spatial variability by simulated maps, and what would be theproperties of such intermediate maps.

Another difference between stochastic simulation and kriging is that manyŽrealizations that all match reasonably the same statistics histogram, semivari-

.ogram can be generated whereas there is a single kriged map that yields theminimum error variance at each location. The set of equally-probable realiza-tions is particularly useful for assessing the uncertainty in the spatial distributionof attribute values, and to investigate the performance of different scenariosŽ .remediation process, land-use policy or, more generally, to analyse the propa-

Ž .gation of errors through GIS Heuvelink, 1998 . Intuitively, the incorporation oflocal contraints in stochastic simulation should reduce differences betweenrealizations, hence the extent of the space of uncertainty, since it would impartto them features of the unique kriged map.

( )P. GooÕaertsrGeoderma 97 2000 165–186 167

Ž .A recent paper by Goovaerts 1998 brought a new insight into the relationbetween estimation and simulation, and described an algorithm to generate mapswith intermediate properties. Estimation and simulation were presented as twooptimization problems that differ in their optimization criteria: minimization of a

Žlocal expected loss for estimation and reproduction of global statistics semi-.variogram, histogram for simulation. Maps with intermediate properties in

terms of mean square prediction error and reproduction of histogram andsemivariogram were generated by modifying gradually an initial random image

Ž .using simulated annealing SA and a two-component objective function thatcontrols the reproduction of semivariogram and minimization of a local expectedloss. The algorithm was applied to the mapping of permeability values. Flowsimulation results showed that accounting for local constraints in stochasticsimulation yields, on average, smaller errors in production forecast than asmooth estimated map or a simulated map that reproduces only the histogramand semivariogram.

This paper presents a generalization of the aforementioned algorithm in thatthe objective function here includes three components which allows one tocontrol separately the minimization of a local expected loss, as well as thereproduction of the histogram and the semivariogram. A new perturbationmechanism is also presented to speed up the completion of the optimizationprocess. This procedure is applied to the mapping of topsoil zinc concentrationin a 14.5 km2 region of the Swiss Jura. The prediction performances of thealgorithm are investigated using a validation set and 36 different weightingschemes for the three components of the objective function.

2. Modeling of local uncertainty

Consider the problem of delineating areas that are potentially contaminatedwith respect to zinc in a 14.5 km2 region of the Swiss Jura. The informationavailable consists of measurements of topsoil Zn at 259 locations depicted at thetop of Fig. 1. The sample histogram and semivariogram with the model fittedare displayed in Fig. 2. A detailed description of the sampling, field and

Ž . Ž .laboratory procedures is given in Atteia et al. 1994 and Webster et al. 1994 .A common approach would be to determine first the zinc concentration at N

Ž .grid nodes u js1,2, . . . , N discretizing the study area AA, then compare thej

concentrations with the regulatory threshold. In this example, a square grid witha spacing of 100 m is overlaid over the region.

Ž .Regardless of the technique estimation, simulation used for determining theconcentration at an unsampled grid node u , there is necessarily some uncer-j

tainty attached to this value. Geostatistics allows the uncertainty about theŽ .unknown z-value at u , z u , to be modeled by the conditional cumulativej j


Fig. 1. Location map of 259 Zn data, and conditional cdf models provided by ordinary indicatorkriging at two grid nodes u and u .1 2

Ž . Ž .distribution function ccdf of the random variable Z u j

< <F u ; z n sProb Z u Fz n 1Ž . Ž . Ž . Ž .� 4Ž .j j

<Ž .where the notation ‘‘ n ’’ expresses conditioning to the local information, say,Ž . Ž .n neighboring data z u . The function 1 gives the probability that thea

unknown is no greater than any given threshold z. Unlike a confidence interval,this model of uncertainty is independent of the choice of a particular estimate atu , and thus provides an uncertainty assessment prior to any mapping of thej

attribute.


Fig. 2. Experimental semivariogram and histogram computed from the data of Fig. 1.

Ž .Conditional distributions ccdf can be established using a variety of algo-rithms that are classified as parametric and non-parametric, see the recent paper

Ž .in this journal by Goovaerts 1999a . In this paper, a non-parametric approach isŽ <Ž ..used whereby the function F u ; z n is assessed for a given number of Kj

threshold values z discretizing the range of variation of the attribute zk

< <F u ; z n sProb Z u Fz n ks1, . . . , K 2Ž . Ž . Ž . Ž .� 4Ž .j k j k

Nine threshold values corresponding to the deciles of the sample distribution ofŽ <Ž ..Fig. 2 were considered, and the associated probabilities, F u ; z n , werej k

estimated using ordinary indicator kriging and the indicator semivariogrammodels of Fig. 3. The resolution of the discrete ccdfs was then increased byperforming a linear interpolation between tabulated bounds provided by the

Ž . Žsample cdf Deutsch and Journel, 1998, p. 136 . For example, Fig. 1 bottom.graphs shows the ccdf models at two locations u and u . These models reflect1 2

the larger uncertainty prevailing at u within a sparsely sampled part of the1

region, as opposed to the steep ccdf model at u .2

3. Formulation of the optimization problem

The ccdf model at u provides information about the range of values that arej

likely to occur there, as well as the corresponding probability of occurrence.


Fig. 3. Experimental indicator semivariograms for the 9 deciles of the sample distribution of Zndata with models fitted.

Estimation or simulation can be viewed as the selection, within the range ofpossible z-values, of a single value that is ‘optimal’ for some criterion. Thus,differences between the two types of approaches reside mainly in the optimalitycriterion that is used.

3.1. Estimation optimality criterion

For estimation, a common optimality criterion is the minimization of theŽ . Ž . )Ž .impact attached to the estimation error e u sz u yz u that is likely toj j j

)Ž .occur, where z u is the estimate of z at u . In particular, the least-squaresj j)Ž .criterion amounts to selecting the value z u that minimizes the quadraticj

w Ž .x2 Žfunction of the estimation error e u , which is referred to as a loss Journel,j.1989, pp. 27–28; Christakos, 1992, pp. 341–343 . Because the actual value

Ž .z u is unknown, the actual loss cannot be computed. However, the model ofj


Ž .uncertainty about this unknown Eq. 1 allows one to calculate the expected lossas

2) )< <w z u n sE Z u yz u nŽ . Ž . Ž . Ž . Ž .Ž . ½ 5j j j

q` 2) <s zyz u d F u ; z n 3Ž . Ž . Ž .Ž .H j j

y`

Ž . )Ž .This expected loss appears as a function w P of the estimated value z u . Itj)Ž . Ž .is minimal if z u is the expected value mean of the ccdf at location uj j

q`) <z u s zd F u ; z n 4Ž . Ž . Ž .Ž .HE j j

y`

The expected loss corresponding to the so-called E-type estimate is the varianceof the conditional cdf

q` 22 ) <s u s zyz u d F u ; z n 5Ž . Ž . Ž . Ž .Ž .Hj E j jy`

Ž . Ž .In practice, the integrals in Eqs. 4 and 5 are approximated by the discretesums

Kq1) < <z u , z P F u ; z n yF u ; z n 6Ž . Ž . Ž . Ž .Ž . Ž .ÝE j k j k j ky1

ks1

Kq122 ) < <s u , z yz u P F u ; z n yF u ; z n 7Ž . Ž . Ž . Ž . Ž .Ž . Ž .Ýj k E j j k j ky1

ks1

where z , ks1, . . . , K , are K threshold values discretizing the range ofkŽ <Ž .. Ž <Ž ..variation of z-values. By convention, F u ; z n s0 and F u ; z n s1.j 0 j Kq1

Other thresholds z were identified to p-quantiles corresponding to regularlyky1Ž w x <Ž ..spaced ccdf increments, i.e., z sF u ;kr Kq1 n , with Ks50 in thek j

Ž xcase study. z is the mean of the class z , z which depends on thek ky1 kŽ .intra-class interpolation model, e.g., for the linear model z s z qz r2.k ky1 k

Consider now the estimation over the N grid nodes u . The set of N optimalj� )Ž . 4values is the set of N E-type estimates z u , js1, . . . , N . The correspond-E j

ing global expected loss, which is the sum of the N local expected losses, isminimal and equal to the sum of the variances of the N ccdfs. More precisely,the optimality criterion for the grid is

N) 2< < <w z u n ys u is minimal 8Ž . Ž . Ž . Ž .Ž .Ý j j

js1

and is equal to zero for the optimal estimation grid.


3.2. Simulation optimality criteria

In stochastic simulation, the objective is not to minimize a local expectedloss. Instead, one aims at generating maps or realizations that reproducestatistics deemed most consequential for the problem at hand. In addition to the

Ž .honoring of data values exactitude property , which is also achieved byestimation, two basic requisites for such simulated maps are the ‘‘approximate’’reproduction of a target histogram and semivariogram model. The term ‘‘ap-

Ž .proximate’’ is used here because: 1 most simulation algorithms do not allowan exact reproduction of histogram and semivariogram models, the discrepanciesbetween realization and model statistics being referred to as ergodic fluctuationsŽ . Ž .e.g., see Goovaerts, 1999b and 2 sample statistics may be far from thepopulation parameters and one should be cautious in imposing their strictreproduction, in particular when data are sparse.

Consider first the constraint of reproduction of a target histogram that istypically identified as the sample histogram, possibly after data declustering to

Žcorrect for a preferential sampling of the study area Goovaerts, 1997b, pp..77–81 . The constraint is usually expressed in terms of the cumulative distribu-

Ž .tion, denoted F z . If the range of variation of z is discretized by a series of Kthresholds z , the optimality criterion isk

K2ˆF z yF z is minimal 9Ž . Ž . Ž .Ý k k

ks1

ˆŽ .where F z is the cumulative frequency at threshold z calculated from thek k

realization.The constraint of semivariogram reproduction can be expressed in a similar

Ž .way. Considering that the reproduction of the semivariogram model g h isusually limited to a specified number S of the first lags, the optimality criterionis written

2S g h yg hŽ . Ž .ˆs sis minimal 10Ž .Ý 2

g hŽ .ss1 s

Ž .where g h is the semivariogram value at lag h calculated from the realiza-ˆ s s

tion. The division by the square of the semivariogram model at each lag h s

gives more weight to reproduction of the z-semivariogram model near the originwhich is usually the most consequential.

Ž .Although the two optimality criteria for simulation Eqs. 9 and 10 resembleŽ .the estimation criterion Eq. 8 , there are two major differences between the

estimation and simulation optimization problems. First, whereas the expectedŽ .loss w P introduced for estimation involves each grid node u separately, thej


constraints of histogram and semivariogram reproduction involve all grid nodesŽ .simultaneously global criteria . Thus, an optimum cannot be reached if the

simulated values are derived independently from one another. A second differ-Žence with estimation is that there are usually many solutions i.e., sets of

.simulated values to the optimization problem, which contrasts with the unique-ness of the optimal estimation grid; the l th solution is hereafter denoted� Ž l .Ž . 4z u , js1, . . . , N . Among the two global constraints, reproduction of thej

Ž .semivariogram is more constraining less solutions than the reproduction ofŽ . Ž .histogram Goovaerts, 1999c . Also, the minimization of the criterion 10

becomes more difficult as the range of the semivariogram increases relative toŽ .the size of the simulated domain Deutsch and Journel, 1998, p. 129 . The

search for such solutions is not as straightforward as the computation of E-typeestimates and requires iterative algorithms, such as SA introduced later.

3.3. Combination of estimation and simulation criteria

Once both estimation and simulation have been formulated as optimizationproblems, it is very straigthforward to combine their respective optimizationcriteria into a single objective function of the type

N K2

2 ˆ< < <l P w z u n ys u ql P F z yF zŽ . Ž . Ž . Ž . Ž .Ž .Ý Ý1 j j 2 k kjs1 ks1

2S g h yg hŽ . Ž .ˆs sql P 11Ž .Ý3 2

g hŽ .ss1 s

with Ý3 l s1, l G0. The relative importance of each component is con-cs1 c c

trolled by the weight l , which allows the user to strike a balance between acŽ .local criterion minimization of a local expected loss and global criteria

Ž .reproduction of a target histogram or semivariogram model . Moreover, eachŽ .component is usually standardized by its initial value, see later Eq. 12 .

Depending on the relative weight given to each component, the characteristicsŽ .of the set of z-values that minimize the objective function 11 range from a

smooth estimated map to simulated maps. The ternary plot on Fig. 4 shows 21different combinations of weights that will be used later in the case study. Opencircles depict the weighting schemes where l )0.5, i.e., the combinations1

where most of the weight is given to the local criterion. Black triangles andcircles depict schemes that give priority to the reproduction of the histogram andsemivariogram, respectively. Crosses located at the center of the plot correspondto intermediate situations. In summary, moving from the top of the ternary plotto the bottom, increasing importance is given to reproduction of global statistics,


Fig. 4. Ternary plot illustrating different weighting schemes for the three components of the SAŽ .objective function. As more weight is given to the local constraint l versus global constraints1

Ž .l , l , the solution of the optimization process is closer to estimation than simulation.2 3

hence the corresponding set of optimal values gets farther away from the uniqueŽ .set of E-type estimates top vertex .

4. SA

Ž .The last step is the identification of the set s of z-values that minimize theŽ .objective function 11 . The set of solutions can be explored using SA which is

a generic name for a family of optimization algorithms based on the principle ofŽ .stochastic relaxation Farmer, 1988; Srivastava, 1996 . The optimization process

amounts to systematically modifying an initial image or realization so as todecrease the value of the objective function, getting the realization acceptablyclose to the target statistics.

4.1. The algorithm

There are many possible implementations of SA, depending on the way theinitial realization is generated and then perturbed, on the components that enterthe objective function, and on the type of decision rule and convergence


Žcriterion that are adopted for the iterative algorithm Deutsch and Cockerham,.1994; Goovaerts, 1997b, pp. 409–420 . In this paper, the following procedure is

used.Ž . � Žl .Ž . 41 Create an initial realization z u , js1, . . . , N by freezing data valuesŽ0. j

at their locations and assigning to each unsampled grid node a z-value drawn atŽ <Ž ..random from the corresponding ccdf model F u ; z n .j

Ž . Ž .2 Compute the initial value of the objective function, O is0 , correspond-ing to that initial realization

O i O i O iŽ . Ž . Ž .1 2 3O i sl ql ql 12Ž . Ž .1 2 3O 0 O 0 O 0Ž . Ž . Ž .1 2 3

The different components are defined asN

Žl . 2< < <O i s w z u n ys u 13Ž . Ž . Ž . Ž . Ž .Ý ž /1 Ž i. j jjs1

K2ˆO i s F z yF z 14Ž . Ž . Ž . Ž .Ý2 k Ž i. k

ks12S g h yg hŽ . Ž .ˆs Ž i. s

O i s 15Ž . Ž .Ý3 2g hŽ .ss1 s

ˆ Ž . Ž .where F z and g h are the cumulative frequencies and semivariogramˆŽ i. k Ž i. s� Ž l .Ž .values computed from the realization at the ith perturbation z u , jsŽ i. j

41, . . . , N . To prevent the component with the largest unit from dominating theŽ .objective function, each component O is standardized by its initial value O 0 .c c

Thus, the initial value of the objective function is 1 since the weights l mustc

sum to unity.Ž .3 Perturb the realization by selecting randomly a location u and replacingj

Ž l . Ž .the current value z u by a new value randomly drawn from the conditionalis0 jŽ <Ž ..cdf F u ; z n . This perturbation mechanism is an improvement over commonj

Ž Ž . .procedures i.e., sampling of the marginal cdf F z or swapping of values inthat it accelerates the convergence process and allows one to gain more control

Ž .on the local distributions of simulated values Goovaerts, 1999c .Ž .4 Assess the impact of the perturbation on the reproduction of target

Ž .statistics by recomputing the objective function, O 0 , accounting for thenew

modification of the initial realization.Ž .5 Accept all perturbations that lower the objective function. Unfavorable

perturbations are accepted according to the value of a negative exponentialprobability distribution

° 1 if O i FO iy1Ž . Ž .~ O iy1 yO iŽ . Ž .� 4Prob Accept ith pert. s

exp otherwise¢ ž /t iŽ .


Ž .where t i is the ‘‘temperature’’ at the ith perturbation. The idea is to start withŽ .an initially high temperature t 0 , which allows a large proportion of unfavor-

able perturbations to be accepted at the beginning of the simulation. As thesimulation proceeds, the temperature is gradually lowered so as to limit discon-tinuous modification of the stochastic image. Two important issues are thetiming and magnitude of the temperature reduction, which defines the annealing

Žschedule. According to Deutsch and Cockerham’s typology Deutsch and Cock-

Fig. 5. First realizations of the spatial distribution of Zn values generated using simulatedŽ .annealing and six different weighting schemes for the objective function 13 . According to the

ternary plot of Fig. 4, as one moves towards bottom maps, the priority shifts from theminimization of local expected loss to the reproduction of histogram and semivariogram.


.erham, 1994 , a fast annealing schedule was used; that is, the initial temperatureŽ .was set to 1 and lowered by a factor 20 reduction factors0.05 whenever

Ž . Ž .enough perturbations 5=N have been accepted or too many 50=N havebeen tried.

Ž .6 If the perturbation is accepted, update the initial realization into a new� Žl .Ž . 4 Ž . Ž .image z u , js1, . . . , N with objective function value O 1 sO 0 .Ž1. j new

Ž .7 Repeat steps 3–6 until either the target low value O s0.001 is reachedmin

or the maximum number of attempted perturbations at the same temperature hasbeen reached three times.

� ŽlX.Ž . 4 XOther realizations z u , js1, . . . , N , l / l, are generated by repeatingj

the entire process starting from different initial realizations.

4.2. Example for the Zn data set

Fig. 5 shows simulated maps generated using SA and six different weightingŽ .schemes for the objective function 12 . The layout of the maps follows the

topology of Fig. 4. The upper map corresponds to the situation where all theŽ .weight is given to the minimization of local expected loss l s1 . The solution1

provided by SA is very close to the map of E-type estimates derived from theŽ . Ž .indicator-based ccdf models using Eq. 6 Fig. 6, top graph , which is known to

Fig. 6. Maps of E-type estimates before and after rescaling to match the sample histogram of Fig.2.


be the unique optimum. Another situation where the optimal grid is unique andcan be computed analytically is that of imposing the joint constraints of

Ž .histogram reproduction and minimization of local expected loss l sl s0.5 .1 2Ž . � )Ž .In this case, Goovaerts 1998 showed that the set of optimal values z u ,c j

4js1, . . . , N can be obtained by applying the following rank-preserving trans-form to the E-type estimates

) y1 )z u sF F z u js1, . . . , N 16Ž . Ž . Ž .Ž .c j E E j

Ž . Ž .where F P is the cumulative distribution function cdf of the N estimates, andEŽ . ŽF P is the target cdf. Again, the solution provided by SA Fig. 5, left middle

. Žgraph is very close to the analytical optimum displayed in Fig. 6 bottom

.graph .A common feature of the maps of original and rescaled E-type estimates is

their failure to reproduce the short-range variability modeled from the data, seesemivariograms of Fig. 7. Such a property is imparted by incorporating the thirdcomponent into the objective function, that is l )0. In the common implemen-3

Žtation of SA pertubation mechanism i.e., sampling of the marginal cdf or

Fig. 7. Experimental semivariograms computed from the maps of Fig. 5 with the target modelŽ .solid line .


. Ž .swapping of values , the sole reproduction of the histogram l s1 would2

yield very noisy realizations with pure-nugget semivariograms. Because thelocal cdfs already account for the pattern of spatial continuity through indicatorkriging, their random sampling yields realizations that display some spatialstructure.

The reproduction of the target histogram by the series of maps can beassessed from the Q–Q plots of Fig. 8. In such plots, the quantiles of thehistogram of simulated values are plotted against the corresponding quantiles of

Ž .the target histogram Deutsch and Journel, 1998, p. 207 . A perfect reproductionof the target histogram would entail that the quantiles of both distributions areidentical, i.e., all black dots would plot on the 1:1 line. Giving all the weight to

Ž .the minimization of local expected loss l s1 produces the typical smoothing1

effect whereby small values are overestimated while large values are underesti-

Fig. 8. Q–Q plots of the distribution of 259 Zn data versus that of simulated values of Fig. 5.Similar distributions should plot on the 1:1 line.


Ž .mated Fig. 8, top graph . Such a smoothing effect is partly corrected byincorporating the constraints of histogram or semivariogram reproduction in the

Ž .objective function Fig. 8, middle graph . Best results are obtained when theŽ .local constraint is ignored l s0 and the second component is accounted for1

Ž .l )0 .2

Fig. 9 shows the evolution of the objective function value along the optimiza-tion process. The starting value is one because of the initial rescaling of the

Ž .objective function, recall expression 12 . That value increases at the beginningof the iterative procedure when a large proportion of unfavorable perturbations

Ž .is accepted high temperature of the annealing schedule . The increase is muchŽ .smaller when only the first component is accounted for top graph , which

indicates that the initial realization was far away from the optimum and sounfavorable perturbations were less likely. Another consequence is that moreperturbations are required to drive the objective function value to zero, comparewith the cases l s1 and l s1. Regardless of the weighting scheme, most of2 3

the decrease in the objective function value arises during the first 20,000perturbations, which is roughly ten times the number of grid nodes. Note that

Ž .the function including both local and global criteria middle row cannot belowered to zero because these constraints are conflicting.

Fig. 9. Impact of the weighting scheme on the reduction of the objective function value versus thenumber of perturbation.


5. Validation set

The same algorithm was used to generate 50 realizations of the spatialdistribution of Zn values for 36 weighting schemes, including the 21 of Fig. 4.Simulated values were then compared with actual concentrations measured at100 test locations that have not been used so far, see Fig. 10.

The first two comparative statistics are the average global expected loss andaverage variance of the local distributions of 50 simulated values which werecomputed as

50 1001Žl . <w z u n 17Ž . Ž . Ž .Ý Ý ž /j50 ls1 js1

2100 50 501 1 1Ž .l Ž l .z u y z u 18Ž . Ž . Ž .Ý Ý Ýj j100 50 50js1 ls1 ls1

Results are expressed as percentage of the largest of the 36 values, see Fig. 11.As expected, the global loss decreases as the weight l of the first component1Ž .13 increases. For l s1, the global loss is minimum and half that obtained1

when the local constraint is ignored. Also, giving more weight to the minimiza-tion of local expected loss reduces differences between realizations, whichbecome more similar to the unique optimum, that is the map of E-type

Ž .estimates. The average local variance 18 , which is a measure of the extent of

Fig. 10. Location map of 100 test locations where zinc concentration is known but has not beenused in the optimization process.


ŽFig. 11. Statistics average global expected loss, average variance of local distributions of.simulated values computed from 50 realizations generated using simulated annealing and 36

different weighting schemes. Results are expressed as percentage of the largest value.

the space of uncertainty, is thus small. When both constraints of minimumexpected loss and histogram reproduction are imposed, the local variance is stillsmall because again there is a single optimum which is the map of rescaled

Ž .E-type estimates, see left side of the ternary plot Fig. 11, bottom graph .However, as soon as all the weight is given to histogram reproduction, therelative variance jumps from 19% to 100%. The maximum variance is obtained

Žfor l s1 because the constraint of histogram reproduction is more general less2.restrictive than the constraint of semivariogram reproduction. In other words,


there are many more solutions to the problem of generating a map with a givenhistogram than that of generating values that obey a given pattern of spatialvariability.

Ž .The mean square error MSE of prediction was computed as the arithmeticaverage of square differences between actual concentrations and values gener-ated at the 100 test locations using the different weighting schemes. Thesmallest errors are obtained using an estimation approach that seeks only the

Ž .minimization of local expected loss Fig. 12, top vertex . When the constraint ofhistogram reproduction is included, the prediction error increases. When a thirdconstraint of semivariogram reproduction is included, the prediction error in-creases even more. In other words, reproduction of spatial variability as modeledby semivariogram is achieved at the expense of larger errors of prediction,

Žwhich confirms results of previous studies Olea and Pawlowsky, 1996;.Goovaerts, 1997a, 1998, 1999c .

Each test location was classified as contaminated if the simulated valueexceeds a threshold value identified with one of the 9 deciles of the distributionof test data, and the exactness of the classification was assessed from the actualzinc concentrations. Two performance criteria were computed: the proportion of

Ž .misclassified locations a local or pointwise criterion and the absolute error inpredicting the proportion of contaminated locations within 1 km2 squares. Thelatter criterion is called spatial in that its computation involves several test

Ž .locations 5–10 within each of 15 non-overlapping blocks distributed over thestudy area. These two criteria were averaged over 50 realizations for each of the

Fig. 12. Mean square error of prediction obtained on average over 50 realizations generated usingsimulated annealing and 36 different weighting schemes. Results are expressed as percentage ofthe largest value. Best if smallest.


Ž .36 weighting schemes. Fig. 13 top graphs shows that for 7 deciles the soleconstraint of minimization of local expected loss yields the smallest proportionof misclassified locations. For the 4th and 6th deciles, better results are obtained

Ž .when the constraint of histogram reproduction is included l )0 . The benefit2

of global contraints is much clearer for the spatial criterion: a correct predictionof the average proportions of contaminated locations within 1 km2 blocksrequires a correction of the smoothing effect, usually through the constraint of

Ž .histogram reproduction and for the two extreme thresholds 1st and 9th decilesŽ .through the reproduction of the pattern of spatial variability l )0 .3

In this case study, the reproduction of the pattern of spatial variabilityŽ .semivariogram does not appear to improve greatly the prediction and classifi-cation performances. A possible explanation is the high sampling density that

Žattenuates the smoothing effect short distances between data locations and test.locations while providing accurate estimates of the histogram and local cdf. To

Ženhance the impact of global constraints in particular reproduction of semivari-.ogram relative to the local one, the sampling density was reduced by selecting

randomly a subset of 50 Zn data that were used in indicator kriging to modelccdfs. Fifty realizations were then generated using the same algorithm and 36weigthing schemes, and performances were assessed for the same 100 test

Ž .Fig. 13. Weighting schemes out of 36 tested yielding the smallest proportions of misclassifiedŽ .test locations left column or the best predictions of the proportion of contaminated test locations

2 Ž . Ž .within 1 km squares right column . Two sampling densities 259 and 50 Zn data and ninethresholds corresponding to the deciles of the distribution of 100 test data are considered.


Ž .locations. Fig. 13 bottom graphs shows that the third constraint of semivari-Žogram reproduction is now included in 9 instead of 2 for the high sampling

.density optimal weighting schemes for both local and spatial criteria. Theweight l of the local constraint is particularly lessened for the spatial criterion.1

At this stage, it is difficult to give a rule of thumb for the choice of theoptimal weighting scheme. Except for the minimization of the MSE of predic-tion, results indicate that the optimal combination depends on the performancecriterion and the threshold considered. Further research should be conducted tooptimize the selection of the relative weights assigned to each component in theobjective function, e.g., through response–surface methodology. The optimiza-

Ž .tion could be done using a cross-validation leave-one-out procedure or aŽ .jackknife approach independent set of test locations like in this paper .

6. Conclusions

Both estimation and simulation approaches can be formulated as the selectionof a set of attribute values that are optimal for specific criteria. SA with athree-component objective function allows one to generate maps with character-istics gradually changing from estimation to simulation as more importance is

Ž .given to global constraints histogram and semivariogram reproduction versusthe local constraint of minimization of expected loss. Because of their conflict-ing nature, the three constraints cannot be jointly met, which means that theobjective function cannot be lowered to zero. A balance between the two typesof constraints can, however, be achieved through the weighting scheme of theobjective function.

The case study shows that as the weight of the local constraint increases, therealizations become smoother and more similar to each other, thereby reducingthe spread of the local distribution of simulated values. Also, accounting forlocal constraints in stochastic simulation reduces the average prediction errorand the risk of classifying wrongly contaminated locations as safe. Smallest

Ž .prediction errors are obtained for the estimation approach l s1 which1

generally leads to the smallest proportions of misclassified locations.In practice remedial measures are applied to an area or block, not to a single

location. A common criterion is to remediate units where the critical threshold isexceeded over a given extent of the block. For such a spatial criterion thatinvolves many grid nodes simultaneously, it becomes important to reproduce thewithin-block variability, which can be achieved by imposing the reproduction ofthe histogram or the semivariogram, or both. The benefit of the spatial constraintis underscored when data are sparse because the smoothing effect of kriging ismore pronounced and the spatial information provided by distant observationsbecomes more valuable.


References

Atteia, O., Dubois, J.-P., Webster, R., 1994. Geostatistical analysis of soil contamination in theSwiss Jura. Environ. Pollut. 86, 315–327.

Christakos, G., 1992. Random Field Models in Earth Sciences. Academic Press, New York, 474pp.

Desbarats, A.J., 1996. Modeling spatial variability using geostatistical simulation. In: Rouhani, S.,Ž .Srivastava, R.M., Desbarats, A.J., Cromer, M.V., Johnson, A.I. Eds. , Geostatistics for

Environmental and Geotechnical Applications. American Society for Testing and MaterialsSTP 1283, Philadelphia, pp. 32–48.

Deutsch, C.V., Cockerham, P., 1994. Practical considerations in the application of simulatedannealing to stochastic simulation. Math. Geol. 26, 67–82.

Deutsch, C.V., Journel, A.G., 1998. GSLIB: Geostatistical Software Library and User’s Guide.2nd edn. Oxford Univ. Press, New York, 369 pp.

Farmer, C., 1988. The generation of stochastic fields of reservoir parameters with specifiedŽ .geostatistical distributions. In: Edwards, S., King, P. Eds. , Mathematics in Oil Production.

Clarendon Press, Oxford, pp. 235–252.Goovaerts, P., 1997a. Kriging vs.stochastic simulation for risk analysis in soil contamination. In:

Ž .Soares, A., Gomez-Hernandez, J., Froidevaux, R. Eds. , GeoENV I — Geostatistics for´ ´Environmental Applications. Kluwer Academic Publishing, Dordrecht, pp. 247–258.

Goovaerts, P., 1997b. Geostatistics for Natural Resources Evaluation. Oxford Univ. Press, NewYork, 512 pp.

Goovaerts, P., 1998. Accounting for estimation optimality criteria in simulated annealing. Math.Geol. 30, 511–534.

Goovaerts, P., 1999a. Geostatistics in soil science: state-of-the-art and perspectives. Geoderma 89,1–45.

Goovaerts, P., 1999b. Impact of the simulation algorithm, magnitude of ergodic fluctuations andnumber of realizations on the spaces of uncertainty of flow properties. Stochastic Environ. Res.

Ž .Risk Assess. 13 3 , 161–182.Goovaerts, P., 1999c. Combining minimum error variance and spatial variability in the modeling

of petrophysical properties. Stanford Center for Reservoir Forecasting, Stanford University,unpublished annual report No. 12.

Heuvelink, G.B.M., 1998. Error Propagation in Environmental Modelling with GIS. ResearchMonographs in GIS Series, London, 127 pp.

Journel, A.G., 1989. In: Fundamentals of Geostatistics in Five Lessons. Short Course in Geologyvol. 8 American Geophysical Union, Washington, DC, 40 pp.

Olea, R.A., Pawlowsky, V., 1996. Compensating for estimation smoothing in kriging. Math. Geol.28, 407–417.

Pachepsky, Y., Acock, B., 1998. Stochastic imaging of soil parameters to assess variability anduncertainty of crop yield estimates. Geoderma 85, 213–229.

Srivastava, M.R., 1996. An overview of stochastic spatial simulation. In: Mowrer, H.T.,Ž .Czaplewski, R.L., Hamre, R.H. Eds. , Spatial Accuracy Assessment in Natural Resources and

Environmental Sciences: Second International Symposium. U.S. Department of Agriculture,Forest Service, Fort Collins, pp. 13–22, General Technical Report RM-GTR-277.

Vanderborght, J., Jacques, D., Mallants, D., Tseng, P.H., Feyen, J., 1997. Analysis of soluteredistribution in heterogeneous soil: II. Numerical simulation of solute transport. In: Soares,

Ž .A., Gomez-Hernandez, J., Froidevaux, R. Eds. , geoENV I — Geostatistics for Environmental´ ´Applications. Kluwer Academic Publishing, Dordrecht, pp. 283–295.

Webster, R., Atteia, O., Dubois, J.-P., 1994. Coregionalization of trace metals in the soil in theSwiss Jura. Eur. J. Soil Sci. 45, 205–218.

Estimation or simulation of soil properties? An optimization problem with conflicting criteria

Documents

Transcript of Estimation or simulation of soil properties? An optimization problem with conflicting criteria