Top Banner
Optimization of sample patterns for universal kriging of environmental variables Dick J. Brus , Gerard B.M. Heuvelink Soil Science Centre, Wageningen University and Research Centre, P.O. Box 47, 6700 AAWageningen, The Netherlands Received 8 March 2006; received in revised form 17 October 2006; accepted 19 October 2006 Available online 28 November 2006 Abstract The quality of maps obtained by interpolation of observations of a target environmental variable at a restricted number of locations, is partly determined by the spatial pattern of the sample locations. A method is presented for optimization of the sample pattern when the environmental variable is interpolated with the help of exhaustively known covariates, which are assumed to be linearly related to the target variable. In this method the spatially averaged universal kriging variance (MUKV), which incorporates trend estimation error as well as spatial interpolation error, is minimized. The optimal pattern is obtained using simulated annealing. The method requires that the covariance function or variogram of the regression-residuals is known. The method is tested in a case study on the Mean Highest Water table in a coversand area in The Netherlands. The patterns of 25, 50 and 100 sample locations are optimized and compared with the patterns optimized with the ordinary kriging (OK) model (assuming no trend) and with the multiple linear regression (MLR) model (assuming no spatial autocorrelation of residuals). The results show that the UK-patterns are a good compromise between spreading in geographic space and feature space. The MUKV for the UK-patterns is 19% (n =25), 7% (n =50) and 3% (n = 100) smaller than for the OK-patterns. Compared with the MLR-patterns the reduction is 2%, 4% and 4%, respectively. © 2006 Elsevier B.V. All rights reserved. Keywords: Geostatistics; Multiple linear regression; Ordinary kriging; Sampling optimization; Simulated annealing 1. Introduction In mapping environmental variables, two main stages can be distinguished: 1) the sampling stage, during which measure- ments are taken of the environmental variable at selected locations; and 2) the prediction stage, during which the observations are interpolated to a fine grid. The quality of the resulting map is determined by both stages. Geostatisticians and pedometricians have concentrated most on the second stage, by developing and applying various types of kriging algorithms (Goovaerts, 1997; Heuvelink and Webster, 2001). The initial univariate interpolation algorithms were later on extended to multivariate interpolation, whereby interpolation is improved by the use of covariates. In the past decade, however, more attention has been paid to the sampling aspects of mapping. We now know that for model-based mapping there is no direct need for probability sampling, and that purposive sampling is generally more efficient (Brus and de Gruijter, 1997; de Gruijter et al., 2006). Suitable sample patterns for model-based mapping are spatial coverage samples and, with the variogram known, geostatistical samples (Brus et al., 2006). Spatial coverage sampling selects the optimal sample pattern by optimizing a geometric quality measure defined in terms of the distance between the observation points and the interpolation- grid points. Geostatistical sampling chooses the optimal sample pattern such that the maximum or average kriging prediction- error variance is minimized. In both types of sample pattern the sample locations are optimized in geographic space. These two types of sample pattern are well suited for mapping by univariate kriging. However, these sample patterns may be suboptimal for spatial prediction with the help of covariates. A straightforward and simple way of spatial prediction with the help of co-variates is to make use of linear regression. For mapping with a regression model, the sampling problem boils Geoderma 138 (2007) 86 95 www.elsevier.com/locate/geoderma Corresponding author. E-mail address: [email protected] (D.J. Brus). 0016-7061/$ - see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.geoderma.2006.10.016
10

Optimization of sample patterns for universal kriging of environmental variables

Feb 09, 2023

Download

Documents

Tamara Metze
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Optimization of sample patterns for universal kriging of environmental variables

007) 86–95www.elsevier.com/locate/geoderma

Geoderma 138 (2

Optimization of sample patterns for universal kriging ofenvironmental variables

Dick J. Brus ⁎, Gerard B.M. Heuvelink

Soil Science Centre, Wageningen University and Research Centre, P.O. Box 47, 6700 AA Wageningen, The Netherlands

Received 8 March 2006; received in revised form 17 October 2006; accepted 19 October 2006Available online 28 November 2006

Abstract

The quality of maps obtained by interpolation of observations of a target environmental variable at a restricted number of locations, is partlydetermined by the spatial pattern of the sample locations. A method is presented for optimization of the sample pattern when the environmentalvariable is interpolated with the help of exhaustively known covariates, which are assumed to be linearly related to the target variable. In thismethod the spatially averaged universal kriging variance (MUKV), which incorporates trend estimation error as well as spatial interpolation error,is minimized. The optimal pattern is obtained using simulated annealing. The method requires that the covariance function or variogram of theregression-residuals is known. The method is tested in a case study on the Mean Highest Water table in a coversand area in The Netherlands. Thepatterns of 25, 50 and 100 sample locations are optimized and compared with the patterns optimized with the ordinary kriging (OK) model(assuming no trend) and with the multiple linear regression (MLR) model (assuming no spatial autocorrelation of residuals). The results show thatthe UK-patterns are a good compromise between spreading in geographic space and feature space. The MUKV for the UK-patterns is 19%(n=25), 7% (n=50) and 3% (n=100) smaller than for the OK-patterns. Compared with the MLR-patterns the reduction is 2%, 4% and 4%,respectively.© 2006 Elsevier B.V. All rights reserved.

Keywords: Geostatistics; Multiple linear regression; Ordinary kriging; Sampling optimization; Simulated annealing

1. Introduction

In mapping environmental variables, two main stages can bedistinguished: 1) the sampling stage, during which measure-ments are taken of the environmental variable at selectedlocations; and 2) the prediction stage, during which theobservations are interpolated to a fine grid. The quality of theresulting map is determined by both stages. Geostatisticians andpedometricians have concentrated most on the second stage, bydeveloping and applying various types of kriging algorithms(Goovaerts, 1997; Heuvelink and Webster, 2001). The initialunivariate interpolation algorithms were later on extended tomultivariate interpolation, whereby interpolation is improvedby the use of covariates. In the past decade, however, moreattention has been paid to the sampling aspects of mapping.

⁎ Corresponding author.E-mail address: [email protected] (D.J. Brus).

0016-7061/$ - see front matter © 2006 Elsevier B.V. All rights reserved.doi:10.1016/j.geoderma.2006.10.016

We now know that for model-based mapping there is nodirect need for probability sampling, and that purposivesampling is generally more efficient (Brus and de Gruijter,1997; de Gruijter et al., 2006). Suitable sample patterns formodel-based mapping are spatial coverage samples and, withthe variogram known, geostatistical samples (Brus et al., 2006).Spatial coverage sampling selects the optimal sample pattern byoptimizing a geometric quality measure defined in terms of thedistance between the observation points and the interpolation-grid points. Geostatistical sampling chooses the optimal samplepattern such that the maximum or average kriging prediction-error variance is minimized. In both types of sample pattern thesample locations are optimized in geographic space. These twotypes of sample pattern are well suited for mapping byunivariate kriging. However, these sample patterns may besuboptimal for spatial prediction with the help of covariates.

A straightforward and simple way of spatial prediction withthe help of co-variates is to make use of linear regression. Formapping with a regression model, the sampling problem boils

Page 2: Optimization of sample patterns for universal kriging of environmental variables

87D.J. Brus, G.B.M. Heuvelink / Geoderma 138 (2007) 86–95

down to selecting the locations that lead to the most accurateestimates of the regression coefficients. In D-optimal designsbased on experimental design theory, the determinant of thevariance–covariance matrix of the estimated regression coeffi-cients is minimized (Atkinson and Donev, 1992). Such samplescontain the locations with extreme values for the covariates. Ifwe are uncertain about the model structure (must quadratic andinteraction terms also be incorporated?), then response-surface(RS) designs come into scope (Myers and Montgomery, 2002).In both sample designs the sample locations are optimized infeature space.

Optimization in feature space may lead to strong spatialclustering of sample locations. There is no penalty on spatialclustering because in standard regression it is assumed that theregression residuals are independent. In practice, the residualswill often be spatially autocorrelated, leading to less preciseestimates of the regression coefficients. Maximum spreading infeature space is no longer optimal, and one needs to find acompromise between spreading in feature space and ingeographic space. To ‘eliminate or minimize the effect ofspatially dependent error structure’, Lesch et al. (1995) and Lesch(2005) proposed a selection algorithm that starts from a samplethat is closest to the optimum RS design. The second and thirdbest RS designs are also computed. Next points of the best RSdesign are swapped with corresponding points in the second andthird best RS designs. Swaps are accepted if they improve thespatial coverage. Note that the primary objective of the spatialRS sample is model identification and model estimation, and themapping of the target variable via use of the regression functiononly. In this approach one does not incorporate the spatialcorrelation in the residual pattern (if any) into the predictionfunction (unlike a universal kriging model) (Lesch, 2005).

For spatial interpolation there is an extra reason for spreadingof observations in geographical space. When regressionresiduals are spatially correlated, the map may become moreaccurate by interpolating these residuals and adding them to thepredicted values from the regression model, as in regression-kriging (Knotters et al., 1995; Hengl et al., 2004). In thismethod, the data are used twice, first for fitting the linearregression model and second for interpolation of the residuals.These two uses impose conflicting requirements on the optimalpattern of the sample locations. Estimation of the regressioncoefficients requires optimization in feature space, whereasinterpolation benefits from spreading in geographic space.

To strike a balance between spreading in feature andgeographic space, Hengl et al. (2003) proposed an equalrange design for mapping by regression-kriging. In thisprocedure the study area is stratified on the basis of thefrequency distribution of the covariates. Stratification limits areset at equal distances in feature space. From each stratum anequal number of sample points is selected randomly, thusensuring that the entire sample has a uniform spreading infeature space. Many samples are generated in this way, and theone with the best spatial coverage is retained.

This method is essentially heuristic, whereby it is not knownprecisely what quality criterion is optimized and whether thefinal sample pattern is a true optimum. Following Müller (2001,

Sections 5.4 and 5.5), we propose to use the universal krigingvariance as a quality measure for designing samples. In theuniversal kriging model the spatial distribution of the targetvariable is described by the sum of a deterministic trend,modelled by a linear regression on covariates, and a realizationof a stationary, spatially autocorrelated residual. It will beshown in this paper that by minimizing the spatially averageduniversal kriging variance, one automatically obtains the rightbalance between optimization in geographic and feature space.

The rest of this paper is organized as follows. We firstintroduce the universal kriging model, which characterizes thespatial variability of the target environmental variable. Next wepresent a numerical search procedure that is used to obtain theoptimal sample pattern, i.e., the sample pattern with the smallestaverage universal kriging variance. We illustrate the proposedmethod with a real-world case study on mapping groundwaterfluctuation characteristics in The Netherlands.

We stress that our method requires that the universal krigingmodel is known. More specific, the method requires knowledgeof the structure of the trend (which covariates, quadratic terms,interactions?) and of the model-type and parameters of thecovariance function or variogram of the residuals. Only theregression coefficients need not be known. This priorknowledge may come from previous surveys in the study areaor in comparable areas. In many situations one lacks this priorknowledge, and this raises questions such as ‘how robust is themethod against deviations from the model?’ and ‘how can weincorporate uncertainty about the model in the samplingmethod?’ We will address these questions in Section 5.

2. Universal kriging

We consider the following model (Christensen, 1990):

ZðsÞ ¼Xmj¼0

bjxjðsÞ þ eðsÞ; ð1Þ

where Z(s) is the target environmental variable, s=(s1 s2)V is atwo-dimensional spatial coordinate, where the xj(s) arecovariates (note that x0(s)≡1 for all s), where the βj areregression coefficients, and where ε(s) is a normally distributedresidual with zero-mean and constant variance c(0). The resi-dual ε is possibly spatially autocorrelated, as quantified throughan autocovariance function or variogram.

In what follows it will be convenient to use matrix notation,so that Eq. (1) may be rewritten as:

ZðsÞ ¼ xVðsÞβþ εðsÞ; ð2Þwhere x and β are column vectors of the m+1 covariates andm+1 regression coefficients, respectively. The universalkriging prediction at an unobserved location s0 from n obser-vations z(si) is given by:

Zðs0Þ ¼ c0 þ XðXVC−1XÞ−1ðx0−XVC−1c0Þ� �

VC−1z; ð3Þ

where X is the n×(m+1) matrix of covariates at the obser-vation locations, x0 is the vector of covariates at the prediction

Page 3: Optimization of sample patterns for universal kriging of environmental variables

88 D.J. Brus, G.B.M. Heuvelink / Geoderma 138 (2007) 86–95

location, C is the n×n variance–covariance matrix of the nresiduals, c0 is the vector of covariances between the residualsat the observation and prediction locations, and where z is thevector of observations z(si). C and c0 are derived from thevariogram of ε.

The universal prediction error variance (universal krigingvariance) at s0 is given by (Christensen, 1990, Section VI.2;Müller, 2001, Section 2.2):

r2ðs0Þ ¼ cð0Þ−c0VC−1c0þ ðx0−XVC−1c0ÞVðXVC−1XÞ−1ðx0−XVC−1c0Þ: ð4Þ

The universal kriging variance incorporates both theprediction error variance of the residual (first two terms onthe right-hand side of Eq. (4)), and the estimation error varianceof the trend (third term on the right-hand side of Eq. (4)). Byminimizing the spatial average (or sum) of the universal krigingvariance at points, one automatically obtains the right balancebetween optimization of the sample pattern in geographic andfeature space.

3. Calculating the optimal sample pattern

The objective is to find the sample pattern that has thesmallest Mean Universal Kriging Variance (MUKV). Weconsider the case where, due to budget constraints, the numberof observations n on the environmental variable of interest isfixed. The only freedom we have is the choice of locations ofthe observations, i.e., we want to optimize the pattern of thesample locations. In principle, by discretizing the target area wecould evaluate the MUKV for all combinations of samplepoints, and select the one that has the smallest value. However,for practical applications the number of combinations will beformidable which means that an exhaustive search over allpossible patterns is prohibitive. Instead, some efficient searchalgorithm has to be employed. Here we use simulatedannealing.

Simulated annealing is an iterative, combinatorial optimiza-tion algorithm in which a sequence of combinations is generatedby deriving a new combination from slightly and randomlychanging the previous combination (van Groenigen et al.,1999). Each time a new combination is generated, the qualitymeasure (i.e., the MUKV) is evaluated and compared with thevalue of the previous combination. The new combination isaccepted if the quality measure has improved by the change.However, the annealing algorithm also accepts some of thechanges that worsen the MUKV. This is to avoid being trappedin a local optimum. The probability of accepting a worsecombination is given by

P ¼ e−DfT ; ð5Þ

where Δf is the increase in the MUKV and T is a controlparameter which, by analogy with the original application ofsimulating the cooling of a metal into a minimum energycrystalline structure, is known as the system temperature. Thistemperature is gradually decreased during the optimization. Eq.

(5) shows that, given the temperature, the larger the increase ofthe MUKV, the smaller the probability of acceptance. Thetemperature remains constant during a fixed number of tran-sitions and is then decreased, which makes that the acceptanceprobability (for given Δf) gradually decreases as the iterationcontinues. The simplest and most commonly used coolingscheme is the exponential cooling scheme:

Tkþ1 ¼ aTk ; ð6Þ

where α is a constant, close to and smaller than 1. Besides thisparameter, there are three more parameters to be specified by theuser:

• the initial temperature T0;• the number of (accepted) transitions during which thetemperature is kept constant;

• a stopping criterion.

The initial temperature is usually chosen such that theaverage probability of accepting a worsening combinationequals 0.8 (Kirpatrick, 1984). It can be calculated by performingan initial search in which all increases are accepted. From theresults the average of the MUKV increasesPDf þ is calculated.The initial temperature is then given by:

T0 ¼PDf þ

lnðv0Þ; ð7Þ

where χ0 is the average increase acceptance probability (i.e.,χ0=0.8).

The number of transitions for a given temperature can bechosen as fixed, but it may also be specified in terms of thenumber of accepted transitions. The stopping criterion maysimply be a fixed number of total iterations, a fixed finaltemperature, or alternatively one may decide to stop the iter-ations when there is lack of progress (i.e., when the qualitymeasure did not improve in many tries).

Sacks and Schiller (1988) showed how the simulatedannealing algorithm can be used for optimizing sample patternsin situations where observations are spatially correlated. vanGroenigen et al. (1999) modified the simulated annealingalgorithm by changing the solution generator, i.e. the mecha-nism to generate a new solution by random perturbation of oneof the variables of the previous solution. This is done by movingone randomly selected sample point over a vector h, withrandom direction, and a random length with a uniform prob-ability distribution with limits 0 and hmax. The upper boundhmax is decreased as the iteration continues.

In a first test of the method described above, Heuvelink et al.(2006) optimized the pattern of 4, 9 and 16 points in a squarearea for three cases: 1) no trend; 2) a trend that is linear in the s1-coordinate; and 3) a trend that is quadratic in the s1-coordinate.They found that taking trend estimation into account had amarked effect on the optimized sample pattern, especially whenspatial autocorrelation of the residual is weak. The effect on thepattern and on the precision (spatially averaged kriging

Page 4: Optimization of sample patterns for universal kriging of environmental variables

Fig. 1. Maps of covariates used in optimizing the sample pattern.

89D.J. Brus, G.B.M. Heuvelink / Geoderma 138 (2007) 86–95

variance) decreased as the number of sample points increased.Heuvelink et al. (2006) recommended to test the method inpractical situations with more than one covariate and largersample sizes.

4. Optimized sampling for mapping groundwater dynamics

Finke et al. (2004) developed a method for mapping ground-water dynamics in The Netherlands. The core of the method isformed by a regression kriging model that predicts groundwaterdynamics at all points in the study area, using spatially ex-haustive covariates derived from a DEM, a topographic mapand the existing water-table-class map. The dynamics of thegroundwater level is characterized by several variables, in-cluding the Mean Highest Water table (MHW) and the MeanLowest Water table (MLW). Different covariates are importantin different situations, meaning that the study area is stratifiedon the basis of thematic maps of geology, soil, topography andother hydrological indicators. For each stratum a specificregression model is fitted and applied.

For fitting these models, in each stratum approximately onesample location per km2, with a minimum of 25, is selected.At each selected location, the groundwater level is measuredat the end of the winter (shallow levels) and summer (deeplevels). The two measurements per sample point are used toestimate the groundwater dynamics, i.e. the response variablesof the regression models, at that location. To select the 25 ormore locations within each stratum, Finke et al. (2004) makeuse of the local drainage depth (i.e., one of the possiblecovariates). Similar to Hengl et al. (2003), an equal rangedesign is used to guarantee that the sample has a uniformspreading in the feature space of the log-transformed drainagedepth. To avoid spatial clustering, in addition the locations areselected such that no locations are closer than a chosen min-imum distance.

Spatial predictions of groundwater dynamics are obtained byprediction with the fitted regression models, followed bykriging of the regression residuals. Later on, de Gruijter et al.(2004) adapted the method by integrating these two steps as inuniversal kriging.

The total number of possible covariates is 15. These 15variables are grouped in 5 subsets, each group consisting oflogically correlated covariates, such as the relative altitudescalculated for different neighbourhoods. In selecting a model,only one covariate from each group is allowed. The grouping ofthe covariates strongly reduces the total number of possibleregression models, but still 3968 models remain to be evalu-ated. The best regression model is selected on the basis ofMallows's Cp.

4.1. Case study

We tested the new sampling method in one stratum(stratum nr 14) of the waterboard-area ‘De Dommel’ in theprovince of Noord–Brabant. The entire waterboard-area wasmapped using the procedure by Finke et al. (2004) in 2002(Hoogland et al., 2004). The size of stratum 14 is 4227 ha.

As target variable we chose the Mean Highest Water table(MHW). The selected regression kriging model for MHWcontains three covariates:

• relative altitude• drainage depth• drainage density.

The relative altitude (local elevation) is calculated with aradius of 300 m. The drainage density is defined as thepercentage of raster cells in a local neighbourhood that iscrossed by large ditches. Fig. 1 shows maps of these threecovariates (size of raster cells is 25 m×25 m; total number ofraster cells and possible sampling locations is 65729), and Fig. 2shows exhaustive scatter plots. Relative altitude and drainagedepth are strongly correlated, whereas the correlations betweenthe other pairs of covariates are very weak.

The autocovariance of the residual ε was modelled byan exponential model with a nugget variance of 54 cm2, apartial-sill parameter (sill minus nugget) of 175 cm2, and arange (correlation length) parameter of 350 m. This covariancefunction is based on a correlogram which was calculated from2009 observations (Finke et al., 2004; Hoogland et al., 2004).The correlogram was rescaled with estimates from the residualvariance in stratum 14, based on 51 observations.

Page 5: Optimization of sample patterns for universal kriging of environmental variables

Fig. 2. Exhaustive scatter plots for the three covariates relative altitude, drainagedepth and drainage density.

Fig. 3. Patterns of 25 sample locations optimized with the universal krigingmodel (top), ordinary kriging model (middle) and multiple linear regressionmodel (bottom).

90 D.J. Brus, G.B.M. Heuvelink / Geoderma 138 (2007) 86–95

The initial temperature for simulated annealing was chosensuch that the average increase acceptance probability was 0.80.The temperature was lowered exponentially with α=0.95. Werun 1000 chains of 100 iterations each. The initial pattern wasa stratified simple random sample, with compact geographicclusters of raster cells formed by k-means as strata (Brus et al.,

1999). The maximum shifts in the s1- and s2-direction for thefirst chain was half the number of columns and rowsrespectively, and were lowered linearly, such that for the finalchain the maximum shift in each direction was 1 cell width(25 m). The universal kriging variance was evaluated for allpoints on a 100 m×100 m grid.

We considered three sample sizes, namely 25, 50 and 100points. In addition, besides optimization with the UK-model,sample patterns were also optimized with the ordinary kriging(OK) model and the multiple linear regression (MLR) model.This gives nine optimized sample patterns in total. The OK-model assumes that there is no spatial trend. The autocovar-iance of the target variable under the OK-model was char-acterized by an exponential function with zero nugget, sill of371 cm2 and range parameter of 350 m. The MLR-modelassumes a trend with the same covariates as in the UK-model.It also assumes that the residuals are spatially uncorrelated. Thevariance was taken as 229 cm2, which equals the sill of the UKvariogram. Replications were not permitted, meaning that alocation (i.e., raster cell) cannot be part of the sample more thanonce.

Page 6: Optimization of sample patterns for universal kriging of environmental variables

91D.J. Brus, G.B.M. Heuvelink / Geoderma 138 (2007) 86–95

5. Results and discussion

Figs. 3–5 show the optimized patterns for n=25, 50 and100, respectively. From these figures it is evident that opti-mization with the ordinary kriging model leads to the bestspatial coverage. The patterns optimized with the MLR-modelshow strong spatial clustering. The spatial coverage of thesample patterns optimized with the UK-model is much bettercompared to the MLR-patterns, but there are still empty spaces.Figs. 6–8 show the position of the sample locations in featurespace. These figures show that for the MLR-patterns the num-ber of points in the outer zones of the scatter clouds (Fig. 2) isthe largest, followed by the UK-patterns. The points of the OK-samples are all located at the centres of the scatter clouds, i.e.have covariate values close to the mean, which is inefficient forestimating the regression coefficients.

Table 1 shows the MUKV for the nine sample patterns. InFig. 9 the MUKV is decomposed into the interpolation-errorvariance of the residuals (first two terms of Eq. (4)) and thetrend-estimation error variance (third term of Eq. (4)). Asexpected, optimization of the sample pattern with the UK-model

Fig. 4. Patterns of 50 sample locations optimized with the universal krigingmodel (top), ordinary kriging model (middle) and multiple linear regressionmodel (bottom).

Fig. 5. Patterns of 100 sample locations optimized with the universal krigingmodel (top), ordinary kriging model (middle) and multiple linear regressionmodel (bottom).

leads to the smallest MUKV. Compared to optimization with theOK-model, optimization with the UK-model leads to a muchbetter spreading in feature space, and consequently a strongreduction of the trend-estimation error variance (Fig. 9). Thispositive effect is not counter-balanced by an increase in theinterpolation-error variance due to a worsening of the spatialcoverage. For a given sample size, the interpolation-errorvariances of the residuals are approximately equal for the UK-and OK-patterns. This can be explained by the low density ofsampling, and as a result the weak autocorrelation between thevalues at unobserved locations and sample locations. Thesampling density expressed as

ffiffiffiffiffiffiffiffiA=n

pis 1300 m (n=25), 919 m

(n=50) and 650 m (n=100), whereas the distance-parameter ofthe exponential variogram is 350 m, with an effective spatialrange of about 1000 m.

When compared to the MLR-patterns, the reduction of theMUKV achieved by the UK-patterns is small. For a givensample size, the trend-estimation error variance for the UK-pattern is smaller than for the MLR-pattern (Fig. 9, top).This can be explained by the strong spatial clustering of the

Page 7: Optimization of sample patterns for universal kriging of environmental variables

Fig. 6. Scatter plots of the three covariates for sample size 25.

Fig. 7. Scatter plots of the three covariates for sample size 50.

92 D.J. Brus, G.B.M. Heuvelink / Geoderma 138 (2007) 86–95

MLR-patterns, which results in a strong redundancy of theinformation content of observations in the MLR-patterns. Thedifference in trend-estimation error variance between UK- andMLR-patterns is more or less constant with sample size. Also,given the sample size, the interpolation-error variance of theresiduals for the UK-pattern is smaller than for the MLR-pattern, due to the better spatial coverage. This difference in-creases with sample size (Fig. 9, bottom).

Going from small to large sample sizes, the ranking of theMLR-pattern and OK-pattern change places. For n=25optimization with the MLR-model is the best alternative toUK-optimization, whereas for n=100 the OK-pattern is second-best. This is due to the strong reduction of the trend-estimationerror variance with sample size for the OK-pattern, and theweak reduction of the interpolation-error variance with samplesize for the MLR-pattern compared to the OK-pattern.

Page 8: Optimization of sample patterns for universal kriging of environmental variables

Fig. 8. Scatter plots of the three covariates for sample size 100.

Table 1Mean Universal Kriging Variance for sample patterns optimized with theuniversal kriging model (UK), the ordinary kriging model (OK) and the multiplelinear regression model (MLR), for sample size n equals 25, 50 and 100

n UK OK MLR

25 238.5 293.6 244.350 230.8 248.6 239.5100 224.1 231.1 234.5

Fig. 9. Interpolation-error variance (top) and trend-estimation error variance(bottom), for sample patterns optimized with the universal kriging model (UK),the ordinary kriging model (OK) and the multiple linear regression model(MLR), for sample size n equals 25, 50 and 100.

93D.J. Brus, G.B.M. Heuvelink / Geoderma 138 (2007) 86–95

To evaluate how close the UK-pattern as obtained withsimulated annealing is to the true optimum, we repeated forn=25 the simulated annealing algorithm multiple times withdifferent initial samples, and different optimization parameters.There were differences between the spatial patterns, but thedifferences between the associated MUKVs were very small(order of magnitude of 0.1 cm2). A closer look at the patternsrevealed interesting similarities. Several points showed up in allor almost all of the patterns. These points were selected for their

position in feature space, being close to the extremes. Otherpoints that were selected less frequently were mainly selectedfor their favorite position in geographic space, for whichmultiple sample configurations yield near-optimal results.

As stressed in the Introduction, the method for optimizationof sample patterns described here requires prior knowledge ofthe model of spatial variation. The structure of the trend and thecovariance function or variogram of the residual must beknown. This implies that the optimized sample patterns areoptimal only if the model assumptions are valid. We may askhow robust the sample pattern and the quality of the resulting

Page 9: Optimization of sample patterns for universal kriging of environmental variables

94 D.J. Brus, G.B.M. Heuvelink / Geoderma 138 (2007) 86–95

map are against deviations from the model. For instance, howsensitive is the result to the assumption that the target variable islinearly related to the covariates and that quadratic and inter-action terms are insignificant? Besides locations with extremevalues for the covariates, D-optimal samples for models withquadratic terms include locations with intermediate values(Heuvelink et al., 2006). Figs. 6–8 show that the UK-modelwith linear trend also selects many locations with intermediatevalues. This is because residuals are spatially autocorrelated,leading to a spreading of the sample in geographical space. Thelocations that are selected for their position in geographicalspace, happen to have intermediate covariate values in this case.This may indicate that in the presence of autocorrelation formoderate to large sample sizes, say nN20, the sensitivity of theresult to the assumed linear relationship may not be very large.The sensitivity of the result to the variogram model-type andparameters must also be studied. Changes in the variogram willhave an effect on the relative importance of the ‘regression’ and‘kriging’ elements of the MUKV-variance Eq. (4), and there-fore we expect that the result is sensitive to the variogramparameters.

The more sensitive the result is to the model assumptions, thegreater the need for methods that can deal with uncertainty aboutthe UK-model. Broadly speaking, there are two solutions to thelack of prior knowledge of themodel: 1) sampling in two ormorephases, and 2) incorporating uncertainty about the model.

When sampling in two phases, the aim of the first sam-pling phase (reconnaissance survey) is to identify and estimate(calibrate) the model of spatial variation. In the second phase theobjective is to map the target variable. The different aims of thetwo phases lead to different sample patterns. The reconnais-sance survey must be tailored at 1) identification of the trend,and 2) identification and calibration of the covariance functionof the residual. A spatial RS sample pattern (Lesch, 2005) iswell suited for identification of the trend, but will generally besuboptimal for variogram identification and parameter estima-tion, due to the lack of short-distance pairs. Lark (2002) and Zhuand Stein (2005) present, among others, sampling methods forestimating the parameters of covariance functions (variograms),but these methods are not optimal for identifying the trend. Apractical solution is to sample on a regular grid or use a spatialcoverage sample supplemented by points at short distances fromthese points. The first phase sample data are then used to selectthe covariates and to estimate the variogram of the residuals.This model may then be used to design the second phase sampleas described in this paper, using the first phase sample data asfixed, prior points.

Even after a reconnaissance survey, one will be uncertain tosome extent about the UK-model. Recently, methods have beendeveloped that incorporate the uncertainty about the model inthe design of a sample. Zhu and Stein (2006) present aninteresting contribution in this relatively new area. The meansquared error of the plug-in kriging predictor, i.e. the krigingpredictor that uses the variogram parameters estimated from thesample also used for interpolation, can be approximated by Eq.(4) plus an extra term that allows for the uncertainty of thevariogram. This method requires a prior estimate of the vario-

gram, for which the variogram estimated from the reconnais-sance survey may be used.

6. Conclusions

Optimization of the sample pattern by minimizing the uni-versal kriging variance integrates optimization in feature andgeographic space in one step. The method has no ambiguitiesand does not involve subjective choices, and is therefore to bepreferred over the two-step procedure proposed by Hengl et al.(2003).

For the case study, the gain in precision, or more specific thereduction of the Mean Universal Kriging Variance compared tooptimization with the ordinary kriging model, is considerable.This is due to a much better positioning of sample locations infeature space, which results in a strong reduction of the trend-estimation error variance. However, compared to the multiplelinear regression model the gain is small. This is because of thelow sampling density for all sample sizes, and consequently theinterpolation-error variance of the residuals is not or onlymarginally reduced by the improved spatial coverage. Thetrend-estimation error variances for the UK-patterns are smallerthan for the MLR-patterns because there is less redundancy inthe observations. For n=25 and 50, optimization of the patternwith the MLR-model is the best alternative to optimization withthe UK-model, for n=100 the OK-pattern is the best alternative.The presented method requires prior knowledge on the structureof the trend and on the variogram of the residual. The robustnessof the presented method against deviations from theseassumptions still needs to be studied. In situations where oneis uncertain about the model, one may collect information in afirst sampling phase (reconnaissance survey). Also, one mayincorporate the uncertainty about the UK-model in the samplingmethod as described by Zhu and Stein (2006).

Acknowledgements

We thank S.M. Lesch and an anonymous referee for theirconstructive and valuable comments on the original manuscript.We also thank J.J. de Gruijter and T. Hoogland for their help andadvice.

References

Atkinson, G.L., Donev, A.N., 1992. Optimum Experimental Design. ClarendonPress, Oxford.

Brus, D.J., de Gruijter, J.J., 1997. Random sampling or geostatistical modelling?Choosing between design-based and model-based sampling strategies forsoil (with discussion). Geoderma 80, 1–59.

Brus, D.J., Spätjens, L.E.E.M., de Gruijter, J.J., 1999. A sampling scheme forestimating the mean extractable phosphorus concentration of fields forenvironmental regulation. Geoderma 89, 129–148.

Brus, D.J., de Gruijter, J.J., van Groenigen, J.W., 2006. Designing spatialcoverage samples using the k-means clustering algorithm. In: Lagacherie, P.,McBratney, A., Voltz, M. (Eds.), Digital Soil Mapping: An IntroductoryPerspective. Developments in Soil Science, vol. 3. Elsevier, Amsterdam.

Christensen, R., 1990. Linear Models for Multivariate, Time, and Spatial Data.Springer, New York.

de Gruijter, J.J., van der Horst, J.B.F., Heuvelink, G.B.M., Knotters, M.,Hoogland, T., 2004. Grondwater opnieuw op de kaart. Methodiek voor de

Page 10: Optimization of sample patterns for universal kriging of environmental variables

95D.J. Brus, G.B.M. Heuvelink / Geoderma 138 (2007) 86–95

actualisering van grondwaterstandsinformatie en perceelsclassificatie naaruitspoelingsgevoeligheid voor nitraat. Tech. rep., Alterra, Wageningen.

de Gruijter, J.J., Brus, D.J., Bierkens, M.F.P., Knotters, M., 2006. Sampling forNatural Resource Monitoring. Springer-Verlag, Berlin.

Finke, P.A., Brus, D.J., Bierkens, M.F.P., Hoogland, T., Knotters, M., de Vries,F., 2004. Mapping groundwater dynamics using multiple sources ofexhaustive high resolution data. Geoderma 123, 23–39.

Goovaerts, P., 1997. Geostatistics for Natural Resources Evaluation. OxfordUniversity Press, New York.

Hengl, T., Rossiter, D.G., Stein, A., 2003. Soil sampling strategies for spatialprediction by correlation with auxiliary maps. Australian Journal of SoilResearch 41, 1403–1422.

Hengl, T., Heuvelink, G.B.M., Stein, A., 2004. A generic framework for spatialprediction of soil variables based on regression-kriging. Geoderma 120,75–93.

Heuvelink, G.B.M., Webster, R., 2001. Modelling soil variation: past, present,and future. Geoderma 100, 269–301.

Heuvelink, G.B.M., Brus, D.J., de Gruijter, J.J., 2006. Optimization of sampleconfigurations for digital mapping of soil properties with universal kriging.In: Lagacherie, P., McBratney, A., Voltz, M. (Eds.), Digital Soil Mapping:An Introductory Perspective. Developments in Soil Science, vol. 3. Elsevier,Amsterdam.

Hoogland, T., Hoogerwerf, M.R., van Kekem, A.J., 2004. Actualisatie grond-waterdynamiek waterschap De Dommel. Tech. rep., Alterra, Wageningen.

Kirpatrick, S., 1984. Optimization by simulated annealing: quantitative studies.Journal of Statistical Physics 34, 975–986.

Knotters, M., Brus, D.J., Oude Voshaar, J.H., 1995. A comparison of kriging,co-kriging and kriging with regression for spatial interpolation of horizondepth with censored observations. Geoderma 67, 227–246.

Lark, R.M., 2002. Optimized spatial sampling of soil for estimation of thevariogram by maximum likelihood. Geoderma 105, 49–80.

Lesch, S.M., 2005. Sensor-directed response surface sampling designs forcharacterizing spatial variation in soil properties. Computers and Electronicsin Agriculture 46, 153–179.

Lesch, S.M., Strauss, D.J., Rhoades, J.D., 1995. Spatial prediction of soilsalinity using electromagnetic induction techniques 2. An efficient spatialsampling algorithm suitable for multiple linear regression model identifi-cation and estimation. Water Resources Research 31, 387–398.

Müller, W.G., 2001. Collecting Spatial Data: Optimum Design of Experimentsfor Random Fields, 2nd edition. Physica-Verlag, Heidelberg.

Myers, R.H., Montgomery, D.C., 2002. Response Surface Methodology:Process and Product Optimization using Designed Experiments, 2ndedition. John Wiley, New York.

Sacks, J., Schiller, S., 1988. Spatial designs. In: Gupta, S., Berger, J. (Eds.),Statistical Decision Theory and Related Topics IV, vol. 2. Springer Verlag,New York, pp. 385–399.

Van Groenigen, J.W., Siderius, W., Stein, A., 1999. Constrained optimisation ofsoil sampling for minimisation of the kriging variance. Geoderma 87,239–259.

Zhu, Z., Stein, M.L., 2005. Spatial sampling design for parameter estimation ofthe covariance function. Journal of Statistical Planning and Inference 134,583–603.

Zhu, Z., Stein, M.L., 2006. Spatial sampling design for prediction withestimated parameters. Journal of Agricultural Biological and EnvironmentalStatistics 11, 24–44.