Archaeology and geostatistics - UNAMmmc2.geofisica.unam.mx/cursos/gest/Articulos/Geostatistics/Archa… · on spatial analysis in archaeology, provide a section on the subject of

Archaeology and geostatistics

C.D. Lloyda*, P.M. Atkinsonb

aSchool of Geography, Queen’s University, Belfast BT7 1NN, UKbSchool of Geography, University of Southampton, Highfield, Southampton SO17 1BJ, UK

Received 16 December 2002; received in revised form 24 June 2003; accepted 8 July 2003

Abstract

Geostatistics is used in many different disciplines to characterise spatial variation and for spatial prediction, spatial simulationand sampling design. Archaeology is an inherently spatial discipline and the models and tools provided by geostatistics should beas valuable in archaeology as they are in other disciplines that are concerned with spatially varying properties. However, therehave, so far, been few applications of geostatistics in archaeology. This paper seeks to highlight some of the key tools providedby geostatistics and to show, through two case studies, how they may be employed in archaeological applications. Some relevantliterature is summarised and two case studies are presented based on the analysis of (i) Roman pottery and (ii) soil phosphatedata.� 2003 Elsevier Ltd. All rights reserved.

Keywords: Spatial analysis; Mapping; Sampling design

1. Introduction

Geostatistics is a set of tools used for characterisingspatial variation, spatial prediction, spatial simula-tion and spatial optimisation (e.g., sampling design).Applications of geostatistics are found in a wide range offields including biology, environmental science, geogra-phy, geology, meteorology and mining. Geostatistics isbased on the principle of spatial dependence (or spatialautocorrelation): observations close in space tend to bemore similar than those further apart. Therefore, if thespatial distribution of some variable is structured (asopposed to being random) geostatistics may be useful insome capacity.

The characterisation of spatial autocorrelation inarchaeological variables has been the concern of severalresearchers. Hodder and Orton [16], in their classic texton spatial analysis in archaeology, provide a sectionon the subject of spatial autocorrelation. This workincluded the definition of Moran’s I and Geary’s c, two

coefficients which characterise the degree of spatial auto-correlation in a variable. An application was demon-strated based on the distribution of the length/breadthratio index of Bronze Age spearheads. Specifically, I wasestimated for several spatial lags (that is, for pairs oflocations separated by several distance and directionvectors), enabling assessment of structure in the spatialdistribution of the index. Other studies have appliedsimilar statistical measures of spatial autocorrelationto the terminal distribution of dated monuments atlowland Maya sites [20,40].

There are few published case studies where geostatis-tics is applied in archaeological contexts. There have,however, been reviews of geostatistics in archaeology:Ebert [13] and Wheatley and Gillings [39] both providesummaries of the basic tools of geostatistics in archaeo-logical contexts. The present paper is intended to take abroader overview and to outline some existing applica-tions of geostatistics in archaeology as well as to presenttwo case studies that are concerned with the analysis of(i) Roman pottery and (ii) soil phosphate data.First, some published applications of geostatistics inarchaeology are outlined. Then, geostatistical theory isintroduced.

* Corresponding author. Tel.: +44-28-8027-3478;fax: +44-28-9032-1280.

E-mail address: [email protected] (C.D. Lloyd).

Journal of Archaeological Science 31 (2004) 151–165

SCIENCE

Journal of

Archaeological

http://www.elsevier.com/locate/jas

SCIENCE

Journal of

Archaeological

http://www.elsevier.com/locate/jas

0305-4403/04/$ - see front matter � 2003 Elsevier Ltd. All rights reserved.doi:10.1016/j.jas.2003.07.004

2. Published applications of geostatistics in archaeology

In this section, a small number of published studiesare discussed. These illustrate the wide range of archaeo-logical problems which geostatistics may help to solve.

Zubrow and Harbaugh [42] is one of only a fewpublications that apply the geostatistical spatial predic-tion method of kriging in an archaeological context.Kriging was utilised to reduce the effort expended inlocating archaeological sites. The sites were locatedin the archaeological zone of Cañada del Alfaro inGuanajuato, Mexico and the Hay Hollow valley ineast-central Arizona, USA. The specific aim of the paperwas to predict, from a sample of the sites identifiedthrough fieldwork, the expected number of sites in eachcell of a regular grid. The paper examined the use of asample of the total surveyed area from which krigedpredictions were made. The subsequent surveyingrequired to locate all sites in the surveyed area was thenassessed. It was observed that increasing the initialsample from 12.5% of the surveyed area to 50% maderelatively little difference in the number of sites foundin cells predicted by kriging. In other words, krigingenabled the location of almost as many of the total sitesfrom 12.5% of the total sample as it did from 50% of thetotal sample. Thus, in this study, spatial dependencein the density of sites was demonstrated, as was theapplicability of methods that utilise this property.

Webster and Burgess [36] examined the application ofkriging to mapping electrical resistivity for a Saxon orNorman to 17th century site at Bekesbourne in Kent,England. The data set was used to illustrate how largescale trends (that is, a spatially varying mean) may affectthe predictions made using kriging, so the objective wasonly indirectly archaeological in nature. In a specificallyarchaeological application, Neiman [27] used vari-ograms to explore spatial variation in the terminal datesof Maya settlements (Whitley [40] and Kvamme [20] hada similar focus).

Geostatistics has been applied in disciplines allied toarchaeology. Oliver et al. [29] estimated variograms ofleading principal components and canonical variates ofpollen counts in a vertical core made through peat inFife, Scotland. Their objective was to use a range oftools, including variograms (a means of characterisingspatial structure; defined below), to explore the structureof the core. Bocquet-Appel and Demars [4] estimatedvariograms of 14C dates of remains from or associatedwith European Neanderthals and early modern humans.Models fitted to the variograms were used to generatemaps representing the spatial distribution of remains ofdifferent dates.

Robinson and Zubrow [34] discuss interpolation inarchaeology and they include discussion about kriging,although they caution that the technique should be usedwith care and that simpler approaches may be suitable in

many contexts. Hageman and Bennett [15] provide ashort summary of widely used variants of krigingfor generating Digital Elevation Models (DEMs) inarchaeological applications.

Ebert [13], in a review of geostatistics for the analysisof archaeological fieldwalking data, presents an analysisof the spatial distribution of bulk struck flint. In thatapplication, cross validation (this entails removing adata point, predicting its value, comparing the predictedand observed data points and carrying out the sameprocess for all data) was used to assess the accuracy ofkriging predictions. A map was also generated using thevariogram model specified in the paper. Wheatley andGillings [39], in their review of GIS in archaeology,provide a chapter on interpolation which includes asection on geostatistical methods. The examples givenare based on elevation data (as is the focus of Hagemanand Bennett [15]) and not explicitly archaeological data.

3. Geostatistics

The basic principles of geostatistics are outlinedbelow. There are many introductions to the subject andseveral more detailed texts that could be consulted formore information (for example, [2,14,38]). Burroughand McDonnell [7] provide a short introduction togeostatistics in the context of GIS. There are alsointroductions for specific audiences including users ofGISystems [28]; physical geographers [30,31] and theremote sensing community [9].

3.1. The theory of regionalised variables

In the Earth sciences knowledge about how proper-ties vary in space is usually sparse. Therefore, it is notfeasible, in general, to use a deterministic model todescribe spatial variation. If, for example, the objectiveis to make predictions at locations for which there are nodata it is necessary to allow for uncertainty in ourdescription as a result of our lack of knowledge.

The uncertainty inherent in predictions of anyproperty means that what cannot be described deter-ministically can be accounted for through the use ofprobabilistic models. With this approach, the data areconsidered as the outcome of a random process. Isaaksand Srivastava [17] caution that use of a probabilisticmodel is an admission of ignorance; it does not meanthat any spatially referenced property varies randomlyin reality.

In geostatistics, spatial variation (at a location, x) ismodelled as comprising two distinct parts, a deter-ministic component (µ(x)) and a stochastic (or ‘random’)component (R(x)):

Z�x��x��R�x� (1)

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165152

This is termed a random function (RF) model. Theupper case Z refers to the RF whereas lower case z refersto the observed data. In geostatistics, a spatially refer-enced variable, z(x), is treated as an outcome of a RF,Z(x), defined as a spatial set of random variables (RVs).A realisation of a RF is called a regionalised variable(ReV). The Theory of Regionalised Variables [22] is thefundamental framework on which geostatistics is based.

Where the properties of the variable of interest are thesame, or at least similar in some sense, across the regionof interest we can employ what is termed a stationarymodel. In other words, we can use the same modelparameters at all locations. Stationarity may be divided(for geostatistical purposes) into three classes for whichdifferent parameters of the RF may exist. In turn theseare: (i) strict stationarity, (ii) second-order stationarityand (iii) intrinsic stationarity [19,26]. Only the latter twoconcern us here.

For second-order stationarity, the mean and (spatial)covariance, are required to be constant. Therefore, theexpected value should be the same at all locations, x:

E�Z�x�� for all x (2)

In addition, the covariance, C(h), between the locationsx and x+h should depend only on the lag, h (the dis-tance and direction by which paired observations areseparated), and not on the location, x:

C�h��E��Z�x��Z�x�h��E�Z�x�Z�x�h��2 for all x (3)

In some cases, the requirements for second-orderstationarity are not met. For example, the variance (ordispersion) may be unlimited as lag increases. For thisreason, Matheron [22] defined the intrinsic hypothesis.For a RF to fulfil the intrinsic hypothesis it is requiredonly that the expected value of the variable should notdepend on x:

E�Z�x�� for all x (4)

for all x and the variance of the increments should befinite [19]. Thus, the variogram, �(h), defined as half theexpected squared difference between paired RFs, existsand depends only on h:

�(h)�1

2E[{Z(x)�Z(x�h)}2] (5)

That is, the expected semivariance is the same for allobservations separated by a particular lag irrespective ofwhere the paired observations are located.

Second-order stationarity implies the intrinsichypothesis, but the intrinsic hypothesis does not implysecond-order stationarity. Thus, the covariance function

and the correlogram (or autocorrelation function, thestandardised covariance) exist only if the RF is second-order stationary, and the variogram must be used whenintrinsic stationarity only can be assumed [19].

3.2. The variogram

The core tool in geostatistical analysis is the vari-ogram (defined above). The variogram characterisesspatial dependence in the property of interest. Theexperimental variogram, �̂�h�, can be estimated from p(h)paired observations, z(xa), z(xa+h), �=1, 2, . p(h) using:

�̂(h)�1

2p(h)

��1

p(h)

�z(x�)�z(x��h)�2 (6)

In simple terms, the variogram is estimated by calculat-ing the squared differences between all the availablepaired observations and obtaining half the average forall observations separated by that lag (or within a lagtolerance where the observations are not on a regulargrid). Fig. 1 gives a simple example of a transect alongwhich observations have been made at regular inter-vals. Lags (h) of 1 and 2 are indicated. Thus, halfthe average squared difference between observationsseparated by a lag of 1 is calculated and the processis repeated for a lag of 2 and so on. The variogramcan be estimated for different directions to enablethe identification of directional variation (termedanisotropy).

A mathematical model may be fitted to the exper-imental variogram and the coefficients of this model canbe used for a range of geostatistical operations such asspatial prediction (kriging) and conditional simulation(defined below). A model is usually selected from one ofa set of authorised models. McBratney and Webster [24]provide a review of some of the most widely usedauthorised models. Further models can be found in arange of texts (for example, [8]).

There are two principal classes of variogram model.Transitive (bounded) models have a sill (finite variance),and indicate a second order stationary process (asdefined above). Unbounded models do not reach anupper bound; they are intrinsic only [24]. Fig. 2 showsthe parameters of a bounded variogram model. Thenugget effect, c0, represents unresolved variation (a

Fig. 1. Observations (+) made along a transect, with lag (h) of 1 and 2indicated.

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165 153

mixture of spatial variation at a finer scale than thesample spacing and measurement error). The structuredcomponent, c1, represents the spatially correlated vari-ation. The sill, c0+c1, is the a priori variance. The range,a, represents the scale (or frequency) of spatial variation.For example, if soil phosphate measured at an archaeo-logical site varies markedly over quite small distancesthen the soil phosphate can be said to have a highfrequency of spatial variation (a short range) while if thesoil phosphate is quite similar over much of the site andvaries markedly only at the extremes of the site (that is,at large separation distances) then the soil phosphatecan be said to have a low frequency of spatial variation(a long range).

Variograms used in the case studies presented follow-ing this section were fitted with a nugget effect and aspherical component. The nugget variance is given as:

�(h)�H0 if h�01 otherwise (7)The spherical model, a bounded model, is defined as:

�(h)�5c·[1.5h

a�0.5Sha D

3

] if h#a

c if h>a

(8)

where c is the structured component. Authorised modelsmay be used in positive linear combination where asingle model is insufficient to represent well the form ofthe variogram.

3.3. Kriging

There are many varieties of kriging. Its simplest formis called simple kriging (SK). To use SK it is necessary toknow the mean of the property of interest and this mustbe modelled as constant across the region of interest. In

practice this is rarely the case. The most widely usedvariant of kriging, ordinary kriging (OK), allows themean to vary spatially: the mean is estimated for eachprediction neighbourhood. OK predictions are weightedaverages of the n available data. The OK weights definethe Best Linear Unbiased Predictor (BLUP). The OKprediction, ẑOK(x0), is defined as:

ẑOK(x0)�

��1

n

��OKz(x�) (9)

with the constraint that the weights, ��OK, sum to 1 to

ensure an unbiased prediction:

��1

n

��OK�1 (10)

So, the objective of the kriging system is to find appro-priate weights by which the available observations willbe multiplied before summing them to obtain the pre-dicted value. These weights are determined using thecoefficients of a model fitted to the variogram (oranother function such as the covariance function).

The kriging prediction error must have an expectedvalue of 0:

E{ẐOK(x0)�Z(x0)}�0 (11)

The kriging (or prediction) variance, �OK2 , is expressed

as:

�̂OK2 (x0)�E[{ẐOK(x0)�Z(x0)}

2]

��(0)�

��1

n

��1

n

��OK��

OK�(x��x�)�2

��1

n

��OK�(x��x0) (12)

That is, we seek the values of �1, ., �n (the weights) thatminimise this expression with the constraint that theweights sum to one (equation 10). This minimisation isachieved through Lagrange multipliers. The con-ditions for the minimisation are given by the OK systemcomprising n+1 equations and n+1 unknowns:

5��1n

��OK�(x��x�)��OK��(x��x0) ��1,...,n

��1

n

��OK�1

(13)

where �OK is a Lagrange muliplier. Knowing �OK, theprediction variance of OK can be given as:

�̂OK2 ��OK��(0)�

��1

n

��OK�(x��x0) (14)

range (a)

nugget (c )

sill (c c )

0

Lag(h)

structured component (c )1

0 1+

Fig. 2. The parameters of a bounded variogram model with a nuggeteffect.


The kriging variance is a measure of confidence inpredictions and is a function of the form of the vari-ogram, the sample configuration and the sample support(the area over which an observation is made, which maybe approximated as a point or may be an area) [19]. Thekriging variance is not conditional on the data valueslocally and this has led some researchers to use alterna-tive approaches such as conditional simulation (dis-cussed in the next section) to build models of spatialuncertainty [14].

There are two varieties of OK: punctual OK andblock OK. With punctual OK the predictions cover thesame area (the support, V) as the observations. In blockOK, the predictions are made to a larger supportthan the observations. With punctual OK the data arehonoured. That is, they are retained in the output map.Block OK predictions are averages over areas (i.e., thesupport has increased). Thus, the prediction is not thesame as an observation (at x0) and does not need tohonour it.

A worked example of the OK system is provided byBurrough and McDonnell [7], box 6.2).

3.4. Conditional simulation

Kriging predictions are weighted moving averages ofthe available sample data. Kriging is, therefore, asmoothing interpolator. Conditional simulation (alsocalled stochastic imaging) is not subject to the smooth-ing associated with kriging (conceptually, the variationlost by kriging due to smoothing is added back) aspredictions are drawn from equally probable joint reali-sations of the RVs which make up a RF model [11].That is, simulated values are not the expected values(i.e., the mean) but are values drawn randomly from theconditional cumulative distribution function (ccdf): afunction of the available observations and the modelledspatial variation [12]. The simulation is considered “con-ditional” if the simulated values honour the observationsat their locations [11]. Simulated realisations represent apossible reality whereas kriging does not. Simulationallows the generation of many different possible realis-ations that may be used as a guide to potential errors inthe construction of a map [18] and multiple realisationsencapsulate the uncertainty in spatial prediction.

Probably the most widely used form of conditionalsimulation is sequential Gaussian simulation (SGS).With sequential simulation, simulated values are condi-tional on the original data and previously simulatedvalues [11]. In SGS the ccdfs are all assumed to beGaussian.

The SGS algorithm follows several steps [10,14] asdetailed below:

1. Apply a standard normal transform to the data.2. Go to the location x1.

3. Use SK (note OK is often used instead; see Deutschand Journel [11] about this issue), conditional on theoriginal data, z(x�), to make a prediction. The SKprediction and the kriging variance are parameters(the mean and variance) of a Gaussian ccdf:

F(x1;z�(n)�Prob{Z(x1)#z�(n)} (15)

4. Using Monte Carlo simulation, draw a randomresidual, zl(x1), from the ccdf.

5. Add the SK prediction and the residual which givesthe simulated value; the simulated value is added tothe data set.

6. Visit all locations in random order and predict usingSK conditional on the n original data and the i�1values, zl(xi), simulated at the previously visitedlocations xj, j=1, ., i�1 to model the ccdf:

F(xi;z�(n�i�1)�Prob{Z(xi)#z�(n�i�1)} (16)

7. Follow the procedure in steps 4 and 5 until alllocations have been visited.

8. Back transform the data values and simulatedvalues.

By using different random number seeds the order ofvisiting locations is varied and, therefore, multiple reali-sations can be obtained. In other words, since thesimulated values are added to the data set, the valuesavailable for use in simulation are partly dependent onthe locations at which simulations have already beenmade and, because of this, the values simulated at anyone location vary as the available data vary.

SGS is discussed in detail in several texts (forexample, [8,10,11,14]). The use, and benefits, of SGS areexplored in this paper.

3.5. Sampling design

Kriging predicts with minimum prediction or krigingerror, �OK (from here on generalised to �K), and alsopredicts this kriging error for every predicted value. Thekriging error depends only on the geometry of thedomain or support V to be predicted, the distancesbetween V and the n(x0) data points x�, the geometry ofthe n(x0) data, and finally the variogram [19]. The valuesof the sample observations locally have no influence.Thus, if the variogram is known, the kriging error can bepredicted for any proposed sampling strategy prior tothe actual survey. Kriging is, therefore, an ideal tool fordesigning optimal sampling strategies.

Burgess et al. [6] chose as their criterion of a goodsampling strategy, the minimisation of the maximumKriging error, �Kmax. The quantity �Kmax is not constantover the region of interest, but rather tends to increasethe further the point (or block) to be predicted is from


the observations (at least for monotonic increasing vari-ograms). The �Kmax is reached when the point to bepredicted is furthest from the sample observations, at adistance dmax.

Burgess et al. [6] showed that where spatial variationis isotropic (invariant with orientation) an equilateraltriangular grid minimises dmax and hence �Kmax. Anyalternative sampling schemes, and in particular therandom scheme, will have some larger values of dmax andhence larger values of �Kmax, although a hexagonal gridmay be optimum in restricted circumstances [41]. Inpractice, a square grid is likely to be preferred forreasons of convenience in indexing, site location andcomputer handling, and of shorter travelling distances inthe field.

The optimal sampling density for a given samplingscheme, can be designed by solving the Kriging equa-tions for several sampling densities and plotting �̂Kmaxagainst sample spacing [23,25]. If the budget for thesurvey is limited then so too is the maximum precisionattainable. If the survey is not limited by funding and theinvestigators can define a maximum tolerable predictionerror, then the optimal sampling strategy is the one thatjust achieves the desired precision. Greater precisionwould be wasteful. The optimal strategy is found byreading the required sample spacing from the plot of�̂Kmax against sample spacing.

The above approach for optimal sampling designprovides a model-based framework for selecting asample spacing to achieve a desired precision of predic-tion. However, this approach has been criticised becauseit does not provide an adequate measure of local uncer-tainty (e.g., [14]). True, the quantity �̂Kmax generallyincreases with dmax such that densely sampled areas havesmaller values of �̂Kmax than sparsely sampled areas.However, the quantity �̂Kmax is not affected by thecharacter of spatial variation locally. Thus, in terms ofelevation, mountainous regions and floodplain areaswould result in the same �̂Kmax, for a given samplingframework. This inadequacy is most evident in maps of�̂Kmax for gridded data: the same local pattern in �̂Kmax isrepeated globally. Despite these limitations �̂Kmax can beuseful as a guide to uncertainty in predictions wherespatial variation is similar across the region of interest.In cases where the form of spatial variation changesacross the region of interest a non-stationary approach(for example, splitting the data into sub-sets which canbe regarded as ‘homogeneous’) can be applied [21].

3.6. Software

The wide range of public domain and low costsoftware now available (see [35], for a review of somepublic domain software) means that the tools of geo-statistics are readily available to the archaeologist.Widely used public domain software packages include

GSLIB (Geostatistical Software Library, [11] and Gstat[32], both used for the case studies presented in thispaper. In addition, several commercial GISystemsinclude geostatistical functions and there is a range ofcommercial geostatistical packages.

4. Case studies

In this section, two case studies are presented. Thefirst case study is an analysis of the distribution ofRoman pottery in southern Britain and use of thevariogram is illustrated. The second case study showshow the variogram, kriging (punctual and block OK)and conditional simulation (SGS) can be applied to theanalysis of the distribution of soil phosphates at anarchaeological site in Greece.

4.1. Case study 1: Roman pottery in southern Britain

The first case study utilises the variogram to charac-terise spatial dependence in assemblages of Romanpottery from the south of Britain from details collectedby Allen and Fulford [1]. Allen and Fulford acquireddata on five types of pottery, but of these, only twooccur with enough regularity at the sites surveyed toprovide a large enough sample for geostatistical analysis.The two types considered here are South-East DorsetBlack Burnished Category I (SEDBB I) and SevernValley Ware (SVW). SVW was not recorded at many ofthe sites and variograms estimated from few data areoften ‘noisy’ and visually unstructured. It should also benoted that the percentages of SEDBB I and SVW ateach site were obtained in various different ways includ-ing sherd counts, sherd weights, estimated vessel equiva-lent (EVE) and number of vessels represented (VR).Allen and Fulford [1] discuss this issue in some detail.

The omnidirectional variogram for SEDBB I is pre-sented in Fig. 3. The increase in semivariance with lagfor the variogram of SEDBB I percentages is indicativeof spatial dependence and a model was fitted to thisvariogram. There is a clear tendency for semivariance toincrease up to a lag of about 90 km after whichsemivariance remains constant. The range (a) of thefitted variogram model was 119.91 km. This may beinterpreted as the separation distance above whichassemblages of SEDBB I are spatially independent. Inarchaeological terms, this may represent the redistribu-tion of pottery of this type from production centres tomarkets. In other words, pottery types that exhibitclearly structured spatial variation may be consideredexamples of larger scale production, vessels that perhapsdominate in the region of concern and are foundconsistently in archaeological assemblages. In such ascenario, industries that were only local in scale would,in a regional context, be marked by unstructured spatialvariation. A map of SEDBB I%, derived using OK, is


given in Fig. 4 (the validity of mapping properties suchas artefact proportions is discussed below). Predictionsare shown only within 250 km of the observations. Thelargest SEDBBI concentrations are in Dorset, as wouldbe expected. The linear features visible in the map arecharacteristic of areas located far away from sampledata.

In this case, the map should not be viewed as a mapof predicted pottery amounts (that is, percentages) sincepottery amount is not a continuously varying property(unlike, for example, elevation or rainfall). As Wheatleyand Gillings [39] note, surfaces derived from observa-tions such as counts of artefacts may be useful where thesurvey from which the data derive was not exhaustive (togive an indication of counts at areas where no data areavailable) but such maps should be viewed with caution.The kriged surface provides a way of gaining a clearersense of regional variation in SEDBB I amount than ispossible using shaded point maps and it may be con-sidered to represent the idealised catchment of SEDBB I,but it does not represent pottery amount (that is,percentage) per se.

The omnidirectional variogram for SVW in Fig. 5appears to demonstrate no general increase in semivari-ance with increase in lag h—in such a case a nugget effectonly may describe adequately the form of the variogramand this would be referred to as pure nugget. In otherwords, semivariance does not increase markedly withrespect to the nugget variance as lag increases. Direc-tional variograms were also computed and, in mostcases, gave little indication of spatial dependence. How-ever, the variogram for 0( (north-south alignment) givenin Fig. 6, to which a model was fitted, demonstrates afairly clear increase in semivariance as lag h increases.

This indicates that the distribution of SVW is morecontinuous in the north–south direction than in otherdirections. Allen and Fulford’s contour map of SVWdepicts major contours aligned north–south with morevisually erratic changes in the contours in an east-westalignment. This corroborates the form of the vari-ograms. Additionally, there are more data within thenorth and south extents of the data set than there arewithin the east–west limits, which means that the vari-ogram would be most stable in form for the north–southdirection. However, the variogram is unbounded andthis may be indicative of differences between the northand south rather than within-region differences. Thevariogram provides a means to quantify spatial vari-ation and compare different properties in a mannerthat is more objective than comparing visually maps.

4.2. Case study 2: Mapping soil phosphates

Kriging has been applied widely in soil survey to mapsoil types (for example, [37]) and is here used to mapsoil phosphates from an archaeological site. The dataexamined were published by Buck et al. [5]. The datarepresent soil phosphate measured at a site (referenceLS 165), probably dating to the Roman period, that wasstudied as part of the Laconia Survey in Greece. Themeasures are mg P/100 g of soil and were obtained at10 m intervals on a 16 by 16 point grid (although nodata were obtained at nine locations on the grid due toobstacles at those nodes of the grid).

Variograms were estimated and models fitted tothem using Gstat. The omnidirectional variogram ofsoil phosphate (Fig. 7) illustrates that the soilphosphate is spatially correlated. The omnidirectional

0

100

200

300

400

500

600

700

800

900

0 20 40 60 80 100 120 140

Sem

ivar

ianc

e (S

ED

BB

I %

2 )

Lag (km)

Semivariance256.785 Nug(0) + 527.289 Sph(119.91)

Fig. 3. Omnidirectional variogram for SEDBB I. Nug. is nugget, Sph. is spherical.


variogram was fitted with a nugget effect and a sphericalmodel. The large nugget effect is indicative of uncer-tainty in the measurement of soil phosphates and local(small-scale) variation in soil phosphates. The modelfitted to the omndirectional variogram has a range of80.325 m. This can be interpreted as the maximum scaleof spatial variation in soil phosphate in this region.

The variogram was also estimated for different direc-tions within a tolerance of 45 degrees. The variogramwith the largest range was for 22(:30# (modelled rangeof 80.976 m) while the variogram at 90( from this had amodelled range of 77.079 m. The variograms, andstructured components, for those two directions aregiven in Fig. 8. The models fitted for the two directionshave similar ranges but different nugget variances.Where the sill (recall the total sill is the nugget varianceplus the structured components) differs for differentdirections this is termed zonal anisotropy. The differ-ences are not marked and the most straightforwardapproach, to use the coefficients from the omni-

directional variogram model as input for kriging, wasaccepted in this case.

The OK functionality of GSLIB was used to Krigesoil phosphate to a grid with a 2 m spacing. Bothpunctual OK and block OK were applied: the choice ofone of the two approaches is an important issue. Themap of punctual OK predictions is given in Fig. 9 andthe corresponding kriging variance in given in Fig. 10.The locations of the observations are obviousin both Fig. 9 and Fig. 10. In Fig. 9, the observationsappear as ‘spikes’ in the map. This is a common featureof maps derived using punctual kriging. The OK vari-ance at the observation locations is zero in Fig. 10. Thisimplies that there is no measurement error in the data,but in fact measurement of soil phosphate entails muchuncertainty. The analysis was repeated using block OK.The spikes evident in Fig. 9 are not apparent in the mapof block OK predictions (Fig. 11). Also, the block OKvariances (Fig. 12), unlike in Fig. 10, are not zero at anylocations. Note also that the range of values in the block

SEDBB I %

Value

High : 68.90

Low : 2.78

±

0 120,000 240,00060,000 Metres

Fig. 4. Map of SEDBB I%, derived using OK. 1000 m cells.


OK map is smaller than the range of values for thepunctual OK map because of the process of averagingover a 2 m by 2 m block. It is, as noted previously,important to consider issues such as the support overwhich predictions will be made.

Buck et al. [5] aimed to delineate areas with high andlow concentrations of soil phosphate. Although the aimhere has been simply to demonstrate the applicationof OK for interpolation, other kriging algorithms, inparticular, disjunctive kriging [33], may be used to

assess the probability that a predicted value exceeds aparticular threshold.

Conditional simulation was also applied to the data.Four maps derived using SGS (the algorithm in GSLIBwas utilised) are given in Fig. 13. Differences betweenthe four realisations are apparent. Conditional simula-tion provides a powerful means to explore variation inspatial data and there are extensive potential applica-tions for interpreting and mapping distributions ofarchaeological variables. Each one of the maps in

0

100

200

300

400

500

0 20 40 60 80 100 120 140

Sem

ivar

ianc

e (S

VW

%2 )

Lag (km)

Semivariance

Fig. 5. Omnidirectional variogram for SVW.

0

100

200

300

400

500

600

0 20 40 60 80 100 120 140

Sem

ivar

ianc

e (S

VW

%2 )

Lag (km)

Semivariance290 Nug(0) + 2 Pow(1)

Fig. 6. Directional variogram (0() for SVW. Nug. is nugget, Pow. is power.


Fig. 13 represents a possible reality, whereas neitherFig. 9 or Fig. 11 is a possible reality because they aresmoothed representations. Kriging provides the bestprediction on a point-by-point basis, whereas simulationis the best on a global basis that is, reproduces theoriginal spatial structure. Statistics estimated frommultiple simulated realisations may be a useful guide tospatial uncertainty.

It was noted above that the coefficients of the modelfitted to the variogram have been used to ascertain the

maximum punctual kriging variance for different samplespacings (for a prediction neighbourhood of 16 obser-vations). This enables the researcher to ascertain themaximum sample spacing possible to achieve a particu-lar precision [3]. To do this it is necessary to obtain asample data set for which a representative variogrammay be estimated. Measurements are often made alonga transect for this purpose. The coefficients of the modelfitted to the omnidirectional variogram of soil phos-phate were input into the Fortran program OSSFIM

0

100

200

300

400

500

600

700

800

0 10 20 30 40 50 60 70 80

Sem

ivar

ianc

e (m

g P

/100

g)2

Lag (m)

Semivariance409.959 Nug(0) + 411.119 Sph(80.325)

Fig. 7. Omnidirectional variogram of soil phosphate. Nug. is nugget, Sph. is spherical, DD is decimal degrees.

0

100

200

300

400

500

600

700

800

900

0 10 20 30 40 50 60 70 80

Sem

ivar

ianc

e (m

g P

/100

g)2

Lag (m)

Semivariance: 22.5 dd22.5 DD: 443.898 Nug(0) + 410.956 Sph(80.976)

Semivariance: 112.5 dd112.5 DD: 366.037 Nug(0) + 417.243 Sph(77.079)

Fig. 8. Directional variogram of soil phosphate for 22:30( and 112:30(. Nug. is nugget, Sph. is spherical, DD is decimal degrees.


(Optimal Sampling Schemes for Isarithmic Mapping,[23,25] and the maximum kriging variance, �Kmax, forseveral different sample spacings was obtained (Fig. 14).In Fig. 15, it is shown that if a required �Kmaxof 625 mg

P/100 g2 (that is, 25 mg P/100 g) were stated then asample spacing of about 25 m would be necessary. Thekriging variance is directly dependent on the form of thevariogram so it is necessary that the variogram is

0 m 100 m0 m

100 m

27.000

37.000

47.000

57.000

67.000

77.000

87.000

97.000

107.000

117.000

127.000

137.000±

Fig. 9. Map of soil phosphate produced using punctual OK, 2 m cells. Scale is in mg P/100 g of soil.

0 m 100 m0 m

100 m

0.0

71.068

142.136

213.204

284.272

355.340

426.408

497.476

568.544

639.612

710.680±

Fig. 10. Map of punctual OK variances. Scale is in (mg P/100 g)2 of soil.


representative of the region for which it is estimated.If this is the case, this approach could be a useful toolfor the archaeologist as an aid to designing samplingstrategies.

5. Summary and conclusions

Geostatistics offers many potential benefits to archae-ologists who are concerned with the analysis of spatial

0 m 100 m0 m

100 m

44.000

54.000

64.000

74.000

84.000

94.000

104.000±

Fig. 11. Map of soil phosphate produced using block OK, 2 m cells. Scale is in mg P/100 g of soil.

0 m 100 m0 m

100 m

81.100

102.418

123.736

145.054

166.372

187.690

209.008

230.326

251.644

272.962

294.280±

Fig. 12. Map of block OK variances. Scale is in (mg P/100 g)2 of soil.


data. However, like any tool geostatistics must be usedappropriately. In many cases, simpler tools may beappropriate. So, it is necessary to consider carefully thepros and cons of geostatistics in any given situation.

Where the spatial variation in an archaeologicalvariable is of interest the tools of geostatistics havemuch potential value. Tools such as the variogram maybe utilised to quantify and interpret observed spatial

0 m 100 m 0 m

100 m

17.000

57.000

97.000

137.000

177.000

217.000

257.000

297.000

0 m 100 m 0 m

100 m

17.000

57.000

97.000

137.000

177.000

217.000

257.000

297.000

0 m 100 m 0 m

100 m

17.000

57.000

97.000

137.000

177.000

217.000

257.000

297.000

0 m 100 m 0 m

100 m

17.000

57.000

97.000

137.000

177.000

217.000

257.000

297.000

Fig. 13. Four maps of soil phosphate produced using SGS, 2 m cells. Scale is in mg P/100 g of soil.

550

600

650

700

0 5 10 15 20 25 30 35 40

Max

. krig

ing

var.

(m

g P

/100

g)

Sample spacing (m)

2

Fig. 14. Plot of maximum kriging variance against sample spacing.


distributions. Hodder and Orton [16] illustrated howmeasures of spatial autocorrelation could be used tocharacterise spatial variation in archaeological variables.In addition to characterisation of spatial dependence,this paper has demonstrated how geostatistics may beused to analyse and map archaeological variables.There are many archaeological variables that could beanalysed geostatistically. Some obvious ones are artefactdensities and dates of objects [39]. The followingapplications were outlined (using the tools specified inparentheses):

• characterisation of spatial variation (variogram)• spatial prediction (ordinary kriging)• assessment of uncertainty in mapped predictions

(kriging variance)• conditional simulation (sequential Gaussian simula-

tion)• design of optimal sampling strategies (kriging

variance).

The tools of geostatistics represent a powerfuladdition to the archaeologist’s tool kit but, so far, littleof the potential benefits have been realised. This is due,in part, to the perceived complexity of the techniquesand the models that underlie them. It is hoped that thispaper will serve in some way to expand the under-standing of geostatistics and to encourage its use inarchaeology.

References

[1] J.R.L. Allen, M.G. Fulford, The distribution of South-EastDorset Black Burnished Category 1 pottery in South-WestBritain, Britannia 27 (1996) 223–281.

[2] M. Armstrong, Basic Linear Geostatistics, Springer, Berlin,1998.

[3] P.M. Atkinson, Optimal sampling strategies for raster-based geographical information systems, Global Ecology andBiogeography Letters 5 (1996) 217–280.

[4] J.P. Bocquet-Appel, P.Y. Demars, Neanderthal contraction andmodern human colonization of Europe, Antiquity 74 (2000)544–552.

[5] C.E. Buck, W.G. Cavanagh, C.D. Litton, The spatial analysis ofsite phosphate data, in: S.P.Q. Rahtz (Ed.), Computer andQuantitative Methods in Archaeology 1988, BAR InternationalSeries 446(i), BAR, Oxford, 1988, pp. 151–160.

[6] T.M. Burgess, R. Webster, A.B. McBratney, Optimal interpola-tion and isarithmic mapping of soil properties. IV. Samplingstrategy, Journal of Soil Science 32 (1981) 643–659.

[7] P.A. Burrough, R.A. McDonnell, Principles of Geo-graphical Information Systems, Oxford University Press, Oxford,1998.

[8] J.P. Chilès, P. Delfiner, Geostatistics: Modeling Uncertainty,John Wiley and Sons, New York, 1999.

[9] P.J. Curran, P.M. Atkinson, Geostatistics in remote sensing,Progress in Physical Geography 22 (1998) 61–78.

[10] C.V. Deutsch, Geostatistical Reservoir Modelling, OxfordUniversity Press, New York, 2002.

[11] C.V. Deutsch, A.G. Journel, GSLIB: Geostatistical SoftwareLibrary and User’s Guide, second ed, Oxford University Press,New York, 1998.

[12] J.L. Dungan, Conditional simulation, in: A. Stein, F. van derMeer, B. Gorte (Eds.), Spatial Statistics for RemoteSensing, Kluwer Academic Publishers, Dordrecht, 1999,pp. 135–152.

[13] D. Ebert, The potential of geostatistics in the analysis offieldwalking data, in: D. Wheatley, G. Earl, S. Poppy (Eds),Contemporary Themes in Archaeological Computing, Universityof Southampton Department of Archaeology Monograph No. 3,Oxbow Books, Oxford, 2002, pp. 82–89.

[14] P. Goovaerts, Geostatistics for Natural Resources Evaluation,Oxford University Press, New York, 1997.

[15] J.B. Hageman, D.A. Bennett, Construction of digital elevationmodels for archaeological applications, in: K.L. Westcott, R.J.

550

600

650

700

0 5 10 15 20 25 30 35 40

Max

. krig

ing

var.

(m

g P

/100

g)

Sample spacing (m)

2

Fig. 15. Plot of maximum kriging variance against sample spacing, showing the sample spacing required to achieve a maximum kriging variance of625 mg P/100 g2.


Brandon (Eds.), Practical Applications of GIS for Archaeologists:A Predictive Modeling Kit, Taylor and Francis, London, 2000,pp. 113–127.

[16] I. Hodder, C. Orton, Spatial Analysis in Archaeology, NewStudies in Archaeology 1, Cambridge University Press,Cambridge, 1976.

[17] E.H. Isaaks, R.M. Srivastava, An Introduction to AppliedGeostatistics, Oxford University Press, New York, 1989.

[18] A.G. Journel, Modelling uncertainty and spatial dependence:stochastic imaging, International Journal of GeographicalInformation Systems 10 (1996) 517–522.

[19] A.G. Journel, C.J. Huijbregts, Mining Geostatistics, AcademicPress, London, 1978.

[20] K.L. Kvamme, Spatial autocorrelation and the Classic Mayacollapse revisited: refined techniques and new conclusions,Journal of Archaeological Science 17 (1990) 197–207.

[21] C.D. Lloyd, P.M. Atkinson, The effect of scale-related issues onthe geostatistical analysis of Ordnance Survey� digital elevationdata at the national scale, in: J. Gómez-Hernández, A. Soares, R.Froidevaux (Eds.), GeoENV II: Geostatistics for EnvironmentalApplications, Kluwer Academic Publishers, Dordrecht, 1999,pp. 537–548.

[22] G. Matheron, The Theory of Regionalized Variables and itsApplications, Les Cahiers du Centre de MorphologieMathématique de Fontainebleau No. 5, École NationaleSupérieure des Mines, Fontainebleau, 1971.

[23] A.B. McBratney, R. Webster, The design of optimal samplingschemes for local estimation and mapping of regionalised vari-ables. II. Program and examples, Computers and Geosciences 7(1981) 335–365.

[24] A.B. McBratney, R. Webster, Choosing functions for semi-variograms of soil properties and fitting them to samplingestimates, Journal of Soil Science 37 (1986) 617–639.

[25] A.B. McBratney, R. Webster, T.M. Burgess, The design ofoptimal sampling schemes for local estimation and mapping ofregionalised variables. I. Theory and method, Computers andGeosciences 7 (1981) 331–334.

[26] D.E. Myers, To be or not to be . stationary? That is thequestion, Mathematical Geology 21 (1989) 347–362.

[27] F.D. Neiman, Conspicuous consumption as wasteful advertising:a Darwinian perspective on spatial patterns in Classic Mayaterminal monuments dates, in: M.C. Barton, G.A. Clark (Eds),Rediscovering Darwin: Evolutionary Theory and Archaeo-logical Explanation, Archaeological Papers of the AmericanAnthropological Association, 1997, pp. 267–290.

[28] M.A. Oliver, R. Webster, Kriging: a method of interpolationfor geographical information systems, International Journal ofGeographical Information Systems 4 (1990) 313–332.

[29] M.A. Oliver, R. Webster, K.J. Edwards, G. Whittington,Multivariate, autocorrelation and spectral analyses of a pollenprofile from Scotland and evidence of periodicity, Review ofPalaeobotany and Palynology 96 (1997) 121–141.

[30] M.A. Oliver, R. Webster, J. Gerrard, Geostatistics in physicalgeography. Part I: theory, Transactions of the Institute of BritishGeographers 14 (1989a) 259–269.

[31] M.A. Oliver, R. Webster, J. Gerrard, Geostatistics in physicalgeography. Part II: applications, Transactions of the Institute ofBritish Geographers 14 (1989b) 270–286.

[32] E.J. Pebesma, C.G. Wesseling, Gstat, a program for geostatisticalmodelling, prediction and simulation, Computers andGeosciences 24 (1998) 17–31.

[33] J. Rivoirard, Introduction to Disjunctive Kriging and Non-linearGeostatistics, Clarendon Press, Oxford, 1994.

[34] J.M. Robinson, E. Zubrow, Between spaces: interpolation inarchaeology, in: M. Gillings, D. Mattingly, J. van Dalen (Eds.),The Archaeology of Mediterranean Landscapes, Oxbow Books,Oxford, 1999, pp. 65–83.

[35] C. Varekamp, A.K. Skidmore, P.A.B. Burrough, Using publicdomain geostatistical and GIS software for spatial interpolation,Photogrammetric Engineering and Remote Sensing 62 (1996)845–854.

[36] R. Webster, T.M. Burgess, Optimal interpolation and isarithmicmapping of soil properties III. Changing drift and universalkriging, Journal of Soil Science 31 (1980) 505–524.

[37] R. Webster, M.A. Oliver, Statistical Methods in Soil and LandResource Survey, Oxford University Press, Oxford, 1990.

[38] R. Webster, M.A. Oliver, Geostatistics for EnvironmentalScientists, John Wiley and Sons, Chichester, 2000.

[39] D. Wheatley, M. Gillings, Spatial Technology and Archaeology:The Archaeological Applications of GIS, Taylor & Francis,London, 2002.

[40] D.S. Whitley, Spatial autocorrelation tests and the Classic Mayacollapse: methods and inferences, Journal of ArchaeologicalScience 12 (1985) 377–395.

[41] E.A. Yfantis, G.T. Flatman, J.V. Behar, Efficiency ofkriging estimation for square, triangular, and hexagonal grids,Mathematical Geology 19 (1987) 183–205.

[42] E.B.W. Zubrow, J.W. Harbaugh, Archaeological prospecting:kriging and simulation, in: I. Hodder (Ed.), Simulation Studies inArchaeology, Cambridge University Press, Cambridge, 1978,pp. 109–122.


Archaeology and geostatisticsIntroductionPublished applications of geostatistics in archaeologyGeostatisticsThe theory of regionalised variablesThe variogramKrigingConditional simulationSampling designSoftware

Case studiesCase study 1: Roman pottery in southern BritainCase study 2: Mapping soil phosphates

Summary and conclusions

References

Archaeology and geostatistics - UNAMmmc2.geofisica.unam.mx/cursos/gest/Articulos/Geostatistics/Archa… · on spatial analysis in archaeology, provide a section on the subject of

Documents