Top Banner

of 7

CH1 Introductionrw

Jun 02, 2018

Download

Documents

zzza
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/10/2019 CH1 Introductionrw

    1/7

    INTRODUCTION

    This

    book

    presents an introduction to the set of tools that has become

    known commonly as

    geostatistics.

    Many statistical tools are useful

    in developing qualitative insights into

    a

    wide variety of natural phe-

    nomena; many others can be used to develop quantitative answers

    to specific questions. Unfortunately most classical statistical meth-

    ods make no use

    of

    the spatial information in earth science data sets.

    Geostatistics offers

    a

    way of describing the spatial continuity that is an

    essential feature of many natural phenomena and provides adaptat ions

    of classical regression techniques to take advantage of this continuity.

    The presentation of geostatistics in this book is not heavily mathe-

    matical. Few theoretical derivations or formal proofs are given; instead

    references are provided to more rigorous treatments of the material.

    The reader should be able to recall basic calculus and be comfortable

    with finding the minimum of a function by using the first derivative

    and representing a spatial average as an integral. Matrix notation is

    used in some of the later chapters since it offers

    a

    compact way of writ-

    ing systems of simultaneous equations. The reader should also have

    some familiarity with the statistical concepts presented in Chapters 2

    and 3

    Though we have avoided mathematical formalism the presentation

    is not simplistic. The book is built around

    a

    series of case studies on

    a distressingly real data set. As we soon shall see analysis of earth

    science data can be both frustrating and fraught with difficulty. We

    intend to trudge through the muddy spots stumble into the pitfalls

    and wander into some of the dead ends. Anyone who has already

  • 8/10/2019 CH1 Introductionrw

    2/7

    4 A n Introduction to Applied Geostatistics

    tackled

    a

    geostat is t ical s tu dy will sy m pa thiz e with us in ou r man y

    dilemmas.

    O ur case s tudies d if fe rent f rom those t h a t prac t i t ioners encoun ter

    in only one aspect ; throughout our s tudy we wil l have access to the

    cor rec t answers. T h e d a ta se t w i th which we perform th e s tudies is in

    fact

    a sub set of

    a

    much larger , completely known d a ta set . T hi s gives

    us a yardstick by which we can me asure the success

    of

    several different

    approaches.

    A

    warn ing is appropr i a t e here. The solutions we propose in the

    various case s tudies a re par t icula r to th e d a ta se t we use. I t i s not o ur

    intent io n t o propose these as general recipes. T h e hal lmark of a good

    geostat is t ical s tu dy is customizat ion of th e approa ch t o th e problem

    at

    hand. All we in tend in these s tudies is t o cul t iva te an und ers tand ing of

    wh at various geos tat is tica l tools can d o an d, more im por tant ly , wh at

    the i r l imi ta t ions a re .

    The Walker

    Lake

    Data Set

    T h e focus

    of

    this book is a d a ta se t tha t was der ived f rom a digi ta l

    e levat ion model f rom th e western United Sta tes; the W alker Lak e are a

    in Nevada.

    We will not be using the original elevation values as variables in

    ou r case s tudies . T h e variables we d o use , however, a re re la ted t o th e

    elevat ion and, as we shal l see , their maps exhibi t features which are

    related to th e topog raphic fea tures in F igure

    1.1.

    For this reason, we

    will be referring t o specific su b areas within th e W alker Lake are a by

    th e geographic names given in Fig ure 1.1.

    T h e original digi ta l elevation m odel contained elevat ions for abo ut

    2 million points on

    a

    regular gr id. Th ese e levat ions have been t rans-

    formed to produce a data set consist ing of three var iables measured

    a t e a c h

    of 78,000

    points on

    a

    260

    x 300

    rectang ular gr id. T h e f irst

    t w o variables a re cont inuous an d their values rang e from zero to sev-

    eral thou sand s. T h e third var iable is discrete an d i ts value is e i ther

    one or two. Detai ls on how t o ob tain th e digita l e levat ion model a nd

    reproduce th is d a ta se t a r e g iven in A ppend ix A.

    W e have tried to avoid w riting

    a

    book th at is too specif ic

    t o one fie ld of application.

    For

    this reason the var iables in the

    Walker Lake da ta se t a re re fe rred t o anonymously as V U a nd T. Un-

    for tuna te ly , a bias toward mining applications will occasionally creep

  • 8/10/2019 CH1 Introductionrw

    3/7

    IlawUlane

    Introduction

    5

    NEVADA

    l

    Figure 1 1 A location map of the Walker Lake area in Nevada The small rectangle

    on the outline of Nevada shows the relative location of the area within the state

    The larger rectangle shows the major topographic features within the area

    in; this reflects both the historical roots

    of

    geostatistics

    as

    well as the

    experience

    of

    the authors. The methods discussed here however are

    quite generally applicable to any dat a set in which the values are spa-

    tially continuous.

    The continuous variables V and U ould be thicknesses of a geo-

    logic horizon

    or

    the concentration of some pollutant; they could be soil

    strength measurements

    or

    permeabilities; they could be rainfall mea-

    surements

    or

    the diameters

    of

    trees. The discrete variable

    T

    can be

    viewed as

    a

    number that assigns each point to one of two possible cate-

    gories; it could record some important color difference

    or

    two different

  • 8/10/2019 CH1 Introductionrw

    4/7

    6

    An

    In troduct ion to Appl ied Geostat is t ics

    species; i t could s ep ar at e different rock typ es

    or

    different soil litholo-

    gies; i t cou ld record som e chemical difference such

    as

    th e presence or

    abs enc e of

    a

    par t icula r e lement .

    For

    th e sak e of convenience an d consistency we will refer t o

    V

    a n d

    U as co ncent ra t ions

    of

    some mater ia l and will g ive bo th of th em uni t s

    of pa r ts per mill ion pp m ). We will t reat T as an indica tor of two

    types that will be referred

    to

    as type a nd type 2 Finally, we will

    assign u ni ts of meters t o our gr id even tho ug h i ts or iginal dimensions

    a re much l a rger than 260 x 300 m2

    T h e Walker Lake da ta se t cons is ts of V

    U

    a nd

    T

    m e a s u r e m e n ts a t

    eac h of

    78 ,000

    points on

    a x 1

    m2 gr id. From this extremely dense

    d a t a s e t

    a

    subse t

    of 470

    sam ple points has been chosen t o represent

    a

    t yp ica l sample d a t a se t . To dis t inguish be tween these two da ta se ts ,

    t h e comple te se t of al l information

    for

    th e 78,000 points is called th e

    exhaustive d a ta se t , while th e smaller subse t of

    470

    points

    is

    cal led th e

    sample d a t a s e t.

    Goals

    of the

    Case

    Studies

    Using th e

    470

    samples in th e samp le d a t a se t we will address th e fol-

    lowing problems:

    1.

    T he desc rip tion of th e imp or tan t f ea tu r e s of th e da ta .

    2 T h e e s t i m a ti on of an average value over a l a rge a rea .

    3

    T he e s tima t ion

    of

    an unknow n value at a par t icula r loca t ion.

    4

    T h e e s t i m a ti on

    of

    a n av erage value over sm all areas.

    5 T h e use of t he ava i lable sampl ing t o check th e per formance of a n

    est im ation methodology.

    6.

    T h e use of samp le values of one var iable to im prove th e est ima-

    t ion of another var iable .

    7 T h e e s t ima t ion of

    a

    distr ibution of values over

    a

    l a rge a rea .

    8.

    T h e e s t ima t ion o f

    a

    di s t r ibu t ion

    of

    values over small area s.

    9.

    T he e s t ima t ion of a distr ibution of block averages.

    10. T h e assessment of th e uncer ta inty of ou r var ious est ima tes.

  • 8/10/2019 CH1 Introductionrw

    5/7

    Introduction

    7

    The

    first

    question despite being largely qualitative is very impor-

    tant. Organization and presentation is a vital step in communicating

    the essential features

    of a

    large data set. In the first part of this book

    we will look a t descriptive tools. Univariate and bivariate description

    are covered in Chapters 2 and 3 In Chapter 4 w e will look at various

    ways of describing the spatial features of a data set. We will then take

    all of the descriptive tools from these first chapters and apply them

    to the Walker Lake dat a sets. The exhaustive dat a set is analyzed in

    Chapter

    5

    and the sample

    data

    set is examined in Chapters 6 and 7.

    The remaining questions all deal with estimation which is the topic

    of the second part of the book. Using the information in the sample

    data set we will estimate various unknown quantities and see how well

    we have done by using the exhaustive data set to check our estimates.

    O u r approach to estimation as discussed in Chapter 8 is first to con-

    sider what i t is we are trying to estimate and then to adopt a method

    that is suited t o tha t particular problem. Three important consider-

    ations form the framework for our presentation of estimation in this

    book. First do we want an estimate over a

    large area

    or

    estimates for

    specific local areas? Second are we interested only in some average

    value

    or

    in the complete distribution

    of

    values? Third do we want our

    estimates to refer to a volume of the same size as our sample data or

    do we prefer to have our estimates refer to

    a

    different volume?

    In Chapter 9 we will discuss why models are necessary and intro-

    duce the probabilistic models common to geostatistics. In Chapter

    10 we will present two methods for estimating an average value over

    a large area. We then turn to the problem of local estimation.

    In

    Chapter 11 we will look at some nongeostatistical methods that are

    commonly used for local estimation. This is followed in Chapter 12

    by a

    presentation of the geostatistical method known as ordinary point

    kriging

    The adaptation

    of

    point estimation methods to handle the

    problem of local block estimates is discussed in Chapter 13.

    Following the discussion in Chapter 14 of the important issue

    of

    the search strategy we will look a t cross validation in Chapter 15

    and show how this procedure may be used to improve an estimation

    methodology. In Chapter 16 we will address the practical problem of

    modeling variograms an issue that arises in geostatistical approaches

    to estimation.

    In Chapter 17 we will look at how to use related information to

    improve estimation. This is a complication that commonly arises in

  • 8/10/2019 CH1 Introductionrw

    6/7

    8

    A n Introduction to Applied Geostatistics

    pract ice when one var iable is undersampled. W hen we ana lyze th e

    s a m p l e da t a s e t in C ha p t e r

    6,

    we will see tha t th e measurements of the

    second variable,

    U

    a re miss ing a t m any sample locat ions. T h e me thod

    of cokr iging presented in Ch ap ter 17 allows us to inco rpo rate th e mo re

    a b u n d a n t V sample values in the estimation of U t ak ing advan tage

    of th e re la tionship be tween th e two t o improve our e s t imat ion of t h e

    more sparsely samp led U variable.

    T h e es t imat ion of a complete distr ibution is typically of more use

    in prac t ice than i s the es t imat ion of a single average value. In m an y

    applicat ions one is interested not in an overal l average value but in

    th e average value abo ve som e specified th reshold. Th is th reshold is

    often some ext reme va lue and th e es t imat ion of t he d i s t r ibu t ion above

    ex tre m e values cal ls for dif ferent techniques th an th e est im atio n of t h e

    overall mean. In Cha pter 18 we wil l explore the est imation of local

    and global dis t r ibut ions. We wil l present the indicator approach one

    of several adv anc ed techniques developed specif ically for t h e est im atio n

    of local distr ibutions.

    A further complication arises if we want our es t im ates t o refer to

    a

    volume different from t h e volume of ou r samples. T hi s is commonly

    referred to

    as

    t h e support problem and frequently occurs in practical

    appl icat ions.

    For

    exam ple, in

    a

    model of

    a

    petroleum reservoir o ne does

    no t need e stim ated permeabilit ies for core-sized volumes bu t ra th er for

    much la rger blocks. In a mine, one will be mining and processing vol-

    umes much la rger th an th e volume of t he samples tha t a r e typica lly

    available for

    a

    feasibi l i ty s tudy. In Chapter 19 we wi l l show tha t the

    d i s t r ibu t ion of poin t values is not th e same

    as

    th e d is t r ibu t ion of av-

    erage block values an d present tw o me tho ds for accou ntin g for this

    discrepancy.

    I n C ha p t e r 20 we will look a t t h e assessment

    of

    uncer ta in ty , a n i ssue

    th at is typical ly muddied by a

    lack of a c lear objec t ive meaning for th e

    var ious uncer ta inty measures that probabi l is t ic models can provide.

    W e will look a t several com mo n problems, discuss how o ur p robabilist ic

    model might provide

    a

    relevant answer , and use th e exhaus t ive d a ta

    set t o check th e performan ce of various m ethod s.

    The f ina l chapter provides a recap of the tools discussed in the

    boo k, recal ling th eir s t reng ths and their l imita t ions. Since this book

    a t t e m p t s a n i n tr oduc ti on

    to

    bas ic methods , many advanced methods

    have not been touched, however, the types of problems that require

    mo re adv anced m etho ds a re discussed and fur th er references a re given.

  • 8/10/2019 CH1 Introductionrw

    7/7

    Introduction

    9

    Before we begin exploring some basic geostatisticd tools we would

    like to emphasize that the case studies used throughout the book are

    presented for their educational value and not necessarily to provide a

    definitive case study of the Walker Lake data set. It is

    our

    hope that

    this book will enable a reader to explore new and creative combinations

    of the many available tools and to improve on the rather simple studies

    we have presented here.