SGeMS

SGeMS User’s Guide

Nicolas Remy, Alexandre Boucher & Jianbing Wu

April 13, 2006

Chapter 1

General Overview

Section?? indicated how GsTL can be integrated into an already existing software. SGeMS,

the Geostatistical Earth Modeling Software, is an example of software built from scratch

using the GsTL. The source code of SGeMS serves as an example of how to use GsTL

facilities.

SGeMS was designed with two aims in mind. The first one, geared toward the end-

user, is to provide a user-friendly software which offers a large range of geostatistics

tools: the most common geostatistics algorithms are implemented, in addition to more

recent developments such as multiple-point statistics simulation. The user-friendliness

of SGeMS comes mainly from its non-obtrusive graphical user interface, and the pos-

sibility to directly visualize data sets and results in a full 3-D interactive environment.

The second objective was to design a software whose functionalities could conveniently

be augmented. New features can be added into SGeMS through a system of plug-ins,

i.e. pieces of software which can not be run by themselves but complement a main soft-

ware. In SGeMS, plug-ins can be used to add new (geostatistics) tools, add new grid data

structures (faulted stratigraphic grids for example) or define new import/export filters.

1.1 First Steps with SGeMS

1.1.1 A quick tour to the graphical user interface

The graphical user interface (GUI) of SGeMS is divided into 3 main parts, see Fig. 1.1:

The Algorithm Panel The user selects in this panel which geostatistics tool to use and

1

Figure 1.1: SGeMS’s graphical interface. The three main panels are highlighted in red.

The top-right panel is the Algorithm Panel, the top-left is the Visualization Panel and the

bottom is the Command Panel

inputs the required parameters (see Fig. 1.2). The top part of that panel shows a

list of available algorithms, e.g. kriging, sequential Gaussian simulation. When

an algorithm from that list is selected, a form containing the corresponding input

parameters appears below the tools list.

The Visualization Panel One or multiple objects can be displayed in this panel, e.g. a

Cartesian grid and a set of points, in an interactive 3-D environment. Visualization

options such as color-maps are also set in the Visualization Panel. The Visualization

Panel is shown in more details in Fig. 1.3.

The Command Panel This panel is not shown by default when SGeMS is started. It

gives the possibility to control the software from a command line rather that from

2

Figure 1.2: The 3 parts of the Algorithm Panel highlighted in red. The top part displays

the list of available tools. The middle part is where the input parameters for the selected

tool are entered.

the GUI. It displays a history of all commands executed so far and provides an input

field where new commands can be typed (see Fig. 1.4). See tutorial 1.1.2 for more

details about the Command Panel.

1.1.2 A simple tutorial

This short tutorial describes a SGeMS session in which ordinary kriging is performed on

a 100 ∗ 130 ∗ 30 Cartesian grid. The data consist of a set of 400 points in 3-D space,

with a rock porosity value associated to each data point. This point-set object is called

porosity data .

The steps involved are the following:

1. Load the data set

3

Figure 1.3: The Visualization Panel. The left-hand side (highlighted in red) controls

which objects (e.g. grids) are visible. It is also used to set display options, such as which

color-map to use.

Figure 1.4: The Command Panel

2. Create a Cartesian grid

3. Select the kriging tool and enter the necessary parameters

4. Display and save the result

4

Loading the data set

The first step is to load the data set into the objects database of SGeMS. Click

Objects | Load Object and select the file containing the data. Refer to section 1.2.1 for a

description of the available data file formats. When the object is loaded a new entry called

porosity data appears in theObjects section of the Visualization Panel, as shown

in Fig. 1.5.

Figure 1.5: Object list after the data set is loaded

Click in the square before the point-set name to display it. Displayed objects have a

little eye painted inside the rectangle before their name. The plus sign before the square

indicates that the object contains properties. Click on the plus sign to display the list

of properties. Click in the square before the property name to paint the object with the

corresponding property (see Fig. 1.6).

Figure 1.6: Showing/hiding an object or a property

5

Creating a grid

The next step is to create the grid on which kriging will be performed. The grid we will

create is a 3-D Cartesian grid with100 ∗ 130 ∗ 30 cells.

Click Objects | New Cartesian Grid to open the grid creation dialog. Enter the di-

mensions of the grid, the coordinates of the origin of the grid, i.e. the lower left corner

(i.e. the grid node with smallestx, y, z coordinates), and the dimensions of each grid cell.

Provide a name for the new grid,working_grid for example. ClickCreate Grid to

create the grid. A new entry calledworking_grid appears in theObjectspanel of the

Visualization Panel, as shown in Fig. 1.7.

Figure 1.7: Object list after the Cartesian grid is created

The object data-base now contains two objects: a point-set with the rock porosity

property, and a Cartesian grid with not yet any property attached. We can proceed to the

kriging run.

Running the kriging algorithm

Select the kriging tool from the list in the Algorithm Panel. A form prompting for the

kriging parameters appears below the algorithms list. You can either type in the para-

meters or load them from a file. Fig. 1.8 shows an example of parameter file for kriging

(refer to section 1.2.2 for a description of the parameter file format and to section 5.0.2

for details on kriging parameters). Using the parameters of Fig. 1.8, ordinary kriging is

performed with an isotropic search ellipsoid of radius50 (section?? describes how to

specify a 3-D ellipsoid in SGeMS) and an isotropic spherical variogram of range30, sill

1, and a nugget effect of0.1 (section??explains how to specify a variogram).

Once all parameters have been entered, click theRun Algorithm button. If some

parameters are not correctly set, they are highlighted in red and a description of the error

will appear if the mouse is left a few seconds on the offending parameter.

6

<parameters> <algorithm name="kriging" />

<Grid_Name value="working_grid" />

<Property_Name value="krig_porosity" />

<Hard_Data grid="porosity data" property="porosity" />

<Kriging_Type type="Ordinary Kriging (OK)" >

<parameters />

</Kriging_Type>

<Max_Conditioning_Data value="12" />

<Search_Ellipsoid value="50 50 50

0 0 0 " />

<Variogram nugget="0.1" structures_count="1" >

<structure_1 contribution="0.9" type="Spherical" >

<ranges max="30" medium="30" min="30" />

<angles x="0" y="0" z="0" />

</structure_1>

</Variogram>

</parameters>

Figure 1.8:Kriging parameter file

If kriging was run with the parameters shown on Fig. 1.8, the grid named

working_grid now contains a new property calledkrig_porosity .

Displaying and saving the result

The algorithmKriging created a new propertykrig_porosity in the grid

working_grid . Click on the plus sign before theworking_grid entry in the objects

list to show the list of properties attached to the grid, and click in the square before the

newly created property to display it. To save the results, clickObject | Save Objectto

open the Save Object dialog. Provide a file name, the name of the object to save, e.g.

working_grid and the file format to use (see section 1.2.1).

It is also possible to save all the objects at once by saving the project (File | Save Project).

7

1.1.3 Automating tasks in SGeMS

Tutorial 1.1.2 showed how to perform a single run of the kriging algorithm. Next, one

would like to study the sensitivity of the algorithm to parameterMax Conditioning Data ,

the maximum number of conditioning data retained for each kriging. The user would like

to vary that number from1 to 50 in increments of1. It would be very impractical to

perform such a study in successive sequences as explained in Tutorial 1.1.2.

SGeMS provides a solution to this problem through its command line interface. Many

actions in SGeMS can either be performed with mouse clicks or by typing a command in

the Command Panel. For example, loading the data set in step 1 of Tutorial 1.1.2 could

have been achieved by typing the following command:

LoadObjectFromFile /home/user/data_file.dat:All

Each command has the following format:

• the name of the command, e.g.LoadObjectFromFile

• a list of parameters, separated by a colon “:”. In the previous example two parame-

ters were supplied: the name of the file to load, and the file formatAll (meaning

that every available file format should be considered).

Every command performed in SGeMS, either typed or resulting from mouse clicks,

is recorded to both the “Commands History” section of the Command Panel and to a file

calledgstlappli history.log . Hence if one does not know a command name, one

can use the GUI to perform the corresponding action and check the command name and

syntax in the command history.

It is possible to combine several commands into a single file and have SGeMS execute

them all sequentially. For the sensitivity study example, one would have to write a “script”

file containing 50 kriging commands, each time changing theMax Conditioning Data

parameter. Note that there are no control structures, e.g. “for” loops, in SGeMS script

files. However there are many programming tools available (Perl, Awk, Shell,. . . ) that

can be used to generate such script file.

8

1.2 File Formats

1.2.1 Objects Files

SGeMS supports two file formats by default to describe grids and sets of points: the

GSLIBformat and the SGeMS binary format.

The GSLIB file format

It is a simple ASCII format used by theGSLIBsoftware (ref ??). It is organized by lines:

• the first line gives the title.

• the second line is a single numbern indicating the number of properties in the

object.

• the n following lines contain the names of each property (one property name per

line)

• each remaining line contains the values of each property (n values per line) sepa-

rated by spaces or tabulations. The order of the property values along each line is

given by the order in which the property names were entered.

Note that the same format is used for describing both point sets and Cartesian grids.

When aGSLIBfile is loaded into SGeMS the user has to supply all the information that

are not provided in the file itself, e.g. the name of the object, the number of cells in each

direction if it is a Cartesian grid.

The SGeMS binary file format

SGeMS can use an uncompressed binary file format to store objects. Binary formats have

two main advantages over ASCII files: they occupy less disk space and they can be loaded

and saved faster. The drawback is a lack of portability between platforms. SGeMS binary

format is self-contained, hence the user need not provide any additional information when

loading such a file.

9

1.2.2 Parameter Files

When an algorithm is selected from the Algorithm Panel (see step 3 of Tutorial 1.1.2),

several parameters are called for. It is possible to save those parameters to a file, and later

retrieve them.

The format of a parameter file in SGeMS is based on the eXtended Markup Language

(XML), a standard formating language of the World Wild Web Consortium (www.w3.org).

Fig. 1.8 shows an example of such parameter file.

In a parameter file, each parameter is represented by an XML element. An element

consists of an opening and a closing tag, e.g.<tag> and</tag> , and one or several

attributes. Following is an example of an element calledalgorithm which contains a

single attribute “name”:

<algorithm name="kriging"> </algorithm>

Elements can themselves contain other elements:



<ranges max="30" medium="30" min="30"> </ranges>

<angles x="0" y="0" z="0"> </angles>

</structure_1>

</Variogram>

Here the elementVariogram contains an elementstructure_1 , which itself

contains two elements:ranges andangles . Each of these elements have attributes.

Note that if an element only contains attributes the closing tag can be abbreviated: in the

previous example, elementranges only contains attributes and could have been written:


The/> sequence indicates the end of the element

A SGeMS parameter file always has the two following elements:

• elementparameters . It is the root element: it contains all other elements

• elementalgorithm . It has a single attributename which gives the name of the

algorithm for which parameters are specified.

10

All other elements are algorithm-dependent and are described in sections 5 and??.

Such XML formated parameter file has several advantages:

• Elements can be entered in any order

• comments can be inserted anywhere in the file. A comment block starts with . They can span multiple lines. For example:






<Grid_Name value="working_grid" />

</parameters>

11

Chapter 2

Data Sets & SGeMS EDA Tools

This chapter presents the data sets which will be used to demonstrate the geostatistics

algorithms in the following chapters. It also provides an introduction to the exploratory

data analysis (EDA) tools of the SGeMS software.

Section 2.1 presents the two data sets: one in 2D and one in 3D. The smaller 2D data

set will be used for most of the examples of running geostatistics algorithms. The 3D

data set, which mimics a large deltaic channel reservoir, will be used to demonstrate the

practice of these algorithms on the real and large 3D applications.

Section 2.2 introduces some basic EDA tools, such as histogram, Q-Q (quantile to

quantile) plot, P-P (probability to probability) plot and scatter plot. These EDA tools are

very useful to check/compare both visually and statistically any given data sets.

2.1 The Data Sets

The application example given for each algorithm should be short yet informative, allow-

ing the user to check his understanding of the algorithm and its input-output parameters.

The results of any application based on a specific data set, shouldNOT be extended into

a the general conclusion. Although the SGeMS software was initially designed for reser-

voir modeling, the application examples should stress the generality of the algorithms and

their software implementation.

2.1.1 The 2D data

To be done ...

12

2.1.2 The 3D data

The 3D data set retained in this book is extracted from a layer of Stanford VI, a synthetic

fluvial channel reservoir (Castro et al., 2005). The corresponding SGeMS project is lo-

cated at ‘dataset/stanford6.prj’. This project contains three SGeMS objects:well, grid

andcontainer.

• Thewell object contains the well data set. There are a total of 26 wells (21 verti-

cal wells, 4 deviated wells and 1 horizontal well). The properties associated with

the wells aredensity, binary facies(sand channel or mud floodplain),P-wave im-

pedance, P-wave velocity, permeabilityandporosity. These data can be used as

hard or soft conditioning data for the geostatistics algorithms. Fig. 2.1 shows the

well locations and the porosity distribution along the wells.

• Thegrid object is a cartesian reservoir grid (see its rectangular boundary on Fig. 2.1),

with,

- grid size:150× 200× 80,

- origin point at (0,0,0),

- unit cell size in each x/y/z direction.

This reservoir grid holds the petrophysical properties, currently it contains:

1. Seismic data. The original Stanford VI contains a cube of P-wave seismic

impedance, which is the product of density and P-wave velocity. However for

most geostatistics algorithms such seismic data cannot be used directly as is,

it must be calibrated into, e.g., facies probability distributions. Here, two sand

probability cubes (propertiesP(sand|seis)andP(sand|seis)2) are provided:

one displaying sharp channel boundaries (best quality data, see Fig. 2.2a); and

a second displaying more fuzzy channel boundaries with noise (poor quality

data, see Fig. 2.2b). These probability data will be used as soft data to con-

strain the facies modeling.

2. Region code. Typically a large reservoir would be divided into different re-

gions with each individual region having its own characteristics, for instance,

different channel orientations, channel thickness. The regions associated with

the Stanford VI reservoir are rotation regions (propertyangle) correspond-

ing to different channel orientations (Fig. 2.3), and affinity regions (property

affinity) corresponding to different channel thickness (Fig. 2.4). Each rotation

13

Angle category 0 1 2 3 4 5 6 7 8 9

Angle value (degree) -63 -49 -35 -21 -7 7 21 35 49 63

Table 2.1: Rotation region indicaotrs

Affinity category 0 1 2

Affinity value ([x,y,z]) [0.5, 0.5, 0.5] [ 1, 1, 1 ] [ 2, 2, 2 ]

Table 2.2: Affinity region indicaotrs

region is labeled with an indicator number, and assigned an angle value, see

Table 2.1. The affinity indicators and the attached affinity values are given

in Table 2.2. An affinity value must be assigned for each x/y/z direction; the

smaller the affinity value, the thicker the channel in that direction.

Thisgrid object is used to demonstrate the application examples.

• Thecontainerobject is composed of all the reservoir nodes which are located inside

the channels, hence it is a point set with (x,y,z) coordinates. The user can perform

geostatistics on this channel container, for example, to obtain the within channel

petrophysical properties. In Fig. 2.5 the channel container is represented by all

nodes with value 1.

Although this 3D data set is taken from a reservoir model, it could represent any 3D

spatially-distributed attribute and applications other than reservoir modeling. For exam-

ple, one can interpret each layer of the seismic data cube as satellite measurements defined

over the same area but recorded at different times. The application could then be modeling

landscape change in both space and time.

2.2 The SGeMS EDA Tools

SGeMS software provides some useful exploratory data analysis (EDA) tools, such as

histogram, quantile (Q-Q) plot, probability (P-P) plot, scatter plot, variogram and cross

variogram calculation and fit. In this chapter, the first four elementary tools are presented;

the two latter tools are presented in the next chapter.

14

Figure 2.1: Well locations and the porosity distribution along the wells.

(a) Good quality probability. (b) Poor quality probability.

Figure 2.2: Sand probability cube calibrated from seismic impedance.

Figure 2.3: Angle indicator cube. Figure 2.4: Affinity indicator cube.

15

Figure 2.5: Channel container (nodes with value 1).

All the EDA tools can be invoked through theData Analysis menu from the main

SGeMS graphical interface. Once a specific tool is selected, the corresponding SGeMS

window is popped up. The EDA tool window is independent of the main SGeMS inter-

face, and the user can have multiple windows for each EDA tool.

2.2.1 Common Parameters

The main interface for any of the EDA tools presented in this chapter has 3 main panels

(see Fig. 2.6, Fig. 2.7 and Fig. 2.8):

A. Parameter Panel The user selects in this panel the properties to be analyzed and

the display options. This panel has two pages: ‘Data’ and ‘Display Options’, the

latter being common to all EDA tools;

B. Visualization Panel This panel shows the graphic result of the selected statistics;

C. Statistics Panel This panel displays some relevant statistical summaries.

In the lower part of main interface, there are two buttons:Save as ImageandClose.

The Save as Imagebutton is used to save a graphical visualization (for example his-

togram) into a picture data file in either ‘png’ format or ‘bmp’ format. The user can also

write the statistical summaries and/or paint the grid into the data file by selecting the cor-

responding options in the figure saving dialog box. TheClosebutton is used to close the

current EDA tool interface.

16

Parameters Description

The parameters of the ‘Display Options’ page are listed as follows:

• X Axis: The X axis for variable 1. Only the property value between ‘Min’ and

‘Max’ are displayed in the plot; the values less than ‘Min’ or greater than ‘Max’

still contribute to the statistical summaries. The default values of ‘Min’ and ‘Max’

are the minimum and maximum of the selected Property. TheX Axis can be set to

a logarithmic scale by marking the corresponding check box.

• Y Axis: The Y axis for variable 2. The previous remarks made for theX Axis apply.

The user can modify the parameters through either the keyboard or the mouse. Any

modification through mouse will instantly reflect on the visualization or the statistical

summaries; while the change through keyboard must be inured by pressing the ‘Enter’

key.

2.2.2 Histogram

The histogram tool creates a visual output of the frequency distribution, and displays

some statistical summaries, such as the mean and variance of the selected variable. The

histogram tool is activated by clickingData Analysis|Histogram. Although the pro-

gram will automatically scale the histogram, the user can set the histogram limits in the

Parameter Panel. The mainhistogram interface is given in Fig. 2.6, and the parameters

of the ‘Data’ page are listed below.


• Object: A cartesian grid or a point set containing variables.

• Property: The values or distribution of a certain variable listed in theObject above.

• bins: The number of classes. The user can change this number through the key-

board, or by clicking the scroll bar. Any value change will be instantly reflected on

the histogram display.

• Clipping Values: Statistical calculation settings. All values less than ‘Min’ and

greater than ‘Max’ are ignored, and any change of ‘Min’ and ‘Max’ will affect the

statistics calculation. The default values of ‘Min’ and ‘Max’ are the minimum and

17

maximum of the selectedProperty. After modifying ‘Min’ and/or ‘Max’, the user

can go back to the default setting by clicking ‘Reset’.

2.2.3 Q-Q plot and P-P plot

The Q-Q plot compares the equal p-values quantiles of two variables; and P-P plot com-

pares the cumulative probability distributions of two variables. The two variables need

not be in the sameObject or have the same number of data. The Q-Q plot and P-P plot are

combined into one program, which can be invoked fromData Analysis|QQ-plot. This

EDA tool generates both a graph in theVisualization Panel and some statistical sum-

maries (mean and variance for each variable) in theStatistics Panel, see Fig. 2.7. The

parameters in the ‘Data’ page are listed below.


• Analysis Type: Algorithm selection. The user can choose either a Q-Q plot or a

P-P plot.

• Variable 1: The variable selection for the X axis. The user must choose first an

Object, then theProperty name.

• Clipping Values for Variable 1: All values strictly less than ‘Min’ and strictly

greater than ‘Max’ are ignored, any change of ‘Min’ and ‘Max’ will affect the

statistics calculation. The user can go back to the default setting by clicking ‘Reset’.

• Variable 2: The variable selection for the Y axis. The user must choose first an

Object, then theProperty name. Note thatVariable 2 andVariable 1 might be

from different objects.

• Clipping Values for Variable 2: Remarks similar to those forClipping Values for

Variable 1.

2.2.4 Scatter Plot

Thescatterplot tool (executed by clickingData Analysis|Scatter-plot) is used to com-

pare a pair of variables by displaying their bivariate scatter plot and some statistics. All

available data pairs are used to compute the summary statistics, such as the correlation

18

coefficient, the mean and varianace of each variable (see part [C] in Fig. 2.8). To avoid a

crowded figure in theVisualization Panel, only up to 10,000 data pairs are given in the

scatter plot. The parameters in the ‘Data’ page are listed below.


• Object: A cartesian grid or a point set containing variables. ThisObject must at

least contain two properties.

• Variable 1: The variable property listed in theObject above. This variable is

associated with the X axis.

• Clipping Values for Variable 1: All values strictly less than ‘Min’ and strictly

greater than ‘Max’ are ignored, hence any change of ‘Min’ and ‘Max’ will affect

the statistics calculation. The user can go back to the default setting by clicking

‘Reset’. If Variable 1 has more than 10,000 data, then the ‘Reset’ button can be

used to generate a new scatter plot with a new set of data pairs containing up to

10,000 data.

• Variable 2: The variable property listed in the upperObject. This variable is

associated with the Y axis.Variable 2 must have the same number of data as

Variable 1.

• Clipping Values for Variable 2: Remarks similar to those forVariable 1.

• Options: The choice of visualizing the least square fit line in the scatter plot. The

slope and the intercept are always given below the check box ‘Show Least Square

Fit’.

19

Fig

ure

2.6:

His

togr

amin

terf

ace

[A]:

para

met

erpa

nel;

[B]:

visu

aliz

atio

npa

nel;

[C]:

stat

istic

spa

nel.

20

Fig

ure

2.7:

Q-Q

plot

inte

rfac

e[A

]:pa

ram

eter

pane

l;[B

]:vi

sual

izat

ion

pane

l;[C

]:st

atis

tics

pane

l.

21

Fig

ure

2.8:

Sca

tter

plot

inte

rfac

e[A

]:pa

ram

eter

pane

l;[B

]:vi

sual

izat

ion

pane

l;[C

]:st

atis

tics

pane

l.

22

Chapter 3

Utilities

3.1 Non-parametric distribution

In SGeMS a non-parametric cumulative distribution function, cdfF (z), is infered from a

set of valuesz1 ≤ . . . ≤ zL which can either be read from a file or from a property.F (z)

is built such that:F (z1) = 1L

andF (zL) = 1 − 1L

. This requires modeling of the tails of

the distributionF (z) if the valuesz1 andzL are not the minimum and maximum values.

The lower tail extrapolation function provides the shape of the distribution between

the minimumzmin and the first valuez1. The options for the lower tail are:

• Z is bounded:F (zmin) = 0. The lower tail ofF is then modeled with a power

model:F (z1)− F (z)

F (z1)=

( z1 − z

z1 − zmin

)ω

∀z ∈]zmin, z1[ (3.1)

The parameterω controls the decrease of the function, with the constraintω ≥ 1.

The greaterω the less likely are low values close tozmin. For ω = 1, all values

betweenzmin andz1 are equally likely.

• Z is not bounded: the lower tail is modeled with an exponential funtion:

F (z) = F (z1) exp(− (z − z1)

2)

∀z < z1 (3.2)

The options for the upper tail extrapolation function are similar but applied to the

interval[zL, zmax].

23

• Z is bounded:F (zmax) = 1. The upper tail ofF is then modeled with a power

model:F (z)− F (zL)

1− F (zL)=

( z − zL

zmax − z1

)ω

∀z ∈]zL, zmax[ (3.3)

The parameterω controls the decrease of the function, with the constraint

ω ∈ [0, 1]. The lower theω value the less likely are extreme values close tozmax.

Forω = 1, all values betweenzL andzmax are equally likely.

• Z is not bounded: the lower tail is modeled by an hyperbolic model

1− F (z)

1− F (zL)=

zL

z∀z > zL (3.4)

All L − 1 intermediary intervals[zi, zi+1] for i = 1, ..., L − 1 are interpolated linearly ,

corresponding to power model withω = 1.

Note: Whenzmin andzmax values are set toz1 andzL, the functional shape of the tail

extrapolation function becomes irrelevant.

Parameters description

1. Reference Distribution: Read the data either from the file [ref on file ] or on the

grid [ref on grid ] .

2. File with reference distribution [filename ]: File containing the reference distribu-

tion in one column without header. Required if [ref on file ] is selected

3. Property with reference distribution: The grid [grid ] and property [property ]

contains the values for the non-parametric distributions

4. Lower Tail Extrapolation : Parametrization of the lower tail. The type of extrapola-

tion function is selected with [LTI function ]. If the power model is selected, the

minimum value [LTI min ] and the parameterω [LTI omega] must be specified. Note

that the minimum [LTI min ] must be less or equal to the minimum datum value as

entered in the reference distribution, and the power omega [LTI omega] must be

greater or equal to 1. The exponential model does not require any parameter.

5. Upper Tail Extrapolation : Parametrization of the upper tail. The type of extrap-

olation fucntion is selected with [UTI function ]. If the power model is selected,

24

the maximum value [UTI max] and the parameterω [UTI omega] must be specified.

Note that the minimum [UTI max] must be greater or equal to the maximum datum

value as entered in the reference distribution, and the power omega [UTI omega]

must be lesser or equal to 1. The hyperbolic model only required the parameter

omega [UTI omega], the upper tail is unbounded.

Histogram Transformation

The algorithmTRANSallows to transform histogram into any other one. For example, the

Gaussian simulation algorithms (SGSIM and COSGSIM), as described in Section 6.1.1

and 6.1.2, assume Gaussian variables. If the attribute of interest is not Gaussian, it is pos-

sible to transform the marginal distribution of that attribute into a Gaussian distribution,

then work on the transformed variable.

The algorithm TRANS transform a property following a source distribution to a new

variable following a target distribution. The transformation of a variableZ with a source

cdf FZ into variableY with target cdfFY is written:

Y = F−1Y

(FZ

(Z

))(3.5)

Note : The transformation of a source distribution to a Gaussian distribution does

not ensure thatY is multivariate Gaussian, only its marginal distribution is. One should

check that the multivariate (or at least bi-variate) Gaussian hypothesis holds forY before

performing Gaussian simulation. If the hypothesis is not appropriate, other algorithms

that do not require Gaussianity, e.g. sequential indicator simulation (SISIM) should be

considered.

Histogram transformation with conditioning data

It is possible to apply a weighting factor to control how much specific values are trans-

formed.

y = z − ω(z − F−1Y (FZ (z)))

Whenω = 0 theny = z, there is no transformation.

25

Whenω = 1 theny = F−1Y (FZ (z))) which is the standard rank transform. The weight

ω can be set equal to the standardized kriging variance. At data locations that kriging

variance is zero, hence there is no transform and the datum value is unchanged. Away

from data locations the kriging variance increases allowing for a larger transform. That

option is to be used for slight adjustment of the marginal distribution. When weights are

used the transform is not rank preserving anymore and would only approximately match

the target distribution.

The histogram transformation algorithm is:

Algorithm 3.1 Histogram transformation1: for Each valuezk to be rank transformeddo

2: Get the quantile from the source histogram associated withzk, pk = FZ(zk)

3: Get the valuesyk from the target distribution associated withpk, yk = F−1Y (pk)

4: if weighted transformationthen

5: Applied weighting to the transform.

6: end if

7: end for


The TRANSalgorithm is activated fromUtilities—trans in the algorithm panel. The

TRANSinterface contains 3 pages: Data, Source and Target (see Figure 3.2). The text

inside [ ] is the corresponding keyword in theTRANSparameter file.

1. Object Name[grid ]: Selection of the grid containing the properties to be transformed

2. Properties[props ]: Properties to be transformed

3. Suffix for output [out suffix ]: The name for each output property consists of the

original name plus the suffix entered here.

4. Local Conditioning [is cond ]: Enables the use of weight for the histogram transfor-

mation

5. Weight Property [cond prop ]: Property with weights for transformation of the his-

togram conditional to data, see Eq. 3.1. The standardized kriging variance is a good

weighting option. Only required ifLocal Conditioning[is cond ] is selected

26

6. Control Parameter [weight factor ]: Value between 0 and 1 adjusting the weights.

Only required ifLocal Conditioning [is cond ] is selected

7. Source histogram[ref type source ]: Define the type of the source histogram.

That histogram may be either non parametric, Gaussian, LogNormal, or Uniform.

Each has its own interface for parameters

8. Gaussian parameters: The mean is given inMean [G mean source ] and the

variance inVariance[G variance source ]

9. LogNormal parameters: The mean is given inMean [LN mean source ] and

the variance inVariance[LN variance source ]

10. Uniform parameters: The minimum value is given inMin [Unif min source ]

and the maximum inMax[Unif max source ]

11. Non Parametric: The non parametric distribution is entered in [nonParamCdf source ],

see 3.1

12. Target histogram[ref type target ]: Define the type of the target histogram. The

histogram may be either non parametric, Gaussian, LogNormal, or Uniform. Each

has its own interface for parameters

13. Gaussian parameters: The mean is given inMean [G mean target ] and the

variance inVariance[G variance target ]

14. LogNormal parameters: The mean is given inMean [LN mean target ] and

the variance inVariance[LN variance target ]

15. Uniform parameters: The minimum value is given inMin [Unif min target ]

and the maximum inMax[Unif max target ]

16. Non Parametric: The non parametric distribution is entered in [nonParamCdf target ],

see 3.1

27

Figure 3.1: Widget for non parametric distribution

(a) Data tab (b) Source tab (c) Target tab

Figure 3.2: User interface forTRANS.

28

Chapter 4

Variograms and variography

Variogram models are used by many geostatistical algorithms; SGeMS allows four basic

models and any positive linear combinations of them, that is the linear model of region-

alization. Linear models of coregionalization are also possible but the software does not

check the permissibility of the model, it is the user responsability. The four basic models

are: the nugget effect model, the spherical model, the exponential model and the Gaussian

model.

nugget effect model

γ(h) =

0 if ‖h‖ = 0

1 otherwise(4.1)

A pure nugget effect model for a variableZ(u) expresses a lack of (linear) depen-

dence between variablesZ(u) andZ(u + h)

spherical model with actual rangea

γ(h) =

32‖h‖a− 1

2(‖h‖

a)3 if ‖h‖ ≤ a

1 otherwise(4.2)

exponential model with practical rangea

γ(h) = 1− exp(−3‖h‖

a

)(4.3)

Gaussian model with practical rangea

γ(h) = 1− exp(−3‖h‖2

a2

)(4.4)

29

Notes: Gaussian models are not permissible for binary indicator variables.

The term variogram is used to designate what is strictly speaking a semi-variogramγ(h)

It is also understood as cross semivariogram depending on the context.

All above models are permissible in 3-D and have a covariance counterpart:

C(h) = C(0)− γ(h)

In SGeMS a variogram model:γ(h) = c0γ(0)(h)+

∑Ll=1 clγ

(l)(h) is characterized by

the following parameters:

• a nugget effectc0γ(0) with nugget constantc0 ≥ 0

• the numberL of nested structures. Each structureγ(l)(h) is then defined by:

– a variance contributioncl ≥ 0

– the type of the variogram: spherical, exponential or Gaussian

– six parameters: the three practical ranges along the main directions of the

anisotropy ellipsoid and the three corresponding rotation angles defining that

ellipsoid. Note that each nested structure can have a different anisotropy.

Example:

Consider the variogram modelγ(h) = 0.3γ(0)(h) + 0.4γ(1)(h) + 0.3γ(2)(h), with:

• γ(0)(h) a nugget with sill0.3

• γ(1)(h) an anisotropic spherical variogram with major range 40, medium range 20

and minor range 5, and anglesα = 45o, β = 0, θ = 0

• γ(2)(h) an isotropic Gaussian variogram of range 200

The corresponding XML parameter file snippet would be:

30

<[Parameter_name] nugget="0.3" structures_count="2" >



<angles x="45" y="0" z="0" />

</structure_1>

<structure_2 contribution="0.3" type="Exponential" >


<angles x="0" y="0" z="0" />

</structure_2>

</[Parameter_name]>

Where[Parameter_name] is the name of the parameter.

4.1 Modeling variogram in SGeMS

SGeMS offers the computation and modeling of experimental variograms for data both on

regular grids and on point sets. The variogram module is accessible from the Data Analy-

sis—Variogram menu. In addition to the variogram, the experimental covariance and

correlogram can also be calculated. SGeMS only fits permissible models to variograms.

The variography in SGeMS is done in three steps 1) choosing the grid and properties, 2)

calculating the experimental variograms and 3) fitting a model. Step 2) differs depending

whether the properties are to be found on a point set or a grid.

For all steps, the next windows are accessed by clicking onNext.

The parametes for the variogram computation are entered in the second window. That

window takes two forms depending if the property resides on a point set, shown in Figure

4.1(b), or on a cartesian grid 4.1(c).

Selecting properties

The selection of the properties on which to perform the variography is done in the first

window, shown in Figure 4.1(a). The experimental variogram can either be loaded from

a previous session or computed from scratch. When the computation is required, the

user has to specified two properties, termed head and tail. Specifying different properties

for head and tail results in computing their cross-variogram. If it is retrieved from a

31

previous session, then the next window is the modeling window of Figure 4.1(d), the

parametrization window is skipped.

1. Experimental variogram: Either compute the experimental variogram from scratch

or retrieve it from a previous modeling session.

2. Grid and Properties: Select the properties for the experimental varigrams. Specifying

different properties for head and tail results in computing a cross-variogram. Using

the same property for head and tail produces a univariate experimental variogram

Experimental variogram on point set

Calculating experimental variograms on point set is challenging due to the irregularity of

data locations. For a fixed lag and direction, it is unlikely that enough pairs, if any, would

be found to calculate the corresponding variogram. Thus, tolerance on the distance and

angles is introduced such that enough data pairs can be found on a point set. Parameters

3 to 13 shown in Figure 4.1(b) are required when working on a point set. Parameters

entered in this window can be saved for future use.

3. Number of lags: Number of variogram lags to compute

4. Lag separation: Distance between two lags. The experimental variogram will be

computed fromh = 0 to h = Number of lags times the lag separation.

5. Lag tolerance: Tolerance around the lag separation. All pairs of data separated by

Lag separation± lag tolerance would be reported to the same lag.

6. Number of directions: Number of directions along which to compute the experimen-

tal variograms, each with the same number of lags. Each direction is parametrize

by the next four items.

7. Azimuth: Horizontal degrees in angle from North moving clockwise

8. Dip: Vertical gegrees in angle from the horizontal plane, moving clockwise.

9. Tolerance: Half-window angle tolerance.Tolerance on the angle to pair data.

10. Bandwidh: Maximum acceptable deviation from the direction vector..

32

11. Measure type: Choice of the metric to measure bivariate spatial patterns. The op-

tions are: variogram, indicator variogram, covariance and correlogram. The mod-

eling capability of SGeMS permits only to fit a model on variogram and indicator

variogram.

12. Head indicator cutoff: Threshold defining the indicator variable. The value of the

head continuous property is coded one if it is below that threshold, it is zero other-

wise. This field is to be used only for indicator variogram.

13. Tail indicator cutoff : Threshold defining the indicator variable. The value of the tail

continuous property is coded one if it is below that threshold, it is zero otherwise.

This field is to be used only for indicator variogram. Different head and tail thresh-

olds lead paramount to a cross-indicator variogram if the tail and head refer to the

same property, or to a cross-indicator cross-variogram if the head and tail property

are different.

Experimental variogram on cartesian grid

Pairs of data are easier to find on a cartesian grid. Experimental variograms on cartesian

grid does not require tolerance. The computation take advantage of the grid by finding

pairs along discrete direction vector increment,(∆x, ∆y, ∆z), specified by the user. For

example, a direction vector of(1, 0, 0) is horizontal East-West,(1, 1, 0) correponds to

azimuth 45 degree, while(0, 0, 1) is vertical.

Parameters 14 to 18 shown in Figure 4.1(c) are required when the selected head and

tail properties are located on a cartesian grid. Parameters entered in this window can be

saved for future use. The description of parameters without reference number in 4.1(c)

already been described for Figure 4.1(b).

14. Number of lags: Number of variograms lags to consider.

15. Number of directions: Number of discretized direction vectors

16. x: East-West increment

17. y: North-South increment

18. z: Vertical increment

33

Fitting a permissible model

The tools to fit a model to the experimental variogram are shown in Figure 4.1(d). The

figures can be reorganized in the window either manually or by using the menuWindow.

The axes for one or all the plots can be changed from theSettingsmenu. TheFile menu

allows to save or load the variogram model in XML format. The experimental variograms

can also be saved in file from that menu. The last option is to take snapshot for all or for

a particular variogram display.

Right-clicking on any variogram display will toggle the number of pairs next to each

experimental values. When fitting, points with few pairs should be given less considera-

tion.

19. Nugget Effect: Contribution to the sill of the nugget efect

20. Nb of structures: Number of structures for the linear model of regionalization

21. Sill Contribution : Contribution to the sill of that structure

22. Type: Type of variogram for that structure. There are three variogram type: spherical,

exponential and Gaussian.

23. Ranges: Ranges of the variogram. The range can either be changed be manually

entering the value, or by sliding the bar.

24. Angles: Angles defining the anisotropy

34

(a) Property selection (b) Parameters for point set

(c) Parameters for cartesian grid

(d) Model fitting

Figure 4.1: User interface for variogram modeling.

35

Chapter 5

Estimation Algorithms

SGeMS provides several tools for estimating a spatially distributed variableZ(u) from

a limited number of samples. All algorithms rely on the linear, least-square estimation

paradigm called kriging. The kriging estimator can be extended to deal with a variable

whose mean is not stationary, and to the case of multiple covariates (cokriging).

5.0.1 Common Parameters

All estimation algorithms in SGeMS work on a 3-D object, loosely called the estimation

grid: in the current version, that object can either be a Cartesian grid or an unstructured

set of points. When an estimation algorithm is run on an object, a new property containing

the result of the estimation is attached to that object. All estimation algorithms require

the following two parameters:


Grid Name: The grid (or more generally, the object) on which the estimation is to be

performed

Property Name: The name of the new property resulting from the estimation

5.0.2 Kriging

Kriging is a 3-D estimation program for variables defined on a constant volume support.

Estimation can either be performed by simple kriging (SK), ordinary kriging (OK), krig-

36

ing with a polynomial trend (KT) or simple kriging with a locally varying mean (LVM).

SK, OK and KT were introduced in??, p. ??. LVM is another variant of kriging, in

which the meanm(u) is assumed not constant but known for everyu. The kriging prob-

lem consists of finding the weights{λα}, α = 1, . . . , n such that:

V ar( n∑

α=1

λα[Z(uα)−m(uα)]− [Z(u)−m(u)])

is minimum

Example:

Simple kriging is used to estimate porosity in the second layer of Stanford V. Porosity

is modeled by a stationary random function with an anisotropic spherical variogram. Its

expected value, inferred from the well data, is 0.17. The input parameters for algorithm

Kriging are reproduced in Fig. 5.1. Fig. 5.2 shows the estimated porosity field.


<Grid_Name value="layer 2" />

<Property_Name value="krig" />

<Hard_Data grid="layer 2 well data" property="porosity" />

<Kriging_Type type="Simple Kriging (SK)" >

<parameters mean="0.17" />

</Kriging_Type>

<Max_Conditioning_Data value="200" />

<Search_Ellipsoid value="100 100 5

0 0 0 " />




<angles x="30" y="0" z="0" />

</structure_1>

</Variogram>

</parameters>

Figure 5.1: Simple kriging parameters

37

(a) 3D view

(b) Cross-section 1 (c) Cross-section 2

Figure 5.2: Porosity estimated by simple kriging


1. Hard Data [Hard Data ]: The grid and the name of the property containing the hard

data

2. Kriging Type [Kriging Type ]: Possible types of kriging include: Simple Kriging

(SK), Ordinary Kriging (OK), Kriging with Trend (KT) and Simple Kriging with

Locally Varying Mean (LVM)

38

3. Trend components[Trend ]: The trend components. Possible components are, with

u = (x, y, z): t1(u) = x, t2(u) = y, t3(u) = z, t4(u) = x.y, t5(u) = x.z,

t6(u) = y.z, t7(u) = x2, t8(u) = y2, t9(u) = z2.

The trend is coded by a string of 9 flags, for example the string

“0 1 0 0 0 1 0 1 0’’ would correspond to a trend with componentst2, t6

andt8, i.e. T (u) = α x + β yz + δ y2. Each flag is separated by a space.

4. Local Mean Property [Local Mean Property ]: The property of the simulation grid

containing the non-stationary mean. A mean value must be available at each loca-

tion to be estimated.

5. Max Conditioning Data [Max Conditioning Data ]: The maximum number of con-

ditioning data to be used for kriging at any location

6. Search Ellipsoid [Search Ellipsoid ]: The ranges and angles defining the search

ellipsoid

7. Variogram [Variogram ]: The variogram of the variable to be estimated by kriging

5.0.3 Indicator Kriging

Let Z(u) be a continuous random variable, andI(u, zk) the binary indicator function

defined at cutoffzk:

I(u, zk) =

{1 if Z(u) ≤ zk

0 otherwise

The aim of indicator kriging is to estimate the conditional cumulative distribution function

(ccdf) at any cutoffzk, conditional to data(n):

I∗(u, zk) = E∗(I(u, zk) | (n))

= Prob∗(Z(u) < zk | (n)

)(5.1)

I∗(u, zk) is estimated by the simple kriging estimator:

I∗(u, zk)− E{I(u, zk)} =n∑

α=1

λα

(I(uα, zk)− E{I(uα, zk)}

)EstimatingI∗(u, zk) for different cutoffszk, k =1,. . . ,K, yields a discrete estimate of

the conditional cumulative distribution function at the threshold valuesz1, . . . , zK .

39

The algorithmIndicator Krigingassumes that the marginal probabilitiesE{I(uα, zk)}are constant and known for alluα andzk (Simple Indicator Kriging ref??

Median IK

Estimating a conditional cumulative distribution function at a given locationu requires

the knowledge of the variogram of each of the indicator variablesI(., zk), k =1, . . . , K.

Inference of these K variograms can be a daunting task. Moreover a kriging system has

to be solved for each indicator variable. The inference and computational burden can be

alleviated if two conditions are met:

• the K indicator variables are intrinsically correlated ref(??):

γZ(h) = γI(h, zk) = γI(h, zk, zk′) ∀zk, zk′

where γZ(h) is the variogram of variableZ(u) and γI(h, zk, zk′) is the cross-

variogram between indicator variablesI(., zk) andI(., zk′). All these variograms

are standardized to a unit sill

• All vectors of hard indicator data are complete: there are no missing values as could

result from inequality constraints on theZ value.

When those two conditions are met, it is only necessary to infer one variogramγZ(h) and

only one kriging system has to be solved for all thresholdszk (indeed, the data locations

and the variogram are the same for all thresholds). This simplification is calledmedian

indicator kriging.

Ensuring the validity of the estimated cdf

Since kriging is a non-convex estimator, the cdf values estimated by krigingProb∗(Z(u) < zk

),

k = 1, . . . , K estimated by indicator kriging may not satisfy the properties of a cdfF , that

is:

∀k ∈ {1, ..., K} F (zk) = Prob(Z(u) < zk) ∈ [0, 1] (5.2)

∀zk ≥ zk′ F (zk) ≥ F (zk′) (5.3)

If properties 5.2 and 5.3 are not verified, the programIndicator Kriging modifies the

kriging values so as to ensure the validity of the ccdf. The corrections are performed in 3

steps.

40

1. Upward correction:

• Loop through all the cutoffsz1,. . . ,zK , starting with the lowest cutoffz1.

• At cutoff zk, if the estimated probabilityI∗(Z(u), zk) is not in[I∗(Z(u), zk−1), 1],

reset it to the closest bound

2. Downward correction:

• Loop through all the cutoffsz1,. . . ,zK , starting with the highest cutoffzK .

• At cutoff zk, if the estimated probabilityI∗(Z(u), zk) is not in[0, I∗(Z(u), zk+1)],

reset it to the closest bound

3. The final corrected probability values are the average of the upward and downward

corrections

Categorical Variables

Indicator Krigingcan also be applied to categorical variables, i.e. variables that take a dis-

crete, finite, number of values (also called classes or categories):z(u) ∈ {0, ..., K − 1}.The indicator variable for classk is then defined as:

I(u, k) =

{1 if Z(u) = k

0 otherwise

and the probabilityI∗(u, k) of Z(u) belonging to classk is estimated by simple kriging:

I∗(u, k)− E{I(u, k)} =n∑

α=1

λα

(I(uα, k)− E{I(uα, k)}

)In the case of categorical variables, the estimated probabilities must all be in[0, 1] and

verify:K∑

k=1

P(Z(u = k) | (n)

)= 1 (5.4)

If not, they are corrected as follows:

1. If I∗(Z(u), zk) /∈ [0, 1] reset it to the closest bound. If all the probability values are

lesser or equal to 0, no correction is made and a warning is issued.

2. Standardize the values so that they sum-up to 1:

I∗corrected(Z(u), zk) =

I∗(Z(u), zk)∑Ki=1 I∗(Z(u), zi)

41


1. Hard Data Grid [Hard Data Grid ]: The grid containing the hard data

2. Hard Data Property [Hard Data Property ]: The name of the properties containing

the hard data. If there areK indicators,K properties must be specified. Each

property contains the indicator valuesI(u, k), k = 1,. . . ,K

3. Categorical Variable Flag[Categorical Variable Flag ]: A flag indicating whether

the variable to be kriged is categorical or continuous. If the flag is on (equal to 1),

the variable is assumed categorical

4. Marginal Probabilities [Marginal Probabilities ]: The marginal probabilities of

each indicator. The probability values are separated by one or more space. The first

probability corresponds to the probability of the first indicator and so on.

5. Max Conditioning Data [Max Conditioning Data ]: The maximum number of con-

ditioning data to be used for each kriging

6. Search Ellipsoid [Search Ellipsoid ]: The ranges and angles defining the search

ellipsoid

7. Median IK Flag [Median Ik Flag ]: A flag indicating whether median IK should be

performed

8. Full IK Flag [Full Ik Flag ]: A flag indicating whether full IK should be performed.

If

Median Ik Flag is on/off,Full Ik Flag is off/on

9. Median IK Variogram [Variogram Median Ik ]: The variogram of the indicator vari-

ables used for median IK. IfMedian Ik Flag is off this parameter is ignored

10. Full IK Variogram [Variogram Full Ik ]: The variograms of each indicator vari-

able. IfMedian Ik Flag is on, this parameter is ignored

42

5.0.4 CoKriging

The aim of cokriging is to estimate a variableZ1(u) accounting for data on related vari-

ablesZ2, ..., ZJ+1. The cokriging estimator is given by:

Z∗1(u)−m1(u) =

∑α

λα

[Z1(uα)−m1(uα)

]+

J+1∑j=2

Nj∑βj

λβj

[Zj(uβj

)−mj(uβj)]

(5.5)

Algorithm CoKrigingallows to distinguish between the two cases of simple and ordi-

nary cokriging:

• the meansm1(u), ...,mJ+1(u) are known and constant: simple cokriging

• the means are locally constant but unknown. The weightsλα andλβjmust then

satisfy: ∑α

λα = 1 (5.6)

Nj∑βj

λβj= 0 ∀j ∈ {1, ..., J} (5.7)

Solving the cokriging system calls for the inference and modeling of multiple vari-

ograms: the variogram of each variable and all cross-variograms between any two vari-

ables. In order to alleviate the burden of modeling all these variograms, two models have

been proposed: the Markov Model 1 and the Markov Model 2 ref(??)

For text clarity, we will only consider the case of a single secondary variable (J = 1).

Markov Model 1

The Markov Model 1 (MM1) considers the following Markov-type screening hypothesis:

E(Z2(u) | Z1(u), Z1(u + h)

)= E

(Z2(u) | Z1(u)

)(5.8)

i.e. the dependence of the secondary variable on the primary is limited to the co-located

primary variable.

The cross-variogram (or cross-covariance) is then proportional to the auto-variogram

of the primary variable (??)

C12(h) =C12(0)

C11(0)C11(h) (5.9)

43

WhereC12 is the covariance betweenZ1 and Z2 and C11 is the covariance ofZ1.

Solving the cokriging system under the MM1 model only calls for knowledge ofC11,

hence the inference and modeling effort is the same as for univariate kriging.

Although very congenial the MM1 model should not be used when the support of

the secondary variableZ2 is larger than the one ofZ1, lest the variance ofZ1 would be

underestimated. It is better to use the Markov Model 2 in that case.

Markov Model 2

The Markov Model 2 (MM2) was developed for the case where the volume support of the

secondary variable is larger than that of the primary variable ref(??) . This is often the

case with remote sensing and seismic-related data. The Markov-type hypothesis is now:

E(Z1(u) | Z2(u), Z2(u + h)

)= E

(Z1(u) | Z2(u)

)(5.10)

i.e. the dependence of the primary variable on the secondary is limited to the co-located

secondary variable.

The cross-variogram is now proportional to the variogram of the secondary variable:

C12(h) =C12(0)

C11(0)C22(h) (5.11)

In order for all three covariancesC11, C12 andC22 to be consistent, ref(??) proposed

to modelC11 as a linear combination ofC22 and any permissible residual covarianceCR.

Expressed in term of correlograms (C11(h) = C11(0)ρ11(h)), this is written:

ρ11(h) = ρ212 ρ22(h) + (1− ρ2

12) ρR(h) (5.12)


1. Primary Hard Data Grid [Primary Harddata Grid ]: The grid containing the pri-

mary hard data

2. Primary Variable [Primary Variable ]: The name of the property containing the

primary hard data

3. Assign Hard Data [Assign Hard Data ]: A flag specifying whether the hard data

should be re-located on the simulation grid. See??

44

4. Secondary Hard Data[Secondary Harddata Grid ]: The grid containing the sec-

ondary hard data

5. Secondary Variable[Secondary Variable ]: The name of the property containing

the secondary hard data

6. Kriging Type [Kriging Type ]: Possible types of cokriging include: Simple Kriging

(SK) and Ordinary Kriging (OK). Note that selecting OK while retaining only a

single secondary datum (e.g. when doing co-located cokriging) amounts to com-

pletely ignoring the secondary information : from Eq. (5.7), the weight associated

to the single secondary datum will be 0.

7. SK Means [SK Means]: The mean of the primary and secondary variables. If the

selected kriging type is Ordinary Kriging, this parameter is ignored

8. Co-kriging Type [Cokriging Type ]: The cokriging option. Available options are:

Full Cokriging, Co-located Cokriging with Markov Model 1 (MM1), and Co-located

Cokriging with Markov Model 2 (MM2)

9. Max Conditioning Data [Max Conditioning Data ]: The maximum number of pri-

mary conditioning data to be used for each cokriging

10. Search Ellipsoid - Primary Variable [Search Ellipsoid 1]: The ranges and angles

defining the search ellipsoid used to find the primary conditioning data

11. Primary Variable Variogram [Variogram C11]: The variogram model for the pri-

mary variable

12. Max Conditioning Data [Max Conditioning Data 2]: The maximum number of

secondary conditioning data used for each cokriging. This parameter is only re-

quired if Cokriging Type is set to Full Cokriging

13. Search Ellipsoid - Secondary Variable[Search Ellipsoid 2]: The ranges and an-

gles defining the search ellipsoid used to find the secondary conditioning data. This

parameter is only required if

Cokriging Type is set to Full Cokriging

45

14. Cross-Variogram[Variogram C12]: The cross-variogram between the primary and

secondary variable. This parameter is only required ifCokriging Type is set to

Full Cokriging

15. Secondary Variable Variogram[Variogram C22]: The variogram of the secondary

variable. This parameter is only required ifCokriging Type is set to Full Cokrig-

ing

16. Coefficient of correlation[Correl Z1Z2]: The coefficient of correlation between the

primary and secondary variable. This parameter is only required ifCokriging Type

is set to MM1

17. Coefficient of correlation [MM2Correl Z1Z2]: The coefficient of correlation be-

tween the primary and secondary variable. This parameter is only required if

Cokriging Type is set to MM2

18. Secondary Variable Variogram[MM2Variogram C22]: The variogram of the sec-

ondary variable. This parameter is only required ifCokriging Type is set to MM2.

Note that the variogram of the secondary variable and the variogram of the primary

variable must verify the condition (5.12)

46

Chapter 6

Simulation Algorithms

This chapter presents the collection of stochastic simulation algorithms provided by SGeMS.

Section 6.1 presents the traditional variogram based (two-points) algorithms, such as

SGSIM(Sequential Gaussian Simulation),SISIM(Sequential Indicator Simulation),COS-

GSIM(Sequential Gaussian Co-simulation),COSISIM(Sequential Indicator co-imulation).

SGSIMandCOSGSIMare the choices for most applications with continuous variables,

while SISIMandCOSISMIwould be choice for categorical variables.

Section 6.2 gives detailed descriptions of the two multiple-point simulation (mps) al-

gorithms:SNESIM(Single Normal Equation Simulation) andFILTERSIM(Filter-based

Simulation).SNESIMonly works for categorical variables as involved in facies distribu-

tions, whileFILTERSIMis suited to both continuous and categorical variables.

Each simulation algorithm presented in this chapter is demonstrated with some exam-

ple runs.

6.1 Variogram-based Simulations

This section covers the variogram-based sequential simulation algorithms implemented in

SGeMS. The stochastic realizations from any of these algorithms draw their spatial pat-

terns from input model variograms. Variograms-based algorithms are to be used when the

critical patterns are reasonably amorphous (high entropy) and can be represented by bi-

variate statistics. Cases when variograms fail to reproduce critical spatial patterns should

lead to using the multiple-point geostatistics algorithms described in the next section.

Variogram-based sequential simulations are the most popular stochastic simulation

47

algorithms mostly due to their robustness and ease of conditioning, both for hard and

soft data. Moreover, they do not require rasterized (regular or cartesian ) grid and allow

simulation on irregular grids such as point sets.

SGeMS takes full advantage of this flexibility; all the simulation algorithms in this

section work both on point sets and on cartesian grids. Moreover, the conditioning data

may or may not be on the simulation grid. However, working on point set induces a

performance penalty as the search for neighboring data is significantly more costly than

on a regular (cartesian) grid.

When the simulation grid is cartesian, all algorithms have the option to relocate the

conditioning data to the nearest grid node for increased execution speed. The re-allocation

strategy is to move each datum to the closest grid node. In case two data share the same

closest grid node, the farthest one is ignored.

This section presents first the simulation algorithms requiring some Gaussian as-

sumption: the sequential Gaussian simulationSGSIMand the sequential Gaussian co-

simulationCOSGSIMfor integration of secondary information through a coregionaliza-

tion model. Next, the direct sequential simulationDSSIMis presented,DSSIMdoes not

require any Gaussian assumption but may not reproduce the target distribution. Finally

the implementation of the indicator simulation algorithmsSISIMandCOSISIMare dis-

cussed. These last two algorithms rely on the decomposition of the cumulative distribution

function by a set of thresholds. At any location the probability of exceeding each thresh-

old is estimated by kriging and these propabilities are combined to construct the local

conditional cumulative distribution function (ccdf) from which simulation is performed.

6.1.1 SGSIM: sequential Gaussian simulation

Let Y (u) be a multivariate Gaussian random function with zero mean, unit variance, and

a given variogram modelγ(h). Realizations ofY (u), conditioned to data(n) can be

generated byAlgorithm 6.1.

A non Gaussian random functionZ(u) must first be transformed into Gaussian ran-

dom functionY (u); Z(u) 7→ Y (u). If no analytical model are available, a normal score

transform may be applied. TheAlgorithm 6.1 then becomesAlgorithm 6.2

Using the normal score transform involves more than just transforming and back trans-

forming the data at simulation time. First, the algorithm calls for the variogram of the

normal scorenot of the original data. Only that normal score variogram is guaranteed

48

Algorithm 6.1 Sequential Gaussian Simulation1: Define a random path visiting each node of the grid

2: for Each nodeu along the pathdo

3: Get the conditioning data consisting of both neighboring original data(n) and pre-

viously simulated values

4: Estimate the local conditional cdf as a Gaussian distribution with mean given by a

form of kriging and its variance by the kriging variance

5: Draw a value from that Gaussian ccdf and add the simulated value to the data set

6: end for

7: Repeat for another realization

Algorithm 6.2 Sequential Gaussian Simulation with normal score transform1: Transform the data into normal score space.Z(u) 7→ Y (u)

2: Perform Algorithm 6.1

3: Back transform the Gaussian simulated field into the data space.Y (u) 7→ Z(u)

to be reproduced, not the originalZ-value variogram. However, in many cases the back

transform does not affect adversely the reproduction of theZ-value variogram model. If

required, the SGeMS implementation ofSGSIMcan automatically perform the normal

score transform of the original hard data and back-transform the simulated realizations.

The user must still perform independently the normal score transform of the hard data

with theTRANS, seeAlgorithm 3.1, to calculate and model the normal score variogram.

At each data location the algorithm can either solve a simple kriging, ordinary krig-

ing, kriging with local mean ot kriging with a trend system. Theory only guarantees

reproduction of the variogram with the simple kriging system.

SGSIM with local varying mean

In most cases, the local varying meanzm(u) = E{Z(u)} is given as aZ-value and must

be converted into the Gaussian space, such thatym(u) = E{Y (u)}. Transformingzm(u)

into ym(u) using theFZ and the rank preserving transform

ym(u) = F−1Y

(FZ

(zm(u)

))would not, in all generality, ensure thatym(u) = E{Y (u)}. A better alternative, when

possible, is to get directlyym(u) by direct calibration of the secondary information with

49

the normal score transformy of the primary attributez.

A note on Gaussian spatial law

Gaussian random functions have very specific and consequential spatial structures and

distribution law. Median values are maximally correlated while extreme values are in-

creasingly less correlated. That property is known as destructuration. If the phenomenon

under study is known to have well correlated extreme values, a Gaussian-related simula-

tion algorithm is not appropriate.

Example:

Figure 6.6 shows two conditional realizations on a point set using simple kriging and

normal score transform of the hard data.


TheSGSIMalgorithm is activated fromSimulation—sgsimin algorithm panel. The main

SGSIMinterface contains 3 pages: General, Data and Variogram (see Figure 6.1). The

text inside ‘[ ]’ is the corresponding keyword in theSGSIMparameter file.

1. Simulation Grid Name [Grid Name]: Name of the simulation grid.

2. Property Name Prefix[Property Name]: Prefix for the simulation output. The suffix

real# is added for each realization.

3. # of realizations[Nb Realizations ]: Number of simulations to generate.

4. Seed[Seed]: Seed for the random number generator.

5. Kriging Type [Kriging Type ]: Select the type of kriging system to be solved at each

node along the random path. The simple kriging (SK) mean is set to zero.

6. Hard Data—Object [Hard Data.grid ]: Name of the grid containing the conditioning

data. If no grid is selected, the realizations are unconditional.

7. Hard Data—Property [Hard Data.property ]: Property for the conditioning data.

Only required if a grid has been selected inHard Data—Object[Hard Data.grid ]

50

8. Assign hard data to simulation grid [Assign Hard Data ]: If selected, the hard data

are copied on the simulation grid. The program does not proceed if the copying

fails. This option significantly increases execution speed.

9. Max Conditionning data [Max Conditioning Data ]: Maximum number of data to

be retained in the search neighborhood.

10. Search Ellipsoid Geometry[Search Ellipsoid ]: Parametrization of the search el-

lipsoid.

11. Target Histogram: If used, the data are normal score transformed prior to simulation

and the simulated field is transformed back to the original space.

Use Target Histogram[Use Target Histogram ] flag to use the normal score trans-

form. [nonParamCdf ] parametrize the target histogram (see section 3.1)

12. Variogram [Variogram ]: Parametrization of the normal score variogram.

6.1.2 COSGSIM: Sequential Gaussian co-simulation

COSGSIMallows to simulate a Gaussian variable accounting for secondary information.

Let Y1(u) andY2(u) be two correlated multiGaussian random variables.Y1(u) is called

the primary variable, andY2(u) the secondary variable. TheCOSGSIMsimulation of

the primary variableY1 conditioned to both primary and secondary data is described in

Algorithm 6.3.

Algorithm 6.3 Sequential Gaussian Cosimulation1: Define a path visiting each nodeu of the grid

2: for At each nodeu do

3: Get the conditioning data consisting of neighboring original data, previously sim-

ulated values and secondary data

4: Get the local Gaussian ccdf for the primary attribute, its mean is estimated by a

cokriging estimate and its variance by the cokriging variance.

5: Draw a value from the that Gaussian ccdf and add it to the data set

6: end for

If the primary and secondary variable are not Gaussian, make sure that the trans-

formed variablesY1 andY2 are at least bigaussian. If they are not, another simulation

51

algorithm should be considered,COSISIMfor example (see 6.1.5). If no analytical model

are available for such transformation, a normal score transform may be applied to both

variables. TheTRANSalgorithm only ensures that the respective marginals are Gaussian.

The algorithm 6.3 then becomesAlgorithm 6.4

Algorithm 6.4 Sequential Gaussian Cosimulation for non-Gaussian variable1: TransformZ1 andZ2 into Gaussian variablesY1 andY2, according to Eq. (3.5).

2: Perform AlgorithmAlgorithm 6.3.

3: “Back-transform” the simulated valuesy1,1, ..., y1,N into z1,1, ..., z1,N :

Example:

Figure 6.11 shows two conditional realizations with a MM1-type of coregionalization

and simple cokriging. Both the primary and secondary information are normal score

transformed.


TheCOSGSIMalgorithm is activated fromSimulation—cosgsimin the algorithm panel.

TheCOSGSIMinterface contains 5 pages: General, Primary Data, Secondary Data, Pri-

mary Variogram and Secondary Variogram (see Figure 6.2). The text inside ‘[ ]’ is the

corresponding keyword in theCOSGSIMparameter file.







node along the random path.

6. Cokriging Option [Cokriging Type ]: Select the type of coregionalization model:

LMC, MM1 or MM2

52

7. Primary Hard Data Grid [Primary Harddata Grid ]: Selection of the grid for the

primary variable. If no grid is selected, the realizations are unconditional.

8. Primary Property [Primary Variable ]: Selection of the hard data property for the

primary variable.




10. Primary Max Conditionning data [Max Conditioning Data 1]: Maximum num-

ber of primary data to be retained in the search neighborhood.

11. Primary Search Ellipsoid Geometry [Search Ellipsoid 1]: Parametrization of

the search ellipsoid for the primary variable.

12. Target Histogram: If used, the primary data are normal score transformed prior to

the simulation and the simulated field is transformed back to the original space.

Transform Primary Variable [Transform Primary Variable ] flag to use the nor-

mal score transform. [nonParamCdf primary ] parametrize the primary variable tar-

get histogram (see section 3.1).

13. Secondary Data Grid[Secondary Harddata Grid ]: Selection of the grid for the

secondary variable.

14. Secondary Property[Secondary Variable ]: Selection of the data property for the

secondary variable.

15. Secondary Max Conditionning data[Max Conditioning Data 2]: Maximum num-

ber of secondary data to be retained in the search neighborhood.

16. Secondary Search Ellipsoid Geometry[Search Ellipsoid 2]: Parametrization of

the search ellipsoid for the secondary variable.

17. Target Histogram: If used, the primary data are normal score transformed prior to

the simulation and the simulated field is transformed back to the original space.

Transform SecondaryVariable [Transform Secondary Variable ] flag to use the

normal score transform. [nonParamCdf secondary ] parametrizes the secondary

variable target histogram (see section 3.1).

53

18. Variogram for primary variable [Variogram C11]: Parametrization of the normal

score variogram for the primary variable.

19. Cross-variogram between primary and secondary variables[Variogram C12]:

Parametrization of the cross-variogram for the normal score primary and secondary

variables. Required ifCokriging Option [Cokriging Type ] is set toFull Cokrig-

ing

20. Variogram for secondary variable [Variogram C22]: Parametrization of the nor-

mal score variogram for the secondary variable. Required ifCokriging Option

[Cokriging Type ] is set toFull Cokriging

21. Coef. Correlation Z1,Z2: Coefficient of correlation between the primary and sec-

ondary variable. Only required ifCokriging Option [Cokriging Type ] is set to

MM1 or MM2 . The correlation keyword is [Correl Z1Z2] for the MM1 coregion-

alization, and [MM2Correl Z1Z2] for the MM2 coregionalization.

22. Covariance for secondary variable[MM2Variogram C22]: Parametrization of the

normal score variogram for the secondary vaiable. Required ifCokriging Option

[Cokriging Type ] is set toMM2

6.1.3 DSSIM: Direct Sequential Simulation

The direct sequential simulation algorithmDSSIM performs simulation of continuous

attributes without prior indicator coding or Gaussian transform. It can be shown that the

only condition for the model variogram to be reproduced is that the ccdf has for mean and

variance the simple kriging estimate and variance. The shape of the ccdf does not matter,

it may not even be the same for each node. The drawback is that there is no guarantee that

the marginal distribution is reproduced.

One solution is to post-process the simulated realizations with a rank transform al-

gorithm to identify the target histogram, see algorithmTRANSin section 3.1. This may

affect variogram reproduction. The second alternative is to set the shape of the local ccdf,

at all locations along the path, such that the marginal distribution is approximated at the

end of each realization.

SGeMS offers two potential distributions type for the first alternative; the ccdf is either

a uniform distribution or a lognormal distribution. Neither of these distributions would

54

produce realizations that have either uniform or lognormal marginal distributions, thus

some rank transformation may be required to identify the target marginal histogram.

For the second alternative, the method proposed by Soares (2001) is implemented.

The ccdf is sampled from the data marginal distribution, modified to be centered on the

kriging estimate with spread equal to the kriging variance. The resulting shape of each lo-

cal ccdf differs from location to location. The method gives reasonable results for a target

symmetric (even multi-modal) distribution, but less so for a highly skewed distribution;

in this latter case the first option of a log-normal ccdf type followed by a rank tranform

may give better results.

Algorithm 6.5 Direct Sequential Simulation1: Define a random path visiting each nodeu of the grid

2: for Each locationu along the pathdo

3: Get the conditioning data consisting of both neighboring original data and previ-

ously simulated values

4: Define the local ccdf with its mean and variance given by the kriging estimate and

variance.

5: Draw a value from that ccdf and add the simulated value to the data set

6: end for

Example:

Figure 6.8 shows two conditional realizations on a point set using the Soares method and

simple kriging. A local varying mean with the Soares method is used in Figures 6.9.


TheDSSIMalgorithm is activated fromSimulation—dssimin the upper part of algorithm

panel. The mainDSSIM interface contains 3 pages: General, Data and Variogram (see

Figure 6.3). The text inside ‘[ ]’ is the corresponding keyword in theDSSIMparameter

file.




55



5. Kriging Type [Kriging Type ]: Select the form of kriging system to be solved at each


6. SK mean[SK mean]: Mean of the attribute. Only required ifKriging Type [Kriging Type ]

is set to Simple Kriging(SK).

7. Hard Data—Object [Hard Data.grid ]: Name of the grid containing the conditioning

data. If no grid is selected, the realizations are unconditional.

8. Hard Data—Property [Hard Data.property ]: Property for the conditioning data.

Only required if a grid has been selected inHard Data—Object[Hard Data.grid ]




10. Max Conditionning data [Max Conditioning Data ]: Maximum number of data to



lipsoid.

12. Distribution type [cdf type ]: Select the type of cdf to be build at each location

along the random path

13. LogNormal parameters: Only activated ifDistribution type [cdf type ] is

set toLogNormal. Parametrization of the global lognormal distribution, the

mean is specified byMean[LN mean] and the variance byVariance[LN variance ].

14. Uniform parameters: Only activated ifDistribution type [cdf type ] is set to

Uniform . Parametrization of the global Uniform distribution, the minimum is

specified byMin [U min ] and the maximum byMax[U max].

15. Soares Distribution[nonParamCdf ]: Only activated ifDistribution type [cdf type ]

is set toSoares. Parametrization of the global distribution from which the lo-

cal distribution is sampled (see section 3.1).

56

16. Variogram [Variogram ]: Parametrization of the variogram. For this algorithm, the

sill of the variogram is a critical information and should not be standardized to 1.

6.1.4 SISIM : sequential indicator simulation

Sequential indicator simulationSISIMcombines the indicator formalism with the sequen-

tial paradigm to simulate non-parametric continuous or categorical distributions. In the

continuous case, at every location along the path a non parametric ccdf is built using

the kriging estimate from the neighboring indicator data. For the categorical case, the

probability for each category to occur is estimated by kriging.

The indicator formalism used with the sequential indicator simulation (Algorithm 6.6)

does not requires any Gaussian assumption. Instead, it is assumed that the ccdf be built by

estimating the sequence of probabilities of exceeding a finite set of threshold values. The

more thresholds are retained, the more detailed the conditional cumulative distribution

function is. The indicator formalism removes the need for normal score transform, but

may require an additional modeling effort.

This version ofSISIM does not require any prior indicator coding of the data, all

coding is done internally for both the continuous and the categorical cases. Interval or

incomplete data may also be entered but need to be pre-coded.

When modeling a set of indicator variograms, the user is warned that not all combi-

nation of variograms can be reproduced. For example, if a field has three categories, the

spatial patterns of the third category is completely determined by the variogram models

of the first two.

Continuous variable

The algorithmSISIM relies on indicator kriging to infer the local ccdf values. The con-

tinuous indicator variable is defined as:

i(z(u), zk) =

{1 if z(u) ≤ zk

0 otherwise

Two types of kriging can be used for indicator simulation. The full IK option requires

a variogram model for each threshold. The median IK option only requires the variogram

model for the median threshold, all other variograms are assumed proportional to that

single model.

57

While bothSGSIMandSISIMwith median IK requires a single variogram, they pro-

duce different outputs and spatial patterns. The difference is that the extreme values of

a random field with a median IK regionalization would be more spatially correlated than

with theSGSIMalgorithm.

For continuous attributesSISIMimplementsAlgorithm 6.6.

Sampling the estimated distribution function

At each location to be simulated, the ccdfFZ(u) is estimated at all threshold values

z1, . . . , zK . However sampling fromFZ(u) as described in step 7 ofAlgorithm 6.6 re-

quires the knowledge ofFZ(u)(z) for all z ∈ D. In SISIMFZ(u) is interpolated as follows:

F ∗Z(u)(z) =

φlti (z) if z ≤ z1

F ∗Z(u)(z1) + z−zk

zk+1−zk

(F ∗

Z(u)(z2)− F ∗Z(u)(z1)

)if zk ≤ z ≤ zk + 1

1− φuti(z) if zk+1 ≤ z, ω > 0, ω ∈ R(6.1)

whereF ∗Z(u)(zk) = i∗(u, zk) is estimated by indicator kriging andφlti (z) andφuti(z) are

respectively the lower and upper tail extrapolation chosen by the user and described in

section 3.1. Values between thresholdszi andzi+1 are interpolated linearly, hence are

drawn from a uniform distribution in the interval[zi, zi+1].

Categorical variables

If Z(u) is a categorical variable which only takes the integer values{1, . . . , K}, Algo-

rithm 6.6 is modified as described inAlgorithm 6.7.

The categorical indicator variable is defined by:

i(u, k) =

{1 if Z(u) = k

0 otherwise

For the categorical case, the median IK option would indicates that all categories share

the same variogram.

58

Algorithm 6.6 SISIM - continuous variables1: Choose a discretization of the rangeD of Z(u): z1, . . . , zK

2: Define a path visiting all locations to be simulated

3: for For each locationu along the pathdo

4: Retrieve the neighboring conditioning data :z(uα), α = 1, . . . , N

5: Turn each datumz(uα) into a vector of indicator values:

v(uα) =[i(z(uα), z1

), . . . , i

(z(uα), zK

)].

6: Estimate the indicator random variableI(u, zk) for each of theK thresholds values

by solving a kriging system.

7: The estimated valuesi∗(u, zk) = Prob∗(Z(u) ≤ zk), after correction of order-

relation problems, define an estimate of the ccdfFZ(u) of the variableZ(u).

8: Draw a value from that ccdf and assign it as a datum at locationu.

9: end for

9: Repeat the previous steps to generate another simulated realization

Algorithm 6.7 SISIM - categorical variable1: Define a path visiting all locations to be simulated

2: for For each locationu along the pathdo

3: Retrieve the neighboring categorical conditioning data:

z(uα), α = 1, . . . , N

4: Turn each datumz(uα) into a vector of indicator data values:

v(uα) =[i(z(uα), z1

), . . . , i

(z(uα), zK

)].

5: Estimate the indicator random variableI(u, k) for each of theK categories by

solving a simple kriging system.

6: The estimated valuesi∗(u, k) = Prob∗(Z(u) = k), after correction of order-

relation problems, define an estimate of the discrete conditional probability density

function (cpdf)fZ(u) of the categorical variableZ(u). Draw a realization fromf

and assign it as a datum at locationu.

7: end for

8: Repeat the previous steps to generate another realization

Example:

Figure 6.10 shows two conditional realizations on a point set with a median IK regional-

ization. Nineteen thresholds were selected.

59


The SISIM algorithm is activated fromSimulation—sisimin the algorithm panel. The

main SISIM interface contains 3 pages: General, Data and Variogram (see Figure 6.4).

The text inside ‘[ ]’ is the corresponding keyword in theSISIMparameter file.


2. Property Name Prefix[Property Name]: Prefix for the simualtion output. The suffix




5. Categorical variable [Categorical Variable Flag ]: Indicates if the data are cate-

gorical or not.

6. # of tresholds/classes[Nb Indicators ]: Number of classes if the flag [Categorical Variable Flag ]

is selected or number of threshold values for continuous attributes.

7. Threshold Values / Classes[Thresholds ]: Threshold values in ascending order. That

field is required only for continuous data.

8. Marginal probabilities [Marginal Probabilities ]:

If continuous Probability to be below each of the above threshold. The entries

must be monotonically increasing.

If categorical Proportion for each category. The first entry correspond to category

0, the second to category 1, ...

9. Lower tail extrapolation [lowerTailCdf ]: Parametrize the lower tail of the cumula-

tive distribution function for continuous attributes.

10. Upper tail extrapolation [upperTailCdf ]: Parametrize the upper tail of the cumu-

lative distribution function for continuous attributes.

11. Indicator kriging type : If Median IK [Median Ik Flag ] is selected, the program

uses median indicator kriging to estimate the ccdf. Otherwise, ifFull IK [Full Ik Flag ]

is selected, one IK system is solved for each threshold/class

60

12. Hard Data Grid [Hard Data Grid ]: Grid containing the conditioning hard data

13. Hard Data Property [Hard Data Property ]: Conditioning data for the simulation

14. Assign hard data to simulation grid[Assign Hard Data ]: If selected, the hard data



15. Interval Data — Object: codedprops Grid containing the interval data. Cannot be

used ifMedian IK [Median Ik Flag ] is selected.

16. Interval Data — Properties: codedgrid Properties with the interval and soft data.

These data must already be properly coded and are all found on the grid [coded grid ].

There must be [Nb Indicators ] properties selected.

17. Max Conditioning data [Max Conditioning Data ]: Maximum number of data to



lipsoid.

19. Variogram [Variogram ]: Parametrization of the indicator variograms. Only one

variogram is necessary ifMedian IK [Median Ik Flag ] is selected. Otherwise there

are [Nb Indicators ] indicator variograms.

6.1.5 COSISIM: Sequential Indicator Cosimulation

The algorithmCOSISIMextends theSISIMalgorithm to handle secondary data. At the

difference ofSISIM, data must already be coded prior to usingCOSISIM. The algorithm

does not differentiate between hard and interval data; for any given threshold both are

located on the same property. If no secondary data are selected, the algorithmCOSISIM

performs a traditional sequential indicator simulation.

The secondary data are integrated using the Markov-Bayes algorithm. As with the

primary attribute, the secondary information must be coded into indicators before it is

used. The Markov-Bayes calibration coefficients are not internally computed and must be

given as input. SGeMS allows using the Markov-Bayes algorithm with both a Full IK or

a median IK regionalization model.

61

A note on conditioning

As opposed toSISIM, COSISIMdoes not exactly honor hard data for a continuous at-

tribute. The algorithm would honor these data in the sense that the simulated values would

be inside the correct threshold interval. It is not possible to retrieve exactly continuous

data values since these were never provided to the program. A possible post-processing is

to copy the conditioning hard data values over the simulated nodes once the realizations

are finished.

Example:

TO DO: example with cosisim


The COSISIMalgorithm is activated fromSimulation—cosisimin the algorithm panel.

The mainCOSISIMinterface contains 3 pages: General, Data and Variogram (see Figure

6.5). The text inside ‘[ ]’ is the corresponding keyword in theCOSISIMparameter file.





4. Seed[Seed]: Seed for the pseudo random number generator.

5. Categorical variable [Categorical Variable Flag ]: Indicates if the data are cate-

gorical or not.

6. # of tresholds/classes[Nb Indicators ]: Number of classes if the flag [Categorical Variable Flag ]

is selected or number of threshold values for continuous attributes.

7. Threshold Values / Classes[Thresholds ]: Threshold values in ascending order, there

must be [Nb Indicators ] values entered. That field is only for continuous data.

8. Marginal probabilities [Marginal Probabilities ]:

62

If continuous Probability to be below each of the above threshold. The entries

must be monotonically increasing.

If categorical Proportion for each category. The first entry correspond to category

0, the second to category 1, ...

9. Lower tail extrapolation [lowerTailCdf ]: Parametrize the lower tail of the ccdf for

continuous attributes.

10. Upper tail extrapolation [upperTailCdf ]: Parametrize the upper tail of the ccdf for

continuous attributes.



12. Indicator kriging type : If Median IK [Median Ik Flag ] is selected, the program

uses median indicator kriging to estimate the cccdf. IfFull IK [Full Ik Flag ] is

selected, there are [Nb Indicators ] IK systems solved at each location, one for

each threshold/class

13. Hard Data Grid [Primary Hard Data Grid ]: Grid containing the conditioning hard

data

14. Hard Data Indicators Properties [Primary Indicators ]: Conditioning primary

data for the simulation. There must be [Nb Indicators ] properties selected, the first

one being for class 0, the second for class 1, and so on. IfFull IK [Full Ik Flag ]

is selected, a location may not be informed for all thresholds.

15. Primary Max Conditionning data [Max Conditioning Data Primary ]: Maximum

number of primary indicator data to be retained in the search neighborhood.

11. Primary Search Ellipsoid Geometry [Search Ellipsoid 1]: Parametrization of

the search ellipsoid for the primary variable.

16. Soft Data Grid [Secondary Harddata Grid ]: Grid containing the conditioning soft

data indicators. If no grid is selected, a univariate simulation is performed.

17. Soft Data Indicators Properties[Secondary Indicators ]: Conditioning secondary

data for the simulation. There must be [Nb Indicators ] properties selected, the first

63

one being for class 0, the second for class 1, and so on. IfFull IK [Full Ik Flag ]

is selected, a location need not be informed for all thresholds.

18. Search Ellipsoid Geometry[Search Ellipsoid Primary ]: Parametrization of the

search ellipsoid for the primary indicator data.

19. B(z,IK) for each indicator [Bz Values ]: Parameters for the Markov Bayes Model,

one B-coefficient value must be entered for each indicator. Only required if sec-

ondary data are used.

20. Secondary Max Conditionning data[Max Conditioning Data Secondary ]: Max-

imum number of secondary indicator data to be retained in the search neighborhood.

21. Secondary Search Ellipsoid Geometry[Search Ellipsoid 1]: Parametrization of

the search ellipsoid for the secondary data indicators.

22. Variogram [Variogram ]: Parametrization of the indicator variograms. Only one

variogram is necessary ifMedian IK [Median Ik Flag ] is selected. Otherwise there

are [Nb Indicators ] indicator variograms.

64

(a) General tab (b) Data tab (c) Variogram tab

Figure 6.1: User interface forSGSIM.

65

(a) General tab (b) Primary data tab (c) Secondary data tab

(d) Primary variogram tab(e) Secondary vari-

ogram tab for the

linear model of

coregionalization

(f) Secondary variogram

tab for the MM2 core-

gionalization

Figure 6.2: User interface forCOSGSIM.

66


Figure 6.3: User interface forDSSIM.

67


Figure 6.4: User interface forSISIM.


Figure 6.5: User interface forCOSISIM.

68

(a) Realization #1 (b) Realization #2

Figure 6.6: Two realizations withSGSIM.


Figure 6.7: Two realizations withSGSIMand local varying mean.

69


Figure 6.8: Two realizations withDSSIM.


Figure 6.9: Two realizations withDSSIMand local varying mean.

70


Figure 6.10: Two realizations withSISIMand the median IK regionalization, 19 thresh-

olds are used in this example.


Figure 6.11: Two realizations withCOSISIMwith the Markov Model 1 (MM1) coregion-

alization.

71

6.2 Multiple-point Simulation algorithms

Before the introduction of multiple-point geostatistics, two families of simulation algo-

rithms for facies modeling were available: pixel-based and object-based. The pixel-based

algorithms build the simulated realizations one pixel at a time, thus providing great flex-

ibility for conditioning to data of diverse support volumes and diverse types. Pixel-based

algorithms may, however, be slow and have difficulty reproducing complex geometric

shapes, particularly if simulation of these pixel values is constrained only by two-point

statistics, such as a variogram or a covariance. Object-based algorithms build the realiza-

tions by dropping onto the simulation grid one object or pattern at a time, hence they can

be fast and faithful to the geometry of the object. However, they are difficult to condition

to local data of different support volumes, particularly when these data are dense as in the

case for seismic surveys.

SGeMS does not provide yet any program for object-based simulation, whether con-

ditional or non-conditional. Such programs are available in many free and commercial

softwares, such as ‘fluvsim’ (Deutsch and Tran, 2002), ‘SBED’.

The multiple-point simulation concept proposed by Journel (1992) and first imple-

mented by Guardiano and Srivastava (1992), combines the strengths of the previous two

classes of simulation algorithms. It operates pixel-wise with the conditional probabilities

for each pixel value being lifted as conditional proportion from a training image depicting

the geometry and distribution of objects deemed to prevail in the actual field. That train-

ing image (TI), a purely conceptual depiction, can be built using object-based algorithms.

Whatmpsdoes is to morph theTI so that it honors all local data.

6.2.1 SNESIM

The originalmps implementation by Guardiano and Srivastava (1992) was much too

slow, for it asked a new rescan of the wholeTI at each new simulation node, to retrieve

the required conditional proportion from which to draw the simulated value.mpsbecame

practical with theSNESIMimplementation of Strebelle (2000). InSNESIM, theTI is

scanned only once and all conditional proportions available in theTI are stored in a

smart search tree data structure, from which they can be retrieved fast.

TheSNESIMalgorithm contains two main parts, the contruction of a search tree where

all training proportions are stored, and the simulation part itself where these proportions

are read and used to draw the simulated values.

72

Search tree construction

A search templateτJ is defined byJ vectorshj, j = 1, . . . , J radiating from a cen-

tral nodeu0. The template is thus constituted by J nodes (u0 + hj, j = 1, . . . , J).

That template is used to scan the training image for all training patternspat(u′0) =

{t(u′0); t(u

′0 + hj), j = 1, . . . , J}, whereu′

0 is any central node of theTI; t(u′0 + hj)

is the training image value at grid nodeu′0 + hj. All these training patterns are stored in

a search tree data structure (Strebelle, 2000), such that one can retrieve easily:

1. the total number (n) of patterns with exactly the sameJ valuesDJ = {dj, j = 1, . . . , J}.One such pattern would be{t(u′

0 + hj) = dj, j = 1, . . . , J};

2. for those patterns, the number (nk) which features a specific valuet(u′0) = k,

(k = 0, . . . , K − 1) at the central locationt(u′0),

whereK is the total number of categories. The ratio of these two numbers give the

proportion of training patterns identifying the central valuet(u′0) = k among all those

identifying theJ “data” valuest(u′0 + hj) = dj:

P (t(u′0) = k|DJ) =

nk

n, k = 0, . . . , K − 1. (6.2)

Single grid simulation

The SNESIMsimulation proceeds one pixel at a time following a random path visiting

all the nodes within the simulation gridG of one realizationre. Hard data have been

relocated to the closest nodes ofG, and these data-informed nodes are not visited thus

ensuring data exactitude. For all other nodes, the algorithm of sequential simulation (in-

ternal re. ??) is used.

At each simulation nodeu, the search templateτJ is used to retrieve the conditional

data eventdev(u) which is defined as

devJ(u) = {z(s)(u + h1), . . . , z(s)(u + hJ)}, (6.3)

wherez(s)(u + hj) is an informed nodal value in theSNESIMrealization; such value

could be either an original hard datum value or a previously simulated value. Note that

there can be any number of uninformed nodal values among theJ possible locations of

the templateτJ centered atu.

73

Next, find the numbern of training patterns which have the same values asdev(u)

from the search tree. Ifn is lesser than a fixed thresholdcmin (the minimum number

of replicates), define the smaller data eventdevJ−1(u) by dropping the furthest away

informed node fromdevJ(u), and repeat the search. This step is repeated untiln ≥ cmin.

Let J ′ (J ′ ≤ J) the data event size for whichn ≥ cmin.

The conditional probability from which the nodal valuez(s)(u) is drawn, is modeled

as:

P (Z(u) = k|devJ(u)) ≈ P (Z(u) = k|devJ ′(u))

= P (t(u′0) = k|devJ ′(u)).

This probability is conditional to up toJ data values found with the search templateτJ .

Algorithm 6.8 describes the simplest version of theSNESIMalgorithm with aK-category

variableZ(u) valued in{0, . . . , K − 1}.

Algorithm 6.8 Simple single gridSNESIM1: Define a search templateτJ

2: Construct a search treeTr specific to templateτJ

3: Relocate hard data to the nearest simulation grid node and freeze them during simu-

lation

4: Define a random path visiting all locations to be simulated


6: Find the conditioning data eventdevJ(u) defined by templateτJ

7: Retrieve the conditional probability distribution ccdfP (Z(u) = k|devJ(u)) from

search tree

8: Draw a simulated valuez(s)(u) from that conditional distribution and add it to the

data set

9: end for

The search-tree stores all training replicates{t(u′0); t(u

′0 + hj), j = 1, . . . , J}, and

allows to retrieve the conditional probability distribution of step 7 inO(J). This speed

comes at the cost of a possibly large RAM memory demand. LetNTI be the total number

of locations in the training image. No matter the data event sizej, there can not be more

thanNTI different data events in the training image; thus an upper-bound of the memory

demand of the search tree is:

Memory Demand≤ min(KJ , NTI)

74

whereKJ is the total number of possible data value combinations withK categories and

J nodes.

Multiple grid simulation

The multiple grid simulation approach (Tran, 1994) is used to capture large scale struc-

tures with a search templateτJ with a reasonably small number of nodes. Denote byG

the 3D Cartesian grid on which simulation is to be performed, defineGg as thegth sub-set

of G such that:G1=G andGg is obtained by down-samplingGg−1 by a factor of 2 along

each of the three coordinate directions:Gg is the sub-set ofGg−1 obtained by retaining

every other node ofGg−1. Gg is called thegth level multi-grid. Fig. 6.12 illustrates a

simulation field which is divided into 3 multiple grids.

Figure 6.12: Three multiple grids (coarsest, medium and finest)

In thegth subgridGg, the search templateτJ is correspondingly rescaled by a factor

2g−1 such that

τ gJ = {2g−1h1, . . . , 2

g−1hJ}

Templateτ gJ has the same number of nodes asτJ but has a much greater spatial ex-

tent, hence allows capturing large-scale structures without increasing the search tree size.

Fig. 6.13 shows a fine template of size3× 3 and the expanded coarse template in the2nd

level coarse grid.

During simulation, all nodes simulated in the previous coarser grid are frozen, i.e.

they are not revisited.Algorithm 6.9 describes the implementation of multiple grids in

SNESIM.

75

Figure 6.13: Multi-grid search template (coarse and fine)

Algorithm 6.9 SNESIMwith multiple grids1: Choose the numberNG of multiple grids

2: Start at the coarsest gridGg, g = NG.

3: while g > 0 do

4: Relocate hard data to the nearest grid nodes in current multi-grid

5: Build a new templateτ gJ by re-scaling templateτJ .

6: Build the search treeTrg using the training image and templateτ gJ

7: Simulate all nodes ofGg as inAlgorithm 6.8

8: Remove the relocated hard data from current multi-grid ifg > 1

9: Move to next finer gridGg−1 (let g = g − 1).

10: end while

Anisotropic template expansion

To get thegth coarse grid, both the base search template and the simulation grid are

expanded by a constant factor2g−1 in all 3 directions. This expansion is thus ‘isotropic’,

and used as default.

The expansion factor in each direction can be made different. Thegth coarse gridGg

is defined by retaining everyf gx nodes, everyf g

y nodes and everyf gz nodes in the X, Y, Z

directions, respectively.

The corresponding search templateτ gJ is re-scaled as:

τ gJ = {f g · h1, . . . , f

g · hJ},

wheref g = {f gx , f g

y , f gz }. Note that the total numberJ of template nodes remains the

same for all grids. This ‘anisotropic’ expansion calls for the expansion factors to be input

through theSNESIMinterface.

Let i = X,Y, Z and 1 ≤ g ≤ G (G ≤ 10), then the requirement for the input

anisotropic expansion factors are:

76

1. all expansion factors (f gi ) must be positive integers;

2. expansion factor for the finest grid must be 1 (f 1i ≡ 1);

3. expansion factor for the(g − 1)th multi-grid must be smaller than or equal to that

for thegth multi-grid (f g−1i ≤ f g

i );

4. expansion factor for the(g − 1)th multi-grid must be a factor of that for thegth

multi-grid (f gi mod f g−1

i = 0).

For example, valid expansion factors for three multiple grids are:

1 1 1 1 1 1 1 1 1

2 2 1 or 4 2 2 or 3 3 1

4 4 2 8 4 2 9 6 2.

A sensitivity analysis of the anisotropic expansion parameter should be performed before

any application.

Marginal distribution reproduction

It is sometimes desirable that the histogram of the simulated variable be close to a given

target distribution, e.g. the sample histogram. There is however no constraint inSNESIM

as described inAlgorithm 6.8 or Algorithm 6.9 to ensure that such target distribution be

reproduced. It is recommended to select a training image whose histogram is reasonably

close to the target marginal proportions.SNESIMcan correct the conditional distribution

function read at each nodal location from the search tree (step 7 ofAlgorithm 6.8) to

gradually gear the histogram of the up-to-now simulated values towards the target.

Let pck, k = 0, . . . , K − 1, denote the proportions of values in classk simulated so

far, andptk, k = 0, . . . , K − 1, denote the target proportions. Step 7 ofAlgorithm 6.8 is

modified as follows:

1. Compute the conditional probability distribution as originally described in step 7 of

Algorithm 6.8.

2. Correct the probabilitiesP(Z(u) = k | devJ(u)

)into:

P ∗(Z(u) = k | devJ(u))

= P(Z(u) = k | devJ(u)

)+

ω

1− ω∗ (pt

k − pck)

77

whereω ∈ [0, 1) is the servosystem intensity factor. Ifω = 0, no correction is

performed. Conversely, ifω → 1, reproducing the target distribution entirely con-

trols the simulation process, at the risk of failing to reproduce the training image

geological structures.

If P ∗(Z(u) = k | devJ(u))

/∈ [0, 1], it is reset to the closest bound. All updated

probability values are then rescaled to sum up to 1:

P ∗∗(Z(u) = k | devJ(u))

=P ∗(Z(u) = k | devJ(u)

)∑Kk=1 P ∗

(Z(u) = k | devJ(u)

)A similar procedure can be called to reproduce a given vertical proportion curve for

each horizontal layer. The vertical proportion should be provided in input as a 1D property

with number of nodes in X and Y directions equal to 1, and the number of nodes in the Z

direction equal to that of the simulation grid.

Soft data integration

Soft (secondary) data may be available to constrain the simulated realizations. The soft

data are typically obtained by remote sensing techniques, such as seismic data. Often soft

data provide exhaustive but low resolution information over the whole simulation grid.

SNESIMcan account for such secondary information. The soft dataY (u) must be first

calibrated into probability dataP(Z(u) = k|Y (u)

), k = 0, . . . , K − 1 as presence or

absence of a certain categoryk centered at locationu, whereK is the total number of

categories.

The Tau model (Journel, 2002) is used to integrate conditional probabilities coming

from the soft data and the training image. The conditional distribution function of step 7 of

Algorithm 6.8, P(Z(u) = k | devJ(u)

), is updated intoP

(Z(u) = k | devJ(u), Y (u)

)using the following formula:

P(Z(u) = k | devJ(u), Y (u)

)=

1

1 + x, (6.4)

x is calculated asx

a=

( b

a

)τ1( c

a

)τ2 , τ1, τ2 ∈ (−∞, +∞), (6.5)

78

wherea, b, c are defined as:

a =1− P

(Z(u) = k

)P

(Z(u) = k

)b =

1− P(Z(u) = k | devJ(u)

)P

(Z(u) = k | devJ(u)

)c =

1− P(Z(u) = k | Y (u)

)P

(Z(u) = k | Y (u)

)P

(Z(u) = k

)is the target marginal proportion of categoryk. The two weightsτ1 and

τ2 account for the redundancy (Krishnan, 2004) between the soft dataY (u) and the local

conditioning data eventdevJ(u), respectively. The default values areτ1 = τ2 = 1.

Step 7 ofAlgorithm 6.8 is then modified as follows:

1. Estimate the probabilityP(Z(u) = k | devJ(u)

)as described inAlgorithm 6.8

2. Compute the updated probabilityP(Z(u) = k | devJ(u), Y (u)

)using Eq. 6.4

3. Draw a realization from the updated distribution function

Subgrid concept

As described earlier in sectionSingle grid simulation (page 73), wheneverSNESIMcan

not find enough training replicates of a given data eventdevJ , it will drop the furthest node

in devJ and repeat searching until the number of replicates is≥ cmin. This data dropping

procedure not only decreases the quality of pattern reproduction, but also significantly

increases CPU cost.

The subgrid concept is proposed to alleviate the data dropping effect. Fig. 6.14.1

shows eight contiguous nodes of a 3D simulation grid, also seen as the 8 corners of a cube

(Fig. 6.14.2). Among them, nodes 1 and 8 belong to subgrid 1; nodes 4 and 5 belongs

to subgrid 2, and all other nodes belong to subgrid 3. Fig. 6.14.3 shows the subgrid idea

in 2 dimensions. The simulation is performed first over subgrid 1, then subgrid 2 and

finally over subgrid 3. This subgrid simulation concept is applied to all multiple grids

except the coarsest grid. Fig. 6.14.4 shows the 3 subgrids over the2nd multi-grid, where

‘A’ denotes the1st sub-grid nodes, ‘B’ denotes the2nd sub-grid nodes, and ‘C’ denotes

the3rd sub-grid nodes.

79

Figure 6.14: Subgrid concept: (1) 8 close nodes in 3D grid; (2) 8 nodes represented in the

corners of a cubic; (3) 4 close nodes in 2D grid; (4) 3 subgrids in the2nd multi-grid

In the1st subgrid of thegth multi-grid, most of the nodes (of type A) would have been

already simulated in the previous coarse(g + 1)th multi-grid: with the default isotropic

expansion, 80% of nodes are already simulated in the previous coarser grid in 3D and

100% of them in 2D. In that1st subgrid, the search template is designed to only use the A

type nodes as conditioning data, hence the data event is almost full. Recall that the nodes

previously simulated in the coarse grid are not resimulated in this subgrid.

In the2nd subgrid, all the nodes marked as ‘A’ in Fig. 6.14.4 are now informed by a

simulated value. In this subgrid, the search templateτJ is designed to use only the A type

nodes, but conditioning includes in addition theJ ′ closest B type nodes; the default is

J ′ = 4. In total, there areJ + J ′ nodes in the search templateτ for that2nd subgrid. The

left plot of Fig. 6.15 shows the2nd subgrid nodes and the template nodes of a simple 2D

case with isotropic expansion: the basic search template of size 14 is marked by the solid

circles, theJ ′ = 4 additional conditioning nodes are marked by the dash circles. Note

that the data event captured by the basic template nodes (solid circles) is always full.

When simulating over the3rd subgrid, all nodes of both the1st and the2nd subgrids

(ot types A and B) are fully informed with simulated values. In that3rd subgrid, the base

templateτJ is designed to search only nodes of types A and B as conditioning data. Again,

80

Figure 6.15: Simulation nodes and search template in subgrid (left:2nd subgrid; right:

3rd subgrid)

theJ ′ nearest nodes of type C in the current subgrid are used as additional conditioning

data. In the right plot of Fig. 6.15, the basic search template for the3rd subgrid is marked

by solid circles, theJ ′ = 4 additional conditioning nodes are marked by dash circles.

The subgrid approach mimics a stagger grid, such that more conditioning data can

be found during simulation in each subgrid, thus the conditional data event is more con-

sistent with the training image patterns. Hence the geological structures present in that

training image would be better reproduced in the simulated realizations. It isstrongly

recommended to use this subgrid concept for 3D simulation.Algorithm 6.9 is modified

as shown inAlgorithm 6.10.

Node re-simulation

The other solution for reducing the data dropping effect is to re-simulate those nodes

which are simulated with a number of conditioning data less than a given threshold.

SNESIMrecords the numberNdrop of data event nodes dropped during simulation. After

simulation on each subgrid of each multiple grid, those nodes withNdrop larger than a

threshold are de-allocated, and pooled into a new random path. ThenSNESIMis per-

81

Algorithm 6.10 SNESIMwith multi-grids and subgrid concept1: for Each subgrids do

2: Build a combined seach templateτ sJ,J ′ = {hi, i = 1, . . . , . . . , (J + J ′)}.

3: end for

4: Choose the numberL of multiple grids to consider

5: Start with the coarsest gridGg, g = L.

6: while g > 0 do

7: Relocate hard data to the nearest grid nodes in current multi-grid

8: for Each subgrids do

9: Build a new geometrical templateτ g,sJ,J ′ by re-scaling templateτ s

J,J ′.

10: Build the search treeTrg,s using the training image and templateτ gJ,J ′

11: Simulate all nodes ofGg,s as inAlgorithm 6.8

12: end for

13: Remove the relocated hard data from current multi-grid ifg > 1

14: Move to next finer gridGg−1 (let g = g − 1).

15: end while

formed again along this new random path. This post-processing technique (Remy, 2001)

improves the simulated realization to a certain extent, in particular it can only improve

small scale features.

In the currentSNESIM, a percentageP of nodes along each of the original random

path are to be re-simulated, withP ∈ [0, 50%] being input through theSNESIMinterface.

Accounting for local non-stationarity

Any training image should be reasonably stationary so that meaningful statistics can be

inferred by scanning it. It is however possible to introduce some non-stationarity in the

simulation by accounting for local rotation and local scaling of an otherwise stationary

TI. SNESIMprovides two approaches to handle such non-stationary simulation: (1)

modify locally the training image; (2) use different training images. The first approach is

presented in this section; the second will be detailed in the next sectionRegion concept

(page 84).

The simulation fieldG can be divided into several rotation regions, each region asso-

ciated with a rotation angle. Letri (i = 0, . . . , Nrot − 1) be the rotation angle about the

82

(vertical) Z-axis in theith regionRi, whereNrot is the total number of regions for rota-

tion, andR0∪· · ·∪RNrot−1 = G. In SNESIMthe azimuth rotation is limited to be around

the Z-axis, and the angle is measured in degree increasing clockwise from the Y-axis.

The simulation gridG can also be divided into a set of scaling regions, each re-

gion associated with scaling factors in X/Y/Z directions. Letf j = {f jx, f j

y , f jz} (j =

0, . . . , Naff − 1) be the scaling factors, also called affinity ratios, in thejth regionSj,

whereNaff is number of regions for scaling,S0 ∪ · · · ∪ SNaff−1 = G, andf jx, f j

y , f jz

are the affinity factors in the X/Y/Z directions, respectively. All affinity factors must be

positive∈ (0, +∞). The smaller the affinity factor, the larger the extent of the geological

structure in that direction. An affinity factor equal to 1 means no training image scaling.

Note that inSNESIM, the numbersNrot of rotation regions andNaff of scaling regions

can be independent one from another, allowing overlap of rotation regions with scaling

regions.

GivenNrot rotation regions andNaff affinity regions, the total number of new training

images after scaling and rotation isNrot ·Naff . Correspondingly, one search tree must be

constructed using templateτJ for each of the new training imageTIi,j defined as:

TIi,j(u) = Θi · Λj · TI(u),

whereu is the node in the training image,Θi is the rotation matrix for rotation regioni,

andΛj is scaling matrix for affinity regionj:

Θi =

cos ri sin ri 0

− sin ri cos ri 0

0 0 1

Λj =

1/f jx 0 0

0 1/f jy 0

0 0 1/f jz

.

Nrot = 1 means rotate the stationaryTI globally, which can be also achieved by speci-

fying a global rotation angle. SimilarlyNaff = 1 corresponds to a global scaling of the

TI.

The correspondingSNESIMalgorithm is described inAlgorithm 6.11.

This option can be very memory demanding as one new search tree has to be built for

each scaling factorfj and for each rotation angleri. Practice has shown that it is possible

to generate fairly complex models using a limited number of regions: a maximum of 5

rotation regions and 5 affinity regions is sufficient in most cases.

83

Algorithm 6.11 SNESIMwith locally varying azimuth and affinity1: Define a search templateτJ

2: for Each rotation regioni do

3: for Each affinity regionj do

4: Construct a search treeTri,j for training imageTIi,j using templateτJ

5: end for

6: end for

7: Relocate hard data to the nearest simulation grid nodes

8: Define a random path visiting all locations to be simulated


10: Find the conditioning data eventdevJ(u) defined by templateτJ

11: Locate the region indices(i, j) of locationu

12: Retrieve the conditional probability distribution ccdfP (Z(u) = k|devJ(u)) from

the corresponding search treeTri,j

13: Draw a simulated valuez(s)(u) from that conditional distribution and add it to the

data set

14: end for

Region concept

The rotation and affinity concepts presented in the previous section allow to account only

for limited non-stationarity in that the geological structures in the different subdomains

are similar except for orientation and size. In more difficult cases, the geological struc-

tures may be fundamentally different from one zone or region to another, calling for

different training images in different regions, see R1,R2 and R3 in Fig. 6.16. Also, parts

of the study field may be inactive (R4 in Fig. 6.16), hence there is no need to perform

SNESIMsimulation in those locations, and the target proportion should be limited to only

the active cells. The region concept allows such flexibility.

The simulation gridG is first divided into a set of subdomains (regions)Gi, i =

0, . . . , NR − 1, whereNR is the total number of regions, andG0 ∪ · · · ∪ GNR−1 = G.

Perform normalSNESIMsimulation for each active region with its specific training image

and its own parameter settings. The regions can be simulated in any order, or can be

simulated simultaneously through a random path visiting all regions.

Except for the first region, simulation in one region can be conditioned to values previ-

ously simulated in other regions, such as to reduce discontinuity across region boundaries.

84

Figure 6.16: Simulation with region concept: each region is associated to a specificTI.

The simulated result not only contains the property values in the current region, but also

the property copied from the other conditioning regions. For instance in Fig. 6.16, when

region 2 (R2) is simulated conditional to the propertyre1 in region 1 (R1), the simulated

realizationre1,2 contains the property in both R1 and R2. Next the propertyre1,2 can

be used as conditioning data to performSNESIMsimulation in region 3 (R3), which will

result in a realization over all active areas (R1+R2+R3).

Target distributions

SNESIMallows three kinds of target proportions: a global target proportion, a vertical

proportion curve and a soft probability cube. Three indicatorsI1, I2, I3 are defined as

follows:

I1 =

1 a global target is given

0 no global target is given

I2 =

1 a vertical proportion curve is given

0 no vertical proportion curve is given

I2 =

1 a probability cube is given

0 no probability cube is given

85

There are in total23 = 8 possible options. TheSNESIMprogram proceeds as follows,

according to the given option:

1. I1 · I2 · I3 = 1, [global target, vertical proportion curve and probability cube all

given]: SNESIMignores the global target, and checks consistency between the soft

probability cube and the vertical proportion. If they are not consistent, a warning

is prompted in the background DOS console, and the program continues running

without waiting for correction of the inconsistency. The local conditional proba-

bility distribution (ccdf) is updated first for the soft probability cube using the Tau

model, then the servosystem is enacted using the vertical probability values as target

proportion for each layer.

2. I1 · I2 · (1− I3) = 1, [global target, vertical proportion curve, no probability cube]:

SNESIMignores the global target, and corrects the ccdf with the servosystem using

the vertical probability value as target proportion for each layer.

3. I1 · (1− I2) · I3 = 1, [global target, no vertical proportion curve, probability cube]:

SNESIMchecks the consistency between the soft probability cube and the global

target proportion. If they are not consistent, a warning is prompted and the program

continues running without correction of the inconsistency. The ccdf is updated first

for the soft probability cube using the Tau model, then the servosystem is enacted

to approach the global target proportion.

4. I1 ·(1−I2) ·(1−I3) = 1, [global target, no vertical proportion curve, no probability

cube]:SNESIMcorrects the ccdf with the servosystem to approach the global target

proportion.

5. (1− I1) · I2 · I3 = 1, [no global target, vertical proportion curve, probability cube]:

Same as case 1.

6. (1−I1) ·I2 ·(1−I3) = 1, [no global target, vertical proportion curve, no probability

cube]: Same as case 2.

7. (1−I1) ·(1−I2) ·I3 = 1, [no global target, no vertical proportion curve, probability

cube]:SNESIMgets the target proportion from the training image, then checks the

consistency between the soft probability cube and that target proportion. If they are

not consistent, a warning is prompted and the program continues running without

86

correction of the inconsistency. The ccdf is updated first for the soft probability

cube using the Tau model, then the servosystem is enacted to approach the target

proportion.

8. (1− I1) · (1− I2) · (1− I3) = 1, [no global target, no vertical propportion curve, no

probability cube]:SNESIMgets the target proportion from the training image, then

correct the ccdf with the servosystem to approach the global proportion.


TheSNESIMalgorithm is activated fromSimulation|snesimstd in the upper part of al-

gorithm panel. The mainSNESIMinterface contains 4 pages: ‘General’, ‘Conditioning’,

‘Rotation/Affinity’ and ‘Advanced’ (see Fig. 6.17). TheSNESIMparameters will be pre-

sented page by page in the following. The text inside ‘[ ]’ is the corresponding keyword

in theSNESIMparameter file.

1. simulation grid name [GridSelector Sim]: The name of grid on which simulation

is to be performed

2. property name prefix [Property NameSim]: The name of the property to be simu-

lated

3. # of Realizations[Nb Realizations ]: Number of realizations to be simulated

4. Seed[Seed]: A large odd number to initialize the pseudo-random number generator

5. Training Image | Object [PropertySelector Training.grid ]: The name of the

grid containing the training image

6. Training Image | Property [PropertySelector Training.property ]: The training

image property, which must be a categorical variable whose value must be between

0 andK − 1, whereK is the number of categories

7. # of Categories[Nb Facies ]: The numberK of categories contained in the training

image.

8. Target Marginal Distribution [Marginal Cdf ]: The target category proportions, must

be given in sequence from category 0 to categoryNb Facies -1. The sum of all tar-

get proportions must be 1

87

Figure 6.17:SNESIMmain interface.

A. General; B. Conditioning; C. Region; D. Advanced

88

9. # of Nodes in Search Template[Max Cond]: The maximum numberJ of nodes con-

tained in the search template. The larger theJ value, the better the simulation

quality if the training image is correspondingly large, but the more demanding the

computer memory. Usually, around 40 nodes in 2D and 60 nodes in 3D with multi-

grid option can create fairly good models

10. Search Template Geometry[Search Ellipsoid ]: The ranges and angles defining

the ellipsoid used to search for neighboring conditioning data. The search template

τJ is automatically built from the search ellipsoid retaining theJ closest nodes

11. Hard Data | Object [Hard Data.grid ]: The grid containing the hard conditioning

data. The hard data object must be a point set. The default input is ‘None’, which

means no hard conditioning data is used

12. Hard Data | Property [Hard Data.property ]: The property of the hard condition-

ing data, which must be a categorical variable with values between 0 andK − 1.

This parameter is ignored when no hard conditioning data is selected

13. Use Probability Data Calibrated from Soft Data[Use ProbField ]: This flag indi-

cates whether the simulation should be conditioned to prior local probability cubes.

If marked, performSNESIMconditional to prior local probability information. The

default is not to use soft probability cubes.

14. Soft Data | Choose Properties[ProbField properties ]: Selection for the soft

probability data. One property must be specified for each categoryk. The prop-

erty sequence is critical to the simulation result: thekth property corresponds to

P(Z(u) = k | Y (u)

). This parameter is ignored ifUse ProbField is set to 0. Note

that the soft probability data must be given over the same simulation grid defined in

(1)

15. Tau Values for Training Image and Soft Data[TauModelObject ]: Input two Tau

parameter values: the first Tau value is for the training image, the second Tau value

is for the soft conditionig data. The default Tau values are ‘1 1’. This parameter is

ignored ifUse ProbField is set to 0

16. Vertical Proportion | Object [VerticalPropObject ]: The grid containing the ver-

tical proportion curve. This grid must be in 1D dimension: number of cells in X

89

and Y directions must be 1, and the number of cells in Z direction must be the same

as that of the simulation grid. The default input is ‘None’, which means no vertical

proportion data is used

17. Vertical Proportion | Choose Properties[VerticalProperties ]: Select one and

only one proportion for each categoryk. The property sequence is critical to the

simulation result. This parameter is ignored whenVerticalPropObject is ‘None’

18. Use Azimuth Rotation[Use Rotation ]: The flag to use the azimuth rotation concept

to handle non-stationary simulations. If marked (set as 1), then use rotation concept.

The default is unmarked

19. Use Global Rotation[Use Global Rotation ]: To rotate the training image with a

single azimuth angle. If marked (set as 1), a single angle must be specified in

‘Global Rotation Angle’

20. Use Local Rotation[Use Local Rotation ]: To rotate the training image for each re-

gion. If selected, a rotation angle must be specified for each region inRotation categories .

Note thatUse Global Rotation andUse Local Rotation are mutually exclusive

21. Global Rotation Angle[Global Angle ]: The global rotation angle given in degree.

The training image will be rotated by that angle prior to simulation. This parameter

is ignored ifUse Global Rotation is set to 0

22. Property with Azimuth Rotation Categories [Rotation property ]: The property

containing the coding of the rotation regions, must be given over the same simu-

lation grid as defined in (1). The region code ranges from 0 toNrot − 1 where

Nrot is the total number of regions. The angles corresponding to all the regions are

specified byRotation categories

23. Rotation Angles per Category[Rotation categories ]: The angles, expressed in

degrees, corresponding to each region. The angles must given in sequence separated

by space. This parameter is ignored ifUse Global Rotation is set to 0

24. Use Scaling[Use Affinity ]: The flag to use the affinity concept to handle non-

stationary simulations. If marked (set as 1), use the affinity concept. The default is

unchecked

90

25. Use Global Affinity [Use Global Affinity ]: The flag to indicate whether to scale

the training image with same constant factors in each X/Y/Z direction. If marked

(set as 1), three affinity values must be specified in ‘Global Affinity Change’

26. Use Local Affinity [Use Local Affinity ]: To scale the training image for each

affinity region. If set to 1, three affinity factors must be specified for each region.

Note thatUse Global Affinity andUse Local Affinity are mutually exclusive

27. Global Affinity Change [Global Affinity ]: Input three values (separated by spaces)

for the X/Y/Z directions, respectively. If the affinity value in a certain direction is

f , then the category width in that direction is1/f times the original width

28. Property with Affinity Changes Categories[Affinity property ]: The property

containing the coding of the affinity regions, must be given over the same simulation

grid as defined in (1). The region code ranges from 0 toNaff − 1 whereNaff is

the total number of affinity regions. The affinity factors should be specified by

Affinity categories

29. Affinity Changes for Each Category[Affinity categories ]: Input the affinity

factors in the table: one scaling factor for each X/Y/Z direction and for each region.

The region index (the first column in the table) is actually the region indicator plus

1

30. Min # of Replicates[Cmin]: The minimum number of training replicates of a given

conditioning data event to be found in the search tree before retrieving the condi-

tional probability. The default value is 1.

31. Servosystem Factor[Constraint Marginal ADVANCED]: A parameter (∈ [0, 1]) which

controls the servosystem correction. The higher the servosystem parameter value,

the better the reproduction of the target category proportions. The default value is

0.5.

32. Re-simulated Nodes Percentage[revisit nodes prop ]: A parameter indicating

the proportion∈ [0%, 50%] of nodes which will be re-simulated at each multigrid.

The default value is15%.

33. # of Multigrids [Nb Multigrids ADVANCED]: The number of multiple grids to con-

sider in the multiple grid simulation. The default value is 3.

91

34. Debug Level[Debug Level ]: The option controls the output in the simulation grid.

The larger the debug level, the more outputs fromSNESIM:

• If 0, then only the final simulation result is output (default value);

• If 1, then a map showing the number of nodes dropped during simulation is

also output;

• if 2, then intermediate simulation results are output in addition to the outputs

from options 0 and 1.

35. Use sub-grids[Subgrid choice ]: The flag to divide the simulation nodes on the

current multi-grid into three groups to be simulated in sequence. It isstrongly

recommended to use this option for 3D simulation

36. Previously simulated nodes[Previously simulated ]: The number of nodes in cur-

rent subgrid to be used for data conditioning. The default value is 4. This parameter

is ignored ifSubgrid choice is set to 0

37. Use Region[Use Region ]: The flag indicates whether to use the region concept. If

marked (set as 1), performSNESIMsimulation with the region concept; otherwise

performSNESIMsimulation over the whole grid

38. Property with Region Code[Region Indicator Prop ]: The property containing

the coding of the regions, must be given over the same simulation grid as defined

in (1). The region code ranges from 0 toNR − 1 whereNR is the total number of

regions

39. List of Active regions [Active Region Code]: The list of region (or regions when

simulate multple regions simultaneously) to be simulated. If simulation with mul-

tiple regions, the input region codes should be separated by spaces

40. Condition to Other Regions[Use Previous Simulation ]: The option to perform

region simulation conditional to data from other regions

41. Property of Previously Simulated Regions[Previous Simulation Pro ]: The prop-

erty simulated in the other regions. The property can be different from one region

to another. See sectionRegion concept(page 84)

92

42. Isotropic Expansion[expand isotropic ]: The flag to use isotropic expansion method

for generating the series of cascaded search templates and multiple grids

43. Anisotropic Expansion[expand anisotropic ]: The flag to use anisotropic factors

for generating a series of cascaded search templates and multiple grids

44. Anistropic Expansion Factors[aniso factor ]: Input an integer expansion factor

for each X/Y/Z direction and for each multiple grid in the given table. The first

column of the table indicates the multiple grid level, the smaller the number, the

finer the grid. This option is not recommended to beginners

Examples

This section presents four examples showing how theSNESIMalgorithm works with

categorical training images with or without data conditioning, for both 2D and 3D simu-

lations.

1. Example 1: 2D unconditional simulation

Fig. 6.18(a) shows a channel training image of size150× 150. This training image

contains four facies: mud background, sand channel, levee and crevasse. The facies

proportions are 0.45, 0.2, 0.2 and 0.15 respectively. An unconditionalSNESIMsim-

ulation is performed with this training image using a maximum of 60 conditioning

data. The search template is isotropic in 2D with isotropic template expansion. Four

multiple grids are used to capture the large scale channel structures. Fig. 6.18(b)

gives oneSNESIMrealization, whose facies proportions are 0.44, 0.19, 0.2 and

0.17 respectively. It is seen that the channel continuities and the facies attachment

sequence are reasonably well reproduced.

2. Example 2: 3D simulation conditioning to well data and soft seismic data

In this example, the large 3D training image of Fig. 6.19(a) is created with the

object-based program ‘fluvsim’ (Deutsch and Tran, 2002). The dimension of this

training image is150 × 195 × 30, and the facies proportions are 0.66, 0.30 and

0.04 for mud background, sand channel and crevasse, respectively. The channels

are oriented in the North-South direction with varying sinuosity and width.

The simulated field is of size100×130×10. Two vertical wells, five deviated wells

and two horizontal wells were drilled during an early production period. Those

93

(a) Four categories training image (b) OneSNESIMrealization

Figure 6.18: Four facies training image and oneSNESIMsimulation

(black: mud facies; dark gray: channel; light gray: levee; white: crevasse)

wells provide hard conditioning data at the well locations, see Fig. 6.19(b). One

seismic survey was collected, and calibrated from the well hard data into soft prob-

ability cubes for each facies as shown in Fig. 6.19 (c)-(e).

For theSNESIMsimulation, 60 conditioning data nodes are retained in the search

template. The three major axes of the search ellipsoid are of size 20, 20 and 5, re-

spectively. The angles of azimuth, dip and rake are all zero. Four multiple grids are

used with isotropic template expansion. The subgrid concept is adopted with 4 ad-

ditional nodes in the current subgrid for data conditioning. OneSNESIMrealization

conditioning to both well hard data and seismic soft data is given in Fig. 6.19(f).

This simulated field has channels orientated in the NS direction, with the high sand

probability area (light gray to white in Fig. 6.19(d)) having more sand facies. The

simulated facies proportions are 0.64, 0.32 and 0.04 respectively.

3. Example 3: 2D hard conditioned simulation with affinity and rotation regions

In this example,SNESIMis performed with scaling and rotation to account for local

non-stationarity. The simulation field is the last layer of Fig. 6.19(b), which is di-

vided into three affinity regions (Fig. 6.20(a)) and three rotation regions (Fig. 6.20(b)).

For the channel training image of Fig. 6.20(c) which is the4th layer of Fig. 6.19(a),

the channel width in each affinity region (0, 1, 2) is extended by a factor of 2, 1 and

0.5, respectively; and the channel orientation in each rotation region (0, 1, 2) is0o,

−60o and60o, respectively. Fig. 6.20(d) gives oneSNESIMrealization conditioned

94

(a) Three categories training image (b) Well conditioning data

(c) Probability of mud facies (d) Probability of channel facies

(e) Probability of crevasse facies (f) OneSNESIMrealization

Figure 6.19: Three facies 3D traning image (black: mud facies; gray: channel; white:

crevasse), well hard data, facies probability cubes and oneSNESIMrealization. Graphs

(c)-(f) are given for the same slices: X=12, Y=113, Z=4

95

to the well data using both the affinity and rotation regions. It is seen that the chan-

nel width varies from one region to another; and the channels between regions are

well connected.

(a) Affinity region (b) Rotation region

(c) Three facies 2D training image (d) OneSNESIMrealization

Figure 6.20: Affinity and rotation regions (black: region 0; gray: region 1; white: region

2); three facies 2D training image and oneSNESIMsimulation (black: mud facies; gray:

sand channel; white: crevasse)

4. Example 4: 2D simulation with soft data conditioning

In this last example, the simulation grid is again the last layer of Fig. 6.19(b). Both

the soft data and well hard data from that layer are used to for data conditioning.

Fig. 6.21(a) gives the mud probability field. Fig. 6.20(c) is used as training image.

The search template is isotropic with 60 nodes. Four multiple grids are retained

with isotropic template expansion.SNESIMis run for 100 realizations. Fig. 6.21

(c)-(e) present 3 realizations: the channels are well connected in the NS direction;

96

and their locations are consistent with the soft probability data (see the dark area in

Fig. 6.21(a) for the channel). Fig. 6.21(b) gives the mud facies probability obtained

from the simulated 100 realizations, this E-type probability is consistent with the

input mud probability Fig. 6.21(a).

(a) Probability of mud facies (b) Experimental mud probability

(c) SNESIMrealization 12 (d) SNESIMrealization 27 (e) SNESIMrealization 78

Figure 6.21: Mud facies probability, experimental mud probability from 100SNESIM

realizations and threeSNESIMrealizations (black: mud facies; gray: sand channel; white:

crevasse)

97

6.2.2 FILTERSIM

SNESIMis designed for modeling categories, e.g. facies distributions. It is limited by the

number of categorical variables it can handle.SNESIMis memory-demanding when the

training image is large with a large variety of different patterns.SNESIMdoes not work

for continuous variables. Themps algorithmFILTERSIM, called filter-based simulation

(Zhang, 2006), has been proposed to circumvent these problems. TheFILTERSIMalgo-

rithm is much less memory demanding yet with a reasonable CPU cost, and it can handle

both categorical and continuous variables.

FILTERSIMutilizes linear filters to classify training patterns in a filter score space

of reduced dimension. Similar training patterns are stored in a class characterized by an

average pattern called prototype. During simulation, the prototype closest to the condi-

tioning data event is determined. A pattern from that prototype class is then drawn, and

pasted onto the simulation grid.

Instead of saving faithfully all training replicates in a search tree as doesSNESIM,

FILTERSIMonly saves the central location of each training pattern in memory, hence

reducing RAM demand.

Filters and scores

A filter is a set of weights associated with a specific data configuration/templateτJ =

{u0;hi, i = 1, . . . , J}. Each nodeui of the template is characterized by a relative offset

vectorhi = (x, y, z)i from the template centeru0 and is associated with a specific fil-

ter value or weightfi. The offset coordinatesx, y, z are integer values. For aJ-nodes

template, its associated filter is{f(hi); i = 1, . . . , J}. The filter configuration can be of

any shape: Fig. 6.22.a shows a irregular shaped filter and Fig. 6.22.b gives a cube-shaped

filter of size5× 3× 5.

A search template is used to capture patterns from a training image. The search tem-

plate ofFILTERSIMmust be rectangular of size(nx, ny, nz), wherenx, ny, nz are odd

positive integers. Each node of this search template is recorded by its relative offset to the

centroid.

Fig. 6.22.b shows a search template of size5× 3× 5.

FILTERSIMrequires that the filter configuration be same as the search template, such

that this filter can be applied to the training pattern centered at locationu. The training

98

Figure 6.22: Filter and score: (a) a general template; (b) a cube-shaped template; (c) from

filter to score.

pattern is then summarized by a filter scoreSτ (u):

Sτ (u) =n∑

i=1

f(hi) · pat(u + hi), (6.6)

wherepat(u + hi) is the pattern nodal value.

Fig. 6.22.c illustrates the process of creating a filter score with a 2D filter.

Clearly, one filter is not enough to capture the essential information carried by a given

training pattern. A set ofF filters should be designed to capture the diverse characteristics

of a training pattern. TheseF filters create a vector ofF scores to represent the training

pattern, Eq. 6.6 is rewritten as:

Skτ (u) =

n∑i=1

fk(hi) · pat(u + hi), k = 1, . . . , F (6.7)

Note that the pattern dimension is reduced from the template sizenx ∗ ny ∗ nz to F . For

example a 3D pattern of size11 × 11 × 3 can be described by the 9 default filter scores

proposed inFILTERSIM.

For a continuous training image (TI), the F filters are directly applied to the con-

tinuous values constituting each training pattern. For a categorical training image with

99

K categories, this training image is first transformed intoK set of binary indicators

Ik(u), k = 0, . . . , K − 1,u ∈ TI:

Ik(u) =

{1 if u belongs tokth category

0 otherwise(6.8)

A K-category pattern is thus represented byK sets of binary patterns, each indicating the

presence/absence of one single category at a certain location. TheF filters are applied

to each one of theK binary patterns resulting in a total ofF × K scores. A continuous

training image can be seen as a special case of a categorical training image with a single

categoryK = 1.

Filters definition

FILTERSIMaccepts two filter definitions: the default filters provided byFILTERSIMand

user-defined filters.

By default,FILTERSIMprovide 3 filters (average, gradient and curvature) for each

X/Y/Z direction, with the filters configuration being identical to that of the search tem-

plate. Letni be the template size in thei direction (i denotes either X, Y or Z),mi =

(ni − 1)/2, andαi = −mi, . . . , +mi be the filter node offset in thei direction, then the

default filters are defined as:

• average filter: f i1(αi) = 1− |αi|

mi∈ [0, 1]

• gradient filter: f i2(αi) = αi

mi∈ [−1, 1]

• curvature filter: f i3(αi) = 2|αi|

mi− 1 ∈ [−1, 1]

The default total is 6 filters for a 2D search template and 9 in 3D.

The users can also design their own filters and enter them into a data file. This option

is not recommended to beginners. The filter data file should follow the following format

(see Fig. 6.23):

• The first line must be an integer number indicating the total number of filters in-

cluded in this data file. Starting from the second line, list each filter definition one

by one.

100

• For each filter, the first line gives the filter name which must be a string and the

weight associated to the filter score (this weight is used later for pattern classi-

fication). In each of the following lines, list the offset (x, y, z) of each template

node and its associated weight (f(x, y, z)). The four numbers must be separated by

spaces.

Although the geometry of the user defined filters can be of any shape and any size,

only those filter nodes within the search template are actually retained for the score cal-

culation to ensure that the filter geometry is same as that of the search template. For

those nodes in the search template but not in the filter template,FILTERSIMadds dummy

nodes associated with a zero filter value. There are many ways to create the filters, Princi-

pal Component Analysis (PCA) (Jolliffe, 1986) is one alternative, refer to Zhang (2006)

for more details.

Figure 6.23: Format of user-defined filter.

Pattern classification

Sliding theF filters over aK-category training image will result inFK score maps,

where each local training pattern is summarized by aFK-length vector in the filter score

101

space. In general,FK is much smaller than the size of the filter templateτ , hence the

dimension reduction is significant.

Similar training patterns will have similarFK scores. Hence by partitioning the filter

score, similar patterns can be grouped together. Each pattern class is represented by a

pattern prototypeprot, defined as the point-wise average of all training patterns falling

into that class. A prototype has the same size as the filter template, and is used as the

pattern group ID.

For a continuous training image, a prototype associated with search templateτJ is

calculated as:

prot(hi) =1

c

c∑j=1

pat(uj + hi), i = 1, . . . , J (6.9)

wherehi is theith offset location in the search templateτJ , c is the number of replicates

within that prototype class;uj (i = 1, . . . , c) is the center of a specific training pattern.

For a categorical variable, Eq. 6.9 is applied to each of theK sets binary indicator

maps transformed by Eq. 6.8. Hence a categorical prototype consists ofK proportion

maps, each map giving the probability of a certain category to prevail at a template loca-

tion uj + hi:

prob(hi) ={protk(hi), k = 1, . . . , K

}, (6.10)

whereprotk(hi) = P (Z(u + hi) = k).

For maximal CPU efficiency, a two-step partition approach is proposed:

1. group all training patterns into some rough pattern clusters using a fast classification

algorithm; these rough pattern clusters are called parent classes. Each parent class

is characterized by its own prototype;

2. partition those parent classes that have both too many and too diverse patterns in

it using the same (previous) classification algorithm. The resulting sub-classes are

called children classes. These children classes might be further partitioned if they

contain too many and too diverse patterns. Each final child class is characterized by

its own prototype of type Eq. 6.9.

For any class and corresponding prototype, the diversity is defined as the averaged

filter variance:

V =1

FK

FK∑k=1

ωk · σ2k,l (6.11)

102

where:

ωk ≥ 0 is the weight associated with thekth filter score,∑FK

k=1 ωk = 1. For the

default filter definition,ωk is 3, 2 and 1 for average, gradient and curvature filters, respec-

tively. For the user-defined filters, theωk value for each filter must be specified in the

filter data file (Fig. 6.23);

σ2k = 1

c

∑ci=1(S

ik−mk)

2 is the variance of thekth score value over thec replicates;

mk = 1c

∑ci=1 Si

k,l is the mean value of thekth score value over thec replicates;

Sik is the score of theith replicate ofkth filter score defining the prototype.

The prototypes with diversity higher than a threshold and with too many replicates are

further partitioned.

This two-step partition approach allows finding quickly the prototype closest to the

data event. Without the two-step partition, it would take at each node 3000 distance

comparisons to check all 3000 prototypes (parent and children); while with the two-step

partition, it takes only 50 comparisons to find the best parent prototype among all 50

parent prototype classes, then it takes in average 60 comparisons to find the best child

prototype, thus in average a total of 110 comparisons.

Partition method

Two classification methods are provided: cross partition (Zhang, 2006) and K-Mean clus-

tering partition (Hartigan, 1975). The cross partition consists of partitioning indepen-

dently each individual filter score into equal frequency bins (see Fig. 6.24.a). Given a

score space of dimensionFK, if each filter score is partitioned intoM bins (2 ≤ M ≤10), then the total number of parent classes isMFK . However, because the filter scores

are partitioned independently one from another, many of these classes will contain no

training patterns. Fig. 6.25 shows the results of cross partition in a 2-filter score space

using the proposed two-step approach splitting parent classes into children.

The cross partition approach is fast, however it is rough resulting in many classes

having few or no replicate.

A better partition method using K-Mean clustering is also proposed: given an input

number of clusters, the algorithm will find the optimal centroid of each cluster, and assign

training patterns to a specific cluster according to a distance between the training pattern

and the cluster centroids (see Fig. 6.24.b). This K-Mean clustering partition is one of

103

Figure 6.24: Two classification methods.

the simplest unsupervised learning algorithms; it creates better pattern groups all with

a reasonable number of replicates, however it is slow compared to the cross partition.

Also the number of clusters is critical to both CPU cost and the final simulation results.

Beginning level users are not advised to use this option. Fig. 6.26 shows the results of

K-Mean clustering partition in a 2-filter score space with the proposed two-step approach.

Single grid simulation

After creating the prototype list (for all parents and children) built from all the training

patterns, one can proceed to generate simulated realizations.

The classic sequential simulation paradigm (Deutsch and Journel, 1998) is extended to

pattern simulation. At each nodeu along the random path visiting the simulation gridG, a

search templateτ of same size as the filter template is used to extract the conditioning data

eventdev(u). The prototype closest to that data event, based on some distance function,

is found. Next a patternpat is randomly drawn from that closest prototype class, and is

pasted onto the simulation gridG. The inner part of the pasted pattern is frozen as hard

data, and will not be revisited during simulation on the current (multiple) grid. The simple

single gridFILTERSIMapproach is summarized inAlgorithm 6.12.

104

Figure 6.25: Illustration of cross partition in a 2-filter score space. Each dot represents

a local training pattern; the solid lines show the first parent partition (M = 3); the dash

lines give the secondary children partition (M = 2).

Figure 6.26: Illustration of K-Mean clustering partition in a 2-filter score space. Each dot

represents a local training pattern; the solid lines show the first parent partition (M = 4);

the dash lines give the secondary children partition (M = 3).

Distance definition

A distance function is used to find the prototype closest to a given data eventdev. The

distance betweendev and any prototype is defined as:

d =J∑

i=1

ωi · |dev(u + hi)− prot(u0 + hi)| (6.12)

105

Algorithm 6.12 Simple, Single gridFILTERSIMsimulation1: Create score maps with given filters

2: Partition all training patterns into classes and prototypes in the score space

3: Relocate hard conditioning data into the simulation gridG

4: Define a random path on the simulation gridG

5: for Each nodeu in the random pathdo

6: Extract the conditioning data eventdev centered atu

7: Find the parent prototypeprotp closest todev

8: if protp has children prototype liststhen

9: Find the child prototypeprotc closest todev

10: Randomly draw a patternpat from protc

11: else

12: Randomly draw a patternpat from protp

13: end if

14: Pastepat to the realization being simulated, and freeze the nodes within a central

patch

15: end for

where

J is the total number of nodes in the search templateτ ;

ωi is the weight associated to each template node;

u is the center node of the data event;

hi is the node offset in the search templateτ ;

u0 is the center node location of the prototype.

Given three different data types: original hard data (d = 1), previously simulated values

frozen as hard data (d = 2), other values informed by pattern pasting (d = 3). The above

weightωi is defined as

ωi =

W1/N1 : hard data(d=1)

W2/N2 : patch data(d=2)

W3/N3 : other(d=3)

,

whereWd (d = 1, 2, 3) is the weight associated with data typed, andNd is the number of

nodes of data typed within the data eventdev. It is required thatW1 + W2 + W3 = 1,

andW1 ≥ W2 ≥ W3, to emphasize the impact of hard data and data frozen as hard (inner

106

patch values).

Note that the data eventdev may not be fully informed, thus only those informed

nodes are retained for the distance calculation.

Marginal distribution reproduction

It may be desirable that the mean value (for continuous variables) or the global propor-

tions (for categorical variables) of a simulated realization be reasonably close to a target

mean or proportions. There is however no such constraint inFILTERSIMas described in

Algorithm 6.12. To better reproduce the target, instead of randomly drawing a pattern

from the closest prototype classFILTERSIM introduces a servosystem intensity factor

ω ∈ [0, 1), similar to that used in theSNESIMalgorithm.

• For a continuous variable, letmt be the target mean value, andmc the mean value

simulated so far. Calculate the gear factor:

µ = (mt/mc)ω

1−ω . (6.13)

Before selecting a pattern from the closest prototype classprot, all patterns be-

longing to classprot are sorted in ascending order based on their inner patch mean

values. Those sorted patterns are numbered from 1 toNp, whereNp is the total

number of patterns falling into prototype classprot. If mt > mc, thenµ > 1 which

means a pattern with high mean value should be selected. The larger theω value,

the more control on this pattern selection.

A power function

f(i) =

(i

Np

)µ

(6.14)

is used as the probability function to select the pattern IDi within the sorted pro-

totype class, whereNp is the total number of patterns falling into that prototype

classprot. µ > 1 means the current simulated mean is lower than the target, hence

a pattern with high mean value will be preferentially selected, see Fig. 6.27; vice

versa forµ < 1. If ω = 0, thenµ = 1, this corresponds to a uniform distribution,

resulting in all patterns to be drawn equi-probably, hence no control on the target

mean value. Ifω → 1, the target mean is better reproduced at the risk of failing to

reproduce the training image structures.

Steps 10 and 12 ofAlgorithm 6.12 are modified asAlgorithm 6.13.

107

Figure 6.27: Pattern selection function for target reproduction

• For a binary variable, letpc0, p

c1 denote the proprotions of values simulated so far,

and letpt0, p

t1 denote the target category proprotions. Becausepc

1, pt1 are the current

mean valuemc and the target mean valuemt of the simulated realizations, hence

Eq. 6.13 andAlgorithm 6.13 are used to select the best pattern for reproducing the

target category proportions.

• For a categorical variable with more than 2 categories (K > 2), an indexG(i, k) is

defined to measure the ability to tune thekth category proportion of theith pattern

in the prototype class:

G(i, k) =∣∣(Nsp

ck + Nrq

ki )/(Ns + Nr)− pt

k

∣∣ , i = 1, . . . , Np, k = 0, . . . , K − 1

where

Ns is the number of nodes informed by either hard data or patched simulated

values;

pck is the proportion of values in classk simulated so far;

Nr is the number of nodes within the inner patch template;

qki is the proportion of categoryk in patterni;

ptk is the target proportion of classk;

Np is the number of patterns in the prototype class.

Let H(i) = max {G(i, k); k = 0, . . . , K − 1}, let V1 = max {H(i)} andV2 =

min {H(i)} for i = 0, . . . , Np. If V2 = V1, all patterns in the prototype class are

equi-probably drawn during simulation. Next, define the gear factor as

µ = (V2/V1)ω

1−ω ≤ 1, and useAlgorithm 6.13 to sample the best pattern from the

108

prototype class for better proprotion reproduction.

Algorithm 6.13 Servosystem control on target reproduction1: Compute the gearing factorω

2: Sort the patterns within the closest prototype clas

3: Use Eq. 6.14 to select the ‘best’ pattern for reproducing target mean

Note that only the nodes within the inner patch template are retained for servosystem

control.

Multiple grid simulation

Similar to theSNESIMalgorithm, the multiple grid simulation concept (Tran, 1994) is

used to capture the large scale structures of the training image with a large but coarse

templateτ . In the gth (1 ≤ g ≤ NG) coarse grid, the filters defined on the rescaled

templateτ g are used to calculate the pattern scores. Sequential simulation proceeds from

the coarset grid to the finest grid. All nodes simulated in the coarser grid are re-simulated

in the next finer grid.

The template is expanded isotropically as described in theSNESIMalgorithm (page 75).

TheFILTERSIMmultiple grid simulation is summarized inAlgorithm 6.14.

Algorithm 6.14 FILTERSIMsimulation with multiple grids1: repeat

2: For thegth coarse grid, rescale the geometry of the search template, the inner patch

template and the filter template

3: Create score maps with the rescaled filters

4: Partition the training patterns into classes and corresponding prototypes

5: Define a random path on the coarse simulation gridGg

6: Relocate hard conditioning data into the current coarse gridGg

7: Perform simulation on current gridGg (Algorithm 6.12)

8: If g 6= 1, delocate hard conditioning data from the current coarse gridGg

9: until All multi-grids have been simulated

109

Soft data integration

TheFILTERSIMalgorithm allows users to constrain simulations to soft data defined over

the same simulation grid. The soft data when simulating a continuous variable should be

a spatial trend (local varying mean, internal ref. ??) of the attribute being modeled, hence

only one soft data is allowed with the same unit as that attribute. For categorical training

images, there is one soft data per category. Each soft cube is a probability field indicating

the presence/absence of a category at each simulation grid nodeu, hence there is a total

of K probability cubes,

• The procedure of integrating soft data for continuous variable is described inAlgo-

rithm 6.15. The soft data eventsdev is used to fill in the data eventdev: at any

uninformed locationuj in the data event, set its value to the soft data value at the

same location (dev(uj) = sdev(uj)). Because this soft data contributes to the pro-

totype selection, hence the choice of the sampled pattern is constrained by the local

trend.

• For a categorical variable, the original training image has been internally trans-

formed intoK binary indicator maps (category probabilities) through Eq. 6.8, thus

each resulting prototype is a set ofK probability templates (Eq. 6.10). At each sim-

ulation locationu, the prototype closest to the data eventdev is a probability vector

prob(u). The same search templateτ is used to retrieve the soft data eventsdev(u)

at the locationu. The Tau model (Journel, 2002) is used to integratesdev(u) and

prob(u) pixel-wise at each nodeuj of the search templateτ into a new probability

cubedev∗ (see sectionSoft data integration in SNESIMalgorithm: page 78). A

prototype is found which is closest todev∗, and a pattern is randomly drawn and

pasted onto the simulation grid. The detailed procedure of integrating soft proba-

bility data for categorical attributes is presented inAlgorithm 6.16.

Accounting for local non-stationarity

The same region concept as presented in theSNESIMalgorithm is introduced here to

account for local non-stationarity. It is possible to performFILTERSIMsimulation over

regions, with each region associated with a specific training image and its own parameter

settings. See sectionRegion conceptof the SNESIMalgorithm (page 84) for greater

details.

110

Algorithm 6.15 Data integration for continuous variable1: At each nodeu along the random path, use the search templateτ to extract both the

data eventdev from the realization being simulated, and the soft data eventsdev from

the soft data field

2: if dev is empty (no informed data)then

3: Replacedev by sdev

4: else

5: Usesdev to fill in dev at all uninformed data locations within the search template

τ centered atu

6: end if

7: Usedev to find the closest prototype, and proceed to simulation of the nodeu

Algorithm 6.16 Data integration for categorical variable1: At each nodeu along the random path, use the search templateτ to retrieve both the

data eventdev from the realization being simulated and the soft data eventsdev from

the input soft data field

2: if dev is empty (no informed data)then

3: Replacedev by sdev, and use the newdev to find the closest prototype.

4: else

5: Usedev to find the closest prototypeprot

6: use Tau model to combine prototypeprot and the soft data eventsdev into a new

data eventdev∗ as the local probability map

7: Find the prototype closest todev∗, and proceed to simulation

8: end if


The FILTERSIMalgorithm can be invoked fromSimulation|filtersim std in the upper

part of algorithm panel. Its main interface has 4 pages: ‘General’, ‘Conditioning’, ‘Re-

gion’ and ‘Advanced’ (see Fig. 6.28). TheFILTERSIMparameters is presented page by

page in the following. The text inside ‘[ ]’ is the corresponding keyword in theFILTER-

SIM parameter file.

1. simulation grid name [GridSelector Sim]: The name of grid on which simulation

is to be performed

111

Figure 6.28:FILTERSIMmain interface.

A. General; B. Conditioning; C. Region; D. Advanced

112

2. property name prefix [Property NameSim]: The name of the property to be simu-

lated

3. # of Realizations[Nb Realizations ]: Number of realizations to be simulated

4. Seed[Seed]: A large odd number to initialize the random number generator.

5. Training Image | Object [PropertySelector Training.grid ]: The name of the

grid containing the training image

6. Training Image | Property [PropertySelector Training.property ]: The training

image property, which must be a categorical variable whose value must be between

0 andK − 1, whereK is the number of categories

7. Search Template Dimension[Scan Template ]: The size of the 3D template used to

define the filters. The same template is used to retrieve training patterns and data

events during simulation.

8. Inner Patch Dimension [Patch Template ADVANCED]: The size of the 3D patch of

simulated nodal values frozen as hard data during simulation

9. Continuous Variable [Is Contv ]: A flag indicating that the current training image is

a continuous variable. The variable type must be consistent with ‘Training Image|Property’

10. Categorical Variable[Is Catv ]: A flag indicating that the current training image is

a categorical variable. The variable type must be consistent with ‘Training Image|Property’. Note thatIs Contv andIs Catv are mutually exclusive

11. Target Mean[Marginal Ave]: The mean value of the simulated attribute when work-

ing with continuous variable

12. # of Facies[Nb Facies ]: The total number of categories when working with cate-

gorical variable. This number must be consistent with the number categories in the

training image

13. Target Marginal Distribution [Marginal Cpdf ]: The target category proportions

when working with categorical variable, must be given in sequence from category

0 to categoryNb Facies -1. The sum of all marginal distributions must be 1

113

14. Treat as Continuous Data for Classification[Treat Cate As Cont ]: The flag to

treat a categorical training image as continuous variable for pattern classification

(the simulation is still performed with categorical variable). With this option, the

F filters are directly applied on the training image without having to transform

the categorical variables intoK sets of binary indicators, hence the resulting score

space is of dimensionF instead ofFK, and simulation is faster. Note that this

training image coding will affect the simulation results

15. Hard Data | Object [Hard Data.grid ]: The grid containing the hard conditioning

data. The hard data object must be a point set. The default input is ‘None’, which

means no hard conditioning data is used

16. Hard Data | Property [Hard Data.property ]: The property of the hard condition-

ing data, which must be a categorical variable with values between 0 andK − 1.

This parameter is ignored when no hard data conditioning is selected

17. Use Soft Data[Use SoftField ]: This flag indicates whether the simulation should

be conditioned to prior local soft data. If marked, performFILTERSIMconditional

to soft data. The default is not to use soft data

18. Soft Data | Choose Properties[SoftData properties ]: Selection for local soft

data. For a continuous variable, only one soft conditioning property is allowed

which is treated as a local varying mean. For a categorical variable, select one

and only one property for each category. The property sequence is critical to the

simulation result: thekth property corresponds to thekth category. This parameter

is ignored ifUse ProbField is set to 0. Note that the soft data must be given over

the same simulation grid as defined in (1)

19. Tau Values for Training Image and Soft Data[TauModelObject ]: Input two Tau

parameter values: the first Tau value is for the training image, the second Tau value

is for the soft conditioning data. The default Tau values are ‘1 1’. This parameter

is ignored ifUse SoftField is set to 0. Note thatTauModelObject only works for

categorical variables

20. Use Region[Use Region ]: The flag indicates whether to use region concept. If

marked (set as 1), performSNESIMsimulation with the region concept; otherwise

performFILTERSIMsimulation over the whole grid

114

21. Property with Region Code[Region Indicator Prop ]: The property containing

the coding of the regions, must be given over the same simulation grid as defined

in (1). The region code ranges from 0 toNR − 1 whereNR is the total number of

regions

22. List of Active regions [Active Region Code]: The region to be simulated, or re-

gions when multiple regions are simulated simultaneously. When simulating with

multiple regions, the input region codes should be separated by spaces

23. Condition to Other Regions[Use Previous Simulation ]: The option to perform

region simulation conditional to data from other regions

24. Property of Previously Simulated Regions[Previous Simulation Pro ]: The prop-

erty simulated in the other regions. The property can be different from one region

to another. See sectionRegion concept(page 84)

25. Servosystem Factor[Constraint Marginal ADVANCED]: A parameter (∈ [0, 1]) which

controls the servosystem correction. The higher the servosystem factor, the better

the reproduction of the target mean value (continuous variable) or the category pro-

portions (categorical variable). The default value is 0.5.

26. # of Multigrids [Nb Multigrids ADVANCED]: The number of multiple grids to con-

sider in the multiple grid simulation. The default value is 3.

27. Min # of Replicates for Each Grid [Cmin Replicates ]: A pattern prototype split

criteria. Only those prototypes with more thanCmin Replicates can be further

divided. Input aCmin Replicates value for each multiple coarse grid. The default

value is 10 for each multigrid.

28. Weights to Hard, Patch & Other [Data Weights ]: The weights assigned to differ-

ent data types (hard data, patched data and all other data). The sum of these weights

must be 1. The default values are 0.5, 0.3 and 0.2.

29. Debug Level[Debug Level ]: The flag controls the output in the simulation grid. The

larger the debug level, the more outputs fromFILTERSIMsimulation:

• If 0, then only the final simulation result is output (default value);

115

• If 1, then the filter score maps of the finest grid are also output over the training

image grid;

• if 2, then the intermediate simulation results are output in addition;

• if 3, then the map giving all parent prototypes’ id number is output in addition.

30. Cross Partition [CrossPartition ]: Perform pattern classification with cross parti-

tion method (default option)

31. Partition with K-mea [KMeanPartition ]: Perform pattern classification with K-

mean clustering method. Note that‘Cross Partition’ and ‘Partition with

K-mean’ are mutually exclusive

32. Number of Bins for Each Filter Score | Initialization [Nb Bins ADVANCED]: The

number of bins for parent partition when using cross partition, the default value is

4

33. Number of Bins for Each Filter Score|Secondary Partition[Nb Bins ADVANCED2]:

The number of bins for children partition when using cross partition, the default

value is 2

34. Maximum Number of Clusters| Initialization [Nb Clusters ADVANCED]: The num-

ber of bins for parent partition when using K-mean partition, the default value is 200

35. Maximum Number of Clusters | Secondary Partition [Nb Clusters ADVANCED2]:

The number of bins for children partition when using K-mean partition, the default

value is 2

36. Distance Calculation Based on| Template Pixels[Use Normal Dist ]: With this

option, the distance is defined as the pixel-wise sum of differences between the

data event values and the corresponding prototype values (Eq. 6.12). This is the

default option

37. Distance Calculation Based on| Filter Scores[Use Score Dist ]: With this option,

the distance is defined as the sum of differences between the data event scores

and the pattern prototype scores. This option has not yet been implemented in the

currentFILTERSIMversion. Note thatUse Normal Dist andUse Score Dist are

mutually exclusive

116

38. Default Filters [Filter Default ]: The option to use the default filters provided by

FILTERSIM: 6 filters for a 2D search template and 9 filters for a 3D search template

39. User Defined Filters[Filter User Define ]: The option to use user’s own filter

defintions. Note that‘Default’ and‘User Defined’ are mutually exclusive

40. The Data File with Filter Definition [User Def Filter File ]: Input a data file with

the filter definitions (see Fig. 6.23). This parameter is ignored ifFilter User Define

is set to 0

Examples

In this section,FILTERSIM is run for both unconditional and conditional simulations.

The first three examples demonstrate theFILTERSIMalgorithm with categorical training

images; the last example illustrates the ability of theFILTERSIMalgorithm to handle

continuous variables.

1. Example 1: 2D unconditional simulation

The first example is a four facies unconditional simulation using Fig. 6.18(a) as

training image. The search template is of size23× 23× 1, and the patch template

is of size15× 15× 1. The number of multiple grid is 3, and the minimum number

of replicates for each multiple grid is 10. The pattern classification method is cross

partition with 4 bins and 2 bins for the two-step partition, respectively. The target

proportions were set to the TI’s proportions, see Table 6.1. Fig. 6.29 shows two

FILTERSIMrealizations, which depict a reasonable training pattern reproduction.

The facies proportions for these two realizations are given in Table 6.1. Compared

to theSNESIMsimulation of Fig. 6.18(b), theFILTERSIMalgorithm appears to

better capture the large scale channel structures.

2. Example 2: 2D unconditional simulation with affinity and rotation

In this example, the region concept is used to account for local non-stationarity.

The simulation field is of size100× 130× 1, same as used in the third example of

theSNESIMalgorithm. The 2D training image is given in Fig. 6.20(c). The affinity

regions are given in Fig. 6.20(a), and the rotation regions are given in Fig. 6.20(b).

The region settings are exactly the same as used in the thirdSNESIMexample.

117

mud background sand channel levee crevasse

training image 0.45 0.20 0.20 0.15

SNESIMrealization 0.44 0.19 0.20 0.17

FILTERSIMrealization 1 0.51 0.20 0.17 0.12

FILTERSIMrealization 2 0.53 0.18 0.18 0.11

Table 6.1: Facies proportions of bothSNESIMandFILTERSIMsimulations

Figure 6.29: TwoFILTERSIMrealizations using Fig. 6.18(a) as training image

(black: mud facies; dark gray: channel; light gray: levee; white: crevasse)

FILTERSIMis performed with a search template of size11×11×1, a patch template

of size7×7×1, three multiple grids and 4 bins for the parent partition. Fig. 6.30(a)

shows oneFILTERSIMsimulation using the affinity regions only, which reflects de-

creasing channel width from South to North without significant discontinuity across

the region boundaries. Fig. 6.30(b) shows oneFILTERSIMrealization using only

the rotation regions. Again, the channel continuity is well preserved across the

region boundaries.

3. Example 3: 2D simulation conditioned to hard well data and soft data

A 2D three faciesFILTERSIMsimulation is performed with soft data conditioning.

The problem settings are exactly the same as those used in the fourthSNESIM

example: the probability fields are taken from the last layer of Fig. 6.19 (c)-(e); and

the training image is given in Fig. 6.20(c).FILTERSIMis run for 100 realizations

with a search template of size11 × 11 × 1, a patch template of size7 × 7 × 1,

118

(a) FILTERSIMsimulation with affinity region (b) FILTERSIMsimulation with rotation region

Figure 6.30:FILTERSIMsimulation with affinity regions (Fig. 6.20(a)) and with rotation

regions (Fig. 6.20(a)). In both cases, Fig. 6.20(c) is used as training image. (black: mud

facies; gray: sand channel; white: crevasse)

three multiple grids and 4 bins for the parent partition. Fig. 6.31 (a)-(c) show three

realizations, and Fig. 6.31(d) gives the E-type mud facies probability calculated

from the 100 realizations. Fig. 6.31(d) is consistent with the soft conditioning data

of Fig. 6.21(a). Recall that inFILTERSIMthe soft data is used only for distance

calculation, not directly as a probability field.

4. Example 4: 3D simulation of continuous seismic data

In this last example, theFILTERSIMalgorithm is used to enhance a 3D seismic

image. Fig. 6.32(a) shows a 3D seismic image with a missing zone due to shadow

effect. The whole grid is of size450× 249× 50, and percentage of missing values

is 24.3%. The goal here is to fill in the missing zone by extending the geological

structures present in the neighboring areas. The north part of the original seismic

image is retained as the training image, which is of size150 × 249 × 50, see the

area in the small white rectangular box.

For theFILTERSIMsimulation, the size of the search template is21× 21× 7, the

size of the patch template is15 × 15 × 5, the number of multiple grids is 3, and

the number of bins for parent partition is 3. All available seismic data are used

for hard conditioning. OneFILTERSIMrealization is given in Fig. 6.32(b). The

simulation area is delineated by the white and black lines. The layering structures

are seen to extend from the conditioning area into the simulation area, with the

119

(a) FILTERSIMrealization 18 (b) FILTERSIMrealization 59

(c) FILTERSIMrealization 91 (d) Experimental mud probability

Figure 6.31: ThreeFILTERSIM realizations (black: mud facies; gray: sand channel;

white: crevasse) and E-type mud probability from 100FILTERSIM realizations. The

training image is given as Fig. 6.20(c)

horizontal large scale structures reasonably well reproduced. For comparison, the

2-point algorithmSGSIMis also used to fill in the empty area with a variogram

modeled from the same training image. Fig. 6.32(c) shows oneSGSIMrealization,

in which the layering structures were lost.

120

(a) Simulation grid with conditioning data

(b) Fill-in simulation withFILTERSIM

(c) Fill-in simulation withSGSIM

Figure 6.32:FILTERSIMandSGSIMused to fill in the shadow zone of a 3D seismic cube

121

Chapter 7

Scripting and Commands

7.1 Commands

Most of the tasks performed in SGeMS using the graphical interface, e.g. creating a new

cartesian grid or performing a geostatistics algorithm, can be executed using a command

line. For example, the following command:

NewCartesianGrid mygrid::100::100::10

creates a new cartesian grid called ”mygrid”, of dimensions 100x100x10. Commands can

be entered in the Command Panel. The Command Panel is not displayed by default. To

display it, go to theViewmenu and selectCommands Panel.

To execute a command, type it in the field underRun Command(refered to as the

command line) and press Enter. The SGeMSCommands Historytab shows a log of

all commands performed either from the graphical interface or the command line. The

commands appear in black. Messages are displayed in blue, and warnings or errors in red.

Commands can be copied from the log, pasted to the command line, edited and executed.

Clicking the Run Command button will prompt for a file containing a list of commands

to be executed. Each command in the file must start on a new line and be contained on

a single line. Comments start with a # sign. Use the Help command (just type Help

in the command line) to get a list of all available commands. SGeMS actually keeps

two logs of the commands executed during a session: one is displayed in the SGeMS

Commands History tab, the other one is recorded to a file called ”sgemshistory.log”. This

file is located in the directory where SGeMS was started. It only contains the commands:

messages, warnings and errors are not recorded.

122

Commands are an easy way to automate tasks in SGeMS. They however offer limited

flexibility: there are no control structures such as loops, tests, etc. Hence performing

20 runs of a given algorithm, each time changing a parameter, would require that the 20

commands corresponding to each run are written in the commands file. If more power-

ful scripting capabilities are needed, one should turn to the other automation facility of

SGeMS: Python scripts.

The following is a list of all SGeMS commands with the input parameters. Parameters in

”‘[]”’ are optional.

• Help List all the commands available with SGeMS.

• ClearPropertyValueIf Grid::Prop::Min::Max Set to not-informed

all value of property ”Prop” in grid ”Grid” that are in range [Min,Max ]

• CopyProperty GridSource::PropSource::GridTarget::

PropTarget::Overwrite::isHardData CopyPropSource fromGridSource

to

PropTarget from GridTarget . If optionsOverwrite is set to true, the copy

would overwrite values already onPropTarget . Option isHardData would

set the copied values as hard data.

• DeleteObjectProperties Grid::Prop1::Prop2... Delete all the spec-

ified properties.

• DeleteObjects Grid1::Grid2... Delete the specified objects.

• SwapPropertyToRAM Grid::Prop1::Prop2... Swap the the specified

properties from the random access memory (RAM) to the hard drive.

• SwapPropertyToDisk Grid::Prop1::Prop2... Swap the the specified

properties from the hard drive to the random access memory (RAM).

• LoadProject Filename Load the specified project.

• SaveGeostatGrid Grid::Filename::Filter Save the specified grid on

file. Filter specified the data format: either asciigslibor binarysgems.

• LoadObjectFromFile Filename Load the object from the specified file.

123

• NewCartesianGrid Name::Nx::Ny::Nz[::SizeX::SizeY::SizeZ]

[::Ox::Oy::Oz] Create a new cartesian grid with the specified geometry. The

default pixel size value for [SizeX,SizeY,SizeZ ] is 1 and the default origin

[Ox,Oy,Oz ] is 0.

• RotateCamera x::y::z::angle Rotate the camera.

• SaveCameraSettings Filename Save the position of the camera into

Filename

• LoadCameraSettings Filename Retrieve the camera position from

Filename

• ResizeCameraWindow Width::Height Set the width and the height of the

camera toWidth andHeight

• ShowHistogram Grid::Prop[::NumberBins::LogScale] Display the

histogram of the specified proprety. The number of bins may also be input with

NumberBins ; the default value is 20. The x axis can also be changed to log scale

by settingLogScale to true.

• SaveHistogram Grid::Prop::Filename[::Format]

[::NumberBins][::LogScale][::ShowStats][::ShowGrid] Save the

specified histogram intoFilename with format specified byFormat . The default

format is PNG. WhenShowStats is set to true, it saves the statistics in the file,

ShowGrid adds a grid to the histogram.

• SaveQQplot Grid::Prop1::Prop2::Filename[::Format]

[::ShowStats][::ShowGrid] Save the QQ plot betweenProp1 andProp2

into Filename , the default format is PNG. WhenShowStats is set to true, it

saves the statistics in the file,ShowGrid adds a grid to the QQ-plot.

• SaveScatterplot Grid::Prop1::Prop2::Filename

[::Format][::ShowStats][::ShowGrid][::YLogScale::

XLogScale] Save the scatter plot betweenProp1 andProp2 into Filename ,

the default format is PNG. WhenShowStats is set to true, it saves the statistics

in the file,ShowGrid adds a grid to the scatter plot.

124

• DisplayObject Grid[::Prop] DisplayProp on the viewing window. When

Prop is not specified, only the grid geometry is displayed.

• HideObject Grid RemoveGrid from the viewing window.

• TakeSnapshot Filename[::Format] Take the snapshot of the current view-

ing window and save it intoFilename . The default format is PNG.

• RunGeostatAlgorithm Parameters Run the algorithms specified by

Parameters

7.2 Python script

SGeMS provides a powerful way of performing repetitive tasks by embedding the Python

language. Useful information about Python (including a beginner’s guide, tutorials and

reference manuals) can be found at http://python.org/doc. SGeMS provides a Python

extension modules that allows Python to call SGeMS commands. The module name is

sgems. It offers three functions:

• execute( ’Command’ ) : executes SGeMS command command.

• get_property( ’GridName’, ’PropertyName’ ) : returns a list con-

taining the values of propertyPropertyName of objectGridName .

• set_property( ’GridName’, ’PropertyName’, Data ) : sets the

values of propertyPropertyName of object GridName to the values of list

Data . If objectGridName has no property calledPropertyName a new prop-

erty is created.

The following is a script example that compute the logarithm of values taken from

the property ’samples’ of the grid named ’grid’. It then writes the logarithms to a new

property called ’logsamples’, displays that property and takes a snapshot in PNG format.

import sgems

from math import *

data = sgems.get_property(’grid’,’samples’)

125

for i in range(len(data)) :

if data[i]>0 : data[i] = log(data[i])

else : data[i] = -9966699

sgems.set_property(’grid’,’log_samples’,data)

sgems.execute(’DisplayObject grid::log_samples’)

sgems.execute(’TakeSnapshot log_samples.png::PNG’)

126

Bibliography

Castro, S. A., Caers, J., Mukerji, T., May 2005. The stanford vi reservoir. Report 18 of

stanford center for reservoir forecasting, Stanford, CA.

Deutsch, C. V., Journel, A. G., 1998. GSLIB: Geostatistical Software Library and User’s

Guide, second edition Edition. Oxford, New York.

Deutsch, C. V., Tran, T. T., 2002. Fluvsim: a program for object-based stochastic model-

ing of fluvial depositional system. Computers & Geosciences, 525–535.

Guardiano, F., Srivastava, R. M., September 1992. Multivariate geostatistics: Beyond

bivariate moments. In: Fourth International Geostatistics Congress. Troia.

Hartigan, J. A., 1975. Clustering algorithms. New York, John Wiley & Sons Inc[].

Jolliffe, I. T., 1986. Principal Component Analysis. Springer-Verlag, New York.

Journel, A., 1992. Geostatistics: roadblocks and challenges. In: Soares, A. (Ed.),

Geostatistics-Troia. Vol. 1. Kluwer Academic Publications, pp. 213–224.

Journel, A. G., 2002. Combining knowledge from diverse sources: An alternative to tra-

ditional data independence hypotheses. Mathematical Geology 34 (5), 573–596.

Krishnan, S., 2004. Combining diverse and partially redundant information in the earth

sciences. Phd thesis, Stanford University, Stanford, CA.

Remy, N., May 2001. Post-processing a dirty image using a training image. Report 14 of

stanford center for reservoir forecasting, Stanford University, Stanford, CA.

Strebelle, S., 2000. Sequential simulation drawing structures from training images. Phd

thesis, Stanford University, Stanford, CA.

127

Tran, T., 1994. Improving variogram reproduction on dense simulation grids. Computers

& Geosciences 20 (7), 1161–1168.

Zhang, T., 2006. Filter-based training pattern classification for spatial pattern simulation.

Phd thesis, Stanford University, Stanford, CA.

128

SGeMS

Documents

ifmedian ik

median ik

median ik

hard dataare

full ik flag

random access

cascaded search

relocate hard