Fuzzy parameterization for analysis of natural phenomenon and use in other geophysical problems

Fuzzy parameterization for analysis of natural phenomenon and use in other

geophysical problems

Stanford Exploration Project SeminarStanford University, CA

12th December, 2008

Pritwiraj MoulikResearch Associate & Visiting Science student, Dept. of Earth Sciences, Univ. of Western Ontario,

CANADAUndergraduate student, Birla Institute of Technology & Science-Pilani, INDIA

Topics

• California fault system• Parkfield Earthquakes: Waveform modeling,

GIS• Earthquake nucleation: Pattern Informatics

• Taiwan Landslides: Neuro-fuzzy framework• Well log analysis

• Prydz Bay: Fuzzy Inference system• Costa Rica convergent margin: Neuro-fuzzy

framework• Climate modeling

• Monsoon Prediction• Paleo-climatic nonlinea time series analysis

Topics







Earthquakes: The unsolved questions…..

• Location of earthquakes: nucleation• Magnitude• Self organization & Ergodicity?• Precursory phenomenon ?

California fault system : Parkfield region• Aims:

• Characterize similarities in waveforms: GIS• Model the waveforms: Fuzzy membership functions

Lessons learnt… Similar magnitude and rupture extent: fault segmentation Long-term non randomness of earthquakes: remarkably similar in

size and location of rupture, albeit not in epicentre or rupture

Geospatial Analysis: Overview

Filter and cluster the voluminous seismic data System Constraints:

Earthquake Parameters : similar faulting mechanism, magnitude, rupture direction and have occurred on the same fault segment or the same epicenter. Lower variability may be achieved if events are further constrained to have the same rupture time history and distribution of slip.

Source and Station Characteristics : geological setting of the station, the source and the path of propagation is also a major consideration.

Why Parkfield?Why Parkfield?The 1934, 1966 and 2004 Parkfield earthquakes used to arrive at this model are remarkably similar in size and location of rupture, albeit not in epicenter or rupture propagation direction (Bakun & McEvilly (1979), Bakun et al. (2005)).

DetailsDetails Earthquake data used: 1934, 1966, 2004 Parkfield

earthquake [COSMOS] Conversion to Excel, then used in ArcMap 9.1 Soil layer data: NRCS, DEM data: CGIAR-CSI System constraints used: hypocenter parameters,

station/event parameters and sensor description.

Geospatial Analysis: Results I

Type Year Station MUID Area(sq. km)

Bedrock Depth(m)

Soil Profile(L1-L11)

I1966 Chalome Array 2 CA501 53642 152 9;9;9;9;6;6;6;6;6;15;15

1966 Chalome Shandom Array 5 & 8

CA502 187777 150 9;9;9;16;16;16;16;16;16;15;15

II2004 Vinayard Canyon CA561 59483.1 140 6;6;6;6;6;12;12;12;7;15;15

2004 Jack Canyon CA344 623444 73 6;6;3;3;3;15;15;15;15;15

1966

Temblor CA344 623444 73 6;6;3;3;3;15;15;15;15;15

III1966 San Luis Obispo CA515 831475 89 12;12;9;11;11;11;11;11;15;15;15

2004 San Luis Obispo CA515 831475 89 12;12;9;11;11;11;11;11;15;15;15

2004 Hollister; Airport Building

CA568 129241 152 12;12;12;12;12;12;12;12;12;15;15

IV1966 Taft, Lincoln

School TunnelCA347 2809850 152 3;3;3;3;3;3;3;3;3;15;15

1966 Chalome Array 12 CA503 23115.4 141 3;3;3;3;3;3;6;6;16;15;15

2004 Fresno;VA Medical Center

CA309 41103.9 147 3;3;3;3;3;7;16;16;16;15;15

2004 Fresno; NAMP USGS Office

CA307 2262820 152 3;3;3;3;3;3;3;3;3;15;15

2004

Parkfield;Eades CA503 29183.2 141 3;3;3;3;3;3;6;6;16;15;15

2004 Coalinga; Fire Station

CA346 394110 152 3;3;3;6;6;6;6;6;6;15;15

V2004 Hollister;City Hall CA548 172888 152 9;9;9;9;9;9;16;16;16;15;15

2004 Joaquin Canyon CA558 531502 63 9;9;9;9;9;11;15;15;15;15;15

2004 Donna Lee CA558 531502 63 9;9;9;9;9;11;15;15;15;15;15

2004 Parkfield;Froelich CA502 52690.8 150 9;9;9;16;16;16;16;16;16;15;15

VI2004 Middle Mountain CA555 49159.6 82 8;8;8;8;8;8;8;15;15;15;15

2004

Parkfield;Gold Hill CA505 529707 82 8;8;8;8;8;9;16;15;15;15;15

2004 Hog Canyon CA555 286145 82 8;8;8;8;8;8;8;15;15;15;15

2004 Work Ranch CA505 686713 82 8;8;8;8;8;9;16;15;15;15;15

2004 Parkfield;Red Hills CA505 529707 82 8;8;8;8;8;9;16;15;15;15;15

2004 Parkfield; UPSAR (1-3,5-13)

CA555 286145 82 8;8;8;8;8;8;8;15;15;15;15

List of stations grouped into six types.

Geospatial Analysis: Results• City Recreation Bldg-864 Santa Rosa, San

Luis Obispo had records of both earthquakes.

• Comparative analysis of both earthquakes: Average S.D.= 0.00693728 cm/s^2, visual similarity

Algorithm System: Basic Structure

If following P-S region Thresholds for any

Magnitude, retrieve data

Calculate grade for every magnitude

Undetected Quake

Clustered SeismicDatabase

Data processing

Filter magnitude< threshold

FuzzySegment Architecture

Incoming acceleration

Calculate cumulative Membership grade for

time interval

CheckThreshold

Alarm

Algorithm : Process I (Clustering)

Input: Seismic data identified by geospatial analysis Aim: To model the general waveform pattern from an active seismic zone. Three Processes:

ClusteringMembership Function developmentEvolutionary Algorithm

DetailsDetailsA graph, specific to an instant from

the onset of P waves, is plotted between acceleration and magnitude of the corresponding earthquake.

The clustering algorithm used in the process is Ward’s Method

The process is repeated to find curves for every instant in the P-S interval.

• Input: The incoming value of acceleration at that station is fed as input

• Output: earthquake magnitudes and corresponding membership grades

Algorithm : Process I (Clustering)

•Only magnitude: corresponding membership grade is greater than 0.8•Cumulative from t1 to t2•Above a threshold: 0.73, tested for the earthquakes

Limitations•Data dependent•Membership function development : computationally intensive

Topics







The Pattern Informatics Method

• The PI index is an analytical method for quantifying the spatiotemporal seismicity rate changes in historic seismicity (Tiampo et.al.,2002).

• The observed seismicity activity rate ψobs(xi,t) : proxy for the energy release, earthquakes per unit time (M>Mcutoff )within the box centred at xi at time t.• The average seismicity function S(xi,t0,t) over the time interval (t-t0) is defined as:

• The mean –zero, unit-norm function, obtained by deducting the average and dividing by the standard deviation is defined thereafter as:

• Physically, the important changes in seismicity are given by :

• The final calculation involves averaging over all the base years, t0, to reduce the effects of noise.

• The PI index, which represents the time-independent background, is denoted by:

0

00

1( , , ) ( , )

t

i obs i

t

S x t t x t dtt t

0 00

0

( , , ) ( , , )( , , )

( , , )i i

ii

S x t t S x t tS x t t

S x t t

1 2 0 2 0 1ˆ ˆ ˆ( , , ) ( , , ) ( , , )i i is x t t s x t t s x t t

2

2 2ˆ( , , ) ( , , )i i i i PP x t t S x t t

Pertinent questions…

• Optimal Temporal Regions?• Magnitude of forecasted earthquake?• Cutoff magnitude to filter?• Threshold PI for hotspots?

Target magnitude & Cutoff magnitudeThe success of a forecast is based on maximizing the fraction of earthquakes that occur in alarm cells and minimizing the fraction of alarm cells that do not result in earthquakes.

A closer look…

Threshold PI for identifying hotspots

Identify optimal temporal regions in a catalog• The TM fluctuation metric measures effective ergodicity, or the difference

between the time average of a quantity and its ensemble average over the entire system (Thirumalai et al., 1989).

• Identify the regions of parameter space which exhibit stationary nature and thereby give an optimal forecast (Tiampo et al., 2003, 2007).

Optimal forecast for California

Bin size, dX = 0.1Target forecasting magnitude, Mtarget=5.1Threshold PI for binary forecast = 0 for the used bin sizeCatalog magnitude cutoff , Mc=3.1tb=1932, t1=1968, t2=1986, t3=2004, where t2-t3 is the forecasting interval

Ongoing work…

• Inversion model for forecasting magnitudes of future earthquakes• Rupture area from PI (Tiampo, 2007)• Fault segmentation• PI value of the hotspot

Topics• California fault system

• Parkfield Earthquakes: Waveform modeling, GIS

• Earthquake nucleation: Pattern Informatics• Taiwan Landslides: Neuro-fuzzy framework• Well log analysis




Landslide prediction

• Aim: to formulate and validate a neuro-fuzzy framework and compare with other empirical approaches

• Study Area:• Taiwan: circum-Pacific seismic belt• Fractured rock mass along jighways• Heavy rainfall

• Previous work (Lee et. al., 1996; Lu, 2001;Chang, 2005)• Typically in weathered soils at low elevation data• Happened at different

1. slope grades

2. Slope heights

3. Slope shapes

4. Geological formations

Framework synopsis

• Parameters• Topographic

• Grade

• Height

• Aspect

• Shape

• Geological• Formation

• Thickness of soil layer

Geological Formation

OutcomeKuantaoshan

Sandstone(type 1)

Tawo Sandstone(type 2)

Cholan Formation

(type 3)

Nanchuang Formation

(type 4)

Shihliufen Shale

(type 5)

Chinshui Shale

(type 6)

Correlation coefficients

Influencing Factor

I II III IV V VI VII VIII IX

I 1.000 -0.016 -0.059 -0.103 -0.047 -0.380 -0.026 -0.096 -0.025

II -0.016 1.000 0.089 0.108 0.070 0.001 -0.026 -0.096 -0.025

III -0.059 0.089 1.000 0.266 0.173 0.102 0.123 -0.074 0.152

IV -0.103 0.108 0.266 1.000 0.246 0.209 0.151 -0.240 0.130

V -0.047 0.070 0.173 0.173 1.000 0.209 0.215 -0.400 -0.014

VI -0.380 0.001 0.102 0.102 0.209 1.000 0.144 -0.098 0.034

VII -0.026 -0.026 0.123 0.123 0.215 0.144 1.000 0.020 0.010

VIII -0.096 -0.096 -0.074 -0.074 -0.400 -0.098 0.020 1.000 -0.056

IX -0.025 -0.025 0.152 0.152 -0.014 0.034 0.010 -0.056 1.000

Results

Frequency distribution of output (1-Landslide, 0-No landslide)

ANN(80.62%)

MSA(75.29%)

Neuro-Fuzzy(86.47%)

Topics







Objective and the studied region• The identification of groundwater,

oil and gas formation lithology from well log data largely depends on expert experience and some subjective rules: “if the natural gamma ray reading is high and the separation between shallow formation resistivity and deep formation resistivity is small, then the formation lithology is probably shale (Chapellier, 1992).”

• The well logging data from ODP Leg 188 boreholes site - 1166A and 1165C were taken as the case study for the present work (O’Brien et al. 2001)

Modeling Parameters• Input Variables used:

• Porosity

• Gamma ray

• Bulk density

• Transit time interval

• Resistivity difference

• Linguistic terms: very low (VL), low (L), medium (M), high (H), and very high (VH)

• Output variables: sand (%), gravel (%) and major soil component size (MSCS) H->clay, M->silt, and L->sand

• Characterization of diamicts, gravels/ conglomerates and breccias modified after Moncrieff (1989)

Linguistic term of output

variable

Grain size range of matrix

(cm)

Reference boundary of

linguistic term

Sand 20>AGS>2-4 [0,-4]

Silt 2-4>AGS>2-8 [-4,-8]

Clay 2-8>AGS>2-12 [-8,-12]

Input & Output trapezoidal membership functionsIf POR GR DEN ΔT ΔR is % Sand % Gravel MSCS Weights

VL M VH M NA M M H 0.8

L M VH M NA H M H 1

VL M H NA L H M M 0.8

NA L H M L M L M 1

L M NA H VL M VL M 1

VL M NA L L L VL M 1

M M NA NA VL L VL L 1

H L NA H VL L VL L 1

H L NA H VL L VL L 1

H M NA M H VH VL H 1

VH L NA H VH H VL H 1

NA L NA L L VH VL H 1

NA M H NA VL L VL L 1

NA H H NA VL L VL L 1

NA VH H NA VL L VL L 1

VL NA NA NA VL L VL L 1

VL M VH NA VL H M H 1

VL M VH NA L H M H 1

NA L NA NA Not L VH VL H 0.8

H NA M NA NA VH VL H 0.8

VL M NA VL NA L VL M 1

L M NA VL NA M VL M 1

NA M H M L L L M 0.8

NA M VH NA L M M H 0.8

Abbreviation: POR: porosity log; GR: gamma ray log; DEN: bulk density; ΔT: Compressional transit time interval; ΔR: separation between phasor deep induction and spherically focused resistivity Log; MSCS: major soil component's size; (N/A): rule did not use this component after system training.

1

2

3

-400

-300

-200

-100

0

-1

0

1

lithology depth

fuzzy lithology

true lithology

1

2

3

-400

-300

-200

-100

0

-1

0

1

depthsoil

Major Soil Component

1

2

3

-400

-300

-200

-100

0

-1

0

1

depthlithology

fuzzy lithology

true lithology

1

2

3

-400

-300

-200

-100

0

-1

0

1

depthsoil

Major Soil Component

Comparison between true lithology and fuzzy lithology 1-Diamictite, 2- Clay/Silt, 3- Sand

1165C

1166B

Well logs & fuzzy lithology – 1165C

0 80 160

G am m a R ay

0 1 2

S h allow R esistivityD eep R esistivity

0 0.8 1.6 2.4

B u lk D en sity

0 0.5 1

1000

900

800

700

600

500

400

300

200

P orosity

Well logs & fuzzy lithology – 1166A

0 100 200 300 400

G am m a R ay

0 2 4 6 8

S h allow R esistivityD eep R esistivity

1 1.5 2 2.5

B u lk D en sity

40 80 120 160 200

Tran sit T im eIn terval

0 0.5 1

350

300

250

200

150

100

50

Porosity

Performance analysis

• 80% training data; 20% testing data

• Borehole site 1166A:

• Training performance: 214 training data sets were identified correctly from the total 258 training data sets with a success rate of 82.95%

• Testing Performance: 57 test data sets were predicted correctly from the total of 65 testing data sets (Fig. 7) with an accuracy of 87.69%

• This technique is also capable of providing significant lithology information, where core recovery is incomplete.

• Core analysis provides a more subjective interpretation but well log analysis may easily:

• define a permeable sand formation• distinguish between silts and sands• determine grain size variation in sands.

• Error due to

• heterogeneous and/or anisotropic conditions existing at this depth between the two wells that resulted in the wrong prediction and

• Some factors that were not considered in this study such as photoelectric log, which may provide another perspective.

0.1 10 1000 100000

-1000

-800

-600

-400

-200

0

Dep

th (m

)

M eth an e (C 1)

E th an e (C 2)

P ropan e (C 3)

0.1 10 1000 100000

C n (ppmv)

100 1000 10000

-1000

-800

-600

-400

-200

0

Dep

th (m

)

100 1000 10000

C 1/ C 2

0 0.4 0.8 1.2 1.6 2

-1000

-800

-600

-400

-200

0

Dep

th (

m)

0 0.4 0.8 1.2 1.6 2O rgan ic carbon (wt %)

Correlation with Geochemical Analysis : 1165C

Conclusions

• Natural systems show evidence of imprecise parameters which may be modeled using Fuzzy Parameterization

• Earthquake fault systems show nucleation and ergodicity: may help in better forecasts using fuzzy logic

• Landslide prediction parameters are inherently imprecise and the best modeled using fuzzy parameterization

• Well log analysis may be made more subjective while incorporating the expertise of the analyst using a Inference engine

• There are limitations in each application which may be considered before using the paradigm.

THANK YOU!!!

• Stanford Exploration Project & Stanford Geophysics• Mentors, collaborators and supervisors

• Kristy Tiampo – University of Western Ontario

• Gerhard Pratt – Queen’s/ UWO

• J. Srinivasan – Indian Institute of Science

• Der-Har Lee – NCKU, Taiwan

• K. Srinivasa Raju – BITS-Pilani

• Upendra K. Singh – Indian School of Mines, Dhanbad

• Data Sources• COSMOS, CGIAR-CSI, NRCS, ANSS, ODP, NCKU

QUESTIONS….

Fuzzy parameterization for analysis of natural phenomenon and use in other geophysical problems

Documents

fuzzy parameterization

location of earthquakes

location of rupture

rupture time history

randomness of earthquakes

earthquake parameters

rupture extent

rupture direction