10 - Ilya Zaliapin

1

1

Journal of Geophysical Research: Solid Earth 2

Supporting Information for 3

Earthquake declustering using the nearest-neighbor approach 4

in space-time-magnitude domain 5

6

Ilya Zaliapin1 and Yehuda Ben-Zion2 7

8

1 Department of Mathematics and Statistics, University of Nevada, Reno 9

2 Department of Earth Sciences, University of Southern California, Los Angeles 10

11

12

13

Contents of this file 14

15

Text Sections S1 to S3; Figures S1 to S9 16

17

Introduction 18

The Supporting Information discusses theoretical motivation for the proposed declustering 19 algorithm, outlines the main steps of its numerical implementation, and includes figures with 20 additional information about declustering in synthetic and real data. It also includes a version of 21 declustered catalog of Hauksson et al. [2012]. 22

2

S1. Motivation of the proposed declustering algorithm 23

24

Here we provide motivation and justification for the proposed declustering algorithm. It is 25

based on the distribution analysis for the nearest-neighbor proximities and thinning theory 26

of point processes. We discuss the case w = 0 (no magnitude component), which 27

corresponds to the main version of our analysis. The magnitude-dependent case can be 28

examined in a similar fashion. The discussion below explains why the proposed algorithm 29

works in selected basic models of clustered fields, and why one can expect it to work in 30

more general situations. We also discuss specific conditions under which the algorithm 31

gives biased results. 32

33

S1.1 Weibull approximation to the nearest-neighbor proximity distribution 34

35

The basic model that we use in this analysis is a Poisson space-time point field that 36

is stationary in time and homogeneous in d-dimensional space, with independent space and 37

time components. We refer to the process by its counting measure [Daley and Vere-Jones, 38

2003] 39

40

H(A) = number of events within space-time region A. 41

42

The first moment measure of the process 43

44

M(A) = E[H(A)] = A dt dx1…dxd = |A| 45

46

is completely specified by the process intensity [yr–1km–d]. The number of events that 47

occurred within a space-time region A with volume |A| is a Poisson random variable with 48

intensity |A|. We define the earthquake proximity sphere centered at event i with radius x 49

as the space-time region 50

51

S(i,) ={(t,x): the proximity from event i to (t,x) is less than }. 52

53

The nearest-neighbor proximity i of Eqs. (1,3) of the main text calculated for event 54

i signifies that there are no events in the sphere S(i,i). The Poisson distribution for the 55

number of events in space-time volumes implies 56

57

Prob[i > x] = Prob[H(S(i,x))=0] = exp{–|S(i,x)|}. 58

59

This allows one to find the distribution of the nearest-neighbor proximities. A complete 60

analysis, which involves some additional technical requirements and auxiliary parameters 61

to prevent spheres of infinite volumes, leads to the following approximate distribution [see 62

Zaliapin et al., 2008; Hicks, 2011]: 63

64

Prob[i > x] exp{–xk }. (S1) 65

66

3

Here , k are functions of dimension d and the auxiliary parameters; these functions are 67

constants with respect to x. We assume that the values of , k are constants for a given 68

examined catalog. 69

The approximation (S1) is the Weibull distribution with shape parameter k and 70

scale parameter s = () –1/k. It provides a close fit to the proximities in the observed 71

earthquake data and synthetic catalogs [Zaliapin and Ben-Zion, 2013a], and can be used 72

for both integer and fractional dimensions d (see also Fig. S9). 73

The numerical values of the parameters , k depend on the analysis assumptions 74

(including possible errors in determining the fractal dimension of the epicenters); they are 75

best estimated from the data. Analyses of multiple observed catalogs suggest that the 76

background field corresponds to an approximate range 0.75 k 1.25, and often the 77

estimated values of k are close to unity. Recall that the case k = 1 in (S1) corresponds to 78

the exponential distribution; the same as the distribution of interevent times in a 79

homogeneous Poisson process [Daley and Vere-Jones, 2003]. 80

81

S1.2 Gumbel approximation for the log-proximities 82

83

We start with a result that connects the Weibull and Gumbel distributions. 84

85

Lemma 1. Suppose a random variable X has the Weibull distribution with scale parameter 86

s > 0 and shape parameter k > 0: 87

Prob[X > x] = exp{–(x/s)k }, x 0. (S2) 88

89

Then, the random variable Y = log10(X) has the Gumbel (minimum) distribution 90

91

Prob[Y > y] = exp{–exp{(y – )/ }}, – <y < , (S3) 92

93

with location parameter = log10 s and scale parameter = (k ln10) –1. In particular, 94

95

E[Y] = log10 s – (k ln10)–1 and Var[Y] = 1/6 2 (k ln10)–2, 96

97

where = 0.5772… is the Euler-Mascheroni constant. Inversely, if random variable Y has 98

the Gumbel distribution (S3), then the random variable X = 10Y has the Weibull distribution 99

(S2). 100

101

Proof. By transforming the cumulative distribution functions of the Weibull and Gumbel 102

distributions. 103

104

Consider now a point field with space-time intensity [yr–1km–d] and suppose that 105

its nearest-neighbor proximity i is given by the Weibull distribution (S1) with shape 106

parameter k and scale parameter s = () –1/k. An example of such process is given by the 107

homogeneous Poisson model of Sect. S1.1. Lemma 1 implies that the logarithm log10i of 108

the nearest-neighbor proximity has the Gumbel (minimum) distribution, with mean 109

110

E[log10i] = –1/k log10() – (k ln10)–1 111

4

and variance 112

Var[log10i] = 1/6 2 (k ln10)–2. 113

114

Here, the mean depends on the process intensity and the parameters , k; and the variance 115

is independent of the process intensity and is completely determined by the parameter k. 116

Accordingly, the random variable 117

118

log10 i = log10i – E[log10i] 119

120

has the Gumbel distribution with zero mean and variance that is independent of the process 121

intensity . Lemma 1 also implies that the random variable i has the Weibull distribution 122

with shape parameter k and scale parameter exp(–/k). Importantly, the distribution of the 123

random variable i does not depend on the process intensity . 124

125

Next, we apply the distribution results of Sects. S1.1, S1.2 to each step of the declustering 126

algorithm (Sect. 4.1 of the main text). 127

128

S1.3 Step 1: Identifying the most clustered events 129

130

This step takes advantage of the well-documented bimodality in the distribution of 131

the nearest-neighbor proximities. Figure S8 illustrates this in the global NCEDC catalog 132

(panel a) and Hauksson et al. [2012] catalog for Southern California (panel b). A sharper 133

separation between the modes can be achieved by considering a 2D space-time 134

representation of the proximity, see Sect. 3, Eq. (4), as discussed by Zaliapin et al. [2008] 135

and Zaliapin and Ben-Zion [2013a]. Independently of whether the bimodality is present or 136

not, we expect the right part of the distribution (large proximities) to correspond to the 137

background seismicity. The left part (short proximities) is expected to be a mixture of 138

background and clustered events. Application of the cutoff proximity 0 is intended to 139

sample the long proximities, which quantify the (location-dependent) background event 140

distribution. The randomized-reshuffled catalogs of Step 2, constructed with these sampled 141

events, are used to approximate the distribution of nearest-neighbor proximities at each 142

location in the absence of clustering. This estimation is necessarily biased (unless 0 = 0 143

and the catalog is unclustered, which is not the case in most interesting practical situations), 144

since it only uses a fraction of background events (those with parent proximity above 0) 145

and hence underestimates the background intensity as each location (i.e., produces a higher 146

fraction of large proximity values). The better is the separation of the clustered and 147

background modes (see Fig. S8), the smaller is the bias. Even in presence of the bias, the 148

resulting estimation should reasonably approximate the relative background intensity. This 149

is confirmed by the analysis of synthetic ETAS seismicity in Sect. 6. 150

151

S1.4 Step 2: Estimation of relative background intensity 152

153

According to Sects. S1.1, S1.2, the empirical distribution of the elements in the 154

proximity vector ki = (1,i,…,M,i) is approximated by the Weibull distribution with scale 155

parameter that is proportional to (i)–1/k, where i denotes the estimated background 156

intensity at location i, and k is the shape parameter close to unity. 157

5

We notice that one can closely estimate the relative location-dependent background 158

intensity only in cases when the separation of the background intensity from the cluster 159

intensity is comparable at different locations. For instance, the location-dependent 160

background intensity may substantially vary from place to place, but if it is always 161

substantially lower that the cluster intensity, our heuristics works. Furthermore, if the 162

location-specific background intensity substantially overlaps the cluster intensity, but the 163

degree of overlap is approximately the same at all locations; the heuristics is still valid. The 164

situation when our estimation may give substantially biased results is when the location-165

dependent background intensity varies in such a way that in some locations it overlaps with 166

the cluster intensity, and in other locations it does not. In this case, the proposed estimation 167

may distort the relative background intensity levels. This is why we suggest to apply the 168

technique to regions where the expected background intensities do not vary over an order 169

of magnitude. 170

171

S1.5 Step 3: Normalized nearest-neighbor proximities 172

173

At this step, we obtain the normalized nearest-neighbor proximities i by rescaling the 174

observed proximities i according to the mean of the proximity vector ki. The goal is to 175

obtain distribution of i that is independent of the estimated location-specific background 176

intensity i. The proposed normalization of Eq. (7) uses logarithmic representation of the 177

proximity vector, and hence is less sensitive to possible outliers. 178

In a catalog with constant background intensity , no clustering, and using 0 = 0, 179

the normalized proximities i have the Weibull distribution, with parameters independent 180

of the intensity ; see Sect. S1.2. One can expect that a similar argument is heuristically 181

applied to a catalog with space-varying intensity (x), no clustering, and using 0 = 0. 182

Finally, in presence of clustering and with 0 > 0, the right tail of the distribution of i is 183

approximately Weibull with intensity-independent parameters, while the left tail might be 184

heavier (a larger proportion of small values) depending on the cluster intensity. 185

186

S1.6 Step 4: Thinning by the observed value of normalized proximity 187

188

The main component of the declustering procedure is Step 4, which applies thinning with 189

the retention probability of event i being proportional to its normalized proximity i. The 190

motivation for this procedure comes from the general theory of thinning for point processes 191

[Schoenberg, 2003; Daley and Vere-Jones, 2008]. As a simple motivation example, 192

consider a (possibly multidimensional) Poisson point process with intensity (x) and apply 193

thinning independently to every event with the retention probability p(x). Then the thinned 194

process is Poisson with intensity p(x)(x). For instance, if the retention probability is 195

196

p(x) = 0/(x), (S4) 197

198

then the thinned process is homogeneous Poisson with constant intensity 0. 199

Application of this general idea to thinning by estimated process intensity is a 200

delicate problem; see Schoenberg [2003], Moeller and Schoenberg [2010], and Clements 201

et al. [2012] for a comprehensive discussion and further references. Notably, in one-202

dimensional case one can avoid complicated estimation of the process intensity, and use a 203

6

process-dependent thinning to still obtain a homogeneous point process. Specifically, it can 204

be shown (see Lemma 14.2.7 in Chapter 14 of Daley and Vere-Jones, [2008]) that thinning 205

of a point process with intensity (t) > 0 using process-dependent retention probability 206

min{0(ti – ti–1),1} results in a point process with intensity 0 +(t), where the deviation 207

term (t) decreases as (t)/0 increases. In other words, the process-dependent thinning 208

results in an almost-homogeneous point process, even if the process intensity is unknown. 209

If one interprets the quantity (ti – ti–1)–1 as a single-point estimation of the process intensity 210

(t) at time ti, then the process-dependent thinning is a natural extension to the general 211

thinning result (S4). 212

This theoretical background motivates us to suggest a process-dependent 213

earthquake thinning procedure. Recall that the shape parameter of the Weibull 214

approximation to the nearest-neighbor proximity i is close to unity. This means that the 215

distribution of i is close to exponential, the same as the interevent time distribution in the 216

above result. We use thinning with retention probability proportional to the observed 217

normalized nearest-neighbor proximity i. In the Weibull model (S1), the MLE of the 218

inverse intensity –1 based on a single observation x is given by 219

[x/(1+1/k)]k x (since k 1), 220

where (x) is the gamma function. This allows one to expect that thinning with retention 221

probability min{A0i,1} results in a point field with approximate intensity A0/. 222

Figure S9 shows a Weibull approximation to the normalized nearest-neighbor 223

proximities i after thinning of Step 4 for the global and southern California catalogs. The 224

fit, although not perfect, is very close. This may serve as an indication that the above 225

heuristics does work in the examined data. This is inspiring, given the enormous variety of 226

seismic regimes, background intensities, and cluster forms that has been analyzed in each 227

examined case. We finally mention that the fit is even closer when examining local regions 228

that are characterized by more uniform background and cluster properties. 229

230

7

S2. Numerical implementation 231

232

The numerical implementation of the declustering algorithm (Sect. 4.1) is described below: 233

234 1. Set parameters 235

d (fractal dimension of epicenters/hypocenters); 236 w (parameter of the proximity of Eq. (1)); 237

0 (initial cutoff threshold); 238

0 (cluster threshold); 239 M (number of reshufflings). 240

241

2. Calculate the nearest-neighbor proximity i for each event in 242 the catalog using Eqs. (1),(3). 243 244

3. Select N0 events that satisfy i > 0. 245 246

4. Create M randomized-reshuffled catalogs and calculate the 247 proximity vectors ki for each event i. Specifically, for each 248 k = 1,…,M: 249 250

a. Create N0 independent and uniformly distributed time 251 instants within the examined time interval; 252

b. Reshuffle the locations of N0 earthquakes selected in 253 Step 3 using a random uniform permutation of {1,…,N0}. 254 Independently, reshuffle the magnitudes of these events. 255

c. Find the nearest-neighbor proximity k,i from each event 256 i in the original catalog to the events of the 257 randomized-reshuffled catalog k comprised of the random 258 times from step (a) and reshuffled locations and 259 magnitudes from step (b). 260 261

5. Calculate the normalized nearest-neighbor proximity i for 262 each event in the catalog using Eq. (7). 263 264

6. Calculate the retention probability Pback,i for each event i in 265 the original catalog according to Eq. (8). 266 267

7. Identify background events according to the retention 268 probabilities of Step 6. 269 270

Some practical comments are in order: 271

1. In Step 4c, the reshuffled catalog may include the event with the same location as 272

event i from the original catalog. This happens if event i satisfies the condition i 273

> 0 and is used in reshuffling. Such a duplicate location should not be used in 274

computing the proximity k,i, as this leads to severe artifacts. Accordingly, for each 275

event i that satisfies the condition i > 0, the proximity k,i is computed using N0 276

– 1 events of the k-th reshuffled catalog, excluding the event with the same location 277

as event i. 278

8

2. For several initial events in the original catalog, a reshuffled catalog k may contain 279

no earlier events. This leads to an infinite value of k,i. Such infinite values should 280

be excluded from calculating the average mean[log10(ki)] in Eq. (7). Formally 281

speaking, we calculate the conditional nearest-neighbor proximity k,i given that a 282

randomized-reshuffled catalog k has events prior to event i of the original catalog. 283

3. The first event in the catalog has undefined i (no earlier events), and hence an 284

undefined i. We use the convention that the first event does not satisfy the 285

background condition (equivalently, Pback,1 = 0). 286

4. As we mentioned in the main text, the parts 6 and 7 of the numerical algorithm are 287

implemented via Eq. (9). 288

5. In part 4c, it is enough to only reshuffle events’ magnitudes and use the original 289

locations. Assigning random times to the original event locations serves as location 290

reshuffling. 291

6. The value of the initial cutoff threshold 0 can be selected using the bimodal 292

distribution of the nearest-neighbor proximities i. Hence, one may first to calculate 293

the proximities (part 2 of the numerical algorithm above), use them to select the 294

value of 0, and then run the other parts of the numerical algorithm. 295

296

S3. Sample declustered catalog 297

298

We include a version of declustering for the catalog of Hauksson et al. [2012] 299

examined in this work. The catalog is in the file 2018JB017120-01.txt and the format 300

description is in the file 2018JB017120-02.txt. 301

The sample declustering file refers to 123,275 events with magnitudes m 2.0 302

during 1981 – 2018. The file reports (in column 13) the values of the logarithmic 303

normalized proximities, log10(i), which allows one to produce declustering with different 304

thresholds 0 and create alternative stochastic realizations of declustering for a fixed 0. In 305

Matlab, this can be done with the following commands, which assume that the logarithmic 306

proximities are stored in the variable logalpha and produces a vector I of background 307

event indicators (logical 1 or 0) 308

309 >> p = 10.^(logalpha-alpha0); 310 >> I = p>rand(size(p)); 311 312

These commands identify background events that are the first events in the respective 313

clusters (see Sect. 4.1). Identification of the largest events from each cluster can be done 314

using the information of the spanning time-oriented tree, which is also provided in the file 315

in the form of parent links (column 16). 316

317

As a specific example of declustering, the file also reports background event indicators for 318

a single stochastic realization of the algorithm with the cluster threshold 0 = 0. Two types 319

of the background indicators are given: the largest cluster event (column 14) and the first 320

cluster event (column 15). 321

322

The file reports the SCSN event id (cuspid) in column 8. This allows one to get additional 323

information about the examined events reported in the original catalog. 324

9

325

Example 1: Line 3 refers to event with the SCSN cuspid 3301566; this event forms a 326

cluster of a single event, and is identified as a background event. Accordingly, it has 327

background index 1 in both column 14 (the largest cluster event indicator) and column 15 328

(the first cluster event indicator). 329

330

Example 2: Line 2 refers to event with the SCSN cuspid 3301565; this event is a first 331

event in a larger cluster and is identified as a background event. Accordingly, it has 332

background index 0 in column 14 (the largest cluster event indicator) and index 1 in column 333

15 (the first cluster event indicator). The largest event in this cluster has index 59 (id 334

3316358), that event has index 1 in column 14 and index 0 in column 15. 335

336

10

337

338 Figure S1: Declustering results for ETAS catalog of Gu et al. [2013]. Quality of event 339

identification among earthquakes with magnitude equal to or above mmin. Blue (top): 340

proportion of the total estimated background events with respect to the true number of 341

background events. Green (middle): proportion of correctly identified triggered events. 342

Red (bottom): proportion of correctly identified background events. The error bars are 95% 343

prediction intervals (not the errors of the mean). The analysis is done for 10,000 344

independent realizations of declustering with 0 = 0.1 at every examined value of mmin. The 345

figure summarizes the results for 210,000 declustered catalogs. 346

347

348

349 Figure S2: Declustering results for ETAS catalog of Gu et al. [2013]. Proportion of 350

correctly identified triggered (solid blue line) and background (dashed red line) events, as 351

a function of the proximity to the true parent or nearest neighbor, respectively. The analysis 352

refers to a single realization of declustering. 353

11

354

355

Figure S3: Declustering results in the global NCEDC catalog, m 5. Stability of 356

declustering. The analysis is done for 10,000 independent realizations of a declustered 357

catalog for each value of cluster threshold 0. (a) The main panel refers to 0 = –0.5. The 358

rest of notations as in Fig. 5. The actual proportion of events that have the same estimated 359

type in all 10,000 realizations is 9.3% for background and 11.8% for clustered events. This 360

is hidden because of a finite bin width (0.025). 361

362

12

363

364

Figure S4: Declustering results for Southern California, m 2.5, catalog of Hauksson et 365

al. [2012]. Stability of declustering. The analysis is done for 10,000 independent 366

realizations of a declustered catalog for each value of cluster threshold 0. (a) The main 367

panel refers to 0 = 0. The rest of notations as in Fig. 5. 368

369

370

371

13

372

373

Figure S5: Declustering results for Southern California, m 3.5, catalog of Hauksson et 374

al. [2012]. Stability of declustering. The analysis is done for 10,000 independent 375


panel refers to 0 = 0.6. The rest of notations as in Fig. 5. 377

378

379

14

380

381

Figure S6: Declustering results for Landers (1992, M7.3) sub-catalog of Hauksson et al. 382

[2012]. Stability of declustering. The analysis is done for 10,000 independent realizations 383

of a declustered catalog for each value of cluster threshold 0. (a) The main panel refers to 384

0 = 0.2. The rest of notations as in Fig. 5. 385

386

15

387

388

Figure S7: Declustering results for Parkfield (2004, M6) sub-catalog of Waldhouser and 389

Schaff [2008]. Stability of declustering. The analysis is done for 10,000 independent 390


panel refers to 0 = 0.0. The rest of notations as in Fig. 5. 392

393

16

394 395

Figure S8: Bimodal distribution of the nearest-neighbor proximity. (a) Global NCEDC 396

catalog, with m 5; (b) Southern California catalog by Hauksson et al., [2012]. (See Sects. 397

2.1, 2.2 of the main text for complete data description). 398

399

400

401 402

Figure S9: Weibull approximation to the normalized nearest-neighbor proximities after 403

thinning. (a) Global NCEDC catalog, with m 5; (b) Southern California catalog by 404

Hauksson et al., [2012]. (See Sects. 2.1, 2.2 of the main text for complete data description). 405

406

10 - Ilya Zaliapin

Documents