Markov chain Monte Carlo Revolution in Reliability …zuev/talks/MCMC_rev.pdfKonstantin Zuev (USC) MCMC Revolution in Reliability Engineering SCPS 2011 2 / 26 MCMC Revolution P. Diaconis

Markov chain Monte Carlo Revolution

in Reliability Engineering

Konstantin Zuev

Department of Mathematics

University of Southern California

http://www-bcf.usc.edu/∼kzuev

December 3, 2011

Southern California Probability Symposium

Konstantin Zuev (USC) MCMC Revolution in Reliability Engineering SCPS 2011 1 / 26

MCMC Revolution

P. Diaconis (2009), “The Markov chain Monte Carlo revolution”:

...asking about applications of Markov chain Monte Carlo (MCMC)

is a little like asking about applications of the quadratic formula...

you can take any area of science, from hard to social, and find a

burgeoning MCMC literature specifically tailored to that area.

Statistics: Byesian inference

Statistical Physics: sampling from the Boltzman distribution

Biochemistry: protein structure simulation

Astronomy: hypothesis testing for astronomical observations

Linguistics: linguistic data analysis

The main goal of this talk: To show how MCMC can be efficiently used for

solving problems in Reliability Engineering


MCMC Revolution














MCMC Revolution














Outline

1 Reliability Problem

2 Pre-MCMC era

3 First MCMC pancake

4 Subset Simulation

5 Enhancements for Subset Simulation

6 Summary


Outline


2 Pre-MCMC era


4 Subset Simulation


6 Summary


Reliability Problem

Reliability Problem: To estimate the probability of failure pF

pF = P (x ∈ F ) =

∫Rd

π(x)IF (x)dx

Notation:

x ∈ Rd represents the uncertain excitation of a system

I x is a random vector with joint PDF π(x) (multivariate standard normal)

F ⊂ Rd a failure domain (unacceptable system performance)

F = {x : g(x) ≥ b∗}

g(x) a performance function (loss function)

b∗ a critical threshold for performance

IF (x) = 1 if x ∈ F and IF (x) = 0 if x /∈ F


Reliability Problem

Reliability Problem: To estimate the probability of failure pF

pF = P (x ∈ F ) =

∫Rd

π(x)IF (x)dx

Notation:

x ∈ Rd represents the uncertain excitation of a system

I x is a random vector with joint PDF π(x) (multivariate standard normal)

F ⊂ Rd a failure domain (unacceptable system performance)

F = {x : g(x) ≥ b∗}

g(x) a performance function (loss function)

b∗ a critical threshold for performance

IF (x) = 1 if x ∈ F and IF (x) = 0 if x /∈ F


Why is this problem computationally challenging?

pF =

∫Rd

π(x)IF (x)dx, F = {x : g(x) ≥ b∗}

Typically in Applications:

The relationship between x and IF (x) is not explicitly known

We can compute IF (x) for any x, but this computation is expensive

The probability of failure pF is very small, pF ∼ 10−2 − 10−9

The dimension d is very large, d ∼ 103

Consequences:

Numerical integration is not suitable

Standard Monte Carlo is computationally infeasible



pF =

∫Rd

π(x)IF (x)dx, F = {x : g(x) ≥ b∗}






Consequences:





pF =

∫Rd

π(x)IF (x)dx, F = {x : g(x) ≥ b∗}






Consequences:





pF =

∫Rd

π(x)IF (x)dx, F = {x : g(x) ≥ b∗}






Consequences:





pF =

∫Rd

π(x)IF (x)dx, F = {x : g(x) ≥ b∗}






Consequences:





pF =

∫Rd

π(x)IF (x)dx, F = {x : g(x) ≥ b∗}






Consequences:




Outline


2 Pre-MCMC era

I Approximate methods

I Simulation methods


4 Subset Simulation


6 Summary


Pre-MCMC Era: Approximate Methods

FORM: First-Order Reliability Method

Failure domain F = {x : g(x) ≥ b∗}

Limit-state surface ∂F = {x : g(x) = b∗}

Design point x∗ = arg minx∈∂F

‖x‖

Reliability index β = ‖x∗‖

FORM estimate pF ≈ Φ(−β)

Main advantage:

If g(x) is linear, then FORM gives the exact result

Main drawbacks:

If g(x) is not linear, then FORM estimate may be very inaccurate

FORM does not give any measure of the error introduced by linearization

Verdict: Valdebenito et al (2010): “FORM has no scientific basis”







‖x‖



Main advantage:


Main drawbacks:










‖x‖



Main advantage:


Main drawbacks:










‖x‖



Main advantage:


Main drawbacks:










‖x‖



Main advantage:


Main drawbacks:





Pre-MCMC Era: Simulation Methods

Importance Sampling:

pF =

∫Rd

π(x)IF (x)dx =

∫Rd

π(x)

µ(x)IF (x)µ(x)dx = Eµ

[π(x)

µ(x)IF (x)

]

µ(x) is the importance sampling density (a.k.a. instrumental or trial density)

pF ≈ pF =1

N

N∑i=1

π(x(i))

µ(x(i))IF (x(i)), x(1), . . . , x(N) ∼ µ

If supp(µ) ⊇ F ∩ supp(π), then pF → pF a.s. when N →∞

If µ(x) is “good”, then IS is an efficient variance reduction technique

Optimal ISD µopt(x) = π(x|F ) = π(x)IF (x)pF

Main drawbacks:

When F ⊂ Rd is high-dimensional, finding a good ISD is very challenging

Importnace sampling suffers from the curse of dimensionality




pF =

∫Rd

π(x)IF (x)dx =

∫Rd

π(x)


[π(x)

µ(x)IF (x)

]


pF ≈ pF =1

N

N∑i=1

π(x(i))

µ(x(i))IF (x(i)), x(1), . . . , x(N) ∼ µ




Main drawbacks:






pF =

∫Rd

π(x)IF (x)dx =

∫Rd

π(x)


[π(x)

µ(x)IF (x)

]


pF ≈ pF =1

N

N∑i=1

π(x(i))

µ(x(i))IF (x(i)), x(1), . . . , x(N) ∼ µ




Main drawbacks:






pF =

∫Rd

π(x)IF (x)dx =

∫Rd

π(x)


[π(x)

µ(x)IF (x)

]


pF ≈ pF =1

N

N∑i=1

π(x(i))

µ(x(i))IF (x(i)), x(1), . . . , x(N) ∼ µ




Main drawbacks:




The first MCMC pancake

Au and Beck (1999): Importance Sampling using kernel density estimators

Key idea: to use x(1), . . . , x(M) ∼ π(x|F ) = µopt(x) to construct

µ(x|x(1), . . . , x(M)) ≈ µopt(x)

How to obtain x(1), . . . , x(M) ∼ µopt(x)?

I Rejection Sampling is extremely inefficient if pF is small

I Generate a Markov chain with the stationary distribution µopt(x)!

A sampling density estimator:

µ(x|x(1), . . . , x(M)) =1

M

M∑i=1

1

(ωλi)dN(x− x(i)

ωλi

)Main drawback:

To work well, µ(x|x(1), . . . , x(M)) ≈ µopt(x) must be a good approximation

⇒ M must be large and x(1), . . . , x(M) must populate F properly

⇒ the method is impractical





µ(x|x(1), . . . , x(M)) ≈ µopt(x)





µ(x|x(1), . . . , x(M)) =1

M

M∑i=1

1

(ωλi)dN(x− x(i)

ωλi

)Main drawback:








µ(x|x(1), . . . , x(M)) ≈ µopt(x)





µ(x|x(1), . . . , x(M)) =1

M

M∑i=1

1

(ωλi)dN(x− x(i)

ωλi

)Main drawback:








µ(x|x(1), . . . , x(M)) ≈ µopt(x)





µ(x|x(1), . . . , x(M)) =1

M

M∑i=1

1

(ωλi)dN(x− x(i)

ωλi

)Main drawback:








µ(x|x(1), . . . , x(M)) ≈ µopt(x)





µ(x|x(1), . . . , x(M)) =1

M

M∑i=1

1

(ωλi)dN(x− x(i)

ωλi

)Main drawback:








µ(x|x(1), . . . , x(M)) ≈ µopt(x)





µ(x|x(1), . . . , x(M)) =1

M

M∑i=1

1

(ωλi)dN(x− x(i)

ωλi

)

Main drawback:








µ(x|x(1), . . . , x(M)) ≈ µopt(x)





µ(x|x(1), . . . , x(M)) =1

M

M∑i=1

1

(ωλi)dN(x− x(i)

ωλi

)Main drawback:





The first efficient MCMC methodAu and Beck (2001): Subset Simulation

Rd = F0 ⊃ F1 ⊃ . . . ⊃ Fm = F

F = {x : g(x) ≥ b∗}

Fi = {x : g(x) ≥ b∗i }

b∗1 < b∗2 < . . . < b∗m = b∗

pF =

m−1∏k=0

P (Fk+1|Fk)

P (Fk+1|Fk) ≈ 1

N

N∑i=1

IFk+1(x

(i)k )

x(i)k ∼ π(x|Fk) =

π(x)IFk(x)

P (Fk)

How to sample from π(x|Fk)?

Use an appropriate MCMC alg



Rd = F0 ⊃ F1 ⊃ . . . ⊃ Fm = F

F = {x : g(x) ≥ b∗}

Fi = {x : g(x) ≥ b∗i }

b∗1 < b∗2 < . . . < b∗m = b∗

pF =

m−1∏k=0

P (Fk+1|Fk)

P (Fk+1|Fk) ≈ 1

N

N∑i=1

IFk+1(x

(i)k )


π(x)IFk(x)

P (Fk)





Rd = F0 ⊃ F1 ⊃ . . . ⊃ Fm = F

F = {x : g(x) ≥ b∗}

Fi = {x : g(x) ≥ b∗i }

b∗1 < b∗2 < . . . < b∗m = b∗

pF =

m−1∏k=0

P (Fk+1|Fk)

P (Fk+1|Fk) ≈ 1

N

N∑i=1

IFk+1(x

(i)k )


π(x)IFk(x)

P (Fk)





Rd = F0 ⊃ F1 ⊃ . . . ⊃ Fm = F

F = {x : g(x) ≥ b∗}

Fi = {x : g(x) ≥ b∗i }

b∗1 < b∗2 < . . . < b∗m = b∗

pF =

m−1∏k=0

P (Fk+1|Fk)

P (Fk+1|Fk) ≈ 1

N

N∑i=1

IFk+1(x

(i)k )


π(x)IFk(x)

P (Fk)





Rd = F0 ⊃ F1 ⊃ . . . ⊃ Fm = F

F = {x : g(x) ≥ b∗}

Fi = {x : g(x) ≥ b∗i }

b∗1 < b∗2 < . . . < b∗m = b∗

pF =

m−1∏k=0

P (Fk+1|Fk)

P (Fk+1|Fk) ≈ 1

N

N∑i=1

IFk+1(x

(i)k )


π(x)IFk(x)

P (Fk)




Sampling from π(x|Fk)Standard Metropolis-Hastings algorithm is not efficient in high dimensions

Modified Metropolis-Hastings algorithm: xn xn+1

Generate candidate state y

For each j = 1 . . . d:

I Simulate yj ∼ Sj(·|xjn)I Compute the acceptance probability

aj(xjn, yj) = min

{1,πj(y

j)

πj(xjn)

}I Accept/Reject yj

yj =

yj , with prob. aj(xjn, yj)

xjn, with prob. 1− aj(xjn, yj)

Accept/Reject y

xn+1 =

y, if y ∈ Fkxn, if y /∈ Fk


Sampling from π(x|Fk)Standard Metropolis-Hastings algorithm is not efficient in high dimensions

Modified Metropolis-Hastings algorithm: xn xn+1

Generate candidate state y

For each j = 1 . . . d:

I Simulate yj ∼ Sj(·|xjn)I Compute the acceptance probability

aj(xjn, yj) = min

{1,πj(y

j)

πj(xjn)

}I Accept/Reject yj

yj =

yj , with prob. aj(xjn, yj)

xjn, with prob. 1− aj(xjn, yj)

Accept/Reject y

xn+1 =

y, if y ∈ Fkxn, if y /∈ Fk


Statistical properties and efficiency of Subset Simulation

SS estimator : pF =

m−1∏k=0

(1

N

N∑i=1

IFk+1(x

(i)k )

), x

(i)k ∼ π(x|Fk)

Statistical properties:

pF is asymptotically unbiased and bias is O(1/N)

pF is consistent and its coefficient of variation δ = O(1/√N)

Efficiency:

What total number of samples is required to achieve a given accuracy in pF ?

Standard Monte Carlo: NT ∝ 1/pF

Subset Simulation: NT ∝ | log pF |r, where r ≤ 3

SS is very efficient when estimating small probabilities



SS estimator : pF =

m−1∏k=0

(1

N

N∑i=1

IFk+1(x

(i)k )

), x

(i)k ∼ π(x|Fk)




Efficiency:







SS estimator : pF =

m−1∏k=0

(1

N

N∑i=1

IFk+1(x

(i)k )

), x

(i)k ∼ π(x|Fk)




Efficiency:







SS estimator : pF =

m−1∏k=0

(1

N

N∑i=1

IFk+1(x

(i)k )

), x

(i)k ∼ π(x|Fk)




Efficiency:







SS estimator : pF =

m−1∏k=0

(1

N

N∑i=1

IFk+1(x

(i)k )

), x

(i)k ∼ π(x|Fk)




Efficiency:






Numerical Example

Linear Problem

I ∂F ⊂ Rd is a hyperplane

I d = 1000

I pF = 10−k, k = 3, 4, 5, 6

What NT is required to achieve the CV δ = 0.3 ?


Numerical Example

Linear Problem


I d = 1000

I pF = 10−k, k = 3, 4, 5, 6



Numerical Example

Linear Problem


I d = 1000

I pF = 10−k, k = 3, 4, 5, 6



MCMC Big Bang

Reliability methods based on MCMC:

Au and Beck (2001): Subset Simulation

Schueller et al (2004): Line Sampling

Ching et al (2005): Subset Simulation with Splitting

Ching et al (2005): Hybrid Subset Simulation

Katafygiotis and Cheung (2005): Two stage Subset Simulation

Katafygiotis et al (2007): Auxiliary Domain Method

Katafygiotis and Cheung (2007): Spherical Subset Simulation

Zuev and Katafygiotis (2007): Adaptive Linked Importance Sampling

Zuev and Katafygiotis (2011): Horseracing Simulation

Zuev et al (2011): Bayesian Subset Simulation


MCMC Big Bang

Reliability methods based on MCMC:

Au and Beck (2001): Subset Simulation

Schueller et al (2004): Line Sampling

Ching et al (2005): Subset Simulation with Splitting

Ching et al (2005): Hybrid Subset Simulation

Katafygiotis and Cheung (2005): Two stage Subset Simulation

Katafygiotis et al (2007): Auxiliary Domain Method

Katafygiotis and Cheung (2007): Spherical Subset Simulation

Zuev and Katafygiotis (2007): Adaptive Linked Importance Sampling

Zuev and Katafygiotis (2011): Horseracing Simulation

Zuev et al (2011): Bayesian Subset Simulation


Outline


2 Pre-MCMC era


4 Subset Simulation


I Modified Metropolis-Hastings algorithm with Delayed Rejection

I Bayesian Subset Simulation

6 Summary


Modifications of the Metropolis-Hastings algorithm

MMH: Modified Metropolis-Hastings algorithm

I Au and Beck, 2001

MHDR: Metropolis-Hastings algorithm with delayed rejection

I Tierney and Mira, 1999

MMHDR: Modified Metropolis-Hastings algorithm with delayed rejection

I Zuev and Katafygiotis, 2011


Metropolis-Hastings algorithm with Delayed Rejection

Tierney and Mira (1999):

a1(xn, y1) = min

{1,π(y1)

π(xn)IF (y1)

}a2(xn, y1, y2) = min

{1,

π(y2)S1(y1|y2)(1− a1(y2, y1))

π(xn)S1(y1|xn)(1− a1(xn, y1))IF (y2)

}

Drawback: Inefficient in high dimensions

Reason: S1(·|xn) and S2(·|xn, y1) are d-dimensional PDFs




a1(xn, y1) = min

{1,π(y1)

π(xn)IF (y1)

}a2(xn, y1, y2) = min

{1,

π(y2)S1(y1|y2)(1− a1(y2, y1))

π(xn)S1(y1|xn)(1− a1(xn, y1))IF (y2)

}






a1(xn, y1) = min

{1,π(y1)

π(xn)IF (y1)

}a2(xn, y1, y2) = min

{1,

π(y2)S1(y1|y2)(1− a1(y2, y1))

π(xn)S1(y1|xn)(1− a1(xn, y1))IF (y2)

}






a1(xn, y1) = min

{1,π(y1)

π(xn)IF (y1)

}a2(xn, y1, y2) = min

{1,

π(y2)S1(y1|y2)(1− a1(y2, y1))

π(xn)S1(y1|xn)(1− a1(xn, y1))IF (y2)

}






a1(xn, y1) = min

{1,π(y1)

π(xn)IF (y1)

}a2(xn, y1, y2) = min

{1,

π(y2)S1(y1|y2)(1− a1(y2, y1))

π(xn)S1(y1|xn)(1− a1(xn, y1))IF (y2)

}




Modified MH algorithm with Delayed Rejection

Features of the Algorithm:

Samples generated by MMHDR are less correlated

then samples generated by MMH.

MMHDR needs more computational effort than MMH

for generating the same number of samples.

Whether MMHDR is useful for reliability problems depends on whether the

gained reduction in variance compensates for the additional cost.

With fixed computational effort:

I MMH: more Markov chains with more correlated states

I MMHDR: fewer Markov chains with less correlated states






























































Numerical ExampleLinear problem

Geometry

I d = 1000

I pF = 10−5, β = 4.265

Proposal PDFs

I MMH: Sj(·|xj0) = N (xj0, 1)

I MMHDR: Sj1,2(·|x

j0) = N (xj0, 1)

Subset Simulation

MMH(1): SS + MMH, N = 103

MMHDR(1.4): SS + MMHDR, N = 103

MMH(1.4): SS + MMH, N = 1450

Reduction in CV is 11%



Geometry

I d = 1000

I pF = 10−5, β = 4.265

Proposal PDFs

I MMH: Sj(·|xj0) = N (xj0, 1)

I MMHDR: Sj1,2(·|x

j0) = N (xj0, 1)

Subset Simulation

MMH(1): SS + MMH, N = 103


MMH(1.4): SS + MMH, N = 1450




Geometry

I d = 1000

I pF = 10−5, β = 4.265

Proposal PDFs

I MMH: Sj(·|xj0) = N (xj0, 1)

I MMHDR: Sj1,2(·|x

j0) = N (xj0, 1)

Subset Simulation

MMH(1): SS + MMH, N = 103


MMH(1.4): SS + MMH, N = 1450




Geometry

I d = 1000

I pF = 10−5, β = 4.265

Proposal PDFs

I MMH: Sj(·|xj0) = N (xj0, 1)

I MMHDR: Sj1,2(·|x

j0) = N (xj0, 1)

Subset Simulation

MMH(1): SS + MMH, N = 103


MMH(1.4): SS + MMH, N = 1450




Geometry

I d = 1000

I pF = 10−5, β = 4.265

Proposal PDFs

I MMH: Sj(·|xj0) = N (xj0, 1)

I MMHDR: Sj1,2(·|x

j0) = N (xj0, 1)

Subset Simulation

MMH(1): SS + MMH, N = 103


MMH(1.4): SS + MMH, N = 1450



“Bayesianization” of Subset Simulation

The key idea of SS:

pF =

m∏k=1

pk, pk = P (Fk|Fk−1)

Original (“frequentist”) SS:

pk ≈ pk =1

N

N∑i=1

IFk(x

(i)k−1) =

nkN, pF ≈ pF =

m∏k=1

nkN

Bayesian SS:

1 Specify prior PDFs f(pk) for all pk = P (Fk|Fk−1), k = 1, . . . ,m.

2 Find the posterior PDFs f(pk|Dk−1) via Bayes’ theorem,

using new data Dk−1 = {x(1)k−1, . . . , x(N)k−1 ∼ π(·|Fk−1)}

3 Obtain the posterior PDF f(pF | ∪m−1k=0 Dk) of pF =∏mk=1 pk

from f(p1|D0), . . . , f(pm|Dm−1).



The key idea of SS:

pF =

m∏k=1



pk ≈ pk =1

N

N∑i=1

IFk(x

(i)k−1) =

nkN, pF ≈ pF =

m∏k=1

nkN

Bayesian SS:





from f(p1|D0), . . . , f(pm|Dm−1).



The key idea of SS:

pF =

m∏k=1



pk ≈ pk =1

N

N∑i=1

IFk(x

(i)k−1) =

nkN, pF ≈ pF =

m∏k=1

nkN

Bayesian SS:





from f(p1|D0), . . . , f(pm|Dm−1).



The key idea of SS:

pF =

m∏k=1



pk ≈ pk =1

N

N∑i=1

IFk(x

(i)k−1) =

nkN, pF ≈ pF =

m∏k=1

nkN

Bayesian SS:





from f(p1|D0), . . . , f(pm|Dm−1).



The key idea of SS:

pF =

m∏k=1



pk ≈ pk =1

N

N∑i=1

IFk(x

(i)k−1) =

nkN, pF ≈ pF =

m∏k=1

nkN

Bayesian SS:





from f(p1|D0), . . . , f(pm|Dm−1).



The key idea of SS:

pF =

m∏k=1



pk ≈ pk =1

N

N∑i=1

IFk(x

(i)k−1) =

nkN, pF ≈ pF =

m∏k=1

nkN

Bayesian SS:





from f(p1|D0), . . . , f(pm|Dm−1).


Prior and Posterior for pk = P (Fk|Fk−1)

1 Prior PDF f(pk)

Principle of Maximum Entropy:

f(pk) = 1, 0 ≤ pk ≤ 1.

2 Posterior PDF f(pk|Dk−1)

I If x(1)k−1, . . . , x

(N)k−1 are i.i.d. according to π(·|Fk−1)

⇒ IFk (x(1)k−1), . . . , IFk (x

(N)k−1) can be interpreted as Bernoulli trials

⇒ Bayes’ Theorem (1763):

f(pk|Dk−1) =pnkk (1− pk)N−nk

B(nk + 1, N − nk + 1)

I In fact, x(1)k−1, . . . , x

(N)k−1 are MCMC samples (for k ≥ 2)

⇒ x(1)k−1, . . . , x

(N)k−1 ∼ π(·|Fk−1), however, they are not independent

f(pk|Dk−1) ≈pnkk (1− pk)N−nk

B(nk + 1, N − nk + 1)



1 Prior PDF f(pk)


f(pk) = 1, 0 ≤ pk ≤ 1.


I If x(1)k−1, . . . , x


⇒ IFk (x(1)k−1), . . . , IFk (x




B(nk + 1, N − nk + 1)

I In fact, x(1)k−1, . . . , x


⇒ x(1)k−1, . . . , x



B(nk + 1, N − nk + 1)



1 Prior PDF f(pk)


f(pk) = 1, 0 ≤ pk ≤ 1.


I If x(1)k−1, . . . , x


⇒ IFk (x(1)k−1), . . . , IFk (x




B(nk + 1, N − nk + 1)

I In fact, x(1)k−1, . . . , x


⇒ x(1)k−1, . . . , x



B(nk + 1, N − nk + 1)



1 Prior PDF f(pk)


f(pk) = 1, 0 ≤ pk ≤ 1.


I If x(1)k−1, . . . , x


⇒ IFk (x(1)k−1), . . . , IFk (x




B(nk + 1, N − nk + 1)

I In fact, x(1)k−1, . . . , x


⇒ x(1)k−1, . . . , x



B(nk + 1, N − nk + 1)



1 Prior PDF f(pk)


f(pk) = 1, 0 ≤ pk ≤ 1.


I If x(1)k−1, . . . , x


⇒ IFk (x(1)k−1), . . . , IFk (x




B(nk + 1, N − nk + 1)

I In fact, x(1)k−1, . . . , x


⇒ x(1)k−1, . . . , x



B(nk + 1, N − nk + 1)


Posterior PDF for pF

Last step: To find the PDF of pF =∏mk=1 pk, given the PDFs of all factors

pk ∼ Be(nk + 1, N − nk + 1)

Idea: To approximate pF by a single beta variable

Theorem (Da-Yin Fan, 1991)

Let X1, . . . , Xm be beta variables, Xk ∼ Beta(ak, bk), and Y = X1X2 . . . Xm.

Then Y is approximately distributed as Y ∼ Beta(a, b), where

a = µ1µ1 − µ2

µ2 − µ21

, b = (1− µ1)µ1 − µ2

µ2 − µ21

,

µ1 =

m∏k=1

akak + bk

, µ2 =

m∏k=1

ak(ak + 1)

(ak + bk)(ak + bk + 1).

Nice property of this approximation: E[Y ] = E[Y ], E[Y 2] = E[Y 2]




pk ∼ Be(nk + 1, N − nk + 1)





a = µ1µ1 − µ2

µ2 − µ21

, b = (1− µ1)µ1 − µ2

µ2 − µ21

,

µ1 =

m∏k=1

akak + bk

, µ2 =

m∏k=1

ak(ak + 1)

(ak + bk)(ak + bk + 1).





pk ∼ Be(nk + 1, N − nk + 1)





a = µ1µ1 − µ2

µ2 − µ21

, b = (1− µ1)µ1 − µ2

µ2 − µ21

,

µ1 =

m∏k=1

akak + bk

, µ2 =

m∏k=1

ak(ak + 1)

(ak + bk)(ak + bk + 1).





pk ∼ Be(nk + 1, N − nk + 1)





a = µ1µ1 − µ2

µ2 − µ21

, b = (1− µ1)µ1 − µ2

µ2 − µ21

,

µ1 =

m∏k=1

akak + bk

, µ2 =

m∏k=1

ak(ak + 1)

(ak + bk)(ak + bk + 1).



Bayesian Subset Simulation

Point estimate pF PDF f(pF ) = Be(pF |a, b)

a =

∏mk=1

nk+1N+2

(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2

b =

(1−

∏mk=1

nk+1N+2

)(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2

What is the relationship between f(pF ) and pF ?

limN→∞

Ef [pF ] = limN→∞

pF = pF

Why is Bayesian Subset Simulation useful?

I CV of f(pF ) can be considered as a measure of uncertainty in the value of pFI The PDF f(pF ) can be fully used for life-cost analyses, decision making, etc.

E[Loss(pF )] =

∫Loss(pF )f(pF )dpF




a =

∏mk=1

nk+1N+2

(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2

b =

(1−

∏mk=1

nk+1N+2

)(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2


limN→∞


pF = pF



E[Loss(pF )] =





a =

∏mk=1

nk+1N+2

(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2

b =

(1−

∏mk=1

nk+1N+2

)(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2


limN→∞


pF = pF



E[Loss(pF )] =





a =

∏mk=1

nk+1N+2

(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2

b =

(1−

∏mk=1

nk+1N+2

)(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2


limN→∞


pF = pF



E[Loss(pF )] =





a =

∏mk=1

nk+1N+2

(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2

b =

(1−

∏mk=1

nk+1N+2

)(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2


limN→∞


pF = pF



E[Loss(pF )] =





a =

∏mk=1

nk+1N+2

(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2

b =

(1−

∏mk=1

nk+1N+2

)(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2


limN→∞


pF = pF


I CV of f(pF ) can be considered as a measure of uncertainty in the value of pF

I The PDF f(pF ) can be fully used for life-cost analyses, decision making, etc.

E[Loss(pF )] =





a =

∏mk=1

nk+1N+2

(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2

b =

(1−

∏mk=1

nk+1N+2

)(1−

∏mk=1

nk+2N+3

)∏mk=1

nk+2N+3 −

∏mk=1

nk+1N+2


limN→∞


pF = pF



E[Loss(pF )] =



Elasto-Plastic Structure Subjected to Ground Motion

S.K. Au (Computers & Structures, 2005):

2D moment-resisting steel frame

Synthetic ground motion a = a(Z)

I Z = (Z1, . . . , Zd)i.i.d∼ N (0, 1)

IZ−→ Filter

a(Z)−−−→I d = 1001

Failure domain:

F = {Z ∈ Rd : δmax(Z) > b}

δmax = maxi=1,...,6

δi

δi is the maximum absolute

interstory drift ratio of the ith story

within the duration of study, 30 s

b = 0.5%⇒ pF ≈ 8.9× 10−3


Elasto-Plastic Structure Subjected to Ground Motion

S.K. Au (Computers & Structures, 2005):

2D moment-resisting steel frame

Synthetic ground motion a = a(Z)

I Z = (Z1, . . . , Zd)i.i.d∼ N (0, 1)

IZ−→ Filter

a(Z)−−−→I d = 1001

Failure domain:

F = {Z ∈ Rd : δmax(Z) > b}

δmax = maxi=1,...,6

δi

δi is the maximum absolute

interstory drift ratio of the ith story

within the duration of study, 30 s

b = 0.5%⇒ pF ≈ 8.9× 10−3


Summary

MCMC is very useful for solving engineering problems

I Reliability Engineering

Subset Simulation (Au and Beck, 2001)

I a very efficient MCMC method for estimation of small failure probabilities

Enhancements for Subset Simulation

I MMHDR = MMH (Au and Beck, 2001) + MHDR (Tierney and Mira, 1999)



Summary









Summary









Summary









Thank you for attention!


Markov chain Monte Carlo Revolution in Reliability …zuev/talks/MCMC_rev.pdfKonstantin Zuev (USC) MCMC Revolution in Reliability Engineering SCPS 2011 2 / 26 MCMC Revolution P. Diaconis

Documents