Probabilistic learning for prediction and optimization …hyperion.usc.edu/MLUQ/pres/ghanem.pdfProbabilistic learning for prediction and optimization of complex systems Roger Ghanem

Probabilistic learning for prediction and optimization ofcomplex systems

Roger GhanemChristian Soize

University of Southern California Los Angeles, CA, USAUniversite Paris-Est, Marne-La-Vallee, France

MLUQ Workshop, USC, Los Angeles, CA July 24-26 2019

Roger Ghanem Christian Soize MLUQ July 26 2019 1 / 20

Data

We are often faced with

N data points, each of dimension n.


Data

We are often faced with

N data points, each of dimension n.


Data

X = (Q,W ,U):

GIVEN VALUES FOR THESE VARIABLES PREDICT DISTRIBUTIONS FOR THESE VARIABLES

TRAINING SET

PREDICTION SET Q:U, W:


Inference

Two general approaches

Q = f (W ,U)

orFQ,|W ,U(q)

Small Data Challenge

not enough raw data to make credible inference

need additional constraints on inference


Inference

Two general approaches

Q = f (W ,U)

orFQ,|W ,U(q)


not enough raw data to make credible inference

need additional constraints on inference


Key Ingredient:zero nought sifr zip zilch nada scratch goose-egg 0 .

Constraints

extrinsic constraints: differentiability, positivity, etc

intrinsic constraints: PCA, DMAPS

Measure of proximity between data points:

Euclidean norm Piecewise interpolation(linear) Correlation Gaussian process interpolation(physics) intrinsic geodesic interpolation


Data-Driven Discovery of Physics Constraints:

Intrinsic Structure Encoded in Data:use graph analysis and diffusions on manifolds to discover structure


Basic Idea

Characterize intrinsic structure of training set

using diffusion maps

Compute statistics of training set (what is likely to happen as we get moredata)

using KDE with Gaussian mixture

Augment data

construct and integrate a manifold-projected Ito equation with KDE asinvariant measure


Use (very large) new data set for inference.


Progress since last year

Bayesian update µQ,W ,U|Q using data

no recourse to expensive model during update

manifold is the locus of physically realizable events

Minimum KL divergence update µQ,W ,U|f (Q) using constraints

no recourse to expensive model during update

manifold is the locus of physcally realizable events


Ingredients: Probability Model

Construction of Distribution on the Manifold: Kernel Density Models

N data points in Rn are initially available associated with randomvariable [X ] with realization [xd ]:

[X ] : [xd ] ={xd ,1, · · · , xd ,N

}, xd ,i ∈ Rn.

Data points are reduced through Karhunen-Loeve expansion andtruncated at ν resulting in random variable [H] with realizations [ηd ]:

[H] : [ηd ] ={ηd ,1, · · · ,ηd ,N

}, ηd ,i ∈ Rν .

N data points assumed iid thus:

p[H]([η]) = pH(η1) · · · pH(ηN)


Ingredients: Probability Model

Construction of Distribution on the Manifold (cont’d):

probability density function of H obtained from KDE:

pH(η) =1

N

N∑j=1

πν

(ηd ,j − η

), η ∈ Rν

probability model for random variable representing available data:

p[H]([η]) = pH(η1) · · · pH(ηN), [η] ∈ Rν×N


Ingredients: Diffusion Manifold

Select Diffusion Kernel

kε(η,η′) = e−

||η−η′||2ε

Proximity on the graph:

[K ]ij = kε(ηd ,i ,ηd ,j), [b]ij = δij

∑j ′

[K ]ij ′ , i , j = 1, · · · ,N

Eigenvectors of diffusion operator

[P] = [b]−1/2[K ][b]−1/2 [P]φα = λαφα

Eigenvectors:

[g ] = [g1 · · · gm]

gα = λα[b]−1/2φα ∈ RN , α = 1, · · · ,m, κ ≥ 1Roger Ghanem Christian Soize MLUQ July 26 2019 11 / 20

Ingredients: ISDE

Ito Sampler from the KDE pdf

Ito equation is constructed with KD pdf as invariant measure.

d [U(r)] = [V (r)]dr

d [V (r)] = [L([U(r)])]dr − 1

2f0[V (r)]dr +

√f0[dW (r)]

I.C. [U(0)] = [Hd ], [V (0)] = [N ] a.s.

[L([U(r)])]k` =∂

∂u`log{q(u`)}

q(u`) =1

N

N∑j=1

exp

{− 1

2s2ν

‖ηd ,j − u`‖2

}Roger Ghanem Christian Soize MLUQ July 26 2019 12 / 20

Ingredients: ISDE

Previous ISDE:

admits a unique invariant measure and a unique solution([U(r)], [V (r)]), r ∈ R+ that is a second-order diffusion stochasticprocess, which is stationary and ergodic, and such that, for all r fixed inR+, the probability density of random matrix [U(r)] is p[H]([η]).


Ingredients: ISDE

Sampling on the Manifold: Projected Ito Equation

d [Z(r)] = [Y(r)]dr

d [Y(r)] = [L([Z(r)])]dr − 1

2f0 [Y(r)]dr +

√f0[dW(r)]

I.C. [Z(0)] = [Hd ][a], [Y(0)] = [N ][a] a.s.

[L([Z(r)])] = [L([Z(r)][g ]T )][a] [a] = [g ]([g ]T [g ])−1


Ingredients: ISDE

Sampling on the Manifold: Projected Ito Equation

d [Z(r)] = [Y(r)]dr

d [Y(r)] = [L([Z(r)])]dr − 1

2f0 [Y(r)]dr +

√f0[dW(r)]

I.C. [Z(0)] = [Hd ][a], [Y(0)] = [N ][a] a.s.

[L([Z(r)])] = [L([Z(r)][g ]T )][a] [a] = [g ]([g ]T [g ])−1


Example: ScramJet

> 100M elements; 28 variables


Quantities of Interest: Q

Objective function:

Q1combustion efficiency

Constraints:

Q2 burned equivalence ratioQ3 stagnation pressure lossQ4 maximum RMS pressure

Control Variables: W

primary injector location

primary injector angle

secondary injector location

global equivalence ratio

primary-secondary ratio


Optimal Solution

Optimal Solution

wopt = arg minw∈Cw⊂R5

J(w)

s.t.: ci < 0 i = 1, 2, 3

Objective:

J(w) = E [Q2]

Constraints:

c1(wopt) = 1− α− P{Q1(wopt) > L1

∣∣ Q2

}c2(wopt) = 1− α− P

{Q3(wopt) < U3

∣∣ Q2

}c3(wopt) = 1− α− P

{Q4(wopt) < U4

∣∣ Q2

}Cw :

w1 ∈ [0.5, 1] w4 ∈ [0.40755, 0.43295]w2 ∈ [0.25, 0.35] w5 ∈ [5, 25]w3 ∈ [0.231, 0.2564]


Optimal Soutions

Convergence of statistical estimates at two different resolutions


Optimal solution

Optimal Solution

●●●●●●●

●

●

● ● ●●●●●0.5

0.6

0.7

0.8

0.9

1.0

10 14 18 23 50 72 50 72 100 222 100 311 100 50012 16 20 40 60 40 60 50 150 50 200 50 200

Size of Training Set

φ G

●

●

●

d/32d/32d/32d/16d/16d/8

●●

●

●

●●

● ●●

●

●

●

●

●

●●

0.26

0.28

0.30

0.32

0.34

10 14 18 23 50 72 50 72 100 222 100 311 100 50012 16 20 40 60 40 60 50 150 50 200 50 200


φ R

●

●

●

d/32d/32d/32d/16d/16d/8

●

●

●

●●●●

●●

●●

●

●●

●

●

0.230

0.235

0.240

0.245

0.250

0.255

10 14 18 23 50 72 50 72 100 222 100 311 100 50012 16 20 40 60 40 60 50 150 50 200 50 200


x 1

●

●

●

d/32d/32d/32d/16d/16d/8

●●

●●

●

●

● ●

●●

●

●●●

●●

0.410

0.415

0.420

0.425

0.430

10 14 18 23 50 72 50 72 100 222 100 311 100 50012 16 20 40 60 40 60 50 150 50 200 50 200


x 2

●

●

●

d/32d/32d/32d/16d/16d/8

●

●

●●

●●● ● ●●

●

●●

●

●

●

5

10

15

20

25

10 14 18 23 50 72 50 72 100 222 100 311 100 50012 16 20 40 60 40 60 50 150 50 200 50 200


θ 1

●

●

●

d/32d/32d/32d/16d/16d/8


Comments

Objective:

Compute credible statistics for risk/decision using a handful of samples.

Bet:

Quantities of interest (eg. Ojective functions) are much simpler thanwhole physics problem.

In spite of parametric variations, the physics provides sufficientconstraints to restrict fluctuations to a computable manifold.


Comments

Objective:

Compute credible statistics for risk/decision using a handful of samples.

Bet:

Quantities of interest (eg. Ojective functions) are much simpler thanwhole physics problem.

In spite of parametric variations, the physics provides sufficientconstraints to restrict fluctuations to a computable manifold.


Probabilistic learning for prediction and optimization …hyperion.usc.edu/MLUQ/pres/ghanem.pdfProbabilistic learning for prediction and optimization of complex systems Roger Ghanem

Documents