Top Banner
A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1 and Adrian Sandu 1 1 Computational Science Laboratory (CSL) Department of Computer Science Virginia Tech {attia,sandu}@cs.vt.edu [1/19] October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
19

A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Jul 10, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

A Sampling Filter forNon-Gaussian Data Assimilation

Ahmed Attia1 and Adrian Sandu1

1Computational Science Laboratory (CSL)Department of Computer Science

Virginia Tech{attia,sandu}@cs.vt.edu

[1/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 2: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Outline

I MotivationI Problem formulationI Sampling approach: HMCMCI Experiment and resultsI Future workI Questions

Outline [2/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 3: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Motivation:

I A Monte Carlo approach capable of finding a good analysis for DAproblems in both Gaussian and non-Gaussian frameworks!

I Why don’t we sample directly from the posterior PDF?I A robust, fast sampling strategy is required!I MCMC: popular and guaranteed to converge but may take forever!I Auxiliary variable MCMC: Hybrid Monte Carlo (HMCMC);

I Fast convergence rate,I High acceptance probability of generated states,I Fairly easy to tune the parameters and enhance performance.

Motivation [3/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 4: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Sequential DA:I From Bayes’ Theorem:

Pa(x) = P(x|y) =P(y|x)Pb(x)

P(y), (1)

∝ P(y|x)Pb(x) . (2)

I Gaussian framework:

Pb(x) ∝ exp(−1

2(x− xb)T B−1(x− xb)

), (3)

P(y|x) ∝ exp(−1

2(H(x)− y)T R−1(H(x)− y)

). (4)

I Posterior PDF:

Pa(x) ∝ exp (−J (x)) , (5)

J (x) =12

(x− xb)T B−1(x− xb) +12

(H(x)− y)T R−1(H(x)− y) . (6)

Problem Formulation [4/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 5: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Hybrid MC:I HMCMC: Analogy to Hamiltonian mechanical system.I To draw samples {x(e)} from a given probability distribution π(x):

- View the state variable as a position variable,- Add an auxiliary variable ”momentum“ p and sample from the joint PDF of

both variables (p, x).+ The Hamiltonian:

H(p, x) =12

pT M−1p− log(π(x)) =12

pT M−1p︸ ︷︷ ︸kinetic energy

+ J (x)︸ ︷︷ ︸potential energy

, (7)

+ The Hamiltonian dynamics (symplectic integrator needed):

dxdt

= ∇p H = M−1p ,dpdt

= −∇x H = −∇xJ (x). (8)

I The canonical PDF of (p,x):

exp (−H(p, x)) = exp(−1

2pT M−1p− J (x)

)= exp

(−1

2pT M−1p

)· π(x). (9)

I Generate a Markov chain with stationary distribution is exp (−H(p,x)).

Sampling Approach: HMCMC [5/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 6: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

HMCMC Sampling for Sequential DA:

Assimilation at time tkI Forecast: given an analysis ensemble {xa

k−1(e)}e=1,2,...,Nens at time tk−1;generate the forecast ensemble via the modelM:

xbk (e) =Mtk−1→tk

(xa

k−1(e)), e = 1,2, . . . , Nens. (10)

I Analysis: given the observation vector yk at time point tk , start samplingfrom Pa(x|yk ):

i- Initialize the MC? Use the best estimate available for (x0)! (e.g. Forecast,EnKF, 3D-Var),

ii- Optionally, calculate the ensemble-based forecast error covariance matrixBk ,

iii- Choose the mass matrix M! (e.g. constant diagonal, diagonal of B0 or Bk ),iv- Generate the MC and sample at stationarity,iv- Use the generated samples {xa

k (e)}e=1,2,...,Nens as an analysis ensemble andcalculate the best estimate of the state (e.g. the mean), and the analysiserror covariance matrix.

Sampling Approach: HMCMC [6/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 7: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

HMCMC Sampling Algorithm:Given the initial state of the chain:

I Draw a random vector pk ∼ N (0,M),I Use a symplectic numerical integrator (e.g. Verlet) to advance the current

state (pk ,xk ) by a time increment T to obtain a proposal state (p∗,x∗):

(p∗,x∗) = ΦT((pk ,xk )

).

I Evaluate the loss of energy based on the Hamiltonian function ∆H

∆H = H(p∗,x∗)− H(pk ,xk ). (11)

I Calculate the probability:

a(k) = 1 ∧ e−∆H . (12)

I Discard both p∗, pk .I (Acceptance/Rejection) Draw a uniform random variable u(k) ∼ U(0,1):

i- If a(k) > u(k) accept the proposal as the next sample: xk+1 := x∗;ii- If a(k) ≤ u(k) reject the proposal and continue with the current state:

xk+1 := xk .I Repeat steps 1 to 6 until sufficiently many distinct samples are drawn.

Sampling Approach: HMCMC [7/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 8: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Symplectic numerical integratorsOne step advance of the solution of the Hamiltonian equations from time tk totime tk+1 = tk + h as follows:

1. Position Verlet integrator

xk+1/2 = xk +h2

M−1 pk , (13a)

pk+1 = pk − h∇xJ (xk+1/2) , (13b)

xk+1 = xk+1/2 +h2

M−1 pk+1. (13c)

2. Two-stage integrator

x1 = xk + (a1h)M−1pk , (14a)

p1 = pk − (b1h)∇xJ (x1) , (14b)

x2 = x1 + (a2h)M−1p1 , (14c)

pk+1 = p1 − (b1h)∇xJ (x2) , (14d)

xk+1 = x2 + (a2h)M−1pk+1 , (14e)

a1 = 0.21132 , a2 = 1− 2a1 , b1 = 0.5 .

Sampling Approach: HMCMC [8/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 9: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Symplectic numerical integratorsOne step advance of the solution of the Hamiltonian equations from time tk totime tk+1 = tk + h as follows:

3. Three-stage integrator

x1 = xk + (a1h)M−1pk , (15a)

p1 = pk − (b1h)∇xJ (x1) , (15b)

x2 = x1 + (a2h)M−1p1 , (15c)

p2 = p1 − (b2h)∇xJ (x2) , (15d)

x3 = x2 + (a2h)M−1p2 , (15e)

pk+1 = p2 − (b1h)∇xJ (x3) , (15f)

xk+1 = x3 + (a1h)M−1pk+1 , (15g)

where:

a1 = 0.11888010966548 ,

a2 = 0.5− a1 ,

b1 = 0.29619504261126 ,

b2 = 1− 2b1.

Sampling Approach: HMCMC [9/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 10: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Symplectic numerical integratorsOne step advance of the solution of the Hamiltonian equations from time tk totime tk+1 = tk + h as follows:

4. Four-stage integratorx1 = xk + (a1h)M−1pk , (16a)

p1 = pk − (b1h)∇xJ (x1) , (16b)

x2 = x1 + (a2h)M−1p1 , (16c)

p2 = p1 − (b2h)∇xJ (x2) , (16d)

x3 = x2 + (a3h)M−1p2 , (16e)

p3 = p2 − (b2h)∇xJ (x3) , (16f)

x4 = x3 + (a2h)M−1p3 , (16g)

pk+1 = p3 − (b1h)∇xJ (x4) , (16h)

xk+1 = x4 + (a1h)M−1pk+1 , (16i)

where:

a1 = 0.071353913450279725904 ,

a2 = 0.268458791161230105820 ,

a3 = 1− 2a1 − 2a2 , b1 = 0.1916678 , b2 = 0.5− b1.

Sampling Approach: HMCMC [10/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 11: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Symplectic numerical integratorsOne step advance of the solution of the Hamiltonian equations from time tk totime tk+1 = tk + h as follows:

5. Infinite dimensional integrator

p1 = pk −h2

M−1∇xJ (xk ) , (17a)

xk+1 = cos (h)xk + sin (h)p1 , (17b)

p2 = − sin (h)xk + cos (h)p1 , (17c)

pk+1 = p2 −h2

M−1∇xJ (xk+1). (17d)

Here, the loss of energy is computed as:

∆H = φ(x∗)− φ(xk ) +h2

8

(|M−

12 (−∇φ(xk ))|2 − |M−

12 (−∇φ(x∗))|2

)+ h

m−1∑i=1

(pT

k (−∇φ(xk )))

+h2

(pT

k (−∇φ(xk )) + (p∗)T(−∇φ(x∗))

),

where φ(x) = − log (π(x))

Sampling Approach: HMCMC [11/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 12: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Experimental SettingsI Applied to 40-variables Lorenz-96, Observe each third component,I Observation operators:

1) Linear:H(x) = Hx = (x1, x4, x7, . . . , x37, x40)T ∈ R14 .

2) Quadratic:H(x) = (x2

1 , x24 , x2

7 , . . . , x237, x2

40)T ∈ R14.

3) Cubic:H(x) = (x3

1 , x34 , x3

7 , . . . , x337, x3

40)T ∈ R14.

4) Magnitude:

H(x) = (|x1|, |x4|, |x7|, . . . , |x37|, |x40|)T ∈ R14.

5) Quadratic with threshold:

H(x) = (x ′1, x ′4, x ′7, . . . , x ′37, x ′40)T ∈ R14 ,

where

x ′i =

{x2

i : xi ≥ 0.5−x2

i : xi < 0.5 ,6) Exponential:

H(x) = (er·x1 , er·x4 , er·x7 , . . . , er·x37 , er·x40 )T ∈ R14 ,

Experiment and Results [12/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 13: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Results: Quadratic H with a threshold 0.5

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

Time

RMSE

Forecast

EnKF

MLEF

(a) Position Verlet integrator(13)

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

Time

RMSE

Forecast

EnKF

MLEF

(b) Two-stage integrator (14)

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

Time

RMSE

Forecast

EnKF

MLEF

(c) Three-stage integrator

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

Time

RMSE

Forecast

MLEF

EnKF

(d) Four-stage integrator (16)

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

Time

RMSE

Forecast

EnKF

MLEF

(e) Integrator defined onHilbert space (17)

Figure : Data assimilation results with the quadratic observation operator with a thresh-old value (12). The integrator used is indicated under each panel. The time step for allintegrators is T = 0.1 with h = 0.01, m = 10, and 30 inter-chain steps.

Experiment and Results [13/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 14: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Results: Exponential H with r = 0.2

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

Time

RMSE

EnKF

Forecast

MLEF

(a) Position Verlet integrator(13)

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

Time

RMSE

Forecast

EnKF

MLEF

(b) Two-stage integrator (14)

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

Time

RMSE

EnKF

Forecast

MLEF

(c) Three-stage integrator(15)

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

Time

RMSE

Forecast

EnKF

MLEF

(d) Four-stage integrator (16)

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

Time

RMSE

Forecast

EnKF

MLEF

(e) Integrator defined onHilbert space (17)

Figure : Data assimilation results with the exponential observation operator with r =0.2 (12). The integrator used is indicated under each panel. The time step for allintegrators is T = 0.1 with h = 0.01, m = 10, and 30 inter-chain steps.

Experiment and Results [14/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 15: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Results: Exponential H with r = 0.2

Figure : Scatter plot of the sampled joint pdf of selective components. Results areobtained from HMCMC ensembles with three-stage integrator.

Experiment and Results [15/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 16: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Results: Exponential H with a factor r = 0.5

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

Time

RM

SE

Forecast

(a) Three-stage integrator; h = 0.01, m =60

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

Time

RMSE

Forecast

(b) Four-stage integrator; h = 0.001, m =30

Figure : Data assimilation results with the exponential observation operator with r =0.5 (12). The integrator used is indicated under each panel. The step size h andthe number of steps m are indicated under each panel, and 30 inter-chain steps areconsidered.

Experiment and Results [16/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 17: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Tuning the parameters:

(a) RMSE mean: two-stageintegrator; h = 2/Nvar, m =Nvar

(b) RMSE standard devia-tion: two-stage integrator;h = 2/Nvar, m = Nvar

(c) RMSE mean: three-stageintegrator; h = 3/Nvar, m =Nvar/2

(d) RMSE standard devia-tion: three-stage integrator;h = 3/Nvar, m = Nvar/2

(e) RMSE mean: four-stageintegrator; h = 4/Nvar, m =Nvar/2

(f) RMSE standard deviation:four-stage integrator; h =4/Nvar, m = Nvar/2

Figure : The mean and standard deviation of RMS error for the quadratic observationoperator (12) for different MC number of inter-chain steps.

Experiment and Results [17/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 18: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Results: Quadratic H with tuned parameters

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

Time

RMSE

Forecast

EnKF

MLEF

Figure : RMS error of the sampling filter using three-stage integrator with h =3/Nvar, m = Nvar/2. The number of inter-chain steps is 40.

Experiment and Results [18/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)

Page 19: A Sampling Filter for Non-Gaussian Data Assimilation...A Sampling Filter for Non-Gaussian Data Assimilation Ahmed Attia 1and Adrian Sandu 1Computational Science Laboratory (CSL) Department

Future work

I Apply to larger models,I Extend to 4D case,I Enhance and parallelize,I ....

Future Work [19/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)