Top Banner
Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne
31

Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

May 18, 2018

Download

Documents

vannhan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Data assimilation in biogeochemistry: Adaptingthe paradigm of numerical weather prediction

Peter Rayner

University of Melbourne

Page 2: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Outline of series

1. Basic approach with some simple examples;

2. What can go wrong and how would we know?

3. Some advanced uses, model development and evaluation.

Page 3: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Outline for Lecture One

• Motivation: An example of data assimilation for climate;

• The minefield of nomenclature and notation;

• Data assimilation as Bayesian inference;

• Some simple examples;

• Looking hard at each component.

Page 4: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Motivation

2000 2020 2040 2060 2080

year

0

5

10

15

20

25

Anomalous NEP (GtC/yr)

Uncertainty in terrestrial

uptake, 2000–2090. Black

lines = current climate, red

= climate change. Thin lines

= original model, thick =

after data.

• Rayner et al., Phil. Trans.

2010;

• Uncertainties completely

dominated by climate

change;

• Greatly reduced by

confronting with data.

Page 5: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

The problem

• To improve our knowledge of the state and functioning of a

physical system given some observations.

• “State” means the value of physical quantities which may

evolve, usually the variables in a numerical model;

• “Function” means the fixed values or even functional forms of

the laws governing the system.

Page 6: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Name Symbol Description ExamplesParameters ~p Quantities not

changed by modelξ (bufferfactor), ba(terrestrial fluxamplitude)

State variables ~v Quantities altered bymodel

leaf area, DIC

Unknowns1 ~x Quantities exposed tooptimisation

ξ, cI(t = 0)

Observables ~o Measurablequantities, maybe in ~v

cA, totalcarbon

Observationoperator

Transforms ~v to ~o 1, cI + cO

Model M Predicts ~v given ~p and~v(t = 0)

Data ~d Measured values of ~o

Page 7: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Data Assimilation in One Picture

unknown

data

measurement

prior

model

0.8

1.2

-0.2 0.2

• Unknown on X-axis, obs onY-axis;• Light-blue = prior unknown• Light-red = obs• Green = model;• Black = solution.

Page 8: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Well, almost one picture

unknown

data

measurement

prior

model

0.8

1.2

-0.2 0.2

Solution is multiplication

of input PDFs.

unknown

probability density

0.1 0.2 0.3 0.4

2.

4.

6

8

10

12

Final PDF projects

triangle onto “unknown”

axis.

Page 9: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Notes

• Solution is multiplication of PDFs;

• Solution can be constructed with only forward models;

• Normalization doesn’t usually matter.

Page 10: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Gaussian Prior

-2 -1 0 1 2 -2

-1

0

1

2

0

0.5

1

1√2πσP

exp− x2

2σ2P

Page 11: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Data

-2 -1 0 1 2 -2

-1

0

1

2

0

0.5

1

1√2πσD

exp−(y − 1)2

2σ2D

Page 12: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Model

-2 -1 0 1 2 -2

-1

0

1

2

0

0.5

1

1√2πσM

exp−(y −M(x))2

2σ2M

Page 13: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Prior plus Data plus Model

-2 -1 0 1 2 -2

-1

0

1

2

0

0.5

1√2πσPσDσM

exp− x2

2σ2P

× exp−(y − 1)2

2σ2D

× exp−(y −M(x))2

2σ2M

Page 14: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

“Solving” the Inverse Problem

• The joint PDF is the solution;

• For Gaussians the solution can be represented by a mean and

variance;

• These can be misleading.

Page 15: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

A simple example

P (x, y) =1√

2πσxσyσM

exp−(x− x0)2

2σ2x

×exp−(y −D)2

2σ2y

×exp−(y −M(x))2

2σ2M

• x0 = 0, D = 1, M = 1, σx = σy = σM = 1;

• Multiplying exponentials ↔ adding exponents;

P (x, y) =1√2π

exp−[x2

2+

(y − 1)2

2+

(y − x)2

2

]

Page 16: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Solution Continued

P (x, y) =1√2π

exp−[x2

2+

(y − 1)2

2+

(y − x)2

2

]• Finding most likely value means maximizing probability

• Maximizing negative exponential means minimizing :

J =12[x2 + (y − 1)2 + (y − x)2

]• Example of least squares cost function.

Page 17: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Solution Continued

J =12[x2 + (y − 1)2 + (y − x)2

]• To maximize set ∂J

∂x = 0 and ∂J∂y = 0

2x− y = 0 (1)

2y − x− 1 = 0 (2)

• x = 13, y = 2

3

Page 18: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Illustrating Solution

unknown

data

measurement

prior • Prior estimate is intersection ofred and blue lines (0, 1).

• Solution is pulled directlytowards model;

• Solution is compromise betweenprior, measurement and model;

• Solution depends on both valuesand uncertainties.

Page 19: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

More detail on Uncertainties

• Prior PDF is distribution of true value deliberately ignoring measurementswe intend to use. Often expressed as distribution around value but notnecessary.

• PDF of data is distribution of true value, usually distributed around ameasurement;

• PDF of model describes distribution of true value given particular value of“unknown”. Almost never available.

Page 20: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

First Simplification

• Often we are not interested in estimating the observable;

• For Gaussian PDFs we can pretend our model is perfect and addobservational and modelling error variances (Tarantola 2004, P202);

• Thus

J =12[x2 + (y − 1)2 + (y − x)2

]becomes

J∗ =12[x2 + (x− 1)2/2

]• Yields x = 1

3 but not y = 23.

Page 21: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Recursive estimation

• Multiplication of PDFs can be done in any order and many at a time orsingly;

• If we preserve the full PDF we can include observations as they arrive;

• For Gaussians PDF described by means and variances;

• Information is always added so that PDFs are always refined.

Page 22: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Batch and Sequential Methods

BATCH

• Handle all obs at once;

• PDFs for priors and obsunrestricted;

• Model error hard to include;

• Classic example 4dVar forweather prediction.

SEQUENTIAL

• Handle obs as they arrive;

• PDFs for obs restricted (timecorrelations hard);

• Model error handled verynaturally;

• Kalman Filter.

Page 23: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

A few Example Applications

• What are the unknowns?

• What is the prior estimate?

• What are the observations?

• What is the model?

• How do they handle the time domain?

Page 24: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Numerical Weather Prediction 4dVar

• Unknown is 3d grid of atmospheric variables at fixed time;

• Prior is previous forecast;

• Observations include in situ and satellite measurements over a fixed timewindow;

• Model combines dynamic evolution of atmosphere with observationoperators;

• All observations handled at once;

• doesn’t usually have explicit model error.

Page 25: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Numerical Weather Prediction, Kalman filtering

• Unknown is 3d grid of atmospheric variables at each time;

• Prior is previous posterior;

• Observations include in situ and satellite measurements within one timestep;

• Dynamic model and observation operators separated;

• Always has explicit model error.

Page 26: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Atmospheric Flux Inversion

• Unknown is space-time distribution of surface fluxes;

• Prior often comes from biogeochemical model;

• Observations are atmospheric concentration;

• Model is atmospheric transport;

• All observations usually handled at once;

• Model error sometimes handled via model ensemble.

Page 27: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Biogeochemical data assimilation

• Confusing terminology;

• Unknowns are parameters in model;

• Priors from independent experiment or literature;

• Many different observations (fluxes, concentrations, vegetation indices,ocean colour etc);

• Dynamic model and obs operators separated;

• Equally split between batch and sequential.

Page 28: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Linear Gaussian Case

• Unknowns and data are vectors ~x and ~d;

• σ2 replaced with variance/covariance matrices C for ~x and ~d;

• Model M becomes matrix M;

• Use usual simplification of assuming perfect model and adding data andmodel uncertainties.

Page 29: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Solution

P (~x) = K1√

detC(~x0) detC(~y)exp−1

2(~x− ~x0)TC−1(~x0)(~x− ~x0) exp−1

2(M~x− ~y)TC−1(~y)(M~x− ~y)

Minimize

J = (~x− ~x0)TC−1(~x0)(~x− ~x0) + (M~x− ~y)TC−1(~y)(M~x− ~y)

Page 30: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Continued

J = (~x− ~x0)TC−1(~x0)(~x− ~x0) + (M~x− ~y)TC−1(~y)(M~x− ~y)

Yields

~x = ~x0 + C(~x0)MT[MC(~x0)MT + C(~y)

]−1(~y −M~x0)

C−1(~x) = C−1(~x0) + MTC−1(~y)M

Page 31: Data assimilation in biogeochemistry: Adapting the ... · Data assimilation in biogeochemistry: Adapting the paradigm of numerical weather prediction Peter Rayner University of Melbourne

Summary

• Data assimilation is an example of Bayesian Inference;

• BI itself follows from rules for combining PDFs;

• Techniques like least squares minimisation are special cases for particulartypes of PDF;

• Most approaches such as Kalman Filtering and 4dVar can be expressed withthis formalism.