Daryl Kleist 1 JCSDA Summer Colloquium on Data Assimilation Fort Collins, CO 27 July – 7 August 2015 Univ. of Maryland-College Park, Dept. of Atmos. & Oceanic Science Thanks to Kayo Ide (UMD) and Massimo Bonavita (ECMWF) for much of the inspiration and/or slides for this lecture. Acknowledgements also to Eugenia Kalnay (UMD) and Jeff Whitaker (NOAA/ESRL)
52
Embed
Ensemble Methods Daryl Kleist 1 JCSDA Summer Colloquium on Data Assimilation Fort Collins, CO 27 July – 7 August 2015 Univ. of Maryland-College Park, Dept.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Daryl Kleist
1
JCSDA Summer Colloquium on Data AssimilationFort Collins, CO
27 July – 7 August 2015
Univ. of Maryland-College Park, Dept. of Atmos. & Oceanic Science
Thanks to Kayo Ide (UMD) and Massimo Bonavita (ECMWF) for much of the inspiration and/or slides for this lecture. Acknowledgements also to
Eugenia Kalnay (UMD) and Jeff Whitaker (NOAA/ESRL)
Outline
2
I. Introduction: Sequential Filteringa) Kalman Filterb) Extended KF and RRKF
II. The Ensemble Kalman Filtera) Stochastic Filter (Perturbed Observations)b) Serial/Square Root Filters (EnSRF, EAKF, LETKF)c) Technical Challenges
i. Localizationii. Inflation
d) Toy model example resultsIII. Ensemble of VarsIV. Summary
I. Introduction
• Data Assimilation Basics– Iterative approach to estimation/forecast of current/future states x
using the computational models & observations of the system
Computational Model:Forecast from tk-1 to tk
ObservationMeasurement
over the window
Data assimilation: Analysis at tk
Assimilation Window
Courtesy: Kayo Ide
I. Introduction
• Data Assimilation Basics, Notations, and Challenges– Iterative approach to estimation/forecast of current/future states x
using the computational models & observations of the system
Step 1. Model ForecastForecastBackground: xb
k
xk=Mk (xk-1)
ObservationMeasurement: yo
k’
yk’=h(xk’)
Assimilation cycle
Step 2. AssimilationIntegration of xb
k and yok
Analysis: xak = func of xb
k and yok
tktk-1
truth xt
Mk is•nonlinear•imperfect
x can be
large
hk’ is/may be•nonlinear•imperfect
y may be large or too small yk’ is/may be
•insufficient to determine xk
•not exactly at tk
Courtesy: Kayo Ide
I. Introduction: Probabilistic View of Data Assimilation
• Here, x*k represents the true state at time (k). Superscripts (f) and (a)
represent the forecast and analysis, respectively.
• M represents the nonlinear propagator (NWP model) that describes evolution in time.
• Errors to be discussed shortly
8
• Linear, unbiased analysis equation can be expressed in the following form for the analysis (a), background (b), and time level (k) using linear operator Hk:
• In order to find the best linear unbiased analysis, the Kalman Gain is expressed as the following
• B represents the background error covariance and R the observation error covariance. Under the condition that K is the optimal gain matrix, we can also obtain an equation for the analysis error covariance
Introduction:Kalman Filter
9
Introduction:Kalman Filter
• Comment on Notation:– You may see f/b superscripts, using forecast and background interchangeably– Here, I use B and A for the background and analysis error covariance matrices. The “unified
notation” (Ide et al. 1997) recommends using Pa and Pb, respectively.
• For the background forecast using the linear model
• Subtract the true state (x*) to define the error
• Noting that:
10
Introduction:Kalman Filter
• We can then prescribe the background error as:
• Defining the model error to be:
• And finally rewriting the previous expression as:
• So that
11
Introduction:Kalman Filter
• Assuming that:
• And inserting the model error covariance matrix, Q
12
Introduction:Kalman Filter (linear)
Forecast Step
Analysis
• Complete set of equations for DA cycling:– State and error covariances are propagated forward in time, and updated with
observations at time k– Under assumptions of linearity (M, H), KF produces optimal set of analysis
states– Analysis is the minimum variance estimate of the state
13
Extended Kalman Filter
• For weakly nonlinear systems, slight modifications can be made. Here, state update and propagation uses nonlinear operators:
• But covariance update and propagations uses linearized operators (Jacobians, or TL/AD)
• Where
Model Forecast: xb, B
ObservationMeasurement: yo
time
EKFAnalysis: xa, A
Step 1. Forecast (xbk, Bk)
Obtained by integrating
starting from (xak-1, Ak-1) over [tk, tk]
Step 1. Forecast (xbk, Bk)
Obtained by integrating
starting from (xak-1, Ak-1) over [tk, tk]
(yo, R)R: prescribed
Step 2. Analysis (xak, Ak)
Extended Kalman Filter
Courtesy: Kayo Ide
Model Forecast: xb
ObservationMeasurement: yo
Assimilation cycle
Assimilation windowtime
Analysis: xa
15
Mk is•nonlinear•imperfect
Mk is•nonlinear•imperfect dim of B
may be (large)2
dim of x
may be large
Extended Kalman Filter
Courtesy: Kayo Ide
16
Kalman Filter for Large Dimensions
• Kalman filters (and EKF) are impractical for large system like NWP models– For present day NWP, the state size (N) can be > O(108)
• However, a variety of Kalman Filters have been developed for large dimensional systems– All of these rely on Low-Rank Approximations of the background and
analysis error covariance matrices
• Assume that Bk has rank M<<N, so that we can write the error covariance as a function of Xb (NxM), where M can be ~100
17
Reduced-Rank KF
• The Kalman Gain can then be re-written as:
• Where the increment is a linear combination of columns of Xbk (thus, confined to that subspace)
• It can be shown that the covariance matrix propagation is rewritten as (requiring only M realizations of Mk):
• Note that there is no Bk, and Hk is operating on smaller dimension. The analysis update again:
18
Ensemble Kalman Filters
• Ensemble Kalman Filters (EnKF) are Monte Carlo approximations/implementations, using sample covariances from an ensemble (over bar represents ensemble mean):
• Where Xbk is a matrix (NxM) of ensemble forecast
perturbations:
• And the full Be is never explicitly computed! Instead, we represent it in the subspace of the M x M ensemble space.
Ensembles
• Represent pdf of state with discrete sampling (ensemble members)• Mean and covariance of ensemble members defined the evolved pdf
t0
t1
Ensemble Approach to Represent p(x)
◆Ensemble • Members• Spread
Mean Covariance
20
◆ Issues Sampling of by ensemble can be poor, especially for
• Small M• Small Pin
Rank of P is at most M-1 There infinitely many ΔX that have the same P=(1/M-1)ΔX(ΔX)T
Courtesy: Kayo Ide
p(x) Sampling & Reconstruction by Ensemble: 1D
21
different realizations
Assuming Gaussian pdfs
• If sampling is well-done, then p*(x)~p(x).
• ‘Fitness’ of p*(x) to p*(x) vary case by case particularly for small M.
• All cases, N<M.
orig. p(x) byM sample* from p(x)Reconstructed p*(x) by
Courtesy: Kayo Ide
p(x) Sampling & Reconstruction by Ensemble: 2D
22
different realizationsAssuming Gaussian pdfs
• If sampling is well-done, then p*(x)~p(x).
• ‘Fitness’ of p*(x) to p*(x) vary case by case particularly for small M.
• All cases, N≤M.
orig. p(x) by M sample from p(x)Reconstructed p*(x) by
• Recall that the KF (and EKF) propagate the error covariances explicitly using the TL/AD of the model and observation operators
• In an EnKF, error covariances are evolved implicitly in time through an ensemble of realizations of the nonlinear model
•Note the lack of Mk and MT
k (there is no TL or AD model needed).
Stochastic EnKF
25
• Starting from the EnKF analysis update equation
• Where Bek is represented by ensemble statistics. The analysis
is the mean of the posterior ensemble and analysis error covariance as:
• With a perturbation update following (if observations are same for all members):
Stochastic EnKF
26
• Which yields an estimate of the analysis error as:
• However, if BLUE is followed, it should actually be:
• So, the error is underestimated! One solution to this is to stochastically perturb the observations from a Gaussian distribution drawn from R
Stochastic EnKF
27
• In the limit of very large ensemble sizes, Rp coincides with the original, prescribed R.
• The new Kalman Gain (K*p) is identical to before, but R is replace with Rp
• Yielding the correct analysis error covariance matrix. This is known as the Perturbed Observation EnKF (Houtekamer and Mitchell, 1998)
Deterministic EnKF[Square Root Filters]
28
• There is another class of filters that does not require perturbing the observations. Starting with the ensemble background perturbation matrix again:
• We define the analysis to be a linear combination background perturbations:
• Where wk is a vector of coefficients in ensemble space. Expanding the RHS and defining the departures:
Deterministic EnKF[Square Root Filters]
29
• Which implies that:
• Let’s define the following matrix in observation space:
• Which yields a simplified formulation for the weights:
Deterministic EnKF[Square Root Filters]
30
• In other words, the Gain is computed in observation space. However, using the Sherman-Morrison-Woodbury formula, this can be rewritten to ensemble space:
• The perturbations can be updated to satisfy the following transform (T) satisfying the relationships:
• It can be shown that one such choice accomplishes this:
Deterministic EnKF[Square Root Filters]
31
• There are many implementations of deterministic, square root filters. There are differences in the handling of observations (performing on local patches, serial assimilation of observations, etc.)– Ensemble Transform Kalman Filter (ETKF, Bishop et al. 2001)– Local Ensemble Transform Kalman Filter (LETKF, Ott et al. 2004, Hunt
et al. 2007)– Serial Ensemble Square Root Filter (EnSRF, Whitaker and Hamill 2002)– Ensemble Adjustment Kalman Filter (EAKF, Anderson 2001)
• Overall, all of the above are largely similar and differ in their practical implementation. The class above is different than the stochastic filter.
Perform data assimilation in a local volume, choosing observations
The state estimate is updated at the central grid red dot
LETKF: Localization based on observations
Perform data assimilation in a local volume, choosing observations
The state estimate is updated at the central grid red dot
All observations (purple diamonds) within the local region are assimilated
LETKF: Localization based on observations
Globally:Forecast step:Analysis step: construct
Locally: Choose for each grid point the observations to be used, and compute the local analysis error covariance and perturbations in ensemble space:
Analysis mean in ensemble spaceand add to
Wa
to get the analysis ensemble in ensemble space.The new ensemble analyses in model space are the
columns ofx. Gathering the grid point analyses forms the new
global analyses. Note that the the output of the LETKF areanalysis weights
and perturbation analysis matrices ofweight
s. These weights multiply the ensemble forecasts.
X XWn na b a b
waWa
LETKF: Hunt et al. (2007)
Two Main Branches of EnKF: Analysis Processes at a Fixed tk
◆ Extended Kalman FilterGiven xb, B, yo, R, h / H Compute K = BHT(HBHT+R)-1
Obtain xa =xb+Δxa ; Δxa=K (yo- h(xb) ) & A = (I- K H)B
(xb, B)
(xa, A)
{xbm}
{xam}
{xbm}
{xam}
Given {xbm}, yo, R, h / H
Apply K to {xbm}
Obtain {xam}
Given {xbm}, yo, R, h (/ H)
Apply K to xb, B
Obtain xa, A& hence {xa
m}
Courtesy: Kayo Ide
What does Be gain us?
• Allows for flow-dependence/errors of the day• Multivariate correlations from dynamic model
– Quite difficult to incorporate into fixed error covariance models
• Evolves with system, can capture changes in the observing network
• More information extracted from the observations => better analysis => better forecasts
36
What does Be gain us?
Temperature observation near warm front
37
Bf Be
Courtesy: Jeff Whitaker
What does Be gain us?
38
Surface pressure observation near “atmospheric river”
First guess surface pressure (white contours) and precipitable water increment (A-G, red contours) after assimilating a single surface pressure observation (yellow dot)
using Be.
Courtesy: Jeff Whitaker
So what’s the catch?
• Rank Deficiency– The ensemble sizes used (~40-100+) for NWP are not nearly large enough
• Too few degrees of freedom available to fit the observations• Low rank approximation yields spurious long-distance correlations
• Mistreatment of “system error/uncertainty”– Sampling (as above), model error, observation operator error,
representativeness, etc.
• State estimate is ensemble average– This can produce unphysical estimates, smooth out high fidelity
information, etc.
39
Inflation and Localization
• Inflation– Used to inflate ensemble estimate of uncertainty to
avoid filter divergence (additive and multiplicative)
• Localization– Domain Localization
• Solves equations independently for each grid point (LETKF)
– Covariance Localization• Performed element wise (Schur product) on covariances
• Additive inflation: Add random samples from a specified distribution to each ensemble member after the analysis step.– Env. Canada uses random samples of isotropic 3DVar
covariance matrix.– NCEP uses random samples of 48-h – 24-h forecast
error (fcsts valid at same time).
41
Imperfect Model (Additive + Multiplicative Inflation Example)
• Additive inflation alone outperforms multiplicative inflation alone (compare values y-axis to values along x-axis)
• A combination of both is better than either alone.
• Multiplicative and additive inflation representing different error sources in the DA cycle?
42From Whitaker and Hamill (2012)
Example of Covariance Localization
17
Estimates of covariances from a small ensemble will be noisy, with signal to noise small especially when covariance is small
Courtesy: Jeff Whitaker
Real World NWP Example ofLocalization
44Courtesy: Jeff Whitaker
Toy Model Experiments
• Lorenz 96 40-variable model (F-8.0)– 0.05 cycling (~6 hours)
45
• Investigate aspects of EnKF, will show various RMSE metrics Graphic courtesy Rahul Mahajan
ETKF Analysis RMSE
0
20
40
60
80
100
120
140
11.
051.
151.
251.
351.
45 1.6
1.8 2
Inflation Parameter
RM
SE M=20
M=30
M=40
ETKF – Impact of Covariance Inflation (nobs=20)
Average ETKF RMSE for analysis mean as function of inflation parameter. Average is taken over 1800 cases, ignoring the first 200 to allow for spin up.
(Reference: EKF RMSE=12.0429)
Localization
Analysis RMSE
0
20
40
60
80
100
120
140
160
8 10 12 14 16 18 20 22 24 26 28 30
Ensemble Size
RM
SE ETKF
LETKF (+/-5)
Localization of observation selection allows for reduction in ensemble size (inflation kept constant 1.1 here). For larger ensembles, more work is need to improve result
(observation error inflation by distance for example)
Comparison between 3DVAR, EKF, and LETKF
Time Series (500 to 1000) showing the analysis RMSE (truth-analysis) for the case where all 20 grid points are observed next
to 4 unobserved. EKF and LETKF are significantly better than 3DVAR.
Scheme Mean RMSE
3DVAR 51.6823
EKF 14.6979
LTEKF 11.9255
Ensemble of Data Assimilations
• Much like the stochastic EnKF, ECMWF and Meteo-France use an ensemble of data assimilations instead of an EnKF– Perturb the observations and model – Designed to represent and estimate the uncertainty in their deterministic
4DVAR
• This provides flow-dependent estimates of analysis error for their EPS
• Also provides flow-dependent estimates of background error for use in DA (either as B0 or in hybrid….next lecture)
• Can be hugely expensive, given that a variational (4DVAR) update has to be executed for each ensemble member!
49
Summary• EnKFs are Monte-Carlo implementations of the sequential Kalman Filter
(minimum variance estimate)– PROS: Easy to implement, do not need TL/AD, flow-dependent estimates for
error covariances, solver can be done in ensemble space (computationally efficient)
– CONS: Sample sizes are usually much too small for large dimensional systems such as NWP models, requires ad-hoc methods of inflation and localization
• Many variants of EnKFs exist, but can be subset into two classes– Stochastic, Perturbed observation schemes (– Deterministic, square root filters
• While many of these exist in their details and practical information, they are all solving for essentially the same thing.
• Observation versus physical space localization (when might this matter?)
• Has been successfully applied to many, high-dimensional problems50
Summary (cont.)
• Like the variational solutions, similar assumptions need to be made to formulate the EnKF (including Gaussianity)– Although not discussed in detail, one does not necessarily need linear,
differentiable observation operators as in variational schemes
• There is another class of “ensemble methods” designed to sample/estimate the full PDF (not just the mean and covariance)– Particle Filters: Nonlinear, non-Gaussian DA, towards Bayesian
filtering– Expensive with greater dimensionality issues than the EnKF
51
Selective References• Anderson, J. L., 2001. An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev. 129, 2884–2903.
• Bishop, C. H., Etherton, B. J. and Majumdar, S. J., 2001. Adaptive sampling with ensemble transform Kalman filter. Part I: theoretical aspects. Mon. Wea. Rev. 129, 420–436.
• Burgers, G., Van Leeuwen, P. J. and Evensen, G., 1998. On the analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev. 126, 1719–1724.
• Evensen, G., 1994. Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res. 99(C5), 10 143–10 162.
• Houtekamer, P. L. and Mitchell, H. L., 1998. Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev. 126, 796–811.
• Houtekamer, P. L. and Mitchell, H. L., 2001. A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev. 129, 123–137.
• Hunt, B. R., Kostelich, E. J. and Szunyogh, I., 2007. Efficient data assimilation for spatiotemporal chaos: a local ensemble transform Kalman filter. Physica D, 230, 112–126.
• Ott, E., Hunt, B. H., Szunyogh, I., Zimin, A. V., Kostelich, E. J. and co-authors. 2004. A local ensemble Kalman filter for atmospheric data assimilation. Tellus 56A, 415–428.
• Tippett, M. K., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble Square Root Filters, Mon. Wea. Rev. 131, 1485–1490.
• Whitaker, J. S. and Hamill, T. M., 2002. Ensemble data assimilation without perturbed observations. Mon. Wea. Rev. 130, 1913–1924.
• Whitaker, J. S. and Hamill, T. M., 2012. Evaluating Methods to Account for System Errors in Ensemble Data Assimilation. Mon. Wea. Rev. 140, 3078–3089.
• Zhang, F., C. Snyder, and J. Sun, 2004: Impact of initial estimate and observation availability on convective-scale data assimilation with an ensemble Kalman filter. Mon. Wea. Rev. 132, 1238–1253. 52