Non-parametric postprocessing of ensemble forecasts for ... · Non-parametric postprocessing of ensemble forecasts for extreme ... a focus on daily rainfall using weighted scoring

Non-parametric postprocessing of ensemble forecasts for extremeand rare events: a focus on daily rainfall using weighted scoring

rules for verification

Maxime Taillardat1,2,3

O. Mestre4,M. Zamo4,P. Naveau2 andA-L. Fougères3

1CNRM/Météo-France 2LSCE 3ICJ4Météo-France

April 25, 2016

The QRF technique Ensemble forecast verification Results Prospects

Ensemble forecast

source : P. Naveau

Maxime Taillardat 1/20


Ensemble forecast

source : P. Naveau



Motivations for statistical post-processing

◮ Ensembles are subject to model biases and underdispersion for

surface weather variables. (Hamill and Colucci 1997 ...)

◮ A simple bias correction is not sufficient.

◮ The skill added by post-processing is not reduced byimprovements in ensemble developments. (Hemri et al. 2014)

Our goal

Most of recent developments are based on parametric techniques.

◮ We want to focus on non-parametric/data-driven techniques.

◮ We want to deal with “tricky” weather variables (precipitation

accumulation).



Main calibration techniques

Most popular techniques :

◮ Analog Method (Hamill and Whitaker 2006)

◮ Find in the model climate situations which are the closest

(according to a metric) of a given prediction◮ Substitute this prediction by the “analogs” observations

◮ Bayesian model averaging (Raftery et al. 2005)

◮ Ensemble model output statistics (Gneiting et al. 2005)







Forecasted proba =K∑

k=1

[proba from forecaster k

× posterior of forecaster k being correct]

◮ Ensemble model output statistics (Gneiting et al. 2005)







◮ Ensemble model output statistics (Gneiting et al. 2005)Under Gaussianity, the EMOS predictive mean is a bias-corrected

weighted average of the ensemble member forecasts. The EMOS

predictive variance is a linear function of the ensemble variance.



Plan

1 The QRF technique

2 Ensemble forecast verification

3 Results

4 Prospects



Quantile Regression Forests (QRF)

◮ Meinshausen 2006 (package R “quantregForest”)

◮ Quantile Regression : estimation of the conditional median or

any other quantile of the response variable given a set ofpredictors (Koenker and Bassett Jr 1978)

◮ Random Forests : aggregating predictions from binary decision

trees (CART) (Breiman 2001)

◮ Non-parametric : elimination of any assumption on the variable

subject to calibration



Quantile Regression Forests (QRF)

◮ Meinshausen 2006 (package R “quantregForest”)

◮ Quantile Regression : estimation of the conditional median or

any other quantile of the response variable given a set ofpredictors (Koenker and Bassett Jr 1978)

◮ Random Forests : aggregating predictions from binary decision

trees (CART) (Breiman 2001)

◮ Non-parametric : elimination of any assumption on the variable

subject to calibration



From CART to QRF

◮ Binary decision tree

A

B C



From CART to QRF


A

B C

◮ Let s be the threshold of a predictor Xi , s must create the mostpossible homogeneous branches in terms of variance :

∆R(s, b) = maxs∈Σ[R(b) − (R(bl ) + R(br ))]

where

R(t) =∑

X∈b

(yi − y(b))2



From CART to QRF


A

B C

◮ Unstable trees (low bias but very high variance) : One fits K trees

using K random samples with replacement of the training set

(bootstrap) : Tree Bagging◮ Strongly correlated trees : each split of each bagged tree is built

on a random subset of the predictors in Σ : Random Forests◮ For each final leaf of each tree one does not compute the mean

of the predictand’s values but instead their empirical CDF :

Quantile Regression

F̂x (y) = P̂(Y ≤ y |X = x) =

n∑

i=1

πi (x)I(Yi ≤ y)



Comparison between EMOS and QRF

◮ Raw ensemble : EMOS

QRF




◮ Raw ensemble : EMOS◮ Calibrated ensemble :

QRF





QRF◮ Calibrated ensemble :


















Plan

1 The QRF technique


3 Results

4 Prospects



Paradigm of verification, scoring rules

“The paradigm of maximizing the sharpness of the predictivedistributions subject to calibration” (Gneiting et al. 2006)

◮ A proper score : the CRPS (Murphy 1969 ; Gneiting and Raftery2007 ; Naveau et al. 2015 ; MT 2016)

CRPS(F , y) =

∫

∞

−∞

(F (x)− 1{x ≥ y})2dx

= EF |X − y | −1

2EF |X − X ′|

= y + 2[

F (y)EF (X − y |X > y)− EF (XF (X))]

= EF |X − y |+ EF (X) − 2EF (XF (X))

◮ Test of equal predictive accuracy : the Diebold-Mariano test type

(1995)◮ The CRPSS (Skill score)

CRPSS(A,B) = 1 −CRPSA

CRPSB



Plan

1 The QRF technique


3 Results

4 Prospects



Data and model fitting

◮ 4 yr of PEARP (ARPEGE 35-member Ensemble PredictionSystem) data from 2011 to 2014 on 87 French SYNOP stations

for 24h lead time, initialization 18H UTC

◮ For EMOS precipitation : GEV and Censored/Shifted Gammadistributions are selected (as in Hemri et al. 2014 ; Scheuerer,

Baran 2015) on a overall CRPS minimization criterion. (GPD isrejected)

◮ For Analog Method : Mahalanobis metric is kept.

◮ Two sets of predictors for QRF technique :

◮ Only with predictors concerning the variable of interest (QRF_O)◮ Like QRF_O with also the first, ninth and fifth decile of other

variable PEARP distributions (QRF_M)



Daily rainfall



Assessing performance for extreme and rare events

◮ A weighted score : the wCRPS (Gneiting and Ranjan 2012 ;

Naveau et al. 2015 ; MT 2016)

CRPS(F , y) =

∫

∞

−∞

w(x)(F (x) − 1{x ≥ y})2dx

= W (y) + 2[

F (y)EF (W (X) − W (y)|X > y)− EF (W (X)F (X))]

= EF |W (X)− W (y)|+ EF (W (X)) − 2EF (W (X)F (X))

where W =∫

w and 0 <∫

wf < ∞

◮ The weight function cannot depend on the observation : it leads

to improper scores.



Weight functions used

0 5 10 15 20 25

05

10

15

20

25

Weight functions

rainfall

W

W1

W2

W3

Weight functions◮

w1(x) = 1 −f (x)

f (0.2)

f is the PDF of the climatology

◮

w2(x) = 1{x ≥ 20}

◮

w3(x) = 2w1(x)W1(x)



Daily rainfall with weighted scoring rules



Daily rainfall with weighted scoring rules

w4(x) = 1{x ≤ 15}



Plan

1 The QRF technique


3 Results

4 Prospects



Prospects

◮ QRF technique gives at least the same or even betterperformance than EMOS unless for very high thresholds :

This is normal : if ie. an event occurs 5 times per year, we have to

get a 7-year training sample in order to build a sound 35-memberensemble with data-driven techniques.

◮ Reforecast work (Hamill and Whitaker 2006 : a 25-yrreforecast has been used)

◮ Combination of QRF and GPD CDF fitting (see tomorrow)

◮ Deal with other parameters (TCC : good preliminary results)

◮ Recovering spatio-temporal trajectories (eg. ECC Schefzik

2013)



Prospects

◮ QRF technique gives at least the same or even better

performance than EMOS unless for very high thresholds :

◮ Reforecast work (Hamill and Whitaker 2006 : a 25-yr

reforecast has been used)



◮ Recovering spatio-temporal trajectories (eg. ECC Schefzik2013)



Prospects










Prospects










Prospects

◮ QRF technique gives at least the same or even betterperformance than EMOS unless for very high thresholds :


reforecast has been used)◮ Combination of QRF and GPD CDF fitting (see tomorrow)◮ Deal with other parameters (TCC : good preliminary results)◮ Recovering spatio-temporal trajectories (eg. ECC Schefzik

2013)



References

◮ Taillardat, M., O. Mestre, M. Zamo, and P. Naveau, 2016 :Calibrated Ensemble Forecasts using Quantile Regression

Forests and Ensemble Model Output Statistics. Mon. Wea. Rev.doi :10.1175/MWR-D-15-0260.1, in press.

◮ [email protected]


References Références

References I

Baran, S., and S. Lerch, 2015 : Log-normal distribution based

ensemble model output statistics models for probabilisticwind-speed forecasting. Quarterly Journal of the Royal

Meteorological Society.

Ben Bouallègue, Z., 2013 : Calibrated short-range ensemble

precipitation forecasts using extended logistic regression withinteraction terms. Weather and Forecasting, 28 (2), 515–524.

Breiman, L., 1996 : Bagging predictors. Machine learning, 24 (2),

123–140.

Breiman, L., 2001 : Random forests. Machine learning, 45 (1), 5–32.

Breiman, L., J. Friedman, C. J. Stone, and R. A. Olshen, 1984 :Classification and regression trees. CRC press.



References II

Courtier, P., C. Freydier, J. Geleyn, F. Rabier, and M. Rochas, 1991 :

The arpege project at meteo-france. ECMWF seminarproceedings, Vol. 2, 193–231.

Descamps, L., C. Labadie, A. Joly, E. Bazile, P. Arbogast, and

P. Cébron, 2014 : PEARP, the Météo-France short-range ensemble

prediction system. Quarterly Journal of the Royal MeteorologicalSociety.

Feldmann, K., M. Scheuerer, and T. L. Thorarinsdottir, 2014 : Spatialpostprocessing of ensemble forecasts for temperature using

nonhomogeneous gaussian regression. arXiv preprintarXiv :1407.0058.

Friederichs, P., and A. Hense, 2007 : Statistical downscaling ofextreme precipitation events using censored quantile regression.

Monthly weather review, 135 (6), 2365–2378.



References III

Friederichs, P., and T. L. Thorarinsdottir, 2012 : Forecast verification

for extreme value distributions with an application to probabilisticpeak wind prediction. Environmetrics, 23 (7), 579–594.

Gneiting, T., F. Balabdaoui, and A. E. Raftery, 2007 : Probabilistic

forecasts, calibration and sharpness. Journal of the RoyalStatistical Society : Series B (Statistical Methodology), 69 (2),

243–268.

Gneiting, T., and M. Katzfuss, 2014 : Probabilistic forecasting. AnnualReview of Statistics and Its Application, 1, 125–151.

Gneiting, T., and A. E. Raftery, 2007 : Strictly proper scoring rules,prediction, and estimation. Journal of the American Statistical

Association, 102 (477), 359–378.



References IV

Gneiting, T., A. E. Raftery, A. H. Westveld III, and T. Goldman, 2005 :

Calibrated probabilistic forecasting using ensemble model outputstatistics and minimum crps estimation. Monthly Weather Review,

133 (5), 1098–1118.

Hamill, T. M., 2001 : Interpretation of rank histograms for verifyingensemble forecasts. Monthly Weather Review, 129 (3), 550–560.

Hamill, T. M., R. Hagedorn, and J. S. Whitaker, 2008 : Probabilisticforecast calibration using ecmwf and gfs ensemble reforecasts.

part ii : Precipitation. Monthly weather review, 136 (7), 2620–2632.

Hamill, T. M., and J. S. Whitaker, 2006 : Probabilistic quantitativeprecipitation forecasts based on reforecast analogs : Theory and

application. Monthly Weather Review, 134 (11), 3209–3229.



References V

Hemri, S., M. Scheuerer, F. Pappenberger, K. Bogner, and T. Haiden,

2014 : Trends in the predictive performance of raw ensembleweather forecasts. Geophysical Research Letters, 41 (24),

9197–9205.

Hersbach, H., 2000 : Decomposition of the continuous rankedprobability score for ensemble prediction systems. Weather and

Forecasting, 15 (5), 559–570.

Meinshausen, N., 2006 : Quantile regression forests. The Journal ofMachine Learning Research, 7, 983–999.

Pinson, P., 2012 : Adaptive calibration of (u, v)-wind ensembleforecasts. Quarterly Journal of the Royal Meteorological Society,

138 (666), 1273–1284.



References VI

Raftery, A. E., T. Gneiting, F. Balabdaoui, and M. Polakowski, 2005 :

Using bayesian model averaging to calibrate forecast ensembles.Monthly Weather Review, 133 (5), 1155–1174.

Schefzik, R., T. L. Thorarinsdottir, T. Gneiting, and Coauthors, 2013 :

Uncertainty quantification in complex simulation models usingensemble copula coupling. Statistical Science, 28 (4), 616–640.

Scheuerer, M., 2014 : Probabilistic quantitative precipitation

forecasting using ensemble model output statistics. QuarterlyJournal of the Royal Meteorological Society, 140 (680), 1086–1096.

Schuhen, N., T. L. Thorarinsdottir, and T. Gneiting, 2012 : Ensemblemodel output statistics for wind vectors. Monthly Weather Review,

140 (10), 3204–3219.



References VII

Sloughter, J. M., T. Gneiting, and A. E. Raftery, 2010 : Probabilistic

wind speed forecasting using ensembles and bayesian modelaveraging. Journal of the american statistical association,

105 (489), 25–35.

Sloughter, J. M. L., A. E. Raftery, T. Gneiting, and C. Fraley, 2007 :Probabilistic quantitative precipitation forecasting using bayesian

model averaging. Monthly Weather Review, 135 (9), 3209–3220.

Thorarinsdottir, T. L., and T. Gneiting, 2010 : Probabilistic forecasts of

wind speed : ensemble model output statistics by usingheteroscedastic censored regression. Journal of the Royal

Statistical Society : Series A (Statistics in Society), 173 (2),371–388.



References VIII

Weijs, S. V., R. Van Nooijen, and N. Van De Giesen, 2010 :

Kullback-leibler divergence as a forecast skill score with classicreliability-resolution-uncertainty decomposition. Monthly Weather

Review, 138 (9), 3387–3399.

Wilks, D. S., 1995 : Statistical methods in the atmospheric sciences.Academic press, 467 pp.

Zamo, M., O. Mestre, P. Arbogast, and O. Pannekoucke, 2014 : A

benchmark of statistical regression methods for short-termforecasting of photovoltaic electricity production. part ii :

Probabilistic forecast of daily production. Solar Energy, 105,

804–816.



Results on surface temperature



Results on surface temperature



Interest of QRF for forecasters



Interest of QRF for forecasters



QRF can have a meteorological interpretation



QRF can have a meteorological interpretation


Non-parametric postprocessing of ensemble forecasts for ... · Non-parametric postprocessing of ensemble forecasts for extreme ... a focus on daily rainfall using weighted scoring

Documents