Verification of extreme events Barbara Casati (Environment Canada) D.B. Stephenson (University of Reading) ENVIRONMENT CANADA ENVIRONNEMENT CANADA.

Verification of extreme events

Barbara Casati (Environment Canada)D.B. Stephenson (University of Reading)

ENVIRONMENT CANADAENVIRONNEMENT CANADA

What is an extreme event?

Extreme events can be defined by:

• Maxima/minima• Magnitude• Rarity• Impact/losses

“Man can believe the impossible, but man can never believe the improbable.” - Oscar Wilde

Gare Montparnasse, 22 October 1895

Definition: intense/extreme/rare events are the events in the tail of the distribution

Driving questions

Q1: is there any skill in forecasting extremes ?

Q2: Are traditional categorical verification scores able to detect skill for extreme events ?

Q3: does hedging affect the extreme event verification scores ?

Q4: Which scores are more suitable for verifying extremes ?

Contingency Image and Table

X>u X<u

Y>u Hits False

Alarmsa+b

Y<u Misses Correct

Rejections

c+d

a+c b+d n

Threshold (u) Binary Images Contingency Table (joint distribution) Categorical Scores and Skill Scores: TS, ETS, HSS, KSS, OR, YQ, ROC curve.

X Binary Analysis Y Binary Forecast Overlapping

NIMROD case studies

A BCDEFGH

As the threshold increases TS,ETS,HSS,KSS converge to zero (no skill) for all the cases

As the threshold increases odds’ ratio,

ROC, Yule’s Q, separates the cases

Why do the scores behave differently ?

Categorical scores versus threshold

Base rate versus threshold

Threshold increases, base rate decreases

Intense/extreme/rare events when 0

)( uXPn

ca

Base Rate = Probability of the event

Categorical scores versus base rate

Plots in logarithmic scale: the rate of convergence of the statistics plays a key role in discriminating the NIMROD cases

Asymptotic model:

= slope parameter

~/ na

> 1 for ROC curve regularity

= 2 for random forecast: > 2 no skill < 2 skill

Behavior of the hits

Asymptotic behaviour of the joint distribution (un-biased

forecast)P(X > u) P(X < u)

P(Y > u)

P(Y < u)

1

2 degrees of freedom: (, ) fully describe the joint distributionExpress joint probabilities and verification statistics as functions of , Analyze statistics asymptotic behavior when the base rate 0

Scores asymptotic behaviour (no bias)

Odds Ratio exhibits different asymptotic behaviours depending on whether a/n converge to zero faster, at the same rate or slower than a/n for a random forecast

TS, ETS,HSS, KSS magnitude converges to zero as 0 either for skilful, random or worse than random forecasts ( acts as a diminishing factor)

2

21

20

)(

)21(

0

02

02

2

2

2

2

2

if

if

if

OR

KSSHSS

ETS

TS

Asymptotic behaviour of the joint distribution (biased

forecast)P(X > u) P(X < u)

P(Y > u)

P(Y < u)

1

3 degrees of freedom: (B, , ) fully describe the joint distributionExpress joint probabilities and verification statistics as functions of B, , Analyze statistics asymptotic behavior when the base rate 0Analyze statistics sensitivity to the bias B in the limit 0

Scores asymptotic behaviour (bias)

magnitude of TS, ETS, HSS, KSS

monotonically increase as B

increases: encourage over-

forecasting !!

2

22

2

2

2

2

2

2

~))((

)1(

~)1(

)(

1

2~

2)1(

)(2

1~

)1(

)(

1~

)1(

BB

BBBOR

BB

KSS

B

B

BB

BHSS

B

B

BBB

BETS

B

B

BB

BTS

The odds’ ratio is not affected by

the bias

Extreme dependency score

forecast and obs values are transformed into empirical cumulative probabilities

uniform marginal distributions, no bias

cumulative probability:

p = 1 – base rate

as 0, then p 1 intense/extreme/rare events

a’

d’ c’

b’

1

21

)'/'ln(

'/)''(ln2lim 1

na

ncaEDS p

1. does not depend on the base rate

2. is not affected by the BIAS

3. it depends only on the parameter (rate of convergence of the joint probability a/n to zero, as the events get rarer)

4. separate the case studies

Extreme Dependency Score

EDS measures forecast and obs

extreme dependency

Conclusions1. TS, ETS,HSS, KSS magnitude converges to zero

either for skilful, random or worse than random forecasts: not suitable to verify extremes

2. Odds ratio, Yule’s Q, ROC are more suitable for detecting the skill in forecasting extreme/rare events

3. TS, ETS,HSS, KSS are overly sensitive to the bias in extreme event situations and encourage over-forecasting

4. the odds ratio, Yule’s Q, ROC are not affected by the bias when verifying extreme/rare events

5. The Extreme Dependency Score provides a bias and base rate independent measure of extreme dependency: very suitable to verify extremes

References

S. Coles, J. Heffernan, J. Tawn (1999) “Dependence Measures for Extreme Value Analyses”, Extremes 2:4, pp. 339-365

C. Ferro (2007) “A probability model for verifying deterministic forecasts of extreme events”, Weather & Forecasting, in press

B. Casati (2004) “New approaches for the verification of spatial precipitation forecasts”, PhD Thesis (Chapter 6), available at http://www.met.rdg.ac.uk/~swr00bc/PhD.html

D.B. Stephenson, B. Casati, C. Wilson (2004) “Skill measures for forecasts of rare extreme events”, presentation given at the 2nd international workshop on verification methods, Montreal, Sept 2004.

THANK YOU !

Verification of extreme events Barbara Casati (Environment Canada) D.B. Stephenson (University of Reading) ENVIRONMENT CANADA ENVIRONNEMENT CANADA.

Documents