Verification of extreme events Barbara Casati (Environment Canada) D.B. Stephenson (University of Reading) ENVIRONMENT CANADA ENVIRONNEMENT CANADA
Verification of extreme events
Barbara Casati (Environment Canada)D.B. Stephenson (University of Reading)
ENVIRONMENT CANADAENVIRONNEMENT CANADA
What is an extreme event?
Extreme events can be defined by:
• Maxima/minima• Magnitude• Rarity• Impact/losses
“Man can believe the impossible, but man can never believe the improbable.” - Oscar Wilde
Gare Montparnasse, 22 October 1895
Definition: intense/extreme/rare events are the events in the tail of the distribution
Driving questions
Q1: is there any skill in forecasting extremes ?
Q2: Are traditional categorical verification scores able to detect skill for extreme events ?
Q3: does hedging affect the extreme event verification scores ?
Q4: Which scores are more suitable for verifying extremes ?
Contingency Image and Table
X>u X<u
Y>u Hits False
Alarmsa+b
Y<u Misses Correct
Rejections
c+d
a+c b+d n
Threshold (u) Binary Images Contingency Table (joint distribution) Categorical Scores and Skill Scores: TS, ETS, HSS, KSS, OR, YQ, ROC curve.
X Binary Analysis Y Binary Forecast Overlapping
NIMROD case studies
A BCDEFGH
As the threshold increases TS,ETS,HSS,KSS converge to zero (no skill) for all the cases
As the threshold increases odds’ ratio,
ROC, Yule’s Q, separates the cases
Why do the scores behave differently ?
Categorical scores versus threshold
Base rate versus threshold
Threshold increases, base rate decreases
Intense/extreme/rare events when 0
)( uXPn
ca
Base Rate = Probability of the event
Categorical scores versus base rate
Plots in logarithmic scale: the rate of convergence of the statistics plays a key role in discriminating the NIMROD cases
Asymptotic model:
= slope parameter
~/ na
> 1 for ROC curve regularity
= 2 for random forecast: > 2 no skill < 2 skill
Behavior of the hits
Asymptotic behaviour of the joint distribution (un-biased
forecast)P(X > u) P(X < u)
P(Y > u)
P(Y < u)
1
2 degrees of freedom: (, ) fully describe the joint distributionExpress joint probabilities and verification statistics as functions of , Analyze statistics asymptotic behavior when the base rate 0
Scores asymptotic behaviour (no bias)
Odds Ratio exhibits different asymptotic behaviours depending on whether a/n converge to zero faster, at the same rate or slower than a/n for a random forecast
TS, ETS,HSS, KSS magnitude converges to zero as 0 either for skilful, random or worse than random forecasts ( acts as a diminishing factor)
2
21
20
)(
)21(
0
02
02
2
2
2
2
2
if
if
if
OR
KSSHSS
ETS
TS
Asymptotic behaviour of the joint distribution (biased
forecast)P(X > u) P(X < u)
P(Y > u)
P(Y < u)
1
3 degrees of freedom: (B, , ) fully describe the joint distributionExpress joint probabilities and verification statistics as functions of B, , Analyze statistics asymptotic behavior when the base rate 0Analyze statistics sensitivity to the bias B in the limit 0
Scores asymptotic behaviour (bias)
magnitude of TS, ETS, HSS, KSS
monotonically increase as B
increases: encourage over-
forecasting !!
2
22
2
2
2
2
2
2
~))((
)1(
~)1(
)(
1
2~
2)1(
)(2
1~
)1(
)(
1~
)1(
BB
BBBOR
BB
KSS
B
B
BB
BHSS
B
B
BBB
BETS
B
B
BB
BTS
The odds’ ratio is not affected by
the bias
Extreme dependency score
forecast and obs values are transformed into empirical cumulative probabilities
uniform marginal distributions, no bias
cumulative probability:
p = 1 – base rate
as 0, then p 1 intense/extreme/rare events
a’
d’ c’
b’
1
21
)'/'ln(
'/)''(ln2lim 1
na
ncaEDS p
1. does not depend on the base rate
2. is not affected by the BIAS
3. it depends only on the parameter (rate of convergence of the joint probability a/n to zero, as the events get rarer)
4. separate the case studies
Extreme Dependency Score
EDS measures forecast and obs
extreme dependency
Conclusions1. TS, ETS,HSS, KSS magnitude converges to zero
either for skilful, random or worse than random forecasts: not suitable to verify extremes
2. Odds ratio, Yule’s Q, ROC are more suitable for detecting the skill in forecasting extreme/rare events
3. TS, ETS,HSS, KSS are overly sensitive to the bias in extreme event situations and encourage over-forecasting
4. the odds ratio, Yule’s Q, ROC are not affected by the bias when verifying extreme/rare events
5. The Extreme Dependency Score provides a bias and base rate independent measure of extreme dependency: very suitable to verify extremes
References
S. Coles, J. Heffernan, J. Tawn (1999) “Dependence Measures for Extreme Value Analyses”, Extremes 2:4, pp. 339-365
C. Ferro (2007) “A probability model for verifying deterministic forecasts of extreme events”, Weather & Forecasting, in press
B. Casati (2004) “New approaches for the verification of spatial precipitation forecasts”, PhD Thesis (Chapter 6), available at http://www.met.rdg.ac.uk/~swr00bc/PhD.html
D.B. Stephenson, B. Casati, C. Wilson (2004) “Skill measures for forecasts of rare extreme events”, presentation given at the 2nd international workshop on verification methods, Montreal, Sept 2004.
THANK YOU !