Top Banner
Training Seminar, 5 Nov 2008 1 Verification at JMA on Ensemble Prediction Hitoshi Sato, Yukiko Naruse Climate Prediction Division Japan Meteorological Agency
23

Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Jul 08, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Training Seminar, 5 Nov 2008 1

Verification at JMA

on Ensemble Prediction

Hitoshi Sato, Yukiko Naruse

Climate Prediction Division

Japan Meteorological Agency

Page 2: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification on Ensemble Prediction 2

Contents

Part Ⅰ one-month prediction

Purposes of verification

Verification of one-month prediction

Part Ⅱ seasonal prediction

Verification of seasonal prediction

Standardized Verification System (SVS) for

Long-Range Forecasts (LRF)

Page 3: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 3

Verification at JMA

on Ensemble Prediction

Part Ⅰ One-month prediction

Purposes of verification

Verification of one-month prediction

methods

results

Page 4: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 4

Why Verify?

Purposes of Verification are:

to monitor forecast quality how accurate are the forecasts and

are they improving?

to guide forecasters and users help forecasters understand model biases and skills

help users interpret forecasts

to guide future developments identify and correct model faults

Page 5: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 5

1-month forecast

3-month forecast

Warm/Cold season forecast

Page 6: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 6

Verification of

operational 1-month forecasts

Error Map for Every Forecast

Ensemble mean forecast error maps,

RMSE and Anomaly Correlation

Probabilistic forecast

Reliability diagrams and ROC curves

Time sequence of ACC and RMSE

Summary in each year

Page 7: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 7

Verification of 1-month

ensemble mean forecast maps

Z500 over the Northern Hemisphere

Stream function (850hPa, 200hPa)

Observation Forecast Error

Page 8: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 8

Verification of

Probabilistic forecasts

・Reliability diagrams and Brier skill scores

・ROC curves and area

Page 9: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 9

Reliability diagram

The reliability diagram plots the

observed frequency (Y-axis)

against the forecast probability

(X-axis).

The diagonal line indicates perfect

reliability (observed frequency

equal to forecast probability for

each category).

perfect reliability

climatology

Brier Scores

forecast frequency

Points below (above) the diagonal line

indicate overforecasting (underforecasting).

Page 10: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 10

Steps for making reliability diagram

1. For each forecast probability category, count the number of

observed occurrences

2. Compute the observed relative frequency in each category k

obs. relative frequencyk = obs. occurrencesk / num. forecastsk

3. Plot observed relative frequency vs forecast probability

4. Plot sample climatology (no resolution line)

sample climatology = obs. occurrences / num. forecasts

5. Plot forecast frequency

Climatology

(no resolution)

Reliability diagram

Forecast frequency

Page 11: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

11

Brier (skill) score

N

i

ii opN

BS1

2)(1

size sample :N

1)or(0occurrenceobserved:

yprobabilitforecast:

i

i

o

p

Brier Score measures mean squared error

of the probability forecasts.

Brier Skill Score measures skill relative to a

reference forecast (usually climatology).

referencereferenceperfect

reference

BS

BS

BSBS

BSBSBSS

1

Range: 0 to 1. Perfect score: 0

Range: minus infinity to 1. BSS=0 indicates no skill compared to the

reference forecast. Perfect score: 1.

Page 12: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

12

Decomposition of the Brier score

)1()(1

)(1 2

11

2 oooonN

opnN

BSK

k

kk

K

k

kkk

Murphy(1973) showed that the Brier score can be decomposed into 3 terms (for K

probability classes and N samples). These terms show sources of error.

Reliability (brel) Resolution (bres) Uncertainty

(bunc) -- the mean squared difference

between the forecast

probability and the observed

frequency.

Perfect score: 0

-- the mean squared difference

between the observed frequency

and climatological frequency.

-- indicates the degree to which the

forecast can separate different

situations.

climatologial

forecast score:0

-- measures the

variability of the

observations.

occurrenceicalclimatolog:o

Page 13: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

13

Brier skill score

referenceforecastperfect

referenceforecast

ScoreScore

ScoreScoreScoreSkill

Brier skill score

limlim

lim 10 cc

c

BS

BS

BS

BSBSBSS

Reliability skill score

lim

1cBS

brelBrel

buncBSc lim

Resolution skill score

Range: minus infinity to 1. Perfect score: 1

BSS=0 indicates no skill compared to the climatology. BSS>0 : better than clim.

= the relative skill of the probabilistic forecast to the climatology

limcBS

bresBres

Score×100

Perfect score: 1

Perfect score: 1

The larger these skill scores are, the better.

Page 14: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 14

Interpretation of Reliability diagram and BSS

Event : Z500 Anomaly > 0

Northern Hemisphere

Spring 2008 (2008/2/28 ~2008/5/29)

1st week forecast

(day 2-8) 3rd and 4th week forecast

(day 16-29)

overforecasting

underforecasting

BSS<0

inferior to climatology

BSS>0

better than climatology

Page 15: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 15

Relative Operating Characteristic

(ROC)

ROC is created by plotting the hit rate

(Y-axis) against the false alarm rate (X-

axis) using increasing probability

thresholds to make the yes/no decision.

The area under the ROC curve (=ROC

area) is frequently used as a score.

Perfect: ROC area= 1

No skill: ROC area= 0.5

Page 16: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 16

Steps for making ROC diagram

1. For each forecast probability category, count the number of hits,

misses, false alarms, and correct non-events

2. Compute the hit rate and false alarm rate in each category k

hit ratek= hitsk/ (hitsk+ missesk)

false alarm ratek= false alarmsk/ (false alarmsk+ correct non-eventsk)

3. Plot hit rate vs false alarm rate

4. ROC area is the integrated area under the ROC curve

yes no

yes hits false alarms

no misses correct non-events

total Observed yes

Observed no

Observed

Fore

cast

Page 17: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 17

Interpretation of ROC curves

ROC is not sensitive to bias in the forecast. A biased forecast may still have good resolution

and produce a good ROC curve, which means that it may be possible to improve the

forecast through calibration.

Thus, the ROC can be considered as a measure of potential usefulness.

On the other hand, reliability diagram measures bias. It is a good partner to the ROC.

Event: Z500 anomaly > 0

Northern Hemisphere

Spring 2008 1st week forecast 3rd and 4th week forecast

high resolution

(high potential skill)

low resolution

(low potential skill)

Perfect

performance

Page 18: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 18

Anomaly Correlation and RMSE

Time sequence of Anomaly Correlation and RMSE

in each season and in each year

Page 19: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 19

Anomaly Correlation of T850 in summer 2008

Seasonal mean scores

Page 20: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 20

Anomaly correlation of Z500 over the Northern Hemisphere

Page 21: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 21

Anomaly correlation of Z500 over the Northern Hemisphere

(1996-2008) 28 days mean, running mean of 52 forecasts (1year)

El nino La nina

Page 22: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 22

Summary of PartⅠ: One-month prediction

Verification of operational prediction ・・・TCC website

Forecast error map (visual verification)

Reliability diagram, BSS, ROC

ACC, RMSE

Verification of Hindcast ・・・inside only

Bias, ACC, RMSE, forecast map, ・・・

Improvement of forecast skills

Page 23: Verification at JMA on Ensemble Predictionds.data.jma.go.jp/tcc/tcc/library/library2008/... · 2. Compute the hit rate and false alarm rate in each category k hit rate k = hits k

Verification I : One-month prediction 23

References

Murphy, A.H., 1973: A new vector partition of the probability score. J. Appl. Meteor.,

12, 595-600.

http://www.eumetcal.org.uk/eumetcal/verification/www/english/courses/msgcrs/inde

x.htm

http://www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.html

http://www.ecmwf.int/newsevents/meetings/workshops/2007/jwgv/index.html