Verification methods - towards a user oriented verification WG5
Jan 01, 2016
COSMO GM Cracow 19.09.2008
Course of verification on principal
Verification
Observation
Forecast
Data control
Enduser
Modeller
Analysis
Data control
COSMO GM Cracow 19.09.2008
What are the results of verification?
RMSE |
S1 |ANOCETS
FBI
FSS
BIAS
BSS |ROC
ISS
COSMO GM Cracow 19.09.2008
Attributes of a forecasts related to observations(I)
Bias - the correspondence between the mean forecast and mean observation.
Association - the strength of the linear relationship between the forecasts and observations (for example, the correlation coefficient measures this linear relationship)
Accuracy - the level of agreement between the forecast and the truth (as represented by observations). The difference between the forecast and the observation is the error. The lower the errors, the greater the accuracy.
Skill - the relative accuracy of the forecast over some reference forecast. The reference forecast is generally an unskilled forecast such as random chance, persistence (defined as the most recent set of observations, "persistence" implies no change in condition), or climatology. Skill refers to the increase in accuracy due purely to the "smarts" of the forecast system. Weather forecasts may be more accurate simply because the weather is easier to forecast -- skill takes this into account. http://www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.html#What%20makes%20a%20forecast%20goodreferring to: A.H. Murphy, Weather and Forecasting, 8(1993), Iss.2,281-293
COSMO GM Cracow 19.09.2008
Reliability - the average agreement between the forecast values and the observed values. If all forecasts are considered together, then the overall reliability is the same as the bias. If the forecasts are stratified into different ranges or categories, then the reliability is the same as the conditional bias,
Resolution - the ability of the forecast to sort or resolve the set of events into subsets with different frequency distributions. This means that the distribution of outcomes when "A" was forecast is different from the distribution of outcomes when "B" is forecast. Even if the forecasts are wrong, the forecast system has resolution if it can successfully separate one type of outcome from another.
Attributes of a forecasts related to observations(II)
COSMO GM Cracow 19.09.2008
Sharpness - the tendency of the forecast to predict extreme values. To use a counter-example, a forecast of "climatology" has no sharpness. Sharpness is a property of the forecast only, and like resolution, a forecast can have this attribute even if it's wrong (in this case it would have poor reliability).
Discrimination - ability of the forecast to discriminate among observations, that is, to have a higher prediction frequency for an outcome whenever that outcome occurs.
Uncertainty - the variability of the observations. The greater the uncertainty, the more difficult the forecast will tend to be.
Attributes of a forecasts related to observations(III)
COSMO GM Cracow 19.09.2008
Current focal points of verification
Spatial verification methods
object oriented methods
„fuzzy“- techniques
Verification of probabilistic and ensemble forecasts
ensemble pdf
generic probability forecasts
probability of an event
Verification of extreme (rare) events
high-impact events
Operational verification
evaluation and monitoring
User-oriented verification strategies
tailored verification for any user
Forecast value
cost - loss analysis, development of an universal score
Verification packages
VERSUS, MET
COSMO GM Cracow 19.09.2008
Some problems concerning significance
Some scores have a statistical outfit and seem to be open for significance tests.
Traditional significance tests require a defined number of degrees of freedom.
In most cases observations, forecasts and errors are correlated.
Therefore, the degrees of freedom cannot be obtained easily.
One way out: resampling and bootstrapping
What about statistical significance and meteorological significance?
COSMO GM Cracow 19.09.2008
User-oriented verification strategies: What are the interests of any users?
Administrator:
Did forecasts yield to better results during last period of interest and in general?
What type of focal points for model development are of current interest?
...
Modeller:
What type of errors occur in general?
What are the reasons for such errors?
How should the model modified in order to avoid or to reduce these errors?
If one has found the reason(s) for the error(s) and one has reduced the effect(s), is the forecast then improved?
...
External users and forecasters:
How can I interpret the forecasts?
What is the benefit of forecasts for me?
...
COSMO GM Cracow 19.09.2008
User-oriented verification step by step1. Diagnosis of errors - normally done by examining the BIAS or the FBI
The problem - mean values of observed and forecasted T2m over Germany during Sommer 2005 and Winter 2005/2006
(RMSE/STDV)
Summer 2005 Winter 2005/2006
COSMO GM Cracow 19.09.2008
User-oriented verification step by step1. Diagnosis of errors - normally done by examining the BIAS or the FBI
The problem - mean values of observed and forecasted gusts over Germany during Spring 2007
(RMSE/STDV)
COSMO-DE COSMO-EU
COSMO GM Cracow 19.09.2008
User-oriented verification step by step1. Diagnosis of errors - normally done by examining the BIAS or the FBI
Examples for four scores in four stylisized situations:
COSMO GM Cracow 19.09.2008
An example for conditional verificationForecasted and observed values of surface level pressure over the region of Germany during DJF 2005/2006(RMSE and STDV)
Forecasted and observed values of surface level pressure over the region of Germany during DJF 2005/2006 observed and forecasted values lower than 1020 hPa(RMSE and STDV)
Forecasted and observed values of surface level pressure over the region of Germany during DJF 2005/2006 observed and forecasted values higher than 1020 hPa(RMSE and STDV)
COSMO GM Cracow 19.09.2008
User-oriented verification step by step2. Some changes made by modellers
New diagnosis of gusts
to reduce the overestimation of gusts:
use wind at 10 m instead of interpolated wind from 30 m to compute gusts
New diagnosis of temperature 2m
to reduce the strong negative bias during winter and get a more realistic diurnal cycle:
set z0 to 2 cm over land
New SSO scheme (currently under examination)
COSMO GM Cracow 19.09.2008
User-oriented verification step by step3. The effects
New diagnosis of gusts
The overestimation of gusts is now reduced.
But: Extreme gusts are underestimated.
New diagnosis of temperature 2m
Systematic negative bias during winter is reduced now.
Diurnal cycle seems to be more realistic,
But: Positive bias occurs during night and summer.
New SSO scheme (currently under examination)
COSMO GM Cracow 19.09.2008
User-oriented verification step by step4. The proof of the effects: New diagnosis of gusts gusts > 12 ms-1 Böenverifikation der Experimente
Exp. 6278 (COSMO-EU) Operational run Exp. 6301 (COSMO-DE)
ETS
FBI
COSMO GM Cracow 19.09.2008
User-oriented verification step by step4. The proof of the effects: New diagnosis of temperature 2m
Comparison of COSMO-EU with experiment 6343 00 UTC: April/June 2007 : RMSE COSMO-EU area
COSMO GM Cracow 19.09.2008
A basic law during model development:
There are no gains without any losses!
(maybe with some exceptions)
Therefore, one has to look both at benefits and risks.
COSMO GM Cracow 19.09.2008
One of known exceptions:The effect of a SSO scheme in COSMO-EU
New SSO schemereference experiment
COSMO GM Cracow 19.09.2008
User-oriented verification step by step5. The risk: New diagnosis of gusts gusts > 25 ms-1
ETS
FBI
Böenverifikation der ExperimenteExp. 6278 (COSMO-EU) Operational run Exp. 6301 (COSMO-
DE)
COSMO GM Cracow 19.09.2008
old
new
User-oriented verification step by step5. The risk: New diagnosis of gusts windgust - old vs new
for 16.01.-17.03.08 over Switzerland
COSMO GM Cracow 19.09.2008
User-oriented verification step by step5. The risk: New diagnosis of temperature 2m
Comparison of COSMO-EU with experiment 6343 00 UTC: April/June 2007 : BIAS COSMO-EU area
COSMO GM Cracow 19.09.2008
User-oriented verification step by step6. The operational effect: New diagnosis of temperature 2m impact on mean diurnal cycle for stations over Switzerland
Summer 2007 Summer 2008
The (well known) errors of:- too strong temperature increase in the morning- maxima reached ~ 1.5-2 h too earlyis removed with the new 2m temperature diagnostics (introduced operationally 12.03.2008 @ DWD and09.06.2008 @ MeteoSwiss)
COSMO GM Cracow 19.09.2008
User-oriented verification step by step6. The operational effect: New diagnosis of temperature 2m for stations
over Germany
COSMO GM Cracow 19.09.2008
User-oriented verification step by step7. The effect for administrators experessed in „The Score COSI“
COSMO GM Cracow 19.09.2008
User-oriented verification step by step7. The effect for administrators experessed in „The Score COSI“
asr
Prognostic cloud ice
Prognostic precipitation
LME V 3.19 V 3.22 T 2m
User-oriented verification step by step8. Questions
Are there any questions to WG5?
Question from WG5:What are the requirements to the verification process by users in order to make the process of model development as effective as possible?