Top Banner
WATER RESOURCES RESEARCH, VOL. 35, NO. 9, PAGES 2739-2750, SEPTEMBER, 1999 Bayesian theory of probabilistic forecasting via deterministic hydrologic model Roman Krzysztofowicz Department of Systems Engineering and Division of Statistics, Universityof Virginia, Charlottesville Abstract. Rational decision making(for flood warning, navigation, or reservoir systems) requires that the total uncertainty abouta hydrologic predictand (such as river stage, discharge, or runoff volume)be quantified in termsof a probability distribution, conditional on all available informationand knowledge. Hydrologic knowledge is typically embodied in a deterministic catchment model. Fundamentals are presented of a Bayesian forecasting system (BFS) for producing a probabilistic forecast of a hydrologic predictand via any deterministic catchment model. The BFS decomposes the total uncertainty into input uncertainty and hydrologic uncertainty, which are quantified independently and then integrated into a predictive (Bayes) distribution. This distribution results from a revision of a prior (climatic)distribution, is well calibrated, and has a nonnegative ex ante economic value. The BFS is compared with Monte Carlo simulation and "ensemble forecasting" technique, none of which can alone producea probabilistic forecast that meetsrequirements of rational decision making, but eachcan serveas a component of the BFS. 1. Introduction 1.1. Impetus for Probabilistic Forecasting Hydrologic models usedoperationally for forecasting hydro- graphs of stages or discharges, or time series of runoffvolumes, are typically deterministic and complex. They are built of nu- merous submodels, each mimicking some physical process such as soil moisture accounting, rainfall-runoff transformation, channel routing, or stage-discharge relation. Forecasts pro- duced via such models are typically in the form of time series of estimates. These estimates are not error-free. From the viewpoint of a rational decision makerwho receives such esti- matesand must decideupon a flood warning,operationof a waterwayor a barge, or releases from a reservoir,there re- mains uncertaintyabout the actual realization of the time series being forecast.Bayesian principles of rationality [De- Groot, 1970;Berger,1985;Bernardo and Smith, 1994] dictate that (1) this uncertainty shouldbe quantifiedin terms of a probability distribution and (2) decisions should be made on the basis of this probability distribution rather than on the face value of estimates [K•sztofowicz, 1983; Murphy,1991]. The primary source of uncertainty in short-term river fore- casts is the future time series of precipitation amounts needed as an input to a hydrologic model. It is safe to saythat quan- tiffcation of uncertaintyabout this input is prerequisite for probabilistic river forecasting. In this regard, the last decade has broughtabout significant advances: the confirmation of a steady improvement of 24-hourquantitative precipitation fore- casts producedoperationally by the National Meteorological Center [Olson et al., 1995], the development and testing (since 1990 to presentday) of a prototype system for probabilistic quantitative precipitation forecasting[Krzysztofowicz et al., 1993; Krzysztofowicz, 1998], and the formulation of a strategic plan by the National Weather Service for implementing the Copyright 1999 by the AmericanGeophysical Union. Paper number 1999WR900099. 0043-1397/99/1999WR900099509.00 probabilistic quantitative precipitation forecasting system na- tionwide [Graziano, 1998]. In its ultimate operational version the system will produce fields with elementsof probability distributions of spatially averaged precipitation amountsfor 6-hourly subperiods up to 72 hours into the future. These advances provide an impetus for researchinto theories and methods of probabilistic river forecasting based on probabilis- tic quantitativeprecipitationforecasts. 1.2. Toward a General Theory This articlelaysdown a theoryfor probabilistic forecasting of hydrologic variates.The theory is Bayesianand has five attributes. (1) It worksin conjunction with any deterministic hydrologic modelwithout imposing on that model any struc- tural (e.g.,linearizing) or distributional (e.g.,normalizing) as- sumptions. (2) It providesa methodological frameworkfor developing a variety of probabilistic forecasting systems suited to different purposes. (3) It outputs a predictive distribution of the variate being forecast (the predictand). The predictive dis- tribution,which results from the Bayesian revision of a prior (climatic)distribution, quantifies the total uncertainty that re- mainsaboutthe predictand, givenall information input to the hydrologic model at the forecast time. Consequently, for any event of interest the predictive distribution yields a probability that admits the subjective (Bayesian) interpretation asa degree of credence aboutthe occurrence of the event.(4) It guaran- tees a self-calibration property: Probabilistic forecasts pre- serve, in the expected value sense (or in thelong run), theprior (climatic)distribution of the predictand. (5) It guarantees a coherence property,which is indispensable for rational deci- sion making underuncertainty: The ex ante economic value of - the probabilistic forecast cannever be negative (relative to the value of a prior distribution). Aside from the potential practicalutility, the theory may serve as an intellectual tool. For example, it reveals certain desired (normative) properties of anyprobabilistic forecasting system. It alsoenables one to identifya proper (and limited) 2739
12

Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

May 20, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

WATER RESOURCES RESEARCH, VOL. 35, NO. 9, PAGES 2739-2750, SEPTEMBER, 1999

Bayesian theory of probabilistic forecasting via deterministic hydrologic model

Roman Krzysztofowicz Department of Systems Engineering and Division of Statistics, University of Virginia, Charlottesville

Abstract. Rational decision making (for flood warning, navigation, or reservoir systems) requires that the total uncertainty about a hydrologic predictand (such as river stage, discharge, or runoff volume) be quantified in terms of a probability distribution, conditional on all available information and knowledge. Hydrologic knowledge is typically embodied in a deterministic catchment model. Fundamentals are presented of a Bayesian forecasting system (BFS) for producing a probabilistic forecast of a hydrologic predictand via any deterministic catchment model. The BFS decomposes the total uncertainty into input uncertainty and hydrologic uncertainty, which are quantified independently and then integrated into a predictive (Bayes) distribution. This distribution results from a revision of a prior (climatic) distribution, is well calibrated, and has a nonnegative ex ante economic value. The BFS is compared with Monte Carlo simulation and "ensemble forecasting" technique, none of which can alone produce a probabilistic forecast that meets requirements of rational decision making, but each can serve as a component of the BFS.

1. Introduction

1.1. Impetus for Probabilistic Forecasting

Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes, are typically deterministic and complex. They are built of nu- merous submodels, each mimicking some physical process such as soil moisture accounting, rainfall-runoff transformation, channel routing, or stage-discharge relation. Forecasts pro- duced via such models are typically in the form of time series of estimates. These estimates are not error-free. From the

viewpoint of a rational decision maker who receives such esti- mates and must decide upon a flood warning, operation of a waterway or a barge, or releases from a reservoir, there re- mains uncertainty about the actual realization of the time series being forecast. Bayesian principles of rationality [De- Groot, 1970; Berger, 1985; Bernardo and Smith, 1994] dictate that (1) this uncertainty should be quantified in terms of a probability distribution and (2) decisions should be made on the basis of this probability distribution rather than on the face value of estimates [K•sztofowicz, 1983; Murphy, 1991].

The primary source of uncertainty in short-term river fore- casts is the future time series of precipitation amounts needed as an input to a hydrologic model. It is safe to say that quan- tiffcation of uncertainty about this input is prerequisite for probabilistic river forecasting. In this regard, the last decade has brought about significant advances: the confirmation of a steady improvement of 24-hour quantitative precipitation fore- casts produced operationally by the National Meteorological Center [Olson et al., 1995], the development and testing (since 1990 to present day) of a prototype system for probabilistic quantitative precipitation forecasting [Krzysztofowicz et al., 1993; Krzysztofowicz, 1998], and the formulation of a strategic plan by the National Weather Service for implementing the

Copyright 1999 by the American Geophysical Union.

Paper number 1999WR900099. 0043-1397/99/1999WR900099509.00

probabilistic quantitative precipitation forecasting system na- tionwide [Graziano, 1998]. In its ultimate operational version the system will produce fields with elements of probability distributions of spatially averaged precipitation amounts for 6-hourly subperiods up to 72 hours into the future. These advances provide an impetus for research into theories and methods of probabilistic river forecasting based on probabilis- tic quantitative precipitation forecasts.

1.2. Toward a General Theory

This article lays down a theory for probabilistic forecasting of hydrologic variates. The theory is Bayesian and has five attributes. (1) It works in conjunction with any deterministic hydrologic model without imposing on that model any struc- tural (e.g., linearizing) or distributional (e.g., normalizing) as- sumptions. (2) It provides a methodological framework for developing a variety of probabilistic forecasting systems suited to different purposes. (3) It outputs a predictive distribution of the variate being forecast (the predictand). The predictive dis- tribution, which results from the Bayesian revision of a prior (climatic) distribution, quantifies the total uncertainty that re- mains about the predictand, given all information input to the hydrologic model at the forecast time. Consequently, for any event of interest the predictive distribution yields a probability that admits the subjective (Bayesian) interpretation as a degree of credence about the occurrence of the event. (4) It guaran- tees a self-calibration property: Probabilistic forecasts pre- serve, in the expected value sense (or in the long run), the prior (climatic) distribution of the predictand. (5) It guarantees a coherence property, which is indispensable for rational deci- sion making under uncertainty: The ex ante economic value of - the probabilistic forecast can never be negative (relative to the value of a prior distribution).

Aside from the potential practical utility, the theory may serve as an intellectual tool. For example, it reveals certain desired (normative) properties of any probabilistic forecasting system. It also enables one to identify a proper (and limited)

2739

Page 2: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

2740 KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING

Probabilistic

Forecast

of Input .•1 DETERMINISTIC ,,- HYDROLOGIC

I MODEL

HYDROLOGIC

--- UNCERTAINTY

PROCESSOR

Probabilistic Forecast

I iNTEGRATOR. I øf Predictand

II INPUT • UNCERTAINTY

PROCESSOR

Figure 1. Structure of the Bayesian forecasting system.

role of Monte Carlo simulation and so-called "ensemble fore-

casting" technique. Section 2 develops the theory whose building blocks are a

principle of decomposition of the total uncertainty, the input uncertainty processor, the hydrologic uncertainty processor, and the integrator; the section concludes with an analysis of key properties of the Bayesian forecasting system (BFS). Sec- tion 3 develops the understanding of the Bayesian integrator of uncertainties; the vehicle for this development is a univariate forecasting problem which admits a parametric, closed-form solution. Section 4 summarizes properties of the Bayesian framework for probabilistic forecasting and pinpoints the role of Monte Carlo and ensemble simulation techniques.

2. Bayesian Theory 2.1. Decomposition of Uncertainty

The sources of uncertainty associated with a river forecast can be categorized as operational, input, and hydrologic. Op- erational uncertainty is caused by erroneous or missing data, human processing errors, unpredictable interventions (e.g., changes in reservoir releases not communicated by a dam operator to the forecaster), unpredictable obstacles within a river channel (e.g., ice jams), and the like. These sources of uncertainty are exterior to the forecasting theory. Therefore the term "total uncertainty" used henceforth will not encom- pass operational uncertainty.

To decompose the total uncertainty, the first step is to screen all inputs to the hydrologic model and identify those whose uncertainty has a significant impact on the model outputs, varies from one forecast to the next, and can be quantified at the forecast time. Such inputs will be treated as random. The remaining inputs will be treated as deterministic. In effect, the total uncertainty will be decomposed into two sources: (1) input uncertainty associated with random inputs to the model and (2) hydrologic uncertainty arising from all sources beyond those classified as random inputs; in general, these sources include model, parameter, estimation, and measurement errors.

For example, in short-term forecasting of floods the princi- pal source of uncertainty is the unknown future rainfall, which is treated as the random input. Future potential evapotranspi- ration is also unknown, but it is more predictable and of lesser significance than rainfall; it is therefore treated as a determin- istic input. The uncertainty due to an error in the potential evapotranspiration estimate is aggregated with all other uncer- tainties (except rainfall uncertainty), which are collectively re-

ferred to as hydrologic uncertainty. These other uncertainties may arise from imperfections of the model: its structure and relations (e.g., soil moisture accounting, rainfall-runoff trans- formation, channel routing, stage-discharge relation), incorrect values of model parameters (e.g., a recession coefficient), in- correct estimates of deterministic inputs (e.g., past mean areal rainfall), errors in measurements of physical quantities (e.g., precipitation, temperature, river stage), and so on.

The rationale underlying this decomposition of uncertainty is that for the purpose of real-time forecasting it is infeasible, and perhaps unnecessary, to explicitly quantify every single source of uncertainty. There are usually a few sources (such as future rainfall in flood forecasting and future temperature in snowmelt runoff forecasting) whose contribution to the total forecast uncertainty dominates the contribution of any other source. Therefore a plausible compromise between the exact- ness and the practicality of a theory can be reached by limiting the explicit quantification to (1) the dominant uncertainties (input uncertainty) and (2) all other uncertainties in the ag- gregate (hydrologic uncertainty). While theoretically subopti- mal (with respect to maximizing the informativeness of a prob- abilistic forecast), this compromise may be practically near- optimal if the hydrologic uncertainty is quantified via a Bayesian processor with suitably chosen state variables.

The decomposition of uncertainty leads to a forecasting sys- tem whose structure is depicted in Figure 1. Conceptually, one may think of two statistical processors being attached to a hydrologic model. One processor maps the input uncertainty into the output uncertainty under the hypothesis that there is no hydrologic uncertainty, and another processor quantifies the hydrologic uncertainty under the hypothesis that there is no input uncertainty..Then the two uncertainties are optimally integrated to produce a probabilistic forecast.

2.2. Bayesian Predictive Inference

The theory was outlined in concept by Krzysztofowicz [1993]. It rests on principles of Bayesian predictive inference. The inference scheme is depicted in Figure 2. Although the theory is applicable to any forecasting problem, it is interpreted herein in the context of forecasting river stages from precipi- tation amounts.

Introduce three random vectors: W (input), $ (output), and H (predictand); the realizations (observations) of these vectors are denoted w, s, and h, respectively. The predictand H is the random vector whose observation is to be predicted. The ran- domness of W arises from the input uncertainty. The random-

Page 3: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING 2741

(a) Characterization of Uncertainties

Deterministic

Input Hydrologic Hydrologic Uncertainty Model Uncertainty

W

vl(wlv) s = r(w, u) •(hls, h 0, y)

H

(b) Inducement of Output Uncertainty

Output Hydrologic Uncertainty Uncertainty

h0, Y

S H .•

n(slu, v) 0(his, h0, y)

(c) Integration of Uncertainties

Total

Uncertainty

, ,U,V

H .•

•(hlh 0, y, u, v)

Figure 2. Bayesian predictive inference leading to a probabilistic forecast via a deterministic hydrologic model.

ness of S arises from viewing any output s calculated via the hydrologic model as a realization of random vector S. When precipitation is the input and river stage is the predictand, the structure of these vectors may be as follows:

w = [(w•,, .. ß, w,o, ..., (w•,, .. ß, w,,)]',

s -- [(Sll,-.. , SN1),''' , (SiM,''', SNM)] t ,

It = [(H•, ''', H•v0, '' ', (H•4, ' ß ß, H•w4)]',

where Wii is the precipitation amount accumulated during subperiod i (i = 1, ..., I), counted from the forecast time, and over subarea j (j = 1, ..., J) of the river basin; Snm is the model river stage at time tn (n = 1,''', N) and at forecast point m (m = 1, ..., M); and Hnm is the actual river stage at time tn and at forecast point m. Time tn is measured on a continuous scale, and n is the index of time instances at which river stages are forecasted and observed, with t o denot- ing the last observation time before forecast preparation.

Consider now a particular forecasting occasion. First, sup- pose the uncertainty about the random input is quantified in terms of a generalized probability density function •(' Iv) of W, where v is a parameter vector specified by a probabilistic quan-

titative precipitation forecast, and the term "generalized" is used because W is a vector of mixed (binary-continuous) vari- ates. Parameter v is a realization of random vector V, and density r•(' Iv) is assumed to be well calibrated, in the Bayesian sense, which is defined in section 2.7. Let U denote the input vector whose realization u comprises all deterministic inputs to the hydrologic model; these inputs encompass all exogenous variables and internal states (initial conditions) whose values vary from one forecast time to the next; they exclude param- eters whose values remain fixed for a given river basin. Second, suppose the river stage process is Markov of some finite order, so that a prior density of H is conditional on observation h o of vector H o of river stages at a finite sequence of times up to the forecast time t o. Let Y denote the state vector whose realiza- tion y is available at the forecast time and partially explains the hydrologic uncertainty. (The precise role of y will become apparent soon.) Finally, define a super vector x = (ho, y, u, v).

The Bayesian theory of probabilistic forecasting derives from the following statement of the total probability law:

q,(hlx) - f_•• 4,(his, x)•(slx) as, (1)

Page 4: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

2742 KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING

where ½, qb, and rr are densities of the vectors indicated. This general statement can be specialized to the forecasting prob- lem by recognizing that (u, v) is the sufficient predictor of S and by defining y so that (ho, y) is the sufficient predictor of H, given S = s. Then

½(h h0, y, u, v) - 4>(his, h0, y),r(slu, v) ds, (2)

where rr( ß lu, v) is the density of output S induced by the density r•( ß Iv) of input W and the hydrologic model with deterministic input u; qb(. Is, ho, y) is the posterior density of predictand H obtained through the revision of a prior density of H, which is conditional on ho, based on a realization of output S = s from the hydrologic model fed with a perfect forecast of input W, and given state vector y; ½(. Iho, y, u, v) is the predictive (Bayes) density of H conditional on (ho, y, u, v), that is, the values of all vectors which comprise information used by the forecaster on the particular occasion.

The structure of (2) prescribes the components of the BFS: (1) the input uncertainty processor, which yields density rr(. lu, v), (2) the hydrologic uncertainty processor, which yields a family of densities {q b(. Is, ho, y):all s), and (3) the integrator, which yields density ½( ß Iho, y, u, v); this density constitutes a probabilistic forecast of H. These three compo- nents are detailed next.

2.3. Input Uncertainty Processor

Suppose the hydrologic model has been readied to produce a forecast. That is, the deterministic input vector u has its value set, and only input w and output s remain variable. Such ready model defines a response function r which transforms a real- ization of input W = w into a realization of output S = s,

s = r(w, u). (3)

Output s constitutes an estimate of predictand H, conditional on the hypothesis that the input is W = w.

Density r•(' Iv) of W and the response function r( -, u) induce density rr(. lu, v) of $. This density quantifies the uncertainty in model output caused by the input uncertainty. It must be estimated on-line, each time a forecast is prepared.

A numerical estimate of the response function r( ß , u) for a fixed u and any set of w values may be obtained via simulation [Box and Draper, 1987]. Likewise, a numerical estimate of the distribution corresponding to density rr(. lu, v), for a set of s values, may be obtained via simulation. For example, in a Monte Carlo approach, a realization of the precipitation field (the input vector) W = w is generated from an approximation to density ,/( ß Iv), and then w is input to a conceptual hydro- logic model to obtain a realization of the model river stage vector $ = s [Seo and Finnerty, 1998; Schaake and Larson, 1998]. Nevertheless, research challenge remains to devise methods of estimating rr(. lu, v) that are more efficient and less approximate than the current Monte Carlo algorithms.

2.4. Hydrologic Uncertainty Processor

2.4.1. Structure. The hydrologic uncertainty processor harnesses the principle of Bayesian revision of a probability distribution. The prior uncertainty about the predictand H, which exists before the preparation of a forecast, is quantified in terms of a family of prior densities { #( ß Iho):all ho}. In accordance with the assumed MarkovJan structure of the river

stage process, density #( ß I ho) of H is conditioned on Ho = ho, a vector of observed river stages up to time to.

The hydrologic uncertainty is characterized in terms of a family of conditional densities {f( ß Ih, y):all h, y), where f( ß Ih, y) is the density of model output $, conditional on the hypothesis that the observation of predictand is H = h and the forecast of input W is perfect, and given that the state vector is Y = y. For a fixed output S = s and state Y = y, object f(sl ß, y) is the likelihood function of predictand H; for con- venience, the term likelihood will also be applied to the triva- riate function f.

The families # and f carry information about the prior un- certainty and the hydrologic uncertainty into the Bayesian re- vision process. For any fixed ho and y the expected density of model output S is given by the total probability law:

•(slh0, y)= I;• f(s h, y)#(hlh0)dh, (4)

and the posterior density of predictand H, conditional on model output S = s, is given by the Bayes theorem:

œ(slh, y)9(hlh0) (hls, h0, y) = (slh0, Y) ' (5)

The family { rb(' is, ho, y) :all s, h o, y} of the posterior densities quantifies the hydrologic uncertainty about predictand H for every possible model output s (induced by a perfect forecast of input W), observation h o, and state y.

2.4.2. Prior density. A prototype for the family of the prior densities # is a time series model of river stages. Time series models are often presented as "forecasting models." In our framework of Bayesian forecasting, a time series model of river stages provides only a prior distribution of the predictand. While numerous time series models have been developed, only those which explicitly specify the transition distributions may be suitable for use in the BFS. Once a model for # is chosen, its parameters should be estimated from past observations (a climatic record) of river stages. Borrowing the term from me- teorology, # can be called the climatic family of densities.

2.4.3. Likelihood function. To interpret the family of the likelihood functions f, consider the following scenario. The hydrologic model is ready for forecasting (with deterministic input u set for a given forecasting occasion), while a clairvoyant enters and offers the true joint observations (w, h). Given input w, output s = r(w, u) is calculated and compared with h. It should be apparent that any error • = s - h is caused solely by the error of the model. If this situation were repeated n times, then the sample of joint observations { (si, hi):j = 1, ..., n } would provide data for estimating a family of densities { f( ß Ih):all h} of S. Next f could be used to probabilistically predict the model error E = S - H, when the true value of W is fixed but unknown. To wit, for any hypothesized observation of the predictand, H = h, the conditional density of model error E is specified by k(•lh) = f(• + hlh). Thus the charac- terization of hydrologic uncertainty in terms of the likelihood function is tantamount with a characterization of the condi-

tional model error that occurs when a clairvoyant forecasts the random input.

It is possible that conditional on predictand H, model error • is stochastically dependent upon some state Y. In terms of the terminology of regression analysis this means that the ob- servation of the state Y = y, which must be available at the

Page 5: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING 2743

forecast time, partially "explains" the conditional model error •. The general formulation of the likelihood function, as f( ß I h, y), makes a provision for this possibility. Practically, the choice of vector y is a matter of experimentation. This vector may include some elements of u and some, or even all, ele- ments of ha.

The scenario described above should be followed in a sim-

ulation experiment that generates the sample necessary for estimating f. This experiment has three salient features. For every realization (1) the joint observation (w, h, y) must be known, and (2) the deterministic input u to the hydrologic model must be estimated using data and procedures that would be used in operational forecasting of the given observation h of the predictand, that is, without foreknowing observations w and h. Across all realizations (3) the sample of w must be representative of the prior (climatic) distribution of input W. Practically, this condition is satisfied when the empirical dis- tribution constructed from the sample closely matches the cli- matic distribution. The formal statement of this requirement is given by equation (A4) in the appendix.

In summary, the scenario assumes that there is no input uncertainty but only hydrologic uncertainty. The experiment yields a sample of joint observations {(sj, h i, Yi) 'J = 1,..., n ) which provides data for estimating a family of densities { f( ß Ih, y)' all h, y) of $. The experiment can be carried out off-line, before any probabilistic forecast must be prepared. In effect, the family of the posterior densities {(k( ' Is, ha, y)' all s, ha, y) can be developed off-line and stored in a form ready for use in real-time forecasting.

2.4.4. Modeling. How should one model the families of the prior densities and the likelihood functions? First, one should note that these families need not be stationary. They may be periodically stationary. For example, in forecasting daily river stages, the period over which # and f are stationary may be a month, in which case twelve families of # and f must be estimated for real-time forecasting.

In principle, parametric modeling of densities # and f, and the subsequent derivation of densities K and (k, may follow the path through which Bayesian processors of deterministic fore- casts have been developed as normal-linear processors [e.g., Krzysztofowicz, 1985, 1987; Krzysztofowicz and Watada, 1986; Krzysztofowicz and Reese, 1991] or meta-Gaussian processors in which the prior density can be of any form and the likelihood function allows for a nonlinear stochastic dependence struc- ture [e.g., Kelly and Krzysztofowicz, 1994, 1995]. Research chal- lenge remains to adapt these models to particular forecasting problems and to extend them to vector-valued time series and hydrologic models with state-dependent predictive capabilities.

2.5. Integrator

At the forecast time, density ,r( ß lu, v) of S estimated for that occasion and the family of posterior densities { (k( ' Is, ha, y):all s} of H, retrieved from storage by Ha = ha and Y -- y observed on that occasion, are integrated according to (2). When (5) is inserted into (2) and the terms are rearranged, one obtains the predictive density of H in the form

½(hlha, y, u, v) = •/(h; ha, y, u, v)#(h ha), (6)

where •/is the predictive weighting function specified by

The predictive density ½( ß Iho, Y, u, v) constitutes the prob- abilistic forecast of It.

2.6. Properties of Predictive Density

Equations (6)-(7) reveal the fundamental structure of the Bayesian forecasting system. This structure is independent of the hydrologic model to which the BFS is attached. To inves- tigate properties of this structure, it is convenient to omit vectors (ha, y, u, v) whose values are fixed on a particular forecasting occasion (although they vary from one occasion to the next). The operational equations are then

½(h) = •/(h) #(h) (8)

and

=(s) •/(h) = f(slh) K-•-•- ds. (9)

The following properties of the BFS can be inferred. 1. The predictive density ½, which constitutes a probabilis-

tic forecast of predictand H, results from a revision of the prior density #.

2. The input uncertainty and the hydrologic uncertainty are integrated into the predictive weighting function % which serves as the multiplier of the prior density #.

3. The predictive density ½ differs from the prior density # if and only if (1) the hydrologic model has a predictive capa- bility, mathematically f( ß Ih) • for at least some h, and (2) forecast of the random input on the particular occasion is informative for predicting output, mathematically ,r 4: g. (Recall that g, as defined by (4), is the expected density of output that results from the true but unknown input.)

4. If the probabilistic forecast of input is noninformative for predicting output on the particular occasion, in the sense that the induced density of output S equals the expected den- sity, ,r = g, then •/(h) = 1 and ½ = #. In other words, the predictive density equals the prior density.

5. If the probabilistic forecast of input is perfect on the particular occasion, in the sense that P(W -- w*) = 1 for some w*, then P(S = s*) = 1 for s* = r(w*, u) and ,r(s) = 8(s - s*), where/3 is the Dirac function. Consequently,

y(h) = f_••f(slh) a(s - s*) f(s*lh) as=

½(h) = f(s*lh)#(h)/g(s*) = 4,(his*).

In other words, the predictive density equals the posterior density, given output S - s*.

6. If the hydrologic model has no predictive capability, then f( ß Ih) = • for every h. Consequently, y(h) -- 1 and • = #. That is, the predictive density equals the prior density.

7. If the hydrologic model is perfect (and there is no other uncertainty except the input uncertainty), then f(slh) - /3(h - s). Consequently,

g(s) = I•• a(h - s)g(h) dh = g(s),

'(slu, v) ha, y, u, v) -- f(slh, y) g(slho, y) ds. (7) =(s) •/(h) = a(h - s) g-•-•- d s = ,r(h)

#(h) '

Page 6: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

2744 KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING

and therefore ½ = •r, which implies H = S. That is, the predictand equals the output, and the predictive density equals the output density.

Property t is a defining property of the Bayesian forecasting system. Property 2 attests to the parsimony of our theory, whose gist is the product ½ = •/#. Property 3 is essentially a condensation of the remaining properties. Properties 4-7 de- scribe the behavior of the theory in four limiting situations. Together, these properties verify the coherence of the theory. These are the properties that one would (or should) expect from any probabilistic forecasting system.

Properties 4 and 6 are worth highlighting. If the probabilistic forecast of input becomes noninformative, •r -• K, or if the predictive capability of the hydrologic model deteriorates, f( ß Ih) -• • uniformly in h, then the predictive density converges to the prior density, ½ -• #. Ergo, the BFS auto- matically guards against the production of a probabilistic fore- cast whose informativeness is lower than the informativeness

of the prior density. This property has a practical significance: it ensures that a probabilistic forecast cannot have a negative ex ante economic value to a rational decision maker, no matter how poorly the hydrologic model performs [K•sztofowicz, 1983].

2.7. Calibration of Forecasting System

One tool for hydrologic analysis is simulation, whereby a precipitation time series observed in the past is input to a hydrologic model to produce a streamflow time series [Day, 1985]. Then, to verify the hydrologic model, one asks whether statistics of the simulated streamflow (such as mean, variance, skewness, covariance) match statistics of the observed stream- flow. Such a comparison of statistics is essentially a practical way of verifying a property which in Bayesian decision theory is called the calibration of a forecaster [Murphy and Winkler, 1974; Alperr and Raiffa, 1982].

Bayesian theory is harnessed to introduce a general defini- tion of calibration of a probabilistic forecasting system and to prove that this definition is satisfied by the BFS.

Let (Ha, Y, U, V) denote the random vector whose realiza- tion (ha, y, u, v) constitutes information upon which the prob- abilistic forecast of H is based.

Definition t: A forecasting system producing a probabilis- tic forecast of input W in the form of a density drawn from the family {r t(' Iv):all v} is said to be well calibrated if

E[rt(' IV)] =q, (t0)

where the expectation is taken with respect to V, and q is the prior (climatic) density of input W.

Definition 2: A forecasting system producing a probabilis- tic forecast of predictand H in the form of a density drawn from the family {½( ß Iho, y, u, v):all ha, y, u, v} is said to be well calibrated if for every h a,

Ih0, Y, U, v)lH0: h0] = #(. Ih0), (11)

where the expectation is taken with respect to (Y, U, V), con- ditional on Ha = ha, and #(. Iho) is the prior (climatic) density of predictand H, conditional on Ha - ha.

Equation (t0) states that the expected density of precipita- tion input W must be equal to the climatic density of W. Equation (t t) states that conditional on the observed river stage Ha - ha, the expected predictive density of river stage H must be equal to the prior (climatic) density of H. Obviously,

these are more general and more stringent verification criteria than any criteria based on equalities of moments.

Theorem (calibration): If the forecasting system supplying a probabilistic forecast of the random input is well calibrated, then the Bayesian forecasting system defined by (2)-(7) is well calibrated.

In a nutshell, if (t0) holds, then (tt) holds. The proof is given in the appendix. The theorem reveals one of the unique properties of the BFS: the self-calibration. Its practical signif- icance is twofold. First, it is imperative that the probabilistic forecast of the random input to a hydrologic model comes from a system which is well calibrated. This imperative justifies the ongoing effort within the National Weather Service aimed at verification of the calibration property of probabilistic quan- titative precipitation forecasts produced for hydrologic pur- poses [•sztofowicz and Sigrest, 1999]. Second, good calibra- tion of the probabilistic forecast inputted to the BFS is sufficient to ensure good calibration of the resultant probabi- listic forecast of a hydrologic predictand. In particular, if the likelihood function is properly modeled and estimated, as de- scribed in section 2.4.3, and if the prior density of predictand H is also properly modeled and estimated, in the sense that it is based on a climatic record of river observations and preserves climatic estimates of all conditional moments of (HIHo - ho), then the probabilistic forecasts of H are guaranteed to pre- serve, in the long run, all these moments. These moments will be preserved even if the hydrologic model itself, when tested via simulation, fails to preserve them. In fact, the BFS will preserve not only all conditional moments, but the entire fam- ily of conditional densities { #( ß I ho): all ha} of H.

In a broad sense, our Bayesian framework enables the hy- drologist to separately develop two models: a deterministic model of physical processes within a catchment and a stochas- tic model of streamflow at the forecast points. Then the BFS combines the two models into a forecasting system that pre- serves properties embedded in each model.

2.8. Judgmental Quantification of Uncertainties

Although our interpretation of the BFS has focused on sta- tistical estimation of densities based on observations, the den- sities could be assessed judgmentally in accordance with the subjective probability theory. For instance, probabilistic quan- titative precipitation forecasts are produced by meteorologists of the National Weather Service who apply knowledge, expe- rience, and techniques to observations, analyses, and model outputs. To quantify the uncertainty, the meteorologists follow a protocol that prescribes the probabilistic reasoning process [K•sztofowicz et al., 1993]. The resultant r•(. Iv) is a judgmen- tal density. Another situation calling for expert judgments arises when a hydrologic model is deployed in a previously ungauged river basin. Whatever technique is chosen to deter- mine the family of the prior densities #, it involves judgments by the hydrologist who selects and applies the technique. Therefore #( ß Iha) is a judgmental density. The same could be true with respect to the family of the likelihood functions f. In conclusion, the input density, the prior density, or the family of the likelihood functions each may be judgmental.

3. Understanding Integration of Uncertainties 3.1. Forecasting Problem

The key operations performed by the BFS are the quarttiff- cation and the integration of uncertainties. To gain insight into

Page 7: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING 2745

the integration operation, a hypothetical example is con- structed. This example involves a univariate normal density of output and a normal-linear hydrologic uncertainty processor, which afford a closed-form solution for the predictive density. Earlier Bayesian analyses of various forecast data suggest that such a BFS is certainly unsuitable for forecasting river stages or discharges [Kelly and Krzysztofowicz, 1994] but may be suitable for forecasting seasonal snowmelt runoff volumes [Krzysztofow- icz and Watada, 1986; Krzysztofowicz and Reese, 1991]. This latter problem provides the context for the example, which is purposely simplistic in order to expose the gist of the Bayesian integrator of uncertainties.

Let H denote the runoff volume measured at a river station

during a specified snowmelt season (e.g., April-July); a fore- cast of H must be made at some fixed time before the onset, or in the early part, of the season. (In actuality, the National Weather Service issues an updated forecast on the first day of each month from January through May or June, depending on the station. The example considers only one of these forecasts.) Let S denote an estimate of H output from some hydrologic model; its structure and complexity do not matter. The deter- ministic input u to this model may include the initial snowpack depth, snow water equivalent, and antecedent runoff (all of which can be measured). The random input W may include time series of future precipitation and temperature (which must be forecast on the basis of some vector v of climatic

predictors).

3.2. Input Uncertainty Processor

The input uncertainty processor is not modeled herein be- cause its structure is immaterial to the behavior of the BFS.

What does matter is the form of the density •r of model output S. Suppose this density is normal with mean and variance

E(Slu, v)= v,

Var <Slu, v>:

The parameters (v, ,2) change from one forecast time to the next.

3.3. Hydrologic Uncertainty Processor

3.3.1. Normal-linear processor. The normal-linear hy- drologic uncertainty processor for a univariate forecasting problem without a state vector takes the following form [K•sztofowicz, 1987]. The prior density # of H is normal with mean and variance

E(H) = M,

Var (H) = V 2, (13)

and the conditional density f( ß Ih) of S is normal with mean and variance

E(SIH- h) = ah + b,

Var (SIH = h) = 0-2. (14)

It then follows that the expected density K of S is normal with moments

E(S) = aM + b,

Var (S) = a2V 2 + 0 -2, (15)

and the posterior density 4•(' Is) of H is normal with moments

where

E(HIS = s) = As + B,

Var (HIS = s) = T 2, (16)

aV 2 M0- 2 - ab V 2

A = 2V 2 0-2 B= 2V 2 0-2 , (17) a + a +

rr2V 2

T 2= 2V 2 0' 2 . (18) a +

In other words, the posterior parameters (A, B, T 2) are obtained directly from the prior parameters (M, V 2) and the likelihood parameters (a, b, o-2).

3.3.2. Estimation. The prior parameters (M, V 2) may be estimated from a climatic record of runoff volumes {hi:i = 1, ..., rn) from rn years. The likelihood parameters (a, b, o -2) may be estimated from a joint sample {(si, hi):j = 1, ..., n) generated by a simulation experiment using data from n years. For a given year the deterministic inputs are set to the estimates of the initial snowpack depth, snow water equivalent, and antecedent runoff which would be available at the forecast time in that year, whereas the random inputs are set to the actual (observed) time series of precipitation and temperature through the end of the snowmelt season in that year, as if their perfect forecast were available. (This assumes that in real-time a probabilistic forecast of these time series is available for the same period, from the forecast time to the end of the snowmelt season.) The runoff volume s i output from the hydrologic model and the actual (observed) runoff volume h• provide an observation for the joint sample. Given this sample, parameters a, b, and 0 '2 may be estimated by fitting linear regression (14).

3.3.3. Measure of hydrologic uncertainty. The normal- linear likelihood function enables one to completely charac- terize the predictive capability of the hydrologic model in just three parameters. If, given a perfect forecast of the random input, the output is a perfect estimate of the predictand, then a = 1, b = 0, and 0 '2 = 0. If, given a perfect forecast of the random input, the output behaves as if it were randomly gen- erated from an arbitrary normal distribution with mean N and variance R 2, then a = 0, b = N, and 0 '2 = R2; such output is "worthless," of course. These limiting cases suggest an in- terpretation of the likelihood parameters: the slope a mea- sures output information (or "signal" carried by the output), while the conditional variance 0 '2 measures output uncertainty (or "noise" in the output). Intuitively, one may anticipate that as the "signal" increases and the "noise" decreases, output becomes more informative. Indeed, this is the case, and the "signal-to-noise" ratio, a2/o '2, constitutes a decision-theoretic measure of informativeness of the output, conditional on a perfect forecast of the random input. Alternatively, one may say that a2/0- 2 is a measure of hydrologic uncertainty, with a2/0- 2 ---> o• implying no uncertainty and a2/0- 2 ---> 0 implying "infinite" uncertainty. The statistical theory behind this inter- pretation of a2/0- 2 is given by Krzysztofowicz [1987, 1992].

3.4. Integrator

When the output density •r is specified by (12) and the family of the posterior densities { 4•( ' Is): -o• < s < o• ) is specified by (16)-(18), the predictive density ½ is normal with mean E(Hlu, v) = rn and variance Var (Hlu, v) = v 2, which are given by

Page 8: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

2746 KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING

Table 1. Examples of the Bayesian Forecasting System With Normal-Linear Processors

Parameter Example

Processor Density Name Symbol 1 2 3

Input Uncertainty ,r(s) mean v 3 3 3 variance z 2 1 5 5

Hydrologic Uncertainty #(h) mean M 1 1 1 variance V 2 4 4 4

f(slh) slope a 0.7 0.7 0.7 intercept b 0.4 0.4 0.4 variance 0 -2 2 2 8

K(s) mean aM + b 1.1 1.1 1.1 variance a2V 2 + 0.2 3.96 3.96 9.96

•(hls) slope A 0.71 0.71 0.28 intercept B 0.22 0.22 0.69 variance T 2 2.02 2.02 3.21

Integrator ½(h) mean rn 2.34 2.34 1.53 variance v 2 2.52 4.52 3.61

aV2(v- b) + M0- 2 rn = 2V2 0' 2 , (19) a +

a2V4(2-2 + 0'2) + 0'4V2 v2= (a2V 2 + 0'2)2 ß (20)

In other words, the predictive parameters (m, v2), which define the probabilistic forecast, result directly from the output parameters (v, 2-2), the prior parameters (M, V2), and the likelihood parameters (a, b, 0'2).

3.5. Numerical Examples

Table 1 lists all the densities and parameters. It also shows three numerical examples intended to convey the importance of optimal integration of uncertainties. In example 1 a rela- tively low output uncertainty (2 '2 = 1) is integrated with a relatively moderate hydrologic uncertainty (a = 0.7, 0'2 = 2, and a2/0' 2 -- 0.245). As a result, the predictive variance, v 2 = 2.52, is much greater than the input variance 2-2 = 1. In example 2 the output uncertainty is relatively high (2-2 = 5), and the hydrologic uncertainty is the same as in example 1. The resultant predictive variance, v 2 -- 4.52, is smaller than the output variance. Furthermore, the output variance and the predictive variance each is greater than the prior variance, V 2 = 4. Finally, in example 3 the output uncertainty is the same as in example 2 but the hydrologic uncertainty is higher (a = 0.7, 0'2 = 8, and a2/0' 2 -- 0.061). The resultant predictive variance, v 2 = 3.61, is still smaller than the output variance. It is even smaller than the prior variance, V 2 = 4.

In both examples 2 and 3 the hydrologic uncertainty appears to dampen the effect of the output uncertainty. Moreover, when the hydrologic uncertainty increases (as it does from example 2 to example 3), the dampening becomes stronger. If this result seems counterintuitive, it is possibly because human intuition expects an additive integration of uncertainties. The optimal, Bayesian integrator is nonadditive and nonlinear, the properties examined in the next section.

Figure 3 depicts example 1. By comparing densities (b( ' Is), and • one can see how the output uncertainty and the hydrologic uncertainty are integrated into the predictive un- certainty. By comparing densities # and • one can see how the prior uncertainty is revised on the basis of a probabilistic input forecast and a deterministic hydrologic model. Also shown is density K; as may be recalled, if ,r = K, then • = #.

3.6. Behavior of Bayesian Integrator

The closed form of (20) makes it easy to examine the way in which the output uncertainty (induced by some input uncer- tainty and characterized by r 2) is integrated with the hydrologic uncertainty (characterized by a 2 and o 2) into the total uncer- tainty (characterized by v2). First, there are limiting cases: (1) In the case of a noninformative hydrologic model (as a 2 --> 0, or 0'2 •_> c•, or both), the predictive variance v 2 converges to the prior variance V 2. This is a unique advantage of the Bayes- ian approach: it imposes an upper asymptote on v 2 and thereby automatically guards the decision maker against a poorly per- forming hydrologic model. (2) In the case of a perfect hydro- logic model (as a 2 --• 1 and 0'2 ___> 0), the predictive variance v 2 converges to the output variance 2-2. This is to be expected because only input uncertainty remains.

Second, there is the monotonicity of relations. (1) The pre- dictive variance v 2 increases linearly with the output variance v 2. (2) The predictive variance v 2 is a nonlinear function of the model signal a 2 and the model noise o 2. For a fixed V 2, 2-2, and a 2 the behavior of v 2 as a function of o 2 follows one of the

three patterns sketched in Figure 4: 1. If the ratio of the output variance to model signal is not

larger than half of the prior variance, 2-2/a2 -< V2/2, then as the model noise o 2 increases, the predictive variance v 2 in- creases asymptotically to the prior variance V 2.

2. If V2/2 < 2-2/a2 --< V 2, then as o 2 increases, V 2 first decreases to a minimum (which lies above V2/2) and next increases asymptotically to V 2. The coordinates of the mini- mum are

a2V 4

0'2 = 2 2-2 _ a 2 V 2 v2 = V2 4 2-2 ß (21)

3. If V 2 < 2-2/a2, then as o 2 increases, V 2 first decreases to a minimum (which lies above V2/2 and below V 2) and next increases asymptotically to V 2. The minimum occurs at the point defined by (21). The crossing of V 2 en route to the minimum occurs at the point

0'2-- 2-2__ a2V 2 v 2-- V 2. (22)

The first case is intuitive: as the hydrologic uncertainty in- creases, so does the predictive variance. The other two cases may seem counterintuitive: How can an increase in hydrologic

Page 9: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING 2747

uncertainty lead to a lower predictive variance? To rationalize this behavior of the BFS, note that the predictive variance v 2 can be larger than the prior variance V 2. This occurs whenever the output uncertainty on a particular occasion is larger than the prior uncertainty, which is assessed for an average occasion (r 2 > V2), and the model is near perfect (say, a 2 = 1 and o '2 --> 0); this is the case at the left end of Figure 4c. Next suppose the model's predictive capability deteriorates (o '2 in- creases). Consequently, model output can no longer be taken at its face value. The BFS recognizes this. In particular, for any output realization S = s, the posterior mean E(HIS = s) is

(a) 0.5

0.4

K: 0.3 • • , • , • , • , •

-4 -2 0 2 4 6 8

S

(b) 0.5

0.4

=3

-4 -2 0 2 4 6 8

h

0.3

0.2

(a)

V 2.

V2/2 .

•2/a2

2 V

13 .2

(b) ,

V 2.

•2/a2

V2/2

I

2,1; 2 _ a2V 2

(c)

1:2/a2•

V2/2}--•

,1:2 _ a2v 2

i

2,1; 2 _ a2V 2

(c) 0.5

0.4

0.3

ß

-4 -2 0 2 4 6 8

h

Figure 3. Example of the Bayesian forecasting system with normal-linear processors: (a) output density z' and expected density K, (b) posterior density •b( ß Is), and (c) prior density !7 and predictive density

Figure 4. Behavior of the predictive variance v 2 as a function of the model "noise" o '2 for three cases defined by the ratio of the output variance ,2 to the model "signal" a 2, relative to the prior variance V2: (a) •'2/a2 --< V2/2, (b) V2/2 < •-2/a2 < V 2,

2 2 and (c)V 2 < r/a ;plots for .fixed a = 1 and V 2 = •.

closer to the prior mean E(H) than is the realization S = s. Overall, the BFS dampens the effect of output uncertainty on the predictive density. This dampening effect increases with o '2 . However, when o '2 exceeds 272 - a2V 2, hydrologic uncer- tainty begins to dominate output uncertainty. The increasing hydrologic uncertainty pushes the predictive variance v 2 up- ward until it converges to V 2.

In conclusion, the Bayesian integrator of input uncertainty with hydrologic uncertainty has a nonmonotone structure. This structure may appear counterintuitive at first. It is certainly not additive: the effect of hydrologic uncertainty cannot be repre-

Page 10: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

2748 KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING

sented in terms of "noise" 19 added to the model output, H = S + 19. Such a heuristic, though popular, is incorrect.

4. Summary and Conclusions 4.1. Bayesian Framework

The Bayesian theory presented herein offers a methodolog- ical foundation and an operational framework for probabilistic forecasting via a deterministic hydrologic model of an arbitrary complexity. This framework, called Bayesian forecasting sys- tem (BFS) for short, has five general properties.

1. The BFS decomposes the total uncertainty about the predictand into input uncertainty and hydrologic uncertainty. Input uncertainty is associated with inputs which constitute the dominant sources of uncertainty and which therefore are treated as random and are forecasted probabilistically. Hydro- logic uncertainty is associated with all other sources of uncer- tainty such as model, parameter, estimation, and measurement errors. Each uncertainty is quantified independently of the other, and then both are integrated into the probabilistic forecast.

2. The probabilistic forecast is in the form of a predictive (Bayes) density. This density quantifies the total uncertainty about the predictand, given all knowledge and information incorporated into the hydrologic model, the probabilistic fore- cast of random inputs, and the estimates of deterministic inputs.

3. The predictive density is a revised prior density of the predictand. The prior density constitutes a stochastic model of the predictand; it quantifies the uncertainty that would exist without the hydrologic model. This uncertainty is tantamount with the natural variability of the predictand.

4. The BFS possesses a self-calibration property: provided the forecasting system supplying the probabilistic forecast of random inputs is well calibrated, the BFS is well calibrated. Loosely speaking, it means that probabilistic forecasts are guaranteed to preserve, in the long run, all distributional prop- erties of the predictand that are captured by the family of prior (climatic) densities.

5. The BFS guarantees a coherence property that is essen- tial for rational decision making: it guards the decision maker against notoriously poor forecasts (whose economic value is negative relative to the value of the prior density that one would use if forecasts were unavailable). In particular, if the hydrologic model has no predictive capability or the input density is noninformative for predicting output, then the pre- dictive density automatically converges to the prior density.

Whereas properties 1 and 3 may be considered specific to the BFS, properties 2, 4, and 5 are submitted as necessary for any probabilistic forecasting system if such a system is to sup- port rational decision making.

4.2. Monte Carlo and Ensemble Techniques

It is instructive to compare the BFS with a typical Monte Carlo simulation: given a distribution of an input vector, one randomly generates realizations of this vector, inputs them to a hydrologic model to produce realizations of the output vec- tor, and, finally, constructs an empirical distribution of the output vector [Law and Kelton, 1991]. In essence, this is a technique for executing the input uncertainty processor of the BFS. In the context of the example constructed in section 3, Monte Carlo simulation yields density ,r with mean v and variance •, whereas the BFS yields density ½ with mean rn and variance v 2. The numerical examples in Table 1 demonstrate that rn 4: v and v 2 4:,2 and that the differences can be

substantial. It should now be apparent that Monte Carlo sim- ulation, without a hydrologic uncertainty processor and an integrator, does not yield a probabilistic forecast.

The term ensemble forecasting is becoming popular. Usu- ally, ensemble forecasting is tantamount with either Monte Carlo simulation or some specialized sampling scheme. For example, a set of realizations (an ensemble) of a meteorologic variate is calculated by perturbing initial conditions in an at- mospheric circulation model [Toth and Kalnay, 1993; Tracton and Kalnay, 1993]; or an ensemble of streamflow time series is calculated via a hydrologic model from an ensemble of time series of atmospheric inputs [Day, 1985; Georgakakos et al., 1998]. Like Monte Carlo simulation, ensemble forecasting is only a technique for executing the input uncertainty processor of the BFS.

In summary, unless hydrologic uncertainty is insignificant and can be ignored, neither Monte Carlo simulation nor en- semble forecasting can alone produce a predictive distribution that satisfies properties 2, 4, and 5. In other words, these techniques are not synonymous with probabilistic forecasting, as defined herein, but each can serve as a component of the BFS. Some designers of ensemble forecasting systems [e.g., Schaake and Larson, 1998] recognize the need to account for hydrologic uncertainty and search for an effective solution. As the examples in Table 1 demonstrate, such a solution cannot be limited to adjusting the mean of the ensemble (to correct for a bias of the hydrologic model) but must also, at the very least, adjust the variance of the ensemble.

Appendix: Proof of Calibration Theorem To keep notation from expanding, • will denote a general-

ized density of any random vector whose realization constitutes the argument of •. Let •(w, h, ho, Y, u, v) denote a super- density from which nature generates all realizations of vector (W, H, Ho, Y, U, V). The theory is coherent if all other densities are derivable from the superdensity. Density •(y, u, vlho) is easily derived. When this density and (6) are inserted into (11), one finds

E[½(hlh0, Y, V, v)lH0- h0]

=Iff •(hlhø'y'u'v)•(Y'u'vlhø)dydudv =•7(hlho) ff f •/(h; h0, y, u, v)

ß ½(y, u, vlh0) dy du dv. (A1)

The task now is to demonstrate that the integral on the right side equals one. This demonstration takes four steps.

Step 1: According to (4) and (7), function •/(h; ho, y, u, v) is constructed from two conditional densities of S: density f(s]h, y), which is induced by density ½(w, u]h, y) and response function s = r(w, u), and density ,r(s]u, v), which is induced by density r•(w]v) and response function s = r(w, u). The two conditional densities of S are coherent if ½(w, u]h, y) and r•(w]v) come from the same superdensity. This is not guaranteed be- cause density r•(w]v) originates outside the BFS. Therefore a sufficient condition for coherence must be established. Toward

this end, the first density is factorized as

Page 11: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING 2749

•(w, ulh, y) - •(wlh, y, u)•(ulh, y)

= [•(h, y, u w)q(w)/•(h, y, u)]•(ulh, y), (A2)

and the second density is factorized as

w(wlv)- •(vlw)q(w)/•(v). (A3)

The only element common to both factorizations is q(w), which is the prior (climatic) density of the input. From (A2),

q(w):ff• •(w,h,y,u)•(h,y,u)dhdydu - EE•(wlI-I, Y, U)], (A4)

and from (A3),

q(w) = f ,/(wlv)•(v) dv = EE •(wlv)], (A5)

f(slh, Y) K(slh0, Y) --ds ] s•(u, v{h0, y)f(ylh0) dy du dv : •(sih;, •) •(ylh0)

f• •(slu, v)•(u, vlh0, y)dudv] ds dy •f f(slh, y) : •(• h;, y) •(ylh0)•(slh0, y) ds dy

=• [f f(slh, y) dsl•(y]ho)dy = 1, (A8)

where the transformation from the second to the third line uses

(A7). QED.

In conclusion, if (A4)-(A5) hold, then densities •(w, ulh, y) and ,/(wlv) are coherent and so are densities f(slh, y) and ,r(slu, v). Condition (A4) is upheld by construction of the likelihood function, which takes place within the BFS. Therefore condi- tion (A5), which is the assumption of the theorem, is sufficient to guarantee the coherence of f(s[h, y) and ,r(s u, v).

Step 2: The superdensity •w, h, ha, y, u, v) and the response function s = r(w, u) induce a joint density •(s, h, ha, y, u, v).

Step 3' The joint density can be factorized as follows:

•(s, h, ho, y, u, v)

- •(hls, h0, y, u, v)•(slh0, y, u, v)•(u, vlh0, y)•(ho, y)

- q,(hls, h0, y)•r(slu, v)•(u, vlh0, y)•(h0, y), (A6)

where the first density reduces to qb(h s, ha, y) because, condi- tional on S = s, vector (ha, y) is the sufficient statistic for the density of H, and the second density reduces to ½r(su, v) because (u, v) is sufficient to induce the density of S, via the density of input r•(wlv) and the response function s = r(w, u). When each side of (A6) is integrated with respect to h and next divided by •(ha, y), one obtains

½(s, u, vlho, y) = 'rr(slu, v)sC(u, vlho, y).

Consequently,

•(slh0, y)= ff s•(s, u, v[h0, y) du dv =ff ½r(s[u, v)½(u, vlh0, y) du dv, (A7)

which demonstrates that the expected density of S is equal to the conditional expectation of the output density. This ex- pected density of S is identical to that specified by (4) in terms of g(hlho) and f(slh, y) because f(slh, y) and ,r(slu, v) are coherent.

Step 4: When, on the right side of (A1), •/(h; ha, y, u, v) is replaced by (7) and •(y, u, vlho) is factorized as •(u, vlho, y)•(ylho), the triple integral takes the form

Acknowledgment. This article is based upon work supported by the National Oceanic and Atmospheric Administration under award NA77WD0556 "Probabilistic Hydrometeorological Forecast System."

References

Alpert, M., and H. Raiffa, A progress report on the training of prob- ability assessors, in Judgment Under Uncertainty: Heuristics and Bi- ases, edited by D. Kahneman, P. Slovic, and A. Tversky, pp. 294- 305, Cambridge Univ. Press, New York, 1982.

Berger, J. O., Statistical Decision Theory and Bayesian Analysis, Spring- er-Verlag, New York, 1985.

Bernardo, J. M., and A. F. M. Smith, Bayesian Theory, John Wiley, New York, 1994.

Box, G. E. P., and N. R. Draper, Empirical Model-Building and Re- sponse Surfaces, John Wiley, New York, 1987.

Day, G. N., Extended streamflow forecasting using NWSRFS, J. Water Resour. Plann. Manage., 111(2), 157-170, 1985.

DeGroot, M. H., Optimal Statistical Decisions, McGraw-Hill, New York, 1970.

Georgakakos, A. P., H. Yao, M. G. Mullusky, and K. P. Georgakakos, Impacts of climate variability on the operational forecast and man- agement of the upper Des Moines River basin, Water Resour. Res., 34(4), 799-821, 1998.

Graziano, T. M., The NWS end-to-end forecast process for quantita- tive precipitation information, in Preprints, Special Symposium on Hydrology, pp. J35-J40, Am. Meteorol. Soc., Boston, Mass., 1998.

Kelly, K. S., and R. Krzysztofowicz, Probability distributions for flood warning systems, Water Resour. Res., 30(4), 1145-1152, 1994.

Kelly, K. S., and R. Krzysztofowicz, Bayesian revision of an arbitrary prior density, in Proceedings, Section on Bayesian Statistical Science, pp. 50-53, Am. Stat. Assoc., Alexandria, Va., 1995.

Krzysztofowicz, R., Why should a forecaster and a decision maker use Bayes theorem, Water Resour. Res., 19(2), 327-336, 1983.

Krzysztofowicz, R., Bayesian models of forecasted time series, Water Resour. Bull., 21(5), 805-814, 1985.

Krzysztofowicz, R., Markovian forecast processes, J..Am. Stat. Assoc., 82(397), 31-37, 1987.

Krzysztofowicz, R., Bayesian correlation score: A utilitarian measure of forecast skill, Mon. Weather Rev., 120(1), 208-219, 1992.

Krzysztofowicz, R., Probabilistic hydrometeorological forecasting sys- tem: A conceptual design, in Third National Heavy Precipitation Workshop, NOAA Tech. Mem. NWS ER-87, pp. 29-42, Natl. Weather Serv., Eastern Reg., Bohemia, N.Y., 1993.

Krzysztofowicz, R., Probabilistic hydrometeorological forecasts: To- ward a new era in operational forecasting, Bull..Am. Meteorol. Soc., 79(2), 243-251, 1998.

Krzysztofowicz, R., and S. Reese, Bayesian analyses of seasonal runoff forecasts, Stochastic Hydrol. Hydraul., 5(4), 295-322, 1991.

Page 12: Bayesian Theory of Probabilistic Forecasting ...€¦ · Hydrologic models used operationally for forecasting hydro- graphs of stages or discharges, or time series of runoff volumes,

2750 KRZYSZTOFOWICZ: BAYESIAN THEORY OF PROBABILISTIC FORECASTING

Krzysztofowicz, R., and A. A. Sigrest, Calibration of probabilistic quantitative precipitation forecasts, Weather Forecasting, 14(3), 427- 442, 1999.

Krzysztofowicz, R., and L. M. Watada, Stochastic model of seasonal runoff forecasts, Water Resour. Res., 22(3), 296-302, 1986.

Krzysztofowicz, R., W. J. Drzal, T. R. Drake, J. C. Weyman, and L. A. Giordano, Probabilistic quantitative precipitation forecasts for river basins, Weather Forecasting, 8(4), 424-439, 1993.

Law, A.M., and W. D. Kelton, Simulation Modeling and Analysis, McGraw-Hill, New York, 1991.

Murphy, A. H., Probabilities, odds, and forecasts of rare events, Weather Forecasting, 6(2), 302-307, 1991.

Murphy, A. H., and R. L. Winkler, Credible interval temperature forecasting: Some experimental results, Mon. Weather Rev., 102, 784-794, 1974.

Olson, D. A., N. W. Junker, and B. Korty, Evaluation of 33 years of quantit•ative precipitation forecasting at the NMC, Weather Forecast- ing, 10, 498-511, 1995.

Schaake, J., and L. Larson, Ensemble streamflow prediction (ESP): Progress and research needs, in Preprints, Special Symposium on Hydrology, pp. J19-J24, Am. Meteorol. Soc., Boston, Mass., 1998.

Seo, D.-J., and B. Finnerty, Simulation of precipitation fields in space and time from probabilistic quantitative precipitation forecast, in Preprints, 14th Conference on Probability and Statistics in the Atmo- spheric Sciences, pp. 140-141, Am. Meteorol. Soc., Boston, Mass., 1998.

Toth, Z., and E. Kalnay, Ensemble forecasting at NMC: The genera- tion of perturbations, Bull. Am. Meteorol. Soc., 74(12), 2317-2330, 1993.

Tracton, M. S., and E. Kalnay, Operational ensemble prediction at the National Meteorological Center: Practical aspects, Weather Fore- casting, 8, 379-398, 1993.

R. Krzysztofowicz, Department of Systems Engineering and Divi- sion of Statistics, University of Virginia, Thornton Hall, SE, Char- lottesville, VA 22903. ([email protected])

(Received September 16, 1998; revised March 19, 1999; accepted March 23, 1999.)