Data mining issues on Data mining issues on improving the accuracy of the improving the accuracy of the rainfall-runoff model for rainfall-runoff model for flood forecasting flood forecasting Jia Liu Jia Liu Supervisor: Dr. Supervisor: Dr. Dawei Han Dawei Han Email: [email protected]Email: [email protected]WEMRC, Department of Civil Engineering WEMRC, Department of Civil Engineering University of Bristol University of Bristol 24 May 2010 24 May 2010
12
Embed
Data mining issues on improving the accuracy of the rainfall-runoff model for flood forecasting Jia Liu Supervisor: Dr. Dawei Han Email: [email protected].
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Data mining issues on Data mining issues on improving the accuracy of the improving the accuracy of the rainfall-runoff model for flood rainfall-runoff model for flood
How to cope with the ‘data rich’ environment?How to cope with the ‘data rich’ environment?
Questions proposed:Questions proposed: A. How to select the most appropriate data to calibrate the model?A. How to select the most appropriate data to calibrate the model?
2. Which period the data should be selected from?2. Which period the data should be selected from?
1. How long the data should be?1. How long the data should be? Data LengthData Length
Data DurationData Duration
B. When used for forecasting, what is the most appropriate sampling rate?B. When used for forecasting, what is the most appropriate sampling rate?
Data Time IntervalData Time Interval
Large quantityLarge quantityDataData Fast sampling rateFast sampling rate++
Calibration data selection: data length and durationCalibration data selection: data length and duration
Data used for model validation is often determined. Data used for model validation is often determined.
We assume that the more similarity the calibration data bears to the validation data, We assume that the more similarity the calibration data bears to the validation data,
the better performance the rainfall-runoff model should have after calibration. the better performance the rainfall-runoff model should have after calibration.
0
5
10
15
20
25
30
m3/
s
0
20
40
60
80
100
mm
Validation data set
A good information qualityA good information quality of the calibration data set = of the calibration data set =
A similar information content to validation data setA similar information content to validation data set
Calibration data set
Comparison of the information Comparison of the information quality of the two data setsquality of the two data sets
Calibration data selection: data length and durationCalibration data selection: data length and duration
2jj kkE C
2jj kkE S
jj
jj E
EP
An indexAn index which can reveal the similarity between the calibration and validation data which can reveal the similarity between the calibration and validation data
sets, can be used as a guide for calibration data selection for the rainfall-runoff model.sets, can be used as a guide for calibration data selection for the rainfall-runoff model.
Information Cost Function (ICF)Information Cost Function (ICF)
ICF lnj jj
P P The Information Cost Function (ICF) is a an entropy-like function that gives a good estimate of the degree of disorder of a system
Liu, J., and D. Han (2010), Indices for calibration data selection of the rainfall-runoff model, Water Resour. Res., 46, W04512, doi:10.1029/2009WR008668.
X
Z
YX1
XN
YN
Y1
Z1
ZN
Forecast lead time Data time interval
Model error
X 1
Z 1
Error
Time interval
Z N
X N
Error
Time interval
Long lead time
Short lead time
Optimal data time interval – for the forecast modeOptimal data time interval – for the forecast modeBf s 2
Optimal time intervalOptimal time intervalSampling theorySampling theory
Bf s 2Lower boundary: Lower boundary:
Too slowToo slow Too fastToo fast
Leading to numerical problemsLeading to numerical problems
Sampling rate of model input dataSampling rate of model input data
Hypothetical curveHypothetical curve
A positive relationA positive relation
Data time interval
Forecast lead time
Optimal data time interval – for the forecast modeOptimal data time interval – for the forecast modeBf s 2
Case studyCase study
Auto-Regressive Moving Average Auto-Regressive Moving Average
(ARMA) model for on-line updating(ARMA) model for on-line updating
Four catchments are selected from Four catchments are selected from
the Southwest England:the Southwest England:
CatchmentsCatchmentsAREA AREA (km(km22))
LDP LDP (km)(km)
DPSBAR DPSBAR (m/km)(m/km)
A A BelleverBellever 21.521.5 13.513.5 94.994.9
B B HalsewaterHalsewater 87.887.8 19.419.4 85.785.7
C C Brue Brue 135.2135.2 22.622.6 71.171.1
D D Bishop_HullBishop_Hull 202.0202.0 40.240.2 98.098.0
LDP: longest drainage path (km)
DPSBAR: mean drainage path slope (m/km)
51°05′N
51°00′N
3°10′W 3°05′W3°15′W4°00′W 3°55′W
50°35′N
50°40′N
2°35′W 2°30′W 2°25′W
51°10′N
51°05′N
3°20′W 3°15′W 3°10′W
51°05′N
51°00′N
Bellever Halsewater
Brue Bishop_Hull
Optimal data time interval – for the forecast modeOptimal data time interval – for the forecast modeBf s 2
Case studyCase study
The positive pattern between the The positive pattern between the
optimal data time interval and the optimal data time interval and the
forecast lead time is found to be forecast lead time is found to be
highly related to the highly related to the catchment catchment
concentration timeconcentration time..
CatchmentsCatchmentsAREA AREA (km(km22))
LDP LDP (km)(km)
DPSBAR DPSBAR (m/km)(m/km)
A A BelleverBellever 21.521.5 13.513.5 94.994.9
B B HalsewaterHalsewater 87.887.8 19.419.4 85.785.7
C C Brue Brue 135.2135.2 22.622.6 71.171.1
D D Bishop_HullBishop_Hull 202.0202.0 40.240.2 98.098.0
LDP: longest drainage path (km)
DPSBAR: mean drainage path slope (m/km)
Bellever Halsewater
Brue Bishop_Hull
015
30
60
120
0123456
9
120
0.2
0.4
0.6
0.8
1
XY
Z
015
30
60
120
0123456
9
120
0.2
0.4
0.6
0.8
1
XY
Z
015
30
60
120
0123456
9
120
0.2
0.4
0.6
0.8
1
XY
Z
015
30
60
120
0123456
9
120
0.2
0.4
0.6
0.8
1
XY
Z
Conclusions and Future workConclusions and Future work
Selecting data with the most appropriate Selecting data with the most appropriate length, duration and time intervallength, duration and time interval is of great is of great
significance in improving the model performance and helps to enhance the efficiency significance in improving the model performance and helps to enhance the efficiency
of data utilization in rainfall-runoff modelling and forecasting.of data utilization in rainfall-runoff modelling and forecasting.
More research is needed to explore the applicability of the ICF index for calibration data More research is needed to explore the applicability of the ICF index for calibration data
selection and to verify the hypothetical curve of the optimal data time interval.selection and to verify the hypothetical curve of the optimal data time interval.
Weather Research & Forecasting (WRF) ModelWeather Research & Forecasting (WRF) Model
Rainfall-Runoff ModelRainfall-Runoff Model
RunoffRunoff
Rainfall (and Evaporation)
Rainfall (and Evaporation)
As real-time inputsAs real-time inputs
Updated by observationsUpdated by observations
The EndThe End
Thank you for your attention!Thank you for your attention!