幻灯片 1 - University of Iowa

Download Report

Transcript 幻灯片 1 - University of Iowa

Data mining issues on improving the
accuracy of the rainfall-runoff model for
flood forecasting
Jia Liu
Supervisor: Dr. Dawei Han
Email: [email protected]
WEMRC, Department of Civil Engineering
University of Bristol
24 May 2010
Outlines
Introduction to the Probability Distributed Model (PDM)
Two data mining issues:
 Selection of data for model calibration
 Optimal data time interval in flood forecasting
Conclusions and Future work
Introduction to rainfall-runoff model
Hydrological Cycle
Rainfall
(and Evaporation)
Rainfall-Runoff Model
Rainfall-runoff model
 A conceptual representation of the hydrological cycle
 The fundamental work for any water researches, i.e.,
real-time flood forecasting, land-use change evaluations
and design of hydraulic structures, etc.
Runoff
Introduction to rainfall-runoff model
Hydrological Cycle
Probability Distributed Model
by Moore (1985)
13 Model Parameters
to be calibrated
fc, Td, cmin, cmax, b, be, kg,
bg, St, k1, k2, kb, qc
Rainfall-runoff model
 A conceptual representation of the hydrological cycle
 The fundamental work for any water researches, i.e.,
real-time flood forecasting, land-use change
evaluations and design of hydraulic structures, etc.
How to cope with the ‘data rich’ environment?
Data
Large quantity
+
Fast sampling rate
Questions proposed:
A. How to select the most appropriate data to calibrate the model?
1. How long the data should be?
Data Length
2. Which period the data should be selected from?
Data Duration
B. When used for forecasting, what is the most appropriate sampling rate?
Data Time Interval
Calibration data selection: data length and duration
Data used for model validation is often determined.
We assume that the more similarity the calibration data bears to the validation data,
the better performance the rainfall-runoff model should have after calibration.
A good information quality of the calibration data set =
A similar information content to validation data set
Comparison of the information
quality of the two data sets
30
0
25
15
Calibration data set
Validation data set
40
mm
m3/s
20
20
60
10
5
0
80
100
Calibration data selection: data length and duration
An index which can reveal the similarity between the calibration and validation data
sets, can be used as a guide for calibration data selection for the rainfall-runoff model.

Flow Duration Curve

Fast Fourier Transform

Discrete Wavelet Decomposition
Information Cost Function (ICF)
E j  k Skj 2
Energy of approximation
E j  k Ckj 2
Energy of detail
Pj 
Ej
E
j
Percentile energy on each
decomposition level
j
ICF   Pj ln Pj
Liu, J., and D. Han (2010), Indices for calibration data
selection of the rainfall-runoff model, Water Resour. Res.,
46, W04512, doi:10.1029/2009WR008668.
j
The Information Cost Function (ICF) is a
an entropy-like function that gives a good
estimate of the degree of disorder of a system
f s  2B
Optimal data time interval – for the forecast mode
Sampling rate of model input data
Too slow
Too fast
Sampling theory
Lower boundary: f s  2B
Optimal time interval
Leading to numerical problems
[Åström, 1968; Ljung, 1989]
Short lead time
Hypothetical curve
Error
Model error
Z
A positive relation
Z1
Data time interval
X1
Time interval
Long lead time
ZN
Error
X
Z1
Forecast lead time
ZN
XN
Y
YN
Forecast lead time
Time interval
XN
X1
Y1
Data time interval
f s  2B
Optimal data time interval – for the forecast mode
Bellever
50°40′N
Halsewater
Case study

51°05′N
Auto-Regressive Moving Average
(ARMA) model for on-line updating

Four catchments are selected from
the Southwest England:
50°35′N
51°00′N
4°00′W
3°15′W
3°55′W
3°10′W
3°05′W
AREA
(km2)
LDP
(km)
DPSBAR
(m/km)
A Bellever
21.5
13.5
94.9
B Halsewater
87.8
19.4
85.7
C Brue
135.2
22.6
71.1
D Bishop_Hull 202.0
40.2
98.0
Catchments
51°10′N
51°05′N
51°00′N
51°05′N
LDP: longest drainage path (km)
DPSBAR: mean drainage path slope (m/km)
Brue
2°35′W
2°30′W
2°25′W
Bishop_Hull
3°20′W
3°15′W
3°10′W
f s  2B
Optimal data time interval – for the forecast mode
Bellever
Halsewater
Case study

The positive pattern between the
1
1
0.8
0.8
optimal data time interval and the
0.6
forecast lead time is found to be
0.4
highly related to the catchment
0.6
Z
Z
0.4
0.2
0.2
120
0
12
120
0
12
60
9
concentration time.
60
9
6
5
4
30
3
2
6
15
1
Y
0
5
4
X
0
30
3
2
15
1
Y
Brue
0
X
0
Bishop_Hull
1
1
0.8
0.8
AREA
(km2)
LDP
(km)
DPSBAR
(m/km)
A Bellever
21.5
13.5
94.9
B Halsewater
87.8
19.4
85.7
C Brue
135.2
22.6
71.1
D Bishop_Hull 202.0
40.2
98.0
Catchments
0.6
0.6
Z
Z
0.4
0.4
0.2
0.2
120
120
DPSBAR: mean drainage path slope (m/km)
0
12
0
12
60
9
6
5
Y
4
3
2
60
9
6
30
5
15
1
0
0
X
LDP: longest drainage path (km)
Y
4
30
3
2
15
1
0
0
X
Conclusions and Future work
Selecting data with the most appropriate length, duration and time interval is of great
significance in improving the model performance and helps to enhance the efficiency of
data utilization in rainfall-runoff modelling and forecasting.
More research is needed to explore the applicability of the ICF index for calibration data
selection and to verify the hypothetical curve of the optimal data time interval.
Weather Research & Forecasting (WRF) Model
As real-time inputs
Updated by observations
Rainfall
(and Evaporation)
Rainfall-Runoff Model
Runoff
The End
Thank you for your attention!