2 - University of Colorado Boulder

Download Report

Transcript 2 - University of Colorado Boulder

A Stochastic Nonparametric Framework
for Ensemble Hydrologic Forecast and
Downscaling
Balaji Rajagopalan
Department of Civil and Environmental Engg.
University of Colorado
Boulder, CO
IRI / Lamont – Aug 2003
Acknowledgements
James Prairie, Katrina Grantz, Somkiat
Apipattanavis, Nkrintra Singhrattna
Subhrendu Gangopadhyay, Martyn Clark
CIRES/University of Colorado, Boulder, CO
Upmanu Lall
(Lamont-Doherty Earth Observatory)
Columbia University, NY
David Yates
NCAR/University of Colorado, Boulder, CO
A Water Resources Management Perspective
Inter-decadal
Decision Analysis: Risk + Values
T
• Facility Planning
i
– Reservoir, Treatment Plant Size
m
e
• Policy + Regulatory Framework
Climate
– Flood Frequency, Water Rights, 7Q10 flow
H
o
r
i
z
o
n
• Operational Analysis
– Reservoir Operation, Flood/Drought Preparation
• Emergency Management
– Flood Warning, Drought Response
Data: Historical, Paleo, Scale, Models
Hours
Weather
Ensemble Forecast (or Scenarios
generation)
• Scenarios (synthetic sequences) of hydroclimate are simulated
for various decision making situations
Reservoir operations (USBR/Riverware)
Erosion Prediction (USDA/WEPP)
Reservoir sizing (Flood frequency)
• Given [Yt] t = 1,2,…,N hydroclimate time series (e.g. daily
weather variables, streamflow, etc.)
Parametric models are fit
(probability density functions – Gamma, Exponential etc.)
Time series Models
(Auto Regressive Models)
Hydrologic Forecasting
•
•
•
•
Conditional Statistics of Future State, given Current State
Current State: Dt : (xt, xt-t, xt-2 t, …xt-d1t, yt, yt- t, yt-2t, …yt-d2t)
Future State: xt+T
Forecast: g(xt+T) = f(Dt)
– where g(.) is a function of the future state, e.g., mean or pdf
– and f(.) is a mapping of the dynamics represented by Dt to g(.)
– Challenges
• Composition of Dt
• Identify g(.) given Dt and model structure
– For nonlinear f(.) , Nonparametric function estimation methods used
•
•
•
•
K-nearest neighbor
Local Regression
Regression Splines
Neural Networks
The Problem
• Ensemble Forecast/Stochastic Simulation
/Scenarios generation – all of them are
conditional probability density function
problems
f  yt y , y ,..., y  
t 1
t 2
t p
f ( yt , yt 1 , yt  2 ,..., yt  p )
 f ( y , y , y ,..., y ) dy
• Estimate conditional PDF and simulate
(Monte Carlo, or Bootstrap)
t
t 1
t 2
t p
t
Parametric Models
• Periodic Auto Regressive model (PAR)
– Linear lag(1) model
y,t =  t + 1 t  y  t – 1 – t – 1  +   ,t
– Stochastic Analysis, Modeling, and Simulation (SAMS)
(Salas, 1992)
• Data must fit a Gaussian distribution
• Expected to preserve
– mean, standard deviation, lag(1) correlation
– skew dependant on transformation
– gaussian probability density function
Parametric Models - Drawbacks
• Model selection / parameter estimation issues
Select a model (PDFs or Time series models)
from candidate models
Estimate parameters
• Limited ability to reproduce nonlinearity and nonGaussian features.
All the parametric probability distributions are
‘unimodal’
All the parametric time series models are ‘linear’
All India Monthly Rainfall
Parametric Models - Drawbacks
• Models are fit on the entire data set
Outliers can inordinately influence parameter
estimation
(e.g. a few outliers can influence the mean,
variance)
Mean Squared Error sense the models are
optimal but locally they can be very poor.
Not flexible
• Not Portable across sites
Nonparametric Methods
• Any functional (probabiliity density, regression etc.)
estimator is nonparametric if:
It is “local” – estimate at a point depends only on
a few neighbors around it.
(effect of outliers is removed)
No prior assumption of the underlying functional
form – data driven
Nonparametric Methods
• Kernel Estimators
(properties well studied)
• Splines
• Multivariate Adaptive Regression Splines (MARS)
• K-Nearest Neighbor (K-NN) Bootstrap Estimators
• Locally Weighted Polynomials (K-NN Polynomials)
K-NN Philosophy
• Find K-nearest neighbors to the desired point x
• Resample the K historical neighbors (with high
probability to the nearest neighbor and low
probability to the farthest)  Ensembles
• Weighted average of the neighbors  Mean Forecast
• Fit a polynomial to the neighbors – Weighted Least
Squares
– Use the fit to estimate the function at the desired point x
(i.e. local regression)
• Number of neighbors K and the order of polynomial
p is obtained using GCV (Generalized Cross
Validation) – K = N and p = 1  Linear modeling
framework.
K-Nearest Neighbor Estimators
k/n
k/n
f NN (x) =
=
Vk (x) c d rk d (x)
A k-nearest neighbor
density estimate
n
x  x 
1
i
f GNN(x) = d
K

rk (x)n i1  rk(x) 

f
A conditional k-nearest
neighbor density estimate
GNN
(x | D)  f (x, D) / f (D)
(r (x, D)n)-1
n  (x, D)  (x , D ) 
k
i i

 K
r (x, D) 
n  x  x  i 1 

k
1
i
(r (x)n)  K

k
i  1  rk (x) 
f(.) is continuous on Rd, locally Lipschitz of order p
k(n) =O(n2p/(d+2p))
A k-nearest neighbor ( modified
Nadaraya Watson) conditional mean
estimate
K (u)  0,  uK (u)du  0,  u 2K (u)du  
 D * D 
n


x
K
 i  r (D*)i 


i1
 k

mˆ (x | D  D*) 
n  D * Di 
 K

i1  rk (D*) 
Classical Bootstrap (Efron):
Given x1, x2, …... xn are i.i.d. random variables with a cdf F(x)
n
ˆ
F
(
x
)

Construct the empirical cdf
 I (xi  x) / n
i1
Draw a random sample with replacement of size n from Fˆ (x)
Moving Block Bootstrap (Kunsch, Hall, Liu & Singh) :
Resample independent blocks of length b<n, and paste them together to form a series of
length n
k-Nearest Neighbor Conditional Bootstrap (Lall and Sharma, 1996)
Construct the Conditional Empirical Distribution Function:
n
Fˆ (x | D*)   I (xi  x)I (Di  Br (D*))K (i) / k
i1
k
ˆ
Draw a random sample with replacement from F (x | D*)
A time series from the model
xt+1 = 1 - 4(xt - 0.5)2
1
k-nearest neighborhoods A
and B for xt=x*A and x*B
respectively
0.75
x t 0.5
1
1
State
4
S
2
0.75
3
A
•
0
25
50
75
time
100
B
3
•
Di
•
•D3 D1 • D2
•
•
•
0
4
Values of x t
1
0.25
State
2
3
0.5
125
xt+1
2
0.25
x* A
x* B
1
0
0
0.25
0.5
0.75
xt
Logistic Map Example
4-state Markov Chain
discretization
1
Define the composition of the "feature vector" Dt of dimension d.
(1) Dependence on two prior values of the same time series.
Dt : (xt-1, xt-2) ; d=2
(2) Dependence on multiple time scales (e.g., monthly+annual)
Dt: (xt-t1, xt-2t1, .... xt-M1t1; xt-t2, xt-2t2, ..... xt-M2t2) ; d=M1+M2
(3) Dependence on multiple variables and time scales
Dt: (x1t-t1, .... x1t-M1t1; x2t, x2t-t2, .... x2t-M2t2); d=M1+M2+1
Identify the k nearest neighbors of Dt in the data D1 ... Dn
Define the kernel function ( derived by taking expected values of distances to each
of k nearest neighbors, assuming the number of observations of D in a neighborhood
Br(D*) of D*; r0, as n , is locally Poisson, with rate (D*))
K(j) 
for the jth nearest neighbor
1/j
k
j =1...k
1/j
i 1
Selection of k: GCV, FPE, Mutual Information, or rule of thumb (k=n0.5)
Applications to date….
• Monthly Streamflow Simulation Space and time disaggregation of
monthly to daily streamflow
• Monte Carlo Sampling of Spatial Random Fields
• Probabilistic Sampling of Soil Stratigraphy from Cores
• Hurricane Track Simulation
•Multivariate, Daily Weather Simulation
• Downscaling of Climate Models
•Ensemble Forecasting of Hydroclimatic Time Series
• Biological and Economic Time Series
• Exploration of Properties of Dynamical Systems
• Extension to Nearest Neighbor Block Bootstrapping -Yao and Tong
K-NN Local Polynomial
K-NN Algorithm
k N
90  9
yt *
yt-1
Residual Resampling
yt = yt* + et*
e t*
yt *
yt-1
Applications
K-NN Bootstrap
• Weather Generation – Erosion Prediction (Luce et al.,
2003; Yates et al., 2003)
• Precipitation/Temperature Downscaling (Clark et al., 2003)
Local-Polynimial + K-NN residual bootstrap
• Ensemble Streamflow forecasting
Truckee-Carson basin, NV (Grantz et al., 2002, 2003)
• Ensemble forecast from categorical probabilistic
forecast
Local Polynomial
• Flood Frequency Analysis (Apipattanavis et al., 2003)
Is a 2 State
Markov Chain
Adequate ?
Can the lag-0
and lag-1
dependence
across variables
be easily
preserved ?
Are multi-scale
statistics preserved by
the Daily Model ?
Our current implementation
uses moving window
seasons
Rajagopalan and Lall, 1999
January-March Daily Weather, Salt Lake City - Wet Days
January-March Daily Weather, Salt Lake City - Dry Days
mean wet spell length
fraction of wet days
standard deviation of wet spell length
longest wet spell length
mean dry spell length
fraction of dry days
standard deviation of dry spell length
longest dry spell length
Mean seasonal precipitation
Variance of seasonal precipitation
Annual Mean Annual Variance
k-nn daily Simulations of
Precipitation
- Performance in terms of
aggregated statistics
SRAD and TMX
TMX and TMN
TMN and P
SRAD and TMN
SRAD and DPT
TMX and DPT
TMX and P
TMN and DPT
DPT and P
lag 0 cross correlation,
for selected daily
variables
MAR-1 simulations
SRAD and TMX
SRAD and TMN
SRAD and DPT
TMX and DPT
TMX and P
k-nn simulations
TMX and TMN
TMN and P
TMN and DPT
DPT and P
Mean Annual Erosion in Kg/Sq. m.
Location CLIGEN BOOTCLIM BOOTCLIM-scramble
Idaho
0.6
0.1
0.5
Oregon
5.2
1.4
3.2
Arizona
0.6
0.3
0.6
Impact of Improper Dependence Structure on Erosion Estimated
from Physical Model (WEPP) using Simulated Weather
Differences in CLIGEN/BOOTCLIM-scramble vs BOOTCLIM
are due to inability vs ability to preserve cross-correlations
between temperature and precipitation (and hence rain/snow)
Luce et al., 2003
Region
Figure 1. Map depicting the 21-state area of interest in this study. The numbers indicate stations grouped by region. The two dark filled squares in
the east are Stations 114198 and 112140 in Region 4, and the two dark squares in the west are Stations 52281 and 52662 in Region 7.
Map depicting the 21-state area of interest in this study. The numbers indicate stations
grouped by region. The two dark filled squares in the east are Stations 114198 and
112140 in Region 4, and the two dark squares in the west are Stations 52281 and 52662
in Region 7.
(Yates et al., 2003)
Temperature Mean and Standard Deviations
Precipitation Statistics
Lag Correlations
Spatial Correlations
Downscaling Concept
Horizontal resolution
~ 200 km
[scale mis-match]
Area of interest
~500 to 2000 km2
•
Purpose: Downscale global-scale atmospheric forecasts to local scales
in river basins (e.g., individual stations).
Downscaling Approach
•
Identify outputs from the global-scale Numerical Weather Prediction
(NWP) model that are related to precipitation and temperature in the
basins of interest
–
–
–
–
•
Geo-potential height, wind, humidity at five pressure levels etc.
Various surface flux variables
Computed variables such as vorticity advection, stabilitiy indices, etc.
Variables lagged to account for temporal phase errors in atmospheric forecasts.
Use NWP outputs in a statistical model to estimate precipitation and
temperature for the basins
–
–
–
–
–
–
Multiple linear regression
K-nn
NWS bias-correction methodology
Local polynomial regression
Canonical Correlation Analysis
Artificial Neural Networks
Multiple Linear Regression (MLR) Approach
•
•
•
•
•
•
Multiple linear Regression with forward selection
Y = a0 + a1X1 + a2X2 + a3X3 . . . + anXn + e
Use cross-validation procedures for variable selection – typically less than 8
variables are selected for a given equation
A separate equation is developed for each station, each forecast lead time, and each
month.
Stochastic modeling of the residuals in the regression equation is done to provide
ensemble time series
The ensemble members are subsequently shuffled to reconstruct the observed
spatio-temporal covariability
Regression coefficients are estimated from the period of the NCEP 1998 MRF
hindcast (1979-2001)
K-nn Approach - Methodology
• Get all the NCEP MRF output variables within a 14 day window (7
days, lag+lead) centered on the current day
•Perform EOF analysis of the climate variables and retain the first few
leading Pcs, that capture most of the variance
•~6 Pcs capture about 90% of the variance
•The PC space leading Pcs becomes the “feature vector”
•Project the forecast climate variable of the current day on to the PC
space – i.e. The “feature vector”
• Select the “nearest” neighbor to the “feature vector” in the PC space
– hence, a day from the historical record.
Snowmelt
Dominated
Cle Elum
Rainfall
Dominated
526km2
East Fork
of
the Carson
Animas
Snowmelt
Dominated
Snowmelt
Dominated
922km2
BASINS
3626km2
Alapaha
1792km2
Stn 1
The “Schaake Shuffle” method
2
3
4
5
6
7
8
9 10 11 12 13 14
1
2
3
4
5
6
7
8
9 10 11 12 13 14
1
2
3
4
5
6
7
8
9 10 11 12 13 14
Stn 3
Stn 2
1
The “Schaake Shuffle” method
(ranked ens. output)
Stn 1 Stn 2 Stn 3
7.5
6.3
12.4
8.3
7.2
13.5
8.8
7.5
14.2
9.7
7.9
14.5
10.1
8.6
15.6
10.3
9.3
15.9
11.2
11.8
16.3
11.9
12.2
17.6
12.5
13.5
18.3
15.3
17.7
23.9
Ranked ensemble output
for a given forecast day
The “Schaake Shuffle” method
(ranked ens. output) (data from days selected from the historical record)
Stn 1 Stn 2 Stn 3
Stn 1
Stn 2
Stn 3
7.5
6.3
12.4
1
10.7
1
7.3
1
15.1
1
12 Jan 1972
8.3
7.2
13.5
2
9.3
2
10.2
2
14.3
2
23 Dec 1986
8.8
7.5
14.2
3
6.8
3
9.3
3
12.8
3
7 Jan 1965
9.7
7.9
14.5
4
11.3
4
11.6
4
16.2
4
9 Jan 1973
10.1
8.6
15.6
5
12.2
5
12.5
5
11.3
5
2 Jan 1997
10.3
9.3
15.9
6
13.6
6
7.6
6
12.4
6
28 Dec 1965
11.2
11.8
16.3
7
8.9
7
5.3
7
14.1
7
29 Dec 1958
11.9
12.2
17.6
8
9.9
8
11.3
8
17.8
8
11 Jan 1978
12.5
13.5
18.3
9
11.8
9
6.5
9
20.1
9
3 Jan 1980
15.3
17.7
23.9
10
12.9
10
16.1
10
15.8
10
10 Jan 1976
The “Schaake Shuffle” method
(ranked ens. output)
Stn 1 Stn 2 Stn 3
(ranked historical obs)
Stn 1
Stn 2
Stn 3
7.5
6.3
12.4
3
6.8
7
5.3
5
11.3
8.3
7.2
13.5
7
8.9
9
6.5
6
12.4
8.8
7.5
14.2
2
9.3
1
7.3
3
12.8
9.7
7.9
14.5
8
9.9
6
7.6
7
14.1
10.1
8.6
15.6
1
10.7
3
9.3
2
14.3
10.3
9.3
15.9
4
11.3
2
10.2
1
15.1
11.2
11.8
16.3
9
11.8
8
11.3
10
15.8
11.9
12.2
17.6
5
12.2
4
11.6
4
16.2
12.5
13.5
18.3
10
12.9
5
12.5
8
17.8
15.3
17.7
23.9
6
13.6
10
16.1
9
20.1
Data ranked
separately for
each station
The “Schaake Shuffle” method
(ranked ens. output)
Stn 1 Stn 2 Stn 3
(ranked historical obs)
Stn 1
Stn 2
Stn 3
(re-shuffled output)
Stn1 Stn2 Stn3
7.5
6.3
12.4
3
6.8
7
5.3
5
11.3
1
8.3
7.2
13.5
7
8.9
9
6.5
6
12.4
2
8.8
7.5
14.2
2
9.3
1
7.3
3
12.8
3
9.7
7.9
14.5
8
9.9
6
7.6
7
14.1
4
10.1
8.6
15.6
1
10.7
3
9.3
2
14.3
5
10.3
9.3
15.9
4
11.3
2
10.2
1
15.1
6
11.2
11.8
16.3
9
11.8
8
11.3
10
15.8
7
11.9
12.2
17.6
5
12.2
4
11.6
4
16.2
8
12.5
13.5
18.3
10
12.9
5
12.5
8
17.8
9
15.3
17.7
23.9
6
13.6
10
16.1
9
20.1
10
The “Schaake Shuffle” method
(ranked ens. output)
Stn 1 Stn 2 Stn 3
(ranked historical obs)
Stn 1
Stn 2
Stn 3
(re-shuffled output)
Stn1 Stn2 Stn3
7.5
6.3
12.4
3
6.8
7
5.3
5
11.3
1
8.3
7.2
13.5
7
8.9
9
6.5
6
12.4
2
8.8
7.5
14.2
2
9.3
1
7.3
3
12.8
3
9.7
7.9
14.5
8
9.9
6
7.6
7
14.1
4
10.1
8.6
15.6
1
10.7
3
9.3
2
14.3
5
10.3
9.3
15.9
4
11.3
2
10.2
1
15.1
6
11.2
11.8
16.3
9
11.8
8
11.3
10
15.8
7
11.9
12.2
17.6
5
12.2
4
11.6
4
16.2
8
12.5
13.5
18.3
10
12.9
5
12.5
8
17.8
9
15.3
17.7
23.9
6
13.6
10
16.1
9
20.1
10
10.1
7.5
15.9
The “Schaake Shuffle” method
(ranked ens. output)
Stn 1 Stn 2 Stn 3
(ranked historical obs)
Stn 1
Stn 2
Stn 3
(re-shuffled output)
Stn1 Stn2 Stn3
7.5
6.3
12.4
3
6.8
7
5.3
5
11.3
1
10.1
7.5
15.9
8.3
7.2
13.5
7
8.9
9
6.5
6
12.4
2
8.8
9.3
15.6
8.8
7.5
14.2
2
9.3
1
7.3
3
12.8
3
9.7
7.9
14.5
8
9.9
6
7.6
7
14.1
4
10.1
8.6
15.6
1
10.7
3
9.3
2
14.3
5
10.3
9.3
15.9
4
11.3
2
10.2
1
15.1
6
11.2
11.8
16.3
9
11.8
8
11.3
10
15.8
7
11.9
12.2
17.6
5
12.2
4
11.6
4
16.2
8
12.5
13.5
18.3
10
12.9
5
12.5
8
17.8
9
15.3
17.7
23.9
6
13.6
10
16.1
9
20.1
10
Maximum Temperature
18
16
14
12
10
8
6
4
2
5 4 4 5 4 3 6 7 10 9 9 8 5 6
0
1
Maximum Temperature
(Downscaled Ensemble) (“Observed” Ensemble)
The “Schaake Shuffle” method
(Clark et al., 2003)
2
3
4
16
5
6
7
8
9
8th - 22nd Jan 1996
17th - 31 Jan 1982
13th - 27th Jan 2000
22nd Jan - 5 Feb 1998
12th - 26th Jan 1968
9th - 23rd Jan 1976
10th - 24th Jan 1998
19th Jan - 2nd Feb 1980
16th - 30th Jan 1973
9th - 23rd Jan 1999
10 11 12 13 14
Forecast Lead Time
Ensemble 1
14
Ensemble 2
12
Ensemble 3
Ensemble 4
10
Ensemble 5
8
Ensemble 6
6
Ensemble 7
Ensemble 8
4
Ensemble 9
2
Ensemble 10
8th - 22nd Jan 1996
0
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15
Forecast Lead Time
Results
• RPSS – precipitation and maximum temperature, MLR and KNN
Ranked Probability Skill Score (RPSS) = 1 – RPSSf / RPSSc
• Spatial autocorrelation – precip, max temp, MLR and KNN
• Lag-1 autocorrelation – precip, max temp, MLR and KNN
•Correlation skill in streamflow simulations from Hydrologic model
MLR Approach – RPSS, PRCP-Jan
Knn Approach – RPSS, PRCP-Jan
MLR Approach – RPSS, PRCP-July
Knn Approach – RPSS, PRCP-July
MLR Approach – RPSS, TMAX-Jan
Knn Approach – RPSS, TMAX-Jan
MLR Approach – RPSS, TMAX-Jul
Knn Approach – RPSS, TMAX-Jul
MLR – Spatial Cor., Unshuffled CO4734-CO1609
Knn Approach – Spatial Cor., CO4734-CO1609
MLR – Spatial Cor., Unshuffled GA0140-GA2266
Knn Approach – Spatial Cor., GA0140-GA2266
MLR – Lag-1, Unshuffled CO7017
MLR – Lag-1, Shuffled CO7017
Knn Approach – Lag-1, CO7017
Knn Approach – Reliability Diagrams (1day forecast),
CO7017
Knn Approach – Lag-1, CO7017
Knn Approach – Lag-1, CO7017
Knn Approach – Lag-1, CO7017
Hydrologic Model
Precipitation
Runoff Modeling
System (PRMS)
[distributed –parameter, physicallybased watershed model]
Implemented in:
The Modular
Modeling System
(MMS)
[A set of modeling tools to enable a
user to selectively couple the most
appropriate algorithms]
Alapaha River Basin (Southern Georgia)
Animas River Basin (Southwest Colorado)
Cle Elum River Basin (Central Washington)
Carson River Basin (CA/NV Border)
Conclusion: Comparison of MLR and KNN
K-NN method exhibits comparable to better skills than the MLR in
downscaling daily precipitation/temperature
•
The K-NN provides a flexible and parsimonious framework for
downscaling.
•
The K-NN approach can be improved to better capture the temporal
dependence and also to generate sequences not seen in history.
•
Downscaling of Precip/Temp improves (using nonparametric
methods – MLR + Schaake Shuffle or K-NN) upon the skill in
streamflow forecasts from the NWS’s Extended Streamflow
Prediction.
•
Ensemble Forecast of Spring Streamflows on the Truckee
and Carson Rivers
Study Area
WINNEMUCCA
LAKE (dry)
NEVADA
CALIFORNIA
PYRAMID
LAKE
Nixon
Stillwater NWR
Derby
Dam
STAMPEDE
Reno/Sparks
INDEPENDENCE
DONNER
Fernley
Newlands
Project
Farad
MARTIS
Carson
City
Ft Churchill
Tahoe City
LAKE TAHOE
Fallon
TRUCKEE
RIVER
BOCA
PROSSER
Truckee
TRUCKEE
CANAL
CARSON
RIVER
LAHONTAN
CARSON
LAKE
Motivation
•
USBR needs good seasonal forecasts on Truckee and
Carson Rivers
•
Forecasts determine how
storage targets will be met
on Lahonton Reservoir to
supply Newlands Project
Truckee Canal
Outline of Approach
• Climate Diagnostics
To identify large scale features correlated to Spring flow in the
Truckee and Carson Rivers
• Ensemble Forecast
Stochastic Models conditioned on climate indicators (Parametric and
Nonparametric)
• Application
Demonstrate utility of improved forecast to water management
Data
– 1949-1999 monthly averages
•
•
•
•
Streamflow at Ft. Churchill and Farad
Precipitation (regional)
Geopotential Height 500mb (regional)
Sea Surface Temperature (regional)
Annual Cycle of Flows
Fall Climate Correlations
Carson Spring Flow
500 mb Geopotential Height
Sea Surface Temperature
Winter Climate Correlations
Carson Spring Flow
500 mb Geopotential Height
Sea Surface Temperature
Winter Climate Correlations
Truckee Spring Flow
500 mb Geopotential Height
Sea Surface Temperature
Climate Composites
High-Low Flow
Sea Surface Temperature
Vector Winds
Precipitation Correlation
Geopotential Height Correlation
SST Correlation
Flow - NINO3 / Geopotential Height
Relationship
Regression Fit
Linear Fit
Local Fit 
Precip Fit 
The Forecasting Model
• Forecast Spring Runoff in Truckee and Carson Rivers
using Winter Precipitation and Climate Data Indices
(Geopotential height index and SST index).
• Linear Regression:
- can capture only linear relationship
- inability to generate ensembles
- Symmetric uncertainity bands
• Modified K-NN Method:
– Uses Local Polynomial for the mean forecast
– Bootstraps the residuals for the ensemble
Grantz et al. (2002, 2003)
Wet Years: 1994-1999
1994
1995
1996
1997
1998
1999
1994
1995
1996
1997
1998
1999
Precipitation
1994
1995
1996
1994
1995
1996
1997
1997
1998
1998
1999
1999
Precipitation and Climate
• Overprediction w/o Climate (1995, 1996)
– Might release water for flood control– stuck in spring with
not enough water
• Underprediction w/o Climate (1998)
Dry Years: 1987-1992
1987
1988
1989
1990
1991
1992
1987
1988
1989
1990
1991
1992
1987
1988
1989
1987
1988
1989
Precipitation
1990
1991
1990
Precipitation and Climate
• Overprediction w/o Climate (1998, 991)
– Might not implement necessary drought
precautions in sufficient time
1991
1992
1992
Fall Prediction w/ Climate
1994
1995
1996
1997
1998
1999
1994
1995
1996
1997
1998
1999
Wet Years
1987
1988
1989
1987
1988
1989
1990
1990
1991
1991
1992
1992
Dry Years
• Fall Climate forecast captures whether season will be
above or below average
• Results comparable to winter forecast w/o climate
Monthly ensemble simulation
1997
1998
1999
1997
1998
1999
Wet Years
1987
1988
1989
1987
1988
1989
Dry Years
• Modified K-NN applied to ensemble seasonal
volumes
Simple Water Balance
St = St-1 + It - Rt
• St-1 is the storage at time ‘t-1’, It is the inflow at time ‘t’
and Rt is the release at time ‘t’.
• Method to test the utility of the model
• Pass Ensemble forecasts (scenarios) for It
• Gives water managers a quick look at how much storage
they will have available at the end of the season – to evluate
decision strategies
For this demonstration,
• Assume St-1=0, Rt= 1/2(avg. Inflowhistorical)
Water Balance
1995 Storage
1995 K-NN
Ensemble
PDF
Historical
PDF
Future Work
• Stochastic Model for
Timing of the Runoff
Disaggregate Spring flows to monthly flows.
• Statistical Physical Model
Couple PRMS with stochastic weather generator
(conditioned on climate info.)
• Test the utility of these approaches to water
management using the USBR operations model
in RiverWare
Region / Data
6 rainfall stations
- Nakhon Sawan, Suphan Buri, Lop
Buri, Kanchana Buri, Bangkok, and
Don Muang
3 streamflow stations
(Chao Phaya basin)
- Nakhon Sawan, Chai Nat, Ang-Thong
5 temperature stations
- Nakhon Sawan, Lop Buri, Kanchana
Buri, Bangkok, Don Muang
Large Scale Climate
Variables
NCEP-NCAR Re-analysis data
(http://www.cdc.noaa.gov)
Temporal Variability
Thailand Rainfall
Moving Seas onal Correlation between Thailand Rainfall and ENSO Index (Tahiti-Darwin)
0.60
0.40
correlation
0.20
0.00
-0.20
-0.40
-0.60
1960
1965
1970
1975
1980
1985
1990
1985
1990
year
Indian Rainfall
Moving Seas onal Correlation between Indian Rainfall and ENSO Index (Tahiti-Darwin)
0.70
0.60
correlation
0.50
0.40
0.30
0.20
0.10
0.00
1960
1965
1970
1975
1980
year
Composite Maps of High rainfall
Pre 1980
Post 1980
Composite Maps of Low rainfall
Pre 1980
Post 1980
Example Forecast for 1997
Conditional Probabilities
from historical data
(Categories are at Quantiles)
Categorical ENSO forecast
Conditional flow probabilites
using Total Probability Theorem
La Nina
Neu
El Nino
Flow
Low
Neu
0.000
0.538
0.320
0.440
0.385
0.538
La Nina
0.2
Low
0.3
High
0.462
0.240
0.077
Neu
El Nino
0.2
0.6
Neu
0.52
High
0.19
Ensemble Forecast from Categorical
Probabilistic forecasts
• If the categorical probabilistic forecasts are P1,
P2 and P3 then
– Choose a category with the above probabilities
– Randomly select an historical observation from the
chosen category
– Repeat this a numberof times to generate ensemble
forecasts
Ensemble Forecast of Thailand
Streamflows – 1997
Summary
• Nonparametric techniques (K-NN framework in
particular) provides a flexible alternative to Parametric
methods for
Ensemble forecasting/Downscaling
• Easy to implement, parsimonious extension to
multivariate situations. Water managers can utilize the
improved forecasts in operations and seasonal planning
• No prior assumption to the functional form is needed.
Can capture nonlinear/non-Gaussian features readily.
Motivation
• Colorado River Basin
– arid and semi-arid climates
– irrigation demands for agriculture
• “Law of the River”
– Mexico Treaty Minute No. 242
– Colorado River Basin Salinity Control Act of
1974
Motivation
• Salinity Control Forum
– Federal Water Pollution Control Act Amendments of
1972
– Numerical salinity criteria
• 723 mg/L below Hoover Dam
• 747 mg/L below Parker Dam
• 879 mg/L at Imperial Dam
• review standards on 3 year intervals
– Develop basin wide plan for salinity control
Salinity Control Efforts
• As of 1998 salinity control projects has removed
an estimated 634 Ktons of salt from the river
– total expenditure through 1998 $426 million
• Proposed projects will remove an additional 390
Ktons
– projects additional expenditure $170 million
• Additional 453 Ktons of salinity controls needed
by 2015
Data taken from Quality of Water, Progress Report 19, 1999
Existing Colorado River
Simulation System (CRSS)
• Includes three interconnected models
– salt regression model
• USGS salt model
– stochastic natural flow model
• index sequential method
– simulation model of entire Colorado River
basin
• implemented in RiverWare
Existing Salt Model Over-Prediction
Research Objectives
• Investigate and improve generation of natural salt
associated stochastic natural flow
• Investigate and improve modeling natural hydrologic
variability (stochastic natural flow)
• Apply modifications to a case study in the Colorado River
Basin
Case Study Area
• Historic flow from 1906 - 95
• Historic salt from 1941 - 95
USGS gauge 09072500
(Colorado River near Glenwood Springs, CO)
Summary
• Comparison of 3 stochastic hydrology models
– ISM, PAR(1), modified K-NN
• Modified K-NN addresses limitations of both the
ISM and PAR(1) models
– generates values and sequences not seen in the historic
record
– generates a greater variety of flows than the ISM
Statistical Nonparametric Model
for Natural Salt Estimation
• Based on calculated natural flow and
natural salt mass from water year 1941-85
– calculated natural flow = observed natural flow
+ total depletions
– calculated natural salt = observed natural salt
- salt added from agriculture
+ salt removed with exports
• Nonparametric regression (local regression)
– natural salt = f (natural flow)
• Residual resampling
USGS salt model and new salt model
with K-NN resampling comparison
CRSS simulation model
for historic verification
calculated natural flow
flow
historic agriculture
consumptive use
estimated natural salt mass
Natural salt 1941-95
salt
irrigated
lands
agricultural
salt loadings
historic exports
Natural flow 1906-95
salt removed
with exports
historic municipal and industrial
Constant salinity pickup
137,000 tons/year
Exports removed
@ 100 mg/L
historic effects of off-stream
reservoir regulation
USGS stream gauge 09072500
simulated historic flow
simulated historic salt mass
Compare results to
observed historic
for validation
Model Validation
Historic Salt Mass
• 1941-1995 natural flow
• 1941-1995 monthly and annual
salt model
12 monthly regressions
1 annual regression
Determining Salinity
Concentration
salt mass (tons)  735.29
salt concentrat ion (mg/L) 
flow volume (acre - feet)
Annual model with resampling
• Based on 1941-1995 natural flow
• 1941-1995 annual salt model
• Simulates 1941-1995
• Historic Flow and Concentration
Policy Analysis
• Fictional Salinity Standards
– Colorado River near Glenwood Springs, CO
– Salt mass standard
• mass remains below 650,000 tons
• salt concentration below 350 mg/L
– Standards occur in tails of distribution
Modified and Existing CRSS Comparison
Historic Salt Mass
• Based on 1906-1995 natural flows
• 1941-1995 monthly salt models
• Simulates 1941-1995
Policy Analysis
Historic Simulation
> 650,000 tons salt
> 350 mg/L salt concentration
CRSS simulation model
for future prediction
synthetic natural flow
flow
associated synthetic natural salt mass
• Natural flows based on
1906-1995
salt
• Natural salt model based
on 1941-1995
future agriculture
consumptive use
irrigated
lands
agricultural
salt loadings
salt removed
with exports
future exports
future municipal and industrial
• Constant Ag salt loading
of 137,000 tons/year
• Constant salt removal
with exports of 100
mg/L/year
USGS stream gauge 09072500
simulated future flow
• Projected depletions
2002-2062
simulated future salt mass
Stochastic Planning Runs
Projected Future Flow and Salt Mass
• Passing gauge 09072500
• Based on 1906-1995 natural flows
• 1941-1995 monthly salt models
• Simulating 2002 to 2062
Policy Analysis
Future Projections
> 750,000 tons salt
> 600 mg/L salt concentration
Modified Colorado River
Simulation System (CRSS)
• Includes three interconnected models
– stochastic natural flow model
• modified nonparametric K-NN natural flow model
– salt regression model
• statistical nonparametric natural salt model
– simulation model of entire Colorado River
basin
• demonstrated on a case study for basin above USGS
gauge 09072500
Conclusion
• Developed a modified modeling system for
the Colorado River Simulation System
– includes both flow and salt uncertainty
• improved representation of flow variability
• better representation of natural salt and flow
relationship
– developed and tested a modified K-NN
Conclusions
– developed and tested a new statistical
nonparametric natural salt model
– discussed nonparametric techniques
• flexible and easy to implement
• can preserve any arbitrary distribution
• conditioning with additional data
– validation of historic record
– preservation of historic salinity violations
Future Work
• Extend the modified K-NN flow model to perform spacetime dissaggregation to simulate flow and salt over the
entire basin
• Move operational policy to an annual time step
• Incorporate total depletions as a function of natural flow
• Further research into the relationship between salt loading
and land use
• Continue work to incorporate climate information in
streamflow generation
Acknowledgements
• Balaji Rajagopalan, Terry Fulp, Edith Zagona for advising
and support
• Upper Colorado Regional Office
of the US Bureau of Reclamation,
in particular Dave Trueman for
funding and support
• CADWES personnel for use of their
knowledge and computing facilities