Spatial verification of NWP model fields

Download Report

Transcript Spatial verification of NWP model fields

Spatial verification of NWP
model fields
Beth Ebert
BMRC, Australia
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
New approaches are needed to
quantitatively evaluate high resolution
model output
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
2
What modelers want

Diagnostic information





What scales are well represented by the model?
How realistic are forecast features / structures?
How realistic are distributions of intensities / values?
What are the sources of error?
How can I improve the model?
It's not
so
easy!
How can
I score?
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
3
Spatial forecasts
Weather variables
defined over spatial
domains have coherent
spatial structure and
features (intrinsic
spatial correlation)
Spatial verification techniques aim to:



account for field spatial structure
provide information on error in physical terms
account for uncertainties in timing and location
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
4
Recent research in spatial verification

Scale decomposition methods


Fuzzy (neighborhood) verification methods


give credit to "close" forecasts
Object-oriented methods


measure scale-dependent error
evaluate attributes of identifiable features
Field verification

evaluate phase errors
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
5
Scale decomposition methods
scale-dependent error
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
6
Wavelet scale components
Briggs and Levine (1997)
ECMWF Analysis
36-h Forecast (CCM-2)
500 mb GZ, 9 Dec 1992, 12:00 UTC, N. America
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
7
Intensity-scale verification technique
Casati et al. (2004)
Measures the skill as function of intensity and spatial scale of the error
1. Intensity: threshold  Categorical approach
2. Scale: 2D Wavelets decomposition of binary images
3. For each threshold and scale: skill score associated to the MSE of
binary images = Heidke Skill Score
Intense storm displaced
threshold = 1mm/h
scale (km)
640
320
160
80
40
20
10
5
0
Skill
1
0
-1
-2
-3
1/16 ¼ ½ 1 2 4 8 16 32
threshold (mm/h)
-4
Multiscale statistical properties
Harris et al. (2001)
Does a model produce the observed precipitation scaledependent variability, i.e. does it look like real rain?
Compare multi-scale statistics for model and radar data
Power spectrum
Structure function
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Moment scaling
9
Fuzzy (multi-scale) verification methods
give credit to "close" forecasts
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
10
"Fuzzy" verification methods

Don't require an exact match between forecasts and
observations


Unpredictable scales
Uncertainty in observations
t-1
Squint your
eyes!
t
Frequency
Why
is itincalled
"fuzzy"?
 Look
a space
/ time neighborhood around the point of
interest
t+1
Forecast value

observation
forecast
Evaluate using categorical,
continuous, probabilistic
scores / methods
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
11
"Fuzzy" verification methods
Treatment of forecast data within a window:

Mean value (upscaling)

Occurrence of event* somewhere in window

Frequency of event in window  probability

Distribution of values within window
May apply to observations as well as forecasts
(neighborhood observation-neighborhood forecast
approach)
* Event defined here as a value exceeding a given threshold, for example,
rain exceeding 1 mm/hr
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
12
Spatial multi-event contingency table
Atger (2001)
ROC
Forecasters mentally "calibrate" the deterministic
forecast according to how close the forecast is to
the place / time / magnitude of interest.
Very close  high probability
Not very close  low probability
Vary decision thresholds:

magnitude (ex: 1 mm h-1 to 20 mm h-1)

distance from point of interest (ex: within 10
km, .... , within 100 km)

timing (ex: within 1 h, ... , within 12 h)

anything else that may be important in
interpreting the forecast
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Sydney
single threshold
"high probability
of
some heavy rain near
Sydney",
not
"62 mm of rain will fall
in Sydney"
EPS
13
Fractions skill score
Roberts (2005)

We want to know



How forecast skill varies with neighbourhood size.
The smallest neighbourhood size that can be can be used to
give sufficiently accurate forecasts.
Does higher resolution provide more accurate forecasts on
scales of interest (e.g. river catchments)
Compare forecast fractions
with observed fractions (radar)
in a probabilistic way over
different sized neighbourhoods
1 N
(Pfcst  Pobs )2

N i 1
FSS  1 
1 N
1 N
2
2
P

Pobs


fcst
N i 1
N i 1
observed
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
forecast
14
Fractions skill score
Roberts (2005)
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
15
Decision models
Fuzzy method
Matching
strategy*
Upscaling (Zepeda-Arce et al. 2000;
Weygandt et al. 2004)
NO-NF
Resembles obs when averaged to coarser scales
Minimum coverage (Damrath 2004)
NO-NF
Predicts event over minimum fraction of region
Fuzzy logic (Damrath 2004), joint
probability (Ebert 2002)
NO-NF
More correct than incorrect
Fractions skill score (Roberts 2005)
NO-NF
Similar frequency of forecast and observed events
Area-related RMSE (Rezacova et al.
2006)
NO-NF
Similar intensity distribution as observed
Pragmatic (Theis et al. 2005)
SO-NF
Can distinguish events and non-events
CSRR (Germann and Zawadzki 2004)
SO-NF
High probability of matching observed value
Multi-event contingency table (Atger
2001)
SO-NF
Predicts at least one event close to observed event
Practically perfect hindcast (Brooks et
al. 1998)
SO-NF
Resembles forecast based on perfect knowledge
of observations
Decision model for useful forecast
*NO-NF = neighborhood observation-neighborhood forecast,
SO-NF = single observation-neighborhood forecast
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
16
Fuzzy verification framework
good performance
poor performance
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
17
Object-oriented methods
evaluate attributes of features
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
18
Entity-based approach (CRA)
Ebert and McBride (2000)

Define entities using threshold (Contiguous Rain Areas)

Horizontally translate the forecast until a pattern
matching criterion is met:


minimum total squared error between forecast and observations

maximum correlation

maximum overlap
The displacement is the vector difference between the
original and final locations of the forecast.
Observed
Forecast
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
19
CRA information
Gives information on:
 Location error
 RMSE and
correlation before
and after shift
 Attributes of
forecast and
observed entities
 Error components



displacement
volume
pattern
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
20
MODE*
*Method for Object-based Diagnostic Evaluation
Davis et al. (2006)
Two parameters:
1. Convolution radius
2. Threshold
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
21
MODE object matching/merging
Compare attributes:
- centroid location
- intensity distribution
- area
- orientation
- etc.
When objects not
matched:
24h forecast of 1h rainfall on 1 June 2005
- false alarms
- missed events
- rain volume
- etc.
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
22
MODE methodology
Identification
Convolution – threshold
process
Merging
Fuzzy Logic Approach
Compare forecast and
observed attributes
Merge single objects into
composite objects
Matching
Compute interest values
Measure
Attributes
Identify matched pairs
Comparison
Summarize
Accumulate and examine
comparisons across many
cases
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
23
Cluster analysis approach
Marzban and Sandgathe (2006)



Goal: Assess the agreement between fields using clusters identified using
agglomerative hierarchical cluster analysis (CA)
Optimize clusters (and numbers of clusters) based on
 Binary images (x-y optimization)
 Magnitude images (x-y-p optimization)
Compute Euclidean distance between clusters in forecast and observed
fields (in x-y and x-y-p space)
MM5 precipitationWRF
forecasts
Verification Toolkit Workshop, Boulder,
21-23 February
2007
8 clusters
identified
in x-y-p space
24
Cluster analysis example
Stage IV
Error = average distance
between matched clusters
in x-y-p space
loge error
COAMPS
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
25
Composite approach
Nachamkin (2004)


Goal: Characterize distributions of errors from both a
forecast and observation perspective
Procedure:




Identify events of interest in the forecasts
Define a kernel and collect coordinated samples
Compare forecast PDF to observed PDF
Repeat process for observed events
Forecast
Observation
x
Event
center
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
26
Composite example

Compare kernel grid-averaged values
Average rain (mm) given an
event was predicted
Average rain (mm) given an
event was observed
FCST-shade
OBS-contour
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
27
Field verification
 evaluate phase errors
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
28
Feature calibration and alignment
(Hoffman et al., 1995; Nehrkorn et al., 2003)
Original forecast Xf(r)
500 mb analysis Xv(r)
Error decomposition
e = Xf(r) - Xv(r)
where Xf(r) is the forecast,
Xv(r) is the verifying analysis,
and r is the position.
Forecast adjustment
e = ep + eb + er
where
ep = Xf(r) - Xd(r) phase error
Adjusted forecast Xa(r)
Residual error er
eb = Xd(r) - Xa(r) local bias error
er = Xa(r) - Xv(r) residual error
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
29
Forecast quality measure (FQM)
Keil and Craig (2007)

Combines distance measure
and intensity difference
measure

Pyramidal image matching
(optical flow) to get vector
displacement field  edistance

Unmatched features are
penalized for their intensity errors
 eintensity

Forecast quality measure
FQM 
1
max( edistance, eintensity)

A A
satellite
WRF Verification Toolkit Workshop, Boulder, 21-23 February
2007
orig.model morphed model
30
Conclusions

What method should you use for model
verification?
 Depends

what question(s) you would like to address
Many spatial verification approaches
decomposition – scale-dependent error
 Fuzzy (neighborhood) – credit for "close" forecasts
 Object-oriented – attributes of features
 Field verification – phase error
 Scale
WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
31