Transcript Document

A Neural Network for Detecting
and Diagnosing Tornadic
Circulations
V Lakshmanan, Gregory Stumpf, Arthur Witt
University of Oklahoma, National Severe Storms
Laboratory, Meteorological Development Laboratory
7/18/2015
[email protected]
1
Motivation

MDA and NSE developed at NSSL



Marzban (1997) developed a NN based on
MDA parameters to classify tornadoes



MDA identifies storm-scale circulations
Which may be precursors to tornadoes
Using 43 cases
Found incorporation of NSE promising
Radar Operations Center wanted us to
examine using a MDA+NSE NN operationally.


7/18/2015
Extended Marzban’s work to 83 cases
With a few modifications
[email protected]
2
MDA and NSE

Mesocyclone Detection Algorithm (MDA)



Near Storm Environment (NSE)


designed to detect a wide variety of circulations of
varying size and strength by analyzing the radial
velocity data from a Doppler weather radar
23 attributes for each circulation
Uses analysis grids from the RUC model to derive
245 different attributes.
Full list of attributes used is in the conference
pre-prints.
7/18/2015
[email protected]
3
Scalar Measures of performance
POD = hit / (hit + miss)
 FAR = fa / (hit + fa)
 CSI = hit / (hit + miss + fa)
 HSS = 2*(null * hit - miss * fa) /
{(fa+hit)*(fa+null) + (null + miss)*(miss +
hit)}
 We also report Receiver Operating
Characteristic (ROC curves)

7/18/2015
[email protected]
4
Neural Network
Fully feedforward resilient
backpropagation NN
 Tanh activation function on hidden nodes
 Logistic (sigmoid) activiation function on
output node
 Error function: weighted sum of crossentropy and squared sum of all the
weights in the network (weight decay)

7/18/2015
[email protected]
5
Truthing

Ground truth based on temporal and
spatial promixity
Done by hand: every circulation was
classified.
 Look for radar signature 20 minutes before
a tornado is on the ground to 5 minutes
after.

7/18/2015
[email protected]
6
NN Training Method
Extract out truthed MDA detections
 Normalize the input features
 Determine apriori probability thresholds


13 attributes known to have univariate
tendencies and prune the training set
Divide set in the ratio 46:20:34 (train:
validate: test)
 Bootstrap train/validate sets.

7/18/2015
[email protected]
7
NN training method (contd.)

Find optimal number of hidden nodes


Beyond which validation cross-entropy error
increases
Choose as warning threshold the
threshold at which the output of NN on
validation set has maximum HSS.
7/18/2015
[email protected]
8
Our method vs. Marzban and
Stumpf

Slightly different from Marzban/Stumpf:

Error criterion different


Weight decay
Error minimization method different

RProp vs SCG
Bootstrapped case-wise instead of patternwise
 Automatic pruning based on apriori prob.

7/18/2015
[email protected]
9
43-case comparison


So, we compared against the same 43-cases (with
same independent test cases)
Most of the difference due to better generalization

case-wise bootstrapping
Method
POD
FAR
CSI
HSS
Marzban 0.36
0.69
0.20
0.29
Us
0.38
0.28
0.38
7/18/2015
0.34
[email protected]
10
MDA NN (83 case)

43 case data set used by Marzban were
large/tall/strong


Rather easy dataset of tornado detection
The next 40 cases more atypical


Mini-supercells, squall-line tornadoes, tropical events etc.
Manually selected independent 27 cases to
have similar distribution of strong and weak
tornadoes.


7/18/2015
Remaining 56 cases used to verify network.
Then, use all 83 cases to create “operational”
network.
[email protected]
11
83 case MDA NN


The performance of best network on independent test
case of 27 compared with results on 43-case.
And performance of best network trained using all 83
cases (no independent test case)
Method
POD
FAR
CSI
HSS
Test 27
0.44
0.53
0.29
0.41
43-case 0.34
0.38
0.28
0.38
Val. (83) 0.42
0.51
0.29
0.40
7/18/2015
[email protected]
12
83 case MDA NN

ROC curves for 27-case independent test
7/18/2015
[email protected]
13
MDA + NSE

Statistics of the dataset change dramatically
when we add NSE parameters as inputs





7/18/2015
10x as many inputs, so chances of over-fitting
much greater.
NSE parameters not tied to individual detections
NSE parameters highly correlated in space and
time.
NSE parameters not resolved to radar resolution
(20kmx20km vs. 1kmx1km)
NSE parameters available hourly; radar data every
5-6 minutes.
[email protected]
14
Feature Selection
Reduce parameters from 245 to 76
based on meteorological understanding.
 Remove one attribute of highly
correlated pairs (Pearson’s correlation
coefficient).
 Take the top “f” fraction of univariate
predictors

7/18/2015
[email protected]
15
Choose most general network

Variation of the
neural network
training and
validation errors as
the number of input
features is
increased.
 Choose the number
of features where
generalization error
is minimum (f=0.3)
7/18/2015
[email protected]
16
MDA+NSE

On independent 27-case set.
Inputs
POD
FAR
CSI
HSS
MDA
0.44
0.53
0.29
0.41
MDA+NSE 0.47
0.49
0.32
0.45
7/18/2015
[email protected]
17
MDA+NSE (27-case set)
7/18/2015
[email protected]
18
Generalization

Similar HSS scores on training, validation and
independent test data sets.
 In MDA+NSE, we sacrificed higher
performance to get better generalization
Inputs
POD
FAR
CSI
HSS
MDA
0.44
0.53
0.29
0.41
MDA+NSE 0.47
0.49
0.32
0.45
7/18/2015
[email protected]
19
Is NSE information helpful?


NSE parameters changed the statistics of the data set
The MDA+NSE neural network is only marginally
better than a MDA NN but:


NSE information has the potential to be useful.
We used only 4 of the 76 of the 245 features!
Inputs
POD
FAR
CSI
HSS
MDA
0.44
0.53
0.29
0.41
MDA+NSE 0.47
0.49
0.32
0.45
7/18/2015
[email protected]
20
Going further

Where can we go further with this
approach?
Find better ways to reduce the number of
features
 Use time history of detections
 Generate many more data cases.


All of which will yield very little (we
believe).
7/18/2015
[email protected]
21
Spatio-temporal Tornado Guidance

Formulate the tornado prediction
problem differently.
Instead of devising a machine intelligence
approach to classify detections
 Spatio-temporal: of estimating the
probability of a tornado event at a particular
spatial location within a given time window

7/18/2015
[email protected]
22
Spatio-temporal approach

Our initial approach:








Modify ground truth to create spatial truth field
use a least-squares methodology to estimate shear
morphological image processing to estimate gradients,
fuzzy logic to generate compact measures of tornado
possibility
a classification neural network to generate the final spatiotemporal probability field.
Past and future history, both of observed tornadoes and of the
candidate regions, is obtained by tracking clustered radar
reflectivity values
integrate data from other sensors (e.g: numerical models and
lightning).
Paper at the IJCNN 2005
7/18/2015
[email protected]
23
Acknowledgements
Funding for this research was provided
under NOAA-OU Cooperative
Agreement NA17RJ1227 and supported
by the Radar Operations Center.
 Caren Marzban and Don Burgess, both
of the University of Oklahoma, helped us
immensely on the methods and
attributes used in this paper

7/18/2015
[email protected]
24