Extremes_CCT_v3 - Biomathematics and Statistics Scotland

Download Report

Transcript Extremes_CCT_v3 - Biomathematics and Statistics Scotland

Extreme values and risk
CCTC meeting, September 2007
Adam Butler
Biomathematics & Statistics Scotland
Extreme values and risk
• Extreme value theory (EVT) is a branch of statistics
concerned with the frequency & size of rare events
• EVT methods are widely used in finance, hydrology &
engineering, usually for risk assessment, but are not
yet widely used in the biological sciences
Extreme values and risk
Risk assessment:
What is the probability we will
have more than 100mm of
rain on a given day?
Risk management:
I need to build a flood
defense, and I want the
probability that it fails on
any particular day to be
less than 1-in-10000.
How high should it be?
Extreme values and risk
What is the chance of getting a log
daily return of less than –0.1?
(i.e. a drop in value of 9% or more
since the previous day)
Extreme values and risk
Common features
• We are interested in a process that can be quantified,
and for which we have some data
• …and we want to use these data to say something about
the probability that a rare or extreme event will occur
Extreme values and risk
• We will usually be interested in events that are beyond
the range of the data i.e. we want to extrapolate
• Extrapolation is rarely advisable, but it is sometimes
unavoidable, especially when doing risk assessment
• The standard approach would be to assume that the
data come from, for example, a normal distribution…
Extreme values and risk
P(X < –0.1)  10-20
Extreme values and risk
…but:
The extreme values don’t play much of a role when we
estimate the parameters, so the model that we end up
fitting might not describe the extreme values at all well…
Extreme values and risk
Empirical:
P(X < –0.05)  0.002
Normal:
P(X < –0.05)  0.000001
Extreme values and risk
…and, worse still, extrapolations beyond the range of the
data often differ radically between models that provide
a very similar fit to the bulk of the data…
Extreme values and risk
Cauchy:
P(X < –0.1)  0.02
Normal:
P(X < –0.1)  10-20
Extreme values and risk
Extreme values and risk
In EVT we adopt the principle that we should only make
use of the most extreme data that we have observed
 we throw away almost all of the data
Extreme values and risk
Threshold
exceedances
Extreme values and risk
Extreme values and risk
We consider exceedances of a high threshold
EVT tells us that a good statistical model for exceedances,
x, is the Generalised Pareto Distribution (GPD),
P(x) = 1 – [1 + (x / )]-1/
 = “scale parameter”
 = “shape parameter”
(x > 0)
Extreme values and risk
GPD: impact of the scale parameter
 = “scale parameter”
=1
=2
=3
=0
 = “shape parameter”
Extreme values and risk
GPD: impact of the shape parameter
 = “scale parameter”
=0
=1
 = -0.5
=1
 = “shape parameter”
Extreme values and risk
Threshold = u = 25mm
 and  estimated by maximum likelihood
to be 7.70 and 0.108 respectively
P(X > 100) estimated to be 0.0000209
(once in a 131 years)
Extreme values and risk
…but why is the GPD the “right model” to use?
• In theory: for almost any random variable X, the
exceedances of a high threshold u will tend towards the
GPD model as u tends towards infinity
• In practice: we use a threshold that is high but still finite:
we rely on the fact that if this level is sufficiently high
then the asymptotic result will still be approximately true
Extreme value methods
“Parameter stability plot”
Extreme values and risk
Other extreme value models
• A related approach involves analysing the maximum
values per day, per month or per year (block maxima)
• EVT suggests that a good model to use in this case is
the GEV (Generalised Extreme Value)
Extreme values and risk
Advantages
• Robust
Disadvantages
• Inefficient
Relies on weak assumptions
Most of the data are thrown away
Avoids bias
…we may over-estimate uncertainty
• Theoretically sound
Justified by asymptotic theory
• Quick & relatively easy to use
• Honest
…about the uncertainties involved
in making statements about very
rare events
…relies on having a large sample size
• Asymptotics
The theory only holds exactly for
infinitely extreme events
Difficult to extend to multivariate case
• Data quality
Sensitive to errors in extreme data
Extreme values and risk
Practicalities
Basic course: http://www.bioss.ac.uk/alarm/training/
• Software: routines in… R, Genstat, S-plus, Matlab
• Extremes toolkit:
http://www.isse.ucar.edu/extremevalues/evtk.html
• Recommended book: Coles (2001) An introduction to
statistical modeling of extreme values. Springer.
• Contact me: [email protected]