Transcript Wykład 9

Wroclaw University of Technology
Kingston University, Dept. Computing, Information
Systems and Mathematics
Analyzing stochastic time series
Tutorial
Malgorzata Kotulska
Department of Biomedical Engineering & Instrumentation
Wroclaw University of Technology, Poland
Wroclaw University of Technology
Outline
1.
Data motivated analysis - time series in the real life
2.
Probability and time series - stochastic vs dterministic
3.
Stationarity
4.
Correlations in time series
5.
Modelling linear time series with short-range correlations –
ARIMA processes
6.
Time series with long correlations – Gaussian and non-Gaussian
self-similar processes, fractional ARIMA
Wroclaw University of Technology
Time series – examples
P. J. Brockwell, R. A. Davis, Introduction to Time Series and Forecasting, Springer, 1987
Wroclaw University of Technology
Ionic channels in cell membrane
M. Kullman, M. Winterhalter, S. Bezrukov,
Biophys. J.82 (2003) p.802
Wroclaw University of Technology
Nile river
J. Beran, Statistics for long-memory processes, Chapman and Hall, 1994
Wroclaw University of Technology
Objectives of time series analysis






Data description
Data interpretation
Data forecasting
Control
Modelling / Hypothesis testing
Prediction
Wroclaw University of Technology
Time series
-11
1
x 10
0.5
0
-0.5
-1
0
2
4
6
8
Signal   Ai sin 2f i t   i 
i
A  [2 ,1,3, 7, 8]; f  [20, 30, 50, 100],
Wroclaw University of Technology
time series
deterministic
periodic
aperiodic
random
Wroclaw University of Technology
Time series – realization of a stochastic
process
{Xt} is a stochastic time series if each component
takes a value according to a certain probability
distribution function.
A time series model specifies the joint distribution
of the sequence of random variables.
Wroclaw University of Technology
White noise - example of a time series model
Wroclaw University of Technology
Gaussian white noise
Wroclaw University of Technology
Stochastic properties of the process
STATIONARITY
System does not change its properties in time
Well-developed analytical methods of signal analysis
and stochastic processes
Wroclaw University of Technology
WHEN A STOCHASTIC PROCESS IS STATIONARY?
{Xt} is a strictly stationary time series if
(X1,...,Xn)=d (X1+h,...,Xn+h),
where n1, h – integer, =d means distribution equality
Properties:
• The random variables are identically distributed.
• An idependent identically distributed (iid) sequence is strictly
stationary.
Wroclaw University of Technology
Weak stationarity
{Xt} is a weakly stationary time series if
• EXt =  and Var(Xt)=2 are independent of time t
• Cov(Xs, Xr) depends on (s-r) only, independent of
t.
Properties:
E(Xt2) is time-invariant.
Wroclaw University of Technology
Quantitative method for stationarity
Reverse Arrangement Test
Weak stationarity:
Testing if E(Xt2) is time-invariant
Wroclaw University of Technology
Quantile line method
A quantile of order , 0  1, is such a value k(t) that probability
of the series taking value less than k(t) at time t equals .
P{Xt  k(t)}= 
PROPERTIES:
• Lines parallel to the time axis  stationarity
• Lines parallel to each other, not to the time axis  constant
variance, a variable mean (or median)
• Lines not parallel to each other  a variable variance (or scale
parameter)
Wroclaw University of Technology
Quantile lines of the raw time series
Nonstationarity with a variable mean and variance
Wroclaw University of Technology
Methods for nonstationary time series
 Trend removal
 Segmentation of the series
 Specific analytical methods (e.g. ambiguity
function, variograms for autocorrelation function)
Wroclaw University of Technology
Trend estimation
• Polynomial (or other, e.g. log) estimation and removal
• Filters, e.g. moving average filter, FIR, IIR filters
• Differencing
Wroclaw University of Technology
Seasonal models
Classical decomposition model
Xt = mt + Yt + st
Stochastic
process
trend
random
noise
seasonal component
Wroclaw University of Technology
Backshift operator B
Wroclaw University of Technology
Detrended series
P. J. Brockwell, R. A. Davis, Introduction to Time Series and Forecasting, Springer, 1987
Wroclaw University of Technology
Quantile lines of the differenced time series
Wroclaw University of Technology
(Sample) autocorrelation function
Wroclaw University of Technology
Range of correlations
 Independent data (e.g. WN)
 Short-range correlations
 Long-range correlations
(correlated or anti-correlated structure)
Wroclaw University of Technology
ACF for Gaussian WN
Wroclaw University of Technology
Short-range correlations
 Markov processes
(e.g. ionic channel conformational transition)
 ARMA (ARIMA) linear processes
 Nonlinear processes (e.g. GARCH process)
Wroclaw University of Technology
ARMA (ARIMA) models
Time series is an ARMA(p,q) process if Xt is stationary and
if for every t:
Xt  1Xt-1 ...  pXt-p= Zt + 1Zt-1 +...+ pZt-p
where Zt represents white noise with mean 0 and variance 2
Left side of the equation represents Autoregresive AR(p)
part, and right side Moving Average MA(q) component.
The polynomials (1- 1z-...- pzp) cannot have (1+ 1z+...+ pzq) common factors.
Wroclaw University of Technology
Examples
The range of MA component estimated by ACF (the lag number
within Bartlett’s limits ), the range of AR component by PACF
Confidence band is
Wroclaw University of Technology
Exponential decay of ACF
MA(1)
sample ACF
AR(1)
Wroclaw University of Technology
Stationary processes with long memory
Qualitative features
• Relatively long periods with high or low level of
observation values
• In short periods there seems to be cycles and local
trends. Looking at long series – no particular cycles or
persisting trends
• Overall the series looks stationary
Wroclaw University of Technology
Stationary processes with long memory
Quantitative features
• The variance of the sample mean decays to zero at a
slower rate. Instead of
there is
• The sample autocorrelation function decays to zero in a
power-law manner instead of exponentially
• Similarly, the periodogram (frequency analysis) shows a power-law
Wroclaw University of Technology
Classical processes with long correlations
• Fractional ARIMA processes (fARIMA)
• Self similar processes
Wroclaw University of Technology
fARIMA
ARMA (p,q):
ARIMA (p,d,q):
fractional ARIMA (p,d,q):
Wroclaw University of Technology
Self-similar process
A process X={X(t)}t  0 is called self-similar if for some
H>0
H =1 – /2 – self-similarity index, (HR +)
Wroclaw University of Technology
Frequency-domain analysis
Periodogram
Periodogram - estimated PSD by a Fourier transform of a sample
autocorrelation function.
Periodogram of a long memory time series depends on frequency according to
power-law relationship (straight line on log-log plot).
It means that if one doubles the frequency - PSD diminishes by the same fraction
regardless of the frequency
Wroclaw University of Technology
Basic features of self similar process
1. APPEARANCE. If an amplitude of a self-similar process is
rescaled by r H, X (rt) looks like X (t), statistically
indistinguishable.
2. VARIANCE of the signal changes as Var (X(t))  t 2H
3. CORRELATION - correlated or anticorrelated structuring
H=0.5 no memory
H>0.5 long memory
H<0.5 antipersistent long correlations – „short memory”
1. PERIODOGRAM - power-law dependance on frequency
Wroclaw University of Technology
Nile
Example
J. Beran, Statistics for long-memory processes, Chapman and Hall, 1994
Wroclaw University of Technology
Methods
 R/S analysis by Hurst
 DFA – Detrended Fluctuation Analysis
 Exponent-based (correlogram, periodogram of the
residuals) : H =1 – /2
 other (for appropriate PDFs, e.g. Orey index)
Wroclaw University of Technology
Hurst exponent – the algorithm
A series with N elements is divided into shorter series – n elements
each
Wroclaw University of Technology
Hurst exponent
Wroclaw University of Technology
Classical self-similar processes
Wroclaw University of Technology
Gaussian noise
•
A series n (n=1,...,N) of uncorrelated and random variables
•
Each n - Gaussian distribution N(0,)
Brownian motion
•
A sum of Gaussian white-noise sequence yn(Bm)=
n(Bm)=  n1/2
Fractional Brownian motions (fBm)
n(fBm)   nH
Wroclaw University of Technology
Fractional Brownian motions (fBm)
X={X(t)}t  0
is a nonstationary Gaussian process with mean zero and
an autocovariance function:
 (t 1 , t 2 ) 
1
2
t
1
2H
 t2
2H
 t1  t2
2H

Var ( X (1))
Fractional Gaussian noise (fGn)
A stationary process of increments in fBm
(differences between values separated by some step)
Wroclaw University of Technology
Gaussian or non-gaussian process
Wroclaw University of Technology
Fractional Lévy Stable Motion
In the fractional Lévy stable motion (FLSM) the distribution is
Lévy-stable.
=0.5
Stable distribution
(solid),
the attraction domain
of stable distribution:
Burr and Pareto
distributions (broken
& dotted)
Wroclaw University of Technology
Scaling properties of PDF
•
-stable distributions have scaling properties – a sum of
independent and identically distributed random variables
maintains the same shape of the distribution.
•
Similarly as the Gaussian distribution, also a stable
distribution (CLT).
•
Only a few -stable distributions have direct formulas for
their probability density function. Usually only the
characteristic function is given. The distinctive properties of
-stable distributions are their long tails, infinite variance
and, in some cases, infinite mean value.
Wroclaw University of Technology
fractional Levy-stable motion
fLSM process is a self-similar non-stationary process which
can be represented as

H 1 / 
Z  (t )   [ (t  u )
H

H 1 / 
 (u )
] dZ  (u )
Z(u) is a symmetric Lévy -stable motion, and  is the stability
index of stable distribution.
The increment process of FLSM is stationary and it is called a
fractional stable noise (FSN).
Wroclaw University of Technology
Memory of a self-similar process
d = H  1/
For a Gaussian process =2
For d > 0 the memory is long – a long-range persistent process.
Otherwise (d < 0) – a long-range antipersistent process („short
memory”). The time series looks very rough
Wroclaw University of Technology
Summary
 Time series can be deterministic or stochastic. Visual distinction
not always possible.
 Stochastic time series may tested analytically by statistical
methods and an appropriate model attributed if the series is
stationary. Otherwise a pre-processing needed.
 Random data in time series may be correlated. Correlations are
called memory.
 Independent data (e.g. WN) do not have a memory. Each
element assumes a value according to an independent
probability density function.
Wroclaw University of Technology
Summary (2)
 In short-range memory the time series is correlated with a few
elements back; the autocorrelation function shows an exponential
decay.
 Typical models of time series with short memory are linear
ARMA models - linear combination of previous elements and
white noise components. The range of AR component estimated
by ACF, the range of MA component by PACF.
 In long-range memory the series is correlated with vary far away
elements. The decay of the autocorrelation function is slower
than exponential. ACF decays according to power-law.
 Long-range time series can be typically modelled by self-similar
processes (fBm, FLSM) or fARIMA linear models
Wroclaw University of Technology
Recommended text books
• P. J. Brockwell, R. A. Davis, Introduction to Time
Series and Forecasting, Springer, 1987
• J. Beran, Statistics for long-memory processes,
Chapman and Hall, 1994
• G.E.P.Box, G.M. Jenkins, Time series analysis:
forecasting and control, Holden Day, 1970