Fixing Autocorrelatin
Download
Report
Transcript Fixing Autocorrelatin
Spatial autoregressive
methods
Nr245
Austin Troy
Based on Spatial Analysis by Fortin
and Dale, Chapter 5
Autcorrelation types
• None: independence
xi i , i N (0, 2 )
• Spatial independence,
functional dependence
xi zi i
zi i
• True autocorrelation>>
inherent autoregressive
xi xi 1 i ,1 1
• Functional autocorr>>
induced autoregressive
xi zi i
2
,
where
,
N
(
0
,
)
zi zi 1 i
Autocorrelation types
• Double autoregressive
xi zi x xi 1 i
zi z zi 1 i
• Notice there are now two
autocorrelation parameters x and -z
Effects?
• Standard test statistics become “too
liberal”—more significant results than the
data justify
• Because observations are not totally
independent have lower actual degrees of
freedom, or lower “effective sample size”:
n’ instead of n; since t stat denominator =
s/n, if n is too big it inflates the t statistic
What to do? Non-effective
• Why not just adjust up the significance
level? E.g. 99% instead of 95%? Because
don’t how by how much to adjust without
further information. Could end up with a
test that is way too conservative
• Why not just adjust sampling to only
include “independent samples?” Because
wasteful of data and because easy to
mistake “critical distance to independence”
Best approach: Adjust effective
sample size
1
• For large sample sizes
n' n
1
– So for instance n=1000 and ro=.4 means n’=429
• Problem is that, to be useful, autoregressive
model (ro parameter) has to be an effective
descriptor of the structure of autocorrelation of
the data
Moving average models
• How calculated depends on “order”
• A simple model for adjusting sample size: first order
autoregressive model, only immediate (first order)
neighbors are correlated with ro>0. All other pairs are
zero.
• In such a model xi is a function of xi+1 and xi-1
• Hence half the info for xi is in each neighbor; produce
k
ro=.5 for large n and n’=n/2.
xi i j i j
• An n order model can take form
j 1
• Translates into generalized matrix form X Z W
• With variance covariance matrix
C [(I W) (I W)]
2
T
Moving average
• When you increase the order, calculating sample size gets
complicated; e.g. second order model, where two ro
parameters now
n' n 2 /[ n 2(n 1) 1 2(n 2) 2 )
• Important point: If there are several different levels of
autocorrelation (k), each k must be incorporated even if
non-significant; this can have a huge impact on the
calculation of effective sample size
• Fortin and Dale recommend not using moving average
approach because very sensitive to irregularities in the
data and can produce a wide range of estimates
Two dimensional approaches
• Problem with MA approach as it was just
presented is assumes one-dimensionality
• In spatial data, xi depends on all neighbors
most likely
• Two best ways for dealing with this:
– Simultaneous autoregressive models (SAR)
– Conditional autoregressive models (CAR)
SAR
• Based on concept of set of simultaneous equations to be
solved. In this xi and xi-1 are each defined by their own
equations
x Z u
• Where x is a vector and is linearly dependent on a vector
of underlying variables z , z z …. Given as matrix Z, u is a
vector non-independent error terms with mean zero and
var-covar matrix C
• Spatial autocorrelation enters via u where
1
2
3
u W u
• Here e is independent error term and W is neighbor
weights standardized to row totals of 1. W is not
necessarily symmetrical, allowing for inclusion of
anisotropy. Wij is >0 if values at location i is not
independent of value at location j
SAR
• This yields the model
x Z W(x Z )
• With variance
covariance matrix
(from u)
C 2 [(I W)T (I W)]1
• Note how similar to
MA—difference is no
inverse in formula
• The elements of C
are variances
From Fortin and Dale p. 231
CAR
• More commonly used in spatial statistics
• Not based on spatial dependence per se; instead probability
of a certain value is conditional on neighbor values
• Similar to SAR, but requires that weight matrix be
symmetrical
2
1
C
(
I
j
V
)
• Here
Where j is the autocorrelation parameter and V is a
symmetrical weight matrix
Any SAR process is a CAR process if
V= W + WT – W TW