Fixing Autocorrelatin

Download Report

Transcript Fixing Autocorrelatin

Spatial autoregressive
methods
Nr245
Austin Troy
Based on Spatial Analysis by Fortin
and Dale, Chapter 5
Autcorrelation types
• None: independence
xi   i ,  i  N (0,  2 )
• Spatial independence,
functional dependence
 xi  zi   i

 zi  i
• True autocorrelation>>
inherent autoregressive
xi  xi 1   i ,1    1
• Functional autocorr>>
induced autoregressive
 xi  zi   i
2
,
where
,


N
(
0
,


)
 zi  zi 1  i
Autocorrelation types
• Double autoregressive
 xi  zi   x xi 1   i

 zi   z zi 1  i
• Notice there are now two
autocorrelation parameters x and -z
Effects?
• Standard test statistics become “too
liberal”—more significant results than the
data justify
• Because observations are not totally
independent have lower actual degrees of
freedom, or lower “effective sample size”:
n’ instead of n; since t stat denominator =
s/n, if n is too big it inflates the t statistic
What to do? Non-effective
• Why not just adjust up the significance
level? E.g. 99% instead of 95%? Because
don’t how by how much to adjust without
further information. Could end up with a
test that is way too conservative
• Why not just adjust sampling to only
include “independent samples?” Because
wasteful of data and because easy to
mistake “critical distance to independence”
Best approach: Adjust effective
sample size
1 
• For large sample sizes
n'  n
1 
– So for instance n=1000 and ro=.4 means n’=429
• Problem is that, to be useful, autoregressive
model (ro parameter) has to be an effective
descriptor of the structure of autocorrelation of
the data
Moving average models
• How calculated depends on “order”
• A simple model for adjusting sample size: first order
autoregressive model, only immediate (first order)
neighbors are correlated with ro>0. All other pairs are
zero.
• In such a model xi is a function of xi+1 and xi-1
• Hence half the info for xi is in each neighbor; produce
k
ro=.5 for large n and n’=n/2.
xi   i    j  i  j
• An n order model can take form
j 1
• Translates into generalized matrix form X  Z  W  
• With variance covariance matrix
C   [(I  W) (I  W)]
2
T
Moving average
• When you increase the order, calculating sample size gets
complicated; e.g. second order model, where two ro
parameters now
n'  n 2 /[ n  2(n  1) 1  2(n  2)  2 )
• Important point: If there are several different levels of
autocorrelation (k), each k must be incorporated even if
non-significant; this can have a huge impact on the
calculation of effective sample size
• Fortin and Dale recommend not using moving average
approach because very sensitive to irregularities in the
data and can produce a wide range of estimates
Two dimensional approaches
• Problem with MA approach as it was just
presented is assumes one-dimensionality
• In spatial data, xi depends on all neighbors
most likely
• Two best ways for dealing with this:
– Simultaneous autoregressive models (SAR)
– Conditional autoregressive models (CAR)
SAR
• Based on concept of set of simultaneous equations to be
solved. In this xi and xi-1 are each defined by their own
equations
x  Z  u
• Where x is a vector and is linearly dependent on a vector
of underlying variables z , z z …. Given as matrix Z, u is a
vector non-independent error terms with mean zero and
var-covar matrix C
• Spatial autocorrelation enters via u where
1
2
3
u  W u  
• Here e is independent error term and W is neighbor
weights standardized to row totals of 1. W is not
necessarily symmetrical, allowing for inclusion of
anisotropy. Wij is >0 if values at location i is not
independent of value at location j
SAR
• This yields the model
x  Z  W(x  Z )  
• With variance
covariance matrix
(from u)
C   2 [(I  W)T (I  W)]1
• Note how similar to
MA—difference is no
inverse in formula
• The elements of C
are variances
From Fortin and Dale p. 231
CAR
• More commonly used in spatial statistics
• Not based on spatial dependence per se; instead probability
of a certain value is conditional on neighbor values
• Similar to SAR, but requires that weight matrix be
symmetrical
2
1
C


(
I

j
V
)
• Here
Where j is the autocorrelation parameter and V is a
symmetrical weight matrix
Any SAR process is a CAR process if
V= W + WT – W TW