Spatial Distribution - UBC Department of Geography

Download Report

Transcript Spatial Distribution - UBC Department of Geography

Spatial Statistics
Applied to point data
Centrographic Statistics
• Most basic type of descriptor for spatial
distributions, includes:
– Mean Center
– Median Center
– Standard Deviation
– Standard Distance
– Standard Deviational Ellipse
• Two dimensional correlates to basic statistical
moments of a single-variable distribution
• Modified from one dimensional to two
dimensional
Mean Center
• Simply the mean of X and Y
• Also called center of gravity
• Sum of differences between the mean X
and all other X is zero (same for Y)
N
Xi
X 
i 1 N
N
Yi
Y 
i 1 N
Weighted Mean Center
• Produced by weighting each coordinate
by another variable (e.g., population)
• Points associated with areas can have
the characteristics of those areas
included
N
Wi X i
X 
N
i 1
N
WiYi
Y 
i 1 N
Standard Deviation of X and Y
• A measure of dispersion
• Does not provide a single summary
statistic of the dispersion
( X i  X )2
Sx  
N 1
i 1
N
(Yi  Y ) 2
Sy  
N 1
i 1
N
Standard Distance Deviation
• Represents the standard deviation of the distance of
each point from the mean center
• Is the two dimensional equivalent of standard
deviation
N
(diMC ) 2
• Where: S xy 
N 2

i 1
where d iMC is the distance between each point,i, and
the mean center and N is the total number of point
We subtract 2 from the number of points to provide
an unbiasedestimate of standard distancesince there
are two constants
Standard Distance Deviation
• Because it is an average distance from
the mean center, it is represented as a
single vector
Standard Deviation Ellipse
• While the standard distance deviation is
a good single measure of the dispersion
of the incidents around the mean center,
it does not show the potential skewed
nature of the data (anisotropy).
• The standard deviation ellipse gives
dispersion in two dimensions
Standard Deviational Ellipse
( 2 x   2 y)
Distribution 
2
where  are the two standard deviationsin the x and y direction
are orthogonalto each other and define an ellipse.
Testing the Differences
Crime Analysis with
Centrographic Statistics
• A good “free” software product for doing some
basic spatial statistics is Crimestat
• Review of Crimestat Figures 4.19 – 4.26
– Seeing the relationship between mean center,
standard distance, and standard deviational ellipse
• Centrographic Statistics in Monroe County
Point Pattern Analysis
• The spatial pattern of the distribution of a
set of point features.
– Spatial properties of the entire body of points
are studied rather than the individual entities
– Points are 0 dimensional objects, the only
valid measures of distributions are the
number of occurrences in the pattern and
respective geographic locations
Descriptive Statistics of Point
Features
• Frequency: number of point features
occurring on a map
Types of Distribution
• Three general patterns
– Random any point is equally likely to occur at any
location and the position of any point is not affected
by the position of any other point. There is no
apparent ordering of the distribution
– Uniform every point is as far from all of its neighbors
as possible
– Clustered many points are concentrated close
together, and large areas that contain very few, if
any, points
Quadrat Analysis
• Based on a measure derived from data
obtained after a uniform grid network is drawn
over a map of the distribution of interest
• The frequency count, the number of points
occurring within each quadrat is recorded first
• These data are then used to compute a
measure called the variance
• The variance compares the number of points in
each grid cell with the average number of
points over all of the cells
• The variance of the distribution is compared to
the characteristics of a random distribution
RANDOM
UNIFORM
CLUSTERED
3
5
2
1
3
1
0
1
3
1
Quadrat
#
1
2
3
4
5
6
7
8
9
10
Number of
Points Per
Quadrat
3
1
5
0
2
1
1
3
3
1
20
Variance
Mean
Var/Mean
2.222
2.000
1.111
2
2
2
2
2
x^2
9
1
25
0
4
1
1
9
9
1
60
2
2
2
2
2
0
0
10
0
0
Number of
Quadrat Points Per
#
Quadrat
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
2
10
2
20
Variance
Mean
Var/Mean
Number of
Quadrat Points Per
#
Quadrat
1
0
2
0
3
0
4
0
5
10
6
10
7
0
8
0
9
0
10
0
20
x^2
4
4
4
4
4
4
4
4
4
4
40
0.000
2.000
0.000
Variance
Mean
Var/Mean
N  num ber_ of _ quadrats 10
x

Variance 
2
 [( x) 2 / N ]
N 1
Variance m ean ratio 
0
0
10
0
0
variance
mean
17.778
2.000
8.889
x^2
0
0
0
0
100
100
0
0
0
0
200
Quadrat Analysis
• A random distribution would indicate that
that the variance and mean are the
same. Therefore, we would expect a
variance-mean ratio around 1
• Values other than 1 would indicate a nonrandom distribution.
Weakness of Quadrat Analysis
• Quadrat size and orientation
– If the quadrats are too small, they may contain only a
couple of points. If they are too large, they may
contain too many points
• Some have suggested that quadrat size should be
twice the size of the mean area per point
• Or, test different sizes (or orientations) to determine the
effects of each test on the results
Weakness of Quadrat Method
• Actually a measure of dispersion, and not
really pattern, because it is based
primarily on the density of points, and not
their arrangement in relation to one
another
• Results in a single measure for the entire
distribution, so variations within the
region are not recognized
Nearest-Neighbour Analysis
• Unlike quadrat analysis uses distances
between points as its basis.
• The mean of the distance observed
between each point and its nearest
neighbour is compared with the expected
mean distance that would occur if the
distribution were random
RANDOM
UNIFORM
CLUSTERED
Point
1
2
3
4
5
6
7
8
9
10
r
Area of
Region
Density
Expected
Mean
R
Nearest Distance
Neighbour
(r)
2
1
3
0.1
2
0.1
5
1
4
1
5
2
6
2.7
10
1
10
1
9
1
10.9
1.09
50
0.2
1.118034
0.9749256
Nearest
Neighbour Distance
3
2.2
4
2.2
4
2.2
5
2.2
7
2.2
7
2.2
8
2.2
9
2.2
10
2.2
9
2.2
22
Point
1
2
3
4
5
6
7
8
9
10
r
Area of
Region
Density
Expected
Mean
R
2.2
50
0.2
1.118034
1.9677398
r

r
n
n
d
area
.5
r (e) 
d
Point
1
2
3
4
5
6
7
8
9
10
r
Area of
Region
Density
Expected
Mean
R
Nearest
Neighbour Distance
2
0.1
3
0.1
2
0.1
5
0.1
4
0.1
5
0.1
6
0.1
9
0.1
10
0.1
9
0.1
1
0.1
50
0.2
1.118034
0.0894427
r
R
r (e)
Advantages of Nearest Neighbor
over Quadrat Analysis
• No quadrat size problem to be concerned with
• Takes distance into account
• Problems
– Related to the entire boundary size
– Must consider how to measure the boundary
• Arbitrary or some natural boundary
– May not consider a possible adjacent boundary