Transcript Document

Geographic Information Science
Geography 625
Intermediate
Geographic Information Science
Week4: Point Pattern Analysis
Instructor: Changshan Wu
Department of Geography
The University of Wisconsin-Milwaukee
Fall 2006
University of Wisconsin-Milwaukee
Geographic Information Science
Outline
1. Revisit IRP/CSR
2. First- and second order effects
3. Introduction to point pattern analysis
4. Describing a point pattern
5. Density-based point pattern measures
6. Distance-based point pattern measures
7. Assessing point patterns statistically
University of Wisconsin-Milwaukee
Geographic Information Science
1. Revisit IRP/CSR
10
9
Independent random process (IRP)
Complete spatial randomness (CSR)
8
7
6
5
4
1.
2.
Equal probability: any point has
equal probability of being in any
position or, equivalently, each small
sub-area of the map has an equal
chance of receiving a point.
Independence: the positioning of any
point is independent of the
positioning of any other point.
3
2
1
0
0
2
4
6
8
10
 n  1   x  1 
P(k , n, x)     

k
  x   x 
k
P(k ) 
k e  
k!
nk
n
and  
x
University of Wisconsin-Milwaukee
Geographic Information Science
2. First- and second order effects
IRP/CSR is not realistic
 The independent random process is mathematically
elegant and forms a useful starting point for spatial
analysis, but its use is often exceedingly naive and
unrealistic.
 If real-world spatial patterns were indeed generated
by unconstrained randomness, geography would have
little meaning or interest, and most GIS operations
would be pointless.
University of Wisconsin-Milwaukee
Geographic Information Science
2. First- and second order effects
1. First-order effect



The assumption of Equal
probability cannot be satisfied
The locations of disease cases tends
to cluster in more densely populated
areas
Plants are always clustered in the
areas with favored soils.
From (http://www.crimereduction.gov.uk/toolkits/fa020203.htm)
University of Wisconsin-Milwaukee
Geographic Information Science
2. First- and second order effects
2. Second-order effect



The assumption of Independence
cannot be satisfied
New developed residential areas
tend to near to existing residential
areas
Stores of McDonald tend to be far
away from each other.
University of Wisconsin-Milwaukee
Geographic Information Science
2. First- and second order effects
In a point process the basic properties of the process are set by a
single parameter, the probability that any small area will receive
a point – the intensity of the process.
First-order stationary: no variation in its intensity over space.
Second-order stationary: no interaction between events.
University of Wisconsin-Milwaukee
Geographic Information Science
3. Introduction to point pattern analysis
Point patterns, where the only data are the locations of a set of
point objects, represent the simplest possible spatial data.
Examples
1) Hot-spot analysis for crime locations
2) Disease analysis (patterns and environmental relations)
3) Freeway accident pattern analysis
University of Wisconsin-Milwaukee
Geographic Information Science
3. Introduction to point pattern analysis
Requirements for a set of events to constitute a point pattern
1) The pattern should be mapped on the plane (prefer to
preserve distance between points)
2) The study area should be determined objectively.
3) The pattern should be an enumeration or census of the
entities of interest, not a sample
4) There should be a one-to-one correspondence between
objects in the study area and events in the pattern
5) Event locations must be proper (should not be the centroids
of polygons)
University of Wisconsin-Milwaukee
Geographic Information Science
4. Describing a Point Pattern
Point density (first-order or second-order?)
Point separation (first-order or second-order?)
When first-order effects are marked, absolute location is an important
determinant of observations, and in a point pattern clear variations across
space in the number of events per unit area are observed.
When second-order effects are strong, there is interaction between
locations, depending on the distance between them, and relative location is
important.
University of Wisconsin-Milwaukee
Geographic Information Science
4. Describing a Point Pattern
First-order or second order?
University of Wisconsin-Milwaukee
Geographic Information Science
4. Describing a Point Pattern
s1 (x1, y1)
A set of locations S with n events
S  {s1 , s2 ,...,si ,...,sn }
The study region A has an area a.
Mean Center
 n xi n yi 
s  ( x ,  y )   i 1 , i 1 
 n
n 


Standard Distance: a measure of how dispersed the
events are around their mean center
 ( x   )
n
d
i 1
i
2
x
n
 ( yi   y ) 2

University of Wisconsin-Milwaukee
Geographic Information Science
4. Describing a Point Pattern
A summary circle can be plotted for the point pattern, centered at
s with radius d
If the standard distance is computed separately for each axis, a
summary ellipse can be obtained.
Summary circle
Summary ellipse
University of Wisconsin-Milwaukee
Geographic Information Science
5. Density-based point pattern measures
Crude density/Overall intensity
n no.(S  A)
 
a
a
The crude density changes
depending on the study area
University of Wisconsin-Milwaukee
Geographic Information Science
5. Density-based point pattern measures
-Quadrat Count Methods
1.



Exhaustive census of quadrats that
completely fill the study region with no
overlaps
The choice of origin, quadrat orientation,
and quadrat size affects the observed
frequency distribution
If quadrat size is too large, then ?
If quadrat size is too small, then?
University of Wisconsin-Milwaukee
Geographic Information Science
5. Density-based point pattern measures
-Quadrat Count Methods
2. Random sampling approach is more
frequently applied in fieldwork.
 It is possible to increase the
sample size simply by adding
more quadrats (for sparse
patterns)
 May describe a point pattern
without having complete data on
the entire pattern.
University of Wisconsin-Milwaukee
Geographic Information Science
5. Density-based point pattern measures
-Quadrat Count Methods
Other shapes of quadrats
University of Wisconsin-Milwaukee
Geographic Information Science
5. Density-based point pattern measures
-Density Estimation
The pattern has a density at any
location in the study region, not just
locations where there is an event
This density is estimated by counting
the number of events in a region, or
kernel, centered at the location where
the estimate is to be made.
no.[S  C ( p, r )]
p 
2
r

Simple density estimation
C(p,r) is a circle of radius r centered at the location of interest p
University of Wisconsin-Milwaukee
Geographic Information Science
5. Density-based point pattern measures
-Density Estimation
Bandwidth r
If r is too large, then ?
If r is too small, then?
University of Wisconsin-Milwaukee
Geographic Information Science
5. Density-based point pattern measures
-Density Estimation
Density transformation
1) visualize a point pattern to
detect hot spots
2) check whether or not that
process is first-order stationary
from the local intensity variations
3) Link point objects to other
geographic data (e.g. disease and
pollution)
University of Wisconsin-Milwaukee
Geographic Information Science
5. Density-based point pattern measures
-Density Estimation
Kernel-density estimation (KDE)
Kernel functions: weight nearby
events more heavily than distant
ones in estimating the local
density
IDW
Spline
Kriging
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
Look at the distances between events in a point pattern
More direct description of the second-order properties
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Nearest-Neighbor Distance
Euclidean distance
d ( si , s j )  ( xi  x j ) 2  ( yi  y j ) 2
dmin (si )  min j(1,n)& j i (dij )


n
d min
d
(
s
)
min
i
i 1
n
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Nearest-Neighbor Distance
If clustered, dmin has a higher
or lower value?
d min  21.62
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Distance Functions: G function
G (d ) 
no.[d min ( si )  d ]
n
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Distance Functions: G function
G (d ) 
no.[d min ( si )  d ]
n
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Distance Functions: G function
The shape of G-function can
tell us the way the events are
spaced in a point pattern.


If events are closely clustered
together, G increases rapidly at
short distance
If events tend to evenly
spaced, then G increases
slowly up to the distance at
which most events are spaced,
and only then increases
rapidly.
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Distance Functions: F function
Three steps
1) Randomly select m locations {p1, p2, …, pm}
2) Calculate dmin(pi, s) as the minimum distance from location
pi to any event in the point pattern s
3) Calculate F(d) F (d )  no.[d min ( pi , S )  d ]
m
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Distance Functions: F function


For clustered events, F function rises slowly at first, but more rapidly at
longer distances, because a good proportion of the study area is fairly
empty.
For evenly distributed events, F functions rises rapidly at first, then
slowly at longer distances.
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Comparisons between G and F functions
no.[d min ( si )  d ]
G (d ) 
n
no.[d min ( pi , S )  d ]
F (d ) 
m
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Comparisons between G and F functions
no.[d min ( si )  d ]
G (d ) 
n
no.[d min ( pi , S )  d ]
F (d ) 
m
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Distance Functions: K Function
The nearest-neighbor distance, and the G and F functions only
make use of the nearest neighbor for each event or point in a
pattern
This can be a major drawback, especially with clustered
patterns where nearest-neighbor distances are very short
relative to other distances in the pattern.
K functions (Ripley 1976) are based on all the distances
between events in S.
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Distance Functions: K Function
Four steps
1) For a particular event, draw a circle
centered at the event (si) and with a
radius of d
2) Count the number of other events
within the circle
no.[S  C(si , d )]
3) Calculate the mean count of all
events n no.[S  C(si , d )]
i 1
n
4) This mean count is divided by the
overall study area event density
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Distance Functions: K Function

K (d ) 
n
i 1
no.[S  C ( si , d )]
n
a 1 n
   no.[S  C ( si , d )]
n n i 1

n
a
is the study area event
density
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Distance Functions: K Function
Clustered?
Evenly distributed?
University of Wisconsin-Milwaukee
Geographic Information Science
6. Distance-based point pattern measures
-Edge effects
Edge effects arise from the fact
that events near the edge of the
study area tend to have higher
nearest-neighbor distances,
even though they might have
neighbors outside of the study
area that are closer than any
inside it.
University of Wisconsin-Milwaukee
Geographic Information Science
7. Assessing Point Patterns Statistically
A clustered pattern is likely to have a peaky density pattern,
which will be evident in either the quadrat counts or in strong
peaks on a kernel-density estimated surface.
An evenly distributed pattern exhibits the opposite, an even
distribution of quadrat counts or a flat kernel-density estimated
surface and relatively long nearest-neighbor distances.
But, how cluster? How dispersed?
University of Wisconsin-Milwaukee
Geographic Information Science
7. Assessing
Point
Patterns
Statistically
University of Wisconsin-Milwaukee
Geographic Information Science
7. Assessing Point Patterns Statistically
-Quadrat Counts
Independent random process (IRP)
Complete spatial randomness (CSR)
P(k ) 
k e  
k!
n
and  
x
Mean
 
Variance
2 
How about mean > variance?
mean < variance?
A
B
The variance/mean (VMR) is
expected to be 1.0 if the
distribution is Poisson.
University of Wisconsin-Milwaukee
Geographic Information Science
7. Assessing Point Patterns Statistically
-Quadrat Counts
For a particular observation
Mean = number of events / study area
  nx
n is the number of events
x is the number of quadrats
A
B
University of Wisconsin-Milwaukee
Geographic Information Science
7. Assessing Point Patterns Statistically
-Quadrat Counts
Variance


n

2
k 0
xk (k   )
x
2
k = 0:
2 * (0 – 1.25)2 = 3.125
k = 1:
3 * (1 – 1.25)2 = 0.1875
k = 2:
2 * (2 – 1.25)2 = 1.125
k = 3:
1 * (3 – 1.25)2 = 3.0625
A
B
2 
3.125  0.1875  1.125  3.0625
 0.9375
8
University of Wisconsin-Milwaukee
Geographic Information Science
7. Assessing Point Patterns Statistically
-Quadrat Counts
A
B
VMR = Variance/Mean
= 0.9375/1.25
= 0.75
Clustered?
Random?
Dispersed?
University of Wisconsin-Milwaukee
Geographic Information Science
7. Assessing Point Patterns Statistically
-Nearest Neighbor Distances
The expected value for mean nearestneighbor distance for a IRP/CSR is
1
E (d ) 
2 
The ratio R between observed
nearest-neighbor distance to this
value is used to assess the pattern
R
d min
E (d )
If R > 1 then dispersed, else if R < 1 then clustered?
University of Wisconsin-Milwaukee
Geographic Information Science
7. Assessing Point Patterns Statistically
-G and F Functions
Clustered
Evenly Spaced
University of Wisconsin-Milwaukee
Geographic Information Science
7. Assessing Point Patterns Statistically
K Functions
IRP/CSR
 d 2
E[ K (d )] 

 d 2
L( d ) 
K (d )

d
University of Wisconsin-Milwaukee