Transcript Document
Spatial Analysis Variogram
Variance
n
Discrete Random Variable:
( xi ) 2 P( xi )
2
i 1
Continuous Random Variable:
2 ( xi ) 2 f ( x)dx
Where is the population mean, P(xi) is probability mass function, and
f(x) is probability density function. In these formulae, the values that
a random variable takes is independent to each other. Independence is
a essential condition for almost all traditional statistical analysis.
Spatial Analysis
First Law of Geography
Everything is related to everything else, things closer are more
related than those that are further apart.
Remotely sensed data are continuous coverage of brightness values
for a certain spectral range. Each brightness value in a image is tied to
a location, therefore the DN values are autocorrelated. Conventional
statistics that requires independence among individuals can not be
applied to autocorrelated variables. The random variable that takes
a unique value in each point of space is called the regionalized
variables.
The statistical theory that stresses the spatial aspect of regionalized
variables is geostatistics.
Characteristics of Regionalized Variables
1. Localized: the information that a regionalized variable
represents a local information.
2. Continuity: the variable may show more or less steady
continuity in its spatial variation.
3. Anisotropies: There may exist a preferential direction along
which the rate of variation does not change significantly, but
it vary rapidly along a cross-direction.
Semivariance
Definition:
1
2
(h) E{[ f ( x) f ( x h)] }
2
Assumptions: Stationary in increment.
Wide-sense stationary: E{f(x)} = M
Stationary in increment:
E{f(x+h)}=M+mh
mh only depends on h
Semivariance
15
16
10
11
12
14
(1)
1
(15 16) 2 (16 10) 2 (10 11) 2 (11 12) 2 (12 14) 2
25
(2)
1
(15 10) 2 (16 11) 2 (10 12) 2 (11 14) 2
2 4
(3)
1
(15 11) 2 (16 12) 2 (10 14) 2
23
Note: the number of sampling points decreases as the lag
increases. The rule of thumb is that the max. lag set to 2/3
or ½ of the total length.
How do you calculate a semivariance for an image?
How many pairs of data we have for lag 1, 2, … ?
What about a whole Landsat image?
Semivariogram
1
2
V f (h) E{[ f ( x) f ( x h)] }
2
semivariance
sill
nugget
lag
range
Terms to describe a semivariogam
Support: area and shape of surface represented by each sample
point. In remote sensing refers to pixel size.
Lag: distance between sampling points
Sill: maximum level of semivariance
Range: point on horizontal axis where semivariance reaches
maximum (sill).
Nugget Effect: point where extrapolated relationship
semivariance-lag for semivariance at lag zero
What semivariogram can tell us?
Range: Object size
Sill: Object cover. Maximize at 50% and decreases with either
high or low cover.
Shape: Variation of object size.
Semivariograms Models
exponential
(h) c(1 e
h
a
)
spherical
3h h 3
c 3
( h) 2a 2a
c
h<a
ha
Semivariance of Remotely Sensed Data
Remote sensing measurement are spatial average within a pixel,
regularized variogram.
Regularization:
1. Reduce sill
2. Increase range
Rate of Decrease in Sill Reveals Object Size
Figure Source: Song and Woodcock, 2003 PE&RS Forthcoming.
A Simulated Image with different object size
Tree crown diameter=3 m
Pixel size=0.2 m
Tree crown diameter=3 m
Pixel size=10 m
Tree crown diameter=9 m
Pixel size=0.2 m
Tree crown diameter=9 m
Pixel size=10 m
Forest Successional Stages and Image Spatial Characteristics
Canopy Structure and Image Spatial Characteristics
14
y = 0.0001x + 3.5760
R 2 = 0.6066
10
8
6
8
4
Conifers
Hardwoods
2
0
0
10000
20000
30000
40000
50000
Image variance (Pixel size = 4x4 m)
Tree Crown Diameter (m)
Leaf Area Index
12
y = -3.7193x + 9.8396
R 2 = 0.4765
7
6
5
60000
4
3
Conifers
2
Hardwoods
1
1
1.2
1.4
R 23
1.6
1.8
2
Shannon Index (H)
H is one of several diversity indices used to measure diversity
in categorical data. It is simply the Information entropy of the
distribution, treating species as symbols and their relative
population sizes as the probability.
S
H pi ln pi
i 1
S: the number of species (or classes in remote sensing)
pi=ni/N, ni is the number of individuals in for a species (pixels), N is total individuals.
The advantage of this index is that it takes into account the number of species
and the evenness of the species. The index is increased either by having
additional unique species, or by having a greater species evenness.
Perimeter/Area Radtio, Patch size distribution
P/A: More fragmented landscape have bigger P/A ratio
freq
area