Image Classification: Supervised Classification
Download
Report
Transcript Image Classification: Supervised Classification
Image Classification:
Supervised Methods
Lecture 8
Prepared by R. Lathrop 11//99
Updated 3/06
Readings:
ERDAS Field Guide 5th Ed. Ch 6:234-260
Where in the World?
Learning objectives
• Remote sensing science concepts
– Basic concept of supervised classification
– Major classification algorithms
– Hard vs Fuzzy Classification.
• Math Concepts
• Skills
--Training set selection: Digital polygon vs. seed pixelregion growing
--Training aids: plot of training data, statistical measure
of separability;
--Edit/evaluate signatures
-- Applying Classification algorithms
Supervised vs. Unsupervised
Approaches
• Supervised - image analyst "supervises" the
selection of spectral classes that represent
patterns or land cover features that the analyst
can recognize
Prior Decision
• Unsupervised - statistical "clustering"
algorithms used to select spectral classes
inherent to the data, more computer-automated
Posterior Decision
Supervised
Select Training
fields
Edit/evaluate
signatures
Classify
image
Evaluate
classification
vs.
Unsupervised
Run clustering
algorithm
Identify
classes
Edit/evaluate
signatures
Evaluate
classification
Supervised vs. Unsupervised
Supervised Prior Decision: from Information classes in the
Image to Spectral Classes in Feature Space
N
I
R
Red
Unsupervised Posterior Decision: from Spectral Classes in
Feature Space to Information Classes in the Image
Training
• Training: the process of defining criteria by
which spectral patterns are recognized
• Spectral signature: result of training that defines
a training sample or cluster
parametric - based on statistical
parameters that assume a normal
distribution (e.g., mean, covariance matrix)
nonparametric - not based on statistics but
on discrete objects (polygons) in feature
space
Supervised Training Set Selection
• Objective - selecting a homogenous (unimodal)
area for each apparent spectral class
• Digitize polygons - high degree of user control;
often results in overestimate of spectral class
variability
• Seed pixel - region growing technique to reduce
with-in class variability; works by analyst setting
threshold of acceptable variance, total # of
pixels, adjacency criteria (horiz/vert, diagonal)
ERDAS Area of Interest (AOI) tools
Seed pixel or region growing dialog
Region Growing: good for linear
features
Spectral Distance = 7
Spectral Distance = 10
Region Growing: good for spectrally
heterogeneous features
Spectral Distance = 5
Spectral Distance = 10
Supervised Training Set Selection
Whether using the
digitized polygon or
seed pixel technique,
the analyst should
select multiple training
sites to identify the
many possible spectral
classes in each
information class of
interest
Guided Clustering: hybrid
supervised/unsupervised approach
• Polygonal areas of known land cover type
are delineated as training sites
• ISODATA unsupervised clustering
performed on these training sites
• Clusters evaluated and then combined into a
single training set of spectral signatures
Training Stage
• Training set ---> training vector
• Training vector for each spectral classrepresents a sample in n-dimensional
measurement space where n = # of bands
for a given spectral class j
Xj = [ X1 ]
X1 = mean DN band 1
[ X2]
X2 = mean DN band 2
Classification Training Aids
• Goal: evaluate spectral class separability
• 1) Graphical plots of training data
- histograms
- coincident spectral plots
- scatter plots
• 2) Statistical measures of separability
- divergence
- Mahalanobis distance
• 3) Training Area Classification
• 4) Quick Alarm Classification
- paralellipiped
Parametric vs. Nonparametric
Distance Approaches
• Parametric - based on statistical parameters
assuming normal distribution of the clusters
e.g., mean, std dev., covariance
• Nonparametric - not based on "normal"
statistics, but on discrete objects and simple
spectral distance in feature space
Parametric Assumption: each spectral
class exhibits a unimodal normal
distribution
Bimodal histogram:
Mix of Class 1 & 2
# of
pixels
Class 1
0
Class 2
Digital Number
255
Training Aids
• Graphical portrayals
of training data
“good”
– histogram (check
for normality)
“bad”
Training Aids
• Graphical portrayals
of training data
– coincident spectral
mean plots
Training Aids
• Scatter plots: each training set sample
constitutes an ellipse in feature space
• Provides 3 pieces of information
- location of ellipse: mean vector
- shape of ellipse: covariance
- orientation of ellipse: slope & sign of
covariance
• Need training vector and covariance matrix
N
I
R
R
e
f
l
e
c
t
a
n
c
e
Spectral Feature Space
Grass
Mix: grass/trees
Broadleaf
Trees
Conifer
Impervious
Surface &
Bare Soil
water
Red Reflectance
Examine ellipses for
gaps and overlaps.
Overlapping ellipses
ok within information
classes; want to limit
between info classes
Training Aids
• Are some training sets redundant or overlap too greatly?
•Statistical Measures of Separability:
expressions of statistical distance that are
sensitive to both mean and variance
- divergence
- Mahalanobis distance
Training Aids
• Training/Test Area classification: look for
misclassification between information
classes; training areas can be biased,
better to use independent test areas
• Quick alarm classification: on-screen
evaluation of all pixels that fall within
the training decision region (e.g.
parallelipiped)
Classification Decision Process
• Decision Rule: mathematical algorithm that,
using data contained in the signature,
performs the actual sorting of pixels into
discrete classes
• Parametric vs. nonparametric rules
Parallelepiped or box classifier
• Decision region defined by the rectangular area
defined by the highest and lowest DN’s in each
band; specify by range (min/max) or std dev.
• Pro: Takes variance into account but lacks
sensitivity to covariance (Con)
• Pro: Computationally efficient, useful as first
pass
• Pro: Nonparametric
• Con: Decision regions may overlap; some pixels
may remain unclassified
N
I
R
R
e
f
l
e
c
t
a
n
c
e
Spectral Feature
Parallelepiped
or Space
Box Classifier
Upper and lower limit of each
box set by either range
(min/max) or # of standard
devs.
Note overlap in Red but not
NIR band
Red Reflectance
Parallelepipeds have “corners”
NIR
reflect
ance
unir
Parallelepiped
boundary
.
Signature ellipse
Candidate
pixel
ured
Red reflectance
Adapted from ERDAS Field Guide
Parallelepiped or Box Classifier:
problems
Veg 1
NIR
reflect
ance
Unclassified
pixels
??
Veg3
Soil 3
Misclassified
Veg 2
pixel
Overlap region
Soil 1
Water 2
Soil 2
Water 1
Red reflectance
Adapted from Lillesand & Kiefer, 1994
Minimum distance to means
• Compute mean of each desired class and
then classify unknown pixels into class with
closest mean using simple euclidean
distance
• Con: insensitive to variance & covariance
• Pro: computationally efficient
• Pro: all pixels classified, can use
thresholding to eliminate pixels far from
means
Minimum Distance to Means Classifier
Veg 1
NIR
Veg3
reflect
ance
Soil 3
Veg 2
Soil 1
Water 2
Soil 2
Water 1
Red reflectance
Adapted from Lillesand & Kiefer, 1994
Minimum Distance to Means Classifier:
Euclidian Spectral Distance
Y
92, 153
Yd = 85-153
Distance = 111.2
180, 85
Xd = 180 -92
X
Feature Space Classification
• Image analyst draws in decision regions
directly on the feature space image using AOI
tools - often useful for a first-pass broad
classification
• Pixels that fall within a user-defined feature
space class is assigned to that class
• Pro: Good for classes with a non-normal
distribution
• Con: Potential problem with overlap and
unclassified pixels
N
I
R
R
e
f
l
e
c
t
a
n
c
e
Spectral
Feature
Space
Feature
Space
Classifier
Analyst draws decision regions
in feature space
Red Reflectance
Statistically-based classifiers
• Defines a probability density (statistical) surface
• Each pixel is evaluated for its statistical
probability of belonging in each category,
assigned to class with maximum probability
• The probability density function for each
spectral class can be completely described by
the mean vector and covariance matrix
Parametric Assumption: each spectral
class exhibits a unimodal normal
distribution
Bimodal histogram:
Mix of Class 1 & 2
# of
pixels
Class 1
0
Class 2
Digital Number
255
wj
wi
Band 2
Band 1
2d vs. 1d
views of class
overlap
# of
pixels
0
Digital Number
Band 1
255
Probabilities used in likelihood ratio
wj
wi
# of
pixels
}
p (x | wj)
p (x | wi){
0
Digital Number
255
N
I
R
R
e
f
l
e
c
t
a
n
c
e
Spectral
Space
SpectralFeature
classes
as
probability surfaces
Ellipses defined by class mean
and covariance; creates
likelihood contours around each
spectral class;
Red Reflectance
N
I
R
R
e
f
l
e
c
t
a
n
c
e
Spectral
SensitiveFeature
to largeSpace
covariance values
Some classes may have large
variance and greatly overlap
other spectral classes
Red Reflectance
Mahalonobis Distance Classifier
D = (X-Mc)T (COVc-1)(X-Mc)
D = Mahalanobis distance
c = particular class
X = measurement vector of the candidate pixel
Mc = mean vector of class c
COVc = covariance matrix
COVc-1 = inverse of covariance matrix
T = transposition
Pro: takes the variability of the classes into account with
info from COV matrix
Similar to maximum likelihood but without the weighting
factors
Con: parametric, therefore sensitive to large variances
Maximum likelihood classifier
• Pro: potentially the most accurate classifier
as it incorporates the most information
(mean vector and COV matrix)
• Con: Parametric procedure that assumes the
spectral classes are normally distributed
• Con: sensitive to large values in the
covariance matrix
• Con: computationally intensive
Bayes Optimal approach
• Designed to minimize the average (expected)
cost of misclassifying in maximum likelihood
approach
• Uses an apriori (previous probability) term to
weight decisions - weights more heavily
towards common classes
• Example: prior probability suggests that 60 of
the pixels are forests, therefore the classifier
would more heavily weight towards forest in
borderline cases
Hybrid classification
• Can easily mix various classification algorithms in a
multi-step process
• First pass: some non-parametric rule (feature space or
paralellipiped) to handle the most obvious cases, those
pixels remaining unclassified or in overlap regions fall
to second pass
• Second pass: some parametric rule to handle the
difficult cases; the training data can be derived from
unsupervised or supervised techniques
Thresholding
• Statistically-based
classifiers do poorest near
the tails of the training
sample data distributions
• Thresholds can be used to
define those pixels that
have a higher probability of
misclassification; these
pixels can be excluded and
labeled un-classified or
retrained using a clusterbusting type of approach
Thresholding: define those pixels that
have a higher probability of
misclassification
# of
pixels
Class 1
0
Class 2
Unclassified
Regions
Threshold
255
Thresholding
• Chi square distribution used to help define a onetailed threshold
Threshold: values above
will remain unclassified
# of
pixels
0
Chi Square
Hard vs. Fuzzy Classification
Rules
• Hard - “binary” either/or situation: a pixel
belongs to one & only one class
• Fuzzy - soft boundaries, a pixel can have
partial membership to more than one class
Hard vs. Fuzzy Classification
Hard Classification
Water
Forested
Wetland
Fuzzy Classification
Forest
Adapted from Jensen,
2nd ed. 1996
Hard vs. Fuzzy Classification
Forest
MIR
reflect
ance
Forested Wetland
Water
NIR reflectance
Hard decision
boundaries
Adapted from Jensen,
2nd ed. 1996
Fuzzy Classification: In ERDAS
•Fuzzy Classification: in
the Supervised
Classification option, the
analyst can use choose
Fuzzy Classification and
then choose the number of
“best classes” per pixel.
•This will create multiple
output classification layers,
as many as the number of
best classes chosen above.
Fuzzy Classification: In ERDAS
•Fuzzy Convolution:
calculates the total weighted
inverse distance of all the
classes in a window of pixels
and assigns the center pixel
the class with the distance
summed over the entire set of
fuzzy classification layers.
•This has the effect of creating
a context-based classification.
•Classes with a very small
distance value will remain
unchanged while classes with
higher distance values may
change to a neighboring value
if there are a sufficient number
of neighboring pixels with
class values and small
corresponding distance
values.
Main points of the lecture
•
Training:
--Training set selection: Digital polygon vs. seed pixel-region growing
--Training aids: plot of training data, statistical measure of separability;
--Edit/evaluate signatures.
• Classification algorithms:
–
–
–
–
•
•
box classifier,
minimum distance to means classifier,
feature space classifier,
statistically-based classifiers (maximum likelihood classifier,
Mahalonobis distance classifier)
Hybrid classification: statistical + Threshold method;
Hard vs Fuzzy Classification.
Homework
1 Homework: Unsupervised classification
(Hand up your excel file and figure process);
2 Reading Textbook Ch. 9:337-389;
3 Reading Field Guide Ch. 7:226-231, 235253.