Supervised Classifications

Download Report

Transcript Supervised Classifications

Map of the Great Divide Basin, Wyoming, created using a
neural network and used to find likely fossil beds
See: http://earthobservatory.nasa.gov/IOTD/view.php?id=79412&src=eoa-iotd
Supervised Classifications &
Miscellaneous Classification
Techniques
Using training data to classify digital
imagery
How does supervised classification work?
• Associate areas on the image with informational
classes from the field (training data!)
• Generate statistics to describe the spectral
characteristics of each informational class in terms of
the satellite bands and/or enhancements
• Assign unknown pixels to classes based on similarity
to the statistical description of the training class.
Supervised vs. Unsupervised Classifications
Unsupervised
Supervised
Choose Bands,
enhancements, etc.
Collect training data
for your map area
Cluster pixels into
spectral classes
Choose bands,
enhancements, etc.
Label clusters
corresponding to
informational classes
Assign pixels to
most similar
informational class
Evaluate result
Evaluate result
Supervised Classification Process
Landsat image near Riverton, WY
A = sagebrush
B = water
B
A
C
C = agriculture
D = riparian vegetation
D
Selecting Training Data
• Selection of training data is the most important part
of a supervised classification!
• Training data must:
– Represent all of the classes that you want to map
– Represent the spectral variability within classes
• (Can split informational classes for classification purposes)
– Be carefully selected based on field work and examination
of the image
– Be modified iteratively if necessary to improve your
classification
Training Data
Training Site Selection
Resulting Classification
Supervised Classification Algorithms
• There are many techniques for assigning pixels
to training (informational) classes. Common
methods include:
• Parallelpiped
• Minimum Distance
• Maximum Likelihood
Parallelpiped Classifier
• Determine the range of DNs for each class in
each band
• Use these DN ranges to define multidimensional “boxes” (parallelpipeds) in
feature space
• If a pixel falls within a box, it is assumed to
belong to that class
• If a pixel falls outside of all boxes, it is not
classified
Parallelpiped “boxes” in 3D feature space
Pros/Cons of Parallelpiped Classifier
• Does NOT assign every pixel to a class. Only
the pixels that fall within ranges.
• Good for helping decide if you need additional
classes (if there are unclassified pixels)
• Good for helping decide if you need additional
predictor variables (spectral or ancillary)
• Problems when class ranges overlap—must
develop rules to deal with overlap areas, or
refine the training data.
Minimum Distance Classifier
• Assigns pixels to the class they are closest to in feature
space in terms of Euclidean distance.
• Calculate the average DN for each class across all bands
(= the class centroid)
• Calculate the Euclidean distance from each pixel to each
centroid in feature space
• Assign each pixel to the class with the closest centroid
Use Pythagorean theorem to calculate distance
(in terms of DNs) to each centroid. Assign
unknown pixel to closest centroid.
Class 1 centroid
Unknown pixel
Band Y
Class 2 centroid
Class 3 centroid
Band X
Pros/Cons of Minimum Distance Classifier
• Classifies every pixel in the image (regardless
of probability that it is really in a class)
• Does not explicitly consider the variability
(variance) within classes
• Works well for some images and not as well
for others
Maximum Likelihood Classifier
• Assigns unknown pixels to the class that it has
the highest probability of belonging to
– Based on how many standard deviations the pixel
is from the class centroid (variance of the class)
• Should use with normally distributed data
(bell-shaped histograms) but we are often
permissive about deviation from this
Pros/Cons of Maximum Likelihood
Classifier
• Classifies every pixel in the image
• Recognizes that some classes have lots of
spectral variability and are more likely to
include pixels that are “far” from the class
centroid
• Image data are not always normally
distributed
• Often, but not always, a better choice than
minimum distance classifiers
Supervised Classification--Summary
• Supervised classification uses knowledge of the
locations of informational classes to group pixels
• Requires close attention to development of training
data
• Typically results in better maps than unsupervised
classifications IF you have good training data.
• Requires more work (time/money) than
unsupervised classifications
Other Classification Algorithms
• Fuzzy classifiers
• Decision trees
• Classification and Regression Trees
(CART)
• Object-based image classification
• Many others – neural networks, expert
systems, etc.
• Some of these are covered in Advanced
RS (BOT/GEOG 4211/5211)
Fuzzy Classifications
• Recognize that in the real world, distinctions
between classes are often not distinct
• Assigns the same place on the ground to all
classes (with a membership probability)
Decision Trees
• Similar to a dichotomous key that you might
use to key out plants or animals
• Can combine many types of data to create
classes (e.g., spectral data, elevation data, soil
maps, etc.)
• Can be built “by hand” or using statistical
techniques
• Includes CARTs that build trees based on
statistics
Image Classification Philosophy
• What is it that makes one feature of interest
different from another?
• Can you capture that difference accurately
with spatially distributed data?
• How can you best exploit the differences
between features statistically or otherwise?
• Remember that we must often go beyond just
using spectral data!