Image Classification
Download
Report
Transcript Image Classification
Chapter 5
Image Classification
Analysis and applications of remote
sensing imagery
Instructor: Dr. Cheng-Chien Liu
Department of Earth Sciences
National Cheng Kung University
Last updated: 24 May 2005
Introduction
Overall objective of classification
• Automatically categorize all pixels in an image into land cover
classes or themes
Three pattern recognitions
• Spectral pattern recognition emphasize in this chapter
• Spatial pattern recognition
• Temporal pattern recognition
Selection of classification
• No single “right” approach
• Depend on
The nature of the data being analyzed
The computational resources available
The intended application of the classified data
Supervised classification
Fig 7.37
• A hypothetical example
Five bands: B, G, R, NIR, TIR,
Six land cover types: water, sand, forest, urban, corn, hay
Three basic steps (Fig 7.38)
• Training stage
• Classification stage
• Output stage
Supervised classification (cont.)
Classification stage
• Fig 7.39
Pixel observations from selected training sites plotted on scatter
diagram
Use two bands for demonstration, can be applied to any band number
Clouds of points multidimensional descriptions of the spectral
response patterns of each category of cover type to be interpreted
• Minimum-Distance-to-Mean classifier
Fig 7.40
Mean vector for each category
Pt 1 Corn
Pt 2 Sand ?!!
Advantage: mathematically simple and computationally efficient
Disadvantage: insensitive to different degrees of variance in the spectral
response data
Not widely used if the spectral classes are close to one another in the
measurement space and have high variance
Supervised classification (cont.)
Classification stage (cont.)
• Parallelepiped classifier
Fig 7.41
Range for each category
Pt 1 Hay ?!!
Pt 2 Urban
Advantage: mathematically simple and computationally
efficient
Disadvantage: confuse if correlation or high covariance are
poorly described by the rectangular decision regions
Positive covariance: Corn, Hay, Forest
Negative covariance: Water
Alleviate by use of stepped decision region boundaries (Fig
7.42)
Supervised classification (cont.)
Classification stage (cont.)
• Gaussian maximum likelihood classifier
Assumption: the distribution of the cloud of points is Gaussian
distribution
Probability density functions mean vector and covariance matrix
(Fig. 7.43)
Fig 7.44: Ellipsoidal equiprobability contours
Bayesian classifier
A priori probability (anticipated likelihood of occurrence)
Two weighting factors
If suitable data exist for these factors, the Bayesian implementation of the classifier is
preferable
Disadvantage: computational efficiency
Look-up table approach
Reduce the dimensionality (principal or canonical components transform)
Simplify classification computation by separate certain classes a prior
Water is easier to separate by use of NIR/Red ratio
Supervised classification (cont.)
Training stage
• Classification automatic work
• Assembling the training data manual work
Both an art and a science
Substantial reference data
Thorough knowledge of the geographic area
You are what you eat!
Results of classification are what you train!
• Training data
Both representative and complete
All spectral classes constituting each information class must be adequately represented in
the training set statistics used to classify an image
e.g. water (turbid or clear)
e.g. crop (date, type, soil moisture, …)
It is common to acquire data from 100+ training areas to represent the spectral variability
Supervised classification (cont.)
Training stage (cont.)
• Training area
Delineate boundaries (Fig 7.45)
Carefully located boundaries no edge pixels
Seed pixel
Choose seed pixel statistically based criteria contiguous pixels cluster
• Training pixels
Number
At least n+1 pixels for n spectral bands
In practice, 10n to 100n pixels is used
Dispersion representative
• Training set refinement
Make sure the sample size is sufficient
Assess the overall quality
Check if all data sets are normally distributed and spectrally pure
Avoid redundancy
Delete or merge
Supervised classification (cont.)
Training stage (cont.)
• Training set refinement process
Graphical representation of the spectral response patterns
Fig 7.46: Histograms for data points included in the training areas of “hay”
Visual check on the normality of the spectral response distribution
Two subclasses: normal and bimodal
Fig 7.47: Coincident spectral plot
Corn/hay overlap for all bands
Band 3 and 5 for hay/corn separation (use scatter plot)
Fig 7.48: SPOT HRV multi-spectral images
Fig 7.49 scatter plot of band 1 versus band 2
Fig 7.50 scatter plot of band 2 versus band 3 less correlated adequate
Quantitative expressions of category separation
Transform divergence: a covariance-weighted distance between category means
Table 7.1: Portion of a divergence matrix (<1500 spectrally similar classes)
Supervised classification (cont.)
Training stage (cont.)
• Training set refinement process (cont.)
Self-classification of training set data
Error matrix for training area not for the test area or the overall scene
Tell us how well the classifier can classify the training areas and nothing more
Overall accuracy is perform after the classification and output stage
Interactive preliminary classification
Plate 29: sample interactive preliminary classification procedure
Representative subscene classification
Complete the classification for the test area verify and improve
Summary
Revise with merger, deletion and addition to form the final set of statistics used in
classification
Accept misclassification accuracy of a class that occurs rarely in the scene to preserve the
accuracy over extensive areas
Alternative methods for separating two spectrally similar classes GIS data, visual
interpretation, field check, multi-temporal or spatial pattern recognition procedures, …
Supervised classification (cont.)
Training stage (cont.)
• Implementation region of interest (ROI)
• Three sources of ROI
Manually from an image using the mouse
From pixel scatter plots
From vector layers
Exercise 1
Quick classification using interactive 2-D
scatter plots
• Rationale
Sufficient information to determine appropriate training areas may not
exist
2-D scatter plot first step in determine training set
• Data: ca_coast.dat (TMS data)
• Create 2D scatter plot
Tool 2-D Scatter Plots…
The adjacent bands are usually highly correlated
Choose band 3 for X-axis and band 8 for Y-axis
Check dancing pixels
hold the left-button in the image window
hold the right-button in the image window
Option Density slice
Exercise 1
Quick classification using interactive 2-D
scatter plots
• Rationale
Sufficient information to determine appropriate training areas may not
exist
2-D scatter plot first step in determine training set
• Data: ca_coast.dat (TMS data)
• Create 2D scatter plot
Tool 2-D Scatter Plots…
The adjacent bands are usually highly correlated
Choose band 3 for X-axis and band 8 for Y-axis
Check dancing pixels
hold the left-button in the image window
hold the right-button in the image window
Option Density slice
Self test 1
File: ca_coast.dat
• Use 2D scatter plot to define 5 ROIs
Note:
• Selection of bands for 2D scatter plot
• The least number of pixels required for each
class
• Dispersion of ROIs
• Give each ROI an appropriate name
• Output the ROIs into a file
Exercise 2
Perform classification
• File: ca_coast.dat
• Use the same ROIs that were defined earlier
• Classification method:
Maximum likelihood method
Minimum distance method
• Try various threshold value(s)
• Use Preview function
Change the extent by selecting the Change View button
• Examine the rule image
Exercise 3
Examine class images
• Load results of classification in previous exercise
• Link the displays and examine the differences
• Answer the following questions
Regions of the same classification
Regions of the different classification
Which is better
Do your ROIs seem to be appropriate?
How to improve the classification by changing the ROIs
• Check the header and data type of the classified result
• Change the class color mapping
Exercise 4
Examine rule images
•
•
•
•
Display rule images in previous exercise
Link the displays and examine the differences
Plot the z profile for each rule image
Move to an arbitrary pixel, check the value
and determine which class this pixel should be
Exercise 5
Perform post classification using the rule
classifier
•
•
•
•
•
Classification Post Classification Rule Classifier
File: dist_rule.img
Change the thresholds and press Quick Apply
Examine the result
Examine the rule images histogram to determine the
appropriate threshold for each class
Press the Hist button for open ocean class
Set a threshold to encompass the first peak of the bimodel
Repeat for the other classes
Exercise 6
Overlay classes
•
•
•
•
Display band 7 of ca_coast.dat in gray
Overlay Classification
File: max_class.img
Interactive Class Tool dialog
Turn on and off class(es)
Options Class distribution
Change active class
Options Associated stats data file
Options Stats for all classes
Examine the min, max, mean, standard deviation for each class
• Display band 7 of ca_coast.dat in a new window
• Overlay dist_class.img
• Link two displays and examine the differences
Exercise 6 (cont.)
Overlay classes (cont.)
• Repeat setting the Interactive Class Tool dialog for
the new file: dist_class.img
Turn on and off class(es)
Options Class distribution
Change active class
Options Associated stats data file
Options Stats for all classes
Examine the min, max, mean, standard deviation for each class
• Compare the class distribution and stats plots
• Editing pixels of classification using the Interactive
Class Tool
Exercise 7
Convert classes to ROIs
• Using Band Threshold to ROI tool
Overlay Regions of Interest
Options Band Threshold to ROI
• Options report area of ROIs
Unsupervised classification
Unsupervised supervised
• Supervised define useful information
categories examine their spectral
separability
• Unsupervised determine spectral classes
define their informational utility
Illustration: Fig 7.51
• Advantage: the spectral classes are found
automatically (e.g. stressed class)
Unsupervised classification (cont.)
Clustering algorithms
• K-means
Locate centers of seed clusters assign all pixels to the cluster with the closest
mean vector revise mean vectors for each clusters reclassify the image
iterative until there is no significant change
• Iterative self-organizing data analysis (ISODATA)
Permit the number of clusters to change from on iteration to the next by
Merging: distance < some predefined minimum distance
Splitting: standard deviation > some predefined maximum distance
Deleting: pixel number in a cluster < some specified minimum number
Table 7.2
• Outcome 1: ideal result
• Outcome 2: subclasses classes
• Outcome 3: a more troublesome result
The information categories is spectrally similar and cannot be differentiated in
the given data set
Exercise 8
Unsupervised classification
•
•
•
•
File: ca_coast.dat
Method: K-means and ISODATA
Parameter:
Overlay the result of classification onto the
original true-color image
• Examine the result of classification
• Save both results for exercise 10
Exercise 8 (cont.)
Hybrid classification
Unsupervised training areas
• Image sub-areas chosen intentionally to be quite different from
supervised training areas
Supervised regions of homogeneous cover type
Unsupervised contain numerous cover types at various locations throughout
the scene
To identify the spectral classes
Guided clustering
•
•
•
•
•
•
•
Delineate training areas for class X
Cluster all class X into spectral subclasses X1, X2, …
Merge or delete class X signatures
Repeat for all classes
Examine all class signatures and merge/delete signatures
Perform maximum likelihood classification
Aggregate spectral subclasses
Classification of mixed pixels
Mixed pixels
• IFOV includes more than one type/feature
Low resolution sensors more serious
Subpixel classification
• Spectral mixture analysis
A deterministic method (not a statistical method)
Pure reference spectral signatures
Measured in the lab, in the field, or from the image itself
Endmembers
Basic assumption
The spectral variation in an image is caused by mixtures of a limited number of surface materials
Linear mixture satisfy two basic conditions simultaneously
The sum of the fractional proportions of all potential endmembers SFi = 1
The observed DNl for each pixel
DN l F1 DN l ,1 F2 DN l , 2 ... FN DN l , N El
B band B equations
B+1 equations solve B+1 endmember fractions
Fig 7.52: example of a linear spectral mixture analysis
Drawback: multiple scattering nonlinear mixturemodel
Classification of mixed pixels (cont.)
Subpixel classification (cont.)
• Fuzzy classification
A given pixel may have partial membership in more than
one category
Fuzzy clustering
Conceptually similar to the K-means unsupervised classification approach
Hard boundaries fuzzy regions
Membership grade
Fuzzy supervised classification
A classified pixel is assigned a membership grade with respect to its
membership in each information class
Exercise 9
Linear spectral unmixing
•
•
•
•
•
•
•
•
File: ca_coast.dat
Display the image in true color
Set 5 ROIs, each has one pure pixel
Spectral mapping methods endmember
collection
Import five endmembers from ROIs
Algorithms Linear spectral unmixing
Set constrained
Apply and examine the results
The output stage
Image classification output products
end users
• Graphic products
Plate 30, Fig 3 of the paper “IKONOS imagery for resource
management”
• Tabular data
• Digital information files
Postclassification smoothing
Salt-and-pepper appearance
• Low-pass filter can not be used
• Must operate on the basis of logical operations, rather than
simple arithmetic computations
Majority filter
• Fig 7.53
(a) original classification salt-and-pepper appearance
(b) 3 x 3 pixel-majority filter
(c) 5 x 5 pixel-majority filter
Imbedded in the algorithm of classification
• Limited
• Need the technique of spatial pattern recognition
• Future development
Exercise 10
Postclassification smoothing
• File: results from exercise 8
• Clump and Sieve
For generalizing classification images, Sieve is usually run
first to remove the isolated pixels based on a size (number
of pixels) threshold. Clump is run to add spatial coherency
to existing classes by combining adjacent similar classified
areas
Classification → Post Classification → Sieve Classes
Classification → Post Classification → Clump Classes
• Combine Classes
Classification → Post Classification → Combine Classes
Classification accuracy assessment
Significance
• A classification is not complete until its accuracy is assessed
Classification error matrix
• Error matrix (confusion matrix, contingency table)
Table 7.3
Omission (exclusion) 漏授(該有的沒有)
Non-diagonal column elements (e.g. 16 sand pixels were omitted)
Commission (inclusion) 誤授(不該有的卻有)
Non-diagonal raw elements (e.g. 38 urban pixels + 79 hay pixels were included in corn)
Overall accuracy
Producer’s accuracy 生產者準確度
Indicate how well training set pixels of the given cover type are classified
User’s accuracy 使用者準確度
Indicate the probability that a pixel classified into a given category actually represents that category on
the ground
Training area accuracies are sometimes used in the literature as an indication of
overall accuracy. They should not be!
Classification accuracy assessment (cont.)
Sampling considerations
• Test area
Different and more extensive than training area
Withhold some training areas for postclassification accuracy assessment
Being homogeneous, test areas might not provide a valid indication of
classification accuracy at the individual pixel level of land cover variability
• Wall-to-wall comparison
Expensive
Defeat the whole purpose of remote sensing
• Random sampling
Collect large sample of randomly distributed points too expensive and
difficult
e.g. 3/4 of Taiwan area is covered by The Central mountain
Only sample those pixels without influence of potential registration error
Several pixels away from field boundaries
Stratified random sampling
Each land cover category Stratum
Classification accuracy assessment (cont.)
Sampling considerations (cont.)
• Accomplishment of random sampling
Overlay the classified output data with a grid
Test cells within the grid are selected randomly and groups of pixels
within the test cells are evaluated
• Sample unit
Individual pixels, clusters of pixels or polygons
• Sample number
General area: 50 samples per category
Large area or more than 12 categories: 75 – 100 samples per category
Depend on the variability of each category
Wetland need more samples than open water
Classification accuracy assessment (cont.)
Evaluating classification error matrices
• Table 7.4: error matrix (randomly sampled test)
Producer’s accuracy for Forest 84% > overall accuracy
65% good for classify forest?!
User’s accuracy for forest is only 60%
Only good for classify water
Self test 2
Employ all methods and concepts of
classification that you have learned so
far to classify the file ca_coast.dat
carefully. The ground truths in the
validation region will be provided next
week in the form of ROIs to assess your
result.
Tutorial: multispectral classification
Read image
• File → Open Image File
Subdirectory: envidata
File: can_tmr.img
RGB Color
Bands 4, 3, and 2
• Review Image Colors
False color infrared photograph
Bright red areas → high infrared reflectance → healthy vegetation → under cultivation, or along rivers
Slightly darker red areas → native vegetation → coniferous trees
Several distinct geologic and urbanization classes are also readily apparent as is urbanization
• Cursor Location/Value
• Examine Spectral Plots
Tools → Profiles → Z Profile (Spectrum)
Note the relations between image color and spectral shape
Pay attention to the location of the image bands in the spectral profile, marked by the red, green, and
blue bars in the plot
Tutorial: multispectral classification
(cont.)
Unsupervised Classification
• Classification → Unsupervised → K-Means or IsoData
• K-Means
Uses a cluster analysis approach which requires the analyst to select the number
of clusters to be located in the data, arbitrarily locates this number of cluster
centers, then iteratively repositions them until optimal spectral separability is
achieved
Choose K-Means as the method, use all of the default values and click on OK
Review the results contained in can_km.img.
Experiment with different numbers of classes, change thresholds, standard
deviations, and maximum distance error values to determine their effect on the
classification.
• Isodata
Calculates class means evenly distributed in the data space and then iteratively
clusters the remaining pixels using minimum distance techniques. Each
iteration recalculates means and reclassifies pixels with respect to the new
means
Choose IsoData as the method, use all of the default values and click on OK, or
Review the results contained in can_iso.img.
Tutorial: multispectral classification
(cont.)
Regions of Interest (ROI)
• Select Training Sets Using Regions of Interest (ROI)
• Restore Predefined ROIs
Choosing from the #1 Main Image menu bar Overlay → Region of
Interest
Choose File → Restore ROIs
File: CLASSES.ROI
• Create Your Own ROIs
Overlay → Region of Interest
Draw a polygon
Fix the polygon by clicking the right mouse button a second time
New Region
Edit
Tutorial: multispectral classification
(cont.)
Supervised Classification
• Supervised classification requires that the user
select training areas for use as the basis for
classification
• Classification → Supervised → [method]
[method] is one of the supervised classification methods in
the pull-down menu (Parallelepiped, Minimum Distance,
Mahalanobis Distance, Maximum Likelihood, Spectral
Angle Mapper, Binary Encoding, or Neural Net). Use one of
the two methods below for selecting training areas, also
known as regions of interest (ROIs).
Tutorial: multispectral classification
(cont.)
Classical Supervised Multispectral Classification
• Parallelepiped
Uses a simple decision rule to classify multispectral data. The decision boundaries form
an n-dimensional parallelepiped in the image data space. The dimensions of the
parallelepiped are defined based upon a standard deviation threshold from the mean of
each selected class
Pre-saved results are in the file can_pcls.img
Perform your own classification using the CLASSES.ROI regions of interest
• Maximum Likelihood
Assumes that the statistics for each class in each band are normally distributed
Calculates the probability that a given pixel belongs to a specific class
Unless a probability threshold is selected, all pixels are classified
Each pixel is assigned to the class that has the highest probability
• Minimum Distance
Uses the mean vectors of each ROI and calculates the Euclidean distance from each
unknown pixel to the mean vector for each class
• Mahalanobis Distance
A direction sensitive distance classifier that uses statistics for each class
Assumes all class covariances are equal and therefore is a faster method
Tutorial: multispectral classification
(cont.)
Spectral Classification Methods
• Developed specifically for use on Hyperspectral data,
but provide an alternative/improved method for
classifying multispectral data
• The Endmember Collection Dialog
Spectral → Mapping Methods → Endmember Collection
(Classification → Endmember Collection)
Open File
File: can_tmr.img
Endmember Collection: Parallel dialog
Algorithm → [method]
[method] represents: Parallelepiped, Minimum Distance, Manlanahobis Distance,
Maximum Likelihood, Binary Encoding, and the Spectral Angle Mapper (SAM)
Tutorial: multispectral classification
(cont.)
Spectral Classification Methods (cont.)
• Binary Encoding Classification
Encodes the data and endmember spectra into 0s and 1s based on
whether a band falls below or above the spectrum mean
An exclusive OR function is used to compare each encoded reference
spectrum with the encoded data spectra and a classification image is
produced
All pixels are classified to the endmember with the greatest number of
bands that match unless the user specifies a minimum match threshold,
in which case some pixels may be unclassified if they do not meet the
criteria
Algorithm → Binary Encoding
Import → from ROI from Input File
Select All Items
Endmember Spectra
Options → Plot Endmembers
Apply
Binary Encoding Parameters
Tutorial: multispectral classification
(cont.)
Spectral Classification Methods (cont.)
• Spectral Angle Mapper Classification
Uses the n-dimensional angle to match pixels to reference
spectra
Determines the spectral similarity between two spectra by
calculating the angle between the spectra, treating them as
vectors in a space with dimensionality equal to the number
of bands
Endmember Collection
Algorithm → Spectral Angle Mapper
Tutorial: multispectral classification
(cont.)
Post Classification Processing
• Classification Method → Rule Image Values
Represent
Parallelepiped Number of bands that satisfied the parallelepiped criteria
Minimum Distance Sum of the distances from the class means
Maximum Likelihood Probability of pixel belonging to class
Mahalanobis Distance Distances from the class means
Binary Encoding Binary Match in Percent
Spectral Angle Mapper Spectral Angle in Radians
Tools → Color Mapping → ENVI Color Tables
Stretch Bottom and Stretch Top sliders
Cursor Location/Value
Classification → Post Classification → Rule Classifier
File: can_tmr.sam
Rule Image Classifier Tool
Tutorial: multispectral classification
(cont.)
Post Classification Processing (cont.)
• Class Statistics
Classification → Post Classification → Class Statistics
Select All Items
• Confusion Matrix
Comparison of two classified images (the classification and the “truth”
image), or a classified image and ROIs
The truth image can be another classified image, or an image created
from actual ground truth measurements
• Classification → Post Classification → Confusion
Matrix → [method]
Using Ground Truth Image, or Using Ground Truth ROIs.
• Match Classes Parameters dialog
Tutorial: multispectral classification
(cont.)
Post Classification Processing (cont.)
• Clump and Sieve
For generalizing classification images, Sieve is usually run first to remove the
isolated pixels based on a size (number of pixels) threshold. Clump is run to add
spatial coherency to existing classes by combining adjacent similar classified
areas
• Compare the pre-calculated results in the files can_sv.img
(sieve) and can_clmp.img (clump of the sieve result) to the
classified image can_pcls.img
• Classification → Post Classification → Sieve Classes
• Classification → Post Classification → Clump Classes
• Combine Classes
• Classification → Post Classification → Combine Classes
File: can_sam.img
Add Combination
Tutorial: multispectral classification
(cont.)
Post Classification Processing (cont.)
• Edit Class Colors
Tools → Color Mapping → Class Color Mapping
To make the changes permanent, select Options → Save Changes
• Overlay Classes
Classification → Post Classification → Overlay Classes
Select can_tmr.img band 3 for each RGB band
Use can_comb.img as the classification input
• Interactive Classification Overlays
Interactively toggle classes on and off as overlays on a displayed image, to edit
classes, get class statistics, merge classes, and edit class colors.
Display band 4 of can_tmr.img
Overlay → Classification
Try the various options for assessing the classification under the Options menu
Choose various options under the Edit menu to interactively change the
contents of specific classes
File → Save Image As → [Device]
Tutorial: multispectral classification
(cont.)
Post Classification Processing (cont.)
• Classes to Vector Layers
• Overlay → Vectors
File: can_clmp.img
• File → Open Vector File → ENVI Vector File
Files: can_v1.evf and can_v2.evf.
Select All Layers
Load Selected
• Classification → Post Classification → Classification to Vector
Raster to Vector Input Band dialog.
Choose the generalized image can_clmp.img
Select Region #1 and Region #2 and enter the root name canrtv
Load Selected at the bottom of the dialog.
Load Vector
Edit→Edit Layer Properties
• Classification Keys Using Annotation
Overlay → Annotation
Object → Map Key
Edit Map Key Items