Image Classification

Download Report

Transcript Image Classification

Chapter
Image Classification
Analysis and applications of remote
sensing imagery
Instructor: Dr. Cheng-Chien Liu
Department of Earth Sciences
National Cheng Kung University
Last updated: 8 April 2016
Introduction
 Overall objective of classification
• Automatically categorize all pixels in an image into land cover
classes or themes
 Three pattern recognitions
• Spectral pattern recognition  emphasize in this chapter
• Spatial pattern recognition
• Temporal pattern recognition
 Selection of classification
• No single “right” approach
• Depend on
 The nature of the data being analyzed
 The computational resources available
 The intended application of the classified data
Supervised classification
 Fig 7.37
• A hypothetical example
Five bands: B, G, R, NIR, TIR,
Six land cover types: water, sand, forest, urban, corn, hay
 Three basic steps (Fig 7.38)
• Training stage
• Classification stage
• Output stage
Supervised classification (cont.)
 Classification stage
• Fig 7.39
Pixel observations from selected training sites plotted on scatter
diagram
Use two bands for demonstration, can be applied to any band number
Clouds of points  multidimensional descriptions of the spectral
response patterns of each category of cover type to be interpreted
• Minimum-Distance-to-Mean classifier
Fig 7.40
 Mean vector for each category
 Pt 1  Corn
 Pt 2  Sand ?!!
Advantage: mathematically simple and computationally efficient
Disadvantage: insensitive to different degrees of variance in the spectral
response data
Not widely used if the spectral classes are close to one another in the
measurement space and have high variance
Supervised classification (cont.)
 Classification stage (cont.)
• Parallelepiped classifier
Fig 7.41
 Range for each category
 Pt 1  Hay ?!!
 Pt 2  Urban
Advantage: mathematically simple and computationally
efficient
Disadvantage: confuse if correlation or high covariance are
poorly described by the rectangular decision regions
 Positive covariance: Corn, Hay, Forest
 Negative covariance: Water
Alleviate by use of stepped decision region boundaries (Fig
7.42)
Supervised classification (cont.)
 Classification stage (cont.)
• Gaussian maximum likelihood classifier
Assumption: the distribution of the cloud of points is Gaussian
distribution
Probability density functions  mean vector and covariance matrix
(Fig. 7.43)
Fig 7.44: Ellipsoidal equiprobability contours
Bayesian classifier
 A priori probability (anticipated likelihood of occurrence)
 Two weighting factors
 If suitable data exist for these factors, the Bayesian implementation of the classifier is
preferable
Disadvantage: computational efficiency
 Look-up table approach
 Reduce the dimensionality (principal or canonical components transform)
 Simplify classification computation by separate certain classes a prior
 Water is easier to separate by use of NIR/Red ratio
Supervised classification (cont.)
 Training stage
• Classification  automatic work
• Assembling the training data  manual work
Both an art and a science
Substantial reference data
Thorough knowledge of the geographic area
You are what you eat!
 Results of classification are what you train!
• Training data
Both representative and complete
 All spectral classes constituting each information class must be adequately represented in
the training set statistics used to classify an image
 e.g. water (turbid or clear)
 e.g. crop (date, type, soil moisture, …)
 It is common to acquire data from 100+ training areas to represent the spectral variability
Supervised classification (cont.)
 Training stage (cont.)
• Training area
 Delineate boundaries (Fig 7.45)
 Carefully located boundaries  no edge pixels
 Seed pixel
 Choose seed pixel  statistically based criteria  contiguous pixels  cluster
• Training pixels
 Number
 At least n+1 pixels for n spectral bands
 In practice, 10n to 100n pixels is used
 Dispersion  representative 
• Training set refinement
 Make sure the sample size is sufficient
 Assess the overall quality
 Check if all data sets are normally distributed and spectrally pure
 Avoid redundancy
 Delete or merge
Supervised classification (cont.)
 Training stage (cont.)
• Training set refinement process
Graphical representation of the spectral response patterns
 Fig 7.46: Histograms for data points included in the training areas of “hay”
 Visual check on the normality of the spectral response distribution
 Two subclasses: normal and bimodal
 Fig 7.47: Coincident spectral plot
 Corn/hay overlap for all bands
 Band 3 and 5 for hay/corn separation (use scatter plot)
 Fig 7.48: SPOT HRV multi-spectral images
 Fig 7.49 scatter plot of band 1 versus band 2
 Fig 7.50 scatter plot of band 2 versus band 3  less correlated  adequate
Quantitative expressions of category separation
 Transform divergence: a covariance-weighted distance between category means
 Table 7.1: Portion of a divergence matrix (<1500  spectrally similar classes)
Supervised classification (cont.)
 Training stage (cont.)
• Training set refinement process (cont.)
Self-classification of training set data
 Error matrix  for training area not for the test area or the overall scene
 Tell us how well the classifier can classify the training areas and nothing more
 Overall accuracy is perform after the classification and output stage
Interactive preliminary classification
 Plate 29: sample interactive preliminary classification procedure
Representative subscene classification
 Complete the classification for the test area  verify and improve
Summary
 Revise with merger, deletion and addition to form the final set of statistics used in
classification
 Accept misclassification accuracy of a class that occurs rarely in the scene to preserve the
accuracy over extensive areas
 Alternative methods for separating two spectrally similar classes  GIS data, visual
interpretation, field check, multi-temporal or spatial pattern recognition procedures, …
Unsupervised classification
 Unsupervised  supervised
• Supervised  define useful information
categories  examine their spectral
separability
• Unsupervised  determine spectral classes 
define their informational utility
 Illustration: Fig 7.51
• Advantage: the spectral classes are found
automatically (e.g. stressed class)
Unsupervised classification (cont.)
 Clustering algorithms
• K-means
Locate centers of seed clusters  assign all pixels to the cluster with the
closest mean vector  revise mean vectors for each clusters  reclassify
the image  iterative until there is no significant change
• Iterative self-organizing data analysis (ISODATA)
Permit the number of clusters to change from on iteration to the next by
 Merging: distance < some predefined minimum distance
 Splitting: standard deviation > some predefined maximum distance
 Deleting: pixel number in a cluster < some specified minimum number
• Texture/roughness
Texture: the multidimensional variance observed in a moving window
passed through the image
Moving window  variance  threshold  smooth/rough
Unsupervised classification (cont.)
 Poor representation
• Roads and other linear features  not smooth
• Solution  hybrid classification
 Table 7.2
• Outcome 1: ideal result
• Outcome 2: subclasses  classes
• Outcome 3: a more troublesome result
The information categories is spectrally similar and cannot
be differentiated in the given data set
Hybrid classification
 Unsupervised training areas
• Image sub-areas chosen intentionally to be quite different from
supervised training areas
 Supervised  regions of homogeneous cover type
 Unsupervised  contain numerous cover types at various locations throughout
the scene
 To identify the spectral classes
 Guided clustering
•
•
•
•
•
•
•
Delineate training areas for class X
Cluster all class X into spectral subclasses X1, X2, …
Merge or delete class X signatures
Repeat for all classes
Examine all class signatures and merge/delete signatures
Perform maximum likelihood classification
Aggregate spectral subclasses
Classification of mixed pixels
 Mixed pixels
• IFOV includes more than one type/feature
 Low resolution sensors  more serious
 Subpixel classification
• Spectral mixture analysis
 A deterministic method (not a statistical method)
 Pure reference spectral signatures
 Measured in the lab, in the field, or from the image itself
 Endmembers
 Basic assumption
 The spectral variation in an image is caused by mixtures of a limited number of surface materials
 Linear mixture  satisfy two basic conditions simultaneously
 The sum of the fractional proportions of all potential endmembers SFi = 1
 The observed DNl for each pixel
DN l  F1 DN l ,1  F2 DN l , 2  ...  FN DN l , N  El
 B band  B equations
 B+1 equations  solve B+1 endmember fractions
 Fig 7.52: example of a linear spectral mixture analysis
 Drawback: multiple scattering  nonlinear mixturemodel
Classification of mixed pixels (cont.)
 Subpixel classification (cont.)
• Fuzzy classification
A given pixel may have partial membership in more than
one category
Fuzzy clustering
 Conceptually similar to the K-means unsupervised classification approach
 Hard boundaries  fuzzy regions
 Membership grade
Fuzzy supervised classification
 A classified pixel is assigned a membership grade with respect to its
membership in each information class
The output stage
 Image classification  output products
 end users
• Graphic products
Plate 30
• Tabular data
• Digital information files
Postclassification smoothing
 Majority filter
• Fig 7.53
(a) original classification  salt-and-pepper appearance
(b) 3 x 3 pixel-majority filter
(c) 5 x 5 pixel-majority filter
 Spatial pattern recognition
Classification accuracy assessment
 Significance
• A classification is not complete until its accuracy is assessed
 Classification error matrix
• Error matrix (confusion matrix, contingency table)
 Table 7.3
 Omission (exclusion) 漏授
 Non-diagonal column elements (e.g. 16 sand pixels were omitted)
 Commission (inclusion) 誤授
 Non-diagonal raw elements (e.g. 38 urban pixels + 79 hay pixels were included in corn)
 Overall accuracy
 Producer’s accuracy 生產者準確度
 Indicate how well training set pixels of the given cover type are classified
 User’s accuracy 使用者準確度
 Indicate the probability that a pixel classified into a given category actually represents that category on
the ground
 Training area accuracies are sometimes used in the literature as an indication of
overall accuracy. They should not be!
Classification accuracy assessment (cont.)
 Sampling considerations
• Test area
Different and more extensive than training area
Withhold some training areas for postclassification accuracy assessment
• Wall-to-wall comparison
Expensive
Defeat the whole purpose of remote sensing
• Random sampling
Collect large sample of randomly distributed points  too expensive
and difficult
 e.g. 3/4 of Taiwan area is covered by The Central mountain
Only sample those pixels without influence of potential registration
error
 Several pixels away from field boundaries
Stratified random sampling
 Each land cover category  Stratum
Classification accuracy assessment (cont.)
 Sampling considerations (cont.)
• Sample unit
Individual pixels, clusters of pixels or polygons
• Sample number
General area: 50 samples per category
Large area or more than 12 categories: 75 – 100 samples
per category
Depend on the variability of each category
 Wetland need more samples than open water
Classification accuracy assessment (cont.)
 Evaluating classification error matrices
• Table 7.4: error matrix (randomly sampled test)
Producer’s accuracy for Forest 84% > overall accuracy
65%  good for classify forest?!
User’s accuracy for forest is only 60%
Only good for classify water
Classification accuracy assessment (cont.)
Tutorial: multispectral classification
 Read image
• File → Open Image File
 Subdirectory: envidata
 File: can_tmr.img
 RGB Color
 Bands 4, 3, and 2
• Review Image Colors
 False color infrared photograph
 Bright red areas → high infrared reflectance → healthy vegetation → under cultivation, or along rivers
 Slightly darker red areas → native vegetation → coniferous trees
 Several distinct geologic and urbanization classes are also readily apparent as is urbanization
• Cursor Location/Value
• Examine Spectral Plots
 Tools → Profiles → Z Profile (Spectrum)
 Note the relations between image color and spectral shape
 Pay attention to the location of the image bands in the spectral profile, marked by the red, green, and
blue bars in the plot
Tutorial: multispectral classification
(cont.)
 Unsupervised Classification
• Classification → Unsupervised → K-Means or IsoData
• K-Means
 Uses a cluster analysis approach which requires the analyst to select the number
of clusters to be located in the data, arbitrarily locates this number of cluster
centers, then iteratively repositions them until optimal spectral separability is
achieved
 Choose K-Means as the method, use all of the default values and click on OK
 Review the results contained in can_km.img.
 Experiment with different numbers of classes, change thresholds, standard
deviations, and maximum distance error values to determine their effect on the
classification.
• Isodata
 Calculates class means evenly distributed in the data space and then iteratively
clusters the remaining pixels using minimum distance techniques. Each
iteration recalculates means and reclassifies pixels with respect to the new
means
 Choose IsoData as the method, use all of the default values and click on OK, or
 Review the results contained in can_iso.img.
Tutorial: multispectral classification
(cont.)
 Regions of Interest (ROI)
• Select Training Sets Using Regions of Interest (ROI)
• Restore Predefined ROIs
Choosing from the #1 Main Image menu bar Overlay → Region of
Interest
Choose File → Restore ROIs
File: CLASSES.ROI
• Create Your Own ROIs
Overlay → Region of Interest
Draw a polygon
Fix the polygon by clicking the right mouse button a second time
New Region
Edit
Tutorial: multispectral classification
(cont.)
 Supervised Classification
• Supervised classification requires that the user
select training areas for use as the basis for
classification
• Classification → Supervised → [method]
[method] is one of the supervised classification methods in
the pull-down menu (Parallelepiped, Minimum Distance,
Mahalanobis Distance, Maximum Likelihood, Spectral
Angle Mapper, Binary Encoding, or Neural Net). Use one of
the two methods below for selecting training areas, also
known as regions of interest (ROIs).
Tutorial: multispectral classification
(cont.)
 Classical Supervised Multispectral Classification
• Parallelepiped
 Uses a simple decision rule to classify multispectral data. The decision boundaries form
an n-dimensional parallelepiped in the image data space. The dimensions of the
parallelepiped are defined based upon a standard deviation threshold from the mean of
each selected class
 Pre-saved results are in the file can_pcls.img
 Perform your own classification using the CLASSES.ROI regions of interest
• Maximum Likelihood
 Assumes that the statistics for each class in each band are normally distributed
 Calculates the probability that a given pixel belongs to a specific class
 Unless a probability threshold is selected, all pixels are classified
 Each pixel is assigned to the class that has the highest probability
• Minimum Distance
 Uses the mean vectors of each ROI and calculates the Euclidean distance from each
unknown pixel to the mean vector for each class
• Mahalanobis Distance
 A direction sensitive distance classifier that uses statistics for each class
 Assumes all class covariances are equal and therefore is a faster method
Tutorial: multispectral classification
(cont.)
 Spectral Classification Methods
• Developed specifically for use on Hyperspectral data,
but provide an alternative/improved method for
classifying multispectral data
• The Endmember Collection Dialog
Spectral → Mapping Methods → Endmember Collection
(Classification → Endmember Collection)
Open File
 File: can_tmr.img
 Endmember Collection: Parallel dialog
Algorithm → [method]
 [method] represents: Parallelepiped, Minimum Distance, Manlanahobis Distance,
Maximum Likelihood, Binary Encoding, and the Spectral Angle Mapper (SAM)
Tutorial: multispectral classification
(cont.)
 Spectral Classification Methods (cont.)
• Binary Encoding Classification
Encodes the data and endmember spectra into 0s and 1s based on
whether a band falls below or above the spectrum mean
An exclusive OR function is used to compare each encoded reference
spectrum with the encoded data spectra and a classification image is
produced
All pixels are classified to the endmember with the greatest number of
bands that match unless the user specifies a minimum match threshold,
in which case some pixels may be unclassified if they do not meet the
criteria
Algorithm → Binary Encoding





Import → from ROI from Input File
Select All Items
Endmember Spectra
Options → Plot Endmembers
Apply
Binary Encoding Parameters
Tutorial: multispectral classification
(cont.)
 Spectral Classification Methods (cont.)
• Spectral Angle Mapper Classification
Uses the n-dimensional angle to match pixels to reference
spectra
Determines the spectral similarity between two spectra by
calculating the angle between the spectra, treating them as
vectors in a space with dimensionality equal to the number
of bands
Endmember Collection
Algorithm → Spectral Angle Mapper
Tutorial: multispectral classification
(cont.)
 Post Classification Processing
• Classification Method → Rule Image Values
Represent






Parallelepiped Number of bands that satisfied the parallelepiped criteria
Minimum Distance Sum of the distances from the class means
Maximum Likelihood Probability of pixel belonging to class
Mahalanobis Distance Distances from the class means
Binary Encoding Binary Match in Percent
Spectral Angle Mapper Spectral Angle in Radians
Tools → Color Mapping → ENVI Color Tables
 Stretch Bottom and Stretch Top sliders
Cursor Location/Value
Classification → Post Classification → Rule Classifier
 File: can_tmr.sam
 Rule Image Classifier Tool
Tutorial: multispectral classification
(cont.)
 Post Classification Processing (cont.)
• Class Statistics
Classification → Post Classification → Class Statistics
Select All Items
• Confusion Matrix
Comparison of two classified images (the classification and the “truth”
image), or a classified image and ROIs
The truth image can be another classified image, or an image created
from actual ground truth measurements
• Classification → Post Classification → Confusion
Matrix → [method]
Using Ground Truth Image, or Using Ground Truth ROIs.
• Match Classes Parameters dialog
Tutorial: multispectral classification
(cont.)
 Post Classification Processing (cont.)
• Clump and Sieve
 For generalizing classification images, Sieve is usually run first to remove the
isolated pixels based on a size (number of pixels) threshold. Clump is run to add
spatial coherency to existing classes by combining adjacent similar classified
areas
• Compare the pre-calculated results in the files can_sv.img
(sieve) and can_clmp.img (clump of the sieve result) to the
classified image can_pcls.img
• Classification → Post Classification → Sieve Classes
• Classification → Post Classification → Clump Classes
• Combine Classes
• Classification → Post Classification → Combine Classes
 File: can_sam.img
 Add Combination
Tutorial: multispectral classification
(cont.)
 Post Classification Processing (cont.)
• Edit Class Colors
 Tools → Color Mapping → Class Color Mapping
 To make the changes permanent, select Options → Save Changes
• Overlay Classes
 Classification → Post Classification → Overlay Classes
 Select can_tmr.img band 3 for each RGB band
 Use can_comb.img as the classification input
• Interactive Classification Overlays
 Interactively toggle classes on and off as overlays on a displayed image, to edit
classes, get class statistics, merge classes, and edit class colors.
 Display band 4 of can_tmr.img
 Overlay → Classification
 Try the various options for assessing the classification under the Options menu
 Choose various options under the Edit menu to interactively change the
contents of specific classes
 File → Save Image As → [Device]
Tutorial: multispectral classification
(cont.)
 Post Classification Processing (cont.)
• Classes to Vector Layers
• Overlay → Vectors
 File: can_clmp.img
• File → Open Vector File → ENVI Vector File
 Files: can_v1.evf and can_v2.evf.
 Select All Layers
 Load Selected
• Classification → Post Classification → Classification to Vector
 Raster to Vector Input Band dialog.
 Choose the generalized image can_clmp.img
 Select Region #1 and Region #2 and enter the root name canrtv
 Load Selected at the bottom of the dialog.
 Load Vector
 Edit→Edit Layer Properties
• Classification Keys Using Annotation
 Overlay → Annotation
 Object → Map Key
 Edit Map Key Items