G07 Final Presentation Spatial Semi
Download
Report
Transcript G07 Final Presentation Spatial Semi
Spatial Semisupervised
Image
Classification
Stuart Ness
G07 - Csci 8701 Final Project
1
Outline
Introduction – Traditional Image
Classification
Motivation
Problem Definition
Key Concepts
Assumptions
Contributions
Future Work
2
Introduction – Traditional
Image Classification
The Classification Problem
How would you begin to classify this
data given the following
information?
− The classes are:
Building = 1
Forest = 2
???? = 3
Sand = 4
Water = 5
Grass = 6
3
Introduction: Supervised
− The resulting classifier is:
Building = 1 = Red and Orange
Forest = 2 = Green
Sand = 4 = Aqua
Water = 5 = Blue
Grass = 6 = Yellow
Requires extensive domain
knowledge
4
Introduction: Unsupervised
Provide the data
Provide a method for
clustering
Create Groups
− Group
− Group
− Group
− Group
‘A’ = Red -Group ‘B’ = Yellow
‘D’ = Blue -Group ‘C’ = Orange
‘E’ = Aqua -Group ‘F’ = Green
‘G’ = Purple
Domain Expert must classify each
group
5
Motivation
Problems with Traditional Methods
− Supervised requires extensive
domain knowledge
− Supervised may create bias due to
the selection of labeled points
− Unsupervised may not have the
correct model specified
− Computationally expensive due to no
initial estimates
Project goal is to identify the work of
semi-supervised learning that may
be applied to a spatial context
6
Problem Definition: SemiSupervised Learning
Given
− Set of Labeled Data (Supervised)
− Set of Unlabeled Data
(Unsupervised)
Find
− Fast and accurate method for
classifying data
Objectives
− Speed
− Little need for Domain Expert Data
Constraints
7
Key Concepts
Semi-supervised learning has been
studied in the textual domain
− Spatial Significance
Semi-Supervised Process (typical)
− Select Data Points (Labeled and
Unlabeled)
− Create an initial Cluster with labeled
data points and/or probability
function
− Cluster Data Samples to create
classifier
8
Key Concepts: Extensions
Pair-wise relation Co-Training
Same Land Types
Different Land Types
9
Key Concepts: Extensions
Markov Random Fields
− General Classification
−Image from
http://www.etro.vub.ac.be/Resear
ch/IRIS/Research/MVISION/MRF%
20models.htm
10
Key Concepts: Extensions
Neighborhood EM
−Include information from
surrounding areas
11
Key Concepts: Extensions
Hybrid EM
− Attempt at improving efficiency
− Reduce number of iterations from
neighborhood EM
− Deals with spatial Data unlike normal
EM
− Use traditional EM unless expectation
decreases then use neighborhood EM
12
Assumptions
Unlabeled Samples are Inexpensive
− Not Guaranteed
− Unlabeled samples may not belong to
labeled Class (Purple Class – Snow)
may require extra processing to
examine
− Randomly chosen unlabeled samples
eliminate bias, but are there benefits
to using a set of randomly chosen
clusters of points
Local Maximum from Hill Climbing is
sufficient
13
Contributions
Provide a brief summary of semisupervised methods that pertain to the
spatial domain
Identify problems of existing semisupervised method
− Unlabeled Samples
− Local Maximum
Identify extensions from textual domain
which could be applied to a spatial
context
− Co-training & Neighborhood EM
− Markov Random Fields
− Hybrid EM
Future Work
Deal with the problems of randomly
sampled unlabeled data
− Random Sample
− Random Cluster Sample
− Choosing samples from known
classes
Improve Algorithm Efficiency
Implement non-hill climbing
approach for finding global
maximum
15
Conclusion
Semi-supervised learning is fairly
well developed.
Minimal work has been done to
implement “spatial” features of
method although, background is
ready
Selecting Unlabeled Samples,
Choosing the correct model, and
local maximum are problematic
16