G07 Final Presentation Spatial Semi

Download Report

Transcript G07 Final Presentation Spatial Semi

Spatial Semisupervised
Image
Classification
Stuart Ness
G07 - Csci 8701 Final Project
1
Outline
 Introduction – Traditional Image
Classification
 Motivation
 Problem Definition
 Key Concepts
 Assumptions
 Contributions
 Future Work
2
Introduction – Traditional
Image Classification
 The Classification Problem
 How would you begin to classify this
data given the following
information?
− The classes are:






Building = 1
Forest = 2
???? = 3
Sand = 4
Water = 5
Grass = 6
3
Introduction: Supervised
− The resulting classifier is:





Building = 1 = Red and Orange
Forest = 2 = Green
Sand = 4 = Aqua
Water = 5 = Blue
Grass = 6 = Yellow
 Requires extensive domain
knowledge
4
Introduction: Unsupervised
 Provide the data
 Provide a method for
clustering
 Create Groups
− Group
− Group
− Group
− Group
‘A’ = Red -Group ‘B’ = Yellow
‘D’ = Blue -Group ‘C’ = Orange
‘E’ = Aqua -Group ‘F’ = Green
‘G’ = Purple
 Domain Expert must classify each
group
5
Motivation
 Problems with Traditional Methods
− Supervised requires extensive
domain knowledge
− Supervised may create bias due to
the selection of labeled points
− Unsupervised may not have the
correct model specified
− Computationally expensive due to no
initial estimates
 Project goal is to identify the work of
semi-supervised learning that may
be applied to a spatial context
6
Problem Definition: SemiSupervised Learning
 Given
− Set of Labeled Data (Supervised)
− Set of Unlabeled Data
(Unsupervised)
 Find
− Fast and accurate method for
classifying data
 Objectives
− Speed
− Little need for Domain Expert Data
 Constraints
7
Key Concepts
 Semi-supervised learning has been
studied in the textual domain
− Spatial Significance
 Semi-Supervised Process (typical)
− Select Data Points (Labeled and
Unlabeled)
− Create an initial Cluster with labeled
data points and/or probability
function
− Cluster Data Samples to create
classifier
8
Key Concepts: Extensions
 Pair-wise relation Co-Training
Same Land Types
Different Land Types
9
Key Concepts: Extensions
 Markov Random Fields
− General Classification
−Image from
http://www.etro.vub.ac.be/Resear
ch/IRIS/Research/MVISION/MRF%
20models.htm
10
Key Concepts: Extensions
 Neighborhood EM
−Include information from
surrounding areas
11
Key Concepts: Extensions
 Hybrid EM
− Attempt at improving efficiency
− Reduce number of iterations from
neighborhood EM
− Deals with spatial Data unlike normal
EM
− Use traditional EM unless expectation
decreases then use neighborhood EM
12
Assumptions
 Unlabeled Samples are Inexpensive
− Not Guaranteed
− Unlabeled samples may not belong to
labeled Class (Purple Class – Snow)
may require extra processing to
examine
− Randomly chosen unlabeled samples
eliminate bias, but are there benefits
to using a set of randomly chosen
clusters of points
 Local Maximum from Hill Climbing is
sufficient
13
Contributions
 Provide a brief summary of semisupervised methods that pertain to the
spatial domain
 Identify problems of existing semisupervised method
− Unlabeled Samples
− Local Maximum
 Identify extensions from textual domain
which could be applied to a spatial
context
− Co-training & Neighborhood EM
− Markov Random Fields
− Hybrid EM
Future Work
 Deal with the problems of randomly
sampled unlabeled data
− Random Sample
− Random Cluster Sample
− Choosing samples from known
classes
 Improve Algorithm Efficiency
 Implement non-hill climbing
approach for finding global
maximum
15
Conclusion
 Semi-supervised learning is fairly
well developed.
 Minimal work has been done to
implement “spatial” features of
method although, background is
ready
 Selecting Unlabeled Samples,
Choosing the correct model, and
local maximum are problematic
16