Geospatial Data Mining at University of Texas at Dallas
Download
Report
Transcript Geospatial Data Mining at University of Texas at Dallas
Geospatial Data Mining at
University of Texas at Dallas
Dr. Bhavani Thuraisingham (Computer Science)
Dr. Latifur Khan (Computer Science)
Dr. Fang Qiu (GIS)
Students
Shaofei Chen (GIS)
Mohammad Farhan (CS)
Shantnu Jain (GIS),
Lei Wang (CS)
Post Doc:
Dr. Chuanjun Li
This Research is Partly Funded by Raytheon
Outline
Case Study
- ASTER Dataset
- Technical Challenges
- Sketches
Process of Our Approach
- Pixel classification using SVM Classifiers
- Ontology Driven Mining
Pixel Merging
Output
Related Work
Future Work
Case Study: Dataset
ASTER (Advanced Spaceborne Thermal Emission and
Reflection Radiometer)
- To obtain detailed maps of land surface temperature,
reflectivity and elevation.
ASTER obtains high-resolution (15 to 90 square meters per
pixel) images of the Earth in 14 different wavelengths of the
electromagnetic spectrum, ranging from visible to thermal
infrared light.
ASTER data is used to create detailed maps of land surface
temperature, emissivity, reflectivity, and elevation.
Case Study: Dataset & Features
Remote sensing data used in this study is ASTER image
acquired on 31 December 2005.
- Covers northern part of Dallas with Dallas-Fort Worth
International Airport located in southwest of the image.
ASTER data has 14 channels from visible through the thermal
infrared regions of the electromagnetic spectrum, providing
detailed information on surface temperature, emissive,
reflectance, and elevation.
ASTER is comprised of the following three radiometers :
Visible and Near Infrared Radiometer (VNIR --band 1
through band 3) has a wavelength range from
0.56~0.86μm.
-
Case Study: Dataset & Features
Short Wavelength Infrared Radiometer (SWIR-- band 4
through band 9) has a wavelength range from 1.60~2.43μm.
- Mid-infrared regions. Used to extract surface features.
Thermal Infrared Radiometer (TIR --band 10 through band 14)
covers from 8.125~11.65μm.
- Important when research focuses on heat such as
identifying mineral resources and observing atmospheric
condition by taking advantage of their thermal infrared
characteristics.
ASTER Dataset: Technical Challenges
Testing will be done based on pixels
Goal: Region-based classification and identify high level
concepts
Solution
Grouping adjacent pixels that belong to same class
- Identify high level concepts using ontology-based mining
-
Sketches: Process of Our Approach
ASTER
Image
Training
Data
Feature
Extraction
Features
(14/pixel)
Test
Data
Feature
Extraction
Features
(14/pixel)
Validation
Features
(14/pixel)
Classification
All Pixel
Data
Feature
Extraction
Classifier
Training
High Level Concepts
SVM
Classifiers
Pixel Grouping
Process of Our Approach
Testing Image Pixels
Training Image Pixels
SVM Classifier
Classified Pixels
Pixel Merging
Concepts and Classes
Ontology Driven Mining
High Level Concepts
SVM Classifiers: Atomic Concepts
Classes
Train set
Test set
Water
1175
1898
Barren Lands
1005
1617
Grass
952
1331
Trees
887
1479
Buildings
1041
768
Road
435
648
House
1584
1364
# of instances
7079
9105
Different Class Distribution of Training and Test Sets
Process of Our Approach
Testing Image Pixels
Training Image Pixels
SVM Classifier
Classified Pixels
Pixel Merging
Concepts and Classes
Ontology Driven Mining
High Level Concepts
Ontology-Driven Mining
-
Ontology will be represented as a directed acyclic graph (DAG). Each node in
DAG represents a concept
-
Interrelationships are represented by labeled arcs/links. Various kinds of
interrelationships are used to create an ontology such as specialization (Is-a),
instantiation (Instance-of), and component membership (Part-of).
IS-A
Urban
Residential
Part-of
Apartment
Single Family
Home
Multi-family
Home
Ontology-Driven Mining
We will develop domain-dependent ontologies
- Provide for specification of fine grained concepts
- Concept, “Residential Area” can be further categorized
into concepts, “House”, “Grass” and “Tree” etc.
Generic ontologies provide concepts in coarser grain
Ontology Driven Mining
Target Area
Urban Area
Building
Road
Residential Area
Tree
House
Grass
Open Area
Water
Barren Land
Challenges
Region growing
- Find out regions of the same class
- Find out neighboring regions
- Merge neighboring regions
- Not scalable
Irregular
regions
Of different sizes
Hard to track boundaries or neighboring regions
Pixel merging
- Only neighboring pixels considered
- Pixels are converted into Concepts
- Linear
Pixels Merging
Pixels Merging
Complexity
There are two iterations:
- First iteration converts signature classes into Concepts
- Second iteration converts remaining classes and isolated
concepts into Dominating classes
Each pixels take O(1) time
Target area takes O(n) time, where n is the number of pixels in
the target area
Example (next slide):
Signature classes: c1, c2, c3
- Non-signature class: c4
- Concepts: C1, C2, C3
-
Pixels Merging
c1
c1
c2
c2
C2
c1
c2
c2
C2
c1
c2
c2
c1
c3
c2
c2
c1
c3
c2
c2
C2
c3
c2
c2
c2
c3
c2
c2
c2
c3
c2
c2
c2
c3
c2
c2
c3
c3
c2
c3
c3
c3
c2
c3
c3
c3
c2
c3
c3
c4
c3
c3
c3
c4
c3
c3
c3
c4
c3
c3
c4
c4
c3
c3
c4
c4
c3
c3
c4
c4
c3
c3
(a)
(b)
(c)
C2
c1
c2
c2
C2
c1
c2
c2
C2
c1
c2
c2
C2
c3
c2
c2
C2
c3
c2
c2
C2
c3
c2
c2
C2
c3
c2
c2
C2
c3
c2
c2
C2
c3
c2
c2
c3
c3
c2
c3
C3
c3
c2
c3
C3
c3
c2
c3
c3
c4
c3
c3
c3
c4
c3
c3
C3
c4
c3
c3
c4
c4
c3
c3
c4
c4
c3
c3
c4
c4
c3
c3
(d)
(e)
(f)
Implementation
Software:
- ArcGIS 9.1 software.
- For programming, we use Visual Basic 6.0 embedded in the
software.
Output:
Output
Output
Related Work
Classification (SVM)
Farid Melgani, Lorenzo Bruzzone, Classification of
hyperspectral remote-sensing images with support
vector machines.
Zhu, G. and D.G. Blumberg. (2002). Classification
using ASTER data and SVM algorithms - The case
study of Beer Sheva, Israel.
Huang C.; Davis L. S.; Townshend J. R. G. (2002) An
assessment of support vector machines for land
cover classification.
Future Work
Develop Full Fledged Prototype (By January 31, 2007)
Generate Rules automatically (By June 30, 2007)
- Ripper–Semi-automatically
- Association mining