Part A - James Madison University

Download Report

Transcript Part A - James Madison University

Data Fusion
and
Data Mining
Julia I. Couto Ph.D.
James Madison University
A future battlefield scenario
– distributed system of sensors for intelligence gathering,
targeting and surveillance
• Platforms hosting operators, sensors and weapons
– tank gunners hosting operators, IR, video cameras
– soldiers equipped with IR, video camera, thermal sights
• Remote sensors
– airborne sensors
– satellite
– unmanned aircraft : IR, radar, video camera
– interconnected electronically to enhance target identification,
surveillance, assessment and understanding of battlefield
situation
Data Fusion
• Data fusion is a process dealing with
• association, correlation and combination of data
and [information] from multiple sensors or
sources to achieve a more accurate assessment of
observed entities in a observed scenarios
• continuous refinement of estimates and
assessments
• evaluation of the need for additional sources, or
modification of the process itself to enhance
situation awareness
Data fusion
• Threat Assessment
• Situation Assessment
• Aggregation of Entities
• Behavior of Entities
• Identity of Entities
• Position/Velocity
• Existence of Entities
D. L. Hall, “Mathematical Techniques for Data Fusion”
JDL Information Fusion Process & Functional Model
• pixel/signal level data association and
characterization
DATA FUSION DOMAIN
Level 0
Processing
Sub-object Data
Association &
Estimation
• Intel Sources
• Air Surveillance
• Surface
Surveillance
• Space Surveillance
•
•
•
•
Level 1
Processing
Level 2
Processing
Level 3
Processing
Object
Refinement
Situation
Refinement
Impact
Assessment
Level 4
Processing
Process
Refinement
Human
• fuses data from multiple sensors
Computer
Interaction
to a joint estimate of identity for
detected
entities
Data Base
Management System
Support
Database
Fusion
Database
Threat assessment
• Monitors data collections
Estimate aggregation of force capabilities
• Data
retrieval to support
the
and •fusion
Assessment of relationships
Prediction
of Storage,
enemy intent
automated
functions
Implications
of future
actionsof level 1 - 4 between entities
• Aggregation entities into a
higher level of abstraction
JDL Data Fusion Process
Data Fusion And Data Mining
 Data mining techniques attempt to extract knowledge
from a large, heterogeneous and amount of
multidimensional data
–
–
–
–
–
–
Classification of objects /patterns
Discovery of useful hidden patterns and structures
Discovery rules that links two or several objects
Outliers and deviation detection
Identification of spatial and time trends
Prediction of events
Classification
 Assign an object to a category according to its features
•
•
•
•
•
•
Linear Discriminant Analysis
Quadratic Discriminant Analysis
K-nearest Neighbor
Support Vector Machines
Decision Trees
ANN
 Limitations:
– requires labeled training data
– majority of classification techniques do not deal with
uncertain|missing values and noisy data
 Application:
– Object Identification, Identity Fusion
Clustering
 Analyzes and group objects according to their similarity
 No-training data is required
– Hard Clustering:
– An object is assign to one and only cluster
•
•
•
•
Partitioning
Hierarchical
Density-Based
Graph-Based
– Soft Clustering:
– An object is assigned to a cluster with a certain probability or
degree of membership
• Fuzzy-Based
Clustering
 Applications:
– Data Preprocessing
• Image segmentation
– Object tracking
• A preprocessing step before track-assignment clustering similar track in
a scenario
– Entity Aggregation
• Aggregated similar entities in a higher level of abstraction and to
explore new relationships in data
 Limitations:
•
•
•
•
•
•
Heuristic algorithms
Definition of an appropriated similarity metric
Unable to discover groups of arbitrary-shaped clusters
Do not handle a mixture of numerical and categorical attributes
Scalabity problems
Do not consider physical constraints that may occur in a real spatial
scenarios such as obstacles and crossings
Bayesian Networks
Bayesian Networks
RSW
EW
IFF
Identity
Comm
Pos
Class
Kin
K.C. Chang, K. Blackmond Laskey, “Partially Dynamic Bayesian Networks for
Tracking and Object Identification”
Dynamic Bayesian Networks
Partially Dynamic Bayesian Networks
Bayesian Networks: IFF Identification
G. Laskey, K. Laskey , “Combat Identification with Bayesian Networks”
Bayesian networks
 Applications
– Identification of Objects:
• Object Identity (BN)
• Identification Occluded Objects (BN)
• Tracking objects (BN, PDBN)
– Threat Assessment/Event Prediction:
•
•
•
•
Identification of IFF
Assessment of attack/threats
Prediction of consequence of actions (DBN)
Assessment of enemy intentions (DBN)
Bayesian networks
 Advantages:
– Effective to deal with noisy and uncertain evidence and missing
data
– Decision theory of risk analysis can be used to choose the action
that maximizes the expected utility
– Learning algorithms to learn the structures and parameters of a BN
– Approximate inference can be made with partial/uncertain
information about the scenario
 Limitations:
– Learning the structure and potentials for each node in the net is
slow
– Requires training data to learn the structure and parameters of the
BN
– Exact inference for PDBN and BN is intractable, recently research
has proposed approximate inference algorithms
Outlier Detection
 Outlier: Objects/Events that differ considerable from
the majority of the objects
– Objects whose non-spatial attributes differ
considerable with non-spatial attributes of
its surrounding objects
– Objects that are far form other objects
Outlier Detection
 Techniques:
•
•
•
•
•
•
•
Discordance tests
Distance-based
Density-based
Deviation-based
Donoho-Stahel Estimators (DSE)
Cluster-Based
Graph-based Method
– Limitations:
• Most of them are not robust to space transformations
Trend Analysis
 Spatial Trend Analysis:
– Detection of changes and trends in a spatial dimension
• Distribution of enemy troops in relation to the
geographical area
• Techniques
– Spatial regression
– Bread-first search for detection of global trends
– Depth First Search Local Trend
 Spatial-Temporal Trend Analysis:
– Detection of changes and trends in both space and time
in a scenario
– New area of research
Conclusions
 In the future, a BS will be a distributed system of sensors.
– Sensor data from several sources must be fused and correlated to allow
target identification, order of battle, identification of potential threats,
enemy intention and prediction of future enemy action that will help
commanders and soldiers to have a clear understanding of the battlefield
situation
 Combination of data mining techniques can be applied through all
the hierarchy of inference in the data fusion process
 Current data mining techniques have limitations dealing with
uncertain and missing data that may hamper the inference in a
data fusion process
– Bayesian network is one of the most promising technique to model
situations that involves a certain degree of uncertainty and missing data
 Many data mining techniques applied in the data fusion process
are supervised methods, more research should be conducted in the
development of unsupervised data mining techniques