Data Fusion - Spatial Database Group

Download Report

Transcript Data Fusion - Spatial Database Group

Fusion Based Knowledge for the
Objective Force: A Science and
Technology Objective
Presented April 3, 2003
By Barbara D. Broome
ARL Data Mining Working Group
Army Science Board* Estimates of
Technology Readiness for Select Fields
Technology Readiness Levels
Enabling Technologies
Aided ATR
Smart Portals to push pull
Mobile Wireless (pagers, PDA)
Malicious Mobile Code
Visualization - Presentation
Data Extraction
Virtual environment
Automatic routers, priorities
2004
3
6
6
1
4
6
3
5
Data fusion, information fusion
2
3
Secure Intelligent Agents
Encryption and authentication
Exploitation Algorithms and assist
RTIC
Future Internet
Individual Soldier Tech.
Collaboration Technologies
Sync Distributed Secure Data base
Secure Access Technology Biometrics
Translingual language transcription
Soldier Education
Associates
Next Generation Internet
2
4
2
5
6
4
6
4
3
4
6
6
6
5
7
2
8
9
8
9
7
8
6
8
7
9
*ASB Study - Knowledge Management and Information
Assurance, dtd 09/01
2008
3
9
9
2
7
8
6
8
Commercial
2
9
7
3
6
8
6
5
7
6
2
9
5
8
5
5
7
7
5
9
Joint Directors of Laboratories (JDL)
Fusion Levels
Level 0: Sensor-level target identification
- Processing raw data near the sensor
Level 1: Where is the enemy? (Multi-sensor correlation)
- Multi INT Correlation for highly detailed Enemy Situation
----------------------------------------------------------------------------
Level 2: What is the enemy doing?
- Aggregation for COP
- Interpreting activities in context
- Develop hypotheses about current ECOA
- Cluster analysis
- Trend analysis
- Association rules
Level 3: What are the enemy’s goals?
- Future ECOA’s
- Predict Intent and Strategy
Level 4: How should we respond?
−
How do we redirect the ISR system to get better SU?
DARPA Programs Related to
Levels 2 & 3 Fusion
Where
What When
Who
Level 2:
Situation Refinement
Level 1:
Object Refinement
Why
How
Level 3: Global
Threat Refinement
How well
Level 4: Performance
Refinement
DATA FUSION PROCESSING
ENABLING TECHNOLOGIES
physical objects
individual
organizations
events
Evidence
Extraction
& Link
Detection
specific
aggregated
environment & enemy tactics
local
Dynamic
Data
Exchange
global
enemy doctrine objectives & capability
CoABS
DAML
RKF
CPOF
local
global
friendly vulnerabilities & mission
Dynamic
Tactical
Targeting
Battle
Adv.
Assessment &
ISR
Data
Mgmt
Dissemination
Ref: DARPA IXO(SUO-SAA)
Information Fusion Workshop, final briefing, 28 Feb 2002
options needs
effectiveness
battle
theatre
resource management
local
global
Why We Need Fusion
Information volume exceeds war-fighter capabilities to develop situational
understanding required for planning and acting within the adversary’s decision cycle
Echelon
# Msg’s per hour*
# full time Analysts,
w/ workstations
Latency for
Level III Fusion
15
1 Hr
Legacy Division
400-600
Future UA Bde
17,000**
0-6 (TBD)
NRT (req)
Future UA Bn
4,000**
Zero
NRT (req)
Future UA Company
1,200**
Zero
NRT (req)
* Current and estimated bottom-up sensor feeds; Top-down feed is much larger
** (Date) Sensor briefing from CG, USAIC&FH to Dir, UAMBL / MAPEX indicates an order of magnitude increase
Reports Without Fusion
Bde COP
UE
Bn COP
Plus…Information from
echelons above UA
170K+ Reports/Hour
Report count based on
DCGS-A MAPEX
results using Caspian
Sea Scenario
Reports generated from FCS EO/IR and COMINT Sensors only.
Add MASINT sensors and reporting at UA goes to @ 600K/hour.
Co COP
56K+ Reports/Hour
18K+ Reports/Hour
PLT COP
Mr. Hayward’s Brief, Force Operating
Capability (FOC) S&T Assessment Review
6K+
Reports/Hour
FBKOF: Overcoming Information Overload
BARRIERS
•
•
•
•
•
Limited computational models
Knowledge/algorithms scenario dependent
COTS knowledge acq. technology slow
Information sources poorly integrated
Knowledge discovery tool limited
APPROACH
•
•
•
•
•
Constrain the problem scope to UA
Apply Blackboard architecture, Bayesian
belief nets, and cooperative human-machine
hypothesis generation and management
Exploit DARPA rapid knowledge formation
technologies to develop knowledgeintensive reasoning for interpretation
Leverage Semantic Web techniques for
source integration.
Integrate and tailor COTS tools for
directed knowledge discovery
DELIVERABLES
•
SW for knowledge generation/explanation to
answer CCIR’s in a timely manner
•
Ontology based information agents
for objective force systems
User-directed knowledge discovery
tools
•
•
Modeling and simulation tools
Schedule
Tasks
FY03 FY04
FY05
FY06
FY07
• Baseline / Assess Knowledge
tools and Fusion Algorithms
2
3
4
• Knowledge Acquisition
2
3
4
3
4
• Mining-Component Development
• Knowledge Infrastructure Development
3
• Modeling and Simulation Support
• C4I experiments and evaluations
• Transitions Decision Points
1.7
2.1
2.2
4
5
Semantic Web Concepts Providing a Knowledge
Environment (Agents and Ontologies)
Interfac
e
GOALS
•
•
•
•
•
•
•
•
Data- DataDatabase
base
base
OLAP
Minimize burden on user
–
Automate well-structured problems
–
Support ill-structured problems
Interface tuned to the task and to the user
Task centered, not tool centered
Support information push and pull
Support collaboration
Accommodate multi-modal data types
Visualization tools to support understanding
Smarter integration of sources
DBMS
Knowledge
Base
Fusion
–
Limit the number of required retrievals (bandwidth)
–
–
Minimize exploration after retrieval (time constrained)
Automate and personalize the process
Interfac
e
Web
Search
Engine
Ontology: formal description
of the relationships among
terms.
Notional Blackboard Architecture
for Fusion Module
Levels of
Analysis
Answers to PIRs
COAs and COA
Fragments
Relations between
objects (command
hierarchy, behavioral)
Events &Activities
Objects
(equipment and
platform-level
entities)
Knowledge
Sources
Blackboard
Plans KS
• History
:
• Doctrine
•
Terrain & Weather
•
Activities KS
• Force Structure
:
• Commo Patterns
• Tactics
• Terrain & Weather
Sensor-Data Fusion KS
:
• Platform & Equipment
Classification
• Terrain & Weather
CONTROL
Agents-Based
What are they? (ATL)
•
•
•
•
-- Huhns
The concept of software agents represents a new way of applying artificial
intelligence techniques such as machine reasoning and learning.
Software agents are computer programs designed to operate in a manner analogous
to human agents. Human agents, such as real-estate agents, carry out tasks on
your behalf using expertise you may not have. Software agents carry out
information processing functions in the same manner.
Agents can be thought of, in software engineering terms, as a step beyond the
objects of object-oriented programming. Whereas objects are passive entities that
must be invoked to execute, agents use AI mechanisms such as machine reasoning
to actively operate as autonomous entities.
Research has shown greatest utility in multi-agent applications is information mgmt.
How do they help?
•
•
•
•
•
Active, persistent sw
components that
perceive, reason, act
and communicate
Huge problem broken into small components
Much can be handled in parallel rather than serially
Reflect changes in priorities without coding changes
Technology is coming of age
Many web applications [6, 9]: mediator, personal assistant
Source Interface Agent Functionality
•
•
•
•
•
•
•
•
•
•
Filter
Monitors
Alert
Retrieve – pull
Disseminate – push
Mediate across legacy systems
Intruder detection
Policy enforcement
Adapt to the user priority
Adapt to the environmental changes
Brigade level
DCGS-A Data Store
Single-INTs
COMINT
ELINT
MASINT
Imagery
Images/ Video/ Audio
MTI
HUMINT
Other Multimedia
Open Source
External COPs
(above/below/beside)
COP COP COP
COP
COP
MIDB
Blue
Asset
Mgmt
Terrain
Weather
Targets
CCIR/
IR/
OPLANs
Alert/
Search
Criteria
All Source Fusion
(ASFDB)
Units Pieces of Equipment
Facilities
Events
Individuals
Organizations
And their interrelationships
PROBLEMS
• Agents new, few success stories and limited developmental environments
• Present complex parallel processing paradigm
• Issues of teaming, security, mobility, efficiency
• Establishing optimum ontology size/approach
• Integrating ontologies across heterogeneous sources
Ontology: a formal
description of the
relationship among
terms.
Ontology-Facilitated
• Information heterogeneous (type, syntax, semantics)
• Heterogeneity of semantics results in conflicts (naming, scaling, confounding)
• Ontologies explicitly describe information sources
• Identify and share formal descriptions of domain-relevant concepts
• Identify classes of objects and organized them hierarchically
• Characterize classes by the properties they share
• Identify important relationships between classes
Brigade level
Mediator
Agent
DCGS-A Data Store
Single-INTs
Fusion
Prioritzer
Reasoner
Agent
Commo
Module
Agent
Ontology
COMINT
ELINT
MASINT
Imagery
Images/ Video/ Audio
MTI
HUMINT
Other Multimedia
Open Source
External COPs
(above/below/beside)
COP COP COP
COP
COP
MIDB
Blue
Asset
Mgmt
Terrain
Weather
Targets
CCIR/
IR/
OPLANs
Alert/
Search
Criteria
All Source Fusion
(ASFDB)
Units Pieces of Equipment
Facilities
Events
Individuals
Organizations
And their interrelationships
Providing User-Directed Knowledge
Discovery Tools
•
•
•
•
•
•
•
•
•
On Line Analytical Processing (OLAP) emerged in the early 90’s (Inmon, Codd)
Multi-dimensional data structure
Better (more flexibly) address decision process (forecasting, time-series
analysis, link analysis)
More natural & efficient storage and retrieval mechanism
Provides a mechanism for accommodating time and space
Support drill down and roll up functionality
ANALYZING THE DATA
Flexible graphical interface
Commercial Product
Natural Transition to Data Mining
Total Cost X Disaster Type
PROBLEMS
•
•
•
•
•
Representation of space and time
Complexity of user interface
Inefficiency of algorithms
Difficulty in extending functionality
Difficulty in modifying the structure
Team Members
Information Agents
Data Mining
Tim Hanratty
Joan Forester
John Dumer
Ann Brodeen
George Hartwig
Mike Evans
Mario Torres
John Raby
Sam Chamberlain
Ed Measure
Partners and Leveraged Programs
•
•
•
•
•
•
•
•
•
•
CECOM/I2WD
Army G2 (Woodson / Walsh / ISR Working Group)
Huachuca (Schlabach – Cahill)
ADA CTA (U W Fl, UMD, SA Tech, Ohio State?)
ARMY HPC Program (Namburu/UMINN, Data mining)
ARL CENTERS OF EXCELLENCE (Evans CAU, Data
mining)
PENN State (Yen, Teaming Agents)
C2CUT and Warrior’s Edge
DARPA: Kessler (PBA Seedling); Taylor (ATA); Kott
(AIM); RKF; Burke (DAML, CoABS Grid)
ENDORSEMENTS: BCBL-H; BCBL-L; PM DCGS-A, PM
IE, PM FCS
FY03 Deliverables
FY03 : (1) Work with CECOM and the user community in conducting a knowledge audit to design the
Human-Computer Interface (HCI) and identify the fusion tasks most likely to be useful to the user. (2)
Develop a small prototype Knowledge Environment (KE) that uses agent techniques to access the two
highest priority data sources. This will establish a baseline system on which to build in out years,
demonstrate our initial concept of the use of ontologies by the KE agent communities, and provide a
mechanism for integrating CECOM’s fusion modules. (3) Conduct an internal demonstration of the
baseline system to support refinement of the HCI/KE concepts
FY04 : (1) Integrate two more data sources into the baseline system to assess the extensibility of the infrastructure and
provide the CECOM fusion module access to a greater variety of data sources. (2) Develop and populate a prototype
multi-dimensional data structure for user directed data mining or knowledge discovery (KD). This will allow us to
explore the use of user-in-the-loop fusion tools to supplement CECOM automated fusion techniques. (3) Conduct an
internal joint CECOM/ARL demonstration to refine the HCI and KE concepts.
FY05 : (1) Modify the KE system architecture, based on the FY04 evaluation and integrate 5th data/information source.
(2) Jointly demonstrate to DCGS-A and user communities the integration of CECOM’s fusion algorithms, the userdirected KD tools and 5 data sources. This provides a formal review for the targeted transition system developers
(FCS/DCGS-A) of the refined approach at a point when all the required components are in place.
FY06 : (1) Finalize user-directed mining scripts and system architecture, based on FY05 evaluation. The goal will be to
simplify access to the KD tools. (3) Develop information agents to support I2WD fusion task. These agents will be
directed toward increasing the efficiency and effectiveness of information push/pull. (2) Internally demonstrate
automated cross-source integration using the enhanced agent environment and work with CECOM to evaluate and
enhance the system’s functionality.
FY07 : (1) Finalize system development, based on FY06 evaluation. (2) Jointly conduct the final system demonstration
and evaluation to support system transition to FCS LSI contractor, PM-CGS, and PM-IF.
Progress
Task
Accomplishments
Remaining
Conduct Baseline
Knowledge Audit of UAS2
• Interviewed SME’s
• Extracted data sources from DCGS-A ORD
• Participation in G2’s ISR Working Group
• Integrate SME modifications
• Document
• Participate in Follow-on MAPEX
Prototype HCI for the S2
• SA-Tech developed Goal-Directed Task Analysis
• Code/evauated against user
requirements
Configure Info Mgmt for
Agent-Based Information
Infrastructure
•
•
•
•
Evaluated ontologies for weather/terrain/red
Held IMPACT class
Installed CoABS Grid (operational)
Installed CAST agents to monitor unit
movement on DaVinci map as client
• Installed EMAA (operational and tested)
• EMAA training
• Develop wrappers for ASAS-Lite,
IMETS/IWEDA
• Choose/extend ontologies
• Develop agent functionality
• Develop API for fusion sw
Develop User-Directed
Data Mining functions
• Installed internal MDDB tools (Oracle Express,
Oracle 9i, other?)
• Held Data Mining Workshop (WSMR)
• Reviewed CAU demonstrations and proposal
• Data Mining Follow-up (U MN, Apr 3)
• Identify/mod/develop demo
functionality (Climatology to weather,
spatial displays, model validation)
Develop Evaluation
Methods
• Worked with SME’s to identify metrics
• Design pilot study
• Develop data collection sw
• Integrate pilot with demo
Demonstrate integrated
components
• Identified and ordered hardware requirements
• Identified software requirements
• Order remaining sw requirements
• Integrate components
• Conduct demonstration
Action Items from Last Meeting
•
Provide to UMINN and CAU actual meteorological measurements, along
with requisite format information, for a 2-3 week period in an area for which
we have corresponding model data. (Passner/Raby)
•
Upon receiving the measurement data, initiate a data mining effort to predict
visibility, identify air mass clusters and their movement, and identify
associations between measurement parameters and accuracy of forecast
predictions. (Kumar/Shekhar/George)
•
Coordinate a follow up meeting to discuss the results of initial data mining
findings on the measurement data in the late March timeframe. Jon
Mercurio can provide a meteorology tutorial for the data miners. Hopefully
this can be held at one of the Universities to involved more faculty and
students in the discussion. (Forester)
•
Contact McWilliams for more detail on an upcoming weather data mining
workshop in the Washington area. (Broome)
•
Identify weather data overlays that can be displayed on the FBKOF
architecture in August and agree on a format for information exchange.
(Evans/Hanratty)
•
Follow up on Shekhar research on spatial data mining and its application to
the spatial data requirements for Online Analytical Processing. (Broome)
Data Mining Issues
•
Incorporating remote sites is extremely time consuming, and as unplanned meetings
arise, cross-site coordination is the first thing to go.
–
–
–
–
•
Establishing a baseline system this FY
–
–
–
–
–
–
–
•
Impacts data mining effort most
VTC/conference calls not enough
ARLpartners may help
Occasional one-site meetings critical, especially this year as everyone is learning their role
Topic: Weather, terrain, MAPCUBE, intrusion detection
Function: Mining forecast errors, micro-scale weather feature clusters, short-term trends
Location: Georgia/Ft Benning
Agent Accessed Data source: IMETS, IWEDA, Other (GA, Nat’l Weather Service)
Structure: Multi-dimensional or Flat File, Standard
Database: Commercial or Inhouse
Integration across partners: ATL, CAU, UMINN
Expanding beyond the weather in out years (distinguishing between BED and FBKOF)
–
–
–
–
–
Terrain
Time
Space
Friendly forces
Enemy forces
Data Mining Issues
•
Obtaining data
–
–
•
Integrating with Agents Software
–
–
–
–
•
Contract scope
Location: Atlanta, WSMR, ALC, APG*
Staff/students/postdocs
Data Mining (hands-on) courses
Papers/publications
–
–
•
What operating system: Windows 2000
What software: Oracle 9i, MatLab, Java, C++
Software handoff: Documentation, installation, demonstration
Standard data structure
Summer faculty/visits
–
–
–
–
•
Weather-related (BED)
Scenario-related (Warrior’s Edge Demo)
Venue
Collaborations
Supporting ARL Data Mining Initiatives
–
McWilliams, National Environmental Satellite Data Information Service Data Mining
Workshop on extracting information from large heterogeneous data sets.
Summary
• Goal: Facilitate quick decisions that fully leverage the huge volumes
of information that the UA will receive.
– Includes, but goes beyond, weather data
• Proposed relatively modest software readiness levels, due to
difficulty of the task, but driving to get a transition:
– PM DCGS-A demonstration in 05, with a transition decision point in 07
– PM IE demonstration in 05, transition decision point in 07
– Demonstration to FCS LSI 05, AMSAA transition decision point in 07
• Data mining resources far exceed initial expectations, but not all can
be targeted toward FBKOF.
• First year of agents development will receive a boost from related
ARL programs (C2CUT, Warrior’s Edge), and can enhance source
integration requirements for data mining applications
• Strong support from user community
– Need to tie work to that community, involve them in the process