Video Analysis and Content Extraction

Download Report

Transcript Video Analysis and Content Extraction

Video Analysis
Content Extraction
VACE
Executive Brief
for MLMI
Dennis Moellman, VACE Program Manager
Video Analysis
Content Extraction
•
•
•
•
•
•
Briefing Outline
Introduction
Phase II
Evaluation
Technology Transfer
Phase III
Conclusion
Video Analysis
Content Extraction
Introduction
What is ARDA/DTO/VACE?
• ARDA – Advanced Research and Development
Activity
– A high-risk/high-payoff R&D effort sponsored by US
DoD/IC
• ARDA taking a new identity
– In FY2007 under the DNI
• Report to: ADNI(S&T)
– Renamed: Disruptive Technology Office
• VACE – Video Analysis and Content Extraction
– Three Phase initiative begun in 2000 and ending 2009
• Winding down Phase II
• Entering into Phase III
Context
Video Analysis
Content Extraction
Video Exploitation Barriers
• Problem Creation:
– Video is an ever expanding source of imagery and open source
intelligence such that it commands a place in the all-source
analysis.
• Research Problem:
– Lack of robust software automation tools to assist human
analysts:
• Human operators are required to manually monitor video signals
• Human intervention is required to annotate video for indexing
purposes
• Content based routing based on automated processing is lacking
• Flexible ad hoc search and browsing tools do not exist
• Video Extent:
– Broadcast News; Surveillance; UAV; Meetings; and
Ground Reconnaissance
Video Analysis
Content Extraction
Research Approach
Video Exploitation
• Research Objectives:
–
–
–
–
Basic technology breakthroughs
Video analysis system components
Video analysis systems
Formal evaluations: procedures, metrics and data sets
• Evaluate Success:
– Quantitative Testing
Metric
Accuracy
Speed
Current
<Human
>Real time
Need
>>Human
<<Real time
– Technology Transition
• Over 70 technologies identified as deliverables
• 50% have been delivered to the government
• Over 20 undergoing government evaluation
Video Analysis
Content Extraction
Management Approach
Geared for Success
Management Philosophy – NABC
• N – Need
• A – Approach
• B – Benefit
• C – Competition
Interests
Video Analysis
Content Extraction
System View
Source Video
Enhancement
Filters
011010101010111
110100101011
01101010101
01101011
011010
0110
Reference
Understanding
Engine
Recognition
Engine
Extraction
Engine
Intelligent
Content Services
Concept
Applications
Language/User
Technology
Visualization
VACE Interests
Video Analysis
Content Extraction
Technology Roadmap
Phase 1
Phase 2
Phase 3
Future
Object Detection & Tracking
Object/Scene Classification
Object Recognition
Object Modeling
Simple Event Detection
Event Recognition
Complex Event Detection
Scene Modeling
Event Understanding
Mensuration
Indexing
Video Browsing
Summarization
Filtering
Advanced query/retrieval using Q&A technologies
Content-based Routing
Video Mining
Change Detection
Video Monitoring
Image Enhancement/Stabilization
Camera Parameter Estimation
Multi-modal fusion
Integrity Analysis
Motion Analysis
Motion Analysis
Event Ontology
Event Expression Language
Automated Annotation Language
Evaluation
Funding
Video Analysis
Content Extraction
Commitment to Success
R&D Services
Evaluation
FY06 Allocations
4%
FY07 Allocations
Program Support Services
Tech Transition
4%
10%
12%
64%
11%
Tier 1
20%
Tier 2
39%
64%
36%
Video Analysis
Content Extraction
Phase II
Programmatics
• Researcher Involvement:
– Fourteen contracts
– Researchers represent a cross section of industry
and academia throughout the U.S. partnering to
reach a common goal
• Government Involvement:
– Tap technical experts, analysts and COTRs from
DoD/IC agencies
– Each agency is represented on the VACE
Advisory Committee, an advisory group to the
ARDA/DTO Program Manager
Phase II
Video Analysis
Content Extraction
Demographics
Univ. of
Washington
Univ. of
Univ. of
of IllinoisIllinoisUniv.
Urbana-Champaign (2) Chicago
Urbana-Champaign
Boeing
Phantom
Works
Carnegie
IBM T. J.
Mellon
Watson Center
Univ. (2)
(Robotics Inst.)
(Informedia)
. Virage TASC
Wright State
Univ.
Purdue
Univ.
AFIT
MIT
BBN
SRI
Salient Stills
Alphatech
Columbia
Univ.
Univ. of
Southern
California
Sarnoff Corp (2)
Univ. of
Maryland (2)
Univ. of
Southern
California / Info.
Science Inst.
Prime Contractors (14)
Sub Contractors (14)
Telcordia
Technologies
Georgia
Inst. Of Tech.
Univ. of
Central Florida
Phase II
Video Analysis
Content Extraction
Projects
Title
Organization
Principal Investigator
Foreign Broadcast News Exploitation
ENVIE: Extensible News
Video Information
Exploitation
Carnegie Mellon
University
Howard Wactlar
Reconstructing and Mining of
Semantic Threads Across
Multiple Video Broadcast
News Sources using MultiLevel Concept Modeling
IBM T.J. Watson
Research Center /
Columbia Univ.
John Smith /
Prof. Shih-Fu Chang
Formal and Informal Meetings
From Video to Information:
Cross-Modal Analysis of
Planning Meetings
Wright State
University
VaTech/AFIT / Univ. of
Chicago / Purdue
Univ. / Univ. of IllinoisUrbana-Champaign
Francis Quek /
Ronald Tuttle /
David McNeill & Bennett
Bertenthal / Thomas
Huang / Mary Harper
Event Recognition from Video
of Formal and Informal
Meetings Using Behavioral
Models and Multi-modal
Events
BAE Systems/ MIT /
Univ. of Maryland /
Virage
Victor Tom/ William
Freeman & John Fisher /
Yaser Yacoob & Larry
Davis / Andy Merlino
Phase II
Video Analysis
Content Extraction
Projects
Title
Organization
Principal Investigator
Abstraction and Inference about Surveillance Activities
Video Event Awareness
Sarnoff Corporation /
Telcordia Technologies
Rafael Alonso /
Dimitrios Georgakopoulos
Integrated Research on Visual
Surveillance
University of Maryland
Larry Davis,
Yiannis Aloimonos &
Rama Chellappa
UAV Motion Imagery
Adaptive Video Processing for
Enhanced Object and Event
Recognition in UAV Imagery
Boeing Phantom Works
(Descoped to UAV data
collection)
Robert Higgins
Task and Event Driven
Compression (TEDC) for UAV
Video
Sarnoff Corporation
Hui Cheng
Ground Reconnaissance Video
Content and Event Extraction
from Ground Reconnaissance
Video
TASC, Inc. / Univ. of
Central Florida / Univ. of
California-Irvine
Sadiye Guler / Mubarak
Shah / Ramesh Jain
Phase II
Video Analysis
Content Extraction
Projects
Title
Organization
Principal Investigator
Cross Cutting / Enabling Technologies
Probabilistic Graphical
Model Based Tools For
Video Analysis
University of Illinois,
Urbana - Champaign
Thomas Huang
Automatic Video Resolution
Enhancement
Salient Stills, Inc.
John Jeffrey Hunter
Robust Coarse-to-Fine
Object Recognition in Video
CMU/Pittsburgh
Pattern Recognition
Henry Schneiderman &
Tsuhan Chen
Multi-Lingual Video OCR
BBN Technologies /
SRI International
John Makhoul / Greg
Myers
Model-based Object and
Video Event Recognition
USC - Institute for
Robotics and
Intelligent Systems /
USC - Information
Sciences Institute
Ram Nevatia, Gerard
Medioni & Isaac Cohen /
Jerry Hobbs
Video Analysis
Content Extraction
Evaluation
Goals
• Programmatic:
– Inform ARDA/DTO management of progress/challenges
• Developmental:
– Speed progress via iterative self testing
– Enable research and evaluation via essential data and
tools – build lasting resources
• Key is selecting the right tasks and metrics
– Gear evaluation tasks to research suite
– Collect data to support all research
Evaluation
Video Analysis
Content Extraction
The Team
NIST
USF
Video Mining
Video Analysis
Content Extraction
Evaluation
NIST Process
Planning
Products
Results
Evaluation Plan
Determine Sponsor
Requirements
Task Definitions
Protocols/Metrics
Dry-Run
shakedown
Rollout Schedule
Assess
required/existing
resources
Develop detailed
plans with
researcher input
Data Identification
Formal Evaluation
Evaluation Resources
Training Data
Development Data
Evaluation Data
Technical Workshops
and reports
Ground Truth and
other metadata
Scoring and
Truthing Tools
Recommendations
Video Analysis
Content Extraction
Evaluation
NIST Mechanics
Algorithms
System Output
Evaluation
Video Data
Ground Truth
Measures
Annotation
Results
Evaluation
Video Analysis
Content Extraction
2005-2006 Evaluations
Detection
Tracking
Recognition
Evaluation Type
Domain
Meeting Room
Broadcast News
UAV
Surveillance
P F
x
x
x
x
x
x
x
V
x
T
x
P
F
x
x
x
x
x
x
x
x
Ground Recon
P = Person; F = Face; V = Vehicle; T = Text
V
T
x
x
x
x
x
P
F
V
T
x
Video Analysis
Content Extraction
Evaluation
Quantitative Metrics
• Evaluation Metrics:
– Detection: SFDA (Sequence Frame Detection
Accuracy)
• Metric for determining the accuracy of a detection
algorithm with respect to space, time, and the number of
objects
– Tracking: STDA (Sequence Tracking Detection
Accuracy)
• Metric for determining detection accuracy along with the
ability of a system to assign and track the ID of an object
across frames
– Text Recognition: WER (Word Error Rate) and CER
(Character Error Rate)
• In-scene and overlay text in video
• Focused Diagnostic Metrics (11)
Evaluation
Video Analysis
Content Extraction
Phase II Best Results
Spatial Detection and Tracking Error
80%
Percent Error
70%
60%
50%
Detection
40%
30%
Tracking
20%
10%
0%
Text
(Broacast
News)
Face
(Broacast
News)
Face
(Meeting
Room)
Hand
(Meeting
Room)
Evaluation Tasks
Person
(Meeting
Room)
Moving
Vehicle
(UAV)
Evaluation
Video Analysis
Content Extraction
Face Detection: BNews (Score Distribution)
Boxplot of (20%T) SFDA scores: Face in BNews
1
0.9
0.8
(20%T) SFDA
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
PPATT
UIUC
Sites
UCF
Evaluation
Video Analysis
Content Extraction
Text Detection: BNews (SFDA Score distribution)
Boxplot of (20%T) SFDA Scores: EngText in BNews
1
0.9
0.8
(20%T) SFDA
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
COLIB
SRI_1
SRI_2
Sites
CMU_L
Video Analysis
Content Extraction
Evaluation
Open Evaluations and Workshops -- International
• Benefit of open evaluations
– Knowledge about others’ capabilities and community feedback
– increased competition -> progress
• Benefit of evaluation workshops
– Encourage peer review and information exchange, minimize
“wheel reinvention”, focus research on common problems, venue
for publication
• Current VACE-related open evaluations
–
–
–
–
–
VACE: Core Evaluations
CLEAR: Classification of Events, Activities, and Relationships
RT: Rich Transcription
TRECVID: Text Retrieval Conference Video Track
ETISEO: Evaluation du Traitment et de l’Interpretation de
Sequences Video
Evaluation
Video Analysis
Content Extraction
Expanded
Task
3D Sing Per Track
Source Data
Sub-condition
Conference
Meetings
Video
CHIL
Audio
CHIL
Audio+Video
CHIL
2D Multi Per Detect
VACE
2D Multi Per Track
VACE
Person ID
Seminar
Meetings
CHIL
CHIL
Audio
CHIL
Audio + Video
CHIL
VACE
2D Face Track
VACE
Head Pose Est
Broadcast
News
VACE
VACE
Hand Track
VACE
UAV
VACE
VACE
CHIL
VACE
CHIL
Hand Detect
Studio
Poses
VACE
Video
2D Face Det
Surveillance
CHIL
VACE
Text Detect
VACE
Text Track
VACE
Text Recognition
VACE
Vehicle Detect
VACE
VACE
Vehicle Track
VACE
VACE
Feature Extract
TRECVID
Shot Boundary Detect
TRECVID
Search
TRECVID
Acoustic Event Detection
CHIL
Environment Class
CHIL
Evaluation
Video Analysis
Content Extraction
Schedule
2005
NIST/DTO VACE
2006
2007
2008
A M J J A S O N D J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J J A S O N D
VACE Phas e 2 Core
4/30 6/30
Evaluation Series 1
(Text, Face in Meeting, BNew s)
Report Results
11/8
5/96/6
Evaluation Series 2
(Hand in Meeting, Vehicle in UAV)
8/24
9/23
Evaluation Series 3
(Person in Meeting)
9/8
10/3
Report Results
11/8
Report Results
11/8
Evaluation Series 4
(Person/Vehicle in Surveillance)
Report Results
6/28
VACE Phas e 3 Core
Report Results
Report Results
Report Results Report Results
Evaluation Series 5, 6, 7, 8
2/23/2 4/5
8/2 9/610/16
2/43/114/22
7/148/2310/2
TRECVID
Report Results
Report Results
Report Results
Report Results
(Search and Extraction)
4/12
9/3011/14
4/7
9/20 11/13
4/6
9/20 11/13
4/7
9/19 11/11
CLEAR
(Person/Face/Pose in Meeing)
Report Results
2/1 3/94/8
(Extraction/Localiazation)
Report Results
Report Results
3/14/2 5/11
3/34/4 5/9
Rich Trans cription
(English/Arabic Text in BNew s)
Report Results
5/4
(Multi-Modal Recognition)
Report Results
Cycle 1
Cycle 2
Seminar
4/306/1 7/5
Report Results
4/306/6 7/9
ETISEO
1/2
Evaluation Schedule
TRECVID
5/1 5/8
11/17 12/8
VACE Phase 2
CLEAR
VACE Phase 3
RT
Video Analysis
Content Extraction
TECH TRANSFER
DTO Test and Assessment Activities
Purpose: Move technology from lab to operation
• Technology Readiness Activity
–
–
–
–
An independent repository for test and assessment
Migrate technology out of lab environment
Assess technology maturity
Provide recommendations to DTO and researchers
Video Analysis
Content Extraction
TECH TRANSFER
DoD Technology Readiness Levels (TRL)
Level
Definitions
Entry Condition
Contractor Activity
1
Basic principles observed and reported
Some peer review of ideas
Reporting on basic idea
2
Technology concept and/or application
formulated
Target applications are proposed
Speculative work; invention
3
Analytical and experimental critical
function and/or characteristic proof of
concept
Algorithms run in contractor labs and
basic testing is possible (internal,
some external may be possible)
Doing Analytical studies with
weakly integrated components
Component/breadboard validation in lab
Proof of concept exists; test plans
exist; external testing is possible
Low fidelity integration of
components
Integrated system functions outside
contractor lab; some TRA tests
completed
Working with realistic situations
5
Component/breadboard validation in
relevant environment
6
System/subsystem model or prototype in
demonstration in relevant environment
IC/DoD users identified; target
environment defined; simulated
testing possible
Demonstrating engineering
(software qualities) feasibility
Test lab trials in simulated
environment completed; installed in
operational environment
Completing the product
7
System prototype demo in operational
environment
8
Actual system completed and qualified
through test and demonstration
Product completed; Test lab trial
completed successfully
Releasing the product; Repairing
minor bugs
9
Actual system proven through successful
mission operations
Proven value-added in an operational
environment
Repairing minor bugs; noting
proven operational results
4
Technology Transfer
Video Analysis
Content Extraction
Applying TRL
DOD Technology Risk Scale
LOW
RISK
HIGH
DTO
Control
8
9
Production
DTO
Influence
UNCLASSIFIED
CLASSIFIED
6
Info-X
Test
Facility
UNCLASSIFIED
Contractor
Test
Facility
4
7
UNCLASSIFIED
CLASSIFIED
IC/DOD
Test
Facility(s)
5
Use in assessing project’s
1
2
3
• Technology maturity
• Risk level
• Commercialization potential
9
8
7
6
5
4
3
2
1
0
GeoTime
Kinevis
OpinionFinder
MPSG
PIQUANT-II
Gutime
START
Informedia
Marvel
EventEstimation
MultilayerMosaicGenerator
ImportanceMapGenerator
ObjectTracking_StatCamera
Stabilization
BackgroundSubtraction
TRL
Video Analysis
Content Extraction
Technology Transfer
TRA Maturity Assessments
TRL Assesment
Assesment prior to delivery
Current assesment
Projected at end of Contract
Video Analysis
Content Extraction
Phase III BAA
Programmatics
• Contracting Agency: DOI, Ft. Huachuca, AZ
– DOI provides COR
– ARDA/DTO retain DoD/IC agency COTR’s and add more
• Currently in Proposal Review Process
– Span 3 FY’s and 4 CY’s
– Remains open thru 6/30/08
• Funding objective: $30M over program life
– Anticipate to grow in FY07 and beyond
• Address the same data source domains as Phase II
• Will conduct formal evaluations
• Will conduct maturity evaluations and tech transfer
Video Analysis
Content Extraction
Phase III BAA
Programmatics
• Emphasis on technology and system approach
– Move up technology path where applicable
– Stress ubiquity
• Divided into two tiers:
– Tier 1: One year base with option year
• Technology focus
• Open to all – US and international
• More awards for lesser funding
– Tier 2: Two year base with option year(s)
• Comprehensive component/system level initiative
• Must be US prime
• Fewer awards for greater funding
Phase III BAA
Video Analysis
Content Extraction
Schedule
2005
Activity
Jun
Jul
Aug
Sep
2006
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Scope Planning
6/14
6/30
8/31
DOI
Planning Group Meetings
8/2
9/27 10/18 11/711/22
BA A RFP
7/18
First Draf t
2/2
Six weeks after kickoff
7/18 8/2
Second Draf t
9/1
Six weeks after first draft
9/30
Third Draf t
Completed prior to Fall Workshop
10/10
11/4
Final Draf t
11/21
Contracting Final Review
Sent to Contracting Office
12/2
Draft Final BAA
12/2
12/16
Final BAA
1/8 1/27
BA A
12/15
Announcement
Questions/Proposals Due
6/30
Draft BAA Posting
Brief
Final BAA Posting
12/15
1/202/1
Questions/Comments
Proposals
1/6
3/3
Proposal Evaluation
3/6
4/14
4/25
Source Selection Meeting
4/26-27
Funding Recommendations
4/28
Contract A w ards
5/1
6/30
Phase 3 Kickof f Workshop
8/9-10
Planning Schedule
Milestone
Major Task
Sep
Oct
Video Analysis
Content Extraction
Summary
Take-Aways
• VACE is interested in:
– Solving real problems with risky, radical
approaches
– Processing multiple data domains and multimodal
data domains
– Developing technology point solutions as well as
component/system solutions
– Evaluating technology process
– Transferring technology into user’s space
Conclusion
Video Analysis
Content Extraction
Potential DTO Collaboration
• Invitations:
– Welcome to participate in VACE Phase III
– Welcome to participate in VACE Phase III Evaluations
Video Analysis
Content Extraction
Contacts
Dennis Moellman, Program Manager
Phones:
202-231-4453
443-479-4365
301-688-7092
800-276-3747
(Dennis Moellman)
(Paul Matthews)
(DTO Office)
(DTO Office)
FAX:
202-231-4242
301-688-7410
(Dennis Moellman)
(DTO Office)
E-Mail:
[email protected]
[email protected]
Location:
Room 12A69 NBP #1
Suite 6644
9800 Savage Road
Fort Meade, MD 20755-6644
(Internet Mail)