Transcript Function
Object Recognition Through Reasoning About Functionality:
A Survey of Related Work and Open Problems
Louise Stark
University of the Pacific
Stockton, California
Melanie Sutton
University of the West Florida
Pensacola, Florida
Dagstuhl Oct09
1
Function-Based Research
Dr. Louise Stark
University of the Pacific
Stockton, CA
Dagstuhl Oct09
2
University of the Pacific
Dagstuhl Oct09
3
University of the Pacific
Dagstuhl Oct09
4
University of West Florida
Pre/post-hurricane season…
Seminar Goals
• This seminar brings together scientists
from disciplines such as computer science,
neuroscience, robotics, developmental
psychology, and cognitive science
Dagstuhl Oct09
6
Seminar Goals
• Hope to further the knowledge
• how the perception of form relates to
object function
• how intention and task knowledge (and
hence function) aids in the recognition of
relevant objects
Dagstuhl Oct09
7
Overview
• Recognition based on functionality
• Overview of GRUFF approach
• Functionality in Related Disciplines
• Open Problem Areas
Dagstuhl Oct09
8
Function-based Approaches
Cognitive Psychology/Human Perception
Representations of object categories
Human-robot interaction strategies
Wayfinding
Artificial Intelligence
Computer Vision
Formal representations of knowledge
Machine learning techniques
to automate reasoning
Document/aerial image analysis
Interpreting human motion
Object recognition/categorization
Robotics
Mapping of indoor environments
Object detection
Navigation/interaction plans
Formalisms for autonomous robot control
Dagstuhl Oct09
9
Computer Vision?
• Deriving meaningful descriptions of the
environment from images
•Descriptions needed for
•Recognition
•Manipulation
•Reasoning about objects
Dagstuhl Oct09
10
Generic Object Recognition
Minsky (1991)
•
Argued for the necessity of representing
knowledge about functionality
•
“… rarely use a representation in an intentional
vacuum, but we always have goals…”
•
“… we must classify things… according to
what they can be used for.”
•
Dagstuhl Oct09
11
Motivation
Parameterized Model
Structural Model
Could these
be
recognized?
Dagstuhl Oct09
12
GRUFF
Generic
Recognition
Using
Form and
Function
chair (cher) n. - a piece
of furniture for one
person to sit on
Dagstuhl Oct09
13
What is the goal?
Develop alternative approaches to generic
object recognition & manipulation
- concentrate on man made objects (artifacts)
Human artifacts – existence or non/existence of
properties can be deduced by analyzing the shape of an
object
For any particular object category – there is some set of
functional properties shared by ALL objects in that
category.
Dagstuhl Oct09
14
Approach to the Problem
•Derive the format of my function-based
representation
• Confirm feasibility of appoach test domainperfect input - planar face models
• Expand the domains
• Test real data
• Interact to confirm functionality
• Exploit contextual information
Dagstuhl Oct09
15
Knowledge in GRUFF is of three types:
A category hierarchy which specifies
superordinate / basic / subordinate categories
furniture chair arm chair
Functional properties that define each catgory
(provides_sittable_surface, provides_stability,...)
Knowledge primitives used to reason about shape
(dimensions, relative orientation, ...)
All organized into a "category definition tree"
which is GRUFF's knowldge about the world.
Dagstuhl Oct09
16
Category Representation Tree
Conventional
Chair
Provides Sittable
Surface
Provides Stable
Support
Dagstuhl Oct09
17
We imagine the definition of a generic object category to be
something like...
straight_back_chair ::= provides_seating_surface +
provides_stability + provides_back_support_surface
and recognition is conceptualized as ...
Provides_back_support
provides_arm_support
Provides_sittable_surface
provides_stable_support
Dagstuhl Oct09
18
Shape-based Knowledge Primitives
A functional requirement such as : provides_sittable_surface
is implemented as a sequence of calls to shape-based operators.
dimensions(shape_element, dimensions_type,
range_parameters)
relative_orientation(normal 1,normal 2,
range_parameters)
clearance(shape_element
clearance_volume)
Dagstuhl Oct09
19
Knowledge Primitives
Abstract shape reasoning
• Metric
dimensions (width, depth, height,
area, contiguous surface, volume
• Proximity
• Relative orientation
• Clearance
• Stability
• Enclosure
Dagstuhl Oct09
20
Knowledge Primitives
Physical interaction reasoning
• Change orientation
• Apply force
• Observe deformation
Dagstuhl Oct09
21
Evaluation Measures
Value returned from knowledge
primitive invocation
1.0
Evaluation
Measure
0.0
least
low
high greatest
ideal
ideal
Values of Shape Property
Dagstuhl Oct09
22
Combining Evidence
•Combine required measurements using
probabilistic AND (0-1)
•Combine descendent subcategory node
measure using probabilistic OR
Dagstuhl Oct09
23
Recognition Process
• Category representation graph is
control structure
• Structural Constraint Propagation –
subcategory nodes constrained by what
was found for the parent
Dagstuhl Oct09
24
Recognition Stage
2 approaches
1. Check all known categories in the
knowledge base
2. Confirm/deny object can/cannot
function as a specified (sub)category
Dagstuhl Oct09
25
Valid Chairs Recognized by GRUFF
Dagstuhl Oct09
26
History of GRUFF Project
Dagstuhl Oct09
27
Context-based Reasoning
GRUFF Generic object recognition system
Reasons about and generates plans for
understanding 3D scenes of objects
Extension to Context-based Reasoning Determine significance of accumulated
functional evidence to infer the existence
of scene concepts
Dagstuhl Oct09
28
Functionality in the Large
What makes an 'office' an office?
A desk with at least one chair in close
proximity.
You categorize areas or workspaces by
the functional configuration of the objects
in the area.
Dagstuhl Oct09
29
Context-based Reasoning
Name: Office
Type: Category
Function Verification Plan
Realized by Potential Results
Name:
Provides potential
seating
Name:
Infer Seating Areas
Name:
Infer Back Support
Name:
Provides potential
worksurfaces
Name:
Infer worksurfaces
Context-based
Reasoning
Shape-based
Reasoning
Dagstuhl Oct09
30
What Did Change?
• Multiple objects in scene
• Relax functional requirements
• Allow partial evidence
Dagstuhl Oct09
31
What Did Not Change?
• Basic set of functional primitives
• Organization of the representation
• Categorization, not identification
Dagstuhl Oct09
32
Test Data
Simulated data
- Complete 3D models evaluated
no occlusion surfaces
- Partial 3D models derived from laser
range finder simulation tool
Real data
- Stereo camera system generating
range data (SRI's Small Vision
System software)
Dagstuhl Oct09
33
Test Scenes Used in
Context-based Reasoning
Dagstuhl Oct09
34
Test Scenes Used in
Context-based Reasoning
Dagstuhl Oct09
35
Context-based Reasoning System
Infer contextual relationships from
accumulated functional evidence
Provides potential
worksurfaces
Provides potential seating
(back support and/or seating area)
Dagstuhl Oct09
36
What is the goal?
Question –
How do we recognize objects we have never
previously encountered?
- we don'thave a model (or do we?)
EssentiallyWe categorize objects using some type of
"model"
Dagstuhl Oct09
37
Earlier Work
Roberts
“Machine perception of three dimensional solids” 1965
•Analyze intensity image
•Extract edge information
•Match against library of geometric models
- “Model-based vision” paradigm
- “Single arbitrary view 3-D object
recognition” paradigm
Dagstuhl Oct09
38
Earlier Work
Binford
“Survey of model-based image analysis systems” 1982
“The essential definition of object class is
functional. …
Object classes have an associated 3-D form:
form equals function. …
Dagstuhl Oct09
39
Earlier Work
Binford
“Survey of model-based image analysis systems” 1982
“An object’s function is often a geometric
function. The function of a room is to be an
enclosing volume. … The function of a
chair… is to be a flat surface at a
comfortable height for sitting….”
Dagstuhl Oct09
40
Earlier Work
Winston, Binford, Katz and Lowry
“Learning physical descriptions from functional
definitions, examples and precedents” 1984
•Discussed used of function-based
definitions of object categories
•Infinity of individual physical
descriptions of objects in a category…
•Single functional description to
represent all (cup example)
Dagstuhl Oct09
41
Earlier Work
Brady, Agre, Braunegg and Connell
“The mechanics mate” 1985
Connell and Brady
“Generating and generalizing models of visual objects”
1987
• Discussed relation between geometric
structure and functional significance
• Generalized structural description
learned from sequence of examples
Dagstuhl Oct09
42
Earlier Work
Minsky
“The Society of Mind”, 1985
“… The solution is that we need to combine
at least two different kinds of descriptions.
On one side, we need structural
descriptions for recognizing chairs when
we see them. ”
Dagstuhl Oct09
43
Earlier Work
Minsky
“The Society of Mind”, 1985
“… On the other side we need functional
descriptions in order to know what we can
do with them… we need connections
between parts of the chair structure and
the requirements of the human body that
those parts are supposed to serve. “
Dagstuhl Oct09
44
Background
DiManzo, Trucco, Giunchiglia, Ricci
“FUR: Understanding Functional Reasoning”, 1989
• Utilized functional knowledge within an
expert system framework
•Primitives defined as individual expert
systems that evaluate 3D information
Dagstuhl Oct09
45
Background
Rivlin and Rosenfeld
“Navigational Functionalities”, 1995
• Explored functionality as it relates to
mobile robots
• Navigating agent may classify objects
in its environment in functional terms
as “threat,” “landmark” and so on.
Dagstuhl Oct09
46
Function-based Approaches
Cognitive Psychology/Human Perception
Representations of object categories
Human-robot interaction strategies
Wayfinding
Artificial Intelligence
Computer Vision
Formal representations of knowledge
Machine learning techniques
to automate reasoning
Document/aerial image analysis
Interpreting human motion
Object recognition/categorization
Robotics
Mapping of indoor environments
Object detection
Navigation/interaction plans
Formalisms for autonomous robot control
Dagstuhl Oct09
47
Artificial Intelligence
Two areas within AI that impact functionbased research
• Work on formal representations of
knowledge about functionality
•Application of machine learning techniques
to automate the process of constructing
function-based systems
Dagstuhl Oct09
48
Artificial Intelligence
• AI approach developed greater formalism
and depth than that in computer vision
• Advantage as complexity of system
requirements increases
Dagstuhl Oct09
49
Robotics
• Incorporate best practices from other fields
• Evolution
• Service robots (controlled environment)
• Interaction to confirm function
• General navigational systems
Dagstuhl Oct09
50
Human Perception Theories
• Klatsky et al. (2005)
• observe how children interact with
object associated to specific function
• use information in design of algorithms
for robotic interaction with objects to
reason about their function
Dagstuhl Oct09
51
Functional Knowledge Representation
• Barsalou et al. (2005)
• HIPE (History, Intentional perspective,
Physical environment, and Event
sequences)
• Raubal and Moratz (2007)
• expanded on theory
• representation of affordance-based
attributes
Dagstuhl Oct09
52
Affordances?
Goal is object recognition using function
According to Webster…
Affordance - <graphics> A visual clue to the
function of an object.
Yes, GRUFF uses affordances
Dagstuhl Oct09
53
Affordances
Some interpretation of Gibson affordance
• Automatic
• Pop out – no processing necessary
Have to admit – there were (are) different
camps
Dagstuhl Oct09
54
Affordances
According to Gibson
“If you know what can be done with… an
object, what it can be used for, you can
call it whatever you please.”
Dagstuhl Oct09
55
Affordances
• Considered an error if an object is
misclassified. Yes or no?
www.businesssupply.com
Dagstuhl Oct09
56
Affordances
According to Gibson
“If a surface of support is knee-high above the
ground, it affords sitting on.
We call it a seat in general.
If it can be discriminated as having just these
properties, it should look sit-on-able.
If it does, the affordance is perceived visually.”
Yes, GRUFF uses affordances
Dagstuhl Oct09
57
Affordances
Yes, it is a chair
Dagstuhl Oct09
58
Gibson’s Theory of Affordances
• Properties noted:
• horizontal
• flat
• extended
• rigid
Knowledge Primitives
Relative Orientation
Planar
Metric Dimensions
Requires Interaction
Physical properties, measured relative to the
animal. (Shape Properties)
The Ecological Approach to Visual Perception, J.J. Gibson
Dagstuhl Oct09
59
Open Problems: Across Disciplines
Work to ensure:
• scalability
• efficiency
• accuracy
• ability to learn
Dagstuhl Oct09
60
What we learned from
GRUFF
Open Problem Areas
Data Flow
End Goals
Provides potential
containment
Provides potential
table area
Infer contextual relationships from
accumulated functional evidence…
Provides potential worksurfaces
Provides potential seating
(back support and/or seating area)
Infer affordances
“in the large”…(in scale-space)
Factors Influencing System Complexity
Degree of Interaction
Feedback from Interaction
Complexity of Interaction
From Function From Visual Analysis and Physical Interaction. M. Sutton, L. Stark, & K. Bowyer. Image and Vision
Computing. 16 (1998) 745-763.
Knowledge Representation
The internal architecture utilized for reasoning
about affordances:
Representation
Representation
Action/
Observation
Sequence
Interaction
Tests
for a
Cup Object
Action/
Observation
Sequence
Action/Observation Sequence
Results: Furniture-like objects
Results: Dish-like objects
Representative OPUS Models
Representative Image Sequences
Results: Segmentation Issues
Segmentation Issues
Summary of Unpredicted
Subsystem Failures
Model
Building
Subsystem
Shape-based
Reasoning
Subsystem
Interaction-based
Reasoning
Subsystem
Chairs
(13/45) 29%
8/32 (25%)
3/18 (17%)
Cups
-
0/27 (0%)
7/27 (26%)
Category
Task/Affordance Driven Data Flow
Use task
information
Capture
image pair
Calculate
disparity
and range
data
Perform
segmentation
Perform
function-based
reasoning
(and evaluate)
(and evaluate)
(and evaluate)
Reset
parameters
?
Reset
parameters
Data flow from function-based reasoning to refinement of image acquisition
and range segmentation parameters.
Implementation Level: Parameter Sets
Implementation Level:
Metrics / Error Calculations
Real Data: SVS
Real Data: Parameter Variations
Surface Extraction and Use of Affordances
Capture image pair -> calculate disparity and range ->
evaluate range data -> perform/evaluate range
segmentation -> perform/evaluate object recognition
Question: How can use of affordances be incorporated into
feedback loops?
Guiding Questions
AND ANSWERS! (from previous Dagstuhl seminar)
How could or should a robot control architecture look like
that makes use of affordances as first-class items in
perceiving the environment?
How could or should such an architecture make use of
affordances for action and reasoning?
Is there more to affordances than function-oriented
perception, action and reasoning?
Guiding Questions
AND ANSWERS! (from previous Dagstuhl seminar)
Should affordances in a robot be programmed or
learned? (Can they be programmed in the first place?)
What about an affordance needs to be represented in a
robot, and how?
How and where in the architecture would attention,
intention, or other internal states filter affordances that
were perceived on a low level?
How would affordance-based control go together with
behavior-based and plan-based control? Is it
complementary? Redundant? Inconsistent?
How can affordances be used for reasoning and action?
Affordances:
…in space
and time…
Affordances:
…within
subsystems…
…supervisors,
specialists,
agents…
Affordances:
…in
scale-space…
In a similar vein, trying to understand perception by studying only neurons
is like trying to understand bird flight by studying only feathers:
It just cannot be done. In order to understand bird flight,
we have to understand aerodynamics; only then do
the structure of feathers and the different
shapes of birds’ wings make sense.
David Marr (1982)
QUESTIONS?
Thank
You !
Dagstuhl Oct09
91