Transcript ppt

Agent-Oriented Techniques for
Programming Robots
Hans-Dieter Burkhard
Humboldt University Berlin
What is an Agent?
Someone who acts autonomously on behalf of others
• Sales agent
• Insurance agent
• Undercover agent
• .....
Software Agents
• Assistance Systems
• Search engines
• ChatterBots
•…
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
2
Open Systems
Agents arrived with
open systems
Definition (Hewitt)
• Continuous availability
• Extensibility
• Decentralized control
• Asynchronous work
• Inconsistent information
• Arm length relationships
Consider: P2P
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
3
What is an Agent?
A program that acts autonomously on behalf of its user
An agent is a long running program, where the
work can be meaningfully described as
autonomous completion of orders or goals
while interacting with the environment.
Further Attributes:
Intelligent, social, reactive, proactive, adaptive, …
AI as research on intelligent agents.
(cf. Textbook Russell/Norvig: Artificial Intelligence)
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
4
Agents (Autonomous Systems) in Real World
• Natural language understanding
• Image interpretation
• Driver assistance systems
• Traffic control
• Space discovery
• Autonomous robots:
– Service robots
– Rescue robots
– Entertainment robots
– Industrial robots
– Agricultural robots
–…
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
5
Autonomous Systems in Real World
Robot soccer as testbed
(How to build and program
soccer robots?)
Robot “Vision” from Team Osaka
Annual world championships and conference
Long term goal: Play like FIFA champion in 2050
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
6
Chess vs. Soccer
1997: Deep Blue
wins against
human champion
Kasparov
Chess:
• Static
• 3 Minutes per move
• Single action
• Single player
• Information:
• reliable
• complete
H.D.Burkhard, HU Berlin
Robot
“Nao” from
Aldebaran
Soccer:
• Dynamic
• Milliseconds
• Sequences of actions
• Team
• Information:
• unreliable
• incomplete
AOT for Programming Robots, Durres, Sept. 10, 2008
7
RoboCup
Melbourne 2000
H.D.Burkhard, HU Berlin
Bremen 2006
AOT for Programming Robots, Durres, Sept. 10, 2008
8
Service Robots
Willie, bring
me a beer
Alternatives:
- from the refrigerator
- from the cellar
- from the neighbor
- from the shop
- from the internet
-…
Which alternative to choose?
What else is needed (glass, …)?
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
9
Robot Needs a World Model
there was a beer
in the refrigerator
Memory of environment:
Part of state in the program
Facts about the world
– maps, positions of objects, descriptions, …
Methods for processing sensory inputs
– language processing, image processing
Methods for integrating sensory data
– new world model from old model and new sensory data
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
10
World Model
Problems:
Environment is only partially observable
Observations are insecure and noisy
Scene interpretation with Bayesian methods, e.g.
Probability to be at location s given an observation z:
P(s|z) = P(z|s)·P(s) / P(z)
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
11
World Model
World model need not be true knowledge,
only belief of the agent.
Someone took
the beer from
the refrigerator!
Plans may fail.
Need methods for revision.
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
12
Memory of Commitments
Commitments:
Part of state in the program
Why did I go to
the refrigerator
Tasks/Goals: Desired world states
Plans (Sequence of actions)
Rationality: Agents should only pursue
goals/plans that can be achieved
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
13
Goal Oriented Agents
Deliberation: Select goal to achieve
e.g. by calculating utilities
Means-ends reasoning: Planning method
e.g. by search in the action space
Rationality. Needs measures of success/quality/benefits.
“Bounded rationality”:
Success w.r.t. to available resources (information, time, …)
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
14
Utility Estimations
Different options o
Achievable by different plans p
With different results r
Value of result r : v(r)
Probability for achieving r using plan p: (r | p)
Utility of plan p (expectation) : u(p) = r result of p (r | p) · v(r)
Utility of option o: u(o) = Max{ u(p) | p plan for o }
Decision process (used for simulated soccer player ATH98):
Estimate utilities for options o
Select best option o as goal g
Build plan p for g
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
15
Rationality (Realism)
Goals must be feasible
Selection process:
1. Rough estimation (utilities)
2. In case of error in means-ends reasoning (planning)
Revision of goal selection
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
16
Refinement of Goals
Refinement as iterated decision-process:
Long term goal
 intermediate goals
...
 intermediate goals  actions
Analogy: Stack of procedure calls
Least commitment: Specification only as far as necessary.
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
17
Maintaining Multiple Goals: BDI-Approach
Belief
(world model)
Desire
(desirable future world states)
Intentions (world states to be achieved)
Desires may be in conflict
Intentions must not be in conflict (rationality)
Mental states based on models of human acting (especially
w.r.t. bounded rationality)
M.E. Bratman: Intentions, Plans, and Practical Reason,
Harvard University Press, Massachusetts, 1987.
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
18
Adaptation vs. Stability
Conflicts between old intentions
and potential new intentions (desires)
There is a beer
on the table!
Adaptation: select always best intentions
Stability: continue old intentions
Advantages of stability:
Reliability (important for cooperation)
Reduce overhead for changes
Avoid oscillations
Disadvantages of stability:
Stick too long on unsatisfactory behavior (fanatism)
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
19
BDI: Screen of Admissibility
Bratman’s solution
for conflicts between old and potential new intentions:
Old intentions restrict admissibility of new intentions,
i.e. set a filter for
- additional intentions
- for refinement of intentions
Bounded Rationality
Efficiency:
Reduce repeated evaluation of adopted intentions.
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
20
BDI Agents
BDI architectures widely used
Implementation in different variations
Often only in simplified manner
desire = goal
intention = plan
without parallel intentions
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
21
Putting Together: Sense-think-act Cycle
Logical ordering of intern processing of the agent
1. Sense („input“) + perception (interpretation, world model)
2. Think (“decision”: evaluation, planning)
3. Act („output“)
sense
act
H.D.Burkhard, HU Berlin
think
AOT for Programming Robots, Durres, Sept. 10, 2008
22
sense
Sense-think-act Cycle
Synchronisation (sequential)
input
act
think
sense
think
act
output
time
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
23
sense
Sense-think-act Cycle
Synchronisation (concurrent)
input
act
think
sense
think
act
output
time
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
24
sense
Sense-think-act Cycle
Synchronisation problems
input
act
sense
?
think
think
For complicated
deliberation processes
act
output
time
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
25
Different Deliberation Times
Layered architectures with different deliberation cycles, e.g.
- Immediate reactions (avoid obstacles)
- Short term planning
- Long term planning
AIBO:
30 images per second
125 motor commands per second
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
26
Structures: Layered Architectures
sense
Layer n
......
Layer 2
Layer 1
H.D.Burkhard, HU Berlin
Agent
Environment
Synchronization
Conflicts
Concurrency
act
AOT for Programming Robots, Durres, Sept. 10, 2008
27
Layered Architectures with Mediator
sense
Agent
Layer n
......
Environment
Mediator
Layer 2
Layer 1
H.D.Burkhard, HU Berlin
act
AOT for Programming Robots, Durres, Sept. 10, 2008
28
1-Pass-Architecture
sense
Layer n
Agent
Environment
......
Layer 2
Layer 1
H.D.Burkhard, HU Berlin
act
AOT for Programming Robots, Durres, Sept. 10, 2008
29
2-Pass-Architecture
sense
Layer n
Agent
Environment
......
Layer 2
Layer 1
H.D.Burkhard, HU Berlin
act
AOT for Programming Robots, Durres, Sept. 10, 2008
30
How to Deal with Dynamic World
Changing situations
Changing expectations
Unexpected situations (e.g. obstacles)
Changing plans
Plans may fail.
Need methods for revision.
Conflict handling by BDI-approach
Least Commitment: Deliberate as far as necessary
Double pass architecture (DPA)
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
31
Option Hierarchies
Serve
beer
Get
glass
Get
bottle
Open
bottle
Fill
glass
Bring
Glass
from
Shop
from
Refr.
...
Go to Open Take
Refr. Refr. Bottle
... ... ...
“And-branches”
- all suboptions have
to be achieved
. . . . . . Get Goto Buy Go
Money Shop Bottle home “Or-branches”
(Alternatives)
. . . . . . . . . - one suboption has
. ... ...
to be achieved
. . . . . . . AOT
. . for Programming Robots, Durres, Sept. 10, 2008
H.D.Burkhard, HU Berlin
32
Serve
beer
Intention Tree
Get
glass
Get
bottle
from
Refr.
Open
bottle
Fill
glass
Bring
Glass
from
Shop
Options may be in
...
... ... ...
different states, e.g.
Go to Open Take
- intended
Refr. Refr. Bottle
- active
Get Goto Buy Go
. . . . . Money
.
Shop Bottle home - done
... ...
. ... ...
...
. . . . . . .AOT
. . for Programming Robots, Durres, Sept. 10, 2008
H.D.Burkhard, HU Berlin
33
Intention Tree
Get
bottle
Serve
beer
Get
glass
Open
bottle
Fill
glass
Bring
Glass
from
Refr.
Go to Open Take
Refr. Refr. Bottle
...
...
Options may be in
. . . . . . different states, e.g.
- intended
- active
- done
...
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
34
Activation Path
Serve
beer
Part of intention tree
Get
bottle
Get
glass
Open
bottle
Fill
glass
Bring
Glass
from
Refr.
Go to Open Take
Refr. Refr. Bottle
...
...
Options may be in
. . . . . . different states, e.g.
- intended
- active
- done
...
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
35
Plan Fails
Get
bottle
Serve
beer
Get
glass
from
Refr.
No Beer
inside
Go to Open Take
Refr. Refr. Bottle
...
Open
bottle
Fill
glass
Bring
Glass
Need for re-deliberation:
Look for alternatives
...
...
...
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
36
Repair: Intention Tree
Get
glass
Get
bottle
from
Shop
from
Refr.
Serve
beer
...
Open
bottle
Fill
glass
Bring
Glass
Re-deliberation
not by chronological
... ... ...
backtracking
Go to Open Take
Refr. Refr. Bottle
. . . . . . Get Goto Buy Go
Money Shop Bottle home
...
. ... ...
...
...
. . . . . . .AOT
. . for Programming Robots, Durres, Sept. 10, 2008
H.D.Burkhard, HU Berlin
37
Double Pass Architecture (DPA)
2 Passes:
- Deliberation determines intention tree
modification if necessary (re-deliberation)
- Executor works over intention tree
maintains activity pass (top-down processing)
controls actuators
Advantages over stack oriented approaches:
Procedure stack has access only to last recent call
Implementations: XABSL, DPA
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
38
Still: Classical Approach (“Dualism”)
Robot = Agent (Brain) augmented by Sensors + Actuators
Environment
H.D.Burkhard, HU Berlin
Input
Agent
(program) Output
Actuators
Sensors
Robot
AOT for Programming Robots, Durres, Sept. 10, 2008
39
Limitations for Complex Actuators
Vehicles have simpler actuation than legged robots
Vehicles:
• Accelerate
• Drive
• Turn
• Stop
H.D.Burkhard, HU Berlin
Legged robots:
• Coordination of limbs
• Complex kinematics
• Stability maintenance
(even in stop state)
AOT for Programming Robots, Durres, Sept. 10, 2008
40
Machine Learning
Use „trial and error“.
•Evolutionary algorithms
•Reinforcement learning
•Case based reasoning
•Neural networks
http://www.robocup.de/AT-Humboldt/simloid-evo.shtml?de
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
41
Proprioception: Feeling the own Body
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
42
Biologically Inspired Robotics
Emergent behavior using situatedness in physical world
Intelligence emerges by “clever connections”
Many sensors
Local processing
Coupling with actuators
Neural Networks
New insights for Artificial Intelligence:
Intelligence needs a body for experiencing the real world.
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
43
Acceleration Sensors at our Robots
Accelboards:
• real time (10ms cycle)
• C/Assembler program
• local processing
ABSR
ABML
ABAL
ABAR
ABHL
ABHR
ABFR
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
ABFL
44
Recent Experiments
Local control by Recurrent Neural Network
Networks developed by evolution
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
45
See you at RoboCup 2009 in Graz!
Thank you!
H.D.Burkhard, HU Berlin
AOT for Programming Robots, Durres, Sept. 10, 2008
46