DoshiAIPR07 - University Of Georgia
Download
Report
Transcript DoshiAIPR07 - University Of Georgia
Artificial Intelligence and Pattern Recognition (AIPR), 2007
On the Role of Interactive
Epistemology in Multiagent Planning
Author
Speaker
Prashant Doshi
Sergei Fogelson
Dept. of Computer Science and AI Center
University of Georgia
Decision-theoretic Planning
Single agent setting
i
t 1
i
a
Physical State
(Loc, Orient,...)
t
i
o
Markov decision processes (MDPs)
state transition
State is perfectly observable
Act to optimize expected reward
Decision-theoretic Planning
Single agent setting
Beliefs
i
t 1
i
a
Physical State
(Loc, Orient,...)
t
i
o
Partially observable Markov
decision processes (POMDPs)
state transition
State is partially observable
Act to optimize expected reward
given beliefs
Decision-theoretic Planning
Multi-agent setting
Beliefs
i
t 1
i
state transition
a
Physical State
(Loc, Orient,...)
t
i
o
o
a
t 1
j
How should agent i act in a multi-agent setting?
t
j
j
Planning in Multiagent Settings
Traditional game-theoretic approach
Nash equilibrium
Epistemological commitment
Plans that are best responses to each other
Common knowledge of rationality
Common knowledge of joint prior over agent types
Limitations
Formalization of common knowledge is self-referential
N.E. is non-unique
Epistemological commitment difficult to satisfy
Multiple equilibria may exist
N.E. is incomplete
Does not prescribe outside of equilibria
Illustration: Centipede Game
Each agent may defect or cooperate
A
cooperate
B
A
B
(9,9)
(8,19)
(18,18)
defect
(0,0)
(-1,10)
Common knowledge of rationality => A will defect (backward induction)
Mutual knowledge of rationality where depth = length of the game
is also sufficient
A is uncertain about B's rationality => A will cooperate
Interactive Epistemology
Beliefs
i
t 1
i
a
Physical State
Beliefs
(Loc, Orient,...)
t
i
o
o
a
t
j
j
t 1
j
Agent i's belief is a distribution over the physical state and models of
other agent j
Model of j = (Beliefj, Frame)
Agent j's belief is a distribution over the state and models of i
Recursive State Space
Interactive state space: ISi= S x j
Agent i's belief:
(Sufficient statistic)
Recursive description
Limit to finite nesting for computability
Analogous to
hierarchical
beliefs systems
in game theory
Interactive POMDPs (I-POMDPs)
(Gmytrasiewicz&Doshi, JAIR05)
I-POMDPs: A framework for decision-theoretic
planning in multiagent settings
Generalizes POMDPs to multiagent settings
-- set of interactive states of agent i
-- set of joint moves of all agents
-- state transition function
-- set of observations of agent i
-- observation function
-- preferences of agent i
I-POMDPs
Belief update for agent i in I-POMDP
1. Use the other agent's model to predict its action(s)
2. Anticipate the other agent's observations and how it updates its
beliefs
3. Use own observations to correct the beliefs
Recursively updates the beliefs at each nesting level down
to level 0
Plan computation
Analogous to POMDPs given the new belief update
Previously this approach has been called
Decision-theoretic approach to game theory
Subjective rationality (Kalai&Lehrer93)
(Kadane&Larkey82)
Discussion
In multiagent settings assumptions about behaviors of
agents assume significance
Common assumption
Beliefs are private
Common knowledge of rationality and prior beliefs
Necessary epistemic condition for Nash equilibrium
Common knowledge of beliefs is infeasible
Beliefs over beliefs => nested belief systems
Infinitely nested belief systems are not computable
Discussion
Finitely nested beliefs systems – Computable but
suboptimal
I-POMDPs – Planning in multiagent settings cognizant of
nested beliefs
Contributes to a growing focus on dynamic interactive
epistemology
Future Work
Address the computational complexity of I-POMDPs
Thank you