DoshiAIPR07 - University Of Georgia

Download Report

Transcript DoshiAIPR07 - University Of Georgia

Artificial Intelligence and Pattern Recognition (AIPR), 2007
On the Role of Interactive
Epistemology in Multiagent Planning
Author
Speaker
Prashant Doshi
Sergei Fogelson
Dept. of Computer Science and AI Center
University of Georgia
Decision-theoretic Planning
Single agent setting
i
t 1
i
a
Physical State
(Loc, Orient,...)
t
i
o
Markov decision processes (MDPs)
state transition
State is perfectly observable
 Act to optimize expected reward

Decision-theoretic Planning
Single agent setting
Beliefs
i
t 1
i
a
Physical State
(Loc, Orient,...)
t
i
o
Partially observable Markov
decision processes (POMDPs)
state transition
State is partially observable
 Act to optimize expected reward
given beliefs

Decision-theoretic Planning
Multi-agent setting
Beliefs
i
t 1
i
state transition
a
Physical State
(Loc, Orient,...)
t
i
o
o
a
t 1
j
How should agent i act in a multi-agent setting?
t
j
j
Planning in Multiagent Settings

Traditional game-theoretic approach

Nash equilibrium


Epistemological commitment



Plans that are best responses to each other
Common knowledge of rationality
Common knowledge of joint prior over agent types
Limitations

Formalization of common knowledge is self-referential


N.E. is non-unique


Epistemological commitment difficult to satisfy
Multiple equilibria may exist
N.E. is incomplete

Does not prescribe outside of equilibria
Illustration: Centipede Game
Each agent may defect or cooperate
A
cooperate
B
A
B
(9,9)
(8,19)
(18,18)
defect
(0,0)

(-1,10)
Common knowledge of rationality => A will defect (backward induction)
Mutual knowledge of rationality where depth = length of the game
is also sufficient


A is uncertain about B's rationality => A will cooperate
Interactive Epistemology
Beliefs
i
t 1
i
a
Physical State
Beliefs
(Loc, Orient,...)
t
i
o
o
a
t
j
j
t 1
j
Agent i's belief is a distribution over the physical state and models of
other agent j



Model of j = (Beliefj, Frame)
Agent j's belief is a distribution over the state and models of i
Recursive State Space


Interactive state space: ISi= S x j
Agent i's belief:
(Sufficient statistic)

Recursive description

Limit to finite nesting for computability
Analogous to
hierarchical
beliefs systems
in game theory
Interactive POMDPs (I-POMDPs)
(Gmytrasiewicz&Doshi, JAIR05)
I-POMDPs: A framework for decision-theoretic
planning in multiagent settings








Generalizes POMDPs to multiagent settings
-- set of interactive states of agent i
-- set of joint moves of all agents
-- state transition function
-- set of observations of agent i
-- observation function
-- preferences of agent i
I-POMDPs

Belief update for agent i in I-POMDP
1. Use the other agent's model to predict its action(s)
2. Anticipate the other agent's observations and how it updates its
beliefs
3. Use own observations to correct the beliefs


Recursively updates the beliefs at each nesting level down
to level 0
Plan computation


Analogous to POMDPs given the new belief update
Previously this approach has been called


Decision-theoretic approach to game theory
Subjective rationality (Kalai&Lehrer93)
(Kadane&Larkey82)
Discussion


In multiagent settings assumptions about behaviors of
agents assume significance
Common assumption



Beliefs are private


Common knowledge of rationality and prior beliefs
Necessary epistemic condition for Nash equilibrium
Common knowledge of beliefs is infeasible
Beliefs over beliefs => nested belief systems

Infinitely nested belief systems are not computable
Discussion


Finitely nested beliefs systems – Computable but
suboptimal
I-POMDPs – Planning in multiagent settings cognizant of
nested beliefs


Contributes to a growing focus on dynamic interactive
epistemology
Future Work
 Address the computational complexity of I-POMDPs
Thank you