Myopic and non-myopic agent optimization in game theory
Download
Report
Transcript Myopic and non-myopic agent optimization in game theory
Myopic and non-myopic agent optimization
in game theory, economics, biology and
artificial intelligence
Michael J Gagen
Institute of Molecular Bioscience
University of Queensland
Email: [email protected]
Kae Nemoto
Quantum Information Science
National Institute of Informatics, Japan
Email: [email protected]
Overview: Functional Optimization in Strategic Economics (and AI)
Formalized by von Neumann and Morgenstern,
Theory of Games and Economic Behavior (1944)
Mathematics / Physics (minimize action)
Overview: Functional Optimization in Strategic Economics (and AI)
Formalized by von Neumann and Morgenstern,
Theory of Games and Economic Behavior (1944)
Strategic Economics (maximize expected payoff)
Functionals:
Fully general
Not necessarily continuous
Not necessarily differentiable
Nb: Implicit Assumption of Continuity !!
Overview: Functional Optimization in Strategic Economics (and AI)
Strategic Economics (maximize expected payoff)
von Neumann’s “myopic” assumption
Evidence:
von Neumann & Nash used fixed
point theorems in probability simplex
equivalent to a convex subset of a
real vector space
von Neumann and Morgenstern, Theory of Games and Economic Behavior (1944)
J. F. Nash, Equilibrium points in n-person games. PNAS, 36(1):48–49 (1950)
Overview: Functional Optimization in Strategic Economics (and AI)
Non-myopic Optimization
No communications
between players
Correlations
Constraints and
forbidden regions
Overview: Functional Optimization in Strategic Economics (and AI)
“Myopic” Economics (= Physics)
Non-myopic Optimization
∞ correlations &
∞ different trees
Myopic One Constraint = One Tree
constraint sets
X
Myopic “The” Game Tree lists All play options
Myopic = Missing Information!
Correlation = Information
What Information?
Nemoto: “It is not what they are doing, its what they are thinking!”
Chess:
“Chunking” or pattern recognition in human chess play
Experts:
Performance in speed chess doesn’t degrade much
Rapidly direct attention to good moves
Assess less than 100 board positions per move
Eye movements fixate only on important positions
Re-produce game positions after brief exposure better than
novices, but random positions only as well as novices
Learning Strategy = Learning information to help win game
Optimization and Correlations are Non-Commuting!
Complex Systems Theory
Emergence of Complexity via correlated signals higher order structure
Optimization and Correlations are Non-Commuting!
Life Sciences (Evolutionary Optimization)
Selfish Gene Theory
Mayr: Incompatibility between biology and physics
Rosen: “Correlated” Components in biology, rather than “uncorrelated” parts
Mattick: Biology informs information science
6 Gbit DNA program more complex than any human program,
implicating RNA as correlating signals allowing multi-tasking
and developmental control of complex organisms.
Mattick: RNA signals in molecular networks
Prokaryotic gene
Eukaryotic gene
Hidden layer
mRNA
protein
mRNA & eRNA networking
functions
protein
Optimization and Correlations are Non-Commuting!
Economics
Selfish independent agents: “homo economicus”
Challenges: Japanese Development Economics,
Toyota “Just-In-Time” Production System
Optimization and Correlations are Non-Commuting!
1 Player Evolving / Learning Machines (neural and molecular networks)
endogenously exploit correlations to alter own decision tree, dynamics and optima
o
o = F(i)
= F(t,d)
i
= Ft (d) {F(x,y,z), … ,F(x,x,z),…}
Functional Programming, Dataflow computation, re-write architectures, …
Discrepancies: Myopic Agent Optimization and Observation
Heuristic statistics
Iterated Prisoner’s Dilemma
Iterated Ultimatum Game
Chain Store Paradox (Incumbent never fights new market entrants)
Myopic Agent Optimization
Strategic Form
Normal Form
Px
Py
?
von Neumann and Morgenstern (1944):
All possible information = All possible move
combinations for all histories and all futures
?
Sum-Over-Histories or
Path Integral formulation
Myopic Agent Optimization
Optimization
Sum
over all
stages
Sum
over all
paths to
nth stage
Probability
of each
path
Payoff
from each
stage for
each path
Myopic Agent Optimization
Myopic agents ( probability distributions)
uncorrelated
no additional constraints
x1
1-p
y1
p
0 ≤ p ≤ 1/2
Backwards Induction
& Minimax
Non-Myopic Agent Optimization
Fully general, notationally emphasized by:
Optimization
Sum over all
correlation
strategies
Constraint
set of
each
strategy
Payoff for
each path
Probability
of each
strategy
Sum
over all
paths
given
strategy
Conditioned
path
probability
Non-Myopic Agent Optimization in the Iterated Prisoner’s Dilemma
In 1950 Melvin Dresher and Merrill
Flood devised a game later
called the Prisoner’s Dilemma
Two prisoners are in separate cells
and must decide to cooperate or
defect
Payoff Matrix
Py
Px
C
D
Cooperation
Defect
C
(2, 2)
(0, 3)
CKR: Common Knowledge of
Rationality
D
(3,0)
(1,1)
Non-Myopic Agent Optimization in the Iterated Prisoner’s Dilemma
Myopic agent assumption
max
Non-Myopic Agent Optimization in the Iterated Prisoner’s Dilemma
Myopic agents: N max constraints
=1 > 0 PN-1,x,HN-2(1) = 1
=0
> 0 PNx,HN-1(1) = 1
Simultaneous solution Backwards Induction myopic agents always defect
Non-Myopic Agent Optimization in the Iterated Prisoner’s Dilemma
Correlated Constraints: (deriving Tit For Tat)
2 max constraints
< 0 P1x(1) = 0, so Px cooperates
< 0 P1y(1) = 0, so Py cooperates
Non-Myopic Agent Optimization in the Iterated Prisoner’s Dilemma
Families of correlation constraints: k, j index
Change of notation: “dot N” = N, “dot dot N” = 2N, “dot dot N-2” = 2N-2, etc
Optimize via game theory techniques
Many constrained equilibria involving cooperation
Cooperation is rational in IPD
Further Reading and Contacts
Kae Nemoto
Email: [email protected]
URL:
http://www.qis.ex.nii.ac.jp/knemoto.html
Michael J Gagen
Email: [email protected]
URL: http://research.imb.uq.edu.au/~m.gagen/
See:
Cooperative equilibria in the finite iterated prisoner's dilemma, K. Nemoto
and M. J. Gagen, EconPapers:wpawuwpga/0404001
(http://econpapers.hhs.se/paper/wpawuwpga/0404001.htm)