ppt - University of Virginia

Download Report

Transcript ppt - University of Virginia

On the Emergence of Social
Conventions: modeling, analysis
and simulations
Yoav Shoham & Moshe Tennenholtz
Journal of Artificial Intelligence 94(1-2), pp.
139-166, July 1997.
CSRG
Presented by
Souvik Das
11/02/05
1
Authors
• Yoav Shoham
– Professor of Computer Science, Stanford University
– AI, MAS, Game Theory, e-commerce
–
http://ai.stanford.edu/~shoham/ , email: [email protected]
• Moshe Tennenholtz
– Professor of Industrial Engineering and Management at
the Technion – Israel Institute of Technology
– AI, MAS, Protocol evolution
–
http://iew3.technion.ac.il/Home/Users/Moshet.phtml email: [email protected]
2
Definition
• Social Convention
– Limiting agents’ choices to induce subgames
– Such restrictions are social constraints, in
cooperative games
– When restrictions leave only one strategy for all
agents it is a social convention
3
Three basic concepts
• Maximin
– Guarantees highest minimal payoff
– Rationality of other players or common knowledge may
not be assumed
• Nash Equilibrium
– No player deviates unilaterally from equlibrium
solution without hurting his/her payoff
– Common knowledge and rationality assumed
• Pareto Optimality
– Joint action is pareto optimal if on increasing one
agents payoff, another suffers
4
Coordination and Cooperation
games
• Coordination
– M= 1,1
-1,-1
-1,-1
1,1
– Maximin gives –1 while the other two give 1 as payoff
• Cooperation
– M= 1,1
-3,3
3,-3
-2,-2
– Maximin and Nash give –2 but this is pareto dominated
5
Motivation
• Under what conditions do conventions
eventually emerge?
• How efficiently are they achieved?
• What are the different parameters affecting
speed of convergence?
6
Game Model
• Symmetric
• Population size N >= 4
• Each game 2 player 2 choice
• Typical coordination and cooperation games
• Payoff matrix M of each game g
M=
x,x
v,u
u,v
y,y
7
Game model cont.
• Social law sl induces sub game gsl where g is the
unrestricted game
• Rationality test of sl
– Let V be the game variable used for determining
rationality
– Let V(g) denote the value of that variable in game g
• A rational social law with respect to g is
– V(g) < V(gsl)
Note: Rationality here does not imply optimality
8
Example
• In coordination game, two possible rational
social conventions with respect to maximin
– Restriction on either one of the strategies
• In cooperation game, only one possible
rational social convention with respect to
maximin
– Cooperate
9
The Game Dynamics
• N-k-g stochastic social game
– Unbounded sequence of ordered tuples of k
agents selected at random from given N agents
– Random k agents meet repeatedly and play
game g
– In each iteration, action selection by agents are
synchronous
10
Action Selection
• An agent switches to a new action iff total payoff
obtained from that action in the last m >= N >= 4
iterations is more than the present action in same
time period
• This action update rule called HCR or Highest
Cumulative Reward
• Complicated weighted HCR rules based on simple
HCR possible
• m puts finite bound on history
11
Theorem 1
• Given a N-2-g stochastic social agreement game
– For every ε > 0, there exists a bounded number Λ such
that if the system runs for Λ iterations, probability that a
social convention is reached is 1-ε
– Once the convention is reached, it is never left
– Reaching the convention guarantees to agent a payoff
no less than the maximum value initially guaranteed
– If social convention exists for g that is rational w.r.t
maximin value then, then social convention will be
rational w.r.t. maximin
• Corollary
– HCR rule guarantees eventual convergence for
coordination and cooperation social games, that is,
rational convention
12
Theorem 2
• Efficiency measured in terms of number of
iterations T(N) required to get desired
behavior
• T(N) = Ω ( N log N ) for any update rule R
which guarantees convergence
13
Proof: Theorem 1
• Case I:
– Coordination games ( y > 0, u < 0, v < 0 )
• Rational social convention will restrict all agents to similar
strategy
• Pair of agents (i,j) with similar strategy meet together till all
other agents forget their past
• i meets x (not equal to j) and then meets j. This step continues
in loop till i meets all agents.
• If Λ = k g(N) f(N), then probability that convention not reached
is e-k
• f(N) and g(N) bounded by an exponent of the form Ns where
s is a polynomial in m and N
14
Proof: Theorem 1
• Case II:
– Cooperation ( y < 0, u < 0, v > 0 )
• Similar structure of proof as Case I
• The major change is in the creation of a pair of cooperative
agents
• Achieved by meeting a pair of agents till a pair of noncooperative agents forget their past
• These historyless non cooperative agents meet till all other non
cooperative agents forget their history
• Then they meet sequentially and convention is reached in
similar way as coordination game
15
Proof: Theorem 2
• Total number of permutations possible for
choosing two players from N is NP2 or N(N-1)
• Ways in which a particular player is chosen is N
• Probability of it not being chosen as player 1 or
player 2 in 2 person game in one iteration is (11/(N-1))2
• Probability of player not being chosen for a stretch
of T(N) = (N-1)f(N) games is (1-1/(N-1))2(N-1)f(N)
which converges to e-2f(N)
16
Proof: Theorem 2 cont.
• Consider the random variable YN(i) which contains
the number of agents that did not participate in
any of the i iterations
• E[YN(T(N))] goes to 0 implies that convention
established
• If e-2f(N) > 1/N, then E[YN(T(N))] > 1, implying no
convergence
• Therefore, for convergence, e-2f(N) < 1/N
• Taking natural log, f(N) > 0.5logN
• Thus, T(N) = Ω ( N log N )
17
Evolution of coordination:
Experimental Results
• Coordination games achieve conventions rapidly
with the HCR rule while cooperation games do not
• Parameters considered are
– Update frequency
• How frequently an agent uses its action update rule HCR
– Memory restarts
• Previous history forgotten, but current action retained
– Memory window
• Previous m iterations in which agent participated versus
previous m iterations regardless of whether the agent
participated in those
18
Update frequency
The efficiency of
convention decreases
as the delay in update
increases
19
Memory Restarts
With decreasing memory restart distance, convention
evolution efficiency decreases
20
Memory Window
Increasing memory size indefinitely is not helpful
Old information not as relevant as new ones
21
Co-varying memory size and
update frequency
• When update
frequency drops
below 100, it becomes
better to use statistics
of only last window
than entire history
• When agents have
update delays, they
rely on old
information
• Systems with large
update delays should
have frequent
memory restarts 22
Convention Evolution Dynamics
• As the number of
players remaining
to conform to
convention
decreases, the rate
of convergence
slows down
23
Extended Coordination Game
• Symmetric 2-person-s-choice game where payoff
x for both agents is greater than 0, iff they perform
similar actions, and it is –x otherwise
• New update rule used in this case is External
Majority or EM rule
• EM rule
– Strategy i is adopted if it was observed in other agents
more often than any other strategy
– Reduces to HCR rule for s=2
24
Experimental results
• Addition of more potential conventions decreases
the efficiency of convention formation by less than
logarithmic fashion
25
General Comments
• These conventions are not necessarily Nash
Equlibria
• Constraints are viewed as regulations laid down
by central authority such as government
• If central authority present and is able to enforce
certain rules, then they may as well enforce the
efficient convention
• In proofs of theorems, statements are made
without validation
26
Comments on Selection Rule
• HCR rule replaces the Best Response or BR rule used in
evolutionary stable strategies and stochastically stable
strategies
• Two important criteria for selection function are
obliviousness and locality
– Selection function is independent of identity of players
– Selection function is purely a function of player’s personal history
• Obliviousness is similar to Young’s approach
• Young* uses BR which is global
• Rationale for using local update is that individual decision
making usually happens in absence of global information
• Is HCR really local?
*The Evolution of Conventions, H P Young, Econometrica, Vol 61, No. 1, (Jan 1993), 57-84
27
Comments on the Experiment
• It is not clear
– How many agents play games in each iteration
and how they are chosen
– How does one ensure that a particular pair of
agents play and the rest forget their play history
in instances where the memory window is
based upon the last m iterations in which the
agents participated
28
Comparison with Young’s Work
• Model differences
–
–
–
–
–
–
–
–
BR vs HCR
Anonymity of history
Incompleteness of information measured by k/m ratio
A convention defined as state h consisting of m
repetitions of a pure strategy which is an absorbing
state
No central authority to dictate restrictions
Mistakes (deviation from rational behavior assumed)
Adaptive play’s incomplete sampling helps it to break
out from sub optimal cycles
As long as m/k and k are large, for 2x2 games,
stochastically stable equilibria is independent of m and
k
29
Questions?
30