No Slide Title

Download Report

Transcript No Slide Title

Mini Quiche
Prepare a double batch of pie crust or purchase a pie crust mix.
Roll out and cut in circles using a donut cutter or cooking cutter.
Place circles into bottom of muffin pan so that pie crust curls up
the sides by about 1/4 inch.
Fill with any combo of filling. I used bacon, mushrooms, onions,
and swiss for one filling. The other was broccoli, onions and
cheddar cheese. Place just enough filling so it does not go above
lip of crust. Beat 2 eggs and 3/4 cup sour cream. Spoon 1
teaspoon of egg mixture over each quiche.
Bake for 20-25 min in 375 oven.
Reheat defrosted quiche for 10 min in a 400 oven.
Recipe Archive at Carnegie Mellon's School of Computer Science (SCS)
Game Networks
Piero La Mura
Stanford University
Motivation:
providing a natural and
computationally efficient framework
for strategic inference
in multi-agent decision problems
Beer/Quiche Game
A Nash Equilibrium
Another Nash Equilibrium
Existing frameworks for strategic inference
include:
• normal forms (NF)
• extensive forms (EF)
• sequence forms (SF)
• influence diagrams (ID)
• expected utility networks (EUN).
Of these,
• EF, SF and ID are dynamic, i.e. capture the causal flow of events
in the game
• NF, EF and SF are multi-agent representations
• EUN capture strategic independencies.
G nets are a dynamic, multi-agent generalization of EUN.
Multi-agent Influence Diagrams?
This doesn’t work.
(of course)
Player1
Player2
G nets, informally
Game networks (G nets) are a novel class of game
representations and associated inference procedures, which
can exploit strategic separabilities in order to simplify the
inference process.
G nets come with two novel inference procedures: one which
selects a single strategic equilibrium as a function of the game
payoffs, and one which identifies all equilibria
.
Decision-theoretic background
Let S be a set of states, A a Boolean algebra on S, and
u: S R a strictly positive (utility) function.
An expected utility function is a pair (p,u), where p is a probability
function on A.
A conditional probability system is a family of conditional
probability functions p( . |F) such that, for all EFG,
• p(E|G)=p(E|F)p(F|G)
• p( . |F) is well defined even when p(F|G)=0, i.e. even when a
condition F has zero probability.
A conditional expected utility function is a pair (p,u) where p is a
conditional probability system.
The expected utility of a possible event E is defined as:
u(E)= u(s) p(s | E)
and the conditional expected utility of E given F is defined as:
u(E | F)= u(E  F) / u(F).
Finally, the conditional value of E given F is defined as:
v(E | F)=p(E | F) u(E | F).
Ceteris Paribus Comparisons
Let {Xk}kN be a set of variables,
for MN, let XM:=kM Xk ,
and let x 0=(xM0, xN -M 0) be an arbitrary reference point.
Ceteris Paribus Comparisons for probabilities
q(xM|xN) = p(xM , xN-M) / p(xM0 , xN-M)
Ceteris Paribus Comparisons for utilities
w(xM|xN) = u(xM , xN-M) / u(xM0 , xN-M)
Conditional independence of probabilities and
utilities
Let {A,B,C} be a partition of N
and let x0=(a0,b0,c0) be an arbitrary reference point.
We say that A is p-independent of B given C if and only if
p(a|b,c)=p(a|b0,c) for all (a,b,c)(A,B,C).
We say that A is u-independent of B given C if and only if
w(a|b,c)=w(a|b0,c), where w(a|b,c):=u(a,b,c)/u(a0,b,c).
Finally, we say that A is strategically independent of B given C
if A is both p- and u- independent of B given C.
Health and Wealth
u(...) -W W
u(…)
-H
1
2
-H
-W
1
H
3
6
H
3
W
2
4
Shoham’s Additive Utility Independence
u(H,W) - u(-H,-W) = [u(-H,W) - u(-H,-W)] + [u(H,-W)-u(-H,-W)]
u(H,W)+ u(-H,-W) = u(-H,W) + u(H,-W)
u-independence
u(H,W) /u(-H,-W) = [u(-H,W) / u(-H,-W)] * [u(H,-W)/u(-H,-W)]
u(H,W)* u(-H,-W) = u(-H,W) * u(H,-W)
Consequences of strategic independence
If strategic independence holds, decisions regarding A and B
given C can be effectively decentralized: the agent who
decides on A does not need to know about B in order to
choose its optimal action given C, and vice versa.
(La Mura and Shoham 1999)
A G net is comprised of the following elements:
• a finite, ordered set of nodes XN (N={1,2,...,n}) corresponding
to a set of strategically relevant variables;
• a partition I of N which determines the identity of the agent
responsible for the decision at each node (including Nature);
• a set of directed (probability) arcs, with no cycles,
representing causal dependencies;
• a set of undirected (utility) arcs representing payoff
dependencies;
• for each Xk, two functions: p(xk|xPP(k)) and w(xk|xUN(k)), where
XPP(k) are the probability parents of Xk and XUN(k) its utility
neighbors.
Example: the Beer/Quiche game
reference point: -S,-B,-F
(not strong, no beer, no fight)
p(S)=0.9
w1(S|-B,-F)=1
payoffs = utility relative to reference point, times 12.
w1(S|B,-F)=1
w1(S|-B,F)=1
w1(S|B,F)=1
w2(S|-F)=1
S
w1(F|S)=1/2
w1(B|S)=3/2
w1(B|-S)=2/3
w1(F|-S)=1/4
B
F
w2(F|S)=1/2
w2(F|-S)=2
Another Example:
a multi-stage game with observed actions
(e.g., the repeated prisoners’ dilemma)
A1|H0
B1|H0
A2|H1
B2|H1
A1
H0
A2
H1
B1
A3
H2
B2
A4
H3
B3
B4
Bayesian rationality in G nets
A G frame is an incompletely specified G net, where only the
probabilities of Nature’s actions (but not those of the other
agents) are specified.
Theorem: Any finite game in extensive form has a G frame
representation.
A G net satisfies Bayesian rationality if, for all kN and
xPP(k)  XPP(k), ui(k)(xk|xPP(k))> ui(k)(x’k|xPP(k)) implies p(x’k|xPP(k))=0.
Corollary: For any G frame there exists a corresponding G net
which satisfies Bayesian rationality.
Strategic equilibrium in G nets
A Nash equilibrium (in the agent-strategic form) is a conditional
probability system p such that, for all q ,
Xk p(xk|xPP(k)) ui(k)(p)(xk|xPP(k))  Xk q(xk|xPP(k)) ui(k)(p)(xk|xPP(k)).
Let f:   be defined by f = z + (1- )v, where:
• z is the conditional probability system which gives uniform
probability to all the possible actions, and
• v(p)(xk|xPP(k)) := p(xk|xPP(k)) ui(k)(p)(xk|xPP(k)).
Then f has a fixed point by Brouwer’s theorem. A limit point of a
sequence of fixed points of f as  goes to zero is a Nash (indeed,
a sequential) equilibrium.
Global convergence to equilibrium in G nets
Nash equilibria are zeros of the function F(p), defined by
F(p)(xk|xPP(k)):= p(xk|xPP(k)) - v(p)(xk|xPP(k)).
Consider the perturbed problem:
F(p)(xk|xPP(k)):= p(xk|xPP(k)) - f(p)(xk|xPP(k)).
This can be rewritten as F0(p) + (1- )F(p), where F is the target
system whose zeros we want to find, and F0 is the trivial system
p(xk|xPP(k)) - z(xk|xPP(k)), whose unique solution is p = z.
Then F defines a convex-linear homotopy h(p,t) := F1-t.
Its homotopy paths are very well behaved, and can be tracked
using standard computational methods (Morgan 1987).
First equilibrium
• For G nets with generic payoffs, the end point of the
homotopy path generated by h(p,t) identifies a unique
sequential equilibrium (first equilibrium).
• The first equilibrium is uniquely determined by the G net
structure, and can be computed using standard path-tracking
techniques.
• Most importantly, strategically independent variables do not
affect the values of F(p)(xk|xPP(k)). It follows that, in the
presence of strategic independencies, convergence to the
zeros of a large system can be reduced to convergence to the
zeros of smaller, strategically independent subsystems.
Computing all equilibria
Let G(p) be defined by
G(p)(xk|xPP(k)):= ui(k)(p)(xPP(k)) F(p)(xk|xPP(k)).
Since u is strictly positive, the zeros of G coincide with the
zeros of F.
Note that G(p) is a vector of polynomial functions, whose zeros
include all the Nash equilibria.
A Nash equilibrium may not have any homotopy path
converging to it in Rn; yet, as it turns out, one can always get at
all of them in the complex space Cn.
Let G0 be the start system defined by
G0j(p)= jdj pjdj - jdj,
where j= (xk|xPP(k)), dj is the degree of Gj(p), and j and j are generic
complex constants. Then G0(p)= 0 has j dj solutions.
Let h(p,t) be the homotopy defined by h(p,t)=(1 - t) G0(p) + t G(p); then
the following result holds.
Theorem (Morgan 1987): Given G, there are sets of measure zero, A
and A in Cn such that, if   A and   A, then:
• the solution set { (p,t) Cn  [0,1) : h(p,t)=0 } is a collection of nonoverlapping, smooth paths;
• the paths move from t=0 to t=1 without backtracking in t;
• each geometrically isolated solution of G=0 of multiplicity m has
exactly m continuation paths converging to it.
• The Nash equilibria are the real solutions of G = 0 in the region
delimited by ui(k)(p)(xk|xPP(k))  1
•The mixed-strategy equilibria are the solutions with multiplicity
higher than one
• The homotopy paths are very well-behaved, and can be tracked
using standard computational methods
• The procedure applies to general games; yet, as in the singleequilibrium case, one can exploit the strategic separabilities in the
G net structure in order to simplify the computations.
Conclusions
• G nets are a novel multi-agent framework for strategic inference
• G nets are more modular and compact than standard gametheoretic representations, thanks to a structured representation
of probabilities, utilities and expected utilities
• moreover, they come with two novel computational procedures
which can take advantage of the extra structure in order to
simplify the inference process
• the first procedure selects a unique sequential equilibrium as a
function of the G net structure
• while the second identifies all Nash equilibria.