Stochastic Games

Transcript Stochastic Games

Stochastic Games
Mr Sujit P Gujar.
e-Enterprise Lab
Computer Science and Automation
IISc, Bangalore.
July 7, 2015
Agenda




Stochastic Game
Special Class of Stochastic Games
Analysis : Shapley’s Result.
Applications
July 7, 2015
e-Enterprise Lab
Repeated Game

When players interact by playing a similar
stage game (such as the prisoner's dilemma)
numerous times, the game is called a
repeated game.
July 7, 2015
e-Enterprise Lab
Stochastic Game




Stochastic game is repeated game with
probabilistic/stochastic transitions.
There are different states of a game.
Transition probabilities depend upon actions
of players.
Two player stochastic game : 2 and 1/2
player game.
July 7, 2015
e-Enterprise Lab
Repeated Prisoner’s Dilemma

Consider Game tree for PD repeated twice.
Assume each player
has the same two
options at each info
set: {C,D}
subga
me
1
2
1
1
2
2
1
1
2
2
What is Player 1’s strategy set?
(Cross product of all choice sets at all information sets…)
{C,D} x {C,D} x {C,D} x {C,D} x {C,D}
25 = 32 possible strategies
July 7, 2015
e-Enterprise Lab
First
Iteratio
n
Second
Iteratio
n
Issues in Analyzing Repeated Games

How to we solve infinitely repeated games?

Strategies are infinite in number.

Need to compare sums of infinite streams of
payoffs
July 7, 2015
e-Enterprise Lab
Stochastic Game : The Big Match




Every day player 2 chooses a number, 0 or 1
Player 1 tries to predict it. Wins a point if he
is correct.
This continues as long as player 1 predicts 0.
But if he ever predicts 1, all future choices for
both players are required to be the same as
that day's choices.
July 7, 2015
e-Enterprise Lab
The Big Match




S = {0,1*,2*} : State space.
 s0 ={0,1} s1 ={0} s2 ={1}
N = {1,2}
02 =
 P
00
P = 0 0
0 1
0 0
1 1
P01 = 1 0
0 0
July 7, 2015

A = Payoff Matrix
=
1* 0*
0 1
e-Enterprise Lab


The "Big-Match" game is introduced by
Gillette (1957) as a difficult example.
The Big Match
David Blackwell; T. S. Ferguson
The Annals of Mathematical Statistics, Vol.
39, No. 1. (Feb., 1968), pp. 159-163.
July 7, 2015
e-Enterprise Lab
Scenario
N
Total number of States/Positions
mk
Choices for row player at position k
nk
Choices for column player at position k
skij > 0
The probability with which the game in position k stops
when player 1 plays i and player 2, j.
pklij
The probability with which the game in position k moves to
l when player 1 plays i and player 2, j.
s
Min skij
akij
Payoff to row player in stage k.
M
Max |akij|
July 7, 2015
e-Enterprise Lab
Stationary Strategies



Enumerating all pure and mixed strategies is
cumbersome and redundant.
Behavior strategies those which specify a
player the same probabilities for his choices
every time the same position is reached by
whatever route.
x = (x1,x2,…,xN) each xk = (xk1, xk2,…, xkmk)
July 7, 2015
e-Enterprise Lab
Notation

Given a matrix game B,




val[B] = minimax value to the first player.
X[B] = The set of optimal strategies for first player.
Y[B] = The set of optimal strategies for second
player.
It can be shown, (B and C having same
dimensions)
|val[B] - val[C]| ≤ max |bij - cij|
July 7, 2015
e-Enterprise Lab


When we start in position k, we obtain a
particular game,
We will refer stochastic game as,
Define,
July 7, 2015
e-Enterprise Lab
1
Shapley’s Results
1
July 7, 2015
L.S. Shapley, Stochastic Games. PNAS 39(1953) 1095-1100
e-Enterprise Lab

Let,
denote the collection of games
whose pure strategies are the stationary
strategies of . The payoff function of these
new games must satisfy,
July 7, 2015
e-Enterprise Lab
Shapley’s Result,
July 7, 2015
e-Enterprise Lab
Applications



1
When N = 1,
By setting all skij = s > 0, we get model of infinitely
repeated game with future payments are discounted by a
factor = (1-s).
If we set nk = 1 for all k, the result is “dynamic
programming model”.
1
July 7, 2015
von Neumann J. , Ergennise eines Math, Kolloquims, 8 73-83 (1937)
e-Enterprise Lab
Example


Consider the game with
N = 1,
A=
1 -1
-2 1



P1 = 1-s
1-s
1-s
1-s
x=(0.6,0.4)
y=(0.4, 0.6)
July 7, 2015

P2 = 1-2s 1-2s
1-s


1-2s
x=(0.61,0.39)
y=(0.39, 0.61)
e-Enterprise Lab

July 7, 2015
Thank You!!
e-Enterprise Lab

Stochastic Games

Transcript Stochastic Games

Directory