Evolving New Strategies - Computer Science & Engineering

Download Report

Transcript Evolving New Strategies - Computer Science & Engineering

Evolving New Strategies
The Evolution of Strategies in the
Iterated Prisoner’s Dilemma
01 / 25
What is the Prisoner’s Dilemma?









There are two prisoners
Each one has taken part in the same criminal act
The authorities are interrogating each one
Each prisoner can choose to keep their mouth shut or rat out their
partner
If both prisoners stay quiet, they each get n months of jail time
If only one prisoner gets ratted out, that prisoner gets n + x months
of jail time while the other prisoner gets n – y months of jail time
If the prisoners rat each other out, they each get n + z months of jail
time.
In this case, n, x, y, and z are all greater than zero.
In this case, x is greater than z.
02 / 25
What is the Iterated Prisoner’s
Dilemma?




Prisoner’s Dilemma performed several times
The two criminals have committed several
crimes together
They are interrogated for each crime, with each
set of interrogations being an instance of the
original Prisoner’s Dilemma
These interrogations are performed in sequence
(or iteratively), and the jail time distributed to
each prisoner is cumulative
03 / 25
How does the IPD relate to GAs?
No optimal solution
 No real strategy
 No clue
 Hard problem


So back to the paper
04 / 25
What This Paper Shows
GAs in a rich social setting
 Advantage of developing new strategies

One parent
 Two parent

Early commitments to paths
 Evolutionary processes optimal or arbitrary

05 / 25
How Does It Show It?
Simulation
 Multiple cases
 Comparative output

06 / 25
The Simulation
Specify the environment
 Specify the encoding
 Testing the effects of random mutation
 Run the simulation
 Analyze the results

07 / 25
The Environment
Prisoner’s dilemma
 Multiple prisoners
 Goal is to achieve mutual cooperation
 Individuals may meet more than once

08 / 25
Initial Experiment

Original strategies were submitted by
fourteen people
Game Theory
 Economics
 Sociology
 Political Science
 Mathematics


Various levels of intricacy
09 / 25
Initial Experiment

Most complex strategy
Markov process
 Bayesian inference


Least complex strategy


TFT
TFT won
10 / 25
Second Experiment

Sixty-two entries
Six countries
 Computer hobbyists, professors


TFT was submitted again

It won
11 / 25
The GA
Population
 Encoding
 Generation
 Crossover
 Mutation
 Fifty Generations

12 / 25
Population
Twenty chromosomes
 Seventy genes

13 / 25
Encoding
For each prisoner’s dilemma, there are
four possibilities
 Each “player” has memory
 What each gene represents

14 / 25
A Single Generation

Multiple games



Each game had one-hundred and fifty-one moves
Each chromosome played eight others
Fitness was assigned




Ratted out – Zero points
Mutually ratted out – One point
Mutual cooperation – Three points
You ratted, other person stayed quit – Five points
15 / 25
Crossover

Fitness proportional selection

Involved standard deviation from mean
Strictly ten crossovers
 Single point
 Two parents

16 / 25
Mutation
Single gene flip
 One gene per two chromosomes

17 / 25
Results

Median resultant member
Just as good as TFT
 Resembled TFT


Five properties were found





Don’t rock the boat
Be provocable
Accept apologies
Forget
Accept a rut
18 / 25
Results

ADJUSTER
Special chromosome which consistently
seeks to exploit
 TFT
 Majority of other chromosomes

19 / 25
Results

Twenty-five percent of runs
Median was better
 Exploit one chromosome

20 / 25
Results: Why is this important?

Chromosomes had to learn
Discriminatory based on evidence
 Self adjusting for exploitation
 No alienation


Break primary rule of first tournament

Be nice? I don’t think so
21 / 25
Results: Misleading?

Median
Exploitative
 Menacing
 A true criminal?

Fixed size population and tournaments
 Simulate real evolution

22 / 25
Results: A Slight Twist

Asexual reproduction
TFT
 Less than half of the medians


Changing environment
Play against everyone
 Everyone starts aggressive

 Fitness
rapidly declines
 Fitness begins to even out
 Fitness begins to rise
23 / 25
Conclusions
The GA is good for searching, large, multidimensional spaces
 Multiple parent crossover helps
 Arbitrary aspects of evolution



Exploration vs. Exploitation


Hitch hikers
Selection Pressure
Evolutionary Commitments can be
irreversible
24 / 25
Related Topics
Mutation
 Crossover
 Inversion
 Coding principles
 Dominant/Recessive
 Rate of evolution
 Population viscosity
 Speciation and niches

25 / 25