arindam-gaurav-victor-interactive-game

Download Report

Transcript arindam-gaurav-victor-interactive-game

INTERACTIVE
COMPUTER GAMES
A HUMAN –LEVEL
ARTIFICIAL INTELLIGENCE
APPLICATION
More precisely called
Branch of AI behind it
are Interactive games an area of
Human-level AI research ?
is AI used in Interactive games ?
Picture Courtesy : Google Images
Human -like attributes expected in a
human-level ai system…
Human-Level capabilities
Real-time response
Robust
Autonomous intelligent interaction with
environment
Planning
Communication with natural Language
Common sense reasoning
Creativity
Learning
are Interactive games an area of
Human-level AI research ?
is AI used in Interactive games ?
ACTION Games
ROLE PLAYING Games
ADVENTURE Games
STRATEGY Games
GOD Games
TEAM SPORTS Games
INDIVIDUAL Games
Tactical enemies
Partners
Support Characters
Story directors
Strategic opponents
Units
Commentators
Search
Logic
Machine
Learning
NLP
Vision
Knowledge
Representation
Planning
Robotics
Expert
Systems
Learning
Planning
Computer
Games
Logic
Vision
Search
A case study : the basics
Focus : Game Tactics
How AI is used to enhance Game Tactics
AI tools used
Evolutionary computation
&
Reinforcement Learning
Real-time Strategy Games
AI
Components
used
Evolutionary
Computation
Reinforcement
Learning
Genetic Algorithm
A learning technique with a
mathematical reward function.
• Player needs to control
armies to defeat all opposing
forces in a virtual battlefield.
• Key to winning lies in
efficiently collecting and
managing resources., and
appropriately allocating these
resources over various action
elements.
• Famous examples : Age Of
Empires , World of Warcraft .
Picture Courtesy : http://www.igniq.com/images/age_of_empires_3
Key terms
Tactics
Strategy
Action
• Sequence consisting of one or more primitive
actions in any game state.
• Sequence of tactics used to play the entire game.
• Atomic transformation of game state.
State 2
State1
State 3
AI Components in the Game
• AI in RTS games determines all decisions of the computer opponents.
• Encoded in the form of scripts. Called STATIC SCRIPTS
State 1
Tactic A
State 2
Tactic B
State 3
Tactic C
Dynamic Scripting
Each state has multiple possible
tactics.
Tactics have relative weight assigned
to them.
Highest weight means best tactic.
Weight Adjustment to adapt to given
situation.
Evolve new tactics on the fly.
Example
Tactic A Tactic B
State 1
0.4
0.6
State 2
0.7
0.3
Example
Tactic A Tactic B
State 1
0.8
0.2
State 2
0.7
0.3
Example
Tactic A Tactic B
State 1
0.8
0.2
State 2
0.7
0.3
Another Real Example
I have to first well
develop my army,
then only I can
attack. This will take
a while.
I don’t care about
available resources.
Attack at earliest !!!
Ha Ha Ha!!
AI
Picture Courtesy : World Of Warcraft
HUMAN
Another Real Example
AI is gathering
resources and
preparing for heavy
assault.
I have suffered heavy
losses. Now I need to
increase my strength
first. Small attacks
are of no use.
AI
Picture Courtesy : World Of Warcraft
HUMAN
Dynamic Scripting
Adaptive Agent. ( Sa , i = Score at state i )
Static Agent. ( Ss , i = Score at state i )
b = Break Even point , at these point the weight remains unchanged.
Weight adjustment is based on :
( Sa , i  Sa , i  1)
Ri 
( Sa , i  Sa , i  1)  ( Ss , i  Ss , i  1)

 Sa , L

min
,
b

 lost

 Sa , L  Ss , L 
R  
max  Sa , L , b  win

 Sa , L  Ss , L 
Dynamic Scripting
Weight values are bounded in [Wmin , Wmax ].

b  R
b  Ri 


P
max  Cend

(
1

C
end )
 {R  b}


b
b 

W  
 R max  Cend R  b  (1  Cend ) Ri  b  {R  b}

1 b
1 b 


C end is a parameter and is set less than 0.5.
Contribution of State Reward is kept larger than Global Reward.
P max and R max are the maximum penalty and maximum reward respectively.
Automatically Generating Tactics
 Evolutionary State Based Tactics Generator (ESTG)
 Genetic Algorithm Application !!!
Counter Strategies are “played” against training scripts , only the fittest are
allowed to the next generation.
• Chromosome encoding
• Genetic operators
• Fitness function
Chromosome Encoding
 EA works with a population of chromosomes .
 Each represents a static strategy .
Start
State 1
State 2
State m
End
The chromosome is divided into the m states .
Chromosome Encoding
States include a state marker followed by the state number and a series
of genes.
A Gene
Parameter
values
4 types of genes
Genes
ID
Build genes
B
Research genes
R
Economy genes
E
Combat genes
C
Followed by
values of
parameters
needed by the
gene .
Chromosome Encoding
Partial example of a chromosome .
Fitness Function

Ma
 CT

min
,
b




C
max
M
a

M
s


F  
Ma



max 
, b

 Ma  Ms


if a lost
if a won
represents the time step at which the game was finished
represents the maximum time step the game is allowed to continue to
represents the military points for the adaptive agent
represents the military points for the adaptive agent’s opponent,
is the break-even point
Fitness Function
Our goal is to generate a chromosome with a fitness exceeding a target value.
When such a chromosome is found, the evolution process ends. This is the fitnessstop criterion.
Because there is no guarantee that a chromosome exceeding the target value will be
found, evolution also ends after it has generated a maximum number of
chromosomes. This is the run-stop criterion.
The choices for the fitness-stop and run-stop criteria can be determined by
experimentations .
Genetic Operators

size-3 tournament


State Crossover
• selects two parents and copies states from either parent
to the child chromosome
Gene-replace
Mutation
• selects one parent, and replaces economy, research, or
combat genes with a 25% probability
Gene-biased Mutation
• selects one parent and mutates parameters for existing
economy or combat genes with a 50% probability
Randomization
• randomly generates a new chromosome
Genetic Operators
KT: State-based Knowledge
Transfer
tactics
Evolved
Chromosomes
State-specific
Knowledge Bases
 The possible tactics during a game mainly depend on the available units and
technology, which in RTS games typically depend on the buildings that the
player possesses.
 Thus, we distinguish tactics using the Wargus states .
 All genes grouped in an activated state (which includes at least one activated
gene) in the chromosomes are considered to be a single tactic.
Extracting Tactics for a state
• The example chromosome displays two tactics.
State 1
• Gene 1.1 (a combat gene that trains a defensive army)
• Gene 1.2 (a build gene that constructs a blacksmith).
• This tactic will be inserted into the knowledge base for state 1.
• Gene 1.2 spawns a state change, the next genes will be part of a tactic for state 3
(i.e., constructing a blacksmith causes a transition to state 3, as indicated by the
state marker in the example chromosome).
Performance of Dynamic Scripting
Experiment Scenario
• The performance of the adaptive agent (controlled by dynamic scripting using the
evolved knowledge bases) in Wargus is evaluated by playing it against a static agent.
• Each game lasted until one of the agents was defeated, or until a certain period of
time had elapsed.
• If the game ended due to the time restriction, the agent with the highest score was
considered to have won.
• After each game, the adaptive agent’s policy was adapted.
A sequence of 100 games constituted one experiment. We
ran 10 experiments each against four different strategies for
the static agent .
Small
Large
Balanced Balanced
Land
Land
Soldier’s Knight’s
Attack
Attack
Rush
Rush
(SBLA) (LBLA)
(SR)
(KR)
small
large
map
map
RTP is the number of the
first game in which the
adaptive agent
outperforms the static
agent.
low RTP value indicates
good efficiency for dynamic
scripting
Average RTP value
Performance Analysis
The opponent strategies
The three bars that reached 100 represent runs where no RTP was found (e.g.,
dynamic scripting was unable to statistically outperform the specified
opponent).
Where We stand today………
Human-Level capabilities
Real-time response
Robust
Achieved
Achieved
Autonomous intelligent interaction with
environment
Achieved
Planning
Achieved
Communication with natural Language
Achieved
Common sense reasoning
Creativity
Learning
Not Achieved
Not Achieved
Achieved
Picture Courtesy : Prince Of Persia , Google Images
Drawbacks
Giving undue advantages to AI agents.
Future – Scope:
• Removing the “cheating” factor from Interactive games.
• Introduction of Creativity in AI agents.
• Capability of AI agents to reason with human-like Common Sense.
 Ponsen,M. & Spronck,P.(2006). Automatically Generating Game Tactics
via Evolutionary Learning.
 Spronck,P. , Sprinkhuizen Kuyper,I. & Postma,E. (2004).Online
adaptation of game opponent AI with dynamic scripting.
 Sutton,R., & Barto,A.(1998). Reinforcement learning : an introduction.