Boosting for transfer

Download Report

Transcript Boosting for transfer

X. Cai C. Fang R. Guo
W. Yu B. Qian S.Jiang
CSC7333 Machine Learning
Project Presentation
April.30th, 2013

Super Mario Bros is a 1985 platform video game
developed by Nintendo, published for the Nintendo
Entertainment System as a pseudo-sequel to the
1983 game Mario Bros.

Mario AI Championship 2012

The 2012 Mario AI Championship, the successor to the successful
2011 and 2010 Mario AI Competition, run in association with several
major international conferences focusing on computational
intelligence and games.

Our project gets inspiration from the competition:

The goal is to develop a best controller(agent) to play the game.

By using artificial neural network.

Method: artificial neural network.

Neural networks are used for modeling complex
relationships between inputs and outputs or to find
patterns in data.

Noise and error Tolerance.

ANN can perform tasks that a linear program can not.

When an element of the neural network fails, it can
continue without any problem by their parallel nature.

Design an ANN approach

By learning the environment around Marino, the
controller's job is to win as many levels as possible.
Each time step the controller has to decide what
action to take (left, right, jump etc).


The linear world of Mario is filled with enemies, obstacles and powerups.
The problem the agent faces when playing a game is how to correctly
interpret what it sees in the environment and decide on the best
action to take.

Input & Output:

Data set = input + output (one frame one data set)

Every frame our script will scan Mario code and extract environment
matrix as well as keyboard action one time and obtain one data set.
For example, A3 minutes monitoring of playing the game, we can
obtain 3*60*24=4320 data sets.

Input=environment matrix + mario position

Output=keyboard action(0,1)
Left, right, up, down, shoot, jump
0
0
0 0
0
1
this is a jump action

Monitor data training:

Encapsulate pattern to Java Agent

Encapsulate these patterns to Java style agents, so that it can be
used in Super Mario source code.

Test ANN Agents

Import ANN Java library to Super Mario code.

Insert ANN agent for testing

First experiment : 13*19 + 1 + 6

13*19 : environment around Mario

1 : Mario’s state

6 : Human keyboard action (up, down, left, run, jump, fire)

Doesn’t work well: input data is too large

Second experiment : 10*10 + 1 + 2

10*10 : reduce the number of environment data

1 : Mario’s state

2 : Human keyboard action, but separate run & jump

Doesn’t work well: different for ‘run’ & ‘jump’, but Mario sometimes
needs to run and jump all together.

Third experiment : 10*10 + 1 + 2

10*10 : keep the environment data small but effective

1 : Mario’s state

2 : merge run & jump together

Works better but not satisfied: too much data noise due to human’s
playing training data

Fourth experiment : 10*10 + 1 + 2

Realized that the process human playing the game as training data is
critical for the experiment .

With extra carefulness during human’s playing.

How to obtain the best input/output pair?



About hidden layer


Input, remove decorative redundancy.
Output, a good human player is critical.
Tuning is the best way.
Is ANN the best way of designing Mario agent?


Not really. The major problem is we cannot obtain a sufficient and
accuracy training data from human beings.
Reinforcement learning maybe better.