Othello Artificial Intelligence With Machine Learning

Download Report

Transcript Othello Artificial Intelligence With Machine Learning

Othello Artificial Intelligence
With
Machine Learning
Computer Systems TJHSST
Nick Sidawy
Introduction
Machine learning is an extensive filed of study.
Most of what is done with machine learning is
tied to artificial intelligence which is why Othello
seemed to be a good vessel for my research. It
is a simple enough game for me to work on, yet
difficult enough to keep me working throughout
the quarters.
Purpose
The purpose of this research project is to
implement machine learning with artificial
intelligence. The reason for this is two-fold:
 First, to create a very effective Othello AI.
 Second, and more academically oriented, is to
gain a deeper understanding of machine
learning.
Goal Breakdown
The first goal I would like to achieve with this
project is to program an effective forwardchecker for the AI.
 The second goal is to use a genetic algorithm to
formulate the best evaluation function for the AI
to use.
 My last goal is to have the AI learn from each
move it makes so that the more it plays, the
faster and better it will perform.

Development
The project will be coded in java.
 I have separated it into 7 different programs:
 The driver
 The panel (which controls the game)
 The AI
 The game functions (such as capturing pieces)
 A class called MyButton
 A program to run the Genetic Algorithm
 A program for the game functions that will be
used in the Genetic Algorithm.
 A program to the MachineLearning functions.

Construction (First Quarter)
There are two algorithms that the first quarter
was focused on:

The forward-checking algorithm (Minimax)

The evaluation function

A MyButton class was also created
MyButton Class
The MyButton class extends Jbutton and has the
following variations:


It has a paint component
It can store values which indicate if the spot on
the board is occupied by a piece and will draw
images accordingly
Forward-Checker
The goal of a forward Checker is to traverse a
tree of possible moves and picking the move that
will lead to the best scenario down the line. The
ply determines how many levels of the tree it
goes through.
The trick is that at each level of the three it
picks the move that is best for the player it is
simulating for. Therefore, the computer assumes
the opponent will play perfectly.
It is nicknamed the Minimax algorithm for this
reason. (See Diagram A)
Evaluation Function
This function returns a number rating how good
a particular board is for a player. It does this
based on the positions of the pieces and amount
of available moves for each player. For example,
pieces in the corners are very valuable and will
add many more points to the rating than a piece
near the center.
Construction (Second Quarter)



Recreating the GUI that the AI would run in so
different AI’s and simulations could be used
when running the program just once, instead of
having to restart it.
Creating a program to run the Genetic
Algorithm.
Creating a program that would contain the
moves necessary for the Genetic Algorithm to
run.
The New GUI
Genetic Algorithm
The values used to evaluate a given Othello board
are very important for picking the best move and I
hope to use the Genetic Algorithm to find the best
values for evaluation.

The main components:




The population (A set of different evaluation values)
The fitness evaluation (The way of testing how well a
set of evaluation values performs)
Splicing and Offspring (The production of a new,
improved set of evaluation values)
Genetic Algorithm (Cont’d)



First, 8 sets of evaluation values are created
that will make up the population.
Second, each set is given a score based on
how it does with the fitness function, which in
this case is a game played against a different,
constant set of values.
Third, 8 new sets of values are produced by
splicing the 8 sets of values in the original
generation.
Splicing




The reproduction takes a Darwinian approach.
Depending on how well a particular set does against the
fitness function determines the likelihood it will be chosen
for reproduction.
Two sets (Set A and B) are chosen based on the
probabilities and a crossover point is chosen at random for
the sets.
Next, the two sets will splice by taking all the values before
the crossover point of Set A and combining them with the
values after the crossover point of Set B. Then, the
opposite is performed.
Splicing (Cont’d)




This process is done 4 times so the offspring (next
generation) is created and can be tested against the
fitness function.
There is also a predetermined chance that a mutation may
occur. A mutation is when a value on a specific set is put
to a random number and helps prevent the evaluation sets
from all reaching the same values (plateau).
In the event a plateau is reached and all the sets in a
generation are the same, then the fitness function will
become the one of the sets of the generation and 8 new
sets will be created by random.
In theory this will create better and better evaluation sets
over time.
Construction (Third Quarter)
Creating a program (MachineLearning) which
does the following:



Saving information from each move the AI
makes.
Storing the data in a file so it can be used game
after game.
Loading data from the file and putting it into a
HashMap for quick access during a game.
Data Collecting Procedure



Games will be run between the AI and a
random opponent over and over.
Data from each move made will be stored in
several different variations. (For example, the
board that is given, a reflection of that board,
etc.)
The data will be stored into a text file after each
game where it can be loaded the next time the
program runs.
Data Storage
After each move the AI makes, four pieces of
data will be stored:

Player's piece color

Boardstate (before the move is made)

Move chosen

Evaluation
During a game, this data will be stored in a
HashMap with the player's piece color and
boardstate as the key and the chosen move
and evaluation as the values stored.
Purpose/Uses for Data



By storing all this data about moves the AI has
chosen, the FowardChecker will be more efficient and,
possibly, more effective.
This will prevent the need to traverse through a tree of
possible moves if that board situation has already
encountered.
It will also make it possible to search deeper into a
tree of moves by using the stored information only
after the ForwardChecker has moved three or four
levels deep.
Conclusions
Thus far into the project I have learned the
importance of the evaluation function. A strong
evaluation function has much more weight than
a forward-checker with a high-ply.
 When the project is complete I plan on having a
very skilled Othello AI that uses machine
learning extensively and will be challenging for
the best players.
