Othello Artificial Intelligence With Machine Learning

Download Report

Transcript Othello Artificial Intelligence With Machine Learning

Othello Artificial Intelligence
With
Machine Learning
Computer Systems TJHSST
Nick Sidawy
Introduction
Machine learning is an extensive filed of study.
Most of what is done with machine learning is tied
to artificial intelligence which is why Othello
seemed to be a good vessel for my research. It is a
simple enough game for me to work on, yet difficult
enough to keep me working throughout the quarters.
Purpose
The purpose of this research project is to implement
machine learning with artificial intelligence. The
reason for this is two-fold:
First, to create a very effective Othello AI.
Second, and more academically oriented, is to gain a
deeper understanding of machine learning.
Goal Breakdown
The first goal I would like to achieve with this
project is to program an effective forward-checker for
the AI.
The second goal is to use a genetic algorithm to
formulate the best evaluation function for the AI to
use.
My last goal is to have the AI learn from each move
it makes so that the more it plays, the faster and better
it will perform.

Development
The project will be coded in java.
I have separated it into 7 different programs:
The driver
The panel (which controls the game)
The AI
The game functions (such as capturing pieces)
A class called MyButton
A program to run the Genetic Algorithm
A program for the game functions that will be used in
the Genetic Algorithm.
A program to the MachineLearning functions.

Construction (First Iteration)
There are two algorithms that the first quarter was
focused on:

The forward-checking algorithm (Minimax)

The evaluation function

A MyButton class was also created
MyButton Class
The MyButton class extends Jbutton and has the
following variations:

It has a paint component
It can store values which indicate if the spot on the
board is occupied by a piece and will draw images
accordingly

Forward-Checker
The goal of a forward Checker is to traverse a
tree of possible moves and picking the move that
will lead to the best scenario down the line. The ply
determines how many levels of the tree it goes
through.

The trick is that at each level of the three it
picks the move that is best for the player it is
simulating for. Therefore, the computer assumes the
opponent will play perfectly.
It is nicknamed the Minimax algorithm for this
reason. (See Diagram A)

Evaluation Function
This function returns a number rating how good a
particular board is for a player. It does this based on
the positions of the pieces and amount of available
moves for each player. For example, pieces in the
corners are very valuable and will add many more
points to the rating than a piece near the center.
Construction (Second Iteration)



Recreating the GUI that the AI would run in so
different AI’s and simulations could be used
when running the program just once, instead of
having to restart it.
Creating a program to run the Genetic
Algorithm.
Creating a program that would contain the
moves necessary for the Genetic Algorithm to
run.
The New GUI
Genetic Algorithm
The values used to evaluate a given Othello board
are very important for picking the best move and I
hope to use the Genetic Algorithm to find the best
values for evaluation.
The main components:





The population (A set of different evaluation values)
The fitness evaluation (The way of testing how well a
set of evaluation values performs)
Splicing and Offspring (The production of a new,
improved set of evaluation values)
Genetic Algorithm (Cont’d)



First, 8 sets of evaluation values are created
that will make up the population.
Second, each set is given a score based on
how it does with the fitness function, which in
this case is a game played against a different,
constant set of values.
Third, 8 new sets of values are produced by
splicing the 8 sets of values in the original
generation.
Splicing




The reproduction takes a Darwinian approach.
Depending on how well a particular set does against the
fitness function determines the likelihood it will be chosen
for reproduction.
Two sets (Set A and B) are chosen based on the
probabilities and a crossover point is chosen at random for
the sets.
Next, the two sets will splice by taking all the values before
the crossover point of Set A and combining them with the
values after the crossover point of Set B. Then, the
opposite is performed.
Splicing (Cont’d)




This process is done 4 times so the offspring (next
generation) is created and can be tested against the
fitness function.
There is also a predetermined chance that a mutation may
occur. A mutation is when a value on a specific set is put
to a random number and helps prevent the evaluation sets
from all reaching the same values (plateau).
In the event a plateau is reached and all the sets in a
generation are the same, then the fitness function will
become the one of the sets of the generation and 8 new
sets will be created by random.
In theory this will create better and better evaluation sets
over time.
Construction (Third Iteration)
Creating a program (MachineLearning) which
does the following:
 Saving information from each move the AI
makes.
 Storing the data in a file so it can be used game
after game.
 Loading data from the file and putting it into a
HashMap for quick access during a game.
Data Collecting Procedure


Games will be run between the AI and a
random opponent over and over.
Data from each move made will be stored in
several different variations. (For example, the
board that is given, a reflection of that board,
etc.)
Data Storage
After each move the AI makes, four pieces of
data will be stored:
 Player's piece color
 Boardstate (before the move is made)
 Move chosen
 Evaluation
During a game, this data will be stored in a
HashMap with the player's piece color and
boardstate as the key and the chosen move
and evaluation as the values stored.
Implementation of Data



By storing all this data about moves the AI has
chosen, the FowardChecker should be more efficient
and, possibly, more effective.
This will prevent the need to traverse through a tree of
possible moves if a board situation found anywhere in
the tree has been encountered.
It should make it possible to search deeper into a tree
of moves by using the stored information only after the
ForwardChecker has moved three or four levels deep.
Conclusions
I have learned the importance of the evaluation
function. A strong evaluation function has much
more weight than a forward-checker with a high-ply.
Collecting loads of data and putting it is useful when
you have enough of it, probably, but as far as I can tell
I will need thousands of times more than I have for
any real improvement to take place.
