Document

Transcript Document

Transposition Tables, Zobrist Hashing, GHI problem
Q: If all these search algorithms are
only about 20 lines of code, how
come my search routine is 20 pages?
A: It’s all the enhancements.
The commonest, most important, enhancement is a transposition table - TT.
• The same position may arise from two or more move sequences.
 (Also, a sequence of moves may lead to repetition of a position)
• If search has told you once what might happen in a position, it is waste of
effort to repeat that search.
• A TT allows you to lookup previously computed results.
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
1
Repetitions
In many games, the rules pay some special attention to repetitions, eg
• in Chess: if the same position occurs three times, the game is drawn
• in Go: it is forbidden to repeat a position
• in Nine Men’s Morris: it is forbidden to make a mill by reversing your
immediately previous move
x

http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
2
Transpositions
Even where repetitions are not specially interesting, there still may be several
alternative sequences of moves that lead to a common position.
eg in Chess:
1: P-K4, P-K4.
2: P-Q4
1: P-Q4, P-K4.
2: P-K4
Same configuration on the board,
same player to move,
same depth to be searched,
same results!
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
3
What should a TT store?
Each position visited in a search should have information stored that:• allows for later recognition of the same position if it recurs
 disposition of pieces
 who is to move
 other game-specific information
– eg in chess: castling possibilities, en-passant possibilities, moves
since some irreversible event like a capture or pawn move
• tells what valuation should be returned if it is still valid
 numeric score; fail low/fail high; B* bounds; CN range.
• allows judging whether or not the valuation is still valid
 depth of the search that provided this valuation
 (if ) what bounds were in force
 (possibly) which problem was being solved when the entry was stored
• tells what was the best move found by the previous search
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
4
How is a TT stored?
A TT is usually maintained as a hash table, using a numeric key.
In the event of a hash-table collision, (where two distinct situations map to the
same entry in the hash table), two alternative approaches are sensible:
• discard the stored entry and replace it with the new one
if the stored entry relates to an older problem, this is obvious approach;
if the stored entry relates to the current problem, there is an alternative …
• choose between the stored entry and the new one on some principled basis:
 keeping the entry with greatest depth?
 keeping the entry with the largest search effort?
 keeping the entry with most reliable value?
 keeping the entry with the tightest bounds?
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
5
(Alfred) Zobrist Hashing
Zobrist hash codes aim to give a systematic numeric coding for situations arising
in games, to allow easy determination of whether two situations are the same.
(esp. is a situation stored in TT the same as the current situation)
Conventional wisdom is that 64-bit codes are necessary.
• cf Shared Birthday problem
• clashes will occur, but so infrequently as not to matter
The hashtable used for the TT will contain far fewer than 264 entries; some
subset of the Zobrist key bits will be used as TT hash code.
– One way tournament programs have been known to cheat is to
instruct their opponent programs to reduce the size of their TTs.
Zobrist codes are updated by XOR with (64-bit) randomised numbers.
Some claim rand() is good enough; others play Hamming-distance tricks.
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
6
Zobrist Hashing uses XOR
In Chess,
In Go,
•There are 12 kinds of piece, 64 squares
•there are 2 kinds of piece, 361 intersections
•Make a 12x64 table of random numbers
•Make a 2x361 table of random numbers
•XOR the piece&position numbers (PPNs)
for all pieces initially on the board
•When a stone is placed, XOR its PPN
•When a piece moves, XOR the PPN for its
starting square (it is no longer there) and
also XOR the PPN for its ending square (it
is there now)
•When a capture occurs, also XOR the PPN
for the captured piece (it is no longer there)
•Use a few extra numbers for
•white’s move vs black’s move
•white can castle kingside, etc
http://csiweb.ucd.ie/Staff/acater/comp4031.html
•When a capture occurs, XOR the PPNs for
all captured stones
•When a move is impossible because of ko,
XOR the PPNs for both black and white;
XOR them back again after each legal move
•Use a few extra numbers for
•white’s move vs black’s move
•balance of prisoners taken (W-B say)
Artificial Intelligence for Games and Puzzles
7
Advantages of using a TT
1. The biggest advantage by far is the reduction of search effort (for chess, x5)
resulting from reuse of information about positions already searched.
In a 2-player game, with branching factor b, if all moves were independent,
• on 3rd ply, b2 moves would be duplications, leaving b3- b2
 substantial saving: b2 searches of subtrees of size bd-3
• on 4th ply, 2x b2 duplication, leaving b4-2x b2
• on 5th and on 6th ply, order b3 duplications
?
2. TTs are also useful to store an opening book.
3. Even where search must be performed again, e.g. because depth of search of
stored move is not sufficient for current needs, the retrieved best-move
information can be used for move ordering for 
4. You sometimes get “extra depth for free”
• It can happen that you find a reusable entry resulting from a search to (say)
depth 7 (from that position) when you only intended to search to depth 3
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
8
Problems of using a TT
1. To be useful, a TT should be large. 105 entries or better - or much better. This
is unlikely to help the caching behaviour of your personal computer.
2. Code complexity: extra code is required to encode positions, probe the
hashtable, judge the reusability of results, decide whether to overwrite or not
in the case of hashtable collisions.
• [1, 2] are generally considered to be prices worth paying
3. Search instability may result, when
• an entry gives “extra depth for free”, then
• gets overwritten with another entry for a different position, then
• its position is revisited, searched again - but to lesser depth than before
4. Beware of the Graph History Interaction (GHI) problem
• how you got to a position may affect what you may do thereafter
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
9
Graph History Interaction problem - GHI
•Recall that many games have special rules about repetition of situations.
• In some games, repetition is illegal (sometimes, as in Nine Men’s Morris; or
always, as in Go with “Superko” rule) and so should be scored as a loss
• In some games, repetition results in a draw (sometimes, as in Chess; or
always), and so should be scored as
 a draw, if you are involved in playing a game
 a failure, if you are trying to achieve a checkmate or similar goal
 a success, if you are trying to avoid a checkmate or similar
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
10
Two manifestations of GHI
•The value of some position P2 may be first computed
according to a lead-up sequence in which some position P1
does occur and does affect the course of the game after P2;
 then this value might not be valid if P2 is reached via
another lead-up sequence not involving P1
•The value of some position P2 may be first computed
according to a lead-up sequence in which some position P1
does not occur and so does not affect the course of the game
after P2;
 then this value might not be valid if P2 is reached via
another lead-up sequence that does involve P1
http://csiweb.ucd.ie/Staff/acater/comp4031.html
P1
P2
Artificial Intelligence for Games and Puzzles
11
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a win, and
scenario is “first-player-loss” - like
not mating when far ahead.
b
g
OR
c
e
d
http://csiweb.ucd.ie/Staff/acater/comp4031.html
AND
f
h
Artificial Intelligence for Games and Puzzles
12
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a win, and
scenario is “first-player-loss” - like
not mating when far ahead.
b
AND
e
d
g
OR
c
f
h
Search a-b-e-h-e: store loss at h
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
13
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a win, and
scenario is “first-player-loss” - like
not mating when far ahead.
b
AND
e
d
g
OR
c
f
h
Search a-b-e-h-e: store loss at h
Search a-b-d: store loss at b
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
14
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a win, and
scenario is “first-player-loss” - like
not mating when far ahead.
b
AND
e
d
g
OR
c
f
h
Search a-b-e-h-e: store loss at h
Search a-b-d: store loss at b
Search a-c-f-h: stored loss at h is backed up to f and c
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
15
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a win, and
scenario is “first-player-loss” - like
not mating when far ahead.
b
AND
e
d
g
OR
c
f
h
Search a-b-e-h-e: store loss at h
Search a-b-d: store loss at b
Search a-c-f-h: stored loss at h is backed up to f and c
Now a is deemed a loss because b and c are both loss.
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
16
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a win, and
scenario is “first-player-loss” - like
not mating when far ahead.
b
AND
e
d
g
OR
c
f
h
Search a-b-e-h-e: store loss at h
Search a-b-d: store loss at b
Search a-c-f-h: stored loss at h is backed up to f and c
Now a is deemed a loss because b and c are both loss.
But the sequence a-c-f-h-e-g is a win!
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
17
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a loss, and
scenario is “current-player-loss” like not being allowed to repeat.
b
d
OR
c
e
g
http://csiweb.ucd.ie/Staff/acater/comp4031.html
AND
f
h
Artificial Intelligence for Games and Puzzles
18
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a loss, and
scenario is “current-player-loss” like not being allowed to repeat.
b
d
AND
e
g
OR
c
f
h
Search a-b-e-h: store win at h, since opponent has no legal move from h
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
19
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a loss, and
scenario is “current-player-loss” like not being allowed to repeat.
b
d
AND
e
g
OR
c
f
h
Search a-b-e-h: store win at h, since opponent has no legal move from h
Search a-c-f-h: stored win at h is backed up to f and c
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
20
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a loss, and
scenario is “current-player-loss” like not being allowed to repeat.
b
d
AND
e
g
OR
c
f
h
Search a-b-e-h: store win at h, since opponent has no legal move from h
Search a-c-f-h: stored win at h is backed up to f and c
Now a is deemed a win because c is a win.
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
21
GHI causing wrong values to be backed up
Consider the AND-OR tree: (minimax trees are closely related)
a
Suppose d is a loss, g is a loss, and
scenario is “current-player-loss” like not being allowed to repeat.
b
d
AND
e
g
OR
c
f
h
Search a-b-e-h: store win at h, since opponent has no legal move from h
Search a-c-f-h: stored win at h is backed up to f and c
Now a is deemed a win because c is a win.
But a is a loss! a-b-d, a-c-f-h-e-g, a-c-f-h-e-h all lose.
“A general solution to the graph history interaction problem”
Kishimoto & Mueller
http://www.cs.ualberta.ca/~kishi/pdf_file/AAAI04KishimotoA.pdf
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
22
Search enhancement: Killer Heuristic
 search is more efficient with well-ordered moves.
It often happens in practice that one player has a particularly good move
available somewhere in the near future, good in several different situations.
If such a “killer move” can be recognised by the search algorithm, and
considered first when it is among the generated possible moves,  is more
likely to cutoff quickly.
Method:
• at each ply in the search tree, keep a count of the number of times each
possible move has caused a  cutoff.
• when a list of moves for a node is generated, consider first the move that has
caused most cutoffs at that ply level.
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
23
Search enhancement: History Heuristic
With similar reasoning, a move that has repeatedly proved good may also be
good at other ply levels.
Method:
• Maintain (for each player separately) a table of moves that have caused beta
cutoffs, recording the number of cutoffs for each move, regardless of ply
depth
• When generating moves for a node, sort them in descending order of the
number of recorded cutoffs.
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
24
Fuzzy move matching
With the Killer Heuristic and the History Heuristic, it is in some games quite
likely that other “similar” moves will also be good. In chess for example,
• A particular move found successful might be white rook:b3-b7
This might be successful for various reasons:
• Getting away from the b3 square (to avoid a threat, or to gain mobility)
• Getting a white rook or queen to the 7th rank (to make an attack)
• Getting a piece to the b7 square (to escape a check)
Moves might be compared to known killer moves on the basis of strict identity,
or by sharing one or more of the piece-from-to elements of the description.
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
25
Search enhancement: Singular Extensions
Part of the rationale of minimax search is that a static evaluation of a position is
more likely to be reliable as the game gets closer to its end.
The idea of singular extensions is to use a very focused search to several ply
deeper in order to get a more reliable evaluation for midgame positions.
In an  search, when static evaluation of a position is required,
• Pick the possible next move with best static evaluation
• Generate all possible successors of that “singular extension”
• Repeat for some predetermined number of ply
• Report the evaluation at the end of this playout sequence to  routine
The host  search must be carried out to lesser depth if singular extensions are
pursued, since substantial move generation and evaluation is necessary.
http://csiweb.ucd.ie/Staff/acater/comp4031.html
Artificial Intelligence for Games and Puzzles
26

Document

Transcript Document

Directory