VK Dice - UCI Math
Download
Report
Transcript VK Dice - UCI Math
VK Dice
By: Kenny Gutierrez, Vyvy Pham
Mentors: Sarah Eichhorn, Robert Campbell
Rules
Variation of the game Sequences
On each turn, a player rolls 6 dice
Player is given option to reroll once but all 1s must be
kept
Larger sequences are worth more points
Three or more 1s = Restart Score
Repeats of a certain number is counted once
Winner is the first player to reach 100
Scoring
1 – 5 points
1,2 –10 points
1,2,3 –15 points
1,2,3,4 –20 points
1,2,3,4,5 –25 points
1,2,3,4,5,6 –35 points
Objective
Optimal Strategy
When to Reroll
Which dice to keep/reroll
Computer Adaptive Learning Program simulate one
million rolls for each run. Programmed to run 5 times
simultaneously
Determined which actions repeated most frequently for all
game states
Repeated actions of the computer are compared to the
Expected Values for each game state.
Description
6^6= 46,656 game states
462 don’t include repetition
Different states are
grouped into sections
according to the same
numbers, regardless of
repetition
[2, 4, 4, 5, 5, 6]
[2, 4, 4, 5, 6, 6]
[2, 2, 4, 4, 5, 6]
[2, 2, 4, 5, 5, 6]
[2, 2, 4, 5, 6, 6]
[2, 4, 5, 5, 5, 6]
[2, 4, 5, 5, 6, 6]
[2, 4, 5, 6, 6, 6]
[2, 4, 4, 4, 5, 6]
[2, 2, 2, 4, 5, 6]
[1, 2, 2, 2, 2, 6]
[1, 2, 2, 2, 6, 6]
[1, 2, 2, 6, 6, 6]
[1, 2, 6, 6, 6, 6]
[1, 2, 2, 2, 3, 4]
[1, 2, 3, 3, 4, 4]
[1, 2, 2, 3, 3, 4]
[1, 2, 2, 3, 4, 4]
[1, 2, 3, 3, 3, 4]
[1, 2, 3, 4, 4, 4]
Probability for Game States
Sections are further divided into: one 1, two 1s, no 1s
The probability is the same within each section
Probability is calculated for every reroll option.
Game States without 1s (Reroll 1-6 Dice)
Game States with one 1 (Reroll 1-5 Dice)
Game States with two 1s (Reroll 1-4 Dice)
Cases For Each Reroll
Reroll 6 Dice (No 1s)
Reroll 5 Dice (one 1)
-Get 1,2,3,4,5,6,
-Get 2,3,4,5,6
-Get 1, 2,3,4,5 not 6
-Get 2,3,4,5 not 6
-Get 1,2,3,4 not 5, not 1,1,1
-Get 2,3,4 not 5, not 1,1,1
-Get 1,2,3 not 4, not 1,1,1
-Get 2,3 not 4, not 1,1,1
-Get 1,2 not 3, not 1,1,1
Get 2, not 3, not 1,1,1
-Get 1 not 2, not 1,1,1
Ex. Of Calculating Probability
Reroll 5 for Initial Game States Without 1s
Case A: Get 1,3,4,5,6
5! (1/6)5
Probability= 5/324= 1.54%
Case B: Get 1,3,4,5 not 6
*1 3 4 5 (4) = 5! = 120
*1 3 4 5 (1 or 3 or 4 or 5) = 4(5!/2!) = 240
Probability= 350/6^5 = 5/108 = 4.63%
Case C: Get 1,3,4 not 5, not 111
*1 3 4 (2,2) or (6,6) = 2(5!/2!) = 120
* 1 3 4 (2 and 6) = 5! = 120
*1 3 4 (2 or 6) (1 or 3 or 4) = 6(5!/2) = 360
*1 3 4 (4,4) or (3,3) = 2(5!/3!) = 40
*1 3 4 (1,3) or (1,4) or (4,3) = 3(5!/(2! 2!)) = 90
Probability= 730/6^5 = 365/3888 = 9.39%
Case D: Get 1, 3, not 4, not 1,1,1
*1 3 (2,2,2) or (5,5,5) or (6,6,6) = 3(5!/3!) = 60
*1 3 (2,5,6) = 5! = 120
*1 3 (2,2) or (5,5) or (6,6) (Different: 2 or 5 or 6)
= 6(5!/2!) = 360
*1 3 (2,2) or (5,5) or (6,6) (1 or 3) = 6(5!/(2! 2!))
= 180
*1 3 (3,3,3) = 5!/4! = 5
*1 3 (3,3) (2 or 5 or 6) = 3(5!/3!) = 60
*1 3 (1,3) (2 or 5 or 6) = 3(5!/4) = 90
*1 3 (2,2) or (5,6) or (4,6) (1 or 3) = 6(5!/2!) = 360
*1 3 (1,3,3) = 5! (3! 2!) = 10
Probability = 1245/6^5 = 415/2592 = 16.0%
Case E: Get 1, not 3, not 1,1,1
*1 (2,2,2,2) or (4,4,4,4) or (5,5,5,5) or (6,6,6,6) =
4(5!/(2! 2!)) = 20
*1 (2,2,2) or (4,4,4) or (5,5,5) or (6,6,6) (Different:
2,4,5,6) = 12(5/3!) = 240
*1 (2,4,5,6) = 5! = 120
*1 (4,4) or (5,5) or (6,6) or (2,2) Differ:
(6,2)(6,4)(4,5)(4,2)(6,5)(5,2)) = 12(5!/2!) = 720
*1 (4,4) or (5,5) or (6,6) or (2,2) Differ: (4,4) (5,5)
(6,6) (2,2) = 12(5!/(2! 2!)) = 360
*1 (1) (4,4,4) or (5,5,5) or (6,6,6) or (2,2,2) =
4(5!/(3! 2!)) = 40
*1 (1) (2,2) or (4,4) or (5,5) or (6,6) Differ(2,4,5,6)=
12(5!/ (2! 2!)) = 360
*1 (1) (2,4,5) or (2,4,6) or (4,5,6) = 3(5!/2!) = 180
Probability = 2040/65 = 85/324 = 26.23%
Finding Expected Values
Sum of all possible values each multiplied by the
probability of its occurrence
Example: [1,2,3,4,4,5]
Reroll 0
Reroll 1
Reroll 2
Reroll 3
Reroll 4
Reroll 5
25.00
Keep:2,3,4,5
26.6667
Keep:2,3,4
21.5278
Keep:2,3
16.9676
Keep:2
12.7199
Reroll all
9.0766
Reroll 0
25
Reroll 1
((5/6)*(25)+(1/6)*(35))
((1(1/18+1/4+1/36))*(20)+(1/18)*(35)+(1/4
)*(25))
((1(1/36+1/9+61/216+2/27))*(15)+(1/36)*(
35)+(1/9)*(25)+(61/216)*(20))
((1(1/54+7/108+91/648+311/1296+19/144)
)*(10)+
(1/54)*(35)+(7/108)*(25)+(91/648)*(20)
+(311/1296)*(15))
((1(5/324+5/108+25/324+95/648+29/144+1
466 /
7776))*(5)+(5/324)*(35)+(5/108)*(25)+
(25/324)*(20)+(95/648)*(15)+(29/144)*
(10))
Reroll 2
Reroll 3
Reroll 4
Legend
Pink: Probability of each cases
multiplied by score
Yellow: Probability of getting the
same score and not 1,1,1
Reroll 5
Inside the Program
Runs five times
Each Run
1,000,000 dice rolls
Prints computer’s
actions for all
game states
Learns based on
result of each roll
through reward and
punish system
How Does It Work?
Set of six numbers for each initial game state.
Each number pertains to one of the six dice
Initially, each number in the list contains 50
Program generates random number between 1-100 for each
number.
In order to reroll a die, the random number must be between
the range 1-(list of number)
Ex
Game State:
[1, 2, 3, 5, 5, 6]
List:
[55, -10, 32, 87, 98, 103]
Random #s:
[60, 39, 47, 37, 12, 18]
Action:
[Keep, Keep, Keep, Reroll, Reroll, Reroll]
Rewarding & Punishing
Reward:
Certain number of points based on
score after re-roll
IF final score > initial score
Increase probability of repeating
that action by either adding or
subtracting
Adds when the computer
rerolled, subtracts when it
kept dice
Punish:
Only when the re-rolls end with at
least three 1s
Decrease probability to avoid
that action by either adding or
subtracting
Adds when the computer
kept dice, subtracts when it
rerolled.
TABLE FOR PUNISH & REWARD
Reward & Punish
5 - ±1
10 - ±2
15 - ±3
20 - ± 4
25 - ±5
35 - ±7
1,1,1 - ±5
Rewarding & Punishing
Computer will never reroll 1, regardless of the number
in the list
After subtracting and adding to each list, the numbers
will eventually go into the negatives or above 100
Negatives= Always Keep (N)
Over 100= Always Reroll (Y)
Between 1-100
Undetermined (U)
Best Move Mechanic
Mechanism implemented to help computer learn the
optimal strategy
Before keeping a die, the computer checks if there is a
better option
Ex. [ 1, 2, 3, 5, 6, 6]
If it wants to keep two 6s, it will change to keep 2 and 3.
Comparing Program with
Theoretical Probability
Examples of each Initial Game States:
Without any 1s
With one 1
With two 1s
Adaptive learning program- used the actions of the dice
most common out of the five runs
Initial State without 1s
Example: [2, 3, 4, 4, 6, 6]
Theoretical Expected Values
Reroll 0
Reroll 1
Reroll 2
Reroll 3
Reroll 4
Reroll 5
Reroll 6
0
Keep: 2,3,4
3.3333
Keep:2,3,4,6
7.5
Keep: 2,3,4
9.3056
Keep:2,3
8.9969
Keep: 2
8.6002
Reroll All
6.1368
Optimal Move:
Reroll 3 dice; Keeping 2,3,4
Adaptive Learning Program
After 5 runs:
[2, 3, 4, 4, 6, 6]
[N, N, Y, N, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
Most Common Move:
Reroll 3 dice; Keeping 2,3,4
Conclusion: Expected Values matched EXACTLY to the
Adaptive Learning Program
Initial State With one 1
Example: [1, 2, 3, 3, 4, 6]
Theoretical Expected Values
Reroll 0
Reroll 1
20
Keep:2,3,4,6
22.5
Reroll 2
Reroll 3
Reroll 4
Reroll 5
Keep:2,3,4
21.5278
Keep:2,3
16.9676
Keep:2
12.7199
Reroll all
9.0766
Optimal Move:
Reroll 1 Dice;
Keep 1,2,3,4,6
Adaptive Learning Program
5 Sample Runs:
[1, 2, 3, 3, 4, 6]
[N, N, N, Y, N,Y]
[1, 2, 3, 3, 4, 6]
[N, N, Y, N, N, Y]
[1, 2, 3, 3, 4, 6]
[N, N, Y, N, N, Y]
[1, 2, 3, 3, 4, 6]
[N, N, N, Y, N, N]
[1, 2, 3, 3, 4, 6]
[N, N, N, Y, N, Y]
Most Common Move:
Reroll 2 Dice; Keeping 1,2,3,4
Conclusion: The expected values and the results from the
program were similar. The computer chose the 2nd best
action
Initial State With two 1s
Example: [1, 1, 2, 4, 5, 5]
Theoretical Expected Values
Reroll 0
10.00
Reroll 1
Keep:2, 4, 5
10.8333
Reroll 2
Keep:2,4
9.7222
Reroll 3
Keep:2
7.8935
Reroll 4
Reroll All
5
Optimal Move:
Reroll 1 Die; Keeping 1,1,2,4,5
Adaptive Learning Program
After 5 runs:
[1, 1, 2, 3, 5, 5]
[Y, Y, N, N, Y, N]
[1, 1, 2, 3, 5, 5]
[Y, Y, U, U, U, U]
[1, 1, 2, 3, 5, 5]
[Y, Y, U, U, U, U]
[1, 1, 2, 3, 5, 5]
[U, U, N, N, Y, N]
[1, 1, 2, 3, 5, 5]
[Y, Y, N, N, U, U]
Most Common Move:
Uncertain
Conclusion: There is a high probability of rerolling a 1 so
the move is undetermined and needs more runs
Conclusion
Expected Values were found for ALL game states
Adaptive Learning Program with 5 runs and created a
list of actions for the 6 dice for every game state.
Most common move from the program were compared
to the expected values for each game state
Program’s common moves were either the best or 2nd
best action indicated by the expected values
Game states with double 1s
Acknowledgements
Sarah Eichhorn:
Helping with the probabilities of the different game states
Answering questions every step of the way
Robert Campbell:
Helping with the computer Program
Teaching us how to calculate expected values