VK Dice - UCI Math

Download Report

Transcript VK Dice - UCI Math

VK Dice
By: Kenny Gutierrez, Vyvy Pham
Mentors: Sarah Eichhorn, Robert Campbell
Rules
 Variation of the game Sequences
 On each turn, a player rolls 6 dice
 Player is given option to reroll once but all 1s must be
kept
 Larger sequences are worth more points
 Three or more 1s = Restart Score
 Repeats of a certain number is counted once
 Winner is the first player to reach 100
Scoring
1 – 5 points
1,2 –10 points
1,2,3 –15 points
1,2,3,4 –20 points
1,2,3,4,5 –25 points
1,2,3,4,5,6 –35 points
Objective
 Optimal Strategy
 When to Reroll
 Which dice to keep/reroll
 Computer Adaptive Learning Program simulate one
million rolls for each run. Programmed to run 5 times
simultaneously
 Determined which actions repeated most frequently for all
game states
 Repeated actions of the computer are compared to the
Expected Values for each game state.
Description
 6^6= 46,656 game states
 462 don’t include repetition
 Different states are
grouped into sections
according to the same
numbers, regardless of
repetition
[2, 4, 4, 5, 5, 6]
[2, 4, 4, 5, 6, 6]
[2, 2, 4, 4, 5, 6]
[2, 2, 4, 5, 5, 6]
[2, 2, 4, 5, 6, 6]
[2, 4, 5, 5, 5, 6]
[2, 4, 5, 5, 6, 6]
[2, 4, 5, 6, 6, 6]
[2, 4, 4, 4, 5, 6]
[2, 2, 2, 4, 5, 6]
[1, 2, 2, 2, 2, 6]
[1, 2, 2, 2, 6, 6]
[1, 2, 2, 6, 6, 6]
[1, 2, 6, 6, 6, 6]
[1, 2, 2, 2, 3, 4]
[1, 2, 3, 3, 4, 4]
[1, 2, 2, 3, 3, 4]
[1, 2, 2, 3, 4, 4]
[1, 2, 3, 3, 3, 4]
[1, 2, 3, 4, 4, 4]
Probability for Game States
 Sections are further divided into: one 1, two 1s, no 1s
 The probability is the same within each section
 Probability is calculated for every reroll option.
 Game States without 1s (Reroll 1-6 Dice)
 Game States with one 1 (Reroll 1-5 Dice)
 Game States with two 1s (Reroll 1-4 Dice)
Cases For Each Reroll
 Reroll 6 Dice (No 1s)
 Reroll 5 Dice (one 1)
 -Get 1,2,3,4,5,6,
 -Get 2,3,4,5,6
 -Get 1, 2,3,4,5 not 6
 -Get 2,3,4,5 not 6
 -Get 1,2,3,4 not 5, not 1,1,1
 -Get 2,3,4 not 5, not 1,1,1
 -Get 1,2,3 not 4, not 1,1,1
 -Get 2,3 not 4, not 1,1,1
 -Get 1,2 not 3, not 1,1,1
 Get 2, not 3, not 1,1,1
 -Get 1 not 2, not 1,1,1
Ex. Of Calculating Probability
Reroll 5 for Initial Game States Without 1s

Case A: Get 1,3,4,5,6
5! (1/6)5
Probability= 5/324= 1.54%
Case B: Get 1,3,4,5 not 6
*1 3 4 5 (4) = 5! = 120
*1 3 4 5 (1 or 3 or 4 or 5) = 4(5!/2!) = 240
Probability= 350/6^5 = 5/108 = 4.63%
Case C: Get 1,3,4 not 5, not 111
*1 3 4 (2,2) or (6,6) = 2(5!/2!) = 120
* 1 3 4 (2 and 6) = 5! = 120
*1 3 4 (2 or 6) (1 or 3 or 4) = 6(5!/2) = 360
*1 3 4 (4,4) or (3,3) = 2(5!/3!) = 40
*1 3 4 (1,3) or (1,4) or (4,3) = 3(5!/(2! 2!)) = 90
Probability= 730/6^5 = 365/3888 = 9.39%
Case D: Get 1, 3, not 4, not 1,1,1
*1 3 (2,2,2) or (5,5,5) or (6,6,6) = 3(5!/3!) = 60
*1 3 (2,5,6) = 5! = 120
*1 3 (2,2) or (5,5) or (6,6) (Different: 2 or 5 or 6)
= 6(5!/2!) = 360
*1 3 (2,2) or (5,5) or (6,6) (1 or 3) = 6(5!/(2! 2!))
= 180
*1 3 (3,3,3) = 5!/4! = 5
*1 3 (3,3) (2 or 5 or 6) = 3(5!/3!) = 60
*1 3 (1,3) (2 or 5 or 6) = 3(5!/4) = 90
*1 3 (2,2) or (5,6) or (4,6) (1 or 3) = 6(5!/2!) = 360
*1 3 (1,3,3) = 5! (3! 2!) = 10
Probability = 1245/6^5 = 415/2592 = 16.0%
Case E: Get 1, not 3, not 1,1,1
*1 (2,2,2,2) or (4,4,4,4) or (5,5,5,5) or (6,6,6,6) =
4(5!/(2! 2!)) = 20
*1 (2,2,2) or (4,4,4) or (5,5,5) or (6,6,6) (Different:
2,4,5,6) = 12(5/3!) = 240
*1 (2,4,5,6) = 5! = 120
*1 (4,4) or (5,5) or (6,6) or (2,2) Differ:
(6,2)(6,4)(4,5)(4,2)(6,5)(5,2)) = 12(5!/2!) = 720
*1 (4,4) or (5,5) or (6,6) or (2,2) Differ: (4,4) (5,5)
(6,6) (2,2) = 12(5!/(2! 2!)) = 360
*1 (1) (4,4,4) or (5,5,5) or (6,6,6) or (2,2,2) =
4(5!/(3! 2!)) = 40
*1 (1) (2,2) or (4,4) or (5,5) or (6,6) Differ(2,4,5,6)=
12(5!/ (2! 2!)) = 360
*1 (1) (2,4,5) or (2,4,6) or (4,5,6) = 3(5!/2!) = 180
Probability = 2040/65 = 85/324 = 26.23%
Finding Expected Values
 Sum of all possible values each multiplied by the
probability of its occurrence
 Example: [1,2,3,4,4,5]
Reroll 0
Reroll 1
Reroll 2
Reroll 3
Reroll 4
Reroll 5
25.00
Keep:2,3,4,5
26.6667
Keep:2,3,4
21.5278
Keep:2,3
16.9676
Keep:2
12.7199
Reroll all
9.0766
Reroll 0
25
Reroll 1
((5/6)*(25)+(1/6)*(35))
((1(1/18+1/4+1/36))*(20)+(1/18)*(35)+(1/4
)*(25))
((1(1/36+1/9+61/216+2/27))*(15)+(1/36)*(
35)+(1/9)*(25)+(61/216)*(20))
((1(1/54+7/108+91/648+311/1296+19/144)
)*(10)+
(1/54)*(35)+(7/108)*(25)+(91/648)*(20)
+(311/1296)*(15))
((1(5/324+5/108+25/324+95/648+29/144+1
466 /
7776))*(5)+(5/324)*(35)+(5/108)*(25)+
(25/324)*(20)+(95/648)*(15)+(29/144)*
(10))
Reroll 2
Reroll 3
Reroll 4
Legend
Pink: Probability of each cases
multiplied by score
Yellow: Probability of getting the
same score and not 1,1,1
Reroll 5
Inside the Program

Runs five times

Each Run



1,000,000 dice rolls
Prints computer’s
actions for all
game states
Learns based on
result of each roll
through reward and
punish system
How Does It Work?
 Set of six numbers for each initial game state.
 Each number pertains to one of the six dice
 Initially, each number in the list contains 50
 Program generates random number between 1-100 for each
number.
 In order to reroll a die, the random number must be between
the range 1-(list of number)
 Ex
Game State:
[1, 2, 3, 5, 5, 6]
List:
[55, -10, 32, 87, 98, 103]
Random #s:
[60, 39, 47, 37, 12, 18]
Action:
[Keep, Keep, Keep, Reroll, Reroll, Reroll]
Rewarding & Punishing
 Reward:
Certain number of points based on
score after re-roll
IF final score > initial score
 Increase probability of repeating
that action by either adding or
subtracting
 Adds when the computer
rerolled, subtracts when it
kept dice

 Punish:

Only when the re-rolls end with at
least three 1s
 Decrease probability to avoid
that action by either adding or
subtracting
 Adds when the computer
kept dice, subtracts when it
rerolled.
TABLE FOR PUNISH & REWARD
Reward & Punish
5 - ±1
10 - ±2
15 - ±3
20 - ± 4
25 - ±5
35 - ±7
1,1,1 - ±5
Rewarding & Punishing
 Computer will never reroll 1, regardless of the number
in the list
 After subtracting and adding to each list, the numbers
will eventually go into the negatives or above 100
 Negatives= Always Keep (N)
 Over 100= Always Reroll (Y)
 Between 1-100
 Undetermined (U)
Best Move Mechanic
 Mechanism implemented to help computer learn the
optimal strategy
 Before keeping a die, the computer checks if there is a
better option
 Ex. [ 1, 2, 3, 5, 6, 6]
 If it wants to keep two 6s, it will change to keep 2 and 3.
Comparing Program with
Theoretical Probability
 Examples of each Initial Game States:
 Without any 1s
 With one 1
 With two 1s
 Adaptive learning program- used the actions of the dice
most common out of the five runs
Initial State without 1s
Example: [2, 3, 4, 4, 6, 6]
Theoretical Expected Values
Reroll 0
Reroll 1
Reroll 2
Reroll 3
Reroll 4
Reroll 5
Reroll 6
0
Keep: 2,3,4
3.3333
Keep:2,3,4,6
7.5
Keep: 2,3,4
9.3056
Keep:2,3
8.9969
Keep: 2
8.6002
Reroll All
6.1368
 Optimal Move:
 Reroll 3 dice; Keeping 2,3,4
Adaptive Learning Program
 After 5 runs:
[2, 3, 4, 4, 6, 6]
[N, N, Y, N, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
 Most Common Move:
 Reroll 3 dice; Keeping 2,3,4
Conclusion: Expected Values matched EXACTLY to the
Adaptive Learning Program
Initial State With one 1
Example: [1, 2, 3, 3, 4, 6]
Theoretical Expected Values
Reroll 0
Reroll 1
20
Keep:2,3,4,6
22.5
Reroll 2
Reroll 3
Reroll 4
Reroll 5
Keep:2,3,4
21.5278
Keep:2,3
16.9676
Keep:2
12.7199
Reroll all
9.0766
 Optimal Move:
 Reroll 1 Dice;
Keep 1,2,3,4,6
Adaptive Learning Program
 5 Sample Runs:
[1, 2, 3, 3, 4, 6]
[N, N, N, Y, N,Y]
[1, 2, 3, 3, 4, 6]
[N, N, Y, N, N, Y]
[1, 2, 3, 3, 4, 6]
[N, N, Y, N, N, Y]
[1, 2, 3, 3, 4, 6]
[N, N, N, Y, N, N]
[1, 2, 3, 3, 4, 6]
[N, N, N, Y, N, Y]
 Most Common Move:
 Reroll 2 Dice; Keeping 1,2,3,4
Conclusion: The expected values and the results from the
program were similar. The computer chose the 2nd best
action
Initial State With two 1s
Example: [1, 1, 2, 4, 5, 5]
Theoretical Expected Values
Reroll 0
10.00
Reroll 1
Keep:2, 4, 5
10.8333
Reroll 2
Keep:2,4
9.7222
Reroll 3
Keep:2
7.8935
Reroll 4
Reroll All
5
 Optimal Move:
 Reroll 1 Die; Keeping 1,1,2,4,5
Adaptive Learning Program
 After 5 runs:
[1, 1, 2, 3, 5, 5]
[Y, Y, N, N, Y, N]
[1, 1, 2, 3, 5, 5]
[Y, Y, U, U, U, U]
[1, 1, 2, 3, 5, 5]
[Y, Y, U, U, U, U]
[1, 1, 2, 3, 5, 5]
[U, U, N, N, Y, N]
[1, 1, 2, 3, 5, 5]
[Y, Y, N, N, U, U]
 Most Common Move:
 Uncertain
Conclusion: There is a high probability of rerolling a 1 so
the move is undetermined and needs more runs
Conclusion
 Expected Values were found for ALL game states
 Adaptive Learning Program with 5 runs and created a
list of actions for the 6 dice for every game state.
 Most common move from the program were compared
to the expected values for each game state
 Program’s common moves were either the best or 2nd
best action indicated by the expected values
 Game states with double 1s
Acknowledgements
 Sarah Eichhorn:
 Helping with the probabilities of the different game states
 Answering questions every step of the way
 Robert Campbell:
 Helping with the computer Program
 Teaching us how to calculate expected values