national lottery uk

Download Report

Transcript national lottery uk

The following lecture has been approved for
University Undergraduate Students
This lecture may contain information, ideas, concepts and discursive anecdotes
that may be thought provoking and challenging
It is not intended for the content or delivery to cause offence
Any issues raised in the lecture may require the viewer to engage in further
thought, insight, reflection or critical evaluation
Probability, Errors & Chance
(crack these and research is easy)
Prof. Craig Jackson
Head of Psychology Division
School of Social Sciences
BCU
[email protected]
Curse of probability
Few subjects more counter-intuitive than probability
Understanding this is essential
“Probability is common sense reduced to calculation”
Pierre Simon Laplace
“{statistics} are the only tools by which an opening
can be cut through the formidable thicket of
difficulties that bars the path of those who pursue
the science of man."
Sir Francis Galton
UK National Lottery 1994
Choose 6 numbers between 1 and 49
Jackpot approx. £8 million for all 6 numbers
Smaller prizes for 5 numbers, 4 numbers, and 3 numbers
Week 1 -
Nobody won
Week 2 -
Rollover
Week 2 -
Factory worker in Bradford won £17,880,003
using 26, 35, 38, 43, 47, 49
LOTTERY FEVER STRUCK THE UK!
Insurance company protection
“Out, thou strumpet, Lady Fortune”
UK National Lottery Behaviour
Buying 13,983,816 tickets = a win
If only winner = £6 million loss
If shared winner = lose more
Rollover
If only winner, possibly
14th Jan 1995
Rollover of £16,292,830
Shared between 133 people who chose no.s 7,17,23,32,38,42
If everyone selected numbers at random, only 4 should have picked this
combination
some curious human psychology at work
UK National Lottery Behaviour
Rule 1
Win a fortune
Only bet when there is a rollover (Rollover Paradox)
Rule 2
Never bet on numbers that other people will choose.
Avoid numbers under 31 – birthday punters + amateur gamblers
especially avoid 3, 7, 17
Do use “4” and “13”
“Stupid” combinations are better e.g. “34,35,36,37,38,39”
Probability is always ahead
UK national lottery
Draw no. 631 Wed 9th Jan 2002
Number Rack
3
13
15
18
39
47
16
20
19
28
www.llednulb.demon.co.uk
21
29
25
41
28
31
45
38
41
44
49
UK lottery ball frequency
Draw no. 631 Wed 9th Jan 2002
140
120
Frequency
100
80
60
40
20
0
1
3
5
7
9
11
13
15
17
19
21
23
25
27
Ball No.
29
31
33
35
37
39
41
43
45
47
49
Probability Basics
Expressed as “P” or “p”
Decimal measure of the likelihood of something happening
P ranges from 0 through to 1
Certain events,
P=1
Impossible events
P=0
Equally likely events
P = 0.5
java applet site demonstrations
www.mste.uiuc.edu
introductory article on probability
Cohen, J & Stewart I (1998) That’s amazing isn’t it? New scientist, 17 Jan.
pp24-28
Oldest example of probability
A coin has landed on heads three times in a row...
What is the probability of the next coin landing on tails?
Causal Formation - thinking things
have a memory, and amateur
gamblers use it all the time
?
Counter intuity #1
4000 flips of a Euro coin
Lands on “heads” 2780 times (68%)
Evidence of an unfair coin?
Heads
1000
Tails
2000
3000
4000
5000
Counter intuity #2
Study 1.
Drug x is more effective than a placebo in male patients
Study 2.
Drug x is more effective than a placebo in female patients
Study 3.
(Combining the data from study 1 & 2)
Drug x is less effective than a placebo in all patients
Calculating Probabilities
Minesweeper
Clicking on a cell containing a mine
ends the game
Clicking on a cell containing no mine,
but adjacent to x mines,will reveal the
number of conjoining mines
Clicking on a square containing no mine,
and not adjacent to a mine, will reveal nothing
Calculating Probabilities
What is the probability of our first move clicking on a cell
containing a mine?
64 cells 10 mines
10 / 64 = 0.15625
15.6% or p = 0.156
What is the probability of clicking on a mine in the
highlighted block?
8 cells 1 mine
1 / 8 = 0.125
12.5% or p = 0.12
What about the green block?
8 cells 3 mines
3 / 8 = 0.375
37.5% or p = 0.375
Great mistakes
1950 Palo Alto
Newly formed CIA tested thousands of people in ESP studies
Scoring < target value of 25% = discarded.
Scoring > 25% = stayed
P = .25
100
75
50
25
0
P = .25
P = .25
P = .25
Significant looking data
“The Random” versus “The Quantum”
Poor at spotting significant trends in data...
“Coincidence” is often some unaccounted for common sense phenomena
Spanish Grand Prix, Herez 1996 - qualifying lap times
M.Schumacher 1:21:072
Villeneuve
1:21:072
3 drivers lapping to 1000th of a second
Frentzen
1:21:072
Were the computers wrong?
How unlikely is this?
Israeli Lottery 2010
Bulgarian Lottery 2009
Same 6 balls drawn in reverse order weeks apart
Same 6 balls drawn four days apart
Basic Scientific Methodology
VARIABLES
IV
DV
Controlled
SAMPLING
Skewed, Methods, Bias
SUBJECTS
Independent, Matched, Repeated
PROBABILITY
P values
ERRORS
Type 1 and Type 2
SENSITIVITY
Tweaking the methodology
Everything has errors
Werner Heisenberg
Science involves proving changes in dependent variables are due to
(manipulation of) particular independent variables
Need to prove random luck alone has not produced changes in the dependent
variables that were observed
Heisenberg’s uncertainty principle (1927) is an eternal problem for
researchers
Cannot objectively measure a phenomenon without effecting the phenomenon
in some way.
e.g. scanning electron microscopes
Types of Error #1
CONSTANT ERRORS
lack of control
poor variable measurement
wrong tools for measuring the variable(s)
HOW TO REMOVE / CONTROL CONSTANT ERRORS
redefine troublesome variables
control troublesome variables
control measurement of variables
Types of Error #2
RANDOM ERRORS
natural fluctuation of the universe
natural blips occurring in our variables and data
little can be done about them
universe is a “random” and chaotic place
RANDOM ERRORS ARE HERE TO STAY
scientific methods have to take account of this
random errors cancel themselves out with a random sample
Q.E.D the need for a truly random sample
The Meaning of P
World is chaotic
Need to know what causes the observed (results in) data
Random luck / natural flux, or the IV ?
Use of an “arbitrary” figure (95% certainty) to let us decide
THE P VALUE IN SCIENTIFIC TERMS
A measure of likelihood of error in our results
The likelihood of the DV being changed by random errors alone, and
not the IV
Significance of .05
Intuitively, with a 5% significance level it means that it can be said with 95%
confidence that observed results are caused by the IV
Significance
Occurs when P=0.05 (or less), meaning there is a 0.95 certainty (or more) of
the IV effecting DV
High significance
Occurs when P=0.01 (or less) meaning there is a 0.99 certainty (or more) of the
IV affecting DV
The Meaning of P
Statistical software gives a p value
Has calculated the likelihood of such results happening by chance
< 5% and it can be assumed that such results have not occurred by chance
“P > 0.05” results are likely to have been derived from random or constant
errors (or both) and the IV was unlikely to have had any effect on the DV.
NON-SIGNIFICANT
i.e. something else changed the DV
The Meaning of P
“P = 0.05” or “P <0.05” results are unlikely to have derived from random or
constant errors, and the IV can be held responsible for the changes in the DV.
SIGNIFICANT
Repeating experiments is the only sure way of establishing if this is really true
e.g.
“The mean age of males in the group (n=64) was 45 years (±3) and the mean
age of females (n=59) was 37 years (±5); P=0.05 and therefore males were
significantly older than females”.
Choosing significance tests
TESTING FOR DIFFERENCES BETWEEN GROUPS
Ordinal data - measurements, normal distributions
Independent samples
t test for independent samples / ANOVA
Matched subjects
t test correlated samples / ANOVA
Repeated measures
t test correlated samples ANOVA
Frequency data
Chi square
TESTING FOR ASSOCIATIONS BETWEEN GROUPS
Ordinal data - frequencies
Correlation
Spearman’s rho
Frequency data
Chi square
Errors continued. . . . .
TYPE 1 ERRORS
Claim that the IV produces an effect on the DV when it did not
A false positive
TYPE 2 ERRORS
Claim that the IV did not produce an effect on the DV, when in fact it did
A false negative
Errors continued. . . . .
Results of a study may be…
1. Genuine case of the IV having no effect on the DV
or
2. IV may genuinely effect the DV, but not being detected by the
measurements. A Type 2 error
or
3. IV may be found to be effecting the DV, but in reality it is not, (experimenter
error etc). A Type 1 error
Errors continued. . . . .
remedies for type 2 errors
• reduce background noise e.g. Streptomycin trials
• ensure everything measured is standardised
• gain max control of environment
• increase sample size e.g. Lanarkshire milk trials 1940’s
• avoid floor and ceiling effects of measurement
• increase measurement reliability e.g. multiple measurements
• change from independent subjects design, to matched subjects / repeated
measures design
Everything has errors 2
Significance level achieved in a study (5% or 1%) measures the likelihood of
making a TYPE 1 error
e.g claiming that the IV effected the DV when it really did not.
P=1
P = 0.5
100% likelihood of type 1 error
Not Good Enough
Not significant
P=0
50% likelihood of type 1 error
Not Good enough
Not significant
0% likelihood of type 1 error
Good enough
Significant
P = 0.05
P = 0.01
5% likelihood
of type 1 error
Good enough
Significant
1% likelihood of
type 1 error
Even better
Highly Significant
P=0
0% likelihood of
type 1 error
Much better
Highly Significant
Is 99.9% certainty enough?
• 12 newborns will be given to the wrong parents daily
• 18,322 pieces of mail mishandled per hour
• 2,000,000 documents lost by the IRS in 2001
• 103,260 income tax returns wrong during the year
• 2.5 million books shipped with the wrong covers
• 2 planes landed at O'Hare airport unsafe every day
• 315 entries in Webster's Dictionary will be misspelled
• 20,000 incorrect drug prescriptions written this year
• 880,000 credit cards in circulation will be incorrect
• 5.5 million cases of fizzy drinks produced will be flat
• 291 pacemaker operations will be performed incorrectly
Further Reading
Altman DG. Practical Statistics For Medical Research. Chapman and Hall,
London 1991.
Bland M. An introduction to medical statistics. (ed.) Oxford Medical
Publications, Oxford 1995.
Gao Smith F. Smith J. Key Topics in Clinical Research. (eds) BIOS scientific
Publications, Oxford 2002.
Stewart I. Does God play dice? Penguin, London 1997.
Feynman RP What do you care what other people think?. Harper Collins,
London 1993.
Tenner E. Why things bite back. Fourth Estate, London 1996.
Lewin R. Complexity. Phoenix, London 1997.
Orkin M. Can you win? Freeman, New York 1991.