Transcript Document
Stick Tossing and
Confidence Intervals
Bruce Cohen
Lowell High School, SFUSD
[email protected]
http://www.cgl.ucsf.edu/home/bic
David Sklar
San Francisco State University
[email protected]
Asilomar - December 2006
Ver. 0.5
Estimating a Probability
An Old Problem:
When a thin stick of unit length is
“randomly” tossed onto a grid of
parallel lines spaced one unit apart
what is the probability that the stick
lands crossing a grid line?
We would like to take a purely experimental and statistical approach to
the problem of finding, or at least estimating, the desired probability.
Our experiments will consist of tossing a stick some fixed number of
times, keeping track of how many times the stick lands crossing a grid
line (the data), and computing the percentage of times this event
occurs (a statistic).
Basic statistical theory will help us understand how to interpret these
results.
Plan
Estimating a simple probability
Toss sticks, gather data
Background material
Estimating the probability
The average and standard
deviation of a list of numbers
Estimating the uncertainty in
the estimate of the probability
Histograms, what they are and
what they aren’t
Confidence intervals and what
they mean
The average and standard
deviation of a histogram
The normal curve
Where does the procedure for
finding confidence intervals come
from? Why does it work?
A mathematical model for the
data
The mathematics of the model
Box models and histograms
for the sum of the draws
The Central Limit Theorem
Sketch of a proof of a special case
of the Central Limit Theorem
Estimating the Probability: A Sample Calculation
Result: 20 line crossings in 36 tosses
# crossings
estimated probability
# tosses
Standard Error (SE) for the
estimated probability
20
0.5555 55.6%
36
est. prob. est. prob. of
of
crossing
not
crossing
# tosses
20 16
36 36
36
0.0828 8.3%
Conclusions: Based on this data an approximate 68% confidence interval for the
probability that the stick lands crossing a line is 55.6% 8.3%
47.3%
63.9%
an approximate 95% confidence interval is 55.6% 16.6%
39.0%
72.2%
68% Confidence Intervals for 10 Experiments
(36 tosses per experiment)
estimated
cross prob
SE
20 55.6% 8.3%
24
66.7%
7.9%
19
52.8%
8.3%
28
77.8%
6.9%
23
63.9%
8.0%
25
69.4%
7.7%
24
66.7%
7.9%
27
75.0%
7.2%
23
63.9%
8.0%
21
58.3%
8.2%
63.9%
47.3%
74.6%
58.8%
61.1%
44.5%
84.7%
70.9%
71.2%
55.9%
77.1%
61.7%
74.6%
58.8%
67.8%
71.2%
55.9%
66.5%
50.1%
40%
50%
82.2%
60%
70%
80%
Pooling the data
Result: 234 line crossings in 360 (independent) tosses
# crossings
234
estimated probability
65.0%
# tosses
360
Standard Error (SE) for the
estimated probability
est. prob. est. prob. of
of
crossing
not
crossing
# tosses
.650 .350
360
0.025 2.5%
Conclusions: Based on this data an approximate 68% confidence interval for the
probability that the stick lands crossing a line is 65.0% 2.5%
62.5%
67.5%
an approximate 95% confidence interval is 65.0% 5.0%
60.0%
70.0%
68% Confidence Intervals for 10 Experiments
(36 tosses per experiment)
estimated
cross prob error
20 55.6% 8.3%
24
66.7%
7.9%
19
52.8%
8.3%
28
77.8%
6.9%
23
63.9%
8.0%
25
69.4%
7.7%
24
66.7%
7.9%
27
75.0%
7.2%
23
63.9%
8.0%
21
58.3%
8.2%
234 65.0%
2.5%
63.9%
47.3%
74.6%
58.8%
61.1%
44.5%
84.7%
70.9%
71.2%
55.9%
77.1%
61.7%
74.6%
58.8%
67.8%
71.2%
55.9%
66.5%
50.1%
62.5%
40%
50%
82.2%
60%
67.5%
70%
80%
Some 95% Confidence Intervals
Where Does the Procedure for Finding
Confidence Intervals Come From?
As with all “real world” applications of mathematics
we begin with a Mathematical Model.
Box Model
The number of line crossings in n tosses of the
stick is like the Sum of values of n draws at
random with replacement from a box with two
kinds of numbered tickets. Those numbered 1
correspond to the stick landing crossing a line,
and those numbered 0 to not crossing. The
percentage of tickets numbered 1 in the box is
not known.
This unknown percentage corresponds to the
probability that a stick lands crossing a line.
? 1
?? 0
The set of tickets in the box
is called the population,
and the (unknown) % of
1’s in the population is a
parameter.
The n drawn tickets are a
sample, and the % of 1’s
in the sample is a statistic.
Note: this kind of box is called a zero–one box.
The Mathematics of the Model
The goal for the rest of the talk is to develop the mathematics of
the box model.
We first review some basic background material which we then use to
understand the behavior of the sum of the draws from a box of known
composition. Finally we use this understanding to see why the confidence
levels come from areas under the normal curve.
The Average and Standard Deviation of a List of Numbers
Example List: 21, 28, 30, 30, 34, 37
The SD measures
the spread.
sum of the values
number of elements
30
The mean or average
25
The mean measures the “center” of the list.
35
The average is the
balance point.
deviation element average
deviations 9, 2, 0, 0, 4, 7
The Standard Deviation (SD)
mean of the squared deviations
9 2 02 02 42 7 2 5
6
2
2
The SD measures the spread of the list about the mean. It has the same units as
the values in the list. It is a natural scale for the list: we are often more interested
in how many SD’s a value is from the mean than in the value itself.
The Average and Standard Deviation of a List of Numbers
For a list consisting of just 0’s and 1’s we have:
average
sum of the values
number of elements
number of ones
fraction of 1's
number of elements
and with some algebra we can show that
SD
mean of the squared deviations
fractions of 1's fractions of 0's
We can now re-interpret the procedure for estimating our probability
estimated probability
SE
# crossings
sample # of 1's
# tosses
sample size
est. prob. est. prob. of
of
crossing
not
crossing
# tosses
sample fraction of 1's
sample average
sample sample
fraction
of
1's
fraction
of
0's
sample size
sample SD
sample size
Properties of The Average and Standard Deviation
1.
If we add a constant, B, to each element of a
list the average of the new list is the old
average + B.
2.
If we multiply each element of a list by a constant,
A, the average of the new list is A times the old
average.
3.
If we add a constant, B, to each element of a
list the SD of the new list is the old SD.
4.
If we multiply each element of a list by a constant,
A, the SD of the new list is |A| times the old SD.
Standard Units
We are often more interested in how many SD’s a value is from the mean than in
the value itself. For example: 37 is 1.4 SD’s above the average or 28 is 0.4 SD’s
below the average.
The value of an element in Standard Units is the the number of SD’s it is above
(positive), or below (negative) the mean. To convert a value to standard units use
value in standard units z -value
Example
List:
deviation
value average
SD
SD
21, 28, 30, 30, 34, 37 with average 30 and SD 5
In Standard Units: -1.8, -0.4, 0, 0, 0.8, 1.4
A list in standard units will have mean 0 and SD 1.
Adding a constant to each element of a list or multiplying each element by a
constant will not change the values of the elements in standard units.
For many lists roughly 68% of the values lie
within 1 SD of the mean and 95% lie within 2 SD’s.
From Lists to Histograms
Example: 36 Exam Scores
23, 29, 30, 31, 35, 38, 40, 41, 42, 45, 46, 51, 52, 54, 55, 55, 57, 58, 59, 60, 61, 63,
69, 70, 70, 71, 71, 74, 75, 75, 82, 85, 86, 91, 91, 93. Note: Av = 59.1, SD = 18.9
0.8
1.4
1.9
1.1
0.8
Endpoint convention: class intervals contain
left endpoints, but not right endpoints
A Histogram represents the
percentages by areas (not by heights).
2.0
(% /point)
area in % (width in pts)(height in %/pt)
13.9 % 18 pts density in % pt
(1.9)
(1.4)
(1.0)
(0.8)
44.4%
(0.8)
16.7%
16.7%
13.9%
8.3%
0.0
%
13.9
16.7
44.4
16.7
8.3
Density (% per point)
0.5
1.5
1.0
class intervals #
20 - 38
5
38 - 50
6
50 - 74
16
74 - 90
6
90 - 100
3
density
20
40
60
scores
80
A histogram is not a bar chart.
100
A Histogram is Not A Bar Chart
Bar Chart of Scores
Histogram of Scores
Density (% per point)
0.5
1.5
1.0
40
(1.9)
(1.4)
(1.0)
(0.8)
44.4%
(0.8)
16.7%
16.7%
16.7%
16.7%
13.9%
8.3%
8.3%
0
0.0
13.9%
% of total papers
10
30
20
2.0
44.4%
20
40
60
scores
80
100
20
38
50
74
90
scores
A Histogram represents the percentages by areas (not by heights).
A histogram is not a bar chart.
100
The Average and Standard Deviation of a Histogram
To find the mean or average of a histogram first list the center of each class interval then
multiply each by the area of the block above it and finally sum.
Class intervals: 20 to 38, 38 to 50, 50 to 74, 74 to 90, 90 to 100
List of midpoints: 29,
44,
62,
82,
95
Histogram Av 29 .139 +44 .167 +62 .444 +82 .167 +95.083 60.5
To find the standard deviation of a histogram find the squared deviations of the center of
each class interval, then multiply each by the area of its corresponding block, then sum, and
finally take the square root.
29 60.5 .139 44 - 60.5 .167
2
2
62 - 60.5 .444 82 - 60.5 .167
2
95 - 60.5 .083
[Note for the original data: Av = 59.1, SD = 18.9]
For many histograms roughly 68%
of the area lies within 1 SD of the
mean and 95% lies within 2 SD’s.
2.0
19.0
(1.9)
Density (% per point)
0.5
1.5
1.0
SD
2
(1.4)
(1.0)
(0.8)
(0.8)
44.4%
16.7%
Av = 60.5
16.7%
SD = 19
13.9%
8.3%
0.0
2
20
40
60
scores
80
100
1.5
1.0
Av = 60.5
0.5
Density (% per point)
2.0
Histograms and Standard Units
0.0
SD = 19
scores
-3
-2
-1
0
Standard Units
1
2
3
The Normal Curve
Area
(percent)
Height
(% per Std.U.)
The normal curve was discovered
by Abraham De Moivre around
1720. Around 1870 Adolph
Quetelet had the idea of using it
as an ideal histogram to which
histograms for data could be
compared. Many histograms
follow the normal curve and
many do not.
The equation for the
Standard Normal Curve is
1
e
2
y f z
From: Freedman, Pisani, and Purves, Statistics, 3rd Ed.
the family: g x
1
2
e
z2
2
x 2
2 2
1.5
1.0
Av = 60.5
0.5
Density (% per point)
2.0
Histograms, Standard Units, and the Normal curve
0.0
SD = 19
scores
-3
-2
-1
0
Standard Units
1
2
3
Data Histograms and Probability Histograms
Discrete data convention
From: Freedman, Pisani, and Purves, Statistics, 3rd ed.
Data Histograms and Probability Histograms for
the Sum of the Draws
The Central Limit Theorem
There are many Central Limit Theorems. We state two in terms of box
models. The second is a special case of the first and it covers the model
we are dealing with in our stick tossing problem. It goes back to the early
eighteenth century.
When drawing at random with replacement from a box of numbered tickets
(with bounded range), the probability histogram for the sum of the draws
will follow the standard normal curve, even if the the contents of the box do
not. The histogram must be put into standard units, and the number of
draws must be reasonably large.
De Moivre – La Place version: When drawing at random with replacement
from a zero-one box, the probability histogram for the sum of the draws
will follow the standard normal curve, even if the the contents of the box do
not. The histogram must be put into standard units, and the number of
draws must be reasonably large.
The Normal Curve and Probability Histograms for
the Sum of the Draws
1
0
provides a box model for
counting the number of heads
in n tosses of a fair coin.
Histogram for the box
100
50
0
0
1
From: Freedman, Pisani, and Purves
The Normal Curve and Probability Histograms for
the Sum of the Draws
From: Freedman, …
The Normal Curve and Probability Histograms for
the Sum of the Draws
Histogram for the box
1
2
From: Freedman, …
9
The Central Limit Theorems
When drawing at random with replacement from a box of numbered tickets
(with bounded range), the probability histogram for the sum (and average)
of the draws will follow the standard normal curve, even if the the contents
of the box do not. The histogram must be put into standard units, and the
number of draws must be reasonably large.
De Moivre – La Place version: When drawing at random with replacement
from a zero-one box, the probability histogram for the sum (and average) of
the draws will follow the standard normal curve, even if the the contents of
the box do not. The histogram must be put into standard units, and the
number of draws must be reasonably large.
The probability histogram for the average of the draws, when put in standard
units is the same as for the sum because multiplying each value of the sum by
1/(# of draws) won’t change the corresponding values in standard units.
Where Does the 68% Confidence Level Come From?
Estimated SE SD of the sample
for the average
# of draws
True SE for the
average of the draws
Pop. SD
# of draws
1
Sample
Average
True
Population
Average
Standard
units
Since the estimated SE for the average computed from sample is, on average,
about equal to the true SE a 68% confidence interval will cover the true
population mean whenever the sample mean is within 1 SE of the true mean.
The probability of this happening is, by the central limit theorem, the area
within 1 standard unit of 0 under the normal curve, and this area is about 68%.
How to Prove The De Moivre – La Place Version of The
Central Limit Theorem
Show that the probability that the sum of n draws at random with replacement
from a zero-one box is exactly k given by the binomial formula
b k ; n, p
n!
pk qnk ,
k ! n k !
Then using “Stirling’s Formula”
show
b k ; n, p
n!
2 n
where q 1 p
n
1
2 n
e
k
n
np nq
2 n n k k n k
nk
Letting x k np and recalling that q 1 p
b k ; n, p
1
x
1
x
x np
2 npq 1 1
np nq
x np
x
1
nq
x nq
How to Prove The De Moivre – La Place Version of The
Central Limit Theorem -- continued
Use the series for the log to show that, for x npq
x np
x nq
x
x
1 x
log 1
1
2 npq
np
nq
x
Which implies 1
np
Hence b k ; n, p
x np
1
2 npq
e
x
1
nq
1 x
2 npq
x nq
e
1 x
2 npq
2
1
2 npq
e
2
2
1 k np
2 npq
2
1
2 npq
The limiting processes in these steps require some care. Both k and n must go to
infinity together in a fixed relationship to each other, and we need to understand
why values of x for which |x|>npq are unimportant.
e
1
z2
2
Bibliography
1.
Freedman, Pisani, & Purves, Statistics, 3rd Ed., W.W. Norton, New York,
1998
2.
W. Feller, An Introduction to Probability Theory and Its Applications,
Volume I, 2nd Ed., John Wiley & Sons, New York, London, Sydney, 1957
3. F. Mosteller, Fifty Challenging Problems in Probability with Solutions,
Addison-Wesley, Palo Alto, 1965.
4. http://www-history.mcs.st-andrews.ac.uk/Biographies/De_Moivre.html
5. R Development Core Team, R: A language and environment
for statistical computing, R Foundation for Statistical
Computing, Vienna, Austria, 2006, <http://www.R-project.org>