Sampling Distribution Proportion
Download
Report
Transcript Sampling Distribution Proportion
Sampling
Distribution of a
Sample Proportion
Lecture 25
Sections 8.1 – 8.2
Fri, Feb 29, 2008
Sampling Distributions
Sampling Distribution of a Statistic
The Sample Proportion
The letter p represents the population
proportion.
The symbol p^ (“p-hat”) represents the sample
proportion.
p^ is a random variable.
The sampling distribution of p^ is the probability
distribution of all the possible values of p^.
Example
Suppose that 2/3 of all males wash their
hands after using a public restroom.
Suppose that we take a sample of 1 male.
Find the sampling distribution of p^.
Example
2/3
W
P(W) = 2/3
N
P(N) = 1/3
1/3
Example
Let x be the sample number of males who
wash.
The probability distribution of x is
x
0
1
P(x)
1/3
2/3
Example
Let p^ be the sample proportion of males
who wash. (p^ = x/n.)
The sampling distribution of p^ is
p^
0
1
P(p^)
1/3
2/3
Example
Now we take a sample of 2 males,
sampling with replacement.
Find the sampling distribution of p^.
Example
2/3
2/3
W
P(WW) = 4/9
N
P(WN) = 2/9
W
P(NW) = 2/9
N
P(NN) = 1/9
1/3
1/3
2/3
N
W
1/3
Example
Let x be the sample number of males who
wash.
The probability distribution of x is
x
0
1
2
P(x)
1/9
4/9
4/9
Example
Let p^ be the sample proportion of males
who wash. (p^ = x/n.)
The sampling distribution of p^ is
p^
0
1/2
1
P(p^)
1/9
4/9
4/9
Samples of Size n = 3
If we sample 3 males, then the sample
proportion of males who wash has the
following distribution.
p^
0
1/3
P(p^)
1/27 = .03
6/27 = .22
2/3
1
12/27 = .44
8/27 = .30
Samples of Size n = 4
If we sample 4 males, then the sample
proportion of males who wash has the
following distribution.
p^
P(p^)
0
1/81 = .01
1/4
8/81 = .10
2/4
24/81 = .30
3/4
32/81 = .40
1
16/81 = .20
Samples of Size n = 5
If we sample 5 males, then the sample
proportion of males who wash has the
following distribution.
p^
P(p^)
0
1/243 = .004
1/5
10/243 = .041
2/5
40/243 = .165
3/5
80/243 = .329
4/5
80/243 = .329
1
32/243 = .132
Our Experiment
In our experiment, we had 80 samples of
size 5.
Based on the sampling distribution when n
= 5, we would expect the following
Value of p^
Actual
Predicted
0.0
0.2
0.4
0.6
0.8
1.0
0.3
3.3 13.2 26.3 26.3 10.5
The pdf when n = 1
0
1
The pdf when n = 2
0
1/2
1
The pdf when n = 3
0
1/3
2/3
1
The pdf when n = 4
0
1/4
2/4
3/4
1
The pdf when n = 5
0
1/5
2/5
3/5
4/5
1
The pdf when n = 10
0
2/10
4/10
6/10
8/10
1
Observations and Conclusions
Observation: The values of p^ are
clustered around p.
Conclusion: p^ is close to p most of the
time.
Observations and Conclusions
Observation: As the sample size
increases, the clustering becomes tighter.
Conclusion: Larger samples give better
estimates.
Conclusion: We can make the estimates of
p as good as we want, provided we make
the sample size large enough.
Observations and Conclusions
Observation: The distribution of p^ appears
to be approximately normal.
Conclusion: We can use the normal
distribution to calculate just how close to p
we can expect p^ to be.
One More Observation
However, we must know the values of
and for the distribution of p^.
That is, we have to quantify the sampling
distribution of p^.
The Central Limit Theorem for
Proportions
It turns out that the sampling distribution of
p^ is approximately normal with the
following parameters.
Mean of pˆ pˆ p
2
Variance of pˆ pˆ
p1 p
n
Standard deviation of pˆ pˆ
p1 p
n
The Central Limit Theorem for
Proportions
The approximation to the normal
distribution is excellent if
np 5 and n1 p 5.
Example
If we gather a sample of 100 males, how
likely is it that between 60 and 70 of them,
inclusive, wash their hands after using a
public restroom?
This is the same as asking the likelihood
that 0.60 p^ 0.70.
Example
Use p = 0.66.
Check that
np
= 100(0.66) = 66 > 5,
n(1 – p) = 100(0.34) = 34 > 5.
Then p^ has a normal distribution with
pˆ
(0.66)(0.34)
pˆ
0.04737
100
Example
So
P(0.60 p^ 0.70)
= normalcdf(.60,.70,.66,.04737)
= 0.6981.
Why Surveys Work
Suppose that we are trying to estimate the
proportion of the male population who
wash their hands after using a public
restroom.
Suppose the true proportion is 66%.
If we survey a random sample of 1000
people, how likely is it that our error will be
no greater than 5%?
Why Surveys Work
Now we have
pˆ
(0.66)(0.34)
pˆ
0.01498.
1000
Why Surveys Work
Now find the probability that p^ is between
0.61 and 0.71:
normalcdf(.61, .71, .66, .01498) = 0.9992.
It is virtually certain that our estimate will
be within 5% of 66%.
Case Study
Study confirms aprotinin drug increases
cardiac surgery death rate
Aprotinin during Coronary-Artery Bypass
Grafting and Risk of Death
Why Surveys Work
What if we had decided to save money
and surveyed only 100 people?
If it is important to be within 5% of the
correct value, is it worth it to survey 1000
people instead of only 100 people?
Quality Control
A company will accept a shipment of
components if there is no strong evidence
that more than 5% of them are defective.
H0: 5% of the parts are defective.
H1: More than 5% of the parts are
defective.
Quality Control
They will take a random sample of 100
parts and test them. If no more than 10 of
them are defective, they will accept the
shipment.
What is ?
What is ?