Sampling Distribution Proportion

Download Report

Transcript Sampling Distribution Proportion

Sampling
Distribution of a
Sample Proportion
Lecture 25
Sections 8.1 – 8.2
Fri, Feb 29, 2008
Sampling Distributions

Sampling Distribution of a Statistic
The Sample Proportion




The letter p represents the population
proportion.
The symbol p^ (“p-hat”) represents the sample
proportion.
p^ is a random variable.
The sampling distribution of p^ is the probability
distribution of all the possible values of p^.
Example
Suppose that 2/3 of all males wash their
hands after using a public restroom.
 Suppose that we take a sample of 1 male.
 Find the sampling distribution of p^.

Example
2/3
W
P(W) = 2/3
N
P(N) = 1/3
1/3
Example
Let x be the sample number of males who
wash.
 The probability distribution of x is

x
0
1
P(x)
1/3
2/3
Example
Let p^ be the sample proportion of males
who wash. (p^ = x/n.)
 The sampling distribution of p^ is

p^
0
1
P(p^)
1/3
2/3
Example
Now we take a sample of 2 males,
sampling with replacement.
 Find the sampling distribution of p^.

Example
2/3
2/3
W
P(WW) = 4/9
N
P(WN) = 2/9
W
P(NW) = 2/9
N
P(NN) = 1/9
1/3
1/3
2/3
N
W
1/3
Example
Let x be the sample number of males who
wash.
 The probability distribution of x is

x
0
1
2
P(x)
1/9
4/9
4/9
Example
Let p^ be the sample proportion of males
who wash. (p^ = x/n.)
 The sampling distribution of p^ is

p^
0
1/2
1
P(p^)
1/9
4/9
4/9
Samples of Size n = 3

If we sample 3 males, then the sample
proportion of males who wash has the
following distribution.
p^
0
1/3
P(p^)
1/27 = .03
6/27 = .22
2/3
1
12/27 = .44
8/27 = .30
Samples of Size n = 4

If we sample 4 males, then the sample
proportion of males who wash has the
following distribution.
p^
P(p^)
0
1/81 = .01
1/4
8/81 = .10
2/4
24/81 = .30
3/4
32/81 = .40
1
16/81 = .20
Samples of Size n = 5

If we sample 5 males, then the sample
proportion of males who wash has the
following distribution.
p^
P(p^)
0
1/243 = .004
1/5
10/243 = .041
2/5
40/243 = .165
3/5
80/243 = .329
4/5
80/243 = .329
1
32/243 = .132
Our Experiment
In our experiment, we had 80 samples of
size 5.
 Based on the sampling distribution when n
= 5, we would expect the following

Value of p^
Actual
Predicted
0.0
0.2
0.4
0.6
0.8
1.0
0.3
3.3 13.2 26.3 26.3 10.5
The pdf when n = 1
0
1
The pdf when n = 2
0
1/2
1
The pdf when n = 3
0
1/3
2/3
1
The pdf when n = 4
0
1/4
2/4
3/4
1
The pdf when n = 5
0
1/5
2/5
3/5
4/5
1
The pdf when n = 10
0
2/10
4/10
6/10
8/10
1
Observations and Conclusions
Observation: The values of p^ are
clustered around p.
 Conclusion: p^ is close to p most of the
time.

Observations and Conclusions
Observation: As the sample size
increases, the clustering becomes tighter.
 Conclusion: Larger samples give better
estimates.
 Conclusion: We can make the estimates of
p as good as we want, provided we make
the sample size large enough.

Observations and Conclusions
Observation: The distribution of p^ appears
to be approximately normal.
 Conclusion: We can use the normal
distribution to calculate just how close to p
we can expect p^ to be.

One More Observation
However, we must know the values of 
and  for the distribution of p^.
 That is, we have to quantify the sampling
distribution of p^.

The Central Limit Theorem for
Proportions

It turns out that the sampling distribution of
p^ is approximately normal with the
following parameters.
Mean of pˆ   pˆ  p
2
Variance of pˆ   pˆ 
p1  p 
n
Standard deviation of pˆ   pˆ 
p1  p 
n
The Central Limit Theorem for
Proportions

The approximation to the normal
distribution is excellent if
np  5 and n1  p   5.
Example
If we gather a sample of 100 males, how
likely is it that between 60 and 70 of them,
inclusive, wash their hands after using a
public restroom?
 This is the same as asking the likelihood
that 0.60  p^  0.70.

Example
Use p = 0.66.
 Check that

 np
= 100(0.66) = 66 > 5,
 n(1 – p) = 100(0.34) = 34 > 5.

Then p^ has a normal distribution with
 pˆ  
(0.66)(0.34)
 pˆ 
 0.04737
100
Example

So
P(0.60  p^  0.70)
= normalcdf(.60,.70,.66,.04737)
= 0.6981.
Why Surveys Work
Suppose that we are trying to estimate the
proportion of the male population who
wash their hands after using a public
restroom.
 Suppose the true proportion is 66%.
 If we survey a random sample of 1000
people, how likely is it that our error will be
no greater than 5%?

Why Surveys Work

Now we have
 pˆ  
(0.66)(0.34)
 pˆ 
 0.01498.
1000
Why Surveys Work
Now find the probability that p^ is between
0.61 and 0.71:
normalcdf(.61, .71, .66, .01498) = 0.9992.
 It is virtually certain that our estimate will
be within 5% of 66%.

Case Study
Study confirms aprotinin drug increases
cardiac surgery death rate
 Aprotinin during Coronary-Artery Bypass
Grafting and Risk of Death

Why Surveys Work
What if we had decided to save money
and surveyed only 100 people?
 If it is important to be within 5% of the
correct value, is it worth it to survey 1000
people instead of only 100 people?

Quality Control
A company will accept a shipment of
components if there is no strong evidence
that more than 5% of them are defective.
 H0: 5% of the parts are defective.
 H1: More than 5% of the parts are
defective.

Quality Control
They will take a random sample of 100
parts and test them. If no more than 10 of
them are defective, they will accept the
shipment.
 What is ?
 What is ?
