Chapter 9 Sampling Distributions
Download
Report
Transcript Chapter 9 Sampling Distributions
Chapter 9
Sampling Distributions
AP Statistics
St. Francis High School
Fr. Chris, 2001
Two Key Ideas
A Statistic is a Random Variable
As such, mean and standard deviations
can be found from combining the basic
random variables that make the statistic
Pick Pennies from a Hat
Recall how we did this
Try it again:
– Pick at random
– Note the year
– Compute the mean and standard deviation
of your sample
– NEW: Compute what you think the mean
and standard deviation of the entire hat!
Formulas
x
x
n
p p
p
p1 p
n
Statistic vs. Parameter
A Statistic is a way to describe a
parameter
A Parameter describes a population
Which is a sample, which is a
parameter?
42% of today’s 15 year-old girls will get pregnant
in their teens
42: parameter
37% said they would vote for Joan Smith, on
election day 41% actually did. 37:statistic 41:parameter
The NIH reports that the mean systolic blood
pressure for males 35-44 years of age is 128 and
the standard deviation is 15. 72 male Stock
Brokers in this age group have a mean blood
pressure of 126.07 128, 15: parameter, 126.07:statistic
Bias vs. Variability
Bias: Is your statistic centered around
the population’s parameter?
Variability: Is your sample distribution
scattered or focused?
Identify the bias and variability
of each:
Population Parameter
Population Parameter
Population Parameter
Population Parameter
What about your sample?
Is it variable?
Is it biased? How can you tell?
http://www.mathorama.com/stat/penny97hist.html
http://www.mathorama.com/stat/penny99hist.html
Confidence Intervals
Use your sample statistics and what you know of the
central limit Theorem, to make an assertion about the
Population parameter.
x 2
0.4
x z(std.error)
where z is the z score
0.3
for the desired %
0.2
y
1
2
e
2
0.1
By hand:
-3
-2
-1
1
2
3
http://www.mathorama.com/stat/Confidence.html
Computer Simulation
http://www.mathorama.com/stat/RandomSamp.html
What about a proportion?
The Gallup poll asked a probability sample of 1785
adults whether they attended church or synagogue
during the past week. Suppose 40% did attend.
How likely is it that a SRS of 1785 would be
within 3% of this actual value?
pˆ p
pˆ
p1 p
n
Two rules of thumb:
The population must be at least 10 times
more than your sample size to use this
formula for standard deviation.
np > 10 and n(1-p) > 10 in order to use
the normal curve for approximating p.
Compute the standard
deviation
Since the population is more than 10 times
1785,
( pˆ )
p(1 p)
n
.4(.6)
=0.0116
1785
The Probability that
p-hat is between 37%-43%
Since (.4)(1785) >10, and (.6)(1785)>10 then
we can convert to z-scores and use the normal
curve.
z
x
.37 .4
2.586
0.0116
.43.4
2.586
0.0116
Using the Normal
Distribution…
P(-2.586 < Z < 2.586)=
P(Z<2.586)-P(Z<-2.586)=
normalcdf(-2.586,2.586)=
Normalcdf(.37, .43, .4, 0.0116)=
.9903!
Okay, what if you flip a coin 20
times and it’s heads 14 times?
Is it a fair coin? How can justify your answer?
Did you mention sample variability? Bias?
Do the rules of thumb apply to find a sigma? To
use the normal distribution?
(.3)(n) 10
10
n
If you suspect that 70% is this coin’s true
.3
proportion, how many times should we flip it so
n 34
we can use the normal curve?
Dishonest Cola?
DC Cola is suspected of underfilling its cans
of cola. They say each can has 12 ounces,
with a standard deviation of 0.4 oz.
If this is true, how likely is it to get an
average of 11.9 oz.or less, by taking a
random sample of 50 cans?
Work it out...
11.9 12
Z score? z
0.4 1.77
50
Look up -1.77 in Table A, or
normalcdf(-1E99, -1.77)
Or normalcdf(-1E99, 11.9, 12, .4 / √50)
=.0384
This leads to inference...
If these were your results, there is still a
3% chance that the parameter really is
where the company says it is (12 oz.)
and sample variation lead you to a
result less than 11.9 oz.
At what point do you reject the company’s
claim? At 5%? 1%? 0.1%?
Inferential Statistics
We choose a level of rejection (alpha)
We assume that our results are no
different, and any variation is from
chance (Null Hypothesis).
If it is unlikely (less than our chosen
alpha), we reject the “Null Hypothesis”
Then claim our results SIGNIFICANTLY
different.
Central Limit Theorem
Draw an SRS of size n from any population
whatsoever with mean µ and a finite
standard deviation . When n is large, the
sampling distribution of the sample mean
x-bar is close to the normal distribution
N[µ, /√n] (page 488).
Law of Large Numbers
Draw observations at random from any
population with finite mean µ. As the
number of observations drawn increases,
the mean x-bar of the observed values
gets closer and closer to .
Homework 9.1-9.4 (489)
Parameter or a statistic?
2.5003
p 7.2% statistic
p 48% statistic
p 52% parameter
parameter x 2.5009 statistic
x1 335
statistic x 2 289 statistic
9.5 (492)Tumbling Toast
Toss coin 20 times. P-hat=
10 more times… make a histogram of
your p-hats…. Is the center close to .5?
Pool your work.. Is the center near .5?
Is it normal?
9.9 (500) Dead Guinea Pigs
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
9.10(510)
A) Large Bias, Large Variability
B)Small Bias Small Variability
C)Small Bias Large Variability
D)Large Bias Small Variability
9.17 (503) School Vouchers
Assuming the poll’s sample size is less
than 780,000-10% of the population of
NJ… the variability would be about the
same
9.19 (511) Got Milk?
n=1012
p .7;
p(1 p)
(.7)(.3)
.0144
n
1012
US 10120
p .67
p .7
np (1012)(.7) 708.4 10
n(1 p) (1012)(.3) 303.6 10
P( p .67) P(Z 0.25) .0186
4*1012 4048
9.33(519) Juan’s results
=10
10
5.7735 mg
n
3
10
3;n 12
n
9.35(524)Bad Rug
Mean=1.6 sd=1.2
1.2
normalcdf (2,9999999 ,1.6,
)0
200
9.39(525) Cheap Cola
=298, =3 P(<295)? P(xbar<295, n=6)?
295 298
P(X 295) P(Z
1) .8413
3
295 298
P(x 295) P(Z
2.4495) .0072
3
6
9.41(526) What a Wreck!
=2.2, =1.4
Not normal but dist of x-bar is!
1.4
N(2.2,
.1941)
52
P(x 2) P(Z
P(x
100
52
) P(Z
2 2.2
1.0302) .1515
1.4
52
2.2
1.4267) .0768
1.4
52
100
52