AP Statistics: Section 10.1 A

Download Report

Transcript AP Statistics: Section 10.1 A

AP Statistics: Section 10.1 A
Confidence interval Basics
How long does a new laptop battery last? What
proportion of college undergraduates have
engaged in binge drinking? It certainly would
not be feasible to test every laptop or question
every college undergraduate. Instead we choose
a sample from the population of interest and
collect data from these subjects. Our goal is to
use the sample statistic to estimate the
unknown population parameter.
Statistical inference provides methods for
drawing conclusions about a population
from sample data. The two most common
types of formal statistical inference are
confidence intervals
&
significance tests
Inference is most reliable when the
data is produced by a properly
____________
randomized design.
Sample values, such as a
proportion or mean, will probably
vary from sample to sample, but
there is only one true population
proportion or mean. Only by
considering our sample as one of
many such samples can we draw
inferences.
The sampling distribution of x
describes how the values of x vary in
repeated samples. Recall from Chapter
9 some important facts about the
sampling distribution of x .
CENTER:  x  
SPREAD:  x  
n
Note : N  10n
SHAPE:
1. If the population is Normally
distributed, then the distribution of x will
be Normally distributed regardless of our
sample size.
2. Central Limit Theorem: If n is
sufficiently large, the sampling distribution
of x will be approximately Normal
regardless of the shape of the population
distribution.
Example: The admissions director at Big City
University proposes using the IQ scores of current
students as a marketing tool. The university decides
to provide him with enough money to administer IQ
tests to an SRS of 50 of the university’s 5000
freshman. The mean IQ score for the sample is 112.
What can the director say about the mean score of
the population of all 5000 freshman?
Because n  50 is fairly large, the Law of Large
Numbers says  will be very close to 112.
Now, 112 is probably not the true population
mean for the IQ of Big City University freshman.
The goal of a confidence interval is to give a
range of values that we are “confident” the true
population mean will lie within. The following
will give us a glimpse of how this is done.
When the distribution of x is Normally
distributed, the 68-95-99.7 rule for Normal
distributions says that in about 95% of all
samples, the mean score, x , for the sample will
be within ___
2 standard deviations of the
population mean .
So, whenever x is within 2
standard deviations of  ,  is
within 2 standard deviations of x .
So the unknown  lies between
________
x  2 and ________
x  2 in about
95% of all samples.
For the example above, let’s assume
the standard deviation of freshman IQ
scores at BCU is  15, so the standard
deviation of x =15
 2.1
50
So, we estimate that  lies somewhere in
x  2(2.1) or
the interval from _______
x  2(2.1) to _______
(_____
x  4.2 , _____)
x  4.2 Our sample of 50
freshmen gave x  112. The resulting
interval is 112  4.2 or ( _____,
116.2 ).
107.8 _____
The key idea is that the sampling
distribution of x tells us how big
the error is likely to be when we
use x to estimate  .
Understand that our confidence is
in the procedure used to generate
the interval.
It is incorrect to try and associate
any type of probability to an
already found interval because
there are only two possibilities:
1. The interval between 107.8 and
116.2 contains the true 
2. Our SRS was one of the few
samples for which is x not
within 4.2 points of the true  .
Only ____
5% of our samples
give such inaccurate results.
The interval of numbers is called a
95% _______________________
confidence interval
for  .
It catches the unknown  in 95%
of all possible samples.
The  4.2 is called the
margin of error
The 95% is the
confidence level