Part 4 - Angelfire

Download Report

Transcript Part 4 - Angelfire

CHAPTER 9
SAMPLING DISTRIBUTIONS AND
CONFIDENCE INTERVALS
1
9.1 CHAPTER OBJECTIVES
Motivation for Point Estimators
 Common Point Estimators
 Desirable Properties of Point Estimators
 Distribution of the Sample Mean: Large
Sample or Known 
 The Central Limit Theorem-A More Detailed
Look
 Drawing Inferences by Using the Central
Limit Theorem
 Large-Sample Confidence Intervals for the
Mean

2
9.1 CHAPTER OBJECTIVES
Distribution of the Sample Mean: Small
Sample and Unknown 
 Small-Sample Confidence Intervals for the
Mean
 Confidence Intervals for Qualitative Data
 Sample Size Calculations

3
9.2 MOTIVATIONS FOR POINT ESTIMATORS

A point estimate is a single number
calculated from sample data.

It is used to estimate a parameter of the
population.

A point estimator is the formula or rule that
is used to calculate the point estimate for a
particular set of data.
4
9.3 COMMON POINT ESTIMATORS
9.3.1 Point Estimators for Quantitative Variables

If you are studying a single quantitative
variable, then you typically wish to know the
value of the population mean and the value
of the population standard deviation. That is,
you wish to know the center and the
variability in the population.

We will use the sample mean, X , to estimate
 and we will use the sample standard
deviation, s, to estimate .
5
9.3 COMMON POINT ESTIMATORS

For example, you may wish to compare:
– the weight of diapers produced by two
different machines
– the sales of a product at two different
locations
– the time it takes to get your burger at
McDonalds and Burger King
– the salaries for men and women in the
same occupation
6
9.3 COMMON POINT ESTIMATORS

For example, if you wanted to compare your
salary to your friend's salary, what would you
do with the two salary figures?

If you said subtract one number from the
other, then you are correct. By convention we
always subtract the second mean from the
first and thus we need to estimate the true
difference between l and 2, or l-2.

It makes sense to estimate this true difference
by using the actual difference in the two
sample means or X  X .
7
1
2
9.3 COMMON POINT ESTIMATORS

Another way to compare two numbers is to
find the ratio of one to the other. If the
numbers are the same, then the ratio will
be 1.

We will use ratios to compare the amount
of variation in the first population to the
amount of variation in the second
population.
8
9.3 COMMON POINT ESTIMATORS
9.3.2 Point Estimators for Qualitative Variables

If the variable you are studying is a
qualitative variable then you typically wish
to know what proportion or percentage of
the population has a particular
characteristic.

For example, you may wish to know the
percentage of defective items in the
population.

The true unknown population percentage is
labeled . In this case we will use the
sample proportion, p, to estimate .
9
9.3 COMMON POINT ESTIMATORS

We often wish to compare two qualitative
variables. For example, you may wish to
compare
– the proportion of men who would buy a
new product with the proportion of
women who would buy it
– the proportion of defectives produced by
the second shift with the proportion of
defectives produced by the third shift
– the proportion of young people who like
the new packaging with the proportion of
older people who like it
10
9.3 COMMON POINT ESTIMATORS

We would like to compare the true
population proportions 1 to 2 and we
accomplish the comparison with a
subtraction.

So we wish to estimate the true difference
in the population proportions, 1-2.

It makes sense to estimate this true
difference by using the difference in the
two sample proportions or p1-p2.
11
12
9.4 DESIRABLE PROPERTIES OF POINT
ESTIMATORS

In order to develop the properties of point
estimators, we focus on estimators for the
population mean, .

Based on what we have seen so far, it
makes sense to consider using one or more
of these statistics as our point estimator of
the unknown value .
13
9.4 DESIRABLE PROPERTIES OF POINT
ESTIMATORS

In summary, what we really want is a
point estimator with the following two
properties:
– The point estimator should yield a
number close to the unknown
population parameter.
– The point estimator should not have a
great deal of variability.
14
9.4 DESIRABLE PROPERTIES OF POINT
ESTIMATORS

These two properties are more precisely
stated as follows:
– The point estimator should be unbiased.
– The point estimator should have a small
standard deviation.

An unbiased estimator yields an estimate
that is fair. It neither systematically
overestimates the parameter nor
systematically underestimates the
parameter.
15
9.5 DISTRIBUTION OF THE SAMPLE MEAN,
X
9.5.1 Putting Z-scores and the Empirical Rule to Use
 To calculate the Z-score we need the following
formula:
Dis tan ce between the data value and the average
Z
S tan dard deviation
Z
X  x
x
16
9.5 DISTRIBUTION OF THE SAMPLE MEAN,
X
9.5.2 The Central Limit Theorem
 The standard error is the standard
deviation of a point estimator. It measures
how much the point estimator or sample
statistic varies from sample to sample.

The probability distribution of a point
estimator or a sample statistic is called a
sampling distribution.
17
9.5 DISTRIBUTION OF THE SAMPLE MEAN, X

The Central Limit Theorem applies when
you have a large enough sample size. A
"large enough" sample size depends on
how much the population distribution
deviates from a normal distribution.

Typically, if the sample size is larger than
30 then it is considered large enough. The
larger the sample size, the better the
normal approximation will be.
18
9.6 THE CENTRAL LIMIT THEOREM
9.6.1 The Shape of the Sampling Distribution of
X

The diaper company is taking samples of
size n= 5 every hour.

Because each individual sample size is
small (n=5), to apply the CLT we will need
to assume that the underlying distribution
of diaper weights is normal or close to a
normal shape.
19
9.6 THE CENTRAL LIMIT THEOREM

We can get a sense of the shape of the
population distribution by examining a
histogram of sample observations.
20
9.6 THE CENTRAL LIMIT THEOREM
9.6.2 The Mean of the Sampling Distribution of
X

The second point of the Central Limit
Theorem is that the mean of the sampling
distribution of X equals the mean of the
population you are sampling from.

This means the center of the histogram of
the X 's should be .
21
9.6 THE CENTRAL LIMIT THEOREM
9.6.3 The Standard Error of the Sampling Distribution of

X
The third point of the theorem says that
the standard deviation of the X 's (also
called the standard error) depends on two
things:
– the amount of variability you start with
in the population, ,
– and the sample size, n.
22
9.6 THE CENTRAL LIMIT THEOREM
9.6.4 Summary of Central Limit Theorem
 Combining all three of the points of the
Central Limit Theorem, we get Figure 9.6A,
which displays the sampling distribution of
X ; when n is sufficiently large.
23
9.6 THE CENTRAL LIMIT THEOREM
9.6.4 Summary of Central Limit Theorem
 We know from our work on the normal
distribution that 68% of values will fall
within one standard deviation of the mean,
95% will fall within two standard deviations
of the mean, and 99.7% will fall within
three standard deviations of the mean.
24
9.7 DRAWING INFERENCES BY USING
THE CENTRAL LIMIT THEOREM
9.7.1 Using the Central Limit Theorem
 We used s, the sample standard deviation,
as an estimate of  in finding the Z-score,
since  was not given. This is not precisely
the correct procedure but it is close enough
given the large sample size.
Z
X  X
sX

X  X
s/ n
25
9.8 LARGE-SAMPLE CONFIDENCE
INTERVALS FOR THE MEAN
9.8.1 The Basics of Confidence Intervals
 Let's examine the components of the
confidence interval. First of all, it has a
lower bound for , called L and an upper
bound for , called U.


Finally, it has a probability value, which
is called the confidence level and is
labeled 1-.
For any individual interval,  is either in
the interval or it is not.
26
9.8 LARGE-SAMPLE CONFIDENCE
INTERVALS FOR THE MEAN

In general, a confidence interval for the
population mean has the following form:
P(L   U) = 1- 

A confidence interval or an interval
estimate is a range of values with an
associated probability or confidence level,
1- .

The probability quantifies the chance
that the interval contains the true
population parameter.
27
9.8 LARGE-SAMPLE CONFIDENCE
INTERVALS FOR THE MEAN
9.8.2 Confidence Interval for : Normally Distributed
Population and Known Standard Deviation

Remember that X is an unbiased estimate of
 and so it makes sense to put X right in the
middle of the interval.

To find the lower bound we take X and
subtract e, and to find the upper bound we
take X and add the value of e.
28
9.8 LARGE-SAMPLE CONFIDENCE
INTERVALS FOR THE MEAN

Suppose we decide we want to construct
a 95% confidence interval ( =0.05) for
the population mean, .

Finally, the standard deviation of the
sample mean, also known as the standard
error, is equal to  / n .

Bringing these three pieces of
information together tells us that e
should equal 2 / n .
29
9.8 LARGE-SAMPLE CONFIDENCE
INTERVALS FOR THE MEAN

To get the correct value for Z, you must
use that procedure with a tail area
probability equal to /2. Label this value
as Z/2.

For a 95% confidence interval we divide
= 0.05 by 2 and find that the area in one
of the tails is 0.025.
30
31