Transcript C12
Chapter 12
Inference About A Population
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.1
Inference With Variance Unknown…
Previously, we looked at estimating and testing the
population mean when the population standard deviation ( )
was known or given:
But in general we do not know the actual population
standard deviation and have to estimate it from the data?
The minute we do this, the “Z-statistic” used in all formulas
changes to a “t-statistic” [Student t-statistic], provided you
are sampling from a normal distribution.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.2
Inference With Variance Unknown…
When
is unknown, we use its point estimator s
Note that the t statistic has one parameter called “degrees of
freedom”. [the normal has 2 parameters, etc.]
The degrees of freedom for the single mean problems we are
working is given by d.f. = = n–1.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.3
Testing
when
is unknown…
When the population standard deviation is unknown and the
population is normal, the test statistic for testing hypotheses
about is:
which is Student t distributed with = n–1 degrees of
freedom. The confidence interval estimator of
is given by:
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.4
Example 12.1…
IDENTIFY
In a clinical trial, if the average time for a drug to take effect
is greater than 450 minutes, it is declared ineffective. Thus,
each new drug must be subjected to the following hypothesis
test. If the null hypothesis is rejected in favor of the
alternative hypothesis, the drug is not approved.
H0: < 450
H1:
> 450
In general we would use a 5% level of significance and in
this example we are going to randomly sample 50 patients.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.5
Example 12.1…
COMPUTE
Our test statistic is:
With n=50 data points, we have n–1=49 degrees of freedom.
Our hypothesis under question is:
H1:
> 450
Our rejection region becomes:
Thus we will reject the null hypothesis in favor of the
alternative if our calculated test static falls in this region.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.6
Example 12.1…
From the data, we calculate
COMPUTE
= 460.38, s =38.83 and thus:
Since
we reject H0 in favor of H1. That is, there is sufficient
evidence to conclude that the new drug is not effective.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.7
Example 12.2…
IDENTIFY
How do we estimate the mean time for a drug to take effect
when the standard deviation is unknown? Same problem but
different data.
Assume we take a random sample of n = 83 patients and
measure the time it takes for the drug to take effect. We want
to construct a 95% confidence interval for the mean time, i.e.
what is:
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.8
Example 12.2…
COMPUTE
From the data, we calculate:
For this term
and so:
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.9
Example 12.2…
INTERPRET
We are 95% confident that the population mean,
, i.e. the
mean time for the drug to become effective, lies between
13.20 minutes and 16.84 minutes.
If the sample size had been n = 21, what value of t would
you use?
If the sample size had been n = 5, what value of t would you
use?
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.10
Check Requisite Conditions…
The Student t distribution is robust, which means that if the
population is nonnormal, the results of the t-test and
confidence interval estimate are still valid provided that the
population is “not extremely nonnormal”.
To check this requirement, draw a histogram of the data and
see how “bell shaped” the resulting figure is. If a histogram
is extremely skewed, that could be considered “extremely
nonnormal” and hence t-statistics would be not be valid in
this case. There are formal statistical tests available to test
the hypothesis that your data comes from a normal
distribution. Always wise to check this out, especially if the
sample size is small [?]
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.11
Inference About Population Variance…
If we are interested in drawing inferences about a
population’s variability, the parameter we need to
investigate is the population variance:
The sample variance (s2) is an unbiased, consistent and
efficient point estimator for
. Moreover,
the statistic,
, has a chi-squared distribution,
with n–1 degrees of freedom.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.12
Testing & Estimating Population Variance
The test statistic used to test hypotheses about
(which is chi-squared with
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
is:
= n–1 degrees of freedom).
12.13
Inference: Population Proportion…
Test statistic for p:
The confidence interval estimator for p is given by:
(both of which require that np>5 and n(1–p)>5)
If this condition is not satisfied we can still work the
problem with a different statistical approach.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.14
Determine the Sample Size necessary to estimate the population proportion
within + B with 95% confidence…
Two methods – in each case we choose a value for
solve the equation for n.
then
Method 1 : no knowledge of even a rough value of
is a ‘worst case scenario’ so we substitute = .50
. This
Method 2 : we have some idea about the value of . This is
a better scenario and we substitute in our estimated value.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.15
Selecting the Sample Size…
Method 1 : no knowledge of value of
Method 2 : some idea about a possible
, use 50%:
value, say 20%:
Thus, we can sample fewer people if we already have a
reasonable estimate of the population proportion before
starting.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
12.16