DevStat8e_09_05
Download
Report
Transcript DevStat8e_09_05
9
Inferences Based on
Two Samples
Copyright © Cengage Learning. All rights reserved.
9.5
Inferences Concerning Two
Population Variances
Copyright © Cengage Learning. All rights reserved.
Inferences Concerning Two Population Variances
Methods for comparing two population variances (or
standard deviations) are occasionally needed, though such
problems arise much less frequently than those involving
means or proportions.
For the case in which the populations under investigation
are normal, the procedures are based on a new family of
probability distributions.
3
The F Distribution
4
The F Distribution
The F probability distribution has two parameters, denoted
by v1 and v2. The parameter v1 is called the number of
numerator degrees of freedom, and v2 is the number of
denominator degrees of freedom; here v1 and v2 are
positive integers.
A random variable that has an F distribution cannot assume
a negative value. Since the density function is complicated
and will not be used explicitly, we omit the formula.
There is an important connection between an F variable
and chi-squared variables.
5
The F Distribution
If X1 and X2 are independent chi-squared rv’s with v1 and v2
df, respectively, then the rv
(9.8)
(the ratio of the two chi-squared variables divided by their
respective degrees of freedom), can be shown to have an F
distribution.
6
The F Distribution
Figure 9.8 illustrates the graph of a typical F density
function.
An F density curve and critical value
Figure 9.8
7
The F Distribution
Analogous to the notation t,v and
we use
for
the value on the horizontal axis that captures of the area
under the F density curve with v1 and v2 df in the upper tail.
The density curve is not symmetric, so it would seem that
both upper- and lower-tail critical values must be tabulated.
This is not necessary, though, because of the fact that
8
The F Distribution
Appendix Table A.9 gives
for = .10, .05, .01, and
.001, and various values of v1 (in different columns of the
table) and v2 (in different groups of rows of the table).
For example, F.05,6,10 = 3.22 and F.05,10,6 = 4.06. The critical
value F.95,6,10 , which captures .95 of the area to its right
(and thus .05 to the left) under the F curve with v1 = 6 and
v2 = 10, is F.95,6,10 = 1/F.05,10,6 = 1/4.06 = .246.
9
The F Test for Equality of Variances
10
The F Test for Equality of Variances
A test procedure for hypotheses concerning the ratio
is based on the following result.
Theorem
Let X1,…, Xm be a random sample from a normal
distribution with variance
let Y1,…, Yn be another
random sample (independent of the Xi’s) from a normal
distribution with variance
and let
and
denote the
two sample variances. Then the rv
(9.9)
has an F distribution with v1 = m – 1 and v2 = n – 1.
11
The F Test for Equality of Variances
This theorem results from combining (9.8) with the fact that
the variables
and
each have a
chi-squared distribution with m – 1 and n – 1 df,
respectively.
Because F involves a ratio rather than a difference, the test
statistic is the ratio of sample variances.
The claim that
too much from 1.
is then rejected if the ratio differs by
12
The F Test for Equality of Variances
Null hypothesis:
Test statistic value:
Alternative Hypothesis
Rejection Region for a Level
Test
13
The F Test for Equality of Variances
Since critical values are tabled only for = .10, .05, .01,
and .001, the two-tailed test can be performed only at
levels .20, .10, .02, and .002. Other F critical values can be
obtained from statistical software.
14
Example 14
On the basis of data reported in the article “Serum Ferritin in
an Elderly Population” (J. of Gerontology, 1979:
521–524), the authors concluded that the ferritin distribution
in the elderly had a smaller variance than in the younger
adults. (Serum ferritin is used in diagnosing iron deficiency.)
For a sample of 28 elderly men, the sample standard
deviation of serum ferritin (mg/L) was s1 = 52.6; for 26 young
men, the sample standard deviation was s2 = 84.2.
Does this data support the conclusion as applied to men?
15
Example 14
cont’d
Let
and
denote the variance of the serum ferritin
distributions for elderly men and young men, respectively.
The hypotheses of interest are
versus
At level .01, H0 will be rejected if f F.99, 27, 25. To obtain the
critical value, we need F.01,25,27. From Appendix Table A.9,
F.01,25,27 = 2.54, so F.99, 27, 25 = 1/2.54 = .394.
The computed value of F is (52.6)2/(84.2)2 = .390. Since
.390 .394, H0 is rejected at level .01 in favor of Ha, so
variability does appear to be greater in young men than in
elderly men.
16
P-Values for F Tests
17
P-Values for F Tests
As we know that the P-value for an upper-tailed t test is the
area under the relevant t curve (the one with appropriate
df) to the right of the calculated t.
In the same way, the P-value for an upper-tailed F test is
the area under the F curve with appropriate numerator and
denominator df to the right of the calculated f.
18
P-Values for F Tests
Figure 9.9 illustrates this for a test based on v1 = 4 and
v2 = 6.
A P-value for an upper-tailed F test
Figure 9.9
19
P-Values for F Tests
Tabulation of F-curve upper-tail areas is much more
cumbersome than for t curves because two df’s are
involved.
For each combination of v1 and v2, our F table gives only
the four critical values that capture areas .10, .05, .01,
and .001.
20
P-Values for F Tests
Figure 9.10 shows what can be said about the P-value
depending on where f falls relative to the four critical
values.
Obtaining P-value information from the F table for an upper-tailed F test
Figure 9.10
21
P-Values for F Tests
For example, for a test with v1 = 4 and v2 = 6,
f = 5.70
.01, < P-value, < .05
f = 2.16
P-value > .10
f = 25.03
P-value < .001
Only if f equals a tabulated value do we obtain an exact
P-value (e.g., if f = 4.53, then P-value = .05).
22
P-Values for F Tests
Once we know that .01 < P-value < .05, H0 would be
rejected at a significance level of .05 but not at a level
of .01.
When P-value < .001, H0 should be rejected at any
reasonable significance level.
The F tests discussed in succeeding chapters will all be
upper-tailed. If, however, a lower-tailed F test is
appropriate, then lower-tailed critical values should be
obtained as described earlier so that a bound or bounds on
the P-value can be established.
23
P-Values for F Tests
In the case of a two-tailed test, the bound or bounds from a
one-tailed test should be multiplied by 2. For example, if
f = 5.82 when v1 = 4 and v2 = 6, then since 5.82 falls
between the .05 and .01 critical values,
2(.01) < P-value < 2(.05), giving .02 < P-value < .10.
H0 would then be rejected if = .10 but not if = .01. In this
case, we cannot say from our table what conclusion is
appropriate when = .05 (since we don’t know whether the
P-value is smaller or larger than this).
24
P-Values for F Tests
However, statistical software shows that the area to the
right of 5.82 under this F curve is .029, so the P-value is
.058 and the null hypothesis should therefore not be
rejected at level .05 (.058 is the smallest for which H0 can
be rejected and our chosen is smaller than this).
Various statistical software packages will, of course,
provide an exact P-value for any F test.
25
A Confidence Interval for 1/2
26
A Confidence Interval for 1/2
The CI for
statement
is based on replacing F in the probability
by the F variable (9.9) and manipulating the inequalities to
isolate
An interval for 1/2 results from taking the
square root of each limit.
27