Ch10-Sec10.1

Download Report

Transcript Ch10-Sec10.1

Chapter 10
Chi-Square Tests and the F-Distribution
1
Chapter Outline
 10.1 Goodness of Fit
 10.2 Independence
 10.3 Comparing Two Variances
 10.4 Analysis of Variance
2
Section 10.1
Goodness of Fit
3
Section 10.1 Objectives
 Use the chi-square distribution to test whether a frequency
distribution fits a claimed distribution
4
Properties of The Chi-Square
Distribution
1. All chi-square values χ2 are greater than or equal to zero.
2. The chi-square distribution is a family of curves, each
determined by the degrees of freedom. To find the critical
values, use the χ2-distribution with degrees of freedom
equal to one less than the sample size.
•
d.f. = n – 1
Degrees of freedom
3. The area under each curve of the chi-square distribution
equals one.
5
Properties of The Chi-Square
Distribution
4.
Chi-square distributions are positively skewed.
chi-square distributions
6
Finding Critical Values for the χ2-Test
1.
2.
3.
Specify the level of significance .
Determine the degrees of freedom d.f. = n – 1.
The critical values for the χ2-distribution are found in Table 6 of
Appendix B. To find the critical value(s) for a
a. right-tailed test, use the value that corresponds to d.f. and .
b.
c.
7
left-tailed test, use the value that corresponds to d.f. and
1 – .
two-tailed test, use the values that corresponds to d.f. and ½ and d.f.
and 1 – ½.
Finding Critical Values for the χ2-Test
Right-tailed
Left-tailed


1–α
1–α
χ2
 02
χ2
 02
Two-tailed
1

2
 L2
8
1

2
1–α

2
R
χ2
Example: Finding Critical Values for χ2
Find the critical χ2-value for a left-tailed test when
n = 11 and  = 0.01.
Solution:
• Degrees of freedom: n – 1 = 11 – 1 = 10 d.f.
• The area to the right of the critical value is
1 –  = 1 – 0.01 = 0.99.
0.01
 02  2.558
 02
From Table 6, the critical value is  02  2.558.
9
χ2
Example: Finding Critical Values for χ2
Find the critical χ2-value for a two-tailed test when n = 13 and
 = 0.01.
Solution:
• Degrees of freedom: n – 1 = 13 – 1 = 12 d.f.
• The areas to the right of the critical values are
1
  0.005
2
1
1    0.995
2
1
  0.005
2
  3.074 
2
L
1
  0.005
2
χ2
2
L
 R2  R2  28.299
From Table 6, the critical values are  L2  3.074 and
2

10
R  28.299
Multinomial Experiments
Multinomial experiment
 A probability experiment consisting of a fixed number of
trials in which there are more than two possible outcomes
for each independent trial.
 A binomial experiment had only two possible outcomes.
 The probability for each outcome is fixed and each outcome
is classified into categories.
11
Multinomial Experiments
Example:
 A radio station claims that the distribution of music preferences
for listeners in the broadcast region is as shown below.
Distribution of music Preferences
Classical
4% Oldies
2%
Country
36% Pop
18%
Gospel
11% Rock
29%
Each outcome is
classified into
categories.
12
The probability for
each possible
outcome is fixed.
Chi-Square Goodness-of-Fit Test
Chi-Square Goodness-of-Fit Test
 Used to test whether a frequency distribution fits an
expected distribution.
 The null hypothesis states that the frequency distribution fits
the specified distribution.
 The alternative hypothesis states that the frequency
distribution does not fit the specified distribution.
13
Chi-Square Goodness-of-Fit Test
Example:
• To test the radio station’s claim, the executive can perform a
chi-square goodness-of-fit test using the following hypotheses.
H0: The distribution of music preferences in the
broadcast region is 4% classical, 36% country,
11% gospel, 2% oldies, 18% pop, and 29% rock.
(claim)
Ha: The distribution of music preferences differs from
the claimed or expected distribution.
14
Chi-Square Goodness-of-Fit Test
 To calculate the test statistic for the chi-square goodness-of-
fit test, the observed frequencies and the expected
frequencies are used.
 The observed frequency O of a category is the frequency
for the category observed in the sample data.
15
Chi-Square Goodness-of-Fit Test
 The expected frequency E of a category is the calculated
frequency for the category.
 Expected frequencies are obtained assuming the specified (or
hypothesized) distribution. The expected frequency for the ith
category is
Ei = npi
where n is the number of trials (the sample size) and pi is the
assumed probability of the ith category.
16
Example: Finding Observed and
Expected Frequencies
A marketing executive randomly selects
500 radio music listeners from the
broadcast region and asks each whether he
or she prefers classical, country, gospel,
oldies, pop, or rock music. The results are
shown at the right. Find the observed
frequencies and the expected frequencies
for each type of music.
17
Survey results
(n = 500)
Classical
8
Country
210
Gospel
72
Oldies
10
Pop
75
Rock
125
Solution: Finding Observed and
Expected Frequencies
Observed frequency: The number of radio music listeners
naming a particular type of music
Survey results
(n = 500)
Classical
8
Country
210
Gospel
72
Oldies
10
Pop
75
Rock
125
18
observed frequency
Solution: Finding Observed and
Expected Frequencies
Expected Frequency: Ei = npi
Type of
music
Classical
Country
Gospel
Oldies
Pop
Rock
19
% of
listeners
4%
36%
11%
2%
18%
29%
Observed
frequency
8
210
72
10
75
125
n = 500
Expected
frequency
500(0.04) = 20
500(0.36) = 180
500(0.11) = 55
500(0.02) = 10
500(0.18) = 90
500(0.29) = 145
Chi-Square Goodness-of-Fit Test
For the chi-square goodness-of-fit test to be used, the following
must be true.
1. The observed frequencies must be obtained by using a
random sample.
2. Each expected frequency must be greater than or equal to
5.
20
Chi-Square Goodness-of-Fit Test
 If these conditions are satisfied, then the sampling
distribution for the goodness-of-fit test is approximated by a
chi-square distribution with k – 1 degrees of freedom, where
k is the number of categories.
 The test statistic for the chi-square goodness-of-fit test is
2
(
O

E
)
2  
E
The test is always
a right-tailed test.
where O represents the observed frequency of each category
and E represents the expected frequency of each category.
21
Chi-Square Goodness-of-Fit Test
In Words
1. Identify the claim. State the
null and alternative
hypotheses.
In Symbols
State H0 and Ha.
2. Specify the level of
significance.
Identify .
3. Identify the degrees of
freedom.
d.f. = k – 1
4. Determine the critical
value.
Use Table 6 in
Appendix B.
22
Chi-Square Goodness-of-Fit Test
In Words
In Symbols
5. Determine the rejection region.
6. Calculate the test statistic.
7. Make a decision to reject or fail
to reject the null hypothesis.
8. Interpret the decision in the
context of the original claim.
23
(O  E) 2
 
E
2
If χ2 is in the
rejection region,
reject H0.
Otherwise, fail to
reject H0.
Example: Performing a Goodness of Fit
Test
Use the music preference data to perform a chi-square goodnessof-fit test to test whether the distributions are different. Use α =
0.01.
Distribution of
music preferences
Classical
4%
Country
36%
Gospel
11%
Oldies
2%
Pop
18%
Rock
29%
24
Survey results
(n = 500)
Classical
8
Country
210
Gospel
72
Oldies
10
Pop
75
Rock
125
Solution: Performing a Goodness of Fit
Test
• H0: music preference is 4% classical, 36% country,
11% gospel, 2% oldies, 18% pop, and 29% rock
• Ha: music preference differs from the claimed or
expected distribution
• Test Statistic:
• α = 0.01
• d.f. = 6 – 1 = 5
• Rejection Region
• Decision:
• Conclusion:
0.01
25
0
15.086
χ2
Solution: Performing a Goodness of Fit
Test
Type of
music
Classical
Country
Gospel
Oldies
Pop
Rock
Observed
frequency
8
210
72
10
75
125
Expected
frequency
20
180
55
10
90
145
2
(
O

E
)
2  
E
(8  20)2 (210  180)2 (72  55) 2 (10  10) 2 (75  90) 2 (125  145) 2






20
180
55
10
90
145
 22.713
26
Solution: Performing a Goodness of Fit
Test
• H0: music preference is 4% classical, 36% country,
11% gospel, 2% oldies, 18% pop, and 29% rock
• Ha: music preference differs from the claimed or
expected distribution
• Test Statistic:
• α = 0.01
χ2 = 22.713
• d.f. = 6 – 1 = 5
• Rejection Region
• Decision: Reject H0
0.01
27
0
χ2
15.086
22.713
There is enough evidence to
conclude that the distribution
of music preferences differs
from the claimed distribution.
Example: Performing a Goodness of Fit
Test
The manufacturer of M&M’s candies claims that the number of
different-colored candies in bags of dark chocolate M&M’s is
uniformly distributed. To test this claim, you randomly select a
bag that contains 500 dark chocolate M&M’s. The results are
shown in the table on the next slide. Using α = 0.10, perform a
chi-square goodness-of-fit test to test the claimed or expected
distribution. What can you conclude? (Adapted from Mars
Incorporated)
28
Example: Performing a Goodness of Fit
Test
Color
Brown
Yellow
Red
Blue
Orange
Green
Frequency
80
95
88
83
76
78
n = 500
29
Solution:
• The claim is that the distribution
is uniform, so the expected
frequencies of the colors are
equal.
• To find each expected
frequency, divide the sample size
by the number of colors.
• E = 500/6 ≈ 83.3
Solution: Performing a Goodness of Fit
Test
• H0: Distribution of different-colored candies in bags
of dark chocolate M&Ms is uniform
• Ha: Distribution of different-colored candies in bags
of dark chocolate M&Ms is not uniform
• Test Statistic:
• α = 0.10
• d.f. = 6 – 1 = 5
• Rejection Region
• Decision:
• Conclusion:
0.10
30
0
9.236
χ2
Solution: Performing a Goodness of Fit
Test
2
(
O

E
)
2  
E
Color
Brown
Yellow
Red
Blue
Orange
Green
Observed
frequency
80
95
88
83
76
78
Expected
frequency
83.3
83.3
83.3
83.3
83.3
83.3
(80  83.3)2 (95  83.3)2 (88  83.3)2 (83  83.3)2 (76  83.3)2 (78  83.3)2






83.3
83.3
83.3
83.3
83.3
83.3
 3.016
31
Solution: Performing a Goodness of Fit
Test
• H0: Distribution of different-colored candies in bags
of dark chocolate M&Ms is uniform
• Ha: Distribution of different-colored candies in bags
of dark chocolate M&Ms is not uniform
• Test Statistic:
• α = 0.01
χ2 = 3.016
• d.f. = 6 – 1 = 5
• Rejection Region
• Decision: Fail to Reject H0
0.10
32
0
3.016
9.236
χ2
There is not enough evidence
to dispute the claim that the
distribution is uniform.
Section 10.1 Summary
 Used the chi-square distribution to test whether a frequency
distribution fits a claimed distribution
33