Transcript and... 15

15 - 1
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 2
When you have completed this chapter, you will be able to:
Understand the nature and role of
chi-square distribution
Identify a wide variety of uses of
the chi-square distribution
Conduct a test of hypothesis comparing an
observed frequency distribution to an
expected frequency distribution
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 3
Conduct a test of hypothesis for normality
using the chi-square distribution
Conduct a hypothesis test to determine
whether two attributes are independent
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Characteristics of the
Chi-Square Distribution
15 - 4
… it is positively skewed
… it is non-negative
… it is based on degrees of freedom
…when the degrees of freedom change
a new distribution is created
…e.g.
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Characteristics of the
Chi-Square Distribution
15 - 5
df = 3
df = 5
df = 10
c2
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Goodness-of-Fit Test:
Equal Expected Frequencies
15 - 6
Let f0 and fe be the observed and expected
frequencies respectively
H0: There is no difference between the
observed and expected frequencies
H1: There is a difference between the
observed and the expected frequencies
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Goodness-of-Fit Test:
Equal Expected Frequencies
2
=

(fo
- fe
fe
)2
…the critical value is a chi-square value with
(k-1) degrees of freedom,
where k is the number of categories
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.



… the test statistic is: c



15 - 7
Goodness-of-Fit Test:
Equal Expected Frequencies
15 - 8
The following information shows the number of
employees absent by day of the week
at a large a manufacturing plant.
Day
Frequency
Monday
120
Tuesday
45
Wednesday
60
Thursday
90
Friday
130
Total
445
At the .05 level of significance, is there a difference
in the absence rate by day of the week?
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Goodness-of-Fit Test:
Equal Expected Frequencies
15 - 9
Hypothesis Test
Step 1
H0: There is no difference in absence rate by
day of the week…
(120+45+60+90+130)/5 = 89
H1: Absence rates by day are not all equal
Step 2  = 0.05
Step 3
Step 4
Use Chi-Square test
Degrees of freedom (5-1) = 4
Reject H0 if c2 > 9.488. (see Appendix I)
Chi-Square
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 10
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Reject H0 if c2 > 9.488
Using the Table…
Degrees of
Freedom
5–1=4
Right-Tail
Area
 = 0.05
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 11
Test Statistics
c
Day
Step 5
2
=

 ( f o - f e )2

fe

15 - 12



Frequency Expected (fo – fe)2/fe
Monday
Tuesday
Wednesday
Thursday
Friday
Total
120
45
60
90
130
445
89
89
89
89
89
445
10.80
21.75
9.45
= 1.98
0.01
18.89
60.90
(120-89)2/89
c2
Reject H0 if c2 > 9.488
Reject the null hypothesis.
Absentee rates are not the same for each day of the week.
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 13
A U.S. Bureau of the Census indicated that…
Married
Widowed
Divorced
Single
63.9%
7.7%
6.9%
21.5%
Not re-married Never married
A sample of 500 adults from the Philadelphia area showed:
310
40
30
120
At the .05 significance level, can we conclude that the
Philadelphia area is different from the U.S. as a whole?
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 14
… continued
Expected (fo – fe)2/fe
Married
310
*319.5
** .2825
Widowed
40
38.5
.0584
Divorced
30
34.5
.5870
Single
120
107.5
1.4535
Total
500
2.3814
* Census figures would predict: i.e. 639*500 = 319.5
** Our sample: (310-319.5)2/319.5 = .2825
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
c2
15 - 15
… continued
Step 1
H0: The distribution has not changed
H1: The distribution has changed.
Step 2
 = 0.05
H0 is rejected if
c2 >7.815, df = 3
Step 3
Step 4
c2 = 2.3814
Reject the null hypothesis.
The distribution regarding marital status in Philadelphia
is different from the rest of the United States.
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Goodness-of-Fit Test:
Normality
15 - 16
… the test investigates
if the observed frequencies in a frequency distribution
match the theoretical normal distribution
…to determine the mean and standard deviation
of the frequency distribution
- Compute the z-value for the lower class limit
and the upper class limit for each class
- Determine fe for each category
- Use the chi-square goodness-of-fit test to
determine if fo coincides with fe
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Goodness-of-Fit Test:
Normality
15 - 17
 A sample of 500 donations to the Arthritis
Foundation is reported in the
following frequency distribution
 Is it reasonable to conclude that the distribution is
normally distributed with a mean of $10 and a
standard deviation of $2?
 Use the .05 significance level
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 18
… continued
Amount Spent
fo
<$6
20
$6 up to $8
60
$8 up to $10
140
$10 up to $12
120
$12 up to $14
90
>$14
70
Total
500
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Area
fe
(fo- fe )2/fe
15 - 19
… continued
z
To compute fe for the first class,
first determine the z - value
X - m
6 - 10
=
=
= - 2 . 00
s
2
Now…
find the probability of a z - value less than –2.00
P( z < -2.00) = 0.5000 - .4772 = .0228
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 20
… continued
Amount Spent
fo
Area
<$6
20
.02
$6 up to $8
60
.14
$8 up to $10
140
.34
$10 up to $12
120
.34
$12 up to $14
90
.14
>$14
70
.02
Total
500
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
fe
(fo- fe )2/fe
15 - 21
… continued
The expected frequency is the probability of a
z-value less than –2.00 times the sample size
f e = (. 0228 )( 500 ) = 11 . 40
The other expected frequencies
are computed similarly
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 22
… continued
Amount Spent
fo
Area
fe
(fo- fe )2/fe
<$6
20
.02
11.40
6.49
$6 up to $8
60
.14
67.95
.93
$8 up to $10
140
.34
170.65
5.50
$10 up to $12
120
.34
170.65
15.03
$12 up to $14
90
.14
67.95
7.16
>$14
70
.02
11.40
301.22
Total
500
500
336.33
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 23
… continued
Step 1
Step 2
H0: The observations follow the normal distribution
H0: The observations do NOT follow the normal
distribution
 = 0.05
Step 3
H0 is rejected if c2 >7.815, df = 6
Step 4
c2 = 336.33
H0: is rejected.
The observations do NOT follow the normal distribution
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 24
A contingency table is used to investigate
whether two traits or characteristics
are related
… each observation is classified according to two criteria
…the usual hypothesis testing procedure is used
… the degrees of freedom is equal to:
(number of rows -1)(number of columns -1)
… the expected frequency is computed as:
Expected Frequency = (row total)(column total)/grand total
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 25
Is there a relationship between the
location of an accident and the gender
of the person involved in the accident?
A sample of 150 accidents reported to the
police were classified by type and gender.
At the .05 level of significance, can we
conclude that gender and the location of
the accident are related?
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 26
… continued
Sex
Location
Total
Work
Home
Other
Male
60
20
10
90
Female
20
30
10
60
Total
80
50
20
150
The expected frequency for the work-male
intersection is computed as (90)(80)/150 =48
Similarly, you can compute the
expected frequencies for the other cells
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 27
… continued
c
Step 1
H0: The Gender and Location are NOT related
H0: The Gender and Location are related
Step 2
 = 0.05
Step 3
H0 is rejected if c 2 >5.991, df = 2
(…there are (3- 1)(2-1) = 2 degrees of freedom)
Step 4
Find the value of c 2
2
(60
2
(
- 48 )2
10 - 8 )
=
+ ... +
48
8
= 16 . 667
H0: is rejected.
Gender and Location are related!
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Test your learning…
www.mcgrawhill.ca/college/lind
Online Learning Centre
for quizzes
extra content
data sets
searchable glossary
access to Statistics Canada’s E-Stat data
…and much more!
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 28
15 - 29
This completes Chapter 15
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.