Doing Statistics for Business

Download Report

Transcript Doing Statistics for Business

Doing Statistics for Business
Data, Inference, and Decision Making
Marilyn K. Pelosi
Theresa M. Sandifer
Chapter 15
The Analysis of
Qualitative Data
1
Doing Statistics for Business
Chapter 15 Objectives
In this chapter you will learn to use chisquare tests for:
 Testing whether a particular probability
model fits a set of data (Goodness of Fit
test)
 Testing equality of proportions for more
than 2 populations
2
Doing Statistics for Business
Chapter 15 Objectives (con’t)
 Testing whether 2 qualitative variables are
dependent or independent
3
Doing Statistics for Business
The chi-square goodness of fit test checks
to see how well a set of data fit the model
for a particular probability distribution.
4
Doing Statistics for Business
The observed frequencies are the actual
number of observations that fall into each
class in a frequency distribution or histogram.
5
Doing Statistics for Business
The expected frequencies are the number
of observations that should fall into each
class in a frequency distribution under the
hypothesized probability distribution.
6
Doing Statistics for Business
A uniform distribution is one in which each
outcome or class of outcomes is equally likely
to occur.
7
Doing Statistics for Business
TRY IT NOW!
Seat-Belt Usage
Setting Up the Hypotheses for the
Goodness of Fit Test
Analysts for insurance companies assume that the number of drivers
who wear seat belts is a binomial random variable with  = 0.70. To
test this assumption they decide to set up checkpoints and sample 10
drivers every 2 hours. Set up the hypotheses to perform an appropriate
chi-square goodness of fit test.
8
Doing Statistics for Business
TRY IT NOW!
Seat-Belt Usage
Calculating the Expected Frequencies
The insurance analysts collect data for 1000 samples of 10 drivers
and obtain the frequency distribution shown in the following table.
Find the expected frequency distribution for the data if the distribution
is really binomial with n = 10 and  = 0.70.
9
Doing Statistics for Business
TRY IT NOW!
Seat-Belt Usage
Calculating the Expected Frequencies
(con’t)
# Wearing Observed
Seatbelts (x) Frequency
0
0
1
0
2
1
3
6
4
33
5
116
6
213
7
275
8
216
9
119
10
21
1000
Total
p(x)
Expected
Frequency
10
Doing Statistics for Business
TRY IT NOW!
Seat-Belt Usage
Calculating the Expected Frequencies
(con’t)
Create frequency histograms for both the observed and the expected
frequency distributions. At this point, does it appear that the observed
data conform to the binomial distribution with n = 10 and  = 0.70.?
Why or why not?
11
Doing Statistics for Business
TRY IT NOW!
Seat-Belt Usage
Calculating the Chi-Square Statistic
Fill in the following table to calculate the value of the
chi-square statistic for the data obtained by the insurance analysts.
# Wearing Observed
Seatbelts Frequency
0
0
1
0
2
1
3
6
4
33
5
116
6
213
7
275
8
216
9
119
10
21
Total
1000
p(x)
0.000
0.000
0.001
0.009
0.037
0.103
0.200
0.267
0.233
0.121
0.028
1
Expected
Frequency
0
0
1
9
37
103
200
267
233
121
28
1000
(o-e)
2
(o -e) /e
12
Doing Statistics for Business

k - p - 1
Figure 15.1
The Upper Tail of a Chi-Square
Distribution
13
Doing Statistics for Business
TRY IT NOW!
Seat-Belt Usage
Finding the Critical Value and Performing
the Test
The insurance analysts decide that they want to test the goodness of fit
hypothesis at the 0.01 level of significance.
How many degrees of freedom will the critical value for the test have?
Find the critical value for the test.
14
Doing Statistics for Business
TRY IT NOW!
Seat-Belt Usage
Finding the Critical Value and Performing
the Test (con’t)
Based on the chi-square test statistic and the critical value, what can you
conclude about the distribution of the number of people in a sample size
of 10 that wear seat belts?
15
Doing Statistics for Business
TRY IT NOW!
Technical Support
Setting Up the Chi-Square Test for
Proportions
A company that sells computer software has three different locations
set up to provide customers with technical support for their products.
The support representatives keep a log for each call to technical support,
and as part of that log, they record whether the problem was resolved
successfully.
16
Doing Statistics for Business
TRY IT NOW!
Technical Support
Setting Up the Chi-Square Test for
Proportions (con’t)
The company analysts are interested in knowing whether the percentage
of calls that are successfully resolved is the same for each location. They
randomly select logs from each location and collect data on the number
of calls that result in a successful resolution of the problem. The data
are summarized in the following table.
17
Doing Statistics for Business
TRY IT NOW!
Technical Support
Setting Up the Chi-Square Test for
Proportions (con’t)
Number of
Successful Calls
Unsuccessful Calls
Totals
1
257
43
300
Location
2
264
86
350
3
283
97
380
Totals
804
226
1030
Set up the hypotheses for the software company.
Calculate the proportion of successfully resolved calls for each location.
Based solely on these numbers, do you think that the proportion of
successfully resolved calls for all three locations is the same?
18
Doing Statistics for Business
TRY IT NOW!
Technical Support
Calculating the Expected Frequencies
The data for the computer software company interested in its
technical support locations are
Number of
Successful Calls
Unsuccessful Calls
Totals
1
257
43
300
Location
2
264
86
350
3
283
97
380
Totals
804
226
1030
19
Doing Statistics for Business
TRY IT NOW!
Technical Support
Calculating the Expected Frequencies
Calculate , the percentage of calls that are resolved successfully,
assuming that the three locations are the same.
Use the overall proportion of successful calls to find the expected
frequency of successful calls for each location.
20
Doing Statistics for Business
Figure 15.2 Minitab Output from Chi-Square Analysis
Non-Binge Infrequent Frequent
Never
Once or
65
More
All
All
77
58.25
39
40.22
31
48.54
14
147.0
7
25.75
19
17.78
39
21.46
65.0
84
84.00
58
58.00
70
70.00
21
212.0
Chi-Square = 40.484, DF = 2, P-Value = 0.00
Cell Contents --
Count
Exp Freq
21
Doing Statistics for Business
TRY IT NOW!
Technical Support
Performing the Chi-Square Test for
Proportions
The computer software company with the different technical support
location wants to complete the test to determine whether the percentage
of successfully resolved calls is the same at all three locations. It wants
to test at the 0.01 level of significance.
22
Doing Statistics for Business
TRY IT NOW!
Technical Support
Performing the Chi-Square Test for
Proportions (con’t)
Calculate the value of the chi-square test statistic and complete the test.
Is the proportion of successfully resolved calls the same at each location?
23
Doing Statistics for Business
TRY IT NOW!
Drinking Survey
Setting Up the Contingency Table for a
Test for Independence
The Public Health student who did the study on drinking also collected
data on the number of times that the student drove while intoxicated in the
last two weeks (coded). The contingency table for the usable responses is
given on the following slide.
24
Doing Statistics for Business
TRY IT NOW!
Drinking Survey
Setting Up the Contingency Table for a
Test for Independence (con’t)
Class
Freshman
Number of Times Drive While Intoxicated
Not at All
Once
Twice or More
72
5
9
Total
86
Sophomore
19
8
9
36
Junior
16
8
6
30
Senior
8
4
7
19
Total
115
25
31
171
25
Doing Statistics for Business
TRY IT NOW!
Drinking Survey
Setting Up the Contingency Table for a
Test for Independence (con’t)
The university is interested in knowing if the number of times a student
drove while intoxicated is related to his or her class in school. It feels
that this information will help target student audiences for programs on
drinking and driving.
26
Doing Statistics for Business
TRY IT NOW!
Drinking Survey
Setting Up the Contingency Table for a
Test for Independence (con’t)
Set up the hypotheses that the university should test.
Calculate the expected frequencies for each cell and put them in the
appropriate location in the table.
Are any of the expected frequencies less than 5? If so, can you suggest a
logical way to combine categories to avoid this problem?
27
Doing Statistics for Business
TRY IT NOW!
Drinking Survey
Performing the Chi-Square Test for
Independence
The university that is looking at the relationship between class year
and drinking and driving wants to perform the test at the 0.05 level of
significance.
Calculate the value of the chi-square statistic for the data.
Find the critical value and perform the test.
Are class and drinking and driving independent?
28
Doing Statistics for Business
The Chi-Square Test for Independence
in Excel - Creating the Contingency Table
1. From the Data menu, select Pivot Table Report and
follow the steps of the pivot table wizard to create the
table.
2. At step 4 of 4, after you indicate where you want the table
to be placed, do not select Finish. Instead, click Options.
3. Make sure that the checkbox next to For empty cells, show
is checked and in the textbox next to it, type “0”. (Excel
will not accept empty cells in the contingency table.)
4. Click OK and then Finish.
29
Doing Statistics for Business
Figure 15.4 Pivot Table Options Dialog Box
30
Doing Statistics for Business
Figure 15.5 Contingency Table for Binge Drinking
Data
31
Doing Statistics for Business
The Chi-Square Test for Independence
In Kadd
Select Hypothesis Testing>Chi Square Test
from the Kadd menu
The dialog box opens
32
Doing Statistics for Business
The Chi-Square Test for Independence
Dialog Box
33
Doing Statistics for Business
The Chi-Square Test for Independence
Complete the Dialog Box:
1. The input range should be the location of
the contingency table for the data. Make
sure you do not select the column and row
that contain the totals.
2. Indicate where you want the output
located.
3. Click OK.
34
Doing Statistics for Business
The Chi-Square Test for Independence
Chi-square test statistic =
p-value =
Number of:
rows =
columns =
15. 3884
0. 0174
4
3
Actual frequencies
Variable B
I nf r equent
25
1
2
30
58
Fr equent
46
0
1
24
71
Chi-square calculations
Variable B
Non- bi nge
I nf r equent
Variable A Res i denc e Hal l or Dor mi t or y
1. 1071
0. 5172
Fr at er ni t y or Sor or i t y
0. 3944
1. 9447
Ot her Uni v er s i t y Hous i ng
1. 1831
1. 7135
Of f Campus Hous e or Apar t ment
1. 7289
0. 1360
Fr equent
3. 2201
0. 3333
0. 0000
3. 1100
Variable A
Res i denc e Hal l or Dor mi t or y
Fr at er ni t y or Sor or i t y
Ot her Uni v er s i t y Hous i ng
Of f Campus Hous e or Apar t ment
Tot al s
Non- bi nge
35
0
0
49
84
Tot al s
106
1
3
103
213
35
Doing Statistics for Business
Chapter 15 Summary
In this chapter you have learned:
 The chi-square test involves comparing observed
and expected frequencies for different classes of
data.
 The chi-square test is quite versatile and can be
used to test:
36
Doing Statistics for Business
Chapter 15 Summary (con’t)
Goodness of Fit
Equality of Proportions for more than 2
Populations
Independence of Qualitative Variables
 The results of a chi-square test do not solve a
problems, but simply point out when further action
is indicated and when it is not.
37