The Practice of Statistics Third Edition Chapter 14

Download Report

Transcript The Practice of Statistics Third Edition Chapter 14

The Practice of Statistics
Third Edition
Chapter 14:
Inference for Distributions
of Categorical Variables:
Chi-Square Procedures
Copyright © 2008 by W. H. Freeman & Company
Inference for Two-Way Tables
• We have learned how to compare
proportions between two groups.
– Two sample z-test.
• What if we want to compare more than two
groups?
• We can use a two way (rows & columns)
table.
Does Background Music
Influence Wine Purchases?
• Consider the following table:
No
Music
French
Music
Italian
Music
Total
French
Wine
30
39
30
99
Italian
Wine
11
1
19
31
Other
Wine
43
35
35
113
Total
84
75
84
243
Conditional Probability (%)
French
Wine
Italian
Wine
Other
Wine
Total
No
Music
35.7%
30/84
13.1%
11/84
51.2%
43/84
100%
84/84
French
Music
52.0%
39/75
1.3%
1/75
46.7%
35/75
100%
75/75
Italian
Music
35.7%
30/84
22.6%
19/84
41.7%
35/84
100%
84/84
Total
40.7%
99/243
12.8%
31/243
46.5%
113/243
100%
243/243
Comparison of percents of different types of wine sold for different music
conditions.
35.7%
22.6%
51.2%
52.0%
46.7%
41.7%
13.1%
35.7%
1.3%
Here are the percents of different types of wine sold for different music conditions.
11/31 = 35.5%
30/99 = 30.3%
19/31 = 61.3%
30/99 = 30.3%
43/113 = 38%
35/113 = 31%
35/113 = 31%
39/99 = 39.4%
1/31 = 3.2%
Conclusions
• There appears to be an association between
music played and the type of wine that
customers buy.
• Sales of Italian wine are low when French
music is playing, but higher when Italian or
no music is playing.
• More French wine is sold when French
music is playing.
The Problems of Multiple
Comparisons
• We would expect music would influence
sales, so music type is the explanatory
variable (x) and type of wine purchased is
the response variable (y).
• We will compare the column percents that
give the conditional distributions for each
type of music played
How to Compare
• We could do 3 chi-square goodness of fit
procedures.
• H0: The distribution of wine types for no music is
the same as the distribution of wine for French
music.
• H0: The distribution of wine types for no music is
the same as the distribution of wine for Italian
music.
• H0: The distribution of wine types for French
music is the same as the distribution of wine for
Italian music.
Weaknesses
• We get three results.
• We can’t safely compare many parameters
by doing tests or confidence intervals for
two parameters at a time.
• This is the problem of MULTIPLE
COMPARISIONS.
Dealing with Multiple Comparisons
• Two parts
– An overall test to see if there is good evidence of
differences among the parameters that we want to
compare.
– A detailed follow-up analysis to decide which of the
parameters differ and to estimate how large the
differences are.
• The overall test is a chi-square test, but it will be
used for comparing several population
proportions.
Two Way Tables
• The tables we have been using are 3x3
tables because there are three types of wine
(rows) and three types of music (columns).
• The explanatory variable (x) is the type of
music.
• The response variable (y) is the type of
wine purchased.
Hypothesis
• Each column represents one sample of
music and each row a type of wine.
• This is separate and independent random
samples from column populations.
– The column represent the populations.
– The rows represent the response variable.
• H0: The distribution of the response variable
(type of wine purchased) is the same in all
column populations.
Music and Wine
• We have 3 populations
– Bottles of wine sold with no music
– Bottles of wine sold with French music playing
– Bottles of wine sold with Italian music playing.
• We have three independent samples of 84,
75, and 84 bottles.
• H0: The proportions of each wine sold is the
same in all 3 populations.
The null hypothesis is that the distribution of wine selected is the
same for all three populations of music types.
The alternative is the distributions of wine types are not all the
same.
If we have n independent trials and the probability of success
on each trial is p, we expect np successes.
If we draw a SRS of n individuals from a population in which
the proportion of successes is p, we expect np successes in the
sample.
Two Way Table
No
Music
French
Music
Italian
Music
Total
French
Wine
30
39
30
99
Italian
Wine
11
1
19
31
Other
Wine
43
35
35
113
Total
84
75
84
243
Finding the Expected Count of Row 1
and Column 1
• The proportion of no music among all 243
subjects: (Column 1 total)/(Table total)
• 84/243 – Think of this as p, the overall proportion
of no music.
• If H0 is true, we would expect this same
proportion of no music in all three groups.
• So the expected count of no music among the 99
subjects who ordered French wine: np =
(99)(84/243) = 34.222
• From the definition: (99)(84)/243
Expected Counts for Music and
Wine
French
Wine
Italian
Wine
Other
Wine
Total
No
Music
34.222
French
Music
30.556
Italian
Music
34.222
Total
99.000
10.716
9.568
10.716
31.00
39.062
34.877
39.062
113.001
84.000
75.001
84.000
243.001
Assignment
• Construct the Expected Counts for Music
and Wine Table.
• Exercises 14.3, 14.11
• Read pages 858 – 865
• Exam on Wednesday, April 14th