Notes 23 - Wharton Statistics Department

Download Report

Transcript Notes 23 - Wharton Statistics Department

Stat 112: Lecture 23 Notes
• Chapter 9.3: Two-way Analysis of Variance
• Schedule:
– Homework 6 is due on Friday.
– Quiz 4 is next Tuesday.
– Final homework assignment will be e-mailed
this weekend and due next Monday.
– Final Project due on Dec. 19th
Two-way Analysis of Variance
• We have observations from different
groups, where the groups are classified by
two factors.
• Goal of two-way analysis of variance: Find
out how the mean response in a group
depends on the levels of both factors and
find the best combination.
• As with one-way analysis of variance, twoway analysis of variance can be seen as a
a special case of multiple regression. For
two-way analysis of variance, we have two
categorical explanatory variables for the
two factors and also include an interaction
between the factors.
Two-way Analysis of Variance
Example
• Package Design Experiment: Several new
types of cereal packages were designed.
Two colors and two styles of lettering were
considering. Each combination of
lettering/color was used to produce a
package, and each of these combinations
was test marketed in 12 comparable
stores and sales in the stores were
recorded.. Two-way analysis of variance
in which two factors are color (levels red,
green) and lettering (levels block, script).
Response Sales
Effect Tests
Source
Color
TypeStyle
TypeStyle*Color
Nparm
1
1
1
DF
1
1
1
Sum of Squares
4641.3333
5985.3333
972.0000
F Ratio
3.1762
4.0959
0.6652
Prob > F
0.0816
0.0491
0.4191
Expanded Estimates
Nominal factors expanded to all levels
Term
Intercept
Color[Green]
Color[Red]
TypeStyle[Block]
TypeStyle[Script]
TypeStyle[Block]*Color[Green]
TypeStyle[Block]*Color[Red]
TypeStyle[Script]*Color[Green]
TypeStyle[Script]*Color[Red]
Estimate
144.91667
-9.833333
9.8333333
-11.16667
11.166667
-4.5
4.5
4.5
-4.5
Std Error
5.517577
5.517577
5.517577
5.517577
5.517577
5.517577
5.517577
5.517577
5.517577
t Ratio
26.26
-1.78
1.78
-2.02
2.02
-0.82
0.82
0.82
-0.82
Estimated Mean for Red Block group = 144.92+9.83-11.17+4.5 = 148.08
Estimated Mean for Red Script group = 144.92+9.83+11.17-4.5= 161.42
Prob>|t|
<.0001
0.0816
0.0816
0.0491
0.0491
0.4191
0.4191
0.4191
0.4191
Interaction in Two-Way ANOVA
• Interaction between two factors: The impact of one factor
on the response depends on the level of the other factor.
• For package design experiment, there would be an
interaction between color and typestyle if the impact of
color on sales depended on the level of typestyle.
• Formally, there is an interaction if
red ,block  red ,script   green,block   green,script
• LS Means Plot suggests there is not much interaction.
Impact of changing color from red to green on mean
sales is about the same when the typestyle is block as
when the typestyle is script.
LS Means Plot
SalesLS Means
250
200
Script
Block
150
100
50
Green
Red
Color
Effect Test for Interaction
• A formal test of the null hypothesis that
there is no interaction, H 0 : ij  i ', j  ij '  i ' j '
for all levels i,j,i’,j’ of factors 1 and 2,
versus the alternative hypothesis that
there is an interaction is given by the
Effect Test for the interaction variable
(here Typestyle*Color).
Effect Tests
Source
Color
TypeStyle
TypeStyle*Color
Nparm
1
1
1
DF
1
1
1
Sum of Squares
4641.3333
5985.3333
972.0000
F Ratio
3.1762
4.0959
0.6652
• p-value for Effect Test = 0.4191. No
evidence of an interaction.
Prob > F
0.0816
0.0491
0.4191
Implications of No Interaction
• When there is no interaction, the two factors can be
looked in isolation, one at a time.
• When there is no interaction, best group is determined
by finding best level of factor 1 and best level of factor 2
separately.
• For package design experiment, suppose there are two
separate groups: one with an expertise in lettering and
the other with expertise in coloring. If there is no
interaction, groups can work independently to decide
best letter and color. If there is an interaction, groups
need to get together to decide on best combination of
letter and color.
Model when There is No Interaction
• When there is no evidence of an
interaction, we can drop the interaction
term from the model for parsimony and
more accurate estimates:
Response Sales
Effect Tests
Source
Color
TypeStyle
Nparm
1
1
DF
1
1
Sum of Squares
4641.3333
5985.3333
F Ratio
3.2000
4.1266
Prob > F
0.0804
0.0481
Expanded Estimates
Nominal factors expanded to all levels
Term
Estimate
Intercept
144.91667
Color[Green]
-9.833333
Color[Red]
9.8333333
TypeStyle[Block]
-11.16667
TypeStyle[Script]
11.166667
Std Error
5.497011
5.497011
5.497011
5.497011
5.497011
t Ratio
26.36
-1.79
1.79
-2.03
2.03
Mean for red block group = 144.92+9.83-11.17=143.58
Mean for red script group = 144.92+9.83+11.17=165.92
Prob>|t|
<.0001
0.0804
0.0804
0.0481
0.0481
Tests for Main Effects When There
is No Interaction
Response Sales
Effect Tests
Source
Color
TypeStyle
Nparm
1
1
DF
1
1
Sum of Squares
4641.3333
5985.3333
F Ratio
3.2000
4.1266
Prob > F
0.0804
0.0481
Expanded Estimates
Nominal factors expanded to all levels
Term
Estimate
Intercept
144.91667
Color[Green]
-9.833333
Color[Red]
9.8333333
TypeStyle[Block]
-11.16667
TypeStyle[Script]
11.166667
Std Error
5.497011
5.497011
5.497011
5.497011
5.497011
t Ratio
26.36
-1.79
1.79
-2.03
2.03
Prob>|t|
<.0001
0.0804
0.0804
0.0481
0.0481
• Effect test for color: Tests null hypothesis that group mean does not
depend on color versus alternative that group mean is different for at
least two levels of color. p-value =0.0804, moderate but not strong
evidence that group mean depends on color.
• Effect test for TypeStyle: Tests null hypothesis that group mean
does not depend on TypeStyle versus alternative that group mean is
different for at least two levels of TypeStyle. p-value = 0.0481,
evidence that group mean depends on TypeStyle.
• These are called tests for “main effects.” These tests only make
sense when there is no interaction.
Example with an Interaction
• Should the clerical employees of a large
insurance company be switched to a four-day
week, allowed to use flextime schedules or kept
to the usual 9-to-5 workday?
• The data set flextime.JMP contains percentage
efficiency gains over a four week trial period for
employees grouped by two factors: Department
(Claims, Data Processing, Investment) and
Condition (Flextime, Four-day week, Regular
Hours).
Response Improve
Effect Tests
Source
Nparm DF Sum of Squares F Ratio Prob > F
Department
2 2
154.3087
8.0662 0.0006
Condition
2 2
0.5487
0.0287 0.9717
Condition*Department
4 4
5588.2004 146.0566 <.0001
There is strong evidence of an interaction.
Department
25
15
FourDay
Regular
5
-5
-15
Condition
Regular
FourDay
Flex
DP
Claims
Flex
Invest
5
-5
-15
Invest
Claims
DP
Condition
Improve
25
15
Department
Improve
Interaction Profiles
Which schedule is best
appears to differ by department.
Four day is best for
investment employees, but
worst for data
processing employees.
Which Combinations Works Best?
• For which pairs of groups is there strong
evidence that the groups have different
means – is there strong evidence that one
combination works best?
• We combine the two factors into one factor
(Combination) and use Tukey’s HSD, to
compare groups pairwise, adjusting for
multiple comparisons.
Oneway Analysis of Improve By Combination
Means Comparisons
Comparisons for all pairs using Tukey-Kramer HSD
Level
DPFlex
InvestFourDay
InvestRegular
ClaimsFlex
ClaimsRegular
ClaimsFourDay
DPRegular
DPFourDay
InvestFlex
A
A
B
C
C
C
C
D
D
Mean
16.89091
16.87273
9.38182
4.32727
4.20000
3.12727
2.21818
-4.74545
-5.65455
Levels not connected by same letter are significantly different
For Data Processing employees, there is strong evidence
that flextime is best. For Investment employees, there is strong
evidence that Four Day is best. For claims employees, there is
not strong evidence that any of the schedules have different means.
Checking Assumptions
• As with one-way ANOVA, two-way ANOVA is a
special case of multiple regression and relies on
the assumptions:
– Linearity: Automatically satisfied
– Constant variance: Spread within groups is the same
for all groups.
– Normality: Distribution within each group is normal.
• To check assumptions, combine two factors into
one factor (Combination) and check
assumptions as in one-way ANOVA.
Checking Assumptions
Means and Std Deviations
Level
GreenBlo
GreenScr
RedBlock
RedScrip
Number
12
12
12
12
Mean
119.417
150.750
148.083
161.417
Std Dev
37.4929
33.5129
44.8461
36.1272
Std Err Mean
10.823
9.674
12.946
10.429
Lower 95%
95.59
129.46
119.59
138.46
Upper 95%
143.24
172.04
176.58
184.37
• Check for constant variance: (Largest
standard deviation of group/Smallest
standard deviation of group)
=(44.85/33.51) <2. Constant variance OK.
• Check for normality: Look at normal
quantile plots for each combination (not
shown). For all normal quantile plots, the
points fall within the 95% confidence
bands. Normality assumption OK.
Two way Analysis of Variance:
Steps in Analysis
1.
2.
3.
4.
Check assumptions (constant variance, normality,
independence). If constant variance is violated, try
transformations.
Use the effect test (commonly called the F-test) to test
whether there is an interaction.
If there is no interaction, use the main effect tests to
whether each factor has an effect. Compare individual
levels of a factor by using t-tests with Bonferroni
correction for the number of comparisons being made.
If there is an interaction, use the interaction plot to
visualize the interaction. Create combination of the
factors and use Tukey’s HSD procedure to investigate
which groups are different, taking into account the fact
multiple comparisons are being done.