Transcript Topic 11
Topic 11 – ANOVA II
Balanced Two-Way
ANOVA
(Chapter 19)
1
Two Way ANOVA
We are now interested the combined effects
of two factors, A and B, on a response (note:
text refers to these as R = rows and
C = columns – we’ll call them A, B, and later
C for a 3-way ANOVA). Examples:
Want to consider the effects of diet plan (Factor A)
and exercise program (Factor B) on weight.
Want to consider the effects of a drug (Factor A)
and a vitamin tablet (Factor B) on blood pressure.
2
Two Way ANOVA (2)
Interested in a combination of the two
factors; unlike blocking, both of primary
interest.
Could treat each combination of factors as a
treatment and do one-way ANOVA, but then
you have to use contrasts a lot in order to
tests hypotheses of interest.
3
Two Way ANOVA (3)
Interaction is a possibility.
Replication is required to investigate
interaction. You generally want at least
two observations per treatment
combination.
You also generally want a balanced
design (for this topic, we’ll assume cell
sizes are equal).
4
Example (Problem 19.1)
An animal experiment is designed to
investigate whether the drug Levorphanol
reduces stress as reflected in the cortical
sterone level.
It is also likely that Epinephrine (adrenaline)
levels have some effect and there may be
an interaction with the drug effect as well.
Some animals were given a drug that raised
their normal levels of Epinephrine.
5
Example (2)
Control
Levorphanol
Epinephrine
Both
1.90
0.82
5.33
3.08
1.80
3.36
4.84
1.42
1.54
1.64
5.26
4.54
4.10
1.74
4.92
1.25
1.89
1.21
6.07
2.57
6
Example (3)
If we treat this as one-way ANOVA (which is not
ideal in real life but useful here for instructional
purposes), then we can design contrasts can be
used to investigate the effects:
Comparison
C
L
E
B
L effect
-1
1
-1
1
E effect
-1
-1
1
1
Interaction
1
-1
-1
1
7
SAS Code (one-way)
proc glm;
class trt;
model y=trt;
contrast 'L' trt -1 1 -1 1;
contrast 'E' trt -1 -1 1 1;
contrast 'Interaction' trt 1 -1 -1 1;
means trt /tukey;
run; quit;
8
SAS Output
Source
Model
Error
Total
DF
3
16
19
SS
37.58
16.30
53.88
Contrast
L
E
Interaction
DF
1
1
1
SS
12.83
18.59
6.16
MS
12.53
1.02
MS
12.83
18.59
6.16
F Value
12.30
Pr > F
0.0002
F Value
12.60
18.25
6.05
Pr > F
0.0027
0.0006
0.0257
9
Notes
The three contrasts have a special property –
they are “orthogonal”. Their sum is actually the
model SS: 12.83 + 18.59 + 6.16 = 37.58.
We see that there are distinguishable effects for
both drugs, plus an interaction (F-tests).
A significant interaction means that the size of
the L effect is different at different levels of E (or
equivalently, the size of the E effect is different
for different levels of L); more later.
10
Two-Way ANOVA
Break up the treatments into two factors.
Factor 1: Levorphanol (Present / Absent)
Factor 2: Epinephrine (High / Low)
Investigates all combinations of the two
factors
Ep.
Lev.
Low
High
Absent
xxxxx
xxxxx
Present
xxxxx
xxxxx
11
Two-Way ANOVA (2)
Rows / Columns of table represent levels of the
factors. We have 5 observations at each
combination of levels. Another way to view the
design:
12
Output from interaction model
Source
Model
Error
Total
DF
3
16
19
SS
37.58
16.30
53.88
MS
12.53
1.02
F Value
12.30
Pr > F
0.0002
Source
E
L
E*L
DF
1
1
1
Type I SS
18.59
12.83
6.16
MS
18.59
12.83
6.16
F Value
18.25
12.60
6.05
Pr > F
0.0006
0.0027
0.0257
Note: Type III SS will be the same,
since balanced design.
13
Statistical Model
a levels of Factor A; b levels of Factor B
n observations per cell, so the total number
of observations is nab.
Usual basic assumptions: Independent &
Normal Errors with Constant Variance.
14
Statistical Model (2)
Yij i j ij ijk
grand mean
th
i i level effect of Factor A
th
j j level effect of Factor B
ij
i 1, 2, ..., a
j 1, 2, ..., b
k 1, 2, ..., n
interaction effect in cell ij
ijk ~ N 0, 2 , independent
15
Statistical Model (3)
As before we think of all the effects in terms
of deviation from the grand (overall) mean.
We need parameter restrictions:
i
0
j
0
i
ij
0
j
ij
0
SAS does things slightly differently – making
mu the mean for the last treatment
combination (labeled ab), and setting any
parameter with one of those levels to zero.
16
Statistical Model (4)
Original ANOVA table (SAS) will have an
“overall” F-test.
Simply tests whether ANY of the factors are
significant.
Not particularly useful (unless insignificant) as it
doesn’t differentiate between the factors.
Because of this, we usually create an
“extended” ANOVA table by replacing the
model line with the Type I SS.
17
Analysis of Variance Table
Source
DF
SS
MS
F0
Factor A
a 1
b 1
SSA
MSA
MSA/MSE
SSB
MSB
MSB/MSE
a 1b 1
SSAB
MSAB
MSAB/MSE
Error
ab n 1
SSE
MSE
Total
abn 1
SST
Factor B
AB
Interaction
18
Replication
Recall that RCBD typically has n = 1.
Notice what happens in the ANOVA table if we
have n = 1.
In this case, we will not be able to investigate
interaction as the DF for error would be zero
(interaction and error effects would be
confounded, or inseparable – a different type of
“confounding” than we discussed last time).
We need at least two replicates in order to
assess interaction.
19
Breakdown of SS
As before, don’t worry too much about the
formulas in the book. Do remember:
SSModel + SSError = SSTotal
SSModel gets broken down via the Type I SS
For balanced design, Type III SS will all be the
same as Type I SS. Balanced designs are to
be preferred.
Know other relationships within the ANOVA table
and how to put together test statistics.
20
Tests
There is a specific order in which you need
to do the tests.
General Rule of Thumb: Test higher order
terms first.
If higher order terms are significant (say AB
interaction), then tests for lower order terms (A
and B main effects) do not matter as much.
Why?
21
Test for Interaction
F MSAB / MSE
F-statistic:
Hypotheses
H 0 : ij 0 for all i, j
H a : There is some non-zero ij
If F F 0.05, DF a 1b1, DF ab n1
num
den
then reject H0.
If insignificant, then can test for main effects.
22
Test for Main Effect Factor A
F-statistic:
Hypotheses
F MSA / MSE
H 0 : 1 2 ... a 0
H a : There is some non-zero i
If F F 0.05, DFnum a 1, DFden ab n1 then reject H0
23
Test for Main Effect Factor B
F MSB / MSE
F-statistic:
Hypotheses
H 0 : 1 2 ... b 0
H a : There is some non-zero j
If F F 0.05, DFnum b1, DFden ab n1 then reject H0.
24
Comparing Factors / Levels
If there is an interaction – you must do
comparisons for one factor at each level of
the other factor. (More later)
If a main effect tests significant (whether or
not there is interaction), you can study the
main effect by averaging over all levels of
the other factor.
This will be most meaningful if there truly is no
interaction; when interaction is present, still best
to study it first.
25
Basic SAS Code
proc glm;
class E L;
model stress = E|L;
Bar notation in model statement tells SAS to
include all combined effects (all main effects
and all interactions) for the factors involved.
Alternatively could do E L E*L; which may
be important if you don’t want all interactions
in a model.
26
Output from interaction model
Source
Model
Error
Total
DF
3
16
19
SS
37.58
16.30
53.88
MS
12.53
1.02
F Value
12.30
Pr > F
0.0002
Source
E
L
E*L
DF
1
1
1
Type I SS
18.59
12.83
6.16
MS
18.59
12.83
6.16
F Value
18.25
12.60
6.05
Pr > F
0.0006
0.0027
0.0257
Note: Type III SS will be the same,
since balanced design.
27
Create “Extended” Table
Source
E
L
E*L
Error
Total
DF
1
1
1
16
19
SS
18.59
12.83
6.16
16.30
53.88
MS
18.59
12.83
6.16
1.02
F Value
18.25
12.60
6.05
Pr > F
0.0006
0.0027
0.0257
The Type I SS lines replace the model line.
This makes it easy to see not only that there
are significant effects, but which effects in
particular are important.
28
Where next?
Interaction is significant, so we will want to
study that.
Because main effects are also significant,
we will be able to distinguish them to some
extent from the interaction.
Our statements about the main effects, however,
will be less useful than they would be if there
were no interaction.
29
Computing main the effects
The means for each combination are
Low E
High E
L Absent
2.246
5.284
L Present
1.754
2.572
Compare Levorphanol (average out epin.)
1.754 2.572 2.246 5.284 1.602
2
2
30
Main Effects (2)
Adding L results in a significant decrease in
stress levels for the animals.
Compare Epinephrine (average out lev.)
5.284 2.572 2.246 1.754 1.928
2
2
Higher levels of E result in significant
increases in the stress levels.
31
Interaction
What about the interaction?
Difference in effect of L between low E and high
E:
2.572 5.284 1.754 2.246
2.712 0.492 2.220
This says that the effect size for the use of L
is much larger when E levels are high. You
get a greater reduction in stress (or more
bang for your buck).
32
Interaction (2)
You could go the other way and consider:
Difference in effect of E when L is used (vs. not
used):
2.572 1.754 5.284 2.246
0.818 3.038 2.220
Interpretation: The effect size for having
higher levels of E is smaller when the drug L
is used.
33
Conclusions
Higher levels of E result in greater stress (as
we might expect).
The drug L seems to effectively lower stress
levels regardless of E levels. But it works
more efficiently at higher levels of E and
lowers stress levels by a greater amount.
34
Interactions
When there is significant interaction,
the key to the analysis is studying the
interaction and interpreting it.
35
Interaction Plots
Two choices:
Plot MEAN Response vs. Factor B by Factor A
Plot MEAN Response vs. Factor A by Factor B
Possible outcomes include
Main Effects but No Interaction
One Main Effect but No Interaction
Same Direction Interaction (as in example;
increase or decrease, but not by same amount)
Reverse Interaction
See Section 19-6-2.
36
Interaction Plots
We’ll now take a look at a number of 2 x 2
interaction plots to get an idea of what to
look for.
Once you learn to look at the plots, you will
be able to:
Determine if interaction is present
Estimate effect sizes from the plots
37
Interaction Plots (1)
Main Effects, No Interaction
38
Interaction Plots (2)
Only Factor A Main Effect
39
Interaction Plots (3)
Decreasing A effect, but not by same amt.
40
Interaction Plots (4)
Reverse Interaction (Main effects likely appear
insignificant due to the interaction)
41
Key to Interpretation
If interaction is present, then the effect of one
factor depends on the level of the other
factor.
Hence main effects carry explicit meaning only if
we have no interaction.
If opposite behavior (as on the previous slide),
main effect might be cancelled out by the
interaction and appear insignificant.
When interaction is present, you need to
discuss the effect of each variable at a
specific level of the other (cannot separate).
42
Example
We return to our example
Difference in effect of L based on E = high or
low:
High E Low E
2.712
0.492
Can we test for significance?
Need to develop standard errors.
43
Example (2)
From ANOVA table, MSE = 1.019 on 16 degrees of
freedom. For any two-sided t-test, the critical value
for significance level 0.05 will be 2.12.
For E = low, the estimated L effect is -0.492 (see
previous slide). The standard error for that
difference would be 1.019 1/ 5 1/ 5 0.6384
So T = -0.492/0.6384 = -0.77. We conclude that the
L effect is not significantly different from 0 for low E.
For E = high, estimated effect of L is -2.712 with SE
of 0.6384. Hence T = -4.25. and we conclude that
the L effect is a significant decrease for high E.
44
Obtaining these tests from SAS
First note that the minor calculations we’ve
done in the notes you should be able to put
together by hand.
In SAS, we’ll use the LSMEANS statement in
PROC GLM, and can get all the numbers we
have computed.
New: As an option in LSMeans for the
interaction A*B, use SLICE = <B> to obtain
tests for the significance of Factor A at fixed
levels of Factor B.
45
SAS Code
proc glm;
class E L;
model stress = E|L;
lsmeans L*E / slice=E
adjust=tukey cl pdiff;
46
Studying Interaction
Recall: Since the interaction was significant,
we must study this data at the interaction
level.
We were able to examine main effects (also
significant), but that analysis must be taken with
a grain of salt (you’ll see why in a moment).
Our LSMEANS statement will produce
output for the L effect at each level of E.
47
Basic LSMeans Output
E
L
High
High
Low
Low
Absent
Present
Absent
Present
i/j
1
1
2
3
4
0.0031
0.0011
0.0002
Stress
LSMEAN
LSMEAN
Number
5.28400000
2.57200000
2.24600000
1.75400000
1
2
3
4
2
3
4
0.0031
0.0011
0.9553
0.0002
0.5870
0.8664
0.9553
0.5870
0.8664
48
L Effect sliced by E levels
E*L Effect Sliced by E for Stress
E
DF
High
Low
1
1
SS
18.39
0.61
MS
18.39
0.61
F Value
Pr > F
18.05
0.59
0.0006
0.4521
Conclusions
If the epinephrine level is high, then the drug is
effective.
But if the epinephrine level is low, the drug doesn’t
do anything that is statistically significant.
49
Could “slice” on Levorphanol
E*L Effect Sliced by L for y
L
0
1
DF
1
1
SS
23.07
1.67
MS
23.07
1.67
F Value
22.65
1.64
Pr > F
0.0002
0.2183
Conclusions
If the drug is not used, higher epinephrine levels
result in a significant stress level increase.
If the drug is used, then higher epinephrine levels
don’t change the stress level significantly.
50
Conclusions
Even though the main effects are
“significant” – the presence of interaction
means we must talk about the combined
effects (as on the previous two slides).
The conclusions based on main effects
would be inaccurate.
Main: Drug is effective (wrong!)
Interaction: Drug is effective for those with high
levels of epinephrine (correct!)
51
Interaction Plots
Interaction plots provide another useful way
to look at interaction effects.
An interaction plot is a plot of the treatment
means at each level of one factor for each
level of the other factor.
The plots are overlaid so that you wind up
with a plot in which the different lines
represent the different levels of the 2nd factor.
52
Interaction Plots (2)
To obtain an interaction plot:
1. Use the SORT procedure to sort the data by each
treatment.
2. Use the MEANS procedure to obtain the means
for each combination.
3. Use the GPLOT procedure (and associated
statements) to produce the plot
53
SAS Code
proc sort; by E L;
proc means;
output out=iplot mean=means;
by E L;
proc print; run;
Note: Produces only the combined means
which are the ones you actually want.
54
Output Data Set
Obs
1
2
3
4
E
0
0
1
1
L
0
1
0
1
_TYPE_
0
0
0
0
_FREQ_
5
5
5
5
means
2.246
1.754
5.284
2.572
55
Alternative Code (class stmt)
proc sort; by E L;
proc means;
output out=iplot mean=means;
class E L;
proc print; run;
Note: Produces means for individual
variables as well as combinations.
56
Output Data Set (class stmt)
Obs
1
2
3
4
5
6
7
8
9
E
.
.
.
0
1
0
0
1
1
L
.
0
1
.
.
0
1
0
1
_TYPE_
0
1
1
2
2
3
3
3
3
_FREQ_
20
10
10
10
10
5
5
5
5
means
2.964
3.765
2.163
2.000
3.928
2.246
1.754
5.284
2.572
57
Interesting Tidbit
Means from CLASS statement can be used to
produce effect sizes (which you should be able
to calculate as we did in 1-way ANOVA).
ˆ = 2.964
m
aˆ E = low = 2.000 - 2.964 = - 0.964
bˆ L = pres = 2.163 - 2.964 = - 0.801
(a¶b )
low , pres
= 1.754 - 2.163 - 2.000 + 2.964 = + 0.555
58
Plotting the Combined Means
symbol1 v=dot i=join;
axis1 offset=(5,5) order=('Low' 'High')
label=('Epinephren');
axis2 label=( angle=90
'Mean Cortical Sterone Level')
order=(0,1,2,3,4,5,6);
proc gplot data=iplot;
plot means*E=L /haxis=axis1 vaxis=axis2;
where _Type_ = 3;
59
Interaction Plot
6
5
4
3
2
1
0
Low
Hi g h
Ep i n e p h r e n
L
Ab s e n t
Pr e s e n t
60
Conclusions (as before)
Higher levels of epinephrine increase stress
levels.
Levorphenol brings stress levels back into
normal range. (It is not effective for anything
if stress levels are already normal).
Levorphenol appears to be useful in animals
with abnormally high stress levels.
61
CLG Activity
Examine some interaction plots.
Perform a two-way ANOVA.
62