Transcript anova
What Is It?
• Analysis of Variance (ANOVA): allows for the simultaneous
comparison of the difference between two or more means
• Partition: a statistical procedure in which the total variance
is divided into separate components
– Partitioning of variance is what gives the ANOVA its name
• One-Way ANOVA: compares more than two levels of a
single IV
General Linear Model
• Factor: the term used for an IV in an ANOVA
– Factors have several Levels (values or conditions)
• Major Assumptions:
– The only difference in means is due to the levels of the IV
– The variances of the groups are equivalent
(homogeneity of variance)
Assumptions of ANOVA
• Data meet the criteria for parametric statistics
(interval/ratio level data).
• The data is normally distributed in each group.
• There is homogeneity of variance
• The observations in each sample are independent
of one another.
Components of Variance
• Total Variance: the variance of
all scores in the data set
regardless of experimental group
2
Σ(X ij - X )
ŝ total =
N -1
2
X = the grand mean
• Comprised of:
– Between-Groups Variance
– Within-Groups Variance
Within-Groups Variance
• Within-Groups Variance:
estimate of the average
variance within each group
ŝ
2
within
=
Σ
Σ(X ij - X j )2
nj - 1
k
k = the number of groups
• Homogeneity of Variance:
σ12 = σ22 = σ32 = … σj2
Between-Groups Variance
• Between-Groups Variance: estimate of variance between
group means
2
Σn j (X j - X )
ŝ between =
k -1
2
• Two Sources:
– Error Variance: uncontrolled
and unpredicted differences
among individual scores; the
within-groups variance estimates
the error variance
– Treatment Variance: the variance
among group means that is due to
the effects of the IV
The F-Ratio
• F-Ratio: the ratio of the between-groups variance divided
by the within-groups variance
• Can be expressed as:
F =
• Or
• Or
treatment variance + error variance
error variance
F =
between-groups variance
within-groups variance
F =
σ
2
between
2
σ within
The F-Ratio
• No Treatment Effect
0.0 + 5.0
F =
5.0
= 1.00
• Treatment Effect Present
5.0 + 5.0
F =
5.0
= 2.00
Is it Significant?
• Same Concept: distributions represent the probability of
various F-ratios when the null hypothesis is true
• Two types of degrees of freedom determine the shape of
the distribution.
– Between-Groups
df between = k - 1
– Within-Groups
or
df within = Σ(n j - 1)
df within = N - k
• If computed F > critical F (or if the computer tells you it is),
the F statistic is significant.
Post What?
• The F-ratio does not specify
which means are different from
other means.
• It only implies that the difference
between the means (at least two)
is great enough to be statistically
significant.
• Post hoc tests utilize pairwise
comparisons to determine which
groups are statistically different.
Using SPSS to Compute a One-Way
ANOVA
• Analyze General Linear Model Univariate
• Move the independent variable to the Fixed Factor(s) box
Move the dependent variable to the Dependent Variable
box
• Click Options highlight the independent variable in the
Factor(s) box and move it to the Display Means for box
Under Display, check descriptive statistics, estimates of
effect size, and homogeneity tests Note that the
significance level is already set at 0.05 Click Continue
• Click Post Hoc highlight the independent variable in the
Factor(s) box and move it to the Post Hoc Tests for box
Under Equal Variances Assumed, check Tukey (not
Tukey’s-b) Click continue.
What Does It All Mean?
De scri ptive Statistics
Dependent Variable: Number of Objects Recalled
sleep_cat
Lit tle
Average
Sufficient
Total
Mean
11.1250
14.4286
17.4000
13.8500
St d. Deviation
1.55265
3.55233
2.40832
3.55816
N
8
7
5
20
a
Le vene's Test of Equa lity of Error Va riances
Dependent Variable: Number of Objects Rec alled
F
.732
df1
df2
2
17
Sig.
.496
Tests the null hypothes is that t he error variance of
the dependent variable is equal across groups.
a. Design: Int ercept+s leep_cat
The descriptive
statistics box provides
the mean, standard
deviation, and number
for each group.
Levene’s test is designed
to compare the error
variance of the
dependent variable
across groups. We do not
want this result to be
significant.
Understanding the Output
Te sts of Betwe en-Subjects Effects
Dependent Variable: Number of Objects Rec alled
Source
Correc ted Model
Intercept
sleep_cat
Error
Total
Correc ted Total
Ty pe III Sum
of Squares
124.761a
3943.531
124.761
115.789
4077.000
240.550
df
2
1
2
17
20
19
Mean Square
62.380
3943.531
62.380
6.811
F
9.159
578.983
9.159
Sig.
.002
.000
.002
Partial Eta
Squared
.519
.971
.519
a. R Squared = .519 (Adjusted R Squared = .462)
The row you are interested in is the row which has the name of your variable in
it. The between df appear in this row; the within degrees of freedom appear in
the error row. F is your test statistic, and Sig is its probability.
Estimated marginal means (the next box), I did not put here. It merely provides
the 95% confidence intervals for each of the means.
Post Hoc Analysis
Multiple Compari sons
Dependent Variable: Number of Objects Rec alled
Tukey HSD
(I) sleep_cat
Lit tle
Average
Sufficient
(J) sleep_c at
Average
Sufficient
Lit tle
Sufficient
Lit tle
Average
Mean
Difference
(I-J)
St d. Error
-3. 3036
1.35071
-6. 2750*
1.48782
3.3036
1.35071
-2. 9714
1.52815
6.2750*
1.48782
2.9714
1.52815
Sig.
.063
.002
.063
.157
.002
.157
95% Confidenc e Int erval
Lower Bound Upper Bound
-6. 7686
.1615
-10.0918
-2. 4582
-.1615
6.7686
-6. 8917
.9488
2.4582
10.0918
-.9488
6.8917
Based on observed means.
*. The mean differenc e is significant at the .05 level.
Multiple Comparisons provide the mean difference between each level of the IV and
its significance. The numbers in this box repeat themselves. It is only necessary to
interpret one of each comparison… which one depends on the hypothesis.