Transcript Slide 1
Some statistics questions answered:
"Well, am I significantly more popular than Gordon Brown?"
Requests:
How to report statistical results in APA format.
How to calculate SD and SE by hand.
An explanation of levels of measurement.
How to calculate t-tests, and how to interpret ANOVA output.
How to report statistics results:
Ultimate authority:
American Psychological Association. (2010). Publication Manual of the
American Psychological Association (6th ed.).
1. Descriptive statistics:
Mean and Standard Deviation:
Report numbers to two significant digits; exclude unnecessary zeroes.
"The children's reading test performance was fairly poor (M = 45.63, SD =
12.28)".
"The children's mean reading test score was 45.63 (SD = 12.28)".
Percentages:
No decimal places. "Almost all (97%) of the sample liked Sooty".
2. Statistical test results:
Varies between journals to some extent.
Essentials - name of test; test statistic; d.f. or N (depending on
test); probability level.
Chi-Square:
Pearsons correlation:
2(2, N = 17) = 8.85, p < .05
r(55) = .49, p < .01.
Spearman's correlation: rs(95) = .36,p < 0.02
T-tests:
ANOVA:
t(54) = 5.43, p < .001
F(1, 145) = 5.43, p = .02
(between-groups df, within-groups df)
Mann-Whitney: SPSS converts U into a z score, so : Z = 1.97, p < .05
By hand: U(N = 17) = 4.00, p < .005.
Wilcoxon: again, SPSS converts result into a z score.
Kruskal-Wallis: SPSS converts H into a Chi-Square value, so
2(2, N = 17) = 10.58, p < .01
By hand, with small Ns, report H .
Standard deviation and standard error:
Standard deviation (SD): a measure of how much scores are spread out
around their mean.
The bigger the SD, the less representative the mean is of the set of scores
from which it was calculated.
s
X X
2
n
1. Find the mean.
2. Find difference between
each score and the mean.
3. Square the differences.
4. Add them up.
5. Divide by number of scores.
6. Take the square root of the
result.
In practice - use the "square root" function on your calculator!
Standard deviation and standard error:
Standard error of the mean (SE): a measure of how much sample
means are spread out around the population mean.
The bigger the SE, the less confident we can be that the sample mean
truly reflects the population mean.
1. Find the standard deviation.
2. Divide it by the square root of the
number of scores.
means of
different
samples
actual
population
mean
successive attempts
Tests you need to be able to calculate by hand (for
section 2 of the exam):
Chi-Square test of association , Chi-Square goodness of fit
Pearson's r , Spearman's rho
Wilcoxon , Mann-Whitney
Friedman's , Kruskal-Wallis
Repeated-measures t-test , Independent-measures t-test
Tests whose output you should be able to understand:
One-way independent-measures ANOVA
One-way repeated-measures ANOVA
Levels of measurement:
Nominal (categorical) data:
Each participant contributes to the frequency for a category.
Voting: Conservative, Lib-Dem, Labour, Green
Handedness: left or right-handed.
All we know is how many people fall into each category - head-counts.
Ordinal data:
In principle, all we can do is place scores in order of magnitude.
In Psychology: ratings, e.g. attitude scales, ratings of pain,attractiveness.
Interval/ratio data:
Share this property - there are equal intervals throughout the measuring
scale.
Often measures of physical properties - time, length, weight.
In Psychology: reaction times, number correct, number of errors.
Levels of measurement:
Difference between interval and ratio scales:
Relatively unimportant for psychology.
Ratio scale - true zero point on scale (representing an absence of the
property being measured). e.g. time, accuracy.
Interval scale - zero point is arbitrary. e.g Fahrenheit/ Centigrade
temperature scales, calendar years.
Christian calendar: "0" is an arbitrary point. Each unit (year) is equal, but
"2000" is NOT twice as much "time" as "1000" !
2000 BC
1000 BC
500 BC
0 BC
500 AD
1000 AD
2000 AD
Analysis of Variance:
One-way independent-measures ANOVA:
Used to compare the means of three or more groups representing
different levels of one independent variable.
Each participant does one condition only.
e.g. effects of age (young, middle-aged, old) on bladder-control
(DV = length of time before needing the toilet)
One-way repeated-measures ANOVA:
Used to compare the means of three or more conditions representng
different levels of one independent variable.
Each participant does all conditions.
e.g. effects of drug (terfenedine, cetirizine, piriton) on hayfever
(DV = number of sneezes in an hour).
Output for a one-way independent-measures ANOVA:
F-ratio is
Mean Squares for between-groups variation
Mean squares for within-groups variation
The bigger the F-ratio, the bigger the differences between the groups,
compared to the differences within the groups.
Need to take into account the number of groups and the number of
participants.
d.f. are the between-groups d.f. (number of groups minus 1) and the
within-groups d.f. (number of participants in group A minus 1, plus
the number of partcipants in group B minus 1; etc,)
F(3, 26) = 19.50, p < .01.
Output for a one-way repeated-measures ANOVA:
F-ratio is
Mean Squares for between-conditions variation
Mean squares for within-conditions variation
The bigger the F-ratio, the bigger the differences between the conditions,
compared to the differences within the conditions.
Need to take into account the number of conditions and the number of
participants.
d.f. are the between-condtions d.f. (number of conditions minus 1) and the
within-conditions d.f. (number of participants in condition A minus 1,
plus the number of participants in condition B minus 1; etc,)
F(2, 38) = 35.89, p < .01.
Independent-measures t-test:
The t value represents the size of the difference between two means
(essentially a z-score for small sample sizes).
The bigger the value of t, the more confident we can be that the difference
between the means is "real", i.e. that it has not occurred just by chance.
t is the difference between two means, divided by an estimate of how much
this difference is likely to vary from occasion to occasion (estimated
standard error of the difference between means).
Repeated-measures t-test:
Same logic, except that we can capitalise on the fact that the same people
did both conditions (and hence random variation in performance is likely
to be less).
When do you use a t-test, and when do you use a correlation?
Depends on whether you have an experimental or correlational design:
t-test:
Two groups or conditions (two levels of one IV)
Looking for differences between them on a single DV.
Does alcohol affect driving performance?
IV "alcohol dosage" (sober versus inebriated).
DV number of crashes.
Correlation:
two different DVs (alcohol consumption and number of crashes)
looking for a relationship between them.
Answers are shown to two significant digits (e.g. 2.56,
31.95 etc.) Should we round numbers up during the
calculations, or only round up our answer?
Round up at the very end, otherwise rounding
errors can accumulate to make your final answer
inaccurate.
Conservative 10,706,647 (36% of the total vote)
Labour 8,604,358 (29% of the total vote)
Liberal Democrat 6,827,938 (23% of the total vote)
Chi-squared goodness of fit test:
Expected frequency = 26,138,943 divided by 3
observed frequencies
CONSERVATIVE
LABOUR
LIB-DEM
10,706,647
8,604,358
6,827,938
observed - expected
1,993,666
-108,623
-1,885,043
2(2, N = 26,138,943) = 865,363, p < .05 (and then some...)