Introduction to Statistical Methods
Download
Report
Transcript Introduction to Statistical Methods
Introduction to Statistical
Methods
By Tom Methven
Digital slides and tools available at:
www.macs.hw.ac.uk/~mjc/teaching/ResearchMethods
Moving Bell-curves
Designing the Experiment
• 1. Define exactly what you want to measure
• 2. Pick which statistical test to use, first
• 3. Decide on your experimental design
Worked Example
• Two different piano designs, iPiano and Wii
Piano!
• This example will be built up through the
notes to show how to go about
designing/analysing a simple experiment.
Level Of Measurement (NonParametric)
• Nominal – Labelled results or characteristics
– Can be used for equivalence and set membership
– Can show commonality but leaves it undefined
• E.g. List of names, football players’ numbers
• Ordinal – Results in some sort of rank order
– Results do not show differences between places!
– No arithmetic is valid, you cannot say 3rd place is
the same as 1st and 2nd combined, for example
• E.g. Race results, top 10 films
Level Of Measurement (Parametric)
• Interval – Results placed along a regular scale,
with an arbitrary zero-point
– Multiplication/Division is meaningless but the ratio of
differences is valid as differences are consistent
• E.g. Degrees Fahrenheit
• Ratio – Results placed along a regular scale with
an absolute zero-point
– Allows for statements such as 6 units is twice as much
as 3 units
• E.g. Kelvin and most SI units
Statistic Basics
• Mode The most common score
• Median The central score
• Mean The ‘average’ of all the scores
• E.g. For the results 9,2,5,3,6,9,5,6,4,2,6
• Mode: 6
Median: 5
• Mean: 5.18
Worked Example Results
• Using time to successfully play a simple tune
as the measure.
• Therefore, we’re using a ratio scaling, so we
can use parametric tests.
Interface 1
• Results:
Interface 2
Person 1
4.28
4.38
Person 2
2.78
4.99
Person 3
7.63
4.30
Person 4
7.93
4.27
Person 5
7.19
5.50
Person 6
5.73
5.22
Person 7
8.40
4.09
Person 8
5.88
4.46
Person 9
5.60
4.00
Person 10
4.89
4.90
Mean:
6.03
4.61
Randomisation and Ordering Effects
• Performing an experiment gives experience
which could improve subjects’ ‘scores’
• For a two-part test, an equal number of
people should try each condition first
• When you have many conditions or trials, it is
easiest to show then in a random order
Latin Squares
• A way of counter-balancing condition order for
small amounts of conditions
• E.g. For three possible conditions:
Order of conditions or trials
Group 1
A
B
C
Group 2
B
C
A
Group 3
C
A
B
• You can also get ‘balanced’ Latin squares for
experiments with even number of conditions.
Accuracy of the Mean
• Variance:
𝑠2 =
(𝑥𝑖 − 𝑥 )2
𝑁−1
• Standard Deviation:
𝑠=
• Standard Error:
𝑠𝑒 =
𝑠2
𝑠
𝑛
Degrees of Freedom
• For sample populations, often ‘N – 1’ is used
rather than N. This is the simplest calculation
of D.O.F., but it can get very complex.
• We assume that the sample mean is the same
as the population mean. Therefore, it is
related to how many values are free to vary
without altering the required mean value.
(Rugby example)
Student’s T-Test
• Used for comparing the means of two sample
groups using the variance
• Several types, dependant on the samples:
– ‘Dependant/Paired’ – Using this one!
– ‘Independent Equal Variance’
– ‘Independent Unequal Variance’
• Assumes populations are normally distributed
Student’s T-Test
• Create a ‘null hypothesis’:
– E.g. There is no difference in learning between
iPiano and Wii Piano
• Create an ‘alternative hypothesis’:
– One-tailed: Wii Piano is easier to learn
– Two-tailed: Wii Piano is easier or harder to learn
Dependent T-Test
• Used to compare the results of two groups
𝑡=
𝐷 − 𝜇𝐷
𝑆𝐷
𝑁
𝐷 = Average difference
𝜇𝐷 = Expected difference (0 for null hypothesis)
𝑆𝐷 = Standard deviation of differences
𝑁 = Sample Size
Worked Example T Result
𝐷
= 1.420756421
𝑆𝐷
= 1.985348881
𝑁
= 10
t-value = 2.26
Interpreting T-Value
• The T-Test gives a t-value, which can be
converted to a p-value
• If the p-value < 0.05 there is less than a 5%
chance the null hypothesis is true
• Therefore we conclude that the change of
condition caused the observed effect
• Our p-value was 0.025, so we can accept our
alternate hypothesis: ‘Wii Piano is easier to
learn’
Effect Size
• Tells us how important the result is in practical
terms. Gives an ‘r-value’.
𝑟=
𝑡2
𝑡 2 + 𝑑𝑓
– r = 0.10 (small effect) – 1% of total variance
– r = 0.30 (medium effect) – 9% of total variance
– r = 0.50 (large effect) – 25% of the variance
–
r DOES NOT work on a linear scale, so these can only be used as good guidelines. It is becoming more
common to include the r-value as well as the p-value as it is a well understood and widely used check.
[letter]-values
Recap:
• t-value: Result of the t-test
– Normally converted into a ‘p-value’ for easy comparison
between statistical tests. Used to calculate ‘r-value’
• p-value: Is it statistically significantly?
– If ‘p-value’ is less than 0.05, then you can discard the null
hypothesis and accept your alternate hypothesis.
• r-value: Is the effect substantial in reality?
– Does the result mean anything in real terms? HOW MUCH
does our experiment explain the change in means?
Final Results
• p-value = 0.025
• r-value = 0.60
• Degrees of freedom = 9
• “The results show that Wii Piano allows users to play
a set tune successfully significantly faster than iPiano
(p = 0.025). In addition, the effect size was large (r =
0.6), showing the result was substantial in real
terms.”
Error Bars
7
6.5
6
Mean 1
5.5
Mean 2
5
4.5
4
0
1
2
3
Error bars: Plot standard error
Excel Example
• TTEST in Excel will give a ‘p-value’ directly
Summing Up
• Dependant t-test when using a single group
– For example, when using 10 friends to test two interfaces
• Avoid ordering effects
– Randomise order and get an even number to do each test first
• Use ‘TTEST’ in Excel to get p-value easily
– Easier than calculating it yourself, but harder to get r-value!
• Check p < 0.05 and quote the value and result
– Always make sure to quote the actual p-value in your results
as well as saying whether the result was statistically significant
Recommended Reading