powerpoint slides
Download
Report
Transcript powerpoint slides
Day 10
Analysing usability test results
Objectives
To learn more about how to understand and
report quantitative test results
To learn about some basic statistical terms
To learn about t-tests
To learn whether obtained results are
“significant”
From
www.infodesign.com.au
http://www.usabilitylabrental.com/
Analysing and presenting results -
qualitative
Qualitative data
You could group comments from participants that
seem to go together and explain how many had what
problem
For example,
4 stated that they did not realise you would click on
“see it” to find the price of an item
2 stated that they searched the whole site and didn’t
find the price of the item
In your report, you could give quotations to back up
what you are saying (this shows more clearly that
this is not your subjective feeling … it makes it
objective)
Analysing and presenting results
- quantitative
Quantitative data
You might want to report things like:
how many clicks people took on a task compared to
an expert
how much time people took on a task with design A
compared with design B
how many errors novice users made compared with
another group who are experienced users of certain
types of software
But, just reporting the numbers is not enough!
The garden.com study from yesterday
http://www.cit.gu.edu.au/~mf/uidweek10/ergosoft.pdf
For each task, they reported
the means (calculated for the participants)
whether the mean for the participants differed from an expert (Y/N)
The Standard Deviation (a measure of how much the data is
dispersed around the mean … how consistent the data is)
But, this is not enough!
Statistical tests
Notice that they also said they did statistical
tests “to determine whether real differences
exist” between the participants and the expert
They should have given more details
what statistical test
the values obtained from the statistical tests
Statistical test – some background
The normal curve
The standard deviation
Types of “experiments”
p values
t-tests
single sample, with hypothesised mean
independent samples
correlated samples
The normal curve and standard deviations
The standard deviation
A measure of the variability of the data about
the mean
A large standard deviation means the values
obtained from the subjects vary a lot from the
mean
A small standard deviation means the values
obtained from the subjects vary little from the
mean
Why is the standard deviation important?
Table 7 of the garden.com study
http://www.cit.gu.edu.au/~mf/uidweek8/ergosoft.pdf
Compare task 1 and task 3
Statistical tests can take this SD difference into
account
The appropriate statistical test is the t-test
The t-test
The t-test will tell you whether one set of means
are really different from another set
That is, it is a statistical test to compare means
There are really 3 kinds of t-tests
Single sample
when you are comparing participant means with an expert
(we’ll call this the hypothesised mean)
Independent samples
when you are comparing performance by two groups
Correlated samples
when you are comparing one group tested in two different
situations
Single sample test
Where you have one group of subjects and test
them against one mean
for example
one value is obtained from one expert, which is then
assumed to be the mean for some expert group
or it could be more like a benchmark, and you
compare the means of the participants to some
benchmark
Independent samples
This is where you have data from different
groups of subjects
for example
you have novices and experienced users and you are
comparing the means of the two groups
Condition
Group 1
members
Condition
Group 2
members
Correlated samples
This is where you use the same subjects for
two different measures and want to compare
them
for example
you give subjects 2 tasks and see if they found one
harder than the other
Condition 1 Condition 2
Group
members
The p-value
When you run a statistical test, you get a pvalue
p-value stands for probability value
The aim of statistical tests is to determine
whether the results could have occurred by
chance
If it is very unlikely that certain results
occurred by chance then there is probably
some other reason; for example, maybe novice
users get more confused than expert users
The importance of p < .05
Usually, if results could be obtained by chance
less than 5 times in a hundred, we say the
results are significant
When you do a statistical test, you will get a pvalue expressed as a decimal; for example
p=.04 (the probability of getting the results by
chance is just 4 in 100)
Any p < .05 is significant: you can assume
your observed differences are significant
One and two tailed t-tests
one-tailed test: used when you have predicted
the direction of the difference; for example,
novices will use more clicks than experts
two-tailed test: used when you have predicted
a difference, but have not stated the direction
of the difference; for example, there will be a
difference in performance between males and
females
Today’s lab
We will run some t-tests on some fake
data
http://faculty.vassar.edu/lowry/VassarStats.html