powerpoint slides

Download Report

Transcript powerpoint slides

Day 10
Analysing usability test results
Objectives
 To learn more about how to understand and
report quantitative test results
 To learn about some basic statistical terms
 To learn about t-tests
 To learn whether obtained results are
“significant”
From
www.infodesign.com.au
 http://www.usabilitylabrental.com/
Analysing and presenting results -
qualitative
 Qualitative data
 You could group comments from participants that
seem to go together and explain how many had what
problem
 For example,


4 stated that they did not realise you would click on
“see it” to find the price of an item
2 stated that they searched the whole site and didn’t
find the price of the item
 In your report, you could give quotations to back up
what you are saying (this shows more clearly that
this is not your subjective feeling … it makes it
objective)
Analysing and presenting results
- quantitative
 Quantitative data
 You might want to report things like:
 how many clicks people took on a task compared to
an expert
 how much time people took on a task with design A
compared with design B
 how many errors novice users made compared with
another group who are experienced users of certain
types of software
 But, just reporting the numbers is not enough!
The garden.com study from yesterday
 http://www.cit.gu.edu.au/~mf/uidweek10/ergosoft.pdf
 For each task, they reported



the means (calculated for the participants)
whether the mean for the participants differed from an expert (Y/N)
The Standard Deviation (a measure of how much the data is
dispersed around the mean … how consistent the data is)
 But, this is not enough!
Statistical tests
 Notice that they also said they did statistical
tests “to determine whether real differences
exist” between the participants and the expert
 They should have given more details
 what statistical test
 the values obtained from the statistical tests
Statistical test – some background
 The normal curve
 The standard deviation
 Types of “experiments”
 p values
 t-tests
 single sample, with hypothesised mean
 independent samples
 correlated samples
The normal curve and standard deviations
The standard deviation
 A measure of the variability of the data about
the mean
 A large standard deviation means the values
obtained from the subjects vary a lot from the
mean
 A small standard deviation means the values
obtained from the subjects vary little from the
mean
Why is the standard deviation important?
 Table 7 of the garden.com study
http://www.cit.gu.edu.au/~mf/uidweek8/ergosoft.pdf
 Compare task 1 and task 3
 Statistical tests can take this SD difference into
account
 The appropriate statistical test is the t-test
The t-test
 The t-test will tell you whether one set of means
are really different from another set
 That is, it is a statistical test to compare means
 There are really 3 kinds of t-tests
 Single sample
 when you are comparing participant means with an expert
(we’ll call this the hypothesised mean)
 Independent samples
 when you are comparing performance by two groups
 Correlated samples
 when you are comparing one group tested in two different
situations
Single sample test
 Where you have one group of subjects and test
them against one mean
 for example
 one value is obtained from one expert, which is then
assumed to be the mean for some expert group
 or it could be more like a benchmark, and you
compare the means of the participants to some
benchmark
Independent samples
 This is where you have data from different
groups of subjects
 for example
 you have novices and experienced users and you are
comparing the means of the two groups
Condition
Group 1
members
Condition
Group 2
members
Correlated samples
 This is where you use the same subjects for
two different measures and want to compare
them
 for example
 you give subjects 2 tasks and see if they found one
harder than the other
Condition 1 Condition 2
Group
members
The p-value
 When you run a statistical test, you get a pvalue
 p-value stands for probability value
 The aim of statistical tests is to determine
whether the results could have occurred by
chance
 If it is very unlikely that certain results
occurred by chance then there is probably
some other reason; for example, maybe novice
users get more confused than expert users
The importance of p < .05
 Usually, if results could be obtained by chance
less than 5 times in a hundred, we say the
results are significant
 When you do a statistical test, you will get a pvalue expressed as a decimal; for example
p=.04 (the probability of getting the results by
chance is just 4 in 100)
 Any p < .05 is significant: you can assume
your observed differences are significant
One and two tailed t-tests
 one-tailed test: used when you have predicted
the direction of the difference; for example,
novices will use more clicks than experts
 two-tailed test: used when you have predicted
a difference, but have not stated the direction
of the difference; for example, there will be a
difference in performance between males and
females
Today’s lab
 We will run some t-tests on some fake
data
 http://faculty.vassar.edu/lowry/VassarStats.html