Thinking about numbers

Download Report

Transcript Thinking about numbers

Thinking about numbers
Thinking about measurement
Thinking about how measurement and
numbers fit into our decision making
Today’s talk
• Theory of measurement
• Look at SPSS as a tool for comparing groups
Theory of measurement
• Everything can be measured: It is more a
question of how. It is question of the type of
measurement. It is a question as to what
words mean.
• Thought experiment: Is there any that cannot
be measured?
• But what does it mean to measure
something? What does ‘measure’ mean?
Lord Kelvin
• (when) you can measure what you are speaking
about, and express it in numbers, you know
something about it; but when you cannot measure
it, when you cannot express it in numbers, your
knowledge is of a meagre and unsatisfactory kind.
• Lord Kelvin (William Thomson, 1st Baron) (18241907) English physicist and mathematician. In:
Popular Lectures and Addresses, London, 1889, v.
I, p. 73.
Lord Rutherford
• If your experiment needs statistics, then you
ought to have done a better experiment.
• Ernest Rutherford (1st Baron Rutherford of
Nelson) (1871- 1937) English physicist, born in
New Zealand. Nobel prize for chemistry 1908.
St Augustine de Hippo
• "The good Christian should beware of
mathematicians and all those who make
empty prophecies. The danger already exists
that mathematicians have made a covenant
with the devil to darken the spirit and confine
man in the bonds of Hell."
St. Augustine (354-430)
Theory of measurement levels (1)
• Categories: akin to concepts, locations, descriptions,
ascriptions, attributions, or otherwise simple traits.
• For example, you are student, live in Adelaide, have a
car, have a part time job, are on Facebook, and want
a good job.
• These are measured by on/offs at the level of the
individual, which means frequency across individuals
becomes a possible unit for analysis.
• Some categories are naturally dichotomies (e.g. male
vs female), whereas others might have many options
(eg, Australian, Kiwi, Brit, Scot, etc)
Theory of measurement levels (2)
• Ordinality or rank
• Idea here is the presence of a dimension along
which one can make local contrasts.
• So Australia has more people than New
Zealand, but not as many as UK.
• We can rank NZ < Aust < UK. But we have no
metric here.
• (Overall advice, … keep away from rankings… bad news for any researcher,
seriously.. Bad science.. Bad thinking.. Full of nonsensical measurement,
such as school comparison tables which are just silly)
Theory of measurement levels (3)
• Equal interval measurement
• Called ‘metric’, or ‘parametric’.
• Idea here is that there is a scale which is both
objective and quantifiable along its entire
length in terms of unit intervals (e.g., a ruler).
• Hence, intervals have same meaning
irrespective of positioning along the scale.
Theory of measurement levels (4)
• Final stage is called ratio measurement.
• This is equal interval, yes, but with the added
assumption that scales begin at zero.
• This use of the origin is problematic for us in
all the social sciences. Our instruments do not
have zero points (e.g. an IQ of 0 is quite
meaningless, as would be an attitude score).
• We override this assumption by subscribing to
theorems collectively known as central
tendency.
What is central tendency?
• This is an assumption that scores will have the
tendency to aggregate around a midpoint, such
that a mapping of frequencies reveals a normal
distribution.
• Normal distribution theory was worked out 170
years ago, and is popularly referred to as the
bellcurve.
• Crucial point here: we describe datasets in terms
of means and deviations from the mean (and
NOT in terms of their ‘start’ and ‘stop’ points)
Normal distributions: Are they common?
• Hard to say. They can occur, but:
• Not really that common in the world at large.
• Many natural distributions are skewed to one
side, e.g. Income distributions (note how they are
often expressed as medians rather than means.
• The point is that this is a sensible descriptive
theory which can be used as a model of reality.
• As statisticians we have many ways to ‘force’
datasets into normal distributions, and thus
create conditions under which we can use many
statistical tools justifiably.
Quotes
• Anon: Like artists, statisticians have the bad habit of
falling in love with their models.
• Anon: Being a statistician means never having to say
you are certain.
• Anon: Having no statistics means a person has no
measure of his or her ignorance.
• Lies, damned lies, and statistics. Various authors,
including Mark Twain, who attributed it to Benjamin
Disraeli, British PM in Victorian era.
• How to lie with statistics. Title of a stats book by Huff.
But within Education
• We often can create normal curves. They
might occur reasonably naturally, and many of
our assessments as presently used will show
them quite automatically.
• Eg, Magill students scores on a multiple choice
test, scored out of 100%, drawn using SPSS
software, but EXCEL can do the same.
10
9
8
Mean =62.86
Std. Dev. =9.94
N =136
Frequency
7
6
5
4
3
2
1
0
29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91
Actual Percentage on the final test, for EDUC 2047 people in 2008
The notion of probability
• Our statistical tests are devices which enable
us to express the likelihood that a given data
pattern is attributable to chance, or to other
variable.
• This hinges upon probability.
• Notions of probability arrived independently
from central tendency theory, since they were
developed as means of calculating odds in
gambling, beginning 500 years ago
Probability
• Romans had no ideas about this. They created
percentages, yes, but never expressed
likelihood in terms of percentages as we do
now.
• Why not? One theory is that they used
knuckle bones instead of dice, which got
invented by the Chinese, perhaps later.
By 1840s (or earlier)
• Around 170 years ago, notions of probability
gelled beautifully into modern mathematics.
• We express outcomes as fractions, or %.
• What is the probability of an event? What is
the probability of living to 100?; Of
contracting TB?; Of being attacked by a shark;
Or of just getting a “Six” on the next throw of
the die?
• Such questions, and the possibility of answers,
are a recent phenomenon.
Which scenario is more likely?
A, or, B, or C.
• Family has six children, and let us assume that
boys and girls have equal chance of being
born
• Pattern A: Boy, boy, boy, boy, boy, boy.
• Pattern B: Boy, girl, boy, girl, boy, girl.
• Pattern C: Boy, girl, girl, boy, girl, boy.
Lets talk about probabilities
• On this campus, are you more likely to meet
more males or more females? Assuming no
means of knowing, we can take a sample.
• How big a sample would we need to take?
• We compare what is in the sample to what we
might expect, on assumption that genders are
equally represented in the overall population
of the state.
• http://www.statpages.org/
• http://graphpad.com/quickcalcs/chisquared1.
cfm
Curious facts about how human thinking
• We have natural tendencies in being overoptimistic about our capabilities and performance.
• We make judgements based on small numbers,
small samples, without any awareness of sampling
problems. The mind has almost no inherent
quantitative sense.
• We are highly influenced by narrative and
individual cases, but almost never by statistics.
• We have no ability at all to generate random
numbers.
My friend’s neighbour
• My friend says she can often see her
neighbour across the fence who often sits
in the sun, reading a book. He is often
seen there in weekends. He wears glasses,
is short, and slightly overweight.
• More likely he is (a) a university professor,
or (b) a bus driver?
• How did you decide?
How to decode statistical reporting
• There is a grammar to this.
• Students in Sue’s classes exhibited higher
scores than students in Greg’s class, F (1,66) =
5.2, p = .03.
• Students in Sue’s classes exhibited higher
scores than students in Greg’s class, F (1,66) =
5.2, p = .03.
• English, statistical coefficient, probability.