Transcript Powerpoint

Statistics
A Word on Statistics - Wislawa Szymborska
Out of every hundred people,
those who always know better:
fifty-two.
Led to error
by youth (which passes):
sixty, plus or minus.
Unsure of every step:
almost all the rest.
Those not to be messed with:
four-and-forty.
Ready to help,
if it doesn't take long:
forty-nine.
Living in constant fear
of someone or something:
seventy-seven.
Always good,
because they cannot be otherwise:
four -- well, maybe five.
Capable of happiness:
twenty-some-odd at most.
Able to admire without envy:
eighteen.
Harmless alone,
turning savage in crowds:
more than half, for sure.
Cruel
when forced by circumstances:
it's better not to know,
not even approximately.
Wise in hindsight:
not many more
than wise in foresight.
Getting nothing out of life except
things:
thirty
(though I would like to be wrong).
Balled up in pain
and without a flashlight in the dark:
eighty-three, sooner or later.
Those who are just:
quite a few, thirty-five.
But if it takes effort to understand:
three.
Worthy of empathy:
ninety-nine.
Mortal:
one hundred out of one hundred -a figure that has never varied yet.
Today
• Introduction to statistics
• Looking at our qualitative data in a
quantitative way
• Presentations
• More exploration of the data
• Tutorials
Why statistics are important
Statistics are concerned with difference – how much
does one feature of an environment differ from
another
Suicide rates/100,000 people
Why statistics are important
Relationships – how does much one feature of
the environment change as another measure
changes
The response of the fear centre of white people to
black faces depending on their exposure to
diversity as adolescents
The two tasks of statistics
Magnitude: What is the size of the difference or the
strength of the relationship?
Reliability. What is the degree to which the
measures of the magnitude of variables can be
replicated with other samples drawn from the
same population.
Magnitude – what’s our measure?
• Raw number? Rate?
• Some aggregate of numbers? Mean, median, mode?
Suicide rates/100,000 people
How safe do MPHS people feel?
1. Feeling safe in their own home: yes=1, no=0
2. Feeling safe in their local part of MPHS: yes =1,
no=0
3. Feeling safe in MPHS generally: yes=1, no=0
Total safety score = add 1-3. range=0 to 3.
If people don’t refer to 1. above, score it as =1,
If people score 0 on 2, they must be 0 on 3.
Arithmetic mean or average
Mean (M or X), is the sum (SX) of all the sample values
((X1 + X2 +X3.…… X22) divided by the sample size (N).
Mean/average = SX/N - Carbon footprint scores
63
71
75
78
80
85
64
72
75
79
81
85
66
73
75
79
81
85
67
73
75
79
83
86
68
74
76
79
84
89
70
74
76
80
84
90
70
74
77
80
84
92
71
77
84
Compute the mean
Total
Polynesian
Other
Total (SX)
3483
971
2512
N
45
13
32
mean
77.4
74.7
78.5
The median
• median is the "middle" value of the sample. There are
as many sample values above the sample median as
below it.
• If the number (N) in the sample is odd, then the
median = the value of that piece of data that is on the
(N-1)/2+1 position of the sample ordered from
smallest to largest value. E.g. If N=45, the median is
the value of the data at the (45-1)/2+1=23rd position
• If the sample size is even then the median is defined as
the average of the value of N/2 position and N/2+1. If
N=32, the median is the average of the 32/2 (16th) and
the 32/2+1(17th) position. Why use the median?
Other measures of central tendency
• The mode is the single most frequently
occurring data value. If there are two or more
values used equally frequently, then the data
set is called bi-modal or tri-modal, etc
• The midrange is the midpoint of the sample the average of the smallest and largest data
values in the sample.
• The geometric mean (log transformation) and
the harmonic mean (inverse transformation) –
both used where data is skewed with the aim
of creating a more even distribution
Compute the median and mode
63
71
75
78
80
85
64
72
75
79
81
85
66
73
75
79
81
85
67
73
75
79
83
86
68
74
76
79
84
89
70
74
76
80
84
90
70
74
77
80
84
92
71
77
84
Mean, median, mode, mid-range
Total
Polynesian
Total
3483
971
N
45
13
mean
77.4
74.7
median
77
75
mode 75, 79, 84
81
77
75
midrange
Other
2512
32
78.5
78.5
84
77
frequency
Ecological footprint histogram
14
12
10
8
6
4
2
0
63-67
68-72 73-77 78-82 83-87
Ecological footprint scores
88-92
The underlying distribution of the data
Normal distribution
Three things we must know before
we can say events are different
1. the difference in mean scores of two or
more events
- the bigger the gap between means the
greater the difference
2. the degree of variability in the data
- the less variability the better
Variance and Standard Deviation
These are estimates of the spread of data.
They are calculated by measuring the
distance between each data point and the
mean
variance (s2) is the average of the squared
deviations of each sample value from the
mean = s2 = S(X-M)2/(N-1)
The standard deviation (s) is the square root
of the variance.
x
64
66
67
70
71
74
75
77
79
80
81
Total
81
86
971
(x-Mx)
-10.7
-8.7
(x-Mx)2
114.3
75.6
-7.7
-4.7
-3.7
-0.7
0.3
2.3
4.3
5.3
6.3
6.3
59.2
22.0
13.6
0.5
0.1
5.3
18.6
28.2
39.8
39.8
11.3
127.9
Mean (Mx) 74.7
Variance = sx2
Nx
13 Standard deviation = sx
544.8
41.9
6.5
Calculating
the
Variance
and the
standard
deviation
for the
Polynesian
sample
All normal distributions have similar properties. The
percentage of the scores that is between one standard
deviation (s) below the mean and one standard deviation
above is always 68.26%
Is there a difference between Polynesian
and “other” scores
Polynesian (n=13 - blue) and "other" (n=32 red) histograms
35%
percentage
30%
25%
20%
15%
10%
5%
0%
63-67
68-72 73-77 78-82 83=87
Ecological footprint scores
88-92
Is there a significant difference between
Polynesian and “other” scores
Three things we must know before
we can say events are different
3. The extent to which the sample is
representative of the population from
which it is drawn
- the bigger the sample the greater the
likelihood that it represents the population
from which it is drawn
- small samples have unstable means. Big
samples have stable means.
Estimating difference
The measure of stability of the mean is the Standard
Error of the Mean = standard deviation/the square root
of the number in the sample.
So stability of mean is determined by the variability in
the sample (this can be affected by the consistency of
measurement) and the size of the sample.
The standard error of the mean (SEM) is the standard
deviation of the normal distribution of the mean if we
were to measure it again and again
Yes it’s significant. The mean of the smaller sample is not too
variable. Its Standard Error of the Mean = 6.5/√13 = 1.80. The
95% confidence interval =1.96 SDs = 3.52. This gives a range
from 71.2 to 78.2. The “Other” mean falls just outside this
confidence interval
Polynesian
Mean =74.7
SD=6.5
N= 13
Distribution
of Standard
error of the
mean
Is the difference between means
significant?
What is clear is that the mean of the Other group is just
outside the area where there is a 95% chance that the
mean for the Polynesian Group will fall, so it is likely
that the Other mean comes from a different population
as the Polynesian mean.
The convention is to say that if mean 2 falls outside of the
area (the confidence interval) where 95% of mean 1
scores are estimated to be, then mean 2 is significantly
different from mean 1. We say the probability of mean
1 and mean 2 being the same is less than 0.05 (p<0.05)
and the difference is significant
The significance of significance
• Not an opinion
• A sign that very specific criteria have been met
• A standardised way of saying that there is a
There is a difference between two groups – p<0.05;
There is no difference between two groups – p>0.05;
There is a predictable relationship between two
groups – p<0.05; or
There is no predictable relationship between two
groups - p>0.05.
• A way of getting around the problem of variability
One and
two tailed
tests
1-tailed test
2-tailed test
-1.96
+1.96
Standard deviations
2.5% of
95% of
2.5% of
M1
M1
M1
distridistridistribution
bution
bution
If you argue
for a one
tailed test –
saying the
difference can
only be in one
direction, then
you can add
2.5% error
from the side
where no data
is expected to
the side where
it is
If we were to argue for a one tailed test – that Polynesian people were
more eco-sustaintable, than the Others – the 95% confidence interval
can all be to the left of the of the SEM distribution rather than equally
distributed on either side. This means that instead of going to 47.5%
line on the right we go to the 45% line = 1.65 SDs or 3.0 units Normal
distribution
Polynesian
Mean =74.7
SD=6.5
N= 13
Distribution
of Standard
error of the
mean
T-tests
t = (Mx-My)/Sx2/Nx + Sy2/Ny;
where t is value generated and :
Mx= the mean carbon footprint of participants with higher
incomes
My= the mean carbon footprint of participants with moderate
to low incomes
Sx2=the variance of the carbon footprint of participants with
higher incomes
Sy2= the variance of the carbon footprint of participants with
moderate to low incomes
Nx=the number of participants with higher incomes
Ny=the number of participants with moderate to low incomes
T-test result. This does exactly what we have done except it argues that in
every sample the first data point is fixed and that other data points are free to
vary in relation to it. Consequently, when estimating variance we should
divide by (N-1) not N. That makes this test more conservative.
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean Difference
Degrees of freedom (df)
t Stat
p(T<=t) one-tail
t Critical one-tail
p(T<=t) two-tail
t Critical two-tail
Polynesian
74.69
45.397
13
0
43
-1.73
0.045
+ or -1.68
0.090
+ or - 2.02
Other
78.50
44.26
32
Impact of gender on safety
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
women
2.05
0.95
21.00
0.00
23.00
1.30
0.10
1.71
0.21
2.07
men
1.58
0.99
12.00
Impact of religion on safety
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized
Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
no religion
1.81
1.23
16.00
0.00
29.00
-0.51
0.31
1.70
0.61
2.05
religion
2.00
0.86
15.00
Impact of work on safety
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
work in
MPHS
2.43
0.26
14.00
0.00
27.00
3.39
0.00
1.70
0.00
2.05
not working in
MPHS
1.47
1.15
19.00
Correlations
and
Chi-square
The correlation with the glacier went unnoticed.
The debate proceeded and receded
with slow heated monotonous cold regularity
although never reversing
at the same point of disagreement.
The correlation with the glacier went. . .
The weight of paper and opinion
now far-exceeding the frozen mountain, even at its zenith.
But no amount of FSC vellum
could paper over the crevasse cracked argument.
The correlation with the glacier . . . .
The blue-green water vein bled
But no aerial artery replenished the source.
The constant melt etching the message
of increased bloodletting from the waning carcase
The correlation with the . . . . .
Lost in the science of the unknown.
The pre-historic signpost, scarred by graffiti,
slowly shrank and collapsed
Its incremental deficit matched by political will.
The correlation . . . . . .
We are, we were, the new dinosaurs,
like the sun-burnt beached berg
doomed for demise in the new non-ice age.
No-one will record its disappearance or ours.
The correlation with humanity went unnoticed.
Correlation by John S http://allpoetry.com/poem/9257026Correlation-by-JohnS
Chi-square test - comparing MPHS samples
with the local populations
• Looks at the magnitude or size of the difference between
observed and expected values (O-E) and then squares those
differences to they are all positive - (O-E)2,
• Adjust those differences so they are relative to the size of the
expected values - (O-E)2/E. This is a variance measure and
takes care of effects that are due to the size of the expected
value, which in turn is related to the sample size.
• Calculates a chi-square value which is the sum of the adjusted
differences ( S(O-E)2/E)=14.03). This is compared with the
value that chi-squared would have to reach to be significant
for the number of categories used (n).
• The question: Is the MPHS sample representative of the
cultural mix of the MPHS population?
What would we predict?
Age
MPHS
sample
population
18-30
9
27%
48%
31-40
13
39%
22%
41-50
9
27%
18%
>50
2
6%
13%
33
100%
In red are the number of participants we would predict (we
EXPECT) based on the percent in each category in the MPHS
population (2006). In blue is what we got (we OBSERVED). Is
the match sufficiently close?
Does the MPHS sample match the population
age distribution?
Age
18-30
31-40
41-50
>50
O
9
13
9
2
E
16
7
6
4
O-E
-6.69
5.82
3.09
-2.22
(0-E)2
44.72
33.89
9.54
4.94
chi-square=
(0-E)2/E
2.85
4.72
1.61
1.17
10.35
Degrees of freedom = N-1 = 3, where N=the number of parametres not the nu
number of participants
Value of chi-square (χ2) for p<0.05=7.81
Actual χ2 is more than 7.81, therefore there is a significant difference between
the MPHS sample and MPHS population
Chi-square table click here to get the Chi-Square table
Does the MPHS sample match the population
age distribution?
Children
No Children
One Child
Two Children
Three Children
Four Children
Five Children
Six or More Children
(0-E)2/E
1.46
5.01
0.00
0.47
0.88
0.01
1.19
9.02
Degrees of freedom = N-1 = 6, where N=the number of parametres not the number
of participants
Value of chi-square (χ2) for p<0.05=12.59
Actual χ2 is less than 12.59, therefore there is no significant difference between the
MPHS sample and MPHS population
O
6
10
7
6
1
1
0
E
10
5
7
5
2
1
1
O-E
-3.79
5.00
0.08
1.46
-1.48
-0.08
-1.19
(0-E)2
14.33
25.02
0.01
2.12
2.19
0.01
1.41
chi-square=
r=0.904
N=33
p<0.00
Correlations
r =( S(X – MX)*((Y – MY))/(N*SX*SY)
X = GDP purchasing power in $'000s
Y= Better Life Index (0-10)
MX=Mean of X = 25,200
MY =Mean of Y= 6.34
SX=Standard deviation of X=7.02
SY=Standard deviation of Y=1.44
r =correlation coefficient = +0.90
One or two tails? Have we made a prior
prediction? Yes, that life satisfaction will
increase with wealth = 1 tailed test
What degrees of freedom? df=N-1= 33-1 = 32
What level of significance should be chosen? It
depends on the number of correlations. p<0.05 –
there is only one correlation. Often there 100’s –
in which case a tougher criterion should be
chosen.
Where can we find the critical values of r?
HERE
Correlation felt safety and people in the
household
Children:
Adults
total people in
household
total safety
Children
Adults
1.00
0.18
1.00
0.70
-0.09
0.83
-0.24
p<0.05, df=30, r=0.349
total people in
household
total safety
1.00
-0.22
1.00
• http://www.medcalc.org/manual/chi-squaretable.php
Correlation and regression
• Correlation quantifies the degree to which two
random variables are related. Correlation does not
fit a line through the data points. You simply are
computing a correlation coefficient (r) that tells
you how much one variable tends to change when
the other one does.
• Linear regression finds the best line that predicts
the size of one variable when given another
variable which is fixed. The regression coefficient (r2) tells how much of the variability of
our fixed (dependent) variable is accounted for by
the independent variable
Correlations
A perfect relationship, but not a
linear correlation
y
x
A powerful
relationship,
but not a
correlation
– what’s
happening
here?
Normality of the data and
Homoscedasticity
r=0.904
N=33
p<0.00
How correlation is used and
misused
Tests of significance
• Tests of difference – t-tests, analysis of
variance, chi-square, odds ratios
• Tests of relationship – correlation,
regression analysis
• Tests of difference and relationship –
analysis of covariance, multiple regression
analysis.
Inferential statistics