lecture 6 hypothesis tests II
Download
Report
Transcript lecture 6 hypothesis tests II
Graduate School
Quantitative Research Methods
Gwilym Pryce
[email protected]
Lecture 6
Hypothesis Testing II:
Proportions and 2 Populations
1
Notices:
Register
2
Aims & Objectives
Aim
• the aim of this lecture is to continue with our
introduction of the method of hypothesis testing
and to demonstrate a number of applications
Objectives
– by the end of this lecture students should
be able to carry out hypothesis tests on:
• two population means
• one population proportion
• two population proportions
3
Plan:
1. Review of Significance
2. Review of one sample tests on the mean
3. Hypothesis tests about Two population means
• Homogenous variances
• Heterogeneous variances
4. Deciding on whether variances are equal
5. Hypothesis tests about proportions
– One population
– Two populations
4
Macro commands:
Confidence Intervals (CI)
Macro
Definition
command
Large sample CI for one mean
CI_L1M
Macro
Command
H_L1M
CI_S1M
Small sample CI for one mean
H_S1M
CI_S2MP
Small independent samples CI for
difference between 2 means
(pooled variance)
Small independent samples CI for
difference between 2 means
(different variances)
Large sample CI for one
proportion (presents output for
both Traditional and Wilson
methods of calculation)
Large sample CI for comparing
two proportions (presents output
for both Traditional and Wilson
methods of calculation)
H_S2MP
CI_S2MD
CI_L1P
CI_L2P
N_L1M
Sample size for desired margin or
error for the mean
H_S2MD
H_L1P
Hypothesis tests
Definition
Large sample significance test on
one mean
Small sample significance test on
one mean
Small
independent
samples
significance test for equality of 2
means (pooled variance)
Small
independent
samples
significance test for equality of 2
means (different variances)
Large sample significance test on
one proportion
H_L2P
Large samples significance test on
two proportions
H_S2VF
Simple small sample F-test on
equality of two variances (see also
Levene’s test in the SPSS help
menu for more sophisticated test
of homogenous variances).
5
1. Review of Significance
P = significance level = chances of our
observed sample mean occurring given that
our assumption about the population
(denoted by “H0”) is true.
So if we find that this probability is small, it
might lead us to question our assumption
about the population mean.
– I.e. if our sample mean is a long way from our
assumed population mean then it is:
• either a freak sample
• or our assumption about the population mean is wrong.
6
If we draw the conclusion that it is our
assumption re m that is wrong and reject H0
then we have to bear in mind that there is a
chance that H0 was in fact true.
– In other words:
• when P = 0.05, for every twenty times we reject H0, then
on one of those occasions we would have rejected H0
when it was in fact true.
7
2. Review of one sample tests on the mean
We introduced a common framework for
hypothesis testing:
4 Steps of Hypothesis testing:
Step (1) state H0 and H1
Step (2) state a and formula
Step (3) state decision rule
Step (4) compute P & decide
8
We also looked at 2 specific tests:
Large sample sig. Test on one mean:
• Formula:
xm
z
s/ n
• Macro syntax:
H_L1M
n=(?) x_bar=(?)
m=(?)
s=(?).
Small sample sig. Test on one mean:
• Formula:
xm
ti
s/ n
, df n 1
• Macro syntax:
H_S1M
n=(?) x_bar=(?)
m=(?)
s=(?).
9
3. Hypothesis tests about two
population means
In SPSS: this is called the “Independent
Sample t-test”
• go to Analyse, Compare Means...
Two different formulas for computing t:
Equal Variances
Unequal Variances
(formula has an exact t-distribution)
(does not have an exact t-distribution)
tc
( x1 x2 ) ( m1 m 2 )
1 1
sp
n1 n2
df n1 n2 1
tc
( x1 x2 ) ( m1 m 2 )
2
2
s1 s2
n1 n2
df min[ n1 1, n2 1]
10
Example where variances are different:
As part of your PhD, you want to test whether the
new “Fun Phonics” reading method is better than
the “Letterland” method. You examine the reading
power of 6 year old children from two similar
schools.
– The first used the FP method and you found that this produced
an average reading proficiency score of 53.7 (based on a
sample of 22 children; s.d. = 11.5).
– The second school used the Letterland method and you found
that this produced an average reading proficiency score of
42.51 (sample = 24; s.d. = 16.9).
Test whether the FP method produces higher results at the
1% significance level.
xFP 53.7, nFP 22, sFP 11.5
xL 42.51, nL 24, sL 16.9
11
xFP 53.7, nFP 22, sFP 11.5
xL 42.51, nL 24, sL 16.9
Use the 4 steps and the following formula to
test whether the FP method produces higher
results at the 1% significance level.
4 Steps of Hypothesis testing:
Step (1) state H0 and H1
Step (2) state a and formula
Step (3) state decision rule
tc
( x1 x2 ) ( m1 m 2 )
2
2
s1 s2
n1 n2
df min[ n1 1, n2 1]
Step (4) compute P & decide
Can you use the canned SPSS procedure to
do this problem?
12
xFP 53.7, nFP 22, sFP 11.5
xL 42.51, nL 24, sL 16.9
(1) H0: mFP = mL
H1: mFP > mL
(means are equal)
(upper tail test)
(2) a = 0.01 (implies critical t value of
2.528),
x1 x2
53.7 42.51
tc
2
2
s1 s2
n1 n2
132.25 285.61
22
24
2.644
df min[ n1 1, n2 1] 21
(3) Reject H0 iff P < a, I.e. if P < 0.01
(4) P = Prob(t > 2.644) = 0.0076, so reject H0
13
Doing the calculation in SPSS:
You cannot use the canned SPSS
procedure unless you have the original
data.
But you can use the following macro
commands:
– Homogenous variances:
• H_S2Mp
n1=(?) n2=(?) x_bar1=(?)
x_bar2=(?) s1=(?) s2=(?).
– Heterogeneous variances:
• H_S2Md
n1=(?) n2=(?) x_bar1=(?)
x_bar2=(?) s1=(?) s2=(?).
14
For the Letterland/FP example we
would use the diff. Variances syntax:
H_S2Md
n1=(22) n2=(24) x_bar1=(53.7) x_bar2=(42.51)
s1=(11.5) s2=(16.9).
The upper tail sig. = 0.007588
I.e. less than 1% chance of false rejection, therefore reject H0 of
equal means in favour of the alternative hypothesis that Fun
Phonics results in higher reading scores on average than
15
Letterland.
4. How do we decide on whether the
variances are similar?
Where variances are hugely
different or exactly the same, the
decision is simple.
When there is any ambiguity, we
can use one of two tests to help us:
• Simple Ratio of Variances Test
• Levene’s Test
16
Simple Ratio of Variances test:
If we divide the ratio of variances of samples
from two independent populations we find
that that ratio has an F distribution in
repeated samples:
F = s 12 / s 2 2
where the denominator degrees of freedom
calculated as n1–1 and the numerator
degrees of freedom calculated as n2–1.
• NB Because the critical values for the F distribution are
only calculated for the upper tail, if the F value you are
have calculated is less than one, you need to invert it (i.e.
swap round the numerator and denominator).
17
This is the formula behind the following
command:
H8_S2VF
n1=(?) n2=(?) s1(?) s2(?)
E.g. For the Letterland/FP example:
H8_S2VF n1=(22)
n2=(24) s1=(11.5) s2=(16.9).
Which tells us that there is less than a 5% chance of
false rejection if we reject the null of equal variances.
So reject the null
• I.e. we can be sure that the population variances are indeed
different.
18
The Levene’s test
If we have the original data we can use
Levene’s test which is a canned routine in
SPSS.
The Levene’s test is more sophisticated &
robust than the simple ratio of variances test:
– If P (I.e. “sig.”) from the Levene’s test is small
reject the H0 of equal variances & use the 1st tformula.
– If P from the Levene’s test is large, accept H1 &
use the 2nd t-formula to compute the test statistic.
19
SPSS Output from test equal purchase prices
between Cumberland and Durham (Nationwide
data):
S
t
H
E
N
e
e
i
a
f
P
7
1
3
8
9
3
0
7
S
s
u
f
a
V
o
n
a
e
r
S
.
e
i
g
E
e
e
o
F
p
d
a
i
t
r
r
w
g
i
p
f
P
E
0
4
6
4
7
8
0
0
5
a
E
4
3
8
8
0
7
2
n
20
Two tails from one:
Along with the Levene’s test results, SPSS
automatically supplies t-test results for both the
equal and different variances formulas.
One problem with the SPSS t-test, however, is
that it only gives the 2 tail sig., but you can
work out the one tail sigs as follows:
• The two tailed significance is twice that of the smallest one
tailed significance:
2 tailed sig. = 2 min[lower tail sig., upper tail sig.]
• But it can be a bit confusing working out which one tail
significance level is the one you want (see notes).
21
Testing for 2 means summary:
If you’ve got the original data,
• First do the Levene’s test in SPSS
Analyze, Compare Means, Independent Samples
• Then do the appropriate macro t-test to avoid confusion.
H_S2Mp for equal variances or H_S2Md for different variances
If you don’t have the original data,
• First do the ratio of variances test
H8_S2VF
• Then do the appropriate macro t-test
H_S2Mp for equal variances or H_S2Md for different variances
22
5.1 Hypothesis tests on proportions
One population
(large samples only)
So far looked at:
– how to make inferences about the
population mean from our sample mean.
But sometimes the variable of interest is
categorical
• household has or has not insurance;
• person is homosexual or not homosexual;
• a person has Aids or does not have Aids
23
In such cases, what we are interested in
is the proportion of cases that fall into a
particular category:
• the proportion of households with insurance;
• the proportion of people who are homosexual;
• the proportion of people with Aids
24
Calculating the sample proportion:
p=x/n
– where:
• x = cases with the attribute of interest
e.g. the number of households with insurance
• n = sample size
25
CLT and Proportions:
Q/ Does the Central Limit Theorem
apply to sample proportions?
A/ Yes.
– Proportions from repeated random samples will be
normally distributed around the population
proportion p.
– We can then translate any sample proportion onto
the standard normal curve by calculating its z
score:
zi
p p
p (1 p )
n
26
Example:
– E.g. 1 As a historian, you want to find the
proportion of citizens in medieval Scotland that
contracted the plague. From a sample of 400
parish records, you find that 22 died of the plague.
The assumption in the literature has been that
10% of the population had died. Test whether this
assumption is valid using both 2 and 1 tailed tests.
27
Summary of data:
n = 400
x = 22
p0 = 0.1
– (1)
H0: p = 10%
H1: p 10% (2-tailed test)
– (2) a = 0.02, for example.
zi
p p
p (1 p )
n
p
p
x
=
=
n
=
the population proportion
the sample proportion = x/n
the no. of items in sample
with the attribute of interest
the sample size.
28
– (3) Reject H0 iff P < a, I.e. if P < 0.02
(this will happen if zc < - 2.33 or if zc > 2.33, where 2.33 is the z
value associated with a = 0.02. Since zc = -3.948, we know we
can safely reject H0).
– (4) Calculate z:
n = 400
x = 22
p0 = 0.1
P
zi
0.055 0.1
.045 .045
p p
zc
3.00
p (1 p )
0.1(1 0.1)
0.09 0.015
n
400
400
= 2x(Prob(z < -3.00))
= 2x 0.0013 = 0.0026
since P < 0.02 (I.e. less than one in 50 chance of type I
error) we can reject H0.
In fact, the chances of incorrect rejection of H0 are less
than one in 3,000.
I.e. the chances of observing p (our sample proportion)
assuming H0 (p = 10%) to be true are so small that we are forced
to question this assumption about p
29
One tailed test:
– (1)
H0: p = 10%
H1: p < 10% (lower tail test)
– (2) a = 0.02
– (3) Reject H0 iff P < a, I.e. if P < 0.02
– (4) Lower tail sig. = P = Prob(z < -3.00) = .001350
since P < 0.02 we can reject H0 knowing that the chances of
incorrect rejection of H0 are less than one in 740
» our cut-off rule for rejecting H0 was no more than a one
in 50 chance
» one in 740 is a lot less than one in 50 so we can reject
H0 with confidence.
30
The macro syntax for one proportion
tests is as follows:
H6_L1P n=(400) x=(22) pi=(0.1).
Which comes to the same result.
31
5.2 Hypothesis tests about Two
population proportions
To test the hypothesis that the population
proportions are equal:
H0: p1 = p2
compute the z statistic:
( p1 p2 )
z
SEDp
where:
SEDp is the pooled standard error =
SEDp
and
p
1 1
p(1 p)
n1 n2
x1 x2
n1 n2
32
Example:
Two surveys of mortgage payment protection
insurance (MPPI) are carried out, one on
single parents with 1 child and one on single
parents with 3 children. Amongst the first
group, 67 out of a sample of 300 were found
to have taken out MPPI, compared with 15
out of a sample of 101 in the second group. Is
take-up significantly lower amongst the HHs
with three children?
• p1 = 67/300
= 0.2233;
• p2 = 15/101
= 0.1485;
• p = (300 + 101)/(67+15) = 0.2045;
33
– (1)
H0: p1 = p2
H1: p1 > p2
– (2) a = 0.01 (z* = 2.33)
1
1
SEDp 0.205(1 0.205)
0.0464
300
100
( p1 p2 ) (0.2233 0.1485)
z
1.6125
SEDp
0.0466
– (3) Reject H0: if P < 0.01
– (4) P = 0.053.
Take-up is not significantly lower amongst HHs
with 3 children at the1% sig. level; or even at 5%
significance level.
• I.e. we cannot say that the difference in proportions is
anything more than the effect of sampling variation.
34
H7_L2P
n1=(300)
n2=(101) x1=(67) x2=(15) .
35
Plan:
1. Review of Significance
2. Review of one sample tests on the mean
3. Hypothesis tests about Two population means
• Homogenous variances
• Heterogeneous variances
4. Deciding on whether variances are equal
5. Hypothesis tests about proportions
– One population
– Two populations
36