Transcript day11
The t Tests
Independent Samples
The t Test for Independent
Samples
• Observations in each sample are
independent (not from the same
population) each other.
• We want to compare differences between
sample means.
t
( X 1 X 2 ) ( 1 2 ) hyp
sX X 2
1
Sampling Distribution of the
Difference Between Means
• Imagine two sampling distributions of the
mean...
• And then subtracting one from the other…
• If you create a sampling distribution of the
difference between the means…
– Given the null hypothesis, we expect the mean of the
sampling distribution of differences, 1- 2, to be 0.
– We must estimate the standard deviation of the
sampling distribution of the difference between
means.
Pooled Estimate of the
Population Variance
• Using the assumption of homogeneity of
variance, both s1 and s2 are estimates of the
same population variance.
• If this is so, rather than make two separate
estimates, each based on some small sample, it
is preferable to combine the information from
both samples and make a single pooled
estimate of the population variance.
2
2
(n
1)s
(n
1)s
2
1
2
2
sp 1
(n1 1) (n2 1)
Pooled Estimate of the Population
Variance
• The pooled estimate of the population variance becomes
the average of both sample variances, once adjusted for
their degrees of freedom.
– Multiplying each sample variance by its degrees of freedom
ensures that the contribution of each sample variance is
proportionate to its degrees of freedom.
– You know you have made a mistake in calculating the pooled
estimate of the variance if it does not come out between the two
estimates.
– You have also made a mistake if it does not come out closer to
the estimate from the larger sample.
• The degrees of freedom for the pooled estimate of the
variance equals the sum of the two sample sizes minus
two, or (n1-1) +(n2-1).
Estimating Standard Error of the
Difference Between Means
2
2
(n
1)s
(n
1)s
2
1
2
2
sp 1
(n1 1) (n2 1)
sX X
1
t
2
s 2p
n1
s 2p
n2
( X 1 X 2 ) ( 1 2 ) hyp
sX X 2
1
The t Test for Independent
Samples: An Example
• Stereotype Threat
“Trying to develop the test
itself.”
“This test is a measure of
your academic ability.”
The t Test for Independent
Samples: An Example
• State the research question.
– Does stereotype threat hinder the
performance of those individuals to which it is
applied?
• State the statistical hypotheses.
H o : 1 2 0
H 1 : 1 2 0
or
H o : 1 2
H 1 : 1 2
The t Test for Independent Samples:
An Example
• Set the decision rule.
.05
df (n1 1) (n2 1) (11 1) (12 1) 21
t crit 1.721
The t Test for Independent Samples:
An Example
• Calculate the test statistic.
Control
4
9
12
8
9
13
12
13
13
7
6
Control Sq
16
81
144
64
81
169
144
169
169
49
36
106
1122
Threat
7
8
7
2
6
9
7
10
5
0
10
8
79
Threat Sq
49
64
49
4
36
81
49
100
25
0
100
64
621
t
( X 1 X 2 ) ( 1 2 ) hyp
sX X 2
1
X1
79
6.58
12
X2
106
9.64
11
The t Test for Independent Samples:
An Example
• Calculate the test statistic.
2
12
(
621
)
(
79
)
s12
9.18
12(11)
11(1122) (106)
s
10.05
11(10)
sX X
1
2
2
2
(n1 1)s (n2 1)s
s
(n1 1) (n2 1)
2
p
2
1
2
2
sX X
1
2
(12 1)9.18 (11 1)10.05
s
9.59
(12 1) (11 1)
2
p
2
s 2p
n1
s 2p
n2
9.59 9.59
1.29
12
11
The t Test for Independent Samples:
An Example
• Calculate the test statistic.
t
( X 1 X 2 ) ( 1 2 ) hyp
sX X 2
1
X 1 6.58
sx X
1
2
X 2 9.64
9.59 9.59
1.29
12
11
6.58 9.64
t
2.37
1.29
The t Test for Independent
Samples: An Example
• Decide if your result is significant.
– Reject H0, - 2.37< - 1.721
• Interpret your results.
– Stereotype threat significantly reduced
performance of those to whom it was applied.
Assumptions
1) The observations within each sample must be
independent.
2) The two populations from which the samples are
selected must be normal.
3) The two populations from which the samples are
selected must have equal variances.
–
This is also known as homogeneity of variance, and there
are two methods for testing that we have equal variances:
•
•
4)
a) informal method – simply compare sample variances
b) Levene’s test – We’ll see this on the SPSS output
Random Assignment
To make causal claims
5)
Random Sampling
To make generalizations to the target population
Which test?
• Each of the following studies requires a t test for one or more
population means. Specify whether the appropriate t test is for one
sample or two independent samples.
– College students are randomly assigned to undergo either behavioral
therapy or Gestalt therapy. After 20 therapeutic sessions, each student
earns a score on a mental health questionnaire.
– One hundred college freshmen are randomly assigned to sophomore
roommates having either similar or dissimilar vocational goals. At the
end of their freshman year, the GPAs of these 100 freshmen are to be
analyzed on the basis of the previous distinction.
– According to the U.S. Department of Health and Human Services, the
average 16-year-old male can do 23 push-ups. A physical education
instructor finds that in his school district, 30 randomly selected 16-yearold males can do an average of 28 push-ups.
Handout Example
Effect Size
1) Simply report the actual results of the study.
–
–
(a) Most direct method.
(b) Can be misleading.
2) Calculate Cohen’s d or Δ (preferred).
(a) Magnitude of effect size is standardized by measuring
the mean difference between two treatments in terms
of the standard deviation.
(b) d = (M1-M2)/sp2
(c) Evaluate using the following criteria:
•
•
•
i) .20 small effect
ii) .50 medium effect
iii) > .80 large effect
Effect Size: Example
• In the study evaluating stereotype
threat, the null hypothesis was rejected,
with M1=6.58, M2=9.64, and sp2 = 9.59.
• Calculate Cohen’s d, and evaluate the
magnitude of this measure (small,
medium, or large).
• Compare effect size to z table to
determine where the mean of one group
is relative to the other.
Type 1 Error & Type 2 Error
Scientist’s Decision
Reject null hypothesis
Null hypothesis
is true
Null hypothesis
is false
Type 1 Error
Fail to reject null hypothesis
Type 1 Error
probability =
Correct Decision
Probability = 1-
Correct decision
probability = 1 -
Type 2 Error
probability =
=
Cases in which you reject
null hypothesis when it is
really true
Type 2 Error
=
Cases in which you fail to
reject null hypothesis when
it is false
Power and sample size estimation
• Power is the probability of correctly
rejecting a null hypothesis.
– In social sciences we typically use .80.
• What determines the power of a study
– Effect size
– Sample size
– Variance
–α
– One vs. two tailed tests
If you want to know
• Sample Size
– Need to know
•α
•β
•Δ
• Power
– Need to know
•α
• N per condition
•Δ
Calculating sample size
• Remember stereotype threat example
– Δ = .99
• Say we want to perform a test with
– power (1-β) = .80
– Two tailed alpha = .05
Solving for n: 1-β = .80
Solving for n: 1-β = .80 one tailed
Solving for n: 1-β = .90
Solving for n: 1-β = .90 one tailed
Calculating power
• Say we did the same study with n of 5 in
each condition (N =10)
• We want to know how much power we
have to find d or Δ = .99.
• Again we are using a two tailed test with α
= .05.
Using Piface
What if I did a one tailed test?
Spss: Homework Hint
• For the two sample t tests you will need to
create two variables, cond (X) and score (Y)