Transcript Powerpoint

PSY 307 – Statistics for the
Behavioral Sciences
Chapter 14 – t-Test for Two
Independent Samples
Independent Samples



Observations in one sample are not
paired on a one-to-one basis with
observations in the other sample.
Effect – any difference between two
population means.
Hypotheses:


Null
H0: m1 – m2 = 0
Alternative H1: m1 – m2 ≠ 0
≤0
>0
The Difference Between Two
Sample Means
Effect Size
X1 minus X2
The null hypothesis (H0) is that these two means come from
underlying populations with the same mean m (so the
difference between them is 0 and m1 – m2 = 0).
Sampling Distribution of
Differences in Sample Means
All possible x1-x2
difference scores that
could occur by
chance
m1 – m2
Critical Value
x1-x2
Critical Value
Does our x1-x2 exceed the critical value?
YES – reject the
null (H0)
What if the Difference is Smaller?
All possible x1-x2
difference scores that
could occur by
chance
m1 – m2
Critical Value
x1-x2
Critical Value
Does our x1-x2 exceed the critical value?
NO – retain the
null (H0)
Distribution of the Differences



In a one-sample case, the mean of
the sampling distribution is the
population mean.
In a two-sample case, the mean of
the sampling distribution is the
difference between the two
population means.
The standard deviation of the
difference scores is the standard
error of this distribution.
Formulas for t-test (independent)
t
( X 1  X 2 )  ( m1  m 2 ) hyp
s x1  x2 
Estimated standard error
s x1  x2
s 2p
n1
SS1   X 12 

s 2p
n2
( X 1 ) 2
n1
SS1  SS 2 SS1  SS 2
s 

df
n1  n2  2
2
p
SS 2   X 22 
( X 2 ) 2
n2
Estimated Standard Error



Pooled variance – the variance
common to both populations is
estimated by combining the
variances.
The variance average is computed
by weighting the group variance by
the degrees of freedom (df) then
dividing by combined df.
Df for pooled variance: n1 + n2 - 2
Confidence Intervals for t

The confidence interval for two
independent samples is:
X1  X 2  (tconf )( sx1  x2 )


Find the appropriate value of t in
the t table using the formula for df.
The true difference in population
means will lie between the upper
and lower limits some % of the time
Assumptions




Both populations are normally
distributed with equal variance.
With equal sample sizes > 10, valid
results will occur even with nonnormal populations.
Equate sample sizes to minimize
effects of unequal variance.
Increase sample size to minimize
non-normality.
Population Correlation Coefficient



Two correlated variables are similar
to a matched sample because in
both cases, observations are paired.
A population correlation coefficient
(r) would represent the mean of r’s
for all possible pairs of samples.
Hypotheses:


H0: r = 0
H1: r ≠ 0
t-Test for Rho (r)


Similar to a t–test for a single
group.
Tests whether the value of r is
significantly different than what
might occur by chance.

Do the two variables vary together by
accident or due to an underlying
relationship?
Formula for t
t
r  r hyp
1 r
n2
2
Standard error of
prediction
Calculating t for Correlated Variables



Except that r is used in place of X,
the formula for calculating the t
statistic is the same.
The standard error of prediction is
used in the denominator to
calculate the standard deviation.
Compare against the critical value
for t with df = n – 2 (n = pairs).
Importance of Sample Size

Lower values of r become significant
with greater sample sizes:


As n increases, the critical value of t
decreases, so it is easier to obtain a
significant result.
Cohen’s rule of thumb



.10 = weak relationship
.30 = moderate relationship
.50 = strong relationship