Transcript Document

Statistics for the Behavioral
and Social Sciences:
A Brief Course
Fifth Edition
Arthur Aron, Elaine N. Aron, Elliot Coups
Prepared by:
Genna Hymowitz
Stony Brook University
This multimedia product and its contents are protected under copyright law.
The following are prohibited by law:
-any public performance or display, including transmission of any image over a network;
-preparation of any derivative work, including the extraction, in whole or in part, of any images;
-any rental, lease, or lending of the program.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Introduction to the t Test
Chapter 8
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Chapter Outline
• The t Test for a Single Sample
• The t Test for Dependent Means
• Assumptions of the t Test for a Single Sample and t
Test for Dependent Means
• Effect Size and Power for the t Test for Dependent
Means
• Single-Sample t Tests and Dependent Means t Tests
in Research Articles
Copyright © 2011 by Pearson Education, Inc. All rights reserved
t Tests
• Hypothesis-testing procedure in which the
population variance is unknown
– compares t scores from a sample to a comparison
distribution called a t distribution
• t Test for a single sample
– hypothesis-testing procedure in which a sample mean
is being compared to a known population mean but the
population variance is unknown
– Works basically the same way as a Z test, but:
• because the population variance is unknown, with a t test
you have to estimate the population variance
• With an estimated variance, the shape of the distribution
is not normal, so a special table is used to find cutoff
scores.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Basic Principle of the t Test:
Estimating the Population Variance
from the Sample Scores
• You can estimate the variance of the population of individuals
from the scores of people in your sample.
– The variance of the scores from your sample will be slightly smaller
than the variance of scores from the population.
• Using the variance of the sample to estimate the variance of the
population produces a biased estimate.
• Unbiased Estimate
– estimate of the population variance based on sample scores, which
has been corrected so that it is equally likely to overestimate or
underestimate the true population variance
• The bias is corrected by dividing the sum of squared deviation by the
sample size minus 1
– S2 = ∑(X – M)2
N–1
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Degrees of Freedom (df)
• The number by which you divide to get the estimated
population variance
• Number of scores free to vary when estimating a
population parameter
– If you know the mean of the population and all but one of the
scores in the sample, you can figure out the score you don’t
know.
• Once you know the mean, one of the scores in the sample is
not free to have any possible value and the degrees of freedom
then would = N – 1
Copyright © 2011 by Pearson Education, Inc. All rights reserved
The Standard Deviation of the
Distribution of Means
• After finding the estimated population variance, you can
calculate the standard deviation of the comparison
distribution.
– The variance of a distribution of means is the variance of the
population of individuals divided by the sample size.
– The standard deviation of the distribution of means based on an
estimated population variance is the square root of the variance of
the distribution of means based on an estimated population
variance.
– S is used instead of Population SD when the population variance is
estimated.
S2M = S2 / N
SM = √S2M
Copyright © 2011 by Pearson Education, Inc. All rights reserved
The t Distribution
• When the population variance is estimated, you have less
true information and more room for error.
– The shape of the comparison distribution will not be a normal
curve; it will be a t distribution.
• t distributions look like the normal curve—they are bell shaped,
unimodal, and symmetrical—but there are more extreme
scores in t distributions.
– Their tails are higher.
• There are many t distributions, the shapes of which vary
according to the degrees of freedom used to calculate the
distribution.
– There is only one t distribution for any particular degrees of
freedom.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Using the t Table
• There is a different t distribution for any particular degrees
of freedom.
• The t table is a table of cutoff scores on the t distribution
for various degrees of freedom, significance levels, and
one- and two-tailed tests.
• The t table only shows positive scores.
• A portion of a t table might look like this:
One-Tailed Tests
Two-Tailed Tests
df
.10
.05
.01
.10
.05
.01
1
3.078
6.314
31.821
6.314
12.706
63.657
2
1.886
2.920
6.965
2.920
4.303
9.925
3
1.638
2.353
4.541
2.353
3.182
5.841
Copyright © 2011 by Pearson Education, Inc. All rights reserved
When Using a t Table…
• Determine whether you have a one- or a two-tailed test.
• If you are using a one-tailed test, decide whether your cutoff
score is a positive or a negative t score.
– If your one-tailed test is testing whether the mean of Population 1 is
greater than the mean of Population 2, the cutoff t score is positive.
– If the one-tailed test is testing whether the mean of Population 1 is
less than the mean of Population 2, the cutoff t score is negative.
• Decide which significance level you will use.
• Find the column labeled with the significance level you are
using.
• Go down to the row for the appropriate degrees of freedom.
• If your study has degrees of freedom between two of the higher
values on the table, you should use the degree of freedom that
is nearest to yours and less than yours.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
The t Score
• The sample’s mean score on the comparison distribution
• It is calculated in the same way as a Z score, but it is used when
the variance of the comparison distribution is estimated.
• It is the sample’s mean minus the population mean divided by
the standard deviation of the distribution of means.
t = M – Population M
SM
• If your sample’s mean was 35, the population mean was 46, and
the estimated standard deviation was 5, then the t score for this
example would be -2.2.
• This sample’s mean is 2.2 standard deviations below the
mean.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Deciding Whether to Reject
the Null Hypothesis
• This is exactly the same as for the other
hypothesis-testing procedures discussed
in earlier chapters.
– You will compare the t score for your sample
to the cutoff score found using the t table to
decide whether to reject the null hypothesis.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Hypothesis Testing When the
Population Variance Is Unknown
•
•
Restate the question about the research hypothesis and a null hypothesis about the
populations.
Determine the characteristics of the comparison distribution.
–
–
population mean
• This is the same as the known population mean.
population variance
• Figure the estimated population variance.
–
•
S2 = [∑(X – M)2] / df
Figure the variance of the distribution of means.
–
–
S2M = S2 / N
standard deviation of the distribution of means
•
Figure the standard deviation of the distribution of means.
–
–
•
Determine the significance cutoff.
–
–
•
Decide the significance level and whether to use a one- or a two-tailed test.
Look up the appropriate cutoff in a t table.
Determine your sample’s score on the comparison distribution.
–
•
S2M = √S2M
shape of the comparison distribution
• t distribution with N – 1 degrees of freedom
t = (M – Population M) / SM
Decide whether to reject the null hypothesis.
–
Compare the t score of your sample and the cutoff score from the t table.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Example of Hypothesis Testing
When the Population Variance Is
Unknown: Step 1
• A survey at your university showed that students at your
school study an average of 17 hours a week.
• You surveyed 16 students in your dorm and found that
they each study 21 hours per week.
• Restate the question about the research hypothesis and a
null hypothesis about the populations.
– Population 1: the kind of students who live in your dormitory
– Population 2: the kind of students in general at your university
– The research hypothesis is that Population 1 students study more
than Population 2 students.
– The null hypothesis is that Population 1 students do not study more
than Population 2 students.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Example of Hypothesis Testing
When the Population Variance Is
Unknown: Step 2
• Determine the characteristics of the comparison
distribution.
– population mean = 17
• This is the same as the known population mean.
– population variance
• Figure the estimated population variance.
– S2 = [∑(X – M)2] / df = 694 / (16 – 1) = 694 / 15 = 46.27
• Figure the variance of the distribution of means.
– S2M = S2 / N = 46.27 / 16 = 2.89
– standard deviation of the distribution of means
• Figure the standard deviation of the distribution of means.
– S2M = √2.89 = 1.70
– shape of the comparison distribution
• t distribution with N – 1 = 15 degrees of freedom
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Example of Hypothesis Testing
When the Population Variance Is
Unknown: Step 3
• Determine the significance cutoff.
– Decide the significance level and
whether to use a one- or a two-tailed
test.
– Look up the appropriate cutoff in a t
table.
– In this example, with 15 degrees of
freedom, a significance level of .05, and
a one-tailed test, using a t table you will
find that the crucial cutoff is 1.753.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Example of Hypothesis Testing
When the Population Variance Is
Unknown: Step 4
• Determine your sample’s score on
the comparison distribution.
– t = (M – Population M) / SM
– t = (21 – 17) / 1.70 = 4 / 1.70 = 2.35
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Example of Hypothesis Testing
When the Population Variance Is
Unknown: Step 5
• Decide whether to reject the null
hypothesis.
– The cutoff score is 1.753.
– Your sample’s score is 2.35.
– You can reject the null hypothesis.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
How Are You Doing?
• How does a sample’s variance differ from the
population’s?
• How do we adjust for bias when estimating
the population variance?
• What does N – 1 represent?
• What is a t distribution?
• What is a t score?
• How is a t score calculated?
Copyright © 2011 by Pearson Education, Inc. All rights reserved
The t Test for Dependent
Means
• It is common when conducting research to have two sets
of scores and not to know the mean of the population.
• Repeated Measures Design
– research design in which each person is tested more than
once
– For this type of design, a t test for dependent means is used.
• The means for each group of scores are from the same people
and are dependent on each other.
• A t test for dependent means is calculated the same way as a t
test for a single sample; however:
– Difference scores are used .
– You assume that the population mean is 0.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Difference Scores
• For each person, you subtract one score from
the other.
• If the difference compares before versus
after, difference scores are also called
change scores.
• Once you have the difference score for each
person in the study, you do the rest of the
hypothesis testing with difference scores.
– You treat the study as if there were a single
sample of scores.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Population of Difference
Scores with a Mean of 0
• Null hypothesis in a repeated measured
design
– On average, there is no difference between
the two groups of scores.
• When working with difference scores, you
compare the population of difference scores
from which your sample of difference scores
comes (Population 1) to a population of
difference scores (Population 2) with a mean of
0.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Steps for a t Test for
Dependent Means
•
•
Restate the question as a research hypothesis and a null hypothesis about the populations.
Determine the characteristics of the comparison distribution.
–
Make each person’s two scores into a difference score.
•
–
–
Figure the mean of the difference scores.
Assume the mean of the distribution of means of difference scores = 0.
–
Find the standard deviation of the distribution of means of difference scores .
•
•
•
–
•
Figure the estimated population variance of difference scores.
– S2 = [∑(X – M)2] / df
Figure the variance of the distribution of means of difference scores.
– S2M = S2 / N
Figure the standard deviation of the distribution of means of difference scores.
– S2M = √S2M
The shape is a t distribution with N – 1 degrees of freedom.
Determine the cutoff sample score on the comparison distribution at which the null
hypothesis should be rejected.
–
–
•
Do all of the remaining steps using these difference scores.
Decide the significance level and whether to use a one- or a two-tailed test.
Look up the appropriate cutoff in a t table.
Determine the sample’s score on the comparison distribution.
–
t = (M – Population M) / SM
Decide whether to reject the null hypothesis.
–
Compare the t score for your sample to the cutoff score found using the t table.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Example of a t Test for
Dependent Means: Step 1
• Use the brain activation example from the text (Aron,
Fisher, Mashek, Strong, & Brown, 2005).
• Restate the question as a research hypothesis and a null
hypothesis about the populations.
– Population 1: individuals like those tested in this study
– Population 2: individuals whose brain activation in the caudate area
of interest is the same whether looking at a picture of their beloved
or a picture of a familiar, neutral person
– Research hypothesis: Population 1’s mean difference score (brain
activation when viewing the beloved’s picture minus brain activation
when viewing a neutral person’s picture
– Null hypothesis: Population 1’s mean difference score is not
different from Population 2’s mean difference score.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Example of a t Test for
Dependent Means: Step 2
•
Determine the characteristics of the comparison distribution.
– Make each person’s two scores into a difference score.
• difference = brain activation for beloved – brain activation for control
• Do all of the remaining steps using these difference scores.
– Figure the mean of the difference scores.
• Sum of the difference scores = 12
• Number of difference scores = 10
• M = 12 / 10 = 1.200
– Assume the mean of the distribution of means of difference scores = 0.
• Population M = 0
– Find the standard deviation of the distribution of means of difference scores.
• Figure the estimated population variance of difference scores.
– S2 = [∑(X – M)2] / df = 3.940 / (10 – 1) = .438
• Figure the variance of the distribution of means of difference scores.
– S2M = S2 / N = .438 / 10 = .044
• Figure the standard deviation of the distribution of means of difference scores.
– S2M = √S2M = √.044 = .210
– The shape is a t distribution with 9 degrees of freedom.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Example of a t Test for
Dependent Means: Step 3
• Determine the cutoff sample score on the comparison
distribution at which the null hypothesis should be
rejected.
– You chose to use a .05 significance level.
– You will use a one-tailed test because your hypothesis is
directional.
– You have 9 degrees of freedom.
– Look up the appropriate cutoff in a t table.
– The cutoff t score is 1.833.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Example of a t Test for
Dependent Means: Step 4
• Determine the sample’s score on the
comparison distribution.
– t = (M – Population M) / SM
– t = (1.200 – 0) / .210 = 5.71
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Example of a t Test for
Dependent Means: Step 5
• Decide whether to reject the null hypothesis.
– Compare the t score for your sample to the cutoff
score found using the t table.
• The sample’s score of 5.71 is more extreme than the
cutoff score of 1.833.
• You can reject the null hypothesis.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Review of the Z test, t Test for a Single Sample,
and t Test for Dependent Means
•
Z Test
–
–
–
–
–
–
•
t Test for a Single Sample
–
–
–
–
–
–
•
Population variance is known.
Population mean is known.
There is 1 score for each participant.
The comparison distribution is a Z distribution.
Formula Z = (M – Population M) / Population SDM
The best estimate of the population mean is the sample mean.
Population variance is not known.
Population mean is known.
There is 1 score for each participant.
The comparison distribution is a t distribution.
df = N – 1
Formula t = (M – Population M) / Population SM
t Test for Dependent Means
–
–
–
–
–
–
Population variance is not known.
Population mean is not known.
There are 2 scores for each participant.
The comparison distribution is a t distribution.
df = N – 1
Formula t = (M – Population M) / Population SM
Copyright © 2011 by Pearson Education, Inc. All rights reserved
How Are You Doing?
• What is an example of a research study
for which you would need a t test for
dependent means?
• What is a difference score?
Copyright © 2011 by Pearson Education, Inc. All rights reserved
•
Assumptions of the t Test for a Single
Sample and t Test for Dependent
Means
Assumption
–
a condition required for carrying out a particular hypothesis-testing
procedure
– It is part of the mathematical foundation for the accuracy of the tables used
in determining cutoff values.
•
A normal population distribution is an assumption of the t test.
– It is a requirement within the logic and mathematics for a t test.
– It is a requirement that must be met for the t test to be accurate.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Effect Size for the t Test for
Dependent Means
• Mean of the difference scores divided
by the estimated standard deviation of
the population of difference scores
estimated effect size = M/S
M = mean of the difference scores
S = estimated standard deviation of the
population of individual difference scores
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Power
• Power for a t test of dependent means can be
calculated using a power software program, a power
calculator, or a power table.
• Table 8-9 in your textbook shows an example of a
power table for a .05 significance level.
• To use a power table:
– Decide whether you need a one- or a two-tailed test.
– Determine from previous research what effect size (small,
medium, or large) you might expect from your study.
– Determine what sample size you plan to have.
– Look up what level of power you can expect given the
planned sample size, the expected effect size, and whether
you will use a one- or a two-tailed test.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Planning a Sample Size
• A power table can be used to see how many
participants you would need to have enough
power.
– Many studies use 80% as the power needed to
make the study worth conducting.
– To use a power table to determine the number of
participants needed in a sample:
• Decide whether you need a one- or a two-tailed test.
• Determine the expected effect size.
• Determine the level of power you want to achieve
(usually .80).
• Use this information to guide you to the appropriate
columns and rows on the power table.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
The Power of Studies Using a
t Test for Dependent Means
• Studies using a repeated-measures design
(using difference scores) often have much
larger effect sizes than studies using other
research designs.
– There is more power with this type of study than if
the participants were divided into groups and each
group was tested under each condition of the
study.
• The higher power is due to a smaller standard deviation
that occurs in these type of studies.
– The smaller variation is because you are comparing
participants to themselves.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
t Tests in Research Articles
• Results from t tests are generally reported in the
following format:
– t (df) = x.xx, p < .05
• x.xx represents the t score.
• Commonly, the significance level will be set at p < .05, but it is
also often set at p < .01.
• Research more commonly uses the t test for
dependent means.
– It is rare to see a study that uses a t test for a single
sample.
• Often a t test for dependent means will be given in
the text, but sometimes results are reported in a table
format.
Copyright © 2011 by Pearson Education, Inc. All rights reserved
Key Points
•
•
•
•
•
•
•
•
•
•
•
When you have to estimate the population variance from scores in a sample, you will use a
formula that divides the sum of square deviation scores by the degrees of freedom.
With an estimated population variance, the comparison distribution is a t distribution; it is
close to normal, but varies depending on the associated degrees of freedom.
A t score is a sample’s number of deviations from the mean of the comparison distribution
this is used in situation when the population variance is estimated.
A t test for a single sample is used when the population mean is known but the population
variance is unknown.
A researcher would use a t test for dependent means when there is more than one score for
each participant. In this case you would use difference scores.
An assumption of the t test is that the population distribution is normal, but even if the
distribution is not normal, the results are fairly accurate.
When testing hypotheses with t tests for dependent means, the mean of Population 2 is
assumed to be 0.
effect size for t tests = mean of the difference scores/standard deviation of the difference
scores
Power or sample size can be looked up using a power table.
The power with a repeated-measures design is usually much higher than that of most other
designs with the same number of participants.
t tests for dependent means are often found in the text or in a table of a research article in
this format: t (df) = x.xx, p < .05
Copyright © 2011 by Pearson Education, Inc. All rights reserved