Transcript Chapter 9

Chapter 9
Introduction to the t-statistic
PSY295 Spring 2003
Summerfelt
Overview
 CLT
or Central Limit Theorem
 z-score
 Standard error
 t-score
 Degrees of freedom
Learning Objectives
 Know
when to use the t statistic for hypothesis
testing
 Understand the relationship between z and t
 Understand the concept of degrees of freedom and
the t distribution
 Perform calculations necessary to compute t
statistic
 Sample
mean & variance
 estimated standard error for X-bar
Central limit theorem


Based on probability theory
Two steps
1.
2.

Take a given population and draw random samples
again and again
Plot the means from the results of Step 1 and it will
be a normal curve where the center of the curve is the
mean and the variation represents the standard error
Even if the population distribution is skewed, the
distribution from Step 2 will be normal!
Z-score Review


A sample mean (X-bar) approximates a population
mean (μ)
The standard error provides a measure of



how well a sample mean approximates the population mean
determines how much difference between X-bar and μ is
reasonable to expect just by chance
The z-score is a statistic used to quantify this inference
z

X 

obtained difference between data and
hypothesis/standard distance expected by chance
What’s the problem with z?
 Need
to know the population mean and
variance!!! Not always available.
What is the t statistic?
 “Cousin”
of the z statistic that does not require the
population mean (μ) or variance (σ2)to be known
 Can be used to test hypotheses about a completely
unknown population (when the only information
about the population comes from the sample)
 Required: a sample and a reasonable hypothesis
about the population mean (μ)
 Can be used with one sample or to compare two
samples
When to use the t statistic?
 For
single samples/groups,
 Whether
a treatment causes a change in the population
mean
 Sample mean consistent with hypothesized population
mean
 For
two samples,
 Coming
later!
Difference between X-bar and μ
 Whenever
you draw a sample and observe
 there
is a discrepancy or “error” between the
population mean and the sample mean
 difference between sample mean and population
 Called
“Sampling Error” or “Standard error of the
mean”
 Goal for hypothesis testing is to evaluate the
significance of discrepancy between X-bar & μ
Hypothesis Testing Two Alternatives
 Is
the discrepancy simply due to chance?
=μ
 Sample mean approximates the population mean
 X-bar
 Is
the discrepancy more than would be expected
by chance?
≠μ
 The sample mean is different the population mean
 X-bar
Standard error of the mean
 In
Chapter 8, we calculated the standard error
precisely because we had the population
parameters.
 For the t statistic,
 We
use sample data to compute an “Estimated
Standard Error of the Mean”
 Uses the exact same formula but substitutes the sample
variance for the unknown population variance
 Or you can use standard deviation
Estimated standard error of mean
sX 
2
s
n
Or
s
sX 
n
Common confusion to avoid
Formula for sample variance and for estimated standard
error (is the denominator n or n-1?)
 Sample variance and standard deviation are descriptive
statistics

Describes how scatted the scores are around the mean
 Divide by n-1 or df


Estimated standard error is a inferential statistic
measures how accurately the sample mean describes the
population mean
 Divide by n

The t statistic
The t statistic is used to test
hypotheses about an unknown
population mean (μ) in situations
where the value of (σ2) is unknown.
 T=obtained difference/standard error


What’s the difference between the t
formula and the z-score formula?
X 
t
sX
z
X 

t and z
 Think
of t as an estimated z score
 Estimation is due to the unknown population
variance (σ2)
 With large samples, the estimation is good and the
t statistic is very close to z
 In smaller samples, the estimation is poorer
 Why?
 Degrees of freedom is used to describe how well t
represents z
Degrees of freedom
=n–1
 Value of df will determine how well the
distribution of t approximates a normal one
 df
 With
larger df’s, the distribution of the t statistic will
approximate the normal curve
 With smaller df’s, the distribution of t will be flatter
and more spread out
t
table uses critical values and incorporates df
Four step procedure for
Hypothesis Testing

1.
Same procedure used with z scores
State hypotheses and select a value for α

2.
Locate a critical region

3.
Find value for df and use the t distribution table
Calculate the test statistic

4.
Null hypothesis always state a specific value for μ
Make sure that you are using the correct table
Make a decision

Reject or “fail to reject” null hypothesis
Example
 GNC
is selling a memory booster, should you use
it?
 Construct a sample (n=25) & take it for 4 weeks
 Give sample a memory test where μ is known to
be 56
 Sample produced a mean of 59 with SS of 2400
 Use α=0.05
 What statistic will you use? Why?
Steps
1.
2.
3.
4.
State Hypotheses and
alpha level
Locate critical region
(need to know n, df, &
α)
Obtain the data and
compute test statistic
Make decision
SS
s 
n 1
2
sX 
s2
n
X 
t
sX