Transcript Document
Psych 230
Psychological Measurement
and Statistics
Pedro Wolf
October 21, 2009
Today….
• Hypothesis testing
• Null and Alternative Hypotheses
• Z-test
• Significant and Nonsignificant results
• Types of Statistical error (type 1 and type 2)
A scientific question
• A biology professor studies the effect of
nutrition on physical attributes. He theorizes
that maternal nutrition can affect the height and
weight of their offspring. Further, he thinks that
the time of year a child is conceived, due to
seasonal nutrition factors, has a relationship
with how tall that child will be. After study, he
establishes that yearly nutrition is most
different from the norm in June. So, he wants
to know whether people conceived in June are
a different height than the population.
How do we answer this question?
1.
2.
3.
4.
5.
6.
State the hypotheses
Design the experiment
Collect the data
Create the statistical hypotheses
Select the appropriate statistical test
Decide the size of the rejection region (value of
)
7. Calculate the obtained and critical values
8. Make our conclusion
How do we answer this question?
1.
2.
3.
4.
5.
6.
State the hypotheses
Design the experiment
Collect the data
Create the statistical hypotheses
Select the appropriate statistical test
Decide the size of the rejection region (value of
)
7. Calculate the obtained and critical values
8. Make our conclusion
Hypothesis Testing
• Experimental hypotheses describe the
predicted outcome we may or may not find in
an experiment
• As scientists, we try to be conservative
– we should assume no effect of what we are
observing or testing
• Does Prozac work in treating depression?
• Are men better at math than women?
• Do we learn better when practice is all at once or spread over
time?
Hypothesis Testing
• In experiments, we usually identify two
hypotheses
• The Null Hypothesis (H0)
– there is no difference in the groups we are testing
• The Alternative Hypothesis (H1)
– there is a real difference in the groups we are
testing
Hypothesis Testing - Experiment
• The Null Hypothesis (H0)
– there is no difference in the groups we are testing
• H0 : People born in March are the same height
those born in other months
• The Alternative Hypothesis
– there is a real difference in the groups we are testing
• H1 : People born in March are not the same height
as those born in other months
How do we answer this question?
1.
2.
3.
4.
5.
6.
State the hypotheses
Design the experiment
Collect the data
Create the statistical hypotheses
Select the appropriate statistical test
Decide the size of the rejection region (value of
)
7. Calculate the obtained and critical values
8. Make our conclusion
Designing the Experiment/Study
•
Dependent/Observed Variable?
•
•
We want to measure height
Independent/Predictor Variable?
•
Month of birth
•
•
March vs all other months
Observational or Experimental study?
How do we answer this question?
1.
2.
3.
4.
5.
6.
State the hypotheses
Design the experiment
Collect the data
Create the statistical hypotheses
Select the appropriate statistical test
Decide the size of the rejection region (value of
)
7. Calculate the obtained and critical values
8. Make our conclusion
Height - Sample
• Heights of those born in March:
63, 64, 62, 67, 68, 66, 72, 64
•
Calculate the mean and standard deviation:
X = 65.75
SX = 3.03
Our Data
Population
Sample
• = 66.57
• N=8
• x = 4.091
• X = 65.75
• SX = 3.03
How do we answer this question?
1.
2.
3.
4.
5.
6.
State the hypotheses
Design the experiment
Collect the data
Create the statistical hypotheses
Select the appropriate statistical test
Decide the size of the rejection region (value of
)
7. Calculate the obtained and critical values
8. Make our conclusion
Statistical Hypotheses
• Our hypotheses were:
– H0 : People born in March are the same height as
those born in other months
– H1 : People born in March are not the same height
as those born in other months
• H0 : = X
• H1 : ≠ X
Statistical Hypotheses
• H0 : = X
a
a
a
a
Statistical Hypotheses
• H0 : = X
X
a
a
a
a
Statistical Hypotheses
• H1 : ≠ X
a
a
a
a
Statistical Hypotheses
• H1 : ≠ X
X
a
a
a
a
How do we answer this question?
1.
2.
3.
4.
5.
6.
State the hypotheses
Design the experiment
Collect the data
Create the statistical hypotheses
Select the appropriate statistical test
Decide the size of the rejection region (value of
)
7. Calculate the obtained and critical values
8. Make our conclusion
Z-Test
• The z-test is the procedure for computing a zscore for a sample mean on the sampling
distribution of means
• Comparing a sample to a population
Assumptions of the Z-Test
•
We have randomly selected one sample
•
The dependent variable is at least
approximately normally distributed in the
population and involves an interval or ratio
scale
•
We know the mean of the population of raw
scores
•
We know the true standard deviation of the
population
How do we answer this question?
1.
2.
3.
4.
5.
6.
State the hypotheses
Design the experiment
Collect the data
Create the statistical hypotheses
Select the appropriate statistical test
Decide the size of the rejection region (value of
)
7. Calculate the obtained and critical values
8. Make our conclusion
Deciding the size of the rejection region
• Usually, Psychologists use a rejection region of
0.05
– known as (alpha)
– sometimes use 0.01 or 0.001
– If the H0 is true, the probability of getting an xbar
this extreme is
One-tail versus two-tail testing
• A two-tailed test is used when we predict that
there is a relationship, but we do not specifically
predict the direction in which scores will change
– Freshmen and Seniors score differently in tests of
sociability (H1)
– People born in March are not the same height as others
(H1)
• A one-tailed test is used when you predict the
specific direction in which scores will change
– Prozac will improve depression symptoms (H1)
Rejection region
• When a two-tailed test is used, we need to
spread our value across both tails of the
distribution
• When a one-tailed test is used, all our value
is put in one tail of the distribution
Rejection region
• A criterion of 0.05 and a region of rejection in
two tails
a
a
a
a
Rejection region
• A criterion of 0.05 and a region of rejection in
just one tail
a
a
a
a
Rejection region - Experiment
• In our study, we will use =0.05
• This is a two-tailed test
• Therefore we will have =0.025 in each tail
Rejection region
• A criterion of 0.05 and a region of rejection in
two tails (0.025 in each tail)
a
a
a
a
How do we answer this question?
1.
2.
3.
4.
5.
6.
State the hypotheses
Design the experiment
Collect the data
Create the statistical hypotheses
Select the appropriate statistical test
Decide the size of the rejection region (value of
)
7. Calculate the obtained and critical values
8. Make our conclusion
The obtained and critical value
a
a
a
a
Calculate the critical value
• We want the z-score that corresponds to an
area in the tail of 0.025
• Look up tables, Starts page 548:
• Where area beyond z=0.025.
• Zcrit = 1.96
Zcrit
Zcrit=-1.96
a
a
a
Zcrit=+1.96
a
Calculate the obtained value
Population
Sample
• = 66.57
• N=8
• x = 4.09
• X = 65.75
• SX = 3.03
z
X
X
X
X
N
Calculate the obtained value
x = 4.09 / √8 = 1.44
Zobt = (65.75 - 66.57) / 1.44 = -0.57
z
X
X
X
X
N
Zcrit and Zobt
Zobt < Zcrit
Zobt=-0.57
Zcrit=-1.96
a
a
a
Zcrit=+1.96
a
How do we answer this question?
1.
2.
3.
4.
5.
6.
State the hypotheses
Design the experiment
Collect the data
Create the statistical hypotheses
Select the appropriate statistical test
Decide the size of the rejection region (value of
)
7. Calculate the obtained and critical values
8. Make our conclusion
Drawing a conclusion
• Zobt < Zcrit therefore we do not reject H0
• We do not have enough evidence to say that our
null hypothesis is false
• When we fail to reject H0 we say the results are
nonsignificant. Nonsignificant indicates that the
results are likely to occur if the predicted
relationship does not exist in the population
• We conclude that people born in March are no
different in height from those born in other months
Drawing a conclusion
• When we fail to reject H0, we do not prove that
H0 is true
• Nonsignificant results provide no convincing
evidence - one way or the other - as to whether
a relationship exists in nature
Drawing a different conclusion
• Let’s assume Zobt = -2.03
Zcrit and Zobt
Zobt=-2.03
Zcrit=-1.96
a
a
a
Zcrit=+1.96
a
Drawing a different conclusion
• Zobt > Zcrit therefore we reject H0
• When we reject H0 and accept H1 we say the
results are significant. Significant indicates that
the results are too unlikely to occur if the
predicted relationship does not exist in the
population
• We conclude that people born in March are
significantly shorter in height than those born in
other months
Z scores and p values
• Z scores can be readily changed back into
proportions, and probabilities
• When reporting the results of tests, a z
value (zobt) and p value are often reported
• In future homework assignments you’ll
need to use p values in the homework
Types of Statistical Error
• When conducting a statistical test, we can
make two kinds of errors:
• Type 1
• Type 2
Type 1 Error
• A Type I error is defined as rejecting H0 when
H0 is actually true
• In a Type I error, we conclude that the
predicted relationship exists when it really does
not
• The probability of a Type I error equals
Type 2 Error
• A Type II error is defined as retaining H0 when
H0 is false (and H1 is actually true)
• In a Type II error, we conclude that the
predicted relationship does not exist when it
really does
• The probability of a Type II error is b or 1-p
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
1.
2.
3.
4.
5.
Is this a one-tailed or two-tailed test? Why?
What are H0 and H1?
Compute zobt
With =0.05, what is zcrit?
What conclusion should we draw from this study?
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
1. Is this a one-tailed or two-tailed test? Why?
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
1. Is this a one-tailed or two-tailed test? Why?
It will be a two-tailed test, as we are not predicting the
direction that the scores will change. That is, we are
asking whether music leads to a different performance,
either better or worse.
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
2. What are H0 and H1?
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
2. What are H0 and H1?
H0 : Music does not lead to changes in test
performance
H0 : music = 50
H1 : Music leads to changes in test performance
H1 : music 50
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
3. Compute zobt
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
3. Compute zobt
z
X
X
X
X
N
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
3. Compute zobt
x = 12 / √49 = 1.714
Zobt = (54.63 - 50) / 1.714 = +2.70
z
X
X
X
X
N
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
4. With =0.05, what is zcrit?
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
4. With =0.05, what is zcrit?
A two-tailed test with =0.05, leaves 0.025 in each tail.
The Z-score for 0.025 is 1.96. Therefore zcrit = 1.96
Zcrit and Zobt
Zobt=+2.70
Zcrit=-1.96
a
a
a
Zcrit=+1.96
a
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
5. What conclusion should we draw from this study?
Problem 1
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
5. What conclusion should we draw from this study?
As zobs > zcrit , we reject H0 and tentatively accept H1.
People listening to music while taking a test score
significantly better than those not listening to music.
Problem 1(b)
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
6. Based on our conclusion (music affects test scores),
what is the probability we made a Type 1 error? What
would that error be? What is the probability we made a
Type 2 error?
Problem 1(b)
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
6. Based on our conclusion (music affects test scores),
what is the probability we made a Type 1 error? What
would that error be? What is the probability we made a
Type 2 error?
Probability of a Type 1 error is 0.05 (our value).
This error would be saying a relationship existed
between music and test scores when really it did not.
Problem 1(b)
Listening to music while taking a test may be relaxing or
distracting. To determine which, 49 participants are
tested while listening to music and they produce a mean
score of 54.63. In the population, the mean score without
music is 50 (std dev of 12).
6. Based on our conclusion (music affects test scores),
what is the probability we made a Type 1 error? What
would that error be? What is the probability we made a
Type 2 error?
As we rejected H0, there is no probability of a Type 2
error
This would be saying there is no relationship when there
really was.
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
1.
2.
3.
4.
5.
Is this a one-tailed or two-tailed test? Why?
What are H0 and H1?
Compute zobt
With =0.01, what is zcrit?
What conclusion should we draw from this study?
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
1. Is this a one-tailed or two-tailed test? Why?
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
1. Is this a one-tailed or two-tailed test? Why?
It will be a two-tailed test, as the researcher is looking
for higher or lower scores.
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
2. What are H0 and H1?
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
2. What are H0 and H1?
H0 : Public and private students score the same on
tests of social skills
H0 : private = 75.62
H1 : Public and private students score differently on
tests of social skills
H1 : private 75.62
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
3. Compute zobt
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
3. Compute zobt
z
X
X
X
X
N
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
3. Compute zobt
x = 28 / √100 = 2.8
Zobt = (71.30 - 75.62) / 2.8 = -1.54
X X
z
X
X
N
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
4. With =0.01, what is zcrit?
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
4. With =0.01, what is zcrit?
A two-tailed test with =0.01, leaves 0.005 in each tail.
The Z-score for 0.005 is 2.57. Therefore zcrit = 2.57
Zcrit and Zobt
Zobt=-1.54
Zcrit=-2.57
a
a
a
Zcrit=+2.57
a
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
5. What conclusion should we draw from this study?
Problem 2
A researcher asks whether attending a private school
leads to higher or lower performance on a test of social
skills. A sample of 100 students from a private school
produces a mean score of 71.30. The for students
from public schools is 75.62 (x=28).
5. What conclusion should we draw from this study?
As zobs < zcrit , we retain H0
The researcher has no evidence of a relationship
between the type of school attended (public or private)
and social skills.
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
1.
2.
3.
4.
5.
Is this a one-tailed or two-tailed test? Why?
What are H0 and H1?
Compute zobt
With =0.05, what is zcrit?
What conclusion should we draw from this study?
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
1. Is this a one-tailed or two-tailed test? Why?
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
1. Is this a one-tailed or two-tailed test? Why?
It will be a one-tailed test, as the researcher is just
interested in lower self-esteem scores.
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
2. What are H0 and H1?
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
2. What are H0 and H1?
H0 : Statistics students do not differ from others in
terms of their self-esteem
H0 : stats >= 55
H1 : Statistics students have lower self-esteem than
other students
H1 : stats < 55
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
3. Compute zobt
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
X
z
3. Compute zobt
X
X
X
N
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
X
z
3. Compute zobt
x = 11.35 / √9 = 3.783
Zobt = (35.11 - 55) / 3.783 = -5.26
X
X
X
N
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
4. With =0.05, what is zcrit?
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
4. With =0.05, what is zcrit?
A one-tailed test with =0.05, looking for lower scores,
leaves 0.05 in the lower tail. The Z-score for 0.05 is
1.645. Therefore zcrit = -1.645
Zcrit and Zobt
Zobt=-5.26
Zcrit=1.645
a
a
a
a
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
5. What conclusion should we draw from this study?
Problem 3
A researcher measures the self-esteem scores of a
sample of 9 statistics students, reasoning that their
frustration with this course may lower their self-esteem
relative to the typical college student where =55 and
x=11.35. This sample has a mean self-esteem score
of 35.11.
5. What conclusion should we draw from this study?
As zobs < zcrit , we reject H0
The researcher has evidence of a relationship.
Statistics students score significantly lower than other
college students on tests of self-esteem.