Hypothesis Testing

Download Report

Transcript Hypothesis Testing

Sampling Distributions,
Hypothesis Testing and
One-sample Tests
Media Violence
• Does violent content in a video affect
later behavior?
 Bushman (1998)
• Two groups of 100 subjects saw a video
 Violent video versus nonviolent video
• Then free associated to 26 homonyms
with aggressive & nonaggressive forms.
 e.g. cuff, mug, plaster, pound, sock
Cont.
Media Violence--cont.
• Results
 Mean number of aggressive free associates
= 7.10
• Assume we know that without aggressive
video the mean would be 5.65, and the
standard deviation = 4.5
 These are parameters (m and s)
• Is 7.10 enough larger than 5.65 to
conclude that video affected results?
Sampling Distribution of the
Mean
• We need to know what kinds of sample
means to expect if video has no effect.
 i. e. What kinds of means if m = 5.65 and
s = 4.5?
 This is the sampling distribution of the mean.
Cont.
Sampling Distribution
Number of Aggressive Associates
1400
1200
Fr equency
1000
800
600
400
Std. Dev = .45
200
Mean = 5.65
0
N = 10000.00
7.
7.
6.
6.
6.
6.
5.
5.
5.
5.
4.
4.
4.
4.
3.
25
00
75
50
25
00
75
50
25
00
75
50
25
00
75
Mean Number A ggr essive Ass ociates
Cont.
Sampling Distribution of the
Mean--cont.
• The sampling distribution of the mean
depends on
 Mean of sampled population
• Why?
 St. dev. of sampled population
• Why?
 Size of sample
• Why?
Cont.
Sampling Distribution of the
mean--cont.
• Shape of the sampling distribution
 Approaches normal
• Why?
 Rate of approach depends on sample size
• Why?
• Basic theorem
 Central limit theorem
Central Limit Theorem
• Given a population with mean = m and
standard deviation = s, the sampling
distribution of the mean (the distribution
of sample means) has a mean = m, and a
standard deviation = s /n. The
distribution approaches normal as n, the
sample size, increases.
Demonstration
• Let population be very skewed
• Draw samples of 3 and calculate means
• Draw samples of 10 and calculate means
• Plot means
• Note changes in means, standard
deviations, and shapes
Cont.
Parent Population
Skewed Population
3000
Frequency
2000
1000
Std. Dev = 2.43
Mean = 3.0
N = 10000.00
0
.0
20
.0
18
.0
16
.0
14
.0
12
.0
10
0
8.
0
6.
0
4.
0
2.
0
0.
X
Cont.
Sampling Distribution n = 3
Sampling Distribution
Sample size = n = 3
Frequency
2000
1000
Std. Dev = 1.40
Mean = 2.99
N = 10000.00
0
0
.0
13 0
.0
12 0
.0
11 0
.0
10
00
9.
00
8.
00
7.
00
6.
00
5.
00
4.
00
3.
00
2.
00
1.
00
0.
Sample Mean
Cont.
Sampling Distribution n = 10
Sampling Distribution
Sample size = n = 10
1600
1400
Frequency
1200
1000
800
600
400
Std. Dev = .77
200
Mean = 2.99
N = 10000.00
0
50
6.
00
6.
50
5.
00
5.
50
4.
00
4.
50
3.
00
3.
50
2.
00
2.
50
1.
00
1.
Sample Mean
Cont.
Demonstration--cont.
• Means have stayed at 3.00 throughout-except for minor sampling error
• Standard deviations have decreased
appropriately
• Shapes have become more normal--see
superimposed normal distribution for
reference
Steps in Hypothesis Testing
• Define the null hypothesis.
• Decide what you would expect to find if
the null hypothesis were true.
• Look at what you actually found.
• Reject the null if what you found is not
what you expected.
The Null Hypothesis
• The hypothesis that our subjects came
from a population of normal responders.
• The hypothesis that watching a violent
video does not change mean number of
aggressive interpretations.
• The hypothesis we usually want to reject.
Important Concepts
• Concepts critical to hypothesis testing
 Decision
 Type I error
 Type II error
 Critical values
 One- and two-tailed tests
Decisions
• When we test a hypothesis we draw a
conclusion; either correct or incorrect.
 Type I error
• Reject the null hypothesis when it is actually
correct.
 Type II error
• Retain the null hypothesis when it is actually
false.
Type I Errors
• Assume violent videos really have no
effect on associations
• Assume we conclude that they do.
• This is a Type I error
 Probability set at alpha ()
•  usually at .05
 Therefore, probability of Type I error = .05
Type II Errors
• Assume violent videos make a difference
• Assume that we conclude they don’t
• This is also an error (Type II)
 Probability denoted beta ()
• We can’t set beta easily.
• We’ll talk about this issue later.
• Power = (1 - ) = probability of correctly
rejecting false null hypothesis.
Critical Values
• These represent the point at which we
decide to reject null hypothesis.
• e.g. We might decide to reject null when
(p|null) < .05.
 Our test statistic has some value with p =
.05
 We reject when we exceed that value.
 That value is the critical value.
One- and Two-Tailed Tests
• Two-tailed test rejects null when
obtained value too extreme in either
direction
 Decide on this before collecting data.
• One-tailed test rejects null if obtained
value is too low (or too high)
 We only set aside one direction for rejection.
Cont.
One- & Two-Tailed Example
• One-tailed test
 Reject null if violent video group had too
many aggressive associates
• Probably wouldn’t expect “too few,” and
therefore no point guarding against it.
• Two-tailed test
 Reject null if violent video group had an
extreme number of aggressive associates;
either too many or too few.
Testing Hypotheses: s known
• H0: m = 5.65
• H1: m  5.65 (Two-tailed)
• Calculate p (sample mean) = 7.10 if
m = 5.65
• Use z from normal distribution
• Sampling distribution would be normal
Using z To Test H0
• Calculate z
X m
7.1  5.65 1.45
z


 3.22
s
4.5
.45
n 1.96, reject
100 H0
• If z > +
• 3.22 > 1.96
 The difference is significant.
Cont.
z--cont.
• Compare computed z to histogram of
sampling distribution
• The results should look consistent.
• Logic of test
 Calculate probability of getting this mean if
null true.
 Reject if that probability is too small.
Testing When s Not Known
• Assume same example, but s not known
• Can’t substitute s for s because s more
likely to be too small
 See next slide.
• Do it anyway, but call answer t
• Compare t to tabled values of t.
Sampling Distribution of the
Variance
1400
1200
138.89
Population
variance = 138.89
Frequency
1000
n=5
800
10,000 samples
600
58.94% < 138.89
400
200
0
0
0.
80 0
0.
75 0
0.
70 0
0.
65 0
0.
60
0
0.
55 0
0.
50 0
0.
45 0
0.
40 0
0.
35 0
0.
30 0
0.
25
0
0.
20 0
0.
15 0
0.
10
.0
50
0
0.
Sample variance
t Test for One Mean
• Same as z except for s in place of s.
• For Bushman, s = 4.40
X  m 7.1  5.65 1.45
t


 3.30
s
4.40
.44
n
100
Degrees of Freedom
• Skewness of sampling distribution of
variance decreases as n increases
• t will differ from z less as sample size
increases
• Therefore need to adjust t accordingly
• df = n - 1
• t based on df
t Distribution
Two-Tailed Significance Level
df
10
15
20
25
30
100
.10
1.812
1.753
1.725
1.708
1.697
1.660
.05
2.228
2.131
2.086
2.060
2.042
1.984
.02
2.764
2.602
2.528
2.485
2.457
2.364
.01
3.169
2.947
2.845
2.787
2.750
2.626
Conclusions
• With n = 100, t.0599 = 1.98
• Because t = 3.30 > 1.98, reject H0
• Conclude that viewing violent video leads
to more aggressive free associates than
normal.
Factors Affecting t
• Difference between sample and
population means
• Magnitude of sample variance
• Sample size
Factors Affecting Decision
• Significance level 
• One-tailed versus two-tailed test
Size of the Effect
• We know that the difference is
significant.
 That doesn’t mean that it is important.
• Population mean = 5.65, Sample mean =
7.10
• Difference is nearly 1.5 words, or 25%
more violent words than normal.
Cont.
Effect Size (cont.)
• Later we will express this in terms of
standard deviations.
 1.45 units is 1.45/4.40 = 1/3 of a standard
deviation.
Confidence Limits on Mean
• Sample mean is a point estimate
• We want interval estimate
 Probability that interval computed this way
includes m = 0.95
CI .95  X  t.025 s X
For Our Data
CI
 X t
sX
.95
.025
 7.1  1.98  0.44
 7.1  0.87
 6.23  m  7.97
Confidence Interval
• The interval does not include 5.65--the
population mean without a violent video
• Consistent with result of t test.
• Confidence interval and effect size tell us about
the magnitude of the effect.
• What can we conclude from confidence
interval?