Transcript PowerPoint
Mar. 31 Statistic for the day:
Average number of baseball gloves
that can be made from one cow: 5
Assignment:
Read Chapter 20 (again!)
Do Exercises 4, 6, 8, 9, 10, 15
These slides were created by Tom Hettmansperger and in some cases
modified by David Hunter
True or False?
To construct a confidence interval for a
population PROPORTION, it is enough to
know the sample proportion and the sample
size.
To construct a confidence interval for a
population MEAN, it is enough to know the
sample mean and the sample size.
Do each of the following tend to
make a confidence interval WIDER
or NARROWER?
A larger sample size
A larger confidence coefficient
A larger standard error of the mean
A sample proportion closer to .5
A larger sample mean
Thought questions:
In example 1 p352, for men who diet but do not exercise
a 95% confidence interval for mean weight loss is
13 to 18 pounds.
For men who exercise but do not diet the 95% confidence
interval for mean weight loss is
6 to 11 pounds.
a. Do you think this means that 95% of all men who diet
will lose between 13 and 18 pounds?
b. On the basis of these results, do you think that you can
conclude that men who diet without exercise lose more
weight on average?
Back to holding babies on the left.
Accepting that Lee Salk has presented a strong case
for holding babies on the left, what is the selective
advantage from the point of view of evolution?
Hypothesis: Holding baby on the left is holding
baby over the heart. And the sound of a human
heartbeat is soothing to baby.
To test this hypothesis, Salk randomly selected a period
of 4 days and played the sound of a heart beating in a
new baby nursery. Then he did the same without the
heartbeat for a new group of newborns.
Babies were divided into three groups:
•light birth weight (2510 – 3000 grams),
•medium birth weight (3010 – 3500 grams) and
•heavy (3510 grams and above).
The weight change from day 1 to day 4 was recorded.
We want to know if the population means for heartbeat
and control (no heartbeat) are close or not.
We don’t know the population means.
We take samples and compute the sample means.
Since the sample means are attracted to the population means,
we want to check to see if the sample means are close.
How do we decide if the sample means are close?
One way: compare the 95% confidence intervals.
If the two confidence intervals are separated then perhaps
we can conclude that the population means are separated
and are not close.
Using confidence intervals takes uncertainty due
to variation in the sample means into account.
95% confidence intervals for weight change
Birth weights (in grams)
2510-3000
3010-3500
3500-
HB
C
HB
C
HB
C
mean = 65
-20
40
-10
10
-45
SD = 50
60
50
50
35
75
n=35
n=28
n=45 n=45 n=20 n=36
SEM = 8.45
11.33
7.45
7.45
48.1
81.9
-42.7
2.7
25.1
54.9
-24.9 -5.7
4.9 25.7
7.83 12.50
-70
-20
95% confidence intervals for mean weight
gain (first 4 days)
100
50
Wt gain
0
Wt loss
-50
1 L: Con 1 L: HB 2 M: Con 2 M: HB 3 H: Con 3 H: HB
We informally compare the 95% confidence intervals
to try to decide if the population means are close or not.
In this case we conclude that for all three birth weights
the population mean weight change for heartbeat babies
is greater than the population mean weight change for
the control babies.
Is there a more formal way to approach the question:
Is there a difference between the population means?
YUP!
Construct a 95% confidence
interval for the difference in
population means based on the
difference in sample means.
…First, a digression.
Suppose I tell you that I have given IQ tests to
a sample of PSU students.
I tell you the mean IQ for the sample is 105.
Question: Is this close to 100 or not?
What other information do you need in order to
answer the question?
Suppose you ask me for the SEM.
It is 2.
Now is 105 close to 100 or not?
Why?
Normal Curve of sample mean
(SEM = 2)
Frequency
10
5
2 SEMs
0
100.5
100
NOT CLOSE
101.5
102.5
101
103.5
104.5
105.5
105
106.5
107.5
108.5
109.5
Suppose you ask me for the SEM.
It is 4.
Now is 105 close to 100 or not?
Why?
Normal Curve of sample mean
(SEM = 4)
Frequency
10
5
2 SEMs
0
96
98
97
100
CLOSE
102
104
106
105
108
110
112
114
So to answer the question: Is a sample mean
of 105 close to 100 or not, you need the
SEM (standard error of the mean).
I could give it to you directly. OR
I could give you
•the sample size and
•the sample standard deviation, SD
Then SEM = SD/sqrt(sample size)
For example: sample size 100 and SD 20
Then SEM = 20/10 = 2.
Now think of TWO sample means
Suppose I have two sample means and I want to
know if they are close to each other.
This is equivalent to:
Is the difference between the two sample means
close to zero?
Let D denote the difference in sample means.
What do you need from me to decide if the
difference D is close to zero?
You need the standard deviation of the
difference of sample means.
Example:
Suppose I tell you that I have two samples of babies,
one that listened to heartbeats and the other that did
not.
I measure weight gained and tell you:
•Heartbeat group sample mean weight gain is 65 g
•Control group sample mean weight gain is −20 g
Are the sample means close? Is the difference of 85
grams close to 0?
I now tell you that the standard deviation of the
difference in sample means is: 14.13 g
Can you tell if the difference of 85 g is close to 0?
You need to check to see if 0 is within 2 standard
deviations of 85 (we suppress the normal curve):
85 + 2x(14.13)
85 + 28.26
56.74 to 113.26
So 0 is not close to 85 in this case and we conclude
that the sample means are not close to each other.
Recall the Pythagorean Theorem:
C = sqrt( A2 + B2)
A
B
Question: How can we get the standard deviation of the
difference from information on the two samples?
Suppose we have the SEMs for the two sample means:
•Heartbeat SEM = 8.45 g
•Control SEM = 11.33 g
Heartbeat
SEM:
8.45
Sqrt( 8.452 + 11.332) = 14.13
Control SEM: 11.33
To decide if two sample means are close, we check to
see if their difference is close to 0.
We must have the standard deviation of the difference.
Once we have that we can check to see if 0 is within
2 standard deviations of the difference.
We could be given:
The individual sample mean SEMs
Then compute the standard deviation of the difference
using the Pythagorean theorem.
The individual sample sizes and the individual
sample standard deviations.
Then compute the individual SEMs from SD/sqrt(sample size)
Then compute the standard deviation of the difference using
the Pythagorean theorem.