Transcript Document

Happiness comes not from material wealth but less desire.
1
Chapter 17
Inference about a Population Mean
2
Chapter 17
BPS - 5th Ed.
Conditions for Inference
about a Mean



Data are from a SRS of size n.
Population has a Normal distribution
with mean m and standard deviation s.
Both m and s are usually unknown.


3
we use inference to estimate m.
Problem: s unknown means we cannot
use the z procedures previously learned.
Chapter 17
BPS - 5th Ed.
Standard Error



When we do not know the population standard
deviation s (which is usually the case), we must
estimate it with the sample standard deviation s.
When the standard deviation of a statistic is
estimated from data, the result is called the
standard error of the statistic.
The standard error of the sample mean x is
s
n

4
Chapter 17
BPS - 5th Ed.
One-Sample t Statistic

When we estimate s with s, our one-sample z
statistic becomes a one-sample t statistic.
x  μ0
z
σ
n

5

x  μ0
t
s
n
By changing the denominator to be the standard
error, our statistic no longer follows a Normal
distribution. The t test statistic follows a t
distribution with n – 1 degrees of freedom.
Chapter 17
BPS - 5th Ed.
The t Distributions



6
The t density curve is similar in shape to the
standard Normal curve. They are both symmetric
about 0 and bell-shaped.
The spread of the t distributions is a bit greater
than that of the standard Normal curve (i.e., the t
curve is slightly “fatter”).
As the degrees of freedom increase, the t density
curve approaches the N(0, 1) curve more closely.
This is because s estimates s more accurately as
the sample size increases.
Chapter 17
BPS - 5th Ed.
The t Distributions
7
Chapter 17
BPS - 5th Ed.
Using Table C


8
Table C on page 693 gives critical values having
upper tail probability p along with corresponding
confidence level C.
z* values are also displayed at the bottom.
Chapter 17
BPS - 5th Ed.
Using Table C

Find the value t* with probability 0.025 to its right
under the t(7) density curve.
t* = 2.365
9
Chapter 17
BPS - 5th Ed.
One-Sample t Confidence Interval
Take an SRS of size n from a population with unknown mean m and
unknown standard deviation s. A level C confidence interval for m
is:
x t

s
n
where t* is the critical value for confidence level C
from the t density curve with n – 1 degrees of
freedom.
– This interval is exact when the population distribution is
Normal and approximate for large n in other cases.
10
Chapter 17
BPS - 5th Ed.
Case Study
American Adult Heights
A study of 7 American adults from an SRS yields an
average height of x = 67.2 inches and a standard
deviation of s = 3.9 inches. A 95% confidence interval
for the average height of all American adults (m) is:
x t
 s

n
 67.2  2.365
3.9
 67.2  3.486
7
 63.714 to 70.686
“We are 95% confident that the average height of all
American adults is between 63.714 and 70.686 inches.”
11
Chapter 17
BPS - 5th Ed.
One-Sample t Test
Like the confidence interval, the t test is close in form to the z test
learned earlier. When estimating s with s, the test statistic
becomes:
x  μ0
t
s n
where t follows the t density curve with n – 1
degrees of freedom, and the P-value of t is
determined from that curve.
– The P-value is exact when the population distribution is
Normal and approximate for large n in other cases.
12
Chapter 17
BPS - 5th Ed.
P-value for Testing Means

Ha: m> m0


Ha: m< m0


P-value is the probability of getting a value as small or smaller
than the observed test statistic (t) value.
Ha: mm0

13
P-value is the probability of getting a value as large or larger than
the observed test statistic (t) value.
P-value is two times the probability of getting a value as large or
larger than the absolute value of the observed test statistic (t)
value.
Chapter 17
BPS - 5th Ed.
14
Chapter 17
BPS - 5th Ed.
Case Study
Sweetening Colas (Ch. 14)
Cola makers test new recipes for loss of sweetness
during storage. Trained tasters rate the sweetness
before and after storage. Here are the sweetness
losses (sweetness before storage minus sweetness
after storage) found by 10 tasters for a new cola recipe:
2.0
0.4
0.7
2.0
-0.4
2.2
-1.3
1.2
1.1
Are these data good evidence that the cola lost
sweetness during storage?
15
Chapter 17
BPS - 5th Ed.
2.3
Case Study
Sweetening Colas
It is reasonable to regard these 10 carefully trained
tasters as an SRS from the population of all trained
tasters.
While we cannot judge Normality from
just 10 observations, a stemplot of the
data shows no outliers, clusters, or
extreme skewness. Thus, P-values for
the t test will be reasonably accurate.
16
Chapter 17
BPS - 5th Ed.
Case Study
1.
2.
Hypotheses:
Test Statistic: t 
(df = 101 = 9)
H0: m = 0
H a: m > 0
x  μ0
s
4.
17
 2.70
1.196
n
3.

1.02  0
10
P-value:
P-value = P(T > 2.70) = 0.0123 (using a computer)
P-value is between 0.01 and 0.02 since t = 2.70 is between
t* = 2.398 (p = 0.02) and t* = 2.821 (p = 0.01) (Table C)
Conclusion:
Since the P-value is smaller than a = 0.02, there is quite strong
evidence that the new cola loses sweetness on average during
storage at room temperature.
Chapter 17
BPS - 5th Ed.
Case Study
Sweetening Colas
18
Chapter 17
BPS - 5th Ed.
Matched Pairs t Procedures




19
To compare two treatments, subjects are matched in pairs
and each treatment is given to one subject in each pair.
Before-and-after observations on the same subjects also
calls for using matched pairs.
To compare the responses to the two treatments in a
matched pairs design, apply the one-sample t procedures to
the observed differences (one treatment observation minus
the other).
The parameter m is the mean difference in the responses
to the two treatments within matched pairs of subjects in
the entire population.
Chapter 17
BPS - 5th Ed.
Case Study
Air Pollution
Pollution index measurements
were recorded for two areas of
a city on each of 8 days.
Are the average pollution levels
the same for the two areas of
the city?
20
Area A Area B
A–B
2.92
1.84
1.08
1.88
0.95
0.93
5.35
4.26
1.09
3.81
3.18
0.63
4.69
3.44
1.25
4.86
3.69
1.17
5.81
4.95
0.86
5.55
4.47
1.08
Chapter 17
BPS - 5th Ed.
Case Study
Air Pollution
It is reasonable to regard these 8 measurement pairs as
an SRS from the population of all paired measurements.
While we cannot judge Normality from
just 8 observations, a stemplot of the
data shows no outliers, clusters, or
extreme skewness. Thus, P-values for
the t test will be reasonably accurate.
0
689
1
11122
These 8 differences have x = 1.0113 and s = 0.1960.
21
Chapter 17
BPS - 5th Ed.
Case Study
1.
Hypotheses:
2.
Test Statistic:
(df = 81 = 7)
H 0: m = 0
H a: m ≠ 0
t 
x  μ0
s
4.
22
 14.594
0.1960
n
3.

1.0113  0
8
P-value:
P-value = 2P(T > 14.594) = 0.0000017 (using a computer)
P-value is smaller than 2(0.0005) = 0.0010 since t = 14.594 is
greater than t* = 5.041 (upper tail area = 0.0005) (Table C)
Conclusion:
Since the P-value is smaller than a = 0.001, there is very strong
evidence that the mean pollution levels are different for the two
areas of the city.
Chapter 17
BPS - 5th Ed.
Case Study
Air Pollution
Find a 95% confidence interval to estimate the
difference in pollution indexes (A – B) between the two
areas of the city. (df = 81 = 7 for t*)
0.1960
 s
x t
 1.0113  2.365
 1.0113  0.1639
n
8
 0.8474 to 1.1752
We are 95% confident that the pollution index in area
A exceeds that of area B by an average of 0.8474 to
1.1752 index points.
23
Chapter 17
BPS - 5th Ed.
Chapter 18
Two-Sample Problems
24
Chapter 18
BPS - 5th Ed.
Two-Sample Problems


The goal of inference is to compare the responses to
two treatments or to compare the characteristics of
two populations.
We have a separate sample from each treatment or
each population.

25
Each sample is separate. The units are not matched, and
the samples can be of differing sizes.
Chapter 18
BPS - 5th Ed.
Case Study
Exercise and Pulse Rates
A study if performed to compare the mean resting
pulse rate of adult subjects who regularly exercise
to the mean resting pulse rate of those who do not
regularly exercise.
n
mean
std. dev.
Exercisers
29
66
8.6
Nonexercisers
31
75
9.0
This is an example of when to use the two-sample t procedures.
26
Chapter 18
BPS - 5th Ed.
Conditions for Comparing Two Means

We have two independent SRSs, from two
distinct populations



Both populations are Normally distributed


27
that is, one sample has no influence on the other-matching violates independence
we measure the same variable for both samples.
the means and standard deviations of the populations
are unknown
in practice, it is enough that the distributions have
similar shapes and that the data have no strong outliers.
Chapter 18
BPS - 5th Ed.
Two-Sample t Procedures

In order to perform inference on the difference of
two means (m1 – m2), we’ll need the standard
deviation of the observed difference x1  x2 :
2
σ1

2
σ2
n1  n2
28
Chapter 18
BPS - 5th Ed.
Two-Sample t Confidence Interval


Draw an SRS of size n1 form a Normal population
with unknown mean m1, and draw an independent
SRS of size n2 form another Normal population
with unknown mean m2.
A confidence interval for m1 – m2 is:
x1  x2   t

s12 s22

n1 n2
– here t* is the critical value for confidence level C for the t
density curve. The degrees of freedom are equal to the
smaller of n1 – 1 and n2 – 1.
29
Chapter 18
BPS - 5th Ed.
Case Study
Exercise and Pulse Rates
Find a 95% confidence interval for the difference in
population means (nonexercisers minus exercisers).
2
2
2 (8.6)2
s
s

(9.0)
1
x1  x2  t
 2  75  66  2.048

n1 n2
31
29
 9  4.65
 4.35 to 13.65
“We are 95% confident that the difference in mean
resting pulse rates (nonexercisers minus exercisers) is
between 4.35 and 13.65 beats per minute.”
30
Chapter 18
BPS - 5th Ed.
Two-Sample t Significance Tests


Draw an SRS of size n1 form a Normal population with
unknown mean m1, and draw an independent SRS of size n2
form another Normal population with unknown mean m2.
To test the hypothesis H0: m1 = m2, the test statistic is:
t

( x1  x 2 )  ( μ1  μ 2 )
2
s1
n1

31

2
s2
n2

x1  x 2
2
s1
n1

2
s2
n2
Use P-values for the t density curve. The degrees of
freedom are equal to the smaller of n1 – 1 and n2 – 1.
Chapter 18
BPS - 5th Ed.
P-value for Testing Two Means

Ha: m1 > m2


Ha: m1 < m2


P-value is the probability of getting a value as small or smaller
than the observed test statistic (t) value.
Ha: m1 m2

32
P-value is the probability of getting a value as large or larger than
the observed test statistic (t) value.
P-value is two times the probability of getting a value as large or
larger than the absolute value of the observed test statistic (t)
value.
Chapter 18
BPS - 5th Ed.
Case Study
Exercise and Pulse Rates
Is the mean resting pulse rate of adult subjects who
regularly exercise different from the mean resting pulse
rate of those who do not regularly exercise?

Null: The mean resting pulse rate of adult subjects who
regularly exercise is the same as the mean resting pulse
rate of those who do not regularly exercise? [H0: m1 = m2]

Alt: The mean resting pulse rate of adult subjects who
regularly exercise is different from the mean resting pulse
rate of those who do not regularly exercise? [Ha : m1 ≠ m2]
Degrees of freedom = 28 (smaller of 31 – 1 and 29 – 1).
33
Chapter 18
BPS - 5th Ed.
Case Study
1.
Hypotheses:
H0: m1 = m2
2.
Test Statistic:
x1  x 2
t 
2
s1
n1

2
Ha: m1 ≠ m2

75  66
s2
(9.0)
n2
31
2

 3.961
(8.6)
2
29
3.
P-value:
P-value = 2P(T > 3.961) = 0.000207 (using a computer)
P-value is smaller than 2(0.0005) = 0.0010 since t = 3.961 is
greater than t* = 3.674 (upper tail area = 0.0005) (Table C)
4.
Conclusion:
Since the P-value is smaller than a = 0.001, there is very strong
evidence that the mean resting pulse rates are different for the two
populations (nonexercisers and exercisers).
34
Chapter 18
BPS - 5th Ed.
Avoid Inference About
Standard Deviations




35
There are methods for inference about the standard
deviations of Normal populations.
Most software packages have methods for comparing
the standard deviations.
However, these methods are extremely sensitive to
non-Normal distributions and this lack of robustness
does not improve in large samples.
Hence it is not recommended that one do
inference about population standard deviations in
basic statistical practice.
Chapter 18
BPS - 5th Ed.