Transcript 9.2

Chapter 9
Section 2
Confidence Intervals about a
Population Mean in Practice where
the Population Standard Deviation
is Unknown
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 1 of 25
Chapter 9 – Section 2
● Learning objectives
1

Know the properties of t-distribution
2 Determine t-values
3
 Construct and interpret a confidence interval about a
population mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 2 of 25
Chapter 9 – Section 2
● Learning objectives
1

Know the properties of t-distribution
2 Determine t-values
3
 Construct and interpret a confidence interval about a
population mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 3 of 25
Chapter 9 – Section 2
● In Section 1, we assumed that we knew the
population standard deviation σ
● Since we did not know the population mean μ,
this seems to be unrealistic
● In this section, we construct confidence intervals
in the case where we do not know the
population standard deviation
● This is much more realistic
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 4 of 25
Chapter 9 – Section 2
● If we don’t know the population standard
deviation σ, we obviously can’t use the formula
Margin of error = 1.96 • σ / √ n
because we have no number to use for σ
● However, just as we can use the sample mean
to approximate the population mean, we can
also use the sample standard deviation to
approximate the population standard deviation
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 5 of 25
Chapter 9 – Section 2
● Because we’ve changed our formula (by using s
instead of σ), we can’t use the normal
distribution any more
● Instead of the normal distribution, we use the
Student’s t-distribution
● This distribution was developed specifically for
the situation when σ is not known
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 6 of 25
Chapter 9 – Section 2
● Properties of the t-distribution
● Several properties are familiar about the
Student’s t distribution
 Just like the normal distribution, it is centered at 0 and
symmetric about 0
 Just like the normal curve, the total area under the
Student’s t curve is 1, the area to left of 0 is ½, and
the area to the right of 0 is also ½
 Just like the normal curve, as t increases, the
Student’s t curve gets close to, but never reaches, 0
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 7 of 25
Chapter 9 – Section 2
● So what’s different?
● Unlike the normal, there are many different
“standard” t-distributions




There is a “standard” one with 1 degree of freedom
There is a “standard” one with 2 degrees of freedom
There is a “standard” one with 3 degrees of freedom
Etc.
● The number of degrees of freedom is crucial for
the t-distributions
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 8 of 25
Chapter 9 – Section 2
● When σ is known, the z-score
x
z
/ n
follows a standard normal distribution
● When σ is not known, the t-statistic
x
t
s/ n
follows a t-distribution with n – 1 degrees of
freedom
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 9 of 25
Chapter 9 – Section 2
● Comparing three curves
 The standard normal curve
 The t curve with 14 degrees of freedom
 The t curve with 4 degrees of freedom
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 10 of 25
Chapter 9 – Section 2
● Learning objectives
1

Know the properties of t-distribution
2 Determine t-values
3
 Construct and interpret a confidence interval about a
population mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 11 of 25
Chapter 9 – Section 2
● The calculation of t-distribution values can be
done in similar ways as the calculation of normal
values
 Using tables (such as Table V on the inside back
cover)
 Using technology (such as Excel, MINITAB,
calculators, StatCrunch, etc.)
● Because t-distribution tables are not complete, it
is suggested that the calculations be done with
one of the technology methods
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 12 of 25
Chapter 9 – Section 2
● Critical values for various degrees of freedom for
the t-distribution are (compared to the normal)
n
6
16
31
101
1001
Normal
Degrees of Freedom
5
15
30
100
1000
“Infinite”
t0.025
2.571
2.131
2.042
1.984
1.962
1.960
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 13 of 25
Chapter 9 – Section 2
● Learning objectives
1

Know the properties of t-distribution
2 Determine t-values
3
 Construct and interpret a confidence interval about a
population mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 14 of 25
Chapter 9 – Section 2
● The difference between the two formulas
x 
z
/ n
x 
t
s/ n
is that the sample standard deviation s is used to
approximate the population standard deviation σ
● The z-score has a normal distribution, the
t-statistic (or the t-score) has a
t-distribution
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 15 of 25
Chapter 9 – Section 2
● A 95% confidence interval, with σ unknown, is
x  t 0.025 
s
n
to
s
x  t 0.025 
n
where t0.025 is the critical value for the
t-distribution with (n – 1) degrees of freedom
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 16 of 25
Chapter 9 – Section 2
● The different confidence intervals with t0.025
would be






For n = 6, the sample mean ± 2.571 • s / √ 6
For n = 16, the sample mean ± 2.131 • s / √ 16
For n = 31, the sample mean ± 2.042 • s / √ 31
For n = 101, the sample mean ± 1.984 • s / √ 101
For n = 1001, the sample mean ± 1.962 • s / √ 1001
When σ is known, the sample mean ± 1.960 • σ / √ n
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 17 of 25
Chapter 9 – Section 2
● In general, the (1 – α) • 100% confidence
interval, when σ is unknown, is
s
x  t / 2 
n
to
s
x  t / 2 
n
where tα/2 is the critical value for the
t-distribution with (n – 1) degrees of freedom
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 18 of 25
Chapter 9 – Section 2
● As the sample size n gets large, there is less
and less of a difference between the critical
values for the normal and the critical values for
the t-distribution
● It is correct to use the t-distribution when σ is not
known
 Technology should always use t-distribution
 When doing rough assessment by hand, the normal
critical values can be used, particularly when n is
large, for example if n is 30 or more
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 19 of 25
Chapter 9 – Section 2
● When does the t-distribution and normal differ by
a lot?
● In either of two situations
 When the sample size n is small (particularly if n is 10
or less), or
 When the confidence level needs to be high
(particularly if α is 0.005 or lower)
● For n = 5 and α = .001, when n and α both are
small, the t-distribution critical value is 5.893
compared to the normal critical value of 3.091
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 20 of 25
Chapter 9 – Section 2
● Assume that we want to estimate the average
weight of a particular type of very rare fish
 We are only able to borrow 7 specimens of this fish
 The average weight of these was 1.38 kg (the sample
mean)
 The standard deviation of these was 0.29 kg (the
sample standard deviation)
● What is a 95% confidence interval for the true
mean weight?
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 21 of 25
Chapter 9 – Section 2
● n = 7, the critical value t0.025 for 6 degrees of
freedom is 2.447
● Our confidence interval thus is
0.29
1.38  2.447 
 1.11
7
to
1.38  2.447 
0.29
 1.65
7
or (1.11, 1.65)
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 22 of 25
Chapter 9 – Section 2
● Outliers are always a concern, but they are even
more of a concern for confidence intervals using
the t-distribution
 The number of values n is small, so each outlier has a
major affect on the data set
 The sample mean is sensitive to outliers
 The sample standard deviation is sensitive to outliers
● This is a problem!
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 23 of 25
Chapter 9 – Section 2
● So what can we do?
 We always must check to see that the outlier is a
legitimate data value (and not just a typo)
 We can collect more data, for example to increase n
to be over 30
● If neither method above will work, i.e. if the data
value is a legitimate value and we are not able
to collect more data, then there are other
methods (“nonparametric methods”) that could
apply … these are in Chapter 15
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 24 of 25
Summary: Chapter 9 – Section 2
● We used values from the normal distribution
when we knew the value of the population
standard deviation σ
● When we do not know σ, we estimate σ using
the sample standard deviation s
● We use values from the t-distribution when we
use s instead of σ, i.e. when we don’t know the
population standard deviation
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 25 of 25