95% confidence interval

Download Report

Transcript 95% confidence interval

Meaning and use of
confidence intervals
(Session 05)
SADC Course in Statistics
Learning Objectives
By the end of this session, you will be able to
• explain the meaning of a confidence interval
• explain the role of the t-distribution in
computing a confidence interval for the
population mean
• calculate a confidence interval for the
population mean using sample data
• state the assumptions underlying the above
calculation
To put your footer here go to View > Header and Footer
2
Revision on standard errors
Recall from the previous session that
• The standard error provides a measure of
the precision of the sample mean
• the formula s/n gives the standard error
of the mean when simple random
sampling is used
• A low standard error indicates that the
sample mean has high precision, i.e. the
sample mean is a “good” estimate of the
population mean
To put your footer here go to View > Header and Footer
3
Standard errors more generally…
• Whenever sample data is used to find an
estimate of a popn parameter, it should be
accompanied by a measure of its precision!
• The formula s/n applies only when using x
as an estimate of the population mean .
Formulae will differ for other estimates,
depending on how the sample was selected.
• The higher the standard error, the less
precise is the estimate - but how high should
it be before we start to get worried about our
estimate?
To put your footer here go to View > Header and Footer
4
Confidence Interval for 
Instead of using a point estimate, it is usually
more informative to summarise using an
interval which is likely (i.e. with 95%
confidence) to contain .
This is called an interval estimate or a
confidence interval (C.I.)
For example, we could report that the mean
landholding size of HHs in Kilindi district in
Tanzania is 7.62 acres with 95% confidence
interval (6.95, 8.28), i.e. there is a 95%
chance that the interval (6.95,8.28) includes
the true value .
To put your footer here go to View > Header and Footer
5
Finding the Confidence Interval
The 95% confidence limits for  (lower and
upper) are calculated as:
x  t n 1 (s
n)
and
where tn-1 is the 5%
level for the tdistribution with (n-1)
degrees of freedom.
Statistical tables and
statistical software
give t-values.
x  t n 1 (s
2½%
–t
n)
2½%
0
To put your footer here go to View > Header and Footer
t
6
t-values for computation of 95% C.I.
P
2
3
4
5
10
6.31
2.92
2.35
2.13
2.02
5
12.7
4.30
3.18
2.78
2.57
2
31.8
6.96
4.54
3.75
3.36
6
7
8
9
10
1.94
1.89
1.86
1.83
1.81
2.45
2.36
2.31
2.26
2.23
3.14
3.00
2.90
2.82
2.76
20
30
40
60
1.72
1.70
1.68
1.67
2.09
2.04
2.02
2.00
2.53
2.46
2.42
2.39
1.64
1.96
2.33
=1

x  t n 1 (s
2½%
–t
n)
2½%
0
To put your footer here go to View > Header and Footer
t
7
Correct interpretation of C.I.’s
If we sampled repeatedly and found a 95%
C.I. each time, only 95% of them would
include the true , i.e. there is a 95% chance
that a single interval includes .
13
12
11
10
0
5
10
15
20
25
30
35
40
To put your footer here go to View > Header and Footer
45
50
8
An example (persons per room)
In Practical 3, the first of 50 samples of size
10 gave mean=7.7, std.dev.=3.7 for the
number of persons per room.
Hence a 95% confidence interval for the true
mean number of persons per room:
7.7  t9 (s/n) = 7.7  2.26(3.7/10)
= 7.7  2.64
= (5.1, 10.4)
Can you interpret this interval? Write down
your answer. We will then discuss.
To put your footer here go to View > Header and Footer
9
Underlying assumptions
The above computation of a confidence
interval assumes that the data have a normal
distribution.
More exactly, it requires the sampling
distribution of the mean to have a normal
distribution.
What happens if data are not normal?
Not a serious problem if sample size is large
because of the Central Limit Theorem (see
Session 4)
To put your footer here go to View > Header and Footer
10
Using the Central Limit Theorem
Recall this theorem says that the sampling
distribution of the mean has a normal
distribution, for large sample sizes.
So even when data are not normal, the
formula for a 95% confidence interval will
give an interval whose “confidence” is still
high - approximately 95%.
Better attach some measure of uncertainty
than worry about exact confidence level.
To put your footer here go to View > Header and Footer
11
Note: The formula on slide 6 for a confidence
interval applies when estimation of  is of
interest.
Different assumptions on the data, and
interest in other population parameters, will
lead to different confidence intervals.
Practical work follows …
To put your footer here go to View > Header and Footer
12