Transcript Document

Statistical Techniques I
EXST7005
Confidence Intervals
Confidence intervals
An expression of what we believe to be a range
of values that is likely to contain the true value o
some parameter is called a confidence interval.
 We can calculate confidence intervals for mean
(m) and variances (s).

Confidence intervals for t and Z
distributions

t and Z tests confidence intervals start with a t o
Z probability statement.
P( t
P( t a
a
2
2
2
 t  t )  1 
a
SY
2

 ta )  1  
Y m
Confidence intervals for t and Z
distributions (continued)

Which is modified to express an interval about m
instead of t (or Z).
2
2
P (  t S  Y  m  t SY )  1  
a
Y
a
2
2
P( Y  t S  m  Y  t SY )  1  
a
Y
a
Confidence intervals for t and Z
distributions (continued)

The final form is given below. The expression f
Z has an identical derivation.
2
2
P(Y  t S  m  Y  t SY )  1  
a
2
Y
a
2
P(Y  Z s 2  m  Y  Z s 2Y )  1  
a
Y
a
Confidence intervals for t and Z
distributions (continued)

A common short notation for the interval in the
probability statement is given as
2
Y t S
a
Y
Confidence intervals for variance

Variances follow a Chi square distribution. The
confidence interval for variance is based on the
Chi Square distribution.
d
i
upper  1  
P  lower




2
2
2
H
PG

F
lower
2

s2
SS

upper
2
K
J
 1 
I
Confidence intervals for variance
(continued)

Which is solved to isolate s2.
H

G
P
F1
lower
2
H

G
P
F1
upper
2


SS
s2
SS
s2

 upper
2

 lower
2
1
1
K
J 1  
I
K
J 1  
I
Confidence intervals for variance
(continued)

Giving the expression,
H

G
P
FSS
upper
2
s2 
 lower
2
SS
K
J 1  
I
Confidence intervals for variance
(continued)

Notice that the upper tabular Chi square value
comes out in the lower bound and the lower Ch
square in the upper bound.
H

G
P
FSS
upper
2
s2 
 lower
2
SS
K
J 1  
I
Notes on Confidence intervals
One sided intervals are possible, but uncommo
 Confidence intervals are one of the most
common expressions in statistics, frequently
occurring in publications.
 Confidence intervals are not commonly
calculated in statistical software programs and
usually must be done by hand.

Example1
We receive a shipment of apples that are
supposed to be "premium apples", with a
diameter of at least 2.5 inches. We will take a
sample of 12 apples, and place a confidence
interval on the mean. The sample values for th
12 apples are;
 2.9, 2.1, 2.4, 2.8, 3.1, 2.8, 2.7, 3.0, 2.4, 3.2, 2.3
3.4

Example1 (continued)
Do we want the Std dev or Std error?
 SAS PROC UNIVARIATE Output












N
Mean
Std Dev
Skewness
USS
CV
T:Mean=0
Num ^= 0
M(Sign)
Sgn Rank
Moments
12 Sum Wgts
0.258333 Sum
0.394181 Variance
-0.11842 Kurtosis
2.51 CSS
152.5863 Std Mean
2.270258 Pr>|T|
12 Num > 0
2 Pr>=|M|
25 Pr>=|S|
12
3.1
0.155379
-0.8353
1.709167
0.11379
0.0443
8
0.3877
0.0493
Example1 (continued)
The standard deviation is the variation in
individual apples. If we wanted the interval that
continued 95% of the apples, we would use the
standard deviation.
 However, we have estimated a mean and we
want to place a confidence interval that
expresses our knowledge of this estimate. Is
our estimate of the mean good or poor? Is the
confidence interval narrow or wide.

Example1 (continued)

Note the confidence interval about the mean,
and about individual observations.
2
2
P(Y  t S  m  Y  t SY )  1  
a
2
Y
a
2
P(Y  t S  Y  Y  t S )  1  
a
a
Example1 (continued)
So we need the mean of the apples and the
standard error.
 Mean = 2.758333 (2.5 added)
 Std Mean = 0.11379 (no adjustment
needed)
 We also need a t-value. With 12 apples and 11
d.f., our two tailed t-value is 2.201.

Example1 (continued)
So `Y±t/2S`Y or 2.758±(2.201)(0.1138) =
2.758±(0.250) gives the interval. The best
expression is as a confidence interval probabilit
statement.
 P(2.758-0.250 m 2.758 + 0.250) = 1-
 P(2.508 m 3.008) = 0.95

Example 2 - Variance CI
Place a confidence interval on the variance
estimate for the apple example. The variance
estimate from the SAS output is S2 = 0.155379
and the corrected sum of squares is 1.709.
 The Chi square values for 11 d.f. are 3.816
(lower) and 21.92 (upper).

Example 2 - Variance CI (continued

Recall,
H

G
P
FSS
upper
2
s2 
 lower
2
SS
K
J 1  
I
Example 2 - Variance CI (continued
Then
 P(SS/2upper  s2  SS/2lower) = 1-
 P(1.709/21.92  s2  1.709/3.816) = 0.95
 P(0.078  s2  0.448) = 0.95
 where the variance was S2 = 0.155379

A note on hypothesis testing

Hypothesis tests can be done by calculating a
confidence interval for the appropriate value of
and checking to see if the hypothesized value is
contained in the interval. This approach is used
in some SAS program output such as Analysis
Variance.
Summary

Confidence intervals for m (t or Z distribution) an
s2 (Chi square).
2
2
P(Y  t S  m  Y  t SY )  1  
a
H

PG
FSS
upper
2
Y
s2 
a
 lower
2
SS
K
J 1  
I
Summary (continued)
This is a common and IMPORTANT calculation
 It is often not calculated by statistical software
 It can be used to perform statistical tests of
hypothesis
