Transcript Lecture6

Confidence intervals and
hypothesis testing
Petter Mostad
2005.10.03
Confidence intervals (repetition)
• Assume μ and σ2 are some real numbers, and
assume the data X1,X2,…,Xn are a random sample
from N(μ,σ2).
– Then
X 
~ N (0,1)
/ n
P(1.96  Z  1.96)  95%
Z
– thus


P
(
X

1.96



X

1.96
)  95%
– so
n
n


(
X

1.96
,
X

1.96
) is a
and we say that
n
n
confidence interval for μ with 95% confidence, based
on the statistic X
Confidence intervals, general idea
• We have a model with an unknown parameter
• We find a ”statistic” (function of the sample) with
a known distribution, depending only on the
unknown parameter
• This distribution is used to construct an interval
with the following property: If you repeat many
times selecting a parameter and simulating the
statistic, then about (say) 95% of the time, the
confidence interval will contain the parameter
Hypothesis testing
• Selecting the most plausible model for the data,
among those suggested
• Example: Assume X1,X2,…,Xn is a random sample
from N(μ,σ2), where σ2 is known, but μ is not; we
want to select μ fitting the data.
• One possibility is to look at the probability of
observing the data given different values for μ.
(We will return to this)
• Another is to do a hypothesis test
Example
• We select two alternative hypotheses:
– H0:   0
– H1:   0
• Use the value of X to test H0 versus H1: If X
is far from 0 , it will indicate H1.
• Under H0, we know that
P( X  1.96

n
 0  X  1.96
• Reject H0 if X is outside

n
)  95%
(0 1.96

,


1.96
)
0
n
n

General outline for hypothesis
testing
• The possible hypotheses are divided into
H0, the null hypothesis, and H1, the
alternative hypothesis
• A hypothesis can be
– Simple, so that it is possible to compute the
probability of data (e.g.,   3.7 )
– Composite, i.e., a collection of simple
hypotheses (e.g.,   3.7 )
General outline (cont.)
• A test statistic is selected. It must:
– Have a higher probability for ”extreme” values under
H1 than under H0
– Have a known distribution under H0 (when simple)
• If the value of the test statistic is ”too extreme”,
then H0 is rejected.
• The probability, under H0, of observing the given
data or something more extreme is called the pvalue. Thus we reject H0 if the p-value is small.
• The value at which we reject H0 is called the
significance level.
Note:
• There is an asymmetry between H0 and H1: In fact,
if the data is inconclusive, we end up not rejecting
H0.
• If H0 is true the probability to reject H0 is (say)
5%. That DOES NOT MEAN we are 95% certain
that H0 is true!
• How much evidence we have for choosing H1 over
H0 depends entirely on how much more probable
rejection is if H1 is true.
Errors of types I and II
• The above can be seen as a decision rule for
H0 or H1.
• For any such rule we can compute (if both
H0 and H1 are simple hypotheses):
H0 true
Accept H0
P(accept | H0)
Reject H0
P(reject | H0)
Significance
H1 true
1 - power
P(accept | H1)
TYPE II error
TYPE I error
P(reject | H1)
Significance and power
• If H0 is composite, we compute the
significance from the simple hypothesis that
gives the largest probability of rejecting H0.
• If H1 is composite, we compute a power
value for each simple hypothesis. Thus we
get a power function.
Example 1: Normal distribution with
unknown variance
• Assume X 1 , X 2 ,..., X n ~ N (  ,  2 )
• Then
X 
~ tn 1
s/ n
• Thus
P( X  tn1, / 2
s
n
   X  tn1, / 2
s
n
) 
• So a confidence interval for  , with significance 
is given by
( X  tn1, / 2 sn , X  tn1, / 2 sn )
Example 1 (Hypothesis testing)
• Hypotheses: H 0 :   0
H1 :   0
• Test statistic
X  0
~ tn 1 under H0
s/ n
• Reject H0 if
X  0  tn1, / 2
X  0  tn1, / 2
s
n
s
n
or if
• Alternatively, the p-value for the test can be
computed (if X  0 ) as the  such that
X  0  tn1, / 2
2
n
Example 1 (cont.)
• Hypotheses: H 0 :   0
• Test statistic
• Reject H0 if
H1 :   0
X  0
~ tn 1 assuming   0
s/ n
X  0  tn 1,
s
n
• Alternatively, the p-value for the test can be
computed as the  such that
X  0  tn 1,
2
n
Example 1 (cont.)
• Assume that you want to analyze as above
the data in some column of an SPSS table.
• Use ”Analyze” => ”Compare means” =>
”One-sample T Test”
• You get as output a confidence interval, and
a test as the one described above.
• You may adjust the confidence level using
”Options…”
Example 2: Differences between
means
2
X
,
X
,...,
X
~
N
(

,

• Assume 1 2
n
x
x ) and
2
Y1 , Y2 ,..., Ym ~ N ( y ,  y )
• We would like to study the difference 1  2
• Four different cases:
–
–
–
–
Matched pairs
Known population variances
Unknown but equal population variances
Unknown and possibly different pop. variances
Known population variances
• We get
X  Y  (x   y )

2
x
nx

 y2
~ N (0,1)
ny
• Confidence interval for 1  2
X  Y  Z / 2

2
x
nx

 y2
ny
Unknown but equal population
variances
• We get
X  Y  (x   y )
s
2
p
nx
where
s 2p 

s
2
p
~ t nx  n y  2
ny
(nx  1) sx2  (ny  1) s y2
nx  ny  2
• Confidence interval for
x   y
X  Y  tnx  ny 2, / 2
s 2p
nx

s 2p
ny
Hypothesis testing: Unknown but
equal population variances
• Hypotheses:
H 0 : x   y
X Y
• Test statistic:
2
p
2
p
s
s

nx n y
H1 :  x   y
~ t nx  n y  2
”T test with equal variances”
• Reject H0 if
X Y
2
p
2
p
s
s

nx n y
 tnx  ny  2, / 2
or if
X Y
2
p
2
p
s
s

nx n y
 tnx  ny  2, / 2
Unknown and possibly unequal
population variances
• We get
X  Y  (x   y )
2
x
2
y
s
s

nx n y
where
• Conf. interval for

~ t

s x2
nx

s 2y
ny

2
2
2
( sx2 / nx ) 2 ( s y / n y )

nx  1
ny  1
x   y
X  Y  t , / 2
2
x
s y2
s

nx ny
Hypothesis test: Unknown and
possibly unequal pop. variances
• Hypotheses:
H 0 : x   y
X Y
• Test statistic
2
y
2
x
s
s

nx n y
H1 :  x   y
~ t
”T test with unequal variances”
• Reject H0 if
X Y
2
x
2
y
s
s

nx n y
 t , / 2
or if
X Y
2
x
2
y
s
s

nx n y
 t , / 2
Practical examples:
• The lengths of children in a class are measured at
age 8 and at age 10. Use the data to find an
estimate, with confidence limits, on how much
children grow between these ages.
• You want to determine whether a costly operation
is generally done more cheaply in France than in
Norway. Your data is the actual costs of 10 such
operations in Norway and 20 in France.
Example 3: Population proportions
• Assume X ~ Bin( , n) , so that p  Xn is a frequency.
p 
• Then
~ N (0,1) (approximately, for large n)
 (1   ) / n
p 
• Thus
p(1  p) / n
• Thus

P  p  Z / 2

~ N (0,1)
(approximately, for large n)
p (1  p )
   p  Z / 2
n
• Confidence interval for 

 p  Z / 2

p (1  p ) 
  
n

p (1  p )
, p  Z / 2
n
p (1  p ) 

n

Example 3 (Hypothesis testing)
• Hypotheses:H 0 :    0
• Test statistic
under H0, for large n
• Reject H0 if p   0  Z / 2
p   0  Z / 2
H1 :    0
p 0
~ N (0,1)
 0 (1   0 ) / n
 0 (1   0 )
n
 0 (1   0 )
n
or if
Example 4: Differences between
population proportions
• Assume X1 ~ Bin(1, n1 ) and X 2 ~ Bin( 2 , n2 ), so
X
X
p

p

that 1 n and 2 n are frequencies
• Then
1
2
1
2
p1  p2  ( 1   2 )
~ N (0,1) (approximately)
 1 (1   1 )  2 (1   2 )

n1
n2
• Confidence interval for 1   2
p1  p2  Z / 2
p1 (1  p1 ) p2 (1  p2 )

n1
n2
Example 4 (Hypothesis testing)
• Hypotheses: H 0 : 1   2
• Test statistic
H1 : 1   2
p1  p2
~ N (0,1)
p0 (1  p0 ) p0 (1  p0 )

n1
n2
n1 p1  n2 p2
where p0 
n1  n2
• Reject H0 if
p1  p2
 Z / 2
p0 (1  p0 ) p0 (1  p0 )

n1
n2
Example 5: The variance of a normal
distribution
• Assume X 1 , X 2 ,..., X n ~ N (  ,  2 )
• Then (n  1) s 2
2

• Thus
2
~  n 1
 2

(n  1)s 2
2
P   n1,1 / 2 
  n1, / 2   
2



• Confidence interval for  2
 (n  1) s 2 (n  1) s 2 
, 2
 2

  n 1,1 / 2  n 1, / 2 
Example 6: Comparing variances for
normal distributions
• Assume X 1 , X 2 ,..., X n ~ N (  x ,  x2 ) Y1, Y2 ,..., Ym ~ N ( y ,  y2 )
2
2
s
/

• We get
x
x
~ Fnx 1, ny 1
2
2
sy /  y
• Fnx-1,ny-1 is an F distribution with nx-1 and ny-1
degrees of freedom
• We can use this exactly as before to obtain a
confidence interval for  x2 /  y2 and for testing for
example if  x2   y2
• Note: The assumption of normality is crucial!
Sample size computations
• For a sample from a normal population with
known variance, the size of the conficence interval
for the mean depends only on the sample size.
• So we can compute the necessary sample size to
match a required accuracy
• Note: If the variance is unknown, it must
somehow be estimated on beforehand to do the
computation
• Works also for population proportion estimation,
giving an inequality for the required sample size
Power computations
• If you reject H0, you know very little about the
evidence for H1 versus H0 unless you study the
power of the test.
• The power is 1 minus the probability of rejecting
H0 given that a hypothesis in H1 is true.
• Thus it is a function of the possible hypotheses in
H1.
• We would like our tests to have as high power as
possible.