Lecture 4 (14/11)
Download
Report
Transcript Lecture 4 (14/11)
Hypothesis testing
Some general concepts:
Null hypothesis
H0
A statement we “wish” to refute
Alternative hypotesis
H1
The whole or part of the complement
of H0
Common case:
The statement is about an unknown parameter,
H0:
H1: – ( \ )
where is a well-defined subset of the parameter space
Simple hypothesis: (or – ) contains only one point (one single value)
Composite hypothesis: The opposite of simple hypothesis
Critical region (Rejection region)
A subset C of the sample space for the random sample X = (X1, … , Xn ) such
that we reject H0 if X C (and accept (better phrase: do not reject ) H0
otherwise ).
The complement of C, i.e. C will be referred to as the acceptance region
C is usually defined in terms of a statistic, T(X) , called the test statistic
Simple null and alternative hypotheses
Errors in hypothesis testing:
Type I error
Rejecting a true H0
Type II error
Accepting a false H0
Significance level
The probability of Type I error
Also referred to as the size of the test or the risk level
Risk of Type II error
The probability of Type II error
Power
The probability of rejecting a false H0 , i.e. the
probability of the complement of Type II error =
=1–
Writing it more “mathematically”:
Pr X C H 0
Pr X C H1 Pr X C H1
1 Pr X C H1
Classical approach: Fix and then find a test that makes desirably small
A low value of does not imply a low value of , rather the contrary
Most powerful test
A test which minimizes for a fixed value of is called a most powerful test
(or best test) of size
Neyman-Pearson lemma
x = (x1, … , xn ) a random sample from a distribution with p.d.f. f (x; )
We wish to test
H0 : = 0 (simple hypothesis)
versus
H1 : = 1 (simple hypothesis)
The the most powerful test of size has a critical region of the form
Lθ1; x
A
Lθ0 ; x
where A is some non-negative constant.
Proof: Se the course book
Note! Both hypothesis are simple
Example:
x x1 , , xn a random sample from Exp , i.e.
with p.d.f.
f x; e
1 1 x
;x0
Test H 0 : 0 vs. H1 : 1 where 1 0 with a test of size
1
n xi
L ; x e
The critical region for a most powerful test is
n
n 1 xi
1 e
n
n 0 xi
0 e
Ae
n
1
n
0
x A 1
i
0
n
1n 0n
n
xi ln A 1 ln A nln 1 ln 0
0
1 0 1n 0n 0
ln A nln 1 ln 0
xi
B
n
n
1 0
T x xi B
If 1 had been θ0 xi B
Logical as E X
How to find B ?
If 1 > 0 then B must satisfy
n
Pr X i B 0
i 1
Use the result th at a sum of Exp distribute d variable s
is Gamma distribute d , Gamman, ,
i.e. with T X X i fT t t n1
t
B
n 1
e t 0
0n
n
dt
Solve for B (numerical ly)
e t
n n
If the sample x comes from a distribution belonging to the one-parameter
exponential family:
Aθ1 i1 B xi i1 C xi nD θ1
Lθ1; x e
n
n
A
θ
B
x
C
x
nD
θ
Lθ0 ; x e 0 i1 i i1 i
0
n
e
n
Aθ1 Aθ0 in1 B xi nD θ1 D θ1
If Aθ1 Aθ0 0 Critical region is T x in1 Bxi
If Aθ1 Aθ0 0 Critical region is T x in1 Bxi
“Pure significance tests”
Assume we wish to test H0: = 0 with a test of size
Test statistic T(x) is observed to the value t
Case 1: H1 : > 0
The P-value is defined as Pr(T(x) t | H0 )
Case 2: H1 : < 0
The P-value is defined as Pr(T(x) t | H0 )
If the P-value is less than H0 is rejected
Case 3: H1 : 0
The P-value is defined as the probability that T(x) is as extreme as the observed
value, including that it can be extreme in two directions from H0
In general:
Consider we just have a null hypothesis, H0, that could specify
• the value of a parameter (like above)
• a particular distribution
• independence between two or more variables
•…
Important is that H0 specifies something under which calculations are feasible
Given a test statistic T = t the P-value is defined as
Pr (T is as extreme as t | H0 )
Uniformly most powerful tests (UMP)
Generalizations of some concepts to composite (null and) alternative hypotheses:
H0 :
H1: – ( \ )
Power function:
θ PrX C θ i.e. a function of θ
Size:
sup θ
θ
A test of size is said to be uniformly most powerful (UMP) if
θ * θ θ Ω
where * θ is the power function of any other tes t
of size
If H0 is simple but H1 is composite and we have found a best test (NeymanPearson) for H0 vs. H1’: = 1 where 1 – , then
if this best test takes the same form for all 1 – , the test is UMP.
Univariate cases:
H0: = 0 vs. H1: > 0 (or H1: < 0 ) usually UMP test is found
H0: = 0 vs. H1: 0 usually UMP test is not found
Unbiased test:
A test is said to be unbiased if ( ) for all –
Similar test:
A test is said to be similar if ( ) = for all
Invariant test:
Assume that the hypotheses of a test are unchanged if a transformation of sample
data is applied. If the critical region is not changed by this transformation, thes test
is said to be invariant.
Consistent test:
If a test depends on the sample size n such that ( ) = n ( ).
If limn n ( ) = 1 the test is said to be consistent.
Efficiency:
Two test of the pair of simple hypotheses H0 and H1. If n1 and n2 are the minimum
sample sizes for test 1 and 2 resp. to achieve size and power , then the
relative efficiency of test1 vs. test 2 is defined as n2 / n1
(Maximum) Likelihood Ratio Tests
Consider again that we wish to test
H0:
H1: – ( \ )
The Maximum Likelihood Ratio Test (MLRT) is defined as rejecting H0 if
max Lθ; x
θ
max Lθ; x
A
θΩ
•01
• For simple H0 gives a UMP test
• MLRT is asymptotically most powerful unbiased
• MLRT is asymptotically similar
• MLRT is asymptotically efficient
If H0 is simple, i.e. H0: = 0 the MLRT is simplified to
Lθ0 ; x
A
ˆ
L θML ; x
Example
x x1 , , xn random sample from Exp
H 0 : 0
H1 : 0
n
L ; x e
1 1 xi
1
n xi
e
i 1
ˆML x (according to earlier examples)
0n e 0 xi
1
n x xi
1
x
n
e
ln ln
n ln x n ln 0
n
1
x 1 x n x
e 0 i e 0 n x 1
0
0
n
0
x 1
Distribution of
Sometimes has a well-defined distribution:
e.g. A can be shown to be an ordinary t-test when the sample is from
the normal distribution with unknown variance and H0: = 0
Often, this is not the case.
Asymptotic result:
Under H0 it can be shown that –2ln is asymptotically 2-distributed with d
degrees of freedom, where
d is the difference in estimated parameters (including “nuisance”
parameters) between
" max Lθ; x " and " max Lθ; x "
θ
θΩ
Example Exp ( ) cont.
ln n ln x n ln 0
n
0
x 1
d 1 as we estimate 0 parameters in the numerator of
and 1 parameter ( ) in the denominato r
2n
2 ln 2n ln x 2n ln 0 x 1 is asymptotic ally
0
12 - distribute d when 0 (i.e. when H 0 is true)
Score tests
Test of H 0 : θ θ0 vs. H 0 : θ θ0 θ Ω θ0
Test statistic :
ψ u T θ0 I θ01 uθ0
where
l l
l
u θ0
,
, ,
k
1 2
T
Under H 0 is asymptotic ally k2 - distribute d
and the test is asymptotic ally equivalent to the
correspond ing MLRT
Wald tests
Test of H 0 : θ θ0 vs. H 0 : θ θ0 θ Ω θ0
Test statistic :
θˆML θ0
T
I θˆ θˆML θ0
ML
Under H 0 is asymptotic ally k2 - distribute d
and the test is asymptotic ally equivalent to the
correspond ing MLRT
Score and Wald tests are particularly used in Generalized Linear Models
Confidence sets and confidence intervals
Definition:
Let x be a random sample from a distribution with p.d.f. f (x ; ) where is an
unknown parameter with parameter space , i.e. .
If SX is a subset of , depending on X such that
Pr X : S X θ 1
then SX is said to be a confidence set for with confidence coeffcient (level) 1 –
For a one-dimensional parameter we rather refer to this set as a confidence
interval
Pivotal quantities
A pivotal quantity is a function g of the unknown parameter and the observations
in the sample, i.e. g = g (x ; ) whose distribution is known and independent of .
Examples:
x a random sample from N ; 2
X
is N 0,1 - distribute d and thus independen t of and 2
n
X
is t n1 - distribute d and thus independen t of and 2
s n
n 1S 2
2
is χ n21 - distribute d and thus independen t of and 2
To obtain a confidence set from a pivotal quantity we write a probability statement
as
Prg1 g X ; θ g2 1
(1)
For a one-dimensional and g monotonic, the probability statement can be rewritten as
Pr1 X 2 X 1
where now the limits are random variables, and the resulting observed confidence
interval becomes
1 x ,2 x
For a k-dimensional the transformation of (1) to a confidence set is more
complicated but feasible.
In particular, a point estimator of is often used to construct the pivotal quantity.
Example:
x a random sample from N ; 2 , and 2 unknown
X
is t n1 - distribute d
s n
Pr t
X
t
2,n 1
s n
Pr X t
2,n 1
1
s
X t 2,n1
n
s
1 X X t 2,n1
and 2 X
n
1 observed confidence interval for
2,n 1
1 x , 2 x x t 2,n1
s
, x t
n
s
1
n
X t
2,n 1
is
2,n 1
s
n
s
n
n 1S 2
is χ n21 - distribute d
2
2
2
n
1
S
2
Pr 1 2
2 1
2
2
n 1S 2
n
1
S
2
1
Pr
2
2
2
1
2
2
n
1
S
2 X
1
2
2
and
2
n
1
S
2 X
2
12
2
1 observed confidence interval for 2 is
12
x
, 22
n 1s 2 n 1s 2
x 2 , 2
1 2
2
Using the asymptotic normality of MLE:s
One-dimensional parameter :
ˆML ~ N , I1
ˆ
ML
~ N 0,1 Pr z 2
z 2 1
1
1
I
I
Approximat e 1 confidence interval for is
ˆ z I 1,ˆ z I 1
ˆML
2
ML
ML
2
k-dimensional parameter :
θˆ ~ N θ; I θ-1
T
ˆ
θ θ I θ θˆ θ ~ χ k2
Ellipsoida l confidence set for θ can be constructe d
Construction of confidence intervals from hypothesis tests:
Assume a test of H0: = 0 vs. H1: 0 with critical region C( 0 ).
Then a confidence set for with confidence coefficient 1 – is
S X θ0 : X C θ0
where C θ0 is the acceptance region