Transcript neyman_LRT

Likelihood Ratio Tests
The origin and properties of using the
likelihood ratio in hypothesis testing
Teresa Wollschied
Colleen Kenney
Outline
 Background/History
 Likelihood Function
 Hypothesis Testing
 Introduction to Likelihood Ratio Tests
 Examples
 References
Jerzy Neyman (1894 – 1981)
 Jerzy Neyman (1894 – 1981)
 April 16, 1894: Born in Benderry, Russia/Moldavia (Russian
version:Yuri Czeslawovich)
 1906: Father died. Neyman and his mother moved to Kharkov.
 1912:Neyman began study in both physics and mathematics at
University of Kharkov where professor Aleksandr Bernstein introduced
him to probability
 1919: Traveled south to Crimea and met Olga Solodovnikova. In 1920
ten days after their wedding, he was imprisoned for six weeks in
Kharkov.
 1921: Moved to Poland and worked as an asst. statistical analyst at the
Agricultural Institute in Bromberg then State Meteorological Institute in
Warsaw.
Neyman biography
 1923-1924:Became an assistant at Warsaw University and taught at the
College of Agriculture. Earned a doctorate for a thesis that applied
probability to agricultural experimentation.
 1925: Received the Rockefeller fellowship to study at University
College London with Karl Pearson (met Egon Pearson)
 1926-1927:Went to Paris. Visited by Egon Pearson in 1927, began
collaborative work on testing hypotheses.
 1934-1938: Took position at University College London
 1938: Offered a position at UC Berkeley. Set up Statistical Laboratory
within Department of Mathematics. Statistics became a separate
department in 1955.
 Died on August 5, 1981
Egon Pearson (1895 – 1980)
 August 11, 1895: Born in Hampstead, England. Middle child of
Karl Pearson
 1907-1909: Attended Dragon School Oxford
 1909-1914: Attended Winchester College
 1914: Started at Cambridge, interrupted by influenza.
 1915: Joined war effort at Admiralty and Ministry of Shipping
 1920: Awarded B.A. by taking Military Special Examination;
Began research in solar physics, attending lectures by Eddington
 1921: Became lecturer at University College London with his
father
 1924: Became assistant editor of Biometrika
Pearson biography
 1925: Met Neyman and corresponded with him through letters
while Neyman was in Paris. Also corresponded with Gosset at the
same time.
 1933: After father retires, becomes the Head of Department of
Apllied Statistics
 1935: Won Weldon Prize for work done with Neyman and began
work on revising Tables for Statisticians and Biometricians
(1954,1972)
 1939: Did war work, eventually receiving a C.B.E.
 1961: Retired from University College London
 1966: Retired as Managing Editor of Biometrika
 Died June 12, 1890
Likelihood and Hypothesis Testing
 “On The Use and Interpretation of Certain Test Criteria for
Purposes of Statistical Inference, Part I,” 1928, Biometrika:
Likelihood Ratio Tests explained in detail by Neyman and
Pearson

“Probability is a ratio of frequencies and this relative measure cannot
be termed the ratio of probabilities of the hypotheses, unless we
speak of probability a posteriori and postulate some a priori
frequency distribution of sampled populations. Fisher has therefore
introduced the term likelihood, and calls this comparative measure
the ratio of the two hypotheses.
Likelihood and Hypothesis Testing
 “On the Problem of the most Efficient Tests of Statistical
Hypotheses,” 1933, Philosophical Transactions of the Royal
Society of London: The concept of developing an ‘efficient’
test is expanded upon.

“Without hoping to know whether each hypothesis is true or false,
we may search for rules to govern our behavior with regard to them,
in following which we insure that, in the long run of experience, we
shall not be too often wrong”
Likelihood Function
Suppose X1, X2,..., Xn is a random sample from a
distributi on with p.d.f. or p.m.f. f( x |  ). Then,
given X = x is observed, the likelihood function is
the function of  defined by :
n
L( | x1,...xn)  f( x |  )   f ( xi |  )
i 1
Hypothesis Testing
Ho :    0
H a :   a
 Define T=r(x)
 R={x: T>c} for some constant c.
Power Function
 The probability a test will reject H0 is given by:
 ( )  P ( X  R)
 Size  test:
sup   o  ( )   , 0    1
 Level  test:
sup   o  ( )   , 0    1
Types of Error
 Type I Error:
Rejecting H0 when H0 is true
 Type II Error:
Accepting H0 when H0 is false
Likelihood Ratio Test (LRT)
 LRT statistic for testing H0:   0 vs. Ha:  
a is:
sup  0 L( | x)
 ( x) 
sup  L( | x)
 A LRT is any test that has a rejection region
of the form {x:  (x)  c}, where c is any
number such that 0  c  1.
Uniformly Most Powerful (UMP) Test

Let  be a test procedure for testing H0:   0 vs.
Ha:   a, with level of significance 0. Then ,
with power function (), is a UMP level 0 test
if:
(1)
(2)
()  0
For every test procedure ′ with (′)  0, we have
′()   () for every   a.
Neyman-Pearson Lemma
Consider testing H0:  = 0 vs. Ha:  = 1, where the pdf or pmf
corresponding to i is f(x|i), i=0,1, using a test with
rejection region R that satisfies
xR if f(x|1) > k f(x|0)
(1)
and
xRc if f(x|1) < k f(x|0),
for some k  0, and
(2)
  P 0 ( X  R ).
Neyman-Pearson Lemma (cont’d)

Then
(a)
Any test that satisfies (1) and (2) is a UMP level  test.
(b)
If there exists a test satisfying (1) and (2) with k>0,
then every UMP level  test satisfies (2) and every
UMP level  test satisfies (1) except perhaps on a set A
satisfying P ( X  A)  P ( X  A)  0.
0
1
Proof: Neyman-Pearson Lemma
Note that, if   P 0(X  R), we have a size  test and hence a
level  test because sup    0 P (X  R)  P 0(X  R)   ,
since 0 has only one point.
Define the test function  as :
 ( x )  1 if x  R,  (x)  0 if x  R c .
Let  ( x ) be the test function of a test satisfying (1) and (2) and
 ' ( x ) be the test function for any other level  test, where the
correspond ing power functions are  ( ) and  ' ( ).
Since 0   ' ( x )  1, ( ( x )   ' ( x ))( f ( x |  1)  kf ( x |  0))  0
for every x. Thus,
(3) 0   [ ( x )   ' ( x )][ f ( x |  1)  kf ( x |  0)]dx
  ( 1)   ' ( 1)  k (  ( 0)   ' ( 0)).
Proof: Neyman-Pearson Lemma (cont’d)
(a) is proved by noting  ( 0)   ' ( 0)     ' ( 0)  0.
Thus with k  0 and (3),
0   ( 1)   ' ( 1)  k (  ( 0)   ' ( 0))   ( 1)   ' ( 1)
showing  ( 1)   ' ( 1). Since  ' is arbitrary and  1 is the only
point in  c0 ,  is an UMP test.
Proof: Neyman-Pearson Lemma (cont’d)
Proof of (b)
Now, let  ' be the test function for any UMP level  test.
By (a),  , the test satisfying (1) and (2), is also a UMP level
 test. So,  ( 1)   ' ( 1). Using this, (3), and k  0,
   ' ( 0)   ( 0)   ' ( 0)  0.
Since  ' is a level  test,  ' ( 0)   , that is,  ' is a size 
test implying that (3) is an equality. But the nonnegativ e
integrand in (3) will be 0 only if  ' satisfies (1)
except, perhaps, where  f ( x |  i )dx  0 on a set A.
A
LRTs and MLEs


Let o be the MLE of    and let o be the MLE of   0.
Then the LRT statistic is

 (x) 
L(o | x)

L( | x)
Example: Normal LRT
Let X1,...Xn be a random sample from a N( , 1) population .
Test : H0 :    0 versus H1 :    0.
Then the LRT statistic is :
n
  ( xi  0 ) 2 / 2
i 1

L(o | x) (2 ) -n/2 e
 (x) 

n
L( x | x)
  ( xi  x ) 2 / 2
(2 ) -n/2 e i 1
e
n
 n
2
2
   ( xi  0 )   ( xi  x ) 
i 1
 i 1

n
2
n
Note that  ( xi   0)   ( xi  x) 2  n( x   0) 2 .
2
i 1
  (x)  e
i 1
 n ( x  0 ) 2
2
Example: Normal LRT (cont’d)
 We will reject H0 if (x)  c. We have:
{x :  ( x)  c}  {x : e
 n ( x  0 ) 2
2
 c}
 {x : ( x   0) 2  (2 ln c) / n}
 {x :| x   0 |
where 0  (2 ln c) / n  .
 (2 ln c) / n }
 Therefore, the LRTs are those tests that reject H0 if the
sample mean differs from the value 0 by more than
(2 ln c) / n.
Example: Size of the Normal LRT
Chose c such that sup    0 P ( (X)  c)   .
For the previous example, we have :
0  {   0}, and n ( X   0) ~ N (0,1) if    0.
The test :
z / 2
reject if | X   0 | 
,
n
where z / 2 satisfies P(Z  z / 2) 

2
wi th Z ~ N(0,1), is a size  test.
Sufficient Statistics and LRTs
Theorem: If T(X) is a sufficient statistic for , and
*(t) and (t) are the LRT statistics based on T and
X, respectively, then *(T(x))=(x) for every x in
the sample space.
Example: Normal LRT with unknown
variance
Let X1,...Xn be a random sample from a N( ,  2 ) population .
Test : H0 :    0 versus H1 :    0. (Note :  2 is a nuisance parameter) .
Then the LRT statistic is :
2
2
max
L(

,

|
x)
max
L(

,

| x)
{  ,  2 :   0 , 2  0}
{  ,  2 :   0 , 2  0}
 (x) 

2
 2
max
L(

,

|
x)
L(  ,  | x)
{  ,  2 :     , 2  0}
if    0
1

 2
  L(  0,  0 | x)
if    0

 2
 L(  ,  | x)
which is equivalent to a test based on the Student' s t statistic.
Example: Normal LRT with unknown
variance (cont’d)
If    0, then
 2
 (x) 
L(  0,  0 | x)
2


(2  0)

2
L(  ,  | x)
(2  )

(2  0)
n
2
- n/2

(2  )
e
e
n
2
 ( xi   0 )
( xi   0 ) 2 i 1

n
(
x
i


)2


( xi   ) 2 i 1
n
n
2
- n/2


n
2
  ( xi   ) / 2
- n/2
i 1
i 1
e

2
e
n



n
  ( xi   0 ) 2 / 2 0
- n/2
i 1

i 1


( 0)
2
- n/2
( ) -n/2
 
 
  
 0 
 

n
2
Example: Normal LRT with unknown
variance (cont’d)
Note that :
1 n
n 1 2
   ( xi  x ) 2 
S
n i 1
n
2
and

1 n
1 n
 n 1 2
2
 0   ( xi   0)    ( xi  x ) 2  n( x   0) 2  
S  ( x   0) 2
n i 1
n  i 1
n

Therefore,  (x)  c when

 

n

1
n

1





2
 2
S
 

  
n
n
 (x)      

  c' and x   0;
n

1
2
2
2
 0  

  n S  ( x   0)   n  1  ( x   0) 
S2

  n

X - 0
this is analogous to rejecting H0 when
 tn  1,  .
2
S
n
 2
Asymptotic Distribution of the LRT –
Simple H0
Theorem : For testin g H0 :    0 versus Ha :    0, suppose X1,...Xn

(1)
(2)
(3)
(4)
are iid f ( x |  ),  is the MLE of  , and f ( x |  ) satisfies the
the following regularity conditions :
The parameter is identifiab le; i.e., if    ' , then f ( x |  )  f ( x |  ' ).
The densities f ( x |  ) have some common support, and f ( x |  )
is differenti able in  .
The parameter space  contains an open set  of which the true
parameter value  0 is an interior point.
x  X , the density f ( x |  ) is three times differenti able with respect
to  , the third derivative is continuous in  , and  f ( x |  )dx can
be differenti ated three times under the integral sign.
(5)   , c  0 and a function M( x) (both depend on  0) such that :
3
log f ( x |  )  M( x) x  X ,  0 - c     0  c,
 3
wi th E 0[M(X)]  .
Asymptotic Distribution of the LRT –
Simple H0 (cont’d)
Then under H0, as n  ,
D
- 2 log  (X) 
12
D
If, instead,   0, then - 2 log  (X) 
2 ,
where  of the limiting distributi on is the difference
between th e number of free parameters specified
by   0 and the number of free parameters specified
by   .
Restrictions
 When a UMP test does not exist, other methods
must be used.

Consider subset of tests and search for a UMP test.
References
Cassella, G. and Berger, R.L. (2002). Statistical Inference. Duxbury:Pacific Grove, CA.
Neyman, J. and Pearson, E., “On The Use and Interpretation of Certain Test Criteria for
Purposes of Statistical Inference, Part I,” Biometrika, Vol. 20A, No.1/2 (July 1928),
pp.175-240.
Neyman, J. and Pearson, E., “On the Problem of the most Efficient Tests of Statistical
Hypotheses,” Philosophical Transactions of the Royal Society of London, Vol. 231 (1933),
pp. 289-337.
http://www-history.mcs.st-andrews.ac.uk/Mathematicians/Pearson_Egon.html
http://www-history.mcs.st-andrews.ac.uk/Mathematicians/Neyman.html