Transcript invariant

Hypothesis Testing
Presenting:
Lihu Berman
Agenda

Basic concepts

Neyman-Pearson lemma

UMP

Invariance

CFAR
2
Basic concepts
X is a random vector with distribution F  X 
 is a parameter, belonging to the parameter space 
  0
1 ; 0
1   disjoint covering of the parameter space
H i denotes the hypothesis that   i
Binary test of hypotheses: H 0 :   0 vs. H1 :  1
M-ary test: H 0 :   0 vs. H1 :  1 vs.
H M 1 :  M 1
3
Basic concepts (cont.)
If i  i  then H i is said to be a simple hypothesis
Otherwise, it is said to be a composite hypothesis
Example: H 0 :   0 vs. H1 :   0 simple vs. composite hypotheses
RADAR – Is there a target or not ?
Physical model – is the coin we are tossing fair or not ?
A two-sided test: the alternative H1 lies on both sides of H 0
A one-sided test (for scalar  ): H 0 :   0 vs. H1 :   0
4
Basic concepts (cont.)
Introduce the test function   x  for binary test of hypotheses:
1 ~ H1 , x  R
  x  
0 ~ H 0 , x  A
A R is a disjoint covering of the measurement space
If the measurement is in the Acceptance region – H 0 is accepted
If it is in the Rejection region – H 0 is rejected, and H1 is accepted.
5
Basic concepts (cont.)
Probability of False-Alarm (a.k.a. Size):
H 0 simple:   PFA  Pr0   X   1  E0   X 
H 0 composite:   PFA  sup E   X  i.e. the worst case
 0
Detection probability (a.k.a. Power):
H1 simple:   PD  Pr1   X   1  E1  X 
H1 composite:     PD    E  X  , 1
  

 0   1

6
Basic concepts (cont.)
Receiving Operating Characteristics (ROC):
PD
1
Chance line
1
PFA
The best test of size  has the largest PD among all tests of that size
7
The Neyman-Pearson lemma
Let:   0 ,1 and let f  x  denote the density function of X, then:

1 f1  x   k  f0  x 
  x  

0 f1  x   k  f0  x 
Is the most powerful test of size  for testing which of the two
simple hypotheses is in force, for some k  0
8
The Neyman-Pearson (cont.)
Proof: Let  '  x  denote any test satisfying:
0   ' x  1
 '  E  '  x   E   x   
0
Obviously:
0
   x   f  x   k  f  x  dx    '  x   f  x   k  f  x  dx
1
0
1
0
    x    '  x  f1  x  dx      x    '  x  k  f0  x  dx
    '  k    '  0
9
The Neyman-Pearson (cont.)
Note 1: If   0 then the most powerful test is:

1 f0  x   0
  x  

0 f0  x   0
Note 2: Introduce the likelihood function: l  x  
f1  x 
f0  x 
Then the most powerful test can be written as:
1 l  x   k
  x  
0 l  x   k
10
The Neyman-Pearson (cont.)
Note 3: Choosing the threshold k.
Denote by f0  l  the probability density of the likelihood function

under H 0 , then:
  Pr l  X   k    f  l  dl
0
0
k
Note 4: If
F0  l 
is not continuous
l
(i.e. k : Pr0 l  X   k   0 )
Then the previous equation might not work! In that case, use the test:
1 l  x   k

  x    l  x   k

0 l  x   k
Toss a
choose
 coin, and
H1 if heads up
11
Binary comm. in AWGN
Source
i 0,1
Mapper
mi 
N
X


n ~ N 0,  2 I
N

‘1’ = Enemy spotted. ‘0’ = All is clear.
Prior probabilities unknown !
Pr  i  0  1  Pr  i  1  ?


: X ~ N  m , I 
H 0 : X ~ N m0 ,  2 I
H1
2
1
12
Binary comm. in AWGN
f1  x 
  x  m1 T  x  m1   x  m0 T  x  m0  

l  x 
 exp  

2
2


f0  x 
2
2


Natural logarithm is monotone, enabling the use of Log-Likelihood
1 l  x   k
1 ln l  x    ln  k 
1 L  x   
  x  
   x  
   x  
0 l  x   k
0 L  x   
0 ln l  x    ln  k 
13
Binary comm. (cont.)
m1  m0  

m1  m0 
T
L  x   ln l  x   w  x  x0  
x


2
2 

T
wT  x  x0   0
m0
x0
m1
w
14
Binary comm. (cont.)
 d 2 2 
H0 : L  X  ~ N 
,d 
 2

d
2
 d2 2 
H1 : L  X  ~ N  , d 
 2

m1  m0   m1  m0 



T
m1  m0  2 m0 , m1
2
2
Assume equal energies:
ES  m0

d 
2
2
2
 m1
2
2
and define

m0 , m1
m0  m1
2 ES 1   
2
15
Binary comm. (cont.)
H0 :
L X 
d
 d 
~ N
,1
 2 
d

2
H1 :
L X 
d
d 
~ N  ,1
2 
ES 1   
2 2

PD
PFA
d 2
0
d 2
16
Binary comm. (cont.)
2

d
 2
  d

2

PFA  Pr  N 
,d    Q

d
  2




 t2 
1
 Qz  
exp    dt
2
 2
z
2




2


d


  d



2 
PD  Pr  N  , d 2      Q 


d
  2





 t2 
1
 Qz  d   
exp    dt  Q Q 1  PFA   d 
2
 2
z d
2
17
Binary comm. (cont.)

z d

z
0
d
18
UMP Tests
The Neyman-Pearson lemma holds for simple hypotheses.
Uniformly Most Powerful tests generalize to composite hypotheses
A test   X  is UMP of size
 , if for any other test  '  X , we have:
sup E  '  X    '    sup E   X 
 0
 0
E  '  X    '    E  '  X  for all   1
19
UMP Tests (cont.)
Consider scalar R.Vs whose PDFs f  X  are parameterized by scalar

Karlin-Rubin Theorem (for UMP one-sided tests):
If the likelihood-ratio l  x  is monotone non-decreasing in x for any
pair 1  0 , then the test:
1 x  x0

  x    x  x0
0 x  x
0

The larger x is,
the more probable H1 looks E  X  

0 
The size is computed
at the boundary  0
Is the UMP test of size  for testing H 0 :   0 vs. H1 :   0
20
UMP Tests (cont.)
Proof: begin with fixed values 0 ,1  0
By the Neyman-Pearson lemma, the most powerful test of size  for
testing H 0' :    0 vs. H1' :   1 is:
1 l  x   k

  x    l  x   k

0 l  x   k
E0   X   

1 x  x0

  x    x  x0
0 x  x
0

E0   X   
As likelihood is monotone, we may replace it with the threshold test
21
UMP Tests (cont.)
The test is independent of 1 , so the argument holds for every 1  0
making   x  the most powerful test of size   at   0 for testing
the composite alternative H1 vs. the simple hypothesis H 0'
Consider now the power function   
At   0 ,  0   . For any 1  0 ,  1    because   x  is
more powerful than the test  '  x   
A similar argument holds for any  2   0
22
UMP Tests (cont.)
Thus, we conclude that    is non-decreasing
    E   x 

 ' 0
  '   '  
Consequently,   x is also a test whose size satisfies
sup E   x   
 0
0 is the boundary of 0 . All other  0 are "easier"
Finally, no test  '  x  with size   can have power    , as it would
contradict Neyman-Pearson, in    0
23
A note on sufficiency
 if and only if
Pr  X  x | T  x   t ,    Pr  X  x | T  x   t 
The statistic T(x) is sufficient for
No other statistic which can be calculated from the same sample
provides any additional information as to the value of the parameter
Fisher-Neyman factorization theorem:
The statistic T(x) is sufficient for
 if and only if f  x   a  x  b T  x  
One can write the likelihood-ratio in terms of the sufficient statistic
24
UMP Tests (cont.)
UMP one-sided tests exist for a host of problems !
Theorem: the one-parameter exponential family of distributions with
density:
f  x   c   h  x  exp    t  x  
has a monotone likelihood ratio in the sufficient statistic t  x 
provided that    is non-decreasing
Proof:
l  x 
f1  x 
f0  x 

c 1 
c 0 
0
exp  1    0   t  x  
1  0
25
UMP Tests (cont.)
Example: Let X   X 0 , X1 ,
f  X    2 
M N 2
R
M 2
c  
  2 
M N 2
R
M 2
, X M 1  denote i.i.d. N  m, R  vectors
 1 M 1

T
exp    X n   m  R 1  X n   m   
 2 n 0

h  x
 1 M 1 2 T 1 
 1 M 1 T 1 
exp    m R m  exp   X n R X n 
 2 n 0

 2 n 0

 T 1 M 1 
 exp  m R  X n 
n 0

 t  x  - sufficient statistic for 
  
26
UMP Tests (cont.)
Therefore, the test
1 t  t0
 t   
0 t  t0
E0   t   
is the Uniformly Most Powerful test of size

for testing
H 0 :   0 vs. H1 :   0
27
Invariance
Revisit the binary communication example, but with a slight change.
Source
i 0,1
Mapper
mi 
N

 1


Y ~ N mi   1,  2 I

n ~ N 0,  2 I

So what?! Let us continue with the log-likelihood as before…
1 L  x   
  x  
0 L  x   

Oops
L  x   ln l  x    wT  x  x0   1
28
Invariance (cont.)
Intuitively: search for a statistic that is invariant to the nuisance parameter


Z  PY  I  N1 11T Y
Project the measurement on the subspace orthogonal to the disturbance!
m0
1
Optimal
signals ?
Y
m1
Z  PY
29
Invariance (formal discussion)
Let G denote a group of transformations. g  x   G
X has probability distribution: F  x 
Y  g  X  has distribution F'  y   Pr  g  X   y 
The family of distributions F  x  ,  is invariant to G
if F'  y   Fg    y 
Same distribution; different parameter
The hypothesis testing problem H 0 vs. H1 is invariant to G
if g  0   0 and g  1   1 Original dichotomy preserved
30
Invariance (cont.)
Revisit the previous example (AWGN channel with unknown bias)

The measurement Y  X   1 is distributed as Y ~ N mi   1,  I
2

so it is reasonable to test H 0 : m  m0   1 vs. H1 : m  m1   1
Unknown bias suggests a group of transformations
invariant to it: G   g : g  y   y  c1

g Y  ~ N mi    c 1,  2 I

 g  mi   1  mi    c 1 and g  i   mi    c 1  i
31
Invariance (cont.)
The test   x  is invariant to G if   g  x      x 
A statistic M  x  is said to be a Maximal invariant statistic if:
1. (invariant ) M  g  x    M  x  for all g  G
2. (maximal ) M  x1   M  x2   x2  g  x1  for some g  G
M  x organizes the measurements x into equivalent classes where:
1. M  x  is constant
2. Every x is related to every other through a transformation g  x 
Every invariant test may be written as   x     M  x  
32
Invariance (cont.)
 y : y  y2  c1 
  y : M  y   Py2 
 y : y  y1  c1 
  y : M  y   Py1
1
y1
Py2
Py1
33
Invariance (cont.)
Let us show that z  Py is indeed a maximal invariant statistic
M  g  y    M  y  for all g  G
1. P  y  c1  Py




2. Py1  Py2  I  N1 11T y1  I  N1 11T y2
 y2  y1  N1 1T  y2  y1 1
 y1  c1
M  y1   M  y2   y2  g  y1  for some g  G
34
Invariance (another example)
Another example
S

H 0 :   0 vs. H1 :   0


X
 unknown !
n ~ N  0, I 
As X ~ N  S ,  2 I  it is reasonable
to test: H 0 :   0 vs. H1 :   0
N
 S
S
Consider the group of transformations:
G   g : g  y   cy with c  0
X
The hypothesis test problem is invariant to G
35
Invariance (another example)
What statistic is invariant to the scale of S ?
The angle between the measurement and the signal-subspace 90  
(or the subspace orthogonal to it:  )
 S
Z  tan    
ST X
S T S X T  I  PS  X
In fact, Z is a maximal invariant statistic to
a broader group G’, that includes also

rotation in the S subspace.
G’ is specifically appropriate for channels

that introduce rotation in S as well as gain
S

X
S
36
Invariance (UMPI & summary)

Invariance may be used to compress measurements into statistics of
low dimensionality, that satisfy invariance conditions.

It is often possible to find a UMP test within the class of invariant tests.

Steps when applying invariance:
1. Find a meaningful group of transformations, for which the
hypothesis testing problem is invariant.
2. Find a maximal invariant statistic M, and construct a likelihood ratio
test.
3. If M has a monotone likelihood ratio, then the test is UMPI for
testing one sided hypotheses of the form H 0 :   0 vs. H1 :   0

Note: Sufficiency principals may facilitate this process.
37
CFAR (introductory example)
S

X

N

n ~ N  0, I 
 known; The test: H 0 :   0 vs. H1 :   0
1
  x  
0
m  m0
m  m0
  ES 
m
~ N
,1
T



 S S


ST X
Project the measurement on the signal space.
PFA  Q  m0 

A UMP test !
PD  Q m0      ES
The False-Alarm Rate is Constant.

Thus: CFAR
38
CFAR (cont.)
S

X

N

n ~ N  0, I 
 unknown; The test: H 0 :   0 vs. H1 :   0
m depends now on the unknown
.
Test is useless. Certainly not CFAR
Redraw the problem as:
S
Utilize Invariance !!



n ~ N  0, I 
X
N
 0  0
 0  0
39
CFAR (cont.)
As before:
Change slightly:
Z  tan    
t
ST X
S T S X T  I  PS  X

ST X  ST S
X T  I  PS  X


  N 21 
2
~  N 1
1    estimate of 
The bottom line:
t N 1
t~
t N 1     ES


~ N   ES ,1

independent
under H 0
under H1
40
CFAR (cont.)
The distribution of t is completely characterized under H 0 even though
is unknown !!!
1 t  t0
Thus, we can set a threshold in the test:   t   
0 t  t0

CFAR !
in order to obtain PFA
Furthermore, as the likelihood ratio for non-central t is monotone, this
test is UMPI for testing H 0 :   0 vs. H1 :   0 in the distribution
X ~ N  S ,  2 I when  is unknown !


41
CFAR (cont.)
The actual probability of detection depends on the actual value of the SNR
42
Summary

Basic concepts

Neyman-Pearson lemma

UMP

Invariance

CFAR
43