Transcript invariant
Hypothesis Testing
Presenting:
Lihu Berman
Agenda
Basic concepts
Neyman-Pearson lemma
UMP
Invariance
CFAR
2
Basic concepts
X is a random vector with distribution F X
is a parameter, belonging to the parameter space
0
1 ; 0
1 disjoint covering of the parameter space
H i denotes the hypothesis that i
Binary test of hypotheses: H 0 : 0 vs. H1 : 1
M-ary test: H 0 : 0 vs. H1 : 1 vs.
H M 1 : M 1
3
Basic concepts (cont.)
If i i then H i is said to be a simple hypothesis
Otherwise, it is said to be a composite hypothesis
Example: H 0 : 0 vs. H1 : 0 simple vs. composite hypotheses
RADAR – Is there a target or not ?
Physical model – is the coin we are tossing fair or not ?
A two-sided test: the alternative H1 lies on both sides of H 0
A one-sided test (for scalar ): H 0 : 0 vs. H1 : 0
4
Basic concepts (cont.)
Introduce the test function x for binary test of hypotheses:
1 ~ H1 , x R
x
0 ~ H 0 , x A
A R is a disjoint covering of the measurement space
If the measurement is in the Acceptance region – H 0 is accepted
If it is in the Rejection region – H 0 is rejected, and H1 is accepted.
5
Basic concepts (cont.)
Probability of False-Alarm (a.k.a. Size):
H 0 simple: PFA Pr0 X 1 E0 X
H 0 composite: PFA sup E X i.e. the worst case
0
Detection probability (a.k.a. Power):
H1 simple: PD Pr1 X 1 E1 X
H1 composite: PD E X , 1
0 1
6
Basic concepts (cont.)
Receiving Operating Characteristics (ROC):
PD
1
Chance line
1
PFA
The best test of size has the largest PD among all tests of that size
7
The Neyman-Pearson lemma
Let: 0 ,1 and let f x denote the density function of X, then:
1 f1 x k f0 x
x
0 f1 x k f0 x
Is the most powerful test of size for testing which of the two
simple hypotheses is in force, for some k 0
8
The Neyman-Pearson (cont.)
Proof: Let ' x denote any test satisfying:
0 ' x 1
' E ' x E x
0
Obviously:
0
x f x k f x dx ' x f x k f x dx
1
0
1
0
x ' x f1 x dx x ' x k f0 x dx
' k ' 0
9
The Neyman-Pearson (cont.)
Note 1: If 0 then the most powerful test is:
1 f0 x 0
x
0 f0 x 0
Note 2: Introduce the likelihood function: l x
f1 x
f0 x
Then the most powerful test can be written as:
1 l x k
x
0 l x k
10
The Neyman-Pearson (cont.)
Note 3: Choosing the threshold k.
Denote by f0 l the probability density of the likelihood function
under H 0 , then:
Pr l X k f l dl
0
0
k
Note 4: If
F0 l
is not continuous
l
(i.e. k : Pr0 l X k 0 )
Then the previous equation might not work! In that case, use the test:
1 l x k
x l x k
0 l x k
Toss a
choose
coin, and
H1 if heads up
11
Binary comm. in AWGN
Source
i 0,1
Mapper
mi
N
X
n ~ N 0, 2 I
N
‘1’ = Enemy spotted. ‘0’ = All is clear.
Prior probabilities unknown !
Pr i 0 1 Pr i 1 ?
: X ~ N m , I
H 0 : X ~ N m0 , 2 I
H1
2
1
12
Binary comm. in AWGN
f1 x
x m1 T x m1 x m0 T x m0
l x
exp
2
2
f0 x
2
2
Natural logarithm is monotone, enabling the use of Log-Likelihood
1 l x k
1 ln l x ln k
1 L x
x
x
x
0 l x k
0 L x
0 ln l x ln k
13
Binary comm. (cont.)
m1 m0
m1 m0
T
L x ln l x w x x0
x
2
2
T
wT x x0 0
m0
x0
m1
w
14
Binary comm. (cont.)
d 2 2
H0 : L X ~ N
,d
2
d
2
d2 2
H1 : L X ~ N , d
2
m1 m0 m1 m0
T
m1 m0 2 m0 , m1
2
2
Assume equal energies:
ES m0
d
2
2
2
m1
2
2
and define
m0 , m1
m0 m1
2 ES 1
2
15
Binary comm. (cont.)
H0 :
L X
d
d
~ N
,1
2
d
2
H1 :
L X
d
d
~ N ,1
2
ES 1
2 2
PD
PFA
d 2
0
d 2
16
Binary comm. (cont.)
2
d
2
d
2
PFA Pr N
,d Q
d
2
t2
1
Qz
exp dt
2
2
z
2
2
d
d
2
PD Pr N , d 2 Q
d
2
t2
1
Qz d
exp dt Q Q 1 PFA d
2
2
z d
2
17
Binary comm. (cont.)
z d
z
0
d
18
UMP Tests
The Neyman-Pearson lemma holds for simple hypotheses.
Uniformly Most Powerful tests generalize to composite hypotheses
A test X is UMP of size
, if for any other test ' X , we have:
sup E ' X ' sup E X
0
0
E ' X ' E ' X for all 1
19
UMP Tests (cont.)
Consider scalar R.Vs whose PDFs f X are parameterized by scalar
Karlin-Rubin Theorem (for UMP one-sided tests):
If the likelihood-ratio l x is monotone non-decreasing in x for any
pair 1 0 , then the test:
1 x x0
x x x0
0 x x
0
The larger x is,
the more probable H1 looks E X
0
The size is computed
at the boundary 0
Is the UMP test of size for testing H 0 : 0 vs. H1 : 0
20
UMP Tests (cont.)
Proof: begin with fixed values 0 ,1 0
By the Neyman-Pearson lemma, the most powerful test of size for
testing H 0' : 0 vs. H1' : 1 is:
1 l x k
x l x k
0 l x k
E0 X
1 x x0
x x x0
0 x x
0
E0 X
As likelihood is monotone, we may replace it with the threshold test
21
UMP Tests (cont.)
The test is independent of 1 , so the argument holds for every 1 0
making x the most powerful test of size at 0 for testing
the composite alternative H1 vs. the simple hypothesis H 0'
Consider now the power function
At 0 , 0 . For any 1 0 , 1 because x is
more powerful than the test ' x
A similar argument holds for any 2 0
22
UMP Tests (cont.)
Thus, we conclude that is non-decreasing
E x
' 0
' '
Consequently, x is also a test whose size satisfies
sup E x
0
0 is the boundary of 0 . All other 0 are "easier"
Finally, no test ' x with size can have power , as it would
contradict Neyman-Pearson, in 0
23
A note on sufficiency
if and only if
Pr X x | T x t , Pr X x | T x t
The statistic T(x) is sufficient for
No other statistic which can be calculated from the same sample
provides any additional information as to the value of the parameter
Fisher-Neyman factorization theorem:
The statistic T(x) is sufficient for
if and only if f x a x b T x
One can write the likelihood-ratio in terms of the sufficient statistic
24
UMP Tests (cont.)
UMP one-sided tests exist for a host of problems !
Theorem: the one-parameter exponential family of distributions with
density:
f x c h x exp t x
has a monotone likelihood ratio in the sufficient statistic t x
provided that is non-decreasing
Proof:
l x
f1 x
f0 x
c 1
c 0
0
exp 1 0 t x
1 0
25
UMP Tests (cont.)
Example: Let X X 0 , X1 ,
f X 2
M N 2
R
M 2
c
2
M N 2
R
M 2
, X M 1 denote i.i.d. N m, R vectors
1 M 1
T
exp X n m R 1 X n m
2 n 0
h x
1 M 1 2 T 1
1 M 1 T 1
exp m R m exp X n R X n
2 n 0
2 n 0
T 1 M 1
exp m R X n
n 0
t x - sufficient statistic for
26
UMP Tests (cont.)
Therefore, the test
1 t t0
t
0 t t0
E0 t
is the Uniformly Most Powerful test of size
for testing
H 0 : 0 vs. H1 : 0
27
Invariance
Revisit the binary communication example, but with a slight change.
Source
i 0,1
Mapper
mi
N
1
Y ~ N mi 1, 2 I
n ~ N 0, 2 I
So what?! Let us continue with the log-likelihood as before…
1 L x
x
0 L x
Oops
L x ln l x wT x x0 1
28
Invariance (cont.)
Intuitively: search for a statistic that is invariant to the nuisance parameter
Z PY I N1 11T Y
Project the measurement on the subspace orthogonal to the disturbance!
m0
1
Optimal
signals ?
Y
m1
Z PY
29
Invariance (formal discussion)
Let G denote a group of transformations. g x G
X has probability distribution: F x
Y g X has distribution F' y Pr g X y
The family of distributions F x , is invariant to G
if F' y Fg y
Same distribution; different parameter
The hypothesis testing problem H 0 vs. H1 is invariant to G
if g 0 0 and g 1 1 Original dichotomy preserved
30
Invariance (cont.)
Revisit the previous example (AWGN channel with unknown bias)
The measurement Y X 1 is distributed as Y ~ N mi 1, I
2
so it is reasonable to test H 0 : m m0 1 vs. H1 : m m1 1
Unknown bias suggests a group of transformations
invariant to it: G g : g y y c1
g Y ~ N mi c 1, 2 I
g mi 1 mi c 1 and g i mi c 1 i
31
Invariance (cont.)
The test x is invariant to G if g x x
A statistic M x is said to be a Maximal invariant statistic if:
1. (invariant ) M g x M x for all g G
2. (maximal ) M x1 M x2 x2 g x1 for some g G
M x organizes the measurements x into equivalent classes where:
1. M x is constant
2. Every x is related to every other through a transformation g x
Every invariant test may be written as x M x
32
Invariance (cont.)
y : y y2 c1
y : M y Py2
y : y y1 c1
y : M y Py1
1
y1
Py2
Py1
33
Invariance (cont.)
Let us show that z Py is indeed a maximal invariant statistic
M g y M y for all g G
1. P y c1 Py
2. Py1 Py2 I N1 11T y1 I N1 11T y2
y2 y1 N1 1T y2 y1 1
y1 c1
M y1 M y2 y2 g y1 for some g G
34
Invariance (another example)
Another example
S
H 0 : 0 vs. H1 : 0
X
unknown !
n ~ N 0, I
As X ~ N S , 2 I it is reasonable
to test: H 0 : 0 vs. H1 : 0
N
S
S
Consider the group of transformations:
G g : g y cy with c 0
X
The hypothesis test problem is invariant to G
35
Invariance (another example)
What statistic is invariant to the scale of S ?
The angle between the measurement and the signal-subspace 90
(or the subspace orthogonal to it: )
S
Z tan
ST X
S T S X T I PS X
In fact, Z is a maximal invariant statistic to
a broader group G’, that includes also
rotation in the S subspace.
G’ is specifically appropriate for channels
that introduce rotation in S as well as gain
S
X
S
36
Invariance (UMPI & summary)
Invariance may be used to compress measurements into statistics of
low dimensionality, that satisfy invariance conditions.
It is often possible to find a UMP test within the class of invariant tests.
Steps when applying invariance:
1. Find a meaningful group of transformations, for which the
hypothesis testing problem is invariant.
2. Find a maximal invariant statistic M, and construct a likelihood ratio
test.
3. If M has a monotone likelihood ratio, then the test is UMPI for
testing one sided hypotheses of the form H 0 : 0 vs. H1 : 0
Note: Sufficiency principals may facilitate this process.
37
CFAR (introductory example)
S
X
N
n ~ N 0, I
known; The test: H 0 : 0 vs. H1 : 0
1
x
0
m m0
m m0
ES
m
~ N
,1
T
S S
ST X
Project the measurement on the signal space.
PFA Q m0
A UMP test !
PD Q m0 ES
The False-Alarm Rate is Constant.
Thus: CFAR
38
CFAR (cont.)
S
X
N
n ~ N 0, I
unknown; The test: H 0 : 0 vs. H1 : 0
m depends now on the unknown
.
Test is useless. Certainly not CFAR
Redraw the problem as:
S
Utilize Invariance !!
n ~ N 0, I
X
N
0 0
0 0
39
CFAR (cont.)
As before:
Change slightly:
Z tan
t
ST X
S T S X T I PS X
ST X ST S
X T I PS X
N 21
2
~ N 1
1 estimate of
The bottom line:
t N 1
t~
t N 1 ES
~ N ES ,1
independent
under H 0
under H1
40
CFAR (cont.)
The distribution of t is completely characterized under H 0 even though
is unknown !!!
1 t t0
Thus, we can set a threshold in the test: t
0 t t0
CFAR !
in order to obtain PFA
Furthermore, as the likelihood ratio for non-central t is monotone, this
test is UMPI for testing H 0 : 0 vs. H1 : 0 in the distribution
X ~ N S , 2 I when is unknown !
41
CFAR (cont.)
The actual probability of detection depends on the actual value of the SNR
42
Summary
Basic concepts
Neyman-Pearson lemma
UMP
Invariance
CFAR
43