Bayes Hypothesis Testing Bayes risk

Download Report

Transcript Bayes Hypothesis Testing Bayes risk

Introduction to Signal Detection
Outline

94/10/14
2
 Given some data and some number of probability distributions from which the data might have been sampled,
we want to determined which distribution was in effect
at the time the data were taken.
 The solution rests on the branch of mathematics called
decision theory.
 For engineering, it concerns that the detecting the
presence of a target in some region of a radar
surveillance area.
94/10/14
3
Decision theory
n(t)
source
Digital
sequence
Transmitter
Digital
sequence
Channel
r(t)
+
Receiver
 In communication systems, some source of message information
produces a bit 0 or 1,which is distorted in the transmission and
corrupted by noise in the channel and the receiver.
 The task of design a receiver which will determine whether a 0 or 1
was the intended message.
 The design rests on a model of the system which specifies the
probability distributions of the received signal in two message
cases.
94/10/14
4
Decision theory
 The task of the receiver is to select between these two
distributions, given the receiver data.
 There are three related problems to be solved.
o specified the model of the system.
o design the receiver, under some mathematization of the desire
for the best receiver.
o evaluate its performance on the average.
94/10/14
5
Statement of Hypothesis Problem
 the available some data which we model as a random
process because some elements in the source of the
data are not describable in certain due to absent of
information or data are corrupted with noise of one kind
of another, which is not deterministically predictable.
 Interested in determining which of a number of
situations gave rise to the data at hand. We will specify
some number of hypotheses Hi i=0,m-1, among which
we believe there is one which describes the state of
affairs at the time the data were produced.
94/10/14
6
Statement of Hypothesis Problem
 Let Hi specifically refer to m probabilistic models
 By processing the data set y at hand, we want to
determine which of the models Hi was in effect to
produce the data in question.
 The result of such processing will be a decision Dj that
the data should be associated with Hj.

94/10/14
7
Statement of Hypothesis Problem
 Give Hi,I=0,m-1, we want to determine how to arrive at
the decision Dj which best fits the case and evaluate
how well the strategy performs on average.
o m=M, M-hypothesis testing m=2, Binary Hypothesis testing
problem
 Binary Hypothesis :
o H0:null hypothesis
o H1:alternative hypothesis
H0
Probabilistic
Transition
mechanism
Source
Observation
space
Decision
Decision rule
H1
94/10/14
8
Formulation of Binary Hypothesis Testing
 two possible hypotheses H0 or H1,corresponding to two
possible probability distributions p0 and p1 ,respectively
on the observation space (T ,G)
H0 : Y ~ P0
verse
H1 : Y ~ P1
 A decision rule  for H0 verse H1 is any partition of the
observation set  into 1 and    c , we choose
0
1
H j when y  j,
94/10/14
9
Formulation of Binary Hypothesis Testing
 The decision rule  as a function on  given by
1 if y  1

 y   
0 if y  
0

 the values of  for a given y is the index of the
hypothesis accepted by 
94/10/14
10
Formulation of Binary Hypothesis Testing
 Y:observation. Scale or multi-dimensional
 Acceptance region / critical region rejection region
o Devide the data into M region, 0 1 …… must include all point
of y –space to make sure any event must make decision
 0 H0 D0
10 H 10 D
D10
1 H1 D1
D0
Say
p0  y 
0
1
Source
p1  y 
Y
Say
94/10/14
D1
1
Observation space
11
Formulation of Binary Hypothesis Testing
 Parametric decision theory: probability distributions
corresponding to the hypotheses are certain function of
known forms. Possible with parameters having
unknowing value.
o Simple: hypothesis have no parameters with unknown values.
o Composite: one or some parameters whose values are
unspecified
 Non-parametric decision theory: too difficult to define
probability
distributions
corresponding
to
the
hypotheses.
o Distribution free.
94/10/14
12
Formulation of Binary Hypothesis Testing
 When decision is making for a binary hypothesis testing
problem, one of four outcomes can happen.
o H0 true;choose D0
o H0 true;choose D1
o H1 true;choose D
1
o H true ; choose D
1
0
 Type I Error
 Type II Error
94/10/14
13
Formulation of Binary Hypothesis Testing
 Type I error : error of the first kind, false alarm rate,the
size of the test in statistical work.
PF  P0  D1    p0  y   dy 
1
 Type II error, error of the second kind, Probility of miss
detection/ Probability of missing
PM  P1  D0  
 p  y   dy 
1
0
 The probability of detection, the power of the test on
statistic work
PD  P1  D1  
 p  y  dy 
1
1
 PD  PM  1
 p  y  dy    p  y   dy   1
1
94/10/14
1
1
0
14
Decision Criterion
 Decision criterion assumption:
o The occurrences of hypothesis H0 and H are governed by
1
probability assignments, denoted by  and   1   ,
• Priori probability
0
1

0

• Represents the observer’s information about the occurrences of
hypothesis before the testing
• Is unconditioned on observation Y.
o To attach some relative importance to the possible courses of
action.
• Cost assignment Cij choose Di when H j is true
• The first subscript indicates the decision chosen
• The second subscript indicates the hypothesis that was true.
94/10/14
15
Decision Criterion
 Condition risk for hypothesis
Hj
o the average or expected cost incurred by decision rule  when
hypothesis H j is true.
R j  C1j Pj  1   C0 j Pj  0 
j  0,1
 The decision rule is designed such that on average the
cost will be as small as possible.
94/10/14
16
Bayes Hypothesis Testing
 Bayes risk
o
r     0R0     1R1  
  0 (C10P0 ( 1)  C00P0 ( 0 ))   1(C11P1( 1)  C01P1( 0 ))
r     0 C10P0 ( 1)  C00 1  P0 (1)  
  1 C11P1( 1)  C01 1  P1( 1) 
  0C00   1C01   0 C10  C00  P0  1 
  1 C11  C01  P1  1 
1
1
j 0
j 0
r     jC0 j   j C1j  C0 j  Pj  1 
94/10/14
17
Bayes Hypothesis Testing
 Bayes rule:the optimal decision rule is one that
minimizes, over all decision rules, the Bayes risk
1
1
j 0
j 0
r     jC0 j   j C1j  C0 j  Pj  1 
o If the Pj has density pj for j=0,1, and then
1
r     jC0 j  
j 0
1
 1

  j C1 j  C0 j  p j  1    dy 
 j 0

• The first term is const. and positive
• In order to to get the minimum risk ,the second term should be
negative such that r(), Baysian risk, is a minimum over all 1
94/10/14
18
Bayes Hypothesis Testing
 Decision region for Min. Bayes risk
1


1   y     j C1 j  C0 j  p j  y   0 
j 0




 y    1 C11  C01  p1  y    0 C00  C10  p0  y 
 Assume , in general, that the cost of correctly H j is less
than the cost of incorrectly rejecting H j , i. e. C11  C01 C00  C10
thus the decision region can be rewritten as
94/10/14
19
Bayes Hypothesis Testing
 Decision region for Min. Bayes risk

p1  y   0 C10  C00  
1   y  


p0  y   1 C01  C11  


 y   L y   

p1  y 
 0 C10  C00 
L y  
y   
p0  y 
 1 C01  C11 
o L(y) the likelihood ratio/ likelihood ratio statistic between H1
and H0 .
o
test threshold

94/10/14
20
Bayes Hypothesis Testing
 The Bayes decision rule corresponding to decision
region 1 compute the likelihood ratio for the
observation y and make its decision by comparing the
ratio to the threshold  , a Bayes rule is
1 if L  y   

B y   
0 if L y  
 

94/10/14
21
Bays Hypothesis Testing
 The minimum probability of error
o In general, the cost assignment is defined as
0 if i  j
Cij  
c if i  j
o The Bayes risk for a decision rule  with the critical region 1 is
given by
r    C  0P0  1    1P1  0 
 C  0PF   1PM 
 CPe
• The bayes risk is the average probability error incurred by the
decision rule 
• Because “c” is a constant, to minimize the the Bayes risk is the
same strategy as to minimize the probability error Pe itself.
94/10/14
22
Bays Hypothesis Testing
 The minimum probability of error
o The likelihood ratio test with the threshold minimizes for the
cost structure defined as above, it is thus a minimum
probability-of-error decision scheme.
o Thus the decision rule
1 if L  y   

B y   
0 if L y  
 

p1  y 
0
L y  

p0  y 
1
• Unless priori probabilities can be defined, the minimum probability
error decision rule can be implemented.
94/10/14
23
MAP-Maximum a posteriori probability criterion
 The Bayes formula implies that the conditional probability that
hypothesis HJ is true given the random observation Y takes on value
y is given
p j  y  j
py  H j   P  H j true Y  y  
p y 
o p  y  :the average density of Y p  y    0 p0  y   1p1  y 
o py  H j :the posterior or a posteriori probability of two hypothesis
 The critical region of the Bayes rule
1


1   y    j C1 j  C0 j  p j  y   0 
j 0


 y    1 C11  C01  p1  y    0 C00  C10  p0  y 

 y   C

p j  y 
py  H j  p y 
j

py  H0   C11py  H1   C00 py  H0   C01py  H1 
o The average posterior cost incurred by choosing hypothesis Hi given Y
10
equals y
94/10/14
Ci 0 py  H0   Ci 1py  H1 
24
MAP-Maximum a posteriori probability criterion
 The Bayes rule makes its decision by choosing the
hypothesis that yields the minimum posterior cost
1 if L( y )  1

 MAP  y   
0 if L( y )  1

L y  
C00 py  H0   C01py  H1 
 1
C10 py  H0   C11py  H1 
 For the uniform cost assignment C00=C11=0, C01=C10=1,
the decision rule become
1 if L(Y )  


 MAP  y   

0 if L(Y )  

L y  
py  H1 
 1
py  H0 
o It is difficult to find out the posteriori probability
o It is more nature to find out the priori probability
94/10/14
py  H0  py  H1 
p0  y  p1  y 
25
MAP-Maximum a posteriori probability criterion
 Adopting Bayes rule
py  H0    p0  y   0  p  y  py  H1    p1  y   1  p  y 
L  y   py  H0  py  H1   1
L  y    p1  y   1 p  y  
 p  y 
0
0
p  y   1
p1  y   0
L y  

  choose H1
p0  y   1
o L(y): is defined a likelihood function and the test is likelihood
ratio test
o  is defined as the threshold value
o The MAP decision has the maximum a posterior probability of
having occurred given that Y=y, which is the same as the
minimum probability-of-error decision
94/10/14
26
Example
p(y)
p(y)
0
1
1
-1
0
1
1
y
-1/2
0
1/2
y
o assume  0   1  1/ 2 then   1 if then L(y)>1, choose H1
o If y  1/ 2 then L(Y) <1, choose H0. for y=0, then the ratio test
is 1 so choose H0
94/10/14
27
Example
H0 : Y ~ N  0  2 
verse
H1 : Y ~ N  1  2 
o the likelihood ratio test for the Bayes decision rule
L y  
p1  y 

p0  y 
2
  y  1 
1
e
2
2
   y  0 
1
e
2
2 2
2 2


e
 1  0   1  0   
y

 
2 
2

 
 
o taking the logarithm
 1  0  
1  0   

ln  L  y    
y 


2

2


 
94/10/14
28
Example
o The Bayes test is
 1  0    ln
 1  0  
y


  2 
2



 2 
 1  0 
y 
'
 ln  

  1  0 
 2 
1 y   '

B y   
0 y   '

o For the minimum-probability-of-error ln()=0

'
0  1 


2
o The probability of Detection and Probability of False alarm
PD  P  D1 H1   P1  1    ' N  1  2 dy


PF  P  D1 H0   P0  1    ' N  0  2 dy


94/10/14
29
Remarks
 Test statistic is the likelihood ratio, and threshold is the
function of cost and pri-porbability.
 Minimizing the Bayes risk is possible only when the cost
function and pri-probability are known.
o Change the Pri-probability
• the threshold value changes
the conditional probabilities are also changed, PF、PD、PM,due
to the region of integration of H0、H1 are changed,
• the curve of cost function of Bayes test is changed.
•
 Usually,
o We don’t know the pri-probability of two Hypotheses.
o We don’t want to assume the pri-probability, however, the costs
have been defined.
94/10/14
30
Remarks
 For a communication system, the designed of a decision
rule may not have control over or access to the
mechanism that generates the state of nature.
 when a receiver is defined, the threshold is defined, then,
the conditional probabilities are defined. Therefore,
o The average or bayes risk is not an acceptable design criterion.
o A single decision rule would not minimize the average risk for
every possible prior distribution.
94/10/14
31
Minmax Hypothesis Testing
 Find the decision rule that will minimizes, over all
decision , the maximum of the conditional risk ,R0  ,R1  
o a possible design criterion is
max R0   ,R1  
94/10/14
32
Minmax Hypothesis Testing

When a threshold is defined,
o
o
a decision  is defined,
the average risk is only functional of pri-probability---a straight
line.
 
 
  0     0R0    1   0  R1  
0
0
1) where R j   corresponds the a certain unknown pri-probability
'
0
0
which minimize the Bayes risk.
   1,   R  
2)   0,   R1  
0
0
0
3) the maximum value occurs at either  0  0  1  1 and the
   R  
maximum value is max R0  
4) Minimize
max   0  
'
0
1
 0'
0 0 1
94/10/14
33
Minmax Hypothesis Testing

Define V        is the minimum possible Bayes risk for the
prior probability  0  0,1 ,and V 0  C V 1  C
          only the Bayes rules can possible be minimax rules
o    max R    R     R     R   
o    max R    R     R     R   
The prior value  L that maximizes V         ,and is constant
over  0,, max R   R    R    R  
a decision rule with equal
conditional risk—equalizer rule,the prior value is called as leastfavorable prior.
The minimax rule is a Bayes rule for the least-favorable prior
0
0
0
11


'
0
0
'
0
'
0
L
0
'
0
1
'
0
"'
0
L
0
"
0
1
"
0
94/10/14
L
0
'
0
0
1
"
0
1
L
L
0
0

00
L
1
L
0
L
1
0
L
L
34
Minmax Hypothesis Testing

Recall the Bayes risk
  0     0R0     1R  
  0R0    1   0  R  
V '  0  
 C10PF  C00 1  PF   C11PD  C01PM 
  0 C10P0  H1   C00P0  H0     1 C11P1  H1   C01P1  H0  
R0    R1  
  0 C10PF  C00 1  PF    1   0 C11PD  C01PM 
 C11PD  C01PM    0 C10PF  C00 1  PF   C11PD  C01PM 


94/10/14
  0  
 R0    R1  
 0

C10PF  C00 1  PF   C11PD  C01PM 
The minmax decision rule is to choose such that the
Conditional risk is the same under Ho and H1 are true
hypotheses, respectively.
Special case C00  C11  0 C10PF  C01PM
35
Remarks

Bayes decision rule
o

Minmax decision rule:
o

The priors are not assumed known and the optimum was
defined in term of minimizing the maximum of the conditional
expected costs under the two hypotheses.
In practical sense,
o
o
94/10/14
Knowing the prior probability of hypothesis, the cost function,
minimize the expected the overall cost or the average risk
the imposition of a specific cost structure on the decision
made in testing is not possible or desirable.
The design of a test for H0 and H1 involves a trade-off between
the probabilities of the two types of errors. One can always be
made arbitrarily small at the expense of the other.
36
Neyman-Pearson Hypothesis Testing

The Neyman-Pearson decision rule:
o
o
to place a bound on the false-alarm rate/probability and then
to minimize the miss probability within this constraint.
the Neyman-Pearson design criterion is
P  
max

D
subject to PF    
where  is the level or significance level of the test.
o The NP design goal is to find the most powerful--level test of
H0. It recognize a basic asymmetry in importance of the two
hypotheses.
o
94/10/14
37
Neyman-Pearson Hypothesis Testing

Optimality :if the decision rule  satisfying PF    
and let  ' be any decision rule of the form
1 if L  y   

'
  y     y  if L  y   

0 if L  y   
o
94/10/14
where
  0 and 0    y   1 s.t ...PF  '    then PD    PD  ' 
38
Neyman-Pearson Hypothesis Testing

PF      and design a test to maximize probability

of detection under this constain.
Using Lagrange multipliers ,
F  PM    PF   
o

Where
PM  P1  D0  
1
0
F   1    
o
o
PF  P0  D1    p0  y   dy 
1
  p  y    p  y  dy 
1
0
0
The first term is positive, To minimize F, the second term
should be negative.
p1  y    p0  y   0
p1  y 

p0  y 
94/10/14
 p  y   dy  and
choose
D0
39
Q&A
94/10/14
40