BN with uncertain evidence
Download
Report
Transcript BN with uncertain evidence
Probabilistic Reasoning with
Uncertain Data
Yun Peng
and
Zhongli Ding, Rong Pan, Shenyong Zhang
Uncertain Evidences
•
Causes for uncertainty of evidence
–
–
•
Observation error
Unable to observe the precise state the world is in
Two types of uncertain evidences
–
–
Virtual evidence: evidence with uncertainty
I’m not sure about my observation that A = a1
Soft evidence: evidence of uncertainty
I cannot observe the state of A but have observed
the distribution of A as P(A) = (0.7, 0.3)
L( A) ( P(Ob(a1 ) | a1 ) : P(Ob(a 2 ) | a 2 ) :... : P(Ob(a n ) | a n ))
Virtual Evidences
►
Represent uncertainty in VE by likelihood ratio
►
This ratio shall be preserved (invariant) in belief update
Implemented by adding a VE node
It is a leaf node, with A as its only
parent
Its CPT conform the likelihood ratio
Many BN engine accept likelihood
ratio directly
Multiple VE is not a problem
A
B
veA
veB
Soft Evidences
►
Represent uncertainty in SE by distribution
►
itself is to be believed without uncertainty and must
be preserved (invariant) in belief update
Reasoning with a single SE: Jeffrey’s rule
For the given seA = R(A)
P( A | seA ) R( A)
for evidence variable A
Q(C ) i P(C | ai ) R(ai ) for the rest of variables
For BN: convert SE to VE: calculate likelihood ratio
Multiple Soft Evidences
►
Problem: cannot satisfy all SE
►
update one variable’s distribution
to its target value (the observed
distribution) can make those of
others’ off their targets
P( A | se A ) Q( A) but
P( A | se A , se B ) Q( A)
A
B
seA
Solution: IPFP
A procedure that modify a distribution by one or more
distributions over subsets of variables
seB
Jeffrey’s Rule
• Jeffrey’s rule (J-conditioning) (R. Jeffrey 1983)
– Given SE R(a), any other variable c is updated by
Q(c) P(c | Ai ) R( Ai ) P(c, Ai )
i
i
R( Ai )
P( Ai )
– Extend Jeffrey’s rule to the entire distribution
Q ( x ) P( x )
R( a )
P( a )
– Q(a) = R(a)
– Among all JPD sayisfying R(a), Q(x) has the smallest KL distance
(I-divergence) to the original P(x)
– Q(x) is called an I-projection of P(x) on R(B)
• What if we have more than one SE?
– R1(educ) and R2(smoker) (constraints)
– How to make a minimum change to P(x) to satisfy ALL
constraints?
IPFP
• We can try Jeffrey’s rule
– First on P(x) using R1 -> Q1(x)
– Then on Q1(x) using R2 -> Q2(x)
– Q2(x) satisfies R2 but not R1
• Iterative Proportional Fitting Procedure (IPFP)
– Proposed by R. Kruithof (1937); convergence proved by I. Csiszar
(1975)
– Loop over the set of constraints, each step tries to fit one constraint
Rj ( y j )
Qk 1 ( x ) Qk ( x )
Qk ( y j )
– Converges to Q*(x), which is the I-projection of P(x) on the set of
given constraints
IPFP
All JPD satisfying R1
R1
Q1
Q3
Q2
Q*
R2
All JPD satisfying R2
P
IPFP
• Problems with IPFP
– Very slow
• Each iteration (fitting step) has complexity of O(2|x|)
• Factorization -> Bayesian network (BN)
– Inconsistent constraints
• No JPD satisfies all constraints
• IPFP won’t converge (oscillating)
oscillating
BN Belief Update with SE
• BN belief update with hard
evidence
– HE a = A1; b = B3
– Clamp node a to A1 and b to B3
– Calculate P(c|A1, B3) for all c
a
b
• Virtual evidence
– Uncertainty of the HE (observation)
– Represented as a likelihood ratio
– Virtual node vea, with conditional
probability table calculated from L(a)
– When vea is clamped to “true”, P(a)
on a is updated to have its likelihood
ratio = L(a)
a
b
vea
veb
BN Belief Update with SE
• Convert SE to VE
– L( a )
R1 (a )
R (a )
R (a )
( 1 1 , ..., 1 m ).
P(a )
P(a1 )
P ( am )
– Belief update with yields Q(a) = R1(a)
a
• Not work with multiple SE
– When apply both sea and seb,
Q(a) != R1(a); Q(b) != R2(b)
• Solution: combine VE with IPFP
b
sea
seb
BN Belief Update with SE
• V-IPFP: at kth iteration
–
Pick up a sei, say R1(a), create a new
vei,j, with likelihood ratio
– Apply vei,j to update the entire network
• Convergence
– Converges to the I-projection on all
constraints
• Cost
– Space: small
– Time: large for large BN
sea,1
…
sea,2
seb,1
…
sea,2
Inconsistent Constraints
• Smooth:
– Phase I: apply IPFP until oscillation is detected
• Pull Q to the neighborhood of the solution
– Phase II: continue IPFP, but each time the constraint is modified
Ri ,k ( y i ) (1 ) Ri ,k 1 ( y i ) Qk 1 ( y i )
new
constraint
current
constraint
with influences
from other
constraints
– A new constraint is generated at each step,
• Original constraints gradually phased out
• Serialized GEMA
i
i
Q
(
y
• New constraints are generated only based on Ri ,k ( y ) and k 1 )
• Incorporate into V-IPFP for BN reasoning is straightforward
BN Learning with Uncertain Data
• Modify BN by a set of low dimensional PD (constraints)
– Approach 1:
•
•
•
Compute the JPD P(x) from BN,
Modify P(x) to Q*(x) by constraints using IPFP
Construct a new BN from Q*(x) (it may have different structure that the
original BN
– Our approach:
•
•
Keep BN structure unchanged, only modify the CPTs
Developed a localized version of IPFP
– Next step:
•
•
•
Dealing with inconsistency
Change structure (minimum necessary)
Learning both structure and CPT with mixed data (samples as low
dimensional PDs)
Remarks
• Wide potential applications
– Probabilistic resources are all over the places (survey data, databases,
probabilistic knowledge bases of different kinds)
– This line of research may lead to effective ways to connect them
• Problems with the IPFP based approaches
– Computationally expensive
– Hard to do mathematical proofs
References:
[1] Peng, Y., Zhang, S., Pan, R.: “Bayesian Network Reasoning with Uncertain Evidences”,
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 18 (5), 539564, 2010
[2] Pan, R., Peng, Y., and Ding, Z: “Belief Update in Bayesian Networks Using Uncertain
Evidence”, in Proceedings of the IEEE International Conference on Tools with Artificial
Intelligence (ICTAI-2006), Washington, DC,13 – 15, Nov. 2006.
[3] Peng, Y. and Ding, Z.: “Modifying Bayesian Networks by Probability Constraints”, in
Proceedings of 21st Conference on Uncertainty in Artificial Intelligence (UAI-2005),
Edinburgh, Scotland, July 26-29, 2005