Bayes Networks

Download Report

Transcript Bayes Networks

Bayes Networks
Outline:
Why Bayes Nets?
Review of Bayes’ Rule
Combining independent items of evidence
General combination of evidence
Benefits of Bayes nets for expert systems
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
1
Why Bayes Networks?
Reasoning about events involving many parts or
contingencies generally requires that a joint probability
distribution be known. Such a distribution might require
thousands of parameters. Modeling at this level of detail is
typically not practical.
Bayes Nets require making assumptions about the relevance
of some conditions to others. Once the assumptions are
made, the joint distribution can be “factored” so that there
are many fewer separate parameters that must be specified.
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
2
Review of Bayes’ Rule
E: Some evidence exists, i.e., a particular condition is true
H: some hypothesis is true.
P(E|H) = probability of E given H.
P(E|~H) = probability of E given not H.
P(H) = probability of H, independent of E.
P(E|H) P(H)
P(H|E) = ----------------P(E)
P(E) = P(E|H) P(H) + P(E|~H)(1 - P(H))
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
3
Combining Independent
Items of Evidence
E1: The patient’s white blood cell count exceeds 110% of average.
E2: The patient’s body temperature is above 101oF.
H: The patient is infected with tetanus.
O(H) = 0.01/0.99
O(H|E1)
=
λ1 O(H)
λ2 O(H)
sufficiency factor for high white cell count.
O(H|E2) =
sufficiency factor for high body temp.
Assuming E1 and E2 are independent:
O(H|E1  E2)
= λ1 λ2 O(H)
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
4
Bayes Net Example
A
P(B|A) = 0.5
B
P(A) = 0.2
C
P(C|A) = 0.3
P(C|~A) = 0.1
P(B|~A) = 0.15
A: Accident (An accident blocked traffic on the highway.)
B: Barb Late (Barbara is late for work).
C: Chris Late (Christopher is late for work).
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
5
Forward Propagation
(from causes to effects)
A
P(B|A) = 0.5
B
P(A) = 0.2
C
P(C|A) = 0.3
P(C|~A) = 0.1
P(B|~A) = 0.15
Suppose A (there is an accident):
Then P(B|A) = 0.5
P(C|A) = 0.3
Suppose ~A (no accident):
Then P(B|~A) = 0.15
P(C|A) = 0.1
(These come directly from the given information.)
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
6
Marginal Probabilities
(using forward propagation)
A
P(B|A) = 0.5
B
P(A) = 0.2
C
P(C|A) = 0.3
P(C|~A) = 0.1
P(B|~A) = 0.15
Then P(B) = probability Barb is late in any situation =
P(B|A) P(A) + P(B|~A) P(~A) = (0.5)(0.2) + (0.15)(0.8) = 0.22
Similarly P(C) = probability Chris is late in any situation =
P(C|A) P(A) + P(C|~A) P(~A) = (0.3)(0.2) + (0.1)(0.8) = 0.14
Marginalizing means eliminating a contingency by summing the probabilities
for its different cases (here A and ~A).
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
7
Backward Propagation: “diagnosis”
(from effects to causes)
A
P(B|A) = 0.5
B
P(A) = 0.2
C
P(C|A) = 0.3
P(C|~A) = 0.1
P(B|~A) = 0.15
Suppose B (Barb is late)
What’s the probability of an accident on the highway?
Use Bayes’ rule:
Then P(A|B) = P(B|A) P(A) / P(B)
= 0.5 * 0.2 / (0.5 * 0.2 + 0.15 * 0.8)
= 0.1 / 0.22
= 0.4545
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
8
Revising Probabilities of Consequences
A
P(B|A) = 0.5
B
P(A|B) = 0.4545
C
P(C|A) = 0.3
P(C|~A) = 0.1
P(B|~A) = 0.15
P(C|B) = ???
Suppose B (Barb is late).
What’s the probability that Chris is also late, given this
information?
We already figured that P(A|B) = 0.4545
P(C|B) = P(C|A) P(A|B) + P(C|~A) P(~A|B)
= (0.3)(0.4545) + (0.1)(0.5455)
= 0.191
somewhat higher than P(C)=0.14
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
9
Handling Multiple Causes
D
A
B
C
P(B|A^D) = 0.9
P(B|A^~D) = 0.45
P(B|~A^D) = 0.75
P(B|~A^~D) = 0.1
D: Disease (Barb has the flu). P(D) = 0.111
(These values are consistent with P(B|A) = 0.5. )
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
10
Explaining Away
D
A
B
C
P(B|A^D) = 0.9
P(B|A^~D) = 0.45
P(B|~A^D) = 0.75
P(B|~A^~D) = 0.1
Suppose B (Barb is late). This raises the probability for
each cause: P(A|B) = 0.4545,
P(D|B) = P(B|D) P(D)/ P(B) = 0.3935
Now, in addition, suppose C (Chris is late).
C makes it more likely that A is true,
“And this explains B.” D is now a less probable.
P(B|D) = P(B|A^D)P(A) + P(B|~A^D)P(~A) = 0.78
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
11
Benefits of Bayes Nets
The joint probability distribution normally requires 2n – 1
independent parameters.
With Bayes Nets we only specify these parameters:
1. “root” node probabilities.
e. g., P(A=true) = 0.2; P(A=false)=0.8.
2. For each non-root node, a table of 2k values, where k
is the number of parents of that node.
Typically k < 5.
3. Propagating probabilities happens along the paths in
the net. With a full joint prob. dist., many more
computations may be needed.
CSE 415 -- (c) S. Tanimoto, 2007
Bayes Nets
12