Bayesian Networks Inference and LS Clique-tree Propagation

Download Report

Transcript Bayesian Networks Inference and LS Clique-tree Propagation

Lauritzen-Spiegelhalter Algorithm
Probabilistic Inference
In Bayes Networks
Haipeng Guo
Nov. 08, 2000
KDD Lab, CIS Department, KSU
Presentation Outline
•
Bayes Networks
•
Probabilistic Inference in Bayes Networks
•
L-S algorithm
•
Computational Complexity Analysis
•
Demo
Bayes Networks
• A Bayes network is a directed acyclic
graph with a set of conditional probabilities
• The nodes represent random variables and
arcs between nodes represent conditional
dependence of the nodes.
• Each node contains a CPT(Conditional
Probabilistic Table) that contains probabilities
of the node being a specific value given the
values of its parents
Bayes Networks
Probabilistic Inference in Bayes Networks
• Probabilistic Inference in Bayes
Networks is the process of computing the
conditional probability for some variables
given the values of other variables
(evidence).
• P(V=v| E=e): Suppose that I observe e on a
set of variables E(evidence ), what is the
probability that variable V has value v, given
e?
Probabilistic Inference in Bayes Networks
Example problem:
• Suppose that a patient arrives and it is
known for certain that he has recently visited
Asia and has dyspnea.
• What’s the impact that this has on the
probabilities of the other variables in the
network ?
• Probability Propagation in the network
Probabilistic Inference in Bayes Networks
• The problem of exact probabilistic inference in an
arbitrary Bayes network is NP-Hard.[Cooper 1988]
• NP-Hard problems are at least as computational
complex as NP-complete problems
• No algorithms has ever been found which can solve
a NP-complete problem in polynomial time
• Although it has never been proved whether P = NP
or not, many believe that it indeed is not possible.
• Accordingly, it is unlikely that we could develop an
general-purpose efficient exact method for
propagating probabilities in an arbitrary network
Lauritzen-Spiegelhalter Algorithm
•
L-S algorithm is an efficient exact
probability inference algorithm in an
arbitrary Bayes Network
•
S. L. Lauritzen and D. J. Spiegelhalter.
Local computations with probabilities on
graphical structures and their application to
expert systems. Journal of the Royal
Statistical Society , 1988.
Lauritzen-Spiegelhalter Algorithm
•
L-S algorithm works in two steps:
First, it creates a tree of cliques(join tree
or junction tree), from the original Bayes
network;
Then, it computes probabilities for the
cliques during a message propagation and
the individual node probabilities are
calculated from the probabilities of cliques.
L-S Algorithm: Cliques
• An undirected graph is compete if every
pair of distinct nodes is adjacent
• a clique W of G is a maximal complete
subset of G, that is, there is no other complete
subset of G which properly contains W
A
B
Clique 1: {A, B, C, D}
E
C
D
Clique 2: {B, D, E}
Step 1: Building the tree of cliques
In step 1, we begin with the DAG of a Bayes Network,
and apply a series of graphical transformation that
result in a join tree:
1. Construct a moral graph from the DAG of a Bayes
network by marrying parents
2. Add arcs to the moral graph to form a triangulated
graph and create an order  of all nodes using
Maximum Cardinality Search
3. Identify cliques from the triangulated graph and
order them according to order 
4. Connect these cliques to build a join tree
Step 1.1: Moralization
Input: G - the DAG of a Bayes Network,
Output: Gm - the Moral graph relative to G
Algorithm: “marry” the parents and drop the direction
(add arc for every pair of parents of all nodes)
A
F
B
A
G
E
B
C
D
F
G
E
C
H
D
H
Step 1.2: Triangulation
Input: Gm - the Moral graph
Output: Gu – a perfect ordering  of the nodes and the
triangulated graph of Gm
Algorithm: 1. Use Maximum Cardinality Search to create a
perfect ordering of the nodes
2. Use Fill-in Computation algorithm to triangulate Gu
A
F
B
A1
G
E
B2
C
D
F6
G5
E3
C4
H
D8
H7
Step 1.3: Identify Cliques
Input: Gu and a ordering  of the nodes
Output: a list of cliques of the triangulated graph Gu
Algorithm: Use Cliques-Finding algorithm to find
cliques of a triangulated graph then order them
according to their highest labeled nodes according
to order 
Step 1.3: Identify Cliques
A1
Clique 1
F6
B2
Clique 2
B2
A1
G5
E3
E3
F6
G5
E3
B2
C4
G5
E3
C4
Clique 3
G5
C4
C4
D8
Clique 4
C4
D8
H7
Clique 6
order them according to their highest
labeled nodes according to order 
Clique 5
H7
Step 1.4: Build tree of Cliques
Input: a list of cliques of the triangulated graph Gu
Output: Create a tree of cliques, compute Separators nodes
Si,Residual nodes Ri and potential probability (Clqi) for all
cliques
Algorithm: 1. Si = Clqi (Clq1  Clq2 … Clqi-1)
2. Ri = Clqi - Si
3. If i >1 then identify a j < i such that Clqj is a parent of Clqi
4. Assign each node v to a unique clique Clqi that v  c(v) 
Clqi
5. Compute (Clqi) = f(v) Clqi =P(v|c(v)) {1 if no v is
assigned to Clqi}
6. Store Clqi , Ri , Si, and (Clqi) at each vertex in the tree
of cliques
Step 1.4: Build tree of Cliques
Ri: Residual nodes
Si: Separator nodes
(Clqi): potential probability of Clique i
A1
Clq1
Clq4
Clq2
B2
E3
E3
G5
Clq3
Clq3 = {E,C,G}
R3 = {G}
S3 = { E,C }
(Clq4) =
P(E|F)P(G|F)P(F)
C4
(Clq3) = 1
Clq3
CG
H7
Clq4 = {E, G, F}
R4 = {F}
S4 = { E,G }
(Clq5) = P(H|C,G)
CGH
Clq4
D8
Clq5
Clq2
EG
EGF
C4
(Clq2) = P(C|B,E)
ECG EC
C4
C4
Clq2 = {B,E,C}
R2 = {C,E}
S2 = { B }
G5
E3
B
BEC
G5
(Clq1) = P(B|A)P(A)
Clq1
F6
B2
Clq6
AB
Clq1 = {A, B}
R1 = {A, B}
S1 = {}
Clq5
Clq5 = {C, G,H}
R5 = {H}
S5 = { C,G }
CD
Clq6 = {C, D}
R5 = {D}
S5 = { C}
C
Clq6
(Clq2) = P(D|C)
Step 1: Conclusion
In step 1, we begin with the DAG of a Bayes Network,
and apply a series of graphical transformation that
result in a Permanent Tree of Cliques. Stored at
each vertex in the tree are the following:
1. A clique Clqi
2. Si
3. Ri
4. (Clqi)
Step 2: Computation inside the join tree
•
In step 2, we start from copying a copy of the Permanent Tree of
Cliques to Clqi’, Si’,, Ri’ and ’ (Clqi ’) and P’ (Clqi’) , leave P’ (Clqi’)
unassigned at first.
•
Then we compute the prior probability P’ (Clqi’) using the same
updating algorithm as the one to determine the conditional
probabilities based on instantiated values.
•
After initialization, we compute the posterior probability P’ (Clqi’)
again with evidence by passing  and  messages in the join tree
•
When the probabilities of all cliques are determined , we can
compute the probability for each variable from any clique
containing the variable
Step 2: Message passing in the join tree
•
The message passing process consists of first
sending so-called  messages from the bottom of
the tree to the top, then sending  messages from
the top to the bottom, modifying and accumulating
node properties(’ (Clqi ’) and P’ (Clqi’) ) along the way
•
The  message upward is a summed product of all
probabilities below the given node
•
The  messages downward is information for
updating the node prior probabilities
Step 2: Message passing in the join tree
Upward  messages
Clq1
Clq2
• ’ (Clqi ’) is modified as the 
messages passing is going
Clq3
Downward  messages
Clq4
Clq5
• P’ (Clqi’) is computed as the 
messages passing is going
Clq6
Step 2: Message passing in the join tree
•
When the probabilities of all cliques are determined ,
for each vertex Clqi and each variable v  Clqi , do
P' (v) :
 P' (Clq' )
wClq 'i , w v
i
L-S Algorithm:Computational Complexity
Analysis
1. Computations in the Algorithm which creates the permanent
Tree of Cliques --- O(nrm)
•
Moralization – O(n)
•
Maximum Cardinality Search – O(n+e)
•
Fill-in algorithm for triangulating – O(n+e)
---- the general problem for finding optimal triangulation (minimal fill-in) is
NP-Hard, but we are using a greedy heuristic
•
Find Cliques and build join tree – O(n+e)
--- the general problem for finding minimal Cliques from an arbitrary graph
is NP-Hard, but our subject is a triangulated graph
•
Compute (Clqi) – O(nrm)
--- n = number of variables; m = the maximum number of variables in a
clique; r = maximum number of alternatives for a variable
L-S Algorithm:Computational Complexity
Analysis
2. Computations in the updating Algorithm --- O(prm )
•
Computation for sending all  messages --- 2prm
•
Computation for sending all  messages --- prm
•
Computation for receiving all  messages --- prm
•
Computation for receiving all  messages --- prm
---- p = number of vertices in the tree of cliques
 L-S algorithm has a time complexity of O(prm), in the
worst case it is bounded below by 2m, i.e. (2m)
L-S Algorithm:Computational Complexity
Analysis
•
It may seem that we should search for a better generalpurpose algorithm to perform probability propagation
•
But in practice, most Bayes networks created by human
hands should often contain small clusters of variables,
and therefore a small value of m. So L-S algorithm works
efficiently for many application because networks available
so far are often sparse and irregular.
•
L-S algorithm could have a very bad performance for more
general networks
L-S Algorithm: Alternative methods
•
Since the general problem of probability propagation is NPHard, it is unlikely that we could develop an efficient generalpurpose algorithm for propagating probabilities in an arbitrary
Bayes network.
•
This suggests that research should be directed towards
obtaining alternative methods which work in different cases:
1. Approximate algorithms
2. Monte Carlo techniques
3. Heuristic algorithms
4. Parallel algorithms
5. Special case algorithms
L-S Algorithm: Demo
•
Laura works on step 1
•
Ben works on step 2