Bayesian Networks Inference and LS Clique-tree Propagation
Download
Report
Transcript Bayesian Networks Inference and LS Clique-tree Propagation
Lauritzen-Spiegelhalter Algorithm
Probabilistic Inference
In Bayes Networks
Haipeng Guo
Nov. 08, 2000
KDD Lab, CIS Department, KSU
Presentation Outline
•
Bayes Networks
•
Probabilistic Inference in Bayes Networks
•
L-S algorithm
•
Computational Complexity Analysis
•
Demo
Bayes Networks
• A Bayes network is a directed acyclic
graph with a set of conditional probabilities
• The nodes represent random variables and
arcs between nodes represent conditional
dependence of the nodes.
• Each node contains a CPT(Conditional
Probabilistic Table) that contains probabilities
of the node being a specific value given the
values of its parents
Bayes Networks
Probabilistic Inference in Bayes Networks
• Probabilistic Inference in Bayes
Networks is the process of computing the
conditional probability for some variables
given the values of other variables
(evidence).
• P(V=v| E=e): Suppose that I observe e on a
set of variables E(evidence ), what is the
probability that variable V has value v, given
e?
Probabilistic Inference in Bayes Networks
Example problem:
• Suppose that a patient arrives and it is
known for certain that he has recently visited
Asia and has dyspnea.
• What’s the impact that this has on the
probabilities of the other variables in the
network ?
• Probability Propagation in the network
Probabilistic Inference in Bayes Networks
• The problem of exact probabilistic inference in an
arbitrary Bayes network is NP-Hard.[Cooper 1988]
• NP-Hard problems are at least as computational
complex as NP-complete problems
• No algorithms has ever been found which can solve
a NP-complete problem in polynomial time
• Although it has never been proved whether P = NP
or not, many believe that it indeed is not possible.
• Accordingly, it is unlikely that we could develop an
general-purpose efficient exact method for
propagating probabilities in an arbitrary network
Lauritzen-Spiegelhalter Algorithm
•
L-S algorithm is an efficient exact
probability inference algorithm in an
arbitrary Bayes Network
•
S. L. Lauritzen and D. J. Spiegelhalter.
Local computations with probabilities on
graphical structures and their application to
expert systems. Journal of the Royal
Statistical Society , 1988.
Lauritzen-Spiegelhalter Algorithm
•
L-S algorithm works in two steps:
First, it creates a tree of cliques(join tree
or junction tree), from the original Bayes
network;
Then, it computes probabilities for the
cliques during a message propagation and
the individual node probabilities are
calculated from the probabilities of cliques.
L-S Algorithm: Cliques
• An undirected graph is compete if every
pair of distinct nodes is adjacent
• a clique W of G is a maximal complete
subset of G, that is, there is no other complete
subset of G which properly contains W
A
B
Clique 1: {A, B, C, D}
E
C
D
Clique 2: {B, D, E}
Step 1: Building the tree of cliques
In step 1, we begin with the DAG of a Bayes Network,
and apply a series of graphical transformation that
result in a join tree:
1. Construct a moral graph from the DAG of a Bayes
network by marrying parents
2. Add arcs to the moral graph to form a triangulated
graph and create an order of all nodes using
Maximum Cardinality Search
3. Identify cliques from the triangulated graph and
order them according to order
4. Connect these cliques to build a join tree
Step 1.1: Moralization
Input: G - the DAG of a Bayes Network,
Output: Gm - the Moral graph relative to G
Algorithm: “marry” the parents and drop the direction
(add arc for every pair of parents of all nodes)
A
F
B
A
G
E
B
C
D
F
G
E
C
H
D
H
Step 1.2: Triangulation
Input: Gm - the Moral graph
Output: Gu – a perfect ordering of the nodes and the
triangulated graph of Gm
Algorithm: 1. Use Maximum Cardinality Search to create a
perfect ordering of the nodes
2. Use Fill-in Computation algorithm to triangulate Gu
A
F
B
A1
G
E
B2
C
D
F6
G5
E3
C4
H
D8
H7
Step 1.3: Identify Cliques
Input: Gu and a ordering of the nodes
Output: a list of cliques of the triangulated graph Gu
Algorithm: Use Cliques-Finding algorithm to find
cliques of a triangulated graph then order them
according to their highest labeled nodes according
to order
Step 1.3: Identify Cliques
A1
Clique 1
F6
B2
Clique 2
B2
A1
G5
E3
E3
F6
G5
E3
B2
C4
G5
E3
C4
Clique 3
G5
C4
C4
D8
Clique 4
C4
D8
H7
Clique 6
order them according to their highest
labeled nodes according to order
Clique 5
H7
Step 1.4: Build tree of Cliques
Input: a list of cliques of the triangulated graph Gu
Output: Create a tree of cliques, compute Separators nodes
Si,Residual nodes Ri and potential probability (Clqi) for all
cliques
Algorithm: 1. Si = Clqi (Clq1 Clq2 … Clqi-1)
2. Ri = Clqi - Si
3. If i >1 then identify a j < i such that Clqj is a parent of Clqi
4. Assign each node v to a unique clique Clqi that v c(v)
Clqi
5. Compute (Clqi) = f(v) Clqi =P(v|c(v)) {1 if no v is
assigned to Clqi}
6. Store Clqi , Ri , Si, and (Clqi) at each vertex in the tree
of cliques
Step 1.4: Build tree of Cliques
Ri: Residual nodes
Si: Separator nodes
(Clqi): potential probability of Clique i
A1
Clq1
Clq4
Clq2
B2
E3
E3
G5
Clq3
Clq3 = {E,C,G}
R3 = {G}
S3 = { E,C }
(Clq4) =
P(E|F)P(G|F)P(F)
C4
(Clq3) = 1
Clq3
CG
H7
Clq4 = {E, G, F}
R4 = {F}
S4 = { E,G }
(Clq5) = P(H|C,G)
CGH
Clq4
D8
Clq5
Clq2
EG
EGF
C4
(Clq2) = P(C|B,E)
ECG EC
C4
C4
Clq2 = {B,E,C}
R2 = {C,E}
S2 = { B }
G5
E3
B
BEC
G5
(Clq1) = P(B|A)P(A)
Clq1
F6
B2
Clq6
AB
Clq1 = {A, B}
R1 = {A, B}
S1 = {}
Clq5
Clq5 = {C, G,H}
R5 = {H}
S5 = { C,G }
CD
Clq6 = {C, D}
R5 = {D}
S5 = { C}
C
Clq6
(Clq2) = P(D|C)
Step 1: Conclusion
In step 1, we begin with the DAG of a Bayes Network,
and apply a series of graphical transformation that
result in a Permanent Tree of Cliques. Stored at
each vertex in the tree are the following:
1. A clique Clqi
2. Si
3. Ri
4. (Clqi)
Step 2: Computation inside the join tree
•
In step 2, we start from copying a copy of the Permanent Tree of
Cliques to Clqi’, Si’,, Ri’ and ’ (Clqi ’) and P’ (Clqi’) , leave P’ (Clqi’)
unassigned at first.
•
Then we compute the prior probability P’ (Clqi’) using the same
updating algorithm as the one to determine the conditional
probabilities based on instantiated values.
•
After initialization, we compute the posterior probability P’ (Clqi’)
again with evidence by passing and messages in the join tree
•
When the probabilities of all cliques are determined , we can
compute the probability for each variable from any clique
containing the variable
Step 2: Message passing in the join tree
•
The message passing process consists of first
sending so-called messages from the bottom of
the tree to the top, then sending messages from
the top to the bottom, modifying and accumulating
node properties(’ (Clqi ’) and P’ (Clqi’) ) along the way
•
The message upward is a summed product of all
probabilities below the given node
•
The messages downward is information for
updating the node prior probabilities
Step 2: Message passing in the join tree
Upward messages
Clq1
Clq2
• ’ (Clqi ’) is modified as the
messages passing is going
Clq3
Downward messages
Clq4
Clq5
• P’ (Clqi’) is computed as the
messages passing is going
Clq6
Step 2: Message passing in the join tree
•
When the probabilities of all cliques are determined ,
for each vertex Clqi and each variable v Clqi , do
P' (v) :
P' (Clq' )
wClq 'i , w v
i
L-S Algorithm:Computational Complexity
Analysis
1. Computations in the Algorithm which creates the permanent
Tree of Cliques --- O(nrm)
•
Moralization – O(n)
•
Maximum Cardinality Search – O(n+e)
•
Fill-in algorithm for triangulating – O(n+e)
---- the general problem for finding optimal triangulation (minimal fill-in) is
NP-Hard, but we are using a greedy heuristic
•
Find Cliques and build join tree – O(n+e)
--- the general problem for finding minimal Cliques from an arbitrary graph
is NP-Hard, but our subject is a triangulated graph
•
Compute (Clqi) – O(nrm)
--- n = number of variables; m = the maximum number of variables in a
clique; r = maximum number of alternatives for a variable
L-S Algorithm:Computational Complexity
Analysis
2. Computations in the updating Algorithm --- O(prm )
•
Computation for sending all messages --- 2prm
•
Computation for sending all messages --- prm
•
Computation for receiving all messages --- prm
•
Computation for receiving all messages --- prm
---- p = number of vertices in the tree of cliques
L-S algorithm has a time complexity of O(prm), in the
worst case it is bounded below by 2m, i.e. (2m)
L-S Algorithm:Computational Complexity
Analysis
•
It may seem that we should search for a better generalpurpose algorithm to perform probability propagation
•
But in practice, most Bayes networks created by human
hands should often contain small clusters of variables,
and therefore a small value of m. So L-S algorithm works
efficiently for many application because networks available
so far are often sparse and irregular.
•
L-S algorithm could have a very bad performance for more
general networks
L-S Algorithm: Alternative methods
•
Since the general problem of probability propagation is NPHard, it is unlikely that we could develop an efficient generalpurpose algorithm for propagating probabilities in an arbitrary
Bayes network.
•
This suggests that research should be directed towards
obtaining alternative methods which work in different cases:
1. Approximate algorithms
2. Monte Carlo techniques
3. Heuristic algorithms
4. Parallel algorithms
5. Special case algorithms
L-S Algorithm: Demo
•
Laura works on step 1
•
Ben works on step 2