Lifted First-Order Probabilistic Inference [Poole 2005]

Download Report

Transcript Lifted First-Order Probabilistic Inference [Poole 2005]

Lifted First-Order Probabilistic Inference
[de Salvo Braz, Amir, and Roth, 2005]
Daniel Lowd
5/11/2005
Key Ideas
Do exact inference at the first-order level,
rather than grounding out the network
When we have no evidence concerning
many objects, we can treat them identically
Allows for queries that primarily depend on
the size of a domain
Background: variable elimination
Key idea: compute exact marginal probability by
iteratively summing out variables.
Example: want to compute P(A,C) in a Markov network:
A
B
D
C
Pr( A, C )   AB ( A, B) AD ( A, D) BC ( B, C )CD (C , D)
B
D
Background: variable elimination

B
AB
( A, B) AD ( A, D) BC ( B, C )CD (C , D)
D
1. Distribute across sums:

B
AB
( A, B) BC ( B, C ) AD ( A, D)CD (C , D)
D
Background: variable elimination

B
AB
( A, B) AD ( A, D) BC ( B, C )CD (C , D)
D
1. Distribute across sums:

AB
( A, B) BC ( B, C ) AD ( A, D)CD (C , D)
B
2. Sum out D:
D

B
AB
( A, B)BC ( B, C ) D ( A, C )
Background: variable elimination

B
AB
( A, B) AD ( A, D) BC ( B, C )CD (C , D)
D
1. Distribute across sums:

AB
( A, B) BC ( B, C ) AD ( A, D)CD (C , D)
B
2. Sum out D:
D

AB
( A, B)BC ( B, C ) D ( A, C )
 D ( A, C ) AB ( A, B)BC ( B, C )
B
B
Background: variable elimination

B
AB
( A, B) AD ( A, D) BC ( B, C )CD (C , D)
D
1. Distribute across sums:

AB
( A, B) BC ( B, C ) AD ( A, D)CD (C , D)
B
2. Sum out D:
D

AB
( A, B)BC ( B, C ) D ( A, C )
 D ( A, C ) AB ( A, B)BC ( B, C )
B
B
3. Sum out B:  ( A, C ) ( A, C )
D
B
Background: variable elimination

B
AB
( A, B) AD ( A, D) BC ( B, C )CD (C , D)
D
1. Distribute across sums:

AB
( A, B) BC ( B, C ) AD ( A, D)CD (C , D)
B
2. Sum out D:
D

AB
( A, B)BC ( B, C ) D ( A, C )
 D ( A, C ) AB ( A, B)BC ( B, C )
B
B
3. Sum out B: ( A, C ) ( A, C )
D
B
  DB ( A, C)  Pr( A, C)
First-order variable elimination
Instead of factors  and , use parameterized
factors, or parfactors: , A,C


 – potential function
A – set of atoms (may be parameterized)
C – set of constraints on groundings  of the

atoms in A
Example parfactor
Given MLN clause:
{w, Friends(A,B) => Friends(B,A)}
One parfactor ( , A, C ) might be:
: 0.7 if Friends(A,B) => Friends(B,A) is true
0.3 otherwise
 A: Friends(A,B), Friends(B,A)
 C: A != bob

Definitions
Logical variable: a predicate parameter
(e.g., in Friends(A, B), A and B are logical
variables)
Notation: LV(S)  logical vars in S
Random variable: the value of a functor
(e.g., Friends(anna, bob) is a random
variable in the Friends/Smokes domain)
Notation: RV(S)  random vars in S
Joint distribution
We can represent any MLN as a set of
parfactors G.
Joint probability of a world:
Pr( RV (G))    g ( Ag )
gG  G
All random variables
(i.e., predicate truth
assignments) in the world
Joint distribution
We can represent any MLN as a set of
parfactors G.
Joint probability of a world:
Pr( RV (G))    g ( Ag )
gG  G
All random variables
(i.e., predicate truth
assignments) in the world
Parfactors
Groundings
Joint distribution
We can represent any MLN as a set of
parfactors G.
Joint probability of a world:
Pr( RV (G))    g ( Ag )
gG  G
All random variables
(i.e., predicate truth
assignments) in the world
Parfactors
Potential of ground atoms
Groundings
Query probability
Pr(Q) 
  (G)
RV ( G ) \ Q
Query probability
Pr(Q) 
  (G)
RV ( G ) \ Q


  (G
RV ( G ) \ Q \ RV ( E ) RV ( E )
Split apart summation
E
) (GE )
Query probability
Pr(Q) 
  (G)
RV ( G ) \ Q


  (G
E
) (GE )
RV ( G ) \ Q \ RV ( E ) RV ( E )


 (GE )
RV ( G ) \ Q \ RV ( E )
Push first factor before summation
  (G
RV ( E )
E
)
Query probability
Pr(Q) 
  (G) 
RV ( G ) \ Q


  (G ) (G
 (G )   (G
E
E
RV ( G ) \ Q \ RV ( E )

  (G
RV ( E )
E
RV ( G ) \ Q \ RV ( E )

  (G
RV ( G ) \ Q \ RV ( E )
E
)
E
)
RV ( G ) \ Q \ RV ( E ) RV ( E )

Substitute parfactor g’:
E
 g') 
  (G' )
RV ( G ) \ Q \ RV ( E )
) ( g ' )
How to find g’?
Inversion elimination
Complexity is independent of number of
groundings
 Not always applicable

Counting elimination
Always applicable
 Requires computing multinomial distributions
(potentially large factorials)

Shattering
Using unification, split parfactors as
necessary to ensure two conditions:

For every atom pair (p,q) in G, RV(p) and
RV(q) are either identical or disjoint.
Good: Friends(anna, B) and Friends(bob, B)
 Bad: Friends(anna, B) and Friends(A, bob)


Incomplete overlap prevents us from reordering terms for
inversion elimination.
Shattering (cont.)
The second condition is used by counting
elimination:

For every atom pair (p,q) in every parfactor g in G, p
and q are never instantiated to the same random
variable.


Good: Friends(A, B) and Friends(B, A), A != B
Bad: Friends(A, B) and Friends(B, A)

Friends(A,B) and Friends(B,A) may be instantiated to the same
random variable, e.g., Friends(anna, anna), and thus are not
independent.
Inversion elimination
Requirements
E = {e}
(a single atom, possibly parameterized)
 LV(e) = LV(g)
(all logical variables that appear in e’s parfactor, g, also
appear as parameters in e)

Example: suppose Ag = {Friends(A,B),
Smokes(A), and Smokes(B)}


Good: E = {Friends(A,B)}
Bad: E = {Smokes(B)}

Fewer instantiations of Smokes(B) than of parfactor g, since g
is over more logical variables.
Inversion Elimination
 (g)    (A )
g
g
RV (e) G
RV(e)
Because our parfactors were shattered, every
single term is independent, allowing us to invert
the sum and the product:

  (A ) (g')
g
g
 G RV(e)
(see paper for full details)
Counting elimination
Inversion elimination cannot be applied to:
Ag = {Professor(A), IsQualsCourse(B)}
Set E consists of multiple atoms, so that remaining
atoms in g are ground.
Each atom may take on one of several values (e.g.,
for predicates, True or False)
Key idea: Sum out atoms by counting the number
of groundings for each configuration (independent
assignment of atoms to values).
(See paper for further details.)