Relational Models
Download
Report
Transcript Relational Models
Relational Models
CSE 515 in One Slide
We will learn to:
Put probability distributions on everything
Learn them from data
Do inference with them
Beyond Vectors
We want to put distributions on:
Trees
Graphs
Objects of different types and their relations
Class hierarchies
Relational databases
Knowledge bases
Programs
Etc.
Two Approaches
Piecemeal
Develop probabilistic models for each
All-in-one
All are easily expressed in first-order logic
Add probability to first-order logic
First-Order Logic
Symbols: Constants, variables, functions, predicates
E.g.: Anna, x, MotherOf(x), Friends(x, y)
Logical connectives: Conjunction, disjunction,
negation, implication, quantification, etc.
Literal: Atom (predicate) or its negation
Clause: Disjunction of literals
All formulas can be converted to conjunction
of clauses
Grounding: Replace all variables by constants
E.g.: Friends (Anna, Bob)
World: Assignment of truth values to all ground atoms
Example: Friends & Smokers
Example: Friends & Smokers
Smoking causes cancer.
Friends have similar smoking habits.
Example: Friends & Smokers
x Smokes( x ) Cancer ( x )
x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Relational Models
Representation
Logical
Language
Probabilistic
Language
Knowledge-based
model construction
Stochastic logic
programs
Probabilistic
relational models
Relational Markov
networks
Bayesian logic
Horn clauses
Bayes nets
Horn clauses
PCFGs
Frame systems
Bayes nets
SQL queries
Markov nets
First-order
Bayes nets
Markov logic
First-order logic Markov nets
Markov Logic
Most developed approach to date
Many other approaches can be viewed
as special cases
Main focus of this lecture
Markov Logic: Intuition
A logical KB is a set of hard constraints
on the set of possible worlds
Let’s make them soft constraints:
When a world violates a formula,
It becomes less probable, not impossible
Give each formula a weight
(Higher weight Stronger constraint)
P(world) exp weights of formulas it satisfies
Markov Logic: Definition
A Markov Logic Network (MLN) is a set of
pairs (F, w) where
F is a formula in first-order logic
w is a real number
Together with a set of constants,
it defines a Markov network with
One node for each grounding of each predicate in
the MLN
One feature for each grounding of each formula F
in the MLN, with the corresponding weight w
Example: Friends & Smokers
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Example: Friends & Smokers
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Two constants: Anna (A) and Bob (B)
Example: Friends & Smokers
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Two constants: Anna (A) and Bob (B)
Smokes(A)
Cancer(A)
Smokes(B)
Cancer(B)
Example: Friends & Smokers
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Two constants: Anna (A) and Bob (B)
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
Example: Friends & Smokers
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Two constants: Anna (A) and Bob (B)
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
Example: Friends & Smokers
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Two constants: Anna (A) and Bob (B)
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
Markov Logic Networks
MLN is template for ground Markov nets
Probability of a world x:
1
P( x) exp wi ni ( x)
Z
i
Weight of formula i
No. of true groundings of formula i in x
Typed variables and constants greatly reduce
size of ground Markov net
Functions, existential quantifiers, etc.
Infinite and continuous domains
Relation to Statistical Models
Markov logic has all the models we’ve seen
in this class as special cases
Markov logic allows objects to be
interdependent (non-i.i.d.)
Markov logic makes it easy to compose
models
Relation to First-Order Logic
Infinite weights First-order logic
Satisfiable KB, positive weights
Satisfying assignments = Modes of distribution
Markov logic allows contradictions between
formulas
Inference in Markov Logic
P(Formula|MLN,C) = ?
MCMC: Sample worlds, check formula holds
P(Formula1|Formula2,MLN,C) = ?
If Formula2 = Conjunction of ground atoms
First construct min subset of network necessary to
answer query (generalization of KBMC)
Then apply MCMC, belief propagation, etc.
Ground Network Construction
network ← Ø
queue ← query nodes
repeat
node ← front(queue)
remove node from queue
add node to network
if node not in evidence then
add neighbors(node) to queue
until queue = Ø
Example Grounding
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A))
Example Grounding
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A))
Example Grounding
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A))
Example Grounding
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A))
Example Grounding
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A))
Example Grounding
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A))
Example Grounding
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A))
Example Grounding
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A))
Example Grounding
1.5 x Smokes( x ) Cancer ( x )
1.1 x, y Friends ( x, y ) Smokes( x ) Smokes( y )
e 2.2 if Smokes(B)
Friends(A,B)
0
e otherwise
Friends(A,A)
Smokes(A)
Smokes(B)
e1.5Cancer(A)
if Smokes(B) Cancer ( B )
0
Friends(B,A)
e otherwise
Friends(B,B)
Cancer(B)
P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A))
Other Options
Lazy inference
Lifted inference
Learning
Data is a relational database
Closed world assumption → Complete data
Learning parameters (weights)
Learning structure (formulas)
Weight Learning
Parameter tying: Groundings of same clause
log Pw ( x) ni ( x) Ew ni ( x)
wi
No. of times clause i is true in data
Expected no. times clause i is true according to MLN
Generative learning: Pseudo-likelihood
Discriminative learning: Cond. likelihood,
use MC-SAT or MaxWalkSAT for inference
Structure Learning
Generalizes feature induction in Markov nets
Any inductive logic programming approach
can be used, but . . .
Goal is to induce any clauses, not just Horn
Evaluation function should be likelihood
Requires learning weights for each candidate
Turns out not to be bottleneck
Bottleneck is counting clause groundings
Solution: Subsampling
Structure Learning
Initial state: Unit clauses or hand-coded KB
Operators: Add/remove literal, flip sign
Evaluation function:
Pseudo-likelihood + Structure prior
Search: Beam, shortest-first, bottom-up, etc.
Alchemy
Open-source software including:
Full first-order logic syntax
Inference: MCMC, belief propagation, etc.
Generative & discriminative weight learning
Structure learning
Programming language features
alchemy.cs.washington.edu
Example:
Information Extraction
Parag Singla and Pedro Domingos, “Memory-Efficient
Inference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficent
inference in relatonal domains. In Proceedings of the
Twenty-First National Conference on Artificial Intelligence
(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inference
with Probabilistic and Deterministic Dependencies”, in
Proc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of the
Twenty-First National Conference on Artificial Intelligence.
Segmentation
Author
Title
Venue
Parag Singla and Pedro Domingos, “Memory-Efficient
Inference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficent
inference in relatonal domains. In Proceedings of the
Twenty-First National Conference on Artificial Intelligence
(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inference
with Probabilistic and Deterministic Dependencies”, in
Proc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of the
Twenty-First National Conference on Artificial Intelligence.
Entity Resolution
Parag Singla and Pedro Domingos, “Memory-Efficient
Inference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficent
inference in relatonal domains. In Proceedings of the
Twenty-First National Conference on Artificial Intelligence
(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inference
with Probabilistic and Deterministic Dependencies”, in
Proc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of the
Twenty-First National Conference on Artificial Intelligence.
Entity Resolution
Parag Singla and Pedro Domingos, “Memory-Efficient
Inference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficent
inference in relatonal domains. In Proceedings of the
Twenty-First National Conference on Artificial Intelligence
(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inference
with Probabilistic and Deterministic Dependencies”, in
Proc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of the
Twenty-First National Conference on Artificial Intelligence.
State of the Art
Segmentation
Entity resolution
HMM (or CRF) to assign each token to a field
Logistic regression to predict same field/citation
Transitive closure
Alchemy implementation: Seven formulas
Types and Predicates
token = {Parag, Singla, and, Pedro, ...}
field = {Author, Title, Venue}
citation = {C1, C2, ...}
position = {0, 1, 2, ...}
Token(token, position, citation)
InField(position, field, citation)
SameField(field, citation, citation)
SameCit(citation, citation)
Types and Predicates
token = {Parag, Singla, and, Pedro, ...}
field = {Author, Title, Venue, ...}
citation = {C1, C2, ...}
position = {0, 1, 2, ...}
Token(token, position, citation)
InField(position, field, citation)
SameField(field, citation, citation)
SameCit(citation, citation)
Optional
Types and Predicates
token = {Parag, Singla, and, Pedro, ...}
field = {Author, Title, Venue}
citation = {C1, C2, ...}
position = {0, 1, 2, ...}
Token(token, position, citation)
Evidence
InField(position, field, citation)
SameField(field, citation, citation)
SameCit(citation, citation)
Types and Predicates
token = {Parag, Singla, and, Pedro, ...}
field = {Author, Title, Venue}
citation = {C1, C2, ...}
position = {0, 1, 2, ...}
Token(token, position, citation)
InField(position, field, citation)
SameField(field, citation, citation)
SameCit(citation, citation)
Query
Formulas
Token(+t,i,c) => InField(i,+f,c)
InField(i,+f,c) <=> InField(i+1,+f,c)
f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))
Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’)
^ InField(i’,+f,c’) => SameField(+f,c,c’)
SameField(+f,c,c’) <=> SameCit(c,c’)
SameField(f,c,c’) ^ SameField(f,c’,c”)
=> SameField(f,c,c”)
SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)
Formulas
Token(+t,i,c) => InField(i,+f,c)
InField(i,+f,c) <=> InField(i+1,+f,c)
f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))
Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’)
^ InField(i’,+f,c’) => SameField(+f,c,c’)
SameField(+f,c,c’) <=> SameCit(c,c’)
SameField(f,c,c’) ^ SameField(f,c’,c”)
=> SameField(f,c,c”)
SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)
Formulas
Token(+t,i,c) => InField(i,+f,c)
InField(i,+f,c) <=> InField(i+1,+f,c)
f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))
Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’)
^ InField(i’,+f,c’) => SameField(+f,c,c’)
SameField(+f,c,c’) <=> SameCit(c,c’)
SameField(f,c,c’) ^ SameField(f,c’,c”)
=> SameField(f,c,c”)
SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)
Formulas
Token(+t,i,c) => InField(i,+f,c)
InField(i,+f,c) <=> InField(i+1,+f,c)
f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))
Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’)
^ InField(i’,+f,c’) => SameField(+f,c,c’)
SameField(+f,c,c’) <=> SameCit(c,c’)
SameField(f,c,c’) ^ SameField(f,c’,c”)
=> SameField(f,c,c”)
SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)
Formulas
Token(+t,i,c) => InField(i,+f,c)
InField(i,+f,c) <=> InField(i+1,+f,c)
f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))
Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’)
^ InField(i’,+f,c’) => SameField(+f,c,c’)
SameField(+f,c,c’) <=> SameCit(c,c’)
SameField(f,c,c’) ^ SameField(f,c’,c”)
=> SameField(f,c,c”)
SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)
Formulas
Token(+t,i,c) => InField(i,+f,c)
InField(i,+f,c) <=> InField(i+1,+f,c)
f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))
Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’)
^ InField(i’,+f,c’) => SameField(+f,c,c’)
SameField(+f,c,c’) <=> SameCit(c,c’)
SameField(f,c,c’) ^ SameField(f,c’,c”)
=> SameField(f,c,c”)
SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)
Formulas
Token(+t,i,c) => InField(i,+f,c)
InField(i,+f,c) <=> InField(i+1,+f,c)
f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))
Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’)
^ InField(i’,+f,c’) => SameField(+f,c,c’)
SameField(+f,c,c’) <=> SameCit(c,c’)
SameField(f,c,c’) ^ SameField(f,c’,c”)
=> SameField(f,c,c”)
SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)
Formulas
Token(+t,i,c) => InField(i,+f,c)
InField(i,+f,c) ^ !Token(“.”,i,c) <=> InField(i+1,+f,c)
f != f’ => (!InField(i,+f,c) v !InField(i,+f’,c))
Token(+t,i,c) ^ InField(i,+f,c) ^ Token(+t,i’,c’)
^ InField(i’,+f,c’) => SameField(+f,c,c’)
SameField(+f,c,c’) <=> SameCit(c,c’)
SameField(f,c,c’) ^ SameField(f,c’,c”)
=> SameField(f,c,c”)
SameCit(c,c’) ^ SameCit(c’,c”) => SameCit(c,c”)
Results: Segmentation on Cora
1
Precision
0.8
0.6
0.4
Tokens
0.2
Tok. + Seq. + Period
Tokens + Sequence
Tok. + Seq. + P. + Comma
0
0
0.2
0.4
0.6
Recall
0.8
1
Results:
Matching Venues on Cora
1
Precision
0.8
0.6
Similarity
0.4
Sim. + Relations
Sim. + Transitivity
0.2
Sim. + Rel. + Trans.
0
0
0.2
0.4
0.6
Recall
0.8
1