Inference in First
Download
Report
Transcript Inference in First
CS 2710, ISSP 2610
Chapter 9
Inference in First-Order Logic
1
Pages to skim
• Storage and Retrieval (p. starts bottom
328)
• Efficient forward chaining (starts p. 333)
through Irrelevant facts (ends top 337)
• Efficient implementation of logic programs
(starts p. 340) through Constraint logic
programming (ends p. 345)
• Completeness of resolution (starts p. 350)
(though see notes in slides)
2
Inference with Quantifiers
• Universal Instantiation:
– Given X (person(X) likes(X, sun))
– Infer person(john) likes(john,sun)
• Existential Instantiation:
– Given x likes(x, chocolate)
– Infer: likes(S1, chocolate)
– S1 is a “Skolem Constant” that is not
found anywhere else in the KB and
refers to (one of) the individuals that
likes sun.
3
Reduction to Propositional
Inference
• Simple form (pp. 324-325) not efficient. Useful
conceptually.
• Replace each universally quantified sentence by all
possible instantiations
– All X (man(X) mortal(X)) replaced by
– man(tom) mortal(tom)
– man(chocolate) mortal(chocolate)
– …
• Now, we essentially have propositional logic.
• Use propositional reasoning algorithms from Ch 7
4
Reduction to Propositional
Inference
• Problem: when the KB includes a function
symbol, the set of term substitutions is
infinite. father(father(father(tom))) …
• Herbrand 1930: if a sentence is entailed by
the original FO KB, then there is a proof
using a finite subset of the
propositionalized KB
• Since any subset has a maximum depth of
nesting in terms, we can find the subset by
generating all instantiations with constant
symbols, then all with depth 1, and so on
5
Reduction to Propositional
Inference
• We have an approach to FO inference via
propositionalization that is complete: any entailed
sentence can be proved
• Entailment for FOPC is semi-decidable: algorithms
exist that say yes to every entailed sentence, but
no algorithm exists that also says no to every
nonentailed sentence.
• Our proof procedure could go on and on,
generating more and more deeply nested terms,
but we will not know whether it is stuck in a loop,
or whether the proof is just about to pop out
6
Generalized Modus Ponens
• This is a general inference rule for
FOL that does not require
instantiation
• Given:
– p1’, p2’ … pn’ (p1 … pn) q
– Subst(theta, pi’) = subst(theta, pi) for all i
• Conclude:
– Subst(theta, q)
7
GMP is a lifted version of MP
• GMP “lifts” MP from propositional to
first-order logic
• Key advantage of lifted inference
rules over propositionalization is that
they make only substitutions which
are required to allow particular
inferences to proceed
8
GMP Example
• x,y,z ((parent(x,y) parent(y,z))
grandparent(x,z))
• parent(james, john), parent(james,
richard), parent(harry, james)
• We can derive:
– Grandparent(harry, john), bindings:
{x/harry,y/james,z/john}
– Grandparent(harry, richard), bindings:
{x/harry,y/james,z/richard}
9
Unification
• Process of finding all legal
substitutions
• Key component of all FO inference
algorithms
• Unify(p,q) = theta, where
Subst(theta,p) == Subst(theta,q)
Assuming all variables universally
quantified
10
Standardizing apart
• knows(john,X).
• knows(X,elizabeth).
• These ought to unify, since john knows
everyone, and everyone knows elizabeth.
• Rename variables to avoid such name
clashes
Note:
all X p(X) == all Y p(Y)
All X (p(X) ^ q(X)) == All X p(X) ^ All Y p(Y)
11
def Unify (p, q, bdgs):
d = disagreement(p, q)
# If there is no disagreement, then success.
if not d: return bdgs
elif not isVar(d[0]) and not isVar(d[1]): return 'fail'
else:
if isVar(d[0]): var = d[0] ; other = d[1]
else: var = d[1] ; other = d[0]
if occursp (var,other): return ‘fail’
# Make appropriate substitutions and recurse on the result.
else:
pp = replaceAll(var,other,p)
qq = replaceAll(var,other,q)
return Unify (pp,qq, bdgs + [[var,other]])
For code, click here unify.py
12
================================
unify:
['loves', ['dog', 'var_x'], ['dog', 'fred']]
['loves', 'var_z', 'var_z']
subs: [['var_z', ['dog', 'var_x']], ['var_x', 'fred']]
result: ['loves', ['dog', 'fred'], ['dog', 'fred']]
================================
unify:
['loves', ['dog', 'fred'], 'fred']
['loves', 'var_x', 'var_y']
subs: [['var_x', ['dog', 'fred']], ['var_y', 'fred']]
result: ['loves', ['dog', 'fred'], 'fred']
================================
unify:
['loves', ['dog', 'fred'], 'mary']
['loves', ['dog', 'var_x'], 'var_y']
subs: [['var_x', 'fred'], ['var_y', 'mary']]
result: ['loves', ['dog', 'fred'], 'mary']
================================
13
unify:
['loves', ['dog', 'fred'], 'mary']
['loves', ['dog', 'var_x'], 'var_y']
subs: [['var_x', 'fred'], ['var_y', 'mary']]
result: ['loves', ['dog', 'fred'], 'mary']
================================
unify:
['loves', ['dog', 'fred'], 'fred']
['loves', 'var_x', 'var_x']
failure
================================
unify:
['loves', ['dog', 'fred'], 'mary']
['loves', ['dog', 'var_x'], 'var_x']
failure
================================
unify:
['loves', 'var_x', 'fred']
['loves', ['dog', 'var_x'], 'fred']
var_x occurs in ['dog', 'var_x']
failure
14
unify:
['loves', 'var_x', ['dog', 'var_x']]
['loves', 'var_y', 'var_y']
var_y occurs in ['dog', 'var_y']
failure
================================
unify:
['loves', 'var_y', 'var_y']
['loves', 'var_x', ['dog', 'var_x']]
var_x occurs in ['dog', 'var_x']
failure
================================
unify: (fails because vars not standardized apart)
['hates', 'agatha', 'var_x']
['hates', 'var_x', ['f1', 'var_x']]
failure
================================
unify:
['hates', 'agatha', 'var_x']
['hates', 'var_y', ['f1', 'var_y']]
subs: [['var_y', 'agatha'], ['var_x', ['f1', 'agatha']]]
result: ['hates', 'agatha', ['f1', 'agatha']]
15
Most General Unifier
• The Unify algorithm returns a MGU
L1 = p(X,f(Y),b)
L2 = p(X,f(b),b)
Subst1 = {a/X, b/Y}
Result1 = p(a,f(b),b)
Subst2 = {b/Y}
Result2 = p(X,f(b),b)
Subst1 is more restrictive than Subst2. In fact,
Subst2 is a MGU of L1 and L2.
16
Storage and retrieval
• Hash statements by predicate for quick
retrieval (predicate indexing), e.g., of all
sentences that unify with tall(X)
• Why attempt to unify
– tall(X) and silly(dog(Y))
• Instead
– Predicates[tall] = {all tall facts}
– Unify(tall(X),s) for s in Predicates[tall]
• Subsumption lattice for efficiency (see p.
329)
17
Forward Chaining over FO
Definite (Horn) Clauses
• Clauses (disjunctions) with at most
one positive literal
• First-order literals can include
variables, which are assumed to be
universally quantified
• Use GMP to perform forward chaining
(Semi-decidable as for full FOPC)
18
Def FOL-FC-Ask(KB,A) returns subst or false
KB: set of FO definite clauses with variables standardized apart
A: the query, an atomic sentence
Repeat until new is empty
new {}
for each implication (p1 ^ … ^ pn q) in KB:
for each T such that SUBST(T,p1^…^pn) =
SUBST(T,p1’^…^pn’) for some p1’,…,pn’ in KB
q’ SUBST(T,q)
if q’ is not a renaming of a sentence already in KB or new:
add q’ to new
S Unify(q’,A)
if S is not fail then return S
add new to KB
Return false
Process can be made more efficient; read on your own, for interest
19
Backward Chaining over
Definite (Horn) Clauses
• Logic programming
• Prolog is most popular form
• Depth-first search, so space
requirements are lower, but suffers
from problems from repeated states
20
american(X) ^ weapon(Y) ^ sells(X,Y,Z) ^ hostile(Z) criminal(X).
owns(nono,m1). missile(m1).
missile(X1) ^ owns(nono,X1) sells(west,X1,nono).
missile(X2) weapon(X2).
enemy(X3,america) hostile(X3).
american(west).
enemy(nono,america).
Goal: criminal(west).
Backward chaining proof: in lecture
In Prolog:
criminal(X) :- american(X), weapon(Y), sells(X,Y,Z), hostile(Z).
21
Horn clauses are all of the form:
L1 ^ L2 ^ ... ^ Ln -> Ln+1
Or, equivalently, in clausal form:
~L1 v ~L2 v ... v ~Ln v Ln+1
Prolog (like databases) makes the "closed world assumption":
if P cannot be proved, infer not P
Think of the system as an arrogant know-it-all:
"If it were true, I would know it. Since I can't prove
it, it must not be true"
Thus, it uses "negation as failure".
22
neighbor(canada,us)
neighbor(mexico,us)
neighbor(pakistan,india)
?- neighbor(canada,india).
no
In full first-order logic, you would have to be able to
infer “~neighbor(canada,india)" for
"neighbor(canada,india)" to be false.
Be careful! “~neighbor(canada,india) is not entailed by the
Sentences above!
23
bachelor(X) :- male(X), \+ married(X).
male(bill).
male(jim).
married(bill).
married(mary).
An individual is a bachelor if it is male and it is
not married. \+ is the negation-as-failure operator in Prolog.
| ?- bachelor(bill).
no
| ?- bachelor(jim).
yes
| ?- bachelor(mary).
no
| ?- bachelor(X).
X = jim;
no
| ?-
24
Inference Methods
• Unification (prerequisite)
• Forward Chaining
– Production Systems
– RETE Method (OPS)
• Backward Chaining
– Logic Programming (Prolog)
• Resolution
– Transform to CNF
– Generalization of Prop. Logic resolution
25
Resolution Theorem Proving
(FOL)
• Convert everything to CNF
• Resolve, with unification
– Save bindings as you go!
• If resolution is successful, proof
succeeds
• If there was a variable in the item to
prove, return variable’s value from
unification bindings
26
Converting to CNF
27
Converting sentences to CNF
1. Eliminate all ↔ connectives
(P ↔ Q) ((P Q) ^ (Q P))
2. Eliminate all connectives
(P Q) (P Q)
3. Reduce the scope of each negation symbol to a single predicate
P P
(P Q) P Q
(P Q) P Q
(x)P (x)P
(x)P (x)P
4. Standardize variables: rename all variables so that each quantifier
has its own unique variable name
28
Converting sentences to clausal form:
Skolem constants and functions
5. Eliminate existential quantification by introducing Skolem
constants/functions
(x)P(x) P(c)
c is a Skolem constant (a brand-new constant symbol
that is not used in any other sentence)
(x)(y)P(x,y) becomes (x)P(x, f(x))
since is within the scope of a universally quantified
variable, use a Skolem function f to construct a new
value that depends on the universally quantified variable
f must be a brand-new function name not occurring in any
other sentence in the KB.
E.g., (x)(y)loves(x,y) becomes (x)loves(x,f(x))
In this case, f(x) specifies the person that x loves
E.g., x1 x2 x3 y p(… y …) becomes
29
x1 x2 x3 p(… ff(x1,x2,x3) …) (ff is a new name)
Converting sentences to clausal
form
6. Remove universal quantifiers by (1) moving them all to the
left end; (2) making the scope of each the entire sentence;
and (3) dropping the “prefix” part
Ex: (x)P(x) P(x)
7. Put into conjunctive normal form (conjunction of
disjunctions) using distributive and associative laws
(P Q) R (P R) (Q R)
(P Q) R (P Q R)
8. Split conjuncts into separate clauses
9. Standardize variables so each clause contains only variable
names that do not occur in any other clause
30
An example
(x)(P(x) ((y)(P(y) P(f(x,y))) (y)(Q(x,y) P(y))))
2. Eliminate
(x)(P(x) ((y)(P(y) P(f(x,y))) (y)(Q(x,y) P(y))))
3. Reduce scope of negation
(x)(P(x) ((y)(P(y) P(f(x,y))) (y)(Q(x,y) P(y))))
4. Standardize variables
(x)(P(x) ((y)(P(y) P(f(x,y))) (z)(Q(x,z) P(z))))
5. Eliminate existential quantification
(x)(P(x) ((y)(P(y) P(f(x,y))) (Q(x,g(x)) P(g(x)))))
6. Drop universal quantification symbols
(P(x) ((P(y) P(f(x,y))) (Q(x,g(x)) P(g(x)))))
31
An Example
7. Convert to conjunction of disjunctions
(P(x) P(y) P(f(x,y))) (P(x) Q(x,g(x)))
(P(x) P(g(x)))
8. Create separate clauses
P(x) P(y) P(f(x,y))
P(x) Q(x,g(x))
P(x) P(g(x))
9. Standardize variables
P(x) P(y) P(f(x,y))
P(z) Q(z,g(z))
P(w) P(g(w))
32
1. all X (read (X) --> literate (X))
2. all X (dolphin (X) --> ~literate (X))
3. exists X (dolphin (X) ^ intelligent (X))
(a translation of ``Some dolphins are intelligent'')
``Are there some who are intelligent but cannot read?''
4. exists X (intelligent(X) ^ ~read (X))
Set of clauses (1-3):
1. ~read(X) v literate(X)
2. ~dolphin(Y) v ~literate(Y)
3a. dolphin (a)
3b. intelligent (a)
Negation of 4:
~(exists Z (intelligent(Z) ^ ~read (Z)))
In Clausal form:
~intelligent(Z) v read(Z)
Resolution proof: in lecture.
33
More complicated example
Did Curiosity kill the cat
• Jack owns a dog. Every dog owner is an animal lover. No
animal lover kills an animal. Either Jack or Curiosity killed the
cat, who is named Tuna. Did Curiosity kill the cat?
• These can be represented as follows:
A. (x) (Dog(x) Owns(Jack,x))
B. (x) (((y) Dog(y) Owns(x, y)) AnimalLover(x))
C. (x) (AnimalLover(x) ((y) Animal(y) Kills(x,y)))
D. Kills(Jack,Tuna) Kills(Curiosity,Tuna)
E. Cat(Tuna)
F. (x) (Cat(x) Animal(x) )
G. Kills(Curiosity, Tuna)
GOAL
34
• Convert to clause form
A1. (Dog(D))
D is a skolem constant
A2. (Owns(Jack,D))
B. (Dog(y), Owns(x, y), AnimalLover(x))
C. (AnimalLover(a), Animal(b), Kills(a,b))
D. (Kills(Jack,Tuna), Kills(Curiosity,Tuna))
E. Cat(Tuna)
F. (Cat(z), Animal(z))
• Add the negation of query:
G: (Kills(Curiosity, Tuna))
35
• The resolution refutation proof
R1: G, D, {}
(Kills(Jack, Tuna))
R2: R1, C, {a/Jack, b/Tuna}
(~AnimalLover(Jack),
~Animal(Tuna))
R3: R2, B, {x/Jack}
(~Dog(y), ~Owns(Jack, y),
~Animal(Tuna))
R4: R3, A1, {y/D}
(~Owns(Jack, D),
~Animal(Tuna))
R5: R4, A2, {}
(~Animal(Tuna))
R6: R5, F, {z/Tuna}
(~Cat(Tuna))
R7: R6, E, {}
FALSE
36
• The proof tree
G
D
{}
R1: K(J,T)
C
{a/J,b/T}
R2: AL(J) A(T)
B
{x/J}
R3: D(y) O(J,y) A(T)
A1
{y/D}
R4: O(J,D), A(T)
A2
{}
R5: A(T)
F
{z/T}
R6: C(T)
A
{}
R7: FALSE
37
Decidability and Completeness
• Resolution is a refutation complete
inference procedure for First-Order Logic
– If a set of sentences contains a contradiction,
then a finite sequence of resolutions will prove
this.
– If not, resolution may loop forever (“semidecidable”)
• Here are notes by Charles Elkan that go
into this more deeply
38
Decidability and Completeness
• Refutation Completeness: If KB |= A then KB |- A
– If it’s entailed, then there’s a proof
• Semi-decidable:
– If there’s a proof, we’ll halt with it.
– If not, maybe halt, maybe not
• Logical entailment in FOL is semidecidable: if the desired conclusion
follows from the premises, then eventually
resolution refutation will find a
contradiction
39
Decidability and Completeness
• Propositional logic
– logical entailment is decidable
– There exists a complete inference
procedure
• First-Order logic
– logical entailment is semi-decidable
– Resolution procedure is refutation
complete
40
• Strategies (heuristics) for efficient
resolution include
– Unit preference. If a clause has only one
literal, use it first.
– Set of support. Identify “useful” rules and
ignore the rest. (p. 305)
– Input resolution. Intermediately generated
sentences can only be combined with original
inputs or original rules.
– Subsumption. Prune unnecessary facts from
the database.
41
Comparing backward chaining in
prolog with resolution
• In lecture
42