Inference in First

Download Report

Transcript Inference in First

CS 2710, ISSP 2610
Chapter 9
Inference in First-Order Logic
1
Pages to skim
• Storage and Retrieval (p. starts bottom
328)
• Efficient forward chaining (starts p. 333)
through Irrelevant facts (ends top 337)
• Efficient implementation of logic programs
(starts p. 340) through Constraint logic
programming (ends p. 345)
• Completeness of resolution (starts p. 350)
(though see notes in slides)
2
Inference with Quantifiers
• Universal Instantiation:
– Given X (person(X)  likes(X, sun))
– Infer person(john)  likes(john,sun)
• Existential Instantiation:
– Given x likes(x, chocolate)
– Infer: likes(S1, chocolate)
– S1 is a “Skolem Constant” that is not
found anywhere else in the KB and
refers to (one of) the individuals that
likes sun.
3
Reduction to Propositional
Inference
• Simple form (pp. 324-325) not efficient. Useful
conceptually.
• Replace each universally quantified sentence by all
possible instantiations
– All X (man(X)  mortal(X)) replaced by
– man(tom)  mortal(tom)
– man(chocolate)  mortal(chocolate)
– …
• Now, we essentially have propositional logic.
• Use propositional reasoning algorithms from Ch 7
4
Reduction to Propositional
Inference
• Problem: when the KB includes a function
symbol, the set of term substitutions is
infinite. father(father(father(tom))) …
• Herbrand 1930: if a sentence is entailed by
the original FO KB, then there is a proof
using a finite subset of the
propositionalized KB
• Since any subset has a maximum depth of
nesting in terms, we can find the subset by
generating all instantiations with constant
symbols, then all with depth 1, and so on
5
Reduction to Propositional
Inference
• We have an approach to FO inference via
propositionalization that is complete: any entailed
sentence can be proved
• Entailment for FOPC is semi-decidable: algorithms
exist that say yes to every entailed sentence, but
no algorithm exists that also says no to every
nonentailed sentence.
• Our proof procedure could go on and on,
generating more and more deeply nested terms,
but we will not know whether it is stuck in a loop,
or whether the proof is just about to pop out
6
Generalized Modus Ponens
• This is a general inference rule for
FOL that does not require
instantiation
• Given:
– p1’, p2’ … pn’ (p1  … pn)  q
– Subst(theta, pi’) = subst(theta, pi) for all i
• Conclude:
– Subst(theta, q)
7
GMP is a lifted version of MP
• GMP “lifts” MP from propositional to
first-order logic
• Key advantage of lifted inference
rules over propositionalization is that
they make only substitutions which
are required to allow particular
inferences to proceed
8
GMP Example
• x,y,z ((parent(x,y)  parent(y,z)) 
grandparent(x,z))
• parent(james, john), parent(james,
richard), parent(harry, james)
• We can derive:
– Grandparent(harry, john), bindings:
{x/harry,y/james,z/john}
– Grandparent(harry, richard), bindings:
{x/harry,y/james,z/richard}
9
Unification
• Process of finding all legal
substitutions
• Key component of all FO inference
algorithms
• Unify(p,q) = theta, where
Subst(theta,p) == Subst(theta,q)
Assuming all variables universally
quantified
10
Standardizing apart
• knows(john,X).
• knows(X,elizabeth).
• These ought to unify, since john knows
everyone, and everyone knows elizabeth.
• Rename variables to avoid such name
clashes
Note:
all X p(X) == all Y p(Y)
All X (p(X) ^ q(X)) == All X p(X) ^ All Y p(Y)
11
def Unify (p, q, bdgs):
d = disagreement(p, q)
# If there is no disagreement, then success.
if not d: return bdgs
elif not isVar(d[0]) and not isVar(d[1]): return 'fail'
else:
if isVar(d[0]): var = d[0] ; other = d[1]
else: var = d[1] ; other = d[0]
if occursp (var,other): return ‘fail’
# Make appropriate substitutions and recurse on the result.
else:
pp = replaceAll(var,other,p)
qq = replaceAll(var,other,q)
return Unify (pp,qq, bdgs + [[var,other]])
For code, click here  unify.py
12
================================
unify:
['loves', ['dog', 'var_x'], ['dog', 'fred']]
['loves', 'var_z', 'var_z']
subs: [['var_z', ['dog', 'var_x']], ['var_x', 'fred']]
result: ['loves', ['dog', 'fred'], ['dog', 'fred']]
================================
unify:
['loves', ['dog', 'fred'], 'fred']
['loves', 'var_x', 'var_y']
subs: [['var_x', ['dog', 'fred']], ['var_y', 'fred']]
result: ['loves', ['dog', 'fred'], 'fred']
================================
unify:
['loves', ['dog', 'fred'], 'mary']
['loves', ['dog', 'var_x'], 'var_y']
subs: [['var_x', 'fred'], ['var_y', 'mary']]
result: ['loves', ['dog', 'fred'], 'mary']
================================
13
unify:
['loves', ['dog', 'fred'], 'mary']
['loves', ['dog', 'var_x'], 'var_y']
subs: [['var_x', 'fred'], ['var_y', 'mary']]
result: ['loves', ['dog', 'fred'], 'mary']
================================
unify:
['loves', ['dog', 'fred'], 'fred']
['loves', 'var_x', 'var_x']
failure
================================
unify:
['loves', ['dog', 'fred'], 'mary']
['loves', ['dog', 'var_x'], 'var_x']
failure
================================
unify:
['loves', 'var_x', 'fred']
['loves', ['dog', 'var_x'], 'fred']
var_x occurs in ['dog', 'var_x']
failure
14
unify:
['loves', 'var_x', ['dog', 'var_x']]
['loves', 'var_y', 'var_y']
var_y occurs in ['dog', 'var_y']
failure
================================
unify:
['loves', 'var_y', 'var_y']
['loves', 'var_x', ['dog', 'var_x']]
var_x occurs in ['dog', 'var_x']
failure
================================
unify: (fails because vars not standardized apart)
['hates', 'agatha', 'var_x']
['hates', 'var_x', ['f1', 'var_x']]
failure
================================
unify:
['hates', 'agatha', 'var_x']
['hates', 'var_y', ['f1', 'var_y']]
subs: [['var_y', 'agatha'], ['var_x', ['f1', 'agatha']]]
result: ['hates', 'agatha', ['f1', 'agatha']]
15
Most General Unifier
• The Unify algorithm returns a MGU
L1 = p(X,f(Y),b)
L2 = p(X,f(b),b)
Subst1 = {a/X, b/Y}
Result1 = p(a,f(b),b)
Subst2 = {b/Y}
Result2 = p(X,f(b),b)
Subst1 is more restrictive than Subst2. In fact,
Subst2 is a MGU of L1 and L2.
16
Storage and retrieval
• Hash statements by predicate for quick
retrieval (predicate indexing), e.g., of all
sentences that unify with tall(X)
• Why attempt to unify
– tall(X) and silly(dog(Y))
• Instead
– Predicates[tall] = {all tall facts}
– Unify(tall(X),s) for s in Predicates[tall]
• Subsumption lattice for efficiency (see p.
329)
17
Forward Chaining over FO
Definite (Horn) Clauses
• Clauses (disjunctions) with at most
one positive literal
• First-order literals can include
variables, which are assumed to be
universally quantified
• Use GMP to perform forward chaining
(Semi-decidable as for full FOPC)
18
Def FOL-FC-Ask(KB,A) returns subst or false
KB: set of FO definite clauses with variables standardized apart
A: the query, an atomic sentence
Repeat until new is empty
new  {}
for each implication (p1 ^ … ^ pn  q) in KB:
for each T such that SUBST(T,p1^…^pn) =
SUBST(T,p1’^…^pn’) for some p1’,…,pn’ in KB
q’  SUBST(T,q)
if q’ is not a renaming of a sentence already in KB or new:
add q’ to new
S  Unify(q’,A)
if S is not fail then return S
add new to KB
Return false
Process can be made more efficient; read on your own, for interest
19
Backward Chaining over
Definite (Horn) Clauses
• Logic programming
• Prolog is most popular form
• Depth-first search, so space
requirements are lower, but suffers
from problems from repeated states
20
american(X) ^ weapon(Y) ^ sells(X,Y,Z) ^ hostile(Z)  criminal(X).
owns(nono,m1). missile(m1).
missile(X1) ^ owns(nono,X1)  sells(west,X1,nono).
missile(X2)  weapon(X2).
enemy(X3,america)  hostile(X3).
american(west).
enemy(nono,america).
Goal: criminal(west).
Backward chaining proof: in lecture
In Prolog:
criminal(X) :- american(X), weapon(Y), sells(X,Y,Z), hostile(Z).
21
Horn clauses are all of the form:
L1 ^ L2 ^ ... ^ Ln -> Ln+1
Or, equivalently, in clausal form:
~L1 v ~L2 v ... v ~Ln v Ln+1
Prolog (like databases) makes the "closed world assumption":
if P cannot be proved, infer not P
Think of the system as an arrogant know-it-all:
"If it were true, I would know it. Since I can't prove
it, it must not be true"
Thus, it uses "negation as failure".
22
neighbor(canada,us)
neighbor(mexico,us)
neighbor(pakistan,india)
?- neighbor(canada,india).
no
In full first-order logic, you would have to be able to
infer “~neighbor(canada,india)" for
"neighbor(canada,india)" to be false.
Be careful! “~neighbor(canada,india) is not entailed by the
Sentences above!
23
bachelor(X) :- male(X), \+ married(X).
male(bill).
male(jim).
married(bill).
married(mary).
An individual is a bachelor if it is male and it is
not married. \+ is the negation-as-failure operator in Prolog.
| ?- bachelor(bill).
no
| ?- bachelor(jim).
yes
| ?- bachelor(mary).
no
| ?- bachelor(X).
X = jim;
no
| ?-
24
Inference Methods
• Unification (prerequisite)
• Forward Chaining
– Production Systems
– RETE Method (OPS)
• Backward Chaining
– Logic Programming (Prolog)
• Resolution
– Transform to CNF
– Generalization of Prop. Logic resolution
25
Resolution Theorem Proving
(FOL)
• Convert everything to CNF
• Resolve, with unification
– Save bindings as you go!
• If resolution is successful, proof
succeeds
• If there was a variable in the item to
prove, return variable’s value from
unification bindings
26
Converting to CNF
27
Converting sentences to CNF
1. Eliminate all ↔ connectives
(P ↔ Q)  ((P  Q) ^ (Q  P))
2. Eliminate all  connectives
(P  Q)  (P  Q)
3. Reduce the scope of each negation symbol to a single predicate
P  P
(P  Q)  P  Q
(P  Q)  P  Q
(x)P  (x)P
(x)P  (x)P
4. Standardize variables: rename all variables so that each quantifier
has its own unique variable name
28
Converting sentences to clausal form:
Skolem constants and functions
5. Eliminate existential quantification by introducing Skolem
constants/functions
(x)P(x)  P(c)
c is a Skolem constant (a brand-new constant symbol
that is not used in any other sentence)
(x)(y)P(x,y) becomes (x)P(x, f(x))
since  is within the scope of a universally quantified
variable, use a Skolem function f to construct a new
value that depends on the universally quantified variable
f must be a brand-new function name not occurring in any
other sentence in the KB.
E.g., (x)(y)loves(x,y) becomes (x)loves(x,f(x))
In this case, f(x) specifies the person that x loves
E.g., x1 x2 x3 y p(… y …) becomes
29
x1 x2 x3 p(… ff(x1,x2,x3) …) (ff is a new name)
Converting sentences to clausal
form
6. Remove universal quantifiers by (1) moving them all to the
left end; (2) making the scope of each the entire sentence;
and (3) dropping the “prefix” part
Ex: (x)P(x)  P(x)
7. Put into conjunctive normal form (conjunction of
disjunctions) using distributive and associative laws
(P  Q)  R  (P  R)  (Q  R)
(P  Q)  R  (P  Q  R)
8. Split conjuncts into separate clauses
9. Standardize variables so each clause contains only variable
names that do not occur in any other clause
30
An example
(x)(P(x)  ((y)(P(y)  P(f(x,y)))  (y)(Q(x,y)  P(y))))
2. Eliminate 
(x)(P(x)  ((y)(P(y)  P(f(x,y)))  (y)(Q(x,y)  P(y))))
3. Reduce scope of negation
(x)(P(x)  ((y)(P(y)  P(f(x,y))) (y)(Q(x,y)  P(y))))
4. Standardize variables
(x)(P(x)  ((y)(P(y)  P(f(x,y))) (z)(Q(x,z)  P(z))))
5. Eliminate existential quantification
(x)(P(x) ((y)(P(y)  P(f(x,y))) (Q(x,g(x))  P(g(x)))))
6. Drop universal quantification symbols
(P(x)  ((P(y)  P(f(x,y))) (Q(x,g(x))  P(g(x)))))
31
An Example
7. Convert to conjunction of disjunctions
(P(x)  P(y)  P(f(x,y)))  (P(x)  Q(x,g(x))) 
(P(x)  P(g(x)))
8. Create separate clauses
P(x)  P(y)  P(f(x,y))
P(x)  Q(x,g(x))
P(x)  P(g(x))
9. Standardize variables
P(x)  P(y)  P(f(x,y))
P(z)  Q(z,g(z))
P(w)  P(g(w))
32
1. all X (read (X) --> literate (X))
2. all X (dolphin (X) --> ~literate (X))
3. exists X (dolphin (X) ^ intelligent (X))
(a translation of ``Some dolphins are intelligent'')
``Are there some who are intelligent but cannot read?''
4. exists X (intelligent(X) ^ ~read (X))
Set of clauses (1-3):
1. ~read(X) v literate(X)
2. ~dolphin(Y) v ~literate(Y)
3a. dolphin (a)
3b. intelligent (a)
Negation of 4:
~(exists Z (intelligent(Z) ^ ~read (Z)))
In Clausal form:
~intelligent(Z) v read(Z)
Resolution proof: in lecture.
33
More complicated example
Did Curiosity kill the cat
• Jack owns a dog. Every dog owner is an animal lover. No
animal lover kills an animal. Either Jack or Curiosity killed the
cat, who is named Tuna. Did Curiosity kill the cat?
• These can be represented as follows:
A. (x) (Dog(x)  Owns(Jack,x))
B. (x) (((y) Dog(y)  Owns(x, y))  AnimalLover(x))
C. (x) (AnimalLover(x)  ((y) Animal(y)  Kills(x,y)))
D. Kills(Jack,Tuna)  Kills(Curiosity,Tuna)
E. Cat(Tuna)
F. (x) (Cat(x)  Animal(x) )
G. Kills(Curiosity, Tuna)
GOAL
34
• Convert to clause form
A1. (Dog(D))
D is a skolem constant
A2. (Owns(Jack,D))
B. (Dog(y), Owns(x, y), AnimalLover(x))
C. (AnimalLover(a), Animal(b), Kills(a,b))
D. (Kills(Jack,Tuna), Kills(Curiosity,Tuna))
E. Cat(Tuna)
F. (Cat(z), Animal(z))
• Add the negation of query:
G: (Kills(Curiosity, Tuna))
35
• The resolution refutation proof
R1: G, D, {}
(Kills(Jack, Tuna))
R2: R1, C, {a/Jack, b/Tuna}
(~AnimalLover(Jack),
~Animal(Tuna))
R3: R2, B, {x/Jack}
(~Dog(y), ~Owns(Jack, y),
~Animal(Tuna))
R4: R3, A1, {y/D}
(~Owns(Jack, D),
~Animal(Tuna))
R5: R4, A2, {}
(~Animal(Tuna))
R6: R5, F, {z/Tuna}
(~Cat(Tuna))
R7: R6, E, {}
FALSE
36
• The proof tree
G
D
{}
R1: K(J,T)
C
{a/J,b/T}
R2: AL(J)  A(T)
B
{x/J}
R3: D(y)  O(J,y)  A(T)
A1
{y/D}
R4: O(J,D), A(T)
A2
{}
R5: A(T)
F
{z/T}
R6: C(T)
A
{}
R7: FALSE
37
Decidability and Completeness
• Resolution is a refutation complete
inference procedure for First-Order Logic
– If a set of sentences contains a contradiction,
then a finite sequence of resolutions will prove
this.
– If not, resolution may loop forever (“semidecidable”)
• Here are notes by Charles Elkan that go
into this more deeply
38
Decidability and Completeness
• Refutation Completeness: If KB |= A then KB |- A
– If it’s entailed, then there’s a proof
• Semi-decidable:
– If there’s a proof, we’ll halt with it.
– If not, maybe halt, maybe not
• Logical entailment in FOL is semidecidable: if the desired conclusion
follows from the premises, then eventually
resolution refutation will find a
contradiction
39
Decidability and Completeness
• Propositional logic
– logical entailment is decidable
– There exists a complete inference
procedure
• First-Order logic
– logical entailment is semi-decidable
– Resolution procedure is refutation
complete
40
• Strategies (heuristics) for efficient
resolution include
– Unit preference. If a clause has only one
literal, use it first.
– Set of support. Identify “useful” rules and
ignore the rest. (p. 305)
– Input resolution. Intermediately generated
sentences can only be combined with original
inputs or original rules.
– Subsumption. Prune unnecessary facts from
the database.
41
Comparing backward chaining in
prolog with resolution
• In lecture
42