Transcript Logic6

CS 4700:
Foundations of Artificial Intelligence
Carla P. Gomes
[email protected]
Module:
FOL Inference
(Reading R&N: Chapter 8)
1
Inference
How to perform inference in First Order Logic? How to derive new
information?
“Similar” to propositional logic but it requires new “tricks” to deal with:
quantifiers and variables.
Unification  a substitution to match atomic sentences involving variables
Skolemization  instantiations of existential variables to remove existential
quantifiers
2
Outline
Reducing first-order inference to propositional inference
Unification
Generalized Modus Ponens
Forward chaining
Backward chaining
Resolution
3
Universal instantiation (UI)
Every instantiation of a universally quantified sentence is
entailed by it:
v α
Subst({v/g}, α)
for any variable v and ground term g
E.g., x King(x)  Greedy(x)  Evil(x) yields:
4
Existential instantiation (EI)
For any sentence α, variable v, and constant symbol k that
does not appear elsewhere in the knowledge base:
v α
Subst({v/k}, α)
E.g., x Crown(x)  OnHead(x,John) yields:
Crown(C1)  OnHead(C1,John)
provided C1 is a new constant symbol, called a Skolem
constant
5
Reduction to propositional inference
Suppose the KB contains just the following:
x King(x)  Greedy(x)  Evil(x)
King(John)
Greedy(John)
Brother(Richard,John)
Instantiating the universal sentence in all possible ways, we have:
King(John)  Greedy(John)  Evil(John)
King(Richard)  Greedy(Richard)  Evil(Richard)
King(John)
Greedy(John)
Brother(Richard,John)
6
Reduction to Propositional Inference
Every FOL KB can be propositionalized so as to preserve
entailment
(A ground sentence is entailed by new KB iff entailed by
original KB)
Idea: propositionalize KB and query, apply resolution, return
result
Often quite effective to propositionalize a theory
to take advantage of fast propositional solvers!!!
7
Reduction to Propositional Inference
But, at a more theoretical level….
Problem: with function symbols, there are infinitely many ground terms,
– e.g., Father(Father(Father(John)))
–
Theorem: Herbrand (1930). If a sentence α is entailed by a FOL KB, it is
entailed by a finite subset of the propositionalized KB
Problem: works if α is entailed, loops if α is not entailed
Theorem: Turing (1936), Church (1936) Entailment for FOL is
semidecidable (algorithms exist that say yes to every entailed sentence,
beyond
this course
but no algorithm existsTheoretical
that also says aspects
no to every
nonentailed
sentence.)
8
Unification
Dealing with variable:
Unification: a substitution to match atomic sentences, that makes two
clauses resolvable:
Unify (P.Q) takes two atomic sentences P and Q and returns a substitution
that makes P and Q look the same.
Rules for substitutions:
• Can replace a variable by a constant. (v1  C;)
• Can replace a variable by a variable. (v2  v3; )
• Can replace a variable by a function expression, as long as the function
expression does not contain the variable. (v4  f(…))
9
Unification
Given:
Knows(John,x)  Hates(John,x)
Knows(John,Jim)
To perform resolution we need:
Unifier  = {x/Jim}
Hates(John,Jim)
10
Unification
Knows(John,x)  Hates(John,x)
And
Knows(John,Jim)
How to resolve them? First match them
Solution: UNIFY(Knows(John,x),Knows(John,Jim)) = {x/Jim})
Unifier  = {x/Jim}
Gives
Knows(John,Jim)  Hates(John,Jim)
And
Knows(John,Jim)
Conclude by resolution: Hates(John,Jim)
Unification
General rule:
Knows(John,x)  Hates(John,x)
Facts:
Knows(John , Jim)
Knows(y , Leo)
Knows(y , Mother(y))
Knows(y , Jane)
Matching facts to the general rules
12
Unification:
Standardizing Variables
UNIFY(Knows(John,x),Knows(John,Jim)) = {x/Jim})
UNIFY(Knows(John,x),Knows(y,Leo)) = {x/Leo,y/John})
UNIFY(Knows(John,x),Knows(y,Mother(y))) = {y/John,x/Mother(John)})
UNIFY(Knows(John,x),Knows(x,Jane)) = fail
 but intuitively we know that everyone John knows he hates and that everyone
knows Jane so we should be able to infer that John hates Jane.
This is why we require, if possible, that every variable has a separate name.
UNIFY(Knows(John,x),Knows(y,Jane))  works!!! {y/John,x/Jane})
Standardizing apart eliminates overlap of variables, e.g., Knows(y,Jane)
13
Unification:
Most General Unifier
To unify Knows(John,x) and Knows(y,z),
θ = {y/John, x/z } or θ = {y/John, x/z, z/Mary} or θ = {y/John, x/John,
z/John}
Choose the substitution that makes the least commitment
(most general) about the bindings
The first unifier is more general than the second.
MGU = { y/John, x/z }
See page 277 and 278 for unification algorithm, O(n2)
(size of expressions being checked).
There is a single most general unifier (MGU) that is unique up to
14
Generalized Modus Ponens (GMP)
p1', p2', … , pn', ( p1  p2  …  pn q)
qθ
p1' is King(John)
p1 is King(x)
p2' is Greedy(y)
p2 is Greedy(x)
θ is {x/John,y/John}
q is Evil(x)
q θ is Evil(John)
where pi'θ = pi θ for all i
All variables assumed universally quantified
Horn Definite clauses (exactly one positive literal) are a suitable normal
form for use with GMP.
17
Example knowledge base
The law says that it is a crime for an American to
sell weapons to hostile nations. The country Nono,
an enemy of America, has some missiles, and all of
its missiles were sold to it by Colonel West, who is
American.
Prove that Colonel West is a criminal
19
Example knowledge base contd.
... it is a crime for an American to sell weapons to hostile
nations:
American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)
Nono … has some missiles, i.e., x Owns(Nono,x) 
Missile(x):
Added skolem constant M1
Owns(Nono,M1) and Missile(M1)
… all of its missiles were sold to it by Colonel West
Missile(u)  Owns(Nono,u)  Sells(West,u,Nono)
Missiles are weapons:
Missile(v)  Weapon(v)
An enemy of America counts as "hostile“:
facts
Enemy(t,America)  Hostile(t)
West, who is American …
All variables assumed universally quantified; quantifiers omitted.
Forward chaining proof
22
All variables assumed universally quantified; quantifiers omitted.
Forward chaining proof
{v/M1}
{u/M1}
{t/Nono}
Missile(u)  Owns(Nono,u)  Sells(West,u,Nono)
Enemy(t,America)  Hostile(t)
Missile(v)  Weapon(v)
23
All variables assumed universally quantified; quantifiers omitted.
Forward chaining proof
{x/West,y/M1,z/Nono}
{v/M1}
{u/M1}
{u/Nono}
American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)
24
All variables assumed universally quantified; quantifiers omitted.
Properties of forward chaining
Sound and complete for first-order Horn definite clauses
Datalog = first-order definite clauses + no functions
FC terminates for Datalog in finite number of iterations
May not terminate in general if α is not entailed
This is unavoidable: entailment with definite clauses is semidecidable
Forward chaining is widely used in deductive databases
25
Improvements: Incremental forward chaining
Backward chaining example
29
Backward chaining example
American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)
30
Backward chaining example
American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)
Missile(x)  Weapon(x)
31
Backward chaining example
American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)
Missile(x)  Weapon(x)
32
Backward chaining example
American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)
Missile(x)  Weapon(x)
Missile(M1)
33
Backward chaining example
American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)
Missile(x)  Weapon(x)
Missile(M1)
Owns(Nono,M1)
34
Backward chaining example
Note: once one subgoal succeeds in a conjunction, its substitution is
applied to subsequent sub-goals. E.g. y is bound to M1 and and z is bound to Nono.
American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)
Missile(x)  Weapon(x)
Missile(M1) Owns(Nono,M1)
35
Enemy(Nono,America)
Backward chaining example
American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)
Missile(x)  Weapon(x)
Missile(M1) Owns(Nono,M1)
36
Enemy(Nono,America)
Properties of backward chaining
Depth-first recursive proof search: space is linear in size of
proof
Incomplete due to infinite loops
–  fix by checking current goal against every goal on stack
–
Inefficient due to repeated subgoals (both success and failure)
–  fix using caching of previous results (extra space)
–
Widely used for logic programming
37
Logic programming: Prolog
Algorithm = Logic + Control
Basis: backward chaining with Horn clauses + bells & whistles
Widely used in Europe, Japan (basis of 5th Generation project)
Interpreted or Compiled (intermediate language, e.g., Lisp C)
Program = set of clauses = head :- literal1, … literaln.
criminal(X) :- american(X), weapon(Y), sells(X,Y,Z), hostile(Z).
Depth-first, left-to-right backward chaining
Built-in predicates for arithmetic etc., e.g., X is Y*Z+3
Built-in predicates that have side effects (e.g., input and output
predicates, assert/retract predicates)
Closed-world assumption ("negation as failure")
38
Prolog (Programming in Logic)
What is Prolog?
–
–
–
Full-featured programming language
Programs consist of logical formulas
Running a program means proving a theorem
Syntax of Prolog
–
Predicates, objects, and functions:
•
–
–
Variables: X, Y, List (capitalized)
Facts:
•
•
–
cat(tuna), append(a,pair(b))
university(cornell).
prepend(a,pair(a,X)).
Rules:
• animal(X) :- cat(X).
• student(X) :- person(X), enrolled(X,Y), university(Y).
 implication “:-” with single predicate on left and only non-negated predicates on the right. All variables
implicitly “forall” quantified.
–
Queries:
• student(X).
 All variables implicitly “exists” quantified.
39
Resolution
40
Resolution
p q
p  r
 qr
Propositional logic
FOL: “Similar” to propositional logic but it requires new
“tricks” to deal with: quantifiers and variables.
41
Resolution
1 – Put in clausal form
All variables universally quantified (standardize names apart)
Main trick: “Skolemization” to remove existential quantifiers.
Idea: Invent names for unknown objects known to exist.
Two cases:
Constant - existential variable not in the scope of any other
variable  skolem constant
Function – existential variable in the scope of other variables
 Skolem function
2 – Use unification to match atomic sentences
3 – Apply resolution rule to the clausal set combine with the negated goal.
Attempt to derive empty clause.
42
Eliminate Existential Quantifiers:
Skolemization
Existential variable not in the scope of any other variable
Existential variable in the scope of other variable
There is one argument for each universally quantified variable whose scope
contains the Skolem function.
43
Resolution: brief summary
The two clauses are assumed to be standardized apart so that they share no
variables.
For example,
Rich(x)  Unhappy(x)
Rich(Ken)
Unhappy(Ken)
with θ = {x/Ken}
Apply resolution steps to CNF(KB  α); refutation complete for FOL
44
Algorithm: Putting Axioms into Clausal Form
Eliminate biconditionals and implications.
Move the negations inwards.
Eliminate the existential quantifiers.
Rename the variables, if necessary.
Move the universal quantifiers to the left.
Move the disjunctions down to the literals.
Eliminate the conjunctions.
Rename the variables, if necessary.
Eliminate the universal quantifiers.
45
Conversion to CNF
Everyone who loves all animals is loved by someone:
x { [y Animal(y)  Loves(x,y)]  [y Loves(y,x)]
}
1. Eliminate biconditionals and implications
x {[y Animal(y)  Loves(x,y)] [y Loves(y,x)]} (eliminate main
implication)
x {[y (Animal(y)  Loves(x,y))] [y Loves(y,x)]} (eliminate other
implication)
2. Move  inwards: x p ≡ x p,  x p ≡ x p
x [y (Animal(y)  Loves(x,y))]  [y Loves(y,x)]
46
x [y Animal(y)  Loves(x,y)]  [y Loves(y,x)] de Morgan’s law
Conversion to CNF contd.
Standardize variables: each quantifier should use a different one
x [y Animal(y)  Loves(x,y)]  [z Loves(z,x)]
Skolemize: a more general form of existential instantiation.
Each existential variable is replaced by a Skolem function of the enclosing
universally quantified variables:
x [Animal(F(x))  Loves(x,F(x))]  Loves(G(x),x)
Drop universal quantifiers:
[Animal(F(x))  Loves(x,F(x))]  Loves(G(x),x)
47
Resolution:
Example
Jack owns a dog.
Every dog owner is an animal lover.
No animal lover kills an animal.
Either Jack or Curiosity killed the cat, who is named Tuna.
Did Curiosity kill the cat?
48
Original Sentences (Plus Background Knowledge)
Jack owns a dog.
Skolem constant
 x p ≡ x p
:
Every dog owner is an animal lover.
:
No animal lover kills an animal.
(v)
(w,v)
Either Jack or Curiosity killed the cat, who is named Tuna.
Theorem: Kills (Curiosity,Tuna)
Proof by Resolution
Negation of theorem
kills(Curiosity,Tuna)
kills(Jack,Tuna)  kills(Curiosity,Tuna)
{}
AnimalLover(w)  Animal(v)  kills(w,v)
kills(Jack,Tuna)
{w/Jack, v/Tuna}
Animal(z) Cat(z)
AnimalLover(Jack)  Animal(Tuna)
{z/Tuna}
 AnimalLover(Jack)  Cat(Tuna)
Cat(Tuna)
{}
AnimalLover(Jack)
Dog(y)  Owns(x,y) AnimalLover(x)
{x/Jack}
Dog(D)
Dog(y)   Owns(Jack,y)
{y/D}
Owns(Jack,D)
NIL
Owns(Jack,D)
Resolution proof: definite clauses
51
Resolution is Refutation Complete
Resolution with unification applied to clausal form, is refutation complete!
Interesting proof! Based on building an “artificial” domain of
interpretation called the Herbrand universe.
See R&N pages 300-303.
52
Proofs can be Lengthy
A relatively straightforward KB can quickly overwhelm general resolution
methods.
Theorem provers are in general based on resolution strategies that can
reduce the problem somewhat, but not completely.
As a consequence, many practical Knowledge Representation formalisms in
AI use a restricted form and specialized inference.
– Logic programming (Prolog)
– Datalog – definite clauses, no functions
– Production or expert systems (rule based systems)
– Frame systems and semantic networks (chapter 10)
– Description logics (chapter 10)
53
Successes in Rule-Based Reasoning
Expert systems
DENDRAL (Buchanan et al., 1969)
MYCIN (Feigenbaum, Buchanan, Shortliffe)
PROSPECTOR (Duda et al., 1979)
R1 (McDermott, 1982)
54
Successes in Rule-Based Reasoning
DENDRAL (Buchanan et al., 1969)
– Infers molecular structure from the information provided by a mass
spectrometer
– Generate-and-test method
55
Successes in Rule-Based Reasoning
MYCIN (Feigenbaum, Buchanan, Shortliffe)
– Diagnosis of blood infections
– 450 rules; performs as well as experts
– Incorporated certainty factors
56
Successes in Rule-Based Reasoning
PROSPECTOR (Duda et al., 1979)
– Correctly recommended exploratory drilling at geological site
– Rule-based system founded on probability theory
R1 (McDermott, 1982)
– Designs configurations of computer components
– About 10,000 rules
– Uses meta-rules to change context
57
Prominent expert systems
• CADUCEUS (expert system)- Blood-borne infectious bacteria
• Dendral- Analysis of mass spectra
• Jess- Java Expert System Shell. A CLIPS engine implemented in Java
used in the development of expert systems
• LogicNets- Web based expert system modeling environment to create
expert systems (in collaboration with NASA)
• Mycin - Diagnose infectious blood diseases and recommend antibiotics
(by Stanford University)
• NEXPERT Object- An early general-purpose commercial backwardschaining inference engine used in the development of expert systems
• Prolog- Programming language used in the development of expert
systems
• R1 (expert system)/XCon Order processing
• STD Wizard - Expert system for recommending medical screening tests
58
• PyKe- Pyke is a knowledge-based inference engine (expert system)