Unit 5 (PowerPoint file) - Huji cse Moodle 2014/15
Download
Report
Transcript Unit 5 (PowerPoint file) - Huji cse Moodle 2014/15
Introduction to Artificial Intelligence –
Unit 5
Inference in first-order logic
Course 67842
The Hebrew University of Jerusalem
School of Engineering and Computer Science
Academic Year: 2008/2009
Instructor: Jeff Rosenschein
(Chapter 9, “Artificial Intelligence: A Modern
Approach”)
Outline
Reducing first-order inference to
propositional inference
Unification
Generalized Modus Ponens
Forward chaining
Backward chaining
Resolution
2
Resolution (Review)
Conjunctive Normal Form (CNF)
conjunction of disjunctions of literals
clauses
E.g., (A B) (B C D)
Resolution inference rule (for CNF):
l1 … lk
m1 … mn
l1 … li-1 li+1 … lk m1 … mj-1 mj+1 ... mn
where li and mj are complementary literals.
E.g., P1,3 P2,2,
P2,2
P1,3
Resolution is sound and complete
for propositional logic
3
Resolution (Review)
“Canada is in North America, or Canada is
in the EU”
“Canada is not in North America, or
Canada is in the British Commonwealth”
From this we can conclude:
“Canada is in the EU, or Canada is in the
British Commonwealth”
4
Universal instantiation (UI)
Every instantiation of a universally quantified sentence is
entailed by it:
v α
Subst({v/g}, α)
for any variable v and ground term g
E.g., x King(x) Greedy(x) Evil(x) yields:
King(John) Greedy(John) Evil(John)
King(Richard) Greedy(Richard) Evil(Richard)
King(Father(John)) Greedy(Father(John)) Evil(Father(John))
.
.
.
5
Existential instantiation (EI)
For any sentence α, variable v, and constant
symbol k that does not appear elsewhere in
the knowledge base:
v α
Subst({v/k}, α)
E.g., x Crown(x) OnHead(x,John) yields:
Crown(C1) OnHead(C1,John)
provided C1 is a new constant symbol,
called a Skolem constant
6
Reduction to propositional
inference
Suppose the KB contains just the following:
x King(x) Greedy(x) Evil(x)
King(John)
Greedy(John)
Brother(Richard,John)
Instantiating the universal sentence in all possible ways,
we have:
King(John) Greedy(John) Evil(John)
King(Richard) Greedy(Richard) Evil(Richard)
King(John)
Greedy(John)
Brother(Richard,John)
The new KB is propositionalized: proposition symbols are
King(John), Greedy(John), Evil(John), King(Richard), etc.
7
Reduction contd.
Every FOL KB can be propositionalized so as to
preserve entailment
(A ground sentence is entailed by new KB iff
entailed by original KB)
Idea: propositionalize KB and query, apply
resolution, return result
Problem: with function symbols, there are
infinitely many ground terms,
e.g., Father(Father(Father(John)))
8
Reduction contd.
Theorem: Herbrand (1930). If a sentence α is entailed
by an FOL KB, it is entailed by a finite subset of the
propositionalized KB
Idea: For n = 0 to ∞ do
create a propositional KB by instantiating with depth-n
terms
see if α is entailed by this KB
Problem: works if α is entailed, does not terminate if α
is not entailed
Theorem: Turing (1936), Church (1936) Entailment
for FOL is semidecidable (algorithms exist that say
yes to every entailed sentence, but no algorithm
exists that also says no to every non-entailed
sentence)
9
Problems with propositionalization
Propositionalization seems to generate lots of
irrelevant sentences
E.g., from:
x King(x) Greedy(x) Evil(x)
King(John)
y Greedy(y)
Brother(Richard,John)
It seems obvious that Evil(John), but
propositionalization produces lots of facts such
as Greedy(Richard) that are irrelevant
With p k-ary predicates and n constants, there
are p·nk instantiations
10
Unification
We can get the inference immediately if we can find a
substitution θ such that King(x) and Greedy(x) match
King(John) and Greedy(y)
θ = {x/John,y/John} works
Unify(α,β) = θ if αθ = βθ
p
Knows(John,x)
Knows(John,x)
Knows(John,x)
Knows(John,x)
q
θ
Knows(John,Jane)
Knows(y,OJ)
Knows(y,Mother(y))
Knows(x,OJ)
Standardizing apart eliminates overlap of variables, e.g.,
11
Unification
We can get the inference immediately if we can find a
substitution θ such that King(x) and Greedy(x) match
King(John) and Greedy(y)
θ = {x/John,y/John} works
Unify(α,β) = θ if αθ = βθ
p
Knows(John,x)
Knows(John,x)
Knows(John,x)
Knows(John,x)
q
θ
Knows(John,Jane) {x/Jane}}
Knows(y,OJ)
Knows(y,Mother(y))
Knows(x,OJ)
Standardizing apart eliminates overlap of variables, e.g.,
12
Unification
We can get the inference immediately if we can find a
substitution θ such that King(x) and Greedy(x) match
King(John) and Greedy(y)
θ = {x/John,y/John} works
Unify(α,β) = θ if αθ = βθ
p
Knows(John,x)
Knows(John,x)
Knows(John,x)
Knows(John,x)
q
θ
Knows(John,Jane) {x/Jane}}
Knows(y,OJ)
{x/OJ,y/John}}
Knows(y,Mother(y))
Knows(x,OJ)
Standardizing apart eliminates overlap of variables, e.g.,
13
Knows(z17,OJ)
Unification
We can get the inference immediately if we can find a
substitution θ such that King(x) and Greedy(x) match
King(John) and Greedy(y)
θ = {x/John,y/John} works
Unify(α,β) = θ if αθ = βθ
p
Knows(John,x)
Knows(John,x)
Knows(John,x)
Knows(John,x)
q
θ
Knows(John,Jane) {x/Jane}}
Knows(y,OJ)
{x/OJ,y/John}}
Knows(y,Mother(y)){y/John,x/Mother(John)}}
Knows(x,OJ)
Standardizing apart eliminates overlap of variables, e.g.,
14
Knows(z17,OJ)
Unification
We can get the inference immediately if we can find a
substitution θ such that King(x) and Greedy(x) match
King(John) and Greedy(y)
θ = {x/John,y/John} works
Unify(α,β) = θ if αθ = βθ
p
Knows(John,x)
Knows(John,x)
Knows(John,x)
Knows(John,x)
q
θ
Knows(John,Jane) {x/Jane}}
Knows(y,OJ)
{x/OJ,y/John}}
Knows(y,Mother(y)){y/John,x/Mother(John)}}
Knows(x,OJ)
{fail}
Standardizing apart eliminates overlap of variables, e.g.,
15
Knows(z17,OJ)
Unification
To unify Knows(John,x) and Knows(y,z),
θ = {y/John, x/z } or θ = {y/John, x/John,
z/John}
The first unifier is more general than the
second
There is a single most general unifier
(MGU) that is unique up to renaming of
variables
MGU = { y/John, x/z }
16
The unification algorithm
17
The unification algorithm
18
Unification (again)
Unification is the process of determining
whether 2 expressions can be made identical
by appropriate substitutions for their variables
Example of a substitution:
{x/A, y/F(B), z/w}
Properties of a substitution:
1.Each variable is associated with at most one
expression;
2.No variable with an associated expression occurs
within any associated expression (that is, no “left
side” appears on a “right side”)
19
Not a Substitution
{ x/A, y/F(x), z/y} NOT a substitution
Applying a substitution to a predicate
calculus expression produces a new
expression, called a “substitution instance”:
Ex: fs yields y
P(x,x,y,v){x/A, y/F(B), z/w}= P(A,A,F(B),v)
20
Unifiable
A set of expressions
is
unifiable if there is a substitution s that
makes them identical, i.e.,
Ex: {x/A, y/B, z/C} unifies the expressions
P(A, y, z) and P(x, B, z):
21
General Unifiers
If there is one unifier for a set of
expressions, there are usually many — but
some are more general than others:
s is more general than t iff there is a d such
that sd = t
Ex: {z / F(w)} is more general than {z / F(C)}
since:
{z / F(w)}{w / C} = {z / F(C)}
22
Most General Unifier (MGU)
There exists a “most general unifier” that is
unique up to a variable renaming:
Ex: {x/A} is the mgu of P(A, y, z) and P(x, y, z)
We’ll be interested in finding the mgu of literals
A recursive procedure for finding a mgu follows
(it assumes that an expression is either a
constant, variable, or structured object,
Ex: F(A,G(y)) is object of length 2
[F is zero-th part])
23
Recursive Procedure Mgu(x,y)
Begin
End
if x=y then Return ( { } ),
if Variable(x) then Return(Mguvar(x,y)),
if Variable(y) then Return(Mguvar(y,x)),
if Constant(x) or Constant(y) then
Return(False),
if Not(Length(x) = Length(y)) then
Return(False),
Begin
i gets 0, g gets { },
Tag:
i = Length(x) then Return(g),
s gets Mgu(Part(x,i), Part(y,i)),
s = False then Return(False),
g gets Compose(g,s),
x gets Substitute(x, g),
y gets Substitute(y, g), Begin
i gets i + 1, Goto Tag
End
End
Includes(x, y) then Return(False),
Return( {x/y} )
24
Example of Finding MGU
Ex: Find the unifier for P(F(x, G(A, y)), G(A,y)) and P(F(x,z),z).
1. Mgu is called on P and P, returns { }
2. Mgu is called recursively on F(x, G(A,y)) and F(x, z)
3. Mgu is called on F and F, returns { }
4. Mgu is called on x and x, returns { }
5. Mgu is called on G(A,y) and z; since z is a variable, it returns
(via Mguvar) {z / G(A,y)} (after checking that z is not in G(A,y) )
6. {z/G(A,y)} is composed with the previous (empty) substitution
7. The entire substitution is applied to both expressions, yielding
F(x, G(A,y))
8. Since i = 3, Mgu returns {z/G(A,y)}, which is then applied to the
top level expressions. All other checks show equality.
25
Generalized Modus Ponens
(GMP)
p1', p2', … , pn', ( p1 p2 … pn q)
qθ
p1' is King(John)
p1 is King(x)
p2' is Greedy(y)
p2 is Greedy(x)
θ is {x/John, y/John}
q is Evil(x)
q θ is Evil(John)
where pi'θ = pi θ for all i
GMP used with KB of definite clauses (exactly one
positive literal)
All variables assumed universally quantified
26
Soundness of GMP
Need to show that
p1', …, pn', (p1 … pn q) ╞ qθ
provided that pi'θ = piθ for all i
Lemma: For any sentence p, we have p ╞ pθ by UI
1. (p1 … pn q) ╞ (p1 … pn q)θ =
(p1θ … pnθ qθ)
2. p1', …, pn' ╞ p1' … pn' ╞ p1'θ … pn'θ
3. From 1 and 2, qθ follows by ordinary Modus Ponens
4.
27
Example knowledge base
The law says that it is a crime for an
American to sell weapons to hostile
nations. The country Nono, an enemy of
America, has some missiles, and all of its
missiles were sold to it by Colonel West,
who is American
Prove that Colonel West is a criminal
28
Example knowledge base
contd.
... it is a crime for an American
to sell weapons to hostile
nations:
American(x) Weapon(y) Sells(x,y,z) Hostile(z) Criminal(x)
Nono … has some missiles, i.e., x Owns(Nono,x) Missile(x):
Owns(Nono,M1) and Missile(M1)
… all of its missiles were sold to it by Colonel West
Missile(x) Owns(Nono,x) Sells(West,x,Nono)
Missiles are weapons:
Missile(x) Weapon(x)
An enemy of America counts as “hostile”:
Enemy(x,America) Hostile(x)
West, who is American …
American(West)
The country Nono, an enemy of America …
Enemy(Nono,America)
29
Forward chaining algorithm
30
Forward chaining proof
31
Forward chaining proof
32
Forward chaining proof
33
Properties of forward chaining
Sound and complete for first-order definite
clauses
Datalog = first-order definite clauses + no
functions
FC terminates for Datalog in finite number of
iterations
May not terminate in general (i.e., non-Datalog)
if α is not entailed
This is unavoidable: entailment with definite
clauses is semidecidable
34
Efficiency of forward chaining
Incremental forward chaining: no need to match a
rule on iteration k if a premise wasn’t added on
iteration k-1
match each rule whose premise contains a newly
added positive literal
Matching itself can be expensive:
Database indexing allows O(1) retrieval of known
facts
e.g., query Missile(x) retrieves Missile(M1)
Forward chaining is widely used in deductive
databases
35
Hard matching example
Diff(wa,nt) Diff(wa,sa) Diff(nt,q)
Diff(nt,sa) Diff(q,nsw) Diff(q,sa)
Diff(nsw,v) Diff(nsw,sa) Diff(v,sa)
Colorable()
Diff(Red,Blue) Diff (Red,Green)
Diff(Green,Red) Diff(Green,Blue)
Diff(Blue,Red) Diff(Blue,Green)
Colorable() is inferred iff the CSP has a solution
CSPs include 3SAT as a special case, hence
matching is NP-hard
36
Backward chaining algorithm
SUBST(COMPOSE(θ1, θ2), p) =
SUBST(θ2, SUBST(θ1, p))
37
Backward chaining example
38
Backward chaining example
39
Backward chaining example
40
Backward chaining example
41
Backward chaining example
42
Backward chaining example
43
Backward chaining example
44
Properties of backward chaining
Depth-first recursive proof search: space is
linear in size of proof
Incomplete due to infinite loops
fix by checking current goal against every goal on
stack
Inefficient due to repeated subgoals (both
success and failure)
fix using caching of previous results (extra space)
Widely used for logic programming
45
Logic programming: Prolog
Algorithm = Logic + Control
Basis: backward chaining with Horn clauses + bells & whistles
Widely used in Europe, Japan (basis of 5th Generation project)
Compilation techniques 60 million LIPS
Program = set of clauses =
head :- literal1, … literaln.
criminal(X) :- american(X), weapon(Y), sells(X,Y,Z),
hostile(Z).
Depth-first, left-to-right backward chaining
Built-in predicates for arithmetic etc., e.g., X is Y*Z+3
Built-in predicates that have side effects (e.g., input and output
predicates, assert/retract predicates)
Closed-world assumption (“negation as failure”)
e.g., given alive(X) :- not dead(X).
alive(joe) succeeds if dead(joe) fails
46
Prolog
Appending two lists to produce a third:
append([],Y,Y).
append([X|L],Y,[X|Z]) :- append(L,Y,Z).
query:
append(A,B,[1,2]) ?
answers: A=[]
B=[1,2]
A=[1]
B=[2]
A=[1,2] B=[]
47
Resolution: brief summary
Full first-order version:
l1 ··· lk,
m1 ··· mn
(l1 ··· li-1 li+1 ··· lk m1 ··· mj-1 mj+1 ··· mn)θ
where Unify(li, mj) = θ.
The two clauses are assumed to be standardized apart so
that they share no variables.
For example,
Rich(x) Unhappy(x)
Rich(Ken)
Unhappy(Ken)
with θ = {x/Ken}
Apply resolution steps to CNF(KB α); complete for FOL
48
Conversion to CNF
Everyone who loves all animals is loved by
someone:
x [y Animal(y) Loves(x,y)] [y Loves(y,x)]
1.Eliminate biconditionals and implications
x [y Animal(y) Loves(x,y)] [y Loves(y,x)]
2.Move inwards: x p ≡ x p, x p ≡ x p
x [y (Animal(y) Loves(x,y))] [y Loves(y,x)]
x [y Animal(y) Loves(x,y)] [y Loves(y,x)]
x [y Animal(y) Loves(x,y)] [y Loves(y,x)]
49
Conversion to CNF contd.
3. Standardize variables: each quantifier should use a
different one
x [y Animal(y) Loves(x,y)] [z Loves(z,x)]
4. Skolemize: a more general form of existential
instantiation.
Each existential variable is replaced by a Skolem function of
the enclosing universally quantified variables:
x [Animal(F(x)) Loves(x,F(x))] Loves(G(x),x)
5. Drop universal quantifiers:
[Animal(F(x)) Loves(x,F(x))] Loves(G(x),x)
6. Distribute over :
[Animal(F(x)) Loves(G(x),x)]
[Loves(x,F(x)) Loves(G(x),x)]
50
Clause Form
We want a uniform representation for our
logical propositions, so that we can use
a simple uniform proof procedure.
Clausal form expresses all logical
propositions using literals (atomic
sentences or negated atomic sentences)
and clauses (sets of literals representing
their disjunction).
51
Any set of sentences in predicate calculus
can be converted to clausal form.
Our 8-step procedure for conversion:
1 .Replace Implications
2 .Distribute negations
3 .Standardize variables
4 .Replace existentials
5 .Remove universals
6 .Distribute disjunctions
7 .Replace operators
8 .Rename variables
52
1 .Replace implications:
Any sentence of the form
fy
becom
es fy
(at all levels)
2 .Distribute negations:
Repeatedly do the following:
53
3 .Standardize variables:
Rename variables to ensure that each
quantifier has its own unique variable.
E
x
: (
xP
(
x
)
)
(
xQ
(
x
)
)
b
e
c
o
m
e
s(
xP
(
x
)
)
(
y
Q
(
y
)
)
4 .Replace existentials:
Consider
The identity of the x that makes this true
depends on the value of y — so we can
replace x by a function of y:
G is called a
Skolem function
y
(
P
(
G
(
y
)
,
y
)
)
54
Skolemization
Replace each occurrence of an
existentially quantified variable by
a Skolem function.
The Skolem function’s arguments
are the universally quantified
variables that are bound by
universal quantifiers whose scope
includes the scope of the
existential quantifier being
eliminated.
55
Skolemization
(continued):
The Skolem functions must be
new, i.e., not already present in
any of our formulas.
If the existential quantifier is not
within a universal quantifier, use a
Skolem function of no arguments,
i.e., a constant.
56
Examples of Skolemization:
becomes
Skolem functions
5 .Remove universals:
Throw away ’s and assume all
variables are universally quantified.
57
6 .Distribute disjunctions:
Write the formula in conjunctive
normal form (the conjunction of
sets of disjunctions), using
repeatedly the rule:
f (y s)
becomes
(f y) (f s)
58
7 .Replace operators:
Replace the conjunction f1 …fn
,f
.
with the set of clauses f1,…
n
Convert each fi into a set of
literals (that is, get rid of the
disjunctions) :
Ex:
becomes a set of sets of literals:
{ {P(x), Q(y)}
a clause
{P(x), ¬ R(z)} }
a clause
59
8 .Rename variables:
Change variable names so that
no variable symbol appears in
more than one clause.
We can do this because:
x (P(x ) Q (x ))
x P(x) y Q (y )
60
Let’s look at an example of
this whole conversion
process into clauses:
61
6. (P (x) P (y) P(F (x,y )))
(P (x) Q( x,G( x)))
(P (x) P (G(x )))
7 . { {P (x ) , P ( y ) , P ( F (x ,y )) } ,
{P ( x ) , Q ( x ,G (x ) )} ,
{P ( x ) , P (G (x ) )} }
8
.{{
P
(x
1
),
P
(y
1
),P
(
F
(x
1
,y
1
))},
{
P
(x
2
),Q
(x
2
,G
(x
2
))},
{
P
(x
3
),
P
(
G
(x
3
)
)} }
62
Resolution proof: definite clauses
63
Restriction strategies for resolution
Various kinds of nonmonotonic
reasoning
64
Resolution can be very inefficient…
1 .{P, Q}
2 .{¬ P, R}
3 .{¬ Q, R}
4 .{¬ R}
D
D
D
G
5 .{Q,R}
6 .{P, R}
7 .{¬ P}
8 .{¬ Q}
1, 2
1, 3
2, 4
3, 4
9 .{R}
10 .{Q}
11 .{R}
12 .{P}
13 .{Q}
14 .{R}
15 .{P}
16 .{R}
3, 5
4, 5
3, 6
4, 6
1, 7
6, 7
1, 8
5, 8
17 .{ }
18 .{R}
19 .{ }
20 .{ }
21 .{R}
22 .{ }
23 .{R}
24 .{ }
25 .{ }
26 .{R}
27 .{ }
28 .{ }
4, 9
3, 10
8, 10
4, 11
2, 12
7, 12
3, 13
8, 13
4, 14
2, 15
7, 15
4, 16
29 .{ }
30 .{ }
31 .{ }
32 .{ }
4, 18
4, 21
4, 23
4, 26
So we want techniques
to help us increase
efficiency…
65
Deletion Strategies:
eliminate certain clauses
before they are ever used
1 .Pure literal elimination Remove any
clause containing a “pure literal”—a
literal that has no complementary
instance in the data base:
X
X
{¬ P, ¬ Q, R}
{¬ P, S}
{¬ Q, S}
{P}
{Q}
{¬ R}
“S” is a pure literal,
so these 2
clauses can be
“thrown away”
For the purposes of refutation proofs, those “S”
clauses won’t help.
66
2 .Tautology elimination
Eliminate clauses that contain
complementary literals—tautologies:
{ P(G(A)), ¬ P(G(A)) }
and
{ P(x), Q(y), ¬ Q(y), R(z) }
3 .Subsumption elimination
A clause F subsumes a clause Y iff there is a
substitution s such that Fs Y
{ P(x), Q(y) } subsumes
{ P(A), Q(v), R(w) }
since the substitution {x/A, y/v} makes the first a
subset of the second.
So we can “throw away” the second.
67
Unit Resolution
Def :At least one of the clauses being
resolved at every step is a “unit clause,”
i.e., one containing a single literal.
Ex:
1 .{P, Q}
2 .{¬ P, R}
3 .{¬ Q, R}
4 .{¬ R}
5 .{¬ P}
2, 4
7 .{Q} 1, 5
D
D
G
D
6 .{¬ Q}
3, 4
8 .{P} 1, 6
9 .{R} 3, 7 10 .{ } 6, 7 11 .{R} 2, 8
12 .{ } 5, 8
68
Refutation Completeness?
Unit resolution is generally not
refutation complete:
Ex:
{P, Q}
{¬ P, Q}
{P, ¬ Q}
{¬ P, ¬ Q}
Can’t perform
unit resolution
A Horn clause is a clause with at
most one positive literal.
There is a unit resolution refutation
of a set of Horn clauses if and
only if that set is unsatisfiable.
69
Linear Resolution
Def :At least one of the clauses
being resolved at every step is
either in the initial data base or is
an ancestor of the other clause.
Ex :{P, Q} {¬ P, Q} {P, ¬ Q} {¬ P, ¬ Q}
{Q}
{P}
{¬ Q}
{Q}
{}
Linear resolution is refutation complete.
70
Input Resolution
Def :At least one of the clauses being
resolved at every step is a member of
the initial (i.e., input) data base.
Surprisingly, “input” =“unit” in power, in
that there is a unit refutation whenever
there is an input refutation, and vice
versa.
So…input refutation is complete for Horn
clauses but incomplete in general.
71
Set of Support Resolution
Def :A subset G of a set D is called
a “set of support” for D iff D -G is
satisfiable.
Set of support resolution :at least
one of the clauses being resolved
at every step is selected from a
set of support G.
72
Ex:
1 .{P, Q}
2 .{¬ P, R}
3 .{¬ Q, R}
4 .{¬ R}
D
D
D
G
5 .{¬ P}
6 .{¬ Q}
2, 4
3, 4
7 .{Q}
8 .{P}
1, 5
1, 6
9 .{R}
10 .{ }
11 .{R}
12 .{ }
3, 7
6, 7
2, 8
5, 8
1 .Set of support refutation is complete.
2 .Often G is chosen to be the clauses derived from the negated
goal (if the initial database is satisfiable).
73
3 .Can be seen as working backwards from the goal.
Ordered Resolution
Def :Each clause is treated as a
linearly ordered set .Resolution is
permitted only on the first literal of
each clause .Literals in the
conclusion preserve parent
clauses’ order – “positive parent”
clauses followed by “negative
parent” clauses.
74
Ex:
1 .{P, Q}
2 .{¬ P, R}
3 .{¬ Q, R}
4 .{¬ R}
D
D
D
G
5 .{Q, R}
1, 2
6 .{R}
3, 5
7 .{ }
4, 6
Refutation by ordered resolution is
complete for Horn clauses but
incomplete in general, (just like unit
resolution and input resolution).
75
Directed Resolution
Def :Use of ordered resolution on a
database of “direction” clauses – i.e.,
Horn clauses with any positive literal
either at the beginning or the end of
the clause.
Directed resolution can be used in the forward
or the backward direction .Sometimes one or
the other – or a mixture – is best.
76
Ex :(forward)
1 .{¬ M(x), P(x)}
2 .{M(A)}
3 .{¬ P(z)}
D
D
G
4 .{P(A)}
1, 2 P(A)
5 .{ }
3, 4 { }
M(x) P(x)
M(A)
P(z)
Same problem :(backward)
1 .{P(x), ¬ M(x) }
D
P(x) M(x)
2 .{M(A)}
D
M(A)
3 .{¬ P(z)}
G
P(z)
4 .{¬ M(z)}
1, 3 M(z)
5 .{ }
2, 4 { }
In general, the problem of deciding whether the
forward or backward direction (or some mixture)
is best is NP-complete.
77
Nonmonotonic
Reasoning
Traditional systems based on
predicate logic are monotonic
– the number of statements
known to be true is strictly
increasing over time:
“Once true, always true”
78
To deal with many real world
situations we need
nonmonotonic reasoning – the
ability to “jump to conclusions”
that are later withdrawn.
(Defeasible conclusions are ones
that can later be withdrawn.)
1 .Closed World Assumption
2 .Default Rules
3 .Circumscription
4 .(Truth Maintenance Systems)
79
Defeasible Conclusions
theory of KB
Database
KB
defeasible
conclusions
later on…
theory of KB'
Database
KB'
defeasible conclusions
80
Closed World Assumption
The CWA “completes” a base set
of beliefs, KB, by including the
negation of a ground atom in the
theory whenever that ground
atom does not logically follow
from KB.
It’s as if we add to the belief set all the negative
ground literals whose positive versions cannot be
deduced from KB.
Ex :KB = P(A) [P(A) Q(A) P(B)]
CWA augments it with ¬ Q(B)
81
Closed World Assumption
theory of KB
¬w
Database
y
KB
f, f y
defeasible
conclusions
e[KB]
¬s
later on…
theory of KB'
y
¬w
Database
KB'
f, f y, s
defeasible
conclusions
e[KB']
82
CWA
Formally, the definition of the CWA is:
The formula f is in T[D] iff D╞ f (regular
definition of theory T[D])
¬P is in Dasm iff the ground atom P is not in
T[D] (Dasm is a set of added beliefs assumed
under the CWA)
f is in CWA[D] iff {D U Dasm}╞ f (the
augmented theory CWA[D] is the closure of all
beliefs, explicit and assumed
83
The CWA is often used with database systems:
Neighbor(Israel, Egypt)
Neighbor(Egypt, Libya)
Neighbor(Canada, US)
Neighbor(China, USSR)
………
If using the database with the CWA,
when asked if “Neighbor(Israel, US)”
you could answer “no.”
84
The CWA does not always result
in a consistent theory, e[B].
Ex :KB = [ P(A) P(B) ]
e[KB ]thus includes ¬ P(A) and ¬ P(B), but
the three clauses together are
inconsistent.
In general, the CWA augmentation CWA[KB ]of a
consistent set KB is inconsistent iff there are positive
ground literals L1, …, Ln such that:
KB ╞ L1 L2
… Ln, but for i
=1,…,n, KB ╞ Li.
85
It may be difficult to test the
conditions above, but we have
another useful rule:
If the clause form of B is Horn *and
consistent, then the CWA
augmentation e[B] is consistent.
*A Horn clause is defined as a clause with at most
one positive literal; a Horn database has only
Horn clauses.
86
Other Variations:
Domain closure assumption—
we limit the constant terms in
the language .Assume the only
objects are the ones that can be
named using the constant and
function symbols in B.
Unique names assumption— if
ground terms cannot be proved
equal, they can be assumed
unequal.
87
Default Rules
Another one of several
alternative approaches to
nonmonotonic reasoning.
Instead of, for example, the
Closed World Assumption, we
introduce:
Nonmonotonic Inference Rules
— Default Rules.
88
Default rules are written in the form:
A(x) : C(x)
W(x)
If, for some X0, A(X0) follows from e[B, D]
and C(X0) is consistent with e[B, D] then
e[B, D] includes W(X0).
e[B, D] above is the augmentation of a belief set B
and a set of default rules D
89
Ex:
Bird(x) : Flies(x)
Flies(x)
If B = Bird(Tweety)
Ostrich(x) ¬ Flies(x)
then e[B, D] also includes
Flies(Tweety)
If we add Ostrich(Tweety) to B,
then we cannot conclude that
Flies(Tweety).
90
Couldn’t we write ridiculous rules?
Monkey(x) : Brown(x)
Flies(x)
Yes, but we can also write ridiculous
implications in classical logic:
Monkey(x) Flies(x)
These are just symbols; we don’t
know their interpretation .We are
manipulating symbols…
91
In a sense, default rules are a
kind of “weak implication”
Instead of writing “all birds fly,” i.e.,
x [ Bird(x) Flies(x) ]
which is too strong, we write the
default rule,
Bird(x) : Flies(x)
Flies(x)
92
e[B,D]
Default Rules
theory of B
Default
Database B
rules D
y: ¬s
f, f y
¬s
defeasible
conclusions
y
¬s
e[B',D]
later on…
y
theory of B'
Default
Database
rules D
B'
y: ¬s
f, f y, s
¬s
defeasible
conclusions
93
“Normal” Default Rules
are written in the form
A(x) : W(x)
W(x)
Ex1:
Ex2:
Bird(x) : Flies(x)
Flies(x)
:¬ P(x)
¬ P(x)
Is example 2 the same as the
Closed World Assumption?
94
A Default Theory may have
more than one augmentation:
Ex :B has P Q
:¬ Q
:¬
P
D has
and
¬Q
¬P
Then e[B, D] could be either (P Q)
and ¬ P, or (P Q) and ¬ Q.
The CWA, on the other hand, would
have produced an inconsistent
augmentation.
95
The union of these two
alternatives for e[B, D]
(P Q) and ¬ P
(P Q) and ¬ Q
are inconsistent.
The general rule:
If a normal default theory has
distinct augmentations, they are
mutually inconsistent.
96
There are default theories
with no augmentations.
Ex :B is empty
:P
D has
¬P
Then e[B, D ]is empty.
But every normal default
theory has an augmentation.
A default theory can have an
inconsistent augmentation iff B
itself is inconsistent.
97
If D D' (both sets of normal
default rules) then for any
e[B, D] there is an e[B, D’]
such that e[B, D] e[B, D’]
Adding new normal default rules
does not require the withdrawal
of beliefs (though adding new
beliefs might).
98
Circumscription
A non-monotonic technique that
involves computing a certain
formula which, when joined to B,
says that “the only objects
satisfying a predicate are those
that can be shown to satisfy it,
given B.”
99
Circumscription
theory of B
y(A)
Circumscriptive Database B
axiom
f(A),
f(x) y(x) f(x) y(x)
defeasible
conclusions
¬ y(B)
later (the circ .axiom might also change)…
Y(A), y(B) theory of B'
Circumscriptive Database B
axiom
f(A), f(B),
f(x) y(x)
f(x) y(x)
defeasible
conclusions
100
We will first consider the related
idea of Predicate Completion:
Let’s say P(A) is the only formula in B that
contains P .P(A) could be written as
if
x[ (x =A) P(x) ]
Now, to say “there are no other objects
satisfying P” we could write
only if
x[ P(x) (x =A) ]
This is called the completion formula
for P.
101
The conjunction of B with the
completion formula is called the
“completion of P in B.”
COMP[B;P] =
B x[ P(x) (x =A) ]
or equivalently
COMP[B;P] =
B x[ P(x) (x =A) ]
(B already contains x[ (x =A) P(x) ] )
102
The same sort of technique of predicate
completion can be used, if, e.g., B
contained only 2 formulas in P, say
P(A) and P(B):
x[((x=A) (x=B)) P(x)]
Our completion formula:
x[P(x) ((x=A) (x=B))]
In these examples, predicate
completion has the same effect as
the CWA with respect to P.
But predicate completion is not
defined for arbitrary formulas.
103
Predicate Completion is only
defined for clauses solitary in P.
A set of clauses is “solitary in P” if each
clause with a positive occurrence of P
has at most one occurrence of P.
Q(A) P(A) is solitary in P
Q(A) ¬ P(B) P(A)
is not solitary in P
104
The formal method for
Predicate Completion:
Set of clauses solitary in P:
Q1 … Qm P(T)
Equivalent to:
(x=T) Q1 … Qm P(x)
Converted to:
y[(x=T) Q1 … Qm] P(x)
(where y are free variables in Q1…Qm)
Called “the normal form for predicate
completion”
105
Let’s call the symbols on the left of the “”
E; then we can have a set of normal form
clauses:
E1 P(x)
E2 P(x)
Ek P(x)
becomes
x[(E1 E2 … Ek)
Completion formula is then
P(x)]
x[ P(x) (E1 E2 … Ek)]
Finally :
COMP[B;P ]is defined as
[ B x[ P(x) (E1 E2 … Ek)] ]
if
only if
106
Example:
B = Ostrich(x) Bird(x) Bird(Tweety)
¬ Ostrich(Sam)
Clauses with Bird: Ostrich(x) Bird(x)
Bird(Tweety)
Normal form (for Bird):
x[Ostrich(x) (x =Tweety)] Bird(x)
So:
COMP[B; Bird] =
B x[Bird(x) [Ostrich(x) (x=Tweety)]]
Is Sam a Bird?
107
Circumscription is similar to predicate
completion, but it is defined for more
general formulas (not just sets of
clauses solitary in P).
In order to understand circumscription,
we need to know about “minimal
models” (which is really what we
were using with predicate completion
as well).
(See Genesereth and Nilsson,
“Logical Foundations of Artificial Intelligence”)
108
Let M[B] and M*[B] be 2 models of B,
identical in every way (same domain,
interpretation on predicates and
functions) EXCEPT that the extension of
P in M *is a subset of its extension in M.
We say M*[B] is no larger than M[B] in
predicate P: M*[B] d p M[B]
Ex :The extension of Red in M is {A, B, C}, while the
extension of Red in M *is {A, B}; everything else is
the same.
109
There may be models of B that are
minimal in P according to the ordering ≤p
[Minimal models do not always exist.]
If a model Mm of B is P-minimal, then
no objects satisfy the extension of P
except those that have to, given B.
Ex :If B includes the two clauses Red(A) and
Red(B) (and no other clauses with Red), then
extension M *above is Red-minimal, though M
certainly also satisfies B.
110
We want to find a sentence Fp such that
the models of B Fp are P-minimal
models of B, that is:
the sentence Fp joined to B asserts that
there are no objects that satisfy P
except those that have to, given B.
This is what we did with predicate
completion; we want a method for
doing it with arbitrary (not just
solitary) formulas.
112
Abbreviations
write
P *≤ P in place of x(P*(x) P(x))
write
P *< P in place of [(P*≤ P) ¬ (P ≤ P*)
write
P = *P in place of [(P*≤ P) (P ≤ P*)
113
Suppose Mm is a P-minimal model of
B .Then there can be no
interpretation for P *that (together with
Mm) satisfies B[P*] (P *< P) (if there
were, we could find a model smaller
than Mm), i.e.,
¬ P*[B(P*) (P *< P)]
This is exactly the formula Fp that we
were looking for…
114
CIRC[B; P] is defined as
B ¬ P*[B(P*) (P *< P)]
This “second order” formula is
unwieldy, and there are useful
theorems that allow us to handle
it more conveniently.
115
Theorem 6.4 (G&N):
Given P and B[P] (a belief set
containing P), then for all P' not
containing P (of the same arity as P),
if B[P] ╞ B[P’] (P' ≤ P) then
CIRC[B;P] is B[P] (P=P’)
The circumscription formula has
simplified .This can be used to
confirm conjectures about
circumscription formulas—guess
P', the completion formula.
116
Often, the circumscription formula is
not required to be second order:
Theorem:
If all occurrences of P in B are
positive, then CIRC[B;P] is
equivalent to a first-order formula.
But what first-order formula?
Theorem 6.5 (G&N) tells us about
one useful situation.
117
Let N[P ]be a formula containing no
positive occurrences of P (in its clause
form); let E be a formula containing no
occurrences of P; let E ≤ P stand (as
usual) for x[E(x) P(x)]
Theorem 6.5 (G&N):
CIRC[N(P) (E ≤ P); P] is
N[E] (E =P) where N[E] is N[P] with
each occurrence of P replaced by E.
118
This can, for certain formulas, give
us a simply computed first-order
formula for circumscription.
Also, another useful simplifying
fact:
Circumscription gives the same
result as predicate completion for
the special case of clauses
solitary in P.
119
Example (from Genesereth and Nilsson):
D = x ¬ On(A,x) On(A,B)
Compute CIRC[B; On]
Rewrite D as:
x ¬ On(A,x) x,y[(x=A) (y=B) On(x,y)]
The first conjunct is N[On] (with no positive occurrences
of On) .The second conjunct is (E ≤ On) with E(x,y)
being [(x=A) (y=B)] (E has no occurrences of On).
By Theorem 6.5, CIRC[D; On] is
x,y [ On(x,y) [ (x =A) (y =B) ] ]
x ¬(x =B)
Just plug it in and simplify.
120
Another useful theorem (6.10, G&S):
If all occurrences of P1, P2,…,PN
occur positively in B, then
CIRC[B; P] is equivalent to
N
∧ CIRC[B; P ]
i
i =1
All this work on circumscription lets us state
default rules modularly:
x[Bird(x) ¬Ab(x) Flies(x)]
x[Ostrich(x) Ab(x)]
We can circumscribe on Ab and Flies to show that
the only abnormal things are Ostriches.
121
Ex: Database KB is
x[Bird(x) ¬Ab(x) Flies(x)]
x[Ostrich(x) Ab(x)]
CIRC[KB; Ab] is
KB (x Ab(x) [ Ostrich(x) (Bird(x) ¬ Flies(x)) ] )
The only abnormal things are Ostriches or Birds that don’t
fly.
If we then also circumscribe on Flies, i.e., compute
CIRC[KB; Ab; Flies], we get
KB (x Ab(x) Ostrich(x))
i.e., the only abnormal things are Ostriches .See
Genesereth and Nilsson, p .150 for full examples.
122
1. Truth Maintenance Systems
2. Knowledge and Belief
123
After all this work on formal
approaches to nonmonotonic
reasoning, why not consider a
more straightforward “database”
approach— adding and deleting
facts from a database is also
nonmonotonic.
Then, however, you need to
resolve potential inconsistencies
in the database over time.
124
Truth Maintenance System
(Doyle)
A TMS supports nonmonotonic
reasoning .It is a “truth
maintenance subsytem”—it doesn’t
generate new inferences, but
rather maintains consistency
among statements generated by
some other program:
SOP
TMS
detects and
“resolves”
inconsistencies
generates inferences
125
In a TMS, each statement or rule is
called a “node” which at any given
point is either
IN — Believed to be true
OUT — Not believed to be true
(either no reasons to believe it
true or no currently valid reasons)
126
Each node has a list of justifications,
each of which is one way of
establishing a node’s validity.
IN nodes have at least one currently
valid justification.
OUT nodes have no currently valid
justification.
There are 2 kinds of justifications:
• Support List (SL (in-nodes) (out-nodes))
• Conditional Proof (CP <consequent>
(in-hypotheses)
(out-hypotheses))
127
SL Justifications
Valid if all the nodes on (in-nodes)
are IN and all the nodes on (outnodes) are OUT.
Ex:
1 .It is winter (SL ( ) ( ) )
2 .It is cold
(SL (1) ( ) )
128
An example of “default reasoning” (uses
a non-empty OUT list):
Ex: 1 .It is winter (SL ( ) ( ) )
2 .It is cold
(SL (1) (3) )
3 .It is warm
3 must be kept around even though it is
OUT .Node 2 (with non-empty OUT list) is
called an “assumption.” TMS doesn’t
create these justifications – it must be
given them.
129
Conditional Proof
Justifications
These represent hypothetical
arguments .They are valid if the
“consquent node” is always IN
whenever the (in-hypotheses)
are IN and the (out-hypotheses)
are OUT .
130
Let’s look at an example of
“dependency-directed backtracking,”
scheduling a meeting:
Day=Wednesday
After many steps,
find 2pm is only
good time
Try to find room, all
rooms are busy
Wed.
FAIL
Day=Tuesday
After many steps,
find 2pm is only
good time
Try to find room
SUCCEED
We want to avoid the duplicated work.
131
Ex :Start with
(1) Day(M) = Wed.
(SL ( ) (2))
(2) Day(M) NEQ Wed.
Node (1) is an assumption .Now do more reasoning (not the
TMS, the other system) and we get
(1) Day(M) = Wed. (SL ( ) (2))
(2) Day(M) NEQ Wed.
(3) Time(M) = 2pm (SL (57, 103, 45) ( ) )
Nodes (1) and (3) are IN, and (2) is OUT .When no room can be
found, there exists a contradiction between (1) and (3), so a
new node is created:
(4) Contradiction
(SL (1, 3) ( ) )
and dependency-directed backtracking is invoked .It searches
back (multiple levels if necessary) from the Contradiction
justifications, to their justifications, etc., looking for a set of
assumptions such that if one is eliminated, the contradiction
disappears .In this case, only (1) is an option.
132
The NOGOOD node is created to represent the set
of inconsistent assumptions:
(1) Day(M) = Wed. (SL ( ) (2) )
(2) Day(M) NEQ Wed.
(3) Time(M) = 2pm (SL (57, 103, 45) ( ) )
(4) Contradiction
(SL (1, 3) ( ) )
(5) NOGOOD N-1 (CP 4 (1, 3) ( ) )
Now node (1), the inconsistent assumption, is made OUT
(by making node (2), on 1’s out-list, IN) .We finally get:
(1) Day(M) = Wed.
(SL ( ) (2) )
(2) Day(M) NEQ Wed .(SL (5) ( ) )
(3) Time(M) = 2pm
(SL (57, 103, 45) ( ) )
(4) Contradiction (SL (1, 3) ( ) )
(5) NOGOOD N-1
(CP 4 (1, 3) ( ) )
With (2), (3), and (5) being IN .Node (3) does not
have to be rederived.
133
Logics of Knowledge
and Belief
For background reading, see
Genesereth and Nilsson,
Logical Foundations of Artificial
Intelligence, Chapter 9
134
In order for agents
to reason about
their own knowledge,
or about other
agents’ knowledge,
we must have
formalisms for
describing what
agents’ know and
believe.
135
To represent agent A believing
some statement P, we will use a
modal operator B:
B(A, Father(John, Bill))
To represent agent A knowing some
statement P, we will use a modal
operator K:
K(A, Father(John, Bill))
Known statements are typically assumed to be true.
136
There are two popular ways of
dealing with logical statements
involving knowledge and belief:
Sentential
and
Possible-Worlds
Logics
137
Sentential Logics
A Simple Version:
1 .All ordinary wffs are wffs
2 .If f is an ordinary closed wff (one with no free
variables), and if a is a ground term, then B(a, f) is
a wff .Such wffs are called belief atoms.
3 .If f and y are wffs, then so are any expressions
that can be constructed from f and y by the usual
propositional connectives.
NOT WFFS:
a .x B(R, P(x))
b .B(R1, B(R2, P(A)))
IS A WFF:
c .B(R, (x P(x)))
138
What are the semantics of belief atoms?
We don’t want B(a, f) to have the same truth
value as B(a,y) just because f and y have the
same truth values !Belief depends on the
believed proposition, not just on its denotation.
Nor do we want the truth value of a belief to
remain the same when equivalent terms are
substituted:
B(a, Hungry(John)) does not imply that B(a,
Hungry(Father-of(Bill))) even if John is the
Father of Bill: referentially opaque vs .
referentially transparent.
139
To define the semantics of B, we
extend our domain as follows:
1 .A set of agents;
2 .Each agent has a base set of beliefs,
Da, composed of ordinary closed wffs,
and a set of inference rules, ra.
3 .Ta is the theory formed by the closure
of Da under the inference rules ra
(provability is |– )
a
140
The agents’ theory is not closed under
logical implication—it is only closed
under the agents’ own inference rules.
These inference rules might limit the
number of deductions the agent can
make, or the kinds of deductions it can
make…
For logical omniscience, give the agent
a complete set of inference rules.
141
B(a, f) is constrained so that a must
be an agent; then—
B(a, f) has the value true if and
only if f is in the theory associated
with the agent denoted by a.
This semantics is referentially
opaque.
142
Proof Methods
We want to be able to start with a
statement that an agent believes
some proposition, B(a, f), and
conclude that he believes some
other proposition, B(a, y).
We must assume that we have a
model of the deduction process of
the agent.
143
The inference rule attachment
(defined for formulas in clause form);
from:
B(a, f1) y1
B(a, f2) y2
B(a, fn) yn
~B(a, fn+1) yn+1
and
f1 … f2 |–a fn+1
conclude
y1 … yn+1
144
Example (with no y’s):
Nora believes P Q but does not believe Q; we
want to show that she does not believe P:
1 .B( Nora, PQ )
2 .~B( Nora, Q)
3 .B(Nora, P)
The attachment rule says that if we can show
(P Q) P |–
Q
Nora
then we have derived the empty clause, and we are
done. If Nora can do modus ponens, we are
successful.
145
Given B(A, P(b)) (b =c), we cannot deduce B(A, P(c))
— which is good!
However, if we have B(A, (b=c)), the situation changes…
Another example:
(x R(x) D(x)) B(J, (x R(x) D(x)))
R(Fred) B(J, R(Fred))
~B(J, D(Fred))
Clause form:
1 .R(Sk) B(J, (x R(x) D(x)))
2 .~D(Sk) B(J, (x R(x) D(x)))
3 .~R(Fred) B(J, R(Fred))
4 .~B(J, D(Fred))
Depending on John’s inference rules:
5 .~D(Sk) ~R(Fred), and
6 .R(Sk) ~R(Fred), yielding
7 .(~D(Sk) R(Sk)) ~R(Fred)
146
We can remove two of the
restrictions of our earlier, “simple
version” of sentential logic:
1 .We can allow nested beliefs, so
that
B(R1, B(R2, P(A))) is now
a wff;
2 .We can allow quantifying-in, so
that
(x B(A, Father(Bill,x))) is
a wff.
147
To deal with the semantics of nested
beliefs, we need to use nested
attachment to arbitrary levels of
nesting:
The symbol |–
indicates proofs
ai, aj, ak
using our model of ai’s model of
aj’s model of ak’s inference rules…
See Genesereth and Nilsson, pp .215–216,
for “Wise-Men Puzzle.”
148
Quantifying-In
requires even more
complicated
semantics, to deal
with naming objects
in different models.
See Genesereth and Nilsson, pp .216–222,
for bullet operator, and related proof
methods.
149
Possible-Worlds Logics
Our conceptualization includes
objects, w0, w1, w2, … , wi, …
called possible worlds.
Our language is basically the same
as that presented for sentential
logics, namely, ordinary first-order
predicate calculus plus a modal
operator, K.
150
Semantics for ordinary wffs (no
modal operators):
A wff is not “true” or “false”, but is
rather true or false with respect
to a possible world.
Instead of a single interpretation
(with objects, functions,
relations), we now have one for
each possible world.
151
Normal (not Possible Worlds)
Semantics (review)
Descriptive Semantics:
Symbols and
sentences…
………
………
Red(A)
………
………
An
Interpretation
I
(a mapping)
The World
(according to
our
conceptualization)
152
1. We have a set of sentences and a
conceptualization of the world.
2. We associate the symbols used in
the sentences with the objects,
functions and relations of our
conceptualization.
3. We evaluate the truth of sentences
according to this association; that is, a
sentence is true if and only if it
accurately describes the world
according to our conceptualization.
153
In Possible Worlds Semantics…
An ordinary wff has the value
true with respect to the world
wi when it evaluates to true
using the interpretation
associated with wi.
So White(Snow) might be true in world
w0 and false in world w1.
154
How are these worlds related?
Accessibility:
There exists an accessibility
relation, k(a, wi, wj); when the
relation is satisfied, we say that
world wj is accessible from world
wi for agent a.
2
a
1
a
4
b
3
155
Possible-Worlds Semantics:
A knowledge atom K(a, f) has
the value true with respect to
world wi if and only if f has the
value true in all worlds
accessible from wi for agent a.
156
Example (a’s accessibility
relation):
P, Q, ~R
w1
P, Q, ~R
P, ~Q, ~R
w2
w0
P, Q, ~R
w3
A knows P and ~R in w0, but does
not know Q or R.
157
Another example—
Nested Knowledge:
kb
ka
ka
P, K(B,P)
w0
kb
P, K(B,P)
kb
P
w4
w1
kb
P
w5
kb
P, K(B,P)
kb
w2
ka
ka
kb
P, K(B,P)
w3
kb
kb
P
w6
P
w
7
P
w8
“The agent A knows in the real world w0 that
agent B knows P”—in all worlds accessible for
A from w0, K(B, P) has value true…
158
The accessibility relation divides
all possible worlds into those that
are accessible and those that are
not—from any given world.
The relation gives us more,
though, than a simple division of
all possible worlds into
“reachable” and “unreachable”
sets…
159
Properties of Knowledge
Various intuitive properties of an
agent’s knowledge can be
captured using potential
properties of the accessibility
relation introduced above,
things like reflexivity, transitivity,
Euclidean, …
160
Distribution Axiom
(G&N, 9.1)
(K(a, f) K(a, f y)) K(a, y)
This is a direct consequence of possibleworlds semantics, regardless of the
nature of the accessibility relation.
161
Knowledge Axiom (9.2)
K(a, f) f
This property, which we might want
our agents’ knowledge to satisfy,
will occur if the accessibility
relation (considered as a binary
relation for a fixed knower) is
reflexive, i.e., k(a, w1, w1) for any
agent a and worlds w1
162
Positive Introspection
Axiom (9.3)
K(a, f) K(a, K(a, f))
“If an agent knows something, he
knows that he knows it.”
This property will be satisfied if the
accessibility relation is transitive (i.e.,
k(a, w1, w2) and k(a, w2, w3) imply k(a,
w1, w3) for any agent and all w1, w2,
and w3).
163
Negative Introspection
Axiom (9.4)
~K(a, f) K(a, ~K(a, f))
“If an agent doesn’t know
something, he knows that he
doesn’t know it.”
This property will be satisfied if the
accessibility relation is Euclidean (i.e., k(a,
w1, w2) and k(a, w1, w3) imply k(a, w2,
w3) for any agent and all w1, w2, and w3).
164
Brouwer Axiom
~K(a, ~K(a, f)) f
This axiom follows from axioms 9.2
and 9.4…
This property will be satisfied if the
accessibility relation is symmetric (i.e.,
k(a, w1, w2) is equivalent to k(a, w2, w1)).
165
Epistemic Necessitation
(Rule 9.5)
From | -f infer K(a, f)
This axiom follows from possibleworlds semantics.
f must be a “logical axiom,” i.e., one
which is valid under all interpretations,
not one which just happens to be true
in a given world.
166
Logical
Omniscience
From f | -y and from K(a, f)
infer K(a, y)
This axiom follows from Axiom 9.1 and
rule 9.5
This means that an agent knows all the
consequences of its knowledge.
167
Other knowledge relationships:
K(a, f) K(a, y) K(a, (f y ))
but it is not true that
K(a, (f y )) implies K(a, f) K(a, y)
K(a, ~ f) ~ K(a, f) but not vice versa
Modal logics of knowledge:
Axioms 9.1 through 9.4, with axioms of ordinary
propositional logic, ordinary inference rules, and rule
9.5 (not all combinations of the axioms are possible):
9.1 through 9.4 is a system called S5
9.1 through 9.3 is a system called S4
9.1 and 9.2 is a system called T
9.1 alone is a system called K
168
Nora knows P Q but does not know Q; we
want to show that she does not know P:
1 .K( Nora, PQ ) — given
2 .K( Nora, P) K( Nora, Q) — 9.1
3 .~ K(Nora, Q) ~ K(Nora, P) —
contrapositive of 2
4 .~ K(Nora, Q) — given
5 .~ K(Nora, P) — 3, 4, Modus Ponens
Prove K( a, f) ~ K( a, ~ f)
1 .~ K( a, f) ~ K( a, ~ f) — given
Assume opposite and show contradiction:
2 .K( a, f) K( a, ~ f)
3 .f ~ f — Axiom 9.2
4 .False
169