Transcript ICoS-4

Working with
Discourse Representation Theory
Patrick Blackburn & Johan Bos
Lecture 3
DRT and Inference
This lecture
Now that we know how to build DRSs
for English sentences, what do we do
with them?
Well, we can use DRSs to draw
inferences.
In this lecture we show how to do that,
both in theory and in practice.
Overview
Inference tasks
Why FOL?
From model theory to proof theory
Inference engines
From DRT to FOL
Adding world knowledge
Doing it locally
The inference tasks
The consistency checking task
The informativity checking task
Why First-Order Logic?
Why not use higher-order logic?
 Better match with formal semantics
 But: Undecidable/no fast provers available
Why not use weaker logics?
 Modal/description logics (decidable fragments)
 But: Can’t cope with all of natural language
Why use first-order logic?
 Undecidable, but good inference tools available
 DRS translation to first-order logic
 Easy to add world knowledge
Axioms encode world knowledge
We can write down axioms about the
information that we find fundamental
For example, lexical knowledge, world
knowledge, information about the structure of
time, events, etc.
By the Deduction Theorem
1 … n |= 
iff
|= 1& … & n  
That is, inference reduces to validity of
formulas.
From model theory to proof theory
The inference tasks were defined
semantically
For computational purposes, we need
symbolic definitions
We need to move from the concept
of |= to |-In other words, from validity to
provability
Soundness
If provable then valid:
If |--  then
|= 
Soundness is a `no garbage` condition
Completeness
If valid then provable
If |=  then
|-- 
Completeness means that proof theory
has captured model theory
Decidability
A problem is decidable, if a computer is
guaranteed to halt in finite time on any
input and give you a correct answer
A problem that is not decidable, is
undecidable
First-order logic is undecidable
What does this mean?
It is not possible, to write a program
that is guaranteed to halt when given
any first-order formula and correctly tell
you whether or not that formula is valid.
Sounds pretty bad!
Good news
FOL is semi-decidable
What does that mean?
 If in fact a formula is valid, it is always
possible, to symbolically verify this fact in
finite time
 That is, things are only going wrong for
FOL when it is asked to tackle something
that is not valid
 On some non-valid input, any algorithm is
bound not to terminate
Put differently
Half the task, namely determining validity, is
fairly reasonable.
The other half of the task, showing nonvalidity, or equivalenty, satisfiability, is harder.
This duality is reflected in the fact that there
are two fundamental computational inference
tools for FOL:
 theorem provers
 and model builders
Theorem provers
Basic thing they do is show that a
formula is provable/valid.
There are many efficient off-the-shelf
provers available for FOL
Theorem proving technology is now
nearly 40 years old and extremely
sophisticated
Examples:
Vampire, Spass, Bliksem, Otter
Theorem provers and informativity
Given a formula , a theorem prover will try
to prove , that is, to show that it is
valid/uninformative
If  is valid/uninformative, in theory, the
theorem prover will always succeed
So theorem provers are a
negative test for informativity
If the formula  is not valid/uninformative, all
bets are off.
Theorem provers and consistency
Given a formula , a theorem prover will try
to prove , that is, to show that  is
inconsistent
If  is inconsistent, in theory, the theorem
prover will always succeed
So theorem provers are also a negative
test for consistency
If the formula  is not inconsistent, all bets
are off.
Model builders
Basic thing that model builders do is try
to generate a [usually] finite model for a
formula. They do so by iteration over
model size.
Model building for FOL is a rather new
field, and there are not many model
builders available.
It is also an intrinsically hard task;
harder than theorem proving.
Examples: Mace, Paradox, Sem.
Model builders and consistency
Given a formula , a model builder will try to
build a model for , that is, to show that  is
consistent
If  is consistent, and satisfiable on a finite
model, then, in theory, the model builder will
always succeed
So model builders are a partial positive
test for consistency
If the formula  is not consistent, or it is not
satisfiable on a finite model, all bets are off.
Finite model property
A logic has the finite model property, if
every satisfiable formula is satisfiable
on a finite model.
Many decidable logics have this
property.
But it is easy to see that FOL lacks this
property.
Model builders and informativity
Given a formula , a model builder will try to
build a model for , that is, to show that  is
informative
If  is satisfiable on a finite model, then, in
theory, the model builder will always succeed
So model builders are a partial positive
test for informativity
If the formula  is not satisfiable on a finite
model all bets are off.
Yin and Yang of Inference
Theorem Proving and Model Building
function as opposite forces
Doing it in parallel
We have general negative tests
[theorem provers], and partial positive
tests [model builders]
Why not try to get of both worlds, by
running these tests in parallel?
That is, given a formula we wish to test
for informativity/consistency, we hand it
to both a theorem prover and model
builder at once
When one succeeds, we halt the other
Parallel Consistency Checking
Suppose we want to test  [representing the
latest sentence] for consistency wrto the
previous discourse
Then:
 If a theorem prover succeeds in finding a proof
for PREV  , then it is inconsistent
 If a model builder succeeds to construct a model
for PREV & , then it is consistent
Why is this relevant to
natural language?
Testing a discourse for consistency
Discourse
Theorem prover
Model builder
Why is this relevant to
natural language?
Testing a discourse for consistency
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Why is this relevant to
natural language?
Testing a discourse for consistency
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Mia loves every man.
??
Model
Why is this relevant to
natural language?
Testing a discourse for consistency
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Mia loves every man.
??
Model
Mia does not love Vincent.
Proof
??
Parallel informativity checking
Suppose we want to test the formula 
[representing the latest sentence] for
informativity wrto the previous discourse
Then:
 If a theorem prover succeeds in finding
a proof for PREV  , then it is not informative
 If a model builder succeeds to construct
a model for PREV & , then it is informative
Why is this relevant to
natural language?
Testing a discourse for informativity
Discourse
Theorem prover
Model builder
Why is this relevant to
natural language?
Testing a discourse for informativity
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Why is this relevant to
natural language?
Testing a discourse for informativity
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Mia loves every man.
??
Model
Why is this relevant to
natural language?
Testing a discourse for informativity
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Mia loves every man.
??
Model
Mia loves Vincent.
Proof
??
Let`s apply this to DRT
Pretty clear what we need to do:




Find efficient theorem provers for DRT
Find efficient model builders for DRT
Run them in parallel
And Bob`s your uncle!
Recall that theorem provers are more
established technology than model
builders
So let`s start by finding an efficient
theorem prover for DRT…
Googling theorem provers for DRT
Theorem proving in DRT
Oh no!
Nothing there, efficient or otherwise.
Let`s build our own one.
One phone call to Voronkov later:
 Oops --- does it take that long to build one
from scratch?
 Oh dear.
Googling theorem provers for FOL
Use FOL inference
technology for DRT
There are a lot FOL provers available
and they are extremely efficient
There are also some interesting freely
available model builders for FOL
We have said several times, that DRT
is FOL in disguise, so lets get precise
about this and put this observation to
work
From DRT to FOL
Compile DRS into standard FOL syntax
Use off-the-shelf inference engines for
FOL
Okay --- how do we do this?
Translation function (…)fo
Translating DRT to FOL: DRSs
x1…xn
C1
(
.
.
.
Cn
)fo = x1… xn((C1)fo&…&(Cn)fo)
Translating DRT to FOL: Conditions
(R(x1…xn))fo = R(x1…xn)
(x1=x2)fo = x1=x2
(B)fo = (B)fo
(B1B2)fo = (B1)fo  (B2)fo
Translating DRT to FOL:
Implicative DRS-conditions
x1…xm
C1
(.
.
.
Cn
B)fo =
x1…xm(((C1)fo&…&(Cn)fo)(B)fo)
Two example translations
Example 1
x
man(x)
walk(x)
y
woman(y)
x
Example 2
man(x)
e

adore(e)
agent(e,x)
theme(e,y)
Example 1
x
man(x)
walk(x)
Example 1
x
(
)fo
man(x)
walk(x)
Example 1
x( (man(x))fo & (walk(x))fo )
Example 1
x(man(x) & (walk(x))fo )
Example 1
x(man(x) & walk(x))
Example 2
y
woman(y)
x
man(x)

e
adore(e)
agent(e,x)
theme(e,y)
Example 2
y
woman(y)
(
x
man(x)

e
adore(e)
agent(e,x)
theme(e,y)
)fo
Example 2
x
y ( (woman(y))fo & ( man(x)
e
adore(e)

agent(e,x)
theme(e,y)
)fo )
Example 2
y ( woman(y) & (
x
man(x)
e
adore(e)

agent(e,x)
theme(e,y)
)fo )
Example 2
e
adore(e)
y (woman(y) &x((man(x))fo ( agent(e,x)
theme(e,y)
)fo))
Example 2
e
adore(e)
y (woman(y) &x(man(x)  (
agent(e,x)
theme(e,y)
)fo ))
Example 2
y (woman(y) &x(man(x) 
e ( (adore(e))fo & (agent(e,x))fo & (theme(e,y))fo )))
Example 2
y (woman(y) &x(man(x) 
e (adore(e) & (agent(e,x))fo & (theme(e,y))fo )))
Example 2
y (woman(y) &x(man(x) 
e (adore(e) & agent(e,x) & (theme(e,y))fo )))
Example 2
y (woman(y) &x(man(x) 
e (adore(e) & agent(e,x) & theme(e,y))))
Basic setup
DRS:
xy
vincent(x)
mia(y)
love(x,y)
Basic setup
DRS:
xy
vincent(x)
mia(y)
love(x,y)
FOL: xy(vincent(x) & mia(y) & love(x,y))
Basic setup
DRS:
xy
vincent(x)
mia(y)
love(x,y)
FOL: xy(vincent(x) & mia(y) & love(x,y))
Model:
D = {d1}
F(vincent)={d1}
F(mia)={d1}
F(love)={(d1,d1)}
Background Knowledge (BK)
Need to incorporate BK
Formulate BK in terms of first-order
axioms
Rather than just giving  to the theorem
prover (or model builder),
we give it: BK &  or BK  
Basic setup
 DRS:
xy
vincent(x)
mia(y)
love(x,y)
Basic setup
 DRS:
xy
vincent(x)
mia(y)
love(x,y)
 FOL: xy(vincent(x) & mia(y) & love(x,y))
Basic setup
 DRS:
xy
vincent(x)
mia(y)
love(x,y)
 FOL: xy(vincent(x) & mia(y) & love(x,y))
 BK: x (vincent(x)  man(x))
x (mia(x)  woman(x))
x (man(x)   woman(x))
Basic setup
 DRS:
xy
vincent(x)
mia(y)
love(x,y)
 FOL: xy(vincent(x) & mia(y) & love(x,y))
 BK: x (vincent(x)  man(x))
x (mia(x)  woman(x))
x (man(x)   woman(x))
 Model:
D = {d1,d2}
F(vincent)={d1}
F(mia)={d2}
F(love)={(d1,d2)}
Local informativity
Example:
 Mia is the wife of Marsellus.
 If Mia is the wife of Marsellus, Vincent will
be disappointed.
The second sentence is informative
with respect to the first. But…
Local informativity
xy
mia(x)
marsellus(y)
wife-of(x,y)
Local informativity
xyz
mia(x)
marsellus(y)
wife-of(x,y)
vincent(z)
wife-of(x,y)

disappointed(z)
Local consistency
Example:
 Jules likes big kahuna burgers.
 If Jules does not like big kahuna burgers,
Vincent will order a whopper.
The second sentence is consistent with
respect to the first. But…
Local consistency
xy
jules(x)
big-kahuna-burgers(y)
like(x,y)
Local consistency
xyz
jules(x)
big-kahuna-burgers(y)
like(x,y)
vincent(z)
u


like(x,y)
order(z,u)
whopper(u)
DRT and local inference
Because DRS groups information into
contexts, we now have natural means
to check not only global, but also local
consistency and informativity.
Important for dealing with
presupposition.
Presupposition is not about strange
logic. But about using classical logic in
new ways.
Tomorrow
Presupposition and Anaphora in DRT