Transcript ICoS-4
Working with
Discourse Representation Theory
Patrick Blackburn & Johan Bos
Lecture 3
DRT and Inference
This lecture
Now that we know how to build DRSs
for English sentences, what do we do
with them?
Well, we can use DRSs to draw
inferences.
In this lecture we show how to do that,
both in theory and in practice.
Overview
Inference tasks
Why FOL?
From model theory to proof theory
Inference engines
From DRT to FOL
Adding world knowledge
Doing it locally
The inference tasks
The consistency checking task
The informativity checking task
Why First-Order Logic?
Why not use higher-order logic?
Better match with formal semantics
But: Undecidable/no fast provers available
Why not use weaker logics?
Modal/description logics (decidable fragments)
But: Can’t cope with all of natural language
Why use first-order logic?
Undecidable, but good inference tools available
DRS translation to first-order logic
Easy to add world knowledge
Axioms encode world knowledge
We can write down axioms about the
information that we find fundamental
For example, lexical knowledge, world
knowledge, information about the structure of
time, events, etc.
By the Deduction Theorem
1 … n |=
iff
|= 1& … & n
That is, inference reduces to validity of
formulas.
From model theory to proof theory
The inference tasks were defined
semantically
For computational purposes, we need
symbolic definitions
We need to move from the concept
of |= to |-In other words, from validity to
provability
Soundness
If provable then valid:
If |-- then
|=
Soundness is a `no garbage` condition
Completeness
If valid then provable
If |= then
|--
Completeness means that proof theory
has captured model theory
Decidability
A problem is decidable, if a computer is
guaranteed to halt in finite time on any
input and give you a correct answer
A problem that is not decidable, is
undecidable
First-order logic is undecidable
What does this mean?
It is not possible, to write a program
that is guaranteed to halt when given
any first-order formula and correctly tell
you whether or not that formula is valid.
Sounds pretty bad!
Good news
FOL is semi-decidable
What does that mean?
If in fact a formula is valid, it is always
possible, to symbolically verify this fact in
finite time
That is, things are only going wrong for
FOL when it is asked to tackle something
that is not valid
On some non-valid input, any algorithm is
bound not to terminate
Put differently
Half the task, namely determining validity, is
fairly reasonable.
The other half of the task, showing nonvalidity, or equivalenty, satisfiability, is harder.
This duality is reflected in the fact that there
are two fundamental computational inference
tools for FOL:
theorem provers
and model builders
Theorem provers
Basic thing they do is show that a
formula is provable/valid.
There are many efficient off-the-shelf
provers available for FOL
Theorem proving technology is now
nearly 40 years old and extremely
sophisticated
Examples:
Vampire, Spass, Bliksem, Otter
Theorem provers and informativity
Given a formula , a theorem prover will try
to prove , that is, to show that it is
valid/uninformative
If is valid/uninformative, in theory, the
theorem prover will always succeed
So theorem provers are a
negative test for informativity
If the formula is not valid/uninformative, all
bets are off.
Theorem provers and consistency
Given a formula , a theorem prover will try
to prove , that is, to show that is
inconsistent
If is inconsistent, in theory, the theorem
prover will always succeed
So theorem provers are also a negative
test for consistency
If the formula is not inconsistent, all bets
are off.
Model builders
Basic thing that model builders do is try
to generate a [usually] finite model for a
formula. They do so by iteration over
model size.
Model building for FOL is a rather new
field, and there are not many model
builders available.
It is also an intrinsically hard task;
harder than theorem proving.
Examples: Mace, Paradox, Sem.
Model builders and consistency
Given a formula , a model builder will try to
build a model for , that is, to show that is
consistent
If is consistent, and satisfiable on a finite
model, then, in theory, the model builder will
always succeed
So model builders are a partial positive
test for consistency
If the formula is not consistent, or it is not
satisfiable on a finite model, all bets are off.
Finite model property
A logic has the finite model property, if
every satisfiable formula is satisfiable
on a finite model.
Many decidable logics have this
property.
But it is easy to see that FOL lacks this
property.
Model builders and informativity
Given a formula , a model builder will try to
build a model for , that is, to show that is
informative
If is satisfiable on a finite model, then, in
theory, the model builder will always succeed
So model builders are a partial positive
test for informativity
If the formula is not satisfiable on a finite
model all bets are off.
Yin and Yang of Inference
Theorem Proving and Model Building
function as opposite forces
Doing it in parallel
We have general negative tests
[theorem provers], and partial positive
tests [model builders]
Why not try to get of both worlds, by
running these tests in parallel?
That is, given a formula we wish to test
for informativity/consistency, we hand it
to both a theorem prover and model
builder at once
When one succeeds, we halt the other
Parallel Consistency Checking
Suppose we want to test [representing the
latest sentence] for consistency wrto the
previous discourse
Then:
If a theorem prover succeeds in finding a proof
for PREV , then it is inconsistent
If a model builder succeeds to construct a model
for PREV & , then it is consistent
Why is this relevant to
natural language?
Testing a discourse for consistency
Discourse
Theorem prover
Model builder
Why is this relevant to
natural language?
Testing a discourse for consistency
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Why is this relevant to
natural language?
Testing a discourse for consistency
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Mia loves every man.
??
Model
Why is this relevant to
natural language?
Testing a discourse for consistency
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Mia loves every man.
??
Model
Mia does not love Vincent.
Proof
??
Parallel informativity checking
Suppose we want to test the formula
[representing the latest sentence] for
informativity wrto the previous discourse
Then:
If a theorem prover succeeds in finding
a proof for PREV , then it is not informative
If a model builder succeeds to construct
a model for PREV & , then it is informative
Why is this relevant to
natural language?
Testing a discourse for informativity
Discourse
Theorem prover
Model builder
Why is this relevant to
natural language?
Testing a discourse for informativity
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Why is this relevant to
natural language?
Testing a discourse for informativity
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Mia loves every man.
??
Model
Why is this relevant to
natural language?
Testing a discourse for informativity
Discourse
Theorem prover
Model builder
Vincent is a man.
??
Model
Mia loves every man.
??
Model
Mia loves Vincent.
Proof
??
Let`s apply this to DRT
Pretty clear what we need to do:
Find efficient theorem provers for DRT
Find efficient model builders for DRT
Run them in parallel
And Bob`s your uncle!
Recall that theorem provers are more
established technology than model
builders
So let`s start by finding an efficient
theorem prover for DRT…
Googling theorem provers for DRT
Theorem proving in DRT
Oh no!
Nothing there, efficient or otherwise.
Let`s build our own one.
One phone call to Voronkov later:
Oops --- does it take that long to build one
from scratch?
Oh dear.
Googling theorem provers for FOL
Use FOL inference
technology for DRT
There are a lot FOL provers available
and they are extremely efficient
There are also some interesting freely
available model builders for FOL
We have said several times, that DRT
is FOL in disguise, so lets get precise
about this and put this observation to
work
From DRT to FOL
Compile DRS into standard FOL syntax
Use off-the-shelf inference engines for
FOL
Okay --- how do we do this?
Translation function (…)fo
Translating DRT to FOL: DRSs
x1…xn
C1
(
.
.
.
Cn
)fo = x1… xn((C1)fo&…&(Cn)fo)
Translating DRT to FOL: Conditions
(R(x1…xn))fo = R(x1…xn)
(x1=x2)fo = x1=x2
(B)fo = (B)fo
(B1B2)fo = (B1)fo (B2)fo
Translating DRT to FOL:
Implicative DRS-conditions
x1…xm
C1
(.
.
.
Cn
B)fo =
x1…xm(((C1)fo&…&(Cn)fo)(B)fo)
Two example translations
Example 1
x
man(x)
walk(x)
y
woman(y)
x
Example 2
man(x)
e
adore(e)
agent(e,x)
theme(e,y)
Example 1
x
man(x)
walk(x)
Example 1
x
(
)fo
man(x)
walk(x)
Example 1
x( (man(x))fo & (walk(x))fo )
Example 1
x(man(x) & (walk(x))fo )
Example 1
x(man(x) & walk(x))
Example 2
y
woman(y)
x
man(x)
e
adore(e)
agent(e,x)
theme(e,y)
Example 2
y
woman(y)
(
x
man(x)
e
adore(e)
agent(e,x)
theme(e,y)
)fo
Example 2
x
y ( (woman(y))fo & ( man(x)
e
adore(e)
agent(e,x)
theme(e,y)
)fo )
Example 2
y ( woman(y) & (
x
man(x)
e
adore(e)
agent(e,x)
theme(e,y)
)fo )
Example 2
e
adore(e)
y (woman(y) &x((man(x))fo ( agent(e,x)
theme(e,y)
)fo))
Example 2
e
adore(e)
y (woman(y) &x(man(x) (
agent(e,x)
theme(e,y)
)fo ))
Example 2
y (woman(y) &x(man(x)
e ( (adore(e))fo & (agent(e,x))fo & (theme(e,y))fo )))
Example 2
y (woman(y) &x(man(x)
e (adore(e) & (agent(e,x))fo & (theme(e,y))fo )))
Example 2
y (woman(y) &x(man(x)
e (adore(e) & agent(e,x) & (theme(e,y))fo )))
Example 2
y (woman(y) &x(man(x)
e (adore(e) & agent(e,x) & theme(e,y))))
Basic setup
DRS:
xy
vincent(x)
mia(y)
love(x,y)
Basic setup
DRS:
xy
vincent(x)
mia(y)
love(x,y)
FOL: xy(vincent(x) & mia(y) & love(x,y))
Basic setup
DRS:
xy
vincent(x)
mia(y)
love(x,y)
FOL: xy(vincent(x) & mia(y) & love(x,y))
Model:
D = {d1}
F(vincent)={d1}
F(mia)={d1}
F(love)={(d1,d1)}
Background Knowledge (BK)
Need to incorporate BK
Formulate BK in terms of first-order
axioms
Rather than just giving to the theorem
prover (or model builder),
we give it: BK & or BK
Basic setup
DRS:
xy
vincent(x)
mia(y)
love(x,y)
Basic setup
DRS:
xy
vincent(x)
mia(y)
love(x,y)
FOL: xy(vincent(x) & mia(y) & love(x,y))
Basic setup
DRS:
xy
vincent(x)
mia(y)
love(x,y)
FOL: xy(vincent(x) & mia(y) & love(x,y))
BK: x (vincent(x) man(x))
x (mia(x) woman(x))
x (man(x) woman(x))
Basic setup
DRS:
xy
vincent(x)
mia(y)
love(x,y)
FOL: xy(vincent(x) & mia(y) & love(x,y))
BK: x (vincent(x) man(x))
x (mia(x) woman(x))
x (man(x) woman(x))
Model:
D = {d1,d2}
F(vincent)={d1}
F(mia)={d2}
F(love)={(d1,d2)}
Local informativity
Example:
Mia is the wife of Marsellus.
If Mia is the wife of Marsellus, Vincent will
be disappointed.
The second sentence is informative
with respect to the first. But…
Local informativity
xy
mia(x)
marsellus(y)
wife-of(x,y)
Local informativity
xyz
mia(x)
marsellus(y)
wife-of(x,y)
vincent(z)
wife-of(x,y)
disappointed(z)
Local consistency
Example:
Jules likes big kahuna burgers.
If Jules does not like big kahuna burgers,
Vincent will order a whopper.
The second sentence is consistent with
respect to the first. But…
Local consistency
xy
jules(x)
big-kahuna-burgers(y)
like(x,y)
Local consistency
xyz
jules(x)
big-kahuna-burgers(y)
like(x,y)
vincent(z)
u
like(x,y)
order(z,u)
whopper(u)
DRT and local inference
Because DRS groups information into
contexts, we now have natural means
to check not only global, but also local
consistency and informativity.
Important for dealing with
presupposition.
Presupposition is not about strange
logic. But about using classical logic in
new ways.
Tomorrow
Presupposition and Anaphora in DRT