PPT - Computational Linguistics and Phonetics

Download Report

Transcript PPT - Computational Linguistics and Phonetics

Computational Semantics
http://www.coli.uni-sb.de/cl/projects/milca/esslli
Day II: A Modular Architecture
Aljoscha Burchardt,
Alexander Koller,
Stephan Walter,
Universität des Saarlandes,
Saarbrücken, Germany
ESSLLI 2004, Nancy, France
Computing Semantic
Representations
• Yesterday:
– -Calculus is a nice tool for systematic meaning
construction.
– We saw a first, sketchy implementation
– Some things still to be done
• Today:
– Let’s fix the problems
– Let’s build nice software
Yesterday: -Calculus
• Semantic representations constructed along
the syntax tree: How to get there?
By using functional application
• s help to guide arguments in the right
place on -reduction:
x.love(x,mary)@john
love(john,mary)
Yesterday’s disappointment
Our first idea for NPs with determiner didn’t work out:
“A man”
~> z.man(z)
„A man loves Mary“
~> * love(z.man(z),mary)
But what was the idea after all?
Nothing!
z.man(z) just isn‘t the meaning of „a man“.
If anything, it translates the complete sentence
„There is a man“
Let‘s try again, systematically…
A solution
What we want is:
„A man loves Mary“ ~> z(man(z)  love(z,mary))
What we have is:
“man”
~>
“loves Mary” ~>
y.man(y)
x.love(x,mary)
How about:
z(man(z)  love(z,mary))
z(y.man(y)(z)
 x.love(x,mary)(z))
love(z,mary))
Remember: We can use variables for any kind of term.
So next:
P(Q.z(P(z)
Q(z)))
P(
Q.z(y.man(y)(z)
P
 x.love(x,mary)(z))
Q(z))
x.love(x,mary) )y.man(y)
<~ “A”
But…
“A man
…
loves Mary”
P(Q.z(P(z) Q(z)))@ y.man(y) @ x.love(x,mary)
Q.z(man(z)Q(z))
z.man(z)
man(z)  x.love(x,mary)(z)
love(z,mary) @ x.love(x,mary)
P(Q.z(P(z)Q(z)))@y.man(y)
“John
…
loves Mary”
x.love(x,mary)
@
john
not systematic!
john
@
x.love(x,mary)
not reducible!
@
x.love(x,mary)

better!
x.love(x,mary)@john
P.P@john
love(john,mary)
So:
fine!
“John”
~>
P.P(john)
Transitive Verbs
What about transitive verbs (like "love")?
"loves"
~> yx.love(x,y)
won't do:
"Mary" ~> Q.Q(mary)
"loves Mary"
???
x.love(x,Q.Q(mary))
~> yx.love(x,y)@Q.Q(mary)
How about something a little more complicated:
"loves" ~> Rx(R@y.love(x,y))
The only way to understand this is to see it in action...

"John loves Mary" again...
love(john,mary)
x.love(x,mary)(john)
love(john,mary)
x(P.P(mary)@y.love(x,y))
P.P(john) @x(y.love(x,y)(mary))
x.love(x,mary)
P.P(john) @
John
( Rx(R@y.love(x,y))
loves
@ P.P(mary) )
Mary
Summing up
•
•
•
•
•
nouns:
intransitive verbs:
determiner:
proper names:
transitive verbs:
“man”
„smoke“
„a“
„mary“
“love”
~> x.man(x)
~> x.smoke(x)
~> P(Q.z(P(z) Q(z)))
~> P.P(mary)
~> Rx(R@y.love(x,y))
Today‘s first success
What we can do now (and could not do yesterday):
• Complex NPs (with determiners)
• Transitive verbs
… and all in the same way.
Key ideas:
• Extra λs for NPs
• Variables for predicates
• Apply subject NP to VP
Yesterday’s implementation
s(VP@NP) --> np(NP),vp(VP).
np(john) --> [john].
np(mary) --> [mary].
tv(lambda(X,lambda(Y,love(Y,X)))) --> [loves],
{vars2atoms(X),vars2atoms(Y)}.
iv(lambda(X,smoke(X))) --> [smokes], {vars2atoms(X)}.
iv(lambda(X,snore(X))) --> [snorts], {vars2atoms(X)}.
vp(TV@NP) --> tv(TV),np(NP).
vp(IV) --> iv(IV).
% This doesn't work!
np(exists(X,man(X))) --> [a,man], {vars2atoms(X)}.
Was this a good implementation?
A Nice Implementation
What is a nice implementation?
It should be:
– Scalable: If it works with five examples, upgrading to
5000 shouldn’t be a great problem (e.g. new
constructions in the grammar, more words...)
– Re-usable: Small changes in our ideas about the system
shouldn’t lead to complex changes in the
implementation (e.g. a new representation language)
Solution: Modularity
• Think about your problem in terms of interacting
conceptual components
• Encapsulate these components into modules of your
implementation, with clean and abstract pre-defined
interfaces to each other
• Extend or change modules to scale / adapt the
implementation
Another look at yesterday’s
implementation
• Okay, because it was small
• Not modular at all: all linguistic functionality in one file, packed
inside the DCG
• E.g. scalability of the lexicon: Always have to write new rules,
like:
tv(lambda(X,lambda(Y,visit(Y,X)))) --> [visit],
{vars2atoms(X),vars2atoms(Y)}.
• Changing parts for Adaptation? Change every single rule!
Let's modularize!
Semantic Construction:
Conceptual Components
smoke(j)
Black Box
“John smokes”
Semantic Construction:
Inside the Black Box
Syntax
Phrases
(combinatorial)
Semantics
combine-rules
DCG
Black Box
Words
(lexical)
lexicon-facts
DCG
The DCG-rules tell us what phrases are acceptable (mainly).
Their basic structure is:
s(...) --> np(...), vp(...), {...}.
np(...) --> det(...), noun(...), {...}.
np(...) --> pn(...), {...}.
vp(...) --> tv(...), np(...), {...}.
vp(...) --> iv(...), {...}.
(The gaps will be filled later on)
combine-rules
The combine-rules encode the actual semantic construction
process. That is, they glue representations together using @:
combine(s:(NP@VP),[np:NP,vp:VP]).
combine(np:(DET@N),[det:DET,n:N]).
combine(np:PN,[pn:PN]).
combine(vp:IV,[iv:IV]).
combine(vp:(TV@NP),[tv:TV,np:NP]).
Lexicon
The lexicon-facts hold the elementary information connected to
words:
lexicon(noun,bird,[bird]).
lexicon(pn,anna,[anna]).
lexicon(iv,purr,[purrs]).
lexicon(tv,eat,[eats]).
Their slots contain:
1. syntactic category
2. constant / relation symbol (“core” semantics)
3. the surface form of the word.
Interfaces
Syntax
Phrases
(combinatorial)
Semantics
combine-rules
DCG
combine-calls
lexicon-calls
Words
(lexical)
Semantic macros
lexicon-facts
Interfaces in the DCG
Information is transported between the three components of our system by
additional calls and variables in the DCG:
•
Lexical rules are now fully abstract. We have one for each category (iv, tv, n,
...). The DCG uses lexicon-calls and semantic macros like this:
iv(IV)--> {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word.
pn(PN)--> {lexicon(pn,Sym,Word),pnSem(Sym,PN)}, Word.
•
In the combinatorial rules, using combine-calls like this:
vp(VP)--> iv(IV),{combine(vp:VP,[iv:IV])}.
s(S)--> np(NP), vp(VP), {combine(s:S,[np:NP,vp:VP])}.
Interfaces: How they work
iv(IV)--> {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word.
When this rule applies, the syntactic analysis component:
• looks up the Word found in the
(e.g. “smokes”)
string, ...
• ... checks that its category is iv, ...
• ... and retrieves the relation symbol Sym
to be used in the semantic construction.
So we have:
Word = [smokes]
Sym = smoke
lexicon(iv, smoke, [smokes])
Sym = smoke

Interfaces: How they work II
iv(IV)--> {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word.
Then, the semantic construction component:
• takes Sym ...
Sym = smoke
• ... and uses the semantic macro
ivSem ...
ivSem(smoke,lambda(X, smoke(X)))
ivSem(smoke,IV)
ivSem(Sym,IV)
• ... to transfer it into a full
semantic representation for an
intransitive verb.
IV = lambda(X, smoke(X))
The DCG-rule is now fully instantiated and looks like this:
iv(lambda(X, smoke(X)))-->
{lexicon(iv,smoke,[smokes]), ivSem(smoke, lambda(X, smoke(X)))},
[smokes].
What’s inside a semantic macro?
Semantic macros simply specify how to make a valid semantic
representation out of a naked symbol. The one we’ve just seen in
action for the verb “smokes” was:
ivSem(Sym,lambda(X,Formula)):compose(Formula,Sym,[X]).
compose builds a first-order formula out of Sym and a new variable X:
Formula = smoke(X)
This is then embedded into a  - abstraction over the same X:
lambda(X, smoke(X))
Another one, without compose:
pnSem(Sym,lambda(P,P@Sym)).
john  lambda(P,P@john)
Syntax
Semantics
s(S)--> np(NP), vp(VP),{combine(s:S,[np:NP,vp:VP])}.
Phrases
(combinatorial)
np(NP)
vp(VP)
pn(PN)
iv(IV)
-->
-->
-->
-->
Word =[john]
Word = [smokes]
Words
(lexical)
NP
VP
PN
IV
…,pn(PN)
…,iv(IV)
…,[john]
…,[smokes]
=
=
=
=
lambda(P,P@john)
lambda(X,smoke(X))
lambda(P,P@john)
lambda(X,smoke(X))
pnSem(Sym,PN) Sym = john
ivSem(Sym,IV) Sym = smoke
lexicon(pn,john,[john]).
lexicon(iv,smoke,[smokes]).
“John smokes”
A look at combine
combine(s:NP@VP,[np:NP,vp:VP]).
S = NP@VP
NP = lambda(P,P@john)
VP = lambda(X,smoke(X))
So:
S = lambda(P,P@john)@lambda(X,smoke(X))
That’s almost all, folks…
betaConvert(lambda(P,P@john)@lambda(X,smoke(X), Converted)
Converted = smoke(john)
Little Cheats
A few “special words” are dealt with in a somewhat different manner:
Determiners: ("every man")
• No semantic Sym in the lexicon:
lexicon(det,_,[every],uni).
• Semantic representation generated by the macro alone:
detSem(uni,lambda(P,lambda(Q,
forall(X,(P@X)>(Q@X))))).
Negation – same thing: ("does not walk")
• No semantic Sym in the lexicon:
lexicon(mod,_,[does,not],neg).
• Representation solely from macro:
modSem(neg,lambda(P,lambda(X,~(P@X)))).
The code that's online
(http://www.coli.uni-sb.de/cl/projects/milca/esslli)
• lexicon-facts have fourth argument for any kind of additional
information:
lexicon(tv,eat,[eats],fin).
e.g. fin/inf, gender
• iv/tv have additional argument for infinite /fin.: e.g. "eat" vs. "eats"
iv(I,IV)--> {lexicon(iv,Sym,Word,I),…}, Word.
• limited coordination, hence doubled categories: e.g. "talks and walks"
vp2(VP2)--> vp1(VP1A), coord(C), vp1(VP1B),
{combine(vp2:VP2,[vp1:VP1A,coord:C,vp1:VP1B])}.
vp1(VP1)--> v2(fin,V2),
{combine(vp1:VP1,[v2:V2])}.
A demo
lambda :readLine(Sentence),
parse(Sentence,Formula),
resetVars, vars2atoms(Formula),
betaConvert(Formula,Converted),
printRepresentations([Converted]).
Evaluation
Our new program has become much bigger, but it's…
• Modular: everything's in its right place:
– Syntax in englishGrammar.pl
– Semantics (macros + combine) in lambda.pl
– Lexicon in lexicon.pl
• Scalable: E.g. extend the lexicon by adding facts to lexicon.pl
• Re-usable: E.g change only lambda.pl and keep the rest for
changing the semantic construction method (e.g. to CLLS on
Thursday)
What we‘ve done today
• Complex NPs, PNs and TVs in λ-based
semantic construction
• A clean semantic construction framework in
Prolog
• Its instantiation for -based semantic
construction
Ambiguity
• Some sentences have more than one reading, i.e.
more than one semantic representation.
• Standard Example: "Every man loves a woman":
– Reading 1: the women may be different
x(man(x) -> y(woman(y)  love(x,y)))
– Reading 2: there is one particular woman
y(woman(y)  x(man(x) -> love(x,y)))
• What does our system do?
Excursion: lambda, variables and atoms
• Question yesterday: Why don't we use Prolog
variables for FO-variables?
• Advantage (at first sight): -reduction as
unification:
betaReduce(lambda(X, F)@X,F).
Now: X = john, F = walk(X) ("John walks")
betaReduce(lambda(X,
betaReduce(lambda(john,walk(john))@john,
F)@X,F).
walk(john))
F = walk(john)
Nice, but…
Problem: Coordination
"John and Mary"
P((Q.Q(john)@P)
 (R.R(mary)@P))
P(P(john)
P(mary))
(X. Y.P((X@P)
(Y@P))@
Q.Q(john))@R.R(mary)
"John and Mary walk"
x.walk(x)@john
P(P(john) x.walk(x)@mary
P(mary))@ x.walk(x)
lambda(X,walk(X))@john & lambda(X,walk(X))@mary
-reduction as unification:
X = john
X = mary
