Natural Language Processing

Download Report

Transcript Natural Language Processing

Semantics Interpretation
Allen’s Chapter 9
J&M’s Chapter 15
1
Rule-by-rule semantic interpretation
• Computing Logical forms (i.e., Semantic
Interpretation) can be coupled with parsing
• A syntactic tree can be generated from a specified
logical form (i.e., Semantic Realization)
• To couple syntax and semantic, the meaning of all
constituent must be determined
• By using features, the relation between the
meaning of a constituent and that of its sub
constituents can be specified in the grammar
• Each syntactical rule is associated with a semantic
interpretation rule (this is called Rule-by-Rule
semantic interpretation)
2
Semantic interpretation and compositionality
• Semantic interpretation is assumed to be a
compositional process (similar to parsing)
• The meaning of a constituent is derived
from the meaning of its sub constituents
• Interpretations can be built incrementally
from the interpretations of sub phrases
• Compositional models make grammars
easier to extend and maintain
3
Semantic interpretation and compositionality
• Integrating parsing and semantic interpretation is
harder than it may seem
• One classic obstacle is the inconsistency between
syntactical structures, and corresponding
structures of the logical forms
• Syntactical structure of Jill loves every dog is:
((Jill) (loves (every dog)))
• Its unambiguous logical form is:
(EVERY d1 : (DOG1 d1) (LOVES1 l1 (NAME j1 “Jill”) d1)
4
Difficulties of Semantic interpretation via
compositionality
• In the syntactical structure, every dog is a sub
constituent of VP, whereas in the logical form, the
situation is reversed
• How the meaning of every dog could be
represented in isolation and then used to construct
the meaning of the sentence?
• Using quasi-logical forms is one way around this
problem
(LOVES1 l1 (NAME j1 “Jill”) <EVERY d1 DOG1>)
5
Problem with Idioms
• Another obstacle for the compositionality assumption is
the presence of idioms
Jack kicked the bucket = Jack died
• Solution1: semantic meaning to be assigned to the entire
phrase, rather than building it incrementally
• What about: The bucket was kicked by Jack?
• Jack kicked the bucket is ambiguous between:
– (KICK1 k1 (NAME j1 “Jack”) <THE b1 BUCKET1>)
– (DIE1 d1 (NAME j1 “Jack”))
• Solution2: adding a new sense of die for the verb kick with
sub categorization for an object BUCKET1
6
Semantic interpretation and compositionality
• In compositional semantic interpretation, we should be able to
assign a semantic structure to any syntactic constituent
• For instance, we should be able to assign a uniform form of
meaning to every verb phrase in any rule involving a VP
• The meaning of the VP in Jack laughed can be shown by a
unary predicate:
(LAUGHED1 l1 (NAME j1 “Jack”))
• The VP in, Jack loved Sue, should also be represented by a
unary predicate:
(LOVES1 l2 (NAME j1 “Jack”) (NAME s1 “Sue”))
• But, how can we express such complex unary predicates?
• Lambda calculus provides a formalism for representing such
predicates
7
Lambda calculus
• Using lambda calculus, loved sue is
represented as:
( x (LOVES1 l1 x (NAME s1 “SUE”)))
• We can apply a lambda expression to an
argument, by a process called lambda
reduction
( x (LOVES1 l1 x (NAME s1 “SUE”))(NAME j1 “Jack”)
= (LOVES1 l1 (NAME j1 “Jack”) (NAME s1 “Sue”))
8
Lambda calculus
• Lambda-calculus provides a handy tool to
couple syntax and semantics
• Now verb phrases with even different
structures can easily be conjoined
Sue laughs and opens the door
( a (LAUGHES1 l1 a)) and
( a (OPENS1 o1 a) <THE d1 DOOR1>) can be conjoined into:
( a (& (LAUGHES1 l1 a) (OPENS1 o1 a) <THE d1 DOOR1>)))
Applying it to (NAME s1 “Sue”) would produce the logical form:
(& (LAUGHES1 l1 (NAME s1 “Sue”))
(OPENS1 o1 (NAME s1 “Sue”)) <THE d1 DOOR1>))
9
Lambda calculus (Cont.)
•
Propositional phrase modifiers in noun phrases could be
handled in different ways
Search for location of modifiers in noun phases and
incorporate them into the interpretations
1.
•
2.
Works for the man in the store, but not for the man is in the
store or the man was thought to be in the store
Give and independent meaning to prepositional phrases
–
in the store is represented by a unary predicate
( o (IN-LOC1 o <THE s1 STORE1>)
•
The man in the store
(<THE m1 (& (MAN1 m1) (IN-LOC1 m1 <THE s1 STORE1>))>)
•
The man is in the store
(IN-LOC1 <THE m1 MAN1> <THE s1 STORE1>)
10
Compositional approach to semantics
• In general, each major syntactic phrase
corresponds to a particular semantic
construction:
–
–
–
–
VPs and PPs map to unary predicates,
sentences map to propositions,
NPs map to terms, and
minor categories are used in building major
categories
11
A simple grammar and lexicon for
semantic interpretation
• By using features, logical forms can be computed
during parsing
• The main extension needed is a SEM feature,
which is added to each lexical entry and rule
• Example:
(S SEM (?semvp ?semnp))  (NP SEM ?semnp) (VP SEM ?semvp)
NP with SEM (NAME m1 “Mary”)
VP with SEM ( a (SEES1 e8 a (NAME j1 “Jack)))
• Then the SEM feature of S is:
(( a (SEES1 e8 a (NAME j1 “Jack))) (NAME m1 “Mary”)), is reduced to
(SEES1 e8 (NAME m1 “Mary”) (NAME j1 “Jack)
12
Mary sees Jack
13
Sample lexicon with SEM features
14
A sample grammar with SEM and
VAR features
• A feature called VAR is also needed to store the discourse variable
that corresponds to the constituents
• The VAR feature is automatically generated when a lexical
constituent is constructed from a word
• Then the VAR feature is passed up the parse tree by being treated
as a head feature
• This guarantees that the discourse variables are always unique
• If ?v is m1, SEM of the man is: <THE m1 (MAN1 m1)>
15
Morphological rules with SEM feature
• The lexical rules for morphological derivations must also be
modified to handle SEM features
• For instance, the rule for converting singular nouns to plural ones
take the SEM of a singular noun and adds a PLUR operator (L7)
• A similar technique is used for tense operators (L1, L2, and L3)
16
Using Chart Parsing for semantic
interpretation
•
Only two simple modifications are needed
to use the standard Chart Parser to handle
semantic interpretation:
•
•
When a lexical rule is instantiated, the VAR
feature is set to a new discourse variable
Whenever a constituent is built, its SEM is
simplified by possible lambda reductions
17
Jill saw the dog
18
Handling auxiliary verbs
• (VP SEM ( a1 (?semaux (?semvp a1)))) 
(AUX SUBCAT ?v SEM ?semaux)
(VP VFORM ?v SEM ?semvp)
• If ?semaux is the modal operator CAN1,
and ?semvp is ( x (LAUGHS1 e3 X))
• Then the SEM of the VP can laugh will be
( a1 (CAN1 (( x (LAUGHS1 e3 x)) a1))
( a1 (CAN1 (LAUGHS1 e3 a1)))
19
Handling Prepositional Phrases
•
Propositional phrases play two different
semantic roles:
1. PP can be a modifier to a noun phrase or a
verb phrase (cry in the corner)
2. PP may be needed by a head word, and the
preposition acts more as a term than as an
independent predicate (Jill decided on a
couch)
20
PP as a modifier of a noun phrase
•
•
•
The appropriate rule for interpreting PP is:
(PP SEM ( y (?semp y ?semnp))) 
(P SEM ?semp) (NP SEM ?semnp)
if SEM of P is: IN-LOC1, and SEM of NP
is: <THE c1 (CORNER1 c1)>,
Then the SEM of PP in the corner will be:
( y (IN-LOC1 y <THE c1 CORNER1>)),
21
PP as a modifier of a noun phrase
• If the PP is used in the rule:
(CNP SEM ( n (& (?semcnp n) (?sempp n)))) 
(CNP SEM ?semcnp) (PP SEM ?sempp)
• Then SEM of CNP man in the corner will be:
( n (& (MAN1 n) (( y (IN-LOC1 y <THE c1 CORNER1>)) n)))
• That is reduced to:
( n (& (MAN1 n) (IN-LOC1 n <THE c1 CORNER1>)))
• Then using rule
(NP VAR ?v SEM <?semart ?v (?semcnp ?v)>) 
(ART SEM ?semart) (CNP SEM ?semcnp)
• The SEM of the man in the corner will be:
<THE m1 (( y (& (MAN1 y) (IN-LOC1 y <THE c1 CORNER1>))) m1)>
• Which is reduced to:
<THE m1 (& (MAN1 m1) (IN-LOC1 m1 <THE c1 CORNER1>))>
22
PP as a modifier of a verb phrase
•
Appropriate rule is:
(VP VAR ?v SEM ( x (& (?semvp x) (?sempp ?v)))) 
(VP VAR ?v SEM ?semvp) (PP SEM ?sempp)
•
•
•
SEM of PP in the corner is:
( y (IN-LOC1 y <THE c1 CORNER1>)),
SEM of VP of cry is:
(x (CRIES1 e1 x)),
SEM of VP of cry in the corner will be:
( a (& (CRIES1 e1 a) (IN-LOC1 e1
<THE c1 CORNER1>)))
23
Parse tree of Cry in the corner
24
PP as a sub constituent of a head word
•
Jill decided on a couch is ambiguous:
1.
2.
•
•
The appropriate rule for the second one is:
VP  V[_pp:on] NP PP[on]
–
•
•
Jill made a decision while she was on a couch
Jill made a decision about a couch
The desired logical form for this is
( s (DECIDES-ON1 d1 s <A c1 (COUCH1 c1)>))
Here the preposition “on” doesn’t have any
semantic contribution (it is only a term)
A new binary feature PRED is needed to
determine the role of preposition (i.e., a
predicate (+) or a term(-))
25
Rules for handling PPs
26
Different parse trees for the VP
decide on the coach
27
Lexicalized semantic interpretation
•
•
•
So far all the complexity of the semantic
interpretation has been encoded in grammar
rules
A different approach is to encode complexities
in lexical entries and have simpler rules
Verb decide has two senses, with two different
SEM values:
1.
2.
Intransitive sense: ( y (DECIDES1 e1 y))
Transitive sense: ( o ( y (DECIDE-ON1 e1 y o)))
28
Lexicalized semantic interpretation
• There is a trade off between the complexity
of the grammatical rules and the complexity
of the lexical entries
• The more complex that logical forms are,
the more attractive the second approach
becomes
• For instance, to handle thematic roles with
simple lexical entries, we need extra rules to
classify different type of verbs
29
Lexicalized semantic interpretation
• See and eat both have transitive forms where the
subject fills the AGENT role and the object fills
the THEME role
• Break also has a transitive sense where the subject
fills the INSTR role and the object fills the
THEME role
• We need a new feature ROLES to be added to the
lexical entries to identify the appropriate form, and
a separate rule for each
30
Lexicalized semantic interpretation
(VP VAR ?v SEM ( a (?semv ?v [AGENT a] [THEME ?semnp]))) 
(V ROLES AGENT-THEME SEM ?semv) (NP SEM ?semnp)
(VP VAR ?v SEM ( a (?semv ?v [INSTR a] [THEME ?semnp]))) 
(V ROLES INSTR-THEME SEM ?semv) (NP SEM ?semnp)
• Additional rules must be added for all
combinations of roles that can be used by
the verbs
• This approach is cumbersome, because it
requires adding thematic role information to
lexicon anyway (using the ROLES feature)
31
Lexicalized semantic interpretation
•
It is simpler to enter the appropriate form only in
the lexicon:
See: (V VAR ?v SEM ( o ( a (SEES1 ?v [AGENT a] [THEME o]))))
Break: (V VAR ?v SEM ( o ( a (BREAKS1 ?v [INSTR a] [THEME o]))))
•
In this case, just a single rule is needed:
(VP SEM (?semv ?semnp)) 
(V SEM ?semv) (NP SEM ?semnp)
•
SEM of VP See the book is represented as:
( o ( a (SEES1 b1 [AGENT a] [THEME o]))))<THE b1 (BOOK1 b1)>
( a (SEES1 b1 [AGENT a] [THEME <THE b1 (BOOK1 b1)>]))))
•
=
SEM of VP Break the book is represented as:
( o ( a (BREAKS1 b1 [INSTR a] [THEME o]))))<THE b1 (BOOK1 b1)>
( a (BREAKS1 b1 [INSTR a] [THEME <THE b1 (BOOK1 b1)>]))))
=
32
Hierarchical lexicons
• The problem of making the lexicon more complex is that
there are too many words
• Verb give allows:
– I gave the money
– I gave John the money
– I gave the money to John
• The lexical entries for give includes:
– (V SUBCAT _np
SEM  o  a (GIVE * [AGENT a] [THEME o]))
– (V SUBCAT _np_np
SEM  r  o  a (GIVE * [AGENT a] [THEME o] [TO-POSS r))
– (V SUBCAT _np_pp:to
SEM  r  o  a (GIVE * [AGENT a] [THEME o] [TO-POSS r))
33
Hierarchical lexicons
• This cause a lot of redundancies in the
lexicon
• A better idea is to exploit regularities of
verbs in English
• A very large class of transitive verbs (give,
take, see, find, paint, etc.), which describe
an action, use the same semantic
interpretation rule for SUBCAT _np
34
Hierarchical lexicons
• The idea of hierarchical lexicon is to organize
verb senses in such a way that their shared
properties can be used (via inheritance)
– INTRANS-ACT defines verbs with SUBCAT _none
• s (?PREDN * [AGENT s])
– TRANS-ACT for verbs with SUBCAT _np
• o a (?PREDN * [AGENT a] [THEME o])
• Give: (V ROOT give PREDN GIVE1
SUP {BITRANS-TO-ACT TRANS-ACT})
35
Lexical hierarchy
36
Handling simple questions
• To handle simple questions, we only need to extend the
appropriate grammar rules with the SEM feature
• The lexical entry for the wh-words are extended with
the SEM feature:
who: (PRO WH {Q R} SEM WHO1 AGR {3s 3p})
• But, how SEM and GAP features interact?
37
Who did Jill see?
38
Prepositional phrase Wh-Questions
• Questions can begin with prepositional phrases:
– In which box did you put the book?
– Where did Jill go?
• The semantic interpretation of these questions
depend on the type of PPs
(S INV – SEM (WH-query ?sems)) 
(PP WH Q PRED ?p PTYPE ?pt SEM ?sempp)
(S INV + SEM ?sems GAP (PP PRED ?p PTYPE ?pt SEM
?sempp))
39
Prepositional phrase Wh-Questions
• To handle wh-terms like where the
following rule is also needed:
(PP PRED ?p PTYPE ?pt SEM ?sempp) 
(PP-WRD PRED ?p PTYPE ?pt SEM ?sempp)
• There would also be two lexical entries:
(PP-WRD PTYPE {LOC MOT} PRED – VAR ?v
SEM <WH ?v (LOC1 ?v)>)
(PP PRED + VAR ?v
SEM ( x (AT-LOC x <WH ?v (LOC1 ?v)>)))
40
Where did Jill go?
•
•
•
But, Wh-questions starting with a PP with + PRED cannot be handled, yet
The difficulty comes from the restriction that in rule VP  VP PP, the
GAP is passed only into the VP sub constituent
Thus there is still no way to create a PP gap that modifies a verb phrase
41
Relative Clauses
• The similarity between relative clauses and
wh-questions continues at the semantic
level, too.
• The logical form of a relative clause is a
unary predicate, which is produced by
(CNP SEM ( ?x (& (?semcnp ?x) (?semrel ?x))) 
(CNP SEM ?semcnp) (REL SEM ?semrel)
• When the embedded sentence is completed,
a lambda is wrapped around it to form the
relative clause
42
Who Jill saw
• The wh-term variable is specified in a feature called
RVAR in the rule
(REL SEM ( ?v ?sems)) 
(NP WH R RVAR ?v AGR ?a SEM ?semnp)
(S [fin] INV – GAP (NP AGR ?a SEM ?semnp) SEM ?sems)
43
Semantic Interpretation by Features Unification
versus Lambda reductions
•
•
Semantic interpretation can be performed by just
using feature values and variables
The basic idea is to introduce new features for
the argument positions that would have been
filled with lambda reductions
44
Jill saw the dog
45
Features versus Lambda expressions
46
Features versus Lambda expressions
• One advantage of using features is that no special
mechanism (e.g., Lambda Reduction) is needed
for semantic interpretation
• Another significant advantage is that the grammar
in this form is reversible (can be used to generate
sentences)
• But, not all lambda expressions can be eliminated
using this technique
• Handling conjoined subject phrases as in sue and
Sam saw Jack, SUBJ variable need to unify with
both Sue and Sam, which is not possible
47
Generating sentences from Logical
Forms
• Intuitively, it should be easy to reverse a
grammar and use it for generation:
• Decompose the logical form of each
constituent into a series of lexical
constituents with the appropriate meaning
• But, not all grammars are reversible:
e.g., a grammar with Lambda reduction
48
Generating sentences from Logical
Forms
• Consider (<PAST SEES1> s1 (NAME j1 “Jill”)
<THE d1 (DOG1 d1)>)
• Rule S with SEM (?semvp ?semnp) cannot be unified with
the above logical form
• The problem is that lambda reduction was used to convert
the original logical form, which was
((a (<PAST SEES1> d1 a <THE d1 (DOG1 d1)>))
(NAME j1 “Jill”)
• Lambda abstraction can be used to find a match, but the
problem is that there are three possible lambda abstractions
– (e (<PAST SEES1> e (NAME j1 “Jill”)
<THE d1 (DOG1 d1)>
– (a (<PAST SEES1> s1 a <THE d1 (DOG1 d1)>))
– (o (<PAST SEES1> s1(NAME j1 “Jill”) o))
49
Realization versus Parsing
• Parsing and realization both can be viewed
as building a syntactic tree
• A parser starts with the words and tries to
find a tree that accounts for them and hence
determine the logical form of the sentence
• A realizer starts with a logical form and
tries to find a tree to account for it and
hence determine the words to realize it
50
Realization and Parsing
• Using standard top-down parsing algorithm is extremely
inefficient
Consider again (<PAST SEES1> s1
(NAME j1 “Jill”) <THE d1 (DOG1 d1)>)
(NP SEM ?semsubj)
(VP SUBJ ?semsubj
SEM (<PAST SEES1> s1 (NAME j1 “Jill”)
<THE d1 (DOG1 d1)>))
• The problem is that the SEM of NP is unconstrained
• Solution: expand the constituents in a different order (e.g.,
choosing the subject based on decisions about the verb and
the structure of the verb phrase)
51
Head Driven Realization algorithm
52
Head driven Realization
• Using rule 3 of grammar 9.14
?semv = <PAST SEES1>,
?v = s1,
?semsubj = (NAME j1 “Jill”),
?semnp = <THE d1 <DOG1 d1)>
• After rewriting the VP, the following list of
constituents is obtained:
(NP SEM (NAME j1 “Jill”))
(V [_np] SEM <PAST SEES1>)
(NP SEM <THE d1 (DOG1 d1)>)
53
Head driven Realization
• Since there is no non-lexical head, the
algorithm picks any non-lexical constituent
with a bound SEM (e.g., the first NP)
• Using rule 5, it yields to (NAME SEM “Jill”)
• Selecting the remaining NP, and using rules
6, and then 7, produce the following list:
(NAME SEM “Jill”), (V [_np] SEM <PAST SEES1>),
(ART SEM THE), (N SEM DOG1)
• Now it is easy to produce Jill saw the dog
54
Ambiguities in Realization
• With large grammars, a wider range of forms would be
possible
• If only the SEM feature is specified, the realization
program would randomly pick between active and passive
sentences when allowed by the verb
• It might also pick between different sub categorization
structures
– Jill gave the dog to Jack
– Jill gave Jack the dog
• Different word senses may be picked
– Jack gave the money to the Humane Society
– Jack donated the money to the Humane Society
• To force particular realization, other features in addition to
SEM may be necessary (e.g., VOICE to force active voice) 55