Transcript Slide 1

Whose presentation is this?
SUBJ(present, Violeta Seretan)
(Decoding the predicate-argument structure of nominalizations)
OBL(collaborate, Lorenzo Thione)
PP-OBJ(with, Lorenzo Thione)
SUBJ(supervise, Martin van den Berg)
Overview







nominalization problem
NOMLEX resource
Denominalizer service based on NOMLEX
additional resources (CSLI)
APIs for NOMLEX, CSLI
related and future work
demo
10/25/2005
2
Text normalization for QA

Mark Twain published Adventures of Huckleberry Finn in 1885 in America.
–
–
–

Who published H.F.?
Where was H.F. published?
When was H.F. published?
QA/NLU needs to deal with a large spectrum of variation in text:
1.
2.
3.
4.

Normalization (via parsing):
1.
2.

base word form: publishes -> publish; published -> publish
canonical word order: SUBJ(publish, Mark Twain); OBJ(publish, H.F.)
Lexical semantic resources:
3.

morphological: published, publishes
syntactic: H.F. was published
lexical: {novel, book, masterpiece, work} {publish, write, author, appear}
nominalization: the publication
synonyms, hyponyms, hypernyms, …
What about nominalization?
10/25/2005
3
Nominalization
Since the publication of Huckleberry Finn in 1885, there have been many
reactions to the novel, some of them quite extreme.
–
When was H.F. published?
deverbal noun
publication of Huckleberry Finn
nominalization
OBJ(publish, Huckleberry Finn)
matrix verb
Nominalization : NP having “a systematic correspondence with a clause
structure” (Quirk et al. 1985)
Goal: decoding the clause structure
10/25/2005
4
Mapping nominal arguments
into verbal roles

Mark Twain’s publication of his book
possessive determiner

PP adjunct
the book publication by Mark Twain
modifier
PP adjunct
(nominal arguments)
(nominal arguments)
Mark Twain - publish – book
SUBJECT
10/25/2005
OBJECT
(verbal roles)
5
Role ambiguity
Rome’s destruction – SUBJ or OBJ?
OBJ(destroy, Rome)
SUBJ(destroy, Rome)
A.
B.
Rome’s destruction by barbarians
Rome’s destruction of Carthage
OBJ
SUBJ
Rome’s destruction – OBJ (by default)
John’s admiration – SUBJ (by default)
10/25/2005
6
NOMLEX –
NOMinalization LEXicon



Macleod et al., New York University
1’025 deverbal nouns
detailed mapping from nominal arguments to verb roles
:ORTH "destruction"
role to assign
:VERB "destroy"
:VERB-SUBC ((NOM-NP :SUBJECT ((N-N-MOD)
(DET-POSS)
(PP :PVAL ("by")))
:OBJECT ((DET-POSS)
(N-N-MOD)
(PP :PVAL ("of")))
:REQUIRED ((OBJECT :DET-POSS-ONLY T
:N-N-MOD-ONLY T))))
default role
10/25/2005
7
NOMLEXML
(NOM :ORTH "accusation"
:PLURAL "accusations"
:PLURAL-FREQ "not rare"
:VERB "accuse"
:NOUN-SUBC ((NOUN-PP :PVAL ("about")))
:NOM-TYPE ((VERB-NOM))
:VERB-SUBJ ((DET-POSS)
(N-N-MOD)
(PP :PVAL ("by")))
Perl
:SUBJ-ATTRIBUTE ((COMMUNICATOR))
:OBJ-ATTRIBUTE ((COMMUNICATOR))
:VERB-SUBC ((NOM-NP-PP :SUBJECT ((DET-POSS)
(N-N-MOD)
(PP :PVAL ("by")))
:OBJECT ((PP :PVAL ("against")))
:PVAL ("of"))
(NOM-NP :SUBJECT ((DET-POSS) …
10/25/2005
8
NOMLEX API in Java
com.fxpal.sake.test (NomLexInterface)
com.fxpal.ltng.services.normalization.noun.nomlex
(NomLex, NomLexEntry, NomLexClassConstants, Subcat)
10/25/2005
9
How useful?
Oracle acquired PeopleSoft at the end of last year.
Oracle’s acquisition of PeopleSoft at the end of last year…
Google hits, 10/25/2005:
"Oracle acquisition of PeopleSoft"
~14’500
"Oracle acquired PeopleSoft"
587
"Oracle's PeopleSoft acquisition"
693
More hits:
"Oracle acquires PeopleSoft"
10/25/2005
1’020
"Oracle has acquired PeopleSoft"
248
"Oracle will acquire PeopleSoft"
424
10
Argument-role mapping
Oracle's acquisition of PeopleSoft
possessive
PP (of )
:ORTH "acquisition"
:VERB "acquire"
:VERB-SUBC ((NOM-NP :SUBJECT ((DET-POSS)
(N-N-MOD)
(PP :PVAL ("by")))
:OBJECT ((N-N-MOD)
(PP :PVAL ("of"))))
10/25/2005
SUBJ(acquire, Oracle)
OBJ(acquire, PeopleSoft)
11
Denominalizer


Input:
Output:
sentence
pairs nominal argument – verb role
for each nominalization
(noun, (argument –role)*)*
Exemples:
•
Oracle's acquisition of PeopleSoft finally materialized after an 18
months struggle between the two companies.
(acquisition, (Oracle - SUBJECT) (PeopleSoft - OBJECT))
•
Oracle acquisition finally materialized.
(acquisition, (Oracle - SUBJECT) (Oracle - OBJECT))
10/25/2005
12
Algorithm
com.fxpal.ltng.services.normalization.noun.*
parse sentence
for each deverbal noun
get noun arguments
for each NOMLEX entry for noun
for each subcat of the entry
1. match arguments against subcat
2. filter assignment results
select a subcat
output assignments for selected subcat
Note:
10/25/2005
overlapping nominalizations ok:
an increase in product sales
13
1. Matching
Oracle's acquisition of PeopleSoft finally materialized.
Arguments (acquisition):
POSS(acquisition, Oracle)
ADJUNCT(acquisition, of)
PP-OBJ(of, PeopleSoft)
NOM-NP
:SUBJECT
:OBJECT
10/25/2005
((DET-POSS)
(N-N-MOD)
(PP :PVAL ("by")))
((N-N-MOD)
(PP :PVAL ("of")))
14
2. Filtering
Oracle's PeopleSoft acquisition finally materialized.
Arguments (acquisition):
POSS(acquisition, Oracle)
MOD(acquisition, PeopleSoft)
NOM-NP
SUBJECT
OBJECT
10/25/2005
((DET-POSS)
(N-N-MOD)
(PP :PVAL ("by")))
((N-N-MOD)
(PP :PVAL ("of")))
Alternatives:
Oracle: SUBJECT
PeopleSoft: SUBJECT, OBJECT
15
NOMLEX constraints (1)

Uniqueness Constraint:
A verbal role may be filled only once.
Oracle's PeopleSoft acquisition
Matching alternatives:
10/25/2005
Oracle:
SUBJECT
PeopleSoft:
SUBJECT, OBJECT
16
NOMLEX constraints (2)

Ordering Constraint:
If there are multiple pre-nominal arguments, they must appear
in the order:
SUBJECT, INDIRECT OBJECT, DIRECT OBJECT, OBLIQUE.
FX’s printer sales grew by 50%.
Matching alternatives:
10/26/2005
FX:
printer:
SUBJECT, OBJECT
SUBJECT, OBJECT
order:
verbal roles:
FX, printer
SUBJECT, OBJECT
17
NOMLEX constraints (3)

Obligatoriness Constraint:
By default, the subject and object are optional.
A NOMLEX entry can specify obligatory roles to be filled.
circulation - REQUIRED (SUBJECT)
blood circulation
SUBJ(circulate, blood)
destruction - REQUIRED ((OBJECT :DET-POSS-ONLY T
Rome’s destruction
:N-N-MOD-ONLY T))))
OBJ(destroy, Rome)
10/25/2005
18
Selectional Restrictions
com.fxpal.ltng.services.normalization.noun.csli
(Nouns, Verbs, NounsVerbs)
10/25/2005
19
Applying
selectional restrictions

room reservation
Alternatives:
room - SUBJECT, OBJECT
reserve - selectional restrictions: SUBJECT: sentient; OBJECT: *
room - location, physobj


semantic types for about 5000 N
selectional restrictions for about 5000 V
459/941 verbs from NOMLEX (48.77%)
10/25/2005
20
Coverage extension

What if a noun is not in NOMLEX?
1.
additional deverbal nouns in the CSLI data
2.
NOMLEX template:
4’087 “event nouns”
3348 new, 739 already in NOMLEX
3348/1025
326% more data
NOM-NP
:SUBJECT
:OBJECT
10/25/2005
((DET-POSS)
(N-N-MOD)
(PP :PVAL ("by")))
((DET-POSS)
(N-N-MOD)
(PP :PVAL ("of")))
21
Future work

extensive test and evaluation

other nominalization data
– deverbal noun recognition
– mapping information (FrameNet)

other lexical resources
PropBank – semantic roles
VerbLex – selectional restrictions

role assignment in context
– word sense disambiguation, anaphora, discourse
– collocations
the author will make no accusation
SUBJ(make, author) -> SUBJ (accuse, author)
10/25/2005
22
Related work




PUNDIT system (Dahl et al., 1987)
SNOWY QA system (Hull and Gomez 1996)
NOMLEX for IE (Meyers et al., 1998)
N-N interpretation (Lapata 2002, Girju et al. 2004)
10/25/2005
23
References








Dahl, Deborah A., Palmer, Martha S.; and Passonneau, Rebecca J. 1987. "Nominalizations
in PUNDIT." Proceedings of the 25th Annual Meeting of the Association for Computational
Linguistics, Stanford, CA.
Girju, Roxana, Ana-Maria Giuglea, Marian Olteanu, Ovidiu Fortu, Orest Bolohan, and Dan
Moldovan. Support vector machines applied to the classification of semantic relations in
nominalized noun phrases. In Proceedings of the HLT-NAACL Workshop on
Computational Lexical Semantics, 2004.
Hull, Richard and Fernando Gomez (1996). Semantic Interpretation of Nominalizations.
PDF Format. Proceedings of the Thirteenth National Conference on Artificial Intelligence,
Portland, Oregon, August, 1996, pp. 1062-8.
Lapata, Maria. 2002. The Disambiguation of Nominalisations. Computational Linguistics
28:3, 357-388.
Macleod, Catherine, Ralph Grishman, Adam Meyers, Leslie Barrett, and Ruth Reeves.
1998. Nomlex: A lexicon of nominalizations. In Proceedings of the 8th International
Congress of the European Association for Lexicography, pages 187–193, Liège, Belgium.
Meyers A., et al. Using NOMLEX to produce nominalization patterns for information
extraction. In Proceedings of the COLING-ACL Workshop on Computational Treatment of
Nominals, 1998.
Quirk, S. R., Greenbaum, G. Leech, and J. Svartvik. 1985. A comprehensive grammar of
English language, Longman, Harlow.
Terada Akira, Tokunaga Takenobu. Corpus based method of transforming nominalized
phrases into clauses for text mining application. IEICE Transactions on Information and
Systems. Vol.E86-D. No.9. pp.1736 -- 1744. 2003.
10/25/2005
24
Thank you!
10/25/2005
25
Selectional restrictions data

CSLI resource:
– nouns

semantic types (ontology)
– verbs


10/25/2005
4858
subcategorizations
selectional restrictions
– noun-verb

4447
5700 V (9415 N)
noun-verb pairs
26
Grammatical Transfer
NOMLEX
XLE
Example
DET-POSS
POSS
Rome's destruction
PP
ADJUNCT, PP-OBJ (POS=NOUN)
destruction of Carthage
TO-INF
XCOMP
the desire to leave
AS-NPPHRASE
ADJUNCT, PP-OBJ (as, POS=NOUN)
his resignation as chairman
N-N-MOD
MOD
the room reservation
P-ING
ADJUNCT, PP-OBJ (POS=VERB)
the accusation against
launching
ING
ADJUNCT, QA_PROG(+)
my appreciation being there
FOR-TO-INF
ADJUNCT, SUBJ
the wish for him to go
ADVP
ADJUNCT (POS=ADV)
his departure abroad
AS-ING
ADJUNCT, PP-OBJ (as, POS=VERB), QA_PROG(+)
characterization as being
AS-ADJP
ADJUNCT, PP-OBJ (as, POS=ADJ)
the characterization as
useful
P-POSSING
ADJUNCT, PP-OBJ(POS=VERB), POSS
the acceptance of his talking
10/25/2005
27
FrameNet




aim: word – semantico-syntactic mapping
semantic roles: frame elements (frame-specific)
BNC corpus (100M words); American English – LDC, ANC
more than 600 frames, about 9.000 words
Example: accusation
frame: Judgment_communication
FE (for this word) and their realization:
communicator
not expressed (27/48)
possessive determiner (6/48)
PP (from) (2/48)
…
10/25/2005
evaluee
not expressed (40/48)
PP (against) (5/48)
PP (about) (3/48)
…
reason
PP (of) (9/48)
S (that) (9/48)
not expressed (8/48)
… PP (about) (3/48) …
28
NOMLEX constraints (4)

restrictions on possible combinations
– specified in NOMLEX entry
adaptation
:NOT ((AND
:SUBJECT ((DET-POSS) (N-N-MOD))
:OBJECT ((N-N-MOD))
*plants' weather adaptation
plants’ adaptation to weather
Note: Not implemented (cannot decide which assignment to remove).
10/25/2005
29
Denominalizer UI
com.fxpal.sake.test.DenominalizerTest
parse triples
output
10/25/2005
30