Powerset/Live Search Symposium August 8, 2008

Download Report

Transcript Powerset/Live Search Symposium August 8, 2008

From Verbal Argument Structures to
Nominal Ones:
A Data-Mining Approach
Olya Gurevich
1 December 2010
Talk Outline
Powerset: a natural language search engine (acquired
by Microsoft in 2008)
Deverbal nouns and their arguments
Data collection and corpus-based modeling
Baseline system
Experiments
Conclusions
© 2010 Microsoft
Page 2
Powerset: Natural Language Search
Queries and documents undergo syntactic and semantic
parsing
Semantic representations allow both more constrained
and more expansive matching compared to keywords
► Who invaded Rome ≠ Who did Rome invade
► Who did Rome invade ≈ Who was invaded by Rome
► Who invaded Rome ≈ Who attacked Rome
► Who invaded Rome ≈ Who was the invader of Rome
Worked on English-language Wikipedia
NL technology initially developed at Xerox PARC (XLE)
© 2010 Microsoft
Page 3
Deverbal Nouns
Events often realized as nouns, not verbs
► Armstrong’s return after his retirement
Armstrong returned after he retired
► The destruction of Rome by the Huns was
devastating
The Huns destroyed Rome
► The Yankees’ defeat over the Mets
The Yankees defeated the Mets
► Kasparov’s defense of his knight
Kasparov defended his knight
In search, need to map deverbal expression to the verb
(or vice versa)
© 2010 Microsoft
Page 4
Deverbal Types
Eventive
► destruction, return, death
Agent-like
► Henri IV was the ruler of France
Henri IV ruled France
Patient-like
► Mary is an IBM employee
IBM employs Mary
© 2010 Microsoft
Page 5
Deverbal Role Ambiguity
Deverbal syntax doesn’t always determine argument
role
► They jumped to the support of the Queen
==> They supported the Queen
► They enjoyed the support of the Queen
==> The Queen supported them
►
We talked about the Merril Lynch acquisition
==> Was Merryl Lynch acquired? Or did it acquire something?
Particularly problematic if underlying verb is transitive
but the deverbal noun has only one argument
© 2010 Microsoft
Page 6
Baseline system
LFG-based syntactic parser (XLE)
► Grammar is rule based
► Disambiguation component statistically trained
List of ~4000 deverbals and corresponding verbs, from
► WordNet derivational morphology
► NomLex, NomLex Plus
► Hand curation
Verb lexicon with subcategorization frames
© 2010 Microsoft
Page 7
Baseline system cont.
Parse sentence using XLE
If a noun is in the list of ~4000 deverbals, map its arguments
into those of a corresponding verb using rule-based
heuristics. For transitive verbs:
► X’s DV of Y ==> subj(V, X); obj(V,Y), etc.
Obama’s support of reform => subj(support, Obama); obj(support,
reform)
►
X’s DV ==> subj(V, X) [default to most-frequent pattern]
Obama’s support ==> subj(support, Obama)
►
DV of X ==> obj(V,X) [default to most-frequent pattern]
support of reform ==> obj(support, reform)
►
►
X DV ==> no role
Subject-sharing support verbs: make, take
Goal: to improve over default assignments
© 2010 Microsoft
Page 8
Baseline system cont.
Agent-like Deverbals
► X’s DVer ==> obj(V,X)
■
►
the project’s director == subj(direct, director); obj(direct,
project)
DVer of X ==> obj(V,X)
■
composer of the song == subj(compose; composer);
obj(compose; song)
Patient-like Deverbals
► X’s DVee ==> subj(V,X)
■
►
DVee of X ==> subj(V,X)
■
© 2010 Microsoft
IBM’s employee == subj(employ, IBM); obj(employ,
employee)
captive of the rebels == subj(capture, rebels); obj(capture,
captive)
Page 9
Deverbal Task
Goal: predict relation between transitive V and
argument X, given {X’s DV}, {DV of X}, or {X DV}
► the program’s renewal ==> obj(renew, program)
► the king’s promise ==> subj(promise, king)
►
►
►
►
►
© 2010 Microsoft
the destruction of Rome ==> obj(destroy, Rome)
the rule of Henri IV ==> subj(rule, Henri IV)
the Congress decision ==> subj(decide, Congress)
castle protection ==> obj(protect, castle)
domain adaptation ==> ?(adapt, domain)
Page 10
Inference from verb usage
Large corpus data can indicate lexical preferences
► Armstrong’s return == Armstrong returned
► return of the book == the book was returned
If: X is more often a subject of V than object
► then: X’s DV or DV of X ==> subj(V, X)
Need to count subj(V,X) | obj(V,X) occurrences for all
possible pairs (V,X)
Need lots of parsed sentences!
© 2010 Microsoft
Page 11
Data Sparseness
Where to get enough parsed data to count all occurrences to model
any pair (V,X)?
We have parsed all of the English Wikipedia (2M docs, 121M
sentences)
■
cf. Penn TreeBank (~50,000 sentences)
Oceanography: distributed architecture for fast extraction / analysis
of huge parsed data sets
72M Role (Verb, Role, Arg) examples
■
69% of these appear just once, 13% just twice!!
Not enough data to make a good prediction for each individual
argument
■
© 2010 Microsoft
need to generalize across arguments
Page 12
Deverbal-only model
for each deverbal DV and related verb V
find corpus occurrences of overlapping arguments
►
X SUBJ V, X OBJ V, and X’s DV for all X
if (XSUBJ / XOBJ) > 1.5, consider X “subject preferring” for this DV
if DV has more subject-preferring than object-preferring arguments,
then map:
► X’s DV ==> subj(V,X) for all X
(conversely for object preference)
if the majority of overlapping arguments for a given V are neither
subjects nor objects, DV is “other-preferring”
For each DV, average over all arguments X
© 2010 Microsoft
Page 13
Walk-through example
renewal : renew
► Argument: program
■
■
■
■
►
Argument: he
■
■
■
■
►
►
►
►
© 2010 Microsoft
program’s renewal
2 occurrences
obj(renew, program)
72 occurrences
subj(renew, program)
9 occurrences
{renewal, program} is object-preferring
his renewal
18 occurrences
subj(renew, he)
615 occurrences
obj(renew, he)
42 occurrences
{renewal, he} is subject-preferring
Object-preferring arguments:
15
Subject-preferring arguments:
9
Overall preference for X’s renewal: obj
But is there a way to model non-majority preferences?
Page 14
Overall preferences
For possessive arguments, e.g. X’s renewal, X’s possession
► Subj-preferring:
1786 deverbals (67%)
► Obj-preferring:
884 (33%)
► Default: subj
For of arguments, e.g. renewal of X, possession of X
► Subj-preferring:
839 (29%)
► Obj-preferring: 2036 (71%)
► Default: obj
For prenominal arguments, e.g. X protection, X discovery
► Subj-preferring:
373 (11%)
► Obj-preferring:
1037 (31%)
► Other-preferring:
1933 (58%)
► Default: other (= no role)
© 2010 Microsoft
Page 15
Incidence: subjects
Subject error (N=220)
Subject head (N=1000)
Agent
Patient
2-argument
"of"
Deverbal
Verb
poss
prenom
Other deverb
Verb
© 2010 Microsoft
Page 16
Incidence: objects
Object head (N=1000)
Object error (N=260)
Deverbal
Verb
© 2010 Microsoft
Agent
Patient
2-argument
"of"
poss
prenom
Other deverb
Verb
Page 17
Evaluation Data
X’s DV:
► 1000 hand-annotated sentences
► Possible judgments:
■
■
■
►
subj
obj
other
Evaluate classification between subj and non-subj
System
Judged
© 2010 Microsoft
Subj
Non-subj
Subj
Correct
Incorrect
Non-subj
Incorrect
Correct
Page 18
Evaluation
DV of X:
► 750 hand-annotated sentences
► Evaluate classification between obj and non-obj
System
Judged
© 2010 Microsoft
Obj
Non-obj
Obj
Correct
Incorrect
Non-obj
Incorrect
Correct
Page 19
Evaluation
DV X
► 999 hand-annotated sentences
► Evaluate classification between subj, obj, and none
System
Judged
© 2010 Microsoft
Subj
Obj
Other
Subj
Correct
Incorrect
Incorrect
Obj
Incorrect
Correct
Incorrect
Other
Incorrect
Incorrect
Correct
Page 20
Evaluation measures
Incorrect
Error 
Correct  Incorrect

© 2010 Microsoft
Page 21
Deverbal-only Results: Possessives
Combining all arguments for each deverbal reduces
role-labeling error by 39% for possessive arguments
Possessive arguments
0.7
0.6
0.5
Baseline
0.4
Deverbal-only
model
0.3
0.2
0.1
0
Subj error
© 2010 Microsoft
Non-subj
error
Overall error
Page 22
Deverbal-only Results: ‘of’ args
Error rate is reduced by 44%
'of' arguments
1
0.9
0.8
0.7
Baseline
0.6
0.5
Deverbal-only
model
0.4
0.3
0.2
0.1
0
Obj error
© 2010 Microsoft
Non-obj
error
Overall error
Page 23
Deverbal-only Results: prenominal args
1
.9
.8
Error rate is reduced by 28%
Prenominal arguments
.7
Baseline
Deverbal-only
Animacy
Lexicalized
.6
1.2
.5
.4
1
.3
.2
1.2
1
0.8
0.8
Baseline
.1
0.6
0.6
0
Obj error
Non-obj
error0.4
Deverbal-only
model
Overall error
0.2
0.2
0
0
Subj
error
© 2010 Microsoft
0.4
Obj error
Other
error
Overall
error
Subj
error
Page 24
O
Too much smoothing?
Combining all arguments is fairly drastic
Possible features of arguments that may impact
behavior:
► Ontological class: hard to get reliable classifications
► Animacy: subjects are more animate than objects
(crosslinguistically true)
the program’s renewal vs. his renewal
Possible features of deverbals and verbs that may
impact behavior:
► Ontological class
► Active vs. passive use of verbs
© 2010 Microsoft
Page 25
Animacy-based model
Split model into separate predictors for animate(X) and inanimate(X)
► Animate: pronouns (I, you, he, she, they)
► Inanimate: common nouns
► Ignored proper names due to poor classification into people vs. places
vs. organizations
If model does not have a prediction for the class of argument encountered,
fall back to deverbal-only model
Results:
► more accurate subject labeling for animate arguments
► lower recall and less accurate object labeling
► overall error rate is about the same as deverbal-only model
► possibly due to insufficient training data
© 2010 Microsoft
Page 26
Lexicalized model
Try to make predictions for individual DV+argument
pairs
If the model has insufficient evidence for the pair, default
to deverbal-only model
Results:
► For possessive args, performance about the same as
deverbal-only
► For ‘of’ args, performance slightly worse than
deverbal-only
► For prenominal args, much worse performance
► Model is vulnerable to data sparseness and
systematic parsing errors (e.g. weather conditions)
© 2010 Microsoft
Page 27
DV+animacy / lex results: possessives
Possessive arguments
0.7
0.6
0.5
Baseline
0.4
Deverbal-only
Animacy
0.3
Lexicalized
0.2
0.1
0
Subj error
© 2010 Microsoft
Non-subj error
Overall error
Page 28
DV+animacy / lex results: ‘of’ arguments
'of' arguments
1
0.9
0.8
0.7
Baseline
0.6
Deverbal-only
0.5
Animacy
0.4
Lexicalized
0.3
0.2
0.1
0
Obj error
© 2010 Microsoft
Non-obj error
Overall error
Page 29
DV+lex results: prenominal args
Prenominal arguments
1.2
1
0.8
Baseline
0.6
Deverbal-only
Lexicalized
0.4
0.2
0
Subj error
© 2010 Microsoft
Obj error
Other error
Overall error
Page 30
Training data size: 10K vs. 2M docs
Error rate vs.
Size of training data
0.5
0.4
Baseline
10K
2M
0.3
0.2
0.1
0
Possessive
of'
Coverage vs.
Size of training data
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
10K
2M
Possessive
© 2010 Microsoft
"of"
Page 31
Support (“light”) verbs
Tried using the same method to derive support verbs
► e.g. make a decision, take a walk, receive a gift
Look for patterns like
John decided vs. John made a decision => lv(decision, make, sb)
We agreed vs. We had an agreement => lv(agreement, have, sb)
Initial predictions had quite a few spurious patterns
After manual curation
► 96 DV-V pairs got a support verb
► 25 unique support verbs
► 28 support verb / argument patterns
Default model fairly fragile
Tight semantic relationship between light verbs and deverbals
makes this method less applicable
© 2010 Microsoft
Page 32
Directions for future work
Less ad hoc parameter setting
Further lexicalization of the model
► Predictions for ontological classes of arguments
► Use properties of verbal constructions (e.g. passive
vs. active, tense, etc.)
More fine-grained classification of non-subj/obj roles
► director of 12 years
► Bill Gates’ foundation
► the Delhi Declaration
© 2010 Microsoft
Page 33
Conclusions
Knowing how arguments typically participate in events
allows interpretation of ambiguous deverbal syntax
Large parsed corpora are a valuable resource
Even the simplest models greatly reduce error
More data is better
© 2010 Microsoft
Page 34
Thanks to:
Scott Waterman
Dick Crouch
Tracy Holloway King
Powerset NLE Team
© 2010 Microsoft
Page 35
References
M. Banko and E. Brill, Scaling to very very large corpora for natural language disambiguation, ACL 2001.
S. Riezler, T. H. King, R. Kaplan, J. T. Maxwell. III, R. Crouch, and M. Johnson, Parsing the Wall Street
Journal using a Lexical-Functional Grammar and discriminative estimation techniques, ACL 2002.
S. A. Waterman, Distributed parse mining, in SETQA-NLP 2009.
O. Gurevich, R. Crouch, T. H. King, and V. de Paiva, Deverbal nouns in knowledge representation, Journal
of Logic and Computation, vol. 18, pp. 385-404, 2008.
O. Gurevich, S.A. Waterman. Mapping Verbal Argument Preferences to Deverbal Nouns, IJSC 4(1), 2010
M. Nunes, Argument linking in English derived nominals, in Advances in Role and Reference Grammar, R.
V. Valin, Ed. John Benjamins, 1993, pp. 375-432.
R. S. Crouch and T. H. King, Semantics via f-structure rewriting, LFG 2006.
C. Macleod, R. Grishman, A. Meyers, L. Barrett, and R. Reeves, NOMLEX: A lexicon of nominalizations,
EURALEX 1998.
A. Meyers, R. Reeves, C. Macleod, R. Szekely, V. Zielinska, B. Young, and R. Grishman, The crossbreeding of dictionaries, LREC-2004.
C. Walker and H. Copperman, Evaluating complex semantic artifacts, LREC 2010.
S. Pradhan, H. Sun, W. Ward, J. H. Martin, and D. Jurafsky, Parsing arguments of nominalizations in
English and Chinese, HLT-NAACL 2004.
C. Liu and H. T. Ng, Learning predictive structures for semantic role labeling of Nombank, ACL 2007.
M. Lapata, The disambiguation of nominalizations, Computational Linguistics, 28(3),357-388, 2002.
S. Pado, M. Pennacchiotti, and C. Sporleder, Semantic role assignment for event nominalisations by
leveraging verbal data, CoLing 2008.
© 2010 Microsoft
Page 36