Disambiguating PP attachment sites with graded semantic data

Download Report

Transcript Disambiguating PP attachment sites with graded semantic data

Clayton Greenberg
Princeton University
May 16, 2013
One morning, I shot an elephant in my pajamas.
How he got into my pajamas I’ll never know.
-- Groucho Marx
*
* Right Association (Late Closure)
* “Terminal symbols optimally associate to the
lowest non-terminal node” (Kimball 1973, 24).
* Minimal Attachment
* “stipulates that each lexical item (or other node)
is to be attached into the phrase marker with the
fewest possible number of nonterminal nodes
linking it with the nodes which are already
present” (Frazier and Fodor 1978).
*
* If Right Association must hold, we would always
get N-attach. In order to get V-attach, the NP
must “close” so that the PP can attach higher
in the tree.
*
*
* Attachment sites grow as Catalan function:
* (1) put (“the block”, “in the box on [the table in the kitchen]”)
* (2) put (“the block”, “in [the box on the table] in the kitchen”)
* (3) put (“the block in the box”, “on the table in the kitchen”)
* (4) put (“the block in [the box on the table]”, “in the kitchen”)
* (5) put (“[the block in the box] on the table”, “in the kitchen”)
*
* Ordered hierarchy of synsets, per POS.
* Synsets contain words, glosses, hypernyms and
hyponyms.
* Equipped to help with data sparsity.
* Provides semi-unsupervised dimensions to data.
* Perhaps able to capture generalizations about
PP-attachment behaviors.
*
*
*
*
*
*
*
*
*
*
*
*
*
*
kayak (n.)
direct hypernym: canoe
direct hypernym: small boat
direct hypernym: boat
direct hypernym: vessel, watercraft
direct hypernym: craft
direct hypernym: vehicle
direct hypernym: conveyance, transport
direct hypernym: instrumentality, instrumentation
direct hypernym: artifact, artefact
direct hypernym: whole, unit
direct hypernym: object, physical object
direct hypernym: physical entity
direct hypernym: entity
*
* I ate my food with a spork
* I ate my food with an eating utensil
(hypernym)
* I ate my food with a fork
* I ate my food with a spoon
* I gobbled my food with a spork
* I consumed my food with a spork
*
* Beyond hierarchies and glosses.
* Teleological links: subjects were asked what
do you use a shovel for? Scissors? Oven?
* Evocation database:
subject were given pairs
of words and asked how much one word evokes
the other.
See -> telescope (weak).
Telescope -> see (strong).
*
* (1) track the number of times verbs and nouns appear without
prepositions
* (2) classify as verb-attach if noun1 is a pronoun (1.8% of our data)
* (3) classify as verb-attach if the verb is passivized
* (4) classify as noun-attach if there is no viable verb to attach to
* (5) if the unambiguous classifications made previously show that one
classification is significantly more likely than the other for the
current data point, assign accordingly
* (6) if the co-occurrence of the verb and preposition differs
significantly from the co-occurrence of noun1 and preposition,
classify accordingly
* (7) classify the remainder as noun-attach.
* Accuracy:
80%
*
0 join board as director V
1 is chairman of N.V. N
2 named director of conglomerate N
3 caused percentage of deaths N
5 using crocidolite in filters V
6 bring attention to problem V
9 is asbestos in products N
12 led team of researchers N
13 making paper for filters N
16 including three with cancer N
18 is finding among those N
22 is one of nations N
* Used WordNet hierarchies and transformationbased learning to induce rules for PP
attachment.
* Human readable output.
* Accuracy:
*
81.8%
* Training corpus enhancements:
* Replaced four digit numbers with “YEAR,”
* Replaced other numbers with “NUM.”
* Verbs and prepositions were converted to all lowercase.
* In nouns, all words that started with an uppercase letter
followed by a lowercase letter were replaced with “NAME.”
* All strings NAME-NAME were replaced with NAME.
* All verbs were automatically lemmatized.
* Accuracy:
84.5%
*
* Developed a custom word sense disambiguation
system for PP-attachment
* Used decision trees as learning algorithm
* Industry leader on Ratnaparkhi dataset with
accuracy of 88.1%
*
Method
Percentage Accuracy
Always N-attach
59.0
Most likely for each preposition
72.2
Hindle and Rooth (1993)
80.0
Ratnaparkhi et al. (1994)
81.6
Brill and Resnik (1994)
81.8
Collins and Brooks (1995)
84.5
Stetina and Nagao (1997)
88.1
Average Human (quadruple)
88.2
Average Human (whole sentence)
93.2
*
* The rabbit is in the hat.
* The rabbit is on the hat.
* The rabbit is with the hat.
* She was blamed for the crime.
* She was accused of the crime.
* She was charged with the crime.
*
*
There is a “general tendency” that complements
merge first, then adjuncts, then specifiers.
of PPs seem to always Merge first, implying a
complement relation:
1)
2)
A game of cards with incalculable odds.
*A game with incalculable odds of cards.
Either by implementing a TBL system or having the
insight to divide the corpus into preposition groups,
one would discover that the Ratnaparkhi corpus
shows of PPs attaching to nouns…
99.1%
of the time.
The only exceptions are phrasal verbs like accuse of
and misclassifications. And, of happens to be the
most frequent preposition in English.
I saw the woman with the telescope.
I saw the woman with the red glasses.
I saw the woman with the handbag.
* Verb-attach PPs tend to be
* (1) instruments (the object with which the action is performed)
* (2) locatives (where the action takes place)
* (3) goals (where or to whom the action is leading)
* (4) sources (where or from whom the action initiated)
* (5) manners (how or when the action occurs).
*
* Cleaned up Ratnaparkhi’s corpus.
* Deleted the missing quadruples, removed
influential errors from Abney et al. (1999) and
others as encountered, and adopted Collins and
Brooks (1995) enhancements.
* Found the original sentence for each quadruple in
the Penn Treebank.
* Programmatically determined three disambiguating
glosses for each quadruple.
* Presented MTurkers with the original and glosses.
They are asked to rate meaning quality from 1 to 5.
*
EXAMPLES:
Original sentence: One morning, I shot an elephant in my pajamas
GOOD: shot in my pajamas, an elephant
EXCELLENT: shot while in my pajamas, an elephant
FAIR: an elephant could be in my pajamas
This sentence has two possible meanings. In one meaning, the shooter is wearing pajamas. In the other,
the elephant is wearing pajamas. The first meaning is more reasonable, so it gets a higher rating. The
"while" makes the second phrase clearer than the first: the shooter is not shooting into the
pajamas. Therefore, the second phrase gets a higher rating than the first.
Original sentence: He saw the woman with the telescope.
EXCELLENT: saw with the telescope, the woman
NEUTRAL: saw while with the telescope, the woman
EXCELLENT: the woman could be with the telescope
This sentence has two possible meanings. In one meaning, "he" used the telescope to see the woman. In
the other, the woman has the telescope. Both meanings seem equally reasonable. However, perhaps you
have a preference for one over the other. There is NO CORRECT answer. The second phrase is less clear
about whether he used the telescope to see the woman, so it gets a lower rating.
Original sentence: He saw the woman with the handbag.
POOR: saw with the handbag, the woman
POOR: saw while with the handbag, the woman
EXCELLENT: the woman could be with the handbag
This sentence has one possible meaning. The woman has the handbag. "He" did not use the handbag to
see the woman. This is the only CORRECT answer. If the sentence has only one meaning, please rate the
phrase with the correct meaning with EXCELLENT and the other(s) with POOR. If you do not get most of
these correct, we may not accept your submission for payment.
* Combine:
Ratnaparkhi classifications, WordNet
senses, instrument distances from
morphosemantic links and evocations, and
idiom list.
* 3,000 idioms. Not edited by hand.
* Presence on the list ended up being the highest
node of the decision tree, as expected.
*
Correctly Classified Instances 17339 89.0230 %
Incorrectly Classified Instances 2138 10.9771 %
Total Number of Instances 19477
=== Confusion Matrix ===
a b <-- classified as
8546 599 | a = V
1539 8793 | b = N
*
* Correlation coefficient 0.2165
* We can divide the human judgments into three groups:
unvaried V-attach, unvaried N-attach, varied.
* The varied data was enough to override the
correlation.
* Many reasons for variation: paraphrases were
awkward, occasional actual ambiguity, misfire on
idioms.
* There were barely any false positives, but many false
negatives, due to sparseness of inter-POS links in
WordNet
*
* Many improvements resisted:
clean up idiom
list, delete bad test points, throw out
quadruples with non-WordNet words, etc.
* Connect to frame semantics?
* Investigate further why people have the
intuitions that they do when the context is
underspecified.
* Find better applications for the graded
judgment corpus.
*
* Over the summer, I adapted a version of TBL to
translate prepositions from Persian into
English.
* The system exclusively used the word forms, a
programmatically determined “governor”
(essentially the attachment site), and the
object.
*
Hassan bx Maryym azdwaj Krd .
Literally, John with Mary got married
‘John married Mary.’
So, we need a rule that says, “If the governor is
azdwaj Krdn, delete bx instead of translating it
to with.
*
* What we wanted:
*
* What we got: