Transcript ppt

Automatic sense prediction for
implicit discourse relations in text
Emily Pitler, Annie Louis, Ani Nenkova
University of Pennsylvania
ACL 2009
Implicit discourse relations
• Explicit comparison
– I am in Singapore, but I live in the United States.
• Implicit comparison
– The main conference is over Wednesday. I am staying
for EMNLP.
• Explicit contingency
– I am here because I have a presentation to give at ACL.
• Implicit contingency
– I am a little tired; there is a 13 hour time difference.
Related work
• Soricut and Marcu (2003)
– Sentence level
• Wellner et al. (2006)
– used GraphBank annotations that do not differentiate
between implicit and explicit.
– difficult to verify success for implicit relations.
• Marcu and Echihabi (2001)
– Artificial implicit
– Delete the connective to generate dataset.
– [Arg1, but Arg2] => [Arg1, Arg2]
Word pairs investigation
• The most easily accessible features are the words
in the two text spans of the relation.
• There is some relationship that hold between the
words in the two arguments.
– The recent explosion of country finds mirrors the
“closed-end fund mania” of the 1920s. Mr. Foot says,
when narrowly focused funds grew wildly popular.
They fell into oblivion after the 1929 crash.
– Popular (受歡迎) and oblivion (被遺忘) are almost
antonyms.
– Triggers the contrast relation between the sentences.
Word pairs selection
• Marcu and Echihabi (2001)
– Only nouns, verbs, and others cue phrases.
– Using all words were superior to those based on only nonfunctions words.
• Lapata and Lascarides (2004)
– Only verbs, nouns, and adjectives.
– Verb pairs are one of best features.
– No useful information was obtained using nouns and adjectives.
• Blair-Goldensohn et al. (2007)
–
–
–
–
Stemming.
Small vocabulary.
Cutoff on the minimum frequency of a feature.
Filtering stop-words has a negative impact on the results.
Analysis of word pair features
• Finding the word pairs with highest information gain on
the synthetic data.
– The government says it has reached most isolated
townships by now, but because roads are blocked, getting
anything but basic food supplies to people remains difficult.
– Remove but => comparison example
– Remove because => contingency example
Features for sense prediction
•
•
•
•
•
•
Polarity tags
Inquirer tags
Money/Percent/Number
Verbs
First-last/first 3 words
Context
Polarity Tags pairs
• Similar to word pairs, but words replaced with polarity tags.
• Each word’s polarity was assigned according to its entry in the
Multi-perspective Question Answering Opinion Corpus (Wilson et
al., 2005)
• Each sentiment word is tagged as positive, negative, both, or
neutral.
• Using the number of negated and non-negated positive, negative,
and neutral sentiment word in the two spans as features.
• Executives at Time Inc. Magazine Co., a subsidiary of Time Warner,
have said the joint venture with Mr. Lang wasn’t a good one.
[Negated Positive]
• The venture, formed in 1986, was supposed to be Time’s low-cost,
safe entry into women’s magazines. [Positive]
Inquirer Tags
• Look up what semantic categories each word falls
into according to the General Inquirer lexicon
(Stone et al., 1966).
• See more observation for each semantic class
than for any particular word, reducing the data
sparsity problem.
• Complementary classes
– “Understatement” vs. “Overstatement”
– “Rise” vs. “Fall”
– “Pleasure” vs. “Pain”
• Only verbs.
Money/Percent/Num
• If two adjacent sentences both contain numbers, dollar
amounts, or percentages, it is likely that a comparison
relation might hold between the sentences.
• Count of numbers, percentages, and dollar amounts in
the two arguments.
• Number of times each combination of
number/percent/dollar occurs in the two arguments.
• Newsweek's circulation for the first six months of 1989
was 3,288,453, flat from the same period last year
• U.S. News' circulation in the same time was 2,303,328,
down 2.6%
Verbs
• Number of pairs of verbs in Arg1 and Arg2 from the
same verb class.
– Two verbs are from the same verb class if each of their
highest Levin verb class levels are the same.
– The more related the verbs, the more likely the relation is
an Expansion.
• Average length of verb phrases in each argument
– They [are allowed to proceed] => Contingency
– They [proceed] => Expansion, Temporal
• POS tags of the main verb
– Same tense => Expansion
– Different tense => Contingency, Temporal
First-Last, First3
• Prior work found first and last words very
helpful in predicting sense
– Wellner et al., 2006
– Often explicit connectives
Context
• Some implicit relations appear immediately
before or immediately after certain explicit
relations.
• Indicating if the immediately
preceding/following relation was an explicit.
– Connective
– Sense of the connective
• Indicating if an argument begins a paragraph.
Dataset
• Penn Discourse Treebank
• Largest available annotated corpus of discourse
relations
– Penn Treebank WSJ articles
– 16,224 implicit relations between adjacent sentences
• I am a little tired; [because] there is a 13 hour
time difference.
– Contingency.cause.reason
• Use only the top level of the sense annotations.
Top level discourse relations
• Comparison (轉折)
– 但是、可是、卻、即使、竟然、然而……
• Contingency (因果)
– 因為、由於、因此、於是……
• Expansion (並列)
– 又、並且、而且……
• Temporal (時序)
– 在此之前、之後……
Discourse relations
Relation Sense
Proportion of implicits
Expansion
53%
Contingency
26%
Comparison
15%
Temporal
6%
Experiment setting
•
•
•
•
Developed features on sections 0-1
Trained on sections 2-20
Tested on sections 21-22
Binary classification task for each sense
• Trained on equal numbers of positive and
negative examples
• Tested on natural distribution
• Naïve Bayes classifier
Results: comparison
Features
f-score
First-Last, First3
21.01
Context
19.32
Money/Percent/Num
19.04
Random
9.91
Polarity is actually
the worst feature
16.63
Distribution of opposite polarity pairs
Comparison
Positive-Negative or
Negative-Positive Pairs
30%
Not Comparison
31%
Results: contingency
Features
f-score
First-Last, First3
36.75
Verbs
36.59
Context
29.55
Random
19.11
Results: expansion
Features
f-score
Polarity Tags
71.29
Inquirer Tags
70.21
Context
67.77
Random
64.74
• Expansion is
majority class
• precision more
problematic than
recall
• These features
all help other
senses
Results: temporal
Features
f-score
First-Last, First3
15.93
Verbs
12.61
Context
12.34
Random
5.38
Temporals often
end with words
like “Monday” or
“yesterday”
Best feature sets
• Comparison
– Selected word pairs.
• Contingency
– Polarity, verb, first/last, modality, context, selected
word pairs.
• Expansion
– Polarity, inquirer tags, context.
• Temporal
– First/last, selected word pairs.
Best results
Relation
F-score
baseline
Comparison
21.96
17.13
Contingency
47.13
31.10
Expansion
76.41
63.84
Temporal
16.76
16.21
Sequence model for
discourse relations
• Tried Conditional random field classifier.
Model
Accuracy
Naïve Bayes Model
43.27%
Conditional Random Fields
44.58%
Conclusion
• First study that predicts implicit discourse
relations in a realistic setting.
• Better understanding of word pairs.
– The feature in fact do not capture opposite
semantic relation but rather give information
about function word co-occurrences.
• Empirical validation of new and old features.
– Polarity, verb classes, context, and some lexical
features indicate discourse relations.