Learning Subjective Adjectives From Corpora

Download Report

Transcript Learning Subjective Adjectives From Corpora

Identifying Subjective Language
Janyce Wiebe
University of Pittsburgh
1
Overview
General area: acquire knowledge of evaluative
and speculative language and use it in NLP
applications
Primarily corpus-based work
Today: results of exploratory studies
2
Collaborators
Rebecca Bruce, Vasileios Hatzivassiloglou,
Joseph Phillips
Matthew Bell, Melanie Martin,Theresa Wilson
3
Subjectivity Tagging
Recognizing opinions and evaluations
(Subjective sentences) as opposed to
material objectively presented as true
(Objective sentences)
Banfield 1985, Fludernik 1993, Wiebe 1994, Stein & Wright 1995
4
Examples
At several different levels, it’s a fascinating
tale. subjective
Bell Industries Inc. increased its quarterly to
10 cents from 7 cents a share. objective
5
Subjectivity
“Complained”
“You Idiot!”
“Terrible product”
“Enthused”
“Wonderful!”
“Great product”
“Speculated”
“Maybe”
6
Examples
Strong addressee-oriented negative evaluation


Recognizing flames (Spertus 1997)
Personal e-mail filters (Kaufer 2000)
I had in mind your facts, buddy, not hers.
Nice touch. “Alleges” whenever facts posted are not
in your persona of what is “real.”
7
Examples
Opinionated, editorial language


IR, text categorization (Kessler et al. 1997)
Do the writers purport to be objective?
Look, this is a man who has great numbers.
We stand in awe of the Woodstock generation’s
ability to be unceasingly fascinated by the subject
of itself.
8
Examples
Belief and speech reports

Information extraction, summarization, intellectual
attribution (Teufel & Moens 2000)
Northwest Airlines settled the remaining lawsuits,
a federal judge said.
“The cost of health care is eroding our standard of
living and sapping industrial strength”, complains
Walter Maher.
9
Other Applications
Review mining (Terveen et al. 1997)
Clustering documents by ideology (Sack 1995)
Style in machine translation and generation
(Hovy 1987)
10
Potential Subjective Elements
Sap: potential subjective element
"The cost of health care is eroding standards
of living and sapping industrial strength,”
complains Walter Maher.
Subjective element
11
Subjectivity
Multiple types, sources, and targets
Somehow grown-ups believed that
wisdom adhered to youth.
We stand in awe of the Woodstock generation’s
ability to be unceasingly fascinated by the
subject of itself.
12
Outline
Data and annotation
Sentence-level classification
Individual words
Collocations
Combinations
13
Annotations
Manually tagged + existing annotations
Three levels:
expression level
sentence level
document level
14
Expression Level Annotations
[Perhaps you’ll forgive me] for reposting his response
They promised [e+ 2 yet] more for [e+ 3 really good]
[e? 1 stuff]
15
Expression Level Annotations
Probably the most natural level
Difficult for manual and automatic tagging:
detailed
no predetermined classification unit
To date:
used for training and bootstrapping
16
Document Level Annotations
Manual: flames in Newsgroups
Existing:
opinion pieces in the WSJ: editorials, letters to
the editor, arts & leisure reviews
* to ***** reviews
+ More directly related to applications, but …
17
Document Level Annotations
Opinion pieces contain objective sentences and
Non-opinion pieces contain subjective sentences
News reports present reactions (van Dijk 1988)
“Critics claim …”
“Supporters argue …”
Editorials contain facts supporting the argument
Reviews contain information about the product
18
Document Level Annotations
In a WSJ data set:
subj
obj
opinion pieces
74%
26%
subj
obj
non-opinion pieces
43%
57%
19
Data in this Talk
Sentence level
1000 WSJ sentences
3 judges reached good agreement after rounds
Used for training and evaluation
Expression level
1000 WSJ sentences (2J)
462 newsgroup messages (2J) + 15413 words (1J)
Single round; results promising
Used to generate features, and not for evaluation
20
Data in this Talk
Document level:
Existing opinion-piece annotations used to generate
features
Manually refined classifications used for evaluation
Identified editorials not marked as such
Only clear instances labeled
To date: 1 judge
Distinct from the other data
3 editions, each more than 150K words
21
Sentence Level Annotations
A sentence is labeled subjective if any significant
expression of subjectivity appears
“The cost of health care is eroding our standard of living and
sapping industrial strength,’’ complains Walter Maher.
“What an idiot,’’ the idiot presumably complained.
22
Sentence Classification
Probabilistic classifier
Binary Features:
pronoun, adjective, number, modal ¬ “will “,
adverb ¬ “not”, new paragraph
Lexical feature:
good for subj; good for obj; good for neither
10-fold cross validation; 51% baseline
72% average accuracy across folds
82% average accuracy on sentences rated certain
23
Identifying PSEs
There are few high precision, high frequency
potential subjective elements
24
Identifying Individual PSEs
Classifications correlated with adjectives
Good subsets
Dynamic adjectives (Quirk et al. 1985)
Positive, negative polarity; gradability
automatically identified in corpora
(Hatzivassiloglou & McKeown 1997)
Results from distributional similarity
25
Distributional Similarity
Word similarity based on distributional pattern of words
Much work in NLP (see Lee 99, Lee and Pereira 99)
Purposes:
Improve estimates of unseen events
Thesaurus and dictionary construction from corpora
26
Lin’s Distributional Similarity
R2
R3
I
have
a
brown
dog
R1
R4
Word
I
have
brown
R
R1
R2
R3
...
W
have
dog
dog
Lin 1998
27
Lin’s Distributional Similarity
Word1
R W
R W
Word2
RW
RW
RW
RW
RW
RW
Pairs statistically correlated with Word1
Sum over RWint: I(Word1,RWint) + I(Word2,RWint) /
Sum over RWw1: I(Word1,RWw1) + Sum over RWw2: I(Word2,RWw2)
28
Bizarre
strange similar scary unusual fascinating
interesting curious tragic different
contradictory peculiar silly sad absurd
poignant crazy funny comic compelling
odd
29
Bizarre
strange similar scary unusual fascinating
interesting curious tragic different
contradictory peculiar silly sad absurd
poignant crazy funny comic compelling
odd
30
Bizarre
strange similar scary unusual fascinating
interesting curious tragic different
contradictory peculiar silly sad absurd
poignant crazy funny comic compelling
odd
31
Filtering
Filtered
Set
Seed
Words
Words+
Clusters
Word + cluster removed
if precision on training set
< threshold
32
Parameters
Threshold
Seed
Words
Words+
Clusters
Cluster size
33
Seeds from Annotations
1000 WSJ sentences with sentence level and
expression level annotations
They promised [e+ 2 yet] more for
[e+ 3 really good] [e? 1 stuff].
"It's [e? 3 really] [e- 3 bizarre]," says Albert
Lerman, creative director at the Wells agency.
34
Experiments
1
10
9
10
1/10 used for training, 9/10 for testing
Parameters:
Cluster-size fixed at 20
Filtering threshold: precision of
baseline adjective feature on
the training data
+7.5% ave 10-fold cross validation
[More improvements with other adj features]
35
Opinion Pieces
3 WSJ data sets, over 150K words each
For measuring precision:
Prec(S) = # instances of S in opinions /
total # instances of S
Baseline for comparison:
# words in opinions / total # words
Skewed distribution: 13-17% words in opinions
36
Parameters
Threshold
1-70%
Seed
Words
Words+
Clusters
2-40
Cluster size
37
Results
Varies with parameter settings, but there are smooth
regions of the space
Here: training/validation/testing
38
Low Frequency Words
Single instance in a corpus ~ low frequency
Analysis of expression level annotations:
there are many more single-instance words
in subjective elements than outside them
39
Unique Words
Replace all words that appear once in the test data
with “UNIQUE”
+5-10% points
40
Collocations
here we go again
get out of here
what a
well and good
rocket science
for the last time
just as well
…!
Start with the observation that low precision words
often compose higher precision collocations
41
Collocations
Identify n-gram PSEs as sequences whose precision
is higher than the maximum precision of its constituents
W1,W2 is a PSE if
prec(W1,W2) > max (prec(W1),prec(W2))
W1,W2,W3 is a PSE if
prec(W1,W2,W3) > max(prec(W1,W2),prec(W3)) or
prec(W1,W2,W3) > max(prec(W1),prec(W2,W3))
42
Collocations
Moderate improvements: +3-10% points
But with all unique words mapped to “UNIQUE”:
+13-24% points
43
Example Collocations with Unique
highly||adverb UNIQUE||adj
highly unsatisfactory
highly unorthodox
highly talented
highly conjectural
highly erotic
44
Example Collocations with Unique
UNIQUE||verb out||IN
farm out
chuck out
ruling out
crowd out
flesh out
blot out
spoken out
luck out
45
Collocations
UNIQUE||adj to||TO UNIQUE||verb
impervious to reason
strange to celebrate
wise to temper
they||pronoun are||verb UNIQUE||noun
they are fools
they are noncontenders
UNIQUE||noun of||IN its||pronoun
sum of its
usurpation of its
proprietor of its
46
Opinion Results: Summary
Best
baseline 17%
+prec/freq
Adjs
+21/373
Verbs
+16/721
2-grams
+10/569
3-grams
+07/156
1-U-grams +10/6065
2-U-grams
+24/294
3-U-grams
+27/138
Worst
baseline 13%
+prec/freq
+09/2137
+07/3193
+04/525
+03/148
+06/6045
+14/288
+13/144
Disparate features have consistent performance
N Collocation sets largely distinct
47
Does it add up?
Good preliminary results classifying opinion pieces
using density and feature count features.
48
Future Work
Mutual bootstrapping (Riloff & Jones 1999)
Co-training (Collins & Singer 1999) to learn both
PSEs and contextual features
Integration into a probabilistic model
Text classification and review mining
49
References
Banfield, A. (1982). Unspeakable Sentences. Routledge and Kegan
Paul.
Collins, M. & Singer, Y. (1999). Unsupervised models for named entity
classification. EMNLP-VLC-99.
van Dijk, T.A. (1988). News as Discourse. Lawrence Erlbaum.
Fludernik, M. (1983). The Fictions of Language and the Languages of
Fiction. Routledge.
Hovy, E. (1987). Generating Natural Language Under Pragmatic
Constraints. PhD dissertation.
Kaufer, D. (2000). Flaming. www.eudora.com
Kessler, B., Nunberg, G., Schutze H. (1997). Automatic Detection of
Genre. ACL-EACL-97.
Riloff, E. & Jones R. (1999). Learning Dictionaries for Information
Extraction by Multi-level Boot-strapping. AAAI-99
50
References
Stein, D. & Wright, S. (1995). Subjectivity and Subjectivisation.
Cambridge.
Terveen, W., Hill, W., Amento, B. ,McDonald D. & Creter, J. (1997).
Building Task-Specific Interfaces to High Volume Conversational Data.
CHI-97.
Teufel S., & Moens M. (2000). What’s Yours and What’s Mine:
Determining Intellectual Attribution in Scientific Texts. EMNLP-VLC00.
Wiebe, J. (2000). Learning Subjective Adjectives from Corpora. AAAI00.
Wiebe, J. (1994). Tracking Point of View in Narrative. Computational
Linguistics (20) 2.
Wiebe, J. , Bruce, R., & O’Hara T. (1999). Development and Use of a
Gold Standard Data Set for Subjectivity Classifications. ACL-99.
51
References
Hatzivassiloglou V. & McKeown K. (1997). Predicting the Semantic
Orientation of Adjectives. ACL-EACL-97.
Hatzovassiloglou V. & Wiebe J. (2000). Effects of Adjective Orientation
and Gradability on Sentence Subjectivity. COLING-00.
Lee, L. (1999). Measures of Distributional Similarity. ACL-99.
Lee, L. & Pereira F. (1999). ACL-99.
Lin, D. (1998). Automatic Retrieval and Clustering of Similar Words.
COLING-ACL-98.
Quirk, R, Greenbaum, S., Leech, G., & Svartvik, J. (1985). A
Comprehensive Grammar of the English Language. Longman.
Sack, W. (1995). Representing and Recognizing Point of View. AAAI
Fall Symposium on Knowledge Navigation and Retrieval.
52
Sentence Annotations
Ave pair-wise Kappa scores:
all data:
.69
certain data: .88 (60% of the corpus)
Case study of analyzing and improving intercoder
reliability:
if there is symmetric disagreement resulting from bias
assessed by fitting probability models (Bishop et al. 1975, CoCo)
•bias: marginal homogeneity
•symmetric disagreement: quasi-symmetry
use the latent class model to correct disagreements
53
Test for Bias:
Marginal Homogeneity
C1
pi   p i for all i
Worse the fit,
greater the bias
C2
C3
C4
C1
1+ = X1
C2
2+ = X2
C3
3+ = X3
C4
4+ = X4
+1 = +2 =
X1 X2
+3 =
X3
+4 =
X4
54
Test for Symmetric Disagreement:
Quasi-Symmetry
C1
Tests relationships
among the
off-diagonal counts
C2
*
C1
C2
*
C3
*
C4
*
C3
*
*
*
*
*
*
C4
*
*
Better the fit,
higher the correlation
55
(Potential) Subjective Elements
Same word, different types
“Great majority”
“Great!“
“Just great.”
objective
positive evaluative
negative evaluative
56
Review Mining
From: Hoodoo>[email protected]>
Newsgroups: rec.gardens
Subject: Re: Garden software
I bought a copy of Garden Encyclopedia from Sierra.
Well worth the time and money.
57