+1 - University of Illinois Urbana

Transcript +1 - University of Illinois Urbana

Opinion Integration and
Summarization
Yue Lu
University of Illinois at Urbana-Champaign
Opinions needed
in all kinds of decision processes
“What do people complain
about iPhone?” Business
intelligence
“How do people like the new
drug?”
Health
informatics
“How is the new policy
received?”
Political
science
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
2
Online opinions cover all kinds of topics
Topics:
People
Events
Products
Services, …
Sources:
Blogs
Microblogs
Forums
Reviews ,…
Yue Lu
…
45M reviews
53M blogs
65M msgs/day
1307M posts
http://sifaka.cs.uiuc.edu/yuelu2/
115M users
10M groups
…
3
After collecting opinions using Google
How could I read them all?
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
4
Online opinions are complicated
Aspect
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
Sentiment
Quality
5
Vision: Opinion Integration & Summarization
Online
Opinions
Sentence1
… Sentence 2
Sentence 100
Sentence 900
…
…
Opinion
Integration
Topic = t
Integrated
Summary
Aspect
Opinion Sentences Sentiment
Quality
Prediction
Quality
Aspect
1
Sentence 512
Sentence 823
positive
negative
high
medium
Aspect
2
Sentence 21
Sentence 153
neutral
positive
low
high
…
…
…
Yue Lu
Sentiment
Analysis
…
http://sifaka.cs.uiuc.edu/yuelu2/
6
Existing work cannot scale to different topics
Heavily rely on domain specific
• Hand-labeled training data
• Review summarization
• Hand-written
– Unsupervised feature
extraction + heuristics/rules
opinion polarity
identification: [Hu&Liu 04], [Popescu&Etzioni 05], …
– Supervised aspect extraction: [Zhuang et al] …
• Hidden aspect discovery: [Hofmann99] [[Chen&Dumais00] [Blei et al03]
[Zhai et al04] [Li&McCallum06] [Titov&McDonald08]…
• Sentiment classification
– Binary classification: [Pang&Lee02] [Kim&Hovy04] [Cui et al06] …
– Rating classification: [Pang&Lee05] [Snyder&Barzilay07] …
• Opinion Quality Prediction: [Zhang&Varadarajan`06] [Kim et al. `06] [Liu
et al. `08] [Ghose&Ipeirotis `10]…
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
…
7
New idea: exploit naturally available resources
Structured
Ontology
[COLING'10]
Sentence1
… Sentence 2
Sentence 100
Sentence 900
Expert
Articles
[WWW‘08]
…
Topic = t
Overall
Sentiment
Ratings
Social
Networks
[WWW‘09]
[KDD’10]
[WWW’11]
Yue Lu
[WWW'10]
http://sifaka.cs.uiuc.edu/yuelu2/
8
Intuition: scalable to different topics
22 M topics
3.5 M things
3.5 M articles
>3 K products/y
Opportunities?
• Provide domain-specific guidance
• Alleviate heavy dependence on
human labors
Challenges?
>3 M users
500 M users
• Cannot directly apply
45M reviews
supervised machine learning
• Need for new methods
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
9
My Work
Online
Opinions
Sentence1
… Sentence 2
Sentence 100
Sentence 900
…
…
Opinion
Integration
Topic = t
Integrated
Summary
Aspect
Opinion Sentences Sentiment
Aspect
[WWW’08]
Sentence 512
1
Sentence 823
[COLING'10]
Aspect
2
…
Yue Lu
Sentiment
Analysis
Sentence 21
Sentence 153
…
positive
[WWW’09]
negative
[KDD’10]
[WWW’11]
neutral
Quality
Prediction
Quality
high [WWW’10]
medium
positive
low
high
…
…
http://sifaka.cs.uiuc.edu/yuelu2/
10
Roadmap
• [WWW’11] “Automatic Construction of a ContextAware Sentiment Lexicon: an Optimization
Approach”
Opinion
Integration
Integrated
Summary
[WWW’08]
Aspects
Opinion Sentences
[COLING'10]
Aspect 1
Sentence 512
Aspect 2
Yue Lu
Sentiment
Analysis
Quality
Prediction
[WWW’10]
Quality
Sentence 823
[WWW’09]
Sentiment
[KDD’10]
positive
[WWW’11]
negative
Sentence 21
Sentence 153
neutral
positive
low
high
http://sifaka.cs.uiuc.edu/yuelu2/
high
medium
11
A well-known challenge:
sentiments are domain dependent
Existing Work
• Linguistic heuristics
Domain = Movie
“unpredictable”
[Hatzivassiloglou&McKeown `97],
[Kanayama&Nasukawa `06], …
• Morphology, synonymy
[Neviarouskaya et al `09],
[Mohammad et al `09], …
• Seed sentiment words
[Turney&Littman `03], …
• Document-level sentiment
rating [Choi and C. Cardie. `09], …
Domain = Laptop
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
12
Sentiments are also aspect dependent
“large”
Aspect = Screen
Aspect = Battery
Domain = Laptop
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
13
New problem:
constructing aspect-dependent sentiment lexicon
Input:
Laptop Collection
+
“Aspects”
• SCREEN: screen, LCD, display, …
• BATTERY: battery, power, charger, …
• PRICE: price, cost, money, …
…
A• challenging
problem:
due to increased sparseness
Output:
“Aspect-Adj”: sentiment_score
SCREEN-large +1
SCREEN-great +1
BATTERY-large -1
…
…
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
14
Our idea: exploit multiple resources
Overall Sentiment Ratings
General Sentiment Lexicon
excellent,
awesome, …
bad,
terrible, …
1
SCREEN-large
SCREEN-great
BATTERY-large
Language Heuristics
Challenges:
1. “and” clue
2. “but” clue
3. “negation” clue
Yue Lu
2
Screen:
text…
Battery:
text…
?
Dictionary
1. signals in different format
Synonyms
2. contradictory signals
large~ big, …
3
…
4
Antonyms
large<->tiny,
…
15
A Novel Optimization Framework
S = argmin
λprior
S
Objective function designed
to encode signals from
multiple resources
+ λrating
SCREEN-large
SCREEN-great
BATTERY-large
…
S1
S2
+ λsim
S3
…
+ λoppo
+δ
S: Aspect-Dependent
Sentiment Lexicon
Constraints
subject to
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
16
1. sentiment prior
S = argmin
SCREEN-great
SCREEN-bad
BATTERY-great
…
λprior
S
+ λrating
SCREEN-large
SCREEN-great
BATTERY-large
…
S1
S2
+ λsim
S3
…
+ λoppo
G: General-purpose
Sentiment Lexicon
+δ
S: Aspect-Dependent
Sentiment Lexicon
Yue Lu
1
-1
1
…
http://sifaka.cs.uiuc.edu/yuelu2/
17
2. overall sentiment rating
S = argmin
λprior
S
+ λrating
SCREEN-large
SCREEN-great
BATTERY-large
…
S1
S2
+ λsim
S3
…
+ λoppo
+δ
S: Aspect-Dependent
Sentiment Lexicon
X: Review Word Matrix
Predicted Ratings
*
R1
R1
R1
R2
…
SCREEN-bright
BATTERY-large
SCREEN-great
SCREEN-awesome
..
0.2
0.3
0.5
0.4
=
R1
R2
R3
R4
…
0.8
0.5
-0.7
0.1
..
O: Review Overall Ratings
~
R1
R2
R3
R4
…
1
1
-1
0
..
18
3. similar sentiments
S = argmin
λprior
S
+ λrating
SCREEN-large
SCREEN-great
BATTERY-large
…
S1
S2
+ λsim
S3
…
+ λoppo
S: Aspect-Dependent
Sentiment Lexicon
+δ
A: Similar-Sentiment Matrix
(from synonyms and “and” clues)
SCREEN-large
SCREEN-bad
BATTERY-small
…
Yue Lu
SCREEN-big
SCREEN-terrible
BATTERY-tiny
…
http://sifaka.cs.uiuc.edu/yuelu2/
1
1
1
19
4. opposite sentiment
S = argmin
λprior
S
+ λrating
SCREEN-large
SCREEN-great
BATTERY-large
…
S1
S2
+ λsim
S3
…
+ λoppo
S: Aspect-Dependent
Sentiment Lexicon
subject to
SCREEN-large
SCREEN-excellent
BATTERY-small
…
SCREEN-small 1
BATTERY-big 1
BATTERY-big 1
…
B: Opposite-Sentiment Matrix
(from antonyms and “but” clues)
+δ
Sign is different
Abs Value is similar
Separate the representation of Sj:
- Sign: only one of Sj+ , Sj- is active
- Abs Value: value of the active one
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
20
A Novel Optimization Framework
S = argmin
λprior
General sentiment lexicon 1
S
+ λrating
Overall rating
+ λsim
Synonyms
“and” clues
+ λoppo
2
3
4
+δ
+ δAntonyms
Weights set as the degree
“but” clues
we trust each signal • Transform to linear
subject to
3
4
programming
• solved efficiently using
GAMS/CPLEX
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
21
Evaluation: Data Sets
Hotel Data
Printer Data
TripAdvisor
Customer Survey
# doc
# aspects
AVG length
# judged doc
4792
7
270
750
3511
25
24
3511
# judged lexicon entry
# judged doc-aspect pair
705
2145
NA
4634
Source
Evaluation (1): Lexicon Quality
Evaluation (2): Doc-Aspect Sentiment, aggregate
the sentiment of lexicon entries to doc level
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
22
Evaluation (1): Lexicon Quality
OPT > Global > Dictionary
Hotel Data
Guess 1,0,-1 uniformly
General dictionary only
Overall ratings only
[Lu et. al. WWW09]
Method Precision Recall F-Score
Random 0.4932
0.2784 0.3559
MPQA
0.9631
0.3702 0.5348
INQ
Global
OPT
0.8757
0.7073
0.8125
Our method with
equal weights, i.e.
(λprior:λrating:λsim:λoppo = 1:1:1:1)
0.4397 0.5855
0.5929 0.6451
0.6823 0.7417
39%
27%
15%
Interesting sample results using OPT:
Hotel Data: ROOM-private, FOOD-excelent
Printer Data: INK-fast, SUPPORT-fast
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
23
Tuning weights further improves performance
OPT default: λprior λsim
1
equal weights 1
λoppo λrating
1
1
F-Score
0.7417
0
1
1
1
0
1
1
1
0
1
1
1
0.6549
0.7309
0.7408
1
1
1
0
0.6453
More weights 2
on important 3
terms
6
1
1
1
1
2
3
0.7431
0.7544
1
1
6
0.7510
8
1
1
8
0.7506
Dropping
one term
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
24
Evaluation (2): Doc-Aspect Sentiment:
OPT > Global > Dictionary
Method Precision Recall
Printer Random 0.4844 0.2629
Data MPQA 0.7579 0.1597
INQ
0.7879
0.3502
Global
0.7645
0.5448
OPT
0.8222
Hotel Random 0.4368
Data MPQA
0.8128
INQ
Global
OPT
Yue Lu
0.7800
0.6975
0.7283
F-Score
0.3408
MSE
0.7142
0.2639 144%0.5740
0.4849 33% 0.5365
0.6362 1% 0.5091
18%
13%
8%
0.5276 0.6428
0.4680
0.3689
0.5289
0.6294
0.7730
0.7756
0.5670
0.470011%
0.4561 9%
0.4426 6%
0.4160
0.3999
0.6408
0.6966
0.7333
0.7512
http://sifaka.cs.uiuc.edu/yuelu2/
17%
8%
2%
25
Roadmap
• [WWW’10]: Exploiting Social Context for Review
Quality Prediction
Opinion
Integration
Integrated
Summary
[WWW’08]
Aspects
Opinion Sentences
[COLING'10]
Aspect 1
Sentence 512
Aspect 2
Yue Lu
Sentiment
Analysis
Quality
Prediction
[WWW’10]
Quality
Sentence 823
[WWW’09]
Sentiment
[KDD’10]
positive
[WWW’11]
negative
Sentence 21
Sentence 153
neutral
positive
low
high
http://sifaka.cs.uiuc.edu/yuelu2/
high
medium
26
Existing Work of Quality Prediction
• As a supervised learning problem
×Not Helpful
?
√Very Helpful
√
?
?
?
?
Labeled
?
?
?
?
Unlabeled
• Textual features
• Meta-data features
[Zhang&Varadarajan`06] [Kim et al. `06]
[Liu et al. `08] [Ghose&Ipeirotis `10]
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
27
Base model: Linear Regression
Labels are expensive
to obtain!
Quality( i )
= Weights × FeatureVector( i )
w = argmin
w
= argmin{
Textual Features
}
w
Labeled
Closed-form: w=
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
28
Our idea: social context can help!
We also observe…
Reviewer
Identity
+
Social
Network
Social
Context
How to use them
Intuitions:
to help prediction?
Quality( ) is related toQuality( )
Quality(
Yue Lu
) is related to its Social Network
http://sifaka.cs.uiuc.edu/yuelu2/
29
Our approach: add social context as
graph-based regularizers
How to design the
Baseline
Trade-off
regularizers?
Loss function parameter
w = argmin{
w
Designed to “favor”
our intuitions
+ β× Graph Regularizer }
Unlabeled
Labeled
Advantages:
• Semi-supervised: make use of unlabeled data
• Applicable to reviews without social context
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
30
Hypothesis 1: Reviewer Consistency
Quality( 1 ) ~ Quality( 2 )
Quality( 3 ) ~ Quality( 4 )
Reviewers are
consistent!
1
Yue Lu
2
34
http://sifaka.cs.uiuc.edu/yuelu2/
31
Regularizer for Reviewer Consistency
Reviewer Regularizer
=∑ [ Quality( 1 ) - Quality( 2 ) ]2
Same-Author Graph (A)
Closed-form solution!
[Zhou et al. 03] [Zhu et al. 03] [Belkin et al 06]
1 2
w=
Review-Feature Graph Laplacian
Matrix
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
3
4
32
Hypothesis 2: Trust Consistency
Quality(
) - Quality(
)≤0
I trust people with
quality at least as
good as mine!
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
33
Regularizer for Trust Consistency
Trust Regularizer
=∑max[0, Quality(
) -Quality(
)]2
Trust Graph
No closed-form solution…
Still convexGradient Descent
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
34
Hypothesis 3 &4
Trust Graph
Hypothesis 3:
Co-citation Consistency
Co-citation Graph
Yue Lu
Hypothesis 4:
Link Consistency
Link Graph
http://sifaka.cs.uiuc.edu/yuelu2/
35
Mathematical Formulations
1. Reviewer Consistency:
Closed form
2. Trust Consistency:
Gradient descent
3. Co-citation Consistency:
4. Link Consistency:
Yue Lu
Closed form
Closed form
http://sifaka.cs.uiuc.edu/yuelu2/
36
Evaluation: Data Sets from Ciao UK
Statistics
Cellphone
Beauty
Digital Camera
1943
4849
3697
# Reviews
Reviews/Reviewer
ratio
Trust Graph Density
2.21
2.84
1.06
0.0075
0.014
0.0006
Summary
Social Context
Cellphone
rich
Beauty
rich
Digital Camera
sparse
Gold-std Quality
Distribution
balanced
skewed
balanced
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
37
% of MSE Difference
Our methods are most effective
with limited labeled data
Baseline
0%
-2%
-4%
-6%
-8%
-10%
-12%
-14%
-16%
Better
Yue Lu
10%
25% 50% 100%
Percentage of labeled Data (Cellphone)
http://sifaka.cs.uiuc.edu/yuelu2/
38
% of MSE Difference
Our methods are most effective
with rich social context
1%
-1%
-3%
-5%
-7%
-9%
-11%
-13%
-15%
Cellphone
Beauty
Digital Camera
Baseline
Reviews/Reviewer
ratio = 1.06
Better
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
39
Summary of this talk
…
Opinion
Integration
Integrated
Summary
Yue Lu
Aspects
Opinion Sentences
Sentiment
Analysis
Quality
Prediction
Sentiment
Quality
Aspect 1
Sentence 512
Sentence 823
positive
negative
high
medium
Aspect 2
Sentence 21
Sentence 153
neutral
positive
low
high
http://sifaka.cs.uiuc.edu/yuelu2/
40
Summary of this talk
1. Sentiment Analysis: construct aspectdependent sentiment lexicon
2. Quality Prediction: exploit social context
Opinion
Integration
Integrated
Summary
[WWW’08]
Aspects
Opinion Sentences
[COLING'10]
Aspect 1
Sentence 512
Aspect 2
Yue Lu
Sentiment
Analysis
Quality
Prediction
[WWW’10]
Quality
Sentence 823
[WWW’09]
Sentiment
[KDD’10]
positive
[WWW’11]
negative
Sentence 21
Sentence 153
neutral
positive
low
high
http://sifaka.cs.uiuc.edu/yuelu2/
high
medium
41
Future Directions
Task-support
Applications
Efficient Algo
for Real-time
Interaction
45M reviews 53M blogs 65M msgs/day
1307M posts
115M users
10M groups
Integrative
Analysis
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
42
Summary of my other work:
Text Information Management
“Investigation of
Topic Models”
[IRJ 10]
“An open system for
microarray clustering”
[NAR 07]
Opinion Integration
and Summarization
Text Mining
[KDD 10]
[COLING 10]
[WWW 08]
[WWW 09]
[WWW 10]
[WWW 11]
Bioinformatics
[NAR 10] “Bio literature mining”
[IRJ 09]
[TREC 07]
“Bio literature IR”
Yue Lu
Information
Retrieval
http://sifaka.cs.uiuc.edu/yuelu2/
43
Thank you!
&
Questions?
Backup Slides
References
[WWW'11] Yue Lu, Malu Castellanos, Umeshwar Dayal, ChengXiang Zhai. "Automatic
Construction of a Context-Aware Sentiment Lexicon: An Optimization Approach", To Appear
at WWW’11
[COLING'10] Yue Lu, Huizhong Duan, Hongning Wang and ChengXiang Zhai. "Exploiting
Structured Ontology to Organize Scattered Online Opinions", In Proceedings of the 23rd
International Conference on Computational Linguistics Pages: 734--742.
[KDD’10] Hongning Wang, Yue Lu, and ChengXiang Zhai. "Latent Aspect Rating Analysis on
Review Text Data: A Rating Regression Approach", In Proceedings of the 16th ACM SIGKDD
Conference on Knowledge Discovery and Data Mining Pages: 783-792
[WWW'10] Yue Lu, Panayiotis Tsaparas, Alexandros Ntoulas, and Livia Polanyi. "Exploiting
Social Context for Review Quality Prediction", In Proceedings of the 19th International World
Wide Web Conference Pages: 691-700.
[WWW'09] Yue Lu, ChengXiang Zhai and Neel Sundaresan. "Rated Aspect Summarization of
Short Comments", In Proceedings of the 18th International World Wide Web Conference
Pages: 131-140.
[WWW'08] Yue Lu and ChengXiang Zhai. "Opinion Integration Through Semi-supervised Topic
Modeling", In Proceedings of the 17th International World Wide Web Conference Pages: 121130.
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
46
Other Publications
[IRJ’10] Yue Lu, Qiaozhu Mei, ChengXiang Zhai. "Investigating Task
Performance of Probabilistic Topic Models - An EmpiricalTopic
Studymodels
of PLSA and
LDA", Information Retrieval.
[NAR’10] X. He, Y. Li, R. Khetani, B. Sanders, Yue Lu, X. Ling, C.-X. Zhai, B. Schatz.
“BSQA: Integrated Text Mining Using Entity Relation SemanticsBioinformatics
Extracted from
Biological Literature of Insects", Nucleic Acids Research.
[IRJ’09] Yue Lu, Hui Fang and ChengXiang Zhai. "An Empirical Study of Gene Synonym
Query Expansion in Biomedical Information Retrieval", Information
Retrieval Volume
Biomedical
IR
12, Issue1 (2009), Pages: 51-68.
[TREC'07] Yue Lu, Jing Jiang, Xu Ling, Xin He, ChengXiang Zhai. "Language Models for
Genomics Information Retrieval: UIUC at TREC 2007 GenomicsBiomedical
Track", In Proceedings
IR
of the 16th Text REtrieval Conference.
[NAR’07] Yue Lu, Xin He and Sheng Zhong. “Cross-species microarray analysis with
the OSCAR system suggests an INSR->Pax6->NQO1 neuro-protective pathway in ageing
Bioinformatics
and Alzheimer's disease", Nucleic Acids Research 105-114
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
47
Generating Candidate Lexicon Entries
Input:
The LCD is great but battery is so large.
Parsed: [The/DT LCD/NN is/VBZ great] but/CC
[battery/NN is/VBZ so/RB large/JJ] ./.
Aspect [The/DT (LCD/NN):SCREEN is/VBZ great/JJ] but/CC
Tagged:
[(battery/NN):BATTERY is/VBZ so/RB large/JJ] ./.
Candidates: SCREEN-great
BATTERY-large
Yue Lu
http://sifaka.cs.uiuc.edu/yuelu2/
SCREEN-large
SCREEN-great
BATTERY-large
…
?
48
From same reviewer
From different reviewers
Density
Hypotheses Testing (1):
Reviewer Consistency
Qg( 1 ) - Qg( 2 )
Qg( 1 ) - Qg( 3 )
Hypothesis 1:
Reviewer Consistency
is supported by data
Yue Lu
Difference in Review Quality
(Cellphone)
http://sifaka.cs.uiuc.edu/yuelu2/
49
Qg(
) - Qg( )
B
A
B is not linked to A
B trusts A
B is co-cited with A
B is linked to A
Hypotheses 2-4:
Social Network-based
Consistencies supported
by data
Yue Lu
Density
Hypotheses Testing (2-4):
Social Network-based Consistencies
Difference in Reviewer Quality
(Cellphone)
http://sifaka.cs.uiuc.edu/yuelu2/
50

+1 - University of Illinois Urbana

Transcript +1 - University of Illinois Urbana

Directory