NLP – Review analysis

Download Report

Transcript NLP – Review analysis

Peer-review analysis
Comprehensive exam
Presentered by :
Committees:
Wenting Xiong
Diane Litman
Rebecca Hwa
Jingtao Wang
1
Motivation
• Goal
Mine useful information in peers’ feedback and represent them in a
intuitive and concise way
• Tasks and related research topics
–
Identify review helpfulness
NLP – Review analysis
–
Summarize reviewers’ comments
NLP – Paraphrasing and Summarization
–
Sense-making of review comments interactive review exploration
HCI – Visual text analytics
2
Part.1
NLP -- Review Analysis
3
Outline
1. Review helpfulness analysis
2. Sentiment analysis (opinion mining)
Aspect detection
Sentiment orientation
Sentiment classification & extraction
4
1 Review helpfulness analysis
1.Automatic prediction
– Learning techniques
– Features utilities
– The ground-truth
1.Analysis of perceived review helpfulness
– Users’ bias when vote for helpfulness
– Influence of the other reviews of the same product
5
1.1 -- Learning techniques
• Problem formalization
– Input: textual reviews
– Output: helpfulness score
• Learning Algorithms
– Supervised learning – Regression
• Product reviews (e.g. electronics) <Kim 2006>, <Zhang 2006>, <Liu
2007>,<Ghose 2010>, <O'Mahony 2010>
• Trip reviews <Zhang 2006>
• Movie reviews <Zhang 2006>
– Unsupervised learning – Clustering
• Book reviews <Tsur 2009>
• Focus
– Predict absolute scores VS. rankings
– Identify most helpful <Liu 2007> vs. unhelpful <Tsur 2009>
6
1.1-- Feature utilities
• Features used to model review helpfulness
Category
Feature type
Unigrams, bigrams
Low level
Structural
Linguistic
Social factors
Syntactic
Semantic: 1) domain lexicons
2) Subjectivity
Sentiment analysis
Readability metrics
High level
Reviewer profile
Product ratings
– Controversial results about the effectiveness of subjectivity
features
• term-based counts not useful <Kim, et. al, 2006>, category-based count
shows positive words correlate with greater helpfulness <Ghose, et. al,
2010>
– Data sparsity issues?
7
1.1 --The ground-truth
• Various gold-standard of review helpfulness
– Aggregated helpfulness votes Perceived helpfulness
e.g. <Kim 2006>
– Manual annotations of helpfulness Real helpfulness
<Liu 2007>
• Problems
Percentage of helpful votes is not consistent with annotators
judgments based on helpfulness specifications
Error rate of preference pair < 0.5 <Liu 2007>
8
1 Review helpfulness analysis
1.Automatic prediction
– Learning techniques
– Features utilities
– The ground-truth
1.Analysis of perceived review helpfulness
– Biased voting of review helpfulness on Amazon.com
– The perceived helpfulness is not only determined by the
textual content
9
1.2 Analysis of perceived review helpfulness
• Biased voting of review helpfulness on Amazon.com
– Imbalanced vote
– Winner Circle bias
– Early bird bias <Liu 2007>
“x/y” does not capture the true helpfulness of reviews
• The perceived helpfulness is not only determined by the
textual content
– Influence of the other reviews of the same product
– Individual bias <Danescu-Niculescu-Mizil 2009>
10
1 Review helpfulness analysis
• Summary
– Effective features for identify review helpfulness
– Perceived helpfulness VS. real helpfulness
• Comments
– New features
• Introduce domain knowledge and information from other
dimensions
– Data sparsity problem
• High-level features
• Deep learning from low-level features
– Other machine learning techniques
• Theory-based generative models
11
Outline
1. Review helpfulness analysis
2. Sentiment analysis (opinion mining)
12
2 Sentient analysis (opinion mining)
How people think about what?
1.Aspect detection
2.Sentiment orientation
3.Sentiment classification & extraction
13
2.1 Aspect detection
• Frequency-based approach
– Most frequent noun-phrase + sentiment-pivot expansion <Liu, 2004>
– PMI (pointwise Mutual information) with meronymy discriminators +
WordNet <Popescu 2005>
• Generative approach
–
–
–
LDA, MG-LDA <Titov 2008>, sentence-level local-LDA <Brody 2010>
Multiple-aspect sentiment model <Titov 2008>
Content-attitude model <Sauper 2011>
14
2.2 Sentiment orientation
• Aggregating from subjective terms
– Manually constructed subjective lexicons
• Bootstrapping with PMI
– Adj & adv <Turney 2001>
– opinion-bearing words <Liu 2004>
• Graph-based approach
– Relaxiation labeling <Popescu 2005>
– Scoring <Brody 2010>
• Domain adaptation
– SCL algorithm <Blitzer 2007>
• Through topic models
– MAS -- aspect-independent + aspect-dependent <Titov 2008>
– Content-attitude models -- predicted posterior of sentiment
distribution <Sauper, 2011>
15
2.3 Sentiment classification and extraction
• Classification
– Binary <Turney 2001>
– Finer-grained e.g. metric labeling <Pang 2005>
•
Data sparsity
– Bag-of-Words vs. Bag-of-Opinions <Qu 2010>
• Opinion-oriented extraction
– Topic of interest
•
•
•
Pre-defined
Automatically learned
User-specified
16
2 Summary
Comparing reviews’ helpfulness and sentiment
• In terms of automatic prediction, both are metric inferring
problem, that can be formalized as standard ML problems
with same input X though different output Y
• The learned knowledge about opinion topics and the
associated sentiments would help model the general utility
of reviews
17
Part.2
NLP -- Paraphrasing &
Summarization
18
Outline
1. Paraphrasing
Paraphrases are semantically equivalent with each other
1. Paraphrase recognition
2. Paraphrase generation
1. Summarization
Shorter representation of the same semantic information of
the input text
1. Informativeness computation
2. Extracted summarization of evaluative text
19
1.1 Paraphrase recognition
• Discriminative approach
–Various string similarity metrics
–Different level of abstraction of textual strings
<Malakasiotis 2009>
Question:
Any useful existing resourses for identifying equivalent semantic
information?
• Word-level: dictionary, WordNet
• Phrase-level: ?
• Sentence-level: ?
20
1.2 Paraphrase generation
• Corpora
– Monolingual vs. bilingual
• Methods
– Distributional-similarity based
– Corpora based
• Evaluation
– Intrinsic evaluation vs. extrinsic evaluation
21
1.2 -- Corpora
• Monolingual corpora
– Parallel corpora
• Translation candidates
• Definitions of the same term
– Comparable corpora
• Summary of the same event
• Documents on the same topic
• Bilingual parallel corpora
22
1.1 -- Methods.1
• Distributional-similarity based methods
– DIRT, paths frequently occur with same words at their
ends
• Using a single monolingual corpus
• MI to measure association strength between slot and its
arguments <Lin 2001>
– Sentence-lattices, argument similarity of multiple slots
on sentence-lattices
• Using a comparable monolingual corpus
• Hierarchical clustering for grouping similar sentences
• MSA to induce lattices <Barzilay 2003>
23
1.2 -- Methods.2
• Corpora-based methods
– Monolingual parallel corpus
• Monolingual MT <Quirk 2004>
• Merging partial parse trees FSA <Pang 2003>
• Paraphrasing from definitions <Hashimoto 2011>
– Monolingual comparable corpus
• MSR paraphrase corpus <Dolan 2005>
• Edit distance, Journalism convention
• Sentence-lattices <Barzilay 2003>
– Bilingual parallel corpus
• Pivot approach <Callison-Burch 2005> <Zhao 2008>
• Random-walk based HTP <Kok 2009>
24
1.2 -- Evaluation
• Intrinsic evaluation
– Responsiveness
• Can access precision, but no recall
– Standard test references <Callison-Burch 2008>
• Manually aligned corpus
• Lower bound precision & relative recall
• Extrinsic evaluation
– Alignment tasks in monolingual translation
• Alignment error rate
• Alignment precision, recall, F-measure <Dolan 2004>
• Model-specific evaluation
– FSA <Pang 2005>
25
2 Summarization
Tasks in automatic summarization
I. Content selection
II. Information ordering
III. Automatic editing, information fusion
Focus of this talk -1. Informativeness computation
2. Information selection (and generation)
3. Summarization evaluation
26
2.1 Computing informativeness
• Semantic information (Topic identification)
– Word-level
• Frequency, TFIDF <Liu 2004>, Topic signature <Lin 2001>, PMI(w, topic)
<Wang 2011>, external domain knowledge <Zhuang 2006>
– Sentence-level
• HMM content models <barzilay 2004>
• Category classification + sentence clustering <Abu-Jbara 2011>
– Summary-level
• Sentiment-aspect match model + KL divergence <Lerman 2009>
• Opinion-based sentiment scores for evaluative texts
• Sentiment polarity, intensity, mismatch, diversity <Lerman 2009>
• Discriminative approach to predict informativeness
• Combine statistic, semantic, sentiment features in linear or log-linear
models <wang 2011>
27
2.2 Information selection & generation
• Extraction
– Rank-based sentence selection
• Aggregation of word informative weights (+ discourse features) <Carenini, 2006>
<Wang, 2011>
• Optimized by Maximal Marginal Relevance
– Topic-based selection
• HMM content model <Barzilay, 2004>
• Languge-model based clustering of informative phrases <Liu, 2010>
• Summarize citations based on category-cluster-setence <Abu-Jbara, 2011>
– Structured evaluative summary
• Aspect + overall rating <Hu, 2004>
• Aspect + pos and cons <Zhuang, 2006>
• Hierarchical aspects + sentiment phrasal expressions <Liu 2010>
• Abstraction
– Generate evaluative arguments based on aggregation of extracted
information <Carenini, 2006>
– Graph-based summarization using adjacently matrix to model dialogue
structure <Wang, 2011>
28
2.3 Summarization evaluation
• Pyramid (empirical)
– Multiple human wrote gold-standards
– SCU <Ani 2007>
• ROUGE
Manual
summary
Responsiv
eness
– Automatically compare with gold-standard
– Consider correlation based on unigram, bigram,
Pyramid
longest common subsequence <Lin 2004>
ROUGE
• Fully automatic
Fully auto
– Good summary should be similar to the input
– KL divergence, JS divergence <Ani 2009>
 User preference of sentiment summarizer
Manual
rating
✔
✔
✗
✔
✔
✗
✗
✗
Paraphrasing and summarization -Summary
• Common theme
– Semantic equivalence
• Related to sentiment analysis
in computing informativeness of reviews
– Aspect-dependent sentiment orientation
• Overall vs. distribution statistics
– Aspect coverage
• Compute through scoring or measuring probabilistic model's
distribution divergence
30
Part. 3
HCI -- Visual text analytics
31
Outline
1. Text visualization
1. Inner-set visualization for abstraction
2. Intra-set visualization for comparison
2. Interactive exploration
1. Design principles and examples
32
1 Text visualization
• Inner-set visualization for abstraction
– Semantic information
– Sentiment information (opinions)
• Intra-set visualization for comparison
33
1.1 Inner-set visualization techniques
• Semantic information
– Original text with highlighted keywords
• Most detailed information
– Topic-based representation
• List of target entities (Jigsaw, <Stasko 2010>)
• Haystack (Themail, <Viegas 2006>)
• Tagcloud (OpinionSeer <Wu 2010>), TIARA <Liu 2009>,
reviewSpotlight <Yatani, 2011>)
– Vector-based representation
• Dot in space (ThemeScapes <Wise 1995>)
34
1.1 Inner-set visualization techniques
• Sentiment information
– Value-based visual representation
•
•
•
•
Bar -- Opinion polarity and intensity <Liu 2005>
Histogram -- Rating distribution <Carenini 2006>
Double-square -- Frequency, polarity, intensity <Oelke 2009>
Thumbnail table -- opinion report for people in groups <Oelke
2009>
Comment:
– Requires NLP techniques for opinion mining and sentiment
analysis
• e.g. Intelligence support for identify salient information for exploration
(Aspect that opinions are most (dis)consisitant) <Carenini 2006>
35
1 Text visualization
• Inner-set visualization for abstraction
– Semantic information
– Sentiment information (opinions)
• Intra-set visualization for comparison
– Dimensionality of comparison
• Via layout or visualizing metadata as axis
36
1.2 Intra-set visualization techniques
• Dimensionality of exploration
– 1D: layout or metadata
– 2D: layout or/and metadata
– 3D & 3D+: layout or/and metadata
37
1.2 Intra-set visualization -- 1D Exploration
• Side-by-side
– Compare single product reviews feature-by-feature <Liu 2005>
– Connect interesting events of different period of times (Continuum, <Andre
2007>)
– Explore the connection of entities across documents (Jigsaw, <Stasko
2010>)
• Grid-layout of data in groups
– Faceted metadata for image browsing <Yee 2003>
– Facetbox for presenting filtering by facet-data <Lee 2009>
– Exploring term-based language patterns across document <Don 2007>
• Timeline -- temporal features
– Themail <Viegas 2006>, Contitunn <Andre 2007> Tiara <Liu 2009>, TwitInfo
<Marcus 2011> etc.
38
1.2 Intra-set visualization -- 2D Exploration
• Aspect-based opinion analysis across multiple targets
–
–
Paired <Liu 2005>
Matrix <Orlke 2009>
• Scatter plot of targets with metadata as axis
– Discover the entity-coverage in documents (Jigsaw <Stasko 2010>)
– Visual DL search result with categorical and hierarchical axes
<Shneiderman 2000>
• 2D graph (layout)
–
Exploring relationships between entities and documents (Jigsaw
<Stasko 2010>)
– *Diagram of social network (TIARA <Liu 2009>)
• Spatial representation in 2D space
–
–
Triangle scatter-plot of opinions (OpinionSeer <Wu 2010>)
*Opinion space <Faridani 2010>
• Circled correlation map of review aspects <Orlke 2009>
39
1.3 Intra-set visualization -- 3D Exploration
• 3D-spacial representation
–
ThemeScapes <Wise 1995>
•
Theme strength as elevation (terrain map)
• Combine multiple visualization of metadata variables
–
–
OpinionSeer <Wu 2010>
•
Radial visualization with co-centric rings
+ stacked graph
+ triangle scatter plot
TIARA <Liu 2010>
•
Stacked topic-models (Wordcloud)
over timeline
Pos
– Discover unperceivable interactions among multiple factors
Cons
– Concise but hard to interpret
40
– Interaction is more complex and hard to design
2 Interactive exploration
Design principles and examples
•Data on-demand and in-depth exploration
From the data perspective
–Overview then detailed view
From the interaction perspective
–zoom-in and zoom-out for exploration
–Hierarchic filtering for search and browse
–Detail information as tooltip in explanatory visualization
•Support exploration of multiple interest
–View switching for interest-specific visualization techniques
–Query-based content browsing
–Pivot action for navigating between related items
•Context preserving
–Overview + detailed view
–Support local interactions (hierarchically structured data)
–A view of selection history of browsing
41
Visual text analytics -- summary
To conclude
•Text visualization construct the semantic mapping between
the text and visual variables
•Visualize metadata together with textual information for
comparison and exploration
•Interaction design should follow human's intuition of data
exploration
–Data characteristics
–Inherited connection between data and metadata
42
Visual text analytics -Connection between NLP and HCI
• NLP help visual analytic in extracting the target
information and organize them in a desired way
• Visual analytic provide exploratory tool for text analysis
and opinion mining
• Poses challenges to NLP in terms of both new corpora
and interesting problems
43
Conclusion
In terms of my own research interest
•Review analysis
– How to model the real helpfulness of peer-reviews
•Paraphrasing and summarization
– How to identify common themes and aggregate comments from
different reviewers
•Visual text analytic
– How to create informative representation of reviews
– And design intuitive interactive-exploration for students or teachers to
mind useful information
Challenges and contributions
• Theory-based high level information of usefulness
• Summary-style paraphrasing
• Visualize connection between opinions with detailed
semantic information in context
44