Peer-review analysis
Comprehensive exam
• Goal
Mine useful information in peers’ feedback and represent them in a
intuitive and concise way
• Tasks and related research topics
Identify review helpfulness
Summarize reviewers’ comments
NLP – Paraphrasing and Summarization
Sense-making of review comments interactive review exploration
HCI – Visual text analytics
1. Review helpfulness analysis
2. Sentiment analysis (opinion mining)
Aspect detection
Sentiment orientation
Sentiment classification & extraction
1 Review helpfulness analysis
1.Automatic prediction
– Learning techniques
– Features utilities
– The ground-truth
1.Analysis of perceived review helpfulness
– Users’ bias when vote for helpfulness
– Influence of the other reviews of the same product
1.1 -- Learning techniques
• Problem formalization
– Input: textual reviews
– Output: helpfulness score
• Learning Algorithms
– Supervised learning – Regression
• Product reviews (e.g. electronics) <Kim 2006>, <Zhang 2006>, <Liu
2007>,<Ghose 2010>, <O'Mahony 2010>
• Trip reviews <Zhang 2006>
• Movie reviews <Zhang 2006>
– Unsupervised learning – Clustering
• Book reviews <Tsur 2009>
• Focus
– Predict absolute scores VS. rankings
– Identify most helpful <Liu 2007> vs. unhelpful <Tsur 2009>
1.1-- Feature utilities
• Features used to model review helpfulness
Feature type
Unigrams, bigrams
Low level
Social factors
Semantic: 1) domain lexicons
2) Subjectivity
Sentiment analysis
Readability metrics
High level
Reviewer profile
Product ratings
– Controversial results about the effectiveness of subjectivity
• term-based counts not useful <Kim, et. al, 2006>, category-based count
shows positive words correlate with greater helpfulness <Ghose, et. al,
– Data sparsity issues?
1.1 --The ground-truth
• Various gold-standard of review helpfulness
– Aggregated helpfulness votes Perceived helpfulness
e.g. <Kim 2006>
– Manual annotations of helpfulness Real helpfulness
<Liu 2007>
• Problems
Percentage of helpful votes is not consistent with annotators
judgments based on helpfulness specifications
Error rate of preference pair < 0.5 <Liu 2007>
1.Automatic prediction
– Learning techniques
– Features utilities
– The ground-truth
1.Analysis of perceived review helpfulness
– Biased voting of review helpfulness on
– The perceived helpfulness is not only determined by the
textual content
1.2 Analysis of perceived review helpfulness
• Biased voting of review helpfulness on
– Imbalanced vote
– Winner Circle bias
– Early bird bias <Liu 2007>
“x/y” does not capture the true helpfulness of reviews
• The perceived helpfulness is not only determined by the
textual content
– Influence of the other reviews of the same product
– Individual bias <Danescu-Niculescu-Mizil 2009>
• Summary
– Effective features for identify review helpfulness
– Perceived helpfulness VS. real helpfulness
• Comments
– New features
• Introduce domain knowledge and information from other
– Data sparsity problem
• High-level features
• Deep learning from low-level features
– Other machine learning techniques
• Theory-based generative models
How people think about what?
1.Aspect detection
2.Sentiment orientation
3.Sentiment classification & extraction
2.1 Aspect detection
• Frequency-based approach
– Most frequent noun-phrase + sentiment-pivot expansion <Liu, 2004>
– PMI (pointwise Mutual information) with meronymy discriminators +
WordNet <Popescu 2005>
• Generative approach
LDA, MG-LDA <Titov 2008>, sentence-level local-LDA <Brody 2010>
Multiple-aspect sentiment model <Titov 2008>
Content-attitude model <Sauper 2011>
2.2 Sentiment orientation
• Aggregating from subjective terms
– Manually constructed subjective lexicons
• Bootstrapping with PMI
– Adj & adv <Turney 2001>
– opinion-bearing words <Liu 2004>
• Graph-based approach
– Relaxiation labeling <Popescu 2005>
– Scoring <Brody 2010>
• Domain adaptation
– SCL algorithm <Blitzer 2007>
• Through topic models
– MAS -- aspect-independent + aspect-dependent <Titov 2008>
– Content-attitude models -- predicted posterior of sentiment
distribution <Sauper, 2011>
2.3 Sentiment classification and extraction
• Classification
– Binary <Turney 2001>
– Finer-grained e.g. metric labeling <Pang 2005>
Data sparsity
– Bag-of-Words vs. Bag-of-Opinions <Qu 2010>
• Opinion-oriented extraction
– Topic of interest
Automatically learned
2 Summary
Comparing reviews’ helpfulness and sentiment
• In terms of automatic prediction, both are metric inferring
problem, that can be formalized as standard ML problems
with same input X though different output Y
• The learned knowledge about opinion topics and the
associated sentiments would help model the general utility
of reviews
Paraphrasing & Summarization
1. Paraphrasing
Paraphrases are semantically equivalent with each other
1. Paraphrase recognition
2. Paraphrase generation
1. Summarization
Shorter representation of the same semantic information of
the input text
1. Informativeness computation
2. Extracted summarization of evaluative text
1.1 Paraphrase recognition
• Discriminative approach
–Various string similarity metrics
–Different level of abstraction of textual strings
<Malakasiotis 2009>
Any useful existing resourses for identifying equivalent semantic
• Word-level: dictionary, WordNet
• Phrase-level: ?
• Sentence-level: ?
1.2 Paraphrase generation
• Corpora
– Monolingual vs. bilingual
• Methods
– Distributional-similarity based
– Corpora based
• Evaluation
– Intrinsic evaluation vs. extrinsic evaluation
1.2 -- Corpora
• Monolingual corpora
– Parallel corpora
• Translation candidates
• Definitions of the same term
– Comparable corpora
• Summary of the same event
• Documents on the same topic
• Bilingual parallel corpora
1.1 -- Methods.1
• Distributional-similarity based methods
– DIRT, paths frequently occur with same words at their
• Using a single monolingual corpus
• MI to measure association strength between slot and its
arguments <Lin 2001>
– Sentence-lattices, argument similarity of multiple slots
on sentence-lattices
• Using a comparable monolingual corpus
• Hierarchical clustering for grouping similar sentences
• MSA to induce lattices <Barzilay 2003>
1.2 -- Methods.2
• Corpora-based methods
– Monolingual parallel corpus
• Monolingual MT <Quirk 2004>
• Merging partial parse trees FSA <Pang 2003>
• Paraphrasing from definitions <Hashimoto 2011>
– Monolingual comparable corpus
• MSR paraphrase corpus <Dolan 2005>
• Edit distance, Journalism convention
• Sentence-lattices <Barzilay 2003>
– Bilingual parallel corpus
• Pivot approach <Callison-Burch 2005> <Zhao 2008>
• Random-walk based HTP <Kok 2009>
1.2 -- Evaluation
• Intrinsic evaluation
– Responsiveness
• Can access precision, but no recall
– Standard test references <Callison-Burch 2008>
• Manually aligned corpus
• Lower bound precision & relative recall
• Extrinsic evaluation
– Alignment tasks in monolingual translation
• Alignment error rate
• Alignment precision, recall, F-measure <Dolan 2004>
• Model-specific evaluation
– FSA <Pang 2005>
2 Summarization
Tasks in automatic summarization
I. Content selection
II. Information ordering
III. Automatic editing, information fusion
Focus of this talk -1. Informativeness computation
2. Information selection (and generation)
3. Summarization evaluation
2.1 Computing informativeness
• Semantic information (Topic identification)
– Word-level
• Frequency, TFIDF <Liu 2004>, Topic signature <Lin 2001>, PMI(w, topic)
<Wang 2011>, external domain knowledge <Zhuang 2006>
– Sentence-level
• HMM content models <barzilay 2004>
• Category classification + sentence clustering <Abu-Jbara 2011>
– Summary-level
• Sentiment-aspect match model + KL divergence <Lerman 2009>
• Opinion-based sentiment scores for evaluative texts
• Sentiment polarity, intensity, mismatch, diversity <Lerman 2009>
• Discriminative approach to predict informativeness
• Combine statistic, semantic, sentiment features in linear or log-linear
models <wang 2011>
2.2 Information selection & generation
• Extraction
– Rank-based sentence selection
• Aggregation of word informative weights (+ discourse features) <Carenini, 2006>
<Wang, 2011>
• Optimized by Maximal Marginal Relevance
– Topic-based selection
• HMM content model <Barzilay, 2004>
• Languge-model based clustering of informative phrases <Liu, 2010>
• Summarize citations based on category-cluster-setence <Abu-Jbara, 2011>
– Structured evaluative summary
• Aspect + overall rating <Hu, 2004>
• Aspect + pos and cons <Zhuang, 2006>
• Hierarchical aspects + sentiment phrasal expressions <Liu 2010>
• Abstraction
– Generate evaluative arguments based on aggregation of extracted
information <Carenini, 2006>
– Graph-based summarization using adjacently matrix to model dialogue
structure <Wang, 2011>
2.3 Summarization evaluation
• Pyramid (empirical)
– Multiple human wrote gold-standards
– SCU <Ani 2007>
– Automatically compare with gold-standard
– Consider correlation based on unigram, bigram,
longest common subsequence <Lin 2004>
• Fully automatic
– Good summary should be similar to the input
– KL divergence, JS divergence <Ani 2009>
 User preference of sentiment summarizer
Paraphrasing and summarization -Summary
• Common theme
– Semantic equivalence
• Related to sentiment analysis
in computing informativeness of reviews
– Aspect-dependent sentiment orientation
• Overall vs. distribution statistics
– Aspect coverage
• Compute through scoring or measuring probabilistic model's
distribution divergence
HCI -- Visual text analytics
1. Text visualization
1. Inner-set visualization for abstraction
2. Intra-set visualization for comparison
2. Interactive exploration
1. Design principles and examples
1 Text visualization
• Inner-set visualization for abstraction
– Semantic information
– Sentiment information (opinions)
• Intra-set visualization for comparison
1.1 Inner-set visualization techniques
• Semantic information
– Original text with highlighted keywords
• Most detailed information
– Topic-based representation
• List of target entities (Jigsaw, <Stasko 2010>)
• Haystack (Themail, <Viegas 2006>)
• Tagcloud (OpinionSeer <Wu 2010>), TIARA <Liu 2009>,
reviewSpotlight <Yatani, 2011>)
– Vector-based representation
• Dot in space (ThemeScapes <Wise 1995>)
1.1 Inner-set visualization techniques
• Sentiment information
– Value-based visual representation
Bar -- Opinion polarity and intensity <Liu 2005>
Histogram -- Rating distribution <Carenini 2006>
Double-square -- Frequency, polarity, intensity <Oelke 2009>
Thumbnail table -- opinion report for people in groups <Oelke
– Requires NLP techniques for opinion mining and sentiment
• e.g. Intelligence support for identify salient information for exploration
(Aspect that opinions are most (dis)consisitant) <Carenini 2006>
1 Text visualization
• Inner-set visualization for abstraction
– Semantic information
– Sentiment information (opinions)
• Intra-set visualization for comparison
– Dimensionality of comparison
• Via layout or visualizing metadata as axis
1.2 Intra-set visualization techniques
• Dimensionality of exploration
– 1D: layout or metadata
– 2D: layout or/and metadata
– 3D & 3D+: layout or/and metadata
1.2 Intra-set visualization -- 1D Exploration
• Side-by-side
– Compare single product reviews feature-by-feature <Liu 2005>
– Connect interesting events of different period of times (Continuum, <Andre
– Explore the connection of entities across documents (Jigsaw, <Stasko
• Grid-layout of data in groups
– Faceted metadata for image browsing <Yee 2003>
– Facetbox for presenting filtering by facet-data <Lee 2009>
– Exploring term-based language patterns across document <Don 2007>
• Timeline -- temporal features
– Themail <Viegas 2006>, Contitunn <Andre 2007> Tiara <Liu 2009>, TwitInfo
<Marcus 2011> etc.
1.2 Intra-set visualization -- 2D Exploration
• Aspect-based opinion analysis across multiple targets
Paired <Liu 2005>
Matrix <Orlke 2009>
• Scatter plot of targets with metadata as axis
– Discover the entity-coverage in documents (Jigsaw <Stasko 2010>)
– Visual DL search result with categorical and hierarchical axes
<Shneiderman 2000>
• 2D graph (layout)
Exploring relationships between entities and documents (Jigsaw
<Stasko 2010>)
– *Diagram of social network (TIARA <Liu 2009>)
• Spatial representation in 2D space
Triangle scatter-plot of opinions (OpinionSeer <Wu 2010>)
*Opinion space <Faridani 2010>
• Circled correlation map of review aspects <Orlke 2009>
1.3 Intra-set visualization -- 3D Exploration
• 3D-spacial representation
ThemeScapes <Wise 1995>
Theme strength as elevation (terrain map)
• Combine multiple visualization of metadata variables
OpinionSeer <Wu 2010>
Radial visualization with co-centric rings
+ stacked graph
+ triangle scatter plot
TIARA <Liu 2010>
Stacked topic-models (Wordcloud)
over timeline
– Discover unperceivable interactions among multiple factors
– Concise but hard to interpret
– Interaction is more complex and hard to design
2 Interactive exploration
Design principles and examples
•Data on-demand and in-depth exploration
From the data perspective
–Overview then detailed view
From the interaction perspective
–zoom-in and zoom-out for exploration
–Hierarchic filtering for search and browse
–Detail information as tooltip in explanatory visualization
•Support exploration of multiple interest
–View switching for interest-specific visualization techniques
–Query-based content browsing
–Pivot action for navigating between related items
•Context preserving
–Overview + detailed view
–Support local interactions (hierarchically structured data)
–A view of selection history of browsing
Visual text analytics -- summary
To conclude
•Text visualization construct the semantic mapping between
the text and visual variables
•Visualize metadata together with textual information for
comparison and exploration
•Interaction design should follow human's intuition of data
–Data characteristics
–Inherited connection between data and metadata
Visual text analytics -Connection between NLP and HCI
• NLP help visual analytic in extracting the target
information and organize them in a desired way
• Visual analytic provide exploratory tool for text analysis
and opinion mining
• Poses challenges to NLP in terms of both new corpora
and interesting problems
In terms of my own research interest
•Review analysis
– How to model the real helpfulness of peer-reviews
•Paraphrasing and summarization
– How to identify common themes and aggregate comments from
different reviewers
•Visual text analytic
– How to create informative representation of reviews
– And design intuitive interactive-exploration for students or teachers to
mind useful information
Challenges and contributions
• Theory-based high level information of usefulness
• Summary-style paraphrasing
• Visualize connection between opinions with detailed
semantic information in context