Sentiment analysis / classification with MaxEnt

advertisement
Sentiment Analysis + MaxEnt*
MAS.S60
Rob Speer
Catherine Havasi
* Lots of slides borrowed for lots of sources! See end.
People on the Web have opinions
The world is full of text
•
•
•
•
•
Customer verbatims
Blogs
Comments
Reviews
Forums
Measuring public opinion
through social media?
People in U.S.
Query
I like Obama
Can we derive a
similar measurement?
I do not
Write
Query
Aggregate
Text Sentiment
Measure
Anne Hathaway
•
•
•
•
•
•
Oct. 3, 2008 - Rachel Getting Married opens:
BRK.A up .44%
Jan. 5, 2009 - Bride Wars opens: BRK.A up 2.61%
Feb. 8, 2010 - Valentine's Day opens: BRK.A up
1.01%
March 5, 2010 - Alice in Wonderland opens:
BRK.A up .74%
Nov. 24, 2010 - Love and Other Drugs opens:
BRK.A up 1.62%
Nov. 29, 2010 - Anne announced as co-host of
the Oscars: BRK.A up .25%
Application: Information Extraction
“The Parliament exploded into fury against the
government when word leaked out…”
Observation: subjectivity often causes false hits for IE
Goal: augment the results of IE
Subjectivity filtering strategies to improve IE Riloff, Wiebe,
Phillips AAAI05
ICWSM 2008
6
Sentiment can be hard to analyze
“This is where the money was spent, on wellchoreographed kung-fu sequences, on giant Kevlar
hamster balls, on smashed-up crates of bananas,
and on scorpions. Ignore the gaping holes in the plot
(how, exactly, if the villain's legs were broken, did he
escape from the secret Nazi base, and why didn't he
take the key with him?). Don't worry about the
production values, or what, exactly, the Japanese girl
was doing hitchhiking across the Sahara. Just go see
the movie.”
• http://www.killermovies.com/o/operationcon
dor/reviews/6na.html
Thwarted Expectations Narrative
• “I thought it was going to be amazing… but it’s
not unless you’re a hungover college student.”
– Tripadvisor, Amy’s Café
This is a messy task
• Inter-annotator agreement on sentiment
analysis tasks can be as low as 70%
• Pang et al., 2002: adding n-grams doesn’t
seem to help
Twitter Mood Swings
Alan Mislove, Northeastern
Daily Mood
Weekly Mood
Opinion mining tasks
 At the document (or review) level:
 Task: sentiment classification of reviews
 Classes: positive, negative, and neutral
 Assumption: each document (or review) focuses on a single
object (not true in many discussion posts) and contains
opinion from a single opinion holder.
 At the sentence level:
 Task 1: identifying subjective/opinionated sentences
 Classes: objective and subjective (opinionated)
 Task 2: sentiment classification of sentences
 Classes: positive, negative and neutral.
 Assumption: a sentence contains only one opinion; not true in many
cases.
 Then we can also consider clauses or phrases.
Opinion Mining Tasks (cont.)
 At the feature level:
 Task 1: Identify and extract object features that have
been commented on by an opinion holder (e.g., a
reviewer).
 Task 2: Determine whether the opinions on the features
are positive, negative or neutral.
 Task 3: Group feature synonyms.
 Produce a feature-based opinion summary of multiple reviews.
 Opinion holders: identify holders is also useful,
e.g., in news articles, etc, but they are usually
known in the user generated content, i.e., authors
of the posts.
Bags of Words
•
•
•
•
Look for certain keywords
“Valence” of a word
Advantage: Works quickly
Disadvantage: Lexical Creativity
Sentiment analysis: word counting
• Subjectivity Clues lexicon from OpinionFinder
/ U Pitt
– Wilson et al 2005
– 2000 positive, 3600 negative words
• Procedure
1. Within topical messages,
2. Count messages containing these positive and
negative words
Main resources
•
Lexicons
•
•
•
General Inquirer (Stone et al., 1966)
OpinionFinder lexicon (Wiebe & Riloff, 2005)
SentiWordNet (Esuli & Sebastiani, 2006)
• Annotated corpora
•
•
Used in statistical approaches (Hu & Liu 2004,
Pang & Lee 2004)
MPQA corpus (Wiebe et. al, 2005)
• Tools
•
•
Algorithm based on minimum cuts (Pang &
Lee, 2004)
OpinionFinder (Wiebe et. al, 2005)
Corpus
• MPQA: www.cs.pitt.edu/mqpa/databaserelease (version 2)
• English language versions of articles from the world press (187 news
sources)
• Also includes contextual polarity annotations
• Themes of the instructions:
– No rules about how particular words should be annotated.
– Don’t take expressions out of context and think about what they could
mean, but judge them as they are used in that sentence.
ICWSM 2008
18
Gold Standards
• Derived from manually annotated data
• Derived from “found” data (examples):
– Livejournal Cambria, Havasi 2008
– Blog tags Balog, Mishne, de Rijke EACL 2006
– Websites for reviews, complaints, political arguments
• amazon.com Pang and Lee ACL 2004
• complaints.com Kim and Hovy ACL 2006
• bitterlemons.com Lin and Hauptmann ACL 2006
• Word lists (example):
– General Inquirer Stone et al. 1996
ICWSM 2008
19
A note on the sentiment list
• This list is not well suited for social media English.
– “sucks”, “ :) ”, “ :( ”
(Top examples)
word valence count
will positive 3934
bad negative 3402
good positive 2655
help positive 1971
(Random examples)
word
valence count
funny
positive 114
fantastic positive 37
cornerstone positive 2
slump
negative 85
bearish negative 17
crackdown negative 5
Patterns
• Lexico-syntactic patterns Riloff & Wiebe 2003
• way with <np>: … to ever let China use force to
have its way with …
• expense of <np>: at the expense of the world’s
security and stability
• underlined <dobj>: Jiang’s subdued tone …
underlined his desire to avoid disputes …
ICWSM 2008
21
Conjunction
ICWSM 2008
22
*We cause great leaders
ICWSM 2008
23
Statistical association
• If words of the same orientation likely to co-occur together,
then the presence of one makes the other more probable
(co-occur within a window, in a particular context, etc.)
• Use statistical measures of association to capture this
interdependence
– E.g., Mutual Information (Church & Hanks 1989)
ICWSM 2008
24
Sentiment Ratio Moving Average
• High day-to-day
volatility.
• Average last k days.
• Keyword “jobs”,
k = 1, 7, 30
• (Gallup tracking
polls: 3 or 7-day
smoothing)
Sentiment Ratio Moving Average
• High day-to-day
volatility.
• Average last k days.
• Keyword “jobs”,
k = 1, 7, 30
• (Gallup tracking
polls: 3 or 7-day
smoothing)
Sentiment Ratio Moving Average
• High day-to-day
volatility.
• Average last k days.
• Keyword “jobs”,
k = 1, 7, 30
• (Gallup tracking
polls: 3 or 7-day
smoothing)
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Smoothed comparisons
“jobs” sentiment
Beyond good and bad
• Can we identify excitement, embarrassment,
fear, and all kinds of other emotions?
Sentiment as Topics
The Hourglass of Emotions
A quantified version of Robert Plutchik’s psychoevolutionary
“wheel of emotions” (1980)
SenticNet
• Augments ConceptNet with emotion-tagged
data
• Learns a function from semantic vectors to the
emotion space
• Evaluation: classify LJ posts that are tagged
with “Current mood: ...”
Learning valence
• Most classifiers are effectively learning a
valence for every feature
– funny = +1
– disappointed = -2
– seagal = -3
Naïve Bayes again?
• Sure, okay
• But interesting n-grams clearly aren’t
independent
– “Gwyneth Paltrow” will be double-counted every
time
Maximum Entropy (MaxEnt)
• MaxEnt finds a probability distribution that
follows a logistic curve
• Doesn’t require independence
The logistic (logit) distribution
•
•
•
•
•
c = class
d = data
f = features in the data
λ = a value for every feature
Z = whatever you need to divide by to make it
add up to 1
MaxEnt learns a probability
distribution
• Optimizing for two things:
– Maximize the probability of your data
– ...but be as uninformative as possible about things
missing from your data
• An unfair coin comes up heads four times.
What’s the probability that it comes up heads
the next time?
Maximum Likelihood
Maximum A Posteriori
• p(class | data) ∝ p(data | class) * p(class)
• posterior = likelihood * prior
• Our prior on p(class) can apply a penalty to
large weights
Slide Credits
•
•
•
•
Brendan O’Connor, CMU – OpinionFinder
Carmen Banea /Jan Wiebe
Luminoso
Smilies: Aditya Joshi
Download