A cognitive study of subjectivity extraction in sentiment annotation

advertisement
A cognitive study of subjectivity
extraction in sentiment
annotation
Abhijit Mishra1, Aditya Joshi1,2,3,
Pushpak Bhattacharyya1
1 IIT
Bombay, India
2 Monash University, Australia
3IITB-Monash Research Academy
At 5th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, ACL 2014, Baltimore
Subjectivity Extraction
• Goal: To identify subjective portions of text
Motivation
• Strong AI suggests that a machine must be
perform sentiment analysis in a manner and
accuracy similar to human beings
• Do humans perform subjective extraction as
well?
A “cognitive study” of subjectivity extraction in
sentiment annotation
Outline
• Sentiment Oscillations & Subjectivity
Extraction
• Experiment Setup
• Anticipation & Homing
• Conclusion & Future Work
Sentiment Oscillations & subjectivity
extraction
• Subjective documents may be:
Linear:
Oscillating:
The story was captivating.
The actors did a great job. I
absolutely loved the movie!
The story was captivating. Only if
they had better actors. But then I
enjoyed the movie, on the whole.
• Humans perform subjectivity extraction either as a
result of “anticipation” or as “homing”.
• Which of the two methods are adopted depends on
the linear/oscillating nature of the subjective
document.
Experiment Setup (1/2)
• A human annotator reads a document and
predicts its sentiment
• A Tobii T120 eye-tracker records eye
movements while he/she reads the document
* No time restriction, no user input required: to minimize errors.
Experiment Setup (2/2)
• Dataset
– 3 Movie reviews in English from imdb
– One linear, one oscillating, one between the two
extremes (D0, D1, D2 respectively)
• Three documents? Really?!
– To eliminate predictability
– To reduce errors due to fatigue
• 12 human annotators (P0, .. P11 respectively)
Observations: Anticipation (1/2)
• In case of linear subjective documents, an
annotator reads some sentences and begins
to skip sentences.
Observations: Anticipation (2/2)
Document
Length
Average number of
non-unique sentences
read by participants
D0
10
21
D1
9
33.83
D2
13
50.42
Observations: Homing (1/3)
• In case of oscillating subjective documents, an
annotator (a) first reads all sentences, (b)
revisits some sentences again
Observations: Homing (2/3)
• Considerable overlap between sentences that
are read in the second pass
• All of them are subjective.
Participant
TFD-SE
PTFD
TFC-SE
P5
7.3
8
21
P7
3.1
5
11
P9
51.94
10
26
P11
116.6
16
56
Reading statistics for D1
TFD: Total fixation duration for subjective extract;
PTFD: Proportion of total fixation duration = (TFD)/(Total duration);
TFC-SE: Total fixation count for subjective extract
Observations: Homing (3/3)
• Homing at a sub-sentence level
– Sarcasm
• Multiple regressions around the sarcasm portion for
participant P1, document D1
• Participant P1 does not correctly detect the sentiment
of the document
– Thwarting
Conclusion & Future Work
• Based on how sentiment changes through a
document, humans may perform subjectivity
extraction as a result of anticipation or homing
• Applications:
– Pricing models for crowd-sourced annotation
– Sentiment classifiers that incorporate “sentiment
runlengths”
References
• WikiSent : Weakly Supervised Sentiment Analysis Through
Extractive Summarization With Wikipedia, Subhabrata
Mukherjee and Pushpak Bhattacharyya, ECML PKDD 2012
• A sentimental education: sentiment analysis using subjectivity
summarization based on minimum cuts, Bo Pang, Lillian Lee,
ACL 2004
Download