Proposal

advertisement
COMP621U, Spring 2011
Project Proposal
Team: Yuanfeng SONG, Jan VOSECKY
Topic: Sentiment Analysis on Twitter
Dataset
Twitter dataset provided by Z. Cheng, J. Caverlee, and K. Lee and used in CIKM 2010.
Statistics:
The training set: 115,886 Twitter users and 3,844,612 tweets from the users.
The test set: 5,136 Twitter users and 5,156,047 tweets from the users.
Used in:
Z. Cheng, J. Caverlee, and K. Lee. You Are Where You Tweet: A Content-Based Approach
to Geo-locating Twitter Users. In Proceeding of the 19th ACM Conference on Information
and Knowledge Management (CIKM), Toronto, Oct 2010.
Suggested approach
We plan to employ several machine learning techniques to extract users’ sentiment (emotion)
from the content of their twitter messages (‘tweets’). Our general approach will consist of the
following steps:
Preprocessing:
 Manually label a set of training and testing instances
 Represent tweets in an appropriate format, such as Bag-of-Words
 Indentify any additional features specific to tweets to include in the feature vector, in
order to leverage additional information which may increase classification accuracy.
Currently, we are considering the addition of temporal data, such as the timestamp.
 Investigate attribute selection and transformation possibilities
Modelling:
 Build a machine-learning model from the training data
 Evaluate the model on the testing data
We plan to experiment with a number of machine-learning algorithms and compare their
effectiveness. These may include Bayesian classifiers, Maximum Entropy, Support Vector
Machines, as well as clustering algorithms.
Possible extensions to our work:
 Topic-sensitive sentiment analysis: public sentiment with respect to specific topics.
Comparison between different locations.
 Tag-clouds: listing the most prominent sentiment-rich words for a category of tweets.
Related work
There has been a large amount of prior research in sentiment analysis, especially in the
domain of product reviews, movie reviews, and blogs. Pang and Lee [4] is an up-to-date
survey of previous work in sentiment analysis. Sentiment analysis works are mainly focusing
on designing platform or tools to do automatic sentiment analysis using models from machine
learning area such as latent semantic analysis (LSA), Naive Bayes, support vector machines
(SVM) etc. [1] [2]. Besides models, another difference between these works is different
datasets, such as Twitter, blogs etc. Following this is the difference in feature set and different
feature extraction methodology. For example, Mishne [1] uses many features extracted from
Live Journal web blog service to train an SVM binary classifier for sentiment analysis. Alec
Go [2] uses a Twitter dataset and extracts features from messages to do semantic analysis.
Besides treating this problem as a positive and negative emotion classification problem,
researchers also tried to identify more kinds of emotions. Jung et al. [5] show that there are
some idiosyncratic natures of mood expression in Plurk messages; for example, initial mood
may change as time passes by (which also known as the fluctuation of moods). Moreover,
some blogs are so intertwined that is even difficult for human to classify, not to mention for a
machine. All these characteristics make it relatively hard to identify multiple emotions.
Emotion detection can be used for different areas, such as recommendation systems [3],
computer-mediated communication (CMC)[6].
Evaluation metrics
Table 1 Confusion Matrix[7]
Actual
Positive
Negative
Total
Predicted
Positive
TP
FP
N(RPM)
Total
Negative
FN
TN
N(RPB)
N(RM)
N(RB)
N
Classification Mean average Precision: MAP  (TP  TN ) N
Kappa is used to measure the agreement between predicted and observed categorizations of a
dataset, while correcting for agreement that occurs by chance. The equation is as follows:
Kappa  ( Accuracy  Pe) (1  Pe)
Where Pe is the hypothetical probability of chance agreement, using the observed data to
calculate the probabilities of each observer randomly saying each category.
Pe 
N ( RM ) N ( RPM ) N ( RB ) N ( RPB )



N
N
N
N
References
[1] G. Mishne, “Experiments with Mood Classification in Blog Posts,” in Proceedings of the
1st Workshop on Stylistic Analysis of Text For Information Access, 2005.
[2] A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant
supervision,”
Dec
2009.
[Online].
Available:
http://www.stanford.edu/~alecmgo/papers/TwitterDistantSupervision09.pdf
[3] L. Terveen, W. Hill, B. Amento, D. McDonald, and J. Creter, “PHOAKS: A system for
sharing recommendations,” Communications of the Association for Computing Machinery
(CACM), vol. 40, pp. 59–62, 1997.
[4] MY Chen. etc. Classifying Mood in Plurks. The 22nd Conference on Computational
Linguistics and Speech Processing. Chi-Nan University, Taiwan. .
[5] Y. Jung, Y. Choi, and S.H. Myaeng, “Determining mood for a blog by combining multiple
sources of evidence,” in Proceedings of IEEE/WIC/ACM International Conference on Web
Intelligence, pp. 271-274, 2007.
[6] J. B. Walther, and K.P. D'addario, “The Impacts of Emoticons on Message Interpretation
in Computer-Mediated Communication,” Social Science Review, vol. 19, no. 3, pp. 324-347,
2001.
[7] Confusion matrix. [Online]http://en.wikipedia.org/wiki/Confusion_matrix
Download