COMP621U, Spring 2011 Project Proposal Team: Yuanfeng SONG, Jan VOSECKY Topic: Sentiment Analysis on Twitter Dataset Twitter dataset provided by Z. Cheng, J. Caverlee, and K. Lee and used in CIKM 2010. Statistics: The training set: 115,886 Twitter users and 3,844,612 tweets from the users. The test set: 5,136 Twitter users and 5,156,047 tweets from the users. Used in: Z. Cheng, J. Caverlee, and K. Lee. You Are Where You Tweet: A Content-Based Approach to Geo-locating Twitter Users. In Proceeding of the 19th ACM Conference on Information and Knowledge Management (CIKM), Toronto, Oct 2010. Suggested approach We plan to employ several machine learning techniques to extract users’ sentiment (emotion) from the content of their twitter messages (‘tweets’). Our general approach will consist of the following steps: Preprocessing: Manually label a set of training and testing instances Represent tweets in an appropriate format, such as Bag-of-Words Indentify any additional features specific to tweets to include in the feature vector, in order to leverage additional information which may increase classification accuracy. Currently, we are considering the addition of temporal data, such as the timestamp. Investigate attribute selection and transformation possibilities Modelling: Build a machine-learning model from the training data Evaluate the model on the testing data We plan to experiment with a number of machine-learning algorithms and compare their effectiveness. These may include Bayesian classifiers, Maximum Entropy, Support Vector Machines, as well as clustering algorithms. Possible extensions to our work: Topic-sensitive sentiment analysis: public sentiment with respect to specific topics. Comparison between different locations. Tag-clouds: listing the most prominent sentiment-rich words for a category of tweets. Related work There has been a large amount of prior research in sentiment analysis, especially in the domain of product reviews, movie reviews, and blogs. Pang and Lee [4] is an up-to-date survey of previous work in sentiment analysis. Sentiment analysis works are mainly focusing on designing platform or tools to do automatic sentiment analysis using models from machine learning area such as latent semantic analysis (LSA), Naive Bayes, support vector machines (SVM) etc. [1] [2]. Besides models, another difference between these works is different datasets, such as Twitter, blogs etc. Following this is the difference in feature set and different feature extraction methodology. For example, Mishne [1] uses many features extracted from Live Journal web blog service to train an SVM binary classifier for sentiment analysis. Alec Go [2] uses a Twitter dataset and extracts features from messages to do semantic analysis. Besides treating this problem as a positive and negative emotion classification problem, researchers also tried to identify more kinds of emotions. Jung et al. [5] show that there are some idiosyncratic natures of mood expression in Plurk messages; for example, initial mood may change as time passes by (which also known as the fluctuation of moods). Moreover, some blogs are so intertwined that is even difficult for human to classify, not to mention for a machine. All these characteristics make it relatively hard to identify multiple emotions. Emotion detection can be used for different areas, such as recommendation systems [3], computer-mediated communication (CMC)[6]. Evaluation metrics Table 1 Confusion Matrix[7] Actual Positive Negative Total Predicted Positive TP FP N(RPM) Total Negative FN TN N(RPB) N(RM) N(RB) N Classification Mean average Precision: MAP (TP TN ) N Kappa is used to measure the agreement between predicted and observed categorizations of a dataset, while correcting for agreement that occurs by chance. The equation is as follows: Kappa ( Accuracy Pe) (1 Pe) Where Pe is the hypothetical probability of chance agreement, using the observed data to calculate the probabilities of each observer randomly saying each category. Pe N ( RM ) N ( RPM ) N ( RB ) N ( RPB ) N N N N References [1] G. Mishne, “Experiments with Mood Classification in Blog Posts,” in Proceedings of the 1st Workshop on Stylistic Analysis of Text For Information Access, 2005. [2] A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant supervision,” Dec 2009. [Online]. Available: http://www.stanford.edu/~alecmgo/papers/TwitterDistantSupervision09.pdf [3] L. Terveen, W. Hill, B. Amento, D. McDonald, and J. Creter, “PHOAKS: A system for sharing recommendations,” Communications of the Association for Computing Machinery (CACM), vol. 40, pp. 59–62, 1997. [4] MY Chen. etc. Classifying Mood in Plurks. The 22nd Conference on Computational Linguistics and Speech Processing. Chi-Nan University, Taiwan. . [5] Y. Jung, Y. Choi, and S.H. Myaeng, “Determining mood for a blog by combining multiple sources of evidence,” in Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence, pp. 271-274, 2007. [6] J. B. Walther, and K.P. D'addario, “The Impacts of Emoticons on Message Interpretation in Computer-Mediated Communication,” Social Science Review, vol. 19, no. 3, pp. 324-347, 2001. [7] Confusion matrix. [Online]http://en.wikipedia.org/wiki/Confusion_matrix