easm-v1

advertisement
Prediction of Sporting Events through Social
Media across the Multiple Cultures of Australia
--- blinded ---
Abstract— Social media offerings such as Twitter provide a near
real-time forum for expression of personal information through
Tweets. Often these Tweets can capture the emotion of the
Tweeter at that point and place in time either on a personal level,
or with regards to some event, organisation or other individual.
In this paper we show how such sentiment can be used to identify
and ultimately predict events. Specifically we focus on prediction
of events that take place in sports – of specific interest and
relevance to Australians and Melbourne in particular as a
passionate sports city. What is novel about this work is that it
addresses the multicultural factors in use of Twitter for
prediction of sporting events, and uses this as the basis for better
understanding of the cities and the communities of Australia. To
explore this, we focus on event prediction in the FIFA World Cup
that took place in 2014 and the Cricket World Cup that took
place in 2015. We show the way in which events are detected and
importantly the cultural diversity for event detection. We
describe the Cloud-based architecture for collecting and
analysing such large, diverse data sets and the algorithms that
are used to identify changes in sentiment and their accuracy. We
illustrate how actual events can indeed be predicted from social
media and the cultural differences in sentiment expression
through social media.
Keywords: Twitter, sentiment analysis, sports prediction.
I. INTRODUCTION
The
II.
RELATED WORK
A
III.
CASE STUDY AND ARCHITECTURAL IMPACT
IV.
CONCLUSIONS
The
.
ACKNOWLEDGMENTS
The authors would like to thank the National eResearch
Collaboration Tools and Resources (NeCTAR) project for the
infrastructure underlying this work.
REFERENCES
[1] About.twitter.com. Company | about, 2015. URL
https://about.twitter.com/company.
[2] A. Schulz, A. Hadjakos, H. Paulheim, J. Nachtwey, and M.
Muhlhauser. A multi-indicator approach for geolocalization of
tweets. In Proceedings of the Eight International Conference on
Weblogs and Social Media (ICWSM), pages 573{582, Menlo Park,
California, USA, 2013. AAAI Press. ISBN 978-157735-610-3.
[3] S. Rosenthal, A. Ritter, P. Nakov, and V. Stoyanov. Sentiment
analysis in Twitter. In Proceedings of the 8th International Workshop
on Semantic Evaluation (SemEval 2014), pages 73{80, Dublin,
Ireland, August
2014. Association for Computational Linguistics and Dublin City
University. URL http://www.aclweb.org/anthology/S14-2009.
[4] Support.rc.nectar.org.au. Nectar support, 2015. URL
https://support.rc.nectar.org.au/.
[5] Couchdb.apache.org. Apache couchdb, 2015. URL
http://couchdb.apache.org/.
[6] Couchbase.com. Couchbase and apache couchdb | couchbase,
2015. URL http://www.couchbase.com/couchbase-vs-couchdb.
[7] Wiki.apache.org. Technical overview - couchdb wiki, 2015. URL
https://wiki.apache.org/couchdb/Technical%20Overview.
[8] Dev.twitter.com. Oauth | twitter developers, 2015. URL
https://dev.twitter.com/oauth.
[9] Dev.twitter.com. Rest apis | twitter developers, 2015. URL
https://dev.twitter.com/rest/public.
[10] Dev.twitter.com. The streaming apis | twitter developers, 2015.
URL https://dev.twitter.com/streaming/overview.
[11] Tweepy.readthedocs.org. Tweepy documentation tweepy 3.2.0
documentation, 2015. URL http://tweepy.readthedocs.org/en/v3.2.0/.
[12] Code.google.com. Resources - otterapi - topsy's otter api google project hosting, 2009. URL
https://code.google.com/p/otterapi/wiki/Resources.
[13] Lightcouch.org. Couchdb java api - lightcouch documentation,
2015. URL http: //www.lightcouch.org/docs.html.
[14] Torstein H_nsi. Highcharts documentation, 2015. URL
http://www.highcharts.com/docs.
[15] Google Developers. Heatmaps, 2015. URL
https://developers.google.com/maps/documentation/javascript/examp
les/layer-heatmap.
[16] http://www.lct master.org/. Introduction to sentiment analysis.
2015. URL http: //www.lctmaster.org/files/MullenSentimentCourseSlides.pdf.
[17] P. Lee, L Lee. Opinion mining and sentiment analysis.
foundations and trends in information retrieval. Volume 2 Issue 12:1{135, January 2008.
[18] R. Feldman. Techniques and applications for sentiment analysis.
Communications of the ACM, Vol 56 (4):82{89, April 2013.
[19] Z. Zabokrtsky. Feature engineering in machine learning.
http://ufal.
mff.cuni.cz/~zabokrtsky/courses/npfl104/html/feature_engineering.p
df.
[20] A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R. Passonneau.
Sentiment analysis of Twitter data. In Proceedings of the Workshop
on Languages in Social Media, LSM '11, pages 30{38, Stroudsburg,
PA, USA, 2011. Association for Computational Linguistics. ISBN
978-1-932432-96-1. URL http://dl.acm.org/
citation.cfm?id=2021109.2021114.
[21] S. Asur and B. A. Huberman. Predicting the future with social
media. In Proceedings of the 2010 IEEE/WIC/ACM International
Conference on Web Intelligence and Intelligent Agent Technology Volume 01, WI-IAT '10, pages 492-499, Washington, DC, USA,
2010. IEEE Computer Society. ISBN 978-0-7695-4191- 4. doi:
10.1109/WI-IAT.2010.63. URL http://dx.doi.org/10.1109/WIIAT.2010.63.
[22] A.G. Jivani. A comparative study of stemming algorithms. Int. J.
Comp. Tech. Appl, Vol 2 (6):1930{1938.
[23] P. Paroubek A. Pak. Twitter as a corpus for sentiment analysis
and opinion mining. Proceedings of the 5th International Workshop
on Semantic Evaluation, ACL 2010:436{439, July 2010.
[24] G.D. Lavra_c Nada Furnkranz, Johannes. Foundations of rule
learning. Springer, XVIII(ISBN { 978-3-540-75196-0):334 p., 2012.
[25] A. Hertzmann. Machine learning and data mining lecture notes.
February 2012.
URL http://www.dgp.toronto.edu/~hertzman/411notes.pdf.
[26] L. A. Smith, T.J. Monk, R.S. Mitchell and G. Holme. Geometric
comparison of classiffcations and rule sets. Workshop on Knowledge
Discovery in Databases, pages 395{406, 1994.
[27] P. Raghavan C.D. Manning and H. Schutze. Introduction to
information retrieval. Cambridge University Press, ISBN 0521865719, 2008.
[28] I. Rish. An empirical study of the naive bayes classiffier. RC
22230 (W0111-014), November 2001.
[29] T. M. Mitchell. lecture slides for textbook machine learning.
McGraw Hill, 1997. URL http://www.cs.cmu.edu/afs/cs/project/theo20/www/mlbook/ch3.pdf.
[30] R. Hwa, B. Maeireizo, D. Litman. Co-training for predicting
emotions with spoken dialogue data. Association for Computational
Linguistics Stroudsburg, Article No. 28, 2004.
[31] J. Weston. Support vector machine and statistical learning
theory. NEC Labs.
[32] D. Koller, S. Tong. Support vector machine active learning with
applications
to text classi_cation. The Journal of Machine Learning Research
archive, 2:
45{66, January 2002.
Bibliography 45
[33] A. Rajaraman; J.D Ullman. Introduction to sentiment analysis.
Data Mining: Mining of Massive Datasets, ISBN
9781139058452:1{17, 2011.
[34] Official Cricket World Cup Website. History. 2015. URL
http://www.
icc-cricket.com/cricket-world-cup/about/279/history.
[35] Asican Cricket Council. Afghanistan. 2015. URL
http://www.asiancricket.org/
index.php/members/afghanistan.
[36] Jon Healy Dean Bilton. Cricket world cup: New zealand
v south africa semi-_nal in auckland as it happened. ABC
News, 2015. URL http://www.abc.net.au/news/2015-03-24/
cricket-world-cup-semi-final3a-new-zealand-v-south-africa-live/
6343458.
[37] New zealand v south africa - 6 defining moments. ICC
Cricket, March 2015. URL http://www.icc-cricket.com/
cricket-world-cup/news/2015/features-and-specials/87358/
new-zealand-v-south-africa-6-defining-moments.
[38] Australia puts its no-one ranking on the line as icc cricket world
cup 2015 starts on saturday. ICC Cricket, Feb 2015. URL
http://www.icc-cricket.com/
cricket-world-cup/news/2015/media-releases/85426/~.
[39] Twitter. Faqs about retweets (rt). 2014. URL
https://support.twitter.com/
articles/77606-faqs-about-retweets-rt.
[40] Valerio Basile and Malvina Nissim. Sentiment analysis on
italian tweets.
[41] D. Maynard and M. A. Greenwood. Who cares about sarcastic
tweets? Investigating the impact of sarcasm on sentiment analysis. In
Proceedings of the Ninth International Conference on Language
Resources and Evaluation (LREC- 2014), Reykjavik, Iceland, May
26-31, 2014., pages 4238{4243, 2014. URL http:
//www.lrec-conf.org/proceedings/lrec2014/summaries/67.html.
[42] A. Kumar and T. Sebastian. Sentiment analysis: A perspective
on its past.
Download