CV - Indiana University Bloomington

advertisement

Shuai Xin

Department of Informatics

Indiana University Bloomington

919 E 10th Street, Room 401

Bloomington, IN 47408

Phone: (812) 606-8969

Office: 401 INFO. East Building

Email: xshuai@indiana.edu

Homepage: http://www.cs.indiana.edu/˜xshuai/

Education

Ph.D. Informatics, Indiana University Bloomington, 2008-present.

GPA: 3.862/4.0

M.S. & B.S. Automation & Control Science, Zhejiang University, 2006-2008.

GPA: 3.9/4.0

Research

Interests

Social Media Mining, Predictive Modeling, Information Retrieval and Network Analysis

Working Papers

Shuai, X., O’hare, N., Aiello, L. M. and James, A. (2014) Predicting Social Events Attendees using Stock Photos Metadata, work is done in Yahoo! Research Barcelona and submitted to Journal of Multimedia Tools and

Applications

Publications

Shuai, X., Liu, X. Z., Xia, T., Wu, Y. Q. and Guo, C. (2014) Comparing the Categorial Hot Events in Twitter and

Weibo (forthcoming), HyperText 2014

Li, D. F., Ding, Y., Shuai, X., Sun, G. Z., Tang, J., Luo, Z. P., Zhang, J. W., and Chambers, T. (2014) Topic-Level

Opinion Influence Model (TOIM): An Investigation Using Tencent Microblogging (forthcoming), JASIST

Shuai, X., Jiang Z. R., Liu, X. Z. and Bollen, J. (2013) A Comparative Study of Academic and Wikipedia Ranking

[url] , JCDL2013

Mao, H. N., Shuai, X., Ahn, Y.-Y. and Bollen, J. (2013) Mobile Communications Reveal the Regional Economy in Cˆ [pdf ] , NetMob2013 , selected as 19 candidates from 166 teams to compete for final prizes in D4D data challenge program.

Shuai, X., Bollen, J. and Pepe A. (2012) How the Scientific Community Reacts to Newly Submitted Preprints:

Article Downloads, Twitter Mentions, and Citations [pdf ] , PLoS ONE 7(11) .

Shuai, X., Chen, S. S., Ding, Y., Sun, Y. Y., Busemeyer, J. R and Tang, J. (2012) There is more than complex contagion: an indirect influence analysis on Twitter [url] , Workshop on Mining Data Semantics in conjunction with KDD2012

Li, D. F., Shuai, X., Sun, G. Z, Tang, J., Ding, Y. and Luo, Z. P. (2012) Mining Topic-Level Opinion Influence for

Micro-Blogging (short paper, accepted rate : 27.8%) [pdf ] , CIKM2012 ,

Shuai, X., Liu, X. Z., and Bollen, J. (2012) Improving News Ranking by Community Tweets [pdf ] , Workshop on

Mining Social Network Dynamics in conjunction with WWW2012

Shuai, X., Ding, Y., Busemeyer, J. R.(2012) Multiple spreaders affect the indirect influence on Twitter (poster, accept ratio: 28.1%) [pdf ] , WWW2012

Mao, H. N, Shuai, X., and Kapadia, A. (2011) Loose Tweets: An Analysis of Privacy Leaks on Twitter [pdf ] ,

WPES 2011: Workshop on Privacy in the Electronic Society, in conjunction with ACM CCS .

Shuai Xin 2

Li, D. F., Ding, Y., Shuai, X., Chen, S. S., Tang, J., Bollen, J., Zhu, J. Y. and Rocha, G. (2011). Adding

Community and Dynamic to Topic Models [url] , Journal of Informetrics , Vol. 6(2):237-253

Projects

Graduate Research Assistant. Indiana University Bloomington, September 2011 – present

– Investigated social media impact on scholarly communications using regression and correlation analysis.

– Designed a binary text classifier to distinguish sensitive tweets leaking privacy from general tweets.

– Improve Google & Yahoo news ranking by modeling geographical community interests on Twitter.

– Predicted users’ opinions by jointly considering topic and social influence in Chinese microblogging site.

– Analyzed and visualized data using Python and R.

Insight Data Science Fellows Program at Mountain View [url] , August 2013 – October 2013

– Developed a web app TwiNews (http://xintwinews.com) in Python to re-rank news articles from Google &

Yahoo according to Twitter geographical popularity.

– Crawled news articles and collected public tweets using Twitter API, stored the data in MySQL, ranked news articles based on cosine similarity scores with local tweets, and evaluated the ranking on MTurk.

– Implemented the web interface using Flask and deployed it using Amazon Web Service.

D4D Data Challenge [url] , Indiana University Bloomington, October 2012 – May 2013

– Inferred socio-economic index from mobile phone usage patterns using correlation analysis.

– Detected the digital divide phenomenon and communication gap between rich and poor areas in an African country through network analysis.

– Visualized the geographical mobile communication flow using R and JEQL.

Internship @ Yahoo! Research at Barcelona, June-September 2012

– Applied Hadoop Map-Reduce framework for large-scale data processing of news photo metadata to predict social event attendees using language model from Terrier IR platform.

– Incorporated textual, network and temporal information together into language model to improve prediction.

– Analyzed the effects of topical, temporal and scale properties of events on prediction performance using R.

Internship @ Dialogue Group at Avaya Lab (a heritage from Bell Lab), June-August 2011

–Developed an IR system based on web service technology to support proactive search to enrich enterprise communication and collaboration, including:

* Disambiguated employee names using co-author social networks.

* Recommend experts given a query based on relevance, reputation and authority.

* Annotated text stream relying on Wikipedia using key phrases extraction and concept disambiguation.

Google Summer of Code [url] , May-August 2009

– Collaborated with a team to develop an open source software package using GitHub.

– Programmed Python/Perl language interfaces to libsequence (a bioinformatics C++ class library) on multioperational systems to facilitate the biological data analysis using scripting language.

Skills

Languages: Python, R, C/C++, Java, Perl

Tools: Scikit-learn, NetworkX, NumPy, SciPy, MATLAB, MySQL, Hadoop, Pig, Flask, Git, SVN

Download