Twitter Mood Predicts the Stock Market Authors: Johan Bollen, Huina Mao, Xiao-Jun Zeng Presented By: Krishna Aswani Computing ID: ka5am Is it possible to predict Stock Markets?? Early research: Stock markets are based on the Efficient Market Hypothesis (by new information, i.e. news, rather than present and past prices) and random walk theory Recent research: News may be unpredictable but early indicators can be extracted from online social media (blogs, Twitter feeds, etc) to predict changes in various economic and commercial indicators Method: Mood Indicators (Daily) Twitter Feed DJIA Text Analysis Phase 1 Granger Causality Normalization SOFNN t-1 t-2 t-3 Stock Markets (Daily) t=0 value F-statistics p-value Predicted Value MAPE Direction% Opinion Finder istime a softwareseries Phase1: Creating sentiment package that classifies tweets into Positive and Negative. For each day ratio of total no. of Positive tweets to total no. Step 1 – Collecting Public Tweets of (February 28 to December 19th, negative tweets is calculated 2.7M users), removing 9,853,498 tweets posted by approximately Google Profile of Mood States classifies tweets into 6 types: Calm, Alert, Sure, Vital, Kind & Happy. 2008 stopwords, normalizing them etc. Step2- Pass it through Opinion Finder and Google Profile of Mood States (GPOMS) to create time series. Step3 – To have a comparison of time series from Opinion Finder and Google Profile of Mood States z-score is used to normalize each: Step 4 – Cross Validating against large socio-cultural events. Method: Phase 2 Mood Indicators (Daily) Twitter Feed DJIA Text Analysis Granger Causality Normalization SOFNN t-1 t-2 t-3 Stock Markets (Daily) t=0 value F-statistics p-value Predicted Value MAPE Direction% Phase 2 – Correlation between mood time series and DJIA Granger causality analysis rests on the assumption that if anormalize variable X it causes Step1- Collect DJIA data for the same time duration, and Y then changes in X will plot a time series. systematically occur Step2 - Use Granger causality analysis on model 1 & 2: before changes in Y Correlation does not mean causation Method: Mood Indicators (Daily) Twitter Feed DJIA Text Analysis Granger Causality Normalization SOFNN t-1 t-2 t-3 Stock Markets (Daily) t=0 value F-statistics p-value Phase 3 Predicted Value MAPE Direction% Phase 3- Non-linear models for accurate stock prediction As the relationship between DJIA and Mood time series doesn’t look linear, to predict with better accuracy Self Organizing Fuzzy Neural Network (SOFNN) are used. Different Permutations of input variables (Mood Time series) are used: Results: Calm Calm and Happy Factors not considered Geographic Location of Tweets. This approach worked because twitter base is predominantly located in the US. These results are strongly indicative of a predictive correlation between measurements of the public mood states from Twitter feeds, but offer no information on the causative mechanisms that may connect online public mood states with DJIA values It is highly vulnerable to twitter bombing campaigns, which very easily become viral. Applications: Companies like Tower Research Capital (computational investment trading) Dataminr (social analytics company)