Twitter Mood Predicts the Stock Market

advertisement
Twitter Mood Predicts the
Stock Market
Authors: Johan Bollen, Huina Mao, Xiao-Jun Zeng
Presented By:
Krishna Aswani
Computing ID: ka5am
Is it possible to predict Stock Markets??

Early research: Stock markets are based on the Efficient
Market Hypothesis (by new information, i.e. news, rather
than present and past prices) and random walk theory

Recent research: News may be unpredictable but early
indicators can be extracted from online social media
(blogs, Twitter feeds, etc) to predict changes in various
economic and commercial indicators
Method:
Mood Indicators (Daily)
Twitter
Feed
DJIA
Text
Analysis
Phase 1
Granger
Causality
Normalization
SOFNN
t-1
t-2
t-3
Stock Markets (Daily)
t=0
value
F-statistics
p-value
Predicted
Value
MAPE
Direction%
Opinion Finder istime
a softwareseries
Phase1: Creating sentiment
package that classifies tweets
into Positive and Negative.
For each day ratio of total no.
of Positive tweets to total no.
 Step 1 – Collecting Public Tweets of
(February
28 to December
19th,
negative tweets
is
calculated 2.7M users), removing
9,853,498 tweets posted by approximately
Google Profile of Mood States
classifies tweets into 6 types:
Calm, Alert, Sure, Vital, Kind &
Happy.
2008
stopwords, normalizing them etc.

Step2- Pass it through Opinion Finder and Google Profile of Mood
States
(GPOMS) to create time series.

Step3 – To have a comparison of time series from Opinion Finder and
Google Profile of Mood States z-score is used to normalize each:

Step 4 – Cross Validating against large socio-cultural events.
Method:
Phase 2
Mood Indicators (Daily)
Twitter
Feed
DJIA
Text
Analysis
Granger
Causality
Normalization
SOFNN
t-1
t-2
t-3
Stock Markets (Daily)
t=0
value
F-statistics
p-value
Predicted
Value
MAPE
Direction%
Phase 2 – Correlation between mood
time series and DJIA
Granger causality analysis
rests on the assumption
that if anormalize
variable X it
causes
 Step1- Collect DJIA data for the same time duration,
and
Y then changes in X will
plot a time series.
systematically occur
 Step2 - Use Granger causality analysis on model
1 & 2:
before
changes in Y
Correlation does not mean causation
Method:
Mood Indicators (Daily)
Twitter
Feed
DJIA
Text
Analysis
Granger
Causality
Normalization
SOFNN
t-1
t-2
t-3
Stock Markets (Daily)
t=0
value
F-statistics
p-value
Phase 3
Predicted
Value
MAPE
Direction%
Phase 3- Non-linear models for accurate
stock prediction

As the relationship between DJIA and Mood time series doesn’t look
linear, to predict with better accuracy Self Organizing Fuzzy Neural
Network (SOFNN) are used.

Different Permutations of input variables (Mood Time series) are used:
Results:
Calm
Calm
and
Happy
Factors not considered

Geographic Location of Tweets. This approach worked because twitter
base is predominantly located in the US.

These results are strongly indicative of a predictive correlation
between measurements of the public mood states from Twitter feeds,
but offer no information on the causative mechanisms that may
connect online public mood states with DJIA values

It is highly vulnerable to twitter bombing campaigns, which very easily
become viral.
Applications:

Companies like Tower Research Capital (computational
investment trading)

Dataminr (social analytics company)
Download