Uploaded by Baddy Scores

Twitter based FX Portfolio Mgmt #forex #python #twitter

advertisement
Twitter
Portfolio
Management
based
Is it possible?
Anand Bindumadhavan
Abstract
Dataset: The dataset is raw twitter tweets containing FX trade ideas/results
Question: Are the buy/sell advices for forex currency pairs from twitter feeds worth
following and will they make one profitable?
Method: I used rudimentary text processing, identifying and isolating tweets containing trade
ideas/results indicating a profit or loss per trade, and used it to build a portfolio with 10K as
initial capital, and derived the equity curve by following each trade/trade result.
Findings: While the analysis shows 50% of the analysts were successful in their trades with
positive portfolio values, the hypothesis and the question is answered: i.e. it is indeed
possible to manage a portfolio by smart monitoring of twitter feeds and selective
tagging of trades!!
Motivation
There are quite a few analysts/companies who frequently tweet about whether to buy or sell
some asset, say an equity or currency at a certain price, with some profit targets. I always
wondered whether one could manage his/her own investment portfolio by following the
buy/sell advice from these twitter users.
As there could be quiet a variety of such advice/tweets, I will narrow the analysis to tweets
from a few such users/analysts, to (optionally) a specific currency pair like EUR/USD, grab
as many tweets as possible with buy/sell advices, and build a equity curve based on the
profit/loss generated from each advice/signal. I will compare the results to understand
whether it is worthwhile following these tweets.
A positive equity at the end of the period of analysis, would indicate that following the tweets
is worthwhile, while a negative equity would indicate that we should not follow such tweets.
Dataset(s)
The data is from historical tweets.
First we grab all tweets from a few randomly picked users (relevant to the topics #
forextrading, # eurusd ), understand the structure of their tweets, parse them and
extract the profit/loss value in pips, per trade, and use it to build an equity curve.
Data Preparation and Cleaning
At a high-level, what did you need to do to prepare the data for analysis?
The first step was to identify which users to follow. A search on twitter for the topic
‘#forextrading’ resulted in a very broad range of tweets – I had to really go back and forth to
eventually come up with a query such as: ‘‘topic is #eurusd and contains the words ‘buy or
sell’’ to narrow the results, from where I picked 7 users randomly for the analysis.
Describe what problems, if any, did you encounter with the dataset?
As the tweets were raw text, I really had to study them closely to identify an appropriate
tweet that can be used for data extraction. Even then, different users recorded the profit/loss
differently. Eg: for +20 pips, vs Profit: 20 pips
I also had to drop one of the users eventually because that user tweeted only trade entries,
but never recorded a follow-up tweet with the actual profit/loss figures.
Research Question(s)
Are the buy/sell advices for the EUR/USD (or other) currency pair(s) from twitter
feeds worth following and will they make you profitable?
Methods
What methods did you use to analyze the data and why are they appropriate? Be
sure to adequately, but briefly, describe your methods.
I used a very simple method of analyzing tweets and extracting the relevant info using very
simple string manipulation/regular expression operations. I did not use any sophisticated
natural language analysis or any ML model for sentiment analysis etc.
Rather, I employed a very simple and straightforward method: look at all the tweets, filter for
relevant tweets, extract the piece of info that is required, and use it to build a model for
evaluation. It is simple, but yet very powerful in my opinion, because it opens a whole new
world of possibilities, for me at least.
Findings – Summary (refer the next two slides: the equity curve and profit/loss distribution)
The analysis shows that although some analysts consistently lose money and the equity
drops significantly for them, others do make profits and have a healthy equity, over a range
of months. So the hypothesis that, we could follow twitter feeds and manage our portfolio is
very much a possibility. However, it requires a structured analysis, in addition to the
preliminary research above, that re-confirms this possibility.
Specific to this analysis, we can see that three out of the six analysts have lost money, while
the other three have shown positive returns. Two of them, have even shown enormous
returns, which at first glance seems to be too good to be true. As mentioned above, a second
set of validation, by correlating the entry price and the timeline, with the actual price of the
currency pair at that same timeline, along with the corresponding exit, would re-confirm that
the trades were indeed profitable, and the tweets were not made up.
So it is indeed possible to manage a portfolio by smart monitoring of twitter feeds and
selective tagging of trades!!
Limitations
If applicable, describe limitations to your findings:
1. This analysis cannot be generalized for all tweets, because each user records his/her
trade ideas/results in a different format, although there is some structure to it
2. The analysis also assumes that there is a position size logic in place, which makes the
value of each profitable pip to be 1 USD. i.e. normally 10000 units of EUR/USD = 1 pip, and
this is different for EUR/GBP or GBP/USD etc. However, the analysis ignores this difference,
as it is beyond the scope of this project
3. The analysis is based only on tweets, and hence the authenticity of the tweets cannot be
guaranteed. To re-confirm the authenticity of the tweets, we have to extract the underlying
entry/exit points, and validate them with the actual entry/exit prices at the same timeline.
However, I believe this is a good starting point, that can be expanded very easily.
Conclusions
Report your overall conclusions, preferably a conclusion per research question
My conclusion is that the research question has been positively answered. i.e. it is
indeed possible to follow trade ideas from twitter, and if we time them correctly, we
will be able to manage a portfolio well and show a profitable equity curve.
However, it is important to do an extensive research to know and understand,
which users to follow, and which trade ideas to follow. This analysis, shows one
way of doing this research, with the above preliminary conclusion.
Acknowledgements
Where did you get your data?
All data is from historical twitter feeds
Did you use other informal analysis to inform your work?
I used the “search.twitter.com” website, to understand how different search strings
work, and how to optimize the search further.
Did you get feedback on your work by friends or colleagues?
No I did not get any feedback from friends or colleagues.
References
If applicable, report any references you used in your work.
I mostly used online blogs as references for checking the actual syntax for
specifying search strings, syntax for dataframe manipulation, equity curve, plotting
etc.
PDF copy of the jupyter notebook
is attached below
In [1]:
# Data Source: Forex trade tweets from Twitter
# Analysis: We build a sample portfolio based on historical trade ideas/tweets and evaluate the performance
# Research question: Are the buy/sell advices for the EUR/USD currency pair from twitter feeds worth following
#
and will they make you profitable?
Twitter based portfolio management
Can we rely on trade ideas from Twitter:
A portfolio simulation based on real trade tweets
In [2]:
# As we already have our twitter credentials in the pickle file from chapter 8,
# we will load the twitter credentials from this file and grab an API context/handle
# Also, while researching the internet for python twitter documentation, I came across tweepy
# tweepy is another python wrapper for twitter API, and it seems much more simpler, at least for me
# So I am using tweepy instead of the twitter package
In [3]:
import pickle
import os
#import twitter
import pandas as pd
import tweepy
import tweepy
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
%matplotlib inline
In [4]:
if not os.path.exists('secret_twitter_credentials.pkl'):
print('Twitter auth file missing - please make sure, a valid secret_twitter_credentials file is present')
else:
Twitter = pickle.load(open('secret_twitter_credentials.pkl','rb'))
In [5]:
auth = tweepy.OAuthHandler(Twitter['Consumer Key'],
Twitter['Consumer Secret'])
auth.set_access_token(Twitter['Access Token'],Twitter['Access Token Secret'])
twitter_api = tweepy.API(auth)
type(twitter_api)
Out[5]:
tweepy.api.API
Step 1: Initial exploration of forex trading related tweets
In [6]:
# Let's start by searching for all tweets related to forextrading
In [7]:
topic = '#forextrading'
num = 100
status = twitter_api.search(q=topic,count=num)
print(type(status))
print(len(status))
for tweet in status[:15]:
print (tweet.user.screen_name,tweet.text)
print (tweet.user.screen_name,tweet.text)
<class 'tweepy.models.SearchResults'>
96
moralfx Fundamental & Technical Analysis in Forex Trading
https://t.co/OJyAbspRjB
#forextrading #forextrader #forex https://t.co/psAfhg7dcx
mxandrv The ULTIMATE GUIDE on how to trade less and make more: https://t.co/bbbLRr4KUR #forex #fx #forextrading
FerruFx #MT4 #MT5 FFx Basket Scanner
See if a currency / related basket is tradable!
https://t.co/zQTWyymFRO
#FerruFx #fx… https://t.co/ee5yugwL2K
closed__ Fx=Full Gelir #forex #forextrading #forextrading #FxCanli #FX
ultraltdnet Entrepreneur Quote of the day!
If you like this quote, Share it Now!
#entrepreneurquotes #founderwords… https://t.co/TcSLpBFkL7
CityofInvestmnt https://t.co/rki7PPm5b5 I CHOOSE TO INVEST #wealth #forextrading #managedforex #money #invest
#Dollar #fx #pension… https://t.co/J6dZt7Majj
FerruFx #MT4 FFx Hidden TP/SL Manager
Hide your targets and stops to your broker
https://t.co/fGy3oOMYZE
#FerruFx #fx… https://t.co/PIFlBPXYC0
FX_haroldShan The first system uses 3 indicators to determine if there is up- or https://t.co/vJ80FBbV9w #Market
#ForexTrading #Invest #Fibonacci #ECN #EA
ForexFalcon_com If you have an edge be the casino. Play every setup. Don't fear the outcome of the next trade #Fo
rex #ForexTrading https://t.co/0bDFKzQMgK
CityofInvestmnt Greetings from City Of Investment #cityofinvestment #managedforex #money #wealth #winners #fx #Fo
rex #trading… https://t.co/8o7lUeZVRJ
CityofInvestmnt Forex Managed Accounts Exclusive x3 45%-75% #forextrading #stocks #forexsignals #cityofinvestment
#fx #Mexico… https://t.co/sv9k0zQuTH
MyForexYoda here's the information for tonights #forextrading #meetup https://t.co/ogCasriQPg
AliceElisson How to REDUCE UNNECESSARY #FOREX LOSSES and increase the number of winning trades ==> https://t.c
o/MCLsFL1Kag #forextrading #fx
forextradingbay Laser accurate and very fast #forex signals directly on your chart ==> https://t.co/kDNOAjvPMl
#forextrading #fx
SO4FRbcwQSyMhb8 ‫ ﻋﺒﺪ‬: ‫دﻟﯿﻠﻚ اﱃ ﻋﺎﱂ اﳌﺎل واﻻﻋﻤﺎل‬
‫اﻟﻮﺻﻮل اﱃ اﻟﺜﺮاء ﲞﻄﻮات ﺑﺴﯿﻄﺔ‬
‫ﲢﻘﯿﻖ اﻻرﺑﺎح واﻧﺖ ﲡﻠﺲ ﰲ ﻣﻨﺰﻟﻚ‬
‫ …اﻟﻌﺰﯾﺰ‬https://t.co/u6lSat9Bdj
In [8]:
# A search on the topic 'forextrading' is too broad,
# it does not provide any useful tweets that we can analyse in a structured way.
# Let's narrow it down further to a specific currency pair to look for trade ideas.
# I am picking eurusd, as this is the most popular currency pair thats traded
In [9]:
topic = '#forextrading AND #eurusd'
num = 100
status = twitter_api.search(q=topic,count=num)
print(type(status))
print(len(status))
for tweet in status[:5]:
print (tweet.text)
<class 'tweepy.models.SearchResults'>
74
ApaChE and KproteKT THE #eurusd #trading algos now available for #free at https://t.co/lWI4i6w3Bu #forex
#forextrading #forexsignals #EA #…
ApaChE and KproteKT #eurusd #trading algos now above the 23% mark in real account. Try them out at
https://t.co/lsA88jkAa2 #forextrading #…
ApaChE and KproteKT THE #eurusd #trading algos now available for #free at https://t.co/tHzYJptV9a #forex
#forextrading #forexsignals #EA #…
ApaChE and KproteKT #eurusd #trading algos now above the 23% mark in real account. Try them out at
https://t.co/TfoRnOzeOE #forextrading #…
https://t.co/lxLDmyVqqO http://www.tradingzine.comGuide to Binary Options in US #asset #binary #Binary #options…
https://t.co/9d7dtr6dgl
In [10]:
# We still don't have any useful tweets
# let's narrow the search further using the keywords buy or sell as below
In [11]:
topic = '#eurusd AND buy OR sell'
num = 100
status = twitter_api.search(q=topic,count=num)
print(type(status))
print(len(status))
print(len(status))
for tweet in status[:5]:
print (tweet.text)
<class 'tweepy.models.SearchResults'>
59
Need much money now? Buy FX Robot to generate cash for you! Proven REAL accounts, big money, big fun!
https://t.co/n5k7LXC8b7 #EURUSD #Rich
No need to buy the EA or pay upfront fees. Money Management for every one with best brokers. Over 500%2,000%/month. #Analize #EURUSD
#EURUSD #TRADESIGNAL
December 18 - 22, 2017
Sell Limit #1 @ 1.17667
and
Sell Limit #2 @ 1.17627
TAKE PROFIT @… https://t.co/TsksgKPO5q
EUR/USD Weekly Technical Analysis: Risk of a Euro Sell-off Rising https://t.co/gPi8a6Vvhj #forex #eurusd #fx #new
s
EUR/USD Weekly Technical Analysis: Risk of a Euro Sell-off Rising https://t.co/4mrCeMgi9t #forex #eurusd #fx #new
s
In [12]:
# This query above results in a better response set
# In a sample run, I got useful tweets such as the structured tweet below
# "SELL #EURUSD at 1.17584 SL:1.19184 TP:1.14784 , check live performances at...."
# Lets grab some tweets and the screennames, and present them in a dataframe format
# this will allow us to pick two screen names in random that post structured tweets
In [13]:
all_text = []
filtered_results = []
for s in status:
if not s.text in all_text:
all_text.append(s.text)
filtered_results.append(s)
results = filtered_results
len(results)
#print (results[0])
Out[13]:
56
In [14]:
tweet_data = pd.DataFrame(data=[[s.user.screen_name,s.text] for s in results],columns=['Author','Tweet'])
pd.set_option('max_colwidth',140)
tweet_data
Out[14]:
Author
Tweet
0
jerrry_fx
Need much money now? Buy FX Robot to generate cash for you! Proven REAL accounts, big money, big fun!
https://t.co/n5k7LXC8b7 #EURUSD #Rich
1
fxolivia_sh
No need to buy the EA or pay upfront fees. Money Management for every one with best brokers. Over 500%-2,000%/month.
#Analize #EURUSD
2
kenya_forex
#EURUSD #TRADESIGNAL December 18 - 22, 2017\n\nSell Limit #1 @ 1.17667\nand \nSell Limit #2 @ 1.17627\n\nTAKE
PROFIT @… https://t.co/T...
3
thefxcoach
EUR/USD Weekly Technical Analysis: Risk of a Euro Sell-off Rising https://t.co/gPi8a6Vvhj #forex #eurusd #fx #news
4
thefxcoach
EUR/USD Weekly Technical Analysis: Risk of a Euro Sell-off Rising https://t.co/4mrCeMgi9t #forex #eurusd #fx #news
5
myfxdataprov
2017_12_15_SYD_CALL: Monthly Buy/Sell Signals #EURUSD [December, 2017] https://t.co/u1ehOe3QJg
6
myfxdataprov
2017_12_15_SYD_CALL: Weekly Buy/Sell Signals #EURUSD [Dec. 11 to Dec. 15, 2017] https://t.co/kW8gp8MxTK
7
JohnFXCorona
Don't gamble anymore! Don't lose any more money! Be a winner! Buy our EA and make $50k/month. #EuropeanUnion #EURUSD
8
SuperHeroInvest Closed Buy for BLI Fund #EURUSD 1.18043 for -42.2 pips, total for today -8.8 pips #Investors welcome https://t.co/QKLgwlVjta
9
forex22com
Forex Signals - 2017-12-15 19:45:01 : BUY EURUSD@1.17582 SL@1.17432 TP@1.17732 #EURUSD #ForexSignal
https://t.co/LwMgoUUihA
10 B52Finance
Closed Buy 0.01 Lots #EURUSD 1.18061 for -21.0 pips, total for today -109.6 pips
11 B52Finance
Closed Buy 0.1 Lots #EURUSD 1.18069 for -20.4 pips, total for today -88.6 pips
12 B52Finance
Closed Sell 0.1 Lots #EURUSD 1.17897 for -20.4 pips, total for today -192.3 pips
13 B52Finance
Closed Buy 0.01 Lots #EURUSD 1.17778 for +3.2 pips, total for today -90.4 pips
13 B52Finance
Closed Buy 0.01 Lots #EURUSD 1.17778 for +3.2 pips, total for today -90.4 pips
Author
14 B52Finance
Closed Buy 0.1 Lots #EURUSD 1.17794 for +1.6 pips, total for today -93.6 pips
15 thefxcoach
Tweet
EUR/USD strategy is to buy on dips https://t.co/D2NYn1SAez #forex #eurusd #fx #news
16 DayTradeScalps $EURUSD #EURUSD | SELL now | Open: 11762.1 | Target: 11755.8 (6.3) | Stop: 11769.8 (7.7) | #fx #forex #daytrading
17 forex22com
Forex Signals - 2017-12-15 17:00:00 : BUY EURUSD@1.17769 SL@1.17619 TP@1.17919 #EURUSD #ForexSignal
https://t.co/LwMgoUUihA
18 DOlefirov
#EurUsd sell 1.15
19 fxcapitalonline
Wednesday Trading Results\n12/13/17\n\n✅Sell #USDJPY Tp Hit +65 Pips\n✅ Sell #USDCAD Manual Close +30pips \n✅ Buy…
https://t.co/SjstAz7f7z
20 fxcapitalonline
Tuesday results 12/12/17\n\n❌Buy #GBPUSD Manual Close -8 pips❤฀\n❌Buy #EURUSD SL triggered - 10 pips❤฀\n❌Sell
#USDJPY S… https://t.co/KR...
21 fxcapitalonline
Monday results 12/11/17\n\n❌Buy #EURUSD SL triggered -12 #pips❤฀\n❌Buy #GBPUSD SL triggered -25 pips❤฀\n❌ #Buy
#AUDUSD… https://t.co/jNc...
22 thefxcoach
EUR/USD: Buy On Dips https://t.co/PKrGVg4Pcw #forex #eurusd #fx #news
23 Wermelgion_Co Closed Sell #Forex #Fx #EURUSD 1.17848 for +12.2 pips, total for today +66.5 pips
24 Wermelgion_Co Closed Sell #Forex #Fx #EURUSD 1.17998 for +9.9 pips, total for today +43.9 pips
25 cosmos4unet
#EURUSD buy signal on 15 DEC 2017 02:00 PM UTC by AdMACD Trading System (Timefr https://t.co/6dzr1x9AMY #Forex…
https://t.co/HbYNZy4MZz
26 DayTradeScalps $EURUSD #EURUSD | BUY now | Open: 11806.6 | Target: 11812.9 (6.3) | Stop: 11798.8 (7.8) | #fx #forex #daytrading
27 DayTradeScalps $EURUSD #EURUSD | BUY now | Open: 11804.1 | Target: 11810.4 (6.3) | Stop: 11796.3 (7.8) | #fx #forex #daytrading
28 myfxdataprov
2017_12_15_NYC_CALL: Monthly Buy/Sell Signals #EURUSD [December, 2017] https://t.co/u1ehOe3QJg
29 Limitforex
Decision time at #EURUSD.\n#EUR #USD #dollar #euro #gold #buy #sell #graphic #analysis #forex #fx #trade #limitforex…
https://t.co/gT0vm...
30 arbtrader100
#ARBsignals | BUY #EURUSD @ 1.1797 | SL:1.1777 | TP:1.1817 | SENT 2017-12-15 12:27:08 GMT | #forexsignal #fx #forex #fb
31 myfxdataprov
2017_12_15_NYC_CALL: Weekly Buy/Sell Signals #EURUSD [Dec. 11 to Dec. 15, 2017] https://t.co/kW8gp8MxTK
32 forex22com
Forex Signals - 2017-12-15 12:00:00 : SELL EURUSD@1.17998 SL@1.18148 TP@1.17848 #EURUSD #ForexSignal
https://t.co/LwMgoUUihA
Author Closed Buy #Forex #Fx #EURUSD 1.18083 for -8.2 pips, total for today +34.0 pips
33 Wermelgion_Co
Tweet
34 Wermelgion_Co Closed Buy #Forex #Fx #EURUSD 1.17791 for +21.0 pips, total for today +42.2 pips
35 limitforextr
#EURUSD'da karar anı.\n#EUR #USD #dolar #euro #sterlin #yen #gold #altın #kazanç #buy #sell #grafik #analiz #forex…
https://t.co/WfrIh59ZWx
36 smart4trade
Bought #NAS100 6404.0 #smart4trader #sp500 #eurusd #dowjones #nasdaq #buy #sell #вк #vk мой сайт
https://t.co/RGfOk0hSXL
37 smart4trade
Bought #US30 24644.0 #smart4trader #sp500 #eurusd #dowjones #nasdaq #buy #sell #вк #vk мой сайт
https://t.co/RGfOk0hSXL
38 forex22com
Forex Signals - 2017-12-15 08:15:00 : SELL EURUSD@1.17864 SL@1.18014 TP@1.17714 #EURUSD #ForexSignal
https://t.co/LwMgoUUihA
39 EuroBulls_Forex
RT @Thai_TraderFX: GM, traders. #EURUSD buy.\n\nLearn to trade like a pro https://t.co/1961yQ4HIV \n\nJoin my telegram
chann...
40 thirdbrainfx
#x112 SELL #EURUSD at 1.17925 SL:1.19525 TP:1.15125 , check live performances at https://t.co/aTfkud3mWf
https://t.co/E5GbZJw0Ss
41 myfxdataprov
2017_12_15_LON_CALL: Monthly Buy/Sell Signals #EURUSD [December, 2017] https://t.co/u1ehOe3QJg
42 Thai_TraderFX
GM, traders. #EURUSD buy.\n\nLearn to trade like a pro https://t.co/1961yQ4HIV \n\nJoin my telegram channel… https://t.co/4X...
43 myfxdataprov
2017_12_15_LON_CALL: Weekly Buy/Sell Signals #EURUSD [Dec. 11 to Dec. 15, 2017] https://t.co/kW8gp8MxTK
44 SuperHeroInvest Closed Sell for BLI Fund #EURUSD 1.18324 for +48.1 pips, total for today +48.1 pips #Investors welcome https://t.co/QKLgwlVjta
45 tj_fx_live
Closed Sell USDJPY 112.559 for +23.8 pips, total for today +6.6 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY
46 myfxdataprov
2017_12_15_TYO_CALL: Monthly Buy/Sell Signals #EURUSD [December, 2017] https://t.co/u1ehOe3QJg
47 tj_fx_live
Closed Sell GBPUSD 1.34295 for -17.2 pips, total for today -17.2 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY
48 myfxdataprov
2017_12_15_TYO_CALL: Weekly Buy/Sell Signals #EURUSD [Dec. 11 to Dec. 15, 2017] https://t.co/kW8gp8MxTK
49 forex22com
Forex Signals - 2017-12-15 01:45:04 : SELL EURUSD@1.17812 SL@1.17962 TP@1.17662 #EURUSD #ForexSignal
https://t.co/LwMgoUUihA
50 forex22com
Forex Signals - 2017-12-14 23:45:03 : BUY EURUSD@1.1774 SL@1.1759 TP@1.1789 #EURUSD #ForexSignal
https://t.co/LwMgoUUihA
51 DarrenwongMfx
Closed Sell 0.1 Lots EURUSD 1.17885 for +21.2 pips, total for today +21.2 pips #forex #eurusd
51 DarrenwongMfx Closed Sell 0.1 Lots EURUSD 1.17885 for +21.2 pips, total for today +21.2 pips #forex #eurusd
Author
52 DarrenwongMfx Closed Sell 0.1 Lots EURUSD 1.17919 for +20.8 pips, total for today +20.8 pips #forex #eurusd
Tweet
53 Forexrulebook
Nice buy opportunity on #EURJPY not to be missed. \nUpward sequence likely to continue\n\nGenesis Asset >> Join for $…
https://t....
54 myfxdataprov
2017_12_14_SYD_CALL: Monthly Buy/Sell Signals #EURUSD [December, 2017] https://t.co/u1ehOe3QJg
55 DarrenwongMfx
Closed Sell 0.1 Lots EURUSD 1.17952 for +20.3 pips, total for today +20.3 pips #forex #eurusd
Step 2: Grabbing tweets from specific screennames
In [15]:
# After a few trial runs of the above query, I have narrowed the below screen names:
#
@fuchstraders
#
@DayTradeScalps
#
@SignalFactory
#
@tj_fx_live
#
@Wermelgion_Co
#
@MaggiecharFx
#
@DanielWr_fx
# I picked them mainly because of the consistent structure they follow in their tweets,
# making it easier for us to parse and analyse the data. Also another important factor for
# choosing them is that they record the profit/loss per trade in the tweet, so
# we don't need special processing to analyse a portofolio based on their tweets
In [16]:
# grabbing user specifc tweets using the screen names picked above
# note that we are not using any hashtag filter or text filter
# i am doing a generic grab of the tweets from these users, to understand
# how often they tweet a trade idea/signal vs something else
In [17]:
for name in
['fuchstraders','DayTradeScalps','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']:
tweets = twitter_api.user_timeline(screen_name=name, count=1000)
print("*"*60)
print("*"*60)
print("Number of tweets extracted for: {} is: {}.\n".format(name,len(tweets)))
for tweet in tweets[:5]:
print (tweet.text)
************************************************************
Number of tweets extracted for: fuchstraders is: 200.
Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/Lnaq3rA8Pv
https://t.co/3WvOOlHGhW
Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/eKRWSL9OH2
Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/U2SORrE3XQ
https://t.co/7cJJ7P6xQL
Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNxnXG https://t.co/jDWyz0tQa0
Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/R1dPfi6neR
https://t.co/NggCw9jZXE
************************************************************
Number of tweets extracted for: DayTradeScalps is: 200.
$EURUSD #EURUSD | SELL now | Open: 11762.1 | Target: 11755.8 (6.3) | Stop: 11769.8 (7.7) | #fx #forex #daytrading
$EURCHF #EURCHF | SELL now | Open: 11667.8 | Target: 11663.1 (4.7) | Stop: 11677 (9.2) | #fx #forex #daytrading
$AUDJPY #AUDJPY | SELL now | Open: 8614.2 | Target: 8609.1 (5.1) | Stop: 8623.1 (8.9) | #fx #forex #daytrading
$NZDUSD #NZDUSD | SELL now | Open: 7006.5 | Target: 7001.6 (4.9) | Stop: 7015.6 (9.1) | #fx #forex #daytrading
$USDCHF #USDCHF | BUY now | Open: 9927.5 | Target: 9932.5 (5) | Stop: 9918.5 (9) | #fx #forex #daytrading
************************************************************
Number of tweets extracted for: SignalFactory is: 200.
Forex Signal | Close(TP) Sell CADJPY@87.394 | Profit: +79 pips | 2017.12.15 19:34 GMT | #fx #forex #fb
Forex Signal | Buy EURAUD@1.53770 | SL:1.53370 | TP:1.54570 | 2017.12.15 19:32 GMT | #fx #forex #fb
Forex Signal | Close(SL) Sell AUDCAD@0.98526 | Loss: -40 pips | 2017.12.15 19:27 GMT | #fx #forex #fb
Forex Signal | Close(SL) Buy CADCHF@0.76917 | Loss: -40 pips | 2017.12.15 19:19 GMT | #fx #forex #fb
Forex Signal | Close(SL) Sell AUDCAD@0.98433 | Loss: -40 pips | 2017.12.15 19:13 GMT | #fx #forex #fb
************************************************************
Number of tweets extracted for: tj_fx_live is: 200.
Bought GBPUSD 1.34391 #trading #EURUSD #FX #forex #GBPUSD #USDJPY
Bought USDJPY 112.239 #trading #EURUSD #FX #forex #GBPUSD #USDJPY
Closed Sell USDJPY 112.559 for +23.8 pips, total for today +6.6 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY
Closed Sell GBPUSD 1.34295 for -17.2 pips, total for today -17.2 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY
Closed 0.0 for 0.0 pips, total for today 0.0 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY
************************************************************
************************************************************
Number of tweets extracted for: Wermelgion_Co is: 200.
Closed Sell #Forex #Fx #USDCHF 0.99072 for +10.4 pips, total for today +131.4 pips
Closed Buy #Forex #Fx #AUDCAD 0.98236 for +9.4 pips, total for today +121.0 pips
Closed Buy #Forex #Fx #AUDCAD 0.9805 for +12.2 pips, total for today +111.6 pips
Closed Sell #Forex #Fx #USDCHF 0.99234 for +7.7 pips, total for today +99.4 pips
Closed Sell #Forex #Fx #AUDCAD 0.97858 for -15.2 pips, total for today +91.7 pips
************************************************************
Number of tweets extracted for: MaggiecharFx is: 200.
Closed Sell 1.0 Lots EURUSD 1.17946 for +25.5 pips, total for today +693.4 pips #Online #ForexTrading #Advisor #B
ase #Code
Closed Sell 1.0 Lots EURUSD 1.17947 for +25.5 pips, total for today +667.9 pips #Online #ForexTrading #Advisor #B
ase #Code
Closed Sell 1.0 Lots EURUSD 1.17947 for +25.5 pips, total for today +642.4 pips #Online #ForexTrading #Advisor #B
ase #Code
Closed Sell 1.0 Lots EURUSD 1.1796 for +25.1 pips, total for today +616.9 pips #Online #ForexTrading #Advisor #Ba
se #Code
Closed Sell 1.0 Lots EURUSD 1.17984 for +27.0 pips, total for today +591.8 pips #Online #ForexTrading #Advisor #B
ase #Code
************************************************************
Number of tweets extracted for: DanielWr_fx is: 200.
Maximum Equity Drop (also called "Draw-Down" or "Risk Management") is less than 25% (usually no more than 10-15%)
. #GetMoney #FreeRobot #FX
You can withdraw your initial deposit after a few days and then we trade only with the profits, so you don't risk
your money any more. #Help
You can exit when 2 of 3 indicators reverse or use Trailing Stop Loss, Take Profit, Risk Management, etc.…
https://t.co/zEXnpOus5b
How to start? Let us know which broker you prefer and we tell you further details about deposit, account type, et
c. #IntroducingBroker #Job
You don't need to have any Forex knowledge whatsoever. I trade for you – you just watch the profits coming.… http
s://t.co/xfr8LQQN7d
Step 3: Extract all tweets from one user and load it into a dataframe
In [18]:
# Now that we have some structured tweets at hand, we can move on to the next step
# extracting all tweets from a user and loading them into a dataframe
# extracting all tweets from a user and loading them into a dataframe
In [19]:
# Before the extraction, I am going to note down the structure of the tweet that
# is of interest for us, from each of these users - we will use this structure for
# parsing the tweets
In [20]:
for name in
['fuchstraders','DayTradeScalps','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']:
tweets = twitter_api.user_timeline(screen_name=name, count=1000)
print("*"*60)
for tweet in tweets:
if 'RT' not in tweet.text and 'close' in tweet.text or 'Close' in tweet.text:
print("The relevant tweet from: {} that we will use for our analysis is: \n {} \n".format(name,tweet.t
ext))
break
************************************************************
The relevant tweet from: fuchstraders that we will use for our analysis is:
Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/Lnaq3rA8Pv
https://t.co/3WvOOlHGhW
************************************************************
************************************************************
The relevant tweet from: SignalFactory that we will use for our analysis is:
Forex Signal | Close(TP) Sell CADJPY@87.394 | Profit: +79 pips | 2017.12.15 19:34 GMT | #fx #forex #fb
************************************************************
The relevant tweet from: tj_fx_live that we will use for our analysis is:
Closed Sell USDJPY 112.559 for +23.8 pips, total for today +6.6 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY
************************************************************
The relevant tweet from: Wermelgion_Co that we will use for our analysis is:
Closed Sell #Forex #Fx #USDCHF 0.99072 for +10.4 pips, total for today +131.4 pips
************************************************************
The relevant tweet from: MaggiecharFx that we will use for our analysis is:
Closed Sell 1.0 Lots EURUSD 1.17946 for +25.5 pips, total for today +693.4 pips #Online #ForexTrading #Advisor #
Closed Sell 1.0 Lots EURUSD 1.17946 for +25.5 pips, total for today +693.4 pips #Online #ForexTrading #Advisor #
Base #Code
************************************************************
The relevant tweet from: DanielWr_fx that we will use for our analysis is:
Closed Sell 2.3 Lots EURUSD 1.17644 for +10.5 pips, total for today +797.4 pips #Online #ForexTrading #Advisor #
Base #Code
In [21]:
# I can't seem to find the word 'close' or 'Close' in the tweets from DayTradeScalps !!
# Let's do some further specific analysis of this user's tweets, to understand
# whether he tweets the closure of his trades
In [22]:
# After a few trials, I had to goto the twitter search site to search there
# https://twitter.com/search?l=en&q=buy%20OR%20sell%20from%3ADayTradeScalps&src=typd
# The results showed that this user only tweets the trade entries, but does not record
# whether the targets were hit or stop loss was triggered
# such an open ended tweet requires further correlation of the targets in the tweets
# with the actual price movements at those times
# Such an analysis is beyond the scope of this project, so I am dropping the user
#
@DayTradeScalps from my analysis
In [23]:
# Let's continue with loading the tweets into a df
# Here we define two helper functions that will help us retrieve all tweets from a specific user
# and load them all up into one dataframe that will help us with the analysis
# I guess there is a max throttle somewhere, but we will proceed to retrieve all tweets
In [24]:
def get_all_unique_tweets(screen_name):
all_tweets = []
new_tweets = twitter_api.user_timeline(screen_name = screen_name,count=200)
all_tweets.extend(new_tweets)
oldest = all_tweets[-1].id - 1
oldest = all_tweets[-1].id - 1
# According to twitter api documents, we can request the next set of tweets based on the max_id
# parameter - so we continue looping retrieving old tweets, and if the number of tweets returns is
# zero we exit the loop
while len(new_tweets) != 0:
new_tweets = twitter_api.user_timeline(screen_name = screen_name,count=200,max_id=oldest)
all_tweets.extend(new_tweets)
oldest = all_tweets[-1].id - 1
# we repeat the filter for unique tweets
# we could have written a function, but I will get on with it for now
all_tweet_text = []
filtered_tweets = []
for t in all_tweets:
if not t.text in all_tweet_text:
all_tweet_text.append(t.text)
filtered_tweets.append(t)
return filtered_tweets
In [25]:
def load_tweets_into_df(all_tweets):
tweet_df = pd.DataFrame(data=[[s.created_at,s.user.screen_name,s.text] for s in all_tweets],columns=['CreatedA
t','Author','Tweet'])
return tweet_df
In [26]:
all_tweet_df = []
all_tweet_df = pd.DataFrame(columns=['CreatedAt','Author','Tweet'])
for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']:
a_tweets = get_all_unique_tweets(name)
tweet_df = load_tweets_into_df(a_tweets)
all_tweet_df = all_tweet_df.append(tweet_df)
print('Loaded {} tweets from {} into a dataframe'.format(len(tweet_df),name))
print('Total tweets loaded = {}'.format(len(all_tweet_df)))
Loaded 3204 tweets from fuchstraders into a dataframe
Loaded 3118 tweets from SignalFactory into a dataframe
Loaded 3118 tweets from SignalFactory into a dataframe
Loaded 763 tweets from tj_fx_live into a dataframe
Loaded 219 tweets from Wermelgion_Co into a dataframe
Loaded 3206 tweets from MaggiecharFx into a dataframe
Loaded 1960 tweets from DanielWr_fx into a dataframe
Total tweets loaded = 12470
Step 4: Data transformation and cleansing
In [27]:
all_tweet_df.head(5)
Out[27]:
CreatedAt
Author
Tweet
0
2017-12-15
15:33:39
fuchstraders
1
2017-12-15
13:02:14
fuchstraders Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/eKRWSL9OH2
2
2017-12-15
09:33:21
fuchstraders
3
2017-12-15
07:04:23
fuchstraders Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNxnXG https://t.co/jDWyz0tQa0
4
2017-12-15
03:33:19
fuchstraders
Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/Lnaq3rA8Pv
https://t.co/3WvOOlHGhW
Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/U2SORrE3XQ
https://t.co/7cJJ7P6xQL
Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/R1dPfi6neR
https://t.co/NggCw9jZXE
In [28]:
# This dataframe has all tweets from these four users
# we need to look for tweets that book profits or losses
# Initial analysis shows, such tweets usually contain the word 'pips'
# as we observed in our analysis of the tweet structure
In [29]:
# We could have either filtered such tweets in the twitter_api search
# or we could have filtered those tweets during the dataframe loading exercise
# The third option is to filter them in the dataframe
# I will use the third option for now, as this gives us a good example
# of loading raw tweets into a dataframe and filtering them there
In [30]:
before = len(all_tweet_df)
print(before)
12470
In [31]:
all_tweet_df.isnull().any()
Out[31]:
CreatedAt
Author
Tweet
dtype: bool
False
False
False
In [32]:
# To build an equity curve out of the profit/loss values,
# we first need to extract the profit/loss values per trade
# we note that that profit/loss is in between 'for' and 'pips' for all users, except for tweets from SignalFactor
y
# For tweets from SignalFactory, the profit/loss is in between 'Profit:' and 'pips' or 'Loss:' and 'pips'
In [33]:
# There are multiple ways to handle this
# one way is to split the dataframe again by the tweet authors and
# apply a different extraction logic for each tweet author
# another way is to do a text replace, to bring all records in the tweet to contain the same pattern
# to keep it simple, let's choose the second way i.e. replace the text 'Profit:' and Loss: to the text 'for'
# however, as each user's tweet's will follow a different structure, a possible future extension
# could be to write a dedicated profit/loss extractor for each user and apply it to the dataframe
# could be to write a dedicated profit/loss extractor for each user and apply it to the dataframe
In [34]:
# Let's first create a filter and see the values that are present
closed_tweet = all_tweet_df['Tweet'].str.contains('Close')
author = all_tweet_df['Author'] == 'SignalFactory'
df_match = closed_tweet & author
print(all_tweet_df[df_match][:5])
all_tweet_df['Tweet'] = all_tweet_df['Tweet'].str.replace('Profit:','for')
all_tweet_df['Tweet'] = all_tweet_df['Tweet'].str.replace('Loss:','for')
# Now if we print the rows based on the same filter, they should contain
# 'for' instead of the words 'Profit:' or 'Loss:'
print(all_tweet_df[df_match][:5])
CreatedAt
0 2017-12-15 19:47:07
2 2017-12-15 19:32:03
3 2017-12-15 19:30:00
4 2017-12-15 19:17:03
7 2017-12-15 17:15:00
Author
SignalFactory
SignalFactory
SignalFactory
SignalFactory
SignalFactory
\
Tweet
0 Forex Signal | Close(TP) Sell CADJPY@87.394 | Profit: +79 pips | 2017.12.15 19:34 GMT | #fx #forex #fb
2
Forex Signal | Close(SL) Sell AUDCAD@0.98526 | Loss: -40 pips | 2017.12.15 19:27 GMT | #fx #forex #fb
3
Forex Signal | Close(SL) Buy CADCHF@0.76917 | Loss: -40 pips | 2017.12.15 19:19 GMT | #fx #forex #fb
4
Forex Signal | Close(SL) Sell AUDCAD@0.98433 | Loss: -40 pips | 2017.12.15 19:13 GMT | #fx #forex #fb
7
Forex Signal | Close(SL) Buy CADCHF@0.77068 | Loss: -40 pips | 2017.12.15 17:06 GMT | #fx #forex #fb
CreatedAt
Author \
0 2017-12-15 19:47:07 SignalFactory
2 2017-12-15 19:32:03 SignalFactory
3 2017-12-15 19:30:00 SignalFactory
4 2017-12-15 19:17:03 SignalFactory
7 2017-12-15 17:15:00 SignalFactory
0
2
3
Tweet
Forex Signal | Close(TP) Sell CADJPY@87.394 | for +79 pips | 2017.12.15 19:34 GMT | #fx #forex #fb
Forex Signal | Close(SL) Sell AUDCAD@0.98526 | for -40 pips | 2017.12.15 19:27 GMT | #fx #forex #fb
Forex Signal | Close(SL) Buy CADCHF@0.76917 | for -40 pips | 2017.12.15 19:19 GMT | #fx #forex #fb
3
4
7
Forex Signal | Close(SL) Buy CADCHF@0.76917 | for -40 pips | 2017.12.15 19:19 GMT | #fx #forex #fb
Forex Signal | Close(SL) Sell AUDCAD@0.98433 | for -40 pips | 2017.12.15 19:13 GMT | #fx #forex #fb
Forex Signal | Close(SL) Buy CADCHF@0.77068 | for -40 pips | 2017.12.15 17:06 GMT | #fx #forex #fb
In [35]:
# Now all the rows that contain the profit/loss values are normalized to contain the
# profit/loss value between the strings 'for' and 'pips'
In [36]:
# The next step is to extract the profit/loss values, and create a new column out of it
# we can use a combination of the extract function and regular expressions
# however if i try to extract the text between 'for' and 'pips' using regular expressions,
# I am getting the second match instead of the first match
# as the words 'for' and 'pips' appear twice in the tweet
# Closed Sell #Forex #Fx #AUDNZD 1.09677 for +23.2 pips, total for today +9.2 pips
# I am using a work around here which is to split the text
# and then extract the text between the words 'for' and 'pips'
In [37]:
all_tweet_df['Profit/Loss String'] = all_tweet_df['Tweet'].str.split(',').str.get(0)
all_tweet_df['Profit/Loss'] = all_tweet_df['Profit/Loss String'].str.extract('.*for(.*)pips.*',expand=True)
In [38]:
len(all_tweet_df)
Out[38]:
12470
In [39]:
# Let's check to make sure we extracted the profit/loss correctly for all tweet authors
for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']:
author = all_tweet_df['Author'] == name
print(all_tweet_df[author][['CreatedAt','Author','Profit/Loss']][:10])
CreatedAt
0 2017-12-15 15:33:39
1 2017-12-15 13:02:14
Author Profit/Loss
fuchstraders
0.0
fuchstraders
0.0
1 2017-12-15 13:02:14
2 2017-12-15 09:33:21
3 2017-12-15 07:04:23
4 2017-12-15 03:33:19
5 2017-12-15 01:02:14
6 2017-12-14 21:33:28
7 2017-12-14 18:33:44
8 2017-12-14 18:33:43
9 2017-12-14 18:33:41
CreatedAt
0 2017-12-15 19:47:07
1 2017-12-15 19:45:04
2 2017-12-15 19:32:03
3 2017-12-15 19:30:00
4 2017-12-15 19:17:03
5 2017-12-15 19:15:01
6 2017-12-15 17:30:01
7 2017-12-15 17:15:00
8 2017-12-15 17:00:00
9 2017-12-15 16:15:00
CreatedAt
0 2017-12-15 07:05:00
1 2017-12-15 02:53:48
2 2017-12-15 02:48:49
3 2017-12-15 02:28:42
4 2017-12-14 22:23:23
5 2017-12-14 14:09:14
6 2017-12-14 14:04:13
7 2017-12-14 07:00:45
8 2017-12-13 22:20:50
9 2017-12-13 22:05:42
CreatedAt
0 2017-12-15 21:35:28
1 2017-12-15 17:35:58
2 2017-12-15 16:55:25
3 2017-12-15 16:25:24
4 2017-12-15 15:44:59
5 2017-12-15 15:44:58
6 2017-12-15 15:24:49
7 2017-12-15 14:59:38
8 2017-12-15 14:29:22
9 2017-12-15 14:24:39
CreatedAt
fuchstraders
0.0
fuchstraders
0.0
fuchstraders
0.0
fuchstraders
0.0
fuchstraders
0.0
fuchstraders
0.0
fuchstraders
NaN
fuchstraders
NaN
fuchstraders
NaN
Author Profit/Loss
SignalFactory
+79
SignalFactory
NaN
SignalFactory
-40
SignalFactory
-40
SignalFactory
-40
SignalFactory
NaN
SignalFactory
NaN
SignalFactory
-40
SignalFactory
-40
SignalFactory
NaN
Author Profit/Loss
tj_fx_live
NaN
tj_fx_live
NaN
tj_fx_live
+23.8
tj_fx_live
-17.2
tj_fx_live
0.0
tj_fx_live
NaN
tj_fx_live
+48.4
tj_fx_live
NaN
tj_fx_live
0.0
tj_fx_live
-35.1
Author Profit/Loss
Wermelgion_Co
+10.4
Wermelgion_Co
+9.4
Wermelgion_Co
+12.2
Wermelgion_Co
+7.7
Wermelgion_Co
-15.2
Wermelgion_Co
+24.0
Wermelgion_Co
+16.4
Wermelgion_Co
+12.2
Wermelgion_Co
+10.4
Wermelgion_Co
+9.9
Author Profit/Loss
CreatedAt
0 2017-12-15 19:32:12
1 2017-12-15 19:32:12
2 2017-12-15 19:32:11
3 2017-12-15 19:32:11
4 2017-12-15 19:32:11
5 2017-12-15 15:16:40
6 2017-12-15 15:16:40
7 2017-12-15 15:16:40
8 2017-12-15 15:16:40
9 2017-12-15 15:16:39
CreatedAt
0 2017-12-16 11:00:52
1 2017-12-16 10:00:53
2 2017-12-16 09:20:24
3 2017-12-16 09:00:16
4 2017-12-16 08:45:24
5 2017-12-16 08:00:52
6 2017-12-16 07:01:04
7 2017-12-16 06:00:49
8 2017-12-16 05:00:47
9 2017-12-16 04:45:23
Author Profit/Loss
MaggiecharFx
+25.5
MaggiecharFx
+25.5
MaggiecharFx
+25.5
MaggiecharFx
+25.1
MaggiecharFx
+27.0
MaggiecharFx
+26.0
MaggiecharFx
+25.5
MaggiecharFx
+25.9
MaggiecharFx
+28.9
MaggiecharFx
+28.2
Author Profit/Loss
DanielWr_fx
NaN
DanielWr_fx
NaN
DanielWr_fx
NaN
DanielWr_fx
NaN
DanielWr_fx
NaN
DanielWr_fx
NaN
DanielWr_fx
NaN
DanielWr_fx
NaN
DanielWr_fx
NaN
DanielWr_fx
NaN
In [40]:
# The next imposrtant step is to drop the NaN values, as the rows with
# NaN values are tweets other than those that record the profit/loss
# They could be a valid trade entry related tweet, but we are
# not interested in those anyway, unless we want to follow the trade ideas
# That is a topic by itself - and is the follow-up research after this analysis :-)
In [41]:
all_tweet_df.isnull().any()
Out[41]:
CreatedAt
Author
Tweet
Profit/Loss String
Profit/Loss
False
False
False
False
True
dtype: bool
In [42]:
before = len(all_tweet_df)
all_tweet_df = all_tweet_df.dropna()
after = len(all_tweet_df)
print ('No. of records dropped = {}'.format(before - after))
No. of records dropped = 3948
In [43]:
all_tweet_df.isnull().any()
Out[43]:
CreatedAt
Author
Tweet
Profit/Loss String
Profit/Loss
dtype: bool
False
False
False
False
False
In [44]:
all_tweet_df.dtypes
Out[44]:
CreatedAt
Author
Tweet
Profit/Loss String
Profit/Loss
dtype: object
datetime64[ns]
object
object
object
object
In [45]:
# The very last but important step, is to convert the profit/loss column from
# a generic object into a float, so it is easier to process further
In [46]:
In [46]:
all_tweet_df['Profit/Loss'] = pd.to_numeric(all_tweet_df['Profit/Loss'])
all_tweet_df.dtypes
Out[46]:
CreatedAt
Author
Tweet
Profit/Loss String
Profit/Loss
dtype: object
datetime64[ns]
object
object
object
float64
Step 5: Visualization
In [47]:
# In order to visualize the growth in the portfolio, based on the
# profit/loss from individual trades, we start with a seed investment of
# 10000 USD and continue adding the profit/loss values.
# There is an important detail that has to be noted here
# If we place a trade, say buying 10000 units of EUR_USD, then 1 pip in profit results
# in 1 USD in profit
# So the general formula for pip to USD conversion for a EUR_USD trade is:
#
No. of units / 10000 * no. of pips in profit/loss
# In our tweet analysis, inititally I wanted to look at only EUR_USD trades, however
# since the no. of tweets were low, I included all currencies
# Applying a currency conversion factor is beyond the scope of this project
# So we will make an important assumption here that the no. of units traded
# for a specific currency pair i.e the position size, already takes into account this conversion factor
# such that a 1 pip profit always results in a 1 USD profit
In [48]:
# Let's filter out the dataframes according to their tweet authors
# and store them in a dictionary with author names as keys, for easy access
In [49]:
dict_of_author_tweets = {}
for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']:
author = all_tweet_df['Author'] == name
dict_of_author_tweets[name] = all_tweet_df[author].copy()
print(dict_of_author_tweets[name][['CreatedAt','Author','Profit/Loss']].head())
CreatedAt
Author Profit/Loss
0 2017-12-15 15:33:39 fuchstraders
0.0
1 2017-12-15 13:02:14 fuchstraders
0.0
2 2017-12-15 09:33:21 fuchstraders
0.0
3 2017-12-15 07:04:23 fuchstraders
0.0
4 2017-12-15 03:33:19 fuchstraders
0.0
CreatedAt
Author Profit/Loss
0 2017-12-15 19:47:07 SignalFactory
79.0
2 2017-12-15 19:32:03 SignalFactory
-40.0
3 2017-12-15 19:30:00 SignalFactory
-40.0
4 2017-12-15 19:17:03 SignalFactory
-40.0
7 2017-12-15 17:15:00 SignalFactory
-40.0
CreatedAt
Author Profit/Loss
2 2017-12-15 02:48:49 tj_fx_live
23.8
3 2017-12-15 02:28:42 tj_fx_live
-17.2
4 2017-12-14 22:23:23 tj_fx_live
0.0
6 2017-12-14 14:04:13 tj_fx_live
48.4
8 2017-12-13 22:20:50 tj_fx_live
0.0
CreatedAt
Author Profit/Loss
0 2017-12-15 21:35:28 Wermelgion_Co
10.4
1 2017-12-15 17:35:58 Wermelgion_Co
9.4
2 2017-12-15 16:55:25 Wermelgion_Co
12.2
3 2017-12-15 16:25:24 Wermelgion_Co
7.7
4 2017-12-15 15:44:59 Wermelgion_Co
-15.2
CreatedAt
Author Profit/Loss
0 2017-12-15 19:32:12 MaggiecharFx
25.5
1 2017-12-15 19:32:12 MaggiecharFx
25.5
2 2017-12-15 19:32:11 MaggiecharFx
25.5
3 2017-12-15 19:32:11 MaggiecharFx
25.1
4 2017-12-15 19:32:11 MaggiecharFx
27.0
CreatedAt
Author Profit/Loss
19 2017-12-15 19:07:09 DanielWr_fx
10.5
20 2017-12-15 19:07:09 DanielWr_fx
10.1
21 2017-12-15 19:07:08 DanielWr_fx
-1.2
22 2017-12-15 19:07:08 DanielWr_fx
10.8
23 2017-12-15 19:07:08 DanielWr_fx
10.5
In [50]:
# Let's create a column named Equity with 0 values initially
# We will calculate the values for equity subsequently
In [51]:
for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']:
dict_of_author_tweets[name]['Equity'] = 0
print(dict_of_author_tweets[name][['CreatedAt','Author','Profit/Loss','Equity']].head())
CreatedAt
0 2017-12-15 15:33:39
1 2017-12-15 13:02:14
2 2017-12-15 09:33:21
3 2017-12-15 07:04:23
4 2017-12-15 03:33:19
CreatedAt
0 2017-12-15 19:47:07
2 2017-12-15 19:32:03
3 2017-12-15 19:30:00
4 2017-12-15 19:17:03
7 2017-12-15 17:15:00
CreatedAt
2 2017-12-15 02:48:49
3 2017-12-15 02:28:42
4 2017-12-14 22:23:23
6 2017-12-14 14:04:13
8 2017-12-13 22:20:50
CreatedAt
0 2017-12-15 21:35:28
1 2017-12-15 17:35:58
2 2017-12-15 16:55:25
3 2017-12-15 16:25:24
4 2017-12-15 15:44:59
CreatedAt
0 2017-12-15 19:32:12
1 2017-12-15 19:32:12
2 2017-12-15 19:32:11
3 2017-12-15 19:32:11
4 2017-12-15 19:32:11
Author Profit/Loss Equity
fuchstraders
0.0
0
fuchstraders
0.0
0
fuchstraders
0.0
0
fuchstraders
0.0
0
fuchstraders
0.0
0
Author Profit/Loss Equity
SignalFactory
79.0
0
SignalFactory
-40.0
0
SignalFactory
-40.0
0
SignalFactory
-40.0
0
SignalFactory
-40.0
0
Author Profit/Loss Equity
tj_fx_live
23.8
0
tj_fx_live
-17.2
0
tj_fx_live
0.0
0
tj_fx_live
48.4
0
tj_fx_live
0.0
0
Author Profit/Loss Equity
Wermelgion_Co
10.4
0
Wermelgion_Co
9.4
0
Wermelgion_Co
12.2
0
Wermelgion_Co
7.7
0
Wermelgion_Co
-15.2
0
Author Profit/Loss Equity
MaggiecharFx
25.5
0
MaggiecharFx
25.5
0
MaggiecharFx
25.5
0
MaggiecharFx
25.1
0
MaggiecharFx
27.0
0
CreatedAt
19 2017-12-15 19:07:09
20 2017-12-15 19:07:09
21 2017-12-15 19:07:08
22 2017-12-15 19:07:08
23 2017-12-15 19:07:08
Author
DanielWr_fx
DanielWr_fx
DanielWr_fx
DanielWr_fx
DanielWr_fx
Profit/Loss
10.5
10.1
-1.2
10.8
10.5
Equity
0
0
0
0
0
In [52]:
# We also notice that the dataframe is in reverse chronological order because
# the tweets from retrieved from the latest to the oldest
# Let's sort the dataframes in a chronological order
# and let's reset the index, so that further manipulation and plotting becomes easier
In [53]:
for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']:
dict_of_author_tweets[name] = dict_of_author_tweets[name].sort_values(by='CreatedAt')
dict_of_author_tweets[name] = dict_of_author_tweets[name].reset_index(drop=True)
print(dict_of_author_tweets[name][['CreatedAt','Author','Profit/Loss','Equity']].head())
CreatedAt
0 2017-03-24 01:07:36
1 2017-03-24 04:14:24
2 2017-03-28 21:08:48
3 2017-03-28 22:33:40
4 2017-03-30 14:21:07
CreatedAt
0 2017-05-30 21:39:20
1 2017-05-30 21:44:24
2 2017-05-30 21:59:27
3 2017-05-30 22:04:29
4 2017-05-31 01:15:01
CreatedAt
0 2017-05-18 14:14:55
1 2017-05-18 19:24:34
2 2017-05-19 03:42:31
3 2017-05-19 11:57:26
4 2017-05-19 23:49:18
CreatedAt
0 2017-11-26 23:03:03
1 2017-11-27 01:22:49
Author Profit/Loss Equity
fuchstraders
-23.4
0
fuchstraders
-12.4
0
fuchstraders
-199.0
0
fuchstraders
-199.0
0
fuchstraders
-19.5
0
Author Profit/Loss Equity
SignalFactory
0.0
0
SignalFactory
-26.4
0
SignalFactory
-15.8
0
SignalFactory
-9.3
0
SignalFactory
80.0
0
Author Profit/Loss Equity
tj_fx_live
249.7
0
tj_fx_live
12.6
0
tj_fx_live
-56.3
0
tj_fx_live
-45.5
0
tj_fx_live
16.7
0
Author Profit/Loss Equity
Wermelgion_Co
7.3
0
Wermelgion_Co
-34.9
0
1 2017-11-27 01:22:49
2 2017-11-27 01:22:49
3 2017-11-27 01:22:49
4 2017-11-27 01:27:51
CreatedAt
0 2017-04-19 10:12:29
1 2017-04-19 10:12:29
2 2017-04-19 10:12:30
3 2017-04-19 10:12:30
4 2017-04-19 10:12:30
CreatedAt
0 2017-10-13 15:19:11
1 2017-10-13 15:19:11
2 2017-10-13 15:19:12
3 2017-10-13 15:19:12
4 2017-10-13 18:04:27
Wermelgion_Co
-34.9
0
Wermelgion_Co
36.8
0
Wermelgion_Co
1.0
0
Wermelgion_Co
37.2
0
Author Profit/Loss Equity
MaggiecharFx
10.2
0
MaggiecharFx
10.6
0
MaggiecharFx
11.0
0
MaggiecharFx
9.8
0
MaggiecharFx
11.0
0
Author Profit/Loss Equity
DanielWr_fx
28.9
0
DanielWr_fx
25.4
0
DanielWr_fx
26.0
0
DanielWr_fx
26.1
0
DanielWr_fx
-1.2
0
In [54]:
# To calculate the running equity value, we need to start with an initial balance
# of 10000 USD, so let's insert a row in each dataframe at the first position
# with an equity value of 10000 USD
In [55]:
from datetime import timedelta
for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']:
oldest_trade = dict_of_author_tweets[name]['CreatedAt'][0]
new_top_row = []
new_top_row.insert(0,{'CreatedAt':oldest_trade + timedelta(days=-1),'Author':name,'Tweet':'Initial Capital','P
rofit/Loss String':'Happy trading','Profit/Loss':0.0,'Equity':10000.0})
dict_of_author_tweets[name] = pd.concat([pd.DataFrame(new_top_row),dict_of_author_tweets[name]],ignore_index=T
rue)
print(dict_of_author_tweets[name][['CreatedAt','Author','Profit/Loss','Equity']].head())
CreatedAt
0 2017-03-23 01:07:36
1 2017-03-24 01:07:36
2 2017-03-24 04:14:24
3 2017-03-28 21:08:48
Author
fuchstraders
fuchstraders
fuchstraders
fuchstraders
Profit/Loss
0.0
-23.4
-12.4
-199.0
Equity
10000.0
0.0
0.0
0.0
4 2017-03-28 22:33:40
CreatedAt
0 2017-05-29 21:39:20
1 2017-05-30 21:39:20
2 2017-05-30 21:44:24
3 2017-05-30 21:59:27
4 2017-05-30 22:04:29
CreatedAt
0 2017-05-17 14:14:55
1 2017-05-18 14:14:55
2 2017-05-18 19:24:34
3 2017-05-19 03:42:31
4 2017-05-19 11:57:26
CreatedAt
0 2017-11-25 23:03:03
1 2017-11-26 23:03:03
2 2017-11-27 01:22:49
3 2017-11-27 01:22:49
4 2017-11-27 01:22:49
CreatedAt
0 2017-04-18 10:12:29
1 2017-04-19 10:12:29
2 2017-04-19 10:12:29
3 2017-04-19 10:12:30
4 2017-04-19 10:12:30
CreatedAt
0 2017-10-12 15:19:11
1 2017-10-13 15:19:11
2 2017-10-13 15:19:11
3 2017-10-13 15:19:12
4 2017-10-13 15:19:12
fuchstraders
-199.0
0.0
Author Profit/Loss
Equity
SignalFactory
0.0 10000.0
SignalFactory
0.0
0.0
SignalFactory
-26.4
0.0
SignalFactory
-15.8
0.0
SignalFactory
-9.3
0.0
Author Profit/Loss
Equity
tj_fx_live
0.0 10000.0
tj_fx_live
249.7
0.0
tj_fx_live
12.6
0.0
tj_fx_live
-56.3
0.0
tj_fx_live
-45.5
0.0
Author Profit/Loss
Equity
Wermelgion_Co
0.0 10000.0
Wermelgion_Co
7.3
0.0
Wermelgion_Co
-34.9
0.0
Wermelgion_Co
36.8
0.0
Wermelgion_Co
1.0
0.0
Author Profit/Loss
Equity
MaggiecharFx
0.0 10000.0
MaggiecharFx
10.2
0.0
MaggiecharFx
10.6
0.0
MaggiecharFx
11.0
0.0
MaggiecharFx
9.8
0.0
Author Profit/Loss
Equity
DanielWr_fx
0.0 10000.0
DanielWr_fx
28.9
0.0
DanielWr_fx
25.4
0.0
DanielWr_fx
26.0
0.0
DanielWr_fx
26.1
0.0
In [56]:
# Now we have all the required info to calculate the equity growth and
# subsequently visualize it to see how each analyst/author has performed
# over the duration
In [57]:
# We will use a simple formula to calculate the Equity evolution
# Equity[i] = Equity[i-1] + (Equity[i-1]/10000*Profit/Loss[i])
# Equity[i] = Equity[i-1] + (Equity[i-1]/10000*Profit/Loss[i])
# This formula takes position sizing into account, however, it ignores
# the differences in the Pip vs USD value for different currency pairs as mentioned earlier
In [58]:
for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']:
profitloss = dict_of_author_tweets[name]['Profit/Loss']
equity = dict_of_author_tweets[name]['Equity']
for x in range(1,len(profitloss)):
equity[x] = equity[x-1]*(1+1/10000*profitloss[x])
dict_of_author_tweets[name]['Equity'] = equity
print(dict_of_author_tweets[name][['CreatedAt','Author','Profit/Loss','Equity']].head())
/home/theerthan/anaconda3/envs/edx/lib/python3.5/site-packages/ipykernel_launcher.py:5: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-ver
sus-copy
"""
CreatedAt
0 2017-03-23 01:07:36
1 2017-03-24 01:07:36
2 2017-03-24 04:14:24
3 2017-03-28 21:08:48
4 2017-03-28 22:33:40
CreatedAt
0 2017-05-29 21:39:20
1 2017-05-30 21:39:20
2 2017-05-30 21:44:24
3 2017-05-30 21:59:27
4 2017-05-30 22:04:29
CreatedAt
0 2017-05-17 14:14:55
1 2017-05-18 14:14:55
2 2017-05-18 19:24:34
3 2017-05-19 03:42:31
4 2017-05-19 11:57:26
CreatedAt
0 2017-11-25 23:03:03
1 2017-11-26 23:03:03
Author Profit/Loss
Equity
fuchstraders
0.0 10000.000000
fuchstraders
-23.4
9976.600000
fuchstraders
-12.4
9964.229016
fuchstraders
-199.0
9765.940859
fuchstraders
-199.0
9571.598635
Author Profit/Loss
Equity
SignalFactory
0.0 10000.000000
SignalFactory
0.0 10000.000000
SignalFactory
-26.4
9973.600000
SignalFactory
-15.8
9957.841712
SignalFactory
-9.3
9948.580919
Author Profit/Loss
Equity
tj_fx_live
0.0 10000.000000
tj_fx_live
249.7 10249.700000
tj_fx_live
12.6 10262.614622
tj_fx_live
-56.3 10204.836102
tj_fx_live
-45.5 10158.404097
Author Profit/Loss
Equity
Wermelgion_Co
0.0 10000.000000
Wermelgion_Co
7.3 10007.300000
2 2017-11-27 01:22:49
3 2017-11-27 01:22:49
4 2017-11-27 01:22:49
CreatedAt
0 2017-04-18 10:12:29
1 2017-04-19 10:12:29
2 2017-04-19 10:12:29
3 2017-04-19 10:12:30
4 2017-04-19 10:12:30
CreatedAt
0 2017-10-12 15:19:11
1 2017-10-13 15:19:11
2 2017-10-13 15:19:11
3 2017-10-13 15:19:12
4 2017-10-13 15:19:12
Wermelgion_Co
-34.9
9972.374523
Wermelgion_Co
36.8 10009.072861
Wermelgion_Co
1.0 10010.073769
Author Profit/Loss
Equity
MaggiecharFx
0.0 10000.000000
MaggiecharFx
10.2 10010.200000
MaggiecharFx
10.6 10020.810812
MaggiecharFx
11.0 10031.833704
MaggiecharFx
9.8 10041.664901
Author Profit/Loss
Equity
DanielWr_fx
0.0 10000.000000
DanielWr_fx
28.9 10028.900000
DanielWr_fx
25.4 10054.373406
DanielWr_fx
26.0 10080.514777
DanielWr_fx
26.1 10106.824920
In [59]:
fig,axis = plt.subplots(2,3,figsize=[20,10])
out = plt.suptitle('Equity curve of various Analysts/FXTraders',y=1.08,fontsize=30)
# every year
years = mdates.YearLocator()
months = mdates.MonthLocator() # every month
yearsFmt = mdates.DateFormatter('%Y')
names = [['fuchstraders','SignalFactory','tj_fx_live'],
['Wermelgion_Co','MaggiecharFx','DanielWr_fx']]
for x in range(len(names)):
for y in range(len(names[x])):
dict_of_author_tweets[names[x][y]]['CreatedAt'] = pd.to_datetime(dict_of_author_tweets[names[x][y]]
['CreatedAt'])
axis[x][y].plot(dict_of_author_tweets[names[x][y]]['CreatedAt'].values,dict_of_author_tweets[names[x][y]]['
Equity'].values)
axis[x][y].set_title(names[x][y],fontsize=20)
axis[x][y].grid(True)
fig.autofmt_xdate()
plt.tight_layout()
In [62]:
# We can also look at the distribution of the trades with respect to the no. of pips
# gained or lost i.e. how many trades were executed that had a certain pip range in profit
# or loss
# this will give us an idea of the consistency of the analyst/author i.e. if he or she is
# consistently winning with a few bad trades or consistently losing with a few winning trades etc
# Based on the above equity curve, we would expect that for the first three, the histogram shows
# mostly trades less than 0 (mostly losing trades), while for the other three,
# mostly trades less than 0 (mostly losing trades), while for the other three,
# the histogram shows mostly trades above 0 (mostly winning trades)
In [63]:
fig,axis = plt.subplots(2,3,figsize=[20,10])
out = plt.suptitle('Histogram of Profit/Loss trades',y=1.08,fontsize=30)
names = [['fuchstraders','SignalFactory','tj_fx_live'],
['Wermelgion_Co','MaggiecharFx','DanielWr_fx']]
for x in range(len(names)):
for y in range(len(names[x])):
axis[x][y].hist(dict_of_author_tweets[names[x][y]]['Profit/Loss'].values,10, normed=False,
facecolor='green')
axis[x][y].set_title(names[x][y],fontsize=20)
axis[x][y].grid(True)
plt.tight_layout()
Observations
The analysis shows that although some analysts consistently lose money and the equity drops significantly for
them, others do make profits and have a healthy equity, over a range of months. So the hypothesis that we
could follow twitter feeds and manage our portfolio is very much a possibility. However, it requires a
structured analysis, in addition to the preliminary research above, that demonstrates this possibility.
Specific to this analysis, we can see that three out of the six analysts have lost money, while the other three
have shown positive returns. Two of them, have even shown enormous returns, which at first glance seems to
be too good to be true. As mentioned above, a second set of validation, by correlating the entry price and the
timeline, with the actual price of the currency pair at that same timeline, along with the corresponding exit,
would re-confirm that the trades were indeed profitable, and the tweets were not made up.
So it is indeed possible to manage a portfolio by smart monitoring of twitter feeds and selective tagging of
trades!!
Download