Twitter Portfolio Management based Is it possible? Anand Bindumadhavan Abstract Dataset: The dataset is raw twitter tweets containing FX trade ideas/results Question: Are the buy/sell advices for forex currency pairs from twitter feeds worth following and will they make one profitable? Method: I used rudimentary text processing, identifying and isolating tweets containing trade ideas/results indicating a profit or loss per trade, and used it to build a portfolio with 10K as initial capital, and derived the equity curve by following each trade/trade result. Findings: While the analysis shows 50% of the analysts were successful in their trades with positive portfolio values, the hypothesis and the question is answered: i.e. it is indeed possible to manage a portfolio by smart monitoring of twitter feeds and selective tagging of trades!! Motivation There are quite a few analysts/companies who frequently tweet about whether to buy or sell some asset, say an equity or currency at a certain price, with some profit targets. I always wondered whether one could manage his/her own investment portfolio by following the buy/sell advice from these twitter users. As there could be quiet a variety of such advice/tweets, I will narrow the analysis to tweets from a few such users/analysts, to (optionally) a specific currency pair like EUR/USD, grab as many tweets as possible with buy/sell advices, and build a equity curve based on the profit/loss generated from each advice/signal. I will compare the results to understand whether it is worthwhile following these tweets. A positive equity at the end of the period of analysis, would indicate that following the tweets is worthwhile, while a negative equity would indicate that we should not follow such tweets. Dataset(s) The data is from historical tweets. First we grab all tweets from a few randomly picked users (relevant to the topics # forextrading, # eurusd ), understand the structure of their tweets, parse them and extract the profit/loss value in pips, per trade, and use it to build an equity curve. Data Preparation and Cleaning At a high-level, what did you need to do to prepare the data for analysis? The first step was to identify which users to follow. A search on twitter for the topic ‘#forextrading’ resulted in a very broad range of tweets – I had to really go back and forth to eventually come up with a query such as: ‘‘topic is #eurusd and contains the words ‘buy or sell’’ to narrow the results, from where I picked 7 users randomly for the analysis. Describe what problems, if any, did you encounter with the dataset? As the tweets were raw text, I really had to study them closely to identify an appropriate tweet that can be used for data extraction. Even then, different users recorded the profit/loss differently. Eg: for +20 pips, vs Profit: 20 pips I also had to drop one of the users eventually because that user tweeted only trade entries, but never recorded a follow-up tweet with the actual profit/loss figures. Research Question(s) Are the buy/sell advices for the EUR/USD (or other) currency pair(s) from twitter feeds worth following and will they make you profitable? Methods What methods did you use to analyze the data and why are they appropriate? Be sure to adequately, but briefly, describe your methods. I used a very simple method of analyzing tweets and extracting the relevant info using very simple string manipulation/regular expression operations. I did not use any sophisticated natural language analysis or any ML model for sentiment analysis etc. Rather, I employed a very simple and straightforward method: look at all the tweets, filter for relevant tweets, extract the piece of info that is required, and use it to build a model for evaluation. It is simple, but yet very powerful in my opinion, because it opens a whole new world of possibilities, for me at least. Findings – Summary (refer the next two slides: the equity curve and profit/loss distribution) The analysis shows that although some analysts consistently lose money and the equity drops significantly for them, others do make profits and have a healthy equity, over a range of months. So the hypothesis that, we could follow twitter feeds and manage our portfolio is very much a possibility. However, it requires a structured analysis, in addition to the preliminary research above, that re-confirms this possibility. Specific to this analysis, we can see that three out of the six analysts have lost money, while the other three have shown positive returns. Two of them, have even shown enormous returns, which at first glance seems to be too good to be true. As mentioned above, a second set of validation, by correlating the entry price and the timeline, with the actual price of the currency pair at that same timeline, along with the corresponding exit, would re-confirm that the trades were indeed profitable, and the tweets were not made up. So it is indeed possible to manage a portfolio by smart monitoring of twitter feeds and selective tagging of trades!! Limitations If applicable, describe limitations to your findings: 1. This analysis cannot be generalized for all tweets, because each user records his/her trade ideas/results in a different format, although there is some structure to it 2. The analysis also assumes that there is a position size logic in place, which makes the value of each profitable pip to be 1 USD. i.e. normally 10000 units of EUR/USD = 1 pip, and this is different for EUR/GBP or GBP/USD etc. However, the analysis ignores this difference, as it is beyond the scope of this project 3. The analysis is based only on tweets, and hence the authenticity of the tweets cannot be guaranteed. To re-confirm the authenticity of the tweets, we have to extract the underlying entry/exit points, and validate them with the actual entry/exit prices at the same timeline. However, I believe this is a good starting point, that can be expanded very easily. Conclusions Report your overall conclusions, preferably a conclusion per research question My conclusion is that the research question has been positively answered. i.e. it is indeed possible to follow trade ideas from twitter, and if we time them correctly, we will be able to manage a portfolio well and show a profitable equity curve. However, it is important to do an extensive research to know and understand, which users to follow, and which trade ideas to follow. This analysis, shows one way of doing this research, with the above preliminary conclusion. Acknowledgements Where did you get your data? All data is from historical twitter feeds Did you use other informal analysis to inform your work? I used the “search.twitter.com” website, to understand how different search strings work, and how to optimize the search further. Did you get feedback on your work by friends or colleagues? No I did not get any feedback from friends or colleagues. References If applicable, report any references you used in your work. I mostly used online blogs as references for checking the actual syntax for specifying search strings, syntax for dataframe manipulation, equity curve, plotting etc. PDF copy of the jupyter notebook is attached below In [1]: # Data Source: Forex trade tweets from Twitter # Analysis: We build a sample portfolio based on historical trade ideas/tweets and evaluate the performance # Research question: Are the buy/sell advices for the EUR/USD currency pair from twitter feeds worth following # and will they make you profitable? Twitter based portfolio management Can we rely on trade ideas from Twitter: A portfolio simulation based on real trade tweets In [2]: # As we already have our twitter credentials in the pickle file from chapter 8, # we will load the twitter credentials from this file and grab an API context/handle # Also, while researching the internet for python twitter documentation, I came across tweepy # tweepy is another python wrapper for twitter API, and it seems much more simpler, at least for me # So I am using tweepy instead of the twitter package In [3]: import pickle import os #import twitter import pandas as pd import tweepy import tweepy import matplotlib.pyplot as plt import matplotlib.dates as mdates import numpy as np %matplotlib inline In [4]: if not os.path.exists('secret_twitter_credentials.pkl'): print('Twitter auth file missing - please make sure, a valid secret_twitter_credentials file is present') else: Twitter = pickle.load(open('secret_twitter_credentials.pkl','rb')) In [5]: auth = tweepy.OAuthHandler(Twitter['Consumer Key'], Twitter['Consumer Secret']) auth.set_access_token(Twitter['Access Token'],Twitter['Access Token Secret']) twitter_api = tweepy.API(auth) type(twitter_api) Out[5]: tweepy.api.API Step 1: Initial exploration of forex trading related tweets In [6]: # Let's start by searching for all tweets related to forextrading In [7]: topic = '#forextrading' num = 100 status = twitter_api.search(q=topic,count=num) print(type(status)) print(len(status)) for tweet in status[:15]: print (tweet.user.screen_name,tweet.text) print (tweet.user.screen_name,tweet.text) <class 'tweepy.models.SearchResults'> 96 moralfx Fundamental &amp; Technical Analysis in Forex Trading https://t.co/OJyAbspRjB #forextrading #forextrader #forex https://t.co/psAfhg7dcx mxandrv The ULTIMATE GUIDE on how to trade less and make more: https://t.co/bbbLRr4KUR #forex #fx #forextrading FerruFx #MT4 #MT5 FFx Basket Scanner See if a currency / related basket is tradable! https://t.co/zQTWyymFRO #FerruFx #fx… https://t.co/ee5yugwL2K closed__ Fx=Full Gelir #forex #forextrading #forextrading #FxCanli #FX ultraltdnet Entrepreneur Quote of the day! If you like this quote, Share it Now! #entrepreneurquotes #founderwords… https://t.co/TcSLpBFkL7 CityofInvestmnt https://t.co/rki7PPm5b5 I CHOOSE TO INVEST #wealth #forextrading #managedforex #money #invest #Dollar #fx #pension… https://t.co/J6dZt7Majj FerruFx #MT4 FFx Hidden TP/SL Manager Hide your targets and stops to your broker https://t.co/fGy3oOMYZE #FerruFx #fx… https://t.co/PIFlBPXYC0 FX_haroldShan The first system uses 3 indicators to determine if there is up- or https://t.co/vJ80FBbV9w #Market #ForexTrading #Invest #Fibonacci #ECN #EA ForexFalcon_com If you have an edge be the casino. Play every setup. Don't fear the outcome of the next trade #Fo rex #ForexTrading https://t.co/0bDFKzQMgK CityofInvestmnt Greetings from City Of Investment #cityofinvestment #managedforex #money #wealth #winners #fx #Fo rex #trading… https://t.co/8o7lUeZVRJ CityofInvestmnt Forex Managed Accounts Exclusive x3 45%-75% #forextrading #stocks #forexsignals #cityofinvestment #fx #Mexico… https://t.co/sv9k0zQuTH MyForexYoda here's the information for tonights #forextrading #meetup https://t.co/ogCasriQPg AliceElisson How to REDUCE UNNECESSARY #FOREX LOSSES and increase the number of winning trades ==&gt; https://t.c o/MCLsFL1Kag #forextrading #fx forextradingbay Laser accurate and very fast #forex signals directly on your chart ==&gt; https://t.co/kDNOAjvPMl #forextrading #fx SO4FRbcwQSyMhb8 ﻋﺒﺪ: دﻟﯿﻠﻚ اﱃ ﻋﺎﱂ اﳌﺎل واﻻﻋﻤﺎل اﻟﻮﺻﻮل اﱃ اﻟﺜﺮاء ﲞﻄﻮات ﺑﺴﯿﻄﺔ ﲢﻘﯿﻖ اﻻرﺑﺎح واﻧﺖ ﲡﻠﺲ ﰲ ﻣﻨﺰﻟﻚ …اﻟﻌﺰﯾﺰhttps://t.co/u6lSat9Bdj In [8]: # A search on the topic 'forextrading' is too broad, # it does not provide any useful tweets that we can analyse in a structured way. # Let's narrow it down further to a specific currency pair to look for trade ideas. # I am picking eurusd, as this is the most popular currency pair thats traded In [9]: topic = '#forextrading AND #eurusd' num = 100 status = twitter_api.search(q=topic,count=num) print(type(status)) print(len(status)) for tweet in status[:5]: print (tweet.text) <class 'tweepy.models.SearchResults'> 74 ApaChE and KproteKT THE #eurusd #trading algos now available for #free at https://t.co/lWI4i6w3Bu #forex #forextrading #forexsignals #EA #… ApaChE and KproteKT #eurusd #trading algos now above the 23% mark in real account. Try them out at https://t.co/lsA88jkAa2 #forextrading #… ApaChE and KproteKT THE #eurusd #trading algos now available for #free at https://t.co/tHzYJptV9a #forex #forextrading #forexsignals #EA #… ApaChE and KproteKT #eurusd #trading algos now above the 23% mark in real account. Try them out at https://t.co/TfoRnOzeOE #forextrading #… https://t.co/lxLDmyVqqO http://www.tradingzine.comGuide to Binary Options in US #asset #binary #Binary #options… https://t.co/9d7dtr6dgl In [10]: # We still don't have any useful tweets # let's narrow the search further using the keywords buy or sell as below In [11]: topic = '#eurusd AND buy OR sell' num = 100 status = twitter_api.search(q=topic,count=num) print(type(status)) print(len(status)) print(len(status)) for tweet in status[:5]: print (tweet.text) <class 'tweepy.models.SearchResults'> 59 Need much money now? Buy FX Robot to generate cash for you! Proven REAL accounts, big money, big fun! https://t.co/n5k7LXC8b7 #EURUSD #Rich No need to buy the EA or pay upfront fees. Money Management for every one with best brokers. Over 500%2,000%/month. #Analize #EURUSD #EURUSD #TRADESIGNAL December 18 - 22, 2017 Sell Limit #1 @ 1.17667 and Sell Limit #2 @ 1.17627 TAKE PROFIT @… https://t.co/TsksgKPO5q EUR/USD Weekly Technical Analysis: Risk of a Euro Sell-off Rising https://t.co/gPi8a6Vvhj #forex #eurusd #fx #new s EUR/USD Weekly Technical Analysis: Risk of a Euro Sell-off Rising https://t.co/4mrCeMgi9t #forex #eurusd #fx #new s In [12]: # This query above results in a better response set # In a sample run, I got useful tweets such as the structured tweet below # "SELL #EURUSD at 1.17584 SL:1.19184 TP:1.14784 , check live performances at...." # Lets grab some tweets and the screennames, and present them in a dataframe format # this will allow us to pick two screen names in random that post structured tweets In [13]: all_text = [] filtered_results = [] for s in status: if not s.text in all_text: all_text.append(s.text) filtered_results.append(s) results = filtered_results len(results) #print (results[0]) Out[13]: 56 In [14]: tweet_data = pd.DataFrame(data=[[s.user.screen_name,s.text] for s in results],columns=['Author','Tweet']) pd.set_option('max_colwidth',140) tweet_data Out[14]: Author Tweet 0 jerrry_fx Need much money now? Buy FX Robot to generate cash for you! Proven REAL accounts, big money, big fun! https://t.co/n5k7LXC8b7 #EURUSD #Rich 1 fxolivia_sh No need to buy the EA or pay upfront fees. Money Management for every one with best brokers. Over 500%-2,000%/month. #Analize #EURUSD 2 kenya_forex #EURUSD #TRADESIGNAL December 18 - 22, 2017\n\nSell Limit #1 @ 1.17667\nand \nSell Limit #2 @ 1.17627\n\nTAKE PROFIT @… https://t.co/T... 3 thefxcoach EUR/USD Weekly Technical Analysis: Risk of a Euro Sell-off Rising https://t.co/gPi8a6Vvhj #forex #eurusd #fx #news 4 thefxcoach EUR/USD Weekly Technical Analysis: Risk of a Euro Sell-off Rising https://t.co/4mrCeMgi9t #forex #eurusd #fx #news 5 myfxdataprov 2017_12_15_SYD_CALL: Monthly Buy/Sell Signals #EURUSD [December, 2017] https://t.co/u1ehOe3QJg 6 myfxdataprov 2017_12_15_SYD_CALL: Weekly Buy/Sell Signals #EURUSD [Dec. 11 to Dec. 15, 2017] https://t.co/kW8gp8MxTK 7 JohnFXCorona Don't gamble anymore! Don't lose any more money! Be a winner! Buy our EA and make $50k/month. #EuropeanUnion #EURUSD 8 SuperHeroInvest Closed Buy for BLI Fund #EURUSD 1.18043 for -42.2 pips, total for today -8.8 pips #Investors welcome https://t.co/QKLgwlVjta 9 forex22com Forex Signals - 2017-12-15 19:45:01 : BUY EURUSD@1.17582 SL@1.17432 TP@1.17732 #EURUSD #ForexSignal https://t.co/LwMgoUUihA 10 B52Finance Closed Buy 0.01 Lots #EURUSD 1.18061 for -21.0 pips, total for today -109.6 pips 11 B52Finance Closed Buy 0.1 Lots #EURUSD 1.18069 for -20.4 pips, total for today -88.6 pips 12 B52Finance Closed Sell 0.1 Lots #EURUSD 1.17897 for -20.4 pips, total for today -192.3 pips 13 B52Finance Closed Buy 0.01 Lots #EURUSD 1.17778 for +3.2 pips, total for today -90.4 pips 13 B52Finance Closed Buy 0.01 Lots #EURUSD 1.17778 for +3.2 pips, total for today -90.4 pips Author 14 B52Finance Closed Buy 0.1 Lots #EURUSD 1.17794 for +1.6 pips, total for today -93.6 pips 15 thefxcoach Tweet EUR/USD strategy is to buy on dips https://t.co/D2NYn1SAez #forex #eurusd #fx #news 16 DayTradeScalps $EURUSD #EURUSD | SELL now | Open: 11762.1 | Target: 11755.8 (6.3) | Stop: 11769.8 (7.7) | #fx #forex #daytrading 17 forex22com Forex Signals - 2017-12-15 17:00:00 : BUY EURUSD@1.17769 SL@1.17619 TP@1.17919 #EURUSD #ForexSignal https://t.co/LwMgoUUihA 18 DOlefirov #EurUsd sell 1.15 19 fxcapitalonline Wednesday Trading Results\n12/13/17\n\n✅Sell #USDJPY Tp Hit +65 Pips\n✅ Sell #USDCAD Manual Close +30pips \n✅ Buy… https://t.co/SjstAz7f7z 20 fxcapitalonline Tuesday results 12/12/17\n\n❌Buy #GBPUSD Manual Close -8 pips❤\n❌Buy #EURUSD SL triggered - 10 pips❤\n❌Sell #USDJPY S… https://t.co/KR... 21 fxcapitalonline Monday results 12/11/17\n\n❌Buy #EURUSD SL triggered -12 #pips❤\n❌Buy #GBPUSD SL triggered -25 pips❤\n❌ #Buy #AUDUSD… https://t.co/jNc... 22 thefxcoach EUR/USD: Buy On Dips https://t.co/PKrGVg4Pcw #forex #eurusd #fx #news 23 Wermelgion_Co Closed Sell #Forex #Fx #EURUSD 1.17848 for +12.2 pips, total for today +66.5 pips 24 Wermelgion_Co Closed Sell #Forex #Fx #EURUSD 1.17998 for +9.9 pips, total for today +43.9 pips 25 cosmos4unet #EURUSD buy signal on 15 DEC 2017 02:00 PM UTC by AdMACD Trading System (Timefr https://t.co/6dzr1x9AMY #Forex… https://t.co/HbYNZy4MZz 26 DayTradeScalps $EURUSD #EURUSD | BUY now | Open: 11806.6 | Target: 11812.9 (6.3) | Stop: 11798.8 (7.8) | #fx #forex #daytrading 27 DayTradeScalps $EURUSD #EURUSD | BUY now | Open: 11804.1 | Target: 11810.4 (6.3) | Stop: 11796.3 (7.8) | #fx #forex #daytrading 28 myfxdataprov 2017_12_15_NYC_CALL: Monthly Buy/Sell Signals #EURUSD [December, 2017] https://t.co/u1ehOe3QJg 29 Limitforex Decision time at #EURUSD.\n#EUR #USD #dollar #euro #gold #buy #sell #graphic #analysis #forex #fx #trade #limitforex… https://t.co/gT0vm... 30 arbtrader100 #ARBsignals | BUY #EURUSD @ 1.1797 | SL:1.1777 | TP:1.1817 | SENT 2017-12-15 12:27:08 GMT | #forexsignal #fx #forex #fb 31 myfxdataprov 2017_12_15_NYC_CALL: Weekly Buy/Sell Signals #EURUSD [Dec. 11 to Dec. 15, 2017] https://t.co/kW8gp8MxTK 32 forex22com Forex Signals - 2017-12-15 12:00:00 : SELL EURUSD@1.17998 SL@1.18148 TP@1.17848 #EURUSD #ForexSignal https://t.co/LwMgoUUihA Author Closed Buy #Forex #Fx #EURUSD 1.18083 for -8.2 pips, total for today +34.0 pips 33 Wermelgion_Co Tweet 34 Wermelgion_Co Closed Buy #Forex #Fx #EURUSD 1.17791 for +21.0 pips, total for today +42.2 pips 35 limitforextr #EURUSD'da karar anı.\n#EUR #USD #dolar #euro #sterlin #yen #gold #altın #kazanç #buy #sell #grafik #analiz #forex… https://t.co/WfrIh59ZWx 36 smart4trade Bought #NAS100 6404.0 #smart4trader #sp500 #eurusd #dowjones #nasdaq #buy #sell #вк #vk мой сайт https://t.co/RGfOk0hSXL 37 smart4trade Bought #US30 24644.0 #smart4trader #sp500 #eurusd #dowjones #nasdaq #buy #sell #вк #vk мой сайт https://t.co/RGfOk0hSXL 38 forex22com Forex Signals - 2017-12-15 08:15:00 : SELL EURUSD@1.17864 SL@1.18014 TP@1.17714 #EURUSD #ForexSignal https://t.co/LwMgoUUihA 39 EuroBulls_Forex RT @Thai_TraderFX: GM, traders. #EURUSD buy.\n\nLearn to trade like a pro https://t.co/1961yQ4HIV \n\nJoin my telegram chann... 40 thirdbrainfx #x112 SELL #EURUSD at 1.17925 SL:1.19525 TP:1.15125 , check live performances at https://t.co/aTfkud3mWf https://t.co/E5GbZJw0Ss 41 myfxdataprov 2017_12_15_LON_CALL: Monthly Buy/Sell Signals #EURUSD [December, 2017] https://t.co/u1ehOe3QJg 42 Thai_TraderFX GM, traders. #EURUSD buy.\n\nLearn to trade like a pro https://t.co/1961yQ4HIV \n\nJoin my telegram channel… https://t.co/4X... 43 myfxdataprov 2017_12_15_LON_CALL: Weekly Buy/Sell Signals #EURUSD [Dec. 11 to Dec. 15, 2017] https://t.co/kW8gp8MxTK 44 SuperHeroInvest Closed Sell for BLI Fund #EURUSD 1.18324 for +48.1 pips, total for today +48.1 pips #Investors welcome https://t.co/QKLgwlVjta 45 tj_fx_live Closed Sell USDJPY 112.559 for +23.8 pips, total for today +6.6 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY 46 myfxdataprov 2017_12_15_TYO_CALL: Monthly Buy/Sell Signals #EURUSD [December, 2017] https://t.co/u1ehOe3QJg 47 tj_fx_live Closed Sell GBPUSD 1.34295 for -17.2 pips, total for today -17.2 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY 48 myfxdataprov 2017_12_15_TYO_CALL: Weekly Buy/Sell Signals #EURUSD [Dec. 11 to Dec. 15, 2017] https://t.co/kW8gp8MxTK 49 forex22com Forex Signals - 2017-12-15 01:45:04 : SELL EURUSD@1.17812 SL@1.17962 TP@1.17662 #EURUSD #ForexSignal https://t.co/LwMgoUUihA 50 forex22com Forex Signals - 2017-12-14 23:45:03 : BUY EURUSD@1.1774 SL@1.1759 TP@1.1789 #EURUSD #ForexSignal https://t.co/LwMgoUUihA 51 DarrenwongMfx Closed Sell 0.1 Lots EURUSD 1.17885 for +21.2 pips, total for today +21.2 pips #forex #eurusd 51 DarrenwongMfx Closed Sell 0.1 Lots EURUSD 1.17885 for +21.2 pips, total for today +21.2 pips #forex #eurusd Author 52 DarrenwongMfx Closed Sell 0.1 Lots EURUSD 1.17919 for +20.8 pips, total for today +20.8 pips #forex #eurusd Tweet 53 Forexrulebook Nice buy opportunity on #EURJPY not to be missed. \nUpward sequence likely to continue\n\nGenesis Asset &gt;&gt; Join for $… https://t.... 54 myfxdataprov 2017_12_14_SYD_CALL: Monthly Buy/Sell Signals #EURUSD [December, 2017] https://t.co/u1ehOe3QJg 55 DarrenwongMfx Closed Sell 0.1 Lots EURUSD 1.17952 for +20.3 pips, total for today +20.3 pips #forex #eurusd Step 2: Grabbing tweets from specific screennames In [15]: # After a few trial runs of the above query, I have narrowed the below screen names: # @fuchstraders # @DayTradeScalps # @SignalFactory # @tj_fx_live # @Wermelgion_Co # @MaggiecharFx # @DanielWr_fx # I picked them mainly because of the consistent structure they follow in their tweets, # making it easier for us to parse and analyse the data. Also another important factor for # choosing them is that they record the profit/loss per trade in the tweet, so # we don't need special processing to analyse a portofolio based on their tweets In [16]: # grabbing user specifc tweets using the screen names picked above # note that we are not using any hashtag filter or text filter # i am doing a generic grab of the tweets from these users, to understand # how often they tweet a trade idea/signal vs something else In [17]: for name in ['fuchstraders','DayTradeScalps','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']: tweets = twitter_api.user_timeline(screen_name=name, count=1000) print("*"*60) print("*"*60) print("Number of tweets extracted for: {} is: {}.\n".format(name,len(tweets))) for tweet in tweets[:5]: print (tweet.text) ************************************************************ Number of tweets extracted for: fuchstraders is: 200. Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/Lnaq3rA8Pv https://t.co/3WvOOlHGhW Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/eKRWSL9OH2 Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/U2SORrE3XQ https://t.co/7cJJ7P6xQL Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNxnXG https://t.co/jDWyz0tQa0 Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/R1dPfi6neR https://t.co/NggCw9jZXE ************************************************************ Number of tweets extracted for: DayTradeScalps is: 200. $EURUSD #EURUSD | SELL now | Open: 11762.1 | Target: 11755.8 (6.3) | Stop: 11769.8 (7.7) | #fx #forex #daytrading $EURCHF #EURCHF | SELL now | Open: 11667.8 | Target: 11663.1 (4.7) | Stop: 11677 (9.2) | #fx #forex #daytrading $AUDJPY #AUDJPY | SELL now | Open: 8614.2 | Target: 8609.1 (5.1) | Stop: 8623.1 (8.9) | #fx #forex #daytrading $NZDUSD #NZDUSD | SELL now | Open: 7006.5 | Target: 7001.6 (4.9) | Stop: 7015.6 (9.1) | #fx #forex #daytrading $USDCHF #USDCHF | BUY now | Open: 9927.5 | Target: 9932.5 (5) | Stop: 9918.5 (9) | #fx #forex #daytrading ************************************************************ Number of tweets extracted for: SignalFactory is: 200. Forex Signal | Close(TP) Sell CADJPY@87.394 | Profit: +79 pips | 2017.12.15 19:34 GMT | #fx #forex #fb Forex Signal | Buy EURAUD@1.53770 | SL:1.53370 | TP:1.54570 | 2017.12.15 19:32 GMT | #fx #forex #fb Forex Signal | Close(SL) Sell AUDCAD@0.98526 | Loss: -40 pips | 2017.12.15 19:27 GMT | #fx #forex #fb Forex Signal | Close(SL) Buy CADCHF@0.76917 | Loss: -40 pips | 2017.12.15 19:19 GMT | #fx #forex #fb Forex Signal | Close(SL) Sell AUDCAD@0.98433 | Loss: -40 pips | 2017.12.15 19:13 GMT | #fx #forex #fb ************************************************************ Number of tweets extracted for: tj_fx_live is: 200. Bought GBPUSD 1.34391 #trading #EURUSD #FX #forex #GBPUSD #USDJPY Bought USDJPY 112.239 #trading #EURUSD #FX #forex #GBPUSD #USDJPY Closed Sell USDJPY 112.559 for +23.8 pips, total for today +6.6 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY Closed Sell GBPUSD 1.34295 for -17.2 pips, total for today -17.2 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY Closed 0.0 for 0.0 pips, total for today 0.0 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY ************************************************************ ************************************************************ Number of tweets extracted for: Wermelgion_Co is: 200. Closed Sell #Forex #Fx #USDCHF 0.99072 for +10.4 pips, total for today +131.4 pips Closed Buy #Forex #Fx #AUDCAD 0.98236 for +9.4 pips, total for today +121.0 pips Closed Buy #Forex #Fx #AUDCAD 0.9805 for +12.2 pips, total for today +111.6 pips Closed Sell #Forex #Fx #USDCHF 0.99234 for +7.7 pips, total for today +99.4 pips Closed Sell #Forex #Fx #AUDCAD 0.97858 for -15.2 pips, total for today +91.7 pips ************************************************************ Number of tweets extracted for: MaggiecharFx is: 200. Closed Sell 1.0 Lots EURUSD 1.17946 for +25.5 pips, total for today +693.4 pips #Online #ForexTrading #Advisor #B ase #Code Closed Sell 1.0 Lots EURUSD 1.17947 for +25.5 pips, total for today +667.9 pips #Online #ForexTrading #Advisor #B ase #Code Closed Sell 1.0 Lots EURUSD 1.17947 for +25.5 pips, total for today +642.4 pips #Online #ForexTrading #Advisor #B ase #Code Closed Sell 1.0 Lots EURUSD 1.1796 for +25.1 pips, total for today +616.9 pips #Online #ForexTrading #Advisor #Ba se #Code Closed Sell 1.0 Lots EURUSD 1.17984 for +27.0 pips, total for today +591.8 pips #Online #ForexTrading #Advisor #B ase #Code ************************************************************ Number of tweets extracted for: DanielWr_fx is: 200. Maximum Equity Drop (also called "Draw-Down" or "Risk Management") is less than 25% (usually no more than 10-15%) . #GetMoney #FreeRobot #FX You can withdraw your initial deposit after a few days and then we trade only with the profits, so you don't risk your money any more. #Help You can exit when 2 of 3 indicators reverse or use Trailing Stop Loss, Take Profit, Risk Management, etc.… https://t.co/zEXnpOus5b How to start? Let us know which broker you prefer and we tell you further details about deposit, account type, et c. #IntroducingBroker #Job You don't need to have any Forex knowledge whatsoever. I trade for you – you just watch the profits coming.… http s://t.co/xfr8LQQN7d Step 3: Extract all tweets from one user and load it into a dataframe In [18]: # Now that we have some structured tweets at hand, we can move on to the next step # extracting all tweets from a user and loading them into a dataframe # extracting all tweets from a user and loading them into a dataframe In [19]: # Before the extraction, I am going to note down the structure of the tweet that # is of interest for us, from each of these users - we will use this structure for # parsing the tweets In [20]: for name in ['fuchstraders','DayTradeScalps','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']: tweets = twitter_api.user_timeline(screen_name=name, count=1000) print("*"*60) for tweet in tweets: if 'RT' not in tweet.text and 'close' in tweet.text or 'Close' in tweet.text: print("The relevant tweet from: {} that we will use for our analysis is: \n {} \n".format(name,tweet.t ext)) break ************************************************************ The relevant tweet from: fuchstraders that we will use for our analysis is: Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/Lnaq3rA8Pv https://t.co/3WvOOlHGhW ************************************************************ ************************************************************ The relevant tweet from: SignalFactory that we will use for our analysis is: Forex Signal | Close(TP) Sell CADJPY@87.394 | Profit: +79 pips | 2017.12.15 19:34 GMT | #fx #forex #fb ************************************************************ The relevant tweet from: tj_fx_live that we will use for our analysis is: Closed Sell USDJPY 112.559 for +23.8 pips, total for today +6.6 pips #trading #EURUSD #FX #forex #GBPUSD #USDJPY ************************************************************ The relevant tweet from: Wermelgion_Co that we will use for our analysis is: Closed Sell #Forex #Fx #USDCHF 0.99072 for +10.4 pips, total for today +131.4 pips ************************************************************ The relevant tweet from: MaggiecharFx that we will use for our analysis is: Closed Sell 1.0 Lots EURUSD 1.17946 for +25.5 pips, total for today +693.4 pips #Online #ForexTrading #Advisor # Closed Sell 1.0 Lots EURUSD 1.17946 for +25.5 pips, total for today +693.4 pips #Online #ForexTrading #Advisor # Base #Code ************************************************************ The relevant tweet from: DanielWr_fx that we will use for our analysis is: Closed Sell 2.3 Lots EURUSD 1.17644 for +10.5 pips, total for today +797.4 pips #Online #ForexTrading #Advisor # Base #Code In [21]: # I can't seem to find the word 'close' or 'Close' in the tweets from DayTradeScalps !! # Let's do some further specific analysis of this user's tweets, to understand # whether he tweets the closure of his trades In [22]: # After a few trials, I had to goto the twitter search site to search there # https://twitter.com/search?l=en&q=buy%20OR%20sell%20from%3ADayTradeScalps&src=typd # The results showed that this user only tweets the trade entries, but does not record # whether the targets were hit or stop loss was triggered # such an open ended tweet requires further correlation of the targets in the tweets # with the actual price movements at those times # Such an analysis is beyond the scope of this project, so I am dropping the user # @DayTradeScalps from my analysis In [23]: # Let's continue with loading the tweets into a df # Here we define two helper functions that will help us retrieve all tweets from a specific user # and load them all up into one dataframe that will help us with the analysis # I guess there is a max throttle somewhere, but we will proceed to retrieve all tweets In [24]: def get_all_unique_tweets(screen_name): all_tweets = [] new_tweets = twitter_api.user_timeline(screen_name = screen_name,count=200) all_tweets.extend(new_tweets) oldest = all_tweets[-1].id - 1 oldest = all_tweets[-1].id - 1 # According to twitter api documents, we can request the next set of tweets based on the max_id # parameter - so we continue looping retrieving old tweets, and if the number of tweets returns is # zero we exit the loop while len(new_tweets) != 0: new_tweets = twitter_api.user_timeline(screen_name = screen_name,count=200,max_id=oldest) all_tweets.extend(new_tweets) oldest = all_tweets[-1].id - 1 # we repeat the filter for unique tweets # we could have written a function, but I will get on with it for now all_tweet_text = [] filtered_tweets = [] for t in all_tweets: if not t.text in all_tweet_text: all_tweet_text.append(t.text) filtered_tweets.append(t) return filtered_tweets In [25]: def load_tweets_into_df(all_tweets): tweet_df = pd.DataFrame(data=[[s.created_at,s.user.screen_name,s.text] for s in all_tweets],columns=['CreatedA t','Author','Tweet']) return tweet_df In [26]: all_tweet_df = [] all_tweet_df = pd.DataFrame(columns=['CreatedAt','Author','Tweet']) for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']: a_tweets = get_all_unique_tweets(name) tweet_df = load_tweets_into_df(a_tweets) all_tweet_df = all_tweet_df.append(tweet_df) print('Loaded {} tweets from {} into a dataframe'.format(len(tweet_df),name)) print('Total tweets loaded = {}'.format(len(all_tweet_df))) Loaded 3204 tweets from fuchstraders into a dataframe Loaded 3118 tweets from SignalFactory into a dataframe Loaded 3118 tweets from SignalFactory into a dataframe Loaded 763 tweets from tj_fx_live into a dataframe Loaded 219 tweets from Wermelgion_Co into a dataframe Loaded 3206 tweets from MaggiecharFx into a dataframe Loaded 1960 tweets from DanielWr_fx into a dataframe Total tweets loaded = 12470 Step 4: Data transformation and cleansing In [27]: all_tweet_df.head(5) Out[27]: CreatedAt Author Tweet 0 2017-12-15 15:33:39 fuchstraders 1 2017-12-15 13:02:14 fuchstraders Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/eKRWSL9OH2 2 2017-12-15 09:33:21 fuchstraders 3 2017-12-15 07:04:23 fuchstraders Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNxnXG https://t.co/jDWyz0tQa0 4 2017-12-15 03:33:19 fuchstraders Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/Lnaq3rA8Pv https://t.co/3WvOOlHGhW Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/U2SORrE3XQ https://t.co/7cJJ7P6xQL Closed # 0.0 for 0.0 pips, total for today 0.0 pips https://t.co/BMg2rNOZmg https://t.co/R1dPfi6neR https://t.co/NggCw9jZXE In [28]: # This dataframe has all tweets from these four users # we need to look for tweets that book profits or losses # Initial analysis shows, such tweets usually contain the word 'pips' # as we observed in our analysis of the tweet structure In [29]: # We could have either filtered such tweets in the twitter_api search # or we could have filtered those tweets during the dataframe loading exercise # The third option is to filter them in the dataframe # I will use the third option for now, as this gives us a good example # of loading raw tweets into a dataframe and filtering them there In [30]: before = len(all_tweet_df) print(before) 12470 In [31]: all_tweet_df.isnull().any() Out[31]: CreatedAt Author Tweet dtype: bool False False False In [32]: # To build an equity curve out of the profit/loss values, # we first need to extract the profit/loss values per trade # we note that that profit/loss is in between 'for' and 'pips' for all users, except for tweets from SignalFactor y # For tweets from SignalFactory, the profit/loss is in between 'Profit:' and 'pips' or 'Loss:' and 'pips' In [33]: # There are multiple ways to handle this # one way is to split the dataframe again by the tweet authors and # apply a different extraction logic for each tweet author # another way is to do a text replace, to bring all records in the tweet to contain the same pattern # to keep it simple, let's choose the second way i.e. replace the text 'Profit:' and Loss: to the text 'for' # however, as each user's tweet's will follow a different structure, a possible future extension # could be to write a dedicated profit/loss extractor for each user and apply it to the dataframe # could be to write a dedicated profit/loss extractor for each user and apply it to the dataframe In [34]: # Let's first create a filter and see the values that are present closed_tweet = all_tweet_df['Tweet'].str.contains('Close') author = all_tweet_df['Author'] == 'SignalFactory' df_match = closed_tweet & author print(all_tweet_df[df_match][:5]) all_tweet_df['Tweet'] = all_tweet_df['Tweet'].str.replace('Profit:','for') all_tweet_df['Tweet'] = all_tweet_df['Tweet'].str.replace('Loss:','for') # Now if we print the rows based on the same filter, they should contain # 'for' instead of the words 'Profit:' or 'Loss:' print(all_tweet_df[df_match][:5]) CreatedAt 0 2017-12-15 19:47:07 2 2017-12-15 19:32:03 3 2017-12-15 19:30:00 4 2017-12-15 19:17:03 7 2017-12-15 17:15:00 Author SignalFactory SignalFactory SignalFactory SignalFactory SignalFactory \ Tweet 0 Forex Signal | Close(TP) Sell CADJPY@87.394 | Profit: +79 pips | 2017.12.15 19:34 GMT | #fx #forex #fb 2 Forex Signal | Close(SL) Sell AUDCAD@0.98526 | Loss: -40 pips | 2017.12.15 19:27 GMT | #fx #forex #fb 3 Forex Signal | Close(SL) Buy CADCHF@0.76917 | Loss: -40 pips | 2017.12.15 19:19 GMT | #fx #forex #fb 4 Forex Signal | Close(SL) Sell AUDCAD@0.98433 | Loss: -40 pips | 2017.12.15 19:13 GMT | #fx #forex #fb 7 Forex Signal | Close(SL) Buy CADCHF@0.77068 | Loss: -40 pips | 2017.12.15 17:06 GMT | #fx #forex #fb CreatedAt Author \ 0 2017-12-15 19:47:07 SignalFactory 2 2017-12-15 19:32:03 SignalFactory 3 2017-12-15 19:30:00 SignalFactory 4 2017-12-15 19:17:03 SignalFactory 7 2017-12-15 17:15:00 SignalFactory 0 2 3 Tweet Forex Signal | Close(TP) Sell CADJPY@87.394 | for +79 pips | 2017.12.15 19:34 GMT | #fx #forex #fb Forex Signal | Close(SL) Sell AUDCAD@0.98526 | for -40 pips | 2017.12.15 19:27 GMT | #fx #forex #fb Forex Signal | Close(SL) Buy CADCHF@0.76917 | for -40 pips | 2017.12.15 19:19 GMT | #fx #forex #fb 3 4 7 Forex Signal | Close(SL) Buy CADCHF@0.76917 | for -40 pips | 2017.12.15 19:19 GMT | #fx #forex #fb Forex Signal | Close(SL) Sell AUDCAD@0.98433 | for -40 pips | 2017.12.15 19:13 GMT | #fx #forex #fb Forex Signal | Close(SL) Buy CADCHF@0.77068 | for -40 pips | 2017.12.15 17:06 GMT | #fx #forex #fb In [35]: # Now all the rows that contain the profit/loss values are normalized to contain the # profit/loss value between the strings 'for' and 'pips' In [36]: # The next step is to extract the profit/loss values, and create a new column out of it # we can use a combination of the extract function and regular expressions # however if i try to extract the text between 'for' and 'pips' using regular expressions, # I am getting the second match instead of the first match # as the words 'for' and 'pips' appear twice in the tweet # Closed Sell #Forex #Fx #AUDNZD 1.09677 for +23.2 pips, total for today +9.2 pips # I am using a work around here which is to split the text # and then extract the text between the words 'for' and 'pips' In [37]: all_tweet_df['Profit/Loss String'] = all_tweet_df['Tweet'].str.split(',').str.get(0) all_tweet_df['Profit/Loss'] = all_tweet_df['Profit/Loss String'].str.extract('.*for(.*)pips.*',expand=True) In [38]: len(all_tweet_df) Out[38]: 12470 In [39]: # Let's check to make sure we extracted the profit/loss correctly for all tweet authors for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']: author = all_tweet_df['Author'] == name print(all_tweet_df[author][['CreatedAt','Author','Profit/Loss']][:10]) CreatedAt 0 2017-12-15 15:33:39 1 2017-12-15 13:02:14 Author Profit/Loss fuchstraders 0.0 fuchstraders 0.0 1 2017-12-15 13:02:14 2 2017-12-15 09:33:21 3 2017-12-15 07:04:23 4 2017-12-15 03:33:19 5 2017-12-15 01:02:14 6 2017-12-14 21:33:28 7 2017-12-14 18:33:44 8 2017-12-14 18:33:43 9 2017-12-14 18:33:41 CreatedAt 0 2017-12-15 19:47:07 1 2017-12-15 19:45:04 2 2017-12-15 19:32:03 3 2017-12-15 19:30:00 4 2017-12-15 19:17:03 5 2017-12-15 19:15:01 6 2017-12-15 17:30:01 7 2017-12-15 17:15:00 8 2017-12-15 17:00:00 9 2017-12-15 16:15:00 CreatedAt 0 2017-12-15 07:05:00 1 2017-12-15 02:53:48 2 2017-12-15 02:48:49 3 2017-12-15 02:28:42 4 2017-12-14 22:23:23 5 2017-12-14 14:09:14 6 2017-12-14 14:04:13 7 2017-12-14 07:00:45 8 2017-12-13 22:20:50 9 2017-12-13 22:05:42 CreatedAt 0 2017-12-15 21:35:28 1 2017-12-15 17:35:58 2 2017-12-15 16:55:25 3 2017-12-15 16:25:24 4 2017-12-15 15:44:59 5 2017-12-15 15:44:58 6 2017-12-15 15:24:49 7 2017-12-15 14:59:38 8 2017-12-15 14:29:22 9 2017-12-15 14:24:39 CreatedAt fuchstraders 0.0 fuchstraders 0.0 fuchstraders 0.0 fuchstraders 0.0 fuchstraders 0.0 fuchstraders 0.0 fuchstraders NaN fuchstraders NaN fuchstraders NaN Author Profit/Loss SignalFactory +79 SignalFactory NaN SignalFactory -40 SignalFactory -40 SignalFactory -40 SignalFactory NaN SignalFactory NaN SignalFactory -40 SignalFactory -40 SignalFactory NaN Author Profit/Loss tj_fx_live NaN tj_fx_live NaN tj_fx_live +23.8 tj_fx_live -17.2 tj_fx_live 0.0 tj_fx_live NaN tj_fx_live +48.4 tj_fx_live NaN tj_fx_live 0.0 tj_fx_live -35.1 Author Profit/Loss Wermelgion_Co +10.4 Wermelgion_Co +9.4 Wermelgion_Co +12.2 Wermelgion_Co +7.7 Wermelgion_Co -15.2 Wermelgion_Co +24.0 Wermelgion_Co +16.4 Wermelgion_Co +12.2 Wermelgion_Co +10.4 Wermelgion_Co +9.9 Author Profit/Loss CreatedAt 0 2017-12-15 19:32:12 1 2017-12-15 19:32:12 2 2017-12-15 19:32:11 3 2017-12-15 19:32:11 4 2017-12-15 19:32:11 5 2017-12-15 15:16:40 6 2017-12-15 15:16:40 7 2017-12-15 15:16:40 8 2017-12-15 15:16:40 9 2017-12-15 15:16:39 CreatedAt 0 2017-12-16 11:00:52 1 2017-12-16 10:00:53 2 2017-12-16 09:20:24 3 2017-12-16 09:00:16 4 2017-12-16 08:45:24 5 2017-12-16 08:00:52 6 2017-12-16 07:01:04 7 2017-12-16 06:00:49 8 2017-12-16 05:00:47 9 2017-12-16 04:45:23 Author Profit/Loss MaggiecharFx +25.5 MaggiecharFx +25.5 MaggiecharFx +25.5 MaggiecharFx +25.1 MaggiecharFx +27.0 MaggiecharFx +26.0 MaggiecharFx +25.5 MaggiecharFx +25.9 MaggiecharFx +28.9 MaggiecharFx +28.2 Author Profit/Loss DanielWr_fx NaN DanielWr_fx NaN DanielWr_fx NaN DanielWr_fx NaN DanielWr_fx NaN DanielWr_fx NaN DanielWr_fx NaN DanielWr_fx NaN DanielWr_fx NaN DanielWr_fx NaN In [40]: # The next imposrtant step is to drop the NaN values, as the rows with # NaN values are tweets other than those that record the profit/loss # They could be a valid trade entry related tweet, but we are # not interested in those anyway, unless we want to follow the trade ideas # That is a topic by itself - and is the follow-up research after this analysis :-) In [41]: all_tweet_df.isnull().any() Out[41]: CreatedAt Author Tweet Profit/Loss String Profit/Loss False False False False True dtype: bool In [42]: before = len(all_tweet_df) all_tweet_df = all_tweet_df.dropna() after = len(all_tweet_df) print ('No. of records dropped = {}'.format(before - after)) No. of records dropped = 3948 In [43]: all_tweet_df.isnull().any() Out[43]: CreatedAt Author Tweet Profit/Loss String Profit/Loss dtype: bool False False False False False In [44]: all_tweet_df.dtypes Out[44]: CreatedAt Author Tweet Profit/Loss String Profit/Loss dtype: object datetime64[ns] object object object object In [45]: # The very last but important step, is to convert the profit/loss column from # a generic object into a float, so it is easier to process further In [46]: In [46]: all_tweet_df['Profit/Loss'] = pd.to_numeric(all_tweet_df['Profit/Loss']) all_tweet_df.dtypes Out[46]: CreatedAt Author Tweet Profit/Loss String Profit/Loss dtype: object datetime64[ns] object object object float64 Step 5: Visualization In [47]: # In order to visualize the growth in the portfolio, based on the # profit/loss from individual trades, we start with a seed investment of # 10000 USD and continue adding the profit/loss values. # There is an important detail that has to be noted here # If we place a trade, say buying 10000 units of EUR_USD, then 1 pip in profit results # in 1 USD in profit # So the general formula for pip to USD conversion for a EUR_USD trade is: # No. of units / 10000 * no. of pips in profit/loss # In our tweet analysis, inititally I wanted to look at only EUR_USD trades, however # since the no. of tweets were low, I included all currencies # Applying a currency conversion factor is beyond the scope of this project # So we will make an important assumption here that the no. of units traded # for a specific currency pair i.e the position size, already takes into account this conversion factor # such that a 1 pip profit always results in a 1 USD profit In [48]: # Let's filter out the dataframes according to their tweet authors # and store them in a dictionary with author names as keys, for easy access In [49]: dict_of_author_tweets = {} for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']: author = all_tweet_df['Author'] == name dict_of_author_tweets[name] = all_tweet_df[author].copy() print(dict_of_author_tweets[name][['CreatedAt','Author','Profit/Loss']].head()) CreatedAt Author Profit/Loss 0 2017-12-15 15:33:39 fuchstraders 0.0 1 2017-12-15 13:02:14 fuchstraders 0.0 2 2017-12-15 09:33:21 fuchstraders 0.0 3 2017-12-15 07:04:23 fuchstraders 0.0 4 2017-12-15 03:33:19 fuchstraders 0.0 CreatedAt Author Profit/Loss 0 2017-12-15 19:47:07 SignalFactory 79.0 2 2017-12-15 19:32:03 SignalFactory -40.0 3 2017-12-15 19:30:00 SignalFactory -40.0 4 2017-12-15 19:17:03 SignalFactory -40.0 7 2017-12-15 17:15:00 SignalFactory -40.0 CreatedAt Author Profit/Loss 2 2017-12-15 02:48:49 tj_fx_live 23.8 3 2017-12-15 02:28:42 tj_fx_live -17.2 4 2017-12-14 22:23:23 tj_fx_live 0.0 6 2017-12-14 14:04:13 tj_fx_live 48.4 8 2017-12-13 22:20:50 tj_fx_live 0.0 CreatedAt Author Profit/Loss 0 2017-12-15 21:35:28 Wermelgion_Co 10.4 1 2017-12-15 17:35:58 Wermelgion_Co 9.4 2 2017-12-15 16:55:25 Wermelgion_Co 12.2 3 2017-12-15 16:25:24 Wermelgion_Co 7.7 4 2017-12-15 15:44:59 Wermelgion_Co -15.2 CreatedAt Author Profit/Loss 0 2017-12-15 19:32:12 MaggiecharFx 25.5 1 2017-12-15 19:32:12 MaggiecharFx 25.5 2 2017-12-15 19:32:11 MaggiecharFx 25.5 3 2017-12-15 19:32:11 MaggiecharFx 25.1 4 2017-12-15 19:32:11 MaggiecharFx 27.0 CreatedAt Author Profit/Loss 19 2017-12-15 19:07:09 DanielWr_fx 10.5 20 2017-12-15 19:07:09 DanielWr_fx 10.1 21 2017-12-15 19:07:08 DanielWr_fx -1.2 22 2017-12-15 19:07:08 DanielWr_fx 10.8 23 2017-12-15 19:07:08 DanielWr_fx 10.5 In [50]: # Let's create a column named Equity with 0 values initially # We will calculate the values for equity subsequently In [51]: for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']: dict_of_author_tweets[name]['Equity'] = 0 print(dict_of_author_tweets[name][['CreatedAt','Author','Profit/Loss','Equity']].head()) CreatedAt 0 2017-12-15 15:33:39 1 2017-12-15 13:02:14 2 2017-12-15 09:33:21 3 2017-12-15 07:04:23 4 2017-12-15 03:33:19 CreatedAt 0 2017-12-15 19:47:07 2 2017-12-15 19:32:03 3 2017-12-15 19:30:00 4 2017-12-15 19:17:03 7 2017-12-15 17:15:00 CreatedAt 2 2017-12-15 02:48:49 3 2017-12-15 02:28:42 4 2017-12-14 22:23:23 6 2017-12-14 14:04:13 8 2017-12-13 22:20:50 CreatedAt 0 2017-12-15 21:35:28 1 2017-12-15 17:35:58 2 2017-12-15 16:55:25 3 2017-12-15 16:25:24 4 2017-12-15 15:44:59 CreatedAt 0 2017-12-15 19:32:12 1 2017-12-15 19:32:12 2 2017-12-15 19:32:11 3 2017-12-15 19:32:11 4 2017-12-15 19:32:11 Author Profit/Loss Equity fuchstraders 0.0 0 fuchstraders 0.0 0 fuchstraders 0.0 0 fuchstraders 0.0 0 fuchstraders 0.0 0 Author Profit/Loss Equity SignalFactory 79.0 0 SignalFactory -40.0 0 SignalFactory -40.0 0 SignalFactory -40.0 0 SignalFactory -40.0 0 Author Profit/Loss Equity tj_fx_live 23.8 0 tj_fx_live -17.2 0 tj_fx_live 0.0 0 tj_fx_live 48.4 0 tj_fx_live 0.0 0 Author Profit/Loss Equity Wermelgion_Co 10.4 0 Wermelgion_Co 9.4 0 Wermelgion_Co 12.2 0 Wermelgion_Co 7.7 0 Wermelgion_Co -15.2 0 Author Profit/Loss Equity MaggiecharFx 25.5 0 MaggiecharFx 25.5 0 MaggiecharFx 25.5 0 MaggiecharFx 25.1 0 MaggiecharFx 27.0 0 CreatedAt 19 2017-12-15 19:07:09 20 2017-12-15 19:07:09 21 2017-12-15 19:07:08 22 2017-12-15 19:07:08 23 2017-12-15 19:07:08 Author DanielWr_fx DanielWr_fx DanielWr_fx DanielWr_fx DanielWr_fx Profit/Loss 10.5 10.1 -1.2 10.8 10.5 Equity 0 0 0 0 0 In [52]: # We also notice that the dataframe is in reverse chronological order because # the tweets from retrieved from the latest to the oldest # Let's sort the dataframes in a chronological order # and let's reset the index, so that further manipulation and plotting becomes easier In [53]: for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']: dict_of_author_tweets[name] = dict_of_author_tweets[name].sort_values(by='CreatedAt') dict_of_author_tweets[name] = dict_of_author_tweets[name].reset_index(drop=True) print(dict_of_author_tweets[name][['CreatedAt','Author','Profit/Loss','Equity']].head()) CreatedAt 0 2017-03-24 01:07:36 1 2017-03-24 04:14:24 2 2017-03-28 21:08:48 3 2017-03-28 22:33:40 4 2017-03-30 14:21:07 CreatedAt 0 2017-05-30 21:39:20 1 2017-05-30 21:44:24 2 2017-05-30 21:59:27 3 2017-05-30 22:04:29 4 2017-05-31 01:15:01 CreatedAt 0 2017-05-18 14:14:55 1 2017-05-18 19:24:34 2 2017-05-19 03:42:31 3 2017-05-19 11:57:26 4 2017-05-19 23:49:18 CreatedAt 0 2017-11-26 23:03:03 1 2017-11-27 01:22:49 Author Profit/Loss Equity fuchstraders -23.4 0 fuchstraders -12.4 0 fuchstraders -199.0 0 fuchstraders -199.0 0 fuchstraders -19.5 0 Author Profit/Loss Equity SignalFactory 0.0 0 SignalFactory -26.4 0 SignalFactory -15.8 0 SignalFactory -9.3 0 SignalFactory 80.0 0 Author Profit/Loss Equity tj_fx_live 249.7 0 tj_fx_live 12.6 0 tj_fx_live -56.3 0 tj_fx_live -45.5 0 tj_fx_live 16.7 0 Author Profit/Loss Equity Wermelgion_Co 7.3 0 Wermelgion_Co -34.9 0 1 2017-11-27 01:22:49 2 2017-11-27 01:22:49 3 2017-11-27 01:22:49 4 2017-11-27 01:27:51 CreatedAt 0 2017-04-19 10:12:29 1 2017-04-19 10:12:29 2 2017-04-19 10:12:30 3 2017-04-19 10:12:30 4 2017-04-19 10:12:30 CreatedAt 0 2017-10-13 15:19:11 1 2017-10-13 15:19:11 2 2017-10-13 15:19:12 3 2017-10-13 15:19:12 4 2017-10-13 18:04:27 Wermelgion_Co -34.9 0 Wermelgion_Co 36.8 0 Wermelgion_Co 1.0 0 Wermelgion_Co 37.2 0 Author Profit/Loss Equity MaggiecharFx 10.2 0 MaggiecharFx 10.6 0 MaggiecharFx 11.0 0 MaggiecharFx 9.8 0 MaggiecharFx 11.0 0 Author Profit/Loss Equity DanielWr_fx 28.9 0 DanielWr_fx 25.4 0 DanielWr_fx 26.0 0 DanielWr_fx 26.1 0 DanielWr_fx -1.2 0 In [54]: # To calculate the running equity value, we need to start with an initial balance # of 10000 USD, so let's insert a row in each dataframe at the first position # with an equity value of 10000 USD In [55]: from datetime import timedelta for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']: oldest_trade = dict_of_author_tweets[name]['CreatedAt'][0] new_top_row = [] new_top_row.insert(0,{'CreatedAt':oldest_trade + timedelta(days=-1),'Author':name,'Tweet':'Initial Capital','P rofit/Loss String':'Happy trading','Profit/Loss':0.0,'Equity':10000.0}) dict_of_author_tweets[name] = pd.concat([pd.DataFrame(new_top_row),dict_of_author_tweets[name]],ignore_index=T rue) print(dict_of_author_tweets[name][['CreatedAt','Author','Profit/Loss','Equity']].head()) CreatedAt 0 2017-03-23 01:07:36 1 2017-03-24 01:07:36 2 2017-03-24 04:14:24 3 2017-03-28 21:08:48 Author fuchstraders fuchstraders fuchstraders fuchstraders Profit/Loss 0.0 -23.4 -12.4 -199.0 Equity 10000.0 0.0 0.0 0.0 4 2017-03-28 22:33:40 CreatedAt 0 2017-05-29 21:39:20 1 2017-05-30 21:39:20 2 2017-05-30 21:44:24 3 2017-05-30 21:59:27 4 2017-05-30 22:04:29 CreatedAt 0 2017-05-17 14:14:55 1 2017-05-18 14:14:55 2 2017-05-18 19:24:34 3 2017-05-19 03:42:31 4 2017-05-19 11:57:26 CreatedAt 0 2017-11-25 23:03:03 1 2017-11-26 23:03:03 2 2017-11-27 01:22:49 3 2017-11-27 01:22:49 4 2017-11-27 01:22:49 CreatedAt 0 2017-04-18 10:12:29 1 2017-04-19 10:12:29 2 2017-04-19 10:12:29 3 2017-04-19 10:12:30 4 2017-04-19 10:12:30 CreatedAt 0 2017-10-12 15:19:11 1 2017-10-13 15:19:11 2 2017-10-13 15:19:11 3 2017-10-13 15:19:12 4 2017-10-13 15:19:12 fuchstraders -199.0 0.0 Author Profit/Loss Equity SignalFactory 0.0 10000.0 SignalFactory 0.0 0.0 SignalFactory -26.4 0.0 SignalFactory -15.8 0.0 SignalFactory -9.3 0.0 Author Profit/Loss Equity tj_fx_live 0.0 10000.0 tj_fx_live 249.7 0.0 tj_fx_live 12.6 0.0 tj_fx_live -56.3 0.0 tj_fx_live -45.5 0.0 Author Profit/Loss Equity Wermelgion_Co 0.0 10000.0 Wermelgion_Co 7.3 0.0 Wermelgion_Co -34.9 0.0 Wermelgion_Co 36.8 0.0 Wermelgion_Co 1.0 0.0 Author Profit/Loss Equity MaggiecharFx 0.0 10000.0 MaggiecharFx 10.2 0.0 MaggiecharFx 10.6 0.0 MaggiecharFx 11.0 0.0 MaggiecharFx 9.8 0.0 Author Profit/Loss Equity DanielWr_fx 0.0 10000.0 DanielWr_fx 28.9 0.0 DanielWr_fx 25.4 0.0 DanielWr_fx 26.0 0.0 DanielWr_fx 26.1 0.0 In [56]: # Now we have all the required info to calculate the equity growth and # subsequently visualize it to see how each analyst/author has performed # over the duration In [57]: # We will use a simple formula to calculate the Equity evolution # Equity[i] = Equity[i-1] + (Equity[i-1]/10000*Profit/Loss[i]) # Equity[i] = Equity[i-1] + (Equity[i-1]/10000*Profit/Loss[i]) # This formula takes position sizing into account, however, it ignores # the differences in the Pip vs USD value for different currency pairs as mentioned earlier In [58]: for name in ['fuchstraders','SignalFactory','tj_fx_live','Wermelgion_Co','MaggiecharFx','DanielWr_fx']: profitloss = dict_of_author_tweets[name]['Profit/Loss'] equity = dict_of_author_tweets[name]['Equity'] for x in range(1,len(profitloss)): equity[x] = equity[x-1]*(1+1/10000*profitloss[x]) dict_of_author_tweets[name]['Equity'] = equity print(dict_of_author_tweets[name][['CreatedAt','Author','Profit/Loss','Equity']].head()) /home/theerthan/anaconda3/envs/edx/lib/python3.5/site-packages/ipykernel_launcher.py:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-ver sus-copy """ CreatedAt 0 2017-03-23 01:07:36 1 2017-03-24 01:07:36 2 2017-03-24 04:14:24 3 2017-03-28 21:08:48 4 2017-03-28 22:33:40 CreatedAt 0 2017-05-29 21:39:20 1 2017-05-30 21:39:20 2 2017-05-30 21:44:24 3 2017-05-30 21:59:27 4 2017-05-30 22:04:29 CreatedAt 0 2017-05-17 14:14:55 1 2017-05-18 14:14:55 2 2017-05-18 19:24:34 3 2017-05-19 03:42:31 4 2017-05-19 11:57:26 CreatedAt 0 2017-11-25 23:03:03 1 2017-11-26 23:03:03 Author Profit/Loss Equity fuchstraders 0.0 10000.000000 fuchstraders -23.4 9976.600000 fuchstraders -12.4 9964.229016 fuchstraders -199.0 9765.940859 fuchstraders -199.0 9571.598635 Author Profit/Loss Equity SignalFactory 0.0 10000.000000 SignalFactory 0.0 10000.000000 SignalFactory -26.4 9973.600000 SignalFactory -15.8 9957.841712 SignalFactory -9.3 9948.580919 Author Profit/Loss Equity tj_fx_live 0.0 10000.000000 tj_fx_live 249.7 10249.700000 tj_fx_live 12.6 10262.614622 tj_fx_live -56.3 10204.836102 tj_fx_live -45.5 10158.404097 Author Profit/Loss Equity Wermelgion_Co 0.0 10000.000000 Wermelgion_Co 7.3 10007.300000 2 2017-11-27 01:22:49 3 2017-11-27 01:22:49 4 2017-11-27 01:22:49 CreatedAt 0 2017-04-18 10:12:29 1 2017-04-19 10:12:29 2 2017-04-19 10:12:29 3 2017-04-19 10:12:30 4 2017-04-19 10:12:30 CreatedAt 0 2017-10-12 15:19:11 1 2017-10-13 15:19:11 2 2017-10-13 15:19:11 3 2017-10-13 15:19:12 4 2017-10-13 15:19:12 Wermelgion_Co -34.9 9972.374523 Wermelgion_Co 36.8 10009.072861 Wermelgion_Co 1.0 10010.073769 Author Profit/Loss Equity MaggiecharFx 0.0 10000.000000 MaggiecharFx 10.2 10010.200000 MaggiecharFx 10.6 10020.810812 MaggiecharFx 11.0 10031.833704 MaggiecharFx 9.8 10041.664901 Author Profit/Loss Equity DanielWr_fx 0.0 10000.000000 DanielWr_fx 28.9 10028.900000 DanielWr_fx 25.4 10054.373406 DanielWr_fx 26.0 10080.514777 DanielWr_fx 26.1 10106.824920 In [59]: fig,axis = plt.subplots(2,3,figsize=[20,10]) out = plt.suptitle('Equity curve of various Analysts/FXTraders',y=1.08,fontsize=30) # every year years = mdates.YearLocator() months = mdates.MonthLocator() # every month yearsFmt = mdates.DateFormatter('%Y') names = [['fuchstraders','SignalFactory','tj_fx_live'], ['Wermelgion_Co','MaggiecharFx','DanielWr_fx']] for x in range(len(names)): for y in range(len(names[x])): dict_of_author_tweets[names[x][y]]['CreatedAt'] = pd.to_datetime(dict_of_author_tweets[names[x][y]] ['CreatedAt']) axis[x][y].plot(dict_of_author_tweets[names[x][y]]['CreatedAt'].values,dict_of_author_tweets[names[x][y]][' Equity'].values) axis[x][y].set_title(names[x][y],fontsize=20) axis[x][y].grid(True) fig.autofmt_xdate() plt.tight_layout() In [62]: # We can also look at the distribution of the trades with respect to the no. of pips # gained or lost i.e. how many trades were executed that had a certain pip range in profit # or loss # this will give us an idea of the consistency of the analyst/author i.e. if he or she is # consistently winning with a few bad trades or consistently losing with a few winning trades etc # Based on the above equity curve, we would expect that for the first three, the histogram shows # mostly trades less than 0 (mostly losing trades), while for the other three, # mostly trades less than 0 (mostly losing trades), while for the other three, # the histogram shows mostly trades above 0 (mostly winning trades) In [63]: fig,axis = plt.subplots(2,3,figsize=[20,10]) out = plt.suptitle('Histogram of Profit/Loss trades',y=1.08,fontsize=30) names = [['fuchstraders','SignalFactory','tj_fx_live'], ['Wermelgion_Co','MaggiecharFx','DanielWr_fx']] for x in range(len(names)): for y in range(len(names[x])): axis[x][y].hist(dict_of_author_tweets[names[x][y]]['Profit/Loss'].values,10, normed=False, facecolor='green') axis[x][y].set_title(names[x][y],fontsize=20) axis[x][y].grid(True) plt.tight_layout() Observations The analysis shows that although some analysts consistently lose money and the equity drops significantly for them, others do make profits and have a healthy equity, over a range of months. So the hypothesis that we could follow twitter feeds and manage our portfolio is very much a possibility. However, it requires a structured analysis, in addition to the preliminary research above, that demonstrates this possibility. Specific to this analysis, we can see that three out of the six analysts have lost money, while the other three have shown positive returns. Two of them, have even shown enormous returns, which at first glance seems to be too good to be true. As mentioned above, a second set of validation, by correlating the entry price and the timeline, with the actual price of the currency pair at that same timeline, along with the corresponding exit, would re-confirm that the trades were indeed profitable, and the tweets were not made up. So it is indeed possible to manage a portfolio by smart monitoring of twitter feeds and selective tagging of trades!!