The use of Google Search data for macro-economic nowcasting Per Nymand-Andersen, European Central Bank CCSA Special session on showcasing big data ESCAP Headquarters, Bangkok, Thailand Agenda 1 Reflections on “big data” for policy purposes 2 Show casing “big data” for macro-economic purposes 3 Preliminary lessons and way forward Reflections on “big data” for policy purposes 1 “Big data are a source of information and intelligence that have been gathered from a recorded action or from a combination of records” For example: • records of supermarket purchases (Walmart tracts > 1 mil. transactions/hour) • robot and sensor information in production processes • road tolls, train, ship, aeroplane, mobile tracking devices, navigation systems • telephone operators and satellite sensors, Electronic images, • behaviour, event-driven and opinion-gathering from search engines, such as social media (Twitter, blogs, text messages, Facebook, LinkedIn), • speech and word recognition • credit and debit payments, trading and settlement platforms, The list seem endless as more and more information becomes public and digital 1 Reflections on “big data” for policy purposes The term “big data” – a large variety of interpretation* While some institutions may consider “administrative data” (business registers) security datasets) as “big data”; others complexity of combining size, formats public/private sources single sourced data, such as granular or micro information data” (security-bymay take a more holistic approach of and sources mainly focussed on non Big data is not just about large data sets. The 4 Vs (IBM) relates to Volume, Velocity, Variety and Veracity. Volume Scale of data Velocity Analysis of streaming data Variety Different forms of data Veracity Uncertainty of data “Big data – The hunt for timely insights and decision certainty. Central banking reflections on the use of big data for policy purposes” P. Nymand- Andersen, IFC publication, (2015). 1 Reflections on “big data” for policy purposes 2 Show casing “big data” for macro-economic purposes Since 2008, new and increasing field for experimental nowcasting of mainly consumption and selective macro-economic indicators Macro-economic topic and number of releases 0 2 4 6 Unemployment Stock market House market Predict sales/consumption Travel Consumer sentiment Inflation State of economy Detect influenza “Predicting the euro area unemployment rate using Google data: central banks’ interest in and use of big data. ”Nymand- Andersen, P & Koivupalo H, forthcoming publication (2015). 8 10 12 2 Show casing “big data” for macro-economic purposes Authors Area of macro-economic topic Hal Varian & Choi (2009, 2011, 2013) unemployment rate, retail sales, home sales, travel/tourism, car sales, consumer confidence, Zimmermann K & Askitas N (2009) DE unemployment rate D’Amuri F, & Marcucci J (2010, 2013) US unemployment rate McLaren N & Shanbhogue R (2011) UK unemployment rate & housing market trends Vosen & Schmidt (2011) DE private consumption Carriere-Swallow (2011) Car purchases in Chile Guzmán G (2011) Inflations Fantazzini D & Toktamysova Z (2014) German car sales Morgan J, e all (2015) DE, FR, IT, ES NL unemployment rates 2 Show casing “big data” for macro-economic purposes How to use google search data to nowcast euro area unemployment Eurostat’s euro area 13 and 19 unemployment rates testing using two periods; 2011–2012 & 2012–2014 Dataset: Google search data (google search machines) using Google’s taxonomy of categorising search terms, includes 26 main categories and 269 sub-categories. (Finance and Banking) Google search data is an index of weekly volume changes The volumes are normalised starting at 1.00 and next week value shows the relative change of Google searches within the category (no absolute volumes) Data from 14 countries: Austria, Belgium, Denmark, France, Germany, Ireland, Italy, Netherlands, Portugal, Spain, Sweden, Slovenia, United Kingdom, USA 2 Show casing “big data” for macro-economic purposes Two autoregressive models are used to nowcast euro area unemployment rate log(yt) = a + b* log(yt-1) + c*log(yt-y12) + et, log(yt) = a + b* log(yt-1) + c*log(yt-y12) + G + et, Where Y(t) is the unemployment rate at month(t) And G is the google search index 2 Show casing “big data” for macro-economic purposes 2 Show casing “big data” for macro-economic purposes Unemployment rate – EA13 MAE/Forecast period Base model base model Google data Errors reduced Unemployment rate – EA18 Jan2011– Dec2012 Nov2012– Oct2014 Jan2011– Dec2012 Nov2012– Oct2014 1,97 1,61 2,23 1,73 1,97 1,41 2,02 1,57 18,1% 22,6% 28,7% 22,2% Applying the mean absolute error (MAE) Preliminary indications suggest that the naïve model including the Google data seems to perform better over the two periods The improvement (reduction in the errors) range from 18.1% to 28,7% 3 Preliminary lessons and way forward Robustness Methodology Quality •stability of search terms •volatility in analytical results •based on one search engine •coverage, weights, normalisations •aggregation methods •price information •short time series •differ across regions •no quality measurements •No unit tracking •rebasing and time lag •home and host concept 3 Preliminary lessons and way forward Usability Availability Innovation • nowcasting of retail consumption and selective macro-economic indicators • conjunctural analysis • consumer behaviour • price indexes • public and free, easy to use • one system for all countries • comparability & timeliness • large taxonomy of searches •trends in communications •product loyalty •advertisement •social patterns in retail markets •households & business surveys 3 Preliminary lessons and way forward new ideas for statistical input are always meet with a degree of scepticism simple, cheap and easy to put into statistics production creates dependencies though always free in the start up phase challenges the statistics communication function Statisticians may need to explore private sources in meeting increasing user demands for statistics 3 Preliminary lessons and way forward Central banks are interested in cooperating in a structural approach • establishing a big data road map • identify joint pilot projects • sharing experience Relevant pilot projects within the field of using 1) administrative dataset (e.g. corporate balance sheet data) 2) web search data set (e.g. Google type search info) 3) commercial dataset (e.g. credit card operators) 4) financial market data (e.g. high frequency trading) Outlet for statistical papers including big data 3 ECB Statistics Paper Series (big data) • “Nowcasting GDP with electronic payments data” by Galbraith J & Tkacz G. – – Electronic payment transactions can be used in nowcasting current gross domestic product growth finds that debit card transactions contribute most to forecast accuracy • “Social media sentiment and consumer confidence” by Daas P & Puts M – – Relationships between the changes in consumer confidence and Dutch public social media? Could be used as an indicator for changes in consumer confidence and as an early indicator • “Quantifying the effects of online bullishness on international financial markets” by Mao H & Counts S, Bollen J. – – Develops a measure of investor sentiment based on Twitter and Google search queries Twitter and Google bullishness are positively correlated to investor sentiment