Using Big Data to investigate the influence of climate and demography on wine consumer habits Alastair Reed1, Michael Shannon1, Daniel Mathews2 1 Viticulture and Winemaking, Melbourne Polytechnic Contact: alastairreed@melbournepolytechnic.edu.au 2School of Mathematic Sciences, Monash University Today Background Australian wine retail sector The study Use of Big Data in wine to derive relationships between geography and climate Results Association between temperature, geography and consumer preference Recommendations Ongoing research and implications for future management Introduction The Australian wine retail sector is a clear duopoly Dominated by two players; Wesfarmers Ltd [19%] and Woolworths Ltd [39%] Data analysis opportunity! Beverage Revenue Wesfarmers Ltd $2.0 billion Woolworths Ltd $4.1 million From: Data estimated by IBIS World What effects consumer preference? Epigenetics of a varietal decision 1. Visual Label, position, status 2. History Regional bias, personal bias 3. Environment Climatic effects, light levels Decision Genes Shiraz Sauvignon Blanc Activation Shiraz sale Sauvignon Blanc sale Decision Gene expression can be developmentally influenced and/or environmental Developmental vs Environmental Case Study: Champagne 0.6 0.5 0.4 0.3 0.2 0.1 0 1/1/2013 2/1/2013 3/1/2013 4/1/2013 5/1/2013 6/1/2013 7/1/2013 8/1/2013 9/1/2013 10/1/2013 Online Chardonnay sales in Melbourne, Australia, during 2013 11/1/2013 12/1/2013 Developmental vs Environmental Case Study: Champagne 0.6 Warm average Cold average 0.5 0.4 0.3 0.2 0.1 0 1/1/2013 2/1/2013 3/1/2013 4/1/2013 5/1/2013 6/1/2013 7/1/2013 8/1/2013 9/1/2013 10/1/2013 11/1/2013 12/1/2013 Developmental vs Environmental Case Study: Champagne NYE 0.6 0.5 Football finals Easter Mother’s Day 0.4 0.3 0.2 Melbourne Cup 0.1 0 1/1/2013 Tax Returns? 2/1/2013 3/1/2013 4/1/2013 5/1/2013 6/1/2013 7/1/2013 8/1/2013 9/1/2013 10/1/2013 11/1/2013 12/1/2013 We wish to explain the environmental and developmental… Can we quantify to what degree wine purchase decisions are influenced by the weather? Can we explain to what degree wine purchase decisions are influenced by location on a citylevel? The data… Over 3 million transactions from across Victoria, Australia Closely examined: Shiraz Chardonnay Riesling Sauvignon Blanc Pinot Gris/Grigio Cabernet Sauvignon Merlot Pinot Noir Wine Purchase Decision Case Study: Victoria Geographically diverse state Desert in north-west Alpine in the north-east Temperate in the south 30 25 Melbourne’s Climate Average temperature: 13 – 25°C 20 15 10 Extreme temperatures: -2 – 46°C 5 0 1 2 3 4 5 6 7 8 9 10 11 12 Consumer decisions cluster into groups Temperature Varieties correlate to temperature on a geographic scale 30 y = 353.25x - 52.241 R² = 0.69 30 y = -180.76x + 33.717 R² = 0.41 25 25 20 20 15 15 10 10 5 5 0.18 0.19 0.2 0.21 0.22 0.23 0.05 0.06 0.07 0.08 0.09 0.1 0.11 Association between relative Sauvignon Blanc (left) and Shiraz (right) sales and temperature, across Australia 0.12 0.13 All analysed varieties were correlated to temperature on a temporal scale 60.00% y = -0.0036x + 0.2678 R² = 0.20 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% 0 5 10 15 20 25 30 35 40 Association between relative Shiraz sales and temperature 45 All analysed varieties were correlated to temperature on a temporal scale 35.00% y = 0.0017x + 0.1195 R² = 0.11 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% 0 5 10 15 20 25 30 35 40 Association between relative Sauvignon Blanc sales and temperature 45 Google search associates Shiraz to temperature 60 y = -0.7793x + 56.795 R² = 0.37 55 Google search (relative) 50 45 40 35 30 25 20 15 10 15 20 25 30 Temperature (°C) Association between relative fortnightly Google searches and average temperature (excluding Christmas period) 35 Google search associates Sauvignon Blanc to temperature 0.22 y = 0.0006x + 0.1445 R² = 0.14 0.2 0.18 0.16 0.14 0.12 0.1 20 25 30 35 40 45 50 55 60 65 Association between relative fortnightly Google searches and average temperature (excluding Christmas period) Link between red wine sales and temperature is consistently stronger than white, except Sauvignon Blanc… Proportion of stores with significant correlation (r) Average income** when significant correlation Average income when insignificant correlation Cabernet Sauvignon 0.96 (0.29) $1632 $1110 Merlot 0.86 (0.26) $1639 $1436 Pinot Noir 0.57 (0.22) $1793 $1371 Shiraz 0.98 (0.44) $1623 $995 Chardonnay 0.45 (0.17) $1703 $1535 Pinot Gris 0.67 (0.23) $1765 $1303 Riesling 0.61 (0.25) $1778 $1352 Sauvignon Blanc 0.96 (0.29) $1626 $1244 Average 0.76 (0.27) $1695a $1294b *>0.027 **fortnightly Geography Decision Gene approach Relative purchase figures can be treated the same as allele frequencies (the frequency of gene variants), where an individual has two alleles for each gene Genotypes: aa = purchase Aa or AA = no purchase We can then use the frequencies to describe the characteristics of a population Comparing the relative frequency of alleles allows populations to be compared using distance-matrices, visualized with traditional phylograms. Clustering between distinct geographic areas Phylogram generated using the Neighbour-Joining (NJ) method on sales frequencies of 7 varieties across 28 retail outlets (derived using POPTREE2 [Takezaki 2010) Chardonnay sales contradict the cliché N High Riesling sales follow SE-NW corridor N High Riesling sales follow SE-NW corridor N Demographics roughly align with Chardonnay/Riesling distinction Sauvignon blanc is most popular in an outer suburban ring N Summary Significant associations can be made between developmental and environmental factors and consumer preference Temporal and spatial trends can be identified but need further analysis for confirmation We are looking for collaborators to consolidate this research, all welcome! alastairreed@melbournepolytechnic.edu.au Acknowledgements Special thanks to the Australian Grape and Wine Authority and Melbourne Polytechnic for supporting my attendance at AAWE