Uncovering Path-to-Purchase Consumer Behaviors using Clustered Multivariate Autoregression Yicheng Song, Nachiketa Sahoo, Shuba Srinivasan, Chrysanthos Dellarocas School of Management, Boston University 1 Introduction It is well known that, in the process of making a purchase, consumers move through a sequence of cognitive states, such as awareness, familiarity, consideration, etc. The sequence of these states is often referred to as the consumer’s path-to-purchase [David et al.2009]. By aligning a firm’s marketing strategies with a consumer’s path-to-purchase position and trajectory, a firm can better use its limited marketing resources. Despite a strong interest in this concept there are few approaches available that can identify the predominant path-to-purchase of a consumer population. The goal of the current research is to fill this gap by developing an approach that identifies a consumer population’s predominant paths-to-purchase from data commonly available from CRM systems. We accomplish this goal in three steps. First, we propose a model to capture the interaction between different activities of a consumer over time. This captures the path of a consumer. Second, we develop a clustering algorithm that identifies segments of consumers with similar path-to-purchase. Finally, we extract the path-to-purchase for each segment as a sequence of activities that the consumers in the cluster engage in before purchase. 2 Model Development A popular model for estimating the interaction between multivariate time series variables is the Vector Autoregressive (VAR) model. However, most of the modeling apparatus for VAR are built on the assumption that the distribution of the data is Gaussian. This is fine for population level statistics. However, when the data represents individual consumer activities (e.g. number of online purchases/offline purchases/browses/searches per week), it is often sparse, rendering Gaussian models inappropriate. We propose a Zero Inflated Multivariate Autoregressive Poisson model (ZMAP) to accommodate discreteness, sparseness and both lagged and contemporaneous-correlation in the count data. Specifically, the ZMAP has three components: (1) Zero-Inflated-Poisson distributions for the marginal distributions of each component time series to accommodate sparse count data, (2) Autoregressive models for the occurrence of the zeros and non-zero values to capture the lagged effects of variables, and (3) Multivariate Normal copula to capture the contemporaneous correlation between different endogenous time series that are not captured in lagged effects [Heinen and Rengifo 2007]. We further extend ZMAP to a mixture of ZMAP models. The data generating process of ZMAP mixture is described in Figure 1. Figure 1, The Bayesian network of Clustered ZAMP (CZMAP) model We adopt Expectation-Maximization algorithm to estimate the parameters of the model [Dempster and Rubin 1977] From the group-level ZMAP parameters we extract the predominant paths-to-purchase. We define the predominant Path-to-Purchase as the sequence of consumer activities, starting from a firm’s marketing activity and leading to the maximum purchase in online/offline channel. Finding the Predominant Path-to-Purchase can be decomposed into two tasks: The first one is measuring the online/offline purchase response to marketing effort and finding the 1 period with maximal response. The second one is to attribute fractions of the maximal response to unique sequences of activities that occur between the marketing activity and peak response. 3 Empirical Application1 We collect a dataset from a major multi-channel, multi-brand North American retailer on marketing activity, customers’ website activities, customers’ purchases, and their demographic information. The dataset spans two years, which we segment into weekly data. The dataset used in this study includes 9,805 consumers from the largest brand of the retailer. We determined the number of clusters and the lags to be 5 and 1, respectively using BIC criteria. The relative sizes of the clusters are {46.11%, 20.82%, 19.60%, 7.89%, 5.56%}. The paths-to-purchase from email and catalog to online and offline purchases are listed in the Table 1. One can see that each cluster has a different set of predominant paths-to-purchase. Due to space restrictions, we only discuss clusters 4 and 5. Cluster 4. Loyal Active Consumers. For this group there are significant paths from email and catalog (external stimuli) to both online and offline purchase (responses). These customers live the farthest from the store. Therefore, it looks like they deliberate more and engage in online research before making offline purchases. On the contrary, their length of path to online purchase is just one week. We find that the proportion of consumers who are loyalty cardholders is the largest in this cluster. The cumulative effect of email and catalog on the purchase is also the largest in this cluster. We named this cluster the “loyal active consumers.” Cluster 5. Holiday Shoppers. The online purchase response to marketing contact is insignificant and the offline purchase response is very small for this group. 92.11% of their online purchases occur during the holiday. The few offline purchases of these customers after receiving email/catalog occur after one week of online research. Since these consumers are most active among all groups during the holidays and do most of their shopping during the holidays, we name them “holiday shoppers.” Table 1. Paths-to-Purchase of different clusters. The first column is the starting impulse. The last column is the peak purchase activity. The table cells describe paths in terms of sequence of activities that occur between the two end points. The parentheses contain the expected number of purchases and the corresponding % of the maximal purchase attributable to the path. 1 Before applying the proposed approach to a dataset collected from real world setting we verified that it’s able to successfully recover parameters from a simulated dataset. The details of these experiments are omitted from this extended abstract to conserve space. 2 Based on the clustering result, we explore two applications in which our results are helpful for guiding marketing decision. Application 1. Target Marketing: In our data, the company has adopted the strategy of targeting customers based on their Recency, Frequency, and Monetary value (RFM). From our path-based segmentation we find that the consumers in cluster 4 are the most sensitive to email and catalog. But are they better targets than the most valuable customers identified by the RFM metric? We test this by selecting the same number of most valuable consumers by RFM as those in Cluster 4. We compared their cumulative impulse response to a single email or catalog. The results are listed in Table 2. We find that the average number of purchases made by people in cluster 4 is larger than those selected by RFM. Both these groups have a higher response than those selected at random. Table 2. The average number of long-term purchase for different consumer selections. Application 2, Resource Allocation: Can we use the path-to-purchase to better decide when to send marketing communication? We consider a scenario where a marketing manager is trying to achieve a 10% increase in sales over a 10 week planning horizon. This is a common application scenario [Hanssens, et.al. 2014]. Under this setting we derive the optimal marketing mix for each cluster using dynamic programming so that we reach as close to this target as possible. Due to limited space the policy and average loss value per person for cluster 4 and 5 are listed in Figure 2. Note that despite the similar sales target, the dynamic programming algorithm recommends very different marketing policies for each cluster. The optimal marketing mix policy and loss with the scenario that there is only one cluster for all consumers is also listed in Figure 2. The loss value is larger than previous setting, indicating that employing different strategies for different clusters will meet the target more precisely Figure 2. Optimal email and catalog policies for cluster 3 and cluster 4. 4 Summary We present one of the first computable definitions of path-to-purchase. We propose an approach to estimate customers’ paths-to-purchase and segment consumer groups based on their unique paths. Using a CRM touch-point dataset, we show that different consumer groups have very different paths to purchase and that this fact can be successfully used by the firm to target its marketing communications and to optimize its marketing mix. (We are preparing the manuscript for a journal submission.) References David Court, D. E., Susan Mulder, and Ole Jørgen Vetvik (2009). The consumer decision journey. McKinsey Quarterly. 3 Heinen, A. and E. Rengifo (2007). "Multivariate autoregressive modeling of time series count data using copulas." Journal of Empirical Finance 14(4): 564-583. Dempster, A.P.; Laird, N.M.; Rubin, D.B. (1977). "Maximum Likelihood from Incomplete Data via the EM Algorithm". Journal of the Royal Statistical Society, Series B 39 (1): 1–38. JSTOR 2984875. MR 0501537. Hanssens, D. M., K. H. Pauwels, S. Srinivasan, M. Vanhuele and G. Yildirim (2014). "Consumer Attitude Metrics for Guiding Marketing Mix Decisions." Marketing Science 33(4): 534-550. 4