Uncovering Path-to-Purchase Consumer Behaviors using Clustered Multivariate Autoregression

Yicheng Song, Nachiketa Sahoo, Shuba Srinivasan, Chrysanthos Dellarocas
School of Management, Boston University
1 Introduction
It is well known that, in the process of making a purchase, consumers move through a sequence of cognitive states,
such as awareness, familiarity, consideration, etc. The sequence of these states is often referred to as the consumer’s
path-to-purchase [David et al.2009]. By aligning a firm’s marketing strategies with a consumer’s path-to-purchase
position and trajectory, a firm can better use its limited marketing resources. Despite a strong interest in this concept
there are few approaches available that can identify the predominant path-to-purchase of a consumer population.
The goal of the current research is to fill this gap by developing an approach that identifies a consumer population’s
predominant paths-to-purchase from data commonly available from CRM systems. We accomplish this goal in three
steps. First, we propose a model to capture the interaction between different activities of a consumer over time. This
captures the path of a consumer. Second, we develop a clustering algorithm that identifies segments of consumers
with similar path-to-purchase. Finally, we extract the path-to-purchase for each segment as a sequence of activities
that the consumers in the cluster engage in before purchase.
2 Model Development
A popular model for estimating the interaction between multivariate time series variables is the Vector
Autoregressive (VAR) model. However, most of the modeling apparatus for VAR are built on the assumption that
the distribution of the data is Gaussian. This is fine for population level statistics. However, when the data represents
individual consumer activities (e.g. number of online purchases/offline purchases/browses/searches per week), it is
often sparse, rendering Gaussian models inappropriate. We propose a Zero Inflated Multivariate Autoregressive
Poisson model (ZMAP) to accommodate discreteness, sparseness and both lagged and contemporaneous-correlation
in the count data. Specifically, the ZMAP has three components: (1) Zero-Inflated-Poisson distributions for the
marginal distributions of each component time series to accommodate sparse count data, (2) Autoregressive models
for the occurrence of the zeros and non-zero values to capture the lagged effects of variables, and (3) Multivariate
Normal copula to capture the contemporaneous correlation between different endogenous time series that are not
captured in lagged effects [Heinen and Rengifo 2007].
We further extend ZMAP to a mixture of ZMAP models. The data generating process of ZMAP mixture is described
in Figure 1.
Figure 1, The Bayesian network of Clustered ZAMP (CZMAP) model
We adopt Expectation-Maximization algorithm to estimate the parameters of the model [Dempster and Rubin 1977]
From the group-level ZMAP parameters we extract the predominant paths-to-purchase. We define the predominant
Path-to-Purchase as the sequence of consumer activities, starting from a firm’s marketing activity and leading to
the maximum purchase in online/offline channel. Finding the Predominant Path-to-Purchase can be decomposed
into two tasks: The first one is measuring the online/offline purchase response to marketing effort and finding the
period with maximal response. The second one is to attribute fractions of the maximal response to unique sequences
of activities that occur between the marketing activity and peak response.
3 Empirical Application1
We collect a dataset from a major multi-channel, multi-brand North American retailer on marketing activity,
customers’ website activities, customers’ purchases, and their demographic information. The dataset spans two
years, which we segment into weekly data. The dataset used in this study includes 9,805 consumers from the largest
brand of the retailer.
We determined the number of clusters and the lags to be 5 and 1, respectively using BIC criteria. The relative sizes
of the clusters are {46.11%, 20.82%, 19.60%, 7.89%, 5.56%}. The paths-to-purchase from email and catalog to
online and offline purchases are listed in the Table 1. One can see that each cluster has a different set of predominant
paths-to-purchase. Due to space restrictions, we only discuss clusters 4 and 5.
Cluster 4. Loyal Active Consumers. For this group there are significant paths from email and catalog (external
stimuli) to both online and offline purchase (responses). These customers live the farthest from the store. Therefore,
it looks like they deliberate more and engage in online research before making offline purchases. On the contrary,
their length of path to online purchase is just one week. We find that the proportion of consumers who are loyalty
cardholders is the largest in this cluster. The cumulative effect of email and catalog on the purchase is also the
largest in this cluster. We named this cluster the “loyal active consumers.”
Cluster 5. Holiday Shoppers. The online purchase response to marketing contact is insignificant and the offline
purchase response is very small for this group. 92.11% of their online purchases occur during the holiday. The few
offline purchases of these customers after receiving email/catalog occur after one week of online research. Since
these consumers are most active among all groups during the holidays and do most of their shopping during the
holidays, we name them “holiday shoppers.”
Table 1. Paths-to-Purchase of different clusters. The first column is the starting impulse. The last column is
the peak purchase activity. The table cells describe paths in terms of sequence of activities that occur between
the two end points. The parentheses contain the expected number of purchases and the corresponding % of
the maximal purchase attributable to the path.
1 Before applying the proposed approach to a dataset collected from real world setting we verified that it’s able to successfully recover
parameters from a simulated dataset. The details of these experiments are omitted from this extended abstract to conserve space.
Based on the clustering result, we explore two applications in which our results are helpful for guiding marketing
Application 1. Target Marketing: In our data, the company has adopted the strategy of targeting customers based
on their Recency, Frequency, and Monetary value (RFM). From our path-based segmentation we find that the
consumers in cluster 4 are the most sensitive to email and catalog. But are they better targets than the most valuable
customers identified by the RFM metric? We test this by selecting the same number of most valuable consumers by
RFM as those in Cluster 4. We compared their cumulative impulse response to a single email or catalog. The results
are listed in Table 2. We find that the average number of purchases made by people in cluster 4 is larger than those
selected by RFM. Both these groups have a higher response than those selected at random.
Table 2. The average number of long-term purchase for different consumer selections.
Application 2, Resource Allocation: Can we use the path-to-purchase to better decide when to send marketing
communication? We consider a scenario where a marketing manager is trying to achieve a 10% increase in sales
over a 10 week planning horizon. This is a common application scenario [Hanssens, et.al. 2014]. Under this setting
we derive the optimal marketing mix for each cluster using dynamic programming so that we reach as close to this
target as possible. Due to limited space the policy and average loss value per person for cluster 4 and 5 are listed in
Figure 2. Note that despite the similar sales target, the dynamic programming algorithm recommends very different
marketing policies for each cluster. The optimal marketing mix policy and loss with the scenario that there is only
one cluster for all consumers is also listed in Figure 2. The loss value is larger than previous setting, indicating that
employing different strategies for different clusters will meet the target more precisely
Figure 2. Optimal email and catalog policies for cluster 3 and cluster 4.
4 Summary
We present one of the first computable definitions of path-to-purchase. We propose an approach to estimate
customers’ paths-to-purchase and segment consumer groups based on their unique paths. Using a CRM touch-point
dataset, we show that different consumer groups have very different paths to purchase and that this fact can be
successfully used by the firm to target its marketing communications and to optimize its marketing mix.
(We are preparing the manuscript for a journal submission.)
David Court, D. E., Susan Mulder, and Ole Jørgen Vetvik (2009). The consumer decision journey. McKinsey
Heinen, A. and E. Rengifo (2007). "Multivariate autoregressive modeling of time series count data using copulas."
Journal of Empirical Finance 14(4): 564-583.
Dempster, A.P.; Laird, N.M.; Rubin, D.B. (1977). "Maximum Likelihood from Incomplete Data via the EM
Algorithm". Journal of the Royal Statistical Society, Series B 39 (1): 1–38. JSTOR 2984875. MR 0501537.
Hanssens, D. M., K. H. Pauwels, S. Srinivasan, M. Vanhuele and G. Yildirim (2014). "Consumer Attitude
Metrics for Guiding Marketing Mix Decisions." Marketing Science 33(4): 534-550.