Modeling Heterogeneity in Multi-Stage Purchase Decisions: The

advertisement
1
___________________________________________________________________________________________________
Multi-stage purchase decision models
Multi-stage purchase decision models: Accommodating response
heterogeneity, common demand shocks, and endogeneity using
disaggregate data
Rick L. Andrews
Department of Business Administration, University of Delaware, Newark, DE 19716 U.S.A
andrews@lerner.udel.edu Tel.: +1 302 831 1190
Imran S. Currim
Paul Merage School of Business, University of California, Irvine, California 92697-3125 U.S.A.
iscurrim@uci.edu Tel.: +1 949 824 8368
Abstract
The most comprehensive models of purchase behavior for frequently purchased supermarket items explain
households’ purchase incidence decisions (whether to buy), brand choice decisions (what to buy), and purchase
quantity decisions (how much to buy). In this study we develop a three-stage purchase incidence/brand
choice/purchase quantity model for household-level data in which all three stages are specified with (i) random
coefficients distributions for model covariates and (ii) random effects distributions to account for unobserved
factors affecting demand (known as common demand shocks), while also (iii) controlling for the effects of
endogeneity in prices. Compared to current state-of-the-art models for multi-stage purchase decisions, the
results show improvements in fit and forecasting accuracy when purchase behaviors are modeled with all of
these components in combination. Perhaps more importantly, when common demand shocks are ignored,
substantial differences in parameter estimates and diagnostic information about consumer behavior are likely
(median differences in parameter estimates are 10% and 20% in two product categories), which impact
managerial deliberations about price and promotion policies. Further, failure to account for common demand
shocks affect the mean and variance of random coefficients distributions in unpredictable directions, which could
produce results that encourage managers to pursue inappropriate and costly micro-level product marketing
strategies.
Keywords: Common demand shocks; Response heterogeneity; Endogeneity
______________________________________________________________________________________
1.0.
Introduction
A typical supermarket purchase decision consists of at least three stages. First, a consumer
decides whether to make a purchase in a particular product class on a given store visit. If the
consumer decides to make a purchase, s/he must also decide which brand to purchase and how many
units of the brand to purchase (though not necessarily in this order). Focusing only on the brand
2
____________________________________________________________________________________________________
Multi-stage purchase decision models
choice decision, which is typical in the marketing literature, is insufficient for fully understanding
consumer purchase behavior. Therefore, a number of studies develop joint models for scanner panel
data to describe and explain more than one stage of the purchase process (Gupta, 1988; Krishnamurthi
& Raj, 1988; Bucklin & Lattin, 1991; Chiang, 1991; Bucklin & Gupta, 1992; Böckenholt, 1993a,b;
Chintagunta, 1993; Dillon & Gupta, 1996; Ailawadi & Neslin, 1998; Bucklin, Gupta, & Siddarth,
1998; Chib, Seetharaman, & Strijnev, 2004; Mehta, 2007).
Consider as an exemplar the model by Bucklin, Gupta, and Siddarth (1998, BGS hereafter).
The covariates consist of static variables designed to capture purely cross-sectional heterogeneity (e.g.,
household brand loyalty, purchase and consumption rates), dynamic variables designed to capture
time-variation in purchase behavior (e.g., household inventory levels, the last brand purchased), and
store-level marketing mix variables (e.g., prices and promotions of various brands) to capture factors
unique to the purchase environment.
However, it is clear from previous studies that covariates alone, whatever they may be, are
insufficient for explaining variation in purchase behaviors observed in scanner panel data. Crosssectional heterogeneity in households' responses to covariates must be accommodated also. The BGS
study develops a finite mixture model to explain heterogeneity in purchase incidence, brand choice,
and purchase quantity decisions, producing segment-level estimates. Their model shows dramatic
gains in explanatory power. The study by Arora, Allenby, and Ginter (1998) obtains individual-level
estimates from continuous random distributions for coefficients in a model that integrates choice and
quantity decisions. Their model demonstrates in-sample fit and predictive performance superior to
that of aggregate and finite mixture models,1 though this may or may not be the case more generally.
Regardless of the modeling approach used to capture cross-sectional heterogeneity in
responses to the covariates, existing models for scanner panel data rely on the assumption that the
covariates completely explain the variation in purchase incidence, brand choice probabilities, and
purchase rates. However, factors that are not observed by the researcher may affect purchase
1
The model developed by Arora, Allenby, and Ginter was applied to survey data, not scanner panel data.
3
____________________________________________________________________________________________________
Multi-stage purchase decision models
behaviors. For logit brand choice models, these factors may be of several kinds (Chintagunta, Dubé,
& Goh, 2005). Factors that vary across brands, households, and time will be absorbed by the extreme
value error term. Factors that vary across brands but are invariant across households and time will be
reflected in the brand-specific constants. In a random coefficients logit specification, factors that vary
across brands and households but not time will be accommodated through the heterogeneous brandspecific constants. Factors that are household-specific but invariant across brands and time will drop
out of the expression for the logit probabilities unless unique coefficients are estimated for each brand,
which is not typical in the literature. What is not accounted for are characteristics that vary across
brands and time but are invariant across households. Examples of such factors could include difficult
to quantify aspects such as advertising or coupon availability, changes in prices of competing goods,
and changes in the economic outlook or weather (Kuksov & Villas-Boas, 2008; Villas-Boas & Winer,
1999). We refer to such factors as common demand shocks.
Chintagunta, Dubé, and Goh (2005) document the importance of accounting for common
demand shocks in household-level brand choice models. They find that ignoring such characteristics
may lead to biased estimates of mean price response parameters and inflated estimates of the variances
in the heterogeneity distributions of price sensitivities across households. Their explanation for this
finding is that, given that there is a total variance associated with the systematic component of brand
utility, then ignoring the variance produced by the common demand shocks causes the variance of the
heterogeneity distributions to be inflated. Inflated variances in heterogeneity distributions could cause
the benefits from marketing activities such as household-level targeting to be overstated.
Closely related to the problem of unobserved factors affecting demand is the endogeneity
problem, in which unobserved factors affect both demand and prices simultaneously. For example,
unobserved factors such as style, prestige, or reputation might result in higher prices for a product and
higher demand for that product. The existence of such unobserved factors causes price to be
correlated with the error term of the demand model. If not addressed properly, endogeneity can bias
the price elasticities and subsequent optimization of the marketing mix. In practice, instrumental
4
____________________________________________________________________________________________________
Multi-stage purchase decision models
variables estimation techniques are often used to remedy an endogeneity problem. The study by
Villas-Boas and Winer (1999) uses an instrumental variables approach to address price endogeneity in
a brand choice model applied to disaggregate data, but it does not consider the purchase incidence or
purchase quantity components of purchase decisions, nor does it accommodate cross-sectional
heterogeneity in consumer preferences or responses.
It is possible that two types of common demand shocks exist, those that affect demand but not
price, and those that affect demand and price, in which case it may be necessary to account for
common demand shocks and endogeneity separately. In this study we develop a three-stage purchase
incidence/purchase quantity/brand choice model for panel data specified with (i) random coefficients
distributions for the covariates and (ii) random effects distributions to account for common demand
shocks in all three stages of the model, while (iii) controlling for the effects of endogeneity in prices as
well. We compare the explanatory power, impact on coefficient estimates, and impact on prediction
accuracy of these model components using scanner panel data from two product categories.
Regarding random effects distributions for common demand shocks, we introduce random
effects distributions into the brand choice and purchase quantity models to account for unobserved
factors that vary across brands and weeks but are common across households. For the purchase
incidence model, which is not brand-specific, the random effects account for demand shocks across
weeks only, capturing seasonality in demand. The study by Chintagunta, Dubé, and Goh (2005)
examines the effects of unmeasured brand characteristics in brand choice models but does not consider
such effects in purchase incidence or purchase quantity models (though their analysis does include a
"no purchase" option). The proposed model will allow us to assess the degree to which random
coefficients distributions might be capturing common demand shocks in multi-stage models rather
than true cross-sectional heterogeneity in responses to the covariates. If so, the means and variances
of the random coefficients distributions could be biased, as Chintagunta, Dubé, and Goh (2005) found
to be the case in brand choice models. This would suggest that it is important for multi-stage purchase
5
____________________________________________________________________________________________________
Multi-stage purchase decision models
decision models to also account for common demand shocks so that the resulting portrait of consumer
behavior and implications for managerial product marketing decisions are more accurate.
The common demand shocks described above account for unobserved factors that affect
demand but not prices. To account for unobserved factors that affect demand and prices
simultaneously, we use a two-stage instrumental variables approach that utilizes readily-available
instruments. The instruments are first used in a regression model to predict actual prices, and the
predicted prices from the regression are then used in the brand choice and purchase quantity models in
place of the actual prices. In addition, the correction for endogeneity may affect the purchase
incidence component of the model indirectly through the category values of the nested logit
formulation, which are based on the utilities from the brand choice component. No previous literature
on multi-stage purchase decision models for disaggregate panel data has controlled for endogeneity in
prices, so it will be interesting to see whether the correction improves model estimates.
The study by Nair, Dubé, and Chintagunta (2005) develops a model that accommodates
endogeneity and heterogeneity for purchase incidence, brand choice, and purchase quantity decisions,
when only aggregate data are available. Though the model is formulated at the consumer level, its
implementation occurs at the aggregate level. Nair, Dubé, and Chintagunta motivate their work by
stating “While individual-level data are preferred, in many instances such disaggregate information
may not be available” (p. 445). Our work, which is complementary to theirs, will be useful in the
many settings in which disaggregate information is available.
In summary, the current study develops a model for repeated multi-stage decision making at
the disaggregate level which accounts for heterogeneity in consumer preferences and responses to
covariates, for unobserved factors such as coupon availability for brands and in-store effects such as
shelf space that affect demand, and also for such unobserved factors that affect demand and prices. In
the next section, we describe the model specifications, followed by applications of the models to
scanner panel data for two product categories, paper towels and margarine. Finally, we summarize the
contributions, limitations, and implications for future research.
6
____________________________________________________________________________________________________
Multi-stage purchase decision models
2.0.
Modeling approach
Following the BGS study, we conceptualize the consumer purchase decision as consisting of
three stages: purchase incidence, brand choice, and purchase quantity. Given a store visit, a shopper
decides whether to make a purchase in the product category in question. If the consumer decides to
make a purchase, s/he decides which brand to buy and how many units of the brand to buy. In the
following paragraphs, we describe the specification of the base model, which is based heavily on BGS.
Then, we discuss extensions of the model that account for cross-sectional heterogeneity, common
demand shocks, and endogeneity.
2.1.
Base model
In this section, we discuss the base specification of the brand choice model, the incidence
model, and the purchase quantity model.
Brand choice model. The probability that household h buys brand i on a store visit at time t,
given a decision to buy in the product category (purchase incidence), is given by the multinomial logit
model:
Pt h i | inc  
 
 
exp U ith
,
k exp U kth
(1)
where the deterministic portion of household h’s utility for brand i at time t is a function of the
following covariates:
U ith   0i  1 BLhi   2 LBPith   3 PRICE its   4 PROM its .
(2)
The 0i are brand-specific constants. Brand loyalty ( BLhi ) is the within-household market share of
each brand during the 60-week initialization period prior to the period used for estimating model
parameters, which should be positively related to utility. The brand loyalty measure should capture
purely cross-sectional heterogeneity in preferences. Last Brand Purchased ( LBPith ) measures the
time-varying heterogeneity in a household’s preference and should be positively related to utility (e.g.,
Ailawadi, Gedenk, & Neslin, 1999; Heilman & Bowman 2002; Seetharaman, 2003). PRICE its is the
7
____________________________________________________________________________________________________
Multi-stage purchase decision models
actual shelf price (including temporary discounts) in the store s in which household h shops, which
should be negatively related to utility. PROM its is a 0/1 indicator variable for an in-store display,
which should be positively related to utility (e.g., Degeratu, Rangaswamy, & Wu, 2000).
Purchase incidence model. The probability that household h decides to make a purchase in
the product category of interest during the store trip at time t is
Pt h inc  
 
exp Vt h
,
1  exp(Vt h )
(3)
where the deterministic portion of utility that household h obtains from making the purchase is given
by
Vt h   0   1 CR h   2 INVt h   3 CVt h .
(4)
Consumption rate, CRh, is a household’s average weekly consumption of the product, calculated as the
total amount of the product purchased by the household during the initialization period divided by the
number of weeks in the period (60). Consumption rate, which is a purely cross-sectional measure of
consumer heterogeneity in purchase incidence probabilities, should be positively related to incidence
utility. The household inventory variable, INVt h , is designed to capture time-varying heterogeneity in
incidence probabilities. To construct the inventory variable, we assume that households draw down
their supply linearly at their rates of consumption, CRh. We initialize our inventory measure at zero at
the start of the initialization period. We also mean-center INVt h by subtracting each household's
average level of inventory during the calibration period. This makes the measure purely longitudinal,
so that INVt h becomes a measure of relative inventory in a household (see BGS). Inventory should be
negatively related to incidence utility. The category value variable, CVt h , measures the attractiveness
of the product category in a nested logit incidence model. It is calculated as the log of the
denominator of the brand choice model (eq. 1), and hence the variables affecting brand choice
indirectly affect incidence through the category value.
8
____________________________________________________________________________________________________
Multi-stage purchase decision models
Quantity model. Given purchase incidence and choice of brand i, the probability that
household h buys qith units of brand i at store visit t is modeled using a Poisson distribution with
truncation of the zero outcome:
exp( ith )(ith )qit
P(Q  q | Q  0) 
,
[1  exp( ith )] qith!
h
h
it
h
it
h
it
(5)
where the Poisson rate parameter is parameterized with covariates as follows:
ith  exp  0i  1 PR h   2 INVt h   3 BLhi   4 PRICE its   5 PROM its  .
(6)
Purchase rate, PRh, is defined as the average quantity of the product purchased by the household
during the initialization period, given that a purchase was made. This variable captures purely crosssectional heterogeneity in purchase quantities and should be positively related to the rate parameter.
Inventory, which should capture time-varying heterogeneity in purchase rates, should be negatively
related to the purchase rate. Brand loyalty, price, and promotion (all defined earlier) should have
positive, negative, and positive relationships, respectively, with purchase rate.
2.2.
Capturing heterogeneity in consumer preferences and responses
To capture heterogeneity in consumer decision processes, one could also allow model
coefficients to follow a distribution such as the multivariate normal distribution. Brand choice utilities
(equation 2) become
U ith   0hi  1h BLhi   2h LBPith   3h PRICE its   4h PROM its ,

(7)

where β h is described by a normal density β h ~ N β, W with mean vector β and covariance
matrix W . Equations (4) and (6) are modified similarly. We stack the vectors of parameters from
all three stages of the model and estimate one large covariance matrix for maximum flexibility.
9
____________________________________________________________________________________________________
Multi-stage purchase decision models
2.3.
Capturing common demand shocks
To account for unobserved factors affecting demand and varying across brands and weeks but
not households, we add a stochastic term to the deterministic portion of utility for the choice model, so
that equation (7) becomes
U ith   0hi  1h BLhi   2h LBPith   3h PRICE its   4h PROM its   it .
(8)
The vector of common demand shocks for week t, ε t , is described by a normal density ε t ~ 0, W  ,
with a zero mean vector and covariance matrix W . Means other than zero for the random effects
would not be identified. Note that  it is identifiable since there are no other random terms in equation
(8) that vary across brands and weeks; the extreme value error term that will be added to equation (8)
to get the familiar logit formulation will vary across brands, weeks, and households and is therefore
distinct from  it . Analogous random effects distributions are added to the expressions for the
purchase rates in equation (6). For the incidence model, which is not brand-specific, only one random
effect term (i.e., one variance parameter) is needed.
2.4
Capturing endogeneity
We use an instrumental variables approach to account for factors that affect both demand and
prices and thereby cause an endogeneity problem. The instruments used to correct for endogeneity,
~


which we call store mean-corrected prices, are given by Z its  PRICE its  Pit , where Pit is the
average price of brand i at time t across stores s. It is possible to show (Andrews & Ebbes, 2009) that
these instruments are uncorrelated with the unobserved factor affecting prices and demand and that the
instruments are also at least moderately correlated with actual prices, satisfying both the properties of
desirable instruments. With these instruments, either a simultaneous equation estimation or a twostage estimation procedure produces good results for logit-based demand models (Andrews and Ebbes,
2009).
~
When PRICE its is regressed on the instruments Z its , i.e.,
10
____________________________________________________________________________________________________
Multi-stage purchase decision models
~
PRICE its  0i  1 Z its  its
(9)
it can be shown that the OLS estimates are always ˆ0i  Pi and ˆ1  1 , so the price equation error
term is always  its   it  Pit  Pi , which does not vary across stores. A simultaneous equation
approach for accounting for the endogeneity would involve allowing an additional common demand
shock  it in equation (8),
U ith   0hi  1h BLhi   2h LBPith   3h PRICE its   4h PROM its   it   it
(10)


and allowing it to be correlated with  it , for example,  it    it   it , where  it ~ N 0,  2 .
However, potential identification problems could result between  it and  it with the simultaneous
equation approach. Instead, we use a two-stage estimation approach to avoid the possibility of
identification problems. This approach inserts predicted values PRIˆCEits from equation (9) in the
model instead of actual prices, resulting in
U ith   0hi   1h BLhi   2h LBPith   3h PRIˆCEits   4h PROM its   it .
(11)
We use the predicted prices in both the brand choice and purchase quantity models to correct for
potential endogeneity problems in both components of the model. In addition, since the purchase
incidence model depends on the category values CVt h , which in turn depend on predicted prices, the
estimates in the purchase incidence model may be affected as well.
2.5.
Estimation
Our approach to estimation is Bayesian. Markov Chain Monte Carlo (MCMC) sampling is
used to generate draws from the posterior densities of various sets of model parameters conditional on
other sets of model parameters. A Metropolis-Hastings (MH) algorithm is used to draw the
multivariate normal response coefficients (the same type of MH algorithm is used to draw the
multivariate normal random effects for capturing common demand shocks). For the random
coefficients distributions, we use the same parameters for the normal mean prior and inverse Wishart
11
____________________________________________________________________________________________________
Multi-stage purchase decision models
covariance prior as those used by Chiang, Chib, and Narasimhan (1999). These hyperparameter
values ensure that the prior distributions are proper but weak, thus allowing the data to dominate the
results. Note that the parameters for the incidence model, the brand choice model, and the purchase
quantity model are drawn simultaneously to accommodate covariances among parameters in different
stages of the choice process. We use 20,000 iterations of the sampler for burn in and an additional
5,000 iterations for collection of information on the posterior distributions of parameters.
2.6.
Restricted models and benchmark models
We estimate various restricted versions of the proposed model as well as finite mixture
versions of the base model to serve as benchmarks. A random coefficients model with correction for
endogeneity will provide a useful benchmark for assessing the benefits of including common demand
shocks in the specification. Random coefficients models with no common demand shocks and no
correction for endogeneity will allow us to assess the extent to which random coefficients will
spuriously accommodate the missing demand shocks. Finally, a base model with fixed coefficients as
well as finite mixture versions of the base model allow us to compare the results of our proposed
model with previous results in the literature.
3.0.
Empirical application
3.1.
Data
Information Resources, Inc., (IRI) scanner panel data are used to calibrate and validate the
purchase incidence/brand choice/purchase quantity models. The panelists, who shopped in nine stores
from a supermarket chain in a Chicago suburban area, are tracked over a 112-week period. The two
product classes used are paper towels and margarine. For each product category, a random sample of
300 households is used for analysis. The first 60 weeks of each household’s purchase history are used
to initialize the model variables, while the remaining 52 weeks are used to calibrate the model
parameters. For the paper towel data, the total number of store trips made by the sample panelists
during the calibration period was 25,105, with 2,022 of the trips resulting in the purchase of paper
12
____________________________________________________________________________________________________
Multi-stage purchase decision models
towels; panelists purchased 3,499 rolls of paper towels (an average of 1.73 rolls per incidence). For
the margarine data, sample panelists made 24,315 store trips, with 2,361 trips resulting in margarine
purchases; panelists purchased 3,125 pounds of margarine (an average of 1.32 pounds per incidence).
For each category, another 300 panelists were randomly chosen for model validation purposes. For
the paper towel category, the validation sample panelists made 23,973 store trips, while the validation
sample panelists for the margarine category made 23,795 store trips.
Following the BGS study, which focused on 8-ounce containers of yogurt to keep model
estimation tractable, we focus on single-roll packs in the paper towel category.2 Ninety-one percent of
all purchases were single roll packs, so parameter estimates for 2-, 3-, 6-, and 12-roll pack indicators
would be questionable in any case. Brand names with 3% or greater market share were retained for
analysis. The seven brand names retained are Bounty, Brawny, HiDri, Mardi Gras, Scott, SoDri, and
Sparkle.3 For the margarine category, the analysis focused on one-pound containers, again in the
interest of model tractability. Ninety percent of all purchases were one pound containers, so again the
parameter estimates for multi-pound pack indicators would be questionable. The nine brand names
retained for analysis are Blue Bonnet, Brummel & Brown, Fleischmann, I Can’t Believe It’s Not
Butter, Imperial, Land O’ Lakes, Parkay, Promise, and Shedd’s Country Crock.
For each product category, price, store feature advertising, and aisle display data are available.
We found that store feature advertising and aisle display are highly correlated, so the marketing mix
variables included in our models are price and aisle display (which we label as promotion).
3.2.
Estimation and validation results
Table 1 shows the model estimation and prediction results for the paper towel category (part a)
and the margarine category (part b). The log marginal density (LMD), which is used to assess
estimation sample fit for the HB-estimated models, is computed using the reweighted importance
2
The study by Andrews and Currim (2005) shows that, when it is not feasible to conduct an analysis at the SKU
level for computational or other reasons, selecting the most popular size and conducting the analysis at the brand
level is the best modeling approach.
3
The Viva brand had sufficient market share to be included in the study, but marketing mix data for this brand
were missing over a 61-week period.
13
____________________________________________________________________________________________________
Multi-stage purchase decision models
sampling method outlined in Raftery (1996) and Newton and Raftery (1994). The log likelihood (Log
L) value is presented for the models estimated with maximum likelihood methods. The log likelihood
value is presented for the model with fixed coefficients and the finite mixture models; for all others,
the log marginal density is presented. In the case of the finite mixture models, BIC was used as an
approximation to the log marginal densities. BIC is based on the Schwartz criterion, which in turn is
an approximation to the log marginal density of a model (Schwartz, 1978). In order to make the log
marginal density comparable to the BIC, a statistic widely used in the marketing literature, we
compared –2(LMD) to the BIC measure for the other models (Andrews, Ainslie, & Currim, 2002).
For model validation, we used four performance measures. The log likelihood value for the holdout
sample, Log L(Validation), captures whole-model fit. We also assess the fit of the incidence and
brand choice components separately using the average predicted probability of the actual outcome,
P (Inc) for the incidence model and P (Choice) for the choice model. Finally, we assess the
purchase quantity component by computing the root mean squared error of the predicted purchase
quantities, RMSE(Q).
Looking first at the paper towel category results (Table 1a), the results for the proposed model
are shown first, followed by results for various restricted forms of the proposed model, and finally the
results for the finite mixture benchmark models. We see that the proposed model with random
coefficients, common demand shocks, and endogeneity has the best results for all performance criteria
for both the estimation and validation samples. The differences between this model and the proposed
model that does not accommodate endogeneity are rather small, with the exception of the purchase
quantity component (RMSE(Q)), for which the differences are perhaps slightly more meaningful.
As for the restricted models, the random coefficients model with endogeneity is different from
the proposed model only in that it does not accommodate common demand shocks, so comparison
with the proposed model provides insight on the value of accommodating common demand shocks.
The proposed model has better overall fit in the estimation and validation samples, as well as better
14
____________________________________________________________________________________________________
Multi-stage purchase decision models
validation sample performance in all three stages of the model. However, it appears that the
accommodation of common demand shocks provides the most significant benefit in the brand choice
and purchase quantity components of the model and the least benefit in the purchase incidence
component. This is likely because the unobserved factors vary across brands and weeks in the brand
choice and purchase quantity components, but they vary only across weeks in the purchase incidence
component since it is not brand specific. Hence there are more random components to explain
variance in the brand choice and purchase quantity components of the model.
A comparison of the restricted random coefficients models with and without corrections for
endogeneity shows that accommodating endogeneity had fairly minor effects. Fit in the estimation
sample is worse when endogeneity is controlled, as expected, because predicted prices are used instead
of actual prices. However, for the validation sample, differences between the two models were slight,
with purchase quantity predictions being slightly better when endogeneity is controlled. The store
mean-corrected price instruments were suitably strong, with R2 values from the regressions of prices
on instruments (equation 9) ranging from 0.54 to 0.90.
Compared to models with fixed coefficients, the random coefficients specifications performed
much better on all criteria. The importance of heterogeneity is no surprise given how much attention
the topic has received in the literature on disaggregate models of purchase behavior.
For the finite mixture benchmark models, an analyst would most likely choose a 4-segment
solution on the basis of BIC. The advantage of the proposed model over the 4-segment FM model is
most evident in the overall fit statistics and in the choice and quantity components of the model. For
example, the predicted choice probability for the chosen brand for the proposed model is 0.6972,
compared to 0.5939 for the FM model (a 17% improvement). Likewise, the RMSE(Q) value for the
proposed model is 0.8466, compared to 1.0785 for the 4-segment model, an error reduction of almost
22%. The improvement in the hit rate for the incidence model is much smaller.
The results for the margarine category (Table 1b) are quite similar to those of the paper towel
category. The proposed specifications perform similarly whether or not endogeneity is controlled, and
15
____________________________________________________________________________________________________
Multi-stage purchase decision models
both perform significantly better than all other restricted models and benchmark models. A
comparison of the proposed model with the restricted random coefficients model with endogeneity
highlights the benefits of accommodating common demand shocks. As with the paper towel category,
the benefits are most apparent in the brand choice and purchase quantity components of the model and
less apparent in the purchase incidence component. Accommodating endogeneity again appeared to
have little effect on model performance, as a comparison of the random coefficients models specified
with and without endogeneity reveals. The instruments are again suitably strong, with R2 values from
the regressions of prices on instruments ranging from 0.50 to 0.99. Finally, using random coefficients
instead of fixed coefficients produced tremendous gains according to all performance measures.
As for the finite mixture benchmark models, an analyst would likely choose the 5-segment
finite mixture model on the basis of BIC. Compared to this benchmark, the proposed model offers an
10% improvement in the choice model hit rate and a 16% reduction in prediction error for purchase
quantity, but again much less improvement in the incidence model hit rate.
Table 2 shows the parameter estimates for the proposed model for both the paper towel and
margarine categories. The results are quite similar across product categories, which bodes well for the
stability of the models as well as the generalizability of the findings. Parameter estimates generally
have the expected signs, and the 95% Highest Posterior Density (HPD) regions (analogous to
confidence intervals) generally do not contain zero. Only one of 26 parameters in Table 2 has an
unexpected sign (brand loyalty for the quantity model in the paper towel category), and even then the
95% HPD region almost includes zero. This could be indicative of variety seeking, but in any case the
effect is weak. For one parameter (promotion in the purchase quantity model for the paper towel
category), the 95% HPD region includes zero.
As discussed earlier, Chintagunta, Dubé, and Goh (2005) document that ignoring common
demand shocks in brand choice models may lead to downward-biased estimates of mean price
response parameters and also to larger estimates of the variance in the heterogeneity distribution of
price sensitivities across households. Based on their findings, we likewise expect that mean estimates
16
____________________________________________________________________________________________________
Multi-stage purchase decision models
of price parameters would be biased toward zero when common demand shocks are ignored. Another
argument for why the mean parameter estimates might be smaller when common demand shocks are
ignored is that, to the extent that common demand shocks improve the fit of the model, the scale factor
changes, resulting in a general elevation of all coefficients by some constant. We also expect that the
variances of random coefficients distributions would be larger when common demand shocks are
ignored.
Figure 1 shows that accounting for common demand shocks affects both the location and scale
of the estimated household-level price effects. The figure compares the random effects distributions
for price for the proposed model accommodating common demand shocks and endogeneity with those
for the model accommodating only endogeneity. For the effects of price in the brand choice model in
the margarine category (Figure 1, part a), the mean price effect is indeed biased slightly towards zero
(9% smaller) when common demand shocks are ignored, and the variation in the household-level
coefficients is indeed larger. For the paper towel category (part b), the same pattern generally holds,
with a 21% reduction in the mean value of the coefficient when common demand shocks are ignored,
except that the variation in coefficients is only slightly larger. For the purchase quantity model (part
c), when common demand shocks are ignored, the mean price coefficients are again significantly
smaller in the margarine category (54%) and paper towel category (23%, part d), but counter to
expectations, in the paper towel category the variation in price coefficients is actually smaller when
common demand shocks are ignored.
One unanswered question is whether the findings of Chintagunta, Dubé, and Goh (2005) and
also our findings for multi-stage purchase decision models in Figure 1 generalize to other model
variables besides price. We examined all coefficient distributions for both the margarine category and
the paper towel category and conclude that the findings are quite mixed with respect to the direction of
the changes in coefficients resulting from omission of the common demand shock. The mean
coefficient effects can be larger or smaller when common demand shocks are omitted, and the
variances of random coefficient distributions can also be larger or smaller. Across categories, about
17
____________________________________________________________________________________________________
Multi-stage purchase decision models
80% of coefficients are larger when common demand shocks are accommodated, and about 57% of
coefficients have larger variances. Though this finding was not predicted, especially given the effects
of improved fit on the scale factor of the logit model and the findings of Chintagunta, Dubé, and Goh
(2005), in retrospect we can imagine some situations in which a missing demand shock is negatively
correlated with a marketing mix variable, which could lead to an over-estimation of the marketing mix
coefficient (i.e., bias away from zero). For example, a manager aware of an upcoming major
promotional event on a product in a complementary product category may lower the price on their
own product to capitalize on the opportunity. The amount of the promotional discount in the
complementary category would be negatively correlated with the manager’s price but positively
related to demand. Thus, for a model in which the promotional discount for the complementary
product was not observed, the effects of the manager’s price change would be overstated, not
understated.
Considering all model coefficients, we calculate that the median4 absolute bias resulting from
omission of the common demand shocks is 10% in the margarine category and 20% in the paper towel
category. Thus, the omission of common demand shocks can produce very significant changes in
mean parameter estimates, even when price endogeneity is controlled.
Finally, we investigate the extent to which household-level estimates of price coefficients are
affected by the inclusion of common demand shocks. In Figure 2, we show scatterplots of householdlevel price coefficients, for the brand choice and purchase quantity models, for random coefficients
models specified with endogeneity and with or without common demand shocks. In the margarine
category (part a), the correlations between household level estimates produced by models specified
with and without common demand shocks are modest in both the brand choice and purchase quantity
models (r=0.19 and r=0.25, respectively). For the paper towel category, models specified with and
without common demand shocks produced highly correlated household-level estimates in the brand
4
The mean is distorted by the fact that some coefficients are fairly small, which can produce a very large
percentage change. Thus, we computed the medians.
18
____________________________________________________________________________________________________
Multi-stage purchase decision models
choice model (r=0.83) but weakly correlated estimates in the purchase quantity model (r=0.26).
Notice that scale factor changes would not be responsible for low correlations in household-level
estimates since correlation coefficients are insensitive to transformations of the form y=a+bx; a scale
factor change would result in a transformation of the form y=bx. This finding indicates that
household-level targeting decisions could very well be impacted by failure to model common demand
shocks.
4.0.
Discussion and conclusion
The goal of this research was to investigate the importance of accounting for response
heterogeneity, common demand shocks, and endogeneity in three-stage purchase incidence/purchase
quantity/brand choice models for panel data. The focus of the analysis was whether and how
parameter estimates are affected by the omission of unobserved common demand shocks.
Chintagunta, Dubé, and Goh (2005) study the effects of common demand shocks on household-level
brand choice models only and find that omission of common demand shocks can lead to two types of
problems: mean price response estimates biased toward zero and larger estimates of the variance in
the heterogeneity distribution of price sensitivities across households. Our study also complements the
work by Nair, Dubé, and Chintagunta (2005), which develops a model that accommodates
endogeneity and heterogeneity (but not common demand shocks) for purchase incidence, brand
choice, and purchase quantity decisions, when only aggregate data are available. Our work will be
useful in the many settings in which disaggregate information is available. Finally, our study extends
the work by Villas-Boas and Winer (1999), which uses an instrumental variables approach to correct
for endogeneity in brand choice models. They do not consider endogeneity in multi-stage purchase
decision models.
For both product categories studied, the model specified with random coefficients
distributions for covariates and random effects distributions for common demand shocks, estimated
using an instrumental variables approach to control for endogeneity, produced superior fit and
predictive accuracy. In particular, the fully-specified model dominated all finite mixture and random
19
____________________________________________________________________________________________________
Multi-stage purchase decision models
coefficients benchmark models, including those without common demand shocks and/or corrections
for endogeneity. The findings are intuitively appealing because there are many relevant variables that
may impact consumer purchase behaviors but are not typically observed or included, such as
advertising or coupon availability or other difficult to quantify aspects such as style or prestige. In
addition, in-store variables such as shelf space or shelf talkers could have important effects. The
correction for price endogeneity resulted in managerially insignificant changes in model performance
criteria in any stage of the purchase decision in either product category, despite the usage of strong
instruments.
The results show that accounting for the effects of unobserved variables can affect the location
and scale of all estimated random coefficient distributions for observed variables, not just price.
Importantly, we find that the direction of bias in the location and scale is not intuitively predictable a
priori. In studies of price endogeneity, the typical finding is that ignoring the effects of unobserved
factors results in the price coefficient being biased toward zero. The assumption is that the
unobserved factor (e.g., advertising exposure or prestige) will be positively correlated with both price
and demand. We also find in both product categories that omission of the common demand shock
resulted in underestimation of the price coefficients, as well as (generally) larger estimates of the
variance of the random coefficient distribution. However, for other variables and brand constants,
omission of common demand shocks could result in under- or over-estimation of the mean coefficients
and well as smaller or larger estimates of the variances of the random coefficients distributions. We
might expect the coefficients of the model specified with common demand shocks to be larger relative
to the standard random coefficients model because the scale factor of the logit model changes as fit
improves, resulting in a general elevation of all coefficients. But this is not the case. Across
categories, about 80% of coefficients are larger when common demand shocks are accommodated, and
about 57% of coefficients have larger variances. Across all model coefficients, the median absolute
percentage change in coefficients resulting from the omission of common demand shocks was 10% in
the margarine category and 20% in the paper towel category. This could have major implications
20
____________________________________________________________________________________________________
Multi-stage purchase decision models
when the diagnostic information about consumer behavior from such coefficient distributions is used
in managerial deliberations about prices, optimal price reductions, promotions, and optimal promotion
policies.
One might also expect that failure to account for common demand shocks could result in
larger variances for parameter distributions, as the model uses random coefficients distributions to
account for the unobserved and unexplained variation. But again this is not necessarily the case. Our
findings suggest that random coefficients distributions are not substitutes for random effects
distributions for common demand shocks (and vice versa). Intuitively, random coefficients
distributions account for factors that vary across brands and households but not purchase occasions,
whereas common demand shocks vary across brands and purchase occasions but not households.
In addition, we computed the correlations between household-level price coefficients
estimated by models specified with and without common demand shocks and found that the
correlations were generally weak (0.25 or below), with the exception of the price coefficients for the
brand choice model for the paper towel category, which were much more strongly correlated. Thus,
perhaps one could argue that it is at least as important to model common demand shocks in the
purchase quantity model as it is the brand choice model, a finding that builds on the findings by
Chintagunta, Dubé, and Goh (2005).
In the context of behavioral theories of consumer decision making, note that we do not claim
that our proposed model better tracks the decision process of the customer. For example, some
consumers could make the incidence/brand choice/quantity decisions simultaneously while others
could make such decisions sequentially. Analysis of scanner data has not yet been used to provide
insights into which of these processes may be operating, though this is an interesting area for future
research. Further, the inclusion of reference prices, consideration sets (e.g., Siddarth, Bucklin, &
Morrison 1995), and planned vs. opportunistic decisions (Bucklin & Lattin, 1991), among other such
effects, could provide additional useful insights.
21
____________________________________________________________________________________________________
Multi-stage purchase decision models
Acknowledgement: The authors would like to thank the Editor, the Area Editor, and the reviewers
for their constructive comments on this manuscript.
References
Ailawadi, K. L., Gedenk, K., & Neslin, S.A. (1999) Heterogeneity and purchase event feedback in
choice models: an empirical analysis with implications for model building. International Journal of
Research in Marketing, 16, 177-198.
Ailawadi, K. L., & Neslin, S.A. (1998). The effect of promotion on consumption: Buying more and
consuming it faster. Journal of Marketing Research, 35 (August), 390-398.
Andrews, R.L., Ainslie, A., & Currim, I.S. (2002). An empirical comparison of logit choice models
with discrete versus continuous representations of heterogeneity. Journal of Marketing Research,
39 (November), 479-487.
Andrews, R. L., & Currim, I.S. (2005). An experimental investigation of scanner data preparation
strategies for consumer choice models. International Journal of Research in Marketing, 22
(September), 319-331.
Andrews, R. L., & Ebbes, P. (2009). Modeling endogeneity in logit-based demand models: Finite
sample results. Working paper, Department of Business Administration, University of Delaware.
Arora, N., Allenby, G.M., & Ginter, J. L. (1998). A hierarchical Bayes model of primary and secondary
demand. Marketing Science, 17 (1), 29-44.
Böckenholt, U. (1993a). Estimating latent distributions in recurrent choice data. Psychometrika, 58
(September), 489-509.
Böckenholt, U. (1993b). A latent-class regression approach for the analysis of recurrent choice data.
British Journal of Mathematical and Statistical Psychology, 46 (May), 95-118.
Bucklin, R., & Gupta, S. (1992). Brand choice, purchase incidence, and segmentation: An integrated
modeling approach. Journal of Marketing Research, 29 (May), 201-215.
Bucklin, R., Gupta, S., & Siddarth, S. (1998). Determining segmentation in sales response across
consumer purchase behaviors. Journal of Marketing Research, 35 (May), 189-197.
Bucklin, R., & Lattin, J.M. (1991). A two-state model of purchase incidence and brand choice.
Marketing Science, 10 (Winter), 24-39.
Chiang, J. (1991). A simultaneous approach to the whether, what, and how much to buy questions.
Marketing Science, 10 (Fall), 297-315.
Chiang, J., Chib, S., & Narasimhan, C. (1999). Markov chain monte carlo models of consideration set
and parameter heterogeneity. Journal of Econometrics, 89 (1-2), 223-248.
22
____________________________________________________________________________________________________
Multi-stage purchase decision models
Chib, S., Seetharaman, P.B., & Strijnev, A. (2004). Model of brand choice with a no-purchase option
calibrated to scanner-panel data. Journal of Marketing Research, 41 (May), 184-196.
Chintagunta, P. K. (1993). Investigating purchase incidence, brand choice and purchase quantity
decisions of households. Marketing Science, 12 (Spring), 184-208.
Chintagunta, P. K., Dubé, J.P., & Goh, K.Y. (2005). Beyond the endogeneity bias: The effect of
unmeasured brand characteristics on household-level brand choice models. Management Science,
51 (May), 832-849.
Degeratu, A.. M., Rangaswamy, A., & Wu, J. (2000). Consumer choice behavior in online and
traditional supermarkets: The effects of brand name, price, and other search attributes. International
Journal of Research in Marketing, 17, 55-78.
Dillon, W., & Gupta S. (1996). A segment-level model of category volume and brand choice.
Marketing Science, 15 (1), 38-59.
Gupta, S. (1988). Impact of sales promotions on when, what, and how much to buy. Journal of
Marketing Research, 25 (November), 342-355.
Heilman, C. M., & Bowman, D. (2002). Segmenting consumers using multiple-category purchase data.
International Journal of Research in Marketing, 19, 225-252.
Krishnamurthy, L., & Raj S.P. (1988). A model of brand choice and purchase quantity price
sensitivities. Marketing Science, 7 (Winter), 1-20.
Kuksov, D., & Villas-Boas, J.M. (2008). Endogeneity and individual consumer choice. Journal of
Marketing Research, 45 (December), 702-714.
Mehta, N. (2007). Investigating consumers’ purchase incidence and brand choice decisions across
multiple product categories: A theoretical and empirical analysis. Marketing Science, 26 (MarchApril), 196-217.
Nair, H., Dubé, J.P., & Chintagunta, P. (2005). Accounting for primary and secondary demand effects
with aggregate data. Marketing Science, 24 (Summer), 444-460.
Newton, M.A., & Raftery, A.E. (1994). Approximate Bayesian inference by the weighted likelihood
bootstrap. Journal of the Royal Statistical Society, Series (B), 56 (1), 43–48.
Raftery, A. E. (1996). Hypothesis testing and model selection. In W.R. Gilks, S. Richardson, and D.J.
Spiegelhalter (eds.), Markov Chain Monte Carlo in Practice (pp. 163-188). London: Chapman &
Hall.
Seetharaman, P. B. (2003). Probabilistic versus random-utility models of state dependence: an
empirical comparison. International Journal of Research in Marketing, 20, 87-96.
Schwartz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6 (2), 461–64.
23
____________________________________________________________________________________________________
Multi-stage purchase decision models
Siddarth, S., Bucklin, R.E., & Morrison, D.G. (1995). Making the cut: Modeling and analyzing
choice set restriction in scanner panel data. Journal of Marketing Research, 32 (August), 255-66.
Villas-Boas, J. M., & Winer, R. S. (1999). Endogeneity in brand choice models.
Management Science, 45 (10), 1324-1338.
24
____________________________________________________________________________________________________________________________________________
Multi-stage purchase decision models
Table 1.
a.
Calibration and validation results
Paper towel category (Estimation sample: N=300; 25,105 observations; Validation sample: N=300; 23,973 observations)
Model
P
Proposed model:
Random coefficients, common demand shocks,
endogeneity
Random coefficients, common demand shocks
Restricted models:
Random coefficients, endogeneity
Random coefficients
Fixed coefficients
Finite mixture models:
2 segment FM
3 segment FM
4 segment FM
5 segment FM
6 segment FM
Estimation Sample
LMD/
-2LMD/
Log L
BIC
Validation Sample
Log L
(Valid.)
P (Inc)
P (Choice)
RMSE(Q)
-8654
-8658
17308
17316
-8580
-8586
0.8705
0.8705
0.6972
0.6958
0.8466
0.8648
26
-8881
-8868
-10716
17762
17737
21580
-8852
-8851
-11117
0.8677
0.8677
0.8484
0.6681
0.6696
0.5636
0.8747
0.8831
0.9856
53
80
107
134
161
-9965
-9761
-9672
-9616
-9522
20233
19979
19953
19997
19962
-10298
-10230
-10149
-9813
-9938
0.8595
0.8643
0.8620
0.8691
0.8688
0.5737
0.5996
0.5939
0.5939
0.6051
1.0322
1.1039
1.0785
1.0052
1.0405
LMD is the Log Marginal Density; Log L is the log likelihood value; BIC is the Bayesian Information Criterion, and -2LMD is an approximation
of it for models estimated with HB methods; P (Inc) is the average predicted probability of the actual purchase incidence outcome; P (Choice)
is the average predicted probability of the actual brand choice outcome, given incidence; RMSE(Q) is the root mean squared error between the
actual and predicted purchase quantity, given incidence.
25
____________________________________________________________________________________________________________________________________________
Multi-stage purchase decision models
Table 1.
b.
Calibration and validation results, continued
Margarine category (Estimation sample: N=300; 24,315 observations; Validation sample: N=300; 23,795 observations)
Model
P
Proposed model:
Random coefficients, common demand shocks,
endogeneity
Random coefficients, common demand shocks
Restricted models:
Random coefficients, endogeneity
Random coefficients
Fixed coefficients
Finite mixture models:
2 segment FM
3 segment FM
4 segment FM
5 segment FM
6 segment FM
Estimation Sample
LMD/
-2LMD/
Log L
BIC
Validation Sample
Log L
(Valid.)
P (Inc)
P (Choice)
RMSE(Q)
-9346
-9292
18692
18584
-9077
-9089
0.8566
0.8567
0.6026
0.6002
0.9574
0.9581
30
-9566
-9560
-11114
19132
19119
22399
-9312
-9349
-11101
0.8549
0.8536
0.8370
0.5768
0.5887
0.5015
0.9935
1.0013
1.1145
61
92
123
154
185
-10682
-10537
-10428
-10321
-10281
21711
21599
21557
21521
21617
-10541
-10330
-10149
-10207
-10129
0.8467
0.8445
0.8452
0.8408
0.8480
0.5202
0.5205
0.5335
0.5414
0.5456
1.1392
1.1296
1.1312
1.1350
1.1112
LMD is the Log Marginal Density; Log L is the log likelihood value; BIC is the Bayesian Information Criterion, and -2LMD is an approximation
of it for models estimated with HB methods; P (Inc) is the average predicted probability of the actual purchase incidence outcome; P (Choice)
is the average predicted probability of the actual brand choice outcome, given incidence; RMSE(Q) is the root mean squared error between the
actual and predicted purchase quantity, given incidence.
26
_________________________________________________________________________________________________________________
Multi-stage purchase decision models
Table 2.
Parameter estimates for the proposed model—random coefficients, common demand
shocks, and endogeneity
Parameter
Incidence:
0
Paper towels
Est.
95% HPD
-4.18 -4.56, -3.88
Margarine
Est.
95% HPD
-3.38
-3.60, -3.20
CR h
INVt h
1.50
1.13, 1.79
1.84
1.68, 2.02
-0.48
-0.58, -0.39
-0.50
-0.62, -0.43
CVt h
0.50
0.42, 0.58
0.80
0.70, 0.90
BLhi
2.80
2.54, 3.05
4.17
3.69, 4.60
LBPith
0.22
0.03, 0.39
0.27
0.07, 0.54
PRICE its
-2.00
-2.29, -1.77
-2.83
-2.98, -2.71
PROM its
2.16
1.86, 2.39
1.10
1.02, 1.19
PR h
INVt h
0.59
-0.13
0.44, 0.79
-0.19, -0.07
0.98
-0.16
0.86, 1.10
-0.24, -0.08
BLhi
-0.20
-0.33, -0.06
0.51
0.31, 0.71
PRICE its
-1.20
-1.42, -1.02
-0.94
-1.10, -0.78
PROM its
-0.03
-0.26, 0.13
0.40
0.31, 0.50
Brand choice:a
Quantity:a
a
Brand-specific constants not reported in the interest of space.
27
_________________________________________________________________________________________________________________
Multi-stage purchase decision models
Figure 1.
Comparison of random coefficients distributions for price for models specified with and
without common demand shocks (continued on next page)
a). Margarine category: brand choice model
Mean=-2.83, std. dev.=0.27
Mean=-2.58, std. dev.=0.54
b) Paper towel category: brand choice model
Mean=-2.00, std. dev.=0.79
Mean=-1.59, std. dev.=0.82
28
_________________________________________________________________________________________________________________
Multi-stage purchase decision models
Figure 1.
Continued.
c) Margarine category: quantity model
Mean=-0.94, std. dev.=0.27
Mean=-0.43, std. dev.=0.36
d) Paper towel category: quantity model
Mean=-1.20, std. dev.=0.34
Mean=-0.93, std. dev.=0.14
29
_________________________________________________________________________________________________________________
Multi-stage purchase decision models
Figure 2.
Scatterplots of household-level price coefficients for random coefficients models with
endogeneity and with or without common demand shocks
a) Margarine category— for price, brand choice model and quantity model
(r=.19)
(r=.25)
b) Paper towel category— for price, brand choice model and quantity model
(r=.83)
(r=.26)
Download