Flexible estimation of price response functions using retail scanner

ARTICLE IN PRESS
Journal of Retailing and Consumer Services 14 (2007) 383–393
www.elsevier.com/locate/jretconser
Flexible estimation of price response functions using retail scanner data
Winfried J. Steinera,, Andreas Brezgerb, Christiane Belitzb
a
Department of Marketing, University of Regensburg, UniversitätsstraX e 31, 93053 Regensburg, Germany
b
Department of Statistics, University of Munich, LudwigstraX e 33, 80539 Munich, Germany
Abstract
Kalyanam and Shively [1998. Estimating irregular pricing effects: a stochastic spline regression approach. Journal of Marketing
Research 35 (1), 16–29] and van Heerde et al. [2001. Semiparametric analysis to estimate the deal effect curve. Journal of Marketing
Research 38 (2), 197–215] have demonstrated the usefulness of nonparametric regression to estimate pricing effects flexibly. The
empirical results of these two studies, however, also revealed that nonparametric regression may suffer from too much flexibility leading
to nonmonotonic shapes for price effects. In this paper, we show how the problem of nonmonotonicity can be dealt with without losing
the power of flexible estimation techniques. We propose a semiparametric approach based on Bayesian P-splines with monotonicity
constraints imposed on own- and cross-price effects. In an empirical application, we illustrate that flexible estimation of own- and crossprice effects can improve the predictive validity of a sales response model substantially, even when price response curves were constrained
to show a monotonic shape, as suggested by economic theory. We also discuss the consequences from an unconstrained estimation of
price effects.
r 2007 Elsevier Ltd. All rights reserved.
Keywords: Price response modeling; Monotonic regression splines; Bayesian estimation
1. Introduction
1.1. Problem description
It is well known that temporary price reductions offered
by retailers may substantially increase sales of brands (e.g.,
Wilkinson et al., 1982; Blattberg and Neslin, 1990;
Bemmaor and Mouchoux, 1991; Blattberg et al., 1995;
Neslin, 2002). There is also empirical evidence that a price
change for one brand may affect sales of competitive items
in the same product category significantly (e.g., Blattberg
and Wisniewski, 1989; Allenby and Rossi, 1991; Mulherne
and Leone, 1991; Bemmaor and Mouchoux, 1991; Sivakumar and Raj, 1997; Sethuraman et al., 1999). Despite a
wealth of empirical studies on own- and cross-price effects,
however, little was known about the shape of price
response curves for frequently purchased consumer goods
until recently. Most studies addressing this issue employed
Corresponding author. Tel.: +49 941 943 2274; fax: +49 941 943 2828.
E-mail address: winfried.steiner@wiwi.uni-regensburg.de
(W.J. Steiner).
0969-6989/$ - see front matter r 2007 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jretconser.2007.02.008
strictly parametric functions, and came to different results
from model comparisons. Today, multiplicative, semilog
and log-reciprocal functional forms are the most widely
used parametric specifications to represent nonlinearities in
price response for brand sales (e.g., Blattberg and
Wisniewski, 1989; Blattberg and George, 1991; Montgomery, 1997; Kopalle et al., 1999; Foekens et al., 1999; van
Heerde et al., 2001, 2002; Bemmaor and Wagner, 2002). It
is important to note that these parametric functional forms
are inherently monotonic, i.e., monotonically decreasing in
own-price and monotonically increasing in cross-price,
which is in accordance with economic theory (e.g.,
Hanssens et al., 2001).
In order to shed more light on this topic, Kalyanam and
Shively (1998) and van Heerde et al. (2001) proposed
nonparametric regression techniques to estimate price
response curves more flexibly. Specifically, Kalyanam and
Shively proposed a stochastic spline regression approach
and van Heerde et al. a Kernel regression approach, and
both obtained superior performance for their models
compared to strictly parametric models. The empirical
results of these studies indicate that own- and cross-price
ARTICLE IN PRESS
384
W.J. Steiner et al. / Journal of Retailing and Consumer Services 14 (2007) 383–393
effects may show complex nonlinearities which are difficult
or not at all to capture by parametric models. These
complex nonlinearities may be caused by the existence of
threshold effects (e.g., flat own price response at the upper
bound of the observed price range), saturation effects
(decreasing returns to scale with decreasing price levels),
odd pricing effects (which can be considered as a special
type of threshold effects), market segments with distinct
reservation prices, or a convolution of several of these
individual effects.1 In addition, both studies provide
empirical evidence that price response may not only differ
across product categories but also across brands within a
product category. Altogether, the findings of Kalyanam
and Shively (1998) and van Heerde et al. (2001) strongly
support the use of nonparametric techniques to let the data
determine the shape of price response curves. Recently,
Martı́nez-Ruiz et al. (2006) applied the methodology
suggested by van Heerde et al. to daily (instead of weekly)
store-level scanner data.2
Kalyanam and Shively (1998), however, also reported
strong irregularities in own-price response for some of the
brands examined. Especially, some curves show local
upturns and downturns with spikes at certain price levels,
resulting in less smooth and nonmonotonic shapes (see
Fig. 1 below, right hand, dashed line for an example). The
authors themselves pointed out that in case of an
insufficient number of data points, the estimated curves
may show irregularities where none exist. The problem of
nonmonotonicity also applied to another brand in their
study, where the estimated curve indicated an increase in
unit sales for higher price levels beyond a certain price
point (see Fig. 1 below, left hand, dashed line). This
irregularity is not in accordance with economic theory and,
as a consequence, would suggest an optimal price at
infinity. The response curves estimated by van Heerde et al.
(2001) were more smooth though not untroubled by
nonmonotonicities. For example, one own-price response
curve indicated a decrease in unit sales as price cuts become
very deep which is again difficult to interpret from an
economic point of view. The authors noted that such
nonmonotonic effects might be due to chance.
The power of nonparametric regression for estimating
response functions based on aggregate data has also been
demonstrated for market share models by Hruschka
(2002), who proposed a semiparametric attraction model
allowing for functional flexibility. In his empirical study,
1
Threshold effects are present if consumers do not change their purchase
intentions unless a price cut exceeds a certain threshold level, say, e.g.,
15% (Gupta and Cooper, 1992; Bucklin and Gupta, 1999). A common
argument for the existence of saturation effects is based on the belief that
consumers can stockpile and/or consume only limited amounts of goods,
e.g., due to inventory constraints or perishability (Blattberg et al., 1995).
Odd pricing refers to the practice of retailers of setting prices in odd
numbers (e.g., 99 cents instead of 1.00 h) and may cause steps or kinks at
the respective odd price point (e.g., Kalyanam and Shively, 1998).
2
In another empirical application of nonparametric regression techniques, van Heerde et al. (2004) used local polynomial regression to allow
for a flexible decomposition of different sales promotion effects.
the semiparametric model provided better fits according to
the BIC criterion and error measures determined by
bootstrapping compared to the strict parametric MNL
and MCI attraction models. In another paper, Hruschka
(2001) further shows that a neural net based market share
attraction model can also achieve greater flexibility than a
common parametric attraction model leading to different
managerial implications. Moreover, the use of nonparametric or neural net based (also called seminonparametric)
techniques has become increasingly popular for modeling
brand choice of consumers using disaggregate data. Non-/
semiparametric choice models have been proposed by, e.g.,
Abe (1999), Briesch et al. (2002) or Hruschka et al. (2004),
flexible neural net based (enhanced) choice models have
been developed by, e.g., Bentz and Merunka (2000) or
Hruschka et al. (2004). Shively et al. (2000) introduced a
nonparametric approach to identify latent relationships in
hierarchical choice models.
1.2. Objectives of this study
It is important for retailers to know how sales respond to
price changes. Kalyanam and Shively (1998), van Heerde et
al. (2001) and Martı́nez-Ruiz et al. (2006) have demonstrated in their studies that flexible regression techniques
have the power to uncover complex nonlinearities in price
response. Large improvements in fit and/or predictive
validity from using nonparametric instead of parametric
regression to estimate price effects have been reported in all
three studies. On the other hand, these studies also revealed
that nonparametric regression techniques are very sensitive
and may suffer from too much flexibility leading to
economically implausible results (i.e., nonmonotonic price
response curves). Nonmonotonic shapes for price response
functions are not only a questionable result from an
economic point of view, but also pose serious problems to
marketing managers for related pricing decisions. In this
paper, we show how the problem of nonmonotonicity can
be dealt with without losing the power of flexible
estimation methods. The way we choose is similar to many
applications of conjoint analysis which include the price as
an attribute (e.g., Allenby et al., 1995): we impose
monotonicity constraints on own- and cross-item price
effects. Importantly, imposing monotonicity constraints
does not preclude the estimation of exceptional pricing
effects like steps and kinks at certain price points or
threshold and saturation effects at the extremes of the
observed price ranges.
The remaining part of the paper is organized as follows:
in Section 2, we introduce a semiparametric approach
based on Bayesian P-splines to model own- and cross-price
effects flexibly. We further provide some details about the
MCMC techniques used for estimation; in Section 3, we
illustrate our methodology in an empirical application
using weekly store-level scanner data for eight brands of
refrigerated orange juice offered by a large supermarket
chain. Our results show that the semiparametric model,
ARTICLE IN PRESS
W.J. Steiner et al. / Journal of Retailing and Consumer Services 14 (2007) 383–393
385
Fig. 1. Parametrically (solid line) versus nonparametrically (dashed line) estimated own-price effects. Numbers along the price response curves indicate the
number of times a price level occurred in the data (Kalyanam and Shively, 1998, p. 26).
although constrained to provide monotonic price effects,
outperforms three widely used parametric models in
predictive validity for all but one of the brands; in Section
4, we compare our Bayesian model to a semiparametric
model without monotonicity constraints estimated in a
frequentist (i.e., non-Bayesian) setting with backfitting;
finally, in Section 5, we conclude with a summary of the
key findings of the paper.
(1996) in a frequentist setting. Accordingly, we assume that
an unknown price response function f ij ðPjs;t Þ can be
approximated by a cubic spline with equally spaced knots
within the observed price range. Suppressing brand index i,
store index s and time index t for convenience, we can write
such a spline for the jth price effect in terms of a linear
combination of M j cubic B-spline basis functions Bjm , m ¼
1; . . . ; M j (we refer to De Boor, 2001 as a key reference for
B-splines):
2. Methodology
f j ðPj Þ ¼
2.1. Semiparametric model
J
X
f ij ðPjs;t Þ þ gi Dis;t
j¼1
þ
4
X
di;q W q;t þ is;t ;
bjm Bjm ðPj Þ,
(2)
m¼1
We suggest a semiparametric approach in which we
model a brand’s (log) unit sales as (1) a sum of
nonparametric functions for own- and cross-item price
variables and (2) a parametric function of other variables
(capturing store effects, display effects and seasonal effects):
ln Qis;t ¼ ais þ
Mj
X
is;t Nð0; s2 Þ,
ð1Þ
q¼2
where Qis;t is the unit sales of brand i in store s and week t,
Pjs;t the observed price of brand j in store s and week t, Dis;t
the dummy variable capturing usage (¼ 1) or nonusage
(¼ 0) of a display for brand i in store s and week t, W q;t the
seasonal dummy indicating if week t belongs to the qth
quarter ðq ¼ 2; 3; 4Þ, with spring representing the reference
season, ais the random store effect for brand i accounting
for heterogeneity in baseline sales across different stores,
f ij ðPjs;t Þ the unknown smooth functions for price effects on
unit sales of brand i, referring to own price ð j ¼ iÞ and
prices of competing brands ð jaiÞ, gij the own-display effect
for brand i, diq the seasonal effects for brand i ðq ¼ 2; 3; 4Þ;
and is;t the disturbance term for brand i, store s and week t.
To model own- and cross-price effects flexibly, we follow
Lang and Brezger (2004) who proposed a Bayesian version
of P-splines originally introduced by Eilers and Marx
where bjm denotes the regression coefficient to be estimated
for the mth B-spline basis. Eilers and Marx (1996) have
suggested to use a moderately large number of knots (usually
between 20 and 40) to ensure enough flexibility for the
unknown function on the one hand, and to introduce a
roughness penalty on adjacent regression coefficients bjm to
guarantee sufficient smoothness and to avoid overfitting on
the other hand. The resulting penalized least-squares problem
for the semiparametric model (1) is stated in Appendix A. For
our empirical application presented in Section 3, we use 20
knots for all own- and cross-price effects.
In a Bayesian approach, as considered in this paper, the
unknown regression coefficients bjm (as well as all other
parameters of the semiparametric model (1)) are considered
as random variables and have to be supplemented with
appropriate prior distributions. In our Bayesian model
setting, penalization is accomplished by using a second
order random walk for adjacent regression coefficients:
bjm ¼ 2bj;m1 bj;m2 þ ujm ;
ujm Nð0; t2j Þ.
(3)
The second order random walk is the stochastic analogue
to the second order difference penalty suggested by Eilers
and Marx (1996). The variance parameter t2j controls the
trade-off between flexibility and smoothness of the P-spline
and corresponds to the smoothing parameter in classical
spline regression (compare Appendix A). In Appendix B,
we illustrate with a simulation example how the P-spline
approach works.
ARTICLE IN PRESS
W.J. Steiner et al. / Journal of Retailing and Consumer Services 14 (2007) 383–393
386
2.2. Bayesian estimation
ln Qis;t ¼ ais þ bii Pis;t þ
The main advantage of our Bayesian approach is that
the amount of smoothness for each price effect can be
estimated simultaneously with all other model parameters
by defining an additional hyperprior for the variance
parameters t2j . We assign inverse Gamma IGðaj ; bj Þ
distributions on the variance parameters t2j (and also on
the scale parameter s2 , compare Eq. (1)) with aj ¼ bj ¼
0:001 leading to almost diffuse priors. To obtain monotonicity, i.e., f 0j ðPj Þp0 for own-price response (j ¼ i) and
f 0j ðPj ÞX0 for cross-price response ðjaiÞ, it can be shown
that it is sufficient to guarantee that subsequent parameters
bjm are ordered, such that
bj1 Xbj2 X XbjM
or bj1 pbj2 p pbjM ,
(4)
respectively. These constraints are easily imposed by
introducing indicator functions to truncate the second
order random walk prior (3) appropriately. Finally,
concerning the parametric effects, we assume diffuse priors
for the display and seasonal effects ðgi ; diq Þ and highly
dispersed normal priors for the random store effects ðais Þ.
Estimation of the semiparametric model is fully Bayesian
and uses recently developed MCMC techniques. More
specifically, we subsequently draw from the full conditionals which are all known distributions (i.e., multivariate
normal distributions for both price effects bj , j ¼ 1; . . . ; J,
and store, display and seasonal effects a, g and d; inverse
Gamma distributions for all variance parameters). Technical details on the full conditionals, especially that of the
smooth functions for price effects, the employed Gibbs
sampling scheme and efficient implementation, are available from the authors upon request.
J
X
bij ð1=Pjs;t Þ þ gi Dis;t
j¼1
jai
þ
4
X
di;q W q;t þ is;t ;
is;t Nð0; s2 Þ.
ð7Þ
q¼2
Models (5)–(7) differ from the semiparametric model (1)
only with respect to own- and cross-price effects which are
specified parametrically, too. Model (5) uses a multiplicative (log–log) functional form like the well-known
SCAN*PRO model in its strictly parametric versions (e.g.,
Foekens et al., 1999; Kopalle et al., 1999; van Heerde et al.,
2001, 2002), with bij representing the constant elasticity of
unit sales of brand i with respect to the price of brand j.
Model (6) is a semilog (or exponential) model and has been
used by, e.g., Blattberg and George (1991), Montgomery
(1997) or Kalyanam and Shively (1998). Model (7) follows
Blattberg and Wisniewski (1989) and is semilog in own
price and log-reciprocal in competitive prices (also see
Blattberg and Neslin, 1990; Bemmaor and Mouchoux,
1991). Accordingly, bii corresponds to the own-price effect
of brand i and bij ðjaiÞ to the cross-item price effects. All
three models imply a convex (decreasing) shape for ownprice effects. With respect to cross-price effects, the
multiplicative model (5) allows for both increasing and
decreasing returns to scale (i.e., a convex or concave
shape), the semilog model (6) for increasing returns to scale
(i.e., a convex shape), and the log-reciprocal specification in
(7) for an s-shape. All involved full conditionals in models
(5)–(7) are fully known and can therefore be easily updated
by Gibbs sampling steps, too.
3. Empirical study
3.1. Data
2.3. Benchmark parametric models
To provide a benchmark for the predictive performance
of our semiparametric model, we compare it in our
empirical application to the three most widely used
parametric models for analyzing sales/price response:
ln Qis;t ¼ ais þ
J
X
bij lnðPjs;t Þ þ gi Dis;t
j¼1
þ
4
X
is;t Nð0; s2 Þ,
ð5Þ
is;t Nð0; s2 Þ,
ð6Þ
di;q W q;t þ is;t ;
q¼2
ln Qis;t ¼ ais þ
J
X
bij Pjs;t þ gi Dis;t
j¼1
þ
4
X
q¼2
di;q W q;t þ is;t ;
In this section, we present results from an empirical
application of our Bayesian semiparametric model to
weekly store-level scanner data for eight brands of
refrigerated orange juice offered by Dominick’s Finer
Foods, a major supermarket chain in the Chicago
metropolitan area. The data were provided by the James
M. Kilts Center, GSB, University of Chicago and include
unit sales, retail prices and display activities for these
brands in 81 stores of the chain over a time span of 89
weeks. Table 1 shows summary statistics pooled across the
stores for average, minimum and maximum weekly market
shares, mean prices as well as price ranges of the individual
brands.
Among the brands are 2 premium brands (made from
freshly squeezed oranges), 5 national brands (reconstituted
from frozen orange juice concentrate) and the supermarket’s own private label brand (Dominick’s). The
differences in quality across the three tiers are well
represented by higher (lower) average prices for higher
(lower) quality tier brands as well as by different price
ARTICLE IN PRESS
W.J. Steiner et al. / Journal of Retailing and Consumer Services 14 (2007) 383–393
Table 1
Market shares (%), mean prices and price ranges ($) for brands in the
refrigerated orange juice category
Brand
Average
market
share
Lowest
market
share
Highest
market
share
Mean price
Price range
Tropicana Pure
Florida Natural
15
5
3
1
73
53
2.95
2.86
[1.60; 3.55]
[1.57; 3.16]
Citrus Hill
Minute Maid
Tropicana
Florida Gold
Tree Fresh
8
21
21
4
4
1
3
2
1
1
78
87
75
63
42
2.31
2.23
2.20
2.17
2.15
[1.09;
[1.29;
[1.49;
[0.99;
[1.07;
Dominick’s
22
1
83
1.75
[0.99; 2.47]
2.82]
2.92]
2.75]
2.83]
2.48]
ranges. The weekly market shares of all brands vary
considerably reflecting the high price variation in this
product category.
3.2. Cross-price effects
To account for multicollinearity and for the fact that
cross-price effects are usually much weaker than own-price
effects (see Hanssens et al., 2001 for an overview of
empirical findings), we capture cross-price effects in a more
parsimonious way at the tier level rather than at the
individual brand level: we define price_premiumst as the
lowest price of a premium brand and price_nationalst as the
lowest price of a national brand in store s and week t,
respectively. It is important to note that the price of a
brand i under consideration (i.e., the brand for which a
response model is estimated at a time), is excluded from the
computation of price_premiumst (price_nationalst) if brand i
is a premium (national) brand. For example, if our
semiparametric model or any of the parametric models is
estimated for the national brand Citrus Hill, price_nationalst
represents the lowest price level of either of the four other
national brands (Minute Maid, Tropicana, Florida Gold,
Tree Fresh) in store s and week t. Finally, price_Dominicksst denotes the actual price for Dominick’s, the only
private label brand, in store s and week t. Previous
approaches have modeled competitive effects either in a
much more parsimonious way through the use of a single
competitive variable (e.g., Blattberg and George, 1991;
Kopalle et al., 1999) or by focusing only on a limited
number of major brands in a product category (e.g.,
Kalyanam and Shively, 1998; van Heerde et al., 2001).3
3.3. Predictive validity
We compared the forecasting performance of our
semiparametric model (1) to that of the three parametric
3
In general, there is no need to have only a limited number of
nonparametric terms in our Bayesian semiparametric model in order to
obtain good estimation results.
387
models (5)–(7) in terms of the average mean squared sales
prediction error (AMSE) in validation samples. Specifically, we randomly splitted the data into nine equally sized
subsets and performed nine-fold cross-validation. For each
subset used once for validation, we fitted the respective
model to the remaining eight subsets making up the
estimation sample and calculated the mean squared sales
prediction error (MSE) of the fitted model when applied to
the observations in this holdout subset (Efron and
Tibshirani, 1998). Eventually, we computed the AMSE
measure by averaging the individual MSE values across the
nine holdout subsets. Because we are interested in unit sales
rather than log unit sales of a brand, the conditional mean
predictions from the estimated log-normal models were
obtained as follows (Goldberger, 1968; Greene, 1997):
Q^ is;t ¼ expðln Q^ is;t þ s^ 2i =2Þ,
(8)
where s^ 2i denotes the residual variance of the respective lognormal model and is included to minimize the bias in the
conditional mean predictions due to estimation in the logspace.
3.4. Empirical results
3.4.1. AMSE performance
Table 2 firstly displays the validation results (AMSE
values) for the three parametric models and shows that the
multiplicative model performed best for six of the eight
brands. This finding indicates that the multiplicative model
has its high popularity in price response modeling not only
due to its constant elasticity property. It also offers a
competitive (and for these six brands a higher) forecasting
accuracy compared with other parametric specifications. In
one case, however, the semilog model (for the store brand
Dominick’s) and in another case the semilog/log-reciprocal
model (for the national brand Tropicana) provided the
highest predictive performance.
Table 3 adds the AMSE values we obtained for our
semiparametric model and compares them to those of the
best parametric model (see Table 2). The results indicate a
superior predictive validity of our flexible approach for all
national and premium brands in the refrigerated orange
juice category (i.e., for 7 out of 8 brands), with improvements in AMSE over the best performing parametric
model ranging from 6.6% for Minute Maid and Florida
Natural up to 41.6% for Florida Gold. Importantly, the
improvements in predictive validity were attained despite
enforcing monotonicity on the nonparametrically estimated own- and cross-price effects. We achieved, however,
no improvement for Dominick’s, the retailer’s own store
brand. This implies that flexible estimation of price effects
does not matter for this brand, and that the semiparametric
model here virtually degenerates into the semilog model
(which is nested in our flexible approach). The latter
finding is important, because it demonstrates that nonparametric modeling of price effects need not necessarily
ARTICLE IN PRESS
W.J. Steiner et al. / Journal of Retailing and Consumer Services 14 (2007) 383–393
388
Table 2
Predictive validity (AMSE results) for strictly parametric models
Brand
Multiplicative
model (5)
Semilog
model (6)
Semilog/logreciprocal model (7)
Tropicana Pure
Florida Natural
3.081
728
3.355
895
3.458
911
Citrus Hill
Minute Maid
Tropicana
Florida Gold
Tree Fresh
8.401
2.795
14.012
58.015
7.440
10.538
3.004
13.615
63.875
8.155
10.803
2.972
13.200
64.234
8.395
Dominick’s
103.075
101.381
101.705
Table 3
Predictive validity (AMSE results) for the semiparametric and the best
parametric model
Brand
Semiparametric
model (1)
Best
parametric
model
Improvement
in AMSE
Tropicana Pure
Florida Natural
2.844
680
3.081
728
7.7%
6.6%
Citrus Hill
Minute Maid
Tropicana
Florida Gold
Tree Fresh
5.502
2.612
11.894
33.892
4.637
8.401
2.795
13.200
58.015
7.440
34.5%
6.6%
9.9%
41.6%
37.7%
Dominick’s
102.022
101.381
No
lead to better prediction results than strictly parametric
modeling. The greater flexibility of nonparametric techniques, however, pays off if nonlinear effects in price
response are present that cannot be adequately captured
parametrically. We discuss this issue in more detail below.
3.4.2. Estimated price effects
Fig. 2 depicts selected price response curves (with price
on the x-axis and predicted sales on the y-axis) and reveals
why the semiparametric model (1) can provide more
accurate forecasts than the best performing parametric
model. The solid lines represent the flexibly estimated price
effects from our semiparametric model (i.e., the P-splines),
whereas the dashed lines refer to the estimated price effects
with respect to the best performing parametric model
(which is the multiplicative model for the displayed brands,
compare Table 2). Also shown are the 95% pointwise
credible intervals (dotted lines) for the nonparametric price
effects.
Figs. 2a–c show estimated own-price effects for the
national brands Florida Gold, Tree Fresh and Citrus Hill,
which are the three brands with the most noticeable
improvements in predictive validity from flexible estimation of price effects. All three nonparametric own-price
response curves show an L-shape indicating a threshold
level beyond which unit sales rapidly increase for still lower
prices, while the multiplicative model yields an exponential
price response curve. For Florida Gold, the strong sales
spike can be attributed to an odd pricing effect at 99 cents,
the lowest observed price for this brand. The threshold
levels occur at rather low price levels implying that these
brands can increase its sales significantly only by setting
very low prices. The multiplicative model, in contrast,
dramatically understates the sales effect for low prices. The
estimated P-spline for the premium brand Florida Natural
(see Fig. 2d) shows a reverse s-shape with a threshold effect
around 2.00$ and further indicates a saturation effect at the
lowest observed prices. The estimated own-price effects for
the second premium brand Tropicana Pure (not shown
here) are quite similar to those of Florida Natural. The
differences between the nonparametric and the best
parametric own-price response curves for the national
brands Minute Maid and Tropicana and for the store
brand Dominick’s are much less distinct.
Figs. 2e–g illustrate selected cross-price effects for the
brands Tree Fresh, Minute Maid and Florida Gold with
respect to competing items in the national brand tier. The
nonparametric curves show a reverse L-shape (indicating a
saturation effect for prices below a certain price level) or a
mixture of an s- and reverse L-shape. For example, if one
of the competing national brands Citrus Hill, Minute
Maid, Tropicana or Florida Gold only slightly decreases its
price, unit sales of Tree Fresh strongly decrease (compare
Fig. 2e). Noticeably, the nonparametric curve lies above
the parametric curve for high(er) price levels for all three
national brands. The nonparametric cross-price effect for
the premium brand Tropicana Pure with respect to its
direct competitor Florida Natural, the other premium
brand, shows an s-shape indicating both a threshold and a
saturation effect (compare Fig. 2h). In general, cross-price
effects turn out to be weaker than own-price effects, as
becomes evident from the predicted sales numbers on the yaxis in Fig. 2.
All estimated own- and cross-price effects of the best
performing parametric models were significant at 5%. The
display effects for the premium brand Florida Natural, for
the national brand Minute Maid and the store brand
Dominick’s are not significant, while the display effects for
the brands Citrus Hill, Tropicana, Florida Gold Gold and
Tree Fresh are significant at 5% and show the expected
sign. The display effect for the premium brand Tropicana
Pure shows the wrong sign. However, this effect is near
zero (0.04) and hence probably due to chance.
3.4.3. Price elasticities
It is also important to note that the semiparametric
model provides different managerial insights with regard to
price elasticities. Table 4 reports own-price elasticities for
Citrus Hill, Florida Gold, and Tree Fresh, the brands with
the largest improvements in predictive validity from our
semiparametric model. The best performing parametric
ARTICLE IN PRESS
W.J. Steiner et al. / Journal of Retailing and Consumer Services 14 (2007) 383–393
389
Fig. 2. Estimated price effects from the semiparametric model (solid lines) and the best performing parametric model (dashed lines). Dotted lines indicate
the 95% pointwise credible intervals for the P-splines.
model for these brands has been the multiplicative model
which is characterized by a constant elasticity over the
entire price range. Shown are separate elasticity measures
for low, medium and high price levels of these brands.
Importantly, the differences are very large for low prices of
Citrus Hill and Florida Gold, where the semiparametric
model suggests a much higher elasticity than its parametric
counterpart. For high prices, the semiparametric model
suggests a noticeably lower elasticity than the multiplicative model for all three brands.
3.4.4. Unconstrained estimation
In order to assess the impact of the monotonicity
constraints, we also compared our results to those obtained
ARTICLE IN PRESS
W.J. Steiner et al. / Journal of Retailing and Consumer Services 14 (2007) 383–393
390
Table 4
Estimated own-price elasticities from the semiparametric and the best
parametric model
Brand
Semiparametric model
Price ranges
p1.5 $ [1.5;2.5] $ 42.5 $ p1.5 $ [1.5;2.5] $ 4 2.5$
Citrus Hill
6.65
Florida Gold 9.15
Tree Fresh
2.94
2.05
3.87
1.71
3.04
2.35
0.89
Multiplicative model
3.60
3.75
2.28
3.60
3.75
2.28
3.60
3.75
2.28
from an unconstrained semiparametric model estimated in
a frequentist setting like the van Heerde et al. (2001) model.
Specifically, we estimated a non-Bayesian version of our
semiparametric model (1) without monotonicity constraints using the backfitting algorithm (also compare
Appendix A). In this case, the amount of smoothness of
each price effect can no longer be estimated simultaneously
with all other model parameters. Details on the estimation
procedure, which uses the improved AIC criterion for
smoothing parameter selection, can be obtained from the
authors upon request.
Fig. 3 shows three selected price effects as representative
examples from the estimation of the unconstrained
semiparametric model. Fig. 3a refers to the unrestricted
own-price effect for the premium brand Florida Natural
and reveals a strong nonmonotonic downward kink in sales
response in the lower range of the observed prices for this
brand. Fig. 3b displays the own-price effect for the national
brand Minute Maid and also indicates a sharp decrease in
unit sales as price becomes very low. These nonmonotonicities are similar to that reported by van Heerde et al.
(2001) and are difficult to interpret from an economic point
of view. At first glance, one explanation may be that
consumers associate a loss in quality with very low price
levels, but this argument seems very questionable with
frequently purchased consumer nondurables (like orange
juice brands). In addition, the constrained semiparametric
model suggests a somewhat lower predictive validity for the
brand Florida Natural, as compared to the semiparametric
model with monotonicity constraints (also see below). The
own-price effect for Minute Maid further shows some local
upturns and downturns in the medium price range. Fig. 3c
illustrates the cross-price effect for the brand Tree Fresh
with respect to competing items in the national brand tier.
The curve is also rather unsmooth and exhibits a strong
nonmonotonic effect near the upper bound of the price
range. Accordingly, unit sales of Tree Fresh increase with a
decreasing competitive price in this price area. There is no
(economic) rationale for a meaningful interpretation of this
pattern.
Nearly all price effects estimated by the unconstrained
model suffer from nonmonotonicities and impose serious
problems for interpretation and managerial implications.
There is a tendency that cross-price effects turn out to be
less smooth (i.e., showing more local upturns and down-
Fig. 3. Estimated price effects from the unconstrained semiparametric
model.
turns) than own-price effects from an unconstrained
estimation.
The results for the unconstrained semiparametric model
with respect to predictive validity are comparable to those
of our constrained semiparametric model for most of the
ARTICLE IN PRESS
W.J. Steiner et al. / Journal of Retailing and Consumer Services 14 (2007) 383–393
brands. The AMSE value is slightly worse for Dominick’s
and Florida Natural, virtually identical for Florida Gold,
and somewhat better for the other five brands. Importantly, for Citrus Hill, Tree Fresh and Florida Gold, the
three brands which benefit most from nonparametric
estimation, the difference between the unconstrained and
constrained semiparametric models in relative improvement in AMSE over the best performing parametric model
is at most 3.3%.
4. Conclusions
We proposed a new semiparametric model embedded in
a Bayesian framework to predict retail sales. Our results
from an empirical application based on retail scanner data
for brands of orange juice showed that flexible estimation
of price response functions can improve the predictive
validity of a sales response model substantially, even when
the price effects were restricted to have a monotonic shape,
as suggested by economic theory. Specifically, we obtained
a higher predictive accuracy for our semiparametric model
compared to three widely used parametric models for 7 out
of 8 brands. Interestingly, flexible estimation of price
effects offered no advantage over the best parametric
model for the retailer’s own store brand. We also compared
our Bayesian model to a semiparametric model without
monotonicity constraints estimated in a frequentist (i.e.,
non-Bayesian) setting using the backfitting algorithm. The
results indicated a similar predictive performance of both
models for most of the brands. However, nearly all
unrestrictedly estimated price effects revealed strong
nonmonotonicities, which are not in accordance with
economic theory and are likely to represent an artifact
caused by too much flexibility of the unconstrained
semiparametric model.
Acknowledgments
The data for our empirical study was provided by the
James M. Kilts Center, GSB, University of Chicago. We
thank Stefan Lang for his idea to visualize how the Psplines approach works.
Appendix A
A.1. Penalized least-squares problem
Let vn denote the vector of all parametric effects of the
semiparametric model (1) for the nth observation and let
index j cover all smooth functions for own- and cross-price
effects, this leads to the following penalized least-squares
criterion (suppressing brand index i, store index s and time
index t):
N
X
n¼1
yn J
X
!2
f j ðPjn Þ vTn z
j¼1
þ
391
J
X
j¼1
lj
Mj
X
ðDðkÞ bjm Þ2 ,
m¼kþ1
(A.1)
where N is the sample size (number of stores times number
of weeks), DðkÞ the differences of order k between adjacent
regression coefficients bjm , and lj the smoothing parameter
for price response curve f j ðPj Þ.
Frequently, as suggested by Eilers and Marx (1996), a
second order difference penalty Dð2Þ bjm ¼ bjm 2bj;m1 þ
bj;m2 is used. The penalized sum of squared residuals (A.1)
is minimized with respect to the unknown regression
coefficients bjm and z. The trade-off between flexibility
and smoothness for price effect j is controlled by the
smoothing parameter lj ðj ¼ 1; . . . JÞ. In a non-Bayesian
setting, estimation of the semiparametric model (1) given
the smoothing parameters can be carried out with backfitting (Hastie and Tibshirani, 1990). ‘‘Optimal’’ smoothing
parameter selection is typically performed via cross
validation or by minimizing an information criterion with
respect to predetermined values lj ðj ¼ 1; . . . JÞ.
A.2. Illustration of P-splines
Fig. 4 gives an illustration how the P-splines approach
works: (a) suppose you know the true shape of a response
function with respect to an independent variable x within
the range from 3 to þ3 (solid line) and you generate 100
observations by adding a random error term. The objective
is to re-estimate the curve based on this simulated data with
a cubic P-spline. In a first step, using cubic B-splines as
basis functions, the spline can be stated in terms of a linear
combination of M of those P
B-spline basis functions
Bm ðm ¼ 1; . . . ; MÞ as f ðxÞ ¼ M
m¼1 bm Bm ðxÞ (compare
Eq. (2)). (b) A moderately large number of knots is chosen
to divide the domain of x into equidistant intervals, and
cubic B-splines are constructed around the knots. (c)
Estimating the unknown regression coefficients Bm ðj ¼
1; . . . ; MÞ from the data implies nothing else than weighting
each of the B-spline basis functions Bm accordingly. (d) The
estimated function value f(x) of the spline is obtained by
simply adding up the values of all overlapping basis
functions at position
x (i.e., by computing the linear
PM
combination
m¼1 bm Bm ðxÞ). However, the estimated
spline (dotted line), although approximating the true
function (solid line) quite well, obviously suffers from
overfitting which is reflected by a rather ‘‘wiggly’’
(unsmooth) curve. This overfitting is the result of not
having incorporated a roughness penalty on adjacent
regression coefficients. (e) By using a second order random
walk to penalize differences between regression coefficients
(compare Eq. (3)), adjacent B-spline basis functions Bm are
coupled and, as a result, come closer to each other in
magnitude. (f) The estimated P(enalized)-spline is now
much more smooth and approximates the true function
still better compared to the unpenalized estimation.
ARTICLE IN PRESS
W.J. Steiner et al. / Journal of Retailing and Consumer Services 14 (2007) 383–393
392
B-spline basis functions
true shape and simulated observations
0.7
1.3
0.9
0.53
0.5
0.35
0.1
-0.3
0.18
-0.7
0
-1.1
-3
-1.8
-0.6
0.6
1.8
3
-4.7
-3.2
-1.7
-0.2
x
1.3
2.8
4.3
x
estimated B-spline (unpenalized)
weighted B-spline basis functions
0.7
1.3
0.9
0.35
0.5
0
0.1
-0.3
-0.35
-0.7
-0.7
-1.1
-3
-2
-1
0
1
2
3
-3
-1.8
-0.6
x
0.6
1.8
3
x
optimal smoothing
estimated P-spline (penalized)
0.7
1.3
0.9
0.35
0.5
0
0.1
-0.3
-0.35
-0.7
-1.1
-0.7
-3
-2
-1
0
1
2
3
-3
-1.8
x
-0.6
0.6
1.8
3
x
Fig. 4. How the P-splines approach works.
References
Abe, M., 1999. A generalized additive model for discrete choice data.
Journal of Business & Economic Statistics 17, 271–284.
Allenby, G.M., Rossi, P.E., 1991. Quality perceptions and asymmetric
switching between brands. Marketing Science 10 (3), 185–204.
Allenby, G.M., Arora, N., Ginter, J.L., 1995. Incorporating prior
knowledge into the analysis of conjoint studies. Journal of Marketing
Research 32, 152–162.
Bemmaor, A.C., Mouchoux, D., 1991. Measuring the short-term effect of
in-store promotion and retail advertising on brand sales. A factorial
experiment. Journal of Marketing Research 28 (2), 202–214.
Bemmaor, A.C., Wagner, U., 2002. Estimating market-level multiplicative
models of promotion effects with linearly aggregated data: a
parametric approach. In: Franses, P.H., Montgomery, A.L. (Eds.),
Advances in Econometrics, vol. 16, Econometric Models in Marketing.
Bentz, Y., Merunka, D., 2000. Neural networks and the multinomial logit
for brand choice modelling: a hybrid approach. Journal of Forecasting
19, 177–200.
Briesch, R.A., Chintagunta, P., Matzkin, R.L., 2002. Semiparametric
estimation of brand choice behavior. Journal of the American
Statistical Association 97, 973–982.
Blattberg, R.C., George, E.I., 1991. Shrinkage estimation of price and
promotional elasticities. Journal of the American Statistical Association 86 (414), 304–315.
Blattberg, R.C., Neslin, S.A., 1990. Sales Promotion: Concepts, Methods,
and Strategies. Englewood Cliffs, NJ.
Blattberg, R.C., Wisniewski, K.J., 1989. Price-induced patterns of
competition. Marketing Science 8 (4), 291–309.
Blattberg, R.C., Briesch, R., Fox, E.J., 1995. How promotions work.
Marketing Science 14 (3 Part 2), G122–G132.
Bucklin, R.E., Gupta, S., 1999. Commercial use of UPC scanner data:
industry and academic perspectives. Marketing Science 18 (3),
247–273.
De Boor, C., 2001. A Practical Guide to Splines, revised ed. Springer,
New York.
Efron, B., Tibshirani, R.J., 1998. An Introduction to the Bootstrap.
Chapman & Hall, CRC, London, Boca Raton.
ARTICLE IN PRESS
W.J. Steiner et al. / Journal of Retailing and Consumer Services 14 (2007) 383–393
Eilers, P.H.C., Marx, B.D., 1996. Flexible smoothing using B-splines and
penalized likelihood (with comments and rejoinder). Statistical Science
11 (2), 89–121.
Foekens, E.W., Leeflang, P.S.H., Wittink, D.R., 1999. Varying parameter
models to accommodate dynamic promotion effects. Journal of
Econometrics 89, 249–268.
Goldberger, A., 1968. The interpretation and estimation of Cobb–Douglas
functions. Econometrica 35, 464–472.
Greene, W., 1997. Econometric Analysis. Prentice-Hall, New Jersey.
Gupta, S., Cooper, L., 1992. The discounting of discounts and promotion
thresholds. Journal of Consumer Research 19, 401–411.
Hanssens, D.M., Parsons, L.J., Schultz, R.L., 2001. Market Response
Models—Econometric and Time Series Analysis. Chapman & Hall,
London.
Hastie, T., Tibshirani, R.J., 1990. Generalized Additive Models. Chapman
& Hall, London.
Hruschka, H., 2001. An artificial neural net attraction model (ANNAM)
to analyze market share eects of marketing instruments. Schmalenbach
Business Review 53, 27–40.
Hruschka, H., 2002. Market share analysis using semi-parametric
attraction models. European Journal of Operational Research 138,
212–225.
Hruschka, H., Fettes, W., Probst, M., 2004. An empirical comparison of
the validity of a neural net based multinomial logit choice model to
alternative model specifications. European Journal of Operational
Research 159, 166–180.
Kalyanam, K., Shively, T.S., 1998. Estimating irregular pricing effects: a
stochastic spline regression approach. Journal of Marketing Research
35 (1), 16–29.
Kopalle, P.K., Mela, C.F., Marsh, L., 1999. The dynamic effect of
discounting on sales: empirical analysis and normative pricing
implications. Marketing Science 18 (3), 317–332.
Lang, S., Brezger, A., 2004. Bayesian p-splines. Journal of Computational
and Graphical Statistics 13, 183–212.
393
Marı́tnez-Ruiz, M.P., Mollá-Descals, A., Gómez-Borja, M.A., RojoÁlvarez, J.L., 2006. Using daily store-level data to understand price
promotion effects in a semiparametric regression model. Journal of
Retailing and Consumer Services 13 (3), 193–204.
Montgomery, A.L., 1997. Creating micro-marketing pricing strategies
using supermarket scanner data. Marketing Science 16 (4), 315–337.
Mulherne, F.J., Leone, R.P., 1991. Implicit price bundling of retail
products: a multiproduct approach to maximizing store profitability.
Journal of Marketing 55 (4), 63–76.
Neslin, S.A., 2002. Sales Promotion. Marketing Science Institute, Cambridge, MA.
Sethuraman, R., Srinivasan, V., Kim, D., 1999. Asymmetric and
neighborhood cross-price effects: some empirical generalizations.
Marketing Science 18 (1), 23–41.
Shively, T.S., Allenby, G.M., Kohn, R., 2000. A nonparametric approach
to identifying latent relationships in hierarchical models. Marketing
Science 19, 149–162.
Sivakumar, K., Raj, S.P., 1997. Quality tier competition: how price change
influences brand choice and category choice. Journal of Marketing 61
(3), 71–84.
van Heerde, H.J., Leeflang, P.S.H., Wittink, D.R., 2001. Semiparametric
analysis to estimate the deal effect curve. Journal of Marketing
Research 38 (2), 197–215.
van Heerde, H.J., Leeflang, P.S.H., Wittink, D.R., 2002. How promotions
work: SCAN*PRO-based evolutionary model building. Schmalenbach
Business Review 54, 198–220.
van Heerde, H.J., Leeflang, P.S.H., Wittink, D.R., 2004. Decomposing the
sales promotion bump with store data. Marketing Science 23 (3),
317–334.
Wilkinson, J.B., Mason, J.B., Paksoy, C.H., 1982. Assessing the impact of
short-term supermarket strategy variables. Journal of Marketing
Research 19 (1), 72–86.