An Integrated Trial/Repeat Model for New Product Sales

An Integrated Trial/Repeat Model for New Product Sales Peter S. Fader Bruce G. S. Hardie Chun-Yao Huang1 August 2001 1 Peter S. Fader is Associate Professor of Marketing at the Wharton School, University of Pennsylvania (email: faderp@wharton.upenn.edu; web: www.petefader.com). Bruce G. S. Hardie is Assistant Professor of Marketing, London Business School (email: bhardie@london.edu; web: www.brucehardie.com). Chun-Yao Huang is a PhD candidate at London Business School (email: chuang@london.edu). The second author acknowledges the support of the LBS Centre for Marketing. Abstract Traditional test-market-based new product forecasting models for consumer packaged goods usually suffer from at least one of three deficiencies: (i) any possible connection between trial and subsequent repeat purchases is generally ignored at the household level (which leads to incorrect parameter estimates and inferences), (ii) consumer preferences for the new product are assumed to be stable over time, and (iii) the effects of marketing activities (e.g., advertising and promotion) are disregarded. We present a parsimonious stochastic model of new product purchasing that addresses all of these issues. Our primary objective is to be able to provide an accurate forecast of overall new product sales. By creating a tight linkage between the trial and repeat purchase processes, we can leverage the limited amount of observed repeat data that are available in the initial weeks after launch. The integrated model allows managers to carefully diagnose the sub-components of new product sales (such as percent of triers repeating by time t, repeats per repeater, and so on), without requiring separate models for each one. We formally combine the trial and repeat processes (and accommodate changing consumer preferences over time) by introducing a probabilistic “renewal” process that varies with depth-ofrepeat. Specifically, as customers gain more experience with the product we would expect their preferences (and therefore their underlying buying rates) to settle down. As such, a desirable feature of the model is that it can evolve to a stationary repeat buying process as the product moves from being “new” to “established.” These renewal events also allow for the possibility of consumer dropout, thereby letting us capture different attrition patterns for the new product. We examine two different distributions for interpurchase times — the simple exponential model and the Erlang-2, which allows for more regularity in the time between repeat purchases. We introduce marketing mix effects via a proportional hazards framework at the individual level. Furthermore, beyond a conventional model with constant covariate effects over time, we also develop a specification that lets the coefficients vary with consumer experience (i.e., depth-of-repeat). Overall, this flexible set of model components gives us a general framework to capture and understand the wide variety of possible purchase patterns that can occur for a new product soon after its launch. This framework includes (and generalizes) many of the models considered by Gupta (1991) as well as other previous models that have addressed some of the specific issues described above. We conduct a detailed empirical analysis using data from IRI’s BehaviorScan service, and show how the model can be used to examine the differential impact on trial and repeat sales that emerges when we remove (or add) a particular promotional event. 1 Introduction Since the early work of Fourt and Woodlock (1960), a number of marketing scientists have devoted their attention to the development of models designed to generate forecasts of a new product’s sales performance. During the 1960s and 1970s, the primary focus was on the development of test-market models, in which medium-term sales forecasts were projected from consumer panel data collected during the first few months of the test market. Attention then shifted to the development of simulated (also known as pre-test) market models that generate sales forecasts using data from survey research conducted prior to the introduction of the new product. Given the popularity of simulated test-market services, such as those offered by A.C. Nielsen’s BASES subsidiary, we may be tempted to assume that the test market is a thing of the past. This is far from the truth, as many firms will not commit to the final launch decision purely on the basis of data collected in a simulated test market. Test-market data provide the “hard” numbers about sales patterns as well as promotional response indicators that are desired (or even required) by many brand managers. Furthermore, electronic test market environments such as Scannel (operated by Taylor Nelson Sofres Sécodip in France) and BehaviorScan (operated by Information Resources, Inc. (IRI) in the US) provide a level of detail (and managerial control) that could only have been dreamed of back in the 1960s. Coupled with this desire for actual sales numbers is the desire to get sales estimates “as soon as possible” (e.g., Advertising Age 2000). Thus the need for test-market models is still very strong among consumer packaged goods (CPG) manufacturers. The (test-market) new product sales forecasting models used by the various market research firms are typically minor modifications of those developed 30–40 years ago. Reflecting on the set of models developed during this era, we can immediately identify two shortcomings. The first shortcoming concerns the fact that the majority of these models do not explicitly incorporate the effects of marketing decision variables. This should come as no surprise since these models were developed at a time when consumer panel data were collected using self-completed paper diaries; data on in-store merchandising activities were non-existent, un- 1 less collected via a custom audit. The emergence of the UPC and laser scanners make such data readily available today, but the models used in practice have not kept pace with these technological improvements. The second shortcoming is a little more subtle. When modeling new product sales, it is standard practice to separate total sales into trial (i.e., first purchase) and repeat (i.e., subsequent purchases) components. In order to understand the development of repeat sales, it is common to decompose repeat sales into its first repeat, second repeat, third repeat (and so on) components. Within the literature on test-market forecasting models, there is a long tradition of building so-called “depth-of-repeat” models, which combine the output from each of these sub-models to arrive at an overall sales forecast for the new product — see, for example, Eskin (1973), Fourt and Woodlock (1960), Kalwani and Silk (1980), and Massy (1969). As we transition from repeat level j to level j + 1, the only piece of information we effectively retain about each person is that they made a jth repeat purchase. Thus the probability that an individual will make a 3rd repeat purchase 2 weeks after his 2nd repeat purchase is generally assumed to be exactly the same regardless of whether he made his second repeat purchase in week 3 or week 50. Furthermore, information on the timing of this individual’s trial and first repeat purchases would be completely ignored. Such an approach has typically been used to capture the nonstationarity in purchasing rates that we observe during the early phase of a new product’s life. A key problem with this depth-of-repeat approach is that it will result in misleading inferences about buyer behavior, since the model formulation fails to recognize the dependence across multiple purchases within each individual (Gupta and Morrison 1991). For example, Fader and Hardie (1999) show that the parameters of Eskin-type models of repeat sales will imply the existence of nonstationarity in repeat-buying behavior even when the model is applied to data from a purely stationary (simulated) market! With these two shortcomings in mind, the objective of this paper is to present a stochastic model for the sales of a new CPG product that integrates all of an individual’s purchases of the new product (as opposed to developing separate models for trial, first repeat, etc.) and simultaneously captures the effects of marketing activities and nonstationarity in initial repeat buying behavior at the individual consumer level. The paper proceeds as follows. In the next 2 section we develop our model for the sales of a new CPG product, and we present two extensions to the basic model. This is followed by an empirical analysis in which we examine the fit and forecasting performance of the proposed model, and consider its use in the evaluation of alternative launch scenarios. We conclude with a discussion of several issues that arise from this work and identify several areas worthy of follow-on research. 2 Model Development Our objective is to develop a model of new product sales that incorporates the effects of marketing mix variables and nonstationarity in buying rates at the individual customer level. The primary motivation for nonstationarity is the notion that customers’ preferences for the new product are evolving; as customers gain more experience with the product we would expect their preferences (and therefore their underlying buying rates) to “settle down.” As such, a desired feature of the model is that it can capture the “evolution” towards a stationary repeatbuying process as the product moves from being “new” to “established.” Nonstationarity is modeled using a multiple-changepoint process for the customer-level buying process. At each changepoint, there is a renewal of (or change in) the underlying buying rate. A renewal is interpreted as a revision of preferences, which may be due, perhaps, to experience with the product or some other unobservable phenomenon. A renewal can occur after any purchase of the new product, but the probability of occurrence decreases as the customer gains more experience with the new product (i.e., moves through higher depth-of-repeat levels). Our model for the evolution of new product purchasing is based on the following five assumptions: i. The probability of an individual ever trying the new product is π0 . ii. Let the random variable Tj denote the time (since the launch of the new product) at which a customer makes its j th repeat purchase (j = 0, 1, . . . , J). By convention, j = 0 corresponds to the trial purchase. The hazard- rate function of the with-covariate interpurchase time distribution is of the form 3 h(t|tj ) = λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time t and β the effects of these covariates. (This corresponds with the assumption of an exponential baseline distribution with covariate effects incorporated using the proportional hazards framework.) Assuming the time-varying covariates remain constant within each unit of time (e.g., week), the survivor function of the with-covariate interpurchase time distribution is S(t|tj ; λ) = exp −λB(t, tj ) (1) where B(tb , ta ) = B(tb ) − B(ta ) with Int(t) B(t) = δt≥1 A(i) + [t − Int(t)]A(Int(t) + 1) i=1 It follows that the pdf of the with-covariate interpurchase time distribution is f (t|tj ; λ) = λA(τ ) exp −λB(t, tj ) (2) where τ = t if t is integer and Int(t) + 1 otherwise. iii. Individual purchase rates, λ, are distributed across the population according to a gamma distribution with shape parameter r and scale parameter α; that is g(λ) = αr λr−1 e−αλ Γ(r) iv. Following his j th purchase, a customer renews his value of λ with probability γj . The depth-of-repeat specific renewal probability is of the form 4 γj =    η j=0   1 − ψ(1 − e−θj ) j = 1, 2, . . . (3) where η, ψ ∈ [0, 1] and θ ≥ 0. v. Upon the occurrence of a renewal, a customer receives a value of λ = 0 with probability φ. (This is equivalent to a complete rejection of the new product.) With probability 1−φ, the customer draws a new value of λ, independent of his previous one, from the same gamma distribution of purchase rates described above. The first assumption follows naturally from the established literature on the modeling of first-purchase of a new product. Since the early work of Fourt and Woodlock (1960), modelers have assumed that there is an upper limit on the market penetration level for a new product. The numerical value of this penetration limit can be interpreted as the probability that a randomly chosen individual will eventually try the new product, which we denote by π0 . The incorporation of covariate effects in interpurchase time distributions using the proportional hazard specification with a parametric baseline hazard function is well established within the marketing literature. The choice of the exponential for the baseline distribution (assumption ii) represents the simplest case and is consistent with the assumption of Poisson counts that underlies much of the stochastic modeling work within the marketing literature. Similarly, assumption (iii) follows the long tradition of using the gamma distribution to capture heterogeneity in purchase rates (e.g., Morrison and Schmittlein 1988). Note that these two assumptions give us the “exp/gamma, covariates” model examined by Gupta (1991). We will also examine an alternative timing process, the Erlang-2, later in the paper. The final two assumptions provide a paramorphic, as opposed to strictly behavioral, representation of when (assumption iv) and how (assumption v) preferences for the new product evolve. The logic behind equation (3), the probability that a renewal occurs at depth-of-repeat level j, is as follows: we would expect that the probability of a consumer revising his preferences following a purchase would decrease as he gains more experience with the new product (i.e., 5 moves to a higher depth-of-repeat level). Looking closely at equation (3), we note that as j increases, γj tends to 1 − ψ. Therefore, if ψ = 1, the probability of a renewal tends to zero as a consumer moves to higher depth-of-repeat levels; in other words, the model evolves to a stationary process which would be consistent with the stabilization of consumer preferences. On the other hand, if ψ < 1, γj > 0 ∀j, which means that individual consumer preferences will not stabilize; in other words, there is long-term nonstationarity in the marketplace. (If θ → ∞, then γj is independent of j and equals 1 − ψ ∀ j.) The relationship between γj and j is illustrated in Figure 1 for three sets of values of ψ and θ. 0.8 Probability of Renewal 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 ... ... ... ... ... ψ = 1.0, θ = 0.4 .. .. ..... .. ... .. .... .. ... .. ... .. .... ... .. ... .. ... .. ... .. ... ... .. ... ... ..... ... ..... ... ..... ... ..... ... ψ = 0.8, θ → ∞ ... ... ..... .......... .... .. ..... ........... . ... ... ...... ... .............. .............. ... ... ... ... ........ ........... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ψ = 0.9, θ = 0.7 ........... ................ .................... ............................... ................................................................................... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Depth-of-Repeat Level (j) Figure 1: Probability of Renewal by Depth-of-Repeat Reflecting on assumption (v), the “spike at zero” — receiving a value of λ = 0 with probability φ — is simply a mechanism by which customers can “drop out” of the market for the new product even after making several purchases of it; drawing a value of zero upon a renewal is viewed as being equivalent to rejecting the new product from future purchase consideration. It follows that the proportion of triers who do not make a repeat purchase is given by ηφ. For j > 1, the proportion of consumers making a (j − 1)th repeat purchase who will ultimately make a jth repeat purchase is given by 1 − γj−1 φ. For finite θ and ψ = 1, this proportion increases with j to a limit of 1.0, which is consistent with the observations of Eskin (1973) and Kalwani and Silk (1980) concerning the nature of depth-of-repeat curves. Given the central role played by γj in determining how many people ultimately make a (j + 1)th repeat purchase, the case of j = 0 is treated separately from j ≥ 1 as it has been observed that the proportion of triers who eventually repeat does not reflect how subsequent repeat purchasing evolves (Eskin 1973). 6 With probability 1 − φ, the consumer draws a new value of λ from the underlying gamma distribution; this allows for changes in the consumer’s latent buying rate, which we interpret as a change in his preference for the new product. This principle of independent renewals from a given mixing distribution was first raised in Howard’s “Dynamic Inference” model (Howard 1965). Similar types of renewal processes have been used by Sabavala and Morrison (1981) in their model of media exposure and Fader and Lattin (1993) in their measure of loyalty for scanner data-based choice models. However, these earlier models all utilized fixed (time-invariant) renewal processes, as opposed to the evolutionary process introduced here. (As noted above, equation (3) admits a fixed (time-variant) renewal process as a special case, i.e., θ → ∞, ψ < 1.) In contrast to a standard changepoint process application (e.g., Henderson and Matthews 1993; Pievatolo and Rotondi 2000; Raftery and Akman 1986), our interest is not in the explicit identification of the changepoint(s) in an observed sequence of variables. Rather we are simply using the changepoint framework to accommodate shifts in the underlying stochastic process which can then be used to forecast future outcomes of the process. Furthermore, for each individual, we restrict the set of possible changepoints to times at which at which they purchase the new product. To illustrate and convey the intuition of the proposed model, let us consider the following scenario of a customer who makes three purchases of the new product in the first six weeks of it being on the market: trial at t0 , first repeat at t1 and second repeat at t2 . Let us assume that if a renewal occurs (i.e., preferences are revised), it is immediately after a purchase. One behavioral “story” consistent with this is to assume that consumption immediately follows purchase, and that preference revisions would immediately follow consumption. t0 t1 t2 6 ✲ × × × ✛ wk 1 ✲✛ wk 2 ✲✛ wk 3 ✲✛ wk 4 ✲✛ wk 5 ✲✛ wk 6 ✲ 0 Given t0 , t1 , t2 , we do not know whether the consumer ever revised his preferences and, if he did, how many times and at which points in time. Let us first assume that the consumer never revised his preferences in (0, 6]. By assumptions (i) and (ii), the conditional likelihood function for this consumer is the probability that he eventually tries the new product (π0 ), multiplied by 7 the product of the density and survival functions, that is, L(π0 , λ, β; data) = π0 f (t0 |0; λ)f (t1 |t0 ; λ)f (t2 |t1 ; λ)S(6|t2 ; λ) = π0 λ3 A(2)A(4)A(6) exp − λB(6, 0) Following the third assumption, the unconditional likelihood function is: ∞ αr λr−1 e−αλ dλ Γ(r) 0 r 3 1 α Γ(r + 3)A(2)A(4)A(6) = π0 α + B(6, 0) Γ(r) α + B(6, 0) L(π0 , r, α, β; data) = L(π0 , λ, β; data) (4) This same likelihood function would emerge if we were to apply the “exp/gamma, covariates” model from Gupta (1991) to this hypothetical purchase history. Alternatively, suppose that the consumer revised his preferences following his second (i.e., first repeat) purchase. Let the purchasing rate λa reflect the consumer’s initial preference for the new product, and λb reflect the consumer’s revised preference following his first repeat purchase. The conditional likelihood function for this consumer is therefore: L(π0 , λa , λb , β; data) = π0 f (t0 |0; λa )f (t1 |t0 ; λa )f (t2 |t1 ; λb )S(6|t2 ; λb ) = π0 λ2a A(2)A(4) exp − λa B(t1 , 0) λb A(6) exp − λb B(6, t1 ) Following assumption (v), we note that the renewal resulted in a new value of λ being drawn from the same underlying gamma distribution, an event which occurs with probability 1 − φ. The unconditional likelihood function is therefore: ∞ ∞ −αλb −αλa αr λr−1 αr λr−1 a e b e (1 − φ) dλa dλb Γ(r) Γ(r) 0 0 2 r 1 α Γ(r + 2)A(2)A(4) = π0 (1 − φ) Γ(r) α + B(t1 , 0) α + B(t1 , 0) r 1 α Γ(r + 1)A(6) × (5) Γ(r) α + B(6, t1 ) α + B(6, t1 ) L(π0 , r, α, φ, β; data) = L(π0 , λa , λb , β; data) In general, we cannot tell exactly when (or if) renewals of buying rates take place. For this 8 consumer, the number of renewals could have ranged from zero to three. The set of eight possible renewal patterns is given in Table 1. Equation (4) is the likelihood function corresponding to the renewal pattern in row (i), and the likelihood function corresponding to the renewal pattern in row (iii) is given in equation (5). While we do not know which of the eight patterns corresponds to the consumer, we can write out the unconditional likelihood function associated with each of the possible renewal patterns and compute the consumer’s overall likelihood as the weighted average of the renewal-pattern-specific likelihoods, where the weights are the probabilities of each renewal pattern occuring. (Following assumption (iv), the probability of observing the renewal pattern in row (i) is given by (1 − γ0 )(1 − γ1 )(1 − γ2 ), while the probability of observing the renewal pattern in row (iii) is given by (1 − γ0 )γ1 (1 − γ2 ).) (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) Renewal Occurs After Trial 1st Repeat 2nd Repeat Number of Renewals 0 1 1 1 2 2 2 3 Table 1: Feasible Renewal Patterns for Three Purchases More generally, let Th = {t0 , . . . , tj , . . . , tJ } be the set of times at which household h, (h = 1, . . . , H), makes its K purchases of the new product in the period (0, tc ], where tc is the censoring point that is the end of the calibration period. (Clearly Th = ∅ if K = 0.) The exact nature of the likelihood function for consumer h depends on whether K = 0 or K > 0. If no purchase of the new product is observed (i.e., K = 0), this is due to either (i) the household not being in the market for the new product, or (ii) the household has simply not yet had the opportunity or need to make a trial purchase. Therefore, the likelihood function for a household making no purchases is: L(Th ) = (1 − π0 ) + π0 9 α α + B(tc , 0) r (6) The first term is simply the probability that a household will never try the new product, whereas the second term is the probability that a household will eventually make a trial purchase multiplied by the with-covariate survival function (i.e., the probability that no purchase occurred in (0, tc ]) mixed with the gamma distribution. When K > 0, the possibility of renewals occurring emerges. For a household making K purchases of the product in the period (0, tc ], there are K renewal opportunities. At each renewal opportunity, a renewal either occurs or it does not; consequently, there are 2K sets of possible renewal points. Let there be n ≤ K renewals, and let w = {wi }, i = 1, . . . , n be the set of renewal points, where wi corresponds to the depth-of-repeat level immediately following which a renewal occurs. (For the second example above, w = {1}.) If a renewal occurs after the trial purchase, we have w1 = 0. As we cannot tell exactly when (or if) renewals of buying rates take place, we first formulate the likelihood function conditional on a given renewal pattern, w. For the case of no renewals (n = 0), we have L(Th | w) = π0 J A(τj ) j=0 J+1 r 1 α Γ(r + J + 1) Γ(r) α + B(tc , 0) α + B(tc , 0) (7) where τj is the time period (e.g., week) in which the j th purchase occurred, defined as τj =    tj if tj is integer   Int(tj ) + 1 otherwise For n > 0 renewals, with the last renewal occurring immediately following the last purchase (i.e., wn = J), we have L(Th | w) = π0 (1 − φ) × n i=1 n−1 J A(τj ) j=0 wi −wi−1 r 1 α Γ(r + wi − wi−1 ) Γ(r) α + B(twi , twi−1 ) α + B(twi , twi−1 ) r α (8) × φ + (1 − φ) α + B(tc , tJ ) 10 where w0 = 0. This likelihood function can be interpreted in the following manner. As the nth renewal occurred immediately following the last purchase, the final bracketed term represents the likelihood that no purchase has occurred since twn = tJ . This is either because the renewal resulted in the new product being rejected (i.e., a value of λ = 0 was drawn, with probability φ) or because the consumer has not yet had the opportunity or need to make a repeat purchase in (tJ , tc ] (i.e., the probability of drawing a positive value of λ, (1 − φ), multiplied by the “exp/gamma, covariates” survival function for a time period (tJ , tc ]). The probability that the first n − 1 renewals saw positive values of λ being drawn is (1 − φ)n−1 . For each of the n intervals during which preferences were stable, the second bracketed term is simply the product of the “with covariates” pdfs, mixed with the gamma distribution. Alternatively, if the final renewal occurs some time before the last purchase (i.e., wn < J), we have L(Th | w) = π0 (1 − φ) × n i=1 n J A(τj ) j=0 wi −wi−1 r 1 α Γ(r + wi − wi−1 ) Γ(r) α + B(twi , twi−1 ) α + B(twi , twi−1 ) J−wn r 1 α Γ(r + J − wn ) (9) × Γ(r) α + B(tc , twn ) α + B(tc , twn ) As twn < tJ , we know that all renewals saw positive values of λ being drawn, the probability of which is (1 − φ)n . The second bracketed term is interpreted as above, while the final bracketed term is the likelihood that the last J − wn purchases occurred in (twn , tc ]. Now the probability of a given renewal pattern w is P (w | ψ, θ) = γj j∈w (1 − γj ) (10) j∈I−w where I = {0, 1, . . . , J}. Therefore, for K > 0, the likelihood function associated with Th is simply the weighted average of the renewal-pattern-specific likelihoods, that is, 11 L(Th ) = L(Th | ws )P (ws ) (11) s where the summation is over the possible renewal sets indexed by s = 1, 2, . . . , 2K . (For K = 0, the likelihood function is given by equation (6).) It follows that the overall sample log-likelihood function is: LL = H ln L(Th ) (12) h=1 Equations (6)–(12) define the model as fitted to a given dataset. Maximum likelihood estimates of the model parameters (π0 , r, α, ψ, θ, φ, β) are obtained by maximizing the log-likelihood function given in equation (12) above. Standard numerical optimization methods are employed, using the MATLAB programming language, to obtain the parameter estimates. 2.1 Properties of the Model In its most general form, the model requires the estimation of 7 + s parameters, where s is the number of marketing covariates. It is a very flexible model that can capture many patterns of buying behavior. Examples of such buying phenomena include: • “Traditional” stationary buying behavior. If γj = 0 ∀ j, we have a purchasing process in which the latent purchase rates are stationary. (This is associated with θ → ∞ and ψ = 1.) When π0 = 1, our model reduces to the “exp/gamma, covariates” model considered by Gupta (1991). When β = 0, we have the two parameter exponential-gamma model of stationary repeat buying behavior which is the timing counterpart of the NBD counting model (Gupta and Morrison 1991). The estimates of r and α would equal those obtained by fitting the NBD to the data. Relaxing the assumption that π0 = 1 gives us the timing equivalent of Morrison’s (1969) NBD with “spike at zero” (counting) model where 1 − π0 is the size of the structural “never buyers” segment. • The transition from a “new” to “established” product. If ψ = 1 and θ is finite, then γj → 0 12 as j increases; that is, the probability of a renewal occurring tends to zero as a consumer moves to higher depth-of-repeat levels. This means that the initial nonstationary buying process evolves to a stationary process as the product becomes more established (i.e., when most buyers have made a large number of repeat purchases). Therefore the model is consistent with the notion of nonstationary buying behavior during the early stages of a new product’s life and stationary buying behavior — as characterized by the NBD model — once it has become established in the marketplace. • Long-term nonstationarity in repeat buying. When ψ < 1, the probability of renewal will always be non-zero which means that the repeat buying process is always nonstationary. If θ → ∞, γj is a constant 1 − ψ; that is, the probability of renewal is constant across all depth-of-repeat levels. For finite θ, γj → 1 − ψ as j increases; that is, the probability of a renewal tends to the constant 1 − ψ as a consumer moves to higher depth-of-repeat levels. Such a model can easily capture the “leakage” of repeat buyers phenomena observed by East and Hammond (1996). In particular, if φ > 0, or the underlying gamma distribution has a mode at zero (r ≤ 1), an on-going low-level of renewals will see some consumers drawing a value of λ = 0 on a given renewal, thereby “dropping out” of the market for the product of interest. Other researchers (e.g., Schmittlein, Morrison, and Colombo 1987) have proposed NBD-based models that include a “death” process. However our model is far more flexible, allowing for other forms of nonstationarity (e.g., “speeding up” and “slowing down” of latent purchase rates) beyond a simple “death” process. 2.2 Generating Sales Forecasts In order to evaluate the tracking performance of the proposed model, or to use the model for forecasting sales beyond the model calibration period, it is necessary to generate sales numbers (i.e., counts) from this timing model. We are interested in a number of sales-related measures for the new product: i. the cumulative trial sales by time t, T (t), ii. the cumulative repeat sales by time t, R(t), 13 iii. the total sales by time t, S(t), which by definition is equal to T (t) + R(t), and iv. the depth-of-repeat components of repeat sales. Defining Rj (t) as the number of customers who have made at least j repeat purchases of the new product by time t, we have R(t) = ∞ j=1 Rj (t). Our goal is to generate these numbers over the time interval (0, tf ], where tf denotes the end of the forecast period. While we have a simple closed-form expression for expect cumulative trial sales, E[T (t)] = H × π0 1 − α α + B(t) r , it is not possible to write out a closed-form expression for R(t), and consequently S(t). We therefore propose a simulation-based approach to computing the sales numbers. A complete step-by-step description of this simulation procedure is contained in Appendix A. 2.3 Extensions to the Basic Model We consider two extensions to the basic model: (i) a relaxation of the assumption that the interpurchase times are distributed exponentially, and (ii) a recognition of the possibility that the effects of marketing activities could vary as the consumer gains more experience with the new product. As with numerous other stochastic models of buyer behavior, our model is based on the assumption that individual consumer interpurchase times can be characterized by the exponential distribution. Two potentially troubling characteristics of this distribution are that it is memoryless (i.e., there is no influence of time since the last purchase) and that the mode of the distribution is at zero (which means that the next purchase is most likely to occur immediately after the last one). Consequently a number of researchers have proposed that the Erlang-2 distribution be used to model interpurchase times, as it allows for a more regular purchase process (Chatfield and Goodhardt 1973; Herniter 1971; Jeuland, Bass and Wright 1980). We therefore consider the case of Erlang-2 distributed interpurchase times as an extension to the basic model. 14 Using Gupta’s (1991) approach to incorporating the effects of time-varying covariates into the Erlang-2 distribution, the survivor function and pdf of the with-covariate interpurchase time distribution are given by S(t | tj ) = exp −λB(t, tj ) 1 + λB(t, tj ) f (t | tj ) = λ2 A(τ )B(t, tj ) exp −λB(t, tj ) (13) (14) Coupled with assumptions (i) and (iii)–(v), we arrive at a new set of renewal-pattern-specific likelihood functions which presented in Appendix B. Our second extension allows the response to marketing activities to vary across depth-ofrepeat levels (i.e., as the consumer gains more experience with the new product). This notion is motivated by the work of Helsen and Schmittlein (1994), who examined how price sensitivity varies across depth-of-repeat classes. In theory, we could estimate a separate β vector for trial, first repeat, second repeat, and so on. However we would not be able to generate sales forecasts from such a model as that we would need β vectors for repeat levels not observed during the model calibration period. One way to accommodate changing βs in a forecasting setting is to specify a general structure for the evolution of the coefficients as we move through higher levels of repeat purchasing. We propose the structure β j = β 0 + (β ∞ − β 0 )(1 − e−δj ) (15) in which the covariate effects evolve from their trial values (β 0 ) to their long-run equilibrium values (β ∞ ). The speed with which the equilibrium values are reached (as a function of repeat level j) is determined by the δ parameter. We explore the value of these two extensions in the following empirical analysis. 15 3 Application The basic model developed above nests a simpler model in which covariate effects are ignored. A generalization of the basic model allows for depth-of-repeat-specific βs, as given in equation (15). At the heart of these three model specifications is the assumption of exponential interpurchase times. Replacing this with the assumption of Erlang-2 interpurchase times gives us another three model specifications to consider. We examine the performance of these six model specifications using test market data for “Kiwi Bubbles”, a masked name for a shelf-stable juice drink, aimed primarily at children, which is sold as a multipack with several single-serve containers bundled together. Prior to national launch, it underwent a year-long test conducted in two of IRI’s BehaviorScan test markets. We use BehaviorScan panel data, drawn from 2799 panelists in two markets. Using data for the 267 panelists that tried the new product by the end of week 26, we wish to forecast the purchasing behavior of the whole panel (i.e., 2799 panelists) to the end of the year (week 52). That is, we fit the six model specifications to the first six months of purchasing data and generate sales forecasts for the whole year. We have information on the marketing activity over the 52 weeks the new product was in the test market; this comprises a standard scanner data measure of promotional activity (i.e., any feature and/or display), along with measures of advertising and coupon activity. To account for carryover effects, the advertising and coupon measures are expressed as standard exponentially-smoothed “stock” variables (e.g., Broadbent 1984). The results for the models are reported in Table 2. Looking at the model log-likelihoods, we immediately observe that the fit of the three exponential model specifications dominates their Erlang-2 counterparts. This result is completely consistent with recent work on the modeling of trial purchasing for new CPG products, which finds strong support of the exponential interpurchase time distribution (Fader and Hardie 2001; Hardie, Fader, and Wisniewski 1998). The dominance of the exponential model specification is confirmed when we look at the index of year-end forecast accuracy (WK52 Index); in all three cases, the exponential specification produces more accurate forecasts than its Erlang-2 counterpart. (Note that this does not necessarily follow from the good fit of the exponential model specification; as Armstrong (2001) observed, a large number of researchers have found 16 π0 r α η ψ θ φ β(coupon)† β(advertising)† β(promotion)† β∞ (coupon) β∞ (advertising) β∞ (promotion) δ LL WK52 Index‡ † ‡ Without Covariates 0.159 0.574 46.597 0.428 0.859 2.112 1.000 −− −− −− −− −− −− −− −3770.1 97.6 Exponential Covariates (constant βs) 1.000 0.061 80.615 0.269 0.968 2.388 0.000 5.208 0.000 0.012 −− −− −− −− −3726.5 104.2 Covariates (varying βs) 0.488 0.159 122.239 0.236 0.968 2.464 0.938 3.881 0.000 0.009 14.852 0.012 0.008 0.164 −3724.0 107.7 Without Covariates 0.163 0.416 10.517 0.668 0.777 1.335 0.587 −− −− −− −− −− −− −− −3777.3 85.6 Erlang-2 Covariates (constant βs) 0.426 0.119 14.771 0.570 0.847 1.405 0.000 3.730 0.000 0.009 −− −− −− −− −3744.9 91.0 Covariates (varying βs) 0.470 0.115 19.463 0.541 0.844 1.484 0.000 1.633 0.000 0.013 10.345 0.000 0.003 0.342 −3741.2 90.9 β 0 for the varying βs specification 100 × expected week 52 cumulative total sales / actual week 52 cumulative total sales Table 2: Summary of Model Results that model fit is a poor way to assess predictive validity.) Within the set of exponential models, we observe that (i) the inclusion of covariates results in a significant improvement in calibration-period model fit (p < .001) and (ii) allowing for depth-of-repeat-specific βs does not result in a significant improvement in calibration-period model fit (p = .17). Looking at the index of year-end forecast accuracy, we also observe that the model that allows for depth-of-repeat-specific βs is dominated by the other two model specifications. This is contrary to the findings of Helsen and Schmittlein (1994). The fact that the anaylsis undertaken by Helsen and Schmittlein treated trial, first repeat, and second repeat as independent processes and failed to control for unobserved heterogeneity means that we have more confidence in our findings. It is, however, too soon to draw any conclusion as to whether and how the effects of marketing activities really vary across depth-of-repeat levels. But the modeling approach developed in this paper is the correct way to explore such effects, as it overcomes the shortcomings identified in the Helsen and Schmittlein analysis framework. Reflecting on the parameters of the “exponential with constant covariate effects” model, we note the model suggests that every panelist is potentially in the market for the new product 17 (π̂0 =1). While at first glance this is counter-intuitive, it is completely consistent with the existing literature on the modeling of new product trial, in which it has been found that the estimated value of the penetration limit parameter (i.e., π0 ) is typically either 1.0 or not significantly different from 1.0 (Fader and Hardie 2001). The next two numbers give us the estimates of the shape and scale parameters (r and α) for the underlying gamma distribution that characterizes the heterogeneous purchasing rates across the panelists. When a given panelist makes a trial purchase, there is a 27% chance (η) that he will change his purchase rate. If this does occur, the panelist does not reject the product (φ̂ = 0); rather, he draws a new purchase rate from the original gamma distribution. When this panelist does eventually make a first repeat purchase, we use equation 3 to determine that there is a 12% chance that he will undergo a renewal immediately after this purchase. This drops to 4% following his second repeat purchase. From that point on, the renewal probability effectively reaches its asymptotic value of 1 − ψ = 3.2%. Finally, we find that the couponing and in-store promotional activities have significant impacts on purchase timing. The zero coefficient for advertising reflects the lower bound placed on this parameter when maximizing the log-likelihood function; unconstrained, it is negative but not significantly different from 0. As a benchmark, we also fit the basic Gupta (1991) “exp/gamma, covariates” model to the first six months of purchasing data (allowing for the possibility of never-triers); the resulting six parameter model has a log-likelihood of −3733.0. This represents a significantly worse fit (p = 0.011) than the above “exponential with constant covariate effects” model specification, and it substantially overpredicts year-end sales, with a WK52 index of 114. We can therefore conclude that there is nonstationarity in the repeat buying behavior for the new product — overand-above the temporary changes induced by the marketing activities — that must be explicitly captured in a model for the sales of a new product. The forecasting performance of the “exponential with constant covariate effects” model specification is illustrated in Figure 2. In addition to a total sales forecast, managers are interested in the break-down of total sales into its trial, first repeat, and additional repeat components — see for example Clarke (1984). The model-based predictions provide an accurate tracking of both the total sales curve as well as its trial and repeat components. 18 Cum. Sales per 100 HH 30 20 10 0 Actual ....... ....................... . .................... . . . . ..... ... ....... ........... ......................... ..................... ........................ . . . . . . . . ......... ... .................. ....... ..... ....... ............... ................ . . . . . . . .. .. Total Sales ................... ....... ........... ... ....... .............. ...... ....... ........ ....... .............. ....... . . . ......... ............ . ..... ......... ....... ......... Trial ....... ............. .......... . ...... . ...................................... . ...... .... .................................. .......... ....................................................................................................................................................... .... . . . . . . . . . . . . . . .......... ....... .... .............................................. .......................... .......... ................................... ................................ . ......... . ............ ............................... . ............ Additional Repeat ............ ............................. ................ ....... ........... ................... ..... .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .................... .... ........................... ....... .......................................... ... ............. .................... ....... ..................................................................................................... ......... .................. ................................................................................................................. .................... .. ........ .............. ....................................................... .............. ........ . . . . . . . . . . . . . . . . . . . . . . . . . First Repeat ....... ............................................................ . .. .... ........................................... Predicted 0 4 8 12 16 20 24 28 32 36 40 44 48 52 Week Figure 2: Predicted Sales Even though the level of additional repeat sales is relatively low at the end of the calibration period, it is evident that additional repeat will quickly bypass the other sales components, and will comprise the lion’s share of total sales in the period following week 52. The ability of our model to accurately track and forecast this key component is, perhaps, the strongest indicator of its validity and usefulness. Two other widely-monitored measures of new product performance are “percent triers repeating” and “repeats per repeater” (Clarke 1984; Rangan and Bell 1994). At any point in time t, percent triers repeating is computed as R1 (t)/T (t), while repeats per repeater is computed as R(t)/R1 (t). In Figures 3 and 4 we compare the actual development of these two measures with the predictions derived from the model, observing that the model-based numbers accurately track the actual numbers. 50 . ....... ....... ....... ...... ... ....... ....... ..... ....... .... . . ............... . . . . . . . . . ............... ....... ..... ........................................................................... ..... ....................................................... ........ .. . ................... . ................................................. . . . . . . ..... ....... .... ... .................... .......... ... ... .. ... .... .... ..... ....... . . . . ..... ....... ....... .... ..... .... ......... .. . ... .. ... . ... ... ..... ..... ... ..... .... . .... ...... ... Actual ..... .. ..... ..... Predicted ... % Triers 40 30 20 10 0 0 4 8 12 16 20 24 28 32 36 40 44 Week Figure 3: Tracking Percent Triers Repeating 19 48 52 Average # Repeat Purchases 3 ........ ...................... ........ ............ ................. ......... . . . . . . . . . . . . . . ... .... ........... .. .... ...... ........ . ....... ..... .. ....... ................................................. .... . . . . . . . . . . ... ....... ..... . ......... ...... .. ...... .... . ............. . ...... .. ........................... .. . . . . . . . ....... .... ...... ..... ...... .. ..... ..... ... ........... .... . . . . ... . ........ .... .... ... . . . . . . . . . . ....... Actual ............. .. ....... .............. . . . . . Predicted . . ........ . ... ........... . . . . . . . ... ... ....... 2 1 0 0 4 8 12 16 20 24 28 32 36 40 44 48 52 Week Figure 4: Tracking Repeats Per Repeater Referring back to Table 2, we observe that the no-covariate model generates the most accurate forecast, as judged on the basis of the WK52 index. The forecasting performance of this model is illustrated in Figure 5. Cum. Sales per 100 HH 30 20 10 0 Actual .. ............. ........ .. .. ............ .... ............ .. . . . . . .. ..... ................. ... ............... ........................ .................. ............... .... . . . . . . . . . . . . . . . . . . ........ ....... .... .... ......... ... ............. .............. . . . . . . . . . . . . . . . ........... ... .............. ...... ........ ...... ....... ..... . ....... . . . . . ...... ..... ....... .... ......... ... ....... ..... .. ..... .. . . ... . .......... ..... ...... ............ . . . . . ..... ... ...... .......... ... ........ . ......... .... ..... ..... Predicted 0 4 8 12 16 20 24 28 32 36 40 44 48 52 Week Figure 5: Sales Forecast: No-Covariate Model Does this mean that marketing-mix variables have no value in a new product sales forecasting model? Not at all. Fader and Hardie (2001) note that the inclusion of marketing-mix variables has a big impact on forecasting performance when the model calibration period is relatively short. (However, with longer calibration periods, the forecasts generated using no-covariate models are just as accurate as their with-covariate counterparts.) Furthermore, the with-covariate model can be used to evaluate the impact of incremental changes in the marketing-mix as the marketing manager seeks to finalize the (national) launch plan for the new product. We now consider such an application of the model. 20 One element of the promotional activity for “Kiwi Bubbles” was an FSI coupon distributed in week 3. In order to determine the impact of this early couponing activity, the marketing manager would want to know what the sales path would be had this coupon not been distributed. Alternatively, noting the apparent sales increase in week 3, the marketing manager may consider repeating such a promotional activity further on in the launch phase of the new product. We therefore consider two scenarios, the first corresponding to the removal of the coupon dropped in week 3, the second corresponding to a repeat of this coupon (i.e., same face value and fuse) in week 20. We generate the sales forecasts under each scenario and compare them to the base case corresponding to the sales forecast associated with the marketing plan used in the test market. The predicted total sales paths for these two scenarios is reported in Figure 6, along with the (predicted) sales path associated with the base case. We observe that under scenario 1, first year sales are down by 4.4% while under scenario 2, first year sales are up by 2.1%. Cum. Sales per 100 HH 30 20 10 0 ..... ............... .............. ... ............. ....... ... . ........ .... Scenario 1 ............ .. .... ......... ....... ............. ...................... .. ....... . . . . Scenario 2 .... ........ .... .... .... ............ .. ............... .... ....... .............. ... ......................... . ....... . . . . . ................. ....... ..... . . . .. ... ..... ... .......... ...... ..... . ........... ................ ....... .......... ....... . . . . . . . . . ........ ...... ....... ....... ....... ...... ......... ........ . ....... . . . . . . ...... ... ... ...... ..... . ..... . ..... ... . . . . . . . .. ...... ...... ..... ... .... .... .... .. ... .... . ...... ...... ..... ...... ..... . . . . .. Base case 0 4 8 12 16 20 24 28 32 36 40 44 48 52 Week Figure 6: Total Sales for Base Case and Scenarios 1 & 2 (Coupon Deleted/Added) These overall changes in total sales are decomposed in Figure 7, which reports cumulative trial and repeat sales under each scenario, indexed against the base case. We observe that yearend trial sales are down by 2.1% under scenario 1 and up by 1.0% under scenario 2. Year-end repeat sales are down by 5.7% under scenario 1 and up by 2.8% under scenario 2. These numbers provide an indication of the permanent loss (or gain) in sales that may be due to the deletion (addition) of this type of coupon event. Further calculations reveal that 17.6% of the change in year 1 sales under scenario 1 is due to the reduction in trial sales (alone), whereas 18.0% of the increase in year 1 sales under scenario 2 is due to the change in trial sales (alone). 21 Sales Index (base case = 100) 110 Scenario 2 ...... ....... ....... ....... ....... ....... ....... .... ... ....... ....... ....... ....... .. . ..... ....... ..... ..... .................. ..................................................... .. . . . . .... ....... ....... ....... ....... ....... ....... ....... ....... ....... ..... .............................................. ... ..... .............................. ... ............................ .............. ..... ........... ....... ....... .... ...... ....... ....... .. .. .. .... ....... ....... ..... .... ....... ....... ... ... .. ....... ....... . .... . . . . . Scenario 1 . .. ...... ....... ..... .. .... . .. ....... ... .... ....... ... . .... ... . . ... .. . . .. ... .... ....... . ....... ... ... .. .. ... .. .. .. . ... ... .. . ... Trial ... .. .. .... ... Repeat .. ... . 100 90 80 70 60 0 4 8 12 16 20 24 28 32 36 40 44 48 52 Week Figure 7: Scenario Trial and Repeat Sales (Cumulative) Indexed to Base Case These numbers by-themselves can be a little misleading as we must consider the “tricklethrough-repeat” effects of the changes in trial. In Figures 8 and 9, we compare the development of “percent triers repeaters” and “repeats per repeater” under each scenario, indexed against the base case. Under scenario 1 we observe that the percentage of triers who make a repeat purchase initially drops by almost 15% but has effectively recovered by the end of the year (down by 1.3%); the initial drop in the number of repeat purchases per repeater was not as great but has not recovered as much (a difference of 2.5% at year-end). Thus we may conclude that much of the reduction in repeat sales under scenario 1 is due to the fact that a number of consumers who would have been induced to try the new product because of the couponing activity delay their trial purchase and therefore the follow-on repeat purchases are not observed. Under scenario 2, the extra coupon has minimal impact on trial or first repeat; the primary impact is on the repeat-buying behavior of those consumers who are already repeat buyers (repeats per repeater are up by a small 1.3%). This analysis of the “Kiwi Bubbles” test-market data illustrates the value of the model developed in this paper, and demonstrates how we can use such a model to help the marketing manager evaluate incremental changes to the new product launch plan. 22 105 Index (base case = 100) Scenario 2 ....... ....... ....... ....... ...... .... ....... ....... ....... . ... ...... ....... ....... ....... ....... ....... .. ... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ... ....... .. .. . . . . . . . . . . . . . ....... ....... ....... ....... ....... ....... ....... ... ....... ....... .. ....... ...... ... .... ....... . . . .. ....... Scenario 1 .. ... .... ..... .. ....... ... . . ... .... .. . ... ... .. .. ... ... .. . ... ... .. .. ... ... .. ...... ... .... .. . ..... 100 95 90 85 0 4 8 12 16 20 24 28 32 36 40 44 48 52 Week Figure 8: Scenario Impact on Percent Triers Repeating (Indexed to Base Case) 105 Index (base case = 100) Scenario 2 ... .. . ....... ....... ....... ....... ....... ....... ....... ....... ....... .... .. .. ... .. ... ...... ... . ....... ....... .. .... ....... ...... ..... ... ... . ....... ....... .... ....... .... .. . . . ... ... ..... ....... ....... ... .. .. .. ....... ...... ..... Scenario 1 ....... ... ....... .. ..... . . ... .. .. ..... ...... ... .. ..... . 100 95 90 . ...... ....... ....... ...... ....... ....... ...... ....... ....... ....... ....... ....... ....... ....... ....... .... 0 4 8 12 16 20 24 28 32 36 40 44 48 52 Week Figure 9: Scenario Impact on Repeats Per Repeater (Indexed to Base Case) 4 Conclusions While certain “hot topics” come and go in the field of marketing research, there has always been a high level of interest (shared by academics and practitioners alike) in the issue of forecasting new product sales. At the same time, however, recent years have seen a widening gap between methodological developments in academia and the state-of-the-art in actual practice. This paper bridges this gap with a model featuring three important contributions: (i) a fully integrated model of trial-repeat behavior; (ii) careful consideration of marketing mix (covariate) effects, including the possibility that the impact of advertising, coupons, and in-store promotional campaigns might each evolve in a different manner with deeper depths-of-repeat; (iii) explicit accommodation of nonstationary repeat buying behavior, which allows chaotic early 23 behavior to settle down towards a steady-state buying pattern over time. We examined several variations to our basic model structure, including different individuallevel timing processes (i.e., exponential versus Erlang-2) and covariate schemes (i.e., no covariates versus constant covariates versus varying with depth-of-repeat). One conclusion that emerged is the benefit of simplicity — the simplest model (exponential timing with no covariates) proved to have excellent forecasting capabilities. This is a theme that echoes recent work with trial-only models (Hardie, Fader, and Wisniewksi 1998) as well as repeat-only models (Fader and Hardie 1999). The fact that it continues to hold even when we mix these different types of buying behaviors is a strong tribute to its robustness and generalizability. Although covariates are not necessarily required for our model to produce excellent forecasts, they are still an important (and managerial desirable) component to include in the final specification. One of the principal reasons for running a test market is to learn about the effectiveness of these different levers in order to know which ones to use, and when to use them. Although some marketing mix elements (e.g., end-aisle displays) are aimed primarily at generating new triers, they also impact repeat sales both directly (i.e., enticing a past buyer to buy again) and indirectly (since a promotion-induced trier may continue to buy the product in the future). Our model allows us to capture these different behavioral effects, and can therefore give managers a correct sense of how well their marketing mix allocations are working. Beyond the context (i.e., a single new CPG product) discussed so far in the paper, it is worth discussing other relevant applications/extensions for the general type of methodology presented here. First, it is important to emphasize that the behavioral “story” behind our model is by no means limited to the CPG setting. A similar pattern will likely emerge for other types of products and services (although the specific parameters that characterize the various components of the model will likely vary from one context to another). Likewise, the model might apply nicely to new customers who are first encountering an existing product/service. For instance, as Internet “newbies” first learn about various websites, their behavior over time should conform to the basic set of assumptions outlined here; this would be a very promising area for future investigation. As we run the model over multiple products/services, it will be instructive to look for “meta24 patterns” in the resulting model parameters. Our empirical analysis revealed one particular type of nonstationary behavior, but it would be useful to catalogue different forms of nonstationarity (and covariate effects) and begin to associate them with product characteristics or other external measures. Many firms (e.g., BASES) attempt to database hundreds or thousands of products using simple sales summaries to enable early forecasts for new launches. Such a process can be greatly enhanced by using the parameters from a complete (and behaviorally plausible) model rather than relying strictly on summary statistics (such as repeats per repeater and the other measures we discussed earlier). As our field continues to make rapid advances with hierarchical Bayes methods, this task should become a workable possibility, even for practitioners, in the near future. Finally, one issue not addressed here, but sometimes considered in the context of new product sales, is the role of competition. Our experience with trial-repeat modeling mirrors that of firms such as BASES, who have found that accurate forecasts rarely require any explicit accommodation of competitive effects. Nevertheless, it is interesting to think about how new product entry can affect — and be affected by — existing market structures (see Bronnenberg, et al. (2000) for a recent review of this literature). But beyond these past approaches — mostly post hoc econometric models that were not intended for forecasting purposes — that other researchers have employed, we are intrigued by an extension of our product-specific stochastic model to one that can deal with sales patterns (and perhaps marketing activities) for other rivals. So while we view our integrated model as offering a reasonably accurate and managerially useful picture of the trial-repeat process for a given new product, we see it as just one step towards the creation of a “Holy Grail” model that builds in competition and other category-level phenomena to be able to anticipate the complete set of market dynamics that surround a new product launch. 25 Appendix A Let the non-zero elements of the vector Nh denote the times at which customer h made his trial, first repeat, etc. purchases (if at all). For a given individual, we simulate the elements of Nh in the following manner. We start by drawing a uniform random variate to determine whether the consumer will ever make a purchase of the new product (with probability π0 ). If this is the case, a value of λ is drawn from the gamma distribution. Using this value of λ and the actual values of the covariates, we simulate an interpurchase time off the “exponential with covariates” interpurchase time distribution. This gives us the consumer’s simulated value of t0 , the time of his trial purchase. If t0 > tf , the consumer is deemed to have made zero purchases of the new product by time tf and the procedure moves on to the next consumer. If t0 ≤ tf , we record the time of this purchase (Nh (0) = t0 ) and then draw a uniform random number to determine whether the consumer retains his value of λ (with probability 1 − γ0 ) or whether a renewal occurs (with probability γ0 ), in which case a new value of λ is drawn from the underlying distribution. Another uniform random number is drawn in the process of determining the new value of λ. With probability φ, a value of λ = 0 is drawn and the consumer is deemed to have rejected the new product and the procedure moves on to the next consumer. If the new value of λ is drawn from the gamma distribution (with probability 1 − φ), or no renewal has occurred, another exponential with covariates interpurchase time is simulated and added to t0 to give us the consumer’s simulated value of t1 , the time of his first repeat purchase. If t1 > tf , the consumer is deemed to have made only a trial purchase by time tf and the procedure moves on to the next consumer. If t1 ≤ tf , we record the time of this first repeat purchase (Nh (1) = t1 ) and the whole process continues for this consumer until tj > tf or a value of λ = 0 is drawn when a renewal occurs, at which time the procedure moves to the next consumer. Once we have simulated Nh for all individuals, we can compute total sales and its components in the following manner: 26 T (t) = Rj (t) = R(t) = H h=1 H h=1 ∞ I{0 < Nh (0) ≤ t} I{0 < Nh (j) ≤ t} Rj (t) j=1 S(t) = T (t) + R(t) where I{·} is an indicator function which equals 1 if the logical condition is true, and 0 otherwise. We repeat this simulation, say 100 times, and take the average of the run-specific S(t), T (t), etc. This simulation-based approach will be used in the empirical analysis. 27 Appendix B Assumption (ii) states that the individual consumer interpurchase times follow the exponential with-covariate distribution with survivor function and pdf given by equations (1) and (2). When we replace this with the assumption that the individual consumer interpurchase times follow the Erlang-2 with-covariate distribution with survivor function and pdf given by equations (13) and (14), we arrive at new expressions for the renewal-pattern-specific likelihood functions: i. For a household making no purchases in the calibration period (0, tc ]: L(Th ) = (1 − π0 ) + π0 α α + B(tc , 0) r 1+ rB(tc , 0) α + B(tc , 0) (A1) ii. When K > 0 purchases with no renewals (n = 0), we have L(Th | w) = π0 J A(τj )B(tj , tj−1 ) j=0 × Γ(r + 2(J + 1)) Γ(r) α α + B(tc , 0) r 1 α + B(tc , 0) 2(J+1) (r + 2(J + 1))B(tc , tJ ) (A2) × 1+ α + B(tc , 0) When π0 = 1, this is “Erlang-2/gamma, covariates” model considered by Gupta (1991). iii. For n > 0 renewals, with the last renewal occurring immediately following the last purchase (i.e., wn = J), we have L(Th | w) = π0 (1 − φ)n−1 n i=1 J A(τj )B(tj , tj−1 ) × j=0 r 2(wi −wi−1 ) α 1 Γ(r + 2(wi − wi−1 )) Γ(r) α + B(twi , twi−1 ) α + B(twi , twi−1 ) r rB(tc , tJ ) α (A3) 1+ × φ + (1 − φ) α + B(tc , tJ ) α + B(tc , tJ ) iv. For n > 0 renewals, with the last renewal occurring some time before the last purchase (i.e., wn < J), we have 28 L(Th | w) = π0 (1 − φ)n n i=1 J A(τj )B(tj , tj−1 ) × j=0 r 2(wi −wi−1 ) α 1 Γ(r + 2(wi − wi−1 )) Γ(r) α + B(twi , twi−1 ) α + B(twi , twi−1 ) r 2(J−wn ) α 1 Γ(r + 2(J − wn )) × Γ(r) α + B(tc , twn ) α + B(tc , twn ) (r + 2(J − wn ))B(tc , tJ ) (A4) × 1+ α + B(tc , twn ) Equations (A1)–(A4) replace equations (6)–(9) respectively. Consequently equations (A1)– (A4), (10)–(12) define the model as fitted to a given dataset when we assume the underlying interpurchase times follow the Erlang-2 distribution. 29 References Advertising Age (2000), “Safe at Any Speed?” January 24, 1, 12. Armstrong, J. Scott (2001), “Evaluating Forecasting Methods,” in J. Scott Armstrong (ed.), Principles of Forecasting: A Handbook for Researchers and Practitioners, Norwell, MA: Kluwer Academic Publishers, 443–472. Broadbent, Simon (1984), “Modelling with Adstock,” Journal of the Market Research Society, 26 (4), 295–312. Bronnenberg, Bart J., Vijay Mahajan, and Wilfried R. Vanhonacker (2000), “The Emergence of Market Structure in New Repeat-Purchase Categories: The Interplay of Market Share and Retailer Distribution,” Journal of Marketing Research, 37 (February), 16–31. Chatfield, C. and G. J. Goodhardt (1973), “A Consumer Purchasing Model with Erlang InterPurchase Times,” Journal of the American Statistical Association, 68 (December), 828–835. Clarke, Darral G. (1984), “G. D. Searle & Co.: Equal Low-Calorie Sweetener (B),” Harvard Business School Case 9-585-011. East, Robert and Kathy Hammond (1996), “The Erosion of Repeat-Purchase Loyalty,” Marketing Letters, 7 (March), 163–171. Eskin, Gerald J. (1973), “Dynamic Forecasts of New Product Demand Using a Depth of Repeat Model,” Journal of Marketing Research, 10 (May), 115–129. Fader, Peter S. and Bruce G. S. Hardie (1999), “Investigating the Properties of the Eskin/Kalwani & Silk Model of Repeat Buying for New Products,” in Lutz Hildebrandt, Dirk Annacker, and Daniel Klapper (eds.), Marketing and Competition in the Information Age, Proceedings of the 28th EMAC Conference, May 11–14, Berlin: Humboldt University. Fader, Peter S. and Bruce G. S. Hardie (2001), “Forecasting Trial Sales of New Consumer Packaged Goods,” in J. Scott Armstrong (ed.), Principles of Forecasting: A Handbook for Researchers and Practitioners, Norwell, MA: Kluwer Academic Publishers, 613–630. Fader, Peter S. and James M. Lattin (1993), “Accounting for Heterogeneity and Nonstationarity in a Cross-Sectional Model of Consumer Purchase Behavior,” Marketing Science, 12 (Summer), 304–317. Fourt, Louis A. and Joseph W. Woodlock (1960), “Early Prediction of Market Success for New Grocery Products,” Journal of Marketing, 25 (October), 31–38. Gupta, Sunil (1991), “Stochastic Models of Interpurchase Time with Time-Dependent Covariates,” Journal of Marketing Research, 28 (February), 1–15. Hardie, Bruce G. S., Peter S. Fader, and Michael Wisniewski (1998), “An Empirical Comparison of New Product Trial Forecasting Models,” Journal of Forecasting, 17 (June–July), 209–229. Helsen, Kristiaan and David Schmittlein (1994), “Understanding Price Effects For New Nondurables: How Price Responsiveness Varies Across Depth-of-Repeat Classes and Types of Consumers,” European Journal of Operational Research, 76 (July 28), 359–374. Henderson, R. and J. N. S. Matthews (1993), “An Investigation of Changepoints in the Annual Number of Cases of Haemolytic Uraemic Syndrome,” Applied Statistics, 42 (3), 461–471. 30 Herniter, Jerome (1971), “A Probabilistic Market Model of Purchase Timing and Brand Selection,” Management Science, 18 Part II (December), P102–P113. Howard, Ronald A. (1965), “Dynamic Inference,” Operations Research, 13 (September–October), 712–733. Jeuland, Abel P., Frank M. Bass, and Gordon P. Wright (1980), “A Multibrand Stochastic Model Compounding Heterogeneous Erlang Timing and Multinomial Choice Processes,” Operations Research, 28 (March-April), 255–277. Kalwani, Manohar and Alvin J. Silk (1980), “Structure of Repeat Buying for New Packaged Goods,” Journal of Marketing Research, 17 (August), 316–322. Massy, William F. (1969), “Forecasting the Demand for New Convenience Products,” Journal of Marketing Research, 6 (November), 405–412. Morrison, Donald G. (1969), “Conditional Trend Analysis: A Model that Allows for Nonusers,” Journal of Marketing Research, 6 (August), 342–346. Morrison, Donald G. and David C. Schmittlein (1988), “Generalizing the NBD Model for Customer Purchases: What Are the Implications and Is It Worth the Effort?” Journal of Business and Economic Statistics, 6 (April), 145–159. Pievatolo, Antonio and Renata Rotondi (2000), “Analysing the Interevent Time Distribution to Identify Seismicity Phases: A Bayesian Nonparametric Approach to the Multiple-Changepoint Problem,” Applied Statistics, 49 (4), 543–562. Raftery, A. E. and V .E Akman (1986), “Bayesian Analysis of a Poisson Process with a ChangePoint,” Biometrika, 73 (1), 85–89. Rangan, V. Kasturi and Marie Bell (1994), “Nestlé Refrigerated Foods: Contadina Pasta & Pizza (A),” Harvard Business School Case 9-595-035. Sabavala, Darius J. and Donald G. Morrison (1981), “A Nonstationary Model of Binary Choice Applied to Media Exposure,” Management Science, 27 (June), 637–657. Schmittlein, David C., Donald G. Morrison, and Richard Colombo (1987), “Counting Your Customers: Who They Are and What Will They Do Next?” Management Science, 33 (January), 1–24. 31

An Integrated Trial/Repeat Model for New Product Sales

Related documents

Products

Support

An Integrated Trial/Repeat Model for New Product Sales

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib