Intern. J. of Research in Marketing 31 (2014) 127–140 Contents lists available at ScienceDirect Intern. J. of Research in Marketing journal homepage: www.elsevier.com/locate/ijresmar From academic research to marketing practice: Exploring the marketing science value chain John H. Roberts a,⁎, Ujwal Kayande b, Stefan Stremersch c,d a London Business School and Australian National University, Canberra, Australia Melbourne Business School, Melbourne, Australia Erasmus School of Economics, Erasmus University Rotterdam, The Netherlands d IESE Business School, University of Navarra, Barcelona, Spain b c a r t i c l e i n f o Article history: First received in 28 September 2012 and was under review for 4 months Available online 1 October 2013 Area Editor: Dominique M. Hanssens Guest Editor: Marnik G. Dekimpe a b s t r a c t We aim to investigate the impact of marketing science articles and tools on the practice of marketing. This impact may be direct (e.g., an academic article may be adapted to solve a practical problem) or indirect (e.g., its contents may be incorporated into practitioners' tools, which then influence marketing decision making). We use the term “marketing science value chain” to describe these diffusion steps, and survey marketing managers, marketing science intermediaries (practicing marketing analysts), and marketing academics to calibrate the value chain. In our sample, we find that (1) the impact of marketing science is perceived to be largest on decisions such as the management of brands, pricing, new products, product portfolios, and customer/market selection, and (2) tools such as segmentation, survey-based choice models, marketing mix models, and pre-test market models have the largest impact on marketing decisions. Exemplary papers from 1982 to 2003 that achieved dual – academic and practice – impact are Guadagni and Little (1983) and Green and Srinivasan (1990). Overall, our results are encouraging. First, we find that the impact of marketing science has been largest on marketing decision areas that are important to practice. Second, we find moderate alignment between academic impact and practice impact. Third, we identify antecedents of practice impact among dual impact marketing science papers. Fourth, we discover more recent trends and initiatives in the period 2004–2012, such as the increased importance of big data and the rise of digital and mobile communication, using the marketing science value chain as an organizing framework. © 2013 The Authors. Published by Elsevier B.V. Open access under CC BY-NC-ND license. 1. Introduction Does marketing science research affect marketing practice? Which decisions have marketing science articles supported? To which tools has marketing science contributed? Which marketing science articles have had dual impact on both science and practice? These are key questions that we address in this paper. We define marketing science as the development and use of quantifiable concepts and quantitative tools to understand marketplace behavior and the effect of marketing activity upon it. From this definition, one would consider it reasonable for marketing scientists to seek impact on marketing practice, i.e., seek relevance. However, marketing scientists have recently rekindled the age-old debate on rigor versus relevance. On the one hand, marketing science has been very successful in attracting scholars from other fields such ⁎ Corresponding author. E-mail addresses: jhroberts@london.edu (J.H. Roberts), U.Kayande@mbs.edu (U. Kayande), stremersch@ese.eur.nl (S. Stremersch). as economics, statistics, econometrics and psychology. This inflow of talented scientists from other fields has clearly added to the rigor of marketing science and has allowed the development of new techniques. On the other hand, a number of academic scholars have recently called for more emphasis to be placed on the application of marketing science to industry problems, rather than rigor per se (e.g., Lehmann, McAlister, & Staelin, 2011; Lilien, 2011; Reibstein, Day, & Wind, 2009). Such application may also show positive returns to firms. Germann, Lilien, and Rangaswamy (2013) find that increasing analytics deployment by firms leads to an improvement in their return on assets. Despite the importance of this debate for our field and the strong interest in the drivers of academic impact (e.g., see Stremersch & Verhoef, 2005; Stremersch, Verniers, & Verhoef, 2007), empirical examination of the impact of marketing science on practice is rare. Valuable exceptions are Bucklin and Gupta (1999), Cattin and Wittink (1982), Wittink and Cattin (1989), and Wittink, Vriens, and Burhenne (1994). However, their application areas were narrow. Wittink and his colleagues studied the commercial use of conjoint analysis in North America and Europe, while Bucklin and Gupta studied the usage of scanner data and the models that scholars have developed to analyze them. Other scholars http://dx.doi.org/10.1016/j.ijresmar.2013.07.006 0167-8116 © 2013 The Authors. Published by Elsevier B.V. Open access under CC BY-NC-ND license. 128 J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 have conceptually reviewed the impact of marketing science and prescribed areas in which marketing science might have an impact in the future. In a special issue of the International Journal of Research in Marketing, Leeflang and Wittink (2000) summarized the areas in which marketing science has been used to inform management decisions. Roberts (2000) acknowledged the breadth of marketing science applications, but lamented the depth of penetration of marketing science (i.e., the proportion of management decisions informed by marketing science models). Lilien, Roberts, and Shankar (2013) take an applications-based approach to best practice. However, there has been no broad systematic investigation of which marketing science articles and tools have been applied, the decisions that these concepts and tools have informed, and the perceptions of different stakeholders of the usefulness of marketing science in informing decisions. We aim to address this void. We develop the concept of the marketing science value chain, which captures the diffusion of insights from academic articles in a direct (e.g., from article to practice) or indirect (e.g., from article to marketing science tool to practice) manner. We survey the primary agents in this value chain – marketing managers, marketing science intermediaries (marketing analysts), and marketing academics – to calibrate the practice impact of marketing science in all its facets. 2. Methodology 2.1. The framework: The marketing science value chain An important step in our methodology is a conceptualization of the marketing science value chain. Our representation of this chain, illustrated in Fig. 1, depicts activities (full arrows) by which marketing science is translated from academic knowledge to practical tools, and thence to marketing action, as well as the participants involved in the chain. First, new knowledge (marketing science articles) is developed, often but not always, by marketing academics.1 Second, knowledge conversion occurs when new knowledge in articles is adapted and integrated into practical tools and approaches, often but again not always, by marketing intermediaries, such as market research agencies (e.g. ACNielsen or GfK), marketing and strategy consultancies (e.g., McKinsey or Bain), specialist niche marketing consulting firms (e.g. Advanis or Simon-Kucher Partners), or the marketing science division of a marketing organization (e.g. Novartis or General Mills). Third, knowledge application occurs when marketing managers implement marketing science knowledge via practical tools to make marketing decisions. While we contend in Fig. 1 that marketing intermediaries play a critical role in the diffusion process, we allow for a direct path as well (disintermediation). For example, marketing academics may work directly with marketing managers to have their tools adopted (marketing science push) or a firm's internal analysts may actively seek out solutions to address the firm's specific problems (marketing science pull). Alternatively, the locus of conceptual innovation may fall further down the value chain (user innovation). Moreover, diffusion may occur through routes other than through intermediaries (for example, via specialist books such as Lilien, Kotler, & Moorthy, 1992; Wierenga & van Bruggen, 2000, and Lilien, Rangaswamy, & De Bruyn, 2007 or general texts such as Kotler & Keller, 2012). In other words, the “direct” influence in Fig. 1 may include a number of further sub-stages that we do not explicitly identify or calibrate. 2.2. The elements: Decisions, tools and articles In Fig. 1, we identify three core elements in the marketing science value chain: decisions, tools, and articles. Selection of stimuli in each of these elements is a critical part of our methodology, especially considering the scope of our study. Not only have thousands of marketing articles been published across many journals, but marketing managers make decisions to solve marketing problems in a wide variety of areas (pricing, promotions, sales force management, etc.), using a considerable range of marketing science tools (segmentation tools, choice models, etc.) to assist in that decision making. To make our calibration practically feasible, we decided to limit the three sets of stimuli to 12 decision areas, 12 marketing science tools, and 20 marketing science articles. We decided on these limits iteratively, by trading off the need for a comprehensive classification of the decisions, tools, and articles against the time required for respondents to react to the stimuli. In Section 3.5 we discuss the dynamics of these three elements. 2.2.1. Decisions Marketing decisions refer to the choice of management actions regarding any part of the firm's marketing activity. To categorize marketing decisions, we followed a four-step procedure. First, we examined subject areas used at the major marketing journals and in leading marketing management textbooks. Second, we integrated and synthesized these lists to create an exhaustive inventory. Third, we aggregated the different decision areas into higher order categories, to create a manageable number. Finally, we tested our list with practicing managers and the Executive Committee of the Marketing Science Institute, and refined it based on their feedback. Our final list of marketing decision areas is: 1. Brand management: Developing, positioning and managing existing brands. Fig. 1. The marketing science value chain. 1 For example, a study of Marketing Science over the period 1982–2003 shows that of 1072 article authors, 1001 of them were academics (93.4%) Authors with multiple articles are counted as many times as they have (co-)authored an article. J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 2. New product/service management: New product development, management and diffusion. 3. Marketing strategy: Product line, multi-product and portfolio strategies. 4. Advertising management: Advertising spending, planning and design. 5. Promotion management: Promotion decisions. 6. Pricing management: Pricing decisions. 7. Sales force management: Sales force size, allocation, and compensation decisions. 8. Channel management: Channel strategy, design, and monitoring. 9. Customer/market selection: Targeting decisions. 10. Relationship management: Customer value assessment and maximization, acquisition, retention, and relationship management. 11. Managing marketing investments: Organizing for higher returns and internal marketing. 12. Service/product quality management: Any aspect of quality management. 2.2.2. Tools Tools are approaches and methodologies that can be used to support marketing decisions. To categorize marketing science tools, we followed a procedure similar to the one used for decision areas (using marketing research and marketing analysis texts). Our list of tools is: 1. 2. 3. 4. 5. 6. 7. 8. 9. Segmentation tools: latent class segmentation, cluster analysis, etc. Perceptual mapping: multidimensional scaling, factor analysis, etc. Survey-based choice models: conjoint analysis, discrete choice, etc. Panel-based choice models: choice models, stochastic models, etc. Pre-test market models: ASSESSOR, durable pre-testing, etc. New product models: diffusion models, dynamic models, etc. Aggregate marketing response models: marketing mix models, etc. Sales force allocation models: Call planning models, etc. Customer satisfaction models: Models of service quality, satisfaction, etc. 10. Game theory models: Models of competition, channel structure, etc. 11. Customer lifetime value models: Loyalty and direct marketing models, etc. 12. Marketing metrics: Accounting models, internal rate of return, etc. 2.2.3. Articles We selected candidate articles for the twenty marketing science articles by applying four filters. First, we filter the journals and time period from which to sample. Second, we select 200 articles in the sampled journals and time period, which have made the highest academic impact, measured by age-adjusted citations. Third, we reduce the list of 200 to 100, by weighing impact with the likelihood to which an article represents marketing science. Fourth, we reduce the list of 100 highimpact marketing articles to the 20 articles that marketing intermediaries rated as most impactful on marketing practice. Next, we explain this procedure in greater detail. For the first filter, our aim was to achieve a good representation of major marketing journals, which we based on prior scientometric work in marketing (Stremersch et al., 2007). We excluded the Journal of Consumer Research (JCR) as it is not an outlet that typically publishes marketing science articles. We added Management Science, because it consistently features in the Financial Times Top 45, for example, and has a marketing section. This step thus led us to the following selection of journals: International Journal of Research in Marketing (IJRM), Journal of Marketing (JM), Journal of Marketing Research (JMR), Management Science (MGS) and Marketing Science (MKS). Next, we assessed how long the journals were covered in the Social Science Citation Index. Young journals need time to mature and become academically and practically impactful, which may make them less suited for our goals, even if they are a top journal. IJRM is the youngest top journal in the set and was not included in the Social Science Citation 129 index until 1997. Therefore, in 2006, it was very unlikely for IJRM articles from the period 1997–2003 to have amassed enough citations to be among the top 200 age-adjusted cited articles and be included in our further analytical steps. Later analyses on an expanded sample that included IJRM showed this assessment to be accurate. The most highly ranked IJRM article was Geyskens, Steenkamp, and Kumar (1998) at rank 255. We selected the period 1982–2003 as observation window. We chose the start year of our data to coincide with the launch of Marketing Science in 1982. We chose the end year of 2003 to allow articles at least 2 full years for their impact to materialize (this is common in citation studies, see Stremersch et al., 2007). Second, we rank-ordered the resultant 5556 articles on their academic importance, as measured by age-adjusted citations (see Stremersch et al., 2007 for a similar procedure). As citations show a time trend, we first de-trended our measure by regressing the number of citations of an article i (CITEi) on the number of quarters (Qi) that have passed between publication and the quarter in which we gathered the citations and its square (Qi2), including a constant (across all articles). We conducted this study in the 3rd quarter of 2006 and, thus, we obtained the stock of citations that were in ISI databases, in that quarter. As CITEi shows over-dispersion, we specified a negative binomial count model and optimized with quadratic hill climbing. As expected, our results indicated an inverted U-shaped time trend (the estimated coefficients for Qi and Qi2 were 0.07 and −4.76E−04 respectively, both significant at p b 0.001; R2 = 0.035). We obtain standardized residuals from the model, denoted by CITERESIDi, which can be regarded as a time-corrected citation measure of academic impact. We retained the top 200 articles ranked on this academic impact measure. Third, we examined the MGS articles in this top 200 and excluded the 71 articles that did not consider a marketing subject, because they could not possibly be “marketing” science. Next, we calculated the extent to which each of the 129 remaining marketing articles is a marketing “science” article. We found the task of defining marketing “science” difficult. After many discussions with experts, we came to the following working definition: “Marketing science is the development and use of quantifiable concepts and quantitative tools to understand marketplace behavior and the effect of marketing activity upon it.”2 To determine whether a specific article satisfied this definition, we asked five pairs of two marketing science experts – members of the Marketing Science and IJRM Editorial Boards, and leading marketing intermediaries – to individually code 100 articles published in the four journals in a hold out sample published in 2004–2005,3 as marketing science articles, or not. The proportion of agreement between the raters was 0.77, which translated into a proportional reduction of loss (PRL) inter-rater reliability measure of 0.72 (Rust & Cooil, 1994), satisfactory for the exploratory nature of our research. We created a variable that took the value 1 if both raters agreed it was a marketing science article, 0 otherwise. Next, we inventoried the number of equations to measure an article's mathematical sophistication (also used by Stremersch & Verhoef, 2005), the methodologies an article uses, going from qualitative techniques to time series and analytical models, and the number of referenced articles in econometrics, statistics and mathematics. Stepwise logistic regression revealed two significant predictors: the number of equations and whether the methodology used factor and/or cluster analysis or not. The more equations an article contains, the higher the likelihood of it being considered a marketing science article. Articles that use factor or cluster analyses are generally less perceived as a 2 This definition aligns closely to the definition of marketing analytics of Germann et al.'s (2013). 3 We selected 25 articles from each of the four journals. Article selection was random for JM, JMR, and Marketing Science. For Management Science, we inventoried 25 articles from 2004 to 2005 that were marketing-related. The list of 100 articles is provided in Web Appendix 1.1. 130 J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 Table 1 The 100 academically most impactful papers in marketing science (ordered by practice impact, and then by academic impact; complete bibliography is available in the Web Appendix 3). Rank Authors, publication year Cites total CITERESID PROBMKS Academic impact: MKSIMPACT Practice impact: INTIMPACT 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 Green and Srinivasan (1990) Louviere and Woodworth (1983) Aaker and Keller (1990) Cattin and Wittink (1982) Guadagni and Little (1983) Mahajan et al. (1990) Rust et al. (1995) Hauser and Shugan (1983) Fornell, Johnson, Anderson, Cha, and Bryant (1996) Griffin and Hauser (1993) Day (1994) Punj and Stewart (1983) Fornell (1992) Vanheerde et al. (2003) Hunt and Morgan (1995) Anderson, Fornell, and Lehmann (1994) Simonson and Tversky (1992) Boulding et al. (1993) Parasuraman, Zeithaml, and Berry (1985) Keller (1993) Yu & Cooper (1983) Urban, Carter, Gaskin & Mucha (1986) Carpenter & Nakamoto (1989) Zeithaml, Berry & Parasuraman (1996) Dickson & Sawyer (1990) Zeithaml, Parasuraman & Berry (1985) Joreskog & Sorbom (1982) Day & Wensley (1988) Thaler (1985) Kamakura & Russell (1989) Zeithaml (1988) Bolton (1998) Tversky & Simonson (1993) Churchill & Surprenant (1982) Fornell & Bookstein (1982) Mittal & Kamakura (2001) Srivastava, Shervani & Fahey (1998) Churchill, Ford, Hartley & Walker (1985) Gupta (1988) Teas (1993) Anderson & Sullivan (1993) Gutman (1982) Jaworski and Kohli (1993) Slater & Narver (1994) Mcguire, TW & Staelin (1983) Parasuraman, Zeithaml & Berry (1994) Mackenzie & Lutz (1989) Robinson & Fornell (1985) Bitner, Booms & Tetreault (1990) Bolton & Lemon (1999) Henard & Szymanski (2001) Bitner (1990) Perreault & Leigh (1989) Ruekert & Walker (1987) Mackenzie, Lutz & Belch (1986) Alba, Lynch, Weitz, Janiszewski, Lutz, Sawyer & Wood (1997) Webster (1992) Haubl & Trifts (2000) Bearden, Sharma & Teel (1982) Han, Kim & Srivastava (1998) Dwyer & Schurr & Oh (1987) Lynch & Ariely (2000) Pollay (1986) Bitner (1992) Cronin & Taylor (1992) Oliver (1999) Garbarino & Johnson (1999) Crosby, Evans & Cowles (1990) Cronin & Taylor (1994) Rindfleisch & Heide (1997) Kalwani & Narayandas (1995) Ganesan (1994) Doney & Cannon (1997) Morgan and Hunt (1994) Bakos (1997) 292 195 170 152 431 268 146 152 136 166 321 263 159 25 149 217 213 250 765 250 192 162 157 191 160 225 133 233 532 242 390 85 121 262 210 56 92 161 206 135 200 157 411 238 140 178 208 174 183 62 42 294 168 185 167 182 273 60 125 97 632 83 157 281 399 81 114 259 153 92 117 311 218 690 156 4.34 2.99 2.21 2.48 7.44 3.82 2.88 2.21 3.29 2.62 6.63 4.40 2.55 2.01 2.95 4.18 3.37 4.23 12.08 4.23 3.13 2.07 1.97 2.65 2.09 5.41 2.04 3.14 8.31 3.37 5.64 2.31 1.90 4.58 3.20 2.62 2.55 2.14 2.72 2.18 3.34 2.65 7.64 5.22 2.08 3.14 2.82 2.33 2.76 2.00 2.08 4.28 2.13 2.39 2.11 5.15 4.57 2.26 1.89 3.12 9.53 3.35 1.99 4.03 6.82 2.64 4.15 3.74 2.62 2.43 2.12 6.07 6.05 14.52 4.52 0.47 0.78 0.45 0.45 0.80 0.87 0.77 0.93 0.45 0.47 0.45 0.47 0.80 0.72 0.45 0.65 0.53 0.42 0.45 0.45 0.47 0.57 0.55 0.45 0.45 0.14 0.84 0.45 0.53 0.84 0.45 0.69 0.78 0.13 0.49 0.59 0.45 0.45 0.74 0.65 0.65 0.45 0.53 0.45 0.95 0.57 0.47 0.55 0.45 0.61 0.49 0.45 0.65 0.45 0.45 0.45 0.45 0.49 0.45 0.16 0.45 0.47 0.45 0.45 0.21 0.45 0.13 0.13 0.45 0.45 0.45 0.13 0.13 0.45 0.89 2.04 2.35 1.00 1.12 5.94 3.31 2.22 2.04 1.48 1.23 2.98 2.07 2.04 1.45 1.33 2.73 1.80 1.79 5.44 1.90 1.47 1.19 1.09 1.19 0.94 0.76 1.71 1.41 4.43 2.81 2.54 1.59 1.49 0.60 1.57 1.56 1.15 0.96 2.01 1.42 2.18 1.19 4.07 2.35 1.98 1.80 1.33 1.29 1.24 1.23 1.02 1.93 1.39 1.08 0.95 2.32 2.06 1.11 0.85 0.51 4.29 1.58 0.89 1.82 1.45 1.19 0.54 0.49 1.18 1.09 0.96 0.80 0.79 6.54 4.04 4.22 3.56 3.50 3.25 3.22 3.11 3.00 3.00 3.00 2.89 2.67 2.67 2.67 2.63 2.63 2.44 2.38 2.38 2.25 2.25 2.25 2.25 2.22 2.13 2.13 2.13 2.11 2.11 2.00 2.00 2.00 2.00 2.00 2.00 1.89 1.89 1.88 1.88 1.75 1.75 1.67 1.67 1.63 1.63 1.63 1.63 1.63 1.63 1.63 1.63 1.63 1.50 1.50 1.50 1.50 1.38 1.38 1.38 1.38 1.33 1.25 1.25 1.25 1.22 1.22 1.22 1.22 1.22 1.13 1.13 1.13 1.13 1.13 1.00 1.00 131 J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 Table 1 (continued) Rank Authors, publication year Cites total 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Narver & Slater (1990) Anderson, Hakansson & Johanson (1994) Deshpande, Farley & Webster (1993) Kohli & Jaworski (1990) Jeuland & Shugan (1983) Gorn (1982) Anderson & Coughlan (1987) Phillips, Chang & Buzzell (1983) Gaski (1984) Novak, Hoffman & Yung (2000) Lovelock (1983) Solomon, Surprenant, Czepiel & Gutman (1985) Anderson & Narus (1990) Zirger & Maidique (1990) Deshpande & Zaltman (1982) Sinkula (1994) Anderson & Weitz (1989) Hirschman & Holbrook (1982) Huber & McCann (1982) Tse & Wilton (1988) Anderson & Weitz (1992) Hoffman & Novak (1996) Slater & Narver (1995) Ferrell & Gresham (1985) Gerbing and Anderson (1988) 440 139 119 442 185 170 213 166 164 92 191 158 442 150 248 152 185 234 133 175 265 287 197 290 453 CITERESID 6.83 2.55 1.92 6.73 2.87 2.99 2.89 2.57 2.30 3.77 2.98 2.12 6.67 1.93 4.19 2.60 2.39 4.12 2.10 2.21 4.18 7.31 3.54 4.27 6.64 PROBMKS Academic impact: MKSIMPACT Practice impact: INTIMPACT 0.45 0.45 0.45 0.45 0.96 0.45 0.45 0.45 0.45 0.13 0.45 0.45 0.13 0.45 0.13 0.45 0.45 0.45 0.67 0.47 0.17 0.45 0.45 0.45 0.21 3.08 1.15 0.87 3.03 2.76 1.35 1.30 1.16 1.03 0.50 1.34 0.96 0.88 0.87 0.55 1.17 1.08 1.86 1.41 1.04 0.73 3.29 1.59 1.92 1.41 1.00 1.00 1.00 0.89 0.89 0.89 0.89 0.89 0.89 0.89 0.78 0.75 0.75 0.67 0.67 0.63 0.63 0.56 0.56 0.56 0.56 0.50 0.44 0.33 0.33 Notes: 1. In the case of ties in practice impact, we reverted to academic impact to determine which articles got into the top 20. 2. CITERESID is age-adjusted citation impact, measured by the residual from the negative binomial model with citations as the dependent variable and quarters since publication and its square as the independent variables. 3. PROBMKS is the probability that the article is a marketing science article, see Section 2.2.3 for details. 4. MKSIMPACT = CITERESID × PROBMKS. 5. INTIMPACT is awareness-adjusted impact, which is the average impact across all respondents assuming that the impact is 0 for articles of which the respondent is not aware. marketing science article, probably because these are exploratory techniques. The fit of this model is reasonable; the hit rate was 75%, which compares favorably to chance (50.5%). We applied these model coefficients, calibrated on the out-of-sample 2004–2005 articles, to the 129 marketing articles identified earlier and retrieve an estimated probability that an article is marketing science, denoted as PROBMKSi. We then weighted the age-adjusted citation impact (CITERESIDi) by the likelihood of the article being marketing science (PROBMKSi) to obtain our final measure of marketing science academic impact for each article (MKSIMPACTi). We rank-ordered the 129 articles on this latter measure and selected the top 100 articles. We provide the full list of 100 articles and all metrics in Table 1. Complete references are included in Web appendix 1.3. Table 1 shows that our methodology leads to credible results, with substantial face validity. For instance, Guadagni and Little (1983) and Mahajan, Muller, and Bass (1990) are more likely to be regarded as being marketing science articles than Morgan and Hunt (1994) and Jaworski and Kohli (1993). Because one of our goals is to survey academics and intermediaries on the impact on practice of individual marketing science articles, we needed to reduce the list of 100 articles to 20, to make the task manageable for our respondents. In the final reduction from 100 articles to 20, we wanted to account for practice impact and asked 34 marketing intermediaries to rate the practical impact of four randomized blocks of 25 articles. The respondents were from a larger pool of 54 intermediaries (63% response rate) who worked in marketing science intermediary roles in firms such as AC Nielsen, Mercer, GfK, and McKinsey. These intermediaries were specifically selected because (i) they had previously published papers in or were on the Editorial Board of Marketing Science, and/or (ii) were past or current members of the Practice Committee of INFORMS ISMS. We asked these 34 respondents if they were aware of each article and, if so, the impact on practice that they believed that it had had, using a 5-point verbally anchored scale (1 = no influence; 5 = extremely influential). We gave a score of 0 to those articles of which the respondents were not aware, assuming that there could not be a direct impact if the respondent was not even aware of the article when prompted. We then calculated an average impact across all respondents for each article, calling it an awareness-adjusted practice impact score (denoted as INTIMPACT). Rank-ordering all 100 articles on INTIMPACT allowed us to select the 20 highest ranked articles, which we used in our large-scale survey of academics and intermediaries. We found no significant differences in the average awarenessadjusted practice impact score across the four groups of intermediaries. We acknowledge that starting with a citation screen (as well as a screen in terms of journal outlet) may preclude consideration of some papers with high impact on practice, but low impact on scholarship. Our intention though was not to measure which were the marketing articles with the highest practice impact per se. Rather our intention was to identify marketing papers with high dual impact, including both academic and practice impact. 2.3. The participants: Managers, intermediaries and academics We use samples from each participant population (managers, intermediaries, and academics) to inventory the impact of marketing science on marketing practice, along the marketing science value chain, described in Fig. 1. We do not expect marketing managers to be aware of many, if any, academic articles, even where those articles have been incorporated into the marketing science tools that they routinely use. Thus, marketing managers can inform us only on knowledge conversion (tools) and knowledge application (decisions). However, we also calibrate managers' perceived importance of different areas of marketing decision making. 2.3.1. Sample of managers Our sample of senior marketing managers consisted of Marketing Science Institute and Institute for the Study of Business Markets 132 J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 (ISBM) members and company contacts. Both institutes graciously emailed a request from us to their members. In total, we solicited survey participation from 477 managers, of whom 94 (20%)4 provided usable responses. While this group comes from a well-defined population, it almost certainly has a bias towards greater sophistication. This sophistication is likely to introduce an upward bias in the perceived impact of tools and their influence in different areas (the absolute impact of marketing science). However, there is no reason to believe that this bias will be very different for different tools and decision areas, meaning any bias in the relative effects will likely be considerably less. 2.3.2. Sample of intermediaries We used four sources to create the sample of intermediaries. First, we examined all articles published by practitioner analysts in Marketing Science and included those authors in our sample. Second, we examined the editorial boards of our target journals and included any intermediaries on these boards. Third, the Marketing Science Institute contacted the marketing intermediaries among their members on our behalf. Finally, we surveyed marketing intermediaries attending the 2007 ISMS Marketing Science Practice Conference, held at the Wharton School. In total, we solicited participation from 93 intermediaries, of whom 34 (37%) participated in the main survey. 21 of these respondents worked at marketing and/or management consulting firms such as McKinsey, AC Nielsen, and Millward Brown, while 13 respondents worked in firms such as General Motors, IBM, and Campbell Soup. and practitioners; (7) the stage of their career in which they wrote the article; and (8) the reasons that may have made the article impactful. We summarize our data collection approach in Fig. 2. 3. Results Moving up the value chain illustrated in Fig. 1, we present the results of our research in four stages: the relative impact of marketing science on different decision making areas (Section 3.1), the impact that different marketing science tools and approaches have had on marketing practice (Section 3.2), the impact of the twenty articles on marketing decisions and tools (Section 3.3), and the antecedents of “dual” (academic and practice) impact from a survey of the authors of 20 top articles (Section 3.4). In Section 3.5, we identify trends since 2004 in the application and use of marketing science. 3.1. Impact of marketing science on marketing decisions To inventory the impact of marketing science on marketing decision areas, we first present the self-stated importance of each decision area by manager respondents. Next, we present the extent to which our respondents felt that marketing science had impacted each marketing decision area. We end with graphically presenting the alignment between impact of marketing science on and the importance of the decision areas. 2.3.3. Sample of academics We defined the sampling frame of marketing academics to be academic marketing science members of the editorial boards of the target journals. We excluded the authors of the current paper from this sampling frame. To identify the “marketing science” members of those editorial boards, we used a peer review process, in which we asked ten marketing science experts to indicate whether they would classify members of these editorial boards (223 in total) as marketing scientists or not.5 Of the 223 editorial board members in total, 126 were classified as marketing scientists, of whom 84 (67%) ultimately responded to our survey. 3.1.1. Importance of decision areas In Table 2, we report the self-stated importance of each of the decision areas to the company, classified by type of firm (B2B, B2C, both B2B and B2C, and total). Overall, pricing management is rated the most important (aggregated across types of firms), while promotion management is rated the least important. However, there are notable differences across B2B and B2C firms. Managers of B2B firms consider pricing management to be the most important decision area, followed by customer/market selection and product portfolio management. Managers of B2C firms consider brand management and new product management to be the most important decision areas. 2.4. The instruments: Surveys among participants 3.1.2. Impact of marketing science on decision areas In Table 3, we present the perceived impact of marketing science on specific marketing decision areas, as perceived by academics (A), intermediaries (I), and managers (M). According to managers, marketing science has had the biggest impact on brand management decisions and pricing decisions (mean = 3.77 for both), and new product/service management and customer/market selection (mean = 3.66 for both). Academics feel that marketing science has made the biggest impact on brand management, new product/service management and promotion management. Intermediaries sense that marketing science has made the biggest impact on pricing management, promotion management, and new product/service management. Interestingly, academics believe that marketing science had the biggest impact on promotion management among all decision areas (mean = 3.76), while managers consider that it had the smallest influence among all areas (mean = 3.14). For other areas, such as new product/service management, both seem to agree much more as to the relatively large extent to which marketing science has impacted such decisions (means = 3.70 and 3.66 respectively for academics and managers). Overall, Table 3 shows that while there is consensus between the academic and intermediary groups (ρAI = 0.62) and some moderate level of consensus between the intermediary and manager groups (ρIM = 0.39), there is much disagreement between academics and managers (ρAM = 0.17), pointing to the bridging role of marketing intermediaries. In Table 3, we also present how managers perceived the impact of marketing science on different decision areas, split by type of firm. As expected, the results indicate some differences by type of firm. While Our instruments are as follows (see Web Appendix 1.2 for details). The survey to managers measured: (1) the overall influence of each of the 12 tools on marketing practice; (2) the overall influence of marketing science on each of the 12 marketing decision areas; and (3) the importance of the 12 marketing decision areas to their company. The survey to intermediaries and academics measured: (1) the overall influence of the 20 marketing science articles on marketing practice; (2) the overall influence of the 12 tools on marketing practice; and (3) the overall influence of marketing science on the 12 marketing decision areas. We also collected respondent background data for each sample. Additionally, we surveyed the authors of the top 20 dual impact articles to probe: (1) other scholars who influenced the development and execution of the article; (2) academic ideas underlying the article, including the important papers on which the article was built; (3) practitioner influence on the development and execution of the article; (4) the practical ideas underlying the article; (5) whether there was cooperation with practitioners when developing the article; (6) any diffusion efforts the authors undertook to diffuse their work to academics 4 The response rate for the MSI sample was 53% and for the ISBM sample (where the participant request was less personalized), it was 16%. Note that our email solicitation included a URL, which increases the likelihood of the email being classified by spam filters as spam and thus not reaching many members of our sample. As a result, the response rate we report is a lower bound. This comment applies to all three samples (managers, intermediaries, and academics). 5 The inter-rater reliability using a separate sub-sample was 0.90, sufficiently high to indicate that our classification procedure is reliable. 133 J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 Stimuli Article selection Tool and Decision selection What is marketing science? Who is a marketing scientist? Marketing Science Editorial Board Screening of 100 most cited 34 Marketing marketing science articles to 20 Intermediaries What are the key areas of marketing decisions? MSI Executive Committee What are the key tools and approaches used? Main Questionnaire Manager Survey (N = 94) Intermediary Survey (N = 34) Academic Survey (N = 84) Importance of decision areas Impact of marketing science on marketing decision areas Impact of marketing science on marketing decision areas Impact of marketing science tools Impact of marketing science tools Impact of articles Impact of articles Manager Survey (N = 4) Intermediary Survey (N = 5) Academic Survey (N = 4) Impact of 12 marketing science tools on 12 marketing decision areas Impact of 12 marketing science tools on 12 marketing decision areas Impact of 12 marketing science tools on 12 marketing decision areas Impact of marketing science on marketing decision areas Impact of marketing science tools Transition Matrices Impact of 20 Articles on 12 marketing science tools Impact of 20 Articles on 12 marketing decision areas Impact of 20 Articles on 12 marketing science tools Impact of 20 Articles on 12 marketing decision areas Antecedents of impactful papers Survey of authors of 20 marketing science articles with high academic and practice impact • Influence (academic, industry, literature, problem • Industry co-operation • Effort to diffuse findings • Author background (experience) Fig. 2. Overview of the primary data collection approach. B2B managers perceive the biggest impact on pricing management, B2C managers perceive the impact to be largest on customer insight management. However, there is moderate consistency (ρB2B, B2C =0.45). 3.1.3. Alignment between importance of decision areas and impact of marketing science To examine whether the impact of marketing science on decision areas is aligned with the importance of the decision area to managers, we plot the importance against (managerial perceptions of) impact in Fig. 3. Considering the differences in importance as well as perceived impact across managers from different types of firms, we present the B2B and B2C plots separately. (We have not included the plots for firms that do both since these largely lie between the two). Table 2 Average importance of decision areas according to managers in different types of firms (ordered per Table 3). Decision areas B2B (N = 59) B2C (N = 10) B2B & B2C (N = 25) Total (N = 94) Brand management Pricing management New product/service management Customer/market selection Product portfolio management Customer insight management Service/product quality management Channel management Relationship management Salesforce management Advertising management Promotion management 3.51 4.03 3.78 3.79 3.79 3.16 3.57 4.60 4.30 4.60 4.20 4.20 4.20 3.80 4.04 4.12 3.80 3.84 3.76 3.80 3.52 3.77 4.09 3.87 3.85 3.83 3.45 3.58 3.24 3.62 3.62 2.69 2.68 4.10 3.60 4.30 3.90 4.00 3.72 3.56 3.60 3.24 3.12 3.46 3.60 3.69 2.97 2.95 Scale: 1: Of no importance. 5: Extremely important. Both plots indicate that, by and large, the impact of marketing science is aligned with the perceived importance of the decision area. The most notable examples of under-performance are sales force management and service/product quality for both groups, relationship management for B2B, and advertising and channel management for B2C. Table 3 Average impact of marketing science on decision areas (ordered by managers' perceptions; numbers represent average impact given awareness). Managers Decision areas Academics Intermediaries All B2B Brand management Pricing management New product/service management Customer/market selectionb Product portfolio managementb Customer insight managementb Service/product quality management Channel managementb,c Relationship management Sales force managementa,c Advertising management Promotion managementb,c Average perceived impact 3.75 3.53 3.70 3.56 3.85 3.68 3.77 3.80 4.10 3.54 3.77 3.82 3.80 3.63 3.66 3.68 3.90 3.50 3.24 2.94 3.58 3.26 3.66 3.70 3.60 3.58 3.55 3.55 3.60 3.54 2.95 3.31 3.42 3.29 4.20 3.38 3.37 3.13 3.41 3.36 3.30 3.58 2.72 3.29 3.43 3.22 3.76 3.32 2.71 3.25 2.80 3.47 3.71 3.36 3.40 3.37 3.26 3.15 3.14 3.46 3.40 3.40 3.29 2.93 3.04 3.44 B2C 3.44 3.56 3.44 3.40 3.60 3.66 B2B & B2C 3.38 3.21 3.13 3.54 3.17 3.43 Scale: 1: No influence at all 5: Extremely influential. a Academics-intermediaries significantly different at p b 0.05. b Academics-managers significantly different at p b 0.05. c Intermediaries-managers significantly different at p b 0.05. Significance assessed with the Welch–Satterthwaite t-test. Degree to which Decision Area is Influenced By Marketing Science Degree to which Decision Area is Influenced By Marketing Science 134 J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 Table 4 Average impact of marketing science tools on marketing practice, according to academics, intermediaries, and managers (ordered by intermediaries' perceptions, numbers represent average impact given awareness). B2B Firms only (N=58) 4.10 3.90 Brand mgmt Customer/Mkt selection New product mgmt 3.50 2.90 2.70 2.50 2.50 Tools/approaches Academics Intermediaries All B2B B2C Segmentation toolsc Survey-based choice modelsa,b,c Aggregate marketing mix modelsa,b,c Pre-test market modelsb,c Marketing metrics New product modelsb Customer life time value modelsb,c Panel-based choice modelsb,c Perceptual mappinga,b Customer satisfaction modela Sales force allocation modelsb Game theory models Average Perceived Impact 4.29 3.71 4.44 4.15 4.02 4.00 4.30 3.96 3.25 3.06 3.50 3.58 3.36 4.06 2.99 2.88 3.40 3.00 3.93 3.54 3.78 3.84 3.94 3.77 3.74 3.63 3.38 3.73 3.37 3.07 3.76 3.99 3.83 3.58 3.53 3.39 2.82 2.73 3.11 2.87 3.19 3.14 3.80 3.04 3.59 3.66 3.33 3.52 3.62 3.23 3.07 3.02 3.25 3.13 2.18 3.65 2.12 3.63 2.41 2.51 2.44 2.19 3.24 3.18 3.46 3.27 Product portfolio mgmt Channel mgmt Relationship mgmt Service/prod quality mgmt 3.30 3.10 Managers Pricing mgmt 3.70 Customer insights mgmt Salesforce mgmt Promotion mgmt Advtg mgmt Degree of Influence = 1.49+ 0.56 Importance of Decision 3.00 3.50 4.50 4.00 Importance of Decision Area B2C Firms only (N=10) Brand mgmt Customer insights mgmt 4.10 3.90 New product mgmt Pricing mgmt 3.70 Promotion mgmt Product portfolio mgmt Customer/Mkt selection Relationship mgmt 3.50 Channel mgmt Salesforce mgmt Advtg mgmt 3.30 Service/prod quality mgmt 2.98 3.72 3.27 3.18 4.30 3.67 3.67 2.70 B2B & B2C 3.71 3.76 3.48 3.00 Scale: 1: No influence at all 5: Extremely influential. a Academics-intermediaries significantly different at p b 0.05. b Academics-managers significantly different at p b 0.05. c Intermediaries-managers significantly different at p b 0.05. Significance assessed with the Welch–Satterthwaite t-test. 3.10 encouraging. But again, we note that our sample is likely biased toward high levels of sophistication. 2.90 2.70 2.50 2.50 Degree of Influence = 1.16 + 0.60 Importance of Decision 3.00 3.50 4.00 4.50 Importance of Decision Area Fig. 3. Impact of marketing science versus importance of decision area (both according to managers). 3.2. Impact of marketing tools on marketing practice Having gauged the decisions that are important to the firm and the extent to which marketing science has influenced them, we examine the tools that provide one route by which that influence is felt. In Table 4, we present the average impact of marketing science tools on marketing practice, as perceived by academics, intermediaries and managers. We also provide a split of manager perceptions, according to whether they are in a B2B, B2C, or both B2B and B2C firm. According to managers, the top three marketing science tools and approaches are: (1) marketing segmentation tools (mean = 4.02), (2) marketing metrics (mean = 3.73), and (3) customer satisfaction models (mean = 3.59). While segmentation tools are also the number 1 pick of academics and intermediaries, opinions diverge on the other ones. Survey-based choice models (number 2 among intermediaries, mean = 4.15) and perceptual mapping techniques (number 2 among academics, mean = 3.99) had less of an impact on marketing practice, according to the marketing managers (means = 3.25 and 3.19 respectively for survey-based choice models and perceptual mapping techniques). Other tools that were consistently found to significantly impact practice are pre-test market models (number 3 or 4 in the three groups) and new product models (number 5 or 6 in the three groups). The different samples also consistently agree on the lack of practical impact of game theory models. The agreement between groups as to the impact of different tools is a lot stronger than the agreement we found on the impact of marketing science on the different decision areas: ρAI = 0.80, ρIM = 0.70, and ρAM = 0.73. Managers' average awareness of marketing science tools was close to 90%, which is 3.3. Impact of articles on marketing tools and directly on marketing practice We continue to calibrate practice impact up the value chain in Fig. 1 by examining select marketing science articles and the effect that they have had both on marketing science tools and directly on marketing decision making. We first report results from our precalibration of the top 100 marketing science papers according to academic impact among marketing intermediaries, after which we report on the results from the complete survey of the authors of the top 20 marketing science papers with “dual” impact. In Fig. 4, we plot the academic impact of the top 100 marketing science articles in Table 1 (MKSIMPACT) against the awareness-adjusted impact on practice as perceived by the 34 marketing intermediaries from the precalibration (INTIMPACT). Individual points may be identified by reference to Table 1. While there is a significant relationship between academic and practice impact, it is weak (ρ = 0.19). We find it more insightful to divide the graph into four quadrants, through a median split on both dimensions. Articles in the bottom left quadrant of Fig. 4 have not had a major impact on practice (e.g., Gerbing & Anderson, 1988), and are also below the median for these 100 articles on academic impact. (Note that all 100 candidates for inclusion fall in the top 5% of age-adjusted citation in the profession's top four quantitative journals.) The articles on the bottom right are primarily knowledge drivers — that is, articles that have had above-median academic impact (relative to the 100 papers in this pool), but have had below-median practice impact (e.g., Morgan & Hunt, 1994). The articles on the top left quadrant are practice drivers — articles that have had below-median academic impact among the top 100 pool, but have had above-median practice impact (e.g., Aaker & Keller, 1990). The top right quadrant consists of articles that have had dual impact, exceptional academic as well as practice impact (e.g., Guadagni & Little, 1983). The selection from top 100 on academic impact to top 20 on dual impact represent articles from both the top-left and the top-right quadrants in Fig. 4 (see Web Appendix 2.2 for articles by quadrant). J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 135 3.4. Antecedents of practice impact among dual impact marketing science articles Awareness-adjusted Practice Impact Score Aaker & Keller (1990) 4.50 Guadagni & Little (1983) 4.00 3.50 3.00 2.50 2.00 1.50 Morgan & Hunt (1994) 1.00 0.50 0.00 0.00 Gerbing & Anderson (1988) 2.00 4.00 6.00 8.00 Academic Impact Score Notes: 1. Awareness adjusted practice impact score is INTIMPACT from Table 1. It is the average impact of the article assuming that the impact=0 for articles of which respondents are not aware. 2. Academic Impact Score is MKSIMPACT from Table 1, which is the age-adjusted citation score, further adjusted by the probability of the paper being marketing science. Fig. 4. Contrast of academic and practice impact of 100 selected articles. Notes: Awareness adjusted practice impact score is INTIMPACT from Table 1. It is the average impact of the article assuming that the impact = 0 for articles of which respondents are not aware. Academic Impact Score is MKSIMPACT from Table 1, which is the age-adjusted citation score, further adjusted by the probability of the paper being marketing science. In Table 5, we present the results of asking our sample of intermediaries (N = 34) and academics (N = 84) to evaluate the practice impact of each of the 20 dual-impact articles we identified earlier. In this table, we present the impact score given awareness for each article6 as well as awareness-adjusted practice impact. Although we need to be careful in drawing very strong conclusions (given quite large standard deviations), Guadagni and Little (1983) and Green and Srinivasan (1990) show the highest impact on practice, both as perceived by academics (mean = 4.28 and 4.17 respectively) and intermediaries (mean = 4.17 and 3.97 respectively). Overall, the ranking across the two samples is quite consistent (ρAI = 0.63). Notable exceptions include Louviere and Woodworth (1983), Vanheerde, Gupta, and Wittink (2003), and Simonson and Tversky (1992), all of which intermediaries accredit a significantly higher impact on practice than academics, while only Fornell (1992) shows the opposite. Finally, there is a correlation of 0.65 between the practice impact of these 20 articles gauged from the pre-calibration sample of intermediaries and the calibration sample of intermediaries. (Respondents in the precalibration and calibration samples responded to different tasks, precluding any aggregation of data across samples). Table 3 describes the impact that marketing science has had on different marketing decisions, and Tables 4 and 5 show the influence of different tools and articles, respectively. We also solicited the more detailed transition matrices of individual articles' impact on individual tools and decisions, and individual tools on individual decisions, from a sub-sample of our respondents. We include and discuss these transition matrices in the marketing science value chain in Web Appendix 2.1. Additionally, many respondents provided open ended comments (included as Web Appendix 2.2). Perhaps the most interesting aspect of those is the variety of “mental maps” with which managers, intermediaries and academics think about marketing science applications. 6 As before, although we also report conditional impact (impact given awareness), our awareness adjusted impact assumes that for an article to have impact a respondent must have awareness of it when prompted. As described earlier in our methodology section, we surveyed the authors of the twenty dual-impact articles, shown in Table 5, to learn from their experiences that go beyond the obvious, or possibly deviate from some norms in our field. Participation in our survey of these author teams was 100% (by article). 17 out of the 20 papers had multiple authors. Of those 17, multiple authors in 9 cases responded to our survey. Unsurprisingly, many expected themes emerged from these responses; themes that have been previously identified in the academic and practitioner literature. They include advice from authors to look for gaps in the literature, to ensure a strong grounding in prior theory, to find interesting, unsolved problems that are important to managers, and to fuel the diffusion process, not relying on good ideas to automatically be adopted. Below we focus on the three most interesting new themes that emerged. In addition, Guadagni and Little (2008) share their recollection in a Marketing Science commentary, which they based on our survey to them. 3.4.1. Symbiosis with consulting Many of the authors referred to the symbiosis of their research with consulting as a fertile ground for dual impact papers. Rick Staelin describing Boulding, Kalra, Staelin, and Zeithaml (1993) stated “This paper started with a “consulting” project for the School [Fuqua School of Business, Duke University] trying to improve the service quality of our teaching/delivery system.” Jordan Louviere speaking of Louviere and Woodworth (1983) said “[The problem] came from a consulting project in Australia. I was asked by the Bureau of Transport Economics to help them forecast demand for Qantas flights on transpacific routes.” Many of the authors also (co-)founded professional services companies to commercialize their work. For example, Roland Rust mentioned forming a company to commercialize the approach of Rust, Zahorik, and Keiningham (1995). John Little attributes his logit model's practical success largely to the commercialized products based on it. Louviere worked with DRC to commercialize the method he had developed. MDS started selling Hauser and Shugan's (1983) Defender model. Hauser joined Bob Klein in founding Applied Marketing Science, Inc. to commercialize the “voice of the customer” methodology (Griffin & Hauser, 1993). 3.4.2. Going against the grain at the right time A common topic in many responses was that they went against the grain at the right point in time. Times were either ripe for the radical innovation the authors introduced or the authors rode on a new technology wave that came to transform industry. About the former, Roland Rust nicely phrases it as follows: “We went against the grain, which meant that acceptance of our ideas ensured minds were changed.” Peter Guadagni and John Little attribute part of the success of Guadagni and Little (1983) more to the latter, an impeccable sense of timing: “Much of the impact was due to its early use of data from UPC scanners.” This does not mean that dual impact author teams were not also firmly grounded in basic theory, despite going against the grain. For example, Peter Guadagni and John Little say: “Consumers make choices to maximize utility. This came from basic economic theory.” In the same vein, John Hauser on Hauser and Shugan (1983) mentions: “There was the Brandaid model by John Little in which he used a multiplicative form for the effects of advertising and distribution. Coupled with Lancaster's model, this gave us an empirically-relevant, but analytically tractable model with which to study the problem.” Indeed, it is of interest that the 20 top papers by practice impact in Table 1 contained an average of 12 equations and 54 references (compared to 5 and 37 respectively for articles ranked 21–100, p b 0.05). 3.4.3. Working with experience A long track record of some of the authors and influencers seems to be an essential component of dual impact teams. All author teams have 136 J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 Table 5 Average impact of marketing science articles on marketing practice (ranked by intermediaries' perceptions of impact). Intermediaries (I) (N = 34). Academics (A) (N = 84). Article Awareness Impact Std. Rank AwarenessAwareness Impact Std. Rank AwarenessDifference test (%) (Avg|Aware) Error (impact) adjusted impact (%) (Avg|Aware) Error (impact) adjusted impact in A–I impact Guadagni and Little (1983) Green and Srinivasan (1990) Louviere and Woodworth (1983) Griffin and Hauser (1993) Keller (1993) Cattin and Wittink (1982) Parasuraman et al. (1985) Mahajan et al. (1990) Fornell et al. (1996) Aaker and Keller (1990) Vanheerde et al. (2003) Hauser and Shugan (1983) Simonson and Tversky (1992) Rust et al. (1995) Anderson et al. (1994) Boulding et al. (1993) Punj and Stewart (1983) Day (1994) Fornell (1992) Hunt and Morgan (1995) Average across articles 85 85 76 74 85 85 65 91 76 79 74 74 71 71 59 68 65 65 62 47 73 4.17 3.97 3.92 3.64 3.48 3.41 3.41 3.35 3.27 2.96 2.96 2.92 2.88 2.83 2.75 2.74 2.73 2.68 2.48 2.44 3.15 0.19 0.18 0.21 0.22 0.20 0.20 0.25 0.20 0.20 0.20 0.19 0.24 0.22 0.19 0.22 0.16 0.24 0.27 0.21 0.27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 3.56 3.38 3.00 2.68 2.97 2.91 2.21 3.06 2.5 2.35 2.18 2.15 2.03 2.00 1.62 1.85 1.76 1.74 1.53 1.15 98 96 81 94 94 92 94 98 90 92 89 93 87 92 92 82 79 86 90 73 90 4.28 4.17 2.76 3.32 3.78 3.23 3.87 3.13 3.63 3.30 2.43 3.03 2.27 3.12 3.10 2.74 2.73 3.19 3.29 2.02 3.17 0.11 0.10 0.14 0.12 0.11 0.13 0.11 0.13 0.12 0.12 0.11 0.12 0.13 0.12 0.11 0.14 0.14 0.12 0.12 0.13 1 2 15 6 4 9 3 11 5 7 18 14 19 12 13 16 17 10 8 20 4.18 4.02 2.24 3.12 3.56 2.96 3.64 3.06 3.29 3.02 2.17 2.81 1.98 2.86 2.85 2.25 2.14 2.74 2.98 1.46 0.50 1.02 −4.60a −1.32 1.35 −0.75 1.69 −0.93 1.59 1.43 −2.46b 0.40 −2.38b 1.28 1.46 0.00 0.00 1.71 3.30a −1.39 Notes: 1. Scale: 1: No influence at all to 5: Extremely influential. 2. ap b 0.01, bp b 0.05, using the Welch–Satterthwaite t-test to test for differences in impact given awareness across academic and intermediary samples. 3. Awareness-adjusted impact is equal to awareness proportion multiplied by impact given awareness. Awareness-adjusted impact assumes that the impact of an article is 0 if the respondent is not aware of the article. Correlation between the two measures is 0.94 for intermediaries and 0.99 for academics. at least one scholar with an academic career of over 15 years before coauthoring the paper (with the exception of Keller, 1993 article). The most senior author in 14 of the top 20 papers by practice impact in Table 1 held a named chair, in contrast to 23 out of the remaining 80 high academic impact articles (p b 0.01). It appears that significant academic experience is close to a prerequisite to writing an article that has large dual impact. In addition, industry experience may help. Authors who responded to our survey also had an average 6.75 years of experience in industry. Authors frequently mentioned close liaison with industry. Eight out of 20 teams worked with practitioners on developing at least part of their ideas. Many other sources are mentioned on the practitioner side, both at intermediaries and marketing companies. Top sources are the Marketing Science Institute (mentioned by 5 author teams out of 20) as a source of inspiration. As individual practitioners, these authors mention people such as Bob Klein, Steve Gaskin, Richard M. Johnson, and Steve Cohen (3 or more mentions). Academic colleagues with an influence are mainly scholars' coauthors, colleagues from the same department, or scholars on whose work authors built. Within the marketing profession, Glen Urban and Al Silk received three or more mentions. John Hauser notes on Hauser and Shugan (1983): “There were many influences. Chief was the Assessor model by Silk and Urban, which was a pre-test market model to predict the shares of new products. However, for every innovator, there were many defenders. We wanted to know what was the best defensive strategy.” Authors also cite inspiration from well-known scholars outside their own field. Scholars mentioned in that category are Doug Carroll, Dan McFadden, Albert Hirschman, Herman Wold, and Frank Andrews (2 or more mentions). 3.5. Trends since 2004 It is useful to examine changes in the environment in the past nine years and to use our findings to consider likely trends in the impact of marketing science. To do that, we return to the marketing science value chain and examine separately changes to the decisions managers make, the tools that they use, and the articles that have driven the development of those tools. 3.5.1. Trends in management decisions Clearly, a number of environmental changes have affected the way in which managers need to relate to their marketplaces. These include a greater availability of addressable data (i.e. big data) and the rise of digital and mobile communications, both in terms of access to markets and communications between consumers (such as social networks). To formalize our examination of these trends, we assessed the changing content of marketing management textbooks. We examined marketing management texts rather than cutting edge methodology books because, at this stage of the marketing science value chain, it is the overall managerial decision making environment we wish to study. An examination of sales lists at amazon.com shows that Kotler/Kotler and Keller's Marketing Management (in its various guises) dominates this market. For example, on February 23, 2013 “A Framework for Marketing Management” (5th edition) was 6632 on the best seller list with the closest non-Kotler competitor coming in at 56,620. Therefore, we looked at the evolution of this text over time: before the beginning of our study (1980), four years into our study (1988), at the end of our study (2003), and most recently (2012). The results are included as Web Appendix 3.1. We note the rising importance of branding, customer management and integrated marketing over this time. Because textbooks may be backward looking, we also examined trends in the Marketing Science Institute's Research Priorities which are, themselves, derived from surveys among academics and their members, who are all senior managers (Web Appendix 3.2). As expected, we see more recent topics in this list such as understanding mobile marketing opportunities, the role of social networks, and the harnessing of “big data.” The survey of our authors would suggest that these environmental shifts in possibility and priority bring with them the opportunity to go against the grain at the right time. An obvious analogy is John Little's view that his adoption of logit modeling was a direct result of the availability of vast quantities of panel scanner data which enabled a new, less aggregate way of modeling response to changes in the marketing mix. J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 3.5.2. Trends in tools available Clearly, many changes have occurred in the statistical tools available to the industry marketing analyst (and marketing intermediary) since 2004. Kluwer's Series in Quantitative Marketing, edited by Josh Eliashberg, provides an excellent resource describing advances in many of the tools available. Many of these are driven by the availability of vast amounts of customer data and with them, the rise of data mining (see Humby, Hunt, & Phillips, 2008 for an example). Much of this work is being conducted by information systems groups rather than marketers. As well as models that account for observed heterogeneity, models that account for unobserved heterogeneity are also gaining traction. Lilien (2011) speaks to the relative success of models that may be implemented by automatic algorithm, rather than as a managerial decision aid, which is an interesting distinction. To gain a more systematic view of trends in the tools being used in industry, we examined the programs of the American Marketing Association's Advanced Research Techniques (ART) Forum from 2002 to 2013. The ART Forum is an annual meeting of academics, intermediaries, and practicing managers which discusses new and emerging marketing science techniques, as well as conducting tutorials in newlyestablished ones. A summary of these programs is included as Web Appendix 3.3. We observe that a number of 12 types of tool we identified continue to be important over the following nine years (including discrete choice conjoint analysis, customer lifetime value models, and segmentation techniques). Second, we notice the introduction of new sets of tools, of which the most important are social media and network analysis methods from 2010 to 2013, including viral models, recommendation systems, and user generated content. Also of growing importance are text mining methods (2012) and agent-based modeling (2008 and 2012). Finally, many of the tools that we have described have undergone substantial development and enhancement. Primary among those are the areas of survey based and panel based choice models. The Bayesian treatment of heterogeneity (from 2002 onwards), introduction of new measurement bases such as MaxDiff, and data augmentation techniques stand out. In a rare study of the prevalence of marketing science tool usage, Orme (2013) notes fourteen major trends over the past ten years in the use of Sawtooth software (probably the leader in conjoint/ choice analysis software). Primary among those are the mainstreaming of Hierarchical Bayes, the decline of ratings based conjoint, the emergence of MaxDiff scaling, and new applications/methods such as menu based choice, optimization, and adaptive designs. 3.5.3. Trends in marketing science articles We undertook an examination of the papers published in IJRM, JM, JMR, MGS, and MKS for the period 2004 to 2010. We included IJRM given the more recent time period of study and its recognized importance as a top academic journal (Pieters, Baumgartner, Vermunt, & Bijmolt, 1999). We obtain a CITERESID (see Section 2.2.3) on each of the journals separately (given that we search for recent trends, they may pop up in one journal specifically). In this model, we used the number of quarters to December, 2010 as a measure of age of the article. Next, we have ranked CITERESID per journal and provide the top 10 per journal in Table 6. Note that we validated that the inclusion of IJRM was appropriate by estimating CITERESID also on the full sample of all articles jointly and found IJRM had 2 representatives in the top 50 (3 in top 100), marking the gradual maturation of IJRM as the youngest member of top journals in marketing. A content-analysis of the 50 papers in Table 6 indicates that the topics of research that have been cited the most are word of mouth and social networks and relationship marketing/management. In the absence of a formal survey of the impact of marketing science articles since 2004, one way to gain some feel for those that have affected the tools that intermediaries (academics and managers) use to address marketing decisions is to look at those articles that have been mentioned in patents. Because such citations are likely to indicate an article providing the foundation of new tools, we undertook a search using 137 Google Patents for mentions of articles in our target journals in patents issued by the US Patents and Trade Office (USPTO). To allow comparability with our sample period of 1983 to 2003, we also looked historically at that period as well. The results are included as Web Appendix 3.4. Marketing papers from the five target journals received a total of 1317 citations from patents issued by the USPTO. The first paper to receive a patent citation was published in the Journal of Marketing in 1940. The data indicate a significantly increasing trend of marketing papers being cited in patents. Almost half of the citations (625) to historical marketing papers published in the five target journals have come from patents issued since 2004. Marketing papers published since 2004 have attracted 39 of those 625 citations. The 39 patent citations were obtained by a total of 27 papers published since 2004 in IJRM (2 papers), JM (2), JMR (5), MGS (5), and MKS (13). Papers on the following topics received more than one citation: pricing and promotions (10), movies (4), online behavior models (4), retail assortment models (3), customer lifetime value models (2), conjoint (2), forecasting (2), innovation (2), and social networks (2). One interesting trend is the level of engagement of marketing intermediaries and managers in the knowledge generation process. In 1983 (the beginning of our sample period), approximately half of the participants at the ISMS Marketing Science Conference held at the University of Southern California came from industry. By 2012, only 37 out of 930 attendees (4%) were from industry. However, general conferences have been replaced by specialized conferences such the biennial ISMS Practice Conference. Similarly, the Gary Lilien ISMS-MSI Practice Prize has maintained industry connections with our top journals in terms of authors. The proportion of industry authors of Marketing Science articles fell from 7% in the period 1983 to 2003 to 5% between 2004 and 2012. However, 35 of these 68 industry authors from 2004 to 2012 were a part of Practice Prize Finalist papers, showing the important role special events can have in stemming the disconnect between academic researchers in marketing and those who have to use their research. 3.5.4. Other marketing science trends A number of other trends emerged in the development and application of marketing science over the past nine years. First, it has become more international at all levels of the value chain. In terms of managerial decision making, globalization has become a major driver of change. In terms of tools, at the American Marketing Association Advanced Research Techniques Forum, the ratio of North American academic presenters to those from other continents went from 15/1 in 2003 to 22/6 in 2008 and 19/6 in 2013. At the other end of the value chain, the number of authors publishing from outside North America in the top marketing journals is increasing. Looking at the authorship profile of the top 100 articles (by age-adjusted citation impact) published in the five top journals from 2004 to 2010, we find that 22% of the authors of papers from 2004 to 2007 were from non-US locations, while this number increased from 11% in 2004 to 33% in 2010. (See Stremersch & Verhoef, 2005 for evidence of globalization of authorship on the same sample of journals, but including all articles between 1964 and 2002, not merely the top cited articles). Also special fora that aim to bridge the gap between academics and practitioners can enable globalization. 11 of the 25 finalists of the Lilien ISMS-MSI Practice Prize Competition since its inception have come from outside North America (seven from Europe, three from Australia, and one from the Asia Pacific region). Entries from Europe have won the prize four out of the seven times. 4. Discussion 4.1. Summary We have calibrated the relative impact of marketing science research on practice, using our marketing science value chain as a central framework. It is reassuring to see that the impact of marketing science on marketing decisions has been largely felt in areas that are of the greatest importance to the firm (see Fig. 3). Moreover, the managers 138 J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 Table 6 Top 10 Articles from 2004 to 2012, listed by journal in order of age-adjusted citations. Articles by journal Total citations Age-adjusted impact Topic International Journal of Research in Marketing Reinartz, Haenlein & Henseler (2009) Peres, Muller & Mahajan (2010) Dholakia, Bagozzi & Pearo (2004) Burgess, Steenkamp (2006) Bagozzi & Dholakia (2006) Street & Burgess & Louviere (2005) Goldenberg, Libai & Muller (2010) Du, Bhattacharya & Sen (2007) Verhoef, Neslin & Vroomen (2007) De Bruyn & Lilien (2008) 40 26 169 71 83 83 18 47 51 30 7.31 5.70 5.53 3.91 3.61 3.31 3.19 3.15 3.14 2.80 Journal of Marketing Vargo & Lusch (2004) Schau, Muniz & Arnould (2009) Palmatier, Dant, Grewal & Evans (2006) Trusov, Bucklin & Pauwels (2009) Kozinets, de Valck, Wojnicki & Wilner (2010) Luo & Bhattacharya (2006) Tuli, Kohli & Bharadwaj (2007) Brakus, Schmitt & Zarantonello (2009) Palmatier, Dant & Grewal (2007) Rust, Lemon & Zeithaml (2004) 1029 82 215 72 40 146 110 55 94 310 10.47 5.37 4.61 4.59 2.82 2.78 2.73 2.70 2.49 2.39 Marketing theory Customer communities Relationship Mktg & Mgmt WOM/networks WOM/networks CSR Mass customization Brand Relationship Mktg & Mgmt Customer equity Journal of Marketing Research Chevalier & Mayzlin (2006) Bergkvist & Rossiter (2007) Gupta, Lehmann & Stuart (2004) Mazar, Amir & Ariely (2008) Reinartz, Krafft & Hoyer (2004) Srinivasan & Hanssens (2009) Rindfleisch, Malter, Ganesan & Moorman (2008) Trusov, Bodapati & Bucklin (2010) Nair, Manchanda & Bhatia (2010) Petrin & Train (2010) 284 200 187 91 190 64 81 21 18 25 10.05 8.14 6.03 6.02 5.78 5.24 4.13 2.95 2.77 2.52 WOM Research methodology/survey research Customer equity Behavioral theory Relationship Mktg & Mgmt Metrics and firm value Research methodology/survey research Social networks Social networks Research methodology/choice Marketing Science Fiebig, Keane, Louviere & Wasi (2010) Godes & Mayzlin (2009) Keller & Lehmann (2006) Hauser, Tellis & Griffin (2006) Godes & Mayzlin (2004) Gupta & Zeithaml (2006) Rust & Chung (2006) Zhang (2010) Van den Bulte & Joshi (2007) Eliashberg, Elberse & Leenders (2006) 46 54 128 122 224 90 79 23 52 61 8.13 7.36 7.18 6.80 5.81 4.76 4.06 3.39 2.97 2.91 Research methodology/choice WOM Brand Diffusion/innovation WOM/networks Metrics and firm value Relationship Mktg & Mgmt Learning Social networks/innovation Movies 41 37 54 75 19 44 32 23 27 72 6.55 3.92 3.36 3.26 3.04 2.83 2.41 2.33 2.27 2.04 Search Pricing WOM/social networks Pricing Mass customization Diffusion/innovation Remanufacturing Recommender systems Online marketing Social networks Management Science Ghose & Yang (2009) Cachon & Swinney (2009) Chen & Xie (2008) Su (2007) Franke, Schreier & Kaiser (2010) Rahmandad & Sterman (2008) Atasu, Sarvary & Van Wassenhove (2008) Fleder & Hosanagar (2009) Forman, Ghose & Goldfarb (2009) Grewal, Lilien & Mallapragada (2006) Research methodology/SEM Diffusion/innovation Social networks Emerging markets Social networks Research methodology/choice Network externalities Corporate Social Responsibility (CSR) Multichannel shoppers Word of Mouth (WOM) Note: Age-adjusted impact is estimated as the residual from a journal-specific negative binomial model relating number of citations to the age of the article (as measured by the number of quarters to December, 2012). The model includes linear and squared age terms to capture the non-linear time trend of citations. in our sample are aware of the marketing science tools available to them, and there is a correlation between managers, academics, and intermediaries on the perception of the impact of those tools. Marketing science articles that have influenced practice come in a wide range of flavors. Some articles do not include empirical work (e.g., Hauser and Shugan's Defender model), while others use only laboratory data (e.g., Aaker and Keller's brand extension work). The survey among authors of top dual impact articles provides excellent pointers as to what it takes to write a top-journal article that achieves high academic and practice impact: symbiosis with consulting, going against the grain at the right time, and working with experience. Examining more recent developments in our field since 2004, we were able to document the rise of digitization, mobile communications, and social networking, as well as further globalization of academia and the important role of special fora. We now discuss implications of our research for academia and practice, limitations of our research, and ideas for future research in this area. 4.2. Implications for academia Many marketing science academics may not see impacting practice as their primary goal, letting the practice impact occur as a by-product J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 at best. A goal of practical impact might even be seen as counterproductive from the perspective of academic impact, distracting researchers from their primary mission and potentially compromising the rigor and integrity with which a problem is studied. Our study points to several counterarguments as to why the two goals may not necessarily be in conflict. First, practical problems may provide inspiration for new breakthroughs as old tools are found inappropriate to solve them (e.g., Louviere & Woodworth, 1983). Second, practical problems lure academics away from the ivory tower, in which they may be held captive by dominant paradigms. Scholars who seek high practical impact may want to focus their research on decisions that are of greater importance to firms. In Table 2, we identified such areas to be pricing management, new product management, customer and market selection, and product portfolio management. While scholars may very well choose their research area using other inputs as well, we are able to offer scholars general advice on the challenging road to practical impact, from surveying top 20 dual impact authors. Research in symbiosis with consulting may prove to be a fertile ground for dual impact papers. The right timing in tackling the problem and the willingness to go against the grain seem crucial as well. Too early and radical a new idea may not find acceptance yet, too late and a colleague may beat the researcher to the punch. That dual impact papers require a strong grounding both in marketing science and practice, may explain why we find a disproportionate number of highly experienced scholars in our 20 top dual impact papers. 139 response rate), we believe that it could introduce considerable bias. We have attempted to address this by focusing largely on relative rather than absolute effects. • Alternative knowledge diffusion routes. Textbooks, magazines and newspapers represent important, alternate ways by which new marketing knowledge diffuses. Similarly, organizations such as ACNielsen, Sawtooth, and Advanis are responsible for knowledge generation that may not always begin in journal articles. Because we are not claiming a complete catalog of the sources and transition nodes of marketing science knowledge diffusion, this is less of a problem. • We focus on success and that brings with it a number of benefits, as well as being easier to observe. However, the lack of a control sample of “failures” means that we cannot discriminate between that which works and that which does not (though we can, to some extent, examine correlates of drivers of the degree of success). Research in marketing science has relevance to many marketing decisions. At least that is what we find from the practitioners we surveyed. Even though our samples may be biased towards the sophisticated end of practice, our results are encouraging. Intermediaries consider segmentation tools and survey-based choice models to be most influential relative to other tools. Intermediaries find individual articles, such as Guadagni and Little (1983), Green and Srinivasan (1990), and Louviere and Woodworth (1983) to be very influential on practice. Our paper provides a good primer on marketing science for marketing practitioners. It reviews an impressive body of top marketing science articles with dual impact. Therefore, it provides a guide to marketing science research for (i) marketing practitioners with an interest in discovering new areas or (ii) young market research professionals. This paper can help them discover for which decisions or tools it is useful to turn to marketing science research, as well as which specific articles provide potentially useful insights and tools to which they should be exposed. Having taken the first step in an effort to calibrate the effect of marketing science on marketing practice, we find ourselves faced with a number of interesting but unanswered questions. These include the possibility of a more comprehensive mapping and measures built up from marketing practice, rather than down from journal articles. In terms of a more comprehensive mapping, it would be useful to consider other knowledge vehicles (e.g., textbooks, magazines and newspapers), routes (e.g., user knowledge generation and seminars), and participants (e.g., specialist training educators). More representative samples would allow inferences to be drawn about absolute impact rather than just relative impact. Finally, the unit of analysis we used is that of articles published in the period 1982–2003. Had it been scholars or over a longer timeframe, other researchers may have been more strongly represented. The measure of relative rather than absolute impact raises another issue; that of market penetration of marketing science knowledge and tools (e.g., Roberts, 2000). Marketing science tools and the articles on which they are based may be used in a wide variety of marketing decision making situations (i.e., the opportunity set is large). A more appropriate benchmark might perhaps be, “Of all the situations to which these tools could have provided insight, in what per cent are the tools actually being applied?” Our sense is that the number is low. If this is indeed the case, it is presumably hard for us to argue that the marketing science tools currently in the market are in any way “standard” approaches to marketing and the measurement of its effect. We could contrast this penetration to that of approaches taught in other management disciplines, such as accounting and finance, for example. Overall, we hope that we have identified the basis for a continued and richer study of the marketing science value chain. 4.4. Limitations and future research Acknowledgments In undertaking any research with as many dimensions as in our study, researchers must make a number of choices and assumptions. Our primary motivation in designing our research was to have a methodology that was objective and verifiable. To do so, we set up criteria upon which to design our study, carefully evaluating those criteria and obtaining input from a variety of knowledgeable sources at each stage of the research. Yet, we understand that other scholars may have approached the study differently and/or identified other study design criteria. Some significant limitations of our research include the following: We were inspired and supported by the Practice Prize Committee of the INFORMS Society for Marketing Science, the Marketing Science Institute (MSI) and the Institute for the Study of Business Markets (ISBM) in the execution of this work. Many individuals have also contributed to the ideas contained in the paper, particularly Gary Lilien and Bruce Hardie. We would like to thank the Guest Editor, Area Editor and two anonymous reviewers who provided constructive and insightful feedback to us. John Roberts acknowledges support from the London Business School Centre for Marketing and Stefan Stremersch from the Erasmus Center for Marketing and Innovation. We have conducted many secondary analyses in the context of this project, which we do not report in the interest of brevity. Please contact the first author should you have an interest in obtaining supplementary materials. 4.3. Implications for practice • Citations as a screening mechanism. We are acutely aware of the irony of starting to measure impact on practice with a list ranked by academic impact (i.e., citations). We tried to minimize this effect by including a pre-calibration stage. At worst, however, we can claim to have gauged the practice impact of the population of highly cited marketing science articles (what we call dual impact). • Biased sample. The use of MSI and ISBM led to practitioner samples that were likely skewed towards greater sophistication. While this likely skew might improve the reliability of responses (and the Appendix A. Supplementary data Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.ijresmar.2013.07.006. 140 J.H. Roberts et al. / Intern. J. of Research in Marketing 31 (2014) 127–140 References Aaker, D. A., & Keller, K. L. (1990). Consumer evaluations of brand extensions. Journal of Marketing, 54(1), 27–41. Anderson, E. W., Fornell, C., & Lehmann, D. R. (1994). Customer satisfaction, market share, and profitability: Findings from Sweden. Journal of Marketing, 58(3), 53–66. Boulding, W., Kalra, A., Staelin, R., & Zeithaml, V. A. (1993). A dynamic process model of service quality — From expectations to behavioral intentions. Journal of Marketing Research, 30(1), 7–27. Bucklin, R. E., & Gupta, S. (1999). Commercial use of UPC scanner data: Industry and academic perspectives. Marketing Science, 18(3), 247–273. Cattin, P., & Wittink, D. R. (1982). Commercial use of conjoint-analysis — A survey. Journal of Marketing, 46(3), 44–53. Day, G. S. (1994). The capabilities of market-driven organizations. Journal of Marketing, 58(4), 37–52. Fornell, C. (1992). A national customer satisfaction barometer — The Swedish experience. Journal of Marketing, 56(1), 6–21. Fornell, C., Johnson, M.D., Anderson, E. W., Cha, J. S., & Bryant, B. E. (1996). The American customer satisfaction index: Nature, purpose, and findings. Journal of Marketing, 60(4), 7–18. Gerbing, D. W., & Anderson, J. C. (1988). An updated paradigm for scale development incorporating unidimensionality and its assessment. Journal of Marketing Research, 25(2), 186–192. Germann, F., Lilien, G. L., & Rangaswamy, A. (2013). Performance implications of deploying marketing analytics. International Journal of Research in Marketing, 2, 114–128. Geyskens, I., Steenkamp, J. E. B.M., & Kumar, N. (1998). Generalizations about trust in marketing channel relationships using meta-analysis. International Journal of Research in Marketing, 15(3), 223–248. Green, P. E., & Srinivasan, V. (1990). Conjoint-analysis in marketing — New developments with implications for research. Journal of Marketing, 54(4), 3–19. Griffin, A., & Hauser, J. R. (1993). The voice of the customer. Marketing Science, 12(1), 1–27. Guadagni, P., & Little, J.D. C. (1983). A logit model of brand choice calibrated on scanner data. Marketing Science, 2(3), 203–238. Guadagni, P., & Little, J.D. C. (2008). A logit model of brand choice calibrated on scanner data: A 25th anniversary perspective. Marketing Science, 27(1), 26–28. Hauser, J. R., & Shugan, S. (1983). Defensive marketing strategies. Marketing Science, 2(4), 319–360. Humby, C., Hunt, T., & Phillips, T. (2008). Scoring points: How Tesco continues to win customer loyalty. London, UK: Kogan Page. Hunt, S. D., & Morgan, R. M. (1995). The comparative advantage theory of competition. Journal of Marketing, 59(2), 1–15. Jaworski, B. J., & Kohli, A. K. (1993). Market orientation — Antecedents and consequences. Journal of Marketing, 57(3), 53–70. Keller, K. L. (1993). Conceptualizing, measuring, and managing customer-based brand equity. Journal of Marketing, 57(1), 1–22. Kotler, P., & Keller, K. L. (2012). Marketing management 14th edition. Upper Saddle River, NJ: Prentice Hall. Leeflang, P.S. H., & Wittink, D. R. (2000). Building models for marketing decisions: Past, present, future. International Journal of Research in Marketing, 17(2–3), 105–126. Lehmann, D. R., McAlister, L., & Staelin, R. (2011). Sophistication in research in marketing. Journal of Marketing, 75(July), 155–165. Lilien, G. L. (2011). Bridging the academic-practitioner divide in marketing decision models. Journal of Marketing, 75(2), 196–210. Lilien, G. L., Kotler, P., & Moorthy, K. S. (1992). Marketing models. Englewood Cliffs: Prentice Hall. Lilien, Gary L., Rangaswamy, Arvind, & De Bruyn, Arnaud (2007). Principles of marketing engineering. : Trafford. Lilien, G. L., Roberts, J. H., & Shankar, V. (2013). Effective marketing science applications: Insights from the ISMS practice prize finalist papers and projects. Marketing Science, 32(2), 229–245. Louviere, J. J., & Woodworth, G. (1983). Design and analysis of simulated consumer choice or allocation experiments — An approach based on aggregate data. Journal of Marketing Research, 20(4), 350–367. Mahajan, V., Muller, F. M., & Bass, (1990). New product diffusion-models in marketing — A review and directions for research. Journal of Marketing, 54(1), 1–26. Morgan, R. M., & Hunt, S. D. (1994). The commitment–trust theory of relationship marketing. Journal of Marketing, 58(3), 20–38. Orme, B. (2013). Advances and trends in marketing science from the Sawtooth Software perspective. Orem, UT: Sawtooth Software, Inc. Parasuraman, A., Zeithaml, V. A., & Berry, L. L. (1985). A conceptual-model of service quality and its implications for future-research. Journal of Marketing, 49(4), 41–50. Pieters, R., Baumgartner, H., Vermunt, J., & Bijmolt, T. H. A. (1999). Importance and similarity in the evolving citation network of the International Journal of Research in Marketing. International Journal of Research in Marketing, 16(2), 113–127. Punj, G., & Stewart, D. W. (1983). Cluster-analysis in marketing-research — Review and suggestions for application. Journal of Marketing Research, 20(2), 134–148. Reibstein, D. J., Day, G., & Wind, J. (2009). Guest editorial: Is marketing academia losing its way? Journal of Marketing, 73(June), 1–3. Roberts, J. H. (2000). The intersection of modelling potential and practice. International Journal of Research in Marketing, 13(3), 127–134. Rust, R. T., & Cooil, B. (1994). Reliability measures for qualitative data: Theory and implications. Journal of Marketing Research, 31(1), 1–14. Rust, R. T., Zahorik, A. J., & Keiningham, T. L. (1995). Return on quality (ROQ) — Making service quality financially accountable. Journal of Marketing, 59(2), 58–70. Simonson, I., & Tversky, A. (1992). Choice in context — Trade-off contrast and extremeness aversion. Journal of Marketing Research, 29(3), 281–295. Stremersch, S., & Verhoef, P. C. (2005). Globalization of authorship in the marketing discipline: Does it help or hinder the field? Marketing Science, 24(4), 585–594. Stremersch, S., Verniers, I., & Verhoef, P. C. (2007). The quest for citations: Drivers of article impact. Journal of Marketing, 71(3), 171–193. Vanheerde, H. J., Gupta, S., & Wittink, D. R. (2003). Is 75% of the sales promotion bump due to brand switching? No, only 33% is. Journal of Marketing Research, 40(4), 481–491. Wierenga, B., & van Bruggen, G. H. (2000). Marketing management support systems: Principles, tools and implementation. Boston: Kluwer Academic Publishers. Wittink, D. R., & Cattin, P. (1989). Commercial use of conjoint analysis: An update. Journal of Marketing, 53(3), 91–96. Wittink, D. R., Vriens, M., & Burhenne, W. (1994). Commercial use of conjoint analysis in Europe: Results and critical reflections. International Journal of Research in Marketing, 11(1), 41–52. Intern. J. of Research in Marketing 31 (2014) 147–155 Contents lists available at ScienceDirect Intern. J. of Research in Marketing journal homepage: www.elsevier.com/locate/ijresmar Full Length Article Probabilistic selling vs. markdown selling: Price discrimination and management of demand uncertainty in retailing☆ Dan Hamilton Rice a,⁎, Scott A. Fay b,1, Jinhong Xie c,2 a b c Department of Marketing, E.J. Ourso College of Business, Louisiana State University, 2119 Building Education Complex, Baton Rouge, LA 70803-6314, United States Department of Marketing, Martin J. Whitman School of Management, Syracuse University, 721 University Avenue, Syracuse, NY 13244, United States Department of Marketing, University of Florida, P.O. Box 117155, Gainesville, FL 32611-7155, United States a r t i c l e i n f o Article history: First received in 26 September 2011 and was under review for 12 months Available online 12 October 2013 Area Editor: Els Gijsbrechts Keywords: Probabilistic selling Pricing Demand uncertainty Markdowns Price discrimination a b s t r a c t Markdown selling (i.e., price reductions over the course of the selling season) is a strategy to implement price discrimination and to manage market uncertainty that has been widely adopted by retailers. This paper explores the potential advantage of introducing an additional tool to the arsenal of retailers, probabilistic selling (i.e., offering consumers a choice to buy a product that can turn out to be any item from a predetermined set of distinct items). We show that both probabilistic and markdown selling strategies serve as price discrimination tools by offering buyers an option to purchase a “damaged” good (an uncertain product under the former and delayed consumption of a product under the latter). However, the two strategies segment markets based on different types of buyer heterogeneity: buyer preference strength under probabilistic selling and buyer patience under markdown selling. Our analytical model reveals that, compared with markdown selling, probabilistic selling can (1) improve margin management by increasing revenue from full-price sales and reducing the magnitude of discounts; and (2) improve inventory utilization by reducing stockouts and the amount of excess inventory. We identify the conditions required for probabilistic selling to be more profitable than markdown selling. © 2013 Elsevier B.V. All rights reserved. 1. Introduction In an effort to obtain the maximum profit across a diverse set of customers, retailers often offer price reductions over the course of the selling season. It is estimated that one-third of all goods are sold at marked-down prices (Friend & Walker, 2001) and discounts due to markdowns by US retailers amount to $200 B a year (Levy, Grewal, Kopalle, & Hess, 2004). Although costly, markdowns can be a valuable tool for improving profit margin management because they allow the retailer to price discriminate across time, i.e., sell the product at a high price early in the season to customers who value the product highly and are unwilling to wait, and at a discounted price later in the season to customers who are willing to delay their purchases (Besbes & Lobel, ☆ The paper has benefited from the helpful comments made by the seminar participants at the University of California, The Ohio State University, Rice University, Baylor University, University of Missouri–Kansas City, Syracuse University, Purdue University (Krannert), University of North Carolina–Charlotte, University of Illinois at Urbana–Champaign, the Marketing Science Conference (2009; Ann Arbor, Michigan) and the UTD Marketing Conference (University of Texas–Dallas). ⁎ Corresponding author at: Department of Marketing, Louisiana State University, Room 2100, Business Education Complex, Baton Rouge, LA 70808-6314, United States. Tel.: +1 225 578 8788; fax: +1 225 578 8616. E-mail addresses: danrice@lsu.edu (D.H. Rice), scfay@syr.edu (S.A. Fay), jinhong.xie@warrington.ufl.edu (J. Xie). 1 Tel.: +1 315 443 3456; fax: +1 315 442 1461. 2 Tel.: +1 352 273 3270; fax: +1 352 846 0457. 0167-8116/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ijresmar.2013.08.006 2012; Nair, 2007; Su, 2007). The markdown strategy can also enhance inventory management for retailers who are unable to accurately predict consumers' demand for each particular product (Lazear, 1986), e.g., by starting with a high price and reducing the price if units of the item remain unsold. Retailers are continually searching for more efficient ways to improve margin management and enhance inventory utilization.3 In this paper, we consider one such alternate selling mechanism, namely probabilistic selling (PS), and show that there are situations in which this mechanism can be advantageous relative to traditional markdowns both in enhancing price discrimination and in overcoming the main problems associated with demand uncertainty, namely stockouts and excess inventory. A probabilistic product is an offer involving the probability of obtaining any one of a set of multiple distinct items (Fay & Xie, 2008). Probabilistic selling (PS) is a selling strategy under which the seller creates probabilistic goods using the seller's distinct products or services and offers such goods to potential buyers as additional purchase choices. Notable examples of sellers of probabilistic products include priceline. 3 Previous research has focused on developing sophisticated dynamic markdown algorithms (e.g., Bitran & Mondschein, 1997; Chung, Flynn, & Zhu, 2009; Mantrala & Rao, 2001; Sullivan, 2005), implementing inventory management systems (Friend & Walker, 2001; Khouja, 1995; Ross, 1997), and identifying alternate ways to dispose of distressed goods, such as via off-price retailers and outlet stores (Coughlan & Soberman, 2005; Levy & Weitz, 2004, p. 56; Petruzzi & Monahan, 2003) or online auctions (Wang, Gal-Or, & Chatterjee, 2009; Wood, Alford, Jackson, & Gilley, 2005). 148 D.H. Rice et al. / Intern. J. of Research in Marketing 31 (2014) 147–155 Table 1 Related literature and distinguishing characteristics of current paper. Research focus Developing theory and applications of probabilistic (opaque) selling strategy Developing decision support system to implement probabilistic (opaque) selling strategy Research paper Current paper Fay and Xie (2012) Fay (2008) Fay and Xie (2010) Fay and Xie (2008) Jiang (2007) Shapiro and Zillante (2009) Jerath et al. (2010) Zouaoui and Rao (2009) Granados et al. (2008) Post (2010) Anderson (2009) Anderson and Xie (2012) Gallego and Phillips (2004) Mang et al. (2012) Petrick, Steinhardt, Gonsch, and Klein (2012) Methodology a AM, LE AM AM AM AM AM LE AM E AM, E AM AM E AM E AM Endogenous variables Price Capacity Yes Yes Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes No No No Yes Yes No No No No No No No No No No No No No No Consumers optimally time purchase Probabilistic good can cannibalize full-price sales Yes No No Yes b No No No Yes No No No No No No Yes No Yes No No Yes No No No No Yes Yes Yes No Yes No Yes No a “AM” = Analytical MODELING; “E” = Empirical; “LE” = Lab experiments. In Fay and Xie (2010), the two selling periods are the advanced period (prior to consumers learning their true valuations) and the spot period. In the current paper, consumers know their valuations in both periods. Thus, the major difference between the two periods is the time delay rather than differences in the information available to consumers. b com, lastminutetravel.com, and hotwire.com, websites where consumers can purchase travel services for which specific attributes of the service (e.g., the itinerary of the flight, the location of the hotel, or the identity of the car rental company) are not revealed until after payment. Recently, the idea of offering probabilistic goods has also been adopted by several online retailers (e.g., swimoutlet.com, agonswim.com, speedo.com, and kidsurplus.com) who offer discounted “grab bag” apparel and shoes, where patterns and styles are chosen randomly by the website.4 As technological advances make it much more practical to implement PS both in online and brick-and-mortar shopping environments, more retailers can potentially benefit from adopting this novel selling strategy (Fay & Xie, 2008). While the existing research on PS has significantly advanced our understanding of the fundamental drivers of PS and illustrates its general applicability, it is important to extend the research to understand how this novel strategy may address some unique problems in the retailing industry and to explore whether PS can be a valuable alternative to offering late-season markdowns. Most retailers strategically invest in inventory prior to the selling season, control the prices of their products over the entire selling season, and must account for how consumers time their purchases in response to these chosen prices. We introduce a model that incorporates each of these key characteristics. As shown in Table 1, among the current research on PS,5 ours is the only model that incorporates all of the following three key characteristics: (1) The seller optimally chooses its prices for the probabilistic goods and the specified goods; (2) the seller optimally adjusts its inventory orders when introducing probabilistic goods; and (3) consumers strategically choose when to purchase in order to maximize their expected surplus. By incorporating these three critical factors, we are able to develop the theory and implications of PS for the retailing industry. In particular, our model enables us to compare discounting on the basis of time (high initial price and a discounted price if the consumer delays her purchase) versus discounting on the basis of product opacity (i.e., setting a high price for each specified good and a discounted price if the consumer will purchase the probabilistic good). Thus, the paper's primary contribution 4 See an example at http://www.swimoutlet.com/product_p/1623.htm Several papers (Gallego & Phillips, 2004; Mang et al., 2012, and Petrick et al., 2012) consider a seller who does not assign products to buyers of the probabilistic good until a time that is substantially later than the day of purchase. They refer to this business model as flexible selling rather than PS. However, consistent with Fay and Xie (2012), we consider these papers as part of the PS literature since delaying product assignment can be viewed as an alternative way of implementing the PS strategy. 5 is that it is the first to examine the profit advantage of the PS strategy relative to the more commonly utilized strategy of marking down merchandise over time, i.e., the markdown (MD) selling strategy. We identify factors under which PS can be a more useful tool for retailers as they attempt to price discriminate across consumers. We find that PS and MD can be complementary strategies since, in some market settings, PS is a profitable form of price discrimination whereas MD is not, while, in other market settings, price discrimination is profitable via MD but not profitable via PS. A second contribution of the paper is that, by introducing a model that allows a probabilistic good to cannibalize full-price sales, we can examine the factors that affect the extent of cannibalization by the probabilistic good and determine whether PS can remain advantageous in its presence. Most extant analytical research on PS utilizes a Hotelling model to account for consumer heterogeneity (Fay, 2008; Fay & Xie, 2008, 2012; Jerath, Netessine, & Veeraraghavan, 2010; Jiang, 2007). A feature of the Hotelling model is that all consumers have the same expected value for the probabilistic good. As a result, price can be set at this common expected value, thus eliminating consumer surplus for all buyers of the probabilistic good. Since the probabilistic good does not generate positive surplus, the seller does not have to worry about any consumers switching from a higher-priced specified good to the lower-priced probabilistic good. However, cannibalization is a crucial concern under MD in the retailing industry because retailers are apprehensive that a discounted price at the end of the season may entice both low- and high-valuation buyers to delay their purchases (especially if the magnitude of the discount is very large). Thus, to provide an adequate comparison of PS with MD, the model must be capable of capturing the cannibalization effect under both strategies. Note that several empirical studies incorporate the cannibalization effect into their model estimations (Anderson & Xie, 2012; Granados, Gupta, & Kauffman, 2008; Mang, Post, & Spann, 2012; Zouaoui and Rao (2009)). However, since demand is modeled in reduced form in these papers, i.e., cross-price effects exist between the probabilistic good and the specified goods, these studies do not analyze the factors which affect the magnitude of this cannibalization effect or how cannibalization impacts the profitability of PS, as we do here. The rest of this paper is organized as follows. In the next section, we use a lab experiment to illustrate the potential advantages of the PS and MD strategies relative to a No Discounting strategy. In Section 3, we illustrate how both PS and MD can enable a retailer to price discriminate and then compare the profitability of these two strategies. In Section 4, D.H. Rice et al. / Intern. J. of Research in Marketing 31 (2014) 147–155 we extend the analytical model to allow for demand uncertainty and demonstrate that our key results continue to hold, and are even strengthened, in such markets. We conclude the paper by summarizing our results and suggesting areas for future research. 149 and MD may enable a firm to effectively segment its customers. Specifically, we develop a model to explain the conditions under which PS will be more profitable than MD, and vice-versa. 3. Price discrimination: Probabilistic selling versus markdowns 2. Two advantageous strategies: Probabilistic selling and markdowns 2.1. Motivation For MD and PS to be viable selling strategies, they must be more advantageous than offering no discounts to consumers, i.e., the No Discounting (ND) strategy. We performed an experimental study to explore how consumers respond to these different types of discounts versus a situation in which no discounts are offered. One hundred and thirty-eight undergraduate business students participated in the webbased study in exchange for extra credit, purportedly to help researchers better understand the online purchase behavior of college students. The design was a 2 (Selling Strategy: MD, PS) × 2 (discount: 5% or 25%) between-subjects design with an ND control condition. Participants were randomly assigned to one of the five conditions. Five students were eliminated for failure to follow directions, which left 133 responses for analysis. Participants were told that they would be shown offers for products (T-shirts) that had recently been for sale online, and then would be asked to answer the purchase questions. Next, the concept of probabilistic goods was explained to the participants, and they were told that the displayed offers may or may not include an option to purchase a probabilistic good. Participants were asked to make a purchase decision for T-shirt offers in Period 1 and then to rank their strength of preference (SOP) between the distinct T-Shirts on a scale of 0 to 50, where SOP is a measure of the difference between valuations for one's preferred and one's less-preferred product. Heterogeneity in the strength of preferences has been previously hypothesized as a key factor in determining the profitability of PS (Fay & Xie, 2008). In the MD condition, full-priced items were offered in Period 1 and, if no purchase occurred in this period, marked-down items were offered in Period 2. In the PS condition, full-priced items as well as the probabilistic good were offered in Period 1, but there were no second-period offerings. In the ND control condition, both full-priced items were offered in Period 1 only. If participants chose to purchase a T-Shirt in Period 1, the shopping experience ended in all conditions. 2.2. Results A one-way ANOVA for revenue generated by the purchase decisions of the participants was conducted across the five selling conditions. The results indicate a significant effect of selling condition (F(4128) = 4.79, p=.001). Specific contrasts indicate that the average revenue under ND ($5.86) was less than under the PS 5% condition (MPS5% = $13.31, t(133) = 4.00, p b .001), the PS 25% condition (MPS25% = $12.16, t(133) = 3.31, p = .001), the MD 5% condition (MMD5% = $9.53, t(133) = 1.95, p = .05) and the MD 25% condition (MMD5% = $9.42, t(133) = 1.95, p = .05). For the two PS conditions, SOP was significantly greater for those buying full-priced goods (MDS = 34.05) than for those who purchased a discounted good (MPS = 22.18; F(1,42) = 7.03, p = .01). No significant preference difference was present in the MD conditions between those who chose full price and those who chose a discounted good (p N .8). Together these findings suggest that, for our sample, both MD and PS were effective at increasing revenue relative to ND, which raises the question of when each strategy works best. The significant effect of SOP is consistent with the arguments in the extant literature that heterogeneity in the strength of consumers' preferences is a fundamental profit driver for PS. In the following sections, we use an analytical model to explore an environment, similar to our experiment, in which both PS In this section, we introduce a formal mathematical model to explore whether, and under what conditions PS is more profitable than MD. We begin by focusing on how PS and MD enable price discrimination. In Section 4, we extend the model to incorporate demand uncertainty and capacity constraints. Our model allows for asymmetric product preferences, costly inventory, endogenous capacity constraints, heterogeneous discount rates, and a spectrum of different product valuations by consumers. Overall, we find that, although PS is not more profitable than MD in all scenarios, there exists a sufficiently broad range of situations in which PS is advantageous to warrant increased attention to this new tool. 3.1. Modeling assumptions 3.1.1. Seller behavior Consider a retailer with two products, A and B, (e.g., a red T-shirt and a white T-shirt) and two possible selling periods (Periods 1 and 2). Prior to the first selling period, the retailer orders KA units of product A and KB units of product B, where each unit costs c. The seller has three alternative selling strategies: (1) No Discounting (ND), under which the prices offered do not change over time, i.e., the price of product A and the price of product B is PND in both the first and second periods. (2) Markdown selling (MD), under which prices vary over time. Specifically, the price in the first period (of product A and of product B) is P1MD and the price of each product is P2MD in the second period. (3) Probabilistic selling (PS), under which the seller offers each product individually and also a probabilistic product that can turn out to be either product A or product B. Specifically, the price of product A and the price of product B is P1PS. In addition, the seller offers a probabilistic good in the first period (at a price of P0PS). After purchase, the firm immediately determines which product the buyer will receive, each of which is equally likely.6 3.1.2. Buyer behavior Each consumer purchases at most one product, choosing the purchase option that yields the highest net surplus. There are two types of consumers: θ = {H,L}, where θ = H represents consumers with high product valuations and θ = L represents consumers with low product valuations. Each type makes up half of the total population and we normalize the size of each segment to one. Let vθF be consumer θ's value in Period 1 for her favored product and vθU be consumer θ's value for her less favored product. Thus, by definition, we have vHF N vLF, vHF N vHU and vLF N vLU. To reduce notation, we normalize the valuations so that the maximum valuation is one: vHF = 1. Furthermore, products are more highly valued if they are purchased in Period 1 rather than in Period 2. Valuations are lower in the second period due to consumer impatience, loss of perceived newness of the product, or loss of the opportunity to use the product during the first period. Specifically, in Period 2, a consumer's favored product is valued at dθvθF and her lessfavored product is valued at dθvθU, where dθ is naturally restricted to 6 Fay and Xie (2008) demonstrate that a seller typically finds an equal probability of assignment optimal under various demand conditions. Thus, the assumption of equal probability is commonly made (e.g., Fay & Xie, 2010; Jiang, 2007). 150 D.H. Rice et al. / Intern. J. of Research in Marketing 31 (2014) 147–155 the parameter region 0 b dθ b 1 for θ = {H,L}. We allow the products to differ in their aggregate popularity. In particular, α of consumers "1 ! # ≤α ≤1 favor the “popular” product and 1 − α of customers favor 2 the “unpopular” product. Note that a specific consumer's favored product may actually be the less popular one. Throughout this section, we assume that product A is the popular product and that the seller and the buyers both know this. In Section 4, we extend the model to allow for demand uncertainty. The seller chooses prices (P1MD, P2MD) to maximize its total profit given the constraints in (2)9: MD ¼ 1−Max½dH −dL vLF ; 0% P1 MD P2 These prices result in a profit of: MD Π 3.2. Three strategies 3.2.1. No Discounting (ND) Under ND, the firm sells in both periods at a price PND. All sales will occur in the first period. The seller chooses the inventory orders (KA, KB) and price, PND, in order to maximize its profit. These optimal values and the resulting profit are: ND KA ¼ P ND ¼ f f c ≤ ^c 2ð1−α Þ c ≤ ^c ^cbc ≤1; K ND ^cbc ≤1; 1−α B ¼ c N1 0 c N1 vLF c ≤ ^c 2ðvLF −cÞ c ≤ ^c ^cbc ≤1; Π ND ¼ ^cbc ≤1 1 1−c N=A c N1 0 c N1 2α α 0 where ^c ¼ 2νLF −1 f f ð1Þ 3.2.2. Markdown selling (MD) In MD, the seller offers each specified product in both the first and second periods (at prices of P1MD and P2MD, respectively). Under this strategy, each H-type consumer buys her favorite product in the first period and each L-type consumer buys her favorite product in the second period.7,8 To meet demand, the seller needs KMD = 2α and A MD KND and P2MD induce such a purchasing B = 2(1 − α). The prices P1 pattern only if the following incentive compatibility and participation constraints are met: MD ½P1% : 1‐P 1 ≥0 MD ½P2% : dL vLF ‐P 2 ≥0 $ % MD MD ½IC1% : 1‐P 1 − dH −P 2 ≥0 $ % MD MD ½IC2% : dL vLF ‐P 2 − vLF −P 1 ≥0 ð2Þ 7 Alternative strategies in which H-types wait to purchase until the second period and L-types either purchase in the first period or also wait to purchase until the second period would yield strictly less profit. 8 One could envision a scenario in which L-type consumers also purchase in the first period. However, our focus is on identifying scenarios in which MD is strictly more profitable than ND. If L-types also purchase in the first period, no markdown sales occur and thus MD and ND would be identical. ¼ 1−Max½dH −dL ν LF ; 0% þ dL ν LF −2c ð4Þ 3.2.3. Probabilistic selling (PS) Under PS, in addition to selling each specified product in the first period (each at a price of P1PS), the seller also offers a probabilistic good in the first period (at a price of PPS O ). After purchase, the firm immediately determines which product the buyer will consume (each of which is equally likely). Under this strategy, each H-type consumer buys her favored product in the first period and each L-type consumer buys the probabilistic good (also in the first period).10 To meet demand, the seller 1 2αþ1 1 3−2α selects K MD and K MD A ¼αþ2¼ 2 B ¼ ð1−α Þ þ 2 ¼ 2 . The expected value of the probabilistic good to consumer θ is vθo = (vθF + vθU)/2. The prices P1PS and PPS 0 induce such a purchasing pattern only if the following incentive compatibility and participation constraints are met: h If costs are low ðc ≤ ^cÞ, the seller orders sufficient capacity to serve all ND consumers: KND A + KB = 2. Price is set so that L-type consumers are willing to purchase. For moderate costs ð^cbc ≤1Þ, the seller only orders ND enough capacity to serve the H-type consumers, KND A + KB = 1, and the price is set so that H-type consumers are just willing to purchase. For higher costs, c N 1, it is impossible for the seller to earn a positive profit. For the remainder of the paper, we assume that c ≤ ^c , so that it is optimal to serve both consumer types, enabling us to focus on the role of price discrimination (since costs will be the same under all three strategies). For c N ^c , total sales under either MD or PS will be higher than under ND, and thus the role of price discrimination is confounded with the role of market expansion. Furthermore, even at these higher costs, the magnitude of c does not impact the focal comparison between MD and PS, because these two strategies require the same amount of inventory and thus incur the same costs. ð3Þ ¼ dL vLF h ′ P1 ′ P2 i PS : 1−P 1 ≥0 i v þv PS LU : LF −P 0 ≥0 2 & ' 1 þ vHU PS PS : 1−P 1 − −P 0 ≥0 2 h i v þv $ % ′ PS PS LF LU −P 0 − vLF −P 1 ≥0 IC2 : 2 ð5Þ h IC1 ′ i The seller chooses prices (P1PS and P0PS) to maximize its total profit given the constraints in Eq. (5)11: ( ) ( ) 1 þ vHU vLF þ vLU 1−vHU þ vLF þ vLU PS P 1 ¼ 1−Max − ; 0 ¼ Min ;1 2 2 2 v þ vLU PS P o ¼ LF 2 ð6Þ These prices result in a profit of: PS Π ( ) 1−ν HU þ vLF þ vLU v þ vLU ; 1 þ LF −2c ¼ Min 2 2 ð7Þ 3.3. Comparison of profit Lemma 1 summarizes the conditions under which MD and PS, respectively, are more profitable than ND. Proofs of the Lemmas, Corollaries, and Propositions are given in the Web Appendix. 9 At the optimal solution, constraint [P2] must be binding. If P2MD was set such that [P2] did not hold with equality, then it would be possible for the seller to raise this secondperiod price and still sell to L-type consumers (and possibly even sell to the H-types at a higher first-period price), thus increasing its profit. Similarly, either [P1] or [IC1] must bind. If P1MD was set such that neither of these constraints held with equality, then it would be possible for the seller to raise this first-period price and still sell to H-type consumers, thus increasing its profit. 10 An alternative strategy in which H-types also buy the probabilistic good would yield strictly less profit since the firm can charge a higher price for a consumer's favored good than it can for the probabilistic good. 11 At the optimal solution, constraint [P2’] must be binding. If P0PS was set such that [P2’] did not hold with equality, then it would be possible for the seller to raise the price of the probabilistic product and still sell it to L-type consumers, thus increasing its profit. Similarly, either [P1’] or [IC1’] must bind. If P1PS was set such that neither of these constraints held with equality, then it would be possible for the seller to raise the price of the specified goods and still sell to H-type consumers, thus increasing its profit. 151 D.H. Rice et al. / Intern. J. of Research in Marketing 31 (2014) 147–155 Lemma 1. (Conditions required for price discrimination) Compared with the No Discounting strategy, which earns the seller the same profit from both H- and L-type consumers, a) Markdown selling allows the seller to benefit from price discrimination if the increase in profit from H-type consumers (who buy in the first period) is larger than the decrease in profit from L-type consumers (who buy in the second period). b) Probabilistic selling allows the seller to benefit from price discrimination if the increase in profit from H-type consumers (who buy the specified goods) is larger than the decrease in profit from the L-type consumers (who buy probabilistic good). Mathematically, PS Π NΠ Π MD ND NΠ ND PS PS PS if ΔPrice to H‐Types −ΔPrice to L‐Types N0 MD PS ND PS ¼ ND PS MD P 1 −P 0 ; ΔPrice to H‐Types PS Π −Π þ MP ND ¼ P 1 −P 1 ; ND MD ΔMD Price to L ‐ Types = P1 − P2 , and these prices are given in Eqs. (1), (3), and (6), respectively. Two important corollaries to Lemma 1 are: Corollary 1. A necessary condition for the inequality in Lemma 1a) to be satisfied is dL N dH, i.e., H-types are less patient than L-types. Corollary 2. A necessary condition for the inequality in Lemma 1b) to be satisfied is vLF-vLU b 1-vHU, i.e., H-types have stronger product preferences than L-types. These two corollaries indicate that “damaging” the good (either by delaying consumption or pairing it with one's less-favored good) must create greater disutility for the high- than for the low-value consumers in order for each respective pricing strategy to be profitable. Under MD, the discounted second-period product must be relatively attractive to the low-value consumers (so that the second-period price is not too low), but also relatively unattractive to the high-value consumers (so that they are willing to pay a premium to purchase in the earlier period). A necessary condition to achieve a net advantage from MD is that high-value consumers must be less willing to wait than low-value consumers. Su (2007) and Besbes and Lobel (2012) also point out that inter-temporal price discrimination is predicated on there being an inverse relationship between consumers' patience and their valuations. Similarly, under PS, the probabilistic good must be relatively more appealing to the low-value consumers so that the discount for the probabilistic good does not have to be too large and, thus, a relatively high price can be obtained for the specified goods. Furthermore, although Lemma 1 indicates that both MD and PS can be used to segment the market, the basis of this segmentation is very different in each case: MD is based on buyer heterogeneity in their discount for time (Corollary 1) and PS is based on buyer heterogeneity in their product preference strength (Corollary 2). Such differences create the potential for retailers to benefit from PS, as summarized by the following proposition. Proposition 1. (Profit advantage of probabilistic selling) (a) PS expands the scenarios under which the seller can benefit from price discrimination, i.e., there is a subset of parameters such that PS is more profitable than ND, but MD is not. (b) PS is more profitable than MD if it can create the following two advantages (or if one of these advantages is sufficiently large): 1. A more valuable “damaged” good, i.e., L-type consumers are willing to pay more for the probabilistic product in the first period than for their preferred product in the second period. & MD ¼ $v þ v % LF LU −dL vLF 2 ffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl Difference in how much L‐types value a probabilistic product in the the first period versus their favored product in the second period: ( )' 1 þ vHU vLF þ vLU − ;0 Max½dH −dL vLF ; 0%‐Max 2 2 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} N0 Difference in surplus the seller must provide to H‐types to get them to forego purchasing the discounted in the markdown selling strategy versus in the probabilistic selling strategy ¼ MD if ΔPrice to H‐Types −ΔPrice to L‐Types N0 where ΔPrice to H‐Types ¼ P 1 −P 1 ; ΔPrice to L‐Types 2. A lessened cannibalization threat, i.e., H-type consumers demand less surplus to forgo purchasing the discounted product under PS than under MD. Formally, PS earns the retailer a higher profit if the following condition holds: ΔValue to L‐Types þ ΔValue to H‐Types (c) The profit advantage of PS relative to MD (ΠPS–ΠMD) increases as 1. L-types' valuations for the probabilistic good increase (i.e., a larger VLU.) 2. L-types become more impatient (i.e., a lower dL) 3. H-types' valuations for the probabilistic good decrease (i.e., a smaller VHU). 4. H-types become more patient (i.e., a higher dH). Fig. 1 illustrates the main results from Proposition 1. In Fig. 1a), we show the impact of vLU and dL on the relative profit of PS, MD, and ND (holding vLF, vHU, and dH constant). Fig. 1b) shows the impact of vHU and dH on the relative profit, holding vLF, vLU, and dL constant. The unshaded regions indicate parameter values for which neither MD nor PS can outperform ND. In the polka-dotted regions, PS is more profitable than both MD and ND. In shaded regions, MD is more profitable than both PS and ND. Consistent with Corollary 1, MD is not a useful price discrimination tool if either L-type consumers are too impatient (i.e., dL b 2/3 in Fig. 1a) or H-type consumers are too patient (i.e., dH N 5/8 in Fig. 1b). However, even with such parameters, L-type consumers may have sufficiently weak preferences (i.e., vLU N ½ in Fig. 1a) and/or H-types may have sufficiently strong preferences (i.e., vHU b ½ in Fig. 1b) so that PS is more profitable than ND. Thus, PS expands the range of market settings under which the seller can price discriminate, i.e., in the regions denoted “Only PS is advantageous to ND,” price discrimination on the basis of time is not profitable, but price discrimination on the basis of consumer preference strength is profitable. In contrast, in the regions denoted “Only MD is advantageous to ND,” price discrimination is profitable if it is done on the basis of time but not on the basis of consumer preference strength. Thus, PS and MD can be viewed as complementary strategies, with one strategy often being profitable in markets where the other strategy would not be beneficial. In markets where both MD and PS outperform ND, the relative advantage of the two price discrimination mechanisms depends on two effects: (1) Difference in Value to L-types, i.e., how valuations for the low-value consumers differ across the two price discrimination mechanisms, and (2) Difference in Surplus to H-types, i.e., how the amount of surplus that must be allocated to the high-value consumers differs between MD and PS.12 First consider the Value-to-L-types effect. Under both MD and PS, low-value consumers purchase a product that is 12 In a Hotelling model, which is most often employed in the extant literature (e.g., Fay & Xie, 2008, 2010; Jerath et al., 2010; Jiang, 2007), the Surplus to H-types effect would be missing since, in that model, no consumer obtains a positive surplus from purchasing the probabilistic good and, thus, no extra incentive is needed to induce strongpreferenced consumers to buy their preferred good. Therefore, the current model has the advantage of allowing us to analyze the cannibalization effect of PS. 152 D.H. Rice et al. / Intern. J. of Research in Marketing 31 (2014) 147–155 a) vLF = ¾ ; v HU b) dH = ½ 3 vLU 1 2 d L = ¾ ; v LU = ½ 1 Only PS is advantageous to ND 4 v LF B1 B2 vHU C 1 2 B2 Only MD is advantageous to ND Neither PS nor MD is advantageous to ND Neither PS nor MD is advantageous to ND Only MD is advantageous to ND Only PS is advantageous to ND B1 A 0 0 0 2 3 1 0 dL 5 8 1 dH Fig. 1. The advantage of price discrimination. The patterns of the regions in Fig. 1a and b represent which strategy is most advantageous for the retailer. In the shaded regions, MD is the most profitable strategy. In the polka-dotted regions, PS is the most profitable strategy. In the unshaded regions, ND is the most profitable strategy. In the regions labeled A, B1, B2, and C, both PS and MD are more profitable than ND. In region A, PS generates more revenue than MD from both the L-types and the H-Types. In region C, PS generates less revenue from both consumer types. In regions B1 and B2, PS generates higher revenue from the L-types but lower revenue from the H-types. The net effect is that PS is more profitable than MD in region B1, but less profitable in region B2. Note: The maximum value of vLU in Fig. 1a is ¾ since, by definition, vLU b vLF = ¾. less valuable to them than their preferred product in the first period and the seller charges a price that is just low enough to induce them to make this purchase. Thus, the difference in revenue per L-type customer between MD and PS equals the difference between how much such a customer values a probabilistic good in Period 1 and how much she values her preferred good in Period 2. The relative advantage of PS is greater the weaker the SOPs of low-value consumers and the less patient they are (Proposition 1(c)). On the other hand, PS is unable to create an advantage (relative to MD) in Value to L-Types if low-value customers are patient but “picky” about which product they consume. We now turn to the Surplus to H-types. Under both MD and PS, the seller must set its first-period price for the specified goods such that the high-value consumers will purchase their preferred products in Period 1. Specifically, the seller must reduce the price below these consumers' willingness-to-pay so they receive at least as much surplus as they would receive if they purchased the discounted product. This maximum obtainable price depends critically upon how much the H-type consumers value the alternate purchase offering and also on the price of this alternate purchase option. Specifically, as indicated in Proposition 1(c), the PS advantage increases as the SOP of low-value consumers weakens (so that P0PS is higher), as low-value consumers become less patient (so that P2MD is lower), as the SOP of high-value consumers become stronger (so that purchasing the probabilistic good would yield less surplus), and as the latter become more patient (so that waiting to purchase until the second period would yield more surplus). On the other hand, PS fails to generate an advantage in Surplus to H-types if low-value customers are patient and picky, while highvalue customers are impatient and are not picky. In short, whether or not PS generates an advantage in Surplus to H-types depends on whether it is easier for the seller to prevent high-value consumers from opting for the (discounted) probabilistic good or to prevent them from waiting until the second period to purchase their preferred products. As shown in the equation in Proposition 1(b), PS is preferable to MD only if the sum of these two effects is positive.13 If PS creates a more valuable “damaged” product for low-value consumers and a lower (or no change in) required surplus for high-value consumers, then PS is clearly more profitable than MD. This occurs in the region labeled “A” in Fig. 1b. On the other hand, neither advantage is present in region 13 Note, we assume that the low- and the high-value segments are of equal size. With asymmetric segment sizes, these two effects would need to be weighted according to each segment's size. “C” of Fig. 1a. Here, MD is more profitable than PS. If the two effects go in different directions, the larger effect determines which pricing discrimination mechanism is more profitable. Thus, even if PS has one disadvantage, this strategy can still be more profitable than MD if its advantage is greater than its disadvantage. In the regions labeled “B1,” PS creates a sufficiently large Value-to-L-types effect to offset the negative Surplus-to-H-types effect. However, in the regions labeled “B2,” the negative Surplus-to-H-types effect is of a greater magnitude, and thus MD is more profitable than PS. 3.4. Summary In sum, for a price discrimination mechanism (e.g., PS or MD) to improve profit, it must create a new purchase option that is attractive to low-value consumers (thus enabling the firm to earn significant revenue from the new purchase option), but is not attractive to highvalue consumers (so that high margins can be maintained for the original products). Whether varying prices over time (via the MD strategy) or creating a probabilistic product (via the PS strategy) is more advantageous, depends crucially upon the heterogeneity in consumers' degree of patience and the strength of their preferences. Thus, in some markets we would expect PS to be more profitable than MD and in other markets for the reverse to be true (depending on whether or not the condition from Proposition 1(b) holds). 4. Model extension: Demand uncertainty Since retailers often cannot predict which products will be more popular, inventory is depleted asymmetrically, with popular products selling faster and unpopular products selling more slowly than expected. Unpopular items that remain after the primary selling season are often severely marked down. In this section, we incorporate demand uncertainty into the base model from Section 3 to examine whether PS remains a viable strategy (relative to MD) in such market settings and to garner additional insights into how demand uncertainty impacts the tradeoff between the two strategies. Specifically, at the time inventory orders are made and when first-period prices are chosen, the seller does not know whether product A or product B will be the more popular good. We assume that the seller knows the value of α, i.e., that one good will be more popular than the other, but not whether α will apply to product A or product B. Instead, the seller believes each product is equally likely to be the popular one. In period 2, the seller learns which product is popular and can adjust its prices accordingly. 153 D.H. Rice et al. / Intern. J. of Research in Marketing 31 (2014) 147–155 In this section, we derive the optimal inventory orders that occur under MD and PS and make the simplifying assumptions that c b Min [dL vLF, (vLF + vLU)/2], dH = dL/2, and vLU = vHU = 0. These conditions guarantee that MD and PS are advantageous in the absence of demand uncertainty, i.e., the conditions given in Corollary 1 and Corollary 2 both hold, and that the L-type's value of both the probabilistic good under PS and the delayed product under MD exceed the cost of acquiring a unit of a product in order to meet this demand. Lemma 2 gives the optimal solution under PS when demand uncertainty is present: Lemma 2. (PS with demand uncertainty) If dH = dL/2, vLU = vHU = 0 and c b vLF/2, in the presence of demand uncertainty (a) The firm will purchase 2αþ1 2 units of both product A and product B. (b) The resulting profit is ΠPS;DU ¼ vLF þ 12 −ð2α þ 1Þc: Lemma 2 shows that, if costs are sufficiently low, the inventory order under PS will enable the firm to sell to all customers (of both types). Demand uncertainty does not impact the prices that the firm sets or the amount of sales. However, the firm must order 2αþ1 units of both 2 product A and product B, whereas without demand uncertainty, the# " 2αþ1 3−2α firm would order units of the 2αþ1 popular good and b 2 2 2 units of the unpopular good. Thus, demand uncertainty leads to excess inventory for the unpopular good and reduces profit. Proposition 2 compares profit and inventory utilization under PS to that which occurs under MD when demand uncertainty is present. Proposition 2. (Demand uncertainty with costly inventory) Probabilistic selling addresses demand uncertainty more efficiently than does markdown selling. Specifically, " + ! , # (a) For sufficiently low costs cbMax α; 2 3 dL vLF , demand uncertainty reduces the profit under markdown selling more than under probabilistic selling. (b) Relative to markdown selling, probabilistic selling achieves a higher level of inventory utilization and reduced wastage for all cb dL 2νLF . (c) Total sales are (weakly) higher under probabilistic selling than under markdown selling. In order to understand Proposition 2, it is instructive to examine how products with asymmetric demand are allocated in the absence of demand uncertainty. Under MD, H-type consumers purchase their preferred good (where α prefer the popular good) in the first period and L-type consumers purchase their preferred good (also where α prefer the popular good) in the second period. Thus, across the two selling periods, 2α units of the popular good and 2(1 − α) units of the unpopular good are sold. Under PS, in the first period, H-type consumers purchase their preferred good (where α prefers the popular good) and L-type consumers will purchase the probabilistic good (where a fraction ½ will be assigned to the popular good). Thus, in total, α þ 12 ¼ 1þ2α 2 units of the popular good and 1−α þ 12 ¼ 3−2α units of the unpopular good are 2 consumed. In the absence of demand uncertainty, under both PS and MD, the firm orders inventory to exactly meet the demand for each product, which involves purchasing more units of the popular product. It is critical to note that total consumption of the popular good and the difference in consumption between the popular and the unpopular good are both higher under MD than under PS. When demand uncertainty exists, the firm does not know which product will be more popular and thus inventory orders cannot be tied to a product's popularity. Mathematically, this is equivalent to imposing a constraint, KsA = KsB, s = {ND, MD, PS} on the maximization problems in Section 3. It is intuitive that, as Proposition 2(a) states, this constraint will have a larger detrimental impact on MD since, without such a restriction, the inventory orders would be more asymmetric under MD than under PS. Under PS, as long as costs are not too large, Lemma 2 shows that the firm's best response to demand uncertainty is to order K PS;DU ¼ K PS;DU ¼ B A 2αþ1 units. Some units of the unpopular good will go unsold. Thus, even 2 though the firm generates the same revenue as in the absence of demand uncertainty, higher costs are incurred (due to a larger total amount of inventory). Under MD,% one of two options will be optimal. First, if costs are very $ low c≤ dL2νLF , the firm will order KMD,DU = KMD,DU = 2α units so that A B sales are the same as under demand uncertainty. Since the firm sells to all consumers and at the same prices as when there was no demand uncertainty, the latter has no effect on the revenue earned under MD. However, costs increase due to the higher inventory order, thus leading to unsold units remaining and lower profit. Fig. 2a graphs the number of unsold units under MD and shows the magnitude of the profit decrease that demand uncertainty inflicts on the MD strategy (ΔMD,DU) as a function of demand asymmetry (α). Notice that, as α increases, the seller faces more severe demand uncertainty. Two results are apparent: (1) MD results in more unsold units than does PS (since inventory orders are higher under MD); and (2) demand uncertainty reduces b) Moderate a) Low 2.0 Sales (PS) 2.0 Unsold (MD) 1.5 1.5 Unsold (PS) 1.0 Sales (MD) 1.0 MD, DU 0.5 0.5 MD, DU PS, DU PS, DU 0.6 0.7 0.8 0.9 1.0 0.6 0.7 0.8 0.9 1.0 Fig. 2. Impact of demand uncertainty on PS and MD. The graphs are drawn assuming vLF = dL = 3/4. For Fig. 2a, which is drawn for c = .2, total sales equals 2 for both MD and PS. Fig. 2b is drawn assuming c = .3. 154 D.H. Rice et al. / Intern. J. of Research in Marketing 31 (2014) 147–155 profit more for MD than for PS. As demand uncertainty grows, the magnitude of both of these effects increases. $ % Second, if costs are higher cN dL2νLF , the retailer will order less than 2α units of each good. This scenario is illustrated in Fig. 2b. While ordering less inventory reduces the number of unsold units, this also creates stockouts, as some or all of the L-type consumers would be unable to purchase in the second period. As stated by Proposition 2(c), total sales would be higher under PS since, in this case, all consumers receive one unit of one product, whereas under MD some consumers do not purchase anything. This reduction in revenue makes the profit decrease from demand uncertainty larger under MD than under PS. It is important to recognize that, whichever of these two options is adopted under MD, PS offers the seller a more effective tool for addressing demand uncertainty. Relative to PS, MD either requires a larger investment in inventory (as in option one) or fewer total sales (as in option two). In sum, not only does PS remain a viable strategy for retailers who face demand uncertainty, the relative advantage of PS over MD becomes even stronger when demand uncertainty is present. 5. Concluding comments 5.1. The benefits of probabilistic selling Markdowns are commonplace in retailing and can result in painful repercussions, such as very low (or even negative) margins and a consumer mentality of delaying purchases in anticipation of a big sale. In this paper, we show that probabilistic selling (PS) can be preferred over a typical markdown (MD) strategy. In particular, we illustrate that introducing probabilistic products can be beneficial because they (1) allow the seller to obtain higher prices from customers with strong preferences for one product over alternatives (because probabilistic products may present less of a cannibalization threat to full-price sales than do traditional time-dependent markdowns); (2) reduce the size of necessary discounts (because consumers may be willing to pay more for a randomly selected product at the beginning of the season than for their preferred product late in the season); and (3) more effectively address the negative consequences of demand uncertainty and limited capacity by helping a firm prevent stockouts and reduce excess inventory. We find that, while both probabilistic products and markdowns can be used to segment consumers, PS is most effective when low-value consumers have weak preferences and high-value consumers have strong preferences. In contrast, markdown pricing relies on specific differences in consumers' willingness to delay purchase until a later date. The markdown pricing strategy is most effective when low-value consumers are relatively patient and high-value consumers are relatively impatient. Because the MD and PS strategies rely on distinctly different mechanisms to segment the market, PS is a valuable additional tool for retailers, especially those who operate in markets where markdowns are ineffective (e.g., non-fashion products or products unaffected by seasonality so that consumer arrival times are not negatively correlated with product valuations or price sensitivity). 5.2. Limitations and future research While markdown pricing strategies are ubiquitous, the PS strategy is still nascent. More research is needed to fully flesh out how to optimally implement this selling strategy. For example, it is unclear how PS impacts which products (and how many) a seller should stock. Other research has considered a version of a probabilistic good (which they call a “flexible good”) in which the seller does not assign products to buyers until after demand uncertainty is resolved (e.g., Gallego & Phillips, 2004; Mang et al., 2012; Post, 2010). As refinements to PS are made, such as determining the optimal product mix and the optimal timing of product assignments, the profit obtainable from the strategy should increase and, thus, the market settings in which PS is advantageous to traditional markdowns should be enlarged beyond the regions identified in this paper, which utilizes a base case (and a rather conservative) definition of PS. The paper employs a stylized model and thus has several limitations. First, in our model, we assume no inventory holding costs and that firstand second-period profits are equally weighted. If holding costs and discounting of future cash flow were incorporated, PS would have the additional benefit over MD of shifting sales forward in time. This advantage would be especially important to sellers who desire rapid turnover or incur significant inventory holding costs. Second, there may be circumstances in practice in which it is beneficial to offer both temporal-based discounts and discounted probabilistic products. Third, our stylized model leads to the firm charging the same price for each product (both in the first and second periods). In addition, we assume that inventory orders can only be placed prior to the selling season. We believe that our assumptions are quite realistic for many retailers since there are many cases in retailing where markdowns are symmetric and mid-period inventory acquisition is not possible. However, in practice, some retailers do offer different-sized markdowns as they better learn the demand for each particular product and others are able to replenish inventory throughout the selling season. Relaxing these (and other) assumptions of the model could be a valuable direction for future research. Additional research questions that remain unresolved include: What are the ramifications of PS on suppliers (e.g., their market power and effects on supply chain dynamics)? Can PS be effective when the seller is uncertain about total category demand rather than the relative popularity of each specific item? Do some consumers prefer the probabilistic good because (rather than in spite of the fact that) it offers a gamble? Does PS alter consumer preferences (e.g., dilute brand- or store loyalty)? Web Appendix Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.ijresmar.2013.08.006. References Anderson, C. K. (2009). Setting prices on priceline. Interfaces, 39(4), 307–315. Anderson, C. K., & Xie, X. (2012). A choice-based dynamic programming approach for setting opaque prices. Production and Operations Management, 21(3), 590–605. Besbes, O., & Lobel, I. (2012). Intertemporal price discrimination: Structure and computation of optimal policies. Columbia Business School Research Paper No. 12/46 (August 7, 2012, downloaded Dec. 4, 2012 from http://papers.ssrn.com/sol3/papers.cfm? abstract_id=2126312) Bitran, G. R., & Mondschein, S. V. (1997). Periodic pricing of seasonal products in retailing. Management Science, 43(1), 64–79. Chung, C. -S., Flynn, J., & Zhu, J. (2009). The newsvendor problem with an in-season price adjustment. European Journal of Operational Research, 198(1), 148–156. Coughlan, A., & Soberman, D. (2005). Strategic segmentation using outlet malls. International Journal of Research in Marketing, 22(1), 61–86. Fay, S. (2008). Selling an opaque product through an intermediary: The case of disguising one's product. Journal of Retailing, 84(1), 59–75. Fay, S., & Xie, J. (2008). Probabilistic goods: A creative way of selling products and services. Marketing Science, 27(4), 674–690. Fay, S., & Xie, J. (2010). The economics of buyer uncertainty: Advance selling vs. probabilistic selling. Marketing Science, 29(6), 1040–1057. Fay, S., & Xie, J. (2012). Timing of commitment as a strategic variable: Using probabilistic selling to enhance inventory. Working Paper. Friend, S.C., & Walker, P. H. (2001). Welcome to the new world of merchandising. Harvard Business Review, 79(10), 133–141. Gallego, Guillermo, & Phillips, Robert (2004). Revenue management of flexible products. Manufacturing and service operations management, 6(4), 321–337. Granados, N., Gupta, A., & Kauffman, R. J. (2008). Designing online selling mechanisms: Transparency levels and prices. Decision Support Systems, 45(4), 729–745. Jerath, K., Netessine, S., & Veeraraghavan, S. K. (2010). Revenue management with strategic customers: Last-minute selling and opaque selling. Management Science, 56(3), 430–448. Jiang, Y. (2007). Price discrimination with opaque products. Journal of Revenue and Pricing Management, 6(2), 118–134. Khouja, M. (1995). The newsboy problem under progressive multiple discounts. European Journal of Operational Research, 84(2), 458–466. Lazear, E. P. (1986). Retail Pricing and Clearance Sales. American Economic Review, 76(1), 14–32. D.H. Rice et al. / Intern. J. of Research in Marketing 31 (2014) 147–155 Levy, M., Grewal, D., Kopalle, P. K., & Hess, J.D. (2004). Emerging trends in retail pricing practice: Implications for research. Journal of Retailing, 80(3), xiii–xxi. Levy, M., & Weitz, B.A. (2004). Retailing management (5th ed.)Boston: McGraw-Hill Irwin. Mang, S., Post, D., & Spann, M. (2012). Pricing of flexible products. Review of Managerial Science, 6(4), 361–374. Mantrala, M. K., & Rao, S. (2001). A decision-support system that helps retailers decide order quantities and markdowns for fashion goods. Interfaces, 31(3), S146–S165. Nair, H. (2007). Intertemporal price discrimination with forward-looking consumers: Application to the US market for console video-games. Quantitative Marketing and Economics, 5(3), 239–292. Petrick, A., Steinhardt, C., Gonsch, J., & Klein, R. (2012). Using flexible products to cope with demand uncertainty in revenue management. OR Spectrum, 34(1), 215–242. Petruzzi, N. C., & Monahan, G. E. (2003). Managing fashion goods inventories: Dynamic recourse for retailers with outlet stores. IIE Transactions, 35(11), 1033–1047. 155 Post, D. (2010). Variable opaque products in the airline industry: A tool to fill the gaps and increase revenues. Journal of Revenue and Pricing Management, 9(4), 292–299. Ross, J. R. (1997). Inventory management systems cut costs while keeping store shelves full. Stores, 79(7), 78. Shapiro, D., & Zillante, A. (2009). Naming your own price mechanisms: Revenue gain or drain? Journal of Economic Behavior and Organization, 72, 725–737. Su, X. (2007). Intertemporal pricing with strategic customer behavior. Management Science, 53(5), 726–741. Sullivan, L. (2005). Fine-tuned pricing. Information Week, 1052, 36–41 (Aug 15-Aug 22). Wang, T., Gal-Or, E., & Chatterjee, R. (2009). The name-your-own-price channel in the travel industry: An analytical exploration. Management Science, 55(6), 968–979. Wood, C. M., Alford, B.L., Jackson, R. W., & Gilley, O. W. (2005). Can retailers get higher prices for “end-of-life” inventory through online auctions? Journal of Retailing, 81(3), 181–190. Zouaoui, F., & Rao, B. V. (2009). Dynamic pricing of opaque airline tickets. Journal of Revenue and Pricing Management, 8(2/3), 148–154. Intern. J. of Research in Marketing 31 (2014) 156–167 Contents lists available at ScienceDirect Intern. J. of Research in Marketing journal homepage: www.elsevier.com/locate/ijresmar Full Length Article Does retailer CSR enhance behavioral loyalty? A case for benefit segmentation Kusum L. Ailawadi ⁎, Scott A. Neslin, Y. Jackie Luan, Gail Ayala Taylor Tuck School of Business, Dartmouth College, 100 Tuck Hall, Hanover, NH 03755, United States a r t i c l e i n f o Article history: First received in 8 November 2011 and was under review for 10 months Available online 16 October 2013 Area Editor: Els Gijsbrechts Keywords: Corporate social responsibility (CSR) Dimensions of CSR Retail store patronage Attitude towards store Loyalty Share of wallet a b s t r a c t We study the effects of consumer perceptions of four types of corporate social responsibility (CSR) activities on their behavioral loyalty toward retailers. The four activities are environmental friendliness, community support, selling locally produced products, and treating employees fairly. Behavioral loyalty is measured by share-ofwallet (SOW). We control for other retailer attributes that drive attitudes and SOW, and examine how the market is segmented in terms of consumer response. We partition the total effect of CSR on SOW into a direct effect and an indirect effect mediated through attitude towards the store. These effects differ by CSR activity and customer segment. The effects on attitude are positive and positive attitude enhances SOW, so the indirect effects on SOW are positive. While we generally find positive total effects, the total effect of one of the CSR activities, environmental friendliness, is significantly negative for one group of consumers. The magnitude of CSR's total impact on SOW is not only statistically significant but also managerially meaningful in an industry where every share point carries a substantial dollar amount. We characterize the customer segments and conclude with implications for how best a retailer can manage its CSR initiatives. © 2013 Elsevier B.V. All rights reserved. 1. Introduction Corporate social responsibility (CSR) refers to a firm's moral, ethical and social obligations beyond its own economic interests (Brown & Dacin, 1997; McWilliams & Siegel, 2001). As CSR gains strategic importance in the eyes of senior management, companies are engaging in a wide range of CSR programs including environmental sustainability, community support, cause-related marketing, and employee enablement. They are investing significantly in publicizing their CSR initiatives in the hope of strengthening relationships with employees, customers, investors, and the broader community. But, as noted by Luo and Bhattacharya (2009) and others, CSR programs compete for resources that can alternatively be channeled to other areas such as innovation or service improvement. Not surprisingly, both academics and practitioners want to determine the returns to CSR investments. The purpose of this paper is to investigate CSR returns by examining the impact on behavioral loyalty, focusing on the retail grocery industry. Prior research has assessed returns to CSR efforts by examining financial performance. Despite a large body of empirical research, the jury is still out regarding this question. Most studies use the Kinder, Lydenburg, Domini (KLD) index of corporate social performance to quantify CSR efforts. The majority of these studies show a positive effect and recent work suggests that CSR reduces firm-specific risk (Luo & Bhattacharya, 2009). But some researchers report a substantial number of insignificant and even negative effects, and methodological and ⁎ Corresponding author. Tel.: +1 603 646 2845; fax: +1 603 646 1308. E-mail address: kusum.ailawadi@dartmouth.edu (K.L. Ailawadi). 0167-8116/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ijresmar.2013.09.003 theoretical criticisms of the studies abound (see Margolis & Walsh, 2003 and Orlitzky, Schmidt, & Rynes, 2003 for reviews). These mixed results are attributable in part to the fact that CSR has multiple dimensions whose impact varies across industries, stakeholder groups, and individuals within a stakeholder group (Berman, Wicks, Kotha, & Jones, 1999; Hillman & Keim, 2001; Sen & Bhattacharya, 2001). Godfrey and Hatch (2007) and Raghubir, Roberts, Lemon, and Winer (2010) note that there is a need to conduct industry-specific studies and to distinguish between different dimensions of CSR. One of the firm's most relevant stakeholders is its customers. Social identity theory and consumer–company identification research suggest that consumers should embrace the more positive and distinctive identity of a company that engages in CSR (e.g., Bhattacharya & Sen, 2003; Sen & Bhattacharya, 2001). Thus, customers should reward such companies with greater loyalty, ultimately enhancing the firm's financial value. But, research on how customers respond to CSR efforts is more limited. Consumer polls paint a rosy picture for CSR initiatives, but they suffer from social desirability bias and other validity concerns (see Auger, Burke, Devinney, & Louviere, 2003 and Cotte & Trudel, 2009 for critiques of these polls). Academic work shows that, by and large, consumers exhibit more favorable attitudes towards socially responsible companies (e.g., Du, Bhattacharya, & Sen, 2007; Klein & Dawar, 2004; Lichtenstein, Drumwright, & Braig, 2004; Luo & Bhattacharya, 2006) but there is considerable heterogeneity in response (e.g., Barone, Miyazaki, & Taylor, 2000; Bhattacharya & Sen, 2004; Brown & Dacin, 1997; Sen & Bhattacharya, 2001). Importantly, it is not clear whether these positive effects translate into behavioral loyalty, for example in the form of share of wallet K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 (SOW). Previous research is largely based on laboratory experiments and measures attitudes and intentions rather than actual behavior. Subjects are typically presented with a description of a company's CSR record and then asked about their attitudes and/or purchase likelihood. Given the salience of the CSR information in the experiment, its impact may be overstated compared to the real-life purchase environment in which several other factors – product quality, price, assortment, convenience, etc. – influence choice. Bhattacharya and Sen (2004) note that, while CSR initiatives produce positive company attitudes, this may not translate into greater purchase behavior because consumers are reluctant to trade off CSR for core attributes such as price. This suggests that attitudes may mediate the impact of CSR activities on behavioral loyalty, but CSR activities may have direct effects as well. In addition, the limited external validity of this body of work has led researchers like Sen and Bhattacharya (2001) and Du et al. (2007) to call for more research based on data collected in actual marketing environments and in the context of competitive offerings. Thus, prior research reveals the need to: (1) distinguish between different dimensions of CSR; (2) study the response of specific stakeholder groups in individual industries; (3) link consumers' CSR perceptions to their behavioral loyalty in addition to attitude; (4) control for other core firm attributes from which consumers derive utility; (5) examine heterogeneity in CSR response across individuals; and (6) study real-world data. To address this need, we study the effects of key CSR activities in the grocery retail industry on behavioral loyalty. We survey consumers in a geographical market to measure their perceptions of CSR and other attributes, as well as overall attitude, with respect to all major grocery retailers in that market, and measure their behavioral loyalty to these retailers. We use these data to specify and estimate a model of behavioral loyalty that allows for attitudes to mediate the impact of CSR, and for heterogeneity in consumer response. We examine four CSR activities: environmental friendliness, community support, selling local products, and treating employees fairly. In sum, we (1) measure the effects of CSR on behavioral loyalty in a field setting while controlling for other drivers of consumer preferences, (2) allow attitudes to mediate these effects; (3) show how these effects differ across key CSR dimensions in an industry that represents a major sector of the economy (U.S. sales of $580billion in 2010); and (4) investigate how the response to CSR dimensions varies across consumers. The rest of our paper is organized as follows. We first develop our conceptual framework and describe the data used for our analysis. This is followed by a presentation of our results and we conclude our paper with a discussion of the implications for researchers and managers. 2. Conceptual development Fig. 1 depicts our conceptual model. It allows consumers' perceptions of CSR to influence behavioral loyalty (measured as share of wallet) through overall attitude as well as directly, while incorporating the impact of other retailer attributes that the literature identifies as important influencers of store patronage. We discuss each major element of the framework below, moving from left to right in the figure. 2.1. The dimensions of CSR The literature generally follows the KLD classification of CSR into six dimensions – employee support, diversity, community support, environment, products, and non-U.S. operations. Bhattacharya and Sen (2004) propose that consumers may respond more positively to CSR initiatives that directly affect their experience with the firm. Bhattacharya, Sen, and Korschun (2008) also note that stakeholders' response depends upon the benefits they themselves derive from the CSR activities. Related to this is the notion that pro-social behavior is motivated by both selfish and selfless altruism, where the ultimate goal 157 of the former is self-benefit with helping others being an instrumental goal, while the ultimate goal of the latter is helping others with selfbenefit as an unintended consequence (Batson & Shaw, 1991; Krishna, 2011). Consumer response is also expected to be more positive for initiatives that are integrated into the core positioning of the firm/ brand (Du et al., 2007), as long as this does not generate negative perceptions regarding the firm's motives (Barone, Norman, & Miyazaki, 2007). This suggests that dimensions of CSR that only contribute to broad social good and that are less integrated with a retailer's core offering (e.g., those related to the environment or community) should have a less positive effect on consumer loyalty. In contrast, CSR dimensions that provide both societal and personal benefit and are integrated into a retailer's core offering (e.g., those related to the product or service experience) should have a more positive effect. We examine four CSR activities that are relevant in our empirical context: environmental friendliness, community support, selling local products, and treating employees fairly. The last two relate directly to the customer's shopping experience, while the first two do not. While the four CSR activities can be grouped into customer-experience versus non-customer-experience, all four are quite different. We therefore examine them separately; the results will reveal whether they exert similar or different effects. 2.2. Other retailer attributes Although our focus is on the effect of CSR, we must control for other retailer attributes that affect loyalty and may be correlated with CSR, especially given previous findings regarding consumers' unwillingness to trade off other attributes for CSR (Barone et al., 2000; Luo & Bhattacharya, 2006; Sen & Bhattacharya, 2001). A review of the literature shows that the drivers of retail store image and patronage can be categorized into a few key attributes – price, assortment, product quality, deals, in-store service and social experience, and convenience of location (Ailawadi & Keller, 2004; Baker, Parasuraman, Grewal, & Voss, 2002; Lindquist, 1974; Mazursky & Jacoby, 1986; Verhoef, Neslin, & Vroomen, 2007). We include these in our model. 2.3. Mediation model Consumer perceptions of CSR and other store attributes can affect behavioral loyalty directly or indirectly through overall attitude towards the store. The indirect route is supported by models of consumer decision-making such as the theory of reasoned action (Fishbein & Ajzen, 1975) later broadened into the theory of planned behavior (Ajzen, 1991). However, attitudes may not fully mediate the impact of perceptions on behavior. Perceptions of a store's CSR activities may influence behavior not just because of what CSR says about the store (as would be measured by overall store attitude), but also because of what it says about oneself (e.g., the social identity literature cited earlier). Also, social scientists have identified an “automaticity” effect, whereby behavior may be induced by cues in the environment without a conscious thought process (Bargh, Chen, & Burrows, 1996). The social atmosphere in the store or CSR activities might serve as such cues, evoking a categorization/stereotype that compels the consumer to shop at or avoid a store (Bitner, 1992). Similarly, a convenient location or special deals may directly cause a consumer to shop at that store, without the elaborate thought process assumed by the formation of overall attitude. In summary, the total effect of CSR and other store attributes on behavioral loyalty comprised an indirect effect (mediated by overall attitude toward the store) and a direct effect. Since prior research shows a positive effect of CSR on attitude and attitude is positively correlated with behavior, albeit weakly, we expect a positive indirect effect. However, the direct effect may well be negative if, as some have suggested, the total effect of CSR on behavior is not positive. The 158 K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 Consumer Heterogeneity CSR-Corporate Ability Belief CSR-Costs Belief Demographics Grocery Budget Retailer CSR Environmental Friendliness Community Support Local Products Employee Fairness Attitude Towards the Store Behavioral Loyalty (Share of Wallet) Other Retailer Attributes Assortment selection Product quality Price Deals In-store service In-store social environment Location convenience Fig. 1. The impact of corporate social responsibility on behavioral loyalty. magnitude of the total effect and its decomposition into the direct and indirect components is an empirical question that we will investigate. 2.4. Heterogeneity in consumer response As noted previously, researchers have found considerable variation in consumers' response to CSR. This may be due to how much consumers personally believe in the activity (Sen & Bhattacharya, 2001), whether they believe that CSR impinges on a company's corporate abilities (Brown & Dacin, 1997), and how much importance they place on other aspects of the company's core offering, such as price and service (Bhattacharya & Sen, 2004). Also, research has shown that consumers vary in the value they place on other store attributes, e.g., how much they are willing to engage in price search (e.g., Talukdar, Gauri, & Grewal, 2010; Urbany, Dickson, & Kalapurakal, 1996). Thus, our conceptual model allows for heterogeneity in the response to CSR as well as the other store attributes. 3. Method 3.1. Sample Our data come primarily from a survey administered to customers of a retail grocery chain located in the northeastern U.S. This “focal” retailer positions itself strongly as a socially responsible retailer. With the retailer's cooperation, we mailed a letter to its approximately 16,000 active loyalty program members (i.e., those who made at least one purchase at the retailer in the previous 6 months) inviting them to participate in the survey that could be completed online or on paper. Paper copies were made available and collected at all of the retailer's stores. The purpose of the survey was introduced in general terms (“to better understand and serve the needs of customers”) without mentioning CSR or any other specific area. It was made clear that the project was being conducted by a team of academic researchers at a nearby university. A lottery of ten gift certificates worth $100 each, redeemable at area businesses, was used to encourage participation. In total, 2884 responses were obtained during the 1-month period when the survey was live, representing a response rate of about 18%. Note that the sampling frame consists of loyalty program members. However, 77% of the total sales by the focal retailer are to members of its program so this is a highly relevant sampling frame for studying the focal firm's customers. 3.2. Measures The survey comprised four main sections. The first section collected information on the respondent's share of wallet (hereafter SOW), measured as percentage of total grocery spending in the past 6 months with the focal retailer as well as the seven major competing retailers in the area. We also allowed the respondent to indicate “other stores” not listed in the survey. The median (mean) SOW for these other stores is 0% (9.3%), indicating that the eight retailers included in our study account for most of the respondents' grocery spending. The second section asked for respondents' perceptions of the focal retailer on the key attributes identified in the retailing and store image literature (such as product quality, price, and in-store service) and the four CSR dimensions identified earlier, as well as their overall attitude towards the retailer. Items for all constructs used a five-point scale, and the ordering of the items was randomized across respondents. The third section asked for respondents' perceptions of a second store on the same items. In the online version of the survey, the second store was randomly generated among the competing stores that received at least 10% SOW from the respondent (this section was skipped if no competing stores receive more than 10% SOW). This ensured that the respondent had some familiarity with the second store being evaluated. In the paper version of the survey, the identity of the second store was randomized across multiple versions of the questionnaire, and the respondent was instructed to skip this section if he or she was unfamiliar with the particular store. The last section of the survey gathered self-reported importance of various retailer attributes and standard demographic and psychographic information. In order to ensure variation not just across but within respondents we retained only respondents who rated two stores. After responses 159 K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 with missing data were discarded, our final sample consisted of 3492 observations from 1746 respondents. To assess how representative our sample is of the sampling frame, we compared it to the focal retailer's population of active loyalty program members. Table 1 summarizes this comparison and shows that our respondents are not very different from the population of loyalty program members in terms of duration of program membership and total spending and number of trips during the 6 months preceding the study. Table 2 presents measures of each variable, along with descriptive statistics and, for multi-item variables, reliabilities.1 As noted previously, we selected the store attributes based on the retailing and store image literature. We fine-tuned our selection based on qualitative interviews with four managers from the focal retailer and a convenience sample of fifteen consumers who were familiar with most of the stores in our study. The interviews led us to include not just the size of a retailer's assortment but the extent to which the retailer offers unique items not available elsewhere. They also led us to measure the social experience/ clientele aspects of the store through two separate attributes, i.e., how much a consumer feels they have in common with the clientele and how wealthy they perceive the clientele to be. 3.3. Data quality SOW is of central interest so we first compare the self-reported SOW with actual spending compiled from the focal retailer's customer database.2 The correlation between respondents' self-reported SOW at the focal retailer and their actual spending there over 6 months prior to survey administration is 0.61. We also computed respondents' SOW at the focal retailer from their actual spending and the weekly total grocery budget they reported in the survey. The correlation between computed and self-reported SOW is 0.71. These correlations are much larger than typically reported between perceived and objective measures (e.g., Bommer, Johnson, Rich, Podsakoff, & Mackenzie, 1995), and suggest that self-reported SOW has strong convergent validity. Table 3 provides descriptive statistics of attribute ratings across retailers. The table shows substantial variation in mean ratings both within and across retailers. As the focal store positions itself on CSR and communicates its CSR activities via the quarterly newsletter to program members, its website, and in-store signage, it is not surprising that it rates highest on these dimensions. It also stands out in carrying unique items, product quality, in-store service, and assortment. However, it is rated poorly on price and promotions, and is perceived as having a wealthy clientele. Thus, consumers don't uniformly rate it positively or negatively on all attributes, alleviating concerns about social desirability and halo effects. Ratings of other retailers also show substantial variation across attributes and they generally have face validity. For example, retailers F and G position themselves strongly on price and consumers' mean price perceptions are in line with this. Similarly, retailers D and F are discount/big box stores and consumers' perceptions of these stores as offering attractive prices but being low on assortment and in-store service are in accordance with this. Finally, retailer H, who receives the poorest ratings, was struggling and closed its store in the year following our study. Overall, the pattern suggests that respondents rated the stores realistically and alleviates concerns about halo effects. 1 We adapted existing measures where possible (e.g., Baker et al., 2002; the GfK consumer surveys used by van Heerde, Gijsbrechts, & Pauwels, 2008; Verhoef et al., 2007) and developed and pretested others. While multi-item measures may have been desirable for all constructs (but see Bergkvist and Rossiter (2007) for findings to the contrary), survey length was an issue given the number of constructs and the need for respondents to rate two retailers. Therefore, we used single items for some variables. 2 The retailer estimates that well over 90% of members' purchases are captured in their database. Table 1 Comparison of sample with member population. Variable Sample mean (std. deviation) Population mean# (std. deviation) Total spending in last 6 months ($) 1549 (1597) 38 (42) 145 (118) 1631 (1976) 35 (44) 147 (121) Number of trips in last 6 months Number of months as member # The population is the full set of active loyalty program members in the focal retailer's database. As an additional check for halo effects, we examined the percentage of observations that showed high (4 or 5 on the 5-point scale) ratings for all attributes, and also what percentage of respondents who rated a retailer high on CSR also rated the retailer high on other attributes. We found that the observations with highly positive ratings on all attributes comprise less than 1% of the sample. Further, in terms of CSR versus other attributes, only a minority rated a given store high on both, depending on the CSR and other attribute involved. For example, less than 50% rated a retailer high on both environmental friendliness and quality. Only 5% rated a retailer high on both environmental friendliness and price. Finally, we interviewed five grocery retail experts in the area and asked them to rate the stores (excluding their own) on the key attributes in our study. Overall, their ratings are consistent with those of our sample. For example, all of them rated the focal retailer highest on CSR attributes, assortment, quality, unique items, and wealthy clientele and worst on price and promotions, although the difference in ratings of the focal retailer and the next best was very small on assortment and quality. In summary, the SOW measure exhibits strong convergent validity, the mean attribute ratings have good face validity as well as discriminant validity, and there is little evidence of halo effects. In addition, concerns about common method bias are alleviated (Rindfleisch, Malter, Ganesan, & Moorman, 2008) because (a) our key dependent variable, SOW, precedes the independent variables in the survey; (b) the order of items relating to CSR, overall attitude, and all other store attributes is randomized across respondents; and (c) SOW is measured using a different measurement scale than that used for other retailer attributes. 3.4. Model The framework in Fig. 1 translates to a model with two equations, one for attitude and one for SOW. Both equations include the CSR dimensions and other store attributes as independent variables; the SOW equation also includes attitude. All variables are mean-centered relative to their grand means in the full sample. This does not affect estimates when there are no interactions and makes it easy to interpret main effects when interactions are subsequently included as a robustness check. Unobserved heterogeneity in parameters across consumers can be incorporated by either the continuous (e.g., Gönül & Srinivasan, 1993) or the finite mixture method (e.g., Kamakura & Russell, 1989). We use the latter while accounting for the dependence between observations from the same respondent in a panel data model. Our choice of the finite mixture (also called latent class) model is dictated by the fact that we are interested not just in controlling for heterogeneity but in identifying actionable consumer segments whose size and preferences provide important managerial insights. Since the latent segments are formed based on response to all model variables, the segment-level parameter estimates characterize the consumer segments not only in terms of how they respond to CSR activities but also in terms of the value they place on other retailer attributes. In addition, per Fig. 1, we use demographics and other 160 K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 Table 2 Measurement of model variables. Variables Dependent variables Attitude (ATT) (α = 0.88) Behavioral loyalty (SOW)⁎ CSR Environmental Friendliness (CSREnv) Community Support (CSRCom) Local Products (CSRLoc) Employee Fairness (CSREmp) Mean 3.64 35.9 Survey itemsa,b SD 1.02 26.4 I consider myself a loyal customer at Retailer A. I would recommend Retailer A to my friends. I would go out of my way to shop at Retailer A. In the last 6 months, what percentage of your grocery spending was in Retailer A? (0–100%) 3.62 3.70 3.32 3.47 1.19 1.26 1.47 0.95 I believe that Retailer A has environmentally friendly policies. I believe that Retailer A cares about the local community. I believe that Retailer A offers a large selection of local products. I believe that Retailer A treats employees fairly. 2.92 0.94 3.93 1.01 3.64 0.79 3.73 0.95 3.89 3.50 3.35 3.10 3.61 1.01 1.36 0.99 1.29 1.32 I can get the same items at lower prices in other stores than Retailer A. Prices at Retailer A are good compared to other stores. (reverse coded) I am confident in the quality of products at Retailer A. The quality of products sold at Retailer A is high. There are special deals available on many products at Retailer A. When items are on sale at Retailer A, the discounts are deep. The atmosphere at Retailer A is pleasant. Help is always available when I need it at Retailer A. It is easy to find things at Retailer A. Retailer A offers a big selection of items in many product categories. I can find unique products at Retailer A that are not available elsewhere. I have a lot in common with others who shop at Retailer A. Shoppers at Retailer A tend to be wealthier than at other stores. Retailer A's location is convenient for me. Education CSR-ability belief (CSR-ability) 0.24 0.31 0.23 0.46 2.37 0.43 0.47 0.42 0.50 1.04 CSR-cost belief (CSR-Cost) Price importance 3.66 3.47 0.87 0.99 Quality importance 4.35 0.72 Service importance 3.30 1.02 CSR importance 3.44 1.12 Seek local 4.08 0.90 Other retailer attributes Price (Price) (α = 0.66) Quality (Qual) (α = 0.90) Deals (Deal) (α = 0.71) In-store service (Instor) (α = 0.79) Assortment selection (Assort) Unique items (Unique) Similar shoppers (Similar) Wealthy shoppers (Wlthy) Location convenience (Convloc) Consumer characteristics Age Income a b Age-High = 1 if age greater than 65, 0 otherwise Income-High = 1 if household income is greater than $100 K, 0 otherwise Income-No Report = 1 if respondent prefer not to report income, 0 otherwise Educ-High = 1 if more than college graduation, 0 otherwise Environmental and social responsibility makes it difficult for companies to best serve their customers. Environmental and social responsibility programs increase a company's costs. How important is everyday price when you decide where to do most of your grocery shopping? How important is product quality when you decide where to do most of your grocery shopping? How important is in-store service when you decide where to do most of your grocery shopping? How important is environmental and social responsibility when you decide where to do most of your grocery shopping? I seek out locally grown and locally produced foods. All items except SOW are measured on a 5-point scale with 5 = “strongly agree” or “extremely important” and 1 = “strongly disagree” or “not at all important”. In the survey, “Retailer A” is replaced by each retailer's actual name. consumer characteristics to explain the probability of belonging to a given segment. The equation for attitude is: Attir ¼ δc0 þ δc1 CSREnvir þ δc2 CSRComir þ δc3 CSRLocir þ δc4 CSREmpir þ δc5 Priceir þ δc6 Assortir þ δc7 Uniqueir þ δc8 Qualir þ δc9 Dealir þ δc10 Instorir þδc11 Similarir þ δc12 Wlthyir þ δc13 Convlocir þ ηir ð1Þ makes it difficult for a firm to effectively serve its customers, and belief that CSR raises a firm's costs. Complete definitions are listed in Table 2. The equation for SOW is the same as the attitude equation, except that Attir is included on the right-hand side of the equation: SOWir ¼ βs0 þ βs1 CSREnvir þ βs2 CSRComir þ βs3 CSRLocir þ βs4 CSREmpir þ þ βs5 Priceir þ βs6 Assortir þ βs7 Uniqueir þ βs8 Qualir þ βs9 Dealir þ βs10 Instorir þβs11 Similarir þ βs12 Wlthyir þ βs13 Convlocir þ βs14 Attir þ εir ð3Þ where the variables are as defined in Table 2; the subscripts refer to consumer i and retailer r, and c = {1,2,…C} indicates the latent class or segment. The prior probability that consumer i belongs to segment c is given by: Here s = {1,2,…S} indicates the latent class or segment and the prior probability that consumer i belongs to segment s is given by: ! 0 " exp Z i :γc Prob ði ¼ cÞ ¼ XC # 0 $ exp Z i :γ c c0 ¼1 ! 0 " exp Z i :γ s Prob ði ¼ sÞ ¼ XS # 0 $ exp Z i :γ s s0 ¼1 ð2Þ where Zi is a vector of consumer i's characteristics comprising income, education, age, average, weekly grocery expenditure, belief that CSR ð4Þ We estimate the attitude and SOW models separately, using the concomitant variable latent class approach (Greene, 2003; Wedel & 161 K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 Table 3 Descriptive statistics across retailers. Variables Environmental friendliness Community support Local products Employee fairness Price Assortment selection Unique items Product quality Deals In-store service Similar shoppers Wealthy shoppers Location convenience Attitude SOW (among raters of store) SOW (full sample n = 1746) Mean (std. error) for retailer A (n = 1746) B (n = 416) C (n = 575) D (n = 273) E (n = 176) F (n = 167) G (n = 98) H (n = 41) 4.55 (.02) 4.68 (.02) 4.54 (.02) 4.01 (.02) 3.45 (.02) 4.11 (.02) 4.58 (.02) 4.67 (.01) 3.50 (.02) 4.31 (.02) 3.74 (.02) 4.10 (.02) 3.86 (.03) 4.20 (.02) 39.72 (.67) 39.72 (.67) 2.74 (.04) 2.74 (.04) 2.18 (.04) 2.99 (.03) 2.67 (.04) 3.87 (.04) 2.52 (.05) 3.17 (.04) 3.77 (.03) 3.27 (.04) 2.98 (.04) 2.17 (.04) 3.36 (.06) 2.84 (.04) 31.10 (1.12) 12.04 (.46) 2.76 (.03) 2.81 (.04) 2.36 (.04) 3.02 (.02) 2.37 (.03) 4.01 (.03) 2.64 (.04) 3.30 (.03) 3.95 (.03) 3.26 (.03) 3.05 (.04) 2.14 (.03) 3.36 (.05) 3.16 (.04) 36.64 (1.02) 18.92 (.57) 2.47 (.05) 2.39 (.05) 1.49 (.04) 2.88 (.04) 2.22 (.05) 2.84 (.07) 2.55 (.07) 3.16 (.05) 3.69 (.05) 2.73 (.05) 2.73 (.05) 2.06 (.05) 3.06 (.07) 3.32 (.05) 19.68 (.77) 6.32 (.27) 3.02 (.07) 3.06 (.08) 2.58 (.07) 3.09 (.05) 2.81 (.06) 3.66 (.07) 2.64 (.08) 3.42 (.07) 3.35 (.05) 3.46 (.07) 3.34 (.06) 2.60 (.06) 4.24 (.08) 3.01 (.07) 43.35 (2.17) 6.72 (.43) 2.34 (.07) 2.41 (.08) 1.46 (.05) 2.34 (.07) 2.02 (.06) 3.48 (.08) 2.19 (.08) 2.57 (.07) 3.76 (.07) 2.61 (.07) 2.58 (.07) 1.56 (.06) 2.98 (.10) 2.75 (.08) 20.92 (1.25) 4.49 (.24) 2.90 (.08) 3.19 (.08) 2.38 (.10) 3.17 (.07) 1.71 (.07) 4.06 (.09) 2.75 (.12) 3.54 (.09) 4.21 (.07) 3.64 (.08) 3.08 (.10) 1.84 (.07) 3.09 (.14) 3.88 (.09) 45.54 (3.05) 4.18 (.35) 2.37 (.13) 2.51 (.16) 1.90 (.15) 2.78 (.14) 2.74 (.11) 2.83 (.20) 1.81 (.14) 2.67 (.15) 3.42 (.12) 2.99 (.15) 2.63 (.16) 1.76 (.16) 3.78 (.23) 2.36 (.17) 25.39 (3.38) 1.20 (.15) Kamakura, 2000), and select the number of latent segments using the Bayesian Information Criterion (BIC).3 4. Results Table 4 presents correlations between key variables in our model. The CSR dimensions are highly correlated as would be expected. There are also strong correlations between CSR dimensions and some other retailer attributes. This underscores the importance of controlling for the latter to ensure that the estimated effects of CSR are not biased due to omitted variables. The strong correlations also suggest the possibility of multicollinearity so we computed variance inflation factors (VIFs) and condition indices for all the variables in our model before proceeding further. The highest VIFs are for the CSR dimensions, but even these are all less than 5, well below levels of 10 or higher that are considered problematic. The condition number for the model variables is only 6.2, well below the ad hoc standard of 30 that is often used. Next, we ensured that our mediation model is supported by the data (Baron & Kenny, 1986; Zhao, Lynch, & Chen, 2010). First we estimated equations for the effects of store attributes on attitude and SOW, and found significant effects of most attributes in both equations. Then, we added attitude to the SOW model and found that most attributes 3 Bucklin, Gupta, and Siddarth (1998) note that the trade-off between separate and joint estimation is one of better fit for the former vs. parsimony (fewer segments) for the latter. The better fit for the separate approach is due to its flexibility. A customer may be in attitude segment 1 and then be in either SOW segment 1 or 2. This is in contrast to the joint approach, which forces all customers who are in attitude segment 1 to be in SOW segment 1. We opted for the separate approach because we are interested in segmentation and therefore flexibility is important. continued to have significant direct effects on SOW after controlling for attitude, with a significant change in their magnitude. Thus, the data support partial mediation of the effects of store attributes on SOW by attitude. 4.1. Attitude model According to the BIC criterion, two segments provided the best fit for the attitude model, with Segment 2 being the majority segment (64.4% versus 35.6% of the sample). Segment-level parameters for the store attributes are provided in the first two columns of Table 5. The CSR coefficients all have positive signs and are statistically significant in four out of eight cases. This confirms that, by and large, CSR improves customer attitudes toward the store. Segments 1 and 2 differ interestingly in the emphasis they put on various CSR activities. Segment 1 places more emphasis on environmental friendliness whereas Segment 2 places more emphasis on employee fairness. The segments are similar in their attitude response to community support and local products. Both value the former and neither has an attitudinal response to the latter. Both segments have the expected signs for other store attributes, differing only in the magnitude of some effects. Perhaps the most important difference between them is that Segment 2 places more emphasis on promotional deals and on price and less emphasis on unique items and quality. In addition, it is helpful to characterize segments in terms of consumer characteristics like demographic variables and to see if their CSR response is related to beliefs about how CSR affects corporate ability (Brown & Dacin, 1997; Sen & Bhattacharya, 2001). The bottom panel of Table 5 provides the effects of these concomitant variables on the probability of belonging to Segment 1 versus Segment 2. We find that higher age and more education increase the likelihood of being in Segment 1 and belief that CSR limits a firm's ability to effectively serve its customers decreases the likelihood of being in Segment 1. 162 K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 Table 4 Correlations among model variables. EnFr Env. friendliness Comm. support Local products Employ. fairness Price Assort. selection Unique items Prod. quality Deals In-store service Similar shoppers Wlthy. shoppers Loc. conv. Attitude Share of wallet CmSup 1 .82 .81 .65 .40 .31 .71 .78 −.00 .72 .49 .68 .24 .64 .21 LocP EmFair Price Assor Uniq Qual Deal 1 .05 .76 .53 .65 .24 .73 .23 1 .16 .14 −.19 .03 .23 .16 Serv Sim Wlth LCon Att SOW 1 .44 1 1 .81 .63 .37 .34 .73 .79 .01 .71 .50 .66 .24 .65 .21 1 .61 .45 .33 .75 .78 −.05 .72 .48 .74 .23 .61 .22 1 .25 .27 .52 .61 .06 .62 .43 .54 .20 .54 .21 1 −.03 .40 .32 −.42 .22 .13 .56 .15 .03 −.10 1 .33 .39 .30 .42 .32 .18 .05 .37 .17 1 .72 −.02 .61 .43 .65 .15 .59 .11 1 .53 .55 .28 .69 .31 1 .38 .23 .54 .30 1 .21 .45 .10 1 .21 .30 Table 5 Attitude and share-of-wallet models with main effects. Independent variable Store attributes: Envir. friendliness Community support Local products Employee fairness Price Assortment selection Unique items Product quality Deals In-store service Similar shoppers Wealthy shoppers Location conven. Attitude Concomitant variables#: CSR-ability belief CSR-costs belief Education-high Income-high Income-no report Age-high Wkly grocery spend Attitude Share of wallet Seg 1 (34.2%) Std. error .173⁎⁎⁎ .098⁎⁎⁎ .017 .004 −.119⁎⁎⁎ .030 .026 .025 .029 .028 .022 .020 .036 .030 .032 .025 .023 .014 .015 .147⁎⁎⁎ .382⁎⁎⁎ −.028 .156⁎⁎⁎ .062⁎⁎ .030 .023 – −.307⁎⁎⁎ −.019 .593⁎⁎⁎ .029 .317 .662⁎⁎⁎ .054 .099 .107 .194 .238 .243 .231 .172 Seg 2 (65.8%) .005 .078⁎⁎⁎ .016 .078⁎⁎⁎ −.296⁎⁎⁎ −.016 .074⁎⁎⁎ .287⁎⁎⁎ .139⁎⁎⁎ .194⁎⁎⁎ .149⁎⁎⁎ −.025 .015 – – – – – – – – Std. error .029 .025 .022 .021 .021 .016 .018 .026 .023 .024 .017 .020 .012 Seg 1 (27.7%) 2.681⁎⁎⁎ 1.864⁎ 7.039⁎⁎⁎ .853 .266 −1.252⁎ 2.288⁎⁎⁎ .446 −1.007 1.672 .601 3.633⁎⁎⁎ 1.659⁎⁎⁎ Share of wallet mediated Std. error Seg 2 (72.3%) Std. error .985 .986 .744 .819 .803 .666 .705 1.060 .845 1.059 .720 .734 .432 −1.287 −1.069 1.218⁎ 1.800⁎⁎ −5.616⁎⁎⁎ .822 .699 .681 .745 .662 .575 .560 .939 .754 .865 .577 .614 .376 – .938 −3.564⁎⁎⁎ −.071 1.094 4.590⁎⁎⁎ 4.007⁎⁎⁎ −1.865⁎⁎⁎ 4.566⁎⁎⁎ – −.358⁎⁎⁎ .033 .372⁎⁎ .079 .444⁎⁎⁎ .188 .478⁎⁎⁎ .078 .086 .149 .183 .186 .178 .131 – – – – – – – Seg 1 (25.4%) 1.944⁎⁎ .767 7.246⁎⁎⁎ .796 1.570⁎ −.880 1.577⁎⁎ −3.578⁎⁎⁎ −2.074⁎⁎ .905 −.364 3.398⁎⁎⁎ 1.253⁎⁎⁎ 8.649⁎⁎⁎ −.321⁎⁎⁎ .031 .337⁎⁎ .044 .399⁎⁎ .215 .550⁎⁎⁎ Std. error Seg 2 (74.6%) Std. error .988 .969 .766 .799 .806 .632 .695 1.122 .840 1.059 .750 .718 .432 .966 −1.961⁎⁎⁎ −1.774⁎⁎⁎ 1.365⁎⁎ .765 .659 .654 .691 .655 .546 .524 .902 .708 .868 .546 .580 .358 .769 .083 .092 .160 .195 .198 .187 .140 .905 −2.584⁎⁎⁎ .962⁎ −4.584⁎⁎⁎ −3.119⁎⁎⁎ .100 2.182⁎⁎ 2.552⁎⁎⁎ −1.524⁎⁎⁎ 4.541⁎⁎⁎ 11.096⁎⁎⁎ – – – – – – – Standard errors are in parentheses. Effects of CSR variables are in bold. # Effect of concomitant variables on probability of membership in segment 1 versus 2. ⁎⁎⁎ p b 0.01. ⁎⁎ p b 0.05. ⁎ p b 0.10. 4.2. SOW model The BIC criterion supports a two-segment solution for the SOW model also. Table 5 shows the non-mediated SOW model (i.e., without attitude) in columns 3 and 4, and the mediated SOW model (i.e., with attitude) in columns 5 and 6. Contrasting the two models yields two important conclusions: First, attitude is clearly important. Its coefficient is positive and highly significant in both segments. Second, many CSR and other store attributes remain statistically significant in the mediated SOW model showing that attitude partially mediates the relationship between attributes and SOW. Thus, we use the mediated SOW model in the remainder of our discussion. The two segments in this SOW model differ substantially in the direct effects of CSR and some other attributes like price, unique items, deals, and wealthy shoppers. The first segment shows more positive direct effects of CSR, unique items and wealthy shoppers, a more negative direct effect of deals, and an insignificant direct effect of price. In terms of the concomitant variables, customers who are highly educated, bigger spenders, and who refuse to report income are more likely to be in Segment 1.4 It is important to point out that interpreting the direct CSR/attribute coefficients of the mediated SOW model in isolation is not particularly relevant. Managerially, what is of interest is the total effect, i.e., the impact of a change in an attribute, e.g., a CSR dimension, on SOW. This δSOW δAtt total effect is dSOW and is given by δSOW δCSR þ δAtt % δCSR . The first term is dCSR 4 It may seem odd for a “missing data” code to have strong explanatory power. However, in light of the other features of Segment 1, “Income-no report” probably means that the customer is higher income. The importance of missing data codes such as represented by the income-no report variable is consistent with Blattberg et al. (2008, p. 307), who recommend the use of missing-variable coding in database marketing models. 163 K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 the direct effect of the CSR dimension; the second is the indirect effect of CSR on SOW through the mediating attitude variable. The decomposition of the total effect shows it is possible for CSR to inspire positive attitudes but little behavioral action. A positive effect of CSR on attitude, together with a positive effect of attitude on behavior δAtt makes the δSOW δAtt % δCSR term positive. Therefore, for the total effect on behavior to be non-positive, the direct effect, δSOW δCSR , has to be negative. In the next section, we calculate the total effects and interpret these in conjunction with the coefficients reported in Table 5. Since the total # δAtt $ effects combine coefficients the attitude model δCSR and the #δSOW from $ mediated SOW model δAtt ; δSOW δCSR , they depend on which segment (1 or 2) the consumer belongs to in the attitude model and the mediated SOW model. 4.3. Total effect of CSR activities on SOW First, we classify each consumer in Attitude Segment 1 or 2 and in SOW Segment 1 or 2, using the posterior probabilities of segment membership.5 This generates four groups of consumers: Group 1/1 comprising consumers who are in Attitude Segment 1 and SOW Segment 1, Group 2/1 comprising consumers who are Attitude Segment 2 and SOW Segment 1, and so on. Next, we compute the total effect of each model variable (i.e., the total number of units by which SOW changes for a one unit increase in the variable) in each of the four groups, using parameter estimates from the corresponding Attitude and SOW segments. Finally, we use bootstrapping to obtain standard errors for these total effects (Efron & Tibshirani, 1993) in each of the four groups. For each group, we draw 500 bootstrap samples with replacement, estimate our models and compute the total effects of each model variable for each sample. The standard deviation of a given total effect across the bootstrapped samples is its standard error. Table 6 provides the total effects and standard errors for all four groups. The first take-away is that the two attitude segments overlap only partially with the two SOW segments. The largest portion of the sample is in Group 2/2 but a significant proportion of consumers is in Groups 2/1 and 1/2. This underscores the advantage of not forcing a single segmentation scheme for attitude and SOW. The second takeaway is that total SOW returns differ by group and CSR dimension, underscoring the importance of segmentation. Third, of the 16 CSR total effects, 10 are statistically significant – nine with a positive sign and one with a negative sign. Clearly CSR exerts an important impact on SOW. Groups 1/1 and 2/1 are relatively small (11.6% and 13.8% of the sample, respectively). However, CSR is very important to these consumers – significant and positive in seven of eight cases. Both of these groups place very high emphasis on local products: a one-point increase in local products perceptions garners over seven SOW points. They also respond similarly to community support: a one-point increase in community support garners 1.62 SOW points for Group 1/1 and 1.44 SOW points for Group 2/1. With respect to the remaining two CSR dimensions, Group 1/1 places more emphasis on environmental friendliness, while Group 2/1 places more emphasis on employee fairness. Referring back to Table 5, we see several significant CSR effects both in the attitude and mediated SOW models for attitude Segments 1 and 2 and SOW Segment 1, so the strong total effects for Groups 1/1 and 1/2 are as expected. CSR exerts different effects on Groups 1/2 and 2/2. This follows from the weaker and even negative direct effects in the mediated SOW model 5 The most transparent way to assess the “quality” of classification is to examine the posterior probability of each respondent being in the segment he/she is assigned to. The closer these probabilities are to 1, the better the quality of classification. We find that the average probability of being in the assigned segment is 0.89 for the SOW model and 0.77 for the attitude model. Thus, the quality of assignment is very high, especially for the SOW model. Table 6 Total effects on share of wallet. Group defined by Total effect of Attitude Seg1 SOW Seg1 (size = 11.6%) Attitude Seg2 SOW Seg1 (size = 13.8%) Attitude Seg1 SOW Seg2 (size = 22.6%) Attitude Seg2 SOW Seg2 (size = 52.0%) Environmental friendliness Community support Local products 3.440⁎⁎⁎ (1.064) 1.615⁎ 1.987⁎⁎⁎ (.782) 1.442⁎⁎ (1.856) 7.393⁎⁎⁎ (.723) 7.384⁎⁎⁎ −1.906⁎⁎ (.933) −.909 (.961) 1.543⁎ (1.074) .831 (.820) .541 (.846) −.750 (.704) 2.848⁎⁎⁎ (.639) −.274 (1.023) −2.316⁎⁎⁎ (.778) 1.471⁎⁎ (.734) −.990 (.664) −1.018⁎⁎ (.883) 2.254⁎⁎⁎ (.461) 2.217⁎⁎⁎ (.516) −1.096 (.826) −.872 (.722) 2.583⁎⁎⁎ −.041 (1.419) −.687 (1.202) 1.554 (1.108) .949 (1.021) −3.904⁎⁎⁎ (1.056) 1.128 (.857) −2.953⁎⁎⁎ (.932) 1.120 (1.492) −.211 (1.173) 3.913⁎⁎⁎ (.754) 4.335⁎⁎⁎ (.928) .172 (.819) 3.657⁎⁎⁎ (.782) 1.452⁎⁎⁎ (.847) .925 (.726) 3.182⁎⁎⁎ (.614) 1.383⁎⁎⁎ (1.312) 3.240⁎⁎⁎ (1.016) −1.191 (.939) 4.796⁎⁎⁎ (.989) 4.205⁎⁎⁎ (.591) −1.801⁎⁎⁎ (.710) 4.707⁎⁎⁎ (.502) (.453) (.619) (.409) Employee fairness Price Assortment selection Unique items Product quality Deals In-store service Similar shoppers Wealthy shoppers Location convenience (.860) 1.770⁎⁎ (.771) −5.868⁎⁎⁎ (.684) .784 (.601) −3.763⁎⁎⁎ (.764) .066 (1.007) 1.642⁎⁎ Note 1: Total effect = (Direct effect on SOW) + (Effect on attitude)⁎(Effect of attitude on SOW). Note 2: Bootstrapped standard errors are in parentheses. Effects of CSR variables are in bold. ⁎⁎⁎ p b 0.01. ⁎⁎ p b 0.05. ⁎ p b 0.10. for segment 2. CSR does not exert a significant impact on SOW for Group 1/2. The local products effect is close but does not achieve statistical significance at the 0.10 level. So for 22.6% of the consumer base, CSR is a non-factor in terms of SOW. Group 2/2, the largest portion of consumers (52.0%), presents a mixed response to CSR. Local products and employee fairness exert positive total effects. A one-point increase in employee fairness is worth 1.77 SOW points and a one-point increase in local products yields 1.54 points of SOW. Note that these two CSR variables are related to a consumer's shopping experience, so a stronger response is expected – these CSR efforts provide not just a societal but also a personal benefit. In contrast, the impact of community support is not significant, and environmental friendliness has a negative total impact, significant at the 0.05 level. This is only one negative total impact out of 16 total CSR effects, but nevertheless it is intriguing and begs explanation. We offer an ex post psychological explanation for the negative impact of environmental friendliness in this group – attribution. In particular, consumers who see a retailer devoting attention to a CSR cause that is not related to their experience with the store might infer that the retailer's attention is being diverted from serving customers. Consistent with this hypothesis, we find that one of the concomitant variables, the perception that “CSR makes it difficult for companies to serve their customers effectively,” is associated with a higher probability of being in attitude segment two (where environmental friendliness has an insignificant effect), and a higher probability of being in SOW segment two (where the direct effect of environmental friendliness is negative). The net result is that consumers in Group 2/2 are concerned that the retailer's attention to CSR activities for the broad societal good takes their attention away from the customer. This shows up as a significantly 164 K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 negative total response to environmental friendliness and an insignificant total response to community support, the two CSR dimensions that have a societal but not necessarily a personal benefit. Overall, Table 6 shows that some CSR efforts can have a strongly positive impact on SOW. The positive effect is especially strong for 25% of our sample. For 52% of the consumers, CSR related to the consumer's experience in the store – local products and employee fairness – has a significant positive effect, but broader societal good related CSR, particularly environmental friendliness, detracts from SOW. There is a third group – 23% of consumers – for whom CSR is a non-factor. 4.4. Total impact of other store attributes on SOW Table 6 shows the total effects of other store attributes on SOW, suggesting important contrasts among the groups. Groups 1/1 and 1/2 are not price sensitive, not deal prone, and value unique and special items and a wealthy clientele. Recall that these two groups are the most responsive to CSR. In contrast, Groups 1/2 and 2/2 are price sensitive and value similar shoppers, not wealthy ones. Group 2/2 is particularly price sensitive and deal prone. They respond to unique items by reducing SOW. Perhaps these consumers think of such a store as a place to shop for special items, not for everyday needs. The survey includes a question as to whether the consumer shopped at a chain mainly for special items not available elsewhere. Consistent with our logic, this item is more positively correlated with the Unique attribute in SOW segment 2 than in SOW segment 1 (correlation = 0.38 versus 0.17), and its mean is significantly higher for SOW segment 2 (3.95) than for SOW segment 1 (3.09). Perhaps the most surprising result is the lack of importance of quality. Table 5 shows that both Attitude segments have a positive effect of quality, but the direct effect on SOW is negative for both SOW segments. Table 6 shows that the total effect is not significant. Everyone likes high quality “theoretically,” but when it comes to actual shopping, quality may not mean much within the range of the data. The stores carry more or less the same packaged goods brands so, as the retail experts we interviewed noted, quality differs primarily in fresh produce, which is not a large part of total grocery spending. 4.5. Additional characterization of the four groups The concomitant variables discussed previously characterize the Attitude and SOW segments. In Table 7, we summarize additional selfreported characteristics of the four groups. The contrast between Groups 1/1 and 2/2 (respectively the most and least CSR responsive groups Table 7 Shopping characteristics of the four groups. Mean (standard error) in group defined by Variable Attitude Seg1 Attitude Seg2 Attitude Seg1 Attitude Seg2 SOW Seg2 SOW Seg2 SOW Seg1 SOW Seg1 (size = 11.6%) (size = 13.8%) (size = 22.6%) (size = 52.0%) SOW at focal chain (%) SOW at low price chains (%) (chains C + D + F + G) Importance of price Importance of quality Importance of in-store service Importance of CSR 73.82⁎ (.74) 12.61⁎ 71.31⁎ (.78) 18.80⁎ 35.01⁎ (1.15) 30.58⁎ (.77) (.71) (1.33) 3.03⁎ (.07) 4.58⁎ 3.29⁎ (.06) 3.93⁎ (.04) 3.47⁎ (.05) 3.42⁎ (.07) 3.79⁎ (.07) 4.38⁎ (.07) 3.57⁎ (.08) 4.25⁎ 3.31⁎ (.05) 4.35 (.04) 3.34 (.05) 3.57⁎ (.05) 4.23⁎ (.06) (.05) (.04) Seeking local products ⁎ Mean is significantly different from group 4 at p b 0.01. 25.71 (.74) 44.12 (1.02) 3.70 (.03) 4.27 (.03) 3.21 (.03) 3.27 (.04) 3.91 (.03) based on our results in Table 6) is particularly interesting. First, Groups 1/1 and 2/1 have much higher SOW at the focal retailer than the other two groups, and the self-reported importance of CSR in store choice, especially in group 1/1, is significantly higher than for the other two groups. This makes sense given the focal retailer's superior performance on CSR and the high responsiveness of these groups to CSR. Groups 1/1 and 1/2 also have much lower SOW at the price oriented retailers (Chains C, D, F, and G), and the self-reported importance of price in their store choice is significantly lower. Again this is in line with our model-based results showing that Groups 1/1 and 1/2 are much less price sensitive and much less deal prone than the other two groups, especially Group 2/2. Finally, Groups 1/1 and 1/2 report significantly higher importance of in-store service and quality in their store choice, and they report seeking local products more than Group 2/2. This, too, is in line with our model-based findings. Thus, the differences in these self-reported characteristics across the four groups provide high convergent validity for our model-based results. 4.6. Robustness checks We conducted several analyses to establish the robustness of our results. The robustness checks relate to (i) use of self-reported SOW versus SOW computed from purchases, (ii) multicollinearity, (iii) chain differences in CSR response, and (iv) potential nonlinear effects of CSR. We summarize our findings below but full details are available upon request. 4.6.1. Self-reported versus computed SOW The SOW measure used in our analyses is self-reported. We have respondents' actual purchase data from the focal retailer but we are unable to use actual SOW because we do not have the same information about their purchases from other retailers. However, we tested the robustness of our results with actual purchase data in two ways. First, we computed SOW at the focal retailer from the respondents' actual purchases at the focal retailer and their total weekly grocery budget as reported in the survey. Then we re-estimated our model after replacing the self-reported SOW by this computed SOW for observations relating to the focal retailer. We found no substantive difference in results. Further, we estimated the model using only observations for the focal retailer. This allowed us to directly compare results between selfreported and computed SOW because both were available. It was reassuring that the results were very similar for the two SOW measures. 4.6.2. Multicollinearity Although the VIFs and condition indices in our model are within acceptable limits, we conducted additional checks given the high correlations between the CSR dimensions. Multicollinearity generally increases standard errors, rendering the estimated coefficients unstable across model variations. Sometimes it can result in “wrong signs”, so we wished to make sure that the negative direct effects of some CSR dimensions are not an artifact of multicollinearity. We re-estimated the mediated SOW model by dropping one CSR dimension at a time and found that the results were very robust – e.g., the negative effects of environmental friendliness and community support in the second segment remained. Noting the high correlations between the quality attribute and in-store service as well as unique items attributes, we ran three additional models, dropping in-store service in one, unique items in the other, and both in the third, to see if the negative quality coefficient held up. Indeed it did. The signs of the CSR variables held up as well, although there were a few instances where significance levels changed. Finally, we subjected the multi-item store attributes to a principal components analysis and re-estimated the models with the orthogonal component scores. The CSR results were unchanged. We also examined the correlation matrices for SOW Segments 1 and 2 and found them to be similar. We examined correlations for the focal chain versus other chains. These were somewhat smaller than in the K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 overall sample but, as we discuss in the next section, this is at the cost of less variation. Overall, therefore, we find that multicollinearity is not a problem and it is not responsible for negative or insignificant CSR effects. 4.6.3. Focal versus other chains Our analysis relies on between chain/within customer variation to estimate the impact of CSR. However, it may be argued that our results, especially the negative direct effects of environmental friendliness and community support, are driven mainly by focal chain which is positioned much higher on CSR dimensions. Although separating the chains reduces variation in the CSR dimensions and hurts statistical power, we repeated our analysis after deleting the focal chain. As we expected, this resulted in fewer segments and reduced statistical significance. However, the direct effects of environmental friendliness and community support do not turn positive even when we exclude the focal chain. Indeed, the direct effect of community support remains significantly negative in one segment, just as in the full sample. We also found that the signs for local products and employee fairness are all positive, and in two of four cases, significant, despite the reduction in statistical power. In summary, the basic pattern of results persists when we remove the focal chain from the analysis, despite the weaker statistical significance resulting from attenuated variation in the CSR variables. 4.6.4. Nonlinear effects of CSR It is possible that the impact of CSR is nonlinear. For example, the negative impact we find for environmental friendliness could be due to “over-reaching” on the part of the focal chain, i.e., going too far in its emphasis on environmental friendliness. This could lead to an inverted U effect on SOW. We examined this by adding dummy variables for high CSR values, i.e., equal to 1 if the CSR variable is rated 4 or 5 on our 1 to 5 scale (the variables are measured on an interval rather than a ratio scale, so it is not appropriate to include a quadratic term). Although model fit improved, we found that the negative effects we discussed earlier cannot be attributed to an inverted-U relationship. Specifically, the dummy variables for environmental friendliness and community support are not significant for the second SOW segment where our model showed negative effects of environmental friendliness and community support. And directionally, they support the opposite of an inverted U effect. For example, the effect of community support becomes less negative at high levels. The only dimension for which we see an inverted U is one that did not have a negative effect in our model – local products. We find that the effect of local products gets less positive at high levels. Adding these dummy variables aggravates multicollinearity (we now obtain VIFs above 5 and the condition number is 15). Further, the magnitude of the dummy variable for local products is implausibly high in Segment 1. Overall, therefore, although there are some nonlinear effects (as shown by the improved fit), they do not explain the negative effects in our original model, nor do they change our results directionally. We are also concerned that this is manipulating the 1–5 scale data too much. Overall, we conclude that the exacerbation of multicollinearity is not worth the additional insights and therefore we retain our original model. However, we note that non-monotonic CSR effects present a fruitful direction for future research. A final nonlinear effect is potential interactions, especially between CSR efforts and other store attributes. We used the “data-driven” procedure proposed by Bijmolt, Van Heerde, and Pietres (2005) to identify potential interactions. Specifically, we added one set of interactions at a time (between the CSR activities and another attribute), determined the number of segments using BIC, and then determined whether this set of interactions improved the fit of the model. We did not find any interactions for the attitude model, while we found interactions of CSR with quality and unique items for the SOW model. 165 We used the results to recalculate the total CSR effects on SOW and found the pattern of results to be similar to those shown in Table 6.6 5. Conclusion We have used field data to quantify the impact of competing grocery retailers' CSR activities on consumers' behavioral loyalty towards these firms. We measure behavioral loyalty as share of wallet and distinguish among four types of CSR and among consumer segments. We investigate the role of consumers' attitude toward the store, allowing it to mediate the impact of CSR and other store attributes on SOW. Our key findings are as follows: (a) CSR perceptions have a direct effect on SOW as well as an indirect effect through attitude toward the store. The effects on attitudes are generally positive and attitudes enhance SOW. However, some of the direct effects are negative reducing the total effect on SOW, which is a combination of indirect and direct effects. (b) The total effect on SOW varies substantially across segments and CSR activities. Among the four types of CSR activities in our study, selling locally produced products has strong universal appeal. Employee fairness also has a positive, albeit weaker, impact across segments. Environmental friendliness is a double-edged sword with respect to SOW. A substantial segment of consumers reacts negatively to it. (c) For the largest group comprising over 50% of our sample, a oneunit improvement in perception (on a 1–5 scale) of local products or employee fairness increases SOW by 1.5 and 1.8 points respectively. However, this group is turned off by environmental friendliness – a similar effort on this dimension loses 1.9 SOW points. For two other groups, comprising approximately 25% of our sample, the SOW gain from local products is much larger and improvements in environmental friendliness and community support also garner substantial increases in SOW. Thus, there is a strong case for benefit segmentation in CSR efforts. (d) These groups can be distinguished based on education, age, income, and belief that CSR activities limits a firm's ability to effectively serve its customers. They can also be distinguished based on their response to other store attributes. Compared to the smaller groups that respond positively to all CSR, members of the group with negative SOW response to environmental friendliness and community support are more price sensitive and place greater value on assortment and location convenience. They are turned off by perceptions of exclusivity such as unique items and a wealthy clientele, have a smaller weekly grocery budget, and are more likely to believe that CSR efforts hinders the retailer's ability to serve its customers effectively. These findings have several important implications. First, the potential gains and losses of SOW due to improvements in CSR perceptions are managerially meaningful. U.S. supermarket sales exceed $550 billion annually and median store sales is over $25 million (Food Marketing Institute, 2010), so every SOW point carries a substantial dollar amount. For instance, consider Kroger, one of the largest U.S. grocery chains with approximately $76 billion in annual sales. It recorded a market share gain of 0.61 points in its major markets in 2008, and a total gain of 2.25 market share points over a four-year period. These gains were considered very “impressive” in its press release announcing the fiscal year results (The Kroger Company, 2009). Second, not all CSR initiatives are equally important or meaningful. The best CSR initiatives are closely integrated into the company's core customer offering. CSR activities that are directly tied to the customer's experience with the firm – the front-end employees and the products – generate a higher return that is less contingent upon consumers' 6 Complete details are available from the authors. 166 K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 idiosyncratic beliefs about the relationship between CSR and corporate abilities. Third, our results highlight the importance of targeting CSR communications to consumers. For the largest group, communicating environmental friendliness hurts SOW. This does not mean that firms should act in ways that are environmentally unfriendly or that exploit their employees. For one, consumers react much more strongly to negative news about a firm's CSR than they might to positive news (Bhattacharya & Sen, 2004). For another, CSR never detracts from overall attitudes toward the store and therefore may lead to other pro-firm behavior not captured in SOW, such as word of mouth referrals and advocacy, and higher willingness to forgive occasional lapses (Bhattacharya & Sen, 2004; Klein & Dawar, 2004). Also, the smaller groups that value CSR for the broader social good are strategically important. Their lifetime value to the retailer is likely to be very high given their lower price sensitivity and high loyalty. In our sample, the average SOW of these segments with the focal retailer is over 70%, and other research shows that the “green consumer” spends more and is more brand and retailer loyal (GMA-Deloitte, 2009). All this suggests that while core-offering related CSR lends itself to a more uniform, mass-market communication approach, non-core related CSR is more nuanced and requires both careful messaging and careful targeting. This is definitely both feasible and cost-effective for a retailer that already has a loyalty program and communicates directly with its consumers, e.g., through e-mail. For instance, all consumers should receive information about a retailer's local product selection and related consumption benefits such as freshness and lower pesticide levels. As also noted in the GMA-Deloitte (2009) study, green products are most effective when they represent a broader value proposition encompassing multiple purchase drivers. Only the higher educated, higher income consumers who use reusable bags or support environmental organizations should receive information about the environmental benefits of local products and about the retailer's environmental programs. Even retailers who do not have a loyalty program can use the rich geo-segmentation data available from tools such as Nielsen's PRIZM system to target CSR communications by zip-code (see Blattberg, Kim, & Neslin, 2008; pp. 197–206). Beyond targeting based on consumer demographics and psychographics, our findings highlight the importance of appropriate messaging. Firms must tie their CSR effort not just to the broader social good but also communicate how those efforts translate into a better customer experience. They need to combat the view that some CSR activities do not directly benefit the customer and interfere with the firm's ability to serve its customers. For example, communications about a retailer's environmental programs (e.g., energy and water conservation or waste reduction) should emphasize how they reduce costs and allow the retailer to invest in products and services that the consumer values and/or reduce prices. The British grocery retailer, Tesco, even rewards customers with loyalty program points if they take actions that benefit the environment, e.g., use a reusable shopping bag. Indeed, Manget, Roche, and Munnich (2009) find that the most popular environmentally friendly actions that consumers themselves undertake involve saving money as well. Fourth, it is dangerous for companies to charge higher prices because they perform well on CSR. Although the segment that values all types of CSR does tend to be less price sensitive, it is small. The largest group, that values only some types of CSR, is significantly more price sensitive. Further, we did not find any interactions between CSR efforts and price response, i.e., CSR does not decrease price sensitivity. This recommendation is also consistent with the GMA-Deloitte (2009) finding that consumers don't see why a green product should cost more if it is manufactured with less packaging/waste or if it is not transported far. Fifth, firms need to measure the costs of their CSR initiatives realistically to calculate the ROI of their CSR investment. These costs were not available to us so we cannot make these calculations. We do wish to note that, just as not all CSR activities bring equal SOW benefits, not all of them are equally costly. Indeed, offering local products, the activity with the biggest benefit in our study, may not incur incremental costs (Bustillo & Kesmodel, 2011). Sourcing local products may actually be cheaper for a retailer due to lower transportation and spoilage costs and more negotiating leverage over local, often smaller, suppliers. Similarly, environmentally friendly practices such as reducing plastic or water or energy use, or reducing waste, may lower costs, the savings from which can be communicated and passed along to consumers. The point is that costs, which vary substantially across CSR activities, are quantifiable, and our research shows how to quantify economic benefits from the revenue side. Finally, our results underscore the importance of distinguishing between attitudes and behavior in CSR research. Previous studies have suggested that positive attitudes engendered by CSR may not translate into higher purchase incidence, but, to the best of our knowledge, the current research is the first to quantify the interrelationship between attitudes and behavioral loyalty. The conclusion is that attitudes partially mediate the relationship between CSR and SOW, so evaluation of CSR must entail both attitudinal and behavioral measures. Our results show that only measuring impact on attitudes paints a rosier than warranted picture of CSR. We note the limitations of our work and some important future research opportunities. First, our sample comes from the loyalty program of the focal retailer who is strongly positioned on CSR. Although much other empirical work has also been done using loyalty program members, we recognize that consumers in our sample may not be representative of the population as a whole. In particular, they may be more responsive to CSR, having chosen to enroll in the focal retailer's program, so our results may reasonably be viewed as an upper bound on the SOW returns of CSR. Even for this sample, we find considerable heterogeneity in CSR response. We hope future research can validate our findings with a broader sample. Second, we have identified consumer segments and developed profiles that can be used for targeted messaging. However, we have not delved deeply into the underlying reasons for consumers' distinct perspectives about CSR. For example, we posit the negative impact of environmental friendliness on SOW in one group may be due to attribution on the part of the consumer that these efforts detract from the retailer's ability to serve the customer. Although we find only this one negative effect out of sixteen effects examined, its significance within a sizeable segment of consumers and the fact that environmental efforts are the most commonly publicized CSR initiatives underscores the importance of further research on this issue. Third, we studied the major stakeholder group for grocery retailers, but it is also important to study how other stakeholders such as employees and investors respond to each CSR dimension. CSR dimensions like environmental friendliness and community support may well have significant effects on these stakeholders and therefore on financial returns even though they have little direct impact on consumers' behavioral loyalty. Fourth, our research is cross-sectional and there are always questions of causality with cross-sectional research. Strictly speaking, our research finds associations though theory suggests the associations are causal. We note that for reasons reported earlier, we do not believe halo effects were a problem. In addition, we note that the fact that we found two CSR dimensions negatively related to SOW and two positively related suggests that reverse causality is not at play here – respondents didn't simply rate their favorite store positively on CSR. Still, a longitudinal study over a period in which CSR policies are changed would be very valuable, especially as some retailers like Wal-Mart are investing significantly in environmentally friendly stores, products, and suppliers, and taking steps to improve their reputation on treatment of employees. It could also help to distinguish between the effects of positive versus negative changes in CSR. Finally, we examined the impact of CSR dimensions in the grocery retail industry. As discussed at the outset of this paper, the impact of K.L. Ailawadi et al. / Intern. J. of Research in Marketing 31 (2014) 156–167 CSR should be industry-specific, so industry focus is important. But we hope future researchers will build on our work by conducting fieldbased analysis of the impact of CSR dimensions in other industries. Acknowledgments The authors thank an anonymous grocery retail chain in the northeastern U.S. for access to their loyalty program members and data. They thank Rong Guo, Lynn Foster-Johnson, and Paul Wolfson for their invaluable research support. Finally, we thank the AE and the reviewers for their constructive and valuable comments. Appendix A Supplementary data to this article can be found online at www. runmycode.org. References Ailawadi, K. L., & Keller, K. L. (2004). Understanding retail branding: Conceptual insights and research priorities. Journal of Retailing, 80(4), 331–342. Ajzen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50, 179–211. Auger, P., Burke, P., Devinney, T. M., & Louviere, J. J. (2003). What will consumers pay for social product features? Journal of Business Ethics, 42(3), 281–304. Baker, J., Parasuraman, A., Grewal, D., & Voss, G. B. (2002). The influence of multiple store environment cues on perceived merchandise value and patronage intentions. Journal of Marketing, 66(2), 120–141. Bargh, J. A., Chen, M., & Burrows, L. (1996). Automaticity of social behavior: Direct effects of trait construct and stereotype activation on action. Journal of Personality and Social Psychology, 71(2), 230–244. Baron, R. M., & Kenny, D. A. (1986). Moderator–mediator variables distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173–1182. Barone, M. J., Miyazaki, A.D., & Taylor, K. A. (2000). The influence of cause-related marketing on consumer choice: Does one good turn deserve another? Journal of the Academy of Marketing Science, 28(2), 248–262. Barone, M. J., Norman, A. T., & Miyazaki, A.D. (2007). Consumer response to retailer use of cause-related marketing: Is more fit better? Journal of Retailing, 83(4), 437–445. Batson, C. D., & Shaw, L. L. (1991). Evidence for altruism: Toward a pluralism of pro-social motives. Psychological Inquiry, 2(2), 107–122. Bergkvist, L., & Rossiter, J. R. (2007). The predictive validity of multiple-item versus singleitem measures of the same constructs. Journal of Marketing Research, 44(2), 175–184. Berman, S. L., Wicks, A.C., Kotha, S., & Jones, T. M. (1999). Does stakeholder orientation matter? The relationship between stakeholder management models and firm financial performance. Academy of Management Journal, 42(5), 488–506. Bhattacharya, C. B., & Sen, S. (2003). Consumer-company identification: A framework for understanding consumers' relationships with companies. Journal of Marketing, 67(2), 76–88. Bhattacharya, C. B., & Sen, S. (2004). Doing better at doing good: When, why, and how consumers respond to corporate social initiatives. California Management Review, 47(1), 9–24. Bhattacharya, C. B., Sen, S., & Korschun, D. (2008). Using corporate social responsibility to win the war for talent. MIT Sloan Management Review, 49(2), 37–44. Bijmolt, T. H. A., Van Heerde, H. J., & Pietres, R. G. M. (2005). New empirical generalizations on the determinants of price elasticity. Journal of Marketing Research, XLII, 141–156. Bitner, Mary J. (1992). Servicescapes: The impact of physical surroundings on customers and employees. Journal of Marketing, 56, 57–71. Blattberg, R. C., Kim, B., & Neslin, S. A. (2008). Database Marketing: Analyzing and Managing Customers. New York: Springer. Bommer, W. H., Johnson, J. L., Rich, G. A., Podsakoff, P.M., & Mackenzie, S. B. (1995). On the interchangeability of objective and subjective measures of employee performance — a metaanalysis. Personnel Psychology, 48(3), 587–605. Brown, T. J., & Dacin, P. A. (1997). The company and the product: Corporate associations and consumer product responses. Journal of Marketing, 61(1), 68–84. Bucklin, R. E., Gupta, S., & Siddarth, S. (1998). Determining segmentation in sales response across consumer purchase behaviors. Journal of Marketing Research, 35(2), 189–197. Bustillo, M., & Kesmodel, D. (August 1). ‘Local’ grows on Wal-Mart. Wall Street Journal, 258(26), B1–B5. 167 Cotte, J., & Trudel, R. (2009). Socially conscious consumerism: A systematic review of the body of knowledge. Network for Business Sustainability. Du, S. L., Bhattacharya, C. B., & Sen, S. (2007). Reaping relational rewards from corporate social responsibility: The role of competitive positioning. International Journal of Research in Marketing, 24(3), 224–241. Efron, Bradley, & Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman & Hall/CRC, Monographs on Statistics and Applied Probability. 57. Fishbein, M., & Ajzen, I. (1975). Belief, attitude, intention, and behavior: An introduction to theory and research. Reading, MA: Addison-Wesley. Food Marketing Institute (2010). Supermarket facts: Industry overview 2010. Accessed on July 30, 2011 at http://www.fmi.org/facts_figs/?fuseaction=superfact GMA-Deloitte (2009). Finding the green in today's shoppers: Sustainability trends and new shopper insights. GMA/Deloitte Green Shopper Study. Godfrey, P. C., & Hatch, N. W. (2007). Researching corporate social responsibility: An agenda for the 21st century. Journal of Business Ethics, 70(1), 87–98. Gönül, F., & Srinivasan, K. (1993). Modeling multiple sources of heterogeneity in multinomial logit-models — methodological and managerial issues. Marketing Science, 12(3), 213–229. Greene, W. H. (2003). Econometric analysis (5th ed.)Upper Saddle River, N.J: Prentice Hall, 440–441. Hillman, A. J., & Keim, G. D. (2001). Shareholder value, stakeholder management, and social issues: What's the bottom line? Strategic Management Journal, 22(2), 125–139. Kamakura, W. A., & Russell, G. J. (1989). A probabilistic choice model for market-segmentation and elasticity structure. Journal of Marketing Research, 26(4), 379–390. Klein, J., & Dawar, N. (2004). Corporate social responsibility and consumers' attributions and brand evaluations in a product–harm crisis. International Journal of Research in Marketing, 21(3), 203–217. Krishna, A. (2011). Can supporting a cause decrease donations and happiness? The cause marketing paradox. Journal of Consumer Psychology, 21(3), 338–345. Lichtenstein, D. R., Drumwright, M. E., & Braig, B.M. (2004). The effect of corporate social responsibility on customer donations to corporate-supported nonprofits. Journal of Marketing, 68(4), 16–32. Lindquist, J.D. (1974). Meaning of image. Journal of Retailing, 50(4), 29–38. Luo, X. M., & Bhattacharya, C. B. (2006). Corporate social responsibility, customer satisfaction, and market value. Journal of Marketing, 70(4), 1–18. Luo, X. M., & Bhattacharya, C. B. (2009). The debate over doing good: Corporate social performance, strategic marketing levers, and firm-idiosyncratic risk. Journal of Marketing, 73(6), 198–213. Manget, J., Roche, C., & Munnich, F. (2009). Capturing the green advantage for consumer companies. Boston Consulting Group Report. Margolis, J.D., & Walsh, J. P. (2003). Misery loves companies: Rethinking social initiatives by business. Administrative Science Quarterly, 48(2), 268–305. Mazursky, D., & Jacoby, J. (1986). Exploring the development of store images. Journal of Retailing, 62(2), 145–165. McWilliams, A., & Siegel, D. (2001). Corporate social responsibility: A theory of the firm perspective. Academy of Management Review, 26(1), 117–127. Orlitzky, M., Schmidt, F. L., & Rynes, S. L. (2003). Corporate social and financial performance: A meta-analysis. Organization Studies, 24(3), 403–441. Raghubir, P., Roberts, J., Lemon, K. N., & Winer, R. S. (2010). Why, when, and how should the effect of marketing be measured? A stakeholder perspective for corporate social responsibility metrics. Journal of Public Policy & Marketing, 29(1), 66–77. Rindfleisch, A., Malter, A. J., Ganesan, S., & Moorman, C. (2008). Cross-sectional versus longitudinal survey research: Concepts, findings, and guidelines. Journal of Marketing Research, 45(3), 261–279. Sen, S., & Bhattacharya, C. B. (2001). Does doing good always lead to doing better? Consumer reactions to corporate social responsibility. Journal of Marketing Research, 38(2), 225–243. Talukdar, D., Gauri, D. K., & Grewal, D. (2010). An empirical analysis of the extreme cherry picking behavior of consumers in the frequently purchased goods market. Journal of Retailing, 86(4), 336–354. The Kroger Company (2009). Kroger Records Fourth Quarter and Record Full Year 2008 Results. Accessed on January 30, 2011, at http://www.thekrogerco.com/corpnews/ corpnewsinfo_pressreleases_03102009.htm Urbany, J. E., Dickson, P. R., & Kalapurakal, R. (1996). Price search in the retail grocery market. Journal of Marketing, 60(2), 91–104. van Heerde, H., Gijsbrechts, E., & Pauwels, K. (2008). Winners and losers in a major price war. Journal of Marketing Research, 45(5), 499–518. Verhoef, P. C., Neslin, S. A., & Vroomen, B. (2007). Multichannel customer management: Understanding the research shopping phenomenon. International Journal of Research in Marketing, 24(2), 129–148. Wedel, M., & Kamakura, W. A. (2000). Market segmentation: Conceptual and methodological foundations (2nd ed.)Boston: Kluwer Academic. Zhao, X., Lynch, J. G., & Chen, Q. (2010). Reconsidering Baron and Kenny: Myths and truths about mediation analysis. Journal of Consumer Research, 37, 197–206. Intern. J. of Research in Marketing 31 (2014) 168–177 Contents lists available at ScienceDirect Intern. J. of Research in Marketing journal homepage: www.elsevier.com/locate/ijresmar The market value for product attribute improvements under price personalization☆ Garrett P. Sonnier University of Texas at Austin, 1 University Station, Austin, TX 78712, United States a r t i c l e i n f o Article history: First received in 18 July 2012 and was under review for 4 ½ months Available online 14 October 2013 Area Editor: John H. Roberts Keywords: Product management Pricing Personalization Choice models a b s t r a c t Personalization of the marketing mix is a topic of much interest to marketing academics and practitioners. Using discrete choice demand theory, we investigate the aggregate market value for product attribute improvements when firms are engaged in personalized pricing. Our results provide a theoretically grounded rule for how to aggregate consumer valuations to assess the overall profitability of attribute improvements under price personalization. Under common pricing, each consumer contributes the same margin. Profitability of an attribute improvement is thus driven by inducing more consumers to buy. Consumers with high choice probabilities are given less weight in the market valuation under common pricing as they are less responsive to attribute improvements. Under personalized pricing, profitability of an attribute improvement is driven by extraction of consumer surplus from high valuation consumers. Consumers with higher valuations, and consequently higher choice probabilities, are given more weight in the market valuation under personalized pricing. Since individual consumers play a more central role in the market valuation under personalized pricing, estimation of consumerlevel valuations is of increased importance. Under common pricing, the market valuation for an attribute improvement is robust to extreme estimates of the consumer-level valuations. Through our theoretical and empirical analyses, we demonstrate that this robustness does not hold under personalized pricing. © 2013 Elsevier B.V. All rights reserved. 1. Introduction New product development is crucial to sustained firm performance. Companies that fail to develop new products risk being supplanted by more nimble competitors responding to shifts in consumer demand. While new companies often focus on creating disruptive technologies that alter the competitive landscape, most new product development activity focuses on incremental innovation devoted to improving existing products. For example, at Sony, over three quarters of new product activity is dedicated to improving existing products (Kotler & Keller, 2006). Bayus (1994) notes the existence of a similar pattern across a range of industries (Abernathy & Utterback, 1978) as well as evidence that incremental innovation is more crucial to profitability than breakthrough technology (Gomory, 1989). While new product development is undeniably important, it is also risky. Some studies suggest a failure rate of 95% in the U.S. (Kotler & Keller, 2006). To improve the odds of success, product managers must carefully assess how consumers value product attribute improvements and, importantly, how to aggregate consumer valuations into a market-level valuation useful for product planning decisions. From the perspective of an individual consumer, the value for a product attribute improvement is typically defined as the change in price that would keep consumer utility constant given the attribute improvement (Train, 2003). Appealing to discrete-choice theory of ☆ The author wishes to thank Elie Ofek for providing access to data and for helpful comments. 0167-8116/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ijresmar.2013.09.002 consumer and firm behavior, Ofek and Srinivasan (2002) derive a market-level analog to this consumer-level valuation termed the market value for an attribute improvement (MVAI). MVAI can be compared to the marginal cost of the attribute improvement, providing product managers with guidance in assessing the overall profitability of the improvement. However, the Ofek and Srinivasan (2002) derivation of MVAI assumes that firms charges a common price to all consumers. In contrast to a homogenous pricing policy, the notion of personalized pricing is of great appeal to both marketing academics and managers (Fay, Mitra, & Wang, 2009). A stream of research in the marketing literature has considered the personalization of the marketing mix from both an empirical and theoretical perspective (Chen & Iyer, 2002; Choudhary, Ghose, Mukhopadhyay, & Rajan, 2005; Heilman, Kaefer, & Ramenofsky, 2003; Khan, Lewis, & Singh, 2009; Knox & Eliashberg, 2009; Liu & Zhang, 2006; Rossi, McCulloch, & Allenby, 1996; Shaffer & Zhang, 2002). Firms from the apparel, airline, bank issued credit-card, and enterprise software industries have engaged in personalized pricing (Choudhary et al., 2005; Montgomery & Smith, 2009; Shaffer & Zhang, 2002). In light of academic and practitioner attention to the topic of personalized pricing, it is interesting to consider whether and how price personalization affects the market value for product attribute improvements.1 1 Rather than focusing on the normative question of whether or not firms should engage in price personalization, we adopt a positive point of view to understand the implications of engaging in one-to-one price personalization for estimates of the market value for a product attribute improvement. G.P. Sonnier / Intern. J. of Research in Marketing 31 (2014) 168–177 The main contribution of this paper is to derive the market value for product attribute improvements when firms are engaged in price personalization. Our results generalize the MVAI measure for common pricing and provide managerial guidance on product planning decisions under personalized pricing. Similar to Ofek and Srinivasan's (2002) analysis of MVAI under common pricing, we obtain closed form expressions for MVAI under personalized pricing in the context of the widely used multinomial logit demand model. However, two important differences in MVAI under common versus personalized pricing emerge from our analysis. First, under common pricing, every consumer contributes the same margin. Incremental profitability from an attribute improvement is thus driven by inducing more consumers to purchase. Consumers with extreme choice probabilities are given less weight in the aggregate market valuation as these consumers are less responsive to attribute changes. In contrast, under personalized pricing, the profitability of an attribute improvement is driven by the extraction of surplus from consumers with higher valuations and, consequently, higher choice probabilities. Under personalized pricing, consumers with high choice probabilities are given greater weight in the market valuation. The first difference between market-level valuations under common and personalized pricing (i.e., which consumers matter more for the aggregate market valuation) relates to the second difference. As individual consumers matter more under personalized pricing, extreme consumer-level valuations have a greater impact in this setting. Unlike the case of common pricing, computing MVAI under personalized pricing requires more careful attention to the estimation of the consumer-level valuations, a point underscored by the results of our empirical application. Choice models specified with additive linear utility imply that the consumer-level valuation for an attribute improvement is identified as the ratio of the estimated attribute and price coefficients (Train, 2003). With a heterogeneous model, the distribution of consumerlevel valuations is specified indirectly as a ratio of random coefficients. Such an identification strategy may yield distributions of the valuations that lack finite moments (Daly, Hess, & Train, 2012). Even if finite moments are assured, the distribution may be prone to yield extreme estimates (Meijer & Rouwendal, 2006; Ofek & Srinivasan, 2002). Alternatively, the valuations can be directly identified in the choice model likelihood which avoids ratio estimation and its associated problems (Cameron & James, 1987; Jedidi, Jagpal, & Manchanda, 2003; Sonnier, Ainslie, & Otter, 2007). An interesting and important property of MVAI under common pricing is its robustness to extreme consumer valuations (Ofek & Srinivasan, 2002) which renders the estimation of the consumer-level valuations less important. Our results demonstrate that robustness to outliers is not a general property of the MVAI measure and does not hold under personalized pricing. Using Ofek and Srinivasan's (2002) data set on stated preferences for portable camera mounts we empirically investigate the MVAI under personalized pricing. Computing MVAI under personalized pricing with ratio estimates of the consumer-level valuations suggests that nearly every attribute improvement is profitable for any product. In contrast, using consumer-level valuations that are directly identified and less prone to extreme estimates to compute MVAI under personalized pricing yields estimates that are smaller in magnitude and suggest a smaller subset of profitable attribute improvements. The remainder of the paper is organized as follows. We begin with a discussion of personalized pricing to motivate the study of product planning decisions under one-to-one pricing. We then review the derivation of the market valuation for an attribute improvement under common pricing and extend the derivation to the case of one-to-one price personalization. In doing so, we also consider the intermediate case of a discrete segment-based price discrimination strategy. We then discuss discrete choice demand models and the specification of consumer-level valuations used to compute the market-level valuation under personalized pricing. Our empirical application follows. The final section summarizes and concludes. 169 2. Personalized pricing in marketing The marketing literature has discussed numerous examples of personalized marketing in both consumer and business-to-business markets. Choudhary et al. (2005) discuss examples of firms in the enterprise software industry, such as IBM, Hewlett–Packard, and Sun Microsystems, that use personalized pricing discounts for products of the same quality. In consumer markets, information technology has enabled firms to develop rich databases of consumer information giving firms the ability to reach individual consumers and personalize the marketing mix. Direct marketing firms such as Land's End and L.L. Bean use promotional discounts to tailor prices to individual households (Shaffer & Zhang, 2002). Firms in the bank issued credit card industry, such as Wells Fargo, engage in price personalization through personalized discounts on card fees (Choudhary et al., 2005). The consulting firm Accenture offers clients a personalized pricing tool to assist in implementing a one-to-one price promotion program.2 A CNN.com report details price variation across consumers for the same product in a variety of online product categories, including airline tickets, digital cameras, and personal computers.3 The online data provision company Lexis–Nexis sells to different consumers at different prices (Ghose & Huang, 2009). Even when met initially with consumer resistance, firms such as Amazon continue to find innovative ways to implement personalized pricing, such as the Gold Box (Choudhary et al., 2005). A challenge in implementing a personalized pricing strategy is that firms must obtain consumer willingness-to-pay for the products in the competitive set. Fay et al. (2009) consider conditions under which firms invest in technology to solicit preferences from consumers at the point of purchase versus technology that allows the firm to infer preferences based on past observations. Wertenbroch and Skiera (2002) discuss different methods for determining consumer valuations, or willingness-to-pay, in market research. These methods include Vickery auctions, the Becker–DeGroot–Marshak (BDM) elicitation procedure, and discrete choice models applied to either stated preference data or market transaction data. Cameron and James (1987), Jedidi et al. (2003), and Ofek and Srinivasan (2002) use discrete choice models to estimate consumer valuations for product attributes. Most empirical applications of personalized marketing also utilize discrete choice models (Ansari & Mela, 2003; Khan et al., 2009; Knox & Eliashberg, 2009; Rossi et al., 1996; Zhang & Krishnamurthi, 2004; Zhang & Wedel, 2009). An advantage of using discrete choice models is that with an attribute based utility function (Fader & Hardie, 1996), the valuation for the product can easily be decomposed into the valuations for the product attributes. Furthermore, if the valuations can be linked to consumer characteristics, such as demographics or purchase history, the model can be used to impute the valuations for new consumers conditional on the characteristics enhancing the firm's ability to implement a personalized pricing strategy (Rossi et al., 1996). In considering the question of whether and how the firm's pricing strategy affects the market value for product attribute improvements it is natural to address the problem from the perspective of firms selling direct to consumers. Shaffer and Zhang (2002) study one-to-one promotions among competing direct marketing firms. Chen and Iyer (2002) study competition among firms that offer personalized prices assuming that firms have an imperfect ability to reach consumers. Choudhary et al. (2005) consider how price personalization in a duopoly impacts firm choices over product quality. It is important to note, though, that selling through a retailer does not preclude the 2 Accenture.com, http://www.accenture.com/NR/rdonlyres/6EFFD307-3CBE-40AEB1929F7FADC5776/0/personalized_pricing_tool.pdf, retrieved on Dec 12, 2009. 3 CNN.com, http://www.cnn.com/2005/LAW/06/24/ramasastry.website.prices/, retrieved on Dec 12, 2009. 170 G.P. Sonnier / Intern. J. of Research in Marketing 31 (2014) 168–177 manufacturer from engaging in personalized pricing. Gerstner, Hess, and Holthausen (1994) analyze a manufacturer targeting pull discounts in the form of coupons or rebates to price sensitive consumers purchasing through a retailer. Liu and Zhang (2006) study personalized pricing in a channel where both the manufacturer and the retailer can personalize price and the manufacturer can open a direct to consumer channel. Silva-Risso and Ionova, (2008) study customized manufacturer incentives in the automotive industry, where manufacturers spend approximately $45 billion per year on sales incentives directed at consumers. 3. Theoretical analysis 3.1. The market value for an attribute improvement under pricing common to all consumers We begin by reviewing the derivation of the market value for an attribute improvement (MVAI) under common pricing (Ofek & Srinivasan, 2002). Assume a market consisting of i = 1,…,I consumers choosing amongst a set of m = 0,…,M products (where 0 denotes the “outside” alternative). Let product m be defined by a vector of continuously differentiable product attributes, xm, and a common price, pm.4 The share of consumers predicted to choose product m I from the competitive set is Sm ¼ 1I ∑ Pr½yim ¼ 1# where yim = 1 denotes i¼1 the choice of product m and Pr[yim =1] is the choice probability. Assume that competing firms sell only one product (such that m also indexes firms) and that fixed costs are zero. The profits from product m, πm, I are given by πm ¼ ∑ Prim ½pm −cm # ¼ I $ Sm ½pm −cm # where cm is the i¼1 variable cost. Note that the aggregation of the choice probabilities into market shares prior to multiplication with the margin is possible in this setting because the prices are common across all consumers. The firm's first order condition for the pricing decision is ∂πm ∂Sm ¼ Sm þ ∂p ½pm −cm # ¼ 0 . Now consider the total change in m ∂pm profitability of product m triggered by a change in the kth product attribute, xkm. The total derivative of profits with respect to a change in ! " dπm ∂Sm ∂cm xkm is k ¼ I ½pm −cm #−Sm k . After substitution of the pricing k dxm ∂xm ∂xm first order condition, the total derivative of profits with respect to 2 2 3 3 ∂Sm dπm ∂cm 5 ∂xkm 5 4 4 the attribute change is given by k ¼ ISm − ∂S − k . m dxm ∂xm ∂pm Under common pricing, incremental profitability hinges on the changes in market share in response to the attribute improvement and price. As each consumer contributes the same margin, the profitability of an attribute improvement will ultimately depend on inducing more consumers to purchase the product. For the attribute change to be profitable to the firm, the ratio of market share changes with respect to the attribute and price must exceed the marginal cost of the attribute change. Ofek and Srinivasan (2002) term this ratio of ∂Sm ∂xk market shares, − ∂Sm , the market value for an attribute improvement m (MVAI). 5 ∂pm 3.2. The market value for an attribute improvement and market share simulators Managers often use the parameter estimates from a discrete choice model to build a market share simulator. The simulator can be used to assess the sensitivity of market share to changes in price and product attributes. Under common pricing, simulation techniques can also be used to compute the price increase given an attribute improvement that leaves aggregate market share constant. The manager first improves the level of the product attribute then searches for the price change that would leave market share unchanged. Ofek and Srinivasan (2002) show that for common pricing this approach coincides with the MVAI. The total differential of market share with respect to the kth product attribute ∂Sm ∂Sm and price is dSm ¼ k dxkm þ dp . The price change that satisfies ∂xm ∂pm m dSm = 0, which is the price change given an attribute change that holds ∂Sm market share constant, is dpm ∂xk ¼ − ∂Sm which is exactly the MVAI under m dxkm ∂pm common pricing. Consider now the case of personalized pricing. Rather than finding the incremental change in the common price that equalizes market share before and after the attribute improvement, the manager seeks the incremental change in the personalized price that leaves the individual's choice probability unchanged. This consumer specific price change can also be approximated via simulation. The manager changes the product attribute then searches for the personalized price change that would leave the individual choice probability unchanged. However, at the end of this exercise the manager is left with a set of consumer-level quantities approximate to the consumer-level valuations from the choice model. The question of how to aggregate these quantities into a marketlevel value to assess the profitability of the attribute improvement remains. Intuitively, the attribute change will be profitable to the firm if the sum of the expected incremental prices that can be captured from consumers exceeds the sum of the expected costs of the improvement. 3.3. The market value for an attribute improvement under price personalization Before addressing one-to-one personalized pricing, it is useful to dwell on whether and how MVAI under common pricing would differ if the firm engaged in a more discrete price discrimination strategy. Under such a strategy, the firm might offer product m at different prices to discrete segments of consumers. Assume there are d=1,…,D segments of consumers each of size Id and each of which receive a price of pdm. Let Sdm represent the share of product m in segment d. The profits from product " d # h i h ii D I D h d d m would then be π m ¼ ∑ ∑ Prim pm −cm ¼ ∑ Id $ Sdm pdm −cm . d¼1 i¼1 4 As with Ofek and Srinivasan's (2002) derivation of MVAI under common pricing, our conceptual analysis considers continuously differentiable product attributes, such as fuel economy for automobiles or processor speed for personal computers. Some product attributes are, of course, discrete. For the case of discrete product attributes, simulation techniques would be required to assess the profitability of an attribute improvement. For example, in the multinomial logit model, there is a closed form expression for the derivative of the choice probability with respect to a change in a continuous product attribute. For a change in a discrete product attribute, the effect on the choice probability is computed as the difference in choice probabilities with the new and original level of the product attribute. d¼1 The derivative of the profit function is obtained by summing the segment specific derivatives. After substitution of the pricing first order condition, the total derivative of profits with respect to the 2 2 2 2 d3 333 ∂Sm attribute change is given by dπm 61 D 6 6 6 ∂xk 7 ∂cm 777 ¼ I4 ∑4I d Sdm 4−4 md 5− k 555 . k ∂Sm I d¼1 dxm ∂xm ∂pdm Thus, the MVAI under a discrete price discrimination strategy is given 5 See Ofek and Srinivasan (2002) for a more complete discussion of MVAI under common pricing. 171 G.P. Sonnier / Intern. J. of Research in Marketing 31 (2014) 168–177 2 2 2 ∂Sdm 333 6 D 6 6 ∂xk 777 by 1I 4 ∑4I d Sdm 4− md 555. Each segment MVAI is computed exactly the d¼1 ∂Sm ∂pdm same as MVAI under common pricing (i.e., as the ratio of market share derivatives) and is weighted by the segment size and market share within the segment. Consider now the profits from product m under personalized I I i¼1 i¼1 pricing, π m ¼ ∑ π im ¼ ∑ Prim ½pim −cm # .6 Unlike common pricing or segment pricing, under price personalization the choice probabilities cannot be aggregated into market shares prior to multiplication with the margin. Profits are obtained in this case by summing over the product of each individual consumer's purchase probability and the consumer specific contribution margin. Thus, we may view segment or common pricing as special cases of the more general case of personalized pricing, where analysis of the two former situations is simplified by the ability to aggregate the choice probabilities into shares prior to multiplication with the common or segment margins. For each consumer, the firm's first order condition for the pricing decision under personalization is ∂πim ∂Prim ¼ Prim þ ½p −c # ¼ 0: ∂pim ∂pim im m ð1Þ The total derivative of profits with respect to the attribute change is I dπ m ∂π m X ∂π m dpim ¼ k þ : k dxm ∂xm i¼1 ∂pim dxkm ð2Þ I Since πm ¼ ∑ πim , the second term in this equation becomes i¼1 ∂πim dpim dπm which is zero by the first order condition. Thus, ¼ ∂pim dxkm dxkm ! " I ∂Prim ∂cm ½pim −cm #−Prim k . Plugging in the expression for [pim − cm] ∑ k ∂xm i¼1 ∂xm from the first order condition and rearranging terms yields the following condition I ∑ i¼1 2 3 3 ∂Prim " # I 6 ∂xk 7 61 X dπ m ∂cm 7 6 7 6 7 ¼ I6 Prim 6− m 7−Sm 7: k 4 5 4 I ∂Prim dxm ∂xkm 5 i¼1 ∂pim 2 ð3Þ Since each consumer has a unique contribution margin under price personalization, the firm cares much more about which consumers are induced to purchase. High valuation consumers are more likely to tolerate higher prices and yield higher margins. Thus, incremental profits depend not solely on attracting more customers (as is the case under common pricing) but more on the extraction of consumer surplus from buyers with larger valuations for the attribute improvement. Under personalized pricing the ratio of consumer choice probability derivatives with respect to the attribute and price determines the MVAI. More specifically, the attribute improvement will be profitable if the weighted average of these ratios exceeds the marginal cost weighted by the product's market share. Specifically, MVAI under personalized pricing is MVAI per ¼ I 1X I i¼1 2 3 ∂Prim 6 ∂xk 7 6 7 Prim 6− m 7: 4 ∂Prim 5 ∂pim 3.4. MVAI under personalized pricing and the multinomial logit (MNL) model We now discuss expressions for MVAI under personalized pricing implied by the widely used multinomial logit model. Suppose we observe the choices of the i = 1,…,I consumers on a set of t = 1,…,T choice occasions. Assume a linear indirect utility function, V imt ¼ x′ imt ϕ( −α ( pimt þ ε(imt , with error term εimt⁎ ~ EV(0,μ*). It is well known that the utility function can be multiplied by a constant without changing the consumer's utility maximizing choice. This scale identification problem is typically addressed by estimating ( ( the parameters ϕ ¼ ϕμ ( and α ¼ αμ ( , normalizing utility by the scale parameter of the error distribution (Swait & Louviere, 1993; Train, 2003). The choice probabilities are Pr½yimt 2 3 6 7 6 exp½x′ imt ϕ−αpimt # 7 7: ¼ 1# ¼ 6 6 M X # $7 4 5 exp x′ ilt ϕ−αpilt 1þ ð5Þ l¼1 Parametric distributions of heterogeneity are easily incorporated % & into the analysis. Fori example, one could specify θi e MVN θ; Σθ h ′ where θi ¼ ϕ i ln ðα i Þ . Following Eq. (4), with heterogeneous θ the MVAI for the kth product attribute under personalized pricing is given by per MVAIθ ¼ " # k I 1X ϕ Prim i : I i¼1 αi ð6Þ Under personalized pricing the MVAI is the average of the consumer-level valuations weighted by the choice probabilities.7 In this specification the distribution of the consumer-level valuations is identified indirectly as the ratio of the random attribute and price coefficients, αϕii (Train, 2003). While commonly employed, unfortunately not much can be said in favor of such an identification strategy in the context of MVAI under personalized pricing. The heterogeneity distribution for the attribute and price coefficients implies a distribution for the ratio which will generally be different from that specified for the coefficients. For example, a normal distribution on the coefficients does not imply a normal distribution on the ratio. The implied heterogeneity distribution may reflect a prior belief that the researcher has no intention of expressing. Since ratios of random variables are generally heavy tailed, the researcher using indirect identification is implicitly (and perhaps unwittingly) imparting a prior belief that the distribution of consumer-level valuations is heavy tailed. Furthermore, it is not at all clear that the implied distribution possesses finite moments (Daly et al., 2012). Even if the heterogeneity distribution does possess finite moments, it is clear that consumers with estimates of αi tending towards zero will be problematic in this setting as their valuations will tend to be very large. Such consumers will inflate the market value for the product attribute improvement. Indeed, only a handful of such consumer-level valuations would likely result in the MVAI exceeding the share weighted marginal cost, suggesting a profitable attribute improvement. Given the significance of the consumer-level valuations in the MVAI under personalized pricing, it seems advantageous to parameterize ( the model to directly identify the valuations. 8 We can estimate β ¼ αϕ( ( and μ ¼ αμ ( , normalizing by the price coefficient and directly identifying ð4Þ 6 We will return to the question of how the firm implements personalized pricing in our empirical application. 7 Interestingly, the weighted average approach has been suggested as an ad-hoc aggregation rule for consumer-level valuations (Ofek & Srinivasan, 2002). 8 The interested reader is directed to Sonnier et al. (2007) for a more detailed discussion of direct and indirect estimation of consumer-level valuations. 172 G.P. Sonnier / Intern. J. of Research in Marketing 31 (2014) 168–177 the consumer-level valuation via β (Cameron & James, 1987; Sonnier et al., 2007). The choice probabilities are Pr½yimt 2 ! ′ " 3 x imt β−pimt 6 exp 7 6 7 μ 7: ¼ 1# ¼ 6 ! " 6 7 M ′ X 4 x ilt β−pilt 5 exp 1þ μ l¼1 4. Empirical application ð7Þ An advantage of direct identification is that the heterogeneity distribution is specified directly on the consumer-level valuations. For example, one could specify normally distributed valuations via h i % & λi e MVN λ; Σλ where λi ¼ β′ i ln ðμ i Þ . This would place less prior mass in the tails of the distribution of the valuations tamping down on outlier valuations. Alternatively, if the researcher believes the distribution of valuations is in fact thick-tailed, a heterogeneity distribution that reflects this belief, such as the t-distribution, may be utilized. The valuations may also be modeled as a function of demographics or other consumer-level covariates, allowing for the prediction of valuations for future consumers conditional on this information. For heterogeneous λ, the market value for an improvement in the kth product attribute under one-to-one price personalization is computed as per MVAIλ ¼ I h i 1X k Prim βi : I i¼1 profitability from an attribute improvement is driven by the extraction of surplus from these consumers. ð8Þ The expression for MVAI in Eq. (8) makes use of the directly indentified consumer valuations and avoids potential problems associated with ratio estimates of the valuations. From Eqs. (7) and (8) we see that for MVAI under personalized pricing the scale of error variance, captured by the parameter μ, plays a role similar to that in the case of common pricing. As the effect of the error variance increases, the ability of the valuations to explain consumer choices diminishes. In 1 the extreme, as μ → ∞ the value of Prim approaches 1þM . As the effect of the error variance decreases, the probability of choosing the alternative with the highest valuation increases. Under common pricing, MVAI gives smaller weight to such high value, high probability consumers since the weight Prim[1 − Prim] reaches its maximum value at Prim = 0.5 (Ofek & Srinivasan, 2002). Under common pricing consumers very likely to buy product m are given a smaller weight in the market valuation for an attribute improvement compared with consumers who are indifferent between product m and the composition of all other products. Under personalized pricing, the consumer-level valuations are weighted by the choice probabilities, Prim. Consumers very likely to buy product m are given a larger weight in the market valuation. Since the choice probability increases in the valuations, consumers with higher valuations for the attributes of product m are also more likely to be consumers with higher choice probabilities for product m. Under common pricing, the firm can raise the price of a product subsequent to an attribute improvement to capture surplus from higher value consumers, but does so at the risk of losing consumers with lower attribute valuations and choice probabilities. By the nature of the S-shaped logit probability response curve, consumers with choice probabilities away from zero or one will exert most of the influence on the changes in market share with respect to the attribute and price, which ultimately determines MVAI under common pricing. MVAI under common pricing reflects the importance of these consumers by giving them higher weights. However, one-to-one price personalization allows the firm to capture surplus from higher valuation, higher probability consumers without driving lower valuation, lower probability consumers away from the product. Thus, consumers with higher valuations, and consequently higher choice probabilities, are given more weight in the market valuation under one-to-one price personalization as incremental Using a data set on consumer stated preferences for camera mounts we explore estimates of MVAI under personalized pricing.9 A complete description of the data and a thorough treatment of MVAI under common pricing can be found in Ofek and Srinivasan (2002). A total of 302 respondents each rank 18 profiles. Each profile is described by 5 attributes and a price. In addition, respondents completed a holdout ranking task with 4 profiles. We begin by considering heterogeneity distributions for the directly identified valuations, βi. We consider normal and t heterogeneity distributions for valuations. A normal heterogeneity % & distribution, βi e N β; Σβ , places small prior mass on outlier consumer valuations. In contrast to a normal heterogeneity distribution for the % & valuations, a t distribution of heterogeneity, βi e t ν β; Σβ , permits more prior mass in the tails. For both cases, we use a log-normal heterogeneity distribution for μi. For the t-distribution, the degree of freedom parameter ν must lie on the range (0,∞). We specify a lognormal prior for ν and treat it as an unknown parameter to be estimated. We estimate the normal and t models with standard Markov–Chain Monte Carlo methods, running the sampler for a total of 15,000 iterations, keeping the last 5000 for inference. Time series plots of the model log-likelihoods indicate that this is sufficient for convergence. Insample fit measured by the Deviance Information Criteria (DIC) (Spiegelhalter, Best, Carlin, & van der Linde, 2004) and holdout fit measured by the log predictive density (LPD) suggest that the model based on a t-distribution of heterogeneity outperforms the model based on normal heterogeneity (DIC of 15,994 vs. 16,689 and LPD of −689 vs. −691 for the t vs. normal model, respectively). Table 1 presents the attributes and levels of the five products used in our MVAI analyses along with the marginal costs of improvement for each attribute. An issue to consider is how the firms might implement personalized pricing in practice. An approach widely discussed in the literature is to offer a personalized discount, via a coupon or rebate, off of a regular common price (Rossi et al., 1996; Shaffer & Zhang, 2002). However, in a personalized marketing environment, firms could forgo regular price altogether. Regular prices place an upper bound on the price a consumer pays which limits the ability of the firm to extract surplus from high value customers. Of course, this issue could be resolved by simply charging a high regular common price, at or above the maximum suggested consumer-level price, and offering discounts accordingly. Shaffer and Zhang (2002) show that in a competitive environment lower regular prices perform the important function of limiting competitive poaching of a firm's high value customers. From a practical vantage point, a regular common price coupled with a personalized discount may also be a more feasible strategy for firms to implement. In light of these issues, we consider a personalized price discount, denoted by zim, off of the common prices used in Ofek and Srinivasan (2002) and reported in Table 1.10 We show in Appendix A that the MVAI under a personalized discount is equivalent to the expression shown in Eq. (4). To find the optimal personalized price discounts, we compute for each consumer the vector of discounts that satisfies the first order condition with respect to zim by finding the consumer-specific discount vector that minimizes "' (2 # M Prim . As in Rossi et al. (1996), we ∑ ½pm −zim −cm #− ∂Prim =∂zim m¼1 allow for the possibility that the optimal discount may be zero. Once the optimal discounts are obtained, we compute MVAI accordingly. 9 While our empirical application makes use of stated preference data, computation of MVAI is not limited to stated preference data. MVAI can also be computed from a choice model calibrated on revealed preference data. 10 This can be viewed as an approximation to a two stage game where competing firms choose a regular price in the first stage then price discounts in the second stage. 173 G.P. Sonnier / Intern. J. of Research in Marketing 31 (2014) 168–177 Table 1 Marginal costs, attribute levels, and common prices for camera mount product simulations. Attribute Weight (tens of oz.) Sizeb Set up time (min.) Stabilityc Flexibilityd Common prices for competitive sete Three products Four products Five products Marginal cost of attribute improvementa $4.90 $0.23 $1.41 $0.31 $0.26 UltraPod Q-Pod Gorilla Pod Camera Critter 0.17 0.80 0.62 2.50 1.80 0.20 0.98 0.98 1.80 1.96 0.35 0.84 0.84 2.50 2.17 0.46 1.27 0.50 2.30 2.84 $8.84 $7.72 $7.15 $9.89 $9.22 $8.53 $9.53 $8.50 $7.75 Half Dome 0.57 1.20 0.42 3.00 2.33 $8.22 $7.49 $10.39 Table 2 reports the estimates of MVAI under personalized pricing implied by the model that directly identifies the valuations with a t-distribution of heterogeneity. The estimates are reported for the three product market in Ofek and Srinivasan (2002). Under common pricing, Ofek and Srinivasan (2002) find that improvements in weight, size, stability and flexibility are profitable for all three products. Improvements in set up time are not profitable for any of the three products. When firms are engaged in price personalization, MVAI estimates suggest a smaller set of profitable attribute improvements. As with common pricing, improvements in size and flexibility are profitable for all three products. However, improvements in weight fail to generate incremental profits for any of the products. Similarly, improvements in stability are profitable only for Q-Pod and Gorilla Pod. Interestingly, under personalized pricing improvement in set up time is profitable for Gorilla Pod. These results reflect two effects at play when firms move from common to personalized pricing. On the one hand, personalization allows firms to capture more consumer surplus versus common pricing. This implies that finding incremental profitability from attribute improvements may be more difficult under price personalization as firms are already wringing much of the surplus from the market. On the other hand, moving to price personalization may render some attributes that are unprofitable to improve under common pricing profitable. This is due to the fact that profitability under personalization is driven by capturing surplus from high value Table 2 Market value for attribute improvements under price personalization: direct identification of consumer-level valuations. Attribute UltraPod Q-Pod Gorilla Pod Weight 1.70 (0.10) 0.55a (0.04) 0.36 (0.03) −0.31 (0.28) 0.48 (0.03) 1.04 (0.06) 0.63 (0.09) 0.44 (0.04) 1.09 (0.14) 0.57 (0.04) 0.94 (0.05) 0.39 (0.03) 1.26 (0.13) 0.79 (0.08) 1.18 (0.11) Set up time Stability Flexibility Attribute UltraPod Q-Pod Gorilla Pod Weight 1.16 (0.08) 0.50a (0.04) 0.34 (0.02) 0.10 (0.03) 0.40 (0.02) 0.82 (0.05) 0.40 (0.03) 0.31 (0.03) 0.54 (0.10) 0.41 (0.03) 0.87 (0.07) 0.34 (0.03) 0.51 (0.06) 0.83 (0.22) 0.70 (0.08) Size Set up time Stability Flexibility a Marginal cost of weight in dollars to reflect coding of weight in tens of ounces. All other marginal cost data are in tens of dollars. b Where 1 represents a camera mount that fits into a standard pocket and 3in a standard book bag. c Where 1 means stable enough under light-medium wind conditions for a small camera with a built-in lens and 3 for a full-size camera with a large lens. d Where 1 is low flexibility and 3 is high flexibility. Flexibility is the degree to which the camera mount can be adapted to various terrains and adjusted for height and angle. e Prices in tens of dollars, as reported in Ofek and Srinivasan (2002). Size Table 3 Market value for attribute improvements under segment pricing: direct identification of consumer-level valuations. Table cells report the posterior mean (in tens of dollars) and posterior standard error (in parenthesis). a Bold indicates that 95% of the distribution of the difference between the valuation and the share weighted marginal costs is positive. Table cells report the posterior mean (in tens of dollars) and posterior standard error (in parenthesis). a Bold indicates that 95% of the distribution of the difference between the valuation and the share weighted marginal costs is positive. consumers. Under common pricing, firms are unable to capture this value without driving down share. Price personalization frees the firm from this constraint. As noted, we see both of these effects in our results. It is interesting to compare the estimates of MVAI under personalized pricing with those under segment pricing. The segments may be determined according to any of a number of bases (e.g., demographics, brand loyalty, or usage). For our illustration, we perform a two-step cluster analysis on the posterior means of the attribute valuations. This cluster analysis results in two segments. The first segment is comprised of 87% of the respondents while the second segment is comprised of 13% of the respondents. The mean attribute valuations in the second segment are all higher than the mean valuations in the first segment. We compute the optimal segmentspecific price discounts and then compute the MVAI under segment pricing. The results appear in Table 3. The results under segment pricing suggest largely the same smaller set of profitable attribute improvements as the results under personalized pricing. The exception is that improvements in set up time are not profitable for Gorilla Pod under segment pricing. In addition, the MVAI estimates under segment pricing are, for the most part, smaller in magnitude. While an indirect identification strategy on the valuations raises a number of concerns, it is nonetheless instructive to dwell on such a strategy. Normal heterogeneity distributions are often used % by& academics and alike.11 We may specify θi e MVN θ; Σθ h practitioners i′ where θi ¼ ϕ′ i ln ðα i Þ . The log-normal distribution for αi ensures that the consumer-valuations are well-defined (Daly et al., 2012). However, the valuations are distributed as the ratio of normal and lognormal random variables and are likely to be heavy tailed. This specification is thus analogous to using the thick tailed t-distribution in the case of direct identification. An important difference though is that under indirect identification, small price coefficients will lead to valuations tending towards infinity. While such valuations have smaller prior probability under direct identification, this is not necessarily so under indirect identification. A draconian fix to this problem is to restrict the price coefficient to be homogeneous across consumers. The implied valuations are then identified as the normally distributed attribute coefficients scaled by the homogeneous price coefficient. This is analogous to the direct model using a normal heterogeneity distribution on the valuations. However, such an approach cannot be recommended as the price of normality in this case is the more restrictive homogeneous specification on price responsiveness. While degradation of model fit is a concern, a larger concern is bias in the estimates of the price coefficient and hence the attribute valuations (Chintagunta, Jain, & Vilcassim, 1991; Daly et al., 2012). 11 For example, Sawtooth Software's Hierarchical Bayes module for choice based conjoint analysis uses normal heterogeneity distributions. 174 G.P. Sonnier / Intern. J. of Research in Marketing 31 (2014) 168–177 Valuation (tens of dollars) 3.50 Table 4 Market value for attribute improvements under price personalization: indirect identification of consumer-level valuations. 3.00 2.50 Median 2.00 UltraPod Q-Pod Gorilla Pod Weight 6.55a (2.42) 2.72 (1.47) 1.96 (0.84) −0.48 (3.35) 2.65 (1.15) 7.26 (2.20) 3.61 (1.85) 3.01 (1.07) 4.30 (1.61) 4.03 (1.28) 8.06 (2.50) 2.93 (1.87) 4.51 (1.73) 4.04 (2.04) 7.60 (2.93) Size 1.50 Set up time 1.00 Stability 0.50 Flexibility 0.00 Weight Size Set Up Time Stability Flexibility Fig. 1. Inter-quartile range of consumer-level attribute valuations based on βi. We estimate an indirect model with normal heterogeneity on the attribute coefficients and log-normal heterogeneity on the price coefficient. In-sample fit measured by the DIC is 16,319 while holdout fit measured by the LPD is −713. Both in-sample and holdout fit is inferior to the direct model with the t-distribution of heterogeneity. As noted, previous research has shown indirect identification of the consumer valuations to be prone to outliers. Figs. 1 and 2, which present the median and inter-quartile ranges of the consumer level valuations implied by the direct and indirect models, respectively, confirm that this is indeed the case for our data. Fig. 1 corresponds to the direct model. The 75th percentile values are in the neighborhood of $10– $30. Fig. 2 corresponds to the indirect model. In Fig. 2, the valuations are far more dispersed. The 75th percentile values range from about $80–$250. The median valuations are all above the marginal costs reported in Table 1. Table 4 presents the MVAI estimates under personalized pricing for the valuations based on indirect identification. The MVAI estimates imply that improving any of the attributes for nearly all of the products is profitable. The MVAI estimates are also much larger in magnitude compared to those based on the directly identified valuations. Given the distribution of the estimates of the consumer-level valuations under indirect identification, it is not surprising that the MVAI estimates computed with these valuations are large and suggest that nearly any attribute improvement will be profitable. The results demonstrate that extreme consumer valuations have a significant impact on MVAI under personalized pricing. Table 5 presents the MVAI estimates under segment pricing for the valuations based on indirect identification. As before, we perform a two-step cluster analysis on the posterior means of the attribute valuations which again results in two segments. The first segment is comprised of 87% of the respondents with lower mean valuations 30.00 Valuation (tens of dollars) Attribute Table cells report the posterior mean (in tens of dollars) and posterior standard error (in parenthesis). a Bold indicates that 95% of the distribution of the difference between the valuation and the share weighted marginal costs is positive. compared to the smaller second segment comprised of 13% of the respondents. However, the average valuations for both segments are much higher compared to the average valuations for the segments derived from the directly identified valuations. This is to be expected given the distributions shown in Figs. 1 and 2. The more interesting issue is how the MVAI estimates under segment pricing compare to those computed with the directly identified valuations. We compute the optimal segment-specific price discounts and then compute the MVAI under segment pricing. Although the indirectly identified valuations are widely dispersed, as noted in Fig. 2, the MVAI estimates under segment pricing are not impacted to the degree to which the MVAI estimates under personalized pricing are affected. However, under segment pricing, the MVAI estimates computed with the indirectly identified valuations suggest that more attributes can be profitably improved compared to those computed with the directly identified estimates. The estimates imply that improvements in size, stability and flexibility are profitable for all products and improvements in set up time are profitable for Q-Pod and Gorilla Pod. In addition, the magnitude of the estimates computed with the indirectly identified valuations, although not as explosively large as those under personalized pricing, are generally larger compared to those computed with the directly identified valuations. While the MVAI estimates under segment pricing are more robust compared with those under personalized pricing, it is still the case that the estimates are adversely impacted by the widely dispersed valuations resulting from an indirect identification strategy. 4.1. MVAI and competitive entry As noted in the conceptual analysis, MVAI under both common and personalized pricing depends upon the consumer choice probabilities, Table 5 Market value for attribute improvements under segment pricing: indirect identification of consumer-level valuations. 25.00 20.00 Median Attribute UltraPod Q-Pod Gorilla Pod Weight 1.35 (0.12) 0.57a (0.06) 0.48 (0.04) 0.18 (0.03) 0.66 (0.06) 1.35 (0.14) 0.61 (0.07) 0.56 (0.06) 0.64 (0.07) 0.79 (0.09) 1.68 (0.19) 0.69 (0.09) 0.80 (0.10) 0.73 (0.08) 1.23 (0.15) 15.00 Size 10.00 Set up time Stability 5.00 0.00 Flexibility Weight Size Set Up Time Stability Flexibility -5.00 h Fig. 2. Inter-quartile range of consumer-level attribute valuations based on ϕ′ i i αi . Table cells report the posterior mean (in tens of dollars) and posterior standard error (in parenthesis). a Bold indicates that 95% of the distribution of the difference between the valuation and the share weighted marginal costs is positive. 175 G.P. Sonnier / Intern. J. of Research in Marketing 31 (2014) 168–177 Table 6 The effect of increased product competition on the market value for attribute improvements under price personalization. Three products Four products Five products Attribute UltraPod Q-Pod Gorilla Pod UltraPod Q-Pod Gorilla Pod Camera Critter UltraPod Q-Pod Gorilla Pod Camera Critter Half Dome Weight 1.70 (0.10) 0.55a (0.04) 0.36 (0.03) −0.31 (0.28) 0.48 (0.03) 36% $8.84 54% $7.45 1.04 (0.06) 0.63 (0.09) 0.44 (0.04) 1.09 (0.14) 0.57 (0.04) 30% $9.89 54% $8.30 0.94 (0.05) 0.39 (0.03) 1.26 (0.13) 0.79 (0.08) 1.18 (0.11) 34% $9.53 41% $8.48 0.70 (0.04) 0.23 (0.02) 0.16 (0.02) −0.38 (0.28) 0.30 (0.02) 17% $7.72 43% $6.86 0.44 (0.03) 0.36 (0.06) 0.20 (0.02) 0.71 (0.08) 0.39 (0.03) 16% $9.22 56% $7.97 0.55 (0.04) 0.23 (0.02) 0.81 (0.11) 0.31 (0.06) 0.94 (0.11) 20% $8.50 33% $7.91 1.99 (0.11) 0.76 (0.07) 0.88 (0.07) 0.94 (0.09) 0.61 (0.03) 47% $8.22 59% $6.72 0.41 (0.03) 0.19 (0.02) 0.11 (0.01) −0.41 (0.33) 0.25 (0.02) 13% $7.15 37% $6.54 0.29 (0.02) 0.30 (0.06) 0.12 (0.01) 0.19 (0.02) 0.22 (0.03) 11% $8.53 51% $7.56 0.40 (0.03) 0.19 (0.02) 0.52 (0.05) 0.10 (0.04) 0.78 (0.09) 15% $7.75 34% $7.26 1.62 (0.09) 0.67 (0.07) 0.52 (0.04) 0.34 (0.02) 0.44 (0.02) 39% $7.49 53% $6.27 0.95 (0.06) 0.24 (0.02) 0.78 (0.11) 1.37 (0.17) 0.55 (0.04) 21% $10.39 62% $8.90 Size Set up time Stability Flexibility Market share Regular price % receiving discount Average discounted price Table cells report the posterior mean (in tens of dollars) and posterior standard error (in parenthesis). a Bold indicates that 95% of the distribution of the difference between the valuation and the share weighted marginal costs is positive. although in different ways. Dependence on the choice probabilities renders MVAI sensitive to competitive entry. The choice model parameters, of course, do not change. However, competitive entry will alter equilibrium prices, the choice probabilities and hence the MVAI estimates. Under common pricing, the effect of expanding the competitive set on the MVAI of existing products will depend on the choice probabilities through the expression Prim[1 − Prim]. Ofek and Srinivasan (2002) show that under common pricing, MVAI may increase or decrease in response to an expansion of the competitive set. For example, a price cut in response to entry may attract more consumers with lower valuations and lower MVAI while products that can maintain premium pricing in the face of entry may lose some lower valuation consumers while retaining higher valuation consumers thereby increasing the MVAI. Consider now the effect of expanding the competitive set on MVAI under personalized pricing. Since the consumer-level valuations are directly weighted by the choice probabilities competitive entry will reduce the probability of purchase for incumbent products, entry is likely to reduce the incumbent firms' MVAI estimates. Table 6 presents MVAI under personalized pricing when the choice set expands from three to four products and then four to five products.12 Also listed in Table 6, for each product, is the market share, the regular price, the percentage of consumers receiving a discount, and the average discounted price. When Camera Critter is added to the choice set, it gains a considerable amount of market share at the expense of the three incumbent products. Camera Critter is the dominant alternative on weight and size, shares dominance on stability with Q-Pod, and engages in personalized discounting with broader scope and scale. As a result, Camera Critter obtains a 47% share upon entry. The MVAI estimates for all the incumbent firms decrease. Gorilla Pod is the dominant alternative on set up time and flexibility. Consequently, its MVAI for these attributes does not decline as sharply. Indeed, its advantage on flexibility is substantial and the Gorilla Pod MVAI for flexibility remains the highest even after Camera Critter's entry. Half Dome enters with dominance on set-up time and stability. Camera Critter retains dominance on weight and size while Gorilla Pod retains dominance on flexibility. Half Dome has the highest average discounted price but still manages to obtain a 21% share. As expected, share declines bring about declines in MVAI for the incumbent firms. However, Camera Critter still has the highest MVAI for size and Gorilla Pod the highest MVAI for flexibility. 12 For the purposes of this analysis we focus only on the model that directly identifies the consumer-level valuations. See Table 1 for a description of all five products. 4.2. MVAI under asymmetric personalization Personalization of the marketing mix is costly in terms of information, computing, and administration (Rossi et al., 1996). In light of this, firms will likely differ in their willingness and/or ability to implement personalized pricing strategies. In this section, we examine the impact of asymmetric personalization on MVAI estimates. We use the term asymmetric personalization to refer to the situation where some firms are engaged in personalization while other firms employ common pricing. This is opposed to the case where all firms personalize, which we term full personalization. To conduct the analysis, we assume that after setting regular price, UltraPod and Q-Pod set personalized discounts while Gorilla Pod sells at the regular price. We then compute MVAI under personalized pricing for UltraPod and Q-Pod and MVAI under common pricing for Gorilla Pod. The results are presented in Table 7. Under asymmetric personalization, UltraPod and Q-Pod offer personalized discounts to over 60% of consumers resulting in an average discounted price of $7.51 for UltraPod and $8.30 for Q-Pod. The average discounted prices are close to those under full personalization and considerably lower than Gorilla Pod's regular price of $9.53. As a result, Gorilla Pod share drops to 28% while UltraPod and Q-Pod shares increase to 37% and 35%, respectively. The MVAI estimates for UltraPod and Q-Pod Table 7 Market value for attribute improvements under asymmetric price personalization. Attribute UltraPod Q-Pod Gorilla Pod Weight 1.61 (0.10) 0.55a (0.04) 0.36 (0.03) −0.29 (0.27) 0.52 (0.03) 37% $8.84 62% $7.51 1.20 (0.06) 0.67 (0.09) 0.40 (0.03) 1.22 (0.14) 0.58 (0.04) 35% $9.89 63% $8.30 2.77 (0.13) 1.14 (0.07) 1.58 (0.10) 2.32 (0.34) 2.16 (0.12) 28% $9.53 – $9.53 Size Set up time Stability Flexibility Market share Regular price % receiving discount Average discounted price Table cells report the posterior mean (in tens of dollars) and posterior standard error (in parenthesis). For m = 1,2 MVAI computed under personalized pricing. For m = 3 MVAI computed under common pricing. a Bold indicates that 95% of the distribution of the difference between the valuation and the share weighted marginal costs is positive. 176 G.P. Sonnier / Intern. J. of Research in Marketing 31 (2014) 168–177 (computed via the rule for MVAI under personalized pricing) increase slightly, commensurate with the share increases. Consumers choosing Gorilla Pod at the premium price are likely consumers with high valuations for the product. Indeed, UltraPod and Q-Pod are unable to profitably entice these consumers to switch even with a personalized discount. This intuition is confirmed by the relatively high MVAI's for Gorilla Pod (computed via the rule for MVAI under common pricing). Improvements in size, stability, and flexibility are profitable for Gorilla Pod under asymmetric personalization. 5. Summary and conclusions Understanding the market value for product attribute improvements is crucial to successful product planning and new product development. A measure of the consumer's value for an attribute improvement is the increase in price that would leave utility unchanged given the attribute improvement. A discrete choice model calibrated on stated or revealed preference data is a popular method for estimating consumer valuations. With heterogeneous consumers, the issue of how to aggregate the consumer-level valuations into a market-level valuation to assess profitability arises. Ad-hoc methods such as taking the average may yield misleading results and, empirically, may suffer from the effect of extreme valuations. Based on micro-economic theory of consumer and firm behavior, Ofek and Srinivasan (2002) derive the market valuation for an attribute improvement (MVAI) as the ratio of changes in market share with respect to the attribute improvement and price. Their derivation assumes the firm employs a common pricing strategy, charging the same price to all consumers. Marketing academics have long been interested in the effects of personalizing the marketing mix (Rossi et al., 1996). Recently, online channels have stimulated industry interest in and enabled more widespread use of price personalization based on purchase history or other information. We consider the market value for product attribute improvements for the case of one-to-one price personalization. Our results demonstrate how to assess the profitability of attribute improvements in this interesting and important setting. Compared with the market valuation for an attribute improvement under common pricing, two important differences emerge. First, under common pricing, the profitability of an attribute improvement is driven by inducing more consumers, each of whom contributes the same margin, to buy. Thus, consumers with extreme choice probabilities are given less weight in the market valuation under common pricing as these consumers are less responsive to attribute improvements. Under personalized pricing, the profitability of an attribute improvement is driven by the extraction of consumer surplus from high value consumers. Thus, higher valuation consumers with higher choice probabilities are given greater weight in the market valuation under personalized pricing. Second, because the individual consumers play a more central role in the market valuation under personalized pricing, MVAI under one-to-one price personalization is not robust to extreme consumer-level valuations. Therefore, when engaged in personalized pricing, the identification and estimation of consumer-level valuations is of increased importance relative to the case of common pricing. With additive linear utility, consumer-level valuations are identified as the ratio of attribute and price coefficients from the discrete choice model. This identification strategy has been shown to yield distributions for the valuations that lack finite moments in some cases and is particularly prone to yield extreme valuations. A simple alternative is to utilize a choice model directly identifies the valuations. Using a dataset on consumer stated preferences for camera mounts, we demonstrate the managerial relevance of our analysis. We estimate choice models that directly and indirectly identify consumer-level valuations for product attribute improvements. We then use these models to compute the MVAI implied by both models under personalized pricing strategies. Under personalized pricing, models that indirectly identify the consumer-level valuations result in MVAI estimates that suggest nearly any attribute improvement for all products considered is profitable. In contrast, the model that directly identifies the consumer-level valuations provides a better fit to the data and results in a smaller set of profitable attribute improvements. There are a number of avenues for future research. As noted, the problem of discrete product attributes remains a challenge. Our expressions for MVAI are based on a logit demand model. Future research may consider other empirical models of demand. Recent research investigates the price discrimination across multiple channels (Wolk & Ebbling, 2010). Investigating product planning decisions in the context of channel competition where manufacturers and retailers each have the ability to personalize price would be very challenging but may yield interesting insights. Lastly, our analysis considers single product firms. Firms may offer different product attributes via vertically differentiated product lines (Michalek, Ebbes, Adigüzel, Feinberg, & Papalambros, 2011). The impact of price personalization on product attribute decisions in a product line may be an interesting topic to consider. Sorting out the market value for a product attribute improvement in these cases should assist firms in making better product planning decisions. Appendix A. MVAI with personalized price discounts Consider our firms engaged in personalized pricing by offering a personalized discount, zim, off of the regular price, pm, common to all consumers. Such a discount could be in the form of a personalized coupon or a rebate. We will abstract away from targeting costs and I redemption issues. Profits to firm m are πm ¼ ∑ Prim ½pm −zim −cm # i¼1 while the MNL choice probabilities in this setting are 2 3 6 7 6 exp½x′ im ϕi −α i ðpm −zim Þ# 7 7 Prim ¼ 6 6 M X # ′ $7 4 5 exp x il ϕi −α i ðpl −zil Þ 1þ l¼1 2 ! ′ " 3 x β −ðpm −zim Þ 6 exp im i 7 6 7 μi ¼6 ! ′ "7 6 7: M X 4 x il βi −ðpl −zil Þ 5 1þ exp μi l¼1 ðA1Þ We assume the regular prices are observable to all firms when choosing their personalized discounts. This is consistent with the notion that regular prices are a high level managerial decision slow to adjust in practice (Shaffer & Zhang, 2002). For each consumer, the manufacturer's first order condition for the discounting decision is ∂πim ∂Prim ¼ −Prim þ ½p −z −c # ¼ 0: ∂zim ∂zim m im m ðA2Þ The total derivative of manufacturer profits with respect to the attribute change is dπm dxkm ¼ " I X ∂Prim i¼1 ∂xkm ½pm −zim −cm #−Prim ∂cm ∂xkm # : ðA3Þ Plugging in the expression for [pm − zim − cm] from the first order condition and rearranging terms yields the following condition 2 3 3 ∂Prim " # I 6 7 6 dπm ∂cm 7 6 ∂xkm 7 61 X 7 ¼ I Pr −S 6 7 6 7: im m 4 ∂Prim 5 4 I i¼1 dxkm ∂xkm 5 ∂zim 2 ðA4Þ G.P. Sonnier / Intern. J. of Research in Marketing 31 (2014) 168–177 Inspection of Eq. (A4) reveals for heterogeneous choice models h ki I ϕ parameterized in the space of θ, the MVAI will be given by 1I ∑ Prim αii . i¼1 For heterogeneous choice models parameterized in the space of λ, the h i I MVAI will be given by 1I ∑ Prim βki . i¼1 References Abernathy, W., & Utterback, J. (1978). Patterns of industrial innovation. Technology Review, 2, 41–47. Ansari, Asim, & Mela, Carl F. (2003). E-customization. Journal of Marketing Research, 40, 131–145. Bayus, B. (1994). Optimal pricing and product development policies for new consumer durables. International Journal of Research in Marketing, 11, 249–259. Cameron, T. A., & James, M.D. (1987). Estimating willingness-to-pay from survey data: An alternative pre-test market evaluation procedure. Journal of Marketing Research, 24, 389–395. Chen, Y., & Iyer, G. (2002). Consumer addressability and customized pricing. Marketing Science, 21, 197–208. Chintagunta, P., Jain, D. C., & Vilcassim, N. J. (1991). Investigating heterogeneity in brand preferences in logit models for panel data. Journal of Marketing Research, 33(4), 417–428. Choudhary, V., Ghose, A., Mukhopadhyay, T., & Rajan, U. (2005). Personalized pricing and quality differentiation. Management Science, 51, 1120–1130. Daly, A. J., Hess, S., & Train, K. E. (2012). Assuring finite moments for willingness to pay in random coefficients models. Transportation, 39, 19–31. Fader, P., & Hardie, B. (1996). Modeling consumer choice among SKUs. Journal of Marketing Research, 33, 442–452. Fay, S., Mitra, D., & Wang, Q. (2009). Ask or infer? Strategic implications of alternative learning approaches in customization. International Journal of Research in Marketing, 26, 136–152. Gerstner, E., Hess, J.D., & Holthausen, D.M. (1994). Price discrimination through a distribution channel: Theory and evidence. American Economic Review, 84, 1437–1445. Ghose, A., & Huang, K. (2009). Personalized pricing and quality customization. Journal of Economics and Management Strategy, 18, 1095–1135. Gomory, R. (1989). From the ladder of science to the product development cycle. Harvard Business Review, 89, 99–105. Heilman, C. M., Kaefer, F., & Ramenofsky, S. D. (2003). Determining the appropriate amount of data for classifying consumers for direct marketing purposes. Journal of Interactive Marketing, 17, 5–28. 177 Jedidi, K., Jagpal, S., & Manchanda, P. (2003). Measuring heterogeneous reservation prices for product bundles. Marketing Science, 22, 107–130. Khan, R., Lewis, Michael, & Singh, Vishal (2009). Dynamic customer management and the value of one-to-one marketing. Marketing Science, 28, 1063–1079. Knox, G., & Eliashberg, J. (2009). The consumer's rent vs. buy decision in the rentailer. International Journal of Research in Marketing, 26, 125–135. Kotler, P., & Keller, K. L. (2006). Marketing management (12e). New Jersey: Pearson Prentice Hall. Liu, Y., & Zhang, J. (2006). The benefits of personalized pricing in a channel. Marketing Science, 25, 97–105. Meijer, E., & Rouwendal, J. (2006). Measuring welfare effects in models with random coefficients. Journal of Applied Econometrics, 21, 227–244. Michalek, J., Ebbes, P., Adigüzel, F., Feinberg, F., & Papalambros, P. (2011). Enhancing marketing with engineering: Optimal product line decisions for heterogeneous markets. International Journal of Research in Marketing, 28, 1–12. Montgomery, A., & Smith, M. (2009). Prospects for personalization on the internet. Journal of Interactive Marketing, 23, 130–137. Ofek, E., & Srinivasan, V. (2002). How much does the market value an improvement in a product attribute. Marketing Science, 21, 398–411. Rossi, P., McCulloch, R., & Allenby, G. (1996). On the value of household purchase history information in target marketing. Marketing Science, 2, 321–340. Shaffer, G., & Zhang, Z. J. (2002). Competitive one-to-one promotions. Management Science, 48, 1143–1160. Silva-Risso, J., & Ionova, I. (2008). A nested logit model of product and transaction-type choice for planning automakers' pricing and promotions. Marketing Science, 27, 545–566. Sonnier, G., Ainslie, A., & Otter, T. (2007). Heterogeneity distributions of willingness to pay in choice models. Quantitative Marketing and Economics, 5, 313–331. Spiegelhalter, D., Best, N., Carlin, B., & van der Linde, A. (2004). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society B, 64, 583–639. Swait, J., & Louviere, J. (1993). The role of the scale parameter in the estimation and comparison of multinomial logit models. Journal of Marketing Research, 30, 305–314. Train, K. (2003). Discrete choice methods with simulation. Cambridge: Cambridge University Press. Wertenbroch, K., & Skiera, B. (2002). Measuring consumers' willingness to pay at the point of purchase. Journal of Marketing Research, 39, 228–241. Wolk, A., & Ebbling, C. (2010). Multi-channel price differentiation: An empirical investigation of existence and causes. International Journal of Research in Marketing, 27, 142–150. Zhang, J., & Krishnamurthi, L. (2004). Customizing promotions in online stores. Marketing Science, 23, 561–578. Zhang, J., & Wedel, M. (2009). The effectiveness of customized promotions in online and offline stores. Journal of Marketing Research, 46, 190–206. Intern. J. of Research in Marketing 31 (2014) 178–191 Contents lists available at ScienceDirect Intern. J. of Research in Marketing journal homepage: www.elsevier.com/locate/ijresmar Full Length Article How much to give? — The effect of donation size on tactical and strategic success in cause-related marketing Sarah S. Müller ⁎, Anne J. Fries, Karen Gedenk 1 University of Hamburg, Max-Brauer-Allee 60, 22765 Hamburg, Germany a r t i c l e i n f o Article history: First received in 7 February 2011 and was under review for 6½ months Available online 16 October 2013 Area Editor: Zeynep Gurhan-Canli Keywords: Cause-related marketing Donation size Donation framing Promotion Choice experiment a b s t r a c t In cause-related marketing (CM), companies promise a donation to a cause every time a consumer makes a purchase. We analyze the impact of the size of this donation on brand choice (tactical success) and brand image (strategic success). Our results reveal different effects of donation size on these success measures. For brand choice, the effect of donation size is moderated by a financial trade-off for consumers, whereas the effect on brand image is moderated by donation framing. Specifically, we show that donation size has a positive effect on brand choice if consumers face no financial trade-off; i.e., if they do not have to choose between triggering a donation or saving money. The effect is negative if a trade-off exists such that higher donations come at higher costs. Brand image is enhanced by larger donations if the framing is nonmonetary (e.g., the campaign promises the provision of vaccinations), whereas donation size has a negative effect if donation framing is monetary (e.g., the campaign states the Euro amount). If campaigns use a combination of both frames, the effect of donation size on brand image has an inverted U shape. Our results suggest that CM enhances tactical and strategic success only if firms select the right donation size, taking into account donation framing and financial trade-offs. © 2013 Elsevier B.V. All rights reserved. 1. Introduction In a cause-related marketing (CM) campaign, Tommy Hilfiger featured a promotion in which 50% of the price of a specific bag would be donated to Breast Health International. In another CM promotion, Starbucks donated $1 to the Global Fund to support people living with AIDS in Africa for every pound of East Africa Blend coffee sold. Volvic promoted its “Drink 1, Give 10” campaign in cooperation with UNICEF, stating that for every liter of water sold, the company would provide 10 l of drinking water in Africa. Procter & Gamble (P&G) promised “1 pack = 1 vaccine” in its CM promotion, in which for every promotional package sold, the company would donate .054€ to UNICEF, equal to the cost of one vaccination against tetanus. In CM campaigns such as these, the firm contributes a specific amount to a cause if a customer buys the firm's product (Varadarajan & Menon, 1988). This transactional element is the main characteristic of CM: The customer must make a purchase to trigger the donation. Corporate sponsorship of social causes has become very frequent, with spendings in North America reaching $1.86 billion in 2011 (IEG, 2011). CM is both a tactical tool that firms employ to increase their sales and a strategic activity aimed at improving brand image (Ross, Stutts, & Patterson, 1991). However, whether the investment in CM always ⁎ Corresponding author. Tel.: +49 40 42838 7132. E-mail addresses: sarah.mueller@wiso.uni-hamburg.de (S.S. Müller), anne.fries@wiso.uni-hamburg.de (A.J. Fries), karen.gedenk@wiso.uni-hamburg.de (K. Gedenk). 1 Tel.: +49 40 42838 3748. 0167-8116/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ijresmar.2013.09.005 pays off is unclear. On the one hand, by triggering a donation through their purchases, consumers might derive utility from giving, which is known as “warm glow” (Andreoni, 1989), and thus exhibit favorable purchase behaviors. On the other hand, CM might raise consumer skepticism about the company's motivation because the donation is conditional on sales and ensures the company's own benefit (Barone, Miyazaki, & Taylor, 2000). These consumer considerations can negatively impact brand image. Whether positive or negative effects prevail depends on several success factors (Fries, 2010). We study one key success factor, donation size, which is particularly interesting because it is a design element that is directly controlled by managers; i.e., they can decide how much to give when implementing CM. Campaigns vary in their donation sizes as indicated by the introductory examples, in which donations range from 1% of the product's price in the P&G example to 50% of the price in the Tommy Hilfiger campaign. The effect of investing in a larger donation is unclear. On the one hand, consumers may derive more warm glow when donation size increases, which should make them more likely to make a purchase. A larger donation could also produce more favorable evaluations of the brand. On the other hand, consumers who face a CM offer with a substantial donation may prefer to receive this money for themselves or may not believe that the company will really donate as much as promised. Thus, donation size could also have a negative effect on sales and brand image. Previous research has studied the influence of donation size on CM success, but the results are equivocal. Some studies find a positive effect (e.g., Olsen, Pracejus, & Brown, 2003), others a negative one (e.g., Strahilevitz, 1999), and others no effect at all (e.g., Human & S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 Terblanche, 2012). We therefore analyze the effect of donation size on CM success in more depth and extend previous research by focusing on the following three aspects: First, we acknowledge that firms use CM for both tactical and strategic purposes and therefore study two success measures: brand choice and brand image. Previous research has rarely compared these success measures. We expect the effects of donation size on brand choice and brand image to differ because of different underlying drivers. Second, we study two potential moderators of the effect of donation size on CM success that have not been analyzed before: the presence of a financial trade-off and donation framing. A financial trade-off occurs when consumers choose between one brand with a CM campaign and another brand with a price promotion. We expect that such a trade-off moderates the effect of donation size on brand choice. The framing of a CM campaign can be monetary (e.g., 5 cents), nonmonetary (e.g., one vaccination), or a combination of both (e.g., one vaccination, worth 5 cents). We expect framing to moderate the impact of donation size on brand image. Third, we vary our independent variable – donation size – over a wide range and in small intervals, which allows us to test for nonlinear effects. In a large-scale experimental survey, we systematically vary donation size and the potential moderators, and ask respondents to make a brand choice decision and evaluate the image of the focal brand. In an additional exploratory study, we also measure prospective drivers underlying CM success to shed light on the differences between tactical and strategic success. We find that the effect of donation size is different for brand choice (tactical success) versus brand image (strategic success). The effect on brand choice is moderated by the presence of a financial trade-off, and the effect on brand image is moderated by donation framing. Furthermore, we find a nonlinear effect of donation size on brand image for a combined monetary and nonmonetary framing. Finally, our exploratory analysis suggests that brand choice is driven by warm glow, whereas brand image mostly depends on what consumers infer about the company's altruism and about the effectiveness of the campaign. Our results have important implications for managers. We show that spending more money on a larger donation does not always produce more favorable effects, but rather donation size has to be chosen carefully, taking into account financial trade-offs and donation framing. Our research contributes to the CM literature by clarifying the effects of donation size: We explain why the effect can be positive, negative, or null. In particular, we detect differences in tactical versus strategic success. Furthermore, we investigate the moderating effects of financial trade-offs and donation framing for the first time and provide new insights into nonlinear effects. We proceed as follows. In Section 2, we review existing research on donation size before presenting our conceptual framework and deriving hypotheses about the effects of donation size on CM success in Section 3. We present the research design of our experimental survey investigating the different effects of donation size in Section 4, and its results in Section 5. To gain insights into the drivers underlying tactical and strategic CM success, we report the data and results of an additional study in Section 6. We conclude by summarizing our work and discussing its implications for both managers and researchers in Section 7. 2. Literature review Much previous research has studied the characteristics of successful CM campaigns (for an overview, see Fries, 2010) and has identified a broad range of success factors, including the characteristics of the cause (e.g., Ross et al., 1991), the company (e.g., Strahilevitz, 2003), the consumer (e.g., Wymer & Samu, 2009), the non-profit organization (NPO) (e.g., Barnes, 1992), the product (e.g., Strahilevitz & Myers, 179 1998), and the fit among these factors (e.g., Zdravkovic, Magnusson, & Stanley, 2010). A success factor that has been analyzed in several past studies is donation size. As indicated in Table 1, the results of these studies are equivocal, spanning positive (e.g., Dahl & Lavack, 1995; Pracejus et al., 2003/04), negative (e.g., Arora & Henderson, 2007, study 3; Strahilevitz, 1999), and insignificant effects of donation size (e.g., Arora & Henderson, 2007, study 1; Vaidyanathan & Aggarwal, 2005). Although several studies incorporate moderating effects (Table 1), these cannot fully explain the conflicting findings. The (potential) moderators either do not influence the effect of donation size (e.g., promotion size, donation recipient), or they merely affect the strength of a positive or negative effect (e.g., cause involvement, price, product type), but do not change its direction. We suggest three possible reasons why the effect of donation size on CM success can be positive, negative, or null, which have not been studied systematically thus far: differences in tactical versus strategic success, moderating and nonlinear effects. First, companies pursue two main goals with CM: the tactical goal of increasing sales and the strategic goal of improving brand image (Polonsky & Wood, 2001). Previous research on donation size mainly uses sales-related dependent variables such as purchase intention and brand choice to measure tactical success. Alternatively, a few studies analyze the effects on attitudes towards the brand to capture strategic success. Only Arora and Henderson (2007), Holmes and Kilbane (1993), and Olsen et al. (2003) investigate both types of success measures and find no differences between them. Yet, a more in-depth analysis of the effect of donation size might reveal differences regarding its impact on tactical and strategic success because the underlying drivers of the success measures should be different. Specifically, purchase decisions should be driven mainly by the utility that consumers derive from the campaign, whereas changes in brand image should result mostly from the inferences consumers make about the brand offering the campaign. These distinct underlying drivers should also cause the effect of donation size on brand choice versus brand image to be moderated by different variables, as explained next. Second, two potentially relevant moderators have not been examined so far: the presence of a financial trade-off and donation framing. Some studies on donation size have provided respondents with decision tasks that involve choosing between a CM option, in which the money is donated, and a non-CM option, which offers a price reduction of equal size (e.g., Arora & Henderson, 2007, study 3; Strahilevitz, 1999). In this case, respondents face a financial trade-off: they can either do something good by choosing the CM option or they can gain a financial advantage for themselves by selecting the competitive offer. Other studies have not included such a trade-off. Table 1 reveals that studies with a financial trade-off tend to find that larger donations hurt sales, whereas most studies without a trade-off report that donation size has a positive or no significant effect. These findings suggest that larger donations help only when they come at no increased costs to the consumer. This is in line with the assertion of Burnett and Wood (1988) that prosocial behavior depends on the cost of helping; forgoing a price discount could be an important cost. Thus, differences in utility caused by financial trade-offs could explain the equivocal effects of donation size on tactical CM success. To date, the moderating effect of a financial trade-off has not been studied. Another new potential moderator is the framing of the donation in monetary versus nonmonetary terms. So far, two studies have examined donation size and framing. Olsen et al. (2003) compare CM campaigns that present the donation as a percentage of the price versus a percentage of the profit and find no differences in the effect of donation size between the two frames. Chang (2008) shows that expressing a donation in absolute monetary value is more favorable for small donations than a percent of price framing, whereas no difference exists for large donations. Both of these studies compare different monetary frames. However, the examples in our introduction 180 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 Table 1 Research on donation size. Study Donation size† Financial trade-off Dependent variable Result Moderating effects Dahl and Lavack (1995) $0.0025 vs. $0.1 No + Promotion size: n.s. Garretson Folse, Niedrich, and Landreth Grau (2010) 0.13–32% of price 1.88–67.5% of price 2.5–40% of price 0–6.8% of price 0–40 cents No No No No No Perceived exploitation of NPO Product appeal CM participation intentions CM participation intentions CM participation intentions Attitude toward ad Willingness to pay Olsen et al. (2003) 5 vs. 40 cents 5 vs. 40 cents 5 vs. 40 cents 1 vs. 10% of price No No No No +/n.s. +/n.s. +/n.s. + Pracejus, Olsen, and Brown (2003/04) Smith and Alcorn (1991) 0–10% of price $0.1; $0.25; $0.4 No No Willingness to pay Willingness to pay Willingness to pay Attitude toward ad Attitude toward brand Purchase intention Brand choice Intention to use coupon Arora and Henderson (2007), study 3 Yes Brand choice − Subrahmanyan (2004) 1 vs. 5% of monthly credit card charge 5 vs. 25% of price 5 vs. 50% of price 1 vs. 25% of price 1 vs. 25% of price 1–20% of price No Yes Yes Yes Yes Behavioral intention Brand choice Brand choice Brand choice Purchase likelihood − − − − − Arora and Henderson (2007), study 1 0–45% of price No n.s. Fries, Gedenk, and Völckner (2010) Holmes and Kilbane (1993) 5 vs. 15% of price 0–6.8% of price No No Human and Terblanche (2012) $0.18 vs. $1.14 No Vaidyanathan and Aggarwal (2005) van den Brink, Odekerken-Schröder, and Pauwels (2006) 6.3 vs. 12.5% of price 0.1 vs. 25% of price Yes/no No Brand choice Purchase likelihood Attitude toward brand Brand choice Attitude toward store Intention to respond Attitude toward cause alliance Attitude toward campaign CM participation intentions Willingness to buy Brand loyalty Holmes and Kilbane (1993) Koschate-Fischer, Stefan, and Hoyer (2012) Chang (2008) Strahilevitz (1999) + + + + + Price: n.s. Attitude toward helping: + Warm glow: + Cause involvement: + Cause organization affinity: + Fit: +/n.s. Fit: +/n.s. Fit: +/n.s. Framing (% of price vs. % of profit): n.s. + + Price: − Product type: + (hedonic) n.s. n.s. Price: n.s. n.s. Donation recipient: n.s. n.s. n.s. Notes: + = positive effect; - = negative effect; n.s. = no significant effect; † = donation sizes were transformed into % of price when possible. illustrate that companies use not only such monetary frames (e.g., $1 in the Starbucks example) but also nonmonetary frames in which the donation is presented as a charitable object or service (e.g., 10 l of drinking water in the Volvic example), as well as a combined framing that provides both types of information (e.g., 1 vaccine, worth .054€, in the P&G example). Research on promotions has shown that a promotion's value in relation to the product's price is assessed differently when it is framed in monetary versus nonmonetary terms (e.g., Nunes & Park, 2003; Palazon & Delgado-Ballester, 2009). For CM campaigns, the effect of donation size on brand image could also be affected by monetary versus nonmonetary framing because these frames provide different information that might influence the inferences consumers make about the company. The moderating effect of donation framing in monetary, nonmonetary, or combined terms has not previously been examined. Third, donation size could exert nonlinear effects on CM success. Most previous studies investigate only two different levels of donation size and the range of donation sizes varies across studies. So far, few studies have tested for nonlinear effects. Pracejus et al. (2003/04) find an insignificant quadratic term. However, they only study a range of donation sizes from 0 to 10% of the price, whereas in actual CM campaigns firms donate up to 50% of the price (e.g., Tommy Hilfiger). Koschate-Fischer et al. (2012) also use a quadratic term and report a positive effect of donation size that is concave (i.e., weaker for larger donations). However, their dependent variable is willingness to pay, i.e., they do not vary donation size in relation to the product's price. Finally, evidence for the nonlinear effects of donation size appears in the context of charity auctions (Haruvy & Popkowski Leszczyc, 2009), which reveal a negative effect for very large donations but a positive effect when a smaller fraction of the auction's final price is donated. However, whether the same mechanisms apply to both CM campaigns and charity auctions is unclear. More importantly, all three studies analyze the nonlinear effects of donation size for a monetary framing. As we explain in the next section, we expect a nonlinear effect for a combined framing (monetary and nonmonetary), which has not been studied, yet. 3. Conceptual framework 3.1. Overview Fig. 1 depicts our conceptual framework. We analyze the impact of donation size on both brand choice (tactical success) and brand image (strategic success). Furthermore, we consider the presence of a financial trade-off (i.e., non-focal brand on price promotion) as a moderator of the effect on brand choice, and donation framing (i.e., monetary, nonmonetary, combination) as a moderator of the effect on brand image. We expect that the effects of donation size on brand choice versus brand image are different and that different moderators are relevant because we propose that these success measures are affected by different underlying drivers. More specifically, we assume the effect of donation size on brand choice to be driven mostly by the utility that consumers derive for themselves from the CM campaign, whereas the effect on brand image should be driven primarily by what consumers infer about the company. 181 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 Financial trade-off Tactical CM success: Brand choice Donation size Strategic CM success: Brand image Donation framing Fig. 1. Conceptual framework. When consumers make brand choice decisions, they focus on themselves such that the utility that they derive from the campaign is crucial. Consumer's utility is determined by the benefits and costs of the CM campaign and a key benefit of a CM campaign is warm glow. Warm glow theory postulates that subjects derive utility from the mere act of giving, which is known as “warm glow” or “moral satisfaction” (Kahneman & Knetsch, 1992). Consumers are thus more likely to choose an option if it provides more warm glow (Andreoni, 1990). Triggering a donation through a product purchase can offer warm glow to consumers (Strahilevitz & Myers, 1998), and the findings of Fries et al. (2010) support that warm glow is the main underlying driver of the positive effect of CM on brand choice. Hence, we assume that the choice of the CM product depends primarily on the campaign's utility, which is provided by warm glow, in relation to the costs of engaging in the campaign. The latter should be affected when there is a financial tradeoff for consumers, i.e., when selecting the CM option comes at the cost of foregoing savings for oneself. When consumers assess brand image, they focus on the company, so the inferences that they derive about the brand from the CM campaign are crucial. Information integration theory (Anderson, 1981) suggests that new information is incorporated into prior attitudes, resulting in updated attitudes that reflect how the stimulus is evaluated. Thus, how consumers evaluate the company's engagement should affect its impact on brand image. We consider two aspects of this evaluation as the main drivers of brand image: perceived altruism and perceived effectiveness. Perceived altruism captures the degree to which consumers perceive the company to be motivated by a genuine interest in supporting the charitable cause. Perceived effectiveness is the degree to which consumers believe that the company will really donate as much as promised and that this donation will actually reach the needy recipients. Perceived altruism and perceived effectiveness have been studied as drivers of the effect of CM on brand choice and have been found to be less influential than warm glow (Fries et al., 2010). We assume that they are more important as drivers of the effect of CM on brand image because CM should positively affect brand image only if consumers attribute altruistic motives to the company's efforts and believe the promises stated in the campaign. In the remainder of this section, we build on these underlying drivers when we derive our hypotheses about how donation size affects CM success. 3.2. Brand choice hypotheses We expect the effect of donation size on brand choice to be moderated by the presence of a financial trade-off because brand choice is driven by warm glow and the cost of giving. We do not predict a moderating effect of donation framing for brand choice. Instead, the mere act of triggering a donation through one's purchase should induce warm glow regardless of the framing of the donation. This is in line with the notion that when consumers contribute to a cause, they are satisfied by the fact that something will be done without requiring detailed information (Kahneman, Ritov, Jacowitz, & Grant, 1993). Warm glow is an increasing function of what is given (Andreoni, 1989). Accordingly, without a financial trade-off, i.e., when consumers do not have to choose between doing good and saving money, a higher donation induces no increased costs to consumers and warm glow and utility should thus increase if the donation rises. We therefore propose: H1a. The effect of donation size on brand choice will be positive, if the consumer faces no financial trade-off. In contrast, when consumers choose between a brand with a CM campaign and another brand with a lower price, they face a trade-off and buying the CM brand comes at a cost. Donors are price-sensitive, and their likelihood of helping decreases as the cost of helping increases (Burnett & Wood, 1988; Eckel & Grossman, 2003). We expect that this increase in costs outweighs the increase in warm glow for larger donations. This is supported by the previous research summarized in Table 1: almost all studies in which consumers face a financial tradeoff find a negative effect of donation size on purchase behavior. Hence, we hypothesize: H1b. The effect of donation size on brand choice will be negative, if the consumer faces a financial trade-off. 3.3. Brand image hypotheses We expect that the effect of donation size on brand image is moderated by donation framing because brand image is driven primarily by what consumers infer about the brand, i.e., by perceived altruism and perceived effectiveness. We do not predict a financial trade-off to be a moderator in this context because a financial trade-off should affect consumers' utility, but not inferences about the brand. When donations are framed in monetary terms, larger donations are likely to decrease perceived effectiveness. With larger monetary donations, consumers may become skeptical that the company will really donate this much money, and the complete amount will reach the needy recipients. Similar effects have been shown for price promotions, where consumers do not believe that discounts are really as large as advertised. More specifically, consumers discount price discounts and do so increasingly as promised savings rise (Gupta & Cooper, 1992). A similar effect is likely to occur for CM campaigns with monetary donations such that consumers may assume that the actual donation will be lower than advertised and will therefore increasingly discount the advertised donation which lowers perceived effectiveness as donations become larger. Hence, we posit: H2a. The effect of donation size on brand image will be negative, if donation framing is monetary. 182 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 A nonmonetary frame emphasizes the output achieved with the donation and the good the company does. Because consumers typically cannot assess the monetary value of public goods (Green, Kahneman, & Kunreuther, 1994), they most likely cannot judge the value of donations that are expressed in nonmonetary terms, such that perceived effectiveness is not affected. However, nonmonetary donations are perceived to require more effort from companies than monetary donations (Ellen, Mohr, & Webb, 2000), such that this frame should indicate more sincere company motives, which is reflected in perceived altruism. Hence, when companies make larger donations, this should be perceived as more effort and result in stronger perceived altruism and thus better brand image. We therefore expect the following: H2b. The effect of donation size on brand image will be positive, if donation framing is nonmonetary. Finally, combined frames include both monetary and nonmonetary information and emphasize not only how much money the company donates but also the achieved output. In this case, opposing forces are at work: On the one hand, larger monetary donations decrease perceived effectiveness; on the other hand, larger nonmonetary donations increase perceived altruism. To derive an effect of donation size on brand image for a combined frame, we consider that this frame provides maximum transparency, which should affect perceived effectiveness. Consumers prefer transparency in donation framing (Landreth Grau, Garretson Folse, & Pirsch, 2007). Hence, in the case of small to medium donations, larger donations should not negatively impact perceived effectiveness because the combined frame reveals not only the monetary amount but also the output. Here, the dominant effect should be that consumers perceive a larger donation as more charitable effort by the company, thereby enhancing perceived altruism such that brand image becomes more favorable with rising donations. However, this only works up to a certain donation level because in the case of a very high monetary amount, consumers might again become skeptical about perceived effectiveness. After this point, we suppose that the effect of donation size on brand image is dominated by these negative inferences and becomes negative. In summary, we expect an inverted U-shaped effect and predict: H2c. The effect of donation size on brand image will follow an inverted U shape, if donation framing combines monetary and nonmonetary information. Table 2 summarizes our hypotheses. 4. Research design To test our hypotheses, we conducted a between-subjects experiment based on a large-scale survey. Different groups of respondents considered different CM campaigns, made brand choice decisions, and assessed the image of the CM brand. 4.1. Stimuli We constructed choice sets with two brands per product category. Participants read the following scenario: “You would like to buy product Table 2 Hypotheses. Brand choice Effects of donation size H1a +, if no financial trade-off H1b −, if financial trade-off Brand image H2a H2b H2c −, if donation framing monetary +, if donation framing nonmonetary ∩, if donation framing combined Notes: + = positive effect; − = negative effect; ∩ = inverted U-shaped effect. category X. You can choose between Brand A and Brand B. With regard to your choice, only the brand (Brand A/Brand B) is relevant, not the depicted flavor.” We presented photos of the two brands and information about their prices and product sizes. In the control condition, both brands appeared without a promotion. In the treatment conditions, one brand offered a CM campaign, and the other did not. In each product category, the CM campaign was always tied to the same focal brand.2 The design of the CM campaign varied across treatment groups. For the treatments with a CM offer and a financial trade-off, the competitive brand offered a price discount of the same size as the donation. We investigated four product categories: chocolate bars, toothpaste, beer, and detergent. All products were fast moving consumer goods (FMCG) that varied in their price levels and degrees of utilitarianism and hedonism to ensure robust results. For each product category, we chose two well-known national brands and presented products that were identical in price, size, and flavor. The prices reflected average prices found in major German supermarkets at the time of our study. We list the brands, sizes, and prices used in the study in Appendix A. We employed the same NPO and cause for all product categories: all CM campaigns promised a donation to SOS Children's Villages to support immunization against tetanus. This well-known charity enjoys a very good reputation (German Fundraising Association, 2009), and immunization against tetanus represents an important and uncontroversial cause. Tetanus remains a major risk in countries with low immunization rates (WHO & UNICEF, 2010). We systematically varied three experimental factors, as specified in Table 3. The donation size manipulation included eight levels: 1, 2.5, 5, 10, 20, 30, 40, and 50% of product price.3 We used these percentages to calculate the respective donation amount in Euros and the number of vaccinations. The factor donation framing had three levels: In the monetary frame, the donation amount was presented in Euros, such as .20€. In the nonmonetary frame, the donation was stated as the number of vaccinations, such as four vaccinations. We used a price of 5 cents per vaccination to translate monetary into nonmonetary donations (WHO & UNICEF, 2010). The combined frame included both the monetary amount in Euros and the equivalent number of vaccinations. All three frames were presented without a financial trade-off. In these treatments, the CM and the competitive brand were priced equally. We also combined the monetary frame with a financial trade-off because this frame supports an equal framing of donation size and savings. In these treatments, the competitive brand offered a price discount equal in size to the donation promised by the CM brand (Arora & Henderson, 2007; Strahilevitz, 1999; Vaidyanathan & Aggarwal, 2005). We provide an example stimulus for a monetary CM campaign and equivalent competitive price promotion in Appendix B. In our experimental set-up, the three frames without financial trade-offs and the monetary frame with the competitive price promotion were combined with the eight levels of donation size. We also included a control group, such that we tested 33 conditions between-subjects. 4.2. Procedure Subjects were randomly assigned to one of the 33 conditions. They assessed up to four product categories, although they answered questions for a category only if they had made a purchase in that 2 Across categories, there is variance in whether the CM brand has a larger purchase frequency than the non-focal brand (measured as number among the last three category purchases before the survey), a smaller or a similar purchase frequency. 3 A systematic research of CM campaigns in Germany revealed 50% as the maximum donation size. 183 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 Table 3 Experimental factors. Factor Level Realization Donation size 1/2.5/5/10/20/30/40/ 50% of product price Monetary Nonmonetary Combined Converted into Euro amount and/or number of vaccinations Amount in Euros Number of vaccinations Amount in Euros and equivalent number of vaccinations Competitive brand offers price discount of equal size as donation by CM brand No competitive promotion Donation framing Financial trade-off Present (only in combination with monetary frame) Not present category at least once during the previous year. This filter increases the response quality because subjects who are familiar with a category give more valid and reliable answers (Alba & Hutchinson, 1987). The categories appeared in the same order for all respondents. For each participant, the experimental treatment was kept constant across all categories. To measure brand awareness, the respondents first indicated whether they were familiar with the two brands in each product category. They then stated their brand preferences by specifying the number of times they had bought the two brands in their last three purchases in the category. In line with previous research (e.g., Bouten, Snelders, & Hultink, 2011; Simonin & Ruth, 1998), we assessed brand awareness and preferences before the experimental manipulation to prevent any influence that the stimulus might have on these measures.4 After the presentation of the stimulus, the respondents made their brand choice decision and then revealed their brand image assessments for the CM brand.5 Finally, we asked about demographic characteristics. (p b .05). Therefore, we include these demographics in our models as control variables. No significant differences emerge for other consumer demographics (i.e., age, gender, household size, income), previous donation behavior, or brand-related variables such as brand awareness and brand preference (p N .10). Brand awareness rates are greater than 92% for all brands used in the study, confirming that we selected wellknown brands. For the analyses, we pool the data across the four product categories, resulting in a total of 4686 observations, with a minimum of 93 observations per experimental group. 4.5. Models For brand choice, we estimate the following binary logit model, which includes donation size and the moderators as concomitant variables: Phc ¼ 1 ; 1 þ eð−Vhc Þ Vhc ¼ C X c¼1 δh ¼ γ h þ 4.4. Sample We sent the questionnaire to participants in an online access panel in Germany. As an incentive, respondents could participate in a drawing to win Amazon gift cards. Of the 1446 respondents who answered the questionnaire between December 2008 and January 2009, 85 were excluded from the analysis due to missing values. The final data set contains 1361 complete observations. Of the respondents, 47% are women. On average, participants are 34 years of age and live in households with 2.3 people. The majority has a monthly net household income ranging from 1000€ to 2000€. More than half (61%) are members of a church, and 25% have children. Finally, 46% of the respondents are employed, and 35% are students. We find significant differences across the 33 experimental groups with respect to occupation, church membership, and parenthood Phc Vhc CATc PREPREFhc CMh DEMOkh Xsh k¼1 c¼1 βc % PREPREFhc % CATc þ δh % CMh ; and ηk % DEMOkh þ S X s¼1 λsh % Xsh ; ð2Þ ð3Þ Probability that subject h chooses the CM brand in category c, Systematic utility of the CM brand for subject h in category c, Category indicator (1 if product category c, and 0 otherwise), Stated brand preference of subject h in category c (=number of times the CM brand was bought during the last three purchases before the survey, minus the number of times the other brand was bought on these purchases), CM indicator (1 if subject h sees a CM campaign, and 0 otherwise), Demographic variable k for subject h, and CM-related concomitant variable s for subject h. In the utility function (Eq. (2)), we include category-specific intercepts and control for stated preference heterogeneity with a PREPREF variable for each category (Ailawadi, Gedenk, & Neslin, 1999; Horsky, Misra, & Nelson, 2006). The parameter δh captures the effect of a CM campaign on utility. It differs across subjects because different respondents receive different experimental treatments, as described by the concomitant variables Xsh, and because response is heterogeneous. We control for demographic variables that vary between the experimental groups (Section 4.4.). Finally, we use a continuous mixture model to capture unobserved heterogeneity in all parameters except those for PREPREF and the demographics, which are household-specific. We assume that all heterogeneous parameters follow normal distributions and estimate their means and standard deviations. For brand image, we estimate the following linear regression model with the same independent and concomitant variables: BIMAGEhc ¼ 4 These measurements might make initial preferences more salient and thus affect the measures of our dependent variables. If this were indeed the case, it would make our hypothesis tests conservative because the experimental treatments would have less of an effect. 5 We measured brand choice before brand image because we did not want the choice decision to be biased by consumers' elaboration on the focal CM brand. To test if this order causes a bias in the measurement of brand image, we counterbalanced the order of the dependent measures for one experimental treatment in our second study. We found no evidence that our results for brand choice and brand image were affected by the order of these two measures. K X C X where 4.3. Measures of CM success We used brand choice to measure tactical success and brand image to capture strategic success. For each choice set, respondents indicated which of the two brands they would rather buy. Next, respondents evaluated the image of the CM brand on six seven-point semantic scales (Völckner, Sattler, & Kaufmann, 2008; see Appendix C). α hc % CATc þ ð1Þ C X c¼1 α hc % CATc þ C X c¼1 βc % PREPREFhc % CATc þ δh % CMh ; ð4Þ where BIMAGEhc Image of subject h of the CM brand in category c. The models for both brand choice and brand image are developed in consecutive steps. Starting with demographics, we add the concomitant variables Xsh stepwise to check for model improvements from adding 184 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 moderating and nonlinear effects. The models are nested, as outlined in Table 4. Appendix D summarizes the operationalization of the independent variables. Model 1, with the main effects of donation size, donation framing, and a financial trade-off, is our base model (Jaccard, 2001; Jaccard, Turrisi, & Wan, 1990). In Model 2, we add interactions of donation size with donation framing and a financial trade-off. With Model 2 we can test our hypotheses about the moderating effect of a financial tradeoff (H1a and H1b) and donation framing (H2a and H2b). To examine the nonlinear effects of donation size (H2c), we incorporate quadratic terms in Models 3 and 4.6 To provide evidence for the significance of the interaction effects (Jaccard, 2001; Jaccard et al., 1990), we first include a quadratic term for donation size (Model 3), and then add quadratic terms for the interactions with donation framing and a financial trade-off (Model 4). Although our hypotheses do not feature all possible moderating and nonlinear effects for both brand choice and brand image, we estimate all four models with both dependent variables to ensure that we do not miss any effects. We estimate our models with simulated maximum likelihood (Train, 2009) using the MAXLIK module in GAUSS.7 We test whether pooling across the product categories is appropriate using likelihood ratio tests for the logit models of brand choice and Chow tests for the regression models of brand image. In the Chow tests, the improvement in model fit when we move from a pooled model to four separate models is not significant for any of the models (p N .05). In the likelihood ratio tests, no fit improvement is significant at the 1% level; for Models 2 and 3, the improvement is significant at the 5% level. However, with more than 4000 observations, even small differences tend to be significant, and we find no substantive differences for the effects of our experimental variables across the four categories. Thus, we consider pooling to be appropriate. 5. Results 5.1. Brand choice Table 5 contains the fit measures for our four brand choice models. Because the models are nested, we use likelihood ratio tests to determine whether more comprehensive models offer a significant improvement over simpler ones. Model 1 includes the main effects of our experimental manipulations of donation size, donation framing and a financial trade-off. In Model 2, we add the moderating effects of donation framing and a financial trade-off to the effect of donation size. The likelihood ratio test shows that model fit improves significantly, indicating that the effect of donation size on brand choice is moderated. In Models 3 and 4, we add quadratic terms to capture the nonlinear effects of donation size but find no significant improvements. This is in line with our predictions: we expected nonlinear effects for brand image but not for brand choice. We rely on Model 2 to test our brand choice hypotheses and present its parameter estimates in Table 6. All coefficients for the control variables exhibit plausible signs. The positive PREPREF coefficients (p b .01) indicate that consumers are more likely to choose the CM brand when they preferred it over the competitive brand in their recent purchases. Respondents with children react more favorably to CM campaigns (p b .01), in line with previous research (Ross, Patterson, & Stutts, 1992). Respondents who did not indicate whether they were church members also react more favorably to CM, but this effect is only weakly significant (p b .10). 6 We also tested for thresholds by allowing the coefficient of donation size to be different below and above a threshold in our Model 2. We inserted thresholds at donation sizes of 10 and 30%, which are common for price promotions (e.g., van Heerde, Leeflang, & Wittink, 2001). However, none of these thresholds improved model fit, neither for brand choice nor for brand image (p N .10). Details are available from the authors upon request. 7 We rescaled donation size (by dividing it by 100) in the brand image models to facilitate the estimation (Ailawadi, Gedenk, Lutzky, & Neslin, 2007). Table 4 Model specification. Variables Category variables CAT_CHOCOLATE CAT_TOOTHPASTE CAT_BEER CAT_DETERGENT PREPREF_CHOCOLATE PREPREF_TOOTHPASTE PREPREF_BEER PREPREF_DETERGENT Demographic variables Child_YES × CM Church membership_YES × CM Church membership_ NO_RES × CM Occupation_FULL × CM Occupation_PART × CM CM variables CM DONSIZ NONMON COMBI TRADE-OFF DONSIZ × NONMON DONSIZ × COMBI DONSIZ × TRADE-OFF Nonlinear effects DONSIZ2 DONSIZ2 × COMBI DONSIZ2 × NONMON DONSIZ2 × TRADE-OFF Model 1 Model 2 Model 3 Model 4 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Notes: × = interaction effect; ✓ = included in the model. Regarding our first hypothesis, we find that the presence of a financial trade-off moderates the effect of donation size, as indicated by λ7. In support of H1a, donation size has a positive effect on brand choice when there is no financial trade-off, according to the significant and positive λ1, which captures the effect of donation size for a monetary frame without a financial trade-off. As expected, we find no differences in the effect of donation size across the three frames without financial trade-offs; neither λ5 nor λ6 is significantly different from zero. To formally test H1a for the nonmonetary and combined frames, we test whether the sums of the respective coefficients (λ1 + λ5 and λ1 + λ6) differ from zero using a Wald test (Greene, 2008). We find a weakly significant positive effect of donation size for the combined frame (p b .10), but for the nonmonetary frame, the effect is only close to significance (p = .11). Thus, the results support H1a for the two frames with a monetary component and without a financial trade-off. The effect of donation size on brand choice is negative when consumers face a financial trade-off: A Wald test shows that the sum of the coefficients λ1 and λ7 is significantly negative (p b .01), thereby supporting H1b. To provide a sense of the strength of the effects of donation size on brand choice, we simulate the changes in brand choice probability for different donation sizes and frames. For the simulation, we use the estimated parameter means from Model 2. We assume that a consumer chooses between two brands that are equally preferred (category dummies and PREPREF equal zero). For the demographic variables, we use the most frequent levels (i.e., no children, church member, fulltime occupation). We present the simulation results in Fig. 2. Fig. 2 reveals that the positive impact of donation size for frames without financial trade-offs is moderate. For example, with a monetary frame, a campaign with a donation of 1% of the price increases brand choice probability by 14.3 percentage points (from 50% without a campaign to 64.3%). Increasing the donation size to 20% of the price earns the firm another 5.6 percentage points in brand choice probability, which is unlikely to offset the loss in margin. For a donation of 50% of the price, brand choice probability increases to 77.8%. 185 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 Table 5 Model fit and improvement. Model Brand choice Log likelihood Likelihood ratio test: Reference model Chi2 (p) Brand image Log likelihood Likelihood ratio test: Reference model Chi2 (p) Model 1 Model 2 Model 3 Model 4 −2360.015 −2318.506 −2317.138 −2314.786 Model 1 83.017 (b.001) Model 2 2.737 (.255) Model 3 4.705 (.582) −6317.384 −6316.166 −6312.979 Model 1 28.210 (b.001) Model 2 2.437 (.296) Model 3 6.373 (.041) −6329.615 Model 4* Notes: N=4,686. p values are printed in bold if pb.10. For a monetary frame with a financial trade-off, brand choice probability is 70.4% for a 1% donation but falls to 6.7% for a 50% donation and an equal competitive discount. Here, the negative effect of donation size on brand choice is substantial. The simulation demonstrates that for donations of up to 13.1% of the price, consumers prefer a CM campaign over a price reduction of the same size, but when donations increase Table 6 Parameter estimates for brand choice and brand image models. Independent variables CM success measure Brand choice Brand image Parameter estimates (standard errors) Model 2 Mean Category variables† β1 PREPREF_CHOCOLAT β2 PREPREF_TOOTHPASTE β3 PREPREF_BEER β4 PREPREF_DETERGENT Concomitant variables†† γ Constant η1 Child_YES η2 Church membership_YES η3 Church membership_NO_Res η4 Occupation_FULL η5 Occupation_PART λ1 DONSIZ λ2 NONMON λ3 COMBI λ4 TRADE-OFF λ5 DONSIZ × NONMON λ6 DONSIZ × COMBI λ7 DONSIZ × TRADE-OFF λ8 DONSIZ2 λ9 DONSIZ2 × COMBI Model 2 SD 1.752*** (.317) 1.295*** (.213) 1.791*** (.270) .921*** (.082) .423 (.311) .352*** (.135) .103 (.122) .580* (.308) .051 (.138) .093 (.163) .014** (.006) .273 (.235) .148 (.225) .362 (.255) −.002 (.009) −.002 (.009) −.085*** (.014) Mean Model 4* SD .204*** (.021) .195*** (.016) .395*** (.030) .230*** (.019) .832** (.293) .005 (.014) .219 (1.224) .267 (.703) 1.311*** (.332) .001 (.019) .009 (.018) .020 (.017) .124 (.123) .080* (.048) .007 (.044) .064 (.106) −.024 (.049) −.043 (.058) −.540** (.218) −.101 (.086) −.176** (.080) −.026 (.078) 1.122*** (.348) 1.314*** (.320) .027 (.306) Mean SD .203*** (.021) .197*** (.016) .396*** (.031) .231*** (.019) .101 (.308) .500* (.258) .298*** (.091) .051 (.105) .374*** (.080) .162 (.966) .721** (.316) .010 (.434) .156 (.134) .070 (.047) .014 (.043) .052 (.105) −.027 (.049) −.045 (.058) −1.203** (.581) −.092 (.090) −.301** (.105) −.032 (.101) 1.105** (.371) 3.525** (1.097) .021 (.438) 1.413 (1.096) −4.703** (2.185) .126 (.104) .253 (.377) .230** (.113) .011 (.114) .332*** (.096) .488 (.578) .613 (.509) .196 (.323) .396 (.763) .957 (1.536) Notes: N = 4686; * p b .10; ** p b .05; *** p b .01 (two-sided); SD = Standard deviation; Standard errors in parentheses; † Category constants available upon request; †† Donation size rescaled (divided by 100) for brand image models. 186 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 0.4 Nonmonetary ∆ Choice probabilities 0.3 0.2 Combined 0.1 0.0 -0.1 0 10 20 30 40 50 Monetary -0.2 -0.3 -0.4 -0.5 Donation size (% of price) Monetary with financial trade-off Fig. 2. Change in choice probabilities through a CM campaign. further, CM can no longer compete with an equivalent competitive price promotion. From this point on, consumers are more attracted by the competing firm's discount than by the focal firm's CM campaign. The finding that consumers would rather have the money for themselves than donate it to a cause is in line with research on willingness to pay for ethical products, which shows that consumers are willing to pay only a limited premium for social attributes (Auger, Devinney, Louviere, & Burke, 2008). In summary, our results suggest that the moderating effect of a financial trade-off can explain most of the equivocal findings on the effect of donation size on tactical CM success in previous research (Table 1). We find that the effect of donation size is positive if consumers face no financial trade-off, but becomes negative when larger donations induce higher costs to consumers. 5.2. Brand image Table 5 presents model fit for the regression models with brand image as the dependent variable. We use likelihood ratio tests to test for improvements in fit in our hierarchy of nested models. Model 1 includes demographics and the three experimental factors as concomitant variables. In Model 2, we add interaction effects of donation size with donation framing and a financial trade-off, and find that model fit improves significantly. Thus, the moderators affect the impact of donation size on brand image. Next, we add nonlinear effects of donation size, but the incorporation of a quadratic term for donation size in Model 3 does not improve model fit. Hence, the effect of donation size on brand image is not nonlinear per se. Model 4 includes quadratic terms for the interactions of donation size with donation framing and a financial trade-off. In Model 4, though, we encounter problems with multicollinearity; thus, we exclude the quadratic terms DONSIZ2 × NONMON and DONSIZ2 × TRADE-OFF from the model.8 The reduced Model 4* represents a significant improvement over Model 3. We therefore use Models 2 and 4* to test our hypotheses and list their parameter estimates in Table 6. Again, the parameters for all control variables have plausible signs: Previous preferences relate positively to brand image (p b .01), and the effect of CM on brand image is more favorable for respondents with children in Model 2 (p b .10). In Model 2, we find support for H2a: The impact of donation size is negative when the frame is purely monetary, regardless of whether the competitive brand is on promotion or not. Specifically, the significant negative coefficient λ1 shows that the effect of donation size is negative for a monetary frame without a financial trade-off, and the interaction with a financial trade-off (λ7) is not significant. A t-test further reveals that the sum of λ1 and λ7 is significantly negative 8 This modification does not limit our insights. In several alternative models, we find no significant parameters for the terms we exclude, and the nonlinear effect of donation size for a combination frame remains stable. (p b .05), providing formal support for H2a. The sum of λ1 and λ5 is significantly positive (p b .05), which supports H2b: The impact of donation size is positive when the frame is purely nonmonetary. Finally, we test for a nonlinear effect of donation size for a combined frame, using Model 4*. With respect to the expected inverted U shape, we use t-tests pertaining to the sum of the coefficients for donation size and its interaction with the combination frame (λ1 + λ6), as well as the sum of the two quadratic terms (λ8 + λ9) (Jaccard et al., 1990). The sum of λ1 and λ6 is positive and significantly different from zero (p b .05). The sum of λ8 and λ9 is weakly significant and negative (p b .10). That is, the effect of donation size follows an inverted U shape for a combined frame, and H2c is supported. To illustrate the effects of donation size on brand image for the different frames and to assess the strength of the effects, we again run a simulation. We use the estimated parameter means from Model 4*, and the same assumptions about PREPREF and demographics as in the brand choice simulation. Fig. 3 presents the results. Fig. 3 reveals the negative effect of donation size for monetary frames, the positive effect for the nonmonetary frame, and the inverted U-shaped effect for a combined frame. When a CM campaign presents both monetary and nonmonetary information, donation size first has a positive effect on brand image and then a negative one. The turning point is reached at a donation of 35.3% of the price — well within the range of realistic donation sizes. CM with large monetary donations (N 15.9% of the price) and very small donations with a combined frame (b6.9% of the price) hurt brand image. The former finding is in line with our reasoning that high monetary donations might lower consumers' perceived effectiveness. The latter indicates that with a combined frame, transparency is counterproductive in the case of very small donations. Revealing the exchange ratio between money and the charitable object demonstrates how few resources are necessary to achieve a considerable outcome. In turn, consumers likely perceive very small combined donations (e.g., 2 cents equaling 0.4 vaccinations in our study) as paltry, which results in lower perceived altruism. Overall, we find changes between +.37 and −.12 on a seven-point scale; brand image in the control group was 4.95. Given that the wellknown brands in our study possess established images that are unlikely to change much because of a single experimental treatment, these effects are substantial. In summary, Fig. 3 shows that donation size can have substantial effects on brand image and highlights the importance of considering donation framing when deciding on the size of the donation. 6. Underlying drivers To derive our hypotheses about the effects of donation size on tactical and strategic success, we relied on different underlying drivers. In a second study, which is exploratory in nature, we collected data on these 187 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 0.4 Nonmonetary ∆ Brand Image 0.3 Combined 0.2 0.1 Monetary 0.0 0 10 20 30 40 50 -0.1 -0.2 Monetary with financial trade-off Donation size (% of price) Fig. 3. Change in brand image through a CM campaign. potential underlying drivers, and analyzed how they affect brand choice and brand image. For this purpose, we regressed the two success measures on warm glow, perceived altruism, and perceived effectiveness. 6.1. Data We included a subset of the stimuli from our large-scale survey. In contrast to our main study, we varied product category betweensubjects because of the longer questionnaire, which now included measures on potential underlying drivers. To keep the number of experimental groups tractable, we used fewer donation sizes (2.5, 10, 30 and 50% of product price) and only two product categories (chocolate bars and toothpaste). We employed the same frames as in our large-scale study (monetary, nonmonetary, and combined without a financial tradeoff and monetary with a competitive price promotion). The three frames without financial trade-offs and the monetary frame with the financial trade-off were combined with the four levels of donation size. We employed all combinations for the two product categories, resulting in 32 conditions, which we varied between-subjects. The procedure and measurements for the success variables were the same as in our first study. In addition, we included measures to capture the underlying drivers. After making their brand choice decisions and brand image assessments, participants indicated their warm glow, perceived effectiveness of the campaign and the perceived altruism of the company running the campaign (all on seven-point multi-item scales, see Appendix C).9 We invited members of an online access panel in Germany to participate in our survey. As an incentive, respondents could participate in a drawing to win Amazon gift cards. Between August and October 2011, 1402 respondents answered the questionnaire. We excluded 34 participants due to a response time of less than 2.5 min (the mean was 7.8 min) or because they clicked through all multi-item scales (straight line response on all scales). The final data set contains 1368 complete observations. Among our respondents, 62% are women. On average, they are 33 years of age and live in households with 1.2 people. The most respondents have a monthly net household income between 1000€ and 2000€. More than half (60%) are church members, and 25% have children. 55% of the respondents are employed, and 31% are students. Thus, the sample's demographics are very similar to those of our main study. We do not find significant differences between the experimental groups on any consumer demographics (i.e., age, gender, household size, income, church membership, parenthood), previous donation behavior or brand-related variables such as brand awareness and brand preference (p N .10). 9 We also measured self-sufficiency and ease of imagination of the donation as additional potential drivers to test for alternative explanations. Because these did not prove to be relevant, we excluded them from the analysis. Full results are available from the authors upon request. 6.2. Results Our hypotheses are based on the reasoning that brand choice is driven by warm glow, and brand image is driven by perceived altruism of the firm and perceived effectiveness of the campaign. Therefore, we regressed both success measures on these three potential drivers. We also included category-specific intercepts and controlled for stated preference heterogeneity with a PREPREF variable for each category.10 A Chow test for the regression model of brand image and a likelihood ratio test for the logit model of brand choice indicate that pooling across the product categories is appropriate (p N .10). All variance inflation factors are below 1.68, indicating no problems with multicollinearity. The estimation results are displayed in Table 7. For the brand choice model, fit is good, as indicated by the value of .328 for Nagelkerke's R2. In line with our reasoning, the only significant driver of brand choice is warm glow, which is crucial for the utility consumers derive from a CM campaign. In contrast, perceived altruism and effectiveness do not disclose significant effects. For brand image, the linear regression's R2 value of .159 is satisfactory, given that we study the same well-known brands as in our main study, for which images have been formed over a long time in the consumer's mind. In line with our reasoning, perceived altruism and perceived effectiveness both exert a significant positive effect on brand image. Warm glow also has a significant effect, which most likely represents a spillover of the good feeling consumers experience through the campaign onto the brand's image. However, the coefficient for warm glow is the smallest. Thus, the primary drivers of brand image are perceived altruism and perceived effectiveness, which both relate to what consumers infer from CM about the brand. In summary, this study suggests that in CM, brand choice and brand image are indeed affected by different underlying drivers. For brand choice, the utility the consumer derives from the campaign is crucial, and this is determined by the warm glow the campaign triggers. In contrast, brand image is mainly affected by the inferences consumers make about the brand involved in the campaign. It is critical for consumers to believe in the company's sincere motives and that the donation will be used as promised. 7. Summary and implications We have investigated the impact of donation size on the effect of CM on brand choice and brand image in a large-scale experimental survey with different product categories. An additional exploratory study 10 Since our dataset contains only one observation per respondent, we do not model unobserved heterogeneity. 188 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 Table 7 Results underlying drivers. Independent variables CM success measure Brand choice Brand image Parameter estimates (standard errors) Category variables† PREPREF_CHOCOLATE PREPREF_TOOTHPASTE 1.190*** (.103) .704*** (.092) .162*** (.026) .190*** (.030) Underlying drivers Warm glow Perceived effectiveness Perceived altruism Nagelkerke's R2 .393*** (.053) .091 (.062) −.027 (.064) .328 .046** (.019) .067*** (.023) .111*** (.023) Chi2 (p) Log likelihood R2 (Adj. R2) F (p) 352.253 (b.001) −631.189 .159 (.155) 42.741 (b.001) Notes: N =1,368; * pb .10; ** pb .05; *** pb .01 (two-sided); Standard errors in parentheses; p values are printed in bold if p b .10. † Category constants available upon request. provides insights into the underlying drivers of consumer behavior in the context of CM. Our key findings are the following: • The effect of donation size on brand choice depends on the presence of a financial trade-off. If consumers face no trade-off, larger donations increase brand choice probability. However, if consumers have to choose between doing good and savings for themselves, larger donations and larger trade-offs respectively will decrease brand choice probability. • The effect of donation size on brand image depends on donation framing. With a monetary frame, larger donations are less favorable for brand image and CM campaigns with high monetary donations can even hurt brand image. For a purely nonmonetary frame, donation size has a positive effect on brand image, and for a combined frame, the effect follows an inverted U shape. • In CM, tactical and strategic success appear to be driven by different mechanisms. Brand choice depends on consumers' utility and thus warm glow is the crucial driver. Brand image improves when consumers make positive inferences about the company and thus is mainly affected by the perceived altruism of the company and the perceived effectiveness of the campaign. Our systematic analysis of donation size helps explain why donation size can have positive, negative, or no effects on CM success. First, we show that tactical success (brand choice) and strategic success (brand image) are affected differently. This is attributed to differences in their underlying drivers. Second, we identify two new moderators of the effect of donation size on CM success. For brand choice, larger donations exert a favorable effect as long as consumers face no financial trade-offs, whereas the effect is negative when an alternative brand offers a price promotion of equal size. This finding goes a long way toward explaining the contradictory previous results summarized in Table 1. Except for Chang (2008) all studies that find a negative impact include a financial trade-off (e.g., Arora & Henderson, 2007; Strahilevitz, 1999) while in none of the studies reporting a positive effect a larger donation comes at higher cost to consumers (e.g., Koschate-Fischer et al., 2012; Pracejus et al., 2003/04). For brand image, the effect of donation size is moderated by donation framing. Through the introduction of this moderator, we extend the scope of previous research on donation size, which has studied only monetary frames. Third, we find that the effect of donation size is nonlinear for a frame that combines monetary and nonmonetary information: It follows an inverted U shape. So far, nonlinear effects have only been studied for monetary frames (Koschate-Fischer et al., 2012; Pracejus et al., 2003/ 04). Our results support the findings of Pracejus et al. (2003/04) in that we do not find nonlinear effects for monetary frames. We note that Koschate-Fischer et al. (2012) find a concave effect for a monetary frame, but with a different success measure, i.e., willingness-to-pay. Their result may simply reflect consumers' reluctance to pay more for larger donations, whereas in our study product price remained constant even with larger donations. Our results suggest important implications for managers who intend to use cause-related marketing. First, large donations are not essential for tactical success. CM can be a cost-effective sales promotion instrument to increase brand choice because even small donations have a substantial positive impact. In contrast, most price promotions must pass a 10– 20% discount threshold to significantly affect purchase intentions and behavior (Gupta & Cooper, 1992; van Heerde et al., 2001). Rising CM donations increase brand choice probability only moderately, which is most likely not sufficient to offset their additional costs. Second, large CM donations may not be able to compete in a promotion-intensive environment in which consumers face trade-offs between doing something good and savings for themselves. In many FMCG categories, price discounts are in the range of approximately 20% of the product price (e.g., van Heerde, Leeflang, & Wittink, 2000). Donating that much money to a cause would increase brand choice probability but not enough to maintain market share when the competitor offers an equivalent price promotion. However, van Heerde, Leeflang, and Wittink (2004) observe huge variations in price discounts, ranging from 5% to 51% of the price. Against smaller discounts, CM is likely to prevail. Third, larger donations can help or hurt strategic success depending on their framing. When CM campaigns use a monetary frame, the effect of CM on brand image becomes less favorable with increasing donations and can even turn negative. In contrast, donation size has a positive effect on brand image for campaigns with nonmonetary framing. Finally, managers may combine monetary and nonmonetary information in their donation framing. In this case, a medium donation size is optimal for brand image. Therefore, managers should carefully align donation size and donation framing in CM to create a positive effect on strategic success. If they succeed in this task, CM may be an attractive alternative to price promotions, which – even if possibly more effective in the short run – typically hurt brand loyalty in the long run (Neslin & van Heerde, 2009). CM, in contrast, may help managers to increase consumers' brand loyalty by adding a philanthropic component to their brand. Finally, managers should pay attention to their communication in CM campaigns. They should appeal to the warm glow consumers derive from participating in the campaign to enhance tactical success. At the same time, to improve brand image, they need to make a credible claim that the company is committed to help the cause and that the donations reach their targets. Our study also has some limitations that provide opportunities for further research. Our measures may suffer from a social desirability bias (e.g., Lautenschlager & Flaherty, 1990; Nancarrow, Brace, & Wright, 2001), such that respondents might be more likely to choose the CM brand and evaluate its brand image more favorably than they would in a real purchase situation. Even if our data suffered from this social desirability bias, though, it would be unlikely to affect our results regarding donation size because the bias would be the same for all experimental groups. However, our measures of the absolute effect of CM would be biased, which would make it challenging to derive specific implications for the optimal donation size. We suspect that our data are not affected by a strong social desirability bias because we find negative effects of CM on brand image and a strong effect of a competitive promotion on brand choice. Nevertheless, it would be worthwhile to validate our results with field data. Furthermore, we study only fast moving consumer goods. Investigating the impact of donation size and its two moderators for durable goods might lead to further insights. For example, the 189 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 effect of a financial trade-off might be different for higher priced products. Previous research has indicated that CM is more successful for hedonic than for utilitarian products (e.g., Strahilevitz, 1999; Strahilevitz & Myers, 1998). The proposed underlying mechanism is that CM reduces customers' guilt associated with indulging in hedonic products. We find no differences for the effects of donation size between hedonic and utilitarian products, maybe because in our scenarios we explicitly told consumers that they want to buy a product, such that their purchase incidence decision was already made. Further research should look more deeply into the role of category type and test whether the feeling of guilt indeed plays a role. Another limitation of our study is that brand image is formed over a long period of time, such that it could be interesting to validate our results with a study that repeatedly measures brand image over a longer time period. This would also present an opportunity to measure how improvements in brand image translate into brand equity and affect purchase decisions in the long run (Keller, 1993). Finally, we have studied a specific type of financial trade-off where the competitive brand offers a price promotion. Our results are interesting for firms who consider using CM to compete in a promotion-intensive environment. Future research may want to study the trade-offs that occur when the CM brand increases its price. The respective results might help firms decide whether they can pass on the costs of the donations to consumers. Despite these limitations, we think that our study yields interesting results with important implications for managers and researchers, and we hope that further research will build on it. Appendix A Product stimuli. Product category Brand Size Chocolate bars KitKat Duplo Colgate Odol-med3 Bitburger Warsteiner Persil Ariel 200 g 1.99€ 75 ml 1.99€ Toothpaste Beer Detergent Notes: Italics = CM brand. Appendix B Example stimuli for monetary CM campaign and competitive price promotion. Price 24 × 0.33 l 12.49€ 4.75 kg 12.49€ 190 S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 Appendix (continued) D (continued) Appendix C Multi-item scales. Measures CM success Brand image Rating the CM brand on Likert scales: −3 = bad and +3 = good −3 = not likeable and +3 = likeable −3 = low quality and +3 = high quality −3 = not trustworthy and +3 = trustworthy −3 = unpleasant and +3 = pleasant −3 = unattractive and +3 = attractive Underlying drivers (Study 2) Warm glow Extent to which participants agreed/disagreed with the following statements: When I purchase [CM brand name], I feel good because I do not only spend money for myself but also for other people. I feel comfortable if I donate for a good cause by purchasing [CM brand name]. I am pleased that I do not only get a product by purchasing [CM brand name], but that I also do a good deed at the same time. Perceived effectiveness of CM campaign Extent to which participants agreed/disagreed with the following statements: I believe that the donated money reaches the needy persons. I am convinced that little of the donated money is wasted. I assume that the donated money will be distributed in favor of the cause. I trust in the fact that the donated money will be used for the cause. I believe that the company actually donates as much as stated in the CM campaign. Perceived altruism of company Extent to which participants agreed/disagreed with the following statements: The manufacturer conducts the campaign in order to do a good deed. The campaign is an honest effort. The manufacturer is not truly committed to the purpose of the donation. Source α Völckner et al. (2008)† .95 (S1) .92 (S2) Arora and Henderson (2007), Andreoni (1989), Fries et al. (2010), and Monin (2003) .93 .93 Fries et al. (2010), Sargeant and Lee (2004), and Webb Green, and Brashear (2000) Fries et al. (2010), Nowak (2004), Strahilevitz (2003), and Webb et al. (2000) .93 Multi-item scales. Appendix D Variable specifications Category variables CAT_CHOCOLATE CAT_TOOTHPASTE CAT_BEER CAT_DETERGENT PREPREF Demographic variables Child_YES Church membership_YES Church membership_NO_Res Occupation_FULL Occupation_PART CM variables DONSIZ NONMON Specification COMBI 1 if the CM campaign includes combined nonmonetary and monetary donation framing, 0 otherwise 1 if the competitive brand offers a price promotion, 0 otherwise TRADE-OFF References Notes: α = Cronbach’s alpha in our study; † we added the item trustworthy; S = Study. Variable Variable Specification 1 if product category is chocolate bars, 0 otherwise 1 if product category is toothpaste, 0 otherwise 1 if product category is beer, 0 otherwise 1 if product category is detergent, 0 otherwise Number of times the CM brand was bought in the last three purchases in the category prior to the survey, minus the number of times the other brand was bought/Divided by 3 1 if subject has children, 0 otherwise 1 if subject is member of a church, 0 otherwise 1 if subject did not indicate church membership, 0 otherwise 1 if subject works full time, 0 otherwise 1 if subject works part time, 0 otherwise Donation size of the CM offer 1 if the CM campaign includes nonmonetary donation framing, 0 otherwise Ailawadi, K. L., Gedenk, K., Lutzky, C., & Neslin, S. A. (2007). Decomposition of the sales impact of promotion-induced stockpiling. Journal of Marketing Research, 44(3), 450–467. Ailawadi, K. L., Gedenk, K., & Neslin, S. A. (1999). Heterogeneity and purchase event feedback in choice models: An empirical analysis with implications for model building. International Journal of Research in Marketing, 16(3), 177–198. Alba, J. W., & Hutchinson, J. W. (1987). Dimensions of consumer expertise. Journal of Consumer Research, 13(4), 411–454. Anderson, N. H. (1981). Foundations of information integration theory. New York: Erlbaum. Andreoni, J. (1989). Giving with impure altruism: Applications to charity and ricardian equivalence. Journal of Political Economy, 97(6), 1447–1458. Andreoni, J. (1990). Impure altruism and donations to public goods: A theory of warm-glow giving. The Economic Journal, 100(401), 464–477. Arora, N., & Henderson, T. (2007). Embedded premium promotion: Why it works and how to make it more effective. Marketing Science, 26(4), 514–531. Auger, P., Devinney, T. M., Louviere, J. J., & Burke, P. F. (2008). Do social product features have value to consumers? International Journal of Research in Marketing, 25(3), 183–191. Barnes, N. G. (1992). Determinants of consumer participation in cause-related marketing campaigns. American Business Review, 10(2), 21–24. Barone, M. J., Miyazaki, A.D., & Taylor, K. A. (2000). The influence of cause-related marketing on consumer choice: Does one good turn deserve another? Journal of the Academy of Marketing Science, 28(2), 248–262. Bouten, L. M., Snelders, D., & Hultink, E. J. (2011). The impact of fit measures on the consumer evaluation of new co-branded products. Journal of Product Innovation Management, 28(4), 455–469. Burnett, J. J., & Wood, V. R. (1988). A proposed model of the donation decision process. In E. Hirschman, & J. Sheth (Eds.), Research in consumer behavior, 3. (pp. 1–47)Greenwich, CT: Elsevier JAI. Chang, C. -T. (2008). To donate or not to donate? Product characteristics and framing effects of cause-related marketing on consumer purchase behavior. Psychology & Marketing, 25(12), 1089–1110. Cause-related marketing: Impact of size of corporate donation and size of cause-related promotion on consumer perceptions and participation. Dahl, D. W., & Lavack, A.M. (Eds.). (1995). AMA Winter Educators' Conference Proceedings, 6. (pp. 476–481). Eckel, C. C., & Grossman, P. J. (2003). Rebate versus matching: Does how we subsidize charitable contributions matter? Journal of Public Economics, 87(3/4), 681–701. Ellen, P.S., Mohr, L. A., & Webb, D. J. (2000). Charitable programs and the retailer: Do they mix? Journal of Retailing, 76(3), 393–406. Fries, A. J. (2010). The effects of cause-related marketing campaign characteristics — A literature review. Marketing — Journal of Research and Management, 6(2), 145–157. Fries, A. J., Gedenk, K., & Völckner, F. (2010). Cause-related marketing: Designing successful campaigns. University of Cologne working paper. Garretson Folse, J. A., Niedrich, R. W., & Landreth Grau, S. (2010). Cause-related marketing: The effect of purchase quantity and firm donation amount on consumer inferences and participation intentions. Journal of Retailing, 86(4), 295–309. German Fundraising Association (2009). Spendenbilanz ausgewählter Organisationen 2005–2008. Retrieved March 15, 2010, from. http://www.fundraisingverband.de/ fileadmin/pdf_upload/1Spendenbilanz_2005-2008.pdf Green, D. P., Kahneman, D., & Kunreuther, H. (1994). How the scope and method of public funding affect willingness to pay for public goods. Public Opinion Quarterly, 58(1), 49–67. Greene, W. H. (2008). Econometric analysis (6th ed.)Upper Saddle River, NJ: Pearson Prentice Hall. Gupta, S., & Cooper, L. G. (1992). The discounting of discounts and promotion thresholds. Journal of Consumer Research, 19(3), 401–411. Haruvy, E., & Popkowski Leszczyc, P. T. L. (2009). Bidder motives in cause-related auctions. International Journal of Research in Marketing, 26(4), 324–331. Holmes, J. H., & Kilbane, C. J. (1993). Cause-related marketing: Selected effects of price and charitable donations. Journal of Nonprofit & Public Sector Marketing, 1(4), 67–83. Horsky, D., Misra, S., & Nelson, P. (2006). Observed and unobserved preference heterogeneity in brand-choice models. Marketing Science, 25(4), 322–335. Human, D., & Terblanche, N. S. (2012). Who receives what? The influence of the donation magnitude and donation recipient in cause-related marketing. Journal of Nonprofit and Public Sector Marketing, 24(2), 141–160. IEG (2011). IEG sponsorship report. Retrieved June 2, 2011, from. http://www.sponsorship. com/About-IEG/Press-Room/Economic-Uncertainty-To-Slow-Sponsorship-Growth-In. aspx Jaccard, J. (2001). Interaction effects in logistic regression. Thousand Oaks, CA: Sage. Jaccard, J., Turrisi, R., & Wan, C. K. (1990). Interaction effects in multiple regression. Thousand Oaks, CA: Sage. Kahneman, D., & Knetsch, J. L. (1992). Valuing public goods: The purchase of moral satisfaction. Journal of Environmental Economics and Management, 22(1), 57–70. S.S. Müller et al. / Intern. J. of Research in Marketing 31 (2014) 178–191 Kahneman, D., Ritov, I., Jacowitz, K. E., & Grant, P. (1993). Stated willingness to pay for public goods: A psychological perspective. Psychological Science, 4(5), 310–315. Keller, K. L. (1993). Conceptualizing, measuring, and managing customer-based brand equity. Journal of Marketing, 57(1), 1–22. Koschate-Fischer, N., Stefan, I. V., & Hoyer, W. D. (2012). Willingness to pay for cause-related marketing: The impact of donation amount and moderating effects. Journal of Marketing Research, 49(6), 910–927. Landreth Grau, S., Garretson Folse, J. A., & Pirsch, J. (2007). Cause-related marketing: An exploratory study of campaign donation structures issues. Journal of Nonprofit & Public Sector Marketing, 18(2), 69–91. Lautenschlager, G. J., & Flaherty, V. L. (1990). Computer administration of questions: More desirable or more social desirability? Journal of Applied Psychology, 75(3), 310–314. Monin, B. (2003). The warm glow heuristic: When liking leads to familiarity. Journal of Personality and Social Psychology, 85(6), 1035–1048. Nancarrow, C., Brace, I., & Wright, L. T. (2001). Tell me lies, tell me sweet little lies: Dealing with socially desirable responses in market research. Marketing Review, 2(1), 55–69. Neslin, S. A., & van Heerde, H. J. (2009). Promotion dynamics. Foundation and Trends in Marketing, 3(4), 177–268. Nowak, L. I. (2004). Cause marketing alliances: Corporate associations and consumer responses. Journal of Food Products Marketing, 10(2), 33–48. Nunes, J. C., & Park, C. W. (2003). Incommensurate resources: Not just more of the same. Journal of Marketing Research, 40(1), 26–38. Olsen, G. D., Pracejus, J. W., & Brown, N. R. (2003). When profit equals price: Consumer confusion about donation amounts in cause-related marketing. Journal of Public Policy & Marketing, 22(2), 170–180. Palazon, M., & Delgado-Ballester, E. (2009). Effectiveness of price discounts and premium promotions. Psychology & Marketing, 26(12), 1108–1129. Polonsky, M. J., & Wood, G. (2001). Can the overcommercialization of cause-related marketing harm society? Journal of Macromarketing, 21(1), 8–22. Pracejus, J. W., Olsen, G. D., & Brown, N. R. (2003/04). On the prevalence and impact of vague quantifiers in the advertising of cause-related marketing (CRM). Journal of Advertising, 32(4), 19–28. Ross, J. K., Patterson, L. T., & Stutts, M.A. (1992). Consumer perceptions of organizations that use cause-related marketing. Journal of the Academy of Marketing Science, 20(1), 93–97. Ross, J. K., Stutts, M.A., & Patterson, L. (1991). Tactical considerations for the effective use of cause-related marketing. The Journal of Applied Business Research, 7(2), 58–65. Sargeant, A., & Lee, S. (2004). Trust and relationship commitment in the United Kingdom voluntary sector: Determinants of donor behavior. Psychology & Marketing, 21(8), 613–635. Simonin, B.L., & Ruth, J. A. (1998). Is a company known by the company it keeps? Assessing the spillover effects of brand alliances on consumer brand attitudes. Journal of Marketing Research, 35(1), 30–42. 191 Smith, S. M., & Alcorn, D. S. (1991). Cause marketing: A new direction in the marketing of corporate responsibility. Journal of Consumer Marketing, 8(3), 19–35. Strahilevitz, M. (1999). The effects of product type and donation magnitude on willingness to pay more for a charity-linked brand. Journal of Consumer Psychology, 8(3), 215–241. Strahilevitz, M. (2003). The effects of prior impressions of a firm's ethics on the success of a cause-related marketing campaign: Do the good look better while the bad look worse? Journal of Nonprofit and Public Sector Marketing, 11(1), 77–92. Strahilevitz, M., & Myers, J. G. (1998). Donations to charity as purchase incentives: How well they work may depend on what you are trying to sell. Journal of Consumer Research, 24(4), 434–446. Subrahmanyan, S. (2004). Effects of price premium and product type on the choice of cause-related brands: A Singapore perspective. Journal of Product & Brand Management, 13(2), 116–124. Train, K. E. (2009). Discrete choice methods with simulation (2nd ed.)Cambridge: Cambridge University Press. Vaidyanathan, R., & Aggarwal, P. (2005). Using commitments to drive consistency: Enhancing the effectiveness of cause-related marketing communications. Journal of Marketing Communications, 11(4), 231–246. van den Brink, D., Odekerken-Schröder, G., & Pauwels, P. (2006). The effect of strategic and tactical cause-related marketing on consumers' brand loyalty. Journal of Consumer Marketing, 23(1), 15–25. van Heerde, H. J., Leeflang, P.S. H., & Wittink, D. R. (2000). The estimation of pre- and postpromotion dips with store-level scanner data. Journal of Marketing Research, 37(3), 383–395. van Heerde, H. J., Leeflang, P.S. H., & Wittink, D. R. (2001). Semiparametric analysis to estimate the deal effect curve. Journal of Marketing Research, 38(2), 197–215. van Heerde, H. J., Leeflang, P.S. H., & Wittink, D. R. (2004). Decomposing the sales promotion bump with store data. Marketing Science, 23(3), 317–334. Varadarajan, P. R., & Menon, A. (1988). Cause-related marketing: A coalignment of marketing strategy and corporate philanthropy. Journal of Marketing, 52(3), 58–74. Völckner, F., Sattler, H., & Kaufmann, G. (2008). Image feedback effects of brand extensions: Evidence from a longitudinal field study. Marketing Letters, 19(2), 109–124. Webb, D. J., Green, C. L., & Brashear, T. G. (2000). Development and validation of scales to measure attitudes influencing monetary donations to charitable organizations. Journal of the Academy of Marketing Science, 28(2), 299–309. WHO, & UNICEF (). Immunization summary: The 2010 edition. Retrieved January 22, 2010, from. http://www.childinfo.org/files/Immunization_Summary_2008_r6.pdf Wymer, W., & Samu, S. (2009). The influence of cause marketing associations on product and cause brand value. International Journal of Nonprofit and Voluntary Sector Marketing, 14(1), 1–20. Zdravkovic, S., Magnusson, P., & Stanley, S. M. (2010). Dimensions of fit between a brand and a social cause and their influence on attitudes. International Journal of Research in Marketing, 27(2), 151–160. Intern. J. of Research in Marketing 31 (2014) 192–206 Contents lists available at ScienceDirect Intern. J. of Research in Marketing journal homepage: www.elsevier.com/locate/ijresmar Full Length Article Choosing a digital content strategy: How much should be free?☆ Daniel Halbheer a,⁎, Florian Stahl b, Oded Koenigsberg c, Donald R. Lehmann d a University of St. Gallen, Department of Economics, Varnbüelstrasse 19, CH-9000 St. Gallen, Switzerland University of Mannheim, Department of Business Administration, L5, 2, D-68131 Mannheim, Germany c London Business School, Regent's Park, London, NW1 4SA, United Kingdom d Columbia Business School, Uris Hall, 3022 Broadway, New York, NY 10027, United States b a r t i c l e i n f o Article history: First received in 27 August 2012 and was under review for 8 months Available online 7 November 2013 Area Editor: Kalyan Raman Guest Editor: Marnik G. Dekimpe Keywords: Information goods Content pricing Sampling Advertising Dorfman-Steiner condition a b s t r a c t Advertising supported content sampling is ubiquitous in online markets for digital information goods. Yet, little is known about the profit impact of sampling when it serves the dual purpose of disclosing content quality and generating advertising revenue. This paper proposes an analytical framework to study the optimal content strategy for online publishers and shows how it is determined by characteristics of both the content market and the advertising market. The strategy choice is among a paid content strategy, a sampling strategy, and a free content strategy, which follow from the publisher's decisions concerning the size of the sample and the price of the paid content. We show that a key driver of the strategy choice is how sampling affects the prior expectations of consumers, who learn about content quality from the inspection of the free samples. Surprisingly, we find that it can be optimal for the publisher to generate advertising revenue by offering free samples even when sampling reduces both prior quality expectations and content demand. In addition, we show that it can be optimal for the publisher to refrain from revealing quality through free samples when advertising effectiveness is low and content quality is high. To illustrate, we relate our framework to the newspaper industry, where the sampling strategy is known as the “metered model.” © 2013 Elsevier B.V. All rights reserved. 1. Introduction Digital information goods have been available on the Internet for almost twenty years. During that time, publishers have developed different strategies to distribute content. Some publishers provide all their information for free, while some charge consumers for access to their content. Other publishers employ a hybrid business model, giving away a portion of their content for free and charging for access to the rest of their content. Offering free content samples allows publishers to both disclose their content quality and to generate revenues from advertisements shown to online visitors. According to Alisa Bowen, general manager of The Wall Street Journal Digital Network, “working with advertisers to offer open houses has proven to be one of the most ☆ We thank Asim Ansari, Jean-Pierre Dubé, Anthony Dukes, Jacob Goldenberg, Avi Goldfarb, Raju Hornis, Ulrich Kaiser, Anja Lambrecht, Philipp Renner, Catherine Tucker and seminar participants at the 11th ZEW ICT Conference (2013), the GEABA 2012 (Graz), the Marketing in Israel Conference 2011 (Tel Aviv), the INFORMS Marketing Science Conference 2011 (Houston), the University of Hamburg, the HEC Paris, the University of Passau, the University of Tilburg, and the University of Zurich for helpful comments and suggestions. Daniel Halbheer gratefully acknowledges support from the Swiss National Science Foundation through grant PA00P1-129097 and thanks the Department of Economics at the University of Virginia for its hospitality while some of this research was being undertaken. ⁎ Corresponding author. E-mail addresses: daniel.halbheer@unisg.ch (D. Halbheer), florian.stahl@bwl.uni-mannheim.de (F. Stahl), okoenigsberg@london.edu (O. Koenigsberg), drl2@columbia.edu (D.R. Lehmann). 0167-8116/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ijresmar.2013.10.004 valuable and efficient ways to expose our premium content to new readers and potential subscribers” (GlobeNewswire, 2012). The main contribution of this paper is to provide a formal analysis of how publishers should choose between different digital content strategies. Information goods such as digital content are experience goods; hence consumers must have an experience to value them (Shapiro & Varian, 1998). Offering free samples is a way for publishers to disclose their content quality, thereby allowing consumers to have actual experience with the good before purchase. Digital information goods are particularly suitable for sampling because the costs of providing free samples are negligible and publishers can include advertisements in the free samples to generate advertising revenues. These two features distinguish sampling of information goods from the typical sampling of perishable or durable goods. Recently, publishers introduced business models that allow consumers to select samples of their choice within a set sample size. A prominent example of this is the “metered model” in the newspaper industry, where publishers offer a number of articles for free and charge for access to the rest. Such “customer selected sampling” differs from the approach where the publisher chooses not only the sample size but also the sample content, which allows the firm to strategically manipulate the sample and creates an environment where customers are likely to discount the sample quality in estimating actual quality. A recent study by the Newspaper Association of America (2012) shows that 62% of the publishers employ a metered model, out of which 95% offer up to twenty free articles monthly. For example, the New York D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 Times currently offers access to ten articles for free on its website each month. Advertising supported sampling is also employed by distributors of music such as Spotify or Rhapsody. Allowing consumers to choose which content to sample means that publishers have no control over the content consumers actually sample. Taking this into account is important for publishers when setting the optimal sample size. The business model where publishers set a sample size and let consumers choose which content to sample differs from versioning or “freemium,” where a firm selected low-end version is offered for free and consumers have to pay for access to the high-end version.1 Such versioning of information goods is often observed in the software industry (see, for instance, Faugère & Tayi, 2007; Cheng & Tang, 2010). Customer selected sampling, in contrast, does not involve quality differentiation: Within the set sample size, the publisher allows the consumers to sample any of its content for free. This paper develops an analytical framework to study the optimal content strategy for online publishers of information goods. Content strategy consists of two decisions: the size of the sample and the price of the paid content. The publisher is assumed to receive revenues from content sales and from advertisements, which are embedded in the free content. A key feature of sampling is that it serves the dual purpose of generating revenues from advertising and disclosing content quality. Consumers have prior expectations about content quality, which they update in a Bayesian fashion through inspection of the free samples. The information transmitted through the free samples affects the consumers' posterior expectations about content quality, which in turn influence expected content sales. Taking the consumers' quality updating into account, the publisher faces a tradeoff between an expansion effect (through learning) and a cannibalization effect (through free offerings) on expected content demand induced by sampling. When the publisher makes its sampling and pricing decisions, it should take both the effects on content demand and the impact on advertising revenue into account. We assume that the publisher can choose among a “sampling strategy,” a pure “paid content strategy,” or a pure “free content strategy.” Importantly, under a sampling strategy, consumers are allowed to sample some content for free, while they have to pay to access the remaining content (which corresponds to the metered model in the newspaper industry). We derive several important results. First, we show how the publisher's advertising-sales revenue ratio and hence its optimal content strategy is determined by characteristics of both the content market and the advertising market. The analysis confirms the insight that the elasticities of content demand and advertising demand jointly determine the publisher's optimal ratio of advertising revenue to sales revenue (Dorfman & Steiner, 1954). Importantly, the elasticity of consumers' updated expectations with respect to the sample size plays a key role in determining the ratio of advertising revenue to sales revenue. This elasticity is positive if sampling increases consumers' expectations about content quality. In this case, managers can expect the advertising-sales revenue ratio to be low due to high revenues from content sales (resulting from the expansion effect). In contrast, if sampling reduces consumers' expectations about content quality, managers can expect the advertising-sales revenue to be high due to low revenues from content sales (resulting from the cannibalization effect). Second, we describe a Bayesian learning mechanism and derive expected content demand from model primitives. This allows us to provide insights into how the effects of pricing and sampling on expected content demand are intertwined with the consumers' prior beliefs. We find that managers must consider two demand regimes, depending on whether consumers' prior expectations are “high” or “low.” In both cases, expected content demand is decreasing in price and increasing in expected posterior quality, as expected. Importantly, we show that sampling can have a demand-enhancing effect through consumers' 1 Bhargava and Choudhary (2008) analyze optimal versioning of information goods. 193 learning when prior expectations are sufficiently low (even though sampling produces a cannibalization effect). We establish the rule of thumb that sampling increases content demand if the elasticity of consumers' updated expectations exceeds the ratio of sampled to paid content. Third, to study the publisher's optimal pricing and sampling decisions, we bring our model of the content market together with a standard model of the advertising market. When content quality is common knowledge, we show that a paid content strategy is optimal for the publisher only if the effectiveness of advertising is sufficiently low. For intermediate levels of advertising effectiveness, the publisher should employ a sampling strategy and generate revenues from both sales and advertising. Once advertising is sufficiently effective, the publisher should switch to a free content strategy. Thus, it can be optimal for the publisher to offer content samples even if sampling solely cannibalizes content demand. In the case where consumers are uncertain about actual content quality and learn about it through inspection of the free samples, the optimal strategy is determined by the relationship between advertising effectiveness and quality expectations. As in the benchmark model, two cut-off values of advertising effectiveness determine the publisher's optimal content strategy: a lower bound that depends on prior quality expectations (separating paid from sampling strategies) and an upper bound that depends on posterior quality expectations (separating sampling from free content strategies). Interestingly, it can be optimal for the publisher to generate advertising revenue by adopting a sampling strategy even when sampling reduces both quality expectations and content demand. In addition, we find that it can be optimal for the publisher to adopt a paid content strategy and to refrain from revealing high quality through free samples. Finally, we explore three extensions of the model to generate additional managerial insights. First, when the publisher also includes advertisements in the paid content, the cut-off values depend not only on the relationship between advertising effectiveness and quality expectations, but also on the consumers' attitudes towards advertisements. Second, we provide insight on how the publisher's optimal content strategy is determined when the willingness to pay for advertisements is related to content quality. Third, we introduce competition in the content market and analyze how asymmetries in prior beliefs affect the equilibrium content strategies. This paper is related to two literature streams. The first stream is on media strategy in two-sided markets.2 For instance, Kind, Nilssen, and Sørgard (2009) analyze how competition, captured by the number of media platforms and content differentiation between platforms, affects the composition of revenues from advertising and sales. Godes, Ofek, and Sarvary (2009) investigate a similar question, focusing on competition between platforms in different media industries. Our paper examines optimal advertising supported content sampling and content pricing when the firm can generate revenue from both content sales and advertising. While papers that examine content sampling from different perspectives include Xiang and Soberman (2011) for preview provision and Chellappa and Shivendu (2005) for piracy-mitigating strategies, neither consider the impact of sampling on advertising revenues. To the best of our knowledge, optimal content sampling when sampling impacts revenues from both content sales and online advertising has not been addressed by the literature. This paper is also related to the broad literature on consumer learning about product attributes. Firms typically enable consumer learning through disclosing information about their products and services. This information can be disclosed in various ways, including through informative advertising (see Anderson & Renault, 2006, and Bagwell, 2007 for a comprehensive survey), or product descriptions or third-party reviews (Hotz & Xiao, 2013; Sun, 2011). Another way for firms to disclose 2 See Rysman (2009) for a general review of the two-sided markets literature. Anderson and Gabszewicz (2006) provide a canonical survey of media and advertising. 194 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 Table 1 Components of the general framework. General Framework Explicit form Variables Publisher Content parameters N … content size V … maximum content quality Cost parameters F … fixed production costs cs … unit distribution costs Assumed properties Expected posterior quality ˜ VðnÞ Expected content demand DE ðp; nÞ ≡ Dðp; n; ˜ VðnÞÞ ˜′ V ðnÞ ≷ 0 Section 3.3 Dp b 0, Dn b 0, D Ṽ N 0 DEn N 0 … demand-enhancing sampling DnE b 0 … demand-reducing sampling Proposition 2 Lemma 2 Lemma 2 Decision variables p … content price n … sample size Content market Advertising market Prior parameters v0 … minimum estimate of V α … uncertainty about v0 Posterior parameters e v0 ðnÞ α+n Preference parameters θ … valuation of quality ξ … ad attraction/ad repulsion x … preferred product characteristic τ … sensitivity to mismatch Indirect utility u(p,n) Conditional indirect utility ui(x;pi,ni) Advertiser parameter ϕ … advertising effectiveness Inverse advertising demand a(n) … ads in free content ap(N − n) … ads in paid content e ðnÞ Endogenous ad effectiveness ϕ information is through sampling. Heiman, McWilliams, Shen, and Zilberman (2001) and Bawa and Shoemaker (2004) study how sampling affects demand and the evolution of market shares for consumer goods, while Boom (2009) and Wang and Zhang (2009) investigate sampling of information goods. However, when firms sample information goods, they only offer a portion of the good for free to avoid the “information paradox” (Akerlof, 1970). Consumers' inferences about a product's attributes are most naturally modeled in a Bayesian framework. Bayesian learning processes based on product experience have been widely employed in the literature, for instance, by Erdem and Keane (1996), Ackerberg (2003), and Erdem, Keane, and Sun (2008), and we follow this approach here. We organize the remainder of the paper as follows. Section 2 presents the general framework and describes how the publisher operates in two markets: content and advertising. Section 3 describes the content market and studies the impact of sampling on expected content demand. Section 4 describes the advertising market. Section 5 analyzes the publisher's optimal content strategy. Section 6 offers extensions of our analysis. Conclusions and directions for future research are provided in Section 7. To facilitate exposition, we have relegated proofs to the Appendix. 2. General framework We first introduce the three main components of our modeling framework: the publisher, the content market, and the advertising market. Next, we define the strategies available to the publisher and characterize the optimal advertising-sales revenue ratio and thus the optimal content strategy. Table 1 summarizes the components of the general framework (as well as its main assumptions) and indicates where the reduced-form expressions are replaced by specific functional forms. 2.1. Publisher We consider a publisher who offers a digital information good with content of size N N 0 through an online channel. Content size may be thought of as the number of chapters of a book or movie, the number Section 3.1 Section 6.3 a′(n) b 0 a′p(N − n) b 0 Section 4 Section 6.1 Section 6.2 of songs on an album, or the number of articles on a news platform. To mirror the cost properties of information goods, we assume that the publisher has fixed costs F ≥ 0 and zero unit costs to produce the content.3 The cost to provide digital access per subscriber is cs ≥ 0 and the costs of providing free samples are normalized to zero. The qualities ! " of the content parts are distributed on the quality spectrum 0; V , where V is the publisher's private information. We treat quality V as an outcome of a previous strategic decision and suppose that the publisher has two decision variables: the sample size n ∈ [0, N] and the price p at which to sell the good.4 Notice that in the context of the metered model, the sample size n is the “meter” and p is the price for the content behind the “paywall,” which separates free content from paid content. 2.2. Content market We consider a market with a unit measure of consumers that observe the publisher's sampling and pricing decisions. Consumers are uncertain about content quality. We assume that they update their prior expectations in a Bayesian fashion through inspection of the free e ðnÞ the consumers' expected posterior quality samples and denote by V given the sample size n. Expected content demand depends on price p, e ðnÞ. Specifically, we assume that the publisher's exsample size n, and V pected content demand is given by E e ðnÞÞ: D ðp; nÞ ≡ Dðp; n; V ð1Þ This representation emphasizes that the sample size has both a direct effect on content demand and an indirect effect that operates through e ðnÞ. the impact of n on expected posterior quality V 3 Throughout the analysis, we assume that the fixed cost do not exceed the product market profit. Hence they do not change the analysis and can therefore be omitted. 4 The choice of (p, n) is not a multidimensional signal for quality as studied, for instance, by Wilson (1985) and Milgrom and Roberts (1986). In this strand of the literature, n is an advertising signal for quality. However, in our setting, the publisher's choice of n allows the consumers to gain information about the actual content quality through their sample experience before making the purchase decision. D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 Expected content demand satisfies the following basic assumptions. b 0, i.e. content demand depends negatively on First, we assume that ∂D ∂p b 0, so that a larger sample size has a price. Second, we impose that ∂D ∂n direct negative effect on demand for the remaining content accessible ∂D N 0, i.e. content demand through the paywall. Third, we require that ∂Ṽ depends positively on expected posterior quality. The overall effect of the sample size n on expected content demand is given by E ∂D ∂D ∂D e ′ V ðnÞ; ¼ þ e ∂n ∂n ∂V ∂D e ′ where term ∂Ṽ V ðnÞ captures the indirect effect of the sample size on expected content demand. It is not clear a priori how the sample size afE fects posterior expectations and hence ∂D . If Ve ′ðnÞ b 0 , sampling ∂n reduces posterior expectations and is thus demand-reducing. Even if e ′ ðnÞ N 0 , that is, if sampling increases posterior expectations, offering V an additional sample may be demand-reducing if the direct effect dome ′ ðnÞ is sufficiently large, the ininates the indirect effect. However, once V ∂DE ∂n N 0 and direct effect is stronger than the direct effect so that sampling has a demand-enhancing effect. In line with Bawa and Shoemaker (2004), we refer to the direct effect of sampling on content demand as the “cannibalization effect” and to the indirect effect as the “expansion effect.” 2.3. Advertising market We consider a market where the advertisements are delivered to the publisher through a representative advertiser (e.g., an advertising agency). We assume that the publisher includes one advertisement in each free content part. The inverse advertising demand is denoted by a(n) and maps the publisher's choice of n into the market price for ads. Thus, a(n) can be thought of as the advertiser's willingness to pay for placing n advertisements. We make the natural assumption that the price for advertisements decreases in sample size, that is, a′(n) b 0. 2.4. Optimal strategy The publisher receives profits from the content market and from the advertising market. The expected profit from content sales is ðp−cs Þ% e ðnÞÞ, while the profit (revenue) from including advertisements in Dðp; n; V the free articles is a(n)n. The publisher makes pricing and sampling decisions so as to maximize its (expected) profit from the two markets: max p;n e ðnÞÞ þ aðnÞn πðp; nÞ ¼ ðp−cs ÞDðp; n; V s:t: p ≥ 0 0 ≤ n ≤ N: ð2Þ As long as the publisher's profit function π(p, n) is concave, standard optimization theory posits that there is a unique constraint global maximizer (p*, n*) because the constraint set is convex. Depending on the optimal pricing and sampling decision, the following definition gives the strategies available to the publisher. Definition 1. Strategies Given the optimal pricing and sampling decision (p*, n*), the publisher adopts either (i) a “sampling strategy” if p* N 0 and n* ∈ (0, N), (ii) a pure “paid content strategy” if p* N 0 and n* = 0, or (iii) a pure “free content strategy” if p* = 0 and n* = N. Notice that both the paid content strategy and the free content strategy are nested within the sampling strategy: The publisher receives no advertising revenue under a paid content strategy and no sales revenue under a free content strategy. The following result 195 describes the optimal strategy as the ratio of advertising revenue to sales revenue. Proposition 1. Advertising-sales revenue ratio Under a sampling strategy, the publisher's optimal ratio of advertising revenue to sales revenue is given by ηn − ηṼ εṼ an% $ ; ¼# 1 Dp% 1− η ηa p ð3Þ where ηp ≡ −(∂D/∂p)(p/D) denotes the elasticity of content demand with respect to price, ηn ≡ −(∂D/∂n)(n/D) denotes the elasticity of content dee V=DÞ e mand with respect to sample size, ηṼ ≡ ð∂D=∂VÞð denotes the elastice ′ ðnÞðn=VÞ e denotes ity of content demand with respect to quality, εṼ ≡ V the elasticity of posterior quality expectations with respect to sample size, and ηa ≡ − n′(a)(a/n) denotes the price elasticity of advertising demand. This result has two important managerial insights: First, it shows that the publisher's advertising-sales revenue ratio and hence its optimal content strategy are determined by characteristics of both the content market and the advertising market. Specifically, consumer preferences determine the characteristics of the content market, captured by the elasticities of content demand with respect to price, sample size, and quality. The price elasticity of advertising demand reflects advertiser preferences. This general result thus provides guidance for managers seeking to better understand the contributions of (expected) sales and advertising to total revenue. Second, Proposition 1 shows how changes in the “market environment,” captured by the various elasticities, affect the publisher's composition of revenues. Unsurprisingly, if the price elasticity ηp increases, the advertising-sales revenue ratio is lower. Intuitively, for a given sample size, the optimal price for the content is lower, which results in a higher sales revenue. In contrast, higher elasticity of content demand with respect to the sample size ηn increases the advertising-sales revenue ratio. Furthermore, the higher the price elasticity of advertising demand ηa, the lower is the advertisingsales revenue ratio. Proposition 1 also highlights the crucial role which the elasticity of posterior quality expectations with respect to sample size plays. Because the elasticity of content with respect to quality ηṼ is positive, the impact of sampling on posterior quality determines the sign of ηṼ εṼ . If εṼ is negative, the ratio of advertising revenue to sales revenue tends to be high, while it tends to be low if εṼ is positive. Intuitively, if ε Ṽ b 0, sampling reduces expected content demand as Ṽ ′ ðnÞ b 0, and hence the advertising-sales revenue ratio is high. In contrast, if εṼ N0, sampling increases expected content demand as consumers revise their expectations about quality upwards, resulting in a lower advertisingsales revenue ratio. Interestingly, the optimal advertising-sales revenue ratio is reminiscent of the well-known Dorfman-Steiner condition, which states that a monopolist's ratio of advertising spending to sales revenue is equal to the ratio of the elasticities of demand with respect to advertising and price (Dorfman & Steiner, 1954). Proposition 1 reduces to this result in the special case when offering additional samples does not affect posterior quality ðε Ṽ ¼ 0Þ and if the advertising demand is perfectly elastic (ηa → ∞). Our general framework is agnostic about how consumers form posterior quality expectations. To shed light on effects of sampling on posterior quality expectations, the next section introduces a Bayesian learning mechanism in which consumers update their prior expectations about content quality through experience with the sample. Importantly, we show how the publisher should take the consumers' learning into account to gauge the effects of offering free samples on content demand. 196 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 (B) (A) Fig. 1. Prior expectations about V (where V ≡ 1). 3. Content market This section analyzes the impact of customer selected sampling on expected content demand when the qualities of the content parts are ! " uniformly distributed on the quality spectrum 0; V . We begin by laying out assumptions regarding consumer behavior and describing the Bayesian learning mechanism. Taking consumers' learning into account, we derive the publisher's expected content demand given its pricing and sampling decisions. Finally, we provide conditions under which sampling increases or decreases expected content demand. 3.1. Consumer behavior Consumers know that the of the free samples are uniformly ! qualities " distributed on the interval 0; V , but they do not know the upper bound of the publisher's quality spectrum V and are hence uncertain about (average) content quality.5 Consumers have a common prior belief about V that may stem, for instance, from reviews, ratings or “word of mouth.” The natural conjugate family for a random sample from a uniform distribution with unknown upper bound is the Pareto distribution (DeGroot, 1970). We capture uncertainty about V by a prior belief that consists of a minimum estimate v0 of the upper bound V and a level of uncertainty α about this value. Specifically, we assume that prior beliefs follow a Pareto distribution with density function f ðvjv0 ; α Þ ¼ 8 α < αv0 αþ1 :v 0; ; ! " ðα þ nÞe v0 ðnÞ : v0 ðnÞ; α ¼ E V je α þ n−1 Consumers infer the expected quality of the information good E½Vjv1 ; …; vn ' from the average quality of the sampled content parts. Knowing that qualities are uniformly distributed on the quality spectrum offered, the expected quality of the information good is given by E½Vjv1 ; …; vn ' ¼ for v N v0 otherwise: ð4Þ Obviously, prior expectations increase in v0 and decrease in α. Fig. 1 illustrates two prior beliefs along with the corresponding expectations for different parameter values. Prior expectations are lower than actual quality in Panel A and higher than actual quality in Panel B. Note that prior expectations can be higher than actual quality even if v0 b V. Consumers update their prior belief about the upper bound of the quality spectrum V by taking the observed qualities of the free samples into account. Specifically, consumers use the n sample qualities Vi = vi Note that the upper bound V is monotonically related to the mean, which may be an alternative way for consumers to think about content quality. 6 Our measure of uncertainty corresponds to the scale parameter α of the Pareto distribution. Hence, when uncertainty is higher, the prior distribution is more spread out. ð5Þ % θNE½Vjv1 ; …; vn ' þ ξn−p; θnE½Vjv1 ; …; vn 'þξn; from purchasing at price p from staying with the f ree samples: The parameter ξ captures the intensity of ad-attraction (ξ N 0) or adrepulsion (ξ b 0) in the population (Gabszewicz, Laussel, & Sonnac, 2005). Thus, when consumers exhibit ad-loving behavior, the utility of both options is augmented by ξn, while the utility of both options is reduced by ξn in the case of ad-avoiding behavior. The value of the information good is equal to the number of content parts multiplied by their expected quality.8 This implies that a consumer will purchase the information good if and only if the indirect utility from buying exceeds the indirect utility from consuming only the free samples, that is, if θðN−nÞE½Vjv1 ; …; vn '−p ≥ 0: ð6Þ This condition means that the (quality-weighted) expected value of the content that has not been sampled must exceed the price. Importantly, purchase does not depend on consumer behavior towards advertising. 7 5 ! " E V je v0 ðnÞ; α : 2 Consumers believe that higher quality is better than lower quality but differ in the way they value quality. To capture this heterogeneity, we introduce a preference parameter for quality θ, which is uniformly distributed on the interval [0,1]. We assume that each consumer either purchases the information good at price p or stays with the n free samples. A consumer's indirect utility from these two options is given by uðp; nÞ ¼ We assume α N 1 to ensure existence of prior expectations.6 Based on the consumers' prior knowledge about v0 and α, their prior expectation about V is ! " αv0 E Vjv0 ; α ¼ : α−1 (i = 1, …, n) to form their posterior belief e vðnÞ about V. Using standard Bayesian analysis, e vðnÞ follows a Pareto distribution with minimum value parameter e v0 ðnÞ ¼ maxfv0 ; v1 ; …; vn g and shape parameter α + n (DeGroot, 1970).7 Hence, the posterior expectation of V is given by The proof of this result appears in the Appendix. This additivity assumption is justified for independently valued content parts. However, a concave or convex relationship between the value and the number of content parts might be more appropriate for interrelated content parts, that is, if the content parts are substitutes or complements. 8 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 v0 b V, sampling may have a demand-enhancing effect. We next address this possibility. 3.2. Expected content demand Because consumers do not know the upper bound of the quality spectrum V with certainty, content demand is influenced by consumers' posterior quality expectations. Thus, when the publisher makes decisions about the sample size and the price, it has to base them on expected content demand as consumers have not yet evaluated sample qualities and updated their expectations about content quality. In order to compute expected content demand, we assume that the publisher knows the consumers' prior parameters v0 and α, which can be learned using standard market research techniques such as surveys. The calculation involves a three-step procedure: In the first step, the publisher computes the expected posterior quality by averaging posterior expectations about V as given by (5) across all possible realizations of sample qualities: E½E½VjV 1 ; …; V n '' ¼ In the second step, the publisher substitutes the expected posterior quality into the purchase condition given by (6) to obtain expected content demand: & p 2ðαþn−1Þ : ðN−nÞ ðα þ nÞE½e v0 ðnÞ' ð7Þ In the third step, the publisher calculates E½ e v0 ðnÞ' to obtain the expected content demand as a function of the underlying model parameters. This calculation leads to the following result. Proposition 2. Expected content demand When the publisher sells the information good at price p and offers n ∈ {1, …, N − 1} samples, then (a) if v0 b V, expected content demand is given by ( ) n p 2ðα þ n−1Þðn þ 1ÞV : Dfv0 b V g ðp; nÞ ¼ max 0; 1− ðN−nÞ ðαþnÞðvnþ1 þ nV nþ1 Þ E 0 ð8Þ (b) if v0 ≥ V, expected content demand is given by E Dfv0 ≥ % ð p; n Þ ¼ max 0; 1− Vg 3.3. The role of quality expectations For a given level of prior expectations about content quality, sampling either increases or decreases expected content demand. Whether sampling compensates for cannibalization through consumers' learning depends on the gap between posterior quality and actual quality. Expected posterior quality is 8 nþ1 > ðα þ nÞðvnþ1 þ nV Þ > 0 > < n ; if v0 b V e ðnÞ ¼ 2ðα þ n−1Þðn þ 1ÞV V > > ðαþnÞv0 > ; if v0 ≥ V : 2ðα þ n−1Þ ð10Þ e ðnÞ− V . Consumers overestimate (unand the quality gap is defined as V 2 derestimate) quality if the expected posterior quality is higher (lower) than actual quality. This leads to the following result. ðα þ nÞE½ e v0 ðnÞ' : 2ðα þ n−1Þ % E D ðp; nÞ ¼ max 0; 1− 197 & p 2ðα þ n−1Þ : ðN−nÞ ðα þ nÞv0 Lemma 1. Quality gap When v0 b V , consumers overestimate quality after their sample experience if v0 V N # $1 α−1 nþ1 ; αþn ð11Þ and underestimate it if the inequality is reversed. If v0 ≥ V , consumers overestimate quality irrespective of the sample size and their level of uncertainty α N 1. Prior expectations can be higher than actual quality even if v0 < V (a high level of uncertainty about v0 is captured by a low α). The condition in Eq. (11) applies when consumers overestimate quality based on posterior expectations: This is likely to be the case for a low α and when the publisher offers a small number of free articles n. On the other hand, consumers underestimate quality if their uncertainty is low and the sample size is large. When v0 b V, the shaded area in Fig. 2 illustrates the set of parameters for which consumers overestimate and underestimate quality, respectively. By construction, where α N 1, condition (11) holds and consumers overestimate quality. The parameter region for which cone ðnÞ → V as sumers overestimate quality shrinks as n gets larger since V 2 n → ∞, meaning that consumers learn actual quality once the sample size gets “large enough.” ð9Þ Expected content demand has the intuitive properties that we assumed in our general framework: it decreases in price and increases in expected posterior quality. In addition, sampling has both a direct demand-reducing effect and an indirect effect that operates through its impact on posterior expectations. The direct effect kicks in 1 through the factor N−n and mirrors the cannibalization effect ∂D b 0. ∂n This follows because a larger sample size reduces the utility of the remaining content consumers have to pay for. Proposition 2 nests the expected content demand under a paid content strategy (n = 0), in which case D(p, 0) is a function of prior quality expectations as there is no learning. In contrast, when the publisher employs a free content strategy (n = N), consumers do not purchase the information good as they can download it for free and hence D(p, N) ≡ 0. Which of the two demand functions reported in Proposition 2 emerges depends on the value of the minimum estimate v0 about V vis-à-vis the actual upper bound of the quality spectrum V : If v0 ≥ V , sampling necessarily reduces expected content demand. In contrast, if Fig. 2. The quality gap for the case v0 b V (where V ≡ 10). The shaded area indicates where consumers underestimate quality. 198 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 e ðnÞ allows us to rewrite the expected content deThe definition of V mand derived in Lemma 2 more compactly as ( E D ðp; nÞ ¼ max 0; 1− p ) e ðnÞ ðN−nÞV ; ð12Þ which is a specific version of the reduced-form demand function in Eq. (1). Hence the number of free samples n has both a direct effect on expected content demand and an indirect effect that operates through e ðnÞ. The next result uses this demand posterior quality expectations V function to identify conditions under which sampling has a demandE enhancing effect (that is, ∂D N 0). ∂n Lemma 2. Effects of sampling n , that is, Offering free samples has a demand-enhancing effect if εṼ N N−n if the elasticity of consumers' posterior quality expectations exceeds the ratio of sampled to paid content. Lemma 2 shows that offering free samples may increase expected content demand through consumers' learning, even though it produces a cannibalization effect. Intuitively, the indirect effect dominates the direct cannibalization effect if sampling results in a sufficiently large upwards revision of consumers' prior expectations. 4. Advertising market This section derives inverse advertising demand of a representative advertiser that places advertisements in the free samples offered by the publisher. The representative advertiser can be thought of as an advertising agency that delivers independent product advertisements to the publisher. Following Godes, Ofek, and Sarvary (2009), we let the advertiser's utility from placing n advertisements be given by n −an; 2N where ϕ N 0 is a parameter capturing the marginal benefit of an advertisement and a denotes the unit price set by the publisher to run an advertisement. This specification captures decreasing marginal utility from placing advertisements caused by decreasing consumer recall and retention rates for ads (Burke & Srull, 1988). The advertiser maximizes its utility to determine how many advertisements to run on the publisher's platform. The implied inverse advertising demand has the linear form aðnÞ ¼ ϕ− n ; N 5.1. Strategy with known quality We first derive content demand under a sampling strategy and subsequently derive the demands for the two boundary strategies when consumers know content quality. Next, we characterize the optimal pricing and sampling decisions for each of the three strategies. Finally, we determine the optimal content strategy. 5.1.1. Content demand When consumers know the upper bound V and hence the quality spectrum, the minimum estimate v0 of the upper bound V is equal to V with certainty and thus α → ∞. Proposition 2 implies that content demand can be expressed as 9 8 > > > > = < p ð14Þ Dðp; nÞ ¼ max 0; 1− > V> > > : ðN−nÞ ; 2 for n ∈ {0, …, N − 1}. Recall that D(p, N) ≡ 0 under a free content strategy. 5.1.2. Optimal pricing and sampling In the benchmark case with known quality, the publisher makes its pricing and sampling decisions so as to # $ ' p n( max πðp; nÞ ¼ p 1− þ ϕ− n V p;n N ðN−nÞ 2 s:t: p ≥ 0 0 ≤ n ≤ N; where content demand is given by Eq. (14) and inverse advertising demand by Eq. (13). From the first-order conditions, the optimal price for a given sample size is 2 uðnÞ ¼ ϕn− ϕ N 1.9 For expositional purposes, we normalize the costs of providing digital access cs to zero and present analysis ignoring the integer constraint on the number of free samples. ð13Þ where ϕ is referred to as “advertising effectiveness.” Inverse demand slopes downward, implying that the publisher's revenue per ad impression is decreasing in sample size. Intuitively, a(n) can be thought of as the advertiser's willingness to pay for placing n advertisements. 5. Optimal content strategy This section analyzes the publisher's optimal content strategy with customer selected sampling. We first analyze the benchmark case in which the consumers know V and hence the publisher's quality spectrum. In this case, sampling does not affect the consumers' expectations about quality and simply serves to generating advertising revenues. Next, we analyze the case where V is not known to consumers. In contrast to the benchmark case, sampling not only generates advertising revenues but also influences consumers' expectations about quality. We consider three strategies: the paid content strategy, the sampling strategy, and the free content strategy. In order to study the tradeoffs between the three strategies, we focus on the case where pðnÞ ¼ ðN−nÞV : 4 ð15Þ This implies that the more free samples the publisher chooses to offer, the less it will be able to charge the consumer for the remaining content. The next result summarizes the optimal pricing and sampling decisions for each of the three strategies. Lemma 3. Pricing and sampling Suppose that consumers know the upper bound of content quality ) * V . Then, (i) under* a sampling strategy, p% ¼ NV 8ð2−ϕÞ þ V =64 ) and n% ¼ N 8ϕ−V =16, (ii) under a paid content strategy, p∗ ¼ NV=4 and n∗ = 0, and (iii) under free content strategy, p∗ = 0 and n∗ = N. The parameters V and ϕ have opposite effects on the optimal price and on the optimal sample size under a sampling strategy: As expected, p∗ increases in V while n∗ decreases in quality. In contrast, p∗ decreases in ϕ, and n∗ increases in advertising effectiveness. Both the optimal price and the optimal sample size increase in content size N. The following proposition characterizes the publisher's optimal strategy. Proposition 3. Optimal strategy If consumers know the quality spectrum, then: (i) if ϕ ≤ V8, the publisher ' ( should follow a paid content strategy, (ii) if ϕ ∈ V8 ; V8 þ 2 , the publisher should employ a sampling strategy, and (iii) if ϕ ≥ optimal strategy is a free content strategy. V 8 þ 2, the publisher's 9 When ϕ ≤ 1, the free content strategy is never optimal because the advertising revenues and hence the publisher's profit is zero. Thus, the publisher's choice is only between the paid content strategy and the sampling strategy. The analysis where ϕ N 1 encompasses this special case. 199 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 Proposition 3 shows that the choice of the optimal strategy is driven by the relationship between content quality V and advertising effectiveness ϕ. Thus, for a given content quality, a paid content strategy is optimal if the effectiveness of advertising is sufficiently low. For intermediate levels of advertising effectiveness, a sampling strategy that generates revenues from both sales and advertising on the free samples is optimal. If advertising is sufficiently effective, the publisher should switch to a free content strategy. The effects of ϕ and V on the optimal strategy can also be understood by inspection of the advertising-sales revenue ratio. The ratio follows from (3) and is an% ¼ Dp% V ϕ− 8 V 4 ! ! V þϕ 8 ! : V þ 2−ϕ 8 The ratio of advertising revenue to sales revenue tends to zero as ϕ V 8 approaches the lower bound , implying that the publisher should employ a paid content strategy. A sampling strategy is optimal only if advertising is not “too effective,” that is, as long as ϕ ≤ V8 þ 2 . Once ϕ exceeds this level, the publisher should switch to a free content strategy. 5.1.3. Summary When content quality is known to consumers, the publisher's optimal strategy is determined by the relation between advertising effectiveness and content quality. The more effective advertising is, the more free samples the publisher should offer—even if it solely cannibalizes content demand. 5.2. Strategy with unknown quality We first determine the optimal content strategy when consumers do not know content quality and compare our findings to the results from the benchmark case. Next, we study how the interplay of prior expectations and advertising effectiveness governs the optimal choice of content strategy. 5.2.1. Optimal pricing and sampling When content quality is not known to consumers, the publisher makes its pricing and sampling decisions so as to max p;n p E π ðp; nÞ ¼ p 1− s:t: ! e ðnÞ ðN−nÞV p≥0 0 ≤ n ≤ N; ' n( þ ϕ− n N In contrast to our benchmark case, it is not possible to characterize the publisher's optimal pricing and sampling decisions (and hence profits) analytically. Nevertheless, we have the following result. Proposition 4. Optimal strategy Suppose that consumers are uncertain about the upper bound of content quality V and that the profit function πE(n) is strictly concave. e ð0Þ−NV e ′ ð0ÞÞ and ϕ ¼ 2 þ Ṽ ðNÞ Then, there are cut-off values ϕ ¼ 1 ðV 4 4 such that a paid content strategy is optimal for ϕ ≤ ϕ , a sampling strategy is optimal for ϕ ∈ ðϕ; ϕÞ, and a free content strategy is optimal for ϕ ≥ ϕ. This result is consistent with the insights from the benchmark case when quality is known: a paid content strategy is optimal only if the advertising effectiveness is sufficiently low, a sampling strategy is optimal for intermediate levels of the advertising effectiveness, and the publisher should switch to a free content strategy once advertising is sufficiently effective. Fig. 3 illustrates the optimal strategy for varying advertising effectiveness ϕ and the expected profit for each strategy (π∗SC for the sampling strategy, π∗PC for the paid content strategy, and π∗FC for the free content strategy). Hence prior expectations determine the lower of the two cut-off values for a sampling strategy to be optimal whereas posterior expectations for sample size n = N determine the upper cut-off value. In effect, ϕ is determined by the impact of the “first” free content part on posterior expectations, while ϕ is determined by posterior expectations after inspection of the “last” free content part. The next lemma shows that the model where quality V is not known to consumers nests the full information benchmark case (see Proposition 3). Lemma 4. Cut-off values Suppose that consumers are uncertain about content quality V and that the profit function πE(n) is strictly concave. Then, when consumers have correct quality expectations, that is, if v0 ¼ V and α → ∞, the lower bound ϕ converges to V and the upper bound ϕ converges to V8 þ 2. 8 5.2.2. The impact of prior expectations Proposition 4 shows that the optimal strategy depends not only on advertising effectiveness ϕ and quality V as in the benchmark case, but also on the specific values of the prior parameters v0 and α (as well as content size N). Fig. 4 illustrates the optimal strategy for given advertising effectiveness and prior expectations. Panel A depicts the cut-off thresholds between the different strategies in the ðϕ; v0 Þ-space (given α = 2). Similarly, Panel B illustrates the publisher's optimal strategy in the (ϕ, α)-space (given v0 ¼ 5 ). where expected content demand is given by Eq. (12) and inverse advertising demand by Eq. (13). The only difference between this expected profit and the profit when content quality is known to consumers is the dependence on expected quality rather than actual (average) quale ðnÞ is the posterior estimate of avity. Based on (15) and recalling that V V erage quality 2 , we thus obtain that pðnÞ ¼ e ðnÞ ðN−nÞV : 2 Substituting p(n) back into the profit function allows us to rewrite the profit maximization problem as max n e ðnÞ ' V n( þ ϕ− n 4 N s:t: 0 ≤ n ≤ N: E π ðnÞ ¼ ðN−nÞ ð16Þ Fig. 3. Optimal strategy with unknown quality (for v0 ¼ 5, α = 2, V ¼ 10, and N = 10). 200 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 (A) (B) Fig. 4. Optimal strategy (for V ¼ 10 and N = 10). Here prior expectations are correct and coincide with actual quality when v0 ¼ 5 and α = 2. The following observation summarizes our insights. Observation 1. Prior expectations If advertising effectiveness is high and actual content quality is relatively low, it can be optimal for the publisher to adopt a sampling strategy to generate advertising revenue even when it reduces prior quality expectations and content demand. In contrast, if advertising effectiveness is low and actual content quality is relatively high, it can be optimal to adopt a paid content strategy and not reveal high quality, even though sampling would increase content demand. The shaded areas in Fig. 4 illustrate these important managerial insights. Notice that the larger areas correspond to the case where prior expectations are high relative to actual content quality. In such market environments, the publisher should sacrifice content demand to boost advertising revenues. The figure also shows that when prior expectations are sufficiently low relative to actual content quality (that is, if v0 is small or α is large), the choice is between a sampling strategy and a free content strategy only. Intuitively, the publisher has an incentive to reveal its higher than expected quality through free samples, possibly offering its content for free. In contrast, when prior expectations are sufficiently high, the optimal strategy is either a paid content strategy or a free content strategy. In such market environment, the publisher has no incentive to reveal its lower than expected quality when ϕ is low. 5.2.3. Summary When actual content quality is not known to consumers, the optimal strategy is determined by the relation between advertising effectiveness, prior quality expectations, and posterior quality expectations. As in the benchmark case, employing a paid content strategy is optimal only if advertising effectiveness is sufficiently low compared to prior quality expectations. For intermediate levels of advertising effectiveness, the publisher should use a sampling strategy. The publisher should switch to a free content strategy once advertising is sufficiently effective compared to posterior quality expectations. Counter to intuition, it can be optimal for the publisher to generate advertising revenue by adopting a sampling strategy even when sampling reduces both prior quality expectations and content demand. In addition, it can be optimal for the publisher to adopt a paid content strategy and to refrain from revealing high quality through free samples. 6. Extensions This section relaxes three key assumptions and studies the effects on our findings. First, we allow the publisher to generate advertising revenues from paid content as well. Second, we allow advertising effectiveness to depend on content quality. Third, we introduce competition into the model. 6.1. Including advertisements in the paid content In this section, we extend the model by allowing it to include advertisements in both the free articles and the paid content. To this end, we assume that the market price for advertisements included in the paid content is given by ^ Þ ¼ ϕp − ap ðn ^ n ; N ð17Þ ^ ≡ N−n and ϕp N 1 denotes the advertising effectiveness for where n ads in the paid content. This inverse demand is a natural counterpart to the advertising demand a(n) given by Eq. (13) and indicates ^ that have that the ad price depends on the number of articles n not been offered as free samples. Differences in the levels of advertising effectiveness ϕp and ϕ capture differences in reach or the degree of targeting in the advertising markets for paid and free content. When allowing for advertisements in the paid content, a consumer's indirect utility from the two options is ^ ðp; nÞ ¼ u % θNE½Vjv1 ; …; vn ' þ ξN−p; θnE½Vjv1 ; …; vn ' þ ξn; from purchasing at price p from staying with the free samples: Importantly, the utility of the purchase option now depends on the overall number of advertisements shown to the consumer rather than the number of ads contained in the free samples only. This augmented specification implies that a consumer will purchase the information good if and only if θðN−nÞE½Vjv1 ; …; vn ' ≥ p−ξðN−nÞ; ð18Þ that is, if the expected value of the content that has not been sampled exceeds its full price p−ξ(N − n). Intuitively, the full price of the content is lower if consumers are ad-lovers (ξ N 0) and higher if they are ad-avoiders (in which case exposure to the ads in the paid content results in a nuisance cost ξ(N − n) that has to be added to the price p). From the purchase condition in Eq. (18), expected content demand can be derived as % ^ E ðp; nÞ ¼ max 0; 1− D ' p (& 1 −ξ : E½Vjv1 ; …; vn ' N−n ð19Þ Compared to the case where consumers are ad-neutral, content demand is higher when consumers are ad-lovers and lower when consumers are ad-avoiders. 201 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 The publisher makes its pricing and sampling decisions so as to ' ( ' ( E ^ E ðp; nÞþ ϕ− n n ^Þ D max π ðp; nÞ ¼ p þ Rp ðn p;n N s:t: p ≥ 0 0 ≤ n ≤ N; where expected content demand is given by Eq. (19) and inverse adver^ Þ ≡ ap ðn ^ Þn ^ are the additional tising demand by Eq. (17). Notice that Rp ðn revenues from including ads in the paid content. By definition, Rp = 0 under a free content strategy, while Rp = (ϕp − 1)N under a paid content strategy. We summarize our insights as follows. Observation 2. Ads included in paid content When the publisher includes advertisements in the paid content and consumers are neutral about advertisements or ad-lovers, the range of advertising effectiveness ϕ for which the sampling strategy is best expands. In contrast, when consumers are ad-avoiders, the range of ϕ for which the sampling strategy is optimal can shrink. Fig. 5 illustrates the profit effects of including advertisements in the paid content when consumers are neutral about advertisements. Intuitively, exploiting revenues from advertisements in the paid content increases the unit margin from selling content, which translates into a higher profit under a sampling strategy. These profit effects are more pronounced when advertising effectiveness for ads in paid content ϕp increases. Further, the profit under a sampling strategy is higher when the consumers are ad-lovers (for given ϕp) and lower when they are ad-avoiders. 6.2. Endogenizing advertising effectiveness Up to now, we considered advertising effectiveness to be exogenous and independent of content quality. In this section, we relax this assumption and allow the advertiser's willingness to pay for advertisements to be positively related to expected posterior quality, in effect having product quality have a halo effect on ad effectiveness. Specifically, we let inverse advertising demand be given by e ðnÞ− aðnÞ ¼ ϕ n N ð20Þ e ðnÞ on qualityand capture the effect of expected posterior quality V e ðnÞ in the following way: adjusted advertising effectiveness ϕ e e ðnÞ ≡ ϕ V ðnÞ ; ϕ e ð0Þ V e ð0Þ where ϕ is advertising effectiveness (as introduced in Section 4) and V denotes prior quality expectations. This specification reflects the idea that free samples become more attractive as an advertising platform when consumers' posterior expectations exceed their prior expectations, which in turn increases the advertiser's willingness to pay for the advertisements (and vice versa). When advertising effectiveness is positively related to posterior content quality, offering free samples affects the willingness to pay for advertisements in two ways: Through changes in the quality-adjusted e ðnÞ and through changes in the number of advertising effectiveness ϕ ads shown to consumers. Therefore, advertising demand can be increasing in the number of free samples when prior expectations are sufficiently low relative to consumers' updated expectations (see Panel A in Fig. 6 for a graphical illustration). Over the range of where a(n) increases, the marginal benefit of higher advertising effectiveness outweighs the disutility of showing an additional ad to consumers. The publisher makes its pricing and sampling decisions so as to max p;n p E π ðp; nÞ ¼ p 1− s:t: p ≥ 0 0 ≤ n ≤ N; ! e ðnÞ ðN−nÞV ' ( e ðnÞ− n n þ ϕ N where expected content demand is given by Eq. (12) and inverse advertising demand by Eq. (20). The next observation summarizes our insights. Observation 3. Endogenous ad effectiveness When consumers' prior expectations are low and the advertiser's willingness to pay for advertisements is positively related to the expected posterior content quality, it can be optimal for the publisher to adopt a free content strategy even when baseline advertising effectiveness ϕ is low. Panel B of Fig. 6 illustrates this observation. In the figure, the lines depict the cut-off thresholds between the different strategies in the (ϕ, v0)-space. The solid lines correspond to the case with qualityadjusted advertising effectiveness and show that a free content strategy can also be optimal for low ϕ, whereas a sampling strategy yields a higher profit when advertising effectiveness is exogenous (indicated by the dashed lines reproduced from Panel A in Fig. 4). This occurs because when prior expectations are low, sampling not only reveals high quality, but also increases the advertiser's willingness to pay for ads and thus to boost advertising revenues. 6.3. The impact of competition Thus far, we have examined a publisher operating in a monopoly setting. Here, we allow for competition between two publishers that offer differentiated information goods. We add horizontal product differentiation to capture the intensity of competition between the two publishers.10 In the newspaper industry for instance, horizontal product differentiation may arise due to different political opinions among consumers (Gabszewicz, Laussel, & Sonnac, 2005). The analysis of the competitive case indicates that, as in the monopoly case, advertising effectiveness is a key driver of the publishers' strategy choice. Specifically, if consumers' prior expectations about content qualities are similar and advertising effectiveness is high, it is optimal for both publishers to adopt a free content strategy in equilibrium. However, two other drivers that affect strategy choice in the competitive case: the gaps in prior expectations about the quality of the competing products and the degree of horizontal product differentiation. The following observation summarizes our insights. Observation 4. Competition If the gap in prior expectations regarding the product qualities is large and advertising effectiveness is high, both publishers should adopt a sampling strategy. As the degree of horizontal product differentiation increases Fig. 5. Optimal strategy with (dashed lines) and without advertisements in paid content (solid lines) for ξ = 0, v0 ¼ 5, α = 2, V ¼ 10, N = 10, and ϕp = 1.1. 10 All details of the model and the analysis of the competitive case are in the Appendix. 202 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 (A) (B) Fig. 6. Impact of endogenizing advertising effectiveness (for α = 2, V ¼ 10, N = 10; in addition, in Panel A v0 ¼ 2 and ϕ = 2). and the products become less substitutable, the parameter region in which the sampling strategy is optimal for both publishers shrinks. Counter to intuition, our results show that it is not per se optimal to choose a free content strategy when advertising effectiveness is high. What matters in addition is the gap in prior expectations between the two publishers: if it is small, then both firms should adopt a free content strategy. Instead, if the gap in prior expectations is large, then both firms should adopt a sampling strategy. Intuitively, the publisher that faces lower prior expectations from consumers has a stronger incentive to reveal its higher than expected quality—and it is a best-response of the rival publisher to also adopt a sampling strategy. The second insight is that the parameter region in which the sampling strategy is optimal for both publishers shrinks when products are perceived as less substitutable (e.g., when the readers with left-wing preference consider a right-wing newspaper a less good alternative). For the publishers, this results in less ability to acquire consumers who purchase from the competitor. Consequently, the publishers try to generate as much revenue as possible from the advertising market by employing a free content strategy. This result mirrors the findings from the monopoly case: as the degree of horizontal product differentiation increases and publishers tend to become monopolists in their respective market segment, they should employ a free content strategy once advertising effectiveness exceeds a certain threshold level. 7. Conclusion This paper analyzed digital content strategies when content sampling serves the dual purpose of disclosing content quality and generating advertising revenue. One of the key features of the model is that consumers evaluate free samples of their choice within the limit set by the publisher. Consumers then use the information gathered from the free samples to update their prior expectations about content quality in a Bayesian fashion. Taking consumers' quality updating into account, the publisher can adopt a sampling strategy, a paid content strategy, or a free content strategy. We derived several important results. First, we show in our general framework how the publisher's advertising-sales revenue ratio and hence its optimal content strategy is determined by characteristics of both the content market and the advertising market. We capture the characteristics of the content market by the elasticity of consumers' updated quality expectations and the elasticities of content demand with respect to price, sample size, and posterior quality. The corresponding characteristic in the advertising market is the price elasticity of advertising demand. We show that, all else equal, the advertising-sales revenue ratio is higher the lower the price elasticity of content demand, the higher the elasticity of content demand with respect to sample size, and the lower the price elasticity of advertising demand. In addition, managers can expect the ratio of advertising revenue to sales revenue to be low when sampling increases consumers' quality expectations (resulting from the expansion effect) and high when sampling reduces quality expectations (resulting from the cannibalization effect). Second, when consumers make purchase decisions based on the price of the content behind the paywall, content demand depends on consumers' posterior quality expectations, which can be influenced by the publisher through its sampling decision. Expected content demand has the natural properties that it is decreasing in price and increasing in expected posterior quality. Further, sampling has a demand-enhancing effect through consumers' learning when prior expectations are sufficiently low (even though sampling produces a cannibalization effect). We uncovered the rule of thumb that sampling increases content demand if the elasticity of consumers' updated expectations exceeds the ratio of sampled to paid content. Third, we characterize the publisher's optimal content strategy when consumers are uncertain about actual content quality and learn about it through inspection of free samples. We identify two cut-off values that determine the publisher's optimal content strategy: a lower bound that depends on prior quality expectations (separating paid from sampling strategies) and an upper bound that depends on posterior quality expectations (separating sampling from free content strategies). From a managerial perspective, it can be optimal to reduce both prior quality expectations and content demand in order to generate advertising revenue. In addition, it can be optimal for managers to adopt a paid content strategy and to refrain from revealing high quality through free samples. We also explore several model extensions that are relevant for managerial decision making. First, when the publisher also includes advertisements in the paid content, the analysis shows that the cut-off values between the content strategies depend not only on the relation between advertising effectiveness and updated quality expectations, but also on consumers' attitudes towards advertisements. Second, when the willingness to pay for advertisements is related to content quality, we show that a free content strategy can be optimal even when advertising effectiveness is low. Third, we show that under competition the advertising effectiveness is a key driver of the publishers' equilibrium strategy choices. This analysis also sheds light on recent developments in the newspaper industry and explains why publishers have moved away from pure advertisingfinanced business models to metered models (Abramson, 2010). Our general framework offers several avenues for future research. Regarding consumers, we assume they correctly update quality expectations based on their sample experience. One alternative is to assume 203 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 a consistent bias in the consumers' judgments. In addition, in circumstances where the firm selects the samples, consumers are likely to adjust (discount) observed quality, assuming that the publisher has provided a non-representative set of samples to choose from in order to persuade them to buy the paid content. Further, one could assume that consumers do not evaluate the quality of all free samples because of “sampling costs,” for example due to the opportunity cost of time or mental costs. One could also enrich the model by allowing for internal competition, where the publisher offers two websites to serve different categories of consumers, which relates to the versioning literature.11 Thus, there are several further directions which research in these areas could take. We view this paper a step in this process and hope the paper encourages work in these and related directions. Appendix A A.1. Sampling from a uniform distribution A.1.1. The Pareto distribution A random variable X has a Pareto distribution with parameters w0 and α (w0 N 0) and (α N 0) if X has a density f ðxjw0 ; α Þ ¼ 8 < αwα0 αþ1 :x 0 for x N w0 otherwise: For α N 1 the expectation of X exists and it is given by EðX Þ ¼ αw0 . α−1 Re- garding sampling from a uniform distribution, we use the following result. Theorem. (DeGroot, 1970) 12 Suppose that X1, …, Xn is a random sample from a uniform distribution of the interval (0, W), where the value of W is unknown. Suppose also that the prior distribution of W is a Pareto distribution with parameters w0 and α such that w0 N 0 and α N 0. Then the posterior distribution of W when Xi = xi (i = 1, …, n) is a Pareto distribution with parameters w0′ and α + n, where w0′ = max{w0, x1, …, xn}. Proof. For w N w0, the prior density function ξ of W has the following form: ξðwÞ∝ 1 : wαþ1 Furthermore, ξ(w) = 0 for w ≤ w0. The likelihood function fn(x1, …, xn|w) of Xi = xi (i = 1, …, n), when W = w (w N 0) is given by13: f n ðx1 ; …; xn jwÞ ¼ f ðx1 jwÞ⋯ f ðxn jwÞ ¼ ( 1 wn 0 for maxfx1 ; …; xn g b w otherwise: It follows from these relations that the posterior p.d.f. ξ(w|x1, …, xn) will be positive only for values w such that w N w0 and w N max{x1, …, xn}. Therefore, ξ(w|⋅) N 0 only if w N w0′. For w N w0′, it follows from Bayes' theorem that ξðwjx1 ; …; xn Þ∝f n ðx1 ; …; xn jwÞξðwÞ ¼ 1 wαþnþ1 (the marginal joint probability density function fn(x1, …, xn) of X1, …, Xn is a normalizing constant). □ 11 For instance, The Boston Globe operates the ad-supported site boston.com and the subscriber-only site BostonGlobe.com. 12 Theorem 1, p. 172. 13 Given W = w, the random variables X1, …, Xn are independent and identically distributed and the common probability density function of each of the random variables is f(xi|w). A.2. Proofs Proof of Proposition 1. By strict concavity of the profit function, the solution to the problem in Eq. (2) must satisfy the necessary and sufficient first-order conditions e ðnÞ þ ðp−c Þ Dðp; n; V s ðp−cs Þ e ðnÞÞ ∂Dðp; n; V þ λ1 ¼ 0 ∂p ! e ðnÞÞ ∂Dðp; n; V e ðnÞÞ ′ ∂Dðp; n; V e ðnÞ þ V e ∂n ∂V þ a′ðnÞn þ aðnÞ þ λ2 −λ3 ¼ 0 ðA:1Þ ðA:2Þ and the constraints λ1p = 0, λ2n = 0, and λ3(n − N) = 0, where the λi's are non-negative real numbers (whose existence is assured by the Kuhn–Tucker theorem). Suppressing the arguments of content demand, (A.1) can be rewritten as # $ p−cs 1 λ ¼ 1þ 1 : ηp p D ðA:3Þ Dividing Eq. (A.2) through p and substituting from Eq. (A.3) produces # $# $ 1 λ ∂D ∂D e ′ a′ðnÞn aðnÞ λ2 −λ3 V ðnÞ þ þ þ ¼ 0: 1þ 1 þ e ηp p p D p ∂n ∂V Recalling that n′ ðaÞ ¼ 1 a′ ðnÞ (from the inverse function theorem) and using the definitions of the respective elasticities, the preceding equation can be rearranged to obtain # $ $ ( # pD 1 λ ' e ¼ 1− 1 þ λ2 −λ3 : 1 þ 1 ηn −η Ṽ V n an ηp ηa D a ðA:4Þ Under a sampling strategy there is an interior solution and hence the λk's are zero. Thus, Eq. (A.4) can be rewritten as an ηn −ηṼ εṼ ( : ¼' Dp 1− 1 η ηa □ p Proof of Proposition 2. (a) In order to calculate E½e v0 ðnÞ' when v0 b V , we first derive the distribution of e v0 ðnÞ ¼ maxfv0 ; V 1 ; …; V n g. Before doing so, we state a preliminary fact: The distribution function of M = max{V1, …, Vn} is given by F M ðt Þ ≡ PrfmaxfV 1 ; …; V n g ≤ t g ¼ PrffV 1 ≤ t g∩ … ∩fV n ≤ t gg # $n n t : ¼ ∏ Pr fV i ≤ t g ¼ V i¼1 ðA:5Þ As an immediate implication, the density function of M is given by f M ðt Þ ¼ nt n−1 V n : ðA:6Þ v0 ðnÞ Next, we derive the density function of e v0 ðnÞ. By definition, e cannot be smaller than v0 . Therefore, e v0 ðnÞ ¼ v0 if and only if maxfV 1 ; …; V n g ≤ v0 . The probability of this event follows from Eq. (A.5) and it is given by F M ðv0 Þ ¼ # $n v0 : V For e v0 ðnÞ N v0 , let e F ð(Þ denote the truncated distribution function of e v0 ðnÞ. After removing the lower part of the distribution, we 204 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 ! " e e have ! F ðt"Þ ¼ F M ðt Þ− F M ðv0 Þ for t ∈ v0 ; V . This implies f ðt Þ ¼ f M ðt Þ for t∈ v0 ; V , and hence n−1 ef ðt Þ ¼ nt n ; V if v0 ≤ t ≤ V by Eq. (A.6). The distribution of e v0 ðnÞ has a mixed structure with # $n v0 Prfe v0 ðnÞ ¼ v0 g ¼ V ðA:7Þ and density n−1 ef ðt Þ ¼ nt n ; V if v0 ≤ t ≤ V: ðA:8Þ The expectation of this mixed distribution is given by E½ e v0 ðnÞ' ¼ v0 ¼ # $n Z v0 V þ nþ1 v0 nþ1 strategy is optimal if π∗PC ≥ π∗SC and π∗PC ≥ π∗FC, that is, if ϕ ≤ V8 . A free content strategy is optimal if π∗FC ≥ π∗SC and π∗FC ≥ π∗PC, that is, if ϕ ≥ V8 þ 2. □ Proof of Proposition 4. At an interior solution, the optimal sample size n∗ satisfies the first-order condition ) N−n e ′ ðn% Þ %* V 4 − e ðn% Þ V 2n% þ ϕ− ¼ 0: 4 N e ð0Þ e ð0Þ V NV þϕ ≤ 0 − 4 4 ⇔ ϕ ≤ ϕ: At the other extreme, when n∗ = N, the Kuhn–Tucker conditions require that Proof of Lemma 1. If v0 b V, the quality gap can be expressed as nþ1 e ðnÞ− V ¼ v0 ðα þ nÞ−V ðα−1Þ : V n 2 2ðα þ n−1Þðn þ 1ÞV ðA:9Þ Clearly, the sign of the quality gap depends only on the sign of numerator (A.9). The latter can easily be rearranged to obtain Eq. (11). If v0 ≥ V, the quality gap can be written as Proof of Lemma 2. Differentiating Eq. (12) with respect to n yields ' ( e ′ ðnÞ−V e ðnÞ p ðN−nÞV ∂DE ðp; nÞ : ¼ ' ( ∂n e ðnÞ 2 ðN−nÞV ′ e ðnÞ−V e ðnÞ N 0, which Clearly, sampling is demand-enhancing if ðN−nÞV n . N−n e ðN Þ V þ ϕ−2 ≥ 0 4 ⇔ ϕ ≥ ϕ: □ e ðnÞ in Eq. (10), the lower Proof of Lemma 4. Using the definition of V bound can be expressed in terms of the underlying model parameters as ϕ¼ ð2α ðα−1Þ þ NÞv0 : 16ðα−1Þ2 Settingv0 ¼ V and letting α → ∞ yields thatϕ → V8. Likewise, we have that ϕ¼ ðα þ NÞV þ 2: 8ðα þ N−1Þ A.3. Analysis of the competitive case which is strictly positive by our assumptions. □ N − Letting α → ∞, we obtain ϕ → V8 þ 2. □ ) * V ðα þ nÞ v0 −V þ V e V ðnÞ− ¼ ; 2 2ðα þ n−1Þ can be rewritten as 2 from Lemma 3 and is given by π∗SC ¼ NðV −16V ðϕ−2Þ þ 64ϕ2 Þ=256. Employing a sampling strategy is optimal if π∗SC N π∗PC and π∗SC N π∗FC. It ) * is immediate that these conditions hold if ϕ ∈ V8 ; V8 þ 2 . A paid content ′ nt n dt v0 V V Substituting this expression into Eq. (7) produces Eq. (8). (b) If v0 ≥ V, then e v0 ðnÞ is equal to v0 , which in turn implies that E½ e v0 ðnÞ' ¼ v0 . Substituting this expression into Eq. (7) yields Eq. (9). □ e ′ ðnÞn V e ðnÞ V Proof of Proposition 3. Using Lemma 3, it is straightforward to derive the profits under a free content strategy (FC) and a paid content strategy (PC). The profits are given by, respectively, π ∗FC = (ϕ − 1)N and π∗PC ¼ NV=8 . Comparing the two profits shows that π∗FC ≥ π∗PC if and only if ϕ N V8 þ 1. The profit under a sampling strategy (SC) follows For a corner solution involving n∗ = 0, the Kuhn–Tucker conditions imply þ nV n : ðn þ 1ÞV nþ1 content strategy, λ1 = λ3 = 0, leading to p∗ ¼ NV=4 and n∗ = 0. Under a free content strategy, we have that p∗ = 0 and n∗ = N. □ □ Proof of Lemma 3. The optimal decisions on the size of the sample and on the price follow from solving the Kuhn–Tucker conditions in Proposition 1.14 Under a sampling strategy, the λk's are zero and it follows ) * ) * that p∗ ¼ NV 8ð2−ϕÞ þ V =64 and n∗ ¼ N 8ϕ−V =16 . Under a paid 14 It is straightforward to show that the objective function is concave for all parameter values. We study competition between two publishers indexed by i = 1,2 and suppose that they offer differentiated information goods to a population of consumers through an online platform. We frame the analysis in terms of the newspaper market and assume that the readers (consumers) can be politically ranked from left to right on the political spectrum captured by the unit interval [0,1] (Gabszewicz, Laussel, & Sonnac, 2005). Horizontal differentiation captures different editorial opinions, and we assume that the publishers are located at the extremes of the political spectrum at x1 = 0 and x2 = 1, respectively. Vertical differentiation captures the firms' different content qualities such as the level of investigative reporting (which are unrelated to the publishers' political orientation). We again assume that the perceived quality qi(ni) of information good i is equal to its number of content parts Ni multiplied by its expected posterior quality, that is, qi ðni Þ ¼ Ni E½V i jv1i ; …; vni '. Consumers make a discrete choice and decide which of the two information goods to purchase. A consumer's conditional indirect utility from buying information good i is given by ui ðx; pi ; ni Þ ¼ qi ðni Þ−τjx−xi j−pi ; D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 where x ∈ [0,1] is the consumer's political orientation and the parameter τ N 0 captures the sensitivity to horizontal mismatch |x − xi|. Intuitively, the mismatch arises because consumers make a discrete choice and cannot purchase the information good that perfectly matches their political preferences. Political orientations are drawn independently across consumers from a uniform distribution over the interval [0,1]. The publishers compete for consumers by making their pricing and sampling decisions. To derive expected content demands, we determine the location ^x of the consumer who is indifferent between buying from publisher 1 and from publisher 2 for given prices p = (p1, p2) and sample sizes n = (n1, n2). Clearly, the location of the indifferent consumer ^xðp; nÞ is a solution to the indifference condition u1 ð^xðp; nÞÞ ¼ u2 ð^xðp; nÞÞ, which ensures that the indirect utilities from the two information goods are the same. With linear mismatch, the consumer located at ^xðp; nÞ segments the market: Consumers located to the left of ^xðp; nÞ purchase from publisher 1, while consumers located to the right of ^ xðp; nÞ purchase from publisher 2.15 Ignoring cannibalization, publisher 1 thus faces ^xðp; nÞ consumers, while publisher 2 faces 1−^xðp; nÞ consumers. To capture that sampling not only reveals quality but also cannibalizes N −n sales, we let i i denote the conditional purchase probability given Ni sample size ni. Hence expected content demands can be expressed as E D1 ðp; nÞ ¼ N1 −n1 ^xðp; nÞ and N1 E D2 ðp; nÞ ¼ N2 −n2 ð1−^xðp; nÞÞ: N2 Consumers thus choose their preferred publisher based on prices and posterior quality expectations and subsequently purchase the content with probability N i −ni . Similar to the monopoly case, the sampling deciNi sion ni therefore has a direct effect on expected content demand through the conditional purchase probability (a cannibalization effect) and an indirect effect on ^xðp; nÞ (an expansion effect). Note that publisher i's expected content demand is zero under a free content strategy due to the cannibalization effect. Publisher i makes its pricing and sampling decisions so as to # $ n E E max π i ðp; nÞ ¼ pi Di ðp; nÞ þ ϕi − i ni pi ;ni Ni s:t: pi ≥ 0 0 ≤ ni ≤ Ni ; where ϕi is the advertising effectiveness of publisher i's advertising. Compared to the monopoly case, each publisher now has to take into account the rival's choice of strategy to make its optimal decision. Thus, there are nine possible outcomes, summarized in Fig. A.1. If both firms use a paid content strategy, the firms' corresponding (expected) profits PP are denoted by πPP 1 and π2 , respectively (and likewise for the other outcomes). For each outcome, the profit levels can be obtained by solving the publishers' decision problems. The optimal strategy choices are then obtained as a Nash equilibrium of the matrix game depicted in Fig. A.1. To analyze the optimal strategy choice, we focus on a market environment where consumers have different prior beliefs about the content quality of the two publishers. Specifically, we suppose that the consumers' minimum estimate v01 differs from v02, while the publishers are actually symmetric and in particular offer the same quality spectrum (V 1 ¼ V 2). Fig. A.2 illustrates the publishers' optimal strategy choices as a function of the gap in prior expectations and advertising effectiveness (ϕi ≡ ϕ). The figure holds v02 constant (at v02 = 1) and plots different values of v01 on the vertical axes. Notice that where v01 b v02, 15 See Anderson, de Palma, and Thisse (1992) for an in-depth treatment of Hotellingtype models. 205 Fig. A.1. Strategy choices and corresponding profits. consumers believe that publisher 1 offers a lower content quality than publisher 2. The key insights from the analysis of the competitive case can be summarized as follows: If consumers' prior expectations about content qualities are similar and advertising effectiveness is high, it is optimal for both publishers to adopt a free content strategy in equilibrium. Instead, if the gap in prior expectations is large and advertising effectiveness is high, both publishers should adopt a sampling strategy. Further, the parameter region in which the sampling strategy is optimal for both publishers shrinks when consumers are more sensitive to horizontal mismatch and products thus are less substitutable. To understand these insights, it is important to notice that each point in the (ϕ,v01)-space in Fig. A.2 corresponds to a (pure-strategy) Nash equilibrium of the matrix game described above.16 The solid line depicts the cut-off threshold between the two types of Nash equilibria: sampling (SC, SC) and free content (FC, FC). Counter to intuition, it is not per se optimal to choose a free content strategy when advertising effectiveness ϕ is high. What matters in addition is the gap in prior expectations among the two publishers: if it is small, then both firms should adopt a free content strategy. Instead, if the gap in prior expectations is large, then both firms should adopt a sampling strategy. Intuitively, publisher 1 has a stronger incentive to reveal its higher than expected quality—and it is a best-response of publisher 2 to also adopt a sampling strategy (that is, publisher 2 attains a higher profit under a sampling strategy than under a free content strategy when taking publisher 1's strategy as given). Fig. A.2 also illustrates that the parameter region in which the sampling strategy is optimal for both publishers shrinks when consumers are more sensitive to horizontal mismatch τ. Intuitively, a higher τ means that products are less substitutable from the viewpoint of the consumers (e.g., when readers with left-wing preferences consider a right-wing newspaper a less good alternative). For the publishers, this results in a lower ability to engage in business stealing in the content market. Consequently, the firms try to generate as much revenue as possible in the advertising market by employing a free content strategy. This comparative statics result mirrors the findings from the monopoly case (cf. Fig. 3): When τ increases and publishers become monopolists on their respective market segment, the firms should employ a free content strategy once ϕ exceeds a certain threshold level—irrespective of the gap in prior expectations (observe that this threshold level lies at around ϕ = 2 in Fig. A.2). Up to now, the focus has been on strategy choices in the upper-right corner of the matrix game in Fig. A.1. The reason is that these strategy combinations capture the essence of the debate in the newspaper industry: whether to offer the content for free or to employ a metered model. Of course, there is also a symmetric industry configuration 16 The code to numerically compute the Nash equilibria is available from the authors upon request. 206 D. Halbheer et al. / Intern. J. of Research in Marketing 31 (2014) 192–206 Fig. A.2. Equilibrium strategies (for ν02 = 1, α = 2, V i ¼ 3, and Ni = 10). where both publishers employ a paid content strategy. Unsurprisingly, this equilibrium arises if advertising effectiveness is low. In addition, there are asymmetric strategy choices such as (PC, SC), which are optimal when consumers overestimate the content quality of publisher 1 (and hence the incentive to not reveal low quality) and underestimate the content quality of publisher 2 (and hence the incentive to reveal high quality through free samples). References Abramson, J. (2010). Sustaining quality journalism. Daedalus, 39–44. Ackerberg, D. A. (2003). Advertising, learning, and consumer choice in experience good markets: An empirical examination. International Economic Review, 44(3), 1007–1040. Akerlof, G. A. (1970). The market for lemons. Quarterly Journal of Economics, 84(3), 488–500. Anderson, S. P., de Palma, A., & Thisse, J. -F. (1992). Discrete choice theory of product differentiation. Cambridge, MA: MIT Press. Anderson, S. P., & Gabszewicz, J. J. (2006). The media and advertising: A tale of two-sided markets. In V. Ginsburgh, & D. Throsby (Eds.), Handbook of the economics of art and culture (pp. 567–614). Elsevier. Anderson, S. P., & Renault, R. (2006). Advertising content. American Economic Review, 96, 93–113. Bagwell, K. (2007). The economic analysis of advertising. In M. Armstrong, & R. Porter (Eds.), Handbook of industrial organization, Vol. III. (pp. 1701–1844). Elsevier. Bawa, K., & Shoemaker, R. (2004). The effects of free sample promotions on incremental brand sales. Marketing Science, 23(3), 345–363. Bhargava, H. K., & Choudhary, V. (2008). Research note: When is versioning optimal for information goods? Management Science, 54(5), 1029–1035. Boom, A. (2009). Download for free—When do providers of digital goods offer free samples? Working Paper. Copenhagen Business School. Burke, R. R., & Srull, T. K. (1988). Competitive interference and consumer memory for advertising. Journal of Consumer Research, 15(1), 55–68. Chellappa, R. K., & Shivendu, S. (2005). Managing piracy: Pricing and sampling strategies for digital experience goods in vertically segmented markets. Information Systems Research, 16(4), 400–417. Cheng, H. K., & Tang, Q. C. (2010). Free trial or no free trial: Optimal software product design with network effects. European Journal of Operational Research, 205, 437–447. DeGroot, M. (1970). Optimal statistical decisions. New York: Mc Graw-Hill. Dorfman, R., & Steiner, P. O. (1954). Optimal advertising and optimal quality. American Economic Review, 44(5), 826–836. Erdem, T., & Keane, M. P. (1996). Decision-making under uncertainty: Capturing dynamic brand choice processes in turbulent consumer goods markets. Marketing Science, 15(1), 1–20. Erdem, T., Keane, M. P., & Sun, B. (2008). A dynamic model of brand choice when price and advertising signal product quality. Marketing Science, 27(6), 1111–1125. Faugère, C., & Tayi, G. K. (2007). Designing free software samples: A game theoretic approach. Information Technology Management, 8, 263–278. Gabszewicz, J. J., Laussel, D., & Sonnac, N. (2005). Attitudes toward advertising and price competition in the press industry. In V. A. Ginsburgh (Ed.), Economics of art and culture (pp. 61–74). Elsevier. GlobeNewswire (April 5). Wall Street Journal offers digital ‘open house’ today; access includes online, apps for iphone, ipad. http://www.globenewswire.com/-newsroom/ news.html?d=250593 Godes, D., Ofek, E., & Sarvary, M. (2009). Content vs advertising: The impact of competition on media firm strategy. Marketing Science, 28(1), 20–35. Heiman, A., McWilliams, B., Shen, Z., & Zilberman, D. (2001). Learning and forgetting: Modeling optimal product sampling over time. Management Science, 47(4), 532–546. Hotz, V. J., & Xiao, M. (2013). Strategic information disclosure: The case of multiattribute products with heterogeneous consumers. Economic Inquiry, 51(1), 865–881. Kind, H. J., Nilssen, T., & Sørgard, L. (2009). Business models for media firms: Does competition matter for how they raise revenue? Marketing Science, 28(6), 1112–1128. Milgrom, P., & Roberts, J. (1986). Price and advertising signals of product quality. Journal of Political Economy, 94(4), 796–821. Newspaper Association of America (2012). Paid digital content benchmarking study. Rysman, M. (2009). The economics of two-sided markets. Journal of Economic Perspectives, 23(3), 125–143. Shapiro, C., & Varian, H. R. (1998). Information rules. Boston, MA: Harvard Business School Press. Sun, M. (2011). Disclosing multiple product attributes. Journal of Economics and Management Strategy, 20(1), 134–145. Wang, C., & Zhang, X. (2009). Sampling of information goods. Decision Support Systems, 4(1), 14–22. Wilson, R. (1985). Multi-dimensional signalling. Economics Letters, 19(1), 17–21. Xiang, Y., & Soberman, D. A. (2011). Preview provision under competition. Marketing Science, 30(1), 149–169. Intern. J. of Research in Marketing 31 (2014) 207–223 Contents lists available at ScienceDirect Intern. J. of Research in Marketing journal homepage: www.elsevier.com/locate/ijresmar Empirical generalizations of demand and supply dynamics for movies Michel Clement a,⁎, Steven Wu a, Marc Fischer b,c a b c University of Hamburg, Institute for Marketing and Media, Welckerstr. 8, D-20354 Hamburg, Germany University of Cologne, Chair for Marketing and Market Research, Albertus-Magnus-Platz, D-50923 Köln, Germany UTS Business School, Sydney, Australia a r t i c l e i n f o Article history: First received in 24 May 2012 and was under review for 5 months Available online 10 December 2013 Area Editor: Dominique M. Hanssens Guest Editor: Marnik G. Dekimpe Keywords: Generalizations Movie industry Screen allocation Endogeneity a b s t r a c t High financial risks in production and marketing, the hedonic nature of products, and the global cultural relevance of movies have encouraged a substantial number of researchers to analyze the success drivers of movies. This research provides empirical generalizations in managing the supply and demand of motion pictures. Prior empirical research either ignored the endogeneity of box office and screen allocation or was based on selective samples, ignoring the large amount of smaller movies released to the market. Using two large and unique samples of all movies released in two major movie markets, the US (2000–2010; n = 2098) and Germany (2002–2010; n = 1360), we extend prior research and present empirical generalizations and new fields of research. © 2013 Elsevier B.V. All rights reserved. 1. Introduction In 2010, the movie Avatar broke all box office records and grossed more than $2.78 billion worldwide within a few weeks. The movie was an exceptional success for the motion picture industry. Each movie is an innovation requiring specific management attention. In addition, the substantial costs to produce the initial first copy of the movie (Avatar was budgeted at $237 million) and the high prelease advertising costs (US: $53.14 million and Germany: €1.13 million for the Avatar movie) are both sunk at the time of release, making it a risky business (Eliashberg, Jonker, Sawhney, & Wierenga, 2000). Thus, studio managers face a high financial risk of producing the next gigantic flops, adding to such legendary examples as The Adventures of Pluto Nash, Stealth, and Gigli. The hedonic nature of movies, their relevance in global culture, the high economic importance of the industry, and the public availability of data have led to a substantial number of academic studies on the success drivers of movies (Eliashberg, Elberse, & Leenders, 2006). Scholars have analyzed the effect of various variables such as star power (Elberse, 2007), academy awards (Deuchert, Adjamah, & Pauly, 2005), word-ofmouth (Liu, 2006), and age restrictions (Leenders & Eliashberg, 2011), on success measures such as box office, number of visitors, and screens. However, the large amount of empirical research has provided conflicting results on the effect of several success drivers. For example, the role of critics has been addressed by several researchers without consistency or generalizable results. While some studies show positive effects of ⁎ Corresponding author. Tel.: +49 40 42838 8721. E-mail addresses: michel@michelclement.com (M. Clement), steven.wu@uni-hamburg.de (S. Wu), marc.fischer@wiso.uni-koeln.de (M. Fischer). 0167-8116/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ijresmar.2013.10.007 positive reviews and negative effects of negative reviews on sales (e.g., Litman, 1983), we also find studies revealing that even negative reviews lead to higher sales (e.g., King, 2007; Wallace, Seigerman, & Holbrook, 1993) or to positive distribution effects with respect to the number of screens (Elberse & Eliashberg, 2003). Furthermore, some authors find evidence that critics influence sales (e.g., Basuroy, Chatterjee, & Ravid, 2003; Boatwright, Basuroy, & Kamakura, 2007; Kamakura, Basuroy, & Boatwright, 2006; Moon, Bergey, & Iacobucci, 2010), whereas others find that critics (actually) only predict sales and that their influence on sales is rather negligible (Eliashberg & Shugan, 1997). Another field with conflicting results is the star power research. Hennig-Thurau, Völckner, Clement, and Hofmann (2013, Appendix A, p.45) present a literature overview of previous research with respect to star power and identify ten studies that report a positive impact of stars on revenues or admissions. However, they also identify twelve studies that find no empirical support for such an effect (six studies find partial support). The heterogeneous findings may be a result of various data limitations. Many studies are based on outdated data sets or face a substantial selection bias because the authors sampled only successful movies (e.g., top 25 in Variety or a pre-defined minimum production budget) and ignored the large number of “small” movies that entered the market more or less successfully (e.g., Elberse & Eliashberg, 2003; Ravid, 1999). Furthermore, most research focuses on the US market and thus ignores other international markets. Finally, many studies use only a very limited set of variables. In this research, we focus on the question of whether prior findings in the motion picture industry can be generalized. The relevance of generalizations has been regularly highlighted in Marketing Science (Albers, 2012; Hanssens, 2009). Especially, Bass (1995) and Ehrenberg 208 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 (1995) emphasized the necessity of empirical research focusing on generalizing prior research findings to provide further insights for new research topics. Additionally, several editors (e.g., Goldenberg & Muller, 2012; Winer, 1998) have highlighted the relevance of replications to investigate the generalizability of earlier research findings. This research contributes to the literature by generalizing prior empirical findings on the success factors of movies. We rely on the established theoretical and modeling framework of Elberse and Eliashberg (2003), which accounts for the interrelationship in the behavior of audiences and exhibitors. Specifically, their dynamic simultaneous equation models account for the endogeneity of revenues and screens and incorporate the need to determine revenues and screens simultaneously. This endogenous relationship has also been identified by Krider, Li, Liu, and Weinberg (2005) who visualize causal interferences using graphical analysis. They conclude that “the dominant industry pattern is one of movie exhibitors monitoring box office sales and then responding with screen allocation decisions” (Krider et al., 2005, p. 625). While Elberse and Eliashberg (2003) (in the following used as E/E) used a sample of 164 American (co-) productions from 1999 that needed to appear at least once in the US box office top 25, we base our analysis on a much larger database covering the full US and German movie markets. Our sample consists of all 2098 movies released in the US between 2000 and (partially) 2010 and all 1360 movies released in Germany between the summer of 2002 and the spring of 2010. We collected information on all movies released during this period from various sources and abandoned any minimum box office criterion when choosing the movies to avoid selection biases. Thus, our sample represents the general movie market in two major countries. Further, we extend the findings of E/E by adding new, important variables such as sequels, MPAA ratings, US productions, genres, and the highly relevant advertising budget for the German market. We also revise the initially counterintuitive results in E/E's study about the effect of reviews. They find that unfavorable reviews by critics correspond with a higher number of opening screens. This finding is surprising in light of the research findings that address the effects of critics on the box office (e.g., Eliashberg & Shugan, 1997), but can be confirmed and explained by supply dynamics and revenue sharing models as we will show later in this paper. Finally, aside from generalizing the results for the US market, we are able to estimate the model covering the full German market. Thus, we provide a setting that also allows for generalizations across the two major international movie markets. Our findings contribute to the literature in two ways. (1) The nature of our data allows for replication as well as substantial extension and actualization of prior research. Thus, we provide empirical generalizations of prior US-based research findings. (2) We provide new managerial insights for the German movie market that allow us to compare the empirical findings across two major movie markets to generate further empirical generalizations. Summarizing, our contribution lies in demand as well as supply generalizations that are compared across two countries. In the next section, we provide an overview of our modeling approach. In Section 3, we discuss our data. The estimation results are presented in Section 4, followed by a discussion in Section 5. We conclude with generalizations and avenues for future research. 2. Model The value chain in the movie business is full of dependencies and conflicting interests (Hennig-Thurau, Henning, Sattler, Eggers, & Houston, 2007). Effectively, two parties are involved in the initial stage of the sequential release strategy (theatrical release) of a movie. Managers of studios and cinemas negotiate with each other, each attempting to enforce favorable conditions (Eliashberg, Swami, Weinberg, & Wierenga, 2001). To analyze the behavior of both parties, our model follows E/E and uses two interdependent equations that cover audience demand (demand equation) and screen allocation by cinemas (supply equation). Research on the diffusion of movies has shown that demand for the majority of movies reaches its maximum during the first week of release (Ainslie, Drèze, & Zufryden, 2005; Sawhney & Eliashberg, 1996). In particular, first weekend box office results serve as an indicator for the movie's total success (Joshi & Hanssens, 2009). Thus, the industry is release-driven and the market players focus on the first week (Karniouchina, 2011). Consequently, we model the dynamics of supply and demand determinants over time and explicitly differentiate the first week from the following weeks using separate equations. The dynamic interests of the two parties are a result of changing profit margins of distributors and cinemas over time. The expected number of visitors, or at least the distributor's estimate, is best reflected in the number of opening screens, which needs to be determined before release (Eliashberg, Hegie, Ho, Huisman, Miller, et al., 2009). Based on the number of expected visitors, the distributor provides the relevant number of copies to be distributed to the cinemas. Assuming that a specific market potential of consumers wants to see a movie, declining profit margins over time imply that strong demand for a movie right after release is favorable for the distributor. In contrast, the cinema earns a higher margin if consumers attend the movie in later weeks. Thus, it can be assumed that the distributor will always attempt to collect as many screens as possible for the first weeks, and cinemas will attempt to shift their capacities to later weeks to maximize profits. Therefore, consistent with E/E, we account for the endogeneity of the number of screens when estimating revenues, and we assume that in each week, the errors in the supply and demand equations may be correlated. We also choose a log–log formulation to directly retrieve elasticities that allow us to better compare our results to previous research. 2.1. Model for the US market—week t = 1 Eq. (1) provides the model for the demand (measured in box office) for movie i in week t = 1. β β β β β β REVENUESit ¼ e 0 " SCREENSit 1 " STARi 2 " DIRECTORi 3 " AD EXP i 4 " REVIEWSi 5 β β β "SEQUELi β "US β " COMP SCR REV it 6 " SEASONit7 " e 8 " e 9 i " MPAAi 10 β 11 "CHILDREN i β 12 "ACTIONi β 13 "DOCUMENTARY i β14 "HORRORi "e "e "e "e β "COMEDY i β "OTHERi ε " e 15 " e 16 " e Rit ð1Þ Revenues are driven by the number of screens allocated to movie i in week t = 1 and a set of time-invariant variables (star power, director power, advertising, critical acclaim, sequel, US production, MPAA rating, and genre variables) and time-variant variables (competition and an index variable to measure season). We provide details on the measurement of the variables in Section 3. The error term for the revenue equation is denoted as εRit. We model the supply of screens for movie i in week t = 1 as shown in Eq. (2). α0 SCREENSit ¼ e %%α α α α α " REVENUESit 1 " BUDGET i 2 " STARi 3 " DIRECTORi 4 " AD EXP i 5 α α "DISTR MAJORi α " REVIEWSi 6 " e 7 " COMP SCR NEW it 8 α α "SEQUELi α "US α α "CHILDRENi " COMP SCR ONGit 9 " e 10 " e 11 i " MPAAi 12 " e 13 α 14 "ACTIONi α 15 "DOCUMENTARY i α 16 "HORRORi α 17 "COMEDY i "e "e "e "e α "OTHERi ε " e 18 " e Sit ð2Þ The number of screens allocated to movie i in week t = 1 is a function of its expected revenues (REVENUESit**), the time-invariant production budget, the distributor's market power, two time-variant competition variables that cover the competition for screen space from new releases and ongoing movies, and, finally, the same variables as listed in Eq. (1), except that we exclude the season variable in the screen equation because of fixed capacities. We include budget only in the screen supply model because cinema operators usually know the movie budget and use it for evaluating the success potential of the movie. In contrast, the average moviegoer is not aware of the production budget of a movie. The error term for the supply equation is denoted as εSit. We model the expected revenues by relying on prerelease interest for the movie on 209 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 the professional website IMDb.com. The website measures interest in a particular movie on its pages and ranks movies accordingly in its “Moviemeter”. We use the Moviemeter rank of movie i at the time of release as an indicator for revenues in week t = 1 as formulated in Eq. (3).1 The pre-launch rank of the Moviemeter is a powerful predictor (R2 = .71) of expected opening-week revenues (Karniouchina, 2011): α0 %% REVENUESi ¼ e α ε " MOVIEMETERi 1 " e i : ð3Þ 2.2. Model for the US market—week t N 1 β β β β REVENUESit ¼ e 0 " SCREENSit1 " COMP SCR REV it 2 " SEASONit3 " WOMit4 β "WEEK it ε "e 5 " e Rit ð4Þ Here, the variable WOM captures the observed buzz with respect to the released movie. Analogously, we model screen allocation for weeks t N 1 in Eq. (5). Compared with E/E, we apply a modification of the supply model. We replace the variable WOM with REVENUES per SCREENS of the previous week because we have learned from our interviews with studio and cinema managers that movie exhibitors focus on these two variables (which can be easily obtained by them) when making their allocation decisions. α α α α 3 SCREENSit ¼ e 0 " COMP SCR NEW it 1 " COMP SCR ONGit 2 " REVENUESit−1 α4 a5 "WEEK it εSit " SCREENSit−1 " e "e ð5Þ 2.3. Model for the German market—week t = 1 We model the demand2 and supply for the German movie market analogously, and include box office performance of the movie in the US and an interaction term of the time difference in weeks between the US and the German release and the US box office performance. The dummy variable US_LAUNCHi indicates whether the movie was launched in the US earlier than in Germany. Our demand (supply) model for Germany is displayed in Eq. (6) (7). In the following coefficients with p b .10 are considered to be significant effects (two sided). β β β β β β ADMISSIONSit ¼ e 0 " SCREENSit 1 " STARi 2 " DIRECTORi 3 " AD EXP i 4 " REVIEWSi 5 β β β US LAUNCHi " COMP SCR REV it 6 " SEASON it 7 " US PERF i 8 ! "β9 US LAUNCHi β10 "SEQUELi β11 "USi β12 "GERi " USPERF i " TIMELAGi "e "e "e β β "TIP β "CHILDREN β "ACTIONi i " MPAAi 13 " e 14 i " e 15 " e 16 β18 "HORRORi β 19 "COMEDY i β20 "OTHERi ε "e "e "e " e Rit α α Analogous to the US market, we model the system of equations for weeks t N 1: β β β β ADMISSIONSit ¼ e 0 " SCREENSit1 " COMP SCR REV it2 " SEASONit3 β β "WEEK it ε "WOM it4 " e 5 " e Rit ð8Þ and Following E/E, we model revenues for weeks t N 1 as denoted in Eq. (4). β 2.4. Model for the German market—week t N 1 α β17 "DOCUMENTARY i "e ð6Þ α α SCREENSit ¼ e 0 " ADMISSIONit 1 " BUDGET i 3 " DIRECTORi 4 " AD EXP i 5 α α DISTR MAJORi α " REVIEWSi 6 " e 7 " COMP SCR NEW it 8 α9 α 10 "US LAUNCH i " COMP SCR ONGit " US PERF i α "US LAUNCH i α "SEQUELi α "US " ðUS PERF i " TIME LAGi Þ 11 " e 12 " e 13 i α 14 "GERi α 15 α 16 "TIP i α 17 "CHILDREN i α 18 "ACTIONi "e " MPAAi " e "e "e α DOCUMENTARY i α "HORRORi α "COMEDY i α "OTHERi ε " e 19 " e 20 " e 21 " e 22 " e Sit ð7Þ 1 E/E use data from the Hollywood Stock Exchange to construct expected revenues for the opening week. We could not use their approach because it covers only “wide-opening” movies. Our data set includes more than just these top range movies. 2 In Germany, demand data are based on admissions and not on box office. Using admissions, however, does not have any effect on our elasticity estimates, because ticket prices do not differ. Hence, revenues and admission are only different up to a scale factor. Since we use multiplicative response models, the scale differences are fully absorbed in the estimated regression constant and not the elasticity parameters. α α α α 3 SCREENSit ¼ e 0 " COMP SCR NEW it 1 " COMP SCR ONGit 2 " ADMISSIONSit−1 α4 α5 "WEEK it ε Sit " SCREENSit−1 " e "e : ð9Þ 3. Data The data comprise two data sets. Each data set consists of the complete inventory of all movies released in the US market (n = 3460 movies) and the German market (n = 2598 movies). For the US market, we cover the time span from 2000 to 2010, and for Germany, we include releases from 2002 to 2010. Contrary to E/E, the German sample includes all movies released in Germany (including the movies in German language). Thus, contrary to many other studies, we do not face a selection effect by, for example, restricting the sample to a minimum box office requirement or by focusing on movies that have been released in the US and then transferred to Germany. After adjusting for missing values that especially occur in information on production budgets, the samples comprise 2098 movies for the US market and 1360 movies for the German market.3 Table 1 provides an overview of the measurement of the variables, and Table 2 presents the respective descriptive statistics. Table 2 further indicates the substantial differences between the sample of E/E and our sample and supports the generalization of our findings to the total market. We follow prior research in measuring our variables. All US-based measures are obtained from variety.com, IMDb.com, Nielsen, or metacritic.com. Measures for the German market are primarily collected from the professional website mediabiz.de and enhanced with the help of executives from Warner Bros., Germany. We received German advertising data from MediaCom. Compared with E/E, we measure star and director power differently. For the US market, we use the IMDb Starmeter ranking to measure director and star power (Karniouchina, 2011). For the German market, we cannot rely on the IMDb Starmeter index because many German actors are not listed in this US-focused service. Therefore, we apply a star power measure of actors and directors that uses the confidence-based weighted mean of two members of the market research department at Warner Bros., Germany (van Bruggen, Lilien, & Kacker, 2002). To validate the measures, we correlate the star power measure of IMDb.com with our star measure from Germany for US movies also released in Germany (n = 912) and find a high correlation (p b .01) of r = .45 for actors and r = .57 for directors. For the German sample, we rely on the German age restrictions (FSK received from Vdf.de; Leenders & Eliashberg, 2011) and measure critical acclaim using rankings from the movie magazine Cinema. The magazine also provides a recommendation for the German market that we include as a separate variable (Tip). Compared with prior research, our model includes a wide range of relevant drivers within two large movie markets. We specifically extend the research of E/E by including additional variables as shown in Tables 3 and 4 (extended model). Especially for the German market, 3 Because almost all missing values were attributable to the lack of production budget data, we also estimated all models, including all movies without budget information (see Table 6), to ensure that the missing data did not cause any biases. The results, which are discussed in the robustness checks section, are very robust. 210 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 Table 1 Measures. Variable Description Measure US data Source US-data Measure German data Source German data REVENUES/ADMISSIONS SCREENS REVENUES1**/ ADMISSIONS1** Weekly revenues Weekly number of screens Expected revenues first week Variety.com Variety.com Variety.com IMDb.com Weekly admissions Weekly number of screens Real admissions of first weeka Mediabiz.de Mediabiz.de Mediabiz.de REVENUES**/ ADMISSONS** BUDGET Expected revenues beyond first week Production budget Weekly revenues in US$ Weekly number of screens Expected revenues of the first week were forecasted using an OLS model with Moviemeter as independent and revenues of the first week as dependent variablea Real revenues of the respective weeka Production budget in US$ Variety.com Real admissions of the respective weeka Production budget in US$ Mediabiz.de STAR Star power IMDb Starmeter ranking at release date (highest rank = 1)b DIRECTOR Director power IMDb Starmeter ranking at release date (highest rank = 1)b IMDb.com AD_EXP REVIEWS Advertising expenditures Critical reviews Advertising expenditure in 000 US$ Weighted mean of US critics Nielsen Metacritics.com DISTR_MAJOR Major distributor 1 = Paramount, Sony Pictures (Columbia, TriStar), Disney (Buena Vista, Touchstone, Hollywood Pictures), 20th Century Fox, Universal, Warner (New Line, Fine Line Features); 0 = other Variety.com WOM Word-of-mouth communication Variety.com COMP_SCR_NEW Competition for “screen space” from new releases COMP_SCR_ONG Competition for “screen space” from ongoing movies COMP_SCR_REV Competition for the attention of audiences SEASON Seasonality US_PERF US market performance Revenues per screen in previous week New releases, weighted by advertising expenditures in 000 US $ until release, if there were no advertising expenditures movie was weighted with 1 US$ Average age of ongoing releases, for each calendar week excluding the movie under consideration Number of similar movies (same genre, same MPAA), weighted by runtime of the movie Index value per week (0 = min, 100 = max) n.a. TIME_LAG MOVIEMETER Time lag between domestic and foreign release Moviemeter ranking MPAA MPAA rating SEQUEL US Sequel US production GERMANY German production DRAMA CHILDREN ACTION DOCUMENTARY HORROR COMEDY OTHER Genre drama Genre children Genre action Genre documentary Genre horror Genre comedy Other genre TIP Movie is marked with “Tip” (Cinema) a IMDb.com BoxOfficeMojo.com the-numbers.com IMDb.com Variety.com Nielsen Varietey.com Variety.com Variety.com n.a. Moviemeter ranking on the release date 0 = unrated; 1 = G; 2 = PG; 3 = PG-13; 4 = R; 5 = C-17 IMDb.com 1 = sequel; 0 = no sequel 1 = US production; 0 = other country n.a. IMDb.com IMDb.com 1 = Drama; 0 = other 1 = Children; 0 = other 1 = Action; 0 = other 1 = Documentary; 0 = other 1 = Horror; 0 = other 1 = Comedy; 0 = other 1 = Other genre; 0 = any of genre above Variety.com Variety.com Variety.com Variety.com Variety.com Variety.com Variety.com IMDb.com Movies were rated on a 0–5 scale using WCMEAN (highest score = 5)c Movies were rated on a 0–5 scale using WCMEAN (highest score = 5) Advertising expenditure in € Cinema rating (on 1–5 scale, 5 is best) 1 = Paramount, Buena Vista, 20th Century Fox, Constantin, Sony Pictures, United Pictures International, Warner; 0 = other (due to market share differences measure differs slightly from US measure) Admissions per screen in previous week New releases, weighted by marketing budget in €, if there were no advertising expenditures movie was weighted with 1 € Average age of ongoing releases, for each calendar week excluding the movie under consideration Number of similar movies (same genre, same FSK), weighted by runtime of the movie Index value per week (0 = min, 100 = max) Average of revenues per screen in the US over the first two weeks Number of days between US release and release in Germany n.a. MPAA reflects FSK rating in Germany 1 = FSK0; 2 = FSK6; 3 = FSK12; 4 = FSK16; 5 = FSK18 1 = sequel; 0 = no sequel 1 = US production; 0 = other country 1 = German production; 0 = other country 1 = Drama; 0 = other 1 = Children; 0 = other 1 = Action; 0 = other 1 = Documentary; 0 = other 1 = Horror; 0 = other 1 = Comedy; 0 = other 1 = Other genre; 0 = any of genre above 1 = “Tip”; 0 = no “Tip” IMDb.com BoxOfficeMojo.com the-numbers.com Own survey by interviews Own survey by interviews MediaCom Cinema.de Mediabiz.de Mediabiz.de Mediabiz.de MediaCom MediaBiz.de MediaBiz.de Babelsberg Charts Variety.com Variety.com Mediabiz.de Vdf.de IMDb.com Vdf.de Vdf.de MediaBiz.de MediaBiz.de MediaBiz.de MediaBiz.de MediaBiz.de MediaBiz.de MediaBiz.de n Measure is different from Elberse and Eliashberg (2003). Variable is reverse coded for estimation, meaning that a larger value represents higher star power. c We asked two experts to rate the star power of each movie in our data set. In addition, they had to provide a confidence value for each rating. The star power for a movie was then generated by applying the confidence-based weighted means method (WCMEAN, van Bruggen et al., 2002). b 211 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 Table 2 Key descriptive statistics. Extended model US Germany Elberse/Eliashberg Attributes US Germany Variable N Mean/% Median SD Budget (000 US$) Star rankinga Director rankinga Ad_Exp (000 US$) Review ratings Screens (t = 1) Revenues (t = 1) (000 US$) Total Revenues (000 US$) Length of run (weeks) Comp_Scr_New (t = 1) Comp_Scr_Ong (t = 1) Comp_Scr_Rev (t = 1) Season (t = 1) Moviemeter ranking (t = 1) MPAA Sequel US productions Distr_Major Children Action Documentary Drama Horror Comedy Other Budget (000 US$) Star rating Director rating Ad_Exp (000 €) Review ratings Screens (t = 1) Admissions (t = 1) (000) Total admissions (000) Length of run (weeks) US_Perfb (000 US$) Time_Lagb Comp_Scr_New (t = 1) Comp_Scr_Ong (t = 1) Comp_Scr_Rev (t = 1) Season (t = 1) MPAA (FSK) Sequel US German productions Distr_Major Tip Children Action Documentary Drama Horror Comedy Other Budget (000 US$) Star rating Director rating Ad_Exp (US) (000 US$) Review ratings Screens (t = 1) Revenues (t = 1) (000 US$) Total Revenues (000 US$) Length of run (weeks) Screens (t = 1) Admissions (t = 1) (000) Total admissions (000) Length of run (weeks) US_Perf (000 US$)b Time_Lag 2098 2098 2098 2098 2098 1917 1917 2098 2098 1917 1917 1917 1917 1917 2098 2098 2098 2098 2098 2098 2098 2098 2098 2098 2098 1360 1360 1360 1360 1360 1335 1335 1360 1360 1360 1360 1335 1335 1335 1335 1360 1360 1360 1360 1360 1360 1360 1360 1360 1360 1360 1360 1360 139 164 164 164 158 164 164 164 164 138 138 138 138 138 138 32,815.00 9172.35 10,678.39 11,438.99 53.87 1525.33 16,252.09 39,106.58 13.30 39,954.59 14.11 11.43 55.25 914.59 2.86 14.06% 79.17% 39.66% 2.53% 26.31% 3.34% 34.70% 3.86% 24.93% 4.34% 36,532.67 1.34 .35 656.30 3.38 243.53 211.37 581.42 10.47 7.06 118.17 2452.83 8.04 8.36 66.19 2.75 11.69% 64.78% 11.54% 50.59% 22.79% 6.54% 21.47% 2.57% 27.50% 5.81% 27.13% 8.97% 36,879.42 46.28 25.28 10,455.01 3.15 1658.73 10,964.91 43,712.51 16.21 276.69 2876.92 9650.27 9.67 8.97 139.83 19,000.00 118.00 1936.00 4731.91 54.00 1694.00 6143.98 13,585.99 12.00 40,919.03 11.84 11.19 48.50 21.00 3.00 39,775.91 81,462.15 41,469.67 13,387.57 17.89 1418.86 26,216.15 65,737.08 10.32 24,851.38 5.45 6.21 17.37 3070.51 1.33 Minimum 1.10 1.00 1.00 .00 3.00 1.00 .01 .10 1.00 .00 1.00 .00 32.47 1.00 .00 Maximum 300,000.00 2,737,503.00 1,338,551.00 55,349.52 100.00 4468.00 196,019.50 839,081.62 220.00 137,029.97 31.95 31.38 100.00 53,569.00 5.00 20,381.77 1.08 .04 193.15 4.00 149.00 64.25 157.56 9.00 1.86 104.00 2248.23 7.78 8.27 68.17 3.00 43,270.39 1.22 .81 945.03 1.17 247.90 422.47 1144.55 6.56 22.82 113.44 1676.37 1.43 4.01 12.36 1.11 .22 .00 .00 .00 .00 1.00 .04 .05 1.00 .00 .00 .00 3.97 .11 40.00 1.00 448,595.00 4.75 5.00 5512.73 5.00 1337.00 3680.04 10,428.18 52.00 719.97 832.00 10,966.11 13.01 25.38 100.00 5.00 30,000.00 48.39 13.82 10,005.90 3.33 1870.00 6947.73 22,059.95 16.00 245.00 1199.07 3400.41 8.00 5.66 124.00 29,762.84 33.67 28.63 6626.67 .84 999.82 12,569.02 58,542.32 6.66 229.23 4445.93 16,187.10 7.34 11.89 97.42 22.00 1.00 1.00 6.20 1.00 1.00 6.81 752.12 2.00 1.00 1.61 3.09 1.00 .78 .00 170,000.00 99.73 97.53 27,827.80 4.67 3309.00 63,674.40 431,088.30 30.00 1001.00 32,236.48 99,859.53 30.00 85.63 529.00 Note: Ad_Exp = advertising expenditure, Comp_Scr_New = competition for “screen space” from new releases, Comp_Scr_Ong = competition for “screen space” for ongoing movies, Comp_Scr_Rev = competition for the attention of audiences, MPAA(/FSK) = age restriction, Distr_Major = major distributor, US_Perf = US market performance, Time_Lag = time lag between domestic and foreign release, Tip = movie is marked with “Tip” (in the German magazine “Cinema”). a Compared with Elberse and Eliashberg (2003), the variable was reverse coded and represents the ranking positions of the IMDb Starmeter. b Includes only those movies that were launched in the US earlier than in Germany. 212 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 Table 3 3SLS estimation results US opening week. log(Revenues) Extended model Elberse/Eliashberg Coeffi se p Constant log(Screens) log(Star) log(Director) log(Ad_Exp) log(Reviews) log(Comp_Rev) log(Season) Extension Sequel US log(MPAA) Children Action Documentary Horror Comedy Other N R2 Note: ⁎⁎⁎ p b .01; ⁎⁎ p b .05; ⁎ p b .10 (two sided). 4.99 1.04 −.00 .11 .02 1.05 −.19 .29 .18 −.23 −.01 −.25 −.28 .03 −.39 −.12 −.21 1917 .92 .51 .02 .01 .02 .01 .06 .04 .06 .06 .06 .05 .14 .06 .14 .11 .05 .11 .00 .00 .93 .00 .05 .00 .00 .00 .00 .00 .85 .08 .00 .82 .00 .03 .06 log(Screens) Extended model Variables Elberse/Eliashberg Variables Elberse/Eliashberg Constant log(Revenues)** log(Budget) log(Star) log(Director) log(Ad_Exp) log(Reviews) Distr_Major log(Comp_Scr_New) log(Comp_Scr_Ong) Extension Sequel US log(MPAA) Children Action Documentary Horror Comedy Other N R2 Note: ***p b .01; **p b .05; *p b .10 (two sided). Coeffi *** *** *** ** *** *** *** *** *** .27 .81 .10 .00 .20 .77 −.20 .02 se p 1.22 .04 .04 .03 .07 .03 .06 .27 .82 .00 .01 .91 .01 .00 .00 .95 *** ** ** *** *** * *** *** ** * 164 .88 Elberse/Eliashberg Coeffi se p −1.66 .40 .28 .10 −.17 .31 −1.15 .67 −.27 .07 .26 .76 −.49 .60 .55 .64 .93 .43 .17 1917 .76 .89 .03 .03 .02 .03 .02 .10 .07 .02 .08 .11 .11 .09 .23 .10 .24 .19 .09 .19 .06 .00 .00 .00 .00 .00 .00 .00 .00 .39 .01 .00 .00 .01 .00 .01 .00 .00 .36 * *** *** *** *** *** *** *** *** Coeffi se p −.29 1.41 −.02 .04 −.03 .25 −1.48 .10 −.19 .07 2.14 .08 .10 .05 .05 .11 .28 .19 .21 .16 .89 .00 .87 .47 .57 .02 .00 .61 .36 .65 *** ** *** ** *** *** ** *** *** *** *** 164 .81 Note: Ad_Exp = advertising expenditure, Comp_Scr_Rev = Competition for the attention of audiences, Distr_Major = major distributor, Comp_Scr_New = Competition for “screen space” from new releases, Comp_Scr_Ong = Competition for “screen space” for ongoing movies. we include major variables (e.g., advertising) to better account for regional market specifics than the prior study by E/E. 4. Estimation To estimate the dynamic system of equations, we need to compute estimates for the expected revenues (REVENUES⁎⁎ i ) in the respective screen equations for the first week (t = 1). For the US market, we generate estimates for first-week revenue expectations by estimating the parameters of Eq. (3) using OLS.4 For the German market, we use real admissions in the first week. We assume that the expected revenues for the first week in Germany do not differ much from the observed revenues because most movies on the German market have been 4 The double exponential smoothing (DES) procedure used by E/E to estimate the expected revenues for the following weeks could not be applied to our data sets. We find that DES is only applicable for movies with a very regular (already smooth) diffusion pattern, which is the case for the top 25 movies. However, our data sets also contain smaller movies with less regular diffusion patterns, resulting in highly fluctuating DES values for consecutive weeks of a movie if we apply the method proposed by E/E. previously released in the US. We test this assumption by splitting the sample in a calibration (all movies released in 2002–2005, N = 614) and a validation sample (all movies released in 2006–2010, N = 721). We estimate the first week admissions in the calibration sample using Eq. (6) applying OLS (R2 = 0.87). Relying on the estimated parameters, we predict the admissions at t = 1 for the validation sample and compare the estimated and the observed admissions in 2006–2010. The results are highly correlated with r = .94 (p b .01). Therefore, we argue that movie exhibitors have very realistic expectations about how successfully the movie will open.5 We then estimate the demand and supply equations for the US and German markets using instrumental variable estimation. The Hausman–Wu specification test (Greene, 2006) provides support for 5 Given the lack of Moviemeter data for many non-US movies, a procedure analogous to that for the US sample was not possible. It was also not possible to use US box office data during the first week to obtain a proxy value for the expected revenues for the German market because not all movies in Germany are previously released in the US. Using either method would result in many missing cases, causing a severe systematic bias in our data set. 213 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 Table 4 3SLS estimation results Germany opening week. log(Admissions) Variables Elberse/Eliashberg Constant log(Screens) log(Star) log(Director) log(Reviews) log(Comp_Rev) log(Season) log(US_Perf) log(Time_Lag*US_Perf) Extension log(Ad_Exp) Sequel US Germany log(MPAA) Tip Children Action Documentary Horror Comedy Other N R2 Note: ***p b .01; **p b .05; *p b .10 (two sided). log(Screens) Variables Elberse/Eliashberg Constant log(Admissions) log(Budget) log(Star) log(Director) log(Reviews) Distr_Major log(Comp_Scr_New) log(Comp_Scr_Ong) log(US_Perf) log(Time_Lag*US_Perf) Extension log(Ad_Exp) Sequel US Germany log(MPAA) Tip Children Action Documentary Horror Comedy Other N R2 Note: ***p b .01; **p b .05; *p b .10 (two sided). Extended model Elberse/Eliashberg Coeffi se p 5.30 1.11 .20 .12 .18 −.09 .57 .05 −.02 .01 .21 −.17 .02 −.00 .32 −.19 .00 −.02 .04 .07 .04 1335 .88 .24 .06 .05 .05 .06 .05 .09 .01 .01 .01 .06 .06 .07 .0 .05 .12 .07 .13 .12 .06 .09 .00 .00 .00 .03 .00 .05 .00 .00 .00 .06 .00 .00 .83 .97 .00 .12 .96 .87 .74 .19 .62 *** *** *** ** *** ** *** *** *** * *** *** Coeffi se p −2.47 1.51 −.03 −.02 .37 −.07 .39 .17 .08 1.01 .07 .04 .03 .23 .02 .18 .08 .90 .02 .00 .51 .56 .11 .00 .03 .04 .90 ** *** *** ** ** *** 138 .88 Extended model Elberse/Eliashberg Coeffi se p −2.49 .29 .17 .13 −.03 .04 .16 −.00 .17 .02 −.02 .06 .18 .15 .21 −.13 .08 .65 .33 −.05 .43 .17 .27 1335 .85 .40 .09 .03 .06 .05 .06 .04 .01 .09 .01 .0 .01 .07 .05 .07 .04 .06 .11 .07 .10 .11 .06 .08 .00 .00 .00 .03 .56 .51 .00 .75 .07 .08 .01 .00 .02 .00 .00 .00 .21 .00 .00 .60 .00 .00 .00 Coeffi *** *** *** ** *** * * *** *** ** *** *** *** .99 .38 .17 .12 −.20 −.41 −.15 −.13 .34 .95 −.28 se p 1.58 .07 .10 .05 .25 .37 .16 .06 .36 .16 .13 .53 .00 .07 .01 .44 .26 .36 .02 .34 .00 .03 *** * ** ** *** ** *** *** *** *** *** 138 .50 Note: Comp_Scr_Rev = competition for the attention of audiences, US_Perf = US market performance, Time_Lag = time lag between domestic and foreign release, Ad_Exp = advertising expenditure, Distr_Major = major distributor, Comp_Scr_New = competition for “screen space” from new releases, Comp_Scr_Ong = competition for “screen space” for ongoing movies, Tip = movie is marked with “Tip” (in the German magazine “Cinema”). endogenous relation between REVENUES and SCREENS for all equations (Eqs. (1), (4), (7), (8)) except for Eq. (6). Further, we consider advertising expenditures as a potentially endogenous variable in the demand equation. We test for the exogeneity assumption of advertising expenditures by applying the Hausman–Wu specification test (Greene, 2006). All exogenous variables that we use to instrument and identify screens in the demand equation serve also as instruments for advertising expenditures. Advertising expenditures in Germany serve as a powerful instrument to identify advertising expenditures on the same movies in the US and vice versa. Expenditures in both countries are correlated because the same studio makes budget decisions. Advertising campaigns in one country, however, should not impact moviegoers in the other country. The Hausman–Wu test does not reject the exogeneity assumption for the advertising variable. The coefficients associated with the additional test variables are not significant (p = .433 for the U.S. and p = .374 for Germany).6 Thus, we focus on the endogenous relation between REVENUES and SCREENS for all equations. Analogous to E/E, we do not use additional instrumental variables. The instrumental variables for the respective equations result from the combination of the two interlaced supply and demand equations. For example, in Eq. (6), the first stage is denoted by Eq. (7), which 6 First-stage R2 ranges from .226 (Germany) to .588 (USA) and the associated F-values exceed the threshold of 10 in both cases (Greene, 2006). Thus, we conclude that our instruments are not weak. 214 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 provides the overidentifying instruments BUDGET, DISTR_MAJOR, COMP_SCR_NEW, and COMP_SCR_ONG. These variables serve as instrumental variables because we assume that they only affect the exhibitors' and not the consumer behavior. In Eq. (7), the first stage is denoted by Eq. (6). In this case, COMP_SCR_REV and SEASON serve as overidentifying instruments, and we assume that they only affect the consumer, not the exhibitor. Based on the tests, we linearize all equations by log-transforming them and estimating the respective parameters using 3SLS. We account for time-specific fixed effects in estimating our model for the first 26 weeks by including a set of dummies in the equations (Eqs. (4), (5), (8), (9)) for the following weeks, which is denoted by the vector WEEK. All weeks after week 26 are captured by a single dummy because we do not expect any time-specific effects after this week. In the appendices, we present the correlation matrices for both markets, which rarely have been presented in prior research. As expected, we find substantial correlations between various variables that raise the question of multicollinearity. However, the VIF values remain below the critical value of 10 and do not indicate severe problems (see Table 6). We ran various robustness checks with respect to multicollinearity by dropping some variables from the model. However, we find that the results are robust, which is depicted in Table 6 where, for example, advertising spending has been dropped. our findings are different from the results of E/E that show a significant effect of stars (.10, p b .05) but no effect of directors (p = .91). With respect to advertising, we find rather small elasticities for movies during the first week (.02, p b .05). These small elasticities are substantially lower than the elasticity reported by E/E (.20, p b .05), but reflect the effect of a potential selection bias because their sample covers only “rather successful” movies that were at least one week in the top 25 of Variety. Very similar to E/E, we find a negative effect of competition on box office. Similar movies (same genre and/or age restriction) will decrease the demand for the movie under consideration. The elasticity of −.19 (p b .01) is close to the − .20 (p b .01) reported by E/E. Contrary to E/E, we find a significant pull effect (.29, p b .01) in high seasons, which substantially increases demand (Radas & Shugan, 1998). Relying on other findings (Hennig-Thurau, Houston, & Heitjans, 2009), we find significant higher demand for sequels (.18, p b .01), which highlights the recent discussion of sequels' relevance for Hollywood. Ceteris paribus, we find a negative impact of US productions on demand (−.23, p b .01). With respect to MPAA ratings, we find no significant effects on demand (p = .85). Compared to drama (which serves as the base category for the genres), we find significantly lower demand for children (p b .1), action (p b .01), horror (p b .01), comedy (p b .05), or other movies (p b .01) that potentially target genre-specific audiences representing a subset of the market potential. No significant effects are observed for documentaries. 5. Results Tables 3–5 provide the 3SLS results for the supply and demand equations of the first week in the US (Table 3) and Germany (Table 4). Table 5 presents the estimation results of the subsequent weeks for both markets. The tables present our estimates and, for comparison, the findings of E/E. A comparison of the R2 values indicates a very good fit of all our models. In particular, the models for the German market show a substantially better fit than the findings of E/E. First, we compare the estimated elasticities to the prior findings of E/E. We find substantially more significant influences in the equations addressing the first week, which can partially be attributed to the larger sample size (US 164 versus 1917 and Germany 138 versus 1335 movies).7 Generally, our findings support the significant findings of E/E with respect to the direction of the estimated elasticities. 5.1. US demand effects t = 1 Focusing on the first week estimation results for the US, we find that the most relevant drivers of movie consumers' demand are the number of screens and movie reviews. We assume that reviews reflect the quality of the movie, which is the strongest driver of consumers' demand. These findings correspond to the findings of E/E. However, we find a higher elasticity for reviews (1.05 versus .77, both p b .01). The lower elasticity reported by E/E points towards a sampling bias. Besides blockbusters our sample also includes low budget movies with lower media coverage. Thus, reviews may serve as a stronger signal for “smaller” movies as compared to the successful ones. In addition, we assume that our more recent data better reflects the effect that reviews are now easier to obtain via the Internet than before via TV, newspapers, or magazines (Hennig-Thurau, Wiertz, & Feldhaus, 2013). This is reflected in the overall higher elasticity finding. We also find a higher elasticity for screens (1.04 versus .81, both p b .01). This result, however, corresponds to more recent findings of Karniouchina (2011) who uses data from 2005 and finds a screen elasticity of .94 for t = 1. Further, we find a significant effect of director power on demand (.11, p b .01) but no significant effect of stars on demand (p = .93). Thus, 7 We analyzed a comparable subsample of our US data set to test whether we could replicate the findings of E/E. We found very similar results, although we used somewhat different measures (the results can be obtained from the authors upon request). 5.2. US supply effects t = 1 Our results reveal a substantial number of significant drivers that influence the behavior of cinema managers, who have a high incentive to shift demand from early weeks to later weeks because margins increase over time. Whereas E/E identified only expected revenues, advertising, and reviews as significant influencers, we find much more differentiated results. Consistent with E/E, we find that positive reviews reduce the number of screens during the first week. The elasticity of reviews with respect to the number of screens is − 1.15 (p b .01), compared with −1.48 (p b .01) found by E/E. Thus, movies with better reviews receive fewer screens during the first week. Karniouchina (2011) observed a similar negative effect with respect to screen allocation for highquality movies and star buzz. Similar to E/E as well as Karniouchina (2011), we argue that this effect may arise from the larger attractiveness cinemas have for high-quality movies. As a result, the movie is expected to have a stronger staying power. Low-quality movies do not have this staying power, so that cinemas have an incentive to support these movies by providing relatively more screens in the first week. In addition, sometimes distributors systematically limit screens (limited release) in the opening week in order to increase the hype of a high quality movie (Karniouchina, 2011). Finally, the result may also be an expression of the documented negative relationship between customer satisfaction and market share as noted by Fornell (1995). In this case the quality of the product valued by movie reviewers (typically highbrow content) is in contrast to the notion of lowbrow or “mass appeal” content that comes with widely distributed (and, therefore, high marketshare) movies.8 Further, we find a significant effect (.40, p b .01) of the expected revenues of a movie, albeit smaller than E/E (1.41, p b .01). With respect to the last consistent finding of advertising, we show rather similar results (.31, p b .01 versus .25, p b .05 of E/E). We also identify a number of additional influences. For example, we find a significant effect of production budget on screens (.28, p b .01), which was not reported by E/E. Moreover, consistent with the negative elasticity of reviews on screens, we find that director power reduces significantly the provision of screens. Contrary to actors, the director works behind the camera and is responsible for the visualization of the story with respect to the 8 We thank the area editor for pointing this issue to us. 215 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 Table 5 3SLS estimation results US and Germany following weeks. log(Revenues) US following weeks Extended model Variables E/E Constant log(Screens) log(Comp_Rev) log(Season) log(WOM) N R2 Adj. R2 Note: ***p b .01; **p b .05; *p b .10 (two sided). log(Screens) Elberse/Eliashberg Coeffi se .14 1.00 −.01 .15 .85 24,030 .97 .05 .00 .00 .01 .00 p .01 .00 .05 .00 .00 Coeffi *** *** * *** *** Constant log(Revenues) log(Comp_Scr_New) log(Comp_Scr_Ong) log(WOM) Ext. log(Revenues-1) log(Screens-1) N R2 Note: ***p b .01; **p b .05; *p b .10 (two sided). log(Admissions) Constant log(Screens) log(Comp_Rev) log(Season) log(WOM) N R2 Adj. R2 Note: ***p b .01; **p b .05; *p b .10 (two sided). log(Screens) Coeffi se −2.04 .05 p .00 *** Coeffi −.01 .00 .00 .01 .00 1.00 *** .32 .60 24,030 .95 .00 .00 .00 .00 *** *** Constant log(Admissions) log(Comp_Scr_New) log(Comp_Scr_Ong) log(WOM) Ext. log(Admissions-1) log(Screens-1) N R2 Note: ***p b .01; **p b .05; *p b .10 (two sided). .22 .00 .04 .70 .00 *** ** *** −.59 1.08 −.26 .06 .35 se p .38 .05 .02 .05 .09 .12 .00 .00 .27 .00 *** *** *** 2489 .74 Germany following weeks Elberse/Eliashberg Coeffi se −.03 1.03 .03 .19 .86 12,807 .94 .04 .00 .01 .02 .01 p .51 .00 .00 .00 .00 Coeffi *** *** *** *** −.55 1.08 .03 .08 .74 1196 .88 .88 se p .22 .03 .03 .05 .03 .01 .00 .33 .09 .00 se p .23 .01 .08 .10 .16 .00 .00 .00 .11 .05 ** *** * *** Germany following weeks Extended model Variables E/E p .24 .02 .02 .06 .04 Elberse/Eliashberg Extended model Variables E/E se US following weeks Extended model Variables E/E .29 1.01 −.03 .02 1.05 2489 .92 .92 Elberse/Eliashberg Coeffi se p Coeffi −1.02 .05 .00 *** −.02 .07 .00 .02 .00 .00 *** *** .20 .75 12,807 .96 .00 .01 .00 .00 *** *** 2.94 .09 −.42 .16 .32 *** *** *** * 12,807 .64 Note: Comp_Scr_Rev = competition for the attention of audiences, Comp_Scr_New = competition for “screen space” from new releases, Comp_Scr_Ong = competition for “screen space” for ongoing movies, Revenues-1 = revenues in previous week, Screens-1 = number of screens in previous week, Admissions-1 = admissions in previous week. overall content of the script resulting in an immense impact on the overall product (Ainslie et al., 2005). Further, some star directors succeed in giving their movies their own distinctive style (Hadida, 2010). Thus, higher director power may serve as an indicator for higher quality and, eventually, longer staying power of the movie. Therefore, we assume that this effect is in line with the argument that positive reviews indicate staying power, which leads to greater incentives for cinema managers to extend the screening period (with higher margins). This finding is supported by the significant correlation of positive reviews and director power (Table A1 in Appendix A). Interestingly, the correlation of star power with reviews is negative, indicating that greater star power does not necessarily lead to high quality movies. Therefore, ceteris paribus, star power significantly and positively influences the allocation of screens for the first week (.10, p b .01). Further, we find a strong significant effect of major distributors on screen allocation (.67, p b .01). Thus, major studios are able to monetize their market position in negotiating more screens for their movies. With respect to competition, we also find a negative elasticity for screens from newly released movies (− .27, p b .01), whereas no significant influence is observed from competition generated by ongoing releases. Sequels also generate more screens (.26, p b .05). Therefore, they add further relevance to the total effect of sequels on demand through this substantial indirect effect 216 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 from screens on total demand. We find a negative effect of US productions on demand (− .23, p b .01). This negative influence on demand may be due to variety seeking motives of moviegoers. The vast majority of movies shown are produced in Hollywood. A foreign production might be a welcome increase in variety. The result probably would be different if the share of non-US productions was higher. In addition, the finding may reflect the fact that we have a much larger data set including the many “small” US movies released to the market. However, we find a strong positive effect of this variable on screens (.76, p b .01). One reason for this contrary effect could be that the production and distribution of films and television programs is perceived as one of the “nation's most valuable cultural and economic resources” (MPAA, 2013; http://www.mpaa.org/policy/state-by-state) which is used as an argument for substantial financial distribution subsidization by US states, which in turn may lead to a relative high number of screens at the opening week. Thus, the overall effect of US productions is positive if the screen elasticity of 1.04 is considered. Adding to recent research by Leenders and Eliashberg (2011), we find that restrictive MPAA ratings significantly reduce the number of screens allocated to a movie (−.49, p b .01). Finally, the genre effects show, compared with drama and except for “other movies,” all significant positive effects with respect to screens. 5.3. German demand effects t = 1 The results for the first week in the German movie market support and widely enhance the prior findings of E/E. A strong relevance of screens on demand (1.11, p b .01) is also observed in our study (E/E = 1.51, p b .01). The strong influence of reviews observed in the US market is not present in the German market. However, E/E also found a lower elasticity (.37) for Germany. In our study, we find that reviews significantly influence demand (.18, p b .01) in the German market. In addition, the variable “Tip” (regression coefficient: .32 resulting in a multiplier effect of 37.7%, p b .01) includes the magazine's recommendation, which is also highly significant. Thus, reviews have a major effect on demand in Germany. Interestingly, we find that seasonality has a substantial effect on demand (.57, p b .01), which is higher than E/E's finding (.39, p b .05) and substantially higher than in the US. With respect to (German and international) star (.20, p b .01) and director power (.12, p b .05), we find that both variables significantly explain demand. Thus, German moviegoers seem to be more attracted by star and director power than their US counterparts. Consistent with the US results, we also find small but significant effects from advertising (.01, p b .1) and competition (−.09, p b .05). In addition, our results reveal that successful US movies face greater demand in Germany (.05, p b .01), although the effect is dampened if the timing of the release is delayed compared with the US release (− .02, p b .01). Demand is also higher for sequels and the effect size with respect to admissions (.21, p b .01) compares well with the US (.18, p b .01). Further, US-produced movies face lower demand (− .17, p b .01), which can be also attributed to the fact that we include all US-produced movies released in Germany in our sample. We find all other variables to be insignificant, meaning that we do not find any effect on demand with respect to age ratings, movies produced in Germany, or genres. 5.4. German supply effects t = 1 We find very consistent results in the supply equation for the German market compared with our US results. The estimated effects from expected admissions (.29, p b .01), budget (.17, p b .01), and star power (.13, p b .05) are very close to the findings of E/E. Similar to the results of E/E, we also find no significant effect of director power or reviews (including the German specific “Tip” variable) on screens. Although this result is consistent with E/E, we found director power or reviews to be significantly negative in the US equation, indicating regional differences between these two markets. Our findings add new insights into the positive effect of adverting on screens. We find a significant but rather small effect (.06, p b .01) of advertising on screens. Additionally, we find significant positive effects on the number of screens attributable to the market power of major distributors (.16, p b .01) that we also identified in the US market, although on a much larger scale. Unlike E/E, we find no significant effect of competition on new releases. However, the competition effect of ongoing movies (.17, p b .1), measured by the average age of the competing movies, significantly influences the number of allocated screens. Thus, greater competition means a lower average age, leading to fewer allocated screens for the movie under consideration. Moreover, our findings indicate a much smaller effect of US performance on the success in the German market compared with E/E (.02, p b .1 vs. .95, p b .01), and the interaction effect on the time lag between US and German release is much smaller than that of E/E (−.02, p b .01 vs. −.28, p b .05). However, our results are conclusive because E/E do not account for non-US movies or German advertising, likely resulting in an overestimation of the US performance and time lag effect. The remaining effects point in the same direction as in the US supply equation. Sequels (.18, p b .05), US and German productions (.15 and .21, both p b .01), MPAA rating (− .13, p b .01), and genre effects are consistent in both equations (with the exception that, on the one hand, we cannot find a significant effect for documentaries in the German equation and, on the other hand, the genre category “others” becomes significant). 5.5. US demand effects t N 1 The estimated elasticities for the following weeks after week 1 are very similar to E/E. In our results, we find that screen elasticity has the highest effect on demand (1.00, p b .01) and that the result is highly consistent with E/E (1.01, p b .01). This result corresponds also to the findings of Karniouchina (2011), who reports elasticities of .94/.89/.93 for t = 2/3/4. We also observe a negative competition effect from other movies of the same genre and/or age rating (− .01, p b .1) that is similar to E/E's findings (− .03, p b .05). Our results differ with respect to E/E in that we find a significant effect of season on demand (.15, p b .01) and a smaller effect of WOM (.85 versus E/E's 1.05, both p b .01) on demand. 5.6. US supply effects t N 1 With respect to supply effects in the following weeks, we find strong effects of the previous week's box office result (.32, p b .01) and, especially, the number of screens allocated to the movie in the previous week (.60, p b .01) on screens. Our results better separate the findings of E/E that report a significant effect of expected revenues (1.08, p b .01) on screens. Consistent with E/E but on a smaller scale (− .01 versus − .26, both p b .01), we find a significant negative effect of competition from newly released movies on screens. 5.7. German demand and supply effects t N 1 The results for screen allocation in Germany are similar to our US results with respect to the relevance of the movie's box office performance measured by the admissions (.20, p b .01) and screens (.75, p b .01) of the previous week. However, we find a stronger tendency in the German market to depend on the previous week's screen allocation. Interestingly, we find stronger effects from competition in the German market. On the one hand, competition from similar movies has a positive effect (.03, p b .01) on demand. Such a market expansion effect attributable to competition was also identified by Radas and Shugan (1998). On the other hand, we find a stronger effect (−.02, p b .01) of competing new releases and ongoing movies (.07, p b .01) on the supply side than in the US market. 217 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 5.8. Robustness and validation checks Table 6 shows the results of some of our robustness and validation checks with respect to the results of the first week in the two markets (for comparison purposes, we report our benchmark results in the first column). First, as previously noted, we observe a large number of missing values for our budget variables. If we exclude the production budget from our estimation (column 3SLS w/o budget), we can rely on larger data sets (n = 3460 for the US market and n = 2598 for the German market). Interestingly, we find no substantial effect in the demand model. In the supply model, the exclusion of the budget only leads to some minor changes. The omission of this variable results in an increase in the effect of major distributors on screens (1.01 instead of .67, both p b .01). Thus, the effect of the budget on screens is picked up by the major studio dummy variable. We also find less relevance for US productions on screen allocation (.35 instead of .76, both p b .01) in the US market. In Germany, we observe consistent effects and find that the market power of the distributor becomes more relevant for screen allocation. Interestingly, we find that the dummy variable indicating German movies becomes insignificant and that documentaries face a significant hurdle in gaining additional screens (−.22, p b .01). Second, we test whether the omission of advertising results in substantial changes attributable to multicollinearity. However, we only find marginal changes in the results (see column 3 in Table 6). Finally, we also report the results of a simple OLS for comparison with E/E. As a conclusion, our robustness checks and our verifications of the results Table 6 Robustness and validation checks. US first week 3SLS log(Revenues//Admissions) Variables Elberse/ Eliashberg Extension log(Screens) Variables Elberse/ Eliashberg Extension Constant log(Screens) log(Star) log(Director) log(Ad_Exp) log(Reviews) log(Comp_Rev) log(Season) Sequel US Germany log(MPAA) Tip Children Action Documentary Other R2 Adj. R2 Mean VIF Max VIF Constant log(Revenues**// Admissions) log(Budget) log(Star) log(Director) log(Ad_Exp) log(Reviews) Distr_Major log(Comp_Scr_New) log(Comp_Scr_Ong) log(US_Perf) Sequel US Germany log(MPAA) Tip Children Action Documentary Horror Comedy Other R2 Adj. R2 Mean VIF Max VIF N Germany first week 3SLS w/o Budget 4.99 1.04 −.00 .11 .02 1.05 −.19 .29 .18 −.23 *** *** *** ** *** *** *** *** *** 5.28 1.03 −.01 .13 .03 1.10 −.28 .23 .17 −.17 3SLS w/o Ad-Exp *** *** *** *** *** *** *** *** *** OLS 4.98 *** 1.05 *** .00 .11 *** 1.06 −.17 .29 .18 −.22 *** *** *** *** *** 3SLS 8.36 .79 .09 .16 .09 .66 −.50 .37 .30 −.01 *** *** *** *** *** *** *** *** *** −.01 −.03 −.25 * −.28 *** .03 −.21 * .92 −.48 *** −.29 *** .04 −.18 ** .94 −.21 −.27 *** .04 −.20 * .92 −.19 −.21 *** .07 −.39 *** .95 .94 1.52 2.60 −1.66 * .40 *** 3.05 *** .44 *** −8.68 *** .57 *** −1.73 * .37 *** .28 .10 −.17 .31 −1.15 .67 −.27 .07 *** *** *** *** *** *** *** .10 −.15 .35 −1.25 1.01 −.29 .06 −.02 .01 *** *** *** *** *** *** .47 *** .12 *** −.20 *** −1.07 *** .90 *** −.01 .09 .30 .11 −.16 .29 −1.15 .82 −.24 .02 *** *** *** *** *** *** *** .26 ** .76 *** .36 *** .35 *** .20 * .99 *** .27 ** .76 *** −.49 *** −.52 *** −.28 *** −.45 *** .60 .55 .64 .93 .43 .17 .76 1917 ** *** *** *** *** .67 .59 .18 .69 .21 .09 .81 3460 *** *** * *** *** .79 .50 .77 1.09 .51 .18 .73 1917 *** *** *** *** *** .60 .54 .65 .98 .43 .18 .76 .76 1.91 4.18 1917 ** *** *** *** *** 5.30 1.11 .20 .12 .01 .18 −.09 .57 .21 −.17 .02 −.00 .32 −.19 .00 −.02 .04 .88 3SLS w/o Budget *** *** *** ** * *** ** *** *** *** *** −2.49 *** .29 *** .17 .13 −.03 .06 .04 .16 −.00 .17 .02 .18 .15 .21 −.13 .08 .65 .33 −.05 .43 .17 .27 .85 1335 *** ** *** *** * * ** *** *** *** *** *** *** *** *** 5.72 .98 .22 .22 .03 .19 −.13 .49 .32 −.16 −.05 −.02 .39 −.26 .02 −.28 −.06 .88 3SLS w/o Ad-Exp *** *** *** *** *** *** *** *** *** *** *** ** *** −.61 .34 *** .29 −.02 .06 .03 .31 −.00 .17 .02 .19 .16 .03 −.07 .12 .90 .41 −.22 .47 .22 .27 .85 2598 *** *** *** ** ** *** ** * *** *** *** *** *** *** OLS 5.31 1.13 .22 .12 *** *** *** ** .18 −.10 .56 .21 −.18 .02 .00 .32 −.19 −.01 −.03 .04 .88 *** ** *** *** *** *** −2.84 *** .35 *** .19 *** .20 *** −.01 .05 .17 −.00 .01 .03 .16 .14 .27 −.10 .07 .73 .31 −.07 .49 .16 .29 .83 1335 *** ** ** *** *** ** *** *** *** *** *** 5.18 1.14 .18 .12 .01 .18 −.09 .56 .20 −.19 .01 .01 .31 −.23 −.02 −.00 .02 .88 .88 1.89 5.04 *** *** *** ** ** *** * *** *** *** *** ** −3.36 *** .56 *** .09 −.02 −.07 .03 −.05 .13 .02 .06 −.01 .02 .16 .10 −.07 −.05 .41 .18 −.04 .18 .06 .11 .89 .89 1.82 5.12 1335 *** * *** *** ** *** ** ** *** *** *** ** Note: ***p b .01; **p b .05; *p b .10 (two sided). Note: Ad_Exp = advertising expenditure, Comp_Scr_Rev = competition for the attention of audiences, Distr_Major = major distributor, Comp_Scr_New = competition for “screen space” from new releases, Comp_Scr_Ong = competition for “screen space” for ongoing movies, US_Perf = US market performance, Tip = movie is marked with “Tip” (in the German magazine “Cinema”). 218 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 with E/E indicate that our findings are robust and may serve as a foundation to generate empirical generalizations. 6. Implications and generalizations High financial risks in production and marketing, the hedonic nature of products, and the global cultural relevance of movies have attracted a substantial number of researchers who have focused on a large number of various success drivers of movies. The findings of previous research substantially differ with respect to the included variables, the underlying measurements, data quality, and—as a result—to the findings. Following the taxonomy of Hubbard and Armstrong (1994), this paper provides an extension of E/E's research and allows new insights and generalizations. We test the conceptual relationships involved in the original study with changes in the initial design by (1) extending the variables included in the model, (2) using two new samples drawn from a different population, and (3) using partially different measures for the variables. Table 7 presents an overview of the findings. Our study also helps in understanding whether the passage of time has had an effect on the results of E/E and whether we can generalize the findings. We find substantial differences in the effect of marketing variables with respect to the behavior of suppliers and consumers for movies, indicating that a separate analysis of the two players is fruitful. Given our data and the robustness of our findings, we present the following set of generalizations. 6.1. Demand generalizations The analysis of the first week (t = 1) demand drivers in the two markets reveals insights with respect to the relative magnitude of the elasticities. In the US market, we find that reviews and the number of screens are by far the most important drivers for demand with elasticities of 1.05 (p b .01; reviews) and 1.04 (p b .01; screens). Next, we observe substantial seasonal primary demand effects (elasticity of .29, p b .01) in the US market. Genre effects also show a substantial multiplier effect for horror (−48%; p b .01), action (−32%; p b .01), children (−28%; p b .1), and other movies (−23%; p b .1). Finally, US produced movies result in a multiplier effect of −26% (p b .01). All other drivers are equal or below an elasticity of |.2| or a multiplier effect of |20%|. In the German market, we identify a strong influence of screens (1.11; p b .01) on demand in the first week. Additionally, we find that reviews (.18 and an additional multiplier effect of 37% if the movie has been marked as a “tip” by the leading German movie magazine Cinema; both p b .01) and seasonality (.57; p b .01) have a strong influence on demand. We also find sequels with a multiplier effect of 24% (p b .01) to be of high relevance in the German market. All other drivers are equal or below an elasticity of |.2| or a lift-up of |20%|. For later weeks (t N 1), we find consistent results across both markets with respect to the relative importance of the number of screens (elasticity of 1.00 in the USA and 1.03 in Germany, both p b .01) and WOM (elasticity of .85 in the USA and .86 in Germany, both p b .01). The results lead to the following generalizations with respect to demand effects. 6.1.1. Demand generalization 1 The demand for a movie in its initial release week (t = 1) is driven by the quality (measured using aggregated professional reviews) of the movie. The elasticity for reviews is 1.05 in the US and .18 with an additional 37% if the movie is listed as a “tip” in Germany. The effect size for Germany is significantly smaller (t-test; p b .01) than for the US. Table 7 Overview of empirical findings. Variable Description US demand First week revenues REVENUES/ADMISSIONS REVENUES1** REVENUESt − 1/ADMISSONSt − 1 SCREENS SCREENS t − 1 BUDGET STAR DIRECTOR AD_EXP REVIEWS DISTR_MAJOR WOM COMP_SCR_NEW COMP_SCR_ONG COMP_SCR_REV SEASON US_PERF US_PERF* TIME_LAG SEQUEL US GERMANY MPAA/FSK CHILDRENa ACTIONa DOCUMENTARYa HORRORa COMEDYa OTHERa TIP Weekly revenues (US)/admissions (GER) Expected revenues first week Revenues (US)/admissions (GER) previous week Weekly number of screens Screens previous week Production budget Star power Director power Advertising expenditures Critical reviews Major distributor Word-of-mouth communication Competition for “screen space” from new releases Competition for “screen space” from ongoing movies Competition for the attention of audiences Seasonality US market performance US market performance × Time lag between domestic and foreign release Sequel US production German production MPAA/FSK rating Genre children Genre action Genre documentary Genre horror Genre comedy Other genre Movie is marked with “Tip” (Cinema) US supply Following weeks revenues First week screens GER demand Following weeks screens First week ad-missions GER supply Following weeks ad-missions First week screens Following weeks screens + + + + + + + + + + + − + − + n.s. + + + + + + n.s + n.s. + + + + + + + − − n.s. − n.s. n.s. + + − − − + + + + + − + + − + + n.s. − − n.s. − − − − + + + + + n.s. + − n.s. n.s n.s. n.s. n.s. n.s. n.s. n.s. + Note: + = coefficient is positive at a significance level of p N .01; − = coefficient is negative at a significance level of p N .01; n.s. = not significant. a Base level in the model is DRAMA. + − + + + − + + n.s. + + + n.s. M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 6.1.2. Demand generalization 2 The distribution of the movie, measured by its numbers of screens, has a major influence on revenues. The screens' elasticity for the first week is similar and not statistically different (p N .30) for the US and Germany, with elasticities of 1.04 and 1.11, respectively. The elasticity remains very high in later periods (1.00 in the US and 1.03 in Germany, the elasticities are statistically different; t-test; p b .01). Thus, the major managerial challenge for studios and distributors is to generate distribution power. 6.1.3. Demand generalization 3 The demand in later stages of a movie is strongly influenced by WOM (elasticity of .85 in the USA and .86 in Germany, the elasticities are statistically not different; t-test; p b .25). 6.1.4. Demand generalization 4 The elasticities for director power are .11 (USA) and .12 (Germany) and the elasticities are statistically not different (t-test; p b .94). This generalization is also supported by Hadida (2010). 6.2. Supply generalizations The analysis of the supply drivers in the first week (t = 1) in both markets provides further opportunities for generalizations. We discuss the insights with respect to the relative magnitude of the elasticities. In the US market, we observe a negative elasticity of reviews on screens of − 1.15 (p b .01). Further, we find that investments in production budgets (elasticity .28, p b .01) and advertising (elasticity .31, p b .01) positively influence the number of screens allocated to a new movie. In addition, we find a strong significant effect of major distributors on screen allocation (regression coefficient of .67 resulting in a multiplier effect of 95%, p b .01). Thus, the financial power and the large market share of the major studios result in substantial benefits with respect to receiving a substantial number of screens; especially for US productions (regression coefficient of .76 resulting in a multiplier effect of 113%, p b .01). Finally, we note that the expected revenues have a substantial impact on screen allocation (elasticity .40, p b .01). In the German market, we do not find any support for a negative effect of reviews on screen allocation as both variables review and tip are insignificant. However, in line with the findings regarding the US market we identify a substantial impact of the expected admissions on screens (elasticity .29, p b .01). Additionally, we find support for the positive effect of financial power (production budget .17 and advertising expenditures .06, both p b .01) and market share advantages (major distributor, regression coefficient of .16 resulting in a multiplier effect of 17%, p b .01) on screens, although both effects are lower compared to the effects in the US market. For later weeks (t N 1), we find consistent results across both markets with respect to the relative importance of screen (elasticity of .60 in the US and .75 in Germany, both p b .01) and revenue/admission carry-over effects (elasticity of .32 in the US and .20 in Germany). The results lead to the following generalizations with respect to supply effects. 6.2.1. Supply generalization 1 High quality movies with excellent reviews receive substantially fewer screens during the first week only in the USA (elasticity −1.15; p b .01). This effect is based on the high total revenue expectations of the cinemas for high quality movies resulting in longer expected staying power. Due to the fact that the revenue shares for cinemas are higher in later weeks of the movie's run, the cinemas have, ceteris paribus, an incentive to shift demand to later weeks of the movie's run. 6.2.2. Supply generalization 2 Advertising has only limited effects on demand (elasticity: USA .02, p b .05; Germany .01; p b .1; the elasticities are according to a t-test statistically not different; t-test; p b .80). The limited ad effectiveness 219 might point to the existence of advertising thresholds in the market for moviegoers. However, we find that advertising has a major influence on supply (elasticity: USA .31, p b .01; Germany .06; p b .01; the elasticities are statistically different; t-test; p b .01). Thus, advertising serves as an instrument targeted mainly towards the cinemas. The lower elasticity in the German market is based on the effect that the German sample includes all German movies, i.e. also such movies that spent much less on advertising. 6.2.3. Supply generalization 3 Screen allocation is substantially influenced by the expected revenues (elasticity: USA .40, p b .01; Germany .29, p b .01; the elasticities are statistically not different; t-test; p b .24). Supply in later periods is primarily driven by carryover effects in such a way that revenues/ admissions (elasticity of .32 in the USA and .20 in Germany) and, in particular, screens of the previous week (elasticity of .60 in the USA and .75 in Germany) substantially influence the number of screens provided in subsequent periods. Although the effects are statistically different (t-test, p b .01) between the two countries, they support in their relative importance the statement by Krider et al. (2005) that the dominant industry pattern is one of movie exhibitors monitoring box office sales and then responding with screen allocation decisions. 6.3. Country specific findings and generalizations Besides the generalizations with respect to demand and supply, we find effects in the two markets that are of general interest but that have a lower relative importance with respect to the elasticities. 6.3.1. Generalization of sequel multiplier effect Sequels result in higher demand (multiplier effect of 20% in the USA and 24% in Germany) and a higher number of screens (multiplier effect of 30% in the USA and 19% in Germany) in both markets. The estimated demand and supply parameters do not statistically differ from each other between the two countries. 6.3.2. Generalization of major multiplier effect Major distributors are in a position to roll out their global market power and receive a higher number of screens during the opening week than other studios. The multiplier effect in the USA (95%) is much higher than in Germany (17%). The smaller effect of majors in markets outside of the USA has also been noted by E/E for France, the UK, and Spain. The reason is the strong relevance of national contents that are partially distributed by local firms that compete with the US majors. 6.3.3. Generalization of competition The effect of competition on demand and supply is consistently negative for the opening week. The competition of similar movies competing for the attention of audience has a higher impact in the US-market. We find elasticities of −.19 in the US vs. −.09 in Germany. However, a t-test shows this difference to be not significant (p = .14). Moreover, we find different competition effects to be significant in the US and in Germany. While in the US the competition for screen space is dominated by new releases (− .27, p b .01), in Germany only the ongoing movies have a significant competition effect (.17, p b .1, higher scores represent weaker competition). However, in later weeks, the effects from competition tend to become smaller. The elasticities of the competition of similar movies for the attention of audiences amount to −.01 (p b .1) in the US vs. .03 (p b .01) in Germany, so in Germany the effect is even slightly positive on admissions pointing towards a market extension effect of competition. Regarding the number of screens, we find an elasticity of −.01 in the US for the competition of new releases. In Germany both competition measures for screen space prove to be significant (new releases: − .02, p b .01; ongoing movies: .07, p b .01, higher scores represent weaker competition). CHILDREN 1 −.099 −.125 −.358 −.124 −.043 .205 −.076 .024 −.313 .030 ACTION 6.3.4. Generalization of star power Stars lead to significantly more screens (elasticity .10 in the USA and .13 in Germany, statistically not different; t-test; p b .68) in both markets. However, German moviegoers seem to be more attracted by star power (elasticity .2) than their US counterparts where the star effect is not significant. Contrasted with the findings on reviews, it is likely to assume that US moviegoers rely more on reviews than on other quality signals such as star power. MPAA 1 .386 −.047 .260 −.216 .054 −.061 .013 .064 .355 −.249 −.141 −.190 .128 .648 .367 −.003 .244 −.286 .023 .032 .014 −.142 .449 −.229 −.099 −.231 .068 .441 .646 .568 .272 .112 .345 −.231 −.038 −.071 .009 −.096 .531 −.244 −.053 −.373 .142 1 1 .256 .420 .694 .535 .763 .743 .433 .039 .288 −.232 .099 −.034 .015 −.117 .491 −.199 −.114 −.290 .068 .827 .285 .460 .756 .561 .725 .602 .292 .101 .300 −.175 .076 .011 −.008 −.153 .607 −.239 −.087 −.401 .067 1 .949 .759 .245 .474 .696 .524 .676 .504 .280 .082 .288 −.169 .102 .033 −.016 −.291 .585 −.208 −.080 −.366 −.012 1 .100 .224 .120 .210 .198 .008 .057 .198 −.037 .077 −.051 −.012 −.015 .141 −.059 .012 −.161 .104 1 .243 .301 .460 .294 .244 .028 .092 −.041 .051 .070 −.054 −.227 .342 −.083 −.053 −.139 −.005 1 .416 .325 .281 .063 .144 −.120 .052 −.004 −.006 −.050 .322 .524 .138 .109 −.023 1 STAR AD_EXP BUDGET US SEQUAL REVENUES1** REVENUES SCREENS REVENUES REVENUES1** SEQUEL US BUDGET AD_EXP STAR DIRECTOR MPAA CHILDREN ACTION DOCUMENTARY HORROR COMEDY OTHER REVIEW DISTR_MAJOR COMP_SCR_NEW COMP_SCR_ONG COMP_SCR_REV SEASON We thank the reviewers and the editorial team for valuable comments throughout the review process. We also thank Alexa B. Burmester for her excellent comments on prior versions of this paper. Correlations of the US data set (after log transformation, significant correlations on p b .05 level are bold) Acknowledgments Appendix A. Correlation matrices for the US and German data This research has been set up systematically to generate empirical generalizations in the motion picture industry. The necessity of providing generalized findings has been emphasized by many scholars and has led to, for example, a special edition of Marketing Science focusing on empirical generalizations in 1995. Our findings provide such generalizations for both the demand and supply side of a major global industry in two major markets. DIRECTOR Along with these generalizations, we also find interesting avenues for further research. (1) We strongly suggest focusing more on profit-driven analyses. Although we are aware of production costs and advertising budgets, we are not able to analyze whether stars or directors are worth their specific costs because the public does typically not know their salaries. (2) Further, we encourage additional studies to analyze the side payments of distributors to cinemas in the profit estimation. The profits of cinemas are also substantially driven by concession sales (e.g., popcorn, drinks) that go to the exhibitor. The concession profit (which has been assumed to be about 1 EUR per visitor in the Netherlands) depends on the genre of movie and has been included in prior research in optimizing movie allocation for the Dutch cinema group Pathé (Eliashberg et al., 2001). However, the relative importance of the concession profits compared to the overall profit generated per visitor for a cinema is not mirrored in prior research. (3) Although the perceived quality of a movie is often included in empirical studies using proxies such as critical acclaim, additional details with respect to the relevance of the movie's content would be helpful. Thus, a stronger interdisciplinary approach using methods grounded in film or, more generally, content analysis is encouraged to extend the prior research of Eliashberg, Hui, and Zhang (2007). Such new data should be used to engage in new studies that explicitly study the reviews' influence on screen allocation. Clearly, more research on this variable is needed. (4) Although some research has been done on the optimal allocation of movies to screens (Eliashberg et al., 2001; Swami, Eliashberg, & Weinberg, 1999), we suggest a deeper analysis of simultaneous decision-making processes of cinemas and studios (or distributors) with respect to advertising budgets and, more generally, movie buzz strategies. Our interviews revealed that these issues are of key relevance during negotiations with respect to screens. (5) The high motivation of individuals to use hedonic goods as signals to their social system explains the high relevance of WOM in the market. However, industry dynamics, coupled with short life cycles of movies in the first window, provide interesting new avenues for further research that analyzes the differences between pre- and post-release WOM on supply and demand. 1 −.089 .102 −.222 .084 .018 −.028 −.104 .136 −.093 −.092 .055 −.052 6.4. Research implications Note: REVENUES1** = expected revenues first week, AD_EXP = advertising expenditure, DISTR_MAJOR = major distributor, COMP_SCR_NEW = competition for “screen space” from new releases, COMP_SCR_ONG = competition for “screen space” for ongoing movies, COMP_SCR_REV = competition for the attention of audiences. M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 1 −.100 −.026 −.033 −.094 −.032 .077 .067 −.016 .029 −.298 .057 220 Correlations of the US data set (after log transformation, significant correlations on p b .05 level are bold) DOCUMENTATION HORROR COMEDY 1 −.033 −.093 −.032 .068 −.118 .070 −.001 −.011 −.042 1 −.118 −.041 −.151 −.019 −.005 .020 −.074 −.034 1 −.116 −.122 .034 −.015 .000 .072 .020 OTHER REVIEW DISTR_MAJOR 1 −.112 .061 .039 .050 .123 1 −.192 −.124 −.275 .090 COMP_SCR_NEW COMP_SCR_ONG COMP_SCR_REV SEASON 1 .179 .007 1 −.149 1 1 .032 −.025 −.005 −.009 −.188 .012 1 .300 .539 −.083 Correlations of the German data set (after log transformation, significant correlations on p b .05 level are bold) ADMISSIONS SEQUEL US GERMANY BUDGET AD_EXP STAR DIRECTOR MPAA (FSK) CHILDREN ACTION DOCUMENTARY HORROR COMEDY OTHER TIP REVIEWS DISTR_MAJOR COMP_SCR_NEW COMP_SCR_ONG COMP_SCR_REV US_PERF TL_US SEASON SCRE ENS ADMISSIONS SEQUAL US GERMANY BUDGET AD_EXP STAR DIRECTOR MPAA (FSK) CHILDREN ACTION DOCUMENTATION .927 .262 .397 −.154 .664 .654 .486 .238 −.008 .181 .191 −.212 .058 −.028 .168 .152 .349 .417 −.181 −.031 −.221 .374 .196 .092 1 .272 .360 −.148 .624 .630 .510 .279 .022 .117 .180 −.198 .052 −.026 .178 .203 .408 .375 −.219 −.026 −.211 .391 .186 .145 1 .145 −.095 .193 .128 .090 .060 .069 .057 .098 −.045 .049 −.058 .111 −.032 .119 .143 −.136 −.017 −.145 .086 −.027 .023 1 −.491 .512 .236 .385 .122 .012 −.025 .107 −.115 .089 .106 .059 .042 .190 .420 −.118 .030 −.121 .596 .593 .010 1 −.367 −.083 −.111 −.023 −.071 .058 −.098 .102 −.080 −.050 −.025 .021 −.090 −.151 .072 −.010 .126 −.522 −.456 .035 1 .431 .510 .225 .024 .109 .216 −.283 −.002 −.019 .175 .064 .290 .424 −.177 .011 −.132 .484 .368 .058 1 .386 .211 .050 .074 .102 −.148 .058 −.044 .107 .127 .229 .269 −.100 −.107 −.133 .268 .145 .074 1 .377 .052 −.165 .187 −.150 −.109 .054 .090 .144 .312 .358 −.145 −.127 .020 .307 .240 .076 1 .104 −.074 .082 −.023 −.045 −.077 .103 .150 .249 .179 −.123 −.143 −.012 .126 .026 .064 1 −.325 .334 −.180 .269 −.293 .085 −.002 .093 −.001 −.024 .051 −.223 .046 −.006 −.049 1 −.138 −.043 −.065 −.159 −.083 .036 −.012 .005 .027 −.009 −.159 −.059 −.045 .041 1 −.086 −.131 −.318 −.166 −.016 .114 .071 −.110 −.024 −.202 .077 .029 −.066 1 −.041 −.100 −.052 −.007 −.089 −.137 .053 .005 −.028 −.139 −.113 −.009 221 Note: AD_EXP = advertising expenditure, Tip = movie is marked with “Tip” (in the German magazine “Cinema”), DISTR_MAJOR = major distributor, COMP_SCR_NEW = competition for “screen space” from new releases, COMP_SCR_ONG = competition for “screen space” for ongoing movies, COMP_SCR_REV = competition for the attention of audiences, US_PERF = US market performance, TL_US = time lag between domestic and foreign release × US market performance. M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 REVENUES REVENUES1** SEQUEL US BUDGET AD_EXP STAR DIRECTOR MPAA CHILDREN ACTION DOCUMENTARY HORROR COMEDY OTHER REVIEW DISTR_MAJOR COMP_SCR_NEW COMP_SCR_ONG COMP_SCR_REV SEASON 222 M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 SEASON 1 1 .043 1 −.095 −.157 −.116 .288 .234 .019 .164 −.285 .036 −.077 .240 .099 −.003 1 1 .327 .002 −.036 .034 −.036 .177 .100 .040 1 .058 .118 .103 −.044 .081 −.148 .067 −.019 .055 1 −.192 −.116 −.064 .081 −.009 .013 .156 .027 .103 −.011 REVIEW TIP OTHER COMEDY HORROR 1 −.151 −.079 −.011 −.044 .044 .019 −.040 −.478 .065 .080 −.019 ADMISSIONS SEQUEL US GERMANY BUDGET AD_EXP STAR DIRECTOR MPAA (FSK) CHILDREN ACTION DOCUMENTARY HORROR COMEDY OTHER TIP REVIEWS DISTR_MAJOR COMP_SCR_NEW COMP_SCR_ONG COMP_SCR_REV US_PERF TL_US SEASON Correlations of the German data set (after log transformation, significant correlations on p b .05 level are bold) Appendix A (continued) DISTR_MAJOR 1 −.092 .079 −.063 .019 .100 COMP_SCR_NEW 1 −.015 .034 .005 −.110 COMP_SCR_ONG 1 −.069 −.066 .018 COMP_SCR_REV 1 .756 .056 US_PERF TL_US Appendix B. Supplementary data Supplementary data to this article can be found online at http:// www.runmycode.org. References Ainslie, A., Drèze, X., & Zufryden, F. S. (2005). Modeling movie life cycles and market share. Marketing Science, 24(3), 508–517. Albers, S. (2012). Optimizable and implementable aggregate response modeling for marketing decision support. International Journal of Research in Marketing, 29(2), 111–122. Bass, F. M. (1995). Empirical generalizations and marketing science: A personal view. Marketing Science, 14(3), G6–G19. Basuroy, S., Chatterjee, S., & Ravid, A. S. (2003). How critical are critical reviews? The box office influence of film critics, star-power, and budgets. Journal of Marketing, 67, 103–117. Boatwright, P., Basuroy, S., & Kamakura, W. A. (2007). Reviewing the reviewers: The impact of individual film critics on box office performance. Quantitative Marketing and Economics, 5, 401–425. Deuchert, E., Adjamah, K., & Pauly, F. (2005). For Oscar glory or Oscar money? Academy awards and movie success. Journal of Cultural Economics, 29, 159–176. Ehrenberg, A. S.C. (1995). Empirical generalisations, theory, and method. Marketing Science, 14(3), G20–G28. Elberse, A. (2007). The power of stars: Do star actors drive the success of movies? Journal of Marketing, 71, 102–120. Elberse, A., & Eliashberg, J. (2003). Demand and supply dynamics for sequentially released products in international markets: The case of motion pictures. Marketing Science, 22(3), 329–354. Eliashberg, J., Elberse, A., & Leenders, M.A. A.M. (2006). The motion picture industry: Critical issues in practice, current research, and new research directions. Marketing Science, 25(6), 638–661. Eliashberg, J., Hegie, Q., Ho, J., Huisman, D., Miller, S. J., Swami, S., et al. (2009). Demand-driven scheduling of movies in a multiplex. International Journal of Research in Marketing, 26(2), 75–88. Eliashberg, J., Hui, S. K., & Zhang, Z. J. (2007). From story line to box office: A new approach for green-lighting movie scripts. Management Science, 53(6), 881–893. Eliashberg, J., Jonker, J. -J., Sawhney, M. S., & Wierenga, B. (2000). MOVIEMOD: an implementable decision-support system for prerelease market evaluation of motion pictures. Marketing Science, 19(3), 226–243. Eliashberg, J., & Shugan, S. M. (1997). Film critics: Influencers or predictors? Journal of Marketing, 61, 68–78. Eliashberg, J., Swami, S., Weinberg, C. B., & Wierenga, B. (2001). Implementing and evaluating SilverScreener: A marketing management support system for movie exhibitors. Interfaces, 31(3), S108–S127. Fornell, C. (1995). The quality of economic output: Empirical generalizations about its distribution and relationship to market share. Marketing Science, 14(3), G203–G211. Goldenberg, J., & Muller, E. (2012). Editorial. International Journal of Research in Marketing, 29(4), v–vii. Greene, W. H. (2006). Econometric analysis (6th ed.)Upper Saddle River, NJ: Prentice Hall. Hadida, A. (2010). Commercial success and artistic recognition of motion picture. Journal of Cultural Economics, 34(1), 45–80. Hanssens, D. (2009). Empirical generalizations about marketing impact. Cambridge, MA: Marketing Science Institute (MSI) Relevant Knowledge Series. Hennig-Thurau, T., Henning, V., Sattler, H., Eggers, F., & Houston, M. B. (2007). The last picture show? Timing and order of movie distribution channels. Journal of Marketing, 71, 63–83. Hennig-Thurau, T., Houston, M. B., & Heitjans, T. (2009). Conceptualizing and measuring the monetary value of brand extensions: The case of motion pictures. Journal of Marketing, 73(6), 167–183. Hennig-Thurau, T., Völckner, F., Clement, M., & Hofmann, J. (2013a). An ingredient branding approach to determine the financial value of stars: The case of motion pictures. Available at SSRN: http://ssrn.com/abstract=1763547 Hennig-Thurau, T., Wiertz, C., & Feldhaus, F. (2013b). Does twitter matter? An investigation of the impact of microblogging word of mouth on consumers' adoption of new products. Available at SSRN: http://ssrn.com/abstract=2016548 Hubbard, R., & Armstrong, J. S. (1994). Replications and extensions in marketing-rarely published but quite contrary. International Journal of Research in Marketing, 11(3), 233–248. Joshi, A.M., & Hanssens, D.M. (2009). Movie advertising and the stock market valuation of studio: A case of “great expectations”? Marketing Science, 28(2), 239–250. Kamakura, W. A., Basuroy, S., & Boatwright, P. (2006). Is silence golden? An inquiry into the meaning of silence in professional product evaluations. Quantitative Marketing and Economics, 4(2), 119–141. Karniouchina, E. V. (2011). Impact of star and movie buzz on motion picture distribution and box office revenue. International Journal of Research in Marketing, 28(1), 62–74. King, T. (2007). Does film criticism affect box office earnings? Evidence from movies zreleased in the U.S. in 2003. Journal of Cultural Economics, 31(3), 171–186. Krider, R. E., Li, T., Liu, Y., & Weinberg, C. B. (2005). The lead-lag puzzle of demand and distribution: A graphical method applied to movies. Marketing Science, 24(4), 635–645. Leenders, M.A. A.M., & Eliashberg, J. (2011). The antecedents and consequences of restrictive age-based ratings in the global motion picture industry. International Journal of Research in Marketing, 28(4), 367–377. Litman, B. R. (1983). Predicting success of theatrical movies: An empirical study. Journal of Religion and Popular Culture, 16, 159–175. M. Clement et al. / Intern. J. of Research in Marketing 31 (2014) 207–223 Liu, Y. (2006). Word of mouth for movies: Its dynamics and impact on box office revenue. Journal of Marketing, 70, 74–89. Moon, S., Bergey, P. K., & Iacobucci, D. (2010). Dynamic effects among movie ratings, movie revenues, and viewer satisfaction. Journal of Marketing, 74, 108–121. MPAA (2013). State-by-State Film & Television Economic Contribution. http://www. mpaa.org/policy/state-by-state last accessed on December 18th 2013. Radas, S., & Shugan, S. M. (1998). Seasonal marketing and timing new product introductions. Journal of Marketing Research, 35, 296–315. Ravid, A. S. (1999). Information, blockbusters, and stars: A study of the film industry. Journal of Business, 72(4), 463–492. 223 Sawhney, M. S., & Eliashberg, J. (1996). A parsimonious model for forecasting gross box-office revenues of motion pictures. Marketing Science, 15(2), 113–131. Swami, S., Eliashberg, J., & Weinberg, C. B. (1999). SilverScreener: A modeling approach to movie screens management. Marketing Science, 18(3), 352–372. van Bruggen, G. H., Lilien, G. L., & Kacker, M. (2002). Informants in organizational marketing research: Why use multiple informants and how to aggregate responses. Journal of Marketing Research, 39, 469–478. Wallace, T. W., Seigerman, A., & Holbrook, M. B. (1993). The role of actors and actresses in the success of films: How much is a movie star worth? Journal of Cultural Economics, 17(1), 1–27. Winer, R. S. (1998). From the Editor. Journal of Marketing Research, 35, iii–v. Intern. J. of Research in Marketing 31 (2014) 224–238 Contents lists available at ScienceDirect Intern. J. of Research in Marketing journal homepage: www.elsevier.com/locate/ijresmar Full Length Article Drivers of the cost of capital: The joint role of non-financial metrics Alexander Himme a,⁎, Marc Fischer a,b,1 a b University of Cologne, The Faculty of Management, Economics, and Social Sciences, Chair for Marketing and Market Research, Albertus-Magnus-Platz 1, D-50923 Cologne, Germany UTS Business School, Sydney, Australia a r t i c l e i n f o Article history: First received in 1 June 2011 and was under review for 9 months Available online 2 December 2013 Area Editor: Koen H. Pauwels Guest Editor: Marnik G. Dekimpe Keywords: Cost of capital Stock market beta Credit spreads Brand value Customer satisfaction Corporate reputation a b s t r a c t Recent marketing studies suggest that non-financial metrics, such as customer satisfaction and brand value, help explain the variation in the cost of equity and the cost of debt. These studies typically focus on only one nonfinancial metric and one component of capital cost. In this study, we broaden the understanding of the relevance of non-financial metrics to the cost of capital. We investigate the joint role of customer satisfaction, brand value, and corporate reputation for stock market beta and credit ratings, which reflect variation in equity and debt risk premiums across firms. In addition to the joint direct influence of these metrics on capital cost, we also study their interaction effects. We develop a conceptual model to explain the effects on capital costs and test the resulting hypotheses in a broad sample of 344 firms from diverse industries using data from the 1991–2006 period. Our results suggest that higher satisfaction ratings reduce both the cost of equity and cost of debt, whereas brand value and corporate reputation only show a negative direct association with the cost of debt. In addition, both measures moderate the effect of satisfaction on the cost of debt. Brand value attenuates the influence of satisfaction, whereas corporate reputation amplifies this effect. © 2013 Elsevier B.V. All rights reserved. 1. Introduction The weighted average cost of capital (WACC) is an important financial metric relevant both to members of the financial community, such as institutional investors, and to the top management of (publicly listed) firms. Given a stream of future cash flows, a lower WACC indicates a higher present value of that stream. For management, a lower WACC constitutes lower hurdle rates for investment projects because investors require less return from the according capital expenditures. WACC is composed of equity cost and debt cost. Both providers of capital demand a return for their investment. The larger the risk that they perceive to be associated with the investment, the higher the required return. The most important measure for equity holder risk is systematic risk, whereas credit ratings are the best signal for debt holders with respect to the default risk of a firm (Brealey, Myers, & Allen, 2007). Systematic risk and default risk vary across companies and over time. The extant accounting/finance literature has thus addressed the natural question regarding the drivers of such risks (e.g., Beaver, Kettler, & Scholes, 1970; Blume, Lim, & MacKinlay, 1998). Most studies focus predominantly on “hard” financial metrics, such as operating margins, asset growth, leverage, and earnings variability, which are commonly documented in financial reports or can be derived from ⁎ Corresponding author. Tel.: +49 221 470 8679; fax: +49 221 470 8677. E-mail addresses: himme@wiso.uni-koeln.de (A. Himme), marc.fischer@wiso.uni-koeln.de (M. Fischer). 1 Tel.: +49 221 470 8675; fax: +49 221 470 8677. 0167-8116/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ijresmar.2013.10.006 corporate or analyst disclosures. Researchers have found that several financial variables serve as drivers of the costs of equity and debt; however, they also acknowledge that their models explain only a fraction of the observed variance in capital cost (e.g., Elton, Gruber, Agrawal, & Mann, 2001). Several authors believe that so-called soft or intangible, non-financial metrics, such as management capabilities and marketing metrics, contribute to explaining the residual variance (e.g., Blume et al., 1998; Pinches & Mingo, 1973). An emerging research stream on the interface between accounting/ finance and marketing provides evidence for the value relevance of marketing metrics. In particular, recent efforts demonstrate that advertising expenditures, brand value, customer satisfaction, and corporate social responsibility possess the power to lower the cost of capital (for an overview, see Srinivasan & Hanssens, 2009). However, all these studies investigate only a single non-financial driver of capital cost. We believe that marketing-related non-financial metrics may offer different informational value for investors and creditors. As a result, such metrics may impact capital costs above and beyond each other. Measures such as customer satisfaction, brand value, and corporate reputation reflect competitive advantages from different domains. Satisfaction focuses on the customer, brand value focuses on the product, and corporate reputation emphasizes the firm. Therefore, these measures provide different signals to investors regarding the financial health of a firm that eventually influence the cost of debt and equity. This study attempts to provide several contributions. First, we investigate the joint role of the common non-financial measures of customer satisfaction, brand value, and corporate reputation in the cost of capital. We call these measures “non-financial” because they 225 A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 inform investors about the quality of marketing and management capabilities although they may be measured in monetary units (e.g., brand value). Specifically, we consider the popular and publicly available American Customer Satisfaction Index ratings, the financial brand values by Interbrand, and Fortune's corporate reputation scores. We develop a novel conceptual model of the informational value and signals contained in these metrics. From this model, we derive hypotheses regarding the incremental contribution of each metric in explaining the risk components of the cost of capital. In addition, we suggest potential moderating effects. Specifically, we suggest that brand value and corporate reputation moderate the influence of customer satisfaction on the cost of capital. Customer satisfaction plays this central informational role because it reflects customer experiences with past transactions (Fornell, Johnson, Anderson, Cha, & Bryant, 1996). Financial accounting is transaction-based and emphasizes historical earnings, which contain information with the highest certainty level (Kothari, 2001). Brand value and corporate reputation are less transactionbased and rather provide information on a firm's potential for future growth. Therefore, these information signals influence the interpretation and processing of satisfaction ratings by investors. Second, we test the hypotheses in a broad sample of 344 firms from diverse industries in the 1991–2006 period. Our analysis accounts for the dynamics and the potential endogeneity of our focal non-financial metrics. Including all three metrics together in the empirical models of equity cost and debt cost enables us to quantify the relative effect of each of the measures above and beyond each individual metric. For managers and investors, it is important to know whether satisfaction ratings, brand value, and corporate reputation scores provide additional distinct information. If not, investors and managers could simply substitute one non-financial metric for another to evaluate risk potential. Third, given that the focal metrics are measured at different scales, it is difficult to compare their relative importance in driving the cost of capital. Hence, we transform the estimated coefficients into elasticity estimates. This study is among the first to calculate elasticities for the effects of non-financial metrics on the components of capital costs. These elasticities enable managers and investors to assess precisely how changes in non-financial metrics influence the cost of capital. In addition, the results enable us to conduct meta-analyses. This paper is organized as follows. We briefly discuss the related literature in the next section. Subsequently, we provide details about the conceptualization of our key variables, which is important to assess their informational value. In Section 4, we derive our hypotheses. The next section includes the empirical study and the estimation results. We discuss these results in the final section and finish by presenting the conclusions and limitations of our study. 2. Literature background In Table 1, we briefly review the related accounting, finance, and marketing literature. From the marketing literature, we include all studies that consider either systematic risk (equity cost) or default risk (debt cost) as a dependent variable and non-financial metrics as an independent variable. 2.1. Accounting and finance literature The extant literature examines the effects of various factors on systematic risk and the cost of equity. Beaver et al. (1970) provide one of the first contributions within this field of research. Their model relates systematic risk (measured by beta) to variables that describe the financial position of a firm. The authors find that greater systematic risk is related to lower dividend payout, higher growth, smaller asset size, and greater leverage. Subsequent studies (e.g., Hill & Stone, 1980) consider similar variables and support the results obtained by Beaver et al. (1970). The research of Horrigan (1966) is among the first studies to analyze drivers of credit ratings that reflect the terms of debt financing. He considers different financial variables (e.g., total assets) to predict corporate bond ratings. Kaplan and Urwitz (1979) use an ordered probit model to predict bond ratings. The authors find, as an example, that total assets, the ratio of long-term debt to total assets, and the stock market beta are relevant. Blume et al. (1998) extend the approach by analyzing a panel of firms in the 1978–1995 period. These researchers introduce new variables, such as pretax interest coverage. We adopt the widely used models by Beaver et al. (1970) and Blume et al. (1998) as baseline specifications that we extend using our focal non-financial metrics. Table 1 Sample of prior research on drivers of the cost of capital. Author(s) Accounting/financial variables Non-financial (marketing) metrics Advertising Studies with focus on accounting/financial variables Beaver, Kettler, and Scholes (1970) Blume, Lim, and MacKinlay (1998) Horrigan (1966) Kaplan and Urwitz (1979) Pinches and Mingo (1973) Studies with focus on non-financial (marketing) variables Agarwal and Berens (2009) Anderson and Mansi (2009) Bharadwaj, Tuli, and Bonfrer (2011) Fornell, Mithas, Morgeson, and Krishnan (2006) Gruca and Rego (2005) Johansson, Dimofte, and Mazvancheryl (2012) Luo, Homburg, and Wieseke (2010) Madden, Fehle, and Fournier (2006) McAlister, Srinivasan, and Kim (2007) Orlitzky and Benjamin (2001) Osinga, Leeflang, Srinivasan, and Wieringa (2011) Rego, Billett, and Morgan (2009) Singh, Faircloth, and Nejadmalayeri (2005) Tuli and Bharadwaj (2009) This study a b Brand value Satisfaction Cost of capital Reputation ✓ ✓ ✓ ✓ ✓ Debt ✓ ✓ ✓ ✓ ✓ (✓)a ✓ ✓ ✓ ✓ ✓ ✓ (✓)b ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Equity ✓ (✓)a ✓ ✓ ✓ ✓ ✓ Authors only investigate one dimension of corporate reputation, which is corporate social responsibility. Authors investigate one dimension of brand value, which is brand quality. ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ 226 A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 2.2. Marketing literature Non-financial metrics provide current and forward-looking information above and beyond the “hard” information that is contained in a firm's financial statements. Five studies (see Table 1 again) focus on the effect of customer satisfaction on one component of the cost of capital (stock market beta or credit spreads). The results of these studies provide strong evidence that customer satisfaction reduces systematic risk (stock market beta) and leads to better credit ratings. Compared with customer satisfaction, the effect of brand value on the components of capital costs is ambiguous. Madden et al. (2006) and Rego et al. (2009) report that stronger brands improve credit ratings and lower systematic risk. By contrast, Bharadwaj et al. (2011) find a positive relationship between strong brand quality and systematic risk. Johansson et al. (2012) conclude that top brands as measured by financial brand value (Interbrand) did not show lower systematic risk than the market as a whole during the stock market downturn in the fall of 2008. However, brands scoring the highest on a consumerbased brand equity measure (EquiTrend) have lower systematic risk. To the best of our knowledge, prior research on the effects of corporate reputation on capital cost is not available. Nevertheless, some studies focus on corporate social responsibility, although this dimension is only one of several that contribute to the overall reputation of a company. Orlitzky and Benjamin (2001) as well as Agarwal and Berens (2009) show that higher corporate social responsibility is associated with lower financial risk and lower capital costs in general. Collectively, prior studies provide stronger and clearer evidence with regard to customer satisfaction compared with brand value and corporate social responsibility. The findings pertaining to the role of brands with respect to systematic risk are inconsistent. We are not aware of any studies that investigate the relationship between corporate reputation and the components of capital cost. As a consequence, we hope that our joint consideration of all three metrics contributes to providing insight into their role as drivers of the cost of capital. 3. Conceptualization and measurement of key variables Customer satisfaction, brand value, and corporate reputation are multi-dimensional constructs that are not directly observable. Our hypotheses regarding their influence on capital cost are based on the distinct informational value that these metrics provide for investors. Different approaches have been suggested for measuring these constructs. It is beyond the scope of this paper to discuss these approaches in detail, but it is important to understand the conceptual foundation of the specific measures that we use in this study. Following the idea of efficient capital markets, we selected measures that are publicly available, consistently measured over time, and widely appreciated by investors. Three measures fulfill these criteria: the American Customer Satisfaction Index (ACSI), Interbrand's brand value measure, and Fortune's corporate reputation index. Following the finance literature, we use credit spreads and the stock market beta to measure the risk components of capital cost that are responsible for company-specific differences in the cost of debt and equity. 3.1. Credit spread (default risk) For debt holders, the default risk of a firm is the most relevant (Blume et al., 1998). Consistent with the literature (Brealey et al., 2007; Ederington, Yawitz, & Roberts, 1987), we measure default risk in terms of credit spreads that are closely related to the credit ratings that are issued by rating agencies, such as Standard & Poor's (S&P). S&P defines credit ratings as follows: “[…] ratings express the agency's opinion about the ability and willingness of an issuer … to meet its financial obligations in full and on time” (Standard & Poor's, 2011a, 3). Typically, analysts obtain information from published reports and financial statements as well as from interviews with the issuer's management. Analysts use such information to assess an entity's financial condition and risk potential. In fact, credit analysts use varying criteria in the rating process. Two risk components lead to the final credit rating. First, credit analysts assess the financial risk of a firm by evaluating “hard” financial metrics, such as capital structure or profitability. Second, analysts assess a firm's business risk by considering the company's market position, cost efficiency, and management and marketing capabilities (Standard & Poor's, 2011a). Relevant factors may include market share, which reflects a firm's market position and its ability to sustain or increase shares; the strength of the brand; the degree of operating efficiency; and management's track record of product innovation and brand building, including the efficiency and effectiveness of marketing spending (Standard & Poor's, 2011b). All this information is non-financial by nature and relates to marketing capabilities. 3.2. Stock market beta (systematic risk) Following the Capital Asset Pricing Model (e.g., Brealey et al., 2007), the covariance between firm i's stock return, ri, and the market return, rm, relative to the variance of the market return, σ 2rm , measures the systematic risk or stock market beta, respectively. Using the identity Cov ðr i ; rm Þ ¼ ρri ;rm σ ri σ rm , we can also write the following equation for beta: BETAi ¼ ρri ;rm σ ri σ rm σ 2rm ¼ ρri ;rm σ ri σ rm ; ð1Þ where ρri ;rm measures the correlation between returns and σ ri and σ rm are the associated standard deviations. The correlation coefficient measures how closely firm returns follow the overall market trend. For example, insurance companies and banks depend quite heavily on the business cycle, whereas pharmaceutical companies are not greatly affected by market trends. Hence, ρri ;rm rather reflects differences across industries but does not vary greatly over time for a single firm. However, the ratio of the standard deviations of firm and market returns, σ ri /σ rm , captures firm-specific differences that also vary over time. Thus, we focus on this ratio when developing our arguments for the associated hypotheses. Consistent with fair value theory, stock returns and earnings/cash flows are highly correlated (Brealey et al., 2007). A major driver of the variation of cash flows is their growth rate. Fischer, Leeflang, and Verhoef (2010) provide a formal proof for this fundamental relationship. Empirical research on capital markets consistently indicates that growth stocks are indeed associated with a higher beta (e.g., Fama & French, 1992). In the subsequent development of our hypotheses, we refer to expectations regarding firm growth rates relative to the market average. 3.3. Customer satisfaction An individual firm's customer satisfaction represents its current customers' overall evaluation of their total purchase and consumption experiences (Fornell et al., 1996). Thus, customer satisfaction is an indicator of the loyalty and the willingness to pay of current customers; thus, it provides information related to revenues from the current customer base (Anderson, Fornell, & Lehmann, 1994). We use customer satisfaction ratings (ACSI) from the National Quality Research Center at the University of Michigan. Fornell et al. (1996) provide details on how ACSI is measured. 3.4. Brand value Brand value measures the incremental discounted future cash flows accruing from a branded product compared with an identical but unbranded product (e.g., Johansson et al., 2012). We use brand value data from the Interbrand Group (see www.interbrand.com for a detailed description). One specific characteristic of the Interbrand approach is that it forecasts the current and future revenues that A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 are specifically attributable to the branded products. The costs of conducting business (e.g., operating costs) and intangibles, such as patents and management strength, are subtracted to assess what portion of the earnings results from the brand. Interbrand calculates a brand strength score to measure a brand's ability to secure ongoing customer demand (e.g., loyalty, retention) and thus sustain future earnings, translating branded earnings into net present value. In general, the Interbrand measure is used to reflect the revenue and profit growth potential of a firm as a result of brand strength. 3.5. Corporate reputation Following Fombrun (1996, 72), we define corporate reputation as “a perceptual representation of a company's past actions and future prospects that describe the firm's overall appeal to all its key constituents when compared to other leading rivals”. Corporate reputation is likely the most complex metric among our focal metrics. In general, corporate reputation comprises the credibility and respect that an organization has among a broad set of constituents (e.g., employees, investors, regulators, customers). In this study, we follow Fortune's approach to measure corporate reputation (Fombrun & Shanley, 1990; Fortune, 2009). A firm's overall reputation score is built from ratings of eight dimensions: financial soundness; innovation; long-term investment; the ability to attract, develop, and retain talented people; product/ service quality; the quality of management; social responsibility; and the wise use of corporate assets. The Fortune measure reflects the potential of a firm to increase its future revenues and operational efficiency. 4. Hypotheses 4.1. Conceptual model Fig. 1 shows the conceptual model that underlies our hypotheses. Information regarding customer satisfaction, brand value, and corporate reputation is assumed to affect the cost of capital via the risk components beta and credit spreads. In this framework, we assume that customer satisfaction plays a prominent role in both beta and credit spreads. Investors prefer clear, certain, and unambiguous information regarding the earnings power of companies. As a consequence, accounting earnings measurement rules place great emphasis on transactionbased revenue recognition (Kothari, 2001). Our three non-financial metrics contain information about both past transactions and potential future revenues and profits. Therefore, we suggest that these metrics have a direct effect on beta and credit spreads. For example, reduced customer satisfaction leads to customer defection in the long term, which in turn affects firm revenue. Brand value and corporate reputation provide additional information regarding the quality of products and services and the efficiency of firms. A decline in brand value and/ or corporate reputation is also expected to influence customer purchase decisions and thus the revenue base of a firm. A decline in customer satisfaction, brand value, and corporate reputation would influence the expectations of a firm's stakeholders about current and future earnings. Because investors immediately react to changes in their expectations, we postulate a direct current effect of the non-financial metrics on the cost of capital. Because economic information often unfolds its full meaning only over time, we also assume carryover effects in our framework. In addition, we postulate that brand value and corporate reputation moderate the influence of customer satisfaction on the cost of capital. By definition, customers report their satisfaction based on past transactions. Hence, customer satisfaction has the closest link to past transactions among the three non-financial metrics, which suggests that it plays a central role in our framework. Brand value and corporate reputation expand the information set of investors with additional signals 227 regarding the future earnings potential of firms. These information signals influence the interpretation and processing of satisfaction ratings by investors and thus moderate the relationship between satisfaction and cost of capital. 4.2. Information content of satisfaction, brand value, and corporate reputation The three non-financial metrics contain information to assist investors in assessing future risk. The information content of the three metrics overlaps, but each metric provides unique information.2 Because of this unique information, we believe that each metric has incremental value for investors in evaluating the potential risk of an investment in a specific company. Table 2 summarizes the differences in the information value of the non-financial metrics. Customer satisfaction signals existing customers' loyalty and willingness to pay (e.g., Anderson et al., 1994). Hence, investors make inferences about revenues and cash flows that stem from existing customers in the future. Brand value informs about the strength of a brand. This strength emanates to a great extent from the innovativeness and the potential to grow with existing and new products in existing and new markets (e.g., Barth, Clement, Foster, & Kasznik, 1998; Leone et al., 2006). In addition, brand value signals how familiar investors are with a firm (e.g., Rego et al., 2009). Corporate reputation provides additional non-market-based information that reflects within-firm characteristics (Fombrun & Shanley, 1990). Six of the eight dimensions of Fortune's reputation metric focus on internal firm processes. Financial soundness and the wise use of corporate assets provide signals about corporate cost management and operational efficiency (Fombrun, 1996). In addition, the metric informs about the quality of management and employees. 4.3. Hypotheses on credit spreads We begin this subsection by discussing the potential influence of customer satisfaction, brand value, and corporate reputation on credit spreads, followed by a discussion of the effects on beta. Finance research has shown that firms are less able to service their debt obligations when suffering from higher equity risk as measured by the stock market beta (e.g., Blume et al., 1998). A higher beta reflects more vulnerable and volatile cash flows relative to the market average. The default risk of a firm increases, and the risk premium or credit spreads, respectively, for corporate bonds consequently increase as well. Our empirical model to explain variation in credit spreads includes beta as an important predictor. In the following, we focus on developing hypotheses regarding the direct influence of satisfaction, brand value, and reputation on credit spread above and beyond the mediated influence via beta, which we discuss subsequently. 4.3.1. Customer satisfaction Customer satisfaction positively influences the willingness to pay of customers while also reducing behaviors with negative economic consequences for firms, such as complaints (e.g., Anderson & Mansi, 2009). Satisfied customers are more likely to buy more of the same product, to buy additional products, and to make recommendations to other customers (e.g., Anderson, Fornell, & Mazvancheryl, 2004). Customer satisfaction ratings influence the credit rating process by providing information regarding the behavior of current customers that determines the size of firm profits (Anderson & Mansi, 2009). Firms with a higher level of expected cash flows ensure payment and are viewed as less risky borrowers. Thus, we propose the following hypothesis: 2 We would like to thank the AE for stimulating the following discussion. 228 A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 : Direct influence : Moderating influence Fig. 1. Conceptual model. H1. Customer satisfaction is negatively associated with credit spreads. 4.3.2. Brand value Strong brands are a signal for excellent marketing, which credit rating agencies consider to be an important criterion in their rating process (Standard & Poor's, 2011b). In addition to customer satisfaction, brand value offers potential to grow the customer base by acquiring new customers. These growth opportunities provide signals about a firm's capability of generating additional sales in the future to help fulfill its liabilities. Moreover, it is well known that a significant proportion of a firm's market value lies in intangible, off-balance sheet assets, such as brands (e.g., Bahadir, Bharadwaj, & Srivastava, 2008). Brands may serve as an elementary security for debt holders in case of a firm's financial distress or even bankruptcy. Finally, brands facilitate the access to fresh capital from equity investors, which again reduces the likelihood of financial distress. For example, investor funds from Abu Dhabi and Qatar provided fresh equity capital to Daimler and Porsche in 2009 when the cash holdings were tight as a result of the deep financial crisis. Both investor funds cited the strength of the premium car brands among the reasons for their investment decisions. Hence, we hypothesize as follows: H2. Brand value is negatively associated with credit spreads. 4.3.3. Corporate reputation Corporate reputation leads to greater familiarity of debt holders and credit rating agencies with a firm (e.g., Fombrun, 1996). Note that corporate reputation offers unique information signals with regard to operational efficiency and the quality of management. Naturally, operational efficiency is an important driver of firm profitability and thus of a firm's ability to fulfill future liabilities (e.g., Singh et al., 2005). The quality of management and employees is also a positive signal for credit rating agencies because it reduces the likelihood of a situation of financial distress (e.g., Blume et al., 1998). Well-known companies are generally more successful in attracting and retaining better employees, who are in turn more productive (Luo & Bhattacharya, 2006). Thus, corporate reputation provides positive signals to credit rating agencies that assess the credit worthiness of a firm. Standard and Poor's (2011b) mentions the degree of operating efficiency and management's track record of product innovation among their top factors in providing credit ratings. Therefore, we hypothesize as follows: H3. Corporate reputation is negatively associated with credit spreads. 4.3.4. Relative strength of effects All three non-financial metrics offer unique information value for the credit rating process. Therefore, we expect that each of these metrics influences credit spreads. However, considering their different information signals, we assume that these metrics exert effects of different strength. Corporate reputation is measured across a diverse range of dimensions involving a broad set of constituents (see Sections 3 and 4.2). In addition, corporate reputation particularly emphasizes the financial soundness and operational efficiency of a firm. Compared Table 2 Main information content of focal non-financial metrics. Main information content (signal) Loyalty of existing customers Willingness to pay by existing customers Potential to grow with product/services into new markets/customer segments Innovativeness Familiarity with product and firm Operational efficiency Quality of management and employees (X) means limited signaling content. Construct (measure) Customer satisfaction (ACSI) Brand value (Interbrand financial brand value) X X (X) (X) X X X Corporate reputation (Fortune reputation index) Customer focus Product focus X (X) X X Firm focus A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 with brand value and customer satisfaction ratings, it may be more difficult to improve corporate reputation, as it requires advancements across several dimensions simultaneously. Credit rating agencies consider all these dimensions in their rating process and combine them with their evaluation of a firm's financial soundness. We therefore assume the relative responsiveness (elasticity) of credit spreads to be higher for corporate reputation than for brand value and customer satisfaction. Thus, we formulate the following hypothesis: H4. Compared with brand value and customer satisfaction, corporate reputation has the strongest negative effect on credit spreads. Consistent with our conceptual model in Fig. 1, we believe that both brand value and corporate reputation also moderate the role of customer satisfaction ratings in the credit rating process. 4.3.5. Moderating effect of brand value Competing arguments regarding the moderating effect of brand value can be made: the uncertainty argument and the price premium argument. 4.3.5.1. The uncertainty argument. Signals from customer satisfaction ratings may be less informative for firms with strong brands. Customer satisfaction ratings are the results of past customer transactions, whereas brand value informs about the potential to grow a business with revenues from new customers. Hence, brand value informs about a second source of future revenues that is not fully reflected in satisfaction ratings from current customers. In addition, future growth from new customers has a side effect, as it increases uncertainty about the exact level of future cash flows, which is important to evaluate a firm's potential to service its debt obligations. This uncertainty also makes the information signal from customer satisfaction less powerful. Thus, we hypothesize as follows: H5. Brand value attenuates the negative effect of customer satisfaction on credit spreads. 4.3.5.2. The price-premium argument3. Higher brand value is not only viewed by consumers as a signal of higher price but also associated with higher prices (Bharadwaj et al., 2011). If current customers are satisfied, then potential new customers can be expected to show equally high customer satisfaction. As a result, new customers will be more likely to be willing to pay a price premium and contribute to generating increased revenues. The price premium information provided by high brand value can make the information signal from customer satisfaction for credit rating agencies more powerful. Thus, we propose the following hypothesis: H5(alt). Brand value amplifies the negative effect of customer satisfaction on credit spreads. 4.3.6. Moderating effect of corporate reputation The perceived quality of a product or service is a primary driver of customer satisfaction (Fornell et al., 1996). As a result, companies invest heavily in implementing systems for customer relationship management or total quality management to increase customer satisfaction (Anderson et al., 1994; Mithas, Krishnan, & Fornell, 2005). Naturally, firms differ with regard to the efficiency of such investments (e.g., Anderson et al., 1994). In particular, corporate reputation provides information regarding the financial soundness and operational efficiency of a firm. Hence, the information signal from a firm's customer satisfaction rating becomes more valuable to credit rating agencies if the firm is known for 3 We would like to thank the editor for suggesting this alternative hypothesis. 229 its operational efficiency and financial soundness (Standard & Poor's, 2011b). This increased information value implies that such a firm generates its revenues from satisfied customers at lower costs. Accordingly, credit rating agencies may evaluate the same customer satisfaction for firms differently depending on their reputation for operational efficiency. Therefore, we hypothesize as follows: H6. Corporate reputation amplifies the negative effect of customer satisfaction on credit spreads. 4.4. Hypotheses on stock market beta 4.4.1. Customer satisfaction Several studies have shown that customer satisfaction enhances customer retention and therefore contributes to reducing the volatility and vulnerability of future cash flows (e.g., Gruca & Rego, 2005). Customers are thus more committed to the firm and less likely to switch to other firms. In periods of cyclical downturn, cash flows are cushioned from the downward trend. In upswing periods, the firm probably does not grow as fast as other companies that lost customers and now expand with the market. As a result, the systematic risk for firms with more satisfied customers is lower. Because prior research (e.g., Fornell et al., 2006; Tuli & Bharadwaj, 2009) provides strong support for this relationship, we do not repeat the arguments in detail here. H7. Customer satisfaction is negatively associated with systematic risk (stock market beta). 4.4.2. Brand value A strong brand acts as a barrier to competition and increases the probability of a customer continuing to purchase the brand (McAlister et al., 2007). The perceived value of a brand prevents customers from brand switching even if remaining with a certain brand requires paying a price premium (Rego et al., 2009). Higher brand value results from higher awareness, which in turn reduces consumer search costs and facilitates repeat purchases (Johansson et al., 2012). These forces strengthen the cash flow basis. Compared with average performers, this basis is less likely to erode for strong brands during an economic downturn when demand is shrinking. However, it has also been argued that the opposite is true. According to Bharadwaj et al. (2011), consumers view high brand quality as a signal of high prices. As consumers become more price-conscious in a downturn, strong brands may lose market share more rapidly than weaker but less expensive brands. As a result, cash flows and thus stock returns decline more rapidly than the market average. During an economic upswing, strong brand value signals faster growth. Strong brand value is an indicator of a firm's cross-selling potential, and consumers of strong brands are more likely to increase purchases in the future (Rego et al., 2009). These benefits imply that in an upswing situation, firms with higher brand value may outperform the market average. Although faster growth in cash flows is positive, it has a side effect of involving a higher variance of cash flows (Fischer et al., 2010), i.e., increases in beta (see Eq. (1)). Consequently, arguments favoring both positive and negative relationships between brand value and beta exist. We leave this question to be solved by the empirical analysis. 4.4.3. Corporate reputation Corporate reputation provides insight into the operational efficiency of a firm and the quality of its management and employees (see Table 2). During a market downturn, an efficient company is more flexible in managing costs compared with less efficient peers (e.g., Soteriou & Zenios, 1999). Firms with high-quality management enjoy stable relationships with their stakeholders, including employees and suppliers (Srivastava, Shervani, & Fahey, 1998). Thus, in economically difficult times, these firms can expect stakeholders to be more willing to 230 A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 cooperate in lowering costs (e.g., by reducing input prices or wages). All these benefits contribute to stabilizing revenues and costs during market downturns. As a consequence, the variance of firm returns is lower than the market average, leading to a lower beta. Although an excellent reputation may insulate a firm's stock from market downturns, it may also contribute to outperforming competitors during market upswings. Because of the superior management capabilities and operational efficiency, new markets are entered more rapidly and easily (Fombrun, 1996). In addition, good reputation increases the acceptance of new product introductions among consumers and channel partners (Kaufman, Jayachandran, & Rose, 2006). Higher operational efficiency also implies the potential of firms to increase their revenues at a lower cost relative to competitors. Hence, such companies offer larger growth potential. However, faster growth also leads to greater variance of returns relative to the market average, which in turn increases beta. Thus, arguments for both positive and negative relationships between corporate reputation and beta exist, which again leads to an empirical question. Table 3 summarizes the hypotheses for the main and interaction effects of the focal variables on credit spread and beta. We test these hypotheses subsequently. 5. Empirical study 5.1. Data and measures Because the database and especially the data alignment influence the model specification, we begin with this discussion before presenting the empirical model. We collect data from various databases, including the Center for Research in Security Prices (CRSP), Standard & Poor's COMPUSTAT database, Bloomberg, Interbrand, Financial World, Fortune, and the National Quality Research Center at the University of Michigan. The data cover the 1989–2006 period. However, we cannot observe all variables since 1989. 5.1.1. Credit spread (default risk) Credit ratings are obtained from the COMPUSTAT databases. COMPUSTAT offers Standard & Poor's long-term domestic issuer credit ratings, which measure a firm's capacity to meet its long-term financial commitments. The ratings range from AAA (highest credit standing) to D (firm is in default) on an ordinal scale. The credit spread for a rating class is calculated as the difference between the average yield (10-year maturity) of a bond portfolio including only bonds of that rating class and the yield of a risk-free bond (10-year US Treasury Bond). These data are provided by Bloomberg's database. 5.1.2. Stock market beta (systematic risk) We follow the standard market-model approach (see, for example, McAlister et al., 2007) to estimate firm-specific betas. We use daily stock returns for each firm and the market return of the CRSP ValueWeighted-Return Index of all trading days of the specific year to obtain the estimates. 5.1.3. Customer satisfaction ACSI produces a customer satisfaction score for each organization that ranges from 0 to 100. ACSI collects and releases data on an annual basis. We obtain the customer satisfaction scores from the fourth quarter of 1994 to the fourth quarter of 2006. 5.1.4. Brand value Interbrand has been publishing the financial value of the Top 100 global brands since 1992. We obtain these data from publications in Financial World or Business Week or from the website of the Interbrand Group (www.interbrand.com). For the multi-brand firms in our data set, we aggregate the available values across individual brands. 5.1.5. Corporate reputation Corporate reputation scores are obtained from Fortune's annual report on America's Most Admired Corporations. The reputation score ranges from 0 to 10. Responses are solicited primarily from company executives. Reputation data have been published since 1991. 5.1.6. Financial and other control variables We follow previous studies in defining the financial variables. We measure growth by the growth rate in total assets (e.g., Beaver et al., 1970). Dividend payout is the ratio of cash dividends with respect to earnings available to common stockholders (e.g., McAlister et al., 2007). We measure leverage by total senior securities (preferred stocks and bonds) divided by total assets (e.g., Beaver et al., 1970). Liquidity is the “current ratio” of a firm (e.g., McAlister et al., 2007). Earnings variability is measured as the standard deviation of the earnings–price ratio (Beaver et al., 1970; McAlister et al., 2007). The log of total assets determines firm size (e.g., Rego et al., 2009). Pretax interest coverage is the operating income after depreciation plus interest expense divided by the interest expense (e.g., Blume et al., 1998). We compute the operating margin by dividing operating income before depreciation by firm sales (e.g., Blume et al., 1998). We measure competitive intensity with the C4 concentration index. This index cumulates the market shares of the four largest firms at the two-digit North American Industry Classification System (NAICS) level. Data on these financial control variables are obtained from COMPUSTAT. Appendix A provides the exact data definitions that are used in our analysis. 5.1.7. Data merging procedure Fig. B.1 in Appendix B summarizes the release dates of the financial and non-financial metrics that are supposed to drive credit spreads and beta. This figure shows that the release dates differ across years. This variation implies challenges for model building that are discussed in detail in Appendix B. 5.1.8. Descriptive and correlation statistics Table 4 displays the descriptive statistics of our sample. The mean beta is close to 1. The mean credit spread amounts to 1.24%. Both dependent variables demonstrate large variation. The average brand in our sample is worth $8337 million. We multiply the brand values with the firm-specific WACC by period to correct for the discounting applied by Interbrand (see Appendix C for details). As a result, we obtain average annual future branded earnings of $620.2 million. The mean satisfaction and reputation scores are 75.7 (scale 0–100) and 6.6 (scale 0–10), respectively. The means and standard deviations for the control variables can be obtained from Table 4 and are comparable to those in other studies (e.g., McAlister et al., 2007). Table 3 Overview of hypotheses. Independent variable Dependent variable Credit spread Customer satisfaction Brand value Corporate reputation Beta Direct impact Relative strength Moderating the impact of customer satisfaction Direct impact − (H1) − (H2) − (H3) H4: Strongest impact by corporate reputation − +/− (H5;H5(alt)) − (H6) − (H7) +/− +/− A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 231 with Table 4 Univariate statistics. Variables N Mean Median Std. dev. Beta Credit spread (in percent) Branded earnings ($m) Satisfaction (scale: 1–100) Reputation (scale: 1–10) Dividend payout Earnings variability Growth Leverage Ln asset size ($m) Liquidity Industry concentration Pretax interest coverage (in percent) Operating margin 4940 3196 1164 1893 1732 3992 3598 4785 5171 5130 4516 6153 4712 5053 .95 1.24 620.16 75.71 6.59 1.09 .08 .11 .44 9.08 1.54 .17 24.90 .18 .88 .92 321.16 76.01 6.70 .38 .02 .07 .46 9.24 1.30 .11 5.93 .17 .54 .95 854.82 6.47 1.04 30.00 .21 .29 .18 1.80 .57 .17 249.22 .35 Table 5 displays the correlation matrix. We do not note any excessive correlations that would indicate collinearity issues. Nevertheless, we check for potential collinearity issues subsequently. Interestingly, reputation scores are more strongly correlated with satisfaction than with brand value (.31 vs. .23). There is virtually no correlation between the WACC-corrected brand value and customer satisfaction (.12; p N .10). The correlations of brand value and corporate reputation with beta are negative but are not significant on a practical level. In contrast, the correlation of customer satisfaction with beta is strongly significant (− .26; p b .01). All the correlations of the three nonfinancial metrics with credit spread are negative and highly significant (satisfaction: − .24; p b .01; brand value: − .16; p b .05; reputation: −.44; p b .01). Consistent with our theoretical arguments (see Table 2), brand value and reputation are positively and significantly correlated with growth (brand value: .12; p b .05; reputation: .20; p b .01), but we find no significant correlation of growth with customer satisfaction (− .02; p N .10). 5.2. Model Building on the extant research on accounting, finance, and marketing (e.g., Beaver et al., 1970; Blume et al., 1998; McAlister et al., 2007), we specify the following two equations to explain the components of capital cost. We adopt the models by Beaver et al. (1970) and Blume et al. (1998) as baseline specification for Eqs. (2) and (3), respectively: SPREADit ¼ α 0i þ α 1 SPREADit−1 þ α 2 BV it þ α 3 SAT it þ α 4 REP it þ α 5 SAT it % BV it þ α 6 SAT it % REP it þ α 7 BETAit−1 þ α 8 INT it−1 þ α 9 OPERit−1 þ α 10 LEV it−1 L−1 X þ α 11 ln ðASSET it−1 Þ þ α 12 CONC it−1 þ α 12þl IDil þ υit ; ð2Þ i¼1 ! " ! " g N 0; σ 2 ; α ¼ α þ φ and φ i:i:d: g N 0; σ 2 ; Coυðυ ; φ Þ ¼ 0; υit i:i:d: υ 0i φ it i i i ! " ! " g N 0; σ 2 ; β ¼ β þ κ and κ i:i:d: g N 0; σ 2 ; Coυðε ; κ Þ ¼ 0; ε it i:i:d: ε 0i i i κ it i where SPREADit Credit spread of firm i in period t BVit Brand value of firm i in period t SATit Customer satisfaction rating of firm i in period t REPit Corporate reputation of firm i in period t BETAit − 1 Systematic risk of firm i in period t − 1 INTit − 1 Pretax interest coverage of firm i in period t − 1 OPERit − 1 Operating margin of firm i in period t − 1 LEVit − 1 Leverage of firm i in period t − 1 ASSETit − 1 Asset size of firm i in period t − 1 DIVit − 1 Dividend payout of firm i in period t − 1 GROWTHit − 1 Asset growth of firm i in period t − 1 LIQit − 1 Liquidity of firm i in period t − 1 EVARit − 1 Earnings variability of firm i in period t − 1 CONCit − 1 Industry concentration (C4 index) relevant for firm i in period t − 1 IDil Industry dummy for firm i and industry l (1 = firm i belongs to industry l; 0 otherwise) νit, εit, φi, κi Error terms σ2υ, σ2φ, σ2ε , σ2κ Variances α, β Parameters to be estimated i = 1, … I (number of firms) t = 1, … T (number of periods) l = 1, … L (number of industries). Beaver et al. (1970) and Blume et al. (1998) provide detailed explanations of the financial control variables and their expected effects, which we do not present here. We extend their models in several ways. First, we add our three focal non-financial metrics and their interactions with customer satisfaction to the baseline model. Second, we include the competitive intensity for each industry (McAlister et al., 2007) and time-invariant industry dummies to control for heterogeneity at the industry level. More highly concentrated industries signal opportunities for new competitors to enter the market and threaten the cash flow stream of incumbents. We therefore expect a positive effect on beta. For credit spreads, the effect of industry concentration is not uniform. Firms in concentrated industries have above-average profits. However, higher concentrated industries signal opportunities for new competitors to enter the market and threaten the cash flow stream of incumbents. Hence, we do not make a sign prediction in this case. Third, we specify a random constant in our models to capture unobserved heterogeneity. This random constant controls for other firm-specific differences in systematic risk and credit spread, that we do not observe but that may affect our estimates. Finally, we capture dynamic effects by including the lagged dependent variable. Moreover, the inclusion of lagged dependent variables also controls for inertia, persistence, and different initial conditions (Tuli & Bharadwaj, 2009). 5.3. Estimation BETAit ¼ β0i þ β1 BETAit−1 þ β2 BV it þ β3 SAT it þ β4 REP it þ β5 SAT it % BV it þ β6 SAT it % REP it þ β7 DIV it−1 þ β8 GROWTH it−1 þ β9 LEV it−1 þ β10 LIQ it−1 þ β11 EVARit−1 L−1 X þ β12 ln ðASSET it−1 Þ þ β13 CONC it−1 þ β13þl IDil þ !it ; ð3Þ i¼1 Note that both Eqs. (2) and (3) include a random constant to account for unobserved firm heterogeneity (Greene, 2008). The terms φi and κi denote firm-specific deviations of the heterogeneous constant from its mean (α; β) and are assumed to be drawn from a normal distribution with zero mean and constant variance. In addition, we acknowledge the possibility that our non-financial variables are endogenous. Changes in these metrics are a result of investments, which are in turn influenced 232 A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 Table 5 Correlations (number of observations in parentheses). 1 1. Beta 2. Credit spread 3. Brand value 4. Satisfaction 5. Reputation 6. Dividend payout 7. Earnings variability 8. Growth 9. Leverage 10. Asset size (log) 11. Liquidity 12. Industry conc. 13. Pretax int. coverage 14. Operating margin 1.00 (4940) .09⁎⁎⁎ (3068) −.09 (1044) −.26⁎⁎⁎ (1721) −.03⁎ (1639) −.03 (3731) .11⁎⁎⁎ (3475) .22⁎⁎⁎ (4541) .17⁎ (4201) −.08⁎⁎⁎ (4781) −.21⁎⁎⁎ (3807) .02⁎⁎ (4896) .05⁎⁎ (4429) −.16⁎⁎⁎ (4744) 2 3 4 5 6 1.00 (3196) −.16⁎⁎ (781) −.24⁎⁎⁎ (1433) −.44⁎⁎⁎ (1311) −.00 (2543) .24⁎⁎⁎ (2411) .02 (3049) .41⁎⁎⁎ (2733) −.25⁎⁎⁎ (3218) .05⁎⁎ (2713) .05⁎ (3167) −.08⁎⁎⁎ (2951) −.22⁎⁎⁎ (3047) 1.00 (1164) .12 (438) .23⁎⁎ (511) −.03 (900) .08⁎⁎ (798) .12⁎⁎ (1085) −.18⁎⁎ (943) .40⁎⁎⁎ (1094) −.05 (980) .11⁎⁎ (1162) .06⁎⁎ (1017) .26⁎⁎⁎ (1081) 1.00 (1893) .31⁎⁎⁎ (836) −.10⁎⁎⁎ (1370) −.24⁎⁎⁎ (1401) −.02 (1774) −.32⁎⁎⁎ (1560) −.10⁎⁎⁎ (1789) .10⁎⁎⁎ (1603) −.35⁎⁎⁎ (1891) .04 (1669) −.09⁎⁎⁎ (1762) 1.00 (1732) −.03 (1343) −.37⁎⁎⁎ (1301) .20⁎⁎⁎ (1674) −.51⁎⁎⁎ (1449) .08⁎⁎⁎ (1680) .11⁎⁎⁎ (1447) −.10⁎⁎⁎ (1730) .08⁎⁎⁎ (1557) .20⁎⁎⁎ (1653) 1.00 (3992) .01 (2743) −.06⁎⁎⁎ (3736) −.00 (3412) −.01 (3992) −.03⁎⁎⁎ (3135) −.01 (3980) −.01 (3721) −.01 (3939) ⁎ Statistical significance at 10% level (two-tailed). ⁎⁎ Statistical significance at 5% level (two-tailed). ⁎⁎⁎ Statistical significance at 1% level (two-tailed). by the cost of capital, creating potential simultaneity issues. We follow Fischer et al. (2010) and adopt their two-step estimation approach. In the first step, we obtain instrumental variables by regressing the respective endogenous variable on its instruments. We then estimate the models with the instrumental variables using the simulated maximum likelihood technique. The estimator is consistent and asymptotically normally distributed under the usual regularity conditions. Further details regarding the estimation procedure can be found in Appendix C. 5.4. Estimation results Tables 6 and 7 summarize the estimation results with respect to credit spread and beta, respectively. We show the results for models that include a varying set of predictors. The last column of Tables 6 and 7 displays the results when all predictors of Eqs. (2) and (3) are incorporated. As mentioned above, this inclusion of all predictors creates the highest demand for joint observations, reducing the sample sizes significantly. Therefore, including the results for varying predictor sets helps us to better assess the stability of our results. Overall, the model fit is very good for this class of data. The pseudo-R2 values, which are based on the squared correlation between predicted and actual values of the criterion variable, range from .59 to .69 for the beta regressions and from .57 to .76 for the credit spread regressions. 5.4.1. Results for credit spreads We begin our discussion with the results of the first column in Table 6. Here, we estimate a model that includes only the financial and other control variables. All estimation results show the expected sign. Our results are largely consistent with the findings of prior studies (e.g., Anderson & Mansi, 2009; Blume et al., 1998). The next three columns display the results for models that include only one non-financial metric. We find a significant negative effect (p b .05) on credit spreads for all three focal variables. The fifth column of Table 6 presents the findings for the model that simultaneously includes all three non-financial metrics. It is noteworthy that although we have a reduced sample size, the picture does not change substantially. The effects are again in the expected direction and reach significance. Hence, the results support all three hypotheses regarding the main effects: H1, H2, and H3. The findings show that customer satisfaction (− .005; p b .01), brand value (− 4.9 × 10−4; p b .05), and corporate reputation (− .072; p b .05) contribute to reducing credit spreads. Because all variables are included simultaneously, the effects are indeed incremental with respect to one another. Moreover, the effect of beta is still positive and significant (.157; p b .05). Hence, non-financial metrics may also influence credit spreads indirectly via beta. The moderating effects of brand value (3.3 × 10− 7; p b .10) and reputation (−5.8 × 10−4; p b .01) with regard to customer satisfaction are both significant and show the expected sign (sixth column of Table 6). The likelihood ratio test also supports this model extension (χ2(2) = 8.68; p b .05). Brand value significantly attenuates the negative effect of customer satisfaction on credit spreads (H5), whereas corporate reputation amplifies this negative effect (H6). To summarize, we find support for H1 to H3 as well as for H5 and H6. Customer satisfaction, brand value, and corporate reputation are found to significantly decrease credit spreads. We discuss the relative effects of the focal variables (H4) subsequently when we compute the elasticity estimates. 5.4.2. Results for stock market beta We begin our discussion with the results of the first column in Table 7. Here, we estimate a model that includes only the financial and other control variables. All effects for the financial control variables show the expected sign. The results are similar to those in previous studies (Beaver et al., 1970; McAlister et al., 2007). The next three columns of Table 7 demonstrate the estimation results that are observed when we add only one non-financial metric at a time. We find a strong negative effect of customer satisfaction (− .010; p b .01), which strongly supports H7. We have suggested arguments for both positive and negative effects of brand value and corporate reputation on beta. In fact, we do not find a significant influence for these two non-financial metrics (brand value: − 2.7 × 10 − 5, p N .05; reputation: − .025, p N .05), which may suggest that both lines of arguments are relevant and that the opposing effects offset one another. This conclusion does not change if we jointly estimate the effects for all three non-financial metrics (fifth column of Table 7). The coefficient associated with customer satisfaction (−.008, p b .01) is still highly significant and of similar size despite the substantially smaller sample. The sixth column of Table 7 presents the results for a model in which we also consider potential moderating effects of brand value and corporate reputation with regard to customer satisfaction. However, these additional variables (brand value × satisfaction: −3.6 × 10−7, p N .10; reputation × satisfaction: .008; p N .10) are not significant and do not improve the model fit. The likelihood ratio test does not support this model extension (χ2(2) = 1.36; p N .10). 5.5. Robustness tests We perform several tests to assess the robustness of our results. First, we note that the results in Tables 6 and 7 already indicate a relatively high level of stability of the estimated effects across several models 233 A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 7 1.00 (3598) −.09⁎⁎⁎ (3172) .07⁎⁎⁎ (3415) −.43⁎⁎ (3415) −.04⁎⁎ (2989) −.02 (3590) −.02 (3082) −.12⁎⁎⁎ (3374) 8 9 10 11 12 13 14 1.00 (4785) −.11⁎⁎⁎ (4073) −.12⁎⁎⁎ (4785) .25⁎⁎⁎ (3626) −.02 (4776) .04⁎⁎⁎ (4401) −.28⁎⁎⁎ (4722) 1.00 (4303) .25⁎⁎⁎ (4302) −.38⁎⁎⁎ (3923) −.15⁎⁎⁎ (4291) −.18⁎⁎⁎ (4203) .01 (4273) 1.00 (5130) .37⁎⁎⁎ (3923) −.19⁎⁎⁎ (5118) −.03⁎⁎ (4710) .21⁎⁎⁎ (5052) 1.00 (4059) .12⁎⁎⁎ (4053) .30⁎⁎⁎ (3649) .14⁎⁎⁎ (3901) 1.00 (6153) −.01 (4700) −.05⁎⁎⁎ (5041) 1.00 (4712) .08⁎⁎⁎ (4654) 1.00 (5053) with varying numbers of predictor variables. To assess the stability of our focal variables, we calculate the coefficient of variation, which is the standard deviation of estimates divided by their mean across different models. For the credit spread regressions, the values are .327 (satisfaction), .251 (brand value), and .197 (corporate reputation). For the beta regressions, we obtain values of .284 (satisfaction), .141 (brand value), and .137 (corporate reputation). Overall, we observe a low relative variance in the coefficient estimates across different models with varying sample sizes and predictor variables. Distributions with a coefficient of variation below 1/3 are considered low variance (McKay, 1932). Second, we determine whether our results are subject to collinearity issues. Following the “artificial orthogonalization” procedure by Hill and Adkins (2008), we regress customer satisfaction on brand value as well as corporate reputation and compute the residuals, which are orthogonal to the regressors by definition. We substitute the residuals into Eqs. (2) and (3) separately for each interaction term and then re-estimate the equations. The results for our interaction terms are not significantly different from the results in Tables 6 and 7.4 Third, we estimate models that include changes in the stock market beta and credit spreads as dependent variables and changes in the non-financial metrics and accounting/finance metrics as independent variables (e.g., Bharadwaj et al., 2011). A model of such changes is appropriate to reduce potential problems associated with time-invariant unobservable factors (Tuli & Bharadwaj, 2009) and multicollinearity. However, a change regression reduces the power of tests, as sample size and variation are significantly reduced. We use these change models for every model in Tables 6 and 7. Overall, the results are similar to the results of the level models. Fourth, for the stock market beta, we test for differences in the effects for the non-financial metrics during economic upswings and downswings. We follow Lamey, Deleersnyder, Dekimpe, and Steenkamp (2007) to determine cyclical upturns versus downturns. We include the non-financial metrics moderated by upturn and downturn dummies in our empirical model. The results are consistent with our reasoning in that they show the expected sign. However, the majority of the estimated coefficients are not significant because of the small sample size.5 4 However, we note that the coefficient of the substituted collinearity-free variable is consistently estimated, although the coefficients for the other variables are biased (Hill & Adkins, 2008). 5 The estimation results can be obtained from the authors upon request. 6. Implications and conclusions 6.1. Managerial and research implications Our results provide interesting insights that should be useful for both managers and researchers. Conceptually, we provide a detailed distinction between the information content of the three non-financial metrics. Empirically, we find a strong effect of customer satisfaction on both stock market beta and credit spreads. The conclusions differ for brand value and corporate reputation. Neither metric appears to affect beta, but both directly influence credit spreads. In addition, our findings suggest that brand value and corporate reputation significantly moderate the effect of customer satisfaction on credit spreads. We conclude that customer satisfaction, brand value, and corporate reputation provide value-relevant information for investors, creditors, and credit rating agencies above and beyond each individual metric. Hence, stakeholders do not appear to substitute one metric for the other when assessing the various types of risks associated with the cost of capital. Strictly speaking, we note that this conclusion relies on information provided by the ACSI, Interbrand, and Fortune's reputation index. Given that all three non-financial metrics affect components of the cost of capital, we can use our estimates to assess their relative influence. Because the metrics are measured with different scales, we cannot compare coefficient estimates directly but must instead transform them into elasticities. Table 8 summarizes the elasticity estimates. The table shows the relative (percentage) increase in the cost of equity, cost of debt, and WACC in reaction to a relative (percentage) increase in customer satisfaction ratings, brand value, and corporate reputation ratings. We take sample means for the risk-free rates, the capital structure, and so forth. We also differentiate between short-term and long-term effects. This separation has a direct practical implication because it enables investors to incorporate a time-varying discount factor into their valuation models. The long-term effect is obtained by dividing the short-term elasticity by 1 minus the carryover coefficient. Because we use a common carryover coefficient, differences in the long-term elasticities are driven by the differences in the short-term elasticities. The estimation results in Tables 6 and 7 show that dynamic effects are indeed present, as the carryover coefficient associated with the lagged dependent variable is always significantly different from zero (p b .05). To better assess the robustness of elasticity magnitudes, Table 8 shows both estimates based on the credit spread models with and without moderators. In the following discussion, we refer to the credit spread model that includes the moderating effects. Customer satisfaction shows highly significant (p b .01) short-term and long-term elasticity with respect to the cost of equity (short-term: 234 A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 Table 6 Estimation results (Eq. (2)); dependent variable: credit spread. Variables Accounting/financial Exp. sign Accounting/financial Accounting/financial control control variables only control variables + customer variables + brand value satisfaction Accounting/financial control variables + corporate reputation Accounting/financial Accounting/financial control variables + all control variables + all 3 non-financial metrics 3 non-financial metrics + moderators 3.290 (.618)⁎⁎⁎ .392 (.029)⁎⁎⁎ .448 (.134)aaa – – 3.642 (.822)⁎⁎⁎ .335 (.031)⁎⁎⁎ .381 (.121)aaa −.005 (.002)aaa −4.9×10−4 (2.7×10−4)aa −.072 (.040)aa – – Constant Estimated SD Lagged spreadb + Satisfactionc − Brand valuec − 3.412 (.277)⁎⁎⁎ .489 (.016)⁎⁎⁎ .541 (.119)aaa – – 3.661 (.341)⁎⁎⁎ .451 (.071)⁎⁎⁎ .437 (.082)aaa −.007 (.002)aaa – Reputationc Sat.c ∗ br. val.c Sat.c ∗ reput.c − +/− − – – – – – – 3.472 (.291)⁎⁎⁎ .355 (.081)⁎⁎⁎ .412 (.152)aaa – −3.8 × 10−4 (2.0×10−4)aa – – – Beta Pretax int. cov. + − .261 (.152)aa −.009 (.004)aaa .222 (.117)aa −.002 (4.2×10−4)aaa .247 (.068)aaa .157 (.092)aa −8.6×10−4 (2.0×10−4)aaa −.005 (9.2×10−4)aaa Oper. margin Leverage Ln asset size Ind. conc. Log L Pseudo R2 N − + − +/− .201 (.122)aa −.006 (3.4 10−4)aaa −1.471 (.375)aaa 1.321 (.190)aaa −.114 (.039)aaa −.055 (.128) −2067.63 .569 2586 3.716 (1.081)⁎⁎⁎ .342 (.046)⁎⁎⁎ .440 (.065)aaa −.003 (7.6 10−4)aaa −2.9 × 10−4 (2.1×10−4)a −.102 (.052)aa 3.3×10−7 (1.8×10−7)⁎ −5.8 × 10−4 (2.7×10−4)aaa .155 (.083)aa −.004 (.001)aaa −1.329 (.377)aaa 1.219 (.252)aaa −.133 (.071)aa .163 (.155) −1036.93 .621 1262 −1.277 (.317)aaa 1.246 (.415)aaa −.059 (.022)aaa .251 (.186) −353.41 .632 659 −.876 (.286)aaa .758 (.251)aaa −.056 (.027)aaa .115 (.131) −661.52 .665 1082 −.939 (.585)a .550 (.337)a −.014 (.010)a −.167 (.220) −36.97 .758 136 −.129 (.031)aaa – – −1.043 (.711)a .584 (.451)a −.021 (.007)aaa −.241 (.502) −40.59 .735 136 Note: The standard errors are reported in parentheses. Coefficients of industry dummies are not reported, but can be obtained from the authors. a Statistical significance at 10% level for variables with directional hypothesis (one-tailed). aa Statistical significance at 5% level for variables with directional hypothesis (one-tailed). aaa Statistical significance at 1% level for variables with directional hypothesis (one-tailed). b Denotes that the variable is instrumented by its lagged deviation from the firm-specific mean. c Denotes that the variables are instrumented by all exogenous variables of model (2) plus dividend payout, growth, liquidity, and earnings variability. ⁎ Statistical significance at 10% level (two-tailed). ⁎⁎ Statistical significance at 5% level (two-tailed). ⁎⁎⁎ Statistical significance at 1% level (two-tailed). Table 7 Estimation results (Eq. (3)); dependent variable: stock market beta. Variables Exp. sign Accounting/financial control variables only Accounting/financial control variables + customer satisfaction Accounting/financial control variables + brand value Accounting/financial control variables + corporate reputation Accounting/financial control variables + all 3 non-financial metrics Accounting/financial control variables + all 3 non-financial metrics + moderators 1.093 (.322)⁎⁎⁎ .341 (.019)⁎⁎⁎ .318 (.059)aaa −.010 (.004)aaa – .844 (.312)⁎⁎⁎ .244 (.017)⁎⁎⁎ .453 (.055)aaa – −2.7 × 10−5 (2.3 × 10−5) – – 1.236 (.172)⁎⁎⁎ .261 (.011)⁎⁎⁎ 1.200 (.602)⁎⁎ .242 (.023)⁎⁎⁎ 1.402 (.713)⁎⁎ .277 (.041)⁎⁎⁎ Constant Estimated SD Lagged betab Satisfactionc Brand valuec + − +/− .733 (.158)⁎⁎⁎ .233 (.022)⁎⁎⁎ .398 (.121)aaa – – Reputationc Sat.c ∗ br. val.c +/− +/− – – – – Sat.c ∗ reput.c Dividend pay. +/− − – −.022 (.011)aaa Growth Liquidity Earnings var. Leverage Ln asset size Ind. conc. Log L Pseudo R2 N + − + + − + – −1.3 × 10−4 (2.9 × 10−4) .324 (.075)aaa −.099 (.092) 4.021 (1.342)aaa .543 (.077)aaa −.039 (.008)aaa .144 (.058)aaa −622.49 .591 3204 −.038 (.058) −.241 (.154)a 3.215 (1.539)aaa .367 (.081)aaa −.020 (.015)a .288 (.077)aaa −236.52 .667 1303 – 1.9 × 10−4 (2.3 × 10−4) .341 (.188)aa −.062 (.126) 4.251 (1.374)aaa .122 (.097)a −.029 (.014)aaa .133 (.089)a −122.43 .613 807 .299 (.102)aaa – – −.025 (.016) – – −1.8 × 10−4 (2.4 × 10−4) .271 (.106)aaa −.051 (.083) 2.195 (1.302)aa .406 (.144)aaa −.025 (.021) .162 (.079)aaa −144.79 .657 1184 .217 (.130)aa −.008 (.004)aaa −1.9 × 10−5 (1.6 × 10−5) −.021 (.013) – – −.088 (.035)aaa .229 (.101)aaa −.005 (.003)aa −2.1 × 10−5 (3.5 × 10−5) −.032 (.029) 3.6 × 10−7 (3.8 × 10−7) .008 (.007) −.105 (.052)aaa .293 (.369) −.073 (.054)a 4.214 (2.272)aa .119 (.062)aa −.052 (.044) .027 (.151) −33.60 .682 145 .315 (.278) −.099 (.050)aa 3.871 (1.948)aaa .102 (.078)a −.031 (.038) .029 (.142) −32.92 .688 145 Note: The standard errors are reported in parentheses. Coefficients of industry dummies are not reported, but can be obtained from the authors. a Statistical significance at 10% level for variables with directional hypothesis (one-tailed). aa Statistical significance at 5% level for variables with directional hypothesis (one-tailed). aaa Statistical significance at 1% level for variables with directional hypothesis (one-tailed). b Denotes that the variable is instrumented by its lagged deviation from the firm-specific mean. c Denotes that the variables are instrumented by all exogenous variables of model (3) plus interest coverage and operating margin. ⁎ Statistical significance at 10% level (two-tailed). ⁎⁎ Statistical significance at 5% level (two-tailed). ⁎⁎⁎ Statistical significance at 1% level (two-tailed). 235 A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 Table 8 Comparison of estimated short-term and long-term elasticities. Due to (percentage change) … N (Eq. (2)) N (Eq. (3)) Elasticity (percentage change) of … Short-term Cost of equity Joint impact WITHOUT MODERATORS (column 5 in Tables 6 and 7) Brand value 136 145 −0.004 (0.003) Satisfaction 136 145 −0.205 (0.103)aaa Corporate reputation 136 145 −0.050 (0.031) Joint impact WITH SIGNIFICANT MODERATORS for credit spread equation (column 6 in Table 6 and column 5 in Table 7) Brand value 136 145 −0.004 (0.003) Satisfaction 136 145 −0.205 (0.103)aaa Corporate reputation 136 145 −0.050 (0.031) Long-term Cost of debt WACC Cost of equity Cost of debt WACC −0.005 (0.004) −0.067 (0.039)aa −0.032 (0.020) −0.041 (0.023)aa −0.020 (0.012) −0.065 (0.026)aaa −0.143 (0.069)aaa −0.263 (0.131)aaa −0.110 (0.046)aaa −0.193 (0.094)aaa −0.073 (0.041)aa −0.058 (0.035) −0.064 (0.040) −0.120 (0.069)aa −0.086 (0.053) −0.023 (0.016)a −0.012 (0.008) −0.005 (0.004) −0.041 (0.028)a −0.021 (0.015) −0.075 (0.019)aaa −0.148 (0.058)aaa −0.263 (0.131)aaa −0.134 (0.034)aaa −0.206 (0.088)aaa aa aa −0.137 (0.071) −0.083 (0.048)⁎ −0.064 (0.040) −0.244 (0.128) −0.137 (0.079)⁎ Note: The standard errors are reported in parentheses. For long-term elasticities, they are approximated with the delta method. Mean values for cost of equity: 8.85% (total sample) and 9.02% (sample with joint impact of non-financials). Mean values for cost of debt: 7.24% (total sample) and 7.66% (sample with joint impact of non-financials). Mean values for WACC: 8.37% (total sample) and 8.61% (sample with joint impact of non-financials). a Statistical significance at 10% level for variables with directional hypothesis (one-tailed). aa Statistical significance at 5% level for variables with directional hypothesis (one-tailed). aaa Statistical significance at 1% level for variables with directional hypothesis (one-tailed). ⁎ Statistical significance at 10% level (two-tailed). ⁎⁎ Statistical significance at 5% level (two-tailed). ⁎⁎⁎ Statistical significance at 1% level (two-tailed). − .205; long-term: − .263; p b .01). An elasticity of approximately −.25 suggests decreasing marginal returns for the effect of satisfaction ratings on equity cost, but the magnitude appears to be quite substantial. Consistent with the estimation results, we do not estimate significant equity-cost elasticities with respect to brand value and corporate reputation (p N .05). The picture is different when we consider the effects on credit spreads. Note that we account for the indirect effects of non-financial metrics via beta and interaction effects with respect to customer satisfaction. We apply the delta method to obtain standard errors. Because of the nature of this Taylor-series approximation of a nonlinear random term, the estimates tend to be inflated. Interestingly, corporate reputation appears to exert the greatest influence on the cost of debt; its short-term elasticity is − .137, and its long-term elasticity − .244 (for both, p b .05). These elasticities are substantial. The elasticities for customer satisfaction are − .075 (short-term) and −.134 (long-term) and are highly significant (p b .01). The values are considerably smaller for brand value (short-term: − .023 and longterm: −.041; both p b .10). Table 8 also shows the ultimate effects of the non-financial metrics on WACC. Here, we consider the average capital structure. WACC elasticity is the highest with respect to satisfaction (short-term: −.148 and long-term: −.206; both p b .01), followed by corporate reputation (short-term: − .083; and long-term: − .137; both p b .10). The elasticity estimates with respect to brand value are rather small and are subject to a rather large approximated standard error. Table 8 shows that the credit spread elasticity is substantially higher for corporate reputation compared with that for brand value and satisfaction. However, the difference is statistically significant only with respect to brand value (p b .01). Therefore, we find partial support for H4. The estimated elasticities underline that the non-financial metrics contain information that significantly drives the cost of capital. These results may serve as an input for a dialog among senior accounting, finance, and marketing executives regarding the role of non-financial metrics in the terms of financing. Our findings can assist in measuring the full financial impact of improving marketing metrics. Given that many studies show the contribution of marketing to financial success via an increase in the level of cash flows, these results complement the picture via the reduction in capital cost. As a consequence, reducing investments in marketing, customer relationship management or reputation-building activities may not only harm a firm's value with regard to future cash flows but also lead to higher hurdle rates for the required return on capital. Our analyses show that all three non-financial metrics are relevant drivers of capital cost components. We find an amazing stability of coefficients across models with different sample sizes and different predictor sets. We have discussed the joint and distinct informational content with regard to the non-financial metrics. The moderation analysis with regard to the cost of debt reveals the importance of how to frame the communication and disclosure of customer satisfaction ratings to stakeholders. Whereas strong brand value dilutes the information content of customer satisfaction, demonstrating operational efficiency makes the information derived from customer satisfaction much more valuable. Hence, marketing managers can more effectively communicate the effects of these intangibles and make a stronger case for marketing in the boardroom (Luo et al., 2010). 6.2. Limitations and future research We need to mention a few limitations of this study that could stimulate future research. Although the results are stable across model estimations with different sample sizes, the full model is estimated with a rather small sample. Analysis of a larger sample is likely to further increase the precision of estimates. Although we have good reasons for the choice of ACSI, Interbrand, and Fortune to measure customer satisfaction, brand value, and corporate reputation, it may be interesting to investigate the sensitivity of the results with respect to alternative measures of satisfaction, brand value, and corporate reputation. Our elasticity analysis shows that the effect of customer satisfaction ratings on capital cost, for example, is substantial. However, this finding does not imply that satisfaction ratings should be improved per se. Improvements require investments that are likely to be subject to decreasing returns of scale. Future research may determine the optimal investment level. 236 A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 Appendix A. Variable definitions Variables Definition Measure COMPUSTAT Asset size Dividend payout Log of total assets Cash dividends/earnings Ln(total assets) Cash dividends/available income Earnings variability Standard deviation of earnings/price ratio Standard deviation of earnings/price ratio Growth Industry concentration Terminal total assets/initial assets C4-concentration index Leverage Total senior securities (preferred stocks and bonds)/total assets Current ratio Operating income before depreciation/ sales (Operating income after depreciation + interest expense)/interest expense Ln(total assetst/total assetst − 1) Sum of market shares of the top four firms in the industry defined at two digits of the NAICS Total senior securities/total assets DATA 6 (total assets) DATA 21 (cash dividend); DATA 20 (income available for common stockholders) DATA 20 (income before extraordinary items — adjusted for common stock equivalents); DATA 24 (Price — close), DATA 25 (common shares outstanding) DATA 6 (total assets) DATA 12 (sales) Liquidity Operating margin Pretax interest coverage Current assets/current liabilities Operating income before depreciation/ sales (Operating income after depreciation + interest expense)/interest expense Our dependent variables credit spread and beta reflect the level at the end of each year. The financial control variables are reported by COMPUSTAT at the annual and quarterly levels, whereas quarterly information is typically unavailable at the beginning of the time series in 1989. The structure of the data has important implications for model building. First, because the focal metrics are measured only at the annual level, the year is the periodicity of our empirical analysis. Ideally, we would align our dependent variables with the release dates of the financial and non-financial information. For example, beta would be calculated over the 12 months preceding the release date of new satisfaction ratings for a firm. Unfortunately, this alignment is not possible Appendix B. Data merging procedure Fig. B.1 in the appendix summarizes the release dates of the financial and non-financial metrics that are supposed to drive credit spreads and beta. The figure shows that release dates differ across years. It also shows the period over which the variables in our empirical models are measured. As is evident in this figure, the non-financial metrics are measured only once per year, whereas the release dates differ across years. Annual ACSI data are collected in different quarters for different industries. Interbrand tracks brands across the year and releases new brand values in the third quarter (usually September). Fortune releases data on corporate reputation in the first quarter (usually March). Measurement Periods DATA 5 + DATA 9 + DATA 10 (total senior securities); DATA 6 (total assets) DATA 4 (current assets); DATA 5 (current liabilities) DATA 13 (operating income before depreciation); DATA 12 (sales) DATA 178 (operating income after depreciation); DATA 15 (interest expense) Financial Controls Corporate Reputation Brand Value Stock market beta Credit spreads ACSI 1a ACSI 2 ACSI 3 ACSI 4 Year 1 Q1 Release Dates Q2 Year 2 Q3 Q4 Q1 Q3 Q2 Financial controls Corporate reputation (Fortune) Brand value (Interbrand) Customer Satisfaction (ACSI) Time Index in Models (2) and (3) t-1 t Note: Denotes time span during which construct is measured a ACSI 1 etc. denote the quarters when data are released for different economic sectors Fig. B.1. Data alignment. Q4 t A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 with our data because the non-financial metrics are released at different points in time within a year. Although this lack of alignment may be a limitation of this database, we believe that any limitation would be offset by the new insights that we generate with respect to the joint role of the three non-financial metrics. We also estimate models for credit spreads and beta that account for the different release dates of satisfaction. We find no evidence that the release date moderates the influence of satisfaction on capital costs. Second, we note that the financial control variables themselves could be influenced by the cost of capital, i.e., stock market beta and credit ratings. To avoid such reverse causality between the dependent variables and the financial control variables, we include financial controls of the previous period (t − 1) in our models. Consistent with our hypotheses, the announcement of the non-financial metrics has informational value for investors. Hence, we regress credit spread and beta in period t on the values announced in period t. To account for potential simultaneity issues, we use an instrumental variable estimation approach that we describe subsequently. Appendix C. Details about the estimation procedure C.1. Measurement-induced endogeneity of financial brand value We acknowledge potential concerns with respect to a financial brand value measure such as Interbrand's measure, which involves discounting future brand-induced cash flows. We multiply each brand value by the WACC of the parent company. The result is a value for the average brand-induced future cash flow per annum. Note that dividing annual cash flows by WACC produces the net present value of this cash flow stream. Through this transformation, we remove the effect of capital cost on computing the financial brand value measure. To account for other sources of endogeneity, we still need to instrument this transformed brand value variable as we do for the other nonfinancial metrics. C.2. Identification of endogenous non-financial metrics All predictor variables in Eqs. (2) and (3) that are not endogenous serve as instruments. Specifically, we consider brand value (i.e., the transformed variable), satisfaction, corporate reputation, and the interaction of brand value and corporate reputation with satisfaction to be endogenous in Eqs. (2) and (3). Pretax interest coverage, operating margin, leverage, asset size (log of total assets), asset growth, dividend payout, liquidity, earnings variability, and industry concentration, which are measured in period t − 1, are assumed to be exogenous. We extend this set of instruments by dividend payout, leverage, the log of total assets, interest coverage, liquidity, earnings variability, and operating margin, which are measured in period t − 2. These 7 twoperiod lagged instruments plus dividend payout, asset growth, liquidity, and earnings variability provide the overidentifying restrictions for Eq. (2), which includes 5 endogenous variables (brand value, customer satisfaction, corporate reputation, interaction of brand value and corporate reputation with satisfaction). In Eq. (3), the same 7 two-period lagged instruments plus interest coverage and operating margin provide the overidentifying restrictions for 5 endogenous variables. Conceptually, operating margin, industry concentration, and interest coverage in particular should provide the identification for the endogenous non-financial metrics. First, profitability is a determinant of financial brand value (Interbrand, 2012); thus operating margin should be a good instrument for brand value. Second, customers in less concentrated industries are typically more satisfied than those in heavily concentrated industries (Luo et al., 2010). Moreover, firms in heavily concentrated industries have higher market shares on average than firms in less concentrated industries. Rego, Morgan, and Fornell (2013) show a consistently significant negative relationship between market share and customer satisfaction. Third, interest coverage is an 237 indicator of the financial resources that are available for investments in corporate reputation. C.3. Testing the exogeneity assumption Although we have good conceptual reasons for our choice of instruments, we test for their exogeneity. We proceed as follows. First, we examine whether the predetermined predictor variables in Eqs. (2) and (3) – which serve as instruments – are indeed exogenous. In Eq. (2), these variables are pretax interest coverage, operating margin, leverage, the log of total assets, and industry concentration, which are all measured in t − 1. In Eq. (3), these variables are dividend payout, asset growth, leverage, liquidity, earnings variability, the log of total assets, and industry concentration, which are all measured in t − 1. We apply the Durbin–Wu–Hausman test (Davidson & MacKinnon, 1993) to test the independence assumption for these regressors with respect to the error term. Specifically, we regress each instrument on all other exogenous variables and obtain the residuals from this regression. These residuals are then included in Eqs. (2) and (3), and the significance of the residuals' coefficients is tested. However, we find that none of these coefficients is significant (p N .10). Detailed results for the significance of each residual's coefficients can be obtained from the authors upon request. These test results suggest that the exogeneity assumption for our predetermined variables in Eqs. (2) and (3) cannot be rejected. Second, we apply the specification test (HT test) outlined by Hausman and Taylor (1981). We use this test to test for the exogeneity of all other instruments that are not included in the estimation equations. Given a set of exactly identified instruments, the HT test examines the exogeneity of additional overidentifying instruments. For Eq. (2), the one-year and two-year lagged exogenous variables pretax interest coverage, operating margin, leverage, and asset size as well as the one-year lagged variable industry concentration provide the initial set of instruments. We test for the following set of overidentifying instruments: dividend payout, asset growth, liquidity, and earnings variability, which are measured in period t − 1. The HT test is not rejected (χ2(4) = 1.52, p = .82). For Eq. (3), the one-year and twoyear lagged exogenous variables dividend payout, growth rate, leverage, liquidity, earnings variability, and the log of total assets as well as the one-year lagged variable industry concentration provide the initial set of instruments. We apply the HT test to the overidentifying instruments of pretax interest coverage and operating margin. The test is not rejected (χ2(2) = 1.48, p = .48). C.4. Strength of instruments Establishing evidence of the exogeneity of the instruments is necessary but not sufficient, as the instruments may be weak. We therefore check for the strength of our set of instruments. First-stage regressions for the endogenous non-financial metrics of brand value, satisfaction, and corporate reputation show satisfactory levels of R2 and F-values. The mean R2 is .35, and all F-values exceed the threshold of 10, thus indicating no issues with weak instruments (Stock, Wright, & Yogo, 2002). Detailed estimation results from the first-stage regressions can be obtained from the authors upon request. In addition, variables such as operating margin, industry concentration, and interest coverage show significant effects in the direction that is consistent with our conceptual arguments of identification (p b .01). C.5. Identification of carryover effects The lagged credit spread and beta in Eqs. (2) and (3) measure carryover effects. In the estimation, we use the lagged deviations of credit spread and beta from the firm-specific mean. This procedure is necessary to isolate the true dynamic effects from the heterogeneity effects that are associated with the lagged dependent variable in a panel 238 A. Himme, M. Fischer / Intern. J. of Research in Marketing 31 (2014) 224–238 (e.g., Arellano, 2003; Fischer & Albers, 2010). In addition, the lagged beta is a predictor of the credit spread in Eq. (2). Because the equation system is recursive, we can use observed values for the lagged beta in Eq. (2). References Agarwal, M. K., & Berens, G. (2009). How corporate social performance influences financial performance: Cash flow and cost of capital. MSI Working Paper No. 09-001, Cambridge (Mass.). Anderson, E. W., Fornell, C., & Lehmann, D. R. (1994). Customer satisfaction, market share, and profitability: Findings from Sweden. Journal of Marketing, 58(3), 53–66. Anderson, E. W., Fornell, C., & Mazvancheryl, S. K. (2004). Customer satisfaction and shareholder value. Journal of Marketing, 68(4), 172–185. Anderson, E. W., & Mansi, S. (2009). Does customer satisfaction matter to investors? Findings from the bond market. Journal of Marketing Research, 46(5), 703–714. Arellano, M. (2003). Panel data econometrics. Oxford: Oxford University Press. Bahadir, S.C., Bharadwaj, S. G., & Srivastava, R. K. (2008). Financial value of brands in mergers and acquisitions: Is value in the eye of the beholder? Journal of Marketing, 72(6), 49–64. Barth, M. E., Clement, M. B., Foster, G., & Kasznik, R. (1998). Brand values and capital market valuation. Review of Accounting Studies, 3(1–2), 41–68. Beaver, W., Kettler, P., & Scholes, M. (1970). The association between market-determined and accounting-determined risk measures. The Accounting Review, 45(4), 654–682. Bharadwaj, S. G., Tuli, K. R., & Bonfrer, A. (2011). The impact of brand quality on shareholder wealth. Journal of Marketing, 75(5), 88–104. Blume, M. E., Lim, F., & MacKinlay, C. A. (1998). The declining credit quality of U.S. corporate debt: Myth or reality? Journal of Finance, 53(4), 1389–1413. Brealey, R. A., Myers, S.C., & Allen, F. (2007). Principles of corporate finance (9th ed.), Boston: MA McGraw-Hill/Irwin. Davidson, R., & MacKinnon, J. G. (1993). Estimation and inference in econometrics. New York: Oxford University Press. Ederington, L. H., Yawitz, J. B., & Roberts, B. E. (1987). The informational content of bond ratings. Journal of Financial Research, 10(3), 211–226. Elton, E. J., Gruber, M. J., Agrawal, D., & Mann, C. (2001). Explaining the rate spread on corporate bonds. Journal of Finance, 56, 247–277. Fama, E. F., & French, K. R. (1992). The cross-section of expected stock returns. Journal of Finance, 47(2), 427–465. Fischer, M., & Albers, S. (2010). Patient- or physician-oriented marketing: What drives primary demand for prescription drugs? Journal of Marketing Research, 47(1), 103–121. Fischer, M., Leeflang, P.S. H., & Verhoef, P. C. (2010). Drivers of peak-sales for pharmaceutical brands. Quantitative Marketing and Economics, 8(4), 429–460. Fombrun, C. (1996). Reputation: Realizing value from the corporate image. Boston: Harvard Business School Press. Fombrun, C. J., & Shanley, M. (1990). What's in a name: Reputation building and corporate strategy. Academy of Management Journal, 33(2), 233–258. Fornell, C., Johnson, M.D., Anderson, E. W., Cha, J., & Bryant, B. E. (1996). The American Customer Satisfaction Index: Nature, purpose, and findings. Journal of Marketing, 70(4), 7–18. Fornell, C., Mithas, S., Morgeson, F. V., III, & Krishnan, M. S. (2006). Customer satisfaction and stock prices: High returns, low risks. Journal of Marketing, 70(1), 3–14. Fortune (2009). Most admired companies. http://money.cnn.com/magazines/fortune/ mostadmired/2009/index.html (June 2010) Greene, W. H. (2008). Econometric analysis (6th ed.)Upper Saddle River, NJ: Pearson. Gruca, T. S., & Rego, L. L. (2005). Customer satisfaction, cash flow, and shareholder value. Journal of Marketing, 69(3), 115–130. Hausman, J. A., & Taylor, W. E. (1981). Panel data and unobservable effects. Econometrica, 49(November), 1377–1398. Hill, R. C., & Adkins, L. C. (2008). Collinearity. In B. H. Baltagi (Ed.), Companion to theoretical econometrics (pp. 256–278). Chichester: Wiley. Hill, N. C., & Stone, B. K. (1980). Accounting betas, systematic operating risk, and financial leverage: A risk-composition approach to the determinants of systematic risk. Journal of Financial and Quantitative Analysis, 15(3), 595–637. Horrigan, J. O. (1966). The determination of long-term credit standing with financial ratios. Journal of Accounting Research, 4, 44–62. Interbrand (2012). Best global brands: Applications and methodology. http://www. interbrand.com/de/best-global-brands/2012/best-global-brands-methodology.aspx (October 2012) Johansson, J. K., Dimofte, C. V., & Mazvancheryl, S. K. (2012). The performance of global brands in the 2008 financial crisis: A test of two brand value measures. International Journal of Research in Marketing, 29(3), 235–245. Kaplan, R. S., & Urwitz, G. (1979). Statistical models of bond ratings: A methodological inquiry. Journal of Business, 52(2), 231–261. Kaufman, P., Jayachandran, S., & Rose, R. L. (2006). The role of relational embeddedness in retail buyers' selection of new products. Journal of Marketing Research, 43(4), 580–587. Kothari, S. P. (2001). Capital markets research in accounting. Journal of Accounting and Economics, 31, 105–231. Lamey, L., Deleersnyder, B., Dekimpe, M. G., & Steenkamp, J. -B. E. M. (2007). How business cycles contribute to private-label success: Evidence from the United States and Europe. Journal of Marketing, 71(1), 1–15. Leone, R. P., Rao, V. R., Keller, K. L., Luo, A.M., McAlister, L., & Srivastava, R. (2006). Linking brand equity to customer equity. Journal of Service Research, 9(2), 125–138. Luo, X., & Bhattacharya, C. B. (2006). Corporate social responsibility, customer satisfaction, and market value. Journal of Marketing, 70(October), 1–18. Luo, X., Homburg, C., & Wieseke, J. (2010). Customer satisfaction, analyst stock recommendations, and firm value. Journal of Marketing Research, 47(6), 1041–1058. Madden, T. J., Fehle, F., & Fournier, S. (2006). Brands matter: An empirical demonstration of the creation of shareholder value through branding. Journal of the Academy of Marketing Science, 34(2), 224–235. McAlister, L., Srinivasan, R., & Kim, M. (2007). Advertising, research and development, and systematic risk of the firm. Journal of Marketing, 71(1), 35–48. McKay, A. T. (1932). Distribution of the coefficient of variation and the extended t-distribution. Journal of the Royal Statistical Society, 95(4), 695–698. Mithas, S., Krishnan, M. S., & Fornell, C. (2005). Why do customer relationship management applications affect customer satisfaction? Journal of Marketing, 69(4), 201–209. Orlitzky, M., & Benjamin, J.D. (2001). Corporate social performance and firm risk: A meta-analytic review. Business & Society, 40(4), 369–396. Osinga, E. C., Leeflang, P.S. H., Srinivasan, S., & Wieringa, J. E. (2011). Why do firms invest in consumer advertising with limited sales response? A shareholder perspective. Journal of Marketing, 75(1), 109–124. Pinches, G. E., & Mingo, K. A. (1973). A multivariate analysis of industrial bond ratings. Journal of Finance, 28(1), 1–18. Rego, L. L., Billett, M. T., & Morgan, N. A. (2009). Consumer based brand equity and firm risk. Journal of Marketing, 73(6), 47–60. Rego, L. L., Morgan, N. A., & Fornell, C. (2013). Reexamining the market share-customer satisfaction relationship. Journal of Marketing, 77(5), 1–20. Singh, M., Faircloth, S., & Nejadmalayeri, A. (2005). Capital market impact of product marketing strategy: Evidence from the relationship between advertising expenses and cost of capital. Journal of the Academy of Marketing Science, 33(4), 432–444. Soteriou, A., & Zenios, S. A. (1999). Operations, quality, and profitability in the provision of banking services. Management Science, 45(9), 1221–1238. Srinivasan, S., & Hanssens, D. H. (2009). Marketing and firm value: Metrics, methods, findings, and future directions. Journal of Marketing Research, 46(June), 293–312. Srivastava, R. K., Shervani, T. A., & Fahey, L. (1998). Market-based assets and shareholder value: A framework for analysis. Journal of Marketing, 62(1), 2–18. Standard & Poor's (2011a). Criteria for rating the global branded nondurable consumer products industry. Available at: http://www.standardandpoors.com/prot/ratings/ articles/en/eu/?articleType=HTML&assetID=1245303899075 (January 2011) Standard & Poor's (2011b). What are credit ratings and how do they work? Available at: http://img.en25.com/Web/StandardandPoors/SP_CreditRatingsGuide.pdf (January 2012) Stock, J. H., Wright, J. H., & Yogo, M. (2002). A survey of weak instruments and weak identification in generalized method of moments. Journal of Business and Economic Statistics, 20(4), 518–529. Tuli, K. R., & Bharadwaj, S. G. (2009). Customer satisfaction and stock returns risk. Journal of Marketing, 73(6), 184–197.