What’s in a Name? The Effects of Prominence on Consumer Behavior at Search Engines∗ Michael R. Baye Babur De los Santos Matthijs R. Wildenbeest Indiana University Indiana University Indiana University Preliminary and Incomplete, September 2012 Abstract This paper examines the effects of location and brand name prominence on consumer clicking behavior at the major search engines. We find that both are important determinants of the links consumers click following a product search. Even though our results indicate that position on the search results page has a larger direct impact than brand name prominence, once we take into account that more prominent retail sites also get better positions, the effect of brand name is much larger than the effect of position. We show that our findings are robust to several alternative specifications. Keywords: product search, internet, search engines, prominence ∗ Department of Business Economics and Public Policy, Kelley School of Business, Indiana University, Bloomington IN 47405; mbaye@indiana.edu, babur@indiana.edu, and mwildenb@indiana.edu. We thank seminar participants at Indiana University for valuable comments, and Susan Kayser, Joowon Kim, and Zachary Mays for research assistance. Funding for the data and research assistance related to this research was made possible by a grant from Google to Indiana University. The views expressed in this paper are those of the authors and do not necessarily reflect the views of Indiana University or Google. 1 1 Introduction Recent theoretical work by Arbatskaya (2007), Armstrong, Vickers, and Zhou (2009), and Armstrong and Zhou (2011) emphasizes the role of prominence in consumer search models. The key to these models is that consumers visit more prominent firms first; in the context of online product search, this implies that more prominent online retailers receive more clicks than their less prominent rivals. While these theoretical models are general enough to handle environments where prominence depends on a multiplicity of factors, empirical studies of online search behavior typically view screen location and prominence as synonymous. The idea is that, when confronted with a list of search results, users are more likely to click links in more prominent positions (those at the top of the list rather than in the middle, those on the first page rather than those on the fourth page, and so on). That firms with prominent locations get more clicks is hardly a matter of dispute; there is abundant evidence that screen positions matter in online search environments ranging from paid advertisements at search engines (see Ghose and Yang, 2009) to price comparison sites (see Brynjolfsson, Dick, and Smith, 2010).1 Despite these important contributions, however, little is know about the empirical effects of prominence on the clicks that different firms receive from organic search results. For example, suppose a consumer queries Google or Bing with the phrase “buy product X online” and in the search results page an obscure website, say FlyByNight.com, is displayed in a higher position than Amazon.com. Blind application of existing empirical work (including some of our own) would lead to the prediction that most consumers would click on the link to FlyByNight.com, since it has the most prominent location on the list. This prediction, of course, ignores the fact that Amazon has the most prominent name (or equivalently in this context, is more prominent in terms of name recognition or brand awareness). In general, one would expect both page location and name recognition to affect an a firm’s overall level of prominence. While the theoretical literature allows for this possibility, the existing 1 See also Rutz and Bucklin (2011), Drèze and Zufryden (2004), and Ansari and Mela (2003). Additonally, Armstrong, Vickers, and Zhou (2009) and De los Santos and Koulayev (2012) summarize a number of studies of “offline” environments (including the yellow pages, voting, and academic citations) that find that “location” significantly impacts choice. 2 empirical literature is silent on the relative importance of these differing types of prominence on consumer behavior. Our paper represents a first attempt to tackle these issues empirically. One of the primary reasons existing empirical researchers focus on location rather than name recognition is measurement: it is relatively easy to objectively measure of the prominence of a firm’s screen location, but this is not the case for the prominence of its name. In Section 2 we introduce a novel measure of the prominence of a firm’s name. This measure, which we construct using comScore Search Planner data, is based on the number of product searches at Google (or Bing) that include the firm’s name or URL in search query. Retailers with more of these “name searches” are deemed to have more prominent names than retailers with fewer name searches. We also provide some evidence that this measure works as advertised in an education context: More “prominent” universities (based on U.S. News and World Report Rankings) enjoy more “name searches” than less prominent universities. Section 3 provides preliminary regression results suggesting that location on the search engine result pages and name recognition are both important determinants of the links consumers click following a product search. These preliminary results, which are robust to the inclusion of a variety of other controls, suggest that location has a larger impact on clicks than name recognition, but that failure to account for name prominence increases the estimated effect of location by about 100 percent. Section 4 shows that our main finding—that name prominence is an important determinant of clicks—is robust to alternative measures of location and name prominence. Section 4 also tackles the endogeneity problems inherent in this line of research. In particular, search engines base firms’positions on the results pages, in part, on past clicks. In the above hypothetical, for example, as more and more consumers click Amazon.com an optimizing search engine will demote FlyByNight1’s position and elevate Amazon.com toward the top of the list. Our analysis indicates that when one controls for endogeneity, name recognition has a greater impact on clicks than location. We find that a firm moving from the “median” screen location to the “best” gets about 81 percent more clicks, whereas a firm that moves from the “median” to the “best” name recognition gets about 150 percent more clicks. In short, when one controls for endogeneity, both location and name recognition remain economically and statistically important determinants of clicks, but the quantitative 3 impact of name prominence is greater than location prominence. Section 5 presents residualized regression results to “unpack” the direct and indirect effects of improved brand recognition. Here, the direct effect refers to the fact that–holding position constant–searchers are more likely to click on links with more prominent names. The indirect effect refers to the fact firms with more prominent names get more clicks, and so receive better positions on search engines. For example, suppose a firm moves from the “above median” brand name category to the ranks of the 20 percent with the most prominent names. In our data, the direct effect of this change (which holds position constant) is a 58 percent increase in clicks. But there is also an indirect effect, since the better brand (and accompanying increase in clicks) ultimately improves its position in the listings. Based on the data, the total effect (the sum of the indirect and direct effect) of such an increase in recognition is a 224 percent increase in clicks. We conclude in Section 6. 2 Methodology and Data Consider an online retailer interested in attracting traffic to its website. It might invest in advertising through traditional (TV, radio or print) or online channels in an attempt to enhance consumer awareness and generate visits to its website. It might spend large sums to build a customer-centric website with a broad array of product offerings and an efficient network of distribution centers to create customer loyalty and word-of-mouth (or word-of-blog) advertising. Or it might use some other strategy, or a blend of several strategies, to induce consumers to visit its website. A less costly option is to economize on such investments and simply “free ride” off of the traffic obtained through organic search results. These and other investments by online retailers impact the prominence their names. Unfortunately, many online retailers are privately held companies; those that are publicly traded to not systematically provide detailed information about the many investments they make to enhance the prominence of their online arms. Our proposed measure of the prominence of an online retailer’s name in a given period is the number of name searches it obtained during that period. Here, name searches refers to 4 search terms and phrases that include the retailers name or URL. For a variety of reasons, this is a potentially useful measure of prominence. First, it is measurable. For example, one can use comScore Search Planner data to calculate the number of name searches different retailer’s received in a given month. Second, the number of name searches captures the behavior of consumers who are acting on all of the many investments retailers made up to that point in time. Essentially, a firm’s number of name searches embodies the cumulative branding efforts of the firm up to and including the instant a search is made. Third, the number of name searches in a given month measures the stock of name prominence. In contrast, even if data were available on the investments different retailers made on advertising and other brand-enhancing activities in a given month, such expenditures merely represent flows that incrementally change prominence relative to previous periods, and therefore would not be helpful in conducting cross-sectional analysis of the impact of prominence on consumer search behavior. Even with time-series data on advertising and other brand-enhancing expenditures, one would have to deal with the thorny issue of identifying the “stock” of brand equity from such “flow” data. In order to examine the potential promise of our proposed measure of name prominence, we obtained U.S. News and World Report rankings of the top 100 universities and placed each university in one of five quintiles. Thus, the most prominent universities (which includes the likes of Harvard, Stanford, Princeton, and the other usual suspects) were in the top quintile; Indiana University (a large public university) was in the 4th quintile. We then used comScore Search planner data (employing a methodology analogous to that described in the next section) to determine the total number of name searches universities in each quintile received during February 2012. The results are displayed in Figure 1: more prominent universities (as measured by the U.S. News and World Report Rankings) received more name searches than less prominent universities; for example, universities in the top-20 received over 80,000 name searches, while those in the bottom-20 received about 22,000 name searches. We note that Figure 1 is based on the raw data; thus, name searches are a useful way of measuring the prominence of different universities even though more prominent universities tend to have significantly smaller numbers of students than less prominent ones. Thus, the results in Figure 1 are hardly driven by spurious correlation with numbers of students. 5 For these reasons, we believe the number of name searches is a promising measure of the prominence of the names of different retailers. We now provide a more detailed description of our data, and the methods we use to construct the measures of prominence and variables used in our econometric analysis. 2.1 Overview of the Data Our analysis is based on three datasets. We assembled two of these using data from thirdparty providers that specialize in electronic commerce marketing data (comScore and Internet Retailer) and created the third dataset using a web scraper written in Java. The comScore data consists of monthly Search Planner data for June, 2012. These data are based on the online browsing activity of two million users in the U.S. It provides a list of search terms and phrases that users entered at search engines (e.g., Google and Bing), along with the number of “organic clicks” that different websites received based on the results pages generated by each search term. The Internet Retailer data provides a list of the top 500 online retailers, along with the general category in which each retailer operates (e.g., apparel and accessories, housewares and home furnishings, computer and electronics, and so on). For each retailer, the data indicates whether it has presence on Facebook or Twitter, the year in which the retailer began its online operations, and whether it is a “web only” retailer (as is the case with Amazon) or also has a brick-and-mortar presence (as is the case with Walmart). Since our goal is to examine product search on general search engines, the first step in our analysis was to link the 500 retailers in the Internet Retailer data with the comScore Search Planner data. In particular, we examined the Search Planner data and identified all of the properties owned by these 500 retailers that were tracked by comScore. Owing to the fact that some retailers own and operate websites with different domain names, (e.g., Amazon operates both the Amazon and Zappos sites; Sears operates the Sears site as well as a Kenmore and Kmart site), we ended up with a sample of 759 retail sites. Next, we extracted the comScore site profile for each of these 759 retail websites. Each site profile provides a list of the search terms and phrases (separately, for Google and Bing) that resulted in organic clicks from the engines’s results page for that search term to a 6 particular retail site. For each search term or phrase, it also indicates the total number of organic clicks each retail site received from results pages on each search engine. For example, across the 759 retail sites, comScore identified a total of 5,549 search terms and phrases that led consumers from Google to one or more of these 759 sites. Table 1 further illustrates by providing the top search terms and phrases on Google that led searchers to click on links at organic results pages directing them to Amazon.com, along with the total number organic clicks on Google for each of these terms. A striking feature of Table 1 is that a very large proportion of organic traffic from search engines to Amazon.com stems from name searches—that is, terms and phrases such as “Amazon,” “Amazon.com,” “Amazon books,” “Amazon music” that searchers use as a substitute for directly navigating to the Amazon.com website.2 In contrast, other searches (like “Panasonic TV” or “Buy Levis jeans”) make up a far lower proportion of the top organic search terms that result in traffic from Google to Amazon. As shown in Figure 1, also other retail sites receive a substantial proportion of traffic from organic name searches, with some sites receiving an even higher proportion of organic traffic through name searches than Amazon does. It is hardly surprising that a searcher who includes “Amazon” in a search phrase would click on an “Amazon.com” link on a results page at a search engine; our analysis therefore focuses on searches that are not name searches. For purposes of our analysis, a name search is defined as a search term consisting of the retailer/site name and misspellings (e.g., “Amazon,” “www.amazon.com,” and “Amzon.com”) as well as phrases containing such terms (e.g., “Buy Camera at Amazon.com” or “buy TV at Amzon”).3 An examination of the 5,549 search terms revealed that 3,911 of the 5,549 search terms and phrases leading searchers from Google to one of the 759 retailer sites were not name searches. For each of the 759 websites, we computed the net number of organic clicks (defined as total organic clicks minus organic 2 In industry parlance, name searches are sometimes called “navigational searches.” We use “name search” to emphasize that these searches contain the name or URL of a particular retailer or site. 3 Section 4 shows that our results are robust to a more narrow definition of name searches that includes site names (Amazon.com) and misspellings (Amazn.com) but excludes phrases with such terms (“buy camera at amazon.com”). 7 name clicks); in other words, the net number of organic clicks for retail site i is defined as the total number of organic clicks site i received when searchers used the 3,911 terms and phrases that did not include the name or URL of retail site i. Finally, we wrote a program in Java that queried Google and Bing in July 2012 and captured the first five “results” pages for each of these 3,911 terms and phrases and identified the position of the retailers in our sample in each of the resulting pages. As discussed in more detail below, this permits us to control for the positions of different retailers on results pages for different queries, as well as to construct controls for ads on results pages that may influence searcher’s decisions to click on organic links. 2.2 Key Variables and Summary Statistics Our goal is to examine how the prominence of different retail websites influence the organic links that users click following product searches. Our analysis is based on variables constructed from the datasets described above. Table 2 provides descriptive statistics for these variables. Net Organic Clicks. Our dependent variable, net organic clicks, is formally defined as follows. Let S denote denote the set of non-name search terms and phrases discussed above, and j ∈ S denote one of these 3,911 searches. Based on the comScore dataset, we know that retail site i received a total of Ni,j searches for search phrase “j.” Then Ni,j measures the number of times searchers that used the search phrase “j” viewed the results pages and chose P to click an organic link to firm i. Firm i’s number of net organic clicks is N OCi = j Ni,j . As shown in Table 2, the retail sites in our sample got on average close to 269 thousand net organic clicks on Google. There is substantial variation: while 42 retail sites got zero net organic clicks on Google, Amazon, the largest retailer in our sample, got more than 67 million net organic clicks in June 2012. Position. Based on the data obtained by querying the Google and Bing search engines using the search terms and phrases in S, we observe the search results position of the organic links of each of the 759 firms for each keyword. Observed search results positions are numbered from 1 to n with 1 being the best and n being the worst. Since we only obtained data for the first five pages of search results, firms with positions outside of this range are not 8 observed and assigned a value of n + 1. Letting Pi,j denote the best search results position of firm i for search term j, firm i’s average search results position across all search terms P and phrases is j Pi,j /|S|. Based on these average positions, we categorized the position variables as follows. Sites that never appeared on the first five pages (those with a mean screen position of n + 1) were placed in a position category labeled “worst.” While we do not know these sites actual average positions, we know these sites had the worst positions of any sites in our sample. Remaining retailers were assigned into position categories depending on the quintile in which their average screen position fell, ranked from “poor” for those retailers with an average screen position that was in the lowest quintile of those in which at least on screen position was observed, to “best” for the retailers with an average screen position in the highest quintile.4 Thus, our primary measure of screen position is a dummy variable that equals one when a site’s average screen position is one of these categories (and is zero otherwise). Table 2 gives descriptive statistics for this variable. Table 2 also gives descriptive statistics for the number of search terms a retailer appears on the first page on Google. On average, a retail site appeared close to 14 times on the first page. The standard deviation for this variable is substantial, which is also reflected by the large difference between the minimum and maximum for this variable in the sample: whereas the highest ranked retail site appeared on the first page for 2,194 search terms, a substantial number of firms did not show up at the first page for all search terms included in the analysis. Name Recognition. As discussed above, our primary indicator of name recognition is a site’s number of name searches, and we create two measures: One based on name searches at Google and the other based on name searches on Bing. The idea is that users must be cognizant of a particular site in order to conduct a name search. Furthermore, the more favorably consumers view a particular sites brand (which embodies not only the name of the 4 The intermediate categories are “below median” (second lowest quintile), “median” (middle quintile), and “above median” (next to highest quintile). A retail site in the “below median” category has an average screen position that was in the second lowest quintile of those in which at least one screen position was observed, while a retail site in the “median” category has an average screen position that was in the middle quintile of those in which at least one screen position was observed. The “above median” category contains all those retail sites that have an average search results position that was in the next to highest quintile. 9 site but characteristics of the retailer, including product breadth, quality, and reputation), the greater the number of name searches. Unfortunately, comScore only records search terms and phrases that exceed an unknown threshold, and as a result, 30% of the retail sites on Google and 55% of those on Bing had so few name searches that comScore did not record them. Retail sites in this category were coded as having the “worst” name recognition. The remaining firms—those in which the number of name searches is observed—were categorized into five quintiles based on their total number of name searches, ranked from “poor” (number of name searches is in the lowest quintile of observed name searches) to “best” (top quintile of observed name searches).5 Thus, our primary measure of name recognition is a dummy variable that equals one when the site’s number of name searches is in one of these categories (and zero otherwise). Descriptive statistics can be found in Table 2. Ads. In addition to displaying organic results, search engines also display paid results. Paid results are essentially advertisements that expose users to the names of companies, and therefore may impact the prominence of a particular site. Based on the data collected by querying Google and Bing, we capture number of times an ad was displayed on the first page of results. As shown in Table 2, the average number of search terms for which the retail sited had an ad on the first page is 11.42, with standard deviation close to 56. The retail site with the highest number of search phrases with an ad on page 1 has 1069 ads. Social Network Presence. Sites that have a presence on Facebook or Twitter get additional exposure to potential searchers, and this might affect a site’s prominence. For each retail site, we created a dummy that equals 1 if a firm has a presence on Facebook or Twitter, and zero otherwise. As shown in Table 2, 89 percent of sites had a social network presence. Retailer Age. One might speculate that firms that have been involved in online sales 5 Similar to position categorization, the intermediate categories are “below median” (second lowest quin- tile), “median” (middle quintile), and “above median” (next to highest quintile). A retail site in the “below median” category has a number of name searches that is in the second lowest quintile of observed name searches, while a retail site in the “median” category has a number of name searches that is in the middle quintile of observed name searches. The “above median” category contains all those retail sites that have a number of name searches that is in the second highest quintile of observed name searches. 10 for a longer period of time are more prominent that newer firms. We constructed a variable that represents the age of the retail site. Table 2 shows that the retailers in our sample exist for an average of close to 13 years. The youngest is two years of age, while the oldest retail sites has been around for 23 years. Web Only. To allow for differences in the prominence of pure-play online retailers (such as Amazon) and online retailers that also have a brick-and-mortar presence (such as Walmart), we constructed a dummy variable that equals one if the retail site is operated by a web-only company and zero otherwise. As shown in Table 2, 36 percent of the retail sites in our sample is web only. Category Fixed Effects. Finally, to control for systematic differences in prominence across different categories of websites, each retail site was assigned to one of the 15 Internet Retailer categories. Table 2 shows these categories and the percentages of sites that belong to each category. 2.3 Other Methodological Considerations As noted above, one limitation of the Search Planner data is that comScore only records the number of clicks a site obtains from a given search term or phrase when the number is above an unspecified threshold, T . An unobserved number of clicks (or missing search term or phrase) does not mean the site did not receive any clicks (or that no users navigated to the site using that search term or phrase). It simply means that the number of clicks was below this threshold. Our methodology attempts to mitigate this concern in two ways. First, we use categories rather than levels to measure name recognition. Thus, while the actual number of name searches is not observed for some sites, such sites necessarily have fewer name searches than those for which the comScore does record a number of searches. So long as sites within the “worst” name recognition category do not have significantly heterogeneous impacts on the dependent variable in our regressions, the ability to observe the number of name searches of these firms would not impact our analysis. Notice that the same is true for our use of categories for page position. Second, we use quantile regressions to mitigate problems stemming from the fact that comScore does not disclose clicks for keywords or phrases in which the number of clicks 11 is below an unspecified threshold, T . We initially explored two extremes to this problem using OLS. In the first, we set T equal to the minimum number of clicks observed in our sample and assumed that sites with unobserved clicks received T − 1 clicks. In the second, we assumed that sites with missing numbers of clicks received only 1 click. Results based on these two extremes were qualitatively similar—and similar to the results reported below— but the magnitude of the estimates were sensitive to these two extremes. In contrast, the results reported below based on quantile regressions are robust to these two extremes. 3 Baseline Results As a starting point, we consider a specifications of the form ln N OC = a + 5 X αb P OSb + 5 X βb N AM Eb + γX + ε b=1 b=1 where P OSb and N AM Eb are dummy variables corresponding to the position categories and name prominence, and X is a vector of other potential controls. The omitted categories are the “worst” position and “worst” name recognition categories. The coefficients for the position and name dummies have the usual interpretation: A firm moving from position category b0 to position category b00 (or from name category b0 to name category b00 ) experiences a [exp (αb00 − αb0 ) − 1] × 100 percentage change in net organic clicks. Baseline quantile regression results are presented in Table 3. Specification (1) includes only controls for position. Consistent with other research, notice that sites with better positions on Google search results pages obtain significantly more net organic clicks through Google. Specification (2) adds controls for name recognition. Three aspects of this specification are noteworthy. First, adding controls for name recognition tends to reduce the magnitude of the estimated position effects, although all of the position coefficients remain statistically significant at the 5 percent level. Second, while the effects of name recognition on clicks is not perfectly monotonic in this simple specification, name recognition has a positive and statistically significant effect on clicks, and firms with better name recognition get more 12 clicks that firms with the “worst” name recognition (the omitted category). Third, holding position constant, a firm that moves from the median to the best name recognition category enjoys a 215 percent increase in clicks.6 This is less than the 325 percent increase in clicks it would enjoy if its position were held constant but it moves from the median to the best position category.7 On balance, these results suggest that name prominence is a potentially important determinant of clicks. Column (3) adds controls for the number of advertisements for the site that appear on the first page of organic search results. Consistent with other research (see, for instance, Goldfarb and Tucker, 2011), the coefficient is positive and statistically significant: A one percent increase in the number of the firms’ ads on page one of search results increases net organic clicks by .25 percent. This is consistent with exposure to a paid ad bolstering the prominence of the firm’s name or link, and therefore increasing the firm’s number of net organic clicks. Adding this control, however, does little to the estimates of the position or name recognition coefficients. If the effects of name recognition identified in specifications (2) and (3) were purely the result of omitted variables or spurious correlation with better measures of firms’ efforts to enhance brand awareness and clicks, the results would not be robust to the inclusion of other variables. Specifications (4) ads a control for whether a firm has a social network presence. While the coefficient is positive, it is not statistically significant. Likewise, one might speculate that the age of the retailer is a useful proxy for the prominence of the names of retail sites, since sites that have been around longer are more likely to better known than newer sites. The results in column (5) indicate that this variable adds little explanatory power over and above our measure of name recognition. Finally one might worry that the previous results are driven by differences between web only and bricks-and clicks retailers, or differences in the retail segments in which different retailers compete. Column 6 shows that the results are robust to these controls as well. On balance, these results indicate that the prominence of a site’s name is an important determinant of whether those conducing product searches click through to that site. Com6 7 Since (exp (2.119 − 0.971) − 1) × 100 = 215.19. Since (exp (3.925 − 2.479) − 1) × 100 = 324.61. 13 parisons of the position effects in columns (1) and (2), as well as columns (6) and (7), reveals that failure to account for the prominence of a site’s name leads to results that overstate the importance of position. 4 Name Recognition and Endogeneity This section considers specifications that use alternative measures of name recognition and screen position, as well as specifications that control for potential endogeneity. As an additional robustness check, we replicate all of our analysis using a different dataset constructed from comScore data on Bing. 4.1 Alternative Measures of Name Recognition and Position As an initial matter, notice that if one used total organic clicks at Google (including organic name clicks) as the dependent variable, it would hardly be surprising to find that the number of organic name clicks at Google has a positive effect on total organic clicks; indeed, in such a specification, a one unit increase in the number of organic name clicks at Google would result in a one unit increase in total organic clicks at Google. We avoid this issue by using net organic clicks as the dependent variable. Despite this, one might worry that unobserved factors influencing clicks on Google name searches also impact net organic clicks on Google. If this is the case, our measure of name recognition will be correlated with the error in the regression, potentially biasing the results. To mitigate this concern, we used an identical methodology to construct measures of name recognition based on name searches at Bing. Given differences in Google and Bing’s algorithms, and differences in their populations of users, it seems reasonable that unobserved factors that influence net organic clicks at Google are likely independent of unobserved factors that influence name clicks on Bing. In any event, as shown in column (1) of Table 4, using this alternative measure does not qualitatively change our findings: Firms with more prominent names on Bing obtain significantly more clicks on Google than less prominent firms. One might also worry that the our use of position categories somehow masks the impor14 tance of being included on the first page of search results. To account for this possibility, column (2) of Table 4 shows results based on an alternative measure of position that is simply the number of times a given retailer appears on the first page of organic search results. As would be expected, the coefficient is positive and statistically significant: firms more frequently appearing on the first page of organic search results obtain significantly more clicks. But more to the point, our finding that sites with more prominent names receive significantly more organic clicks than less prominent sites continues to hold with this alternative specification. As an additional robustness check, we also replicated our analysis using a more narrow definition of a name search that only includes the name or URL of the site (e.g., excludes “buy camera at amazon” but includes “amazon.com” and “amazon.” Under this definition, a name search is a pure navigational search—consumers using this query are merely attempting to navigate to a particular firm’s site. As shown in columns (3) and (4) of Table 4, the results are very similar to those based on the broader definition of a name search in columns (1) and (2). 4.2 Position and Ad Endogeneity Importantly, all of the results described above are based on specifications in which position is treated as an exogenous variable. This could be the case if search result positions are predetermined at the time consumers make their click decisions there, making it unnecessary to adjust for endogeneity. While one could argue this is true for our data (where we are using cross-sectional data from a single month rather than a time series of data), in practice search engines continually refine and optimize their algorithms in an attempt to present searchers with the most relevant organic results. From the standpoint of estimation, this means that a site’s position in the list of organic search results depends on past clicks. To further complicate matters, past organic clicks depend on the past prominence (both location and name recognition). An additional issue is the potential endogeneity of Ads—one of the controls in our specifications. Search engines make money when users click on ads, and thus search engines take into account the likelihood that a firm’s ad will be clicked when deciding to display a firm’s 15 ad. Again, a search engine’s decision to display an ad depends on the past click-through behavior of consumers. Thus, there is reason to believe that two of the covariates in our analysis—Position on Google and Ads—may be endogenous. As discussed below, we use information about position and ads on Bing as instruments for position and ads on Google. Since Google’s position and ad decisions are based on past clicks at Google–not Bing these instruments would seem to satisfy the requirements of valid instruments. Two-stage Ordered Probit. To facilitate comparisons with earlier results, we first present results using a two-stage ordered probit approach. We use the categorized position dummies on Bing as instruments. In the first stage we estimate an ordered probit model using the categorized position variable as a dependent variable and the instruments and all exogenous covariates as explanatory variables. We then use the predicted values from the first stage to create another set of position categories, which we use to replace the position variables in the original model. In the second stage we estimate this equation by OLS. The results are displayed in column (1) of Table 5. While the results are qualitatively similar to those ignoring endogeneity, controlling for endogeneity dramatically changes the quantitative importance of position relative to location prominence. In particular, notice first that the coefficients of the “poor” and “below median” position categoriess are not statistically different from zero; position matters only if it is at or better than the median position. In contrast, name recognition matters across all categories. Second, when one controls for endogeneity, name prominence has a greater impact on clicks than position prominence. For example, a firm that moves from the median position categories to the best position category obtains only 81.3 percent 8 more clicks, whereas it enjoys a 150 percent9 increase in clicks if moves from the median to the best name recognition category. Two-stage Least Squares. As an additional robustness check, we use the alternative measure of position discussed in Section 2 (the number of times a firm appears on the first page of search results). Since both endogenous variables—ln(Page 1) and ln (# of Ads on Page 1)—are continuous variables, we used standard two stage least squares with as instruments the corresponding values of these variables on Bing as well as the log of the 8 9 Since (exp (1.420 − .825) − 1) × 100 = 81.30. Since (exp (2.832 − 1.916) − 1) × 100 = 149.93. 16 average position on Bing’s search result pages. The results are displayed in column (2) of Table 5. Importantly, name recognition remains an important determinant of clicks in this specification. Additionally, comparing the results to those in column (2) of Table 4 (which is the identical specification not controlling for endogeneity) reveals that controlling for endogeneity increases the estimated effects of name recognition on clicks. Finally, Sargan test results indicate the residuals are uncorrelated with the exogenous variables, which suggests the instruments are exogenous. 4.3 Other Robustness Checks As a final robustness check we replicated all of the above analysis by constructing a dataset for Bing from the comScore data. As shown in Tables A1 through A4 in the Appendix, our main findings are robust to this alternative dataset. 5 Indirect Effects of Name Recognition In the previous sections we have focused on the direct effect of brand name recognition on clicks. However, if the search engines’ algorithms are such that the rankings of search results are partly determined by past clicking behavior, there is also an indirect effect of brand name recognition on clicks: retail sites with more prominent names will get more clicks, which results in better future positions. Figure 3 gives the average number of times a retail site appears on the first page for the different name recognition categories, and confirms that the retail sites with a relatively high brand recognition also tend to have the better positions on the results pages of the major search engines. In fact, retail sites in the “best” name recognition category have by far the highest number of search terms for which they appear on a first result page, which suggest the brand name effect seems to be especially important for the most recognized sites. Whereas the partial coefficients in a standard OLS regression would only capture the direct effect of brand name recognition on clicks, in this section we show how to distinguish between the direct and indirect effects by estimating the combined effect using a residualized regression approach. In a first step we run a regression of the log of the number of search 17 phrases for which a retail sites appears on the first page on the name recognition categories for Bing. The residuals from this regression can be considered as brand name-adjusted positions. In a second step we regress the log of the net organic clicks on Google on the brand name-adjusted positions as well as on the brand name recognition categories for Bing. The estimated coefficient for the brand name variables now contains both the direct and indirect effects of brand name recognition on clicks.10 10 Formally, in the first step we estimate ln P OS = γ + 5 X θb N AM Eb + ν. b=1 The residuals ν can be interpreted as brand name-adjusted positions, i.e., ln P[ OS = ν = ln P OS − γ − 5 X θb N AM Eb . b=1 In the second step we estimate ln N OC = c + δ ln P[ OS + 5 X ηb N AM Eb + ωX + . b=1 A comparison of this specification to the standard OLS regression ln N OC = a + α ln P OS + 5 X βb N AM Eb + γX + ε, b=1 P5 shows that difference between the two specifications is α γ + b=1 θb N AM Eb . Adding this difference to the OLS specification gives, after reorganizing, ln N OC = a + αγ + α ln P[ OS + 5 X (βb + αθb )N AM Eb + γX + ε, b=1 which is equivalent to the second step in the residualized regression approach if c = a + αγ, δ = α, ηb = βb + αθb , ω = γ, and = ε. Note that that this means that by combining the OLS parameter estimates with the parameter estimates from the first step in the residualized regression approach, we can break down the coefficients from the second step into individual components (for instance, the total effect of brand name recognition on clicks ηb can be split into a direct effect βb and an indirect effect αθb ). This also shows that the OLS regression and the second step in the residualized regression are equivalent in terms of minimizing the sum of squared residuals. Indeed, Table 4 shows the R-squared is identical across the two specifications, 18 Column (1) of Table 4 gives estimation results for the first step. In line with the patterns observed from Figure 3, the effect of name recognition is most pronounced for the “best” brand name category, although the effects are significantly different from the omitted category for all categories. Column (3) of Table 4 gives the results for the second step, in which we regress clicks on the brand name-adjusted position variable and the name recognition categories, as well as some other control variables. A comparison of the parameter estimates with those for a standard OLS regression (see Column (2) of Table 4) tell us that the impact of brand name more than doubles for the retail sites with the best brand names and almost doubles for those with lesser well known brand names. Whereas the direct effect of moving from the retail sites with brand names in the “above median” category to to those with the most prominent names leads to a 58 percent increase in net organic clicks on Google,11 the indirect effect adds another 166 percent point increase to a combined effect of 224 percent.12 Note that the relatively large magnitude of the brand name recognition variables does not change if we use IV to control for endogeneity, as shown in Column (4) of Table 4. The large indirect effects we find are visualized in Figure 4, which puts both the total effect obtained from the residualized regression coefficients and the direct effect from the standard OLS regression coefficients in one graph, together with the 95% confidence intervals for the estimated parameters. For retail sites that are in the three highest name recognition categories the total effect is significantly different from the direct effect, which strengthens our finding that the indirect effect of brand name trough position plays an important role in generating clicks. Moreover, Figure 4 shows that if we ignore the indirect effect brand name has on clicks, the differences between brand name categories are not that large, especially for the retail sites that are most recognizable, but once we include the indirect effect it becomes clear that it is especially important to be among the retailers in the “best” brand name category. So even though in the short run investments in brand name might not improve a retail site’s position on the search engine, in the long run search engines’ will re-optimize their ranking, which will lead to better positions and thus more clicks. just as the parameter estimates and standard errors for the position variable and the covariates X. 11 Since (exp (1.789 − 1.330) − 1) × 100 = 58.24. 12 Since (exp (4.120 − 2.945) − 1) × 100 = 223.81. 19 6 Conclusions In this paper we have investigated the importance of brand name recognition in consumers’ decisions on which search engine results to click. Using a large dataset of search terms and phrases used at the major search engines, have found that the effects of brand name recognition on clicks are substantial, even if we control for positions on the search engine result pages. Moreover, we have found that adding the indirect effect of brand on clicks through position doubles the direct effect. We have shown that are findings are robust to several alternative specifications. 20 References [1] Ansari, Asim and Carl Mela (2003). “E-Customization.” Journal of Marketing Research, 40 (2), pp. 131-146. [2] Arbatskaya, Maria (2007). “Ordered Search.” The RAND Journal of Economics, 38 (1), pp. 119-126. [3] Armstrong, Mark, John Vickers, and Jidong Zhou (2009). “Prominence and Consumer Search,” The RAND Journal of Economics, 40(2), pp. 209-233. [4] Armstrong, Mark and Jidong Zhou (2011). “Paying for Prominence,” Economic Journal, 121 (556), pp. F368-F395. [5] Baye, Michael R. and John Morgan (2009). “Brand and Price Advertising in Online Markets.” Management Science, 55 (7), pp. 1139-1151. [6] Brynjolfsson, Erik, Astrid Dick, and Michael D. Smith (2010). “A Nearly Perfect Market?” Quantitative Marketing and Economics, 8 (1), pp. 1-33. [7] Brynjolfsson, Erik and Michael D. Smith (2001). “The Great Equalizer: The Role of Shopbots in Electronic Markets.” MIT Sloan Working Paper No. 4208-01. [8] De los Santos, Babur and Sergei Koulayev (2012). “Optimizing Click-through in Online Rankings for Partially Anonymous Consumers.” Working paper. [9] Drèze, Xavier and Fred Zufryden (2004). “Measurement of Online Visibility and its Impact on Internet Traffic.” Journal of Interactive Marketing, 18 (1), pp. 20-37. [10] Ghose, Anindya and Sha Yang (2009). “An Empirical Analysis of Search Engine Advertising: Sponsored Search in Electronic Markets.” Management Science, 55 (10), pp. 1605-1622. [11] Goldfarb, Avi and Catherine Tucker (2011). “Online Display Advertising: Targeting and Obtrusiveness.” Marketing Science, 30 (3), pp. 389-404. 21 [12] Rutz, Oliver J. and Randolph E. Bucklin (2011). “From Generic to Branded: A Model of Spillover Dynamics in Paid Search Advertising,” forthcoming in the Journal of Marketing Research. 22 20 Avg. Name Searches (thousands) 60 40 80 100 Figure 1: Name Searches and University Rankings 1-20 21-40 41-60 61-80 81-100 University Rank Figure 2: Organic Clicks and Name Searches that Lead Users to Retailers Amazon Walmart Apple Target Homedepot Bestbuy Lowes Macys 0 20,000 40,000 60,000 80,000 Number of Searches (thousand) Net Organic Searches 23 100000 Name Searches 120000 0 Avg. Number of Times on First Page 20 40 60 80 Figure 3: Position and Name Recognition Worst Poor Below Median Median Above Median Best Name Recognition Category 0.00 Coefficients and 95% confidence interval 1.00 4.00 5.00 2.00 3.00 Figure 4: Total Effect of Name Recognition on Clicks Poor Below Median Median Above Median Name Recognition Category Direct Effect 24 Total Effect Best Table 1: Search Terms that lead users to Amazon Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 30 40 50 60 70 80 90 100 Search Phrase amazon amazon.com www.amazon.com kindle 50 shades of grey ebay name name amazon books *** name last name kindle fire google name name book amzon amazonprime boss orange name amazon kindle amazon prime kim kardashian sex tape first name review fifty shades of grey amazo bosch name insanity workout metal detectors the night dad went to jail target name name books in order All Search Terms # Organic Clicks at Google 9,006,624 1,207,598 247,793 146,406 116,359 106,062 105,408 100,357 96,165 90,934 76,201 72,856 66,754 63,852 56,407 55,281 55,272 55,259 54,028 51,240 29,303 22,909 18,878 15,634 14,419 13,502 12,125 11,301 79,244,892 Notes: comScore Search Planner data from June 2012. Search phrases are ranked by the total number of organic clicks on Google. Table 2: Descriptive Statistics (N=759) Variable Net Organic Clicks on Google (thousands) Number of Times on Page 1 Number of Ads on Page 1 Mean 268.99 13.90 11.42 Std. Dev. 2590.29 86.58 55.91 Min 0 1 1 Max 67497.97 2194 1069 0.20 0.15 0.17 0.16 0.16 0.16 0.40 0.36 0.37 0.37 0.37 0.37 0 0 0 0 0 0 1 1 1 1 1 1 0.34 0.13 0.13 0.13 0.13 0.13 0.47 0.34 0.34 0.34 0.34 0.34 0 0 0 0 0 0 1 1 1 1 1 1 0.58 0.08 0.08 0.08 0.08 0.08 0.89 12.97 0.36 0.49 0.28 0.28 0.28 0.28 0.28 0.31 3.19 0.48 0 0 0 0 0 0 0 2 0 1 1 1 1 1 1 1 23 1 0.30 0.01 0.04 0.09 0.04 0.04 0.04 0.05 0.11 0.02 0.07 0.03 0.08 0.05 0.03 0.46 0.11 0.20 0.28 0.19 0.21 0.20 0.22 0.31 0.14 0.26 0.17 0.28 0.22 0.16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Position on Google Worst Poor Below Median Median Above Median Best Name Recognition on Google Worst Poor Below Median Median Above Median Best Name Recognition on Bing Worst Poor Below Median Median Above Median Best Social Network Presence Retailer Age Web Only Retailer Category Apparel/accessories Automotive parts/accessories Books/music/video Computers/electronics Flowers/gifts Food/drug Hardware/home improvement Health/beauty Housewares/home furnishings Jewelry Mass merchant Office supplies Specialty/non-apparel Sporting goods Toys/hobbies Table 3. Baseline Model Dependent Variable: ln(Net Organic Clicks on Google) Variable (1) (2) (3) (4) (5) (6) (7) 1.328 (0.235)* 1.846 (0.232)* 2.479 (0.246)* 2.993 (0.262)* 3.925 (0.300)* 1.279 (0.230)* 1.718 (0.229)* 2.264 (0.250)* 2.773 (0.276)* 3.557 (0.344)* 1.258 (0.234)* 1.707 (0.232)* 2.277 (0.254)* 2.811 (0.280)* 3.499 (0.349)* 1.263 (0.238)* 1.715 (0.237)* 2.281 (0.259)* 2.828 (0.287)* 3.502 (0.358)* 1.221 (0.204)* 1.761 (0.203)* 2.234 (0.226)* 2.881 (0.250)* 3.475 (0.311)* 1.400 (0.212)* 1.862 (0.208)* 2.666 (0.225)* 3.412 (0.242)* 4.398 (0.290)* 0.691 (0.224)* 1.094 (0.237)* 0.971 (0.239)* 1.045 (0.259)* 2.119 (0.299)* 0.729 (0.219)* 0.957 (0.234)* 0.896 (0.234)* 0.810 (0.256)* 1.858 (0.297)* 0.253 (0.080)* 0.703 (0.223)* 0.942 (0.237)* 0.915 (0.237)* 0.894 (0.259)* 1.958 (0.301)* 0.253 (0.081)* 0.268 (0.221) 0.731 (0.228)* 0.967 (0.242)* 0.944 (0.242)* 0.928 (0.264)* 1.980 (0.307)* 0.249 (0.083)* 0.243 (0.225) 0.006 (0.022) 0.738 (0.201)* 1.042 (0.211)* 1.186 (0.220)* 1.183 (0.245)* 1.952 (0.282)* 0.289 0.379 (0.072)* (0.073)* 0.564 0.480 (0.197)* (0.206)* 0.018 0.016 (0.020) (0.021) 0.107 -0.206 (0.145) (0.144) Yes Yes Yes Yes 759 759 0.43 0.40 Position on Google Poor 1.307 (0.226)* Below Median 2.009 (0.220)* Median 3.022 (0.223)* Above Median 3.886 (0.223)* Best 5.543 (0.223)* Name Recognition on Google Poor Below Median Median Above Median Best ln(# Ads on Page 1) Social Network Presence Retailer Age Web Only Retailer Constant Yes Yes Yes Yes Retailer Category Indica No No No No Observations 759 759 759 759 Pseudo R2 0.35 0.38 0.39 0.39 Notes: Standard errors in parentheses. *significant at 5%. Yes No 759 0.39 Table 4: Alternative Measures of Name Recognition and Position Dependent Variable: ln(Net Organic Clicks on Google) (1) Variable Position on Google Poor Below Median Median Above Median Best (2) (3) (4) Measure of Name Recognition Phrase Contains Only Name or Name or Domain of Domain of Retailer Retailer 1.294 (0.212)* 1.699 (0.210)* 2.473 (0.231)* 3.030 (0.253)* 3.700 (0.316)* ln(Page 1) 1.363 (0.206)* 1.746 (0.203)* 2.543 (0.223)* 3.136 (0.244)* 3.868 (0.305)* 0.855 (0.086)* 0.857 (0.088)* Name Recognition on Bing Poor 0.387 0.644 0.329 (0.241) (0.243)* (0.234) Below Median 0.561 0.871 0.496 (0.247)* (0.246)* (0.238)* Median 0.576 0.911 0.697 (0.256)* (0.259)* (0.251)* Above Median 0.874 0.757 0.836 (0.266)* (0.268)* (0.255)* Best 1.497 1.251 1.502 (0.298)* (0.303)* (0.287)* ln(# Ads on Page 1) 0.327 0.224 0.318 (0.075)* (0.083)* (0.073)* Social Network Presence 0.321 0.227 0.351 (0.207) (0.210) (0.201) Retailer Age 0.016 0.030 0.019 (0.021) (0.021) (0.020) Web Only Retailer -0.006 0.068 -0.013 (0.151) (0.154) (0.145) Constant Yes Yes Yes Retailer Category Indica Yes Yes Yes Observations 759 759 759 Pseudo R2 0.42 0.39 0.42 Notes: Standard errors in parentheses. *significant at 5%. 0.644 (0.249)* 0.915 (0.252)* 0.905 (0.268)* 0.943 (0.272)* 1.255 (0.308)* 0.219 (0.085)* 0.257 (0.215) 0.035 (0.022) 0.030 (0.157) Yes Yes 759 0.38 Table 5: Specifications Controlling for Endogenity of Position and Ads Dependent Variable: ln(Net Organic Clicks on Google) (1) Two-Stage Ordered Probit Variable (2) Two-Stage Least Squares Position on Google Poor Below Median Median Above Median Best -0.475 (0.293) 0.455 (0.274) 0.825 (0.256)* 1.129 (0.270)* 1.420 (0.252)* ln(Page 1) 1.115 (0.190)* Name Recognition on Bing Poor Below Median Median Above Median Best ln(# Ads on Page 1) Social Network Presence Retailer Age Web Only Retailer 1.158 (0.256)* 1.648 (0.227)* 1.916 (0.212)* 2.149 (0.221)* 2.832 (0.328)* 0.815 (0.070)* 0.093 (0.217) 0.069 (0.026)* 0.101 (0.183) Yes Yes 759 0.53 Constant Retailer Category Indicators Observations Pseudo R2 Sargan Test (p-value) Notes: Standard errors in parentheses. *significant at 5%. 1.049 (0.273)* 1.299 (0.277)* 1.325 (0.296)* 1.277 (0.317)* 1.801 (0.358)* 0.010 (0.188) 0.245 (0.234) 0.028 (0.024) 0.022 (0.171) Yes Yes 759 0.53 0.35 Table 6: Long-Run Effect of Name Recognition (1) ln(Page 1) Variable ln(Page 1) OLS Position Regression (2) (3) (4) ln(Net Organic Clicks Google) OLS 0.881 (0.079)* ln(Page 1) residual OLS Residualized IV Residualized Regression Regression 0.881 (0.079)* 0.635 (0.220)* 1.644 (0.241)* 2.004 (0.215)* 2.424 (0.192)* 2.945 (0.196)* 4.120 (0.317)* 0.292 (0.072)* 0.210 (0.220) 0.034 (0.026) 0.008 (0.191) Yes Yes 759 0.53 1.625 (0.302)* 1.983 (0.307)* 2.410 (0.331)* 2.891 (0.376)* 4.007 (0.530)* 0.342 (0.202) 0.170 (0.235) 0.046 (0.025) 0.055 (0.171) Yes Yes 759 0.53 0.80 Name Recognition on Bing Poor Below Median Median Above Median Best ln(# Ads on Page 1) Social Network Presence Retailer Age Web Only Retailer 0.724 (0.141)* 0.831 (0.139)* 1.235 (0.158)* 1.834 (0.165)* 2.646 (0.207)* 1.006 (0.253)* 1.272 (0.223)* 1.337 (0.205)* 1.330 (0.221)* 1.789 (0.379)* 0.292 (0.072)* 0.210 (0.220) 0.034 (0.026) 0.008 (0.191) Yes Yes 759 0.53 Constant Yes Retailer Category Indica No Observations 759 Pseudo R2 0.41 Sargan Test (p-value) Notes: Standard errors in parentheses. *significant at 5%. Table A1: Baseline Model Dependent Variable: ln(Net Organic Clicks on Google) Variable (1) Position on Bing 4.504 Poor (0.621)* 5.111 Below Median (0.567)* 5.784 Median (0.589)* 6.883 Above Median (0.593)* 8.738 Best (0.587)* (2) 3.811 (0.526)* 3.683 (0.485)* 4.687 (0.505)* 4.982 (0.533)* 6.122 (0.580)* (3) 3.733 (0.530)* 3.777 (0.491)* 4.663 (0.513)* 5.013 (0.558)* 6.122 (0.657)* (4) 3.733 (0.531)* 3.777 (0.492)* 4.663 (0.513)* 5.021 (0.559)* 6.122 (0.657)* (5) 3.733 (0.524)* 3.777 (0.487)* 4.663 (0.508)* 5.021 (0.554)* 6.122 (0.658)* (6) 2.613 (0.510)* 2.951 (0.472)* 4.129 (0.500)* 3.829 (0.551)* 5.267 (0.654)* (7) 3.508 (0.577)* 3.863 (0.530)* 4.712 (0.558)* 5.049 (0.598)* 6.607 (0.698)* Name Recognition on Bing 2.704 (0.580)* 2.267 (0.576)* 1.882 (0.590)* 2.796 (0.611)* 3.827 (0.660)* 2.598 (0.594)* 2.183 (0.588)* 1.819 (0.600)* 2.667 (0.625)* 3.665 (0.696)* 0.049 (0.194) 2.598 (0.594)* 2.036 (0.588)* 1.811 (0.600)* 2.717 (0.625)* 3.665 (0.696)* 0.049 (0.194) 0.000 (0.502) 2.598 (0.587)* 2.174 (0.581)* 1.811 (0.592)* 2.717 (0.619)* 3.665 (0.688)* 0.049 (0.192) 0.000 (0.495) 0.000 (0.049) Yes No 759 0.31 Yes No 759 0.31 Yes No 759 0.31 2.729 (0.578)* 2.284 (0.577)* 2.210 (0.599)* 2.840 (0.617)* 3.360 (0.698)* 0.118 (0.188) -0.046 (0.491) 0.016 (0.050) 0.183 (0.362) Yes Yes 759 0.33 0.398 (0.205) 0.079 (0.558) -0.007 (0.057) -0.398 (0.389) Yes Yes 759 0.27 Poor Below Median Median Above Median Best ln(# Ads on Page 1) Social Network Presence Retailer Age Web Only Retailer Yes No 759 0.23 Yes No 759 0.31 Constant Retailer Category Indica Observations Pseudo R2 Notes: Standard errors in parentheses. *significant at 5%. Table A2: Alternative Measures of Name Recognition and Position Dependent Variable: ln(Net Organic Clicks on Google) (1) (2) (3) (4) Measure of Name Recognition Phrase Contains Only Name or Name or Domain of Domain of Retailer Variable Position on Bing Poor Below Median Median Above Median Best 1.461 (0.395)* 1.993 (0.369)* 2.519 (0.391)* 2.297 (0.434)* 3.813 (0.519)* ln(Page 1) 1.525 (0.375)* 2.121 (0.349)* 2.693 (0.370)* 2.332 (0.411)* 3.754 (0.492)* 0.789 (0.180)* 0.803 (0.184)* Name Recognition on Google Poor 2.629 3.409 2.425 (0.388)* (0.471)* (0.369)* Below Median 3.553 4.587 3.704 (0.394)* (0.475)* (0.373)* Median 3.647 5.068 3.667 (0.416)* (0.499)* (0.393)* Above Median 3.762 5.234 3.780 (0.451)* (0.535)* (0.423)* Best 4.683 5.996 4.692 (0.513)* (0.614)* (0.484)* ln(# Ads on Page 1) 0.148 0.115 0.169 (0.144) (0.176) (0.136) Social Network Presence 0.414 0.371 0.436 (0.381) (0.465) (0.362) Retailer Age 0.018 0.044 0.012 (0.039) (0.047) (0.037) Web Only Retailer 0.166 0.311 0.107 (0.284) (0.348) (0.268) Constant Yes Yes Yes Retailer Category Indica Yes Yes Yes Observations 759 759 759 Pseudo R2 0.35 0.34 0.35 Notes: Standard errors in parentheses. *significant at 5%. 3.517 (0.484)* 4.695 (0.485)* 4.938 (0.509)* 5.088 (0.543)* 5.804 (0.624)* 0.189 (0.180) 0.367 (0.476) 0.035 (0.049) 0.076 (0.354) Yes Yes 759 0.33 Table A3: Specifications Controlling for Endogenity of Position and Ads Dependent Variable: ln(Net Organic Clicks on Google) Variable (1) (2) Two-Stage Two-Stage Ordered Probit Least Squares Position on Bing Poor Below Median Median Above Median Best -0.210 (0.340) -0.536 (0.350) 0.724 (0.339)* 0.903 (0.305)* 0.956 (0.325)* ln(Page 1) 0.988 (0.212)* Name Recognition on Google Poor 1.590 (0.330)* 2.321 (0.365)* 3.100 (0.320)* 3.714 (0.338)* 5.141 (0.372)* 0.471 (0.095)* 0.590 (0.325) 0.056 (0.030) 0.402 (0.228) Yes Yes 759 0.48 1.455 (0.303)* Below Median 2.178 (0.307)* Median 2.872 (0.324)* Above Median 3.236 (0.351)* Best 4.032 (0.415)* ln(# Ads on Page 1) 0.168 (0.226) Social Network Presence 0.625 (0.300)* Retailer Age 0.020 (0.031) Web Only Retailer 0.116 (0.228) Constant Yes Retailer Category Indicators Yes Observations 759 Pseudo R2 0.49 Sargan Test (p-value) 0.39 Notes: Standard errors in parentheses. *significant at 5%. Table A4: Long-Run Effect of Name Recognition (1) lnPage 1) Variable ln(Page 1) OLS Position Regression 0.829 (0.111)* (2) (3) (4) ln(Net Organic Clicks Google) OLS ln(Page 1) residual OLS Residualized IV Residualized Regression Regression 0.829 (0.111)* 0.371 (0.232) 1.732 (0.322)* 2.715 (0.339)* 3.546 (0.297)* 4.294 (0.309)* 6.023 (0.350)* 0.146 (0.105) 0.635 (0.320)* 0.027 (0.030) 0.201 (0.229) Yes Yes 759 0.49 1.625 (0.308)* 2.465 (0.321)* 3.210 (0.346)* 3.824 (0.383)* 5.142 (0.508)* 0.668 (0.233)* 0.543 (0.304) 0.042 (0.032) 0.300 (0.233) Yes Yes 759 0.48 0.93 Name Recognition on Google Poor Below Median Median Above Median Best ln(# Ads on Page 1) Social Network Presence Retailer Age Web Only Retailer 1.516 (0.324)* 2.286 (0.349)* 3.022 (0.307)* 3.460 (0.329)* 4.414 (0.398)* 0.146 (0.105) 0.635 (0.320)* 0.027 (0.030) 0.201 (0.229) Yes Yes 759 0.49 0.260 (0.088)* 0.517 (0.096)* 0.632 (0.103)* 1.006 (0.113)* 1.940 (0.150)* Constant Yes Retailer Category Indica No Observations 759 Pseudo R2 0.32 Sargan Test (p-value) Notes: Standard errors in parentheses. *significant at 5%.