What’s in a Name? The Effects of Prominence on

advertisement
What’s in a Name? The Effects of Prominence on
Consumer Behavior at Search Engines∗
Michael R. Baye
Babur De los Santos
Matthijs R. Wildenbeest
Indiana University
Indiana University
Indiana University
Preliminary and Incomplete, September 2012
Abstract
This paper examines the effects of location and brand name prominence on consumer clicking behavior at the major search engines. We find that both are important determinants of
the links consumers click following a product search. Even though our results indicate that
position on the search results page has a larger direct impact than brand name prominence,
once we take into account that more prominent retail sites also get better positions, the effect
of brand name is much larger than the effect of position. We show that our findings are robust
to several alternative specifications.
Keywords: product search, internet, search engines, prominence
∗
Department of Business Economics and Public Policy, Kelley School of Business, Indiana University,
Bloomington IN 47405; mbaye@indiana.edu, babur@indiana.edu, and mwildenb@indiana.edu. We thank
seminar participants at Indiana University for valuable comments, and Susan Kayser, Joowon Kim, and
Zachary Mays for research assistance. Funding for the data and research assistance related to this research
was made possible by a grant from Google to Indiana University. The views expressed in this paper are
those of the authors and do not necessarily reflect the views of Indiana University or Google.
1
1
Introduction
Recent theoretical work by Arbatskaya (2007), Armstrong, Vickers, and Zhou (2009), and
Armstrong and Zhou (2011) emphasizes the role of prominence in consumer search models.
The key to these models is that consumers visit more prominent firms first; in the context of
online product search, this implies that more prominent online retailers receive more clicks
than their less prominent rivals. While these theoretical models are general enough to handle
environments where prominence depends on a multiplicity of factors, empirical studies of
online search behavior typically view screen location and prominence as synonymous. The
idea is that, when confronted with a list of search results, users are more likely to click links
in more prominent positions (those at the top of the list rather than in the middle, those on
the first page rather than those on the fourth page, and so on). That firms with prominent
locations get more clicks is hardly a matter of dispute; there is abundant evidence that screen
positions matter in online search environments ranging from paid advertisements at search
engines (see Ghose and Yang, 2009) to price comparison sites (see Brynjolfsson, Dick, and
Smith, 2010).1
Despite these important contributions, however, little is know about the empirical effects
of prominence on the clicks that different firms receive from organic search results. For
example, suppose a consumer queries Google or Bing with the phrase “buy product X online”
and in the search results page an obscure website, say FlyByNight.com, is displayed in a
higher position than Amazon.com. Blind application of existing empirical work (including
some of our own) would lead to the prediction that most consumers would click on the link
to FlyByNight.com, since it has the most prominent location on the list. This prediction, of
course, ignores the fact that Amazon has the most prominent name (or equivalently in this
context, is more prominent in terms of name recognition or brand awareness). In general,
one would expect both page location and name recognition to affect an a firm’s overall
level of prominence. While the theoretical literature allows for this possibility, the existing
1
See also Rutz and Bucklin (2011), Drèze and Zufryden (2004), and Ansari and Mela (2003). Additonally,
Armstrong, Vickers, and Zhou (2009) and De los Santos and Koulayev (2012) summarize a number of
studies of “offline” environments (including the yellow pages, voting, and academic citations) that find that
“location” significantly impacts choice.
2
empirical literature is silent on the relative importance of these differing types of prominence
on consumer behavior. Our paper represents a first attempt to tackle these issues empirically.
One of the primary reasons existing empirical researchers focus on location rather than
name recognition is measurement: it is relatively easy to objectively measure of the prominence of a firm’s screen location, but this is not the case for the prominence of its name. In
Section 2 we introduce a novel measure of the prominence of a firm’s name. This measure,
which we construct using comScore Search Planner data, is based on the number of product
searches at Google (or Bing) that include the firm’s name or URL in search query. Retailers
with more of these “name searches” are deemed to have more prominent names than retailers with fewer name searches. We also provide some evidence that this measure works as
advertised in an education context: More “prominent” universities (based on U.S. News and
World Report Rankings) enjoy more “name searches” than less prominent universities.
Section 3 provides preliminary regression results suggesting that location on the search
engine result pages and name recognition are both important determinants of the links consumers click following a product search. These preliminary results, which are robust to the
inclusion of a variety of other controls, suggest that location has a larger impact on clicks
than name recognition, but that failure to account for name prominence increases the estimated effect of location by about 100 percent. Section 4 shows that our main finding—that
name prominence is an important determinant of clicks—is robust to alternative measures
of location and name prominence.
Section 4 also tackles the endogeneity problems inherent in this line of research. In
particular, search engines base firms’positions on the results pages, in part, on past clicks.
In the above hypothetical, for example, as more and more consumers click Amazon.com
an optimizing search engine will demote FlyByNight1’s position and elevate Amazon.com
toward the top of the list. Our analysis indicates that when one controls for endogeneity,
name recognition has a greater impact on clicks than location. We find that a firm moving
from the “median” screen location to the “best” gets about 81 percent more clicks, whereas
a firm that moves from the “median” to the “best” name recognition gets about 150 percent
more clicks. In short, when one controls for endogeneity, both location and name recognition
remain economically and statistically important determinants of clicks, but the quantitative
3
impact of name prominence is greater than location prominence.
Section 5 presents residualized regression results to “unpack” the direct and indirect
effects of improved brand recognition. Here, the direct effect refers to the fact that–holding
position constant–searchers are more likely to click on links with more prominent names.
The indirect effect refers to the fact firms with more prominent names get more clicks, and
so receive better positions on search engines. For example, suppose a firm moves from the
“above median” brand name category to the ranks of the 20 percent with the most prominent
names. In our data, the direct effect of this change (which holds position constant) is a 58
percent increase in clicks. But there is also an indirect effect, since the better brand (and
accompanying increase in clicks) ultimately improves its position in the listings. Based on
the data, the total effect (the sum of the indirect and direct effect) of such an increase in
recognition is a 224 percent increase in clicks.
We conclude in Section 6.
2
Methodology and Data
Consider an online retailer interested in attracting traffic to its website. It might invest
in advertising through traditional (TV, radio or print) or online channels in an attempt to
enhance consumer awareness and generate visits to its website. It might spend large sums to
build a customer-centric website with a broad array of product offerings and an efficient network of distribution centers to create customer loyalty and word-of-mouth (or word-of-blog)
advertising. Or it might use some other strategy, or a blend of several strategies, to induce
consumers to visit its website. A less costly option is to economize on such investments and
simply “free ride” off of the traffic obtained through organic search results. These and other
investments by online retailers impact the prominence their names. Unfortunately, many
online retailers are privately held companies; those that are publicly traded to not systematically provide detailed information about the many investments they make to enhance the
prominence of their online arms.
Our proposed measure of the prominence of an online retailer’s name in a given period is
the number of name searches it obtained during that period. Here, name searches refers to
4
search terms and phrases that include the retailers name or URL. For a variety of reasons,
this is a potentially useful measure of prominence. First, it is measurable. For example, one
can use comScore Search Planner data to calculate the number of name searches different
retailer’s received in a given month. Second, the number of name searches captures the behavior of consumers who are acting on all of the many investments retailers made up to that
point in time. Essentially, a firm’s number of name searches embodies the cumulative branding efforts of the firm up to and including the instant a search is made. Third, the number
of name searches in a given month measures the stock of name prominence. In contrast,
even if data were available on the investments different retailers made on advertising and
other brand-enhancing activities in a given month, such expenditures merely represent flows
that incrementally change prominence relative to previous periods, and therefore would not
be helpful in conducting cross-sectional analysis of the impact of prominence on consumer
search behavior. Even with time-series data on advertising and other brand-enhancing expenditures, one would have to deal with the thorny issue of identifying the “stock” of brand
equity from such “flow” data.
In order to examine the potential promise of our proposed measure of name prominence,
we obtained U.S. News and World Report rankings of the top 100 universities and placed
each university in one of five quintiles. Thus, the most prominent universities (which includes
the likes of Harvard, Stanford, Princeton, and the other usual suspects) were in the top
quintile; Indiana University (a large public university) was in the 4th quintile. We then used
comScore Search planner data (employing a methodology analogous to that described in the
next section) to determine the total number of name searches universities in each quintile
received during February 2012. The results are displayed in Figure 1: more prominent
universities (as measured by the U.S. News and World Report Rankings) received more name
searches than less prominent universities; for example, universities in the top-20 received over
80,000 name searches, while those in the bottom-20 received about 22,000 name searches.
We note that Figure 1 is based on the raw data; thus, name searches are a useful way of
measuring the prominence of different universities even though more prominent universities
tend to have significantly smaller numbers of students than less prominent ones. Thus, the
results in Figure 1 are hardly driven by spurious correlation with numbers of students.
5
For these reasons, we believe the number of name searches is a promising measure of the
prominence of the names of different retailers. We now provide a more detailed description
of our data, and the methods we use to construct the measures of prominence and variables
used in our econometric analysis.
2.1
Overview of the Data
Our analysis is based on three datasets. We assembled two of these using data from thirdparty providers that specialize in electronic commerce marketing data (comScore and Internet Retailer) and created the third dataset using a web scraper written in Java.
The comScore data consists of monthly Search Planner data for June, 2012. These data
are based on the online browsing activity of two million users in the U.S. It provides a list of
search terms and phrases that users entered at search engines (e.g., Google and Bing), along
with the number of “organic clicks” that different websites received based on the results
pages generated by each search term.
The Internet Retailer data provides a list of the top 500 online retailers, along with the
general category in which each retailer operates (e.g., apparel and accessories, housewares
and home furnishings, computer and electronics, and so on). For each retailer, the data
indicates whether it has presence on Facebook or Twitter, the year in which the retailer
began its online operations, and whether it is a “web only” retailer (as is the case with
Amazon) or also has a brick-and-mortar presence (as is the case with Walmart).
Since our goal is to examine product search on general search engines, the first step in
our analysis was to link the 500 retailers in the Internet Retailer data with the comScore
Search Planner data. In particular, we examined the Search Planner data and identified all
of the properties owned by these 500 retailers that were tracked by comScore. Owing to
the fact that some retailers own and operate websites with different domain names, (e.g.,
Amazon operates both the Amazon and Zappos sites; Sears operates the Sears site as well
as a Kenmore and Kmart site), we ended up with a sample of 759 retail sites.
Next, we extracted the comScore site profile for each of these 759 retail websites. Each
site profile provides a list of the search terms and phrases (separately, for Google and Bing)
that resulted in organic clicks from the engines’s results page for that search term to a
6
particular retail site. For each search term or phrase, it also indicates the total number of
organic clicks each retail site received from results pages on each search engine. For example,
across the 759 retail sites, comScore identified a total of 5,549 search terms and phrases that
led consumers from Google to one or more of these 759 sites. Table 1 further illustrates by
providing the top search terms and phrases on Google that led searchers to click on links at
organic results pages directing them to Amazon.com, along with the total number organic
clicks on Google for each of these terms.
A striking feature of Table 1 is that a very large proportion of organic traffic from
search engines to Amazon.com stems from name searches—that is, terms and phrases such
as “Amazon,” “Amazon.com,” “Amazon books,” “Amazon music” that searchers use as a
substitute for directly navigating to the Amazon.com website.2 In contrast, other searches
(like “Panasonic TV” or “Buy Levis jeans”) make up a far lower proportion of the top
organic search terms that result in traffic from Google to Amazon. As shown in Figure 1,
also other retail sites receive a substantial proportion of traffic from organic name searches,
with some sites receiving an even higher proportion of organic traffic through name searches
than Amazon does.
It is hardly surprising that a searcher who includes “Amazon” in a search phrase would
click on an “Amazon.com” link on a results page at a search engine; our analysis therefore
focuses on searches that are not name searches. For purposes of our analysis, a name search is
defined as a search term consisting of the retailer/site name and misspellings (e.g., “Amazon,”
“www.amazon.com,” and “Amzon.com”) as well as phrases containing such terms (e.g., “Buy
Camera at Amazon.com” or “buy TV at Amzon”).3 An examination of the 5,549 search
terms revealed that 3,911 of the 5,549 search terms and phrases leading searchers from Google
to one of the 759 retailer sites were not name searches. For each of the 759 websites, we
computed the net number of organic clicks (defined as total organic clicks minus organic
2
In industry parlance, name searches are sometimes called “navigational searches.” We use “name
search” to emphasize that these searches contain the name or URL of a particular retailer or site.
3
Section 4 shows that our results are robust to a more narrow definition of name searches that includes
site names (Amazon.com) and misspellings (Amazn.com) but excludes phrases with such terms (“buy camera
at amazon.com”).
7
name clicks); in other words, the net number of organic clicks for retail site i is defined as
the total number of organic clicks site i received when searchers used the 3,911 terms and
phrases that did not include the name or URL of retail site i.
Finally, we wrote a program in Java that queried Google and Bing in July 2012 and
captured the first five “results” pages for each of these 3,911 terms and phrases and identified
the position of the retailers in our sample in each of the resulting pages. As discussed in
more detail below, this permits us to control for the positions of different retailers on results
pages for different queries, as well as to construct controls for ads on results pages that may
influence searcher’s decisions to click on organic links.
2.2
Key Variables and Summary Statistics
Our goal is to examine how the prominence of different retail websites influence the organic
links that users click following product searches. Our analysis is based on variables constructed from the datasets described above. Table 2 provides descriptive statistics for these
variables.
Net Organic Clicks. Our dependent variable, net organic clicks, is formally defined as
follows. Let S denote denote the set of non-name search terms and phrases discussed above,
and j ∈ S denote one of these 3,911 searches. Based on the comScore dataset, we know that
retail site i received a total of Ni,j searches for search phrase “j.” Then Ni,j measures the
number of times searchers that used the search phrase “j” viewed the results pages and chose
P
to click an organic link to firm i. Firm i’s number of net organic clicks is N OCi = j Ni,j .
As shown in Table 2, the retail sites in our sample got on average close to 269 thousand
net organic clicks on Google. There is substantial variation: while 42 retail sites got zero
net organic clicks on Google, Amazon, the largest retailer in our sample, got more than 67
million net organic clicks in June 2012.
Position. Based on the data obtained by querying the Google and Bing search engines
using the search terms and phrases in S, we observe the search results position of the organic
links of each of the 759 firms for each keyword. Observed search results positions are numbered from 1 to n with 1 being the best and n being the worst. Since we only obtained data
for the first five pages of search results, firms with positions outside of this range are not
8
observed and assigned a value of n + 1. Letting Pi,j denote the best search results position
of firm i for search term j, firm i’s average search results position across all search terms
P
and phrases is j Pi,j /|S|. Based on these average positions, we categorized the position
variables as follows. Sites that never appeared on the first five pages (those with a mean
screen position of n + 1) were placed in a position category labeled “worst.” While we do not
know these sites actual average positions, we know these sites had the worst positions of any
sites in our sample. Remaining retailers were assigned into position categories depending on
the quintile in which their average screen position fell, ranked from “poor” for those retailers
with an average screen position that was in the lowest quintile of those in which at least
on screen position was observed, to “best” for the retailers with an average screen position
in the highest quintile.4 Thus, our primary measure of screen position is a dummy variable
that equals one when a site’s average screen position is one of these categories (and is zero
otherwise). Table 2 gives descriptive statistics for this variable. Table 2 also gives descriptive
statistics for the number of search terms a retailer appears on the first page on Google. On
average, a retail site appeared close to 14 times on the first page. The standard deviation
for this variable is substantial, which is also reflected by the large difference between the
minimum and maximum for this variable in the sample: whereas the highest ranked retail
site appeared on the first page for 2,194 search terms, a substantial number of firms did not
show up at the first page for all search terms included in the analysis.
Name Recognition. As discussed above, our primary indicator of name recognition is
a site’s number of name searches, and we create two measures: One based on name searches
at Google and the other based on name searches on Bing. The idea is that users must be
cognizant of a particular site in order to conduct a name search. Furthermore, the more
favorably consumers view a particular sites brand (which embodies not only the name of the
4
The intermediate categories are “below median” (second lowest quintile), “median” (middle quintile),
and “above median” (next to highest quintile). A retail site in the “below median” category has an average
screen position that was in the second lowest quintile of those in which at least one screen position was
observed, while a retail site in the “median” category has an average screen position that was in the middle
quintile of those in which at least one screen position was observed. The “above median” category contains
all those retail sites that have an average search results position that was in the next to highest quintile.
9
site but characteristics of the retailer, including product breadth, quality, and reputation),
the greater the number of name searches. Unfortunately, comScore only records search terms
and phrases that exceed an unknown threshold, and as a result, 30% of the retail sites on
Google and 55% of those on Bing had so few name searches that comScore did not record
them. Retail sites in this category were coded as having the “worst” name recognition. The
remaining firms—those in which the number of name searches is observed—were categorized
into five quintiles based on their total number of name searches, ranked from “poor” (number
of name searches is in the lowest quintile of observed name searches) to “best” (top quintile
of observed name searches).5 Thus, our primary measure of name recognition is a dummy
variable that equals one when the site’s number of name searches is in one of these categories
(and zero otherwise). Descriptive statistics can be found in Table 2.
Ads. In addition to displaying organic results, search engines also display paid results.
Paid results are essentially advertisements that expose users to the names of companies, and
therefore may impact the prominence of a particular site. Based on the data collected by
querying Google and Bing, we capture number of times an ad was displayed on the first page
of results. As shown in Table 2, the average number of search terms for which the retail
sited had an ad on the first page is 11.42, with standard deviation close to 56. The retail
site with the highest number of search phrases with an ad on page 1 has 1069 ads.
Social Network Presence. Sites that have a presence on Facebook or Twitter get
additional exposure to potential searchers, and this might affect a site’s prominence. For
each retail site, we created a dummy that equals 1 if a firm has a presence on Facebook or
Twitter, and zero otherwise. As shown in Table 2, 89 percent of sites had a social network
presence.
Retailer Age. One might speculate that firms that have been involved in online sales
5
Similar to position categorization, the intermediate categories are “below median” (second lowest quin-
tile), “median” (middle quintile), and “above median” (next to highest quintile). A retail site in the “below
median” category has a number of name searches that is in the second lowest quintile of observed name
searches, while a retail site in the “median” category has a number of name searches that is in the middle
quintile of observed name searches. The “above median” category contains all those retail sites that have a
number of name searches that is in the second highest quintile of observed name searches.
10
for a longer period of time are more prominent that newer firms. We constructed a variable
that represents the age of the retail site. Table 2 shows that the retailers in our sample exist
for an average of close to 13 years. The youngest is two years of age, while the oldest retail
sites has been around for 23 years.
Web Only. To allow for differences in the prominence of pure-play online retailers
(such as Amazon) and online retailers that also have a brick-and-mortar presence (such as
Walmart), we constructed a dummy variable that equals one if the retail site is operated by
a web-only company and zero otherwise. As shown in Table 2, 36 percent of the retail sites
in our sample is web only.
Category Fixed Effects. Finally, to control for systematic differences in prominence
across different categories of websites, each retail site was assigned to one of the 15 Internet
Retailer categories. Table 2 shows these categories and the percentages of sites that belong
to each category.
2.3
Other Methodological Considerations
As noted above, one limitation of the Search Planner data is that comScore only records
the number of clicks a site obtains from a given search term or phrase when the number is
above an unspecified threshold, T . An unobserved number of clicks (or missing search term
or phrase) does not mean the site did not receive any clicks (or that no users navigated to
the site using that search term or phrase). It simply means that the number of clicks was
below this threshold. Our methodology attempts to mitigate this concern in two ways.
First, we use categories rather than levels to measure name recognition. Thus, while the
actual number of name searches is not observed for some sites, such sites necessarily have
fewer name searches than those for which the comScore does record a number of searches.
So long as sites within the “worst” name recognition category do not have significantly
heterogeneous impacts on the dependent variable in our regressions, the ability to observe
the number of name searches of these firms would not impact our analysis. Notice that the
same is true for our use of categories for page position.
Second, we use quantile regressions to mitigate problems stemming from the fact that
comScore does not disclose clicks for keywords or phrases in which the number of clicks
11
is below an unspecified threshold, T . We initially explored two extremes to this problem
using OLS. In the first, we set T equal to the minimum number of clicks observed in our
sample and assumed that sites with unobserved clicks received T − 1 clicks. In the second,
we assumed that sites with missing numbers of clicks received only 1 click. Results based on
these two extremes were qualitatively similar—and similar to the results reported below—
but the magnitude of the estimates were sensitive to these two extremes. In contrast, the
results reported below based on quantile regressions are robust to these two extremes.
3
Baseline Results
As a starting point, we consider a specifications of the form
ln N OC = a +
5
X
αb P OSb +
5
X
βb N AM Eb + γX + ε
b=1
b=1
where P OSb and N AM Eb are dummy variables corresponding to the position categories
and name prominence, and X is a vector of other potential controls. The omitted categories
are the “worst” position and “worst” name recognition categories. The coefficients for the
position and name dummies have the usual interpretation: A firm moving from position
category b0 to position category b00 (or from name category b0 to name category b00 ) experiences
a [exp (αb00 − αb0 ) − 1] × 100 percentage change in net organic clicks.
Baseline quantile regression results are presented in Table 3. Specification (1) includes
only controls for position. Consistent with other research, notice that sites with better
positions on Google search results pages obtain significantly more net organic clicks through
Google.
Specification (2) adds controls for name recognition. Three aspects of this specification
are noteworthy. First, adding controls for name recognition tends to reduce the magnitude
of the estimated position effects, although all of the position coefficients remain statistically
significant at the 5 percent level. Second, while the effects of name recognition on clicks
is not perfectly monotonic in this simple specification, name recognition has a positive and
statistically significant effect on clicks, and firms with better name recognition get more
12
clicks that firms with the “worst” name recognition (the omitted category). Third, holding
position constant, a firm that moves from the median to the best name recognition category
enjoys a 215 percent increase in clicks.6 This is less than the 325 percent increase in clicks
it would enjoy if its position were held constant but it moves from the median to the best
position category.7 On balance, these results suggest that name prominence is a potentially
important determinant of clicks.
Column (3) adds controls for the number of advertisements for the site that appear on
the first page of organic search results. Consistent with other research (see, for instance,
Goldfarb and Tucker, 2011), the coefficient is positive and statistically significant: A one
percent increase in the number of the firms’ ads on page one of search results increases net
organic clicks by .25 percent. This is consistent with exposure to a paid ad bolstering the
prominence of the firm’s name or link, and therefore increasing the firm’s number of net
organic clicks. Adding this control, however, does little to the estimates of the position or
name recognition coefficients.
If the effects of name recognition identified in specifications (2) and (3) were purely the
result of omitted variables or spurious correlation with better measures of firms’ efforts to
enhance brand awareness and clicks, the results would not be robust to the inclusion of
other variables. Specifications (4) ads a control for whether a firm has a social network
presence. While the coefficient is positive, it is not statistically significant. Likewise, one
might speculate that the age of the retailer is a useful proxy for the prominence of the names
of retail sites, since sites that have been around longer are more likely to better known than
newer sites. The results in column (5) indicate that this variable adds little explanatory
power over and above our measure of name recognition. Finally one might worry that the
previous results are driven by differences between web only and bricks-and clicks retailers,
or differences in the retail segments in which different retailers compete. Column 6 shows
that the results are robust to these controls as well.
On balance, these results indicate that the prominence of a site’s name is an important
determinant of whether those conducing product searches click through to that site. Com6
7
Since (exp (2.119 − 0.971) − 1) × 100 = 215.19.
Since (exp (3.925 − 2.479) − 1) × 100 = 324.61.
13
parisons of the position effects in columns (1) and (2), as well as columns (6) and (7), reveals
that failure to account for the prominence of a site’s name leads to results that overstate the
importance of position.
4
Name Recognition and Endogeneity
This section considers specifications that use alternative measures of name recognition and
screen position, as well as specifications that control for potential endogeneity. As an additional robustness check, we replicate all of our analysis using a different dataset constructed
from comScore data on Bing.
4.1
Alternative Measures of Name Recognition and Position
As an initial matter, notice that if one used total organic clicks at Google (including organic
name clicks) as the dependent variable, it would hardly be surprising to find that the number
of organic name clicks at Google has a positive effect on total organic clicks; indeed, in such
a specification, a one unit increase in the number of organic name clicks at Google would
result in a one unit increase in total organic clicks at Google. We avoid this issue by using
net organic clicks as the dependent variable. Despite this, one might worry that unobserved
factors influencing clicks on Google name searches also impact net organic clicks on Google.
If this is the case, our measure of name recognition will be correlated with the error in the
regression, potentially biasing the results.
To mitigate this concern, we used an identical methodology to construct measures of
name recognition based on name searches at Bing. Given differences in Google and Bing’s
algorithms, and differences in their populations of users, it seems reasonable that unobserved
factors that influence net organic clicks at Google are likely independent of unobserved
factors that influence name clicks on Bing. In any event, as shown in column (1) of Table 4,
using this alternative measure does not qualitatively change our findings: Firms with more
prominent names on Bing obtain significantly more clicks on Google than less prominent
firms.
One might also worry that the our use of position categories somehow masks the impor14
tance of being included on the first page of search results. To account for this possibility,
column (2) of Table 4 shows results based on an alternative measure of position that is simply the number of times a given retailer appears on the first page of organic search results.
As would be expected, the coefficient is positive and statistically significant: firms more frequently appearing on the first page of organic search results obtain significantly more clicks.
But more to the point, our finding that sites with more prominent names receive significantly more organic clicks than less prominent sites continues to hold with this alternative
specification.
As an additional robustness check, we also replicated our analysis using a more narrow
definition of a name search that only includes the name or URL of the site (e.g., excludes
“buy camera at amazon” but includes “amazon.com” and “amazon.” Under this definition, a
name search is a pure navigational search—consumers using this query are merely attempting
to navigate to a particular firm’s site. As shown in columns (3) and (4) of Table 4, the results
are very similar to those based on the broader definition of a name search in columns (1)
and (2).
4.2
Position and Ad Endogeneity
Importantly, all of the results described above are based on specifications in which position
is treated as an exogenous variable. This could be the case if search result positions are
predetermined at the time consumers make their click decisions there, making it unnecessary
to adjust for endogeneity. While one could argue this is true for our data (where we are using
cross-sectional data from a single month rather than a time series of data), in practice search
engines continually refine and optimize their algorithms in an attempt to present searchers
with the most relevant organic results. From the standpoint of estimation, this means that
a site’s position in the list of organic search results depends on past clicks. To further
complicate matters, past organic clicks depend on the past prominence (both location and
name recognition).
An additional issue is the potential endogeneity of Ads—one of the controls in our specifications. Search engines make money when users click on ads, and thus search engines take
into account the likelihood that a firm’s ad will be clicked when deciding to display a firm’s
15
ad. Again, a search engine’s decision to display an ad depends on the past click-through
behavior of consumers. Thus, there is reason to believe that two of the covariates in our
analysis—Position on Google and Ads—may be endogenous. As discussed below, we use
information about position and ads on Bing as instruments for position and ads on Google.
Since Google’s position and ad decisions are based on past clicks at Google–not Bing these
instruments would seem to satisfy the requirements of valid instruments.
Two-stage Ordered Probit. To facilitate comparisons with earlier results, we first
present results using a two-stage ordered probit approach. We use the categorized position
dummies on Bing as instruments. In the first stage we estimate an ordered probit model
using the categorized position variable as a dependent variable and the instruments and all
exogenous covariates as explanatory variables. We then use the predicted values from the
first stage to create another set of position categories, which we use to replace the position
variables in the original model. In the second stage we estimate this equation by OLS.
The results are displayed in column (1) of Table 5. While the results are qualitatively
similar to those ignoring endogeneity, controlling for endogeneity dramatically changes the
quantitative importance of position relative to location prominence. In particular, notice
first that the coefficients of the “poor” and “below median” position categoriess are not
statistically different from zero; position matters only if it is at or better than the median
position. In contrast, name recognition matters across all categories. Second, when one
controls for endogeneity, name prominence has a greater impact on clicks than position
prominence. For example, a firm that moves from the median position categories to the best
position category obtains only 81.3 percent
8
more clicks, whereas it enjoys a 150 percent9
increase in clicks if moves from the median to the best name recognition category.
Two-stage Least Squares. As an additional robustness check, we use the alternative
measure of position discussed in Section 2 (the number of times a firm appears on the first
page of search results). Since both endogenous variables—ln(Page 1) and ln (# of Ads
on Page 1)—are continuous variables, we used standard two stage least squares with as
instruments the corresponding values of these variables on Bing as well as the log of the
8
9
Since (exp (1.420 − .825) − 1) × 100 = 81.30.
Since (exp (2.832 − 1.916) − 1) × 100 = 149.93.
16
average position on Bing’s search result pages. The results are displayed in column (2) of
Table 5. Importantly, name recognition remains an important determinant of clicks in this
specification. Additionally, comparing the results to those in column (2) of Table 4 (which
is the identical specification not controlling for endogeneity) reveals that controlling for
endogeneity increases the estimated effects of name recognition on clicks. Finally, Sargan test
results indicate the residuals are uncorrelated with the exogenous variables, which suggests
the instruments are exogenous.
4.3
Other Robustness Checks
As a final robustness check we replicated all of the above analysis by constructing a dataset
for Bing from the comScore data. As shown in Tables A1 through A4 in the Appendix, our
main findings are robust to this alternative dataset.
5
Indirect Effects of Name Recognition
In the previous sections we have focused on the direct effect of brand name recognition on
clicks. However, if the search engines’ algorithms are such that the rankings of search results
are partly determined by past clicking behavior, there is also an indirect effect of brand
name recognition on clicks: retail sites with more prominent names will get more clicks,
which results in better future positions. Figure 3 gives the average number of times a retail
site appears on the first page for the different name recognition categories, and confirms that
the retail sites with a relatively high brand recognition also tend to have the better positions
on the results pages of the major search engines. In fact, retail sites in the “best” name
recognition category have by far the highest number of search terms for which they appear
on a first result page, which suggest the brand name effect seems to be especially important
for the most recognized sites.
Whereas the partial coefficients in a standard OLS regression would only capture the
direct effect of brand name recognition on clicks, in this section we show how to distinguish
between the direct and indirect effects by estimating the combined effect using a residualized
regression approach. In a first step we run a regression of the log of the number of search
17
phrases for which a retail sites appears on the first page on the name recognition categories
for Bing. The residuals from this regression can be considered as brand name-adjusted
positions. In a second step we regress the log of the net organic clicks on Google on the
brand name-adjusted positions as well as on the brand name recognition categories for Bing.
The estimated coefficient for the brand name variables now contains both the direct and
indirect effects of brand name recognition on clicks.10
10
Formally, in the first step we estimate
ln P OS = γ +
5
X
θb N AM Eb + ν.
b=1
The residuals ν can be interpreted as brand name-adjusted positions, i.e.,
ln P[
OS = ν = ln P OS − γ −
5
X
θb N AM Eb .
b=1
In the second step we estimate
ln N OC = c + δ ln P[
OS +
5
X
ηb N AM Eb + ωX + .
b=1
A comparison of this specification to the standard OLS regression
ln N OC = a + α ln P OS +
5
X
βb N AM Eb + γX + ε,
b=1
P5
shows that difference between the two specifications is α γ + b=1 θb N AM Eb . Adding this difference to
the OLS specification gives, after reorganizing,
ln N OC = a + αγ + α ln P[
OS +
5
X
(βb + αθb )N AM Eb + γX + ε,
b=1
which is equivalent to the second step in the residualized regression approach if c = a + αγ, δ = α, ηb =
βb + αθb , ω = γ, and = ε. Note that that this means that by combining the OLS parameter estimates with
the parameter estimates from the first step in the residualized regression approach, we can break down the
coefficients from the second step into individual components (for instance, the total effect of brand name
recognition on clicks ηb can be split into a direct effect βb and an indirect effect αθb ). This also shows that
the OLS regression and the second step in the residualized regression are equivalent in terms of minimizing
the sum of squared residuals. Indeed, Table 4 shows the R-squared is identical across the two specifications,
18
Column (1) of Table 4 gives estimation results for the first step. In line with the patterns
observed from Figure 3, the effect of name recognition is most pronounced for the “best”
brand name category, although the effects are significantly different from the omitted category for all categories. Column (3) of Table 4 gives the results for the second step, in which
we regress clicks on the brand name-adjusted position variable and the name recognition
categories, as well as some other control variables. A comparison of the parameter estimates
with those for a standard OLS regression (see Column (2) of Table 4) tell us that the impact
of brand name more than doubles for the retail sites with the best brand names and almost
doubles for those with lesser well known brand names. Whereas the direct effect of moving
from the retail sites with brand names in the “above median” category to to those with the
most prominent names leads to a 58 percent increase in net organic clicks on Google,11 the
indirect effect adds another 166 percent point increase to a combined effect of 224 percent.12
Note that the relatively large magnitude of the brand name recognition variables does not
change if we use IV to control for endogeneity, as shown in Column (4) of Table 4.
The large indirect effects we find are visualized in Figure 4, which puts both the total
effect obtained from the residualized regression coefficients and the direct effect from the
standard OLS regression coefficients in one graph, together with the 95% confidence intervals
for the estimated parameters. For retail sites that are in the three highest name recognition
categories the total effect is significantly different from the direct effect, which strengthens
our finding that the indirect effect of brand name trough position plays an important role in
generating clicks. Moreover, Figure 4 shows that if we ignore the indirect effect brand name
has on clicks, the differences between brand name categories are not that large, especially for
the retail sites that are most recognizable, but once we include the indirect effect it becomes
clear that it is especially important to be among the retailers in the “best” brand name
category. So even though in the short run investments in brand name might not improve
a retail site’s position on the search engine, in the long run search engines’ will re-optimize
their ranking, which will lead to better positions and thus more clicks.
just as the parameter estimates and standard errors for the position variable and the covariates X.
11
Since (exp (1.789 − 1.330) − 1) × 100 = 58.24.
12
Since (exp (4.120 − 2.945) − 1) × 100 = 223.81.
19
6
Conclusions
In this paper we have investigated the importance of brand name recognition in consumers’
decisions on which search engine results to click. Using a large dataset of search terms
and phrases used at the major search engines, have found that the effects of brand name
recognition on clicks are substantial, even if we control for positions on the search engine
result pages. Moreover, we have found that adding the indirect effect of brand on clicks
through position doubles the direct effect. We have shown that are findings are robust to
several alternative specifications.
20
References
[1] Ansari, Asim and Carl Mela (2003). “E-Customization.” Journal of Marketing Research,
40 (2), pp. 131-146.
[2] Arbatskaya, Maria (2007). “Ordered Search.” The RAND Journal of Economics, 38 (1),
pp. 119-126.
[3] Armstrong, Mark, John Vickers, and Jidong Zhou (2009). “Prominence and Consumer
Search,” The RAND Journal of Economics, 40(2), pp. 209-233.
[4] Armstrong, Mark and Jidong Zhou (2011). “Paying for Prominence,” Economic Journal,
121 (556), pp. F368-F395.
[5] Baye, Michael R. and John Morgan (2009). “Brand and Price Advertising in Online
Markets.” Management Science, 55 (7), pp. 1139-1151.
[6] Brynjolfsson, Erik, Astrid Dick, and Michael D. Smith (2010). “A Nearly Perfect Market?” Quantitative Marketing and Economics, 8 (1), pp. 1-33.
[7] Brynjolfsson, Erik and Michael D. Smith (2001). “The Great Equalizer: The Role of
Shopbots in Electronic Markets.” MIT Sloan Working Paper No. 4208-01.
[8] De los Santos, Babur and Sergei Koulayev (2012). “Optimizing Click-through in Online
Rankings for Partially Anonymous Consumers.” Working paper.
[9] Drèze, Xavier and Fred Zufryden (2004). “Measurement of Online Visibility and its
Impact on Internet Traffic.” Journal of Interactive Marketing, 18 (1), pp. 20-37.
[10] Ghose, Anindya and Sha Yang (2009). “An Empirical Analysis of Search Engine Advertising: Sponsored Search in Electronic Markets.” Management Science, 55 (10), pp.
1605-1622.
[11] Goldfarb, Avi and Catherine Tucker (2011). “Online Display Advertising: Targeting
and Obtrusiveness.” Marketing Science, 30 (3), pp. 389-404.
21
[12] Rutz, Oliver J. and Randolph E. Bucklin (2011). “From Generic to Branded: A Model of
Spillover Dynamics in Paid Search Advertising,” forthcoming in the Journal of Marketing
Research.
22
20
Avg. Name Searches (thousands)
60
40
80
100
Figure 1: Name Searches and University Rankings
1-20
21-40
41-60
61-80
81-100
University Rank
Figure 2: Organic Clicks and Name Searches that Lead Users to Retailers
Amazon
Walmart
Apple
Target
Homedepot
Bestbuy
Lowes
Macys
0
20,000
40,000
60,000
80,000
Number of Searches (thousand)
Net Organic Searches
23
100000
Name Searches
120000
0
Avg. Number of Times on First Page
20
40
60
80
Figure 3: Position and Name Recognition
Worst
Poor
Below Median
Median
Above Median
Best
Name Recognition Category
0.00
Coefficients and 95% confidence interval
1.00
4.00
5.00
2.00
3.00
Figure 4: Total Effect of Name Recognition on Clicks
Poor
Below Median
Median
Above Median
Name Recognition Category
Direct Effect
24
Total Effect
Best
Table 1: Search Terms that lead users to Amazon
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
30
40
50
60
70
80
90
100
Search Phrase
amazon
amazon.com
www.amazon.com
kindle
50 shades of grey
ebay
name name
amazon books
***
name last name
kindle fire
google
name name book
amzon
amazonprime
boss orange name
amazon kindle
amazon prime
kim kardashian sex tape
first name review
fifty shades of grey
amazo
bosch name
insanity workout
metal detectors
the night dad went to jail
target
name name books in order
All Search Terms
# Organic Clicks at Google
9,006,624
1,207,598
247,793
146,406
116,359
106,062
105,408
100,357
96,165
90,934
76,201
72,856
66,754
63,852
56,407
55,281
55,272
55,259
54,028
51,240
29,303
22,909
18,878
15,634
14,419
13,502
12,125
11,301
79,244,892
Notes: comScore Search Planner data from June 2012. Search
phrases are ranked by the total number of organic clicks on
Google.
Table 2: Descriptive Statistics (N=759)
Variable
Net Organic Clicks on Google (thousands)
Number of Times on Page 1
Number of Ads on Page 1
Mean
268.99
13.90
11.42
Std. Dev.
2590.29
86.58
55.91
Min
0
1
1
Max
67497.97
2194
1069
0.20
0.15
0.17
0.16
0.16
0.16
0.40
0.36
0.37
0.37
0.37
0.37
0
0
0
0
0
0
1
1
1
1
1
1
0.34
0.13
0.13
0.13
0.13
0.13
0.47
0.34
0.34
0.34
0.34
0.34
0
0
0
0
0
0
1
1
1
1
1
1
0.58
0.08
0.08
0.08
0.08
0.08
0.89
12.97
0.36
0.49
0.28
0.28
0.28
0.28
0.28
0.31
3.19
0.48
0
0
0
0
0
0
0
2
0
1
1
1
1
1
1
1
23
1
0.30
0.01
0.04
0.09
0.04
0.04
0.04
0.05
0.11
0.02
0.07
0.03
0.08
0.05
0.03
0.46
0.11
0.20
0.28
0.19
0.21
0.20
0.22
0.31
0.14
0.26
0.17
0.28
0.22
0.16
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Position on Google
Worst
Poor
Below Median
Median
Above Median
Best
Name Recognition on Google
Worst
Poor
Below Median
Median
Above Median
Best
Name Recognition on Bing
Worst
Poor
Below Median
Median
Above Median
Best
Social Network Presence
Retailer Age
Web Only Retailer
Category
Apparel/accessories
Automotive parts/accessories
Books/music/video
Computers/electronics
Flowers/gifts
Food/drug
Hardware/home improvement
Health/beauty
Housewares/home furnishings
Jewelry
Mass merchant
Office supplies
Specialty/non-apparel
Sporting goods
Toys/hobbies
Table 3. Baseline Model
Dependent Variable: ln(Net Organic Clicks on Google)
Variable
(1)
(2)
(3)
(4)
(5)
(6)
(7)
1.328
(0.235)*
1.846
(0.232)*
2.479
(0.246)*
2.993
(0.262)*
3.925
(0.300)*
1.279
(0.230)*
1.718
(0.229)*
2.264
(0.250)*
2.773
(0.276)*
3.557
(0.344)*
1.258
(0.234)*
1.707
(0.232)*
2.277
(0.254)*
2.811
(0.280)*
3.499
(0.349)*
1.263
(0.238)*
1.715
(0.237)*
2.281
(0.259)*
2.828
(0.287)*
3.502
(0.358)*
1.221
(0.204)*
1.761
(0.203)*
2.234
(0.226)*
2.881
(0.250)*
3.475
(0.311)*
1.400
(0.212)*
1.862
(0.208)*
2.666
(0.225)*
3.412
(0.242)*
4.398
(0.290)*
0.691
(0.224)*
1.094
(0.237)*
0.971
(0.239)*
1.045
(0.259)*
2.119
(0.299)*
0.729
(0.219)*
0.957
(0.234)*
0.896
(0.234)*
0.810
(0.256)*
1.858
(0.297)*
0.253
(0.080)*
0.703
(0.223)*
0.942
(0.237)*
0.915
(0.237)*
0.894
(0.259)*
1.958
(0.301)*
0.253
(0.081)*
0.268
(0.221)
0.731
(0.228)*
0.967
(0.242)*
0.944
(0.242)*
0.928
(0.264)*
1.980
(0.307)*
0.249
(0.083)*
0.243
(0.225)
0.006
(0.022)
0.738
(0.201)*
1.042
(0.211)*
1.186
(0.220)*
1.183
(0.245)*
1.952
(0.282)*
0.289
0.379
(0.072)* (0.073)*
0.564
0.480
(0.197)* (0.206)*
0.018
0.016
(0.020) (0.021)
0.107
-0.206
(0.145) (0.144)
Yes
Yes
Yes
Yes
759
759
0.43
0.40
Position on Google
Poor
1.307
(0.226)*
Below Median
2.009
(0.220)*
Median
3.022
(0.223)*
Above Median
3.886
(0.223)*
Best
5.543
(0.223)*
Name Recognition on Google
Poor
Below Median
Median
Above Median
Best
ln(# Ads on Page 1)
Social Network Presence
Retailer Age
Web Only Retailer
Constant
Yes
Yes
Yes
Yes
Retailer Category Indica
No
No
No
No
Observations
759
759
759
759
Pseudo R2
0.35
0.38
0.39
0.39
Notes: Standard errors in parentheses. *significant at 5%.
Yes
No
759
0.39
Table 4: Alternative Measures of Name Recognition and Position
Dependent Variable: ln(Net Organic Clicks on Google)
(1)
Variable
Position on Google
Poor
Below Median
Median
Above Median
Best
(2)
(3)
(4)
Measure of Name Recognition
Phrase Contains
Only Name or
Name or Domain of
Domain of Retailer
Retailer
1.294
(0.212)*
1.699
(0.210)*
2.473
(0.231)*
3.030
(0.253)*
3.700
(0.316)*
ln(Page 1)
1.363
(0.206)*
1.746
(0.203)*
2.543
(0.223)*
3.136
(0.244)*
3.868
(0.305)*
0.855
(0.086)*
0.857
(0.088)*
Name Recognition on Bing
Poor
0.387
0.644
0.329
(0.241)
(0.243)*
(0.234)
Below Median
0.561
0.871
0.496
(0.247)*
(0.246)*
(0.238)*
Median
0.576
0.911
0.697
(0.256)*
(0.259)*
(0.251)*
Above Median
0.874
0.757
0.836
(0.266)*
(0.268)*
(0.255)*
Best
1.497
1.251
1.502
(0.298)*
(0.303)*
(0.287)*
ln(# Ads on Page 1)
0.327
0.224
0.318
(0.075)*
(0.083)*
(0.073)*
Social Network Presence 0.321
0.227
0.351
(0.207)
(0.210)
(0.201)
Retailer Age
0.016
0.030
0.019
(0.021)
(0.021)
(0.020)
Web Only Retailer
-0.006
0.068
-0.013
(0.151)
(0.154)
(0.145)
Constant
Yes
Yes
Yes
Retailer Category Indica
Yes
Yes
Yes
Observations
759
759
759
Pseudo R2
0.42
0.39
0.42
Notes: Standard errors in parentheses. *significant at 5%.
0.644
(0.249)*
0.915
(0.252)*
0.905
(0.268)*
0.943
(0.272)*
1.255
(0.308)*
0.219
(0.085)*
0.257
(0.215)
0.035
(0.022)
0.030
(0.157)
Yes
Yes
759
0.38
Table 5: Specifications Controlling for Endogenity of Position and Ads
Dependent Variable: ln(Net Organic Clicks on Google)
(1)
Two-Stage
Ordered Probit
Variable
(2)
Two-Stage
Least Squares
Position on Google
Poor
Below Median
Median
Above Median
Best
-0.475
(0.293)
0.455
(0.274)
0.825
(0.256)*
1.129
(0.270)*
1.420
(0.252)*
ln(Page 1)
1.115
(0.190)*
Name Recognition on Bing
Poor
Below Median
Median
Above Median
Best
ln(# Ads on Page 1)
Social Network Presence
Retailer Age
Web Only Retailer
1.158
(0.256)*
1.648
(0.227)*
1.916
(0.212)*
2.149
(0.221)*
2.832
(0.328)*
0.815
(0.070)*
0.093
(0.217)
0.069
(0.026)*
0.101
(0.183)
Yes
Yes
759
0.53
Constant
Retailer Category Indicators
Observations
Pseudo R2
Sargan Test (p-value)
Notes: Standard errors in parentheses. *significant at 5%.
1.049
(0.273)*
1.299
(0.277)*
1.325
(0.296)*
1.277
(0.317)*
1.801
(0.358)*
0.010
(0.188)
0.245
(0.234)
0.028
(0.024)
0.022
(0.171)
Yes
Yes
759
0.53
0.35
Table 6: Long-Run Effect of Name Recognition
(1)
ln(Page 1)
Variable
ln(Page 1)
OLS Position
Regression
(2)
(3)
(4)
ln(Net Organic Clicks Google)
OLS
0.881
(0.079)*
ln(Page 1) residual
OLS Residualized IV Residualized
Regression
Regression
0.881
(0.079)*
0.635
(0.220)*
1.644
(0.241)*
2.004
(0.215)*
2.424
(0.192)*
2.945
(0.196)*
4.120
(0.317)*
0.292
(0.072)*
0.210
(0.220)
0.034
(0.026)
0.008
(0.191)
Yes
Yes
759
0.53
1.625
(0.302)*
1.983
(0.307)*
2.410
(0.331)*
2.891
(0.376)*
4.007
(0.530)*
0.342
(0.202)
0.170
(0.235)
0.046
(0.025)
0.055
(0.171)
Yes
Yes
759
0.53
0.80
Name Recognition on Bing
Poor
Below Median
Median
Above Median
Best
ln(# Ads on Page 1)
Social Network Presence
Retailer Age
Web Only Retailer
0.724
(0.141)*
0.831
(0.139)*
1.235
(0.158)*
1.834
(0.165)*
2.646
(0.207)*
1.006
(0.253)*
1.272
(0.223)*
1.337
(0.205)*
1.330
(0.221)*
1.789
(0.379)*
0.292
(0.072)*
0.210
(0.220)
0.034
(0.026)
0.008
(0.191)
Yes
Yes
759
0.53
Constant
Yes
Retailer Category Indica
No
Observations
759
Pseudo R2
0.41
Sargan Test (p-value)
Notes: Standard errors in parentheses. *significant at 5%.
Table A1: Baseline Model
Dependent Variable: ln(Net Organic Clicks on Google)
Variable
(1)
Position on Bing
4.504
Poor
(0.621)*
5.111
Below Median (0.567)*
5.784
Median
(0.589)*
6.883
Above Median (0.593)*
8.738
Best
(0.587)*
(2)
3.811
(0.526)*
3.683
(0.485)*
4.687
(0.505)*
4.982
(0.533)*
6.122
(0.580)*
(3)
3.733
(0.530)*
3.777
(0.491)*
4.663
(0.513)*
5.013
(0.558)*
6.122
(0.657)*
(4)
3.733
(0.531)*
3.777
(0.492)*
4.663
(0.513)*
5.021
(0.559)*
6.122
(0.657)*
(5)
3.733
(0.524)*
3.777
(0.487)*
4.663
(0.508)*
5.021
(0.554)*
6.122
(0.658)*
(6)
2.613
(0.510)*
2.951
(0.472)*
4.129
(0.500)*
3.829
(0.551)*
5.267
(0.654)*
(7)
3.508
(0.577)*
3.863
(0.530)*
4.712
(0.558)*
5.049
(0.598)*
6.607
(0.698)*
Name Recognition on Bing
2.704
(0.580)*
2.267
(0.576)*
1.882
(0.590)*
2.796
(0.611)*
3.827
(0.660)*
2.598
(0.594)*
2.183
(0.588)*
1.819
(0.600)*
2.667
(0.625)*
3.665
(0.696)*
0.049
(0.194)
2.598
(0.594)*
2.036
(0.588)*
1.811
(0.600)*
2.717
(0.625)*
3.665
(0.696)*
0.049
(0.194)
0.000
(0.502)
2.598
(0.587)*
2.174
(0.581)*
1.811
(0.592)*
2.717
(0.619)*
3.665
(0.688)*
0.049
(0.192)
0.000
(0.495)
0.000
(0.049)
Yes
No
759
0.31
Yes
No
759
0.31
Yes
No
759
0.31
2.729
(0.578)*
2.284
(0.577)*
2.210
(0.599)*
2.840
(0.617)*
3.360
(0.698)*
0.118
(0.188)
-0.046
(0.491)
0.016
(0.050)
0.183
(0.362)
Yes
Yes
759
0.33
0.398
(0.205)
0.079
(0.558)
-0.007
(0.057)
-0.398
(0.389)
Yes
Yes
759
0.27
Poor
Below Median
Median
Above Median
Best
ln(# Ads on Page 1)
Social Network Presence
Retailer Age
Web Only Retailer
Yes
No
759
0.23
Yes
No
759
0.31
Constant
Retailer Category Indica
Observations
Pseudo R2
Notes: Standard errors in parentheses. *significant at 5%.
Table A2: Alternative Measures of Name Recognition and Position
Dependent Variable: ln(Net Organic Clicks on Google)
(1)
(2)
(3)
(4)
Measure of Name Recognition
Phrase Contains
Only Name or
Name or Domain of
Domain of Retailer
Variable
Position on Bing
Poor
Below Median
Median
Above Median
Best
1.461
(0.395)*
1.993
(0.369)*
2.519
(0.391)*
2.297
(0.434)*
3.813
(0.519)*
ln(Page 1)
1.525
(0.375)*
2.121
(0.349)*
2.693
(0.370)*
2.332
(0.411)*
3.754
(0.492)*
0.789
(0.180)*
0.803
(0.184)*
Name Recognition on Google
Poor
2.629
3.409
2.425
(0.388)*
(0.471)*
(0.369)*
Below Median
3.553
4.587
3.704
(0.394)*
(0.475)*
(0.373)*
Median
3.647
5.068
3.667
(0.416)*
(0.499)*
(0.393)*
Above Median
3.762
5.234
3.780
(0.451)*
(0.535)*
(0.423)*
Best
4.683
5.996
4.692
(0.513)*
(0.614)*
(0.484)*
ln(# Ads on Page 1)
0.148
0.115
0.169
(0.144)
(0.176)
(0.136)
Social Network Presence 0.414
0.371
0.436
(0.381)
(0.465)
(0.362)
Retailer Age
0.018
0.044
0.012
(0.039)
(0.047)
(0.037)
Web Only Retailer
0.166
0.311
0.107
(0.284)
(0.348)
(0.268)
Constant
Yes
Yes
Yes
Retailer Category Indica
Yes
Yes
Yes
Observations
759
759
759
Pseudo R2
0.35
0.34
0.35
Notes: Standard errors in parentheses. *significant at 5%.
3.517
(0.484)*
4.695
(0.485)*
4.938
(0.509)*
5.088
(0.543)*
5.804
(0.624)*
0.189
(0.180)
0.367
(0.476)
0.035
(0.049)
0.076
(0.354)
Yes
Yes
759
0.33
Table A3: Specifications Controlling for Endogenity of Position and
Ads
Dependent Variable: ln(Net Organic Clicks on Google)
Variable
(1)
(2)
Two-Stage
Two-Stage
Ordered Probit
Least Squares
Position on Bing
Poor
Below Median
Median
Above Median
Best
-0.210
(0.340)
-0.536
(0.350)
0.724
(0.339)*
0.903
(0.305)*
0.956
(0.325)*
ln(Page 1)
0.988
(0.212)*
Name Recognition on Google
Poor
1.590
(0.330)*
2.321
(0.365)*
3.100
(0.320)*
3.714
(0.338)*
5.141
(0.372)*
0.471
(0.095)*
0.590
(0.325)
0.056
(0.030)
0.402
(0.228)
Yes
Yes
759
0.48
1.455
(0.303)*
Below Median
2.178
(0.307)*
Median
2.872
(0.324)*
Above Median
3.236
(0.351)*
Best
4.032
(0.415)*
ln(# Ads on Page 1)
0.168
(0.226)
Social Network Presence
0.625
(0.300)*
Retailer Age
0.020
(0.031)
Web Only Retailer
0.116
(0.228)
Constant
Yes
Retailer Category Indicators
Yes
Observations
759
Pseudo R2
0.49
Sargan Test (p-value)
0.39
Notes: Standard errors in parentheses. *significant at 5%.
Table A4: Long-Run Effect of Name Recognition
(1)
lnPage 1)
Variable
ln(Page 1)
OLS Position
Regression
0.829
(0.111)*
(2)
(3)
(4)
ln(Net Organic Clicks Google)
OLS
ln(Page 1) residual
OLS Residualized IV Residualized
Regression
Regression
0.829
(0.111)*
0.371
(0.232)
1.732
(0.322)*
2.715
(0.339)*
3.546
(0.297)*
4.294
(0.309)*
6.023
(0.350)*
0.146
(0.105)
0.635
(0.320)*
0.027
(0.030)
0.201
(0.229)
Yes
Yes
759
0.49
1.625
(0.308)*
2.465
(0.321)*
3.210
(0.346)*
3.824
(0.383)*
5.142
(0.508)*
0.668
(0.233)*
0.543
(0.304)
0.042
(0.032)
0.300
(0.233)
Yes
Yes
759
0.48
0.93
Name Recognition on Google
Poor
Below Median
Median
Above Median
Best
ln(# Ads on Page 1)
Social Network Presence
Retailer Age
Web Only Retailer
1.516
(0.324)*
2.286
(0.349)*
3.022
(0.307)*
3.460
(0.329)*
4.414
(0.398)*
0.146
(0.105)
0.635
(0.320)*
0.027
(0.030)
0.201
(0.229)
Yes
Yes
759
0.49
0.260
(0.088)*
0.517
(0.096)*
0.632
(0.103)*
1.006
(0.113)*
1.940
(0.150)*
Constant
Yes
Retailer Category Indica
No
Observations
759
Pseudo R2
0.32
Sargan Test (p-value)
Notes: Standard errors in parentheses. *significant at 5%.
Download