Why do Category Labels Stick? Unpacking the Innovation Paradox Stine Grodal Fernando Suarez Diego Zunino Boston University Questrom School of Business ABSTRACT Adding to the traditional studies of technology adoption by scholars in economics and management, scholars with socio-cognitive and institutional perspectives have recently proposed an apparent paradox in the adoption of new technologies: The need to balance the perceived familiarity and novelty of a new technology. This idea has become well accepted, but scant empirical research has been done to confirm or deepen our theorizing. By examining an important element of how innovations are understood, their related category labels, we show how adoption is affected by the degree to which a new product label is perceived as familiar or creative. Drawing on a large data set from the smartphone industry from 1998 to 2011, we track firms’ adoption of category labels over time. We find support for an inverted U-shape relationship between the firms’ decision of which label to adopt and the degree to which labels are perceived as familiar or creative. We confirm these results with an online experiment designed to address potential endogeneity concerns. Our paper expands our understanding of the innovation paradox and further builds theory around why some category labels stick while others are left in the dustbin of history. 1 INTRODUCTION The question of how new technologies get adopted has been a key concern of scholars in technology management and economics for decades (Hannan and McDowell, 1984; Rogers, 2010; Majumdar and Venkataraman, 1998). A more recent literature has begun to explore the role that socio-cognitive and institutional factors play in the stakeholders’ framing and understanding of new technologies, and their consequent influence on how each technology fares in the market (Kennedy and Fiss, 2013). This literature has highlighted an apparent paradox that every new technology needs to overcome. On the one hand, new technologies have to differentiate themselves from existing technological offerings; in other words, they must convey ideas of novelty, creativity, and originality. However, at the same time, new technologies have to appear familiar enough to stakeholders and invoke existing understandings, so as to minimize the natural reluctance to something new and unproven (Hargadon and Douglas, 2001; Bingham and Kahl, 2013). A growing body of research has examined the socio-cognitive aspects surrounding new technologies and markets (Rosa et al., 1999; Pontikes, 2012; Suarez, Grodal and Gotsopoulos, 2014). This research has stressed the role of categories and their associated labels in shaping stakeholders’ understanding and perception of new technologies and products (e.g. Navis and Glynn, 2010). Categories are defined as socially constructed partitions or taxonomies that divide the social space into groupings of objects perceived to be similar (Bowker and Star, 2000). Category labels --words or phonemes that are used to reference objects that are perceived to be distinct or belong together-- are the first instantiation of categories. As such, they are one of the earliest factors that shape stakeholders’ understandings about a new product. When introducing an innovation to the market, producers often lack terms for how to reference their new product (Kaplan and Tripsas, 2008). In order to communicate the meaning of 2 their product to stakeholders they often invent category labels, like such as “pocket PC”, “camera phone” or “smartphone”. When confronted with a newly created category label, stakeholders have no prior experience with the kind of products that such label may refer to; consequently, at the beginning, the meaning of most new category labels tends to be shallow. Extant research has suggested that stakeholders begin to make sense of new labels by linking the label’s constituent parts via a relation to existing objects, or by projecting properties of a constituent (Wisniewski, 1996; Smith, Osherson, Rips and Keane, 1988). Through an iterative and negotiated process by which labels are adopted and used by different stakeholders, labels become infused with value and began to be associated with pre-existing categories, and thus gradually become meaningful categories themselves (Peirce, 1931; Grodal et al., 2014). As the meaning of the new categories develop, they begin to form “rules” or “boundaries” that dictate which objects can claim membership to a given category (Hannan, Polos, and Carroll, 2007; Navis and Glynn, 2010). Category labels that do not get traction fail to get infused with meaning and are ultimately abandoned. As this process continues, emerging industries often coalesce in around a dominant category, or “the conceptual schema that most stakeholders adhere to when referring to products that address similar needs and compete for the same market space” (Suarez et al., 2013). Despite the importance that category labels have in shaping stakeholders’ understanding about new technologies, we know relatively little about the process by which some of these labels are selected over others and how they “emerge and fall out of use” (Kennedy and Fiss, 2013, p. 1). This is an important gap in our theory, since the performance of entrants into a new industry can depend on the categorical labels they choose when positioning their products (Rosa et al., 1999; Pontikes, 2012). In this paper, we tackle this gap by presenting an empirical study of the creation and selection of categorical labels in what became to be known as the “smartphone” industry. By collecting data from the official product launches of each smartphone in the period 1998 to 2011 3 in the U.S., we trace the categorical labels used by smartphone producers over time. Borrowing from linguistics and extant research on categories, we test and expand existing theory about why some labels tend to stick while others are abandoned. In particular, we revisit and test the idea that, to secure adoption, producers have to find ways to convey both notions of change and stability when introducing new products (Hargadon and Douglas, 2001). We posit that one of the key ways in which producers convey these notions to consumers and other stakeholders is through their choice of category labels. The literature, however, has not been consistent or specific enough as to what kind of change and stability is needed to maximize adoption. Some authors seem to imply some type of trade-off between these dimensions (e.g. Bingham and Kahl, 2013), while other authors seem to suggest that both can be achieved at once (Hargadon and Douglas, 2001). Studies of adoption in other contexts that also use this change-stability distinction are equally unspecific. For instance, Uzzi et al. (2012), in one of the few large scale empirical studies besides ours, use the opposite sides of a single measure to operationalize both constructs (in their terms, novelty and conventionality) in their study of how atypical combination of knowledge affect the adoption (cites) of scientific papers, which would suggest some kind of trade-off. However, then they argue that most-cited papers score high in both novelty and conventionality. Mckinley et al. (1999), studying adoption of schools of thought in organizational theory, recognize the existence of a tension between continuity and novelty, but then they go on to argue that adoption requires an “adequate” level of each (p.637). In this paper, we provide a fresh empirical analysis from which our existing theory and understanding can be improved. By focusing on why some category labels are adopted more than others, we provide both granularity and specificity to the discussion of these important issues. In particular, we hypothesize a non-linear relationship between label adoption and a construct denoting stability and conventionality, which hereon we refer to as label familiarity; we also 4 hypothesize a non-linear relationship between label adoption and a construct denoting change and novelty, which hereon we will refer to as label creativity. Our results largely support our hypotheses, thus representing not only one of the first large empirical studies of category label adoption, but arguably the most detailed data-based work to date on the change-stability paradox. Moreover, we complement our econometric analysis by carrying out and reporting the result of an online experiment specially designed to provide further confirmation of our findings, and to address possible concerns about endogeneity in our regressions. The results of the experiment strongly confirm our empirical findings, making our paper also, to our knowledge, the first multiple-methods work on this topic. THEORY AND HYPOTHESES Categories are arguably the most widely-researched construct among several that have been proposed to capture the socio-cognitive dimension of technology emergence and adoption, such as “field frames” (Lounsbury 2001; Lounsbury, Ventresca and Hirsch, 2003); “schemas” (Bingham and Kahl, 2013), and “technological frames” (Kaplan and Tripsas, 2008; Orlikowski and Gash, 1994). Categories help group similar objects and can indeed determine the set of characteristics that the objects belonging to a given category are expected to posses, and the elements that differentiates them (or not) from members of other categories (Vygotsky, 1986). As the first instantiation of categories, category labels help make sense of new products by allowing stakeholders to develop semantic links to other categories and their associated labels, relating a label to other objects or concepts and borrowing from the inherited properties of the label components (Wisniewski, 1996). It is through this process that the “deepening” of meaning takes place (Grodal et al, 2014) through which some of these labels eventually become 5 established categories (Peirce, 1931; Bingham and Kahl, 2013). When stakeholders observe a category label, they construct the group of objects that they perceive as being associated with it (Yamauchi and Markman, 1998). While producers are the main creators of labels in the process of introducing their products to an emergent industry, but labels are socially constructed and thus other stakeholders such as users, industry analysts or observers, bloggers, etc, can also create them. For instance, the “mountain bike” label was created by users who tweaked their standard bikes for racing down hills, the “robot” label was created by a writer of a science fiction book, and the label “impressionism” was created by art critics (originally with a negative connotation) to refer to the unconventional work of Monet and other painters of the time. Extant research has sought to understand the extent to which firms spanning categorical boundaries face performance penalties (Zuckerman, 1999; Hsu et al. 2009), and the conditions under which spanning categorical boundaries might be permissible (Ruef and Patterson, 2009; Fleischer, 2009; Pontikes, 2012; Granqvist et al. 2013). Despite notable progress in our understanding of categories, scant research, theoretical or empirical, has been conducted to unveil the dynamics of competing categories and their associated labels; i.e. the contestation process that leads to the adoption of some labels and the abandonment of others. In order to tackle this challenge, in this study we test and extend the theoretical argument that, in order to drive higher adoption, category labels have to resolve the paradox of being simultaneously (1) creative enough to convey notions of novelty and change so as to attract the attention of stakeholders, and (2) familiar enough that they convey notions of continuity and stability, so as to be readily comprehensible. Similar versions of this broad argument have been proposed by a handful of authors. For instance, Hargadon and Douglas (2001) suggest that, “Without invoking existing understandings, innovations may never be understood and adopted in the first place. Yet, by hewing closely to existing institutions, innovators risk losing the valued details, representing the 6 innovation’s true novelty, that ultimately change those institutions. Success, then, requires entrepreneurs to locate their ideas within the set of understandings and patterns of action …in order to gain initial acceptance, yet somehow retain the inherent differences” (p. 478). While the general argument of a paradox is appealing, it is by no means widely accepted, particularly when it comes to the form and limits of the tension between the two constructs. Moreover, the argument has, to our knowledge, never been empirically tested. The existing literature has been rather unspecific and even in apparent disagreement with respect to the existence of a familiarity-creativity tension, and the role that it may take during technology adoption. The quote from Hargadon and Douglas above would suggest that the authors believe that familiarity and creativity can be achieved at the same time, and that having more of each is always beneficial for adoption. However, no conclusion can be drawn from their theorizing as to how much of each dimension maximizes adoption; for instance, should a technology sponsor always try to convey more familiarity and more creativity in their label to secure stronger adoption? Bingham and Kahl (2013), in contrast, perceive a trade-off. They claim that their study resolves, “a conundrum related to the process of emergence – how to manage the simultaneous existence of two inconsistent states, familiarity and novelty” (p. 15). They propose that these “inconsistent states” can be overcome by focusing on one first (familiarity) and then the other (novelty). Studies of adoption that have used the familiarity-creativity dilemma in contexts other than technology are equally unspecific. McKinley et al. (1999) state that, “Our central thesis concerning novelty and continuity is that although there exist tension between them, adequate level of each are necessary [for adoption].” Not surprisingly, they give no clues to determine what an “adequate” level is. We contribute to existing literature by providing a fresh perspective to study these open issues in the literature, based on a careful empirical study complemented by an online experiment 7 that allows us to add precision to our theorizing. We focus on the structure and content of category labels in order to develop a more nuanced account of the familiarity and creativity constructs. In our view, these constructs are best understood through the prism of recombination. An established literature has shown that new creations are formed through the recombination of existing elements (Schumpeter, 1939; Wisniewski, 1996), and that the level of creativity in these recombinations can influence both the success of the innovation (Fleming, 2001; Fleming, Mingo, and Chen, 2007) and how widely they diffuse (Grodal and Thoma, 2014). A category label can be creative while still using very familiar words of phonemes. For example, when John Burton Carpenter in the 1970s created the compound label “snowboard” it was creative, because it combined two words “snow” and “board” that had seldom been used together, but at the same time it was familiar, because even stakeholders who were exposed to the compound for the first time were able to associate it with elements they knew and that helped them make sense of the new product. Indeed, compounds, defined as “the simple concatenation of any two or more nouns [or other words] functioning as a third nominal” (Downing, 1977: 810), are an important way for stakeholders to create category labels that: (a) build links to existing categories and meanings, thus invoking continuity and familiarity but, at the same time, (b) allow for novelty and change by supporting creative recombinations (Lieber, 1983; Wry, Lounsbury and Jennings, forthcoming). A central tenet of our theorizing in this paper is that, when it comes to adoption, there are first increasing and later decreasing returns to both creativity and familiarity. In other words, the more familiar a category level, the higher the adoption, but only up to a point. The more creative a category label is, the higher the adoption, but also up to a point. Both of econometric analysis and online experiment below are geared to test these hypotheses. Labels become familiar by establishing links to existing categories and their associated 8 labels. Category labels that fail to make these connections will most likely be abandoned because stakeholders will be confused about the labels’ meaning. As we noted before, the label “snowboard” became dominant in that industry and was widely adopted, as it represented a creative recombination of words that conveyed both change and familiarity. However, snowboard was not the label used by the first such product introduced to the market. The first product was introduced using a different label, the “snurfer,” which failed to make enough connections to existing labels and categories and therefore was later was abandoned in favor of snowboard. While the snurfer label drew from the words “snow” and “surfer,” these associations were disguised, making it hard for stakeholders encountering the label for the first time to make sense of it. While being unfamiliar might be problematic, being too familiar has its disadvantages as well, because when category labels are perceived to be too familiar they become taken-forgranted and thereby fail to elicit interest and scrutiny by stakeholders (Hsu and Grodal, 2014). Too much familiarity may render the label obvious or uninteresting and thus fail to capture the attention of stakeholders. For instance, one of the early labels in the smartphone industry was “camera phone,” directly derived from the fact that one of the first technological features that the new type of phones added was a digital camera. The label was an obvious combination of two very well known labels that had existed for a long time, and as such did not capture major attention from stakeholders. We therefore hypothesize, H1: There is an inverted U-shape relationship between the familiarity of a category label and its degree of adoption in an emerging industry. A similar dynamic occurs with creativity in labels. While label creators should strive to have their labels be as novel and creative as possible so as to make them attractive to other 9 stakeholders, the recombination of words in the creation of category label (like the recombination of novel technologies) can be taken to an extreme: “The set of potential combinations and, a fortiori, the possible ways that each set of potential combinations can be combined has become essentially infinite” (Fleming, 2001). In other words, label sponsors have limitless possibilities to come up with creative labels by recombining words and phonemes that have never been recombined before. However, if a label’s creativity is too high, it lead to inconsistent semantic connections to existing labels or categories, thus bringing confusion to stakeholders that encounter the label. Inconsistent semantic connections mean that stakeholders cannot easily find informational value in the creative combination of words, thus hampering their understanding. Labels that are too creative risk being difficult to comprehend by stakeholders and thus are likely to be abandoned. Consistent with our thinking here, Fleming (2001) finds that recombinations based in local search tend to be more successful than those based on more distant search. It follows that, H2: There is an inverted U-shape relationship between the creativity of a category label and its degree of adoption in an emerging industry. METHODS Setting: The “Smartphone” Industry We chose to study the smartphone industry, a suitable context in which to test our hypotheses, due to several characteristics. First, this industry represents a market space that emerged recently and therefore data can be retrieved on the entire set of category labels created by the key stakeholders. Second, many categorical labels were introduced during a relatively short period of time, which highlights the label contestation dynamics we are interested in. Third, due to the extensive and far-reaching technological possibilities offered by smartphones and the rapid pace 10 of technological change, there was much categorical uncertainty, particularly in the early years: that is, the meaning of the many categorical labels introduced was often ambiguous and overlapping. It was not easy for customers and other stakeholders to make sense of the different categorical labels they were presented with. Regression Analysis In our regressions, the unit of analysis is label-year. Because we deal with count data, a Poisson model is appropriate. The Poisson model assumes expected value and variance equal to 𝜆, known as the dispersion index. This assumption can be too strict, so we relax it by using a Negative Binomial regression, a generalization of the Poisson model that does not rely on the identity of variance and expected value of the dispersion index. The Binomial distribution, however, assumes overdispersion --that is, the variance is larger than the expected value. We can see from the summary statistics table (Table 3) the face validity of assuming overdispersion, but we also provide the estimate of ln(alpha) for each regression. In a Poisson model, alpha is constrained to zero; thus, whenever ln(alpha) is not different from zero, alpha is equal to 1. Overdispersion parameter equal to 1 implies that the negative binomial is of little use compared to a Poisson regression. The distribution of Negative Binomial is given by: The negative binomial regression equation is then: 11 Where X is the vector of time-invariant characteristics of the category label, and 𝜏 is a set of dummy for each year in the sample. We include both a linear and a quadratic term for the interaction between familiarity and originality. DATA AND VARIABLES To test our hypotheses, we constructed a unique dataset of category labels and their adoption in what we know today as the smartphone industry, which emerged in the late 1990s. We identified category labels by examining the press releases that smartphone manufacturers used when introducing their new products to the market. Press releases are a reliable record of the category labels chosen and used by different companies over time, since they represent the firms’ best efforts to communicate the position of their products vis a vis pre-existing market categories. Indeed, press releases have already been used in extant research on categories; Pontikes (2012), for instance, reports that each producer issues more than 1 unique category label every year (mean 1.5, median 1.3). We coded 390 category labels from 382 press releases between 2000 and 2010, for a total of 1,924 label-year observations. Our dataset tracks category labels from 31 companies used to position 308 devices identified as belonging to what would later be commonly referred as to the “smartphone” category. The reference markets are the United Kingdom and the United States, and the time covered goes from from 2000 to 2010. Figure 1 shows the average number of labels per press release by company. The chart gives an idea of the different strategies followed by the firms entering the industry in the use of category labels. For instance, while Apple uses very few categorical labels (1.7 per press release), LG and Motorola do the opposite, using almost six category labels per press release. The figure also suggests that the number of category labels used by a producer is not necessarily correlated 12 with the producer’s performance in the market. For example, Motorola and Nokia, among the companies using the largest number of category labels, failed to remain in the industry as independent companies and were later acquired by Google and Microsoft, respectively. Figure 2 shows that, by 2010, the last year in our data, the total number of labels used by producers is still increasing, suggesting that a dominant category (Suarez et al, 2013) has not yet emerged. We collected press releases from different sources. Table 1 shows the relative importance of each of our sources: three quarters of the press releases come from smartphone manufacturers themselves, and remaining quarter come predominantly from operators (23.7%). The residual is divided in OS developers (0.8%) and retailers (0.5%). *** INSERT FIGURE 1 HERE *** *** INSERT TABLE 1 HERE *** *** INSERT FIGURE 2 HERE *** To construct indexes for the familiarity and originality of each label, we broke down the label into words. For the case of labels with two or more words, we also deconstructed the compounds, even when the words were written together as one word. For instance, the label “smartphone” was deconstructed into the words “smart” and “phone”. In total, there were 206 words used to construct 390 category labels. We constructed a measure of a category label’s familiarity by counting the number of times that each word in the label appeared in the Factiva database of major US newspapers. For originality, given the computational requirements, we drew on data coming only from three major U.S. newspapers: The New York Times, The Wall Street Journal, and USA Today. We randomly selected one day per month for each year in our data collection period, excluding 13 weekends because of possible differences in weekend editions. We then collected all the articles that were published by the three newspapers on that particular day: on average, each random day provided 400 to 450 articles, with exceptions for holidays when the article count dropped to 190 to 320. Table 2 shows that we collected more than 4,500 articles per year over 11 years, with a total of 54,161 articles. Finally, we used MemeStat, a software for content analysis, to obtain both counts of words and particularly counts of words co-occurrence at the paragraph level. *** INSERT TABLE 2 HERE *** Dependent Variables Count of press releases citing the label. The dependent variable in our analysis is a measure of the level of adoption of a given category label. This measure provides greater granularity over alternative measures of category label success, such as survival (i.e. label still in use). We compute a label’s count of press releases as the number of press releases by any smartphone producer that use a focal label, within the first year from the label’s first use in any manufacturer’s press release. The label showing the highest adoption under this measure was used in 82 press releases (in 2010), while many category labels were mentioned in just one press release. Count of labels citations. Counting press releases makes no difference between a press release that contains just one mention of a given category label, and another that uses the label several times. To take this difference into account, we use an alternative dependent variable that counts the total number of times a label is used in all the press releases by any smartphone producer, within the first year from the label’s first use by any manufacturer. This dependent variable can be thought of as measuring the intensity of use of a given category label. Explanatory Variables 14 Two-word compound is a dummy variable that assumes the value of one if the category label is a recombination of two words (as for example in “pocket computer”), and zero otherwise. We created another dummy variable to control for the rather rare occurrence of compounds with three or more words (as for example in “electronic messaging device”). Familiarityt-1. We constructed a familiarity index by searching our corpus of newspaper articles for the frequency of each word used in the labels within any given year. We then used this metric to estimate how common each word was relative to other words used in the category labels for that year. For labels with more than one word, we averaged the frequency of each word in the label. To obtain a scale from 0 to 1, we divided the average frequency by the highest frequency in that year. Because of the shape of the distribution, we use the logarithm of the familiarity index in the regressions. Given our hypotheses, familiarity enters the regressions both in linear and quadratic terms. Originalityt-1. This variable represents, for any given category label, the level of originality in the recombination of words. We measured this by identifying how common it was for two or more words to appear together in the same press release or story for any given year. For example, consider this excerpt from The Wall Street Journal, July 10, 2007 part of our sample: “The colorful computers and the ads are part of an effort by Dell to redefine its brand, which had lost its focus in recent years as the PC business changed. Dell became famous with the direct-distribution sales model pioneered by founder Michael Dell, in which it sold computers over the phone and on the Internet. Instead of aiming to create an image for the brand, Dell's ads used to focus on the technical specifications of its computers, such as the speed of its processors.” In the excerpt above we would code the words “PC” and ”phone” as co-occurrences in a given paragraph. A high number of co-occurrences like this would signal that the combination of those two words is not that original. We measure originality as the difference between one and the ratio of the number of co-occurrences of two words over the number of times the two words are 15 mentioned individually. We subtracted the intersection from the sum of the two words in order to avoid counting the intersection twice; the originality index is logically illustrated in Figure 3. For multiple-word labels, we computed all the possible two words combinations within the label and then averaged them to calculate the originality ratio. *** INSERT FIGURE 3 HERE *** Category labels with more than two words are only 7% of the category label-year observations and just 17 of them receive between one and three mentions. For simplicity, given the minimal loss, we limit the analysis of originality to two-word category labels. Theoretically, a value of originality equal to zero means that there words A and B are always mentioned together (always co-occur), while a value of originality equal to one means that the two words have not co-occurred together in that year. Because of the shape of the distribution, we used the logarithm of originality in the regression. The originality index enters the regression both in linear and quadratic terms. As we did with familiarity, we performed the analysis of originality at the article and paragraph level using the 3-newspaper sample, and also at the article level using the larger set of 69 newspapers. Control Variables We controlled for several other characteristics of category labels. We include a set of dummy variables to capture the number of words used in a label. We use dummy variables to control for the use of trademarks (e.g. Galaxy), the reference to a particular technology (e.g. LCD), to an operating system (e.g. Android), or to a technology generation (e.g. 3G). We control also for the use of suffixes such as “enabled”, “powered”, or “enhanced”. We take into account 16 the age of the category label, defining it as the difference between the focal year t and the label’s year of introduction. Finally, in some robustness checks, we use a full set of dummies to capture the “head” of the category label, meaning “the word that is obligatory and controls the other words, its dependents. […] heads convey central pieces of information and their dependents contribute extra information” (Brown & Miller, 2013). For example, in the category label “smart device”, the words that controls the other word is “device”; while in pocket PC the head is “PC”. Some of these heads are repeated several times in our data, such as in “smart device,” “portable device,” “mobile device,” etc. When we control for the head we look at the familiarity and the originality of the dependents, given a particular head. This is why this control enters as a robustness check. RESULTS Table 3 reports the descriptive statistics of the variables, while table 4 reports the correlation matrix. The table shows that category labels use on average a little more than three words. On average, the age of a category label is 3 years and a quarter, while the oldest category label is fourteen years old in 2010. Almost half of the labels reference technological features, 13% reference an operating system, 9% reference a technology generation, 7% of the category labels contained a trademark term, and about 13% of labels use the suffixes “enabled”, “powered”, or “enhanced”. From the correlation matrix we observe that familiarity and originality are negatively correlated; this correlation gets attenuated when we take the logarithms in the regressions. Table 5 shows the results for the test of Hypotheses 1 and 2. Models 1 and 2 include the control variables as described in the section above. The two models differ in the dependent variable used: in model 1 the dependent variable is the count of press releases citing the label, while in model 2 is the count of label citations. In the appendix (Table A-1) we report the 17 hierarchical models with progressive inclusion of controls. Model 3 and 4 add to the model a set of dummies for the “head” nouns. We first analyze the control variables and then the explanatory variables. Category labels with more words are associated with less adoption as measure by either of the two dependent variables. Comparatively, two-words labels perform better than labels with more than two words: in all the four models the negative coefficient associated with two words labels is smaller than that associated with more-than-two-word labels. The dummy for labels with trademark is not significant in models (1) and (2) but becomes significant in models (3) and (4) when we control for the “head” nouns, suggesting an association with a lower level of adoption. This same pattern holds with age: older labels tend to be adopted less, and this negative association is stronger after we control for the “head” nouns. There is no significant difference between category labels that refer to a technological generation with respect to those that do not. Nevertheless, we observed negative association between adoption and category labels that refer to a particular technology (e.g., Bluetooth, MMS, or megapixel camera). In contrast, category labels that refer to an Operating System, and those that have “enabled”, “powered”, or “enhanced” as suffix, show a positive association. As for the explanatory variables, the first two rows of Table 5 show the results for familiarity. The linear term is positively related to the dependent variables, while the quadratic term is negative. The linear and quadratic terms are significant at 1% and 5% level respectively for the count of press releases citing the label, and even more significant (0.1% and 1% respectively) for the count of label citations. Once we controlled for head nouns in Models 3 and 4, the quadratic term becomes larger and more significant, while the linear term coefficient is smaller in case of the first dependent variable and larger for the second one. Overall, these results suggest substantial evidence that familiarity and degree of adoption of a category label follow an inverted U-shape relationship, which lends support to Hypothesis 1. 18 The third and fourth rows show the coefficients for originality. At first glance, one notices that the coefficient for the originality score is very high – even when using a logarithm transformation. This is because of the skewedness in the originality distribution: in fact, most of the scores lie between 71 and 100, with a large number of them around decimals of 99. The coefficients for the linear term of originality are positive and significant when each of the two dependent variables is used, at 1% and 5% significant levels, respectively. The coefficient for the quadratic term is also negative and significant in both Models 1 and 2, same significance levels similar to those of the linear term. When controlling for the head nouns in Models 3 and 4 the coefficients become larger and more significant in both cases. Our results therefore provide evidence that originality and degree of adoption of a category label have an inverted U-shape relationship. This evidence supports Hypothesis 2. In Table 6 we analyzed how familiarity and originality interact. In the appendix (Table A2) we report the hierarchical models with progressive inclusion of controls. The four models in Table 6 have the same specification as those in Table 5, and in Table 6 we have added the interaction terms, both linear and quadratic, between originality and familiarity (first two rows). We first describe the main effects, and then we describe the interaction. The main effects for familiarity and originality show no change in the sign of the coefficients with respect to those of Table 5. Possibly due to the effect of multicollinearity, the significance of the linear coefficient for familiarity drops from 1% to 10% in Model 1, and from 0.1% to 1% in Model 2. With the addition of head dummies, the significance of familiarity drops both in linear and quadratic terms when compared to those from Table 5, for both of the dependent variables. The significance for originality does not seem to drop when compared to Table 5 and even increases for the case of Model 2. The interaction between familiarity and originality enters the regression both in linear and 19 quadratic terms. The sign of the linear coefficient for the interaction is negative, for all the model specifications in Table 6, and significant. The coefficient for the squared interaction term is positive and significant, again across all model specifications. Thus, overall we show evidence of a U-shaped relationship between the familiarity-originality interaction and our two measures of category label’s adoption. One way to interpret this result is to think of familiarity mitigating the inverted U-shape relationship between label originality and label traction. These results hold overall to two robustness specifications, which we report in table 7 and 8. In table 7, we only select one category label for each press release – the one most used as measured by the frequency of use within the press release. With this specification, we omit other labels that are used but that can be considered secondary to a firms’s attempts to associate its product with a category. The number of observations drops from 1900 to 338. The specification of models 1 and 2 is the same than in models 1 and 2 of Table 5, while the specification of models 3 and 4 is the same than in models 1 and 2 in Table 6. When compared to the larger sample, the signs of the coefficients do not change. The coefficients for familiarity and originality still suggest an inverted U-shape relationship with the level of adoption of a label, while the interaction terms still shows a U-shape relationship. The significance of the coefficient is still robust for familiarity, ranging from 5% to 0.1%. Originality suffers from lack of power because of its skewedness: when no interactions are added, it is only significant at the 10% level. In table 8, in a further robustness test, we dropped from the analysis a category label containing the words “phone” or “device” unless they were the only labels used to describe the product in a given press release. These two words are among the most used in the press releases, but they differ from other label words insofar as “phone” refers to the previous (and broader) category, while “device” is a very generic term for a technology object. The specifications of Table 8 do not differ from those in Table 7. Moreover, the signs in 20 Table 8 are consistent with the signs of Tables 5 and 6, and with our hypotheses. It should be noted that the significance is lower for familiarity when there are no interactions; indeed, the quadratic term is not significant in models 1 and 2. When interactions are added, however, the significance of coefficients for familiarity become significant for both the linear and quadratic terms. Figures 2 and 3 provide a graphical interpretation of the interaction between originality and familiarity. We first segmented familiarity into three terciles, and labeled each of them as “low”, “medium”, and “high” familiarity, respectively. In Figure 2 we show the quadratic fit of the count of press releases citing the label over the values originality for each of the three familiarity terciles. The figure shows that the highest predicted number of count of press releases for a category label occurs when originality and familiarity are both “medium”. Moreover, the shape of the curve gets more skewed for the case of medium familiarity, and almost flat for the case of high familiarity. This suggests that, when the familiarity of the label words is high, increasing originality produces a small but monotonic positive effect on the traction of the label. This result does not change when we use the alternative dependent variable. Therefore, consistently with our theory, we find evidence that the highest number of counts appears when both originality and familiarity are high but not too high. DISCUSSION AND FINAL REMARKS Our research augments our understanding of the socio-cognitive dimension of industry emergence. Drawing from the literatures on categories and industry evolution we present an indepth look and empirical test of the socio-cognitive dynamics that take place as an industry develops. Our study shows that, as theorized by recent research (see for instance, Suarez et al., 21 2014), an initial period of categorical divergence where the number of categories in use increases over time, is followed by a period of convergence where the number of categories in use is reduced -- a pattern illustrated in Figure 4. In doing so our study begins to answer the call by Kennedy and Fiss (2013) to study the process through which categories are created, adopted and fall out of use. *** INSERT FIGURE 4 HERE *** We focus on category labels, because they represent the first instantiation of categories. During the early period of industry evolution producers introduce new category labels as they struggle to find the right term to describe their innovative products. As the industry evolves producers gradually cease to introduce new category labels and instead converge on using the same category label(s). In this study we set out to understand why some category labels are adopted while others are abandoned. We argue that, in order to be successful, category labels have to overcome an inherent tension between familiarity and originality. Drawing from recent literature (Kennedy and Fiss, 2013; Grodal et al., 2014), we hypothesized a non-linear, inverse Ushape relationship between adoption and both familiarity and originality. In other words, successful category labels are familiar, but not too familiar to be uninteresting, and original, but not too original that stakeholders cannot relate to them. Our results, using data we collected on what is today called the smartphone industry, largely support our hypotheses. Indeed, smartphone, the “dominant category” (Suarez et al., 2013) that emerged in this industry, out competed more than 200 different labels that firms used to refer to their products. Moreover, as Table 4 shows, the smartphone label falls in that sweet spot of familiarity and originality that we propose. The arguments and results of this article have direct and important implications for firms’ 22 strategies. It is likely that firms, when introducing their products, may not be fully aware of the socio-cognitive dynamics that take place in their emergent industry, let alone its consequences. As existing research has shown (Zuckerman, 1999; Pontikes, 2012), choosing a categorical positioning that is not consistent with what customers and other stakeholders begin to accept as the major categories in an industry, can have important consequences for the success of the firm’s products and the firm’s overall performance. While it is always possible for a firm to reposition its products using a different category label than that used when introducing the product, making such changes are costly. At the end of the day, even the most powerful firms have to conform to the dominant category once it has emerged. In the smartphone industry, for instance, Apple initially resisted for years the use of the “smartphone” label in their communications and advertising, emphasizing the power and customer awareness of their “iPhone,” first launched in 2007. However, by 2012 Apple had begun using the “smartphone” category label to refer to the iPhone. Firms have to conform to the dominant category label because if they don’t, they run the risks of not being in the preference consideration set of customers when they look for a product in that category. When meaningful categories form from category labels, they create “rules of membership” or “boundaries” (Navis and Glynn, 2010)—that is, rules by which stakeholders determine which products belong to the category and which don’t—that can be delineated and unforgiving. The implication of our study is, therefore, that firms should not only pay more attention to the labels they create and use when introducing their products, but also simultaneously follow closely the evolution of other categorical labels in the industry. Balancing familiarity and originality is not trivial, and there may be different strategies open to firms, such as hedging by trying to position an early product in more than one category while learning and collecting information about which labels seem to works best. It is also clear from our results that firms are 23 better off using two-word compounds when creating category labels than longer compounds or single words. Compounds make it easier for firms to overcome the tension between originality and familiarity than single words, as one word in the compound can provide the familiar link while the other an original twist. Moreover, two familiar words that are not often used together often may provide the originality needed for a label to succeed. Despite our contributions, there are still several limitations to our study. First, we have only studied what drives the adoption of category labels, but have not explored in our regressions the link to product or firm performance. Given that many of the firms competing in the smartphone industry are large firms that produce many different products for different industries, it has not yet been possible yet to obtain a reliable measure of smartphone performance for all firms in the sample. However, it would be beneficial for future research to address this relationship. The increased understanding of the dynamics of category creation and adoption that our study provides is an important issue in its own right. Second, one may question the external validity of our study, because we have only studied one industry. However, the dynamics of categorical evolution have been investigated in other industries (e.g. Rosa et al., 1999; Pontikes, 2012) and, while those studies do not contain the kind of empirical analysis that we present here, the basic pattern of category creation and adoption seem to be similar to what we observed and document for the smartphone industry. 24 LIST OF TABLES AND FIGURES Figure 1. Average number of category labels used by the different companies in the study, 20002010 Kyocera LG Motorola Huawei Siemens HTC Nokia ZTE Sendo AlphaCell Samsung HP Pantech & Curitel Hitachi GarminAsus Ericsson Dell Inc ALVA RIM Palm Sony Ericsson ASUS Sagem Google Danger Audiovox Apple Novatel Wireless 0 1 2 3 4 5 6 7 8 Figure 2. Number of labels used by all smartphone producers each year, 2000-2010 120 100 80 60 40 20 0 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 25 Figure 3. Label Adoption over time – Ten Labels (percent of total adoption per year) Table 1. Sources of Smartphone Press Releases in US and UK, 2000-2010 26 Source Count Manufacturer 68.4% Operator 23.4% OS developer 0.8% Manufacturer and Operator 5.0% Manufacturer and OS developer 0.8% Operator and OS developer 0.3% Manufacturer, Operator, and OS developer 0.8% Retailer 0.5% Total 100% Table 2. Number of random articles considered each year Year Count of Articles Share 1999 4709 9% 2000 4279 8% 2001 4861 9% 2002 4555 8% 2003 4866 9% 2004 4319 8% 2005 4492 8% 2006 4275 8% 2007 4285 8% 2008 4975 9% 2009 4376 8% 2010 4382 8% Total 54,374 100% Notes: Random draws from Factiva. 27 Table 3. Summary Statistics Variable N Mean S.D Min Max Number of Articles 1924 0.81 3.34 0 84 Cites per label 1924 2.14 13.22 0 345 Familiarityt-1 1924 12.40 15.4 0.012 100 Originality t-1 1900 98.93 1.62 77.22 100 Age 1924 3.25 2.96 0 14 Number of words 1924 3.19 1.32 1 9 Trademark 1924 0.07 0.24 0 1 Generation 1924 0.09 0.29 0 1 Technology 1924 0.51 0.50 0 1 Operative System 1924 0.13 0.34 0 1 “enabled” suffix 1924 0.13 0.34 0 1 Notes: Summary Statistics of Category Labels between 2000 and 2010, which appeared at least in a given year. 28 Table 4. Correlation Matrix (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) Number of Articles (1) 1 (2) 0.873 Cites per label 1 (3) -0.01 -0.026 Familiarityt-1 1 (4) -0.03 -0.004 -0.583 Originality t-1 1 (5) -0.163 -0.164 0.148 -0.111 Age 1 (6) -0.03 -0.024 0.106 -0.018 -0.125 Number of words 1 (7) 0.147 0.151 0.085 -0.042 -0.072 0.087 Trademark 1 (8) -0.033 -0.036 -0.057 0.055 0.104 0.024 -0.086 Generation 1 (9) -0.121 -0.117 -0.128 0.109 0.415 -0.085 -0.117 0.308 Technology 1 Operative System (10) -0.019 -0.03 0.006 -0.05 0.285 -0.096 -0.095 0.047 0.381 1 (11) -0.04 -0.045 -0.101 0.061 0.313 -0.095 -0.092 -0.025 0.284 0.332 1 “Enabled” suffix (12) 0.015 0.015 -0.054 0.101 0.018 -0.037 0.253 0.076 0.084 0.082 0.078 1 Year Notes: Summary Statistics of Category Labels between 2000 and 2010, which appeared at least in a given year. 29 Table 5. Test of Familiarity and Originality on Count (1) (2) Dependent Count of press Count of Variable releases citing the label label citations Log(Familiarityt-1) Log(Familiarity2t-1) Log(Originalityt-1) Log(Originality2t-1) (3) Count of press releases citing the label Head dummies (4) Count of label citations Head dummies 0.748** (0.28) -0.146* (0.07) 1305.357** (487.76) -144.983** (54.14) 0.700*** (0.16) -0.119** (0.04) 1183.899* (548.45) -131.576* (60.96) 0.587** (0.18) -0.148*** (0.04) 1495.177*** (282.65) -165.827*** (31.28) 0.851** (0.27) -0.207*** (0.06) 1562.395*** (352.03) -173.405*** (38.94) -2.168** (0.71) -3.225*** (0.67) -3.787*** (0.66) -4.362*** (0.67) -4.273*** (0.71) -3.522** (1.25) -4.865*** (0.70) -4.470*** (0.81) -2.845*** (0.81) -4.512*** (0.71) -5.177*** (0.72) -5.954*** (0.73) -5.886*** (0.78) -5.085*** (1.34) -6.549*** (0.83) -6.159*** (0.95) -2.430*** (0.59) -3.520*** (0.43) -4.135*** (0.56) -4.639*** (0.56) -4.721*** (0.54) -3.846*** (0.93) -5.277*** (0.33) -4.841*** (0.76) -3.239*** (0.68) -4.894*** (0.45) -5.661*** (0.57) -6.222*** (0.59) -6.550*** (0.54) -5.405*** (1.24) -7.072*** (0.39) -6.781*** (0.82) -0.044 (0.26) -0.067 (0.04) 0.218 (0.19) -0.529** (0.18) 0.613** (0.21) 0.528* (0.22) -0.020 (0.30) -0.088* (0.04) 0.239 (0.24) -0.706** (0.25) 0.914*** (0.24) 0.584* (0.25) -0.337** (0.10) -0.075*** (0.01) 0.120 (0.23) -0.707** (0.22) 0.698*** (0.17) 0.556** (0.18) -0.350* (0.15) -0.111*** (0.02) 0.078 (0.19) -0.914*** (0.24) 0.855*** (0.16) 0.741** (0.23) # of Words 2 words 3 words 4 words 5 words 6 words 7 words 8 words 9 words Trademark(=1) Age Generation(=1) Technology(=1) Operating System(=1) Enabled suffix 30 Year 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 Head Dummies 0.596 (0.43) 1.075* (0.43) 1.049** (0.40) 1.028** (0.39) 1.147** (0.38) 0.670+ (0.40) 0.745+ (0.39) 0.570 (0.40) 0.592 (0.41) 1.237** (0.43) 0.813+ (0.45) 1.194* (0.48) 1.028* (0.44) 0.992* (0.43) 1.304** (0.45) 0.816+ (0.45) 0.622 (0.44) 0.423 (0.44) 0.536 (0.45) 1.276* (0.51) 0.574 (0.84) 0.856 (0.91) 0.759 (0.85) 0.700 (0.86) 0.673 (0.96) 0.150 (0.85) 0.272 (0.88) 0.070 (0.88) 0.015 (1.07) 0.675 (1.04) 0.619 (1.06) 0.605 (1.09) 0.404 (0.96) 0.398 (1.00) 0.399 (1.22) -0.185 (1.07) -0.213 (1.09) -0.415 (1.12) -0.424 (1.33) 0.313 (1.31) No No Yes Yes Ln(alpha) 0.897*** 1.440*** 0.628*** 1.210*** (0.10) (0.09) (0.11) (0.09) N 1900 1900 1900 1900 Notes: Negative Binomial Regression with label-clustered standard errors for models 1 and 2, and head clustered standard errors for models 3 and 4 in parentheses. Significance levels: + p<0.1, * p<0.05, ** p<0.01, *** p<0.001 31 Table 6. Test of Interaction between Familiarity and Originality on Count (1) (2) (3) Dependent Count of Count of label Count of press Variable press citations releases citing the releases label citing the label Head dummies (4) Count of label citations Head dummies Log(Familiarityt-1)* Log(Originalityt-1) -36.482+ (19.68) -61.038** (22.25) -16.469* (8.03) -24.091* (10.80) Log(Familiarityt-1) 2* Log(Originalityt-1) 2 0.824* (0.37) 1.280** (0.42) 0.405** (0.14) 0.502* (0.21) 168.524+ (90.62) -17.557* (7.78) 1567.068** (577.63) -169.991** (62.13) 281.712** (102.44) -27.226** (8.80) 1774.230** (682.83) -189.311** (73.44) 76.194* (37.03) -8.671** (3.02) 1473.929*** (374.85) -161.983*** (40.26) 111.620* (49.75) -10.800* (4.37) 1700.132*** (446.82) -185.571*** (47.94) Yes Yes No Yes Yes No Yes Yes Yes Yes Yes Yes Log(Familiarityt-1) Log(Familiarityt-1) 2 Log(Originalityt-1) Log(Originalityt-1) 2 Control Variables Year Dummies Head Dummies Ln(alpha) 0.987*** 1.535*** 0.715*** 1.294*** (0.12) (0.11) (0.11) (0.13) N 1900 1900 1900 1900 Notes: Negative Binomial Regression with label-clustered standard errors in parentheses for models 1 and 2, head-clustered standard errors in parentheses for models 3 and 4. Control Variables are Derivation, Trademark, Generation, “Enabled” suffix, and Operating System dummy variables, Age, and a set of dummy variables for the number of words. Significance levels: + p<0.1, * p<0.05, ** p<0.01, *** p<0.001 32 Table 7. Robustness Check – Only principal labels per press release (1) (2) (3) Dependent Count of press Count of label Count of press Variable releases citing the citations releases citing label the label Log(Familiarityt-1)* Log(Originalityt-1) Log(Familiarityt-1) 2* Log(Originalityt-1) 2 Log(Familiarityt-1) Log(Familiarityt-1) 2 Log(Originalityt-1) Log(Originalityt-1) 2 Control Variables Year Dummies Ln(alpha) (4) Count of label citations -48.373* (21.18) 1.246* (0.55) -87.770** (29.69) 2.314** (0.79) 2.344** (0.90) -0.381* (0.18) 587.981+ (312.53) -65.962+ (35.24) 3.417*** (1.02) -0.559** (0.19) 828.391+ (428.80) -92.625+ (48.33) 223.989* (97.13) -26.521* (11.68) 941.923* (378.18) -100.096* (40.68) 405.975** (136.28) -49.198** (16.69) 1464.964** (532.49) -154.093** (57.10) Yes Yes Yes Yes Yes Yes Yes Yes 0.743* 1.583*** 0.737* 1.561*** (0.32) (0.20) (0.33) (0.20) N 338 338 338 338 Notes: Negative Binomial Regression with label-clustered standard errors in parentheses. Control Variables are Derivation, Trademark, Generation, “Enabled” suffix, and Operating System dummy variables, Age, and a set of dummy variables for the number of words. Significance levels: + p<0.1, * p<0.05, ** p<0.01, *** p<0.001 33 TABLE 8: Robustness Check – Labels excluding “phone” and “device” if not used as only label (1) (2) (3) (4) Dependent Count of press Count of label Count of press Count of label Variable releases citing the citations releases citing citations label the label Log(Familiarityt-1)* Log(Originalityt-1) Log(Familiarityt-1) 2* Log(Originalityt-1) 2 Log(Familiarityt-1) Log(Familiarityt-1) 2 Log(Originalityt-1) Log(Originalityt-1) 2 Control Variables Year Dummies Ln(alpha) -55.383* (25.18) 1.195** (0.46) -80.453** (26.36) 1.648*** (0.49) 0.541+ (0.30) -0.098 (0.08) 1502.156* (613.03) -166.564* (68.02) 0.493** (0.16) -0.068 (0.05) 1301.807* (659.39) -144.480* (73.31) 255.283* (115.94) -25.351** (9.72) 2001.157** (768.51) -215.210** (82.29) 370.889** (121.37) -34.970*** (10.30) 2217.440** (812.65) -235.331** (87.14) Yes Yes Yes Yes Yes Yes Yes Yes 0.978*** 1.578*** 0.949*** 1.541*** (0.14) (0.12) (0.15) (0.12) N 1875 1875 1875 1875 Notes: Negative Binomial Regression with label-clustered standard errors in parentheses. Control Variables are Derivation, Trademark, Generation, “Enabled” suffix, and Operating System dummy variables, Age, and a set of dummy variables for the number of words. Significance levels: + p<0.1, * p<0.05, ** p<0.01, *** p<0.001 34 Figure 4: Mediation of familiarity on originality (Count of press releases citing the label as DV) Figure 5: Mediation of familiarity on originality (Count of labels citations as dependent variable) 35 REFERENCES Anderson, P., and Tushman, M. L. 1990. Technological discontinuities and dominant designs: A cyclical model of technological change. Administrative science quarterly, 604-633. Berger, J. L., Addis, M. E., Reilly, E. D., Syzdek, M. R., and Green, J. D. 2012. Effects of Gender, diagnostic Labels, and causal theories on Willingness to report symptoms of depression. Journal of Social and Clinical Psychology, 31(5), 439-457. Bingham, C. B., and Kahl, S. J. 2013. The process of schema emergence: Assimilation, deconstruction, unitization and the plurality of analogies. Academy of Management Journal, 56(1), 14-34. Bowker, G. C., and Star, S. L. 2000. Invisible mediators of action: Classification and the ubiquity of standards. Mind, Culture, and Activity, 7(1-2), 147-163. Brown, K., & Miller, J. 2013. The Cambridge Dictionary of Linguistics. Cambridge University Press. Fleischer, A. 2009. Ambiguity and the equity of rating systems: United States brokerage firms, 1995–2000. Administrative Science Quarterly, 54(4), 555-574. Downing, P. 1977. On the creation and use of English compound nouns. Language, 810842. Fleming, L. 2001. Recombinant uncertainty in technological search. Management science, 47(1), 117-132. Fleming, L., Mingo, S., and Chen, D. 2007. Collaborative Brokerage, Generative Creativity, and Creative Success. Administrative Science Quarterly, 52(3): 443–475. Granqvist, N., Grodal, S., and Woolley, J. L. 2013. Hedging Your Bets: Explaining Executives' Market Labeling Strategies in Nanotechnology. Organization Science, 24(2), 395-413. Grodal, S., and Thoma, G. 2014. Cross-pollination in science and technology: concept mobility in the nanobiotechnology field. Annals of Economics and Statistics, Forthcoming. Grodal, S., Gotsopoulos, A., & Suarez, F. 2014. The Co-evolution of Technologies and Categories during Industry Emergence. Academy of Management Review. TH Hannan & JM McDowell, The determinants of technology adoption: The case of the banking firm, The RAND Journal of Economics, 1984 Hannan, M. T., Polos, L., & Carroll, G. R. 2007. Logics of organization theory: Audiences, codes and ecologies. Princeton and Oxford: Princeton University Press. Hargadon, A. B., and Douglas, Y. 2001. When innovations meet institutions: Edison and the design of the electric light. Administrative science quarterly, 46(3), 476-501. 36 Hsu, G. 2006. Evaluative schemas and the attention of critics in the US film industry. Industrial and Corporate Change, 15(3), 467-496. Hsu, G. and Grodal, S. 2013: “Category Taken-for-grantedness as a Strategic Opportunity: The Case of Light Cigarettes, 1964-1993", Working Paper. Hsu, G., Hannan, M. T., and Koçak, Ö. 2009. Multiple category memberships in markets: An integrative theory and two empirical tests. American Sociological Review, 74(1), 150-169. Kaplan, S., and Tripsas, M. 2008. Thinking about technology: Applying a cognitive lens to technical change. Research Policy, 37(5), 790-805. Kennedy, M. T. 2008. Getting counted: Markets, media, and reality. American Sociological Review, 73(2), 270-295. Kennedy, M. T., and Fiss, P. C. 2013. An ontological turn in categories research: from standards of legitimacy to evidence of actuality. Journal of Management Studies, 50(6), 1138-1154. Lieber, R. 1983. Argument linking and compounds in English. Linguistic inquiry, 251285. Lounsbury, M. 2001. Institutional sources of practice variation: Staffing college and university recycling programs. Administrative Science Quarterly, 46(1), 29-56. Lounsbury, M., and Ventresca, M. 2003. The new structuralism in organizational theory. Organization Science, 10(3), 457-480. Lounsbury, M., Ventresca, M., and Hirsch, P. M. 2003. Social movements, field frames and industry emergence: a cultural–political perspective on US recycling. SocioEconomic Review, 1(1), 71-104. Majumdar, S. K. and S. Venkataraman. 1998. Network effects and the adoption of new technology: Evidence from the U.S. telecommunications industry. Strategic Management Journal 19: 1045–1062 (1998) Navis, C., and Glynn, M. A. 2010. How new market categories emerge: Temporal dynamics of legitimacy, identity, and entrepreneurship in satellite radio, 1990–2005. Administrative Science Quarterly, 55(3), 439-471. Negro, G., Hannan, M. T., and Rao, H. 2011. Category reinterpretation and defection: Modernism and tradition in Italian winemaking. Organization Science, 22(6), 1449-1463. Orlikowski, W. J., and Gash, D. C. 1994. Technological frames: making sense of information technology in organizations. ACM Transactions on Information Systems (TOIS), 12(2), 174-207. Peirce, C. S. 1931. Collected papers of Charles Sanders Peirce. Cambridge, MA: Harvard University Press 37 Pieters, R., Warlop, L., Wedel, M., 2002 Breaking Through the Clutter: Benefits of Advertisement Originality and Familiarity for Brand Attention and Memory. Management Science 48(6):765-781 Pontikes, E. G. 2012. Two Sides of the Same Coin How Ambiguous Classification Affects Multiple Audiences’ Evaluations. Administrative Science Quarterly, 57(1), 81118. Rosa, J. A., Porac, J. F., Runser-Spanjol, J., and Saxon, M. S. 1999. Sociocognitive Dynamics in a Product Market. Journal of Marketing, 63(4). Ruef, M., and Patterson, K. 2009. Credit and classification: The impact of industry boundaries in nineteenth-century America. Administrative Science Quarterly, 54(3), 486-520. Santos, F. M., and Eisenhardt, K. M. 2009. Constructing markets and shaping boundaries: Entrepreneurial power in nascent fields. Academy of Management Journal, 52(4), 643671. Schumpeter, J. A. 1939. Business cycles (Vol. 1, pp. 161-74). New York: McGraw-Hill. Seidel, V. P., and O'Mahony, S. 2014. Managing the Repertoire: Stories, Metaphors, Prototypes, and Concept Coherence in Product Innovation. Organization Science. SMITH, E. E., OSHERSON, D. N., RIPS, L. J., & KEANE, M. (1988). Combining prototypes: A modification model. Cognitive Science, 12, 485–527. Suarez, F. F., Grodal, S., and Gotsopoulos, A. 2013. Perfect Timing? Dominant Category, Dominant Design, and the Window of Opportunity for Firm Entry. Strategic Management Journal. Utterback, J. M. 1996. Mastering the dynamics of innovation. Harvard Business Press. Wisniewski, E. J. 1996. Construal and similarity in conceptual combination. Journal of Memory & Language, 35, 434-453. Wry, T., Lounsbury, M., and Jennings, P. D. 2013. Hybrid Vigor: Securing Venture Capital by Spanning Categories in Nanotechnology. Academy of Management Journal. Yamauchi, T., and Markman, A. B. 1998. Category learning by inference and classification. Journal of Memory and language, 39(1), 124-148. Zuckerman, E. W. 1999. The categorical imperative: Securities analysts and the illegitimacy discount. American journal of sociology, 104(5), 1398-1438. Zuckerman, E. W. 2000. Focusing the corporate product: Securities analysts and dediversification. Administrative Science Quarterly, 45: 591-619. 38 APPENDIX TABLE A1: Nested Regression Table for Test of H1 and H2. Dependent Variable Log(Familiarityt-1) Log(Familiarity2t-1) Log(Originalityt-1) Log(Originality2t-1) (1) Number of Citing Press releases Baseline (2) Number of Citing Press releases (3) Number of Citing Press releases (4) Number of Citing Press releases (5) Cites of Label (6) Cites of Label (7) Cites of Label (8) Cites of Label Controls Year FE Head FE Baseline Controls Year FE Head FE 0.290 (0.23) -0.072 (0.07) 403.693 (719.74) -45.058 (79.87) 0.763** (0.28) -0.150* (0.07) 1566.355*** (417.42) -173.927*** (46.31) 0.747** (0.28) -0.145* (0.07) 1311.585** (485.80) -145.624** (53.92) 0.583*** (0.18) -0.147*** (0.04) 1518.810*** (308.38) -168.385*** (34.04) 0.170 (0.16) -0.066 (0.06) -393.402 (940.83) 43.043 (104.25) 0.688*** (0.16) -0.117** (0.04) 1534.633*** (451.14) -170.465*** (50.10) 0.700*** (0.16) -0.119** (0.04) 1208.141* (540.23) -134.213* (60.05) 0.848** (0.26) -0.207*** (0.06) 1611.497*** (356.78) -178.782*** (39.41) -2.265** (0.70) -3.331*** (0.66) -3.888*** (0.65) -4.513*** (0.66) -4.451*** (0.69) -3.430** (1.31) -4.957*** (0.70) -4.533*** (0.82) -0.004 (0.25) -0.068+ (0.04) 0.286 -2.164** (0.71) -3.217*** (0.67) -3.775*** (0.66) -4.347*** (0.66) -4.275*** (0.70) -3.511** (1.25) -4.849*** (0.70) -4.458*** (0.81) -0.045 (0.26) -0.067 (0.04) 0.219 -2.431*** (0.59) -3.514*** (0.42) -4.128*** (0.55) -4.630*** (0.56) -4.730*** (0.53) -3.839*** (0.93) -5.270*** (0.32) -4.834*** (0.76) -0.338** (0.10) -0.075*** (0.01) 0.123 -3.004*** (0.80) -4.661*** (0.69) -5.354*** (0.71) -6.140*** (0.72) -6.181*** (0.74) -5.006*** (1.38) -6.737*** (0.81) -6.198*** (0.95) 0.012 (0.28) -0.092* (0.04) 0.325 -2.845*** (0.81) -4.510*** (0.71) -5.172*** (0.72) -5.946*** (0.72) -5.904*** (0.77) -5.079*** (1.34) -6.538*** (0.82) -6.153*** (0.95) -0.020 (0.30) -0.089* (0.04) 0.243 -3.247*** (0.67) -4.896*** (0.45) -5.664*** (0.56) -6.222*** (0.59) -6.578*** (0.51) -5.405*** (1.23) -7.071*** (0.38) -6.784*** (0.82) -0.351* (0.15) -0.111*** (0.02) 0.083 # of Words: 2 words 3 words 4 words 5 words 6 words 7 words 8 words 9 words Trademark(=1) Age Generation(=1) 39 Technology(=1) Operating System(=1) Enabled suffix (0.20) -0.492** (0.18) 0.621** (0.22) 0.534* (0.23) (0.19) -0.538** (0.18) 0.618** (0.21) 0.526* (0.22) (0.22) -0.718** (0.22) 0.703*** (0.17) 0.553** (0.18) No 0.532 (0.43) 1.017* (0.43) 0.982* (0.40) 0.985* (0.38) 1.090** (0.37) 0.612 (0.40) 0.687+ (0.39) 0.513 (0.39) 0.535 (0.40) 1.183** (0.43) No 0.452 (0.80) 0.746 (0.85) 0.639 (0.80) 0.606 (0.81) 0.564 (0.90) 0.037 (0.80) 0.160 (0.84) -0.041 (0.83) -0.097 (1.01) 0.564 (0.98) Yes (0.26) -0.651** (0.25) 0.919*** (0.26) 0.538* (0.25) (0.24) -0.716** (0.25) 0.918*** (0.24) 0.583* (0.25) (0.19) -0.925*** (0.25) 0.860*** (0.16) 0.741** (0.24) No 0.768+ (0.45) 1.150* (0.47) 0.980* (0.43) 0.961* (0.42) 1.263** (0.45) 0.772+ (0.45) 0.579 (0.44) 0.381 (0.44) 0.494 (0.44) 1.234* (0.51) No 0.505 (1.01) 0.515 (1.01) 0.305 (0.90) 0.320 (0.93) 0.308 (1.13) -0.278 (0.99) -0.306 (1.02) -0.507 (1.04) -0.516 (1.24) 0.218 (1.22) Yes Year 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 Head Dummies Ln(alpha) No No 1.564*** 1.051*** 1.006*** 0.716*** 2.194*** 1.603*** 1.562*** (0.16) (0.12) (0.12) (0.11) (0.12) (0.11) (0.11) N 1900 1900 1900 1900 1900 1900 1900 Notes: Negative Binomial Regression with label-clustered standard errors in parentheses. Head clustered standard errors for models 4 and 8. Significance levels: + p<0.1, * p<0.05, ** p<0.01, *** p<0.001 40 1.290*** (0.13) 1900 TABLE A2: Nested Regression Table for Test of H3 (1) Number of Citing Press releases Baseline (2) Number of Citing Press releases (3) Number of Citing Press releases (4) Number of Citing Press releases (5) Cites of Label (6) Cites of Label (7) Cites of Label (8) Cites of Label Controls Year FE Head FE Baseline Controls Year FE Head FE 188.041 (139.03) -18.531+ (11.10) 650.873 (925.35) -67.345 (98.95) 208.869* (86.35) -20.769** (7.49) 1983.664*** (476.40) -214.580*** (51.15) 176.303* (89.38) -17.838* (7.76) 1658.206** (561.76) -179.493** (60.49) 79.927** (29.73) -8.595*** (2.38) 1600.921*** (354.20) -175.616*** (38.21) 389.182+ (200.18) -31.975* (15.08) 503.037 (1110.13) -42.571 (121.66) 344.477*** (93.50) -32.338*** (8.16) 2351.324*** (520.53) -250.926*** (55.84) 299.248** (98.57) -28.232** (8.61) 1931.374** (641.07) -205.738** (69.04) 127.112** (43.32) -11.612** (3.86) 1905.966*** (406.92) -207.508*** (43.77) Log(Familiarityt-1)* Log(Originalityt-1) -40.807 (30.19) -45.245* (18.75) -38.169* (19.41) -17.274** (6.45) -84.454+ (43.43) -74.673*** (20.30) -64.845** (21.41) -27.449** (9.40) Log(Familiarity2t-1)* Log(Originality2t-1) 0.873+ (0.52) 0.976** (0.35) 0.838* (0.37) 0.401*** (0.11) 1.502* (0.71) 1.521*** (0.38) 1.327** (0.41) 0.540** (0.18) -2.376*** (0.72) -3.394*** (0.67) -4.019*** (0.67) -4.520*** (0.69) -4.636*** (0.71) -3.639** (1.27) -5.212*** (0.73) -4.761*** -2.281** (0.73) -3.293*** (0.68) -3.903*** (0.68) -4.375*** (0.69) -4.458*** (0.72) -3.688** (1.22) -5.066*** (0.73) -4.674*** -2.473*** (0.61) -3.537*** (0.44) -4.171*** (0.58) -4.624*** (0.60) -4.769*** (0.55) -3.908*** (0.86) -5.362*** (0.34) -4.895*** -3.198*** (0.82) -4.776*** (0.71) -5.602*** (0.72) -6.219*** (0.74) -6.539*** (0.77) -5.445*** (1.33) -7.201*** (0.83) -6.611*** -3.040*** (0.82) -4.649*** (0.72) -5.414*** (0.73) -6.051*** (0.75) -6.258*** (0.79) -5.458*** (1.29) -6.930*** (0.86) -6.541*** -3.291*** (0.69) -4.920*** (0.46) -5.724*** (0.57) -6.249*** (0.61) -6.635*** (0.53) -5.536*** (1.16) -7.204*** (0.37) -6.874*** Dependent Variable Log(Familiarityt-1) Log(Familiarity2t-1) Log(Originalityt-1) Log(Originality2t-1) # of Words: 2 words 3 words 4 words 5 words 6 words 7 words 8 words 9 words 41 Trademark(=1) Age Generation(=1) Technology(=1) Operating System(=1) Enabled suffix Year Dummies Head Dummies Ln(alpha) No No (0.85) -0.009 (0.24) -0.076* (0.04) 0.283 (0.20) -0.487** (0.18) 0.612** (0.22) 0.560* (0.24) No No (0.85) -0.043 (0.25) -0.075+ (0.04) 0.214 (0.19) -0.527** (0.18) 0.603** (0.21) 0.546* (0.23) Yes No (0.79) -0.319** (0.10) -0.079*** (0.01) 0.139 (0.22) -0.703** (0.22) 0.705*** (0.16) 0.546** (0.18) Yes Yes No No (0.98) -0.014 (0.27) -0.109** (0.04) 0.305 (0.25) -0.646** (0.24) 0.860*** (0.25) 0.648* (0.26) No No (1.00) -0.032 (0.29) -0.106* (0.04) 0.221 (0.23) -0.696** (0.24) 0.861*** (0.24) 0.671** (0.26) Yes No (0.85) -0.336* (0.15) -0.116*** (0.02) 0.090 (0.19) -0.907*** (0.25) 0.863*** (0.15) 0.750** (0.23) Yes Yes 1.544*** 1.035*** 0.993*** 0.718*** 1.572*** 1.537*** 1.291*** (0.16) (0.12) (0.12) (0.11) (0.11) (0.11) (0.13) N 1900 1900 1900 1900 1900 1900 1900 1900 Notes: Negative Binomial Regression with label-clustered standard errors in parentheses. Poisson Regression for Model 5 because of non-convergence of the Negative Binomial model (problem of mulicollinearity). Head clustered standard errors for models 4 and 8. Significance levels: + p<0.1, * p<0.05, ** p<0.01, *** p<0.001 42