Contributing to Public Document Repositories: A Critical Mass Theory Perspective Naren B. Peddibhotla npeddibhotla@csom.umn.edu Mani R. Subramani msubramani@csom.umn.edu 3-365, Department of Information and Decision Sciences Carlson School of Management University of Minnesota 321, 19th Ave South Minneapolis, MN 55455 USA March 30, 2006 Under second-round review at Organization Studies for the Special Issue on Online Communities Comments are welcome. Public document repositories Contributing to Public Document Repositories: A Critical Mass Theory Perspective Abstract: Public document repositories (PDRs) are valuable resources available on the Internet and are a component of the broader information commons freely accessible to the public. Instances of PDRs include the repository of reviews at Amazon.com and bn.com and Wikipedia, the online encyclopedia. These repositories are created and sustained by the voluntary contributions of individuals who are not compensated for their inputs. While the potential value of these repositories is recognized, there has been little prior examination of the fundamental mechanisms underlying the willingness of individuals to take the time and effort to make contributions. This paper draws on Critical Mass theory to examine the benefits, motivations and dynamics of contributions by the critical mass of contributors and is based on profiles of contributors and data on their contributions of reviews to Amazon.com. It identifies a small critical mass of contributors that makes a significant contribution to the maintenance of the PDR. The paper contributes to the development of a theory of collective action related to public repositories of information goods. Keywords: Public document repository; critical mass theory, collective action, dynamics, information technology. 2 Public document repositories Critical Mass – “a small segment of the population that chooses to make a big contribution to the collective action while the majority do little or nothing. These few individuals are precisely those who diverge most from the average. …the number of such deviants and the extremity of their deviance - is one key to predicting the probability, extent and effectiveness of collective action”. (Oliver et al., 1985: 524) Introduction A number of websites that are freely accessible over the Internet provide users with useful content. Instances of such sites, which we term public document repositories (PDR) include the repository of book reviews, movies and music at amazon.com, repositories of travel and tourism information at travelpost.com and lonelyplanet.com and the large body of reviews of consumer products at epinions.com. Such repositories are created by the largely uncompensated efforts of individuals contributing content e.g. book reviews, comments on hotels and tourist destinations for the benefit of others who may be considering reading the books, choosing hotels or visiting these destinations. The scale of many of these repositories is truly non-trivial. For instance, Dooyoo.co.uk had over 200,000 reviews available on its site contributed by over 20,000 individuals and Amazon.com had over 3.5 million reviews available on its site in 2004 contributed by over a million individuals. Such repositories are termed discretionary databases by Thorn and Connolly (1987) since they comprise private information that is shared by individuals, at their discretion, with others. Advances in information technologies facilitating large scale storage and retrieval make it increasingly feasible to consolidate the collective knowledge and resources of even widely geographically dispersed individuals into shared repositories that can be extremely useful to the general public. However, motivating individuals to contribute to collective repositories is a daunting challenge and initiatives to establish such repositories, even when they are seen as 3 Public document repositories serving the common good overwhelmingly fail (Fulk et. al 2004). Viewed against this backdrop, evidence that popular public repositories of reviews such as Amazon.com and bn.com are thriving suggests the need for a closer examination of the factors linked to their success. While the value of such publicly accessible online repositories is generally recognized, there has been little prior research examining contribution and participation in PDRs. Thorn and Connolly (1987) observe that “the technology of storing and distributing information is advancing rapidly; but we see relatively little evidence of parallel growth in the understanding of how this technology can best be harnessed” (page 527). Their observation, made nearly two decades ago still continues to be valid today. In this paper, we attempt to move the field forward towards a greater understanding of the dynamics of collective action in Public Document Repositories. We apply critical mass theory to the context of PDRs and suggest propositions related to the dynamics of repository contributions. Using data on contributors to the large PDR at Amazon.com, we identify a critical mass of reviewers and the factors linked to their repository contributions. Public Document Repositories The act of making a repository contribution has several unique characteristics that set it apart from instances of helping behavior in physical contexts (Clary et al. 1998) as well as contributions in the context of technology mediated forums such as email and listservs (Constant et al. 1996; Butler 2001). First, a repository contribution such as the posting of a book review is independently initiated by an individual with the expectation that this might be useful to others. Such contributions are made not only without a request for help but also without specific information on the individuals being helped by the action. Second, while helpful actions generally occur in a dyadic context of individuals or groups linked by social ties, PDR 4 Public document repositories contributions represent attempts by individuals to help unknown others with whom they typically have no discernable ties other than those arising from participation in the PDR. Third, repository contributions are accomplished through impersonal interactions with a database. Users typically log into a website, fill out a form describing the contribution and either attach a document or copy and paste their contribution into a text box. Contributions thus occur in a context devoid of social cues – a rather peculiar feature since helping is a fundamentally social act. Fourth, tangible incentives for contribution are mostly nonexistent and are at best, minimal. Though organizations maintaining PDRs encourage contributions, they usually provide no direct incentives for contributions. Finally, contributors usually get no feedback when (or if) their contributions are viewed by others. Repositories such as Amazon.com provide mechanisms for users viewing reviews to provide feedback on its quality but leaving feedback is optional and is generally meager. Thus, the individuals spending their time and taking the effort to make repository contributions appear to be doing so in spite of impediments that inhibit contributions. Further, the technologies of PDRs create a unique environment with two important characteristics: a) PDRs are not excludable since they cannot be withheld from any individual once they become available, regardless of whether or not he or she contributed to its creation and b) Use of PDRs is non-rivalrous since one person’s use of the PDR does not affect its availability or its utility to other individuals. PDR are therefore collective goods or public goods (Hardin 1982). This view that PDRs are public goods is consistent with the arguments of Thorn and Connolly (1987) and Fulk et. al (2004) regarding repositories of discretionary information. PDRs exhibit another characteristic of collective goods termed the jointness of supply, the costs associated with creating the public good are fixed, regardless of the number of individuals that take advantage of it. The costs of writing and submitting content remain the same whether the 5 Public document repositories content is used by one individual or a very large number. As a result, free riding is not a burden and PDRs can potentially be created and sustained for the collective through the efforts of a relatively small minority. These features of PDRs create conditions that Olson (1965) described as the exploitation of the great by the small. The theory of critical mass (Oliver et al. 1985) that proposes a framework to explain collective action with respect to public goods therefore provides a useful lens to study contributions to PDRs. Critical Mass Theory and Collective Action Critical mass theory (Oliver et al. 1985; Marwell and Oliver 1993) presents a framework to explain collective action. The central insight suggested by the theory is that the presence of a critical mass, a sub-group of the population that shoulders most of the initial cost, can trigger broader participation and the creation of the public good. Another important insight is the interdependence of contribution by individuals in the population. An illustration of collective action where the initial efforts of a critical mass are important is political lobbying by a neighborhood to fight school closure where an affluent minority can jump-start the movement by hiring a lawyer with their own funds before others join (Oliver et al. 1985). This theory has been applied to study phenomena involving collective action by researchers in a variety of domains (Oliver and Marwell 2001). It has been used by Markus (1987) in the literature on computer mediated communications (CMC) to explain the diffusion of interactive media, by Thorn and Connolly (1987) and Fulk et. al (2004) to explain contributions to discretionary organizational databases and by Monge et al. (1998) to study pooled information in inter-organizational alliances. However, to the best of our knowledge, there has been no prior application of this theory to study collective action related to PDRs. In applying the theory to the context of PDRs, 6 Public document repositories we advance propositions extending the theory in the light of empirical observations of PDR contribution and use. Critical Mass Theory and Public Document Repositories Role of the critical mass of contributors: Public goods present a social dilemma since any individual can derive benefits from their use irrespective of his or her participation in creating them, leading to the temptation for individuals to free ride on the contributions of others. Free riding leads to sub-optimal provisioning of public goods. In the context of PDRs, widespread free riding can lead to PDRs having very little useful content. When most users withhold their own contribution to the PDR in the hope of expecting to benefit from the contributions of others, there are few contributions ever made to take advantage of. The central premise of critical mass theory applied to PDRs is that a small minority in the population – the critical mass – that is interested in the PDR can make most of the contributions and lead to the creation of a useful PDR that the majority of users exploit. In the absence of prior work on the validity of this perspective to PDRs, our first research question is: Are public document repositories created and sustained by a critical mass of contributors? Thorn and Connolly (1987) in their examination of discretionary databases in organizations conclude that discretionary information will be chronically undersupplied. However, anecdotal evidence of successful PDRs on the Internet such as those at Amazon.com and Wikipedia.com appear to be inconsistent with these predictions. For instance, the online encyclopedia at Wikipedia.com that was launched in 2001 had over 3.7 million articles (over 1million articles in 7 Public document repositories English) contributed by users as of March 2006. Wikipedia, one of the few PDRs for which detailed contribution statistics are available reports receiving over 46 million content updates by users on 3.7 million pages since July 2002. Wikipedia reported having over 17,000 active contributors (those submitting 5 or more times in a month) in December 2005 and the number of active contributors has been growing consistently every month since January 2001. While such detailed statistics are unavailable from other PDRs operated by commercial entities such as Amazon.com and epinions.com, a steady growth of contributors is reported for these sites as well. This evidence suggests the need to revisit some of the assumptions underlying models of contributor behavior employed in prior research on discretionary databases. Consistent with the uses and gratification paradigm in which outcomes observed are linked to diverse sources of benefits that can influence the nature of technology use by participants (Katz et al. 1974), we focus on the nature of the uses and gratifications from repository contributions. Self oriented usage of technology: One central assumption in the models of Thorn and Connolly 1987) and Fulk et al. (2004) is that contributions by individuals can only benefit other people but not the contributor and benefits to contributors accrue only from access to the contribution of others. This assumption is unlikely to be valid in the context of PDRs such as Amazon.com since the process of making repository contributions can be expected to be useful for individuals in various ways. For instance, in contributing a review of a book to Amazon.com, the process of reflecting on the book and providing a critique helps contributors develop skills related to critical analysis and composition. Posting a review can also serve as a means of self-expression. 8 Public document repositories While private benefits have been recognized in the general case of public goods (Kim and Bearman 1997), they have been assumed away as non-existent or ignored in prior research on discretionary databases (e.g. Thorn and Connolly 1987; Fulk et. al 2004). The presence of such private benefits can lead to greater repository contributions than predicted by prior theory since the overall value to individuals from the PDR is enhanced. This leads to the following research question: Do contributors to PDRs derive direct benefits from making their contributions? Motivations for PDR contributions: In view of the unique context of PDRs contributions highlighted earlier, an understanding of the set of salient motivations for contributions is an important issue. The work of Thorn and Connolly (1987) suggests reciprocity as the sole motive for repository contributions. Since PDRs are contexts with fluid memberships that individuals can join or drop-out at any time and where individuals can participate anonymously, it is hard to imagine that reciprocity, a feature that is prevalent in stable groups of identified individuals, can provide a dominant motive. In the same vein, motivations for action such as altruism and social affiliation highlighted in instances of pro-social behavior (such as caring for a stranger who collapses on the street) are unlikely to be as salient in potentially de-individuating contexts at PDRs where relatively anonymous individuals contribute and retrieve documents. Prior research indicates that individuals sharing common interests can develop social bonds in virtual communities (Butler et al. 2002). However, while it is also likely that the motives for PDR contribution may be social and other-oriented, there is little guidance from prior theory regarding the motives operative in the case of repository contributions. This leads to the following research question: 9 Public document repositories What are the key motivations for contributors to PDRs? Interdependence of individual contributions: An important focus of prior work on collective action (Marwell and Oliver 1993) is the nature of interdependence among individual contributors. Thorn and Connolly (1987) viewed individual contributions to discretionary databases as being independent since all players makes simultaneous decisions about contributing in each period of a multi-period game. This clearly is unlikely to be the case in PDRs since the technology can provide considerable transparency regarding prior contributions by others and this can influence the willingness of individuals to make their own contributions. Critical mass theory suggests two patterns in the interdependence of incremental contributions. Incremental contributions of individuals can be accelerating (Oliver et al. 1985) with contributions being more valuable in the presence of prior contributions. Alternatively, incremental contributions can be decelerating, with contributions being less valuable in the presence of prior contributions. The key issue determining the nature of interdependence is the interpretation by potential PDR contributors of information on the contributions by others (Kim and Bearman 1997). Does the availability of prior reviews inhibit subsequent PDR contribution (the marginal value of an additional review is considered to be small)? Or does the availability of prior reviews encourage contribution (an additional review is considered an important contribution to the ongoing articulation of the value of a book or movie)? There is little guidance in the literature to determine the nature of interdependence of PDR contributions. We therefore examine the following research question: What is the nature of interdependence of incremental contributions in a PDR? 10 Public document repositories Methods We examined these questions using self-disclosed reviewer profiles and data available on their contributions to the review repository at Amazon.com, a site visited by about 40 million users every month (Nielsen NetRatings 2006). The PDR of reviews at Amazon.com has over 3.5 million reviews contributed by over a million reviewers. While the repository is owned and operated by a commercial firm– it is freely accessible without exclusions to the public over the web and it is searchable in a variety ways - using keywords, book title, author or topic. While users can choose to purchase items they see listed on the site, no purchase is necessary to use the content or the facilities provided by the site. Participation as a contributor has minimal prerequisites. Any individual with an email address, irrespective of his or her location in the world can sign up for an Amazon.com account and begin to contribute content - reviews of books, music, videos and other products sold on the site or provide comments on content contributed by other users. Amazon.com provides basic guidelines for reviews and all submissions are moderated. A small group of Amazon.com editors using automated text search programs deletes or replaces inappropriate or offending content from contributions before posting them online. To eliminate confounds from the characteristics of the user interface and features provided by different public repositories, we focused our data collection on this large PDR. Contributing reviewing to the Amazon repository is not compensated – it is entirely voluntary. The only reward, if any, is intangible – in the form of a higher rank among Amazon reviewers. Amazon.com ranks reviewers using a composite measure based on the number of reviews submitted and the average number of helpful votes received by reviews from users. A reviewer’s categorization as a #1 Reviewer, Top 10, Top 50, Top 500 or Top 1000 reviewer is displayed along with the text of his or her reviews. The possibility of joining the ranks of reviewers in any 11 Public document repositories of these five tiers represents the only formal incentive offered to contributors. Amazon.com also provides all reviewers the option to disclose personal information (up to 4000 words) and upload a photograph. This profile information is made available on a personal page that is linked to each reviewer’s name when it appears alongside the review. Users are free to provide as much or as little information that they see fit in these profiles. Another feature of the Amazon.com site is the facility for users to select one or more reviewers as a ‘favorite person”. Individuals receive email notifications with a URL to the contribution whenever one of their favorite persons posts a review. Reviewers at Amazon.com come from a wide variety of backgrounds and include teachers, librarians, a former Speaker of the US House of Representatives, journalists, lawyers, consultants and college students. While the total number of reviewers is large, those contributing a total of 10 or more reviews number only about 47,000. The critical mass of reviewers, the focus of this study, is likely to be a subset of this population of active reviewers and by definition, comprises the highest ranked members of this active group. Reviewers in the Top 1000 are clearly likely to be part of the critical mass. We therefore collected detailed data only on this select group of prolific contributors. In the first phase of data collection, we conducted hour-long semi-structured interviews with two of the Top 50 reviewers at Amazon.com. In the second phase, we collected contribution data on all reviewers at Amazon.com. For each contributor to the PDR, we gathered data on the number of reviews and helpful votes each review had received. In the third phase, we collected detailed profile information provided by reviewers in the Top 1000 list. This final sample has 1009 individuals since multiple individuals shared ranks in the Top 1000 list. These self disclosed profiles (collected only for reviewers in the top 1000 list) 12 Public document repositories often contained a variety of personal details such as their email address, URL of their personal webpage, their location, their professional career, their hobbies and interests, details of their families and pets, factors motivating them to write reviews, their favorite books and the music that they liked. The disclosure of profile information is voluntary and some contributors provided only their name (or pen-name) and little else. Three illustrative profiles with different amounts of profile information are shown in Appendix 1. We also collected data on the number of reviews already available at the time that each reviewer on the Top 1000 list contributed each of his or her reviews. Data Analysis Quantitative Analyses: We used SPSS to fit a curve to data on repository contributions by all reviewers. For the top 1000 reviewers, we examined correlations among variables using the data on review contributions and variables coded from their profiles. Qualitative Analysis: We read each profile and coded them into categories that reflected the various motivations that reviewers had revealed. In analyzing the data, we followed the techniques of open coding and axial coding advocated by Strauss and Corbin (1998). We used open coding to categorize the text in the reviewer profiles into categories suggested by prior theory. We identified keywords suggesting different categories of motives such as reciprocity and enriched this set with keywords we encountered in profiles. We also used explanations suggested by the data in the profiles. In creating new categories derived from the data, we often backtracked to earlier profiles if any of them could be recoded into the new category created. After coding the data, we grouped the categories that reflected similar concepts and themes, consistent with the notion of axial coding. These steps highlighted the core phenomenon of the motivations underlying contribution of reviews and we used the linkage between the categories 13 Public document repositories to infer the theoretical explanation. An individual’s profile was coded into multiple categories when the profile indicated multiple motives for contribution. Findings Critical Mass: Is there a critical mass of reviewers at the Amazon.com review repository? To examine the evidence for this, we examined the distribution of contributions among the reviewers. The earliest in the set of reviews submitted by reviewers in the Top 1000 was in March 1997. 55% of the reviewers wrote their first review in 1999 or earlier. 97% wrote their first review in or before 2000. Clearly, this group comprises a set of longstanding, prolific contributors to the PDR. We grouped reviewers in order of their Amazon.com reviewer ranks (100 ranks in each group). For each of the groups (e.g. ranks from 1-100, 101-200, and so on), we calculated the total number of reviews contributed. The number of contributions by each of the groups and the curve fitting this distribution is in Figure 1. ---------Figure 1 about here ---------- The Top 100 reviewers contributed 95,995 reviews while those ranked between 900-1000 contributed 14730 reviews, those ranked between 5,000 and 5,100 contributed 4923 reviews, and those ranked between 9900 and10000 contributed 2533 reviews. This pattern indicates review contributions being significantly lower for reviewers of lower-rank. The curve that best fits this distribution (R2 = 0.96) is a power-law function: Y = 82756.1 * X (-0.7217) 14 Public document repositories Such a distribution indicates that small contributions are extremely common whereas large contributions are uncommon (Adamic 2002). This pattern is described by Juran (1992) as reflecting the vital few and useful many and indicates a small group exerting a disproportionately large influence compared to the rest of the population. Further, this suggests the possibility that reviewers in the Top 1000 represent a key component of the critical mass of contributors to the repository. The data also indicate that the PDR has the characteristics of a large-group solution where no single individual makes a perceptible difference to the collective (Olson 1965). We find that the contribution of the critical mass is small compared to the overall volume of contributions of the larger group (see table below). On average, members of the critical mass in our case each contribute 148 reviews but their contributions amount to just over seven percent of the reviews in the repository; the rest comes from a very large number of individuals, each contributing, on average, just one review. It is interesting to note that members of the critical mass contributing the greatest number of reviews per person are also the most helpful– members of the critical mass on average receive 1177 helpful votes while the rest on average receive just three. Each of the reviews of the critical mass of contributors, on average, received 8.03 helpful votes while those of the others on average receive 2.12. This indicates the empirical validity of the feature of critical mass theory - the exploitation of the great by the small (Oliver et al. 1985) - in the case of PDRs. ---------Table 1 about here ---------- 15 Public document repositories One of the central roles of the critical mass highlighted by the theory is the early contribution of resources to collective action. The logic suggested by critical mass theory is that a minority of the population - the critical mass, - through their early contribution to the collective good enhances the probability of success of collective action. This in turn creates conditions for the majority to join in and the collective goal is achieved by the participation of the majority. (Marwell and Oliver 1993). Applying this logic to the case of PDRs, we suggest that the critical mass of contributors- by their early contributions - makes early content available in the repository. This supply of content is important in enabling the repository to be a useful resource for the general public, a key feature that helps generates participation and use of the repository by the public at large. This participation by the larger group subsequently creates the potential for discretionary contributions to the PDR by the others. In the case of a review repository like Amazon.com where the set of books, movies and other products needing reviews is constantly expanding, we argue that the critical mass plays an ongoing role in providing the early set of reviews of products so that reviews are available when the average user comes to the site to look up the newly added products. In contrast to the role of critical mass in other contexts where the group makes an early and important but one-time effort to get collective action started, the critical mass in the case of PDRs performs the ongoing role of providing early content in the different categories created as the repository expands. To assess this role of this set of reviewers in the ongoing maintenance of the PDR, we examined the extent to which they provided the early reviews of products on the site. For each of the 98799 reviews contributed by the 466 reviewers in the Top 1000 list for whom we had profile information, we calculated the frequency with which the review was among the set of early 16 Public document repositories reviews available on the site. The number of reviews available prior to the reviews contributed by the 466 reviewers is in Figure 2. The data indicates that 16 percent of the reviews submitted were the first ones available on Amazon.com for the book or movie. 32 percent of the reviews were among the first three reviews available and 42 percent of the reviews were among the first five available. Overall, 55% of the reviews posted by the critical mass were among the first ten reviews posted on the site. This statistic provides compelling evidence regarding the central role of the critical mass of reviewers as early contributors of content to the PDR. Even though the overall number of reviews contributed by the critical mass is small compared to the total number of reviews available on the site (about 7 percent of the total), these reviews are among the earliest that are available to the users of the site. ---------Figure 2 about here ---------The critical mass thus comprises resourceful individuals who step up to make contributions to the public good in instances where there are few alternatives available. This clearly indicates that the critical mass plays an extremely valuable role in not only setting collective action in motion, but also in maintaining the quality of the public goods available in the PDR on an ongoing basis. Drawing from critical mass theory, we therefore suggest the following proposition: Critical mass proposition: Public document repositories rely heavily on a small number of active contributors both for their establishment and their sustenance. We now turn to an analysis of the personal benefits and motivations of this critical mass using data from the profiles of reviewers in the Top 1000 list. 17 Public document repositories Among the critical mass of reviewers, profile information was available for 900; 109 reviewers had no profile text at all. 258 of the 900 profiles were brief and had less than 50 words. 466 of the profiles disclosed at least one motivation. Our analysis is based on the information on motivations provided by this subset of the Top 1000 reviewers. To preclude biases on account of the focus on this subset of 466 reviewers, we compared attributes of this subset with those of the set of 543 reviewers who either did not provide profiles or provided no information on motivations or benefits in their profiles. The median reviewer ranks for these two sets were 482.5 and 533 respectively. In addition, those who had disclosed motivations had contributed a median of 162 reviews, had received 1223 helpful votes for their reviews, and had on average 7.91 helpful votes per review. Those who had not disclosed motivations had written a median of135 reviews, had received 1134 helpful votes and had on average 8.10 helpful votes per review. The similarity of the two groups suggests that the set of reviewers disclosing profile details is representative of the critical mass of reviewers in the top 1000 list. Personal benefits to critical mass: Information on the private benefits to contributors from making contributions indicated were coded into five categories: self-expression, development of writing skills, enhanced understanding of the topic, utilitarian benefits and personal enjoyment. The frequency of occurrence and examples of these private benefits in the profiles is indicated in Table 2 (in decreasing order of frequency). ---------Table 2 about here ---------- 18 Public document repositories The data thus provides evidence that contributors to PDRs do obtain personal benefits from making their own contributions. The existence of direct, private benefits suggests a broader view of benefits to contributors than recognized in prior literature. We therefore present the following proposition highlighting influence of direct benefits to contributors on contributions: Direct benefits proposition: Contributions to PDRs by the critical mass are linked to direct benefits obtained by contributors from their own contributions. Motivations of critical mass: Mentions of motivations for contribution in the profiles were coded into three categories of motivation: social affiliation, altruism and reciprocity. The frequency of occurrence of these motivations in the profiles is indicated in Table 3 (in decreasing order of frequency). ---------Table 3 about here ---------In a context that appears devoid of social cues, the presence of social motives is interesting. Nearly half the reviewers indicated social affiliation as a motivation for contributing. This suggests a social view of contributions to PDRs, quite different from that of interactions with a database. Rather than being created by individuals seeing their actions as being one-on-one interactions with a repository, PDRs are created by contributors who are aware of the presence of a wider audience for their inputs. The profile data reveal altruism and reciprocity as the other motives for contribution. It is interesting to note that reciprocity, the sole motivation for contribution considered by Thorn and Connolly (1987), is the least frequently mentioned among motives. 19 Public document repositories We therefore propose that: Social motives proposition: Contributions to PDRs by the critical mass are linked to social motives of contributors. Interdependence among contributors: The data in the profiles indicated that reviewers’ choices regarding the topic and areas to contribute in were influenced by the contributions of others. 64 profiles (13.73% of the set of profiles) contained details indicating that reviewers tried to pick topics where there were few prior contributions or where they believed prior reviews were inadequate. This clearly suggests that contributions to PDRs are influenced by the contributing behavior of other reviewers. The pattern of interdependence suggested by the profile data fits that described by (Oliver et al. 1985) as decelerating, a situation where early contributions are most valuable and incremental contributions are less valuable: Do I read more conventional books? Yes, but if there are already reviews that approximate what I think needs to be known, I don't bother to review them.” “For new music CDs, I review everything. But for catalog material (stuff that is on my shelves at home), I do not write reviews if others have written reviews.” We therefore propose that Interdependence proposition: PDR contributions by the critical mass are more likely where there are fewer prior contributions and the likelihood of contribution decreases with increases in prior contributions by others. Private benefits, social motives and contribution behavior In addition to analyses guided by our research questions, we examined the data to gain insights into the nature of contributions by the critical mass of reviewers. 20 Public document repositories The PDR at Amazon.com provides two measures of reviewer contribution – quantity (number of reviews) and quality (helpful votes received). Prior theory suggest that incentives to contribute are likely to lead to increased volume of contributions but decreased quality of contributions (Thorn and Connolly 1987), indicating a trade-off between contribution quantity and contribution quality. Our data supports this view - the number of reviews submitted by critical mass of reviewers was negatively correlated with the quality of reviews (correlation = -0.15, p<0.01, N=466). The correlations among contribution quantity and quality of contribution, benefits and motives are indicated in Table 4. ---------Table 4 about here ---------Factors linked to contribution quantity: Table 4 indicates that private benefits (utilitarian benefits, self-expression) are positively correlated with quantity of contributions (Table 4, rows 5 and 6). This parallels the finding of Thorn and Connolly (1987) that raising the benefits from contributions increases the quantity of contributions. It is surprising that mention of the social affiliation motive is negatively correlated with the quantity of reviews (Table 4, row 3). It is likely that individuals for whom the social affiliation motive is important seem to be less inclined to contribute reviews on an ongoing basis. Perhaps, they restrict their contributions to specific areas where they are aware of other reviewers, thus adversely impacting the overall quantity of contributions. 21 Public document repositories Quantity of contribution proposition: The quantity of contributions by the critical mass will tend to be a) Positively related to personal benefits to contributors from contributions. b) Negatively related to social motives of contributors. Factors linked to contribution quality: The data suggests that reciprocity has a significant positive correlation with quality (Table 4, row 2), while altruism has a weak positive correlation (Table 4, row 1). This suggests that individuals contributing to reciprocate provide higher quality content, indicating the useful role of the social context. It is likely that higher attention-to-task observed in contexts of greater self-presentation (Sproull et. al 1996) is applicable to PDRs as well. It may also arise from reviewers’ attempts to be equitable in reciprocating help (Adams, 1965). The association of altruism and quality of reviews suggests that people who are out to help ensure that their reviews are useful. Table 4 also indicates that private benefits (development of writing skills) and quality of reviews are negatively correlated. This suggests that contributors with a focus on private benefits tend to focus less on the usefulness of content to potential users. Quality of contribution proposition: Contributions of higher quality content by the critical mass will tend to be a) Negatively related to personal benefits to contributors from making contributions; b) Positively related to social motives of contributors. 22 Public document repositories Discussion Publicly accessible repositories that rely on voluntary contributions of content are increasingly emerging as important sources of information for the public but have received little attention by researchers. To understand the dynamics of contribution and use of such systems we need theoretically grounded models of collective action in these contexts. To this end, we drew on Critical Mass theory and applied it to examine behavior at one large PDR – the repository of reviews at Amazon.com. Our work suggests the applicability of the theory to the context of PDRs and also suggests several aspects along which received theory needs to be modified. In this study, we make several contributions. First, we confirm the presence and identify the important role of a relatively small group of individuals, the critical mass, whose ongoing contributions are central to the sustenance of the PDR as a public good. In the Amazon PDR, the critical mass comprises the core set of 1009 individuals (of the population of 1.3 million reviewers) who contributes the most number of reviews per person and the most useful reviews. Over half of the reviews they contribute are also among the first ten reviews available to users on the site. As the coverage of the repository continually expands, the critical mass plays an important ongoing role in creating content in areas where few prior reviews exist. This is remarkable, considering that there is no coordination between the addition of new products to the site and the contributions of reviews by the critical mass. Second, we highlight that individuals obtain private benefits from their own PDR contributions. This runs counter to views in prior research on discretionary databases (e.g., Thorn and Connolly 1987; Fulk et al. 2004) that contributors derive benefits only from the information contributed by others and do not directly benefit from making contributions. Our study suggests that the process of making a contribution – a process that involves framing one’s thoughts, composing opinions and presenting them– is an 23 Public document repositories exercise with personal benefits of various kinds. These private benefits may be important motivations to contribute to PDRs, a feature that has been overlooked in prior research. Third, we find social motives to be salient for contribution. This is a surprising finding given that repository contribution and use is a context characterized by minimal social interaction. These motives are another set of potential motivations overlooked in prior research and that are highlighted in our work as important in understanding repository contributions. Fourth, we find that individuals in the critical mass are less likely to contribute on topics that already have contributions from others. This provides a new perspective on the congestion / information overload phenomenon recognized in prior research on online forums (Butler 2001; Jones et al. 2004). Prior work suggests that congestion leads users to completely avoid participation in online forums. In our study, we find individuals recognizing differences in the levels of congestion within individual topics and rather than avoiding the PDR as a whole, contribute to less congested topics with few prior contributions. Finally, our study suggests that the determinants of quantity and quality of content are distinct. This contrasts with the results of Wasko and Faraj (2005) in the context of an online bulletinboard that found several common antecedents of contribution quality and quantity. Limitations Our study has several limitations. First, our results are based on data collected at one PDR, the repository of reviews at Amazon.com. While this choice minimized confounds due to contextual differences between multiple sites, it is likely that the specific features implemented at Amazon.com may have influenced our findings. Second, the work is based on the sample of the critical mass of contributors at Amazon. The critical mass is recognized as being distinct in attitudes, motives and behaviors from larger public and the results are likely to not be 24 Public document repositories generalizable to the average repository contributor. Third, the qualitative data on reviewer profiles is in form of statements that were not solicited by the authors of this paper. They were willingly provided by many reviewers using the facilities provided by Amazon.com. Since we used self-disclosed profile data, it is likely that our results are biased by the self-presentation of contributors. Fourth, the cross-sectional nature of our study limits inferences of causality among the variables. Longitudinal examinations of repository contributions and the role of motivational and contextual factors can provide a deeper understanding of cause and effect relationships explaining repository contributions. Despite these limitations, our study makes a number of contributions to research and practice. This represents one of the first field studies identifying the critical mass of contributors and examining the benefits and motives of this group. The study highlights the importance of social motives even in a context where the act of contribution does not involve social interaction. It provides evidence of private benefits to contributors from their own contributions. This study also provides empirical evidence regarding the relationship between key outcome variables – quantity and quality –and contributor motives and private benefits. Our exploratory study suggests an initial theoretical framework that explains contribution behavior in online PDRs. In the figure below, we depict a summary of our findings in line with our earlier propositions: ---------Figure 3 about here ---------Based on our findings we suggest two sets of factors influencing quality and quantity of contribution among individuals in the critical mass of contributors. For quality of contribution, social motives are proposed to be antecedents. On the other hand, we propose that private 25 Public document repositories benefits influence quantity of contribution. Also, social motives and private benefits are negatively related with quantity and quality of contributions respectively. Implications for practice: Our findings also have implications for practice. The current study was based on a specific but important type of public document repository. An increasing number of e-commerce sites are providing facilities that allow people to submit reviews on products they have bought (Kawakami 2005). According to a recent Forrester Research study, nearly 26% of online retailers provide product review forums on their websites (Mendelsohn and McNabb 2005). The procedure used by us to identify the critical mass and study its characteristics can be usefully applied to direct incentives to the appropriate set of key participants. The findings can similarly be applied within organizations seeking to identify the critical mass of contributors in knowledge management initiatives that seek to develop repositories based on discretionary contributions of content by employees (Fulk et. al 2004). Our results suggest that attending to the private benefits and tapping social motives are important levers to encourage the critical mass to contribute to PDRs. Our results also suggest that the factors linked to quality and quantity of contribution are different and can guide the development of incentive mechanisms for the key group of prolific contributors. Conclusion This paper draws on Critical Mass theory to develop a theory to explain collective action in the development of public document repositories. Our results, based on data from the large PDR of reviews at Amazon.com highlight the complex role of the critical mass of contributors in establishing and sustaining collective action. The results also suggests the utility of a broader view of benefits from contribution and the recognition of social motives linked to contributions 26 Public document repositories by the critical mass. Further, our results contribute to a more nuanced view of determinants and outcomes of contribution since the factors linked to the quality of contribution are distinct from those linked to the quantity of contribution. Our approach also opens up several areas for further theoretical and empirical work to understand the complexity of the establishment of collective action to create publicly accessible information goods. 27 Public document repositories References Adamic, Lada 2002 ‘Zipf, power-laws, and Pareto – a ranking tutorial’ Accessed online at http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html on March 13, 2006 Adams, John S. 1965 ‘Inequity in social exchange’ In L. Berkowitz (Ed.) Advances in experimental social psychology Volume 1: 267-299, New York: Academic Press. Butler, Brian S. 2001 ‘Membership size, communication activity, and sustainability: A resource-based model of online social structures’. Information Systems Research 12 / 4: 346-362. Butler, Brian, Lee Sproull, Sara Kiesler, and Robert Kraut 2002 ‘Community effort in online groups: Who does the work and why?’ In Leadership at a distance. Suzanne P. Weisband and Leanne Atwater (Eds.) Mahwah, NJ: Lawrence Erlbaum. Clary, Gil, Mark Snyder, Robert Ridge, John Copeland, Arthur Stukas, Julie Haugen, and Peter Miene 1998 ‘Understanding and assessing the motivations of volunteers: A functional approach’. Journal of Personality and Social Psychology, 74: 1516-1530. Constant, David, Sara Kiesler, and Lee Sproull 1994 ‘What’s mine is ours, or is it? A study of attitudes about information sharing’. Information Systems Research, 5: 400-421. Constant, David, Lee Sproull, and Sara Kiesler 1996 ‘The kindness of strangers: The usefulness of electronic weak ties for technical advice’. Organization Science 7: 119-135. Fulk, Janet, Rebecca Heino, Andrew J. Flanagin, Peter R. Monge, and Francois Bar 2004 ‘A test of the individual action model for organizational information commons’ Organization Science 15: 569-585. 28 Public document repositories Hardin, Russell 1982 Collective Action. Baltimore: Johns Hopkins University Press. Jones, Quentin, Gilad Ravid, and Sheizaf Rafaeli 2004 ‘Information overload and the message dynamics of online interaction spaces: A theoretical model and empirical investigation’. Information Systems Research 15: 194-210. Juran, Joseph 1992 Juran on Quality by Design. New York: Free Press Katz, Elihu, Jay Blumler, and Michael Gurevitch 1974 ‘Uses and gratifications research’ Public Opinion Quarterly 37: 509-523. Kawakami, Laurie 2005 ‘Giving reviews the thumbs down’ Wall Street Journal August 4, 2005. Accessed at http://online.wsj.com/public/article/SB112311985732004654x3qD8ODU_jwTvTplbvV6uhXfyBc_20060803.html on March 13, 2006 Kim, Hyojoung, and Peter Bearman 1997 ‘The structure and dynamics of movement participation’. American Sociological Review, 62: 70-93. Markus, Lynne 1987 ‘Toward a critical mass theory of interactive media: Universal access, interdependence and diffusion’ Communication Research, 14: 491-511 Marwell, Gerald, and Pamela Oliver 1993 The critical mass in collective action: A micro-social theory. New York, Cambridge University Press. Mendelsohn, Tamara, and Kyle McNabb 2005 ‘Using web content management to drive e-commerce’ Forrester Research, October 20, 2005 teleconference. Accessed at http://www.forrester.com/Events/Overview/0,5158,1211,00.html on March 13, 2006 29 Public document repositories Monge, Peter, Janet Fulk, Michael Kalman, Andrew Flanagin, Claire Parnasa, and Suzanne Rumsey 1998 ‘Production of collective action in alliance-based interorganizational communication and information systems’ Organization Science, 9: 411-433. Nielsen NetRatings 2006 ‘Two-Thirds of Active U.S. Web Population Using Broadband, Up 28 Percent Year-OverYear to an All-Time High’ Accessed at http://www.netratings.com/pr/pr_060314.pdf on March 13, 2006 Oliver, Pamela, and Gerald Marwell 2001 ‘Whatever happened to critical mass theory? A retrospective and assessment’. Sociological Theory, 19: 292-311 Oliver, Pamela, Gerald Marwell, and Ruy Teixeira 1985 ‘A theory of critical mass I: Interdependence, group heterogeneity, and the production of collective action’ American Journal of Sociology 91: 522-556. Olson, Mancur 1965 The Logic of Collective Action: Public Goods and the Theory of Groups. Cambridge, MA: Harvard University Press. Rafaeli, Sheizaf 1986 ‘The electronic bulletin board: A computer-driven mass medium’. In Computers and the Social Sciences 2. Paradigm Press: Osprey, FL. Sproull, Lee, Mani Subramani, Sara Kiesler, Janet Walker, and Keith Waters 1996 ‘When the interface is a face’. Human-Computer Interaction 11: 97-124. Strauss, Anselm L and Juliet M. Corbin 1998 Basics of qualitative research: Techniques and procedures for developing grounded theory, 2nd ed., Thousand Oaks, CA: Sage. Thorn, Brian K and Terry Connolly 30 Public document repositories 1987 ‘Discretionary data bases: A theory and some experimental findings’. Communications Research, 14 / 5: 512-528. Wasko, Molly M. and Samer Faraj 2005 ‘Why should I share? Examining social capital and knowledge contribution in electronic networks of practice’. Management Information Systems Quarterly 29 / 1: 35-57. 31 Public document repositories Figures Distribution of reviews 100000 80000 60000 40000 20000 0 0 20 40 60 80 100 120 Reviewer rank (in 100s) Figure 1: Distribution of contributions by rank Note: Graph plotted for top 10,000 reviewers only due to scaling considerations 32 Public document repositories Cumulative distribution of prior reviews 70.00 60.00 50.00 40.00 30.00 20.00 10.00 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0.00 0 Cumulative percentage of reviews 80.00 Number of prior reviews Figure 2: Reviews available prior to contribution by critical mass (N=98799) 33 Public document repositories + Quality of contribution Social motives _ _ + Private benefits Quantity of contribution Figure 3: Model of critical mass contribution at PDRs 34 Public document repositories Tables Critical mass (N=1009) 257,773 148 All other reviewers (N=1,321,493) 3,428,054 1 Total number of reviews Median number of reviews per reviewer Median number of helpful 1177 3 votes per reviewer Median number of helpful 8.03 2.12 votes per review per reviewer Table 1: Contribution volume, helpfulness of critical mass Personal benefit Self-expression Frequency of mentions 139 (29.8 %) Illustrative comments “(Writing reviews on Amazon) gives me the opportunity to express my opinion on the items that I have purchased.” 82 (17.6 %) “I think what people listen to or watch (or don’t) says a little bit about who they are… I try to compare and contrast within a genre. I also try to compare an artist’s work with his / her past accomplishments rather than with someone else’s work…” “Writing reviews has enabled me to use some of the writing skills that I learned in law school.” Enhancing understanding of topic 39 (8.4 %) “I am a technical writer by profession; reviews allow me to take out my adjectives and brush the dust off them.” “I write reviews on Amazon.com's website … to clarify and organize my own thoughts.” Utilitarian benefit 29 (6.22 %) Enjoyment 23 (4.9 %) Developing writing skills “I review largely to fix the book for myself in my head”. “I get promo copies of CDs from record companies … I have realized that putting reviews on Amazon impresses record companies as much as writing reviews for print weeklies. I often send links of my reviews to record companies”.” “I enjoy free gift certificates and would appreciate any!” “Reviewing is fun. I do it for my own enjoyment.” “I am doing this for fun and imagine that, besides myself and I, no one else will ever read this.” Table 2: Personal benefits to critical mass (N=466) 35 Public document repositories Motivation Frequency of mentions Illustrative comments Social affiliation 208 (44.6 %) “This is so cool that Amazon permits us book lovers the space to share our thoughts about what we're reading. I love to peruse other peoples' thoughts on the books I'm about to buy and enjoy exchanging comments and ideas with other readers.” Altruism 136 (29.2 %) Reciprocity 49 (10.6 %) “Most pals, buddies know about my writing reviews. They do not look at my reviews. Feedback from readers of my magazine reviews is usually from people whom I know. What is noteworthy is the feedback from customers at Amazon: people who do not know you. I get mail from people all over the world.” “Wanting to help is the primary reason I write book reviews on Amazon.com “I am trying to help others in a purchase decision.” “I know I read these reviews prior to buying any book and they have been excellent help, so if I can steer someone to one they will enjoy, well, then I've paid my dues.” “I have consulted Amazon's public reviews for years before making a purchase and I decided to start giving back to the Amazon community” Table 3: Motives of Critical Mass (N=466) S. No. Mentions of Quantity of Quality of benefits and contributions contributions motives (# reviews) (Votes per review) 1 Altruism 0.08† 2 Reciprocity 0.15** 3 Social affiliation -0.12* 4 Development of -0.13* writing skills 5 Utilitarian benefits 0.10* 6 Self-expression 0.14** Table 4: Correlation of Motives, Benefits and Attributes of Contribution (N=466) **: p < 0.01, *: p < 0.05, †: p < 0.1. 36 Public document repositories Appendix 1 Illustrative profiles provided by Amazon.com reviewers (Identifying details removed) Name: Nickname: Reviewer Rank: 20 At a glance Reviews written: 514 (1319 helpful votes) Name: Nickname: E-mail: Reviewer Rank: 153 About me: I hope you find my reviews helpful. They were not helpful, please let me know. I want to give you the best information to make your choice. Evaluating cookbooks is something I love to do, and I hope you can benefit from some of my experience. Reviews written: 158 (1084 helpful votes) Name: Nickname: E-mail: Reviewer Rank: 339 About me: I am a senior engineer for network security operations at a Fortune 500 firm. From 1998 through 2001 I defended global American information assets as a captain in the Air Force Computer Emergency Response Team (AFCERT). Now I provide network security monitoring to protect my employer and our clients. My professional interests include intrusion detection, incident response, digital forensics, system administration, and multibooting of operating systems on the Intel architecture. I read and review to learn, assist, and contribute. My reviews are not always popular, since telling the truth is more important to me than inflating sales. If I decide a book offers nothing new or useful, I will skim it. If I believe a book is lacking technical accuracy, I may critique it. My goal is to give you straight advice on books. I will earn your trust! Reviews written: 66 (722 helpful votes) 37