The Dimensions of Reputation in Electronic Markets 2nd Statistical Challenges in E-Commerce Research Symposium Anindya Ghose Stern School of Business New York University Panagiotis Ipeirotis Stern School of Business New York University Arun Sundararajan Stern School of Business New York University Overview: We present a methodology for identifying the different dimensions of online reputation and characterizing their influence on the pricing power of sellers. Theory predicts that sellers with better recorded online reputation can successfully charge higher prices than competing sellers of identical products, and that their pricing power increases with their recorded level of experience. We develop and implement a new text mining technique that identifies and quantitatively assesses dimensions of importance in reputation profiles, and use this technique to create a new data set containing detailed reputation profiles and prices for sellers of consumer software on Amazon.com's online secondary marketplace. The estimation of a set of econometric models on this data set validates the predictions of our theory, and further, ranks these dimensions of reputation based on their effect on measured seller value, identifying those that have the most significant impact on reputation. This paper is the first study that integrates econometric and text mining techniques toward a more complete analysis of the information captured by reputation systems, and it presents new evidence of the importance of their effective and judicious design. Theory: When buyers purchase products in an electronic market, they assess and pay not only for the product they wish to purchase, but for a set of fulfillment characteristics as well: for instance, packaging, timeliness of delivery, the extent to which the product description matches the actual product, and reliability of settlement. In traditional (bricks and mortar) retailing, buyers have a deterministic way of assessing the quality of such fulfillment characteristics. However, such characteristics cannot be reliably described or verified ex-ante in an electronic market. Buyers make inferences about a seller's true characteristics based on their interpretation of its feedback profile, which comprises numerical and text-based information about the observed seller characteristics for each of their prior transactions. Since buyers are cognitively bounded, they pay closer attention to more recent transactions by other buyers, which appear first in the reputation profile. Also, buyers are heterogeneous in the relative value they place on each of these characteristics. We build a model that leads to a set of hypothesis, which briefly, conjecture that sellers with a higher frequency of positive assessments on each of these characteristics can successfully charge higher prices, and that this pricing power increases with the size of a seller's feedback profile (or their “level of experience”). Estimation Methodology: We estimate an econometric model that associates the numerical score associated with a seller's reputation and the level of experience (that is, the number of transactions in the seller's profile) with the premium in price the seller can command over other sellers who simultaneously have an identical product available at the time the transaction takes place. The results of this estimation confirm that a higher average reputation and a higher level of experience each independently increase pricing power. Next, we calibrate a scoring function that assigns dimension-specific pricing premium scores to each of the modifiers we have mined. Since these scores are based on regressions that controls for seller and product-specific fixed effects, they isolate the extent to which the information contained explicitly in the text feedback of a seller's profile contributes toward its pricing power, and provides a ranking of the value of dimension-modifier pairs based on their impact on pricing power. Contribution: There has been an emerging stream of work that examines the impact of reputation systems (Dellarocas et al., 2004, Zeckhauser and Resnick 2002). In this paper, we provide a new econometric framework for measuring the extent to which numerical and textbased reputation affects outcomes in electronic markets. To the best of our knowledge, our study is the first to use text mining techniques for analyzing reputation feedback, and the first set of results that establishes the value of information contained in the text-based feedback of an online reputation system. Using a new text mining technique we have developed and implemented, we then structure the text feedback in the reputation profile of each seller who has participated in at least one of over 9,500 transactions for the sale of packaged consumer software on Amazon’s secondary marketplace. Our text mining technique identifies the unique dimensions associated with a seller's recorded reputation (examples of dimensions include “delivery” and “packaging”) and then locates the modifiers associated with each dimension (examples of such modifiers include “fast delivery” and “careless packaging"). Hence, we convert unstructured text feedback for a seller into a vector of dimension-modifier pairs. This yields a rich data set that describes, for each transaction, the prices and characteristics of successful sellers, as well as those of their competitors. To discover the semantic orientation (Hatzivassiloglou and McKeown 1997), the strength of each modifier and the importance of each dimension, we connect the dimensionmodifier pairs with the pricing power that the respective sellers achieve. Our analysis transforms the seemingly unstructured and qualitative text feedback into a structured quantitative framework, allowing us to use a rich set of statistical techniques to analyze the feedback text. Our approach for analysis of text feedback is likely to gain importance as the fraction of used good exchanges taking place on trading networks that are not mediated by a central market maker increases. Statistical Challenges: A key statistical challenge we face in this study is an understanding of the most robust way of incorporating the vast number of modifiers and dimensions that emerge from the textmining analysis in our econometric specifications. Assigning an economic value on each dimension-modifier pair is the first statistical challenge that we have addressed. We will discuss the appropriability of different statistical techniques such as factor analysis, principal component analysis, and latent Dirichlet allocation towards this analysis as well their shortcomings. We will describe our approaches in examining the use of reputation data to predict the statistical probability with which a seller makes the sale. Finally, we will further describe some of the research contributions we have made in bridging research in Econometrics with that from Computer Science (textmining) and how they may generalize towards future empirical research in electronic commerce, especially those requiring computation with large datasets. References Dellarocas, Chrysanthos, Ming Fan, Charles A. Wood. 2004. Self-interest, reciprocity, and participation in online reputation systems. Working paper. Hatzivassiloglou, Vasileios, Kathleen R. McKeown. 1997. Predicting the semantic orientation of adjectives. Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL'97). 174{181. Zeckhauser, Richard, Paul Resnick. 2002. Trust among strangers in internet transactions: Empirical analysis of eBay's reputation system. Michael Baye, ed., The Economics of the Internet and E-Commerce. Elsevier Science, 127{157.