MEASURING AND COMPARING THE EFFECTIVENESS OF E-COMMERCE WEBSITE DESIGNS Jungpil Hahn Doctoral Candidate Robert J. Kauffman Co-Director, MIS Research Center Professor and Chair Information and Decision Sciences Carlson School of Management University of Minnesota Minneapolis, MN 55455 Email: {jhahn, rkauffman}@csom.umn.edu Last revised: January 20, 2003 _____________________________________________________________________________________ ABSTRACT The assessment of the effectiveness of e-commerce websites is of critical importance to online retailers. However, the current techniques for evaluating effectiveness are limited in that they do not allow for formal empirical measurement of the productivity and performance of the website. In this paper, we use the theoretical perspective of production economics to measure the performance of Internet-based selling websites. We model the website as a production system where customers consume inputs (i.e., use various functionalities of the website) to produce an output (i.e., a basket full of items at checkout). With this basic perspective, we propose an analysis methodology for measuring and comparing website efficiency and attributing observed inefficiency to customer inefficiency or website design inefficiency. The application of the proposed evaluation methodology to a currently operational e-commerce website demonstrates the value of our technique. _____________________________________________________________________________________ KEYWORDS: Business value, customer efficiency, data envelopment analysis, design efficiency, economic analysis, electronic commerce, e-tailing, productivity, website design. _____________________________________________________________________________________ Acknowledgments: An earlier version of this paper was presented at the 2002 Workshop on Information Systems and Economics, Barcelona, Spain. The authors would like to thank Gordon Davis, Alok Gupta, Joe Konstan and Jinsoo Park, who provided useful feedback at an earlier stage in the development of this work. We also thank Rajiv Banker, Indranil Bardhan, Mayuram Krishnan, Sandra Slaughter and other participants at the 2002 Workshop on Information Systems and Economics for helpful suggestions. INTRODUCTION Since the crash of the DotComs in the American stock market in April and May of 2000, the evaluation of e-commerce websites in terms of business value has become increasingly important (Varianini and Vaturi, 2000). No longer are venture capital firms willing to make portfolio investments in e-commerce properties that only present potential future return on investment (ROI) opportunities. Instead, they seek e-commerce firms with more immediate opportunities, especially firms that can demonstrate a well-developed discipline for evaluating investments in web-related software development and e-commerce business models. This way, the risks and uncertainties of investing in this emerging market are better balanced with the rewards. During the first phase of e-commerce, the goal for most companies was to secure a share of the virtual market space through an online presence by attracting as many visitors as possible to their website. However, the industry has progressed to the second phase of ecommerce. As e-commerce begins to mature, the ability to retain customers and conduct online operations justified by ROI is the only way an e-business can survive (Agrawal, Arjona, and Lemmens, 2001; Straub and Watson, 2001; Chen and Hitt, 2002). Recent industry analyses, however, point out that e-commerce retailers are earning low scores on ROI, by failing to meet consumers’ purchase needs with the poor usability and errant designs of their web-based storefronts (Souza, Manning, Sonderegger, Roshan, and Dorsey, 2001; Anderson, 2002). For example, a study by Zona Research reported that 60% of web-savvy users dropped out of the purchasing process because they could not find the products in the online retailers’ websites (Zona Research, 1999). Another study conducted by A.T. Kearney showed that 80% of experienced online shoppers gave up shopping on e-commerce websites due to problems they encountered while interacting with the website (Rizzuti and Dickinson, 2000). Yet another study conducted by Creative Good showed that 43% of purchase attempts ended in failure due to poor usability of the websites (Rehman, 2000). This shortfall in realized value compared to the potential value that web-based selling approaches offer is dramatic. The Creative Good study points out that this level of failed purchase attempts is consistent with an estimated loss of $14 billion in sales for e-commerce retailers in the 2000 Christmas-New Year’s holiday shopping season alone. Recent academic research reinforces the picture that emerges. Almost all of the papers in the two recent guest-edited issues of INFORMS Information Systems Research on e-commerce metrics included metrics related to the design and usability of e-commerce websites (Straub, Hoffman, Weber, and Steinfield, 2002b, 2002a). Apparently the quality of the online customer experience that effectivelydesigned websites create not only has a positive effect on the financial performance of a firm, but also possesses the potential to create unique and sustainable competitive advantage for Internet-based sellers and other e-commerce firms (Rajgopal, Venkatachalam, and Kotha, 2001). 1 Developing, launching and maintaining an e-commerce website entails a significant investment for ecommerce firms. Simple e-commerce websites can cost $1-2 million per year for setup and maintenance, whereas more sophisticated websites with dynamic capabilities require annual investments up to $52 million (Rizzuti and Dickinson, 2000; Dalton, Hagen, and Drohan, 2001). Despite the importance of website development requiring such significant investments, the process of designing high quality websites for e-commerce is still more of an art than a science. E-commerce companies still rely largely on intuition when it comes to designing their websites (Hahn, Kauffman, and Park, 2002). To make matters worse, design changes and their impacts are not tracked, making it impossible to measure the benefits of website design (Wallach, 2001). This situation brings to the foreground the importance of value-driven evaluation and design of e-commerce websites. However, e-businesses are facing difficulties due to the lack of proven tools and methods for accomplishing this. According to Ivory and Hearst (2001), traditional approaches to e-commerce website evaluation fall into three major categories: user testing, where users are asked to perform representative tasks with a given website and problems are determined based on the range of observed user interactions (e.g., Spool, Scanlon, Schroeder, Synder, and DeAngelo, 1999; Rizzuti and Dickinson, 2000); inspection, where domain experts use a set of criteria to identify potential problems with the website (e.g., web usability heuristics, such as those suggested by Nielsen (1994)) to identify potential usability problems in the website design (e.g., Nielsen and Mack, 1994); and, inquiry, where users provide feedback on the website via interviews, surveys, participation in focus groups etc. (e.g., Schubert and Selz, 1999). These methods have been adopted from the field of user interface evaluation (UIE) within the broader field of human-computer interaction (HCI). However, even though these approaches have been successfully applied for the evaluation of user interfaces of traditional IS applications, they are not perfectly suited for web-based e-commerce applications. For example, websites are very frequently updated and redesigned, which makes the recurring cost of recruiting test users, experts or survey respondents for the evaluation of each redesign overly excessive for most organizations with limited labor and capital resources. It is also important to emphasize that users of web-based applications are most often customers, which is untypical of traditional IS applications developed for use by employees within a firm. As a result, greater constraints are placed on what a designer/developer must do to create a desirable setting for system use by a user/customer since training is not a viable option. The purpose of this paper is to present a methodology for assessing the effectiveness of e-commerce website design. Our proposed approach to e-commerce website evaluation is not for comparative evaluation of websites of different companies (e.g., testing whether the Amazon.com website is more 2 effective than the competing BN.com website). Rather, our approach is intended for use within a firm for assessing the effectiveness of one’s own website or comparing the effectiveness of different redesigns of one’s own website. The intuition behind our proposed methodology is that we can measure (or estimate) the effectiveness of an e-commerce website by analyzing how well the website enables efficient customer behaviors that are observable from clickstream data in web server logs. The paper is organized as follows. We review the relevant literature to provide a theoretical foundation for modeling the effectiveness of e-commerce websites in terms of production economics. We propose a new model of e-commerce website effectiveness that views the e-commerce website as a production environment where effectiveness can be characterized in terms of customer transaction productivity. We also discuss the methods for estimating the efficiency of website designs in the analytical context of data envelopment analysis (DEA) (Charnes, Cooper, and Rhodes, 1978, 1981; Banker, Charnes, and Cooper, 1984). Next, we illustrate the value of our proposed methodology by applying it in the empirical evaluation of the effectiveness of website design at an Internet retailer of groceries. We also discuss our methodology in depth prior to presenting the results of the empirical analysis. We conclude with discussions, implications of our results and directions for future research. ONLINE SHOPPING AS AN ECONOMIC PRODUCTION PROCESS We conceptualize online shopping as an economic production process in which customers make use of the e-commerce website in producing an economic transaction. We set the stage for this conceptualization by reviewing literature that provides a foundation for characterizing e-commerce websites as self-service technologies (SSTs, also called self-service production technologies). We then discuss the basic concepts in production economics to offer a basis for measuring the effectiveness of ecommerce websites. Finally, we present our model of online shopping as an economic production process, as well as our approach for evaluating the effectiveness of e-commerce website designs. Service Production and Self-Service Technologies (SSTs) Early research in service operations management recognized the importance of customer’s involvement in the service production and delivery process as a source for increasing a service firm’s productivity (Chase, 1978; Lovelock and Young, 1979). Given that the presence of the customer (or at least her input) is generally required in the service delivery process, customers have also been regarded as partial employees of the service firm (Mills and Morris, 1986). This perspective of customer coproduction is especially relevant when the service encounter involves the use of SSTs in the service production and delivery (e.g., automated teller machines (ATMs) for banking transactions, e-ticket kiosks for airline check-in, e-commerce websites for online shopping, etc.) since customers are actually performing the necessary tasks that a paid employee of the service firm would otherwise execute. 3 In a similar vein, the service marketing literature identifies employees’ performance in a service delivery system as a vital factor affecting the service firm’s productivity and service quality (Zeithaml, Parasuraman, and Berry, 1990). Since, customers are co-producers of the service, the customers’ efficiency and productivity also become important precursors to high quality service. The concept of customer efficiency has been defined and investigated in prior research by Xue and Harker (2002), who propose that customer efficiency consists of transaction efficiency (e.g., more efficient transactions) and value efficiency (e.g., more frequent transactions). The authors also refer to quality efficiency for services when the major content of the service product is provided by peer customers. Transaction efficiency creates value for the firm through cost savings whereas value efficiency creates value through increased volume. This focus on labor efficiency has strategic implications. Given the lack of training opportunities for customers, it becomes difficult to increase the productivity of the customers. Hence, the strategic implication is that a service firm should identify and select more efficient customers in order to increase productivity and profitability. This perspective is consistent with profitability-based customer segmentation: it has been suggested that firms should identify and serve only profitable customers (Brooks, 1999; Zeithaml, Rust, and Lemon, 2001). However, the narrow focus on customer efficiency does not paint the whole picture of service production efficiency, especially service delivery environments with technology-based service production. The emphasis on customer efficiency (i.e., employee performance) was primarily due to the fact that, traditionally, services were labor-intensive processes that involved the co-presence of the employee and customer. This efficiency perspective needs be extended given the rise of SSTs. Productivity increases can be a result of not only improving the quality of the labor force (i.e., increasing employee and customer efficiency—by making the employees or the self-served customers more efficient), but also by investing in more efficient capital equipment (i.e., increasing technological efficiency—making SSTs more efficient). This perspective on technological efficiency has a very different strategic implication. Identifying efficient and profitable customers should no longer be the main focus or the only focus; rather, the effective design of the technology so that even inefficient customers can be more efficient should move to the foreground. E-commerce websites, especially transactional web-based applications for Internet-based selling, can be viewed as SSTs (Meuter, Ostrom, Roundtree, and Bitner, 2000). The design of SSTs has significant impact on the adoption of the channel as well as the quality of the service production and delivery process. It has been shown that the adoption of SSTs is sensitive to their design and the degree of customer contact (Walley and Amin, 1994). In other words, low-contact services (i.e., services that require only the presence of the customer) can deal with highly complex operational services, whereas high-contact services (i.e., services in which the customer is the direct producer) have typically employed technologies 4 for low operational complexity (e.g., ATMs). In case of complex services that require a high degree of customer contact, ease of use of the self-service technology becomes critically important due to the lack of training opportunities that can be provided to the customer (Chase and Tansik, 1984). Production Economics and the General Model of a Production Frontier We model the customer-website interaction as an economic production process (Kriebel and Raviv, 1980). An economic production process defines the technical means by which inputs (e.g., materials and resources) are converted into outputs (e.g., goods and services). This technical relationship is represented by the production function, which articulates the maximum level of outputs produced for each given level of inputs (i.e., the efficient frontier or the “best practice” production frontier). Deviations from the production frontier reflect inefficiencies in individual observations (Aigner and Chu, 1968). Another important concept in production economics is returns to scale, the relative increase in outputs as all inputs are increased proportionately without changing the relative factor mix (Varian, 1992). A production process is said to exhibit constant returns to scale if, when all inputs are proportionately increased by k, the outputs also increase by k. The production process exhibits increasing returns to scale if outputs increase by a proportion greater than k and decreasing returns to scale if outputs increase by a proportion smaller than k. A general model of production is given by: xk = f (y, s, ε) = f (yl, si) + ε , (General Model of Production) where xk = input k, yl = output l, si = environmental variable i influencing the production process, ε = deviations from the production frontier. Figure 1 provides a graphical illustration of the basic logic of our production model. (See Figure 1.) The production function represents the most efficient production process. All points that lie on the curve (e.g., the point E) are said to be efficient since there is no deviation from the production frontier (i.e., ε = 0). On the other hand, all observations that lie below the curve (e.g., the point I) are inefficient in the sense that the same level of output may be achieved with ε less input, ε > 0. ε can take on both positive and negative values. Inefficiencies in the production process result in negative deviations from the production frontier (i.e., results in less output produced, indicating the production process is less efficient). Random effects (e.g., measurement errors, effects of any factors that are not included in the model, or randomness due to human indeterminacy) may cause both positive and negative deviations. Assuming a symmetric distribution for the random effects deviations, the amount of downward deviation from the frontier, on average, will be greater than or equal to the amount of upward deviation. 5 Figure 1. Production Frontier in Production Economics Y decreasing returns to scale Production Frontier ε= 0 x E x = f(y, s) Output constant returns to scale ε> 0 x I increasing returns to scale 0 Input X Conceptualizing Website Design Effectiveness Our view of an e-commerce website as a service production environment enables us to start thinking about the evaluation of website performance: the ability to transform inputs to outputs. In our context of online shopping, we conceptualize the inputs as customers’ actions in navigating through the e-commerce website in order to produce a transaction, in which the output can be regarded as a checkout of a basket of products. Before proceeding with the specification of the online shopping production model, it is important to consider the axioms of economic production, to verify their conformance with the online shopping context that we examine here. Axioms of Production in Online Shopping. There are three basic assumptions of production postulated by economic theory (Varian, 1992). Regularity. The regularity axiom, also called the “no free lunch” assumption, states that the input requirement set is a closed non-empty set for all outputs levels y with y ≥ 0. With regularity, it is not possible to produce something from nothing; at least some non-zero input needs to be consumed in order to produce an output. In online shopping, this translates into our assumption that a customer must interact with the site (i.e., non-zero input) in order to produce a transaction on the website. Monotonicity. The monotonicity axiom, also called the “free disposability” assumption, states that for a given production possibility set of outputs and inputs, it is possible to produce the same level of outputs with another bundle of inputs of greater amount. Thus, it should also be possible to produce an equal or smaller amount of all outputs by using at least as much of all inputs. In the 6 online shopping context, our interpretation is that a customer may end up with a transaction of equal size (i.e., same number of items in cart) by browsing more within the website. For example, a customer may add an item into the cart when she first visits the item’s product page (i.e., an input), or on first visit to that page think about buying that item only to decide to add it to her cart later on. However, she would have to revisit the item’s product page in order to add this item to her cart (i.e., an additional unit of input). In other words, customers may view product pages without adding the items to her cart (i.e., free disposable input). Convexity. The convexity axiom states that the input requirements set is a convex set. In other words, given any two input bundles in the input requirements set, any convex combination of those two input bundles is also in the input requirements set and, hence, can produce the same level of output. This assumption is also satisfied in the online shopping context where the customer can end up with the same contents in the shopping cart by using different functionalities of the e-commerce website. For example, during one transaction, she may add a particular item to her cart from the product page within the product category hierarchy, and for another transaction, the same items can be added to the cart from the specials and promotions section within the website. The Online Shopping Production Model. Consistent with the axioms of economic production, we conceptualize the online shopping production model as: xkj = f(ylj, sij) + εj , (Online Shopping Production Model) where xkj = input k for shopping transaction j, ylj = output l for shopping transaction j, sij = environmental variable i influencing the shopping transaction j, εj = deviations from the frontier for shopping transaction j. Given that each customer may perform multiple online transactions, we subscript the variables with j in the online shopping production model to distinguish between different shopping transactions. The inputs relate to the effort put forth by the customers in filling their virtual shopping carts (e.g., number of product page views, extent of navigation through product listings, amount of search conducted, references to help pages, etc.). The outputs in the model describe the transaction (e.g., number of items in the shopping cart at checkout, dollar amount of items in the shopping cart at checkout, etc.). Various other factors may influence the production process. In online shopping, these may include the level of experience of the customer (e.g., number of previous transactions at the website, etc.), the quality of the Internet connection (e.g., connection speed) and so forth. 7 Efficiency Concepts for Analyzing the Internet-Based Selling Websites. For Internet-based selling, we define website efficiency as the degree to which website design supports efficient online purchasing.1 The effectiveness (or ineffectiveness) of the e-commerce website (i.e., the online production environment) can be inferred by analyzing the inefficiencies of the customer-website interactions, which can be measured by estimating the deviations from the production frontier for the observed online transactions. Note, however, that the focus of efficiency estimation here is not on assessing the productivity of each production unit (i.e., a customer transaction) per se. Instead, it is to assess the overall productivity of the production environment. Hence, we are not interested in the individual inefficiency estimates but in the overall distribution of the inefficiency estimates given a particular production environment (i.e., a particular website design). Conceptually, there may be two sources of inefficiency: customer inefficiency and website design inefficiency. This distinction is similar to managerial versus program efficiency investigated by Charnes, Cooper and Rhodes (1981) in the context of evaluating the effectiveness of “Program Follow Through” compared to “Non-Follow Through” in public education. First, the customers may be inefficient in that they do not use the website design features that are optimal for task performance. This is customer inefficiency (i.e., inefficiency due to poor execution by the customers). Second, the design of the website may be poor such that even efficient consumers cannot complete their tasks efficiently. And this is website design inefficiency (i.e., inefficiency due to poor website design). Analyzing the Sources of Inefficiency. The question then is to determine from the observed efficiency measures where the source of inefficiency is. We approach this by extending the work of Frei and Harker (1999) by organizing the observations into subgroups of production environments (e.g., different website designs) to see if one subgroup outperforms another. In other words, we organize the customer shopping transactions by website designs to estimate efficiency scores for each subgroup. As a 1 Website design efficiency is similar to the concept of task-technology fit that has been investigated extensively in the information systems literature in the areas of graphical information presentation (e.g., Benbasat, Dexter, and Todd, 1986), tables versus graphs (e.g., Vessey, 1991), and also general information systems (e.g., Goodhue and Thompson, 1995). In the context of online shopping, the task comprises the shopping goals and decision processes. The technology is the design of the e-commerce website. In other words, website design efficiency measures the fit between the intended usage via system design and actual usage (i.e., how well the design of the website actually supports a consumer’s goal for using the website). In a similar vein, information foraging theory (Pirolli and Card, 1999) posits that people will adopt information seeking strategies that maximize their rate of gaining valuable information (i.e., maximize the return on information foraging) for a given structure of the information environment (i.e., systems and interface design). In other words, the effectiveness of different information environments may be assessed by comparing the respective return on information foraging, which measures the rate of gaining valuable information in the context of an embedding task. 8 result, each customer transaction will have two efficiency measures: one overall efficiency measure, and one for its design subgroup. Customer and website design inefficiencies are measured by: CustomerInefficiency = Inefficiency DesignGroup InefficiencyOverall WebsiteDesignInefficiency = 1 − CustomerInefficiency and = 1− Inefficiency DesignGroup Inefficiency Overall The logic behind these measures is illustrated in Figure 2, which depicts a hypothetical situation where we are comparing two website designs, Design 1 and Design 2. (See Figure 2.) Output Figure 2. Decomposing Inefficiency j1 j3 j2 Production Frontier for Design 1 Production Frontier for Design 2 Overall Production Frontier Input The figure depicts three production frontiers: one for Design 1, one for Design 2, and finally one overall frontier for all observations. First, consider the online shopping transaction j1, which is part of Design 1. j1 lies on the efficiency frontier of both its design group as well as of the overall frontier. Hence, j1 is efficient (i.e., there is no room for customer or website design inefficiency). Second, consider the transaction j2, which is part of Design 2. Even though j2 lies on the efficiency frontier of its design group (Design 2), it lies below the overall efficiency frontier and hence is not 100% efficient. Since the inefficiency score (i.e., deviation from the production frontier) for the design group is 0, the portion of inefficiency due to poor execution (i.e., customer inefficiency) is 0%, whereas the portion of inefficiency due to poor website design (i.e., website design inefficiency) is 100%. In other words, even though the transaction j2 was highly productive within its current production environment (Design 2), there still exists a potential for increased productivity. In other words, the same level of output could have been achieved with less input if j2 had interacted with the other website design (Design 1). 9 Third, consider the transaction j3, which is inefficient overall, as well as within its own design group (Design 2). We can see that the deviation from the overall frontier is approximately twice the deviation from its design group frontier. Hence, customer inefficiency and website design inefficiency will both be approximately 50%. Analyzing the different sources of website design inefficiencies provides an innovative approach to analyzing the design performance of e-commerce websites. The results of such inefficiency analyses will have different implications depending on how the results turn out. If the source of inefficiency is present for all customers, then it becomes a signal to the e-commerce manager that she should think about a radical new redesign. However, if the source of inefficiency is the customer, then radical redesign would not be necessary. Rather, various other remedial approaches will be more effective. For example, the ecommerce firm may target email messages to those less efficient customers to inform them about features that exist, which they do not currently use, or redesign the website to make these hidden areas more salient and accessible. Towards a Formal Methodology. With the above conceptualization of customer and website design inefficiency, we can devise formal methods for comparing between different website designs. Since customer inefficiency relates to the inefficiency due to poor execution given a production environment, customer inefficiency can be measured by the deviation from the production frontier. Hence, comparing customer inefficiency between website designs equates with comparing the means of the distributions of the inefficiency deviations. Consider Figure 3, which depicts a hypothetical scenario where the individual observations (i.e., customer transactions) for two different website designs are plotted. (See Figure 3.) Figure 3. Comparing Customer Inefficiency 0 Y x x xx xx x xx x x xx x xx xxx xx x x x xx x x xx x Input x x x x Output Output Y x x x x X 0 Website Design A x x x x x x x x x x xx x x x x x x x x x x x x Input X Website Design B We may clearly see that the inefficiency deviations for Website Design A are smaller than those for Website Design B. In other words, these results would suggest that Website Design B has greater customer inefficiency than Website Design A, or Website Design A is more “customer efficient” than 10 Website Design B. Next, since website design inefficiency refers to the inefficiency due to poor website design, this inefficiency is associated with the best practice production frontiers for the respective production environments. In other words, we are interested in how well the website design performs if the website were to produce the maximum level of outputs for each given level of input. Hence, the comparison of website design inefficiency between website designs can be achieved by directly comparing the production frontiers. Figure 4 presents a hypothetical situation where the production frontier for Website Design A dominates the production frontier of Website Design B (i.e., for any level of output, Website Design A can produce that output with less input than Website Design B).2 Such results would suggest that website design B has greater website design inefficiency compared to Website Design A, or Website Design A is more “website design efficient” than Website Design B. (See Figure 4.) Figure 4. Comparing Website Design Inefficiency Y Production Frontier of Website Design A Output Production Frontier of Website Design B 0 Input x EMPIRICAL METHODS FOR EXAMINING WEBSITE EFFICIENCY To illustrate the value of our proposed e-commerce website evaluation methodology, we apply our technique to a currently operational e-commerce website. We next present the research methodology of the empirical examination. 2 There may also be a situation where the frontiers intersect (Charnes and Cooper, 1980). In such cases, one can compare website design inefficiencies for subsets of the two website designs grouped by magnitude of input (or output) (Brockett and Golany, 1996). 11 Research Site and Data We next present information on the research site, and the nature of the unique data and new methods that permit us to provide new kinds of insights into the problem of website design and evaluation in Internet-based selling. Research Site. Data for this study were collected at an online retailer of groceries. The online grocer is a pure-play Internet-based retailer that delivers groceries directly to the customer’s doorsteps with the mission of “taking the dread out of grocery shopping.” The company made its first delivery in April 1999, and by-mid July 2000 it had over 9000 customers who generated more than $16 million in revenue. Currently, the organization continues to operate only in one metropolitan area in the upper Midwest, where it is the only online service within its regional market. Performance evaluation of the website at the company has multiple purposes. First, performance evaluation is carried out to assess and manage the business, and to assure investors that their invested funds are deployed in a manner that has the potential to create significant returns. Second, performance evaluation of the website is employed to find ways to improve the business process that customers participate in when they shop, and, as a result, firm performance. Similar to many other web-based businesses, the company has adopted the attitude that “you can’t manage what you can’t measure”—in other words, competent measurement is a precursor to the formulation of effective management policy for the firm’s online operations. With this goal in mind, management spends time to do website performance evaluation so that it can generate insights into how the website is operating, what changes are required to improve service quality, and why one change might be given a greater priority another, due to the relative leverage on ROI that each may provide. Currently, the data for estimating these business metrics are derived from two separate systems. One is the customer data warehouse and the other is a website analysis tool that is provided as a bundled service by an out-of-state application service provider that hosts the firm’s website. The data warehouse, which contains customer and sales data, is used to conduct market basket analysis (Berry and Linoff, 1997). For example, final sales statistics are used to answer questions such as: “What are our best selling products?” “What are the demographics of our customer segments?” And “What is the average profitability for each customer segment?” This analysis is valuable for assessing the overall performance of the online service (e.g., merchandizing effectiveness). However, it provides very little manageriallyactionable information about how to improve the company’s website. The website analysis and data mining tool, WebTrends, is employed towards this second goal. It compiles web server logs to generate website usage statistics. (For more detailed information on this tool, the interested reader should see www.netiq.com.) The analysis tool offers a series of pre-packaged reports that show various aspects of online activity. For example, some reports list the most requested 12 pages, whether the page hits come to the website through external “referring” Websites, the browsers that are used by people who visit the site, the number of hits and visits for a given date range on different parts of the Website, and the most frequently occurring HTTP errors. These reports are used to answer various questions such as “What are the most popular product categories?” “What are the most popular products?” “When do customers shop?” and “What is the ratio of customers who shop or browse versus customers who purchase?” A shortcoming of the ready-made reports is that they are designed to only list a set of top 200 statistics, constraining the extent to which the tool can extract useful data to support a variety of managerial decision and evaluation tasks. For example, like a typical grocery store, the number of products offered by our data site is much greater than 200, so it is impossible for the firm’s management to acquire a complete and accurate view of site usage, if they wish to track more than this number of products. As the reader can imagine, this has become a major source of frustration for the firm’s managers. They often are more interested in identifying the least visited areas of the Website and the prepackaged statistics tend to focus more on the most frequently-visited pages (e.g., the home page, the check out page, the sale items page, and so on). Thus, in our field study, we learned that the tools and techniques for evaluating the performance of the company’s website were surprisingly limited—nowhere near what we expected for a firm that prided itself on its innovation and technology-enabled managerial sophistication. As a result and with the lack of useful managerially-actionable information, senior managers at the online grocer rely largely on “gut feel” and intuition when it comes to decision-making about design changes. Data Collection. Clickstream data were collected directly from an online grocer’s web servers. The website uses HTTP session cookies downloaded onto the visitor’s computer to track the customer’s shopping behavior at the site. Typical data pre-processing procedures for using webserver logs were used to extract navigation path sequences for each individual visitor from the clickstream data (Cooley, Mobasher, and Srivastava, 1999). The navigation sessions were then combined to identify purchase transactions from which website usage metrics were extracted to measure the extent to which various areas of the website were used in each of the purchasing processes. An overview of the data processing procedure is presented in Figure 5. (See Figure 5.) 13 Figure 5. Data Processing Procedure Website source code Webserver Logs (HTML, ASP files) Website Design Analysis - Page classification - Website topology identification Pre-processing - Data cleaning Path completion Session identification User identification Transaction identification User-SessionTransaction Database Website Design Metadata Website Usage Metrics The current dataset spans two weeks from June 23 to July 5, 2001. In this time period, a total of 36,051 sessions were recorded by 18,297 unique customers. The analysis will focus on 5,383 actual completed purchasing transactions from 4,941 customers. We selected this period for analysis because there was a design change in the middle; only the homepage of the website (i.e., the first page after the login screen) was changed. Empirical Methods Using Data Envelopment Analysis (DEA) We will illustrate the value of the proposed methodology by comparing the efficiencies of two website designs. To evaluate the effectiveness of the online grocer’s e-commerce website, we employ data envelopment analysis (DEA), a non-parametric methodology for production frontier estimation (Charnes, Cooper, and Rhodes, 1978; Banker, Charnes, and Cooper, 1984). We employ the nonparametric model, DEA, rather than a parametric model (e.g., stochastic frontier estimation) to estimate the production relationship between online shopping input and output. We do this because DEA does not assume a specific functional form for the production function and only requires relatively few assumptions of monotonically increasing and convex relationship between inputs and outputs. The parametric formulation of stochastic frontier estimation and the non-parametric formulation of DEA have been shown in prior research to yield very similar results (Banker, Datar, and Kemerer, 1991). DEA estimates the relative efficiencies of decision-making units (DMUs) from observed input measures and output measures. The relative productivity of a DMU is evaluated by comparing it against a hypothetical DMU that is constructed as a convex combination of other DMUs in the dataset. In our current analyses, 14 we employ an input-oriented CCR model (Charnes, Cooper, and Rhodes, 1978, 1981) to estimate the efficiencies of online shopping transactions.3 When using DEA, the subsequent analysis is only as good as the initial selection of input and output variables. The input and output variables are to be selected such that the inputs represent the resources consumed by the DMUs and the outputs represent the performance of the DMUs. (See Table 1.) Our model views the online shopping experience as a self-service production system where customers (i.e., decision-making units or DMUs) are using a production technology (i.e., the e-commerce website) to produce an output (i.e., a shopping transaction). In our online shopping context, we conceptualize the input as customers’ actions in navigating through the e-commerce website in order to produce a transaction, in which the output can be regarded as a checkout of a basket of products. Table 1. Input and Output Variables for Website Efficiency Measurement CATEGORY Inputs Output VARIABLE x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 MEASURE Products Lists Personal Order History Search Promotion Recipe Checkout Help Basket size DESCRIPTION Number of product page views Number of product lists views Number of personal list views Number of orders history page views Number of search conducted Number of promotional page views Number of recipe page views Number of checkout pages Number of help page views Number of items at checkout The efficiency h0 of an online transaction j0, characterized on the basis of the consumption of inputs, xi0 and production of output y0, is assessed by solving the following linear program: 3 The BCC model (Banker, Charnes, and Cooper, 1984) allows for variable returns to scale to estimate technical inefficiency at a given scale of production; the CCR model (Charnes, Cooper, and Rhodes, 1981) assumes constant returns to scale and estimates the aggregate of technical and scale inefficiencies. There are several conflicting arguments for or against the appropriateness of each model in our application. For example, one may argue that since the CCR model estimates measures of efficiency that represent the aggregate of scale and technical efficiencies whereas the BCC model estimates the technical efficiency, the BCC model may be more appropriate. Why so? Because the analytical objective is to estimate and compare the technical efficiencies of website designs (This was suggested to us by Rajiv Banker in a personal communication (December 15, 2002)). Furthermore, using BCC estimates of efficiency for hypothesis testing when comparing the efficiencies of website designs may provide a stricter test since BCC estimates will inevitably be less than (or at most equal to) the CCR estimates. On the other hand, it may be argued that the CCR model provides a more appropriate conceptualization of the efficiency measures. Why? Because in the online shopping context the size of the transaction (i.e., the number of items purchased), as an indicator of scale size, is under the control of the customer and not the e-commerce firm. 15 Min h0 (Online Shopping Efficiency Evaluation Model) subject to n h0 xi 0 ≥ ∑ xij λ j , i = 1, … , 9 inputs j =1 n y r 0 ≤ ∑ y rj λ j , r = 1 output j =1 λj ≥ 0 for ∀j The specification of the constraints in the above linear program is such that the production possibilities set conforms to the axioms of production in terms of convexity, monotonicity, constant returns to scale and minimum extrapolation (Banker, Charnes, and Cooper, 1984). The first constraint ensures that all observed input combinations lie on or within the production possibility set defined by the production frontier (i.e., the envelopment conditions for the input). The second constraint maintains that the output levels of inefficient observations are compared to the output levels of a convex combination of observed outputs. The final constraint ensures that all values of the production convexity weights are greater than or equal to zero. The DEA program is run iteratively for all online shopping transactions to yield efficiency scores h*j (i.e., the reciprocal of the inefficiency score θ *j = 1 h *j ) for all DMUs (j = 1…J). A DMU j0 is said to be fully efficient if the optimal solution h*j0 to its linear program above yields h*j0 = 1 with no slack (i.e., excess input or slack output). DMUs with h*j < 1 are said to be inefficient. Empirical Analysis and Hypothesis Testing Given that the management at our research site implemented a design change for its website in the middle of the data collection period, our analysis will focus on estimating and comparing the website efficiencies—both customer efficiency and website design efficiency—of and between the two designs. Customer inefficiency relates to the inefficiency due to poor execution with a particular website design. Hence, the estimation of customer efficiency/inefficiency involves conducting DEA efficiency estimations separately for each website design condition (i.e., each DMU is compared only to other DMUs that transacted with the same website design). To test whether the customer inefficiency scores for one website design are greater (or smaller) than those of another website design, we need to compare the inefficiency distributions between the two website design conditions. On the other hand, website design inefficiency relates to the inefficiency due to poor website design. Hence, estimating website design inefficiency involves estimating the production frontier (i.e., the data envelope) and comparing website design inefficiencies between different website designs involves analyzing whether the production frontier for one website design outperforms that of the other website 16 design. In order to do so, the inefficient DMUs are adjusted to their “level-if-efficient-value” by projecting each DMU onto the production frontier of its website design condition. In other words, for each website design condition, all (or most) DMUs (i.e., the originally-efficient DMUs and the adjustedto-efficiency inefficient DMUs) should be expected to be rated as efficient within their website design condition. Next, a pooled (inter-envelope) DEA is conducted with all DMUs (i.e., all DMUs in both website conditions) at their adjusted efficient levels. Finally, to test whether website design inefficiency for one website design is greater (or smaller) than that of the other website design, we need to compare the efficiency ratings derived from the pooled DEA between the website design conditions. We adopt the statistical test procedure proposed by Banker (1993) for comparing efficiency ratings between groups. Basically, the statistical procedure involves testing whether the means of the inefficiency score probability distributions for different conditions are different. Two test statistics were proposed by Banker depending on whether inefficiency deviations of the observed data are postulated to be drawn from exponential or half-normal distributions. It is reasonable to assume an exponential distribution for the inefficiency deviations when one has reason to believe that most observations are close to the production frontier, whereas a half-normal distribution should be assumed when few observations are likely to be are close to the frontier. The overall test procedure is as follows. Let j represent an online shopping transaction in the overall dataset. The set J of online transactions consists of two subsets D1 and D2 representing the two different website designs (i.e., D1 for week 1 and D2 for week 2). We denote the inefficiency score of a shopping transaction j in group Di by θ jDi to distinguish between the two groups and allow for the possibility that the probability distributions for the two sets of inefficiency scores θ D1 and θ D may be different. If we 2 assume the inefficiency deviations to be exponentially distributed (i.e., θ D1 – 1 exponentially distributed with parameter σ1, θ D – 1 exponentially distributed with parameter σ2), the null hypothesis is that the 2 two website designs are not different in terms of inefficiency deviations, H0: σ1 = σ2. The alternative hypothesis is: H1: σ1 > σ 2, that the website design of Week 1 has greater input inefficiency than that of Week 2 (i.e., website design at week 1 is less efficient than that of Week 2). The test statistic is: ∑ (θ ) ∑ (θ − 1) / n j∈D1 j∈D2 D1 j − 1 / nD1 D2 j D2 The test statistic asymptotically follows the F-distribution with ( 2n D1 ,2n D2 ) degrees of freedom for large n, where n D and n D are the number of observations in the subsets D1 and D2, respectively. On the other 1 2 hand, if we assume the inefficiency deviations to be half-normally distributed then we need to use a different test statistic, as follows: 17 ∑ (θ D1 j − 1 / nD1 ∑ (θ D2 j − 1 / nD2 j∈D1 j∈D2 ) 2 ) 2 This statistic again asymptotically follows an F-distribution with (n D1 , n D2 ) degrees of freedom for large n. When comparing customer inefficiency between website designs, the inefficiency scores are derived by running DEA separately for each website design condition, whereas when comparing website design inefficiency, the inefficiency scores are derived by running the pooled DEA with all DMUs at their adjusted efficient levels for their respective website design. To avoid confusion between the two sets of inefficiency scores, we will use θ jDi to denote the inefficiency ratings derived from the separate (between group) DEA and use θˆ jDi (with a hat) to denote the inefficiency ratings derived from the pooled (interenvelope) DEA with adjusted-to-efficiency DMUs. In any case, the general statistical test procedure will be the same for both customer inefficiency and website design inefficiency comparisons. RESULTS We now present the results of our empirical examination of the evaluation of the online grocer’s ecommerce website. We first present overall results of the aggregate efficiency estimations, then follow by presenting the results of comparing customer and website design efficiencies between the websites. Overall Results We first report the overall results of the aggregate efficiency estimation. Table 2 presents descriptive statistics of the DEA results. These results seem to suggest that, overall, the website exhibits an average level of website efficiency since efficiency scores for 50% of the transactions range between 0.522 and 0.785. There also seems to be quite a lot of inefficient transactions since efficiency scores for 25% of the transactions range between 0.108 and 0.521. Another way to look at this is to observe the long tail of the distribution of inefficiency deviations, where 25% range between 0.915 and 8.291. (See Table 2.) 18 Table 2. DEA Aggregate Efficiency Score Summary Statistics STATISTIC Minimum Maximum Mean Std Deviation 1st Quartile Median 3rd Quartile EFFICIENCY SCORE 0.108 1.000 0.648 0.187 0.522 0.644 0.785 INEFFICIENCY DEVIATION 0.000 8.292 0.717 0.697 0.274 0.552 0.915 Note: The number of observations in this analysis is 5,383. Efficiency scores (0 < h*j ≤ 1) are estimated from the DEA, whereas the inefficiency deviations are derived from the efficiency scores (θ*j = 1/h*j - 1). Transactions with efficiency scores close to 1 are efficient. Figure 6 plots the aggregate efficiency scores of all DMUs (with 5,383 observations) against the respective output of each observation (i.e., the basket size or number of items in the cart at checkout). (See Figure 6.) Visual inspection gives a summary of overall website efficiency. The plot shows the variability of efficiency scores at all levels of outputs, suggesting that the website may be ineffective. Figure 6. DEA Aggregate Efficiency Scores by Output Level Note: The above graph represents the aggregate efficiency scores estimated from the overall DEA (5,383 observations). Hence, at this point we do not distinguish between transactions that occurred with the different website designs. The horizontal axis represents the efficiency scores of the online shopping transactions (0 < h*j ≤ 1), whereas the output level (i.e., number of items in the cart at checkout) is represented on the vertical axis. The efficient transactions lie on (or near) the right edge of the graph (h*j ≈ 1). Comparing DEA Scores for Customer Efficiency We now compare the efficiencies of the two website designs in our data collection period. We first identified two sub-samples of online shopping transactions that did not span weeks. In other words, we are interested in comparing efficiency scores for only those transactions that were performed with either of the two website designs since a customer may initiate a transaction during week 1 and complete it 19 during week 2. Of the 5,383 completed transactions in our dataset, we identified 789 transactions which started and ended during Week 1 (subgroup D1) and 604 which started and ended during Week 2 (subgroup D2). The online shopping DEA model was run iteratively for each of the subsets to estimate the relative customer efficiencies ( h Dji ) and the inefficiency deviations ( θ jDi − 1 ). Table 3 presents the descriptive statistics of the DEA results and Figure 7 shows the distribution of observed inefficiency deviations. These results seem to suggest that D2 outperforms D1 in customer efficiency (i.e., greater average customer efficiency scores, or lesser average customer inefficiency deviations). (See Table 3 and Figure 7.) Table 3. DEA Customer Efficiency Scores Summary Statistics STATISTIC EFFICIENCY SCORE D1 D2 0.111 0.211 1.000 1.000 0.551 0.766 0.177 0.186 0.428 0.640 0.551 0.782 0.661 0.928 Minimum Maximum Mean Std Deviation 1st Quartile Median 3rd Quartile INEFFICIENCY DEVIATION D1 D2 0.000 0.000 8.035 3.742 1.060 0.415 0.895 0.494 0.513 0.078 0.815 0.278 1.336 0.562 Note: The subset D1 for Week 1 has 789 observations; subset D2 for Week 2 has only 604. These observations represent online transactions that started and ended during its respective week. They are only those customer transactions where the customer interacted with one particular website design in completing a transaction. Figure 7. Distribution of Observed Customer Inefficiency Deviations 25% D 2 (Week 2) % of Customers 20% Overall 15% D 1 (Week 1) 10% 5% 0% 0 1 2 3 4 5 Inefficiency Deviations Note: The figure shows inefficiency deviations ( 1 h *j − 1 ) distributions for the customer efficiency scores for the subsets, D1 and D2, and for the aggregate customer efficiency scores. Greater mass near the left edge of the graph 20 (i.e., inefficiency deviation 1 h *j − 1 ≈ 0, or h *j ≈ 1) implies that most observations are on or close to the efficient frontier. The long tail suggests some, but not many, transactions that are highly inefficient. We conducted hypothesis tests using DEA-based heuristics proposed by Banker (1993) to validate these findings. Table 4 presents a summary of the results. (See Table 4.) Table 4. Summary of Hypothesis Test Comparing Customer Efficiency Scores HYPOTHESIS H0: σ1 = σ2 H1: σ1 > σ2 H0: σ1 = σ2 H1: σ1 > σ2 INEFFICIENCY DEVIATIONS DISTRIBUTION Exponential ∑ (θ ) ∑ (θ − 1)/ n j∈D1 j∈D2 Half-normal CRITICAL F (α=0.01) RESULT = 2.557 F(1578, 1208) = 1.135 Reject H0 = 4.624 F(789, 604) = 1.119 Reject H0 TEST STATISTIC D1 j − 1 / nD1 D2 j D2 ∑ (θ D1 j − 1 / nD1 ∑ (θ D2 j − 1 / nD2 j∈D1 j∈D2 ) 2 ) 2 Note: This table summarizes the hypothesis test comparing the customer efficiency scores using the test statistics proposed by Banker (1993). Note that the degree of freedom for the F-test when the assumed distribution for inefficiency deviations is exponential is twice the number of observations for each subset (2 × n D = 2 × 789 = 1578 and 2 × n D = 2 × 604 = 1208). 1 2 The null hypothesis (i.e., H0: σ1 = σ2) that there are no differences in input customer inefficiencies was rejected (α = 0.01 level) for both the exponential and half-normal distribution assumptions for the inefficiency deviations. Therefore, we accept the alternative hypothesis that the website design of Week 2 resulted in reduced input customer inefficiencies. Similar results were obtained with the non-parametric test of Brockett and Golany (1996) that uses the Mann-Whitney rank statistic to evaluate significance of differences in observed customer efficiencies between website designs. The test statistic Ztest using the Mann-Whitney statistic (U = 156,440) was –10.999 < -Zα/2 = -2.576 (α = 0.01). This suggests that customer inefficiency for D1 was greater than for D2. Again we have evidence that the Week 2 design led to reduced input customer inefficiencies. Comparing DEA Scores for Website Design Efficiency Next, we compare the website design efficiencies of the two website designs. The pooled (interenvelope) DEA was run with the adjusted-to-efficiency DMUs to estimate the relative website design efficiencies ( hÌ‚ Dji ) and inefficiency deviations ( θˆ jDi − 1 ). Table 5 presents the descriptive statistics of the pooled DEA results and Figure 8 shows the distribution of observed inefficiency deviations. (See Table 5 and Figure 8.) In this case, the results seem to suggest that D1 outperforms D2 in terms of website design efficiency (i.e., greater average website efficiency scores, or lesser average website inefficiency deviations), a reversal of the results of comparing customer efficiencies. 21 Table 5. Website Design Efficiency Scores Summary Statistics STATISTIC EFFICIENCY SCORE D1 D2 0.875 0.561 1.000 1.000 0.987 0.906 0.021 0.078 0.982 0.857 0.999 0.919 1.000 0.971 Minimum Maximum Mean Std Deviation 1st Quartile Median 3rd Quartile INEFFICIENCY DEVIATION D1 D2 0.000 0.000 0.142 0.783 0.013 0.113 0.023 0.108 0.000 0.003 0.000 0.087 0.018 0.167 Note: Again, the subset D1 for Week 1 has 789 observations, whereas the subset D2 for Week 2 has 604 observations. Website design efficiency scores are estimated by conducting the pooled DEA where the observations are adjusted to efficiency levels within their respective subsets. Figure 8. Distribution of Observed Website Design Inefficiency Deviations 60% 50% D 1 (Week 1) % of Customers 40% Overall 30% 20% D 2 (Week 2) 10% 0% 0 0.1 0.2 0.3 0.4 0.5 Inefficiency Deviations Note: The above graph shows the distributions of the inefficiency deviations ( 1 hˆ *j − 1 ) for the website design efficiency scores for the two subsets (D1 and D2) as well as for the scores for the aggregate website design efficiency scores. The interpretation of the graph is the same as for Figure 7. However, note the difference in order of magnitude of the inefficiency deviations (horizontal axis) between the two graphs. This is due to the fact that the observations have been adjusted to efficiency levels within their respective subsets prior to conducting the pooled DEA. Hence, most of the observations should be close to the efficiency frontier, leading to the order of magnitude difference in inefficiency deviation levels. As before, we again conducted hypothesis tests using DEA-based heuristics proposed by Banker (1993) to validate these findings. (See Table 6.) 22 Table 6. Summary of Hypothesis Test Comparing Website Design Efficiency Scores HYPOTHESIS H0: σ1 = σ2 H1: σ1 < σ2 H0: σ1 = σ2 H1: σ1 < σ2 ASSUMED DISTRIBUTION FOR INEFFICIENCY DEVIATIONS Exponential ∑ (θˆ ) ∑ (θˆ − 1)/ n j∈D2 j∈D1 ∑ (θˆ Half-normal CRITICAL F (α=0.01) RESULT = 8.546 F(1208, 1578) = 1.093 Reject H0 = 33.970 F(604, 789) = 1.133 Reject H0 TEST STATISTIC D2 j − 1 / nD2 D1 j D1 ) ∑ (θˆ − 1) / n j∈D2 j∈D1 D2 j 2 − 1 / nD2 2 D1 j D1 Note: This table summarizes the hypothesis test comparing the website design efficiency scores. Note that the direction of the inequality for the alternative hypothesis (H1) are reversed since we are testing whether website design inefficiency of week 2 (D2) was greater than week 1 (D1). As a consequence, the numerators and denominators of the test statistics are switched (i.e., the numerator has the average inefficiency deviation (or average sum of squared inefficiency deviation) for D2 whereas the average for D1 is in the denominator). Also, the degrees of freedom for the F-test are now reversed (i.e., F(1208, 1578) and F(604, 789) instead of F(1578,1208) and F(789, 604)). The null hypothesis (i.e., H0: σ1 = σ 2) that there are no differences in input website design inefficiencies was rejected (α = 0.01) for both the exponential and half-normal distribution assumptions for the inefficiency deviations. Therefore, we again accept the alternative hypothesis that the website design of Week 2 resulted in reduced input website design inefficiencies. Lending additional credibility to these results, we also report that similar results were obtained with the non-parametric test. The test statistic Ztest using the Mann-Whitney statistic (U = 385,329) was 19.764 > Zα/2 = 2.576 (α = 0.01), suggesting that website design inefficiency for D2, the Week 2 design, was greater than for D1, the Week 1 design. Decomposing Website Efficiency We next investigate more qualitatively the difference between the two website designs in terms of the source of inefficiency. In particular, are the observed inefficiencies largely due to customer inefficiency or website design inefficiency? We compute inefficiencies (i.e., customer and website design inefficiencies) for each of the inefficient DMUs. Figure 9 presents a visual overview of the results. (See Figure 9.) 23 Figure 9. Histogram of Website Design Inefficiency Scores 0.6 Customer Inefficiency Website Design Inefficiency 0.4 0.2 0 0% 10% 20% 30% 40% 50% D1 (Week 1) 60% 70% 80% 90% 100% D2 (Week 2) Note that website design and customer inefficiency are complementary. So a 0% website design inefficiency corresponds to 100% customer inefficiency. Hence, left-skewed distributions suggest that the inefficiency lies with customers and not website design, whereas right-skewed bars suggest that most of the measured inefficiency is attributable to website design. Consistent with our hypotheses tests, the website design inefficiency scores for the two different designs (D1 for Week 1 versus D2 for Week 2) show stark differences. For D1 (Week 1), more DMUs showed less inefficiency due to website design: any observed inefficiency is more likely to be due to poor execution by the customers. For D2 (Week 2), a reversal occurs: there is a large proportion of users for whom all sources of inefficiency are due to poor website design, and not poor execution. The shift in source of inefficiency (i.e., from more customer inefficiency during Week 1 to more website inefficiency during Week 2) suggests that the website design during Week 1 had the potential to be highly efficient. However, the customer base of the online grocer did not have the capabilities to reap the benefits of this efficient design. On the other hand, the website design during Week 2, in general, was easier to use (i.e., based on reduced customer inefficiencies) but had less potential for high efficiency. We may infer from these findings that the customer base of the online grocer is not composed of highly sophisticated Internet users, which may be expectable given the domain of grocery shopping. With such a user/customer base, it may be more beneficial to go with a less-than-optimal design (in terms of the best practice frontier) so that the overall customer base may still use the website. A useful analogy is Unix versus Windows in computer operating systems. Unix is efficient because its user base consists mainly of expert programmers or power users who understand the cryptic command-line interface. However, the Windows graphical user interface makes it possible for novices to perform necessary tasks productively. 24 DISCUSSION The major problems that managers at e-commerce sites face are associated with understanding how their online storefront is operating, whether the current design of the website is effective, and, if not, what design changes may increase its effectiveness. However, these problems are difficult to tackle because obtaining an adequate representation of overall website effectiveness is problematic due to the lack of visibility of customer’s actions. The major challenge lies in the difficulty in gaining an understanding of how users are actually using the website. Unlike with physical stores, managers at e-commerce firms cannot directly observe the customers’ behaviors within the online storefront. For example, when we encounter customers having trouble finding items in the aisles of a physical store, sales representatives may intervene to help (Underhill, 1999). However, with e-commerce websites, it is difficult to observe what is happening within the virtual store. The major managerial insight that can be derived from this paper is that even though one cannot directly observe consumers’ actions within the online storefront, it is possible to indirectly infer their behaviors from clickstream traces left by the customers. The problem then turns into a data summarization problem. Even though it is possible to reconstruct the customer’s navigation traces it is still very costly to examine these traces on an individual basis. Hence, we need to summarize the set of individual navigation traces into a meaningful measure or set of measures that can provide manageriallyactionable information about how to improve the company’s website. In this paper, we have proposed that website efficiency is a worthwhile representation of the effectiveness of the e-commerce website. Measures of website efficiency represent how well the e-commerce website supports online shopping transactions. Even though website efficiency provides a useful metric for assessing the effectiveness of the online shopping environment, it only provides a diagnosis. In other words, low website efficiency scores may indicate that the design of the e-commerce website may be ineffective. However, why it is ineffective is still unknown.4 Unfortunately, this kind of diagnosis does not provide management with immediate insights into how to improve the design of the website to increase its effectiveness. With this goal in 4 This is a recurring problem with many performance metrics provided by current weblog analysis and data mining tools. For example, knowing that a particular product is the top-selling product for this week is nice to know, but there is nothing that can be done about it once you do know. Instead, it would be more appropriate to have specific information that indicates the effectiveness of the placement of product promotions, and the extent to which different kinds of placement create marginal impacts on sales. With this kind of information in hand, management will have more information about how to improve the design of the Web site so as to maximize the ROI associated with the screen real estate that is being used. 25 mind, we have proposed the measurement concepts of customer efficiency and website design inefficiency. Customer inefficiency relates to inefficiency due to poor execution given a particular website design. Website design inefficiency conveys the inefficiency due to poor design of the website. These finer-grained measures of website efficiency (and inefficiency) provide the much-needed insights into why a particular design may be ineffective and also what to do about it. For example, high levels of website design inefficiency would suggest that the poor design of the website impedes the efficiency of the customer’s interaction with the website. This would be a signal to the e-commerce manager that a fundamental redesign of the website may be necessary. On the other hand, high levels of customer inefficiency would imply that the customers are not utilizing the website to its full efficiency potential. In such a situation, the correct remedial measure would be to educate the customers. For example, the ecommerce firm may target email messages to those less efficient customers to inform them about features that exist which they do not currently use, or redesign the website to make these hidden areas more salient and accessible. The application of our proposed evaluation methodology illustrates the value of our technique. We were able to generate several interesting insights concerning the effectiveness of the e-commerce website of the Internet grocer that we studied. Obtaining these insights would have been otherwise difficult (or impossible) to observe with currently available and widely-used website evaluation techniques. For example, we were able to estimate significant differences in efficiencies (inefficiencies) for the online grocer’s website pre- and post-redesign. The results of the efficiency estimation suggest that the online grocer’s website had potential for highly efficient transactions pre-redesign, however, a good portion of the customers could not attain this high level of efficiency. On the other hand, the results also suggest that this potential for highly efficient transactions was lessened as a result of the redesign. Still though, the online grocer’s customers were relatively better able to use this less-than-optimal design to its full extent. These results are very rich and provide a valuable empirical basis for how to start thinking about redesigning the website in order to increase its effectiveness. CONCLUSION Evaluating the effectiveness of e-commerce website design is a very important, yet highly complex problem for e-commerce retailers. Given that the success of e-commerce retailers hinges to a great extent on the ability of the e-commerce firm to provide a high-quality website, e-commerce retailers need to constantly monitor the effectiveness of their web-based storefronts. However, current methods for website evaluation do not offer practical means for a solution to this problem. In this paper, we have proposed a methodology for evaluating and measuring the effectiveness of e-commerce website design that offers an innovative approach to assessing the effectiveness of e-commerce websites on an on-going 26 basis. Our methodology not only allows the measurement of the effectiveness of a particular design but also the comparison of website effectiveness between designs over time. We conclude with a run-down of the practical and theoretical contributions of this paper, along with caveats and considerations in applying our methodology. We end with directions for future development for this line of work. Contributions Our website evaluation methodology provides significant benefits over current methods that are widely used. One of the major advantages of our proposed measurement technique is that we are able to make use of observable consumer actions for all users/customers at a given website. In fact, the problem of scalability has been a major concern with the previous evaluation methods such as user testing, inquiry or expert inspection. For example, with user testing deciding on the adequate number of subjects to test in order to generate a representative picture of website usability problems is still in debate (Spool and Schroeder, 2001; Bevan, Barnum, Cockton, Nielsen, Spool, and Wixon, 2003). Also, it is difficult for usability experts to be able to identify all usability problems that may arise for the wide variety different users that may be customers at the website due to bounded rationality. We are not, however, arguing that testing, inquiry and inspection methods do not provide any value. On the contrary, we believe that such methods have their own specific complementary strengths and should be employed in conjunction with our proposed method. Second, our methodology provides an unobtrusive approach to data collection. Though they leverage the available web technologies, and online user surveys are currently being widely adopted, still nonrespondents will exist. Moreover, in the context of frequent website redesigns, which is the norm rather than the exception in e-commerce, it becomes difficult to solicit continuous responses for each website redesign. Furthermore, the obtrusive nature of the survey method (and also the user testing method) may introduce response bias from the participants, which may contaminate the results. Thus, we view the survey method as a weaker method for studying these systems design contexts. And so a major benefit of our methodology is that we may bypass the aforementioned problems by making use of automaticallycollected web server log data. Since web navigation behavior occurring in a genuine real world setting is collected, the potential for bias in the data is minimized.5 In addition, with additional effort, the datacollection, preparation and even the efficiency estimation procedures can be systematically programmed 5 However, there are important privacy concerns since customers are extremely averse to the idea that someone is monitoring their website usage behaviors (Srivastava, Cooley, Deshpande, and Tan, 2000). The World-Wide Web Consortium (W3C) has an ongoing initiative called “Platform for Privacy Preferences.” It recommends that site administrators publish a site’s privacy policies in machine-readable format. This is so that web browsers only request and display pages that conform with the user’s privacy preferences. Still, most users are not aware of these features and conformance by firms to these protocols currently is not regulated by law. 27 into the web application servers making it possible to automatically generate efficiency metrics on a continuous basis so that e-commerce managers can monitor the effectiveness of their website on an ongoing basis without suffering the costs of extraneous data collection and tedious analysis. Such additional developments will allow e-commerce firms to gain competitive advantage by reducing the feedback cycle time between website evaluation and website redesign. Third, this paper presents a formal methodology for website evaluation. Our methodology outlines the full cycle starting from model formulation, data requirements and collection, model operationalization, efficiency estimation all the way to hypothesis testing. We also present a framework for remedial action when the results suggest they are appropriate. Furthermore, the empirical methods outlined in this paper do not employ any proprietary data that is specific to the particular research site that we have investigated. Rather, our methods only make use of data that are available to all e-commerce firms (i.e., raw clickstream data from web server logs). Therefore, our method should be readily applicable to different firms. Finally, from a theoretical standpoint, this paper is the first to introduce the production perspective to the evaluation of online shopping. The theory of production features a number of useful conceptualizations for the kind of context that we have studied (including, e.g., the production function, the best practice frontier, returns to scale, technical rate of substitution, and so on). The use of that theoretical basis also opens up use of other related and highly-sophisticated and innovative methodological tools for estimating the productivity of a production technology (e.g., data envelopment analysis, parametric frontier estimation, etc.). We have only touched upon a few of these concepts and techniques in the current paper. We believe that new insights will be generated by incorporating additional concepts from production theory into the problem of website evaluation. So this paper presents a first step in this exciting direction. This newly-introduced evaluation perspective for online shopping should stimulate additional research into the area of e-commerce website evaluation and also more broadly into the research domain of online shopping. Caveats and Considerations The reader should consider a number of caveats and considerations relative to the interpretation of the results of this study, as well as the implementation of the methodology that we propose. On the Possibility of Inappropriate Generalization. Even though the value of the proposed website evaluation methodology can be inferred by the interesting results enumerated above, care must be taken not only when interpreting the results but also when trying to apply the methodology more widely. We have shown that the estimated efficiencies of the different website designs were significantly different. But we purposely have made an effort to not reveal a lot of qualitative details about the nature of the design changes. This is because we do not want the reader to over-generalize and assume that the 28 changes that were made to the website led to increased website design efficiency and reduced customer efficiency. In other words, one should not, given the above results and insights, assume that a similar change to a different website would lead to the similar results in terms of website efficiency changes. Indeed, we acknowledge and emphasize the fact that many factors come into play in terms of ecommerce website effectiveness and that many of these factors may be site-specific. We are not interested in uncovering universal design guidelines that may be applied to any setting (e.g., identifying the optimal organization of product hierarchies in an e-commerce website). Instead, the focus of our methodology is to provide to managers at e-commerce firms with useful feedback concerning how their customers are performing in the presence of their website designs. As we briefly described in the Introduction of this article, our proposed evaluation methodology is intended to be used as a tool for the continuous management of website quality. The rationale is similar in spirit to an important research stream in software engineering economics, where metrics for evaluating software development and maintenance productivity have been developed as a vehicle for managing and maximizing the value of software development projects (e.g., Banker, Datar, and Kemerer, 1991; Banker and Kauffman, 1991; Banker and Slaughter, 1997; Chidamber, Darcy, and Kemerer, 1998). Likewise, our proposed evaluation methodology is intended for use within a firm for managing its website development initiatives. Uncovering insights related to specific design features can also be accommodated given that our model of online shopping as production process is very general. Even though we have only employed measures of input (xkj) and output (ylj) in our empirical investigation, other environmental variables (sij) may be included in the production function. For example, in order to estimate the impact of a particular design feature (e.g., a new personalization engine), the presence (or absence) of that feature may be included as an environmental variable in the frontier estimation. The resulting coefficient from the frontier estimation will represent the impact of that design feature on website efficiency. We are currently collecting data related to the design qualities of the online grocer’s website in order to estimate the impacts of various design features on website efficiency. The Limitations of the Production Paradigm. Another concern related to broader application of our methodology relates to the appropriateness of the production paradigm in modeling online shopping, since our methodology is theoretically grounded in production economics. First, when applying our website evaluation methodology, one must be confident that the production model is an appropriate framework for effectiveness. In other words, the online shopping process that is being analyzed must be consistent with the theoretical perspective of production that we discuss earlier in this article. For example, in selecting input and output variables for frontier estimation, care must be taken so that these comprise the production possibility set. 29 Second, our assertion that efficiency is an appropriate measure of performance also should be justified. Shopping efficiency is meaningful and important in the online grocery shopping domain that we are investigating in this paper. In fact, the target market among consumers in our study of the grocery shopping website performance is the time-pinched customer who seeks convenience in her grocery shopping activities. Consequently, one of the key operational goals for website design set forth by the data that we obtained from our research site is to have firsttime customers be able to check out within an hour and have their subsequent transaction sessions reduced to thirty minutes. Hence, we see that website efficiency is indeed a major focus in the current context. Goal-Directed Versus Experiential Shopping. It may, however, not be as clear whether efficiency is also an appropriate metric for other e-commerce websites where shopping for enjoyment may be common. The consumer behavior literature in marketing identifies two distinct motivations for purchasing: goal-directed (or utilitarian) versus experiential (or hedonic). The two different motivations for shopping bring about considerable differences in consumer behavior (Babin, Darden, and Griffin, 1994). Goal-directed shopping, which typically occurs with a near-immediate purchase horizon, entails a highly focused information search process whereby consumers are seeking information specific to products in their consideration set so that it can be used in the purchase decision-making. The focus of goal-directed shopping actually is on the efficiency of the purchasing process. On the other hand, experiential shopping tends to focus on the recreational or hedonic motives of the consumer. Experiential shopping entails on-going search without a specific purchase horizon (Bloch, Sherrell, and Ridgway, 1986). Wolfinbarger and Gilly (2001) argue that goal-directed shopping will be more prevalent in online contexts compared to experiential shopping. The rationale behind their argument is that time-strapped consumers are more likely to adopt online channels to minimize the costs associated with going to physical stores (Bellman, Lohse, and Johnson, 1999). “Efficiency” Metrics for Internet-Based Shopping. Taken together, these theoretical arguments seem to support the claim that efficiency is indeed an important metric in the context of online shopping. However, it remains an empirical question whether our “efficiency orientation” holds for other product types (e.g., apparel) or shopping motivations (e.g., buying a gift) need to be empirically validated. In addition, employing efficiency as the dependent variable may raise questions in the minds of marketing managers at e-commerce websites. Does website efficiency lead to greater business value? As an illustrative example, we know that at physical grocery stores the overall layout of the shopping aisles is not designed to maximize efficiency. Milk and other everyday-use dairy products tend to be located at the back of the store so that shoppers who only want to quickly grab a carton of milk still need to pass through the whole store. This often leads to impulse purchases of other items that were not on the 30 shopper’s list. Interestingly, when there is some intent to manipulate consumer behavior present on the part of the retailer, it seems that the business value of the store layout actually ought to be inversely related to its efficiency from the customer’s point of view. We remind the reader that prior research on customer efficiency (e.g., Xue and Harker, 2002) suggests that efficiency consists of transaction efficiency and value efficiency, and that such efficiencies create business value through cost savings and increased sales volume. However, whether website efficiency is also positively related to business value is still an open empirical question. Future Research Several directions for extending the current work are apparent. Given that we may observe recurring transactions by customers over time, the deterministic frontier analysis that we employ needs to be extended to stochastic frontier analysis (Aigner, Lovell, and Schmidt, 1977; Banker, 1993). Since each customer will have multiple measures of efficiency over time, we need to take into consideration the measurement and random error components of the efficiency ratings. This will require a larger sample with many more recurring transactions than the limited sample we used for illustration purposes in this paper. We are currently in the process of collecting a longitudinal sample of web log data for this purpose. We also should point out that our empirical analyses did not include any environmental variables that may affect the production process, even though our general model provided the possibility of including such factors. Our current data collection effort includes the collection of measures for various environmental variables (e.g., customer experience) so that the impact of these additional factors may also be estimated. Another direction for further development comes from the fact that we have not yet fully characterized the qualities of the different website designs in the present analysis. We represented the two different website designs as “black box” production environments, instead of richly portraying the differences between them in our models. We are currently working to more faithfully represent the qualities of the website designs so that these design variables can also be incorporated in the production model. This will enable managers to assess the impact of different design variables on website efficiency. Finally, in order to provide a stronger rationale for the importance of the efficiency perspective, we need to empirically validate whether website efficiency does in fact have a positive impact on business value. Our current research project also includes additional data collection efforts in order to link our proposed efficiency metrics with business value. REFERENCES Agrawal, V., Arjona, L. D. and Lemmens, R. (2001). "E-Performance: The Path to Rational Exuberance". The McKinsey Quarterly, 2001(1), pp. 31-43. Aigner, D. J. and Chu, S. F. (1968). "On Estimating the Industry Production Function". American Economic Review, 58(4), pp. 826-839. 31 Aigner, D. J., Lovell, C. A. K. and Schmidt, P. (1977). "Formulation and Estimation of Stochastic Frontier Production Function Models". Journal of Econometrics, 6(1), pp. 21-37. Anderson, L. (2002). In Search of the Perfect Web Site. Smart Business, March, 2002, pp. 60-64. Babin, B. J., Darden, W. R. and Griffin, M. (1994). "Work and/or Fun: Measuring Hedonic and Utilitarian Shopping Value". Journal of Consumer Research, 20(4), pp. 644-656. Banker, R. D. (1993). "Maximum Likelihood, Consistency and Data Envelopment Analysis: A Statistical Foundation". Management Science, 39(10), pp. 1265-1273. Banker, R. D., Charnes, A. and Cooper, W. W. (1984). "Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis". Management Science, 30(9), pp. 1078-1092. Banker, R. D., Datar, S., M. and Kemerer, C. F. (1991). "A Model to Evaluate Variables Impacting the Productivity of Software Maintenance Projects". Management Science, 37(1), pp. 1-18. Banker, R. D. and Kauffman, R. J. (1991). "Reuse and Productivity: An Empirical Study of Integrated Computer-Aided Software Engineering (ICASE) at the First Boston Corporation". MIS Quarterly, 15(3), pp. 374-401. Banker, R. D. and Slaughter, S. A. (1997). "A Field Study of Scale Economies in Software Maintenance". Management Science, 43(12), pp. 1709-1725. Bellman, S., Lohse, G. L. and Johnson, E. J. (1999). "Predictors of Online Buying Behavior". Communications of the ACM, 42(12), pp. 32-38. Benbasat, I., Dexter, A. S. and Todd, P. A. (1986). "An Experimental Program Investigating ColorEnhanced and Graphical Information Presentation: An Integration of the Findings". Communications of the ACM, 29(11), pp. 1094-1105. Berry, M. J. A. and Linoff, G. (1997). Data Mining Techniques for Marketing, Sales, and Customer Support. New York, NY: John Wiley and Sons. Bevan, N., Barnum, C., Cockton, G., Nielsen, J., Spool, J. M. and Wixon, D. (2003). "Panel: The "Magic Number 5:" Is It Enough for Web Testing?" Proceedings of the 2003 ACM Conference on Human Factors in Computing Systems (CHI 2003), Ft. Lauderdale, FL, ACM Press, New York. Bloch, P. H., Sherrell, D. L. and Ridgway, N. M. (1986). "Consumer Search: An Extended Framework". Journal of Consumer Research, 13(1), pp. 119-126. Brockett, P. L. and Golany, B. (1996). "Using Rank Statistics for Determining Programmatic Efficiency Differences in Data Envelopment Analysis". Management Science, 42(3), pp. 466-472. Brooks, R. (1999, January 7). Alienating Customers Isn't Always a Bad Idea, Many Firms Discover. Wall Street Journal, pp. A1 and A12. Charnes, A. and Cooper, W. W. (1980). "Management Science Relations for Evaluation and Management Accountability". Journal of Enterprise Management, 2(2), pp. 160-162. Charnes, A., Cooper, W. W. and Rhodes, E. (1978). "Measuring Efficiency of Decision-Making Units". European Journal of Operational Research, 2(6), pp. 428-449. Charnes, A., Cooper, W. W. and Rhodes, E. (1981). "Evaluating Program and Managerial Efficiency: An Application of Data Envelopment Analysis to Program Follow Through". Management Science, 27(6), pp. 668-697. Chase, R. B. (1978). "Where does the Customer Fit in a Service Operation?" Harvard Business Review, 56(6), pp. 138-139. Chase, R. B. and Tansik, D. A. (1984). "The Customer Contact Model for Organization Design". Management Science, 29(9), pp. 1037-1050. Chen, P.-Y. and Hitt, L. M. (2002). "Measuring Switching Costs and the Determinants of Customer Retention in Internet-Enabled Businesses: A Study of the Online Brokerage Industry". Information Systems Research, 13(3), pp. 255-274. Chidamber, S. R., Darcy, D. P. and Kemerer, C. F. (1998). "Managerial Use of Metrics for Object Oriented Software: An Exploratory Analysis". IEEE Transactions on Software Engineering, 24(8), pp. 629-639. Cooley, R., Mobasher, B. and Srivastava, J. (1999). "Data Preparation for Mining World Wide Web Browsing Patterns". Journal of Knowledge and Information Systems, 1(1), pp. 5-32. 32 Dalton, J. P., Hagen, P. R. and Drohan, H. (2001). The Cost of Selling Online (Forrester Research Report). Cambridge, MA: Forrester Research Inc., July 2001. Frei, F. X. and Harker, P. T. (1999). "Measuring the Efficiency of Service Delivery Processes: An Application to Retail Banking". Journal of Service Research, 1(4), pp. 300-312. Goodhue, D. L. and Thompson, R. L. (1995). "Task-Technology Fit and Individual Performance". MIS Quarterly, 19(2), pp. 213-236. Hahn, J., Kauffman, R. J. and Park, J. (2002). "Designing for ROI: Toward a Value-Driven Discipline for E-Commerce Systems Design". Proceedings of the 35th Hawaii International Conference on System Sciences (HICSS 35), Big Island, HI, January 7-10, IEEE Computer Society Press, Los Alamitos, CA. Ivory, M. Y. and Hearst, M. A. (2001). "The State of the Art in Automating Usability Evaluation of User Interfaces". ACM Computing Surveys, 33(4), pp. 470-516. Kriebel, C. H. and Raviv, A. (1980). "An Economics Approach to Modeling the Productivity of Computer Systems". Management Science, 26(3), pp. 297-311. Lovelock, C. H. and Young, R. F. (1979). "Look to Consumers to Increase Productivity". Harvard Business Review, 57(3), pp. 168-178. Meuter, M. L., Ostrom, A. L., Roundtree, R. I. and Bitner, M. J. (2000). "Self-Service Technologies: Understanding Customer Satisfaction with Technology Based Service Encounters". Journal of Marketing, 64(3), pp. 50-64. Mills, P. K. and Morris, J. H. (1986). "Clients as "Partial" Employees of Service Organizations: Role Development in Client Participation". Academy of Management Review, 11(4), pp. 726-735. Nielsen, J. (1994). Top 10 Heuristics for Usability, [the Internet]. UseIT: Jakob Nielsen's Website, Nielsen/Norman Group, Fremont, CA. available: www.useit.com/papers/heuristic/heuristic_list.html. Nielsen, J. and Mack, R. L. (Eds.). (1994). Usability Inspection Methods. New York, NY: John Wiley and Sons. Pirolli, P. L. T. and Card, S. K. (1999). "Information Foraging". Psychological Review, 106(4), pp. 643675. Rajgopal, S., Venkatachalam, M. and Kotha, S. (2001). Does the Quality of Online Customer Experience Create a Sustainable Competitive Advantage for E-commerce Firms? (Working Paper). Seattle, WA: School of Business Administration, University of Washington, April 2001. Rehman, A. (2000). Holiday 2000 E-Commerce: Avoiding $14 Billion in "Silent Losses" (Research Report). New York, NY: Creative Good, October 2000. Rizzuti, K. and Dickinson, J. (2000). Satisfying the Experienced On-Line Shopper: Global E-Shopping Survey (Research Report). London: A.T. Kearney 2000. Schubert, P. and Selz, D. (1999). "Web Assessment: Measuring the Effectiveness of Electronic Commerce Sites Going Beyond Traditional Marketing Paradigms". Proceedings of the 32nd Hawaii International Conference on System Sciences (HICSS 32), Maui, HI, January 5-8, Los Alamitos, CA: IEEE Computer Society Press. Souza, R., Manning, H., Sonderegger, P., Roshan, S. and Dorsey. (2001). Get ROI From Design (Forrester Research Report). Cambridge, MA: Forrester Research Inc., June 2001. Spool, J. M., Scanlon, T., Schroeder, W., Synder, C. and DeAngelo, T. (1999). Web Site Usability: A Designer's Guide. San Francisco, CA: Morgan Kaufmann Publishers. Spool, J. M. and Schroeder, W. (2001). "Testing Web Sites: Five Users Is Nowhere Near Enough". Proceedings of the 2001 ACM Conference on Human Factors in Computing Systems (CHI 2001 (Extended Abstracts)), Seattle, WA, March 31-April 5, ACM Press, pp. 285-286. Srivastava, J., Cooley, R., Deshpande, M. and Tan, P.-N. (2000). "Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data". SIGKDD Explorations, 1(2), pp. 12-23. Straub, D. W., Hoffman, D. L., Weber, B. W. and Steinfield, C. (2002a). "Measuring e-Commerce in NetEnabled Organizations: An Introduction to the Special Issue". Information Systems Research, 13(2), pp. 115-124. 33 Straub, D. W., Hoffman, D. L., Weber, B. W. and Steinfield, C. (2002b). "Toward New Metrics for NetEnhanced Organizations". Information Systems Research, 13(3), pp. 227-238. Straub, D. W. and Watson, R. T. (2001). "Transformational Issues in Researching IS and Net-Enabled Organizations". Information Systems Research, 12(4), pp. 337-345. Underhill, P. (1999). Why We Buy: The Science of Shopping. New York, NY: Touchstone. Varian, H. R. (1992). Microeconomic Analysis (3rd ed.). New York, NY: Norton. Varianini, V. and Vaturi, D. (2000). "Marketing Lessons from E-Failures". The McKinsey Quarterly, 2000(4), pp. 86-97. Vessey, I. (1991). "Cognitive Fit: A Theory-Based Analysis of Graphs Versus Tables Literature". Decision Sciences, 22(2), pp. 219-240. Wallach, S. L. (2001, July 9). Usability Improvements Payoff for Web Site, [the Internet]. ITworld.com. available: www.itworld.com/nl/sup_mgr/07092001. Walley, P. and Amin, V. (1994). "Automation in a Customer Contact Environment". International Journal of Operations and Production Management, 14(5), pp. 86-100. Wolfinbarger, M. and Gilly, M. C. (2001). "Shopping Online for Freedom, Control and Fun". California Management Review, 43(3), pp. 34-55. Xue, M. and Harker, P. T. (2002). "Customer Efficiency: Concept and Its Impact on E-Business Management". Journal of Service Research, 4(4), pp. 253-267. Zeithaml, V. A., Parasuraman, A. and Berry, L. L. (1990). Delivering Quality Service: Balancing Customer Perceptions and Expectations. New York, NY: Free Press. Zeithaml, V. A., Rust, R. T. and Lemon, K. N. (2001). "The Customer Pyramid: Creating and Serving Profitable Customers". California Management Review, 43(4), pp. 118-142. Zona Research. (1999). Shop Until You Drop? A Glimpse into Internet Shopping Success (Zona Assessment Paper). Redwood City, CA: Zona Research 1999. 34