Time Changes Everything, Even Our Coefficient Estimates: An Examination and Application of Time-Varying Coefficients in E-Commerce Research By: Eric Overby and Benn Konsynski, Goizueta Business School, Emory University Academic research has provided significant insight into e-commerce phenomena, including why Internet users buy, how firms set prices, and which products are best suited for exchange. However, the empirical relationships uncovered by researchers are likely to be quite dynamic as the electronic marketplace continues to evolve. New technologies, new legislation, and an ever more-experienced user population are some of the reasons that relationships among variables in e-commerce research vary over time. For example, the influence of variables such as product diagnosticity or seller trustworthiness on the price of an e-commerce transaction is likely to change over time, as individuals and firms become more experienced with e-commerce and/or as new business models and legislation provide greater assurance for online trading. This paper examines several of the statistical methods available for testing for whether e-commerce relationships change over time, in other words, whether coefficients are time-varying. We describe how the structure of e-commerce data sets creates challenges when investigating timebased effects, and we illustrate the methods using data from the wholesale automotive market. Thus, we address both methodological and substantive issues. E-commerce researchers often have data sets containing observations that span time, permitting an examination of how relationships among variables evolve. However, the structure of these data sets often precludes the use of some of the more popular statistical methods for investigating time-based effects. In particular, an entire set of statistical methods designed for panel and classical time series data are often ill suited for analyzing the types of data used by e-commerce researchers. This is so because classical time series or panel data sets are comprised of multiple observations of the same units at different times, whereas many e-commerce data sets are comprised of multiple observations of different units at different times. Table 1 illustrates this pictorially. Consider an e-commerce transaction data set, such as one scraped from eBay. Ecommerce researchers often focus on the individual user as the unit of analysis in these data sets, in order to analyze their behavior. For example, an e-commerce researcher might be interested in how informational features such as seller reputation score affect a user’s willingness to pay. Because it is typically uncommon for the same user to appear multiple times in the same data set, the data cannot be represented as a time series or as a panel. This means that e-commerce transaction data sets often look more like the right-hand side of Table 1 than the left. Panel Data / Classical Time Series Data Same units measured at different times, e.g., stock price data, brand sales data, etc. Data Often Used in E-Commerce Research Different units measured at different times, e.g., e-commerce transaction / clickstream data. T1 T2 T3 T4 T5 T6 T1 Unit 1 9 9 9 9 9 9 Unit 1 Unit 2 9 9 9 9 9 9 Unit 2 Unit 3 9 9 9 9 9 9 Unit 3 T2 T3 T4 T5 T6 9 9 9 9 Table 1: Frequent Distinction Between Panel / Time Series Data and E-Commerce Data. The result is that several statistical models used to investigate time-based effects, such as ARIMA and ARCH models, are often inappropriate in e-commerce research. However, there are other methods that can be used to investigate time-varying coefficients in e-commerce data, including the Chow test, rolling regression, and “parameterizing” the coefficients by modeling them using process functions. We examine each of these methods and illustrate them using data from the wholesale automotive market. In this market, institutional buyers (e.g., licensed automotive dealers) and institutional sellers (e.g., rental car firms, fleet operators) exchange used vehicles, which are then resold to the consumer public. The data set consists of 13,794 transactions between buyers and sellers in this market, which occurred across a 15-month span from 2003 to 2005. Two variables are of particular interest: 1) whether the buyer purchased a vehicle while physically attending the market facility or while accessing the market remotely via the Internet, to be referred to as the BUYERACCESS variable, and 2) whether the buyer purchased a vehicle that was physically presented at the market facility or was presented electronically via digital photos and textual information, to be referred to as the VEHICLEPRESENTATION variable. The coefficients reflecting these variables’ effect on price are likely to change over time, as market participants become accustomed to the electronic mechanisms, which are relatively new. The Chow test can be used to determine if the coefficients in a data set vary at discrete points in time. Rolling regression allows for more continuous modeling of how coefficients vary across time. Parameterizing the coefficients allows a coefficient to be modeled explicitly as a function of time, thereby permitting an investigation of the process by which the coefficient varies over time. This function is referred to as the process function. Figure 1 illustrates the dynamism over time in the BUYERACCESS coefficient, suggesting that buyers initially discounted what they paid for vehicles when accessing the market via the Internet as opposed to physically attending, but that this effect became insignificant over time, perhaps as buyers became more comfortable with this new market access method. Interestingly, the VEHICLEPRESENTATION coefficient did not change over time in any meaningful way, suggesting that perhaps buyers can more readily adapt to new market access mechanisms than to new product representation mechanisms. 0.00 First 33% of transactions -0.02 -0.04 Significant 76% of the time Last 33% of transactions -0.01 (n/s) -0.03*** *** p-value < 0.01 Significant < 1% of the time 0.00 α 0 + α 1t +α 2 t 2 = -0.04 -0.08 β BuyerAccess (t ) = − 0.07 +0.0001t Time Parameterizing the Coefficients Figure 1: Illustrates different statistical methods to detect whether the BUYERACCESS coefficient varies over time. (To provide a sense of scale, the mean of the dependent variable is 0.94.) Chow Test Rolling Regression The paper describes the methods in detail, discussing advantages / disadvantages and illustrating their application in the empirical context. Several questions arise when using these methods, such as how to specify data regimes for the Chow test, how to set the window size for rolling regression, and how to specify the process function when parameterizing a coefficient. When these issues are handled appropriately, these and related statistical methods can help researchers investigate the dynamism inherent to e-commerce research.