Time Changes Everything, Even Our Coefficient Estimates: An Examination and

advertisement
Time Changes Everything, Even Our Coefficient Estimates: An Examination and
Application of Time-Varying Coefficients in E-Commerce Research
By: Eric Overby and Benn Konsynski, Goizueta Business School, Emory University
Academic research has provided significant insight into e-commerce phenomena, including why
Internet users buy, how firms set prices, and which products are best suited for exchange.
However, the empirical relationships uncovered by researchers are likely to be quite dynamic as
the electronic marketplace continues to evolve. New technologies, new legislation, and an ever
more-experienced user population are some of the reasons that relationships among variables in
e-commerce research vary over time. For example, the influence of variables such as product
diagnosticity or seller trustworthiness on the price of an e-commerce transaction is likely to
change over time, as individuals and firms become more experienced with e-commerce and/or as
new business models and legislation provide greater assurance for online trading. This paper
examines several of the statistical methods available for testing for whether e-commerce
relationships change over time, in other words, whether coefficients are time-varying. We
describe how the structure of e-commerce data sets creates challenges when investigating timebased effects, and we illustrate the methods using data from the wholesale automotive market.
Thus, we address both methodological and substantive issues.
E-commerce researchers often have data sets containing observations that span time, permitting
an examination of how relationships among variables evolve. However, the structure of these
data sets often precludes the use of some of the more popular statistical methods for investigating
time-based effects. In particular, an entire set of statistical methods designed for panel and
classical time series data are often ill suited for analyzing the types of data used by e-commerce
researchers. This is so because classical time series or panel data sets are comprised of multiple
observations of the same units at different times, whereas many e-commerce data sets are
comprised of multiple observations of different units at different times. Table 1 illustrates this
pictorially. Consider an e-commerce transaction data set, such as one scraped from eBay. Ecommerce researchers often focus on the individual user as the unit of analysis in these data sets,
in order to analyze their behavior. For example, an e-commerce researcher might be interested
in how informational features such as seller reputation score affect a user’s willingness to pay.
Because it is typically uncommon for the same user to appear multiple times in the same data set,
the data cannot be represented as a time series or as a panel. This means that e-commerce
transaction data sets often look more like the right-hand side of Table 1 than the left.
Panel Data / Classical Time Series Data
Same units measured at different times,
e.g., stock price data, brand sales data, etc.
Data Often Used in E-Commerce Research
Different units measured at different times, e.g.,
e-commerce transaction / clickstream data.
T1
T2
T3
T4
T5
T6
T1
Unit 1
9
9
9
9
9
9
Unit 1
Unit 2
9
9
9
9
9
9
Unit 2
Unit 3
9
9
9
9
9
9
Unit 3
T2
T3
T4
T5
T6
9
9
9
9
Table 1: Frequent Distinction Between Panel / Time Series Data and E-Commerce Data.
The result is that several statistical models used to investigate time-based effects, such as
ARIMA and ARCH models, are often inappropriate in e-commerce research. However, there are
other methods that can be used to investigate time-varying coefficients in e-commerce data,
including the Chow test, rolling regression, and “parameterizing” the coefficients by modeling
them using process functions. We examine each of these methods and illustrate them using data
from the wholesale automotive market. In this market, institutional buyers (e.g., licensed
automotive dealers) and institutional sellers (e.g., rental car firms, fleet operators) exchange used
vehicles, which are then resold to the consumer public. The data set consists of 13,794
transactions between buyers and sellers in this market, which occurred across a 15-month span
from 2003 to 2005. Two variables are of particular interest: 1) whether the buyer purchased a
vehicle while physically attending the market facility or while accessing the market remotely via
the Internet, to be referred to as the BUYERACCESS variable, and 2) whether the buyer purchased
a vehicle that was physically presented at the market facility or was presented electronically via
digital photos and textual information, to be referred to as the VEHICLEPRESENTATION variable.
The coefficients reflecting these variables’ effect on price are likely to change over time, as
market participants become accustomed to the electronic mechanisms, which are relatively new.
The Chow test can be used to determine if the coefficients in a data set vary at discrete points in
time. Rolling regression allows for more continuous modeling of how coefficients vary across
time. Parameterizing the coefficients allows a coefficient to be modeled explicitly as a function
of time, thereby permitting an investigation of the process by which the coefficient varies over
time. This function is referred to as the process function. Figure 1 illustrates the dynamism over
time in the BUYERACCESS coefficient, suggesting that buyers initially discounted what they paid
for vehicles when accessing the market via the Internet as opposed to physically attending, but
that this effect became insignificant over time, perhaps as buyers became more comfortable with
this new market access method. Interestingly, the VEHICLEPRESENTATION coefficient did not
change over time in any meaningful way, suggesting that perhaps buyers can more readily adapt
to new market access mechanisms than to new product representation mechanisms.
0.00
First 33% of
transactions
-0.02
-0.04
Significant 76%
of the time
Last 33% of
transactions
-0.01 (n/s)
-0.03***
*** p-value < 0.01
Significant < 1%
of the time
0.00
α 0 + α 1t +α 2 t 2 =
-0.04
-0.08
β BuyerAccess (t ) =
− 0.07 +0.0001t
Time
Parameterizing the
Coefficients
Figure 1: Illustrates different statistical methods to detect whether the BUYERACCESS coefficient
varies over time. (To provide a sense of scale, the mean of the dependent variable is 0.94.)
Chow Test
Rolling Regression
The paper describes the methods in detail, discussing advantages / disadvantages and illustrating
their application in the empirical context. Several questions arise when using these methods,
such as how to specify data regimes for the Chow test, how to set the window size for rolling
regression, and how to specify the process function when parameterizing a coefficient. When
these issues are handled appropriately, these and related statistical methods can help researchers
investigate the dynamism inherent to e-commerce research.
Download