Product Variety, Across-Market Demand Heterogeneity and the

advertisement

PRODUCT VARIETY, ACROSS-MARKET DEMAND HETEROEGENEITY,

AND THE VALUE OF ONLINE RETAIL

Thomas W. Quan

Department of Economics

University of Georgia

Kevin R. Williams

School of Management

November 2015

Yale University

Abstract

Online retail gives consumers access to an astonishing variety of products. However, the value of this variety depends on the extent to which local demand can be captured by local retailers. We quantify the gains from increasing variety with new, rich data for the online shoe industry. Despite observing millions of transactions, many products have zero market share at the local level. We propose a modification to Berry

(1994) and Berry, Levinsohn, Pakes (1995) that addresses sparsity in local sales due to sampling. Our results indicate that products face substantial heterogeneity in demand across geographic markets, which greatly mitigates the consumer gains of increasing variety when brick-and-mortar retailers localize assortments. This suggests the large consumer heterogeneity needed to rationalize the long revenue tail in national level data is likely the result of aggregating over local demand, where consumers have correlated preferences.

JEL: C13, L67, L81

Keywords: product variety, demand estimation, errors in shares, long revenue tail

∗ quan@uga.edu

† kevin.williams@yale.edu

We thank Judy Chevalier, Chris Conlon, Amit Gandhi, Phil Haile, Thomas Holmes, Kyoo il Kim,

Amil Petrin, Ben Shiller, Catherine Tucker, Joel Waldfogel, and the seminar and conference participants at Darmouth-Tuck Winter IO, NU, WUSTL, MSU, DOJ, BLS, FCC, UGA, UCLA, NBER, SMU, NUS, Stanford,

IIOC, Maryland, Toronto, NYU Stern IO, and QME. We also thank the Minnesota Supercomputing Institute

(MSI) for providing computational resources and the online retailer for providing us with data.

1 Introduction

There is widespread recognition that as economies have advanced, consumers have benefited from an increasing access to variety. Several strands of the economics literature have examined the value of new products and increases in variety either theoretically or empirically, e.g. in trade (Krugman 1979), macroeconomics (Romer 1994), and industrial organization (Lancaster 1966, Dixit and Stiglitz 1977). The internet has given consumers access to an astonishing level of variety. Consider shoe retail. A large traditional brickand-mortar shoe retailer o ff ers at most a few thousand distinct varieties of shoes. However, as we will see, an online retailer may o ff er over 50,000 distinct varieties. How does such a dramatic increase in variety contribute to welfare?

The central idea of this paper is that gains from online retail depend critically on the extent to which demand varies across geography and on how traditional brick-and-mortar stores respond to local tastes (Waldfogel 2010). For example, the addition of 5,000 di ff erent kinds of winter boots online will be of little value to consumers living in Florida just as the addition of 5,000 di ff erent kinds of sandals online will be of little consequence to consumers in Alaska. If Alaskan retailers o ff er a large selection of boots that captures the majority of local demand, only consumers with niche tastes – possibly those who want sandals – will benefit from the variety o ff ered by online retail. Therefore, in order to quantify the gains from variety due to online retail, it is critical to estimate the extent to which demand varies across locations.

We have collected an extremely detailed data set consisting of point-of-sale, product review, and inventory data from a large online retailer. One of the product categories the retailer sells is footwear, and we observe over 13.5 million shoe sales across thousands of products. In addition, we bring in data on local shoe assortments for a few large retail chains. This data provides us with direct evidence that firms are responding to acrossmarket heterogeneity, as product assortments vary significantly across stores within the

1

same chain. Additionally, the assortment similarity across stores within a chain decreases with distance. With our transactions data, we document large di ff erences in demand for specific products across geographic markets. Since prices, product characteristics, and choice sets are the same for all geographic markets, these di ff erences can only be rationalized by di ff erences in local demand. To formally test for di ff erences in demand across markets, we use multinomial tests that overwhelmingly reject the null hypothesis that consumers across markets have the same demand for shoes.

After showing that the data is inconsistent with a model devoid of across-market demand heterogeneity, we turn to estimating the gains from online variety. Our modeling approach closely follows the discrete choice literature with an emphasis on allowing tastes to vary across locations. The importance of flexibly modeling heterogeneity in discrete choice setups has been well documented in the literature, e.g. Berry, Levinsohn, and

Pakes (1995), Petrin (2002), Song (2007). The model allows for the fact that, for example, the removal of a popular sandal will be much more costly for markets in Florida than for markets in Alaska.

Employing our data at the level of narrowly defined products and at narrow geographic detail, however, also presents us with an empirical challenge. Despite the fact that we observe millions of transactions, most products have local market shares equal to zero.

For example, even at the annual-state level, 64.26% of products have zero sales. The zeros are problematic for two reasons. First, using standard demand techniques creates a selection bias in the demand estimates (Berry, Linton, and Pakes 2004, Gandhi, Lu, and Shi 2013, Gandhi, Lu, and Shi 2014), leading to incorrect estimates of the gains from increased variety. Second, the zeros suggest a small sample problem. This is particularly problematic for us because if uncorrected, we would overstate the degree of heterogeneity across markets (Ellison and Glaeser 1997) and understate the gains from increasing variety.

For example, suppose we observe the sale of a red shoe in market 1 and the sale of a blue shoe in market 2. If we ignore the fact that this is an extremely small sample, we would

2

conclude that these two markets demand completely di ff erent products. Market 1 would not benefit from access to the blue shoe, and market 2 would not benefit from access to the red shoe.

A contribution of our paper is to propose solutions to these issues in the discrete choice demand framework. Our estimation strategy exploits the structure of the demand model to separate the problem into two parts. At the aggregate level, our approach e ff ectively mimics the standard approach and we are able to pin down the price coe ffi cient and other parameters common across markets. We use the main data problem – local level zeros

– to identify across-market demand heterogeneity. Variance in the demand for products predicts the distribution of local sales. For each product, we match the model’s prediction of the product’s proportion of locations with zero sales, holding fixed the total number of local sales, to our local level data. These micro moments (Petrin 2002) account for the small sample problem. Our approach allows us to identify the variance of the product-market level unobservables (random e ff ects) instead of recovering them as fixed e ff ects.

Our results indicate that the demand for specific products varies significantly across markets and we show that accounting for this heterogeneity is necessary for rationalizing the distribution of local sales. Using our model estimates, we first simulate demand when consumers have access to a nationally standardized assortment and find large gains (17.3%), consistent with Brynjolfsson, Hu, and Smith (2003). However, as we have recovered variation in preferences across locations, we conduct a similar counterfactual, allowing each location to receive a tailored assortment. In this counterfactual, we find the consumer gains from increasing variety to be 21.2% lower. Put another way, if local stores cater to the local demand, then the value of online markets is relatively small because the average consumer already has access to the products he or she wants to purchase.

One of our robustness exercises highlights the importance of accounting for the sampling problem. Using the traditional approach, small sample sizes generate significant spurious heterogeneity across markets, resulting in estimated gains from variety equal to zero.

3

Finally, our results suggest a new interpretation of the long tail phenomenon observed in online retail (Anderson 2004). The prevailing view is that the long tail pattern has emerged because niche products, only available online, better satisfy consumer tastes

(Brynjolfsson, Hu, and Smith 2003). In constrast, we show a long tail is created simply by aggregating over local market demand when consumers within a location have correlated preferences. That is, by adding the boot sales in Alaska to the sandal sales in Florida, and so forth, the aggregated (national level) demand heterogeneity increases. While we still find significant gains to consumers from access to the huge online choice set, they are smaller if local retailers tailor to local demand.

1.1

Literature

This paper relates to a large body of literature that has highlighted across-market dif-

ferences in both supply and demand

1

, and the long tail phenomenon in online retail

2

.

Our work bridges these two literatures, emphasizing the importance of accounting for across-market heterogeneity when analyzing the welfare benefits of online retail and the composition of the long tail.

While discrete choice techniques developed by Berry (1994) and Berry, Levinsohn, and Pakes (1995) allow for flexible substitution patterns reflective of across-market demand heterogeneity, the small sample problem prohibits us from using these standard techniques. Gandhi, Lu, and Shi (2014) propose new methodology to estimate demand without dropping the zeros, but with over 95% local zeros, as we have in our data, the technique is ine ff

ective.

3

Our approach relates to Chintagunta and Dube (2005), who use

1 See: Waldfogel (2003, 2004, 2008), Bronnenberg, Dhar, and Dube (2009); Choi and Bell (2011); Bronnenberg,

Dube, and Gentzkow (2012).

2 See: Chellappa, Konsynski, Sambamurthy, and Shivendu (2007); Brynjolfsson, Hu, and Rahman (2009);

Brynjolfsson, Hu, and Smith (2010); Brynjolfsson, Hu, and Simester (2011); Tan and Netessine (2009). The long tail literature also relates to blockbuster vs. niche products as well as search both empirically (Tucker and Zhang 2011, Hinz, Eckert, and Skiera 2011) and theoretically (Bar-Isaac, Caruana, and Cuñat 2012).

3 The correction results in adjusting all local shares observed to be zero up by the same "optimal" amount.

Given the large number of local zeros, this would cause an overstatement in local within-market heterogeneity.

4

store level market shares to pin down mean utilities augmented with household scanner data to identify the distribution of consumer heterogeneity. Finally, we show our proposed technique provides a justification for the crowding penalty proposed by Ackerberg and

Rysman (2005). We show that our estimates of across-market heterogeneity correspond to their crowding term, which they model as the number of retail outlets per product.

The rest of the paper is organized as follows. Section 2 discusses our data and presents preliminary evidence of across-market heterogeneity. In section 3, we present the model and estimation procedure. Results and counterfactuals are in section 4 and 5, respectively.

Section 6 discusses the robustness of our findings, and section 7 concludes the paper.

2 Data

We create several original data sets for this study. The main data set consists of detailed point-of-sale, product review, and inventory data that we collected from a large online retailer. With this data, we observe over $1 billion worth of online shoe transactions between 2012 and 2013. We augment this with a snapshot of shoe availability for a few

large brick-and-mortar retailers. We begin by summarizing our data sets (Section 2.1).

Then we provide evidence of across-market heterogeneity using the brick-and-mortar

assortment data (Section 2.2) and with the online retail data (Section 2.3). Finally, we

document the small sample problem in the sales data – in particular, the zeros problem –

and discuss aggregation as a means to address the issue (Section 2.4).

2.1

Data Summary

Online Retailer Data

The main data set for this study was collected and compiled with permission from a large online retailer. This online retailer sells a wide variety of product categories, including footwear, which will be the focus of our analysis. Each transaction in the point-of-sale

5

(POS) data base contains the timestamp of the sale, the 5-digit shipping zip code, price paid, and a wealth of information about the shoe, including model and style information.

The transaction identifier allows us to see if a customer purchased more than a single pair of shoes. Finally, we download a picture of each shoe and image process it to create color covariates.

We merge in product review and inventory data. The review data contain the time series of reviews and ratings for each shoe. In the inventory data, we track daily inventory

for every shoe.

4

Importantly, this data allows us to infer the complete set of shoes in the consumer’s choice set, even when the sale of a particular shoe is not observed (Conlon and Mortimer 2013). While the inventory data is size specific, the sales data does not include size. We concede that this, in general, will cause us to understate the gains of online variety because consumers with unusual foot sizes may greatly benefit from online

shopping if traditional retailers do not typically stock unusual sizes.

5

We observe over 13.5 million shoe transactions during the collection period, with twothirds of transactions being women’s shoes. The price of shoes varies substantially both across gender and within gender – for example, dress shoes tend to be more expensive than walking shoes. The distribution of transaction size per order is heavily skewed to the left. Only a small fraction of orders contain several pairs of shoes. Additionally, of the transactions containing multiple purchases, less than a quarter contain the same shoe, suggesting concern over resellers is negligible in our data set. This also implies there are few consumers buying multiple sizes of the same shoe in a single transaction. Overall, we believe this supports our decision to model consumers as solving a discrete choice problem.

We observe over 580,000 reviews of products and record the consumer response to a

4 Initially this data was not collected daily, but for the last seven months of data collection, each shoe inventory was tracked daily.

5 However, one store manager we spoke to indicated his retailer sets assortments based not only on styles, but also on sizes. With our brick-and-mortar data, we can test for this. Using our Macy’s data, we reject the null hypothesis that the mean assortment shoe size is constant across stores.

6

few questions regarding the fit and look of the product. The metrics we include in the demand system are ratings for comfort, look, and overall appeal, where 1 is the lowest rating, and 5 is the highest rating. The reviews are heavily skewed towards favorable ratings.

An important feature of the data is the number of products the online retailer o ff ers.

The average daily assortment size is over 50,000 products, and over the span of data collection, over 100,000 varieties of shoes were o ff ered for sale. This suggests there is significant turnover in the choice set, with some products being o ff ered over the entire sample and others appearing for brief periods of time.

Brick-And-Mortar Data

In addition to the retail data, we collect a snapshot of shoe availability from Macy’s and

Payless ShoeSource during August and September of 2014. We first collected all the shoe

SKUs each retailer sold, and then for each SKU, we used the firm’s "check in store" web feature to see if the product was currently available at each location. The firms’ websites do not list how many shoes are in stock, just whether a shoe is in stock or not. Since each query was for a specific shoe size, we then aggregate across all sizes to have a measure of product availability consistent with our product definition. This also corrects for the possibility that a particular size may be temporarily sold out at a given store. We cannot merge this brick-and-mortar data with our online sales data as the collection periods do not overlap and the firms utilize di ff erent product identifiers.

Table 1 presents summary information on the assortments of 649 Macy’s locations and

3,141 Payless’ locations. In September 2014, we observe 7,844 di ff erent styles available at Macys.com, of which about 35% of shoes are online exclusives. At Payless.com, we observe 1,430 distinct styles, with about 19% being online exclusives. Average in-store assortment sizes are similar across retail chains – 624.9 and 513.0 for Macy’s and Payless,

respectively. However, there is greater variance in Macy’s store size. Figure 1 highlights

7

these di ff erences in the form of histograms of the assortment sizes at Macy’s and Payless locations. Unsurprisingly, we find that the stores with larger assortments tend to be located around larger population centers.

2.2

Localization of Brick-And-Mortar Retailers

The premise of this paper is that there may exist significant di ff erences in consumer demand across geographic markets. This has significant implications for the value of increased variety if traditional brick-and-morter retailers localize assortments. For large national retailers, there are trade o ff s to localizing assortments. On the one hand, catering to local demand may greatly increase revenues, but on the other hand, there are cost advantages from economies of scale through standardization. In this subsection, we present evidence from the product assortment choices of Macy’s and Payless. While there could be supply-side reasons for localized assortments, such as di ff erent transportation costs across brands, we find this to be an unlikely rationale since the vast majority of products are produced overseas. We interpret the facts below as evidence of firms responding to demand.

We want to measure how assortments vary by store. Figure 2 graphs the histogram of

the percentage of locations carrying a shoe style for Macy’s and Payless. For example, if all shoes were available at all stores, the density would collapse at 1 (100%). This analysis

excludes online-only shoes, and products are aggregated over sizes.

6

For Macy’s, the density is concentrated primarily to the left, meaning the vast majority of products are sold at only a few stores. The Payless distribution is more bimodal – at a few stores and at almost all stores. In recent years, Macy’s has made a concerted e ff ort to better

localize its product assortments through a program called "My Macy’s." 7

The strikingly

6 A product is stocked at a store if any size of that product is in stock.

7 "We continued to refine and improve the My Macy’s process for localizing merchandise assortments by store location, as well as to maximize the e ff ectiveness and e ffi ciency of the extraordinary talent in our My

Macy’s field and central organization. We have re-doubled the emphasis on precision in merchandise size, fit, fabric weight, style and color preferences by store, market and climate zone. In addition, we are better

8

low prevalence of products across stores is likely reflective of this program. Payless, on the other hand, produces and partners with other brands to provide exclusive products for its retail chain. The bimodal distribution for Payless may be reflective of these partnerships.

Figure 2 shows heterogeneity in assortments but does not control for geography. Next,

we want to measure how assortments change moving away from a particular store. To calculate this measure, we take the network of stores and create all possible links. Then for each pair of stores with assortment sets ( A

,

B ), we calculate

Assortment Overlap

=

# ( A ∩ B ) min { # A , # B }

This measure is bounded between zero and one. We use the minimum cardinality, rather than the cardinality of the union in the denominator, because we want this measure to capture di ff erences in the composition of each store’s inventory, not di ff erences in assortment size. To further isolate di ff erences in variety from di ff erences in assortment

size we directly compare only locations with similar sizes. Figure 3 plots Lowess fitted

values of this exercise for Macy’s and Payless as a function of distance between stores A and B . We see can that the assortment overlap has a decreasing relationship with distance suggesting these retailiers localize their product assortments. We also note that as distance approaches zero, assortment similarly does not converge to one. This is likely reflective of a strategy to increase variety within a geographic area, in addition to locations where retailers created separate men’s and women’s stores.

2.3

Across-Market Demand Heterogeneity in Online Data

With our online retail data, prices, product characteristics, and choice sets are the same for all markets, suggesting di ff erences in observed local market shares can only be rationalized by di ff

erences in local demand.

8

In Table 2, we present the local and national share of

understanding and serving the specific needs of multicultural consumers who represent an increasingly large proportion of our customers." https://www.macysinc.com/macys/m.o.m.-strategies/default.aspx

8 They could also be driven by sampling, which we address shortly.

9

revenue generated by the top 500 products ranked within a local market. For example, suppose we defined a market as a combined statistical area plus the remaining parts of the states (CSA

+

state).

9

At the CSA

+ state-month level, we observe 213 local markets over 14 time periods. On average, the top 500 products at this level make up 67.05% of local revenue. If we take the same 500 products and calculate their revenue share at the national level, we find they make up only 7.19% of national revenue. If demand were homogeneous across markets, we would expect the share of revenue accruing to these products to be the same locally and nationally. The extent to which they di ff er provides evidence that people in di ff erent locations demand di ff erent products.

Table 2 shows that for most definitions of the local market, there are large di

ff erences between the local market revenue share and the national revenue share. This suggests that the commonality of popular products is quite small across markets. However, although we chose a small cuto ff

(500, or 1% of products) to single out popular products, the di ff

erences in revenue share may be driven by sampling.

10

To disentangle sampling from heterogeneity, we conduct statistical tests.

We formally test for across-market demand heterogeneity, controlling for local sample

( size, using multinomial tests comparing local market shares ( s ` j

) to national market shares s j

). Define s = { s j

}

J j

=

1 and s

`

= { s

` j

}

J j

=

1

, then the null hypothesis is H

0

: s = s

` . The last

column in Table 2 presents the rejection rates for various levels of aggregation. We can

see that these tests are overwhelmingly rejected at all levels of aggregation. However, the tests reveal e ff ects coming from both zeros and aggregation. At more disaggregated levels, zeros become more prevalent, reducing the power of the multinomial tests (e.g.

zip5 rejection rate

< zip3 rejection rate). At the other end of the spectrum, aggregating up to Census Regions greatly obscures heterogeneity across markets leading to a slight

9 There are 165 CSAs, which are composed of adjacent metropolitan and micropolitan statistical areas. We then define states as the portion of a state not contained in a CSA. This adds an additional 48 markets. All of

Rhode Island and New Jersey are contained in a CSA.

10 We have conducted this analysis with a cuto ff ranging from one to over 50,000 and find intuitive results.

For small cuto ff s the di ff erence in percent terms is very large but decreases as the cuto ff increases between

3,000-5,000.

10

reduction in rejection rates when compared to the state level (94% vs. 92%).

Some di ff erences in demand across markets occur for obvious reasons. Take our earlier

example of boots versus sandals. Figure 4 plots the predicted values from a regression

of a state’s average annual temperature on the share of state revenue captured by boots and sandals. As expected, boots make up a greater share of revenue in colder states and a

smaller share in warmer states. Conversely, the opposite relationship holds for sandals.

11

Other di ff

erences in demand across markets occur for less obvious reasons. In Figure 5,

we map the consumption pattern of a popular brand by national revenue. Local revenue share at the 3-digit zip code level is mapped for the eastern United States. While this brand is extremely popular when measured by national sales, we can see a clear preference for this brand in the Northeast. In Florida this brand makes up less than 2.5% of sales, while in parts of New York, New Jersey, and Massachusetts it makes up over 6% of sales. We will exploit this variation to help us identify across-market demand heterogeneity.

2.4

Aggregation and the Zeros Problem

If we define local market shares given our observed sales, most products would have shares equal to zero. Observed zeros are problematic from both a theoretical and empirical point of view. The distributional assumptions in many demand models imply no products will have zero sales, and empirical techniques often require taking the natural log of sales or market shares. This, of course, does not exist when zero sales are observed and common

solutions to this problem lead to biased demand estimates.

12

Table 3 shows the severity of the zeros problem in our data. At fine levels of geography,

such as defining a market at the zip code-month level, 99.96% of products have zero sales.

What is astonishing is that even at highly aggregated levels, such as state-month, 85.25%

11 This also demonstrates that consumers do not shop online just for products that are not available in traditional brick-and-mortar stores. For example, boots – rather than sandals – make up a sizable share of revenue in Alaska.

12 An in-depth discussion of these issues can be found in Berry, Linton, and Pakes (2004); Gandhi, Lu, and

Shi (2013); and Gandhi, Lu, and Shi (2014).

11

of products have zero sales.

The results of Table 2 and Table 3 give important insights for modeling. Table 3

shows that as geography is aggregated, the zeros problem becomes less severe. For example, at the Census Region-month level, only 33.70% of products have zero sales.

While aggregation helps the zeros, Table 2 shows for this level of aggregation, the revenue

share comparison for the top 500 products is 16.36% versus 14.76%. Aggregation has e ff ectively smoothed over all of the heterogeneity in the data we are interested in exploring.

Similarly, we could also aggregate over product space. Table 4 shows the percentage of

zeros and the local and national revenue shares of the top local products ranked by market

for products at the SKU, shoe model, brand-category, and brand levels.

13

Since aggregating to the brand-category and brand levels greatly reduces the number of products, we adjust the benchmark to the top 10 "products" rather than the top 500. We see a similar result: aggregation helps but does not solve the zeros problem, and heterogeneity across markets is smoothed.

Instead of choosing between decreasing the number of zeros and smoothing over heterogeneity, in the next section we present a model and estimation routine that allows for a significant number of local level zeros (sampling problem), while retaining information on the across-market demand heterogeneity.

3 Model and Estimation

In this section, we present our discrete choice demand model, introduce our proposed estimation technique, and discuss identification. Our modeling approach closely follows

the discrete choice literature, taking the form of a nested logit model.

14

We then show how to use local sales to identify across-market demand heterogeneity as a product-market

13 A SKU is a shoe model + style combination. For example, a shoe model could be a particular Nike running shoe. The SKU corresponds to that shoe model with a particular style, such as blue on white.

14 A previous version of this paper included a simple logit version of the model. Results are available upon request.

12

level random e ff ect.

3.1

Model

Each consumer solves a discrete choice utility maximization problem: Consumer i in location

` will purchase a product j if and only if the utility derived from product j is greater than the utility derived from any other product, u i ` j

≥ u i ` j 0

,

∀ j 0

∈ J ∪ { 0 } , where

J denotes the choice set of the consumer and 0 denotes the option of not purchasing a product. We suppress the time script t . For a product j , the utility of a consumer i ∈ I

` in location

`

∈ L is given by u i

` j

= δ j

+ ν i

` j where δ j is the mean utility of product j for the (national) population of consumers and

ν i

` j is a random utility component that is heterogeneous across consumers and locations.

The mean utility of product j is linear in product characteristics and can be written as

δ j

= x j

β − α p j

+ ξ j

, where x j is a vector of product characteristics, p j is the price of product j , and

ξ j is the unobserved (national) product quality for product j . These preferences are constant across locations, and characteristics do not change across locations. The outside good has utility normalized to zero, i.e.

δ

0

=

0.

We pursue a nested demand system where products can be grouped into mutually exclusive and exhaustive sets. Let c denote a nest, and note that product every j implicitly belongs to some nest c and the outside good belongs to its own nest. The category of shoe is to defined to be the nesting variable (e.g. boots, sandals, sneakers, etc.). We assume the

13

random utility component can be decomposed as

ν i

` j

= η

` j

+ ζ ic

+ (1 − λ ) ε i

` j where

ε i ` j is drawn i.i.d. from a Type-1 extreme value distribution and, for consumer i ,

ζ ic is common to all products in the same category and has a distribution that depends on the nesting parameter

λ

, 0 ≤ λ <

1.

ζ ic

+

(1 − λ

)

ε i ` j is then also Type-1 extreme value distributed leading to the frequently used nested logit demand model.

λ determines the within category correlation of utilities. When

λ → 1 consumers will only substitute to products within the same group and when

λ =

0 the model collapses to the simple logit case.

The terms entering

ν i

` j decompose the heterogeneity in the random utility among consumers into an "across-market" e ff ect, η

` j

, and a "within-market" e ff ect, ζ ic

+ (1 − λ ) ε i

` j

.

When

η

` j

=

0 for all

`

∈ L

, j ∈ J , then the model reduces to a standard "love of variety" nested logit model, where there is no distinction between local and national preferences.

Integrating over the Type-1 extreme value error terms forms location-specific choice probabilities. These choice probabilities are a function of location-specific mean utilities,

δ j

+ η

` j

, as well as the substitution parameter

λ

. The mean utilities are inclusive of the unobservables ( ξ j

, η

` j

). Similar to Berry (1994), the choice probabilities have the following analytic expression:

π

` j

= π

` c

·

π

` j

/ c

P

=

1

+ P c

0

∈ C j ∈ c exp { (

δ j

P j

0

∈ c

0

+ exp { (

η

δ

` j j 0

)

/

(1 −

λ

) }

+ η

` j 0

)

/

(1

1 −

λ

− λ

) }

1 − λ

·

P exp j 0 ∈ c

{ (

δ j exp { (

δ

+ η

` j j 0

+

)

η

/

(1

` j 0

λ

) }

)

/

(1 −

λ

) }

,

(3.1) where π

` c is the choice probability of purchasing any product in c and π

` j

/ c is the choice probability of purchasing product j conditional on choosing category c . Aggregating these

14

probabilities across the distribution of locations yields the national choice probability

π j

=

Z

L

π

` j dF

ω =

L

X

ω

`

π

` j

,

` =

1 where dF

ω is the density of location population shares and, in discrete notation,

ω

` is the population share of location

`

.

We could invert Equation 3.1, as shown in Berry (1994) and Berry, Levinsohn, and

Pakes (1995), for each location

` and proceed with linear instrumental variable methods to obtain estimates of the preference parameters, as well as the product fixed e ff ects

ξ j

.

The local level residuals would then form estimates of

η

` j

. However, the sparsity of individual product sales within locations would lead to attenuation bias. To circumvent this problem, we use a random-e ff ects specification, where

η

` j is drawn independently from a normal distribution, N (0

, σ 2 j

). Instead of attempting to recover

η

` j directly, we estimate the variance of its distribution,

σ 2 j

. We maintain the fixed e ff ects assumption for

ξ j and since our online retailer sets prices at the national level, we allow for the possibility that prices are correlated with

ξ j

, but we assume characteristics, including prices, are exogenous with respect to

η

.

3.2

Inverting the Market Share

We now show that the inverse of our market shares takes a convenient analytical form, which will simplify the simulation of our local choice probabilities.

Proposition 1.

For any set of {

η

`

}

L

` =

1 the market share inversion takes the following analytic form,

∀ j ∈ J,

δ j

= (1 − λ )

 log π j

− log

X

` ∈ L

ω

`

π

` c

π

π

`

0

` c

1

1

λ exp

η

` j

1 −

λ

 

.

(3.2)

15

Proof.

In the nested logit case, we will find it convenient to write shares as a fraction of the category share. By Bayes’ rule

π j

(

η

` ;

δ, λ

)

=

Pr ` { c } · Pr ` { j | c }

= π

` c

·

P exp j 0 ∈ c

{ (

δ j

+ η

` j

)

/

(1 −

λ

) } exp { (

δ j

0

+ η

` j

0

)

/

(1 − λ

) }

,

Aggregated choice probabilities are then

π j

=

L

X ω

`

π j

(

η

` ;

δ, λ

)

=

` = 1

L

X ω

`

` = 1

π

` c

P exp j 0 ∈ c

{ (

δ j exp { ( δ

+ η j

0

` j

+

)

η

/

(1

` j

0

λ

) }

) / (1 − λ ) }

.

Next, define

D ` c

= X exp j

0

∈ c

(

δ j

+ η

` j

1 −

λ

)

.

We normalize the utility of the outside good, both in terms of product characteristics and the unobserved taste preference across locations. This means the probability of choosing the outside good at location

` is equal to

π

` 0

=

1

1

+ P c

0

∈ C

D 1 −

λ

` c

0

, and note that the probability of choosing a good in category c at location ` is equal to

π

` c

=

1

+

D 1

` c

λ

P c 0 ∈ C

D 1

` c

λ

0

, thus

D ` c

=

π

` c

π

` 0

!

1

1 −

λ

.

Plugging into the aggregate choice probabilities, gives

π j

= X ω

`

π

` c

`

∈ L

π

` 0

π

` c

!

1

1 −

λ exp

(

δ j

+ η

` j

1 −

λ

= exp

(

δ j

1 −

λ

)

X ω

`

`

∈ L

π

` c

π

` 0

π

` c

!

1

1 −

λ

) exp

η

` j

1 −

λ

Finally, taking logs we have our inversion, log π j

=

δ j

1 −

λ

+ log

X

 `

∈ L

ω

`

π

` c

π

`

0

π

` c

!

1

1 −

λ exp

η

` j

1 −

λ

 or

δ j

= (1 − λ )

 log π j

− log

X

 `

∈ L

ω

`

π

` c

π

` 0

π

` c

!

1

1 − λ exp

η

` j

1 −

λ

 

Equation 3.2 relates

δ j to the aggregated share data,

π j

, local population shares,

ω

` ,

16

local outside good and category shares,

π

`

0 and

π

` c

, which we assume are known from the data, as well as our across-market e ff ects,

η

` j

, which are unknown. We believe it is reasonable to assume that category shares are well estimated in the data because, for each location, we observe hundreds of sales spread across just 8-10 categories.

Additionally, note that this inversion reduces to the inversion found in Berry (1994) when

η

` j

=

0, ∀

`

∈ L

, j ∈ J

.

15

However, since

η

` j is unknown, unlike Berry (1994), we cannot simply recover mean utilities from observables. In the next subsection, we describe how we use our inversion to estimate the demand parameters. While the individual

η

` j s cannot be consistently estimated due to the small sample problem, we appeal to the law of large numbers in locations to recover the summation term of our inversion, given an estimate of the distribution of

η

` j

. We then introduce a set of micro moments created from information on local level sales to identify their distribution. Finally, we can then integrate out this distribution according to our inversion to obtain the mean utilities,

δ j

, from the data (

π j

, π

` 0

, π

` 0

). This allows us to then use standard estimation techniques at the aggregate level.

15 Suppose η

` j

= 0, ∀

`

∈ L , j ∈ J , then π

` 0

= π

0 and π

` c

= π c

, and

δ j

= (1 −

λ )

 log π j

− log

X

`

∈ L

ω

`

π

` c

π

π

` 0

` c

1

1 −

λ

=

(1 −

λ

)

 log

π j

− log

X

`

∈ L

ω

`

π − c

λ

1 −

λ

=

=

=

(1 log log

π

π

λ j j

) log π j

− log

π

0

− log π

0

+ λ log π c

− log π

0

π j

λ log

π c

λ log π j | c

π

0

1

1 −

λ

 

 exp

η

` j

1 −

λ

 

17

3.3

Estimation

Suppose we knew, or had an estimate for,

σ =

{

σ j

}

J j =

1 we exploit the structure of the model. By law of large numbers,

η

` j

∼ N (0

, σ 2 j

),

π j

L

X

ω

`

π

` j

` =

1

η

` ;

δ, λ

)

, so long as the number of locations L is su ffi ciently large. Thus, aggregated choice probabilities only depend on the variance of the across-market heterogeneity,

σ

, rather than on than the individual fixed e ff ects, η

` , themselves. Therefore, national demand can be expressed as

π j

= π j

( δ, λ ; σ ) , j = 1 , ..., J , which is a system of equations that can, in general, be inverted (Berry, Gandhi, and

Haile 2013) to yield,

δ ( π, λ, σ ) = x j

β − α p j

+ ξ j

.

Following Berry, Levinsohn, and Pakes (1995), for a fixed σ , we can use linear instrumental variables z j

, such that E [ z j

ξ j

]

=

0 and E [ z

0 j

( p j

, x j

)] has full rank, to identify

( α, β ) as a function of σ

. However, the existing instruments used in the literature 16

typically provide little to no identifying power for the non-linear parameter

σ

(Gandhi and

Houde 2015). Instead, we use the disaggregated information in our data to augment the instrumental variable conditions with an additional set of micro moments that provide direct information on σ .

16 For example, BLP instruments.

18

Micro Moments

Let P 0 ` j

(

σ

;

δ, λ

) be the probability that a product j has zero sales given the N ` consumers observed to purchase a shoe in location

`

. We then define

P 0 j

(

σ

;

δ, λ

)

=

1

L

L

X

P 0 ` j

(

σ

;

δ, λ

)

` = 1 to be the fraction, or proportion, of markets that the model predicts will have zero sales for product j . Observe that this fraction depends on model parameters where we have concentrated out

δ as

δ

(

π, λ, σ

). The empirical analogue is j

=

1

L

L

X

1 { s ` j

=

0 }

,

` = 1 where s ` j is the observed location level market share for product j . Our micro moment then identifies

σ by matching the model’s prediction to the empirical analogue, i.e.

mm (

σ

;

δ, λ

)

=

1

J

X

J j

=

1

P 0 j

(

σ

;

δ, λ

) − j

.

It is important to point out that P 0 is just one such micro moment that can be used to estimate across-market demand heterogeneity. Other moments include P 1

,

P 2, etc., as well as the variance in sales across markets. Note that P 0 remains valid as the number of locations increases. This is because we assume finite population for a given market which implies as L → ∞ , a positive proportion of locations may experience zero sales for a given

product.

17

17 In Monte Carlo studies, we have found adding additional micro moments does not greatly a ff ect the estimates. Also, the logit structure implies P 0 is no longer valid when assuming large N for all locations since then each product will have positive local share. We estimate these models as well.

19

Estimation Procedure

Having laid the foundation of our methodology, we turn to detailing the computational mechanics of the estimation. The model can be estimated using two-step feasible generalized method of moments (GMM). We start with the implementation of our micro moments.

Note that local level mean utilities can then be written as

δ

` j

= δ j

+ η

` j

= δ j

+ σ j

η

` j

η

` j is an i.i.d. draw from a standard normal distribution. With the assumptions on the individual level unobservable (Type 1 extreme value), the nested logit structure implies local level product choice probabilities are equal to

π

` j

= π

` c

P exp j 0 ∈ c

{ (

δ j exp { (

δ

+ σ j j 0

+

η

` j

σ j 0

)

η

/

(1

` j 0

λ

) }

)

/

(1 −

λ

) }

.

(3.3)

The local level choice probabilities can be used to simulate consumer purchases at each location, holding the number of observed purchases, N ` , fixed. In particular, using Equa-

tion 3.3, the probability a product is observed to have zero sales at location

` is

P 0 ` j

( σ ; δ, λ ) = (1 − ˆ ` j

)

N ` , i.e. the probability we observe N

` sales, none of which are good j . Alternatively, as we take local category shares as given, another micro moment is to match the probability of zero sales within category, P 0

` j / c

( σ ; δ, λ ) = (1 − ˆ

` j / c

) N ` c , where N

` c is the number of purchases within category c at location

`

. Averaging over locations allows us to match the proportion of locations observing zero sales of j . These are our micro moments, mm ( · ).

This is computationally fast and avoids the problems posed by simulating individual purchase decisions.

With a candidate solution of

σ and

λ

, the structure we have placed on the

η s allows us

20

to integrate them according to the market share inversion in Equation 3.2. This recovers

δ j

. However, the inversion can be further simplified by using the moment generating

function of the normal distribution, as shown in Proposition 2.

Proposition 2.

Applying the law of large numbers in L and integrating out over

η

` j gives

X

ω

`

π

` c

`

∈ L

π

`

0

π

` c

1

1

λ exp

η

` j

1 −

λ

X

ω

`

π

` c

`

∈ L

π

`

0

π

` c

1

1

λ exp

1

2

σ 2 j

(1 −

λ

) 2 

(3.4)

Proof.

By strong law of large numbers

X

ω

`

π

` c

`

∈ L

π

` 0

π

` c

!

1

1 −

λ exp

η

` j

1 − λ

X

E

`

∈ L

ω

`

π

` c

π

` 0

π

` c

!

1

1 −

λ exp

1

η

` j

− λ

 as L → ∞

Define,

η

` j

=

η

` j

1 −

λ and j

=

σ j

1 −

λ then

X

E

`

∈ L

ω

`

π

` c

π

` 0

π

` c

!

1

1 −

λ

 exp n

˜ ` j o 

= X

Z

`

∈ L

ω

`

π

` c

π

` 0

π

` c

!

1

1 −

λ exp { ˜ ` j

} f ( ˜ ` j

| ˜ ) d ˜

=

=

X ω

`

`

∈ L

π

` c

π

` 0

π

` c

!

1

1 −

λ

X ω

`

`

∈ L

π

` c

π

` 0

π

` c

!

1

1 − λ

Z exp { ˜ ` j

} f ( ˜

E h exp { η

` j

} i

` j

| ˜ ) d

η

` j

From the moment generating function of the normal distribution, we have that

E h exp { ˜ ` j t } i =

M x

( t )

= exp ˜ t

+

1

2

2 j t

2 , and for t

=

1,

E h exp { ˜ ` j

} i = exp

1

2

σ 2 j

(1 −

λ

) 2

.

21

Combining Equation 3.2 and Equation 3.4, we can rewrite our inversion as

δ j

=

(1 −

λ

)

 log

π j

− log

X

 `

∈ L

ω

`

π

` c

π

` 0

π

` c

1

1

λ

1 exp

2

σ 2 j

(1 −

λ

) 2

  

.

Thus, after specifying

σ and

λ mean national level utilities are recovered as

δ j

= x j

β − α p j

+ ξ j

(3.5)

Hence, we obtain a linear estimating equation where instrumental variable methods can be used to control for price endogeneity. Let Z be the matrix of instruments to identify

β, α

. We add m

1

= E [ Z 0 ξ

] to the set of included moments.

The last complication to address is how to identify the nesting parameter. In the

Berry (1994) nested logit inversion, within category shares are also correlated with the unobserved product quality creating an endogeneity problem. A similar issue arises in our inversion. Note that, with δ

as defined in Equation 3.2,

E

"

∂δ j

(

π, λ, σ

)

∂λ

·

ξ j

#

,

0 because ξ j enters the aggregate product share, π j

, and the local level category shares, π

` c

.

Berry (1994) solves this problem by employing an instrument, z j | c

, that is correlated with

the within category share, but uncorrelated with the unobserved product quality.

18

same instrument can be employed here, since z j | c is correlated with

∂δ j

(

π,λ,σ

)

∂λ

The through the local level category shares, but still uncorrelated with the unobserved product quality.

Thus, if z j | c is a valid and relevant instrument when estimating the nested logit model using the Berry (1994) inversion, it is a valid and relevant instrument for our inversion.

These moments are m

2

.

18 For example, a combination of the product characteristics of competing products within the same category or nest.

22

Stacking moments where

θ = σ, λ, β, α we have

G (

θ

; · )

=

 mm m m

1

2

 and the GMM criterion is G (

θ

; · )

T

WG (

θ

; · ), with weighting matrix W . We first take W

=

I

− 1 and then use b

= G ( θ (1) ; · ) G ( b

(1) ; · )

T in the second step. Our final estimates are b

(2) .

Included in x are product ratings for comfort, look, and overall appeal and fixed e ff ects for color, top brands, and time. We instrument for price using the number of available styles (color combinations) for that product’s shoe model. We believe this to be a proxy for the costs to manufacture a product. For example, products that come in many di ff erent color combinations are cheaper, mass produced products. At the same time, the number of available colors for a shoe model should not a ff ect the utility derived from consuming a particular color of that shoe model.

Additionally, we include the average overall rating

19

within a brand’s own products and these same instruments by category. That is, let: B denote the set of brands; J b denote the set of products manufactured by brand b ∈ B ; c b denote the set of shoes manufactured by brand b ∈ B in category c ∈ C . For each time period, our additional instruments are

J b

X x j

, j 0

, j

X j 0

, j ∈ c b x j 0

.

These will aid us in identifying the nesting parameter,

λ

.

Finally, we parametrize σ as

σ j

= h (category j

)

= γ c

,

19 Ratings for overall, look, and comfort are highly collinear. We use only the average rating to increase the precision of the weighting matrix.

23

meaning mm ( · ) contains C

moments.

20

3.4

Identification

The variance of our local level random e ff ect, η

` j

, is identified through di ff erences in local market shares. If we had a large sample of sales and there were no across-market demand heterogeneity, each product’s local market shares would be the same in every market and our variance would be zero. However, di ff erences in local market shares may arise due to sampling. Therefore, in our construction of the micro moments, we are careful to account for the number of sales in each market.

For each product, we use the proportion of locations where zero sales occur to form our micro moments. To understand the intuition behind this, consider a world with a single inside good. If demand were homogeneous across markets, at the disaggregated level, we would expect to see similar local shares. In particular, if this good were very popular at the aggregate level, we would expect to observe few, if any, local markets with zero sales.

Instead, suppose we observe wildly di ff erent shares across markets with a significant portion of markets having zero sales. This suggests the product faces heterogeneous demand across markets. Assuming a normal distribution, as we do, the variance of this heterogeneity is pinned down by the proportion of observed zeros. If a large number of zeros is observed, this suggests a large number of markets drew low valuations for the good (a low draw of

η

` j

). This suggests a higher variance for the heterogeneity. Conversely, few observed zeros suggests there are few markets with low draws of

η

` j and, hence, a lower variance.

Parameters within

δ j are identified through the standard channels – in the cross-section through variation in aggregate sales given characteristics, ( x j

, p j

), and across time periods through time varying characteristics and variation in the choice set J . Identification of

20 In addition to category, we have estimated the model using a parametric function of product rank, as well as interacting rank and category fixed e ff ects. The results are similar to what we present here. We have noted more complicated functions, such as polynomials of rank interacted with category information are too computationally burdensome.

24

the nesting parameter,

λ

, is driven by changes to the category shares as the utilities of products within the category change or the set of products within the category changes.

3.5

Commentary on Modeling Approach

The random e ff ects assumption we pursue allows us to estimate across-market heterogeneity given observed local level zeros. This comes at the expense of not being able to recover the exact realizations of the local level unobserved product quality. For example, we are able to estimate that sandals have large variation in demand, but we cannot recover the realization of the unobservable (

η

` j

) at particular locations, such as Alaska and Florida.

Post-estimation, we conduct a number of exercises to calculate the gains from increasing variety. By adding a large number of products, standard demand systems typically predict large gains due to the full support assumption on the individual unobserved demand shock (the

ε s). Ackerberg and Rysman (2005) propose introducing a crowding penalty term, R j

, to relax restrictions on the unobservable characteristic space of discrete choice models and show that introducing this term reduces bias in the estimated elasticities and welfare. Their adjusted share for nested logit is

P

π j

=

1

+ P c

0

∈ C j ∈ c

P

R j exp {

δ j

/

(1 −

λ

) } j

0

∈ c

0

R j exp {

δ j 0

1 −

λ

)

/

(1 −

λ

) }

1 − λ

·

P

R j j 0 ∈ c exp

R j

{

δ exp j

{

/

δ

(1 j 0

λ

) }

/

(1 −

λ

) }

.

(3.6)

They operationalize the crowding penalty term by making assumptions on the number of retail outlets per product.

In our application, Equation 3.4 suggests taking

R (

σ j

)

= exp

1

2

σ 2 j

(1 −

λ

) 2

.

25

Since R (

σ j

) is not indexed by

`

, the share equation can be rearranged to yield

π j

=

R (

σ j

) exp

(

δ j

1 −

λ

)

X

ω

`

π

` c

`

∈ L

π

`

0

π

` c

1

1

λ

.

Expanding this equation results in Equation 3.6 found in Ackerberg and Rysman (2005).

21

That is, our random e ff ect can be interpreted as an application of Ackerberg and Rysman

(2005) at the national level, where we motivate and discipline the crowding penalty term through across-market demand heterogeneity. This relationship to Ackerberg and Rysman

(2005) suggests our estimates of the increase in consumer welfare due to an increase in the size of the choice set may be smaller than expected because of incorporation of crowding parameters (

σ

).

4 Results

In this section, we discuss our demand estimates and the fit of the model. We restrict our attention to adult shoes and estimate the demand for men’s and women’s shoes separately.

We define our time horizons to be at the monthly level and our geographic locations to be

composed of 213 local markets (165 CSAs plus 48 states).

22

Our market sizes are defined as the adult population for men and women, respectively.

21 To see this, note that

R (

σ j

) exp

(

δ j

1 −

λ

)

= exp

(

δ j

+

(1 −

λ

) log R (

σ j

) )

1 −

λ

.

j

= δ j

+ (1 −

λ ) log R ( σ j

). Plugging this into the expanded nested logit share equation gives

π j

=

1 + P

P j ∈ c exp {

˜ j

/ (1 −

λ ) } c 0 ∈ C

1 −

λ

P j 0 ∈ c 0 exp {

δ j 0

) / (1 −

λ ) }

1 −

λ

·

P exp j 0 ∈ c

{

δ j

/ (1 −

λ ) } exp {

˜ j 0

/ (1 −

λ ) }

.

j

= δ j

+ log R ( σ j

) gives us Equation 3.6.

22 We find at finer levels of geography, such as zip code, the nearly 100% local zeros cause the micro moments to lose identifying power. We have confirmed this with Monte Carlo exercises, some of which appear in the

Appendix. We choose CSA + state, compared to just CSA, since a large percentage of observed sales occur outside CSAs. For example, if we pursued the CSA market definition, we would drop all of sales to consumers in Alaska. Results dropping states are available on request.

26

We compare the estimates of our approach with a number of alternative models. For ease of exposition, we define these approaches now:

Local RE Location-product level random e ff ect model (our approach)

Local FE

National FE

Traditional nested logit model at the local level

Traditional nested logit model with aggregated (national) data

Market-by-Market Traditional nested logit model run for each location separately

For all these approaches (except Local RE), we estimate two versions. The first treats observed shares as true shares, which we call empirical shares or "ES." The second approach adjusts aggregate zeros using the correction proposed by Gandhi, Lu, and Shi (2014), which we call "AS." Local RE is only estimated using AS shares. The important distinction is that a model estimated using ES will drop the zeros, whereas AS shares bring the zeros o ff

the bound. A discussion of the correction procedure can be found in Appendix B. For

all models estimated, we include fixed e ff ects for top brands, color (top 25), category, and time.

4.1

Demand Parameters Constant Across Markets

We begin by discussing the demand parameters that are constant across locations. A sum-

mary of our main demand estimates is presented in Tables 5 and 6 for men’s and women’s

shoes, respectively. Within each table, there are three sets of estimates, corresponding to:

(1) Local FE-AS; (2) National FE-AS; and (3) Local RE.

Specification (1), Local FE-AS, illustrates the selection bias generated by the severity of the zeros problem, even when employing adjusted shares. With this model, each observation is a product-location specific share; thus, the number of observations is 213 times greater (number of products times 213 CSA + state markets) than the other specifications.

Unfortunately, at this level of disaggregation, about 95% of the observations have zero sales. Of particular concern for us are the price coe ffi cients, which are an order of magnitude smaller for men and about a third the size for women compared to (2) and (3). We also

27

find the nesting parameter is attenuated for women, at roughly 0.2 compared to 0.65 and

0.45 for (2) and (3), respectively. The impact of attenuation bias implies price elasticities that are much too inelastic. The bottom panels of each table report the mean and standard deviation of the estimated product level price elasticities. For the Local FE-AS model, we obtain elasticities of -0.7 for men, which di ff ers by a magnitude of four compared to the other models. The bias is even larger for women.

Specifications (2) and (3) directly compare the results estimated using the standard approach on national level data and the results estimated using our procedure for the nested logit model. Unsurprisingly, the results for these specifications are similar. Notably, the estimates of the price coe ffi cient and nesting parameter are similar and significant, which yields average product elasticities of (-3.6, -3.0) and (-2.0, -1.5) for men and women, re-

spectively, across specifications.

23

The reasonably high estimates of the nesting parameter suggest substitution within category is important. While (2) and (3) o ff er similar estimates, the advantage of utlizing the Local RE model is that it retains information on the distribution of heterogeneity across locations. The importance of this distinction will be highlighted in the following section when we perform counterfactual analyses.

Turning to the coe ffi cients on our review variables, we can see that the overall rating has the expected sign, with higher ratings having positive e ff ects on demand. The estimates are statistically significant. Look and comfort have much smaller e ff ects, with only comfort being statistically significant in the women’s specification. The review ratings are highly correlated so it is likely that after controlling for overall appeal, the estimates for look and comfort su ff er from collinearity. Meanwhile, our indicator for no reviews takes on positive signs for both men’s and women’s shoes. This variable largely captures the demand for new products before there has been an opportunity to review them. New products often benefit from additional promotion and advertising, and it is likely that the positive e ff ect

23 The empirical literature on shoe demand is limited. Roberts, Xu, Fan, and Zhang (2012) look at imports of Chinese footwear and find elasticities in the range that we find. For the US, their elasticities are smaller; however, their definition of a shoe is broader than our study.

28

of having no review actually reflects the additional promotion, rather than a desire to purchase shoes that have not been reviewed.

For comparison of (1), Table 8 reports demand estimates using empirical shares at

the local and national level (Local FE-ES, National FE-ES). Two patterns emerge. The first is that at the local level, using empirical shares also results in significant attenuation bias, although surprisingly, mean elasticities are larger in magnitude compared to using local adjusted shares. This is likely driven by the sheer number of zeros providing little guidance on how to adjust shares at the local level. Second, we find empirical shares yield mixed results at the national level, where less than 10% of the sample is dropped. For men, we find more inelastic demand with empirical shares which is consistent with Gandhi, Lu, and Shi (2013). On the other hand, elasticities are again larger in magnitude for women using national empirical shares.

Another approach of retaining local heterogeneity is to have location-specific param-

eters, which we operationalize by estimating demand market-by-market.

24

Summary

results of these models appear in Table 9. While there is substantial variation in estimates

across markets, our general finding is these models perform poorly with both empirical and adjusted shares. For example, the average product elasticity for men’s shoes is -0.8

and -1.7 for ES and AS, respectively, which is half the magnitude we obtain under National

FE and Local RE. For women, we again find mean elasticities nearly half of what we find under National FE and Local RE.

4.2

Across-Market Heterogeneity

The results in the previous subsection suggest the Local RE model provides comparable estimates to the National FE-AS model. However, the Local RE model also provides estimates of across-market demand heterogeneity to rationalize the distribution of local

24 With 32 and 82 million observations for men and women respectively, we found it too computationally intensive to estimate all markets simultaneously.

29

demand. These parameters are estimated as

σ j

= h (category j

) = γ c by matching the proportion of local zeros with the observed percentage of locations that have zero sales. Our estimates for the across-market demand heterogeneity in the Local

RE model are presented in Table 7.

For men’s shoes, we find five of eight categories with statistically significant estimates of across-market demand heterogeneity – boat, clogs, sandals, slippers, and sneakers. For women’s shoes, we find only three of ten – boat, loafers, and sneakers – are statistically significant. The size of the statistically significant across-market demand heterogeneity parameters is quite large. For example, the smallest statistically significant

σ c for men’s shoes is sandals at 0.462. To put this number in perspective, a one standard deviation increase in a sandal’s draw of

η

` j is equivalent to a decline in price of $33. Similarly, for women’s shoes, loafers have the smallest statistically significant σ c at 0.321, which implies a one standard deviation increase in a loafer’s draw of

η

` j is equivalent to a decline in price of almost $46. This suggests the demand for these products varies greatly across markets.

To show how the estimates of

σ c rationalize the distribution of local sales in the data,

Figure 6 plots the proportion of local zero market shares across products in the sneaker

category for three scenarios: (i) the observed data, (ii) results of the Local RE model, and

(iii) results of the Local RE model for the case of homogeneous demand across markets, i.e. when

σ j

=

0 for all shoes j ∈ J . The left panel is the plot for men’s sneakers and the right panel is for women’s sneakers. For women’s sneakers in particular, we see the

Local RE model closely follows the data, which makes sense since the micro moments are used to match local zeros. However, we see the homogeneous demand line systematically understates the degree of zeros. Put another way, the gap between the data line and homogeneous demand line suggests that if demand were homogeneous across markets, we would expect to see far fewer zeros among popular and mid-ranked products.

30

Figure 6 also suggests that there is a limit to the level of disaggregation the Local RE

model can handle. In particular, it becomes increasingly di ffi cult to identify the variance in demand when markets have too few observed sales because any pattern of zeros can be rationalized through the mean aggregate utilities. For example, in the extreme, we could define our local markets as individual consumers. However, we would expect to estimate no "across-market heterogeneity" because it would already be captured by the mean utilites and the logit error.

5 Analysis of the Estimated Model

With our demand estimates, we now conduct a series of counterfactual analyses to quantify

the gains from online variety (Section 5.1). We compare our estimated consumer surplus

and retail revenue under the large (observed) choice set to the counterfactual surplus

obtained under a limited assortment of products (Section 5.1). This mimics a world in

which consumers do not have access to online retail. We consider two scenarios: (1) where local assortments are tailored to local demand and (2) where local assortments are standardized, which is analogous to the counterfactuals found in the existing literature.

Finally, we revisit the phenomenon of the long tail and show that aggregation of sales over markets with di ff erent tastes is a key driver of the long tail in our online retail data

(Section 5.2).

5.1

The Gains from Increasing Access to Variety

The objective of our main counterfactual is to quantify the increase in consumer surplus and retail revenue from increasing access to variety in the presence of across-market demand heterogeneity. Mechanically, to compute our counterfactuals, we draw a set of η

` s for each location. Using these taste draws, along with recovered national mean utilities, products are then ranked in each location by their location specific market shares.

Products with the highest local shares are included in the counterfactual choice set –

31

these products make up the "pre-internet choice set." For each counterfactual choice set,

location level choice probabilities are then recalculated according to Equation 3.3. Using

these probabilities, we simulate location level purchases, which then allows us to compute counterfactual consumer welfare and retail revenue.

We utilize our local retailer data to set local assortment cuto ff s. While we cannot directly match our online sales data and our brick-and-mortar assortment data, we can use the counts as a guide for our selection of the local level assortment sizes. Each market receives a limited assortment from the predicted values of ln( a ` )

= β

0

+ β

1 ln(pop

`

)

+ ε

`

, where a ` is the assortment number we see in our Macy’s and Payless data and pop

` is location population. We examine the robustness of our results for a range of thresholds in the following section. Location level consumer welfare is defined as

CS `

=

M

ω

`

α

 log

1 +

X c ∈ C

X

 j ∈ c exp

(

δ j

+ η

` j

1 −

λ

)

1 − λ

 and retail revenue is defined as r ` j

= p j

M

ω

`

π

` j

, where M is the size of the national population (for men and women, respectively).

Table 10 summarizes our main findings and compares estimates under the Local RE

model with the Local FE-ES model. Using our proposed technique, we find consumer welfare increases by $16.8 million, or 9.9%, for men and $106 million, or 15.4%, for women by going from localized assortments to the large, online choice set. Our interpretation is that these numbers are significant, but modest.

Regardless of the interpretation, these numbers are in stark contrast to the gains found under the local fixed e ff ects model. Consistent with Ellison and Glaeser (1997), ignoring

32

the local level small sample problem exaggerates estimated heterogeneity across markets.

By assuming products without an observed sale are completely unwanted at that particular location, the customized counterfactual choice set satisfies most consumer demand. This leads to an understatement of the consumer welfare increase due to having access to the large online choice set, and in fact, we estimate the gains to be almost nonexistent. For men’s shoes, we find the consumer gain to be approximately 0% and for women, 0.1%.

To get a sense of the role that across-market demand heterogeneity and localized assortments play in the counterfactual, we now compare these results to the scenario where each location draws from the same, standardized ranking of products. Each location receives a number of products based on its population, as in the previous counterfactuals, but the selection of products is derived from the top products at the aggregate, or national, level. Under a standardized national ranking of products, access to online variety increases consumer welfare by $23.3 million, or 14.3%, for men and $121.7 million, or 18.1%, for women. This suggests abstracting from the fact that retailers cater to local demand has sizable consequences concerning the benefit of variety – 38.7% and 14.8% in absolute terms for men and women, respectively. In percentage terms the overstatement is 44.4% and

17.5% for men and women, respectively. The overstatement occurs because the baseline welfare (pre-internet) of consumers is lower when choice sets are determined by national preferences than when they are locally targeted. For example, if the national ranking highly rates sneakers and sandals, there will be too few boots for consumers in Alaska.

Our results also have implications concerning assortments at brick-and-mortar retailers. By comparing the results of the nationally standardized assortment with localized assortments, we find revenue is 3% higher under the latter. This suggests that there may be an incentive for local stores to cater to local demand depending on the potential diseconomies of scale due to localization.

33

5.2

Long Tail Analysis

Our counterfactual results suggest that "shorter" revenue tails at the local level underlie

the long tail at the national level. Using the raw sales data, Figure 7 illustrates how local

level "short" tails can aggregate to a national level long tail. It plots the cumulative share of revenue going to the top K products (x-axis) for the median market (by number of monthly sales), middle 10% (p45-p55), middle 50% (p25-p75), and national level markets.

For the median market, we can see that there is an extremely short tail with fewer than

2,000 products making up total local revenue.

The next line (p45-p55) aggregates the sales data for the middle 10% of markets. Since the popularity of products varies across geographic markets, aggregating over markets increases the number of di ff erent varieties sold and decreases the density of sales among the top ranked products. Sales become less concentrated among the top products producing a lengthening e ff ect of the revenue tail. Using the middle 50% of markets (p25-p75), the lengthening e ff ect of the tail is very pronounced. Hence, simply aggregating over markets creates a long tail even though each individual market demands far fewer varieties of shoes.

The small sample problem in the raw data presents us with a skewed perspective in that it suggests a ridiculously short tail at the local level. However, we can correct for the small sample problem in our long tail analysis by utilizing the results of our model and

simulating a large number of purchases for each of our local markets. Figure 8 contains the

same median market (data) revenue curve along with the national revenue curve found

in Figure 7. We add a line called "Median (Simulated)" which removes the small sample

problem for that location. As expected, we find that the local tail is actually quite a bit longer than suggested by the raw data. This implies that there is a long tail e ff ect as described in the existing literature, but it is shorter than the national tail would suggest.

Table 11 further illustrates the e

ff ect of small sample sizes on the local tail. It presents the average share of revenue accruing to products outside of the top 3,000 products at both

34

the local and national level. At the national level, more than 50% of revenue comes from products ranked outside the top 3,000. At the local level, if we were to rely on the raw data, we would find that only 3.6% of revenue comes from products ranked outside the top 3,000. In other words, 96.4% of demand could be satisfied with just 3,000 well-targeted products in each local market. Simulating our model with the same small number of sales

observed in each local market yields a distribution of sales that is similar to the data.

25

However, by simulating a large number of sales in each local market, we find that there is in fact significant demand for niche products at the local level. On average, 47.0% of sales

come from products outside the top 3,000 products. Consistent with Figure 8, however,

the local percentage mean is less than in the national level data.

6 Robustness

6.1

Welfare and Counterfactual Choice Sets

In this subsection, we examine the robustness of our findings to the size of the counterfactual choice set. While we find that the absolute size of the overstatement is sensitive to the size of the counterfactual assortment size, the percentage overstatement is fairly robust across a wide range of threshold sizes and in line with our findings from the previous

section. Table 12 presents the change in consumer welfare and the size of the overstate-

ment resulting from various thresholds of the counterfactual choice set, respectively. For comparison, we also include our baseline results from the previous section.

Unsurprisingly, as the size of the counterfactual choice set increases, the gain consumers derive from access to the remaining products decreases. This decrease occurs faster under locally-customized assortments than under a nationally standardized assortment. As a result, the percentage overstatement tends to increase in the assortment size,

25 Given the small number of sales at the local level, this result is unsurprising. For example, suppose fewer than 3,000 sales are observed in a local market. Then, of course, the share of revenue going to products outside the top 3,000 is zero.

35

despite the absolute size of the overstatement decreasing. This pattern is illustrated in

Figure 9, which can be read as the estimated consumer welfare overstatement when as-

suming no local assortment customization measured in millions of dollars (solid) and as a percentage (dash).

Table 13 presents the retail revenue at various thresholds of the counterfactual choice

set. With retail revenue we find that as assortment sizes increase, the gain from customizing assortments to local demand decreases in size. However, a typical large brick-and-mortar shoe retailer stocks, at most, a few thousand varieties. Our results imply that a national retailer stocking 3,000 products in each store could increase its revenue by 5.1% by moving to a locally-customized inventory from a nationally standardized one. This suggests that there may be significant incentives for large national brick-and-mortar shoe retailers to customize their assortments to local demand.

Figure 10 graphs the increase in retail revenue due to local customization of assort-

ments, measured in millions of dollars (solid) and as a percentage (dash). The percentage increase monotonically decreases with assortment size. The graph shows that when assortment sizes are extremely limited, brick-and-mortar retailers can significantly boost revenue by maintaining locally-customized product assortments.

6.2

Small Sample Sizes and the Long Tail

We may be concerned that the long tail observed in our aggregated data is actually due to small sample sizes at the local level, rather than driven by across-market demand

heterogeneity. Figure 11 graphs the cumulative share of revenue going to the top

K products for the median CSA, middle 10% (p45-p55), middle 50% (p25-p75), and the national level markets across four panels (solid lines). To test how sampling impacts the revenue curve, we remove all products in which only a single local sale occurs (dashed lines).

As expected, we find that removing single sale products shortens the revenue tail. For

36

the median market (a), the already extremely short tail shortens further. For the middle

10% of markets (b) the shortening is quite large, but this e ff ect diminishes substantially with aggregation to the middle 50% (c). In particular, at the national level (d) we still obtain a long tail pattern, even with all of the single sale products removed at the local level. This suggests that aggregation does, in fact, average out the e ff ects of small sample sizes and gives us confidence that our long tail results are not driven by one-o ff purchases.

7 Conclusion

In this paper, we quantify the e ff ect of increased access to variety due to online retail on consumer welfare and firm profitability. To perform this analysis, we develop new methodology that allows us to confront the severe small sample problem in our data while retaining the across-market heterogeneity of interest to us. Our estimates suggest products face substantial heterogeneity in demand across markets, and that this heterogeneity helps explain the distribution of sales we see in the data. The presence of across-market demand heterogeneity has important implications for both consumer welfare and firm strategy.

On one hand, di ff erences in local demand may create an incentive for retailers to customize assortments and our brick-and-mortar data suggests that brick-and-mortar shoe stores are reacting to these incentives. Our results suggest local retailers may generate 3% additional revenue by localizing assortments. On the other hand, our calculations suggest that abstracting from across-market demand heterogeneity overestimates the consumer welfare gain due to online markets by about 20%. This is important since there are several potential avenues through which online retail can benefit consumers. For example, the entry of online firms suggests local brick-and-mortar retailers now face increased competition, which may lead to a reduction in prices and an increase in consumer welfare.

The variety channel is another avenue through which online retail can increase consumer welfare: the large online choice set allows for better product matching compared to the limited selection available at physical stores. Our results suggest this channel may be less

37

important than previously thought due to correlated local preferences.

Finally, our results suggest a new interpretation of the long tail phenomenon. With our data, we show that simply aggregating local demand creates a long tail at the national level. We confirm that substantial heterogeneity exists at the national level, but the key driver is across-market demand heterogeneity.

Although we bring in new, rich data and propose new methodology to estimate demand with 95% local zeros, both the data and methodology have limitations. With our data, we do not observe consumer search which may further decrease the consumer surplus gains of the large online choice set. However, recommendation and search tools available on most online retail websites decrease these search costs and may provide additional benefit compared to physically searching (shopping) at brick-and-mortar stores.

Incorporating this feature of the market maybe important given the size of the choice set.

Like the existing literature, we lack data on pre-internet assortments and resort to counterfactual exercises that assume the stocking decisions of brick-and-mortar retailers have been una ff ected by the advent of online retail. It may be that the internet has shaped how retailers choose assortments, which in turn has impacted how consumers di ff erentially shop online versus at stores. Additionally, we assume that brick-and-mortar retailers are able to perfectly predict consumer demand. To the extent that buyers for the brick-andmortar retailers are unable to perfectly forecast local demand, our counterfactuals will

understate the gains from online variety.

26

However, as long as there was some degree of local customization in brick-and-mortar assortments before the Internet and as long as brick-and-mortar retailers are not stocking products at random today, our main conclusion holds. That is, it is important to account for across-market demand heterogeneity when estimating the gain from online variety.

On the methodological side, our locations are simply characterized by random draws from the distribution of across-market heterogeneity. This means we cannot conduct

26 Aguiar and Waldfogel (2015) explore the benefit of having a long tail of products when quality is not perfectly predictable.

38

location-specific analyses. A potentially interesting area of future research is addressing sampling in more flexible demand systems incorporating location-level or individual-level data.

References

A ckerberg

, D. A., and

M. R ysman

(2005): “Unobserved Product Di ff erentiation in Discrete-

Choice Models: Estimating Price Elasticities and Welfare E ff ects,” RAND Journal of

Economics , 36(4), 771–788.

A guiar

, L., and

J. W aldfogel

(2015): “Quality Predictability and the Welfare Benefits from

New Products: Evidence from the Digitization of Recorded Music,” Working paper.

A nderson

, C. (2004): “The Long Tail,” Wired Magazine , 12(10), 170–177.

B ar

-I saac

, H., G. C aruana

, and

V. C u

˜ (2012): “Search, Design, and Market Structure,”

American Economic Review , 102(2), 1140–1160.

B erry

, S., A. G andhi

, and

P. H aile

(2013): “Connected substitutes and invertibility of demand,” Econometrica , 81(5), 2087–2111.

B erry

, S., J. L evinsohn

, and

A. P akes

(1995): “Automobile Prices in Market Equilibrium,”

Econometrica , 63(4).

B erry

, S., O. B. L inton

, and

A. P akes

(2004): “Limit theorems for estimating the parameters of di ff erentiated product demand systems,” The Review of Economic Studies , 71(3), 613–

654.

B erry

, S. T. (1994): “Estimating Discrete-Choice Models of Product Di ff erentiation,” The

RAND Journal of Economics , 25(2), 242–262.

B ronnenberg

, B. J., S. K. D har

, and

J.-P. H. D ube

(2009): “Brand History, Geography, and the Persistence of Brand Shares,” Journal of Political Economy , 117(1), 87–115.

B ronnenberg

, B. J., J.-P. H. D ube

, and

M. G entzkow

(2012): “The Evolution of Brand

Preferences: Evidence from Consumer Migration,” American Economic Review , 102(6),

2472–2508.

B rynjolfsson

, E., Y. J. H u

, and

M. S. R ahman

(2009): “Battle of the Retail Channels: How

Product Selection and Geography Drive Cross-Channel Competition,” Management Science , 55(11), 1755–1765.

39

B rynjolfsson

, E., Y. J. H u

, and

D. S imester

(2011): “Goodbye Pareto Principle, Hello Long

Tail: The E ff ect of Search Costs on the Concentration of Product Sales,” Management

Science , 57(8), 1373–1386.

B rynjolfsson

, E., Y. J. H u

, and

M. D. S mith

(2003): “Consumer Surplus in the Digital

Economy: Estimating the Value of Increased Product Variety at Online Booksellers,”

Management Science , 49(11), 1580–1596.

(2010): “Long Tails Versus Superstars: The E ff ect of IT on Product Variety and

Sales Concentration Patterns,” Information Systems Research , 21(4), 736–747.

C hellappa

, R., B. K onsynski

, V. S ambamurthy

, and

S. S hivendu

(2007): “An empirical study of the myths and facts of digitization in the music industry,” in Presentation 2007

Workshop Information Systems Economics (WISE), Montreal .

C hen

, W.-C. (1980): “On the Weak Form of Zipf’s Law,” Journal of Applied Probability , 17(3),

611–622.

C hintagunta

, P. K., and

J.-P. D ube

(2005): “Estimating a Stockkeeping-Unit-Level Brand

Choice Model That Combines Household Panel Data and Store Data,” Journal of Marketing Research , 42(3), 368–379.

C hoi

, J., and

D. R. B ell

(2011): “Preference minorities and the Internet,” Journal of Marketing

Research , 48(4), 670–682.

C onlon

, C. T., and

J. H. M ortimer

(2013): “Demand Estimation under Incomplete Product

Availability,” American Economic Journal: Microeconomics , 5(4), 1–30.

D ixit

, A. K., and

J. E. S tiglitz

(1977): “Monopolistic competition and optimum product diversity,” The American Economic Review , 67(3), 297–308.

E llison

, G., and

E. L. G laeser

(1997): “Geographic Concentration on U.S. Manufacturing

Industries: A Dartboard Approach,” Journal of Political Economy , 105(5), 889–927.

G andhi

, A., and

J.-F. H oude

(2015): “Measuring Substitution Patterns in Di ff erentiated

Products Industries - The Missing Instruments,” Working paper.

G andhi

, A., Z. L u

, and

X. S hi

(2013): “Estimating Demand for Di ff erentiated Products with Error in Market Shares,” Working paper.

(2014): “Demand Estimation with Scanner Data: Revisiting the Loss-Leader

Hypothesis,” Working paper.

H inz

, O., J. E ckert

, and

B. S kiera

(2011): “Drivers of the Long Tail Phenomenon: An

Empirical Analysis,” Journal of Management Information Systems , 27(4), 43–69.

40

K rugman

, P. R. (1979): “Increasing returns, monopolistic competition, and international trade,” Journal of international Economics , 9(4), 469–479.

L ancaster

, K. J. (1966): “A New Approach to Consumer Theory,” Journal of Political

Economy , 74(2), 132–157.

P etrin

, A. (2002): “Quantifying the Benefits of New Products: The Case of the Minivan,”

Journal of Political Economy , 110(4), 705–729.

R oberts

, M. J., D. Y. X u

, X. F an

, and

S. Z hang

(2012): “A structural model of demand, cost, and export market selection for chinese footwear producers,” Working paper.

R omer

, P. (1994): “New goods, old theory, and the welfare costs of trade restrictions,”

Journal of development Economics , 43(1), 5–38.

S ong

, M. (2007): “Measuring Consumer Welfare in the CPU Market: An application of the pure-characteristics demand model,” The RAND Journal of Economics , 38(2), 429–446.

T an

, T. F., and

S. N etessine

(2009): “Is Tom Cruise Threatened? Using Netflix Prize Data to Examine the Long Tail of Electronic Commerce,” Working paper.

T ucker

, C., and

J. Z hang

(2011): “How Does Popularity Information A ff ect Choices? A

Field Experiment,” Management Science , 57(5), 828–842.

W aldfogel

, J. (2003): “Preference Externalities: An Empirical Study of Who Benefits

Whom in Di ff erentiated-Product Markets,” RAND Journal of Economics , 34(3), 557–568.

(2004): “Who Benefits Whom in Local Television Markets?,” in Brookings-Wharton

Papers on Urban Economics , ed. by J. R. Pack, and W. G. Gale, pp. 257–305. Brookings

Institution Press, Washington DC.

(2008): “The Median Voter and the Median Consumer: Local Private Goods and

Population Composition,” Journal of Urban Economics , 63(2), 567–582.

(2010): “Who Benefits Whom in the Neighborhood? Demographics and Retail

Product Geography,” in Agglomeration Economics , ed. by E. L. Glaeser, pp. 181–209.

University of Chicago Press, Chicago.

41

A Tables and Figures

Figure 1: Histogram of shoe assortment size across stores

(a) Macy’s (b) Payless

Note: Data aggregated over sizes. A shoe is marked as available if any size of a particular shoe is in stock. Data covers 649 Macy’s stores and 3,141 Payless stores.

Figure 2: Histogram of percent of stores carrying individual shoes

(a) Macy’s (b) Payless

Note: Data aggregated over sizes. A shoe is marked as available if any size is in stock. If all shoes were available at all stores, the densities would collapse to 1. Online exclusives are excluded from this analysis.

42

Figure 3: Assortment Overlap by Distance

(a) Macy’s (b) Payless

Note: Lowess fitted values of assortment overlap across stores in the network. Analysis split across stores with similar assortment sizes.

Figure 4: Boots vs. Sandals Revenue by Temperature

Note: Linear regression fitted values of state revenue share of boots and sandals as a function of average annual state temperature.

43

Figure 5: Sales Share of a Popular Brand Across Zip3s

Note: Map of Eastern US Zip3s – the first 3 digits of a 5-level zip code. The color of the Zip3 corresponds to the local revenue share of a popular brand in the data set. Sales of the brand are concentrated in the Eastern US.

44

Figure 6: Goodness of Fit: Percentage of Location Level Zeros (Sneakers)

Notes: (left) Men’s (right) Women’s. For each product, percentage of locations with zero sales in the data

(solid), in our estimation with across-market heterogeneity (long-dash), and with homogeneous demand across markets (dash).

Figure 7: Aggregating to the Long Tail

Note: For varying levels of aggregation, the cumulative share of revenue going to the top products.

45

Figure 8: Local Tail: Correcting for Small Samples

Note: For the median local market (CSA

+ state, by number of monthly sales), the cumulative share of revenue going to the top products, as seen in the data (dot) and simulated using our estimated demand system (dash-dot). For comparison, we also include the national level (solid).

46

Figure 9: Overestimation of Consumer Welfare

Note: The overstatement in consumer surplus gains, by counterfactual assortment size, when assuming a nationally standardized assortment vs. a locally customized assortment measured in dollars (red) and percentage (black).

47

Figure 10: Increase in Retail Revenue from Local Assortments

Note: The gain in retail revenue, by local retailer assortment size, when moving from a nationally standardized assortment to a locally customized assortment measured in dollars (red) and percentage (black).

48

Figure 11: Simulation of demand aggregation where single sale observations are dropped.

(a) Median Market (b) 45 th - 55 th Percentile Markets

(c) 25 th - 75 th Percentile Markets (d) Aggregation of All Markets (National)

Note: For varying levels of aggregation, the cumulative share of revenue going to the top products as seen in the data (solid) and after dropping all local market level single sales (dash-dot).

49

Table 1: Summary of Brick-and-Mortar Data

Number of stores

Number of products

Percent online exclusive

Avg. assortment size

Macy’s

649

7,844

34.8%

624.9

(299.3)

Payless Shoes

3,141

1,430

19.2%

513.0

(58.4)

Notes: Data collected through macys.com and payless.com. For every shoe-size combination, we check to see if the product is in stock.

N

Payless

= 69 , 451 , 866.

N

Macy’s

= 93 , 481 , 515,

Table 2: Local-National Revenue Share Comparison and Multinomial Tests

Market

Definition

5-Digit Zip Code

Number of Market Top 500

Markets Market National

35,279 99.96

4.69

Multinomial

Tests - Rejection

Rates (%)

40.14

3-Digit Zip Code

CSA

+

State

Combined Statistical Area (CSA)

State (plus DC)

Census Region

894

213

165

51

4

85.12

67.05

70.31

30.04

16.36

6.28

7.23

7.19

9.86

14.76

65.02

78.30

88.05

94.26

92.86

National 1 15.54

15.54

Multinomial tests: Define s =

{ s

1

, ..., s

J

} and s

`

=

{ s

` 1

, ..., s

` J

} , then the null hypothesis is H

0

: s = s

`

. CSA

+

State includes the 165 CSAs and 48 States. NJ, RI, and DC are dropped as sales in these states are assigned to

CSAs.

50

Table 3: Data Disaggregation: The Zeros Problem

Market

Definition

5-Digit Zip Code

3-Digit Zip Code

CSA

+

State

Combined Statistical Area (CSA)

State (plus DC)

Census Region

National

Number of Percent of Zero Sales

Markets Week Month Quarter Annual

35,279 99.99

99.96

99.91

99.78

894

213

165

51

4

1

99.57

98.43

98.50

94.23

59.83

28.30

98.57

95.54

95.80

85.25

33.70

9.27

Percent of products observed to have zero sales, where a product is a SKU.

97.07

91.98

92.53

76.27

21.72

4.50

94.09

86.12

87.15

64.26

12.17

1.01

Table 4: Revenue Share of Top Products with Product Aggregation

Product Definition Percent of Zero Sales

SKU (shoe

+ style)

Shoe

95.54

93.10

SKU (shoe

Shoe

Brand

+ style) 95.54

93.10

59.27

Market Top 500

Market National

67.05

7.23

73.39

19.07

Market Top 10

Market National

7.59

0.50

9.10

33.91

2.18

25.48

Time horizon fixed at monthly level and geography aggregated to the CSA-State level. Brand and Brand-Category information only includes data for which brand and shoe category information are available.

51

Table 5: Demand Estimates with Adjusted Shares - Men’s

Price

Comfort

Look

Overall

No Review

Intercept

λ

σ

Fixed E ff ects

N

Category

Brand

Color

Month

Zeroes

Price Elast.

Mean

Std. Dev.

95

Local FE

(1)

− 0

.

003 ∗∗∗

(0

.

000)

0

.

010

∗∗∗

(0 .

002)

0 .

002

(0 .

002)

0

.

055

∗∗∗

(0

.

003)

0 .

281 ∗∗∗

(0 .

014)

− 13 .

876

(0

.

254)

∗∗∗

0

.

523

∗∗∗

(0

.

015)

.

X

X

X

X

38.2mil

03% →

− 0 .

784

(0 .

582)

0%

National FE

(2)

− 0

.

012 ∗∗∗

(0

.

001)

0

.

012

(0 .

007)

0 .

032 ∗∗∗

(0 .

009)

0

.

185

∗∗∗

(0

.

014)

0 .

982 ∗∗∗

(0 .

077)

− 11 .

575 ∗∗∗

(0

.

272)

0

.

633

∗∗∗

(0

.

025)

8

.

X

X

X

X

179,416

23% →

− 3 .

599

(2 .

673)

0% 8

.

Local RE

(3)

− 0

.

014 ∗∗∗

(0

.

003)

0

.

013

(0 .

010)

0 .

036

(0 .

282)

0

.

236

∗∗∗

(0

.

034)

1 .

214

(1 .

550)

− 13 .

167

(2

.

075)

∗∗∗

0

.

525

∗∗∗

(0

.

006)

179,416

23%

X

X

X

X

− 3 .

002

(2 .

526)

0%

Notes: Estimated at the monthly level. “Local FE” (1) estimates nested logit demand at the CSA-State level with Gandhi, Lu, and Shi (2014) adjusted shares, creating local-product level fixed e ff ects. “National FE” (2) estimates nested logit demand at the national level with Gandhi, Lu, and Shi (2014) adjusted shares, creating national-product level fixed e ff ects. Finally, “Local RE” (3) estimates the nested logit model using our estimation technique to allow for across-market heterogeneity in the form of a location-product level random e ff ect. Standard errors in parentheses.

estimates for across-market heterogeneity in specification (3) are in Table 7

52

Table 6: Demand Estimates with Adjusted Shares - Women’s

Price

Comfort

Look

Overall

No Review

Intercept

λ

σ

Fixed E ff ects

N

Category

Brand

Color

Month

Zeroes

Price Elast.

Mean

Std. Dev.

95

Local FE

(1)

− 0

.

002 ∗∗∗

(0

.

000)

0

.

016

∗∗∗

(0 .

002)

− 0 .

007 ∗∗∗

(0 .

002)

0

.

047

∗∗∗

(0

.

002)

0 .

195 ∗∗∗

(0 .

010)

− 19 .

731

(0

.

165)

∗∗∗

0

.

183

∗∗∗

(0

.

010)

.

X

X

X

X

82.6mil

17% →

− 0 .

250

(0 .

220)

0%

National FE

(2)

− 0

.

006 ∗∗∗

(0

.

000)

0

.

007

(0 .

005)

− 0 .

008

(0 .

005)

0

.

103

∗∗∗

(0

.

007)

0 .

500 ∗∗∗

(0 .

029)

− 11 .

614 ∗∗∗

(0

.

112)

0

.

654

∗∗∗

(0

.

010)

9

.

X

X

X

X

387,745

41% →

− 2 .

062

(1 .

814)

0% 9

.

Local RE

(3)

− 0

.

007 ∗∗∗

(0

.

000)

0

.

020

∗∗∗

(0 .

007)

− 0 .

003

(0 .

025)

0

.

158

∗∗∗

(0

.

008)

0 .

772 ∗∗∗

(0 .

120)

− 14 .

640

(0

.

159)

∗∗∗

0

.

453

∗∗∗

(0

.

006)

382,747

41%

X

X

X

X

− 1 .

547

(1 .

360)

0%

Notes: Estimated at the monthly level. “Local FE” (1) estimates nested logit demand at the CSA-State level with Gandhi, Lu, and Shi (2014) adjusted shares, creating local-product level fixed e ff ects. “National FE” (2) estimates nested logit demand at the national level with Gandhi, Lu, and Shi (2014) adjusted shares, creating national-product level fixed e ff ects. Finally, “Local RE” (3) estimates the nested logit model using our estimation technique to allow for across-market heterogeneity in the form of a location-product level random e ff ect. Standard errors in parentheses.

estimates for across-market heterogeneity in specification (3) are in Table 7

53

Table 7: Parameter Estimates of Across-Market Heterogeneity:

σ j

= h ( · )

Men

(1)

Women

(2)

Boat

Boots

Clogs

Flats

Heels

Loafers

Oxfords

Sandals

Slippers

Sneakers

0

.

758

∗∗∗

(0

.

024)

0 .

001

(6 .

638)

1

.

419

∗∗∗

(0 .

031)

0

.

009

(0

.

761)

0 .

017

(0

.

383)

0 .

462 ∗∗∗

(0 .

016)

1

.

066 ∗∗∗

(0

.

025)

0 .

555 ∗∗∗

(0 .

009)

0

.

916

∗∗∗

(0

.

023)

0 .

002

(1 .

163)

0

.

020

(0 .

445)

0

.

002

(4

.

008)

0 .

003

(1 .

497)

0

.

321

∗∗∗

(0

.

029)

0 .

001

(11

.

425)

0 .

002

(1 .

264)

0

.

123

(0

.

115)

0 .

904 ∗∗∗

(0 .

011)

δ

Notes: Parameter estimates correspond to “ ∗ ", column 3, in

Table 5 and Table 6, respectively. Parameters estimated jointly,

by sex, with robust standard errors in parentheses. There are no products classified as men’s flats or men’s heels in the data sample.

54

Table 8: Demand Estimates with Empirical Shares

Price

Comfort

Look

Overall

No Review

Intercept

λ

Local

(1)

− 0

.

004

∗∗∗

(0 .

000)

0

.

025 ∗∗∗

(0

.

002)

0 .

001

(0 .

002)

0

.

062

∗∗∗

(0

.

002)

0 .

356 ∗∗∗

(0

.

012)

− 10 .

643 ∗∗∗

(0 .

047)

0

.

562 ∗∗∗

(0

.

006)

Men

National

(2)

− 0

.

010

∗∗∗

(0 .

001)

0

.

027 ∗∗∗

(0

.

006)

0 .

031 ∗∗∗

(0 .

007)

0

.

134

∗∗∗

(0

.

009)

0 .

789 ∗∗∗

(0

.

051)

− 11 .

182

(0 .

199)

∗∗∗

0

.

658 ∗∗∗

(0

.

019)

Local

Women

(3)

National

(4)

− 0

.

009

∗∗∗

(0 .

000)

− 0

.

016

∗∗∗

(0 .

001)

0

.

053 ∗∗∗

(0

.

002)

0

.

065 ∗∗∗

(0

.

008)

− 0 .

020 ∗∗∗

(0 .

003)

0

.

104

∗∗∗

(0

.

003)

0 .

532 ∗∗∗

(0

.

013)

− 12 .

490 ∗∗∗

(0 .

039)

0

.

279 ∗∗∗

(0

.

006)

1 .

0 .

008

(0 .

009)

0

.

232

311

∗∗∗

(0

.

013)

∗∗∗

(0

.

061)

− 15 .

008

(0

0

.

.

224)

287

(0

.

022)

∗∗∗

∗∗∗

Fixed E ff ects

N

Category

Brand

Color

Month

Zeroes

Price Elast.

Mean

Std. Dev.

X

X

X

X

1,899,674

95.03%

− 0 .

936

(0 .

528)

X

X

X

X

164,647

8.23%

− 3 .

139

(2 .

193)

X

X

X

X

3,989,579

95.17%

− 1 .

293

(0 .

848)

X

X

X

X

351,259

9.41%

− 2 .

741

(2 .

213)

Notes: Estimated at the monthly level using empirical (observed) shares. Columns (1) and (3) use local data and (2) and (4) use national shares. Zeros indicate percent of products dropped from the sample by using empirical shares. Robust standard errors in parentheses.

55

Table 9: Demand Estimates Market-by-Market

Price

Comfort

Look

Overall

No Review

Intercept

λ

Men Women

Empirical Shares Adjusted Shares Empirical Shares Adjusted Shares

(1) (2) (3) (4)

− 0 .

004

[ − 0 .

150 , 0 .

039]

31 .

0% ∗∗

[ −

0

.

022

0

.

588

,

2

.

852]

13 .

1% ∗∗

0

.

066

[ − 1

.

153

,

5

.

155]

8

.

0%

∗∗

− 0 .

007

[ − 5

.

667

,

1

.

623]

19

.

2% ∗∗

[ −

[ −

[ −

0

0

.

.

− 0 .

010

82

90

.

.

.

002

, 0

6%

61 .

0%

.

∗∗

∗∗

− 0

.

003

33

.

8%

0 040

002

,

0

6%

.

∗∗

∗∗

000]

0

.

010

0

.

009

,

0

.

034]

[ − 0

.

030

,

0

.

034]

175]

[

[ −

017]

[ − 0

.

450

,

1

.

914]

[

0

0

.

012

0

.

436

,

0

.

424]

0

.

.

033

36

17 .

4%

0

.

014

10

.

8%

0

2%

484

28

0

.

.

.

.

003

, 0

033

,

0

2%

.

∗∗

∗∗

∗∗

.

∗∗

469]

[

[

[ −

0

0

.

.

93

0

000

94

0

.

.

.

.

004

001

,

,

0

9%

037

0

8%

.

∗∗

0

.

013

0

.

025

,

0

.

045]

70 .

9% ∗∗

− 0

.

009

[ − 0

.

047

,

0

.

021]

46

.

0%

∗∗

.

∗∗

000]

086]

0 .

402

[ − 2 .

534 , 16 .

907]

24 .

9% ∗∗

0 .

185

[ − 0 .

022 , 0 .

826]

79 .

8% ∗∗

0 .

239

[ − 1 .

262 , 8 .

215]

31 .

5% ∗∗

0 .

135

[ − 0 .

036 , 0 .

364]

92 .

5% ∗∗

− 10 .

345 − 15 .

602 − 12 .

742 − 20 .

479

[ − 70 .

830 , 159 .

316] [ − 51 .

493 , − 3 .

872] [ − 19 .

807 , − 7 .

695] [ − 60 .

767 , − 8 .

640]

74 .

2% ∗∗ 100 .

0% ∗∗ 99 .

5% ∗∗ 100 .

0% ∗∗

0

.

557

[ − 3

.

848

,

15

.

910]

41

.

8%

∗∗

0

.

444

[ − 0

.

170

,

1

.

032]

87

.

8%

∗∗

0

.

069

[ − 1

.

440

,

1

.

161]

18

.

8%

∗∗

0

.

160

[ − 0

.

102

,

0

.

878]

63

.

8%

∗∗

Fixed E ff ects

Category

Brand

Color

Month

N

Zeroes

Price Elast.

Mean

Std. Dev.

[127

[41

.

,

X

X

X

X

104

88%

,

99

,

− 0

.

831

(0

.

583)

.

271]

93%]

179

X

X

X

X

,

416

− 1

.

730

(1

.

303)

[400

[43

.

,

46%

X

X

X

X

219

,

99

,

− 0

.

791

(0

.

568)

.

214]

90%]

387

X

X

X

X

,

745

− 0

.

400

(0

.

352)

Notes: Estimated at the monthly level, market-by-market. Estimate rows are: mean parameter estimate across locations (unweighted), range of estimates, and percent of estimates significant at 5%. Columns (1) and (3) use empirical (observed) shares and (2) and (4) use Gandhi, Lu, and Shi (2014) adjusted shares.

56

Table 10: Gains From Increasing Variety

CS

Rev.

Assortment

Localized

National

Localized

National

Surplus Increase

$mil

%

$mil

%

$mil

%

$mil

%

Men Women

Local FE Local RE Local FE Local RE

0

.

2

0 .

0% 9

16

.

.

8

9% 0 .

0

.

5

1%

106

15 .

.

0

4%

64

.

4

9

.

5%

0

.

1

0

.

0%

14

14

23

.

3%

29

.

.

.

3

1

2%

101

16

0

.

.

0

.

.

7

4%

5

1%

121

18

19

.

.

.

7

1%

94

.

9

3%

33

.

1

12

.

8%

38

.

1

19

.

4%

103

.

4

18

.

4%

105

.

9

22

.

0%

Results based on parameter estimates in Table 5 and Table 6. Local assortment size is specified as the

predicted values of ln( a

`

) = β

0

+ β

1 ln( p

`

) +

`

, where a is the assortment size found in the Macy’s and

Payless data, and p is local population.

Table 11: Average Revenue Share of Products Outside of the Top 3,000

Men

Women

Market

National

Local

National

Local

Small Sample

Data Model

27.7 % 26.5 %

0.4 % 0.1 %

44.8 %

1.9 %

44.5 %

1.6 %

Large Sample

Model

27.6 %

19.9 %

45.2 %

40.4 %

Total National

Local

53.7 %

3.6 %

53.4 %

3.2 %

53.8 %

47.0 %

Results based on the Local RE parameter estimates in Table 5 and Table 6. The small

sample simulation of the model holds fixed the number of purchases at each location to that observed in the data. The large sample simulation of model is calculated using the simulated local shares, e ff ectively assuming a continuum of consumers at each location.

57

Table 12: Robustness: Overstatement of Consumer Welfare Increase

Assortment Size

Baseline

Threshold

Mean Baseline

3,000

6,000

12,000

24,000

Loc.

14.3

24.5

29.0

14.3

4.6

0.3

Percent Increase

Nat.

17.3

30.2

35.2

17.2

5.6

0.4

%

21.2

23.0

21.4

20.5

22.0

40.2

Absolute Increase ($ Millions)

Loc.

122.8

193.2

220.6

122.5

43.3

3.0

Nat.

145.0

227.3

255.5

143.9

52.3

4.2

%

18.1

17.7

15.8

17.5

20.8

40.0

Results based on the Local RE parameter estimates in Table 5 and Table 6. The baseline assortment size

us specified as the predicted values of ln( a

`

)

= β

0

+ β

1 ln( p

`

)

+

`

, where a is the assortment size found in the Macy’s and Payless data, and p is local population. The threshold assortment sizes imposed the same assortment size in every local market.

Table 13: Robustness: Retail Revenue

Assortment Size

Baseline

Threshold

Mean Baseline

3,000

6,000

12,000

24,000

Loc.

14.5

25.4

27.5

14.6

5.2

0.5

Percent Increase

Nat.

17.2

30.5

32.6

17.3

6.3

0.6

%

18.9

20.2

18.7

18.7

19.6

38.0

Absolute Increase ($ Millions)

Loc.

124.0

198.7

211.5

124.7

48.8

4.5

Nat.

144.0

229.4

241.3

144.6

57.8

6.2

%

16.1

15.5

14.1

16.0

18.4

37.8

Results based on the Local RE parameter estimates in Table 5 and Table 6. The baseline assortment size

us specified as the predicted values of ln( a

`

) = β

0

+ β

1 ln( p

`

) +

`

, where a is the assortment size found in the Macy’s and Payless data, and p is local population. The threshold assortment sizes imposed the same assortment size in every local market.

58

B An Empirical Bayesian Estimator of Shares

As mentioned in the Data section, our data exhibits a high percentage of zero observations.

To account for this we implement a new procedure proposed by Gandhi, Lu, and Shi (2014).

This estimator is motivated by a Laplace transformation of the empirical shares s lp j

=

M · s j

+

1

M

+

J

+

1

.

Note using that s lp j as s j p

→ π j results in a consistent estimator of

δ as the market size M → ∞ as long

. However, instead of simply adding a sale to each product, they “propose an optimal transformation that minimizes a tight upper bound of the asymptotic mean squared error of the resulting

β estimator.”

The key is to back out the conditional distribution of choice probabilities, π t

, given empirical shares and market size, ( s

,

M ). Denote this condition distribution F π

| s

,

M

. According to Bayes rule

F π

| s

,

M

( p | s

,

M )

=

R x ≤ p

R f s | π, M

( s | x , M ) dF

π | M , J

( x | M , J )

.

x f s |

π,

M

( s | x

,

M ) dF π

| M

,

J

( x | M

,

J )

Thus, F π

| s

,

M can be estimated if the following two distributions are known or can be estimated:

1.

F s |

π,

M

: the conditional distribution of s given (

π,

M );

2.

F π

| M

,

J

: the conditional distribution of

π given ( M

,

J ).

F s | π, M is known from observed sales: parameters (

π,

M ),

M · s is drawn from a multinomial distribution with

M · s ∼ MN (

π,

M )

.

(B.1)

F π

| M

,

J is not generally known and must be inferred. Gandhi, Lu, and Shi (2014) note that sales can often be described by Zipf’s law, which, citing Chen (1980), can be generated if

π/

(1 −

π

0

) follows a Dirichlet distribution. It is then assumed that

π

(1 −

π

0

)

J

,

M

, π

0

∼ Dir (

ϑ

1

J

)

, for an unknown parameter ϑ .

Equations B.1 and B.2 then imply

(B.2) s

(1 − s

0

)

J

,

M

, s

0

∼ DCM (

ϑ

1

J

,

M (1 − s

0

))

,

59

where DCM ( · ) denotes a Dirichlet compound multinomial distribution.

ϑ can the be estimated by maximum likelihood, since J

,

M

, s

0 are observed. This estimator can be interpreted as an empirical Bayesian estimator of the choice probabilities

π

, with a Dirichlet prior and multinomial likelihood,

F π

1 − s

0

| s , M

∼ Dir (

ϑ +

M · s )

.

For any random vector X

=

( X

1

, ...,

X

J

) ∼ Dir (

ϑ

),

E h log( x j

) i = ψ

(

ϑ j

) −

ψ

(

ϑ 0

1 d ϑ

)

,

Thus,

E log

π j

1 − s

0

=

E h log

π j i

− E log (1 − s

0

)

= ψ

(

ϑ +

M · s j

) −

ψ

((

ϑ +

M · s )

0

1 d ϑ

)

, which implies

δ ˆ = log( ˆ j

) − log( ˆ

0

)

=

E h log

π j i

− E log (

π

0

)

= ψ

(

ϑ + M · s j

) −

ψ

( M · s

0

)

.

The nested logit model as requires an estimate of the choice probability conditional on nest, j

) − log( ˆ c

) = E h log( π j

) i

− E log( π c

)

 

= ψ ( ϑ + M · s j

) − ψ

X

 j ∈ c

ϑ + M · s j

.

60

C Monte Carlo Analysis

We conduct a Monte Carlo study of our estimator. We start by specifying the data generating process of a nested logit demand system and then create synthetic data sets from this process. Finally, we estimate the structural parameters using 2-step GMM.

The true model specifies consumer utility as u i

` j

= β

|

0

+ β

1 x

1 j

+

{z

β

2 x

2 j

+ ξ

} j

δ j

+ η

` j

+ ζ ic

+

(1 −

λ

)

ε i

` j

=

− 4

+

.

75 x

1 j

+ .

75 x

2 j

+ ξ j

+ η

` j

+ ζ ic

+

(1 −

.

5)

ε i

` j

The normalized outside good gives utility u i

`

0

= ζ i 0

+

(1 −

λ

)

ε i

`

0

. Here we assume both characteristics are exogenous from the unobservable

ξ

; however, given real data, instrumental variables can be used on these characteristics. We assign distributions on the data

generating process according to Table 14 below.

Table 14: Data generating process for Monte Carlo study

Definition

Characteristic 1 x

1

Characteristic 2

National Unobservable

Local Unobservable

Individual Unobservable x

2

ξ

η

ε

Local-Category Product Size J c

Num. of Categories C

Num. of Periods

Market Population

Num. of Local Markets

Population Distribution

T

M

L

ω

`

Variable Specification

N (0

,

1)

N 0

,

1

.

5

2

N (0

,

1)

N (0 , σ c

=

1)

T1EV

175

3

10

2000000

200

1

/

L

The parameters to be estimated are:

β

0

=

− 4

, β

1 following steps are used to compute the estimator:

=

.

75

, β

2

= .

75

, σ c

=

1

, λ = .

5. The

0. Initialize values of σ, λ ,

1. Recover

δ ( k ) j

using the inversion (Equation 3.2),

61

2. Given,

δ ( k ) j

, calculate GMM objective using micro moments and orthogonality conditions on ξ ( k ) j

, m ( · ),

3. Select

σ ( k

0

) , λ ( k

0

) and repeat 1-2 until GMM objective is minimized,

4. Given parameter estimates b 1

, calculate the weighting matrix b

= m (

θ

1

; Z )

T m ( b 1

; Z )

− 1

,

5. With b repeat steps 0-3 to obtain b 2

, the two-step feasible GMM estimator.

We use the Nelder-Mead method to minimize to the GMM objective, where Z

=

[ X

, z

1

, z

2

], and z k the mean characteristic of competing products within category for characteristic k .

Table 15: Monte Carlo Results

Parameter True Value Bias MSE Reject. Rates (%)

β

0

β

1

β

2

σ

1

σ

2

σ

3

λ

-4

-75

.75

1

1

1

.5

0.065

-0.016

0.015

0.053

0.052

0.054

0.074

0.998

0.022

0.022

0.044

0.043

0.043

0.018

The last column tests H

0

: b k

= θ

0 and H

1

: ¬ H

0

. The rejection rates are at the 5% level.

6.122

6.122

6.122

5.102

5.102

5.102

6.122

Table 15 presents the results for our Monte Carlo exercises, using 100 synthetic data

sets to construct the bias, mean-squared error, and rejection rates. The data generating process yields roughly 75% local zeros and 10% aggregate zeros.

62

(e) β

0

(h) σ

0

(f) β

1

(i) σ

1

(k)

λ

Figure 12: Histogram of Monte Carlo parameter estimates

(g) β

2

(j) σ

2

63

Download