Zoom in iOS Clones: Examining the Impact of Copycat

advertisement
Zoom in iOS Clones: Examining the Impact of Copycat
Apps on Original App Downloads
Beibei Li
Param Vir Singh
Quan Wang
Carnegie Mellon University
Carnegie Mellon University
Carnegie Mellon University
beibeili@andrew.cmu.edu
psidhu@andrew.cmu.edu
quanw@andrew.cmu.edu
Nov 2014
Abstract
With the rapidly growth of mobile app market, a large amount of mobile app developers are
encouraged to invest in app innovation. However, it inevitably invites other developers to imitate
the design and appearance of innovative and original apps. In this paper, we examine the
prevailing copycat app phenomenon and its impact. Using a combination of machine learning
techniques such as natural language processing, latent semantic analysis, network-based
clustering and image analysis, we are able to detect two types of copycats: deceptive and nondeceptive. Based on the detection results, we conduct econometric analysis to understand the
major impacts on the download of original apps. Our analysis is validated on a unique dataset
containing detailed information about 10,100 action game apps by 5,141 developers released at
iOS App Store for five years. Our final results indicate significant heterogeneity in the
interactions between copycats and original apps over time. In particular, our findings suggest that
the copycat apps can be either friends or foes of the original apps. Specifically, high quality
copycats tend to compete with the original app, especially if the copycat apps are non-deceptive.
Interestingly, for low quality copycats, we find a significant and positive effect from the
deceptive copycats on the original app downloads, suggesting a potential positive spillover
effect.
1
Introduction
The mobile commerce has grown immensely around the world in recent years. Much of the
industry revolution is driven by mobile software applications, or apps. According to app
analytics provider Flurry, the average US consumers are spending 2 hours and 42 minutes per
day on mobile devices, 86% of which is spent on mobile apps (Flurry 2014). In terms of
monetary expenditure, Apple announced that its users spent $10 billion in the app store in 2013,
with $1 billion coming in December alone (Bloomberg Businessweek 2014). The tremendous
demand of the mobile apps has created significant financial incentives and business opportunities
for app developers. As a result, over 1,000,000 mobile apps are created and offered on each of
the two leading platforms, Apple iOS and Google Android. A top-10 ranked grossing iOS game
can easily make $47,000 in daily revenue and over $17 million a year (Distimo 2013).
While this enormous economic opportunity attracts an increasing amount of mobile developers
to invest in app innovation, it inevitably invites copycat developers to imitate the design and
appearance of original apps. A study by The Guardian in 2012 shows that Apple’s App Store
was flooded with copycat gaming applications (TheGuardian 2012). Only several days after the
original game Flappy Bird hit the market, “Flappy Bird” copycats rocketed up and soared into
four out of the five iOS app store’s top free apps. Now one can easily find tens of “Flappy Bird”
knock-offs on any major mobile platform (NYTimes 2014). Flappy Bird isn’t alone with a clone.
Many successful apps are closely followed by copycats (Technobuffalo 2014). The copycat
phenomenon has been so ubiquitous that the “derivative” works don't just come from smaller
developers - they come from large firms imitating smaller ones as well. Large gaming studios
like Zynga have used their ample engineering resources to imitate smaller games like “Tiny
Tower” (Huffington Post 2013). Large companies also love to parrot other large companies to
maintain feature parity (for an example, see “Instagram” vs “Vine,” Huffington Post 2013).
2
Several unique features of the mobile app market have contributed to the prevalence of copycat
apps. First of all, the barrier to copycat entry is almost zero. Compared with other creative
industries such as film and TV, the cost of copy is substantially lower and the protection of
intellectual property is weak in the mobile app market. Second, the success of innovation is
unpredictable so that the cost of innovation becomes high relative to the expected revenue. Even
large companies and superstar app developers are not guaranteed to generate successful
innovation continuously. Third, the original app developers are usually lack of resources to
educate consumers about who is authentic. As a result, the brand loyalty of the mobile app is
insignificant. In addition to the factors that potentially displace the demand and supply from
original apps to copycats, notable features of the mobile app market imply the shift in the
opposite direction.
The uniqueness of the copycat app phenomenon provides a great opportunity for researchers to
examine various issues such as the economic impact of copycat apps to original apps, the optimal
combating strategy, the trade-off between being innovating and imitating, innovation and
growth, the optimal copyright protection policy, etc. However, very little knowledge has been
generated so far due to a few notable challenges. The first challenge comes from the definition of
copycat apps. The imitation can happen in multiple dimensions regarding the functions and
appearances of the app. It’s possible that the copycat app mimic the authentic app by the
functional dimensions or by the packaging strategy, or both. Different degrees of imitation can
imply different types of original-copycat interactions. Second, due to the large amount of
existing apps and the budget constraints of small firms, tracking the copycat behavior tends to be
too costly for many app developers. As a result, the market collects very little knowledge about
the actual behavior of copycat apps, which makes copycat detection highly exploratory and
3
unsupervised. The third challenge comes from the lack of theoretical support. The imitation issue
is constantly understudied in the literature of economics and management, while the theories of
counterfeiting are often context dependent. For example, Cho et al. (2013) contrast the
combating strategies for different types of counterfeits. They highlight the trade-offs among
different strategies in different contexts. Therefore, it is crucial to take the unique features of the
mobile app market into consideration when examining the copycat app phenomenon.
Keeping these important managerial and policy questions in mind, we have two major objectives
in this paper. The first objective is the introduction of a functionality-based copycat detection
method that achieves accurate results using publicly available data. This method should be fast
and scalable to accommodate the large number and great variety of available apps. Using this
method to access the copycat and original identifiers is highly beneficial to both practitioners and
researchers. For practitioners, they can better monitor the behavior of the followers or the
original innovator and hence respond accordingly. For researchers, detecting copycat apps and
original apps can help produce many interesting work and advance the understanding of this
booming market.
Our second objective is to empirically analyze the interplay between original apps and copycat
apps in terms of economic outcomes. In particular, this paper focuses on the potential sales
cannibalization of copycat apps on the original app. We are especially interested in:
-
What is the impact of the demand for copycat apps on the demand of original apps?
Specifically, under what conditions does the copycat app impede the sales of the original
app, and when does it facilitate the sales of the original app?
To achieve the goals, we first refer to the literature of counterfeiting and imitation to formally
define the copycat apps and original apps. We specify two types of copycat apps: deceptive and
4
non-deceptive, based on the difference in the appearance. Then we collect panel data for all the
action gaming apps from the iOS app store and AppAnnie.com for over five years. The data
contains both structured data such as the download rank and characteristics of the apps as well as
unstructured data such as textual and graphical descriptions of the apps. We combine various
machine learning techniques to analyze the unstructured textual and graphical data to detect the
copycat apps. Using surveys on Amazon Mechanical Turk as the benchmark, we evaluate the
accuracy of the proposed detection approach.
Then we examined the economic outcomes of copycat apps with panel data and econometric
models. Our results provide solid empirical evidence that the interactions between original apps
and copycat apps are highly heterogeneous. Copycat apps can be both friends with and foes of
authentic apps, depending on the type of copycats. The non-deceptive copycats are special kinds
of competitors and will cannibalize the authentic sales. Especially high quality non-deceptive
copycats can steal the sales that would have gone to the original apps. However, if the copycat is
deceptive, it can have both positive spillover effects and negative cannibalization on the original
sales. In particular, for high quality deceptive copycats, the negative cannibalization dominates.
Interestingly, for low quality deceptive copycats, we find a significant and positive effect from
them on the original app downloads, suggesting a positive spillover effect.
Our answers to these research questions provide a useful guidance to both industry managers and
to policy makers, and also contributed to the growing academic literature on mobile commerce.
To the best of our knowledge, our paper is the first study to focus on the copycat phenomenon in
the mobile app settings. Our proposed copycat detection approach allows the practitioners as
well as researchers to access the copycat identifiers and investigate into this emerging area.
Applying econometric methodologies, we are able to examine the major economic impacts of
5
copycat phenomenon from a causal perspective. Our analysis can be the first step towards
understanding the drivers of technological innovation in the mobile app market.
Related Literature
There
are only a handful of studies on the economic and social aspects of the mobile apps market
emerging in recent years. Carare (2012) measures the ranking effect of app popularity on the
consumers’ willingness to pay. Garg and Telang (2013) present a method to calibrate the salesrank relationship for the apps market using public data. Liu et al. (2012) examine the impact of
freemium pricing model on the sales volume and revenue of the paid apps. Anindya and Han
(2014) estimate consumer preferences towards attributes of mobile apps in the iOS and Android
markets. However, the extant literature has left the issue of copycat apps largely unexplored.
Our study is related to the studies about imitation and innovation. Nelson and Winter (1982)
defines innovation, in particular technical innovation, as the implementation of a design for a
new product, or of a new way to produce a product. In other words, an innovator works with an
extremely sparse set of clues about the details to solve a revolutionary problem independently. In
contrast, imitators will borrow heavily from what has been produced, although they can offer
incremental improvement on other aspects of the product, such as user interface and detailed
competences. Previous studies have examined whether imitation works as a driving force or
binder of innovation, and the result is mixed. Conventional wisdom holds that imitating an
innovation breaks the monopoly power of the original producer so that the price is driven down
to the marginal cost by competition. However, a recent study (Bessen and Maskin 2009) argues
that imitation promotes innovation in certain industries such as software and computers. This is
because the innovation in such high tech industries is both sequential and complementary. In
particular, each successive invention builds on the preceding one. Also, each potential innovator
6
takes a different research line and thereby enhances the overall probability that a particular goal
is reached. Our definition of copycat apps and original apps are built upon this stream of
literature.
Our project is also related to the literature on counterfeits – unauthorized copies that infringe the
trademark or copyright of the original product. Several analytical studies in this stream focus on
the optimal combating strategies, but the predictions and implications vary from context to
context. Grossman and Shapiro (1988a) propose a vertical differentiation model to describe the
interaction between the domestic brand name company and a foreign deceptive counterfeiter.
They show that the domestic company may raise or lower the quality to battle counterfeits,
depending on whether the counterfeiters’ entry can be effectively deterred. In another early
paper, Grossman and Shapiro (1988b) study the status goods market where consumers are not
deceived by the foreign counterfeiter. They show that government enforcement and increased
tariff can be effective combating strategies. Cho et al. (2013) contrast the optimal combating
strategies for both deceptive and non-deceptive counterfeits. They conclude that the effectiveness
of the strategies depends on the type of the counterfeiters that the brand-name company faces.
A few empirical studies in marketing have tested the market outcomes of counterfeiting. For
example, Qian (2008) observes that Chinese shoe manufactures would improve qualities, raise
prices, and integrate with downstream distributors to reduce counterfeit sales. In a follow-up
study, Qian (2011) finds that counterfeits have both advertising effects and substitution effects
on original products. She concludes that the advertising effects have a commanding influence for
high-end original product sales while the substitution effects prevail for low-end product sales.
Nevertheless, the copycat issue in the mobile app market differs from the counterfeit issue in
7
several ways. These differences make it hard to generalize the insights on counterfeits to the
mobile apps copycat problem.
Finally, our study is related to the piracy issue of information goods, particularly how piracy has
impacted legitimate sales. The empirical findings are mixed. For instance Oberholzer-Gee and
Strumpf (2007) found that file sharing has an effect on the legal sales of music which is
statistically indistinguishable from zero. Smith and Telang (2009) found that the availability of
pirated content has no significant effect on post-broadcast DVD sales. While those studies
conclude no effect of piracy and file sharing, Danaher et al. (2014) and several other empirical
studies find that piracy has a significant negative impact on authentic sales. However, piracy is
different from copycat in at least the following aspects. First, the content of the pirated goods is
almost the same as that of authentic goods, which may not hold in the copycat settings. Second,
consumers are very likely to be aware of the authentic goods, even before the release of the
product. However, mobile users are not likely to have noticed the product as such. Third, the
traditional digital goods are released in multiple channels, while both original and copycat apps
are accessible through the same app store. Therefore, the insight from the piracy studies may not
be generalizable to the mobile app settings.
Research Context and Data
The research context of this study is the iOS app store, a digital distribution platform for mobile
applications developed and maintained by Apple, Inc. Initialized as the first mobile app store
ever, the iOS app store has been exploding since its open in July 2008. Starting with 500 apps in
2008, over 1 million apps were available in 24 categories by Dec 2013. More than 30 billion
apps had been downloaded, resulting in a total of $10 billion developer revenues over the year
2013 (Apple Press info 2013).
8
Apps can be priced either as paid or free. For paid apps, developers can freely choose a
download price at multiple of $1 minus 1 cent. For free apps, they are charged at zero cost when
users make the purchase. The store provides three top charts to help users to browse apps. They
are the top free download chart, top paid download chart, and top grossing revenue chart. The
top chats help consumers to discover the latest popular apps. Meanwhile, they help the
developers monitor other apps in the marketplace.
The dataset used in this study is publicly available app information on the U.S. iOS store for
iPhone. It consists of a random sample of 10,100 action game apps by 5,141 developers released
between July 2008 and Dec 2013. We focus on the action game category for two reasons. First,
games contribute to the largest revenue in the mobile app market (approximately 75% of the
revenue on iOS and around 90% of venue on Google Play, according to appannie.com 2014).
Action game is among the largest game genres. Second, action game is among the most
innovative genres on the platform. Many novel and famous mobile games such as “Angry
Birds”, “Fruit Ninja”, and “Clash of Clans” belong to this genre. Compared with more traditional
genres such as card games and casino games, the action games have a large variation in the app
originality. We randomly choose 10,100 apps from the population of 31,159 action game apps.
Accounting for 1/3 of the population, this sample should be unbiased and representative.
Our data contains cross-sectional descriptions of the app landing pages on the iOS website in
Dec 2013. In addition to the numerical characteristic such as price, file size and consumer rating,
unstructured information such as the image of the app, the textual description of the app, and the
user reviews are also carefully collected. Our data also includes a panel data on the daily
download ranks, daily grossing ranks, download price, and version updates of those apps since
the release of each app. For the rank tables, we observe the top 1,500 apps on top charts in the
9
genre of action games. We calibrate the daily download quantity from the daily rank data using
the method proposed by Garg and Telang (2013).1 For version updates, we observe the date of
version updates. For download price, we observe when the price is changed from which value to
which value. We find 16,757 version updates and 32,523 price changes in our dataset. Finally,
our data contains the panel of Google Search Trend of app titles on the web Internet. This data
will be used as a control variable in our main analysis.
Table 1 presents the summary statistics about the major variables in the cross-sectional data. We
see that 48% of the apps in our sample are paid apps. The remaining 52% are free apps. The
average download price is $0.78. The consumer rating which has an ordinal scale 1 through 5 has
an average score of 3.43. The age of the app measuring the month count since release has an
average value of 26.60. The game center dummy refers to whether the app is connected to the
iOS game center which lets users play and share games with their friends. 38% of the apps are
connected to the game center. 6,918 apps have sibling apps that belong to the same developer.
3,184 apps are the single apps published by their developers.
Variable
# Obs
Mean
Std. dev.
Download price
10,100 0.7815
3.8655
0
349.99
Paid dummy
10,100 0.4810
0.4997
0
1
App age
10,100 26.5970
13.5489
1
65
Rating
10,100 3.4268
1.0041
1
5
Game center dummy
10,100 0.3823
0.4860
0
1
# Apps by the developer
10,100 7.7479
16.4381
1
113
# Characters in description
10,100 872.2314 676.4216
0
3994
# Screenshots
10,100 3.6557
0
6
1.7782
Min
Max
Table 1. Summary Statistics
1
Our estimated shape parameter is -0.9996, similar to their reported parameter -0.944.
10
Copycat Detection
To distinguish the copycat apps and original apps, we propose and verified a functionality-based
detection architecture that is able to provide accurate app identification. But before introducing
our copycat detection strategy, we formally define the copycat apps and original apps based on
the literature of innovation and imitation.
Original apps refer to the apps that have endowed significant amount of resources to implement
original ideas and create innovative apps. These apps offer unique functionality and gameplay
that are fundamentally different from existing apps. In contrast, the copycat apps refer to the
apps that have borrowed heavily from one or more existing apps in terms of functionality and
gameplay. However, copycat apps can make humble adaptation to the original apps. It’s possible
that the copycat apps have improved the user experience and app competence instead of a simple
replication of the original idea. To investigate the heterogeneity of copycats more deeply,
consistent with the theory (Grossman and Shapiro, 1988a, 1988b), we define two types of
copycats: deceptive and non-deceptive. The deceptive copycats refer to apps that are designed to
deceive consumers in the appearance of the app. When the consumer purchase the deceptive
copycats, they are likely to believe they have purchased the original one. In contrast, the nondeceptive copycats refer to the apps that mimic the original app’s functionality, but maintain
distinctive appearance. The developers of non-deceptive copycat apps make efforts to
differentiate themselves from the original app. And consumers can easily tell apart the original
and non-deceptive copycat apps.
In our proposed copycat detection framework, a mobile app is modeled as a collection of
different functionalities and appearances. We first partition all apps into collections of apps that
have similar functionality. We then define the original apps and copycat apps in each collection.
We finally determine whether the copycat app is deceptive or non-deceptive.
11
We achieve our goal by solving three important questions. First, given an app, we need to know
what functionalities it provides. Although the functionality is often explicitly stated at the textual
descriptions of the app on the landing page, it’s usually a short paragraph that may not give a
comprehensive overview of the app gameplay. However, we noticed that the functionality can be
repeated mentioned in the consumer review. Therefore, we conduct textual mining on both the
descriptions and consumer reviews to extract the app functionality.
Second, we need to partition the apps based on the functionality. Although there are limited
types of functionality in the app market, the possible combinations of the functionality aspects
can be huge. For example, one app can be specified as “endless running” plus “in-app-purchase”
plus “tilt”, while another similar app has features such as “endless running” plus “touch screen”.
To reduce the dimension of functionality, we conduct latent semantic analysis. Then we cluster
the apps based on the similarity between apps.
Third, we need to determine the level of imitation. For the copycat apps that use name and image
similar to the original app, we identify them as deceptive copycat. For other apps that are
explicitly differentiated from the original apps in the appearance, we identify them as nondeceptive. In the following part of this section, we discuss in detail how we process the publicly
available but highly unstructured data to detect different types of copycat apps.
Step 1: Detecting Similarity in App Textual Descriptions and Reviews Using NLP
The main purpose of this step is to first map each individual app to a collection of functional
features, and then measure the app similarity at the feature level. To achieve the goal, we first
obtain the features of each app by processing textual information. We follow Hu and Liu 2004 to
combine user reviews with textual descriptions. Noticing that the user reviews contain noisy
information that is not directly related to app features, we filter the review content with the
12
following strategy. First, we combine all the descriptions of apps as a bag of words. We perform
text preprocessing, including tokenization, removing stop words, and Part-of-Speech tagging.
Second, we keep the unique nouns and verbs in the bag of words to create the dictionary of app
features. We only keep the nouns and verbs because we believe they are more relevant to app
features than other word categories. Third, given an app, we compute the term weights of the app
features that are included in its preprocessed textual descriptions and the most useful user
reviews. The term weights are calculated using the standard TF-IDF (Salton and McGill 1983)
scheme. By doing so, we map each app to a vector with weighted frequencies of app features.
However, using purely TF-IDF can be problematic (Aggarwal and Zhai 2012). First, the
dimensionality of the text representation is very large (in our case there are 26,642 unique stem
words for 10,100 documents), but the underlying data is sparse. Second, the TF-IDF algorithm
assumes the words are independent of each other. It ignores the synonymy, polysemy and
underlying correlations between words. To solve the potential issues, we conduct latent semantic
analysis (LSA). In particular, we conduct the singular value decomposition (SVD) method
(Landauer et al. 1998). SVD is widely used in the large-scale data mining contexts to reduce the
dimension of the feature vectors but preserve the similar structure among vectors. Hence, we
apply SVD to the TF vectors.
Finally, we apply the cosine similarity function to calculate the pairwise app similarities. The
cosine similarity is a value between 0 and 1 that captures the probability of being identical. A
larger value indicates that the pair of apps share a stronger functional similarity based on their
textual descriptions.
13
Step 2: Network-Based Clustering Using Markov Clustering Algorithm
The output of step one can be viewed as an undirected probabilistic graph. In this graph, a node
represents an app. The pairwise similarity of apps represents an arc. The arc is undirected. And
the value of similarity represents how likely the two connected nodes are the same. Therefore,
our goal is to cluster this graph network based on the structure of the network. The expected
outputs of step two are clusters of apps where apps in the same cluster are very similar in terms
of functionality and gameplay, and apps in different clusters are divergent.
To achieve this goal, we apply a network-based clustering method. It is an unsupervised learning
method that allows us to leverage the network structure to extract groups of similar items. In
particular, we use the Markov clustering algorithm (MCL) to cluster our app network (Dongen
2000). Compared with distance-based clustering algorithms such as k-means (MacQueen 1967)
and hierarchical clustering (Eisen et al. 1998), MCL has a few merits (Satuluri et al. 2010). First,
unlike K-Means based algorithm that converges to one of numerous local minima, MCL is
insensitive to the initial starting conditions. Second, it doesn’t take any default number of
clusters as input. Instead the algorithm allows the internal structure of the network to determine
the granularity of the cluster. Third, compared with many state-of-the-art network clustering
algorithms, it is more noise-tolerant as well as effective at discovering the cluster structure
(Brohee and Helden 2006). This method has been widely applied in bioinformatics (Satuluri et al.
2010). It converges at a speed linear to the size of the matrix (Dongen, 2000).
The basic intuition of the algorithm is based on random walk. The probability of visiting a
connected node is proportional to the weight on the arc. In other words, the random walk will
stabilize inside the dense regions of the network after many steps. The stabilized regions shape
the clustering output and reflect the intrinsic structure of the network. Once we extract the
clusters of similar apps, the next step is to distinguish the original apps from the followers. We
14
consider the app release date as our standard in this study. If an app is the first app released in a
cluster, it’s labeled as original. Otherwise it’s labeled as a copycat app. However, if the original
developer releases several apps in the same cluster, e.g. “Angry Birds” and “Angry Birds Space,”
the differentiated apps are also labeled as original.
Step 3a: Detecting Similarity in App Titles Using String Soft Matching
According to theory, there are mainly two types of copycats (e.g., Grossman and Shapiro, 1988a,
1988b), deceptive and non-deceptive, depending on the nature of imitation. In our case, we
identify an app as deceptive if either its title is similar to the original app’s title, or its icon looks
similar to the original app’s icon. On the other hand, we identify the app as non-deceptive if it is
identified as copycat from the previous step (i.e., similar in its textual descriptions), but neither
the title nor the icon is similar to the original app. Under this definition, we can empirically
identify the deceptive copycat apps from the non-deceptive ones by conducting further analyses
to extract the similarity in app titles and icon images. We achieve our goal by conducting two
separate analyses using string soft matching (Step 3a) and image matching analysis (Step 3b).
To extract the similarity in app titles, we conduct the string soft matching techniques using the
edit distance metrics to compare app names. The edit distance between two strings is the
minimum number of edit operations needed to convert from one string to the other (Elmagarmid
et al. 2007). There are three kinds of edit operations: inserting, deleting, and replacing. Each edit
operation has cost 1. For each copycat app in a cluster, we compute a pairwise distance between
the copycat app name and the original app name. A smaller distance indicates higher similarity.
For normalization, we transform the computed distance values to a scale between 0 and 1. Using
a rule of thumb cutoff value 0.7 (Kim and Lee, 2012), we are able to find the app pairs with
15
similar titles. Hence, we label a copycat as deceptive if its normalized edit distance to the
original app in a cluster is above 0.7.
Step 3b: Detecting Similarity in App Icons Using Image Matching Analysis
To detect the imitation of app icons, we need an image matching algorithm that is invariant to
image scale, rotation, change in illumination etc. This is because copycat developers may not
take the exactly same image as the original. But it’s very likely that the copycat developers
rescale, rotate, or add noises to the original icon. To address this challenge, we employ the ScaleInvariant Feature Transform (SIFT) algorithm proposed by Lowe (1999). The algorithm is one of
the most robust and widely used image matching algorithms based on local features in the field
of computer vision (Mikolajczyk and Schmid 2005). It extracts a core set of features from an
image that reflect the most important and distinctive information from local regions of the image.
After we represent the image by the core set of features, we can match this image with another
image, a part of the other image, or a subset of the core features extracted from the other image.
Therefore, SIFT is able to detect graphical similar patterns between images, and moreover even
when the images have gone through structural transformations.
We conduct the matching among all copycats’ icons against the original app’s icon in each
cluster. The SIFT method will compute a matching score which captures the level of similarity
between each copycat and the original app in a cluster. We label a copycat as deceptive if the
image matching score exceeds a threshold. Figure 1 reveals two examples of the matching
results. The first image is the icon of “angry birds”, which is a famous original game. The second
image is the icon of “cut the birds”, which has very similar appearance to “angry birds” but is
produced by an unrelated developer. The SIFT algorithm recognizes them as similar images. The
third image is from “plants vs. zombies”, which is also a featured original game. The last image
16
is from “cut the zombies,” which looks like the original game but is offered by a different
producer. The image matching algorithm also recognizes them as similar images. Overall, the
image matching process reports 473 authentic-copycat pairs of similar images.
Figure 1. Examples of Original Icon vs. Deceptive Icon
In summary, we apply an automatic way of detecting different types of copycat apps using
different machine learning techniques, including Natural Language Processing, Latent Semantic
Analysis, Network-based Clustering, String Soft Matching and Image Matching analysis. Table 2
provides an overview of our goals, data in use, and methods in each step above.
Goals
Data
Extract app functional
similarity based on textual
descriptions and reviews
App textual
description,
user reviews
Cluster apps based on
functional similarity
Textual
similarity scores
derived
Identify original apps vs.
copycats
Release date,
developer ID
Identify deceptive vs. nondeceptive copycats
App title,
App icon image
Methods
 Part-of-Speech;
 TF-IDF;
 Latent Semantic Analysis
(Singular Vector Decomposition);
 Cosine Similarity
 Markov Cluster Algorithm
--- String Soft Matching (Edit
Distance);
 Image Matching Analysis (ScaleInvariant Feature Transform)
Table 2. Summary of Different Methods for Copycat Detection
Main Findings
The feature extraction process is based on 35,996 unique nouns and verbs from the description of
10,100 apps. We report the results of feature extraction for five popular apps in Table 3. The
17
table shows that most of the extracted features are meaningful and specific to the app content.
The good quality of the feature extraction has laid the foundations for effective copycat
detection.
App Name
Features
Angry Birds
bird, angry, crash, level, push, power, new, up, love
Contract Killer
weapon, contract, mission, killer, gun, crash, graphic, iPad
Despicable Me: Minion minion, gameloft, Christmas, rush, run, keep, cute, addict
Rush
Doodle Jump
doodle, jump, monster, tilt, worth, multiplay, score, rock
Fruit Ninja
fruit, mode, halfbrick, juice, blade, hit, arcade, kid, unlock
Table 3. Example of extracted App Features
(1) Evolution of Mobile App Copycats.
From our Network-based Clustering analysis in Step 2, we acquire 4080 clusters among which
1791 clusters contain more than one app. To explore the evolution of the copycats, we take the
snapshots of the 50 largest clusters over time as shown in Figure 2. From left to right, the subgraphs represent the structures of the clusters in (1a) Dec 2009, (1b) Dec 2011, and (1c) Dec
2013, respectively. In these sub-graphs, a node refers to an app. An arc represents a nonzero
similarity score between the two connected apps. Figure 2 indicates (i) the density of the clusters
has grown rapidly from 2009 to 2013; (2) different clusters grow at different paces; (3) many
recently released apps are not original; (4) some original apps are more likely to attract copycats
than others.
(2) Release of New (Original) Apps.
To verify the first observation, we plot the number of app released per month and the time trend
of percentage of original apps. Our findings are intriguing: although the mobile app market is
growing with more new apps released over time, the proportion of original apps is in fact
dropping rapidly. For example, in Figure 3 the number of action games released has increased by
18
more than 7 times from less than 50 per month in 2008 to 400 per month in 2013. Nevertheless,
Figure 4 shows that the percentage of original apps among the released apps has reduced
straightly from over 90% in Dec 2008 to around 45% in late 2013. In other words, for every two
app released in late 2013, one is a clone of an existing app.
(1a) Dec 2009
(1b) Dec 2011
(1c) Dec 2013
Figure 2. Clusters of Mobile App Copycats Over Time
Figure 3. Count of New App Release Over
Time
Figure 4. Rate of Original Apps Release
Over Time
(3) Comparison between types of apps.
After the process of copycat detection, we acquire the sets of original apps, the deceptive
copycats, and the non-deceptive copycats. We find that there exist a significantly higher number
of non-deceptive copycats than the deceptive copycats. In our dataset with over 10,000 apps,
there are 4.84% deceptive copycats, 36.17% non-deceptive copycats, and remaining 58.98%
original.
19
Moreover, we compare the means of a few cross-sectional variables for the three types of apps in
Table 4. The joint test for the group means are also appended in the last column. The table shows
that the characteristics of the original apps and copycats are significantly different in various
aspects. First, the estimated daily download and price are highest for the original apps on
average, followed by the non-deceptive copycats, then by the deceptive copycats. Second, the
proportion of paid apps is higher for the original apps than for copycats. Third, the user rating
and the number of screenshots tend to be higher for the copycats than for the original apps.
For the original apps, their characteristics also vary subject to the numbers of copycats they have.
Table 5 reports the means of variables for two groups of original apps. The first group has no
copycat apps at all, and the second group has at least one copycat app. The download quantity is
higher on average for the second group of apps. Also, the percentage of paid apps, number of
apps by the same developer, app age, and number of ratings per month are higher for the second
group. Interestingly, the average user rating is lower for the second group than for the first group.
Variable
Est
daily
quantity
Original
download 39.0546
Deceptive Non-deceptive
F-test /Chi-square test
38.6382
38.8713
0.0022
Paid dummy
0.4954
0.4867
0.4566
0.0000
# Apps by developer
11.8096
5.4412
6.3134
0.0000
Price
0.8616
0.7231
0.6357
0.0000
Rating
3.3847
3.3916
3.4611
0.0000
App age
30.7677
23.2638
23.3022
0.0000
# Ratings per month
56.0764
13.3219
40.9430
0.3095
# Screenshots
3.5041
3.5910
3.6844
0.0001
726.7996
869.0194
0.0000
#
Characters
description
in 969.8589
Table 4. Summary Statistics by App Type
20
Variable
Original
w/o copycats
Original
w/ copycats
T-test/Chi-square test
Est daily download quantity
38.6640
39.2928
0.0001
Paid dummy
0.4696
0.5116
0.0000
Cluster size
1
5.4383
0.0000
# Apps by developer
3.6522
11.8096
0.0000
Price
0.7458
0.8616
0.0000
Rating
3.4465
3.3847
0.0000
App age
25.8847
30.7677
0.0000
# Ratings per month
20.6849
56.0764
0.0047
# Screenshots
3.8685
3.5041
0.0000
# Characters in description
751.1429
969.8589
0.0000
Table 5. Comparison within Original Apps
External Evaluation of Clustering
The accuracy of the proposed copycat detection method is of vital importance as it will be used
as input to the subsequent economic analysis. Therefore, we carefully evaluate the accuracy of
the proposed method by conducting survey on Amazon Mechanical Turk. The results of the
survey serve as external benchmarks so that it can be thought as a gold standard for evaluation.
Therefore we are able to assess the closeness of our proposed approach to the predetermined
benchmark classification. Our final results show that the proposed copycat detection approach is
able to correctly identify whether apps are similar over 91.9% of the time.
Amazon Mechanical Turk is a crowdsourcing web service that coordinates the supply and
demand of tasks that require human intelligence to complete. It is an online labor market where
workers are recruited by requesters for the execution of well-defined simple tasks such as image
tagging, sentiment judgment, and survey completion. In machine learning and related areas, it
has been heavily used for evaluating the performance of unsupervised methods (Heer and
Bostock 2010) that rivals the quality of work by highly paid, domain-specific experts.
21
Specifically, we structure the external evaluation in the following four steps through which we
are able to decompose the complicated evaluation task into series of smaller tasks. First, 1250
pairs of apps are sampled from all possible combinations of apps. Due to the sparcity of the
similar pairs, these 1250 pairs are carefully sampled in a two-step manner. We will introduce the
particular sampling strategy subsequently. Second, 250 questionnaires are created and published
on Amazon MTurk. Each questionnaire asks to compare 5 distinctive pairs of apps based on the
name, image, and gameplay. Each questionnaire is answered by 3 independent Amazon MTurk
workers. Third, majority vote is employed to determine whether the apps are similar and which
aspects are similar. The results of the majority vote work as the gold standard for the external
evaluation. Finally, the quality of the copycat detection method is measured by the commonly
used Rand measure and F-measure (e.g. Larsen and Aone 1999).
As we briefly mentioned, the apps for Mturk evaluation are sampled carefully in two steps. This
particular sampling strategy is needed because the sparcity of similar app pairs makes the naïve
random sampling strategy unattractive. With sparse similar pairs, very few pairs in naïve random
sample are similar pairs. However, the accuracy measures are not sensitive to false positives and
false negatives. It is possible that the accuracy measures look great while the algorithm works
poorly. To solve the issue introduced by sparse similar pairs, we sample the 1,250 pairs such that
the proportion of the similar pairs is substantially high. We do so by temporarily treating the
machine learning results as if they were ground truth. We first generate a random similar app for
each app (according to the machine learning similarity). Similarly we generate a random
unrelated app for each app (according to the machine learning similarity). Then we generate
random samples from these two pools to form the 1250 pairs.
22
The primary results of the Mturk survey are four count numbers: true positives (TP), true
negatives (TN), false positives (FP), and false negatives (FN). In our context, TP refers to the
case that both the human evaluation and the copycat detection method report the pair of apps to
be similar. TN refers to the case that both evaluations determine the pair to be very different. FP
refers to the case that the human evaluation reports the apps to be different in all aspects while
the machine learning results show them to be similar. And FN refers to the case that the human
evaluation reports the apps to be similar while the algorithm reports them as different.
The external evaluation supports that our proposed copycat detection method is accurate. Among
the 1250 tested pairs, there are 501 true positive (TP) pairs, 648 true negative (TN) pairs, 67 false
positive (FP) pairs, and 34 false negative (FN) pairs. Therefore, the Rand measure of our method
is 0.919. The precision is 0.882. The recall is 0.936. And the F-measure is 0.908. In terms of the
quality of the survey data, we find that (1) 96.9% of the workers answered our pre-test
questionnaire correctly, (2) the demographics of the workers are quite diverse and representative,
(3)90.16% of the answers from three different independent workers are always the same. Thus
these results dovetail well with our machine learning output, which suggest that our copycat
detection method can capture the true imitation relationship between apps.
Model
Consequences of Copycats
In this section, we exploit the panel structure of the data to analyze whether copycat apps
cannibalize the sales of original apps, and if so, under which condition is the cannibalization
statistically and economically significant.
Sales cannibalization from copycat apps may be particularly concerning in the mobile app
industry, as the copycat apps compete with the original apps by providing the similar product
23
with almost the same functionality, reduced price, and good user experience. Consumers who
would otherwise purchase the original product are now attracted by the copycat competitors. In
this view, the original app as the sole seller would have obtained a higher monopoly profit than
in the competitive world. Therefore, to stimulate innovation, policies that protect the original
apps may be justified.
However, the above intuition might be incomplete, because it is possible that the copycat apps
have no effect on original sales, or even stimulate original sales. In the “stimulate sales” view,
being imitated can be a strategic advantage for the original firm under certain conditions. One
such condition is the present of network externality of the technology (Conner 1995). In
Conner’s framework, the imitators bring lower-valuing buyers into the market. They collectively
increase the user base of the type of technology. Due to the positive network externality, the
expanded size of user base increases the perceived quality of the product. Consequently, the
innovator’s product becomes more attractive to high- and medium-valuing buyers. When the
benefit of added user base surpasses the sales lost to the clones, the innovator earns higher
payoffs than as a monopolist. In this case, imitation works as reward of innovation. Moreover,
Conner finds that the returns from imitation depend centrally on the magnitude of the network
externality and the degree of consumer-perceived quality differences between the innovator’s
and the imitator’s products. When the size of the network externality is large, or when the gap
between the perceived quality is large, the benefit from imitation is increased.
In other independent studies, researchers have also found that copying can increase firm’s profit,
lead to better quality product, and increase social welfare even when there are no network effects
and the market is saturated (Jain 2008). This is because copying can lead to reduced price
24
competition by allowing price-sensitive consumers to buy copies (Gu and Mahajan 2005, Jain
2008).
In the mobile app context, especially for action games, the magnitude of network externality
might not be dominant. However, it is possible that there is significant spillover effect of the
copycat app, especially if the copycat app is associated with the original app through similar
appearance. A typical mobile app shopper faces the following decisions: which apps to consider,
and which app she is willing to buy. We argue that the anonymous amount of available apps, the
nontrivial search cost, and the limited brand effect of the developers, among other factors, have
made individual apps very difficult to stand out in the consideration stage. However, if the
mobile user observes many similar apps of one type, she can probably believe this type of apps
have large demand among early buyers. The perceived quality of this type of apps goes up.
Consequently, potential consumer who would have ignored the entire group of apps is now
aware of both copycat apps as well as the original ones. Then in the purchase stage, she can
compare the group of apps and choose the optimal one. As the group of apps receives more
awareness due to imitation, the demand of original app might go up. In particular, such spillover
effect should be more salient when the copycat app is more easily to be associated with the
original app, for example through similar looking. Therefore, the deceptive copycat apps are
more likely to have spillover effect than the non-deceptive copycats. Also, such spillover effect
should be more significant when the quality of the original app is much better than the quality of
the copycat app. The low quality copycat apps are more likely to have spillover effect than the
high quality copycat apps.
To test the above competition and stimulation points of view, we conduct panel data analysis as
follows. For each original app, we take a snapshot of the list of copycat apps and the download
25
rank as well as the time-varying characteristics, at the end of each day. We focus our analysis on
a monthly basis, because we have a long panel of five years, and month can be the most salient
unit of measurement when the app developers make decision. Nevertheless, the main results are
robust when analyzed on a weekly basis.
We denote the log transformed sum of download amount that original app i receives during tth
month with yit where t=1,…,T. Similarly, we denote the log transformed sum of copycat
download for the original app i received during tth month with xit . A naïve test of sales
cannibalization would be to look for the sequential correlation in the download performance such
that the sales of the original app are correlated with the sales of copycat apps. This test translates
into a regression in which the dependent variable include yit , and the independent variables
include xit , time-varying attributes of the original app Dit , time-invariant characteristics λi , time
trend of the market φt :
yit = αxit + Dit β1 + λi + φt + εit
Where t=2,…T.
The time-varying characteristics Dit includes the following variables: Log Original Price, the log
transformed monthly average download price of the original app, Original Version, the monthly
count of new version release, App Age, the age of the app at current month, App Age2, the
square of the app age, Log Developer Download, the log transformed sum of monthly download
of other apps by the same developer, Dev Version, the count of version updates of other apps by
the same developer. The scalar α and vector β1 are parameters to be estimated.
The available data are unlikely to capture every source of heterogeneity across apps. For
example, an original app is different from other original apps in terms of imitability,
functionality, popularity, etc. which are very likely to be correlated with both the demand of
26
copycat apps as well as the demand of original apps. Statistically, the absence of such variables
results in omitted variable bias in the estimate of α. Fortunately, the panel structure of the data
allows us to capture correlations between unobserved app features and market demand. In this
model, we have ruled out the time-invariant unobserved heterogeneity across apps, as we
decompose λi from the error term using fixed effects. The identification assumption is that
unobservable heterogeneity of original apps λi is time invariant. This assumption is plausible in
the mobile app setting because the major functional characteristics of an app are unlikely to
change over the life cycle. Based on this assumption, we report the ordinary least square (OLS)
estimation results of Equation (1) with standard errors clustered at the developer level in Column
(1) of Table 6. The correlation between Log Copycat Apps and Log Original is positive but
statistically insignificant. However, this result can be attributed to the following mechanisms:
time-varying trend for the demand, selection of copycat, and the net effect of competition and
sales stimulation. Below we discuss the empirical strategy employed to disentangle these
mechanism.
Identification of the Consequence Model
Socially correlated purchase decisions may have also resulted from time-varying factors among
consumers. In particular, mobile app users are influenced by the marketing mix of the products,
the word-of-mouth of their friends, etc. Consider two original apps that are otherwise identical,
the original app that has received a higher amount of marketing mix may still be more desirable.
The marketing mix is likely to change over time, while it is correlated with both the demand of
copycat apps and the demand of original apps. To exploit such variation, we decompose the
unobserved time trends into two parts: the trends that impact all apps in the store, and the trends
that impact the specific type of copycat and original apps. The time trends that affect all apps
27
(such as Holiday season, price change of the mobile phone, etc.) are already captured in the time
dummy φt in the fixed effect model. For the time trends that affect the specific type of apps, we
combine two strategies together: using Google Search Trend as a proxy of unobserved trend, and
finding the appropriate instrumental variable for the copycat sales. In particular, Google Search
Trend works as an index to measure the search volume of the title of the original apps on the web
Internet. Although it may not be a perfect measure of time-varying demand shocks, it should be
highly correlated with them. To operationalize this idea, we augment the model by explicitly
including the Google Search Trend in our model. We expect the coefficient 𝛼 less likely to be
biased after controlling for the Google Search Trend.
To further capture such time-varying unobserved heterogeneity that drives the changes of both
copycat downloads and original app downloads, we introduce two different types of instrumental
variables for the copycat downloads. A valid instrumental variable should be correlated with the
sales of copycat apps but uncorrelated with the time-varying unobserved error term (which might
be correlated with the original app’s downloads). In the panel data setting, a valid instrumental
variable should also have variations over time. Following the literature of using lagged price as
the instrumental variable for current period price, we use the lagged copycat download as an
instrument for current period download. In particular, we use the lagged terms in three
successive months as the set of instrumental variables. The underlying assumption is that the
lagged period copycat sales are uncorrelated with current period common shocks. However, if
this assumption is violated, we will have underestimated or overestimated coefficients,
depending on the correlation between the unobserved time trend and the original download as
well as the correlation between the unobserved time trend and the copycat download. To address
this concern, we propose a second type of instrumental variable which is the average file size of
28
the copycat apps in the cluster. The file size of copycat apps should be significantly correlated
with the copycat download, which is verified with our data. Intuitively, the file size is associated
with copycat demand as it reflects the richness of the content. Moreover, it’s conservative to
assume that file size of copycat apps is uncorrelated with the time-varying unobserved shocks for
the original download (such as the marketing mix, word-of-mouth etc.). Statistically, with two
different types of instrumental variables, we are able to increase the power of the analysis by
conducting the over-identification test for these instrumental variables.
(1) Fixed
effect
Log Copycat Apps
0.0029
(0.0135)
0.0139
(0.0110)
0.6624***
(0.0408)
0.2251***
(0.0193)
-0.0221***
(0.0064)
-0.0277***
(0.0034)
0.0002***
(0.0000)
(2) With
Google
search trend
(3) With
Lagged
terms as IV
0.0027
(0.0135)
0.0135
(0.0110)
0.6609***
(0.0406)
0.2248***
(0.0193)
-0.0223***
(0.0064)
-0.02791***
(0.0034)
0.0002***
(0.0000)
0.1103***
(0.0492)
Yes
Yes
0.2306
-0.0165
(0.0166)
Log Original Price
0.0141
(0.0106)
Original Version
0.5705***
(0.0456)
Log Developer Download
0.2091***
(0.0201)
Dev Version
-0.0196***
(0.0061)
App Age
-0.0244***
(0.0034)
App Age2
0.0002**
(0.0000)
Log Search
0.1036**
(0.0409)
Individual Fixed Effect
Yes
Yes
Time Fixed Effect
Yes
Yes
2
Adjusted/pseudo-R
0.2209
0.1302
Weak Instrument F test
6239.4
Over-identification J test
3.159
Number of individuals
3,667
3,667
3,667
Number of observations
109,166
109,166
109,166
*
**
***
Note: Std. Err. in parentheses
p < 0.1, p < 0.05, p < 0.01
(4) With
Lagged term
and file size
as IV
-0.0173
(0.0166)
0.0141
(0.0106)
0.5705***
(0.0456)
0.2092***
(0.0201)
-0.0196***
(0.0061)
-0.0244***
(0.0034)
0.0002**
(0.0000)
0.1037**
(0.0409)
Yes
Yes
0.1301
4106.8
7.622
3,667
109,166
Table 6. Overall Effect in the Consequence Model
29
The results in model (1) to (4) consistently show that the overall effect of copycat app sales on
original sales is statistically insignificant. This implies that the competition point of view is
incomplete in understanding the interplay between copycat apps and original apps. The next
question is: what is the complete view? There are two possibilities: there’s no sales
cannibalization of copycat apps, or the insignificant result is a net effect of countervailing effects.
We explore these possibilities by splitting Log Copycat Apps into Log Highly-rated Copycat and
Log Lowly-rated Copycat as two independent variables in the main regression equation. Highlyrated copycat apps refer to the imitators that have higher aggregate consumer ratings than the
original app, whereas lowly rated copycat apps refer to the imitators that have lower aggregate
consumer ratings than the original app. If the “no effect” view holds, we should expect to see the
heterogeneity of copycat ratings doesn’t change the regression coefficients. However, if the
insignificant main result is pooling result of conflicting effects, we should expect the coefficient
of these two subgroups of copycat to differ. As consumer ratings can approximate the perceived
quality of apps, we expect the copycat apps that have lower ratings are more likely to help the
sales of original apps, while the copycat apps that have higher ratings are more likely to eat up
the sales of original apps.
Also, we hypothesize that app similarity positively contributes to the helpfulness of copycat apps.
This is because the similarity in appearance makes it easy to associate the copycat app with the
original app. Mobile users arriving at the copycat apps are more likely to pay attention to and get
engaged in similar apps (which can be the original) at the exploring stage. Therefore, it is
possible that non-deceptive copycat apps have a dominant substitution effect, while the deceptive
apps have a dominant positive spillover effect. To examine the heterogeneity in the appearance
of copycats, we break down the sales of copycat apps into sales of deceptive copycat apps and
30
sales of non-deceptive copycat apps. Lastly, we want to examine the interactions between
deceptiveness and consumer rating. To operationalize this idea, we split the copycat apps into
four subgroups: highly rated deceptive copycat, lowly rated deceptive copycat, highly rated nondeceptive copycat, and lowly rated non-deceptive copycat.
Results
Table 7 Column 1 demonstrates the estimation results by separately examining the effects of
highly rated and lowly rated copycats. Log High Ratings Copycat has a significant and negative
main effect, which confirms the competition point of view — copycat apps that have higher
perceived quality than the original do hurt the demand of original app. This cannibalization result
is an analogy for Danaher et al. (2014) on piracy of music sales. The point estimate of -0.0486
infers that a 10% increase in the relatively high quality copycat sales results in an average
0.486% decrease of the original app’s sales. More interestingly, Log Low Ratings Copycat has a
significant and positive effect, which supports our hypothesis that copycat apps do help the
original ones in certain conditions. It seems that the low quality copycat apps increase the
awareness of the group of apps with similar functionality. As the copycat app is of relatively bad
quality, the consumers prefer the original to the copycat. In particular, a 10% increase in the
relatively low quality copycat sales results in an average 0.953% increase of the original app’s
sales. The R2 statistics increases to 14.79% after we split copycat apps by consumer ratings,
compared with 13.01% in Table 6 column 4.
The coefficients for control variables remain similar in this model compared with Table 6. Very
intuitively, the version update of the original app is positively associated with higher original
download, while the version update of the sibling apps by the same developer is negatively
associated with original download. The download of sibling apps is positively associated with
31
the original download, perhaps because the cross-promotion is likely to happen between sibling
apps. The download quantity decreases as the app age increases, but the decreasing speed slows
down as the app gets older. Finally, the model shows that the Google Search Trend has a positive
and significant association with the download performance.
Table 7 column 2 presents the estimated treatment effect for copycat apps with different levels of
deceptiveness. As expected, non-deceptive copycats tend to compete directly with the original
apps for demand. The point estimate of -0.0561 for Log Non-deceptive Copycat indicates that a
10% increase in the non-deceptive copycats’ download quantity results in an average 0.56%
decrease in the original app’s download quantity. However, the effect of deceptive copycat apps
goes against the hypothesis as the coefficient is negative but statistically insignificant. This result
once again supports our hypothesis that different types of copycats have different effects on the
sales of the original app. In particular, if the copycat app is non-deceptive, it turns to compete
with the original apps in all dimensions so that the substitution effect dominates. However, the
non-deceptive copycat apps might have mixed effect or no effect.
To further explore the heterogeneity in deceptive and non-deceptive copycats separately, we
consider the interactions between copycat quality and copycat type. We have thus specified four
sub-categories: high quality deceptive copycats, low quality deceptive copycats, high quality
non-deceptive copycats and low quality non-deceptive copycats. Consistent with the model of
Table 7 column 1, we define a high quality copycat app as consumer ratings to be higher than
that of the corresponding original app. A low quality copycat app refers to one with consumer
ratings lower than that of the corresponding original app. The estimation results are reported in
Table 7 Column 3.
32
Interestingly, we find significant negative effect, significant positive effect, and insignificant
effect for different subgroups of copycat apps. It once again suggests that there are indeed
conflicting forces in affecting the consequences of different copycats. In particular, high quality
non-deceptive copycats have a negative effect on the original app sales mainly due to the
substitution effect. This result is consistent with the hypothesis that high quality apps are more
competitive, and non-deceptive copycat apps are more competitive. However, the negative effect
from the high quality deceptive copycats is not statistically different from zero. Again, this result
is consistent with the hypothesis that similarity in appearance can potentially generate spillover
demand to the entire group of similar apps. When the substitution effect cancels out with the
spillover effect, we would like to expect the main effect of high quality deceptive copycat apps
to have almost no impact.
In contrast, for low quality copycat apps, deceptive copycat apps results in increase in the
original app’s sales. The point estimate suggests that a 10% increase in low quality deceptive
copycat downloads results in an average of 0.948% increase in the original app’s download. But
for non-deceptive copycat apps, such spillover effect is statistically insignificant. This result is
also consistent with our previous hypothesis that original apps are less likely to benefit from
copycat apps that have distinct appearance such as title and icon.
The above analysis highlights the importance of quality difference and deceptiveness in the
impact of copycat sales. How can original apps respond to the copycat apps in return? If the
developers of original apps want to maximize the benefit of copycat apps and avoid competition,
the original apps should lead the market not only by providing the first app of this type, but also
by providing high quality products. This may require the original apps to improve the product
continuously, especially if there are many high quality copycat apps coming in to the market. On
33
the other hand, the original apps many not need to worry about the deceptive copycat apps, as the
appearance similarity tend to benefit the original apps instead of hurting them.
(1)Rating
differences
Log High Ratings Copycat
(2)Deceptive
ness
(3)Rating
and
deceptiveness
-0.0486***
(0.0156)
0.0953***
(0.0337)
Log Low Ratings Copycat
Log Deceptive Copycat
Log Non-deceptive Copycat
0.0564
(0.0480)
-0.0561***
(0.0167)
Log High Ratings Deceptive Copycat
Log Low Ratings Deceptive Copycat
Log High Ratings Non-deceptive Copycat
Log Low Ratings Non-deceptive Copycat
Log Original Price
0.0160
0.0160
(0.0111)
(0.0113)
Original Version
0.6518***
0.6696***
(0.0407)
(0.0435)
Log Developer Download
0.2140***
0.2353***
(0.0192)
(0.0207)
Dev Version
-0.0233***
-0.02110***
(0.0062)
(0.0065)
App Age
-0.0333***
-0.0301***
(0.0031)
(0.0034)
2
App Age
0.0003***
0.0002***
(0.0000)
(0.0000)
Log Search
0.1060**
0.0982**
(0.0436)
(0.0471)
Individual Fixed Effect
Yes
Yes
Time Fixed Effect
Yes
Yes
Adjusted/pseudo-R2
0.1479
0.1217
Weak Instrument F test
5632.4
6532.1
Overidentification J test
7.023
4.586
Number of individuals
3,667
3,667
N
109,166
109,166
Std. Err. in parentheses * p < 0.1, ** p < 0.05, *** p < 0.01
-0.0641
(0.0815)
0.0948***
(0.0345)
-0.0892***
(0.0156)
0.0125
(0.0607)
0.0161
(0.0111)
0.6516***
(0.0407)
0.2140***
(0.0191)
-0.0233***
(0.0062)
-0.0334***
(0.0031)
0.0003**
(0.0000)
0.1060**
(0.0436)
Yes
Yes
0.1480
2651.0
11.159
3,667
109,166
Table 7. Results of Consequence Model
34
Robustness Checks
One restriction about the panel analysis is that the panelists cannot include original apps that
have never been copied by any imitators. This is because statistically the variation of the
independent variable cannot be zero. Ideally, we would like to randomly assign whether an
original app is followed by none or at least one copycat app. Then we could compare the demand
of original apps under the treatment of copycat apps. However, such experiment is too expensive
and almost infeasible for researchers. A compromising strategy is to conduct propensity score
matching in a separate cross-sectional analysis. The goal of the matching process is to generate a
matched sample to mimic a randomized experiment where whether an original app is followed
by copycat apps or not is randomly assigned by researchers. Then we compare the performance
of original apps for the treatment group and control group. Matching original apps in this way
should substantially reduce any remaining selection bias issues.
To conduct propensity score matching, we need to identify a set of observable covariates X that
influences treatment decisions and apps’ performance simultaneously. The goal is to balance the
distribution of covariates so that the difference between these two groups is attributed to the
treatment only. The selection criteria of covariates include (1) they affect both treatment and
outcome; (2) they are not affected by treatment decisions or anticipation of it. In our context, the
qualified covariates include most of the observable app features: download price, length of
descriptions, number of supported devices, number of screenshots, file size, whether the app is
connected to the game center, the level of content advisory, and number of clusters the developer
belongs to. We use the standard logit regression and Radius Matching method to match original
apps. And the treatment is defined as whether the original app has been followed by at least one
copycat imitator in the first two years since release. By doing so, we create the artificial
35
treatment group and the artificial control group that have balanced propensity score (Figure 5).
Finally the treatment effect is estimated as the mean difference between the treatment group and
control group. We expect the propensity score matching analysis to provide insights similar to
the results using panel data.
0
.1
.2
Propensity Score
Untreated
.3
.4
Treated
Figure 5. Propensity Score Matching for the Existence of Copycat Apps
The propensity score matching analysis shows that the existence of a copycat app reduces 367
downloads per month for an original app on average. When the highly rated copycat apps exist,
the average treatment effect is a decrease of 605 download per month. And for lowly rated
copycat apps, the average treatment effect is an increase of 571 download per month. Comparing
the results of the propensity score matching with the results of the main model, we find
consistent evidence that the heterogeneity of the copycat apps matters to the direction of the
impact.
Finally, in a series of robustness checks, we verify whether the finding of heterogeneous copycat
impact is robust with respect to a set of alternative specifications. First, we alter the shape and
scale parameter in the rank to download projection. The result is reported in Table 8 Column 1.
Compared with Table 7 Column 3, the results are qualitatively the same. Second, we exclude the
36
reviews from the copycat detection process, and re-do the copycat detection process as well as
the econometric analysis. Interestingly, we find the level of clustering granularity is slightly
changed, while the direction of effect remains similar (Table 8 Column 2). Third, we aggregate
the data biweekly instead of monthly. The results are reported in Table 8 Column 3.
(1) Different
download
projection
Log High Ratings Deceptive Copycat
-0.0233
(0.0230)
Log Low Ratings Deceptive Copycat
-0.0296
(0.0249)
Log High Ratings Non-deceptive Copycat -0.0590***
(0.0067)
Log Low Ratings Non-deceptive Copycat 0.0239***
(0.0077)
Log Original Price
0.0523***
(0.0125)
Original Version
0.9320***
(0.0445)
Log Developer Download
0.2396***
(0.0199)
Dev Version
-0.0243***
(0.0064)
App Age
-0.0492***
(0.0032)
App Age2
0.0005***
(0.0000)
Log Search
0.0371
(0.0257)
Individual Fixed Effect
Yes
Time Fixed Effect
Yes
2
Adjusted/pseudo-R
0.1229
Weak Instrument F test
11051.8
Overidentification J test
10.8
Number of individuals
3,667
N
109,166
Std. Err. in parentheses * p < 0.1, ** p < 0.05, *** p < 0.01
(2) No reviews
in
copycat
detection
-0.0957***
(0.0246)
0.0444*
(0.0213)
-0.0391*
(0.0162)
0.0351
(0.0303)
0.0191
(0.0118)
0.1800***
(0.0425)
0.2050***
(0.0206)
-0.0126***
(0.0039)
-0.0043***
(0.0012)
0.0000
(0.0000)
0.0199***
(0.0031)
Yes
Yes
0.2284
144.20
8.28
3,667
109,166
(3) Biweekly
panel
-0.0226
(0.0228)
-0.0273
(0.0291)
-0.0514***
(0.0065)
0.0448***
(0.0077)
0.0344
(0.0209)
0.6598***
(0.0430)
0.2294***
(0.0204)
-0.0207***
(0.0062)
-0.0334***
(0.0025)
0.0003**
(0.0000)
0.0425**
(0.0156)
Yes
Yes
0.1340
11793.5
5.6
3,667
109,166
Table 8. Robustness Checks
37
Conclusion
Although the mobile apps are now penetrating many aspects of our lives, the research studies on
this field are few. Despite the fact that the copycat issue has been emerging for a while, little is
known about the causes and effects. How are copycat apps defined and identified? What
economic impact will copycats bring to the original apps? From the original app’s point of view,
understanding the causes and effects of copycats can help find better combating or
accommodating strategies. From the platform’s point of view, the sustainable and healthy
development of the app store depends on the appropriate policy of whether to regulate the
copycat followers.
This motivates the need to have a deeper understanding of the copycat phenomenon in the setting
of mobile apps. One interesting observation is that the degree of imitation varies from different
types of copycats. While some copycat apps tend to retain distinctive appearances, some others
have very similar name and look to the original apps. Therefore, non-deceptive copycats are
likely to be differentiated versions of the original product, but deceptive copycats are likely to
deceive the consumers and free-ride the popularity of the original app. For a given original app,
some copycat apps can mimic the functionality very well or even improve the quality and user
experience, whereas other copycat apps are shady imitators.
In this paper, we propose an automatic machine learning approach to identify copycats and
original apps using unstructured textual and graphical information. We verify the accuracy of the
proposed method using Amazon Mechanical Turk as the golden external benchmark. We find
strong evidence that the proportion of copycat apps in the new app release has increased
dramatically in the action game genre during the past 5 years. When we further divide the
copycat apps into deceptive and non-deceptive, we find that the deceptive and non-deceptive
38
copycat apps are systematically different in multiple dimensions. For example, non-deceptive
apps have higher downloads, higher consumer ratings, larger number of consumer ratings, higher
ratio of free apps, lower price, longer textual descriptions, more screenshots, etc. on average
compared with the deceptive ones.
Then we conduct the economic analysis on the consequence of copycat apps by analyzing the
potential sales cannibalization. In particular, we examine two countervailing effects that copycat
apps might have. On one hand, copycat apps compete with the original apps by provide similar
functionality and perhaps lower priced product. On the other hand, copycat apps can help
increase the awareness of the group of similar apps so that the original apps can be taken into
consideration more often. Interestingly, we find that both the relative quality and the appearance
similarity affect the dominant effect. For highly rated copycat apps, they tend to compete with
the original ones and switch consumers to their products. For lowly rated copycat apps, they tend
to stimulate the sales of the original apps. Similarly, for deceptive and non-deceptive copycat
apps, we find that non-deceptive ones have dominant significant cannibalization effects on the
original sales. This cannibalization effects majorly come from high quality non-deceptive
copycat apps instead of the low quality non-deceptive ones. Moreover, for deceptive copycat
apps, the effect tends to be in both directions. High quality deceptive copycats are likely to
substitute the download of original apps. But for low quality deceptive copycats, we find a
positive association between the copycat download and original download. It once again
supports the possibility of positive spillover effect of copycat sales.
Our paper also has some limitations, which could act as fruitful areas for future research. First,
although we have considered the causes and effects of copycat apps, we haven’t explored the
optimal combating strategy or the overall welfare impact of the copycat apps. These questions
39
are also important to the original developers and the platform owners. Second, we haven’t
answered whether imitation can hinder or encourage subsequent innovation on this market. It’ll
be good to explore the dynamic relationship between imitation and innovation. Third, ideally we
should use the actual download quantities to conduct analysis. But due to the data limitation on
the actual download quantities, we calibrate the download quantity from the download rank.
Therefore, we have to impose a few assumptions such as the overall market size of the action
game market is constant over time. However, such assumptions may not hold in reality. Despite
these limitations, our paper is the first study that provides machine learning combined with
economic analysis approach to analyze copycats and original’ behavior in the context of mobile
apps, which help both practitioners and researchers to better understand this rapid growing
industry. We hope our work can pave the way for future research on this important area.
40
References
Aggarwal, C., Zhai, C., 2012. A survey of text clustering algorithms. Mining Text Data. 77-128.
Appannie.com, 2014, App Annie index – market Q1 2014: revenue soars in the United States
and China, http://blog.appannie.com/app-annie-index-market-q1-2014/
Appfreak, 2014, The ultimate guide to App Store optimization in 2014,
http://www.appfreak.net/ultimate-guide-app-store-optimization-2014/
Apple Press info, 2013, App Store sales top $10 billion in 2013,
http://www.apple.com/pr/library/2014/01/07App-Store-Sales-Top-10-Billion-in
2013.html?sr=hotnews.rss
Apptamin, 2013, App Store Optimization (ASO): improve your app description,
http://www.apptamin.com/blog/app-store-optimization-app-description/
Angrist,J., and Krueger., A. 2001. Instrumental variables and the search for identification: from
supply and demand to natural experiments. Journal of Economic Perspectives, 15(4): 69-85.
Archak, N., Ghose, A., Ipeirotis, P., 2011. Deriving the pricing power of product features by
mining consumer reviews. Management Science. 57(8). 1485-1509.
Bessen. J. and Maskin E. 2009. Sequential innovation, patents, and imitation. RAND Journal of
Economics. Vol 40, No. 4. Pp. 611-635.
Biais, B. and E. Perotti. 2008, Entrepreneurs and new ideas, RAND Journal of Economics, Vol.
39, No. 4, Winter 2008 pp. 1105-1125.
Bloomberg Business, 2014, Apple Users Spent $10 Billion on Apps in 2013,
http://www.businessweek.com/articles/2014-01-07/apple-users-spent-10-billion-on-apps-in2013
Bluecloud,
2013,
App
Store
Optimization
101:
the
ultimate
checklist,
http://www.bluecloudsolutions.com/articles/app-store-optimization-101-ultimate-checklist/
Brohee, S., Helden, J. 2006. Evaluation of clustering algorithms for protein-protein interaction
networks. BMC Bioinformatics. 7.
Burtch. G., Ghose. A., Wattal. S. 2013. An empirical examination of the antecedents and
consequences of contribution patterns in crowd-funded markets. 24(3). 499-519.
Carare, O. 2012. The impact of bestseller rank on demand: evidence from the app market.
International Economic Review. 53(3), 717-742.
Castro, J., D. Balkin, and D. Shepherd., 2008, Can entrepreneurial firms benefit from product
piracy? Journal of Business Venturing, 23 (1): 75-90.
Cho,S., Fang, X., Tayur, S. 2013. Combating strategies counterfeiters in licit and illicit supply
chains. Working paper.
Conner. K., 1995. Obtaining Strategic Advantage from Being Imitated: When Can Encouraging
“Clones” Pay? Management Science. 41(2): 209-225.
Danaher, B., Smith, M., Telang, R., Chen, S., 2014. The effect of graduated response anti-piracy
laws on music sales: evidence from an event study in France. Journal of Industrial
Economics. Forthcoming.
Distimo, 2013. What Is Needed For Top Positions In The App Stores?
http://www.distimo.com/publications
Dongen, S., 2000. Graph clustering by flow simulation. PhD thesis. University of Utrecht.
Eisen, M. B., Spellman, P. T., Brown, P. O. and Botstein, D. 1998. Cluster analysis and display
of genome-wide expression patterns. Proceedings of the National Academy of Sciences, 95,
14863–14868.
41
Elmagarmid, A., Ipeirotis, P., Verykios, V., 2007. Duplicate record detection: a survey. IEEE
Transactions on Knowledge and Data Engineering. 19(1).
Flurry, 2014, Apps Solidify Leadership Six Years into the Mobile Revolution,
http://www.flurry.com/bid/109749/Apps-Solidify-Leadership-Six-Years-into-the-MobileRevolution#.VCuYffkapjc.
Garg, R., Telang, R. 2013. Estimating app demand from publicly available data. MIS Quartlerly.
37(4),1253-1264.
Gartner 2013. Gartner says mobile app stores will see annual downloads reach 102 billion in
2013, http://www.gartner.com/newsroom/id/2592315.
Ghose. A., Ipeirotis, P, Li, B., 2012. Designing ranking systems for hotels on travel search
engines by mining user-generated and crowdsourced content. Marketing Science. 31(3). 493520.
Ghose, A., Han, S. 2014. Estimating demand for mobile applications in the new economy.
Management Science, Forthcoming.
Gosline, R. 2010. Counterfeit labels: good for luxury brands? Forbes, 2/12/2010.
Grossman, G., Shapiro, C., 1988a. Counterfeit-product Trade. American Economic Review.
78(1).59-75.
Grossman, G., Shapiro, C., 1988b. Foreign counterfeiting of status goods. The Quarterly Journal
of Economics. 103(1).79-100.
Gu, B., V. Mahajan. 2005. How much anti-piracy effort is too much? A study of the global
software industry. Working paper.
Hausman, J. A. 1996. Valuation of new goods under perfect and imperfect competition, in T. F.
Bresnahan and R. Gordon, eds., The Economics of New Goods, Studies in Income and
Wealth. 58. NBER.
Heer. J. and Bostock.M. 2010. Crowdsourcing Graphical Perception: Using Mechanical Turk to
Access Visualization Design. Proceedings of the SIGCHI Conference on Human Factors in
Computing Systems. Pp. 203-212.
Hu, M., and Liu, B. 2004. “Mining and summarizing customer reviews,” Proceedings of the
tenth ACM SIGKDD international conference on knowledge discovery and data mining:
ACM, pp. 168-177.
Huffington Post, 2013. The Evolution Of An Underground Copycat App Environment.
http://www.huffingtonpost.com/himanshu-sareen/post_5236_b_3647228.html
Jain. S. 2008. Digital Piracy: A Competitive Analysis. Marketing Science. 27(4): 610-626.
Kim, J., Lee, H., 2012. Efficient exact similarity searches using multiple token orderings. IEEE
Landauer, T., Laham, D., Foltz, P., 1998. Learning human-like knowledge by singular value
decomposition: a progress report. Advances in Neural Information Processing Systems.
Larsen, B., and Aone, C. 1999. “Fast and effective text mining using linear-time document
clustering”. Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge
Discover and Data Mining. Pp. 16-22.
Liu, C., Au, Y., Choi, H. 2012. An empirical study of the freemium strategy for mobile apps:
evidence from the Google Play market. ICIS.
Lowe, D., 1999. Object recognition from local scale-invariant features. The Proceedings of the
Seventh IEEE International Conference on Computer Vision. Vol2. 1150-1157.
MacQueen, J., 1967. Some methods for classification and analysis of multivariate observations.
Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability.
University of California Press. Pp 281-297.
42
Mikolajczyk. K., Schmid. C., 2005. A performance evaluation of local descriptors. IEEE
Transactions on Pattern Analysis and Machine Intelligence. Vol 27. 1615-1630.
Nelson and Winter. 2009. An Evolutionary Theory of Economic Change. Harvard University
Press.
Nguyen 2013. Evolving competitive dynamics of the global mobile telecommunication industry
in 2012 and beyond. Stanford business case.
NYTimes.
2014.
Flappy
Bird
copycats
keep
on
flapping.
http://bits.blogs.nytimes.com/2014/02/24/flappy-bird-copycats-keep-onflapping/?_php=true&_type=blogs&_r=0
Oberholzer-Gee, F., Strumpf, K. 2007 The effect of file sharing on record sales: an empirical
analysis. Journal of Political Economy. 115(1). 1-42.
PandoDaily, 2014, Why copycats are the best thing to happen to your company,
http://pando.com/2014/02/19/why-copycats-are-the-best-thing-to-happen-to-your-company
Qian, Y. 2008. Impacts of entry by counterfeiters. The Quarterly Journal of Economics.
123(4).1577-1609.
Qian, Y. 2011. Counterfeiters: foes or friends? How do counterfeits affect different product
quality tiers? Working paper.
Satuluri, V., Parthasarathy. S., Ucar. D., 2010. Markov clustering of protein interaction networks
with improved balance and scalability. ACM-BCB. pp 247-256.
Siegfried. J., Evans. L, 1994. Empirical studies of entry and exit: a survey of the evidence.
Review of Industrial Organization. 9(2). 121-155.
Smith. M., Telang. R., 2009. Competing with free: the impact of movie broadcasts on DVD sales
and internet piracy. MIS Quarterly 33(2). 321-338.
TheGuardian, 2012. Should Apple take more action against march of the iOS clones?
http://www.theguardian.com/technology/appsblog/2012/feb/03/apps-apple?newsfeed=true
Technobuffalo, 2014, This is the biggest problem with mobile gaming today.
http://www.technobuffalo.com/2014/04/23/biggest-problem-with-mobile-gaming/
Villas-Boas, J., Winer, R. 1999. Endogeneity in brand choice models. Management Science.
45(10). Pp 1324-1338.
43
Related documents
Download