Many organizational decision problems concern choices

advertisement
Corporate search:
Exploration and exploitation in multiple markets
Thorbjørn Knudsen & Nils Stieglitz
Strategic Organization Design
Department of Marketing & Management
University of Southern Denmark
nst@sam.sdu.dk
Abstract
A firm that operates in more than one industry or market is engaged in corporate search when it
collects information to consider trade-offs between new and existing market opportunities. Corporate
search is a continuum spanned by the extreme poles of exploration and exploitation. Little is known
about the conditions that allow firms to benefit from exploration of multiple markets as opposed to
refinement of an existing business serving a particular market. This paper develops a modelling
structure that provides useful insights about this important problem. Our results suggest that
organizational commitments hold the key to benefiting from corporate exploration by gradually
forcing a shift from broad search among multiple markets to narrow search from improvement within
a particular market. We show how organizational commitments can be operationalized as either
financial constraints or time constraints.
Keywords: corporate search, exploration and exploitation, NK model
1
1. Introduction
How do firms benefit from exploring multiple markets? The canonical example of this problem is
corporate search. A firm that operates in more than one industry or market is engaged in corporate
search when it collects information to consider trade-offs between new and existing market
opportunities. Corporate search is a continuum spanned by the extreme poles of exploration and
exploitation. Corporate exploration involves searching for business opportunities among multiple
markets or industries. Its contrast, corporate exploitation is the refinement of an existing business
serving a particular market. Little is known about the conditions for successful corporate search. This
is because prior work has primarily focussed on search in single markets.
Prior work has advanced our knowledge about the relation between organizational search and
performance at the business unit level. We now have a fairly good understanding of the way
organizations must design search so it leads to superior outcomes in a single task environment. The
extant work on organizational search is thereby useful for analysis of business units that operate in a
single industry or market. It has spawned many detailed insights for this situation (see Sorensen 2002;
Gupta el al. 2006 for reviews) as well as the general insight that organizational search can be
characterized as trade-offs between mechanisms that are variance enhancing (exploration) and
mechanisms that are mean enhancing (exploitation).
The exploration-exploitation trade-off has been identified as one of the most important
challenges for organizational search (March 1991). This trade-off exists because the mean enhancing
mechanisms associated with exploitation reinforces positive, proximate, and predictable rewards
while the variance enhancing mechanisms that characterize exploration reinforce more uncertain,
distant and often negative rewards (Levinthal & March 1993; Denrell & March 2001; Benner &
Tushman 2003; He & Wong 2004). Exploitation ignores possible distant gains while exploration
ignores immediate incremental gains. Prior formal models have provided detailed insights on the
exploration-exploitation trade-off that correspond to a single business unit, operating in a single
market (e.g. March 1991; Carley & Svoboda 1996, Siggelkow & Levinthal 2003; Siggelkow &
Rivkin 2005). An important finding is the advantage of increasing the level of variance in alternative
generation when many policy attributes are interdependent (Rivkin & Siggelkow 2003; Knudsen &
2
Levinthal 2007). As further explained in the following section’s review of the literature, exploration
and exploitation at the corporate level adds the challenge of considering exploration and exploitation
within the context of multiple markets.
As far as we are aware this challenge has not yet been addressed in the literature on
organizational search (cf. Gupta et al. 2006). There are no workable models that facilitate systematic
analysis of the trade-offs that occur when firms operate in more than one industry or market. This is
important because most firms in the economy are active in more than one market (Montgomery 1994;
Villalonga 2004). While prior formal models on organizational search has primarily focussed on
single unit businesses, this type of firm only account for about 20% of the number of publicly traded
firms in the US economy (Villalonga 2004). We therefore lack a valid model of organizational search
for about 80% of public firms in the economy. That is, corporate search is an important unexplored
topic in the literature on organizational search. We aim to take a first step in the direction of a
systematic analysis of this topic.
A defining feature of corporate search is the interrelatedness of search at two distinct levels of
organization. Firms both search for new markets and for improvements that can strengthen
competitive positions in existing markets (e.g. Tripsas 1997; Winter 2000; Bhardwaj et al. 2006).
These two levels of organizational search are interrelated, as Levinthal and March (1993: 100-101,
their emphasis) point out, “an organization learns which market to enter and how to function
effectively in several alternative markets.” Learning about a new market also reveals valuable
information on how to compete in it. What is more, search at one level may substitute for search at
another level. A firm could settle in a less attractive industry, but nevertheless stake out a superior
competitive position that allows it to earn above-average rents (Rumelt 1991; Porter & McGahan
1997). In contrast, it is quite possible to fail in an attractive industry. These issues cannot be addressed
by confining the analysis to just one level of organizational search and adaptation. Rather, we need to
consider two distinct levels of search – within and among markets – in order to understand corporate
search in multiple markets.
Even though most firms are active in multiple markets, it is not clear how firms can design
search processes so they benefit from exploring multiple markets. In order to fill this gap, we offer a
3
platform for analysis of search in multiple markets that combines two well-established models from
the organization literature. We draw on the NK model of adaptation on rugged landscapes (e.g.
Levinthal 1997; Fleming & Sorensen 2001; Gavetti 2005) to characterize search within a particular
market. Our modeling choice reflects a certain caution in that we simply extend prior NK-models by
considering
choice
among
multiple
performance
landscapes.
We
use
well-understood
characterizations of search within landscapes. The added ingredient, then, is a characterization of
search between landscapes. The obvious choice here is to model search among markets as an instance
of the multi-armed bandit problem (e.g. March 1991; March 2003; Fang & Levinthal 2009). This
class of model is well-understood and ideally suited for trade-offs between possible improvementprocesses in multiple markets. That is, the problem of corporate search is a specific version of the
more generic problem of searching through multiple spaces, or markets, each of which offers distinct
improvement paths.
Our study provides insights that can help understand the relation between corporate search
and corporate performance. We identify conditions for successful corporate search and show that
consideration of multiple markets, each of which offers specific opportunities, requires multi-stage
search procedures. We find that the idea of identifying and maintaining a fixed balance between
corporate exploration (search between markets) and corporate exploitation (search within markets) is
misplaced. At the very least, corporate search requires a two-stage policy where exploration is
followed by exploitation, a procedure that seems intuitive and consistent with available evidence
(Winter & Szulanski 2001; Burgelman, 2002; Siggelkow & Levinthal 2003). But when should the
firms shift to exploitation mode? If the manager can compute an optimal sampling strategy the answer
is given, but this is rarely the case. We identify a hard budget constraint as a much more realistic
alternative that achieves similar results. The answer, then, to solving the exploration-exploitation
tradeoff at the corporate level lies in commitments that ensure a shift to exploitation after a sufficient
period of exploration. We identify conditions that ensure the exploration period is sufficiently long
and show that results do not vary dramatically when managers do not get it exactly right. Put
differently, our findings show how corporate search can be designed so that managers avoid two wellknow option traps (Adner & Levinthal 2004). The first is the tendency to search for new markets
4
when things are not going well, rather than improving things in the existing market (failure trap). The
second is the tendency to improve things in the existing market rather than abandon a market with
little potential (success trap).
2. Corporate search: Exploration and exploitation in multiple markets
While the majority of firms in the economy perform corporate search where they consider exploration
and exploitation in multiple markets, prior work has mostly been limited to the base-case of a single
business unit searching for improvements in a single task environment, or market. A large body of
empirical and theoretical literature has studied the exploration and exploitation trade-off, yet prior
research has not systematically considered organizational search on multiple levels. Most theoretical
studies have focussed on just a single level of organizational search (e.g. Carley & Svoboda 1996;
Levinthal 1997; Siggelkow & Rivkin 2003; Fang & Levinthal 2008). The more challenging problem
of searching multiple performance landscapes is now beginning to attract attention. Gavetti (2005)
studied organizational search in the context of a multi-divisional firm. After an initial period of
activity in a single market, an organization continues to operate in the original market, but also enters
a new industry. He finds that knowledge spillovers from the old to the new business unit may be
beneficial when market characteristics are similar. Likewise, Gavetti, Levinthal & Rivkin (2005)
invoke multiple markets to examine the role of managerial analogies for strategy-making. An
important result is that a well-informed analogy is a powerful guide in extending search from a known
market context to a new one. However, these contributions do not examine how firms search for and
select markets, and how search among and within markets interact with each other. 1 Our aim is to use
a similar multi level perspective to understand corporate search for new markets.
While we lack a workable model of corporate search, the literature on corporate strategy is,
per definition, concerned with multiple business units that are located in distinct markets. The choice
of a firm’s market is a central problem of corporate strategy. Research on corporate strategy has
extensively studied the motivation for entering new markets (e.g. Montgomery 1994), the rate and
A notable exception is Holmqvist’s (2004) case-study of exploration and exploitation in the context of product
development. He documented that exploration and exploitation must be understood at multiple levels of
organization (within and between firms that are partners in an alliance).
1
5
direction of diversification (e.g. Mosakowski 1997) as well as the organizational structure of
diversified companies (e.g. Chandler 1962; Eisenmann & Bower 2000). But only very limited
attention has been directed to the problem of how firms actually search for and decide whether to
enter new markets. From a managerial perspective this problem is quite challenging because the value
of leveraged resources in a new market is hard to assess (Denrell, Fang & Winter 2003). Mosakowski
(1997) and Matusaka (2001) analyze how a firm engages in experimental diversification to identify
new markets. An example can illustrate. In a case study of DuPont’s diversification strategy,
Bhardwaj, Camillus and Hounshell (2006) report how the company explored various markets like
synthetic organic chemicals, inorganic chemicals, motor cars, and many more. Each market offered
numerous specific business possibilities that could be pursued further. For example, within synthetic
organic chemicals, DuPont engaged in the development of dyes, drugs, food preservatives, and
perfumes. Over a period of 20 years, the company considered a vast array of business possibilities in
many markets, and then gradually focused corporate resources on just a few of them.
Searching in multiple markets is challenged by the difficulty of balancing exploration and
exploitation at the corporate level, a difficulty rooted in the way feedback from the two search
activities become antagonistic. Exploitation of a market is associated with positive, proximate, and
predictable rewards that reinforce refinement and lead organizations into a competency or success trap
(Denrell & March 2001; Benner & Tushman 2003; He & Wong 2004). In contrast, the more
uncertain, distant and often negative rewards from exploring new markets tend to lead firms into a
failure trap where negative feedback reinforces a drive towards further experimentation and change
(Levinthal & March 1993). Our concern is how such pathologies relating to corporate search may be
cured. On the one hand, a company needs to ensure attractive markets are identified by corporate
exploration. This requirement calls for serious assessment of many new markets, as illustrated in the
DuPont example. On the other hand, excessive corporate exploration may undercut the identification,
refinement, and strengthening of valuable business possibilities within a market. Our model captures
the essential features of corporate search in multiple markets.
We identify three logically distinct approaches to the design of trade-offs between higherlevel search (market choice) and lower-level search (business refinement). The first approach commits
6
the decision-maker to a fixed balance between corporate exploration and exploitation under the
assumption that such a balance exists and can be identified. The second strategy splits the search
process into two distinct stages. In the first stage, the firm engages in corporate exploration where it
gains information about opportunities in a number of new markets. In the second stage, it then
switches to exploitation and focuses on improving its position within a single market. A two-stage
approach is based on the premise that it is useless to strike a fixed balance between exploitation and
exploration. In contrast, it assumes that it is roughly possible to identify the point in time where the
firm should switch from exploration to exploitation. The third approach assumes even less knowledge
about the search problem. It simply allocates a fixed budget to search and decreases search effort as
the budget is depleted. This dynamic adjustment approach allows for gradually adjusting the balance
based on performance feedback and the remaining budget for corporate search. The decision-maker
gradually homes in on an attractive market, progressively shifting the balance toward corporate
exploitation. The following section explains how these three versions of corporate search are
represented in our model of search in multiple markets.
3. A model of search in multiple markets
Our model structure has three building blocks. The first establishes the nature of the task environment
that a firm operates in, i.e. the market or industry; the second specifies how agents search within a
particular task-environment. The third building block establishes the strategies for searching among
markets. Table 1 gives an overview of the parameters of the model. Below we elaborate on each
building block and associated parameter specifications.
7
Parameter
Description
Parameter values
Generation of
d
Number of markets
4
task
Ni
Number of policy attributes within task 1a) Ni = 10
environments
environment i
(markets)
1b) Ni = 50
2b) Ni = 50
Ki
Number of interactions among policy 1a) Ki = 0, 1, 5, 9
attributes in task environment i
1b) Ki = 0, 1, 25, 49
2b) Ki = 25, 25, 25, 25
Search within a
Broad_S
market
Agents engage in local (Broad_S = 0) 0, 1
or broad search (Broad_S = 1)
Search among
Τ
Propensity to explore among markets
0.005, 0.1, 0.5, 1
markets
Α
Tolerance level in optimal sampling
0.2, 0.5
Δ
Confidence level in optimal sampling
0.05, 0.2
Β
Adjustment rate in dynamic adjustment
0.5, 1, 5
B
Financial budget
50
C
Entry costs
0, 0.1, 0.2
Table 1: Overview of model parameters
Nature of task environments
We generate a set of performance landscapes that represent separate task environments. Each task
environment can be thought of as a particular market. Task environments may differ in the number of
relevant policy attributes, complexity and mean performance.2 A task environment is characterized as
a space of alternatives. In each task environment, an alternative consists of N binary attributes that
represent choices regarding the business strategy within that market. The policy attributes may relate
to sourcing, production, sales, support functions, etc. (e.g. Rivkin 2001). Each attribute can take on
two states, so there are 2N different business policy configurations in each landscape. The
performance values of each of the N attributes are determined by random draws from a uniform
2
The specification of more than one market therefore allows for a more penetrating analysis of exploration and
exploitation as landscapes may differ in the global optimum. Even if agents have reached the global optimum in
one landscape, they may still be outperformed by agents in landscapes with a higher mean performance. This
essential property of exploration cannot be captured by studying just one landscape.
8
distribution over the unit interval. The overall performance of a policy configuration in one landscape
is the average of the values assigned to each of the N attributes.3
The attributes of a policy configuration may be more or less interdependent. Attributes are
interdependent if the value of each of the N individual attributes depend on both the state of that
attribute itself and the states of K other attributes. If K = 0, attributes are independent. As K increases,
more and more attributes of a configuration become interdependent, with K = N – 1 being the case of
interdependence among all attributes. The number of interdependencies given by K determines the
surface of the performance landscape. With higher values of K, there are more local peaks and
performance differences among neighboring configurations differing only in a single policy attribute
become relatively more pronounced (Kauffman 1993).
Search within a market
In their search for a viable business strategy within a market, firms may either incrementally adapt
policy configurations or try to more radically alter current positions. We associate the former with
local search, the latter with broad search.4
Since the properties of local search are well-known (Levinthal 1997), we use it as our baseline
for search within a landscape. With local search, agents revise policy configurations by changing a
randomly chosen single attribute and examining the outcome.5 If the result is improvement of
performance, the revision is implemented; if not, it is rejected. In addition to local search, we analyze
a combination of local and distant search that we refer to as broad search. When firms are engaged in
More formally, we specify alternatives within a particular market as consisting of N policy attributes, a1,…,aN.
For simplicity, it is assumed that each attribute can take on two states. A performance landscape is a mapping of
any possible vector of policy attributes A = (a1, a2,…aN) to performance values V(A). The value of each
individual policy attribute ai is affected by both the state of that attribute itself and the states of a number of
other attributes a –i. Denote the value of attribute ai by vi(ai, a –i). For each generated landscape, the particular
value of an attribute, vi, is determined by drawing randomly from a uniform distribution over the unit interval.
Each landscape is seeded differently. The value of a given set of alternatives A is then given by V(A) = [v 1(a1, a
–1) + v2(a2, a –2) +…+ vN(aN, a –N)]/N. The identity of a –i, i.e., the set of attributes that affect each attribute a i, is
given by the interaction structure of the firm’s decision problem (i.e. the variable K). We assume random
interdependencies among attributes. Since our studies include rather large performance landscapes (N= 50), we
report raw values from the NK runs rather than normalized results.
4
Prior research has studied broader search within a market as a sequence of distant and local search (e.g.
Siggelkow & Levinthal, 2003). The performance properties of this combination are well-known, and
implementing a similar approach would increase the complexity and parameter space of our model.
5
We limit our study to perfect evaluation and do not consider the elaborations introduced by Knudsen &
Levinthal (2007) relating to imperfect evaluation.
3
9
broad search, they widen their search perimeter to include more distant configurations when local
search no longer uncovers performance improvement. This captures the idea of problemistic search in
the behavioral theory of the firm (Cyert & March 1963). Initially, agents engage in local search by
changing a single, randomly chosen attribute. If an agent’s performance does not increase, the agent
broadens the scope of search by incrementally increasing the number of attributes that are changed.6
An agent thus begins to consider progressively more distant alternatives within a task environment
when local search fails to improve performance. If a better configuration is found, the agent reverts
back to local search. This combination of local and broad search corresponds to empirical
observations of search behavior (Fleming and Sorensen 2004; Bhardwaj et al. 2006).
Search among markets
Faced with multiple markets, agents need to decide on whether they want to explore a new market or
focus search on an existing market. Essentially, they need to identify the market with the highest
expected performance and refine the policy configuration within that market.7 Thus, corporate
resources compete for the conflicting demands of exploring markets and of improving an existing
business. We capture this trade-off by limiting organizational search to a fixed time period and a
6
Specifically, broad search is initially very coarse-grained and then becomes more fine-grained. The scope of
search is slowly widened as long as performance does not increase. In the first pass of broad search, the
probability that an attribute is mutated is initially set at 0.1 (the threshold for useful mutations when N=10). If
performance does not increase, the search scope is then increased by increments of 0.1 until it reaches the limit
of 1. In the next pass, the probability that an attribute is mutated is initially set at 0.05. It is then increased by
increments of 0.05 until it reaches the limit of 1. In the third pass, the incremental mutation probability is set to
0.03, etc. (with 4 passes, this procedure has reached the threshold of 0.02 for useful mutations when N=50).
Broad search thereby captures three salient features of problemistic search (Cyert & March 1963, p. 121-22).
First, search is motivated by the lack of performance increases. Second, it is simple minded.. If local search does
not increase performance, the organization uses increasingly distant search. Third, search as biased as it is
constrained by the present location within a landscape. Specifically, broad search is initially very coarse-grained
and then becomes more fine-grained. The scope of search is slowly widened as long as performance does not
increase. In the first pass of broad search, the probability that an attribute is mutated is initially set at 0.1 (the
threshold for useful mutations when N=10). If performance does not increase, the search scope is then increased
by increments of 0.1 until it reaches the limit of 1. In the next pass, the probability that an attribute is mutated is
initially set at 0.05. It is then increased by increments of 0.05 until it reaches the limit of 1. In the third pass, the
incremental mutation probability is set to 0.03, etc. (with 4 passes, this procedure has reached the threshold of
0.02 for useful mutations when N=50).
7
Our conceptualization of exploration and exploitation is grounded in decision theory and the adaptive systems
approach, which often consider the N-armed bandit problem as the canonical representation of the exploration
and exploitation trade-off (Robins 1952; Holland 1975; March 2003). A gambler with a finite budget faces the
problem of maximizing her total payoffs. When pulled, each lever of the bandit provides a reward drawn from a
distribution associated with that specific lever. Initially, the gambler has no knowledge about the levers, but
through repeated trials (exploration), she can identify and focus on the most rewarding lever (exploitation).
10
financial budget. The time constraint captures how agents allocate search efforts. For example, if an
agent decides to search for opportunities in market X, it foregoes searching within market Y. The
financial budget represents resources for explorative search (Mosakowski 1997; Adner & Levinthal
2004). Agents invest a non-recoverable cost out of the budget B every time they (re-)enter and sample
a market.
We assume that an agent samples one market in each time step. Initial configurations in a
market are randomly determined while subsequent refinements benefit from prior search. That is,
whenever an agent enters a new market, the initial configuration is randomly determined. And when
an agent returns from an excursion to another market, local or broad search is resumed from the
configuration that was used before departure from this market.
The mean performance of each market is initially unknown, but agents are able to form
subjective beliefs about the relative performance of markets (Luce & Raiffa 1957). An agent forms
beliefs on the basis of knowledge generated through prior search within a market. To model this
relationship between past search efforts and the formation of beliefs, we turn to research on
reinforcement learning (e.g. Lave & March 1975; Denrell & March, 2001). Agents sample a market
by engaging in the search modes described above. How much they sample a particular market
depends on the performance feedback in the past. Agents spend more time in markets they, rightly or
wrongly, find attractive. Sampling a more attractive market therefore gets reinforced over time. The
choice of a market is also influenced by how much weight an agent puts on corporate exploration, i.e.
searching for opportunities in new markets. We use the Softmax algorithm of reinforcement learning
to capture the relationship between performance feedback, sampling, and the propensity to explore
markets.
The Softmax algorithm, attributed to Luce (1959), provides a straight-forward way to model
the formation of beliefs (Sutton & Barto 1998; Denrell & Le Mens 2007; Fang & Levinthal 2009).8
The Softmax algorithm makes the probability of sampling a particular market i at time step t
8
Empirical findings in psychology and neuroscience offer some evidence that the repeated choices of human
agents in uncertain environments correspond to this algorithm (Yechiam & Busemeyer 2005; Daw et al. 2006).
11
dependent on the observed mean performance xi , the average mean performance of all markets d,
and on the propensity for corporate exploration τ:
pi ,t  e xi / / i 1 e xi / .
d
(1)
The parameter τ influences the degree to which an agent adheres to prior beliefs. A lower value of τ
increases the probability that an agent selects a market believed to be the most attractive. Corporate
exploration is curbed as the agent places a much higher emphasis on exploitation of a particular
market. A higher τ downplays the role of prior beliefs and increases the probability that an agent
samples and explores a different market. Hence, we label this parameter the propensity to explore.
Note that agents continually update the estimates of mean performances. The more an agent explores
a market, the better the estimates of mean performances and of the relative attractiveness of markets.9
In the Softmax algorithm, the parameter τ encapsulates the propensity to engage in corporate
exploration among markets. The critical question then is how agents set τ. As previously explained,
we specify three alternative approaches to the determination of τ, each representing a different
balancing of corporate exploration and exploitation. The first approach aims to achieve a fixed
balance of corporate exploration and exploitation. The second approach considers a more
sophisticated two-stage sequence of corporate exploration and exploitation, with the length of the
exploration period depending on the number of possible markets. The third approach is inspired by
models of adaptive search and allows agents to continually adapt the balance between corporate
exploration and exploitation.
9
A numerical example might help to illustrate the Softmax algorithm. Assume that a decision-maker has to
choose among three markets. Based on prior search efforts in each market, the agent estimates the means of the
three markets, xi, as 0.4, 0.5, and 0.6. The decision-maker has set τ to 1, corresponding to a fairly high
propensity to engage in corporate exploration. The (rounded) probabilities of sampling a specific market in the
next period are 0.30, 0.33, and 0.37 respectively. In contrast, if the agent sets τ to 0.1 (low propensity to
explore), the probabilities are 0.09, 0.25, and 0.66. The probability of choosing the most attractive market is
therefore much higher with a low τ. Suppose the agent samples the second market, leading to a new estimate of
the mean of 0.52. In the next time step, based on the new information, the probabilities of selecting the markets
are 0.30, 0.34, and 0.36 (with τ = 1) and 0.09, 0.28, and 0.63 (with τ = 0.1).
12
Fixed balance of corporate exploration and exploitation
When corporate search proceeds on the basis of a fixed balance between exploration and exploitation,
decision makers aim to identify and maintain the right weight between mean-enhancing and variance
enhancing search. We also refer to this search as one-stage balancing of exploration and exploitation,
because decision makers once and for all set a parameter (propensity to explore τ) that determines a
fixed balance between exploration and exploitation. The higher this parameter, the more the balance is
pushed towards corporate exploration – agents spread search efforts across multiple markets instead
of focusing on one. The lower the propensity to explore, the more the balance is pushed towards
corporate exploitation – search within markets dominates search among markets.
Two-stage sequencing of corporate exploration and exploitation
With two-stage sequencing of corporate exploration and exploitation, the propensity to explore (τ)
determines the mode of search. When τ is set to high values, the search mode is exploration while low
values of τ are associated with exploitation (see Table 1 for specification of actual values). Two-stage
search is a process where the corporation switches from exploration of multiple markets to
exploitation of a single market. We capture this by switching from a high to a low value of τ. The
decision maker first takes a number of samples in exploration mode (high τ) and then firmly shifts to
exploitation mode (low τ). In exploration mode, τ is set so high that search among landscapes
dominates search within landscapes. By contrast, in exploitation mode, the propensity to explore is set
so low that search within landscapes dominates search among landscapes.
The effective sequencing of corporate exploration and exploitation depends largely on the
number of markets to be searched. As the number of prospective markets increases, more samples
need to be taken before the estimates become reliable. The length of the exploration period is
determined by the optimal sample size. Following Even-Dar et al. (2002), the α-optimal sample size S
with probability 1 – δ is given by
(2)
S = d/α2 log (d/δ),
with d the number of performance landscapes. The parameter α represents the tolerance of the agent,
and δ the confidence level. The two parameters, α and δ, are the agent’s choice parameters. If a more
13
precise and reliable estimate is desired, the agent must spend more time on corporate exploration. The
sample size S determines the time step at which an agent switches from the exploration stage (high τ)
to the exploitation stage (low τ). In essence, agents build up a broad stock of knowledge about the task
environments (corporate exploration with high τ) and then proceed to refine this knowledge (corporate
exploitation with low τ).
How exactly is the sample size S determined? Agents choose a tolerance level α and
confidence level δ. The tolerance level captures the deviation from the best possible market. With a
tolerance of α=0.20, decision makers accept a market that has 20% less performance than the
(unknown) best possible market. The confidence level δ captures the risk of accepting a bad
alternative that falls below the threshold set by the tolerance level. With a confidence level of δ =
0.95, decision makers will accept a bad alternative with probability 0.05. That is, the alternatives they
accept will be worse than the limit set by the tolerance level. With a tolerance of α = 0.20, 5% of the
accepted alternatives will have performance that deviates more than 20% from the best possible.
These two sensitivity parameters (α and δ) and the number of performance landscapes to be searched
(d) jointly determine the total number of samples S that the decision maker allocates to corporate
exploration. The lower the tolerance level α, the higher the confidence level δ, and the more
landscapes d, the larger the sample size.
Dynamic adjustment of corporate exploration and exploitation
Alternatively, firms may dynamically adjust the balance of corporate exploration and exploitation
over time. In actual practice, there are a number of reasons why firms may prefer such a strategy. The
optimal sample size may be unrealistically high from a practical perspective, or the number of
alternatives may be unknown. In addition, agents may benefit from continually updating the balance
instead of sticking to a fixed sequence of exploration and exploitation. We consider a simple
behavioral rule that agents use to adjust the propensity to explore τ in the Softmax algorithm. The
behavioral rule is inspired by models of adaptive organizational search (Levinthal & March 1981;
Greve 2003). It allows agents to adapt the balance between exploration and exploitation based on
14
performance feedback. They decrease exploration when they have been successful in the past
(cumulated performance W) and increase exploration if they have a large budget B:
τ = β (B/W).
(3)
As the budget gets depleted, the propensity to explore τ is reduced and corporate exploration
is gradually toned down. The parameter β is a scale-parameter that tempers the exploration phase. A
higher value of β gives more pronounced corporate exploration, i.e. firms search more among
markets. Thus, a high β induces a dynamic that mimics a two-stage sequencing of exploration and
exploitation. With lower values of β, the initial stage of exploration will endure longer, but be less
pronounced. As β approaches zero at the limit, agents become trapped in the market to which they
were initially assigned to at random.
4. Results
For the simulation runs, we have restricted our analysis to four markets (d = 4) that a firm may locate
in.10 We study both relatively simple task environments with few important policy attributes (N = 10)
and more challenging markets with a large number of important policy attributes (N = 50). In markets
with few policy attributes, the manager may choose among 4096 alternatives distributed across the
four landscapes (1024 alternatives in each). In markets with many policy attributes, managers face the
daunting task of considering 4.5e15 (4*250) alternatives. We consider two principal scenarios. In the
first scenario, the markets only differ in terms of complexity (the K parameter)..11 For simple task
environments (N = 10), we examine complexity levels with K = 0, 1, 5, 9. For markets with many
policy attributes (N = 50), we use values of K = 0, 1, 25, 49. In the second scenario, the markets
differ in mean performance, while complexity is the same across markets.12 We consider markets of
10
Our results may be extrapolated to any number of markets by adjusting the sample size in the two-stage
sequencing according to equation (2). For dynamic adjustment, the effect can be gauged by reference to
equation (3).
11
The comparison of markets differing in complexity is attractive, since performance landscapes systematically
differ depending on the value of the K parameter (Kauffman 1993). In K = 0, only a single performance peak
exists that corresponds to the global optimum. As K increases, the number of local peaks increases
exponentially. The mean performance of local peaks is highest for low levels of complexity. However,
maximally complex markets (K = N - 1) contain the highest performing peak.
12
The spread in mean performances among landscapes is implemented by adding a fixed value (0.1, 0.2, 03,
respectively) to the raw fitness value generated by the standard NK model.
15
medium complexity, with K = 5 (for N = 10) and K = 25 (for N = 50). Each market is randomly
seeded. The results report the average of 100 simulations runs with 100 individual agents and for 1000
time steps. Unless specified otherwise, we assume a budget of 50 and entry costs of 0.1.13
We study the effectiveness of the three approaches to corporate exploration outlined in the
modeling section for these two scenarios. We allow for both local and broad search within a market.
We first compare a fixed balance with the two stage sequencing for markets that differ in complexity.
We then evaluate the effectiveness of the dynamic adjustment of corporate exploration for the same
scenario. Having established the basic properties of the three approaches, we proceed to analyze them
for the second scenario in which markets differ in mean performance.
Fixed balance and two-stage sequencing compared
We examine four representative strategies with a fixed balance between corporate exploration and
exploitation. The first is pushed towards the extreme pole of exploitation with parameter τ set to a
very low value of 0.005. In the second, τ is set to 1, so that exploration dominates. The remaining
strategies with a fixed balance fall between these two extremes (τ = 0.1 and 0.5). In two-stage search,
the propensity to explore is again set to 1 during the exploration stage and then, after a total of S
samples, the agent shifts to corporate exploitation by sharply reducing the propensity to explore to
0.005. The values of α (tolerance level) and δ (confidence level) were set to 0.2 and 0.05, respectively.
That is, the agent attempts to locate in a market that has at most 20 percent less performance than the
best possible. The probability of accepting a market that does not meet this threshold is just 5 percent
at the end of the exploration stage. For optimal sampling (equation 2), the agent therefore allocates
438 samples to corporate exploration and then switches to exploitation within the chosen market
(sensitivity and robustness discussed below).
13
The absolute value of these parameters is unimportant. The entry costs set a scale for the budget, so it is the
calibration of entry costs relative to the budget that matters. The actual values were chosen on the basis of
comprehensive additional simulations. These simulations also showed that our results are not a knife’s edge
property. They hold for a wide range of parameter values.
16
0.8
Mean performance
0.75
0.7
0.65
0.6
1-stage exploitation with broad search
1-stage exploration with broad search
2-stage sequence with broad search
2-stage sequence with local search
0.55
0.5
0
200
400
600
800
1000
Time
Figure 1: Two-stage search compared to one-stage search for markets with few policy attributes
(d = 4, N = 10, K = 0, 1, 5, 9, B = 50, entry cost = 0.1, α = 0.2, δ = 0.05, τ= 0.005, 1)
Figure 1 compares the average performance of one-stage search and two-stage search for
environments with few policy attributes (N = 10). Pure exploitation in one-stage search is
characterized by rapid initial performance increases. Firms that exclusively focus on corporate
exploitation tend to become stuck in the first market they decided to locate in and then singlemindedly focus on improving their policy configuration. However, after the initial period of rapid
improvements, gains from single-minded exploitation level off. In contrast, with pure exploration in
one-stage search, performance improves steadily as agents slowly focus search efforts on more
attractive markets. Explorative search starts to exhaust itself after 650 time steps, as agents deplete
their budget and are forced to firmly locate in one market. The result is a small increase in mean
performance, since agents are then able to locate in more attractive markets. The average performance
of the two intermediate one-stage search strategies (τ = 0.1 and 0.5, not reported in Figure 1) fall
between these two extremes. Overall, a fixed balance pushed toward exploitation shows better
performance than one-stage search favoring more exploration. Furthermore, the model exhibits the
17
common bias towards exploitation (March 1991), as the initial performance gains from exploitation
within a market are much higher than gains from exploration.
The two-stage search strategy produces a much smaller initial improvement than one-stage
search in the exploitation mode. However, after the exploration stage has been concluded, the shift
towards exploitation of one market is associated with a substantial increase in performance. This
dramatic gain is achieved because of the knowledge acquired in the exploration stage about the
relative attractiveness of markets. In stage two, agents use this knowledge to locate in the market
perceived to be the most attractive one. Within that market, they start from the policy configuration
they have so far identified as the best.
When the refinement of a configuration in the two-stage strategy is achieved by local search,
much of the gains are already realized in the exploration stage. After settling in the most attractive
market, there are no further gains in performance during the exploitation stage. If agents are capable
of broad search, performance continues to increase in the exploitation stage. What is striking is that
the two-stage model, even if it is limited to local search, clearly outperforms the one-stage search
strategies based on a combination of local and distant search. More generally, Figure 1 demonstrates
that the two-stage sequencing of exploration and exploitation clearly dominates a fixed balance in
terms of both cumulated and achieved mean performance at the end of the simulation run.
The same general result holds for markets with many policy attributes (N = 50). The relative
advantages of two-stage search become even more pronounced (Figure 2). A perhaps surprising result
is that broad search within a market decreases performance when compared to local search.14 That is,
the ability to search more distant configurations has a negative impact on the effectiveness of search
within a market where a firm has to get many policy configurations right. The reasons for this are
two-fold. First, as markets differ with respect to complexity (interdependencies among policy
attributes), the majority of agents locate in simple landscapes (K = 0). In that case, local search leads
agents towards the global optimum. As can be readily observed in Figure 2, mean performance
continues to increase slowly at the end of the simulation run. Second, the number of possible policy
14
We do not report the performance properties of the fixed balance strategies with local search. They are
slightly lower with local search vis-à-vis the same fixed balance with broad search and significantly lower vis-àvis the two-stage sequencing with local search. The results are available from the authors upon request.
18
configurations is so huge that it takes a long time to reach a local peak even in a complex landscape
(K > 0). Agents must go through many variations of the policy configuration before getting stuck.
0.75
Mean performance
0.7
0.65
0.6
0.55
0.5
0
1-stage exploitation with broad search
1-stage exploration with broad search
2-stage sequence with broad search
2-stage sequence with local search
200
400
600
800
1000
Time
Figure 2: Two-stage search compared to one-stage search for markets with many policy
attributes (d = 4, N = 50, K = 0, 1, 25, 49, B = 50, entry cost = 0.1, α = 0.2, δ = 0.05, τ = 0.005, 1)
Broad search, on the other hand, distracts agents. They start to broaden search as soon as local
search does not increase performance. This does not, however, mean that they already have reached a
local peak. In markets where firms must consider a large number of policy configurations, broad
search leads agents astray as the probability of finding a more distant configuration with higher
performance is substantially lower than of finding one in the proximate neighborhood early in the
search process. Agents engaged in broader search thus spend more time searching for a performanceincreasing configuration than decision-makers firmly focused on locally improving a policy
configuration. This is problematic if the period of corporate exploitation is relatively short. Mean
performance thereby decreases during the exploration stage. In general, these two effects imply that
broader search becomes more attractive when there are fewer important policy attributes. When it is
possible to focus on relatively few policy attributes (ten in the present analysis), broad search can be
19
advantageous. This happens when most policy attributes are interdependent and if the firm allocates a
good deal of time to broad search within each market.
To conclude, the findings provide powerful support for the conjecture that two-stage
sequencing of corporate exploration and exploitation dominates a fixed balance. One-stage search
pushed towards exploitation is a high-risk strategy that may benefit a few lucky agents, but the
majority gets stuck in inferior markets. In contrast, one-stage search in exploration mode foregoes the
potential performance benefits of focusing search efforts on a particular market. A two-stage search
strategy combines the advantage of these two extremes, while excluding the liabilities. Agents first
form expectations about relative performance based on an optimal sample size during corporate
exploration. They then proceed to corporate exploitation by firmly focusing on the market identified
as the most attractive one.
Robustness of two-stage sequencing: Lock-ins and adjustments of the exploration stage
The effective balancing of corporate exploration and exploitation in a two-stage search process
critically depends on aligning the budget with sample size. If the budget does not support a
sufficiently long period of exploration, agents get locked into a market prematurely. Performance
suffers. Still, the impact of a lock-in is not as pronounced as one might expect and, despite the lock-in,
agents perform better than with a comparable one-stage search strategy. With broad search, a longer
period of sustained corporate exploration within a market compensates for insufficient corporate
exploration. If agents can only engage in local search, the performance implications of a lock-in
critically depend on market complexity. In less complex markets, local search provides ample
opportunities for improving performance. In more complex environments, however, performance
decreases more sharply by a lock-in, since agents quickly reach a local peak and do not benefit from
more extended exploitation within a market.
20
0.75
Mean performance
0.7
0.65
0.6
0.55
0.5
0
N = 10, confidence level (delta) = 0.05
N = 10, confidence level (delta) = 0.20
N = 50, confidence level (delta) = 0.05
N = 50, confidence level (delta) = 0.20
200
400
600
800
1000
Time
Figure 3: Two-stage search with different confidence levels (d = 4, N=10/50, local search, B = 50,
entry costs = 0.1; α = 0.2, δ = 0.05, 0.1, τ = 0.005, 1)
To prevent a lock-in, agents may reduce the sample size in the exploration stage by adjusting
the confidence or the tolerance level. Reducing the confidence level to 80% (δ = 0.2) decreases the
length of the exploration stage to under 300 time steps. However, agents are still able to differentiate
among markets and achieve large performance gains after completion of the exploration stage, even if
they only engage in local search within a market (Figure 3).
With less time allocated to corporate exploration, agents may compensate during the
subsequent exploitation stage by further improving their position within a particular market. The
overall effect critically depends on the performance differences among markets and the relative
contribution of exploitation within a market. In markets with few important policy attributes (N = 10),
most of the performance gains are already realized in the exploration stage. Figure 3 reports that a
longer, more reliable exploration stage (barely) outperforms a shorter period of corporate exploration.
However, the small increase in mean performance during the exploitation stage comes at the huge
price of foregone performance gains during the extended exploration stage. Markets with many policy
attributes (N = 50) exhibit more improvement opportunities, a property that might justify a shorter,
21
less reliable exploration stage, since agents can effectively compensate by sustained improvements
within a market. These markets offer so many possibilities that it is apparently better to briefly take
aim and then go for the most promising market. The same basic effects may be observed when the
duration of corporate exploration is adjusted by the tolerance level α. Yet, curtailing the exploration
stage too much significantly decreases performance.15 As we found before, exploitation is only an
imperfect substitute for exploration, even if markets just differ in complexity. Overall, the viability of
two-stage search depends on calibration of the budget, the tolerance, and the confidence level. It
outperforms a fixed balance one-stage search strategy across a wide range of parameter values. As we
shall see, dynamic adjustment can match a well calibrated two stage search procedure, but with far
fewer demands on getting the parameters “right”.
Two-stage sequencing compared to dynamic adjustment
In the following, we compare two-stage sequencing with dynamic adjustment of corporate
exploration. Two stage sequencing is characterized by a first stage of intensive exploration among
markets followed by a second stage of focused exploitation of a market. In contrast, dynamic
adjustment gradually reduces the propensity to explore based on performance feedback and the
remaining budget. It thereby keeps the door open to higher levels of corporate exploration in later
time periods.
15
The results are available from the authors upon request.
22
0.8
Mean performance
0.75
0.7
0.65
0.6
Dynamic adjustment with broad search
Dynamic adjustment with local search
2-stage sequencing with broad search
2-stage sequencing with local search
0.55
0.5
0
200
400
600
800
1000
Time
Figure 4a: Two-stage sequencing compared to dynamic adjustment in markets with few policy
attributes (d = 4, N=10, K = 0, 1, 5, 9; B = 50, entry costs = 0.1; α = 0.2, δ = 0.05, β = 1, τ = 0.005,
1)
Mean performance
0.7
0.65
0.6
0.55
0.5
0
Dynamic adjustment with broad search
Dynamic adjustment with local search
2-stage sequencing with broad search
2-stage sequencing with local search
200
400
600
800
1000
Time
Figure 4b: Two-stage sequencing compared to dynamic adjustment in markets with many
policy attributes (d = 4, N=50, K = 0, 1, 25, 49; B = 50, entry costs = 0.1, α = 0.2, δ = 0.05, β = 1, τ
= 0.005, 1)
23
Figure 4 compares the performance characteristics in markets with few (N = 10, Panel a) and
many (N = 50, Panel b) policy attributes. It is evident that the two strategies have very different
characteristics. In general, mean performance grows faster with dynamic adjustment, but two-stage
sequencing catches up with dramatic gains in the transition to the exploitation stage. The most striking
observation relates to the effectiveness of two-stage search strategy. Two-stage sequencing based on
optimal sampling does not systematically outperform the simple behavioral rule underlying dynamic
adjustment. Rather, the mean performance at the end of the simulation is only slightly higher with
two-stage search. This marginal increase in final mean performance, however, comes at the price of
much lower performance gains during the exploration stage. Hence, dynamic adjustment often
outperforms two-stage sequencing in terms of accumulated performance over the entire simulation
run. This effect is particularly pronounced in markets with few policy attributes (Figure 4a). When
there are many policy attributes to be considered (Figure 4b), two-stage sequencing is superior,
especially in terms of accumulated performance. With dynamic adjustment, “keeping the door open”
to exploration distracts from realizing incremental gains in a huge space of possibilities, leading to
smaller performance in the mid-term of the search process. The firm commitment to sustained
exploitation in two-stage sequencing here helps agents to realize incremental performance gains by
focusing on improving a policy configuration within one market.
The performance of both search strategies also critically depends on whether agents engage in
local or broader search. First, engaging in broad search comes at the cost of lower mean performance
at the beginning of the search process, regardless of the chosen search strategy. This finding applies to
markets with few (N = 10) and many (N = 50) policy attributes. Second, over the longer run, broad
search leads to higher mean performance in markets where few policy attributes must be considered,
but it decreases mean performance and accumulated performance when there are many policy
attributes. These results substantiate our finding that broad search may be a mixed blessing, since it
depreciates mean performance during corporate exploration. In corporate exploitation, it distracts
agents when a multitude of policy configuration offers vast opportunities from incrementally
improving a configuration and realizing more proximate gains.
24
Robustness of dynamic adjustment: Budget adjustments
Overall, the performance characteristics of the dynamic adjustment of corporate exploration are quite
remarkable when compared to those for two-stage search. Compared to two-stage search, the dynamic
adjustment model is more robust to changes in the parameters. 16
0.75
Mean performance
0.7
0.65
0.6
0.55
0.5
0
N = 10, budget constraint
N = 10, no budget constraint
N = 50, budget constraint
N = 50, no budget constraint
200
400
600
800
1000
Time
Figure 5: Dynamic adjustment and budget commitment (d = 4, N=10/50, local search, B = 50,
entry costs = 0, 0.1, Beta = 1)
The success of dynamic adjustment critically depends on a firm commitment to a fixed budget
that is not replenished during the search process. Otherwise, agents may engage in excessive
exploration. To study the effects of a flexible budget, we lifted the budget constraint by setting entry
costs to zero. The balance between corporate exploration and exploitation is still adjusted by
The results for the dynamic adjustment model were quite robust to variations of parameter β. With a high
value β, firms begin with intensive exploration among markets. As the value of β is lowered this tendency
reduces. The results for the dynamic adjustment model reported here assume a value of β set to 1. We tested the
robustness of these results by setting β to 0.1, 0.5 and 5. With 0.1, agents do not spend enough time searching
among markets, leading to a decline in overall performance. No performance differences were observed when β
was varied between 0.5 and 1. Even with a very high β (= 5) performance did not decline significantly. In
addition, we also tested the impact of budget size by reducing the budget to 25 (50% cut). Performance varied
only slightly, and sometimes even outperformed a higher budget. Obviously, the budget does have to be
sufficiently high to allow for some corporate exploration and low enough to prevent excessive exploration.
Overall, performance was remarkably robust to changes in β and the size of the budget.
16
25
performance feedbacks, but the adjustment is now more gradual as exploration is not lowered by a
declining budget. The impact on the effectiveness of search is quite pronounced (figure 5). When the
budget constraint is lifted from the dynamic adjustment model, there is a dramatic decline in
performance as agents engage in excessive exploration among markets. Thus, a hard budget constraint
is a precondition for the effectiveness of the dynamic adjustment of corporate exploration.
Search among markets that differ in mean performance
0.8
Mean performance
0.78
0.76
0.74
0.72
0.7
0.68
1-stage search (tau = 0.005)
1-stage search (tau = 0.1)
2-stage sequencing
Dynamic adjustment
0.66
0.64
0
200
400
600
800
1000
Time
Figure 6: Comparison of three approaches when markets differ in mean performance (d = 4, N
= 50, K = 25, local search, entry costs = 0.1, α = 0.2, δ = 0.05, β = 1, τ = 0.005, 0.1, 1)
So far, we have only considered markets that differ in the number of relevant policy attributes
and in the number of interdependencies among those attributes (complexity). But we have not yet
considered markets that differ in mean performance. We turn to this important case now. The effects
on corporate exploration are two-fold. On the one hand, corporate exploration becomes more critical.
Since performance differences among markets are more pronounced, it is even more important to
locate in an attractive market where firms are rewarded with higher revenue. Agents in a poorperforming market can only partially compensate by more intensive corporate exploitation. On the
26
other hand, large performance differences make corporate exploration less demanding. It becomes
easier to differentiate among markets and to separate the wheat from the chaff. For the following
analysis, we consider markets that have the same complexity, but differ sharply in mean performance.
Specifically, we consider markets with a mean performance of 0.5, 0.6, 0.7 and 0.8, respectively.
Figure 6 reports the effectiveness of all three approaches to corporate exploration in complex
markets (K = 25) with many relevant policy attributes (N = 50). For the fixed balance, we show the
results for a low propensities to explore (τ = 0.005 and τ = 0.1), the latter being the best-performing
strategy with a fixed balance. It is evident that eschewing corporate exploration entirely and just
randomly picking a market is a much less effective strategy under the condition of large performance
differences among markets. Again, the remarkable result here is the performance of the dynamic
adjustment strategy that clearly outperforms two-stage search in terms of accumulated performance
and matches it in terms of final performance.17 In markets with few policy attributes (N = 10, K = 5),
we find similar results, with a slight advantage of two-stage sequencing in terms of final performance.
5. Discussion and conclusions
A classic problem in strategic decision-making is the allocation of resources to the two conflicting
demands of diversifying into new markets (corporate exploration) and of focusing on existing lines of
businesses (corporate exploitation). Corporate exploration takes a firm into new markets. In contrast,
corporate exploitation is concerned with the refinement and strengthening of the competitive position
within a market. Firms therefore search for competitive advantages on two distinct, but interrelated
levels. Searching a market also reveals useful information about how to compete. The inherent
problem of balancing corporate exploration and exploitation is avoiding the pitfalls of the success trap
and of the failure trap. Both traps are rooted in the antagonistic forces of immediate feedback from
corporate exploration and exploitation. Initial positive feedback drives out exploration, while negative
feedback undercuts exploitation. We built an agent-based simulation model to capture the salient
features of this decision-problem and analyze how to benefit from corporate exploration.
17
For robustness we also tested the influence of broad search. Again, broad search depressed mean performance
during corporate exploration. In markets with few attributes, broad search outperforms local search toward the
end of the simulation run.
27
We analyzed three different approaches to searching among multiple markets. The first
approach commits the decision-maker to a fixed balance between corporate exploration and
exploitation. The second strategy splits the search process into two stages. The firm initially engages
in corporate exploration and then firmly switches to the exploitation of one market. The decisionmaker commits to a fixed period of intensive corporate exploration among markets and then to
sustained exploitation of one market in the second stage. Third, dynamic adjustment allows for
gradually adjusting the balance based on performance feedback and the remaining budget for
explorative search. Here, the commitment to a fixed budget is critical. The firm gradually homes in on
an attractive market, progressively shifting the balance toward exploitation. Yet, dynamic adjustment
allows for more intensive exploration among markets even in the late stages of organizational search.
Table 2 summaries our results. Our main finding firmly establishes the poor performance
characteristics of a fixed balance between corporate exploration and exploitation. The two-stage
strategy and the dynamic adjustment approach consistently outperform all variants of the fixed
balance. This baseline result is far from trivial, since a fixed balance seems to hold some advantages.
Especially a balance pushed toward corporate exploitation allows for an intensive period of
exploration early on, since there are no apparent differences among markets. As soon as performance
differences among markets become apparent, the most attractive market is chosen with a high
probability and firms only seldom venture into new markets to reinforce their beliefs. Our results
demonstrate that this approach fails to tame the antagonistic forces of exploration and exploitation.
By contrast, the two-stage search strategy copes with the antagonistic forces by dividing the
search process into two distinct stages. The commitment to an extended period of exploration prevents
agents from falling prey to the success trap, that is, from settling in a market too early in the search
process. The commitment to a switch toward sustained exploration counters the allure of the failure
trap and protects agents from excessive exploration. In the dynamic adjustment approach, the
commitment to a fixed budget combined with performance feedback allow for intensive exploration
early on to prevent the success trap, while progressively adjusting the balance toward exploitation to
escape the failure trap. Thus, both approaches systematically tame the antagonistic forces by entering
organizational commitments, although they achieve this very differently.
28
More strikingly, the simple behavioral rule underling the dynamic adjustment matches and
sometimes even outperforms the two-stage sequencing based on optimal sampling. Two-stage
sequencing often does marginally better in terms of final performance at the end of simulation run, but
this small gain comes at the cost of much lower performance during the extended exploration stage.
Essentially, this brings into sharp contrast the problems inherent in the two-stage model. Effective
two-stage sequencing puts much higher demands on the cognitive abilities of decision-maker. For
effective sampling in the exploration stage, she must decide on a reasonable confidence and tolerance
levels and find the right propensity to explore for both stages of organizational search.
Few policy attributes
Many policy attributes
(N = 10)
(N = 50)
Markets differ in complexity,
a) Dynamic adjustment >
a) Two-stage > dynamic
not in mean performance
two-stage > fixed
adjustment > fixed
balance
balance
b) Broad search > local
b) Local search > broad
search
search
Markets differ in mean
a) Dynamic adjustment >
c) Dynamic adjustment >
performance, not in
two-stage > fixed
two-stage > fixed
complexity
balance
balance
b) Broad search > local
d) Local search > broad
search
search
Table 2: Summary of main results in terms of accumulated performance at the end of the
simulation run
Even though the results are fairly robust to variations in the propensity to explore, changes in
tolerance and confidence levels have significant performance implications. If set too low, exploration
may fail to identify a lucrative market. If set too high, the organization spends too much time
exploring markets, for very little gains. In addition, the budget must be tuned to the exploration stage
to prevent a premature lock-in to a market. The dynamic adjustment model is more robust. The
manager specifies a hard budget constraint for explorative search and stimulates a high initial level of
corporate exploration. If in doubt, the initial level of corporate exploration should be exaggerated.
29
Our second set of results relates to the interactions between search on two levels and the
performance properties of local and broad search within markets. First, sustained exploitation within a
market is, on average, an imperfect substitute for exploration among markets. Second, the ability to
search more broadly is a mixed blessing. If firms cannot readily tell whether they have reached a local
peak, broad search may distract them from making proximate gains by local search. Performance
suffers, since firms start to broaden search as soon as initial attempts at local improvements turn out to
be unsuccessful. Thus, broad search becomes more attractive the smaller and the more complex a
market is and the more time an agent spends within a market. Third, independent of the characteristics
of the markets, broad search leads to substantial losses in mean performance during corporate
exploration. Engaging in more explorative search on both the corporate and the business level
confounds learning experiences. A possible solution is to manage local and broad search more
carefully. During corporate exploration, firms should constrain business development within a market
to local improvements, even if performance does not increase immediately. In corporate exploitation,
broadening search should only be attempted if local improvements consistently failed to increase
performance (as the latter reinforces the impression of having reached a local peak).
The two-stage sequencing and the dynamic adjustment approach have very different
organizational implications. Two aspects seem to be especially relevant: managing the transition from
corporate exploration to exploitation, and organizational commitments. Regarding the first aspect,
prior research shows that exploration and exploitation call for different organizational structures (e.g.
Tushman & O’Reilly 1997). The two-stage sequencing approach assumes a clear and swift
organizational transition from an organization geared toward exploration to one squarely focused on
exploitation. This appears to stand in stark contrast to research highlighting the problems of radical
organizational change and to prior literature pointing to a gradual shift toward exploitation (e.g.
March 1991; Levinthal & March 1993). The latter point addresses the issue of maintaining a high
level of exploration through various organizational instruments, while the former point may be
addressed by the organizational separation of exploration and exploitation. That is, the organization
needs to consist of two specialized organizational units for exploration and exploitation (Tushman &
O’Reilly 1997). The units must be ordered hierarchically: After the completion of corporate
30
exploration, the exploration unit directs the exploitation unit toward the most attractive market and
policy configuration therein. Thus, two-stage sequencing essentially calls for a more centralized
approach toward corporate exploration. The dynamic adjustment approach, on the other hand, does
not suffer from the same limitations. The gradual decline in corporate exploration implies incremental
organizational change toward a more stable organizational structure. Thus, corporate exploration and
exploitation is managed within the same business unit. It corresponds to a more decentralized
solution, in which the business unit is delegated a fixed budget for explorative search among markets.
The second aspect concerns organizational commitment to either a fixed time period or a
budget for explorative search. This raises the problem of establishing credible managerial
commitments not to intervene in a delegated task (Baker, Gibbons & Murphy 1999; Foss 2005). If the
task of corporate exploration is delegated to a subordinate, the principal can always intervene or
retract the delegated decision from the agent. Incentive problems may ensue. This seems to be a
relevant organizational concern for implementing the two-stage and the dynamic adjustment
approach. Without offering a fully-fledged analysis, a few remarks can be derived from our results.
The two-stage approach appears to be more prone to incentive conflicts than the dynamic adjustment
model. Two-stage sequencing offers more room to renege on the agreed time period and the budget,
making it harder for outsiders to detect the breach of an informal contract. Second, mean performance
stays flat during much of the exploration stage. This makes performance-based incentives a blunt tool
for motivating the agent for exploration, a task notoriously hard to measure. Providing incentives
based on performance after the completion of exploration also creates problems, as the firm might
bring in a new manager for the exploitation stage. This suggests that delegating the task of corporate
exploration is harder to accomplish, again calling for a more centralized solution for two-stage search.
Dynamic adjustment provides a more straight-forward way to manage organizational
commitments. First, the informal contract only relates to a fixed budget, usually highly visible
throughout the organization. Second, dynamic adjustment is characterized by steadily increasing
performance, so that the incentives for an agent may be based on a combination of current and longterm performance. Again, dynamic adjustment tends to favor a decentralized approach to corporate
exploration. These aspects – how incentives influence organizational search and adaptation – need
31
more attention in future research (cf. Nickerson & Zenger 2003; Siggelkow & Rivkin 2005; Ethiraj &
Levinthal 2008).
Our model admits various limitations that may inspire further research. A limitation of the
current model is that agents may locate only in a single market. An interesting extension of the model
could allow firms to spread their investments across more or less related industries through corporate
diversification. The model can therefore be extended to study the evolution of corporate
diversification more fully. Another avenue of research might consider the dynamic emergence of new
markets or introduce relative performance shifts. This could, in a stylized way, capture the essence of
industry dynamics and technological development, since markets would evolve through a period of
growth and stagnation. An interesting extension would be to introduce competitive dynamics among
firms (cf. Lenox, Rockart & Lewin 2006; Knudsen, Levinthal & Winter 2009).
To conclude, our results suggest that firms may benefit from corporate exploration by making
organizational commitments. In strategy-making, corporate exploration and exploitation unfolds on
two distinct, but interrelated levels. On the level of corporate strategy, firms search among markets
(corporate exploration and exploitation). Within a market, firms explore and exploit competitive
positions (business-level exploration and exploitation). On the corporate level, a firm needs to commit
itself to a fixed time period of sustained exploration or to a hard budget constraint when searching
prospective new markets. On the business level, performance gains may be achieved by a
commitment to local improvements during the early stages of organizational search. We hope this
contribution will benefit both research and practice by directing attention to these issues, providing a
modeling structure with which they can be examined in a systematic way, and providing a first set of
robust results that point to the advantages of two-stage sequencing and the dynamic adjustment of
corporate exploration and exploitation. Corporate search is an important unexplored topic, and we
hope to have stimulated consideration of it in our theories of organizational search.
32
6. References
Adner, R., D.A. Levinthal. 2004. What is not a real option: Considering boundaries for the application
of real options to business strategy. Acad. Man. Rev 29(1) 74–85.
Baker, G, R. Gibbons & K.J. Murphy 1999. Informal authority in organizations. J. of Law, Econom.,
and Organ. 15(1) 56-87.
Bhardwaj, G., et al. 2006. Continual corporate entrepreneurial search for long-term growth. Man. Sci.
52(2) 248-261.
Benner, M. J., M. L. Tushman. 2003. Exploitation, exploration, and process management: The
productivity dilemma revisited. Acad. Man. Rev. 28(2) 238–256.
Burgelman, R.A. 2002. Strategy as Vector and the Inertia of Coevolutionary Lock-in. Admin. Sci.
Quart 47 325-357.
Carley, K.M., M. Svoboda 1996. Modeling organizational adaptation as a simulated annealing
process. Sociological Methods & Res. 25(1) 138-168.
Chandler, A.D. 1962. Strategy and structure: chapters in the history of the industrial enterprise. MIT
Press: Cambridge.
Cyert, R.M., J.G. March. 1963. A Behavioral Theory of the Firm. Prentice-Hall: Englewoods Cliff.
Daw, N.D., J.P. Doherty, P. Dayan, B. Seymour, R.J. Dolan. 2006. Cortical substrates for exploratory
decisions in humans. Nature 441 876-879.
Denrell, J., J.G. March. 2001. Adaptation as Information Restriction: The Hot Stove Effect. Organ.
Sci. 12(5) 523-538.
Denrell, J., C. Fang, S.G. Winter. 2003. The Economics of Strategic Opportunity. Strat. Man. J.
24(10) 977-990.
Denrell J., Le Mens G. 2007. Interdependent Sampling and Social Influence. Psyc. Rev. 114(2) 398422.
Eisenmann, T.R., J.L. Bower 2000. The entrepreneurial M-form: Strategic integration in global media
firms. Organ. Sci. 11(4) 348-355.
Ethiraj, S.K., D.A. Levinthal 2009. Hoping for A to Z While Rewarding Only A: Complex
Organizations and Multiple Goals. Organ. Sci. 20(1) 4-21.
Even-Dar, E., S. Mannor, Y. Mansour. 2002. PAC Bounds for Multi-Armed Bandit and Markov
Decision Processes. 15th Con. on Comp.l Learning Theory (COLT) 255-270.
Fang, C., D.A. Levinthal. 2009. Near-Term Liability of Exploitation: Exploration and Exploitation in
Multistage Problems. Organ. Sci. 20(3) 538-551.
Fleming, L., O. Sorensen. 2001. Technology as a complex adaptive system: evidence from patent
data. Res. Policy 30(7) 1019-1039.
Fleming, L., O. Sorensen. 2004. Science as a map in technological search. Str. Man. J. 25(8) 909-928.
33
Foss, N.J. 2003. Selective Intervention and Internal Hybrids: Interpreting and Learning from the Rise
and Decline of the Oticon Spaghetti Organization. Organ. Sci. 14(3) 331-349.
Gavetti, G. 2005. Cognition and Hierarchy: Rethinking the Microfoundations of Capabilities’
Development. Organ. Sci. 16(6) 599-617.
Gavetti, G., D. A. Levinthal, J.W. Rivkin. 2005. Strategy making in novel and complex worlds: the
power of analogy. Strat. Man. J. 26(8) 691-712.
Greve, H.R. 2003. Organizational Learning from Performance Feedback. Cambridge University
Press: Cambridge
Gupta, A.K., K.G. Smith, C.E. Shalley. 2006. The Interplay between Exploration and Exploitation.
Acad. of Man. J. 49(4) 693–706.
He, ZL., PK. Wong. 2004. Exploration vs. Exploitation: An Empirical Test of the Ambidexterity
Hypothesis. Organ Sci. 15(4) 481-494.
Holland, J.H. 1975. Adaptation in Natural and Artificial Systems. Ann Arbor: University of Michigan
Press.
Holmqvist, M. 2004. Experiential Learning Processes of Exploitation and Exploration within and
between Organizations: An Empirical Study of Product Development. Organ. Sci. 15 70-81.
Kauffman, S.A. 1993. The Origins of Order. New York: Oxford University Press
Knudsen, T., D.A. Levinthal. 2007. Two faces of search: Alternative generation and alternative
evaluation. Organ. Sci. 18 39-54.
Knudsen, T., D.A. Levinthal & S.G. Winter. 2009. The Role of Scale Adjustment in Industry
Dynamics. DRUID conference paper.
Lave, C.A., J. G. March. 1975. An introduction to models in the social sciences. New York: Harper &
Row.
Lenox, M.J., S.F. Rockart, A.Y. Lewin. 2006. Interdependency, competition, and the distribution of
firm and industry profits. Man. Sci. 52 757-772.
Levinthal, D.A. 1997. Adaptation on rugged landscapes. Man. Sci. 43 934-951.
Levinthal, D.A., J.G. March. 1981. A model of adaptive organizational search. J. of Econom.
Behavior and Organ. 2 307-333.
Levinthal, D.A., J.G. March. 1993. The myopia of learning. Strat. Man. J. 14(S2) 95-112.
Luce, R.D., H. Raiffa. 1957. Games and Decisions. Wiley: New York.
Luce, R.D. 1959. Individual Choice Behavior. Wiley: New York.
March, J.G. 1991. Exploration and exploitation in organizational learning. Organ. Sci. 2 71-87.
March, J.G. 2003. Understanding organisational adaptation. Soc. and Econom. 25(1) 1-10.
Matsusaka, J.G. 2001. Corporate diversification, value maximization, and organizational capabilities.
J. of Bus. 74(3) 409-431.
34
McGahan, A.M., M.E. Porter. 1997. How much does industry matter, really. Strat. Man. J. 18(S1) 1530.
Mosakowski, E. 1997. Strategy making under causal ambiguity: Conceptual issues and empirical
evidence. Organ. Sci. 8 414-442.
Montgomery, C.A., 1994. Corporate Diversification. J. of Econom. Persp. 8(3) 163-178.
Nickerson, J.A., T.R. Zenger. 2004. A Knowledge-Based Theory of the Firm: The Problem-Solving
Perspective. Organ. Sci. 15(6) 617-632.
Rivkin, J.W. 2001. Imitation of Complex Strategies. Man. Sci. 46(6) 824-844.
Rivkin, J.W., N. Siggelkow. 2003. Balancing Search and Stability: Interdependencies among
Elements Organizational Design. Man. Sci, 49(3) 290-311.
Robbins, H. 1952. Some Aspects of the Sequential Design of Experiments. Bulletin of the Amer.
Math. Soc. 55 527–535.
Rumelt, R.P. 1991. How much does industry matter? Strategic Man. J.. 12(3) 167-185.
Siggelkow, N., D.A. Levinthal. 2003. Temporarily Divide to Conquer: Centralized, Decentralized,
and Reintegrated Organizational Approaches to Exploration and Adaption. Organ. Sci. 14(6) 650-669.
Siggelkow, N., J.W. Rivkin. 2005. Speed and Search: Designing Organizations for Turbulence and
Complexity. Organ. Sci. 16(2) 101-122.
Sorenson, O. 2002. Interorganizational complexity and computation. J A. Baum, ed The Blackwell
Companion to Organizations. Blackwell, Oxford, 664-685.
Sutton, R.S., A.G. Barto. 1998. Reinforcement learning. Cambridge, Ma.: MIT Press.
Tripsas, M. 1997. Unraveling the Process of Creative Destruction: Complementary Assets and
Incumbent Survival in the Typesetter Industry. Strat. Man. J. 18(S1) 119-142.
Tushman, M.L., C.A. O’Reilly. 1997. Ambidextrous Organizations: Managing Evolutionary and
Revolutionary Change. California Management Rev. 38(4) 8-30.
Villalonga, B. 2004. Diversification discount or premium? New evidence from the business
information tracking series. J. of Finance 59(2) 479-506.
Winter, S.G. 2000. The Satisficing Principle in Capability Learning. Strat. Man. J. 21(10-11) 981996.
Winter, S.G., G. Szulanski. 2001. Replication as Strategy. Organization Science 12 730-743.
Yechiam E, Busemeyer JR. 2005. Comparison of Basic Assumptions Embedded in Learning Models
for Experience-Based Decision Making. Psych. Bull. & Rev. 12(3) 387-402.
35
Download