Corporate search: Exploration and exploitation in multiple markets Thorbjørn Knudsen & Nils Stieglitz Strategic Organization Design Department of Marketing & Management University of Southern Denmark nst@sam.sdu.dk Abstract A firm that operates in more than one industry or market is engaged in corporate search when it collects information to consider trade-offs between new and existing market opportunities. Corporate search is a continuum spanned by the extreme poles of exploration and exploitation. Little is known about the conditions that allow firms to benefit from exploration of multiple markets as opposed to refinement of an existing business serving a particular market. This paper develops a modelling structure that provides useful insights about this important problem. Our results suggest that organizational commitments hold the key to benefiting from corporate exploration by gradually forcing a shift from broad search among multiple markets to narrow search from improvement within a particular market. We show how organizational commitments can be operationalized as either financial constraints or time constraints. Keywords: corporate search, exploration and exploitation, NK model 1 1. Introduction How do firms benefit from exploring multiple markets? The canonical example of this problem is corporate search. A firm that operates in more than one industry or market is engaged in corporate search when it collects information to consider trade-offs between new and existing market opportunities. Corporate search is a continuum spanned by the extreme poles of exploration and exploitation. Corporate exploration involves searching for business opportunities among multiple markets or industries. Its contrast, corporate exploitation is the refinement of an existing business serving a particular market. Little is known about the conditions for successful corporate search. This is because prior work has primarily focussed on search in single markets. Prior work has advanced our knowledge about the relation between organizational search and performance at the business unit level. We now have a fairly good understanding of the way organizations must design search so it leads to superior outcomes in a single task environment. The extant work on organizational search is thereby useful for analysis of business units that operate in a single industry or market. It has spawned many detailed insights for this situation (see Sorensen 2002; Gupta el al. 2006 for reviews) as well as the general insight that organizational search can be characterized as trade-offs between mechanisms that are variance enhancing (exploration) and mechanisms that are mean enhancing (exploitation). The exploration-exploitation trade-off has been identified as one of the most important challenges for organizational search (March 1991). This trade-off exists because the mean enhancing mechanisms associated with exploitation reinforces positive, proximate, and predictable rewards while the variance enhancing mechanisms that characterize exploration reinforce more uncertain, distant and often negative rewards (Levinthal & March 1993; Denrell & March 2001; Benner & Tushman 2003; He & Wong 2004). Exploitation ignores possible distant gains while exploration ignores immediate incremental gains. Prior formal models have provided detailed insights on the exploration-exploitation trade-off that correspond to a single business unit, operating in a single market (e.g. March 1991; Carley & Svoboda 1996, Siggelkow & Levinthal 2003; Siggelkow & Rivkin 2005). An important finding is the advantage of increasing the level of variance in alternative generation when many policy attributes are interdependent (Rivkin & Siggelkow 2003; Knudsen & 2 Levinthal 2007). As further explained in the following section’s review of the literature, exploration and exploitation at the corporate level adds the challenge of considering exploration and exploitation within the context of multiple markets. As far as we are aware this challenge has not yet been addressed in the literature on organizational search (cf. Gupta et al. 2006). There are no workable models that facilitate systematic analysis of the trade-offs that occur when firms operate in more than one industry or market. This is important because most firms in the economy are active in more than one market (Montgomery 1994; Villalonga 2004). While prior formal models on organizational search has primarily focussed on single unit businesses, this type of firm only account for about 20% of the number of publicly traded firms in the US economy (Villalonga 2004). We therefore lack a valid model of organizational search for about 80% of public firms in the economy. That is, corporate search is an important unexplored topic in the literature on organizational search. We aim to take a first step in the direction of a systematic analysis of this topic. A defining feature of corporate search is the interrelatedness of search at two distinct levels of organization. Firms both search for new markets and for improvements that can strengthen competitive positions in existing markets (e.g. Tripsas 1997; Winter 2000; Bhardwaj et al. 2006). These two levels of organizational search are interrelated, as Levinthal and March (1993: 100-101, their emphasis) point out, “an organization learns which market to enter and how to function effectively in several alternative markets.” Learning about a new market also reveals valuable information on how to compete in it. What is more, search at one level may substitute for search at another level. A firm could settle in a less attractive industry, but nevertheless stake out a superior competitive position that allows it to earn above-average rents (Rumelt 1991; Porter & McGahan 1997). In contrast, it is quite possible to fail in an attractive industry. These issues cannot be addressed by confining the analysis to just one level of organizational search and adaptation. Rather, we need to consider two distinct levels of search – within and among markets – in order to understand corporate search in multiple markets. Even though most firms are active in multiple markets, it is not clear how firms can design search processes so they benefit from exploring multiple markets. In order to fill this gap, we offer a 3 platform for analysis of search in multiple markets that combines two well-established models from the organization literature. We draw on the NK model of adaptation on rugged landscapes (e.g. Levinthal 1997; Fleming & Sorensen 2001; Gavetti 2005) to characterize search within a particular market. Our modeling choice reflects a certain caution in that we simply extend prior NK-models by considering choice among multiple performance landscapes. We use well-understood characterizations of search within landscapes. The added ingredient, then, is a characterization of search between landscapes. The obvious choice here is to model search among markets as an instance of the multi-armed bandit problem (e.g. March 1991; March 2003; Fang & Levinthal 2009). This class of model is well-understood and ideally suited for trade-offs between possible improvementprocesses in multiple markets. That is, the problem of corporate search is a specific version of the more generic problem of searching through multiple spaces, or markets, each of which offers distinct improvement paths. Our study provides insights that can help understand the relation between corporate search and corporate performance. We identify conditions for successful corporate search and show that consideration of multiple markets, each of which offers specific opportunities, requires multi-stage search procedures. We find that the idea of identifying and maintaining a fixed balance between corporate exploration (search between markets) and corporate exploitation (search within markets) is misplaced. At the very least, corporate search requires a two-stage policy where exploration is followed by exploitation, a procedure that seems intuitive and consistent with available evidence (Winter & Szulanski 2001; Burgelman, 2002; Siggelkow & Levinthal 2003). But when should the firms shift to exploitation mode? If the manager can compute an optimal sampling strategy the answer is given, but this is rarely the case. We identify a hard budget constraint as a much more realistic alternative that achieves similar results. The answer, then, to solving the exploration-exploitation tradeoff at the corporate level lies in commitments that ensure a shift to exploitation after a sufficient period of exploration. We identify conditions that ensure the exploration period is sufficiently long and show that results do not vary dramatically when managers do not get it exactly right. Put differently, our findings show how corporate search can be designed so that managers avoid two wellknow option traps (Adner & Levinthal 2004). The first is the tendency to search for new markets 4 when things are not going well, rather than improving things in the existing market (failure trap). The second is the tendency to improve things in the existing market rather than abandon a market with little potential (success trap). 2. Corporate search: Exploration and exploitation in multiple markets While the majority of firms in the economy perform corporate search where they consider exploration and exploitation in multiple markets, prior work has mostly been limited to the base-case of a single business unit searching for improvements in a single task environment, or market. A large body of empirical and theoretical literature has studied the exploration and exploitation trade-off, yet prior research has not systematically considered organizational search on multiple levels. Most theoretical studies have focussed on just a single level of organizational search (e.g. Carley & Svoboda 1996; Levinthal 1997; Siggelkow & Rivkin 2003; Fang & Levinthal 2008). The more challenging problem of searching multiple performance landscapes is now beginning to attract attention. Gavetti (2005) studied organizational search in the context of a multi-divisional firm. After an initial period of activity in a single market, an organization continues to operate in the original market, but also enters a new industry. He finds that knowledge spillovers from the old to the new business unit may be beneficial when market characteristics are similar. Likewise, Gavetti, Levinthal & Rivkin (2005) invoke multiple markets to examine the role of managerial analogies for strategy-making. An important result is that a well-informed analogy is a powerful guide in extending search from a known market context to a new one. However, these contributions do not examine how firms search for and select markets, and how search among and within markets interact with each other. 1 Our aim is to use a similar multi level perspective to understand corporate search for new markets. While we lack a workable model of corporate search, the literature on corporate strategy is, per definition, concerned with multiple business units that are located in distinct markets. The choice of a firm’s market is a central problem of corporate strategy. Research on corporate strategy has extensively studied the motivation for entering new markets (e.g. Montgomery 1994), the rate and A notable exception is Holmqvist’s (2004) case-study of exploration and exploitation in the context of product development. He documented that exploration and exploitation must be understood at multiple levels of organization (within and between firms that are partners in an alliance). 1 5 direction of diversification (e.g. Mosakowski 1997) as well as the organizational structure of diversified companies (e.g. Chandler 1962; Eisenmann & Bower 2000). But only very limited attention has been directed to the problem of how firms actually search for and decide whether to enter new markets. From a managerial perspective this problem is quite challenging because the value of leveraged resources in a new market is hard to assess (Denrell, Fang & Winter 2003). Mosakowski (1997) and Matusaka (2001) analyze how a firm engages in experimental diversification to identify new markets. An example can illustrate. In a case study of DuPont’s diversification strategy, Bhardwaj, Camillus and Hounshell (2006) report how the company explored various markets like synthetic organic chemicals, inorganic chemicals, motor cars, and many more. Each market offered numerous specific business possibilities that could be pursued further. For example, within synthetic organic chemicals, DuPont engaged in the development of dyes, drugs, food preservatives, and perfumes. Over a period of 20 years, the company considered a vast array of business possibilities in many markets, and then gradually focused corporate resources on just a few of them. Searching in multiple markets is challenged by the difficulty of balancing exploration and exploitation at the corporate level, a difficulty rooted in the way feedback from the two search activities become antagonistic. Exploitation of a market is associated with positive, proximate, and predictable rewards that reinforce refinement and lead organizations into a competency or success trap (Denrell & March 2001; Benner & Tushman 2003; He & Wong 2004). In contrast, the more uncertain, distant and often negative rewards from exploring new markets tend to lead firms into a failure trap where negative feedback reinforces a drive towards further experimentation and change (Levinthal & March 1993). Our concern is how such pathologies relating to corporate search may be cured. On the one hand, a company needs to ensure attractive markets are identified by corporate exploration. This requirement calls for serious assessment of many new markets, as illustrated in the DuPont example. On the other hand, excessive corporate exploration may undercut the identification, refinement, and strengthening of valuable business possibilities within a market. Our model captures the essential features of corporate search in multiple markets. We identify three logically distinct approaches to the design of trade-offs between higherlevel search (market choice) and lower-level search (business refinement). The first approach commits 6 the decision-maker to a fixed balance between corporate exploration and exploitation under the assumption that such a balance exists and can be identified. The second strategy splits the search process into two distinct stages. In the first stage, the firm engages in corporate exploration where it gains information about opportunities in a number of new markets. In the second stage, it then switches to exploitation and focuses on improving its position within a single market. A two-stage approach is based on the premise that it is useless to strike a fixed balance between exploitation and exploration. In contrast, it assumes that it is roughly possible to identify the point in time where the firm should switch from exploration to exploitation. The third approach assumes even less knowledge about the search problem. It simply allocates a fixed budget to search and decreases search effort as the budget is depleted. This dynamic adjustment approach allows for gradually adjusting the balance based on performance feedback and the remaining budget for corporate search. The decision-maker gradually homes in on an attractive market, progressively shifting the balance toward corporate exploitation. The following section explains how these three versions of corporate search are represented in our model of search in multiple markets. 3. A model of search in multiple markets Our model structure has three building blocks. The first establishes the nature of the task environment that a firm operates in, i.e. the market or industry; the second specifies how agents search within a particular task-environment. The third building block establishes the strategies for searching among markets. Table 1 gives an overview of the parameters of the model. Below we elaborate on each building block and associated parameter specifications. 7 Parameter Description Parameter values Generation of d Number of markets 4 task Ni Number of policy attributes within task 1a) Ni = 10 environments environment i (markets) 1b) Ni = 50 2b) Ni = 50 Ki Number of interactions among policy 1a) Ki = 0, 1, 5, 9 attributes in task environment i 1b) Ki = 0, 1, 25, 49 2b) Ki = 25, 25, 25, 25 Search within a Broad_S market Agents engage in local (Broad_S = 0) 0, 1 or broad search (Broad_S = 1) Search among Τ Propensity to explore among markets 0.005, 0.1, 0.5, 1 markets Α Tolerance level in optimal sampling 0.2, 0.5 Δ Confidence level in optimal sampling 0.05, 0.2 Β Adjustment rate in dynamic adjustment 0.5, 1, 5 B Financial budget 50 C Entry costs 0, 0.1, 0.2 Table 1: Overview of model parameters Nature of task environments We generate a set of performance landscapes that represent separate task environments. Each task environment can be thought of as a particular market. Task environments may differ in the number of relevant policy attributes, complexity and mean performance.2 A task environment is characterized as a space of alternatives. In each task environment, an alternative consists of N binary attributes that represent choices regarding the business strategy within that market. The policy attributes may relate to sourcing, production, sales, support functions, etc. (e.g. Rivkin 2001). Each attribute can take on two states, so there are 2N different business policy configurations in each landscape. The performance values of each of the N attributes are determined by random draws from a uniform 2 The specification of more than one market therefore allows for a more penetrating analysis of exploration and exploitation as landscapes may differ in the global optimum. Even if agents have reached the global optimum in one landscape, they may still be outperformed by agents in landscapes with a higher mean performance. This essential property of exploration cannot be captured by studying just one landscape. 8 distribution over the unit interval. The overall performance of a policy configuration in one landscape is the average of the values assigned to each of the N attributes.3 The attributes of a policy configuration may be more or less interdependent. Attributes are interdependent if the value of each of the N individual attributes depend on both the state of that attribute itself and the states of K other attributes. If K = 0, attributes are independent. As K increases, more and more attributes of a configuration become interdependent, with K = N – 1 being the case of interdependence among all attributes. The number of interdependencies given by K determines the surface of the performance landscape. With higher values of K, there are more local peaks and performance differences among neighboring configurations differing only in a single policy attribute become relatively more pronounced (Kauffman 1993). Search within a market In their search for a viable business strategy within a market, firms may either incrementally adapt policy configurations or try to more radically alter current positions. We associate the former with local search, the latter with broad search.4 Since the properties of local search are well-known (Levinthal 1997), we use it as our baseline for search within a landscape. With local search, agents revise policy configurations by changing a randomly chosen single attribute and examining the outcome.5 If the result is improvement of performance, the revision is implemented; if not, it is rejected. In addition to local search, we analyze a combination of local and distant search that we refer to as broad search. When firms are engaged in More formally, we specify alternatives within a particular market as consisting of N policy attributes, a1,…,aN. For simplicity, it is assumed that each attribute can take on two states. A performance landscape is a mapping of any possible vector of policy attributes A = (a1, a2,…aN) to performance values V(A). The value of each individual policy attribute ai is affected by both the state of that attribute itself and the states of a number of other attributes a –i. Denote the value of attribute ai by vi(ai, a –i). For each generated landscape, the particular value of an attribute, vi, is determined by drawing randomly from a uniform distribution over the unit interval. Each landscape is seeded differently. The value of a given set of alternatives A is then given by V(A) = [v 1(a1, a –1) + v2(a2, a –2) +…+ vN(aN, a –N)]/N. The identity of a –i, i.e., the set of attributes that affect each attribute a i, is given by the interaction structure of the firm’s decision problem (i.e. the variable K). We assume random interdependencies among attributes. Since our studies include rather large performance landscapes (N= 50), we report raw values from the NK runs rather than normalized results. 4 Prior research has studied broader search within a market as a sequence of distant and local search (e.g. Siggelkow & Levinthal, 2003). The performance properties of this combination are well-known, and implementing a similar approach would increase the complexity and parameter space of our model. 5 We limit our study to perfect evaluation and do not consider the elaborations introduced by Knudsen & Levinthal (2007) relating to imperfect evaluation. 3 9 broad search, they widen their search perimeter to include more distant configurations when local search no longer uncovers performance improvement. This captures the idea of problemistic search in the behavioral theory of the firm (Cyert & March 1963). Initially, agents engage in local search by changing a single, randomly chosen attribute. If an agent’s performance does not increase, the agent broadens the scope of search by incrementally increasing the number of attributes that are changed.6 An agent thus begins to consider progressively more distant alternatives within a task environment when local search fails to improve performance. If a better configuration is found, the agent reverts back to local search. This combination of local and broad search corresponds to empirical observations of search behavior (Fleming and Sorensen 2004; Bhardwaj et al. 2006). Search among markets Faced with multiple markets, agents need to decide on whether they want to explore a new market or focus search on an existing market. Essentially, they need to identify the market with the highest expected performance and refine the policy configuration within that market.7 Thus, corporate resources compete for the conflicting demands of exploring markets and of improving an existing business. We capture this trade-off by limiting organizational search to a fixed time period and a 6 Specifically, broad search is initially very coarse-grained and then becomes more fine-grained. The scope of search is slowly widened as long as performance does not increase. In the first pass of broad search, the probability that an attribute is mutated is initially set at 0.1 (the threshold for useful mutations when N=10). If performance does not increase, the search scope is then increased by increments of 0.1 until it reaches the limit of 1. In the next pass, the probability that an attribute is mutated is initially set at 0.05. It is then increased by increments of 0.05 until it reaches the limit of 1. In the third pass, the incremental mutation probability is set to 0.03, etc. (with 4 passes, this procedure has reached the threshold of 0.02 for useful mutations when N=50). Broad search thereby captures three salient features of problemistic search (Cyert & March 1963, p. 121-22). First, search is motivated by the lack of performance increases. Second, it is simple minded.. If local search does not increase performance, the organization uses increasingly distant search. Third, search as biased as it is constrained by the present location within a landscape. Specifically, broad search is initially very coarse-grained and then becomes more fine-grained. The scope of search is slowly widened as long as performance does not increase. In the first pass of broad search, the probability that an attribute is mutated is initially set at 0.1 (the threshold for useful mutations when N=10). If performance does not increase, the search scope is then increased by increments of 0.1 until it reaches the limit of 1. In the next pass, the probability that an attribute is mutated is initially set at 0.05. It is then increased by increments of 0.05 until it reaches the limit of 1. In the third pass, the incremental mutation probability is set to 0.03, etc. (with 4 passes, this procedure has reached the threshold of 0.02 for useful mutations when N=50). 7 Our conceptualization of exploration and exploitation is grounded in decision theory and the adaptive systems approach, which often consider the N-armed bandit problem as the canonical representation of the exploration and exploitation trade-off (Robins 1952; Holland 1975; March 2003). A gambler with a finite budget faces the problem of maximizing her total payoffs. When pulled, each lever of the bandit provides a reward drawn from a distribution associated with that specific lever. Initially, the gambler has no knowledge about the levers, but through repeated trials (exploration), she can identify and focus on the most rewarding lever (exploitation). 10 financial budget. The time constraint captures how agents allocate search efforts. For example, if an agent decides to search for opportunities in market X, it foregoes searching within market Y. The financial budget represents resources for explorative search (Mosakowski 1997; Adner & Levinthal 2004). Agents invest a non-recoverable cost out of the budget B every time they (re-)enter and sample a market. We assume that an agent samples one market in each time step. Initial configurations in a market are randomly determined while subsequent refinements benefit from prior search. That is, whenever an agent enters a new market, the initial configuration is randomly determined. And when an agent returns from an excursion to another market, local or broad search is resumed from the configuration that was used before departure from this market. The mean performance of each market is initially unknown, but agents are able to form subjective beliefs about the relative performance of markets (Luce & Raiffa 1957). An agent forms beliefs on the basis of knowledge generated through prior search within a market. To model this relationship between past search efforts and the formation of beliefs, we turn to research on reinforcement learning (e.g. Lave & March 1975; Denrell & March, 2001). Agents sample a market by engaging in the search modes described above. How much they sample a particular market depends on the performance feedback in the past. Agents spend more time in markets they, rightly or wrongly, find attractive. Sampling a more attractive market therefore gets reinforced over time. The choice of a market is also influenced by how much weight an agent puts on corporate exploration, i.e. searching for opportunities in new markets. We use the Softmax algorithm of reinforcement learning to capture the relationship between performance feedback, sampling, and the propensity to explore markets. The Softmax algorithm, attributed to Luce (1959), provides a straight-forward way to model the formation of beliefs (Sutton & Barto 1998; Denrell & Le Mens 2007; Fang & Levinthal 2009).8 The Softmax algorithm makes the probability of sampling a particular market i at time step t 8 Empirical findings in psychology and neuroscience offer some evidence that the repeated choices of human agents in uncertain environments correspond to this algorithm (Yechiam & Busemeyer 2005; Daw et al. 2006). 11 dependent on the observed mean performance xi , the average mean performance of all markets d, and on the propensity for corporate exploration τ: pi ,t e xi / / i 1 e xi / . d (1) The parameter τ influences the degree to which an agent adheres to prior beliefs. A lower value of τ increases the probability that an agent selects a market believed to be the most attractive. Corporate exploration is curbed as the agent places a much higher emphasis on exploitation of a particular market. A higher τ downplays the role of prior beliefs and increases the probability that an agent samples and explores a different market. Hence, we label this parameter the propensity to explore. Note that agents continually update the estimates of mean performances. The more an agent explores a market, the better the estimates of mean performances and of the relative attractiveness of markets.9 In the Softmax algorithm, the parameter τ encapsulates the propensity to engage in corporate exploration among markets. The critical question then is how agents set τ. As previously explained, we specify three alternative approaches to the determination of τ, each representing a different balancing of corporate exploration and exploitation. The first approach aims to achieve a fixed balance of corporate exploration and exploitation. The second approach considers a more sophisticated two-stage sequence of corporate exploration and exploitation, with the length of the exploration period depending on the number of possible markets. The third approach is inspired by models of adaptive search and allows agents to continually adapt the balance between corporate exploration and exploitation. 9 A numerical example might help to illustrate the Softmax algorithm. Assume that a decision-maker has to choose among three markets. Based on prior search efforts in each market, the agent estimates the means of the three markets, xi, as 0.4, 0.5, and 0.6. The decision-maker has set τ to 1, corresponding to a fairly high propensity to engage in corporate exploration. The (rounded) probabilities of sampling a specific market in the next period are 0.30, 0.33, and 0.37 respectively. In contrast, if the agent sets τ to 0.1 (low propensity to explore), the probabilities are 0.09, 0.25, and 0.66. The probability of choosing the most attractive market is therefore much higher with a low τ. Suppose the agent samples the second market, leading to a new estimate of the mean of 0.52. In the next time step, based on the new information, the probabilities of selecting the markets are 0.30, 0.34, and 0.36 (with τ = 1) and 0.09, 0.28, and 0.63 (with τ = 0.1). 12 Fixed balance of corporate exploration and exploitation When corporate search proceeds on the basis of a fixed balance between exploration and exploitation, decision makers aim to identify and maintain the right weight between mean-enhancing and variance enhancing search. We also refer to this search as one-stage balancing of exploration and exploitation, because decision makers once and for all set a parameter (propensity to explore τ) that determines a fixed balance between exploration and exploitation. The higher this parameter, the more the balance is pushed towards corporate exploration – agents spread search efforts across multiple markets instead of focusing on one. The lower the propensity to explore, the more the balance is pushed towards corporate exploitation – search within markets dominates search among markets. Two-stage sequencing of corporate exploration and exploitation With two-stage sequencing of corporate exploration and exploitation, the propensity to explore (τ) determines the mode of search. When τ is set to high values, the search mode is exploration while low values of τ are associated with exploitation (see Table 1 for specification of actual values). Two-stage search is a process where the corporation switches from exploration of multiple markets to exploitation of a single market. We capture this by switching from a high to a low value of τ. The decision maker first takes a number of samples in exploration mode (high τ) and then firmly shifts to exploitation mode (low τ). In exploration mode, τ is set so high that search among landscapes dominates search within landscapes. By contrast, in exploitation mode, the propensity to explore is set so low that search within landscapes dominates search among landscapes. The effective sequencing of corporate exploration and exploitation depends largely on the number of markets to be searched. As the number of prospective markets increases, more samples need to be taken before the estimates become reliable. The length of the exploration period is determined by the optimal sample size. Following Even-Dar et al. (2002), the α-optimal sample size S with probability 1 – δ is given by (2) S = d/α2 log (d/δ), with d the number of performance landscapes. The parameter α represents the tolerance of the agent, and δ the confidence level. The two parameters, α and δ, are the agent’s choice parameters. If a more 13 precise and reliable estimate is desired, the agent must spend more time on corporate exploration. The sample size S determines the time step at which an agent switches from the exploration stage (high τ) to the exploitation stage (low τ). In essence, agents build up a broad stock of knowledge about the task environments (corporate exploration with high τ) and then proceed to refine this knowledge (corporate exploitation with low τ). How exactly is the sample size S determined? Agents choose a tolerance level α and confidence level δ. The tolerance level captures the deviation from the best possible market. With a tolerance of α=0.20, decision makers accept a market that has 20% less performance than the (unknown) best possible market. The confidence level δ captures the risk of accepting a bad alternative that falls below the threshold set by the tolerance level. With a confidence level of δ = 0.95, decision makers will accept a bad alternative with probability 0.05. That is, the alternatives they accept will be worse than the limit set by the tolerance level. With a tolerance of α = 0.20, 5% of the accepted alternatives will have performance that deviates more than 20% from the best possible. These two sensitivity parameters (α and δ) and the number of performance landscapes to be searched (d) jointly determine the total number of samples S that the decision maker allocates to corporate exploration. The lower the tolerance level α, the higher the confidence level δ, and the more landscapes d, the larger the sample size. Dynamic adjustment of corporate exploration and exploitation Alternatively, firms may dynamically adjust the balance of corporate exploration and exploitation over time. In actual practice, there are a number of reasons why firms may prefer such a strategy. The optimal sample size may be unrealistically high from a practical perspective, or the number of alternatives may be unknown. In addition, agents may benefit from continually updating the balance instead of sticking to a fixed sequence of exploration and exploitation. We consider a simple behavioral rule that agents use to adjust the propensity to explore τ in the Softmax algorithm. The behavioral rule is inspired by models of adaptive organizational search (Levinthal & March 1981; Greve 2003). It allows agents to adapt the balance between exploration and exploitation based on 14 performance feedback. They decrease exploration when they have been successful in the past (cumulated performance W) and increase exploration if they have a large budget B: τ = β (B/W). (3) As the budget gets depleted, the propensity to explore τ is reduced and corporate exploration is gradually toned down. The parameter β is a scale-parameter that tempers the exploration phase. A higher value of β gives more pronounced corporate exploration, i.e. firms search more among markets. Thus, a high β induces a dynamic that mimics a two-stage sequencing of exploration and exploitation. With lower values of β, the initial stage of exploration will endure longer, but be less pronounced. As β approaches zero at the limit, agents become trapped in the market to which they were initially assigned to at random. 4. Results For the simulation runs, we have restricted our analysis to four markets (d = 4) that a firm may locate in.10 We study both relatively simple task environments with few important policy attributes (N = 10) and more challenging markets with a large number of important policy attributes (N = 50). In markets with few policy attributes, the manager may choose among 4096 alternatives distributed across the four landscapes (1024 alternatives in each). In markets with many policy attributes, managers face the daunting task of considering 4.5e15 (4*250) alternatives. We consider two principal scenarios. In the first scenario, the markets only differ in terms of complexity (the K parameter)..11 For simple task environments (N = 10), we examine complexity levels with K = 0, 1, 5, 9. For markets with many policy attributes (N = 50), we use values of K = 0, 1, 25, 49. In the second scenario, the markets differ in mean performance, while complexity is the same across markets.12 We consider markets of 10 Our results may be extrapolated to any number of markets by adjusting the sample size in the two-stage sequencing according to equation (2). For dynamic adjustment, the effect can be gauged by reference to equation (3). 11 The comparison of markets differing in complexity is attractive, since performance landscapes systematically differ depending on the value of the K parameter (Kauffman 1993). In K = 0, only a single performance peak exists that corresponds to the global optimum. As K increases, the number of local peaks increases exponentially. The mean performance of local peaks is highest for low levels of complexity. However, maximally complex markets (K = N - 1) contain the highest performing peak. 12 The spread in mean performances among landscapes is implemented by adding a fixed value (0.1, 0.2, 03, respectively) to the raw fitness value generated by the standard NK model. 15 medium complexity, with K = 5 (for N = 10) and K = 25 (for N = 50). Each market is randomly seeded. The results report the average of 100 simulations runs with 100 individual agents and for 1000 time steps. Unless specified otherwise, we assume a budget of 50 and entry costs of 0.1.13 We study the effectiveness of the three approaches to corporate exploration outlined in the modeling section for these two scenarios. We allow for both local and broad search within a market. We first compare a fixed balance with the two stage sequencing for markets that differ in complexity. We then evaluate the effectiveness of the dynamic adjustment of corporate exploration for the same scenario. Having established the basic properties of the three approaches, we proceed to analyze them for the second scenario in which markets differ in mean performance. Fixed balance and two-stage sequencing compared We examine four representative strategies with a fixed balance between corporate exploration and exploitation. The first is pushed towards the extreme pole of exploitation with parameter τ set to a very low value of 0.005. In the second, τ is set to 1, so that exploration dominates. The remaining strategies with a fixed balance fall between these two extremes (τ = 0.1 and 0.5). In two-stage search, the propensity to explore is again set to 1 during the exploration stage and then, after a total of S samples, the agent shifts to corporate exploitation by sharply reducing the propensity to explore to 0.005. The values of α (tolerance level) and δ (confidence level) were set to 0.2 and 0.05, respectively. That is, the agent attempts to locate in a market that has at most 20 percent less performance than the best possible. The probability of accepting a market that does not meet this threshold is just 5 percent at the end of the exploration stage. For optimal sampling (equation 2), the agent therefore allocates 438 samples to corporate exploration and then switches to exploitation within the chosen market (sensitivity and robustness discussed below). 13 The absolute value of these parameters is unimportant. The entry costs set a scale for the budget, so it is the calibration of entry costs relative to the budget that matters. The actual values were chosen on the basis of comprehensive additional simulations. These simulations also showed that our results are not a knife’s edge property. They hold for a wide range of parameter values. 16 0.8 Mean performance 0.75 0.7 0.65 0.6 1-stage exploitation with broad search 1-stage exploration with broad search 2-stage sequence with broad search 2-stage sequence with local search 0.55 0.5 0 200 400 600 800 1000 Time Figure 1: Two-stage search compared to one-stage search for markets with few policy attributes (d = 4, N = 10, K = 0, 1, 5, 9, B = 50, entry cost = 0.1, α = 0.2, δ = 0.05, τ= 0.005, 1) Figure 1 compares the average performance of one-stage search and two-stage search for environments with few policy attributes (N = 10). Pure exploitation in one-stage search is characterized by rapid initial performance increases. Firms that exclusively focus on corporate exploitation tend to become stuck in the first market they decided to locate in and then singlemindedly focus on improving their policy configuration. However, after the initial period of rapid improvements, gains from single-minded exploitation level off. In contrast, with pure exploration in one-stage search, performance improves steadily as agents slowly focus search efforts on more attractive markets. Explorative search starts to exhaust itself after 650 time steps, as agents deplete their budget and are forced to firmly locate in one market. The result is a small increase in mean performance, since agents are then able to locate in more attractive markets. The average performance of the two intermediate one-stage search strategies (τ = 0.1 and 0.5, not reported in Figure 1) fall between these two extremes. Overall, a fixed balance pushed toward exploitation shows better performance than one-stage search favoring more exploration. Furthermore, the model exhibits the 17 common bias towards exploitation (March 1991), as the initial performance gains from exploitation within a market are much higher than gains from exploration. The two-stage search strategy produces a much smaller initial improvement than one-stage search in the exploitation mode. However, after the exploration stage has been concluded, the shift towards exploitation of one market is associated with a substantial increase in performance. This dramatic gain is achieved because of the knowledge acquired in the exploration stage about the relative attractiveness of markets. In stage two, agents use this knowledge to locate in the market perceived to be the most attractive one. Within that market, they start from the policy configuration they have so far identified as the best. When the refinement of a configuration in the two-stage strategy is achieved by local search, much of the gains are already realized in the exploration stage. After settling in the most attractive market, there are no further gains in performance during the exploitation stage. If agents are capable of broad search, performance continues to increase in the exploitation stage. What is striking is that the two-stage model, even if it is limited to local search, clearly outperforms the one-stage search strategies based on a combination of local and distant search. More generally, Figure 1 demonstrates that the two-stage sequencing of exploration and exploitation clearly dominates a fixed balance in terms of both cumulated and achieved mean performance at the end of the simulation run. The same general result holds for markets with many policy attributes (N = 50). The relative advantages of two-stage search become even more pronounced (Figure 2). A perhaps surprising result is that broad search within a market decreases performance when compared to local search.14 That is, the ability to search more distant configurations has a negative impact on the effectiveness of search within a market where a firm has to get many policy configurations right. The reasons for this are two-fold. First, as markets differ with respect to complexity (interdependencies among policy attributes), the majority of agents locate in simple landscapes (K = 0). In that case, local search leads agents towards the global optimum. As can be readily observed in Figure 2, mean performance continues to increase slowly at the end of the simulation run. Second, the number of possible policy 14 We do not report the performance properties of the fixed balance strategies with local search. They are slightly lower with local search vis-à-vis the same fixed balance with broad search and significantly lower vis-àvis the two-stage sequencing with local search. The results are available from the authors upon request. 18 configurations is so huge that it takes a long time to reach a local peak even in a complex landscape (K > 0). Agents must go through many variations of the policy configuration before getting stuck. 0.75 Mean performance 0.7 0.65 0.6 0.55 0.5 0 1-stage exploitation with broad search 1-stage exploration with broad search 2-stage sequence with broad search 2-stage sequence with local search 200 400 600 800 1000 Time Figure 2: Two-stage search compared to one-stage search for markets with many policy attributes (d = 4, N = 50, K = 0, 1, 25, 49, B = 50, entry cost = 0.1, α = 0.2, δ = 0.05, τ = 0.005, 1) Broad search, on the other hand, distracts agents. They start to broaden search as soon as local search does not increase performance. This does not, however, mean that they already have reached a local peak. In markets where firms must consider a large number of policy configurations, broad search leads agents astray as the probability of finding a more distant configuration with higher performance is substantially lower than of finding one in the proximate neighborhood early in the search process. Agents engaged in broader search thus spend more time searching for a performanceincreasing configuration than decision-makers firmly focused on locally improving a policy configuration. This is problematic if the period of corporate exploitation is relatively short. Mean performance thereby decreases during the exploration stage. In general, these two effects imply that broader search becomes more attractive when there are fewer important policy attributes. When it is possible to focus on relatively few policy attributes (ten in the present analysis), broad search can be 19 advantageous. This happens when most policy attributes are interdependent and if the firm allocates a good deal of time to broad search within each market. To conclude, the findings provide powerful support for the conjecture that two-stage sequencing of corporate exploration and exploitation dominates a fixed balance. One-stage search pushed towards exploitation is a high-risk strategy that may benefit a few lucky agents, but the majority gets stuck in inferior markets. In contrast, one-stage search in exploration mode foregoes the potential performance benefits of focusing search efforts on a particular market. A two-stage search strategy combines the advantage of these two extremes, while excluding the liabilities. Agents first form expectations about relative performance based on an optimal sample size during corporate exploration. They then proceed to corporate exploitation by firmly focusing on the market identified as the most attractive one. Robustness of two-stage sequencing: Lock-ins and adjustments of the exploration stage The effective balancing of corporate exploration and exploitation in a two-stage search process critically depends on aligning the budget with sample size. If the budget does not support a sufficiently long period of exploration, agents get locked into a market prematurely. Performance suffers. Still, the impact of a lock-in is not as pronounced as one might expect and, despite the lock-in, agents perform better than with a comparable one-stage search strategy. With broad search, a longer period of sustained corporate exploration within a market compensates for insufficient corporate exploration. If agents can only engage in local search, the performance implications of a lock-in critically depend on market complexity. In less complex markets, local search provides ample opportunities for improving performance. In more complex environments, however, performance decreases more sharply by a lock-in, since agents quickly reach a local peak and do not benefit from more extended exploitation within a market. 20 0.75 Mean performance 0.7 0.65 0.6 0.55 0.5 0 N = 10, confidence level (delta) = 0.05 N = 10, confidence level (delta) = 0.20 N = 50, confidence level (delta) = 0.05 N = 50, confidence level (delta) = 0.20 200 400 600 800 1000 Time Figure 3: Two-stage search with different confidence levels (d = 4, N=10/50, local search, B = 50, entry costs = 0.1; α = 0.2, δ = 0.05, 0.1, τ = 0.005, 1) To prevent a lock-in, agents may reduce the sample size in the exploration stage by adjusting the confidence or the tolerance level. Reducing the confidence level to 80% (δ = 0.2) decreases the length of the exploration stage to under 300 time steps. However, agents are still able to differentiate among markets and achieve large performance gains after completion of the exploration stage, even if they only engage in local search within a market (Figure 3). With less time allocated to corporate exploration, agents may compensate during the subsequent exploitation stage by further improving their position within a particular market. The overall effect critically depends on the performance differences among markets and the relative contribution of exploitation within a market. In markets with few important policy attributes (N = 10), most of the performance gains are already realized in the exploration stage. Figure 3 reports that a longer, more reliable exploration stage (barely) outperforms a shorter period of corporate exploration. However, the small increase in mean performance during the exploitation stage comes at the huge price of foregone performance gains during the extended exploration stage. Markets with many policy attributes (N = 50) exhibit more improvement opportunities, a property that might justify a shorter, 21 less reliable exploration stage, since agents can effectively compensate by sustained improvements within a market. These markets offer so many possibilities that it is apparently better to briefly take aim and then go for the most promising market. The same basic effects may be observed when the duration of corporate exploration is adjusted by the tolerance level α. Yet, curtailing the exploration stage too much significantly decreases performance.15 As we found before, exploitation is only an imperfect substitute for exploration, even if markets just differ in complexity. Overall, the viability of two-stage search depends on calibration of the budget, the tolerance, and the confidence level. It outperforms a fixed balance one-stage search strategy across a wide range of parameter values. As we shall see, dynamic adjustment can match a well calibrated two stage search procedure, but with far fewer demands on getting the parameters “right”. Two-stage sequencing compared to dynamic adjustment In the following, we compare two-stage sequencing with dynamic adjustment of corporate exploration. Two stage sequencing is characterized by a first stage of intensive exploration among markets followed by a second stage of focused exploitation of a market. In contrast, dynamic adjustment gradually reduces the propensity to explore based on performance feedback and the remaining budget. It thereby keeps the door open to higher levels of corporate exploration in later time periods. 15 The results are available from the authors upon request. 22 0.8 Mean performance 0.75 0.7 0.65 0.6 Dynamic adjustment with broad search Dynamic adjustment with local search 2-stage sequencing with broad search 2-stage sequencing with local search 0.55 0.5 0 200 400 600 800 1000 Time Figure 4a: Two-stage sequencing compared to dynamic adjustment in markets with few policy attributes (d = 4, N=10, K = 0, 1, 5, 9; B = 50, entry costs = 0.1; α = 0.2, δ = 0.05, β = 1, τ = 0.005, 1) Mean performance 0.7 0.65 0.6 0.55 0.5 0 Dynamic adjustment with broad search Dynamic adjustment with local search 2-stage sequencing with broad search 2-stage sequencing with local search 200 400 600 800 1000 Time Figure 4b: Two-stage sequencing compared to dynamic adjustment in markets with many policy attributes (d = 4, N=50, K = 0, 1, 25, 49; B = 50, entry costs = 0.1, α = 0.2, δ = 0.05, β = 1, τ = 0.005, 1) 23 Figure 4 compares the performance characteristics in markets with few (N = 10, Panel a) and many (N = 50, Panel b) policy attributes. It is evident that the two strategies have very different characteristics. In general, mean performance grows faster with dynamic adjustment, but two-stage sequencing catches up with dramatic gains in the transition to the exploitation stage. The most striking observation relates to the effectiveness of two-stage search strategy. Two-stage sequencing based on optimal sampling does not systematically outperform the simple behavioral rule underlying dynamic adjustment. Rather, the mean performance at the end of the simulation is only slightly higher with two-stage search. This marginal increase in final mean performance, however, comes at the price of much lower performance gains during the exploration stage. Hence, dynamic adjustment often outperforms two-stage sequencing in terms of accumulated performance over the entire simulation run. This effect is particularly pronounced in markets with few policy attributes (Figure 4a). When there are many policy attributes to be considered (Figure 4b), two-stage sequencing is superior, especially in terms of accumulated performance. With dynamic adjustment, “keeping the door open” to exploration distracts from realizing incremental gains in a huge space of possibilities, leading to smaller performance in the mid-term of the search process. The firm commitment to sustained exploitation in two-stage sequencing here helps agents to realize incremental performance gains by focusing on improving a policy configuration within one market. The performance of both search strategies also critically depends on whether agents engage in local or broader search. First, engaging in broad search comes at the cost of lower mean performance at the beginning of the search process, regardless of the chosen search strategy. This finding applies to markets with few (N = 10) and many (N = 50) policy attributes. Second, over the longer run, broad search leads to higher mean performance in markets where few policy attributes must be considered, but it decreases mean performance and accumulated performance when there are many policy attributes. These results substantiate our finding that broad search may be a mixed blessing, since it depreciates mean performance during corporate exploration. In corporate exploitation, it distracts agents when a multitude of policy configuration offers vast opportunities from incrementally improving a configuration and realizing more proximate gains. 24 Robustness of dynamic adjustment: Budget adjustments Overall, the performance characteristics of the dynamic adjustment of corporate exploration are quite remarkable when compared to those for two-stage search. Compared to two-stage search, the dynamic adjustment model is more robust to changes in the parameters. 16 0.75 Mean performance 0.7 0.65 0.6 0.55 0.5 0 N = 10, budget constraint N = 10, no budget constraint N = 50, budget constraint N = 50, no budget constraint 200 400 600 800 1000 Time Figure 5: Dynamic adjustment and budget commitment (d = 4, N=10/50, local search, B = 50, entry costs = 0, 0.1, Beta = 1) The success of dynamic adjustment critically depends on a firm commitment to a fixed budget that is not replenished during the search process. Otherwise, agents may engage in excessive exploration. To study the effects of a flexible budget, we lifted the budget constraint by setting entry costs to zero. The balance between corporate exploration and exploitation is still adjusted by The results for the dynamic adjustment model were quite robust to variations of parameter β. With a high value β, firms begin with intensive exploration among markets. As the value of β is lowered this tendency reduces. The results for the dynamic adjustment model reported here assume a value of β set to 1. We tested the robustness of these results by setting β to 0.1, 0.5 and 5. With 0.1, agents do not spend enough time searching among markets, leading to a decline in overall performance. No performance differences were observed when β was varied between 0.5 and 1. Even with a very high β (= 5) performance did not decline significantly. In addition, we also tested the impact of budget size by reducing the budget to 25 (50% cut). Performance varied only slightly, and sometimes even outperformed a higher budget. Obviously, the budget does have to be sufficiently high to allow for some corporate exploration and low enough to prevent excessive exploration. Overall, performance was remarkably robust to changes in β and the size of the budget. 16 25 performance feedbacks, but the adjustment is now more gradual as exploration is not lowered by a declining budget. The impact on the effectiveness of search is quite pronounced (figure 5). When the budget constraint is lifted from the dynamic adjustment model, there is a dramatic decline in performance as agents engage in excessive exploration among markets. Thus, a hard budget constraint is a precondition for the effectiveness of the dynamic adjustment of corporate exploration. Search among markets that differ in mean performance 0.8 Mean performance 0.78 0.76 0.74 0.72 0.7 0.68 1-stage search (tau = 0.005) 1-stage search (tau = 0.1) 2-stage sequencing Dynamic adjustment 0.66 0.64 0 200 400 600 800 1000 Time Figure 6: Comparison of three approaches when markets differ in mean performance (d = 4, N = 50, K = 25, local search, entry costs = 0.1, α = 0.2, δ = 0.05, β = 1, τ = 0.005, 0.1, 1) So far, we have only considered markets that differ in the number of relevant policy attributes and in the number of interdependencies among those attributes (complexity). But we have not yet considered markets that differ in mean performance. We turn to this important case now. The effects on corporate exploration are two-fold. On the one hand, corporate exploration becomes more critical. Since performance differences among markets are more pronounced, it is even more important to locate in an attractive market where firms are rewarded with higher revenue. Agents in a poorperforming market can only partially compensate by more intensive corporate exploitation. On the 26 other hand, large performance differences make corporate exploration less demanding. It becomes easier to differentiate among markets and to separate the wheat from the chaff. For the following analysis, we consider markets that have the same complexity, but differ sharply in mean performance. Specifically, we consider markets with a mean performance of 0.5, 0.6, 0.7 and 0.8, respectively. Figure 6 reports the effectiveness of all three approaches to corporate exploration in complex markets (K = 25) with many relevant policy attributes (N = 50). For the fixed balance, we show the results for a low propensities to explore (τ = 0.005 and τ = 0.1), the latter being the best-performing strategy with a fixed balance. It is evident that eschewing corporate exploration entirely and just randomly picking a market is a much less effective strategy under the condition of large performance differences among markets. Again, the remarkable result here is the performance of the dynamic adjustment strategy that clearly outperforms two-stage search in terms of accumulated performance and matches it in terms of final performance.17 In markets with few policy attributes (N = 10, K = 5), we find similar results, with a slight advantage of two-stage sequencing in terms of final performance. 5. Discussion and conclusions A classic problem in strategic decision-making is the allocation of resources to the two conflicting demands of diversifying into new markets (corporate exploration) and of focusing on existing lines of businesses (corporate exploitation). Corporate exploration takes a firm into new markets. In contrast, corporate exploitation is concerned with the refinement and strengthening of the competitive position within a market. Firms therefore search for competitive advantages on two distinct, but interrelated levels. Searching a market also reveals useful information about how to compete. The inherent problem of balancing corporate exploration and exploitation is avoiding the pitfalls of the success trap and of the failure trap. Both traps are rooted in the antagonistic forces of immediate feedback from corporate exploration and exploitation. Initial positive feedback drives out exploration, while negative feedback undercuts exploitation. We built an agent-based simulation model to capture the salient features of this decision-problem and analyze how to benefit from corporate exploration. 17 For robustness we also tested the influence of broad search. Again, broad search depressed mean performance during corporate exploration. In markets with few attributes, broad search outperforms local search toward the end of the simulation run. 27 We analyzed three different approaches to searching among multiple markets. The first approach commits the decision-maker to a fixed balance between corporate exploration and exploitation. The second strategy splits the search process into two stages. The firm initially engages in corporate exploration and then firmly switches to the exploitation of one market. The decisionmaker commits to a fixed period of intensive corporate exploration among markets and then to sustained exploitation of one market in the second stage. Third, dynamic adjustment allows for gradually adjusting the balance based on performance feedback and the remaining budget for explorative search. Here, the commitment to a fixed budget is critical. The firm gradually homes in on an attractive market, progressively shifting the balance toward exploitation. Yet, dynamic adjustment allows for more intensive exploration among markets even in the late stages of organizational search. Table 2 summaries our results. Our main finding firmly establishes the poor performance characteristics of a fixed balance between corporate exploration and exploitation. The two-stage strategy and the dynamic adjustment approach consistently outperform all variants of the fixed balance. This baseline result is far from trivial, since a fixed balance seems to hold some advantages. Especially a balance pushed toward corporate exploitation allows for an intensive period of exploration early on, since there are no apparent differences among markets. As soon as performance differences among markets become apparent, the most attractive market is chosen with a high probability and firms only seldom venture into new markets to reinforce their beliefs. Our results demonstrate that this approach fails to tame the antagonistic forces of exploration and exploitation. By contrast, the two-stage search strategy copes with the antagonistic forces by dividing the search process into two distinct stages. The commitment to an extended period of exploration prevents agents from falling prey to the success trap, that is, from settling in a market too early in the search process. The commitment to a switch toward sustained exploration counters the allure of the failure trap and protects agents from excessive exploration. In the dynamic adjustment approach, the commitment to a fixed budget combined with performance feedback allow for intensive exploration early on to prevent the success trap, while progressively adjusting the balance toward exploitation to escape the failure trap. Thus, both approaches systematically tame the antagonistic forces by entering organizational commitments, although they achieve this very differently. 28 More strikingly, the simple behavioral rule underling the dynamic adjustment matches and sometimes even outperforms the two-stage sequencing based on optimal sampling. Two-stage sequencing often does marginally better in terms of final performance at the end of simulation run, but this small gain comes at the cost of much lower performance during the extended exploration stage. Essentially, this brings into sharp contrast the problems inherent in the two-stage model. Effective two-stage sequencing puts much higher demands on the cognitive abilities of decision-maker. For effective sampling in the exploration stage, she must decide on a reasonable confidence and tolerance levels and find the right propensity to explore for both stages of organizational search. Few policy attributes Many policy attributes (N = 10) (N = 50) Markets differ in complexity, a) Dynamic adjustment > a) Two-stage > dynamic not in mean performance two-stage > fixed adjustment > fixed balance balance b) Broad search > local b) Local search > broad search search Markets differ in mean a) Dynamic adjustment > c) Dynamic adjustment > performance, not in two-stage > fixed two-stage > fixed complexity balance balance b) Broad search > local d) Local search > broad search search Table 2: Summary of main results in terms of accumulated performance at the end of the simulation run Even though the results are fairly robust to variations in the propensity to explore, changes in tolerance and confidence levels have significant performance implications. If set too low, exploration may fail to identify a lucrative market. If set too high, the organization spends too much time exploring markets, for very little gains. In addition, the budget must be tuned to the exploration stage to prevent a premature lock-in to a market. The dynamic adjustment model is more robust. The manager specifies a hard budget constraint for explorative search and stimulates a high initial level of corporate exploration. If in doubt, the initial level of corporate exploration should be exaggerated. 29 Our second set of results relates to the interactions between search on two levels and the performance properties of local and broad search within markets. First, sustained exploitation within a market is, on average, an imperfect substitute for exploration among markets. Second, the ability to search more broadly is a mixed blessing. If firms cannot readily tell whether they have reached a local peak, broad search may distract them from making proximate gains by local search. Performance suffers, since firms start to broaden search as soon as initial attempts at local improvements turn out to be unsuccessful. Thus, broad search becomes more attractive the smaller and the more complex a market is and the more time an agent spends within a market. Third, independent of the characteristics of the markets, broad search leads to substantial losses in mean performance during corporate exploration. Engaging in more explorative search on both the corporate and the business level confounds learning experiences. A possible solution is to manage local and broad search more carefully. During corporate exploration, firms should constrain business development within a market to local improvements, even if performance does not increase immediately. In corporate exploitation, broadening search should only be attempted if local improvements consistently failed to increase performance (as the latter reinforces the impression of having reached a local peak). The two-stage sequencing and the dynamic adjustment approach have very different organizational implications. Two aspects seem to be especially relevant: managing the transition from corporate exploration to exploitation, and organizational commitments. Regarding the first aspect, prior research shows that exploration and exploitation call for different organizational structures (e.g. Tushman & O’Reilly 1997). The two-stage sequencing approach assumes a clear and swift organizational transition from an organization geared toward exploration to one squarely focused on exploitation. This appears to stand in stark contrast to research highlighting the problems of radical organizational change and to prior literature pointing to a gradual shift toward exploitation (e.g. March 1991; Levinthal & March 1993). The latter point addresses the issue of maintaining a high level of exploration through various organizational instruments, while the former point may be addressed by the organizational separation of exploration and exploitation. That is, the organization needs to consist of two specialized organizational units for exploration and exploitation (Tushman & O’Reilly 1997). The units must be ordered hierarchically: After the completion of corporate 30 exploration, the exploration unit directs the exploitation unit toward the most attractive market and policy configuration therein. Thus, two-stage sequencing essentially calls for a more centralized approach toward corporate exploration. The dynamic adjustment approach, on the other hand, does not suffer from the same limitations. The gradual decline in corporate exploration implies incremental organizational change toward a more stable organizational structure. Thus, corporate exploration and exploitation is managed within the same business unit. It corresponds to a more decentralized solution, in which the business unit is delegated a fixed budget for explorative search among markets. The second aspect concerns organizational commitment to either a fixed time period or a budget for explorative search. This raises the problem of establishing credible managerial commitments not to intervene in a delegated task (Baker, Gibbons & Murphy 1999; Foss 2005). If the task of corporate exploration is delegated to a subordinate, the principal can always intervene or retract the delegated decision from the agent. Incentive problems may ensue. This seems to be a relevant organizational concern for implementing the two-stage and the dynamic adjustment approach. Without offering a fully-fledged analysis, a few remarks can be derived from our results. The two-stage approach appears to be more prone to incentive conflicts than the dynamic adjustment model. Two-stage sequencing offers more room to renege on the agreed time period and the budget, making it harder for outsiders to detect the breach of an informal contract. Second, mean performance stays flat during much of the exploration stage. This makes performance-based incentives a blunt tool for motivating the agent for exploration, a task notoriously hard to measure. Providing incentives based on performance after the completion of exploration also creates problems, as the firm might bring in a new manager for the exploitation stage. This suggests that delegating the task of corporate exploration is harder to accomplish, again calling for a more centralized solution for two-stage search. Dynamic adjustment provides a more straight-forward way to manage organizational commitments. First, the informal contract only relates to a fixed budget, usually highly visible throughout the organization. Second, dynamic adjustment is characterized by steadily increasing performance, so that the incentives for an agent may be based on a combination of current and longterm performance. Again, dynamic adjustment tends to favor a decentralized approach to corporate exploration. These aspects – how incentives influence organizational search and adaptation – need 31 more attention in future research (cf. Nickerson & Zenger 2003; Siggelkow & Rivkin 2005; Ethiraj & Levinthal 2008). Our model admits various limitations that may inspire further research. A limitation of the current model is that agents may locate only in a single market. An interesting extension of the model could allow firms to spread their investments across more or less related industries through corporate diversification. The model can therefore be extended to study the evolution of corporate diversification more fully. Another avenue of research might consider the dynamic emergence of new markets or introduce relative performance shifts. This could, in a stylized way, capture the essence of industry dynamics and technological development, since markets would evolve through a period of growth and stagnation. An interesting extension would be to introduce competitive dynamics among firms (cf. Lenox, Rockart & Lewin 2006; Knudsen, Levinthal & Winter 2009). To conclude, our results suggest that firms may benefit from corporate exploration by making organizational commitments. In strategy-making, corporate exploration and exploitation unfolds on two distinct, but interrelated levels. On the level of corporate strategy, firms search among markets (corporate exploration and exploitation). Within a market, firms explore and exploit competitive positions (business-level exploration and exploitation). On the corporate level, a firm needs to commit itself to a fixed time period of sustained exploration or to a hard budget constraint when searching prospective new markets. On the business level, performance gains may be achieved by a commitment to local improvements during the early stages of organizational search. We hope this contribution will benefit both research and practice by directing attention to these issues, providing a modeling structure with which they can be examined in a systematic way, and providing a first set of robust results that point to the advantages of two-stage sequencing and the dynamic adjustment of corporate exploration and exploitation. Corporate search is an important unexplored topic, and we hope to have stimulated consideration of it in our theories of organizational search. 32 6. References Adner, R., D.A. Levinthal. 2004. What is not a real option: Considering boundaries for the application of real options to business strategy. Acad. Man. Rev 29(1) 74–85. Baker, G, R. Gibbons & K.J. Murphy 1999. Informal authority in organizations. J. of Law, Econom., and Organ. 15(1) 56-87. Bhardwaj, G., et al. 2006. Continual corporate entrepreneurial search for long-term growth. Man. Sci. 52(2) 248-261. Benner, M. J., M. L. Tushman. 2003. Exploitation, exploration, and process management: The productivity dilemma revisited. Acad. Man. Rev. 28(2) 238–256. Burgelman, R.A. 2002. Strategy as Vector and the Inertia of Coevolutionary Lock-in. Admin. Sci. Quart 47 325-357. Carley, K.M., M. Svoboda 1996. Modeling organizational adaptation as a simulated annealing process. Sociological Methods & Res. 25(1) 138-168. Chandler, A.D. 1962. Strategy and structure: chapters in the history of the industrial enterprise. MIT Press: Cambridge. Cyert, R.M., J.G. March. 1963. A Behavioral Theory of the Firm. Prentice-Hall: Englewoods Cliff. Daw, N.D., J.P. Doherty, P. Dayan, B. Seymour, R.J. Dolan. 2006. Cortical substrates for exploratory decisions in humans. Nature 441 876-879. Denrell, J., J.G. March. 2001. Adaptation as Information Restriction: The Hot Stove Effect. Organ. Sci. 12(5) 523-538. Denrell, J., C. Fang, S.G. Winter. 2003. The Economics of Strategic Opportunity. Strat. Man. J. 24(10) 977-990. Denrell J., Le Mens G. 2007. Interdependent Sampling and Social Influence. Psyc. Rev. 114(2) 398422. Eisenmann, T.R., J.L. Bower 2000. The entrepreneurial M-form: Strategic integration in global media firms. Organ. Sci. 11(4) 348-355. Ethiraj, S.K., D.A. Levinthal 2009. Hoping for A to Z While Rewarding Only A: Complex Organizations and Multiple Goals. Organ. Sci. 20(1) 4-21. Even-Dar, E., S. Mannor, Y. Mansour. 2002. PAC Bounds for Multi-Armed Bandit and Markov Decision Processes. 15th Con. on Comp.l Learning Theory (COLT) 255-270. Fang, C., D.A. Levinthal. 2009. Near-Term Liability of Exploitation: Exploration and Exploitation in Multistage Problems. Organ. Sci. 20(3) 538-551. Fleming, L., O. Sorensen. 2001. Technology as a complex adaptive system: evidence from patent data. Res. Policy 30(7) 1019-1039. Fleming, L., O. Sorensen. 2004. Science as a map in technological search. Str. Man. J. 25(8) 909-928. 33 Foss, N.J. 2003. Selective Intervention and Internal Hybrids: Interpreting and Learning from the Rise and Decline of the Oticon Spaghetti Organization. Organ. Sci. 14(3) 331-349. Gavetti, G. 2005. Cognition and Hierarchy: Rethinking the Microfoundations of Capabilities’ Development. Organ. Sci. 16(6) 599-617. Gavetti, G., D. A. Levinthal, J.W. Rivkin. 2005. Strategy making in novel and complex worlds: the power of analogy. Strat. Man. J. 26(8) 691-712. Greve, H.R. 2003. Organizational Learning from Performance Feedback. Cambridge University Press: Cambridge Gupta, A.K., K.G. Smith, C.E. Shalley. 2006. The Interplay between Exploration and Exploitation. Acad. of Man. J. 49(4) 693–706. He, ZL., PK. Wong. 2004. Exploration vs. Exploitation: An Empirical Test of the Ambidexterity Hypothesis. Organ Sci. 15(4) 481-494. Holland, J.H. 1975. Adaptation in Natural and Artificial Systems. Ann Arbor: University of Michigan Press. Holmqvist, M. 2004. Experiential Learning Processes of Exploitation and Exploration within and between Organizations: An Empirical Study of Product Development. Organ. Sci. 15 70-81. Kauffman, S.A. 1993. The Origins of Order. New York: Oxford University Press Knudsen, T., D.A. Levinthal. 2007. Two faces of search: Alternative generation and alternative evaluation. Organ. Sci. 18 39-54. Knudsen, T., D.A. Levinthal & S.G. Winter. 2009. The Role of Scale Adjustment in Industry Dynamics. DRUID conference paper. Lave, C.A., J. G. March. 1975. An introduction to models in the social sciences. New York: Harper & Row. Lenox, M.J., S.F. Rockart, A.Y. Lewin. 2006. Interdependency, competition, and the distribution of firm and industry profits. Man. Sci. 52 757-772. Levinthal, D.A. 1997. Adaptation on rugged landscapes. Man. Sci. 43 934-951. Levinthal, D.A., J.G. March. 1981. A model of adaptive organizational search. J. of Econom. Behavior and Organ. 2 307-333. Levinthal, D.A., J.G. March. 1993. The myopia of learning. Strat. Man. J. 14(S2) 95-112. Luce, R.D., H. Raiffa. 1957. Games and Decisions. Wiley: New York. Luce, R.D. 1959. Individual Choice Behavior. Wiley: New York. March, J.G. 1991. Exploration and exploitation in organizational learning. Organ. Sci. 2 71-87. March, J.G. 2003. Understanding organisational adaptation. Soc. and Econom. 25(1) 1-10. Matsusaka, J.G. 2001. Corporate diversification, value maximization, and organizational capabilities. J. of Bus. 74(3) 409-431. 34 McGahan, A.M., M.E. Porter. 1997. How much does industry matter, really. Strat. Man. J. 18(S1) 1530. Mosakowski, E. 1997. Strategy making under causal ambiguity: Conceptual issues and empirical evidence. Organ. Sci. 8 414-442. Montgomery, C.A., 1994. Corporate Diversification. J. of Econom. Persp. 8(3) 163-178. Nickerson, J.A., T.R. Zenger. 2004. A Knowledge-Based Theory of the Firm: The Problem-Solving Perspective. Organ. Sci. 15(6) 617-632. Rivkin, J.W. 2001. Imitation of Complex Strategies. Man. Sci. 46(6) 824-844. Rivkin, J.W., N. Siggelkow. 2003. Balancing Search and Stability: Interdependencies among Elements Organizational Design. Man. Sci, 49(3) 290-311. Robbins, H. 1952. Some Aspects of the Sequential Design of Experiments. Bulletin of the Amer. Math. Soc. 55 527–535. Rumelt, R.P. 1991. How much does industry matter? Strategic Man. J.. 12(3) 167-185. Siggelkow, N., D.A. Levinthal. 2003. Temporarily Divide to Conquer: Centralized, Decentralized, and Reintegrated Organizational Approaches to Exploration and Adaption. Organ. Sci. 14(6) 650-669. Siggelkow, N., J.W. Rivkin. 2005. Speed and Search: Designing Organizations for Turbulence and Complexity. Organ. Sci. 16(2) 101-122. Sorenson, O. 2002. Interorganizational complexity and computation. J A. Baum, ed The Blackwell Companion to Organizations. Blackwell, Oxford, 664-685. Sutton, R.S., A.G. Barto. 1998. Reinforcement learning. Cambridge, Ma.: MIT Press. Tripsas, M. 1997. Unraveling the Process of Creative Destruction: Complementary Assets and Incumbent Survival in the Typesetter Industry. Strat. Man. J. 18(S1) 119-142. Tushman, M.L., C.A. O’Reilly. 1997. Ambidextrous Organizations: Managing Evolutionary and Revolutionary Change. California Management Rev. 38(4) 8-30. Villalonga, B. 2004. Diversification discount or premium? New evidence from the business information tracking series. J. of Finance 59(2) 479-506. Winter, S.G. 2000. The Satisficing Principle in Capability Learning. Strat. Man. J. 21(10-11) 981996. Winter, S.G., G. Szulanski. 2001. Replication as Strategy. Organization Science 12 730-743. Yechiam E, Busemeyer JR. 2005. Comparison of Basic Assumptions Embedded in Learning Models for Experience-Based Decision Making. Psych. Bull. & Rev. 12(3) 387-402. 35