A Model of Information Processing Aspects of Organizational Learning and Performance Aris M. Ouksel University of Illinois at Chicago Dept. of Info. & Decision Sciences (M/C 294) Dept. of Computer Science Chicago, IL 60607-7124 Contact: email: aris@uic.edu Kathleen Carley Department of Social and Decision Sciences Carnegie Mellon University Pittsburgh, PA 15213 Abstract We propose a mathematical model to provide a platform for the study of the information processing aspects of organizational learning as they apply to a twovalued decision making task and the relation of such aspects to organizational structure. Our primary contribution is to further understanding of the use and power of formal mathematical models in this area. In particular, we demonstrate that the model can capture many of the important features of inductive learning in an decision-making environment where the decisions tasks are interrelated by complex propositional logic expressions. We begin with a simple model which assumes independent equally weighted tasks, and examine its assumptions and practical implications. We then extend the model into a more formal, mathematical context. Improving the robustness of the model is investigated through the relaxation of assumptions. Additional questions that are addressed concern the impact of data interdependencies on organizational learning and the conditions under which inconclusive decision rules may occur. Future research directions are also discussed. 1 A Mathematical Model for Organizational Performance I. Introduction The motivation for this research can be seen in the confluence of two major trends in the business world of the 1990's and beyond. The first of these is the increasing importance of organizational learning as a corporate survival skill (Garvin, 1993). Intense competitive pressures have been brought on by the ever accelerating rate of technological change and the accompanying globalization of our economies. The second major trend is the movement towards flatter and other than hierarchical organizational structures (Tomasko, 1993, Johansen & Swigart, 1994). The popular media are full of reports of corporate restructuring, with "down-sizing'' and the more euphemistic "right-sizing'' being the overwhelming themes. Advances in information technology (IT) are typically credited (or blamed, depending on the perspective) with enabling such dramatic corporate restructuring to occur. The overall thesis of our research is that the technology-driven trend towards flatter organizational structures has run up against a formidable barrier of what is commonly referred to as “information overload” (Davenport, 1997). Paradoxically, the availability of volumes of unfiltered information leads to a decrease in productivity, thereby suggesting that flatter organizations are not necessarily the panacea to streamlining workflows in organizations in all situations. In fact, as we have shown in (Mihavics & Ouksel, 1996a), no organizational structure is best for all scenarios; rather, the ideal structure is contingent on the environment (Lawrence & Lorsch, 1967), the complexity of the task (Burton and Obel, 1998), and the cognitive architecture of the agents (Carley, Prietula and Lin, 1998). It has also been suggested that the hierarchy will begin to come back into vogue as the information overload increases, except that this time many of the nodes will be intelligent software agents acting as information filters (Davenport, 1997; Carley, 2001). This conclusion has been validated in our research for simple decision functions in (Ouksel & Vyhmeister, 2000), where it is shown that the performance of hierarchies outperform flat structures as the volume of information goes beyond a threshold. The results 2 were however preliminary. A thorough analysis will require more sophisticated decision functions and learning. The modeling in this article represents the first step towards capturing more complex relationships in the environmental data. This research then is aimed at increasing our understanding of organizational learning. and its relation to organization structure. Organization structure provides the framework within which various coordination and control mechanisms can be brought to bear on organizational learning (Arrow, 1974; Williamson 1975). Such mechanisms are in fact types of information processing systems used in an attempt to continually improve the decision making performance in successful organizations (Simon, 1976). Thus the primary determinant of organizational structure can be seen in terms of its information needs, especially as they relate to the resolution of uncertainty within their environments (Stinchcombe, 1990; Galbraith, 1973; Simon, 1981;Carley, 2003). One goal in this article is to design a mathematical model to serve as a platform for the investigation of more robust results in organizational learning. One of the most crippling obstacles to progress in the field of organizational learning stems from the lack of precise, explicit definitions of terms (Huber, 1991). In the past decade, the wealth of research in this area has laid bare some fundamental lines along which to discriminate aspects of organizational learning. To begin with, the literature generally distinguishes between individual and organizational learning (Fiol and Lyles, 1985; Levitt and March, 1988; Carley, 1992; Attewell, 1992; Argote, 1993; Anderson, Baum and Miner, 1999) both of which can take place within the actor (individual or organization) or among actors. Individual learning is typically demonstrated by changes in actions or the associated performance scores and changes in mental models. Organizational learning is more difficult to demonstrate and has been measured in terms of changes in routines, rules, performance, and efficiency made possible by both the accumulation of individual learning and the existence structural learning (Argote, 1999). In both cases, learning occurs at different levels and by different means; e.g., it may occur incrementally through experiential or exploitation procedures in a single loop fashion or by leaps based or discovery in a 3 double loop fashion (Argyris and Schön,, 1978; March, 1995). Learning among actors has been characterized as structural learning (Carley and Hill, 2001) and is seen in changes in the underlying network of relations among actors and/or the transfer of technology or personnel among organizations. Learning need not lead to improved performance as when individual learning fails to become embedded in organizational routines (Attewell, 1992), types of learning clash (Carley, 1999) or when "superstitious learning" occurs (Fiol and Lyles, 1985 ). In this paper, we follow in the neo-information processing tradition (Carley, 1990, 1992; Levitt and March 1988) and treat learning in terms of changes in routines through the accumulation and embedding of individual experience. As such, learning consists of "encoding inferences from history into routines that guide future behavior'' (Levitt and March, 1988). This definition will allow us to objectively quantify the amount of learning that occurs within each organizational structure and thereby to compare both amounts and rates of learning across various structures. Note there are caveats to this view. In particular, it does not deal with the case of novel situations where historic routines may be useless, cases of clashes between levels of learning, and structural learning It is important to note that the view of learning that we have adopted builds on three classical observations of organizational behavior: 1) behavior within organizations is based on routines (Cyert & March, 1963; Nelson & Winter, 1982) which can be coded as rules (Levinthal and March, 1993); 2) organizational actions are history-dependent (Lindblom, 1959; Steinbruner, 1974; Lin and Carley, 2003); and 3) organizations are engaged in tasks that may have specific goals (Simon, 1955; Siegel, 1957) such that the goals and the tasks constrain the set of available actions (Carley and Newell, 1994). The first point means that most organizational decisions do not involve attempts to calculate various projected outcomes and then select the optimal scenario. Such calculations are typically too complex to be practically useful. Instead an inventory of routines is developed. The second point is that organizational actions are by and large determined merely by matching previously used procedures (routines) to certain types of situations that have been encountered in the past. The third point is that 4 organizations are goal oriented and control their behavior through feedback involving regular measurements of actual results against targeted aspiration levels. The term "structure" is similarly troublesome to find a generally agreed upon definition for in the literature. For example, organization structure has been defined to include: 1) "those aspects of the pattern of behavior in the organization that are relatively stable and change only slowly'' (March and Simon, 1958); 2) "myth and ceremony'' (Myer and Rowan, 1977); 3) "informal networks of influence'' (Granovetter, 1974); 4) those channels through which “power” is exercised (Pfefer, 1981), 5) the network of relations connecting people, knowledge, resources and tasks that constrain and enable organizational behavior (2002). In this study, use the term "structure" in a narrow, operational sense (Carley, 1992; 2002) to refer to the a) formal lines of communication and management reporting linkages in the organization and b) who has access to what basic information or “evidence”. In this sense, we are using both the personnel network and the knowledge network cells in the meta-matrix of organizational design (Carley, 2002). Illustrative organizational structures one might study are discussed in Section II. In using such simple, operational definitions there is of course the danger that some important aspects of organizational learning and organizational design will be missed. Admittedly, a formal reporting structure is only a relatively superficial notion of how things get really done within organizations. For example, innformal networks certainly play an important role in organizations (Granovetter, 1974), but they are not captured in our models at the moment. The remaining sections proceed as follows. Section III describes and defines the formal mathematical model. Section IV discusses the use of this new model and the types of problems it may be able to provide insights on. Section V elaborates on the importance of data interdependencies and their impact on organizational learning. Section VI addresses the problems introduced by inconclusive decision rules, and finally a conclusion and discussion of future research directions is included in Section VII. II. Research Background 5 As previously noted, the view of organizational learning we follow is based on the theory that learning depends in large measure on paying careful attention to events that have already taken place. Thereafter generalizations can be made based upon this past data, and these generalizations can in turn be applied to future situations that are deemed appropriately similar (Stinchcombe, 1990, Cohen, 1991, March, Sproull, and Tamuz, 1991). Our research focuses on this type of data-driven, or “experiential learning” (Carley, 1992), which is inductive or "bottom-up" in nature. This is in stark contrast to the view of learning as embodied in typical corporate training programs centers around domain-based learning, which is deductive or "top-down" (e.g., knowledge transfer from those who know to those who don't) (Brown and Duguid, 1991). And like situated learning (Hutchins,1999) the networks in which the decisionmaker is embedded limit what and when the learn; but, unlike that literature we do not focus on the knowledge creation aspect in which new knowledge is created as individuals interact, communicate, make offers and have them accepted by others in the network. (Tier and Von Hippel, 1997). In addition, the model will only capture the information processing aspects of organizational learning in that decision making is characterized as a process of information gathering and processing, especially as it relates to the resolution of uncertainty (Stinchcombe, 1990; Galbraith, 1973; Carley 2002). Other, more humanistic or psychological aspects of learning such as motivation, power, politics, trust, and affective issues, while undeniably important are nevertheless outside of the boundaries of the initial model. Moreover, in our model we treat the decision makers as being both “boundedly rational” (Simon, 1955; Simon, 1976) and cooperative (Carley and Svoboda, 1996) such that each decision maker makes decision on the basis of known information, does not optimize, and provides information on request that is accurate and uptodate . This approach to learning has proven fruitful in increasing our understanding of the factors that impact organizational learning. In this sense, our study follows the long stream of research that views organizational decision making as being composed of the boundedly rational decision making behaviors of constituent individual decision makers (Simon, 1976; March and Simon, 1958; Cyert and March, 6 1963; Steinbruner, 1974; March and Olsen, 1975; Padgett, 1980; Carley, 1986; Mihavics and Ouksel, 1996a; Ouksel and Vyhmeister, 2000, Lin and Carley, 2003). It is also motivated by the contingency theory approach since we show that there is no best organizational design in any absolute sense, but rather that organizational design should be matched to the context in which the organization exists (Ashby, 1968; Lawrence and Lorsch, 1967; Burton and Oberl, 1998). In particular, the approach used is an extension of Carley's initial research framework for examining organizational decision making performance under various design constraints and operating conditions (Carley, 1992; Mihavics and Ouksel, 1996a; Ouksel and Vyhmeister, 2000). The goal is to model more realistic decision making functions, which capture complex data interpendencies, and will allow us eventually to investigate subunit task complexity, and interdependence among subunits (Tushman and Nadler, 1978; Mihavics and Ouksel, 1996b). Two aspects of organizational design will be addressed: the personnel network defined as the management reporting structure or "who reports to whom'' and the knowledge network defined by the task decomposition scheme or "who has access to what information'' (Cohen, March, and Olsen, 1972; and Carley, 1990). Note, in this paper all the information that personnel have access to must be brought to bear to solve the task; thus defining who knows what defines how the task has been broken in to parts. Finally, the actors in our models can usefully be thought of as decision making units – a collection of one or more people and associated machines that act, from an organizational perspective as a single actor. We do not distinguish between the individuals with and without computational assistance. The agents are heterogeneous only in that they have a diversity of experience owing to their position in the organizational structure, and not due to their personal information processing characteristics such as access to technology, intelligence, etc. The other general assumptions of the model include: 1) organizational decision making behavior is historically based; 2) organizational learning depends on the boundedly rational decision making behaviors of the individual decision makers which make up the organization; 3) subordinates condense 7 their input data into output recommendations to their superiors, and this information compression is lossy, i.e. uncertainty absorption (March and Simon, 1958) occurs at each node in the structure; 4) overall organizational decisions do not require that a consensus be reached (e.g. a legitimate policy might be to let the majority opinion rule); 5) the organizational decision is two-valued (e.g. go / no go) 6) the organization faces quasi-repetitive integrated decision making tasks. Quasi-repetitive in that the tasks are typically similar although not identical to the previous tasks. Integrated meaning the task is too complex for a single decision maker to handle alone. So the sub-decisions of multiple agents must be combined in some fashion (depending on the organizational design) to reach an overall organizational decision. The tasks of interest here are assumed to be non-decomposable, meaning that combining the correct solutions to each sub task may not always yield the correct solution to the overall task. Each of these decision tasks that the organization faces is represented by a binary string of N bits, where each bit of evidence is denoted xi. The organization is represented by a number of "agents'' (the sub-decision makers) who each have access to a subset of the n bit string or task , xi, xi+1, ... , xj where 1 ≤ i < j ≤ N and (j-i) < N. Each agent examines his/her local memory of prior tasks (bit patterns) and the corresponding past decision outcomes in an attempt to learn what their decision on the current task ought to be. That is, they try to learn from their past experience. It is further assumed that the organization initially knows nothing about the bits of evidence that comprise each task other than the fact that each bit is two-valued (0 or 1) and the overall decision to be reached is similarly binary. The model used by Carley is referred to here as the "uniform model'' since she assumed that each bit of input evidence was of equal importance and independent of the other bits. It is our objective to extend this model and improve its robustness by relaxing such restrictive assumptions. For example, one conclusion drawn from the uniform model was that "teams in general learn faster and better than hierarchies'' (Carley, 1992). It is our contention that the generalizability of these results depends on the constraints placed upon the inherent decision functions employed in the model, which are contingent on the environmental constraints (Lawrence & Lorsh, 1967). 8 The decision function used within the uniform model can be expressed as: (1) ⎧1 if z ≥ 0 sign(z ) = ⎨ where z = f (b1, ... , bn ) = b1 + . . . + bn - n/2. ⎩0 otherwise For example, an employee (node) within the organizational structure with 5 input bits of data (e.g. five direct subordinates) would, over time, learn to associate an input bit pattern containing a preponderance of 1 values with the correct overall organizational decision of "1." Thus through inductive pattern matching he would learn to send a "1" recommendation to his superior in such situations. Our first extension to this basic model is thus to relax the assumption of uniform weights of evidence, and study the impact on organizational performance and learning (Mihavics & Ouksel, 1996b, Ouksel & Vyhmeister, 2000). This can be accomplished directly with a simple extension to the decision function so that: (2) z = f ( b1 , ... , b n ) = ( ∑ a j b j ) j ∑a j j 2 where aj is the weight of evidence bj, and bj has value “1” if the evidence is available and “0” otherwise, for 1≤ j ≤n. This function is positive if the sum of the weights of the available evidence is more than half the sum of all the weights. In other words, if there is preponderance of evidence, particularly the more important ones, then the function is positive. This is of course a much more natural representation of the types of decision making situations which real organizations face. In real life there is a rather low probability that each of the N bits of evidence gathered to help make some decision are equally important. For example, if a stock analyst were to study say ten indicators of a stock’s growth future potential, is it likely that none of the ten indicators is any more important than any of the others? This does not seem reasonable. Indeed, the now widely recognized need for some form of “data quality measures” implies that data is typically not of uniform reliability in predicting some outcome (Dillard, 1992). Even so, the above representation is not sufficient to capture the complexity of decision-making in organizations (see example in section V). Our mathematical model is intended to extend the realism of these functions with complex logical relationships between the features of evidence. III. A Formal Mathematical Model 3.1 Modeling the decision function 9 Formally, this model assumes that each class of input "events" (decision tasks) is fully defined by the set of features which we enumerate from 1 to n. Thus to begin we let Bn = {b=(b1, ... , bn) | bi ∈ {0,1} } be the set of binary sequences of length n (i.e., Bn represents a binary vector with n elements). For b ∈ Bn let bi denote the value of the i-th element of the vector b. We assume that the vector b consists of the results of independent random trials. Also let pi be the probability that bi = 1, and let qi be the probability that bi = 0. Obviously then pi = 1 - qi. Thus b ∈ Bn denotes the result of a multiple random experiment. A simplifying assumption that pi = qi = .50 is used in the initial simulation model (Carley, 1992)1. Intuitively the set Bn of vectors b = ( b1, ... , bn ) can be viewed as a sequence of values of the features which fully describe the class of possible events. The value of bi describes whether the i-th feature is present or not. For example, if the 6th feature is "adverse weather" then b6 = 0 means adverse weather conditions were not present. The probability pi represents the probability of the appearance of feature i in the experiment known a priori. We'll call b ∈ Bn an event. To model a "go / no go" type of decision we need a mapping from the set of input data vectors, Bn, to the set { 0, 1 }. This mapping induces a partition of the set of all events into two classes, those for which the correct decision was 0 and those for which it is was 1. Thus, the space of classes is simply {{0},{1}}. Each class is fully defined by the classification function Γ which is defined as follows: (3) Γ : Bn |→ {0,1} b |→ Γ(b ) = sign ( f ( b ) ) where f is a polynomial function with rational coefficients defined over the set of binary vectors Bn, and b is a variable over Bn. The rational coefficients are simply the weights of each bit of evidence in b. Let z denote function f(b) in equation (3). Then we define sign(z) below as follows: (4) 1Carley ⎧1 if z ≥ 0 ⎩0 otherwise Γ(b )= sign ( z ) = ⎨ (1990) does explore the impact of a biased decision task, where bias is defined as the extent to which pi diverges from .50. 10 sign(z) is “1” if there is a preponderance of evidence and “0” otherwise. The set of events is thus partitioned into two classes, which we denote as Class 1 (or as {1}) and Class 0 (or as {0}), and which we define as follows: i) Class 1 = { b | b ∈ Bn, Γ(b) = 1 } ii) Class 0 = { b | b ∈ Bn , Γ(b) = 0 } = the complement of Class 1. In order to accommodate the general polynomial, we define the function f such that: ∑a f ( b1 , ... , b n ) = (5) j b1v1 ... b vn n j where v = ( v1 , ... , v n ) , v i ∈ {0,1}, a j ∈ K v v where each of b1 1 ... b n n is a monomial K represents the set of rational numbers and coefficients (aj) are in fact equal to zero for almost all strings v (of the 2n total possible strings). For example z = f(b1, ... , b5) = 1 0 0 b1 0 0 + 2b2 + 0 0 b3 + 0 0 2b4 0 1 + 0 0 b3⋅b5 – 0 0 4, 0 is 1 0 1. b11 b 2n b 3 n b 4n b 5 n +2. b10 b12 b 3 n b 4n b 5 n + b1 1 b 2n b 3n b 4n b 5 n + 2. b1 1 b 2n b 3 n b 4n b 5 n equivalent 0 0 1 to 1 z= 0 + b1 1 b 2n b 3n b 4n b 5 n -4, where features 2 and 4 are twice as important as the others in determining the correct decision (i.e. sign (z) ), and that each bi is independent but b3 and b5 contribute together towards the value of z through an AND relationship. This is a significant generalization of the uniform case, which would model a 5 bit decision task as z = f (b1, ... , b5 ) = b1 + b2 + b3 + b4 + b5 - 2.5 . Additional concrete examples will be given later to support the significance of capturing logical relationships such as AND within the model. 3.2 Modeling the organization structure Next we turn our attention to the formal definition of the decision structures. The goal here is to use a directed graph to represent the flow of decision making information through the organizational structure. Each decision maker will be represented as a node in the graph, and the input data to his decisions will be represented as the decision values of all of the nodes immediately subordinate to him. His learning process then is to determine the best polynomial function for predicting the correct overall organizational decision, given his input data bits. One "control knob" that is available to him is to adjust the weight 11 coefficients, aj, so as to accurately reflect the relative importance of each input data bit in influencing the decision outcome. Let G = < V, E > be a directed graph (Harary, 1969) where V is a set of nodes and E is the set of edges. A tree (T) is a directed graph G with a distinguished node vT (the root) such that for each v ∈ V with v ≠ vT there exists a unique path vT, ... , v from vT to v. Let Ev denote the set of edges emanating from node v, and |Ev| the cardinality of this set. Thus: E= UE v∈ V Let the edges in Ev be numbered 1, ... , |Ev|. v . For a given tree T, let Leaf(T) be the set of leaves of T and |Leaf(T)| = n its cardinality. Let g: Leaf(T) |→ {1, ... , |Leaf(T)|} is the function enumerating the leaves of tree T and let D be a decision function such that: ∞ (6) D : V → U f (y1 ,..., y i ) i =1 if v ∈ Leaf(T) ⎧bg ( v ) v a D (v ) = ⎨ ⎩ f(y1 ,..., y |E v | ) otherwise That is, if the vertex is a leaf node then the decision for that node is based upon a single input data point (bi), otherwise the decision is based upon the decision recommendations reported up the hierarchy from each subordinate (i.e., the vertices at the end of each arc emanating downward from that vertex). The notation used here is that each bi represents observed bits of evidence, whereas each yi represents the recommended solution of some decision maker. The function f is specific to each node of the decision structure. The notation used above indicates that f can be any function of the input data bits. In this specific research, we assume it to be a rational polynomial function, whose arguments are the decision values ( y1, ... , y|Ev| ) of its children. The form of each polynomial function f represents that decision maker's estimate as to how best to evaluate his data. Figure 1 below shows a sample decision structure with 3 different decision functions, one for each distinct decision maker within the tree. This also 12 represents an extension of previous models in that all decision makers do not perfom equally. Notice that the observed evidence (bi) is located at the leaf nodes of the tree. Figure 1. A Simple Hierarchical Decision Structure Y = f(Y1,Y2,b6) CEO Y1 = f'(b1,b2) Y2 = f''(b3,b4,b5) Analyst 1 b1 Level 2 Analyst 2 b2 b3 b4 Level 1 b5 b6 Level 0 any give node v by Dv so long as the context of the discussion makes it clear which node is being referred to. We say that the decision structure DS admits the set of observational data Bn if |Leaf(T)| = n (i.e., there is a leaf corresponding to each input bit of evidence). In summary, we can say that the model of the experiment M is defined by the set of observational data Bn, the classification rule Γ, and the decision structure DS which admits the set of observational data Bn. We'll denote this model as M = < Bn, DS, Γ : Bn → {0,1}>. We'll call the triplet: DS = < T , D , g > (7) the decision structure on the set of observed data Bn. For each node v of the tree T we define the decision rule recursively, where y represents the decision recommendation propagated up the structure from any given node, as follows: (8) for v ∈ Leaf(T) ⎧bg ( v ) y = D( v ) = ⎨ ⎩ sign( f(D(v 1 ),..., D( v |Ev | )) otherwise where {v1, . . . , v|Ev|} are the end nodes of edges emanating from v. This means the decision rule for node v is to take the sign of the decision function f on the decision outcomes of each node underneath v in the hierarchy2. 2 Hierarchies cover a large range of organizatinal structures. But bidirectional directed graphs are necessary to define matrix or partial matrix structures. In the tree model, this is partially achieved through data and task dependencies. 13 We will define the decision rule of the entire decision structure to be Dv T where vT is the root of the tree T. Since the decision rule of the decision structure is dependent on the input variables at the leaf nodes, the complete way to express this rule is Dv T (b). Thus we have defined the map: Dv T : Bn → {0,1} : : (b1, ... , bn) |→ Dv T ( b1, ... , bn) which is the decision map of the given decision structure. In terms of notation, we may represent the decision rule at IV. Uses of the Formal Model in Organizational Learning One of the most immediate uses for this new formal approach is to be able to compare the organizational learning potential (i.e. the limit decision making performance, expressed as a percentage of correct decisions) of various different organizational models. To accomplish this we must define the error of classification for a given model of an organization M. Let Err = { b | b ∈ Bn, Dv T (b) ≠ Γ (b) } be the set of events for which the value of the classification map and the decision rule do not coincide. This means that the elements of this set are misclassified by the decision rule. With this notation fully defined we are now ready to address three separate problems: 1) how to determine the error of classification; 2) how to compare the performance between two models; and 3) how to define an inconclusive decision rule. Definition 1: For the given model of experiment M = < Bn, DS, Γ : Bn → {0,1}> the error of classification by the decision rule Dv T relative to the classification function Γ is the probability of misclassification and is computed by the formula: (9) ξM = ∑ (∏ p B∈Err bi j =1 ij ) ⋅ (∏ qij ) bi j = 0 To compare learning potential within different organizational structures, we define two models of an experiment over the same observational space Bn denoted M = < Bn, DS, Γ :Bn →{0,1}> and M' = < Bn, DS', Γ : Bn → {0,1}>: Definition 2: The Model M is said to be better than Model M' if ξM ≤ ξM' . It is said to be strictly better than Model M' if ξM < ξM' . This definition simply indicates that a model of an organization is better if, on average, it perceives the business environment more accurately (i.e., has a lower probability of misclassifications). 14 In real life, while it is important to avoid models that constantly misclassify its environment, a small number of errors may be vital for an organization to learn and to evolve. A perfect model may only lull an organization into complacency and prevent it from observing unusual or novel situations. An interesting reseach question, beyond the scope of this paper, will be to investigate what might a permissible misclassification threshold might for a learning organization. Thirdly, a definition of inconclusive decision rules is needed. Definition 3: An inconclusive decision rule at any given node v is a decision rule such that for at least one input vector b ∈ Bn we have Dv (b) = undefined. Further discussion of inconclusive decision rules is included in section V below. To help clarify the use of this formal approach, let's consider the class of models of experiments which is referred to as "voting teams" (Carley, 1992). The voting team structure from Carley's model can be characterized by k analysts each of whom analyzes r features of the decision task, and where the overall group decision is determined by voting (see Figure 2). The decision rule for each analyst is a simple majority classification rule. Let <Bn, (p1, ... , pn)> be the probabilistic space of the experiment. As above we continue to denote qi = 1 - pi. The true decision function in Carley's model (with uniform weights), Γ : Bn → {0,1}, is given by the formula: n (10) Γ(b ) = sign( f (b )) = sign( ∑ bi − ( n / 2 )) i =1 This is a special case where: 1) each node employs the same decision function (after training); 2) these decision functions are linear, with all coefficients set to 1; and 3) the classification function is normalized about zero (i.e., equal probability for outcome 0 or 1). Next let's formally define the decision structure on the set of evidence Bn that will simulate the decision process of a group of k analysts each of whom has access to r bits of evidence as follows. Let us define a tree Tk,r with k⋅(r +1) +1 nodes and k⋅(r +1) arcs, as shown in Figure 2. Let k and r be such 15 natural numbers that k ⋅ r = n and k ≠ 1 . For a given k,r let V = (vo,a1, ..., ak, c1,1, ..., ck,r) be the set of nodes and |V| = k ⋅ (r +1) +1 . We now define an arc numbering for the given tree as follows: (11) For a node vo : E v = { ( v o ,a i ) } ik= 1 and fv0 : Ev0 → {1,..., k}::( v 0 , ai ) a i o (12) For a node ai : Eai = {( ai , ci , j )}rj=1 and fai : Eai → {1,..., r}::( ai , ci , j ) a j Let E = Ev0 ∪ k UE ai i =1 be the set of arcs with do and d1 as previously defined (i.e., E consists of arcs connecting the root to each analyst and each analyst to his evidence). Now let g : Leaf(T) → {1, ..., n} : : (ci,j) |→ (i-1) ⋅ r + j . This gives a specific means for numbering each leaf node. Next we define the decision rules for each node. Figure 2. Sample Voting Team Decision Structure Vote Vo 1 k 2 Arc # ... 1 c 11 1 a 1 2 r 1 ... c 12 2 c 1r r c 21 r+1 a 2 2 r ... c 22 r+2 1 a k 2 Analysts r Arc # ... c c c c Evidence 2r k1 k2 kr r+r (k-1)r+1 (k-1)r+2 (k-1)r+r Node # Recall from the discussion of our general model that we introduced a function Dv so that the particular form of the decision function f utilized by the decision maker at each node could be unique. However, in Carley's model, in the limit (after training) all decision makers come to behave as though they are using the same decision function -- the simple "majority classification rule" (Carley 1992). Thus at any node v we have: 16 ⎧ ⎪ b g ( ci , j ) ⎪ k ⎪ (13) Dv = ⎨ sign ∑ y j − ( k / 2) for the root, v = v T j =1 ⎪ r ⎪ ⎪sign ∑ y j − (r / 2) for an arbitrary node of level 1, v = ai j= 1 ⎩ Notice here that the notation yj refers to a sub-decision which is propagated up the tree by a subordinate decision maker. Then from Equation 8 we see that the decision at any leaf node is simply given by the value of that bit of evidence ⎛⎜ bg ⎞⎟ ⎝ ( ci, j ) ⎠ , at the root we use the majority rule on the sub- decisions of the analysts, and at the analysts' level we use the majority rule on the bits of evidence underneath them. When applied to a specific input vector b of Bn the above formula expands to: ⎧ ⎪ for any leaf, v = c i, j b(i −1)⋅r + j ⎪ k r ⎪ (14) Dv = ⎨ sign∑ [sign∑ b(i −1)⋅r + j − (r / 2)] − ( k / 2) for the root, v = v T i =1 j =1 ⎪ r ⎪ ⎪sign ∑ b( i −1)⋅r + j − (r / 2) for an arbitrary node of level 1, v = a i j =1 ⎩ These well-defined decision formulas allow for the development of analytical means for determining the learning performance potential of given organizational structures. We have developed and used an algorithm based on these formulas, and utilizing prior work with "Young Diagrams" (Knuth, 1973), to confirm the simulation results obtained via Carley's model (Mihavics and Ouksel, 1996a, Mihavics and Ouksel, 1996b). V. Modeling Interpendencies and Implications on Learning 5.1 The Nature of Interdependent Data As mentioned earlier, organizations act as information processing systems and attempt to cope with uncertainty within their environments (Galbraith, 1973). In this paper, we consider three sources of 17 uncertainty: unstable subunit task environments, subunit task complexity, and interdependence among subunits (Tushman and Nadler, 1978). The original model (Carley, 1990) assumed that all input bits of evidence to the organization are of equal importance and independent of one another, and to this point our primary focus has been on extending the model to include weights of evidence. Next the model is further extended to include the possibility of subunit interdependencies. Due to space limitations, this issue is only briefly discussed in this paper. A thorough presentation of uncertainty, based on established results (Mihavics & Ouksel, 1996b), will be reported in a subsequent publication. Recall that each bit of evidence can be seen to correspond to the presence (value = 1) or absence (value = 0) of some feature of an external entity or "experiment'' that is of interest to the organization. For example, consider an organization that manages a trust fund. One of their primary investment goals with regards to this trust fund might well be "preservation of capital." In other words they want to invest these trust funds in such a way as to minimize risk to principal. Keeping this goal in mind they might decide that when investment opportunities arise which require a relatively large capital outlay, they demand a relatively low level of risk. On the other hand (perhaps as a hedge against inflation), they might prefer to accept a greater level of risk in certain other investments; so long as the amount of capital at risk is rather small. This particular investment philosophy can be summarized as shown in the decision table below: Table 1. Sample Data Interdependencies Large Capital Outlay? Low Risk exposure? Yes Yes Yes No No Yes No No Invest? Yes No No Yes Now if we label the three columns x1, x2, x3 and let the value 0 = "no" and 1 = "yes," we can denote the above interdependency as: x1 XOR x2 ⇒ ¬ x3. The point of this example is twofold: 1) such interdependencies are quite common and natural in real life; 2) they can be readily incorporated into our model; and 3) few previous models captures this type of complex interdependencies. 18 Another significant use of this new model is that its precise formulation allows one to employ a variety of other powerful and well-defined mathematical techniques. For example, the fact that the decision making functions are based on the general polynomial immediately enables a natural extension of the model to include the power of propositional logic. Thus the model is no longer constrained by the unnatural assumption of data independence, since the fundamental logical operators of AND, OR, and NOT can all readily be modeled as shown below: (15) The AND relationship is modeled as: (16) The exclusive OR relationship is given by: (17) The NOT relationship is modeled as: xi⋅xj xi + xj - xi⋅xj 1 - xi With the incorporation of these three simple logical operators, complex data relationships can now be modeled. Thus the power of the model is significantly increased. In addition, we hypothesize that the introduction of such data dependencies should heighten the need for coordination within an organizational structure. We are currently researching this area, and our preliminary results tend to confirm this notion. 5.2 General Implications for the Model Once introduced into the model, data interdependencies can be manipulated in such a way as to test the effect of subunit interdependency on organizational learning across different organization structures. To begin our study of such effects we chose to focus first on a particular type of data dependency, the "exclusive or" (denoted XOR). The XOR relationship is given by: (33) xi + xj - 2⋅xi⋅xj Notice that in the above expression if evidence xi is analyzed in isolation from xj, then xi loses all of its informational value and thus will contribute nothing towards organizational learning. The same can of course be said of xj in relation to xi. Our first hypothesis in this area then is simply that: H1: Data interdependencies can significantly affect organizational learning. 19 To test this hypothesis the simulation model was run under six different organizational learning scenarios. Each of the three organizational structures (Hierarchy, Expert Team, and Majority Team) was tested with input tasks that either did or did not include data interdependencies. The interdependencies used here were of the XOR variety. More will be said in a moment on other types of interdependencies. The results are shown in Table 1 and 99% Confidence Intervals are graphically displayed in Figure 3. Each experiment included 10,000 input tasks during each of 50 simulation runs. Each input task consisted of 27 bits of evidence. Note that the structures (The Hierarchy, Expert Team, and Majority Team) used in the analysis are all captured by directed graphs. We omit their precise definitions here (see Mihavics & Ouksel, 1996a) as they are not only intuitive but are not essential to the understanding of our results below: Table 1. Average Final Performance (w/ std. dev.) No Interdependencies w/ Interdependencies Majority Team 85.6 (1.1) 62.5 (0.8) Expert Team 84.2 (1.3) 65.7 (0.7) Hierarchy 80.7 (1.0) 64.9 (0.8) Learning Proficiency Figure 3. Interdependencies Can Affect Learning 90 80 99% C.I. 70 60 50 E. Team Hier. M. Team From Figure 3 it is clear that there are significant reductions in the amount of learning that occurred when data interdependencies were introduced to each of the three organizational structures. T-test results confirmed this in that for Majority Teams we find T = -56.04 and p = 0.00 which clearly rejects the null hypothesis that the means are equal. Similarly, we reject the null hypothesis that the means are equal for both Expert Teams (T = -40.0, p = 0.00) and Hierarchies (T = -37.9, p = 0.00). 20 VI. Inconclusive Data We observe many situations in the real world where the data provides no indication as to the correct solution to some question. Thus in our model we may want to treat the presence of ambiguous evidence as a separate case. For instance, we might introduce the option for decision makers to "abstain" from a decision, or perhaps "don't know" could be a valid decision outcome. Instances of inconclusion are in fact only all too common. The pervasive nature of this type of problem has lead to the acceptance of common rules of organization such as forming committees with an odd number of members. In definition 3 (section IV) we explicitly recognized the fact that there are occasions when the input evidence does not indicate whether one decision is more likely to be correct than the other. Rather than having the decision maker merely guess at the correct decision, as was our simplifying assumption back in Equation 4 when we had set sign(z) = 1 for z = 0, here we propose a slightly different function sign'(z): (18) if z > 0 ⎧1 ⎪ sign' ( z ) = ⎨ 0 if z < 0 ⎪undefined if z = 0 ⎩ The significance of this development is that it once again demonstrates the ease of extensibility and thus the real power of this new model. Inconclusive data (uncertainty) is clearly something organizations often go to great lengths to attempt to avoid. Indeed, for many researchers "uncertainty avoidance" is the sine qua non of organization theory (March & Simon, 1958; Thompson, 1967; Galbraith, 1973; Tushman & Nadler, 1978). Our model can capture this desire for organizations to avoid inconclusive data. This phenomenon can be easily seen in this case where each employee (node) has access to two bits of evidence. If the bits are of uniform weight (e.g. 5 and 5) then the probability that the agent will see completely contradictory data is 50% (in 2 of the 4 possible patterns). If the bits are of non-uniform weights and the weights can range from 0 through 9 then the probability of completely inconclusive 21 evidence drops to only 5.5%. Furthermore, if one employee had bits with weights of 5 & 5 and another had bits of say 8 & 8, then the overall learning potential of the organization can be improved by having these two employees swap one bit of data. The 5.5% figure above can be readily verified since only when the weights coincide can a complete contradiction occur. Thus there are 10 cases to consider: weights for the two bits = 0 0, 1 1, 2 2, 3 3, ... , or 9 9 . In the case where both weights are zero there are 4 bit-value patterns which yield no information (equivalent to a complete contradiction): 0 0, 1 0, 0 1, or 1 1 . The other 9 cases for weight patterns each yield a complete contradiction for 2 bit-value patterns: 1 0 or 0 1. Thus there are 4 + 9 x 2 = 22 total instances of no informational value to the agent. This is out of a total possible 10 x 10 x 4 = 400 instances (combinations of 2 weights ranging from 0 through 9 with 4 possible bit value patterns for each combination). So the probability of the agent being confronted with no informational value from his 2 bits of evidence = 22 / 400 or 5.5%.3 The rest of the possible scenarios for different numbers of bits per agent and the uniform versus non-uniform weights alternatives are given in Table 2 below. Table 2. Number of bits seen by each Employee (N) 2 3 4 5 n Uniform Weights weights equal, from 0 - 9) (all 55.0% 10.0% 43.75% 10.0% 1/(M+1) if n odd, see (19) if n even Non-uniform Weights (weights range from 0 - 9) 5.5% 4.2% 3.6% 3.2% use Equation 22 If n is even and weights are uniform: (19) L( M , n) = ⎛ n ⎞ n ⎜ ⎟⋅M +2 ⎝ n / 2⎠ 2 n ⋅ ( M + 1) where M is the maximum weight for a single bit of evidence and n is the number of bits. 3This is of course over the long run, assuming that the weights are randomly distributed within a given range of values (0 through 9 in our example) and that initially the bits of evidence are randomly assigned to the agents. 22 To explain how the general formula for the non-uniform case was derived we must return to our formal model. Assume we have a class of acceptable decision functions, where f denotes a function of degree n with rational coefficients. Also assume that the set of acceptable decision rules is of the form: (20) for f ( x ) > 0 ⎧1 ⎪ sign' ( f ( x1 , x 2 )) = ⎨0 for f ( x ) < 0 ⎪undefined for f ( x ) = 0 ⎩ where f(x1,x2) = a1x1 + a2x2 + a0 and a0,a1,a2 ∈ Q. Now if for example a1 = a2 = - a0 then our decision maker will face an uncomfortable decision in two out of four cases (i.e., when x1 ≠ x2 the decision maker sees completely contradictory data). But how likely is such a scenario? To answer this question we need to develop a measure of the number of inconclusive decision rules which exist within our class of acceptable decision functions. Suppose now that a1,a2 ∈ {0,1,2, ... , M}. This assumption is reasonable in light of the fact that "weights" are often assigned by business decision makers in terms of scales of discrete values. Furthermore, all ai ∈ Q can be transformed into ai' ∈ I+ by multiplying our function f by the least common denominator of {a1, ... , an}, and by taking the complement of any variable whose weight would otherwise require a negative value (e.g., if xi =1 represents "adverse weather present" and adverse weather is positively correlated with a correct decision outcome of 0 then we could simply redefine xi = 1 to represent "adverse weather not present"). Suppose now that we fix a0 = -((a1+a2)/2). This is done so that f(x1,x2) models a simple "majority classification" rule (i.e., so that the range of f(x1,x2) is centered about zero). Our measure of inconclusive decision rules is then simply (M+1) / (M+1)2, since there are M+1 ways for a1 to equal a2 and (M+1)2 total possible combinations for (a1,a2). Next we extend the case above to include three bits of evidence rather than just two. Suppose we have a decision maker, Bob, who has three bits of evidence on which to base his recommendation to his boss. Assume each bit of evidence has an integer weight of importance in the range 0 - M. Here we 23 need to count the number of ways in which 3 coefficients can cancel each other out (i.e., ai + aj = ak). To that end we can fix one of the 3 coefficients, ak, at each possible value w ∈ {0,1,2, ... , M} and then count the ways the other two coefficients, ai + aj, can sum to that value w. Thus we now define a new function f ( n , w ), which is a function that computes the number of ways in which n bits can have their weights sum to w. Next we develop the general formula for calculating our function. Proposition V.1: Consider the set of vectors of positive integer coefficients of length n, whose sum is equal to some fixed positive integer value w: n ⎧ ⎫ Awn = ⎨(a1 ,..., a n ) | ∑ a i = w, w, ai ∈ I + ⎬ i =1 ⎩ ⎭ The power of this set of vectors is computed by the formula: 1 for 0 ≤ w ≤ M and n = 1 ⎧ 0 for w > M and n = 1 ⎪ ⎪ 1 for w = 0 and n > 0 ⎪⎪ w f (n, w) = ⎨ ∑ f ( n - 1, w - i) for w ≤ M and n > 1 ⎪ i=0 M ⎪ ∑ f ( n - 1, w - i) for w > M and n > 1 ⎪ ⎪⎩i=max ( 0,w-( n-1)⋅M ) (21) The lower bound for i in the last case within Equation 21 above (namely i = max (0,w-(n-1)⋅ M ) is necessary in situations where (n-1)⋅ M < w. For example, consider counting the ways in which 3 bits can have their weights sum to 19 if each weight can range from 0 to 9 (i.e. M=9). The correct formulation in this case is: 9 f ( 3, 19 ) = ∑ f ( 2 , 19 − i ) i =1 where we start with i=1 instead of i=0 because the third bit must have a weight of at least 1 since the first 2 bits can at most have their weights sum to 18 (not 19). 24 Returning now to our befuddled decision maker, Bob, assuming M = 9 we see that given 3 bits of input data there are: 9 ∑ f (2, w) w=0 or 55 decision rules which admit the possibly of completely contradictory data. This is out of 103 total possible decision rules. See Table 3 for these calculations. f(n, n 1 2 0 1 1 1 2 1 2 Table 3 w 3 1 3 4 1 4 1 5 5 1 6 6 1 7 7 1 8 8 1 9 9 1 10 Upon further reflection, we also realize that not all of these potentially contradictory decision rules are equally troublesome. For example, the decision rule f(x1,x2,x3) = 9x1 + 4x2 + 5x3 - 9 yields contradictory evidence in only 2 out of 8 cases (i.e., (1,0,0) and (0,1,1)) whereas the decision rule f(x1,x2,x3) = 5x1 + 0x2 + 5x3 - 5 yields contradictory evidence in 4 out of 8 cases (i.e., (1,0,0), (1,1,0), (0,0,1), and (0,1,1)). Indeed the rule f(x1,x2,x3) = 0x1 + 0x2 + 0x3 yields contradictory evidence (actually "no information" in this instance) in all cases. Thus we can introduce a more refined measure with respect to our contradictory data problem. This new measure counts the number of ways that the values of bits of evidence can combine with decision coefficients (i.e., "weights") to yield an inconclusive decision situation. Let us now formalize this notion. Consider a set Fn of all linear decision functions of dimension n (i.e., on the vector space Bn) with its coefficients lying in the set M = {0, ..., M} and its free coefficient (a0) lying in the set {-M, ..., 0}. There is a natural bijection between Fn and Mn+1 defined by g : Fn → Mn+1 : : a1x1+...+anxn+a0 a (a1,...,an,a0). Let now S = Mn+1x Bn be the categorical product of the functional domain and the binary vector space. Then we define a function App : Mn+1x Bn → b : : a1,...,an,a0,b1,...,bn a g- 25 1(a ,...,a ,a )(b ,...,b ) 1 n 0 1 n which will give us a value of a particular function g-1(a1,...,an,a0) = a1x1+...+anxn+a0 on the binary vector (b1,...,bn). where I = { (x1,...,xn,xn+1,...,x2n+1) ∈ Mn+1x Bn | App (x1,...,x2n+1) = 0 }. Essentially this gives us I as a set of pairs of the form (decision function f, binary Now let's define set I ⊂ S vector b) such that f(b) = 0 (i.e., the decision rule f applied to evidence vector b yields an inconclusive outcome). We then define L (M,n) = |I| / |Mn+1x Bn| . Thus L (M,n) is a volume type of measure on In+1x Bn. This leads to the following proposition: Proposition V.2: For the class of linear decision functions of dimension n with coefficients in the set M = {0, ..., M} and free coefficient (a0) lying in the set {-M, ..., 0} the measure of inconclusive outcomes is given by the formula: (22) L (M,n) = n ⋅M 2 ⎡ ⎣( n −1) / 2 ⎦ ⎛ n⎞ i⋅ M 2 n + 2⎢ ⎢⎣ ∑ i =1 ⎤ ⎛ ⎢ n + 1 ⎥ ⎢ n ⎥⎞ ⎛ n ⎞ ⎛ ⎛ n ⎞ ⎞ ⎜ ⎟ ∑ f (i , w) ⋅ f (n − i , w)⎥ + ⎜ 1 − ⎢ ⎥⎦ + ⎢⎣ 2 ⎥⎦⎟⎠ ⎜⎝ n / 2⎟⎠ ∑ ⎜⎝ f ⎜⎝ 2 , w⎟⎠ ⎟⎠ ⎝ 2 ⎝ i ⎠ w =1 ⎣ w =1 ⎥⎦ 2 ( M + 1) n ⋅ 2 n Note that the term (1-⎣(n+1)/2⎦ + ⎣n/2⎦) will become zero for all odd n and 1 for even n. For example, in our case where n= 3 and M = 9 we have from Equation 22: L ( M , n) = ⎣2 / 2 ⎦ ⎛ 3⎞ i⋅9 2 + 2 ⋅ ∑ ⎜ ⎟ ∑ f (i , w) ⋅ f (3 − i , w) i =1 ⎝ i ⎠ w = 1 3 (9 + 1) 3 ⋅ 2 3 = 8 + 2 * 3 * [f(1,1)*f(2,1) + f(1,2)*f(2,2) + f(1,3)*f(2,3) + f(1,4)*f(2,4) + f(1,5)*f(2,5) + f(1,6)*f(2,6) + f(1,7)*f(2,7) + f(1,8)*f(2,8) + f(1,9)*f(2,9)] / 8000 = 332 / 8000 = 4.15% Thus Bob has a 4.2% chance of seeing completely inconclusive evidence. Applying the above formulas in a number of cases (n = 2,3,4,5) where ai ∈ {0,1, ..., 9} indicates that as n increases our measure of inconclusive data decreases as shown in Table 1. Proofs of these formulas are included in the Appendix. VII. Conclusion 26 This article represents a continuation of the stream of research which attempts to capture many of the fundamental aspects of organizational learning within mathematical models. We began with Carley's (1992) model of this process, extending it into a more formal, mathematical context. In this regard we provided a formal definition of the key components of the model: organizational learning, organization structure, and organizational decision functions. The power of this model is that it is grounded in formal mathematics and is thereby more easily extensible. After examining its assumptions, we improved the robustness of the model through the relaxation of two of these constraints: uniform weights of evidence and data independence. These changes result in a more generalizable model that can be used to compute the error of classification between alternative models, calculate a measure of inconclusive decision rules, locate the optimal decision function for any node in a given organizational structure, and locate the optimal organizational structure from within a given class of structures, i.e, those defined by the directed graphs in our mathematical model. Furthermore, the use of binary decision variables offers a firm foundation for the direct implementation of propositional logic and the construction of more complex relationships. Researchers have identified several sources of uncertainty in organizations. These include, but are not limited to task complexity (Perrow, 1984), interdependencies among knowledge or resources (Vaughan, 1996), organizational politics (Sagan, 1993), human factos (Kohn, Corrigan, and Donaldson, 200), and emergent technology (Aldrisch and Mueller, 1978). In this paper, we have focused on data interdependencies. We elaborate on its importance in organizations and on its impact on organizational learning. The other forms of uncertainty could be fruitfully explored using the approach presented here. Finally, the identification of the effects of inconclusive data and the introduction of data dependencies via propositional logic both indicate the importance of some form of coordination across the organizational structure. Uncertainty avoidance can be modeled in terms of bit swapping which reduces the probability of encountering inconclusive decision rules. Degrees of task decomposability can be captured through the introduction of data dependencies. For example, the presence of data bits 27 related by XOR operations that span employees render the overall task less decomposable (i.e. having only one bit from an XOR pair provides absolutely no informational value). In such cases the coordination mechanism would attempt to recognize where these XOR'ed bit pairs occur and reorganize them under a single employee. Acknowlegement: We would like to thank the anonymous referees for their insightful comments. They have helped both in improving the presentation and clarifying the contributions. 28 References Aldrich, H. and S. Mueller, 1982 The Evolution of Organizational Forms: Technology, Coordination and Control, Research in Organizational Behavior, 4: 33-87. Anderson, P., J. Baum and A. Miner (Eds.), 1999. Advances in Strategic Management: Population-Level Learning and Industry Change, Vol. 16. Elsevier Science Ltd. Amsterdam, The Netherlands. Argote, L., 1993, Group and Organizational Learning Curves: Individual System and Environmental Components, British Journal of Social Psychology, 32: 31-51. Argote, L., S. Beckman and D. Epple, 1990, The Persistence and Transfer of Learning in Industrial Settings, Management Science, 36: 140-154. Argote, L., 1999, Organizational Learning: Creating, Retaining and Transferring Knowledge, Kluwer Academic Publishers, Boston, MA. Argyris, C. and Schön, D., 1978, Organizational Learning, Addison-Wesley, Reading, MA. Arrow, K. J., 1974, The Limits of Organization, W. W. Norton & Co, Inc., New York, NY. Ashby, W. R., 1968, Principles of Self-Organizing Systems, in Modern Systems Research for the Behavioral Scientist W.Buckley (Ed.), Aldine, Chicago, IL. Attewel, P., 1992., Technology Diffusion and Organizational Learning: The Case of Business Computing, Organization Science, 3: 1-19. Brown, J.S., and P. Duguid, 1991, Organizational Learning and Communities of Practice: Toward a Unified View of Working, Learning, and Innovation, Organization Science, Feb.: 40-56. Burton, R. M. and B. Obel, 1998, Strategic Organizational Design: Developing Theory for Application, Kluwer Academic Publishers, Norwell, MA. Carley, K. M., 1986, Efficiency in a Garbage Can, Implications for Crisis Management,' in J. March and R. Weissinger-Baylon (Eds,), Ambiguity and Command: Organizational Perspectives on Military Decision Making, Pitman, Boston, MA, Ch. 8: 195-231. 29 Carley, K. M., 1990, Coordinating for Success: Trading Information Redundancy for Task Simplicity, in Proceedings of the 23rd Annual Hawaii International Conference on Systems Sciences, Kona, Hawaii, Jan. Carley, K.M., 1992, Organizational Learning and Personnel Turnover, Organization Science, Feb.: 20-46. Carley, K.M., M.J. Prietula and Z. Lin, 1998, Design versus Cognition: The Interaction of Agent Cognition and Organizational Design on Organizational Performance, Journal of Artificial Societies and Social Simulation, 1(3):1-19. 30 June 1998 at <http://www.soc.surrey.ac.uk/JASSS/>, 1: paper 4. Carley, K.M., 1999, On the Evolution of Social and Organizational Networks. In Vol. 16 special issue of Research in the Sociology of Organizations on Networks In and Around Organizations S.B. Andrews and D.Knoke (Eds.). Greenwhich, CN: JAI Press, Inc. Stamford, CT, pp. 3-30. Carley, K. M., 2002, Smart Agents and Organizations of the Future, in The Handbook of New Media. L. Lievrouw and S. Livingstone (Eds.) , Ch. 12 : 206-220, Thousand Oaks, CA, Sage. Carley, K, M., 2003, Computational Organizational Science and Organizational Engineering, Simulation Modeling Practice and Theory, 10 (5-7): 253-269. Carley, K. and A. Newell, 1994, The Nature of the Social Agent, Journal of Mathematical Sociology, 19(4): 221-262. Carley, K.M. and D.M. Svoboda, 1996, Modeling Organizational Adaptation as a Simulated Annealing Process. Sociological Methods and Research, 25(1): 138-168. Carley, K. M. and V. Hill, 2001, Structural Change and Learning Within Organizations. In Dynamics of Organizations: Computational Modeling and Organizational Theories. A.Lomi and E.R. Larsen (Eds.), MIT Press/AAAI Press/Live Oak, Boston, MA, Ch. 2; 63-92. Cohen, M.D., 1991, Individual Learning and Organizational Routines: Emerging Connections, Organization Science, 135-39. Cohen, M.D., J. G. March, and J. P. Olsen, 1972, A Garbage Can Model of Organizational Choice, Administrative Science Quarterly, March: 1-25. 30 Cyert, R.M. and J.G. March, 1963, A Behavioral Theory of the Firm, Prentice-Hall, Englewood Cliffs, NJ. Daft, R.L., and R.H. Lengel, 1986, Organizational Information Requirements, Media Richness, and Structural Design, Management Science, 32: 554-71. Daft, R.L., and K.E. Weik, 1984, Toward a Model of Organizations as Interpretation Systems, Academy of Management Review, 284-295. Davenport, T. H., 1997, Information Ecology, Oxford University Press, pp. 91-95. Dillard, R. A., 1992, Using Data Quality in Decision Making Algorithms, IEEE Expert, Dec.: 63-72. Drucker, P. F., 1988, Tomorrow’s Restless Managers, Industry Week, April: 18, 25. Fiol, C. M., and M. A. Lyles, 1985, Organizational Learning, Academy of Management Review, Oct.:. Galbraith, J., 1973, Designing Complex Organizations, Addison Wesley, Reading, MA. Garvin, D. A., 1993, Building a Learning Organization, Harvard Business Review, July-August: 78-91. Granovetter, M.J., 1974, The Strength of Weak Ties, American Journal of Sociology, 78: 1360-1379. Harary, F., 1969, Graph Theory, Addison-Wesley, Reading, MA. Huber, G.P., 1991, Organizational Learning: The Contributing Processes and the Literatures, Organization Science, Feb.. Hutchins, E., 1999, Cognition in the Wild, MIT Pres, Boston, MA. Johannsen, R. and R. Swigart, 1994, Upsizing the Individual in the Downsized Organization, Addison Wesley, Reading, MA. Kim, D.H., 1993, The Link between Individual and Organizational Learning, Sloan Management Review, Fall: 37-50. Knuth, D., 1973, The Art of Computer Programming, Vol.3, Addison-Wesley, Reading, MA. Kohn, L.T., Corrigan, J. M., and Donaldson, M.S. (Eds), 2000, To Err Is Human: Building a Safer Health System. Institute of Medicine. Lawrence, P. R., and J. Lorsch, 1967, Organization and Environment: Managing Differentiation and Integration, Harvard University Press, Boston, MA. 31 Levitt, B. and J.G. March, 1988, Organizational Learning, American Review of Sociology, 14: 319-340. Levinthal, D.A. and J.G. March, 1993, The Myopia of Learning. Strategic Management Journal, 14: 95112. Lin, Z. and K.M. Carley, 2003, Designing Stress Resistant Organizations: Computational Theorizing and Crisis Applications, Kluwer, Boston, MA. Lindblom, C.E., 1959, The Science of Muddling Through, Public Administration Review, 19: 78-88. March, J. G., and H. A. Simon, 1958, Organizations, John Wiley, New York, NY. March, J. G. , 1996, Exploration and Exploitation in Organizational Learning. in Organizational Learning, M.D. Cohen and L.S. Sproull (Eds.), Sage, Thousand Oaks, CA. March, J.G., L.S. Sproull, and M. Tamuz, 1991, Learning from Samples of One or Fewer, Organization Science, Feb: 1-13. Mihavics, K. and A. Ouksel, 1996a, Learning to Align Organizational Design and Data. Computational and Mathematical Organization Theory, 1(2): 143-155. Mihavics, K. and A. Ouksel, 1996b, A Model for the Study of the Effects of Organizational Structure on Organizational Learning, PhD Dissertation, U.I.C., Chicago, IL,. Supervisor: Aris M. Ouksel Myer, J. and B. Rowan, 1977, Institutionalized Organizations: Formal Structure as Myth and Ceremony, American Journal of Sociology, 83: 340-363. Nelson, R, and S. Winter, 1982, An Evolutionary Theory of Economic Change, Bellhop Press, Cambridge, MA. Ouksel, A. M., and J. Watson, 1998, The Need for Adaptive Workflow and What is Currently Available on the Market: Perspectives from an ongoing industry benchmarking initiative, Workshop on Computer-Supported Cooperative Work, Seattle, WA. Ouksel, A.M. and R. Vyhmeister, 2000, Impact of Organizational Design on Organizational Learning: Robustness of Prior Results, Journal of Mathematics and Computational Organization Theory, 6: 395410. 32 Padgett, J.F., 1980, Managing Garbage Can Hierarchies, Administrative Science Quarterly, 25(4): 583-604. Perrow, C., 1984, Normal accidents: Living with high risk systems, Basic Books, New York, NY. Pfeffer, J., 1981, Power in Organizations, Ballinger Publishing Company, Cambridge,MA. Sagan, S. D., 1993, The Limits of Safety: Organizations, Accidents, and Nuclear Weapons, Princeton University Press, Princeton, NJ. Siegel, S., 1957, Level of Aspiration and Decision Making, Psychology Review, 64: 253-262. Simon, H.A., 1955, A Behavioral Model of Rational Choice, Quaterly Journal of Economics, 69:99-118. Simon, H. A., 1976, Administrative Behavior 3rd ed, The Free Press, New York, NY. Simon, H. A., 1981, The Sciences of the Artificial 2nd ed, MIT Press, Cambridge, MA Steinbruner, J. D., 1974, The Cybernetic Theory of Decision Processes, Princeton University Press, Princeton, NJ. Stinchcombe, A. L., 1990, Information and Organizations, University of California Press, Berkeley, CA. Thompson, J.D., 1967, Organizations in Action, McGraw Hill, New York, NY. Tyre, M.J., and E. Von Hippel,1997, The situated nature of adaptive learning in organizations. Organization Science, 8: 71-84. Tomasko, R.M., 1993, Rethinking the Corporation: The Architecture of Change, AmaCom. Tushman, M. and D. Nadler, 1978, Information Processing as an Integrating Concept in Organizational Design, Academy of Management Review, July: 613-624. Vaughan, D., 1996, The Challenger launch decision: Risky technology, culture, and deviance at NASA. University of Chicago Press, Chicago, IL. Williamson, O. E., 1975, Markets and Hierarchies: Analysis and Antitrust Implications, Free Press, New York, NY.