From: AAAI-86 Proceedings. Copyright ©1986, AAAI (www.aaai.org). All rights reserved. COOPERATION WITHOUT Michael R. Genesereth, Matthew COMMUNICATION L. Ginsberg, and Jeffrey S. Rosen&&n* Logic Group, Knowledge Systems Laboratory, Computer Science Department, Stanford University, Stanford, California 94305 ABSTRACT Intelligent agents must be able to interact even without the benefit of communication. In this paper we examine various constraints on the actions of agents in such situations and discuss the effects of these constraints on their derived utility. In particular, we define and analyze basic raiionaliiy; we consider various assumptions about independence; and we demonstrate the advantages of extending the definition of rationality from individual actions to decision procedures. I Introduction The affairs of individual intelligent agents can seldom be treated in isolation. Their actions often interact, sometimes for better, sometimes for worse. In this paper we discuss ways in which cooperation place in the face of such interaction. Previous A. In recent gence has called arisen. can take a sub-area artificial Researchers have in parallel distributed tributed nature air traffic [29]) intelli(DAI) freely help one another. to address those of synchronization, efficient (inadvertent) interference. attempted or as necessitated of the problem control domain by the (e.g., dis- [30]). contract-bid metaphor to model the assignment of tasks to processors. Lesser and Corkill have made empirical analyses of distributed computation, trying to *This cooperation strategies that lead to efficient solutions for a network of nodes [3,4,7,21]. research has been supported c. by the Office of Naval Re- search under grant number N00014-81-K-0004 and by DARPA under grant numbers N00039-83-C-0136 and N00039-86-C-0033. Issues to be addressed destructive Overview 1. Smith and Davis’ work on the contract net [6] produced a tentative approach to cooperation using a discover problem assumptions artificial AI the problems of interacting agents so as to increase efficiency (by harnessing multiple reasoners to solve problems Their B. intelligence of distributed These DA1 efforts have made some headway in constructing cooperating systems; the field as a whole haa also benefited from research into the formalisms necessary for one agent to reason about another’s knowledge and beliefs. Of note are the efforts of Appelt [l], Moore [24], Konolige [19,18], Levesque [22], Halpern and Moses [8,16]. Previous DA1 work has assumed for the most part that agents are mutually cooperative through their designer’s fiat; there is built-in “agent benevolence.” Work has focused on how agents can cooperatively achieve their goals when there are no conflicts of interest. The agents have identical or compatible goals and work in Distributed years, Georgeff has attacked the problem of assuring noninterference among distinct agents’ plans [12,13]; he has made use of operating system techniques to identify and protect critical regions within plans, and has developed a general theory of action for these plans. Lansky has adapted her work on a formal, behavioral model of concurrent action towards the problems of planning in multi-agent domains [20]. include communication, and of this paper True conflicts of interest The research that this paper describes discards the benevolent agent assumption. We no longer assume that there is a single designer for all of the interacting agents, Rather, nor that they will necessarily we examine the question help one another. of how high-level, au- tonomous, independently-motivated agents ought to interact with each other so as to achieve their goals. In a world in which we get to design only our own intelligent agent, how should it interact with other intelligent agents? Planning: AUTOMATED REASONING / 51 There are a number of domains in which autjonomous, independently-motivated agents may be ex- The study of such constraints and their consequences is important for the design of intelligent, inde- pected to interact. Two examples are resource management applications (such as an automated secretary [15]), and military applications (such as an autonomous land vehicle). These agents must represent the desires of their designers in an environment that includes other intelligent agents with potentially conflicting goals. pendently motivated agents expected to interact with other agents in unforeseeable circumstances. Without Our model of agent interaction thus allows for true conflicts of interest. As special cases, it includes pure conflict (i.e., zero sum) and conflict-free (i.e., common goal) encounters. By allowing conflict of interest interactions, we can address the question of why rational agents would choose to cooperate with one another, and how they might coordinate their actions so as to bring about mutually preferred outcomes. No 2. such an analysis, a designer might overlook powerful principles of coooperation or might unwittingly build in interaction techniques that are nonoptimal or even inconsistent. Section 2 of this paper provides the basic framework for our analysis. The subsequent sections analyze progressively more complicated assumptions about interactions between agent.s. Section 3 discusses the consequences of acting rationally and exploiting the rationality of other agents in an interaction; section 4 analyzes dependence and independence in decision making; and section 5 explores the consequences of rationality across situations. The concluding section discusses the coverage of our analysis. communication Although communication for accomodating interaction is a powerful instrument (and has been examined II in previous work [28]), in our analysis here we consider only situations in which communication between the agents is impossible. While this might seem overly restrictive, such situations do occur, e.g., as a result of Framework Throughout the paper we make the assumption that there are exactly two agents per interaction and exactly two actions available to each agent. This as- commmunications equipment failure or in interactions between agents without a common communications protocol. Furthermore, the results are valuable in the analysis of cooperation with communication [28,27]. sumption substantially simplifies our analysis, while retaining the key aspects of the general case. Except where indicated to the contrary, all results hold in gen- Despite the lack of communication, we make the strong assumption that sufficient sensory information is available for the agents to deduce at least partial information about each other’s goals and rationality. For example, an autonomous land vehicle in the battle- The essence of interaction is the dependence of one agent’s utility on the actions of another. We can characterize this dependence by defining the payoff for each agent i in an interaction s as a function pi that maps every joint action into a real number designating the resulting utility for i. Assuming that M and N are the field may perceive the actions of another autonomous land vehicle and use plan recognition techniques [9] to deduce its destination or target, even in the absence eral [10,11,14,27]. sets of possible moves for the two agents we have (respectively), of communication. 3. Study pf ; A4 x l?i --+ R. of coristrairrts In this paper we examine various constraints on the actions of agents in such situations and discuss the In describing specific interactions, we present the values of this function in the form of payoff matrices effects of these constraints on the utility derived by agents in an interaction. For example, we show that it can be beneficial for one agent to exploit information [23], like the one shown in figure 1. The number in the lower left hand corner of each box denotes the payoff to agent J if the agents perform the corresponding ac- about tions, the rationality is interacting. for an agent and other tions of another agents, except where such similarity nonoptimal 52 /’SCIENCE agent with which it We show that it can also be beneficial to exploit the similarity between itself action. in certain symmetric situa- leads to indeterminate or denotes and the number in the upper right hand corner the payoff to K. For example, if agent J per- forms action a in this situation and agent Ii performs action c, the result will be 4 units of utility for J and 1 unit for Ii’. Each its own utility. agent is interested in maximizing the term A;((m) denote the action that agent K will perform in situation s if agent J performs action m: WI&) = &(W.r(s)). In what follows we call AL the reac2ion function IC. Then the formal definition of dominance is 1: A payoff matrix Figure Although the utilities present in a payoff matrix generally take on any value, we will only need the dering of outcomes in our analysis. Therefore, we only be using the numbers 1 through 4 to denote utility of outcomes. An agent’s job in such a situation can orwill the In the remainder of agent J. III M. WJ here is a function * predicate 3 ‘R:(m) Even if J knows nothing about K’s decision procedure, this constraint guarantees the optimality of a analysis. According decision rule known as dominance to this rule, an action is forbidden if there is another action that yields a higher payoff for every action of the other agent, i.e., that designates rationality Proof: A straightforward of rationality. Cl As payoff for J (since n’)) implies application 3 W(s) dominance # m. analy- of the definition an example of dominance analysis, consider the matrix in figure 2. In this case, it is clearly best to perform action a, no matter what K does 4 and 3 are both better than 2 and 1). There is the action perabove. which actions we need to further Basic n) < p:(m’, K as described to judge p;(m, no way that J can get a better payoff by performing action b. WJ(S) # m. by J in each situation, order to use this definition nality > P~(mA;((m)). rationality predicate. An if there is another action (3m’ D~(m’,m)) Theorem sis. Basic Rationality however, p$(m’, 4&n’)) We can now define the action is basically irrational that dominates it. (3m’hVn’ lR”,(m) rational, ++ of the paper we take the viewpoint We begin our analysis by considering the consequences of constraining agent J so that it will not perform an action that is basically irrational. Let Ri denote a unary predicate over moves that is true if and only if its argument is rational for agent i in situation s. Then agent J is basically rational if its decision procedure does not generate irrational moves, i.e., formed m) is to decide which action to perform. We characterize the decision procedure for agent i as a function W; from situations (i.e., particular interactions) to actions. If S is the set of possible interactions, we have Wi : S + D;(m’, for In are J define the ratio- R”, . Figure An action m’ dominaies an action m for agent J in situation s (written D;(m’, m)) if and only if the payoff to J of performing action m’ is greater than the payoff of performing action m (the definition for agent I< is analogous). The difficulty in selecting an action stems from lack of information agent will do. about If such information to perform. dominance analysis Problem does not always ap- ply. As an example, consider the payoff matrix in fig- the ure 3. In this situation, an intelligent agent J would probably select action a. Rowever, the rationale for this decision requires an assumption about the ratio- Let nality what the other were available, agent could easily decide what action Of course, 2: Row Dominance of the other agent Planning: in the interaction. ALJTOMATED REASONING ! ii The main consequence rule commonly known of independence is a decision If for every %xed as case analysis. move” of K, one of J’s actions is superior to another, the latter action is forbidden. The difference between then case analysis and dominance analysis is that it allows J to compare two possible actions for each action by K without considering any %ross terms.” Figure 3: Column Dominance Problem As an example, In dealing with another agent it is often reasonable to assume that the agent is also basically rational. The formalization of this assumption of mutual rationality is analogous to that for basic rationality. consider the payoff J gets 4 units of utility (for J) of the outcome rather than implies Iterated dominance iterated ad is less than the payoff of bc. domi- of several following tionality of K, we can show that action d is irrational for K. Therefore, neither ad nor bd is a possible outcome, and J need not consider them.2 Of the remaining two posrible outcomes, ac dominates be (from J’s perspective), so action b is irrational for J. _ Action Theorem offers several with Dependence different, but rationality Problem and independence By combining inconsistent, the independence rationality, we can also show version of case analysis. Theorem erated case imply Mutual assumption with mutual the correctness rationality of an iterated and independence imply it- case analysis. As an example action. c. tion Basic of iterated case analysis, consider the 5. J cannot use dominance analysis, situation in figure iterated dominance Unfortunately, there are situations that cannot be handled by the basic rationality assumptions alone. Their weakness is that they in no way account for dependencies between the actions of interacting agents. This secto dealing 4: Case Analysis analysis. dominance analysis handles the column problem in figure 3. Using the basic ra- IV 3, and if K performs J Figure Proof: For this proof, and those theorems, see [lo] and [ll]. III 4. agent c, then K Using this assumption one can prove the optimality of a technique called iterated dominance analysis. rationality in figure action d, then J gets 2 units of utility rather than 1. Dominance analysis does not apply in this case, since the payoff lRk(m)3 W&s)#m Theorem Basic nance analysis. matrix Given independence of actions, a utility-maximizing J should perform action a: if K performs action However, With this analysis, nor case analysis information and to select K can exclude using case analysis mutual rationality, an action J can exclude action Q. K approaches this deficiency. J The simplest case is complete independence. The independence assumption states that each agent’s choice of action agent’s is independent reaction of the other. function one of the other agent’s we would yields actions. In other the same words, value each Figure for every For all m, m’, n, and n’, then have Note dent, 2We write if two decision the independence procedures assumption = A&(m’) results. A;(n) = A”J(n’). “paradox.” An alien one marked “!I?’ and the other mn to describe / SCIENCE that, Case Analysis i Problem A&(m) the situation where J has chosen action m, and K has chosen action n; we call this a joint 54 5: Iterated action. lope tains As an example, contains the same some consider approaches number number the following you with marked of dollars, of cents. are not indepen- can lead to nonoptimal The ‘)?“. well-known two envelopes, The first enve- and the other conalien is prepared to V Alien General General nality, J difference to decision procedures We introduce define general Figure 6: Omniscient Alien rational In an attempt to discourage greed on your part, you pick the envelope alien has decided marked #. Bearing on the contents you pick one, which envelope in mind that the of the envelope should before you select,? payoff of $10.00. reaction function The straint. and by describing abandoning appropriate the alien’s the independence axioms A&($) are con- = 1 and A%(+‘) = 1000. These hand constraints corner J’s attention limit and the upper right to the lower hand corner #. Although cal, there are real-world of independence trated above the example encounters is unwarranted, must and where the effect illus- enter the rational agent’s analysis. we of course not only uation the current are reversed. behavior same situation An agent of the interacting (3P’ DJ(P’, as that of agent J in the permuted situation as good behavior be insupportable tion among same design. like, except is a strong agent K will take in situation P. Then the formal definition in general, artificial agents, Unfortunately, when 3This constraint tion in [28,27]. combined it is reasonable especially While The advantage for interac- those built from the it is not as strong with general is similar to the similar as we would rationality. bargainer.9 common lows us to eliminate if and only if it yields and a better denote payoff in at the action that behavior actions rationality (defined joint actions for all agents, is that, together in the last section), that it al- are dominated a technique by called domi- and common behavior imply nated case elimination. rationality Let, s be a situation Proof: with joint actions uv and zy such that psJ(u, v) > psJ(z, y) and p;((u, v) > p;C(z, y), and let, P be a decision procedure such that P(s) = z and = y (where s’ is the permuted positions have and Q(s’) Under agents a different informal situation, reversed). J where Let Q be a deci- to P except the common behavior that Q(s) = u assumption, J and K and, therefore, P for both irrational. In other been that is identical = v. Q dominates erally both assump- are the rationality s if agent J uses procedure of dominance is: of general with case elimination. it may above. actions nl(P’, P) vspdJ(P’(s),dR(~‘ 2))pd,(P(s)AON A 3sp:(P’(s),&&“)) > pdJ(P(s)A#‘)). and K’s constraint. which dzJ(P). another game General = WJ(~‘) * Let the term dk(P) sion procedure Common in every least, one game. P(a’) Va WK(a) dominates a payoff Theorem s’.~ the ac- as described to judge P)) dominated J and an agent K have common of K in situation s is the that designates RJ. sit- if and only if the action # P(s)). need to define further agents s but also the permuted s’ in which the positions to predicate The definition of rationality for a decision procedure is analogous to that for individual actions. A procedure is irrational if there is another procedure that dominate8 it: other joint Another interesting example of action dependence is common behavior. The definition requires that we consider a unary by J in each situation, rational, left the assumption 33 (W&J) to use this definition of the ma- given here is whimsiwhere ac- and functions is true if and only if its argument is rational agent can use a In order trix. Since the payoff for selecting the # envelope is better than the payoff for selecting the $ envelope, a rational agent will choose Let IRi denote * One procedure We can easily solve this problem than to individual that WJ here is a function Recall tion performed predicate The payoff matrix for this situation is shown in figure 6. Since the payoff for $ is greater that, that for # for either of the alien’r option@, cma anallyair dictatea chooring the % envelope. Assuming that the alien’s omniscience is accurate, this lead to a payoff of $1.00. While selecting the envelope marked # violates case analysis, it leads to a rather ratio- rationality i. A generally -%(P) he has decided to put one unit of currency in the envelopes if you pick the envelope marked $ but one thousand units if of basic that general only if it is rational. give you the contents of either envelope. The catch is that the alien, who is omniscient,, is aware of the choice you will make. that for agent procedure version being a new set of relations rationality. over procedures Problem is a stronger the primary applies tions. rationality Rationality P is gen- q words, if a joint in an interaction, action. arguments This action is disadvantageous at least conclusion one for will perform has an analog in the REASONING / 55 of [5] and [17]. Planning: AUTOMATED No-conflict this situations result. The joint action interaction, are handled best plan as a special rule states that, that maximizes the payoff then it should be selected. Carallary Gnatal ply best plan. rationality case if there to all agents of is a in an J and common bshuvior imFigure Proof: Apply alternatives. dominated case elimination common in figure dominance applies. of best 7. None analysis, plan, case However, action c (under common consider the situation of the preceding analysis, techniques iterated UC dominates and so J will perform tions, of the Sexes Cl As an example tured 9: Battle to each of the case the assumptions a and K of general (e.g., analysis) all of the other action pic- joint behavior conflict such as that the constraints situations. situation, in figure Neverthe- the resolution 9 remains of a undetermined by we have introduced. ac- will perform rationality to non-symmetric less, in a no-communication VI and Conclusions behavior). 1 C d 4 J 1133 Figure of this approach There are 144 distinct interactions between two agents with two moves and no duplicated payoffs. Of these, the 2 4 b ] 2 IE a Coverage A. K techniques presented remaining 27 cases here cover 117. are unclear, e.g., The solutions to the the situation in fig. 10. 7: Best K Plan Generlll rationality and common behavior also handle difkult rituationr like the prisoner’s dilemma [2,512ti] pica tured in figure 8. Since the situation is symmetric (i.e., s=a ‘, using our earlier notation), common behavior requires that they both perform the same action; general rationality eliminates the joint action bd since it is dominated by ac. The agents perform each receives dictates that to a payoff actions a and c respectively, 3 units of utility. By contrast, case analysis the agents perform actions b and d, leading of only 2 units Figure for each. For a discussion be used to handle contrasting theory, K Unfortunately, ior are not the battle uation both general always agents actions consistent. of the sexes is symmetric, perform on the ac/bd rationality problem and common As an example, in figure and common the same action. diagonal are forbidden behavconsider 9. Again behavior the sit- dictates However, that both joint by dominated case elimination. This stricting 56 inconsistency the by occasionally use of general rationality reand approachs with paper’s analysis of interactions assumptions. have common knowledge of actions of the interaction and their ing utilities). the interaction Third, is given the effects current there be effective must (otherwise, there choices (i.e., a vari- might first, and the new situation to includ- Second, there there are no miss- is viewed to future in isolation interactions have on them). simultaneity are issues matrix, outcomes. in the matrix no consideration presupposes First, the agents are assumed is no incompleteness ( i.e., those used in game of this approach ety of strong and Fourth, in the agents’ actions concerning which that then confronts agent moves the second agent). Admittedly, can be eliminated simultaneous / SCIENCE all of these ing choices Dilemma of a variety of other techniques that can these situations, as well as a discussion Suitability This J 8: Prisoner’s situation see [27]. B. Figure 10: Anomalous and some situations example these are serious where two ALVs they assumptions, are satisfied. approaching opposite but there are Consider as an ends of a narrow tunnel, each having ing one of several the choice alternate of using routes. the tunnel to assume that. they have common knowledge other’s approach (e.g., through reconnaisance). unreasonable to assume of one another’s route navigation, that the agents might be no concern and the decisions counters, The types of analysis to use in deciding For most above to be done domains, restrictive work in [28,27] rently, action and assumptions more steps in that on the question needs in turn. Cur- matrices will inevitably The existence need by these to be handled is routinely and their telligent handled automated by humans. extensions agents need to interact flexibly of conflicting goals will should The agents, results successfully for multi-agent Decision In Proceedings procedures. Workshop, 1985. Between Ira P. G o Id s t ein. Bargaining ing Paper 102, Massachusetts Intelligence plan- 121-125. Laboratory, A. I. Work- Goals. Institute of pages 43- of Technology Ar- 1975. Joseph Y. Halpern and Yoram Moses. Knowledge and Common I(nowledge in a Distributed Environment. Research Report IBM RJ 4421, IBM Research Laboratory, San Jose, California, October 1984. [17] D. R. Hofstadter. Metamagical themes-computer tournaments of the prisoner’s dilemmasuggest how cooperation evolves. Scientific American, 248(5):1626, May 1983. [IS] Kurt Konolige. A computational spection. IJCAI-85, pp. 502-508. [19] Kurt its. Konohge. PhD thesis, theory A Deduction Model Stanford University, of belief of Belief 1984. intro- and ita Log- [20] A. L. Lansky. Behavioral Specification and Planning for Domains. Multiagent Technical Note 360, SRI International, Menlo Park, California, November 1985. design of inin the face of [21] such conflict. of action in [16] as it in this paper be of use in the able to function just A theory pp. M. L. Ginsberg. tificial is being pursued, so that the type of conflict analysis presented in this paper can be applied to interactions with incomplete information [26]. Future work will focus on issues arising from multiple encounters, such as retaliation and future compensation for present loss. Intelligent agents with other entities. [l5] The direction. of incomplete Georgeff. AAAI-84, and interaction 125-129. the Distributed Artificial Intelligence 65, AAAI, Sea Ranch, CA, December listed work so that each of the can be removed represents research the clearly Michael ning. Re- (HPP-84-41), Computer SciUniversity, November 1984. Communication AAAI-83, pp. [ 121 Michael Georgeff. multi-agent planning. [13] Jef- Matthew L. Ginsberg, and Michael R. G enesereth, frey S. Rosenschein. Solving the Prisoner’s Dilemma. port NO. STAN-CS-84-1032 ence Department, Stanford [14] this approach assumptions 1984. [ll] tool to take. 84-36, Heuristic Programming Project, CornStanford University, September Report puter Science Department, models are an appropriate of course, limiting, in developing most have some (in this case) over future enare effectively simultaneous. in this paper what far too are of one anNor is it utility functions. Finally, in the domain of the choices are often few and well-defined. There HPP or try- It is not unreasonable V. Lesser and D. Corkill. The distributed vehicle moni- toring testbed: a tool for investigating distributed problem solving networks. AI Magazine, 4(3):15-33, Fall 1983. REFERENCES [l] D. E. Appelt. Satisfy Multiple Planning Goals. [2] Robert Axelrod. The Books, Inc., New York, [3] D. Corkill. University Evolution 1984. of Massachusetts, of Cooperation. Organizalional Networks. for Problem-Solving Amherst, Basic Self-Design PhD thesis, MA, 1982. [4] Daniel D. Corkill and Victor R. Lesser. The use of metalevel control for coordination in a distributed problem solving network. [5] IJCAI-83, pp. Davis and Reid metaphor for distributed telligence, 20(1):63-109, G. Negotiation solving. Artificial report. [9] Michael R. Genesereth. consultation. IJCAI-79, [lo] Michael R. G enesereth, frey S. Rosenschein. [26] Jeffrey Matthew C ooperation Reasons Jeffrey [29] Also published (301 of Im- Interaction: Cooperation PhD thesis, Stanford Univeras STAN-CS-85-1081 (KSL85- of Computer Science, Stanford University, 1985. S. R osenschein and Michael R. Genesereth. rational agents. IJCAI-85, pp. 91-99. R. G. Smith. A F Ta4meWOTk tributed Processing University, 1978. and limited Rational Agents. Intelligent Department among in the Presence Ox- Technical Report, Knowledge SysComputer Science Dept., Stanford Univ., S. Rosenschein. sity, 1986. (281 Cooperation Press, In preparation. Jeffrey 40), Clarendon In of Information. tems Laboratory, [27] and Persons. S. Rosenschein. complete as a In- in automated Games and DeciJohn Wiley and D. Parfit. ford, 1984. pp. 491-501. The role of plans pp. 311-319. Survey. [25] October IJCAI-85, Raiffa. belief. R. Moore. A formal theory of knowledge and action. J. R. Hobbs and R. C. Moore, editors, Formal Theories the Commonsense World, Ablex Publishing Co., 1985. 1983. Belief, awareness, and explicit [24] Among Smith. problem [8] R. Fagin and J. Y. Halpern. preliminary R. Duncan Lute and Howard aions, Introduction and Critical Sons, New York, 1957. 1986. American j’i’] Edmund H. Durfee, Victor R. Lesser, and Daniel D. Corkill. Increasing coherence in a distributed problem solving network. IJCAI-85, pp. 1025-1030. reasoning: [23] 748-756. L. Davis. Prisoners, paradox and rationality. Philosophical Quarterly, 14, 1977. [S] Randall Hector J. Levesque. A logic of implicit AAAI-84, pp. 198-202. Natural Language Utterances to PhD thesis, Stanford Univ., 1981. A F ramework in Distributed [22] R. Steeb, fOT Environment. S. C ammarata, Distribzlted intelligence port WD-839-ARPA, Problem Solwing PhD F. Hayes-Roth, for air fleet control. The Rand thesis, Deals in a DisStanford and R. Wesson. Technical Re- Corporation, Dec. 1980. L. Ginsberg, and Jefwithout Communication. Planning: AUTOMATED REASONING / 5’