Scaffolding and the Insufficiency of the Intentional Stance as a Conceptual Underpinning for Multiagent Systems Albert Esterline Dept. of Computer Science, North Carolina A&T State University 1601 East Market Street Greensboro, NC 27411 esterlin@ncat.edu Space restrictions confine discussion to representative positions. We consider the criticisms of ethnomethodology since they present a clear attack on the foundations of multiagent research by questioning the role of plans, taken in a very broad sense, in coordination. And we consider Bratman’s analysis of plans since it is in clear contrast with that of ethnomethodology, exposes the conceptual relations among the notion of plans and other basic notions, and has been endorsed by much of the multiagent community. Abstract The intentional stance, which has been a conceptual underpinning of much work on multiagent systems, cannot provide an account of coordinated behavior in terms that can apply equally to humans and to agents. We suggest that one must also include social scaffolding in the sense in which “scaffolding” is used by Andy Clark. Critical aspects of the relevant social scaffolding can be formulated in terms of notions familiar from formalisms that have been applied to software systems and to humans alike. The Intentional Stance Introduction * The intentional stance finds its classical exposition in the work of the philosopher Dennett (Dennett 1987). The intentional strategy (adopting the intentional stance), according to Dennett, is one of several strategies—such as the physical strategy and the design strategy—we use to predict the behavior of systems. When even the design stance is impractical, there is still the intentional stance: The key feature unifying agent research is viewing computational entities as human-like. The theoretical step justifying this is known as adopting the intentional stance. This paper suggests, however, that the intentional stance cannot provide an account of coordinated behavior in terms that can apply equally to humans and to agents. What further is required is what Andy Clark has called scaffolding (Clark 1997): a complex world of physical and social structures on which the coherence and analytic power of human activity depends. Of interest here is social scaffolding, critical aspects of which can be formulated in terms of notions familiar from formalisms that have been applied to software systems and to humans alike. The next section reviews Dennett’s formulation of the intentional stance, and the following section present’s Bratman’s position on intentions and plans and considers the contrasting findings of ethnomethodoloy. For scaffolding, we first consider obligations, suggesting that speech acts, the normal way of establishing directed obligations, be seen as joint actions modeled as handshakes (in the process-algebraic sense). The consequences for a theory of multiagent systems of the fact that specifications establish obligations are investigated in the following section, then common “knowledge” is identified as another aspect of social scaffolding. After a brief caveat on using modal logics in these contexts comes the conclusion. … first you decide to treat the object … as a rational agent; then you figure out what beliefs that agent ought to have, given its place in the world and its purpose. Then you figure out what desires it ought to have, on the same considerations, and finally you predict that this rational agent will act to further its goals in the light of its beliefs. (p. 17) Practical reasoning then lets you decide what the agent out to do, which is what you predict. The intentional stance gives shallow accounts of simple artifacts but, for complex, versatile systems (e.g., chess-playing programs and, paradigmatically, humans), it significantly constrains the internal constitution attributed to the intentional system. Intentions, Plans, and Criticisms of the AI Planning Model Critical components of this internal constitution for both humans and agents are plans. Bratman (Bratman 1990), whose work inspired BDI agent architectures, distinguishes between a plan as a sort of recipe (as in traditional AI) and * Copyright © 2007, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 65 a plan in the sense of having a plan as a mental state, essentially an intention. Bratman distinguishes between two kinds of pro-attitudes. Pro-attitudes that are merely potential influencers of conduct include such things as desires, but intentions (plans) are conduct controlling, providing consistency constraints as they must be both internally consistent and consistent with the agent's beliefs. They should also have means-end coherence, being elaborated enough for me to do what I now plan. Intentions thus pose problems for deliberation hence determine the relevance of options. They are thus stable yet (for rationality) not irrevocable. So future-directed intentions allow deliberation in advance of conduct and support coordination by supporting expectations of their successful execution. A feature of plans/intentions characteristic of limited agents is that they are partial: details are left for later deliberation. Suchman (Suchman 1987) criticized the notion of plans as formal structures generating actions that predominates in classical AI. Face-to-face human conversation is taken as the baseline for assessing the state of human-computer interaction; such conversation is found to be at odds with the traditional planning model since it is not so much an alternating series of actions as a joint activity continuously engaging the participants. Suchman maintains that a plan is not a generative mechanism but an artifact of reasoning about actions, an abstraction over actions; a plan is a resource for action, used retrospectively to justify a course of action and before the fact to orient us. Suchman's work heralded a body of naturalistic research by social scientists into interaction in technology-supported activities. Ethnomethodology and conversation analysis, the main orientations here, “examine the ways in which participants reflexively, and ongoingly, constitute the sense of intelligibility of the ‘scene’ from within the activities in which they are engaged” (Heath and Luff 2001, p. 19). It is generally found that “the accomplishment of complex tasks … is ‘ongoingly’ co-ordinated with the actions of others” (Heath and Luff 2001, p. 20). Bratman’s position, in contrast, is that a plan as a resource is a recipe, which, if embraced by intention, becomes a plan effective of action. In fact, the notion of expectations based on intentions comes closer to the core of multiagent systems and group activity than does that of an inert recipe. An intention, however, is an attitude, hence private; for multiagent systems, we consider communication acts, which are public. So we shift from intentions to obligations. “Obligation” and ”commitment” are related, but the latter can also describe an attribute of a person (a resolve), so we prefer “obligation.” might be obligated to teach at 10:00 AM and also be obligated to appear in court at 10:00 AM. One of the prima facie obligations defeats the others, becoming the actual obligation. And what was once an actual obligation may be defeated by a new obligation. Whereas having an intention/plan and its reconsideration are up to the agent, being under an obligation and which prima facie obligations defeat others are objective. Thus, one can have an obligation to A yet not intend to A; conversely, one can have an intention without the corresponding obligation. Yet there is a close relation between intentions and obligations since normally one intends to discharge one’s obligations—otherwise, obligations would be pointless. Thus, obligations serve much the same functions Bratman identifies for intentions. They support coordination by supporting expectations of their successful discharge because they are normally stable, and they drive means-end reasoning. Imposing an obligation on oneself (e.g., promising) or another (e.g., commanding) allows deliberation in advance of conduct. But obligations have a more meager requirement of means-end coherence, being more abstract and one more step away from conduct. They are also conduct controllers, again one more step away from conduct, although they are not attitudes. Focusing on obligations gives an angle on the intentional stance that emphasizes ideal behavior. Speech Acts, Handshakes, and Processalgebraic Plans Focusing on obligations emphasizes speech acts, the normal way to establish directed obligations, where there is an obligor (subject to the obligation) and an obligee (to whom the obligation is owed). For example, a promise establishes the speaker as the obligor and the addressee as the obligee, while a command reverses these roles. As contemporary analysis of face-to-face conversation emphasizes the active role of addressees (e.g., nods), we view speech acts as joint actions. Since the agents involved in a joint action must time their contributions so that each contributes only when all are prepared, a good formal model for a speech act is a handshake in a process algebra (e.g., the S-calculus (Milner 1999), a joint communication action that happens only when both parties are prepared. In a process algebra, terms denote processes, and combinators apply to processes to form more complex processes. Combinators typically include alternative and parallel composition as well as a prefix combinator that forms a process from a given process and a name. A handshake results in an action identified by the prefix of the selected alternative, and the resulting process consists of only the selected alternative with its prefix removed. Names come in complementary pairs, and parallel processes may handshake only if they have alternatives with complementary prefixes, that is, only when both are prepared. A process can evolve only by handshaking, which synchronizes components’ behaviors. Obligations Obligations, like intentions, are normally stable, and they are defeasible (overridable). There is interplay between the consistency constraint on obligations and their defeasibility. We may have several prima facie obligations that clash in that not all can be discharged. For example, I 66 read “agent i knows that M”. EGM, read as “everyone (in G) knows that M,” is defined as K1M K2M … KnM. Let EGk be the EG operator iterated k times. Then “it is common knowledge in group G that M,” in symbols, CG M, is defined with the infinite conjunction EG1 M EG2 M … EGi M …, that is, everyone knows that M, everyone knows that everyone knows that M, and so on, for arbitrarily deep nestings of “everyone knows that.” Common knowledge is a prerequisite for coordinated action. For example, traffic lights would not work unless it were common knowledge that green means go, red means stop, and that lights for opposite directions have different colors. If this were not so, we would not confidently drive through a green light. In a standard epistemic logic (such as S5) augmented with the operators defined above, it is easy to show that, if everyone in G agrees that \, then the agreement is common knowledge (Fagan et al. 2003). It can also be shown formally that coordination implies common knowledge. Characterizing common knowledge in terms an infinite conjunction is labeled the iterate approach by Barwise (Barwise 1989), who identifies two other approaches. In the fixed-point approach (which eliminates the infinite conjunction), we view CG M as a fixed-point of the function (Fagan et al. 2003) f(x) = EG(M x). Specifically, in augmented S5, we can derive CG M EG (M CG M). The third approach (which we formulate following (Clark and Carlson 1982)) is the shared situation approach. For this, where A and B are rational, we may infer common knowledge among A and B that M if 1. A and B know that some situation V holds. 2. V indicates to both A and B that both A and B know that V holds. 3. V indicates to both A and B that M. Barwise concludes that the fixed-point approach (essentially implied by the shard situation approach) is the correct analysis of common knowledge, and that common knowledge generally arises via shared situations. It is tempting to view the iterate approach as characterizing how common knowledge is used, but progress through the ever deeper nestings of “everyone knows that” is blocked by doubt at some level. Barwise points out that common knowledge is not properly knowledge. Knowing that M is stronger than carrying the information that M since it relates to the ability to act. Also (Fagan et al. 2003), if we considers common knowledge as a disposition of individuals, we face the paradox of logical omniscience since the standard possibleworlds semantics of epistemic logic requires that we know all logical consequences of what we know. And a thesis of standard epistemic logic states that information becomes shared in the required sense at the same time for all agents sharing it, no surprise since all the agents are involved in the circularity. Barwise concludes that common knowledge (as per fixed the-point approach) is a necessary but not sufficient condition for action and is useful only when arising in a shared situation that, if maintained, “provides a stage for maintaining common knowledge.” Plans as process-algebraic terms, then, provide an analysis in line with conversation and are appropriate for multiagent systems. Such terms can represent the structure of a joint activity. What is effective of the activity, however, are the intentions of the agents to participate and, at a more abstract level, the obligations that spawn these intentions and that arise as a result of communication actions within the activity itself. Specifications and Obligations Khosla and Maibaum (Khosla and Maibaum 1987) point out that a specification of a software system establishes obligations on the behavior of the specified system. When the system is an agent, aspects of the specification that relate to its behavior within a multiagent system are the constraints on the sequences of communication acts it may perform. These can be seen as sequences of speech acts establishing and discharging obligations. Violation warrants a sanction, such as taking a compensating action. The expectations agents have of the behavior of other agents derives from the others’ specifications. So we view the general notion of an agent at the specification level. That this is the appropriate level is also suggested by the fact that there are many ways to realize a handshake, and picking out the handshakes is part of characterizing the situation. (For this point regarding adjacency pairs in conversation analysis, see (Heritage 1984, p. 302).) The important thing for us is that we can specify that certain handshakes are required and obligate other handshakes no matter how these handshakes are realized. AS a specification can be implemented any number of times, the agent abstraction is for a role. A role can be seen as a resource since behavior is stated declaratively, but once the role is assumed, obligations are assumed, and it becomes effective of action. A role can introduce special obligations, with an obligor but no oblige, which tend to be more abstract than directed obligations. The protocol specified by the environment introduces general obligations, with neither an obligor nor an oblige, which are conditions for any agent to be in the multiagent system. So obligations, being public, are part of the social scaffolding needed for cooperation. Obligations persist, some (like many directed obligations) until they are discharged, while others (special obligations) attach to a role or generally hold of anyone or anything participating in the given society, but almost all can be defeated in some circumstances by other obligations. Common Knowledge While obligations are effective of coordinated activity, common knowledge is a necessary condition for it. To explain this concept (Fagan et al. 2003), let G be a group of n agents, denoted by ordinals, G = {1, 2, …, n}. We introduce n modal operators Ki, 1 d i d n, where Ki M is 67 Common knowledge, rather than being a disposition mysteriously shared by a group, is a feature of the social scaffolding. If we analyze situations rigorously, then fixed-points seem appropriate, and logical omniscience and simultaneity requirements regarding the scaffolding are acceptable. Barwise addresses situations, but the scaffolding and the common “knowledge” it supports are much more embracing. H. H. Clark and Carlson (Clark and Carlson 1982) identified three “co-presence heuristics” giving rise to different kinds of shared “situations”. Two, physical co-presence and linguistic co-presence, properly relate to situations, but the third, community membership, is not temporally or spatially restricted, and they suggest that the other two heuristics presuppose it. The scaffolding in fact allows us to escape spatial and temporal bounds and to overcome simultaneity requirements, as the use of writing notably attests. action; rather than something dispositional, it is seen as a feature of the social scaffolding, a view that mollifies several paradoxes. For multiagent systems, much of the scaffolding consists of protocols that are common knowledge among participants and guarantee establishment of the required common knowledge. This scaffolding is so transparent that it can be mistaken for the activities it supports. For example, a syntactically correct message with a performative is a common characterization of speech act but, in fact, is not an act of any kind. In fact, synchronous communication (akin to human conversation) is more primitive conceptually than asynchronous communication, although the latter is more tractable given the physical scaffolding supplied by the computational infrastructure. A penetrating analysis of multiagent systems can be achieved with a proper view of the relation between the scaffolding and the coordinated activity it supports. A Note on Modal Logics References Although the aspects of the social scaffolding we address relate to concepts in some sense captured by modal logics, these logics do not tell us how the concepts apply to reality. We have suggested that epistemic logic, at least regarding common “knowledge”, is really concerned with the scaffolding. Deontic logics purportedly capture the notions of obligation, permission, and prohibition. Standard deontic logics relate to “ought to be” while “ought to do” is perhaps more natural, and deontic modalities relating to agency, as in “ought to bring-itabout-that,” are perhaps more revealing. Yet, while there is no one correct way to analyze deontic notions, the alternatives and the attempts to dispel the paradoxes plaguing deontic logics sharpen our analytic tools. Barwise, Jon. 1989. Chapter 9, On the Model Theory of Common Knowledge, in The Situation in Logic. Stanford, CA: CSLI. Bratman, M. What is Intention? In Cohen, P. R., Morgan, J., and Pollack, M.E. eds. 1990. Intentions in Communication, Cambridge, MA.: The MIT Press, 15-31. Clark, Andy. 1997. Being There: Putting Brain, Body, and World Together Again. Cambridge, MA: The MIT Press. Clark, H.H., and Carlson T.B. Speech Acts and Hearers’ Beliefs. In Smith, N. V. ed. 1982. Mutual Knowledge. New York: Academic Press, 1-37. Dennett, D. C. True Believers: The Intentional Strategy and Why It Works. In Dennett, D. C. ed. 1987. The Intentional Stance. Cambridge, MA: MIT Press, 113-35. Fagin, R., Halpern, J. Y., Moses, Y. and Vardi, M. Y. 2003. Reasoning About Knowledge. Cambridge, MA: MIT Press. Heath, C. and Luff, P. 2001. Technology in Action. Cambridge, UK: Cambridge University Press. Heritage, J. 1984. Garfinkel and Ethnomethodology. Cambridge, UK: Polity Press. Conclusion To provide an account of coordinated behavior, one must augment the intentional stance with social scaffolding, critical aspects of which can be formulated in terms of notions familiar from formalisms that have been applied to software systems and to humans alike. We have reviewed Bratman’s position on intentions and plans and considered the contrasting findings of ethnomethodoloy, that complex tasks are accomplished by ongoingly coordinating with the actions of others. For scaffolding, we first considered obligations, which provide most of the support for activity attributed to intentions, but, being part of the scaffolding, are effective of coordinated activity. We suggested that speech acts, the normal way of establishing directed obligations, be seen as joint actions modeled as handshakes; then plans as process-algebraic terms provide an analysis in line with conversation that is appropriate for multiagent systems. We suggested that the general notion of an agent is at the specification level and relates to roles, which come with obligations. Finally, we addressed common knowledge as a prerequisite for coordinated Khosla, S. and Maibaum, T. S. E. The Prescription and Description of State Based Systems. In Banieqbal, B., Barringer, H., and Pnueli, A. eds. 1987. Temporal Logic in Specification. Berlin: Springer-Verlag, 243-294. Milner, R. 1999. Communicating and Mobile Systems: The S-calculus, Cambridge, UK: Cambridge University Press. Suchman, L. A. 1987. Plans and Situated Actions: The Problem of Human-Machine Communication. New York. Cambridge University Press. 68