Proceedings, Eleventh International Conference on Principles of Knowledge Representation and Reasoning (2008) On Notions of Causality and Distributed Knowledge∗ Ron van der Meyden University of New South Wales meyden@cse.unsw.edu.au Abstract by saying that anything that u knows about the outside world must be distributed knowledge to Iu . In previous work (van der Meyden 2007), we have used this formulation of the intuition to argue that intransitive noninterference, a notion of causality from the computer security literature, is flawed, and have proposed a number of other definitions of causality in response to this failure. We justifed these new definitions in that work primarily by showing that they lead to a more satisfactory account of the classical proof theory and applications for intransitive noninterference. In this paper, we show that these new notions in fact satisfy the intuition much more generally than in the single example considered in (van der Meyden 2007), and give a refined statement of how this is the case. In doing so, we argue that distributed knowledge, as it has been defined in the epistemic logic literature, is not the only useful way to formalise the intuitive notion of what a group of agents would know if they were to combine their information. We develop several new distributed-knowledge like modalities for our application. However, we also attempt to do more than show that the new notions of distributed knowledge satisfy the intuition concerning an agent’s knowledge and the distributed knowledge of its interferers. We also consider the converse relationship between knowledge and causality. That is, we ask whether it is the case that when the expected relationship between the knowledge of an agent “about other agents” and the distributed knowledge of its interferers holds, we can conclude that the system has the expected causal structure. We show that this converse relationship does in fact hold, provided one uses appropriate formulations of distributed knowledge and what it means for an agent to know something “about” other agents. Some of the details are subtle, and give fresh insight into how information flows in the class of systems we consider. The notion of distributed knowledge is used to express what a group of agents would know if they were to combine their information. The paper considers the application of this notion to systems in which there are constraints on how an agent’s actions may cause changes to another agent’s observations. Intuitively, in such a setting, one would like that anything an agent knows about other agents must be distributed knowledge to the agents that can causally affect it. In prior work, we have argued that the definition of intransitive noninterference — a notion of causality used in the literature on computer security — is flawed because it fails to satisfy this property, and have proposed alternate definitions of causality that we have shown to be better behaved with respect to the theory of intransitive noninterference. In this paper we refine this understanding, and show that in order for the converse of the property to hold, one also needs a novel notion of distributed knowledge, as well as a new notion of what it means for a proposition to be “about” other agents. Introduction It is a commonly held intuition that information flows along causal lines: where there is no causal relationship, there will be no flow of information. In this paper, we attempt to give a precise characterization of this intuition using notions drawn from the literature on computer security (where lack of causality is called noninterference, and is considered in dealing with information flow security) and the literature on epistemic logic. In particular, we start with the following idea. Suppose that a system is structured so that the only way that an agent u may be causally affected by the outside world is through the activity of a set of “interfering” agents Iu . Then any information that u has about the outside world, it must have obtained by somehow combining the information that it received from agents Iu . The notion of distributed knowledge is used in the epistemic logic literature to capture what could be deduced if we were to combine all the information available to a group of agents. Thus, we may express our intuition Intransitive Noninterference We begin by recalling some notions of causality used in the computer security literature. The definitions are cast in terms of a formal semantic model for multi-agent systems. Several different models have been used in the literature; we follow the stateobserved machine formulation of (Rushby 1992). This ∗ Thanks to the Computer Science Department, Stanford University for hosting a sabbatical visit during which some of this research was conducted. Work of the author supported by an Australian Research Council Discovery grant. c 2008, Association for the Advancement of Artificial Copyright Intelligence (www.aaai.org). All rights reserved. 209 consisting of a High security agent H, a low security agent L, and a downgrader agent D, whose role is to make declassification decisions that release High security information to L. Here the policy is H D L.2 P-security says that L can learn about D actions, but will never know anything about H actions. Thus, in a P-secure system, L will not know about H activity even if D has chosen to downgrade it. For example, if h, d are actions of H, D, respectively, then purgeL (hdh) = d = purgeL (d). Thus, according this definition, L cannot distinguish the sequences hdh and d, so D is unable to downgrade the fact that action h has occurred. To avoid this problem, Haigh and Young (Haigh and Young 1987) generalised the definition of the purge function to intransitive policies as follows. Intuitively, the intransitive purge of a sequence of actions with respect to a domain u is the largest subsequence of actions that could form part of a causal chain of effects (permitted by the policy) ending with an effect on domain u. More formally, the definition makes use of a function sources : A∗ × D ⇒ P(D) defined inductively by sources(, u) = {u} and model consists of deterministic machines of the form S, s0 , A, step, obs, dom, where S is a set of states, s0 ∈ S is the initial state, A is a set of actions, dom : A → D associates each action to an element of the set D of agents, step : S × A → S is a deterministic transition function, and obs : S × D → O maps states to an observation in some set O, for each agent. The nomenclature D and dom for agents arises from the fact that, in the security literature, D is though of as the set of security domains. For example, in a multi-level secure system, a security domain is a pair consisting of of a security level (e.g. Low, High, Secret) together with a set of classes (e.g. Nuclear, NATO, FBI) used to restrict information access on a need-to-know basis. Several agents might be assigned to such a security domain in this interpretation. For our purposes in this paper, we will think of each element of D as a single agent, since this better fits the perspective that we apply from the logic of knowledge. We write s · α for the state reached by performing the sequence of actions α ∈ A∗ from state s, defined inductively by s · = s, and s · αa = step(s · α, a) for α ∈ A∗ and a ∈ A. Here denotes the empty sequence. Permitted causal relationships are expressed in the security literature using noninterference policies, wich are relations ⊆ D × D, with u v intuitively meaning that “actions of agent u are permitted to interfere with agent v”, or “information is permitted to flow from agent u to agent v”. Since, intuitively, an agent should be allowed to interfere with, or have information about, itself, this relation is assumed to be reflexive.1 Noninterference was given a formal semantics for transitive noninterference policies (which arise naturally from partially ordered security levels) by Goguen and Meseguer (Goguen and Meseguer 1982), using a definition based on a “purge” function. Given a set E ⊆ D of agents and a sequence α ∈ A∗ , we write α E for the subsequence of all actions a in α with dom(a) ∈ E. Given a policy , we define the function purge : A∗ × D → A∗ by purge(α, u) = α {v ∈ D | v u}. (For clarity, we may use subscripting of agent arguments of functions, writing e.g., purge(α, u) as purgeu (α).) sources(aα, u) = sources(α, u)∪ {dom(a) | ∃v ∈ sources(α, u)(dom(a) v)} for a ∈ A and α ∈ A∗ . Intuitively, sources(α, u) is the set of domains v such that there exists a sequence of permitted interferences from v to u within α. The intransitive purge function ipurge : A∗ ×D → A∗ is then defined inductively by ipurge(, u) = and ipurge(aα, u) a · ipurge(α, u) = ipurge(α, u) if dom(a) ∈ sources(aα, u) otherwise for a ∈ A and α ∈ A∗ . The intransitive purge function is then used in place of the purge function in Haigh and Young’s definition: Definition 2: M is IP-secure with respect to a policy if for all u ∈ D and all sequences α, α ∈ A∗ with ipurgeu (α) = ipurgeu (α ), we have obsu (s0 · α) = obsu (s0 · α ). Definition 1: A system M is P-secure with respect to a policy if for all agents u and for all sequences α, α ∈ A∗ such that purgeu (α) = purgeu (α ), we have obsu (s0 · α) = obsu (s0 · α ). It can be seen that ipurgeu (α) = purgeu (α) when is transitive, so IP-security is in fact a generalisation of the definition of security for transitive policies. This definition solves the problem noted above, since now we have ipurgeL (hdh) = hd. Here we see that L can now learn of the first occurrence of h. (The second h remains invisible to L. This is in accordance with the intent of the definition - L should only know of H events that have been explicitly downgraded. ) This can be understood as saying that agent u’s observation depends only on the sequence of interfering actions that have been performed. This definition is appropriate when the noninterference policy is transitive, but it has been considered to be inappropriate for the intransitive case. An example of this is systems Knowledge and Distributed Knowledge 1 We follow the terminology historically used in the area even though it is peculiar: note that the relation specifies permitted interferences rather than required noninterferences, and when we speak of “intransitive noninterference”, we mean that noninterference policies are not assumed to be transitive (rather than assumed to be not transitive). We have recently presented an argument against the definition of intransitive noninterfere (van der Meyden 2007), in 2 When presenting policies we list only the nonreflexive relations; the policy should be taken to be the reflexive closure of the facts given. 210 which we exploit intuitions from the literature on the logic of knowledge (Fagin et al. 1995). In this section, we recall the relevant background from the latter area. Let Prop be a set of atomic propositions. We define a propositional modal logic based on a set Op of monadic modal operators (to be introduced below), with formulas defined as follows: if p ∈ Prop then p is formula, and if φ and ψ are formulas and X ∈ Op is a modal operator, then ¬φ, φ ∨ ψ and Xφ are formulas. We use standard boolean abbreviations such as φ ⇒ ψ for ¬φ ∨ ψ. If u ∈ D is an agent and G ⊆ D is a nonempty set of agents, the set Op will contain the operators Ku and DG . Intuitively, the formula Ku φ expresses that agent u knows φ, and DG φ expresses that φ is distributed knowledge to the group G, which means that the group G, collectively, knows φ. (We introduce additional operators below.) We take the atomic propositions to be interpreted over sequences of actions of a system M with actions A, by means of an interpretation function π : Prop → P(A∗ ). Formulas φ are then interpreted as being satisfied at a sequence of actions3 α ∈ A∗ by means of a relation M, π, α |= φ. For atomic propositions p ∈ Prop, this relation is defined by M, π, α |= p if α ∈ π(p). The semantics of the operators for knowledge is defined using the following notion of view. The definition uses an absorbtive concatenation function ◦, defined over a set X by, for s ∈ X ∗ and x ∈ X, by s ◦ x = s if x is equal to the final element of s (if any), and s ◦ x = s · x (ordinary concatenation) otherwise. The view of agent u with respect to a sequence α ∈ A∗ is captured using the function viewu : A∗ → (A ∪ O)∗ (where O is the set of observations in the system), defined by This is essentially the definition of knowledge used in the literature on reasoning about knowledge (Fagin et al. 1995) for an agent with asynchronous perfect recall . The notion of distributed knowledge is defined in the literature as follows. First, define the relations ∼G on sequences of actions by α ∼G α if α ∼u α for all u ∈ G. The operators DG are then given semantics by the clause M, π, α |= DG φ if M, π, α |= φ for all sequences α such that α ∼G α . Intuitively, this definition says that a fact is distributed knowledge to the set of agents G if it could be deduced after combining all the information that these agents have. Note that combination of the agents’ information is here being modelled as the intersection of their “information sets” — we will argue below that there are alternatives to this definition that are both reasonable and useful. The Key Intuition Suppose that a system is secure with respect to a noninterference policy . Given an agent u, let Iu = {v ∈ D | v u, v = u} be the set of all other agents that are permitted to interfere with u. Let p be a proposition that expresses a fact about some agent other than u, and suppose u knows p. Intuitively, if the system respects the noninterference policy, then since p is not “local information”, the only way that u should be able to learn that p holds is by receiving information from the agents Iu . However, u may have deduced p by combining facts received from several sources. Since we have assumed agents have perfect recall, those sources should also know those facts. Hence, if we combine the information known to agents in Iu , then we should also be able to deduce p. Thus, we expect that if the system M satisfies the policy and p is interpreted by π as being about agents other than u, then viewu () = obsu (s0 ), and viewu (αa) = (viewu (α) · b) ◦ obsu (s0 · α), M, π, α |= Ku p ⇒ DIu p where b = a if dom(a) = u and b = otherwise. That is, viewu (α) is the sequence of all observations and actions of domain u in the run generated by α, compressed by the elimination of stuttering observations. Intuitively, viewu (α) is the complete record of information available to agent u in the run generated by the sequence of actions α. The reason we apply the absorbtive concatenation is to capture that the system is asynchronous, with agents not having access to a global clock. Thus, two periods of different length during which a particular observation obtains are not distinguishable to the agent. Using the notion of view, we may define for each agent u an equivalence relation ∼u on sequences of actions by α ∼u α if viewu (α) = viewu (α ). The semantics for the knowledge operators may then be given by (1) ∗ for all α ∈ A . In (van der Meyden 2007), we presented an argument against IP-security on the grounds that this intuition can be false when we interpret “the system M satisfies the policy ” as saying that M is IP-secure with respect to . The essential reason for this is that the intransitive purge ipurgeL (α) preserves not just certain actions from the sequence α, but also their order. This allows L to “know” this order in situations where an intuitive reading of the policy would suggest that it ought not to know this order. The following reproduces the example that we used in (van der Meyden 2007) to show that IP-security does not satisfy the intuition concerning the relationship between causality and distributed knowledge. M, π, α |= Ku φ if M, π, α |= φ for all sequences α such that α ∼u α . Example 1: Consider the intransitive policy given by H1 D1 , H2 D2 , D1 L and D2 L. Intuitively, H1 , H2 are two High security domains, D1 , D2 are two downgraders, and L is an aggregator of downgraded information. For this policy, we have IL = {D1 , D2 } so (1) requires that 3 The reader may have expected to see propositions interpreted more generally at sequences alternating states and actions. We remark that there is in fact no loss of generality, because the assumption that actions are deterministic means that the sequence of states in a run is uniquely determined by the sequence of actions. M, π, α |= KL p ⇒ D{D1 ,D2 } p 211 (2) for all α ∈ A∗ . We show that if security is interpreted as IP-security, then this can be false. Define the system M with actions A = {h1 , h2 , d1 , d2 , l} with domains H1 , H2 , D1 , D2 , and L, respectively. The set of states of M is the set of all strings in A∗ . The transition function is defined by concatenation, i.e. for a state α ∈ A∗ and an action a ∈ A, step(α, a) = αa. The observation functions are defined using the ipurge function associated to the above policy: obsu (α) = [ipurge(α, u)]. (Here we put brackets around the sequence of actions when it is interpreted as an observation, to distinguish such occurrences from the actions themselves as they occur in a view.) It is plain that M is IP-secure. For, if ipurge(α, u) = ipurge(α , u) then obsu (s0 · α) = [ipurge(α, u)] = [ipurge(α , u)] = obsu (s0 · α ). We show that it does not satisfy condition (2). Consider the sequences of actions α1 = h1 h2 d1 d2 and α2 = h2 h1 d1 d2 . Note that these differ in the order of the events h1 , h2 . Let the atomic proposition p be interpreted by π as asserting that there is an occurrence of h1 before an occurrence of h2 . That is, π(p) = {αh1 βh2 γ | α, β, γ ∈ A∗ }. Then we have obsL (α1 ) = [ipurge(α1 , L)] = [h1 h2 d1 d2 ]. Hence, for any sequence α with α1 ∼L α , we have ipurgeL (α ) = h1 h2 d1 d2 , so α ∈ π(p). Thus M, π, α1 |= KL p, i.e., in α1 agent L knows the ordering of the events h1 , h2 . We demonstrate that α2 is a witness showing that it is not the case that M, π, α1 |= D{D1 ,D2 } p, i.e., we have α1 ∼{D1 ,D2 } α2 and α2 ∈ π(p). The latter is trivial. For the former, note an agent’s observation may not give it more than this maximal amount of information. The definitions differ in the modelling of the maximal information, but all take the view that an agent increases its information either by performing an action or by receiving information transmitted by another agent. In the first model of the maximal information, what is transmitted when an agent performs an action is information about the actions performed by other agents. The following definition expresses this in a weaker way than the ipurge function. Given sets X and A, let the set T (X, A) be the smallest set containing X and such that if x, y ∈ T and z ∈ A then (x, y, z) ∈ T . Intuitively, the elements of T (X, A) are are binary trees with leaves labelled from X and interior nodes labelled from A. Given a policy , define, for each agent u ∈ D, the function tau : A∗ → T ({}, A) inductively by tau () = , and, for α ∈ A∗ and a ∈ A, 1. if dom(a) u, then tau (αa) = tau (α), 2. if dom(a) u, (tau (α), tadom(a) (α), a). then tau (αa) = Intuitively, tau (α) captures the maximal information that agent u may, consistently with the policy , have about the past actions of other agents. (The nomenclature is intended to be suggestive of transmission of information about actions.) Initially, an agent has no information about what actions have been performed. The recursive clause describes how the maximal information tau (α) permitted to u after the performance of α changes when the next action a is performed. If a may not interfere with u, then there is no change, otherwise, u’s maximal permitted information is increased by adding the maximal information permitted to dom(a) at the time a is performed (represented by tadom(a) (α)), as well the fact that a has been performed. Thus, this definition captures the intuition that an agent may only transmit information that it is permitted to have, and then only to agents with which it is permitted to interfere. viewD1 (α1 ) = [] ◦ [h1 ] ◦ [h1 ] ◦ d1 ◦ [h1 d1 ] ◦ [h1 d1 ] = [] ◦ [] ◦ [h1 ] ◦ d1 ◦ [h1 d1 ] ◦ [h1 d1 ] = viewD1 (α2 ) i.e., α1 ∼D1 α2 . By symmetry, we also have α1 ∼D2 α2 , hence α1 ∼{D1 ,D2 } α2 . This means that D1 and D2 do not have distributed knowledge of the ordering of the events h1 , h2 , even with respect to the asynchronous perfect recall intepretation of knowledge, in which they reason based on everything that they learn in the run. Thus, L has acquired information that cannot have come from the two sources D1 and D2 that are supposed to be, according to the policy, its only sources of information. Definition 3: A system M is TA-secure with respect to a policy if for all agents u and all α, α ∈ A∗ such that tau (α) = tau (α ), we have obsu (s0 ·α) = obsu (s0 ·α ). Intuitively, this says that each agent’s observations provide the agent with no more than the maximal amount of information that may have been transmitted to it, as expressed by the functions ta. Like IP-security, the definition of TA-security views it as permissible for an agent to cause the transmission of information that it is permitted to have, but has not itself observed. For example, an agent may forward an email attachment without inspecting it. For at least some applications (e.g., the downgrading example mentioned above) this is undesirable. To prohibit such behaviour, a second alternative definition of causality from (van der Meyden 2007) restricts the transmitted information to that which has been observed. This alternative is defined as follows. Given a policy , for each domain u ∈ D, define the function Alternate Definitions of Causality In response to the example of the previous section, we have defined in (van der Meyden 2007) several alternative notions of causality/security that are not only better behaved than IPsecurity with respect to the example, but also prove to be a much better fit to proof techniques and applications that had been developed for intransitive noninterference. We will not go into the latter here, but confine ourselves to stating the definitions of our alternative notions of causality. All the alternatives are based on a concrete model of the maximal amount of information that an agent may have after some sequence of actions has been performed, and state that 212 tou : A∗ → T (O(A ∪ O)∗ , A) by tou () = obsu (s0 ) and tou (α) if dom(a) u, tou (αa) = (tou (α), viewdom(a) (α), a) otherwise. Intuitively, this definition takes the model of the maximal information that an action a may transmit after the sequence α to be the fact that a has occurred, together with the information that dom(a) actually has, as represented by its view viewdom(a) (α). By contrast, TA-security uses in place of this the maximal information that dom(a) may have. (The nomenclature ‘to’ is intended to be suggestive of transmission of information about observations.) We may now base the definition of security on either the function to or ito rather than ta. The Example Revisited - A First Concern We have presented Example 1 in terms of the notion of distributed knowledge as it is usually defined in the literature. We now note that it is reasonable to object to the use of this notion in the present context. Note that whereas L is, intuitively, permitted by the policy to observe the ordering of actions in the domains D1 , D2 (and actually does observe this order in the example), the way that the definition of distributed knowledge combines the information available to D1 and D2 does not take into account the relative ordering of the actions in these domains. Indeed, the following example illustrates that to ask that information known to L be distributed knowledge to D1 and D2 may be too strong. Example 2: Let the (transitive) policy be defined by L1 H and L2 H. Consider a system with actions A = {l1 , l2 , h} of domains L1 , L2 , H, respectively, and let the states and transitions of M be given by S = A∗ and next(α, a) = α · a as in Example 1, but define the observations by obsLi (α) = purgeLi (α) and obsH (α) = α. This system is intuitively secure, and is easily seen to be P-secure (hence secure for all the other definitions) but, taking p to mean “there is an occurrence of l1 before l2 ”, we have M, π, l1 l2 |= KH p but not M, π, l1 l2 |= D{L1 ,L2 } p, since l1 l2 ∼{L1 ,L2 } l2 l1 . Definition 4: The system M is TO-secure with respect to if for all domains u ∈ D and all α, α ∈ A∗ with tou (α) = tou (α ), we have obsu (s0 · α) = obsu (s0 · α ). The following result shows how these definitions are related: Theorem 1 (van der Meyden 2007) For state-observed systems, with respect to a given policy , P-security implies TO-security implies TA-security implies IP-security. Examples showing that all these notions are distinct are presented in (van der Meyden 2007). This completes the background for the contributions of the present paper. In what follows, we show that these new definitions of security can in fact be shown to be closely related to our intuitions about distributed knowledge in such settings, provided we also develop some new notions of distributed knowledge. In order to capture common structure in our results, it is useful to take a more abstract perspective on the above definitions. Define a local-state assignment in a system M to be a function Y : A∗ × D → L, where L is some set. As usual, we write Yu (α) for Y (α, u). The functions view, ta and to are all examples of local-state assignments. Let X and Y be local-state assignments. We say that X is a full-information local-state assignment based on Y and if the following holds: Xu () = and for all agents u ∈ D, sequences α ∈ A∗ and a ∈ A, we have Xu (α) if dom(a) u Xu (αa) = (Xu (α), Ydom(a) (α), a) otherwise. That is, according to the local state assignment X, the effect of an action a after a sequence α is to transmit to all domains u with dom(a) u the information that the action a has been performed, as well as all the information in the local state Ydom(a) (α). Such domains u add this new information to the information Xu (α) already collected. The action has no effect on domains with which it is not permitted to interfere. We can now identify some common structure in the above definitions by noting that to is a full-information local-state assignment with respect to view and , and ta is a fullinformation local-state assignment with respect to ta and . The appropriate diagnosis here seems to be that distributed knowledge is too strong a notion for our present purposes. We are therefore motivated to define a variant, that takes into account H’s observational powers in this example. Define the equivalence relation ∼pG on A∗ by α ∼pG α if α ∼G α and α G = α G. Intuitively, this relation combines the information in views of the agents G, not just by intersecting the information sets, but also taking into account the ordering of the actions in these views in the actual run. Extend the modal language by adding an operator DpG , with semantics given by M, π, α |= DpG φ if M, π, α |= φ for all α ∈ A∗ with α ∼pG α . Note that DG φ ⇒ DpG φ, so this is a weaker notion of distributed knowledge. Example 3: Reconsidering Example 2, we see that l1 l2 ∼p{L1 ,L2 } α iff α ∈ {h}∗ · l1 · {h}∗ · l2 · {h}∗ , hence M, π, l1 l2 |= Dp{L1 ,L2 } p. Thus, with this new interpretation of distributed knowledge, we recover the desired intuition in this example. Using this variant notion of distributed knowledge, the problem identified in the example from (van der Meyden 2007) can be shown to persist. Example 4: Let the system M , the interpretation π of the proposition p and the sequences α1 and α2 be as in Example 1. Since α1 {D1 , D2 } = d1 d2 = α2 {D1 , D2 }, we have α1 ∼p{D1 ,D2 } α2 , so M, π, α1 |= KL p but not M, π, α1 |= Dp{D1 ,D2 } p. We therefore conclude that although the concern raised 213 above is valid, the problem we have identified with the definition of IP-security is real. Nevertheless, we would like to have a better understanding concerning the intuition than can be obtained from a single example. We pursue this in the following sections. We may now express the key intuition abstractly using the following definition. Definition 5: A system M is confined with respect to a local-state assignment Y and noninterference policy if for all sequences of actions α ∈ A∗ and all agents u, if π interprets p as depending only on D \ u, then M, π, α |= Ku p ⇒ DYIu p, where Iu = {v ∈ D | u = v, v u}. From Causality to Distributed Knowledge Based on a problem identified in Example 1, we have constructed some alternative definitions for security with respect to intransitive policies, reflecting intuitions about the transmission of information in a concrete setting. These definitions were justified in (van der Meyden 2007) on the grounds that they lead to a better account of the classical proof theory and applications for intransitive noninterference. We now consider what these new definitions say about our motivating example. Indeed, we show that they support our intuition not just in this example, but more generally. As discussed above, our key intuition is that if the system is secure, then any information that u has about other agents must be deducible from the information passed to it by the agents Iu , so should be distributed knowledge, in some sense, to Iu . We now check how each of our definitions fares with respect to this intuition. One qualification will be required: note that if u is able to have an effect on the agents Iu , then it may combine its knowledge of these effects with the distributed knowledge of Iu to deduce facts that are not distributed knowledge to Iu alone. We therefore require that u not be permitted to interfere, directly or indirectly, with Iu . This may be captured by the requirement that u not be part of any nontrivial cycle in . To formalise the intuitive notion of “information about other agents”, we use the following notion. For F ⊆ D a set of agents, say that a property Π ⊆ A∗ depends only on F if for all α, α ∈ A∗ with α F = α F we have α ∈ Π iff α ∈ Π. Similarly, say that an atomic proposition p is interpreted by π as depending only on F if π(p) depends only on F . This expresses more precisely the fact that the proposition p is “about the agents F ”. As already noted in the previous section, the appropriate notion of distributed knowledge to capture our intuition in the current semantic setting is one that takes into account the interleaving of the actions of the group. As different definitions of causality require different notions of distributed knowledge to capture the intuition, we generalize the operator Dp as follows. Suppose that Y is a local-state assignment in M and G is a set of agents. Define the relation ∼YG on sequences of actions in M by α ∼YG α if α G = α G and Yv (α) = Yv (α ) for all v ∈ G, Then we define the new modal operator DYG with semantics We would like to have that if a system is secure for a given notion of security, then it is confined with respect to an appropriate local-state assignment. In order to identify these local state assignments, note that, given the way that information is transmitted in the definitions of ta and to, an agent v ∈ Iu may have acquired new information after it last transmitted information to u. Whereas u is not expected to have this information, the distributed knowledge of Iu takes this into account. Thus, in some sense, the relation ∼Iu takes more information from Iu into account than is required. We therefore develop a construct that helps to identify the latest point at which an agent may have transmitted information to other agents. For each domain u define the mapping mu : A∗ → A∗ so that mu (α) is the prefix of α up to but excluding the rightmost action of agent u. More precisely, the definition is given inductively by mu () = and mu (αa) = mu (α) if dom(a) = u and mu (αa) = α otherwise. Intuitively, mu (α) expresses the history of system at the point that the latest action of u occurs in α. If Y is a local-state assignment, we write Y ◦ m for the local-state assignment defined by (Y ◦ m)u (α) = Yu (mu (α)). Consider now the relation ∼TGO and the operator DTGO , defined to be ∼YG and DYG φ, respectively, where Y = view ◦ m. Intuitively, DTGO φ says that φ is deducible by an agent that has all the information that was available to agents in G — at the time that they performed their latest action — together with the ordering of all the actions that have been performed by these agents. It is easily seen that this is a stronger notion of distributed knowledge than DpG , in the sense that the formula DTGO φ ⇒ DpG φ is valid. This notion supports our key intuition: Theorem 2 Suppose that is acyclic. If M is TO-secure with respect to then M is confined with respect to the local-state assignment view ◦ m and . In order to obtain a similar result for TA-security, we need to take into account that TA-security is consistent with the transmission of information that is not contained in the sender’s view. Thus, the best that we can expect in this case is that information known to u must be permitted to be distributed knowledge to the agents that may transmit information to u. As with the notion DTGO , we also restrict the distributed knowledge to information that may have been transmitted, rather than that information currently held. Thus, we consider the notion of distributed knowledge DTGA , defined as DYG where Y = ta ◦ m. Using this, we can again support the key intuition: M, π, α |= DYG φ if for all α ∈ A∗ such that α ∼YG α we have M, π, α |= φ. Intuitively, this is just the notion of distributed knowledge DpG , except that instead of combining the information in the local states viewv (α) for v ∈ G relative to the interleaving of G actions in α, we combine the information in the local states Yv (α). Indeed, it is easily seen that DpG φ is equivalent to Dview G φ. 214 a sequence β such that (1) viewv (α) = viewv (β ) and (2) β D \ {v} = β D \ {v}. It then follows using the fact that M, π, α |= Kv p and (1) that M, π, β |= p, hence using (2) and the fact that π interprets p as depending only on D \ {v} that M, π, β |= p. We consider several cases for v, with subcases for α: Case 1: v = L. Here we have that π interprets p as depending only on {H, D}, IL = {D}, and α ∼TILO β implies α D = β D. Theorem 3 Suppose that is acyclic. If M is TA-secure with respect to then M is confined with respect to ta ◦ m and . Theorems 2 and 3 show that for both TO-security and TAsecurity, we are able to support the intuitive relationship between causal structure and distributed knowledge. From Distributed Knowledge to Causality In the preceeding, we have shown that the causal structure of a system implies a relationship between an agent’s knowledge and the distributed knowledge of agents that may interfere with it. We now consider the converse direction. Suppose that security of a system implies that it is confined with respect to a local-state assignment Y and . Is it also the case that if a system is confined with respect to Y and then it is secure? The following example shows that it does not. 1. Suppose vL (α) = 0(l0)k . Then there is no occurrence of d in α, and consequently, no such occurrence in β. Here we take β = (β H)lk which can easily be seen to satisfy (1) and (2). 2. Suppose vL (α) = 0(l0)k 1(l1)j . Then we can write α = α0 dα1 where the occurrence of d is the first in α, α0 L = lk and the first action in α0 is not l, and α1 L = lj . Since α D = β D, there is also an occurrence of d in β and we may write β = β0 dβ1 , where β0 contains at least one h action. Define Example 5: We show that it is possible for a system to be confined with respect to view ◦ m, yet still be TO-insecure. Thus, the converse to Theorem 2 does not hold. (A similar example may be constructed for ta ◦ m and TA-security, we leave this as an exercise for the reader.) Consider a system with agents H, D, L, each of which has a single action h, d, l, respectively. Thus D = {H, D, L} and A = {h, d, l}. We take the set of states to be A∗ , the initial state to be and state transitions to be defined by concatenation: α · a = αa. Let the observations for agent H be given by obsH (α) = 0, the observations for agent D be given by obsD (α) = α {H, D}, and those for L be given by β = (β0 H)lk d(β1 {H, D})lj . This can be seen to satisfy (1) and (2). 3. Suppose vL (α) = 0(l0)k 2(l2)j . Then we can write α = α0 dα1 where the occurrence of d is the first in α, α0 L = lk and the first action in α0 is l, and α1 L = lj . Since α D = β D, there is also an occurrence of d in β and we may write β = β0 dβ1 , where β0 does not contain d. Define β = l(β0 H)lk−1 d(β1 {H, D})lj . This can be seen to satisfy (1) and (2). Case 2: v = D. Here we have that π interprets p as depending only on {H, L}, ID = {H}, and α ∼TIDO β implies α H = β H. Here viewH (α) has the form 0(h0)k . Using the fact that α H = β H, we can construct β such that α {H, D} = β {H, D} and α {H, L} = β {H, L}. This yields (1) and (2). Case 3: v = H. Here we have that π interprets p as depending only on {D, L}, IH = ∅, and α ∼TIHO β is trivially true for all β. We take β = (β {D, L})(α H), which can be seen to satisfy (1) and (2). 1. obsL (α) = 0 if α does not contain a d action, 2. obsL (α) = 1 if α contains a d action and the first action of α is not l, 3. obsL (α) = 2, otherwise. Let the policy be given by H D L. Then, if we take α1 = hld and α2 = lhd, we have that toL (α1 ) = = = = (toL (hl), viewD (hl), d) ((toL (h), viewL (h), l), [] ◦ [h] ◦ [h], d) (toL (), 0 ◦ 0, l), [][h], d) ((, 0, l), [][h], d) The appropriate diagnosis for this example appears to be that security of a system talks not just about what an agent knows about the actions of other agents, but also about how the agent’s own actions are interleaved with those of other agents. Because we have formulated the notion of a proposition that “depends only on other agents” in a way that is not sensitive to such interleavings, we we are not able to express security of the system using our present formulation of the intuition. We now set about developing a variant formulation that does take such interleavings into account. We will show that this can be done in such as way as to completely characterize security in terms of the relationship between an agent’s knowledge and the distributed knowledge of its interferers. First, we relativize the notion of dependency to an agent. Let F ⊆ D be a set of agents and let u ∈ D be an agent. For a sequence α ∈ A∗ , define α u F to be the sequence in and toL (α2 ) = (toL (lh), viewD (lh), d) = (toL (l), [] ◦ [] ◦ [h], d) = ((toL (), viewL (), l), [] ◦ [] ◦ [h], d) = ((, 0, l), [][h], d), where we include sequences in square braces for clarity. Thus, toL (α1 ) = toL (α2 ). On the other hand, we have that obsL (s0 · α1 ) = obsL (α1 ) = 1 but obsL (s0 · α2 ) = obsL (α2 ) = 2. Thus, this system is not TO-secure. We now show that, on the other hand, this system is confined with respect to the local state assignment view◦m. For this, we show that for each agent v, if M, π, α |= Kv p and π interprets p as depending only on D \ {v} then M, π, α |= DITvO p. For this, we show that if α ∼TIvO β then there exists 215 A∗ ∪ {⊥u } otained from α by deleting occurences of action a with dom(a) ∈ F , and replacing each occurrence of an action a with dom(a) = u by ⊥u . We then say that a proposition Π ⊆ A∗ is about F relative to u if for all α, α ∈ A∗ , if α u F = α u F then α ∈ Π iff α ∈ Π. We will similarly relativize the notions of distributed knowledge defined above. Suppose that Y is a local-state assignment in M and u is an agent. Then we define the new modal operator DYG,u with semantics Subject to the condition that the local-state assignment Y respects , we obtain from the following that DYG,u and DYG are equivalent on the set of propositions of interest for confinement. Lemma 2 Suppose that Y respects in M and let u ∈ G ↓. Then if π interprets p as depending only on D \ u, we have M, π |= DYG,u p ⇒ DYG p. Proof: Assume that M, π, α |= DYG,u p. We show that M, π, α |= DYG p. Let α ∈ A∗ be a sequence such that α G = α G and Yv (α) = Yv (α ) for all v ∈ G. We need to show that M, π, α |= p. Define β to be a sequence obtained from α by first deleting actions a with dom(a) = u and then inserting such actions in such a way that β u G = α u G. (This is possible because α G = α G.) Since Y respects and G ↓⊆ D \ u, it follows from this that also Yv (α) = Yv (β) for all v ∈ G. Thus, by the assumption that M, π, α |= DYG,u p, we have M, π, β |= p. Since we also have that β D\u = α D\u, we obtain using that fact that π interprets p as depending only on D \ u that M, π, α |= p, as required. M, π, α |= DYG,u φ if for all α ∈ A∗ such that α u G = α u G and Yv (α) = Yv (α ) for all v ∈ G, we have M, π, α |= φ. Intuitively, this definition expresses that the group G is able to deduce φ from the information in the local states Y , given information about the actions of the group G and how they were interleaved, and the points in that interleaving at which agent u performed an action (but not the details of u’s acO A tions). In particular, we define the operators DTG,u and DTG,u by taking Y to be view ◦ m and ta ◦ m in this definition, respectively. It is easily seen from the definitions and the fact that α u G = α u G implies α G = α G that DYG φ ⇒ DYG,u φ is valid. Thus DYG,u is a weaker notion of distributed Y . knowledge than DG Using these new notions of dependence and distributed knowledge, we consider the following formulation of our intuition concerning the relationship between the knowledge of an agent u and the distributed knowledge of its interferers Iu : From Lemma 2 we obtain the following. Corollary 1 If M is relatively confined with respect to Y and , and Y respects in M , then M is confined with respect to Y and . Combining this result with with and Lemma 1, we obtain: Corollary 2 If M is relatively confined with respect to , then M is confined with respect to ta ◦ m and . If M is TO-secure and relatively confined with respect to , then M is confined with respect to view ◦ m and . Thus, the notion of relative confinement is stronger than the notion of confinement in the cases we considered above. We now establish that this stronger notion actually holds in these cases. We obtain this as a consequence of a more general result, for which we first need some technical definitions and results. Say that a local-state assignment Y is locally cumulative if it satisfies the following conditions, for all agents u ∈ D, actions a, b ∈ A and sequences of actions α, β ∈ A∗ : [LC1.] If Yu (αa) = Yu () then dom(a) = u and Yu (α) = Yu (). [LC2.] If dom(a) = u and dom(b) = u and Yu (αa) = Yu (βb) then either Yu (α) = Yu (βb) or Yu (αa) = Yu (β) or Yu (α) = Yu (β). [LC3.] If dom(a) = u and dom(b) = u and Yu (αa) = Yu (βb) then Yu (α) = Yu (βb). [LC4.] If dom(a) = u and dom(b) = u and Yu (αa) = Yu (βb) then Yu (α) = Yu (β). Some of the local-state assignments of interest to us have this property. Definition 6: A system M is relatively confined with respect to a local-state assignment Y and a noninterference policy if for all sequences of actions α ∈ A∗ , for all agents u and all π that interpret p as depending only on D \ u relative to u, we have M, π, α |= Ku p ⇒ DYIu ,u p, where Iu = {v ∈ D | u = v, v u}. In many cases of interest, this is a stronger statement than our previous formulation of the intuition using DYG . This is not obvious, since while it is plain that if Π depends only on D \ u then Π depends only on D \ u relative to u, in order to obtain our previous formulation we would also need to have DYG,u φ ⇒ DYG φ which is the converse of the fact that DYG φ ⇒ DYG,u φ noted above. However, the desired implication in fact often holds, because of the following. Let be a noninterference relation. If G is a set of agents, we write G ↓ for the set {v | v ∗ u ∈ G}, and write u ↓ for {u} ↓. Say that a local-state assignment Y respects in M if for all agents u, if α (u ↓) = β (u ↓) then Yu (α) = Yu (β). Several of the local-state assignments we have introduced respect . Lemma 1 1. If α u ↓= β u ↓ then mu (α) u ↓= mu (β) u ↓. 2. The local-state assignments ta and ta ◦ m respect . 3. If M is TO-secure then the local-state assignments to, to ◦ m, view and view ◦ m respect . Lemma 3 The local-state assignments view and ta are locally cumulative. The following results gives some technical properties that will be of use to us below. 216 Admittedly, the added complexity of the operator DYG,u used in the definition of local confinement makes this result somewhat less intuitive. However, the benefit that we obtain from this complexity is the ability to state the following converse to Theorem 4: Theorem 5 Let be acyclic, and let X be a fullinformation local-state assignment based on Y and . If M is relatively confined with respect to Y ◦ m and then M is X-secure. Proof: We need to show that for all sequences α, α and agents u, if Xu (α) = Xu (α ) then obsu (s0 · α) = obsu (s0 · α ). We proceed by induction on the combined length of α and α . The base case of α = α = is trivial, so we consider the case of sequences αa, α , with Xu (αa) = Xu (α ), where a ∈ A. Case 1: dom(a) u. Then Xu (α) = Xu (αa) = Xu (α ), so by the induction hypothesis, we have obsu (s0 · α) = obsu (s0 · α ). We would like to show that obsu (s0 · αa) = obsu (s0 · α a). We suppose not, and obtain a contradiction. Note that it follows from the assumption and the conclusion above that obsu (s0 · αa) = obsu (s0 · α). Define the interpretation π on p by M, π, γ |= p iff γ u D \ u = αa u D \ u. Then plainly π interprets p as depending only on D \ u relative to u. We show that M, π, α |= Ku p. Suppose that viewu (α) = viewu (γ) and M, π, γ |= ¬p. From viewu (α) = viewu (γ) it follows that α u = γ u, and from M, π, γ |= ¬p we have γ u D \ u = αa u D \ u. It follows that γ = αa. However, since obsu (s0 · αa) = obsu (s0 · α), we have viewu (γ) = viewu (αa) = viewu (α), a contradiction. Thus, viewu (α) = viewu (γ) implies M, π, γ |= p. This shows that M, π, α |= Ku p. Since π interprets p as depending only on D \ u relative to u and M is relatively confined with respect to Y ◦ m and , it follows that M, π |= DYIu◦m ,u p. However, since dom(a) u, we have dom(a) ∈ Iu ∪ {u}, so αa u Iu = α u Iu . Moreover, for v ∈ Iu , we have mv (αa) = mv (α), so Yv (mv (αa)) = Yv (mv (α)). Plainly M, π, αa |= ¬p. Thus, we have M, π |= ¬DYIu◦m ,u p This is a contradiction. Case 2: dom(a) u. Then Xu (αa) = (Xu (α), Ydom(a) (α), a). Since Xu (αa) = Xu (α ), we cannot have α = , so write α = βb. If dom(b) u, then we can switch the roles of αa and βb and apply the previous case. Thus, without loss of generality dom(b) u, and we have Xu (βb) = (Xu (β), Ydom(a) (β), b). It follows that a = b, Xu (α) = Xu (β) and Ydom(a) (α) = Ydom(a) (β). Suppose that obsu (s0 · αa) = obsu (s0 · βa). By the induction hypothesis, we obtain from Xu (α) = Xu (β) that obsu (s0 · α) = obsu (s0 · β). Define π on p by M, π, γ |= p iff γ u D \ u = βa u D \ u. We show that M, π, αa |= Ku p. Suppose that viewu (γ) = viewu (αa) and M, π, γ |= ¬p. Then γ u = αa u and γ u D \ u = βa u D \ u. Now from Xu (α) = Xu (β) and the fact that X is a full-information local-state assignment, we obtain that α u = β u. Hence γ u = αa u = βa u. Combining this with γ u D \ u = βa u D \ u we obtain γ = βa. This implies that viewu (γ) = viewu (αa), a contradiction, Lemma 4 If Y is locally cumulative, then for all sequences α, β ∈ A∗ and agents u, Yu (α) = Yu (β) implies Yu (mu (α)) = Yu (mu (β)). Lemma 5 Let X be a full-information local-state assignment based on Y and . Then we have the following. 1. If M is X-secure with respect to then for all α, β ∈ A∗ and agents u, if Xu (α) = Xu (β) then viewu (α) = viewu (β). 2. If Xu (α) = Xu (β) and v u, then Yv (mv (α)) = Yv (mv (β)). 3. Suppose Y is locally cumulative. If α Iu ∪ {u} = β Iu ∪ {u} and Yv (mv (α)) = Yv (mv (β)) for all v ∈ Iu , then Xu (α) = Xu (β). Lemma 6 Suppose that Y respects in M and is acyclic. Then for all α, β ∈ A∗ , if α u D \ u = β u D \ u then for all v ∈ Iu , we have Yv (mv (α)) = Yv (mv (β)). We can now state a result that generalizes Theorem 2 and Theorem 3. Theorem 4 Let be acyclic, and let X be a full information local state assignment based on Y and . Suppose that Y respects in M and is locally cumulative. Then if M is X-secure then M is relatively confined with respect to Y ◦ m and . Proof: Suppose π interprets p as depending only on D \ u relative to u. We need to show that for all α ∈ A∗ we have M, π, α |= Ku p ⇒ DYIu◦m ,u p. We assume M, π, α |= p, and show that M, π, α |= ¬Ku p. ¬DYIu◦m ,u Since M, π, α |= ¬DYIu◦m p, there exists a sequence β ∈ ,u ∗ A such that α u Iu = β u Iu and Yv (mv (α)) = Yv (mv (β)) for all v ∈ Iu and M, π, β |= ¬p. Since α u Iu = β u Iu , the sequences α and β have the same number k of occurrences of actions of agent u, and α Iu = β Iu . Let γ be the sequence constructed from β by replacing the i-th occurrence of an action of agent u by the i-th action of agent u in α, for i = 1 . . . k. Then we have that β u D \ u = γ u D \ u and α Iu ∪ {u} = γ Iu ∪ {u}. By Lemma 6, since β u D \ u = γ u D \ u, we have that Yv (mv (β)) = Yv (mv (γ)) for all v ∈ Iu . It follows that Yv (mv (α)) = Yv (mv (γ)) for all v ∈ Iu . Since also α Iu ∪ {u} = γ Iu ∪ {u} and Y is locally cumulative, we obtain using Lemma 5(3) that Xu (α) = Xu (γ). By X-security of M and Lemma 5(1), we conclude that viewu (α) = viewu (γ). However, since M, π, β |= ¬p, β u D \ u = γ u D \ u and π interprets p as depending only on D \ u relative to u, we have that M, π, γ |= ¬p. Thus M, π, α |= ¬Ku p. Using Lemma 3 and Lemma 1 , we obtain the following result, which can be seen to strengthen Theorem 2 and Theorem 3 by Corollary 2. Corollary 3 If M is TA-secure with respect to then M is relatively confined with respect to ta ◦ m and . If M is TO-secure with respect to then M is relatively confined with respect to view ◦ m and . 217 since obsu (s0 · αa) = obsu (s0 · βa). This shows that M, π, αa |= Ku p. It is obvious that π interprets p as depending only on D \ u relative to u, so by the assumption, we have M, π, αa |= DYIu◦m ,u p. By an induction on the definition of Xu , we can see that Xu (αa) = Xu (βa) implies αa u Iu ∪ {u} = βa u Iu ∪ {u}. By Lemma 5(2) we also have Yv (mv (αa)) = Yv (mv (βa)) for all agents v with v ∈ Iu . Since M, π, βa |= ¬p, it follows that M, π, αa |= ¬DYIu◦m ,u p, a contradiction. used in the literature on computer security. We have shown that it is possible to express these definitions precisely in terms of how information flows in the system. There are several directions that could be explored in future research. We note that our definitions are suited to deterministic systems in which actions are interleaved and have immediate effects: different formulations may be required to capture similar intuitions in asynchronous message passing systems, or systems with simultaneous actions, such as the model used in (Fagin et al. 1995). It would also be of interest to consider nondeterministic systems, and systems in which actions have probabilistic effects. Our semantic model and notions of causality originate in the computer security literature and have a richer temporal structure than is generally considered in KR work in the area (Pearl 2000; Halpern and Pearl 2001), but it would be interesting to investigate the relationships. We remark also that we have confined our attention to acyclic noninterference policies. It remains to formulate generalizations of these results that encompass systems with causal cycles. In particular, combining this result with Corollary 3 we obtain the following. Corollary 4 M is relatively confined with respect to ta ◦ m iff M is TA-secure. M is relatively confined with respect to view ◦ m iff M is TO-secure. That is, we are able to give a complete characterization of the causal notions of TO-security and TA-security that is stated entirely in epistemic terms. The equivalence gives us a reduction of causal notions to epistemic notions. References Conclusion Fagin, R.; Halpern, J.; Moses, Y.; and Vardi, M. 1995. Reasoning About Knowledge. MIT-Press. Goguen, J., and Meseguer, J. 1982. Security policies and security models. In Proc. IEEE Symp. on Security and Privacy, 11–20. Haigh, J., and Young, W. 1987. Extending the noninterference version of MLS for SAT. IEEE Trans. on Software Engineering SE-13(2):141–150. Halpern, J. Y., and Pearl, J. 2001. Causes and explanations: A structural-model approach - Part II: Explanations. In Nebel, B., ed., IJCAI, 27–34. Morgan Kaufmann. Moses, Y., and Bloom, B. 1994. Knowledge, timed precedence and clocks (preliminary report). In Proc. ACM Symp. on Principles of Distributed Computing, 294–303. Pearl, J. 2000. Cauality: Models, reasoning and inference. Cambridge University Press. Rushby, J. 1992. Noninterference, transitivity, and channel-control security policies. Technical Report CSL92-02, SRI International. van der Meyden, R. 2007. What, indeed, is intransitive noninterference (extended abstract). In Biskup, J., and Lopez, J., eds., Proc. European Symp. on Research in Computer Security (ESORICS), volume 4734 of Springer LNCS, 235–250. Full paper at http://www.cse.unsw.edu.au/∼meyden. While the notion of distributed knowledge has been studied in the literature on reasoning about knowledge since the 1980’s, relatively little concrete use has been made of it. Notably, one of the few relates distributed knowledge to the notion of of Lamport causality, which is appropriate in asynchronous message passing systems (see (Fagin et al. 1995) Proposition 4.4.3). Our work provides a study of a slightly different nature, dealing with distributed knowledge and information flow in causally constrained asynchronous systems. Specifically, we have clarified the intuition that the causal structure of a system constrains agents to acquire only external knowledge that is distributed knowledge to the agents that may causally affect it. In doing so, we have argued that the classical definition of distributed knowledge needs to be adapted in order to capture the intuition. We have shown that, besides intersection of information sets, there are other ways that a group of agents might combine their information. In particular, we have argued that such combination needs to be relativized to an agent to which the group is communicating its information in order to fully capture our intuitions concerning information flow and causal structure. We believe our new notions of distributed knowledge are intuitive and may find application in other contexts. We remark that the need for such variants has previously been argued by Moses and Bloom (Moses and Bloom 1994), who show that distributed knowledge it is too weak for certain applications, and propose a stronger notion IG called inherent knowledge, such that IG φ ⇒ DG φ is valid. This is a strengthening whereas we propose weakenings, and the context in this work is rather different from ours: systems in which communication is by asynchronous message passing, rather than the asynchronously operating systems with a synchronous communication mechanism, as in our definition of systems. Finally, we believe that the perspective from the logic of knowledge provides fresh insight into the definitions of causality being 218