Simple On-the-fly Automatic Verification of Linear Temporal Logic R. Gerth Technical University Eindhoven Den Dolech 2, Eindhoven, The Netherlands D. Peled AT&T Bell Laboratories 600 Mountain Avenue, Murray Hill, NJ 07974, USA M. Y. Vardi Rice University Department of Computer Science, Houston TX 77251, USA P. Wolper1 Université de Liège Institut Montefiore, B28, 4000 Liège, Belgium Abstract We present a tableau-based algorithm for obtaining an automaton from a temporal logic formula. The algorithm is geared towards being used in model checking in an “on-the-fly” fashion, that is the automaton can be constructed simultaneously with, and guided by, the generation of the model. In particular, it is possible to detect that a property does not hold by only constructing part of the model and of the automaton. The algorithm can also be used to check the validity of a temporal logic assertion. Although the general problem is PSPACE-complete, experiments show that our algorithm performs quite well on the temporal formulas typically encountered in verification. While basing linear-time temporal logic model-checking upon a transformation to automata is not new, the details of how to do this efficiently, and in “on-the-fly” fashion have never been given. Keywords Automatic Verification, Linear Temporal Logic, Büchi Automata, Concurrency, Specification. 1 Introduction Checking automatically that a protocol, especially a concurrent one with many parallel activities, satisfies its specification has gained a lot of attention during the last 15 years. The main 1 The work of this author was supported by the Esprit BRA action REACT and by the Belgian Incentive Program “Information Technology” - Computer Science of the future, initiated by the Belgian State - Prime Minister’s Office - Science Policy Office. The scientific responsibility is assumed by its authors. dichotomy between approaches to automated protocol verification can be characterized as logicbased versus state-space based methods. The former type of methods proceed by translating both the protocol and its specification into formulas in some formal logic and by showing logical implication of the specification by the protocol formula. In contrast, state-space based methods proceed by analyzing the possible configurations the protocol can be in, i.e. its state space, and how the protocol evolves from one configuration to another. None of these methods offer a uniform advantage; both have strengths and weaknesses when compared to the other. This paper concentrates on a class of state-space based methods, often called “model checking”. The idea of model checking is to view verification as checking whether the graph representing the state space of the protocol satisfies (is a model of) the property to be checked. Specifically, we focus on model checking for linear-time temporal logic formulas [9]. In this context, what one actually checks is that all infinite execution sequences that can be extracted from the state-space graph satisfy (are models of) the temporal logic formula, or equivalently, that none of these sequences falsifies the formula. A classical approach to solving this problem [12] is to proceed as follows. One first constructs the state spaces for both the protocol to be verified and for the negation of the property, the latter state space thus comprises all execution sequences (models) on which the property is violated. The two state spaces are then analyzed for the existence of a common execution sequence; finding one means that the property can be violated by the protocol. Given that one is interested in the infinite sequences that can be generated by the two state spaces, these can be interpreted as automata over infinite words, i.e., as ! -automata [11]. The analysis to be done thus amounts to the standard problem of checking if the language accepted by the (synchronous) product of the automata is empty or not. A general approach for solving this problem proceeds by checking for strongly connected components as is done in [8], but one can also reduce the problem to a simpler cycle detection for which simpler algorithms can be used [6, 4]. The model-checking problem as well as the validity problem for linear temporal logic are PSPACE-complete [10]. In practice, applications of model-checking methods face two complexity related limits: 1. The size of the automata, both for the protocol and for the property, since the execution time is proportional to the product of the number of nodes in the automata; 2. The size of that part of the product automaton that has to be kept in memory in order to check for emptiness, since available memory sets a firm bound on the size of the problems that can be treated. As to the latter problem, the cycle detection approach of [6, 4] uses a simple depth-firstsearch (DFS) strategy and, in contrast with [8], only needs a small part of the product automaton to be in main memory at any one time: the part corresponding to the computation that the depthfirst-search is currently exploring. It implies that the protocol automaton may be constructed on-the-fly, i.e. as is needed, while checking for its emptiness. This means that, if the property does not hold, the algorithm can detect so after constructing and visiting only a small part of the state space The automaton corresponding to the property can have as many as 2 O(n) nodes where n is the number of subformulas in the property formula [13]. Thus, the size of the product automaton, which determines the overall complexity of the method is proportional to N 2O(n), where N is the number of (reachable) protocol states. It is clearly desirable to keep property automata small and to avoid the exponential blowup that can occur in their construction whenever possible. The standard automaton construction for a temporal logic property [13] (see also [16, 8]) is a global one and starts by generating a node for each (maximally consistent) set of subformulas of the property. While this is a simple way to describe the construction, it is clearly not a reasonable way to implement it, since it immediately realizes the worst case exponential complexity. A subsequent construction, proposed as a basis for an implementation [7], starts with a two state automaton that is repeatedly ‘refined’ until all models of the property are realized. Although the worst case remains exponential, this construction often achieves a substantial reduction in the number of generated nodes. On the other hand, the algorithm cannot be used on-thefly during a depth-first search, as it repeatedly inspects the whole graph and “corrects” it by removing and adding edges and nodes. Moreover, the emptiness check proceeds by determining and inspecting the strongly connected components of the automaton and is thus less easily applicable to verifying whether a protocol satisfies a property. It should be said that the authors of [7] were not so much interested in protocol verification as in checking validity of a formula that include past operators. In this paper we present, and describe experiments with, a pragmatic algorithm for constructing an automaton from a temporal logic formula. Though having its roots in the construction of [13], our algorithm is designed to yield small automata whenever possible and to be simple to implement. Furthermore, it proceeds on-the-fly in the sense that the automaton is only generated as needed during the verification process. Technically, the algorithm translates a propositional linear temporal logic formula into a Generalized Büchi automaton [4] using a very simple depthfirst search. The interesting point is that, even though the algorithm produces a Generalized Büchi automaton, a simple transformation of this automaton yields a classical Büchi automaton for which the emptiness check can be done using a simple cycle detection scheme as in [4]. The result is that we obtain a protocol verification algorithm in which both the protocol and the property automata (and, hence, the product automaton) are constructed on-the-fly during a depth-first search that checks for emptiness. The rest of the paper starts with some preliminaries defining temporal logic and its interpretations. Section 3 presents the basic algorithm, discusses optimizations and its application to model checking. The correctness proof occupies Section 4. In Section 5 we make some more detailed comparisons with existing constructions. The paper finishes with some experimental results and conclusions in Sections 6. 2 Preliminaries The set of well-formed linear temporal logic (LTL) are constructed from a set of atomic propositions, the standard Boolean operators, and the temporal operators X and U. Precisely, given a finite set of propositions , formulas are defined inductively as follows: P every member of P is a formula, if ' and are formulas, then so are :', ' ^ , ' _ , X' and ' U . An interpretation for a linear-time temporal logic formula is an infinite word = x0 x1 over the alphabet 2P , i.e. a mapping from the naturals to 2P . As made precise below, the elements of 2P are interpreted as assigning truth values to the elements of : elements in the set are assigned true, elements not in the set are assigned false. We write i for the suffix of starting at xi . The semantics of LTL is then the following. P j= q iff q 2 x0, for q 2 P , j= :' iff not j= ', j= ' ^ iff j= ' and j= , j= ' _ iff j= ' or j= , j= X' iff 1 j= ', j= ' U iff there is an i 0 such that i j= and j j= ' for all 0 j < i. _: : :: :: : p, and F as an abbreviation for T. We also introduce We introduce T as an abbreviation for p additional temporal operators as abbreviations: F' = T U ', G' = F '. Finally, we also use the temporal operator V which is defined as the dual of U: 'V = ( ' U ). 3 A Tableau Construction Our goal is to build an automaton (transition system) that generates all infinite sequences satisfying a given temporal logic formula '. The automata we build are generalized Büchi automata, namely Büchi automata with multiple sets of accepting states, as opposed to simple Büchi automata that have only one set of accepting states [11]. A h ! ! Fi A generalized Büchi automaton [4] is a quadruple = Q; I; ; , where Q is a finite set of states, I Q is the set of initial states, Q Q is the transition relation, and Q 22 is a set of sets of accepting states = F1; F2 ; : : :Fn . Notice that can be empty. F F f g F An execution of A is an infinite sequence = q0 q1 q2 : : : such that q0 2 I and, for each i 0, qi ! qi 1 . An accepting execution is an execution such that, for each acceptance set Fi 2 F , there exists at least one state q 2 Fi that appears infinitely often in . + The automata we have defined so far have no input, and hence do not define any sequences. We thus need to add labels to our automata. The most common approach is to add labels to transitions. Here, we proceed slightly differently and add labels to states. A labeled generalized Büchi automaton, or LGBA for short, is a triple ; ; , where is a generalized Büchi automaton, is some finite domain, and : Q 2D is a labeling function from the states of to subsets of the domain (a state has a set of labels from ). An LGBA accepts a word = x0 x1 x2 : : : from ! iff there exists an accepting execution = q0 q1 q2 : : : of such that for each i 0, xi (qi ). We also say that the execution accepts . A D D 2L D L hA D Li ! A D A The central part of the automaton construction algorithm is a tableau-like procedure related to the ones described in [14, 15]. The tableau procedure builds a graph, which will define the states and transitions of the automaton. The nodes of the graph are labeled by sets of formulas and are obtained by decomposing formulas according to their Boolean structure, and by expanding the temporal operators in order to separate what has to be true immediately from what has to be true from the next state on. The fundamental identity used to this is U ( X( U )). Before describing the graph construction algorithm, we introduce the data structured used to represent the graph nodes. _ ^ 3.1 The Data Structure The data structure we use for representing graph nodes contains sufficient information for the graph construction algorithm to be able to operate in a DFS order. A graph node contains the following fields: Name A string that is the name of the node. Incoming The incoming edges represented by the names of the nodes with an outgoing edge leading to the current node. A special name, init is used to mark initial nodes. init is not the name of any node, hence does not represent a real edge. New A set of temporal properties (formulas) that must hold at the current state and have not yet been processed. Old The properties that must hold in the node and have already been processed. Eventually, New will become empty, leaving all the obligations in Old. Next Temporal properties that must hold in all states that are immediate successors of states satisfying the properties in Old. Father During the construction, nodes will be split. This field will contain the name of the node from which the current one has been split. This field is used for reasoning about the correctness of the algorithm only, and is not important for the construction. We keep a list of nodes Nodes Set whose construction was completed, each having the same fields as above. We denote the field New of the node q by New (q ), etc.. 3.2 The algorithm To simplify the representation of the algorithm, we assume first that the given formula ' for which the automaton should be built does not contain the Nextime operator ‘X’. We will show later how to lift this restriction. Without loss of generality, we may further assume that the formula does not contain the operators ‘ F’ and ‘G’, and that all the negations are pushed inside until they only precede propositional variables. That is, the formula is first transformed to contain only the operators U and V. In fact, the operator V, which is the dual of the operator ‘U’, was specifically introduced in order to allow pushing the negations without causing an exponential blowup in the size of the translated formula. The line numbers in the following description refer to the algorithm that appears in Figure 1. The algorithm for translating the formula ' starts with a single node (lines 34–35). This node has a single (dummy) incoming edge, labeled init, to mark the fact that it is an initial node. Thus, by the end of the construction, a node will be initial iff it contains this label in its list of incoming nodes. It has initially one new obligation in New, namely, ', and the sets Old and Next are initially empty. For example, the upper node in Figure 2 is the one with which the algorithm starts for constructing the automaton for p U q . With the current node N, the algorithm checks if there are unprocessed obligations left in New (line 4). If not, the current node is fully processed and ready to be added to Nodes Set. If there already is a node in Nodes Set with the same obligations in both its Old and Next fields (line 5), the copy that already exists needs only to be updated w.r.t. its set of incoming edges; the set of edges incoming to the new copy are added to the ones of the old copy in Nodes Set (line 6). If no such node exists in Nodes Set, then the current node is added to this list, and a new current node is formed for its successor as follows (lines 8–10): There is initially one edge from N to the new current node. The set New is set initially to the Next field of N . The sets Old and Next of the new current node are initially empty. When processing the current node, a formula in New is removed from this list. In the case that is a proposition or the negation of a proposition (a literal), then, if is in Old (we identify with ), the current node is discarded, as it contains a contradiction (lines 16–17). Otherwise, is added to Old (if it is not already there). :: : When is not a literal, the current node can be split into two (lines 21–26) or not split (lines 29–31), and new formulas can be added to the fields New and Next (lines 22–23,25– 26,30–31). The exact actions depend on the form of and are the following: = ^ Then, both and are added to New as the truth of both formula is needed to make hold. = _ Then, the node is split, adding to New of one copy, and to the other. These nodes correspond to the two ways in which can be made to hold. = U Again, the node is split: for the first copy, is added to New and U to Next. For the other copy, is added to New. This splitting is explained by observing that U is equivalent to _ ( ^ X( U )). This is depicted in Figure 2. = V Then, the node is split: is added to New of both copies, is added to New of one copy, and V is added to Next of the other. This splitting is explained by observing that V is equivalent to ^ ( _ X(V )). The copies are processed in DFS order, i.e., when expansion of the current node and its successors are finished, the expansion of the second copy and its successors is started. The algorithm is listed in Figure 1 in a pseudo-code language. The function new name() generates a new string for each successive call. The function Neg, is defined as follows: Neg(Pn )= Pn , Neg( Pn )=Pn , and similarly for the boolean constants T and F. The functions New1( ), New2( ) and Next1( ) are defined in the following table: : : U V _ New1( ) Next1( ) New2( ) fg f g fg f U g f g fV g f; g ; f g 3.3 Using the Automaton for Automatic Protocol Verification The graph constructed by the algorithm in Section 3.2 can now be used to define an LGBA accepting the infinite words satisfying the formula. The set of states Q will be the nodes returned by the algorithm. Notice that only nodes for which New is empty are placed in this set. In other words, only fully expanded nodes are returned. The initial states I are those nodes q such that init Incoming (q ). The transitions p q are exactly those satisfying that p Incoming (q). 2 2 ! D The domain is 2P and the label of a node q is all sets in 2P that are compatible with Old (q ). Indeed, a node of the graph does not necessarily assign truth values to all atomic propositions, and the label of a node can be any element of 2P that agrees with the literals that appear in Old (q ). Precisely, let Pos (q ) be Old (q ) and Neg (q ) be Old (q) , i.e., Pos (q ) and Neg (q ) are the positive and negative occurrences of the propositions in q , respectively. Then, (q ) = X X X Pos (q) X Neg (q) = . \P L f j P^ ^ \ f j: 2 ;g ^ 2 Pg Finally, we have to impose accepting conditions. Indeed, observe that not every maximal path = q0 q1 in the graph determines models of the formula: the construction allows some node to contain U while none of the successor nodes contain . This is solved by imposing the generalized Büchi acceptance conditions. For each subformula of ' of the type U , there will be a set F which includes the nodes q Q such that either U Old (q ), or Old (q ). 2F 2 2 62 Let us show that, with these acceptance conditions, one can no longer accept a sequence in which U appears from some node qi onwards without occurring later. First, notice that from the construction, if U Old (qi) and qi and Old (qi+1 ), then U qi+1 , then U Old (qi+1). Thus, in the above scenario, U propagates from qi onwards, since never occurs. Let F be the accepting subset that is associated with U . Then, none of the states with index greater or equal to i can be in F . But then the sequence does not contain infinitely many occurrences of any state from F , and is not accepting. 2 2 62 2 62 2F As explained in the introduction, a protocol is verified w.r.t. a property by constructing an automaton for the negation of the property, and by exploring the synchronous product of the protocol and the property automaton for emptiness. Since the automaton representing the 1 record graph node = [Name:string, Father:string, Incoming:set of string, 2 New:set of formula, Old:set of formula, Next:set of formula]; 3 function expand (Node, Nodes Set) 4 if New(Node)= then 5 if ND Nodes Set with Old(ND)=Old(Node) and Next(ND)=Next(Node) 6 then Incoming(ND) = Incoming(ND) Incoming(Node); 7 return(Nodes Set); 8 else return(expand([Name Father new name(), Name(Node) , New Next(Node), 9 Incoming 10 Old , Next ], Node Nodes Set)) 11 else 12 let New; 13 New(Node) := New(Node) ; 14 case of 15 = Pn , or Pn or =T or =F=> 16 if =F or Neg( ) Old(Node) (* Current node contains a contradiction *) 17 then return(Nodes Set) (* Discard current node *) 18 else Old(Node):=Old(Node) ; 19 return(expand(Node, Nodes Set)); 20 = U , or V , or => 21 Node1:=[Name new name(), Father Name(Node), Incoming Incoming(Node), 22 New New(Node) ( New1( ) Old(Node)), 23 Old Old(Node) , Next=Next(Node) Next1( ) ]; 24 Node2:=[Name new name(), Father Name(Node), Incoming Incoming(Node), 25 New New(Node) ( New2( ) Old(Node)), 26 Old Old(Node) , Next Next(Node)]; 27 return(expand(Node2, expand(Node1, Nodes Set))); = => 28 29 return(expand([Name Name(Node), Father Father(Node), Old(Node)), 30 Incoming Incoming(Node), New New(Node) ( ; 31 Old Old(Node) , Next=Next(Node)], Nodes Set)) 32 end expand; 9 2 ; ( (f g ( ; ( ; f g[ 2 nf g : 2 ^ ( ( ( ( _ [f [f g ( [f [f g [ ( ( [f g ( gn ( ( ( ( [f g 33 function create graph (') 34 return(expand([Name Father ' , Old 35 New , Next 36 end create graph; ( (f g (; gn ( ( ( [f ( g [ f gn (new name(), Incoming(finitg, ( ; , ;)) ] Figure 1: The algorithm ( ( Name: Node1 Father: Node1 Incoming: init Current New: Current Old: Next: f U g ; ; split Name: Node2 Father: Node1 Incoming: init Current New: Current Old: U Next: U f Name: Node3 Father: Node1 Incoming: init Current New: Current Old: U Next: fg f g g f g f g ; Figure 2: Splitting a node protocol has an empty acceptance condition ( accepting sets of the property automaton. F = ;), the product automaton simply inherits the Checking for emptiness can be done on-the-fly, i.e., during the generation of the product. For a simple Büchi automaton (one for which is a singleton), one only needs to find a reachable accepting state that is also reachable from itself. An algorithm for doing this is described in [4]. Furthermore, that paper also shows how generalized Büchi conditions can also be handled. The idea is to transform a generalized Büchi automaton into a simple one. This is done by using a counter: each state becomes a pair q; i where i is a counter. The counter is initialized to 0 and counts modulo n, where n = . It is updated from i to i + 1 whenever one reaches an element of the ith set Fi . One then only needs one set of accepting states, for instance F0 0 . F jFj 2F h i f g 3.4 Improvements to the Basic Algorithm Adding Nextime Formulas All that is needed to be able to handle formulas involving the Nextime operator (X) is to add an extra case to the algorithm. = X => ( ( return(expand([Name Name(Node), Father Father(Node), Incoming Incoming(Node), New New(Node), Old Old(Node) Next Next(Node) ], Nodes Set)) ( ( [f g ( ( [fg, Pure “On-the-fly” Construction. The algorithm presented here generates an LGBA that can be used for model-checking or checking the validity of a temporal formula. However, one does not have to complete the construction of this automaton in order to do the model-checking. Construction of nodes can be done “on-demand”, while intersecting them with the protocol automaton. Then, when the successors of a node in the property automaton are constructed, one does not immediately continue to construct their own successors, and so forth. Instead, one chooses the successors that can match the current state of the protocol. Thus, it is possible that a violation of the checked property will be discovered before generating the entire property automaton. Improving the Efficiency. The algorithm as presented here was written in such a way that its proof of correctness will be simplified. Therefore, it contains some redundancies. The following improvements can be made: The field Father is not needed, except for the proof of correctness. When splitting a node (lines 21–26), there is no need to generate two new nodes; instead one can update one of them with additional information, and after generating all its descendents, create the other one. This is also true when adding the conjuncts to a node (lines 28–30). An eventuality of the form U T does not generate a set F equivalent to T. 2 F . Indeed, such a formula is Inconsistencies are only detected at the level of atomic propositions so that nodes that are semantically inconsistent may still appear in the automaton. Certain inconsistencies can be detected earlier using syntactic means. For instance, before adding a formula to a node one can ‘compute’ (by pushing the negation inside) and check whether it already occurs. If it occurs, the current node is abandoned. : Every processed formula is currently stored in the Old field. This is not always necessary. For instance, after a conjunction 1 2 has been analyzed, it need not be added to the Old field because both 1 and 2 will be added, and the presence of these formula tells us that the conjunction will also be true in this node. Note however that, if 1 2 is the righthand argument of an Until subformula U , it must still be stored, since it is used to define the acceptance conditions. Similar observations apply to disjunctions, U and V formulas, but care must be also taken to retain the information needed for identifying the acceptance conditions, i.e. the righthand arguments of U formulas. As a consequence, the generated automata may become smaller, since nodes that differed previously might become identical. ^ ^ In the case of treating at line 20 a subformula of the type U , if already appears in New (q ) Old (q ), then there is no need to split the node q into two. It is then sufficient to move the subformula U from New (q ) to Old (q ). The same holds when treating a formula of the type V , and both and are in New (q ) Old (q ). [ [ 4 Proof of Correctness In this section, the proof of correctness will be sketched. The main theorem is the following: Theorem 4.1 The automaton over (2P )! that satisfy '. A constructed for a property ' accepts exactly the sequences Proof. The two directions are proved in Lemma 4.8 and Lemma 4.9 below. Let ∆(q ) denote the value of Old (q ) at the point where the construction of the node q is V finished, i.e. when it is added to Nodes Set, at line 10 of the algorithm., Let Ξ denote the conjunction of a set of formulas Ξ, the conjunction of the empty set being taken equal to T. = x0x1 x2 : : : be a propositional sequence, i.e., a sequence over (2P )! , and let = q0q1q2 : : : be a sequence of states of A such that for each i 0, qi ! qi 1 . Recall that i denotes the suffix of the sequence , i.e., xi xi 1 xi 2 : : :. Lemma 4.1 Let be an execution of A, and let U 2 ∆(q0). Then one of the following holds: Let + + 1. 2. + 8i 0 : ; U 2 ∆(qi) and 62 ∆(qi). 9j 0 8i 0 i < j : ; U 2 ∆(qi) and 2 ∆(qj ). Proof. Follows directly from the construction. Lemma 4.2 When a node q is split during the construction in lines 21–26 into two nodes and q2, the following holds: q1 V Old(q) ^ V New(q) ^ X V Next(q)) ! V V V V V V (( Old(q ) ^ New(q ) ^ X Next(q )) _ ( Old(q ) ^ New(q ) ^ X Next(q ))) ( 1 1 1 2 2 2 Similarly, when a node q is updated to become a new node q 0, as in lines 28–31, the following holds: ( ^ Old(q ) ^ ^ New(q ) ^X ^ Next(q )) ^ !( Old(q 0) ^ ^ New(q 0) ^X ^ Next(q 0)) Proof. Directly from the algorithm and the definition of LTL. Using the field Father we can link each node to the one from which it was split. This defines an ancestor relation R, where (p; q ) R iff Father(q ) = Name(p). Let R be the transitive closure of R. Nodes q such that Father (q ) = Name (q ), i.e., (p; p) R are called rooted. A rooted node p can be one of the following two: 2 2 1. p is the initial node with which the search started at lines 34–35. Thus, it has New (p) = f'g. 2. p is obtained at lines 8–9 from some node q whose construction is finished. have New (p) set to Next (q ). Let rst (q ) be the node p such that (p; q ) 2 R, and (p; p) 2 R. Thus, we Lemma 4.3 Let p be a rooted node, and q1; q2; : : :qn be all its same-time descendant nodes, i.e. the nodes qi such that (p; qi ) R . Let Ξ be the set of formulas that are in New(p), when it is created. Let Next(qi) be the values of the fields Next for qi at the end of the construction. Then, the following holds: 2 ^ j W Ξ ! _ ^ 1 in ( ∆(qi ) ^ V V ^X ^ Next(qi )) Moreover, if = 1in ( ∆(qi ) X Next(qi )), then there exists some 1 i n such that = V ∆(qi) X V Next(qi) such that for each U ∆(qi) with = , is also in ∆(qi). j ^ 2 j Proof. By induction on the construction, using Lemma 4.2. j V ^ ^ V Lemma 4.4 Let be a propositional sequence such that = ∆(q ) X Next(q ). Then, there exists a transition q q0 in such that 1 = V ∆(q0) X V Next(q0). Moreover, let Γ = U ∆(q ) and ∆(q ) and 1 = , then in particular there exists a transition q q0 such that q0 satisfies also that Γ ∆(q0). f j ! ! 62 2 A j g j Proof. When the construction of node q was finished, a node r with New (r) = was generated. Then, Lemma 4.3 guarantees that a successor as required exists. Lemma 4.5 For every initial state q have ' ∆(q ). 2 Next (q ) = Ξ 2 I of an automaton A generated from the formula ', we Proof. Immediately from the construction. Lemma 4.6 Let A be an automaton constructed for the LTL property '. Then ' $ _ (^ ∆(q) ^ X ^ Next(q)) : q 2I Proof. From Lemma 4.3, since Ξ in that Lemma is initially A f'g. Lemma 4.7 Let = q0q1q2 : : : be a run of that accepts the propositional sequence when q0 is taken to be an initial state. Then = V ∆(q0 ). j Proof. By induction on the size of the formulas. The base case is for formulas of the form We will show only the case of U ∆(q0 ). Then, according to Lemma 4.1 there are two cases: P; :P , where P 2 P . 1. 2. 2 8i 0 : ; U 2 ∆(qi ) and 62 ∆(qi). 9j 0 8i 0 i < j : ; U 2 ∆(qi) and 2 ∆(qj ). A Since satisfies the acceptance conditions of , only case 2 is possible. But then, by the induction hypothesis, j = and for each 0 i < j , i = . Thus, by the semantic definition of LTL, = U . The other cases are treated similarly. j j Lemma 4.8 Let be an execution of the automaton propositional sequence . Then = '. j j A, constructed for ', that accepts the Proof. The node q0 is now an initial state, i.e., in I . From Lemma 4.7 it follows that By Lemma 4.5, if q0 I then ' ∆(q0 ). Thus, = '. 2 j 2 j= V ∆(q0). j= '. Then there exists an execution of A that accepts . V V Proof. First, by Lemma 4.6, there exists a node q0 2 I such that j= ∆(q0 ) ^ X Next (q0). Lemma 4.9 Let Now, one can construct the propositional sequence by repeatedly using Lemma 4.4. Namely, V V X Next (qi ), then choose qi+1 to be a successor of qi that satisfies i+1 = Vif ∆i(q= ) ∆(XqiV) Next (qi+1 ). Furthermore, Lemma 4.4 also guarantees that we can choose qi+1 i+1 such that if for an U subformula U in ∆(qi ), holds in i+1 , then ∆(qi+1 ). We also know from Lemma 4.1 that U will propagate to the successors of qi unless holds. Since i = U , there must be some minimal j i such that j = . hence by the above, ∆(qj ). j j ^ ^ j 2 j 2 5 Comparison with Previous Work The first translation from an LTL formula ' to a Büchi automaton was by Wolper, Vardi and Sistla [16, 13]. It is based on constructing the intersection of two automata. The first automaton takes care of the state-to-state consistency of the runs, and is called the local automaton. The other automaton, called the eventuality automaton, takes care that the eventualities i.e., subformulas of the type U , will be satisfied. The set of formulas cl(') are the subsets of '. Then, each state A of the local automaton consists of the formulas from cl('), either negated, or non-negated. The transitions of the local automaton reflect consistency conditions. E.g., if p q, i.e., q is a possible successor of p, and XP belongs to node p, then P must belong to node q. The edges of this automaton are labeled identically to the nodes from which they emanate. The initial states of the local automaton are the ones that contain the formula ' itself. ! The second automaton’s states consists of a subset of U subformulas of ' . These are the set of goals that need to be satisfied along the execution sequence. The edges are labeled as in the local automaton. Once the righthand subformula of a U formula (i.e., in ' U ) appears on an edge, the U formula is removed from the set of goals (i.e., does not appear in the set of formulas of the next state). When all the goals are achieved, one starts with a new set of goals accumulated in the labels of the edge (which will later be linked to the goals accumulated in the state of the local automaton). The eventuality automaton accepts a word whenever all the goals are achieved infinitely often. The combination of the two automata is done by taking the Cartesian product of the node sets, and coordinating the edges. The acceptance condition of the product is fixed by the eventuality automaton: any node that has (in its second component) an empty set of goals is accepting. This construction was meant first of all to show the theoretical connection between LTL and Büchi automata and establish its correctness. It was also designed to be applicable to temporal logic extended with operators defined by finite automata [14]. Applied blindly, it systematically leads to an automaton with a state set of exponential size for the following reasons. 1. Each node in the local automaton of this construction is maximal. Namely, it contains each subformula either negated or non-negated. Thus the number of nodes is exponential in the size of the formula (or equivalently in the number of its subformulas, cl(')). This is unnecessary since many of theses nodes are often unreachable. Furthermore, this approach does not allow nodes that only differ on locally irrelevant members of cl(') to be merged. 2. The eventuality automaton has states that consist of sets of U subformulas. Thus, it is exponential in the number of U subformulas. This is needed to handle extended temporal logics, but is not necessary for the logic we consider here. Indeed, since U formulas propagate unmodified until their righthand side argument is satisfied, one can, as we did here, directly write the requirement that U subformulas are satisfied as a generalized Büchi acceptance condition. Furthermore, converting this generalized Büchi acceptance condition to a simple one can de done with an increase in the size of the automaton that is linear in the number of U subformulas, rather than exponential in this number as in the eventuality automaton approach. (A similar observation is independently and implicitly made in [2].) 3. The nodes are generated in a “global” manner: first, all possible nodes are generated for both automata. Then, edges are constructed between pairs of nodes if they satisfy some consistency conditions. Finally, the product automaton is taken. Only at the end it is possible to check which nodes are really reachable from the initial states. This requires an additional search. An improved tableau construction for temporal logic was given in [7]. It constructs a graph (the goal of that paper was checking satisfiability rather than using the translation for model-checking), but can similarly create the Büchi automaton that corresponds to a temporal property. This construction indeed uses the above observations to reduce the number of states and edges. It is also claimed that it operates “on-the-fly”, as it starts with the property that needs to be translated, creating an initial graph, and then refining this graph until it corresponds to the appropriate translation. Thus, it constructs nodes and edges only “when needed”. This construction globally checks pairs of adjacent nodes in the graph. If they do not satisfy the tableau consistency conditions, one of these nodes is refined: it is replaced by a set of nodes that satisfy the consistency conditions. The algorithm continues to refine nodes until all the edges satisfy the consistency conditions. This involves replacing old nodes by new ones, and adding and removing edges accordingly. With this algorithm, the construction of the automaton needs to be finished before it can be used for model-checking. Our construction starts with the checked formula ', constructs a node for it and continue to generate the graph in a depth-first-search order. The only cases where a node is discarded are where it is already found in the list of existing nodes, or when it contains a propositional contradiction. Moreover, it can be used on-the-fly. Thus, avoiding the need to construct the entire automaton if a violation of the checked property was found during its intersection with the protocol. 6 Experimental Results and Conclusions The following table compares the global construction described in [13] and the algorithm described in Section 3. Both were implemented in Standard ML of New Jersey. Here, Fp abbreviates T U p and Gp abbreviates F p. :: Num. 1 2 3 4 5 6 7 Formula p1 U p2 p1 U (p2 U p3 ) :(p1 U (p2 U p3)) GFp1 ! GFp2 Fp1 U Gp2 Gp1 U p2 :(FFp1 $ Fp1) Global Construction Nodes Transitions 8 34 26 240 26 240 114 763 56 337 13 63 - New Construction Nodes Transitions Accepts 3 4 1 4 6 2 7 15 0 9 15 2 8 15 2 5 6 1 22 41 2 The rightmost column represents the number of pairs in the acceptance table of the constructed automaton. Notice that for the safety property 3, there are no U subformulas satisfy. Yet, for the automaton to be nonempty, it has to contain a reachable cycle. The formulas that were used in the experiments are the following. GFp1 ! GFp2 This formula can describe a fairness condition: p1 expresses the enabledness of some element (e.g., a process, a transition), and p2 the execution of that element. Such a formula can be exploited when one wants to check some property under a fairness condition which is not already implemented in the model-checker. p1 U (p2 U p3 ) and :(p1 U (p2 U p3 )) The purpose of these examples is to show that the construction does not impose an exponential blowup when negating a formula. :(FFp1 $ Fp1) $ This can be used to verify that (FFp1 Fp1 ) is a tautology. Unfortunately there was insufficient memory for the ML program for the global construction to complete. It is evident from the table that the exponential blowup occurs much faster using the global construction. This will not only be reflected in the memory and time that it takes to complete the construction, but also during the emptiness check, which takes time (linearly) proportional to the size of the constructed automaton. In model-checking the size of the constructed property automaton is more critical, since one has to take the product of this automaton with the one representing the state space. Given that the size of the state space is itself also often a problem, it is all the more important that the property automaton be as small as possible. For the same reason, the fact that the algorithm is on-the-fly is important. It means that the algorithm can often given an answer before the full state space and property automaton have been constructed. Thus, we feel that the algorithm in this paper is a promising and potentially practical approach to both model-checking and validity checking: it is simple, it appears to produce reasonable sized automata and it operates on-the-fly. Acknowledgment. The second author likes to thank Elsa Gunter for helping him with debugging the ML program. References [1] Y. Choueka, Theories of automata on ! -Tapes: a simplified approach, Journal of Computer and System Science 8 (1974), 117-141. [2] G. Bhat, R. Cleaveland, O. Grumberg, Efficient on-the-fly model checking for CTL*, Proceedings of the 10th Symposium on Logic in Computer Science, 1995, San Diego, CA, To appear. [3] J.R. Burch, E.M. Clarke, K.L. McMillan, D.L.Dill, J. Hwang, Symbolic model checking: 1020 states and beyond, Information and Computation, 98(1992), 142–170. [4] C. Courcoubetis, M. Vardi, P. Wolper, M. Yannakakis, Memory-efficient algorithms for the verification of temporal properties, Formal methods in system design 1 (1992) 275–288. [5] O. Coudert, C. Berthet, J.C. Madre, Verification of synchronous sequential machines based on symbolic execution, Automatic Verification Methods for Finite State Systems, Grenoble, France, LNCS 407, Springer–Verlag, 1989, 365–373. [6] G.J. Holzmann, Design and Validation of Computer Protocols, Prentice Hall, 1992. [7] Y. Kesten, Z. Manna, H. McGuire, A. Pnueli, A decision algorithm for full propositional temporal logic, CAV’93, Elounda, Greece, LNCS 697, Springer–Verlag, 97-109. [8] O. Lichtenstein, A. Pnueli, Checking that finite-state concurrent programs satisfy their linear specification, 11th ACM POPL, 1984, 97–107. [9] A. Pnueli, The temporal logic of programs, Proceedings of the 18th IEEE Symposium on Foundation of Computer Science, 1977, 46-57. [10] A. P. Sistla, E. M. Clarke, The Complexity of propositional linear temporal logics, Journal of the ACM, 32(1985), 733–749. [11] W. Thomas, Automata on infinite objects, Handbook of theoretical computer science, 1990, 165–191. [12] M.Y. Vardi, P. Wolper, An automata-theoretic approach to automatic program verification, Proceedings of the 1st Symposium on Logic in Computer Science, 1986, Cambridge, England, 322–331. [13] M.Y. Vardi, P. Wolper, Reasoning about infinite computations, Information and Computation, 115(1994), 1–37. [14] P. Wolper, Temporal logic can be more expressive, Information and Control, 56(1983), 72–99. [15] P. Wolper, The tableau method for temporal logic: an overview, Logique et Analyse, 110–111(1985), 119–136. [16] P. Wolper, M.Y. Vardi, A.P. Sistla, Reasoning about infinite computation paths, Proceedings of 24th IEEE symposium on foundation of computer science, Tuscan, 1983, 185–194.