Tools for Automated Verification of Web Services Tevfik Bultan Department of Computer Science University of California, Santa Barbara Web Services Interaction BPEL4WS, WSCI Service WSDL Message SOAP Type XML Schema Data XML Web Service Standards Implementation Platforms Loosely coupled Standardized data transmission via XML Asynchronous messaging Platform independent (.NET, J2EE) Microsoft .Net, Sun J2EE • • • • Motivation • Challenges in both specification and verification – Distributed nature, no central control • How do we model the global behavior? • How do we specify the global properties? – Asynchronous messaging introduces undecidability in analysis • How do we check the global behavior? • How do we enforce the global behavior? – XML data manipulation • How do we specify XML messages? • How do we verify properties related to data? Outline • Web Service Composition Model • Capturing Global Behaviors – Conversations • Top-Down Specification and Verification – Realizability • Bottom-Up Specification and Verification – Synchronizability • Web Service Analysis Tool • Conclusions and Future Work Collaborators: Xiang Fu, Jianwen Su, Rick Hull Web Service Composition • A composite web service is a tuple S = ( P, M ) [Bultan, Fu, Hull, Su WWW’03] P : finite set of peers (web services) Req1 Traveler Agency Booking2 M : finite set of message classes Hotel Req2 Booking1 Airline Communication Model • Reliable • Asynchronous • Queues are FIFO and unbounded Agency Req1 Airline R2 R2 • This model is similar to industry efforts – JMS (Java Message Service) – MSMQ (Microsoft Message Queuing Service) Message Classes • Messages are classified into classes • Each message class is associated with one sender and one receiver Agency Req2 Airline • Two models for messages: – No content, just classes • this model can represent messages with content as long as domains are finite – Messages with content • XML messages Finite State Peers • Peer: Finite state automaton + one FIFO queue • Extensions – Reactive services: Büchi automata – Message contents: Guarded automata output messages input messages !Booking3 ?Req3 [ := ] Airline Executing Web Service Composition ! R1 ? B1 Traveler B1 ? R1 ! B1 ? R2 ! B2 ! B1 ! R2 R B321 R2 ? B2 ? B3 Airline ! R3 ? R3 Agency ! B3 R3 Hotel Execution is a complete run if • Each sent message is eventually consumed • Each peer visits its final states infinitely often Outline • Web Service Composition Model • Capturing Global Behaviors – Conversations • Top-Down Specification and Verification – Realizability • Bottom-Up Specification and Verification – Synchronizability • Web Service Analysis Tool • Action Language Verifier • Conclusions and Future Work Conversations • Watcher: “records” the messages as they are sent R1 Traveler B1 Agency R2 Hotel B2 Watcher R 1 R2 B2 R3 B3 B 1 Airline • A conversation is a sequence of messages the watcher sees in a complete run • Conversation Set: the set of all possible conversations of a service S : C(S) Properties of Conversations • The notion of conversation enables us to reason about temporal properties of the web service composition • LTL framework extends naturally to conversations – LTL temporal operators X (neXt), U (Until), G (Globally), F (Future) – Atomic properties Predicates on message classes (or contents) Example: G (R1 F B1) • Model checking problem: Given an LTL property, does the conversation set C(S) satisfy the property? Question • Given a web service composition S, is the language C(S) always regular? If it is regular, finite state model checking techniques can be used for verification Answer Conversation Sets are not always regular, even without message contents Example: C(S) = { w | w (r | a) and for each prefix w’, |r|w’ |a|w’} !r ?a P1 r ?r a !a P2 • Causes: asynchronous communication with unbounded queues • Bounded queues or synchronous communication Conversation Set always regular Outline • Web Service Composition Model • Capturing Global Behaviors – Conversations • Top-Down Specification and Verification – Realizability • Bottom-Up Specification and Verification – Synchronizability • Web Service Analysis Tool • Action Language Verifier • Conclusions and Future Work Bottom-Up vs. Top-Down Bottom-up approach • Specify the behavior of each peer • The global communication behavior (conversation set) is implicitly defined based on the composed behavior of the peers • Global communication behavior is hard to understand and analyze Top-down approach • Specify the global communication behavior (conversation set) explicitly as a protocol • Ensure that the conversations generated by the peers obey the protocol msg1 Conversation Schema Peer A msg2, msg6 msg4 Peer B msg3, msg5 Peer C BA:msg2 BC:msg5 Conversation Protocol AB:msg1 ? BA:msg6 BC:msg3 LTL property G(msg1 F(msg3 msg5)) C B:msg4 (c) Peer A Peer B !msg1 Peer C ?msg1 !msg3 Input Queue ?msg3 !msg2 ?msg2 !msg5 ?msg6 Virtual Watcher ?msg5 ?msg4 !msg4 !msg6 ... ? G(msg1 F(msg3 msg5)) LTL property Conversation Protocols • Conversation Protocol: – An automaton that accepts the desired conversation set – For reactive protocols with infinite message sequences we use: • Büchi automata • Accept infinite strings – For specifying message contents, we use: • Guarded automata • Guards are constraints on the message contents • A conversation protocol is a contract agreed by all peers – Each peer must act according to the protocol Model Checking • Protocols without message contents – Finite state model checking techniques and tools • Protocols with finite domain message contents – Finite state model checking techniques and tools • Protocols with infinite domain message contents – Infinite state model checking techniques and tools Synthesize Peer Implementations • Conversation protocol specifies the global communication behavior – How do we implement the peers? • How do we obtain the contracts that peers have to obey from the global contract specified by the conversation protocol? • Project the global protocol to each peer – By dropping unrelated messages for each peer Interesting Question Conversations specified by the conversation protocol ? Conversations generated by the composed behavior of the projected services Are there conditions which ensure the equivalence? Realizability Problem • Not all conversation protocols are realizable! A B: a !a ?a !b ?b Peer A Peer B Peer C Peer D C D: b Conversation protocol Projection of the conversation protocol to the peers Conversation “ba” will be generated by any legal peer implementation which follows the protocol Realizability Problem • Three sufficient conditions for realizability (contentless messages) [Fu, Bultan, Su, CIAA’03] – Lossless join • Conversation set should be equivalent to the join of its projections to each peer – Synchronous compatible • When the projections are composed synchronously, there should not be a state where a peer is ready to send a message while the corresponding receiver is not ready to receive – Autonomous • Each peer should be able to make a deterministic decision on whether to send or to receive or to terminate Realizability for Guarded Protocols • One natural conjecture: – Drop all guards and message contents to get the “skeleton” of the conversation protocol – Check realizability of the skeleton • Conjecture fails because there exists – Nonrealizable guarded protocols with realizable skeletons, and – Realizable guarded protocols with nonrealizable skeletons. Examples Skeleton is realizable, but guarded protocol is not Guarded protocol is realizable, but its skeleton is not . D B: d(1) A B: a(1) C D: c(2) D A: e(1) C D: c(1) D A: e(2) D B: d(2) A B: a(2) A B: a c (1) a(2) is a conversation of the projected peers B A: b Realizability for Guarded Protocols • A fourth condition – Deterministic guards • If we determinize projection of the conversation protocol to each peer, all the guards that map to a state should be identical • If a guarded conversation protocol satisfies the above property – and if its skeleton satisfies the three conditions we discussed before, • then it is realizable Guarded Protocols • If the realizability conditions are not met we can still try exhaustive state space exploration – Treat each valuation of message contents as a new message class and get a standard conversation protocol without contents – Accurate, but costly • Future work: developing symbolic verification techniques for conversation protocols Outline • Web Service Composition Model • Capturing Global Behaviors – Conversations • Top-Down Specification and Verification – Realizability • Bottom-Up Specification and Verification – Synchronizability • Web Service Analysis Tool • Action Language Verifier • Conclusions and Future Work msg1 Conversation Schema Peer A msg2, msg6 msg4 Peer B msg3, msg5 Peer C BA:msg2 BC:msg5 Conversation Protocol AB:msg1 ? BA:msg6 BC:msg3 LTL property G(msg1 F(msg3 msg5)) C B:msg4 (c) Peer A Peer B !msg1 Peer C ?msg1 !msg3 Input Queue ?msg3 !msg2 ?msg2 !msg5 ?msg6 Virtual Watcher ?msg5 ?msg4 !msg4 !msg6 ... ? G(msg1 F(msg3 msg5)) LTL property Bottom-Up Approach • We know that analyzing conversations of composite web services is difficult due to asynchronous communication • The question is, can we identify composite web services where asynchronous communication does not create a problem? Three Examples, Example 1 r 1 , r2 !e e ?a2 !r1 !r2 !a1 ?a1 requester ?r2 ?r1 !a2 ?e a1, a2 server • Conversation set is regular: (r1a1 | r2a2)* e • During all the executions queues are bounded Example 2 ?a1 !e !r1 r 1 , r2 e !r2 ?a2 !a1 a1, a2 requester • Conversation set is not regular • Queues are not bounded ?r2 ?r1 !a2 ?e server Example 3 !r2 !r1 r , r 1 2 !e ?r !a e ?a !r requester a1, a2 ?r2 ?e server • Conversation set is regular: (r1 | r2 | r a)* e • Queues are not bounded ?r1 # of states in thousands Three Examples 1600 1400 1200 1000 Example 1 Example 2 Example 3 800 600 400 200 13 11 9 7 5 3 1 0 queue length • Verification of Examples 2 and 3 are difficult even if we bound the queue length • How can we distinguish Examples 1 and 3 (with regular conversation sets) from 2? – Synchronizability Analysis Synchronizability Analysis • A composite web service S is synchronizable, if its conversation set C(S) does not change – when asynchronous communication is replaced with synchronous communication • A composite web service is synchronizable, if it satisfies the synchronous compatible and autonomous conditions [Fu, Bultan, Su WWW’04] Are These Conditions Too Restrictive? Problem Set Source Name #msg ISSTA’04 SAS 9 CvSetup 4 MetaConv 4 IBM Chat 2 Conv. Buy 5 Support Haggle 8 Project AMAB 8 BPEL shipping 2 Loan 6 spec Auction 9 Collaxa. StarLoan 6 Cauction 5 com Size #states 12 4 4 4 5 5 10 3 6 9 7 7 Synchronizable? #trans. 15 4 6 5 6 8 15 3 6 10 7 6 yes yes no yes yes no yes yes yes yes yes yes Outline • Web Service Composition Model • Capturing Global Behaviors – Conversations • Top-Down Specification and Verification – Realizability • Bottom-Up Specification and Verification – Synchronizability • Web Service Analysis Tool • Action Language Verifier • Conclusions and Future Work Web Service Analysis Tool Web Services BPEL (bottom-up) Front End BPEL to GFSA Analysis Back End Intermediate Representation Guarded automata GFSA to Promela Synchronizability Analysis GFSA parser Guarded automaton (synchronous communication) GFSA to Promela skip Conversation Protocol (top-down) Verification Languages (bounded queue) Realizability Analysis fail success GFSA to Promela (single process, no communication) Promela Guarded Automata Model • Uses XML messages • Uses MSL for declaring message types – MSL (Model Schema Language) is a compact formal model language which captures most features of XML Schema • Uses XPATH expressions for guards – XPATH is a language for writing expressions (queries) that navigate through XML trees and return a set of answer nodes An XML Message <Register> <investorID> 1234 </investorID> <requestList> <stockID> AAAA </stockID> <stockID> BBBB </stockID> </requestList> <payment> <accountNum> 56 </accountNum> </payment> </Register> Register investorID 1234 requestList payment stockID stockID accountNum AAAA BBBB 56 MSL Type Declaration Register[ investorID[xsd:int] , requestList[ stockID[xsd:string]{1,50} ] , payment[ creditCardNum[xsd:int] | accountNum[xsd:int] ] ] <Register> <investorID> 1234 </investorID> <requestList> <stockID> AAAA </stockID> <stockID> BBBB </stockID> </requestList> <payment> <accountNum> 56 </accountNum> </payment> </Register> XPATH Queries Register investorID 1234 requestList payment stockID stockID accountNum AAAA BBBB 56 //payment/* returns the node labeled accountNum /Register/requestList/stockID/string returns the nodes labeled AAAA and BBBB //stockID[string=AAAA]/string returns the node labeled AAAA The Guarded Automata Model // XML Schema Type Decl. ?a1 !e !r1 request [ id [int] ] // messages r2: request ?a2 !r2 Guard{ r2/id = last/id r2/id := last/id + 1 } //local variables last: request Guarded Automata to Promela • Restrictions: – Bound all the domains – Only ordered lists • Map MSL types to Promela Type System • Translate XPATH expressions to Promela $request // stockID = $register // stockID [int()>5] [position() = last()] $request // stockID = $register // stockID [int()>5] [position() = last()] Model Checking Using Promela • Subtle errors in an example – SAS: Stock Analysis Service [Fu, Bultan, Su ISSTA’04] – 3 peers: Investor, Broker, ResearchDept. – Investor Broker: a registerList of stockIDs – Broker ResearchDept.: • relay request (1 stockID per request) • find the stockID in the latest request, send its subsequent stockID in registerList – Repeating stockID will cause error. – Only discoverable by analysis of XPath expressions Related Work • Conversation specification – IBM Conversation support project – Conversation support for business process integration [Hanson, Nandi, Kumaran EDOCC’02] – Orchestrating computations on the world-wide web [Choi, Garg, Rai, Misram, Vin EuroPar’02] • Verification of web services – Simulation, verification, composition of web services [Narayanan, McIlraith WWW’02] • Realizability problem – Realizability of Message Sequence Charts (MSC) [Alur, Etassami, Yannakakis ICSE’00, ICALP’01] Current and Future Work • More analysis tools are necessary for guarded protocols with infinite domains – Symbolic analysis – Abstraction • Extending the source and target languages • Tools for model checking web services – Finite state vs. infinite-state – Message contents, local variables Current and Future Work Verification Languages Front End BPEL DAML-S WSCI Conversation Protocols ... Translator for bottom-up specifications Translator for top-down specifications Analysis Back End Intermediate Representation Guarded automata Guarded automaton Automated Abstraction Web Service Specification Languages Synchronizability Analysis Translation with synchronous communication Translation with bounded queue skip Realizability Analysis fail success Translation with single process, no communication Promela SMV Action Language ...