WSAT A Tool for Formal Analysis of Web Services Xiang Fu Tevfik Bultan Jianwen Su Department of Computer Science University of California, Santa Barbara {fuxiang,bultan,su}@cs.ucsb.edu Web Services Composition WSCI BPEL4WS Service WSDL Message SOAP Type XML Schema Data XML Web Service Standards Implementation Platforms Interaction Microsoft .Net, Sun J2EE • Loosely coupled, interaction through standardized interfaces • Standardized data transmission via XML • Asynchronous messaging • Platform independent (.NET, J2EE) Challenges in Verification of Web Services • Distributed nature, no central control – How do we model the global behavior? – How do we specify the global properties? • Asynchronous messaging introduces undecidability in analysis – How do we check the global behavior? – How do we enforce the global behavior? • XML data manipulation – How do we specify XML messages? – How do we verify properties related to data? Outline • Web Service Composition Model – Conversations: Capturing Global Behaviors • Top-Down vs. Bottom-Up Specification and Verification – Realizability vs. Synchronizability • XML messaging – MSL, XPath – Translation to Promela • Web Service Analysis Tool • Conclusions and Future Work Composite Web Services Investor Stock Broker Firm ?register !register ?accept !ack ?reject !reject !accept !request acc rep bil ?report ?ack reg ack ?bill !cancel !bill ?cancel ?bill !bill !terminate Research Dept. ?request !report req ter Watcher ?terminate reg acc req rep ack bil ter Conversation Protocols • A conversation is a sequence of messages the watcher sees during an execution [Bultan, Fu, Hull, Su WWW’03] • Conversation Protocol: An automaton that accepts the desired conversation set SAS conversation protocol report 1 register 3 reject 6 request 2 accept 7 cancel ack 8 ack request 5 9 report terminate 4 12 terminate bill 11 cancel 10 msg1 Conversation Schema Peer A msg2, msg6 msg4 Peer B msg3, msg5 Peer C BA:msg2 BC:msg5 Conversation Protocol AB:msg1 ? BA:msg6 BC:msg3 LTL property G(msg1 F(msg3 msg5)) C B:msg4 Composite Web Service Peer A Peer B !msg1 Peer C ?msg1 !msg3 Input Queue ?msg3 !msg2 ?msg2 !msg5 ?msg6 Virtual Watcher ?msg5 ?msg4 !msg4 !msg6 ... ? G(msg1 F(msg3 msg5)) LTL property Top-Down Approach • Conversation protocol specifies the global communication behavior – How do we implement the peers? • Project the global protocol to each peer – By dropping unrelated messages for each peer Conversations specified by the conversation protocol ? Conversations generated by the composed behavior of the projected services Are there conditions which ensure the equivalence? Realizability Problem • Not all conversation protocols are realizable! A B: m1 !m1 ?m1 !m2 ?m2 C D: m2 Peer A Conversation protocol Peer B Peer C Peer D Projection of the conversation protocol to the peers Conversation “m2 m1” will be generated by any legal peer implementation which follows the protocol This protocol fails Lossless join condition Another Non-Realizable Protocol m1 A m3 m2 B m2 A m1 B m3 C C B B A: m2 A, C A B: m1 Watcher A B: m1 B A: m2 m2 m1 m3 A C: m3 This protocol fails Autonomous condition Yet Another Non-Realizable Protocol m1 A m2 B m2 A m1 B C C A B: m1 C A: m2 Watcher m2 m1 This protocol fails Synchronous compatible condition Realizability Problem • Three sufficient conditions for realizability [Fu, Bultan, Su, CIAA’03, TCS] – Lossless join: Conversation set should be equivalent to the join of its projections to each peer – Synchronous compatible: When the projections of the conversation protocol are executed with synchronous communication semantics, there should not be a state where a peer is ready to send a message while the corresponding receiver is not ready to receive – Autonomous: Each peer should be able to make a deterministic decision on whether to send or to receive or to terminate Bottom-Up Approach • We know that analyzing conversations of composite web services is difficult due to asynchronous communication • The question is, can we identify composite web services where asynchronous communication does not create a problem? Three Examples, Example 1 r 1 , r2 !e e ?a2 !r1 !r2 !a1 ?a1 requester ?r2 ?r1 !a2 ?e a1, a2 server • Conversation set is regular: (r1a1 | r2a2)* e • During all the executions queues are bounded Example 2 ?a1 !e !r1 e !r2 ?a2 !a1 r 1 , r2 a1, a2 requester • Conversation set is not regular • Queues are not bounded ?r2 ?r1 !a2 ?e server Example 3 !r2 !r1 r , r 1 2 !e ?r !a e ?a !r requester a1, a2 ?r2 ?e server • Conversation set is regular: (r1 | r2 | r a)* e • Queues are not bounded ?r1 # of states in thousands Three Examples 1600 1400 1200 1000 Example 1 Example 2 Example 3 800 600 400 200 13 11 9 7 5 3 1 0 queue length • Verification of Examples 2 and 3 are difficult even if we bound the queue length • How can we distinguish Examples 1 and 3 (with regular conversation sets) from 2? – Synchronizability Analysis Synchronizability Analysis • A composite web service is synchronizable, if its conversation set does not change when asynchronous communication is replaced with synchronous communication • A composite web service is synchronizable, if it satisfies the synchronous compatible and autonomous conditions [Fu, Bultan, Su WWW’04] Are These Conditions Too Restrictive? Problem Set Source Name #msg ISSTA’04 SAS 9 CvSetup 4 MetaConv 4 IBM Chat 2 Conv. Buy 5 Support Haggle 8 Project AMAB 8 BPEL shipping 2 Loan 6 spec Auction 9 Collaxa. StarLoan 6 Cauction 5 com Size #states 12 4 4 4 5 5 10 3 6 9 7 7 Synchronizable? #trans. 15 4 6 5 6 8 15 3 6 10 7 6 yes yes no yes yes no yes yes yes yes yes yes Web Service Analysis Tool (WSAT) Web Services Front End Analysis Back End Intermediate Representation BPEL (bottom-up) BPEL to GFSA Guarded automata GFSA to Promela (synchronous communication) Synchronizability Analysis GFSA to Promela (bounded queue) skip Conversation Protocol (top-down) GFSA parser Guarded automaton Realizability Analysis Verification Languages success fail Demonstration Saturday or anytime you find me with my laptop GFSA to Promela (single process, no communication) Promela Guarded Automata Model • Uses XML messages • Uses MSL for declaring message types – MSL (Model Schema Language) is a compact formal model language which captures core features of XML Schema • Uses XPath expressions for guards – XPath is a language for writing expressions (queries) that navigate through XML trees and return a set of answer nodes SAS Guarded Automata Topdown { Schema{ PeerList{ Investor, Broker, ResearchDept }, TypeList{ Register ... Accept ... }, MessageList{ register{ Investor -> Broker : Register }, accept{ Broker -> Investor : Accept }, ... } }, GProtocol{ States{ s1,s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12 }, InitialState{ s1 }, FinalStates{ s4 }, TransitionRelation{ t1{ s1 -> s2 : register, Guard{ true } }, t2{ s2 -> s5 : accept, Guard{ true => $accept[//orderID := $register//orderID] } }, ... } } } An XML Document and Its Tree <Register> <investorID> VIP01 </investorID> <requestList> <stockID> 0001 </stockID> <stockID> 0002 </stockID> </requestList> <payment> <accountNum> 0425 </accountNum> </payment> </Register> Register investorID VIP01 requestList stockID stockID 0001 0002 payment accountNum 0425 An MSL Type Declaration and an Instance Register[ investorID[string] , requestList[ stockID[int]{1,3} ] , payment[ creditCardNum[int] | accountNum[int] ] ] <Register> <investorID> VIP01 </investorID> <requestList> <stockID> 0001 </stockID> <stockID> 0002 </stockID> </requestList> <payment> <accountNum> 0425 </accountNum> </payment> </Register> MSL to Promela Example Register[ investorID[string] , requestList[ stockID[int]{1,3} ] , payment[ creditCardNum[int] | accountNum[int] ] ] typedef t1_investorID{ mtype stringvalue;} typedef t2_stockID{int intvalue;} typedef t3_requestList{ t2_stockID stockID [3]; int stockID_occ; } typedef t4_accountNum{int intvalue;} typedef t5_creditCard{int intvalue;} mtype {m_accountNum, m_creditCard} typedef t6_payment{ t4_accountNum accountNum; t5_creditCard creditCard; mtype choice; } typedef Register{ t1_investorID investorID; t3_requestList requestList; t6_payment payment; } XPath Expressions Register investorID VIP01 requestList stockID stockID 0001 0002 payment accountNum 0425 //payment/* returns the node labeled accountNum /Register/requestList/stockID/int returns the nodes labeled 0001 and 0002 //stockID[int > 1]/int returns the node labeled 0002 XPath to Promela $register // stockID / [int()>5] / [position() = last()] / int() SET (i2,1) EMPTY SET (bRes2,0) SET (bRes1,0) 1 FOR (i1,1,3) IF (cond) SET (bRes1,1) IF (i2==i3) SET (bRes2,0) 5 IF (bRes1) IF (bRes2) EMPTY 5 5 INC (i2) cond v_register.requestlist.stockID[i1] > 5 Sequence Insert 6 $request//stockID=$register//stockID[int()>5][position()=last()] /* result of the XPath expression */ bool bResult = false; /* results of the predicates 1, 2, and 1 resp. */ bool bRes1, bRes2, bRes3; /* index, position(), last(), index, position() */ int i1, i2, i3, i4, i5; i2=1; /* pre-calculate the value of last(), store in i3 */ i4=0; i5=1; i3=0; do :: i4 < v_register.requestList.stockID_occ -> /* compute first predicate */ bRes3 = false; if :: v_register.requestList.stockID[i4].intvalue>5 -> bRes3 = true :: else -> skip fi; if :: bRes3 -> i5++; i3++; :: else -> skip fi; i4++; :: else -> break; od; $request//stockID=$register//stockID[int()>5][position()=last()] i1=0; do :: i1 < v_register.requestList.stockID_occ -> bRes1 = false; if :: v_register.requestList.stockID[i1].intvalue>5 -> bRes1 = true :: else -> skip fi; if :: bRes1 -> bRes2 = false; if :: (i2 == i3) -> bRes2 = true; :: else -> skip fi; if :: bRes2 -> if :: (v_request.stockID.intvalue == v_register.requestList.stockID[i1].intvalue) -> bResult = true; :: else -> skip fi :: else -> skip fi; i2++; :: else -> skip fi; i1++; :: else -> break; od; Model Checking Using Promela • Error in SAS conversation protocol t14{ s8 -> s12 : bill, Guard{ $request//stockID = $register//stockID [position() = last()] => $bill[ //orderID := $register//orderID ] } } • Repeating stockID will cause error • One can only discover these kinds of errors by analysis of XPath expressions Related Work • Conversation specification – IBM Conversation support project http://www.research.ibm.com/convsupport/ – Conversation support for business process integration [Hanson, Nandi, Kumaran EDOCC’02] • Realizability problem – Realizability of Message Sequence Charts (MSC) [Alur, Etassami, Yannakakis ICSE’00, ICALP’01] Related Work • Verification of web services – Simulation, verification, composition of web services using a Petri net model [Narayanan, McIlraith WWW’02] – Using MSC to model BPEL web services which are translated to labeled transition systems and verified using model checking [Foster, Uchitel, Magee, Kramer ASE’03] – Model checking Web Service Flow Language specifications using SPIN [Nakajima ICWE’04] – BPEL verification using a process algebra model and Concurrency Workbench [Koshkina, van Breugel TAVWEB’04] Future Work • Other input languages in the front end – WSCI, OWL-S • Other verification tools at the back end – SMV, Action Language Verifier • Symbolic representations for XML data • Abstraction for XML data and XML data manipulation Future Work Verification Languages Front End WSCI ... Back End Intermediate Representation BPEL Conversation Protocols Analysis Translator for bottom-up specifications Translator for top-down specifications Guarded automata Guarded automaton Automated Abstraction Web Service Specification Languages Synchronizability Analysis Translation with bounded queue skip Realizability Analysis fail Translation with synchronous communication success Translation with single process, no communication Promela Action Language SMV ...