Tools for Automated Verification of Web Services

advertisement
Tools for Automated Verification of
Web Services
Tevfik Bultan
Department of Computer Science
University of California, Santa Barbara
bultan@cs.ucsb.edu
http://www.cs.ucsb.edu/~bultan/
Characteristics of Web Services
Web services: Web accessible software applications which
interact with each other through the Internet
Goals
• Platform independent (.NET, J2EE)
• Dynamic service discovery
• Loosely coupled
• Tolerate pauses in availability and slow data transmission
Approach
• Standardized data transmission: XML
• Interaction through standardized interfaces: WSDL
• Asynchronous messaging
Composition
WSCI
BPEL4WS
Service
WSDL
Message
SOAP
Type
XML Schema
Data
XML
Web Service Standards
Implementation
Platforms
Interaction
Microsoft .Net, Sun J2EE
Web Service Standards
Challenges in Verification of Web Services
• Distributed nature, no central control
– How do we model the global behavior?
– How do we specify the global properties?
• Asynchronous messaging introduces undecidability in
analysis
– How do we check the global behavior?
– How do we enforce the global behavior?
• XML data manipulation
– How do we specify the XML messages?
– How do we verify properties related to data?
Outline
• Web Service Composition Model
• Capturing Global Behaviors
– Conversations
• Top-Down vs. Bottom-Up Specification and Verification
– Realizability vs. Synchronizability
• XML messaging
– MSL, XPath
– Translation to Promela
• Web Service Analysis Tool
• Conclusions and Future Work
Collaborators: Xiang Fu, Jianwen Su, Rick Hull
An Example: Stock Analysis Service
Three peers: Investor (Inv), Stock Broker (SB), and Research
Department (RD)
• Inv initiates the stock analysis service by sending a
register message to the SB
• The SB may accept or reject the registration
• If the registration is accepted, the SB sends an analysis
request to the RD
• RD sends the results of the analysis directly to the Inv as
a report
• After receiving a report the Inv can either send an ack
to the SB or cancel the service
• Then, the SB either sends the bill for the services to the
Inv, or continues the service with another analysis
request
An Example: Stock Analysis Service (SAS)
• SAS is a composite web service
– a finite set of peers: Inv, SB, RD
– and a finite set of message classes: register, ack,
cancel, accept, reject, bill, request,
terminate, report
Investor
(Inv)
report
register
ack, cancel
Stock Broker
(SB)
accept,
reject, bill
request,
terminate
Research Dept.
(RD)
Communication Model
• We assume that the messages among the peers are
exchanged using reliable and asynchronous messaging
– FIFO and unbounded message queues
Stock Broker
(SB)
req req
Research Dept.
(RD)
• This model is similar to industry efforts such as
– JMS (Java Message Service)
– MSMQ (Microsoft Message Queuing Service)
Composite Web Service Execution
Investor
Stock Broker Firm
?register
!register
?accept
!ack
?reject
!reject
!accept
!request
acc
rep
bil
?report
?ack
reg
ack
?bill
!cancel
?bill
!bill
?cancel
!bill
!terminate
Research Dept.
?request
!report
req
ter
?terminate
Conversations
• A virtual watcher records the messages as they are sent
register
Investor
(Inv)
accept
ack
Stock Broker
(SB)
bill
Watcher
Research Dept.
(RD)
reg acc req rep ack bil ter
• A conversation is a sequence of messages the
watcher sees during an execution
[Bultan, Fu, Hull, Su WWW’03]
Effects of Asynchronous Communication
• Question: Given a composite web service, is the set of
conversations a regular set?
• Even when messages do not have any content and the
peers are finite state machines the conversation set may
not be regular
• Reason: asynchronous communication with
unbounded queues
• Bounded queues or synchronous communication
 Conversation Set always regular
Properties of Conversations
• The notion of conversation enables us to reason about
temporal properties of the composite web services
• LTL framework extends naturally to conversations
– LTL temporal operators
X (neXt), U (Until), G (Globally), F (Future)
– Atomic properties
Predicates on message classes (or contents)
Example: G ( accept  F bill )
• Model checking problem: Given an LTL property, does
the conversation set satisfy the property?
Bottom-Up vs. Top-Down
Bottom-up approach
• Specify the behavior of each peer
• The global communication behavior (conversation set) is
implicitly defined based on the composed behavior of the
peers
• Global communication behavior is hard to understand and
analyze
Top-down approach
• Specify the global communication behavior (conversation
set) explicitly as a protocol
• Ensure that the conversations generated by the peers
obey the protocol
msg1
Conversation
Schema
Peer A
msg2,
msg6
msg4
Peer B
msg3,
msg5
Peer C
BA:msg2 BC:msg5
Conversation
Protocol
AB:msg1
Peer A
BA:msg6
BC:msg3
LTL property
G(msg1  F(msg3  msg5))
CB:msg4
Peer B
!msg1
?
Peer C
?msg1
!msg3
Input
Queue
?msg3
!msg2
?msg2
!msg5
?msg6
Virtual Watcher
?msg5
?msg4
!msg4
!msg6
... ?
G(msg1  F(msg3  msg5))
LTL property
Conversation Protocols
• Conversation Protocol:
– An automaton that accepts the desired conversation set
• A conversation protocol is a contract agreed by all peers
– Each peer must act according to the protocol
• For reactive protocols with infinite message sequences we
use:
– Büchi automata which accept infinite strings
• For specifying message contents, we use:
– Guarded automata
– Guards are constraints on the message contents
SAS Conversation Protocol
• This conversation protocol specifies the set of
conversations for the SAS
1
3
6
register
request
reject
accept
2
report
7
cancel
ack
8
request
5
9
report
terminate
4
12
terminate
bill
11
cancel
10
ack
Synthesize Peer Implementations
• Conversation protocol specifies the global communication
behavior
– How do we implement the peers?
• How do we obtain the contracts that peers have to
obey from the global contract specified by the
conversation protocol?
• Project the global protocol to each peer
– By dropping unrelated messages for each peer
Interesting Question
Conversations specified by
the conversation protocol
?

Conversations generated
by the projected services
If this equality holds the conversation protocol is realizable
Are there conditions which ensure the equivalence?
Realizability Problem
• Not all conversation protocols are realizable!
AB: m1
!m1
?m1
!m2
?m2
CD: m2
Peer A
Conversation
protocol
Peer B
Peer C
Projection of the conversation
protocol to the peers
Conversation “m2 m1” will be generated by all peer
implementations which follow the protocol
Peer D
Another Non-Realizable Protocol
m1
A
B
m2
m3
B
BA: m2
m2
A
m1
B
m3
C
C
m2
m1
m3
A, C
AB: m1
Watcher
BA: m2
AB: m1
AC: m3
Generated conversation: m2 m1 m3
Realizability Conditions
Three sufficient conditions for realizability (no message
content) [Fu, Bultan, Su, CIAA’03, TCS’04]
• Lossless join
– Conversation set should be equivalent to the join of its
projections to each peer
• Synchronous compatible
– When the projections are composed synchronously,
there should not be a state where a peer is ready to
send a message while the corresponding receiver is not
ready to receive
• Autonomous
– At any state, each peer should be able to do only one
of the following: send, receive or terminate
(a peer can still choose among multiple messages)
Realizability Conditions
• Following protocols fail one of the three conditions but
satisfy the other two
AB: m1
AB: m1
BA: m2
AB: m1
BA: m2
CD: m2
CA: m2
AB: m1
AC: m3
Not lossless
join
Not synchronous
compatible
Not autonomous
Bottom-Up Approach
• We know that analyzing conversations of composite web
services is difficult due to asynchronous communication
• The question is:
– Can we identify the composite web services where
asynchronous communication does not create a
problem?
Three Examples, Example 1
r1, r2
!e
?a1
?a2
!r1 !r2
requester
e
a1, a2
!a1
!a2
?r1
?r2
?e
server
• Conversation set is regular: (r1a1 | r2a2)* e
• During all executions the message queues are bounded
Example 2
r1, r2
!e
?a1
?a2
!r1
!r2
e
a1, a2
requester
• Conversation set is not regular
• Queues are not bounded
!a1
!a2
?r1
?r2
?e
server
Example 3
!e !r
2
!r1
r1, r2
e
?a
!r
?r
!a
?r1
a1, a2
requester
?r2
?e
server
• Conversation set is regular: (r1 | r2 | ra)* e
• Queues are not bounded
# of states in thousands
State Spaces of the Three Examples
1600
1400
1200
1000
Example 1
Example 2
Example 3
800
600
400
200
13
11
9
7
5
3
1
0
queue length
• Verification of Examples 2 and 3 are difficult even if
we bound the queue length
• How can we distinguish Examples 1 and 3 (with
regular conversation sets) from 2?
– Synchronizability Analysis
Synchronizability Analysis
• A composite web service is synchronizable, if its
conversation set does not change
– when asynchronous communication is replaced with
synchronous communication
• A composite web service is synchronizable, if it satisfies
the synchronous compatible and autonomous conditions
[Fu, Bultan, Su WWW’04]
Are These Conditions Too Restrictive?
Problem Set
Source
Name
#msg
ISSTA’04
SAS
9
CvSetup
4
MetaConv
4
IBM
Chat
2
Conv.
Buy
5
Support
Haggle
8
Project
AMAB
8
BPEL
shipping
2
Loan
6
spec
Auction
9
Collaxa. StarLoan
6
Cauction
5
com
Size
#states
12
4
4
4
5
5
10
3
6
9
7
7
Synchronizable?
#trans.
15
4
6
5
6
8
15
3
6
10
7
6
yes
yes
no
yes
yes
no
yes
yes
yes
yes
yes
yes
Web Service Analysis Tool (WSAT)
Web
Services
Front End
BPEL
(bottom-up)
BPEL
to
GFSA
Analysis
Back End
Intermediate
Representation
Guarded
automata
GFSA to Promela
(synchronous
communication)
Synchronizability
Analysis
GFSA to Promela
skip
Conversation
Protocol
(top-down)
GFSA
parser
Guarded
automaton
Verification
Languages
(bounded queue)
Realizability
Analysis
success GFSA to Promela
fail
http://www.cs.ucsb.edu/~su/WSAT/
[Fu, Bultan, Su CAV’04]
(single process,
no communication)
Promela
Guarded Automata Model
• Uses XML messages
• Uses MSL for declaring message types
– MSL (Model Schema Language) is a compact formal
model language which captures core features of XML
Schema
• Uses XPath expressions for guards
– XPath is a language for writing expressions (queries)
that navigate through XML trees and return a set of
answer nodes
The Guarded Automata Model
//type declaration
request [
id [int]
]
!e
?a1
?a2
// message declaration
r2: request
// local variable declaration
last: request
!r1
!r2
Guard{
a2/id = last/id =>
r2/id := last/id + 1,
last/id := last/id + 1
}
XML (eXtensible Markup Language)
• XML is a markup language like HTML
• Similar to HTML, XML tags are written as
<tag> followed by </tag>
• HTML vs. XML
– In HTML, tags are used to describe the appearance of
the data
<b> </b> <i> </i> <br> <p> ...
– In XML, tags are used to describe the content of the
data rather than the appearance
<date> </date> <address> </address>
An XML Document and Its Tree
<Register>
<investorID>
Register
VIP01
</investorID>
<requestList>
investorID requestList
payment
<stockID>
0001
</stockID>
<stockID>
VIP01 stockID stockID accountNum
0002
</stockID>
</requestList>
0001
0002
0425
<payment>
<accountNum>
• XML documents can be modeled as trees
0425
</accountNum> where each internal node corresponds to a
</payment>
tag and leaf nodes correspond to basic types
</Register>
XML Schema
• XML provides a standard way to exchange data over the
Internet.
• However, the parties which exchange XML documents still
have to agree on the type of the data
– What are the tags that will appear in the document, in
what order, etc.
• XML Schema is a language for defining XML data types
• MSL (Model Schema Language) is a compact formal
model language which captures core features of XML
Schema
MSL (Model Schema Language)
• Basic MSL syntax
g


| b | t[g ] | g{m,n }
| g,g | g&g | g|g
g is an XML type (i.e., an MSL type expression)
 is the empty sequence
b is a basic type such as string, boolean, int, etc.
t is a tag
m and n are positive integers
[ ] { } & , | are MSL type constructors
MSL Semantics
• t [ g ] denotes a type with root node labeled t with
children of type g
• g { m , n } denotes a sequence of size at least m and at
most n where each member is of type g
• g1 , g2 denotes an ordered sequence where the first
member is of type g1 and the second member is of type g2
• g1 & g2 denotes an unordered sequence where one
member is of type g1 and the other member is of type g2
• g1 | g2 denotes a choice between type g1 and type g2, i.e.,
either type g1 or type g2, but not both
An MSL Type Declaration and an Instance
Register[
investorID[string] ,
requestList[
stockID[int]{1,3}
] ,
payment[
creditCardNum[int] |
accountNum[int]
]
]
<Register>
<investorID>
VIP01
</investorID>
<requestList>
<stockID>
0001
</stockID>
<stockID>
0002
</stockID>
</requestList>
<payment>
<accountNum>
0425
</accountNum>
</payment>
</Register>
Translating Guarded Automata to Promela
• We used the SPIN model checker to verify the properties
of conversations
• SPIN is a finite state model checker
– we restricted XML message contents to finite domains
• We translate guarded automata models to Promela (input
language of the SPIN model checker)
– First, translate MSL type declarations to Promela type
declarations
– Then, translate XPath expressions to Promela code
Mapping MSL types to Promela
• Basic types
– integer and boolean types are mapped to Promela
basic types int and bool
– We only allow constant string values and strings are
mapped to enumerated type (mtype) in Promela
• Other type constructors are handled using
– structured types (declared using typedef) in Promela
– or arrays
Mapping MSL type constructors to Promela
• t [ g ] is translated to a typedef declaration
• g { m , n } is translated to an array declaration
• g1 , g2 is translated to a sequence of type declarations
• g1 | g2 is translated to a sequence of type declarations
and an enumerated variable which is used to record which
type is chosen
•
g1 & g2 is not handled! We do not handle unordered type
sequence (it can cause state-space explosion)
Example
Register[
investorID[string] ,
requestList[
stockID[int]{1,3}
] ,
payment[
creditCardNum[int] |
accountNum[int]
]
]
typedef t1_investorID{
mtype stringvalue;}
typedef t2_stockID{int intvalue;}
typedef t3_requestList{
t2_stockID stockID [3];
int stockID_occ;
}
typedef t4_accountNum{int intvalue;}
typedef t5_creditCard{int intvalue;}
mtype {m_accountNum, m_creditCard}
typedef t6_payment{
t4_accountNum accountNum;
t5_creditCard creditCard;
mtype choice;
}
typedef Register{
t1_investorID investorID;
t3_requestList requestList;
t6_payment payment;
}
XPath
• In order to write specifications or programs that
manipulate XML documents we need:
– an expression language to access values and nodes in
XML documents
• XPath is a language for writing expressions (queries) that
navigate through XML trees and return a set of answer
nodes
• An XPath query defines a function which
– takes and XML tree and a context node (in the same
tree) as input and
– returns a set of nodes (in the same tree) as output
XPath Syntax
Basic XPath syntax:
q 
. | .. | b | t | *
| /q | //q | q / q | q // q
| q [ q ] | q [ exp ]
q is an XPath query
exp denotes a predicate on basic types, i.e., on the leaf
nodes of the XML tree
b denotes a basic type such as string, boolean, int, etc.
t denotes a tag
XPath Semantics
Given an XML tree and a node n as a context node
. returns n
.. returns the parent of n
Given an XML tree and a set of nodes
* returns all the nodes
b returns the nodes that are of basic type b
t returns the nodes which are labeled with tag t
XPath Semantics Contd.
Starting at the context node
• /q
returns the nodes that match q
• //q
returns the nodes that match q starting at any
descendant
• q1 / q2
returns each node which matches q2 starting at
a child of a node which matches q1
• q1 // q2 returns each node which matches q2 starting at
a descendant of a node which matches q1
• q1 [ q2 ] applies q2 to the children of the nodes which
match q1
• q [ exp ] returns the nodes that match q and for children
of which the expression exp evaluates to true
Examples
Register
investorID
VIP01
requestList
stockID stockID
0001
0002
payment
accountNum
0425
//payment/* returns the node labeled accountNum
/Register/requestList/stockID/int returns the
nodes labeled 0001 and 0002
//stockID[int > 1]/int returns the node labeled 0002
XPath to Promela
• Generate code that evaluates the XPath expression
[Fu, Bultan, Su ISSTA’04]
• Traverse the XPath expression from left to right
– Code generated in each step is inserted into the
BLANK spaces left in the code from the previous step
– A tree representation of the MSL type is used to keep
track of the context of the generated code
• Uses two data structures
– Type tree shows the structure of the corresponding
MSL type
– Abstract statements which are mapped to Promela
code
Statement
IF(v)
Promela Code
if
:: v -> BLANK
:: else -> skip
fi
FOR(v,l,h)
v = l – 1
do
:: v < h -> BLANK
v++
:: else -> break
od
EMPTY
BLANK
INC(v)
v++
SET(v,a)
v = a
Type Tree
Register[
investorID[string] &
requestList[
stockID[int]{1,3}
] &
payment[
creditCardNum[int] |
accountNum[int]
]
]
2
investorID
3
string
Register
1
7
payment
4
requestList
8
10
5
stockID creditCard accountNum
(idx: i1)
9
6
int
int
int
11
Generated Statements
$register // stockID / [int()>5] / [position() = = last()]/ int()
EMPTY
5
FOR
(i1,1,3)
1
SET
(bRes1,0)
IF
(cond)
SET
(bRes2,0)
5
IF
(i2==i3)
SET
(bRes1,1)
SET
(bRes2,1)
5
IF
(bRes1)
5
IF
(bRes2)
5
5
cond  v_register.requestlist.stockID[i1] > 5
EMPTY
5
5
Sequence
Insert
6
$request//stockID=$register//stockID[int()>5][position()=last()]
/* result of the XPath expression */
bool bResult = false;
/* results of the predicates 1, 2, and 1 resp. */
bool bRes1, bRes2, bRes3;
/* index, position(), last(), index, position() */
int i1, i2, i3, i4, i5;
i2=1;
/* pre-calculate the value of last(), store in i3 */
i4=0; i5=1; i3=0;
do
:: i4 < v_register.requestList.stockID_occ
->
/* compute first predicate */
bRes3 = false;
if
:: v_register.requestList.stockID[i4].intvalue>5
-> bRes3 = true
:: else -> skip
fi;
if
:: bRes3 -> i5++; i3++;
:: else -> skip
fi;
i4++;
:: else -> break;
od;
$request//stockID=$register//stockID[int()>5][position()=last()]
i1=0;
do
:: i1 < v_register.requestList.stockID_occ -> bRes1 = false;
if
:: v_register.requestList.stockID[i1].intvalue>5 -> bRes1 = true
:: else -> skip
fi;
if
:: bRes1 -> bRes2 = false;
if
:: (i2 == i3) -> bRes2 = true;
:: else -> skip
fi;
if
:: bRes2 ->
if
:: (v_request.stockID.intvalue ==
v_register.requestList.stockID[i1].intvalue)
-> bResult = true;
:: else -> skip
fi
:: else -> skip
fi;
i2++;
:: else -> skip
fi;
i1++;
:: else -> break;
od;
Model Checking Using Promela
• Found subtle errors in an example
– SAS: Stock Analysis Service [Fu, Bultan, Su ISSTA’04]
– 3 peers: Investor, Broker, ResearchDept.
– Investor  Broker: a registerList of stockIDs
– Broker  ResearchDept.:
• relay request (1 stockID per request)
• find the stockID in the latest request, send its
subsequent stockID in registerList
– Repeating stockID will cause error.
– Only discoverable by analysis of XPath expressions
Related Work
• Conversation specification
– IBM Conversation support project
http://www.research.ibm.com/convsupport/
– Conversation support for business process integration
[Hanson, Nandi, Kumaran EDOCC’02]
– Orchestrating computations on the world-wide web
[Choi, Garg, Rai, Misram, Vin EuroPar’02]
• Realizability problem
– Realizability of Message Sequence Charts (MSC) [Alur,
Etassami, Yannakakis ICSE’00, ICALP’01]
Related Work
• Verification of web services
– Simulation, verification, composition of web services
using a Petri net model [Narayanan, McIlraith
WWW’02]
– BPEL verification using a process algebra model and
Concurrency Workbench [Koshkina, van Breugel TAVWEB’03]
– Using MSC to model BPEL web services which are
translated to labeled transition systems and verified
using model checking [Foster, Uchitel, Magee, Kramer
ASE’03]
– Model checking Web Service Flow Language
specifications using SPIN [Nakajima ICWE’04]
Current and Future Work
• Extending the source and target languages
• Symbolic analysis
[Fu, Bultan, Su ICWS’04]
• Abstraction
• Design for verification for web services
[Betin-Can, Bultan ’04]
Current and Future Work
Verification
Languages
Front End
BPEL
DAML-S
WSCI
Conversation
Protocols
...
Translator
for bottom-up
specifications
Translator
for top-down
specifications
Analysis
Back End
Intermediate
Representation
Guarded
automata
Guarded
automaton
Automated
Abstraction
Web Service
Specification
Languages
Synchronizability
Analysis
Translation with
synchronous
communication
Translation with
bounded queue
skip
Realizability
Analysis
fail
success Translation with
single process,
no communication
Promela
SMV
Action
Language
...
Download