Tools for Automated Verification of Web Services

advertisement
Tools for Automated Verification of Web Services
Modeling Interactions of Web Software
Analyzing Conversations of Web Services
Tevfik Bultan
Department of Computer Science
University of California, Santa Barbara
bultan@cs.ucsb.edu
http://www.cs.ucsb.edu/~bultan
Joint work with
Xiang Fu, Georgia Southwestern State University
Jianwen Su, University of California, Santa Barbara
Going to Lunch at UCSB
• Before Xiang graduated from UCSB, Xiang, Jianwen and I
were using the following protocol for going to lunch:
– Sometime around noon one of us would call another
one by phone and tell him where and when we would
meet for lunch.
– The receiver of this first call would call the remaining
peer and pass the information.
• Let’s call this protocol the First Caller Decides (FCD)
protocol.
Implementation of the FCD Protocol
Tevfik
Xiang
Jianwen
!tj1
?jt2
!xj1
?jx2
!jt1
?tj2
!tx1
?xt2
!xt1
?tx2
!jx1
?xj2
?jt1
?xt1
?jx1
?tx1
?tj1
?xj1
!tx2
!tj2
!xt2
Message Labels:
from
Tevfik
!xj2
!
?
t x 1
to
Xiang
!jx2
1st
message
!jt2
send
receive
FCD Protocol does not Work with Voicemail
• When the university installed a voicemail system FCD
protocol started causing problems
– We were showing up at different restaurants at different
times!
• Example scenario: tx1, jx1, xj2
The messages jx1 and xj2 are not consumed
– Note that this scenario is not possible without
voicemail!
A Different Lunch Protocol
• Jianwen suggested that we change our lunch protocol as
follows:
– As the most senior researcher among us Jianwen
would make the first call to either Xiang or Tevfik and
tell when and where we would meet for lunch.
– Then, the receiver of this call would pass the
information to the other peer.
– Let’s call this protocol the Jianwen Decides (JD)
protocol
Implementation of the JD Protocol
Tevfik
?jt
!tx
?xt
Xiang
?jx
Jianwen
?tx
!xt
• JD protocol works fine with voicemail!
!jt
!jx
Conversation Protocols
• The FCD and JD protocols specify a set of conversations
• The implementations I showed are supposed to generate
the set of conversations specified by these protocols
• We can specify the set of conversations without showing
how the peers implement them, we call such a
specification a conversation protocol
FCD and JD Conversation Protocols
FCD Protocol
JD Protocol
jt
tx1
xj2
tj1
xt1
xj1
jt1
jx1
jx2
tj2
jt2
tx2
xt2
Conversation set:
{(tx1, xj2), (tj1, jx2), (xt1, tj2),
(xj1, jt2), (jt1, tx2), (jx1, xt2)}
tx
jx
xt
Conversation set:
{(jt, tx), (jx, xt)}
Observations & Questions
• The implementation of the FCD protocol behaves
differently with synchronous and asynchronous
communication whereas the implementation of the JD
protocol behaves the same.
– Can we find a way to identify such implementations?
• The implementation of the FCD protocol does not obey the
FCD protocol if asynchronous communication is used
whereas the implementation of the JD protocol obeys the
JD protocol even if asynchronous communication used.
– Given a conversation protocol can we figure out if there
is an implementation which generates the same
conversation set?
Synchronizability and Realizability Analyses
• We formalized these observations and questions using
synchronizability and realizability analyses
– The implementation of the JD protocol is
synchronizable but the implementation of the FCD
protocol is not synchronizable
– The JD protocol is realizable but the FCD protocol is
not realizable
Outline
• Web Service Composition Model
• Capturing Global Behaviors
– Conversations
• Top-Down vs. Bottom-Up Specification and Verification
– Realizability vs. Synchronizability
• XML messaging
– MSL, XPath
– Translation to Promela
• Web Service Analysis Tool
• Conclusions and Future Work
Characteristics of Web Services
WS-CDL
Behavior
BPEL4WS
Interface
WSDL
Message
SOAP
Type
XML Schema
Data
XML
Web Service Standards
Implementation Platforms
Interaction
Microsoft .Net, Sun J2EE
• Loosely coupled, interaction through standardized
interfaces
• Standardized data transmission via XML
• Asynchronous messaging
• Platform independent (.NET, J2EE)
Challenges in Verification of Web Services
• Distributed nature, no central control
– How do we model the global behavior?
– How do we specify the global properties?
• Asynchronous messaging introduces undecidability in
analysis
– How do we check the global behavior?
– How do we enforce the global behavior?
• XML data manipulation
– How do we specify the XML messages?
– How do we verify properties related to data?
A Model for Composite Web Services
• A composite web service consists of
– a finite set of peers
• Lunch example: T, X, J
– and a finite set of message classes
• Lunch example (JD protocol): jt, tx, jx, xt
tx
Peer T
Peer X
xt
jt
jx
Peer J
Communication Model
• We assume that the messages among the peers are
exchanged using reliable and asynchronous messaging
– FIFO and unbounded message queues
Peer T
tx
tx
Peer X
• This model is similar to industry efforts such as
– JMS (Java Message Service)
– MSMQ (Microsoft Message Queuing Service)
Conversations
• A virtual watcher records the messages as they are sent
Peer T
tx
Peer X
Watcher
Peer J
jt
tx
• A conversation is a sequence of messages the
watcher sees during an execution
[Bultan, Fu, Hull, Su WWW’03]
Effects of Asynchronous Communication
• Question: Given a composite web service, is the set of
conversations a regular set?
• Even when messages do not have any content and the
peers are finite state machines the conversation set may
not be regular
• Reason: asynchronous communication with
unbounded queues
• Bounded queues or synchronous communication
 Conversation set always regular
Properties of Conversations
• The notion of conversation enables us to reason about
temporal properties of the composite web services
• LTL framework extends naturally to conversations
– LTL temporal operators
X (neXt), U (Until), G (Globally), F (Future)
– Atomic properties
Predicates on message classes (or contents)
Example: G ( payment  F receipt )
• Model checking problem: Given an LTL property, does
the conversation set satisfy the property?
Bottom-Up vs. Top-Down
Bottom-up approach
• Specify the behavior of each peer
• The global communication behavior (conversation set) is
implicitly defined based on the composed behavior of the
peers
• Global communication behavior is hard to understand and
analyze
Top-down approach
• Specify the global communication behavior (conversation
set) explicitly as a protocol
• Ensure that the conversations generated by the peers
obey the protocol
Conversation
Schema
tx
xt
Peer T
jt
Conversation
Protocol
jx
Peer J
jt
jx
?
tx
Peer T
?jt
Peer X
LTL property
GF(tx  xt))
xt
Peer J
Peer X
?xt
!jt
?jx
!jx
!tx
Virtual Watcher
?tx
Input
Queue
!xt
... ?
GF(tx  xt))
LTL property
Conversation Protocols
• Conversation Protocol:
– An automaton that accepts the desired conversation set
• A conversation protocol is a contract agreed by all peers
– Each peer must act according to the protocol
• For reactive protocols with infinite message sequences we
use:
– Büchi automata which accept infinite strings
• For specifying message contents, we use:
– Guarded automata
– Guards are constraints on the message contents
Synthesize Peer Implementations
• Conversation protocol specifies the global communication
behavior
– How do we implement the peers?
• How do we obtain the contracts that peers have to
obey from the global contract specified by the
conversation protocol?
• Project the global protocol to each peer
– By dropping unrelated messages for each peer
Interesting Question
Conversations specified by
the conversation protocol
?

Conversations generated
by the projected services
If this equality holds the conversation protocol is realizable
Are there conditions which ensure the equivalence?
Realizability Problem
• Not all conversation protocols are realizable!
AB: m1
!m1
?m1
!m2
?m2
CD: m2
Peer A
Conversation
protocol
Peer B
Peer C
Projection of the conversation
protocol to the peers
Conversation “m2 m1” will be generated by all peer
implementations which follow the protocol
Peer D
Another Non-Realizable Protocol
m1
A
B
m2
m3
B
BA: m2
m2
A
m1
B
m3
C
C
m2
m1
m3
A, C
AB: m1
Watcher
BA: m2
AB: m1
AC: m3
Generated conversation: m2 m1 m3
Realizability Conditions
Three sufficient conditions for realizability (no message
content) [Fu, Bultan, Su, CIAA’03, TCS’04]
• Lossless join
– Conversation set should be equivalent to the join of its
projections to each peer
• Synchronous compatible
– When the projections are composed synchronously,
there should not be a state where a peer is ready to
send a message while the corresponding receiver is not
ready to receive
• Autonomous
– At any state, each peer should be able to do only one
of the following: send, receive or terminate
(a peer can still choose among multiple messages)
Realizability Conditions
• Following protocols fail one of the three conditions but
satisfy the other two
AB: m1
BA: m2
AB: m1
AB: m1
BA: m2
CD: m2
CA: m2
AB: m1
AC: m3
Not lossless
join
Not synchronous
compatible
Not autonomous
Bottom-Up Approach
• We know that analyzing conversations of composite web
services is difficult due to asynchronous communication
– Model checking for conversation properties is
undecidable even for finite state peers
• The question is:
– Can we identify the composite web services where
asynchronous communication does not create a
problem?
Three Examples, Example 1
r1, r2
!e
?a1
?a2
!r1 !r2
requester
e
a1, a2
!a1
!a2
?r1
?r2
?e
server
• Conversation set is regular: (r1a1 | r2a2)* e
• During all executions the message queues are bounded
Example 2
r1, r2
!e
?a1
?a2
!r1
!r2
e
a1, a2
requester
• Conversation set is not regular
• Queues are not bounded
!a1
!a2
?r1
?r2
?e
server
Example 3
!e !r
2
!r1
r1, r2
e
?a
!r
?r
!a
?r1
a1, a2
requester
?r2
?e
server
• Conversation set is regular: (r1 | r2 | ra)* e
• Queues are not bounded
# of states in thousands
State Spaces of the Three Examples
1600
1400
1200
1000
Example 1
Example 2
Example 3
800
600
400
200
13
11
9
7
5
3
1
0
queue length
• Verification of Examples 2 and 3 are difficult even if
we bound the queue length
• How can we distinguish Examples 1 and 3 (with
regular conversation sets) from 2?
– Synchronizability Analysis
Synchronizability Analysis
• A composite web service is synchronizable, if its
conversation set does not change
– when asynchronous communication is replaced with
synchronous communication
• If a composite web service is synchronizable we can
check the properties about its conversations using
synchronous communication semantics
– For finite state peers this is a finite state model
checking problem
Synchronizability Analysis
• A composite web service is synchronizable, if it satisfies
the synchronous compatible and autonomous conditions
[Fu, Bultan, Su WWW’04, TSE]
• Connection between realizability and synchronizability:
– A conversation protocol is realizable if its projections to
peers are synchronizable and the protocol itself
satisfies the lossless join condition
Are These Conditions Too Restrictive?
Problem Set
Source
Name
#msg
ISSTA’04
SAS
9
CvSetup
4
MetaConv
4
IBM
Chat
2
Conv.
Buy
5
Support
Haggle
8
Project
AMAB
8
BPEL
shipping
2
Loan
6
spec
Auction
9
Collaxa. StarLoan
6
Cauction
5
com
Size
#states
12
4
4
4
5
5
10
3
6
9
7
7
Pass?
#trans.
15
4
6
5
6
8
15
3
6
10
7
6
yes
yes
no
yes
yes
no
yes
yes
yes
yes
yes
yes
Web Service Analysis Tool (WSAT)
Web
Services
Front End
BPEL
(bottom-up)
BPEL
to
GFSA
Analysis
Back End
Intermediate
Representation
Guarded
automata
GFSA to Promela
(synchronous
communication)
Synchronizability
Analysis
GFSA to Promela
skip
Conversation
Protocol
(top-down)
GFSA
parser
Guarded
automaton
Verification
Languages
(bounded queue)
Realizability
Analysis
success GFSA to Promela
fail
http://www.cs.ucsb.edu/~su/WSAT/
[Fu, Bultan, Su CAV’04]
(single process,
no communication)
Promela
Guarded Automata Model
• Uses XML messages
• Uses MSL for declaring message types
– MSL (Model Schema Language) is a compact formal
model language which captures core features of XML
Schema
• Uses XPath expressions for guards
– XPath is a language for writing expressions (queries)
that navigate through XML trees and return a set of
answer nodes
The Guarded Automata Model
//type declaration
request [
id [int]
]
!e
?a1
?a2
// message declaration
r2: request
// local variable declaration
last: request
!r1
!r2
Guard{
a2/id = last/id =>
r2/id := last/id + 1,
last/id := last/id + 1
}
XML (eXtensible Markup Language)
• XML is a markup language like HTML
• Similar to HTML, XML tags are written as
<tag> followed by </tag>
• HTML vs. XML
– In HTML, tags are used to describe the appearance of
the data
<b> </b> <i> </i> <br> <p> ...
– In XML, tags are used to describe the content of the
data rather than the appearance
<date> </date> <address> </address>
An XML Document and Its Tree
<Register>
<investorID>
Register
VIP01
</investorID>
<requestList>
investorID requestList
payment
<stockID>
0001
</stockID>
<stockID>
VIP01 stockID stockID accountNum
0002
</stockID>
</requestList>
0001
0002
0425
<payment>
<accountNum>
• XML documents can be modeled as trees
0425
</accountNum> where each internal node corresponds to a
</payment>
tag and leaf nodes correspond to basic types
</Register>
XML Schema
• XML provides a standard way to exchange data over the
Internet.
• However, the parties which exchange XML documents still
have to agree on the type of the data
– What are the tags that will appear in the document, in
what order, etc.
• XML Schema is a language for defining XML data types
• MSL (Model Schema Language) is a compact formal
model language which captures core features of XML
Schema
MSL (Model Schema Language)
• Basic MSL syntax
g


| b | t[g ] | g{m,n }
| g,g | g&g | g|g
g is an XML type (i.e., an MSL type expression)
 is the empty sequence
b is a basic type such as string, boolean, int, etc.
t is a tag
m and n are positive integers
[ ] { } & , | are MSL type constructors
MSL Semantics
• t [ g ] denotes a type with root node labeled t with
children of type g
• g { m , n } denotes a sequence of size at least m and at
most n where each member is of type g
• g1 , g2 denotes an ordered sequence where the first
member is of type g1 and the second member is of type g2
• g1 & g2 denotes an unordered sequence where one
member is of type g1 and the other member is of type g2
• g1 | g2 denotes a choice between type g1 and type g2, i.e.,
either type g1 or type g2, but not both
An MSL Type Declaration and an Instance
Register[
investorID[string] ,
requestList[
stockID[int]{1,3}
] ,
payment[
creditCardNum[int] |
accountNum[int]
]
]
<Register>
<investorID>
VIP01
</investorID>
<requestList>
<stockID>
0001
</stockID>
<stockID>
0002
</stockID>
</requestList>
<payment>
<accountNum>
0425
</accountNum>
</payment>
</Register>
Translating Guarded Automata to Promela
• We used the SPIN model checker to verify the properties
of conversations
• SPIN is a finite state model checker
– we restricted XML message contents to finite domains
• We translate guarded automata models to Promela (input
language of the SPIN model checker)
– First, translate MSL type declarations to Promela type
declarations
– Then, translate XPath expressions to Promela code
Mapping MSL types to Promela
• Basic types
– integer and boolean types are mapped to Promela
basic types int and bool
– We only allow constant string values and strings are
mapped to enumerated type (mtype) in Promela
• Other type constructors are handled using
– structured types (declared using typedef) in Promela
– or arrays
Mapping MSL type constructors to Promela
• t [ g ] is translated to a typedef declaration
• g { m , n } is translated to an array declaration
• g1 , g2 is translated to a sequence of type declarations
• g1 | g2 is translated to a sequence of type declarations
and an enumerated variable which is used to record which
type is chosen
•
g1 & g2 is not handled! We do not handle unordered type
sequence (it can cause state-space explosion)
Example
Register[
investorID[string] ,
requestList[
stockID[int]{1,3}
] ,
payment[
creditCardNum[int] |
accountNum[int]
]
]
typedef t1_investorID{
mtype stringvalue;}
typedef t2_stockID{int intvalue;}
typedef t3_requestList{
t2_stockID stockID [3];
int stockID_occ;
}
typedef t4_accountNum{int intvalue;}
typedef t5_creditCard{int intvalue;}
mtype {m_accountNum, m_creditCard}
typedef t6_payment{
t4_accountNum accountNum;
t5_creditCard creditCard;
mtype choice;
}
typedef Register{
t1_investorID investorID;
t3_requestList requestList;
t6_payment payment;
}
XPath
• In order to write specifications or programs that
manipulate XML documents we need:
– an expression language to access values and nodes in
XML documents
• XPath is a language for writing expressions (queries) that
navigate through XML trees and return a set of answer
nodes
• An XPath query defines a function which
– takes and XML tree and a context node (in the same
tree) as input and
– returns a set of nodes (in the same tree) as output
XPath Syntax
Basic XPath syntax:
q 
. | .. | b | t | *
| /q | //q | q / q | q // q
| q [ q ] | q [ exp ]
q is an XPath query
exp denotes a predicate on basic types, i.e., on the leaf
nodes of the XML tree
b denotes a basic type such as string, boolean, int, etc.
t denotes a tag
XPath Semantics
Given an XML tree and a node n as a context node
. returns n
.. returns the parent of n
Given an XML tree and a set of nodes
* returns all the nodes
b returns the nodes that are of basic type b
t returns the nodes which are labeled with tag t
XPath Semantics Contd.
Starting at the context node
• /q
returns the nodes that match q
• //q
returns the nodes that match q starting at any
descendant
• q1 / q2
returns each node which matches q2 starting at
a child of a node which matches q1
• q1 // q2 returns each node which matches q2 starting at
a descendant of a node which matches q1
• q1 [ q2 ] applies q2 to the children of the nodes which
match q1
• q [ exp ] returns the nodes that match q and for children
of which the expression exp evaluates to true
Examples
Register
investorID
VIP01
requestList
stockID stockID
0001
0002
payment
accountNum
0425
//payment/* returns the node labeled accountNum
/Register/requestList/stockID/int returns the
nodes labeled 0001 and 0002
//stockID[int > 1]/int returns the node labeled 0002
XPath to Promela
• Generate code that evaluates the XPath expression
[Fu, Bultan, Su ISSTA’04]
• Traverse the XPath expression from left to right
– Code generated in each step is inserted into the
BLANK spaces left in the code from the previous step
– A tree representation of the MSL type is used to keep
track of the context of the generated code
• Uses two data structures
– Type tree shows the structure of the corresponding
MSL type
– Abstract statements which are mapped to Promela
code
Statement
IF(v)
Promela Code
if
:: v -> BLANK
:: else -> skip
fi
FOR(v,l,h)
v = l – 1
do
:: v < h -> BLANK
v++
:: else -> break
od
EMPTY
BLANK
INC(v)
v++
SET(v,a)
v = a
Type Tree
Register[
investorID[string] &
requestList[
stockID[int]{1,3}
] &
payment[
creditCardNum[int] |
accountNum[int]
]
]
2
investorID
3
string
Register
1
7
payment
4
requestList
8
10
5
stockID creditCard accountNum
(idx: i1)
9
6
int
int
int
11
Generated Statements
$register // stockID / [int()>5] / [position() = = last()]/ int()
EMPTY
5
FOR
(i1,1,3)
1
SET
(bRes1,0)
IF
(cond)
SET
(bRes2,0)
5
IF
(i2==i3)
SET
(bRes1,1)
SET
(bRes2,1)
5
IF
(bRes1)
5
IF
(bRes2)
5
5
cond  v_register.requestlist.stockID[i1] > 5
EMPTY
5
5
Sequence
Insert
6
$request//stockID=$register//stockID[int()>5][position()=last()]
/* result of the XPath expression */
bool bResult = false;
/* results of the predicates 1, 2, and 1 resp. */
bool bRes1, bRes2, bRes3;
/* index, position(), last(), index, position() */
int i1, i2, i3, i4, i5;
i2=1;
/* pre-calculate the value of last(), store in i3 */
i4=0; i5=1; i3=0;
do
:: i4 < v_register.requestList.stockID_occ
->
/* compute first predicate */
bRes3 = false;
if
:: v_register.requestList.stockID[i4].intvalue>5
-> bRes3 = true
:: else -> skip
fi;
if
:: bRes3 -> i5++; i3++;
:: else -> skip
fi;
i4++;
:: else -> break;
od;
$request//stockID=$register//stockID[int()>5][position()=last()]
i1=0;
do
:: i1 < v_register.requestList.stockID_occ -> bRes1 = false;
if
:: v_register.requestList.stockID[i1].intvalue>5 -> bRes1 = true
:: else -> skip
fi;
if
:: bRes1 -> bRes2 = false;
if
:: (i2 == i3) -> bRes2 = true;
:: else -> skip
fi;
if
:: bRes2 ->
if
:: (v_request.stockID.intvalue ==
v_register.requestList.stockID[i1].intvalue)
-> bResult = true;
:: else -> skip
fi
:: else -> skip
fi;
i2++;
:: else -> skip
fi;
i1++;
:: else -> break;
od;
Model Checking Using Promela
• Found subtle errors in an example
– SAS: Stock Analysis Service [Fu, Bultan, Su ISSTA’04]
– 3 peers: Investor, Broker, ResearchDept.
– Investor  Broker: a registerList of stockIDs
– Broker  ResearchDept.:
• relay request (1 stockID per request)
• find the stockID in the latest request, send its
subsequent stockID in registerList
– Repeating stockID will cause error.
– Only discoverable by analysis of XPath expressions
Related Work
• Conversation specification
– IBM Conversation support project
http://www.research.ibm.com/convsupport/
– Conversation support for business process integration
[Hanson, Nandi, Kumaran EDOCC’02]
– Orchestrating computations on the world-wide web
[Choi, Garg, Rai, Misram, Vin EuroPar’02]
• Realizability problem
– Realizability of Message Sequence Charts (MSC) [Alur,
Etassami, Yannakakis ICSE’00, ICALP’01]
Related Work
• Verification of web services
– Simulation, verification, composition of web services
using a Petri net model [Narayanan, McIlraith
WWW’02]
– BPEL verification using a process algebra model and
Concurrency Workbench [Koshkina, van Breugel TAVWEB’03]
– Using MSC to model BPEL web services which are
translated to labeled transition systems and verified
using model checking [Foster, Uchitel, Magee, Kramer
ASE’03]
– Model checking Web Service Flow Language
specifications using SPIN [Nakajima ICWE’04]
Current and Future Work
• Extending the source and target languages
• Symbolic analysis
[Fu, Bultan, Su ICWS’04, JWSR]
• Abstraction
• Design for verification for web services
[Betin-Can, Bultan WWW’05, ICWS’05]
Current and Future Work
Verification
Languages
Front End
BPEL
DAML-S
WS-CDL
Conversation
Protocols
...
Translator
for bottom-up
specifications
Translator
for top-down
specifications
Analysis
Back End
Intermediate
Representation
Guarded
automata
Guarded
automaton
Automated
Abstraction
Web Service
Specification
Languages
Synchronizability
Analysis
Translation with
synchronous
communication
Translation with
bounded queue
skip
Realizability
Analysis
fail
success Translation with
single process,
no communication
Promela
SMV
Action
Language
...
THE END
Download