Model Checking XML Manipulating Software Department of Computer Science

advertisement
Model Checking XML Manipulating Software
Xiang Fu
Tevfik Bultan
Jianwen Su
Department of Computer Science
University of California, Santa Barbara
{fuxiang,bultan,su}@cs.ucsb.edu
Web Services
Composition
WSCI
BPEL4WS
Service
WSDL
Message
SOAP
Type
XML Schema
Data
XML
Web Service Standards
Implementation
Platforms
Interaction
Microsoft .Net, Sun J2EE
• Loosely coupled, interaction through standardized
interfaces
• Standardized data transmission via XML
• Asynchronous messaging
• Platform independent (.NET, J2EE)
Outline
• An Example: Stock Analysis Service
• Capturing Global Behaviors
– Conversations, Conversation Protocols
• Web Service Analysis Tool
• XML Messaging
– XML data, MSL types, XPath expressions
• Model Checking Conversation Protocols
– Translation to Promela
• Conclusions and Future Work
An Example: Stock Analysis Service (SAS)
• SAS is a composite web service
– a finite set of peers: Investor (Inv), Stock Broker (SB),
and Research Department (RD)
– and a finite set of message classes: register, ack,
cancel, accept, ...
Investor
(Inv)
report
register
ack, cancel
Stock Broker
(SB)
accept,
reject, bill
request,
terminate
Research Dept.
(RD)
Communication Model
• We assume that the messages among the peers are
exchanged through reliable and asynchronous
messaging
– FIFO and unbounded message queues
Stock Broker
(SB)
req req
Research Dept.
(RD)
• This model is similar to industry efforts such as
– JMS (Java Message Service)
– MSMQ (Microsoft Message Queuing Service)
Conversations
• A virtual watcher records the messages as they are sent
register
Investor
(Inv)
accept
ack
Stock Broker
(SB)
bill
Watcher
Research Dept.
(RD)
reg acc req rep ack bil ter
• A conversation is a sequence of messages the
watcher sees during an execution
Conversation Protocols
• Conversation Protocol: An automaton that accepts the
desired conversation set
SAS conversation protocol
report
1
register
3
reject
6
request
2
accept
7
cancel
ack
8
ack
request
5
9
report
terminate
4
12
terminate
bill
11
cancel
10
Properties of Conversations
• The notion of conversation enables us to reason about
temporal properties of the composite web services
• LTL framework extends naturally to conversations
– LTL temporal operators
X (neXt), U (Until), G (Globally), F (Future)
– Atomic properties
Predicates on message classes (or contents)
Example: G ( accept  F bill )
• Model checking problem: Given an LTL property, does
the conversation set satisfy the property?
Web Service Analysis Tool (WSAT)
Web
Services
Front End
Analysis
Back End
Intermediate
Representation
BPEL
(bottom-up)
BPEL
to
GFSA
Guarded
automata
GFSA to Promela
(synchronous
communication)
Synchronizability
Analysis
GFSA to Promela
(bounded queue)
skip
Conversation
Protocol
(top-down)
GFSA
parser
Guarded
automaton
Verification
Languages
Realizability
Analysis
success
GFSA to Promela
(single process,
no communication)
fail
• Friday 4:00pm, tool presentation at CAV
• Demonstration Saturday (or anytime you find me with
my laptop)
Promela
SAS Guarded Automata
Topdown {
Schema{
PeerList{ Investor, Broker, ResearchDept },
TypeList{ Register ... Accept ... },
MessageList{
register{ Investor -> Broker : Register },
accept{ Broker -> Investor : Accept }, ... }
},
GProtocol{
States{ s1,s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12 },
InitialState{ s1 }, FinalStates{ s4 },
TransitionRelation{
t1{ s1 -> s2 : register, Guard{ true } },
t2{ s2 -> s5 : accept,
Guard{ true =>
$accept[//orderID := $register//orderID] } },
...
}
}
}
XML (eXtensible Markup Language)
• XML is a markup language like HTML
• Similar to HTML, XML tags are written as
<tag> followed by </tag>
• HTML vs. XML
– In HTML, tags are used to describe the appearance of
the data
<b> </b> <i> </i> ...
– In XML, tags are used to describe the content of the
data rather than the appearance
<date> </date> <address> </address>
• XML documents can be modeled as trees where each
internal node corresponds to a tag, and leaf nodes
correspond to basic types
An XML Document and Its Tree
<Register>
<investorID>
VIP01
</investorID>
<requestList>
<stockID>
0001
</stockID>
<stockID>
0002
</stockID>
</requestList>
<payment>
<accountNum>
0425
</accountNum>
</payment>
</Register>
Register
investorID
VIP01
requestList
stockID stockID
0001
0002
payment
accountNum
0425
MSL (Model Schema Language)
• MSL is a language for defining XML data types
– MSL captures core features of XML Schema
• Basic MSL syntax
g
g



| b | t[g ] | g{m,n }
| g,g | g&g | g|g
is an XML type (i.e., an MSL type expression)
is the empty sequence
b
is a basic type such as string, boolean, int, etc.
t
is a tag
m and n are positive integers
[ ] { } & , | are MSL type constructors
MSL Semantics
t[g ]
denotes a type with root node labeled t with children of
type g
g{m,n }
denotes a sequence of size at least m and at most n
where each member is of type g
g1 , g2
denotes an ordered sequence where the first member is
of type g1 and the second member is of type g2
g1 & g2
denotes an unordered sequence where one member is
of type g1 and the other member is of type g2
g1 | g2
denotes a choice between type g1 and type g2, i.e., either
type g1 or type g2, but not both
An MSL Type Declaration and an Instance
Register[
investorID[string] ,
requestList[
stockID[int]{1,3}
] ,
payment[
creditCardNum[int] |
accountNum[int]
]
]
<Register>
<investorID>
VIP01
</investorID>
<requestList>
<stockID>
0001
</stockID>
<stockID>
0002
</stockID>
</requestList>
<payment>
<accountNum>
0425
</accountNum>
</payment>
</Register>
Mapping MSL types to Promela
• Restrictions: no unbounded or unordered sequences, no
string manipulation
• Basic types
– integer and boolean types are mapped to Promela
basic types int and bool
– strings are mapped to enumerated type (mtype) in
Promela
• we only allow constant string values
• Type constructors are handled using
– structured types (declared using typedef) in Promela
– or arrays
Example
Register[
investorID[string] ,
requestList[
stockID[int]{1,3}
] ,
payment[
creditCardNum[int] |
accountNum[int]
]
]
typedef t1_investorID{
mtype stringvalue;}
typedef t2_stockID{int intvalue;}
typedef t3_requestList{
t2_stockID stockID [3];
int stockID_occ;
}
typedef t4_accountNum{int intvalue;}
typedef t5_creditCard{int intvalue;}
mtype {m_accountNum, m_creditCard}
typedef t6_payment{
t4_accountNum accountNum;
t5_creditCard creditCard;
mtype choice;
}
typedef Register{
t1_investorID investorID;
t3_requestList requestList;
t6_payment payment;
}
XPath
• In order to write specifications or programs that
manipulate XML documents we need:
– an expression language to access values and nodes in
XML documents
• XPath is a language for writing expressions (queries) that
navigate through XML trees and return a set of answer
nodes
• An XPath query defines a function which
– takes and XML tree and a context node (in the same
tree) as input and
– returns a set of nodes (in the same tree) as output
XPath Syntax
Basic XPath syntax:
q  . | .. | b | t | * | q / q | q // q | q [ exp ]
q
is an XPath query
exp
denotes a predicate on basic types, i.e., on the leaf
nodes of the XML tree
b
denotes a basic type such as string, boolean, int, etc.
t
denotes a tag
XPath Semantics
XPath expression are evaluated from left to right
Given an XML tree and a node n as a context node
. returns n
.. returns the parent of n
Given an XML tree and a set of nodes
* returns all the nodes
b returns the nodes that are of basic type b
t returns the nodes which are labeled with tag t
XPath Semantics Contd.
Starting at the context node:
q1 / q2
returns each node which matches q2 starting at
a child of a node which matches q1
q1 // q2
returns each node which matches q2 starting at
a descendant of a node which matches q1
(if q1 is missing, then start at the root)
q [ exp ]
returns the nodes that match q and with
children for which exp evaluates to true
Examples
Register
investorID
VIP01
requestList
stockID stockID
0001
0002
payment
accountNum
0425
//payment/* returns the node labeled accountNum
/Register/requestList/stockID/int returns the
nodes labeled 0001 and 0002
//stockID[int > 1]/int returns the node labeled 0002
XPath to Promela
• Generate code that evaluates the XPath expression
– Restrictions: no ancestors-axis, no string expressions
• Uses two data structures
– Type tree shows the structure of the corresponding
MSL type
– Abstract statements which are mapped to Promela
code
• Traverse the XPath expression from left to right
– Statements generated in each step are inserted into the
BLANK spaces left in the code from the previous step
– The type tree is used to keep track of the context of the
generated code
Statement
IF(c)
Promela Code
if
:: c -> BLANK
:: else -> skip
fi
FOR(v,l,h)
v = l – 1
do
:: v < h -> BLANK
v++
:: else -> break
od
EMPTY
BLANK
INC(v)
v++
SET(v,a)
v = a
Type Tree
Register[
investorID[string] &
requestList[
stockID[int]{1,3}
] &
payment[
creditCardNum[int] |
accountNum[int]
]
]
2
investorID
3
string
Register
1
7
payment
4
requestList
8
10
5
stockID creditCard accountNum
(idx: i1)
9
6
int
int
int
11
$register // stockID / [int()>5] / [position() = last()]
/ int()
SET
(i2,1)
EMPTY
SET
(bRes2,0)
SET
(bRes1,0)
1
FOR
(i1,1,3)
IF
(cond)
SET
(bRes1,1)
IF
(i2==i3)
SET
(bRes2,0)
5
IF
(bRes1)
IF
(bRes2)
EMPTY
5
5
INC
(i2)
cond  v_register.requestlist.stockID[i1] > 5
Sequence
Insert
6
$request//stockID=$register//stockID[int()>5][position()=last()]
/* result of the XPath expression */
bool bResult = false;
/* results of the predicates 1, 2, and 1 resp. */
bool bRes1, bRes2, bRes3;
/* index, position(), last(), index, position() */
int i1, i2, i3, i4, i5;
i2=1;
/* pre-calculate the value of last(), store in i3 */
i4=0; i5=1; i3=0;
do
:: i4 < v_register.requestList.stockID_occ
->
/* compute first predicate */
bRes3 = false;
if
:: v_register.requestList.stockID[i4].intvalue>5
-> bRes3 = true
:: else -> skip
fi;
if
:: bRes3 -> i5++; i3++;
:: else -> skip
fi;
i4++;
:: else -> break;
od;
$request//stockID=$register//stockID[int()>5][position()=last()]
i1=0;
do
:: i1 < v_register.requestList.stockID_occ -> bRes1 = false;
if
:: v_register.requestList.stockID[i1].intvalue>5 -> bRes1 = true
:: else -> skip
fi;
if
:: bRes1 -> bRes2 = false;
if
:: (i2 == i3) -> bRes2 = true;
:: else -> skip
fi;
if
:: bRes2 ->
if
:: (v_request.stockID.intvalue ==
v_register.requestList.stockID[i1].intvalue)
-> bResult = true;
:: else -> skip
fi
:: else -> skip
fi;
i2++;
:: else -> skip
fi;
i1++;
:: else -> break;
od;
Model Checking Using Promela
• Error in SAS conversation protocol
t14{ s8 -> s12 : bill,
Guard{
$request//stockID = $register//stockID [position() = last()]
=>
$bill[ //orderID := $register//orderID ]
}
}
• Repeating stockID will cause error
• One can only discover these kinds of errors by analysis of
XPath expressions
Related Work
• Verification of web services
– Simulation, verification, composition of web services
using a Petri net model [Narayanan, McIlraith
WWW’02]
– Using MSC to model BPEL web services which are
translated to labeled transition systems and verified
using model checking [Foster, Uchitel, Magee, Kramer
ASE’03]
– Model checking Web Service Flow Language
specifications using SPIN [Nakajima ICWE’04]
– BPEL verification using a process algebra model and
Concurrency Workbench [Koshkina, van Breugel TAVWEB’04]
Related Work
• Conversation specification
– IBM Conversation support project
http://www.research.ibm.com/convsupport/
– Conversation support for business process integration
[Hanson, Nandi, Kumaran EDOCC’02]
Future Work
• Other input languages in the front end
– WSCI, OWL-S
• Other verification tools at the back end
– SMV, Action Language Verifier
• Symbolic representations for XML data
• Abstraction for XML data and XML data manipulation
Current and Future Work
Verification
Languages
Front End
WSCI
...
Back End
Intermediate
Representation
BPEL
Conversation
Protocols
Analysis
Translator
for bottom-up
specifications
Translator
for top-down
specifications
Guarded
automata
Guarded
automaton
Automated
Abstraction
Web Service
Specification
Languages
Synchronizability
Analysis
Translation with
bounded queue
skip
Realizability
Analysis
fail
Translation with
synchronous
communication
success
Translation with
single process,
no communication
Promela
Action
Language
SMV
...
Download