Data-driven web services

advertisement
Data-driven web services:
specification and verification
Victor Vianu
UC San Diego
Web service: service hosted on the Web
• Interactive, often data-driven:
accesses an underlying database and interacts with
users/programs according to explicit or implicit workflow
• Complex services: Web service compositions
peers communicating asynchronously
• Complexity of workflow leads to bugs:
see the public database of Web site bugs (Orbitz bug)
• Static analysis required
-behavior of individual peers
-protocols of communication between peers
-global properties
Focus of this course
Automatic Verification
Understand when verification is possible,
describe techniques and algorithms
Our target: data-aware Web services
Home page(HP)
Namepage(HP)
Home
passwd
Name
passwd
login
login
Error message page(MP)
Error message page(MP)
back
back
input
cancel
cancel
Customer page(CP)
Customer page(CP)
Desktop
My order
Desktop
laptop
My order
laptop
Desktop Search(SP)
Past Order (POP)
Past Order
Past
Order(POP)
Past Order
database
laptop Search(SP)
Desktop search
Desktop
Search(SP)
Desktop search
laptop
Search(SP)
Desktop search
Ram:
Desktop search
Ram:
Ram:
Hdd:
Ram:
Hdd:
Hdd:
Hdd:
Display:
search
update
state
Display:
search
search
search
output
query
Order status(OSP)
Order status(OSP)
Order
status
Order status
Cancel confirmation
page(CCP)
Cancel confirmation
page(CCP)
Product index page(PIP)
Product index page(PIP)
Matching products
Matching products
Product detail page(PP)
Product
detail
page(PP)
Product
detail
Product detail
buy
buy
Confirmation page(CoP)
Confirmation
Order detailpage(CoP)
Order detail
Input
Input
Input
Triggers state
update and
transition to
new page
Input
Home Page(HP)
Message Page
(MP)
NAME:
PASSWD:
back
cancel
login
Message
state update
Customer Page(CP)
laptop
Laptop Search (LSP)
desktop
DB
Desktop Search (DSP)
RAM:
CPU:
SCREEN:
RAM:
CPU:
submit
submit
Matching products
Confirmation
Details
buy
Product Index (PIP)
Product Detail
(PDP)
print
Confirmation
(CoP)
output
Compositions of Web services
Buyer Login page(BP)
Name
passwd
login
cancel
Category choice page(CP)
My order
Desktop
laptop
Desktop Search(SP)
Past Order (POP)
Past Order
User payment(UPP)
laptop Search(SP)
Desktop search
Ram:
Hdd:
Desktop search
Ram:
Hdd:
Display:
search
Credit
Verification
Payment
CC No:
Expire date
search
M
submit
Order status(OSP)
Order status
Cancel confirmation
page(CCP)
Product index page(PIP)
Matching products
Product detail page(PP)
Product detail
buy
Confirmation page(CoP)
Order detail
Examples of Desirable Properties
• Semantic properties
– “no product is delivered before payment in the right
amount is received"
– “no user can cancel an order that has already been
shipped”
• Basic soundness of specification
– “conditions guarding transition to next Web page are
mutually exclusive”
• Navigational properties
– “the shopping cart page is reachable from any page”
Expressed using temporal logics
“An order is rejected in the next step only if it has
already been ordered but not paid correctly in the
current input”
always
at next step
x G [ X reject-order(x)
(past-order(x)   y (pay(x,y)  price(x,y)))]
Outline
•
•
•
•
Review of model checking
Finite-state abstractions
Data-aware abstractions
The XML angle
Disclaimer: a lot not covered here!
•
•
•
•
•
Process algebras, pi-calculus
Situation calculus
Petri nets
Transaction logic
Use of ontologies, description logics
Finite-state abstractions of Web services*
• Black box: input/output signature
order
bill
Supplier
delivery
* curtesy of Rick Hull
payment
Finite-state abstractions of Web services
• White box: internal logic
order
bill
?o !b
delivery
?p !d
payment
Simplest: finite state Mealy machines
Finite-state abstractions of Web services
• White box: internal logic
order
bill
?o
delivery
payment
Simplest: finite state Mealy machines
• Interacting Web services: compositions
 P : finite set of Web
services (peers)
store
authorize
ok
bank
 C : finite set of peer-topeer channels
 M : (finite) set of
mesage classes
supplier1
supplier2
Combining Peer and Composition Models
store
!a
bank
?k
supplier1
?o1 !b1
...
...
?a
supplier2
?o2
• Peer fsa’s begin in their start states
!k
...
...
Executing a Mealy Composition (cont.)
store
!a
?k
supplier1
?o1 !b1
...
...
a
bank
?a
supplier2
?o2
!k
...
...
• STORE produces letter a and sends to BANK
Executing a Mealy Composition (cont.)
store
!a
bank
?k
supplier1
?o1 !b1
...
...
• BANK consumes letter a
?a
supplier2
?o2
!k
...
...
store
!a
r2
o1
?k
supplier1
?o1 !b1
...
...
b2 b1 bank
?a
o2 o2 r2
!k
supplier2
?o2
...
...
• Important parameters:
--bounded or unbounded queues
--open or closed system
• Execution successful if all queues are empty and
fsa’s in final state
Verifying Temporal Properties of Mealy
Compositions
store
!a
bank
?k
...
“line-ofcredit
warehouse1
available”
?o1
!b1
...
?a
!k
“shipment
just
made”
...
warehouse2
?o2
...
• Label states with propositions
• Express temporal formulas in LTL, e.g.,
– “shipment just made” only after “line-of-credit avail”
Results on Temporal Verification
• Long history, see [Clarke et.al. ’00]
Example: one fsa and propositional LTL
– PSPACE in size of formula + fsa
– linear time in size of fsa
• Mealy compositions
– Bounded queues
• Composition can be simulated as Mealy machine
• Verification is decidable
• Standard techniques to reduce cost
– Unbounded queues
• In general, undecidable [Brand & Zafiropulo 83]
Alternative: temporal property of
sequence of exchanged messages
warehouse1
payment2
bank
bill2
order1
receipt1
store
authorize
ok
warehouse2
“conversation”
a k o1 o2 b1 p1 r1 r2 b2 p2
LTL properties: Every authorize
followed by some bill?
Examples:
G(( [?o]   [ !b])  X [ !b])
“if an order has been received but a bill not yet
sent, then in the next state a bill has been sent”
G( [ ?o]  F( [ !b] ))
“if an order has been received then eventually a
bill will be sent”
Conversation languages
• Conversation: sequence of exchanged
messages
• Conversation language: set of all
conversations between Mealy peers
• Bounded queues: regular language
• Unbounded queues
!a
?b
a
?a
b
!b
p2
p1
• Conversation language L is not regular:
L  a*b* = { anbn | n  0 }
• Conversation languages are context sensitive
in fact, they are the quasi-realtime languages
[Bultan+Fu+Hull+Su 03]
Quasi-realtime languages
• Accepted by non-deterministic multi-tape TM in linear
time [Book+Greibach]
• Also, smallest AFL containing the CFGs and closed
under intersection
AFL: families of languages containing the finite
languages and closed under union, concatenation, +,
non-erasing homomorphism, inverse homomorphism,
intersection with regular languages
Synthesis of compositions
• Given a set of peers and
communication channels with message
names, and a constraint Φ on the
conversation language, find Mealy
automata for peers so that the
constraint is satisfied.
Bounded queues
• Closed systems:
PSPACE for LTL
PTIME for ω-regular sets given by an automaton
• Open systems:
“game” against environment
synthesis = finding winning strategy
undecidable for arbitrary topology
hierarchical topology: decidable [Kupferman + Vardi 01]
but non-elementary even in linear case [Pnueli+Rosner]
[
Unbounded queues
• Open systems: undecidable
• Closed systems: open
Practical alternative: synthesizing hierarchical composition
from “library” of services
Travel
Service
Templates
Air Travel
Templates
Airport
Transfer
Hotel
Reservation
Customized
Travel
Service
More work on compositions
• The Roman model
sequence of papers by Berardi, Calvanese, Da
Giacomo, Hull, Lenzerini, Mecella
also Bultan, Dang, Fu, Hull, Ibara, Su
see references.
• Use of description logics and logic programming
Beyond fsa: workflow+data
Home page(HP)
Namepage(HP)
Home
passwd
Name
passwd
login
login
Error message page(MP)
Error message page(MP)
back
back
input
cancel
cancel
Customer page(CP)
Customer page(CP)
Desktop
My order
Desktop
laptop
My order
laptop
Desktop Search(SP)
Past Order (POP)
Past Order
Past
Order(POP)
Past Order
database
laptop Search(SP)
Desktop search
Desktop
Search(SP)
Desktop search
laptop
Search(SP)
Desktop search
Ram:
Desktop search
Ram:
Ram:
Hdd:
Ram:
Hdd:
Hdd:
Hdd:
Display:
search
update
state
Display:
search
search
search
output
query
Order status(OSP)
Order status(OSP)
Order
status
Order status
Cancel confirmation
page(CCP)
Cancel confirmation
page(CCP)
Product index page(PIP)
Product index page(PIP)
Matching products
Matching products
Product detail page(PP)
Product
detail
page(PP)
Product
detail
Product detail
buy
buy
Confirmation page(CoP)
Confirmation
Order detailpage(CoP)
Order detail
• Abstract model: relational transducer
input
transducer
state
control
db
output
Control: (input, state, db)  (output, state)
History:
• Relational transducers
[Abiteboul+Vianu+Fordham+Yesha 98]
• ASM relational transducer
[Spielmann 00]
• Extended ASM transducer
[Deutsch+Liying+Vianu 04]
• Communicating transducers
[Deutsch+Liying+Vianu+Zhou 06]
History:
• Relational transducers
[Abiteboul+Vianu+Fordham+Yesha 98]
• ASM relational transducer
[Spielmann 00]
• Extended ASM transducer
[Deutsch+Liying+Vianu 04]
• Communicating transducers
[Deutsch+Liying+Vianu+Zhou 06]
Target: “business models”
in e-commerce
High-level descriptions of protocols of interaction
Transducer - SHORT
% schema: relational signatures
database:
price: 2
input:
order: 1, pay: 2
state:
past-order: 1, past-pay: 2
output:
send-bill: 2, deliver: 1
% state rules
past-order(X) +:- order(X)
past-pay(X,Y) +:- pay(X,Y)
% output rules
send-bill(X,Y) :- order(X), price(X,Y),  past-pay(X,Y)
deliver(X)
:- past-order(X), price(X,Y), pay(X,Y),
 past-pay(X,Y)
Run of a transducer
input/output sequence
input1
input2
input3
output1
output2
output3
Price
Input Sequence
order(Time)
order(Newsweek)
order(People)
order(Le Monde)
pay(Time,55)
pay(Newsweek,48)
send-bill(Time,55)
send-bill(Newsweek,45)
Output Sequence
Newsweek
Time
Le Monde
45
55
350
pay(Newsweek,45)
deliver(Time)
deliver(Newsweek)
send-bill(Le Monde,350)
What to verify:
• Goal reachability
is it possible to reach a goal (output)?
Example: is it possible to receive
a tax refund?
• Temporal properties of all runs
Example: no product can be delivered unless it
has been paid
• Comparing transducers
-- Equivalence: same runs
-- Containment: every run of T1 is
a run of T2
• Interacting transducers
Overall consistency
Example: T1 requires
payment before delivery
T2 requires
delivery before payment
deadlock situations!
• Generally, undecidable
Tractable restriction:
Spocus Transducer
Semi-Positive Output, CUmulative State
• State rules:
past-R( x ) + R(x)
• Output rules: A0
A1 … Ai … An
Rinput
A0 : output relation R(x)
Ai : input, database, or state literals, safe negation
% schema
database:
input:
state:
output:
% state rules
past-order(X)
past-pay(X,Y)
% output rules
send-bill(X,Y)
deliver(X)
Transducer - SHORT
price
order, pay
past-order, past-pay
send-bill, deliver
+:- order(X)
+:- pay(X,Y)
:- order(X), price(X,Y),  past-pay(X,Y)
:- past-order(X), price(X,Y), pay(X,Y),
 past-pay(X,Y)
What can be verified
• Goal reachability
goal: x ( A1  …  An)
Ai : () R( y ) where R is an output relation,
safe negation
Example: is it possible to reach deliver(x)?
• Temporal properties of runs
class T of sentences: x φ( x )
where φ is a Boolean combination
of output, state, and database literals
Example:
x y [(deliver(x)  price(x,y))
past-pay(x,y)]
• Complexity: NEXPTIME
Proof: Reduction to finite satisfiability of FO
sentences in the Bernays-Schönfinkel
prefix class ** FO
Comparing Spocus transducers
• Motivation: customization, negotiation, etc.
• Need to distinguish significant events from
“syntactic sugaring”
• Notion of log: semantically significant
input and output relations
% schema
database:
input:
state:
output:
Transducer SHORT
price
order, pay
past-order, past-pay
send-bill, deliver
% state rules
past-order(X) +:- order(X)
past-pay(X,Y) +:- pay(X,Y)
% output rules
send-bill(X,Y) :- order(X), price(X,Y),  past-pay(X,Y)
deliver(X)
:- past-order(X), price(X,Y), pay(X,Y),
 past-pay(X,Y)
Price
Input Sequence
order(Time)
order(Newsweek)
order(People)
order(Le Monde)
pay(Time,55)
pay(Newsweek,48)
send-bill(Time,55)
send-bill(Newsweek,45)
Output Sequence
Newsweek
Time
Le Monde
45
55
350
pay(Newsweek,45)
deliver(Time)
deliver(Newsweek)
send-bill(Le Monde,350)
% schema
database:
input:
state:
output:
Transducer Friendly
price
order, pay, pending-bills
past-order, past-pay, past-pending-bills
send-bill, deliver,
reject-pay, already-paid, re-bill
% some additional output rules
re-bill(X,Y) :- pending-bills(X), past-order(X), price(X,Y),
 past-pay(X,Y)
already-paid(X) :- pay(X,Y), past-pay(X,Y)
reject-pay(X) :- pay(X,Y),  past-order(X)
% schema
database:
input:
state:
output:
log:
Transducer Friendly
price
order, pay, pending-bills
past-order, past-pay, past-pending-bills
send-bill, deliver,
reject-pay, already-paid, re-bill
order, pay, send-bill, deliver
% some additional output rules
re-bill(X,Y) :- pending-bills(X), past-order(X), price(X,Y),
 past-pay(X,Y)
already-paid(X) :- pay(X,Y), past-pay(X,Y)
reject-pay(X) :- pay(X,Y),  past-order(X)
• Log of a run: restriction to logged events
• Valid log of a transducer: log of a run
• Transducer containment relative to a log:
every valid log of T1
is also a valid log of T2
Containment of Spocus transducers: undecidable.
Proof: reduction of implication of FDs and IDs
• Useful special case: customization:
“T2 is an extension of T1”
-- input(T1)  input(T2)
-- input(T1)  log(T1) = log(T2)
• Faithful extension T2 of T1 : T2 contained in T1
Example: Short vs. Friendly
• Decidable for Spocus tranducers in NEXPTIME
Proof: Bernays-Schönfinkel
Controlling Input Sequences
• Example:
order(x) must be input before pay(x)
• Use a distinguished error output:
error
pay(x), not past-order(x)
Valid runs: error free
Power of error-free runs
• Propositional output transducer:
-- propositional outputs
-- one proposition output at each step
• Generror-free(T): words output by T on finite runs
Theorem: A language L equals Generror-free(T)
for some propositional-output Spocus transducer
T iff L is a prefix-closed r.e. language
Temporal properties of error-free runs
• A useful class can be enforced
class Terror-free of sentences:
x [φ(state,db,in)(x)
ψ(state,db,in)(x)]
-- φ(state,db,in)(x): conjunction of literals
with safe negation
-- ψ(state,db,in)(x): positive formula
Examples
• xy [(past-order(x)  price(x,y) 
past-pay(x,y))
(pay(x,y)  cancel(x))]
• xy [pay(x,y)
• x [cancel(x)
(price(x,y)  past-order(x))]
past-order(x)]
Theorem: for every formula φ in Terror-free
there exists a Spocus transducer T
whose error-free runs are precisely
those satisfying φ.
Verification:
• Undecidable if all error-free runs of a Spocus
transducer T satisfy given φ in Terror-free
• Decidable for transducers T without negative
state literals in error-generating rules
Similar results for containment:
• Undecidable if every error-free run of T1 is
also an error-free run of T2
• Decidable if T1 and T2 have same schema
and full log, and no negative state literals
in error-generating rules
Conclusion on Spocus transducers
• Reasonable for simple business models
• Too restricted for more intricate Web service
workflows
• Limited class of temporal properties verified
• No compositions
Target: more sophisticated Web services
• Specified using high-level tools
Example: WebML
• Single peers and compositions
Input
Input
Input
Triggers state
update and
transition to
new page
Input
Home Page(HP)
Message Page
(MP)
NAME:
PASSWD:
Message
back
High-level
WebML-style
statetools
update
specification
cancel
login
Customer Page(CP)
laptop
Laptop Search (LSP)
desktop
DB
Desktop Search (DSP)
RAM:
CPU:
SCREEN:
RAM:
CPU:
submit
submit
Matching products
Confirmation
Details
buy
Product Index (PIP)
Product Detail
(PDP)
print
Confirmation
(CoP)
output
Model for data-driven Web services
WebML-style
A web service W is a set of web page
schemas as well as
– global database D (fixed during a session)
– global state relations S (updatable)
– input relations I and output relations O
– home page W0
– error page We
Webpage Schema
• Input options (user may pick at most one)
Query(DB, State, Prev-input)
• Output and Transitions (triggered by user input)
Output:
Next Webpage:
Query(DB, State, Input, Prev-input)
Query(DB, State, Input, Prev-input)
• State updates (triggered by user input)
Insertions/deletions:
Query(DB, State, Input, Prev-Input)
“Query”: First Order Logic formula (core SQL)
Details on Modeling the Input
• input options
as provided by menus, pull-down lists, buttons, HTTP
links: user must choose one
• input constants
model text input boxes:
name, password, etc.
• input I at previous step: prev-I
can be viewed as special state
Home Page(HP)
Message Page
(MP)
NAME:
PASSWD:
login
Message
back
input
constant
cancel
Customer Page(CP)
input
laptop
constant
desktop
Home page(HP)
Laptop Search (LSP)
Desktop Search (DSP)
Input:
name, password , clickbutton(x)
RAM:
RAM:
CPU:
SCREEN:
CPU:
Input options: clickbutton(x)  (x= “login” or x = “cancel”)
submit
submit
State update: error("bad user")  not users(name,password) and
clickbutton("login")
state table
DB table
Matching
productsrules:
Page
Transition
CP  users(name,password
) and
clickbutton(“login”)
Confirmation
Details
MP  not users(name,buy
password) and print
clickbutton("login")
next page
Product Index (PIP)
Product Detail
(PDP)
Confirmation
(CoP)
Home page(HP)
Name
passwd
login
cancel
Customer page(CP)
Error message page(MP)
back
My order
Desktop Search(SP)
Past Order (POP)
Past Order
Desktop
laptop
laptop Search(SP)
Desktop search
Desktop search
Ram:
Ram:
Page: Past Order (POP)
Hdd:
Hdd:
Display:
search
search
Input:
Input
Rules:
Order status(OSP)
Order status
State Rules:
Cancel confirmation
page(CCP)
Clickbutton(x), Opick(oid, pid, price, status)
Opick(pid,price,status) 
Product index page(PIP)
order-db(name, pid, price, status);
Matching products
Clickbutton(x)  x="view cart" or x="logout" or
x="continue shopping" or x="back"
Userorderpick(name,pid,price,
status)  Opick(pid,price,status)
Product detail page(PP)
Product detail
Target Webpages: OSP, CP, HP,CC
buy
Target Rules:
Confirmation
page(CoP)
OSP 
Opick(pid,price,status);
HP  Clickbutton("logout");
Order detail
...........
Home Page(HP)
Product
(PIP)
Message Info
Page PageNAME:
PASSWD:
(MP)
Input: pick(pid,price)
Message
Input options:
back
cancel
login
Customer Page(CP)
pick(pid,price)  ram
laptop cpu
desktop
prev-search(ram,cpu)  catalog(pid,ram,cpu,price)
Laptop Search (LSP)
RAM:
previous
CPU:
SCREEN:
Desktop Search (DSP)
input
RAM:
CPU:
db table
submit
submit
Matching products
Confirmation
Details
buy
Product Index (PIP)
Product Detail
(PDP)
print
Confirmation
(CoP)
Web service run
• Fixed database
• Sequence of inputs, states and outputs
• Input constants are given values
throughout the run
Inconsistencies
error page
Home Page(HP)
Message Page
(MP)
NAME:
PASSWD:
Message
back
High-level
WebML-style
state update
specification
cancel
login
Customer Page(CP)
laptop
Laptop Search (LSP)
desktop
Desktop Search (DSP)
RAM:
CPU:
SCREEN:
RAM:
CPU:
submit
submit
Matching products
Confirmation
Details
buy
Product Index (PIP)
Product Detail
(PDP)
print
Confirmation
(CoP)
Web application
code
Examples of Desirable Properties
• Semantic properties
– “no product is delivered before payment in the right
amount is received"
– “no user can cancel an order that has already been
shipped”
• Basic soundness of specification
– “conditions guarding transition to next Web page are
mutually exclusive”
• Navigational properties
– “the shopping cart page is reachable from any page”
Approach to verification
• Define an extended relational transducer
• Show that every Web service spec and temporal
property to be verified can be translated efficiently
into an extended relational transducer spec and a
corresponding property
Main result
restrictions leading to PSPACE
decidability of LTL properties
Several versions:
• Relational transducers
[Abiteboul+Vianu+Fordham+Yesha 98]
• ASM relational transducer
[Spielmann 00]
• Extended ASM transducer
[Deutsch+Liying+Vianu 04]
• Communicating transducers
[Deutsch+Liying+Vianu+Zhou 06]
Extended relational transducer:
reactive system with FO control
input
state
control:
FO
Single peer
db
output
Control: (input, state, db)  (output, state)
input
state
control:
FO
Single peer
db
output
Control: (input, state, db)  (output, state)
Single peer
state
FO query
db
Input options
user choice
Single peer
state
FO query
db
Input options
user choice
Single peer
state
FO
queries
db
output
Technical point: queries can also refer to k previous inputs
Configurations and runs
input
Configuration
state
db
output
Run: infinite sequence of consecutive configurations
Language for properties of runs:
LTL-FO
FO + LTL operators + Boolean operators
• Start with FO formulas referring to the states, db, inputs,
top and last message of queues in current configuration
FO components
• Apply Boolean and LTL operators: X, U, F, G, B
• All remaining free variables are universally quantified
x (x)
Example of LTL-FO formula
“An order is rejected in the next step only if it has already been
ordered but not paid correctly in the current input”
x G [ X reject-order(x)
(past-order(x)   y (pay(x,y)  price(x,y)))]
FO components
p: reject-order(x)
q: (past-order(x)   y (pay(x,y)  price(x,y)))
LTL formula over p, q:
G[Xp
q]
Example Property
“any shipped product must be previously paid for”
pid, uname,price [(pid, uname, price) B
Ship(uname, pid)]
output
Where (pid, uname, price) is the formula
input
pay(price)picked(uname, pid, price) 
prod-price(pid, price)
state
database
The Verification Problem
Given transducer M and LTL-FO property 
Decide if every run of M satisfies  .
If not, exhibit a counterexample run.
Challenge: infinite-state system!
Typical approaches in Software Verification
are unsatisfactory:
– Model checking: developed for finite-state systems
described by propositional states. More expressive
specifications first abstracted to propositional ones.
Unsatisfactory: can check that some payment
occurred before some shipment, but not that it
involved the correct amount and product.
– Theorem proving: no completeness guarantees, not
autonomous. Prover requires expert guidance .
Alternative approach: identify a restricted but reasonably
expressive class of transducers that can be verified
Main restrictions for decidability
guarded quantification
guarded quantification: quantified variables
must appear in input (or prev-I) atoms
“input boundedness”
earlier variant: Spielmann
pick(pid,price)  ram cpu prev-search(ram,cpu)
 catalog(pid,ram,cpu,price)
Input-bounded transducer
• State, action, and transition rules use FO
conditions with “input-bounded” quantification:
x ( input(- x- )  φ( x ))
x ( input(- x- )
φ( x ))
state atoms have no quantified variables
prev-I can also serve as guard
• Input options definitions:
*FO (db, prev-input, ground state atoms)
Home page(HP)
Name
passwd
login
cancel
Customer page(CP)
Error message page(MP)
back
My order
Desktop Search(SP)
Home page(HP)
Past Order (POP)
Past Order
Desktop
laptop
laptop Search(SP)
Desktop search
Desktop search
Ram:
Ram:
Hdd:
Hdd:
Input: name, password , clickbutton(x)
search
Display:
search
Input options: clickbutton(x)  (x= “login” or x = “cancel”)
Product index page(PIP) user")  not users(name,password) and
State update: error("bad
Order status(OSP)
clickbutton("login")
Order status
Matching products
Transition rules:
HP  clickbutton(“cancel”)
Cancel confirmation
page(CCP)
Product detail
page(PP)
CP  user(
name,password
) and clickbutton(“login”)
Product detail
MP  not users(name, password) and
clickbutton("login")
buy
Confirmation page(CoP)
Order detail
Home page(HP)
Name
passwd
login
cancel
Customer page(CP)
Error message page(MP)
back
My order
Desktop Search(SP)
Past Order (POP)
Past Order
Desktop
laptop
laptop Search(SP)
Desktop search
Desktop search
Ram:
Ram:
Page: Past Order (POP)
Hdd:
Hdd:
Display:
search
search
Input:
Input
Rules:
Order status(OSP)
Order status
State Rules:
Cancel confirmation
page(CCP)
Clickbutton(x), Opick(oid, pid, price, status)
Opick(pid,price,status) 
Product index page(PIP)
order-db(name, pid, price, status);
Matching products
Clickbutton(x)  x="view cart" or x="logout" or
x="continue shopping" or x="back"
Userorderpick(name,pid,price,
status)  Opick(pid,price,status)
Product detail page(PP)
Product detail
Target Webpages: OSP, CP, HP,CC
buy
Target Rules:
Confirmation
page(CoP)
OSP 
Opick(pid,price,status);
HP  Clickbutton("logout");
Order detail
...........
Home Page(HP)
Product
(PIP)
Message Info
Page PageNAME:
PASSWD:
(MP)
Input: pick(pid,price)
Message
Input options:
back
cancel
login
Customer Page(CP)
pick(pid,price)  ram
laptop cpu
desktop
prev-search(ram,cpu)  catalog(pid,ram,cpu,price)
Laptop Search (LSP)
RAM:
previous
CPU:
SCREEN:
Desktop Search (DSP)
input
RAM:
CPU:
db table
submit
submit
Matching products
Confirmation
Details
buy
Product Index (PIP)
Product Detail
(PDP)
print
Confirmation
(CoP)
Input-bounded LTL-FO property:
FO components are input bounded
“An order is rejected in the next step only if it has
already been ordered but not paid correctly in the
current input”
x G [ X reject-order(x)
(past-order(x)   y (pay(x,y)  price(x,y)))]
Main verification result
Theorem: It is decidable whether an inputbounded extended relational transducer
satisfies an input-bounded LTL-FO property.
Complexity: PSPACE-complete for bounded arity
schemas, EXPSPACE otherwise
Tightness: even small extensions
lead to undecidability
• Relaxing the requirement that state atoms must be
ground in formula defining the input options.
Reduction: Does TM halt on input epsilon?
• Lifting the input-bounded requirement by allowing
state projection.
Reduction: Implication for FDs and IDs
• Allowing Prev-I to record all previous input to I rather
than the most recent one.
Reduction: Trakhtenbrot’s Theorem
• Extend the FO-LTL formulas with path quantification.
Reduction: validity of **FO formulas
Expressivity of input-bounded specs
Significant parts of the following Web
applications could be modeled:
•
•
•
•
Dell-like computer shopping website
Expedia
Barnes&Noble
GrandPrix motor sports Web site
See demo site http://www.db.ucsd.edu
PSPACE verification:
outline for single peer
To check that M satisfies ,
verify that there is no run satisfying  
Recall model checking approach (finite-state):
• Build Büchi automaton B( ) for  
• Build Kripke model K for all runs
• Check that there is no counterexample run:
emptiness of K  B( )
Our case: infinite-state system
Same idea: build B( ), then search for
counterexample runs accepted by B( )
But: no finite Kripke model K for the runs!
Problem in searching for counterexample runs:
infinite runs
infinitely many underlying databases
How to limit the search space?
Infinite search space for runs
number of underlying DBs
..
.
...
...
...
...
...
...
...
...
...
length of run
Bounding the search for counterexample runs
double-exponentially many DBs
number of underlying DBs
...
...
...
Sufficient to consider only
DBs over a fixed domain
of cardinality exponential
in size of spec + prop
...
...
Finite search space
yields decidability
of verification
...
...
...
Periodic runs suffice:
 counterexample iff
 periodic one
...
doubly-exponential length in size of spec+prop
length of run
Key insight for PSPACE complexity
• No need to explicitly materialize entire configuration:
• Instead, at each step construct only those portions of DB,
states and outputs which can affect property.
• Call them pseudoconfigurations.
Pseudoconfigurations
C = a set of relevant constants extracted from the spec. and prop.
+
a fixed number of variables
restriction of outputs to
constants in C
input picked from C
input
restriction of states to
constants in C
output
S
restriction of DB to C
Size polynomial in spec + prop
Pseudoruns
pseudorun
...
DB
DB
DB
 counterexample run iff  counterexample pseudorun
Pseudoruns
pseudorun
...
DB
DB
DB
• Can compute next possible pseudoconfigurations from current one
Pseudoruns
pseudorun
...
DB
DB
DB
• Can compute next possible pseudoconfigurations from current one
• Never construct entire DB, just “slide” poly window over it
PSPACE verification algorithm
“Relevant” constants in C
• constants explicitly used in spec
• witnesses for existentially quantified variables in  
Example:
=
x G [ X reject-order(x)
(past-order(x)   y (pay(x,y)  price(x,y)))]
  =  x  G [ X reject-order(x)
(past-order(x)   y (pay(x,y)  price(x,y)))]
Replace x by non-deterministically chosen constant c,
continue with new formula
 G [ X reject-order(c)
(past-order(c)   y (pay(c,y)  price(c,y)))]
Variables in C
Input values that are not “relevant constants”
Why don’t we need infinitely many?
Variables in C
Input values that are not “relevant constants”
k
Can use just 2k variables because only k values
can be passed via prev-input between configurations
Implementation so far
WAVE: verifier for single Web service peer
[SIGMOD’05]
• Essentially implements search for a
counterexample pseudorun
• Many tricks and heuristics to achieve good
verification times
Some techniques
• Dataflow analysis to identify all constants to which a DB
attribute may be compared (directly or indirectly).
Limits the relevant combinations of constants when constructing
partial DBs. Spectacular reduction: for the computer shopping
website, from 2^(17,270,412,688) partial DBs to 8 !
• Internal representation of pseudoconfigs to
– Efficiently detect loop in periodic run
– Efficiently evaluate queries
• Early pruning of pseudoruns
Experimental Evaluation of WAVE Tool
• Online Demo at http://www.db.ucsd.edu/
• Evaluated experimentally on 4 Web applications:
– Dell-like computer shopping
– Part of Expedia, Barnes&Noble, GrandPrix
• Verification times for a battery of properties:
all within seconds, below one minute.
• Here, report only Dell experiment. All others are similar.
Some of the Verified Properties
Property type
Property name
Time (seconds)
Sequence pBq
P5 (true)
P7 (true)
4
2
Session
Gpafter
 Gq
Shipment
only
proper
payment
Correlation
Fp  Fq
P9 (true)
1
P10 (true)
P11 (false)
P12 (true)
P13 (false)
0.23
0.29
0.6
0.44
Response p  Fq
P14 (false)
0.19
Reachability Gp or Fq
P2 (true)
P3 (false)
0.9
0.37
Recurrence G(Fp)
P17 (false)
0.15
Strong non-progress F(Gp)
P15 (false)
0.26
Weak non-progress G(pXp)
P6 (false)
0.49
Guarantee Fp
P1 (true)
P8 (false)
0.02
0.11
Failure of Classical Tools
• SPIN model checker
Abstraction is unsatisfactory.
Alternative trick:
Try to use SPIN to verify pseudoruns.
The resulting SPIN input is too large to handle.
• PVS theorem prover
Not guaranteed to find a counterexample.
Gets stuck during search, asks for guidance from
expert user.
Branching-time temporal properties
homepage
Current
state
Need path quantifiers
“At any point in a run, there is a way
to return to the home page”
Verification results for CTL(*)
• Propositional Web services:
--states and actions are propositional
--prev-I atoms are disallowed
• CTL* formulas using state, action,
input, and Web page symbols interpreted as
propositions
Examples
Navigational properties
• “One can reach the home page from any page”
AGEF( Home-Page)
• “After login, the user can reach a page where he can
authorize payment”
AG(Home-Page  button(“login”)
EF(button(“authorize payment”)))
• Verification of CTL(*) formulas for propositional Web
services:
--CO-NEXPTIME for CTL
--EXPSPACE for CTL*
Proof idea:
(i) show that there is a bound
on the databases that need to be considered in
order to detect a violation;
(ii) for a fixed database, reduce checking violation
to model checking for a Kripke structure
generated from the database.
Getting down to PSPACE:
• Navigational properties: CTL* formulas
using only Web page symbols
• Fully propositional Web services:
inputs are also propositional
Proof technique: highly efficient model-checking technique
of Kupferman, Vardi, Wolper using hesitant alternating
tree automata (HAA). Reduce to checking non-emptiness
of a one-letter word HAA.
Alternative restriction:
Web services with “input-driven search”
• Propositional states and actions
• Inputs are monadic, propagated
to next Web page as a parameter
using prev-I atoms
Example: allows conducting an user-driven search
going through consecutive stages of refinement
(each choice triggers new set of options)
For Web services with input-driven search:
CTL formulas can be verified in EXPTIME
CTL* formulas can be verified in 2-EXPTIME
EXPTIME for fixed out-degree of input choice
Proof: reduce to satisfiability of CTL(*) formulas
by a Kripke structure
Compositions of Web services
Buyer Login page(BP)
Name
passwd
login
cancel
Category choice page(CP)
My order
Desktop
laptop
Desktop Search(SP)
Past Order (POP)
Past Order
User payment(UPP)
laptop Search(SP)
Desktop search
Ram:
Hdd:
Desktop search
Ram:
Hdd:
Display:
search
Credit
Verification
Payment
CC No:
Expire date
search
M
submit
Order status(OSP)
Order status
Cancel confirmation
page(CCP)
Product index page(PIP)
Matching products
Product detail page(PP)
Product detail
buy
Confirmation page(CoP)
Order detail
Examples of Composition Properties
• “every payment request by a user results eventually in
an approval or denial output to the user”
• “the answer to every credit check request message for
a user is a credit rating message poor, fair, or good, for
the same user”
• “for every two consecutive credit rating messages for
the same user user there exists an intermediate credit
request message for that user.”
• Communicating peers: composition
• channels between peers
• message: finite relation (set or singleton)
• one FIFO queue at recipient of each channel
More on messages
•
•
•
•
Flat message: single tuple
Nested message: finite set of tuples
Messages queued at recipient
Message contents:
!M(x) :- query(db, state, input, in-messages)
Flat messages: query may generate several
tuples, choose non-deterministically one to be sent
Peers with messages
input
incoming
messages
Single peer
FO
control
state
output
db
outgoing
messages
Control: (input, in-messages, state, db)  (output, out-messages, state)
Configurations and runs
incoming
message
queues
input
Configuration
of a single peer
state
db
output
Configuration of a composition: member peer configurations
Transitions: one peer at a time
Configuration of a composition: member peer configurations
Transitions: one peer at a time
Configuration of a composition: member peer configurations
Run:Transitions:
infinite sequence
one peer
of consecutive
at a time configurations
Verification of compositions:
main restrictions for decidability
input boundedness of rules
+ bounded queues, lossy channels
Input-bounded compositions
• State, output, and nested message rules use FO
formulas with guarded quantification:
x ( guard(- x- )  φ( x ))
x ( guard(- x- )
φ( x ))
where guard is an input or flat message atom
and state and nested message atoms in φ have
no quantified variables
• Input options and flat message definitions:
*FO formulas with ground state and
nested message atoms
Verification of compositions
Reduce to single peer verification
• Reduction applies to input-bounded compositions
with bounded, lossy channels
• Flat message queues simulated by inputs
• Nested message queues simulated by states
• Non-deterministic choice of peer at each
transition simulated with additional input
• Some tricky timing issues in translation of property
PTIME reduction preserving input boundedness
Main verification result
Theorem: It is decidable whether an inputbounded composition with bounded queues
and lossy channels satisfies an input-bounded
LTL-FO property.
Complexity: PSPACE-complete for bounded arity
schemas, EXPSPACE otherwise
Examples of extensions
leading to undecidability
• Allowing perfect flat queues
(perfect nested queues are ok)
• Disallowing non-deterministic choice for flat
messages
Reduction: Post Correspondence Problem
Additional verification problems
• Conversation protocols
sequences of messages observed in runs
data-agnostic: message parameters ignored
data-aware: parameters taken into account
• Modular verification
specs of some peers not available
information limited to input/output behavior
Verification of conversation protocols
• data-agnostic protocol: Büchi automaton
over alphabet of message names
• Possible semantics with lossy channels:
observer-at-recipient
observer-at-source
Theorem:
Theorem: ItItisisPSPACE-complete
undecidable if an input-bounded
if an input-bounded
composition
composition with
with bounded,
bounded, lossy
lossy channels
channels satisfies
satisfies
aadata-agnostic
data-agnostic conversation protocol with
observer-at-recipient
observer-at-source semantics
semantics
Verification of conversation protocols
• Similar results for data-aware protocols:
formalized as Büchi automaton whose
alphabet is a finite set of FO formulas on
message relations
G( get-rating(x) B rating(x,y) )
Theorem: It is PSPACE-complete if an input-bounded
composition with bounded, lossy channels satisfies
a data-aware conversation protocol with
observer-at-recipient semantics
Modular verification
?
?
Black box peers: input-output behavior
Modular verification
Environment
Environment specification:
LTL-FO description of input and output messages
Properties under given environment
Composition C satisfies LTL-FO property 
under environment specification :
every run of C in which messages to/from the
environment satisfy  and use values from
some finite domain, satisfies 
Verification under given environment
Additional restriction needed for decidability
LTL-FO property  is strictly input-bounded if
its FO components have no free variables
Example:
G ssn [ ?getRating(ssn) 
(!rating(ssn, “poor”)  !rating(ssn, “fair”)  !rating(ssn, “good”))]
Verification under given environment
Theorem: It
It is
is PSPACE-complete
undecidable if an input-bounded
Theorem:
if an input-bounded
composition C
C with
with bounded
bounded queues
queues and
and lossy
lossy channels
channels
composition
satisfies an
an input-bounded
input-bounded LTL-FO
LTL-FO property
property 
 under
under
satisfies
an input-bounded but not strictly-input-bounded
a strictly-input-bounded environment specification 
environment specification 
Putting the pieces together
WebML-style spec of
Web service composition
PTIME
peer composition spec
PTIME
single peer spec
The XML angle
• Black box: input/output signature
order
bill
Supplier
delivery
payment
The XML angle
• Black box: input/output signature
XML type
order
bill
XML type
Supplier
XML type
delivery
payment
XML type
• Interacting Web services
store
supplier1
authorize
ok
bank
supplier2
• Interacting Web services
XML types
authorize
ok
store
XML types
bank
XML types
supplier1
XML types
supplier2
Active XML
• Full integration of XML with Web service calls
• Service calls embedded in XML documents
• Static analysis: typechecking, type casting,
optimization, materialization policy
• More later!
Quick XML Review
• XML document (ignoring PC data):
labeled, unranked, ordered tree
root
section
section
intro section conc
intro section
intro
section conc
intro conc intro conc
conc
• XML type: XML Schema
• XML query language: XQuery
Tools in static analysis: connections to
tree automata and tree transducers
Simple XSchema:
Document Type Definition (DTD)
: alphabet of element names, root  
set of rules:
e
element name
r
regular expression
over 
Documents satisfying a DTD
root
….
e
e
e1
….
r
ek
r
Set of trees satisfying DTD d: Sat(d)
Example
A DTD and a tree satisfying it:
root
section
section*;
intro, section*,conclusions;
root
section
section
intro section conc
intro conc
intro section section conc
intro conc intro conc
Essential enhancement of XML Schema:
specialization
dealer
used
new
car
car
model
year
model
car has different structure in different contexts
Specialization
dealer
used
carused
model
year
new
carnew
model
car has different structure in different contexts
Specialized DTD
dealer
used
new
carused
carnew
used new
(carused )*
(carnew)*
model year
model (year | )
dealer
dealer
used
new
used
new
carused
carnew
car
car
model
year
model
model
year
model
Tool in static analysis:
Powerful connection to tree automata!
Tree automata
States: p,q,r, …
Start state: q0
Final state: qf
root
a
a
c
b
c
c
b
c
a a
Transitions:
b
a
b
p
a
a
b
Tree automata
States: p,q,r, …
Start state: q0
Final state: qf
root
a
a
c
b
c
c
b
c
a a
Transitions:
b
a
b
p
a
q
r
Tree automaton computation
root
a
a
c
b
c
c
b
c
a a
b
a
b
a
Tree automaton computation
q0
a
a
c
b
c
c
b
c
a a
b
a
b
a
Tree automaton computation
q0
p
a
c
q
c
c
b
c
a a
b
a
b
a
Tree automaton computation
q0
p
r
c
q
p
c
b
q
a a
p
a
b
a
Tree automaton computation
q0
p
r
Acceptance: there is a
computation such that
all leaves are labeled qf
q
p
q
p
qf qf qf qf qf qf qf
qf
Set of accepted trees: regular tree language
Tree automata variants
• Top-down nondeterministic
• Bottom-up (non)-deterministic
• Top-down deterministic (weaker)
Examples
• regular: set of trees representing Boolean circuits
evaluating to true



0


1 1

0 0

1 0
1
• not regular: equal numbers of a and b nodes
Regular: definable in Monadic Second-Order Logic
Theorem: XMLSchemas define
regular languages of unranked trees
Benefits:
• Algorithms for validation wrt XMLSchemas,
testing inclusion of XMLSchemas, etc.
• Static analysis
Query languages for XML
• Standard: XQuery
integrates previous paradigms
• One paradigm (also XML-QL, Lorel)
– extract bindings for variables using patterns
– construct answer from bindings
• Another paradigm (also XSLT, UnQL)
– structural recursion
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
root
pattern:
p
X
p,q,r,s: regular
path expressions
q
Y
r
Z
s
T
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
root
pattern:
p
X
p,q,r,s: regular
path expressions
q
Y
r
Z=5
s
T
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
root
pattern:
p
X
r
Z=5
=
p,q,r,s: regular
path expressions
q
Y
s
T
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
answer:
f(X), g(X,Y)
Skolem functions
root
…
f(X) …
g(X,Y)
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
answer:
f(X), g(X,Y)
Skolem functions
root
…
f(X) …
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
answer:
f(X), g(X,Y)
Skolem functions
root
f(X1) …f(Xi) …f(Xn)
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
answer:
f(X), g(X,Y)
Skolem functions
root
…
f(X) …
g(X,Y)
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
answer:
f(X), g(X,Y)
Skolem functions
root
…
f(X) …
g(X,Y1)…g(X,Yk)
Example: art exhibits
root
exhibit
title
museum
DTD:
exhibit*
title. museum*. review*
artist*
address. dates._*
WHERE
CONSTRUCT
root
exhibit
X1
title
museum
X2
artist
X4 = Vermeer
X3
Vermeer’s exhibits
title(X2)
museum(X2,X3)
*
X5
X5(X2,X3,X5)
Q(X2)
Nested query Q(X2):
WHERE
root
exhibit
Y1
title
review
X2
Y2
CONSTRUCT
Vermeer’s-reviews(X2)
review(X2 ,Y2)
A Different Paradigm: Structural Recursion
• Example: change all name tags occurring under
person nodes to pname
a
• Notation: a(t1,t2) denotes the tree
t1
• Transformation defined by f:
f(x(t1,t2)) = if x = person then x (g(t1), g(t2))
else x (f(t1), f(t2))
g(x(t1,t2)) = if x = name then pname (g(t1), g(t2))
else x (g(t1), g(t2))
t2
Connection to tree transducers
k-pebble transducers
[Milo+Suciu+V.-]
• subsume core of XQuery
• useful for static analysis
k-pebble transducers
– States, pebbles,
transition rules
– Control: alternating
k pebbles:
1
2
k
k-pebble transducers
stack discipline for pebbles
i
2
1
k-pebble transducers
stack discipline for pebbles
3 pebbles:
1
2
3
k-pebble transducers
1
stack discipline for pebbles
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
1
1
3 pebbles:
1
2
3
k-pebble transducers
2
stack discipline for pebbles
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
2
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
2
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
3
stack discipline for pebbles
2
3
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
3
2
3
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
23
3
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
2
3
1
2
3
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
23
3
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
3
2
3
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
3
stack discipline for pebbles
2
3
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
2
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
2
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
2
stack discipline for pebbles
2
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
1
1
3 pebbles:
1
2
3
k-pebble transducers
stack discipline for pebbles
1
1
3 pebbles:
1
2
3
k-pebble transducers
1
stack discipline for pebbles
1
3 pebbles:
1
2
3
Output-producing transitions
root
Output-producing transitions
root
2
1
3
2
1
3
Output-producing transitions
root
2
2
3
1
1
3
Output-producing transitions
root
32
1
2
1
Output-producing transitions
root
2
3
1
2
1
Output-producing transitions
root
32
1
1
2
Output-producing transitions
root
a
32
32
1
2
1
1
Output-producing transitions
root
a
b
2
3
1
1
2
Output-producing transitions
root
a
c
b
2
1
3
Output-producing transitions
root
a
b
c
a
Output-producing transitions
root
a
c
b
a
b
Output-producing transitions
root
a
c
b
a
a
b
Output-producing transitions
root
a
c
b
a
a
b
c
a
Output-producing transitions
root
a
c
b
c
a
a
b
c
a
a
XML Query languages
vs. k-pebble transducers
k-pebble transducers subsume the
tree manipulation core of XQuery
Application: typechecking
output type β
Tree transducer T
XML query
output type
Web service A
output type α
Web service B
output type
Web service C
Need to check: T(α)  β
Equivalently: α  T-1(β)
Theorem: T-1(β) is a regular tree language
[Milo+Suciu+V.-]
Typechecking:
1. Compute from T and β the
tree automaton for T-1(β)
2. Check that α  T-1(β)
Caveat: no data joins
Example: application to matching
for service composition
Web service A
output type α
Can output of A be restructured to fit the input type β of B?
Web service B
input type β
Key: describe allowed restructurings by
a nondeterministic transducer T
• Given output I of service A, check whether
T(I)  β ≠ 
by constructing a tree automaton for T(I)
• If yes, produce as side effect a minimal
restructuring of I that satisfies β,
witnessing the nonempty intersection
Static version:
Can every output of A be restructured
so as to satisfy the input type of B?
• Key: {I / T(I)  β ≠ } is regular
if T is k-pebble transducer
• Enough to check that
α  {I / T(I)  β ≠ }
Going all the way: Active XML
newspaper
XML+
embedded
service calls
title
GetTemp
date
city
“06/10/2003”
“Le Monde”
GetEvents
“Exhibits”
“Paris”
Y!
 Materialization: replacing a service call by its result.
 It’s a recursive process.
[Milo,Abiteboul,Amann,Benjelloun,Ngoc – SIGMOD03]
Going all the way: Active XML
newspaper
XML+
embedded
service calls
title
temp
date
“06/10/2003”
GetEvents
“Exhibits”
“16°C”
“Le Monde”
T!
 Materialization: replacing a service call by its result.
 It’s a recursive process.
[Milo,Abiteboul,Amann,Benjelloun,Ngoc – SIGMOD03]
Going all the way: Active XML
newspaper
XML+
embedded
service calls
title
temp
date
“06/10/2003”
“16°C”
exhibits
GetExhibits
City
“Le Monde”
“Paris”
 Materialization: replacing a service call by its result.
 It’s a recursive process.
[Milo,Abiteboul,Amann,Benjelloun,Ngoc – SIGMOD03]
• Context: peer-to-peer Web services
• Each peer
– Repository of intensional (AXML)
documents
– Server: provides Web services (XQuery)
– Client: when invoking the embedded
service calls
Extended type
• Restriction on where data and service calls
occur in tree
newspaper
GetEvents
title
date
GetTemp
“Exhibits”
city
“06/10/2003”
“Le Monde”
“Paris”
• Service call input/output signatures
• Service call definitions
input parameters: XQueries on client data
output: XQuery on parameters and server data
Basic typechecking problem
Given AXML types and service signatures and
definitions for all peers, check if:
• all AXML documents resulting
from calls among peers are valid
• all service inputs and outputs satisfy the
signatures
Can be checked using transducers
(if no data joins)
Controlling expansion policy by typing
newspaper
GetEvents
title
date
GetTemp
“Exhibits”
city
“06/10/2003”
“Le Monde”
“Paris”
Y!
 Materialization can be performed
 by the sender, before sending a document…
 or by the receiver, after receiving it.
Controlling expansion policy by typing
newspaper
GetEvents
title
date
temp
“06/10/2003”
“16°C”
“Le Monde”
 Materialization can be performed
 by the sender, before sending a document…
 or by the receiver, after receiving it.
“Exhibits”
Controlling expansion policy by typing
newspaper
Y!
GetEvents
title
date
GetTemp
“Exhibits”
city
“06/10/2003”
“Le Monde”
“Paris”
 Materialization can be performed
 by the sender, before sending a document…
 or by the receiver, after receiving it.
Controlling expansion policy by typing
newspaper
Y!
GetEvents
title
date
temp
“06/10/2003”
“16°C”
“Le Monde”
 Materialization can be performed
 by the sender, before sending a document…
 or by the receiver, after receiving it.
“Exhibits”
Why control the materialization of calls?
• For added functionality, e.g.
– Intensional data allows to get up-to-date information.
• For security reasons or capabilities, e.g.
–
–
–
–
I don’t trust this Web service/domain,
I don’t have the right credentials to invoke it,
It costs money,
Maybe the receiver doesn’t know Active XML!
• For performance reasons, e.g.
– A proxy can invoke services on behalf of a PDA.
Example scenario
• Client allows only certain service calls,
specified by its type
• Can server always force its answer to satisfy
the clients schema by appropriate expansions?
• Game between server and invoked services
Example: word case
• Client type: regular language R
• Input: word a1 ... an
each ai represents a Web service call
• Output type of service a: regular language Ra
• Game: Bob chooses a in the current word,
Alice responds with word in Ra to replace a
• Bob wins if resulting word is in R
Does Bob have a winning strategy on a1 ... an ?
Undecidable
[Segoufin,Schwentick,Muscholl – STACS’04]
even if limited to “context-free games”: Ra consists of
just finitely many words
Decidable under restrictions
[Segoufin,Schwentick,Muscholl – STACS’04]
[Milo,Abiteboul,Amann,Benjelloun,Ngoc – SIGMOD’03]
complexity: PSPACE to EXPSPACE
Mille Grazie!
Download