The Web Service Discovery Architecture (WSDA) How to Reach Mentioned Objectives?

advertisement
How to Reach
Mentioned Objectives?
• Retain and wrap all four WWW pillars (URI, HTTP, MIME,
HTML) “as is”
– yet allow for flexible extensions in terms of
identification, retrieval and caching of content
• Judiciously combine the…
– four WWW pillars
– Dynamic Data Model (DDM)
– Web Service Discovery Architecture (WSDA)
– Hyper Registry
– the Unified Peer-to-Peer Database Framework (UPDF)
– and its Peer Database Protocol (PDP)
The Web Service Discovery Architecture
(WSDA)
in Comparison with OGSA
Input for improvements of OGSA
Wolfgang.Hoschek@cern.ch
European DataGrid Data Management Work Package
(EDG WP2)
OGSA Workshop, Argonne, May 31, 2002
OGSA Workshop, Argonne, May 31, 2002
1
Objectives of Work
Ph.D and Papers
• Define how to…
– bootstrap, query and publish to a dynamic
information space maintained by self-describing
network interfaces
• Show how to support…
– expressive general-purpose queries for service
discovery
– over a view that integrates autonomous dynamic
database nodes
– from a wide range of distributed system topologies
1. A Unified Peer-to-Peer Database Framework for XQueries
over Dynamic Distributed Content and its Application for
Scalable Service Discovery
Ph.D Thesis, Tech. University of Vienna (submitted), 2002.
2. A Data Model and Query Language for Service Discovery
3. A Database for Dynamic Distributed Content and its
Application for Service and Resource Discovery
4. The Web Service Discovery Architecture
5. A Unified Peer-to-Peer Database Framework and its
Application for Scalable Service Discovery
6. A Unified Peer-to-Peer Database Protocol
–
OGSA Workshop, Argonne, May 31, 2002
See http://cern.ch/grid-data-management/publications.html
OGSA Workshop, Argonne, May 31, 2002
2
5
Tuple from
Dynamic Data Model
World Wide Web Architecture
• T. Berners-Lee designed the WWW as a
– consistent interface to a flexible and changing
heterogeneous information space
– for use by CERN's staff, the High Energy Physics
community, and, of course, the world at large
• WWW architecture rests on four simple and
orthogonal pillars:
– URIs as identifiers
– HTTP for retrieval of content pointed to by
identifiers
– MIME for flexible content encoding
– HTML as the primus-inter-pares (MIME) content
type
OGSA Workshop, Argonne, May 31, 2002
4
– A WSDA tuple is an…
•
•
•
•
annotated multi-purpose soft state data container
that may contain a piece of arbitrary MIME content
and allows for refresh of that content at any time
(default content-type is XML)
Tuple :=
Link Type Context Timestamps Metadata
Content (optional)
Semantics : HTTP GET(tuple.link) --> tuple.content
type(HTTP GET(tuple.link)) --> tuple.type
3
OGSA Workshop, Argonne, May 31, 2002
6
1
1
Tuple Set
from Dynamic Data Model
Discovery Query (2)
• Find all CMS replica catalogs and return their physical
<tupleset>
<tuple link="http://sched001.cern.ch/getServiceDescription"
file names (PFNs) for a given logical file name (LFN);
suppress PFNs not starting with ``ftp://''.
type="service" ctx="parent" TS1="10" TC="15" TS2="20 TS3="30">
<content>
<service> service description A goes here </service>
LET $repcat := "http://gridforum.org/interface/ReplicaCatalog-1.0"
FOR $tuple in /tupleset/tuple[@type="service"]
</content>
<metadata>
<owner name="http://cms.cern.ch"/>
LET $s := $tuple/content/service
</metadata>
WHERE
</tuple>
SOME $op IN $s/interface[@type = $repcat]/operation SATISFIES
($op/name="XML getPFNs(String LFN)" AND $op/bindhttp/@verb ="GET"
<tuple link="http://repcat.cern.ch/pub/getServiceDescription?id=4711"
type="service" ctx="child" TS1="30" TC="0" TS2="40" TS3="50">
AND contains($op/allow, "http://cms.cern.ch/everybody"))
RETURN
FOR $pfn IN invoke($s, $repcat, "XML getPFNs(String LFN)",
</tuple>
"http://myhost.cern.ch/myFile")/tupleset/PFN
</tupleset>
WHERE starts-with($pfn, "ftp://")
RETURN $pfn
OGSA Workshop, Argonne, May 31, 2002
OGSA Workshop, Argonne, May 31, 2002
7
Query Support
WSDA Interfaces
• Simplest possible query support? ÆMinQuery interface
– “Select all”-style!
– Return all tuples (including or excluding cached content)
• XML getTuples()
• XML getLinks()
Interface Operations
Responsibility
Presenter
Retrieve service description
Default MIME content-type:
XML
HTTP(S) GET on HTTP(S) URL
or
MIME getServiceDescription()
Consumer (TS4,TS5) publish(XML tupleset) A content provider can
publish a dynamic pointer
(content link), which in turn
enables the consumer (e.g.
hyper registry) to retrieve
the current content.
• Powerful query support? ÆXQuery interface
– XQuery Language!
– Everything that can be done in SQL can be done in
XQuery. But XQuery is even more powerful (e.g.
hierarchical navigation)
• XML query(XQuery)
OGSA Workshop, Argonne, May 31, 2002
10
MinQuery
XML getTuples()
XML getLinks()
Simplest possible query
support (“select all”)
XQuery
XML query(XQuery)
Powerful query over tuple
set
OGSA Workshop, Argonne, May 31, 2002
8
Discovery Query (1)
11
Client and WSDA Interfaces
Legend
• Find all services that implement a replica catalog
Remote Client
service interface and that CMS members are allowed
to use, and that have an HTTP bindings for the replica
catalog operation “XML getPFNs(String LFN)”.
HTTP GET or
getSrvDesc()
publish(...)
getTuples()
getLinks()
Interface
T1
query(...)
...
Invocation
Content Link
Tn
Presenter Consumer MinQuery XQuery
LET $repcat := "http://gridforum.org/interface/replicaCatalog-1.0"
Tuple 1
...
Tuple N
FOR $tuple IN /tupleset/tuple[@type="service"]
WHERE SOME $op IN $tuple/content/service/interface[@type = $repcat]/operation
SATISFIES ($op/name="XML getPFNs(String LFN)" AND
$op/bindhttp/@verb="GET“ AND
Content 1
OGSA Workshop, Argonne, May 31, 2002
Presenter N
Presenter 1
contains($op/allow, "http://cms.cern.ch/everybody"))
RETURN $tuple
9
...
Content N
OGSA Workshop, Argonne, May 31, 2002
12
2
2
Tuple vs.
Service Data Element (2)
OGSA vs. WSDA (1)
Concept
WSDA
OGSA
Interfaces
Presenter,
MinQuery, XQuery,
HandleMap,
GridService,
Registry,
NotificationSink,
NotificationSource,
Factory, PrimaryKey,
Consumer,
TriggerXQuery (tbd.)
Service identifier
Service link (i.e. content
link) = HTTP(S) URL,
Need not be unique
Service description Service description (e.g.
WSDL)
WSDA Tuple
OGSA Service Data
Element
Dynamic Pointer / ID
Content link =
dynamic pointer
globalName (+May spec.)
= ID
What?
Content-type
Type
Type (+May spec.)
Grid Service Reference
(GSR) (e.g. WSDL)
Service description via HTTP(S) GET or
via HTTP(S) GET or
Presenter.getServi HandleMap.findByHand
retrieval
le(GSH)
ceDescription()
When?
Lifetime
4 timestamps
3 timestamps
More annotations
Metadata
(optional)
Not available
Embedded data
Content (optional)
Content (optional)
OGSA Workshop, Argonne, May 31, 2002
13
OGSA vs. WSDA (2)
Concept
WSDA
OGSA
Multi-purpose
data container
Tuple
Service Data Element
Set of data
containers
Tuple set
Collection of service data
elements
Query capability MinQuery.getLinks(),
MinQuery.getTuples(),
XQuery.query(XQuery)
Data publication (TS4,TS5)
Mandatory
Interfaces
Registry.RegisterService
(handle),
NotificationSink.deliver
Notification(sdata)
none
GridService
OGSA Workshop, Argonne, May 31, 2002
Time
Stamp
WSDA
OGSA
TS1 /
goodFrom
Time content provider last
modified content
Time from which the value of
the SDE carried in its
extensibility element is said to
be valid.
TC
Time embedded tuple
content was last modified
(e.g. by an intermediary)
Not available
TS2 /
goodUntil
Expected time while current
content at provider is at
least valid
Time until which the value of
the SDE in its extensibility
elements is said to be valid.
TS3 /
avail.Until
Expected time while content
link at provider is at least
valid (alive)
Time until which this named
SDE is expected to be
available.
OGSA Workshop, Argonne, May 31, 2002
14
Tuple vs.
Service Data Element (1)
17
Hyper Registry vs. MDS
a) Content Provider and Hyperlink Registry
• Service Data Element…
– is a named multi-purpose soft state data container that may
contain a piece of arbitrary XML content (value). May contain
an arbitrary extensibility element as content.
– Attributes added in May spec
b) Content Provider and GRIS
Remote Client
Remote Client
Query
DB
• + Global name (i.e. QName), +Type
Query
Query
Registry
(Re)publish content link
without content or
with content (push)
via HTTP POST
• WSDA Tuple…
– is an annotated multi-purpose soft state data container that
may contain a piece of arbitrary MIME content and allows for
refresh of that content at any time (default content-type is
XML)
– Has as attributes a content link, a type, a context, four soft
state time stamps, and
– (optionally) two arbitrary-shaped extensibility elements,
namely metadata and content.
OGSA Workshop, Argonne, May 31, 2002
16
Soft State Time Stamps
GridService.FindServiceD
ata(XML query)
Consumer.publish(XML
tupleset)
Name (?)
Context
Why? How?
Publication purpose/usage
Grid Service Handle (GSH)
= HTTP(S) URL with
restrictions, Must be unique
OGSA Workshop, Argonne, May 31, 2002
Concept
Cache
GRIS
Content retrieval
(pull)
via HTTP GET
Content Provider
15
Query
Content retrieval
(pull)
via execution of
local program
Content Provider
Publisher
Presenter
Mediator
Executable
Content Source
Content Source
OGSA Workshop, Argonne, May 31, 2002
18
3
3
Unified Peer-to-Peer Database
Framework (UPDF)
Example Content Providers
publish
& refresh
retrieve
cron job
Apache
XML file(s)
publish
& refresh
retrieve
publish
& refresh
monitor
thread servlet
retrieve
cron job
Perl HTTP
publish
& refresh
• Q: Can we devise a unified P2P database framework …
– for general-purpose query support
– in large heterogeneous distributed systems
– spanning many administrative domains?
• Q: Can we devise a framework that allows to express specific
applications for a wide range of …
– data types (typed or untyped XML, any MIME type)
– node topologies (e.g. ring, tree, graph)
– query languages (e.g. XQuery, SQL, LDAP)
– query response modes (e.g. Routed, Direct and Referral
Response)
– neighbor selection policies (e.g. in the form of an XQuery)
– pipelining characteristics, timeout and other scope options?
• Answer: Yes ÆUnified P2P DB Framework
retrieve
java mon servlet
to XML
to XML
Replica catalog
service(s)
RDBMS or LDAP
cat /proc/cpuinfo
uname, netstat
(re)compute
service description(s)
OGSA Workshop, Argonne, May 31, 2002
OGSA Workshop, Argonne, May 31, 2002
19
Query Response Modes
Soft State Transitions
a)
Routed Response
(RR)
Node
Agent Node
Originator
Query
Result set
Invitation
Data Query
Data
UNKNOWN
4
currentTime > TS2
TS1 > TC
e)
3
11 12
10
8
OGSA Workshop, Argonne, May 31, 2002
6
6 5
6
9
10
1 7
4
8 9 10
2
1 11
f)
Direct Metadata Response
with Invitation (DRM)
3
4
5 2
1 8
7
1 7
Direct Metadata
Response without Invitation
2 7
3
4
5 2
6
3
9
3
1 8
5
Direct Response
with Invitation (DR)
6
2 7
Routed Response
with Metadata (RRM)
CACHED
c)
b)
Direct Response
without Invitation
5
4
3
d)
Publish with content (push)
Retrieve (pull)
NOT CACHED
22
7
6 5
13
12
2
4
8 9 10
11
1 11 14
15
OGSA Workshop, Argonne, May 31, 2002
20
Tuples Partitioned over
Registry Nodes -Topology
23
Response Mode
Switches and Shifts
• No need to mandate single response mode globally
• Response modes can be permuted arbitrarily
• For autonomy, scalability, availability, performance, security, etc.
a)
RR --> DR Switch
b)
DR --> RR Switch
c)
DR --> DR Shift
Node
Agent Node
Originator
Query
Result set
OGSA Workshop, Argonne, May 31, 2002
21
OGSA Workshop, Argonne, May 31, 2002
24
4
4
Template Execution Plan
Permitted Message Exchanges
• Any query can be answered by appropriate
substitutions into template
• MSG_QUERY
--> RPY_OK | ERR
• MSG_RECEIVE --> RPY_SEND |
(ANS_SEND [0:N], NULL) |
ERR
• MSG_INVITE --> RPY_OK | ERR
• MSG_CLOSE
--> RPY_OK | ERR
A
SEND
A ... Agent Plan
L ... Local Query
M ... Merge Query
M
L
RECEIVE1 ... RECEIVE k
N
• Supports synchronous (pull) and asynchronous (push)
• Supports batched iterators
– RECEIVE/SEND batches of at least N and at most M
results from the (remainder of the) result set
N ... Neighbor Query
U
Q ... User Query
U ... Unionizer Operator
N
OGSA Workshop, Argonne, May 31, 2002
OGSA Workshop, Argonne, May 31, 2002
25
Peer Database Protocol (PDP)
28
Node State Transitions
• Messaging model and network protocol that supports
the UPDF framework and the XQuery interface
• Fully based on BEEP IETF standard
• Transaction
– consists of one or more discrete message
exchanges related to the same query
• Messages
– QUERY, RECEIVE, SEND, INVITE, CLOSE
1. CLOSE received
2. SEND exhausts result set
3. INVITE not accepted (Direct Response non empty resultset)
4. True (Direct Response empty local result set)
5. Various errors
6. Abort timeout
Trigger action:
Trigger action:
Forward QUERY
Forward CLOSE
to neighbors
to dependents
OPEN
CLOSED
Loop timeout
QUERY
UNKNOWN
OGSA Workshop, Argonne, May 31, 2002
OGSA Workshop, Argonne, May 31, 2002
26
PDP Properties
Summary: WSDA in One Slide
Legend
• Low latency, pipelining, early and/or partial result set
retrieval due to synchronous pull, and result set
delivery in one or more variable sized batches.
• It is efficient
– due to asynchronous push with delivery of multiple
results per batch.
• Resource consumption and flow control and on a per
query basis
– due to the use of a distinct channel per transaction.
• Scalable
– due to application multiplexing, which allows for
very high query concurrency and very low latency,
even in the presence of secure TCP connections.
OGSA Workshop, Argonne, May 31, 2002
29
Remote Client
HTTP GET or
getSrvDesc()
publish(...)
getTuples()
getLinks()
Interface
T1
query(...)
...
Invocation
Content Link
Tn
Presenter Consumer MinQuery XQuery
Tuple 1
...
Presenter N
Presenter 1
Content 1
27
Tuple N
...
Content N
OGSA Workshop, Argonne, May 31, 2002
30
5
5
Questions?
• Input for improvements of OGSA
• More information
– http://cern.ch/grid-data-management/
– http://www.edg.org
• Contacts
– wolfgang.hoschek@cern.ch
– peter.kunszt@cern.ch
OGSA Workshop, Argonne, May 31, 2002
31
6
6
Download