Best Practices in Distributed Workflows and Web Services CCLRC Daresbury Laboratory

advertisement
Best Practices in Distributed Workflows
and Web Services
Rob Allan, Asif Akram, David Meredith
CCLRC Daresbury Laboratory
WOSE
19th October 2006, NeSC, Edinburgh
WOSE Project
 Workflow Optimisation for Services in e-Science
 EPSRC funded project
 In collaboration with Imperial College and Cardiff
University
 CCLRC is investigating some user requirements
 Developing Use Cases based on existing eScience Applications
 e-HTPX: An e-Science Resources for High
Throughput Protein Crystallography
http://www.grids.ac.uk/WOSE
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Best Practices for Workflows
 Best practices for interoperable Web Services
 Modular workflow design
 Hierarchical workflow exception handling
 Compensation mechanisms
 Build adaptivity and flexibility into workflow
 Account for workflow management
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Best Practices for Workflows
 Best practices for interoperable Web Services
a) Selection of WS style
b) Abstraction of data model
c) When to use ‘loosely’ or ‘strongly’ typed Web services
d) Approach to writing Web services (code first / WSDL first)
 Modular workflow design




Hierarchical workflow exception handling
Compensation mechanisms
Build adaptivity and flexibility into workflow
Account for workflow management
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
a) Selection of WS style
 RPC/ encoded: (depreciated – only use is for legacy
purposes)
 RPC/ literal: WS-I compliant but has limitations for
validation of complex data (due to namespace issues)
 Document/ literal (wrapped) style is best (100% WS-I
compliant). Supports modelling and validation of complex
data from different namespaces in plain XML instance
documents.
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
b) Abstraction of data model
Separating the XML Schema elements and complex types defined within
the <wsdl:types> section of the WSDL file into separate files.
Recommended as data can be modelled in different documents according
to different namespace and domain / data requirements
<wsdl:definitions …
<wsdl:types>
<xsd:schema …>
<xsd:import namespace=“…”
schemaLocation="BulkMRFinal.xsd"/>
</xsd:schema>
</wsdl:types>
<schema…>
<xsd:include
schemaLocation=“schema1.xsd”/>
<xsd:import
schemaLocation=“schema2.xsd”/>
…
</schema>
..
</wsdl:definitions>
Schema1.xsd
Schema2.xsd
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
b) Abstraction of Data Model (Benefits)
Separation of roles – Data model can be developed in isolation from WSDL.
Schema / data model re-usability – Share and re-use schema rather than
re-designing schema for each new WS.
Isolation of changing components – Data model can be subject to change.
Its isolation limits the impact on other WS components such as concrete
WSDL file.
Data model avoids dependencies on WSDL + SOAP namespaces – Good
data model design.
Full Schema functionality – XML Schema is powerful (e.g. xsd
patterns/regular-expressions, optional elements, enumerations, type
restrictions etc)
Extensible Modelling – Schema can be extended without breaking software
that uses the original schema through the use of xsd:any / xsd:anyType
(wildcard / placeholder for extending schema where necessary).
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
b) Abstraction of Data Model
Abstract the XML Schema(s) used in the <wsdl:types> section of
the WSDL file into separate files.
1) Model data logically in separate schema files with
different namespaces as required by the data model.
2) Combine separate Schema files as required using
<xsd:include> and <xsd:import>
3) Import the schema file(s) into the WSDL file.
4) Reduces the size + complexity of the WSDL file.
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
c) When to use ‘loose’ or ‘strong’ typing
A strongly typed WS:
Defines a WSDL with a complete definition of a WS operation’s input and
output messages in XML schema with tight constraints on the allowed
values.
• xsd:patterns, xsd:restrictions, xsd:enumerations, xsd:sequences etc
Advantages:
• Well defined service interface – all info necessary to invoke the service
is encapsulated within the WSDL (client and automation friendly)
• Strong control on the data that enters the business logic of the service
Disadvantages:
• Requires a working knowledge of XML Schema
• Resistive to change in data model
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
c) When to use ‘loose’ or ‘strong’ typing
A loosely typed WS:
Uses generic data types in the WSDL which are used to ‘wrap’ data of other formats:
• String – can wrap markup fragments (e.g. xml, name-value pairs, SQL)
• xsd:any / xsd:anyType - placeholders for embedding arbitrary XML
• SOAP attachments (used to send data of any format, e.g binary)
Advantages:
• Flexible – single WS can handle multiple types of message + message
data
• Easy to develop
Disadvantages:
• Incomplete WSDL interface - Requires manual negotiation between
client and service to establish format of data wrapped by loose type.
• Prone to message exceptions (requires WS to be v.tolerant in what it
Presenter Name
accepts)
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
d) Approach to writing WS
Two Choices:
a) Code first (‘bottom up’)
b) WSDL first (‘contract driven’)
a) Code first (generating WSDL from WS implantation)
Advantages:
• Simple
Disadvantages:
• WSDL files created from src code are always loosely typed (e.g. a
src code: String is just a String – can’t impose xsd:patterns or
xsd:restrictions if starting with a String).
• WSDL auto-generation tools can introduce technical dependencies upon
implementation language !
• Not all language specific data types can be mapped into interoperable XML
(‘the Object – XML mismatch’).
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
d) Approach to writing WS
b) WSDL first or ‘contract driven’ approach (generate the WS
operation from the WSDL file)
Advantages:
• Authored WSDL can be strongly and loosely typed where
necessary
• Interoperability issues between different languages are avoided if
starting with plain XML:
• Can auto-generate service implementation classes from WSDL.
Disadvantages:
• Developer requires a reasonable knowledge of XML Schema + WSDL
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Best Practices for Workflows
 Best practices for interoperable Web Services
 Modular workflow design
 Hierarchical workflow exception handling
 Compensation mechanisms implemented by partner
services
 Build adaptivity and flexibility into workflow
 Account for workflow management
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Modular Design
 Modules operate as “object oriented black boxes”
in the overall workflow, with their
 own variables,
 computational logic,
 dependency constraints,
 event handlers.
 Group similar or related activities
 Module components should be scalable and
replaceable (to serve as repeatable units)
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Best Practices for Workflows
 Best practices for interoperable Web Services
 Modular workflow design
 Hierarchical workflow exception handling
 Compensation mechanisms implemented by partner
services
 Build adaptivity and flexibility into workflow
 Account for workflow management
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Exception Handling
Types of Exception
 “expected exceptions” (“variations”)
 “unexpected exceptions”
 Exception handlers can be defined at different
hierarchical levels.
 Global (unexpected)
–Scoped (expected)
• Inline (expected)
 Exceptions should be handled at the lowest
available level and unrecognized exceptions are
passed to a higher level.
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Exception Handling
process
fault
handler
catch
catch
global fault
handlers
scope
fault
handler
catch
catch
scoped fault
handlers
invoke
catch
inline fault
handlers
catch
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Best Practices for Workflows
 Best practices for interoperable Web Services
 Modular workflow design
 Hierarchical workflow exception handling
 Compensation mechanisms provided by partner services
 Build adaptivity and flexibility into workflow
 Account for workflow management
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Compensation Mechanism
 “undoing steps that have already completed
successfully”
 Reversing the effects of successful activities in
the abandoned workflow
 A+B+C, if C fails may need to roll back to A
 Provision of compensation is responsibility of the
service.
 Long running processes require some sort of
compensation.
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Best Practices for Workflows
 Best practices for interoperable Web Services
 Modular workflow design
 Hierarchical workflow exception handling
 Compensation mechanisms provided by partner services
 Build adaptivity and flexibility into workflow
 Account for workflow management
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Adaptive Workflow
 Workflow can be static or dynamic depending on whether
changes are accommodated at design time or runtime.
 Adaptivity and flexibility cater for adjustments/
modifications without breaking a workflow.
 Provide “place holders” for future extensions e.g.
replaceable “empty activities”
 Externalizing changeable data e.g. partner service
endpoints specified in configuration files.
 Use generic/ loose data types (rather than strongly
typed data).
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Best Practices for Workflows
 Best practices for interoperable Web Services
 Modular workflow design
 Hierarchical workflow exception handling
 Compensation mechanisms implemented by partner
services
 Build adaptivity and flexibility into workflow
 Account for workflow management
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Management of Workflow
 Workflow management involves
 Breakpoints: arbitrary location crucial to the
process
 Steering points: e.g. reset, re-schedule, restart,
undo, abort, complete, recover, ignore or jump
operations.
 Persistence: state recorded at break+steering
points (allows re-execution of workflow from
prior state, recording/ inspection of prior state
data values)
Presenter Name
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
References
 A. Akram WS-RF Tutorials. Simplified version to be published by IBM Developer
Works http://www.grids.ac.uk/WOSE/tutorials
 A. Akram Requirements and expectations from Workflow
 A. Akram, D. Meredith, R.J. Allan Best practices in web service style, data
binding and validation for use in data-centric scientific applications Proc. All
Hands Meeting 2006, Nottingham, paper 621
 A. Akram, D. Chohan, D. Meredith, R.J. Allan CCLRC Portal Infrastructure to
support Research Facilities Concurrency and Computation: Practice and
Experience (2006)
 A, Akram, J. Kewley, R.J. Allan Data Centric approach for Workflows Proc. The
Tenth IEEE International Enterprise Distributed Object Computing Conference
(EDOC 2006), Hong Kong
 A, Akram, J. Kewley, R.J. Allan Modelling WS-RF based Enterprise Applications
Proc. The Tenth IEEE International Enterprise Distributed Object Computing
Conference (EDOC 2006)
 L. Huang, A, Akram, D.W. Walker, R.J. Allan, O.F. Rana, Y. Haung. A workflow
portal supporting multi-language interoperation and optimisation Concurrency
and Computation: Practice and Experience (2005)
 A, Akram, D. Chohan, D. Meredith, X.D. Wang, R.J. Allan CCLRC Portal
infrastructure to support research facilities Global Grid Forum 14, USA, 2006
 A, Akram, D. Meredith, R.J. Allan Evaluation of BPEL for Scientific Workflows 6th
IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2006,
Presenter Name
Singapore
Facility Name
WOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh
Download