Grid Data Services (GDS), The Engine & Activities Tom Sugden (EPCC),

advertisement
Grid Data Services (GDS),
The Engine & Activities
Tom Sugden (EPCC),
James Magowan (IBM),
Overview
4This presentation discusses the low-level
aspects of OGSA-DAI R2, including:
–
–
–
–
–
–
The GDS
The Engine
Perform Documents
Activities
Role Mapping
Implementing an Activity
4Starting point:
– The GDSF has created a GDS
Part 1
Part 2
Part 1 - GDS, The Engine, Perform Documents
Part 1 - GDS, The Engine,
Perform Documents
The GDS
Transport
Perform
Document
Grid
Data
Service
Grid
Data
( GDS )
Service
Database
Response
Document
Components of the GDS
GDS
Engine
Activity Handler
Activity Handler
Activity
Activity
Database
The Engine
4The Engine is the core component of a GDS
–
–
–
–
–
Receives a perform document from the GDS
Identifies the required Activity implementations
Executes the Activities in the correct sequence
Combines the results to form a response document
Returns the response document to the GDS
Perform Document
…
<xActivity name=“x”>…
</xActivity>
<yActivity name=“y”>…
</yActivity>
…
Activity Implementations
XActivity
YActivity
Response Document
…
<result name=“x”>…
</result>
<result name=“y”>…
</result>
…
Introduction to Activities
4An Activity dictates an action to be performed
by a GDS
– query a database, update a collection, deliver some results
4Each type of Activity has a corresponding:
– XML element type
– XSD schema
– Java implementation class
Ex. sqlQueryActivity
Ex. sql_query_statement.xsd
Ex. SQLQueryStatementActivity
4These details are specified in the Activity Map file
<gdsf:activityMap name="sqlQueryStatement“
class="uk.org.ogsadai…SQLQueryStatementActivity”
schemaFileName=“…sql_query_statement.xsd">
</gdsf:activityMap>…
Engine Construction
4When a GDS is created, it instantiates its
own Engine
4The Engine constructor takes a Context
object (known as the Engine Context) that
encapsulates part of the GDS configuration:
–
–
–
–
The Activity Map
The Schema Map
The Data Resource Implementation Map
The default Data Resource name
The Activity and Schema Maps
4 The Activity Map provides a map of the Activity implementations
available to the GDS
– Activity element name Æ Java implementation class
4 The Schema Map provides a map of the corresponding XSD
schemas
– Activity element name Æ XSD schema
<gdsf:activityMap name="sqlQueryStatement”
class="uk.org.ogsadai…SQLQueryStatementActivity“
schemaFileName=“…sql_query_statement.xsd">…
<validXPathScheme>//gdsf:queryLanguage[@name="SQL"]…
<validXPathLocation>//gdsf:dataManager[@productType="MySQL“]…
Activity
Map file
…<gdsf:queryLanguage name="SQL" version=“SQL92”/>…
…<gdsf:dataManager name="jdbcDBMS" productType="MySQL"/>…
GDSF
Config
file
The Data Resource Implementation Map
4The Data Resource Implementation Map
manages a collection of Data Resource
Implementations
– Data Resource name Æ DataResourceImplementation instance
4Constructed by the GDSF using the GDSFConfig
& DataResourceImplementationMap files
…<gdsf:dataResource name="myDataResource“>…
…<gdsf:dataResourceImplementation dataResourceName="myDataResource“
class=“uk.org.ogsadai...SimpleJDBCDataResourceImplementation"/>…
Engine Invocation
4When the GridDataService port type perform
operation is called, the GDS calls the Engine’s
invoke method:
invoke(Document performDoc, Map invocationContext)
4The perform document describes the actions for
the GDS to perform
4The invocation context contains the
distinguished name from the user certificate
Perform Documents
4A perform document contains a series of
activities for a GDS to perform
4A client sends a perform document to a GDS
via the perform operation of the
GridDataService port type
4The root element of a perform document is the
<gridDataServicePerform> element
4The top-level elements are:
– <request> defines a named request that should be stored by
the GDS for future execution.
– <execute> indicates that a specific stored request should be
executed
– <terminate> terminates a specified request.
Example Perform Document
<gridDataServicePerform
xmlns="http://ogsadai.org.uk/P2R2/schemas/gds">
<request name="myRequest">
<documentation>
Simple select statement, executed and response delivered.
</documentation>
<sqlQueryStatement name="statement">
<dataResource>myDataResource</dataResource>
<expression>select * from littleblackbook</expression>
<webRowSetStream name=“myOutput"/>
</sqlQueryStatement>
<deliverToResponse name="d1">
<fromLocal from=“myOutput"/>
</deliverToResponse>
</request>
<execute name=“myExecute" requestName="myRequest“ />
<terminate name=“myTerminate” executeName=“anotherRequest" />
</gridDataServicePerform>
The <request> Element
4 The <request> element contains a series of activities to
be performed when the request is executed.
–
–
–
–
–
Query a data resource
Update a data resource
Parameterise another activity
Deliver results in a response document
Deliver results by FTP
4 The <request> element must contain a name attribute
4 OGSA-DAI supports many such activities and provides
an extensibility framework allowing developers to
create their own.
The <execute> element
4The <execute> element initiates the execution
of a named request
– Request may be contained in the same perform document or
may have been contained in a perform document sent to the
GDS previously
– Parameter values for stored activities may be specified from
within the <execute> element using the optional
<withParameter> sub-element
…
<execute name=“myExecute" requestName=“findCDs">
<withParameter name="artist">Death by Milkfloat</WithParameter>
</execute>
…
The <terminate> element
4Having set a request in motion we can use a
<terminate> element to stop the running
request
…
<terminate name=“myTerminate" executeName=“myExecute" />
…
4The value of the executeName attribute must
match the name of a previously issued
execution (<execute> element).
Validating and Parsing the Perform Document
4When the Engine receives a perform
document, it is validated using the schema
map contained in the Engine context
4After validation, the perform document is
parsed and for each encounter with a known
element type, the Engine:
1. instantiates the corresponding Activity class (specified
in the activity map) using the element and the context
2. creates an ActivityHandler to manage the Activity
Engine Processing
4The Engine determines the correct sequence
4Conclusion
Part 2 - Activities
Part 2 - Activities
Overview
4Introduction to Activities
4Activity Mapping
4Activity Handlers
4Accessing Data Resources
4Activities provided with OGSA-DAI
4Implementing an Activity
4Example implementation of XPathQueryActivity
Activities
4An Activity dictates an action to be taken by the
GDS
4Activities correspond to certain element types
in the perform document
– sqlQueryStatement Æ SQLQueryStatementActivity
– xPathStatement Æ XPathStatementActivity
– deliverToResponse Æ DeliverToResponseActivity
4The mapping from element type to Activity
class is expressed in the Activity Map file
4The Activity Map file also associates an XSD
schema with each activity element type
Example Activity Map File
<gdsf:activityMap name="sqlQueryStatement“
class="uk.org.ogsadai.porttype.gds.activity.sql.SQLQueryStatementActivity“
schemaFileName="http://somewhere/schema/sql_query_statement.xsd">
<namespaces>{http://ogsadai.org.uk/P2R2/schemas/gdsf}gdsf</namespaces>
<validXPathScheme>
//gdsf:queryLanguage[@name="SQL"]
</validXPathScheme>
<validXPathLocation>
//gdsf:dataManager[
@productType="MySQL"|
@productType="DB"|
@productType="Oracle"]
</validXPathLocation>
</gdsf:activityMap>
Activity Handlers
4The Engine uses Activity Handlers to manage
Activities
4This design allows activities to be used in
different ways without the activity implementer
or the engine needing to know the specifics
– SimpleHandler : generates output only when it is required
– RunAheadHandler : can generate output before it is required
4The main jobs for a handler are to manage the
inputs and outputs for the activity and call the
processBlock method when necessary
The Activity Context
4The Activity context is constructed by merging
parts of the Engine context with the invocation
context, and adding the inputs and outputs
Activity Context
Engine Context
Data Resource Implementation Map
Default Data Resource Name
Invocation Context
User Credentials
Inputs - BlockReaders
Outputs -BlockWriters
Accessing Data Resources
4Activities often interact with data resources
– Query a database, update a table row, etc
4Data resources often require user validation
– User ID and password
4An Activity can use the information contained
in its Context to access and interact with a data
resource
–
–
–
–
Data Resource Implementation Map
Default Data Resource name
Data Resource Implementation
User Credentials
Activity to Data Resource Sequence Diagram
:Activity
:Context
:DataResource
ImplementationMap
Get user credentials
Get reference to data resource implementation map
Get reference to data resource
Get connection using user credentials
Do something exciting with the connection
Return connection
:DataResource
Implementation
Data Resource to Role Mapper Sequence Diagram
:Activity
:DataResource
Implementation
:RoleMapper
role:
DatabaseRole
Get connection( user credentials )
role := map( database name, user credentials )
Get user ID
Get password
open connection using user ID and password
Do things with collection
Return collection
The role mapper encapsulates the
database name to user ID/password
mappings contained is the Role Map file.
Role Mapping
4The Role Mapper loads the Role Map file
referenced from the GDSFConfig
4This file maps X509 Certificate User
Credentials to username and password
combinations
– An X509 Certificate is a type of digital document used in Web
Service to attest the identity of an individual or other entity.
…
<Database name="MyDatabase">
<User dn=“J.Smith@somewhere.co.uk,OU=Group,O=Org,O=Another”
userid=“jsmith" password=“carrotcake" />
</Database>
…
Provided Activities
4OGSA-DAI R2 provides many Activity
implementations for Relational and XML
database access:
–
–
–
–
–
–
–
–
SQLQueryStatementActivity
SQLStoredProcedureActivity
SQLUpdateActivity
RelationalResourceManagementActivity
XPathStatementActivity
XUpdateStatementActivity
XMLCollectionManagementActivity
XMLResourceManagementActivity
Implementing an Activity
4An Activity implementation must extend the
abstract Activity class
Activity
# mContext: Context
# mElement: Element
# mInputs: String[]
# mOutputs: String[]
+ Activity( element: Element )
+ setContext( context: Context ) : void
# setStatus( status: int ) : void
+ getStatus() : int
+ processBlock() : void
The constructor
4 Activity( org.w3c.dom.Element element )
– element parameter corresponds to an activity element from the
perform document. Ex. xPathStatement, sqlQueryStatement
4 Calls the super constructor: super( element );
4 Determine the inputs and outputs of the Activity and
make their names available through the getInputs and
getOuputs methods
4 The Element will often be parsed and the inherited
mInputs[] and mOutputs[] instance variables set.
String inputName = parseInputName( element );
String outputName = parseOutputName( element );
mInputs = new String[] { inputName };
mOutputs = new String[] { outputName };
The setContext method
4setContext( Context context )
4Sets the inherited mContext instance member
using super.setContext( context )
4The Engine will guarantee that this context
contains BlockReaders and BlockWriters for
the inputs and outputs that were set in the
activities constructor
4Context dependent initialisation may be
performed in the setContext method
– Obtaining references to the inputs, outputs, and data resource
Retrieving Objects from the Context
4 Objects can be retrieved from the Context using the
get method:
…
BlockReader myInput = (BlockReader) context.get(
EngineImpl.PIPES + mInputs[0] )
BlockWriter myOutput = (BlockWriter) context.get(
EngineImpl.PIPES + mOutputs[0] )
DataResourceImplementationMap map =
(DataResourceImplementationMap) mContext.get(
OGSADAIConstants.DATA_RESOURCE_IMPLEMENTATION_MAP );
…
4 Key constants are stored in
– uk.org.ogsadai.service.OGSADAIConstants
– uk.org.ogsadai.porttype.gds.engine.EngineImp
The processBlock method
4 The main work of an Activity is done through the
processBlock() method.
4 A call to processBlock is a request from the Engine
for the Activity to provide a block of output.
4 In many cases this will involve the Activity reading a
block from an input, performing some processing and
then putting a block onto an output.
4 The Activity Handler checks the status of an Activity
prior to a call to processBlock to ensure the Activity has
not terminated
Activity Status & the setStatus method
4 An Activity must track its own status using the
setStatus method
4 There are 4 states:
– UNSTARTED : before the processBlock method has been
invoked.
– PROCESSING : set the first time the processBlock method is
invoked and remains set until the processing is complete or
there is an error.
– COMPLETE : set when the processing is complete and there
are no more blocks to output.
– ERROR : set when there is a problem of some kind during the
processing of a block.
4 These states are stored as public static final ints in
uk.org.ogsadai.porttype.gds.engine.StatusMessage
XPathStatementActivity
4Excerpt from a perform document:
…
<xPathStatement name="statement">
<dataResource>myXMLDBDataResource</dataResource>
<collection>musicians/folksingers</collection>
<namespace prefix="c">http://ogsadai.org.uk/contacts</namespace>
<expression>/c:entry/c:address</expression>
<sequenceStream name="statementOutput"/>
</xPathStatement>
…
4The xPathStatement element is passed to the
XPathStatementActivity constructor.
XPathStatementActivity Constructor
4Parses the xPathStatement element
– Find the collection name, data resource name, sequence
stream (output) name, resource ID, namespace bindings and
xpath query expression.
4Sets up the Activity input and output names
…
String sequenceStream = mParser.getString(
XMLDBStatementParser.SEQUENCE_STREAM );
mInputs = new String[0]; // no inputs to this type of activity
mOutputs = new String[] { sequenceStream }; // one output
…
XPathStatementActivity setContext method
4 Invokes super.setContext( context );
4 Retrieves a reference to the output Block Writer:
mOutput
4 Retrieves a reference to the Data Resource
Implementation: mDataResource
String dataResourceName =
mParser.getString(XMLDBStatementParser.DATA_RESOURCE );
DataResourceImplementationMap map =
(DataResourceImplementationMap) mContext.get(
OGSADAIConstants.DATA_RESOURCE_IMPLEMENTATION_MAP);
mDataResource = (XMLDBDataResourceImplementation)
map.get( dataResourceName );
XPathStatementActivity processBlock method
4The first time the method is invoked
– The Activity status is set to PROCESSING
– The XPath expression is executed generating a
ResourceIterator for the results, referenced by mResults
• The Data Resource Implementation is used to retrieve an open
Collection for the underlying XMLDB database.
– The first resource from the mResults is put onto the output.
4Each subsequent invocation (until complete)
– Checks whether there are any more resources
• If so, performs next iteration of mResults and puts the resulting
resource onto the output, mOutput
• Otherwise, returns the open Collection to the Data Resource
Implementation and sets the Activity status to COMPLETE
– If any exceptions are generated then the Collection is returned
and the Activity status is set to ERROR
XPathActivityStatement processBlock method cont.
public void processBlock() {
try {
if ( getStatus() == StatusMessage.UNSTARTED ) {
setStatus( StatusMessage.PROCESSING );
performStatement();
}
if ( mResults.hasMoreResources() ) {
mOutput.put( resourceToXML( mResults.nextResource() ) );
} else {
close();
setStatus( StatusMessage.COMPLETE );
}
}
catch ( Exception e ) {
close();
setStatus( StatusMessage.ERROR, e );
}
}
Conclusion
4The Engine is the core of a GDS
4The Engine uses Activities to perform actions
4Activities can use Data Resource
Implementations to access data resources
4OGSA-DAI R2 includes many activities for
querying, updating and managing relational
and XML databases
– MySQL, DB2, Xindice 1.0
4Additional Activities and Data Resource
Implementations can be developed
Download