Grid Data Services (GDS), The Engine & Activities Tom Sugden (EPCC), James Magowan (IBM), Overview 4This presentation discusses the low-level aspects of OGSA-DAI R2, including: – – – – – – The GDS The Engine Perform Documents Activities Role Mapping Implementing an Activity 4Starting point: – The GDSF has created a GDS Part 1 Part 2 Part 1 - GDS, The Engine, Perform Documents Part 1 - GDS, The Engine, Perform Documents The GDS Transport Perform Document Grid Data Service Grid Data ( GDS ) Service Database Response Document Components of the GDS GDS Engine Activity Handler Activity Handler Activity Activity Database The Engine 4The Engine is the core component of a GDS – – – – – Receives a perform document from the GDS Identifies the required Activity implementations Executes the Activities in the correct sequence Combines the results to form a response document Returns the response document to the GDS Perform Document … <xActivity name=“x”>… </xActivity> <yActivity name=“y”>… </yActivity> … Activity Implementations XActivity YActivity Response Document … <result name=“x”>… </result> <result name=“y”>… </result> … Introduction to Activities 4An Activity dictates an action to be performed by a GDS – query a database, update a collection, deliver some results 4Each type of Activity has a corresponding: – XML element type – XSD schema – Java implementation class Ex. sqlQueryActivity Ex. sql_query_statement.xsd Ex. SQLQueryStatementActivity 4These details are specified in the Activity Map file <gdsf:activityMap name="sqlQueryStatement“ class="uk.org.ogsadai…SQLQueryStatementActivity” schemaFileName=“…sql_query_statement.xsd"> </gdsf:activityMap>… Engine Construction 4When a GDS is created, it instantiates its own Engine 4The Engine constructor takes a Context object (known as the Engine Context) that encapsulates part of the GDS configuration: – – – – The Activity Map The Schema Map The Data Resource Implementation Map The default Data Resource name The Activity and Schema Maps 4 The Activity Map provides a map of the Activity implementations available to the GDS – Activity element name Æ Java implementation class 4 The Schema Map provides a map of the corresponding XSD schemas – Activity element name Æ XSD schema <gdsf:activityMap name="sqlQueryStatement” class="uk.org.ogsadai…SQLQueryStatementActivity“ schemaFileName=“…sql_query_statement.xsd">… <validXPathScheme>//gdsf:queryLanguage[@name="SQL"]… <validXPathLocation>//gdsf:dataManager[@productType="MySQL“]… Activity Map file …<gdsf:queryLanguage name="SQL" version=“SQL92”/>… …<gdsf:dataManager name="jdbcDBMS" productType="MySQL"/>… GDSF Config file The Data Resource Implementation Map 4The Data Resource Implementation Map manages a collection of Data Resource Implementations – Data Resource name Æ DataResourceImplementation instance 4Constructed by the GDSF using the GDSFConfig & DataResourceImplementationMap files …<gdsf:dataResource name="myDataResource“>… …<gdsf:dataResourceImplementation dataResourceName="myDataResource“ class=“uk.org.ogsadai...SimpleJDBCDataResourceImplementation"/>… Engine Invocation 4When the GridDataService port type perform operation is called, the GDS calls the Engine’s invoke method: invoke(Document performDoc, Map invocationContext) 4The perform document describes the actions for the GDS to perform 4The invocation context contains the distinguished name from the user certificate Perform Documents 4A perform document contains a series of activities for a GDS to perform 4A client sends a perform document to a GDS via the perform operation of the GridDataService port type 4The root element of a perform document is the <gridDataServicePerform> element 4The top-level elements are: – <request> defines a named request that should be stored by the GDS for future execution. – <execute> indicates that a specific stored request should be executed – <terminate> terminates a specified request. Example Perform Document <gridDataServicePerform xmlns="http://ogsadai.org.uk/P2R2/schemas/gds"> <request name="myRequest"> <documentation> Simple select statement, executed and response delivered. </documentation> <sqlQueryStatement name="statement"> <dataResource>myDataResource</dataResource> <expression>select * from littleblackbook</expression> <webRowSetStream name=“myOutput"/> </sqlQueryStatement> <deliverToResponse name="d1"> <fromLocal from=“myOutput"/> </deliverToResponse> </request> <execute name=“myExecute" requestName="myRequest“ /> <terminate name=“myTerminate” executeName=“anotherRequest" /> </gridDataServicePerform> The <request> Element 4 The <request> element contains a series of activities to be performed when the request is executed. – – – – – Query a data resource Update a data resource Parameterise another activity Deliver results in a response document Deliver results by FTP 4 The <request> element must contain a name attribute 4 OGSA-DAI supports many such activities and provides an extensibility framework allowing developers to create their own. The <execute> element 4The <execute> element initiates the execution of a named request – Request may be contained in the same perform document or may have been contained in a perform document sent to the GDS previously – Parameter values for stored activities may be specified from within the <execute> element using the optional <withParameter> sub-element … <execute name=“myExecute" requestName=“findCDs"> <withParameter name="artist">Death by Milkfloat</WithParameter> </execute> … The <terminate> element 4Having set a request in motion we can use a <terminate> element to stop the running request … <terminate name=“myTerminate" executeName=“myExecute" /> … 4The value of the executeName attribute must match the name of a previously issued execution (<execute> element). Validating and Parsing the Perform Document 4When the Engine receives a perform document, it is validated using the schema map contained in the Engine context 4After validation, the perform document is parsed and for each encounter with a known element type, the Engine: 1. instantiates the corresponding Activity class (specified in the activity map) using the element and the context 2. creates an ActivityHandler to manage the Activity Engine Processing 4The Engine determines the correct sequence 4Conclusion Part 2 - Activities Part 2 - Activities Overview 4Introduction to Activities 4Activity Mapping 4Activity Handlers 4Accessing Data Resources 4Activities provided with OGSA-DAI 4Implementing an Activity 4Example implementation of XPathQueryActivity Activities 4An Activity dictates an action to be taken by the GDS 4Activities correspond to certain element types in the perform document – sqlQueryStatement Æ SQLQueryStatementActivity – xPathStatement Æ XPathStatementActivity – deliverToResponse Æ DeliverToResponseActivity 4The mapping from element type to Activity class is expressed in the Activity Map file 4The Activity Map file also associates an XSD schema with each activity element type Example Activity Map File <gdsf:activityMap name="sqlQueryStatement“ class="uk.org.ogsadai.porttype.gds.activity.sql.SQLQueryStatementActivity“ schemaFileName="http://somewhere/schema/sql_query_statement.xsd"> <namespaces>{http://ogsadai.org.uk/P2R2/schemas/gdsf}gdsf</namespaces> <validXPathScheme> //gdsf:queryLanguage[@name="SQL"] </validXPathScheme> <validXPathLocation> //gdsf:dataManager[ @productType="MySQL"| @productType="DB"| @productType="Oracle"] </validXPathLocation> </gdsf:activityMap> Activity Handlers 4The Engine uses Activity Handlers to manage Activities 4This design allows activities to be used in different ways without the activity implementer or the engine needing to know the specifics – SimpleHandler : generates output only when it is required – RunAheadHandler : can generate output before it is required 4The main jobs for a handler are to manage the inputs and outputs for the activity and call the processBlock method when necessary The Activity Context 4The Activity context is constructed by merging parts of the Engine context with the invocation context, and adding the inputs and outputs Activity Context Engine Context Data Resource Implementation Map Default Data Resource Name Invocation Context User Credentials Inputs - BlockReaders Outputs -BlockWriters Accessing Data Resources 4Activities often interact with data resources – Query a database, update a table row, etc 4Data resources often require user validation – User ID and password 4An Activity can use the information contained in its Context to access and interact with a data resource – – – – Data Resource Implementation Map Default Data Resource name Data Resource Implementation User Credentials Activity to Data Resource Sequence Diagram :Activity :Context :DataResource ImplementationMap Get user credentials Get reference to data resource implementation map Get reference to data resource Get connection using user credentials Do something exciting with the connection Return connection :DataResource Implementation Data Resource to Role Mapper Sequence Diagram :Activity :DataResource Implementation :RoleMapper role: DatabaseRole Get connection( user credentials ) role := map( database name, user credentials ) Get user ID Get password open connection using user ID and password Do things with collection Return collection The role mapper encapsulates the database name to user ID/password mappings contained is the Role Map file. Role Mapping 4The Role Mapper loads the Role Map file referenced from the GDSFConfig 4This file maps X509 Certificate User Credentials to username and password combinations – An X509 Certificate is a type of digital document used in Web Service to attest the identity of an individual or other entity. … <Database name="MyDatabase"> <User dn=“J.Smith@somewhere.co.uk,OU=Group,O=Org,O=Another” userid=“jsmith" password=“carrotcake" /> </Database> … Provided Activities 4OGSA-DAI R2 provides many Activity implementations for Relational and XML database access: – – – – – – – – SQLQueryStatementActivity SQLStoredProcedureActivity SQLUpdateActivity RelationalResourceManagementActivity XPathStatementActivity XUpdateStatementActivity XMLCollectionManagementActivity XMLResourceManagementActivity Implementing an Activity 4An Activity implementation must extend the abstract Activity class Activity # mContext: Context # mElement: Element # mInputs: String[] # mOutputs: String[] + Activity( element: Element ) + setContext( context: Context ) : void # setStatus( status: int ) : void + getStatus() : int + processBlock() : void The constructor 4 Activity( org.w3c.dom.Element element ) – element parameter corresponds to an activity element from the perform document. Ex. xPathStatement, sqlQueryStatement 4 Calls the super constructor: super( element ); 4 Determine the inputs and outputs of the Activity and make their names available through the getInputs and getOuputs methods 4 The Element will often be parsed and the inherited mInputs[] and mOutputs[] instance variables set. String inputName = parseInputName( element ); String outputName = parseOutputName( element ); mInputs = new String[] { inputName }; mOutputs = new String[] { outputName }; The setContext method 4setContext( Context context ) 4Sets the inherited mContext instance member using super.setContext( context ) 4The Engine will guarantee that this context contains BlockReaders and BlockWriters for the inputs and outputs that were set in the activities constructor 4Context dependent initialisation may be performed in the setContext method – Obtaining references to the inputs, outputs, and data resource Retrieving Objects from the Context 4 Objects can be retrieved from the Context using the get method: … BlockReader myInput = (BlockReader) context.get( EngineImpl.PIPES + mInputs[0] ) BlockWriter myOutput = (BlockWriter) context.get( EngineImpl.PIPES + mOutputs[0] ) DataResourceImplementationMap map = (DataResourceImplementationMap) mContext.get( OGSADAIConstants.DATA_RESOURCE_IMPLEMENTATION_MAP ); … 4 Key constants are stored in – uk.org.ogsadai.service.OGSADAIConstants – uk.org.ogsadai.porttype.gds.engine.EngineImp The processBlock method 4 The main work of an Activity is done through the processBlock() method. 4 A call to processBlock is a request from the Engine for the Activity to provide a block of output. 4 In many cases this will involve the Activity reading a block from an input, performing some processing and then putting a block onto an output. 4 The Activity Handler checks the status of an Activity prior to a call to processBlock to ensure the Activity has not terminated Activity Status & the setStatus method 4 An Activity must track its own status using the setStatus method 4 There are 4 states: – UNSTARTED : before the processBlock method has been invoked. – PROCESSING : set the first time the processBlock method is invoked and remains set until the processing is complete or there is an error. – COMPLETE : set when the processing is complete and there are no more blocks to output. – ERROR : set when there is a problem of some kind during the processing of a block. 4 These states are stored as public static final ints in uk.org.ogsadai.porttype.gds.engine.StatusMessage XPathStatementActivity 4Excerpt from a perform document: … <xPathStatement name="statement"> <dataResource>myXMLDBDataResource</dataResource> <collection>musicians/folksingers</collection> <namespace prefix="c">http://ogsadai.org.uk/contacts</namespace> <expression>/c:entry/c:address</expression> <sequenceStream name="statementOutput"/> </xPathStatement> … 4The xPathStatement element is passed to the XPathStatementActivity constructor. XPathStatementActivity Constructor 4Parses the xPathStatement element – Find the collection name, data resource name, sequence stream (output) name, resource ID, namespace bindings and xpath query expression. 4Sets up the Activity input and output names … String sequenceStream = mParser.getString( XMLDBStatementParser.SEQUENCE_STREAM ); mInputs = new String[0]; // no inputs to this type of activity mOutputs = new String[] { sequenceStream }; // one output … XPathStatementActivity setContext method 4 Invokes super.setContext( context ); 4 Retrieves a reference to the output Block Writer: mOutput 4 Retrieves a reference to the Data Resource Implementation: mDataResource String dataResourceName = mParser.getString(XMLDBStatementParser.DATA_RESOURCE ); DataResourceImplementationMap map = (DataResourceImplementationMap) mContext.get( OGSADAIConstants.DATA_RESOURCE_IMPLEMENTATION_MAP); mDataResource = (XMLDBDataResourceImplementation) map.get( dataResourceName ); XPathStatementActivity processBlock method 4The first time the method is invoked – The Activity status is set to PROCESSING – The XPath expression is executed generating a ResourceIterator for the results, referenced by mResults • The Data Resource Implementation is used to retrieve an open Collection for the underlying XMLDB database. – The first resource from the mResults is put onto the output. 4Each subsequent invocation (until complete) – Checks whether there are any more resources • If so, performs next iteration of mResults and puts the resulting resource onto the output, mOutput • Otherwise, returns the open Collection to the Data Resource Implementation and sets the Activity status to COMPLETE – If any exceptions are generated then the Collection is returned and the Activity status is set to ERROR XPathActivityStatement processBlock method cont. public void processBlock() { try { if ( getStatus() == StatusMessage.UNSTARTED ) { setStatus( StatusMessage.PROCESSING ); performStatement(); } if ( mResults.hasMoreResources() ) { mOutput.put( resourceToXML( mResults.nextResource() ) ); } else { close(); setStatus( StatusMessage.COMPLETE ); } } catch ( Exception e ) { close(); setStatus( StatusMessage.ERROR, e ); } } Conclusion 4The Engine is the core of a GDS 4The Engine uses Activities to perform actions 4Activities can use Data Resource Implementations to access data resources 4OGSA-DAI R2 includes many activities for querying, updating and managing relational and XML databases – MySQL, DB2, Xindice 1.0 4Additional Activities and Data Resource Implementations can be developed