Optimising the OGSA-DAI Enactment Model Konstantinos Karasavvas Research Associate

advertisement
Optimising the OGSA-DAI
Enactment Model
Konstantinos Karasavvas
Research Associate
NeSC, University of Edinburgh
Overview
● Background: OGSA-DAI
 Workflow distinctive features
● Enactment Model optimisation
● Reconstruction Scenarios
 And future work
● Summary
eSI - WODE, 19 Oct. 06
2
OGSA-DAI in a nutshell
● An extensible framework for data access and integration
● Expose heterogeneous data resources to a grid through
web services
● Interact with data resources




Queries and updates
Data transformation / compression
Data delivery
Application specific functionality
● A base for higher-level services
 Federation, data mining, visualisation
● Reduce development cost of data centric grid
applications by providing useful functionality readily
packaged
eSI - WODE, 19 Oct. 06
3
OGSA-DAI Request/Response
Request
OGSA-DAI
Data
Service
DB
Response
3rd Party
Activity A
eSI - WODE, 19 Oct. 06
Activity B
Activity C
4
Example OGSA-DAI Request
sqlQueryStatement
deliverFromURL
•SELECT * FROM Bands
WHERE name = Bangles;
•http://www.someplace.org/styl
esheets/webRowSetToHTML.xsl
ResultSet
XSL
sqlResultsToXML
WebRowSet
xslTransform
HTML
deliverToURL
•ftp://www.musicplace.org/bands/Bangles.html
eSI - WODE, 19 Oct. 06
5
Workflow Distinctive Features (1)
● Keep everything local
 We have homogeneity
 Avoid data movement
• Call with whole result
• Separate calls
 E.g. we can pass object references
● Activities are configured at
service deployment
sqlQuery
RS
transform
 Cannot use any “service” you want in
your workflow
sqlQuery
Service
eSI - WODE, 19 Oct. 06
SOAP/XML
transform
Service
7
Workflow Distinctive Features (2)
● Simple, efficient workflow language
 Sequence, flow
 Fits our data processing needs
 Other constructs can be used at the activity level (e.g.
exclusive choice – a.k.a. if-split)
A
if-split
begin
if-split
end
x
pipe
closes
eSI - WODE, 19 Oct. 06
B
y
pipe
closes
8
Workflow Distinctive Features (3)
● Streaming model
 Large quantities of data
 Parallel processing (pipelining)
 Implicit iteration via streaming
eSI - WODE, 19 Oct. 06
9
Enactment Model
● A.k.a. Activity
Framework
● Core component of
OGSA-DAI
● It is responsible for
performing tasks
(activities) and
streaming data
eSI - WODE, 19 Oct. 06
DB Query
block
Produces data
in blocks
Stores and
provides access to
data blocks
Pipe
block
Consumes
data blocks
Delivery
10
Activity Processing (Current Release)
● Processing of blocks (and therefore
activities) is controlled by the pipe – from
outside the activity
 processBlock() is called many times until
processing is complete
 Usually consumes and produces a single
block per call
eSI - WODE, 19 Oct. 06
11
Activity Processing (New)
● Processing of blocks will be controlled by
the activity
 process() is called exactly once
 Consumes and produces blocks as necessary
 Each activity in a pipeline processes within
each own thread
 Pipes receive and may buffer blocks until they
are requested
eSI - WODE, 19 Oct. 06
12
Current Model
Activity A
processBlock()
block
processBlock
Pipe
Called
repeatedly
block
getBlock
processBlock
Activity B
processBlock()
eSI - WODE, 19 Oct. 06
Request
Processor
13
New Model
Called
Once
Activity A
process()
putBlock
process
Processing
Service
Pipe
initialise
block
as result
getBlock
process
Activity B
process()
eSI - WODE, 19 Oct. 06
14
Improvements to the Enactment Model
● Each activity is run in its own thread
 Rather than one thread per chain
● Each activity is called to process only once
 Rather than once per block
● Each activity sends its result to the pipe when
ready
 Rather than wait to be called
● The pipe now buffers
 Rather than delegate the request for new block
● Performance improvements
eSI - WODE, 19 Oct. 06
15
Mobile Code
● Allows to ‘embed’ functionality in your workflow
 Allows dynamic inclusion of new code
 Extends the workflow functionality without calling to a
remote service
 Simple but ‘prepared’ code
Activity A
Activity B
Mobile Code
eSI - WODE, 19 Oct. 06
16
Workflow Reconstruction
● Be able to identify patterns
 Describe workflow nodes/edges
● Reconstruct part of the workflow graph
 Performance
● Some example scenarios
eSI - WODE, 19 Oct. 06
17
Scenario (1)
sqlQuery
RS
RSToWebRowSet
WRS
WebRowSetProjection
WRS
sqlQuery
RS
resultsetProjection
RS
RSToWebRowSet
WRS
eSI - WODE, 19 Oct. 06
18
Scenario (2)
filename
A
file
FTP
URL
A
Split
A
A
eSI - WODE, 19 Oct. 06
filename
file
URL
filename
file
URL
filename
file
URL
FTP
FTP
FTP
19
Thoughts on Reconstruction
● Currently recommendations
 Not automated
● How and what do we describe?
 Scenario 1
• Semantics of projection
• Activities’ I/O typing
 Scenario 2
• Can be applied in many scenarios
• Rules needed
 When are two workflows equivalent?
• Models of enactment and activities
 How do we estimate/predict the cost of a workflow?
● Future work
 Need to investigate the above
eSI - WODE, 19 Oct. 06
20
Thoughts on Parallelisation
● Parallelising the execution of OGSA-DAI
 already exploit thread parallelism
● Multiple nodes/JVM running activities
● Requires serialisation of objects passing
between activities
 When and where do we serialise?
● Future work
 Need to investigate the above
eSI - WODE, 19 Oct. 06
21
Summary
● Overview of OGSA-DAI workflow
 Motivation
 Distinctions
● Enactment optimisations
 Improved pipeline processing
● Workflow reconstruction and
parallelisation
eSI - WODE, 19 Oct. 06
22
Questions?
www.ogsadai.org.uk
kostas@nesc.ac.uk
Download