CGW09_abstract_pattern

advertisement
Development and Execution of Collaborative Application
on the ViroLab Virtual Laboratory
Tomasz Gubala (1,2), Marek Kasztelnik (3), Maciej Malawski (1), Marian Bubak (1,3)
(1) Institute of Computer Science AGH, Krakow, Poland
(2) Informatics Institute, University of Amsterdam, The Netherlands
(3) ACC CYFRONET AGH, Krakow, Poland
Modern practices of science in such area as investigation of HIV virus drug resistance, requires
collaborative sharing, processing and analyzing virological, immunological, clinical and experimental
data, as well as advanced tools for statistical analysis, visualization, modelling and simulation [1]. An
experiment in a virtual laboratory combines data and activities which are available on the distributed
Web- and Grid-based infrastructure and it needs to orchestrate them in possibly complex scenarios. As
existing solutions to the problem of experiment orchestration, there are many scientific workflow
systems available for the Grid, such as Pegasus [2], Triana [3] and K-WfGrid [4] systems. They are
intended to assist nonprogrammer users in developing applications, however in the case of workflows
with many components and complex interactions, they can also become difficult to use.
An experiment plan is a Ruby script, which features the concise and clear syntax combined with a
full set of control structures. To relieve the experiment developer from the sophisticated details of the
underlying grid infrastructure, we introduce a high-level object-oriented API, which allows requesting
"which" computational functionality is required, without a need to specify "how" to access it with
available middleware. If a developer needs to retain a full control over the experiment plan, it is possible
to specify all the technical details on the lower level of abstraction. In order to support development and
execution in this high-level scripting paradigm, a set of tools in the virtual laboratory has been
developed. An experiment developer uses an Experiment Planning Environment based on Eclipse
platform. The developer can use the semantic-web based Domain Ontology Store graphical browser to
discover the available data and computational services, coupled with Grid Resources Registry which
provides the available operations which can be invoked directly from a script. A script can be executed
from the EPE, which is integrated with the GridSpace Engine (GSEngine) being a core of the runtime
system. GSEngine includes the Grid Operation Invoker [2], which translates high-level operations
specified in the script into concrete invocations on computational resources using appropriate
technologies. The process of experiment planning and execution is collaborative in the sense that the
virtual laboratory supports cooperation of multiple experiment developers and users. When a developer
prepares an experiment script, it can be published in the experiment repository and thus made available
to others. Then, a scientist can access the virtual laboratory through a portal, and execute the published
experiments using Web browser, providing only input data when necessary. The repository can be
shared and reused. Provenance data on the experiment is also recorded and available for queries,
making the results more reliable, reproducible and scientifically relevant.
Using a scripting language enables to define even complex experiments easily, still remaining on a
high-level of abstraction and concealing the details of underlying grid middleware. By providing a set of
user friendly tools, both advanced experiment developers and domain scientists can productively
collaborate and conduct their research in modern highly distributed environment.
Acknowledgements. This work was supported by EU project ViroLab IST-027446 with the related Polish
grant SPUB-M and by the Foundation for Polish Science.
References
1. Peter M.A. Sloot, Ilkay Altintas, Marian Bubak, Charles A. Boucher: From Molecule to Man: Decision
Support in Individualized E-Health; IEEE Computer Society, vol 39, no.11, pp. 40-46, Nov., 2006.
2. Ewa Deelman, James Blythe, Yolanda Gil, Carl Kesselman, Gaurang Mehta, Sonal Patil, Mei-Hui
Su, Karan Vahi, and Miron Livny. Pegasus: Mapping scientific workflows onto the grid. In Grid
Computing: Second European Across Grids Conference, AxGrids, LNCS 3165, pp. 11- 20, Springer,
2004.
3. Ian Taylor, Matthew Shields, Ian Wang, and Andrew Harrison. Visual grid workflow in Triana.
Journal of Grid Computing, 3(3-4):153-169, September 2005.
4. Tomasz Gubala and Andreas Hoheisel. Highly dynamic workflow orchestration for scientific
applications. CoreGRID Intergation Workshop 2006, pp. 309-320, ACC CYFRONET AGH, 2006.
5. Tomasz Bartynski, Marian Bubak, Tomasz Gubala, Maciej Malawski: Universal Grid Client: Grid
Operation Invoker; Proceedings of PPAM'07, Gdansk, September 2007, LNCS (to appear).
Download