Pegasus: Planning for Execution in Grids Ewa Deelman

advertisement
Pegasus:
Planning for Execution in
Grids
Ewa Deelman
Information Sciences Institute
University of Southern California
Pegasus Acknowledgements

Ewa Deelman, Carl Kesselman, Saurabh
Khurana, Gaurang Mehta, Sonal Patil,
Gurmeet Singh, Mei-Hui Su, Karan Vahi
(ISI)

James Blythe, Yolanda Gil (ISI)

http://pegasus.isi.edu

Research funded as part of the NSF
GriPhyN, NVO and SCEC projects.
Ewa Deelman
Information Sciences Institute
Outline

General Scientific Workflow Issues on the
Grid



Mapping complex applications onto the Grid
Pegasus
Pegasus Application Portal
LIGO-gravitational-wave physics
 Montage-astronomy



Incremental Workflow Refinement
Futures
Ewa Deelman
Information Sciences Institute
Grid Applications





Increasing in the level of complexity
Use of individual application components
Reuse of individual intermediate data products
(files)
Description of Data Products using Metadata
Attributes
Execution environment is complex and very
dynamic




Resources come and go
Data is replicated
Components can be found at various locations or
staged in on demand
Separation between


the application description
the actual execution description
Ewa Deelman
Information Sciences Institute
Application Development and Execution Process
Abstract
Workflow
Generation
FFT
Application
Component
Selection
ApplicationDomain
Specify a
Different
Workflow
Concrete
Workflow
Generation
FFT filea
Resource Selection
Data Replica Selection
Transformation Instance
Selection
Abstract
Workflow
Pick different Resources
transfer filea from host1://
home/filea
to host2://home/file1
/usr/local/bin/fft /home/file1
DataTransfer
Concrete
Workflow
host1
host2
host2
Retry
Data
Data
Execution
Environment
Ewa Deelman
Failure Recovery
Method
Information Sciences Institute
Why Automate Workflow Generation?

Usability: Limit User’s necessary Grid knowledge



Complexity:

User needs to make choices






Alternative application components
Alternative files
Alternative locations
The user may reach a dead end
Many different interdependencies may occur among
components
Solution cost:

Evaluate the alternative solution costs




Monitoring and Directory Service
Replica Location Service
Performance
Reliability
Resource Usage
Global cost:


minimizing cost within a community or a virtual organization
requires reasoning about individual user’s choices in light of
other user’s choices
Ewa Deelman
Information Sciences Institute
Executable Workflow Construction



Chimera builds an abstract workflow based
on VDL descriptions
Pegasus takes the abstract workflow and
produces and executable workflow for the
Grid
Condor’s DAGMan executes the workflow
Abstract
Worfklow
Chimera
Ewa Deelman
Concrete
Workflow
Pegasus
Jobs
DAGMan
Information Sciences Institute
Pegasus:
Planning for Execution in Grids

Maps from abstract to concrete workflow


Automatically locates physical locations for both
components (transformations) and data


Algorithmic and AI based techniques
Use Globus RLS and the Transformation Catalog
Finds appropriate resources to execute

via Globus MDS

Reuses existing data products where applicable

Publishes newly derived data products

Chimera virtual data catalog
Ewa Deelman
Information Sciences Institute
Chimera is developed at ANL
By I. Foster, M. Wilde, and J. Voeckler
Virtual Data
Language
Chimera
Abstract Worfklow
Request Manager
Workflow
Planning
Replica Locati
on
Available
Reources
Data
Management
Workflow
Reduction
at
io
n
in
fo
rm
Concrete
Workflow
Globus Monitoring
and Discovery
Service
Transformation
Catalog
M
on
ito
r
in
g
workflow executor
(DAGman)
Execution
Data
Publication
Dynamic
information
Submission and
Monitoring System
Replica and
Resource
Selector
Globus Replica
Location Service
Information and
Models
s
ta
Grid
ks
Raw data
detector
Ewa Deelman
Information Sciences Institute
Example Workflow Reduction

Original abstract workflow
a

b
d1
d2
c
If “b” already exists (as determined by query to
the RLS), the workflow can be reduced
b
Ewa Deelman
d2
c
Information Sciences Institute
Mapping from abstract to concrete
b

d2
c
Query RLS, MDS, and TC, schedule
computation and data movement
Move b
from A
to B
Execute
d2 at B
Ewa Deelman
Move c
from B
to U
Register
c in the
RLS
Information Sciences Institute
User
the
ntic
atio
n
V
ac DL/
tW
or
fk
low
tion
u
c
Exe ords
re c
ata/
Metad
Pegasus
VDL
Portal
Abs Metada
trac
t
t W a/
orkf
low
Abstract
Workflow/
Information
LIGO-specific
interface
Montagespecific
Interface
Ab
str
Metadata
Catalog
Service
Au
Chimera
Globus MDS
/
nc
r
In ete
fo W
rm o
at rkf
io lo
n w
In
fo
r
m
at
io
Co
Simplified View of SC 2003 Portal
MyProxy
DAGMan
Jobs/
Information
Ewa Deelman
n
on
ati
The Grid
Globus RLS
rm
o
f
In
Transformation
Catalog
Information Sciences Institute
LIGO Scientific Collaboration



Continuous gravitational waves are expected to be
produced by a variety of celestial objects
Only a small fraction of potential sources are known
Need to perform blind searches, scanning the regions of
the sky where we have no a priori information of the
presence of a source




Wide area, wide frequency searches
Search is performed for potential sources of continuous
periodic waves near the Galactic Center and the galactic
core
The search is very compute and data intensive
LSC used the occasion of SC2003 to initiate a month-long
production run with science data collected during 8 weeks
in the Spring of 2003
Ewa Deelman
Information Sciences Institute
Additional resources used: Grid3 iVDGL resources
Ewa Deelman
Information Sciences Institute
LIGO Acknowledgements








Bruce Allen, Scott Koranda, Brian Moe, Xavier Siemens,
University of Wisconsin Milwaukee, USA
Stuart Anderson, Kent Blackburn, Albert Lazzarini, Dan
Kozak, Hari Pulapaka, Peter Shawhan, Caltech, USA
Steffen Grunewald, Yousuke Itoh, Maria Alessandra Papa,
Albert Einstein Institute, Germany
Many Others involved in the Testbed
www.ligo.caltech.edu
www.lsc- group.phys.uwm.edu/lscdatagrid/
http://pandora.aei.mpg.de/merlin/
LIGO Laboratory operates under NSF cooperative agreement
PHY-0107417
Ewa Deelman
Information Sciences Institute

Montage (NASA and
NVO)



Montage
Deliver science-grade
custom mosaics on
demand
Produce mosaics from a
wide range of data
sources (possibly in
different spectra)
User-specified
parameters of
projection, coordinates,
size, rotation and
spatial sampling.
Mosaic created by Pegasus based Montage from a run of
the M101 galaxy images on the Teragrid.
Ewa Deelman
Information Sciences Institute
Small Montage Workflow
~1200 nodes
Ewa Deelman
Information Sciences Institute
Montage Acknowledgments




Bruce Berriman, John Good, Anastasia Laity,
Caltech/IPAC
Joseph C. Jacob, Daniel S. Katz, JPL
http://montage.ipac. caltech.edu/
Testbed for Montage: Condor pools at USC/ISI, UW
Madison, and Teragrid resources at NCSA, PSC,
and SDSC.
Montage is funded by the National Aeronautics and
Space Administration's Earth Science Technology
Office, Computational Technologies Project, under
Cooperative Agreement Number NCC5-626
between NASA and the California Institute of
Technology.
Ewa Deelman
Information Sciences Institute
Other Applications Using
Chimera and Pegasus

Other GriPhyN applications:
High-energy physics: Atlas, CMS (many)
 Astronomy: SDSS (Fermi Lab, ANL)


Astronomy:


Biology


Galaxy Morphology (NCSA, JHU, Fermi,
many others, NVO-funded)
BLAST (ANL, PDQ-funded)
Neuroscience

Tomography (SDSC, NIH-funded)
Ewa Deelman
Information Sciences Institute
Current System
Pegasus(Abstract
Workflow)
Concrete Worfklow
DAGMan(CW))
Original Abstract
Workflow
Current Pegasus
Ewa Deelman
Workflow Execution
Information Sciences Institute
Workflow Refinement and execution
User’s
Request
Workflow
refinement
Levels of
abstraction
Application
-level
knowledge
Logical
tasks
Tasks
bound to
resources
and sent for
execution
Relevant
components
Policy
info
Workflow
repair
Full
abstract
workflow
Task
matchmaker
Not yet
executed
Ewa Deelman
Partial
execution
executed
time
Information Sciences Institute
Incremental Refinement

Partition Abstract workflow into partial
workflows
PW A
PW B
PW C
A Particular Partitioning
Ewa Deelman
New Abstract
Workflow
Information Sciences Institute
Meta-DAGMan
Pegasus(A)
Su(A)
DAGMan(Su(A))
Pegasus(B)
Su(B)
DAGMan(Su(B))
Pegasus(X) –Pegasus generates
the concrete workflow and the
submit files for X = Su(X)
DAGMan(Su(X))—DAGMan executes
the concrete workflow for X
Ewa Deelman
Pegasus(C)
Su(C)
DAGMan(Su(C))
Information Sciences Institute
Future Directions

Incorporate AI-planning technologies in
production software (Virtual Data Toolkit)

Investigate various scheduling techniques

Investigating fault tolerance issues


Selecting resources based on their reliability

Responding to failures
http://pegasus.isi.edu
Ewa Deelman
Information Sciences Institute
Download