gUSE_EGEE_UF_2

advertisement
Enabling Grids for E-sciencE
gUSE: grid User Support
Environment
Peter Kacsuk, Krisztian Karoczkai, Andras
Schnautigel, Istvan Marton, Gabor Herman
MTA SZTAKI
www.lpds.sztaki.hu
www.eu-egee.org
EGEE-II INFSO-RI-031688
EGEE and gLite are registered trademarks
Content
Enabling Grids for E-sciencE
• Motivations
– Lessons learnt from P-GRADE portal
– Lessons learnt from accessing production Grid infrastructures
– Lessons learnt from providing multi-grid service
•
•
•
•
The service-oriented architecture of gUSE
Services in gUSE
Workflow concept of gUSE
Parameter sweep support of gUSE
– CancerGrid
• Usage of gUSE
– EDGeS
• Conclusions
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
2
Lessons learnt from P-GRADE portal
Enabling Grids for E-sciencE
• Popular because it provides
– Easy-to-use but powerful workflow system (graphical editor, wf
manager, etc.)
– Easy-to-use parameter sweep concept support
– Easy-to-use MPI program execution support
– Multi-grid/multi-VO access mechanism (job submission grid
interoperability at workflow level) for LCG-2, gLite and GT2
• Its extension with GEMLCA enables
–
–
–
–
The usage of legacy codes as grid-enabled services
The usage of service/job repository
Access to SRB and OGSA-DAI
Multi-grid/multi-VO access mechanism for LCG-2, gLite, GT2
and GT4
– Data management level of grid interoperability
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
3
Popularity of P-GRADE portal
Enabling Grids for E-sciencE
• It has been used in many EGEE and EGEE-related VOs:
– GILDA, VOCE, SEE-GRID, BalticGrid, BioInfoGrid, EGRID, etc.
• It has been used in many national grids:
– UK NGS (a GT2-based grid), Grid-Ireland, Turkish Grid, Croatian
Grid, Ukrainan Grid, etc.
• It has been used as the GIN VO Resource Testing
Portal
• It became OSS in the beginning of Januar 2008:
https://sourceforge.net/projects/pgportal/
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
4
Download of OSS P-GRADE portal
Enabling Grids for E-sciencE
130 downloads
within a month
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
5
Limitations of P-GRADE portal
Enabling Grids for E-sciencE
• Restricted workflow capabilities
– No cycle construct, no if-then-else, no embedding
• Static parameter sweep capabilities
– PS can not be used inside a workflow
• Single user view
– Too simple for IT people
– Too complicated for end-users
• Lack of collaborative tools supporting user
communities
• Monolithic architecture and as a result problems with
– Scalability:
 simultaneous number of jobs in the range of 100s
 simultaneous number of users in the range of 30-50
– Adaptivity: difficult to adapt to new grid services
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
6
Lessons learnt from accessing
production Grid infrastructures
Enabling Grids for E-sciencE
• Production Grids do not enable you to modify
anything, just use their services (no matter they are
good or bad)
• Usually they provide basic grid services
• The user should construct higher level services
• However, if you do not want to be locked with one
particular grid the user-written service should be
interoperable with many basic grid services provided
by different grids
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
7
Motivations of creating gUSE
Enabling Grids for E-sciencE
• We wanted
– to overcome the limitations of the current P-GRADE portal
– To create a set of high-level grid services that can be used with
many different grids
• Therefore we have defined a new service-oriented grid
layer that can be deployed
– on a single machine
– on a cluster
– on different grid sites as Web Services
• Performance comparison
– P-GRADE portal monolithic architecture: 100-200 jobs
– WS-PGRADE/gUSE SOA architecture: 10.000 jobs
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
8
Monolithic architecture of P-GRADE
portal
Enabling Grids for E-sciencE
A single Web
container
Workflow
save
read
WEB UI
File storage
special file
formats
Workflow
submit
special
protocol
WFS and
file
Storage
Read
Workflow
to run
special
protocol
EGEE-II INFSO-RI-031688
Single
computer
Workflow Engine
Grid ClientS
Built-in Grid API
+ Hack for nonsupported APIs
3rd EGEE User Forum
9
gUSE architecture
Enabling Grids for E-sciencE
Graphical User Interface: WS-PGRADE
Workflow
storage
Workflow
Engine
Broker/
Meta-broker
gUSE
information
system
Submitters
File
storage
Application
repository
Autonomous
Services:
high level
middleware
service layer
Logging
gLite resources, Globus resources and Web services
EGEE-II INFSO-RI-031688
Gridsphere
portlets
gLite or Globus
or Web service:
low level
middleware
service layer
Generic service communication
scheme in gUSE
Enabling Grids for E-sciencE
Definition
of client
functions
Definition
of server
functions
Function
definitions
RPC
Service
request
Client
Interface
Client Implementation
Concrete
implementation of
Service calls
EGEE-II INFSO-RI-031688
Function
implementations
Service
Interface
RPC server
Front-end
implementation
Service Front-end
Service
Back-end
Service
Logic
3rd EGEE User Forum
11
Distributed SOA architecture
Enabling Grids for E-sciencE
WF Storage
Special file
formats
inside
Workflow list
and config
descriptor
1
WFS
WEB UI
8
2
Workflow
Executor
WFE
Workflow
descriptor
7
Job Submit
Job info
Status back
6
File
Storage
4
Grid ClientS
Grid Api
EGEE-II INFSO-RI-031688
File Storage
Special file
formats inside
Status
back
Workflow
Submit
3
5
Web container
Files needed for
wf execution
3rd EGEE User Forum
Application developers’ view
Enabling Grids for E-sciencE
• Users of gUSE can be either
– grid application developers
– or end-users.
• Application developers can develop sophisticated
workflow applications where workflows can be
– embedded into each other at any depth
– recursive workflows are allowed
– gUSE supports the following workflow types




graphs (abstract workflows)
workflow templates
concrete workflows
workflow instances
• Parametric sweep nodes and normal nodes can be
used in a mixed way.
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
13
Collaboration support between user
communities
Enabling Grids for E-sciencE
• Application developers can
– publish
 incomplete wf applications (projects), wf parts (templates, graphs,
concrete wf, wf instances) into a workflow repository for the use
of other developers
 ready-to-run wf applications for end-users
– import workflows from the repository and can continue the work
on them even if they were published by other developers
• End-users can
– import ready-to-run wf applications from the repository
– execute ready-to-run wf applications imported from the
repository based on a simplified portal interface hiding grid
details
• Grid is exposed only for application developers.
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
14
User activities
Enabling Grids for E-sciencE
New
Graph
Edit,
Copy
Delete
Jobs,
Edges,
Ports
Edit
Template
New
New
New
Concrete Workflow
Algorithms,
Resource
references,
Inputs
Submit
Workflow Instance
Running state,
Outputs
EGEE-II INFSO-RI-031688
Constraints,
Comments,
Form
Generators
Configure,
Copy,
Delete
Export
Repository Item
Import
Applications,
Projects,
Workflow part
(G,T,CW,WI)
Observe,
Download,
Suspend,
Delete
3rd EGEE User Forum
15
The workflow concept of gUSE
Enabling Grids for E-sciencE
• The workflow concept of gUSE is much more flexible
than P-GRADE portal and many other workflow
systems
• Its DAG topology is extended with
–
–
–
–
–
embedded WFs
recursive embedded WFs
parameter sweep nodes
conditional control mechanism
special workflow starting control mechanisms based on
 external events or
 periodic timing
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
16
Workflow Graph: Overview
Enabling Grids for E-sciencE
Input Port
Node: job, service call
(WS, legacy), wf
Output Port
The
Workflow
Editor as it
appears
for the
user
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
17
Configuring the Workflow: Overview
Enabling Grids for E-sciencE
m
n
Determine number of
accepted files on
external input Ports
h
Generator job produces
multiple data on the
output port within one
job submission step
*K
1
Legend:
Determine Dot or
Cross product relation
of Input ports to define
the number of job
submissions
Cross Product
Dot Product
Determine Job to be Collector
by defining a Gathering Input
Port. The Job execution will be
postponed until all input files to
that Port have arrived
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
18
Animation the number of
generated output files
Enabling Grids for E-sciencE
m
n
h
m*n
h
m*n
m*n
h*K
h*K
m*n
m*n*h*K
*K
h*K
S
m*n*h*K
1
In case of Generator job
the number of job
submissions may differ
from the number of files
on Output Ports
S
S=max(m*n,h*k)
In case of dot
product the Job is
submitted with
input files having a
common index
number in each
input Ports
S
S
S
S
S
In case of cross product
individual Job
submission is generated
for each possible input
file combination
S
EGEE-II INFSO-RI-031688
S
3rd EGEE User Forum
19
An example CancerGrid workflow
Enabling Grids for E-sciencE
N = 20e-30e, M = 100 => very large number of executions and files
x1
NxM
x1
xN
xN
xN
NxM
x1
Generator job
xN
xN
xN
Generator job
NxM
NxM
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
20
Interoperability support
Enabling Grids for E-sciencE
• gUSE supports:
– grid interoperability
– workflow interoperability
• gUSE can easily be connected to any known grid
middleware. It is already connected to GT2, GT4, LCG-2,
gLite and WS based grid systems
• gUSE can also be connected to local systems like
clusters or supercomputers
• It contains a built-in grid broker that can automatically
distribute the jobs of a workflow into any of the
connected grids
• It can use other grid brokers like the gLite broker or
GridWay
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
21
Interoperability support: EDGeS
Enabling Grids for E-sciencE
• EDGeS: Enabling Desktop Grids for e-Science
• To integrate EGEE with Desktop Grids
• gUSE can provide the transparent access of EGEE
and DGs
WSPGRADE
Appl.
Repository
gUSE
LocalDEG
University DG
LocalDEG
LocalDEG
Service
Grid
EGEE
GlobalDEG
Volunteer DG
LocalDEG
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
22
Family of user support products
Enabling Grids for E-sciencE
P-GRADE portal
P-GRADE/GEMLCA portal
WS-PGRADE portal
1st generation
2nd generation
• P-GRADE portal and gUSE/WS-PGRADE represent a
family of user support products
• They support the whole range of user types:
– Novice application developers: 1st generation P-GRADE portals
Advances application developers: 2nd generation WS-PGRADE
portal developer view
– End-users without grid knowledge: 2nd generation WSPGRADE portal end-user view
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
23
Enabling Grids for E-sciencE
Family of P-GRADE products
and their use
• P-GRADE
– Parallelizing applications for clusters and grids
• P-GRADE portal
– Creating simple workflow and parameter sweep applications for
grids
• P-GRADE/GEMLCA portal
– Creating workflow applications using legacy codes and
community codes from repository
• gUSE/WS-PGRADE
– Creating complex workflow and parameter sweep applications
for clusters, service grids and desktop grids
– Creating workflow applications using embedded workflows,
legacy codes and community workflows from workflow repository
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
24
Conclusions / Future plans
Enabling Grids for E-sciencE
• gUSE solves all the limitation problems of P-GRADE
portal:
– Implementation of gUSE is highly scalable, can be distributed on
a cluster or even on different grid sites.
– Stress tests show that it can simultaneously serve thousands of
jobs
– Its workflow concept is much more expressive than in P-GRADE
portal (recursive wf, generic PS support, etc.)
– Its user interface called as WS-PGRADE provides a graphical
workflow editor that is much faster than the one in P-GRADE
portal
– gUSE provides a workflow repository and its use by end-users
and application developers
– gUSE solves grid interoperability at workflow level
 among service grids
 between service grids and desktop grids (see EDGeS project)
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
25
Roadmap of gUSE
Enabling Grids for E-sciencE
• First version was demonstrated at SC’07
• First version will be released in March 2008 with full
support for EGEE, GT2 and GT4
• Second version will be released in July 2008 with full
support for desktop grids
• Third version solving interoperability between EGEE
and desktop grids will be released by SC’08
EGEE-II INFSO-RI-031688
3rd EGEE User Forum
26
Download