Template to gather information about existing portal / middleware

advertisement
Template to gather information about existing portal / middleware and infrastructure
solutions (for WP4 and WP5)
We assume to create three lists (portals, dispatchers, computing resources), linked among
one another
Partner number and name: UU – Partner 8
●
Portal solution(s)
○ Name
CS-ROSETTA3
○ URL
http://haddock.science.uu.nl/enmr/services/CS-ROSETTA3
○ Institution - who is hosting/responsible for the portal itself
UU
○ Description, applications supported
Chemical-shift based NMR structure calculations using rosetta
○
○
○
○
Requires several applications to run, some on the grid some only locally:
- CSRosetta2009_05
- DBScore
- ProfitV3.1
- csrosetta3
- nmrPipe
- rosetta3.5 (the only one really required on the grid)
- talosplus
-…
Al together about 30GB of software and data required.
Implementation framework (PHP, Tomcat, Django, …)
● The server front end are html pages (some with php). cgi scripts are python
User AuthN/Z
■ Authentication mechanism - certificates, dedicated username/password,
identity federations …
Registration required from the WeNMR site (combination of username
and password are used for submission) – Access to the portal via the
WeNMR SSO module (only granted provided a valid X509 certificate
registered with the enmr.eu VO
■ Authorization mechanism - VOMS, registration procedures, membership
renewal
Connection to the WeNMR SSO module. The server itself uses a robot
certificate for submission to the grid
Details about size of datasets
■ How much data are uploaded by the user on job submission?
Typically a few to tens of MB
■ How much data are downloaded as the result?
Results are presented on a web page – full result archive (tar gzipped
archive) can be up to several GBs depending on the system size.
■ Are any “background” data used, e.g. PDB referred by id?
No
essential statistics
■ number of active users
50 registered (WeNMR stats)
■
○
○
number of jobs per year
~50 runs translating into 60-70
thousands individual grid jobs
■ average job length and number of CPU cores per job
The jobs dispatched to the grid as part of the complex workflow have an
average runtime of 1 ½ hours (EGI accounting portal stats). But this varies
very much depending on the system sizes
which job dispatcher is used (from the list in the next section)
torque batch commands on the local resources, gLite WMS
computing resources accessed (from the list below)
local clusters (for pre- and post-processing) and EGI grid resources supporting
the enmr.eu VO
(Luna question) Do the compute jobs require shared drive mount? Are they multinodes jobs (ex: require MPI)?
No
●
Job dispatcher
○ Overall description of architecture
Complex python workflow, creating individual jobs sent to local resources or the
grid. Grid submission is handled by separate grid scripts (mostly csh scripts
running as cron deamons).
○ Supported backends (local batch system, gLite WMS/CREAM, Dirac, OCCI, …)
Local batch system (Torque/Maui), gLite WMS
○ Standard solution x proprietary
■ Interfaces, API
○ dispatcher vs. computing resource AuthN - user proxy certificates (using
MyProxy?), robotic certificates, …
Grid submission makes use of a robot certificate (in the name of Alexandre
Bonvin)
○ Application software distribution - VM images, Docker, xroot, ...
- see above
●
Local computing resources
○ Operating system
Scientific Linux (SL5.X)
○ Number of CPUs,Memory, Storage (total / used by portals)
One cluster with respectively ~180 cores, 4 TB storage space. Results are only
stored for 2 weeks to limit storage requirements. Software+portal account for
~30GB of diskspace. Current results storage used is 3.5 GB
○ Dispatcher / batch system
Torque
○ Interfaces, APIs
■ No clear what is asked here? Seems redundant with above
●
Remote computing resources
○ Provider
EGI
○ local to specific portal x external (accessed remotely)
All sites with a software tag added VO-enmr.eu-ROSETTA3.3
○
○
○
Number of CPUs (or jobs), Memory, Storage (total / used by portals)
The portal sends single CPU jobs to the grid (~50000 per year). Job+data size is
typically less than 20MB. Only local temporary storage is used on the grid.
Results recovered via the gLite WMS
Interfaces, APIs
■ grid (CREAM, …)
Fixed, dynamic and/or opportunistic resources
Opportunistic resources from sites supporting the enmr.eu VO with the proper
software tag added
●
Software deployment / management
○ How is the software deployed? (e.g. sent with the jobs, remotely installed, other)
Software is deployed and managed by us. Initially using the software manager
role in the enmr.eu VO and installing the software in the local software dir on
each grid site. Replaced now in most case by CVMFS
○ Licensing scheme
Free for non-profit users, but does require a license form
●
Storage solutions
○ Local and network storage requirements
An active run on the server can generate several GB of data (even >10GB) while
running. Concurrent runs are allowed to a maximum of 10 (and a max of typically
5 per user)
○ How are the results returned to the users?
Users are notified by email and can access their results on a web page.
Download