www.see-grid.eu
SEE-GRID-2 Training Event,
Tirana, Albania, 02 April 2008
Emanouil Atanassov
Institute for Parallel
Processing, BAS emanouil@parallel.bas.bg
Antun Balaz
WP3 Leader
Institute of Physics, Belgrade antun@phy.bg.ac.yu
The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no. 031775
Decisions about site configuration
OS installation, NTP, firewall issues
Java installation
Shared storage
Middleware installation
Configuration
Adding MPI support
APEL, Accounting configuration
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 2
Operating system: SL 4.x
recommended, SLC, Centos, RHEL compatible
32 or 64 bit – 64 bit is the future
Required service nodes – CE, SE (dpm recommended), MON.
WNs can be 32 or 64 with preference for 64
Virtualization is the recommended solution for combining more nodes on one physical node.
SL5 host running xen with SL4 para or fully virtualized guests configuration is usable
UI must be close to the user
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 3
Storage – 1TB for the SE
RAM – at least 1 GB RAM per job/core
Internal networking – goal should be the WNs to be all on one
1Gbps switch. 1Gbps should be the goal
External networking – the more the better
Firewalls
Avoid NAT worker nodes
Service nodes MUST have public IPs, and DNS resolution MUST work for them both ways
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 4
Install the lastest SL 4.x (for some node types 3.0.x)
Keep the WNs homogeneous (cloning)
Be generous, install development packages, compilers.
Yum is recommended over apt-get, because of multiarch support x86_64/i386.
Locate a reliable close NTP server for time synchronization!
Enable the dag repository
Do not allow automatic upgrades for the middleware repositories
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 5
Use latest Java 1.5. Follow advice from: https://twiki.cern.ch/twiki/bin/view/EGEE/GLite31JPackage or: http://wiki.egee-see.org/index.php/SL4_WN_glite-3.1
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 6
Add the CA repository as shown in: http://grid-deployment.web.cern.ch/grid-deployment/yaim/repos/lcg-CA.repo
Install with yum install lcg-CA
Pick the right repository https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310#Updates
Install SEE-GRID VOMS server rpms (and any additional rpms for additional VOs). Currently: http://www.irb.hr/users/vvidic/seegrid/seegrid-0.5-1.noarch.rpm
http://www.grid.auth.gr/services/voms/SEE/GridAUTH-vomscert-1.2-5.noarch.rpm
Install glite middleware with yum, using the right target:
CE: lcg-CE glite-TORQUE_utils glite-TORQUE_server
BDII_site
SE dpm :glite-SE_dpm_mysql
MON: glite-MON
WN: glite-WN glite-TORQUE_client
UI: glite-UI
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 7
Configuration consists mostly in editing several configuration files, which you can
“steal” from http://glite.phy.bg.ac.yu/GLITE-3/AEGIS/
Ideally these files should be the same on all your nodes.
Be careful with MON_HOST and REG_HOST. MON_HOST is the FQDN of your MON box. REG_HOST should be gserv1.ipp.acad.bg for SEE-GRID only sites, but lcgic01.gridpp.rl.ac.uk for EGEE sites.
Configuration is done with one single command:
/opt/glite/yaim/bin/yaim -c -s site-info.def -n <node_type1> …-n <node_type2> where you list node types as in
/opt/glite/yaim/bin/yaim -c -s site-info.def -n lcg-CE -n TORQUE_server -n
TORQUE_utils
IMPORTANT: More than one service node on the same logical computer is not supported and may result in severe headache.
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 8
Sometimes the resulting configuration has known problems. Then we manually patch the holes.
Examples: pool users must be able to ssh from WN to CE without password (and without annoying warning messages)
Info provider on CE uses maui diagnose command. This is better avoided.
Timeouts in the infoprovider may have to be increased
On WNs cat /var/spool/pbs/mom_priv/config should return:
$pbsserver CEFQDN
$restricted CEFQDN
$logevent 255
For pbs a line could be added to use NFS instead of scp.
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 9
Recommendation: Hard limits on the queues are best imposed with qmgr command:
Qmgr> set queue seegrid max_running=21
Necessary changes in qmgr for MPI support – if you have the default setting of max 48 hours CPU time, MPI jobs taking more than 48 hours of total time will be aborted. We suggest to set CPU time to much more than 48 hours, and to rely on Wall Clock Time to impose reasonable limit. Example:
Qmgr>set queue seegrid resources_max.cput=17280000 to enable MPI jobs taking up to 200 CPU days, but:
Qmgr>set queue seegrid resources_max.walltime=259200 to allow up to 3 days wall clock usage.
In maui you can make reservations for specific groups (=VOs), users (by DN - if you have added the patched torque-submitter script, which also improves MPI support – see later).
Necessary changes in maui for MPI support:
ENABLEMULTIREQJOBS TRUE
SEE-GRID-2 Regional Training for Grid Administrators, Sofia, Bulgaria, 20-21 March 2008 10 10
Outbound connectivity – unlimited.
Inbound usually required to 20000-25000 TCP.
For MON box – 8443, 8088,2135 (or 2170), 2136.
For CE – 2170, 2135, 2119, 2811
For SE – 2170, 2811, 8443…
If you suspect firewall problem on a service node, look at: netstat –anlp|grep LISTEN
To determine which ports have some deamons listening.
UI can be under NAT, but this prevents some useful commands from working.
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 11
Outbound connectivity – unlimited.
Public inbound connectivity usually required to 20000-25000 TCP.
For MON box – 8443, 8088,2135, 2136.
For CE – 2170, 2119, 2811
For SE – 2170, 2811, 8443…
Intra-cluster – difficult to manage effectively.
If you suspect firewall problem, look at: netstat –anlp|grep LISTEN to determine which ports have some deamons listening.
Control ssh access – usually penetration happens because of weak admin or user passwords. Ideally replace password access with private key access. Teach users not to use unprotected private keys – hackers are looking for these.
UIs are a security nightmare (but can be installed on a SL4.x VMware virtual machines behind NAT – on the user laptop).
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 12
Home directories must be shared between CE and
WNs for good MPI support (read – if you are serious about MPI, they must be shared).
/opt/exp_soft directory must be exported to the worker nodes (all sites). Appropriate permissions must be set (read/write for SGM accounts, read for user accounts).
Other posix compliant shared filesystems are also possible
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 13
MPI support usually requires pool users to be able to ssh between
WNs without password. Mpiexec can avoid that, but users have problems with mpiexec.
A cron script that kills runaway processes (processes, run by users that do not have active job on that job) must be in place.
Jobs run with mpiexec produce correct accounting (but are killed by the batch system if they go above the max CPU time limit for the queue). Solution – set max CPU time much higher than wall clock time.
Jobs run with mpich2 also result in correct accounting, and can be run across sites (tested in SEEGRID)!
For WAN MPI support some new protocols are more promising.
MPI can be based on SCTP instead of TCP (some success, but requires some changes in site configuration).
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 14
EGEE moving towards standardized use of mpi-start:
http://glite.web.cern.ch/glite/packages/R3.1/deployment/glite-MPI_utils/3.1.1-0/glite-MPI_utils-3.1.1-0-update.html
http://www.grid.ie/mpi/wiki/YaimConfig
[glite-MPI_utils] name=glite 3.1 MPI enabled=1 gpgcheck=0 baseurl=http://glitesoft.cern.ch/EGEE/gLite/R3.1/glite-MPI_utils/sl4/i386/ yum install glite-MPI_utils
Configure becomes:
/opt/glite/yaim/bin/yaim -c -s site-info.def -n MPI_CE
/opt/glite/yaim/bin/yaim -c -s site-info.def -n MPI_WN -n glite-WN -n TORQUE_client
Use submit filter for torque:
Edit /var/spool/pbs/torque.cfg
and add
SUBMITFILTER /var/spool/pbs/submit_filter.pl
steal mine from ce002.ipp.acad.bg: globus-url-copy gsiftp://ce002.ipp.acad.bg/var/spool/pbs/submit_filter.pl file:///tmp/submit_filter.pl
My advice – do not allow LAM for grid jobs
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 15
Changes in site-info.def:
jobmanager=pbs
CE_BATCH_SYS=pbs
(not torque!)
Add:
MPI-START
MPICH
MPICH-1.2.7
MPICH2
MPICH2-1.0.4
OPENMPI
OPENMPI-1.1
Just after R-GMA for the GlueHostSoftwareEnvironment.
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 16
Install MON box on SL 3.0.x
Make sure LcgRecords table has the correct format (on old installations some fields need to be made wider).
If using pbs (shared home dirs) change pbs.pm on the CE http://glite.phy.bg.ac.yu/GLITE-3/AEGIS/pbs.pm
Make sure ports 8088 and 8443 are open.
After installation and configuration of MON box,
SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 17