SEE-GRID Site Installation and Configuration

advertisement

SEE-GRID-2

SEE-GRID Site installation and configuration

www.see-grid.eu

SEE-GRID-2 Training Event,

Tirana, Albania, 02 April 2008

Emanouil Atanassov

Institute for Parallel

Processing, BAS emanouil@parallel.bas.bg

Antun Balaz

WP3 Leader

Institute of Physics, Belgrade antun@phy.bg.ac.yu

The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no. 031775

Outline

Decisions about site configuration

OS installation, NTP, firewall issues

Java installation

Shared storage

Middleware installation

Configuration

Adding MPI support

APEL, Accounting configuration

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 2

Decisions about site configuration

Operating system: SL 4.x

recommended, SLC, Centos, RHEL compatible

32 or 64 bit – 64 bit is the future

Required service nodes – CE, SE (dpm recommended), MON.

WNs can be 32 or 64 with preference for 64

Virtualization is the recommended solution for combining more nodes on one physical node.

SL5 host running xen with SL4 para or fully virtualized guests configuration is usable

UI must be close to the user

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 3

Decisions about site configuration

Storage – 1TB for the SE

RAM – at least 1 GB RAM per job/core

Internal networking – goal should be the WNs to be all on one

1Gbps switch. 1Gbps should be the goal

External networking – the more the better

Firewalls

Avoid NAT worker nodes

Service nodes MUST have public IPs, and DNS resolution MUST work for them both ways

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 4

OS installation

Install the lastest SL 4.x (for some node types 3.0.x)

Keep the WNs homogeneous (cloning)

Be generous, install development packages, compilers.

Yum is recommended over apt-get, because of multiarch support x86_64/i386.

Locate a reliable close NTP server for time synchronization!

Enable the dag repository

Do not allow automatic upgrades for the middleware repositories

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 5

Java installation

Use latest Java 1.5. Follow advice from: https://twiki.cern.ch/twiki/bin/view/EGEE/GLite31JPackage or: http://wiki.egee-see.org/index.php/SL4_WN_glite-3.1

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 6

Middleware installation

Add the CA repository as shown in: http://grid-deployment.web.cern.ch/grid-deployment/yaim/repos/lcg-CA.repo

Install with yum install lcg-CA

Pick the right repository https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310#Updates

Install SEE-GRID VOMS server rpms (and any additional rpms for additional VOs). Currently: http://www.irb.hr/users/vvidic/seegrid/seegrid-0.5-1.noarch.rpm

http://www.grid.auth.gr/services/voms/SEE/GridAUTH-vomscert-1.2-5.noarch.rpm

Install glite middleware with yum, using the right target:

CE: lcg-CE glite-TORQUE_utils glite-TORQUE_server

BDII_site

SE dpm :glite-SE_dpm_mysql

MON: glite-MON

WN: glite-WN glite-TORQUE_client

UI: glite-UI

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 7

Middleware configuration

Configuration consists mostly in editing several configuration files, which you can

“steal” from http://glite.phy.bg.ac.yu/GLITE-3/AEGIS/

Ideally these files should be the same on all your nodes.

Be careful with MON_HOST and REG_HOST. MON_HOST is the FQDN of your MON box. REG_HOST should be gserv1.ipp.acad.bg for SEE-GRID only sites, but lcgic01.gridpp.rl.ac.uk for EGEE sites.

Configuration is done with one single command:

/opt/glite/yaim/bin/yaim -c -s site-info.def -n <node_type1> …-n <node_type2> where you list node types as in

/opt/glite/yaim/bin/yaim -c -s site-info.def -n lcg-CE -n TORQUE_server -n

TORQUE_utils

IMPORTANT: More than one service node on the same logical computer is not supported and may result in severe headache.

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 8

Patching the configuration

Sometimes the resulting configuration has known problems. Then we manually patch the holes.

Examples: pool users must be able to ssh from WN to CE without password (and without annoying warning messages)

Info provider on CE uses maui diagnose command. This is better avoided.

Timeouts in the infoprovider may have to be increased

On WNs cat /var/spool/pbs/mom_priv/config should return:

$pbsserver CEFQDN

$restricted CEFQDN

$logevent 255

For pbs a line could be added to use NFS instead of scp.

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 9

Maui and qmgr configuration

Recommendation: Hard limits on the queues are best imposed with qmgr command:

 Qmgr> set queue seegrid max_running=21

 Necessary changes in qmgr for MPI support – if you have the default setting of max 48 hours CPU time, MPI jobs taking more than 48 hours of total time will be aborted. We suggest to set CPU time to much more than 48 hours, and to rely on Wall Clock Time to impose reasonable limit. Example:

Qmgr>set queue seegrid resources_max.cput=17280000 to enable MPI jobs taking up to 200 CPU days, but:

Qmgr>set queue seegrid resources_max.walltime=259200 to allow up to 3 days wall clock usage.

In maui you can make reservations for specific groups (=VOs), users (by DN - if you have added the patched torque-submitter script, which also improves MPI support – see later).

Necessary changes in maui for MPI support:

 ENABLEMULTIREQJOBS TRUE

SEE-GRID-2 Regional Training for Grid Administrators, Sofia, Bulgaria, 20-21 March 2008 10 10

iptables configuration

Outbound connectivity – unlimited.

Inbound usually required to 20000-25000 TCP.

For MON box – 8443, 8088,2135 (or 2170), 2136.

For CE – 2170, 2135, 2119, 2811

For SE – 2170, 2811, 8443…

If you suspect firewall problem on a service node, look at: netstat –anlp|grep LISTEN

To determine which ports have some deamons listening.

UI can be under NAT, but this prevents some useful commands from working.

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 11

iptables configuration

Outbound connectivity – unlimited.

Public inbound connectivity usually required to 20000-25000 TCP.

For MON box – 8443, 8088,2135, 2136.

For CE – 2170, 2119, 2811

For SE – 2170, 2811, 8443…

Intra-cluster – difficult to manage effectively.

If you suspect firewall problem, look at: netstat –anlp|grep LISTEN to determine which ports have some deamons listening.

Control ssh access – usually penetration happens because of weak admin or user passwords. Ideally replace password access with private key access. Teach users not to use unprotected private keys – hackers are looking for these.

UIs are a security nightmare (but can be installed on a SL4.x VMware virtual machines behind NAT – on the user laptop).

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 12

Shared filesystem

Home directories must be shared between CE and

WNs for good MPI support (read – if you are serious about MPI, they must be shared).

/opt/exp_soft directory must be exported to the worker nodes (all sites). Appropriate permissions must be set (read/write for SGM accounts, read for user accounts).

Other posix compliant shared filesystems are also possible

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 13

MPI support

MPI support usually requires pool users to be able to ssh between

WNs without password. Mpiexec can avoid that, but users have problems with mpiexec.

A cron script that kills runaway processes (processes, run by users that do not have active job on that job) must be in place.

Jobs run with mpiexec produce correct accounting (but are killed by the batch system if they go above the max CPU time limit for the queue). Solution – set max CPU time much higher than wall clock time.

Jobs run with mpich2 also result in correct accounting, and can be run across sites (tested in SEEGRID)!

For WAN MPI support some new protocols are more promising.

MPI can be based on SCTP instead of TCP (some success, but requires some changes in site configuration).

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 14

MPI support

EGEE moving towards standardized use of mpi-start:

 http://glite.web.cern.ch/glite/packages/R3.1/deployment/glite-MPI_utils/3.1.1-0/glite-MPI_utils-3.1.1-0-update.html

 http://www.grid.ie/mpi/wiki/YaimConfig

[glite-MPI_utils] name=glite 3.1 MPI enabled=1 gpgcheck=0 baseurl=http://glitesoft.cern.ch/EGEE/gLite/R3.1/glite-MPI_utils/sl4/i386/ yum install glite-MPI_utils

Configure becomes:

 /opt/glite/yaim/bin/yaim -c -s site-info.def -n MPI_CE

/opt/glite/yaim/bin/yaim -c -s site-info.def -n MPI_WN -n glite-WN -n TORQUE_client 

Use submit filter for torque:

Edit /var/spool/pbs/torque.cfg

and add

SUBMITFILTER /var/spool/pbs/submit_filter.pl

steal mine from ce002.ipp.acad.bg: globus-url-copy gsiftp://ce002.ipp.acad.bg/var/spool/pbs/submit_filter.pl file:///tmp/submit_filter.pl

My advice – do not allow LAM for grid jobs

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 15

MPI support

Changes in site-info.def:

 jobmanager=pbs

 CE_BATCH_SYS=pbs

(not torque!)

 Add:

MPI-START

MPICH

MPICH-1.2.7

MPICH2

MPICH2-1.0.4

OPENMPI

OPENMPI-1.1

Just after R-GMA for the GlueHostSoftwareEnvironment.

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 16

Accounting issues

Install MON box on SL 3.0.x

Make sure LcgRecords table has the correct format (on old installations some fields need to be made wider).

If using pbs (shared home dirs) change pbs.pm on the CE http://glite.phy.bg.ac.yu/GLITE-3/AEGIS/pbs.pm

Make sure ports 8088 and 8443 are open.

After installation and configuration of MON box,

SEE-GRID-2 Training Event, Tirana, Albania, 2 April 2008 17

Download