glite-mpi-support.pptx

advertisement
MPI support in gLite
Enol Fernández
CSIC
MPI on the Grid
– Definition of job characteristics
– Search and select adequate resources
– Allocate (or coallocate) resources for the job
CREAM/WMS
• Submission/Allocation
– File distribution
– Batch system interaction
– MPI implementation details
MPI-Start
EMI INFSO-RI-261611
• Execution
Allocation / Submission
EMI INFSO-RI-261611
• Process count specified with the
CPUNumber attribute
Type
= "Job";
CPUNumber
= 23;
Executable
= "my_app";
Arguments
= "-n 356 -p 4";
StdOutput
= "std.out";
StdError
= "std.err";
InputSandBox = {"my_app"};
OutputSandBox = {"std.out", "std.err"};
Requirements
=
Member("OPENMPI”,
other.GlueHostApplicationSoftwareRunTimeEnvironment);
MPI-Start
EMI INFSO-RI-261611
• Specify a unique interface to the upper layer to
run a MPI job
• Allow the support of new MPI implementations
without modifications in the Grid middleware
• Support of “simple” file distribution
• Provide some support for the user to help
manage his data
Grid Middleware
MPI-START
MPI
Resources
MPI-Start Design Goals
• Portable
– The program must be able to run under any
supported operating system
• Modular and extensible architecture
– Plugin/Component architecture
EMI INFSO-RI-261611
• Relocatable
– Must be independent of absolute path, to adapt
to different site configurations
– Remote “injection” of mpi-start along with the
job
• “Remote” debugging features
File Dist.
Compiler
Execution
User
Local
PACX
LAM
MPICH2
Scheduler
Open MPI
LSF
SGE
PBS/Torque
EMI INFSO-RI-261611
MPI-Start Architecture
CORE
Hooks
EMI INFSO-RI-261611
Using MPI-Start (I)
JobType
= "Normal";
CpuNumber
= 4;
Executable
= "starter.sh";
InputSandbox
= {"starter.sh”}
StdOutput
= "std.out";
StdError
= "std.err";
OutputSandbox
= {"std.out","std.err"};
Requirements
=
Member("MPI-START”,
other.GlueHostApplicationSoftwareRunTimeEnvironment)
&& Member("OPENMPI”,
other.GlueHostApplicationSoftwareRunTimeEnvironment);
$ cat starter.sh
#!/bin/sh
# This is a script to call mpi-start
stdout:
Scientific Linux CERN SLC release 4.5 (Beryllium)
Scientific Linux CERN SLC release 4.5 (Beryllium)
lflip30.lip.pt
lflip31.lip.pt
stderr:
real
0m0.731s
user
0m0.021s
sys
0m0.013s
# Set environment variables needed
export
I2G_MPI_APPLICATION=/bin/hostname
export I2G_MPI_APPLICATION_ARGS=
export I2G_MPI_TYPE=openmpi
export I2G_MPI_PRECOMMAND=time
# Execute mpi-start
$I2G_MPI_START
Using MPI-Start (II)
…
CpuNumber
Executable
Arguments
InputSandbox
Environment
...
=
=
=
=
=
4;
”mpi-start-wrapper.sh";
“userapp OPENMPI some app args…”
{”mpi-start-wrapper.sh”};
{“I2G_MPI_START_VERBOSE=1”, …}
#!/bin/bash
MY_EXECUTABLE=$1
shift
MPI_FLAVOR=$1
shift
export I2G_MPI_APPLICATION_ARGS=$*
EMI INFSO-RI-261611
# Convert flavor to lowercase for passing to mpi-start.
MPI_FLAVOR_LOWER=`echo $MPI_FLAVOR | tr '[:upper:]' '[:lower:]'`
# Pull out the correct paths for the requested flavor.
eval MPI_PATH=`printenv MPI_${MPI_FLAVOR}_PATH`
# Ensure the prefix is correctly set. Don't rely on the defaults.
eval I2G_${MPI_FLAVOR}_PREFIX=$MPI_PATH
export I2G_${MPI_FLAVOR}_PREFIX
# Setup for mpi-start.
export I2G_MPI_APPLICATION=$MY_EXECUTABLE
export I2G_MPI_TYPE=$MPI_FLAVOR_LOWER
# Invoke mpi-start.
$I2G_MPI_START
MPI-Start Hooks (I)
• File Distribution Methods
– Copy files needed for execution using the most
appropriate method (shared filesystem, scp,
mpiexec, …)
EMI INFSO-RI-261611
• Compiler flag checking
– checks correctness of compiler flags for 32/64
bits, changes them accordingly
• User hooks:
– build applications
– data staging
MPI-Start Hooks (II)
#!/bin/sh
pre_run_hook () {
# Compile the program.
echo "Compiling ${I2G_MPI_APPLICATION}"
EMI INFSO-RI-261611
# Actually compile the program.
cmd="mpicc ${MPI_MPICC_OPTS} -o ${I2G_MPI_APPLICATION} ${I2G_MPI_APPLICATION}.c"
$cmd
if [ ! $? -eq 0 ]; then
echo "Error compiling program. Exiting..."
exit 1
fi
# Everything's OK.
echo "Successfully compiled ${I2G_MPI_APPLICATION}"
return 0
}
…
InputSandbox = {…, “myhooks.sh”…};
Environment = {…, “I2G_MPI_PRE_HOOK=myhooks.sh”};
…
MPI-Start: more features
• Remote injection
– Mpi-start can be sent along with the job
• Just unpack, set environment and go!
• Interactivity
EMI INFSO-RI-261611
– A pre-command can be used to “control” the mpirun call
– $I2G_MPI_PRECOMMAND mpirun ….
– This command can:
• Redirect I/O
• Redirect network traffic
• Perform accounting
• Debugging
– 3 different debugging levels:
• VERBOSE: basic information
• DEBUG: internal flow information
• TRACE: set –x at the beginning. Full trace of the execution
Future work (I)
• New JDL description for parallel jobs (proposed
by the EGEE MPI TF):
– WholeNodes (True/False):
• whether or not full nodes should be reserved
– NodeNumber (default = 1):
EMI INFSO-RI-261611
• number of nodes requested
– SMPGranularity (default = 1):
• minimum number of cores per node
– CPUNumber (default = 1):
• number of job slots (processes/cores) to use
• CREAM team working on how to support them
Future work (II)
• Management of non MPI jobs
– new execution environments (OpenMP)
– generic parallel job support
• Support for new schedulers
EMI INFSO-RI-261611
– Condor and SLURM support
• Explore support for new architectures:
– FPGAs, GPUs,…
More Info…
• gLite MPI PT:
– https://twiki.cern.ch/twiki/bin/view/EMI/GLi
teMPI
EMI INFSO-RI-261611
• MPI-Start trac
– http://devel.ifca.es/mpi-start
– contains user, admin and developer docs
• MPI Wiki @ TCD
– http://www.grid.ie/mpi/wiki
Download