Platform MPI Background

advertisement
IBM Systems & Technology Group
Platform MPI 9.1.2
Introduction to Platform MPI
Dec, 2013
© 2013 IBM Corporation
IBM Systems & Technology Group
Topics
Introduction
Key Concepts in Parallel Computing
MPI Background
Platform MPI Background
Installing Platform MPI
Building & Running MPI Applications
2
© 2013 IBM Corporation
IBM Systems & Technology Group
Who am I?
Stan Graves <gravess@us.ibm.com>
Platform MPI Software Developer
•
Joined HP-MPI team ~7 years ago
•
Primary customer support contact
•
Customer facing technical trainer
ClusterPack Project Manager & Developer
3
•
~3 years Developer, ~2 years Project Manager
•
Scale Out Cluster Management Solution
•
Parallel system administration tools
•
System image creation & distribution
© 2013 IBM Corporation
IBM Systems & Technology Group
Topics
Introduction
Key Concepts in Parallel Computing
MPI Background
Platform MPI Background
Installing Platform MPI
Building & Running MPI Applications
4
© 2013 IBM Corporation
IBM Systems & Technology Group
What is Parallel Computing?
“Parallel Computing” is an overloaded term
Parallelism at bit-level, instruction level, data level,
and task level can all claim to be “parallel
computing”
“Parallel machines” can be some combination of:
•
•
“Scale Up” Multiple processing elements within a single
machine (multiple sockets per machine and/or multiple cores
per socket)
“Scale Out” Multiple computers in a cluster (or grid or cloud)
For purposes of this discussion:
•
5
Parallel computing is the coordination of work across many
individual processes running concurrently (e.g. “in parallel”) on
one or more computers to solve a single problem.
© 2013 IBM Corporation
IBM Systems & Technology Group
What is Parallel Computing?
Parallel Programming Opportunities
Clusters can be scaled to meet problem size
Commercial-Off-The-Shelf (COTS) hardware
components to build clusters are readily available
Many possible cluster styles depending on customer
needs
•
High Availability
•
Load Balancing
•
Computation Clusters
•
Grid Computing
•
Cloud Computing
6
© 2013 IBM Corporation
IBM Systems & Technology Group
What is Parallel Computing?
Parallel Programming Difficulties
The coordination & synchronization of multiple
processes running concurrently introduces
complexity as the number of processes increases
Common software bugs with parallel computing
•
Race conditions
•
Deadlocks
•
Synchronization
•
Parallel slowdown
Interconnect bandwidth & latency are often limiting
factors for overall parallel program throughput
Interconnect fabrics are increasing in complexity (e.g.
torus configurations, multi-card & multi-port
networks)
7
© 2013 IBM Corporation
IBM Systems & Technology Group
What is Parallel Computing?
Parallel Programming Difficulties
There is no “Silver Bullet”
“The Mythical Man Month”
“Accidental complexity” is complexity that is incidental
and non-essential to the problem that needs to be
solved.
“Essential complexity” is inherently part of the problem
that needs to be solved.
“Amdahl’s Law” accounts for the theoretical
improvement of a task when only a sub-set is
improved. Often used in evaluating the expected
effect of adding processors to a parallel application.
8
© 2013 IBM Corporation
IBM Systems & Technology Group
Topics
Introduction
Key Concepts in Parallel Computing
MPI Background
Platform MPI Background
Installing Platform MPI
Building & Running MPI Applications
9
© 2013 IBM Corporation
IBM Systems & Technology Group
What is MPI?
Message Passing Interface (MPI)
API Specification for the interchange of messages
between individual computing processes
De-facto standard for parallel programming
Formalized topology, synchronization, and
communication for multiple process jobs
Language independent, platform independent
Many different implementations
•
Typically non-ABI compatible
MPI Forum
•
http://www.mpi-forum.org/
General overview at Wikipedia
•
10
http://en.wikipedia.org/wiki/Message_Passing_Interface
© 2013 IBM Corporation
IBM Systems & Technology Group
What is a rank?
The MPI Standard uses the term “rank” to describe a
process that is part of an MPI job
Ranks have two unique identifiers
Global Rank ID
Local Rank ID
A rank “is a” process (for “almost all” implementations)
All threads in a process are part of the same rank
NOT defined this way by the standard
Generally, MPI jobs are run with one rank per core
© 2013 IBM Corporation
IBM Systems & Technology Group
MPI Key Concepts
MPI-1 Standard
Communicator

Communicators are objects connecting groups of processes in the MPI session.
Every MPI job includes the default MPI_Comm_World that includes all the ranks that
are part of the job.
Point-to-point Operations

Point-to-point communication involves sending and receiving message between two
processes. A much used example is the MPI_Send and MPI_Recv.
Collective Operations

Applications may require coordinated operations among multiple processes. For
example, all ranks need to cooperate to sum sets of numbers.
Derived Datatypes

Predefined MPI datatypes: MPI_INT, MPI_CHAR, MPI_DOUBLE. Suppose your
data is an array of ints and all the processors want to send their array to the root with
MPI_Gather.
© 2013 IBM Corporation
IBM Systems & Technology Group
MPI Key Concepts
MPI-2 Standard
One-Sided Communication

Put, Get, and Accumulate, being a write to remote memory, a read from remote
memory, and a reduction operation on the same memory across a number of tasks.
Also defined are global, pair wise, and remote locks.
Collective Extensions

Various extensions to collective operations. (e.g. MPI_Reduce_local)
Dynamic Process Management

The key aspect of this MPI-2 feature is "the ability of an MPI process to participate in
the creation of new MPI processes or to establish communication with MPI
processes that have been started separately." (e.g MPI_Comm_Spawn)
Parallel I/O

A collection of functions designed to allow the difficulties of managing I/O on
distributed systems to be abstracted away to the MPI library, as well as allowing files
to be easily accessed in a patterned fashion using the existing derived datatype
functionality.
© 2013 IBM Corporation
IBM Systems & Technology Group
MPI Key Concepts
MPI-3 Standard
Non-blocking Collectives

Non-blocking versions of all collective calls. Allows the overlapping of collective
communication with application computation.
Neighborhood Collectives

Neighborhood (aka sparse) collective operations. Allows the user to define specific
topology patterns for sparse communication in collective operations.
One-sided communication operations

Fortran 2008 Bindings

Clarifications to existing parts of the MPI 2.2 Standard

© 2013 IBM Corporation
IBM Systems & Technology Group
Topics
Introduction
Key Concepts in Parallel Computing
MPI Background
Platform MPI Background
Installing Platform MPI
Building & Running MPI Applications
15
© 2013 IBM Corporation
IBM Systems & Technology Group
Platform MPI 9 - Market Leadership
Auto-Detect Interconnect


Runtime detection or
specification of interconnect
protocol
Broad range of interconnects
Improved Automated Benchmarking
of selected Collective Operations

Improved dynamic collective
algorithm selection

Improved user benchmarking for
better algorithm selection.
LSB (Linux Standard Base)


Abstracts system calls
Allows one MPI library to run
with multiple libc versions
Linux/Windows Support

Shared source base for both
Oss, makes them “bug”
compatible – fix it once will fix
it for all O/Ss
Scheduler Neutral

16
TCP performance

Optimized TCP point to point
communications

Runtime tunable to optimize large
messages for TCP
Improved Collective Algorithms

Improved Reduce/AllReduce,
Allgather, Alltoall
LSF, PE-POE, PBS/torque,
Slurm, MS-HPC
© 2013 IBM Corporation
IBM Systems & Technology Group
What is Platform MPI?
•
•
Platform MPI is a proprietary implementation of the V1.2,
V2.2 and V3.0 MPI Standard (Partial).
More ISVs have standardized on Platform MPI and
distribute Platform MPI than any other commercial MPI
*
*
Abaqus
*
*
ANSYS, CFX, FLUENT
RADIOSS
*
*
*
*
*
Molpro
University of Cardiff
*
*
*
*
NX Nastran
17
AMLS
© 2013 IBM Corporation
IBM Systems & Technology Group
Major Product Milestones
2006
2003
1997
2009
2010
2012
2013
First Release
HP-MPI 1.1 for HP-UX
First Release
IBM Platform MPI 8.3
First Release
HP-MPI 2.0 for Linux
First Release
Platform MPI 7.1
First Release
HP-MPI 1.0 for Windows
Platform MPI 9.1
Performance & Scalability
Enhancements
Last Release
HP-MPI 2.3.1
Platform MPI 9.1.2
Community Edition
Platform MPI 8.0
First “simultaneous” release
of Linux & Windows
First release to incorporate
Platform MPI 5.6.x features
(ScaliMPI)
18
© 2013 IBM Corporation
IBM Systems & Technology Group
Platform MPI Advantage
Proven scalability to 128K ranks
Supports broad set of interconnects
Infiniband - SDR, DDR, QDR, FDR
Mellanox (OFED/IBV)
Qlogic (PSM)
10G
Myrinet (MX)
Multiple protocol support
Interconnect ‘native’ protocols (IBV, PSM, MX, ...)
UDAPL
TCP
Shared memory
Automatic interconnect detection/selection at runtime
Develop, Debug, Test with TCP and run with IBV.
19
© 2013 IBM Corporation
IBM Systems & Technology Group
Platform MPI Advantage
Supports a wide variety of schedulers
LSF for Linux and Windows
PE-POE
Windows HPC
SLURM
PBS Pro
MPI 2.2 Standard Compliance for all products
Preview of selected MPI 3.0 Features
Non-blocking collectives
Fortran 2008 Bindings
MPI 3.1 Draft High Availability APIs (Standard Edition Only)
20
© 2013 IBM Corporation
IBM Systems & Technology Group
Topics
Introduction
Key Concepts in Parallel Computing
MPI Background
Platform MPI Background
Installing Platform MPI
Building & Running MPI Applications
21
© 2013 IBM Corporation
IBM Systems & Technology Group
Installation and Upgrades
Platform MPI
All files are located under the MPI_ROOT installation
location
The installation directory is re-locatable
Uses InstallAnywhere on Linux & Windows
Installation files can be removed after installation
Product can be installed one time and copied or
included in installation images
22
© 2013 IBM Corporation
IBM Systems & Technology Group
What is MPI_ROOT?
Platform MPI can have multiple installed versions on the
same cluster.
MPI_ROOT is an environment variable that should be set
to the installation root of the desired version.
The “mpirun” will “intuit” a MPI_ROOT if not set.
This process is “good” but NOT “fool proof.”
$MPI_ROOT/bin/mpirun …
%MPI_ROOT%\bin\mpirun …
© 2013 IBM Corporation
IBM Systems & Technology Group
Platform MPI Editions
Edition Overview
Platform MPI 9.1.2 Community Edition
•
•
24
•
Free download from IBM Developer Works (registration required)
•
Limited to 4096 ranks per job
•
Limited support for “Cluster Test Tools”, no support for HA Features
•
Developer Works Forum for support (best effort basis only)
•
ONLY “full versions” will be released on the Developer Works portal
Community Edition + Support
•
Limited to 8192 ranks per job
•
Limited support for “Cluster Test Tools”, no support for HA Features
•
Entitled to Support and FixPacks
Standard Edition
•
No limits on job size (64k+ ranks)
•
Full access to “Cluster Test Tools” and HA Features
•
Entitled to Support and FixPacks
© 2013 IBM Corporation
IBM Systems & Technology Group
Topics
Introduction
Key Concepts in Parallel Computing
MPI Background
Platform MPI Background
Installing Platform MPI
Building & Running MPI Applications
25
© 2013 IBM Corporation
IBM Systems & Technology Group
“Hello World”
#include <stdlib.h>
#include <stdio.h>
#include <mpi.h>
int main(int argc, char **argv)
{
int
rank, size, len;
char
name[MPI_MAX_PROCESSOR_NAME];
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Get_processor_name(name, &len);
printf ("Hello world! I'm %d of %d on %s\n", rank, size, name);
MPI_Finalize();
exit(0);
}
26
© 2013 IBM Corporation
IBM Systems & Technology Group
Compiling with MPI
For each language, the compiler wrappers search for
compilers in a specific order:
1.
If the user has established a preference with the
environment variables MPI_CC, MPI_CXX,
MPI_F77, MPI_F90, that takes precedence.
2.
Otherwise look for a compiler in the language
specific list from left to right. If the compiler in the list
can be found in the user’s PATH.
We recommend Option #1.
Use –show to view the compiler wrapper’s options
and command line.
% $MPI_ROOT/bin/mpicc –show example.c
27
© 2013 IBM Corporation
IBM Systems & Technology Group
Compiling with Platform MPI
The compiler wrappers are provided as a “template”
The “-show” option should be used to examine the
compile & link commands
•ONLY shared libraries are shipped & supported
•
Build commands can be included in a more complex
build environment
•
28
Support will be “limited” and on a best effort basis
© 2013 IBM Corporation
IBM Systems & Technology Group
Building an example
application
Best Practice: Set MPI_ROOT in
the environment to the installed
location of the Platform MPI.
Multiple versions of Platform MPI
can be installed on the same
machine.
Platform MPI should be installed in
a shared files system location, or
locally on each machine in the
cluster.
Platform MPI ships with several
example applications that can be
used for testing.
The “demo” directory is on a shared
file system across all the nodes in
the cluster. The Platform MPI
version is installed locally on each
node.
29
© 2013 IBM Corporation
IBM Systems & Technology Group
Running with Platform MPI
Setting “MPI_ROOT” and invoking the appropriate
mpirun command is sufficient to start the MPI job
•
$MPI_ROOT/bin/mpirun …
In most cases it is NOT appropriate to add Platform
MPI directories to system level environment variables
PATH
•LD_LIBRARY_PATH
•LD_PRELOAD
•
30
© 2013 IBM Corporation
IBM Systems & Technology Group
Stan Graves (gravess@us.ibm.com)
QUESTIONS?
31
© 2013 IBM Corporation
Download