High Productivity MPI – Grid, Multi-cluster and Embedded Systems Extensions Presented by Dr. Anthony Skjellum Chief Software Architect Verari systems Software, Inc September 30, 2004 HPEC Workshop IMPI • IMPI – Interoperable Message Passing Interface • Developed and Proposed by NIST • Standard for inter-operation of multiple – Implementations (IMPI, IMPI-2) – Architectures – Networks – Operating Systems 7/12/2016 2 Client, Host Concept • MPI processes spread across multiple clients • Clients represent MPI processes belonging to a single implementation • Hosts represent gateways for processes of Clients • IMPI Application may have two or more clients • Client may have one or more hosts • Hosts serve as gateways for one or more MPI processes 7/12/2016 3 Typical Scenario – Multi-vendor MPI Cluster 2 Cluster 1 Lam • 3 Clients (Each cluster make one client) MPIPro •Client 1 •2 hosts, 2 MPI processes •Client 2 MPICH •1 host, 3 MPI processes •Client 3 •2 host, 6 MPI processes Cluster 3 7/12/2016 4 MPI/Pro 1.7.0 • MPI/Pro 1.7.0 provides first complete implementation of IMPI • Enables Interoperation between – – – – Windows, Linux and Mac OSX operating systems 32-bit and 64-bit architectures TCP, GM and VAPI Networks Any combination of all the above 7/12/2016 5 Extensions • IMPI does not address issues such as – – – – Private IP Addresses Owner domains Faults Interacting Dynamic Groups • Above issues play vital role in Grid • Verari proposed and implemented a novel method to address issue of Private IP Addresses 7/12/2016 6 Case Study Private IP Enabled IMPI Private IP Private IP Private IP Private IP Private IP Private IP Private IP 7/12/2016 Private IP •Compute Nodes have private IP addresses Compute Nodes Compute Nodes Typical Cluster Setup •External communication through single head node or gateway •Unsuitable for multicluster multi-site communication 8 Network Address Translation (NAT) Private IP Private IP Front End 1 Front End 2 Private IP Private IP Private Public IP IP Public Private IP IP Private IP Private IP Private IP Private IP 7/12/2016 •Firewalls block incoming connections •NAT used to serve requests generated within the cluster 9 NAT-based IMPI Private IP Private IP Cluster 1 Head Node Cluster 2 Head Node Private Public IP IP Private IP Public Private IP IP Private IP Private IP Private IP Private IP Public IP Compute Nodes Compute Nodes Private IP •Use NAT to generate dynamic mappings between head node and compute nodes •Dissipate dynamic mapping info through IMPI server •Release mapped ports on head node on completion of application IMPI Server 7/12/2016 10 Performance MB/s IMPI Bandw idth 90 80 70 60 50 40 30 20 10 0 MPI/Pro MPI/Pro w ith IMPI 0 256 512 768 Configuration Latency (us) MPI/Pro without IMPI 142.45 MPI/Pro with IMPI 147.35 1024 Message Length (KBytes) 7/12/2016 11 Performance IMPI using NAT - Bandwidth Bandw idth Streaming Bandw idth 10 9 8 MB/s 7 6 5 4 3 2 1 0 0 256 512 768 1024 Message Length (KBytes) 7/12/2016 12 Proposed Extensions • • • • • IMPI extensions for MPI-2 Open protocol-based initialization such as SOAP Adaptation to the Grid Reduce user involvement Optimize for performance 7/12/2016 13 References • • • • • Velusamy, V et al. Communication Strategies for Private-IP-Enabled Interoperable Message Passing across Grid Environments, First International Workshop on Networks for Grid Applications, 2004. MPI Forum. MPI: A Message-Passing Interface Standard. 1994. http://www.mpi-forum.org/docs. IMPI Steering Committee. IMPI Draft Standard. 1999. http://impi.nist.gov/IMPI/. Gropp, W. et al. A High-Performance Portable Implementation of the Message-Passing Interface. Journal of Parallel Computing, 22, 1996, pp: 789-828. Skjellum, A. and McMahon, T. Interoperability of Message-Passing Interface Implementations: a Position Paper. http://www.verarisoft.com/publications/files/interop121897.pdf 7/12/2016 14