Implementation and Optimization of MPI point-to-point communications on SMPCMP clusters with RDMA capability Department of Computer Science at Florida State MPI point-to-point communication • Pairing MPI_Send with MPI_Recv or MPI_Isend/MPI_Irecv/MPI_Wait • There is an implicit synchronization – Receiver can complete only after sender performs the send; the communication operation cannot complete until both sender and receiver are ready. Department of Computer Science at Florida State MPI point-to-point communication • Use different protocol for large and small messages o Eager protocol for small messages • • Low latency communication Sender not depending on receiver o Rendevuous protocols for large messages • No message copy Department of Computer Science at Florida State Eager protocol Department of Computer Science at Florida State Rendezvous protocol Department of Computer Science at Florida State Existing RDMA based small message channel – the MVAPICH design [Liu03] Department of Computer Science at Florida State Our improved design – eliminating persistent buffer association Department of Computer Science at Florida State Further improvement – node-shared Small message channels Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Optimizing Rendezvous protocol – ideal rendezvous protocol • • SS – Send start, SW – Send wait, RS– Receive start, RW – Receive wait. When both sender and receiver have initiate the communication, data transfer should start Department of Computer Science at Florida State Optimizing Rendezvous protocol – the problem • Poor progress Department of Computer Science at Florida State Optimizing Rendezvous protocol – the problem • • The performance is heavily affected by the timing of the events? Is it possible to have near optimal performance for all timing situations? Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State Department of Computer Science at Florida State How to use these protocols • • • Dynamic protocol selection – design maga-protocol that combines multiple of these protocols. Profile-guided optimization – use profiling to determine the timing information, and use the timing information to select the protocol. Compiler-assisted optimization – use compiler analysis to determine the timing information, and use the timing information to select the best performing protocol. Department of Computer Science at Florida State