Software Overhead in Messaging Layers Pitch Patarasuk Messaging Layer • Bridge between the hardware functionality and the user communication requirement • Network hardware feature – Arbitrary delivery order – Finite buffering – Limited fault handling • User communication requirement – In-order delivery – End-to-end flow control – Reliable transmission Messaging Layer (cont.) Software overhead • Main communication cost = hardware cost + software cost • Software cost dominates hardware cost • 50-70% of the software cost are the direct gap between the network features and user requirements Software overhead in messaging layers • Analyze the costs of communication functionality and network features • What overhead might be reduced if the underlining network provide higher level of service CM-5 active message layer (CMAM) • CM-5 – Parallel machine up to a few thousand nodes connected with an incomplete fat tree topology • Active message – Communication mechanism intended to expose full hardware flexibility and performance of modern networks – Each message contain an address of a user-level handler which is executed on message arrival with the message body as an argument – The handler extract data from the network and integrate it into the ongoing computation • CMAM vs Send/Recv – User direct access to network, no OS involve, the data go directly to the user space computation Software overhead cost analysis • Consider implementation of 3 protocols – Single-packet delivery – Finite sequence, multi-packet delivery – Indefinite sequence, multi-packet delivery • Use instruction counts for measurement Single-packet delivery Description Call/Return NI Setup Write to NI Read from NI Check NI status Control Flow Source 3 2 7 3 20 Destination 10 3 12 2 27 Finite sequence, multi-packet delivery Indefinite sequence, multi-packet delivery Message size = 16 words Message size = 1024 words Messaging layer with high-level network feature • Given that CMAM is considered to be very efficient, there are 2 choices to reduce software overhead – Lower user requirement – Raise level of service provided by the network • Compressionless Routing (CR) – Order-preserving transmission – Deadlock freedom independent of packet acceptance guarantees – Fault-tolerant transmission at packet level Single-packet delivery • Has the same cost as the previous CMAM case • However, it is guaranteed to be fault-free, no deadlock or buffer overflow Finite sequence, multi-packet delivery Indefinite sequence, multi-packet delivery Message size = 16 words Message size = 1024 words Discussion • Larger packet sizes – Reduce overhead, especially for indefinite-sequence protocol • Improved network interfaces and DMA hardware – Network interface: • only make basic cost faster, but not protocol cost in messaging layer • Make it more important for messaging layer to be effective – DMA: • only reduce cost in moving large amounts of data Discussion (Cont.) • Implication of network design – Improving routing performance may increase software cost, e.g. out-of-order delivery • Reducing communication features – Fault-tolerance, in-order-delivery – Put burden on parallel software programmer • Use instruction counts for measurement instead of latency – Latency is hard to measure in a portable fashion Conclusion • Software overhead is much larger than routing time and 50-70% are messaging layer overhead • Software overhead in messaging layer is the cost of the gap between the network feature and the user requirement • This cost can be reduced to zero if the underlining network provides functionality that the user requires. Questions?