Using Heterogeneous Paths for Inter-process Communication in a Distributed System Vimi Puthen Veetil Instructor: Pekka Heikkinen M.Sc.(Tech.) Nokia Siemens Networks Supervisor: Professor Raimo Kantola Agenda 1. Background What is Performance Based Path Determination? 2. Objectives 3. Performance Tests Delay measurements Throughput measurements 5. Applying PBPD Background (1/4) Performance of inter-process communication in distributed systems is of utmost importance Heterogeneity in the communication needs of processes running in distributed processing systems. Some events require low latency communication, some other might need high bandwidth Several different communication networks such as ATM, HIPPI, FDDI etc. available today Each originally developed for a different application domain. Different performance for different types of communication Processing nodes having in-built support for multiple heterogeneous networks is becoming common “No Influences the performance of the entire system Simultaneous use of multiple networks for performance improvement not in wide use. single network can provide the best performance for all types of communication within a single application! ” Background (2/4) A method of utilizing multiple heterogeneous networks available in a distributed system to enhance the performance of each type of communication between processes. Network A Latency What is Performance Based Path Determination ? Latency Network B Network A Network B t Utilizing the difference in the characteristics of different paths performance S Message size Two techniques : Performance Based Path Selection (PBPS) Performance Based Path Aggregation (PBPA) Latency Case X a Case Y Network B Network A Message size Case Z b Message size Background (3/4) Performance Based Path Selection Applicable when one of the networks exhibits better performances over the other(s) in one situation, while another is better in another situation. fPBPS(x) f1 (x) = a 1 x + l1 y = Latency f2 (x) = a 2 x + l2 a1 a2 t1 t2 Dynamically select the appropriate communication path for a given communication event f1 (m1 ) = t1 f2 (m2 ) = t2 t2 t1 Using some message parameters such as message size, the type of communication etc. fPBPS(m1 ) = Best[f1 (m1 ),f2 (m1 )] m1 m2 x = Message size Background (4/4) Performance Based Path Aggregation f (x ) = a x + l 1 Applicable when two co-existíng networks show similar characteristics y = Latency Two identical networks are aggregated into a single virtual network Each message divided into submessages and transferred over available networks simultaneously. Segmentation into submessages and reassembly at destination can add substantial overhead. Submessage size can be calculated by: 1 1 f (x ) = a x + l 2 a 1 a 2 2 2 f PBPA (x) t =t 1 2 f (m ) = t f (m ) = t l2 f PBPA 1 1 1 2 2 2 (m + m ) = t = t 1 2 l1 x = Message size f1(x)|x= m1 = t1 = a1m1 + l1 m 1 m 2 f2(x)|x= m2 = t2 = a2m2 + l2 t1 = t 2 m = m1 + m2 Solving…. m1 a 2 m l 2 l 1 a1 a 2 m2 a 1 m l 1 l 2 a1 a 2 m=m +m 1 2 1 2 Objectives Measure and compare the performance of ATM and Fast Ethernet as node inter-connects in a distributed system. Investigate the possibility of achieving performance enhancement in inter-process communication in an experimental distributed system, using the techniques of Performance Based Path Determination, when ATM and Ethernet co-exist. Performance Tests Tests done on a mini-network Intel PCs connected back to back (i586 & i686) Independent connections with Fast Ethernet and ATM Operating System – RH Linux (2.4.18) Ethernet NIC of type 3COM 3c905c-Tx ATM NIC of type ENI 155p-MF Measurement from the perspective of application programs UDP/IP and TCP/IP protocols used over Ethernet. AAL5 used for ATM No load in the network or on CPU Ethernet NIC ATM NIC ATM NIC Multimode Fiber Cross over cable Tested Parameters Delay as a function of message size Throughput as a function message size Ethernet NIC Delay measurements Round trip delay measured Message size range 5bytes – 60k bytes used Results For a minimal size message UDP/IP has one way delay of 63µs TCP/IP has one way delay of 84.5 ATM has one way delay of 82.5 Below 200 bytes Ethernet with UDP has lower delay (25% better) For bigger messages, ATM performed better E.g., for 2000 bytes ATM showed about 50% improvement Delay (Contd.) Delay breakdown Propagation delay – negligible in our test results Transmission delay – Significant as message size increases E.g., for 1000 bytes, for Ethernet this value is 80µs and for ATM 51µs Significant portion of total delay for 1000 bytes; 150µs and 123µs respectively Nodal processing - Major contributor Found to be 29µs for sending and 65µs for receiving for Ethernet using UDP for 1000 bytes Most expensive - Interrupt handling and copying data to user space Throughput measurements Receiving throughput measured Faster processor used as receiver Results Ethernet reached a maximum throughput of 93.3Mbps with TCP Ethernet reached a maximum throughput 95.64Mbps with UDP ATM reached a maximum throughput of 135Mbps Overhead due to protocols reduces the maximum achievable throughput Below 1000 bytes, Ethernet offers higher bandwidth! Delay measurements with background traffic 800 Delay with load in the network and on the CPU Multiple applications communicating Similar kind of traffic in the background Results Performance of Ethernet degrades. ATM has no significant impact on delay Available throughput dropped 700 600 Delay ( microseconds ) 500 400 300 TCP(40%load) UDP(40%load) ATM(40%load) TCP(20%load) UDP(20%load) ATM(20%load) 200 100 0 500 1000 1500 Message size ( bytes ) 2000 2500 Performance Tests Conclusions from the tests ATM has higher throughput for bigger messages and smaller delay for bigger messages Delay comparable to Ethernet for small messages (Depends on transport protocol) Better performance on a loaded network Overall ATM gave a better performance We need more information, e.g., goodput ratio, connection set up time, reliability etc. Suitability to a system depends on the application domain, nature of inter-process messaging of the system etc. Applying PBPD to the test network Performance Based Path Selection Delay For messages smaller than 200 bytes, 20µs improvement if Ethernet used with UDP And above 200 bytes, ATM behaves better. By dynamically selecting appropriate path, delay performance improves Applicability in real systems depends on the system Throughput Below 1000 bytes Ethernet offered higher throughput For bigger messages ATM is better Yes! Excellent possibility for improvement! Applying PBPD to the test network Performance Based Path Aggregation Implemented user process, that can segment and sent messages over the two paths using two threads Delay Used pre-determined sizes for test purposes Similar process for receiving Performance degrades Segmentation and reassembly adds substantial overhead Also other factors – sending and receiving slower than for a single network Throughput Significant improvement Better than using Ethernet alone, for the entire tested message size range Better than ATM for above 2500 bytes Applying PBPD Conclusions PBPD can offer performance improvement in some systems. Not a surefire solution for all performance problems. Depends on many factors including processing power of computing nodes, used protocols etc. Future Work Numerous possibilities! Cost effectiveness of PBPD Whether additional hardware cost and R&D costs are justified by the performance improvement Goodput Ratio and connection setup time for ATM and Ethernet Thank you!