ND The research group on Networks & Distributed systems

ND The research group on Networks & Distributed systems ND activities • ICON – Interconnection Networks – Interconnection networks are tightly coupled/short distance networks with extreme demands on bandwidth, latency, and delivery – Problem areas: Effective routing/topologies, fault-tolerance/dynamic reconfiguration, and Quality of Service • VINNER – End-to-end Internet communications – Problem area: Network resilience – as a set of methods and techniques that improve the user perception of network robustness and reliability. 2 ND activities • QuA - Support of Quality of Service in component architectures – Problem area: How to develop applications that are sensitive to QoS on a component architecture platform and how dynamic QoS management and adaptation can be supported • Relay – Resource utilization in time-dependent distributed systems – Problem area: Reduce the effects of resource limitations and geographical distances in interactive distributed applications – through a toolkit of kernel extensions, programmable subsystems, protocols and decision methods 3 Assessment of Data Path Implementations for Download and Streaming Pål Halvorsen1,2, Tom Anders Dalseng1 and Carsten Griwodz1,2 1Department 2Simula of Informatics, University of Oslo, Norway Research Laboratory, Norway Overview • Motivation • Existing mechanisms in Linux • Possible enhancements • Summary and Conclusions 5 Delivery Systems Network bus(es) 6 Delivery Systems application user space kernel space file system bus(es) communication system 7  several in-memory data movements and context switches Intel Hub Architecture Pentium 4 Processor registers application file system communication system disk network card cache(s) memory controller hub RDRAM file system RDRAM communication system RDRAM application RDRAM I/O controller hub PCI slots network card PCI slots PCI slots disk 8 Motivation • Data copy operations are expensive – consume CPU, memory, hub, bus and interface resources (proportional to data size) – profiling shows that ~40% of CPU time is consumed by copying data between user and kernel space – gap between memory and CPU speeds increase – different access times to different banks • System calls make a lot of switches between user and kernel space 9 Zero–Copy Data Paths application user space kernel space file system data_pointer bus(es) communication system data_pointer 10 Motivation • Data copy operations are expensive – consume CPU, memory, hub, bus and interface resources (proportional to data size) – profiling shows that ~40% of CPU time is consumed by copying data between user and kernel – gap between memory and CPU speeds increase – different access times to different banks • System calls make a lot of switches between user and kernel space • A lot of research has been performed in this area • BUT, what is the status today of commodity operating systems? 11 Existing Linux Data Paths Content Download application user space kernel space file system bus(es) communication system 13 Content Download: read / send application application buffer read send kernel copy page cache DMA transfer   2n copy operations 2n system calls copy socket buffer DMA transfer 14 Content Download: mmap / send application mmap send kernel page cache DMA transfer   n copy operations 1 + n system calls copy socket buffer DMA transfer 15 Content Download: sendfile application sendfile kernel gather DMA transfer page cache append descriptor socket buffer DMA transfer  0 copy operations  1 system calls 16 Download: Results • Content Tested transfer of 1 GB file on Linux 2.6 • Both UDP (with enhancements) and TCP UDP TCP 17 Streaming application user space kernel space file system bus(es) communication system 18 Streaming: read / send application application buffer read send kernel copy page cache DMA transfer   2n copy operations 2n system calls copy socket buffer DMA transfer 19 Streaming: read / writev application application buffer read kernel writev copy page cache DMA transfer   copy copy socket buffer DMA transfer 3n copy operations  One copy more than previous solution 2n system calls 20 Streaming: mmap / send application application buffer mmap cork send send uncork kernel copy page cache DMA transfer   2n copy operations 1 + 4n system calls copy socket buffer DMA transfer 21 Streaming: mmap / writev application application buffer mmap writev kernel copy page cache DMA transfer   copy socket buffer DMA transfer 2n copy operations 1 + n system calls  Three calls less than previous solution 22 Streaming: sendfile application application buffer cork send sendfile kernel uncork gather DMA transfer page cache append descriptor copy socket buffer DMA transfer   n copy operations 4n system calls 23 Results • Streaming: Tested streaming of 1 GB file on Linux 2.6 • RTP over UDP Compared to not sending an RTP header over UDP, we get an increase of 29% (additional send call) More copy operations and system calls required  potential for improvements TCP sendfile (content download) 24 Enhanced Streaming Data Paths Enhanced Streaming: mmap / msend application mmap application buffer cork send kernel DMA transfer   msend send uncork gather DMA transfer page cache msend allows to send data from an mmap’ed file without copy appendcopy descriptor copy socket buffer DMA transfer n copy operations  One copy less than previous solution 1 + 4n system calls 26 Enhanced Streaming: mmap / rtpmsend application mmap application buffer cork send kernel sendrtpmsend uncork gather DMA transfer page cache RTP header copy integrated into msend system call append descriptor copy socket buffer DMA transfer   n copy operations 1 + n system calls  Three calls less than previous solution 27 Enhanced Streaming: mmap/krtpmsend application application buffer An RTP engine in the kernel adds RTP headers rtpmsend krtpmsend kernel gather DMA transfer copy RTP engine page cache append descriptor socket buffer DMA transfer   0 copy operations  One copy less than previous solution 1 system call  One call less than previous solution 28 Enhanced Streaming: rtpsendfile application application buffer cork send RTP header copy integrated into sendfile system call sendfile rtpsendfile uncork kernel gather DMA transfer page cache append descriptor copy socket buffer DMA transfer   n copy operations n system calls  existing solution requires three more calls per packet 29 Enhanced Streaming: krtpsendfile application application buffer An RTP engine in the kernel adds RTP headers rtpsendfile krtpsendfile kernel gather DMA transfer copy RTP engine page cache append descriptor socket buffer DMA transfer   0 copy operations  One copy less than previous solution 1 system call  One call less than previous solution 30 Streaming: • Enhanced Tested streaming of 1Results GB file on Linux 2.6 • RTP over UDP mmap based mechanisms sendfile based mechanisms 31 Conclusions • Current commodity operating systems still pay a high price for streaming services • However, small changes in the system call layer might be sufficient to remove most of the overhead • Conclusively, commodity operating systems still have potential for improvement with respect to streaming support • What can we hope to be supported? • Road ahead: optimize the code, make patch and submit to kernel.org 32 Questions?? 33

ND The research group on Networks & Distributed systems

Related documents

Products

Support

ND The research group on Networks &amp; Distributed systems

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

ND The research group on Networks & Distributed systems