Publish/Subscribe Internetworking Transport Abstractions & Congestion Control Somaya Arianfar & Pasi Sarolahti 26.9.2011 Agenda (for the general transport part) • Traditional transport abstractions – Reflections from how things are done today • • • • • Rethinking the transport concepts (layering etc.) Reliable Multicast Transport Peer-to-Peer Transport HTTP as a Pub/Sub Transport Open Issues 2 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 The OSI Layer Model Program D a t Da D aa tt DD aD a aaa ttt DD aD aD a aaaa tttt Da DD aD aD a aaaaa ttttt DD aD aD aD aD a aaaaaa tttttt Da DD aD aD aD aD a aaaaaaa ttttttt aaaaaaa Application Presentation Session Transport Network Data link Physical 3 Data Program Data Application Data Presentation Data Session Data Transport Data Network Data Data link Data Physical Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 PURSUIT Functional Model 4 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Current Internet Transport Protocols Reliable Byte Stream API TCP - Segmentation - Flow multiplexing - Retransmissions - Congestion ctrl Unreliable Datagrams Reliable Messages API API UDP DCCP - Flow multiplexing - Flow multiplexing - Cong. ctrl SCTP - Flow multiplexing - Retransmissions - Congestion ctrl - Lots of features… - Extensible Internet Protocol Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 IETF and the Transport Area DCCP Congestion control and related transport protocols Interconnecting Content Distribution Networks Network File System, network storage Peer-to-peer communication TCP extensions and maintenance Reliable multicast transport QoS, new transport protocols, traffic engineering 6 PCN LEDBAT CONEX CDNI NFSV4 PPSP TCPM STORM DECADE ALTO MPTCP RMT TSVWG Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Classical Example: TCP • Service model: Reliable byte stream – No message boundaries • Send(1024 bytes) may result in Receive(512 B) + Receive(512 B) – One-to-one association • Segmentation to packets – Depends on network path MTU • Reliable: retransmissions based on acknowledgments and timer • Congestion control: based on RTT measurements and packet loss events • Much intelligence at sender, some intelligence at receiver – No intelligence in network 7 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Another Example: DCCP (Datagram Congestion Control Protocol) • Service Model: Unreliable datagrams – Send(1024 B) results in Receive(1024 B) – or no Receive at all – One-to-one association • Unreliable: no retransmissions • Congestion control: based on RTT measurements and packet loss events – Requires acknowledgment mechanism – …and connection state at the sender • Feature negotiation (e.g., for congestion control algorithm) • Much intelligence at sender, some intelligence at receiver – Very different from UDP (for unreliable datagrams) 8 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Connection-oriented vs. Connectionless • Two ends of the connection need to agree on – Protocol parameters • e.g., type of acknowledgments, maximum segment size – Shared “secrets” • e.g., TCP’s initial sequence number • Reliable connection establishment method needed – Typically: three-way handshake • Connection-oriented is easy with 1-to-1 connections – How to do it between multiple senders and receivers? 9 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Middleboxes and Connectivity • NATs, Firewalls and other types of intermediaries have become common – These operate on packets’ transport headers (or deeper) – Typically support only the most common protocols • TCP and UDP • One difficulty: pseudoheader checksum calculation and NATs • Difficult challenge for deployment of new protocols – Unknown protocols often lost in transit • Common trick is to encapsulate packets inside UDP • Additional problem: short lease times – Idle periods lead to dropped connections 10 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Rethinking Layering • Transport Layer – Communication abstractions, reliability, etc. • Flow Regulation Layer – Congestion control, etc. – Adaptability to special algorithms and path conditions – Multipath communication • Endpoint Layer – Logical endpoints (i.e., ports) – Extension of network layer B. Ford and J. Iyengar, “Breaking Up the Transport Logjam” In Proc. of ACM HotNets-VII, October 2008 11 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Structured Streams • Addresses some of TCP problems – Head-of-Line blocking – Integrated congestion control • Channel Protocol – Congestion control – Security associations • Stream Protocol – Hierarchical model • E.g., “www” root stream distributed into streams of web pages and subobjects – Stream is associated with channel 12 B. Ford, “Structured Streams: a New Transport Abstraction” In Proc. of ACM SIGCOMM ‘07, August 2007 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Host-Centric Networking vs. Content-Centric Networking • Host networking • Content networking – Resources are located based on topological address • IP addresses • Routing is “easy” – DNS is used to map host names to IP addresses – Communication associations are between hosts • E.g. TCP connections – Security model usually based on hosts 13 – Resources are located by content-based name • Often flat, topologyindependent namespace • Routing is difficult • Sometimes host identities not even known – No host-based communication abstractions • Content distribution, content migration becomes easier – Security model based on content Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 But… How do we implement transport for publish/subscribe networks? Picture from: http://www.clker.com 14 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Multicast Transport • Special IP address range reserved for “channels” – UDP packets – Receivers join and leave channels using IGMP or MLD protocols – Source specific multicast limits the allowed sources of data • Multicast not globally deployed – But in limited use e.g. in IPTV distribution • Many channels do not need reliability – Real-time A/V streaming, etc. • How to guarantee reliability in multicast session? – In a session of 100 subscribers, what if one receiver misses data? • What is the feedback mechanism? • What to retransmit, when and where? 15 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Reliable Multicast Transport • Designated receivers (RMTP) J.C. Lin and S. Paul, “RMTP: A Reliable Multicast Transport Protocol”, In Proc. of IEEE Infocom ‘96, March 1996. – Separate acknowledgments, aggregated over tree of designated receivers • Network coding – Add redundancy along data – For the price of overhead, can re-construct individual packets • Data Carousel – Repeated transmission of same data A. Dimakis, K. Ramchandran, Y Wu and C. Suh, “A Survey on Network Codes for Distributed Storage”, In Computing Research Repository, April 2010. http://arxiv.org/abs/1004.4438 16 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Peer-to-Peer Transport (BitTorrent) • Peers exchange 256 KB chunks – Controlled by tracker – Exchanged in “random” order • Old version: on top of TCP – Each peer runs multiple TCP connections – TCP congestion control • New version: dedicated transport protocol – Delay-based congestion control – Yields to other traffic tracker 17 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Content-orientedness and TCP (Warning: research in progress…) • Target: get some of the benefits of content-centric networking to TCP receiver cache – mainly caching • New TCP option – Content label – identifies content in TCP payload • Benefits: – Packet-based caching in current infrastructure • Challenges – Interactions with some TCP details 18 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 sender HTTP as a Pub/Sub Transport • Much of the Internet traffic use HTTP – Accessible from most locations behind middleboxes – Lots of apps: real-time video, Email, version control, file storage • Content-centric protocol methods – GET: fetch a resource from a server – PUT: push content to a server – HTTP chunking enables datagram-like service (e.g., for video) • Pub/Sub could be implemented with variation of GET – S-GET: Remember the GET request at HTTP server (like named pipe) L. Popa, A. Ghodsi, I. Stoica, “HTTP as the Narrow Waist of the Future Internet” In Proc. of ACM HotNets-IX, October 2010 19 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011 Open(?) Issues with Publish/Subscribe Transport • How to know Maximum Transmission Unit? – What is the packet size? • • When to retransmit (if retransmissions are needed)? Request per packet vs. request per “flow”? – How to do flow control? • • • How to build feedback channel? Reliability? Buffering? Reliable transport on top of pub/sub, or pub/sub over reliable transport? 20 Publish/Subscribe Internetworking Transport & Congestion Control © Pasi Sarolahti 2011