If a Clean Slate is the solution what was the problem? Van Jacobson Stanford ‘Clean Slate’ Seminar February 27, 2006 A Brief History of Networking • Generation 1: the phone system - focus on the wires. • Generation 2: the Internet - focus on the machines connected to the wires. • Generation 3? dissemination - focus on the data flowing between the machines connected to the wires. 2 The Phone System is not about phones, it’s about connecting wires to other wires • The utility of the system depends on running a pair of wires to every home & office. • Those pairs are the dominant cost. • Revenue comes from dynamically constructing a path from caller to callee. 3 • For a telco, a ca! is not the conversation you have with your mom in Phoenix, it’s a path between two end-office line cards. •A phone number is not the ‘name’ of your mom’s phone, it’s a program for the end-office switch fabric to build a path to the destination line card. 4 Some ways to build paths Sequential switch sequencing (Strowger stepper) Switchboard coordinate (AT&T Operators, 1959) 5 Structural problems with phone system • Path building is non-local and encourages centralization / monopoly. • A call fails if any element in path fails so reliability goes down exponentially as system scales up. • Data can’t flow until path is set up so efficiency decreases when setup time or bandwidth increase or holding time decreases. 6 The next step: packet switching Paul Baran, 1964 • Change point of view to focus on endpoints rather than paths. • Data sent in independent chunks and each chunk contains the name of the final destination. • If arriving chunk is for a different destination, node tries to forward it (using static configuration and/or distributed routing computation). Don Davies, 1963 7 Packet switching used the existing wires, it just used them differently In 1964 these ideas were ‘lunatic fringe’ — anyone who knew anything about communications knew this could never work. 8 The ARPAnet September, 1971 Lincoln MIT Case Utah T BBN CMU SRI T Stanford UCLA UCSB Harvard ILLINOIS Ames T Mitre SDC RAND 9 BBN Burroughs The ARPAnet was built on top of the existing phone system. ‣ It needed cheap, ubiquitous wires. ‣ It needed a digital signaling technology (but not anything like the state of the art). ➡ At the outset, the new network just looked like an inefficient way to use the old network. ➡ The research community was putting enormous effort into the details of circuit switched data. In the end it didn’t matter. 10 11 The CATENET and TCP/IP Bob Kahn, 1973 • Packet switching worked so well that by 1973 everyone was building a network. • Each was done as a clean slate so they didn’t interoperate. • Since Paul had already abstracted out all the topological details Vint realized that a common encapsulation & addressing structure was all that was needed to glue together arbitrary networks. Vint Cerf, 1973 12 Multinetwork Demonstration 1977 13 TCP/IP advantages • Adaptive routing lets system repair failures and hook itself up initially. • Reliability increases exponentially with system size. • No call setup means high efficiency at any bandwidth, holding time or scale. • Distributed routing supports any topology and tends to spread load and avoid a hierarchy’s hot spots. 14 TCP/IP issues • “Connected” is a binary attribute: you’re either part of the internet and can talk to everything or you’re isolated. • Becoming part of the internet requires a globally unique, globally known IP address that’s topologically stable on routing time scales (minutes to hours). ‣ connecting is a heavy weight operation ‣ the net doesn’t like things that move 15 Like the phone system before it, TCP/IP solved the problems it set out to solve so well that even today it’s hard to conceive of an alternative. TCP/IP’s issues don’t reflect an architectural failing but rather its massive success in creating a world rich in information & communication. When TCP/IP was invented there were few machines and many users per machine. Today there are many machines and many machines per user, all with vast amounts of data to be synchronized & shared. And that creates an entirely new class of problem . . . 16 • The raison d’être of today’s networking, both circuit switched and TCP/IP, is to allow two entities to have a conversation. • The overwhelming use (>99% according to most measurements) of today’s networks is for an entity to acquire or distribute named chunks of data (like web pages or email messages). Acquiring named chunks of data is not a conversation, it’s a dissemination (the computer equivalent of “Does anybody have the time?”) 17 In a dissemination (e.g., getting a web page) the data matters but not who gives it to you. It’s possible to disseminate via conversation and accomplish the user’s goal as a side effect but: ‣ Security is an afterthought. Channels are secured, not data, so there’s no way to know if what you got is complete, consistent or even what you asked for. ‣ It’s inefficient (hotspots, poor reliability, poor utilization). ‣ Users have to do the translation between their goal & its realization and manually set up the plumbing to make things happen. 18 Proposal: Base the next generation on dissemination • Data is requested, by name, using any and all means available (normal IP, VPN tunnels, local zeroconf addresses, Rendezvous/SLP, proxies, multicast, etc.). • Anything that hears the request and has a valid copy of the data can respond. • The returned data is signed, and optionally secured, so its integrity & association with the name can be validated. 19 Advantages • Trust & data integrity are foundation of the design, not an add-on. Phishing, Pharming, SPAM, etc., can easily be made impossible. • Trust is associated with user level objects, not irrelevant abstractions like an SSL connection. • It’s hard for an adversary to disrupt a network that uses any thing, any time, any where to communicate. 20 More Advantages • Network transacts in content, not conversations, so popular content won’t generate congestion. • Request/response model gives user fine grained control over incoming traffic (may obviate many QOS concerns). • User communicates intent to network so network can do more on user’s behalf. 21 Yet More Advantages • There’s no distinction between bits in a memory & bits on a wire. • Since nodes don’t need names, wireless & sensor nets can use simple, local protocols (e.g., proximity, diffusion). • Data can be cached by any node so intermittent operation doesn’t preclude communication. Delay is not only tolerated, it’s irrelevant. • Can use opportunistic transport (e.g., planes overhead, car-roadway-car, fellow travelers). 22 Stuff to figure out Stuff that might not be hard • Protocols for checking integrity of data and of name-data binding (start with PEM & PGP web-of-trust model). • Operational model for directories and repositories (want creating content to be as easy as it is today so some of the data integrity generation has to be automatic). • Data location (URL, DHT, epidemics, directed diffusion, filtered “small world”). 24 Stuff about names & naming • Augment names with time/version to create cacheable, stable references. E.g., “today’s New York Times” becomes “New York Times of 2/27/2006” with a certified relationship between the generic & specific name. • Integrity preserving data segmentation so all responses can be idempotent & small. • Nicknames and intentional names (“all the open doors in building A”). 25 Digression on Implicit vs. Explicit Information • The nice properties of packet switching result from moving source & destination information implicit in a circuit switch’s time slot assignments into explicit addresses in the packet header. (But it’s easy to do this wrong, e.g., ATM.) • The nice properties of dissemination result from making the time & sequence information implicit in a conversation be explicit in a fully qualified name. 26 Stuff that might be harder • Incentive structure (flow & congestion control, sharing & redistribution incentives). • Miscreant & freeloader detection and interaction with anonymity. • Redistribution (content routing, storage replacement strategies, liability issues). 27 Conclusion • IP rescued us from plumbing at the wire level but we still have to do it at the data level. A dissemination based architecture would fix this. • Many ad-hoc dissemination overlays have been created (Akami CDN, BitTorrent p2p, Sonos mesh, Apple Rendezvous) so there’s clearly a need. • If we’re going to have a future, we need to rescue some grad students from re-inventing the past. 28