Advanced Computer Networks (COS 561) Jennifer Rexford Tuesdays/Thursdays 1:30pm-2:50pm

advertisement
Advanced Computer Networks
(COS 561)
Jennifer Rexford
Advanced Computer Networks
http://www.cs.princeton.edu/courses/archive/fall08/cos561/
Tuesdays/Thursdays 1:30pm-2:50pm
Focus of the Course: Network Architecture
• Network architecture
– Definition and placement of functions
– Types of nodes, and information they exchange
– Not measuring or redesigning individual protocols
• Revisiting the functions inside the network
– Naming and addressing
– Routing and forwarding
– Virtualization and programmability
• To address critical challenges
– Performance, scalability, security, manageability,…
– Interactive applications, content services, …
Goals of the Course
• Understand the Internet architecture
– Reading and discussing the classic papers
– Considering the strengths and limitations
• Critically study new architectural alternatives
– Reading and discussing recent research papers
– Considering how well they address the limitations,
and emerging challenges
• Create and evaluate new architectural ideas
– Learning tools for experimental systems research
– Completing a systems-oriented research project
Reading Research Papers
• Classic papers
– For the first few weeks of the course
– Lectures (quickly) reviewing today’s architecture
• Recent research papers
– Emphasis on new architectural ideas
– Some full-length papers with thorough evaluation
(e.g., from SIGCOMM and NSDI conferences)
– And some short papers selling a new idea (e.g.,
from HotNets and other workshops)
– Lectures reviewing the related limitations of
today’s architecture
In-Class Discussion
• Big part of each class is devoted to discussion
– Focused on the research papers
– Everyone is expected to participate. (Really!)
• To prepare for the discussions
– Critically read the assigned paper(s)
– Consider how you would summarize:
• The main idea and contributions
• Strengths of the paper
• Weaknesses of the paper
• Directions for future work
– No need to submit a formal written review
Homework Assignments
• Learning tools for experimental systems work
– Routers: Click and Quagga
– Evaluation facilities: Emulab
– Measurement data: RouteViews and Netflow
• Three assignments
– Two in the first half of the course
– One in the second half
• You can work in pairs on the assignments
– Each person should complete their own write-up
– Include the name of the person you worked with
Course Project
• Research project
– Capstone for the semester
– Design and evaluate a new networking idea
– Can work in groups, if you like
• Due dates
– Before fall break: short proposal (1-2 pages)
• Must discuss the topic with me ahead of time
– Dean’s date: final report (10 two-column pages)
• Paper format listed on the course Web site
– During exam period: oral presentations
• To be scheduled later…
Grading Breakdown
• Class participation: 30%
• Homework assignments: 30%
– That is, 10% for each assignment
• Course project: 40%
– Includes both the report and presentation
• Students auditing the class
– Do not need to complete the homework
assignments and course project…
For You To Do (See Class Web Site for Details)
• Join the class mailing list
– For follow-up discussions and pointers
– For questions about the homework assignments
• Create an account on Emulab
– Needed for the first homework assignment
– Requires time for approval, so do it right away
• Read the “how to read a paper” tips
– Two short write-ups linked from today’s class
– To help you read the research papers efficiently
• Start reading assignment for Tuesday’s class
The Internet: The Good,
The Bad, and The Ugly
What is the Internet?
The Internet is the worldwide, publicly accessible
network of interconnected computer networks that
transmit data by packet switching using the
standard Internet Protocol (IP). It is a "network of
networks" that consists of millions of smaller
domestic, academic, business, and government
networks, which together carry various information
and services, such as electronic mail, online chat,
file transfer, and the interlinked Web pages and
other documents of the World Wide Web.
http://en.wikipedia.org/wiki/Internet
The Internet: A Remarkable Story
• Tremendous success
– A research experiment that truly
escaped from the lab
• The brilliance of under-specifying
– Best-effort packet-delivery service
– Key functionality at programmable end hosts
• Enabled massive growth and innovation
– Ease of adding hosts & links, & new technologies
– Ease of adding new services (Web, P2P, VoIP, …)
Idea #1: Functionality at the
(Programmable) Edge of the Network
Telephone Network: Dumb Edge, Smart Core
• Dumb phones
– Dial a number
– Speak and listen
• Smart switches
– Set up and tear down a circuit
– Forward audio along the path
• Limited services
– Audio
– Later, fax, caller-id, …
• A monopoly for a long time
Internet: Smart Edge, Dumb Core
End-to-End Principle
Whenever possible, communications protocol
operations should be defined to occur at the
end-points of a communications system.
Programmability
With programmable end hosts, new network
services can be added at any time, by anyone.
And then end hosts became powerful and ubiquitous….
Programmability
• Architectural decision with profound effects
– Where you place programmability in the system
determines who gets to innovate
– And what kinds of innovations can happen
• Today’s Internet
– Programmable hosts  innovation in applications
– Non-programmable routers  more control by
standards bodies, routers vendors, and carriers
• Democratizing Innovation
– Interesting book by Eric von Hippel
– http://web.mit.edu/evhippel/www/democ1.htm
Idea #2: Best-Effort Packet Switching
Internet Protocol (IP) Packet Switching
• Like the postal system
– Divide information into letters
– Stick them in envelopes
– Deliver them independently
– And sometimes they get there
• What’s in an IP packet?
– The data you want to send
– A header with the “from”
and “to” addresses
Why Packets?
• Packets can be delivered by most anything
– Serial link, fiber optic link, coaxial cable, wireless, birds
• Data traffic is bursty
– Logging in to remote machines, exchanging e-mail
• Don’t waste bandwidth
– No traffic exchanged during idle periods
• Better to allow multiplexing
– Different transfers share access to same links
Best-Effort Packet-Delivery Service
• Best-effort delivery
– Packets may be lost
– Packets may be corrupted
– Packets may be delivered out of order
source
destination
IP network
Why Best-Effort?
• Simpler network
– No error detection and correction
– Don’t remember from one packet to next
– Don’t reserve bandwidth and memory
– Transient disruptions are okay during failover
• … but, applications do want efficient, accurate
transfer of data in order, in a timely fashion
• Fortunately, the end host take care of that!
End Host Can Take Care of Requirements
• No error detection or correction
– Higher-level protocol can provide error checking
• Successive packets may not follow same path
– No problem as long as packets reach destination
• Packets can be delivered out-of-order
– Receiver can put packets back in order (if needed)
• Packets may be lost or arbitrarily delayed
– Sender can send the packets again (if desired)
• No reaction to congestion, beyond “drop”
– Sender can slow down in response to loss or delay
Idea #3: Layering and the IP
Hourglass Model
Layering: A Modular Approach
• Sub-divide the problem
– Each layer relies on services from layer below
– Each layer exports services to layer above
• Interface between layers defines interaction
– Hides implementation details
– Layers can change without disturbing other layers
Application
Application-to-application channels
Host-to-host connectivity
Link hardware
The Narrow Waist of IP
FTP
HTTP
NV
TCP
TFTP
Applications
UDP TCP
UDP
Waist
IP
Data Link
NET1
NET2
…
NETn
Physical
The Hourglass Model
The waist facilitates interoperability
Above and Below the Waist
• IP over anything
– Internetworking protocol that runs on anything
– Accommodate innovation in link technology
– … and heterogeneity throughout the network
• Anything over IP
– Variety of transport protocols can be built
– Though, in practice, mainly just TCP and UDP
• TCP: ordered, reliable stream of bytes
• UDP: simple (unreliable) message delivery
– And any applications on top of that
End-to-End IP
host
host
HTTP message
HTTP
TCP segment
TCP
router
IP
Ethernet
interface
HTTP
IP packet
Ethernet
interface
Ethernet frame
IP
TCP
router
IP packet
SONET
interface
SONET
interface
SONET frame
IP
IP packet
Ethernet
interface
IP
Ethernet
interface
Ethernet frame
Idea #4: Decentralized Control
Benefits of Decentralization: Scalability
• Scalability
– Limit amount of state, and frequency of updates
• Addressing
– Internet routers only need to know how to reach
blocks of addresses (e.g., 12.0.0.0/8)
• Routing
– Link failure in one network is typically not visible in
another
• Naming
– Look-up of www.cnn.com doesn’t go to same
server as look-up of www.princeton.edu
Benefits of Decentralization: Autonomy
• Autonomy
– Allow different parties to manage different parts
of the system, and apply their own policies
• Addressing
– ARIN delegates address space to AT&T, who
delegates smaller blocks to its customers
• Routing
– AT&T controls flow of traffic through its backbone
• Naming
– CNN controls addresses for www.cnn.com
Problems Lurking
Challenges Tied to Early Design Decisions
• Power of programmable end hosts
– Easy to spoof IP addresses, e-mail addresses, …
– Incentives for users to violate congestion control
– Malicious users launching Denial-of-Service attacks
• Best-effort packet-delivery service
– Inefficient in high-loss environments (wireless)
– Poor performance for interactive applications
– Expensive per-packet handling on high-speed links
Challenges Tied to Early Design Decisions
• Layering and the IP narrow waist
– Low efficiency due to many layers of headers
– Poor visibility into underlying shared risks
– Complex network management due to multiple
interconnected protocols and systems
• Decentralized control
– Hierarchical addressing makes mobility difficult,
and requires careful configuration
– Autonomy makes measurement (and
troubleshooting and accountability) hard
– Autonomy makes protocol changes difficult
Recurring Challenges
• Security
– Weak notions of identity that are easy to spoof
– Protocols that rely on good behavior
– Incomplete or non-existent registries, keys, …
• Mobility and disconnected operation
– Hierarchical addressing closely tied with routing
– Presumption that hosts are connected
• Network management
– Many coupled, decentralized control loops
– Limited visibility into across layers and networks
• Application performance requirements
– Real-time, interactive applications
– Throughput sensitive vs. delay-sensitive
Internet is Not Standing Still
• Partial solutions to these problems
– Often as “add ons” or “extensions”
– Hampered by need to be backwards compatible,
and work when only partially deployed
– Rather than complete architectural solutions
• Solutions create problems of their own
– Violations of architectural assumptions
– Unexpected interactions with applications
– Adding complexity to an already complex system
Example: Middleboxes
• Middleboxes are intermediaries
– Interposed in-between the communicating hosts
– Often without knowledge of one or both parties
• Examples
– Network address translators
– Firewalls
– Traffic shapers
– Intrusion detection systems
– Transparent Web proxy caches
– Application accelerators
Middleboxes Address Practical Challenges
• Host mobility
– Relaying traffic to a host in motion
• IP address depletion
– Allowing multiple hosts to share a single address
• Security concerns
– Discarding suspicious or unwanted packets
– Detecting suspicious traffic
• Performance concerns
– Controlling how link bandwidth is allocated
– Storing popular content near the clients
Middleboxes Violate Network-Layer Principles
• Globally unique identifiers
– Each node has a unique, fixed IP address
– … reachable from everyone and everywhere
• Simple packet forwarding
– Network nodes simply forward packets
– … rather than modifying or filtering them
source
destination
IP network
Two Views of Middleboxes
• An abomination
– Violation of layering
– Cause confusion in reasoning about the network
– Responsible for many subtle bugs
• A practical necessity
– Solving real and pressing problems
– Needs that are not likely to go away
• Would they arise in any edge-empowered
network, even if redesigned from scratch?
Clean-Slate Network Architecture
• Clean-slate architecture
– Without constraints of today’s artifacts
– To have a stronger intellectual foundation
– And move beyond the incremental fixes
• Still, some constraints inevitably remain
– Ignore today’s artifacts, but not necessarily all reality
• Such as…
–
–
–
–
–
Resource limitations (CPU, memory, bandwidth)
Time delays between nodes
Independent economic entities
Malicious parties
The need to evolve over time
Conclusions
• Internet architecture is a huge success
– Functionality at programmable edge nodes
– Best-effort packet-delivery service
– Layering and the IP hourglass model
– Decentralized control of the global system
• These very features are causing problems
– Security, mobility, manage-ability, performance,
reliability, …
• Rethinking the network architecture
– For a strong intellectual foundation
– And long-term improvements to the Internet
Download