CPS110: Networks March 20, 2008 Landon Cox

advertisement
CPS110:
Networks
Landon Cox
March 20, 2008
Network hardware reality
 Lots of different network interface cards (NICs)
 3Com/Intel, Ethernet/802.11x
 Each NIC has a fixed hardware address
 MAC address: 01:10:C6:CE:8E:42




Send packet to LAN by specifying MAC address
Max packet size is 1500 bytes
Packets can be reordered, corrupted, dropped
Anyone can sniff packets from the network
Virtual/physical interfaces
Applications
Device
indepe
ndence
Route
across
networ
ks
Symbol
ic host
names
Large
messag
es
Process
to
process
Ordere
d
messag
es
Reliabl
e
messag
ing
Byte
stream
s
Secure
transm
ission
Proced
ure
calls
Distinc
t
messag
es
Insecur
e
transm
ission
Messag
es
OS
Many
types
of NICs
Deliver
only on
LAN
Hardware
MAC
addres
ses
Small
messag
es
NIC to
NIC
Unorde
red
messag
es
Unrelia
ble
messag
ing
Distributed computing
 Try to make multiple computers look like one
 We won’t really cover
 Take CPS 214




Distributed shared memory
Distributed file systems
Parallelizing compilers
Process migration
Protocol layers
NFS
(files)
HTTP
(web)
SMTP
(email)
SSH
(login)
RPC
Applications
Abstraction
UDP
Abstraction
TCP
IP
Ethernet
ATM
Abstraction
PPP
Hardware
OSI model
 Open Systems Interconnections
Layer 7
Applications
Applications
Layer 6
Presentation
Presentation
Layer 5
Session
Session
Layer 4
Transport
Transport
Layer 3
Network
Network
Layer 2
Layer 1
DataLink
DataLink
Physical
Physical
Network layers (the stack)
 Build higher-level services on simpler ones
 IP over Ethernet
 TCP over IP
 HTTP over TCP
 Why build in layers?
 Could have 0 layers (build directly on top of HW)
 What would happen?
 Have to build from scratch each time HW changes
 E.g. one firefox for wired NIC, one for wireless NIC
Network layers (the stack)
 Build higher-level services on simpler ones
 IP over Ethernet
 TCP over IP
 HTTP over TCP
 Why build in layers?
 Could have 1 layer (OS provides single layer)
 What would happen?
 Better to let applications choose functionality they need
 Unneeded features usually cost something (performance)
 E.g. would you ever not need reliable communication?
Virtual/physical interfaces
Applications
Route
across
networks
OS
Deliver
only on
LAN
Hardware
Routing
 HW lets us send to neighbor on same LAN
 Single-hop route
 Want to send to computer on another LAN
 Multi-hop route
 IP (Internet Protocol) handles this
Local-area network
 Typically, switched Ethernet
Ethernet
switch
 Messages delivered using
 Ethernet MAC address
 E.g. 00:0D:56:1E:AD:BB
 Unique to physical card (like a serial number)
 Switch knows all connected computers’ MAC addresses
Routing
 Can’t put all computers on one switch!
 Think of the wiring logistics
 Want to connect two LANs together
 Use a machine that straddles two networks
 Called a router or gateway or bridge
 LANs and routers form the Internet
Internet graph
A
B
Each letter is a router,
possibly with a LAN
connected to it.
C
E
D
G
F
Internet graph
Each node is an
Autonomous
System (AS). Can
think of as an
ISP.
Internet graph
A
B
C
E
D
G
F
How does D know how to get to router G?
Should it send messages to E, C, or F?
Internet routing is imprecise
 Internet has no centralized state
 Makes it (supposedly) more fault-tolerant
 Routing is hard when a network is




Large (a lot to track)
Dynamic (connections change quickly)
Incentives to lie (make money by accepting traffic)
The Internet exhibits all three
 Basic idea
 Routers propagate info about the graph to each other
 BGP (Border Gateway Protocol)
Traceroute example
 www.kernel.org
 Unix traceroute utility
Virtual/physical interfaces
Applications
Symbolic
host
names
OS
MAC
addresse
s
Hardware
Naming other computers
 Low-level interface
 Provide the destination MAC address
 00:13:20:2E:1B:ED
 Middle-level interface
 Provide the destination IP address
 152.3.140.183
 High-level interface
 Provide the destination hostname
 crocus.cs.duke.edu
Translating hostname to IP addr
 Hostname  IP address
 Performed by Domain Name Service (DNS)
 Used to be a central server
 /etc/hosts at SRI
 What’s wrong with this approach?
 Doesn’t scale to the global Internet
DNS
 Centralized naming doesn’t scale
 Server has to learn about all changes
 Server has to answer all lookups
 Instead, split up data
 Use a hierarchical database
 Hierarchy allows local management of changes
 Hierarchy spreads lookup work across many computers
Example: www.cs.duke.edu
 nslookup in interactive mode
Translating IP to MAC addrs
 IP address  MAC address
 Performed by ARP protocol
 Only done after you get to the right LAN
 How does a router know the MAC address of 152.3.140.183?





ARP (Address Resolution Protocol)
If it doesn’t know the mapping, broadcast through switch
“Whoever has this IP address, please tell me your MAC address”
Cache the mapping
“/sbin/arp”
 Why is broadcasting over a LAN ok?
 Number of computers connected to a switch is relatively small
Virtual/physical interfaces
Applications
Large
messages
OS
Small
messages
Hardware
Message sizes
 Hardware interface
 Max Ethernet message size is 1500 bytes
 Application interface
 IP maximum packet size is 64 kbytes
 What if the route narrows?
 Start at Ethernet max of 1500 bytes
 Could traverse ATM w/ max of 53 bytes
Message sizes
 IP layer fragments larger MTU to smaller MTU
Computer 1
Router
Computer 2
IP
IP
IP
Ethernet
Ethernet
ATM
ATM
Virtual/physical interfaces
Applications
Processtoprocess
OS
NIC-toNIC
Hardware
Processes vs machines
 IP is machine-to-machine
 E.g. crocus.cs.duke.edu  www.kernel.org
 Process abstraction
 Each app thinks it has its own machine
 Give each process multiple virtual NICs
Processes vs machines
 Hardware interface
 One network endpoint per machine
 Application interface
 Multiple network endpoints per machine
 Sockets
 Software endpoints for communication
 Like virtual network cards
Sockets
 Another example of virtualized hardware
 Thread  virtual processor
 Address space  virtual memory
 Endpoint/socket  virtual NIC
 NIC and socket both have unique identifiers
 NIC: MAC address
 Socket: ‹hostname, port number›
 bind () assigns a port number to a host’s socket
Sockets
 OS allows apps to program sockets
 E.g. BSD sockets
 WinSock has pretty much same interface
 Processes name each other via sockets
 Each message includes a destination ‹host, port›
 Tells routers which computer gets message
 Tells dst computer which process gets message
Sockets
 OS can multiplex multiple connections over one NIC
 Kinds of sockets: UDP (datagrams), TCP (ordered, reliable)
Course administration
 Exam regrades back on Tuesday
 Project 2 also due on Tuesday
 Four groups have submitted
 Any questions?
Virtual/physical interfaces
Applications
Ordered
messages
Reliable
messages
Byte
streams
OS
Unordere
d
messages
Hardware
Unreliabl
e
messages
Distinct
messages
Ordered messages
 Networks can re-order IP messages
 E.g. Send: A, B. Arrive: B, A
 How should we fix this?
 Assign sequence numbers (0, 1, 2, 3, 4, …)
Ordered messages
 Do what for a message that arrives out of order?
 (0, 1, 3, 2, 4)
a.
Save #3 and deliver after #2 is delivered
 (this is what TCP does)
b.
c.
Drop #3, deliver #2, deliver #4
Deliver #3, drop #2, deliver #4
b. and c. are ordered, but not reliable (messages are dropped).
Relies on the reliability layer to handle lost messages.
Ordered messages
 For a notion of order, first need “connections”
 Why?
 Must know which messages are related to each other
 Idea in TCP
 Open a connection
 Send a sequence of messages
 Close the connection
 Opening a connection ties two sockets together
 Connection is socket-to-socket unique: only these sockets can use it
 Sequence numbers are connection specific
Virtual/physical interfaces
Applications
Ordered
messages
Reliable
messages
Byte
streams
OS
Unordere
d
messages
Hardware
Unreliabl
e
messages
Distinct
messages
Reliable messages
 Usually paired with ordering
 TCP provides both ordering and reliability
 Hardware interface
 Network drops messages
 Network duplicates messages
 Network corrupts messages
 Application interface
 Every message is delivered exactly once
Detecting and fixing drops
 How to fix a dropped message?
 Have sender re-send it
 How does sender know it’s been dropped?
 Have receiver tell the sender
 Receiver may not know it’s been sent
 Like asking in the car,
 “If we left you at the theater, speak up.”
Detecting and fixing drops
 Have receiver acknowledge each message
 Called an “ACK”
 If sender doesn’t get an ACK
 Assume message has been dropped
 Resend original message
 Is this ok for the sender to assume?
 No. ACKs can be dropped too (or delayed)
Detecting and fixing drops
 Possible outcomes
 Message is delayed or dropped
 ACK is delayed or dropped
 Strategy
 Deal with all as though message was dropped
 Worst case if message wasn’t dropped after all?
 Need to deal with duplicate messages
 How to detect and fix duplicate messages?
 Easy. Just use the sequence number and drop duplicate.
What about corruption?
 Messages can also be corrupted
 Bits get flipped, etc
 Especially true over wireless networks
 How to deal with this?
 Add a checksum (a little redundancy)
 Checksum usually = sum of all bits
 Drop corrupted messages
What about corruption?
 Dropping corrupted messages is elegant
 Transforms problem into a dropped message
 We already know how to deal with drops
 Common technique
 Solve one problem by transforming it into another
1. Corruption  drops
2. Drops  duplicates
3. Drop any duplicate messages (very simple)
Virtual/physical interfaces
Applications
Ordered
messages
Reliable
messages
Byte
streams
OS
Unordere
d
messages
Hardware
Unreliabl
e
messages
Distinct
messages
Byte streams
 Hardware interface
 Send information in discrete messages
 Application interface
 Send data in a continuous stream
 Like reading/writing from/to a file
Byte streams
 Many apps think about info in distinct messages
 What if you want to send more data than fits?
 UDP max message size is 64 KB
 What if data never ends?
 Streamed media
 TCP provides “byte streams” instead of messages
Byte streams
 Sender writes messages of arbitrary size
 TCP breaks up the stream into fragments
 Reassembles the fragments at destination
 Receiver sees a byte stream
 Fragments are not visible to either process
 Programming the receiver
 Must loop until certain number of bytes arrive
 Otherwise, might get first fragment and return
Byte streams
 UDP makes boundaries visible
 TCP makes boundaries invisible
 (loop until you get everything you need)
 How to know # of bytes to receive?
1. Size is contained in header
2. Read until you see a pattern (sentinel)
3. Sender closes connection
Sentinels
 Idea: message is done when special pattern arrives
 Example: C strings
 How do we know the end of a C string?
 When you reach the null-termination character (‘\0’)
 Ok, now say we are sending an arbitrary file
 Can we use ‘\0’ as a sentinel?
 No. The data payload may contain ‘\0’ chars
 What can we do then?
Download