L7 - IP Intro

advertisement
CPS-356- Computer Networks
Class 7: Switching Continued+
Network Layer
Theophilus Benson
Based partly on lecture notes by Rodrigo Fonseca, David Mazières, Phil Levis, John Jannotti
Today’s Lecture
• Switching (Take II)
– Ethernet (datagram)
• Spanning-Tree
– ATM (Virtual Circuits)
• Network layer: Internet Protocol (v4)
• Forwarding
–
–
–
–
–
Addressing
Fragmentation
ARP
DHCP
NATs
Ethernet Switching
• Hosts come preconfigured with IDs
– Each host has a MAC-address
• Network automatically determines routes
– Flood to discover who is connected.
Drawbacks of Flooding
B3
B1
B4
Alice
Brige1
A
LAN 3
B
LAN 2
Bob
Brige4
Brige3
B
A
Brige5
B
A
LAN 4
B5
A
B
Drawbacks of Flooding
B3
B1
Bob
B4
A
Alice
Brige1
A
LAN 3
B
LAN 2
Bob
Brige4
Brige3
B
A
Brige5
B
A
LAN 4
B5
Bob
A
A
B
Drawbacks of Flooding
B3
B1
Bob
B4
A
Alice
Brige1
A
LAN 3
B
LAN 2
Bob
Brige4
Brige3
B
A
Brige5
B
A
LAN 4
B5
Bob
A
A
B
Drawbacks of Flooding
B3
B1
Bob
B4
A
Alice
Brige1
A
LAN 3
B
LAN 2
Bob
Brige4
Brige3
B
A
Brige5
B
A
LAN 4
B5
Bob
A
A
B
Drawbacks of Flooding
B3
B1
Bob
A
Alice
B
B4
Alice
B
Alice
Brige1
A
LAN 3
B
Brige4
Brige3
LAN 2
Bob
Alice
B
A
Brige5
B
A
LAN 4
B5
Bob
A
A
B
A
Drawbacks of Flooding
B3
B1
B4
Bob
A
Alice
B
Alice
B
Bob
A
Brige1
A
LAN 3
B
Brige4
Brige3
LAN 2
Bob
Alice
B
A
Brige5
B
A
LAN 4
B5
Bob
A
A
B
Alice
A
Bob
B
Drawbacks of Flooding
• Can not deal with loops
• Can not scale to a large number of devices.
Drawbacks of Flooding
• Can not deal with loops
– Solution: Spanning Tree
• Can not scale to a large number of devices.
– Solution: VLANs
Drawbacks of Flooding
B3
B1
B4
Bob
A
Alice
B
Alice
B
Bob
A
Brige1
A
LAN 3
B
Brige4
Brige3
LAN 2
Bob
Alice
B
A
Brige5
B
A
LAN 4
B5
Bob
A
A
B
Alice
A
Bob
B
Drawbacks of Flooding
B3
B1
B4
Bob
A
Alice
B
Alice
B
Bob
A
Brige1
A
LAN 3
B
Brige4
Brige3
LAN 2
Bob
Alice
B
A
Brige5
B
A
LAN 4
B5
Bob
A
A
B
Alice
A
Bob
B
Drawbacks of Flooding
B3
B1
B4
Bob
A
Alice
B
Alice
B
Bob
A
Brige1
A
LAN 3
B
Brige4
Brige3
LAN 2
Bob
Alice
B
A
Brige5
B
A
LAN 4
B5
Bob
A
A
B
Alice
A
Bob
B
Drawbacks of Flooding
B3
B1
B4
Bob
A
Alice
B
Alice
B
Bob
A
Brige1
A
LAN 3
B
Brige4
Brige3
LAN 2
Bob
Alice
B
A
Brige5
B
A
LAN 4
B5
Bob
A
A
B
Alice
A
Bob
B
Drawbacks of Flooding
• Can not deal with loops
– Solution: Spanning Tree
• Can not scale to a large number of devices.
– Solution: VLANs
Spanning-Tree
• Exchange BPDU messages
– BPDU = Bridge Protocol Data Unit
• Discover a routing topology free of loop
– Eliminates redundancy: wastes extra links
root ID
root bridge (what the sender thinks it is)
root path cost for sending bridge
Identifies sending bridge
Identifies the sending port
cost
bridge ID
port ID
Building a Spanning Tree:
Time 0: Everyone thinks they are ‘root’
B3 think B3 is Root
B1 think B1 is Root
B1
0
B3
B1
0
B3
Alice
Brige1
A
B
B5 think B5 is Root
B5
0
Brige4
Brige3
B
A
Brige5
B5
B4
LAN 3
LAN 2
Bob
B4 think B4 is Root
B
A
LAN 4
A
B
0
B4
Building a Spanning Tree:
Time 1: Everyone heard from B1.
B1 has a lower bridge B1 .. Must be root
B3 think B1 is Root
B1 think B1 is Root
B1
0
B3
B1
0
Alice
Brige1
A
B
Brige3 R
B
R
A
Brige5
B5 think B1 is Root
B5
0
B5
B4 think B1 is Root
B4
LAN 3
LAN 2
Bob
B3
B
A
LAN 4
Brige4
R
A
B
0
B4
Building a Spanning Tree:
Time 2: Tell each other B1 is root
B3 think B1 is Root
B1 think B1 is Root
B1
0
B3
B1
0
1
1
Brige1
A
B
Brige3 R
B
R
A
Brige5
0
1
B
A
LAN 4
B5 think B1 is Root
B5
B
B5
A
Alice
B4 think B1 is Root
B4
0
1
LAN 3
LAN 2
Bob
B3
Brige4
R
A
B
B4
A
Building a Spanning Tree:
Time 3: Discover Duplication
Turn off ports.
B3 think B1 is Root
B1 think B1 is Root
B1
0
B1
B1
1
B1
Brige1
A
B3
0
B3
B1
1
B3
LAN 3
B
Brige3 R
LAN 2
R
Bob
A
Brige5
0
B5
B1
1
B5
B
A
D B
LAN 4
B5 think B1 is Root
B5
B
A
Alice
B4 think B1 is Root
B4
0
B4
B1
1
B4
Brige4
R
A
B
Professors
A
The Spanning Tree
Brige1
Brige5
Brige4
Brige3
B
B
The Spanning Tree
Brige1
D
D
LAN 3
Bob
Port
Type
Rules
R
Accept & forward flood
traffic. Don’t forward
BDPU
D
Accept & Forward flood
traffic. Forward BDPU
LAN 2
R
Brige5
Brige3
B
D
TWO WASTED LINKS
IN THIS TOPOLOGY
R
Brige4
R
B
LAN 4
Alice
Sent packets
Packets not sent
Drawbacks of Flooding
• Can not deal with loops
– Solution: Spanning Tree
• Can not scale to a large number of devices.
– Solution: VLANs
Virtual LANs
Brige1
A
LAN 3
B
Brige4
Brige3
LAN 2
Bob
B
A
Brige5
Alice
B
A
B
A
LAN 4
• Assign switch ports to a VLAN ID (color)
– Isolate traffic: only same color
– Trunk links may belong to multiple VLANs
– Encapsulate packets: add 12-bit VLAN ID
• Easy to change, no need to rewire
Virtual LANs
Brige1
A
LAN 3
B
Brige4
Brige3
LAN 2
Bob
B
A
Brige5
Alice
B
A
B
A
LAN 4
• Assign switch ports to a VLAN ID (color)
– Isolate traffic: only same color
– Trunk links may belong to multiple VLANs
– Encapsulate packets: add 12-bit VLAN ID
• Easy to change, no need to rewire
Virtual LANs
Brige1
A
LAN 3
B
Brige4
Brige3
LAN 2
Bob
B
A
Brige5
Alice
B
A
B
A
LAN 4
• Assign switch ports to a VLAN ID (color)
– Isolate traffic: only same color
– Trunk links may belong to multiple VLANs
– Encapsulate packets: add 12-bit VLAN ID
• Easy to change, no need to rewire
Other Uses for VLANs (Virtual LANs)
Finance: 1
Brige1
A
LAN 3
B
LAN 2
Finance: 1
Brige4
Brige3
B
A
Brige5
B
A
B
A
Professors
LAN 4
• Company network, A and B departments
– May not want traffic between the two
departments
– Topology has to mirror physical locations
– What if employees move between offices?
What Do Switches Look Like?
Generic Switch Architecture
• Goal: deliver packets from input to output
ports
• Potential performance concerns:
– Throughput in bytes/second
– Throughput in packets/second
– Latency
Shared Memory Switch
• 1st Generation – like a regular PC
–
–
–
–
–
NIC DMAs packet to memory over I/O bus
CPU examines header, sends to destination NIC
I/O bus is serious bottleneck
For small packets, CPU may be limited too
Typically < 0.5 Gbps
Shared Bus Switch
• 2st Generation
– NIC has own processor, cache of forwarding
table
– Shared bus, doesn’t have to go to main
memory
– Typically limited to bus bandwidth
• (Cisco 5600 has a 32Gbps bus)
Point to Point Switch
• 3rd Generation: overcomes single-bus bottleneck
• Example: Cross-bar switch
– Any input-output permutation
– Multiple inputs to same output requires trickery
– Cisco 12000 series: 60Gbps
Cut through vs. Store and Forward
• Two approaches to forwarding a packet
– Receive a full packet, then send to output port
– Start retransmitting as soon as you know output
port, before full packet
• Cut-through routing can greatly decrease latency
• Disadvantage
– Can waste transmission (classic optimistic approach)
• CRC may be bad
• If Ethernet collision, may have to send runt packet on
output link
Cut through
Store and forward
Buffering
• Buffering of packets can happen at input
ports, fabric, and/or output ports
• Queuing discipline is very important
• Consider FIFO + input port buffering
– Only one packet per output port at any time
– If multiple packets arrive for port 2, they may
block packets to other ports that are free
– Head-of-line blocking: can limit throughput to ~
58% under some reasonable conditions*
2
Port 1
1 2
Port 2
* For independent, uniform traffic, with same-size frames
Head-of-Line Blocking
2
Port 1
1 2
Port 2
• Solution: Virtual Output Queueing
– Each input port has n FIFO queues, one for each
output
– Switch using matching in a bipartite graph
– Shown to achieve 100% throughput*
*MCKEOWN et al.: ACHIEVING 100% THROUGHPUT IN AN INPUT-QUEUED SWITCH, 1999
Today’s Lecture
• Switching (Take II)
– Ethernet (datagram)
• Spanning-Tree
– ATM (Virtual Circuits)
• Network layer: Internet Protocol (v4)
• Forwarding
–
–
–
–
–
Addressing
Fragmentation
ARP
DHCP
NATs
ATM Cells
• Fixed-size packets
– 5 bytes header
– 48 bytes payload
• If payload smaller than 48B, uses padding
• If greater than 48B, breaks it
Why small, fixed-length packets?
• Cons: maximum efficiency 48/53=90.6%
• Pros:
– Suitable for high-speed hardware implementation
– Many switching elements doing the same thing in
parallel
– Reducing priority packet latency
• Good for QoS
– Reducing transmission latency
• Reducing preemption
latency
• Reduce queuing latency
– Transmission +
propagation + queuing
Why 48 bytes
• It’s from the telephone technology
• Thought data would be mostly voice
• A compromise
– US: 64 bytes
– Europe: 32 bytes
– 64+32 = 48 bytes
Virtual paths
• 24-bit virtual circuit identifiers (VCIs)
– Discussed in our previous lecture
• Two-levels of VCIs
– 8-bit virtual path, 16-bit VCI
– Virtual paths shared by multiple connections
Today’s Lecture
• Switching (Take II)
– Ethernet (datagram)
• Spanning-Tree
– ATM (Virtual Circuits)
• Network layer: Internet Protocol (v4)
• Forwarding
–
–
–
–
–
Addressing
Fragmentation
ARP
DHCP
NATs
Internet Protocol Goal
• How to connect everybody?
– New global network or connect existing networks?
• Glue lower-level networks together:
– allow packets to be sent between any pair or hosts
• Wasn’t this the goal of switching?
Le Theo Net
(ATM)
Le Duke Net
(Token Ring)
Internetworking Challenges
• Heterogeneity
– Different addresses
– Different service models
– Different allowable packet sizes
• Scaling
• Congestion control
How would you design such a
protocol?
• Circuits or packets (datagram)?
– Predictability
• Service model
– Reliability, timing, bandwidth guarantees
• Any-to-any
– Finding nodes: naming, routing
– Maintenance (join, leave, add/remove links,…)
– Forwarding: message formats
How would you design such a
protocol?
• Circuits or packets (datagram)?
– Predictability
• Service model
– Reliability, timing, bandwidth guarantees
• Any-to-any
– Finding nodes: naming, routing
– Maintenance (join, leave, add/remove links,…)
– Forwarding: message formats
IP’s Decisions
• Packet switched
– Unpredictability, statistical multiplexing
• Service model
– Lowest common denominator: best effort,
connectionless datagram
• Any-to-any
–
–
–
–
Common message format
Separated routing from forwarding
Naming: uniform addresses, hierarchical organization
Routing: hierarchical, prefix-based (longest prefix
matching)
– Maintenance: delegated, hierarchical
A Bit of History
• Packet switched networks: Arpanet’s IMPs
– Late 1960’s
– RFC 1, 1969!
– Segmentation, framing, routing, reliability,
reassembly, primitive flow control
• Network Control Program (NCP)
– Provided connections, flow control
– Assumed reliable network: IMPs
– Used by programs like telnet, mail, file transfer
• Wanted to connect multiple networks
– Not all reliable, different formats, etc…
TCP/IP Introduced
• Vint Cerf, Robert Kahn
• Replace NCP
• Initial design: single protocol providing a
unified reliable pipe
– Could support any application
• Different requirements soon emerged, and
the two were separated
– IP: basic datagram service among hosts
– TCP: reliable transport
– UDP: unreliable multiplexed datagram service
An excellent read
David D. Clark, “The design Philosophy of the DARPA
Internet Protocols”, 1988
• Primary goal: multiplexed utilization of existing
interconnected networks
• Other goals (works and works):
– Communication continues despite loss of networks or
gateways
– Support a variety of communication services
– Accommodate a variety of networks
– Permit distributed management of its resources
– Be cost effective
– Low effort for host attachment
– Resources must be accountable
Still An excellent read
David D. Clark, “The design Philosophy of
the DARPA Internet Protocols”, 1988
• Primary goal: multiplexed utilization of existing
interconnected networks
• None-Other goals (other real world issues):
– Security
– Privacy
– Flow money
Internet Protocol
•
•
•
•
•
IP Protocol running on all hosts and routers
Routers are present in all networks they join
Uniform addressing
Forwarding/Fragmentation
Complementary:
– Routing, Error Reporting, Address Translation
Routing
Switch
(diff framing)
IP Protocol
• Provides addressing and forwarding
– Addressing is a set of conventions for naming nodes
in an IP network
• e.g. your name: Theo
– Forwarding is a local action by a router: passing a
packet from input to output port
• e.g. how to get to theo
• IP forwarding finds output port based on
destination address (based on mapping)
– Also defines certain conventions on how to handle
packets (e.g., fragmentation, time to live)
• Contrast with routing (defines mapping)
– Routing is the process of determining how to map
packets to output ports (topic of next two lectures)
Service Model
• Connectionless (datagram-based)
• Best-effort delivery (unreliable service)
– packets may be lost
– packets may be delivered out of order
– duplicate copies of packets may be delivered
– packets may be delayed for a long time
• It’s the lowest common denominator
– A network that delivers no packets fits the
bill!
– All these can be dealt with above IP (if
probability of delivery is non-zero…)
Format of IP addresses
• Globally unique (or made seem that way)
– 32-bit integers, read in groups of 8-bits:
128.148.32.110
• Hierarchical: network + host
• Originally, routing prefix embedded in
address
– Class A (8-bit prefix), B (16-bit), C (24-bit)
– Routers need only know route for each network
128.*.*.*.. Class A
128.62.*.* .. Class B
128.12.*.* .. Class B
128.62.*.*
128.12.*.*
Forwarding Tables
• Exploit hierarchical structure of addresses:
need to know how to reach networks, not
hosts
Network
Next Address
212.31.32.*
0.0.0.0
18.*.*.*
212.31.32.5
128.148.*.*
212.31.32.4
Default
212.31.32.1
• Keyed by network portion, not entire address
• Next address should be local: router knows
how to reach it directly* (we’ll see how soon)
Classed Addresses
• Hierarchical: network + host
– Saves memory in backbone routers (no default routes)
– Originally, routing prefix embedded in address
– Routers in same network must share network part
• Inefficient use of address space
–
–
–
–
Class C with 2 hosts (2/255 = 0.78% efficient)
Class B with 256 hosts (256/65535 = 0.39% efficient)
Shortage of IP addresses
Makes address authorities reluctant to give out class B’s
• Still too many networks
– Routing tables do not scale
• Routing protocols do not scale
Subnetting
•
•
•
•
Add another level to address/routing hierarchy
Subnet mask defines variable portion of host part
Subnets visible only within site
Better use of address space
Scaling: Supernetting
• Problem: routing table growth
• Idea: assign blocks of contiguous networks to
nearby networks
• Called CIDR: Classless Inter-Domain Routing
• Represent blocks with a single pair
– (first network address, count)
• Restrict block sizes to powers of 2
• Use a bit mask (CIDR mask) to identify block size
• Address aggregation: reduce routing tables
CIDR Forwarding Table
Network
Next Address
212.31.32/24
0.0.0.0
18/8
212.31.32.5
128.148/16
212.31.32.4
128.148.128/17
212.31.32.8
0/0
212.31.32.1
Example
H1-> H2: H2.ip & H1.mask != H1.subnet => no direct path
R1’s Forwarding
Table
Network
Subnet Mask
Next Address
128.96.34.0
255.255.255.128
128.96.34.1
128.96.34.128 255.255.255.128 128.96.34.130
128.96.33.0
255.255.255.0
128.96.34.129
IP v4 packet format
IP header details
• Forwarding based on destination address
• TTL (time-to-live) decremented at each hop
– Originally was in seconds (no longer)
– Mostly prevents forwarding loops
– Other cool uses…
• Fragmentation possible for large packets
– Fragmented in network if crossing link w/ small frame
– MF: more fragments for this IP packet
– DF: don’t fragment (returns error to sender)
• Following IP header is “payload” data
– Typically beginning with TCP or UDP header
Other fields
• Version: 4 (IPv4) for most packets, there’s
also 6
• Header length: in 32-bit units (>5 implies
options)
• Type of service (won’t go into this)
• Protocol identifier (TCP: 6, UDP: 17, ICMP: 1,
…)
• Checksum over the header
Fragmentation & Reassembly
• Each network has maximum transmission
unit (MTU)
• Strategy
– Fragment when necessary (MTU < size of
datagram)
– Source tries to avoid fragmentation (why?)
– Re-fragmentation is possible
– Fragments are self-contained datagrams
– Delay reassembly until destination host
– No recovery of lost fragments
Fragmentation Example
• Ethernet MTU is 1,500 bytes
• PPP MTU is 576 bytes
– R2 must fragment IP packets to forward them
Fragmentation Example
(cont)
• IP addresses plus ident field
identify fragments of same
packet
• MF (more fragments bit) is 1 in
all but last fragment
• Fragment offset multiple of 8
bytes
– Multiply offset by 8 for fragment
position original packet
Today’s Lecture
• Switching (Take II)
– Ethernet (datagram)
• Spanning-Tree
– ATM (Virtual Circuits)
• Network layer: Internet Protocol (v4)
• Forwarding
–
–
–
–
–
Addressing
Fragmentation
ARP
DHCP
NATs
Translating IP to lower level addresses
or… How to reach these local
addresses?
• Map IP addresses into physical addresses
– E.g., Ethernet address of destination host
– or Ethernet address of next hop router
• Techniques
– Encode physical address in host part of IP address
(IPv6)
– Each network node maintains lookup table (IP->phys)
ARP – address resolution protocol
• Dynamically builds table of IP to physical
address bindings for a local network
• Broadcast request if IP address not in table
• All learn IP address of requesting node
(broadcast)
• Target machine responds with its physical
address
• Table entries are discarded if not refreshed
ARP Ethernet frame format
• Why include source hardware address?
Obtaining Host IP Addresses - DHCP
• Networks are free to assign addresses within block
to hosts
• Tedious and error-prone: e.g., laptop going from CIT
to library to coffee shop
• Solution: Dynamic Host Configuration Protocol
– Client: DHCP Discover to 255.255.255.255 (broadcast)
– Server(s): DHCP Offer to 255.255.255.255 (why
broadcast?)
– Client: choose offer, DHCP Request (broadcast, why?)
– Server: DHCP ACK (again broadcast)
• Result: address, gateway, netmask, DNS server
Obtaining IP Addresses
• Blocks of IP addresses allocated hierarchically
– ISP obtains an address block, may subdivide
ISP: 128.35.16/20
10000000 00100011 00010000 00000000
Client 1: 128.35.16/22 10000000 00100011 00010000 00000000
Client 2: 128.35.20/22 10000000 00100011 00010100 00000000
Client 3: 128.35.24/21 10000000 00100011 00011000 00000000
• Global allocation: ICANN, /8’s (ran out!)
• Regional registries: ARIN, RIPE, APNIC, LACNIC, AFRINIC
Network Address Translation (NAT)
• Despite CIDR, it’s still difficult to allocate
addresses (232 is only 4 billion)
• We’ll talk about IPv6 later
• NAT “hides” entire network behind one address
• Hosts are given private addresses
• Routers map outgoing packets to a free
address/port
• Router reverse maps incoming packets
• Problems?
Internet Control Message Protocol
(ICMP)
•
•
•
•
•
•
•
•
Echo (ping)
Redirect
Destination unreachable (protocol, port, or host)
TTL exceeded
Checksum failed
Reassembly failed
Can’t fragment
Many ICMP messages include part of packet that
triggered them
• See http://www.iana.org/assignments/icmpparameters
ICMP message format
Example: Time Exceeded
• Code usually 0 (TTL exceeded in transit)
• Discussion: traceroute
Example: Can’t Fragment
• Sent if DF=1 and packet length > MTU
• What can you use this for?
• Path MTU Discovery
– Can do binary search on packet sizes
– But better: base algorithm on most common
MTUs
Download