Lecture 2: Basic routing, ARP, and basic IP Basic Routing

advertisement
Internetworking
Lecture 2: Basic routing, ARP, and
basic IP
Basic Routing
• Literature:
Delivery, Forwarding, and Routing
of IP packets
– Forouzan, TCP/IP Protocol Suite: Ch 6 - 8
lecture_2
lecture_2
Connection-oriented vs Connectionless
• Connection-Oriented Services
Direct vs Indirect delivery
• Direct delivery
– The network layer establishes a connection between a source and a
destination
– The final destination is connected to the same
physical network as the sender.
– Packets are sent along the connection.
– IP destination address and local interface has
same netmask
– The decision about the route is made once at connection
establishment
A
indirect delivery
– Map IP address to physical address: ARP
– Routers/switches in connection-oriented networks are stateful
• Connectionless Services
R1
• Indirect delivery
indirect delivery
– From router to router, last delivery is direct
R2
– Destination address and routing table: Routing
– The network layer treats each packet independently
– Route lookup for each packet (routing table)
A
– IP is connectionless
indirect delivery
B
R3
direct delivery
direct delivery
– IP routers are stateless
direct delivery
B
lecture_2
lecture_2
Next-hop Routing
R
Routing Table Search - Classful
• How do you hold information about route from A to all other hosts?
– A Æ R1 Æ R2 Æ R3 Æ B
• Determine class from destination address
• Search within class
• Store table of host/network address and nexthop in every node
• Routing table often divided into ”buckets”
N1, N2, R1
N3, R1
N4, R1
N1, N2, R2
N3, R2
N4, R2
A
N1, R1
N2, R4
N3, R4
N4, R3
R1
R2
N1, R2
N2, R2
N3, R2
N4, -
N1, R3
N2, R3
N3, R3
N4, -
Class A bucket
B
R3
N1
destination
destination IP
IP address
address
R4
N2
lecture_2
Class B bucket
N4
C
Class C bucket
N3
D
E
F
lecture_2
1
Routing Table Search - Classless
• Longest prefix first
• Conceptually: divide table in 32 ”buckets” - one for each netmask
length and match destination with longest prefixes first
• SW algorithms: tree, binary trees, tries (different data structures)
• HW support: TCAMs – Content Addressable Memory
• More aggregation leads to smaller routing tables
• Some mechanisms lead to increased fragmentation
Netid
– # of available addresses decreasing Æ distribution of long prefixes (/24)
...
31
– more specific networks (with longer prefixes) Æ
less specific networks (with shorter prefixes)
– Effective address assignment policy
Netid
1
• The basic idea with IP addressing (and CIDR) is to aggregate
addresses
• The ideal situation is to have domains publishing (exporting) only a
small set of prefixes
Masklen
0
Routing Tables
– Multihoming - sites having several subnetworks – from different providers
• Current routing tables (# of entries) is ~150000 (~60% are /24 prefixes)
32
destination
destination IP
IP address
address
lecture_2
lecture_2
Routing Table – Common Fields
IP Router Model
Mask Network Next-hop
Address Address
Interface Flags Reference Use
count
........
..............
.............. ...............
.........
................
IP
Routing
RIB
Routing
Information
Base
IP
Forwarding
FIB
Forwarding
Information
Base
......
• Mask – netmask applied for the entry [255.255.255.0]
• Network address – destination network [192.168.15.0]
• Next-hop address – next router [130.237.15.1]
•
•
•
•
Control
Plane
Interface – outgoing interface [eth0]
Flags – status/info [U(p), G(ateway), H(ost-specific)...]
Reference count – # of users using this route
Use – # of packets transmitted for this destination
lecture_2
Data
Plane
Ethernet
Interface
FDDI
Interface
Router
• A Router can be partitioned into a dataplane and a controlplane
– The dataplane is fast and special purpose – handles packet
forwarding in real-time
lecture_2
– The control plane is general purpose– handles routing in the
background
IP Forwarding
• A router switches packets between network interfaces
• Extracts header information from the incoming datagram
– Destination IP address
• Makes a lookup in the forwarding information base by making a match
against networks
ARP
– Next-Hop IP address,
– Outgoing interface,...
• Modifies datagram header
Mapping between logical IP addresses and
physical addresses
• Sends on outgoing interface
• But a router performs much more than IPv4 lookup
– Access lists, filtering
– Traffic management
– Other protocols: Bridging, MPLS, IPv6, ...
lecture_2
lecture_2
2
Logical and Physical Addresses
Communicating with a next-hop
bsdi
bsdi
Name:
bsdi
bsdi
Name:
MAC addr:
IP addr:
sun
sun
8:0:20:3:f6:42
140.252.13.33
0:0:c0:6f:2d:40
140.252.13.35
sun
sun
svr4
svr4
svr4
svr4
MAC addr:
IP addr:
0:0:c0:c2:9b:26
140.252.13.34
8:0:20:3:f6:42
140.252.13.33
0:0:c0:6f:2d:40
140.252.13.35
0:0:c0:c2:9b:26
140.252.13.34
• Problem: bsdi wants to send an IP packet to svr4
– No routers between sender and receiver – directly connected host
• Getting the IP address of svr4
• A host’s network interface card (NIC) has:
– Static configuration
– a hardcoded, physical MAC address
– DNS: Name Æ Address (Later lectures)
• e.g., 48-bit Ethernet address
• Getting the MAC address of svr4
– a configured, logical IP address
– Static configuration
– a configured name
– Dynamic Address Resolution - ARP
lecture_2
lecture_2
ARP - Address Resolution Protocol
ARP Example
bsdi intends to send an IP datagram to svr4 (140.252.13.34)
• Problem: we are to send a packet to an interface on a
directly attached network - we know the IP-address of the
destination but not the MAC address.
• Idea: Broadcast a request - “On which MAC address can
IP-address X be reached?”.
– ARP request
1. Send an ARP request on broadcast to all stations:
–
who has 140.252.13.34?
2. svr4 identifies it as its own address and sends an ARP reply on unicast
back to bsdi
–
I have 140.252.13.34 and its mac address is 0:0:c0:c2:9b:26
3. bsdi sends the datagram to svr4 using the resolved mac address
4. Note that sun and svr4 can update their ARP caches with bsdi!
• The host/router with the destination replies with its MAC
address
bsdi
– ARP reply
• This is the basic functionality of ARP
svr4
1
3
lecture_2
sun
2
lecture_2
ARP Packet
ARP Optimizations
• Two length fields
• ARP cache
– Hardware (Ethernet address length: 6)
– Resolved addresses are saved in a cache.
– Protocol (IP address length: 4)
• Sender Ethernet and IP address
– Works because of correlations in use of addresses
• Target Ethernet and IP address
– Limits ARP traffic
• ARP is encapsulated directly into a data link frame (e.g., Ethernet)
hw prot hw prot
type type len len
2
2
1
1
• Entries in the ARP cache times out
• Network is snooped
hardware size
op
sender
Ethernet addr
sender
IP addr
target
Ethernet addr
2
6
4
6
target
IP addr
4
– Since the sender’s Internet-to-Physical address binding is in every
ARP broadcast; (all) receivers update their caches before
processing an ARP packet
protocol size
lecture_2
lecture_2
3
ARP Timeouts
Indirect/Direct Delivery and ARP
• If there is no reply to an ARP request
• A sends an IP packet to B through router R
• Ethernet links to connect A and B to R
– The machine is down or not responding
– Request was lost, therefore retry (but not too often)
– Eventually give up (When?)
IP A
IP R
IP B
• ARP cache timeouts
MAC a
– completed entry in 20 minutes (BSD Unix)
MAC r1
MAC r2
MAC b
– incomplete entry in 3 minutes (BSD Unix)
IP Header
Ethernet Header
lecture_2
Src: A, Dst: B
Src: a, Dst: r1
Src: r2, Dst: b
Indirect delivery
Direct delivery
lecture_2
Proxy ARP (RFC 826)
• Proxy ARP - someone
responds to ARP requests on
someone else’s behalf
• Example: sun is hidden behind
netb: Netb responds on behalf
of sun.
Gratuitous ARP
• Host sends an ARP request of its own address
– Generally done at boot time to inform other machines of its address
(possibly a new address) - they get a chance to update their cache entries
immediately
gemini
• Allows sub-networks to be
hidden
arp request for
140.252.1.129
arp reply
140.252.1.183
netb
slip
140.252.1.129
sun
lecture_2
– Lets hosts check to see if there is another machine claiming the same
address ⇒ “duplicate IP address sent from Ethernet address a:b:c:d:e:f”
• As noted before, hosts have paid the price by servicing the broadcast,
so they can cache this information - this is one of the ways the proxy
ARP server could know the mapping
• Note that faking that you are another machine can be used to provide
failover for servers
lecture_2
RARP: Reverse Address Resolution
Protocol (RFC 903)
• How to get your own IP address, when all you know is your link address
• Necessary if you don’t have a disk or other stable storage
• RARP request - broadcast to every host on the network
(i.e., EtherDST=0xFFFFFF), TYPE=0x8035
• RARP server: “I know that address!” and sends an RARP reply
• Source host - receives the RARP reply, and now knows its own IP addr
• RARP packet has exactly the same format as ARP packet
• BOOTP/DHCP is a more powerful alternative to RARP
lecture_2
Src: A, Dst: B
RARP Server
• Someone has to know the mappings - quite often this is in
the file “/etc/ethers”
• Since this information is generally in a file, RARP servers
are generally implemented as user processes
• Unlike ARP responses which are generally part of the
TCP/IP implementation (often part of the kernel)
• How does the process get the packets - since they aren’t IP
and won’t come across a socket?
– PCAP – Packet Capture (used by Tcpdump/Ethereal)
– BPF – Berkeley Packet Filter (older)
• RARP requests are sent as hardware level broadcasts therefore are not forwarded across routers
lecture_2
4
Issues in IP
• Following the end2end argument, only the absolutely
necessary functionality is in IP
– Best Effort Service: Unreliable and Connectionless
– Application or Transport layer handles reliability
IP
• How to deliver datagrams over multiple links (hops) in an
internetwork?
Basic functionality and the IP packet header
– Addressing
– Best-effort delivery service
• Forwarding of packets from one link to another
– Error handling
lecture_2
lecture_2
IPv4 Header – RFC 791
•
Version
•
HLEN – Header Length
•
Type of Service
•
Total Length
•
Fragmentation
–
–
•
Protocol
–
• Version 4 (RFC 791)
– IPv4
• Version 5 (RFC 1190)
Limits lifetime
– ST-II - Multimedia streaming protocol
• Version 6 (RFC 2460)
Higher level protocol
•
Header checksum
•
IP Addresses
•
Options
–
– Stems from when TCP was being split into one component handling
hop-by-hop communication (IP) and one component handling endto-end communication (TCP). IEN 21 1 February 1978.
ID, Flags, Offset
TTL – Time To Live
–
• Version 3 (IEN 21)
Header + Payload
•
The Version Field
– IPv6
Source, Destination
©The McGraw-Hill Companies, Inc., 2000
lecture_2
lecture_2
The Length Fields
• Header Length (4 bits)
– Size of IPv4 header including options.
– Expressed in number of 32-bit words (4-byte words)
– Min is 5 words (=20 bytes)
– Max is 15 words (=60 bytes) – limited size Æ limited use
• Total Length (16 bits)
The Type of Service Field
• Type of Service (ToS): 8 bits
• Intended as a field for specifying Quality of Service on a
per-packet basis.
• Few applications set the TOS field.
– Unless an added cost/policy check/… associated with usage of a
precedence level - it is very likely going to be abused.
• Long history of experimental use
– Total length of datagram including header.
– RFC 791 – original
– If datagram is fragmented: length of fragment.
– RFC 1122, 1349, 1455 modified the meaning of the ToS field
– Expressed in bytes.
– Current proposal: RFC 2474
• Max: 65535 bytes. (This is IPs length limit)
• Many systems only accept 8K bytes.
lecture_2
• Differentiated Services
– Early Congestion Notification (ECN): RFC 2481, 3168
lecture_2
5
The ToS Byte – Original proposal
DSField – Current Proposal
Bit 0
Bit 0
Precedence
Bit 7
DSCP
Bit 7
ECN
TOS
• Differentiated Services (DiffServ) proposes to use 6 of these bits to
provide 64 priority levels - calling it the Differentiated Service (DS) field
• Original Proposal – RFC 791
– RFC 2474
– Bits 0-6: Differentiated Services CodePoint (DSCP)
• Bits 0-2: Precedence
– Defines priority e.g., when packets must be dropped
• The DSCP is set when entering an area and determines the QoS
handling of the IP datagram in the routers within that area
– Scheduling
• Bits 3-5: TOS
– Bit 3: 0 = Normal Delay,
1 = Low Delay
– Bit 4: 0 = Normal Throughput,
1 = High Throughput
– Bit 5: 0 = Normal Reliability,
1 = High Reliability.
– Shaping
– Queue Dropping
• Explicit Congestion Avoidance (ECN)
– ECN Capable Transport (ECT)
– Congestion Experienced (CE)
lecture_2
lecture_2
Fragmentation – MTU
Fragmentation cont’d
• Physical networks maximum frame size
– MTU Maximum Transfer Unit.
• A host or router transmitting datagram larger than MTU of
link must divide it into smaller pieces - fragments.
• Both hosts and router may fragment
©The McGraw-Hill Companies, Inc., 2000
– But only destination host reassemble!
– Each fragment routed separately as independent datagram
• If the IP datagram is larger than the MTU of the link layer, it
must be divided into several pieces to fit the MTU – this is
called fragmentation
• In effect, only datagram service (e.g. UDP)
lecture_2
lecture_2
– TCP uses 576 byte MTU or path MTU discovery
• 3 fields of the IP header concerns fragmentation
The Fragmentation Fields
Fragmentation Example – Offset
• Identification: 16 bits
– ID + src IP addr uniquely identifies each datagram sent by a host
– The ID is copied to all fragments of a datagram upon fragmentation
• Flags: 3 bits
– RF (Reserved Fragment) – for future use (set to 0)
– DF (Dont Fragment).
• Set to 1 if datagram should not be fragmented.
• If set and fragmentation needed, datagram will be discarded and an error
message will be returned to the sender
– MF (More Fragments)
• Set to 1 for all fragments, except the last.
• Fragmentation Offset: 13 bits
– 8-byte units: (ipÆip_frag << 3)
©The McGraw-Hill Companies, Inc., 2000
– Shows relative position of a fragment with respect to the whole datagram
lecture_2
lecture_2
6
Fragmentation Example – Detailed
MTU = 1500 bytes
IPv4 hdr
id=0, DF=0
UDP hdr
20 bytes
8 bytes
The TTL field
• TTL - Time To Live: 8 bits
• Limit the lifetime of a datagram - avoid infinite loops
Data
• A router receiving a TTL>1 decrements the TTL and
forwards it
1473 bytes
• A TTL <= 1 shall not be forwarded
– ICMP “time exceeded” is returned to the sender (later slide)
IPv4 hdr
id=n, DF=0
MF=1, off=0
UDP hdr
20 bytes
IPv4 hdr
id=n, DF=0
MF=0,
off=185
Data
8 bytes
1472 bytes
20 bytes
• Recommended value is 64
Data
1 byte
Offset = 185 Æ 185x8 = 1480 bytes
lecture_2
• Should really be called Hop Limit (as in IPv6)
– Historically: Every router holding a datagram for more than 1 second
should decrement the TTL by the number of seconds.
lecture_2
The Protocol Field
Header Checksum
• Ensures integrity of header fields
– Hop-by-hop (not end-to-end)
– The header fields must be correct for proper and safe processing.
• Demultiplexing to
higher layers
• Assigned by IANA
decimal
keyword
protocol
1
ICMP
Internet Control Message
– The payload is not covered.
• Other checksums
– Link-level CRC. IP assumes a strong L2 checksum/CRC. Hop-by-hop.
4
IP
IP in IP (encapsulation)
– Internet Assigned
Numbers Authority
6
TCP
Transmission Control
• A subset (out of 134)
assigned
17
UDP
User Datagram
41
IPv6
IPv6 in IPv4
– Treat header as sequence of 16-bit integers.
46
RSVP
Reservation Protocol
– Add them together
– L4 checksums, eg TCP/ICMP/UDP checksums cover payload. End-to-end.
• Internet Checksum Algorithm, RFC 1071
– Take the one’s complement of the result.
lecture_2
lecture_2
Summary
• Basic Routing
– Connectionless, next-hop routing
– Routing tables: RIBs and FIBs
– Longest prefix match
• Address resolution
– ARP
– RARP
• IP – Internet Protocol
– Basic functionality
– Header fields
lecture_2
7
Download