Lecture 1.0: Introduction to Ethernet Ethernet Ancestors

advertisement
Lecture 1.0:
Introduction to Ethernet
Giuseppe Bianchi
Ethernet Ancestors
Late 1960: ALOHA network
Norman Abramson, University of Hawaii
Application: radio network among islands
Distribuded, uncoordinated network!
First random access mechanism
(Pure aloha / Slotted aloha)
Giuseppe Bianchi
1
Birth of Ethernet
May 22, 1973: Ethernet memo
Bob Metcalfe (Xerox Palo Alto Research Center)
Carrier Sense Multiple Access with Collision Detection and expo backoff
3 mbps speed
Original Metcalfe drawing
June 1976 presentation at
National Computer Conference
US Patent 4.063.220
“Multipoint Data Communication
System with Collision Detection”
end 1977
1978: US Patent for
Ethernet Repeater
Giuseppe Bianchi
Ethernet Standardization
1979: Metcalfe start-up - 3COM
1980: DIX Ethernet Standard
DIX = Digital-Intel-Xerox vendor consortium
Interoperable products from the three founding
companies
1982: Xerox relinquish “Ethernet”
trademark
1985: IEEE 802.3
Ethernet becomes an IEEE 802 standard
Thick-RG213
10 Mbps (10BASE5 thick coaxial)
802.3 supplement a (1985):
» 10BASE2 thin coax
Minor modifications vs DIX standard
Path towards worldwide interoperability
Thin-RG58
Ethernet standard: the world’s FIRST open, multi-vendor standard!
Quoting Metcalfe: “the invention of Ethernet as an open, non-proprietary,
industry-standard local network was perhaps even more significant
than the invention of Ethernet technology itself”
Giuseppe Bianchi
2
Connection to coaxial cable
(historical)
Thick Cable
Thin Cable
transceiver
Wall, …
transceiver
cable
Controller
Controller
15-pin AUI
connector
Controller
May support Internal TAP
(on board transceiver)
Needs external TAP
(transceiver)
Giuseppe Bianchi
A note on Ethernet terminology
speed
Signal method
medium
EXAMPLE: 100Base-T, 1000Base-LX, …
Speed
10, 100, 1000, 10G
Signal method
Base, broad
Broad = RF modulated on coax
» only one case: 10BROAD36, now obsolete
Medium
Old notation: 2,5 = 200/500 mt (thin/thick coax)
More recent notation: T, Tx, T4, T2, FX, X, CX, SX, LX
Depends on which specific twisted pair category & fibre category;
Different labels (e.g. T, TX, T4, T2) accounts for different encoding
details
Giuseppe Bianchi
3
Ethernet and OSI
Giuseppe Bianchi
Ethernet and PHY
(selected+simplified)
Giuseppe Bianchi
4
IEEE 802 project
LAN / MAN Standards Committee (LMSC)
Unified interface with network layer
LLC
NETWORK
LAYER
802.2 Logical Link Control
DATA LINK
LAYER
802.1 Bridging
MAC
802.3
802.5
802.11
802.15
CSMA/CD
TOKEN
RING
WLAN
WPAN
PHYSICAL
LAYER
Giuseppe Bianchi
IEEE 802 standards
ACTIVE WORKING & TECHNICAL ADVISORY GROUPS
802.1 High Level Interface (HILI)
802.3 CSMA/CD
802.11 Wireless LAN (WLAN)
802.15 Wireless Personal Area Network (WPAN)
802.16 Broadband Wireless Access (BBWA)
802.17 Resilient Packet Ring (RPR)
802.18 Radio Regulatory Technical Advisory Group
802.19 Coexistence Technical Advisory Group
802.20 Mobile Wireless Access
802.21 Media Independent Handover
HIBERNATING WORKING GROUPS (standards published, but inactive)
802.2 Logical Link Control (LLC)
802.5 Token Ring
802.12 Demand Priority
DISBANDED WORKING GROUPS (all standards withdrawn or did not publish
a standard)
802.4 Token Bus
802.6 Metropolitan Area Network (MAN)
802.7 BroadBand Technical Adv. Group (BBTAG)
802.8 Fiber Optics Technical Adv. Group (FOTAG)
802.9 Integrated Services LAN (ISLAN) Working Group
802.10 Standard for Interoperable LAN Security (SILS) Working Group
802.14 Cable-TV Based Broadband Communication Network Working Group
Giuseppe Bianchi
5
Traditional Ethernet topology: bus
Multiple Access shared transmission medium
thick / thin coaxial cable
A
B
C
A
D
B
C
E
D
F
E
F
g!
wron
Giuseppe Bianchi
Twisted Pair revolution
1990: 802.3i
10BASE-T twisted pair
Invented by SynOptics Communications
Alternatives
Reuse structured cabling
UTP (Unshielded)
system standards
FTP (Foiled)
Overcomes management and installation
1 shield for all the cable
problems from coaxial cabling
STP (Shielded):
Ethernet market takes-off!!
One shield per pair
Giuseppe Bianchi
6
Twisted Pair: star topology
(no TAP allowed)
allowed)
Initially: HUB
broadcasts signal on all links
Logically behaves as a bus
Only one tx at a time
A
HUB
HUB
B
Then: SWITCH
Repeats signal on specifically
addressed link
Bridging function
Many tx-rx pairs at a time
More bw!
A
C
D
E
F
E
F
SWITCH
SWITCH
B
C
D
Same topological issues for fiber optic links…
Giuseppe Bianchi
Fiber Optics enters into play
1987: FOIRL (802.3d)
Fiber Optic Inter-Repeater Link
Point-to-point segment to link remote ethernet
segments (via repeaters)
No direct PC-Ethernet connection until
10BASE-F
1993: 10BASE-F (802.3j)
three specifications
10BASE-FB for active fiber hubs
» scarce success
10BASE-FP for passive fiber hubs
» never built!
10BASE-FL extends FOIRL specification
» the only one deployed
Giuseppe Bianchi
7
Ethernet for higher speeds
Many people though Ethernet could not go faster than 10 Mbps… instead:
1995: 100BASE-T Fast Ethernet (802.3u)
100 Mbps on twisted pair
As well as on any other media
Auto-Negotiation capabilities
10/100 products
1997: full duplex standard (802.3x)
Simultaneously transmit and receive (2x speed increase)
1998: 1000BASE-X Gigabit Ethernet (802.3z)
Over fiber and short copper cable
1999: 1000BASE-T Gigabit Ethernet (802.3ab)
Over Twisted Pair
10/100/1000 auto-negotiation
2002 (july): 10 GigaEthernet (802.ae)
Giuseppe Bianchi
Ethernet Evolution at a glance
Giuseppe Bianchi
8
Lecture 1.1:
Ethernet Basics
Giuseppe Bianchi
Ethernet/802.3 frame
64-1518 bytes
8 bytes
preamble
14 bytes
Ethernet/802.3
6 bytes
Destination address
46-1500 bytes
4 bytes
LLC/DATA
FCS
6 bytes
2 bytes
Source address
Length
or type
Main Differences
ETHERNET
802.3
Frame type
type
Length or type
payload
Data
(≥46 but no PAD)
LLC+data
(explicit PAD)
Giuseppe Bianchi
9
Preamble
8 bytes
7 bytes preamble
7 x (10101010)
Last byte: SFD (Start Frame Delimiter)
(10101011)
Devised for 10 Mbps systems
For synchronization
SFD
bit sequence
1
0
1
0
1
0
1
1
Manchester
Encoding
(10 Mbps)
Not useful for 100/1000 systems
Maintained for backward compatibility
Giuseppe Bianchi
Frame Check Sequence
FCS
4 bytes = 32 bits CRC
Order: [x31…x0]
calculated on frame only (not on preamble…)
Of course!!
Giuseppe Bianchi
10
48 bit addresses
Typically referred to as
Interface address
Hardware address
MAC address
First bit:
0 = physical address of an interface
Unicast address
1 = group address
Second bit:
0 = globally administered address
Assigned by the manifacturer
1 = locally administered address
Can be configured through driver
First 24 bits: OUI
(Organization Unique Identifier)
(unique for each vendor)
Typically written in hex
e.g.: F0-11-00-4F-A2-1C
Each byte transmitted
from LSB to MSB
0000.1111.1000.1000.0000.0000.
1111.0010.0100.0101.1000.0011
mcast addresses: start with 1
(first octet LSB!)
Why destination first? Station who does not match dest may ignore rest of frame!
Giuseppe Bianchi
Examples
Individual unicast: xy-xx-xx-xx-xx-xx y multiple of 4
802.3 & 802.4: transmitted from LSB to MSB
802.5 & FDDI: transmitted from MSB to LSB
Giuseppe Bianchi
11
Length/Type
2 bytes
In original ethernet: frame type
Used for demultiplexing upper layer proto
Eg: 0x0800=IP
In 802.3: length OR type
If>1500 (more precisely, ≥ 0x0600 = 1536) frame type
Else LLC payload size (≤1500)
Demultiplexing provided by LLC
If <46, remaining octets are PAD (padding)
Ethernet and 802.3 frames may (do!) coexist on the same network.
Recognized via length / frame type field.
Giuseppe Bianchi
LLC header
64-1518 bytes
8 bytes
preamble
46-1500 bytes
14 bytes
Ethernet/802.3
LLC/DATA
1 byte 1 byte 1 byte
3 bytes
2 bytes
DSAP SSAP ctrl
Protocol OUI
Protocol
4 bytes
FCS
46-1492 bytes
………
SNAP (SubNetwork Access Protocol)
DSAP=SSAP (typically)
Control: depends on service type. Typically:
service type = connectionless unreliable
ctrl=0x03 (unnumbered information)
Demultiplexing:
DSAP, SSAP used only for ISO-OSI standards
other protocols (including IP!) require SNAP addesses
Giuseppe Bianchi
12
LLC Header for an IP packet
1 byte 1 byte 1 byte
3 bytes
2 bytes
0xAA 0xAA 0x03
0x000000
0x0800
46-1492 bytes
………
DSAP=SSAP:
0xAA (use SNAP extension)
Control:
0x03 (unnumbered information)
Protocol OUI (Organization Unique Identifier:
0x000000 (Internet – IETF protocols)
Non-zero values for Novell, IBM, Digital, Apple, etc protocols
Protocol
0x0800 (IP)
Giuseppe Bianchi
Medium Access Control Protocol
Carrier Sense Multiple Access
with Collision Detection
CSMA/CD
Giuseppe Bianchi
13
Role of MAC (CSMA/CD)
three functions:
Transmit/receive data frames
Decode data frames and check them for
valid addresses
before passing them to the upper layers of
the OSI model
Detect errors within data frames or on the
network
Giuseppe Bianchi
Carrier Sense Multiple Access
1. Listen before talking
1. Station ready to
transmit a frame
3. Transmit frame
IFS
2. Listen for at least
an Inter Frame Spacing
(channel must be
idle meanwhile)
Ethernet Notation = Inter Packet Gap (IPG)
802.3 Notation = Inter Frame Spacing (IFS)
Minimum: 96 bits (@ 10 Mbps = 9.6 µs)
Giuseppe Bianchi
14
Carrier Sense Multiple Access
2. If channel detected busy: defer
1. Frame
ready
3. Busy
Detect
4. Defer
5. Listen
for ≥ IFS
2. Listen
(similar defer situation if channel immediately busy)
Giuseppe Bianchi
Collision Detection
3. Listen while talking
If collision detected:
Continue to transmit other 32 bits of signal
(Collision Enforcement Jam Signal)
If detected during the preamble, continue transmitting preamble
AND other 32 jam bits
End transmission
Generate backoff interval, after which retry transmission
Backoff: r x 51.2 µs, 0 ≤ r < 2k, k=min(10,n), n=retry #
Slot-time = 64 bytes = 512 bits (@ 10 Mbps = 51.2 µs)
Abort tx after 16 unsuccessful retries
Giuseppe Bianchi
15
Summary of operation
Source: Cisco CCNA
Giuseppe Bianchi
Collision detection in practice
Media dependent
On fiber or twisted pair:
Point-to-point links
Collision detected by the simultaneous occurrence
of activity on both transmit and receive paths
On Coax:
Monitor average DC voltage
When more than 1 station transmits, voltage gets
greater than given threshold
Giuseppe Bianchi
16
Why collisions occur?
distance d (m)
prop delay d/200 µs
Start tx
IFS
IFS
Detect
collision
Collision occurs if stations access the channel
in instants of time which differ for less than their
propagation delay
Start tx
Detect collision
time
Speed of EM signal in cable: ~200.000 km/s 200m/µs
Giuseppe Bianchi
Network diameter
Stations placed at opposite network edges
Essential condition:
a station must be able to detect a
collision
Otherwise lots of problems
station would think the frame
to be successfully delivered…
Shortest possible frame:
6+6+2+46+4 = 64 bytes =
= 512 bits (excl preamble)
64 bytes
512 bits
frame
RTT=2 x prop
must be lower
than 512 bits
Condition on network
diameter:
a collision MUST be detected on
shortest possible frame
Bound on maximum RTT
@ 10 Mbps: 512/10 [µs] = 2d [m] / 200 [m/µs] d= 5120 [m]
Giuseppe Bianchi
17
Backoff slot-time
Set to 512 bits
as minimum frame size
As maximum RTT
guarantees that a transmitting
station in previous backoff slot will be
ALWAYS detected
A station transmitting for 512 bits
will acquire for sure the channel
No “late collisions” possible
Unless misconfigurations occur…
Giuseppe Bianchi
How does backoff works?
Extracts 0 in (0,1)
Immediately reschedules tx
IFS
jam
Extracts 1 in (0,1)
Waits for a 51.2 µs slot-time
IFS
jam
Giuseppe Bianchi
18
More on network diameter
Safe condition to allow
collision detection
add 32 bits jam time
extra time for processing
From standard:
Maximum RTT=46.38 µs
Phy media have max len
Fiber: 2000m
Coax: 500m
Thin coax: 185m
drop cable: 50m
Transceiver cable
Fiber to coax
repeaters introduce delay
Etc……
RESULT:
2800 mt max diameter
3x500m coax
+ 1000m total fiber
+ 6 drop
Giuseppe Bianchi
Giga Ethernet – Carrier Extension
Minimum frame size set equal to
larger slot time of giga-ethernet
512 bytes = 4096 bits
Extension achieved with external padding
Frame structure left unchanged for backward
compatibility
Giuseppe Bianchi
19
Giga Ethernet – Frame bursting
Optional feature: Burst Mode
transmit series of frames without relinquishing control of the
transmission medium.
Achieves collision-free transmission for frames following the first
one
Transmitting station fills the interframe spacing interval with
extension bits
readily distinguished from data bits at the receiving stations
maintain the detection of carrier in the receiving stations (does
not allow the medium to assume an idle condition between
frames)
Necessary condition for bursting: first frame has been successfully
transmitted
Upper bound: burstLimit = 65536 bits
Giuseppe Bianchi
Channel Capture Effect / 1
For simplicity: we are neglecting detailed timing issues
Assumption: station with “many” frames in the tx buffer
B=0
New frame:
No backoff!
B in (0,1)
IFS
P1 jam
P1
P2 jam
B in (0,3)
B=1
IFS
P1 jam
P1 jam
After second collision, station B will be at the SECOND retry (due to its backoff choice), and will
compete with station A at FIRST retry.
Psucc(A) = ½ * ¾ + ½ * ½ =5/8
A gets unfair advantage!
Psucc(B) = ¼ * ½ = 1/8
Giuseppe Bianchi
20
Channel Capture Effect / 2
Unlucky stations get more and more unlucky!!
Following previous example:
P(win at first try) 1/4 vs 1/4
P(win at second try) = 1/8 vs 5/8
P(win at third try) = 1/16 vs 13/16
P(win at fourth try = 1/32 vs 29/32
… !!! …
Result: if you start losing collisions, you will
end up losing all the remaining ones
At the 16th retry, frame will dropped
ONLY AT THIS POINT station will restart with no backoff
Again fair competition
Consequence: extremely high access delay variance
Packet Starvation Effect
Giuseppe Bianchi
Numerical results
From Whetten et al, http://www.ethermanage.com/ethernet/papers.html
Solution to channel capture:
Some solutions proposed (e.g. BLAM – Binary Logarithmic Arbitration Method)
Download: http://www.ethermanage.com/ethernet/papers.html
Not standardized, despite proposals
Adds complexity
Backward compatibility
Concerns: is capture a practical issue at all? (e.g. in normal load)
Giuseppe Bianchi
21
Lecture 1.2:
Repeaters, Hubs, Switches
Giuseppe Bianchi
Repeater
Physical layer device
Provides the “3-R” functions:
Re-Shaping
Restores the proper signal waveform
Re-Timing
Restores the proper impulse duration
Re-Transmitting
Retransmits collisions, too
Actually, regenerates (extending them to 96 bits) 010101… jam
sequences
Automatic “partitioning”
Protect network from faulty segments
If 30+ consecutive frame tx failures detected, disconnect the link
Giuseppe Bianchi
22
Multiport Repeaters (Hubs)
Slang name: HUBS
Essential for BASE-T and BASE-F
Star / tree topology
But logically acts as a bus!
No loops allowed (rings)
Otherwise signal would travel forever!
Collision domain
Maximum propagation distance between
end nodes
Giuseppe Bianchi
Star == Bus
Giuseppe Bianchi
23
Repeaters and preamble
Part of preamble needed for synchronization
Ethernet repeater:
“consumed” part of preamble is NOT regenerated
limit on number of repeaters crossed
R
frame
frame
802.3 repeater:
Preable fully restored
But this adds extra delay (up to 16 bits per repeater)
Moreover since synchronization delay is NOT constant (a second frame might
synchronize faster than a first one), IFS can reduce
IFS illegal below 47 bits
frame
frame
R
frame
frame
Giuseppe Bianchi
5-4-3 rule
Ethernet/IEEE 802.3 rule on the number
Segments
Repeaters
Populated (user) segments vs unpopulated (link) segments
Link segments used to connect 2 repeaters
Rule: between any two nodes on the
network, there can only be
a maximum of five segments
connected through four repeaters
only three of the five segments may contain user
connections.
Giuseppe Bianchi
24
Stackable Repeaters
Special connector
(approx up to 30 cm)
Stacked Repeaters
act as a SINGLE repeater
device!
Giuseppe Bianchi
Modular hubs (chassis)
Expand by adding more boards in slots
available
Minor issue: must buy from same vendor
Major issue: power failure implies failure for all ports (many!!)
Giuseppe Bianchi
25
Photos
8, 16, 24 10/100 ports (stackable) Hubs
Modular chassis hub
42 10 Base-T RJ45 Port
2 Fiber Ports.
Giuseppe Bianchi
Bridges & Switches
Giuseppe Bianchi
26
Bridge vs Switch
Functional differences:
None!
Switch = Bridge
Marketing issues
Bridge: traditional name; may give the flavour of:
Very low number of ports (typically 2)
Goal: interconnect LANs
Switch: more appealing name; gives the flavour of
Many ports (goal: to “switch” between end-user links)
May support many additional functions than “just” bridging
Implementation issues
Bridge:
store & Forward operation
Software implementation
Switch:
may use cut-through operation(faster)
Hardware switching operation implementation
Difference: basically a marketing/implementation issue
For us: BRIDGE == (Layer 2) SWITCH
Giuseppe Bianchi
Bridging in the 80’s
goal: limit collision domain
D
B
A
LAN 1
Bridge
LAN 2
E
C
Bridge: terminates a collision domain!
LANs: not necessarily Ethernet
“Transparent” bridging
Bridging interfaces are NOT directly addressed at MAC level
they are intermediary
Giuseppe Bianchi
27
Bridging and network extension
BRIDGE
Segment 1
10 Mbit/s (shared)
10 Mbit/s (dedicated)
HUB 2
HUB 1
S4
HUB 1
S3
S7
S5
S6
S1
S2
Segment 2
10 Mbit/s (shared)
Giuseppe Bianchi
Protocol stack
layers
3-7
layers
3-7
LLC
LLC
MAC1 Relay
MAC1
PHY1
PHY
LAN 1
MAC2
MAC2
PHY2
PHY2
LAN 2
Operate at OSI layer 2 (datalink)
Higher layers unaware
May interconnect LANs with different PHY and MAC
Giuseppe Bianchi
28
Bridging specified in 802.1D
Bridging not specific for 802.3 (common for all 802)
802.2 Logical Link Control
ISO 8802.2
LLC
802.1D Bridging
MAC
802.3
802.5
FDDI
802.11
802.12
ISO
8802.3
ISO
8802.5
ISO
9314
ISO
8802.11
ISO
8802.12
CSMA/CD
TOKEN
RING
FDDI
Wireless
AnyLAN
Giuseppe Bianchi
Bridge in the 90’s and 00’s
Collapsed Backbone
Backbone collapsed into center
device
star/tree topology
Versus shared bus
Suitable for structured cabling
Two links per port
2 x twisted pair (or fiber):
HUB
Multi-port
repeater
Shared bandwidth
Transmit; receive
SWITCH
Per-pair dedicated bandwidth
Giuseppe Bianchi
29
Broadcast domain vs collision domain
HUB
Switch
HUB
Without Switching
With switching
Collision
Domain
LAN
Collision
Domain
Broadcast Domain
Collision
Domain
Switch
Collision
Domain
Collision
Domain
Broadcast Domain
Giuseppe Bianchi
Micro-segmentation
Bridge segments network
into distinct parts
Low number
Switched LAN
Many more segments
Limit: one segment per user
The most frequent case!
Incoming frame switched to
appropriate output line
Unused lines can switch other
traffic
More than one station can
transmit at a time
Multiply capacity of LAN
Giuseppe Bianchi
30
Switch technical features
Giuseppe Bianchi
Autonegotiation
802.3-2002-part2, clause 28
Formerly 802.3u, drafted in 1994
Original specification for 10/100 Mbps
More recently extended for 1000 Mbps
What is autonegotiation
Mechanism run independently at each link end
1.
2.
3.
Detect various modes that exist in the device on the other
end of the wire
Advertise to the other end device its own abilities
Goal: automatically configure the highest performance
mode of interoperation.
o
o
o
o
Speed
Line Coding
Half/Full duplexing
Extras
Giuseppe Bianchi
31
NLP and FLP bursts
Normal Link Pulses (NLP)
10Base-T idea;
In the absence of data, periodically transmit link integrity pulses to
run-time determine if the link is operational
1 NLP every 16 +/- 8 ms
Fast Link Pulses (FLP)
Instead of a single NLP pulse, transmit a 16-bit codeword
Coded with pulse position
Carrying the information about the device capabilities
Once negotiation completed, get back to NLP
transmission
NLP
16 +/- 8 ms
FLP
16 +/- 8 ms
Giuseppe Bianchi
FLP wording
1
1
0
1
0
………
17 clock pulses; in the 16 intermediate spaces:
- pulse = bit 1
- no pulse = bit 0
Giuseppe Bianchi
32
FLP coding
Selector field (5 bits)
00001 for 802.3
Other sequences for other standards
Technology ability field (8 bits)
Specify capabilities (e.g. 10Base-T, 100Base-T4, 100BaseTx+fullduplex, etc)
RF = Remote Fault (1 bit)
Allow to signal that a fault occured on the other side
Fault an be specified in “Next Page”
Ack (1 bit)
Notifies that a device has successfully received the FLP
NP = Next Page
Notifies that a device is “next page” capable, i.e. it wishes to exchange additional
data
Each following page transmitted until explicitly ack-ed
Giuseppe Bianchi
Negotiation process
If both link partners capable of autonegotiation:
Select best technology among that available
First 100Base-T full duplex
…
Last 10BaseT half duplex
If only one link partner capable of
auto-negotiation:
Adapt to available technology on the other side
This process is called “parallel detection”
Giuseppe Bianchi
33
Switch advantages: full-duplexing
(optional feature)
feature)
Bus, hubs: shared medium
Require only one station to transmit at a time
Half-duplex
CSMA/CD operation
Switch: dedicated connection
A connection is dedicated
between two swicthing ports
Between PC and switch port
Point-to-point transmission media
Obvious extension: move to full duplexing (802.3x)!
Transmit and receive on two separate links
Which can operate IN PARALLEL!!
Double the link capacity
No more need for CSMA/CD
No collision possible, as no more stations to collide with…!
No more limits on maximum segment length (just technical limits)
Giuseppe Bianchi
Hub disadvantage (solved by switch)
must downgrade to lowest supported rate
in full analogy with bus situation!
situation!
100 Mbps
100 Mbps
100 Mbps
100 Mbps
10 Mbps
????
Obviously
Impossible!
HUB
100 Mbps
100 Mbps
100 Mbps
10 Mbps
????
Obviously
Impossible!
Possible with
10/100 switches
Giuseppe Bianchi
34
Bridge/Switch operation
(following discussion limited to Ethernet Bridges/
Bridges/Switches)
Switches)
Giuseppe Bianchi
Bridge/Switch operation
Preamble +
SFD
DEST
lookup
SRC
LEN or
type
Data
PAD
FCS
Store & Forward:
read frame (memorize into onboard buffer)
Check CRC
Discard frame if
» CRC fails
» too short (<64 bytes, “runt”)
» too long (>1518 bytes, “giant”)
Look up destination into forwarding (switching)
table
Forward packet to outgoing port
Cut-through
Just read first few bytes (until destination
address)
Don’t do any check
Look up forwarding table and select destination
forward frame while receiving it
Giuseppe Bianchi
35
Store & forward vs cut-through
latency
1518 bytes frame
Assume full 8 bytes preamble received
S&F @ 10 mbps ≥ 1526*8/10 µs = 1222 µs
C-T @ 10 mbps ≥ 14*8/10 µs = 11.2 µs
S&F @ 100 mbps ≥ 122 µs
C-T @ 100 mbps ≥ 1.1 µs
Not a real problem at high rate
Giuseppe Bianchi
S&F vs C-T: adaptive feature
(typically configurable – example: Intel “Express” switch)
Max=1000:
=0.6%
=0.4%
=5%
=10%
Giuseppe Bianchi
36
S&F vs CT: Fragment-free mode
Compromise between cut-through and store-andforward
Reads first 64 bytes
includes the frame (+LLC) header
Then starts send packet
before the entire data field is read and the FCS is checked are read.
Advantages:
Verify reliability of header information (addresses, frame type, LLC header
information)
Detects & discards runts & collisions
Preamble +
SFD
DEST
SRC
Cut-through
LEN or
type
Data
Fragment-free
PAD
FCS
Store & Forward
Giuseppe Bianchi
Further issues with C-T
Cut-through possible only if source
and destination ports have same bit
rate
Symmetric switching.
Different rates buffering necessary
S&F only
Asymmetric switching
Asymmetric switching typical in
client/server environments
More bandwidth dedicated to the server port to
prevent a bottleneck
Giuseppe Bianchi
37
Forwarding database
Mapping between MAC
addresses and ports
Ports: module/port-#
Static entries:
Configured by sysadmin
Permanent database
Dinamic entries:
“Learned”
Expire after ageing process
reaches upper value
Dest MAC Address
----------------00-00-08-11-aa-01
00-b0-8d-13-1a-f1
a8-11-06-00-0b-b4
08-01-00-00-a7-64
00-ff-08-10-44-01
Ports
----1/1
1/7
2/3
2/4
2/6
Age
--1
4
0
1
5
E.g. 300 seconds
configurable
Giuseppe Bianchi
A note on technical
implementation - CAM
A forwarding Database is typically realized in
hardware for maximum speed/scalability
Technology of choice: Content Addressable Memories
(CAMs)
Used also in current high-range routers for very fast & scalable address
lookup
Software-based lookup (search): o(Log n);
Hardware-based CAM lookup: O(1)
Massively parallel comparison circuitry added to every cell of the
hardware memory
Search result in just 1 memory cycle!!
For details refer to:
http://www.eecg.toronto.edu/~pagiamt/cam/references.html
Giuseppe Bianchi
38
Address Learning /1
STA 1
00-11-22-33-44-01
P1
00-11-22-33-44-01
08-55-66-77-88-02
08-aa-bb-cc-dd-03
08-01-02-f1-f2-04
P1
P1
P2
P3
P3
STA 4
08-01-02-f1-f2-04
STA 2
08-55-66-77-88-02
P2
Frame arrives at
port X
Hence it has come
from LAN attached to
port X
STA 3
08-aa-bb-cc-dd-03
SRC address
used to update
forwarding DB
SRC MAC  Port
Giuseppe Bianchi
Address Learning /2
STA 1
00-11-22-33-44-01
P1
STA 2
08-55-66-77-88-02
00-11-22-33-44-01
08-55-66-77-88-02
08-aa-bb-cc-dd-03
08-01-02-f1-f2-04
08-00-0f-cc-cc-a2
P1
P1
P2
P3
P1
5
7
0
6
0
P3
STA 4
08-01-02-f1-f2-04
P2
08-00-0f-cc-cc-a2
Incoming frame
whose SRCaddr not
in forwarding DB:
Create new entry
Ageing-time=0
STA 3
08-aa-bb-cc-dd-03
Incoming frame
whose SRCaddr
already in forwarding
DB:
Refresh ageing-time
Ageint-time=0
Giuseppe Bianchi
39
Address Learning /3
STA 1
00-11-22-33-44-01
P1
STA 2
08-55-66-77-88-02
00-11-22-33-44-01
08-55-66-77-88-02
08-aa-bb-cc-dd-03
08-01-02-f1-f2-04
08-00-0f-cc-cc-a2
08-00-0f-cc-cc-a2
P1
P1
P2
P3
P1
P2
5
7
4
6
2
0
P3
STA 4
08-01-02-f1-f2-04
P2
08-00-0f-cc-cc-a2
STA 3
08-aa-bb-cc-dd-03
Giuseppe Bianchi
Incoming frame
whose SRCaddr
already in forwarding
DB but associated to
different port:
Update associated port
Refresh ageing time
Frame forwarding
Very first operation performed by the bridge/switch upon
frame reception
Before learning
Preamble +
SFD
DEST SRC
LEN or
type
Data
PAD
FCS
1. Frame OK?
Port X
CRC check
Only for Store & Forward
2. Incoming port enabled
(in forwarding state)?
Switch port may be disabled
e.g. to isolate malfunctioning stations/LANs
Port Y
3. If DEST is NOT in forwarding DB
broadcast frame (flooding)
forward frame to all ports EXCEPT incoming one
4. If DEST is in forwarding DB
Check whether DEST port = incoming port
If YES, discard packet (dest on same LAN of src)
If NO, forwards packet to output port
» Unless output port blocked
Flooding occurs also for broadcast frames (obvious) and for multicast frames (unless more sophisticated policies are set)
Giuseppe Bianchi
40
Example / 1
startstart-up
P1
P3
P2
Initial state: forwarding DB = empty
Giuseppe Bianchi
Example / 2
STA 1 STA 2
STA 1
00-11-22-33-44-01
00-11-22-33-44-01
P1
P1
0
P3
P2
STA 1 transmits frame to STA 2
Flooding occurs (STA2 not registered in DB)
Bridge learns STA1=P1
Giuseppe Bianchi
41
Example / 3
STA 2 STA 1
STA 1
00-11-22-33-44-01
00-11-22-33-44-01
00-aa-bb-cc-dd-02
P1
P3
STA 2
00-aa-bb-cc-dd-02
2
0
P1
P3
P2
STA 2 may respond
depends on involved protocol/app rules (e.g. TCP handshake)
transmits frame to STA 1
Destination selected
Bridge learns STA2=P3
Giuseppe Bianchi
Example / 4
STA 3 STA 1
STA 1
00-11-22-33-44-01
P1
00-11-22-33-44-01
00-aa-bb-cc-dd-02
08-80-f0-00-ff-03
P1
P3
P1
12
10
0
STA 2
00-aa-bb-cc-dd-02
P3
STA 3
08-80-f0-00-ff-03
P2
STA 3 on LAN 1 transmits to STA 1
Frame arrives to STA1 on LAN 1
But arrives also to Bridge
Bridge discards frame (STA1 on same port of incoming
frame)
This operation is referred to as FILTERING FUNCTION
Bridge learns STA3=P1
Giuseppe Bianchi
42
Example / 5
STA 1 moves;
moves; STA 1 STA 3
P1
00-11-22-33-44-01
00-aa-bb-cc-dd-02
08-80-f0-00-ff-03
00-11-22-33-44-01
P1
P3
P1
P2
13
11
1
0
STA 2
00-aa-bb-cc-dd-02
P3
STA 3
08-80-f0-00-ff-03
P2
STA 1
00-11-22-33-44-01
STA 1 moves on LAN 2
Then transmits to STA 3
Frame arrives to Bridge on P2, and forwarded to P1
According to forwarding DB information
Bridge learns that STA 1 moved
Deletes previous entry with P1
Adds new entry with P2
Giuseppe Bianchi
Example / 6
STA 2 moves;
moves; STA 1 STA 2
???
STA 2
00-aa-bb-cc-dd-02
P1
00-aa-bb-cc-dd-02
08-80-f0-00-ff-03
00-11-22-33-44-01
P3
P1
P2
13
3
2
P3
STA 3
08-80-f0-00-ff-03
P2
STA 1
00-11-22-33-44-01
STA 2 moves on LAN 1
STA 1 transmit frame to STA 2
Frame forwarded on old port P3!!
Bridge will learn only when STA2 will transmit first frame
OR when ageing time will expire
and STA2 P3 entry will be removed from forwarding DB
Giuseppe Bianchi
43
Why a station should move?
FaultFault-tolerant architectures!
architectures!
P1
P2
Link 1
Link 2
As link 1 fails, server switches on link 2 server MOVES from original port P2 to new port P1 !!
(need to reduce ageing time – but trade-off required: too short ageing time, too much burden on switch)
(effective solution: i) periodically send “advertisement” frames ii) send frame after switching to link 2)
Giuseppe Bianchi
44
Download