Lecture 1.0: Introduction to Ethernet Giuseppe Bianchi Ethernet Ancestors Late 1960: ALOHA network Norman Abramson, University of Hawaii Application: radio network among islands Distribuded, uncoordinated network! First random access mechanism (Pure aloha / Slotted aloha) Giuseppe Bianchi 1 Birth of Ethernet May 22, 1973: Ethernet memo Bob Metcalfe (Xerox Palo Alto Research Center) Carrier Sense Multiple Access with Collision Detection and expo backoff 3 mbps speed Original Metcalfe drawing June 1976 presentation at National Computer Conference US Patent 4.063.220 “Multipoint Data Communication System with Collision Detection” end 1977 1978: US Patent for Ethernet Repeater Giuseppe Bianchi Ethernet Standardization 1979: Metcalfe start-up - 3COM 1980: DIX Ethernet Standard DIX = Digital-Intel-Xerox vendor consortium Interoperable products from the three founding companies 1982: Xerox relinquish “Ethernet” trademark 1985: IEEE 802.3 Ethernet becomes an IEEE 802 standard Thick-RG213 10 Mbps (10BASE5 thick coaxial) 802.3 supplement a (1985): » 10BASE2 thin coax Minor modifications vs DIX standard Path towards worldwide interoperability Thin-RG58 Ethernet standard: the world’s FIRST open, multi-vendor standard! Quoting Metcalfe: “the invention of Ethernet as an open, non-proprietary, industry-standard local network was perhaps even more significant than the invention of Ethernet technology itself” Giuseppe Bianchi 2 Connection to coaxial cable (historical) Thick Cable Thin Cable transceiver Wall, … transceiver cable Controller Controller 15-pin AUI connector Controller May support Internal TAP (on board transceiver) Needs external TAP (transceiver) Giuseppe Bianchi A note on Ethernet terminology speed Signal method medium EXAMPLE: 100Base-T, 1000Base-LX, … Speed 10, 100, 1000, 10G Signal method Base, broad Broad = RF modulated on coax » only one case: 10BROAD36, now obsolete Medium Old notation: 2,5 = 200/500 mt (thin/thick coax) More recent notation: T, Tx, T4, T2, FX, X, CX, SX, LX Depends on which specific twisted pair category & fibre category; Different labels (e.g. T, TX, T4, T2) accounts for different encoding details Giuseppe Bianchi 3 Ethernet and OSI Giuseppe Bianchi Ethernet and PHY (selected+simplified) Giuseppe Bianchi 4 IEEE 802 project LAN / MAN Standards Committee (LMSC) Unified interface with network layer LLC NETWORK LAYER 802.2 Logical Link Control DATA LINK LAYER 802.1 Bridging MAC 802.3 802.5 802.11 802.15 CSMA/CD TOKEN RING WLAN WPAN PHYSICAL LAYER Giuseppe Bianchi IEEE 802 standards ACTIVE WORKING & TECHNICAL ADVISORY GROUPS 802.1 High Level Interface (HILI) 802.3 CSMA/CD 802.11 Wireless LAN (WLAN) 802.15 Wireless Personal Area Network (WPAN) 802.16 Broadband Wireless Access (BBWA) 802.17 Resilient Packet Ring (RPR) 802.18 Radio Regulatory Technical Advisory Group 802.19 Coexistence Technical Advisory Group 802.20 Mobile Wireless Access 802.21 Media Independent Handover HIBERNATING WORKING GROUPS (standards published, but inactive) 802.2 Logical Link Control (LLC) 802.5 Token Ring 802.12 Demand Priority DISBANDED WORKING GROUPS (all standards withdrawn or did not publish a standard) 802.4 Token Bus 802.6 Metropolitan Area Network (MAN) 802.7 BroadBand Technical Adv. Group (BBTAG) 802.8 Fiber Optics Technical Adv. Group (FOTAG) 802.9 Integrated Services LAN (ISLAN) Working Group 802.10 Standard for Interoperable LAN Security (SILS) Working Group 802.14 Cable-TV Based Broadband Communication Network Working Group Giuseppe Bianchi 5 Traditional Ethernet topology: bus Multiple Access shared transmission medium thick / thin coaxial cable A B C A D B C E D F E F g! wron Giuseppe Bianchi Twisted Pair revolution 1990: 802.3i 10BASE-T twisted pair Invented by SynOptics Communications Alternatives Reuse structured cabling UTP (Unshielded) system standards FTP (Foiled) Overcomes management and installation 1 shield for all the cable problems from coaxial cabling STP (Shielded): Ethernet market takes-off!! One shield per pair Giuseppe Bianchi 6 Twisted Pair: star topology (no TAP allowed) allowed) Initially: HUB broadcasts signal on all links Logically behaves as a bus Only one tx at a time A HUB HUB B Then: SWITCH Repeats signal on specifically addressed link Bridging function Many tx-rx pairs at a time More bw! A C D E F E F SWITCH SWITCH B C D Same topological issues for fiber optic links… Giuseppe Bianchi Fiber Optics enters into play 1987: FOIRL (802.3d) Fiber Optic Inter-Repeater Link Point-to-point segment to link remote ethernet segments (via repeaters) No direct PC-Ethernet connection until 10BASE-F 1993: 10BASE-F (802.3j) three specifications 10BASE-FB for active fiber hubs » scarce success 10BASE-FP for passive fiber hubs » never built! 10BASE-FL extends FOIRL specification » the only one deployed Giuseppe Bianchi 7 Ethernet for higher speeds Many people though Ethernet could not go faster than 10 Mbps… instead: 1995: 100BASE-T Fast Ethernet (802.3u) 100 Mbps on twisted pair As well as on any other media Auto-Negotiation capabilities 10/100 products 1997: full duplex standard (802.3x) Simultaneously transmit and receive (2x speed increase) 1998: 1000BASE-X Gigabit Ethernet (802.3z) Over fiber and short copper cable 1999: 1000BASE-T Gigabit Ethernet (802.3ab) Over Twisted Pair 10/100/1000 auto-negotiation 2002 (july): 10 GigaEthernet (802.ae) Giuseppe Bianchi Ethernet Evolution at a glance Giuseppe Bianchi 8 Lecture 1.1: Ethernet Basics Giuseppe Bianchi Ethernet/802.3 frame 64-1518 bytes 8 bytes preamble 14 bytes Ethernet/802.3 6 bytes Destination address 46-1500 bytes 4 bytes LLC/DATA FCS 6 bytes 2 bytes Source address Length or type Main Differences ETHERNET 802.3 Frame type type Length or type payload Data (≥46 but no PAD) LLC+data (explicit PAD) Giuseppe Bianchi 9 Preamble 8 bytes 7 bytes preamble 7 x (10101010) Last byte: SFD (Start Frame Delimiter) (10101011) Devised for 10 Mbps systems For synchronization SFD bit sequence 1 0 1 0 1 0 1 1 Manchester Encoding (10 Mbps) Not useful for 100/1000 systems Maintained for backward compatibility Giuseppe Bianchi Frame Check Sequence FCS 4 bytes = 32 bits CRC Order: [x31…x0] calculated on frame only (not on preamble…) Of course!! Giuseppe Bianchi 10 48 bit addresses Typically referred to as Interface address Hardware address MAC address First bit: 0 = physical address of an interface Unicast address 1 = group address Second bit: 0 = globally administered address Assigned by the manifacturer 1 = locally administered address Can be configured through driver First 24 bits: OUI (Organization Unique Identifier) (unique for each vendor) Typically written in hex e.g.: F0-11-00-4F-A2-1C Each byte transmitted from LSB to MSB 0000.1111.1000.1000.0000.0000. 1111.0010.0100.0101.1000.0011 mcast addresses: start with 1 (first octet LSB!) Why destination first? Station who does not match dest may ignore rest of frame! Giuseppe Bianchi Examples Individual unicast: xy-xx-xx-xx-xx-xx y multiple of 4 802.3 & 802.4: transmitted from LSB to MSB 802.5 & FDDI: transmitted from MSB to LSB Giuseppe Bianchi 11 Length/Type 2 bytes In original ethernet: frame type Used for demultiplexing upper layer proto Eg: 0x0800=IP In 802.3: length OR type If>1500 (more precisely, ≥ 0x0600 = 1536) frame type Else LLC payload size (≤1500) Demultiplexing provided by LLC If <46, remaining octets are PAD (padding) Ethernet and 802.3 frames may (do!) coexist on the same network. Recognized via length / frame type field. Giuseppe Bianchi LLC header 64-1518 bytes 8 bytes preamble 46-1500 bytes 14 bytes Ethernet/802.3 LLC/DATA 1 byte 1 byte 1 byte 3 bytes 2 bytes DSAP SSAP ctrl Protocol OUI Protocol 4 bytes FCS 46-1492 bytes ……… SNAP (SubNetwork Access Protocol) DSAP=SSAP (typically) Control: depends on service type. Typically: service type = connectionless unreliable ctrl=0x03 (unnumbered information) Demultiplexing: DSAP, SSAP used only for ISO-OSI standards other protocols (including IP!) require SNAP addesses Giuseppe Bianchi 12 LLC Header for an IP packet 1 byte 1 byte 1 byte 3 bytes 2 bytes 0xAA 0xAA 0x03 0x000000 0x0800 46-1492 bytes ……… DSAP=SSAP: 0xAA (use SNAP extension) Control: 0x03 (unnumbered information) Protocol OUI (Organization Unique Identifier: 0x000000 (Internet – IETF protocols) Non-zero values for Novell, IBM, Digital, Apple, etc protocols Protocol 0x0800 (IP) Giuseppe Bianchi Medium Access Control Protocol Carrier Sense Multiple Access with Collision Detection CSMA/CD Giuseppe Bianchi 13 Role of MAC (CSMA/CD) three functions: Transmit/receive data frames Decode data frames and check them for valid addresses before passing them to the upper layers of the OSI model Detect errors within data frames or on the network Giuseppe Bianchi Carrier Sense Multiple Access 1. Listen before talking 1. Station ready to transmit a frame 3. Transmit frame IFS 2. Listen for at least an Inter Frame Spacing (channel must be idle meanwhile) Ethernet Notation = Inter Packet Gap (IPG) 802.3 Notation = Inter Frame Spacing (IFS) Minimum: 96 bits (@ 10 Mbps = 9.6 µs) Giuseppe Bianchi 14 Carrier Sense Multiple Access 2. If channel detected busy: defer 1. Frame ready 3. Busy Detect 4. Defer 5. Listen for ≥ IFS 2. Listen (similar defer situation if channel immediately busy) Giuseppe Bianchi Collision Detection 3. Listen while talking If collision detected: Continue to transmit other 32 bits of signal (Collision Enforcement Jam Signal) If detected during the preamble, continue transmitting preamble AND other 32 jam bits End transmission Generate backoff interval, after which retry transmission Backoff: r x 51.2 µs, 0 ≤ r < 2k, k=min(10,n), n=retry # Slot-time = 64 bytes = 512 bits (@ 10 Mbps = 51.2 µs) Abort tx after 16 unsuccessful retries Giuseppe Bianchi 15 Summary of operation Source: Cisco CCNA Giuseppe Bianchi Collision detection in practice Media dependent On fiber or twisted pair: Point-to-point links Collision detected by the simultaneous occurrence of activity on both transmit and receive paths On Coax: Monitor average DC voltage When more than 1 station transmits, voltage gets greater than given threshold Giuseppe Bianchi 16 Why collisions occur? distance d (m) prop delay d/200 µs Start tx IFS IFS Detect collision Collision occurs if stations access the channel in instants of time which differ for less than their propagation delay Start tx Detect collision time Speed of EM signal in cable: ~200.000 km/s 200m/µs Giuseppe Bianchi Network diameter Stations placed at opposite network edges Essential condition: a station must be able to detect a collision Otherwise lots of problems station would think the frame to be successfully delivered… Shortest possible frame: 6+6+2+46+4 = 64 bytes = = 512 bits (excl preamble) 64 bytes 512 bits frame RTT=2 x prop must be lower than 512 bits Condition on network diameter: a collision MUST be detected on shortest possible frame Bound on maximum RTT @ 10 Mbps: 512/10 [µs] = 2d [m] / 200 [m/µs] d= 5120 [m] Giuseppe Bianchi 17 Backoff slot-time Set to 512 bits as minimum frame size As maximum RTT guarantees that a transmitting station in previous backoff slot will be ALWAYS detected A station transmitting for 512 bits will acquire for sure the channel No “late collisions” possible Unless misconfigurations occur… Giuseppe Bianchi How does backoff works? Extracts 0 in (0,1) Immediately reschedules tx IFS jam Extracts 1 in (0,1) Waits for a 51.2 µs slot-time IFS jam Giuseppe Bianchi 18 More on network diameter Safe condition to allow collision detection add 32 bits jam time extra time for processing From standard: Maximum RTT=46.38 µs Phy media have max len Fiber: 2000m Coax: 500m Thin coax: 185m drop cable: 50m Transceiver cable Fiber to coax repeaters introduce delay Etc…… RESULT: 2800 mt max diameter 3x500m coax + 1000m total fiber + 6 drop Giuseppe Bianchi Giga Ethernet – Carrier Extension Minimum frame size set equal to larger slot time of giga-ethernet 512 bytes = 4096 bits Extension achieved with external padding Frame structure left unchanged for backward compatibility Giuseppe Bianchi 19 Giga Ethernet – Frame bursting Optional feature: Burst Mode transmit series of frames without relinquishing control of the transmission medium. Achieves collision-free transmission for frames following the first one Transmitting station fills the interframe spacing interval with extension bits readily distinguished from data bits at the receiving stations maintain the detection of carrier in the receiving stations (does not allow the medium to assume an idle condition between frames) Necessary condition for bursting: first frame has been successfully transmitted Upper bound: burstLimit = 65536 bits Giuseppe Bianchi Channel Capture Effect / 1 For simplicity: we are neglecting detailed timing issues Assumption: station with “many” frames in the tx buffer B=0 New frame: No backoff! B in (0,1) IFS P1 jam P1 P2 jam B in (0,3) B=1 IFS P1 jam P1 jam After second collision, station B will be at the SECOND retry (due to its backoff choice), and will compete with station A at FIRST retry. Psucc(A) = ½ * ¾ + ½ * ½ =5/8 A gets unfair advantage! Psucc(B) = ¼ * ½ = 1/8 Giuseppe Bianchi 20 Channel Capture Effect / 2 Unlucky stations get more and more unlucky!! Following previous example: P(win at first try) 1/4 vs 1/4 P(win at second try) = 1/8 vs 5/8 P(win at third try) = 1/16 vs 13/16 P(win at fourth try = 1/32 vs 29/32 … !!! … Result: if you start losing collisions, you will end up losing all the remaining ones At the 16th retry, frame will dropped ONLY AT THIS POINT station will restart with no backoff Again fair competition Consequence: extremely high access delay variance Packet Starvation Effect Giuseppe Bianchi Numerical results From Whetten et al, http://www.ethermanage.com/ethernet/papers.html Solution to channel capture: Some solutions proposed (e.g. BLAM – Binary Logarithmic Arbitration Method) Download: http://www.ethermanage.com/ethernet/papers.html Not standardized, despite proposals Adds complexity Backward compatibility Concerns: is capture a practical issue at all? (e.g. in normal load) Giuseppe Bianchi 21 Lecture 1.2: Repeaters, Hubs, Switches Giuseppe Bianchi Repeater Physical layer device Provides the “3-R” functions: Re-Shaping Restores the proper signal waveform Re-Timing Restores the proper impulse duration Re-Transmitting Retransmits collisions, too Actually, regenerates (extending them to 96 bits) 010101… jam sequences Automatic “partitioning” Protect network from faulty segments If 30+ consecutive frame tx failures detected, disconnect the link Giuseppe Bianchi 22 Multiport Repeaters (Hubs) Slang name: HUBS Essential for BASE-T and BASE-F Star / tree topology But logically acts as a bus! No loops allowed (rings) Otherwise signal would travel forever! Collision domain Maximum propagation distance between end nodes Giuseppe Bianchi Star == Bus Giuseppe Bianchi 23 Repeaters and preamble Part of preamble needed for synchronization Ethernet repeater: “consumed” part of preamble is NOT regenerated limit on number of repeaters crossed R frame frame 802.3 repeater: Preable fully restored But this adds extra delay (up to 16 bits per repeater) Moreover since synchronization delay is NOT constant (a second frame might synchronize faster than a first one), IFS can reduce IFS illegal below 47 bits frame frame R frame frame Giuseppe Bianchi 5-4-3 rule Ethernet/IEEE 802.3 rule on the number Segments Repeaters Populated (user) segments vs unpopulated (link) segments Link segments used to connect 2 repeaters Rule: between any two nodes on the network, there can only be a maximum of five segments connected through four repeaters only three of the five segments may contain user connections. Giuseppe Bianchi 24 Stackable Repeaters Special connector (approx up to 30 cm) Stacked Repeaters act as a SINGLE repeater device! Giuseppe Bianchi Modular hubs (chassis) Expand by adding more boards in slots available Minor issue: must buy from same vendor Major issue: power failure implies failure for all ports (many!!) Giuseppe Bianchi 25 Photos 8, 16, 24 10/100 ports (stackable) Hubs Modular chassis hub 42 10 Base-T RJ45 Port 2 Fiber Ports. Giuseppe Bianchi Bridges & Switches Giuseppe Bianchi 26 Bridge vs Switch Functional differences: None! Switch = Bridge Marketing issues Bridge: traditional name; may give the flavour of: Very low number of ports (typically 2) Goal: interconnect LANs Switch: more appealing name; gives the flavour of Many ports (goal: to “switch” between end-user links) May support many additional functions than “just” bridging Implementation issues Bridge: store & Forward operation Software implementation Switch: may use cut-through operation(faster) Hardware switching operation implementation Difference: basically a marketing/implementation issue For us: BRIDGE == (Layer 2) SWITCH Giuseppe Bianchi Bridging in the 80’s goal: limit collision domain D B A LAN 1 Bridge LAN 2 E C Bridge: terminates a collision domain! LANs: not necessarily Ethernet “Transparent” bridging Bridging interfaces are NOT directly addressed at MAC level they are intermediary Giuseppe Bianchi 27 Bridging and network extension BRIDGE Segment 1 10 Mbit/s (shared) 10 Mbit/s (dedicated) HUB 2 HUB 1 S4 HUB 1 S3 S7 S5 S6 S1 S2 Segment 2 10 Mbit/s (shared) Giuseppe Bianchi Protocol stack layers 3-7 layers 3-7 LLC LLC MAC1 Relay MAC1 PHY1 PHY LAN 1 MAC2 MAC2 PHY2 PHY2 LAN 2 Operate at OSI layer 2 (datalink) Higher layers unaware May interconnect LANs with different PHY and MAC Giuseppe Bianchi 28 Bridging specified in 802.1D Bridging not specific for 802.3 (common for all 802) 802.2 Logical Link Control ISO 8802.2 LLC 802.1D Bridging MAC 802.3 802.5 FDDI 802.11 802.12 ISO 8802.3 ISO 8802.5 ISO 9314 ISO 8802.11 ISO 8802.12 CSMA/CD TOKEN RING FDDI Wireless AnyLAN Giuseppe Bianchi Bridge in the 90’s and 00’s Collapsed Backbone Backbone collapsed into center device star/tree topology Versus shared bus Suitable for structured cabling Two links per port 2 x twisted pair (or fiber): HUB Multi-port repeater Shared bandwidth Transmit; receive SWITCH Per-pair dedicated bandwidth Giuseppe Bianchi 29 Broadcast domain vs collision domain HUB Switch HUB Without Switching With switching Collision Domain LAN Collision Domain Broadcast Domain Collision Domain Switch Collision Domain Collision Domain Broadcast Domain Giuseppe Bianchi Micro-segmentation Bridge segments network into distinct parts Low number Switched LAN Many more segments Limit: one segment per user The most frequent case! Incoming frame switched to appropriate output line Unused lines can switch other traffic More than one station can transmit at a time Multiply capacity of LAN Giuseppe Bianchi 30 Switch technical features Giuseppe Bianchi Autonegotiation 802.3-2002-part2, clause 28 Formerly 802.3u, drafted in 1994 Original specification for 10/100 Mbps More recently extended for 1000 Mbps What is autonegotiation Mechanism run independently at each link end 1. 2. 3. Detect various modes that exist in the device on the other end of the wire Advertise to the other end device its own abilities Goal: automatically configure the highest performance mode of interoperation. o o o o Speed Line Coding Half/Full duplexing Extras Giuseppe Bianchi 31 NLP and FLP bursts Normal Link Pulses (NLP) 10Base-T idea; In the absence of data, periodically transmit link integrity pulses to run-time determine if the link is operational 1 NLP every 16 +/- 8 ms Fast Link Pulses (FLP) Instead of a single NLP pulse, transmit a 16-bit codeword Coded with pulse position Carrying the information about the device capabilities Once negotiation completed, get back to NLP transmission NLP 16 +/- 8 ms FLP 16 +/- 8 ms Giuseppe Bianchi FLP wording 1 1 0 1 0 ……… 17 clock pulses; in the 16 intermediate spaces: - pulse = bit 1 - no pulse = bit 0 Giuseppe Bianchi 32 FLP coding Selector field (5 bits) 00001 for 802.3 Other sequences for other standards Technology ability field (8 bits) Specify capabilities (e.g. 10Base-T, 100Base-T4, 100BaseTx+fullduplex, etc) RF = Remote Fault (1 bit) Allow to signal that a fault occured on the other side Fault an be specified in “Next Page” Ack (1 bit) Notifies that a device has successfully received the FLP NP = Next Page Notifies that a device is “next page” capable, i.e. it wishes to exchange additional data Each following page transmitted until explicitly ack-ed Giuseppe Bianchi Negotiation process If both link partners capable of autonegotiation: Select best technology among that available First 100Base-T full duplex … Last 10BaseT half duplex If only one link partner capable of auto-negotiation: Adapt to available technology on the other side This process is called “parallel detection” Giuseppe Bianchi 33 Switch advantages: full-duplexing (optional feature) feature) Bus, hubs: shared medium Require only one station to transmit at a time Half-duplex CSMA/CD operation Switch: dedicated connection A connection is dedicated between two swicthing ports Between PC and switch port Point-to-point transmission media Obvious extension: move to full duplexing (802.3x)! Transmit and receive on two separate links Which can operate IN PARALLEL!! Double the link capacity No more need for CSMA/CD No collision possible, as no more stations to collide with…! No more limits on maximum segment length (just technical limits) Giuseppe Bianchi Hub disadvantage (solved by switch) must downgrade to lowest supported rate in full analogy with bus situation! situation! 100 Mbps 100 Mbps 100 Mbps 100 Mbps 10 Mbps ???? Obviously Impossible! HUB 100 Mbps 100 Mbps 100 Mbps 10 Mbps ???? Obviously Impossible! Possible with 10/100 switches Giuseppe Bianchi 34 Bridge/Switch operation (following discussion limited to Ethernet Bridges/ Bridges/Switches) Switches) Giuseppe Bianchi Bridge/Switch operation Preamble + SFD DEST lookup SRC LEN or type Data PAD FCS Store & Forward: read frame (memorize into onboard buffer) Check CRC Discard frame if » CRC fails » too short (<64 bytes, “runt”) » too long (>1518 bytes, “giant”) Look up destination into forwarding (switching) table Forward packet to outgoing port Cut-through Just read first few bytes (until destination address) Don’t do any check Look up forwarding table and select destination forward frame while receiving it Giuseppe Bianchi 35 Store & forward vs cut-through latency 1518 bytes frame Assume full 8 bytes preamble received S&F @ 10 mbps ≥ 1526*8/10 µs = 1222 µs C-T @ 10 mbps ≥ 14*8/10 µs = 11.2 µs S&F @ 100 mbps ≥ 122 µs C-T @ 100 mbps ≥ 1.1 µs Not a real problem at high rate Giuseppe Bianchi S&F vs C-T: adaptive feature (typically configurable – example: Intel “Express” switch) Max=1000: =0.6% =0.4% =5% =10% Giuseppe Bianchi 36 S&F vs CT: Fragment-free mode Compromise between cut-through and store-andforward Reads first 64 bytes includes the frame (+LLC) header Then starts send packet before the entire data field is read and the FCS is checked are read. Advantages: Verify reliability of header information (addresses, frame type, LLC header information) Detects & discards runts & collisions Preamble + SFD DEST SRC Cut-through LEN or type Data Fragment-free PAD FCS Store & Forward Giuseppe Bianchi Further issues with C-T Cut-through possible only if source and destination ports have same bit rate Symmetric switching. Different rates buffering necessary S&F only Asymmetric switching Asymmetric switching typical in client/server environments More bandwidth dedicated to the server port to prevent a bottleneck Giuseppe Bianchi 37 Forwarding database Mapping between MAC addresses and ports Ports: module/port-# Static entries: Configured by sysadmin Permanent database Dinamic entries: “Learned” Expire after ageing process reaches upper value Dest MAC Address ----------------00-00-08-11-aa-01 00-b0-8d-13-1a-f1 a8-11-06-00-0b-b4 08-01-00-00-a7-64 00-ff-08-10-44-01 Ports ----1/1 1/7 2/3 2/4 2/6 Age --1 4 0 1 5 E.g. 300 seconds configurable Giuseppe Bianchi A note on technical implementation - CAM A forwarding Database is typically realized in hardware for maximum speed/scalability Technology of choice: Content Addressable Memories (CAMs) Used also in current high-range routers for very fast & scalable address lookup Software-based lookup (search): o(Log n); Hardware-based CAM lookup: O(1) Massively parallel comparison circuitry added to every cell of the hardware memory Search result in just 1 memory cycle!! For details refer to: http://www.eecg.toronto.edu/~pagiamt/cam/references.html Giuseppe Bianchi 38 Address Learning /1 STA 1 00-11-22-33-44-01 P1 00-11-22-33-44-01 08-55-66-77-88-02 08-aa-bb-cc-dd-03 08-01-02-f1-f2-04 P1 P1 P2 P3 P3 STA 4 08-01-02-f1-f2-04 STA 2 08-55-66-77-88-02 P2 Frame arrives at port X Hence it has come from LAN attached to port X STA 3 08-aa-bb-cc-dd-03 SRC address used to update forwarding DB SRC MAC Port Giuseppe Bianchi Address Learning /2 STA 1 00-11-22-33-44-01 P1 STA 2 08-55-66-77-88-02 00-11-22-33-44-01 08-55-66-77-88-02 08-aa-bb-cc-dd-03 08-01-02-f1-f2-04 08-00-0f-cc-cc-a2 P1 P1 P2 P3 P1 5 7 0 6 0 P3 STA 4 08-01-02-f1-f2-04 P2 08-00-0f-cc-cc-a2 Incoming frame whose SRCaddr not in forwarding DB: Create new entry Ageing-time=0 STA 3 08-aa-bb-cc-dd-03 Incoming frame whose SRCaddr already in forwarding DB: Refresh ageing-time Ageint-time=0 Giuseppe Bianchi 39 Address Learning /3 STA 1 00-11-22-33-44-01 P1 STA 2 08-55-66-77-88-02 00-11-22-33-44-01 08-55-66-77-88-02 08-aa-bb-cc-dd-03 08-01-02-f1-f2-04 08-00-0f-cc-cc-a2 08-00-0f-cc-cc-a2 P1 P1 P2 P3 P1 P2 5 7 4 6 2 0 P3 STA 4 08-01-02-f1-f2-04 P2 08-00-0f-cc-cc-a2 STA 3 08-aa-bb-cc-dd-03 Giuseppe Bianchi Incoming frame whose SRCaddr already in forwarding DB but associated to different port: Update associated port Refresh ageing time Frame forwarding Very first operation performed by the bridge/switch upon frame reception Before learning Preamble + SFD DEST SRC LEN or type Data PAD FCS 1. Frame OK? Port X CRC check Only for Store & Forward 2. Incoming port enabled (in forwarding state)? Switch port may be disabled e.g. to isolate malfunctioning stations/LANs Port Y 3. If DEST is NOT in forwarding DB broadcast frame (flooding) forward frame to all ports EXCEPT incoming one 4. If DEST is in forwarding DB Check whether DEST port = incoming port If YES, discard packet (dest on same LAN of src) If NO, forwards packet to output port » Unless output port blocked Flooding occurs also for broadcast frames (obvious) and for multicast frames (unless more sophisticated policies are set) Giuseppe Bianchi 40 Example / 1 startstart-up P1 P3 P2 Initial state: forwarding DB = empty Giuseppe Bianchi Example / 2 STA 1 STA 2 STA 1 00-11-22-33-44-01 00-11-22-33-44-01 P1 P1 0 P3 P2 STA 1 transmits frame to STA 2 Flooding occurs (STA2 not registered in DB) Bridge learns STA1=P1 Giuseppe Bianchi 41 Example / 3 STA 2 STA 1 STA 1 00-11-22-33-44-01 00-11-22-33-44-01 00-aa-bb-cc-dd-02 P1 P3 STA 2 00-aa-bb-cc-dd-02 2 0 P1 P3 P2 STA 2 may respond depends on involved protocol/app rules (e.g. TCP handshake) transmits frame to STA 1 Destination selected Bridge learns STA2=P3 Giuseppe Bianchi Example / 4 STA 3 STA 1 STA 1 00-11-22-33-44-01 P1 00-11-22-33-44-01 00-aa-bb-cc-dd-02 08-80-f0-00-ff-03 P1 P3 P1 12 10 0 STA 2 00-aa-bb-cc-dd-02 P3 STA 3 08-80-f0-00-ff-03 P2 STA 3 on LAN 1 transmits to STA 1 Frame arrives to STA1 on LAN 1 But arrives also to Bridge Bridge discards frame (STA1 on same port of incoming frame) This operation is referred to as FILTERING FUNCTION Bridge learns STA3=P1 Giuseppe Bianchi 42 Example / 5 STA 1 moves; moves; STA 1 STA 3 P1 00-11-22-33-44-01 00-aa-bb-cc-dd-02 08-80-f0-00-ff-03 00-11-22-33-44-01 P1 P3 P1 P2 13 11 1 0 STA 2 00-aa-bb-cc-dd-02 P3 STA 3 08-80-f0-00-ff-03 P2 STA 1 00-11-22-33-44-01 STA 1 moves on LAN 2 Then transmits to STA 3 Frame arrives to Bridge on P2, and forwarded to P1 According to forwarding DB information Bridge learns that STA 1 moved Deletes previous entry with P1 Adds new entry with P2 Giuseppe Bianchi Example / 6 STA 2 moves; moves; STA 1 STA 2 ??? STA 2 00-aa-bb-cc-dd-02 P1 00-aa-bb-cc-dd-02 08-80-f0-00-ff-03 00-11-22-33-44-01 P3 P1 P2 13 3 2 P3 STA 3 08-80-f0-00-ff-03 P2 STA 1 00-11-22-33-44-01 STA 2 moves on LAN 1 STA 1 transmit frame to STA 2 Frame forwarded on old port P3!! Bridge will learn only when STA2 will transmit first frame OR when ageing time will expire and STA2 P3 entry will be removed from forwarding DB Giuseppe Bianchi 43 Why a station should move? FaultFault-tolerant architectures! architectures! P1 P2 Link 1 Link 2 As link 1 fails, server switches on link 2 server MOVES from original port P2 to new port P1 !! (need to reduce ageing time – but trade-off required: too short ageing time, too much burden on switch) (effective solution: i) periodically send “advertisement” frames ii) send frame after switching to link 2) Giuseppe Bianchi 44