Packet Switch Architectures High Performance Switching and Routing Telecom Center Workshop: Sept 4, 1997. The following are (sometimes modified and rearranged slides) from an ACM Sigcomm 99 Tutorial by Nick McKeown and Balaji Prabhakar, Stanford University Slides used with permission from authors. © 1999-2000. All rights reserved by authors. Outline • Introduction: What is a Packet Switch? • Packet Lookup and Classification: Where does a packet go next? • Switching Fabrics: How does the packet get there? Copyright 1999. All Rights Reserved 2 Introduction What is a Packet Switch? • Basic Architectural Components • Some Example Packet Switches Copyright 1999. All Rights Reserved 3 Basic Architectural Components Datapath: per-packet processing 3. 1. Forwarding Table 2. Output Scheduling Interconnect Forwarding Decision Forwarding Table Forwarding Decision Forwarding Table Forwarding Decision Copyright 1999. All Rights Reserved 4 Where high performance packet switches are used - Carrier Class Core Router - ATM Switch - Frame Relay Switch The Internet Core Edge Router Enterprise WAN access & Enterprise Campus Switch Copyright 1999. All Rights Reserved 5 Some Example Packet Switches • Packet switches exist for different networking technologies – Internet: IP protocol suite – Ethernet: Ethernet switches – ATM (Asynchronous Transfer Mode): ATM switch – MPLS (Multiprotocol label switching): MPLS switch • There are many similarities in the architecture of the switches Copyright 1999. All Rights Reserved 6 Packet Lookup Where does a packet go next? • ATM and MPLS switches – Direct Lookup • Bridges and Ethernet switches – Associative Lookup – Hashing • IP Routers – Patricia trees/tries Copyright 1999. All Rights Reserved 7 Lookup in an ATM Switch • • • • Lookup cell VCI/VPI in VC table. Replace old VCI/VPI with new. Forward cell to outgoing interface. Transmit cell onto link. Copyright 1999. All Rights Reserved 8 Lookup in an Ethernet Switch • Lookup frame DA in forwarding table. – If known, forward to correct port. – If unknown, broadcast to all ports. • Learn SA of incoming frame. • Forward frame to outgoing interface. • Transmit frame onto link. Copyright 1999. All Rights Reserved 9 Lookup in an IP Router • Lookup packet DA in forwarding table. – If known, forward to correct port. – If unknown, drop packet. • Decrement TTL, update header Cksum. • Forward packet to outgoing interface. • Transmit packet onto link. Copyright 1999. All Rights Reserved 10 ATM and MPLS Switches Direct Lookup VCI (Port, VCI) Memory Copyright 1999. All Rights Reserved 11 Bridges and Ethernet Switches Associative Lookups Advantages: Associative Memory or CAM Search Data 48 Network Associated Address Data • Simple Associated Data { Hit? Address log2N Disadvantages • Slow • High Power • Small • Expensive Copyright 1999. All Rights Reserved 12 Bridges and Ethernet Switches Hashing 16 Memory Data 48 Hashing Function Address Search Data Associated Data { Hit? Address log2N Copyright 1999. All Rights Reserved 13 Lookups Using Hashing An example Memory #1 Search Data 48 #2 #3 #4 Associated Data Hashing Function CRC-16 Linked lists 16 #1 { #2 Hit? Address log2N #1 #2 #3 Copyright 1999. All Rights Reserved 14 IP Router Lookup H E A D E R Incoming Packet Dstn Addr Forwarding Engine Next Hop Next Hop Computation Forwarding Table Destination Next Hop ---------------- ---- IPv4 unicast destination address based lookup Copyright 1999. All Rights Reserved 15 IP Routers Lookup • Longest Prefix Matching 128.9.16.14 Prefix Port 65/8 128.9/16 128.9.16/20 128.9.19/24 128.9.25/24 128.9.176/20 142.12/19 3 5 2 7 10 1 3 • Lookup time • Storage space • Update time • Preprocessing time Copyright 1999. All Rights Reserved 16 Ternary CAMs Associative Memory Value 10.0.0.0 10.1.0.0 10.1.1.0 10.1.3.0 10.1.3.1 Mask 255.0.0.0 255.255.0.0 255.255.255.0 255.255.255.0 255.255.255.255 R1 R2 R3 R4 R4 Next Hop Priority Encoder Copyright 1999. All Rights Reserved 17 Binary Tries 0 d f e a b 1 g i h c j Copyright 1999. All Rights Reserved Example Prefixes a) 00001 b) 00010 c) 00011 d) 001 e) 0101 f) 011 g) 100 h) 1010 i) 1100 j) 11110000 18 Patricia Tree 0 f d a b e c 1 g h i Example Prefixes a) 00001 b) 00010 c) 00011 d) 001 Skip=5 e) 0101 f) 011 j g) 100 h) 1010 i) 1100 j) 11110000 Copyright 1999. All Rights Reserved 19 Switching Fabrics: How does the packet get there? • Output and Input Queueing • Output Queueing • Input Queueing • Other non-blocking fabrics Copyright 1999. All Rights Reserved 20 Basic Architectural Components Datapath: per-packet processing 3. 1. Forwarding Table 2. Output Scheduling Interconnect Forwarding Decision Forwarding Table Forwarding Decision Forwarding Table Forwarding Decision Copyright 1999. All Rights Reserved 21 Interconnects Two basic techniques Input Queueing Output Queueing Usually a non-blocking switch fabric (e.g. crossbar) Usually a fast bus Copyright 1999. All Rights Reserved 22 Interconnects Output Queueing Individual Output Queues Centralized Shared Memory Memory b/w = 2N.R 1 2 N 1 2 Memory b/w = (N+1).R N Copyright 1999. All Rights Reserved 23 Output Queueing How fast can we make centralized shared memory? 5ns SRAM Shared Memory • 5ns per memory operation • Two memory operations per packet • Therefore, up to 160Gb/s • In practice, closer to 80Gb/s 1 2 N 200 byte bus Copyright 1999. All Rights Reserved 24 Switching Fabrics • Output and Input Queueing • Output Queueing • Input Queueing – Scheduling algorithms – Other non-blocking fabrics – Combining input and output queues – Multicast traffic Copyright 1999. All Rights Reserved 25 Input Queueing with Crossbar Memory b/w = 2R Data In Scheduler configuration Data Out Copyright 1999. All Rights Reserved 26 Input Queueing Delay Head of Line Blocking Load 58.6% Copyright 1999. All Rights Reserved 100% 27 Head of Line Blocking Copyright 1999. All Rights Reserved 28 Copyright 1999. All Rights Reserved 29 Copyright 1999. All Rights Reserved 30 Input Queueing Virtual output queues Copyright 1999. All Rights Reserved 31 Input Queues Delay Virtual Output Queues Load Copyright 1999. All Rights Reserved 100% 32 Input Queueing Memory b/w = 2R Scheduler Copyright 1999. All Rights Reserved Can be quite complex! 33 Input Queueing Scheduling 1 1 2 2 1 1 2 2 Copyright 1999. All Rights Reserved 34 Wave Front Arbiter Scheduling Algorithm Requests Match 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 Copyright 1999. All Rights Reserved 35 Wave Front Arbiter Requests Match Copyright 1999. All Rights Reserved 36 Other Non-Blocking Fabrics Clos Network Copyright 1999. All Rights Reserved 37 Other Non-Blocking Fabrics Clos Network Expansion factor required = 2-1/N (but still blocking for multicast) Copyright 1999. All Rights Reserved 38 Other Non-Blocking Fabrics Self-Routing Networks 000 000 001 001 010 010 011 011 100 100 101 101 110 110 111 111 Copyright 1999. All Rights Reserved 39 Other Non-Blocking Fabrics Self-Routing Networks The Non-blocking Batcher Banyan Network Batcher Sorter Self-Routing Network 3 7 7 7 7 7 7 7 2 5 0 4 6 6 5 3 2 5 5 4 5 2 5 3 1 6 5 4 6 6 1 3 0 3 3 0 1 0 4 3 2 2 1 0 6 2 1 0 1 4 4 4 6 2 2 0 000 001 010 011 100 101 110 111 • Fabric can be used as scheduler. •Batcher-Banyan network is blocking for multicast. Copyright 1999. All Rights Reserved 40