Packet Switch Architectures
High Performance
Switching and Routing
Telecom Center Workshop: Sept 4, 1997.
The following are (sometimes modified and rearranged slides) from
an ACM Sigcomm 99 Tutorial by Nick McKeown and Balaji
Prabhakar, Stanford University
Slides used with permission from authors.
© 1999-2000. All rights reserved by authors.
Outline
• Introduction:
What is a Packet Switch?
• Packet Lookup and Classification:
Where does a packet go next?
• Switching Fabrics:
How does the packet get there?
Copyright 1999. All Rights Reserved
2
Introduction
What is a Packet Switch?
• Basic Architectural Components
• Some Example Packet Switches
Copyright 1999. All Rights Reserved
3
Basic Architectural Components
Datapath: per-packet processing
3.
1.
Forwarding
Table
2.
Output
Scheduling
Interconnect
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Copyright 1999. All Rights Reserved
4
Where high performance packet
switches are used
- Carrier Class Core Router
- ATM Switch
- Frame Relay Switch
The Internet Core
Edge Router
Enterprise WAN access
& Enterprise Campus Switch
Copyright 1999. All Rights Reserved
5
Some Example Packet Switches
• Packet switches exist for different networking technologies
– Internet: IP protocol suite
– Ethernet: Ethernet switches
– ATM (Asynchronous Transfer Mode): ATM switch
– MPLS (Multiprotocol label switching): MPLS switch
• There are many similarities in the architecture of the switches
Copyright 1999. All Rights Reserved
6
Packet Lookup
Where does a packet go next?
• ATM and MPLS switches
– Direct Lookup
• Bridges and Ethernet switches
– Associative Lookup
– Hashing
• IP Routers
– Patricia trees/tries
Copyright 1999. All Rights Reserved
7
Lookup in an ATM Switch
•
•
•
•
Lookup cell VCI/VPI in VC table.
Replace old VCI/VPI with new.
Forward cell to outgoing interface.
Transmit cell onto link.
Copyright 1999. All Rights Reserved
8
Lookup in an Ethernet Switch
• Lookup frame DA in forwarding table.
– If known, forward to correct port.
– If unknown, broadcast to all ports.
• Learn SA of incoming frame.
• Forward frame to outgoing interface.
• Transmit frame onto link.
Copyright 1999. All Rights Reserved
9
Lookup in an IP Router
• Lookup packet DA in forwarding table.
– If known, forward to correct port.
– If unknown, drop packet.
• Decrement TTL, update header Cksum.
• Forward packet to outgoing interface.
• Transmit packet onto link.
Copyright 1999. All Rights Reserved
10
ATM and MPLS Switches
Direct Lookup
VCI
(Port, VCI)
Memory
Copyright 1999. All Rights Reserved
11
Bridges and Ethernet Switches
Associative Lookups
Advantages:
Associative
Memory or CAM
Search
Data
48
Network Associated
Address
Data
• Simple
Associated
Data
{
Hit?
Address
log2N
Disadvantages
• Slow
• High Power
• Small
• Expensive
Copyright 1999. All Rights Reserved
12
Bridges and Ethernet Switches
Hashing
16
Memory
Data
48
Hashing
Function
Address
Search
Data
Associated
Data
{
Hit?
Address
log2N
Copyright 1999. All Rights Reserved
13
Lookups Using Hashing
An example
Memory
#1
Search
Data
48
#2
#3
#4
Associated
Data
Hashing Function
CRC-16
Linked lists
16
#1
{
#2
Hit?
Address
log2N
#1
#2
#3
Copyright 1999. All Rights Reserved
14
IP Router
Lookup
H
E
A
D
E
R
Incoming
Packet
Dstn
Addr
Forwarding Engine
Next Hop
Next Hop Computation
Forwarding Table
Destination Next Hop
----------------
----
IPv4 unicast destination address based lookup
Copyright 1999. All Rights Reserved
15
IP Routers
Lookup
• Longest Prefix Matching
128.9.16.14
Prefix
Port
65/8
128.9/16
128.9.16/20
128.9.19/24
128.9.25/24
128.9.176/20
142.12/19
3
5
2
7
10
1
3
• Lookup time
• Storage space
• Update time
• Preprocessing time
Copyright 1999. All Rights Reserved
16
Ternary CAMs
Associative Memory
Value
10.0.0.0
10.1.0.0
10.1.1.0
10.1.3.0
10.1.3.1
Mask
255.0.0.0
255.255.0.0
255.255.255.0
255.255.255.0
255.255.255.255
R1
R2
R3
R4
R4
Next Hop
Priority Encoder
Copyright 1999. All Rights Reserved
17
Binary Tries
0
d
f
e
a b
1
g
i
h
c
j
Copyright 1999. All Rights Reserved
Example Prefixes
a) 00001
b) 00010
c) 00011
d) 001
e) 0101
f) 011
g) 100
h) 1010
i) 1100
j) 11110000
18
Patricia Tree
0
f
d
a b
e
c
1
g
h
i
Example Prefixes
a) 00001
b) 00010
c) 00011
d) 001
Skip=5
e) 0101
f) 011
j
g) 100
h) 1010
i) 1100
j) 11110000
Copyright 1999. All Rights Reserved
19
Switching Fabrics:
How does the packet get there?
• Output and Input Queueing
• Output Queueing
• Input Queueing
• Other non-blocking fabrics
Copyright 1999. All Rights Reserved
20
Basic Architectural Components
Datapath: per-packet processing
3.
1.
Forwarding
Table
2.
Output
Scheduling
Interconnect
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Copyright 1999. All Rights Reserved
21
Interconnects
Two basic techniques
Input Queueing
Output Queueing
Usually a non-blocking
switch fabric (e.g. crossbar)
Usually a fast bus
Copyright 1999. All Rights Reserved
22
Interconnects
Output Queueing
Individual Output Queues
Centralized Shared Memory
Memory b/w = 2N.R
1
2
N
1
2
Memory b/w = (N+1).R
N
Copyright 1999. All Rights Reserved
23
Output Queueing
How fast can we make centralized shared memory?
5ns SRAM
Shared
Memory
• 5ns per memory operation
• Two memory operations per packet
• Therefore, up to 160Gb/s
• In practice, closer to 80Gb/s
1
2
N
200 byte bus
Copyright 1999. All Rights Reserved
24
Switching Fabrics
• Output and Input Queueing
• Output Queueing
• Input Queueing
– Scheduling algorithms
– Other non-blocking fabrics
– Combining input and output queues
– Multicast traffic
Copyright 1999. All Rights Reserved
25
Input Queueing with Crossbar
Memory b/w = 2R
Data In
Scheduler
configuration
Data Out
Copyright 1999. All Rights Reserved
26
Input Queueing
Delay
Head of Line Blocking
Load
58.6%
Copyright 1999. All Rights Reserved
100%
27
Head of Line Blocking
Copyright 1999. All Rights Reserved
28
Copyright 1999. All Rights Reserved
29
Copyright 1999. All Rights Reserved
30
Input Queueing
Virtual output queues
Copyright 1999. All Rights Reserved
31
Input Queues
Delay
Virtual Output Queues
Load
Copyright 1999. All Rights Reserved
100%
32
Input Queueing
Memory b/w = 2R
Scheduler
Copyright 1999. All Rights Reserved
Can be quite
complex!
33
Input Queueing
Scheduling
1
1
2
2
1
1
2
2
Copyright 1999. All Rights Reserved
34
Wave Front Arbiter
Scheduling Algorithm
Requests
Match
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
Copyright 1999. All Rights Reserved
35
Wave Front Arbiter
Requests
Match
Copyright 1999. All Rights Reserved
36
Other Non-Blocking Fabrics
Clos Network
Copyright 1999. All Rights Reserved
37
Other Non-Blocking Fabrics
Clos Network
Expansion factor required = 2-1/N (but still blocking for multicast)
Copyright 1999. All Rights Reserved
38
Other Non-Blocking Fabrics
Self-Routing Networks
000
000
001
001
010
010
011
011
100
100
101
101
110
110
111
111
Copyright 1999. All Rights Reserved
39
Other Non-Blocking Fabrics
Self-Routing Networks
The Non-blocking Batcher Banyan Network
Batcher Sorter
Self-Routing Network
3
7
7
7
7
7
7
7
2
5
0
4
6
6
5
3
2
5
5
4
5
2
5
3
1
6
5
4
6
6
1
3
0
3
3
0
1
0
4
3
2
2
1
0
6
2
1
0
1
4
4
4
6
2
2
0
000
001
010
011
100
101
110
111
• Fabric can be used as scheduler.
•Batcher-Banyan network is blocking for multicast.
Copyright 1999. All Rights Reserved
40