High Performance Switches and Routers:
Theory and Practice
Sigcomm 99
August 30, 1999
Harvard University
High Performance
Switching and Routing
Telecom Center Workshop: Sept 4, 1997.
Nick McKeown
Balaji Prabhakar
Departments of Electrical Engineering and Computer Science
nickm@stanford.edu
balaji@isl.stanford.edu
Tutorial Outline
• Introduction:
What is a Packet Switch?
• Packet Lookup and Classification:
Where does a packet go next?
• Switching Fabrics:
How does the packet get there?
• Output Scheduling:
When should the packet leave?
Copyright 1999. All Rights Reserved
2
Introduction
What is a Packet Switch?
• Basic Architectural Components
• Some Example Packet Switches
• The Evolution of IP Routers
Copyright 1999. All Rights Reserved
3
Basic Architectural Components
Admission
Control
Policing
Congestion
Control
Routing
Switching
Copyright 1999. All Rights Reserved
Reservation
Output
Scheduling
Control
Datapath:
per-packet
processing
4
Basic Architectural Components
Datapath: per-packet processing
1.
Forwarding
Table
2.
Interconnect
3.
Output
Scheduling
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Copyright 1999. All Rights Reserved
5
Where high performance packet
switches are used
- Carrier Class Core Router
- ATM Switch
- Frame Relay Switch
The Internet Core
Edge Router
Copyright 1999. All Rights Reserved
Enterprise WAN access
& Enterprise Campus Switch
6
Introduction
What is a Packet Switch?
• Basic Architectural Components
• Some Example Packet Switches
• The Evolution of IP Routers
Copyright 1999. All Rights Reserved
7
ATM Switch
•
•
•
•
Lookup cell VCI/VPI in VC table.
Replace old VCI/VPI with new.
Forward cell to outgoing interface.
Transmit cell onto link.
Copyright 1999. All Rights Reserved
8
Ethernet Switch
• Lookup frame DA in forwarding table.
– If known, forward to correct port.
– If unknown, broadcast to all ports.
• Learn SA of incoming frame.
• Forward frame to outgoing interface.
• Transmit frame onto link.
Copyright 1999. All Rights Reserved
9
IP Router
• Lookup packet DA in forwarding table.
– If known, forward to correct port.
– If unknown, drop packet.
• Decrement TTL, update header Cksum.
• Forward packet to outgoing interface.
• Transmit packet onto link.
Copyright 1999. All Rights Reserved
10
Introduction
What is a Packet Switch?
• Basic Architectural Components
• Some Example Packet Switches
• The Evolution of IP Routers
Copyright 1999. All Rights Reserved
11
First-Generation IP Routers
Shared Backplane
Copyright 1999. All Rights Reserved
Buffer
Memory
CPU
DMA
DMA
DMA
Line
Interface
Line
Interface
Line
Interface
MAC
MAC
MAC
12
Second-Generation IP Routers
Buffer
Memory
CPU
DMA
DMA
DMA
Line
Card
Local
Buffer
Memory
Line
Card
Local
Buffer
Memory
Line
Card
Local
Buffer
Memory
MAC
MAC
MAC
Copyright 1999. All Rights Reserved
13
Third-Generation Switches/Routers
Switched Backplane
Line
Card
Copyright 1999. All Rights Reserved
CPU
Card
Line
Card
Local
Buffer
Memory
Local
Buffer
Memory
MAC
MAC
14
Fourth-Generation Switches/Routers
Clustering and Multistage
1 2 3 4 5 6
13 14 15 16 17 18
25 26 27 28 29 30
7 8 9 10 11 12
19 20 21 22 23 24
31 32 21
1 2 3 4 5 6 7 8 9 10 111213 14 15 16
17 1819 20 21 22 23 2425 26 27 28 29 30 31 32
Copyright 1999. All Rights Reserved
15
Packet Switches
References
• J. Giacopelli, M. Littlewood, W.D. Sincoskie “Sunshine: A
high performance self-routing broadband packet switch
architecture”, ISS ‘90.
• J. S. Turner “Design of a Broadcast packet switching
network”, IEEE Trans Comm, June 1988, pp. 734-743.
• C. Partridge et al. “A Fifty Gigabit per second IP Router”,
IEEE Trans Networking, 1998.
• N. McKeown, M. Izzard, A. Mekkittikul, W. Ellersick, M.
Horowitz, “The Tiny Tera: A Packet Switch Core”, IEEE
Micro Magazine, Jan-Feb 1997.
Copyright 1999. All Rights Reserved
16
Tutorial Outline
• Introduction:
What is a Packet Switch?
• Packet Lookup and Classification:
Where does a packet go next?
• Switching Fabrics:
How does the packet get there?
• Output Scheduling:
When should the packet leave?
Copyright 1999. All Rights Reserved
17
Basic Architectural Components
Datapath: per-packet processing
1.
Forwarding
Table
2.
Interconnect
3.
Output
Scheduling
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Copyright 1999. All Rights Reserved
18
Forwarding Decisions
• ATM and MPLS switches
– Direct Lookup
• Bridges and Ethernet switches
– Associative Lookup
– Hashing
– Trees and tries
• IP Routers
–
–
–
–
Caching
CIDR
Patricia trees/tries
Other methods
• Packet Classification
Copyright 1999. All Rights Reserved
19
ATM and MPLS Switches
Direct Lookup
VCI
Copyright 1999. All Rights Reserved
(Port, VCI)
Memory
20
Forwarding Decisions
• ATM and MPLS switches
– Direct Lookup
• Bridges and Ethernet switches
– Associative Lookup
– Hashing
– Trees and tries
• IP Routers
–
–
–
–
Caching
CIDR
Patricia trees/tries
Other methods
• Packet Classification
Copyright 1999. All Rights Reserved
21
Bridges and Ethernet Switches
Associative Lookups
Advantages:
Associative
Memory or CAM
Search
Data
Network Associated
Address
Data
48
• Simple
Associated
Data
{
Hit?
Address
log2N
Copyright 1999. All Rights Reserved
Disadvantages
• Slow
• High Power
• Small
• Expensive
22
Bridges and Ethernet Switches
Hashing
16
Memory
Data
48
Hashing
Function
Address
Search
Data
Associated
Data
{
Hit?
Address
log2N
Copyright 1999. All Rights Reserved
23
Lookups Using Hashing
An example
Memory
#1
Search
Data
48
#2
#3
#4
Associated
Data
Hashing Function
CRC-16
Linked lists
Copyright 1999. All Rights Reserved
16
#1
{
#2
Hit?
Address
log2N
#1
#2
#3
24
Lookups Using Hashing
Performance of simple example





ER = 1
--- 1 + -------------------------------
2
1-  M 
 1 – --1
–


N 
Where:
ER = Expected number of memory references
M = Number of memory addresses in table
N = Number of linked lists
 = M N
Copyright 1999. All Rights Reserved
25
Lookups Using Hashing
Advantages:
• Simple
• Expected lookup time can be small
Disadvantages
• Non-deterministic lookup time
• Inefficient use of memory
Copyright 1999. All Rights Reserved
26
Trees and Tries
Binary Search Tree
<
>
>
<
N entries
Copyright 1999. All Rights Reserved
>
log2N
<
Binary Search Trie
0
0
1
1
010
0
1
111
27
Trees and Tries
Multiway tries
16-ary Search Trie
0000, ptr
0000, 0
1111, ptr
000011110000
Copyright 1999. All Rights Reserved
1111, ptr
0000, 0
1111, ptr
111111111111
28
Trees and Tries
Multiway tries
N  D +
Ew = D L – 11 –  1 – ------D L 
N D +
En = 1 + D L 1 – ------DL 
Where:
L –1
 D i  1 – D i – 1 N –  1 – D 1 – i  N
i=1
L = Number of layers/references
N = Number of entries in table
L–1

D = Degree of tree
Di – D i – 11 – D i – 1 N
i =1
E n = Expected number of nodes
Ew = Expected amount of wasted memory
Degree of
Tree
# Mem
References
2
4
8
16
64
256
48
24
16
12
8
6
# Nodes Total Memory Fraction
(Mbytes)
Wasted (%)
(x106)
1.09
0.53
0.35
0.25
0.17
0.12
4.3
4.3
5.6
8.3
21
64
49
73
86
93
98
99.5
Table produced from 215 randomly generated 48-bit addresses
Copyright 1999. All Rights Reserved
29
Forwarding Decisions
• ATM and MPLS switches
– Direct Lookup
• Bridges and Ethernet switches
– Associative Lookup
– Hashing
– Trees and tries
• IP Routers
–
–
–
–
Caching
CIDR
Patricia trees/tries
Other methods
• Packet Classification
Copyright 1999. All Rights Reserved
30
Caching Addresses
Slow Path
Buffer
Memory
CPU
Fast Path
DMA
DMA
DMA
Line
Card
Local
Buffer
Memory
Line
Card
Local
Buffer
Memory
Line
Card
Local
Buffer
Memory
MAC
MAC
MAC
Copyright 1999. All Rights Reserved
31
Caching Addresses
LAN:
Average flow < 40 packets
WAN:
Huge Number of flows
100%
90%
80%
Cache
Hit
Rate
70%
60%
50%
40%
30%
20%
10%
0%
Cache = 10% of Full Table
Copyright 1999. All Rights Reserved
32
IP Routers
Class-based addresses
IP Address Space
Class A
212.17.9.4
Class B
Class A
Class B
Class C
Copyright 1999. All Rights Reserved
Class C D
Routing Table:
Exact
Match
212.17.9.0
Port 4
33
IP Routers
CIDR
Class-based:
A
B
C
D
232-1
0
Classless:
128.9.0.0
65/8
0
142.12/19
128.9/16
216
232-1
128.9.16.14
Copyright 1999. All Rights Reserved
34
IP Routers
CIDR
128.9.19/24
128.9.25/24
128.9.16/20 128.9.176/20
128.9/16
232-1
0
128.9.16.14
Most specific route = “longest matching prefix”
Copyright 1999. All Rights Reserved
35
IP Routers
Metrics for Lookups
128.9.16.14
Prefix
Port
65/8
128.9/16
128.9.16/20
128.9.19/24
128.9.25/24
128.9.176/20
142.12/19
3
5
2
7
10
1
3
Copyright 1999. All Rights Reserved
• Lookup time
• Storage space
• Update time
• Preprocessing time
36
IP Router
Lookup
H
E
A
D
E
R
Dstn
Addr
Forwarding Engine
Next Hop
Next Hop Computation
Forwarding Table
Destination Next Hop
-------------
Incoming
Packet
----
----
IPv4 unicast destination address based lookup
Copyright 1999. All Rights Reserved
37
Need more than IPv4 unicast
lookups
• Multicast
• PIMSM
– Longest Prefix Matching on the source and group address
– Try (S,G) followed by (*,G) followed by (*,*,RP)
– Check Incoming Interface
• DVMRP:
– Incoming Interface Check followed by (S,G) lookup
• IPv6
• 128bit destination address field
• Exact address architecture not yet known
Copyright 1999. All Rights Reserved
38
Lookup Performance Required
Line
Line Rate Pktsize=40B
Pktsize=240B
T1
1.5Mbps
4.68 Kpps
0.78 Kpps
OC3
155Mbps
480 Kpps
80 Kpps
OC12
622Mbps
1.94 Mpps
323 Kpps
OC48
2.5Gbps
7.81 Mpps
1.3 Mpps
31.25 Mpps
5.21 Mpps
OC192 10 Gbps
Gigabit Ethernet (84B packets): 1.49 Mpps
Copyright 1999. All Rights Reserved
39
Size of the Routing Table
Source: http://www.telstra.net/ops/bgptable.html
Copyright 1999. All Rights Reserved
40
Ternary CAMs
Associative Memory
Value
10.0.0.0
10.1.0.0
10.1.1.0
10.1.3.0
10.1.3.1
Mask
255.0.0.0
255.255.0.0
255.255.255.0
255.255.255.0
255.255.255.255
R1
R2
R3
R4
R4
Next Hop
Priority Encoder
Copyright 1999. All Rights Reserved
41
Binary Tries
0
d
1
f
e
a b
g
i
h
c
Copyright 1999. All Rights Reserved
j
Example Prefixes
a) 00001
b) 00010
c) 00011
d) 001
e) 0101
f) 011
g) 100
h) 1010
i) 1100
j) 11110000
42
Patricia Tree
0
f
d
a b
e
c
Copyright 1999. All Rights Reserved
1
g
h
i
Example Prefixes
a) 00001
b) 00010
c) 00011
d) 001
Skip=5
e) 0101
f) 011
j
g) 100
h) 1010
i) 1100
j) 11110000
43
Patricia Tree
Disadvantages
• Many memory accesses
• May need backtracking
• Pointers take up a lot of
space
Advantages
• General Solution
• Extensible to wider
fields
Avoid backtracking by storing the intermediate-best matched prefix.
(Dynamic Prefix Tries)
40K entries: 2MB data structure with 0.3-0.5 Mpps [O(W)]
Copyright 1999. All Rights Reserved
44
Binary search on trie levels
Level 0
Level 8
Level 29
Copyright 1999. All Rights Reserved
P
45
Binary search on trie levels
Store a hash table for each prefix length
to aid search at a particular trie level.
Length
Hash
8
12
10
16
24
10.1, 10.2
10.1.1, 10.1.2, 10.2.3
Copyright 1999. All Rights Reserved
Example Prefixes
10.0.0.0/8
10.1.0.0/16
10.1.1.0/24
10.1.2.0/24
10.2.3.0/24
Example Addrs
10.1.1.4
10.4.4.3
10.2.3.9
10.2.4.8
46
Binary search on trie levels
Disadvantages
• Multiple hashed memory
accesses.
• Updates are complex.
Advantages
• Scaleable to IPv6.
33K entries: 1.4MB data structure with 1.2-2.2 Mpps [O(log W)]
Copyright 1999. All Rights Reserved
47
Compacting Forwarding Tables
1 0 0 0
1
0
Copyright 1999. All Rights Reserved
1
1 1 0 0 0 1
1
1
1
48
Compacting Forwarding Tables
10001010 11100010 10000010 10110100
R1, 0
0
R2, 3
1
R3, 7
2
Codeword array
11000000
R4, 9
3
R5, 0
4
Base index array
0
0
13
1
Copyright 1999. All Rights Reserved
49
Compacting Forwarding Tables
Disadvantages
• Scalability to larger
tables?
• Updates are complex.
Advantages
• Extremely small data
structure - can fit in
cache.
33K entries: 160KB data structure with average 2Mpps [O(W/k)]
Copyright 1999. All Rights Reserved
50
Multi-bit Tries
16-ary Search Trie
0000, ptr
0000, 0
1111, ptr
000011110000
Copyright 1999. All Rights Reserved
1111, ptr
0000, 0
1111, ptr
111111111111
51
Compressed Tries
Only 3 memory accesses
L8
L16
L24
Copyright 1999. All Rights Reserved
52
Number
Routing Lookups in Hardware
Prefix length
Most prefixes are 24-bits or shorter
Copyright 1999. All Rights Reserved
53
Routing Lookups in Hardware
Prefixes up to 24-bits
224 = 16M entries
142.19.6
142.19.6
Next Hop
24
14
142.19.6.14
1
Next Hop
Copyright 1999. All Rights Reserved
54
Routing Lookups in Hardware
Prefixes up to 24-bits
128.3.72
0
Next Hop
Pointer
base
128.3.72
24
Next Hop
Prefixes above
24-bits
Copyright 1999. All Rights Reserved
8
offset
Next
Next Hop
Hop
44
128.3.72.44
1
55
Routing Lookups in Hardware
Prefixes up to n-bits
2n entries:
0
i
N
i

m
2
 entries
j
Prefixes
longer than
N+M bits
Next Hop
N+M
Copyright 1999. All Rights Reserved
56
Routing Lookups in Hardware
Disadvantages
• Large memory required
(9-33MB)
• Depends on prefix-length
distribution.
Advantages
• 20Mpps with 50ns
DRAM
• Easy to implement in
hardware
Various compression schemes can be employed to decrease the
storage requirements: e.g. employ carefully chosen variable length
strides, bitmap compression etc.
Copyright 1999. All Rights Reserved
57
IP Router Lookups
References
• A. Brodnik, S. Carlsson, M. Degermark, S. Pink. “Small Forwarding
Tables for Fast Routing Lookups”, Sigcomm 1997, pp 3-14.
• B. Lampson, V. Srinivasan, G. Varghese. “ IP lookups using multiway
and multicolumn search”, Infocom 1998, pp 1248-56, vol. 3.
• M. Waldvogel, G. Varghese, J. Turner, B. Plattner. “Scalable high
speed IP routing lookups”, Sigcomm 1997, pp 25-36.
• P. Gupta, S. Lin, N.McKeown. “Routing lookups in hardware at
memory access speeds”, Infocom 1998, pp 1241-1248, vol. 3.
• S. Nilsson, G. Karlsson. “Fast address lookup for Internet routers”,
IFIP Intl Conf on Broadband Communications, Stuttgart, Germany,
April 1-3, 1998.
• V. Srinivasan, G.Varghese. “Fast IP lookups using controlled prefix
expansion”, Sigmetrics, June 1998.
Copyright 1999. All Rights Reserved
58
Forwarding Decisions
• ATM and MPLS switches
– Direct Lookup
• Bridges and Ethernet switches
– Associative Lookup
– Hashing
– Trees and tries
• IP Routers
–
–
–
–
Caching
CIDR
Patricia trees/tries
Other methods
• Packet Classification
Copyright 1999. All Rights Reserved
59
Providing ValueAdded Services
Some examples
• Differentiated services
– Regard traffic from Autonomous System #33 as `platinumgrade’
• Access Control Lists
– Deny udp host 194.72.72.33 194.72.6.64 0.0.0.15 eq snmp
• Committed Access Rate
– Rate limit WWW traffic from subinterface#739 to 10Mbps
• Policybased Routing
– Route all voice traffic through the ATM network
Copyright 1999. All Rights Reserved
60
Packet Classification
H
E
A
D
E
R
Incoming
Packet
Copyright 1999. All Rights Reserved
Forwarding Engine
Packet Classification
Action
Classifier (Policy Database)
Predicate
Action
----------------
---61
Multi-field Packet Classification
Field 1
Field 2
…
Field k Action
Rule 1
152.163.190.69/21 152.163.80.11/32
…
UDP
A1
Rule 2
152.168.3.0/24
152.163.0.0/16
…
TCP
A2
…
…
…
…
…
…
Rule N
152.168.0.0/16
152.0.0.0/8
…
ANY
An
Given a classifier with N rules, find the action associated
with the highest priority rule matching an incoming
packet.
Copyright 1999. All Rights Reserved
62
Geometric Interpretation in 2D
Field #1 Field #2
R7
R6
P1
P2
Field #2
Data
R3
e.g. (144.24/16, 64/24)
e.g. (128.16.46.23, *) R1
R5
Copyright 1999. All Rights Reserved
R4
R2
Field #1
63
Proposed Schemes
Pros
Sequential
Evaluation
Small storage, scales well with
number of fields
Ternary CAMs Single cycle classification
Grid of Tries Small storage requirements and
(Srinivasan et fast lookup rates for two fields.
al[Sigcomm Suitable for big classifiers
98])
Copyright 1999. All Rights Reserved
Cons
Slow classification rates
Cost, density, power
consumption
Not easily extendible to
more than two fields.
64
Proposed Schemes (Contd.)
Pros
Crossproducting
(Srinivasan et
al[Sigcomm 98])
Fast accesses.
Suitable for
multiple fields.
Bil-level Parallelism Suitable for
(Lakshman and
multiple fields.
Stiliadis[Sigcomm 98])
Copyright 1999. All Rights Reserved
Cons
Large memory
requirements. Suitable
without caching for
classifiers with fewer than
50 rules.
Large memory bandwidth
required. Comparatively
slow lookup rate.
Hardware only.
65
Proposed Schemes (Contd.)
Pros
Hierarchical
Intelligent Cuttings
(Gupta and
McKeown[HotI 99])
Tuple Space Search
(Srinivasan et
al[Sigcomm 99])
Suitable for multiple
fields. Small memory
requirements. Good
update time.
Suitable for multiple
fields. The basic scheme
has good update times
and memory
requirements.
Recursive Flow
Fast accesses. Suitable for
Classification (Gupta multiple fields.
and
Reasonable memory
McKeown[Sigcomm requirements for real-life
99])
classifiers.
Copyright 1999. All Rights Reserved
Cons
Large preprocessing
time.
Classification rate can be
low. Requires perfect
hashing for determinism.
Large preprocessing time
and memory
requirements for large
classifiers.
66
Grid of Tries
0
Dimension 1
1
0
0
0
1 R4
0
1
R1
0
1
1 0
R3
R2
Copyright 1999. All Rights Reserved
0
0
R5
0
R6
0
1
Dimension 2
R7
67
Grid of Tries
Disadvantages
• Static solution
• Not easy to extend to
higher dimensions
Advantages
• Good solution for two
dimensions
20K entries: 2MB data structure with 9 memory accesses [at most 2W]
Copyright 1999. All Rights Reserved
68
Classification using Bit Parallelism
0
1
1
1
1
1
0
0
R4
Copyright 1999. All Rights Reserved
R3
R2
R1
69
Classification using Bit Parallelism
Disadvantages
• Large memory
bandwidth
• Hardware optimized
Advantages
• Good solution for
multiple dimensions
for small classifiers
512 rules: 1Mpps with single FPGA and 5 128KB SRAM chips.
Copyright 1999. All Rights Reserved
70
Classification Using Multiple Fields
Recursive Flow Classification
2S = 2128
2T = 212
Packet Header
Memory
Memory
F1
Memory
Action
F2
F3
2S = 2128
264
224
2T = 212
F4
Fn
Copyright 1999. All Rights Reserved
71
Packet Classification
References
• T.V. Lakshman. D. Stiliadis. “High speed policy based packet
forwarding using efficient multi-dimensional range matching”,
Sigcomm 1998, pp 191-202.
• V. Srinivasan, S. Suri, G. Varghese and M. Waldvogel. “Fast and
scalable layer 4 switching”, Sigcomm 1998, pp 203-214.
• V. Srinivasan, G. Varghese, S. Suri. “Fast packet classification using
tuple space search”, to be presented at Sigcomm 1999.
• P. Gupta, N. McKeown, “Packet classification using hierarchical
intelligent cuttings”, Hot Interconnects VII, 1999.
• P. Gupta, N. McKeown, “Packet classification on multiple fields”,
Sigcomm 1999.
Copyright 1999. All Rights Reserved
72
Tutorial Outline
• Introduction:
What is a Packet Switch?
• Packet Lookup and Classification:
Where does a packet go next?
• Switching Fabrics:
How does the packet get there?
• Output Scheduling:
When should the packet leave?
Copyright 1999. All Rights Reserved
73
Switching Fabrics
• Output and Input Queueing
• Output Queueing
• Input Queueing
–
–
–
–
Scheduling algorithms
Combining input and output queues
Other non-blocking fabrics
Multicast traffic
Copyright 1999. All Rights Reserved
74
Basic Architectural Components
Datapath: per-packet processing
1.
Forwarding
Table
2.
Interconnect
3.
Output
Scheduling
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Copyright 1999. All Rights Reserved
75
Interconnects
Two basic techniques
Input Queueing
Output Queueing
Usually a non-blocking
switch fabric (e.g. crossbar)
Usually a fast bus
Copyright 1999. All Rights Reserved
76
Interconnects
Output Queueing
Individual Output Queues
Centralized Shared Memory
Memory b/w = 2N.R
1
2
N
1
2
Memory b/w = (N+1).R
Copyright 1999. All Rights Reserved
N
77
Output Queueing
The “ideal”
2
1
1
2
1
2 1
2
11
2
2
1
Copyright 1999. All Rights Reserved
78
Output Queueing
How fast can we make centralized shared memory?
5ns SRAM
Shared
Memory
• 5ns per memory operation
• Two memory operations per packet
• Therefore, up to 160Gb/s
• In practice, closer to 80Gb/s
1
2
N
200 byte bus
Copyright 1999. All Rights Reserved
79
Switching Fabrics
• Output and Input Queueing
• Output Queueing
• Input Queueing
–
–
–
–
Scheduling algorithms
Other non-blocking fabrics
Combining input and output queues
Multicast traffic
Copyright 1999. All Rights Reserved
80
Interconnects
Input Queueing with Crossbar
Memory b/w = 2R
Data In
Scheduler
configuration
Data Out
Copyright 1999. All Rights Reserved
81
Input Queueing
Delay
Head of Line Blocking
Load
58.6%
Copyright 1999. All Rights Reserved
100%
82
Head of Line Blocking
Copyright 1999. All Rights Reserved
83
Copyright 1999. All Rights Reserved
84
Copyright 1999. All Rights Reserved
85
Input Queueing
Virtual output queues
Copyright 1999. All Rights Reserved
86
Input Queues
Delay
Virtual Output Queues
Load
Copyright 1999. All Rights Reserved
100%
87
Input Queueing
Memory b/w = 2R
Scheduler
Copyright 1999. All Rights Reserved
Can be quite
complex!
88
Input Queueing
Scheduling
Input 1
Q(1,1)
A1 (t)
A1,1(t)
Matching, M
Output 1
D1 (t)
Q(1,n)
?
Input m
Q(m,1)
Output n
Dn(t)
Am (t)
Q(m,n)
Copyright 1999. All Rights Reserved
89
Input Queueing
1
2
3
4
7
2
4
2
5
2
Request
Graph
Scheduling
1
1
2
2
3
3
4
4
1
2
3
4
Bipartite
Matching
(Weight = 18)
Question: Maximum weight or maximum size?
Copyright 1999. All Rights Reserved
90
Input Queueing
Scheduling
• Maximum Size
– Maximizes instantaneous throughput
– Does it maximize long-term throughput?
• Maximum Weight
– Can clear most backlogged queues
– But does it sacrifice long-term throughput?
Copyright 1999. All Rights Reserved
91
Input Queueing
Scheduling
Copyright 1999. All Rights Reserved
1
1
2
2
1
1
2
2
92
Input Queueing
Longest Queue First or
Oldest Cell First
Weight
1
2
3
4
1
1
1
Queue Length
Waiting Time
={
1
10
10
1
2
3
4
Copyright 1999. All Rights Reserved
}
100%
1
2
3
4
1
2
3
4
93
Input Queueing
Why is serving long/old queues better than
serving maximum number of queues?
Non-uniform traffic
Uniform traffic
VOQ #
Copyright 1999. All Rights Reserved
Avg Occupancy
Avg Occupancy
• When traffic is uniformly distributed, servicing the
maximum number of queues leads to 100% throughput.
• When traffic is non-uniform, some queues become
longer than others.
• A good algorithm keeps the queue lengths matched, and
services a large number of queues.
VOQ #
94
Input Queueing
Practical Algorithms
• Maximal Size Algorithms
– Wave Front Arbiter (WFA)
– Parallel Iterative Matching (PIM)
– iSLIP
• Maximal Weight Algorithms
– Fair Access Round Robin (FARR)
– Longest Port First (LPF)
Copyright 1999. All Rights Reserved
95
Wave Front Arbiter
Requests
Match
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
Copyright 1999. All Rights Reserved
96
Wave Front Arbiter
Requests
Copyright 1999. All Rights Reserved
Match
97
Wave Front Arbiter
Implementation
Copyright 1999. All Rights Reserved
1,1
1,2
1,3
1,4
2,1
2,2
2,3
2,4
3,1
3,2
3,3
3,4
4,1
4,2
4,3
4,4
Combinational
Logic Blocks
98
Wave Front Arbiter
Wrapped WFA (WWFA)
N steps instead of
2N-1
Requests
Copyright 1999. All Rights Reserved
Match
99
Input Queueing
Practical Algorithms
• Maximal Size Algorithms
– Wave Front Arbiter (WFA)
– Parallel Iterative Matching (PIM)
– iSLIP
• Maximal Weight Algorithms
– Fair Access Round Robin (FARR)
– Longest Port First (LPF)
Copyright 1999. All Rights Reserved
100
Parallel Random
Iterative
Matching
Random Selection
Selection
#1
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
4
4
4
4
4
4
Requests
Grant
Accept/Match
1
2
#2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
4
4
4
4
4
4
Copyright 1999. All Rights Reserved
101
Parallel Iterative Matching
Maximal is not Maximum
1
2
3
1
2
3
1
2
3
1
2
3
4
4
4
4
Requests
Copyright 1999. All Rights Reserved
Accept/Match
1
2
3
1
2
3
4
4
102
Parallel Iterative Matching
Analytical Results
Number of iterations to converge:
N2
E U i   ------4i
E C   log N
Copyright 1999. All Rights Reserved
C = # of iterations required to resolve connections
N = # of ports
U i = # of unresolved connections after iteration i
103
Parallel Iterative Matching
Copyright 1999. All Rights Reserved
104
Parallel Iterative Matching
Copyright 1999. All Rights Reserved
105
Parallel Iterative Matching
Copyright 1999. All Rights Reserved
106
Input Queueing
Practical Algorithms
• Maximal Size Algorithms
– Wave Front Arbiter (WFA)
– Parallel Iterative Matching (PIM)
– iSLIP
• Maximal Weight Algorithms
– Fair Access Round Robin (FARR)
– Longest Port First (LPF)
Copyright 1999. All Rights Reserved
107
iSLIP
Round-Robin Selection
Round-Robin Selection
#1
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
4
4
4
4
4
4
Requests
Grant
Accept/Match
1
2
#2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
4
4
4
4
4
4
Copyright 1999. All Rights Reserved
108
iSLIP
Properties
•
•
•
•
•
Random under low load
TDM under high load
Lowest priority to MRU
1 iteration: fair to outputs
Converges in at most N iterations. On average <=
log2N
• Implementation: N priority encoders
• Up to 100% throughput for uniform traffic
Copyright 1999. All Rights Reserved
109
iSLIP
Copyright 1999. All Rights Reserved
110
iSLIP
Copyright 1999. All Rights Reserved
111
iSLIP
Programmable
Priority Encoder
N
N
Implementation
1
1
Grant
Accept
2
2
Grant
Accept
log2N
log2N
State
Decision
N
N
Grant
Copyright 1999. All Rights Reserved
N
Accept
log2N
112
Input Queueing References
References
• M. Karol et al. “Input vs Output Queueing on a Space-Division Packet
Switch”, IEEE Trans Comm., Dec 1987, pp. 1347-1356.
• Y. Tamir, “Symmetric Crossbar arbiters for VLSI communication
switches”, IEEE Trans Parallel and Dist Sys., Jan 1993, pp.13-27.
• T. Anderson et al. “High-Speed Switch Scheduling for Local Area
Networks”, ACM Trans Comp Sys., Nov 1993, pp. 319-352.
• N. McKeown, “The iSLIP scheduling algorithm for Input-Queued
Switches”, IEEE Trans Networking, April 1999, pp. 188-201.
• C. Lund et al. “Fair prioritized scheduling in an input-buffered switch”,
Proc. of IFIP-IEEE Conf., April 1996, pp. 358-69.
• A. Mekkitikul et al. “A Practical Scheduling Algorithm to Achieve
100% Throughput in Input-Queued Switches”, IEEE Infocom 98, April
1998.
Copyright 1999. All Rights Reserved
113
Switching Fabrics
• Output and Input Queueing
• Output Queueing
• Input Queueing
–
–
–
–
Scheduling algorithms
Other non-blocking fabrics
Combining input and output queues
Multicast traffic
Copyright 1999. All Rights Reserved
114
Other Non-Blocking Fabrics
Clos Network
Copyright 1999. All Rights Reserved
115
Other Non-Blocking Fabrics
Clos Network
Expansion factor required = 2-1/N (but still blocking for multicast)
Copyright 1999. All Rights Reserved
116
Other Non-Blocking Fabrics
Self-Routing Networks
000
000
001
001
010
010
011
011
100
100
101
101
110
110
111
111
Copyright 1999. All Rights Reserved
117
Other Non-Blocking Fabrics
Self-Routing Networks
The Non-blocking Batcher Banyan Network
Batcher Sorter
Self-Routing Network
3
7
7
7
7
7
7
7
2
5
0
4
6
6
5
3
2
5
5
4
5
2
5
3
1
6
5
4
6
6
1
3
0
3
3
0
1
0
4
3
2
2
1
0
6
2
1
0
1
4
4
4
6
2
2
0
000
001
010
011
100
101
110
111
• Fabric can be used as scheduler.
•Batcher-Banyan network is blocking for multicast.
Copyright 1999. All Rights Reserved
118
Switching Fabrics
• Output and Input Queueing
• Output Queueing
• Input Queueing
–
–
–
–
Scheduling algorithms
Other non-blocking fabrics
Combining input and output queues
Multicast traffic
Copyright 1999. All Rights Reserved
119
Speedup
• Context
– input-queued switches
– output-queued switches
– the speedup problem
• Early approaches
• Algorithms
• Implementation considerations
Copyright 1999. All Rights Reserved
120
Speedup: Context
M
e
m
o
r
y
M
e
m
o
r
y
A generic switch
The placement of memory gives
- Output-queued switches
- Input-queued switches
- Combined input- and output-queued switches
Copyright 1999. All Rights Reserved
121
Output-queued switches
Best delay and throughput performance
- Possible to erect “bandwidth firewalls” between sessions
Main problem
- Requires high fabric speedup (S = N)
Unsuitable for high-speed switching
Copyright 1999. All Rights Reserved
122
Input-queued switches
Big advantage
- Speedup of one is sufficient
Main problem
- Can’t guarantee delay due to input contention
Overcoming input contention: use higher speedup
Copyright 1999. All Rights Reserved
123
A Comparison
Memory speeds for 32x32 switch
Output-queued
Input-queued
Line Rate
Memory
BW
Access Time
Per cell
Memory
BW
Access Time
100 Mb/s
3.3 Gb/s
128 ns
200 Mb/s
2.12 s
1 Gb/s
33 Gb/s
12.8 ns
2 Gb/s
212 ns
2.5 Gb/s
82.5 Gb/s
5.12 ns
5 Gb/s
84.8 ns
10 Gb/s
330 Gb/s
1.28ns
20 Gb/s
21.2 ns
Copyright 1999. All Rights Reserved
124
The Speedup Problem
Find a compromise: 1 < Speedup << N
- to get the performance of an OQ switch
- close to the cost of an IQ switch
Essential for high speed QoS switching
Copyright 1999. All Rights Reserved
125
Some Early Approaches
Probabilistic Analyses
- assume traffic models (Bernoulli, Markov-modulated,
non-uniform loading, “friendly correlated”)
- obtain mean throughput and delays, bounds on tails
- analyze different fabrics (crossbar, multistage, etc)
Numerical Methods
- use actual and simulated traffic traces
- run different algorithms
- set the “speedup dial” at various values
Copyright 1999. All Rights Reserved
126
The findings
Very tantalizing ...
- under different settings (traffic, loading, algorithm, etc)
- and even for varying switch sizes
A speedup of between 2 and 5 was sufficient!
Copyright 1999. All Rights Reserved
127
Using Speedup
1
2
1
2
1
Copyright 1999. All Rights Reserved
128
Intuition
Bernoulli IID inputs
Speedup = 1
Fabric throughput = .58
Bernoulli IID inputs
Speedup = 2
Fabric throughput = 1.16
I/p efficiency,  = 1/1.16
Ave I/p queue = 6.25
Copyright 1999. All Rights Reserved
129
Intuition (continued)
Bernoulli IID inputs
Speedup = 3
Fabric throughput = 1.74
Input efficiency = 1/1.74
Ave I/p queue = 1.35
Bernoulli IID inputs
Speedup = 4
Fabric throughput = 2.32
Input efficiency = 1/2.32
Ave I/p queue = 0.75
Copyright 1999. All Rights Reserved
130
Issues
Need hard guarantees
- exact, not average
Robustness
- realistic, even adversarial, traffic
not friendly Bernoulli IID
Copyright 1999. All Rights Reserved
131
The Ideal Solution
Inputs
Speedup = N
Outputs
?
Speedup << N
Question: Can we find
- a simple and good algorithms
- that exactly mimics output-queueing
- regardless of switch sizes and traffic patterns?
Copyright 1999. All Rights Reserved
132
What is exact mimicking?
Apply same inputs to an OQ and a CIOQ switch
- packet by packet
Obtain same outputs
- packet by packet
Copyright 1999. All Rights Reserved
133
Algorithm - MUCF
Key concept: urgency value
- urgency = departure time - present time
Copyright 1999. All Rights Reserved
134
MUCF
The algorithm
- Outputs try to get their most urgent packets
- Inputs grant to output whose packet is most
urgent, ties broken by port number
- Loser outputs for next most urgent packet
- Algorithm terminates when no more matchings
are possible
Copyright 1999. All Rights Reserved
135
Stable Marriage Problem
Men = Outputs
Bill
John
Pedro
Women = Inputs
Hillary
Copyright 1999. All Rights Reserved
Monica
Maria
136
An example
Observation: Only two reasons a packet doesn’t get to its output
- Input contention, Output contention
- This is why speedup of 2 works!!
Copyright 1999. All Rights Reserved
137
What does this get us?
Speedup of 4 is sufficient for exact emulation of FIFO
OQ switches, with MUCF
What about non-FIFO OQ switches?
E.g. WFQ, Strict priority
Copyright 1999. All Rights Reserved
138
Other results
To exactly emulate an NxN OQ switch
- Speedup of 2 - 1/N is necessary and sufficient
(Hence a speedup of 2 is sufficient for all N)
- Input traffic patterns can be absolutely arbitrary
- Emulated OQ switch may use a “monotone”
scheduling policies
- E.g.: FIFO, LIFO, strict priority, WFQ, etc
Copyright 1999. All Rights Reserved
139
What gives?
Complexity of the algorithms
- Extra hardware for processing
- Extra run time (time complexity)
What is the benefit?
- Reduced memory bandwidth requirements
Tradeoff: Memory for processing
- Moore’s Law supports this tradeoff
Copyright 1999. All Rights Reserved
140
Implementation - a closer look
Main sources of difficulty
- Estimating urgency, etc - info is distributed
(and communicating this info among I/ps and O/ps)
- Matching process - too many iterations?
Estimating urgency depends on what is being emulated
- Like taking a ticket to hold a place in a queue
- FIFO, Strict priorities - no problem
- WFQ, etc - problems
Copyright 1999. All Rights Reserved
141
Implementation (contd)
Matching process
- A variant of the stable marriage problem
- Worst-case number of iterations for SMP = N2
- Worst-case number of iterations in switching = N
- High probability and average approxly log(N)
Copyright 1999. All Rights Reserved
142
Other Work
Relax stringent requirement of exact emulation
- Least Occupied O/p First Algorithm (LOOFA)
Keeps outputs always busy if there are packets
By time-stamping packets, it also exactly mimics
- Disallow arbitrary inputs
E.g. leaky bucket constrained
Obtain worst-case delay bounds
Copyright 1999. All Rights Reserved
143
References for speedup
- Y. Oie et al, “Effect of speedup in nonblocking packet switch’’, ICC 89.
- A.L Gupta, N.D. Georgana, “Analysis of a packet switch with input and
and output buffers and speed constraints”, Infocom 91.
- S-T. Chuang et al, “Matching output queueing with a combined input and
and output queued switch”, IEEE JSAC, vol 17, no 6, 1999.
- B. Prabhakar, N. McKeown, “On the speedup required for combined input
and output queued switching”, Automatica, vol 35, 1999.
- P. Krishna et al, “On the speedup required for work-conserving crossbar
switches”, IEEE JSAC, vol 17, no 6, 1999.
- A. Charny, “Providing QoS guarantees in input buffered crossbar switches
with speedup”, PhD Thesis, MIT, 1998.
Copyright 1999. All Rights Reserved
144
Switching Fabrics
• Output and Input Queueing
• Output Queueing
• Input Queueing
–
–
–
–
Scheduling algorithms
Other non-blocking fabrics
Combining input and output queues
Multicast traffic
Copyright 1999. All Rights Reserved
145
Multicast Switching
• The problem
• Switching with crossbar fabrics
• Switching with other fabrics
Copyright 1999. All Rights Reserved
146
Multicasting
2
1
Copyright 1999. All Rights Reserved
3
5
4
6
147
Crossbar fabrics: Method 1
Copy network + unicast switching
Copy networks
Increased hardware, increased input contention
Copyright 1999. All Rights Reserved
148
Method 2
Use copying properties of crossbar fabric
No fanout-splitting: Easy, but low
throughput
Fanout-splitting: higher
throughput, but not as simple.
Leaves “residue”.
Copyright 1999. All Rights Reserved
149
The effect of fanout-splitting
Performance of an 8x8 switch with and without fanout-splitting
under uniform IID traffic
Copyright 1999. All Rights Reserved
150
Placement of residue
Key question: How should outputs grant requests?
(and hence decide placement of residue)
Copyright 1999. All Rights Reserved
151
Residue and throughput
Result: Concentrating residue brings more new work
forward. Hence leads to higher throughput.
But, there are fairness problems to deal with.
This and other problems can be looked at in a unified
way by mapping the multicasting problem onto a
variation of Tetris.
Copyright 1999. All Rights Reserved
152
Multicasting and Tetris
Input ports
1 2 3 4 5
Residue
1 2 3 4 5
Output ports
Copyright 1999. All Rights Reserved
153
Multicasting and Tetris
Input ports
1 2 3 4 5
Residue
Concentrated
1 2 3 4 5
Output ports
Copyright 1999. All Rights Reserved
154
Replication by recycling
Main idea: Make two copies at a time using a binary tree
with input at root and all possible destination outputs at
the leaves.
b
c
y
a
d
x
e
x
b
x
a
c
y
y
e
d
Copyright 1999. All Rights Reserved
155
Replication by recycling (cont’d)
Receive
Reseq
Transmit
Output
Table
Network
Recycle
Scaleable to large fanouts. Needs resequencing at outputs and
introduces variable delays.
Copyright 1999. All Rights Reserved
156
References for Multicasting
• J. Hayes et al. “Performance analysis of a multicast
switch”, IEEE/ACM Trans. on Networking, vol 39, April
1991.
• B. Prabhakar et al. “Tetris models for multicast switches”,
Proc. of the 30th Annual Conference on Information
Sciences and Systems, 1996
• B. Prabhakar et al. “Multicast scheduling for input-queued
switches”, IEEE JSAC, 1997
• J. Turner, “An optimal nonblocking multicast virtual
circuit switch”, INFOCOM, 1994
Copyright 1999. All Rights Reserved
157
Tutorial Outline
• Introduction:
What is a Packet Switch?
• Packet Lookup and Classification:
Where does a packet go next?
• Switching Fabrics:
How does the packet get there?
• Output Scheduling:
When should the packet leave?
Copyright 1999. All Rights Reserved
158
Output Scheduling
• What is output scheduling?
• How is it done?
• Practical Considerations
Copyright 1999. All Rights Reserved
159
Output Scheduling
Allocating output bandwidth
Controlling packet delay
scheduler
Copyright 1999. All Rights Reserved
160
Output Scheduling
FIFO
Fair Queueing
Copyright 1999. All Rights Reserved
161
Motivation
• FIFO is natural but gives poor QoS
– bursty flows increase delays for others
– hence cannot guarantee delays
Need round robin scheduling of packets
– Fair Queueing
– Weighted Fair Queueing, Generalized Processor
Sharing
Copyright 1999. All Rights Reserved
162
Fair queueing: Main issues
• Level of granularity
– packet-by-packet? (favors long packets)
– bit-by-bit? (ideal, but very complicated)
• Packet Generalized Processor Sharing (PGPS)
– serves packet-by-packet
– and imitates bit-by-bit schedule within a tolerance
Copyright 1999. All Rights Reserved
163
How does WFQ work?
WR = 1
WG = 5
WP = 2
Copyright 1999. All Rights Reserved
164
Delay guarantees
• Theorem
If flows are leaky bucket constrained and all nodes
employ GPS (WFQ), then the network can
guarantee worst-case delay bounds to sessions.
Copyright 1999. All Rights Reserved
165
Practical considerations
• For every packet, the scheduler needs to
– classify it into the right flow queue and maintain a
linked-list for each flow
– schedule it for departure
• Complexities of both are o(log [# of flows])
– first is hard to overcome
– second can be overcome by DRR
Copyright 1999. All Rights Reserved
166
Deficit Round Robin
700
50
250
400
200
600
600
500
250
750
500
1000
100
400
500
Good approximation of FQ
Much simpler to implement
Copyright 1999. All Rights Reserved
500
Quantum size
167
But...
• WFQ is still very hard to implement
– classification is a problem
– needs to maintain too much state information
– doesn’t scale well
Copyright 1999. All Rights Reserved
168
Strict Priorities and Diff Serv
• Classify flows into priority classes
– maintain only per-class queues
– perform FIFO within each class
– avoid “curse of dimensionality”
Copyright 1999. All Rights Reserved
169
Diff Serv
• A framework for providing differentiated QoS
– set Type of Service (ToS) bits in packet headers
– this classifies packets into classes
– routers maintain per-class queues
– condition traffic at network edges to conform to
class requirements
May still need queue management inside the network
Copyright 1999. All Rights Reserved
170
References for O/p Scheduling
- A. Demers et al, “Analysis and simulation of a fair queueing algorithm”,
ACM SIGCOMM 1989.
- A. Parekh, R. Gallager, “A generalized processor sharing approach to
flow control in integrated services networks: the single node
case”, IEEE Trans. on Networking, June 1993.
- A. Parekh, R. Gallager, “A generalized processor sharing approach to
flow control in integrated services networks: the multiple node
case”, IEEE Trans. on Networking, August 1993.
- M. Shreedhar, G. Varghese, “Efficient Fair Queueing using Deficit Round
Robin”, ACM SIGCOMM, 1995.
- K. Nichols, S. Blake (eds), “Differentiated Services: Operational Model
and Definitions”, Internet Draft, 1998.
Copyright 1999. All Rights Reserved
171
Active Queue Management
• Problems with traditional queue management
– tail drop
• Active Queue Management
– goals
– an example
– effectiveness
Copyright 1999. All Rights Reserved
172
Tail Drop Queue Management
Lock-Out
Max Queue Length
Copyright 1999. All Rights Reserved
173
Tail Drop Queue Management
• Drop packets only when queue is full
– long steady-state delay
– global synchronization
– bias against bursty traffic
Copyright 1999. All Rights Reserved
174
Global Synchronization
Max Queue Length
Copyright 1999. All Rights Reserved
175
Bias Against Bursty Traffic
Max Queue Length
Copyright 1999. All Rights Reserved
176
Alternative Queue Management
Schemes
• Drop from front on full queue
• Drop at random on full queue
 both solve the lock-out problem
 both have the full-queues problem
Copyright 1999. All Rights Reserved
177
Active Queue Management
Goals
• Solve lock-out and full-queue problems
– no lock-out behavior
– no global synchronization
– no bias against bursty flow
• Provide better QoS at a router
– low steady-state delay
– lower packet dropping
Copyright 1999. All Rights Reserved
178
Active Queue Management
• Problems with traditional queue management
– tail drop
• Active Queue Management
– goals
 an example
– effectiveness
Copyright 1999. All Rights Reserved
179
Random Early Detection (RED)
Pk
maxth



P2
qavg
P1
minth
if qavg < minth: admit every packet
else if qavg <= maxth: drop an incoming
packet with p = (qavg - minth)/(maxth - minth)
else if qavg > maxth: drop every incoming
packet
Copyright 1999. All Rights Reserved
180
Effectiveness of RED: Lock-Out
• Packets are randomly dropped
• Each flow has the same probability of
being discarded
Copyright 1999. All Rights Reserved
181
Effectiveness of RED: Full-Queue
• Drop packets probabilistically in
anticipation of congestion (not when queue
is full)
• Use qavg to decide packet dropping
probability: allow instantaneous bursts
• Randomness avoids global synchronization
Copyright 1999. All Rights Reserved
182
What QoS does RED Provide?
• Lower buffer delay: good interactive service
– qavg is controlled to be small
• Given responsive flows: packet dropping is
reduced
– early congestion indication allows traffic to throttle
back before congestion
• Given responsive flows: fair bandwidth
allocation
Copyright 1999. All Rights Reserved
183
Unresponsive or aggressive flows
• Don’t properly back off during congestion
• Take away bandwidth from TCP
compatible flows
• Monopolize buffer space
Copyright 1999. All Rights Reserved
184
Control Unresponsive Flows
• Some active queue management schemes
– RED with penalty box
– Flow RED (FRED)
– Stabilized RED (SRED)
identify and penalize unresponsive flows
with a bit of extra work
Copyright 1999. All Rights Reserved
185
Active Queue Management
References
• B. Braden et al. “Recommendations on queue management
and congestion avoidance in the internet”, RFC2309, 1998.
• S. Floyd, V. Jacobson, “Random early detection gateways
for congestion avoidance”, IEEE/ACM Trans. on
Networking, 1(4), Aug. 1993.
• D. Lin, R. Morris, “Dynamics on random early detection”,
ACM SIGCOMM, 1997
• T. Ott et al. “SRED: Stabilized RED”, INFOCOM 1999
• S. Floyd, K. Fall, “Router mechanisms to support end-toend congestion control”, LBL technical report, 1997
Copyright 1999. All Rights Reserved
186
Tutorial Outline
• Introduction:
What is a Packet Switch?
• Packet Lookup and Classification:
Where does a packet go next?
• Switching Fabrics:
How does the packet get there?
• Output Scheduling:
When should the packet leave?
Copyright 1999. All Rights Reserved
187
Basic Architectural Components
Admission
Control
Policing
Congestion
Control
Routing
Switching
Copyright 1999. All Rights Reserved
Reservation
Output
Scheduling
Control
Datapath:
per-packet
processing
188
Basic Architectural Components
Datapath: per-packet processing
1.
Forwarding
Table
2.
Interconnect
3.
Output
Scheduling
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Copyright 1999. All Rights Reserved
189