Talk for Monica/Earl - High Performance Networking Group

advertisement
Processing packets in
packet switches
CS343
May 7th 2003
High Performance
Switching and Routing
Telecom Center Workshop: Sept 4, 1997.
Nick McKeown
Professor of Electrical Engineering
and Computer Science, Stanford University
nickm@stanford.edu
www.stanford.edu/~nickm
1
Contents
1.
2.
What processing is done where?
What does a packet switch look like?




3.
4.
Trends and consequences
Technology options for processing packets




5.
Examples of packet switches
What does a packet switch do?
Typical packet switch architecture
Evolution of high performance packet switch architecture
General purpose CPU
Network processors
FPGA
ASIC
My 2c
2
The Network Layer View of the
Internet
End hosts
Routers
3
Hierarchical arrangement
A crude approximation
End hosts
Edge Routers
Core Routers
Core routers: Maximum capacity, minimum function.
Typically: 16 ports of 10Gb/s. Capacity 160Gb/s, 200Mpps. Price $1M.
Edge routers: Medium capacity, maximum flexibility and function.
Typically: 16 ports of 2.5Gb/s. Capacity 20-30 Gb/s, 10-20Mpps. Price $200k.
4
Hierarchical arrangement
End hosts
(1000s per mux)
Access multiplexer
Edge Routers
Core Routers
POP
10Gb/s “OC192”
POP
POP
Point of Presence (POP)
POP: Point of Presence. Richly interconnected by mesh of long-haul links.
Typically: 40 POPs per national network operator; 10-40 core routers per POP.
5
Autonomous Systems
POP
POP
POP
POP
POP
POP
POP
AT&T
POP
Worldcom
“peering points”
POP
POP
POP
POP
POP
POP
POP
POP
Global Crossing
Sprint
6
How we connect
Corporate/campus Environment
Typically: 100 ports of
100Mb/s Ethernet
Ethernet switch
Building-wide router
e.g. gates-rtr.stanford.edu
Typically: 16 ports of 1Gb/s Ethernet
POP
10Gb/s “OC192”
POP
POP
POP
i/f
Campus or company-wide router
e.g. border-rtr.stanford.edu
Typically: mixture of 2.5Gb/s “OC48” and
Gb/s Ethernet
7
How we connect
Home modem/DSL environment
Telephone switch with DSL line
interface at your local Central Office
POP
10Gb/s “OC192”
POP
POP
Point of Presence (POP)
i/f
DSL Router/NAT
Typically: 10/100Mb/s
8
Outline
1.
2.
What processing is done where?
What does a packet switch look like?




3.
4.
Trends and consequences
Technology options for processing packets




5.
Examples of packet switches
What does a packet switch do?
Typical packet switch architecture
Evolution of high performance packet switch architecture
General purpose CPU
Network processors
FPGA
ASIC
My 2c
9
What a High Performance Router
Looks Like
19”
19”
Capacity: 160Gb/s
Power: 4.2kW
6ft
Capacity: 80Gb/s
Power: 2.6kW
3ft
2ft
Cisco GSR 12416
2.5ft
Juniper M160
10
Other packet switches
Cisco 7500 “edge” routers
Lucent GX550 Core ATM switch
D-Link DSL router
Wiring closet in Packard building
11
Outline
1.
2.
What processing is done where?
What does a packet switch look like?




3.
4.
Trends and consequences
Technology options for processing packets




5.
Examples of packet switches
What does a packet switch do?
Typical packet switch architecture
Evolution of high performance packet switch architecture
General purpose CPU
Network processors
FPGA
ASIC
My 2c
12
The IP Datagram
vers
HLen
TOS
ID
Hop count
TTL
Total Length
Flags
Protocol
Offset within
original packet
FRAG Offset
checksum
SRC IP Address
<=64 KBytes
DST IP Address
(OPTIONS)
(PAD)
13
Forwarding in an IP Router
1. Lookup packet DA in forwarding table.
–
–
If known, forward to correct port.
If unknown, drop packet.
2. Decrement TTL, update header checksum.
3. Forward packet to outgoing interface.
4. Transmit packet onto link.
14
Ethernet Frame Format
Bytes:
7
1
Preamble SFD
1.
2.
3.
4.
5.
6.
6
DA
6
2
SA
Type
0-1500
Data
0-46
4
Pad
CRC
Preamble: trains clock-recovery circuits
Start of Frame Delimiter: indicates start of frame
Destination Address: 48-bit globally unique address
assigned by manufacturer.
1b: unicast/multicast
1b: local/global address
Type: Indicates protocol of encapsulated data (e.g. IP = 0x0800)
Pad: Zeroes used to ensure minimum frame length
Cyclic Redundancy Check: check sequence to detect bit errors.
15
Encapsulation
IP Header
Preamble SFD
DA
IP Data
SA
Type
= IP
Data
Pad
CRC
16
Outline
1.
2.
What processing is done where?
What does a packet switch look like?




3.
4.
Trends and consequences
Technology options for processing packets




5.
Examples of packet switches
What does a packet switch do?
Typical packet switch architecture
Evolution of high performance packet switch architecture
General purpose CPU
Network processors
FPGA
ASIC
My 2c
17
Generic Router Architecture
Header Processing
Data
Hdr
Lookup
Update
IP Address Header
IP Address
~1M prefixes
Off-chip DRAM
Queue
Packet
Data
Hdr
Next Hop
Address
Table
Buffer
Memory
~1M packets
Off-chip DRAM
18
Generic Router Architecture
Header Processing
Lookup
IP Address
Update
Header
Buffer
Memory
Address
Table
Header Processing
Lookup
IP Address
Update
Header
Header Processing
Address
Table
Buffer
Manager
Buffer
Memory
Address
Table
Lookup
IP Address
Buffer
Manager
Update
Header
Buffer
Manager
Buffer
Memory
19
Contents
1.
2.
What processing is done where?
What does a packet switch look like?




3.
4.
Trends and consequences
Technology options for processing packets




5.
Examples of packet switches
What does a packet switch do?
Typical packet switch architecture
Evolution of high performance packet switch architecture
General purpose CPU
Network processors
FPGA
ASIC
My 2c
20
First Generation Routers
Shared Backplane
CPU
Route
Table
Buffer
Memory
Line
Interface
Line
Interface
Line
Interface
MAC
MAC
MAC
Typically <0.5Gb/s aggregate capacity
21
Second Generation Routers
CPU
Route
Table
Buffer
Memory
Line
Card
Line
Card
Line
Card
Buffer
Memory
Buffer
Memory
Buffer
Memory
Fwding
Cache
Fwding
Cache
Fwding
Cache
MAC
MAC
MAC
Typically <5Gb/s aggregate capacity
22
Third Generation Routers
Switched Backplane
Line
Card
CPU
Card
Line
Card
Local
Buffer
Memory
Routing
Table
Local
Buffer
Memory
Fwding
Table
Fwding
Table
MAC
MAC
Typically <50Gb/s aggregate capacity
23
Fourth Generation Routers
Optical links
100s
of metres
Switch Core
Linecards
160Gb/s - 20Tb/s routers in development
24
Contents
1.
2.
What processing is done where?
What does a packet switch look like?




3.
4.
Trends and consequences
Technology options for processing packets




5.
Examples of packet switches
What does a packet switch do?
Typical packet switch architecture
Evolution of high performance packet switch architecture
General purpose CPU
Network processors
FPGA
ASIC
My 2c
25
Normalized Growth since 1980
Trends in Technology, Routers & Traffic
1,000,000
Line Capacity
2x / 7 months
100,000
10,000
1,000
100
User Traffic
2x / 12months
Router
Capacity
2.2x / 18months
Moore’s Law
2x / 18 months
DRAM
Random Access Time
1.1x / 18months
10
1
1980
1983
1986
1989
1992
1995
1998
2001
26
Trends and Consequences
1
2
600
Normalized growth
1000
CPU Instructions
per minimum length packet
100
10
1
1996
Disparity between traffic
and router growth
500
traffic
400
300
5-fold
disparity
Router
capacity
200
100
0
1997
1998
1999
2000
2001
2003
2006
2009
2012
Consequences:
1. Packet processing is getting harder, and eventually network
processors will be used less for high performance routers.
2. (Much) bigger routers will be developed.
27
Trends and Consequences (2)
4
3
2
1
0
1990
1993
1996
1999
2002
10,000
1,000
100
10
1
19
98
approx...
100,000
19
92
4
1,000,000
19
86
Power (kW)
5
Disparity between line-rate
and memory access time
19
80
6
Power consumption will
Exceed POP limits
Normalized Growth Rate
3
Consequences:
3. Multi-rack routers will spread power over multiple racks.
4. It will get harder to build packet buffers for linecards.
28
Contents
1.
2.
What processing is done where?
What does a packet switch look like?




3.
4.
Trends and consequences
Technology options for processing packets




5.
Examples of packet switches
What does a packet switch do?
Typical packet switch architecture
Evolution of high performance packet switch architecture
General purpose CPU
Network processors
FPGA
ASIC
My 2c
29
Technology Options

General purpose processor




MIPS
PowerPC
Intel
Network processor



Intel IXA and IXP processors
IBM Rainier
Control plane processors: SiByte (Broadcom), QED (PMCSierra).
FPGA
 ASIC

30
Network Processors
Load-balancing
CPU
Dispatch
CPU
Dedicated
Dedicated
Dedicated
Dedicated
HW
support,
HW
support,
HW
support,
HW
support,
e.g.
lookups
e.g.
lookups
e.g.
e.g.lookups
lookups
cache
CPU
cache
CPU
cache
CPU
cache
CPU
cache
Incoming packets dispatched to:
1.
Idle processor, or
2.
Processor dedicated to packets in this flow
(to prevent mis-sequencing).
3.
Processor for processing needed by packet,
e.g. security, transcoding, application-level
processing.
Off chip Memory
31
Network Processors
Pipelining
Off chip Memory
cache
cache
cache
cache
CPU
CPU
CPU
CPU
Dedicated
Dedicated
Dedicated
Dedicated
HW
support,
HW
support,
HW
support,
HW
support,
e.g.
lookups
e.g.
lookups
e.g.
e.g.lookups
lookups
Processing broken down into (hopefully balanced) steps,
Each processor performs one step of processing.
32
Network Processors
Pros



Cons





Flexibility: Protocols change, features are added.
Reduced development time: In principle, should be quicker to
develop software than design a custom chip.
Reduces time-to-market, development costs, …
Less efficient: slower than custom chip, more power.
Usually designed using standard processors cores, not
optimized for stream processing.
Generally about 10x slower than general purpose CPU.
Unusual development environments; hard to program.
Often hard to partition functions over processors.
33
General Observations

Up until about 1998,




Low-end packet switches used general purpose
processors,
Mid-range packet switches used FPGAs for datapath,
general purpose processors for control plane.
High-end packet switches used ASICs for datapath,
general purpose processors for control plane.
More recently,


3rd party network processors now used in many low- and
mid-range datapaths.
Home-grown network processors used in mid- and highend.
34
Contents
1.
2.
What processing is done where?
What does a packet switch look like?




3.
4.
Trends and consequences
Technology options for processing packets




5.
Examples of packet switches
What does a packet switch do?
Typical packet switch architecture
Evolution of high performance packet switch architecture
General purpose CPU
Network processors
FPGA
ASIC
My 2c
35
My 2c on network processors



Is it clear that multiple small parallel processors
are needed?
When are 10 processors at speed 1 better than 1
processor at speed 10?
Network processors make sense if:



If general purpose processors evolve anyway to:



Application is parallelizable into multiple threads/contexts.
Uniprocessor performance is limited by load-latency.
Contain multiple processors per chip,
Support hardware multi-threading,
…then perhaps they are better suited because:


Greater development effort means faster general purpose
processors,
Existing well-known development environments.
36
My 2c on network processors
The nail:
Data
Hdr
Context
The hammer:
Data cache(s)
Characteristics:
1.
Stream processing.
2.
Multiple flows.
3.
Most processing on
header, not data.
4.
Two sets of data:
packets, context.
5.
Packets have no
temporal locality, and
special spatial locality.
6.
Context has temporal
and spatial locality.
Characteristics:
1.
Shared in/out bus.
2.
Optimized for data
with spatial and temporal
locality.
3.
Especially optimized for
register accesses.
37
A network uniprocessor
Off-chip FIFOs
Head/tail Mailbox registers
On-chip FIFO
Context memory
hierarchy
Off-chip FIFOs
On-chip FIFO
Data cache(s)
Off chip Memory
Add hardware support for multiple threads/contexts.
38
Download