The Cisco ASR 9000
Architecture
Sebastián Maulén
Service Provider – Systems Engineer
semaulen@cisco.com
lwigley Viking
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Confidential (NDA Required for external – no softcopy)
1
Agenda
ƒ Hardware Architecture Overview
–Chassis
–RSP
–Line Card
ƒ Switch Fabric Architecture and Fabric/System QoS
ƒ Multicast Architecture
ƒ QoS Overview
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
2
ASR 9000 At a Glance
ƒ Optimized for Aggregation of
Dense 10GE and 100GE
ƒ Designed for Longevity: Scalable
up to 400 Gbps of Bandwidth per
Slot
ƒ Based on IOS-XR for Nonstop
Availability and Manageability
ƒ Market focus:
- CE L2 Business VPN
- Residential Triple Play
- Mobile Backhaul
- Advanced Video Services
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
3
ASR 9010 and ASR 9006 Chassis
Integrated cable
management
with cover
System fan trays
Front-toback airflow
Side-to-back
airflow
RSP (0-1)
Line Card
(0-3)
Line Card
(0-3, 4-7)
RSP (0-1)
System fan trays
Air draw
cable
management
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Three Modular
Power Supplies
Cisco Public
Six Modular
Power Supplies
4
ASR 9010 and 9006 Chassis with Door
Optional class
door with lock
ƒ¼ rack: 17.38”w x 17.35”h x 28”d
Half-rack: 17.38”w x 36.75”h x 28”d
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
5
Chassis Overview
ASR-9000 10-slot system
ƒ 10 slots: 8x linecards + 2x RSP
ƒ Half-rack: 17.38”w x 36.75”h x 28”d
ƒ Bandwidth (initial)
400Gbps backplane
180Gbps fabric Æ 400G
40G/80G linecards Æ N-100G
ƒ Carrier-class hardware redundancy
ƒ AC & DC systems
Pay-as-you-grow, modular power
Green emphasis throughout
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
6
Chassis Overview
ASR-9000 6-slot system
ƒ 6 slots: 4x linecards + 2x RSP
ƒ ¼ rack: 17.38”w x 17.35”h x 28”d
ƒ Bandwidth
400Gbps backplane
180Gbps fabric Æ 400G
40G/80G linecards Æ Nx100G
ƒ Carrier-class hardware redundancy
ƒ AC & DC systems
Pay-as-you-grow, modular power
Green emphasis throughout
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
7
ASR 9000 System Scalability
Outlasting the Future
2009 Future
10 slots
6 slots
18 Slots
8 LC + 2 RSP
4 LC + 2 RSP
16 LC + 2 RSP
Linecard Density
200
80
Gbps
40 Gbps
200
80
Gbps
40 Gbps
200 Gbps
80 Gbps
Bandwidth per Slot
400 Gbps
Gbps
180
400
180 Gbps
400 Gbps
Bandwidth per Chassis
6.4
2.8 Terabits
3.2
1.4 Terabits
12.8 Terabits
Linecards per Chassis
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
8
The ASR 9000 Chassis - Built on a Green Foundation
ƒ Longevity: Linecard 3D Space (L x W x H), Power, Cooling,
& Signal Integrity Designed for Growth to 400 Gbps per Slot
ƒ “Green” Efficiency: Low Wattage per Gbps of Capacity
ƒ “Pay as you Grow”: Modular Power Supplies with 50 Amp
DC Input or 16 Amp AC for Easy CO Install
ƒ Variable-Speed Fans for Low Noise Output, with Reduced
Power, for NEBS + OSHA Compliance
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
9
Power Distribution (DC N:1 protection)
Shelf 0 (Top)
LC
Feed A
Feed A
Feed B
Feed A
Feed B
Feed A
Feed B
Feed A
Feed B
Feed A
Feed B
PS 0
PS 1
PS 2
PS 3
PS 4
Power Distribution Bus
Feed B
PS 5
LC
ƒ Single power zone, one
distribution bus
LC
ƒ All modules load share
RSP
RSP
LC
LC
Fans
ƒ 2kW and 1.5kW supplies
ƒ Each power supply is
wired to both ‘A’ and ‘B’
feed
ƒ Feed failure doubles draw
on remaining feed
ƒ supply failure increases
draw on remaining
supplies
Fans
Shelf 1 (bottom)
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
10
Power Distribution (AC 1:1 protection)
Shelf 0 (Top)
LC
Feed A
Feed A
Feed B
Feed B
Feed B
PS 1
PS 2
PS 3
PS 4
Power Distribution Bus
Feed A
PS 0
PS 5
LC
LC
RSP
RSP
LC
LC
ƒ Single power zone, one
distribution bus
ƒ All modules load share
ƒ AC power supplies are
rates @ 3KW each
ƒ ‘A’ feed wired to top power
shelf
ƒ ‘B’ feed wired to bottom
power shelf
Fans
Fans
Shelf 1 (bottom)
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
11
10-slot chassis: 6kW redundant commons
AC “1:1” protection vs. DC “N:1” protection
Shelf 0 (Top)
Feed A
Feed B
AC PS
3kW
AC PS
3kW
AC PS
3kW
Shelf 1 (bottom)
S
Y
S
T
E
M
L
O
A
D
© 2009 Cisco Systems, Inc. All rights reserved.
DC PS
2kW
Feed A & B
DC PS
2kW
Feed A & B
DC PS
2kW
Feed A & B
DC PS
2kW
Shelf 1 (bottom)
For 6KW AC, you need 4x 3KW power
supplies (feed failure takes down
two of them, supply failure kills one)
BRKARC-2003_c2
Feed A & B
Cisco Public
Power Distribution Bus
Feed B
AC PS
3kW
Power Distribution Bus
Feed A
Shelf 0 (Top)
S
Y
S
T
E
M
L
O
A
D
For 6KW DC, 4x 2KW power supplies
(2kw each, feed failure has no
impact, so we protect only against a
supply failure)
12
Power Distribution (DC N:1 protection)
6-slot chassis
Feed A
Feed B
Feed A
Feed B
Feed A
Feed B
PS 0
PS 1
PS 2
Power Distribution Bus
Power Entry Shelf
LC
LC
ƒ Single power zone, one
distribution bus
LC
ƒ All modules load share
LC
RSP
ƒ 2kW and 1.5kW supplies
ƒ Each power supply is
wired to both ‘A’ and ‘B’
feed
RSP
Fans
Fans
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
13
Power Distribution (AC 2+1 protection)
6-slot chassis
Feed A
Feed A
Feed B
PS 0
PS 1
PS 2
Power Distribution Bus
Power Entry Shelf
LC
LC
ƒ Single power zone, one
distribution bus
LC
ƒ All modules load share
LC
RSP
ƒ 3KW supplies
ƒ 6KW maximum power per
chassis, additional 3KW is
used for protection
RSP
Fans
Fans
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
14
Power Check and Rules
ƒ Available power is checked when:
–An LC card is inserted
–An LC card is powered up via the CLI
–An LC card is reset via “hw-mod reload”
ƒ If the system does not have enough available power to
accommodate the LC, then the LC becomes “UNPOWERED”
ƒ Installing new power supplies will not automatically power up any
UNPOWERED line cards. The user can force a recheck using:
“hw-mod reload loc <>”
ƒ RSP and Fan Tray cards are given priority allocation of power
budget
ƒ LC power budget is checked in numeric order until it is exhausted.
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
15
RSP Engine
ƒ Performs control plane and management functions
ƒ Dual Core CPU processor with 4GB DRAM
ƒ 2MB NVRAM, 2GB internal bootdisk, 2 external compact flash
slots
ƒ Dual Out-of-band 10/100/1000 management interface
ƒ Console & auxiliary serial ports
ƒ USB ports for file transfer
ƒ Hard Drive: 40G HDD
Console Port
Management
Ethernet
BRKARC-2003_c2
BITS
AUX Port
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
Status light
ALARM
Compact Flash
Slots
Status LED
16
RSP Engine Architecture
BITS
Clock
Time
FPGA
Timing Domain
4GB MEM
HDD
CF card
Mgt Eth
CPU
Ether
Switch
Mgt Eth
Console
Aux
4G CF
Punt
FPGA
I/O FPGA
Front Panel
BRKARC-2003_c2
Boot Flash
CPU Complex
© 2009 Cisco Systems, Inc. All rights reserved.
Arbitration
Arbitration
Crossbar
Fabric
ASIC
Alarm
NVRAM
Fabric
Interface
EOBC/
Control Plane GE
Cisco Public
Crossbar
Fabric
ASIC
Fabric Interface
17
FCS Line Card Support
ƒ 40Gbps line rate
ƒ Scalable architecture
A9K-8T/4-E
-4xTen, 8xTenGE (over subscribed), 40xGE
ƒ Flexible, microcode based architecture
ƒ Base & extended memory options
–Additional memory Æ higher scale
A9K-4T-B
–Medium Queue (-B) and High Queue (-E)
ƒ Common Architecture Enabling Feature Parity
across all variants
ƒ L2 & L3 Feature Coexistence on the same line
card and chassis
ƒ Advanced IP software licence
A9K-40GE-B
A9K-8T/4-E
–L3VPN
8 port Ten Gig
Extended memory option
40Gig line rate network processor
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
18
FCS Line Card Hardware Architecture – Example1
Example: A9K-4T-B
2GB flash
XFP
3
XFP
2
2GB memory
CPU
10GE PHY
NPU 0
Crossbar
Fabric
ASIC
Bridge
FPGA 0
10GE PHY
Arbitration
NPU 1
RSP0
GigE
EOBC
XFP
1
Fabric
Interface
10GE PHY
NPU 2
Arbitration
10GE PHY
NPU 3
Network
Clocking
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
Crossbar
Fabric
ASIC
Crossbar
Fabric
ASIC
Bridge
FPGA 1
XFP
0
Crossbar
Fabric
ASIC
RSP1
via backplane
19
Line Card Hardware Components
ƒ Fabric Interface ASIC
–Provide data connection to the switch fabric crossbar ASIC
–Each Fabric interface ASIC has one fabric channel (23Gbps bi-directional) to each of the crossbar
–With dual RSP system, each line card can have 4x23Gbps = 92Gbps bi-directional fabric bandwidth
–Fabric interface ASIC has hardware queues and virtual output queues for the system level QoS (see
system QoS part for more information)
–Fabric interface ASIC has multicast replication table and does the multicast replication in hardware
towards the two Bridge FPGA
ƒ Bridge FPGA
–Connect NPU and the Fabric interface ASIC, convert between NPU header and fabric header
–Has hardware queues for system level QoS
–Has multicast replication table and does the multicast replication in hardware towards the two NPUs
ƒ “Trident” NPU (Network process unit)
–Main forwarding engine, both the L2 and L3 lookup, features, multicast replications are done by the
NPU (next slides for more information)
ƒ CPU
–Same type of CPU as RP
–Some of the control plane protocols are distributed on the local line card CPU for more scale, like BFD,
CFM, ARP
–Local SW process receive FIB table from the RP, and program hardware forwarding table into NPU,
Bridge and Fabric interface
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
20
FCS Line Card Hardware Architecture – Example2
Example: A9K-8T/4-B
over-subscribed
~1.5:1
2GB flash
2GB memory
15Gbps,29Mpps
uni-directional
per NPU
XFP
3
10GE PHY
XFP
7
10GE PHY
CPU
10GE PHY
XFP
6
10GE PHY
Bridge
FPGA 0
NPU 1
10GE PHY
XFP
5
10GE PHY
Crossbar
Fabric
ASIC
2x30Gbps
60Gbps
GigE
EOBC
XFP
1
Fabric
Interface
NPU 2
Bridge
FPGA 1
XFP
0
10GE PHY
XFP
4
10GE PHY
4x23Gbps
fabric bandwidth
© 2009 Cisco Systems, Inc. All rights reserved.
30Gbps
Cisco Public
Arbitration
RSP0
Crossbar
Fabric
ASIC
Crossbar
Fabric
ASIC
Arbitration
NPU 3
BRKARC-2003_c2
Crossbar
Fabric
ASIC
30Gbps
NPU 0
XFP
2
4x15Gbps
Network
Clocking
RSP1
via backplane
21
Trident Network Process Unit (NPU)
Main Forwarding Engine
FIB
MAC
LOOKUP
MEMORY
E
STATS QDR
MEMORY
-
TCAM
TRIDENT NPU
FRAME MEMORY
-
ƒ
Multi-stage microcode based architecture, feature rich 10Gbps bi-directional line rate
ƒ
Each NPU has Four Main Associated memories TCAM , Search/Lookup memory , Frame/buffer
memory and statistics memory
–TCAM is used for VLAN tag, QoS and ACL classification
–Lookup Memory is used for storing FIB tables, Mac address table and Adjacencies
–Stats QDR memory is used for all interface statistics, forwarding statistics etc
–Frame memory is buffer memory for Queues
ƒ
FCS line cards have two versions – base and extended, depends on the memory size
– Search Memory is the same across Base and Extended card to support mix of the line cards
without impacting the system level scale like routing table, multicast, MAC address table
–TCAM , QDR and Frame Memory are Smaller in Base cards to have lower scale number of QoS
queues and sub-interfaces supported on the Line card level
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
22
Hardware Subsystem: Linecard
Synchronous Ethernet support on existing HW
ƒ RSP has BITS input, DTI and
centralized Clock Distribution
hardware
ƒ Full support for L1 Sync-E on
linecards (XR 3.9)
ƒ Flexible time sourcing: Line cards
capable of recovering clock and
sending to RSP and receiving
Transmit clock from RSP
ƒ Future Hardware capable of
EEE1588-v2
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
23
Hardware Subsystem: Linecard
Optics and SFP support
ƒ Gigabit SFPs (40xGE card)
SFP-GE-T= (supports 10M, 100M and 1000M on cat5/6 cable)
SFP-GE-S=
SFP-GE-L=
SFP-GE-Z=
ƒ 10GE XFPs (all Nx10GE cards)
XFP-10GLR-OC192SR=
XFP-10GZR-OC192LR=
XFP-10GER-192IR+=
ƒ CWDM SFPs:
1470nm to 1610nm
ƒ DWDM XFP & SFPs:
most wavelengths from 1530-1561nm
ITU 100Ghz C-band spacing
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
24
New Line Cards in Release 3.9.x
ƒ 8 x 10GE Line Rate Line Cards
ƒ Combo 2x10GE+20xGE Line Cards
ƒ Low-Queue Line Cards
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
25
ASR 9000 Ethernet Linecards
Low Queue*
Medium Queue
High Queue
512k
512k
512k
IPv4 Routes
1M
1M
1M
VRFs
4k
4k
4k
MPLS Labels
128k
128k
128k
L3 Subif/Port
4k
4k
4k
Bridge Domains
8k
8k
8k
EFPs
4k
16k
32k
8/port
64k
256k
8k
128k
256k
50ms
50ms
150ms
MAC Addresses
Egress Queues
Policers
Packet Buffer
Different
Metric
Common
Capability Comparison
* Low Queue Linecards are targeted for 3.9
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
26
Hardware Subsystem: Linecard
IPoDWDM / G.709 / WAN-PHY support
ƒ Available on 2+20, 8xTenGE cards:
These cards have a separate Ethernet MAC chip that provides WAN-PHY & g.709 support
ƒ WAN-PHY
Provides ethernet framing over SONET/SDH OC-192 signalling rate (9.953 vs 10.0 Gbps)
ƒ IPoDWDM / G.709:
Utilizes forward error correcting (FEC) coding to improve signal:noise ratio
Extends fiber span distance capabilities without regeneration
Two variants of FEC on ASR 9000 linecards:
“standard” FEC (a.k.a. “G-FEC”) : compatible with most vendors L2/L3 equipment and
MSTP
“enhanced” FEC: proprietary, compatible with 7600 and other ASR9Ks, but not CRS-1
ƒ Software Licencing for G.709:
WAN-PHY configurable without any additional licence
g.709 / FEC / EFEC requires additional licence per linecard
WDM I/F
WDM I/F
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
27
Cisco ASR 9000 8x10GE Line-Rate LC Testimonials
“LR and EANTC had a first look at a linecard with eight 10Gigabit Ethernet port …This shows that the device could
handle delivering 160 Gbit/s of multicast and unicast
traffic or 80 Gbit/s in each direction.”
“The ASR 9010
was able to deliver
high-priority
traffic, such as
VoIP calls, even
when the network
was under an
unusual traffic
load and fending
off a simulated
denial-of-service
attack.”
European Advanced Networking Test Center AG (EANTC)’s Test
of Cisco ASR 9000; commissioned by Light Reading
http://www.lightreading.com/document.asp?doc_id=177356&page_number=9
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
28
Agenda
ƒ Hardware Architecture Overview
–Chassis
–RSP
–Line Card
ƒ Switch Fabric Architecture and Fabric/System QoS
ƒ Multicast Architecture
ƒ QoS Overview
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
29
Fabric Overview
ƒ
Fabric is logically separate from LC/RSP
ƒ
Physically resides on RSP
ƒ
Separate data and arbitration paths
ƒ
Each LC/RSP has a fabric interface ASIC (80Gbps line rate LC have two fabric interface
ASICs)
Crossbar
Fabric
ASIC
Crossbar
Fabric
ASIC
23Gbps per fabric channel
8x23Gbps =184Gbps with
dual RSP, 4x23Gbps=92Gbps
with single RSP
Arbitration
Fabric Interface
and VOQ
Single-Fabric
interfaces 40G
Linecard
4x23Gbps =92Gbps with dual RSP,
2x23Gbps=46Gbps with single RSP
RSP0
Crossbar
Fabric
ASIC
Crossbar
Fabric
ASIC
Fabric Interface
and VOQ
Fabric Interface
and VOQ
Dual-Fabric
interfaces 80G
Linecard
Arbitration
RSP1
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
30
Fabric Load Sharing – Unicast
Crossbar
Fabric
ASIC
Crossbar
Fabric
ASIC
Arbitration
Fabric Interface
and VOQ
RSP0
Fabric Interface
and VOQ
4
3
2
1
Crossbar
Fabric
ASIC
Crossbar
Fabric
ASIC
Arbitration
RSP1
ƒ
Unicast traffic is sent across first available fabric link to destination (maximizes efficiency)
ƒ
Each frame (or superframe) contains sequencing information
ƒ
All destination fabric interface ASIC have re-sequencing logic
ƒ
Additional re-sequencing latency is measured in nanoseconds
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
31
Fabric Load Sharing – Multicast
Crossbar
Fabric
ASIC
Crossbar
Fabric
ASIC
Arbitration
Fabric Interface
and VOQ
RSP0
Fabric Interface
and VOQ
C1
B2
A3
B1
A2
A1
Flows exit in-order
Crossbar
Fabric
ASIC
Crossbar
Fabric
ASIC
Arbitration
RSP1
ƒ
Multicast traffic is hashed based on (S,G) info to maintain flow integrity
ƒ
Very large set of multicast destinations preclude re-sequencing
ƒ
Multicast traffic is non arbitrated – sent across a different fabric plane
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
32
Fabric Super-framing Mechanism
ƒ Multiple unicast frames from/to same destinations are aggregated into one
super frame
ƒ Super frame is created if there are frames waiting in the queue, up to 32
frames or when min threshold meet, can be aggregated into one super frame
ƒ Super frame only apply to unicast, not multicast
ƒ Super-framing significantly improves total fabric throughput
Packet 3
Packet 1
No super-framing
Packet 2
Packet 1
Min reached
Packet 2
Packet 1
Max reached
Packet 1
Max
Super-frame
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Min
Super-frame
Cisco Public
Jumbo
0 (Empty)
33
Access to Fabric Bandwidth – Arbitration
ƒ Access to fabric controlled using central arbitration. Arbitration is
being performed by a central high speed arbitration ASIC on the
RSP. At any time a single arbiter is responsible for arbitration
ƒ The Arbitration algorithm is QOS aware and will ensure that P1
classes have preference over P2 classes, both of which have
preference over non-priority classes
ƒ Arbitration make sure the bandwidth fairness among the Line cards
ƒ Arbitration is performed relative to a given the egress 10G
complex (NPU)
ƒ Fabric capacity on egress modules represented by Virtual Output
Queues (VOQs) at ingress to fabric
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
34
Fabric Arbitration
RSP0
Crossbar
Fabric
ASIC
1: Fabric Request
5: credit return
Crossbar
Fabric
ASIC
Arbitration
Fabric Interface
and VOQ
2: Arbitration
Fabric Interface
and VOQ
Crossbar
Fabric
ASIC
3: Fabric Grant
4: load-balanced
transmission
across fabric links
Crossbar
Fabric
ASIC
Arbitration
RSP1
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
35
ASR 9000 Advanced System QoS (1) –
User priority is mapped to system priority Æ end-to-end priority
Drop low priority packets
during the congestion
5Gbps Hi + 5Gbps Lo
1
T0
T1
5Gbps Hi + T2
5Gbps Lo
B1
T3
LC1
P1
P2
BE
…
Rx queue,
nonblocking
Fabric
Interface
2
Fabric
Interface
B0
T – Trident NPU
T0
1
T1
2
T2
B1
Switch
Fabric
T3
LC2
P1
P2
BE
P1
P2
BE
Ingress queues
B0
B – Bridge FPGA
System queues in fabric
interface ASIC (P1,P2,
2xBE)
HP
LP
System queues in bridge
FPGA (Hi and Lo)
P1
P2
BE
…
Egress queues
Ingress queues
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
User (interface) level queues, Egress queues
hierarchy, priority propagation
Cisco Public
36
User QoS to System QoS mapping
ƒ ASR 9000 supports traffic differentiation at all relevant points within the
system
–P1/P2/BE differentiation or HP/LP differentiation support throughout the system
• Classification into these priorities is based on input user QoS
classification on the ingress linecard into P1, P2, or Other queues
–Once a packet is classified into a P1 class on ingress it will get mapped to PQ1
queue along the system qos path
–Once a packet is classified into a P2 class on ingress it will get mapped to PQ2
queue along the system qos path. If system component only have one priority
queue, then both user P1 and P2 will be mapped to the same system PQ.
–Once a packet is classified into a non-PQ1/2 class on ingress it will get mapped
to LP queue along the system qos path
ƒ Note: The marking is implicit once you assign a packet into a given
queue on ingress; its sets the fabric header priority bits onto the packet
–no specific “set” action is required
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
37
ASR 9000 Advanced System QoS (2) –
Back pressure and VoQ Æ avoid HoLB
B – Bridge FPGA
Ingress LC fabric interface has Virtual Output Queues
per each egress NPU for all LCs in the system. Packet
to different egress NPU is in different VoQ, no HoLB
DRR
Slot 2/T1
Backpressure to fabric
interface ASIC if high priority
queues congested
3
1
T0
2
10Gbps
T3
P1
P2
T0
B1
Fabric
Interface
T1
T2
BRKARC-2003_c2
2
B0
T1
3
T2
B1
Switch
Fabric
LC1
P1
P2
BE
BE
© 2009 Cisco Systems, Inc. All rights reserved.
T3
LC2
P1
P2
…
1
Fabric
Interface
B0
6Gbps
BE
DRR
4Gbps
Slot 2/T0
T – Trident NPU
HP
LP
P1
P2
BE
…
Cisco Public
38
What Are VOQs?
ƒ Virtual Output Queues (VOQs) on ingress modules represent
fabric capacity on egress modules
ƒ If VOQ available on ingress to fabric, capacity exists at egress
module to receive traffic from fabric
Central arbiter determines whether VOQ is available for a given packet
ƒ VOQ is “virtual” because it represents EGRESS capacity but
resides on INGRESS modules
It is still PHYSICAL buffer where packets are stored
ƒ Note: VOQ is NOT equivalent to ingress or egress port buffer or
queues
Relates ONLY to ASICs at ingress and egress to fabric
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
39
Fabric Interface ASIC VOQs
RSP0
Port 3
DRR
DRR
Crossbar
Fabric
ASIC
Arbitration
Ingress
Multicast
Fabric Interface ASIC
Multicast
RSP1
136 ingress VoQ used:
Egress Scheduler
Crossbar
Fabric
ASIC
Slot 9/Port 3
DRR
Port 2
DRR
DRR
Slot 9/Port 2
DRR
.
Port 1
Arbitration
Ingress Scheduler
.
Port 0
DRR
DRR
Slot 0/Port 1
Crossbar
Fabric
ASIC
DRR
DRR
Slot 0/Port 0
Crossbar
Fabric
ASIC
Egress
Fabric Interface ASIC
20 egress fabric queues:
4 classes/port * 4 ports/LC (unicast) == 16
8 dest LCs * 4 10G ports/LC * 4 classes/port** == 128 VoQ for LCs
4 multicast classes == 4
2 dest RSPs * 1 10G port/RSP * 4 classes/port == 8 VoQ for RSPs
4 multicast queues
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
40
VOQ Destinations
ƒ For every “destination” on other modules in system, each ingress
module has corresponding VOQ with four priority levels
ƒ One VOQ with four priority levels serves one “destination” which is
one NPU complex on an egress module
ƒ One NPU complex can have:
One front-panel 10G port (non-blocking line card) -orTwo front-panel 10G ports (blocking line card) -orTen front-panel 10/100/1000 ports
Line Card
NPU 0
Bridge 0
NPU 1
VOQ “destination”
Fabric
Interface
NPU 2
Bridge 1
NPU 3
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
41
ASR 9000 Advanced System QoS (3) –
B – Bridge FPGA
T – Trident NPU
Multicast and Unicast Separation Æ U:M bandwidth fairness
Guaranteed unicast and multicast
bandwidth ratio under congestion
4Gbps
T0
T0
T1
10Gbps
T2
B1
LC1
P1
P2
BE
…
B0
T1
T2
B1
Switch
Fabric
T3
3Gbps
Fabric
Interface
Fabric
Interface
B0
6Gbps
Rx queue,
nonblocking
P1
P2
T3
LC2
Separated unicast
and multicast
fabric plane
BE
P1
P2
BE
HP
LP
P1
P2
BE
…
Unicast and Multicast have
separate system queues
(High and Low)
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
42
Advanced System QoS Summary
ƒ Central arbitration for fabric access
–Ensures fair access to bandwidth for multiple ingress ports transmitting
to one egress port
–Central arbiter ensures all traffic sources get appropriate access to
fabric bandwidth, even with traffic sources on different modules
ƒ System queue priority
–Ensures priority traffic takes precedence over best-effort traffic across
system components
–User priority mapped to system priority automatically
ƒ Flow control and VoQ
–Prevents congested egress ports from blocking ingress traffic destined
to other ports
–Mitigates head-of-line blocking by providing dedicated buffer (VOQ) for
individual destinations across the fabric
ƒ Unicast and multicast queue/fabric separation
–Guaranteed unicast and multicast bandwidth ratio across fabric and
system components under congestion
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
43
Agenda
ƒ Hardware Architecture Overview
–Chassis
–RSP
–Line Card
ƒ Switch Fabric Architecture and Fabric/System QoS
ƒ Multicast Architecture
ƒ QoS Overview
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
44
ASR 9000 Multicast Architecture Overview
Clean, Simple and Scalable Architecture for bandwidth Efficiency and
Guaranteed QoS
1
T1
4
Replication points at different
system component
1
LC3
LC3
2
T3
4
Switch
Fabric
ƒ Clean and Scalable Multicast Architecture
Always fabric and egress line card replication
Fabric
Interface
Fabric
Interface
Fabric
Interface
B1
3
Fabric
Interface
B0
T2
IGMP joins 4
3
LC1
T0
Multicast
Source
2
LC2B0 T0
B0 T0
B0
2
T1
B1
B1
T2
B1
3
Line rate Multicast replication independent of the scale
ƒ Bandwidth Efficient Multicast Replication
4
T3
T0
T1
T1
T2
T2
T3
IGMP joins
T3
Replicate the packet in the most optimized way
ƒ Guaranteed System QoS
Separated Unicast and Multicast High and Low priority
system level queues; Separated switch fabric plane
simple, predictable
Guaranteed priority
Guaranteed unicast and multicast bandwidth ratio
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
45
ASR 9000 Multicast Architecture Overview
ƒ Control plane
All multicast control protocols run on the RP
Local line card CPU handle exceptional multicast packet (SW switched, Multicast
signaling, etc)
ƒ Distributed forwarding plane
Multicast forwarding is fully distributed on the LC
ƒ Two stage forwarding
Ingress LC lookup determine what’s the destination egress LCs and NPUs
Egress LC lookup determine what’s the destination egress ports
ƒ Optimal HW based multicast packet replication
Multicast packet is replicated on switch fabric and egress LC, ingress LC never
replicate multicast packet
Multicast packet is replicated by HW chip and in efficient way
Line rate multicast with fully loaded chassis
ƒ L2 vs. L3 Multicast
L2 and L3 multicast has separate control plane
L2 and L3 multicast has uniform data forwarding plane
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
46
L3 Multicast Control Plane
T: Trident NPU
B: Bridge FPGA
MGID: Multicast Group ID
FGID: Fabric Group ID
RSP
LC
T0
T1
T2
B1
Fabric
Interface
B0
PIM IGMP
1
Switch Fabric
PI
M
IGMP
2
3
T3
4
CPU
MRIB
RP
MFIB
ƒ
Incoming IGMP and PIM packets are punted to RP directly bypassing LC CPU
ƒ
Protocols (PIM/IGMP) send their Route/OLIST Information to MRIB process
ƒ
MRIB sends the multicast state information (mroute, olist) to MFIB (process running on LC
CPU)
MRIB assign FGID for each mroute, indicating which slot it should forward multicast copy. This is
based on the OLIST information
MRIB assign a global unique MGID for each mroute
ƒ
MFIB program HW forwarding tables in NPU, Bridge FPGA and Fabric interface
Program MGID table in Fabric interface ASIC to indicate which Bridge it should forward multicast
copy. This is based on the OLIST information
Program MGID table in Bridge interface ASIC, similar as above
BRKARC-2003_c2
Program
table
in the
NPU, Cisco
similar
© 2009 CiscoFIB
Systems,
Inc. All rights
reserved.
Public as IPv4 unicast
47
L2 Multicast Control Plane
T: Trident NPU
B: Bridge FPGA
MGID: Multicast Group ID
FGID: Fabric Group ID
RSP
LC
T0
T1
T2
B1
Fabric
Interface
B0
IGMP
1
Switch Fabric
IGMP
Snooping
2
3
T3
4
CPU
L2FIB
RP
L2FIB
ƒ
Incoming IGMP packets are punted to RP directly bypass LC CPU
ƒ
IGMP snooping process send its Information to L2FIB process
ƒ
L2FIB sends the multicast state information (mroute, olist) to L2FIB (process running on the LC
CPU)
L2FIB assign FGID for each mroute, indicating which slot it should forward multicast copy. This is
based on the OLIST information
L2FIB assign a global unique MGID for each mroute
ƒ
L2FIB program HW forwarding tables in NPU, Bridge FPGA and Fabric interface
Program MGID table in Fabric interface ASIC to indicate which Bridge it should forward multicast
copy. This is based on the OLIST information
Program MGID table in Bridge interface ASIC, similar as above
Program
FIB table in the NPU, similar
as L3 multicast
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
BRKARC-2003_c2
48
Data Plane – Packet Replication
Efficient Multicast Packet Replication
1
Fabric Replication Æ
replicate packet to
egress LCs based on
FGID & FPOE table
2
Fabric Interface Replication
Æ replicate packet to Bridge
FPGAs based on MGID table
3
Bridge FPGA Replication
Æ replicate packet to NPU
based on MGID table
LC2
T0
B0
T1
T24
IGMP joins
T3
4
B1
3
Fabric
Interface
Multicast
Source
Fabric
Interface
LC1
T0
2
1
FGID
MGID
B0
T1
T2
B1
IGMP joins
3
MGID
Switch
Fabric
NPU Replication
Æ replicate packet
to egress port
IGMP joins
2
MGID
4
4
T3
LC3
MGID
T0
Fabric
Interface
B0
T1
B1
FIGD – Fabric Group ID
MGID – Multicast Group ID
T2
T3
FPOE – Fabric Point of Exit
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
49
Agenda
ƒ Hardware Architecture Overview
–Chassis
–RSP
–Line Card
ƒ Switch Fabric Architecture and Fabric/System QoS
ƒ Multicast Architecture
ƒ QoS Overview
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
50
ASR 9000 QoS Capability Overview
ƒ Very scalable SLA enforcement
–Up to 3 Million queues per system (with extended linecard)
–Up to 2 Million policers per system (with extended linecard
ƒ Hierarchical scheduling support
–Four layer scheduling hierarchy ÆPort, Subscriber Group, Subscriber, Class
–Egress & Ingress
ƒ Dual Priority scheduling with priority propagation for minimum latency and jitter
ƒ Flexible & granular classification
–Full Layer 2, Full Layer 3/4 IPv4, IPv6
ƒ Robust implementation
–System level QOS on fabric and LC system components
–H-QOS uses dedicated and purpose-built traffic manager HW
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
51
4 Layer Hierarchy QoS Overview
L1
L2
L3
L4
Port
Level
Subscriber
group Level
Subscriber
Level
Class
Level
S-VLAN EFP
S-VLAN EFP
Port
BRKARC-2003_c2
C-VLAN
Class
Class
C-VLAN
Class
Class
Class
Class
C-VLAN
Class
Class
C-VLAN
Class
Class
© 2009 Cisco Systems, Inc. All rights reserved.
Class
Class
Cisco Public
Note: We count hierarchies as
follows:
4L hierarchy = 3 Level nested
policy-map
3L hierarchy = 2 level nested
policy-map
L1 level is not configurable but is
implicitly assumed
Hierarchy levels used are
determined by how many nested
levels a policy-map is configured
for and applied to a given subinterface
Max 8 classes per subscriber level
52
H-QOS – Priority & Priority Propagation
ƒ Priority level 1 & 2 support
–The high priority queue level 1 gets scheduled at strict priority, i.e. if it has not
met it’s configured maximum BW, determined by policing.
–The high priority queue level 2 gets scheduled at relative strict priority after PQ
level 1 has been scheduled, i.e. if it (PQ L2) has not met it’s configured
maximum BW, determined by policing or shaping.
ƒ Priority propagation
–means that strict priority scheduling (latency/priority behavior) is executed
throughout all layers of the hierarchy in case of congestion at any of the levels
–Latency assurance at a child class is automatically assured at parent /
grandparent levels for traffic in that class
e.g. in congestion at parent / grandparent levels, traffic in this class will be
serviced first
ƒ Unshaped Priority traffic for lowest latency:
–If priority traffic level 1 is scheduled into a parent shaper it will NOT actually be
shaped, but scheduled at linerate
–It will only be accounted for at the parent scheduler so that shapers will not be
overrun
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
53
www.cisco.com/go/asr9000
BRKARC-2003_c2
© 2009 Cisco Systems, Inc. All rights reserved.
Cisco Public
54