SPRING-OPEN
SDN based WAN Control of
Open Segment Routers
An ONF TAG Project
Saurav Das
Project Lead & ONF Consultant
Ciena talk, Oct 23rd, 2014
Outline
• Motivation & Project Goals
• Project Description
• Progress
• What next?
Motivation: ONF Point of View
SDN/OpenFlow successful
• in DataCenters
• with Software Switches
• and Overlay networks
But when it comes to Hardware switches,
misconceptions abound
• OpenFlow is not mature
• OpenFlow does not work with current hardware
• OpenFlow does not scale
• SDN/OpenFlow is about centralized control
OpenFlow has evolved towards production readiness.
state
1.0
flows
Q4 ‘09
ports
1.1
Q1 ‘11
+ Group Tables
+ Multiple Tables/Pipelines:
1.2
1.3
+ optical ports
+ synchronized tables
Interface
msg
single message queue
w/optional barriers
+ forward 1-in-n (ECMP)
+ match QinQ, MPLS, SCTP
+ match virtual ports
+ per-flow metering
+ tunnel-id
Q2 ‘12
Q4 ‘13
forward {0, 1, n}
match Eth, VLAN, IP, L4
+ IPv6
+ multiple controllers
Q4 ‘11
1.4
behavior
+ extensible match
+ extensible actions
+ multiple channels
(auxiliary connections)
+ bundle messages
ONF TAG Project Goals
1. Demonstrate maturity and scale of the ONF
work product in hardware readily available today
using the latest stable versions of ONF protocols
– eg. OF 1.3.4.
2. Provide feedback to ONF WGs on their work
product from an implementation of the chosen
networking scenario.
3. Promote adoption by creating a core-kernel that
is extensible for value-add towards deployment,
interoperability and differentiation.
Non - Goals
1. Not creating GA product; no QA; will not be ready
for production nor interoperate with other
networks and network control planes. Will support
some elements helpful for productization (eg.
config, troubleshooting/OAM, visibility etc.)
2. Not delivering a specific service like Bandwidth-TE
/VPN/NFV. Instead supporting core-capabilities to
build such services on top (extensibility options)
3. Not a plugfest – data and control plane choices will
be made; however choices should be replaceable
by other parts, both commercial and open-source
as long as they conform to the requirements
N/w. Scenario: SDN based WAN Control
Routing
Service
Discovery
Forwarding
Service
Service
Controller System
Requests
Requests
Routing,
Recovery,
Label imposition
OpenFlow
SR Labels
imposed by
controller
OSR FIB built by
controller
Open
Segment
Routers
(OSR)
One Way to Implement SR
OpenDaylight or
Cisco ONE or
Juniper NorthStar
PCE
Controller
PCEP for
tunnel req &
label imposition
IETF working on
extending all of
these protocols
for Segment
Routing
Eg. Cariden/Cisco or
WANDL/Juniper
BGP-LS for
topology
info
OSPFv2
OSPFv3
ISIS
Routing,
Recovery,
Label distribution (new in SR)
Controller/PCE not required for certain use cases - just
configure routers for SR via CLI
Why Segment Routing
Segment Routing (SR) or SPRING (IETF name)
– Source Packet Routing In NetworkinG
• Eliminates label distribution protocols – LDP and RSVP-TE
• Thereby eliminates synchronization and state management complexities
• Label distribution via OSPF or ISIS with suitable extensions (see IETF drafts)
• Source routing via ‘segments’
• maps to ‘labels’ in MPLS data plane;
• MPLS data plane unchanged – SR operations PUSH, NEXT, CONTINUE maps to
MPLS operations PUSH, POP, SWAP (with same label) resp.
• Introduces globally significant labels - node segments
• retains locally significant labels – adjacency segments
• can use ECMP shortest-paths and Explicit Paths (loose, strict);
• can be used for TE/VPN/PBR/Service-chains
Think of Segment Routing as giving new meaning to labels allowing
different network operations and a simpler control plane without
changing the data plane!
Outline
• Motivation & Project Goals
• Project Description
• Progress
• What next?
ONF TAG Project Core Requirements
• Must work on Hardware +
• Must use ONF Protocols +
• Must use Available Commodity Parts +
• Provide Feedback to Standards +
• Diversity of Solutions +
• Must be Extensible
Project Deliverables
1. Open Segment Router on 1 hardware platform
2. WAN Controller
• Supports Discovery and Routing Services
• Label imposition for segment-routing/stitching
• GUI/CLI, troubleshooting, stats
3. System Prototype & Demonstration
• Segment routed island
• Demonstrate discovery & several routing scenarios
• Extensible towards deployment & interoperability
4. Feedback
• What was not implemented and why?
• Gaps/inefficiencies in protocol
• HW requirements
Project Milestones & Timeline
Open Segment
Router (OSR)
WAN
Controller
Controller-OSR
Integration
System
Prototype &
Demonstration
June
1st
Aug
1st
Oct
1st
Dec
1st
Routing Service: Scenario # 1
Default Routing using Node Segments, ECMP and PHP
10.10.4.0/24
102
104
PHP
106
101
10.10.6.0/24
10.10.1.0/24
Global label 106
imposed on pkts
dst. to the
10.10.6 subnet
Still 106
103
105
ECMP Paths
10.10.3.0/24
101, 102 … 106 are Node Segments allocated out of the SRGB,
and bound to the router loopback addresses.
Routing Service: Scenario # 2
Policy Routing
10.10.4.0/24
102
102
106
Anycast Node
Segment 999
104
106
101
10.10.6.0/24
10.10.1.0/24
link X
104
106
103
105
10.10.3.0/24
Policy#1 – Traffic
from .3 to .6 should
avoid link X
Policy#2 – Flow ‘f’
from .1 to .6 should
stay in upper plane
Routing Service: Scenario # 3
TE Support: Load-balancing among non-ECMP Paths
12009
106
10.10.4.0/24
102
104
106
101
10.10.6.0/24
10.10.1.0/24
Same adjacency
segment 12009
assigned to both
outgoing links for
load-balancing at
102, to 104 or 103
103
105
Non-ECMP paths
10.10.3.0/24
Once at 104 or 103, it’s just SPF to 106
Routing Service: Scenario # 4
TE Support: Explicit Routing
10.10.4.0/24
102
102
103
105
104
106
10.10.6.0/24
10.10.1.0/24
101
Desired Explicit Path
Requires label stack:
102
103
105
104
106
105
103
Pop
105
Push
104
106
Stitching Segments
Deep-stacks can cause problems in
merchant silicon
1) Cannot push many labels all at
once
2) Can cause loss of entropy if hw
cannot read down to L3/L4
headers
Solution: use Segment stitching
Routing Service: Scenario # 5
Service Chaining
10.10.4.0/24
103
9002
105
16555
106
102
104
106
10.10.6.0/24
10.10.1.0/24
101
Desired
Chain
105
103
Adjacency
Segment
9002
Firewall
DPI
Adjacency
Segment
16555
Note: Could have used segment-stitching or labelswapping to avoid deep label stack
SPRING-OPEN Data Plane Requirements
Supporting Processes
OS
Distribution
CPU
Bare-metal Hardware
OF Client
Gluework
SDK
ASIC
SPRING-OPEN Control Plane Requirements
Routing Service
Policy Routing Manager – ACL,
TE Support, Service-Chains
Default Routing Manager
Discovery Service
Forwarding Service
Consistent Update Manager
Network Snapshot Manager
Stats/OAM
Manager
Controller
System
Config.
Manager
Recovery Services
Resource
Manager
Visibility/Debug Fwk.
Link/Nbr.
Disc.
C2D Sync Manager
Conn. Mgr. / Event Engine
C2C Sync Manager
HA Manager
Dist. DB
SPRING-OPEN Control Plane Requirements
Routing Service
Policy Routing Manager – ACL,
TE Support, Service-Chains
Default Routing Manager
Discovery Service
Forwarding Service
Consistent Update Manager
Network Snapshot Manager
Stats/OAM
Manager
Controller
System
Config.
Manager
Recovery Services
Resource
Manager
Visibility/Debug Fwk.
Link/Nbr.
Disc.
C2D Sync Manager
Conn. Mgr. / Event Engine
C2C Sync Manager
HA Manager
Dist. DB
SPRING-OPEN Control Plane Requirements
Routing Service
Policy Routing Manager – ACL,
TE Support, Service-Chains
Default Routing Manager
Recovery Services
SFP Routing with Node Segments
Use of ECMP and PHP
Convergence
Protection
Connectivity management – ACL policies
Avoiding links, nodes
TE support – explicit strict paths
Load balancing over non-equal-cost paths
Service chaining
SPRING-OPEN Control Plane Requirements
Routing Service
Policy Routing Manager – ACL,
TE Support, Service-Chains
Default Routing Manager
Discovery Service
Forwarding Service
Consistent Update Manager
Network Snapshot Manager
Stats/OAM
Manager
Controller
System
Config.
Manager
Recovery Services
Resource
Manager
Visibility/Debug Fwk.
Link/Nbr.
Disc.
C2D Sync Manager
Conn. Mgr. / Event Engine
C2C Sync Manager
HA Manager
Dist. DB
SPRING-OPEN Control Plane Requirements
Network wide view of topology, traffic, capabilities and resource limits
Maintains API for requests from routing, forwarding services & external req.
Provides versioning
Discovery Service
Network Snapshot Manager
Stats/OAM
Manager
Config.
Manager
Resource
Manager
Link/Nbr.
Disc.
LLDP based distributed
Link/Neighbor Discovery
Data Plane Stats
Data Plane Troubleshooting
Node/link characteristics, capabilities &
constraints (eg. table-types, bw etc.)
Scope of identifiers, namespaces & association with nodes/intfs
Verifying configuration vs. discovered resources
Proxy edge services – eg. ARP, ICMP
SPRING-OPEN Control Plane Requirements
Routing Service
Policy Routing Manager – ACL,
TE Support, Service-Chains
Default Routing Manager
Discovery Service
Forwarding Service
Consistent Update Manager
Network Snapshot Manager
Stats/OAM
Manager
Controller
System
Config.
Manager
Recovery Services
Resource
Manager
Visibility/Debug Fwk.
Link/Nbr.
Disc.
C2D Sync Manager
Conn. Mgr. / Event Engine
C2C Sync Manager
HA Manager
Dist. DB
SPRING-OPEN Control Plane Requirements
Responsible for consistency
requirements when updating
Multiple entries in a table
Multiple tables in a switch
Multiple switches in a network
Forwarding Service
Consistent Update Manager
C2D Sync Manager
C2C Sync Manager
Responsible for syncing controller-to
controller forwarding state
Responsible for syncing controller-tocontroller forwarding state
SPRING-OPEN Control Plane Requirements
Routing Service
Policy Routing Manager – ACL,
TE Support, Service-Chains
Default Routing Manager
Discovery Service
Forwarding Service
Consistent Update Manager
Network Snapshot Manager
Stats/OAM
Manager
Controller
System
Config.
Manager
Recovery Services
Resource
Manager
Visibility/Debug Fwk.
Link/Nbr.
Disc.
C2D Sync Manager
Conn. Mgr. / Event Engine
C2C Sync Manager
HA Manager
Dist. DB
SPRING-OPEN Control Plane Requirements
Routing
Service
Routing
Service
Discovery
Forwarding
Service
Service
Controller System
Controller
System
Visibility/Debug Fwk.
Routing
Service
Discovery
Forwarding
Service
Service
Controller System
Conn. Mgr. / Event Engine
New Handshake
– better error hd
-- better SM
REST
OpenFlow 1.3
GUI
CLI
Dashboard
Tseries
Config
RT view
TTP1
TTP2
HA Manager
Dist. DB
Leader
Election
Typed Table
Abstraction
debugCounters
debugEvents
Discovery
Forwarding
Service
Service
Controller System
TTP3
Support for
EQUALS
Dist. key-value store
Persistence
Notifications
Outline
• Motivation & Project Goals
• Project Description
• Progress
• What next?
Project Members
Committed
Considering
Switch
Development
NTT (Lagopus)
Dell (FTOS)
Intel
Broadcom
Controller
Development
ON.Lab (ONOS)
ONF
Switch
Contribution
Delta
Dell
NTT
Advisory,
Engineering
Testbed
Verizon
NTT
Google
Tencent
Intel
Broadcom
ON.Lab Involvement
SPRING-OPEN
IPv4 unicast routing
using MPLS labels,
following Segment
Routing rules
A platform for multiple services:
Multi-layer
Overlay
Security
ONOS
Typed Table
Hardware
A platform for multiple switch
types:
Software Switches
Un-typed tabled hardware
Optical Switches
Control Application
ONOS Graph API
Network Graph (Eventually consistent global view)
Intent
F/W
Topology
Replica
Instance 1
OpenFlow
Manager+
Intent
F/W
Topology
Replica
Instance 2
OpenFlow
Manager+
Intent
F/W
Topology
Replica
Instance 3
OpenFlow
Manager+
Event Notifications
Hazelcast
Persistence
RAMCloud
Low-latency k/v store (Strongly
Consistent)
Zookeeper
Control Application
Distributed Registry
(Strongly Consistent)
Coordination
v0.1.5
(current)
ONOSONOS
System
Architecture
Host
+Floodlight
Host
Drivers
32
Host
Progress
mid-May
master
onos13
1st June
-- OF 1.3 support
-- Driver Manager
-- I/O State Machine
-- Role management
-- Debug framework
onos13integration
1st July
25
26
8th August
27
1st Sept
mid Oct
end Nov
-- Unit tests
-- Manual Integration
New Changes (1.3 switches)
- new OF Library (Loxigen)
- new support for different
switches - DriverManager
- support for Role. EQUAL
- simultaneous support for 1.0 and
1.3 switches
- prototyping
Test & Integration
- integration with master
- unit test coverage > master
- ensured nightly tests are passing
- ensured global context and app
functionality
- reviewed and merged to master
New
ONOS
(1.0 + 1.3)
Old ONOS (1.0 switches)
- old state machine
(or lack thereof)
- old switch/port handling
- registry service (zookeeper)
- role management/changer
- ONOS storage + upper
functionality
- old controller
modified
modified
modified
modified
Newer Floodlight (1.0 switches)
- new I/O state machine
- new switch/port handling
- new role management
- new debug framework
- new storage/sync-manager
- new controller
- switch manager
- role manager
Progress
mid-May
master
onos13
1st June
-- OF 1.3 support
-- Driver Manager
-- I/O State Machine
-- Role management
-- Debug framework
onos13integration
1st July
25
-- Unit tests
-- Manual Integration
26
8th August
27
1st Sept
mid Oct
end Nov
-- Prototyping
-- CPqD13
-- OVS13
-- Dell13
SPRING-OPEN Hardware Abstraction
Pkt. +
Incoming
Meta- VLAN
Ingress
Packet
Data + Flow
Port
Action Table
Set {}
[0]
Termination
MAC Flow
Table
[10]
Unicast
IPv4
Routing
Flow
Table
[20]
MPLS
Forwarding
z
Flow Table
[30]
ACL
Policy
Flow
Table
[50]
Apply
Outgoing
Actions
-push/pop Packet Egress
-TTL mpls
Port
-Set
or
-Output
Group
-Group
Group Table Entries:
L3 Unicast
MPLS Unicast
z
ECMP
Progress
mid-May
master
onos13
1st June
-- OF 1.3 support
-- Driver Manager
-- I/O State Machine
-- Role management
-- Debug framework
onos13integration
1st July
25
-- Unit tests
-- Manual Integration
26
8th August
27 -- Network
1st Sept
mid Oct
end Nov
Config
Manager
-- Prototyping
-- CPqD13
-- OVS13
-- Dell13
ONOS NetworkConfigManager
Channel
Config
file
Network
Config
Mgr.
Startup
Config
Config Service
CLI/
REST
Running
Config
Topology Publisher
host
s
Running
Config
ONOS
Instance
Channel
Instance
ONOS
Startup
Config
Startup
Config
switches
links
ONOS
Instance
Startup
Config
Filtering Logic
Yes
Default Deny
DENY
No
Has
Config?
Yes
DENY
No
Allowed
?
Restrict
switche
s?
No
Default Allow
Has
Config?
No
ACCEPT
Yes
Allowed
?
No
DENY
Deny list
Yes
ACCEPT
& ADD
Allow list
Yes
ACCEPT
& ADD
Progress
mid-May
master
onos13
1st June
-- OF 1.3 support
-- Driver Manager
-- I/O State Machine
-- Role management
-- Debug framework
ntt
onos13integration
1st July
cli
gui
25
-- Unit tests
-- Manual Integration
dell
26
-- Prototyping
8th August
27 -- Network
1st Sept
Config
Manager
-- CPqD13
-- OVS13
-- Dell13
onos-spring
SR Prototype
-- Saurav (ONF)
-- Sangho (ON.Lab)
-- Srikanth (Ericsson/ON.Lab partner)
mid Oct
end Nov
Dell Switch Progress
Delivered two switches with pre-alpha
software for integration with controller
Demo
• Default Segment Routing with MPLS (node-segments)
and ECMP shortest-paths - Communication between
subnets across the SR WAN works
• ARP/ICMP handling, subnet-configuration, pinging
router-IPs (normal router behavior) works
• Link and Switch failure recovery works
• Policy routing works for one use-case - creating an SR
tunnel and assigning flow(s) to it
• Segment stitching works (where tunnel requires pushing
more than 3 labels, and so we stitch-segments of the
tunnel to get around hardware limitations)
Demo
192.168.0.2
192.168.0.5
102
105
7.7.7.0/24
10.0.1.0/24
h1
h6
101
106
192.168.0.1
192.168.0.6
103
104
192.168.0.3
192.168.0.4
Outline
• Motivation & Project Goals
• Project Description
• Progress
• What next?
Options for Extensibility
•
Extend the controller for hierarchical, geographically distributed
control
SDN WAN Architecture
Global
Controllers
Local
Controllers
WAN links
WAN links
Google’s B4 Architecture
Gateway
Gateway
Quagga
Quagga
Quagga
RAP TE-AGENT
OFC
paxos
Paxos
Servers
Site A
OFA
Switch
OFA
Switch
OFA
Switch
OFA
Switch
Global
TE
Central
CentralTE
TE
Servers
Servers
Site B
Controllers
Site C
Controllers
Servers
Servers
Switch
hardware
B4 WAN
iBGP, ISIS
Site B
Site
controllers
Site C
eBGP
Data Center
© 2013 SDN Academy, LLC™. All Rights Reserved.
Data
Center
Data
Center
47
Microsoft’s SWAN Architecture
Network agent
Datacenter
Switch
Datacenter
Inter-DC
WAN
Service host
Service broker
SWAN controller
F i gu r e 5: A r ch i t ect u r e of SW A N .
of priority classes t imes t he number of DC pairs. Because
SWA N support s t hree priority classes, we obt ain t hree t unnels wit h non-zero t raffic per DC pair on average, which is
SDN WAN Architecture
Global
Controllers
Local
Controllers
WAN links
WAN links
Options for Extensibility
•
•
Extend the controller for hierarchical, geographically distributed
control
Add E-BGP on the controller for exchanging reachability
information, route selection and more
Options for Extensibility
•
•
•
•
•
•
Extend the controller for hierarchical, geographically distributed
control
Add E-BGP on the controller for exchanging reachability
information, route selection and more
Provide L3VPN/VPLS/VPWS services
Provide full blown TE solution with bandwidth optimization,
calendaring etc.
Extend control plane to work with optical switches / networks
Interoperability with traditional LDP/IGP control plane
IP Routing without an IGP
109
110
100
107
108
101
109
110
107
108
102
104
106
103
105
100
102
104
106
101
103
105
109
110
107
108
102
104
100
106
101
103
105
Consistent updates – loop free updates
Segment Stitching
10.10.4.0/24
102
102
103
105
104
106
10.10.6.0/24
10.10.1.0/24
101
Desired Explicit Path
Requires label stack:
102
103
105
104
106
105
103
Pop
105
Push
104
106
Stitching Segments
Deep-stacks can cause problems in
merchant silicon
1) Cannot push many labels all at
once
2) Can cause loss of entropy if hw
cannot read down to L3/L4
headers
Solution: use Segment stitching
B4’s In-Place Replacement Model
Gateway
Gateway
Quagga
Quagga
Quagga
RAP TE-AGENT
OFC
paxos
Paxos
Servers
Site A
OFA
Switch
OFA
Switch
OFA
Switch
OFA
Switch
Global
TE
Central
CentralTE
TE
Servers
Servers
Site B
Controllers
Site C
Controllers
Servers
Servers
Switch
hardware
B4 WAN
iBGP, ISIS
Site B
Site
controllers
Site C
eBGP
Data Center
© 2013 SDN Academy, LLC™. All Rights Reserved.
Data
Center
Data
Center
54
SPRING-OPEN’s Parallel Nw Model
SDN
Fabric
Traditional
Network
Parallel Network
• parallel SDN fabric, interacts with traditional network and outside world using E-BGP
• small number of sites
• low volume of production traffic
• as confidence is gained, grow users at site, increase footprint to more sites
Options for Extensibility
•
•
•
•
•
•
•
•
•
•
•
•
•
Extend the controller for hierarchical, geographically distributed
control
Add E-BGP on the controller for exchanging reachability
information, route selection and more
Provide L3VPN/VPLS/VPWS services
Provide full blown TE solution with bandwidth optimization,
calendaring etc.
Extend control plane to work with optical switches / networks
Interoperability with traditional LDP/IGP control plane
In-band control
Add FRR to data plane recovery
Deeper buffers & QoS in white-box platform
Scale-out Segment Routers with white-boxes
More OAM / troubleshooting features
Security features
Multicast/IPv6 … and much more
Summary
• Motivation & Project Goals
• Demonstrate maturity & scale of ONF work product
• Promote adoption by creating core-kernel
• Project Description
• SDN based WAN control of Open Segment Routers
• Controllers, Bare-metal, merchant-Si, MPLS, OF1.3
• Prototype & Demonstrate several Segment Routing
scenarios in 6 months – multi-member-company effort
• Progress
• Prototyping with software switches using OF1.3
• Integration with Dell hardware switch beginning Nov
• Next
• Lots of extensibility options for value-add,
interoperability and deployment