Presentation

advertisement
Platform-agnostic
Behavioral Forwarding and Stateful
Flow Processing at Wire Speed
Giuseppe Bianchi
CNIT / University of Roma Tor Vergata
Joint work with M. Bonola, A. Capone, C. Cascone, S. Pontarelli, D. Sanvito
Supported by EU ICT-05 grant:
Beba
BEhavioural BAsed forwarding
Giuseppe Bianchi
Outline
èMotivation and goal
èBackground and state of the art
èOpenState: beyond OpenFlow
èBeyond OpenState?
Platform-agnostic Behavioral Forwarding and Stateful Flow Processing at Wire Speed
Giuseppe Bianchi
Why flow processing?
èYesterday’s networks:
ðIncreasebandwidth,decreaselatency
èToday’s networks:
ðIncreasebandwidth,decreaselatency
ðAdd(very)specializedprocessingfornetwork
performanceimprovements
àLoadbalancing,TCPacceleration,trafficpolicing,etc
ðImprovenetworksecurity
àTrafficclassificationandfiltering,anomalydetection,DDoS
detection,pattern/signaturerecognition,etc
Platform-agnostic Behavioral Forwarding and Stateful Flow Processing at Wire Speed
Giuseppe Bianchi
Why programmable flow processing?
èYesterday’s networks:
ðIncreasebandwidth,decreaselatency
èToday’s networks:
ðIncreasebandwidth,decreaselatency
ðAdd(very)specializedprocessingfornetwork
performanceimprovements
àLoadbalancing,TCPacceleration,trafficpolicing,etc
ðImprovenetworksecurity
àTrafficclassificationandfiltering,anomalydetection,DDoS
detection,pattern/signaturerecognition,etc
è Flexible approaches/techniques for different needs!
Platform-agnostic Behavioral Forwarding and Stateful Flow Processing at Wire Speed
Giuseppe Bianchi
Why flow processing at wire speed?
è Softwarisation does NOT necessarily imply x86
ð Generalpurpose network/flow processors(domain-specific)canbetter
fithighrateprocessingandretainVNFgoals(repurposability,ondemandplacement,etc)
à Generalpurpose,butdomainspecific,HWisstill2-3ordersofmagnitudebetterthan
SW(thinktoTCAM’swildcardmatching)
è Packet-level processing needs
ð …if(pck==SYN) thenforwardremainingpacketsto…
Often «stateful»: depends on
ð …if(#pck >X)then…
what happened before…
ð …ifpck_fragment #1followedbyfragment#lastthen…
ð …set(forwardingtable=MAC_SRCà in-port)
è Flexibility @ packet-level time scale à VERY hard!
ð @100Gbpslink:64Bpacket=5.12nanoseconds
ð Hard(euphemism)torelyonCPUà confinedtoslowpath
Platform-agnostic Behavioral Forwarding and Stateful Flow Processing at Wire Speed
Giuseppe Bianchi
Where is the problem?
Plenty of network processors…
Yes, but… ALL proprietary:
àproprietary programming model
and languages (APIs)
à Flexibility up to the vendor
Yes but… what about P4? More later!
Giuseppe Bianchi
Platform-agnostic
Node behavior
Description
(formal)
Network
Entity
e.g. Switch
Any vendor, any size, any HW/SW platform…
Platform-agnostic Behavioral Forwarding and Stateful Flow Processing at Wire Speed
Giuseppe Bianchi
Platform agnostic HW
configuration: SDN foundation
Software-Defined Networking
Traditional networking
smart, slow, (logically) centralized
Programmable
switch
Controlplane
Control-plane
Data-plane
Control-plane
Switch
Data-plane
Control-plane
Data-plane
Data-plane
API to the
data plane
(e.g., OpenFlow)
Data-plane
Problem!! As of now, API to the data plane (OpenFlow)
only for «static» forwarding rules
Giuseppe Bianchi
Data-plane
dumb, fast
What we’d like to do?
OpenFlow / SDN
Our view / SDN
SMART!
Controlplane
OpenFlow
switch
ß Forwarding rules
Central control à
still decides how
switches shall
«behave»
Extended
switch
Data-plane
Data-plane
DUMB!
SMART!
SMART!
Controlplane
ß Forwarding behavior:
ß Forwarding rules AND
how they should change
or adapt to «events»
Smart switches à
can dynamically
update flow tables
at wire speed
Platform-agnostic Behavioral Forwarding and Stateful Flow Processing at Wire Speed
Giuseppe Bianchi
What we’d like to do?
OpenFlow / SDN
Our view / SDN
Central control à
still decides how
aswitches
nutshell:shall
«behave»
SMART!
BehavioralSMART!
Forwarding in
ControlControlDynamic forwarding
rules/states
à
plane
plane
some control tasks back (!) into the switch
OpenFlow
switch
Extended
Forwarding
rules
ß Forwarding behavior:
switch
(hard ß
part:
via platform-agnostic
abstractions)
Data-plane
Data-plane
DUMB!
SMART!
ß Forwarding rules AND
how they should change
or adapt to «events»
Smart switches à
can dynamically
update flow tables
at wire speed
Platform-agnostic Behavioral Forwarding and Stateful Flow Processing at Wire Speed
Giuseppe Bianchi
Outline
èMotivation and goal
èBackground and state of the art
èOpenState: beyond OpenFlow
èBeyond OpenState?
Platform-agnostic Behavioral Forwarding and Stateful Flow Processing at Wire Speed
Giuseppe Bianchi
The emergence of Software
Defined Networking
è Traditional concern: how to configure and manage multi-vendor
networks
ð Configurationinterfacesnotonlydifferfordifferent vendors/devicesbuteven
varyacrossdifferentfirmwareversionsofsamedevice!
è Before 2008: many ideas/proposals, no real world impact
è 2008 breakthrough: OpenFlow
ð Firstplatform-agnosticAPItotheswitchHW– key:pragmaticcompromise!
à “Before OpenFlow, the ideas underlying SDN faced a tension between the vision of fully
programmable networks and pragmatism that would enable real-world deployment. OpenFlow
struck a balance between these two goals by enabling more functions than earlier route controllers
and building on existing switch hardware, through the increasing use of merchant-silicon chipsets
in commodity switches”. [quote from Feamster, Rexford, Zegura, 2014]
è Since then:
ð Excellentworkonnetwork-wideprogramming abstractions&controllers
ð Excellentworkon“softwarization”(virtualization)ofnetworkfunctions
ð Butverylimited(significant)workonbeyondOpenFlow HWabstractions
àAtleastupto2013/14;ThenPOF,P4,ourOpenState…morelater
Giuseppe Bianchi
SDN key concept
Central view
Lots of advantages…. simpler centralized control, immediate deployment, cheaper HW,
uniform interface to the switch, high level abstractions and programming models,
etc, etc etc
Giuseppe Bianchi
SDN enabler: the OpenFlow compromise
[original quotes: from OF 2008 paper]
èBest approach: “persuadecommercialname-brand
equipmentvendorstoprovideanopen,programmable,
virtualizedplatformontheirswitchesandrouters”
ðPlainlyspeaking:openthebox!!Noway…
èViable approach: “compromiseongeneralityandseek
adegreeofswitchflexibilitythatis
ðHighperformanceandlowcost
ðCapableofsupportingabroadrangeofresearch
ðConsistent with vendors’ need for closed
platforms.
Giuseppe Bianchi
OpenFlow match/action abstraction
Programmable logic
Vendor-implemented
Matching
Rule
Action
1. FORWARD TO PORT
2. ENCAPSULATE&FORWARD
3. DROP
4. …
Extensible
Switch MAC
Port
src
MAC
dst
Eth
type
VLAN
ID
IP
Src
IP
Dst
IP
Prot
Pre-implementedmatching engine
Giuseppe Bianchi
TCP
TCP
sport dport
Example
Descripti
on
Port
MAC
src
MAC
dst
Eth
type
VLAN
ID
IP Src
IP
Dest
TCP
sport
TCP
dport
Action
L2
switching
*
*
00:1f:.
.
*
*
*
*
*
*
Port6
L3 routing
*
*
*
*
*
*
5.6.*.*
*
*
Port6
Micro-flow
handling
3
00:20.
.
00:1f..
0x80
0
Vlan1
1.2.3.
4
5.6.7.
8
4
17264
Port4
Firewall
*
*
*
*
*
*
*
*
22
Drop
VLAN
switching
*
*
00:1f..
*
Vlan1
*
*
*
*
Port6,
port7,
port8
Readily implemented in legacy TCAM
Ternary Content Addressable Memory
Giuseppe Bianchi
How to populate flow states?
Controller
No match
Packet F1
Giuseppe Bianchi
How to populate flow states?
Take «smart»
decision
Controller
Encapsulate &
forward
Giuseppe Bianchi
How to populate flow states?
Controller
Packet F1
Giuseppe Bianchi
Centralization: not a panacea!
èCentral view of the network
àNetwork as a “whole”
àNetwork states
àMulti-node coordination
èSignalling & latency!
ðO(100 ms)
à100ms = 20M packets lost @ 100 gbps
Great idea for networkwide states and «big
picture» decisions
Poor idea for local
states/decision, (way!)
better handled locally
(less delay, less load)
A considered «solution»: proactive flow states: pre-populate flow tables. Yes, but how to?!
Giuseppe Bianchi
Distributed controllers
the «common» way to address such cons
Proprietary controller extensions?
Back to Babel?
A non-solution!
still slow path latency!!
«true» fast path solution: update
forwarding rules in 1 packet
time – 5 ns @ 64B x 100 Gbps
3 ns = 60cm signal propagation…
Giuseppe Bianchi
OpenFlow evolutions
è Pipelined tables from v1.1
ð Overcomes TCAM size
limitation
ð Multiple matches natural
àIngress/egress, ACL, sequential L2/L3 match, etc.
è Extension of matching capapilities
ð More header fields
ð POF (Huawei, 2013): complete matching flexibility!
è Openflow «patches» for (very!) specific processing
needs and states
ð Group tables, meters, synchronized tables, bundles, typed tables
(sic!), etc
ð Not nearly clean, hardly a «first principle» design strategy
ð A sign of OpenFlow structural limitations?
Giuseppe Bianchi
Programming the data plane:
The P4 initiative
è SIGCOMM CCR 2014. Bosshart,
McKeown, et al. P4: Programming
protocol-independent packet processors
ð Dramatic flexibility improvements in packet processing
pipeline
ETH
VLAN
L2S
àConfigurable packet parser à parse graph
àTarget platform independence à compiler maps
onto switch details
àReconfigurability à change match/process fields L2D
during pipeline
IPV4
IPV6
TCP
UDP
è Feasible with HW advances
ð Reconfigurable Match Tables, SIGCOMM 2013
ACL
TM
ð Intel’s FlexPipe architectures
è P4.org: Languages and compilers
ð Further support for «registry arrays» and counters meant Table Graph
to persist across multiple packets
àThough no HW details, yet
Giuseppe Bianchi
Programming the data plane:
The P4 initiative
è SIGCOMM CCR 2014. Bosshart,
McKeown, et al. P4: Programming
protocol-independent packet processors
ð Dramatic flexibility improvements in packet processing
pipelineOpenFlow 2.0 proposal
ETH
VLAN
L2S
àConfigurable packet parser à parse graph
àTarget
platform
independence
à compiler
maps
Stateful
processing,
but only
«inside»
a
onto switch details
packet processing pipeline!
àReconfigurability à change match/process fields L2D
during pipeline
IPV4
IPV6
TCP
UDP
è FeasibleNot
with
advances
yet HW
(clear)
support for stateful
ð Reconfigurable
Match«across»
Tables, SIGCOMM
2013 packets
processing
subsequent
ACL
TM
ð Intel’s FlexPipe
architectures
in the flow
è P4.org: Languages and compilers
ð Further support for «registry arrays» and counters meant Table Graph
to persist across multiple packets
àThough no HW details, yet
Giuseppe Bianchi
Outline
èMotivation and goal
èBackground and state of the art
èOpenState: beyond OpenFlow
èBeyond OpenState?
Platform-agnostic Behavioral Forwarding and Stateful Flow Processing at Wire Speed
Giuseppe Bianchi
OpenState, 2014
è Our group, SIGCOMM CCR 2014; surprising finding:
an OpenFlow switch can «already» support stateful
evolution of the forwarding rules
ð With almost marginal (!) architecture modification
OpenFlow / SDN
OpenState / SDN
SMART!
Controlplane
OpenFlow
switch
ß Forwarding rules
Central control à
still decides how
switches shall
«behave»
OpenState
switch
Data-plane
Data-plane
DUMB!
SMART!
Giuseppe Bianchi
SMART!
Controlplane
ß Forwarding behavior:
ß Forwarding rules AND
how they should change
or adapt to «events»
Smart switches à
can dynamically
update flow tables
at wire speed
Our findings at a glance
èAny control program that can be
described by a Mealy (Finite State)
Machine is already (!) compliant with
OF1.3
èMM + Bidirectional flow state
handling requires minimal hardware
extensions to OF1.1+
Details in G. Bianchi, M. Bonola, A. Capone, C. Cascone,
“OpenState: programming platform-independent stateful
OpenFlow applications inside the switch”, ACM SIGCOMM Computer Communication Review,
vol. 44, no. 2, April 2014.
Giuseppe Bianchi
Remember OF match/action API
Programmabile logic
Vendor-implemented
Matching
Rule
Action
1. FORWARD TO PORT
2. ENCAPSULATE&FORWARD
3. DROP
4. …
Extensible
Pre-implemented matching engine
Switch MAC
Port
src
MAC
dst
Eth
type
Giuseppe Bianchi
VLAN
ID
IP
Src
IP
Dst
IP
Prot
TCP
TCP
sport dport
What is the OF abstraction,
formally?
è Packet header match = “Input Symbol” in a finite set
I={i1, i2, …, iM}.
ð One input symbol = any possible header match
ð Possible matches pre-implemented; cardinality depends on match
implementation
ð Theoretically, it is irrelevant how the Input Symbols’ set I is established
ài.e. each input symbol = Cartesian combination of multiple header field
matches, further including “wildcard” matches;
àE.s. incoming packet destination port = 5238 AND source IP address is
160.80.82.1, and the VLAN tag is 1111, etc.
è OpenFlow actions = “Output Symbols” in finite set
O={o1, o2, …, oN}
ð Pre-implemented actions
è OpenFlow’s match/action abstraction: a map T : I à O
ð all what the third party programmer can specify!
Giuseppe Bianchi
Reinterpreting (and extending) the
OpenFlow abstraction
èOpenFlow map is trivially recognized to be a
very special and trivial case of a Mealy Finite
State Machine
èT : {default-state}× I à{default-state}× O,
èi.e. a Finite State Machine with output, where
we only have one single (default) state!
èBy adding (per-packet) retrieval and update of
states, OpenFlow can be turned it into a Mealy
machine executor!!
Giuseppe Bianchi
If an application can be «abstracted»
in terms of a mealy Machine…
Port!=5123
Port=5123
Port=6234
Port=7345
Port=8456
Port=22
Drop()
Drop()
Drop()
Drop()
Drop()
Forward()
DEFA
ULT
Stage
1
Stage
2
Stage
3
Port!=6234
Port!=7345
Port!=8456
Drop()
Drop()
Drop()
OPEN
Port!=22
Drop()
Example: Port Knocking firewall
knock «code»: 5123, 6234, 7345, 8456 à then open Port 22
Giuseppe Bianchi
… it can be transformed in a Flow Table!
Ipsrc: ??
MATCH: <state, port> à ACTION: <drop/forward, state_transition>
Plus a state lookup/update
State DB
Metadata:
State-label
IPsrc
Port
Match fields
Actions
state
event
action
Next-state
DEFAULT
Port=5123
drop
STAGE-1
STAGE-1
Port=6234
drop
STAGE-2
STAGE-2
Port=7345
drop
STAGE-3
STAGE-3
Port=8456
drop
OPEN
OPEN
Port=22
forward
OPEN
OPEN
Port=*
drop
OPEN
*
Port=*
drop
DEFAULT
Giuseppe Bianchi
IpsrcàOPEN
State DB
Putting all together
1) State lookup
IPsrc=1.2.3.4
2) XFSM state transition
Port=8456
STAGE-3
IPsrc=1.2.3.4
State Table
Flow key
IPsrc= … …
Ipsrc= … …
IPsrc=1.2.3.4
IPsrc=5.6.7.8
IPsrc= … …
IPsrc= no match
state
… … …
… … …
Write:
STAGE-3
OPEN
OPEN
… … …
DEFAULT
write
Port=8456
Match fields
XFSM Table
Actions
state
event
action
Next-state
DEFAULT
STAGE-1
STAGE-2
STAGE-3
OPEN
OPEN
*
Port=5123
Port=6234
Port=7345
Port=8456
Port=22
Port=*
Port=*
drop
drop
drop
drop
forward
drop
drop
STAGE-1
STAGE-2
STAGE-3
OPEN
OPEN
OPEN
DEFAULT
3) State update
OPEN
IPsrc=1.2.3.4
Port=8456
1 «program» XFSM table for all flows
(same knocking sequence)
N states, one per (active) flow
Giuseppe Bianchi
Cross-flow state handling
è Yes but what about MAC learning, multi-port
protocols (e.g., FTP), reverse path forwarding,
bidirectional flow handling, etc?
MACdst
MACsrc
State Table
lookup
Flow key
48 bit MAC addr
state
Port #
XFSM Table
MACdst
MACsrc
update
state
event
action
Next-state
Port#
*
forward
In-port
State Table
Flow key
48 bit MAC addr
DIFFERENT lookup/update scope
Field 1
Field 2
Flowkey selector
Giuseppe Bianchi
Field N
Read/write signal
state
Port #
(HW) OpenState Architecture
update
extractor
update
next state
D-left
Hash
Table
lookup+state
Look-up
extractor
Mixe
r
lookup
delay queue
TCAM
1
TCAM
2
RAM2
RAM1
action
Action
Block
XFSM table
Flow table
egress queues
Ingress queues
èHW implementation details:
ð First TCAM only for static states (e.g. packets belonging to a given subnet)
ð 5 clock (2 SRAM read + 2 TCAM + 1 SRAM write)
ð 10 Gbps just requires 156 MHz clock TCAM, trivial
HW implementation Details in HPSR 2015, Pontarelli, Bonola, Bianchi, Capone, Cascone
Giuseppe Bianchi
Outline
èMotivation and goal
èBackground and state of the art
èOpenState: beyond OpenFlow
èBeyond OpenState?
Platform-agnostic Behavioral Forwarding and Stateful Flow Processing at Wire Speed
Giuseppe Bianchi
Mealy Machine: nice but
insufficient!
è«true» Flow processing requires memory,
registries, counters, etc
ðState alone is insufficient
è«true» flow processing requires operations
(compare, add, shift, etc)
ðOpenFlow (forwarding) actions are insufficient
è«true» flow processing requires… «processing»
ðProcessing = CPU: cannot afford any ordinary CPUs at ns time
scales wire speed!
Can we further evolve OpenState into an architecture equivalent to a
“full” CPU (Without using any CPU?)
AND CAPABLE OF EXECUTING A PLATFORM AGNOSTIC
ABSTRACTION?
Giuseppe Bianchi
Extended finite state
machines: much more general!
è Mealy Machines: 4-tuple
ð I, O, S
ð T:S×IàS×O
Input symbol
Check Conditions on D
State 1
Output symbol
update D
è XFSM: 7-tuple
ð I, O, S (Input symbols, output symbols, states)
àAs before, S = User-defined
ð D=D1×…×Dn n-dimensional linear space
àRegistries!!! Global or (user-defined) per flow!!
ð F = set of enabling functions fi:Dà{0,1}
àBoolean Conditions on registries!!!
ð U = set of update functions ui:DàD
àUpdate of the registry values!
ð T:S×I×FàS×O×U the actual XFSM transition
àA mapà can be implemented by the TCAM!
Giuseppe Bianchi
State 2
Extended finite state
machines: much more general!
è Mealy Machines: 4-tuple
ð I, O, S
ð T:S×IàS×O
Input symbol
Check Conditions on D
State 1
Output symbol
update D
Question to CS formal theorists in the
can an XFSM execute any arbitrary
è XFSM:room:
7-tuple
ð I, O, algorithm?
S (Input symbols, output symbols, states)
àAs before, S = User-defined
ð D=DIf
linearto:
space
not, we
are very close
Abstract State
1×…×D
n n-dimensional
àRegistries!!!
Global orslight
(user-defined)
per flow!!
Machines (further
generalization)
ð F = set
of enabling
functions fi:Dà{0,1}
indeed
can [Gurevich]
àBoolean Conditions on registries!!!
ð U = set of update functions ui:DàD
àUpdate of the registry values!
ð T:S×I×FàS×O×U the actual XFSM transition
àA mapà can be implemented by the TCAM!
Giuseppe Bianchi
State 2
Towards an Open Flow Processor
èHW architecture «executing» an XFSM
ðSeems feasible, via appropriate extension of OpenState
èDetails to be presented in a follow up
presentation
Giuseppe Bianchi
An OFP program = a platform
agnostic abstract XFSM!
(example: a TCP SYN scan detection+mitigation)
NEW_TCP_FLOW
<D0 = 0>
<D1 = pkt.ts>
[OUTPUT 1]
DEFAUL
T
D0: TCP SYN rate
D1: last packet timestamp
D2: DROP state expiration
timestamp
G0: rate threshold (global)
G1: DROP duration (global)
IDLE_TIMEOUT_EXPIRED
<REMOVE_FLOW_ENTRY>
ANY_PACKET
If D2 < pkt.ts
<D0 = 0>
<D1 = pkt.ts >
[OUTPUT 1]
DROP
ANY_PACKET
if D2 > pkt.ts
[DROP]
NEW_TCP_FLOW
if D0 >= G0
<D2 = pkt.ts + G1>
[DROP]
Giuseppe Bianchi
MONITO
R
NEW_TCP_FLOW
if (D0 < G0)
<D0 = rate(D0, D1)>
<D1 = pkt.ts >
[OUTPUT 1]
An OFP program = a platform
agnostic abstract XFSM!
(example: a TCP SYN scan detection+mitigation)
NEW_TCP_FLOW
<D0 = 0>
<D1 = pkt.ts>
[OUTPUT 1]
DEFAUL
T
D0: TCP SYN rate
D1: last packet timestamp
D2: DROP state expiration
timestamp
G0: rate threshold (global)
G1: DROP duration (global)
IDLE_TIMEOUT_EXPIRED
<REMOVE_FLOW_ENTRY>
DROP
ANY_PACKET
if D2 > pkt.ts
[DROP]
ANY_PACKET
If D2 < pkt.ts
<D0 = 0>
<D1 = pkt.ts >
[OUTPUT 1]
MONITO
R
Note: guaranteed to run at wire speed!
(does not rely on any CPU: it is directly
NEW_TCP_FLOW
«executed»
byGthe architecture, with the
if D >=
<D = pkt.ts + G
>
TCAM performing
state
transitions!
[DROP]
0
2
Giuseppe Bianchi
0
1
NEW_TCP_FLOW
if (D0 < G0)
<D0 = rate(D0, D1)>
<D1 = pkt.ts >
[OUTPUT 1]
Conclusions
èPlatform-agnostic programming of control
intelligence inside devices’ fast path (!) seems
not only possible but even viable
ðViable on today’s HW (implemented!)
ðTCAM as «State Machine processor»
ðSupport for full XFSM, beyond OpenState’s Mealy Machine
èRethinking control-data plane SDN
separation?
ðBack to active networking 2.0? (but now with a clearcut
abstraction in mind)
Giuseppe Bianchi
Download