Proposal for OpenFlow 1.1 Extension Protocol James Kempf Introduction This document sketches a proposal for a comprehensive extension protocol for OpenFlow 1.1. Because the exact definition of the OpenFlow 1.1 protocol messages has yet to be defined, the proposal is written as some options for the extension to the existing OpenFlow 1.0 protocol, and it will need to be modified when the exact format of the OpenFlow 1.1 messages are determined. The proposal was informed by the design for the X Window System extension mechanism [1], though the design is not exactly the same. Though over 20 years old and therefore now well along into hoary old age as information technology goes, the X Window System protocol is a successful example of how a welldesigned extension mechanism can accommodate a wide variety of innovative hardware without fostering vendor lock-in. Note that in some cases, the protocol messages will be variably sized depending on the contents of the extension. Expressing this in the preferred OpenFlow specification language C is somewhat difficult, but the cases where variably sized fields are needed are noted in the comments. This would be easier to specify in a more flexible protocol specification language that is not tied to a particular programming language. Identifying What Extensions the Switch Supports Extensions are determined by a 16 bit code. The extension space is partitioned as follows: – Reserved 0x0001:0x1000 – Experimental extension 0x1001:0xffff – Standardized extension number allocated by OpenFlow Consortium, IANA, or some other body 0x0000 Extensions start off numbered in the informal, non-allocated space, which is loosely coordinated among developers. They move to the allocated extension space if they are either fairly specific to particular hardware (for example, the Stanford SONET/SDH extension) or they are not incorporated into a standardized version of OpenFlow but are still of interest to a broader user community. An extension which is standardized and given an assigned number can then deprecate the experimental number. This provides a means whereby needed and tested new features can be incorporated into the protocol. Extensions are described by an extension header: struct ofp_extension { uint16_t id; /* Extension identifier.*/ uint32_t vendor; /* Implementor ID: * - MSB 0: low-order bytes are IEEE OUI. * - MSB != 0: defined by OpenFlow * consortium. */ uint16_t version ; /* Extension version number, format defined * by the extension */ unit16_t length; /* Length of the extension description, including * this header */ /* Extension specific data in here. */ }; A switch returns a list of extension headers describing what extensions it supports. There are a couple of different ways that the controller could query the switch for the supported extensions: either as part of OFPT_FEATURES_REPLY message or in response to a new message, OFP_EXTENSION_REQUEST. At the beginning of every OpenFlow session, the controller issues an OFPT_FEATURES_REQUEST message to determine the features supported by the switch. In addition to the base switch features, this message could also returns a list of extensions supported by the switch. This would require that the ofp_phy_port_ports array be specifically sized. The following illustrates the modified OFPT_FEATURES_REPLY C struct: struct ofp_switch_features { struct ofp_header header; uint64_t datapath_id; /* Datapath unique ID. The lower 48-bits are for a MAC address, while the upper 16-bits are implementer-defined. */ uint32_t n_buffers; /* Max packets buffered at once. */ uint8_t n_tables; /* Number of tables supported by datapath. */ uint8_t pad[3]; /* Align to 64-bits. */ /* Features. */ uint32_t capabilities; /* Bitmap of support "ofp_capabilities". */ uint32_t actions; /* Bitmap of supported "ofp_action_type"s. */ /* Port info.*/ uint16_t n_ports ; /* Number of ports */ struct ofp_phy_port ports[0]; /* Port definitions. Size is determined by number of ports. */ struct ofp_extension extensions[0]; /* Extension definitions. Total size of extension descriptor array determined by length in header.*/ }; Alternatively, a new message pair, OFP_EXTENSION_REQUEST/OFP_EXTENSION_REPLY, could be added to allow the controller to query for extensions. The request message simply consists of an ofp_header while the reply message returns the following: struct ofp_extension_reply { struct ofp_header header; struct ofp_extension extensions[0]; /* Extension definitions. Total size * of extension descriptor array * determined by length in header. */ }; The controller queries the switch initially for the extensions it supports, and then dynamically links the code for the extension if the extension code is not built into the controller. The extension library contains all the data type definitions and code to communicate the extended capabilities with the switch. Should Flow Definitions be Extensible? OpenFlow 1.0 has a fixed, nonextensible flow definition, ofp_match. In the process of doing the Ericsson MPLS extension, we had to extend the flow definition to include the MPLS labels. But matches on MPLS flows must also match other header fields besides the MPLS labels. Defining a new flow descriptor was not possible, unless all the nonMPLS field definitions were duplicated. As a result, the ofp_match was redefined to include the MPLS fields. Flow types are self describing through type code. Again, the type code space is partitioned to allow experimental extensions and standardized extensions: – Reserved – Experimental extension flow type 0x1001:0xffff – Standardized flow type number allocated by OpenFlow Consortium, IANA, or some other body 0x0000 0x0001:0x1000 One possibility is to define the flow descriptor more modularly, with Ethernet, IPv4, and transport header rules in separate C struct definitions. For example: /*Ethernet header fields */ struct ofp_enet_rule { uint8_t dl_src[OFP_ETH_ALEN]; /* Ethernet source address. */ uint8_t dl_dst[OFP_ETH_ALEN]; /* Ethernet destination address. */ uint16_t dl_vlan; /* Input VLAN id. */ uint8_t dl_vlan_pcp; /* Input VLAN priority. */ uint8_t pad1[1]; /* Align to 64-bits */ uint16_t dl_type; /* Ethernet frame type. */ }; /*IPv4 header fields */ struct ofp_ipv4_rule { uint8_t nw_tos; /* IP ToS (actually DSCP field, 6 bits). */ uint8_t nw_proto; /* IP protocol or lower 8 bits of * ARP opcode. */ uint8_t pad2[2]; /* Align to 64-bits */ uint32_t nw_src; /* IP source address. */ uint32_t nw_dst; /* IP destination address. */ }; /*Transport layer header fields*/ struct ofp_L4_rule { uint16_t tp_src; /* TCP/UDP source port. */ uint16_t tp_dst; /* TCP/UDP destination port. */ }; /*Canonical OF flow descriptor*/ struct ofp_match { uint16 type; /*Flow type identifier*/ uint16_t in_port; /* Input switch port. */ uint16_t length; /*Length of flow match, including this struct. Allows variable sized extensions in standard messages. */ }; /*IPv4 flow type. This is the only standardized flow type for now.*/ enum ofp_flow_type { OFP_FLOW_TYPE_RESERVED = 0x0000; OFP_FLOW_TYPE_IPV4 = 0x1001; }; /*IPv4 flow descriptor, matches OF 1.0 flows*/ struct ofp_ipv4_match struct ofp_match; uint32_t wildcards; /* Wildcard fields for L2, IPv4, and L4. */ struct ofp_enet_rule; struct ofp_ipv4_rule; struct ofp_L4_rule; }; This would simplify defining a new flow type with a new descriptor: const uint8_t OFP_MAX_MPLS_LABEL_STACK = 2; const uint16_t OFP_FLOW_TYPE_IPV4_MPLS = 0x0001; /*Experimental for now*/ /*MPLS header. Match up to 2 labels*/ struct_mpls_rule { uint8_t mpls_wildcards; /*Wildcards for MPLS*/ uint8_t n_labels; /*Number of labels to match in stack*/ uint32_t label_stack[0]; /*Label stack, size determined by n_labels*/ }; /*MPLS flow descriptor*/ struct ofp_mpls_match { struct ofp_match; uint32_t wildcards; /* Wildcard fields for L2, IPv4, and L4. */ struct ofp_enet_rule; struct ofp_mpls_rule; /*The MPLS header to match*/ struct ofp_ipv4_rule; struct ofp_L4_rule; }; The ofp_match struct has a length field so that the ofp_flow_mod message and other messages taking flow specs can be extended without having to duplicate it. Note that this extension mechanism will work if the new flow descriptors correspond to well-understood, standardized headers with fixed sized header fields at specified locations in the header for which definitions are documented and the documentation is available for both the switch writer and the extension writer. It will not work for on-thefly negotiation of variable header fields between the switch and controller, because the controller will not have the code available to specify the headers. What about Ports and Queues? While the parameters describing ports and queues may change, it seems unlikely that completely different types of ports and queues will be required. For example, the size of an output queue will change from one type of switch to another, but the basic definition of the queue should remain the same. Consequently, these object types don’t need to be extensible. Extending Flow Actions OpenFlow 1.0 already has a fairly complete action extension mechanism, but there is no formalized way to partition the flow type space into standardized and experimental flow action types. The action extension types need to be formalized in a manner similar to extensions and flows: – Reserved – Experimental extension action type 0x1001:0xffff – Standardized action number allocated by OpenFlow Consortium, IANA, or some other body 0x0000 0x0001:0x1000 Standardized actions are: enum ofp_action_type { OFPAT_OUTPUT = 0x1001, /* Output to switch port. */ OFPAT_SET_VLAN_VID =0x1002, /* Set the 802.1q VLAN id. */ OFPAT_SET_VLAN_PCP -0x1003, /* Set the 802.1q priority. */ OFPAT_STRIP_VLAN =0x1004, /* Strip the 802.1q header. */ OFPAT_SET_DL_SRC =0x1005, /* Ethernet source address. */ OFPAT_SET_DL_DST =0x1006, /* Ethernet destination address. */ OFPAT_SET_NW_SRC =0x1007, /* IP source address. */ OFPAT_SET_NW_DST =0x1008, /* IP destination address. */ OFPAT_SET_NW_TOS =0x1009, /* IP ToS (DSCP field, 6 bits). */ OFPAT_SET_TP_SRC =0x100a, /* TCP/UDP source port. */ OFPAT_SET_TP_DST = 0x100b, /* TCP/UDP destination port. */ OFPAT_ENQUEUE = 0x100c, /* Output to queue. */ } ; Action message objects are modularized: struct ofp_action_header { uint16_t type; /* Action type code*/ uint16_t len; /* Length of action, including this header. */ }; struct ofp_action_output { struct ofp_action_header; uint16_t port; /* Output port. */ uint16_t max_len; /* Max length to send to controller. */ }; New actions are defined from the experimental or standardized ranges, for example: const uint16_t OFP_MPLS_POP_ACTION = 0x0001 /*Experimental for now.*/ /*Pop top label off of MPLS stack*/ struct ofp_mpls_action_pop { ofp_action_header; /*Other pop action-specific data.*/ }; This procedure is not much different from the OpenFlow 1.0 action extension procedure. Extension Messages Messages are extended in the same way. The message type code space extended from 8 to 16 bits and divided up as with extensions, flows, and actions: – Reserved – Experimental extension message type 0x1001:0xffff – Standardized message number allocated by OpenFlow Consortium, IANA, or some other body 0x0000 0x0001:0x1000 Standardized messages have codes within the allocated range: enum ofp_type { /* Immutable messages. */ OFPT_HELLO = 0x1001, /* Symmetric message */ OFPT_ERROR = 0x1002, /* Symmetric message */ OFPT_ECHO_REQUEST = 0x1003, /* Symmetric message */ OFPT_ECHO_REPLY = 0x1004, /* Symmetric message */ OFPT_VENDOR = 0x1005, /* Symmetric message */ /* Switch configuration messages. */ OFPT_FEATURES_REQUEST = 0x1006, /* Controller/switch message */ OFPT_FEATURES_REPLY = 0x1007, /* Controller/switch message */ OFPT_GET_CONFIG_REQUEST = 0x1008, /* Controller/switch message */ OFPT_GET_CONFIG_REPLY = 0x1009, /* Controller/switch message */ OFPT_SET_CONFIG = 0x100a, /* Controller/switch message */ /* Asynchronous messages. */ OFPT_PACKET_IN = 0x100b, /* Async message */ OFPT_FLOW_REMOVED = 0x100c, /* Async message */ OFPT_PORT_STATUS = 0x100d, /* Async message */ /* Controller command messages. */ OFPT_PACKET_OUT = 0x100e, /* Controller/switch message */ OFPT_FLOW_MOD = 0x100f, /* Controller/switch message */ OFPT_PORT_MOD = 0x1010, /* Controller/switch message */ /* Statistics messages. */ OFPT_STATS_REQUEST = 0x1011, /* Controller/switch message */ OFPT_STATS_REPLY = 0x1012, /* Controller/switch message */ /* Barrier messages. */ OFPT_BARRIER_REQUEST = 0x1013, /* Controller/switch message */ OFPT_BARRIER_REPLY= 0x1014, /* Controller/switch message */ /* Queue Configuration messages. */ OFPT_QUEUE_GET_CONFIG_REQUEST = 0x1015, /* Controller/switch message */ OFPT_QUEUE_GET_CONFIG_REPLY = 0x1016 /* Controller/switch message */ }; /* Header on all OpenFlow packets. */ struct ofp_header { uint16_t version; /* OFP_VERSION. */ uint16_t type; /* Type code for messages*/ uint16_t length; /* Length including this ofp_header. */ uint32_t xid; /* Transaction id associated with this packet. Replies use the same id as was in the request to facilitate pairing. */ }; Standard messages are defined as previously, for example: /* OFPST_FLOW request. */ struct ofp_flow_stats { ofp_header header; uint8_t table_id; /* ID of table flow came from. */ uint8_t pad; struct ofp_match match; /* Description of fields. */ uint32_t duration_sec; /* Time flow has been alive in seconds. */ uint32_t duration_nsec; /* Time flow has been alive in nanoseconds beyond duration_sec. */ uint16_t priority; /* Priority of the entry. Only meaningful when this is not an exact-match entry. */ uint16_t idle_timeout; /* Number of seconds idle before expiration. */ uint16_t hard_timeout; /* Number of seconds before expiration. */ uint8_t pad2[6]; /* Align to 64-bits. */ uint64_t cookie; /* Opaque controller-issued identifier. */ uint64_t packet_count; /* Number of packets in flow. */ uint64_t byte_count; /* Number of bytes in flow. */ struct ofp_action_header actions[0]; /* Actions. */ }; Extension messages are defined with a new type code and message object, for example: const uint16_t OFPT_MPLS_FAILOVER_STATS = 0x0001 /*Experimental for now*/ struct ofp_mpls_failover_stats { struct ofp_header header; struct ofp_mpls_match match; /* Fields to match, MPLS extension */ uint64_t cookie; /* Opaque controller-issued identifier. */ uint32_t failovers; /*Number of times LSP has failed over and restored*/ }; References [1] http://en.wikipedia.org/wiki/X_Window_System_protocols_and_architecture