Top Banner
March 2012 Interoperability Event Technical Paper Version 1.0 April 19, 2012 CONTACT: ONF Testing-Interoperability Working Group Michael Haugh, Chair ([email protected]) Rob Sherwood, Vice-Chair ([email protected]) Ron Milford ([email protected])
15

March 2012 Interoperability Event Technical Paper, v1.0

Feb 09, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: March 2012 Interoperability Event Technical Paper, v1.0

March 2012 Interoperability Event Technical Paper Version 1.0 April 19, 2012

CONTACT: ONF Testing-Interoperability Working Group Michael Haugh, Chair ([email protected]) Rob Sherwood, Vice-Chair ([email protected]) Ron Milford ([email protected])

Page 2: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

Copyright ©

2012 Open Networking Foundation

Disclaimer

Contact the Open Networking Foundation at www.opennetworking.org for information on specification

licensing through membership agreements.

Any marks and brands contained herein are the property of their respective owners.

Page 3: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

Contents 1 Introduction .......................................................................................................................................... 4

2 Definitions ............................................................................................................................................. 4

3 Issues ..................................................................................................................................................... 4

3.1 OFTest ........................................................................................................................................... 4

3.1.1 OF message split across IP packets ....................................................................................... 4

3.1.2 OFTest in a VM ...................................................................................................................... 5

3.1.3 Multipart OFPT_STATS_REPLY messages .............................................................................. 5

3.2 Default PACKET_IN behavior ........................................................................................................ 6

3.2.1 Discarding non-matched packets .......................................................................................... 6

3.2.2 Hybrid Switch L2 forwarding non-matched packets ............................................................. 7

3.3 Packet_Out .................................................................................................................................... 7

3.3.1 Source Port ............................................................................................................................ 7

3.3.2 Switch Bug ............................................................................................................................. 7

3.4 Feature Request/Reply ................................................................................................................. 7

3.4.1 Destination Port 0 ................................................................................................................. 8

3.4.2 Current Features Flag set to all Zeros ................................................................................... 8

3.5 FLOW_MOD .................................................................................................................................. 9

3.5.1 Send frame out ingress port ................................................................................................. 9

3.6 Control Channel ............................................................................................................................ 9

3.6.1 Mixed control and data network traffic ................................................................................ 9

3.6.2 Management and Control traffic on same network ........................................................... 10

3.6.3 Vendor specific messages ................................................................................................... 10

3.6.4 Barrier messages ................................................................................................................. 10

3.7 LLDP ............................................................................................................................................. 11

3.8 Controller Connection Failure Modes ......................................................................................... 12

3.8.1 L2 forwarding ...................................................................................................................... 12

3.8.2 Flows stay active until they timeout ................................................................................... 12

3.9 Controller Specific ....................................................................................................................... 12

3.9.1 Topology not updated when control channel (TCP session) lost to switch ........................ 13

3.9.2 Topology not updated when port status changes .............................................................. 13

3.9.3 Controller discovers links in only one direction .................................................................. 13

Page 4: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

3.10 FlowVisor ..................................................................................................................................... 13

3.10.1 Controller crashing FlowVisor ............................................................................................. 13

4 Additional Comments ......................................................................................................................... 13

5 Conclusions ......................................................................................................................................... 14

Appendix A: References .............................................................................................................................. 15

Appendix B: Revision History ...................................................................................................................... 15

1 Introduction The ONF held an OpenFlow interoperability event March 5 – 9 at the Ixia iSimCity Lab in Santa Clara, Ca.

Please refer to the event whitepaper for descriptions of the event, tests and participants. This

documents purpose is to present many of the issues encountered during the event. This is not a

complete or comprehensive list as not all issues were reported. We’ve attempted to present a detailed

description of each issue wherever possible, resolutions or temporary workarounds used during the

event to overcome those issues and recommendations for further action to resolve those issues. Due to

the temporary nature of the test bed network and time constraints, we were not able to fully debug and

completely isolate the cause of all issues, but those issues are presented to increase awareness and

further investigation in future events. Vendor names were generally left out of the issue descriptions to

help protect the confidentiality of the participants.

2 Definitions DUT – Device Under Test

OpenFlow Switch – non-hybrid OpenFlow device

Hybrid Switch – Device that can operate in OpenFlow Mode or as a standard L2 switch

3 Issues

3.1 OFTest

3.1.1 OF message split across IP packets

Description: Issues occurred with OFTest when one switch sent OF messages split across multiple IP

packets to the controller. The first IP packet contained a complete OF message and a portion of a second

OF message. The second IP packet contained the remainder of the second OF message. OFTest was not

correctly remembering partial OF messages across read()'s. Historically, this hasn't been a problem

because most switches call write() once per message, so the TCP.PSH flag ensures that a single read()

returns only complete OF messages.

Page 5: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

Resolution: This was identified as a bug in OFTest. A bug fix (If read() returns a partial message, then

buffer it until subsequent read() calls eventually return the complete message.) was implemented

during the event and the switch in question was retested.

3.1.2 OFTest in a VM

Description: Issues appeared when running OFTest within a VM and doing non-control plane related

testing. This involved going through a virtual switch on the server prior to hitting the physical nic. Any

virtual machine hypervisor has an implicit virtual switch in it between the VMs and the physical nics.

That virtual switch actually has logic that blocks some of OFTest's packets which prevents certain tests

(e.g., packet-out test) from succeeding. Further, most vswitches do not allow promiscuous mode

listening by default and do mac learning (as a security feature) to ensure that a VM does not change its

mac or spoof. So, while each of these points have potential work arounds (directly mapping the physical

nics to the OFTest VM, allowing promiscuous mode, allowing multiple macs on those ports), ultimately

running OFTest through a VM adds an extra source of uncertainty and complicates debugging, so we've

decided it's a best practice to not run OFTest in a VM.

Resolution: Reinstalled Linux and OFTest on a bare metal server.

Recommendations: While each of these points have potential work arounds (directly mapping the

physical nics to the OFTest VM, allowing promiscuous mode, allowing multiple macs on those ports),

ultimately running OFTest through a VM adds an extra source of uncertainty and complicates debugging,

so we've decided it's a best practice to not run OFTest in a VM.

3.1.3 Multipart OFPT_STATS_REPLY messages

Description: The OpenFlow 1.0 specification (section 5.3.5 Read State Messages) reads:

The switch responds to a OFPT_STATS_REQUEST message with one or more OFPT_STATS_REPLY

messages:

struct ofp_stats_reply {

struct ofp_header header;

uint16_t type; /* One of the OFPST_* constants. */

uint16_t flags; /* OFPSF_REPLY_* flags. */

uint8_t body[0]; /* Body of the reply. */

};

OFP_ASSERT(sizeof(struct ofp_stats_reply) == 12);

“The only value defined for flags in a reply is whether more replies will follow this one - this has the

value 0x0001. To ease implementation, the switch is allowed to send replies with no additional

entries. However, it must always send another reply following a message with the more flag set. “

The stats_reply openflow message encapsulates zero, one, or many stat entries (e.g., statistics for a

specific port) and a single stats_request returns one or more stats_reply messages. If the single

stats_request is going to return multiple stats_replies, every reply should have the OFPSF_REPLY_MORE

flag set, except for the last one. All stat replies should have the transaction ID field set to match the

stat_request.

Page 6: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

A typical implementation of a stats reply encapsulates as many stat entries as will fit and only uses

multiple stat_reply messages if the number of statistics is large and exceeds the OpenFlow 64k message

limit. A second type of implementation is each stat_reply encapsulates exactly one stat entry, so given

N stat entries, the switch replies with N stat_reply messages, where messages 1 through

N-1 had the OFPSF_REPLY_MORE bit set and the Nth message did not have OFPSF_REPLY_MORE set.

At the interop event, one vendor interpreted the specification in a third, different way which broke

OFTest. Given N stat entries, they actually replied with N+1 stat reply messages. Messages 1 through N

each included a single stat entry and had OFPSF_REPLY_MORE set, but message N+1 has *no* stat entry

and did not set OFPSF_REPLY_MORE.

Resolution: None

Recommendations: After reading the exact wording of the specification, it seems like this is a valid

literal interpretation of the spec, but does arguably violate the spirit of the design. Rather than push the

third interpretation, the switch authors are investigating moving to the second interpretation.

A bug report should be submitted for OFTest to allow it to support this interpretation.

Subsequent versions of the specification should clarify this.

3.2 Default PACKET_IN behavior

3.2.1 Discarding non-matched packets

Description: The default behavior for packets in the OpenFlow pipeline that do not match an existing

flow rule (table miss) is specified in section 4.1.2 of the OpenFlow 1.0.0 specification.

Packet-in: For all packets that do not have a matching flow entry, a packet-in event is sent to the

controller …

Some OpenFlow switches discarded packets that did not match a flow table rule. One example of this

was the unsolicited discovery packets used by many controllers. They did not match an existing flow rule

so they were discarded. This resulted in the failure of link discovery or the discovery of a unidirectional

link. Another example was ARP packets not being forwarded to the controller. This resulted in failure to

discover hosts and install the correct flows for traffic destined to those hosts.

Resolution: In some cases the switch vendor put a lowest priority default match all rule to send packets

that would otherwise result in a table miss to the controller. In other cases, the controller, using a

flow_mod message, injected either a default lowest priority match all rule or a specific match rule for

the exact packet type it was looking for. This was a workaround.

Recommendations: It is recommended that switches supporting the 1.0.0 specification support the

default packet_in behavior as indicated in the specification.

Page 7: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

3.2.2 Hybrid Switch L2 forwarding non-matched packets

Description: Some hybrid switches were defaulting to normal layer2 pipeline processing for frames that

did not appear to the switch as belonging in the OpenFlow pipeline. An example might be a NOX

controller modified LLDP frame that uses a multicast destination mac address. If this packet arrives on a

hybrid interface and does not match an OpenFlow rule, the switch will send the frame to the normal

Layer2 processing pipeline and broadcast it out all ports. Since STP was not running in most of the test

bed, this was a potential cause of loops within the network or at the least incorrect topology discovery.

Resolution: In some cases the switch vendor put a lowest priority default match all rule to send packets

that would otherwise result in a table miss to the controller. In other cases, the controller, using a

flow_mod message, injected either a default lowest priority match all rule or a specific match rule for

the exact packet type it was looking for.

Recommendations: It is recommended that switches supporting the 1.0.0 specification support the

default packet_in behavior as indicated in the specification.

3.3 Packet_Out

3.3.1 Source Port

Description: Controllers varied on which in_port value they set in packet_out messages. They used one

of the following values:

OFPP_CONTROLLER = 0xfffd, /* Send to controller. */ OFPP_NONE = 0xffff /* Not associated with a physical port. */

The 1.0.0 specification (section 5.3.6 Send Packet Message) specifies using OFPP_NONE for the port

value when sending packet_out messages from the controller. Some switches did not recognize the

value OFPP_CONTROLLER and rejected the packet_out messages.

Resolution: None

Recommendation: OpenFlow 1.0.0 Controllers should use OFPP_NONE as the in_port of packet_out

messages.

3.3.2 Switch Bug

Description: One switch was not correctly processing packet_out messages.

Resolution: This turned out to be a bug introduced in a code upgrade after testing with OFTest. The bug

was reported and fixed the same day.

3.4 Feature Request/Reply

Page 8: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

3.4.1 Destination Port 0

Description: Some switches used a port_number of 0 in feature_reply messages. The OpenFlow 1.0.0

specification (section 5.2.1 Port Structures) reads:

The port numbers use the following conventions:

/* Port numbering. Physical ports are numbered starting from 1. */

enum ofp_port {

/* Maximum number of physical switch ports. */

OFPP_MAX = 0xff00,

The spec also indicates a starting port of 1 in Table 3 on page 4.

However, the 1.0.0 specification is somewhat ambiguous. On pages 28, 32 and 33 it states;

"… out_port must be set to OFPP_NONE, since 0 is a valid port id."

Resolution: None

Recommendations: Switches should begin indexing ports at 1.

The 1.0.0 specification should be reviewed to determine if the language on pages 28, 32 & 33 should be

corrected or clarified.

3.4.2 Current Features Flag set to all Zeros

Description: The OpenFlow 1.0.0 specification describes the ofp_port_features bitmap as follows:

/* Features of physical ports available in a datapath. */

enum ofp_port_features {

OFPPF_10MB_HD = 1 << 0, /* 10 Mb half-duplex rate support. */

OFPPF_10MB_FD = 1 << 1, /* 10 Mb full-duplex rate support. */

OFPPF_100MB_HD = 1 << 2, /* 100 Mb half-duplex rate support. */

OFPPF_100MB_FD = 1 << 3, /* 100 Mb full-duplex rate support. */

OFPPF_1GB_HD = 1 << 4, /* 1 Gb half-duplex rate support. */

OFPPF_1GB_FD = 1 << 5, /* 1 Gb full-duplex rate support. */

OFPPF_10GB_FD = 1 << 6, /* 10 Gb full-duplex rate support. */

OFPPF_COPPER = 1 << 7, /* Copper medium. */

OFPPF_FIBER = 1 << 8, /* Fiber medium. */

OFPPF_AUTONEG = 1 << 9, /* Auto-negotiation. */

OFPPF_PAUSE = 1 << 10, /* Pause. */

OFPPF_PAUSE_ASYM = 1 << 11 /* Asymmetric pause. */

};

In the features_reply message from at least one switch, the ofp_port_features bits were set to all zeros.

The controller did not understand this response and set the link and duplex of the ports for that switch

to “unknown”.

Resolution: None

Recommendations: The switch should respond with the correct set of available speed and duplex option

bits set.

Page 9: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

3.5 FLOW_MOD

3.5.1 Send frame out ingress port

Description: Some switches do not support flow_mod messages that set the action type

OFPAT_OUTPUT value to IN_PORT.

The OpenFlow 1.0.0 specification (Section 3.3 Actions) specifies required actions.

Required Action: Forward. OpenFlow switches must support forwarding

the packet to physical ports and the following virtual ones:

_ ALL: Send the packet out all interfaces, not including the incoming interface.

_ CONTROLLER: Encapsulate and send the packet to the controller.

_ LOCAL: Send the packet to the switch’s local networking stack.

_ TABLE: Perform actions in flow table. Only for packet-out messages.

_ IN PORT: Send the packet out the input port.

Resolution: None

Recommendations: Further investigation is required by vendors to determine if this is a hardware

limitation. It is recommended that this feature be supported in order to be compliant with the 1.0.0,

1.1.0 and future specifications.

3.6 Control Channel

3.6.1 Mixed control and data network traffic

Description: While traffic load was generated by the Ixia and Spirent systems on the data plane, we saw

packet_in messages on the control channel that had several layers of nested packet_in messages in

them. This seemed to be created by control messages (packet_in) leaking from the control plane back

into the data plane, and generating there a new packet_in encapsulating the previous packet. This could

have been a cause of the TCP retransmissions seen on the control channel.

Resolution: None, behavior stopped before we could debug it further.

Recommendations: The control-channel should not in any way be connected to the data plane.

Vendors should test that switches never leak control-channel data onto the data plane and vice versa.

Recommend that we bring this issue to the control committee to investigate ways to check for these

types of loops?

Page 10: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

3.6.2 Management and Control traffic on same network

Description: There were numerous instances where we saw significant TCP retransmissions on the

control channel. Code upgrades, package downloads and other normal management traffic could have

contributed to congestion and packet loss on the control channel.

Having the OpenFlow control channel and the device management connections on the same IP interface

led to increased difficulty in isolating issues quickly.

Resolution: None

Recommendations: For future interoperability events, where possible, use different IP Addresses and

physical interfaces for Control-channel traffic vs controller and switch management traffic to ease

troubleshooting. We should consider establishing separation of control and management traffic as a

best practice in production networks.

3.6.3 Vendor specific messages

Description: Some controllers were sending vendor specific messages to the switches expecting 1 of 2

responses, a vendor supported or a vendor unsupported message. Instead, some switches would

respond with an unrecognized error. If the appropriate responses were not received from the switch,

the TCP session was reset by the controller. After a configured backoff timer expired, the switch would

reestablish the connection and the same behavior would repeat causing the control channel connection

to bounce.

Resolution: None

Recommendations: Switches supporting the 1.0.0 specification should respond with the correct error

message(s) if they do not support the specified vendor extensions. Controllers should be able to handle

exceptions like this in a more graceful manner.

3.6.4 Barrier messages

Description: The OpenFlow 1.0.0 specification (Section 5.3.7 Barrier Messages) reads:

When the controller wants to ensure message dependencies have been met or

wants to receive notifications for completed operations, it may use an OFPT_BARRIER_REQUEST

message. This message has no body. Upon receipt, the switch must _nish processing

all previously-received messages before executing any messages beyond

the Barrier Request. When such processing is complete, the switch must send

an OFPT_BARRIER_REPLY message with the xid of the original request.

Some controllers were using barrier messages as indicated by the specification to ensure they did not

overwhelm the switch with too many messages. Some switches did not respond with a barrier_ reply

message, but with an error message Type 1 (Request not understood) Code 1 (ofp_header.type not

supported). This caused the controller to reset the TCP session. After a configured backoff timer

Page 11: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

expired, the switch would reestablish the connection and the same behavior would repeat causing the

control channel connection to bounce.

Resolution: Some controllers were able to disable this functionality for the purposes of the test case.

We were able to continue with the testing without the capability to rate limit controller messages to the

switches.

Recommendations: Barrier request messages are mandatory in 1.0.0 and 1.1.0 and should continue to

be in future specifications. In order to be compliant, switch vendors should implement barrier messages

as indicated by the specification and generate valid replies. Nevertheless controllers should be robust

enough to handle error messages, and/or probe scenarios like this in a startup phase.

3.7 LLDP

Description: Normal switches and hybrid OpenFlow switches have the capability of sending their own

LLDP messages. When an OpenFlow 1.0 switch receives one of these messages, it will correctly send a

packet_in message to the controller if they do not have a matching rule to indicate another action. In

some cases these messages are simply ignored by the controller since they are not seen as originating

from a controller packet_out message on another switch. In other cases, the controller may use these

messages for topology discovery of non-OpenFlow switches.

When a hybrid switch receives one of these LLDP messages, it may behave in one of several ways. It may

recognize the LLDP message as LLDP, bypass the OpenFlow pipeline and send the message to its

processor to update the internal switch topology table, or It may forward it to the controller in a

packet_in message.

This causes problems for OFTest when it receives one of these unsolicited LLDP messages.

Resolution: Hybrid switches turned off LLDP processing on their OpenFlow enabled interfaces.

Some controllers change the format of the LLDP packets they use as probes. Instead of using a bridge-

filtered multicast mac address (01:80:C2:00:00:0E), they use a normal multicast mac address

(01:23:00:00:00:01). This has been unofficially called OpenFlow Discovery Protocol (OFDP). Since both

destination mac address && ethertype (0x88cc) do not conform to the LLDP standard, the message will

not be recognized as LLDP by the OpenFlow or Hybrid switch and instead will be forwarded to the

controller in a packet_in message, or broadcast out all ports. This allows them to discover links between

OpenFlow switches that traverse non-OpenFlow switches.

In some cases the controller changed the ethertype from the standard 0x88cc to keep switches from

intercepting the LLDP messages. This was not necessary in the case of the NOX controller as the LLDP

specification indicates both the dest mac address and the ethertype must match in order for the packet

to be recognized as LLDP by the switch. The NOX controller was already modifying the dest mac address.

Page 12: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

For those switches that were discarding the packets or forwarding them on the normal L2 path, flow

rules were inserted to forward these messages to the controller. (See Default PACKET_IN behavior)

Recommendations: LLDP should be reevaluated as the topology discovery mechanism of choice on

OpenFlow networks.

3.8 Controller Connection Failure Modes

OpenFlow 1.0.0 specifies a switch that loses its TCP connection to the controller must go into emergency

mode.

If some number of attempts to contact a controller (zero or more) fail, the

switch must enter \emergency mode" and immediately reset the current TCP

connection. In emergency mode, the matching process is dictated by the emergency

flow table entries (those marked with the emergency bit when added to

the switch). All normal entries are deleted when entering emergency mode.

3.8.1 L2 forwarding

Description: When the tcp session to the controller is lost, some hybrid switches defaulted to normal

Layer2 forwarding mode. Since STP is not running in the network, this has the potential to cause loops.

Resolution: Some switches were able to change their failure mode to drop all traffic.

Recommendations: There are no known implementations of “emergency mode” which indicates it may

not be a viable option. While switches should support “emergency mode” if controller vendors do begin

implementing it, we would also recommend supporting the failure modes described in 1.1.0 and later

(fail-standalone, fail-secure). This seems to be a more common practice even in 1.0.0 implementations.

In the absence of fail-standalone and fail-secure mode support, and if the controller has not inserted

emergency rules, the switch should still delete all flow entries. Switches should never fail to L2

forwarding mode unless explicitly configured.

3.8.2 Flows stay active until they timeout

Description: When the tcp session to the controller is lost, some switches maintained existing flow rules

until they expired due to hard or soft timeouts. All non-matching traffic was discarded. Since we were

testing OpenFlow 1.0.0 only, this was unexpected behavior.

Resolution: Some switches were able to change their failure mode.

Recommendations: This behavior is more consistent with fail-secure mode as described in the

OpenFlow 1.1.0 specification. This is a common practice instead of implementing “emergency mode” as

indicated in 1.0.0. While 1.0.0 switches should support “emergency mode” if controller vendors do

begin implementing it, we would also recommend supporting the failure modes described in 1.1.0 and

later (fail-standalone, fail-secure).

3.9 Controller Specific

Page 13: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

3.9.1 Topology not updated when control channel (TCP session) lost to switch

Description: When the TCP control channel connection to a switch on the active datapath was lost,

some controllers did not switch to an alternative path or update their topology DB accordingly.

Resolution: The topology DB in the controller had to be updated manually.

Recommendations: OpenFlow 1.0.0 controllers should detect a lost control channel and treat it as

device down situation, removing it from the active topology and actively direct traffic to alternative

paths.

3.9.2 Topology not updated when port status changes

Description: When a cable is unplugged and a port goes down along the data path, the switch should

generate ofp_port_status messages and sends them to the controller. The controller should update its

topology DB and reroute traffic to an alternate path. Switching to a secondary path was not successful.

Resolution: None, due to time constraints, we were unable to fully troubleshoot this issue.

Recommendations: Repeat testing

3.9.3 Controller discovers links in only one direction

Description: At least one controller would accept a link as being discovered if it received a topology

probe packet in one direction. It installed the link in its topology database and it appeared as a fully

functional bi-directional link when it was only functional in a single direction. This caused confusion and

additional difficulty in troubleshooting issues.

Resolution: None

Recommendation: A link should only be recognized and accepted into the topology if it is discovered in

both directions. Or else it should be designated in some way as a unidirectional link.

3.10 FlowVisor

3.10.1 Controller crashing FlowVisor

Description: One application and controller was causing FlowVisor to crash. The exact cause was not

determined.

Resolution: None, due to time constraints we were unable to troubleshoot further.

Recommendations: Additional testing is recommended.

4 Additional Comments There were several additional topics of conversation during the event that is worth mentioning.

Page 14: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

When a switch loses its connection to the controller, the switch has to rely on the OpenFlow echo

request/reply messages to detect the failure. Some vendors suggested that the time taken to detect the

failure was too long.

Not all controllers had built in topology visualization tools. Those that did helped troubleshooting

significantly. This is a highly recommended feature for all controller applications participating in future

interoperability events.

There are very few existing tools available to troubleshoot OpenFlow networks. Work needs to be done

to develop troubleshooting tools and methodologies for OpenFlow networks.

5 Conclusions It was generally agreed upon that the interoperability event was extremely useful for identifying issues

with individual devices, controllers and applications, but it was just a beginning. The limited amount of

time made it impossible to troubleshoot and find causes for all of the issues encountered. One notion

that was confirmed during the week is that passing all the OFTest suite of “compliance” tests is no

guarantee that an OF device will perform as expected in an interoperability test, but failing portions of

the OFTest suite usually guarantees failure at interoperability. One possible way to increase the amount

of time available at future events would be to require vendors to run specified test tools like OFTest

against their equipment in their own lab prior to the event and provide the results. This would help to

identify some issues ahead of time and give a better idea of what to expect during interoperability

testing.

The most important issues to come out of this event were probably the diverse ways of handling

unmatched packets, inconsistent failure modes and issues around the use of LLDP as a discovery tool.

The OpenFlow 1.0.0 specification is clear on sending a packet_in message to the controller when a

received packet does not match an existing flow rule, but there still seems to be much debate amongst

vendors as to the best way to handle unmatched packets. While controllers may be able to overcome

these differences by inserting specific default action rules instead of relying on expected default

behavior, the ONF needs to be very careful that future versions of the specification take into account

these potentially divergent methodologies. It seems critical that the specification settle on an acceptable

set of behaviors and add the ability for an OF device to indicate its supported set of behaviors to

controllers, perhaps in feature_reply messages. Or the controller could query and potentially configure

the switch behavior through the use of the OF-CONFIG protocol.

The variety of failure modes also caused many issues. It is important for the controller to fully

understand and pre-determine the behavior of an OpenFlow device if the control-channel TCP session is

lost. In some cases, OF 1.0.0 controllers may be able to overcome these differences by taking advantage

of “emergency mode” and inserting emergency flow rules that define the failure mode (i.e. an

emergency rule that drops all packets), but other desired behaviors may be difficult or impossible to

Page 15: March 2012 Interoperability Event Technical Paper, v1.0

Open Networking Foundation Technical Paper

influence with emergency rules. In later specifications, there is no concept of “emergency mode”, but

the set of acceptable failure modes is well defined. The controller, however, still seems to have no

knowledge of how the switch is configured to fail. This seems to be another case where failure modes

could be communicated to controllers as part of feature_reply messages, or the controller could query

and potentially configure the switch behavior through the use of the OF-CONFIG protocol.

Topology discovery may not be a within the scope of the OpenFlow specification, but it turned out to be

a very important issue at the interoperability event. The question remains whether LLDP should

continue to be the topology discovery protocol of choice in OpenFlow networks. Since LLDP is used

natively by some Hybrid switches, it may not interact as expected on all OF networks. We also must

think about interoperability with non-OF devices at the edges or within the OF network.

Overall, the experience highlighted the importance of events like this and continual interoperability

testing.

Appendix A: References OpenFlow Specification 1.0.0 - http://www.openflow.org/documents/openflow-spec-v1.1.0.pdf

OpenFlow Specification 1.1.0 - http://www.openflow.org/documents/openflow-spec-v1.0.0.pdf

OpenFlow Specification 1.2 -

https://www.opennetworking.org/images/stories/downloads/openflow/openflow-spec-v1.2.pdf

OpenFlow Configuration and Management Protocol -

https://www.opennetworking.org/images/stories/downloads/openflow/OF-Config1dot0-final.pdf

ONF-Testing-Interop-March-2012-Whitepaper (will be posted on the opennetworking.org site)

Appendix B: Revision History

Version Date Notes

0.1 3/29/2012 Initial Draft – Indiana University, Ron Milford

0.2 4/4/2012 Title Page & Index

Updates From – Big Switch, HP, IU, Ixia, NEC

0.3 4/6/2012 Introduction, Definitions

Updates From – HP, IU

0.4 4/10/2012 Additional Comments, Conclusions

Updates From – Big Switch, HP, IU, NEC, Stanford

0.5 4/12/2012 Updates From – HP, IU, NEC

0.6 4/19/2012 References & Revision History - IU

1.0 (final) 4/19/2012 Minor updates and finalization of the document – Ixia