On “VMware’s SDN Dilemma: VXLAN or Nicira?”

Some commentary on a blog published by Networking Computing titled “VMware’s SDN Dilemma: VXLAN or Nicira?”

VMware has a technology problem: it’s backing two competing standards for overlay networks: Nicira’s STT and the IETF draft standard VXLAN

Nonsense.  As of right now, STT tunneling provides the best performance for network virtualization (wire speed).  If and when VXLAN (or some derivative) becomes the best option, it’s just a matter of adding VXLAN as another choice of tunneling protocol from which to configure the system – if not already there.  That’s not a “technology problem”.  That’s providing the right tools at the right time — facilitating a transition from one generation to the next (from early adopters to wide-spread deployment).

… limited entropy in the STT header means it doesn’t balance loads evenly over Ethernet port bundles in network backbones. Depending on your network design, this could be a significant limitation.

This is just factually incorrect.  The TCP source port in the STT outer header is derived from a hash of the internal frame’s header.  Individual flows carried by STT will have a different TCP source port in the other header.  This provides maximum flow level granularity (entropy) for optimal load balancing for ECMP/LAG paths on standard hardware in the physical network.  This is discussed in section 2.5 of the STT informational draft. By the way, this is the same method employed by VXLAN.

NVGRE is the tunneling protocol (pushed by Microsoft) with poor handling of flow level granularity.  Section 4.8 of the NVGRE draft states that “NVGRE-Aware” network devices would be required to realize the best flow level entropy and optimal load balancing on ECMP/LAG paths. Perhaps the author confused STT with NVGRE?


  1. says


    I agree that changing the underlying tunneling protocol is simply a matter of configuration (Open vSwitch also supports GRE) and that it makes sense to have a choice of tunneling protocols so that you can pick the one that best suits your requirements.

    On the ECMP/LAG point, I think there a basic problems with relying on hashing to distribute data center workloads. Much of the traffic between VMs consists of long lived flows which can lead to poorly balanced ECMP/LAG paths, even if you use inner header information to increase entropy. In addition, improvements in hypervisor networking performance mean that VM to VM traffic can saturate a 10G link, making effective load balancing critical to managing QoS.

    SDN/OpenFlow holds promise as a way to more intelligently select ECMP/LAG paths for long lived flows by explicitly controlling their forwarding paths. Short lived flows are well balanced be existing hash based techniques, so a hybrid approach is probably optimal.



  2. Joe Smith says


    Sounds like you’re getting comfortable with the new sales speak at vMware. :-) Good stuff. Quick – perhaps stupid – question: Do NVOs require Open Flow? I thought I read once that Nicira’s NVP leverages OF between the OVS and the Controller….All this SDN stuff is pretty murky unless you have the opportunity to be engaged in relevant discussions with the industry leaders who are delivering this technology to market. Otherwise, you have to read sketchy overviews and marketing slides that sometimes deliberately paint that murky picture to hide the uglier, less sexy details.

    Can you give us a high-level intro on the different SDN/NVO architectures that are being developed?

    Kind Regards


Leave a Reply

Your email address will not be published. Required fields are marked *