On optimizing traffic for network virtualization

The era of network virtualization and software overlays is coming (read: VXLAN, OpenFlow, SDN, etc.) and with it the role of the physical network and what we define as “the network”, is all about to change.  How does this change the way application flows map to traffic on the network and servers? How does this change the way we design and optimize the network?  I don’t intend to answer all of these questions here, but instead to simply observe that the conversation is taking place, and provide my perspective on this, for what its worth.

Here, I’ll pay particular attention to the perceived issue of excessive server-to-server hops and traffic “trombones” as a result of network virtualization.  How should we optimize for it? Or should we even bother?

Martín Casado wrote about Defining “Fabric” in the Era of Overlays.  An excellent staring point to understand the fundamental shift taking place, and the terminology.

Ivan Pepelnjak describes a “total spaghetti mess” in his post VXLAN termination on physical devices.

Scott Lowe asked: “How do we handle the horseshoe routing issue? It would seem to me the only way … would be to support LISP” in his post Revisiting VXLAN and Layer 3 connectivity.

Here, I’m going to expand on a comment I left in Scott’s post where I basically said that this “horseshoe routing issue” or “total spaghetti mess” is NOT a big issue. Or at least, not the issue we should be focusing on.

Scott’s reply to me was this:

Brad, thanks for your perspective. I’m curious—other networking experts seem to make a really big deal out of the traffic trombone/horseshoe, but you claim it’s not really an issue. May I ask *why* you are not concerned? I’m not necessarily disagreeing with you (I don’t have enough knowledge and experience to disagree), I’m just curious to know your thought processes on this matter. Thanks!

I used to think this was a big deal too.  It’s the natural reaction of an experienced networking professional. It just doesn’t look right on paper. We like things to look nice, orderly, and symmetrical. Minimal hops, etc. That’s what we’ve always done. We engineer “the network”. However the reality that we face today is that with network virtualization and overlays, the definition of “the network” is starting to change. Perhaps the way we think about the network, and the way we engineer it, may need to change along with it.

Think about it this way… When you have a switch, packets go in Port-A, stuff happens inside the switch, and the packets come out another port, Port-B. We don’t need to engineer the switch. That has already been done for us. The switch has an internal I/O *fabric* constructed with ASICs and electrical traces, all engineered by the switch vendor to certain externally visible specifications (latency, throughput, etc). The switch also has a supervisor engine that controls the internal forwarding logic to get the packets from Port-A to Port-B. We don’t worry too much about how this internal fabric topology or forwarding logic is constructed, or the consequential number of hops, or “trombones” inside the switch. Thank heavens. Can you imagine how difficult the job would be if we had to engineer that too? After all, such things have no bearing on the topology exposed to the applications. We simply expect that packets will enter and exit the switch ports within certain parameters of latency and throughput.

From there, we engineer “the network” formed by linking multiple switches together and configuring the topology exposed to the applications with the configuration of the physical network. We engineer for the fewest physical hops, because, well, we can. It’s something we can easily control and make better, extolling some additional professional value.

All of *that* has begun to change. In this new era of network virtualization and overlays, the role of the physical network has changed from providing the topology, to providing the I/O fabric. The physical network now provides a *fabric* to one expansive data center wide software virtual switch. To construct the internal *fabric* for this virtual switch, the physical switches themselves provide the role of the ASIC, and the cables between the switches provide the role of the electrical traces. The servers provide the role of the linecard with virtual ports. And finally the software that controls the virtual switch provides the role of the supervisor engine, implementing the forwarding logic that gets the packets from virtual-Port-A to virtual-Port-B.

The application topology is now vastly simplified because it’s constructed by a single logical “switch”, a software virtual switch. The vast underlying physical network is hidden inside the virtual sheet metal of this single logical switch.

The engineering of the physical network takes on a new perspective, similar to the engineering of a single physical switch. You build a topology of ASICs (switches) and traces (cables) that provide the virtual switch with certain scalability and performance specifications. In the same way that we didn’t care too much about the consequential hops or trombones within a physical switch. We now have the luxury of not caring about consequential hops or traffic trombones between servers within the physical *network*, with the understanding that the network has been engineered to certain specifications of latency and throughput to satisfy what the applications can extract from the network in terms of performance. I know, I know, a little easier said than done.

To that end, because the role of the physical network has changed we can certainly expect that its time re-think the requirements and features we need from the network, and how it’s constructed. For example, rather than focusing on the typical mix of Layer 2 and Layer 3 requirements, we can instead simply focus on Layer 3. Evaluating a switch based on its Layer 2 protocol capabilities is no longer a concern here.

In this new era, when you try to apply your usual skills of optimizing east-west application hops on the physical network, inside this new virtual switch, you can quickly find yourself on a wild Goose chase. Bear in mind that you’re taking a skill of optimizing a topology, and trying to apply that same skill to optimizing the internal workings of a “switch”, albeit a logical switch. Square peg, Round hole.

First of all, it’s easy to get frustrated. Unlike before, there is not much you can really do about it. Similar to the fact that there isn’t much you can really do about how the internal workings of a switch work. Instead, you’ll find yourself asking for new traffic engineering features from the virtual switch (LISP), and new capabilities from the existing features (a distributed implementation of vShield Edge). All of this requires significant time and engineering effort, and cost. It’s not something we can easily affect ourselves in a few hours at the whiteboard and CLI.

Secondly, its easy to come to the conclusion that all of this would largely be a futile effort anyway. Lets say you get your LISP in the virtual switch and your distributed VSE, you’ve implemented it, and … (drum roll) Presto! You now have 1 or 2 less hops inside the virtual switch for some flows. Did anybody notice? Probably not.

Finally, this is by no means an attempt to minimize the importance of understanding how and where traffic is flowing on your physical network, and why. Quite the contrary in fact. Understanding the application behavior on the network, both virtual and physical, is probably more important now than ever before. Such knowledge directly translates into how well you can anticipate growth, where in the infrastructure you focus your efforts, and how well you can evaluate the relevance of new technologies to your environment.

For example, when a switch vendor boasts about VXLAN capabilities in their new hardware — is that a big deal? Or not? What about a distributed network services platform like Embrane? What kind of impact will that have on your applications and infrastructure? Only somebody who knows exactly how the applications overlay the physical and logical network will be able to efficiently answer these questions. And *that* is the optimization you and I should be focusing on. This being a more effective path to extolling our professional value in this new virtual era.



  1. Adam Webb says

    I agree with your assertion it’s not a big deal. We need to build the physical as simply as possible, more so now than ever before, so that whatever we overlay remains easy to manage. We’ve all worked in operations at some point, and packets “tromboning” or taking increasingly strange paths to reach a destination is just part of dealing with outages and failures. Planning for this behavior, and expecting it, isn’t unreasonable.

  2. says

    While there is a certain appeal to treating the network as a single logical switch, the physics of data transmission is inescapable: shorter network paths provide higher bandwidth and lower latency. Instead of trying to hide the network topology from applications, making the topology visible to distributed applications allows them to take advantage of the information and optimally place workloads.


    • says

      Hi Peter,
      The “single logical switch” is the view presented to the infrastructure consumer, and yes, the applications the consumer instantiates.

      The network provisioning component, on the other hand, can certainly still leverage topology awareness for consumer workload placement. For example, the “supervisor engine” of the software virtual switch, with its tentacles in the hypervisor kernel, has visibility in to which customer VMs are on which servers. It wouldn’t be a stretch to then create groupings of servers based on physical placement in the network (same rack). This additional level of detail could easily be used for VM mobility and placement decisions.

      In a nutshell, I don’t see network virtualization inherently breaking the NUMA like provision concepts you describe.


      • says

        Hi Brad,

        To properly orchestrate resources you need a single logical point of control. You can take a network centric view and have the network supervisor engine drive decisions like VM location, or an application centric view in which the network is one of the many resources under the control of the hypervisor/application.

        Ultimately, I think that application centered control will win since software is far more maleable than networking hardware and the application developer has a much wider set of options (for example caching data that would be expensive to retrieve over the network). Software developers that understand how best to use network resources will be able to deliver applications that are more scaleable, perform better and make more efficient use of resources. A good example is topology awareness is Hadoop (where it tried to move computation close to the data, either on the same server or to a server on the same physical switch).

        The single logical switch notion tends to get in the way and isn’t necessary to provide the automated switch configuration, equal cost shortest path bridging, congestion control, etc. that are features of an Ethernet fabric, although many network vendors seem to be linking the two.


        • says

          It might help to get more specific about the word “application”. In the era of software defined networking it’s going to be one of those nebulous words that can confuse people. Kinda like “cloud”.

          When I refer here to “application”, I’m talking about IaaS and the customer owned application loaded into that environment that will consume the infrastructure. I get the feeling you’re talking about infrastructure software, the stuff that sets up an environment for the infrastructure consumer (eg. Hadoop, or hypervisor). Two very different things. In that sense, I’m in full agreement with you that infrastructure software provides the optimal point of control — hence, *Software* Defined Networking (SDN). The “single logical switch” I refer to here is the product of such infrastructure software. It’s purely a *software* constructed element.

  3. Nathan says

    Hi Brad,

    A great post which is very relevant at the moment.

    I have ended up in several head scratching moments recently where clients want their dual data centres to be active / active with complete workload portability, all servers to use the local data centre gateway (both inbound and outbound) and for no traffic tromboning across the DCI….help!

    I think the *virtual switch* analogy makes sense but and as you say the network needs to be designed to meet the required performance and latency requirements (potential commercial issue here) but needs to also function in a deterministic fashion.

    Im keen to run this concept past a few customers

    • says

      Hi Nathan,

      In this post I was primarily addressing the server-to-server tromboning within a single data center, although I did not make that explicitly clear.

      If your customer is uncomfortable with traffic tromboning between two “active/active” data centers (I wouldn’t like that either), then they should not implement an architecture that would lead to those circumstances. Traffic tromboning between data centers would certainly lead to application performance problems. Because, the DCI link is typically not engineered with the capacity and (low) latency to provide intra or inter DC workload placement transparency.

      When it comes to “active/active” data centers, my general recommendation is to have each data center act as “active” for a given application, and while that data center is “healthy”, all layers of that application stack stays in that data center. From there you can use global site load balancing (GSLB) to (1) detect application failures, and (2) move user traffic for that application to the other data center.

  4. says

    Thanks for the article. I will be sharing it with customers as I think most people are still not convinced that “virtualisation creates paradigm shift in network”. I think this topic deserves a book (or some blog entries!)

    Have a great 2012!

  5. Joe Smith says

    Network virtualization, per se, is nothing new. VLANs, VRFs, virtual interfaces (think dialer profiles, SVIs, etc) have been around for years. I’m still trying to wrap my head around SDN. Why is it considered so revolutionary? What is so attractive about a centralized brain/control plane? Unknown unicasts are still flooded by the SDN controller, so dont you still need a loop prevention mechanism?

    And pardon what may seem like a stupid question, but isnt all networking “software defined”? The technologies and protocols that define the control, data and management planes are all implemented in a switch/router’s OS….

    I’m sure there is something…not sure what

    • Naveen says

      Joe, Here Software ‘Defined’ means…..Software will define and provison the underlying network as per requirements dynamically…In traditional networking, net admins confiure and provision the network and this has certain limitations when it comes to Cloud based needs that requires dynamic and instant network setup….Software via Programming will send commds to the network devices that will then ‘setup’ the network…so the era of Network ‘Software’ programmer is fast approaching….if you see in this blog post and in other blogs…SDN is being worked upon and experminted upon by software pros who have had Network background (mostly old hat CCIEs)….you won’t see conventional software app developers with zero knowledge on networking working on this subject.


Leave a Reply

Your email address will not be published. Required fields are marked *