Network Virtualization: a next generation modular platform for the data center virtual network

“What will my next generation data center networking platform look like?”

“How do I describe this platform to IT managers and begin to wrap my arms around it?”

This post attempts to provide a framework for that discussion, in which I’ll argue that the platform for the next generation data center network has already taken shape.   It’s called Network Virtualization, and it looks a lot like the networking platforms we’re already familiar with.

Over the last 15 years the networking industry has been challenged with tremendous growth in all areas of the infrastructure, but none more challenging than the data center.  As networking professionals we have built data centers to grow from tens, to hundreds, to thousands of servers – all while undergoing a transition to server virtualization and cloud computing.  How on earth did we manage to make it this far? Platforms.

More specifically: flexible, scalable, robust, adaptable, modular switching platforms.

A platform for data center networks

As networking professionals we have relied on these modular switching platforms as a foundation to build any network, connect to any network and any host, meet any requirement, and build an architecture that is both scalable and easy to manage with L2-L4 services.  This is evident by observing the phenomenal success of the modular chassis switch, a marvel in network engineering for architecting the physical data center network.

There have been many different modular switching platforms over the years, each with their own differentiating features – but the baseline fundamental architecture is always the same.  There is a supervisor engine which provides a single point of configuration and forwarding policy control (the management and control plane).  There are Linecards (the data plane) with access ports that enact a forwarding policy prescribed by the supervisor engine.  And finally there are fabric modules that provide the forwarding bandwidth from one linecard to the next.

In general, we see this very same architecture in almost all network switch platforms (virtual or physical), which boils down to three basic components.

  1. Control point (switch CPU, or supervisor engine)
  2. Edge forwarding components (port ASIC, or linecards)
  3. Fabric (switch ASIC, or fabric modules)

The scale at which this architecture is realized varies based on the implementation of the switch (e.g. fixed, or modular).

Why has this architecture been so successful?  I can think of several reasons:

Consistency – A single forwarding policy is defined at one control point, the supervisor engine, which automatically deploys that policy to the appropriate linecards.  The supervisor engine ensures that each linecard configuration is correct and consistent with policy.  This consistency model also applies to the forwarding tables.  As each linecard learns the MAC address connected to a port, the supervisor engine ensures that all other linecards have the same consistent forwarding table.

Simplicity – Each linecard locally implements a forwarding lookup and enforces a policy (security, QoS, etc.) upon receiving traffic — determines a destination linecard — and relies upon the fabric modules to provide the non-blocking bandwidth to the destination linecard.  There is no need to re-implement the forwarding policy again on the fabric modules.  Additionally, there is no need to populate the full forwarding tables on the fabric modules either, because forwarding in the fabric is based on information added by the source linecard that identifies the destination linecard.  In total, this architecture simplifies both the fabric module design and the overall implementation of the modular chassis switch.

Scale – The switch architecture in its entirety represents a single logical forwarding pipeline of input ports, a policy or service to implement, and output ports.  After all, this is the very essence of what a network should do:

  1. Receive traffic
  2. Apply a service or policy
  3. Forward to a final destination

The configuration complexity required to implement that basic pipeline across a network of discrete devices impacts the overall scalability of the network.  Traffic steering is one of those complexities, where network teams must weave together an intricate tapestry of VLANs, VRFs, VDCs, appliance contexts, etc. across many devices just to establish the logical pipeline with isolation and multi-tenancy.  Inside a single switch however, traffic steering is handled by the switch architecture, not the operator of the switch.  Hence scalable switch architectures such as a modular chassis switch work to reduce the configuration touch points required to implement traffic steering and the forwarding pipeline across the overall physical network.

Setting aside for a moment the impracticality of massive sheet metal, cabling, silicon, and vendor lock-in:  If it were possible to build and install one large all-encompassing chassis switch for the entire data center, imagine the simplicity it would afford in implementing the basic forwarding pipeline.  We would have a single control point for all ports, consistency in policy and forwarding, and no complex traffic steering.

Indeed the massive hardware chassis switch is just a fantasy for the physical network.  However the majority of endpoints in the data center are now virtual – attached to a virtual network made up of virtual switches.  And unlike a physical switch, the scope of Network Virtualization is not constrained by hardware elements such as sheet metal, cabling, and silicon.  Instead, a network virtualization platform is only constrained by software, standard transport interfaces (IP), and open control protocols (API).

Next we’ll explore how the VMware/Nicira network virtualization platform provides a common logical switching architecture at an all-encompassing scale for the data center virtual network.

A platform for Network Virtualization

Now let’s look at how this very same proven and familiar modular switch architecture has manifested itself once again to become a next-generation platform for the data center network.  Remember our three basic architecture components: 1) Control point, 2) Edge forwarding, and 3) Fabric.  All of these still apply, only now at a much larger, more encompassing scale.

Before we begin, let’s first recognize what it is we are trying to accomplish.  Remember the essence of what a network should do: 1) receive traffic, 2) apply a service or policy, 3) forward to the final destination.  This, again, is the essence of what we want to accomplish – implement a logical forwarding pipeline for the virtual network – with all the properties of consistency, simplicity, and scale.


Let’s begin with the Edge.  This is where traffic is first received on the virtual network – the insertion point of ingress policy in our forwarding pipeline.  And this of course is what we know today to be the virtual switch present in hypervisor hosts.  Two obvious examples of the virtual Edge are the VMware vSwitch, and Open Virtual Switch (OVS).  The virtual edge is effectively the “Linecard” of the network virtualization platform (NVP).  And these edge Linecard devices are “wired” to each other as needed with tunnels, configured dynamically by the Controller.

One notable difference from physical switch architecture is that our network virtualization platform is not limited to a small subset of vendor specific linecards, or vendor specific fabric modules.  This is because the logical chassis is constructed with open source software at the edge (OVS), linked together with soft cabling (STT, VXLAN, GRE tunnels), over any network fabric, and controlled with open APIs such as OpenFlow and OVSDB.  This creates a platform ripe for an ecosystem.  For example, in addition to a virtual switch, other possibilities for a virtual Edge linecard include 3rd party Top of Rack switches (for connecting physical hosts to the virtual network), and 3rd party network services appliances for attaching specialized network services to the virtual network and forwarding pipeline.


Similar to a supervisor engine of a modular chassis switch, the virtual edge linecards are programmed with a forwarding policy from a central controller.  Specifically, a scale-out software-defined network (SDN) controller cluster made up of x86 machines capable of managing close to a thousand virtual edge linecards.  Just as a supervisor engine has a management interface supporting protocols such as SSH and SNMP, the SDN controller cluster has an API interface for configuring the virtual network, and supporting any potential upstream cloud management platform (CMP) such as OpenStack, VMware vCloud, CloudStack.

Similar to the supervisor engine, the SDN controller ensures consistent policy and forwarding tables across all virtual linecards.  For example, when a virtual machine is powered on or migrated, all linecards requiring knowledge of this event are updated and configured by the controller.  Similar to a Linecard in a modular chassis switch, the forwarding table of the virtual linecard maps a destination endpoint to a destination Linecard.  In this case, the destination linecard is identified by its IP address in a tunnel header.  The controller has a global view of the virtual network.  It knows the location and network policy of each virtual machine and is able to program that view when and where needed.


And finally we have the Fabric.  In the modular chassis switch, the fabric is made up of fabric modules supplied by the switch vendor providing the forwarding bandwidth between linecards.  In a network virtualization platform, the fabric is the physical network – which itself could be constructed with modular chassis switches, or perhaps a distributed architecture of fixed switches.  Either way, the physical network provides forwarding bandwidth between all of the virtual Edge linecards.  And the fabric for network virtualization can be supplied by any switching vendor – similar to how hardware for server virtualization can be supplied by any server vendor.

Similar to the fabric modules of a chassis switch, the physical network fabric is not configured with the same forwarding policy and forwarding tables as deployed in the virtual edge linecards.  Consider that fabric modules of a chassis switch have no awareness of linecard configurations such as QoS, VLANs, ACLs, VRFs, NAT, etc. — the same is true for network virtualization.  Any network configuration that implements the forwarding pipeline and virtual network viewed by a virtual machine is only necessary at the Edge, and programmed automatically by the Controller.

As a result, network teams do not need to configure the multitude of physical switches with traffic steering and network configurations that construct the virtual network, such as VLANs, VRF, VDC, QoS, ACL, etc.  Consequently, the physical network is free to evolve independently of the virtual network, and designed around criteria of scale, throughput, and robust network architecture (Layer 3 ECMP).

The VMware/Nicira Network Virtualization Platform

The network virtualization platform (NVP) from VMware/Nicira is the first solution to deliver full network virtualization and deployed in production at some of the largest service providers and enterprises.  NVP is a standalone L2-L7 data center networking platform designed to work on any network fabric, work with any hypervisor, connect  to any external network, and deployable with any cloud management platform (CMP).


Through full network virtualization, NVP is able to create a complete multilayer network abstraction exposing logical network devices such as logical switches, logical routers, and more.  These logical devices can be configured with security and monitoring policies, and attached to each other in any arbitrary topology through the NVP API.  The NVP Controller programs the logical topology at the virtual edge.  With this programmatic control, the logical network has the speed of configuration and operational model similar to a virtual machine – create, start, stop, clone, snapshot, audit, migrate, etc.

Looking ahead, NVP will serve as a platform for ecosystem partners to plug-in physical (or virtual) devices such as Top of Rack switches and Network Services appliances into the architecture like a “Linecard”, based on protocols and APIs such as VXLAN and OVSDB.  Network architects will be able to present these 3rd party switches and services as logical devices in the logical network, while NVP systematically implements any necessary traffic steering with tunnels to abstract a simplified view of the forwarding pipeline.

Just as a modular chassis switch can connect to any external network, NVP Gateways provide an edge that connects to any standard Layer 2 or Layer 3 external physical network.  Network architects can attach the external physical networks anywhere in the logical network through the NVP API.  Gateways can also extend logical networks to a remote site using secure IP tunnels (IPSec + STT).  And multiple NVP Gateways can be deployed for scale-out performance and high availability.

In addition to NVP Gateways providing a connection to any external network, NVP Service Nodes provide a connection on any network.  Service Nodes are x86 machines managed by the NVP Controller dedicated to performing additional CPU intensive packet processing services such as handling broadcast, unknown unicast, multicast (BUM) frames, and encryption (IPSec) — offloading that work from hypervisor hosts.  The handling of BUM frames by a scale-out cluster of Service Nodes provides scalable network virtualization on any network, without requiring the limited scale and complexity of an IP multicast deployment in the physical network.

Eschew tradeoffs of Good Design vs. Speed and Flexibility

A network switch is useless until it’s been provided with a configuration, and a network virtualization platform is no different.  However there is one important difference.  Physical network switches have always been designed under the assumption that once a switch has been configured, the configuration is not going to change that often.  Because physical network topology is assumed to stay stable once established, and physical servers are added and moved infrequently.  As such, the CLI has been a suitable interface for configuration change that happens over longer time scales.

The virtual network, on the hand, is completely different.  Topology and configuration change is happening all the time – virtual machines are frequently added, removed, and migrating about – and the virtual network configuration must move at a similar time scale.  If not, overall provisioning speed and accuracy is bottlenecked by the slowest common denominator – the physical network switches, each with its own CLI.

Before network virtualization, the physical network needed to play a role in constructing the end-to-end virtual network used by the virtual machines.  A virtual machine was just another host on the physical network.  Traffic steering with VLANs, VRF, VDC, ACL, NAT, etc. needed to be configured by hand with a CLI on numerous switches — a time-consuming process prone to error and inconsistency.

As a result, the significant delta in provisioning speed between virtual machines and virtual networks brought about a contentious tradeoff:  You can have faster network provisioning with a precarious network design (such as all VLANs preemptively flooded on every port and large Layer 2 domains).  Or, you can have a good network design but with slow and limited provisioning (such as virtual machine networks limited to certain racks and services anchored to a physical network chokepoint).  You can’t have both good network design and service provisioning speed.  Not until you’ve decoupled virtual and physical network configuration through a network virtualization platform.

With the virtual network fully abstracted from physical switch hardware, through network virtualization, we are now free to use a configuration mechanism specifically for the virtual layer that’s better suited to the faster time scale of virtual networks – the API.  And the physical switch configurations need only provide a topology to deliver forwarding bandwidth that doesn’t need to change that often — for which the existing CLI is well suited.  As such the manner in which network operators configure the physical network today need not change with network virtualization.

The era of software centric networking platforms

The next generation network virtualization platforms such as NVP closely resemble the switch architectures we’ve deployed over the years to build highly scalable and robust physical networks.  What’s different is that the primary elements enabling this platform are software driven,  such as SDN controllers and virtual switches, connected together via standard transports (tunnels), and controlled via standard API interfaces (OpenFlow and OVSDB).

At VMware we believe that virtualization software providers are best equipped to deliver and package a software driven networking platform for the virtual network.  For example, NVP was the first network virtualization platform, and already in production at many service provider and enterprise data centers.  While we do expect network vendors to deliver similar “network virtualization” platforms aimed at the virtual network – their execution is likely to come with caveats that require only their physical network hardware.

For the same reason that it makes sense to support sever virtualization on any server hardware, network virtualization should provide the same basic principle of deployment on any network hardware.  Otherwise, it’s not really virtualization.


Brad Hedlund
Engineering Architect, Virtual Networking
VMware, Inc.


  1. Jim Zhang says

    Great view from Nicira perspective on network virtualization. It is really a thoroughly virtualization for network from technology pespective. However, some of network vendors could likely not to follow this direction, just as you mentioned in the last point. On my opinion, network virtulization is a bit different from server virtualization, cause either chassis or fixed switches have the unique operation system and it tightly programmed with the chipset, this is not the case for X86 standard servers.

    • says

      That’s why it’s called *Network* virtualization — and not *Switch* virtualization. We don’t care about an individual physical switch and its OS, there’s no need to change that. Instead, we take a step back, look at the broader Network, and virtualize that. The edge of that Network is on x86 standard servers (the hypervisor), and that is where NVP inserts itself.

  2. says

    Thank you for the excellent description of NVP. It matches where I think SDN is headed. A post on BUM processing would be useful. I think there are ways to minimize its impact.

    • says

      Thanks for dropping by and leaving a comment! The BUM processing is really straightforward so that should be an easy/quick post.
      In a nutshell, we take BUM traffic in the virtual network and convert it to known unicasts (tunnels) on the physical network. And because the NVP Controller has a global view of where each endpoint exists, the traffic only goes where it needs to.


      • says

        Hi Brad.

        First allow me to wish you all the best on your new endeavor at VMWare. Excellent analysis of this complex subject. Thank you. On the subject of handling BUM traffic, I’m guessing that the controller has a “learning mechanism” whereby it learns all the MAC-to-egress interface associations. With that knowledge it then tunnels the traffic accordingly. In that case unknown destination packets have to be flooded until a valid MAC-to-egress interface association is registered. Broadcast and Multicast traffic can by definition only be forwarded over a multicast tunnel. This is almost identical to the classic L2 learning mechanism in 802.1D. This function is so fundamental that almost all modern protocols aiming to replace classic 802.1D utilize it in one form or another. TRILL uses this learning mechanism for BUM traffic to setup MAC-to-egress interface associations, VXLAN in its current form uses it to handle BUM traffic even A-VPLS uses this! This is a well understood mechanism that IMHO is a common denominator of any overlay network technology. One important differentiator between overlay network technologies is whether the MAC-to-egress interface associations are stored in a central database or distributed across the (tunneling) end-points of the overlay network. Essentially what this boils down to is the emerging controversy of centralized vs. distributed control plane in network virtualization. Do you believe that this is the fundamental issue at hand and which approach do you advocate?


      • says


        Here’s an extremely abbreviated explanation:

        Broadcast, Unknown, and Multicast (BUM) frames from a hypervisor are sent to NVP Service Nodes. You can have multiple Service Nodes and each hypervisor has a tunnel established to each one. For every BUM frame the hypervisor selects one Service Node, based on a hash, and sends one copy of the frame to that node. The Service Node receiving the Broadcast/Multicast frame will replicate and will only send copies to each hypervisor that needs to see a copy. This is because the NVP Controller has programmed the Service Node with the location of each virtual machine and the logical networks to which they belong. NVP has authoritative knowledge of all virtual machines connected at the edge hypervisors. Unknown frames will be dropped by Service Nodes, because there are no unknown hosts (unless the Service Node is providing a connection to another NVP domain, in which Unknown frames will be flooded the other NVP domains).

        Hope that helps.


        • John says

          If you are saying every BUM packet must be sent to the Service Node, then you can’t compare NVP to a Modular Chassis, unless you put a very “big caveat” in your blog. Would you agree? A modular chassis replicates the packet locally without the need to have a “Service Node”.

          I may be using the wrong comparison but NVP looks closer to LANE than a modular switch. For example:
          1) Service Node provides the same functionality as the Broadcast and Unknown Server (BUS).
          2) NVP Controller provides the same functionality as the LECS/LES
          3) Edge is similar to the LEC



          • says

            Not necessarily. A modular chassis has to replicate the frame with a replication engine ASIC, or a replication function within a single all purpose ASIC.
            You can think of the Service Nodes as the replication engine providing that same replication function.

          • John says

            I dont you should compare the Service Node to a replication engine, sure both main function is to replicate packets but;

            1) Replication Enginees (ASIC) can replicate packets with higher efficiency than a service node. Would you agree?

            2) for example if I have 2 multicast receivers and 1 source in the same edge switch. It looks like I have to send the traffic to the service node. If I have a switch and/or modular chassis the multicast packets stay locally. I would not be wasting bandwidth in my network. Agree?

            LANE is probably closer than anything

            Thanks for your in helping understanding this technology


          • says


            1) No. I don’t agree with that.

            2) The edge hypervisor vswitch will locally forward broadcast/multicast to local receivers on the same logical network. Service Nodes facilitate replication to other hypervisors with virtual machines on the same logical network.

            The LANE analogy falls apart when you look at the operational model. LANE was still a network composed of many individual switches configured by hand/CLI. The NVP Controller Cluster provides a single point to configure the virtual network, across all hypervisors and gateways, programmatically with an API.

  3. Donny says

    This really gets interesting when you begin to really investigate the architecture and realise the classic network (IP) is becoming a service bus in a virtualized environment. With the intelligence, flexibility, and rapid configuration of SDN, I simply need a transport medium between hosts.

    I believe we will soon see the next generation in which there is an IP “gateway” to the legacy network. All virtual systems behind the gateway will use a simple high speed, low latency interconnect for transport only. Think along the lines of consolidated infrastructure (like Nutanix) with a IB fabric. The entire transport and logical control can be software defined and operate at PCI bus speed.

    • says

      I definitely agree, and what you describe sounds a lot like what this post was about — software defined edge, IP transport network, and interfacing to external networks via Gateways.
      The transport network certainly could have it’s own controller — to perhaps build a fabric that’s easier to manage and configure.

  4. Marc Abel says

    I am curious about your statement that QOS is needed only at the edge. QOS is generally needed end to end to have proper effect. Can you expound on this point and how it is handled?

    • says

      The heavy lifting of QoS for virtual networks is best done at the edge — traffic classification, shaping, limiting, minimums, maximums, and marking. The physical fabric switches can and should certainly provide local QoS on marked packets, but can only do so much with their limited number of hardware queues. But as we see the cost per port of 10G/40G switches continue to plummet and density rise, building a non-blocking fabric (where QoS is largely irrelevant) becomes more of a reality.

  5. Donny says

    “…precarious network design (such as all VLANs preemptively flooded on every port and large Layer 2 domains). ”

    Could you expand on this? For the last 3 years, all virtualized environments my team has deployed have use 4 or less network interfaces with consolidated VLANs trunked to the host. I would like to understand the “precariousness” you reference.

    Thank you…

    • says

      When virtual network = VLAN, the result is building large Layer 2 domains and forwarding a swath of VLANs on all host facing ports. Would you design a network that way if you didn’t have to? If the answer is Yes then I can’t help you — just keep doing what you’re doing.

      • Donny says

        VLANs should only be forwarded on hosts were workloads for those segments are processed.

        Would I collapse the network (logical segmentation) to as few interfaces as possible, yes. But I fear there is still a disconnect which I want to ferret out. I am proposing the minimizing of physical links, while presenting only required logical paths. VLANs may be the logical segmentation method, but they need not be forwarded to hosts which do not process those workloads. I believe the point of contention is “all host”.

      • Donny says

        Sorry for the redundant posts, but I just had another thought.

        “…swath of VLANs on all host facing ports…” How many ports on switches are utilized for VLAN trunks? Would we forward only 1 VLAN per switch interconnect? Why are we treating the VLAN within the virtual switch different than the VLAN within the physical switch?

        • says

          Yes, when the virtual network is fully decoupled from the physical network, through network virtualization, the virtual network is not dependent on VLAN configurations in the physical network switches. Your physical switch port configuration does not need to change as virtual networks are changed, added, or removed. Your physical switch ports provide the host with enough access to transmit/receive IP traffic. Virtual networks can exist on any server, in any rack, reachable by IP — no limitations imposed by the reach of physical network VLANs.

          Why do this? Speed, accuracy, and efficiency. The edge vswitch layer can have its virtual networks established programmatically, in a matter of seconds — without waiting for the physical network switch be configured with a VLAN at multiple switches, by hand — a process that’s both slow and prone to error.

          • Donny says

            On that I agree whole-heartedly. Programmatic control of both switch planes is powerful and anticipated. I love the foreshadowing of programmatic ARP tables (reduced broadcast) due to the fact that NICIRA knows where the VM resides at all times.

            However, I believe our discussion was over previous generation SDN solutions (VSS/VDS) where VLAN configuration must be coordinated with legacy infrastructure. I am attempting to understand you presentation of the precarious configuration from the presentation of VLANs and broad L2 domains. If we are to sell to “those network guys” the validity of the vSwitch solutions, we must treat them as “just another switch”.

            The VLAN being forwarded to virtual hosts (especially ESXi) should be viewed by network engineers as just unother switch trunk port. The access layer essentially becomes embedded in the virtual host with trunk links to the second tier or core of the network. There is still the risk of configuration deviation and process coordination between the teams.

            I know you are more knowledgeable and aware of these matters than I am, and this makes me want to understand difference between our views. Thank you for the discussion.

  6. Steve says

    Great post! So far this is the best one I’ve seen on NVP/SDN topic. I have two questions and see if you can share your thoughts with.
    1) After the decoupling of virtual network (like NVP) from physical one, what would become the important criterias to consider when building up the physical network? and what kind of topology would be a better choice in terms of L2 domain size, VLAN usage model, and L3 seperation point selection?
    2) You mentioned on using service node for BUM traffic handling. Actually I’m more interested to know about the soft switch (like OVS) on the hypervisor itself. Let’s say the soft switch needs to handle around 10Gbps aggregated traffic including the tunneling (STT/NVGRE/VXLAN), do you have any experience data on what kind of CPU resouce would it take, e.g., # of cores, utilization ratio?

    • says

      Great description of network virtualization – I really like the diagrams!

      There is a discussion virtual switch performance on Martin Casado’s Network Heresy blog:

      As far as physical network topology goes, while network virtualization creates a logically abstract view of the network, fully achieving the abstraction requires that the cloud orchestration software manage the physical network resources in order to optimally place workloads and hide the resource management task from virtualized network applications – in the same way that server virtualization requires that the hypervisor manage physical compute, memory and IO resources in order to deliver the virtual machine abstraction.

      To facilitate automation, physical network topologies need to be simple and modular – again, in the same way that clusters/pools of servers simplifies the management of compute resources.


  7. Victor Lama says

    Brad, my man! Great post. Informative, clear, concise, logical and necessarily simplified for us legacy networking knuckleheads. :-)

  8. Mohan says


    Thanks for very informative article. I have a question about VMware/Nicira NVP. On Nicira website, I read that it’s a complete software solution. However in your article I see that NVP claims to virtualize network elements like LB, IPS, FW etc. Don’t these elements need to be upgraded to work with NVP?
    As I understand, to virtualize FW, the FW should be able to process encapsulated packet, decode and apply policy as per the virtual network. But existing FW may not be supporting this!


    • says

      The last diagram in this post depicts a Load Balancer connected directly to NVP logical networks with VXLAN tunnels. That is a forward looking illustration.
      That said, today you can have service appliances as virtual machines with NVP, or as physical appliances outside of NVP which are connecting to NVP logical networks through NVP Gateways.

  9. Naveen says


    Thanks for the post…I want to ask:

    Who is better suited to handle and manage virtual (SDN) networks?

    Network teams or VM teams?

    • says

      The VM teams usually just want to easily consume networking — even if that means taking on more manual network configuration tasks at the hypervisor. The last decade of vswitch deployments have shown us that.

      With network virtualization, consuming network is just a matter of API calls, which can be done programmatically without the manual configuration tasks.
      The network virtualization platform, such as NVP, that provides the API and logical network, is probably going to be setup by the cloud architect or network architect, or both. The network ops team continues to deploy and operate the physical network with their existing management tools, or perhaps some other SDN controller specifically for the physical network — it doesn’t matter.

        • says

          I’m not going to say that. What I am saying is, at a high level, there is and always will be a role for the network architect and network ops — with or without network virtualization.
          Did server jobs go away after server virtualization? I would argue server virtualization catapulted the better/smarter server guys/gals into much better jobs — such as virtualization admin, and cloud architect.

          The same can and should happen with network jobs and network virtualization.

  10. Joe Smith says

    brad – hope you get back to your blog soon, as the question queue is building up. :-)

    Mac-in-IP tunnels work to provide L2 adjacency across a L3 fabric…with this, you probably dont need VLANs either. But what if there is a L32 network that has no problem spanning a VLAN and therefore the client is using VLANs and maybe vPC?

    Are NVP’s tunnels needed? How will it interact and interoperate in a legacy network environment?


    • says

      NVP tunneling will work fine on L2 or L3 data center networks. The hypervisor hosts just need IP reachability to each other.
      That said, NVP does support a mode called “bridged mode”, in which tunnels are not used. In bridged mode, the hypervisor switch connects VM’s to a VLAN on the physical network, which is presumed to be already configured by the network admin; much like traditional virtual networking of the past decade.
      Bridged mode is there as a choice for customers who need higher scalability limits than the default tunneling mode supports — even if it means losing the operational benefits of full network virtualization enabled by tunneling. Not many customers use bridged mode, but some do.


  11. says

    Hi Brad,

    this is quite an interesting post! While I’m a network admin and am all about VRF, VLAN and VDC’s, the concept of a simplified forwarding pipeline appeals to me. However, this would also mean that the WAN part of the solution needs to be simplified. Now we do terminate specific WAN connections in a VRF, VLAN or VDC in order to connect the customer to it’s virtual environment in the datacenter. Same is true with the public internet, where specific customers are mapped to specific public IP spaces. I wonder how this part fits into the solution.


    • says

      Excellent question Dieter. Yes, connecting virtual networks within NVP to VPNs you may have outside of NVP is done through NVP Gateway nodes.
      The gateway nodes have one NIC inside NVP running tunnels, and one NIC outside NVP connecting to traditional 802.1Q VLANs. From the NVP API you can map logical switches within NVP to an 802.1Q VLAN tag on a gateway outside NIC. That VLAN could be mapped to VRF-lite or MPLS VPNs on the network switches/routers outside of NVP.

  12. Atrey Parikh says


    On one of the responses you mentioned “software defined edge, IP transport network, and interfacing to external networks via Gateways”. Is there any design specific dependencies on the IP transport network or the underlying physical network at all? You could have your software defined edge run on a flat L2 on the transport side or L3 ECMP or some type of L2/L3, is this correct? From pure networking perspective what would be the configuration of physical port connected to the hypervisor hosting Gateway service or acting as a gateway, access-port or point-to-point? Are gateways responsible to advertise IP space behind this software defined edge?


    • Atrey Parikh says

      Ha… VMware NSX is up on the VMware Executive blog ( so I guess I can officially ask questions. I attended VMware executive round-table yesterday in NYC but only stayed for Martin’s presentation and the beginning of Hatem’s presentation and I figured I would get lot better network related insight from you anyway. Now I can correlate this blog post with NSX and I realize you gave a hint via this blog post back in January. Anyway to my previous question about configuration for the port connected to a GATEWAY, as I guessed it between access or point-to-point, would have to be routed hence point-to-point and then run IP routing between physical network and NSX based virtual network. And to my next question it seems obvious that because Gateways run IP routing they would advertise the IP space behind NSX. Okay so question about Service Nodes, what happens when you have a “virtualized” multicast source with receivers sitting on the physical network or the other side of NSX? Finally, still would like to know, if the underlying physical network is NOT an L3 ECMP or if it is not a flat L2, would the whole NSX concept still work without any changes to the physical world?

      Thank you.


  1. […] In particular, SDN is a work in progress, so it isn’t all going to be right the first time around. For some reason, this reminds me of the old saying that 90% of almost anything is crap. See also Terry’s CMUG slides, posted at links off We debated some, and I pretty much agree with Terry. I too particularly like Scott Shenker’s presentation “A Gentle Introduction to SDN”. And Brad Hedlund is usually an interesting read, see the article at… […]

Leave a Reply

Your email address will not be published. Required fields are marked *