Cisco UCS and Nexus 1000V design diagram with Palo adapter

Filed in Cisco UCS, Design Diagrams, FCoE, Featured, Nexus, NIV, QoS, VMware by on August 11, 2009 46 Comments

This is a follow-up and enhancement of a previous design diagram in which I showed Cisco UCS running the standard VMware vSwitch.  In this post I am once again showing Cisco UCS utilizing the Cisco (Palo) virtualized adapter with an implementation of VMware vSphere 4.0, however in this design we are running ESXi and the Cisco Nexus 1000V distributed virtual switch (vDS).

The Cisco adapter on the UCS B-200 blade is using its Network Interface Virtualization (NIV) capabilities and presenting (4) virtual Ethernet NICs, and (2) virtual Fibre Channel HBA’s to the operating system vSphere 4.0 ESXi.  The vSphere 4.0 hypervisor sees the virtual adapters as unique physical adapters and identifies them as VMNIC’s and VMHBA’s.  The vSphere VMNIC’s are then associated to the Cisco Nexus 1000V software switch to be used as uplinks.  The NIV capabilities of the Cisco adapter allow the designer to use a familiar VMware multi-NIC design on a server that in reality has (2) 10GE physical interfaces with complete Quality of Service, bandwidth sharing, and VLAN portability among the virtual adapters.

Aside from visualizing how all the connectivity works, this diagram is also intended to illustrate some key concepts and capabilities.

Cisco Virtualization Adapter preserving familiar ESX multi-NIC network designs

In this design we have used the NIV capabilities of the Cisco “Palo” adapter to present multiple adapters to the vSphere hypervisor in an effort to preserve the familiar and well known (4) NIC design where (2) adapters are dedicated to VM’s, and (2) adapters dedicated to management connections. The vSphere hypervisor scans the PCIe bus and see’s what it believes to be (4) discreet phsyical adapters, when in reality there is only (1) physical dual-port 10GE adapter. Just as we would with a server with (4) physical NICs we can dedicate (2) virtual Ethernet adapters to the virtual machine traffic by creating a port profile called “VM-Uplink” and associating it to the Cisco adapter vNIC1 and vNIC2. Similarly we can dedicate (2) virtual Ethernet adapters to the management traffic by creating a port profile called “System-Uplink” and associating it to the Cisco adapter vNIC3 and vNIC4.

We will configure the “VM-Uplink” port profile to only forward VLANs belonging to VM’s, and configure the “System-Uplink” port profile to only forward VLANs belonging to management traffic.

Creating separate uplink Port Profiles for VM’s and Management

Nexus1000V# config
Nexus1000V(config)# port-profile System-Uplink
Nexus1000V(config-port-prof)# capability uplink
Nexus1000V(config-port-prof)# vmware port-group
Nexus1000V(config-port-prof)# switchport mode trunk
Nexus1000V(config-port-prof)# switchport trunk allowed vlan 90, 100, 260-261
Nexus1000V(config-port-prof)# no shutdown
Nexus1000V(config-port-prof)# state enabled

Nexus1000V(config)# port-profile VM-Uplink
Nexus1000V(config-port-prof)# capability uplink
Nexus1000V(config-port-prof)# vmware port-group
Nexus1000V(config-port-prof)# switchport mode trunk
Nexus1000V(config-port-prof)# switchport trunk allowed vlan 10, 20
Nexus1000V(config-port-prof)# no shutdown
Nexus1000V(config-port-prof)# state enabled

The VMware administrator will now be able to associate vmnic0 and vmnic1 to the “VM-Uplink” port group, additionally vmnic2 and vmnic3 can be associated to the “System-Uplink” port group. This action puts those NICs in the control of Nexus 1000V which assigns them to a physical interface number; Eth1/1 for vmnic0, Eth1/2 for vmnic1, and so on.

Nexus 1000V VSM running on top of one of it’s own VEM’s

In this diagram the UCS blade is running the Nexus 1000V VSM in a virtual machine connected to a VEM managed by the VSM itself.  Sounds like a chicken and egg brain twister doesn’t it?  So how does that work?  Well, pretty simple actually.  We use the ‘system vlan‘ command on the uplink port profile “System Uplink”.  This allows the VLANs stated in this command to be up and forwarding prior to connecting with the VSM for ‘critical connections’ such those needed to reach the VSM and other critical VMWare management ports such as the VM Kernel.  We can also use the same ‘system vlan’ command on the port profiles facing the locally hosted VSM on this blade.

Nexus1000V# config
Nexus1000V(config)# port-profile System-Uplink
Nexus1000V(config-port-prof)# capability uplink
Nexus1000V(config-port-prof)# system vlan 90,100,260-261
! These VLANs forwarding on the uplink prior to locating VSM

Nexus1000V(config)# port-profile VMKernel
Nexus1000V(config-port-prof)# switchport mode access
Nexus1000V(config-port-prof)# switchport access vlan 100
Nexus1000V(config-port-prof)# system vlan 100
! This allows access to VMKernel if VSM is down

Nexus1000V(config)# port-profile N1K-Control
Nexus1000V(config-port-prof)# switchport mode access
Nexus1000V(config-port-prof)# switchport access vlan 260
Nexus1000V(config-port-prof)# system vlan 260
! Allows VNICs for the VSM to be up prior to connecting to the VSM itself
! Do the same for N1K-Packet and N1K-Control

Virtual Port Channel “Host Mode” on the Nexus 1000V VEM uplinks (vPC-HM)

In this design the uplink port profiles “System Uplink” and “VM Uplink” are establishing a single logical port channel interface to two separate upstream switches.  The two separate upstream switches in this case are (Fabric Interconnect LEFT) and (Fabric Interconnect RIGHT).  While the server adapter is physically wired the UCS “Fabric Extenders” (aka IOM), the fabric extender is simply providing a remote extension of the upstream master switch (the Fabric Interconnect), therefore the server adapter and Nexus 1000V VEM see itself as being connected directly to the two Fabric Interconnects.  Having said that, the two Fabric Interconnects are not vPC peers that would normally allow them to share a single port channel facing a server or upstream switch.  So how does the Nexus 1000V form a single port channel across two separate switches not enabled for vPC?  This is done with a simple configuration on the Nexus 1000V called vPC-HM.

The Nexus 1000V VEM learns via CDP that Eth 1/1 and Eth 1/2 are connected to separate physical switches and creates a “Sub Group” unique to each physical switch.  If there are multiple links to the same physical switch they will be added to the same Sub Group.  When a virtual machine is sending network traffic the Nexus 1000V will first pick a Sub Group and pin that VM to it.  If there are multiple links within the chosen Sub Group the Nexus 1000V will load balance traffic across those links on a per-flow basis.

Enabling vPC-HM on Nexus 1000V

Nexus1000V# config
Nexus1000V(config)# port-profile VM-Uplink
Nexus1000V(config-port-prof)# channel-group auto mode on sub-group cdp
Nexus1000V(config)# port-profile System-Uplink
Nexus1000V(config-port-prof)# channel-group auto mode on sub-group cdp

With this configuration the Nexus 1000V will automatically create two Port Channel interfaces and associate them to the chosen Port Profiles

Nexus1000V# show run
! unnecessary output omitted
interface port-channel1
inherit port-profile VM-Uplink
interface port-channel2
inherit port-profile System-Uplink

Cisco Virtualization Adapter per vNIC Quality of Service

Our multi-NIC design is enhanced by the fact that Cisco UCS can apply different Quality of Service (QoS) levels to each individual vNIC on any adapter. In this design, the virtual adapters vNIC3 and vNIC4 dedicated to management connections are given the QoS profile “Gold”. The “Gold” QoS setting can for example define a minimum guaranteed bandwidth of 1Gbps.  This works out nicely because this matches the VMware best practice of providing at least 1Gbps of guaranteed bandwidth to the VM Kernel interface. Similarly, the “Best Effort” QoS profile assigned to the NICs used by VM’s can also be given a minimum guaranteed bandwidth.

It is important to understand that this is NOT rate limiting. Interface rate limiting is an inferior and sub optimal approach that results in wasting unused bandwidth. Rather, if the VM Kernel wants 10G of bandwidth it will have access to all 10G bandwidth if available. If the VM’s happen to be using all 10G of bandwidth and the VM Kernel needs the link, the VM Kernel will get it’s minimum guarantee of 1Gbps and the VM’s will be the able to use the remaining 9Gbps, and vice versa. The net result is that Cisco UCS provides a fair sharing of available bandwidth combined with minimum guarantees.

QoS policies for the individual adapters are defined and applied centrally at the UCS Manager GUI:

Read the Cisco.com UCS Manager QoS configuration example for more information.

True NIV goes both ways: (Server and Network)

To obtain true NIV requires virtualizing the adapter towards the Server and the Network.  In this design we are providing NIV to the Server by means of SR-IOV based PCIe virtualization which fools the server into seeing more than one adapter, all from a single physical adapter.  So the virtual adapters vNIC1, vNIC2, and so on, are identifying and distinguishing  themselves to the server system with PCIe mechanisms.  This accomplishes the goal of adapter consolidation and virtualization from the Server perspective.

The next challenge is differentiating the virtual adapters towards the Network.  Remember that more than one virtual adapter is sharing the same physical cable with other virtual adapters.  In this case vNIC1 and vNIC3 are sharing the same 10GE physical cable.  When traffic is received by the adapter on this shared 10GE cable how does the physical adapter know to which vNIC the traffic belongs to?  Furthermore, when a vNIC transmits traffic towards the Network, how does the upstream network know which vNIC the traffic came from and apply a unique policy to it, such as our “Gold” QoS policy?

Cisco UCS and Nexus 5000 solve this problem with the use of a unique tag dedicated for NIV identification purposes, shown here as a VNTag.  Each virtual adapter has it’s own unique tag# assigned by UCS Manager.  When traffic is received by the physical adapter on the shared 10GE cable it simply looks at the NIV tag# to determine what vNIC the traffic belongs.  When a vNIC is transmitting traffic towards the network it applies it’s unique NIV tag# and the upstream switch (Fabric Interconnect) is able to identify which vNIC the traffic was received from and apply a unique policy to it.

Not all implementations of NIV adequately address the Network side of the equation, and as a result can impose some surprising restrictions on the data center designer.  A perfect example of this is Scott Lowe’s discovery that HP Virtual Connect Flex-10 FlexNICs cannot have the same VLAN present on two virtual adapters (FlexNICs) sharing the same LOM.  Because HP did not adequately address the Network side of NIV (such as implementing an NIV tag), HP is forcing the system to use the existing VLAN tag as the means to determine which FlexNIC is receiving or sending traffic on a shared 10GE cable, resulting in the limitation Scott Lowe discovered and wrote about on his blog.  Furthermore, HP’s Flex-10 imposes a rate limiting requirement that imposes a hard partitioning of bandwidth resulting in waste and inefficiency.  Each FlexNIC must be given a not-to-exceed rate limit, and the sum of those limits must not exceed 10Gbps.  For example, I could have (4) FlexNICs sharing one 10GE port and I could give each FlexNIC 2.5Gbps of max bandwidth.  However if the link is idle FlexNIC #1 could not transmit any faster than 2.5Gbps (wasted bandwidth).

Cisco UCS addresses NIV from both the Server side and Network side, and provides actual Quality of Service with fair sharing of bandwidth secured by minimum guarantees (not max limits).  As a result there is no VLAN or bandwidth limitations.  In the design shown here with Cisco UCS and Nexus 1000V, any VLAN can be present on any number of vNICs on any port, and any vNIC can use the full 10GE of link bandwidth, giving the Data Center Architect tremendous virtualization design flexibility and simplicity.

I hope you enjoyed this post.  Feel free to submit any questions or feedback in the comments below.

Other related posts:
Cisco UCS and VMWare vSwitch design with Cisco 10GE Virtual Adapter
Nexus 1000V with FCoE CNA and VMWare ESX 4.0 deployment diagram

###

Disclaimer: This is not an official Cisco publication.  The views and opinions expressed are solely those of the author as a private individual and do not necessarily reflect those of the author’s employer (Cisco Systems, Inc.).  This is not an official Cisco Validated Design.  Contact your local Cisco representitive for assistance in designing a data center solution that meets your specific requirements.

About the Author ()

Brad Hedlund is an Engineering Architect with the CTO office of VMware’s Networking and Security Business Unit (NSBU), focused on network & security virtualization (NSX) and the software-defined data center. Brad’s background in data center networking begins in the mid-1990s with a variety of experience in roles such as IT customer, systems integrator, architecture and technical strategy roles at Cisco and Dell, and speaker at industry conferences. CCIE Emeritus #5530.

Comments (46)

Trackback URL | Comments RSS Feed

  1. Brilliant. Keep it up, Brad! The troops are hungry for this kind of manna :-)

  2. Rodos says:

    Brad, love your work as always. I like the slight change to the storage from the last diagram, or maybe I just understand it better now.

    You use the label Fabric Extender, often referred to as the FEX. As I now understand it the FEX has been officially renamed as the IO Module or IOM for short. Some of the docs still refer to it as the FEX but thats to be updated. As I suspect this diagram will be well use could be good to see it use the new name to avoid confusion.

    Great stuff, I look forward to seeing more.

    Rodos

  3. Duncan says:

    Great article Brad!

  4. Mike says:

    Brad, when you talk about QoS, is it also possible to give Fibre Channel a minimum of guaranteed bandwidth, and what happens if not enough bandwidth is available, because it is needed for Networking(pause the FCoE traffic)?

    thx for info,
    Mike

    • Brad Hedlund says:

      Mike,
      In both UCS and Nexus 5000, Fibre Channel has QoS on by default. The default setting for FCoE traffic is a guarantee of 50% link bandwidth (5Gbps) and no packet drops.

      Cheers,
      Brad

  5. HP says:

    Your article is a little biased. It is easy to stress some HP Flex-10 limitations, but the same is true for Cisco:
    * HP Flex-10 bandwidth divisions are fixed today. That is true. But at least it is available now. That can’t be said of the Cisco Palo NIC. HP Flex-10 divisions will be dynamic in future releases also.
    * You go through a lot of effort to setup everything redundantly. However, the hardware is not redundant to chip-level. You mention it yourself: “The vSphere hypervisor scans the PCIe bus and see’s what it believes to be (4) discreet phsyical adapters, when in reality there is *** only (1) physical *** dual-port 10GE adapter” and “….which fools the server into seeing more than one adapter, all from *** a single physical adapter ***”. In a half-width server, the CNA remains a SPOF.
    HP at least provided a full redundant system up to chip-level for half-height servers with two *** physically seperated *** Flex10-10GE interfaces.

    • Brad Hedlund says:

      Calling this article biased is quite funny coming from “HP”, as you proceed to espouse Flex-10 with bias. Furthermore, you are stating the obvious. Any intelligent reader knows that a Cisco employee writing about a Cisco product has bias, and expects it.

      In a half-width server, the CNA remains a SPOF.

      Uh, yeah, that’s obvious. Any customer briefed on UCS understands that. There’s no secret there. More adapters for the sake of redundancy translates into higher cost. The customer will make the judgement call if leveraging the HA capabilities of the virtualization or clustering software can afford savings in infrastructure costs.

      p.s. Don’t be ashamed to use a real name.

      Cheers,
      Brad

  6. Burg Rahja says:

    Brad

    Thanks for this article I’ve learned a lot from reading it.

    I have a follow up question. How is the live migration of VMs handed so that the minimum bandwidth guarantees are enforced when the VM moves to another host?

    Thx
    Burg

    • Brad Hedlund says:

      Burg,
      An excellent question that highlights the innovation and value of UCS and Nexus 1000V. Any minimum bandwidth guarantees as they existed on the source machine would be preserved at the destination machine provided the destination system was identically configured and the QoS policies followed the VM during vMotion.

      How is UCS and Nexus 1000V special in enabling this?

      Destination system identically configured:
      The complete server and network configuration of this system, from the server blade settings itself, to its Palo adapter, and all of the LAN/SAN settings provisioned on the Fabric Interconnect for this server are captured in a UCS Service Profile. This Service Profile could be made into a Service Profile Template. Any new blades I bring into the environment can be provisioned with a Service Profile that was cloned from the template. Following this behavior insures that my configuration is consistent among all blades. The configuration of the Palo adapter and all its QoS settings, the LAN/SAN settings and QoS on the Fabric Interconnect are all the same with no configuration drift or inconsistencies.

      QoS policies following the VM:
      This is where the Nexus 1000V shines. When a VM is migrated via vMotion to another system within the Nexus 1000V domain, any QoS settings specific to that VM are migrated along with it, resulting in consistent QoS behavior and policies regardless of the VM’s actual location. This automated migration of network QoS policies is something that was never possible before prior to Nexus 1000V.

      Hope that helps.

      Cheers,
      Brad

  7. Brad, this is a great overview. Thanks.

    Would you elaborate on the “per-flow” hashing that Nexus 1000V performs in vPC-HM? What aspects of a packet/frame are used in identifying a “flow” and how accurately does this result in load-balanced traffic across the redundant virtual adapters?

    I’d also be interested to hear your thoughts on the pros/cons of Nexus 1000V attaching to the Palo NIV devices as you’ve described here, versus PCIe device “pass-through” in the VMM to expose NIV devices directly to each VM instance. I gather that local switching between VMs would be impacted on the one hand, though perhaps hardware-assist features in the NIC would be impacted conversely. I’m not sure how overall management and scale would be affected. What other considerations come to mind?

    Thanks again,
    -Benson

    • Brad Hedlund says:

      Benson,
      The Nexus 1000V has tons of options for hashing what constitutes a flow:
      http://www.cisco.com/en/US/docs/switches/datacenter/nexus1000/sw/4_0/command/reference/n1000v_cmds_p.html#wp1284857

      The more granular your hashing algo is, the more likely you are to get even Steven load balancing. However, before you pick a granular method such as source & dest TCP ports, your member links should be landing on the same physical switch, or a single “logical” switch created by vPC, VSS, or StackWise.

      The pro’s of hypervisor bypass are better I/O performance and lower latency for the VM’s (more like bare metal), nice for high I/O VM’s such as Oracle or Exchange etc. The tradeoff of hypervisor bypass is scalability, as the # of vNIC’s you can provision on your physical adapter (Palo in this case) is hardware limited (128, with realistic numbers in the 50 range or less). The software based approach with Nexus 1000V has no hardware limits and scalability in terms of # of VM’s per blade is much higher.

      Cheers,
      Brad

  8. scott owens says:

    How does the impact of 10Gb with Jumbo frames fit into this ?
    The 7K & 5K both support jumbos – should we expect greater performance increases between backup servers and targets along with iSCSI improvements too .
    Also … does the Palo have iSCSI offloading along with ethernet offload ?

    thanks

    • Brad Hedlund says:

      Scott,

      Jumbo frames can easily be enabled on the Fabric Interconnect and with that can certainly only help iSCSI and vMotion performance, for example.
      The Palo adapter does TCP segmentation offloading but does not do any special HW offloading for iSCSI specific payloads, nor does it support iSCSI booting.

  9. Got on this thread by chance. Interesting stuff.

    BTW

    >HP at least provided a full redundant system up to chip-level for
    >half-height servers with two *** physically seperated *** Flex10-10GE
    >interfaces.

    Well that’s what we (IBM) say about our BladeCenter Vs the HP BladeSystem (i.e. we have a redundant backplane while you do not bla bla bla).

    Funny. I guess the glass is always half full for vendors…. isn’t it?

    Massimo.

  10. Josh says:

    My thought would be that most implementations would have either a CNA/1000V or the Palo/bypass arrangement. Does that sound right to you Brad?

    Also, you must be in switching mode with the 6100’s and are you statically pinning with separate VLAN’s per 6100? Why not use the native switching mode and keep the VLAN configs in the 6100 consistent? This config just seems much more involved than it needs to be. Am I missing something?

  11. Jonathan Butz says:

    I see that you have separate uplinks, system-uplink and vm-uplink, which then require dedicated physical links on the host. This is what the Cisco docs recommend.

    I have found that it is not necessary to do this and have provisioned Nexus1000v infrastructure with a single uplink profile that attaches to all of the host physical nics.

    What is Cisco’s perspective on a single uplink?
    If I am building hosts with two 10Gb nics, it seems so silly to dedicate one to management traffic…

  12. VMAdmin says:

    This is brillient and a new stuff to me.

    Really amazing

  13. push bhatkoti says:

    Great article. It ahs cleared up my mystery!

    -Push

  14. James S says:

    Great article !

    Should we use VPC-HM with mac-pinning instead of subgroup cdp ?

    • Brad Hedlund says:

      James,
      Nexus 1000V “MAC Pinning” is the mode that is easiest to configure and “just works”. VPC-HM is the mode that allows for more granular control of steering certain traffic types out of the NICs you want. If you’re not too picky about that, I would recommend MAC Pinning.

      Cheers,
      Brad

  15. IPK says:

    Hi Brad,
    Just a simple query from a design perspective. I want to compare the following designs;

    1) Nexus 1000v with Palo adapter [VN-Link in SW]
    2) Palo adapter with Passthru switching with FI acting as the vDS [VN-Link in HW]

    in a VMware virtualization environment where there is 10:1 VM density across three B-Series HH blades. Workloads are fairly I/O intensive hence I preferred the second approach. Any limitation in terms of extending this design across B-Series blade systems?

    IPK

    • Brad Hedlund says:

      IPK,
      With design #2 (VN-Link in HW) you can have the Palo adapter provide 54 dynamic vNIC’s per host, and therefore 54 virtual machines per host. Should be no problem with your 10:1 consolidation ratio.
      As far as extending this across multiple B-Series blades, the Fabric Interconnect is acting as a vDS from the perspective of vCenter and therefore has the same configuration maximums as a normal software based vDS. According to the new vSphere 4.1 Configuration Maximums Guide you can have 350 hosts in one vDS. http://www.vmware.com/pdf/vsphere4/r41/vsp_41_config_max.pdf

      Having said that, the theoretical maximum number of blades under the auspices of a Fabric Interconnect cluster is 320. The current supported number of chassis in one cluster is (14), so the actual maximum as of today would be 112 blades in one HW VN-Link DVS.
      For HA/DRS clusters the VMware maximum is still 32 hosts.

      Cheers,
      Brad

      • rhino says:

        Hello Brad,

        First of all, thank you for your web site.

        I’m also trying to compare these two solutions.

        With the solution #2, the Fabric Interconnect acts as the vDS. I’m ok with that. So what are the advantages to use the Nexus 1000v ? Do we have the same functionnalities between the Nexus 1000v and the “vDS” FI ?

        Maybe if you can provide us with the advantages of each solution, it would help me to understand.

        Rhino

        • rhino says:

          Maybe one difference is the vPath capability of the Nexus 1000v used for “Cisco Unified Network Servcies” ?

          Thank you

          Rhino

  16. LAB says:

    Brad –

    After these last few posts, I’m coming to the conclusion that it is not necessary to purchase and implement 1000V if one plans on solely using Palo and hypervisor bypass.

    Is this correct?

    – LAB

    • Brad Hedlund says:

      LAB,
      Correct, the Nexus 1000V and Hypervisor Bypass solutions are mutually exclusive per ESX host. You certainly could have one cluster of hosts running Nexus 1000V, and another cluster of hosts running Hypervisor Bypass (HW VN-Link).

      Cheers,
      Brad

  17. Morgandechile says:

    Hello Brad

    GREAT JOB …THANKS !!!!

    talking about LAB question …. are they (nexus and VN-Link ) necesarily mutualy exclusives for any ESX host … … ???

    is it possible to still use, from the perspective of the ESX Server, some nics on NEXUS and some other nics on VM-link …. in the same physical server ???

    what will be the limitations of that ?!?!?
    what could be the benefits ???

    thanks in advance ….

    Gustavo.

  18. Morgandechile says:

    could you be so kind to give me an advice ????

    I Already have two clusters of 8 Blade servers (each)
    and a third cluster with 2 blades for Admin purposes ….and even a fourth cluster of 2 servers for experimental purposes ……. all with Palo Adaptors…..
    Supposedly ….this will be very loaded ….
    what would be your suggestion for this environment ???

    1 cluster with nexus .. and 1 cluster with VN-LINK for example ?!?!? how many NICs you recomend ?!?!?

    thanks a lot in advance ….

  19. Rafael says:

    Morgandechile
    it depends , is it a lab or is it gonna be a production environment? if its gonna be a lab okay , you can mix it , but in a production environment i wouldnt mix the technologies , you have to device , nexus1k software , or PTS mode (vn link , nexus 1k in hw mode)
    the usages are different , PTS mode you would use for low density(56 vnics max) deployment and specific network requirements, for example if you have a vm with 3 nics (backup,mgmt , production) you could create only 18 VMs per host (56/3=18.6) (also you need all 8 uplinks from the chassis to the FI so you can have 56 nics)
    the nexu1k in software mode in other hand allows you have the same limit (vms per host) as a common DVS switch , also its network admin friendly (NX-OS cLI), so you can have a management point where you network admins are familiar , also NX-OS permits integration with your existing switch monitoring and management tools , also nexus1k software allows you use other type of servers (ibm , hp , dell etc etc)…normally in a datacenter you will have a single type of virtual switch.
    my personal opinion , go for nexus1k in software mode , the PTS is only for very specific and network high performance use cases.
    for the nexus1k in softwre mode i use maximum 10 vnics
    2 for service console and vmkernel , 2 for backup , 2 for nexus1k control , 2 for NFS (if its the case) , 2 for vm production

  20. Victor says:

    Brad, perhaps you can clarify something for me…

    Cisco uses VN-Tag to tag VM traffic for the purpose of enabling upstream devices, such as the 5000 or the 6100, to identify the source of the traffic and apply whatever policies have been configured in the 5000 or 6100.

    You mention that HP would like to track VM traffic based on the VLAN to which it belongs.

    Why cant VM traffic be tracked based on the MAC address of the VM’s vNIC, as switches do under normal circumstances? So basically, why not associate a VM’s MAC to a vETH port on the 5000 or 6100 and be done with it?

    Thanks

    • Brad Hedlund says:

      Victor,
      Good question! That _could_ be done, but there are some challenges with that approach:
      How would the MAC address be populated in the switch? Having the network guy type them all in will simply not work.
      What happens when the VM moves?
      What if the VM has additional MAC address it’s hosting unknown to vCenter?
      How do you manage multicast traffic? How do prevent bothering VM’s with multicast traffic they’re not interested in?
      How do manage broadcast traffic for VM’s on different VLANs? How do prevent VM’s from being exposed to all broadcasts on all VLANs?

      When you replicate the familiar model of a server (VM) being connected to a switch port (vEth) with a cable (VN-Tag or N1K port profile) you don’t have to worry about any of these issues because you haven’t fundamentally changed the provisioning model, rather you have simply taken the same provisioning model and adapted it for virtualization.

      Cheers,
      Brad

  21. victor says:

    Brad, thank you for your time and response. I know some people come on here to challenge/argue/debate you. Im not here to do any of those things. I just want to learn. So, before I go on, let me inform you that I personally couldn’t care less whether the IEEE adopts a tagged or untagged approach. I have no vested interest. I don’t work for hP or Cisco. As long as I get my paycheck, I’m fine! :-)

    That having been said, let me respond real quick.

    You ask: “How would the MAC address be populated in the switch? Having the network guy type them all in will simply not work.”

    I must be missing something (and I dont mean that sarcastically). Which switch are you referring to, the vDS/Nexus 1000 or the physical switch that the server is connected to? If its the vswitch, I would say that it would have authoritative knowledge of a VMs vNIC MAC address. Correct? And as for the physical switch, why can’t it learn the MAC address of an attached VM as it learns the MAC address of any physical server?

    Thank you

    • Brad Hedlund says:

      Victor,
      I think we’re talking about two different things here, hence the confusion.
      When the VM connects to a hypervisor software switch (vSwitch, vDS, N1K), you are right, there is authoritative knowledge from vCenter of the VM’s MAC address which can be associated to a vEth port on the N1K. Cisco refers to this as “Software VN-Link”.

      The other approach is to have the VM logically connected to a physical switch, bypassing the hypervisor for network I/O, and associated to a vEth port on the physical switch. This is the scenario I thought you were asking about. When you asked why a VM can’t be associated to a vEth port simply based on its MAC address, I listed the challenges with that (under the premise that the VM connects to a vEth on a physical switch). Cisco refers to this as “Hardware VN-Link”.

      Does that clarify?

      Cheers,
      Brad

  22. Matt Mc Auley says:

    Brad you are a god!

    I am just starting a number of deployment projects for a large integrator and your posts are making my life much easier!

    • Brad Hedlund says:

      Matt,
      Thanks for the nice feedback. Good luck with your projects and be sure to let me know if there are any gray areas needing clarification. I’m always thinking of new things to write about.

  23. Derek says:

    If you have a multi-node ESX cluster, say 4-8 hosts, and you are using the 1000v on all hosts do you recommend hosting the two VSMs on that same cluster? This is assuming you properly configure the system VLANs. Or should you go the Nexus 1010 route, or configure a 2-node ESX cluster using a standard vSwitch that just hosts the VSMs which in turn manage the VEMs in the other cluster?

    Basically, what’s the current advice on the chicken and the egg issue with VSMs and VEMs?

    Lastly, if your recommendation is that the VSM/VEMs can be located on the same cluster, should all pNICs be assigned to the 1000v or should a couple critical VLANs (say the ESXi console and the VSM management interface) be on a standard vSwitch?

  24. libing says:

    HI

    Thanks for the great information, but from the Cisco official introduction of VN-Tag(http://www.cisco.com/en/US/solutions/collateral/ns340/ns517/ns224/ns892/ns894/white_paper_c11-525307_ps9902_Products_White_Paper.html), the multi-channel NIV is deployed without Nexus1000v, and all the switching work will be done at the upstream switches, which is similar to a VEPA style.

    So, which is the recommended deployment for VN-Tag? thanks!

  25. Thank you very very much for this fantastic summary Brad, I just used it with a customer to illustrate the benefits of MAC pinning and the ins and outs of Nexus 1000v forwarding. Linking it at my site right now!!

    P.

  26. Gabe says:

    Hi Brad,

    Your posts are very informative! I would love to have some of your diagrams blown up and used on a wall at work. Do you make any of these available for such a purpose?

  27. JamesH says:

    Heya Brad,
    So, I just inherited three UCS platforms in three different DC’s that have been built by a guy who really knew what he was doing. He built out most of the last one, and me and the infrastructure guys built out the last of it to include the N1k. I’ve been hammering at the documentation and going over the config’s in this environment, yanno…spending all my free time trying to really ‘learn’ it. I read your posts religiously ( the guy that left recommended them highly), and my comprehension on this is getting better, but I have a lot of work to do.

    Anyway, in your post about the system uplink and vm uplink creating a single port channel..im lost. We are using two channels in our environment (four NICs ) one for uplink the other for customer traffic. Based on what I know about this, when the admin selects the profiles for each, they channel based on the configs that are part of the port-profile (we are using mac-pinning in our channel command), and the service profile in UCS is calling the vNIC template that is basically the same config as the port profile on the 1000v..in the UCS portion of this I can see that each set of NIC’s (vNIC) are each pinned to a different FI, and unless I’m mistaken, that ‘pinning’ is done as part of the service profile..so in your scenario, when you say that “The Nexus 1000V VEM learns via CDP that Eth 1/1 and Eth 1/2 are connected to separate physical switches and creates a “Sub Group” unique to each physical switch. ” is this something that you configured in UCS (like with the vNIC template scenario I spoke of)?.

    The problem is that the documentation for this is horribly confusing (http://www.cisco.com/en/US/docs/switches/datacenter/nexus1000/sw/4_0_4_s_v_1_3/port_profile/configuration/guide/n1000v_portprof_5channel.html#wp1119384) as it says that ‘mac-pinning’ is used when you are connecting to a switch that doesn’t support port channels….first off, the FI’s do support port channels, and in what world today is there a switch that doesn’t support port channels?

    I mean i’m just completely lost on this one :)

    Hopefully, I haven’t just confused the issue, i’m just wondering why in your scenario you are specifying host mode as the solution (because the virtual cables are terminating at different FI’s) and why ours are also terminating at different FI’s but we are using mac-pinning

  28. Manoj says:

    Excellent explanation on QOS policy and difference b/w HP FlexNIC an d UCS Fabric interconnect…..Simple and staright forward explanation…………….

Leave a Reply

Your email address will not be published. Required fields are marked *