Feb 09 2010

HP Flex-10 versus Nexus 5000 & Nexus 1000V with 10GE passthrough

Published by Brad Hedlund at 10:26 am under Data Center, FCoE, Nexus, QoS

I had an interesting discussion with a customer the other day where both Cisco (myself included) and HP account teams where on the same call to discuss Flex-10, Nexus 1000V, or other approaches that may work better. — Yeah, awkward.

Anyway, for most of the time we (the Cisco team) focused on Flex-10’s total lack of QoS capabilities. Flex-10 requires the customer to carve up bandwidth on the Flex-NICs exposed to the vSphere ESX kernel.

For example (1) FlexNIC set at 2Gbps for Service Console, another FlexNIC set at 2Gbps for vMotion, and a 3rd FlexNIC set at 6Gbps for VM data.  Even if the 10GE link is completely idle a FlexNIC doesn’t get anymore bandwidth than what’s been manually assigned to it.

The question then becomes … Why not just feed the server a 10G interface and let vMotion or VM’s grab as much bandwidth as is available? — with minimum guarantees provided by QoS.
The Nexus 1000V can apply the QoS markings and the Nexus 5000 can act upon it. However, with Flex-10 as a man-in-the-middle the QoS model breaks.
What value does the FlexNIC bandwidth assignments have with a 10G attached server, other than being a workaround for a lack of QoS intelligence in Flex-10?

Moving on to operations, with Nexus 1000V, the server team is handing over the VM networking configuration and responsibility to the network team. The Nexus 5000 is obviously managed by the network team as well …. so … who manages Flex-10? Server team or Network team?
If the Server team manages it, there is no continuity in troubleshooting from Nexus 5000 to Nexus 1000 because you have this other object in the path (Flex-10) managed by a separate team — everybody seems to agree that’s not an ideal situation.
If the Network team manages Flex-10, where is the CLI? where’s the QoS? how do you upgrade the code? do you need to reboot servers or interact with the server team for every config change? — nobody is quite sure.

The HP people on the call kept interjecting … “but, but, but … with Flex-10 you can move or replace the server without reconfiguring MAC/WWNs” That was all they could really say.
The customer didn’t seem to give this argument much weight because plenty of fail over capacity would be built into the VMware environment. If a server failed, VM’s will be immediately started on another host, and how fast you can reconfigure a physical server becomes less important.
It would take (2) or more servers failing at the exact same time (highly unlikely) for the speed of physical server provisioning becoming the bottle neck to restoring capacity to the vSphere environment.

It became quite clear towards the end of the meeting that 10GE pass-through to Nexus 5000 was a solid approach without much downside … and VMware vSphere + Nexus 1000V had a lot to do with the customer coming to that conclusion.
The nice migration path to FCoE with 10GE pass-through to Nexus 5000 is also a huge plus. The HP 10GE pass-through module does support FCoE today. So, if HP is “going to announce FCoE soon”, then why buy a Flex-10 switch today that doesnt support FCoE? Why not buy a pass-through device that supports FCoE now, and spend less money doing so?

More information on the HP 10GE passthrough module can be found here on HP’s website:
http://h18004.www1.hp.com/products/blades/components/ethernet/10gbe-p/

###

17 responses so far

17 Responses to “HP Flex-10 versus Nexus 5000 & Nexus 1000V with 10GE passthrough”

  1. Network Engineer says:

    Brad, few points of note here:

    (1) You are assuming that the network team deploys & manages Nexus-1000 soft switches. N1K switches are clearly not a requirement in ESX virtualized environment. They are more of a nice-to-have for those who for some reason need to do more than ESX “native” soft switches can offer.

    In our organization we decided to draw the line at the N5K aggregation layer, and let our Server team manage everything below that (which includes Flex-10 devices and all ESX software components). Flex-10 is no longer “in the middle”, because there is no Cisco piece at the bottom.

    (2) I can argue that FlexNIC bandwidth allocation method is superior to QoS. QoS is much more complex to provision and manage on large scale. And N5K doesn’t exactly offer good QoS capabilities in its current form (I can’t speak for N1K as we haven’t tried it). FlexNIC bandwidth carving is bullet-proof and it works with minimal provisioning complexity.

    (3) Your point about Flex-10 lack of network-friendly management capabilities is absolutely valid. I don’t think it was developed as a true “networking” device, but rather as a “server NIC on steroids”.

    • Brad Hedlund says:

      Hi “Network Engineer”,

      Thanks for the comments. Allow me to respond to a couple of things you said:

      N5K doesn’t exactly offer good QoS capabilities in its current form

      I couldn’t disagree with you more on this. The Nexus 5000 QoS capabilities stand out among all other 10GE L2 switches. Nexus 5000 can classify traffic based on COS, IP, UDP/TCP ports, IP Precedence, DSCP, and Protocol Type. From there you can classify traffic into 8 different service levels called “QoS Groups”. Each QoS group can then be given a % of guaranteed bandwidth, strict priority, and no-drop service if you want. Each port has dedicated 480KB buffers for queuing — compare that to Flex-10’s 2MB buffer shared across all ports.

      I can argue that FlexNIC bandwidth allocation method is superior to QoS … FlexNIC bandwidth carving is bullet-proof and it works with minimal provisioning complexity.

      So your argument is that carving bandwidth at the FlexNIC is superior because its just ‘easier’ — and because of that you are willing to limit vMotion bandwidth to 2Gbps (for example), versus having access to all 10GE? I’m sorry, but that sounds lazy! Especially when you find out how easy it is to enable QoS on the Nexus 1000V and Nexus 5000. Minimal provisioning complexity? I don’t know about you, but configuring just (2) 10GE ports per server sounds a lot easier than configuring (8) FlexNIC’s per server, maybe that’s just me though.

      Here’s how easy the configuration can be for QoS on Nexus 1000V and Nexus 5000. This is a simple example of guaranteeing 10% bandwidth to vMotion, with no rate limiting, all 10G is accessible.

      Easy Nexus 1000V QoS Configuration:

      policy-map type qos Mark-VMotion
      class class-default
      set cos 4

      port-profile VMotion
      service-policy type qos input Mark-VMotion

      With that config we have marked the vMotion traffic with CoS 4. That’s all you need to do on the Nexus 1000V. Now lets go the Nexus 5000 and have it act upon the classification made by the Nexus 1000V…

      Easy Nexus 5000 QoS Configuration:

      class-map type qos VMotion
      match cos 4

      policy-map type qos Classify-VMware-Traffic
      class VMotion
      set qos-group 2

      class-map type queuing VMotion
      match qos-group 2

      policy-map type queuing VMware-Bandwidth-Settings
      class type queuing VMotion
      bandwidth percent 10
      class type queuing class-default
      bandwidth percent 40

      Note: FCoE already has a 50% bandwidth setting on by default

      system qos
      service-policy type qos input Classify-VMware-Traffic
      service-policy type queuing input VMware-Bandwidth
      service-policy type queuing output VMware-Bandwidth

      Thats it! You’re done! And you only need to configure that once. Every ESX Host connected to the Nexus 1000v and Nexus 5000 will have the benefit of this intelligent QoS. With this simple configuration, your VMware vSphere ESX servers attached to Nexus 5000 can use as much of the 10GE bandwidth available for vMotion, with a guaranteed minimum of 10% (1Gbps). Now that’s efficiency.

      On the other hand, with Flex-10, since it does not have this intelligent QoS capability, you will need to configure a FlexNIC for vMotion and give it a very low bandwidth setting, such as 1 or 2Gbps, and it will never be able to go higher than that, even if the bandwidth is available. What a waste!

      Cheers,
      Brad

  2. Etherealmind says:

    The QoS debate is important to networking people, but in my experience, Server people have no concept of oversubscription, QoS, or ordered allocation. To whit, the debate on allowing VMware to oversubscribe memory even with resource controls. Most Server Admin cannot conceive of flexible resource allocation.

    Until that happens, I suspect that HP is selling “what the customer wants” not “what the customer needs”.

    The largest concern with HP Flex-10 is the operational risk to the network. The HP sales pitch that Flex-10 saves time on network provisioning is seductive but false. I wonder how many networks will suffer overloading outages, or security breaches, or VLAN exhaustion as the Server Admins treat the network as a static but infinite pool of resources. That is, “just keep configuring until it breaks” deployment plan.

    Adding servers without consideration for backbones, or dynamic load movement, or security boundaries is serious risk that is currently being ignored.

    Ah, well. Once again, Networks will pick up the pieces later while the Server folks just keep doing what they always did.

    Etherealmind

  3. quaich says:

    @Etherealmind: i totally agree with you in all points…….i often have the feeling that the network gets loaded and loaded with more intelligence but on the other side the upper layer intelligence and folks that are administering those upper layers stay on the same step or even worse……

    quaich

  4. net@work says:

    I totally agree with you in all points.
    We have N5k in place in our company, but we have no chance to use FCoE, because the server guys give us no chance. Even worse we swap all our FC switches this month without having a look to FCoE. Off cource FCoE is not the solution for all systems, but …
    The big problem is, that server administrators gather more and more network tasks, but they do not realy understand what they are doing. – In some years we will not see any netwokers within the datacenter. For me it seems that server people will win this game. Networkers have no chance to argument, because they use a language which is not spoken in the world of server guys. And security,availability and relaiability – who cares about this?

  5. Social comments and analytics for this post…

    This post was mentioned on Twitter by nash_j: RT @bradhedlund: HP Flex-10 versus 10GE Passthrough when using Nexus 5000 & Nexus 1000V: http://bit.ly/ciH4v0...

  6. serverguy says:

    You know what’s really depressing?

    I am a server guy (Unix & more recently VMware) and yet I find I can’t argue with the other comments regarding server guys.

    Obviously I exclude myself :)

    Too often though I work with server people with little clue of how to run on a contended/shared infrastructure let alone how to manage one themselves. The concepts of risk & impact analysis, capacity management, availability, failure domains, etc. seem meaningless and incomprehensible to the major of server people these days. Particularly those who’s experience has been centered on Linux or Windows.

    What depresses me most though, and it’s a cliche, is when network and server (and even storage) people are unable to work together or even just talk the same language.

  7. thehevy says:

    Great discussion and posts.

    We have been looking at network bandwidth and QoS on 10G Ethernet in our labs. While we can use benchmarking and load generation tools to get 9.5+G of throughput on a 10G port, the question that we keep asking ourselves is, will we see this in the real world?

    I am sitting in a VMware vSphere Fast Track course with 20+ Admins that still run their network connections on 1G. They are running less then 10 VMs per host with 8 1G ports of connectivity. With that few of VMs and since each 1G port is limited to 1G of traffic the likelihood of a host actually being able to max out a pair of 10G ports seems pretty low.

    So in most situation why would we want to spend the additional time and overhead of setting up QoS policies. I would say setup monitoring of the network and watch for bandwidth issues and only setup QoS if needed.

    I put more of these thoughts in a white paper…
    http://download.intel.com/support/network/sb/10gbe_vsphere_wp_final.pdf

    With all that said, I believe in two 10G ports that are wide open for all traffic types to share the bandwidth, UNLESS, you have a real need to manage your bandwidth of a specific traffic type. Having an Admin manually carve a 10G in to 4 segments that don’t share unused bandwidth doesn’t seem to match the new paradigm of dynamically changing virtualized data centers.

    I look forward to reading about the areas / use cases that QoS is really needed and is more efficient then just adding another 10G port to the host. The reality is 10G is not that expensive especially when compared to paying for the additional operational expenses to implement and manage QoS on every host in a data center.

  8. Sri says:

    Brad,
    I totally agree with you that VC and Flex-10 has no value proposition. I also agree with thehevy that 10Gig is more that what you need today, and no point in managing traffic when there is no traffic congestion. So, in fact no need for carving or QoS as well. In virtualized environments 8 1Gig NICs at a max are handle today’s needs, so having 2 10Gig pipes from one server means 2+ times more bandwidth will be available. Always you could add few more 10G ports.

    All this can be done with VMware and their native vSwitches. In addition, VMware supports traffic shaping with rate limiting feature, just in case you still want to limit bandwidth per traffic/port group..So I see no need for customer to spend money on Nexus 1000V soft switch. No value proposition, similar to Flex-10.

    Again, I agree that pass thru is the best option today, leaving path to FCoE support which comes with industry standards DCB based Converged Enhanced Ethernet, later part of 2010. So I would say customer should wait till on buying any convergence gear till the standard based products hit the market in by end of 2010. So that in fact means wait till Nexus 5K and UCS 6100 FI all these move to DCB standards based CEE..These products need to move on from Cisco proprietary DCE…and that will happen later in the years along with rest of the industry.

    There is no point in rushing in to all these products that are based on pre-standard technologies….

    Don’t you agree? If not, they may end up with expensive paper weights….
    Cheers
    Sri

  9. Juan Lage says:

    Leaving QoS aside, what about scaling the solution and network management? A solution with ESX and Flex-10 effectively has three network layers all of which are managed differently in interfaces, tools and even nomenclature. At the ESX layer, you have to manage the vSwitch, at the server chassis level you need to manage Flex-10, and then you need to manage your network layer. If you have workloads that need to exists beyond a single server chassis, this means for VM networking policy you need to touch at least those three components for (perhaps) every change.

    Say for instance you need to have a group of VMs on a PVLAN and they need to live in blades which expand multiple chassis (say more than 4, which is the limit on VC stacking today). Leave aside the fact that PVLAN may or may not be supported in all three mentioned layers, and if they are supported, will certainly work in different ways. Still, you need to configure the network on all three layers, think about it. If you are on vSphere Ent, no distributed switches, means also you need to touch every vSwitch on every host, plus every Flex-10 module, plus network. Let’s add vlan limitations in Flex-10 to make it more appealing … In large scale environments it can become a nightmare. In small scale, a burden.

    If you go for a solution like the one outlined above by Brad (N1K + Passthru + N5K) you can etherchannel every host to the N5K with a trunk. PVLAN required? It works the same everywhere (NX-OS) and you configure on a single point for up to 256 host: the N1Kv VSM. A lot simpler. Simple = reliable. Not to mention the savings in Ops costs.

    My 0.2 cents.

    Juan

  10. Sri says:

    Brad,
    My regards and compliments to you for all this great wonderful work you do with great technical details and creative blogging. You definitely are a good technology blogger’s role model and inspiration…I wish you the very best :) .

    Keep up the good work!
    Cheers
    Sri

  11. Brad-

    Morning! Couldn’t you do something similar using 1 distributive virtual switch and using traffic shaping. I was under the impression that in utilization of Dvs that you can do both egress and ingress traffic shaping.

    I appreciate your comments.

    Disclaimer: Not a network guy

  12. thehevy says:

    The traffic shaping in vSphere works but it only provides shaping of traffic between the VM and a port in a port group. You can limit the amount of traffic a specific port can tx or rx but you have to keep in mind that the each port on the port group gets the same traffic shaping. That means that if you have shaping limited to 1G per port and 20 VMs end up on the same host that use that port group you can have up to 20Gs of traffic on that host.

    Traffic shaping based on a port has limited use. Traffic shaping based on a traffic type in conjunction with guaranteed minimum bandwidth allocations is where we need to get so all traffic types can take advantage of a large pipe. This will allow traffic to burst when no one else is using it while allowing higher priority traffic to get bandwidth when it needs it in a congested network.

    I can see using a port group based traffic shaping model for a backup connection in VMs or in a DMZ to match other connections are slower.

    One quick way to see your traffic is look at esxtop or resxtop’s networking screen during vMotion and under high traffic. I don’t think you will see any where near the 18Gb+ bandwidth that dual 10GbE ports are capable of…

    I still say, keep the network model as simple as possible, apply good monitoring and management tools and only implement traffic shaping and QoS if needed.

  13. Dan Robinson says:

    So I have a few questions about what seem to be holes in this theory that Nexus solves all problems.

    1) If you dont have N1K -OR- dont have N5K, does QoS still work? Your point about the HP Approach being lazy seems to me that its just meant to be more flexible and work in numerous environments, including those not under the Cisco thumb.

    2) The nexus series doesnt properly support FCoE today, because the CEE standard is not entirely ratified yet. So to say that FCoE works on the N5K when you will have to turn around the REPLACE some of the hardware down the line seems a little premature. You might argue that Cisco will push the standard their way, but its not a guarantee. And for those who say it will only be a software upgrade, ask your Cisco rep to put that in writing.

    3) Your argument about Pass Through 10GB confuses me.
    3a) The HP Proposition is that Flex-10 is natively supported by the LOM on the G6/G7 blades. So put pass through there (Interconnect 1 and 2) and you get 10GB Ethernet. how do you upgrade this to FCoE? Oh that’s right you have to buy a whole new server or put a CNA in a mezz slot and then move your 10GB Pass Through down to match. This might seem like a knock on HP, but see my point above that FCoE isnt ratified yet.
    3b) 10GB Ports on the N5K are free then? because your asking people to use 32 ports per chassis on your N5K. Using Flex-10 (or any 10GB Switch) would allow you to agregate your port usage a little. As mentioned before, you dont really need 10Gbx2 per server, so not having a 1:1 server:uplink ratio doesnt really matter. And for those who would argue this, does that mean your N5K has 30+ 10GB uplinks to your core? I didnt think so.

    Lastly, keep in mind as per #1 above that not all Cisco customers have Nexus across the board. Most still have CAT6K and cannot move to an entirely Nexus solution.
    Using Flex-10, at the last place I worked, we were able to have 4 chassis tied together with a 30Gbps private network for vMotion/HA/FT and then only using 4 x 10Gb uplinks to Cat6K Distribution layer for Management and VMs.

    • Brad Hedlund says:

      Hmm. Another HP employee posing as a customer?

      If you dont have N1K -OR- dont have N5K, does QoS still work?

      Sure it will, if you have the vSwitch connected to a physical switch that has QoS capabilities, which isn’t Flex-10 that’s for sure.

      The nexus series doesnt properly support FCoE today…

      Really? Tell that to EMC, NetApp, and oh by the way, HP, all who have certified, support, and resell the Nexus 5000 for FCoE deployments.

      …because the CEE standard is not entirely ratified yet

      CEE isn’t a standard. CEE (also known as DCB) is a collection of standards designed to enhance the data center network. Not all standards in CEE/DCB are required for FCoE.
      The standards that are required for FCoE to work (IEEE 802.1Qbb, IEEE 802.1Qaz, IEEE 802.1Qab) are all silicon ready and have been for some time. The other standards in CEE/DCB that are not as final such as IEEE 802.1Qau and TRILL are not at all required for FCoE.

      So to say that FCoE works on the N5K when you will have to turn around the REPLACE some of the hardware down the line seems a little premature.

      Actually, No! There will be no need for a customer to “REPLACE” the Nexus 5000 for any reason other than normal life cycling. FCoE works on the Nexus 5000 today. Funny you should use the word “REPLACE”, because that’s exactly what the HP Flex-10 customer will need to do in order to implement FCoE or what HP calls the “Converged Infrastructure”.

      And for those who say it will only be a software upgrade, ask your Cisco rep to put that in writing.

      Again, the Nexus 5000 doesn’t *need* any hardware upgrade. Can you say the same for Flex-10?

      This might seem like a knock on HP, but see my point above that FCoE isnt ratified yet.

      Uh, Wrong! You might want to get more informed about the status of FCoE. Take a look at http://www.fcoe.com and you will see that FCoE has been a standard since June 2009.

      10GB Ports on the N5K are free then? because your asking people to use 32 ports per chassis on your N5K. Using Flex-10 (or any 10GB Switch) would allow you to agregate your port usage a little.

      10GB ports on the Flex-10 are free then? Of course not. This is a ridiculous argument. You need to pay for the 10BG port a server connects to whether its on the Flex-10 or the Nexus 5000.

      As mentioned before, you dont really need 10Gbx2 per server

      That’s a bit presumptuous of you to say that.

      so not having a 1:1 server:uplink ratio doesnt really matter. And for those who would argue this, does that mean your N5K has 30+ 10GB uplinks to your core? I didnt think so.

      No. Customers are not asking for a 1:1 ratio of server to core. There will be oversubscription at the Nexus 5000 just like there is oversubscription at the Flex-10.

      not all Cisco customers have Nexus across the board. Most still have CAT6K and cannot move to an entirely Nexus solution.

      Yeah, and the good news is that HP Passthrough Module supports both 1G and 10G, which can connect to existing Catalyst switches that support 1G or 10G. In most cases when there is a project to enable 10G servers, it also coincides with a project to add 10G networking, and thats a perfect time to look at Nexus. And Nexus integrates nicely with existing Catalyst 6500 switches.

      Using Flex-10, at the last place I worked, we were able to have 4 chassis tied together with a 30Gbps private network for vMotion/HA/FT and then only using 4 x 10Gb uplinks to Cat6K Distribution layer for Management and VMs.

      Sweet. However its too bad that network has no FCoE readiness, no QoS, no In Service Software Upgrades (ISSU), no vPC, no network security, no network visibility, rate limited vMotion bandwidth, etc. All the things you can have with the HP Passthrough 10G module connected to Nexus.

      For example, using that same scenario, now you can connect those 4 chassis to (4) Nexus 2232 10G Fabric Extenders with the HP 10G Passthrough module. Connect the Nexus 2232’s to a pair of Nexus 5000’s, and connect the Nexus 5000’s with 4 x 10G uplinks to the Catalyst 6500.

      This design would provide you with the aforementioned FCoE readiness, QoS, full 10G bandwidth available to vMotion, In Service Software Upgrades (ISSU), 20G active/active bandwidth to each server with virtual Port Channels (vPC), proper network security features, and full traffic visibility. All of that even without the Nexus 1000V, this is just what the N5K/N2K brings. With the Nexus 1000V you have a comprehensive and consistent network operations model from the VM all the way through to the core of the network.

      Cheers,
      Brad

  14. Sam says:

    “For example, using that same scenario, now you can connect those 4 chassis to (4) Nexus 2232 10G Fabric Extenders with the HP 10G Passthrough module. Connect the Nexus 2232’s to a pair of Nexus 5000’s, and connect the Nexus 5000’s with 4 x 10G uplinks to the Catalyst 6500.

    This design would provide you with the aforementioned FCoE readiness, QoS, full 10G bandwidth available to vMotion, In Service Software Upgrades (ISSU), 20G active/active bandwidth to each server with virtual Port Channels (vPC), proper network security features, and full traffic visibility.”

    At the cost of ~$150,000 for all that new network infrastructure…vs $1000 for some 10Gbe-CX4 cables.

    • Brad Hedlund says:

      Sam,
      Lets take a look at the math in each scenario based on (4) HP c7000 chassis. Lets start with the HP switches.

      Each HP c7000 will need (2) Flex-10 modules at $12,000 list price each, and (2) Virtual Connect FC modules at $9,000 list price each. That’s $42,000 in networking costs per HP c7000, and with (4) HP c7000’s thats $168,000.

      Now lets look at the Cisco Nexus solution with HP 10GE Pass-through modules. Each HP c7000 will need (2) 10GE passthrough modules at $5,000 list price each. So that’s $10,000 in networking costs per HP c7000, and with (4) HP c7000’s that’s $40,000. Now we need to connect those c7000’s to (4) Nexus 2232 Fabric Extenders at $10,000 list price each, totaling $40,000. And then we will connect the Nexus 2232’s to (2) Nexus 5010’s with FCoE licenses at $30,000 each, so that’s another $60,000.

      So in summary:

      HP networking solution = $168,000
      Cisco Nexus + 10GE Passthrough = $140,000

      Winner: Cisco Nexus + 10GE Passthrough

      Cheers,
      Brad

Leave a Reply