Cisco UCS intelligent QoS vs. HP Virtual Connect rate limiting

Filed in Cisco UCS, QoS by on August 16, 2010 27 Comments

This article is a simple examination of the fundamental differences in how server bandwidth is handled between the Cisco UCS approach of QoS (quality of service), and the HP Virtual Connect Flex-10 / FlexFabric approach of Rate Limiting.  I created two simple flash animations shown below to make the comparison.

This movie requires Flash Player 9

This movie requires Flash Player 9

The animations above are each showing (4) virtual adapters sharing a single 10GE physical link to the upstream network switch.  In the case of Cisco UCS the virtual adapters are called VNIC’s that could be provisioned on the Cisco UCS virtual interface card (aka “Palo”).  For HP Virtual Connect the virtual adapters are called FlexNIC’s.  In either case, the virtual adapters are each provisioned for a certain type of traffic on a VMware host and share a single 10GE physical link to the upstream network.  This is a very common design element for 10GE implementations with VMware and blade servers.

When you have multiple virtual adapters sharing a single physical link, the immediate challenge lies in how you guarantee each virtual adapter will have access to physical link bandwidth.  The virtual adapters themselves are unaware of the other virtual adapters, and as a result they don’t know how to share available bandwidth resources without help from a higher level system function, a referee of sorts, that does know about all the virtual adapters and the physical resources they share.  The system referee can define and enforce the rules of the road, making sure each virtual adapter gets a guaranteed slice of the physical link at all times.

There are two approaches to this challenge: Quality of Service (as implemented by Cisco UCS); and Rate Limiting (as implemented by HP Virtual Connect Flex-10 or FlexFabric).

The Cisco UCS QoS approach is based on the concept of minimum guarantees with no maximums, where each virtual adapter has an insurance policy that says it will always get a certain minimum percentage of bandwidth under the worst case scenario (heavy congestion).  Under normal conditions, the virtual adapter is free to use as much bandwidth as it possibly can, all 10GE if its available, for example if the other virtual adapters are not using the link or using very little.  However if two or more virtual adapters try to use more than 10GE of bandwidth at any time, the minimum guarantee will be enforced and each virtual adapter will get its minimum guaranteed bandwidth, plus any additional bandwidth that may be available.

Cisco UCS provides a 10GE highway where each traffic class is given road signs that designate which lanes are guaranteed to be available for that class of traffic.  Between each lane is a spray painted dotted line that allows traffic to merge into other lanes if those lanes are free and have room for driving.  There is one simple rule of the road on the Cisco UCS highway: If you are driving in a lane not marked for you, and that lane becomes congested, you must go to another available lane or go back to your designated lane.

The HP Virtual Connect approach of Rate Limiting does somewhat of the opposite.  With HP, the system referee gives each virtual adapter a maximum possible bandwidth that cannot be exceeded, and then insures that the sum of maximums does not exceed the physical link speed.  For example (4) FlexNICs could each be given a maximum bandwidth of 2.5 Gpbs.  If FlexNIC #1 needed to use the link it would only be able to use 2.5 Gbps even if the other 7.5 Gbps of the physical link is unused.

HP Virtual Connect provides a 10GE highway where lanes are designated for each virtual adapter, and each lane is divided from the other lanes by cement barriers.  There could be massive congestion in Lane #1, and as the driver stuck in that congestion you might be able to look over the cement barrier and see that Lane #2 is wide open, but you would not be able to do anything about it.  How frustrating would that be?

The HP rate limiting approach does the basic job of providing each virtual adapter guaranteed access to link bandwidth, but does so in a way that results in massively inefficient use of all available network I/O bandwidth.  Not all bandwidth is available to each virtual adapter from the start, even under normal non-congested conditions.  As the administrator of HP Virtual Connect, you need to define the maximum bandwidth for traffic such as VMotion, VM data, IP storage, etc. (something less than 10GE) and from the very start that traffic will not be able to transmit any faster, there is an immediate consequence.

The Cisco UCS approach allows efficient use of all available bandwidth with intelligent QoS, all bandwidth is available to all virtual adapters from the start while providing each virtual adapter minimum bandwidth guarantees.  As the Cisco UCS administrator you define the minimum guarantees for each virtual adapter through a QoS Policy.  Traffic such as VMotion, VM data, IP Storage, etc. will have immediate access to all 10GE of bandwidth, there is an immediate benefit of maximum bandwidth. Only under periods of congestion will the QoS policy be enforced.

UPDATE: Follow-up post: VMware 10GE QoS Design Deep Dive with Cisco UCS, Nexus

About the Author ()

Brad Hedlund is an Engineering Architect with VMware’s Networking and Security Business Unit (NSBU), focused on network & security virtualization (NSX) and the software-defined data center. Brad’s background in data center networking begins in the mid-1990s with a variety of experience in roles such as IT customer, systems integrator, architecture and technical strategy roles at Cisco and Dell, and speaker at industry conferences. CCIE Emeritus #5530.

Comments (27)

Trackback URL | Comments RSS Feed

  1. Sean McGee says:

    Great post, Brad!

    Couple of things I’d like to add.

    1. HP’s rate limiting is for transmit traffic only. HP doesn’t provide any bandwidth control for receive traffic. In other words, customer’s think they have dedicated bidirectional bandwidth for each Virtual Connect FlexNIC, but they don’t. On receive, all FlexNICs have to fight for bandwidth – all frames inbound to any FlexNIC on the same Flex-10 port are handled as FIFO (first in first out) with no ability to prevent one FlexNIC from starving the other FlexNICs. So, you really need two flash animations for HP – one showing transmit and one showing receive.

    2. Cisco also provides the rate limiting feature (a la HP) if customer’s really want it. It’s their choice. In addition, Cisco fully implements rate limiting for both transmit and receive (unlike HP).

  2. @niketown588 says:

    QoS is based on the switch. You Can implement QoS at the adapter but the switch has the final say.
    Are you saying that the UCS uses better switches (I agree with) or better HBAs which I disagree with?

    • Brad Hedlund says:


      The intelligent QoS advantage with Cisco UCS is the result of better switches and adapters. If you think the adapters in Cisco UCS are not any better than HP, go ahead and make your case.


  3. Ian Erikson says:

    Great Blog!

    2 questions –

    1) If you are in a HP blade environment- with Flex-10 – do you recommend creating 1 pipe in flex-10 and installing a nexus 1000v and doing QoS/Rate Limiting there? (Assuming you are also using UCS with N1KV already, so there is support/experience available)

    2) HP Blade options – the HP pass through module is interesting, but having (8) or (16) copper twinax 10Gb cables for 8 or 16 blades seems like overkill – in bandwidth. That’s also a bulk of cabling we are trying to reduce.

    • Ian Erikson says:

      oh, wanted to mention that with Flex-10/HP Pass through options- not using for storage traffic – still will be using seperate VC FC modules.

    • Brad Hedlund says:


      1) If you have Virtual Connect Flex-10 or FlexFabric I absolutely recommend using the Nexus 1000V to gain the per virtual machine network visibility. With Flex-10, the Nexus 1000V can classify and mark traffic leaving the Host, the first import step in intelligent QoS. However, the second important step to intelligent QoS is enforcing the QoS policy, and this is something the Flex-10 switch is simply not capable of doing. You can classify and mark traffic all day long with Nexus 1000V however it won’t make any difference in how the Flex-10 switch manages traffic within and through the c7000 chassis. This is why I recommend the HP 10GE pass-through module which allows you to connect the blade servers directly to a more intelligent switch, such as the Nexus 5000 or Nexus 2232. With the Nexus 1000V connected to the Nexus 5000/2000 switches via 10GE pass-through, you have the intelligent QoS capabilities shown here with your HP c7000 blade chassis and servers.

      2) The extra cables from the HP 10GE pass-through design all stay within the rack, and the CX1 SFP+ copper cables are very thin and flexible. The number of cables leaving the rack are the same if you had used chassis switches such as Flex-10. Yes, there are extra cables inside the rack, however the one time investment of additional cables (inside the rack only) during initial setup continually pays off in the long run with a more intelligent end-to-end virtualization aware network from the virtual machine all the way through to the network Core.

      Or, better yet, consider Cisco UCS where you can get both the intelligent network and the reduced cabling.


  4. Derek says:

    But the “extra cables” require extra switch ports. And Nexus switch ports are not cheap. So instead of having 2-4 10Gb uplinks per chassis, I now need up to 32. That’s a drastic increase in switching costs.

    ESX 4.1 is now has more intelligent network I/O control, and it’s bi-directional. So why not just not use the Flex-10 Flex-NIC feature and configure them for a single 10Gb connection and rely on ESX 4.1 network IOC to help with QoS.

    • Brad Hedlund says:

      The number of “switch ports” stays the same. You are removing the Flex-10 “switch” from the chassis and replacing it with a Nexus 2232 inside the rack. The Cisco list price of a Nexus 2232 is $10,000 for (40) 10GE ports, that’s just $250 per port. The Flex-10 switch on the other hand is $12,000 for (24) 10GE ports, that’s $500 per port. So, contrary to what you believe, the “extra cables” are actually saving you money and providing a more intelligent network.

      As for ESX 4.1 “network I/O control”, that’s just another example of how you can always burden software to do the job of hardware. It might work, but at the cost of higher server CPU utilization, and what are the real performance impacts of that? Hardware will always provide robust and reliable performance while offloading the server CPU to do what it should be doing, executing the business applications and hosting more virtual machines. Furthermore, implementing QoS at the server NIC (be it with hardware or software) is only half of the solution. The network switch connected to the NIC needs to be intelligent enough to enforce the QoS policy you have defined. So, you could configure ESX 4.1 “network I/O control” all day long on your HP blade servers, but if those blade servers are linked to HP Virtual Connect Flex-10 (which has no QoS enforcement capabilities) you have accomplished nothing.


      • Adam says:

        (Disclaimer I work for HP)

        “So, contrary to what you believe, the “extra cables” are actually saving you money and providing a more intelligent network.”

        But those cables cost money and they also draw power so actually to be fair in your comparison you need to include the cost of the cables. Also Ian made a valid point that most customers have been trying to replace the cables and use trunking instead, jumping to pass through is a step backwards! Anyone who’s tried to cable pass through will know the difficulties involved (not on a Visio I mean actually in a data center, we can all sit here and draw pretty diagrams!)

        One of the real advantages of Virtual Connect is the ability to keep a lot on “non-essential” traffic away from the network (like vmotion) and inside the chassis where bandwidth is cheap and plentiful.

        I also have a serious point to make about this increasing example of “overpowering” CPU’s by getting them to do actions that can be delivered in the hardware (I agree in principle that hardware implementation will always be more efficient than software).

        Do you honestly believe that CPU’s are the bottleneck in a datacenter? Honestly? Because let me tell you they are not! Speak to any customer running production workloads and the two things they will always ask for more of is memory and network I/O (in terms of I/O per virtual machine) CPU’s were a problem 2 years ago but there are now more than enough spare cycles to take care of the minor overhead involved in performing QoS at the hypervisor. CPU’s are not the issue any more (and please don’t throw benchmarks around let’s look at actual production for a change!).

        Memory is no longer an issue with the advent of Nehalem-EX (although the smart money is on the AMD architecture particularly where price/performance is concerned) so that leaves I/O, if a customer is serious about sticking lots of VM’s on a box (ignoring the question of all eggs in one basket) and running them all at near full capacity then he will need LOTS of I/O, and based on some recent conversations customers are asking for at least 20Gb (active) and some are even considering 40Gb per box (on the larger Nehalem machines). My understanding is that today you are limited to 10Gb active per 2 socket box?

        Don’t get me wrong QoS has its place, but it shouldn’t be the primary bandwidth control mechanism in the network, it should be a tool of last resort for when it’s all gone hideously wrong and you just need to keep operating at a reduced level. It’s far better to have lots of bandwidth (you can always add more mezz cards in a HP blade!) and not run into network bottlenecks.

        • Brad Hedlund says:

          Thanks for the “comment”. Allow me to respond to a few of your statements, in no particular order.

          But those cables cost money and they also draw power so actually to be fair in your comparison you need to include the cost of the cables.

          OK. Fine, lets add in the cost of the cables. Cisco list price for a 3 meter SFP+ copper cable is $210. Since we have already established the pass-through design saves $250 per port, if we add in the cable it’s a $40 per port savings. At the end of the day it’s still cheaper while providing a more intelligent network for the customer. Furthermore, the 10GE pass-through design is capable of FCoE, so no additional FC blade switch is necessary to provide FC connectivity to the HP blade servers. Given that each HP Virtual Connect FC module is roughly $9,000 – the 10GE pass-through design can remove around $18,000 per c7000 chassis in FC networking costs. I can assure you the cable costs are no where near $18,000 per chassis!

          One of the real advantages of Virtual Connect is the ability to keep a lot on “non-essential” traffic away from the network (like vmotion) and inside the chassis where bandwidth is cheap and plentiful.

          We have already established HP Virtual Connect isn’t “cheap” and the key point made in this article is that the bandwidth is NOT “plentiful”. Vmotion traffic is a perfect example. Because HP Virtual Connect has no intelligent QoS capabilites the VC administrator must carve up the 10GE link and decide how much bandwidth vmotion will get (something less than 10GE). Any bandwidth given to vmotion will not be available to other traffic types, such as VM data or IP storage. In many cases this results in the customer allocating 2 Gbps for vmotion. Compare that to Cisco UCS or the HP 10GE pass-through design where vmotion can get all 10GE of bandwidth if its available without making any compromises. Thats what I call “plentiful” bandwidth.

          Do you honestly believe that CPU’s are the bottleneck in a datacenter? … CPU’s were a problem 2 years ago but there are now more than enough spare cycles to take care of the minor overhead involved in performing QoS at the hypervisor.

          I never said server CPU’s were a bottle neck. What I said is that server CPU’s should be doing the job they were purchased for, to execute business applications. A server CPU cycle spent doing networking functions reduces the efficiency of the server. Furthermore, I think you missed the point completely. You can do QoS at the hypervisor all day long but if the switch port connected to that server is not capable of enforcing the QoS policy (as is the case with HP Virtual Connect) you have accomplished nothing.

          Memory is no longer an issue with the advent of Nehalem-EX

          Memory capacity was no longer an issue with Nehalem-EP processors, actually, thanks to the Cisco UCS B250 blade, where you have (48) DIMM slots available on a (2) socket Nehalem-EP board. There is also a Westmere-EP version of Cisco UCS B250. Sure, the Nehalem-EX architecture has more memory DIMMs and more CPU power, but at a significantly higher cost. If the customers applications are more memory bound, they have a great choice in getting the needed memory footprint at a lower cost with Cisco UCS B250 blades which are unique in the market in providing lots of memory with lower priced, lower power processors.

          My understanding is that today you are limited to 10Gb active per 2 socket box?

          If you are talking about Cisco UCS, you are definitely wrong there. The Cisco UCS B250 blade I discussed above is a (2) socket blade with 40 Gb active I/O bandwidth delivered from (2) dual-port 10GE mezzanine adapters.

          Don’t get me wrong QoS has its place, but it shouldn’t be the primary bandwidth control mechanism in the network, it should be a tool of last resort

          If you read this article you would have learned that I agree with you and that’s exactly how intelligent QoS works … it only enforces a policy when there is congestion. HP Virutal Connect, on the other hand, is the exact opposite. The rate-limiting requirement of FlexNIC’s is controlling and limiting bandwidth at all times, even when there is no congestion!

          Thanks for stopping by!


  5. David Coulthart says:


    To achieve the benefits of the Cisco UCS implementation’s “intelligent QoS” is it necessary to connect the UCS to a physical switch that supports the PFC & ETS components of the DCB standards (i.e., the Nexus line) or will it work with a switch that only supports “classic” Ethernet CoS (e.g., Catalyst 6500)?


    • Brad Hedlund says:

      The “intelligent QoS” described here for the server I/O network is inherent to the Cisco UCS platform regardless of what make/model 10GE switch you connect it to upstream.


  6. Jase Machado says:


    I think you’ve clearly stated what’s going on with the different HP Cisco approaches. Cisco is more efficient with smaller amounts of bandwidth whereas HP does not manage the bandwidth as well but has lots of raw bandwidth. Somewhere in the middle are VMware architectes trying to get to 10GbE with or without FCoE.

    In our environments the FCoE is nice but not a real requirement since a full end-to-end FCoE vision ROI is years out. With that said, we’re focusing on how to chop up 10GE connections for all the things VMware hosts need. These include console, vMotion, and Fault Tolerance.

    One thing I’m concerned about with the UCS approach is that maybe there is not enough raw bandwidth coming down to the enclosure. If there is 8 x 10GE ports then that’s a total of 80GE total possible. I understand that cisco does 100 oversubscription and each “half-wide” blade thinks it has 20 GE all the while doing intelligent QoS. This is all banking on the hope that not all blades need 20 GE each all at once. It wouldn’t happen. Each blade would max out at 10 GE per what the max bandwidth is off the aggregate (8) 10 GE connections.
    This is all fine and I think this scenario would never happen with regular network loads. But wait, we’re running FCoE in there too! So my HP half-height blades right now have (2) 4GB HBA’s. If those were maxed ( possible with many VMs ) then that cuts my max simultaneous possible bandwidth on each half-wide from 10 to 2 Gb left. ( 10 – 8 = 2 ).

    So my point is that all though HP is not managing bandwidth as nice as Cisco with oversubscribing and nice QoS minimums, they are solving the problem with raw bandwidth and options to continue using FC aside from 10 GE. *Also, the Flex-10 inter-enclosure link to isolate vMotion and FT traffic is very slick!


    • Brad Hedlund says:

      I do find it quite funny that making excuses for HP requires throwing out these totally unrealistic Frankenstein bandwidth scenarios (such as all servers in a chassis sending max Eth and FC traffic all at once). Frankenstein had impressive brute strength but unfortunately he didn’t adapt too well to the real world.

      Speaking of the real world, you said: “we’re focusing on how to chop up 10GE connections for all the things VMware hosts need”. That is a real world problem. How are you going to decide the _max_ limit of bandwidth for Virtual Machines, Fault Tolerance, NFS, vMotion etc? Which one is going to get the shortest straw? Why?
      You see, because Cisco UCS is designed to address real world problems, these difficult decisions are vastly minimized because with UCS you are sorting out _min_ bandwidth guarantees only enforced during congestion.

      Why is creating a second network that can only string between (4) chassis “slick”? I don’t get that. While you’re at it, why stop at a second network? Why not create a separate network for every possible traffic type? Oh, wait, that’s starting to sound like a data center circa 2003 😉

      Thanks for reading and leaving a comment!


      • Jase Machado says:

        So now understanding both solutions functionality I’m starting to think we really need to analyze/trend what is the max amount bandwidth we currently use and what we will need in the future. If the total max bandwidth needs are below the max UCS can provide then maybe UCS is best. But if we require more than 10Gb bandwidth on each blade con-currently then maybe we need more brute strength. The trend for more denser virtualization will surely require some brute strength in the future. I’m confident that the next version of UCS will deliver twice the current bandwidth 😉 . -That appears to be the only risk of this solution when planning for long-term scalability in regards to functionality.

        Aside from the functionality contest, there is also this whole “Do more with less” theme these day in IT. With that said, Cisco UCS solution as a whole has a high cost problem. UCS is very expensive and more so than any other solution per Blade and per VM. Is that arguable? HP’s argument against UCS shows a lot of extra UCS hardware. 38:2? ( Thats a lot of extra hardware and associated costs. Almost a Frankenstein inventory of hardware.. 😉

        If not just measuring pure hardware costs, we could also measure the higher operational costs of maintaining more devices and the costs of implementing new operational support procedures. UCS will require shops to rethink and create new processes for provisioning, monitoring, and supporting hardware. This may be necessary if the overall ROI is there and better than competitors. Again, we must analyze the total costs of these solutions as a whole.

        I’m impressed with Cisco’s introduction into the market. I think they may soon dominant with a couple more revs and lower costs. Although my commentary sounds biased to HP, it is not. I’m indifferent to each product and really interested in this dialogue to better understand each product better as we move forward with our formal bake-off and planning of the next generation of VMware. This is going to be a good bake-off!


        • Brad Hedlund says:

          Anytime there is a new and more efficient means to do something often results in a “rethink”. When the automobile was invented we had to “rethink” transportation infrastructure, from dirt/gravel roads for horse powered buggies to paved streets & highways. (

          You’re right, with UCS it will be the same kind of “rethink” in how to do things more efficiently.

          For example, your current operations procedure to install an HP chassis probably goes something like this:

            1) Connect Power
            2) Configure IP address & login for HP Onboard Administrator
            3) Load firmware for HP Onboard Administrator
            4) Connect HPOA to separate out of band management network
            5) Configure IP address & login for Virtual Connect Flex-10
            6) Load software for Flex-10
            7) Configure Flex-10 vNets
            8 ) Connect Ethernet Flex-10 uplinks
            9) Configure Flex-10 uplinks
            10) Check with Network guy that upstream network is properly configured (VLANs, etc.)
            11) Configure Virtual Connect Server Profiles
            12) Load software for Virtual Connect FC switch
            13) Connect Virtual Connect FC uplinks
            14) Check with SAN guy that upstream FC ports are ready (VSAN, zoning, etc.)
            15) Insure server blades have firmware compatible with HPOA firmware
            16) Insure server Ethernet adapters have firmware compatible with Flex-10
            17) Insure server FC adapters have firmware compatible with VC FC.
            18) Add HPOA to HP Systems Insight Manager
            19) Add chassis to HP Virtual Connect Enterprise Manager
            20) Insure server BIOS revision levels and settings are correct

          [ If installing another chassis, repeat steps 1-20 ]

          With Cisco UCS, you’ll have to “rethink” the procedure to something like this:

            1) Connect power
            2) Connect Ethernet
            3) Walk away

          [ If installing another chassis, repeat steps 1-3 ]


          • Jase Machado says:

            Ha! That link is great!
            Very funny. Although maybe a little one-sided, its still a good satiracle rebuttle to HP link I posted above.

            As for your your procedural setup steps outlined above, that is not accurate. To say UCS has 3 steps to HP 20 is obviously un-true. Cisco UCS has way more initial configuration settings. If UCS Manager just needs power and configures itself then that is awesome! From my expeirence configuring the UCS 6140XP and the Cisco UCS Manager is a lot of work. However, It may be true that adding additional chassis may be easier.

            BTW- Another good HP Virtual Connects vs Cisco UCS dialogue:


          • Brad Hedlund says:

            Of course with Cisco UCS there is initial configuration required, in one management platform, UCS Manager. With HP, on the hand, there is also initial configuration required, but with several different management platforms (SIM, VCEM, Insight Dynamics VSE, RDP, LAN, SAN, etc). Regardless, in either case, initial management platform(s) will need to be set up. So lets call that part a wash.

            Having said that. Again, the (3) steps to add your FIRST, 2nd … 14th UCS chassis is as follows:

            1) Connect power
            2) Connect Ethernet
            3) Walk Away

            What about replacing a failed Networking module in the chassis? With UCS it would go something like this:

            1) Remove failed FEX
            2) Insert replacement FEX
            3) Connect Ethernet
            4) Walk away

            Compare that to HP. How many steps would it take to replace a Flex-10 module?

            In a nutshell, all of the chassis, networking, blades, & adapters (everything underneath UCS Manager) is completely stateless. This allows IT to respond faster, with more accuracy, focus less on operational specific tasks and more on optimizing SLA’s. The whole “do more with less” thing. 😉


          • John Smith says:

            surely you’re oversimplifying the ease of installation here. If what you say is true, the intelligence in the UCS chassis is quite advanced. I’m not sure I want be responsible for an Extinction Level Event by initiating the singularity when I plug this thing into my network. (I hope you recognize the attempt at humor here)

            can you point to a document that tells me to install a UCS chassis that all I have to do plug it into power and network?

        • Michael Smith says:

          I agree with you Jase. We had looked at using HP BladeSystem, with Brocade TOR switch and HP Networking hardware, and Virtual Connect vs. Cisco UCS, Nexus switches, and Cisco networking hardware.

          The difference was outrageous. We presented both solutions to our client. Both solutions backed up with the appropriate documentation. NO FUD! Either way we win. The client wanted to know why we wasted their time with the Cisco UCS presentation. We told them that we would not be doing our jobs if we did not. The amount of management, hardware, power, and data center floorspace required for Cisco UCS compared to HP is quite high. Not to mention all of the licensing costs associated with Cisco hardware.

          I am glad IBM with their PureSystems and Cisco with their UCS is in the converged data center market. With these big players competing against HP, everybody wins!

Leave a Reply

Your email address will not be published. Required fields are marked *