Cisco UCS criticism and FUD: Answered

One of my readers recently submitted a comment asking me to respond to some criticisms he frequently hears about Cisco UCS.  This is a pretty typical request I get from partners and perspective customers, and its a list of stuff I ‘ve seen many times before, so I thought it would be fun to address these and other common criticisms and FUD against Cisco UCS in one consolidated post.  We’ll start with the list submitted by the reader and let the discussion continue in the comments section.  Sounds like fun, right? :-)


I regularly hear a few specific arguments critiquing the UCS that I would like you to respond to, please.

1. The Cisco UCS system is a totally proprietary and closed system, meaning:

a) the Cisco UCS chassis cannot support other vendor’s blades. For example, you can’t place an HP, IBM or Dell blade in a Cisco UCS 5100 chassis.

b) The Cisco UCS can only be managed by the Cisco UCS manager – no 3rd party management tool can be leveraged.

c) Two Cisco 6100 Fabric Interconnects can indeed support 320 server blades (as Cisco claims), but only with an unreasonable amount of oversubscription. The more accurate number is two 6100s for every four (4) 5100 UCS chassis (32 servers), which will yield a more reasonable oversubscription ratio of 4:1.

d) A maximum of 14 UCS chassis can be managed by the UCS manager, which resides in the 6100 Fabric Interconnects. Therefore, this creates islands of management domains, especially if you are planning on managing 40 UCS chassis (320 servers) with the same pair of Fabric Interconnects.

e) The UCS blade servers can only use Cisco NIC cards (Palo).

f) Cisco Palo cards use a proprietary version of interface virtualization and cannot support the open SR-IOV standard.

I would really appreciate it if you can give us bulleted responses in the usual perspicacious Brad Hedlund fashion. :-)


This is a good list to start with.  But before we begin, lets define what constitutes valid criticism in the context of this discussion.

Criticism: something pointed out as lacking or deficient when compared to what is typically found and expected in other comparable and “acceptable” solutions.  For example, if my new commuter car didn’t have anti-lock brakes this would be a valid criticism as anti-lock brakes is a feature commonly found and expected in most newer commuter cars today.  However, if my car didn’t transform into a jet plane and fly with the press of a button, is that a valid criticism? No.  This is not a capability typically expected of any automobile.  Such a “criticism” is pointless.

OK, lets get started…

1) “Cisco UCS chassis cannot support other vendor’s blades”

This is one of my favorites.  If someone brings this up you know right away you’re dealing with someone who is either A) joking, or B) has no idea what they’re talking about.  Anybody who has set foot in a data center in the last 7 years knows that Vendor X’s blade chassis are only populated with Vendor X’s blade servers, and … <GASP> yes! Cisco UCS chassis are only populated with Cisco UCS blades. Shame on Cisco! LOL.

Before the IBM guys jump out of their seat, Yes, I am aware that 3rd party blade servers can be made to fit into an IBM blade chassis.  While that’s a cute little check box to have on your data sheet, the actual implementation of this is extremely rare.  Why? It just doesn’t make any sense to do this, especially with commodity x86 hardware.

When was the last time you saw Vendor X’s blade server in Vendor Y’s blade chassis?  Exactly.  This is not a valid criticism.  Case closed.

2) “Cisco UCS can only be managed by the Cisco UCS manager – no 3rd party management tool can be leveraged.”

If “managed” means: The basic baseboard level management of the blade itself (BIOS settings, firmware, iLO, KVM, virtual media, etc.), in other words, everything needed to get the blade up and functionally booting an OS — Well, yes, this of course is true and again its no different than the other market leading vendors.  Example, the HP c7000 chassis requires that you have at least one HP management module present in every chassis to manage the blades (HP Onboard Administrator).  Furthermore, to aggregate management across multiple c7000 chassis you are required to have HP management software performing that function as well, HP Systems Insight Manager.  This is true of the other blade vendors as well (DELL, IBM).  You have their management software and modules managing their hardware.  So help me understand, how is this a valid criticism against Cisco UCS?

If “managed” means: a higher level capability set such as.. auditing, provisioning, historical statistics, life cycle management, alerts and monitoring, etc. — this is actually where Cisco UCS sets itself apart from the other vendors in being more “open” and eco-system friendly.  Unlike the others, Cisco UCS provides an extremely powerful and open XML API that any 3rd party developer can customize their solution to.  Consider the fact that the UCS Manager GUI is just a browser based front-end to the same XML API that 3rd party developers are interfacing with.  Its entirely possible to provision and manage an entire UCS system with 3rd party software, and never once using the UCS Manager GUI.

There are many examples this open XML API management integration with Cisco UCS, but here are just a few:

Why isn’t there an iPhone app yet to manage HP BladeSystem, or DELL, or IBM? Answer: Without a consolidated and open API to interface with this would be a tremendously complex effort. Compare that to the Cisco UCS iPhone app that was developed by just one Cisco SE (Tige Phillips) in his spare time!

If an amateur programmer can write an iPhone app to manage Cisco UCS in his spare time, imagine what a team of cloud savvy programmers can accomplish?  Example: Check out what newScale is doing with Cisco UCS.

So, as for the claim: “… no 3rd party management tool can be leveraged.”?  We can dismiss that one as being totally false.

3) “Two Cisco 6100 Fabric Interconnects can indeed support 320 server blades (as Cisco claims), but only with an unreasonable amount of oversubscription. The more accurate number is two 6100s for every four (4) 5100 UCS chassis (32 servers), which will yield a more reasonable oversubscription ratio of 4:1”

The statement “unreasonable amount of oversubscription” is pure speculation.  Oversubscription requirements will vary per customer deployment depending on factors of desired scale, bandwidth, and cost.  The trade off between bandwidth and scale is no secret and is simply a fact of life with any vendor solution, its not something unique to UCS.  More bandwidth means more network ports, more switches, higher cost.  More oversubscription means higher scale at lower costs.

Next, what does “oversubscription” really mean anyway?  For some, it might be very blade chassis centric where they calculate the ratio of total bandwidth provisioned to the servers in a chassis compared to the total uplink bandwidth available to that chassis.  In these calculations, each Cisco UCS chassis of 8 servers can be provisioned for a max of 80 Gbps, or a minimum of 20 Gbps.  When you provision the minimum 20 Gbps of uplink bandwidth per chassis you can in theory*** achieve the scale of 320 servers per Fabric Interconnect. (40 chassis dual homed to a pair of 40-port fabric interconnects)

Example: If I have a modest provisioning of 10 Gbps per server (that’s a lot, actually), and the minimum of 20 Gbps of chassis uplink bandwidth — that’s a <GASP> “more reasonable” 4:1 oversubscription ratio for 320 servers! 😉

For others, “oversubscription” might mean the ratio of bandwidth a server must share not only to exit the chassis, but rather the total amount of bandwidth each server shares to reach the Layer 3 core switch.  Again, this is a universal bandwidth/scale/cost design trade-off across all vendors, not just Cisco.  This kind of exercise requires taking a look at the total solution including servers, chassis, access switches, core switches, LAN and SAN.

Here’s a simple example of achieving 4:1 oversubscription from every server to the LAN core*, and 8:1 to the SAN core**.  You could have (8) UCS chassis each with 8 servers provisioned for 10 Gbps of LAN bandwidth, 4 Gbps of SAN bandwidth.  Each chassis is wired for the maximum of 80 Gbps providing 1:1 at the chassis  uplink level.  So, now we have 64 servers at 1:1 that we need to uplink to the SAN and LAN core.  To get 4:1 to the LAN core*, and 8:1 to the SAN core**, we need to have (16) 10GE uplinks and (8) 4G FC uplinks from each Fabric Interconnect.  We’ll take those uplinks and connect them to 1:1 non-oversubscribed ports at the SAN and LAN core.

The result: 64 servers each provisioned for 10GE with 4:1 oversubscription to the LAN core*, and 8:1 to the SAN core**.  All of this fits into a single pair of UCS 6140 Fabric Interconnects.  You could treat this as a discreet “Pod”.  As you need to scale out more servers at similar oversubscription, you stamp out more similarly equipped pods.

Want 4:1 to both the LAN and SAN core?  Scale back to (6) UCS chassis and (48) servers per Fabric Interconnect, and provision more FC uplinks to the SAN core.  Its the classic scale vs. bandwidth design trade off applicable to any vendor solution.

*Side note: The LAN oversubscription from Server to Core is actually 2:1 with both fabrics available, and 8:1 with one fabric completely offline.  For the sake of discussion lets just average it out to 4:1.

**Side note: The SAN oversubscription from Server to Core is actually 4:1 with both fabrics available, and 8:1 with one fabric completely offline.

***Side note: The current hardware architecture of Cisco UCS can fit 40 chassis and 320 servers underneath a single pair of 6140 fabric interconnects. However, the number of chassis per fabric interconnect officially supported by Cisco at this time is 20. This number started at 5 and continues to go up with each new major firmware release.  Reaching 40 supported chassis is only a matter of time.

4) “A maximum of 14 UCS chassis can be managed by the UCS manager, which resides in the 6100 Fabric Interconnects. Therefore, this creates islands of management domains, especially if you are planning on managing 40 UCS chassis (320 servers)”

Correction: as of the most recent UCS Manager 1.4 release you can now manage a maximum of 20 chassis.

This one always cracks me up because it somehow tries to say that a single point of management for 14 or even 20 chassis is somehow a BAD thing? LOL! 😀

What’s the alternative?  With HP, IBM, or DELL, (20) chassis is exactly (20) islands of management, and each island has multiple things you need to manage on it (chassis switches and management modules).  What about the LAN and SAN access switches connecting the (20) chassis? Yep, you need to manage those too.

Compare that to the (1) management island per (20) chassis from a single interface and single data set managing settings and policies for all of the servers including LAN & SAN. 😉

5) “The UCS blade servers can only use Cisco NIC cards (Palo)”

This is simply not true.  From the very beginning customers have had the choice of several non-Cisco adapters.  In fact, the Cisco adapter (Palo) wasn’t available for almost a year after the initial release of Cisco UCS.  As of the recent UCS Manager 1.4 release, several more adapters have been added to the portfolio of choices.

The adapters Cisco UCS customers can choose from:

Case closed.

6) “Cisco Palo cards use a proprietary version of interface virtualization and cannot support the open SR-IOV standard”

The Cisco Palo card accomplishes interface virtualization in way that’s completely transparent to the OS — This is done through simple standards based PCIe.  There’s nothing proprietary happening here at all.  When installed into the server, the Cisco Palo card appears to the system like a PCIe riser hosting multiple standard PCIe adapters.  In other words, Cisco has effectively obsoleted the need for SR-IOV with the design of the Cisco VIC (Palo).  There’s nothing stopping any other vendor from using the same transparent PCIe based approach to interface virtualization.

With SR-IOV, on the other hand, the OS needs to be SR-IOV aware.  You need to have the proper SR-IOV drivers and extensions loaded, etc.  Why complicate the solution with additional complexity when you can achieve the same goal (interface virutalization) in a way that’s completely transparent to the OS and adapter drivers?  This obviates the need for any additional “standard” layered into the solution.

By the way, there’s nothing preventing you from using an SR-IOV adapter with Cisco UCS.  For example, the new Intel 82599 adapter for UCS supports PCI SIG SR-IOV.  If you want SR-IOV really bad, use that adapter.

OK! That was a good round. Now go ahead and hit me with your best shot in the comments section. Please, keep your comments to one or two concise paragraphs. If you have a whole bunch of stuff to throw at me, break it up into multiple comments if you can.  If you submit a really good one, I’ll promote it into the article content.

For my HP, IBM, and DELL friends out there (you know who you are) — Guys, there’s no need to submit comments pretending to be a disappointed customer.  Just cite your real name and vendor disclosure and lets have an honest and forthright discussion like the gentlemen we all are.  No need for games.

Also, please keep in mind that I am employed by Cisco Systems, so I do need to exercise discretion in what I can and cannot say.  I’m sure you can understand.

Cheers & Happy New Year!

Disclaimer:  The views and opinions expressed are those of the author, and not necessarily the views and opinions of the author’s employer.  The author is not an official media spokesperson for Cisco Systems, Inc.  For design guidance that best suites your needs, please consult your local Cisco representative.


  1. says

    Nice collection Brad. My favourite is the 1st one, it’s so demagog, always make me smile :) Ppl who come with this are so serious they make me think they just saw a HP chassis running IBM blades.

    • Sunil says

      Some of your own customers to be, where you have given away an enclosure free might have asked that question. No one working in a decent IT Department will raise that question. We have been selling (I really mean selling) HP Enclosures & Servers all over the world for so long, there is not a single soul who does not know about blade and blade enclosures.

      Just making up some questions so you can find an easy answer is not fun. And then guys like you with pseudo names act like real guys showing interest!!! Cisco must be really spending (wasting) a lot of money on this.

      • Matt says

        Sunil, fyi; the point that Cisco enclosures do not support other vendor blades has been brought up in our shop before. I assure you there are many people out there who see this point (out of context) and count it as a mark against Cisco. That being the case, I have to agree with Brad when he said “…you’re dealing with someone who is either A) joking, or B) has no idea what they’re talking about.”

        Right now, UCS is a game changer for server hosting (and didn’t disappear after a year as a certain executive prophesized in early 2010). 😉 That isn’t to say UCS is the only one out there. HPIBMDellVendor XYZ are all innovating and marketing to deal with the same set of issues. IMHO though, it does not bode well for any vendor that spends their cycles releasing press releases bashing their competition (sometimes falsely) rather than actually innovating.

        Disclosure: I am not an employee of Cisco or of one of their partners. I am working in a shop with HP equipment.

        • Pratik says

          I agree with Matt. Any IT manager having common sense knows not to mix vendors inside a integrated IT system. Why would I have a multi-vendor environment when there are so many cons?
          -Interoperability – restricted features or performance compared to native vendor.
          -Support nightmare – no clear ownership on who is responsbile if issue happens
          -Having to manage multiple point of contacts for sales/support

  2. Joe Smith says

    Thanks, Brad. Good stuff.

    The SR-IOV dialogue leads to a comment and another question or 2.

    The manner in which Cisco virtualizes the Palo is indeed proprietary. One may argue it is a simple approach and precludes the need for SR-IOV, etc, but it is nonetheless proprietary because it is not based on an open standard. That doesn’t mean that’s a bad thing in and of itself, but it is what it is.

    That having been said, Cisco offers VN-Link technology through software (1000v) or through hardware using VN-Tag. And VN-Link is supported by both Nexus ToRs and the UCS FIC. Before I ask my question, is all this correct?

    So, what if a shop decides that they do not want to take the Cisco proprietary UCS VIC-VN-Link approach but instead use the open standards SR-IOV-VEPA approach? For example, a client would like to use a QLogic SR-IOV-enabled CNA and then leverage VEPA (when its ratified) to map an SR-IOV VF to a vEth port on the adjacent bridge (or FIC, as in the case of UCS) — will Cisco’s UCS allow this design approach? Moreover, will the FIC and the Nexus switches support VEPA?

    And this is a legitimate concern, because another criticism, which I forgot to list before, is that the Cisco UCS system necessitates the removal of any existing ToR switch – even if it’s a Cisco Nexus. Now this is true whether the Nexus will support VEPA in the future or not, but if a client chooses another vendor for their blade chassis solution, it certainly would be nice for them to be able to keep their Nexus ToR in place and simply upgrade it to support VEPA.

    • says

      When it comes to network interface virtualization (NIV) there are two main problems to solve:
      1) how the server views the single physical adapter as multiple virtual adapters.
      2) how the upstream network connects each virtual adapter to a unique virtual network interface.

      As for #1, the Cisco VIC uses *standards based* PCIe technology to do this. This is exactly why the server, OS, and adapter drivers do not need any special awareness of NIV — The standard interface with the PCIe bus is all that is needed. Again, there is nothing proprietary. Nothing is stopping any other vendor from taking a similar approach.

      As for #2, the two main approaches to this problem are VN-Tag, and VEPA, both of which are *standards based* — 802.1Qbh is the standard based on VN-Tag, and 802.1Qbg is the standard based on VEPA.

      Both Cisco and HP have stated they will support both 802.1Qbh and 802.1Qbg in the products where it makes sense, when the standards are final.

  3. Doron Chosnek says

    That’s a great collection, Brad. Another common FUD item I
    hear is that UCS requires a forklift upgrade to Nexus and MDS
    throughout your datacenter. This is unequivocally incorrect. The
    UCS 6120/6140 interface to any IEEE compliant LAN and any ANSI T11
    compliant SAN. Cisco has added features to UCS that enable it to
    work *better* with other Cisco products (example: FC port
    channeling and VSAN trunking added in UCS 1.4) by taking advantage
    of cool Cisco features… but customers are not *required* to use
    those features. Disclaimer: I work for Cisco.

    • says

      Doron, UCS may not require a Nexus upgrade, but it may very well require a Nexus rip-out-and-replace. For example, organizations that have IBM, Dell, or HP blade chassis deployed in their data center are able to use ToR access switches from a variety of vendors. The Dell m1000e blade chassis, for example, can be uplinked to a Dell, Brocade, Juniper or Cisco ToR switch, including the Nexus 5000. That’s not so with the UCS that requires the 6120/6140 at the ToR. So to migrate to UCS, any existing ToR switches, including any N5K, will have to go.

      The effect of deploying UCS is less disruptive in environments that leverage an EoR architecture, where the EoR switch can be redeployed as a core switch.

      • says

        Cisco UCS is more than just chassis with servers linked to some networking stuff — that’s the old school HP/IBM/DELL approach. If you think of the UCS architecture as just more of the same — no different than the others — its typical to ask questions like: “Why can’t I link the UCS chassis to my same old top of rack switch?”.

        Cisco UCS is a new architecture of network and computing in a single integrated system. Applying the same old thinking to a new architecture will get you confused pretty quickly. Such as, linking the UCS chassis to the same old top or rack switch (a separate system) would negate many of the automated provisioning and single point of management benefits the unified architecture provides. Most customers understand this and look to deploy UCS as a new pod of servers and network, linked up to their existing LAN and SAN core as Doron pointed out.

        Having said that, linking the Cisco UCS 6120/40 fabric interconnects to a new or existing Nexus 5000 or other 10GE top of rack switch can certainly be done, and some customers do this today for migration purposes or other reasons.

        • says


          I can appreciate exactly what the UCS is and I am not confused at all. For one, in my example, I was not referring to the case of linking the UCS to the “same old top of rack switch.” or some generic “networking stuff.” A Nexus 5000 that is deployed as a ToR is hardly an unexciting, legacy switch, yet it would have to be ripped out and replaced to deploy the UCS architecture. Brocade’s brand new, revolutionary VDX switch would also not work with the UCS. It is what it is, no matter what kind of spin you want to put on it.

          By the way, I also don’t see the value in needlessly adding another layer of switching just to prevent the situation in which the N5K would have to be decommissioned.

          All that having been said, I am not knocking the UCS. I can appreciate it’s innovation and the fact that Cisco — as usual — has beaten the standards bodies to the punch. It’s an impressive foray into the world of server networking.

          • says


            I don’t think you are understanding the UCS architecture. The fabric interconnects (FI) are the core (brains) of the UCS system, and they are much more than just a switch. Everything that is UCS is contained within the FIs. I think you’re thinking of old blade systems where all of the intelligence is bundled within the chassis, and the chassis is connected to an upstream switch. For UCS, the intelligence is located within the FI’s, and the FI’s combined with the Chassis hardware form the complete UCS system. So, the FI’s connect to your existing ToR, EoR, etc. network infrastructure so you are free to use Brocade’s VDX, Nexus 5K, etc.


          • Scott Wilson says

            Joe – stop thinking about the Fabric Interconnects replacing your ToR switches. The Fabric Interconnects (FI) should be considered *PART* of the UCS platform. You can’t use them without UCS, and you can’t use UCS without them. They just happen to be an element that sits outside the chassis, if you will, to provide the architecture some scalability. All the UCS chassis that sit under the FI are really viewed as one manageable *pod.*

            The FI then require connectivity into the rest of your domain. That could be a Nexus product. It could be any old ToR switch (although I’d question the logic of that b/c they probably don’t have the power to legitimately handle the throughput), or it could be the fancy Brocade switch you mention. Whatever you want.

            Part of the benefit of the UCS platform is that it allows you to GET RID OF all those ToR switches in the first place! You needed them in the past to handle all your discreet rackmount servers. If those are now blades and VMs consolidated to and running on UCS, you no longer need them. You can keep them if you want, but to what purpose?

            (Disclaimer – I work for Cisco)

          • JoseA says

            This is an old post but something I think is completely muddled by Joe’s post. The FI’s are not “switches” but more of a management and IO aggregation point for the UCS. What your blade switch and FC IO modules in the chassis do the FI’s do and much much more. There are uplinks that come from the blade chassis that are basically dumb IO uplinks and then go to the FI’s. The FI’s then go to your SAN and LAN switches. What makes this approach so great IMO is that you can mix match according to your IO profile. Have a VMAX and need 128gb of total FC throughput dedicate 8 ports per FI to Fiber Channel, need only 40gb of TCP/IP throughput 2 ports per FI in LACP and you’re good, if you need to run 20gb to your DMZ 40gb to your core and 20 gb to test dev you can seperate all that traffic dedicate the ports to the networks you want/need set your vlans and off to the races. This is a better system than any other blade product I’ve ever worked with. More importantly from that point forward provisioning a new system to those resources is just a few clicks.

  4. says


    I fully understand everything you just wrote. I understand that the FIs are not traditional switches and that they are indeed the brains of the outfit and that the UCSM resides there, etc…etc. That is precisely my point. With the FI at the ToR, what do you do with the existing ToR (Brocade, Juniper, Nexus 5K, etc)? You rip it out. Unless you’re suggesting that we have 2 ToR appliances/switches — an FI and a Brocade… or Nexus… or whatever. That would be preposterous. Why in Sam’s Hill would I do that? I wouldn’t. Again, I don’t think this is necessarily a show-stopper since many data centers are still leveraging EoR designs at the access layer that they can repurpose if they choose to deploy UCS.

    By the way, that brings up another point I’m not too crazy about: the fact that the UCS chassis does not provide distributed intelligence, which forces intra-chassis traffic to get forwarded to the FEX, then up to the FI, and then right back down again. The reason is the FEX, of course, which basically uses a pinning methodology to forward traffic out its uplinks. The UCS’s I/O is somewhat rigid, as opposed to, say, the Dell m1000e, which offers 3 redundant fabrics that can support 1G/10G Ethernet, FCoE, 2/4/8G FC, 1G/10G Ethernet pass-through, and Infiniband.

    As I said earlier, overall, the UCS represents an innovative approach to data center server networking, mostly because of the UCSM and VN-Link, which allows the administrator to manage VM policies with much more flexibility and uniformity.

    • says


      Just a quick response to your comment:

      UCS chassis does not provide distributed intelligence, which forces intra-chassis traffic to get forwarded to the FEX, then up to the FI, and then right back down again.

      This again goes back to the old school thinking of seeing the blade chassis as a discreet object with intelligence to manage. UCS takes a completely different approach — the chassis in UCS is really just sheet metal providing a convenience for racking and powering up the blades. And the FEX provides all the servers in all the chassis with the logical architecture of being wired to one switch — just like the familiar design of rack mount servers wired to a big end of row switch (expect without all the cabling mess). This is how you can add more servers and more chassis while keeping the management effort a constant.

      Localizing intra-chassis traffic might be important if your data center is nothing more than (1) chassis, and there isn’t much bandwidth between your chassis and the network — but that’s simply not the case anymore. Once you add a second chassis you now have inter-chassis traffic that has a difference performance profile than the intra-chassis traffic. That’s not the case with UCS. Whether you have (1) chassis, (2) chassis, or (20) chassis, all traffic patterns are consistent and predictable. This makes it easier to keep a consistent SLA in a virtual data center with VM’s moving from one chassis to the next. With UCS, the location of the workload is of little significance.

      By the way, Joe, thanks for all the great comments. And most of all, THANK YOU for inspiring the content of this article!


      • Jeff says

        Wow. I keep reading from the Cisco employees that, “this again goes back to old school thinking”. The fact that two servers in the same chassis must go up to the FI and back down again in order to communicate is “old school” in my book. The latency associated with this and the additional inter-chassis traffic placed on the 6140 will cause performance issues for high traffic volume installations.

        Add to that the fact that the number of uplinks from the 6140 to the core network are significantly less than the number of links from the chassis to the 6140s, we call that a “bottleneck” in old school language.

        Then add the fact that the XEPs only have 3 qos queues – drop/no-drop/control, compared to the 8 queues on the server and on the 6140, which means there is NO mapping available for critical applications as they go back and forth from one server on the same chassis to a different server on the same chassis. Such traffic is either put in the “no-drop” queue in the FEX (along with FCoE and iSCSI) to fight for its life, or it gets put in the “drop” queue to die when necessary. This “old school” approach to not providing consistent QoS mappings hails back to the ToS/IP Precedence/DSCP wars of yesteryear.

        Finally, when one of the 6100’s fails, you get to double up your bandwidth until it is replaced.

        Really? UCS is all about bottlenecks, poorly implemented QoS, failover to only one available path?

        That, gentlemen, is OLD SCHOOL!

        • JoseA says

          First let me start out saying I know these are old posts but I’ve been here a dozen times and every time I get annoyed reading some of the comments. The FUD that persists on the internet for years is IMO one of the downsides of the persistent nature of the internet. I don’t work for Cisco I’m simply an engineer that works with pretty much every blade infrastructure on the market on a pretty regular basis. In the interest of full disclosure I love UCS because it’s awesome to work with.

          So have you ever measured the latency from the FEX to the FI and back down? You say it can cause performance issues for high demand applications, I’ve not seen that in real-world scenarios ever. If there is a specific case of this measured latency I’d love to see it because I’ve deployed UCS and many other blade chassis and simply never seen this with the UCS. Secondly the “bottleneck” you speak of is one you decide on. If you want 1:1 bandwidth it’s possible. I’ve deployed very large database applications on UCS and never seen what you describe. If I know the IO needs of my server are 40GB I can account for that in the design and implementation of the UCS. You have 40 ports to mix between different traffic types. FEX to FI, FI to core, and FI to SAN. If I want to deploy 4 double-width blades I can use 8 port IOM’s to get a total of 16 10GB unified ports to the FI, and I still have 32 on each FI to assign as ethernet uplinks or FC uplinks, 64 in total. I can scale this up or out as I see fit, I don’t know of another blade chassis that can do this. I would argue at this point if your application demands more than 40gbps of total IO throughput per blade or hits “bottlenecks” as you like to call them in the “old school” at 40gbps of total IO then you’re simply not doing it right or need to be running it on specialized hardware.

    • Scott Wilson says

      I guess I’m confused on why the commitment to the ToR switches. They’re no longer needed. If you want to keep them, keep them, but why? Isn’t that typically a good thing? I mean, one of the things people were thrilled about with VMs is that they could consolidate numerous different servers onto fewer more powerful servers. They didn’t get upset that they then needed to get rid of all those old servers. They practically danced and skipped their way to the recycling center. The same concept holds true for the ToR servers…

  5. Rob says

    Great post and I agree with you on all your rebuttals to the FUD out there about the UCS. However, the reason my company doesn’t use the UCS and still sticks with HP was not on the list.

    The reason we don’t use it and won’t look at it is because it is a relatively new product line offered by Cisco. Cisco is new to the blade/server market…minus the switches they make for them. Hp is a server company and they make money selling servers. It is their bread and butter. Cisco on the other hand is a router and switch company.

    No doubt that the UCS is a good product. BUT…if Cisco doesn’t make enough money…if they get new leadership…if they get too big (overreach)…if they fall into tough economic times…what are they gonna do? The’re going to drop the extra side business and do what they do best… switches and routers. They’ll refocus their R&D, money, efforts, time…what have you on their bread and butter.

    It happens all the time with companies. Unfortunately you can’t predict the future. 10 years from now Cisco may be still selling UCS and we’ll be wrong…but I bet HP will still be selling servers and that is the reason we’re sticking with them.

    I wish Cisco still made switches for the back of the c7000 to keep our networking team happy but since they don’t, we have to use the FlexFabric…and I think you made a great case in your other posts as to why the Cisco solution is technically better…its just not the right one for us.

    Now there are reasons as to why we choose HP blades over IBM and Dell… its because they make better servers with better management products. That is quantifiable. We have IBM in our shop…due to vender mandates (we’re healthcare) and they have so many more problems than our HP equipment.

    • Scott Wilson says

      I think that that is a valid argument to make. However, at one point even HP was considered something else. One might argue it’s a printer company or an ink cartridge company :)

      But if you look at the financials related to Cisco and the UCS platform, I think you’ll notice that it has already gained critical mass. Cisco is no more likely to walk away from this market at this juncture than it is likely to walk away from routers. It’s a new (less than 5 years) endeavor for Cisco, but it’s been over a year now and the customer adoption and run rates have been astounding. UCS is past the “what if they don’t sell it” hump. And during the last economic “slow down” which most people agree was pretty brutal – that’s in the middle of when Cisco launched UCS. You couldn’t have picked a worse time to do it – and at the same time you couldn’t have picked a better time because it showed that Cisco has a powerful offering that is taking market share from other players and can grow even in down cycles. It’s a safe bet.

  6. Craig Bruenderman says

    I’m an SE with a Gold Partner in Louisville, KY, focusing on the data center portfolio. I love UCS as much as the next guy. This isn’t so much of a criticism as an open question. I’ve heard various marketing and technical folks make the claim that UCS’s flexibility makes it a for virtual, bare-metal, and HPC environments. I think the first two claims are pretty well supported, but I spent several years in academic HPC and I don’t see how a system without the ability to run low latency message passing interconnects for MPI/PVM (like IB, Myrinet, Quadrics, etc) can be an HPC contender. Further, I’ve not heard anyone provide any details as to an actual use case for UCS in typical HPC applications.

    Cisco’s bread and butter is the enterprise computing, not necessarily research computing, although many areas overlap. Enterprise is different in many ways though than research, the problems are simply different. Just because we do a bang-up job running NLB for Exchange doesn’t mean we’re well suited for Gaussian elimination. If we’re going to be spouting off about how UCS is appropriate for HPC, we should be able to reference a typical application deployment, computational chemistry for instance, and customer. To my knowledge, doesn’t contain a single system based on UCS, which is pretty telling of the HPC story.

    Can anyone shed more light on this (or does anyone care :) ) ? Am I missing something? Is 10Gbe actually being used in place of IB these days? I can’t imagine how.

    PS, this only pertains to B-Series. I’m sure a 2000 node cluster of C210s with PCIe IB cards would sequence genes nicely.

    • Ian Erikson says


      As someone who has implemented Infiniband and 10GbE clusters and blade chassis’ – I think I can offer some perspective. I think each may have their role for now, but Ethernet is more versatile IMO. I also don’t think I would look first and only to a UCS B series for a HPC cluster.

      I think a good place for comparison for IB is Ethernet implemented with 1/3/5m twinax SFP+ cables. The 1 and 3m cables can be very cheap >100$, and are very thin and flexible compared to IB cables. I have worked with IB cables, they can be bulky and difficult to manage – not to say it can’t be done. The latency on a twinax SFP+ is “0.1 μs for Twinax with SFP+ versus 1.5 to 2.5 μs for current 10GBASE-T specification.” It also uses much less power than traditional twisted pair Ethernet. I feel that perhaps IB is the better stack for speed, but you will find way more applications with Ethernet. There is some good comparison discussion here

      • Jim H says

        I wish they had an edit function as well :-) I looked at
        the latest test results from a recent test I will remain H/W
        neutral: Using RoCE as opposed to iWARP a large financial company
        was able to achieve 0.8usec across 10GB Ethernet. In addition the
        edge “transceiver” also has support for 40GB at the same latency.
        Obviously this is closing the gap as typical IB latency speeds are
        higher than 0.1usec they hover “typically” around 0.5usec 2.0usec
        very dependent on nic i.e. myriad vs. others (again trying to
        remain product agnostic) sorry for the hijack/offtopic just wanted
        to provide additional data for Ian

    • Jim H says

      I would like to hear brad’s response to HPC this is an
      excellent question I do understand that HPC and ULL is very niche
      however I work specifically with Market Data, HPT, and other
      various financial market technologies. This may be more of a switch
      discussion Voltaire / Arista / Cisco Nexus however I would like to
      understand Cisco’s take on HPC and ULL regarding the UCS plays
      typically the HPC and ULL play is a market that Cisco has not
      focused on nor had a deep understanding of to be candid. Finally,
      the Palo card as I understood it did reference an I/O and RDMA
      offload capability which is similar to what myriad did for HPC
      environments I wonder if GPU scaling is also a future plan for UCS
      again a technology specifically designed for HPC and ULL
      architecture. I do not work for any hardware vendor at all however
      I am often involved in the final decisions for large infrastructure
      solution purchases as we focus on the architecture and design of
      ULL solutions.

  7. says

    Craig, I can’t speak for what Cisco’s vision is for UCS and HPC applications, but I do think that the future of IB is in danger and the only IB switch maker, Voltaire, knows it. This is why they are developing 10G ethernet switches that promise to import some IB-specific technologies, like RDMA (RoCEE), as well as IB’s flat topology into CEE Ethernet. I think they recognize that Ethernet is the hands-down winner in the data center, given its ubiquity and plug-and-play familiarity, and that they Ethernet has been enhanced and evolved to a point where it can benefit from some of IB’s innovations.

    This is not unlike Brocade that has adapted FC technology to operate in an Ethernet environment. With their new VDX switches and Virtual Cluster Switching technology, they have imported the goodness of FC (intelligent fabrics, auto configuration, flat topology) into an enhanced Ethernet network.

    Just my 2 cents…

    • Jim H says

      I could not agree more regarding your take on the future of
      IB in fact I have seen test reports (customer driven) that indicate
      arista as a market leader for ULL ETHERNET switching that rivals
      current IB switching vendors like Voltaire in typical use case.
      However, to date in my humble opinion IB can still be tweaked and
      optimized to outperform Ethernet architecture (again tweaked being
      key word) however, this gap is closing quickly and I fully expect
      IB to be a thing of the past by end of 2012. This however is off
      topic as ULL switching is not exactly part of the fud for UCS :-)
      but I would love to trade notes as so few engineers understand the
      convolutions of HPC and ULL switching needs.

  8. says

    I wish this blog had an edit feature :-(

    I meant that Voltaire is the main IB switch manufacturer, not the only. I believe they do indeed have the largest market share.

  9. Dani says

    Only thing that i cannot understand is why on UCS system i cannot (pls say me NO if i’m wrong) remove VLAN from uplink.
    In some circumstances i do need to do it. Maybe can be useful in switched mode but i think also in EHM.
    Don’t know if i’m wrong but, maybe, without trunk all vlan on all interfaces we can avoid behaviour of losing packet when EHM has got several uplink on different physical infrastructure.

    tnx a lot for good work!

    • Ian Erikson says

      Why wouldn’t you want to trunk every VLAN you are going to use with the chassis, on every uplink of your UCS system? Brad has some great uplink scenario video’s, but perhaps you are talking about wanting to join 2 separate L2 networks in EHM?

      • Dani says

        Hi Ian,

        for separate L2 networks in switch mode, maybe STP issue in trunk all if on other end someone has got a trunk all and maybe a vlan overlap on two separate infrastructure.

      • Cy says

        I can think of one reason why you wouldn’t trunk every VLAN to your upstream. The Heartbeat of an Oracle RAC cluster.

    • says

      Filtering VLANs on the uplinks in EHM would not fix the problem with upstream disjointed L2 domains. The problem stems from the FI arbitrarily choosing 1 uplink to be a broadcast listener. This behavior will be enhanced in the next software release to be more aware of disjointed L2 domains.

  10. Cy says

    A big reason we are moving away from traditional blade chassis to UCS is that I don’t have to install any management agents at all to get a consolidated view of our high density environment. Added with the high density B230 blade and I can replace entire Dell m1000e chassis with a few blades, get a massive improvement to networking (design and traffic consistency), dramatically reduce the amount of cabling that is required for the same bandwidth and it is stateless hardware.

    One of the cisco guys we deal with notes at regular intervals, just look at the back of the UCS chassis, it’s cable porn!

    • says

      Exactly. Great feedback. The others, IBM/DELL/HP, ask you to build a house of cards of management agents supporting multiple management software stacks. Compared to UCS where the comprehensive management is literally built in to the solution out of the box.


  11. anitha says


    Good Explanations !!

    Can you share some insight on the virtual vlan port vs vlan count limitation in UCS. Looks like even with a 1000 vlan support the virtual vlan port limit is only 6000.

    Assume, I have a PALO Adapter and am trying to deploy a cloud solution using VMWare Standard edition. If my customer VM definition has 4 nics on it all on same vlan ID but different subnets, then I consume 4 virtual vlan ports. Now if I want to preconfigure all customer vlan id’s then I hit the limit.

    As per the release notes below, VLAN virtual port limit in release 1.4(1) is 6000.

    The Release Notes for Cisco UCS Software, Release 1.4(1) says, “The VLAN virtual port limit in release 1.4(1) is 6000. When this system wide limit is reached, if you try to add more VLANs through service profiles the service profile association will fail. In a layer 2 topology, logical instances of a VLAN are created on each interface for each active VLAN. These instances are referred to as Spanning-tree active logical ports or virtual ports. For example if you define 900 VLANs and 2 uplinks, this would consume 1800 of these virtual VLAN ports.”

    How are other cloud providers scaling vlan’s?

  12. tom says

    I have a confussion please I would appreciate if you could help me answer the confussion.
    I have a half width blade. I have one Menlo card in it. If I use the simple wizard, I can configure failover that means that one port of the menlo card goes to fabric A is active(I configure it that way) and the second 10G port of the menlo card is backup (failover). That means I am only using 10 Gig of the 20 Gig available from the Menlo Mezannine card.

    Now if I use the expert Wizard, it allows me to setup the interfaces as trunks and I can use both the ports (10G) of the mezzanine card. Which gives me 20 Gig on every server but no failover. Is this correct?

    Also if i get 20 gig on every server, I will have a oversubscription of 2:1 for every server in the chassis as I have 8 servers and 80 gig uplink bandwidth. If every server uses 20Gig without failover, I get 160 Gig but i only have 80 gig going to interconnect? How does cisco address this issue?

    • says

      You can use both 10G ports with fabric failover. You would simply have (2) vNICs (ore more) with fabric failover enabled, each using a separate fabric for its primary path.
      In a situation where you have congestion at the chassis uplinks, the chassis IO module will send PFC pause messages back to the adapters, which will cause the server adapter to slow its sending rate until the congestion is cleared.


  13. vicl2010v2 says

    Now that you are no longer at Cisco, maybe you can answer this (and this is relevant to DELL use of memory in high memory loading situations as well).

    How would you compare the large memory loading problem as addressed by Cisco UCS vs. LRDIMMs (currently Inphi is the only one sourcing the buffer chipsets).

    As described in the thread below, CSCO UCS seems to allow more DIMMs to be used, but then the latency penalty is an additional 6 ns.
    Compare that with the 5 ns latency penalty with LRDIMMs.
    Re: Let the truth be revealed on IPHI CC .. CSCO UCS and LRDIMMs 12 second(s) ago

    Also if you have any insight into how Netlist’s HyperCloud memory can make inroads into this area.

    NLST’s HyperCloud memory which has the SAME 1 cycle latency as for RDIMMs. (it’s “load reduction” and “rank multiplication” is allegedly being infringed by Inphi and Google for that matter – since Google runs it’s own memory division it seems).

    HyperCloud in addition is interoperable with other memory (LRDIMM can only be used with LRDIMM), and it requires no BIOS modification (unlike LRDIMM which does require that).

    Do you think people are only just realizing what LRDIMMs will actually deliver (5 ns additional latency), and why is Intel so intent on pushing LRDIMMs when HyperCloud is available (probably on RAND terms to JEDEC).

  14. vicl2010v2 says

    Here is a slightly more technical examination for why LRDIMM has such high latency (their use of a 628-pin chip – with possibly asymmetric data lines):
    LR-DIMM Significant Drawbacks relative to HyperCloud 30-Sep-11 01:49 am
    Currently Intel is pushing LRDIMMs – previously they were pushing MetaRAM (don’t know if people here remember them – the Fred Weber (AMD CEO) outfit). MetaRAM eventually conceded to NLST (NLST vs. MetaRAM) just prior to going bankrupt. MetaRAM even said in court docs that they had destroyed the infringing hardware (!).

    At DDR4’s higher speeds, there maybe serious issues getting LRDIMMs to perform well.

  15. Paul Riker says

    Question – has anyone else had their whole data center taken down by UCS? I have seen this twice, at two different companies. Same problem, UCS system has “bad” firmware, UCS systems take a power hit. The UCS blades do not come back online correctly. I have seen this twice now, and believe this is a major flaw in the UCS system. Sure, power should never go out, but &(&#@ happens, and when it does, watch out. Hopefully, if you have VMware in place, you would be very wise to have some other vendors systems in operation.

    Also – in a co-lo – for the amount of power we were given, we could run either 3 UCS chassis or 2 HP c-7000’s. The difference, 24 UCS blades vs 32 HP blades. So, if space is tight, I will take HP’s 33% advantage any day.

  16. vicl2010v2 says

    Check out this post for a more detailed examination of LRDIMMs and comparison with HyperCloud for the upcoming Romley platform rollout (Spring 2012):
    01/13/2012 at 7:33 am

    Title: High memory loading LRDIMMs on Romley – An Introduction to Next-Gen Memory for Romley
    Date: January 10, 2012

    See the section entitled “Inphi LRDIMM vs. Netlist HyperCloud:” for NLST’s PR on the direct comparison of LRDIMM with HyperCloud.

    See the section entitled “Netlist and Cisco UCS:” for NLST comparison with Cisco UCS (regarding the memory latency).

  17. eric bar says

    I dont know much about UCS and just googled it to learn a bit. I read this bit with interest:

    “This one always cracks me up because it somehow tries to say that a single point of management for 14 or even 20 chassis is somehow a BAD thing?”

    To my mind anything that is a “single point” is not enterprise worthy. What happens when/if this goes down? You wont be able to manage your environment? Or have I misread this?

    I also try to avoid SPOF (single points of failure) when I design solutions.


    • Andy says

      Eric, UCS has a single point of management, but this isn’t a single point of failure. You manage the UCS chassis from the Fabric Interconnects, which are deployed in a HA clustered pair. When manageing the UCS environment you open a connection to a shared VIP. One of the FI’s own’s this VIP until tere is either a failure or a manual cluster failover. If the primary FI (primary for management) were to fail, then the secondary FI would detect this and take over the shared VIP, maintaning management access.

  18. Nico says

    some of this is out of date. Palo is not only already released but vendors intel, emulex, qlogic and cisco all have a version of DCE/FCoE cards. Cisco does follow the standards if you have a chance take a look at Cisco assisted in co-authoring many of the 10 gig ethernet standards. as a matter of fact, Cisco came out with it before everyone else. Brocade bought McData and Foundry to maintain its competitive edge. Juniper followed in suit not long after Brocade did. Much of this is easily found on the internet among many of the news sources on technology. theStork, yahoo, MSNBC etc… definately not telling you anything you don’t already know.

  19. vicl2012v says

    I will be posting a couple of articles on the choices for the new Romley series of servers – for virtualization.

    They will mostly concern high loading memory (for virtualization etc.) at 3 DPC, 2 DPC on modern servers, and the limitations.

    For now, here is an article on memory choices for the HP DL360p and DL380p virtualization servers.

    I’ll get to the IBM System x3630 M4 server shortly.
    May 24, 2012
    Installing memory on 2-socket servers – memory mathematics

    For HP:
    May 24, 2012
    Memory options for the HP DL360p and DL380p servers – 16GB memory modules
    May 24, 2012
    Memory options for the HP DL360p and DL380p servers – 32GB memory modules

  20. Bill says

    Ok, I’m new to UCS and ask the question, why would I buy UCS? I’m a IT guy who looks after everything (server/storage/networking). At no time in my life have I ever needed formal training, but now with UCS, I do. And I’ve just finished the official course and I’m still blown away over how hard it is to do simple things. The management interface is atrocious, and there no intuition any step. Every other person on my course agreed. This is a game changer, it’s now going to be really hard to do simple things, so why bother? Just stick with your existing vendor and spend your time on things that count.

    • Joseph Greco says

      I’m also an IT guy who takes care of a full datacenter, when we first purchased UCS it came with a QuickStart training. I was able to get the system up and running before Cisco set foot in the door. The training just clarified some stuff and made it look nice. The Cisco UCS management systeme is GUI based and with a couple of quick searches on the net any IT admin should be able to get the system up and running in a couple of hours. We replaced all our Dell and IBM servers with UCS. Just the cable saving is unreal, the back of my Racks are clean and uncluttered. Since we have had the system we have had just 2 HW failures and both were on PSU’s. Since the system has quad PSU’s we had 0 downtime.

    • Ben says

      Bill, I couldn’t agree more. I work for a storage company and the complexity of UCS is unreal; layer and upon layer of virtualization.

      What UCS customers need to ask: How do you troubleshoot?

      It’s pretty tough, you have to go to Cisco support (unlike when you have a non-branded 1U PC in a rack..). Cisco has UCS expertise.. less Fabric Interconnect expertise, and virtually no expertise with fibre channel. God forbid you’re $500,000 array, based on tried and true fibre channel, suffers a performance issue with UCS as the initiator.

      I recommend customers ask tough questions about UCS support; ask about the support issues with NPV through the Fabric Interconnect, and how all of the virtualization and proprietary tech performs as all of these servers are funneled into less fabric ports. How do you follow the path? How do you readily check for physical layer errors?

      Once you see how Brocade does it, it becomes painful working with Cisco. One word: porterrshow. OK, one more: portlogdump. Cisco, get with it, FC customers still exist and FC is not IP.

      I’m here trying to help, yet another UCS customer, with performance problems. Always performance problems with UCS, and there are those who think 20ms is actually good; I’m supporting SSD based devices. UCS has a helluva time driving a FC SSD based array. UCS customers get 10 ms, at best, while all other customers routinely achieve and maintain < 1ms latency with outstanding bandwidth.

      Sure, oversubscription is not a problem with low end iSCSI attached disk arrays… I'll give you that. Indeed, UCS problems must be masked to a great degree with low end storage.

      Regarding this blog post, UCS is oversubscribed in my opinion; how many actual 8Gb fabric ports does UCS ever actually get access to? Mind you, many Cisco customers still use 4Gb pipes for FC (older MDS switches are a bargain). If you do go UCS, and want to connect SSD storage, opt for a 16Gb Brocade switch, or, get a 16Gb Cisco switch. It will help with the UCS npv density.

      Are all storage people as frustrated with Cisco? In my company we are, I sure would like to hear from other UCS supportin storage people.

  21. Alex says

    Is there any benefit in running 30x UCS rack -mount servers (C220 M3) through the Cisco Fabric interconnects and then to a pair of Nexus, instead of just configuring Intel 10 GbE NIC’s on each host and connecting to Brocade 10GbE VCS Fabric?

  22. Alex says

    to add more context: I am comparing fabric technologies for a new setup and I see that on the 1U rack C220 servers, I can configure 2x Intel 10GbE copper adaptors for a total of 4x 10GbE ports, and the Brocade VCS supports 4 links in their multi-chassis vLAGs, so I would have 40 Gb bandwidth per host, not sure if i could achieve the same with C220 (and would i still need the fabric interconnects?)

    • Chris says

      (Interesting read Brad, thanks)

      Alex, I am also configuring up a C-based solution atm. I am going to use the Cisco VIC card and go straight to Nexus 5k (no 6200’s). (Also running the 1000V essentials on VMware). This should give me Adapter-FEX from the VIC card, and can use vPC across the pair of N5Ks to get redundant uplinks from the ESXi hosts.

      Maybe Brad can correct me here if I’m wrong, but the main advantage I can see with C-series and 6200s is the UCS management side of things.

      I am also going to be connecting up ESXi hosts running Enterprise edition with standard vSwitch. I am hoping not to run into any problems with vPC here.


      • says

        Hey Chris,
        I’m not a big fan of running vPC down to the ESXi hosts. It doesn’t provide much in the way of added value when you look at the complexity it adds, and the simple redundancy options that are already built in to the vSwitch.

  23. says

    Hi Brad,

    We have 3 FlexPod in our environment and in one of the flexpod memory utilization is above 77%. Can you please help me how to manage this memory problem.?
    Below is the output from the VSM:-
    sh system resources
    Load average: 1 minute: 0.00 5 minutes: 0.00 15 minutes: 0.00
    Processes : 297 total, 1 running
    CPU states : 0.0% user, 0.0% kernel, 100.0% idle
    Memory usage: 2075792K total, 1616704K used, 459088K free
    69724K buffers, 952300K cache

    Waiting for the reply.

    Thanks In Advance
    Abhinav Singh


Leave a Reply

Your email address will not be published. Required fields are marked *