One of my readers recently submitted a comment asking me to respond to some criticisms he frequently hears about Cisco UCS. This is a pretty typical request I get from partners and perspective customers, and its a list of stuff I ‘ve seen many times before, so I thought it would be fun to address these and other common criticisms and FUD against Cisco UCS in one consolidated post. We’ll start with the list submitted by the reader and let the discussion continue in the comments section. Sounds like fun, right?
I regularly hear a few specific arguments critiquing the UCS that I would like you to respond to, please.
1. The Cisco UCS system is a totally proprietary and closed system, meaning:
a) the Cisco UCS chassis cannot support other vendor’s blades. For example, you can’t place an HP, IBM or Dell blade in a Cisco UCS 5100 chassis.
b) The Cisco UCS can only be managed by the Cisco UCS manager – no 3rd party management tool can be leveraged.
c) Two Cisco 6100 Fabric Interconnects can indeed support 320 server blades (as Cisco claims), but only with an unreasonable amount of oversubscription. The more accurate number is two 6100s for every four (4) 5100 UCS chassis (32 servers), which will yield a more reasonable oversubscription ratio of 4:1.
d) A maximum of 14 UCS chassis can be managed by the UCS manager, which resides in the 6100 Fabric Interconnects. Therefore, this creates islands of management domains, especially if you are planning on managing 40 UCS chassis (320 servers) with the same pair of Fabric Interconnects.
e) The UCS blade servers can only use Cisco NIC cards (Palo).
f) Cisco Palo cards use a proprietary version of interface virtualization and cannot support the open SR-IOV standard.
I would really appreciate it if you can give us bulleted responses in the usual perspicacious Brad Hedlund fashion.
This is a good list to start with. But before we begin, lets define what constitutes valid criticism in the context of this discussion.
Criticism: something pointed out as lacking or deficient when compared to what is typically found and expected in other comparable and “acceptable” solutions. For example, if my new commuter car didn’t have anti-lock brakes this would be a valid criticism as anti-lock brakes is a feature commonly found and expected in most newer commuter cars today. However, if my car didn’t transform into a jet plane and fly with the press of a button, is that a valid criticism? No. This is not a capability typically expected of any automobile. Such a “criticism” is pointless.
OK, lets get started…
1) “Cisco UCS chassis cannot support other vendor’s blades”
This is one of my favorites. If someone brings this up you know right away you’re dealing with someone who is either A) joking, or B) has no idea what they’re talking about. Anybody who has set foot in a data center in the last 7 years knows that Vendor X’s blade chassis are only populated with Vendor X’s blade servers, and … <GASP> yes! Cisco UCS chassis are only populated with Cisco UCS blades. Shame on Cisco! LOL.
Before the IBM guys jump out of their seat, Yes, I am aware that 3rd party blade servers can be made to fit into an IBM blade chassis. While that’s a cute little check box to have on your data sheet, the actual implementation of this is extremely rare. Why? It just doesn’t make any sense to do this, especially with commodity x86 hardware.
When was the last time you saw Vendor X’s blade server in Vendor Y’s blade chassis? Exactly. This is not a valid criticism. Case closed.
2) “Cisco UCS can only be managed by the Cisco UCS manager – no 3rd party management tool can be leveraged.”
If “managed” means: The basic baseboard level management of the blade itself (BIOS settings, firmware, iLO, KVM, virtual media, etc.), in other words, everything needed to get the blade up and functionally booting an OS — Well, yes, this of course is true and again its no different than the other market leading vendors. Example, the HP c7000 chassis requires that you have at least one HP management module present in every chassis to manage the blades (HP Onboard Administrator). Furthermore, to aggregate management across multiple c7000 chassis you are required to have HP management software performing that function as well, HP Systems Insight Manager. This is true of the other blade vendors as well (DELL, IBM). You have their management software and modules managing their hardware. So help me understand, how is this a valid criticism against Cisco UCS?
If “managed” means: a higher level capability set such as.. auditing, provisioning, historical statistics, life cycle management, alerts and monitoring, etc. — this is actually where Cisco UCS sets itself apart from the other vendors in being more “open” and eco-system friendly. Unlike the others, Cisco UCS provides an extremely powerful and open XML API that any 3rd party developer can customize their solution to. Consider the fact that the UCS Manager GUI is just a browser based front-end to the same XML API that 3rd party developers are interfacing with. Its entirely possible to provision and manage an entire UCS system with 3rd party software, and never once using the UCS Manager GUI.
There are many examples this open XML API management integration with Cisco UCS, but here are just a few:
- EMC Ionix UIM
- Microsoft: SCOM 2007 Cisco UCS Management Pack
- BMC: BladeLogic Server Automation, ProactiveNet Performance Manager, Cloud Lifecycle Manager
- IBM: Tivoli Monitoring for Cisco UCS (health scripts)
- HP: Operations Manager support for Cisco UCS
- CA Automation Suite for Cisco UCS
- iPhone App for Cisco UCS management
Why isn’t there an iPhone app yet to manage HP BladeSystem, or DELL, or IBM? Answer: Without a consolidated and open API to interface with this would be a tremendously complex effort. Compare that to the Cisco UCS iPhone app that was developed by just one Cisco SE (Tige Phillips) in his spare time!
If an amateur programmer can write an iPhone app to manage Cisco UCS in his spare time, imagine what a team of cloud savvy programmers can accomplish? Example: Check out what newScale is doing with Cisco UCS.
So, as for the claim: “… no 3rd party management tool can be leveraged.”? We can dismiss that one as being totally false.
3) “Two Cisco 6100 Fabric Interconnects can indeed support 320 server blades (as Cisco claims), but only with an unreasonable amount of oversubscription. The more accurate number is two 6100s for every four (4) 5100 UCS chassis (32 servers), which will yield a more reasonable oversubscription ratio of 4:1”
The statement “unreasonable amount of oversubscription” is pure speculation. Oversubscription requirements will vary per customer deployment depending on factors of desired scale, bandwidth, and cost. The trade off between bandwidth and scale is no secret and is simply a fact of life with any vendor solution, its not something unique to UCS. More bandwidth means more network ports, more switches, higher cost. More oversubscription means higher scale at lower costs.
Next, what does “oversubscription” really mean anyway? For some, it might be very blade chassis centric where they calculate the ratio of total bandwidth provisioned to the servers in a chassis compared to the total uplink bandwidth available to that chassis. In these calculations, each Cisco UCS chassis of 8 servers can be provisioned for a max of 80 Gbps, or a minimum of 20 Gbps. When you provision the minimum 20 Gbps of uplink bandwidth per chassis you can in theory*** achieve the scale of 320 servers per Fabric Interconnect. (40 chassis dual homed to a pair of 40-port fabric interconnects)
Example: If I have a modest provisioning of 10 Gbps per server (that’s a lot, actually), and the minimum of 20 Gbps of chassis uplink bandwidth — that’s a <GASP> “more reasonable” 4:1 oversubscription ratio for 320 servers! 😉
For others, “oversubscription” might mean the ratio of bandwidth a server must share not only to exit the chassis, but rather the total amount of bandwidth each server shares to reach the Layer 3 core switch. Again, this is a universal bandwidth/scale/cost design trade-off across all vendors, not just Cisco. This kind of exercise requires taking a look at the total solution including servers, chassis, access switches, core switches, LAN and SAN.
Here’s a simple example of achieving 4:1 oversubscription from every server to the LAN core*, and 8:1 to the SAN core**. You could have (8) UCS chassis each with 8 servers provisioned for 10 Gbps of LAN bandwidth, 4 Gbps of SAN bandwidth. Each chassis is wired for the maximum of 80 Gbps providing 1:1 at the chassis uplink level. So, now we have 64 servers at 1:1 that we need to uplink to the SAN and LAN core. To get 4:1 to the LAN core*, and 8:1 to the SAN core**, we need to have (16) 10GE uplinks and (8) 4G FC uplinks from each Fabric Interconnect. We’ll take those uplinks and connect them to 1:1 non-oversubscribed ports at the SAN and LAN core.
The result: 64 servers each provisioned for 10GE with 4:1 oversubscription to the LAN core*, and 8:1 to the SAN core**. All of this fits into a single pair of UCS 6140 Fabric Interconnects. You could treat this as a discreet “Pod”. As you need to scale out more servers at similar oversubscription, you stamp out more similarly equipped pods.
Want 4:1 to both the LAN and SAN core? Scale back to (6) UCS chassis and (48) servers per Fabric Interconnect, and provision more FC uplinks to the SAN core. Its the classic scale vs. bandwidth design trade off applicable to any vendor solution.
*Side note: The LAN oversubscription from Server to Core is actually 2:1 with both fabrics available, and 8:1 with one fabric completely offline. For the sake of discussion lets just average it out to 4:1.
**Side note: The SAN oversubscription from Server to Core is actually 4:1 with both fabrics available, and 8:1 with one fabric completely offline.
***Side note: The current hardware architecture of Cisco UCS can fit 40 chassis and 320 servers underneath a single pair of 6140 fabric interconnects. However, the number of chassis per fabric interconnect officially supported by Cisco at this time is 20. This number started at 5 and continues to go up with each new major firmware release. Reaching 40 supported chassis is only a matter of time.
4) “A maximum of 14 UCS chassis can be managed by the UCS manager, which resides in the 6100 Fabric Interconnects. Therefore, this creates islands of management domains, especially if you are planning on managing 40 UCS chassis (320 servers)”
Correction: as of the most recent UCS Manager 1.4 release you can now manage a maximum of 20 chassis.
This one always cracks me up because it somehow tries to say that a single point of management for 14 or even 20 chassis is somehow a BAD thing? LOL! 😀
What’s the alternative? With HP, IBM, or DELL, (20) chassis is exactly (20) islands of management, and each island has multiple things you need to manage on it (chassis switches and management modules). What about the LAN and SAN access switches connecting the (20) chassis? Yep, you need to manage those too.
Compare that to the (1) management island per (20) chassis from a single interface and single data set managing settings and policies for all of the servers including LAN & SAN. 😉
5) “The UCS blade servers can only use Cisco NIC cards (Palo)”
This is simply not true. From the very beginning customers have had the choice of several non-Cisco adapters. In fact, the Cisco adapter (Palo) wasn’t available for almost a year after the initial release of Cisco UCS. As of the recent UCS Manager 1.4 release, several more adapters have been added to the portfolio of choices.
The adapters Cisco UCS customers can choose from:
- Intel: 82598 Gen1, 82599 Gen2
- Emulex: Gen1 CNA, Gen2 CNA
- QLogic: Gen1 CNA, Gen2 CNA
- Broadcom w/ TOE & iSCSI HBA
- Cisco Virtual Interface Card (VIC) (Palo)
6) “Cisco Palo cards use a proprietary version of interface virtualization and cannot support the open SR-IOV standard”
The Cisco Palo card accomplishes interface virtualization in way that’s completely transparent to the OS — This is done through simple standards based PCIe. There’s nothing proprietary happening here at all. When installed into the server, the Cisco Palo card appears to the system like a PCIe riser hosting multiple standard PCIe adapters. In other words, Cisco has effectively obsoleted the need for SR-IOV with the design of the Cisco VIC (Palo). There’s nothing stopping any other vendor from using the same transparent PCIe based approach to interface virtualization.
With SR-IOV, on the other hand, the OS needs to be SR-IOV aware. You need to have the proper SR-IOV drivers and extensions loaded, etc. Why complicate the solution with additional complexity when you can achieve the same goal (interface virutalization) in a way that’s completely transparent to the OS and adapter drivers? This obviates the need for any additional “standard” layered into the solution.
By the way, there’s nothing preventing you from using an SR-IOV adapter with Cisco UCS. For example, the new Intel 82599 adapter for UCS supports PCI SIG SR-IOV. If you want SR-IOV really bad, use that adapter.
OK! That was a good round. Now go ahead and hit me with your best shot in the comments section. Please, keep your comments to one or two concise paragraphs. If you have a whole bunch of stuff to throw at me, break it up into multiple comments if you can. If you submit a really good one, I’ll promote it into the article content.
For my HP, IBM, and DELL friends out there (you know who you are) — Guys, there’s no need to submit comments pretending to be a disappointed customer. Just cite your real name and vendor disclosure and lets have an honest and forthright discussion like the gentlemen we all are. No need for games.
Also, please keep in mind that I am employed by Cisco Systems, so I do need to exercise discretion in what I can and cannot say. I’m sure you can understand.
Cheers & Happy New Year!