Jun 22 2010
Cisco UCS Networking Best Practices (in HD)
This is a presentation I developed covering networking best practices for Cisco UCS, and now have recorded in High Definition for your viewing pleasure! Sweet!
This presentation assumes familiarity with basic networking and server VNIC concepts in UCS, and familiarity with virtual port channels.
This version of the presentation (v2.5) focuses primarily on the Ethernet uplinks. SAN uplinks and VMware networking scenarios are briefly discussed but not covered extensively. Those topics and others such as QoS, the Cisco VIC, and vNIC fabric failover may be included in future versions of this presentation.
Stay tuned for updates! RSS feed: http://bradhedlund.com/feed/
Part 1 – Cisco UCS Networking Overview
In Part 1 we take start with a baseline overview of Cisco UCS Networking. At the heart of the system is the Fabric Interconnect (6100) “the Brains of UCS” which provides 10GE & FC networking for all the compute nodes in its domain as well as being the central configuration, management, and policy engine for all automated server and network provisioning.
Part 2 – Switch Mode vs. End Host Mode
Part 2 is an examination of the two different switching modes supported by the Fabric Interconnect, “Switch Mode” and “End Host Mode”. With “Switch Mode”, the Fabric Interconnect behaves like a normal Layer 2 switch on all server ports and uplinks, and therefore attaches to the upstream data center network as a spanning tree enabled “Switch”.
“End Host Mode”, on the other hand, while still providing local Layer 2 switching on the server ports, does not behave like a normal Layer 2 switch on its uplinks. Instead, server NICs are “pinned” to a specific uplink, and no local switching happens from uplink to uplink. This allows “End Host Mode” to attach to the network like a “Host” without spanning tree, and all uplinks forwarding on all VLANs.
End Host Mode is the preferred mode, and it’s enabled by default.
Part 3 – End Host Mode – Individual Uplinks
In Part 3 we take a look how the individual uplinks behave in End Host Mode, and how the system reacts to uplink failures. When an uplink fails, the Fabric Interconnect will move the server NICs to a new uplink in under a second without causing any disruption to the server NIC. This uplink failover process is called dynamic re-pinning.
After the dynamic re-pinning process, the Fabric Interconnect will send Gratuitous ARP messages for all of the MAC address that were previously using the failed uplink. This GARP process aids the upstream network in quickly learning the new location of the affected MAC address now using the new uplink.
Part 4 – Port Channel Uplinks
Here we take a look at the benefits of using Port Channel uplinks with Cisco UCS. The key advantages to port channel uplinks is the minimal impact of a physical link failure and the potential for better overall uplink load balancing. During individual physical link failures fewer moving parts required to provide a fast recovery. For example, Gratuitous ARP messages and dynamic re-pinning are not required when an individual physical member link fails in a port channel uplink. Port Channel uplinks are definitely recommended whenever possible.
Part 5 – Virtual Port Channel Uplinks (vPC)
Part 5 covers the advantages of using virtual port channel (vPC) uplinks with Cisco UCS. With vPC uplinks, there is minimal impact of both physical link failures and upstream switch failures. With more physical member links in one larger logical uplink, there is the potential for even better overall uplink load balancing and better high availability than with a standard Port Channel uplink discussed in Part 4. Using a virtual port channel uplink is highly recommended if you have vPC capabilities present in your upstream network switches.
Part 6 – Connecting Cisco UCS to separate networks
In Part 6 we discuss the scenario of connecting a single Cisco UCS system in End Host Mode to separate Layer 2 networks. When the system is in End Host Mode, it expects and assumes that all uplinks are connected to the same common Layer 2 domain. If some uplinks are connected to physically separate networks you will have connectivity problems. The Fabric Interconnect will randomly pick one of its uplinks to process broadcast messages for all VLANs. As a result, only servers associated with the chosen network will be able to see and process broadcasts messages on their network. The solution is create a common Layer 2 network for the Fabric Interconnect in End Host Mode and each of the separate networks to attach to, or, use Switch Mode. If creating a common Layer 2 network or using Switch Mode is not an option, you can always deploy a unique Cisco UCS system per separate network to preserve the existing “silos”.
Part 7 – Inter Fabric Traffic Examples
This is a brief look at some the common types of traffic flows that may flow between Fabric-A and Fabric-B within a single Cisco UCS system. With this understanding, the subsequent material will make more sense.
Part 8 – Don’t: Connect Cisco UCS to vPC domains without vPC uplinks
This is a fairly extensive look at the scenario of attaching UCS to upstream switches configured for vPC, without using vPC uplinks. Here we will show that this scenario doesn’t make much sense and in fact can cause some unwanted traffic black holes under some failure scenarios. This is a prelude to Part 9 where we illustrate that if your upstream network is configured for virtual port channel capability (vPC), you should always attach UCS with vPC uplinks.
Part 9 – Do: Connect Cisco UCS to vPC domains with vPC uplinks
This section shows that if you have virtual port channel capabilities in your upstream switches, you have everything to gain and nothing to loose by connecting Cisco UCS with vPC uplinks. You will gain the benefit of the upstream switch locally switching all Fabric-A to Fabric-B traffic, and acheiving more bandwidth scalability for inter-fabric traffic because all inter-fabric traffic will travel on the vPC uplinks, rather than on less abundant inter-switch links. Additionally, you will avoid potential black hole failure scenarios discussed in Part 8, if vPC is already present in the upsteam network switches.
Part 10 – Connecting Cisco UCS without vPC
While there are certainly advantages to uplinking Cisco UCS with virtual port channels, vPC is certainly not required. Cisco UCS easily and efficiently connects to any data center network environment with or without vPC. This section discusses best practices connecting UCS to networks without vPC. The key best practice here is to always dual attach each Fabric Interconnect to two upstream network switches, whether its with vPC uplinks, or multiple individual uplinks. Another suggested practice is to avoid attaching Cisco UCS to a second tier Layer 2 switch with spanning tree blocking links. A better approach is to either have vPC capabilites at the second tier Layer 2 switch, or connect Cisco UCS directly to the tier 1 switch, avoiding a traffic bottlenecks induced by spanning tree.
17 responses so far
Great stuff here Brad. Thanks educating us on this stuff, and keep up the good work!
[...] **Update** Brad has posted his UCS Networking Best Practices Post I was hinting at above. It’s a fantastic video blog in HD, check it out here: http://bradhedlund.com/2010/06/22/cisco-ucs-networking-best-practices/ [...]
Brad,
Wow these are great videos! Thanks for taking the time to make all of them and put them on your site.
Brad….amazing videos and great blog…thanks!
Brad,
I really appreciate this post. It has been extremely valuable in our configuration and evaluation of the UCS platform.
One question I would like clarified about UCS best practices has to do with 1000v. I was wondering if you might be able to shed some light. In the “Best Practices in Deploying Cisco Nexus 1000V Series Switches on Cisco UCS B Series Blade Servers” gude, found here: http://cisco.biz/en/US/prod/collateral/switches/ps9441/ps9902/white_paper_c11-558242.html there is a statement regarding the placement and connectivity of VSMs in a 1000v implementation on UCS. It gives several options “listed in order of recommended customer use” but says little to justify the order of preference. I am looking for any information that would help me understand the reasoning behind the recommendation. Also I am wondering why VSM Inside the Cisco Unified Computing System on the Cisco Nexus 1000V Series VEM is not recommended or at least not mentioned at all.
Jason
Brad,
I have configuration question. when connecting a Fabric Interconnect running in End Host Mode to a nexus 5k vPC, what spanning-tree port type should the 5k port channel be? i.e. spanning-tree port type edge or spanning-tree port type network?
thanks!
Andy,
The Fabric Interconnect in End Host Mode connects to the upstream switch like a “Host”, not as a “Switch”. Therefore you should configure the upstream switch port (or port channel in your case) just as you would if it were a server attaching to it. Therefore the best configuration on the 5K, in your case, would be: ‘spanning-tree port type edge trunk‘
Cheers,
Brad
Hi Brad,
These videos are really informative and clear. It is great to have a place to review topics.
I would love to see something similar where the subject is “How Nexus platform works with UCS, in the data center”.
Joe
Brad,
awsome post…i was just having a tech session with your Cisco peer Steve Fishman… good stuff… I’d like to see the same for BP on FCoE to FC from the 6100 to MDS91xx… and how to scale FC to large SAN env from UCS.
[...] Cisco UCS Networking Best Practices (in HD) [...]
Awesome videos. It greatly helped me to understand the connectivity and traffic flow path in the UCS system. Thank you
Brad:
Awesome vids.
I give this a A+ for the presentation of very useful information.
I found this very helpful but the delivery and tone…. knocked me out. I could hardly stay awake.
This is an A++ as a sedative. Dude you should probably consider moonlighting as a hypnotist.
JM
Great job though. Keep up the good work.
Hello,
In watching video#2 and looking at the difference between switch mode and end host mode I don’t see where / how I would team nics to a host. In your example vlan 10 is only on 6100A and vlan 20 is only on 6100B which suggests that if we lost 6100A we would lose vlan 10 completely.
Wouldn’t a better approach be to put vlans 10 & 20 on both 6100s and trunk and team them to the servers via LACP so that a failure would allow the traffic to still have a path to server ?
Robert,
Yes, in most cases the same VLANs will exist on both 6100s. You would not use LACP for NIC teaming because that would require the 6100s to be clustered together in a vPC domain, and that is not supported. For NIC teaming you could use standard active/standby teaming, the unique vNIC “fabric failover” capability in UCS, or in the case of a VMware host you can also use active/active NIC load balancing based on vPort-ID or Mac Pinning.
Brad
Hi Brad,
I enjoyed these great videos, thanks!
To your statement “For NIC teaming you could use standard active/standby teaming,”
assuming active/standby alternating (server1 active to FEX1, server 2 active to FEX2), how is the traffic flow between server 1 and 2 ?
Thanks,
Matthew
Matthew,
In this scenario:
(server1 active to FEX1, server 2 active to FEX2), how is the traffic flow between server 1 and 2 ?
Traffic between Server 1 and Server 2 would travel through the upstream network. This is another example of “Inter-Fabric Traffic” discussed in Part 7.
Cheers,
Brad
Excellent information, brief and straight to the point info on very relevant concepts for SEs pushing this solution to clients and needing to know all the facts (not just the cool marketing stuff).
Many thanks,
Paul