Architecting Data Center Networks in the era of Big Data and Cloud

Earlier this month I was thrilled to have a speaking session at Interop Las Vegas 2012.

Session title: Architecting Data Center Networks in the era of Big Data and Cloud

Session abstract: Big Data clusters and SDN enabled clouds invite a new approach to data center networking. This session for data center architects will explore the transition from traditional scale-up chassis based Layer 2 centric networking, to the next generation of scale-out Layer 3 Leaf/Spine CLOS based fabrics of fixed switches.

Below is the HD video recording of the session, and the slides. Enjoy!

 

Slides PDF Download

 

Cheers,
Brad

Comparing fabric efficiencies of Fixed vs Chassis based designs, non-blocking

Recently I made the observation that Fixed switches continue to outpace Chassis switches in both power and space efficiency.  Simply put, with Fixed switches you can cram more ports into fewer RUs, and each port will draw less power when compared to Chassis switches.  Those are the indisputable facts.

A common objection when these facts are presented is that ultimately when you go to build a *fabric* of Fixed switches, that fabric will consume more total power, more total RU, and leave you with a lot more switches and cables to manage when compared to a single Chassis *switch*.  For example,  one 384 port line rate chassis switch (Arista 7508) consumes less power and RU when compared to the (10) Dell Force10 Z9000 fixed switches you would need to build a 384 port fabric.  While that is true, this is purely academic with no relevance to the real world.  Who in their right mind runs their 384 port non-blocking fabric on *one* switch? Nobody.  To carry this flawed logic out a bit further, the largest fabric you could have would be equal to the largest chassis you can find.  Nobody wants that kind of scalability limit.

In the real world we build scalable fabrics with more than one switch.  So with that in mind lets look at some various non-blocking fabric sizes, staring at 384 ports and on up to 8192 ports.  In each fabric size lets compare the total power and RU of the network switches.  I’ll make the observation that when you actually construct a real world non-blocking fabric, designs with all fixed switches consume less power and less space than comparable designs with chassis switches.  Another interesting observation is the non-blocking fabrics constructed with fixed switches available today result in fewer switches and cables to manage – compared to designs a chassis vendor might propose.

The chassis based design uses a typical 1RU fixed switch as the Leaf (Arista 7050S-64), connecting to a Chassis switch Spine layer (Arista 7508 or 7504), something Arista would likely propose.

The design of all fixed switches uses a 2RU switch at both the Leaf and Spine layer, the Dell Force10 Z9000.  The intention here is not to pick on Arista – quite the contrary – I’m using Arista as an example because of the current crop of monstrous power sucking chassis switches, Arista’s are the more efficient (sucking less).  Kudos to them for that.

To get straight to the point, lets first look at the overall summary charts.  The individual designs and data will follow for those interested in nit-picking.

Fabric Power Efficiency

The chart above shows that fully constructed non-blocking fabrics of all fixed switches are more power efficient than the typical design likely proposed by a Chassis vendor.  As the fabric grows the efficiency gap widens.  Given we already know that fixed switches are more power efficient than chassis switches, this data should make sense.

Fabric Space Efficiency

Again, the chart above shows a very similar patter with space efficiency.  A fully constructed non-blocking fabric of all fixed switches consumes less data center space than the typical design of Chassis switches aggregating fixed switches.

Designs & Data

As you look at the designs below, notice that non-blocking fabrics with fixed switches actually have fewer switches and cables to manage than non-blocking fabrics with Chassis switches — contrary to the conventional wisdom.

Above: (6) Leaf fixed switches, (4) Spine fixed switches interconnected with 40G and providing 384 line rate 10G access ports at the Leaf layer, and 96 inter-switch links.  (10) switches total, each with a max rated power consumption of 800W.

Above: (12) Leaf fixed switches, (2) Spine chassis switches interconnected with 10G.  Each Leaf switch at 220W max power has 32 x 10G uplink, and 32 x 10G downlink for 384 line rate access ports, and 384 inter-switch links (ISL).  The (2) chassis switches are 192 x 10G port Arista 7504 each rated at 7RU and 2500W max power.

 

Above: (32) Leaf fixed switches, (16) Spine fixed switches interconnected with 40G providing 2048 line rate 10G access ports at the Leaf layer, and 512 inter-switch links.  (48) switches total, each with a max rated power consumption of 800W.

 

Above: (64) Leaf fixed switches each at 220W max power and 1RU, with 32 x 10G inter-switch links, and 32 x 10G non-blocking fabric access ports.  (8) Arista 7508 Spine chassis each with (6) 48-port 10G linecards for uniform ECMP.  Because each 11RU chassis switch is populated with 6 linecards of 8 possible, I’ve factored down the power from the documented max of 6600W, down to 5000W max. (72) total switches.

Above: (64) Leaf fixed switches, (32) Spine fixed switches interconnected with 10G providing 4096 line rate 10G access ports at the Leaf layer, and 4096 inter-switch links.  (96) switches total, each with a max rated power consumption of 800W.

Above: (128) Leaf fixed switches each at 220W max power and 1RU, with 32 x 10G inter-switch links, and 32 x 10G non-blocking fabric access ports.  (16) Arista 7508 Spine chassis each with (6) 48-port 10G linecards for uniform ECMP.  Because each 11RU chassis switch is populated with 6 linecards of 8 possible, I’ve factored down the power from the documented max of 6600W, down to 5000W max. (144) total switches.

Above: (64) Leaf fixed switches, (128) Spine fixed switches interconnected with 10G providing 8192 line rate 10G access ports at the Leaf layer, and 8192 inter-switch links.  (192) switches total, each with a max rated power consumption of 800W.

Above: (256) Leaf fixed switches each at 220W max power and 1RU, with 32 x 10G inter-switch links, and 32 x 10G non-blocking fabric access ports.  (32) Arista 7508 Spine chassis each with (6) 48-port 10G linecards for uniform ECMP.  Because each 11RU chassis switch is populated with 6 linecards of 8 possible, I’ve factored down the power from the documented max of 6600W, down to 5000W max. (288) total switches.

Conclusion

When building a non-blocking fabric, a design of all fixed switches scales with better power and space efficiency, and with fewer switches and cables (if not the same), when compared to designs with chassis switches.

Something wrong with my data, designs, or assumptions?  Chime in with a comment.

Follow-up post:

Cheers,
Brad

Comparing efficiencies of Fixed vs Chassis switches

When building a fabric for a cluster of servers and storage in the data center, how should you architect the network?  There are several ways to approach this.  The architecture you choose probably depends on your preconceived notions of what a network should look like and addressing the things that you care about.  For example, does power and space efficiency matter at all?  It usually does, given these are often finite and costly resources.  The more power and space given to the network means less power and space given to storage and compute — the stuff that actually provides a return.

With that in mind, which architecture and switch platforms might work best when space and power are taken into consideration? Lets take a quick look at comparing the power and space efficiency of Fixed switches vs. Chassis switches.  I will make the observation that fixed switches continually outpace chassis switches in power and space efficiency.

Power efficiency

The line graph above shows the maximum rated power per line rate L2/L3 port.  We are looking at the most dense platform for that year and from the data sheet we divide the max power by the number of ports.  For the Fixed switches, I could have used data from the lower power single chip platforms, but to be extra fair we are looking at the higher power multi chip platforms (eg. Dell Force10 Z9000).   I did not include 2008 because the chassis data for that year was so high that it skewed visibility for the remaining years.  Chassis switches got significantly better in 2010 thanks to some new players in that market (namely Arista).

Note: Projections for 2014 are based on the trend from previous years.

Space efficiency

The line graph above shows the line rate L2/L3 port density of the most dense platform that year, Chassis vs Fixed.  Pretty straight forward.  We take the number of ports and divide it by the required rack units (RU).  While each platform is getting better, the port density of fixed switches continually outpaces chassis switches with no end in sight.

Conclusion

Fixed platforms are more power and space efficient than chassis platforms, by a significant margin, year after year.

Some might say: “Yes, Brad, this is obvious. But comparing chassis vs fixed is not a fair comparison, and its silly.  You can’t build a scalable fabric with fixed switches.”

My response to that: Think again.  Perhaps it’s time to question the preconceived notions of what a network architecture should look like, and the form factors we instinctively turn to at each layer in the network .  Ask yourself a very basic question:  “Why do I want so many ports shoved into one box?”  Are you building a scalable network? Or are you designing around arcane Layer 2 network protocols?

What would an efficient and scalable network look like if we could eschew the premise of arcane Layer 2 protocols (STP)? And instead build the network with new alternatives such as TRILL, OpenFlow, or Layer 3 fabric underlays with network virtualization overlays (VXLAN, NVGRE, STT).

What would that network look like? 😉

Follow-up posts:

Cheers,
Brad

Data:
Chassis density
2008 – 3 (Nexus 7010 w/ 64 @ 21RU) *M1-32 linecard
2010 – 34 (Arista 7508 w/ 384 @ 11RU)
2012 – no change, Arista 7508 still most dense
2014 – anticipated 96pt per slot w/ current chassis

Fixed density
2008 – 24 (Arista 7124, Force10 S2410) @ 1RU
2010 – 48 (Arista 7148) @ 1RU
2012 – 64 (Broadcom Trident) @ 1RU
2014 – anticipated 128pt @ 1RU

Chassis power
2008 – Nexus 7010 w/ 8 x M1-32 = 8400W max (64 ports line rate), 131W / line rate port
2010 – Arista 7508 = 6600W max / 384 ports = 17W
2012 – Nexus 7009 w/ 7 x F2 = 4595W max / 336 = 13.6W
2014 – Anticipated 25% decrease = 10.2 (based on a 25% decrease from prior 2 years)

Fixed power
2008 – Arista 7124SX – 210W / 24 ports = 8.75 W / line rate port (single chip)
2010 – Arista 7148SX – 760W / 48 ports = 15.8 W / line rate port (multi chip)
2012 – Broadcom Trident+ based platforms – 789W (Dell Force10 Z9000) / 128 line rate ports (multi chip) = 6.1W
2014 – Anticipated 60% decrease = 2.4W (based on a 60% decrease from prior 2 years)