Video: Basic introduction to the Leaf/Spine data center networking fabric design

This video is a snippet from a presentation I gave to a Dell audience covering a basic introduction to the Leaf/Spine Layer 3 data center networking fabric design with a Dell Networking point of view.




  1. Atrey Parikh says

    L3 fabric? How do you solve application dependencies on L2 between racks? VXLAN and NVGRE are great if encap/decap is done in hardware instead of in the hypervisor on x86.

    • says

      Yes it’s an L3 fabric — which is the ideal network architecture for Network Virtualization such as VMware + Nicira, Microsoft + NVGRE, OpenStack + Midokura, etc.

      As for the best place to encap/decap the overaly — I’ll disagree with you there…

      For a virtual machine environment, the best place to provision the virtual network for those virtual machines is at the edge, at the hypervisor vswitch connected to the vms — and hence the best place to terminate/orginiate the overlay is at the edge hypervisor.

      For non-virtual workloads like a distributed storage cluster running Linux on baremetal (example: Ceph), that perhaps is one place to make an argument for overlay encap/decap in pSwitch hardware.

      On the other hand, one could also argue that even with Linux on baremetal you can have the Open vSwitch running on those machines too, and still encap/decap at the edge software layer — which is arguably more programmable and easier to coordinate with the edge virtual hosts performing the same encap/decap at the same x86 Linux software layer.

      In a nutshell I think overlay tunnel termination in pSwitch hardware is a niche one-off at best. Probably not going to be the predominate deployment.


    • says

      Yes. Its a /30 subnet on each Leaf-Spine link, OSPF point-to-point.
      In Dell Fabric Manager you provide an IP range for these interlinks, say /24, and DFM will chop that up into /30 and assign accordingly to each interlink during the auto-deployment.

      • Mohammed says

        you mentioned that as you add the spine switches, you can increase the number of leaf switches. isn’t that false? I would assume that the number of leaf switches depends on the number of ports in the spine switch and not the total number of spine switches.

        Can you explain this a bit more in detail?

        • says

          Here’s a simple example. Let’s say you have Leaf switches with 4 x 40G uplinks — because that’s the bandwidth you want.
          Now lets suppose you have (2) Spine switches with only 4 x 40G ports each. Each Leaf switch will have 2 x 40G to each Spine switch and you can only have (2) Leaf switches.

          There are two dimensions with which to scale this. Yes, the first obvious dimension is to increase the port count on the Spine switch. You replace the 4-port Spine switches with 8-port Spine switches and you can have (4) Leafs in the design above. Scaling up the port count of your two “Core” switches (running STP) has been the scaling approach to data center networking for the last 15 years — because its been the only way. One dimensional scaling ushering in the era of monstrous power sucking chassis switches.

          What is different here is the fact that with the Leaf/Spine architecture you know have a second dimension to work with — adding more Spine switches — because we are not running STP and we can have more than two “Core” (Spine) switches. You can have as many Spine switches as you have uplinks from your Leafs.

          So from our example above, we can take those same two 8-port Spine switches, and just add two more — leaving me with four Spine switches each with 8 x 40G ports. As a result the number of Leafs I can have doubles again, from 4 Leafs to 8 Leafs. Because each Leaf connects 1 x 40G to each Spine switch. And with 8-ports on each Spine switch that means I can have 8 Leafs.

          The second dimension is adding more Spine/Core switches, and evenly distributing the Leaf connections among those. The result is an architecture that can scale to very large fabric requirements with smaller, more efficient switches. The fabric is also more robust in the sense that a failure of a Spine switch only represents a fraction of fabric bandwidth, inversely proportional to the number of Spine switch you have deployed.

          100 / (number of Spines) = % of fabric bandwidth per Spine


  2. Amit says

    Conceptually, this looks similar to Cisco’s FabricPath, except the IP addressing and OSPF part. Auto-deployment using DFM sounds like a very good selling point.

    Any reason to not connect anything, except the leaf switches, to the Spine switches?

  3. Dushyant says

    Nice video Brad! As always very informative. One small question: the number of spine switches depends on the ECMP factor of the switches as well, doesn’t it?

  4. says

    Hi Brad,

    I have 2 Spine switch and 6 leaf switch and i am using flat network single vlan, how can i configure the spine switches and leaf switches and wher will create L3 vlan and L2, What will be use protocal in this schenrio.

    Please help me.

    Thanks in advance,


Leave a Reply

Your email address will not be published. Required fields are marked *