In the past I’ve written about the basics of Hadoop network traffic, 10GE Hadoop clusters, Leaf/Spine fabrics, and talked about how you might construct a fabric for a Big Data and private cloud. In this post I’ll continue on that theme with a high level discussion on linking these environments to the rest of the world via the existing monolithic IT data center network.
The conventional wisdom and default thinking usually runs along one of two lines:
- Expand the existing monolithic IT data center network to absorb the new big data or private cloud environment.
- Build a unique fabric for the new environment, then link that fabric to the existing IT network via switch-to-switch connections and networking protocols.
Rather than bore you with the conventional wisdom, I’ll break from that and instead discuss a third approach, one where we treat the big data / cloud fabric and its machines collectively as an “appliance”. To link this appliance to the outside world we use x86 Linux based gateway machines. This is something I’ve found several times now in travels listening to customers talk about their plans or production environments.
In the appliance model we recognize that not all of the machines in our big data cluster or private cloud need to have a direct network path to the outside world. Many of these machines are worker “nodes” holding and processing sensitive data, or virtual machines. The worker node communicates directly to other worker nodes or master coordination nodes in the cluster – this is where we observe so-called “East-West” traffic.
We also recognize that access to the data or VMs residing on the worker nodes, from the outside, is often facilitated through another machine providing a gateway function for the cluster. Two examples of such a gateway machine are the Client machines (or sometimes called “Edge Nodes”) in Hadoop, and the Quantum networking machines providing L3/NAT services in OpenStack. Another example might be machines in a VMware vCloud running instances of vShield Edge.
The gateway machines are the interface point and pathway to our “appliance”, having one leg (NIC) on the inside connected to the cluster fabric, another leg (NIC) on the outside connecting to the existing data center network. Outside of the cluster we have users who want to access the data or applications inside our “appliance”, a typical Client/Server relationship – this is where we observe so-called “North-South” traffic.
How does this differentiate from the aforementioned conventional approaches?
The connection point between our cluster “appliance” and the existing IT data center network is like that of any other server. The IT data center network administrator need only provide the appropriate server access connectivity to the “outside” leg of the Gateway machine(s). There are no IP routing protocols or spanning-tree connections to engineer between the East-West cluster fabric, and the North-South data center network. Even more interesting, given the East-West cluster fabric is purpose-built and isolated from the main data center network, the cluster administrator can implement tools that orchestrate and simplify the cluster network switch provisioning specific to the needs of the cluster nodes – one step closer to a more self-contained “appliance” like deployment model.
With no direct network path between the cluster fabric and the outside world, access to our cluster resources must traverse through x86 Linux based machines (the gateways) - which have a lot more security controls available than a typical network switch. Security hardened Linux kernels (SELinux) and firewalls (IPTables) are some examples of freely available and well understood Linux based security. Additionally, any disruption or instability event in the main data center network will not cascade (via network protocols) into the cluster fabric.
With our cluster fabric purpose-built and isolated, the cluster administrator is now free to design and scale the fabric specific to the needs of the cluster, independently of the main data center network. A network design that works well for scaling the cluster fabric, such as Leaf/Spine, may not be the design that works well for the existing applications on the main data center network. A purpose-built and isolated cluster fabric allows each administrator to make the best network design choice for their specific environment.
This is about placing the right features, at the right cost, at the right places in the network. With our cluster fabric designed independently from the data center network, we can choose networking equipment with the feature set and performance that best meets the needs of our cluster fabric designed for East-West traffic (Leaf/Spine) – nothing more, nothing less. On the other hand, our main data center network usually needs to accommodate a more complex feature set suited for the North-South traffic (such as VRF, MPLS, LISP, etc.) – so it makes sense to pay for those features where you need them, and not where you don’t.
Enabling maximum automation
With our cluster fabric and its machines encapsulated into a unique administrative boundary, including x86 Linux machine gateways – we are ready to introduce automation tools specifically designed for our particular deployment of big data or cloud, with comprehensive and coordinated control over the servers and network settings of our cluster “appliance”.
Maybe you’re worried about not having the necessary in-house skills in, say, Hadoop, VMware vCloud, or OpenStack? Perhaps this is where you hand that administrative boundary (and its necessary skills & responsibilities) over to a 3rd party – perhaps the vendor of a comprehensive big data or cloud turn-key “solution”. On the other hand, you might have uber-skilled folks in-house capable of building homegrown cluster deployment tools – where the maximum innovation potential will be achieved with an authoritative domain that includes both the cluster servers and the cluster network – with a well-defined, secure interface to the North-South data center network domain via x86 gateway machines.