Regaining Efficiency in the modern datacenter

At the start of the virtual revolution, efficiency was the essential driver that made admins revisit how datacenters were architected. The ability to make a business run with two or three servers instead of ten or twenty was something of a sea change. The ability to spin up new workloads in minutes instead of days was both profound and immediately impactful, showing  how business could be done moving into the 21st century .  The efficiency gained by running many workloads in parallel on the same resources (rather than waste CPU, disk, and RAM that sat idle on single application server boxes) brought a fundamental change to the data center. Suddenly, CapEx and OpEx  expenditures could be tamed in the client/server x86 world without resorting to old school big iron (mainframes).  An organic change to how x86 servers could be implemented

Enter the Virtual Server: 

This was all very good stuff, but brought with it it’s own suite of new problems. One of which (but a biggie) -when a physical server died, it no longer took out just a database, or just a file server, but rather took out several. Exchange, sql, file and print, and an app server or two likely went with it (and stayed gone until repairs to the server could be effected). All this was caused by using the disks internal to the server to house all of the VM’s virtual hard drives. A solution to this need for shareable block level storage had to be found before the next major steps in the Virtual revolution could take place.

Enter the SAN – 

The SAN brought flexibility and portability to the virtual infrastructure. By moving the virtual hard drives off of the internal disks and out to a network shared RAID based enclosure,  workloads were able to be quickly restarted in the face of hardware faults. However, it did this at a cost.  A cost in complexity and overhead. Introducing the network as a carrier for block level disk IO to external disk enclosures brought with it the overhead of TCP (or FCp),  the overhead of new storage protocols (iSCSI and FC) or new uses for old storage protocols (NFS), the addition of new moving parts  such as cabling ,  dedicated switching, and the complexity of networking and security to support running your entire IO path over a network, all layered on top of levels of RAID penalty. An organic solution, and one that lost the efficiency,  but covered the basics of restoring availability when a box died.

Enter the Killer App

These series of organic changes/solutions to the problems presented also enabled what could arguably be called the virtualization “killer app” – Live Migration of a running virtual server between physical boxes. A capability historically only provided by mainframes. Killer App indeed – it drove the virtual revolution out of the shadows and to the forefront of datacenter implementations for the enterprise.

The problem with this servers plus switches plus San plus storage protocols plus virtualization software approach lay in the tortured, organically grown (rather than purpose built) architectural approach used to get there. While it worked, the cost and complexity of the solution left it unapproachable for a majority of the SMB market. The benefits of live migration and fault tolerance were huge, but the “Rube Goldberg-ian” machine used to get there redefined complexity and cost in the data center.

Enter the Clean Sheet Approach

Clearly, it had become time to rethink the approach to virtualization. Time to look at the goals of the approach, but eliminate the inefficiencies introduced through organic problem solving by taking a “Clean Sheet” approach to how high availability could be obtained without the efficiency losses to complexity, cost and advanced training that made the now “legacy” virtualization approach unreachable for many in the marketplace.


Two different schools of thought emerged on how best to simplify the architecture while maintaining the benefits of virtualization.

The VSA/Controller VM approach –  Simply virtualize the SAN and it’s controllers – also known as pulling the SAN into the servers. The VSA or Virtual San Appliance approach was developed to move the SAN up into the host servers through the use of a virtual machine. This did in fact simplify things like implementation and management by eliminating the separate SAN. However, it didn’t do much to simplify the data path or regain efficiency. The VSA consumed significant host resources (CPU and RAM), still used storage protocols, and complicated the path to disk by turning the IO path from application->RAM->Disk into application->RAM->hypervisor->RAM->SAN controller VM->RAM->hypervisor->RAM->write-cache SSD->Disk. Often, this approach uses so much resource that one could run an entire SMB  datacenter on just the CPU and RAM being allocated to these VSA’s. For the SMB, this approach tends to lack the efficiency that the sub 100 VM data center really needs.

vsa data path

The HES (Hypervisor Embedded Storage) clustered approach – Eliminate the dedicated servers, storage protocol overhead, resources consumption, multi-layer object files and filesystem nesting, and associated gear by moving the hypervisor directly into the OS  of a clustered storage platform as a set of kernel modules, thereby simplifying the architecture dramatically while regaining the efficiency originally promised by Virtualization.


It is up to you to decide which approach is the most efficient for your individual needs based on the requirements of your data center. While the first approach can bring some features that might make sense to the admin used to having to deal with an enterprise oriented (read tons) of servers, switches, and multiple big-iron SAN implementations, it does so at a resource cost that just doesn’t make sense to the efficiency minded SMB and Mid-Market systems administrator that has to do way too many other things to need to worry about complexity in his architecture.

Buy Now Form Submission

Please complete the form below and someone from Scale Computing will be in contact with you within 24hrs.