The Dinosaurs Are Dying

| min

Having spent almost a decade in IBM research inventing bleeding-edge technologies for storage systems, and proudly making them the biggest and baddest creatures in the storage-land, I now feel that the beginning of the end for SAN-like storage systems is hurtling towards us. The last few of the T. rex might be the most vicious but their roar will soon be forgotten as the Jurassic age of computing comes to an end.

Virtualization has ushered in a slew of fundamental changes into the computing landscape. With an ever increasing demand for IT processing, the datacenters have been witnessing unprecedented sprawl. Virtual machines have allowed the consolidation of 100s of traditional servers on a few powerful physical servers. From virtual desktops to software-as-a-service, to anything cloud — virtual machines have become a fundamental tenet in the data center. This uber concentration of computing logic onto a handful of data center servers gives rise an unprecedented demand for concentrated storage performance. The SAN vendors coined terms like ‘boot storms’ in a subliminal attempt to shift the blame back to the consolidated servers, while actually exposing the reality that SANs were never designed for virtualized servers. Most storage vendors, today, provide an ensemble of yester-years technologies glued together with a serious amount of IT dollars. At IBM storage research, I realized that it was time, once again, to think outside the storage box.

Applications that deal with a large amount of data, like Google’s search, Facebook’s inbox, and Amazon’s services, have all developed their own storage architectures as even the scale-out SAN architectures were seriously inadequate. The fundamental problem is the bottleneck in the networking fabric that SAN (or NAS) architectures create due to the separation of the consolidated servers from the consolidated storage. The sequential bandwidth of hard disks, and the spectacular performance of recent SSDs, have encouraged the creation of data-hungry applications, and made the problem even more acute. To work around the networking meltdown, the googles and facebooks of the world hired a bunch of PhDs to rewrite the entire application layer such that logic is shipped to run on the hardware where the data resides (a.k.a Map-reduce). This is diametrically opposite to the assumptions of traditional SAN architectures. At IBM, I saw such non-SAN storage as complementary and non-threatening to the traditional SAN systems where my research was focused. But, something else was brewing on the horizon.

The explosion of virtual machines and server consolidation in the data center has completely shaken the SAN world. Most traditional applications (which constitute the majority of the software wealth of mankind) have been designed with the assumption that data is shipped to the application. From this contradiction arises the third option where logic and data are in fact co-located. Even in the human brain, which arguably is the most evolved form of computing, logic and data inextricably and sometimes indistinguishably co-reside. This emergence of what might be called NoSAN, seems both logical and inevitable. I believe this ‘third option’ is the species that will run rampant not only in the green fields of Big Data but will push the SAN dinosaurs to a corner in their home turf as well. Data will need to be primarily stored on the same hardware or close to where the applications reside for the best scale out performance, while being backed up elsewhere for reliability.

A couple of years ago, the founders of Nutanix had the vision and fortitude to embark on a mission to realize this third option that would deliver seamless mobility, scalability, performance and reliability in a harmonious marriage of storage and server technologies. The dawn of a new era in computing is now upon us as Nutanix unveils the data-center of the future. This is not to say that SANs will cease to exist, but just like RDBMS, will play a supportive role in the path going forward. It is time to embrace Not-only-SAN, Not-only-SQL, and Not-only-what-used-to-be-cool.