Rearchitecting Unstructured Storage  

By Devon Helms

| min

Have you ever considered what would happen if you took an architecture that was meant to store terabytes of data and increase the requirements by three to four orders of magnitude? It doesn’t take a data scientist to understand why things would break down under such conditions. The rapid growth of unstructured data on legacy infrastructure is creating just such a problem. It’s time to rethink unstructured storage and design architectures for the new realities of massive scale and automated management. 

In a recent report on the rise of unstructured data, analyst firm ESG discusses how legacy architectures are ill equipped to effectively manage, host, and provide access to critical data stored on legacy file and object systems. They present the key qualities modern unstructured storage solutions must possess to help administrators manage this data deluge while providing greater access to the users who need the data to drive business success. 

The main problem is rapidly growing silos of unstructured data on legacy architectures. Unstructured data stored on file and object systems nearly double every couple of years as more users and end point machines create ever more primary and copy data. This isn’t garbage data; it’s often critical to the success of the business. 

Unstructured data is typically comprised of data that is not easily searchable, including such formats as audio, video, and social media postings. It has an internal structure but is not structured via pre-defined data models and may be textual or non-textual, and human- or machine-generated. It is typically stored on file and object storage. 

Providing continuous, reliable access to the data on storage with the right performance capabilities is yet another responsibility of the IT team. As the storage footprint grows to accommodate the data both IT teams and the business at large feel the strain. 

  • Management complexity increases as legacy tools struggle to keep up. 
  • Mean time to resolution increases as older troubleshooting tools slow the diagnostic process. 
  • System maintenance windows for software upgrades and hardware refreshes lengthen and down time increases as disruptive update and rip and replace processes become more frequent and impactful.And, of course, IT teams are expected to continue to manage this growth with flat budgets and limited headcount increases. 

IT leaders see digital transformation projects as a key factor in providing greater access to data while reducing maintenance overhead. According to ESG, 86 percent of surveyed respondents say that they will be less competitive if they don’t embrace these digital transformations. 

ESG identifies three key qualities of a modern “data-centric unstructured storage” system that businesses need to address massive data growth and support digital transformation efforts. 

First, the solution must be simple. It should reduce the management overhead for administrators, freeing them to do more valuable activities. The solution should reduce the learning curve, enabling a broader set of users to self-serve their own data access needs. It should also use automation to standardize processes and prevent self-service provisioning from violating proper usage policies. And it must provide API and programmable management at every point so that developers and applications can directly address and manage the storage resources. 

Second, the solution must be flexible. The popularity of public cloud derives mainly from its flexibility. A modern solution should adopt that same flexibility, allowing administrators choice of platform and hardware—no more lock-in. It should allow administrators to pay for what they use with cloud-like consumption models, freeing administrators from resource planning uncertainty. It should eliminate silos, allowing file, object, or block storage on the same platform, as well as expansion from the on-prem platform to the public cloud. And it should be capable of scaling on demand to support the increasing workloads and data with minimal disruption. 

Finally, the solution must be intelligent to automate routine management tasks while maintaining control of the data. It should provide automated self-healing and self-tuning capabilities to reduce the time spent on maintenance activities and to reduce down time. It should provide visibility into the data being stored so that data placement decisions are more accurate. And it should provide machine learning-based automated compliance and control to ensure data is secure and protected from unwanted access and use. 

An “intelligent” cloud-like architecture is, in many ways, the ideal model. Simple enough for anyone to request and access data services; flexible enough to mask the underlying hardware while also providing optimal data placement; and automated enough for any user to own and manage with confidence. For many workloads, though, public cloud is not the right solution. The economics rarely make sense at large scale. And the performance is often lacking when data needs to be accessed from long distances. For these workloads, an on-prem solution that includes these qualities will be the best choice and should be the ideal for IT teams creating modern data-centric unstructured storage. 

While the struggle to manage the exponential growth of unstructured data continues to impede enterprise datacenters, storage organizations are looking for solutions. To learn more about this solution and others, check out ESG’s research and analysis of the needs of unstructured storage solutions. You can read more at  

© 2019 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and the other Nutanix products and features mentioned herein are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. All other brand names mentioned herein are for identification purposes only and may be the trademarks of their respective holder(s).