Nutanix and Microsoft Private Cloud: We begin our journey with Hyper-V (Part I)


By Tim Isaacs

Over the last few years, Hyper-V has gained great traction with IT organizations serving numerous use cases, geographies and customer segments. According to IDC, Hyper-V has grown to capture nearly 1/3rd of the hypervisor market in the last 5 years as seen in the chart below (source: IDC WW Quarterly Server Virtualization Tracker, Dec 2013).

Hyper-V’s market share is expected to grow at a faster rate than the general growth of virtualization, driven by several different factors including the upcoming end of support of Windows Server 2003. According to Microsoft, roughly 15 million instances of Windows Server 2003, most of them physical, will need to be upgraded within 12 months to avoid any lapses in support.

We are watching this market dynamic unfold as our Hyper-V customer base is rapidly growing. Systems running Hyper-V are in the triple digits just in the first few months following support for the hypervisor. Our healthy and growing pipeline of opportunities around VDI, remote office, business critical applications and private cloud deployments further supports the trend.

The Nutanix OS 3.5.2 release late last year marked the beginning of our journey with Hyper-V. We released support for the full spectrum of Hyper-V features including native SMB 3.0, ODX, TRIM, Power Resource Optimization, Failover Clustering & Live Migration. Most storage vendors claim similar support. However, the quality of the integration lies in the details of the implementation and the ability of vendor architecture to take full advantage of the hypervisor feature set.

Nutanix brings the same advantages we see with ESXi and KVM environments and more to deployments using Windows Server 2012 R2 with Hyper-V. Here are some of the top areas that make us unique:

  • Eliminates Storage Complexity: We eliminate management and complexity of networked storage by presenting up one large container for virtual disk storage, eliminating the management of volumes, LUNs, RAID groups and the storage area network.

In the context of Hyper-V, we present the same large container as an SMB 3.0 share where Hyper-V virtual machines store their virtual disks. Clustered Shared Volumes or CVS are not required. As compute and storage needs grow, new Nutanix systems (scale unit is a single server node) can simply be added into the existing Nutanix cluster; resulting in linear scaling out of both storage and compute without needing to tweak the cluster.

Contrast this with a traditional storage scenario where admins have to provision a LUN within the appropriate group of disks (example: RAID group) with the performance and fault tolerance to meet defined storage objectives. If existing storage groups are near capacity, admins have to pay close attention to the remaining capacity, keeping a safe margin for future provisioning and provisioning new disk groups as needed. Remember, traditional storage is a scale-up model requiring either a rip and replace of existing infrastructure with a larger storage system or deploying a completely new silo of storage. Once system limits are reached, admins have a long wait cycle before another system becomes operational. Think of the end-to-end procurement process including pre-sales sizing, purchasing, shipping and post-sales installs, just to name a few. After provisioning storage and creating the LUN, admins must set authorization through zoning and masking, ensure adequate network connectivity, resiliency and performance, then attach the LUN to the server. From the server, admins format the LUN with a file system and only beyond this point is storage ready to store virtual disks. We are simplifying for the sake of brevity but as one can imagine there are plenty of intermediary steps not being discussed.

  • Enables efficient Copy & Zeroing operations: The Hyper-V ODX (Offloaded Data Transfers) feature allows Windows to place the responsibility of data copy and zeroing operations to the Nutanix CVM, resulting in as much as 10x better performance than equivalent client level operations.

As a bonus, enabling inline deduplication for the performance timeans we will simply map the copy to original data blocks, rather than make an outright copy. ‘Zeroing’ offloads are treated as lightweight metadata operations requiring minimal system resources. Zeroing offloads are also leveraged to zero out fixed size VHDXs.

  • Enables efficient Thin provisioning: The Hyper-V TRIM capability allows us to reclaim capacity the moment its available, with minimal impact to system resources, making thin provisioning more efficient.

TRIM’s usefulness is dependent on how efficiently storage systems respond to client generated events. Illustrating using an example: when a file is deleted within a VM, we respond simply by flagging the blocks in question for future garbage collection. To admins, this means immediate, up-to-date, available capacity. Put another way, thin provisioning and reclamation are made more efficient with minimal impact to system resources.

  • Complements Failover Clustering: Failover clustering is integral to application availability and requires storage to be shared across all hosts. Our architecture is based on shared storage pooled from SSDs and HDDs local to the nodes in the cluster. All hosts are presented with a single, scale out, distributed file system, enabling virtual machines access to storage on any host within the cluster.

Supporting Failover clustering using traditional storage requires presenting a large LUN to hosts. However, the performance and capacity of that LUN is limited either by the storage controller or the underlying disks. Traditional storage models break when either the storage controller or the underlying disks become a bottleneck warranting a rip and replace of the system.

Compare this to our architecture based on shared storage where all hosts within the cluster contribute to cluster capacity and performance. When capacity or performance is expended, admins simply add new Nutanix systems to the cluster with no downtime, resulting in linear scale out of capacity and performance. Admins desiring to balance virtual machine (VM) workloads across cluster hosts can leverage Hyper-V Performance Resource Optimization (PRO) capability in System Center Virtual Machine Manager (SCVMM) where load balancing rules are configured. These rules determine when and where virtual machines must move within the cluster and govern physical resource distribution across VMs. VM data is appropriately moved within the cluster following the VM it is tied to, without any manual intervention.

  • Quick Installation with automated Setup: Installing Windows Server 2012 R2 with Hyper-V and Nutanix Operating System on multiple hosts should complete in less than an hour. Setup is fully automated including joining hosts to Active directory, creating a Failover cluster and optionally registering hosts with SCVMM

Traditional infrastructure involving standalone servers and networked storage typically take days to install and setup through expensive, vendor provided, professional services. Some of those complexities were discussed earlier. Here is a picture telling the story:

In summary, Hyper-V when combined with Nutanix creates an excellent full-stack virtualized platform for all workloads and environments. The Nutanix OS 3.5.2 release marks only the beginning of this journey with plenty more to come.

Why the buzz around scale-out (or web-scale)

Customers can appreciate the scale-out value of starting small and growing as data and workload demands grow. However, effective scale-out implies system resources will scale linearly. If one node in a cluster on the storage front contributes X IOPs and Y TB, then 50 systems should contribute 50*X IOPs and 50*Y TB. This engineering problem is optimally solved only through distributed architectures, like Nutanix, designed with high-distributed scale out in mind (we call this web scale). Contrast this to traditional storage vendors with monolithic scale-up architectures. Indeed, when these architectures are retrofitted with scale-out there is a penalty to be paid in the form of complexity, bottlenecks and more importantly non-linear scaling.

Distributed architectures further lend themselves to value added data management activities like deduplication, compression, storage analytics, erasure coding and a plethora more. Take the example of deduplication. Once fingerprints are computed (on write), all nodes participate to remove duplicates through MapReduce initiated processes, linearly scaling deduplication power as the number of nodes in the cluster increase. The scope of Deduplication as a result spans the entire cluster. Compare this to traditional storage vendors where deduplication relies on resources from a single controller. If you are not careful, deduplication can easily interfere with client performance. While price insensitive customers may oversize and buy several expensive controllers to mitigate this problem, the scope of deduplication remains small, restricted to only a portion of the controller silo (Eg: a volume). Small scope implies fewer duplicates resulting in low savings.

We will soon be releasing part II of this blog series discussing the continued evolution of our story with Hyper-V. You will read about new product features, improvements, touch points with the Azure cloud and why our solution is unique. Stay tuned!