VDI Series: Part 3 – Incremental Scalability

In my previous blog post, I talked about how the Nutanix architecture, built ground-up for virtualization, is a perfect fit for the unique performance needs of VDI. In this post, I would like to cover another pain point of VDI – incremental scalability.  It’s commonly known that many VDI deployments fail when trying to scale beyond 200 desktops. In this post, I’d like to cover some of the reasons for this failure, and then go into how the Nutanix architecture is designed to rapidly scale while maintaining high performance, enabling organizations to grow their VDI deployments, as needed, one node at a time.

Let’s look at some of the reasons why VDI deployments fail to scale when using the standard server + SAN approach:

Many VDI deployments do not have a dedicated storage array. Typically, the SAN is a common resource to all virtualization initiatives in an organization. From the get-go, this approach can be a recipe for disaster. SAN resources are shared by VDI workloads as well as server workloads, which have different performance profile requirements. There is no effective way to guarantee any kind of QoS for the initial VDI deployment, let alone scale the VDI deployment effectively.

If a VDI deployment does indeed get a budget to have its own SAN, the SAN is sometimes under-provisioned because it’s based on the initial # of planned desktops. When the VDI deployment needs to scale, the current SAN performance & capacity is no longer sufficient, and another storage array needs to be wheeled in. Adding a storage array not only makes the CAPEX go through the roof, but also increases the management complexity of the VDI deployment.

In the most common case, SAN resources are over-provisioned. The idea here is that the SAN is a one-time purchase, and the organization can add servers to the fabric as the deployment scales. There are two common resulting problems with this approach:

– The organization is forced to take a huge upfront CAPEX hit, which then drives up the overall cost of the project and the time it takes to realize the ROI
– The interconnect becomes the bottleneck. Typically the array is on the other side of the core switch. The core switch is heavily oversubscribed, and therefore the interconnect between the servers and the storage becomes a huge bottleneck.

The real solution to delivering incremental scalability is to keep the compute right next to the storage, which is key to the Nutanix approach for achieving higher performance with greater simplicity. A single Nutanix block houses four nodes, and organizations can add single nodes to their Nutanix cluster to continue to grow their VDI by 50-100 desktops depending on the user profile (task/knowledge worker/power user).
Some of the key attributes of the Nutanix architecture that deliver this incremental scalability are described below:

Shared Nothing Distributed Architecture: The Nutanix Cluster is a pure distributed system. This means that the compute gets its storage locally, and does not need to traverse the network. All the problems around network, interconnect and core switch bottlenecks are avoided completely. IT organizations can add single Nutanix nodes or blocks of four nodes, as needed to achieve linear scaling in performance and capacity.

Distributed Metadata : A big problem affecting scalability in a traditional SAN is metadata access. Most storage arrays have 1-4 controller heads and all metadata access needs to go through these heads. This causes contention and performance drops as more servers try to access the same storage array.

In the Nutanix Cluster, metadata is maintained in a truly distributed and decentralized fashion. Each of the nodes in the Nutanix Cluster maintains a part of the global metadata, which means that there is no single bottleneck for metadata access.

Distributed Metadata Cache : Most storage arrays do a lousy job of maintaining metadata caches. In storage arrays that do have a metadata cache, the caches live on the limited number of controller heads. Access to the cache is limited by the bottlenecks around the network, interconnect and switch as discussed above.

In the Nutanix Cluster, metadata is cached on each of the controller VMs. Most metadata access is served up by cache lookups. Each controller VM maintains its own cache.  This means that however large the Nutanix Cluster grows as you continue to add nodes, the cost of metadata access stays the same.

Lock-free Concurrency Model :  The standard approach to ensuring correctness for metadata access is to use locking. Unfortunately, in a distributed system, locking can become tricky. Excessive locking ensures correctness, but causes performance to drop like a rock.

The Nutanix Cluster implements an innovative lock free concurrency model for metadata access. This model ensures the correctness of metadata but at the same time ensures high performance.

Distributed MapReduce for data/metadata consistency : For a large scale deployment, consistency checking for data and metadata becomes a challenge. The Nutanix Complete Cluster implements a fully distributed map-reduce algorithm to ensure data and metadata consistency. The distributed nature of the map-reduce ensures that there is no single bottleneck in the system. MapReduce has been shown by Google to scale to 1000s of nodes, and is a key ingredient in the incremental scalability of the Nutanix Cluster

Distributed Extent Cache : Caching is not a new concept to storage arrays. The challenge, however is that caches are located on the limited number of storage controllers in a storage array. Not only does the limited of storage controllers cause contention, but the fact that these caches live in the storage array means that access to cached data needs to traverse the core switch. This brings with it latencies around the network, interconnect and switch that was discussed above.

In my previous blog post, I discussed the extent cache, which caches the data served up by the controller VM. They key aspect of this cache is it lives on the controller VM. This means that the compute tier can access the cache locally, without having to hop across the network and the core switch. This approach allows the Nutanix Cluster to incrementally scale with ease, while maintaining high performance.

My next blog post will focus on manageability. With large VDI deployments, it is crucial to let organizations focus on managing virtual desktops, rather than have to worry about compute and storage. In a traditional server + SAN model, organizations need to have trained personnel to manage servers, storage arrays, switches, zones etc. The Nutanix Cluster removes the complexity in managing a virtualized datacenter, and allows an organization to focus on using VDI towards their core business.