VDI Series: Part 2 – Addressing Performance

By Partha Ramachandran
| min

At VMworld back in August, Nutanix was honored with TechTarget’s Best of VMworld – Desktop Virtualization award.

With the show being our first public appearance after launching just two weeks prior, this honor surprised many people in the desktop virtualization and general virtualization community. In this post (continuing our VDI series), I would like to go into some details on why we believe the Nutanix Complete Cluster is a great fit for desktop virtualization use cases. If you’re in a hurry and want the Cliff Notes version, you can take a detour to the vmworld video interview @ For those with a longer attention span, let’s get into the nitty, gritty.

Three key areas that can make or break a successful VDI deployment are: Performance, Scale-out and Manageability. This post focuses on performance as it relates to VDI. I will cover Scale-out and Manageability in subsequent posts.

Let’s start with a baseline. This article from Citrix does a good job of describing the IO profile for virtual desktops. Even though the article is specific to Citrix XenDesktop, the analysis can also equally apply to VMware View or any other VDI management solution. Here’s a summary of the key points from this article for our baseline:

Boot Storms – 300 IOPS, 90/10 R/W ratio
Steady State -10/90 R/W ratio. IOPS numbers vary as follows:

Light: Single application, no web browsing – 6 IOPS

Normal: Few applications, minimal web browsing – 10 IOPS

Power: multiple applications concurrently, and heavy browsing – 25 IOPS

Heavy: Compiling code, Videos – 50 IOPS

Now that we know what it takes for a VDI performance-wise, let’s examine how a typical SAN holds up under these circumstances. Let’s take a 1,000 desktop deployment as an example.

Random Steady State IO: In steady state, desktop users open applications, send email, browse the web etc. Each of these translates to a small size IO request to the storage layer. Now, each desktop is independent of the other desktops. A SAN can’t tell one desktop from the other. Therefore all IO coming into a SAN at steady state is completely random.

What kills a SAN in steady state is that each random IO is a spindle head movement. With an average of 20 IOPS per desktop, the total random IO required of a SAN is 20,000 IOPS. This translates to 300 spindle disks without accounting for RAID. With RAID 5 or 6, the number of disks required is 600-800 just to support steady state random IO coming from these 1,000 virtual desktops.

Boot Storms: Booting a virtual desktop requires that the key OS bits be loaded from the SAN, separately for each desktop. There’s no simple way for a SAN to load the data more intelligently, since this intelligence has to be at an upper layer. Booting 1,000 users translates to 300,000 IOPS, which means for many deployments, overprovisioning storage is necessary to meet the performance requirements, driving the TCO way up.

Now let’s get back to why we believe the Nutanix architecture provides a great alternative to traditional server +SAN approaches for VDI. How does the Nutanix Complete Cluster, with its converged compute + storage architecture stack up in the face of these performance challenges?

Fast Random IO: All write IO in the Nutanix Complete Cluster goes to the HOT (Heat Optimized Tiering) Cache first. This cache data is written to Fusion-io ioDrive that is on each node in the Cluster, and immediately returns to the OS. The Nutanix Scale-Out Converged Storage layer that stitches the local storage from each node in the cluster into one global fabric then provides persistent storage in the form of either the DiskStore (direct-attached SATA HDDs) or the FlashStore (PCI-e attached Fusion-io ioMemory) depending on the tiering policies in place. The fact that the IO is written to Fusion-io ioDrive first means that there is no spindle disk in the picture here. All Fusion-io flash. This means microsecond latencies and high throughput.

No more Boot Storms: Each controller VM in the Nutanix architecture functions similarly to a traditional storage controller, except that there is one for every node instead of a limited number shared by a large SAN that may result in bottlenecks. The controller VMs each have a cache called an “Extent Cache” that caches the data that has been served up by the controller. Frequently accessed data continues to live in the cache. This means that once cached, the OS bits are served from the cache, eliminating the need for disk/flash IO.

By rethinking how infrastructure should be built for virtualization, Nutanix’s approach inherently solves the 2 biggest performance pain points for VDI. Nutanix delivers fast random IO as well as high sequential bandwidth providing desktop users with a great user experience in steady state and in the face of boot storms.

In the next post, I’ll go into another key pain point for VDI: incremental scalability. It’s commonly known that many VDI deployments fail when they try to scale beyond 200 desktops. We’ll look at how Nutanix is built from the ground up to rapidly scale while maintaining high performance, enabling IT organizations to grow their VDI deployments, as needed, one node at a time.

One last note to keep in mind that though the Nutanix solution is great for VDI, it was designed for virtualization use cases as a whole, from general server virtualization in the core datacenter to DR sites, to test and dev or branch offices. We’ll elaborate on more of these use case in future blog posts, so stay tuned.