Blog

How Nutanix and VSAN EVO: RAIL differ on management and your designs

 

By Dwayne Lessner

Nutanix operates in the user space while Virtual SAN (which includes EVO:RAIL, of course) is embedded in the hypervisor kernel. Each design offers unique challenges and choices when designing, deploying and managing hyperconverged systems.

Design

VSAN is directly plumbed into the compute cluster. The storage available for the virtual machines running on VSAN hosts is inherently limited to the local attached storage of the hosts that make up the vSphere compute cluster. For example, you can’t extend the VSAN container to other compute clusters, and you can only have one container per compute cluster.

Depending on vSphere and application licensing (e.g., databases), VSAN can potentially create silos of capacity and performance.

Nutanix, on the other hand, is not bound to a vSphere compute cluster. So while you could architect a solution similar to VSAN, you have the choice to create one large storage clusters – regardless of licensing of the compute layer.

To illustrate, let’s use the above example of 9 hosts, each with 2 TB of useable capacity per node for both VSAN and Nutanix. Also, we’ll architect for fault tolerance of 1.

With VSAN clusters you would have 3 separate containers of 4 TB, accounting for space for the node to fail and rebuild. Nutanix can aggregate of all the nodes, yielding 1 large container of 16 TB, also accounting for space for the node to fail and rebuild.

Normally you only need 3 nodes with either solution to get your cluster up and running. If you want the ability to lose more than one node at anyone time the minimum starting number of nodes in the cluster needs to be higher. Fault tolerance(FT) of 2 means your cluster can lose two nodes at anyone time and avoid an outage. It wouldn’t be possible to achieve FT=2 with the smaller VSAN clusters because you would need at least 5 nodes in one cluster to survive the loss of the two nodes. Nutanix also needs a minimum of 5 nodes for FT=2 but Nutanix would have access to all 9 nodes across the multiple vSphere compute clusters so it would be possible to configure this.

Also, having a larger cluster with Nutanix means you can more easily share the most expensive resource, flash. If flash is available, all 9 nodes will have access to the flash.

Also having only 1 container for your virtual machines today limits you today to 2048 machines if you want to enable HA for them. It is very much possible with Nutanix create to multiple containers and still share the resources.

Create multiple containers for HA or to apply polices like compression.

Management

Nutanix operating in the user space we do have a VM called our CVM that is the storage controller running on each node. VSAN being embedded in the kernel doesn’t use VM’s for base functionality.

To effectively manage VSAN you need to add some additional VMs, they’re optional but do come installed with EVO:RAIL. Today’s vCenter monitoring of VSAN is limited. For example. vCenter doesn’t have a way of grouping latency, performance and throughput from a cluster perspective. You can, however, grab individual host stats, some VM performance stats and you see the health of the components.

Most people will use VSAN Observer and vCOPS. Today, VSAN Observer is the main tool. Best practice is to run a second vCenter for VSAN Observer because of the Ruby Console that runs the webserver. You do not have to license the 2nd vCenter according to Duncan Epping.

Also by default VSAN Observer requires access to the Internet, which would be an issue for a lot of dark sites. They are now steps to get around the Internet requirement: Those steps can be found here.

VSAN Observer, however, is only is intended for short term monitoring. VSAN observer runs in memory so after a couple of hours it could take up GB’s of RAM. There is no user authentication to the VSAN Observer website and it’s kind of a pain in the butt to start all the time. Here are the steps to run VSAN Observer,http://vinfrastructure.it/2014/05/vmware-virtual-san-observer/.

vCOPS today really doesn’t have a concept of VSAN either. Today there is no way to keep historical stats for VSAN. I am guessing there will be some type of management pack introduced in conjunction with VMworld Europe since the October show tends to be heavy on the management side. Cormac Hogan has a nice article on the state of union with vCOPS and VSAN.

In contrast, the Nutanix Prism is a distributed application purpose built to scale along with the cluster.

You can get per VM statics that include
• CPU usage
• Memory usage
• Host that it runs on
• VM configuration
• IOP
• Latency

What makes this unique is how the data collection is handled. Each CVM keeps track of statistics for the host it serves and stores it for 90 days, no manual pruning required! The ability to click on a graph and bring it in the analytic section and start to correlate events and alerts with performance metrics is really powerful. I would encourage people to see a demo of this in action.

The Prism UI doesn’t need any additional VM’s, third party databases or java to run, point your web browser to the IP address of one of the virtual storage controllers and your good to go.

Another feature that is included in all editions of Nutanix is Cluster Health.

There are over 55 tests in Cluster Health that can be used for troubleshooting. It will alert when the virtual storage controller CPU load is too high, inform you of any network latency issues between storage controllers, and perform HDD diagnostics and more.

When talking to customers about the differences of VSAN and Nutanix and trying to relate it back to traditional storage people tend to forget about support. How does your existing storage array work? Does it have phone home for a failed drive? Can you support get on to check the system? Nutanix does this natively today, and it looks like EVO:RAIL will add this in as another VM to run on VSAN. I think is like a chicken and egg problem but time will tell on how this will pan out.

Nutanix is ahead in management and flexibility today. Yes, VSAN gaps may be addressed in future releases, but Nutanix isn’t sitting around either. It will be interesting times as Nutanix continues to partner with VMware on solution’s while both VSAN and Nutanix compete for rack space in the data center.

@dlink7