nutanix

The Basics: Snapshots

In my IT career at work I have never had shared storage that didn’t have snapshots integrated. They were considered fundamental to me if I was going to put all my eggs into one basket. Performance, ease of use and reliability were all deciding factors on selecting a solution. There are lots of different solutions on the market today and some are really good but it’s hard to get all three deciding factors to line up if snapshots were not integrated from the ground up.

Performance

It’s great to have a feature. But if the performance impacts of using it is are such that it adversely affects the environment then what’s the point? Snapshots are used for a point in time copy of data so that you can roll back incase of corruption, file recovery or part of a larger business continuity plan that makes use of replication.

Snapshots should be able to be used on running applications without causing any performance impacts. Nutanix’s VM-centric snapshot architecture definitely helps in this regard. If you have a LUN with 100 server workloads and take a hardware-based snapshot then you are taking a snapshot of every one of those workloads. Cache and metadata resources are wasted, as all the workloads now have to be tracked and maintained due to the snapshot process.

Take a look at the video below. 32 virtual machines are being snapped at the same time via Nutanix Command Line (NCLI). Two protection domains with 16 virtual machines being placed inside of it. All of the virtual machines are running 4K 100% random write tests with IO Meter on a NX-2450. The virtual machines are writing to raw disks and not using NTFS for caching.  You are able to see that Nutanix is able to keep performance consistent throughout the whole process.

Video: 32 VMs being snapped

This is not possible with hypervisor-based snapshots today. Both VMware and Hyper-V have performance problems around their hypervisor-based snapshots. VMware even has a KB article that states VMware based snapshots“Negatively impacts the performance of a virtual machine”.

Since Nutanix snapshots are based on redirect-on-write implementation, there is no performance impact of keeping snapshots lying around. Nutanix OS is always optimizing the index or metadata associated with the snapshots in the background for performance and capacity.

Ease of Use

Snapshots and replication ease of use for me boils down to scheduling. If your system doesn’t have snapshots and you have to rely on the hypervisor to will have to implement some form of scripting. Scripting isn’t bad but it’s another homegrown thing that you need to maintain which is easier said and done. This is one of the reasons that VMware Site Recovery Manager is great. Verified, supportable and repeatable.

Reliability

Reliability is somewhat tied to performance. You want that same constant performance and not have your world crashing down. I think the reliability comes in with having a strong link to your metadata. This strong link enables features like VAAI (vStorage APIs for Array Integration) to limit the impact of such task. Most virtual admins would say that VAAI support is a must have to operate a virtual environment.  Without VAAI you have endure blunt force trauma with full file copy tasks. This can cause issues delivering service levels. Nutanix supports VAAI, including View Composer API for Array Integration (VCAI) support in VMware Horizon View environments. For XenDesktop users our shadow clone technology delivers similar benefits regardless of the hypervisor.

Blunt force trauma

Without VAAI support it makes using a product even for Test & Dev very hard. You’ll be waiting for your workloads to finishing copying and you might impact test results by the additional overhead in the environment.

Snapshots need to be a core part of the overall architecture. Not having snapshots points to a weakness at the metadata layer for not being able to get granular enough to track such changes.  Without the ability to track changes snapshots will be one of many features that will not be possible. If you want a converged solution that has cloning, replication, compression and inline dedupe, it will only be possible when snapshots are first included on the list.