Blog

Highlighting Key New Capabilities for major AHV update in AOS 6.7 

By Bob Ball

September 7, 2023 | min

The Nutanix AOS™ 6.7 release includes a new major version of the Nutanix AHV® hypervisor (20230302.207), which has many exciting core improvements. Before we dig into some of the changes in the underlying AHV version, I wanted to highlight a key, highly requested AHV feature added in AHV 20230302.207 and AOS 6.7 - On Demand Cross Cluster Live Migration!

Migrating VMs Across Clusters

Many Nutanix customers will already be aware that the ability to live migrate VMs to another cluster - and indeed to a cluster in another Prism Central™ instance - already existed as part of the planned failover feature in the Nutanix Disaster Recovery™ solution. This feature has enabled a large number of VM movements between clusters since AOS 5.19, and we have always been keen on streamlining the user experience to make this key functionality more accessible, with as few as two steps to migrate a set of VMs.

With AOS 6.7, we’ve provided a new option right from within a VM’s Action drop down - “Migrate Across Clusters” which can be triggered for just one VM or a batch of VMs that need moving to another cluster.

VM’s Action drop down - Migrate Across Clusters

This option allows you to move a VM within the local Availability Zone or to another Availability Zone that has been linked with the local Prism Central instance. As expected, you can then choose any of the clusters in the destination Availability Zone to be the destination of the virtual machine. Depending on the VM configuration, the migration to the destination cluster could be completed in as few as three steps; selecting the destination and then running a few checks that confirm there is enough space on the destination before starting the migration itself.

Migration to the destination cluster

Under the hood, the cross-cluster migration functionality is backed by the heavily proven syncrep technology also used by Nutanix Disaster Recovery, with the migration occurring in a three primary phases:

Migration Checks

Ensures there are sufficient resources (e.g. CPU, memory, storage, GPU) on the destination cluster and warns if any anomalies are detected

Synchronize Storage

Ensures all VM disk writes are synchronously mirrored to the destination cluster while copying the VM storage to the destination cluster in the background.

VM Migration

Migrates the remaining VM state (e.g. Memory, CPU, and GPU) to the destination cluster, in the same way as ADS (Acropolis Dynamic Scheduler) does for all in-cluster migrations.

We really hope you like the simplification of how you can now access and experience simpler cross-cluster live migration. As these steps are built on technologies like syncrep and AHV’s live migration flows, they also benefit from improvements we are making in those areas without any further integration needed.

Faster AHV upgrades with High Speed Networks

During an AHV upgrade, the system needs to Live Migrate virtual machines from the upgrading node to another node so the VM does not experience any interruptions during the node upgrade process. The final phase of each of these live migrations will need to move the VM state between these nodes, with the bulk of this state being the VM memory, which needs to be transferred over the network while the VM is still running. Unlike the storage migration described above, where every disk write is mirrored in real time to the destination cluster, a different approach is needed to migrate VM memory because memory writes are typically expected to be several orders of magnitude faster than typical latencies observed over a network. Each AHV live migration therefore uses an iterative process to migrate the VM memory, with each iteration copying the difference in memory that had been changed by the running VM since the previous one. This means that the rate at which a virtual machine is modifying memory is a key factor in how many times we need to copy the difference in memory and therefore how long the upgrade may take.

The second key factor is how fast we can transfer memory across the network to the new node, and that’s where the high speed networks come in. If the available link between the two nodes is 20GB/s or more then the migrations may be able to benefit from using a larger number of parallel connections between the source node and destination node and we can transfer memory faster in each iteration, reducing the impact of the rate at which a virtual machine is modifying memory.

The end result of this improvement is that, during an upgrade (or any other operation which is placing a node into maintenance mode), the length of time each VM migration takes may be substantially reduced, providing much faster upgrades.

Major update to internal components

AHV is based on proven virtualization components such as the open source KVM hypervisor, with substantial optimizations and extensions to produce an enterprise-grade hypervisor. This major release of AHV includes an update to many of these internal components, bringing the latest performance improvements, bug fixes and opportunities for feature development to the platform.

With this major update to internal components, we were keen to include a new upgrade approach which is much more resilient to potential changes or customizations which may occur on the node. Such changes aren’t designed to be preserved across an upgrade, but have been historically problematic during the upgrade process. We’ve therefore introduced a new method to upgrade from AHV ‘8’ releases (AHV versions 20220304.x) to AHV ‘9’ (AHV versions 20230302.x): re-image based upgrades.

As the name implies, this re-image based upgrade approach performs a full re-imaging of the hypervisor during the upgrade process. In order to preserve various configurations of the system across an upgrade, however, we do make a copy of key configuration settings, such as the hostname of the node, and re-apply them after the upgrade.

Migrate VMs

Moves all VMs off the node to other nodes in the cluster so the node can be upgraded without affecting user VMs

Backup Configuration

Preserve hostname, network setup, CVM setup and other specific node configuration details

Reimage Hypervisor

Reboot the node into a reimaging environment, and perform the hypervisor reimage process

Restore Configuration

Re-apply the backed-up configuration and then reboot the node into the newly imaged AHV version

It is important to ensure that all modifications to AHV nodes are made using the supported methods to ensure they are preserved over upgrades, as both re-image based upgrades and in-place upgrades will restore certain node configurations to the values configured by the supported methods.

Next generation APIs for AHV Virtual Machine Management

AOS 6.7 also brings an update to the Early Access (EA) of the v4 next generation of APIs. Following the initial EA release of the APIs in AOS 6.6 we made some minor changes, and the latest API documentation can be found on developers.nutanix.com. The v4 APIs mark a substantial improvement over previous API versions, with more and more products enabled and an update from the intent-based v3 APIs to REST APIs in v4. We are aware that movements from the v3 APIs to v4 will require changes to existing systems, so we’re not deprecating our v3 APIs yet. We’d also love any and all feedback you have on the new APIs to keep moving forward and continue enabling more use cases.

That’s just the top four things to talk about in this exciting new AHV release, but also included are thousands of changes, bug fixes and performance improvements and other features. Please check the release notes for more details of what else is in store when you upgrade. 

© 2023 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and all Nutanix product, feature and service names mentioned herein are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. Other brand names mentioned herein are for identification purposes only and may be the trademarks of their respective holder(s). This post may contain links to external websites that are not part of Nutanix.com. Nutanix does not control these sites and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such a site. Certain information contained in this post may relate to or be based on studies, publications, surveys and other data obtained from third-party sources and our own internal estimates and research. While we believe these third-party studies, publications, surveys and other data are reliable as of the date of this post, they have not independently verified, and we make no representation as to the adequacy, fairness, accuracy, or completeness of any information obtained from third-party sources.

This post may contain express and implied forward-looking statements, which are not historical facts and are instead based on our current expectations, estimates and beliefs. The accuracy of such statements involves risks and uncertainties and depends upon future events, including those that may be beyond our control, and actual results may differ materially and adversely from those anticipated or implied by such statements. Any forward-looking statements included herein speak only as of the date hereof and, except as required by law, we assume no obligation to update or otherwise revise any of such forward-looking statements to reflect subsequent events or circumstances.