Design Overview: AHV Metro on Nutanix Cloud Clusters (NC2) on AWS

By Dwayne Lessner

This design for AHV Metro on the Nutanix Cloud Clusters (NC2) solution on AWS consists of two NC2 clusters running in the same region but in different availability zones. A third single-node NC2 cluster, located in a different availability zone within the same region, is deployed to run the witness. The clusters are connected using an AWS Transit Gateway. 

The Nutanix Flow Virtual Networking software-defined network virtualization solution is required to advertise the subnets to the rest of the environment and any connected private datacenters. In this design, Flow Virtual Networking is enabled to allow the same CIDR blocks to be present in both locations; however, a stretched layer 2 is not needed. This is an active/passive design per subnet, allowing different active subnets to run on either side simultaneously.

A key piece of the design is using components of the Nutanix Cloud Manager (NCM) platform. Both NCM Self Service for script execution and NCM Intelligent Operations for the use of Playbooks are required to:

  • Run the script based on witness failover events.
  • Automatically change the external routable prefix for the subnet that is failing over.
  • Change the static route in the Transit Gateway to point to the active AWS VPC.
  • Clear the failover alert and be in a good state to fail back when the original site is healthy. 

Healthy Site Configuration 

Health site configuration of NC2 with Metro Availability

Health site configuration of NC2 with Metro Availability 

Prerequisite

To ensure the required permissions for managing Transit Gateways, the following AWS IAM policy actions must be granted for Cluster IAM Role :

ec2:DescribeTransitGatewayAttachments
ec2:DescribeTransitGatewayRouteTables
ec2:SearchTransitGatewayRoutes
ec2:ReplaceTransitGatewayRoute
ec2:CreateTransitGatewayRoute

The Cluster IAM ROLE is created when the cloud account is first set up. Existing NC2 clusters will have to make this update. 

Initial setup

  1. The witness runs as a VM on a single-node NC2 cluster in a third, separate AZ. The witness is registered to both Prism Central (PC) multicluster managers.
  2. All VPCs are connected to the AWS transit gateway. Your initial configuration will have a static route pointing to the active VPC for the subnet CIDR that is active.
AWS transit gateway

AWS transit gateway

You will have to edit the transit gateway route table to route traffic to each AZ.

AWS transit gateway route table

AWS transit gateway route table

A. The witness
B. AZ A
C. AZ B
D. The static route for the active FVN user subnet. Starting off in this example it’s in AZ A.

3. You must edit the User Management security group for each cluster to allow traffic between the clusters.  The UVM security groups  will also have to be edited to contact both PCs from the witness.  The ports needed for AHV Metro are listed here:  AHV – DR – Metro - Ports.

4. Between all PCs and the witness Port 9440 (TCP) needs to be bidirectional. Also between AZ A PC and AZ B PC you need the SSH port 22 open in both directions. 

On both AZ A and AZ B PC’s you need to setup the failover script. There are two parts to getting it setup. 

  1. In NCM Self Service you need to set up the supplied eScript from Nutanix(KB-19323). Type: Execute and script type eScript. If you don’t have a project created you can create a new one. The project doesn’t need authentication setup.
NCM Self Service

NCM - Self Service 

2. In NCM Intelligent Operations you need to create the playbook that will trigger Alert Policy  - Automatic Failover triggered by Witness.        

Automatic Failover triggered by Witness using a playbook

Automatic Failover triggered by Witness using a playbook.

Playbook configuration

Playbook configuration

Resolve alert configuration

Resolve alert configuration

5. When you set the externally routable prefix (ERP) in PC for the active side it will automatically add an ENI on the FVN subnet. The ERP is advertising the subnet to the rest of the AWS VPC. The ENI will be responsible for all north and southbound traffic for your FVN subnet. A route will be added to the local VPC route table for your FVN subnet. The old ERP must be manually cleared on the source cluster. 

 Externally routable prefix configuration

Externally routable prefix configuration

ENI added after the ERP configuration

ENI added after the ERP configuration.

6. Once you set up your protection policies, and recovery plans you are ready for a DR event. There is no layer 2 stretch needed. You do need identical user VPC and user subnet names for the failover script to work.

 Setting up a protection policy in PC to use synchronous replication

Setting up a protection policy in PC to use synchronous replication

Setting up the recovery plan in PC with identical subnets

Setting up the recovery plan in PC with identical subnets.

Failover event has been triggered.

Failover event has been triggered.

1. With communication down between the witness and AZ A and between AZ A and B the Automatic Failover triggered by Witness will be triggered executing our playbook. You can read more about witness response behaviors here.

Note: If the DR site/sync destination is not reachable, automatic pause is done on source to ensure business continuity. The administrator is then expected to manually resume sync from the PC.

2. With the playbook triggered, the static route in the AWS transit gateway will be updated to point to AZ B.

 AWS transit gateway updated to point to AZ B for the failed over subnet

 AWS transit gateway  updated to point to AZ B for the failed over subnet.

3. The ERP has been added to AZ B and removed from AZ A.

4. When the ERP was updated with the script for AZ B, the ENI was automatically created so traffic can route locally to the FVN user subnet.

Private route table in AZ B. Traffic for the failover subnet is now advertising the route in AZ B.

Private route table in AZ B. Traffic for the failover subnet is now advertising the route in AZ B.

5. The protected VMs  have failed over and will be protected once AZ A is healthy again  . Generally post unplanned failover, there are 2 instances of the VM - one stale instance on AZ A and the new instance on AZ B. The Nutanix DR stack automatically takes care of deleting the stale VM on AZ A during reverse protection.

 

VMs are now running on AZ B

VMs are now running on AZ B.

Conclusion

The AHV Metro architecture on NC2 on AWS provides a robust, cloud-native solution for high availability and disaster recovery across multiple availability zones. By leveraging Nutanix Cloud Manager, Flow Virtual Networking, and AWS Transit Gateway, this implementation is designed to provide seamless failover, consistent security policies, and minimal disruption during outages.

Learn more about the value of NC2: Maximizing License Efficiency When Migrating to Public Cloud With Nutanix Cloud Clusters

©2025 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and all Nutanix product and service names mentioned are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. All other brand names mentioned are for identification purposes only and may be the trademarks of their respective holder(s). The example implementation here is provided AS IS and is not guaranteed to be complete, accurate, or up-to-date. Nutanix makes no representations or warranties of any kind, express or implied, as to the operation or content of any code samples or snippet. Nutanix expressly disclaims all other guarantees, warranties, conditions and representations of any kind, either express or implied, and whether arising under any statute, law, commercial use or otherwise, including implied warranties of merchantability, fitness for a particular purpose, title and non-infringement therein. Results, benefits, savings, or other outcomes described depend on a variety of factors including use case, individual requirements, and operating environments, and this publication should not be construed as a promise or obligation to deliver specific outcomes.