Disaster Recovery
Done Right

USING THE CLOUD FOR DR & HOW NUTANIX CAN HELP

Top 10 questions answered

ABOUT THIS EBOOK

This eBook explains how to extend your private IT infrastructure to the public cloud as part of a modern business continuity and disaster recovery (BCDR) plan. You will learn how to use Nutanix Clusters to bolster the resiliency of your IT environment at a far lower cost than traditional DR approaches by taking advantage of the public cloud’s on-demand, elastic capabilities and the Nutanix Clusters unified management plane and automation.

DISASTER PLANNING IN THE DIGITAL ERA

As organizations continue to raise their digital profiles, the impact of unplanned downtime on their businesses grows more serious. A 2021 study by 451 Research, for example, found that 61% of enterprises surveyed had experienced at least one recent outage that cost them more than $100,000.

All companies with IT operations need a solid DR strategy. However, DR planning is notoriously one of the most diffcult aspects of IT infrastructure management. It involves a careful mix of best practices, regular IT resource backup, and frequent DR plan testing.

At the heart of DR management is a system for replicating and maintaining IT hardware, software, and data in geographically diverse locations. Backing up and continually synchronizing IT resources protects your business operations if a catastrophic event renders server, storage, and network resources in your datacenter unusable. With a remotely mirrored IT site, you can fail over to the redundant infrastructure and continue to serve customers and employees in the event of a local or regional IT outage. Public clouds can be a huge help in this aspect of your DR plan as they inherently offer capabilities not available to on-premises infrastructure deployments.

Keep reading to learn how you can use Nutanix Clusters to integrate DR capabilities in your Nutanix on-premises infrastructure with the public cloud, modernize your DR plan for fast recovery, and minimize business loss due to unplanned outages.

1 451 Research, 2021: Voice of the Enterprise: Storage, Data Management, and Disaster Recovery

0%

of recent outages were caused by software failure

(451 Research’s Voice of the Enterprise: Storage, Data Management and Disaster Recovery 2021)

CLOUD CONSIDERATIONS

WHY SHOULD I CONSIDER USING PUBLIC CLOUD INFRASTRUCTURE AS A DR PLATFORM?

Nearly unlimited public cloud resources in many geolocations are available over the Internet in a variety of consumption models. Available as a usage-based, pay-as-you-go public cloud service, it integrates with your Nutanix on-premises cloud in a hybrid or multicloud setup to make cloud-based DR a comparatively low-cost and simple option.

> No capital costs. Cloud DR eliminates the large capital expense of building and maintaining a fully redundant datacenter and IT infrastructure. Instead, you get mirrored cloud storage, compute, and networking resources dedicated to your DR environment.

> Automated operations. Cloud DR uses automated replication processes to keep up to date without requiring operations staff involvement.

HOW DOES A PUBLIC CLOUD DR SOLUTION COMPARE TO BUILDING MY OWN DR SITE?

The evolution of DR solutions and the simplicity of hyperconverged infrastructure (HCI) have created modern, effcient options for protecting and recovering business applications and data, however the traditional approach of creating a near-duplicate secondary “hot” site for IT failover requires all the investment in time, effort, and cost associated with designing, building, and operating the primary site.

> Unified management plane: By integrating the datacenter HCI architecture with the application programming interfaces (APIs) of leading public cloud providers, Nutanix can offer a unified management and control plane across both on-premises and public cloud-based environments.

> No more idle resources or outdated backups. An on-premises backup infrastructure you build might never be used if you are lucky enough to avoid a catastrophic event. Yet it requires substantial investment and operational effort to maintain and synchronize it with your production systems.

> Cost and operational relief. The cloud DR model relieves you from the capital expenditure associated with equipment ownership for an IT stack that is rarely, if ever, used. Cloud usage fees can be minimized to apply only during data replication and the infrequent outage when you fail over or test your cloud-based DR infrastructure.

> Simple on-premises/public cloud integration. Nutanix Cloud Platform runs on bare-metal instances in the public cloud, each with its own compute, memory, local storage, and networking resources. Grouped together and combined with the same on-premises Nutanix HCI software, you create and manage secondary backup infrastructures without the complexity of building and maintaining that secondary on-premises site yourself.

Recovery Point Objective
RPO is a standard you set for the interval of time between data backups, calculated based on how long your business can survive without data being updated. Can you tolerate a week’s worth of out-of-date data? A day’s worth?

Recovery Time Objective
RTO is the standard you set for how fast it takes your business to be back online and fully functioning, following an outage. You might have different RTOs for different workloads, applications, and data, based on their criticality, which are generally measured in seconds, minutes, hours, or days.

THE NUTANIX ROLE

WHAT ARE THE BENEFITS OF NUTANIX CLUSTERS AS THEY RELATE TO DR?

Nutanix Clusters allow you to be more selective and granular in your DR policies and design. You can decide where and how to protect workloads and where to recover them, whether that is in an on-prem or other private datacenter or in one or more public clouds. The DR design decisions you make can therefore be based on solution cost, recovery-time requirements, the criticality of the workload or application, compliance requirements, and other factors.

There are three primary strategies for enabling DR solution flexibility and minimizing cost, beyond the general elimination of the cost to build and maintain a second physical datacenter.

> DR tiering. Native DR capabilities allow you to create custom protection levels for your applications and data based on their specific recovery requirements. Protect business-critical applications with the shortest RPO and quickest RTO, while assigning lower-tier apps and data to less-expensive public cloud DR tiers with slightly longer and slower RPO/RTOs. For example, replicate business-critical, Tier 0 data to private infrastructure but replicate Tier 1 or 2 data to the public cloud.

> Elastic DR. The “elastic” nature of the public cloud is its ability to scale out and scale back IT infrastructure on-demand. You can create a small “pilot light” DRCluster with a minimum of three nodes with sufficient storage to duplicate everything in your primary environment. If a failover event occurs, your Nutanix DR protection policies automatically trigger the cluster to expand on-demand. This enables significant cost savings because you do not pay for hot cloud resources that fully mirror your primary infrastructure but remain unused most of the time.

> Cluster hibernation. A DR alternative is to create a full, mirrored DR cluster (storage, compute and networking) and “hibernate” it. This option shuts down the entire cluster when not in use or when data is not being replicated to it. No bare-metal instance charges are incurred during these periods, however if an outage occurs, the clusters automatically wake from hibernation, and data is pulled from public cloud storage to restore applications and workloads on the designated infrastructure.

ELASTIC DR

CAN I USE ELASTIC DR FOR SOME WORKLOADS BUT NOT OTHERS TO BALANCE COST WITH RESTORAL SPEED?

Yes. Identifying the required recovery policies for each workload or application type is key to defining which DR options to use. You can use elastic DR to create the small hot-standby infrastructure to protect apps and data to meet your desired RPO. However, your RTO with elastic DR is longer than it would be if all resources were available immediately, because it takes up to one hour to build out your full Nutanix Clusters environment upon failover. Matching DR policies to different DR capabilities or sites can provide the best solution to match your recovery times and budget.

CONSEQUENCES OF AN OUTAGE

0%

Lost worker productivity

0%

Lost data

0%

Lost revenue

0%

Damaged reputation

0%

Lost customer loyalty

0%

Compliance penalties

(451 Research’s Voice of the Enterprise: Storage, Data Management and Disaster Recovery 2021)

APPS AND SECURITY

I RUN VIRTUAL DESKTOP INFRASTRUCTURE (VDI) ON NUTANIX HCI. DO I HAVE TO BUY ADDITIONAL NUTANIX CAPACITY TO SUPPORT MY VDI ENVIRONMENT WITH CLOUD DR?

The Nutanix Clusters infrastructure requires a minimum of three nodes. However, cluster sizes will depend on your workload requirements for storage, compute and networking. VDI typically requires minimal storage due to its use of workload imaging, so you can configure your VDI environment to use minimal cloud infrastructure with the elastic DR cluster on-demand option.

MY ORGANIZATION IS CONCERNED ABOUT CYBERATTACKS SUCH AS RANSOMWARE. HOW CAN NUTANIX AND THE CLOUD HELP ME PREVENT OR QUICKLY RECOVER FROM A BREACH?

It is important to realize that no single action completely protects against a cyberattack. The best solution is a multi-layered security strategy that combines best practices and tools for prevention, detection, and recovery. Because it is impossible to prevent all breaches, you need to be prepared to quickly detect them and recover from them when they occur.

> Prevent.  Following best practices can mitigate the potential impact of cyberattacks. This includes enforcing strong password policies, network segmentation, end-user education about phishing and other hacker break-in tricks, role-based access control (RBAC) with zero-trust security policies, and write once, read many (WORM) object storage practices. The Nutanix Cloud Platform is hardened and secured using these best practices, allowing you to create policies for all these aspects of security and enforce them across both your private Nutanix environment and your public cloud clusters DR infrastructure consistently with a unified interface.

> Detection. It is critical to use intrusion detection tools that monitor anomalous behavior, such as repeated failed authentication attempts and huge increases in network traffic or application and file access activity. An organization that monitors its backup process regularly can detect costly attacks such as ransomware in their earliest stages and limit their impact.

Nutanix offers help through our support for policy-based service insertion of network security and threat awareness tools from our ecosystem partners. In addition, our Prism Ops management software provides insights and analytics that alert resource utilization anomalies, while the Nutanix Files intelligent analytics engine provides insight into suspicious file-share activity.

> Recovery. Recovery plans should take a layered approach based on your required RPO/RTOs. In the case of ransomware where an attacker encrypts your data for ransom, a clean snapshot from a time just before the infection occurred will provide the quickest option to recover data. When snapshots are not available, restoration from the last backup cycle is the next logical option. It’s important to make sure backups have not been corrupted. The following steps help you achieve these goals:

  • Replicate data to two or more locations as a part of your DR plan.
  • Follow the 3-2-1 rule for backup: This calls for three copies of data (one production and two backup copies) stored on two different media types (e.g., disk, tape, cloud storage), with one copy stored in an off-site location.
  • Create immutable storage: You can use Nutanix Mine, with WORM capabilities, to “lock” backup copies for a specified period to prevent deletion or encryption of backups. A properly designed WORM system would ensure that no matter what, all data would be protected for a set period of time against all write, update, and delete requests, regardless of who is requesting that operation.
0%

of recent outages were caused by cyberattacks

(451 Research’s Voice of the Enterprise: Storage, Data Management and Disaster Recovery 2021)

CLOUD DR STRATEGIES

HOW CAN I OPTIMISE THE EFFECTIVENESS OF MY CLOUD DR?

Start by defining a workload replication and recovery plan if you have not already. Group your workloads, applications, and data by level of criticality to your business so you can establish an RPO (specifying how much data you can afford to lose—minutes? hours? days?) and RTO for each group. It might be tempting to group all apps and workloads into a single recovery tier to simplify DR management. However, this approach is usually more costly because it forces all resources to be covered by your highest (Tier 0) RPO and RTO levels. It usually pays to define a few logical protection groupings and each of them to your corresponding internal business SLAs to determine RPO and RTO. For less critical resources with longer acceptable RPOs or slower RTOs, the elastic DR or the hibernate options can help you balance requirements with cost.

SHOULD I CHANGE MY DATA PROTECTION PLAN IF I MOVE TO THE PUBLIC CLOUD?

Adding the Nutanix Clusters infrastructure does not necessarily require changes to your existing data protection plan. If your DR objectives are changing as a component of your cluster deployment project, then existing data protection plans should be updated accordingly. This might involve adapting protection policies to support elastic DR or leveraging new DR tiering options that were previously not available. Note that Nutanix Clusters for DR gives you the ability to use public cloud IT resources in regions where you do not currently run infrastructure. This provides you with the option to amend your protection policies to split your resource redundancy across two or more different standby DR sites for additional diversity and protection, which could prompt changes to your plan.

REAL-WORLD RESULTS

CAN YOU SHARE ANY CASE STUDIES ABOUT HOW ENTERPRISES ARE USING NUTANIX CLUSTERS FOR DR?

Since the launch of Nutanix Clusters, cloud DR has been a popular way for enterprises to get experience with hybrid clouds. Penn National Insurance is one such customer. Before running Nutanix, the company was challenged by DR and infrastructure management complexity. It backed up its data to tape, which it stored off-site, a DR solution that was unable to keep pace as data growth started skyrocketing.

The restoral process would take several days, leaving the company offine too long to meet its RPO/RTOs. Penn National’s updated hybrid cloud environment consists of on-premises Nutanix HCI and Nutanix Clusters in the Amazon Web Services (AWS) cloud. According to Craig Wiley, Senior Infrastructure Systems Architect at the company, “By moving our DR to the AWS cloud, our recovery time has dropped from several days to under two hours. If there is a disaster, we can quickly spin up the Nutanix Cluster on AWS and bring up the replicated data in the cloud.”

You can learn more about the Penn National Insurance use case here.

LEARN MORE

HOW CAN I GET A PRACTICAL FEEL FOR HOW EXPANDING MY NUTANIX INFRASTRUCTURE INTO A PUBLIC CLOUD SERVICE FOR DR WOULD WORK?

There are several ways to learn more about extending your Nutanix private IT infrastructure to the public cloud to strengthen your resilience to unplanned downtime. Follow these links to discover more about using Nutanix Clusters and the public cloud together, seamlessly, for enhanced DR capabilities: