What is Disaster Recovery?
Disaster Recovery (DR) is a first-line insurance strategy that protects a datacentre from the effects of a natural or man-made catastrophe. In the event of a disaster, a disaster recovery plan (DRP) ensures a business can either quickly resume operations or maintain mission-critical functions during or after a disaster. The DR process includes planning and testing, and typically involves a separate physical site for restoring operations.
To better understand DR, we must first define a disaster in terms of business continuity. A disaster, in the most simple terms, is anything that puts an organization's operations at risk. This can be a cyberattack, a data breach, an equipment failure, a natural disaster, or even rats chewing through cables. Not to mention, any of the following can create an IT disaster: data loss, human error, malware and viruses, network and internet blips, hardware and/or software failure, weather catastrophes, natural or pipe burst flooding, office vandalism or damage.
When a disaster strikes, the goal of any DR plan is to ensure operations run as normal as possible. While the business will be aware of the crisis, ideally, its customers and end-users should not be affected.
Many businesses also opt for a disaster recovery-as-a-service (DRaaS) strategy, a model that allows companies to duplicate and host servers in a separate datacenter through a third-party provider. Some cloud vendors offer a native DRaaS solution, which simplifies the installation and onboarding processes. Once onboarded, companies enjoy the immediate benefits of DR protection. And since this service is cloud-based, it is elastic, able to accommodate the growing or shrinking needs of the client.
What are the Types of Disaster Recovery?
Disaster Recovery-as-a-Service (DRaaS): Fully managed disaster recovery solution where organizations don’t need to personally maintain and manage their disaster recovery plan. With DRaaS, the managed service provider (in our case Nutanix) will manage the infrastructure and software for them. This solution is best suited for enterprise-grade organizations, that don’t have the infrastructure nor the manpower to maintain and manage a disaster recovery strategy.
Virtualization: This is what we do at Nutanix! Organizations can be up and running extremely fast recovering from their backup servers, apps, and operating systems through the internet.
Cloud-based disaster recovery: This is a disaster recovery solution that is hosted in either the public or private cloud. Customers can choose where they would like to host their DR solution based on how much control and overload they wish to maintain. With the flexibility of cloud-based disaster recovery, organizations are able to grow without bounds and optimize efficiency and costs.
For example, at Nutanix, we have two main cloud-based offerings. Nutanix Disaster Recovery on NC2 (AWS) allows customers to run DR in the public cloud with the help of Nutanix Cloud Clusters, and Nutanix DRaaS (Private Cloud-Nutanix).
On-premises disaster recovery: This disaster recovery strategy is for organizations in industries with high regulations (federal, finance, etc.) or those organizations that have the resources and capital to maintain their own secondary datacenter. Disaster recovery is hosted at these secondary datacenters and is recovered from there in the face of a disruption.
Benefits of Quality Disaster Recovery Strategy
On top of eliminating the risks associated with poor disaster recovery, there are several major benefits of ensuring your business has a well-established, easy-to-execute DR strategy in place.
- RTO and RPO: Establishing a disaster recovery solution establishes predictable restoration of systems, services, and applications. The industry-standard service level agreement (SLA) metrics are recovery time objective (RTO - a measure of how long a business can tolerate a loss of business operations) and recovery point objective (RPO - a measure of how much data can be acceptably lost).
- Limit losses: Since a disaster recovery solution restores business operations quickly, revenue losses and costs associated with damage caused by lengthy downtime are minimized.
- Protect operations: Not all business applications are created equal; therefore, a well-architected DR plan will enforce SLAs on a per-application basis. This ensures business-critical applications are highly protected.
- Protect reputation: A good DR strategy is a competitive differentiator. If clients/customers of a business see the resiliency of a business during a disaster, their overall confidence in the company will likely increase.
- Performance Improvements: The ability to host target DR failover operations on a physically separate datacenter (possibly multiple remote datacenters or service providers) means avoiding performance degradation due to localized disasters.
What Should a Disaster Recovery Plan Include?
There is no universal disaster recovery plan that can fit the unique needs of all businesses. And while the following criteria are meant to be guidelines for establishing a disaster recovery strategy, customization is expected.
- Inventory infrastructure: It’s important to inventory hardware and software at the outset of developing a DR plan. Businesses operating on mostly software will find this task easier, as they won’t need to account for physical components in their datacenter. This process includes capturing each vendor’s technical support contact information for all hardware and applications.
- Perform risk assessment: Determine how much downtime and data loss the business can handle. While 0 downtime and 0 data loss is the ideal, not all businesses can afford a disaster recovery solution that can support that goal. Businesses that are highly reliant on IT, such as e-commerce sites, simply cannot withstand much, if any, downtime. This exercise is also an important opportunity to determine an acceptable RPO and RTO for each class of application.
- Establish a communication plan: An effective communication plan keeps employees informed during a disaster, ensuring they understand how to access systems needed to continue business operations during a DR event. This plan also includes an established base of operations location during the disaster.
- Establish SLAs: Ensure contracted service-level agreements (SLAs) include disasters. Many businesses outsource technologies to service providers or store their systems in a separate datacenter or facility.
- Regular disaster recovery testing: Companies must regularly test the readiness of their DR solution. Even the most robust DR plans must be tested to satisfy auditors, whether internal or external.
The Risks of Not Implementing a Disaster Recovery Plan
Regardless of the size of an organization, IT is an integral part of any business; in fact, for an increasing number of companies, it is the very lifeline of the business. Protecting IT assets and mission-critical operations are at the top of the priority list. A sound DR solution does more than simply protect hardware; nowadays, software attacks are more commonplace, which can impact website(s), the ability to fulfill orders, and perform other business-critical tasks.
Without a disaster recovery strategy in place, there are operational, financial, and reputational risks a company may face. From a business continuity perspective, if a disaster impedes a business’s ability to operate effectively, its employees will be unable to do their jobs, customers may be impacted by operational slowdown, and they may even choose to consume products and/or services from a competitor.
Perhaps the most obvious, immediate risk that arises out of a disaster is massive revenue loss. And while nearly all disasters will create some kind of financial loss, if the response and recovery time is slow, the business is likely to lose a lot more money. Unfortunately, the cost of slow recovery responses is rising. In fact, the average cost of IT Downtime can be as high as $17,000 per minute. Because not all companies are well-equipped to cover that expense, many won’t recover after being hit by one significant disaster.
Finally, companies that are unable to quickly and efficiently recover after a disaster are at risk of losing their reputation as a secure, trustworthy business. All good companies know their customers are what keeps them in business, and reputational damage can hamper future investments, turn away valuable employees, and for some businesses, eliminate any chance of returning to the market. This is among the chief reasons why businesses often fail after being hit by a disaster.