Resiliency Redefined: The Cloud-Scale Platform for Business Continuity
BY GENE KNAUER
Every CIO is aware of the costs of downtime to the business. You know it’s a nightmare best avoided. The most recent stats from Gartner put the average cost of IT downtime at $5,600 per minute — but for some businesses, it’s as high $540,000 per minute. For stock trading platforms, the fallout from downtime can be in the billions of dollars. That doesn’t even include the costs of diverting IT staff from their regular work to deal with a downtime emergency. The cost to brand reputation and legal penalties can deal a mortal blow.
It’s important to understand that if you’re using multiple clouds, the potential for downtime increases. Why? Because there are more environments in the mix with often different technologies, and because moving things around between them can be error-prone. Even the most skilled IT administrator is still occasionally subject to a uniquely human failing: We make mistakes.
Now, however, the management complexity that companies with many different clouds experience has been abstracted and simplified. Automated management platforms are available to decrease the potential for downtime in whatever type of cloud you use, whether public, private, multi-cloud, or hybrid cloud.
Business continuity requires resilient storage, network, compute, and data environments. Cloud services have changed the paradigm of how to achieve that. The traditional, rigid, monolithic nature of compute and data environments in particular can now be adapted to the automation and distributed nature of cloud. Compute and data resources can be abstracted and managed through a single pane of glass to deliver more resilient cloud, hybrid cloud, and multi-cloud environments.
The compute stack is always evolving, but it is still a very rigid architecture. With a cloud management platform, abstracted compute stack clusters can be deployed to multiple geographies and be individually scaled up and down as needed on demand. Continuous monitoring of the compute infrastructure identifies and anticipates problems and remediates them in real-time.
For example, a global retailer needs a 1 TB database during a holiday period. But the network pipe is too small to support the anticipated number of transactions. The risk is that people won’t be able to complete those transactions. Not good. What to do? With a cloud-scale management platform, the database capacity can be increased with a few clicks, whereas reconfiguring the traditional compute stack would be difficult and time-consuming. Fast, simplified — along with management features that span the distributed cloud scale architecture — help ensure that the retailer can keep transactions happening smoothly, whatever the volume and from wherever the transactions emanate.
High demand for data resources is another resiliency hurdle as it can lead cloud and on-premise systems to shut down and transactions to be aborted. Angry customers have long memories for terrible customer experiences.
With the distributed web-scalable architecture of a cloud management platform, if there is excess demand on one server and the server goes down, the system moves immediately to a new server cluster with all of the data and transactions. Failover is seamless. Everything is automated.
Artificial intelligence algorithms enable the system to detect when a server farm reaches 80%, and then, if necessary, to transition to another one to help carry the load. Machine learning intelligence guards against issuing false reports. For example, an adaptive machine learning algorithm can learn from the environment that server memory will be maxed out during certain times of the month. It will monitor server capacity based on specific instructions. If there are unusual spikes in demand — as compared to normal spikes with accounting servers for monthly reconciliation — the learning algorithm knows that this is unusual. If it persists, it notifies IT.
Cloud-Scale Business Continuity
Automated, cloud-scale management platforms with intelligence and distributed capabilities are here to help do a better job of providing business continuity across diverse cloud architectures. It isn’t easy for machines, let alone humans. But with the intelligence of A.I., the reach of cloud, the power of automated lifecycle management, and the simplicity of a single pane of glass dashboard, business continuity has gotten a lot more dependable and worry free.