The Ultimate Guide to Managing Hybrid Cloud Performance

Organizations need to choose services, devices and management tools correctly to drive scalability, agility and cost savings with the hybrid cloud.

By Dipti Parmar

By Dipti Parmar February 16, 2022

The hybrid cloud is the IT operating model of choice for 9 out of 10 enterprises.

A full 94% of global IT leaders believe hybrid clouds are critical for meeting their business needs, as per an NTT report. And they’re putting their money where their mouth is – 61% of them are already using or piloting a hybrid cloud infrastructure.

Scalability, agility, and cost containment are the core benefits – real, perceived, and hoped-for – of a hybrid cloud model. It gives organizations the versatility and power of the public cloud with the security of the private cloud. No wonder respondents to the Nutanix Enterprise Cloud Index report were unanimous in their reasoning to drop their current IT deployment models for the hybrid cloud: business outcomes.

While a successful hybrid cloud deployment is not without its challenges, it is just the tip of the iceberg. Managing, monitoring, and optimizing a hybrid infrastructure is a whole new ball game. Let’s examine various aspects involved in improving hybrid cloud performance.

Know the Problem Areas

The hybrid cloud is a multifaceted environment that runs complex applications with faster release cycles and manifold dependencies. Workflows are not easy to analyze or optimize. There are a variety of underlying causes – both hardware and software – that cause sluggishness or outages in business-critical apps.

End-user devices: In the age of remote work, Bring Your Own Device, and DaaS-powered End User Computing (EUC), the variations in mobile and desktop devices that connect to one or more of the cloud systems or on-prem VDI is staggering. It is near-impossible to pin the issue to the device, OS, browser, or telecom carrier.

Network: Bandwidth and connection issues can affect cloud performance in multiple ways – it could be data transfer between a public and private cloud, connectivity between a cloud and an on-prem data center, security of a public cloud system, or simply a traffic spike from a particular geographic region. Data- and I/O-intensive applications can trigger problems such as network congestion, latency, jitter, packet loss, and excessive retransmission.

Cloud services: Disruptions in the public cloud components of the hybrid cloud – in IaaS, PaaS, or SaaS services – impact the performance of the entire workload. If only a part of the stack has been migrated to the cloud, it becomes even more difficult to diagnose the problem, given the limited services controlled by the organization. Performance monitoring, enforcement of SLAs, and consistent user experience, all become more complex as a result.

Servers: Servers in hosted or on-prem private clouds can be affected by configuration errors, hardware malfunction, outdated OS or applications, inadequate load balancing, and so on. With web servers, HTTP errors such as “page not found” and “internal server error” degrade web application or website performance.

Application code: Organizations using cloud-native apps built on microservice architecture using DevOps practices have to deal with complexities introduced by increased internal dependencies and faster release cycles. Non-standard coding could lead to bad data calls, microservices failure, authentication errors, and memory leaks, etc. within the application.

Understand What and How to Test

Identifying various metrics and creating routine testing procedures is the first step in keeping all parts of the hybrid cloud moving smoothly. Admins need to stay on top of indicators relating to file system performance, read/write speed of storage systems, caching and autoscaling. The performance checks that need to be performed routinely include:

  • Capacity testing – benchmarking the maximum traffic or operations the infrastructure can handle at any given time or for defined periods of time
  • Load testing – how applications and workloads perform when hundreds or thousands of users are logged in simultaneously
  • Stress testing – the stability, reliability, and responsiveness of cloud resources when they’re under high-to-extreme loads 
  • Soak testing – what happens when user or resource load is maintained for extended periods of time, typically done to evaluate continuity and failover in a production environment
  • Failover testing – checks what happens when systems fail partially or fully, when operations are interrupted by resource failure, and whether additional network bandwidth, storage space, memory, or processing power is automatically added when predefined thresholds are crossed
  • Browser testing – compatibility of various browsers across mobile and laptop devices
  • Latency testing – the time needed to send and receive a unit of data from one point to another in the network or environment
  • Application testing: how each application component or layer performs on its own when isolated and whether it generates any system errors

In case of an outage or slowdown in a cloud workload, typical actions that IT teams can take to identify the root cause include:

  • Isolating the problem to the code, network, or infrastructure layer
  • Analyzing the performance of the application in question across devices, OS versions, geographic regions, and carriers to spot a pattern, if any
  • Tracing transaction paths from the user, through the network, and into the backend and back
  • Reconstructing incidents across services, containers, cloud platforms, and data centers

Focus on Cloud-Specific Issues

Given the complexities and multiple technologies that come together, performance might not even be a priority for deploying a hybrid cloud for a host of reasons (lack of IT expertise, for one). Organizations shifting to hybrid, multicloud infrastructures primarily focus on areas such as business functionality, data privacy, security, and compliance, resource availability, scalability, and improved mobility (both data and app mobility).

Related

The True Hybrid Cloud Benefit: Bridging the Public and Private Cloud Gap

There are a few areas that ultimately influence the overall performance of the cloud environment, especially a hybrid model. Cloud architects and managers need to identify and fine tune their operational best practices to better address these areas.

Smartly allocate resources or instances

While the public cloud thrives on resource sharing, a hybrid cloud adds performance to it by lowering the amount of shared resources and hosts. It offers single tenancy resource management, where dedicated instances are physically isolated at the host level from other instances that belong to other tenants.

Depending on the budget, businesses have the option to stabilize workload performance by buying dedicated hosts, instances or bandwidth. Further, with some long-term commitment to the provider, they can realize significant cost savings using facilities such as Amazon’s AWS EC2 Reserve Instances.

Enable end-to-end visibility into application performance

Whenever business operations are affected, the issue becomes obvious to the user only via the application, regardless of the infrastructure layer at which the problem resides. This is why application monitoring and correlation of every transaction across public and private clouds and on-prem environments is so important. An app-centric approach to tracking performance helps businesses achieve end-to-end visibility throughout the infrastructure.

Application performance directly impacts customer experience (CX). Cloud admins need to be able to correlate the UX of the application with the hybrid infrastructure. Monitoring tools that provide insights into how customers interact with applications across browsers and devices are the need of the hour. If user journey hotspots and UX glitches can be mapped to application flows, it enables better identification and diagnosis of anomalies across all cloud services.

Put in Place a Hybrid Cloud Management Strategy

While it is easy to see the business benefits of a hybrid cloud setup, effective cloud infrastructure management is essential for sustained performance, availability, resilience and on-demand scaling of resources.

An in-depth understanding of business goals, user requirements, industry regulations, data governance and security are critical to successful hybrid cloud deployment and administration. This calls for a comprehensive cloud management strategy that accounts for:

Unified management – Relying on multiple interfaces and dashboards for different components or environments within a hybrid cloud is a recipe for disaster. Implement a single management and control plane that aggregates data from all cloud and on-prem systems within the hybrid infrastructure. A centralized system for monitoring and reporting makes cloud management much simpler.

Further, a “single pane of glass” management interface enables real-time orchestration and automatic provisioning of cloud resources while abstracting the complexity of the underlying platform and application/database stacks.

Security and governance – A hybrid cloud necessitates homogenous security interfaces across multiple environments with frameworks such as Identity and Access Management (IAM) and Zero Trust Architecture (ZTA).

Workload suitability – Figuring out which workloads run well in public clouds and which run better on-prem is key to fulfilling the very goals implementing a hybrid cloud environment. Admins need to evaluate dozens of other things that impact performance, including mapping out applications, estimating the business value they deliver over time, constructing private-public interfaces, monitoring latencies, meeting user requirements, and so on.

“Exploiting cloud successfully and safely requires multiple domains to coordinate and develop a business-driven decision framework and best-practice IT operational models,” said David Cearley, Research Vice President and Fellow at Gartner.

“This helps to standardize cloud strategy across an organization, while allowing for an approach that will meet the unique needs of different use cases and business units,” Cearley said.

A Brave, New Hybrid World

Cloud computing is one of the major driving forces behind digital transformation initiatives today. Hybrid clouds performing at optimum levels give organizations key competitive advantages, the agility to operate in new markets, the leverage to try new business models, and serve customers better.

The primary benefit for businesses in improving hybrid cloud performance is not cost savings, faster time-to-market or even operational efficiency.

“A managed approach to hybrid cloud provides enterprises with the ability to ‘offload’ some of their IT operations and focus more on their business objectives,” said Michael Ritchken, Principal Consultant, Cloud Computing Services at Dimension Data.

“At the end of the day, it’s ultimately about delivering business value.”

Dipti Parmar is a marketing consultant and contributing writer to Nutanix. She’s a columnist for major tech and business publications such as IDG’s CIO.com, Adobe’s CMO.com, Entrepreneur Mag, and Inc. Follow Dipti on Twitter @dipTparmar or connect with her on LinkedIn for little specks of gold-dust-insights.

© 2022 Nutanix, Inc. All rights reserved.  For additional legal information, please go here.