How to Build a Sustainable, Energy Efficient IT Infrastructure for AI: 6 Core Principles

By Andrea Osika, Sr. PMM, Sustainability, Nutanix

May 19, 2026 8:26 pm

Modernizing for the Inference Era: Balancing Performance and Energy Constraints

The AI wave is maturing as organizations transition from experimental pilots to business-critical production. This shift has moved the focus from resource-heavy model training to the high-volume demands of AI inference, which is projected to account for roughly two-thirds of all enterprise AI compute in 2026. Inference is persistent, distributed, and resides closer to the user, so the sustainability challenge has fundamentally changed. The bar for success has moved from one-time instances or rounds of efficiently training a model to deployment of continuous, system-level efficiency across the entire hybrid multicloud stack.

In response, the industry is moving toward a more intentional, "AI-Smart" approach to infrastructure. This means treating AI not as an isolated side project, but as a core capability requiring the same resiliency, Day-2 operational discipline, and integrated security as any other mission-critical system. This transition is critical since siloed, legacy infrastructure was not engineered for the high power density or the persistent uptime requirements of modern, distributed AI logic. In fact, these constraints are pushing hyperscalers to redesign their sites.

As energy-intensive AI services scale, infrastructure efficiency is shifting from a secondary utility concern to a primary constraint. A rigid IT infrastructure design risks creating operational bottlenecks and financial unpredictability (TCO) as environments struggle to adapt to rising energy costs or power limitations. Efficiency should be factored into IT modernization alongside performance, sovereignty, and security. Sustainable IT infrastructure centers on building on a flexible, unified platform that can remain adaptable for the next generation of AI execution.

The Six Core Principles of Sustainable, Energy Efficient AI Infrastructure

A sustainable IT architecture is the result of intentional design that reduces waste, improves utilization, and adapts as workloads evolve.

1. Consolidate via HCI to Minimize “stranded watts”

Poor utilization is the biggest driver of energy waste. Efficiently scaling AI won’t work if the legacy tech debt underneath it isn’t addressed. Sustainable AI starts with consolidation. By using Hyperconverged Infrastructure (HCI), enterprises can keep infrastructure “tightly packed" and scale resources only when demand rises. This can provide resiliency needed to help eliminate “stranded watts” that can be found in legacy infrastructure, i.e., power allocated to underutilized or “zombie” servers, and absorb constant hardware churn (GPUs/CPUs) without overprovisioning.

Evidence-Based Impact: On average, customers that shared their experiences using the Nutanix Cloud Infrastructure (NCI) solution in the Nutanix Cloud Platform (NCP) reported a 50% reduction in energy consumption, and approximately a 66% decrease in physical footprint after replacing legacy systems.*

RELATED RESOURCES: Sustainable IT Solutions, Smart Datacenter Strategies: Consolidation for Cost Savings and Efficiency

* These space or energy savings claims are average results based on case studies of representative Nutanix customers that are publicly available on the Nutanix website as of December 10, 2025, and were initially published between Jan 1, 2023, and Dec 10, 2025. Because potential customer outcomes depend on a variety of factors including their use case, individual requirements, and operating environments, these accounts should not be construed to be a promise or obligation to deliver specific outcomes. We invite you to contact Nutanix here to discuss how we may be able to provide an optimal solution for your specific circumstances.

2. Optimize Day-Two Operations with Kubernetes

Running AI reliably on "Day 2" is where sustainability is won or los t. Using cloud-native practices like virtualization and containers to achieve enterprise efficiency allows for automated scaling designed to better align power usage with demand. This approach maximizes utilization by packing workloads onto fewer systems and autoscaling only when needed. The objective is less idle capacity, lower power and cooling demand, and faster scaling for AI workloads moving into production. This targets complexity and "AI Sprawl", where forgotten models continue to unnecessarily consume energy and create security gaps. For many workloads, running containers inside a platform can be an efficient course for the business because:

Consolidation and shared resources.
Predictability: Using platforms that leverage virtualization can offer the ability to snapshot, clone, and move workloads instantly.
RELATED RESOURCES: How to Apply AI in Cloud Native to Accelerate Intelligent Workloads, Nutanix Turns AI Ambition into Enterprise Control and Customer Delight, AI-ready infrastructure that delivers performance, scale, and flexibility

3. Operationalize Integrated Security and Energy Observability

You cannot manage what you cannot see. Integrated observability provides real-time visibility into the relationship between performance and power-per-watt. This enables IT teams to measure and monitor energy demand dynamically, aligning with the World Economic Forum’s call for "energy-aware" compute. Because AI introduces new, distributed data surfaces, security and power monitoring should be built into the foundation of the platform. By maintaining real-time visibility into the entire stack, IT can intelligently allocate energy resources to mission-critical models, helping ensure that power is used where it has the highest business impact and the lowest risk.

A holistic workspace monitor solution can help organizations holistically understand their infrastructure environment – including storage, compute and networking resources – with useful metrics like CPU utilization, memory usage and storage IOPS as well as power metrics.

A View of the Nutanix Prism Dashboard with Power Usage Highlighted.

The Nutanix Prism dashboard allows organizations to monitor the power usage alongside key system metrics such as CPU usage, memory usage, disk I/O and more.

Additionally, features like Token-Based Guardrails which implements rate limiting and granular cost controls can help to prevent "bill shock" and resource exhaustion. The implementation of Max-Step Execution Caps at the infrastructure or gateway layer can also provide a mechanism to help manage the operational overhead of autonomous agents. By enabling the termination of workflows after a predefined number of steps, this feature can help mitigate the risk of recursive 'hallucination cycles' and potentially reduce the energy waste often associated with unconstrained, non-terminal reasoning.

RELATED RESOURCES: Approaches to Monitoring Power Consumption, Nutanix Cloud Manager, Unified Cloud Operating Model

4. Right-Size Your Silicon: Use CPUs for Inference

One of the most impactful ways to reduce operational energy consumption is to pair the right accelerator with the right task. Enterprise IT infrastructure can be optimized by matching the right tool for the job.

CPUs/NPUs: Modern CPUs can offer a practical balance of energy efficiency and potential for a lower TCO for a significant majority of enterprise inference tasks, like RAG for embedding and reranking of data.
GPUs/TPUs: Reserve high-wattage accelerators specifically for large-scale tuning and heavy-duty LLM inference, like reasoning.

The Strategy: Use CPUs when you can, and GPUs only when you must, to maximize performance-per-watt. Leverage a software-defined infrastructure that uses automation to right-size and place workloads efficiently, which can help reduce waste.

Feature	Traditional Hardware	Software-Defined Infrastructure
GPU Usage	1 GPU = 1 Model	Fractional GPU Sharing (Higher utilization)
Compute	Buy new GPUs for all AI	Uses existing CPUs
Waste	Provision for peak capacity	Right-Sizing (Match supply to demand)
Networking	CPU handles all traffic	DPU Offloading (Free CPU for core tasks)

RELATED RESOURCES: Practical Guide to Optimizing LLM Inference on Nutanix, Nutanix AI, The Handprint and the Footprint of Artificial Intelligence, AI-ready Infrastructure Solutions: Cost-Effective Options

5. Deploy Carbon-Aware Workload Placement

Recent research suggests that running AI in a low-carbon location can, in some scenarios, significantly reduce emissions, often a more impactful move than a hardware upgrade alone. A hybrid multicloud architecture can provide the necessary mobility to shift heavy compute to where the grid is cleanest.

Cloud for Intensity: Hyperscalers are increasingly powered by renewable PPAs and high Carbon-Free Energy (CFE) percentages. Their optimized Power Usage Effectiveness (PUE) makes them the ideal default for energy-intensive training and high-volume LLM workloads.
Edge for Efficiency: When data must stay local for sovereignty or latency, adopt the "efficiency playbook" at the source. Running inference on the edge using NPUs¹, ARM architectures², or AI PCs reduces backhaul and can cut the total energy required per query³.

By combining renewable-rich cloud regions with localized edge inference, enterprises can deliver AI execution that is both carbon-aware and high-performance.

Nexus Image View of Nutanix Carbon and Power Estimator

RELATED RESOURCES: Carbon and Power Estimator, Hybrid Multicloud Sustainability, How Companies Can Reduce Emissions by Moving Workloads, Nutanix Cloud Platform, Edge/ROBO on Nutanix

_{¹https://www.mdpi.com/2079-8954/13/9/797

² https://www.arm.com/markets/cloud-ai

³https://www.rapidus.inc/en/tech/te0017/}

6. Embrace Circularity and Hardware Repurposing

Sustainability includes the embodied carbon of the hardware itself. This highlights the importance of choosing energy-efficient, EPEAT-registered systems and extending asset life through modular upgrades. One of the most overlooked sustainability levers is repurposing aging GPUs for lighter inference tasks rather than replacing them prematurely. Another lever is repurposing existing hardware such as servers and external storage arrays. This approach aligns with circular economy principles, reducing e-waste and conserving resources while preserving financial investments.

REALTED RESOURCES: Building an IT Sustainability Framework, Nutanix Expands Support for vSAN Ready Nodes, Practical Guide to Optimizing LLM Inference on Nutanix, Nutanix Expands Flexibility by Building out External Storage

A Sustainable Way to Run AI

A sustainable way to run AI is through a unified, hybrid platform that treats energy as a finite resource. By moving from a fragmented "AI-first" mindset to a disciplined, AI-smart architecture, organizations can work toward reducing energy demand, maximize utilization, and scale responsibly.

Sustainable AI FAQ:

Unlike model training, which is a localized, high-intensity batch process, inference is a continuous, real-time workload. It requires infrastructure that prioritizes resiliency and power-per-watt to avoid unmanageable energy costs during 24/7 production.

Nutanix provides a unified platform that can reduce the physical footprint in a data center, which in turn can lead to savings in power and cooling.

Yes, depending on the use case. For many Small Language Models (SLMs), modern CPUs with built-in accelerators can offer high energy efficiency and lower TCO compared to dedicated GPUs. Rule of thumb: Use CPUs when appropriate and GPUs when you must.

Traditional networking often relies on a sprawl of dedicated hardware appliances that stay powered on 24/7. By virtualizing these functions—using solutions like Nutanix Flow Virtual Networking—there is opportunity to eliminate redundant hardware switches and routers. This could lower data center energy consumption by reducing the "vampire power" draw from idle, standalone gear.

©2026 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and all Nutanix product and service names mentioned are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. Kubernetes is a registered trademark of The Linux Foundation in the United States and other countries. All other brand names mentioned are for identification purposes only and may be the trademarks of their respective holder(s). Certain information contained in this content may link or refer to, or be based on, studies, publications, surveys, and other data obtained from third-party sources. While we believe these third-party studies, publications, surveys, and other third-party data are reliable as of the date of publication, they have not independently verified unless specifically stated, and we make no representation as to the adequacy, fairness, accuracy, or completeness of any information obtained from a third-party. Our decision to publish, link to or reference third-party data should not be considered an endorsement of any such content.

Five Bold Predictions for Enterprise IT in 2025

As 2025 kicks off, it’s time to look ahead and anticipate the trends and challenges that will define Enterprises and their IT this year.

Blog Category:Thought leadership
Executive Topics:AI, Cloud Native, Platform
Nutanix Cloud Infrastructure (NCI):Nutanix Cloud Clusters (NC2)
Nutanix Unified Storage (NUS):Nutanix Data Lens
Nutanix-cxo:Executive Topics
Products:Nutanix Cloud Infrastructure (NCI), Nutanix Cloud Platform (NCP), Nutanix Database Service (NDB), Nutanix Enterprise AI (NAI), Nutanix Unified Storage (NUS)
Use Cases:AI ML

January 24, 2025

Blog Post

Nutanix Cloud Infrastructure 7.5: Integrated Security is a Key Part of a Distributed Sovereign Cloud

With the release of Nutanix Cloud Infrastructure (NCI) 7.5, Nutanix is doubling down on integrated security for customers building distributed sovereign cloud environments.

Products:Nutanix Cloud Infrastructure (NCI)
Resource Type:Blog Post

December 15, 2025

Unlocking Business Value with Nutanix Cloud Platform: 391% ROI and a Seven-Month Payback

In today’s digital-first economy, infrastructure decisions are no longer just IT concerns – they’re strategic business imperatives.

Nutanix Cloud Manager (NCM):Nutanix Security Central
Products:Nutanix Cloud Manager (NCM), Nutanix Cloud Platform (NCP)

September 11, 2025

Nutanix Announces Key Innovations for Enhanced Hybrid Cloud Efficiency, Storage, Security, Cloud Native Operations, and AI Adoption

Top .NEXT 2025 News

Nutanix Central:Prism
Nutanix Cloud Infrastructure (NCI):Nutanix Cloud Clusters (NC2)
Products:Nutanix Cloud Infrastructure (NCI)
Use Cases:AI ML, Cloud Native, Hybrid Multicloud

May 7, 2025

How to Build a Sustainable, Energy Efficient IT Infrastructure for AI: 6 Core Principles

Modernizing for the Inference Era: Balancing Performance and Energy Constraints

The Six Core Principles of Sustainable, Energy Efficient AI Infrastructure

1. Consolidate via HCI to Minimize “stranded watts”

2. Optimize Day-Two Operations with Kubernetes

3. Operationalize Integrated Security and Energy Observability

4. Right-Size Your Silicon: Use CPUs for Inference

5. Deploy Carbon-Aware Workload Placement

6. Embrace Circularity and Hardware Repurposing

A Sustainable Way to Run AI

Sustainable AI FAQ:

Why does AI inference require different infrastructure than training?

How does Nutanix help with AI sustainability and energy efficiency?

Can I run AI inference on CPUs?

How does Software-Defined Networking (SDN) factor into energy efficiency?

Related Articles