Nutanix Turns AI Ambition into Enterprise Control and Customer Delight

By Luke Congdon, Tuhina Goel

For organizations starting or scaling their AI initiatives, the transition from experimentation to real-world production ready deployment can feel like crossing a chasm. Enterprises are realizing that success is not just about model accuracy; it’s about promoting control, observability, security, and scale from day one. 

That is where the Nutanix Enterprise AI (NAI) solution comes in. It is built to provide a simple and intuitive infrastructure platform, purpose-built for generative AI. NAI delivers robust security, observability, auditing, and logging right out of the box, so teams can move from experimentation to production with confidence. 

With the latest release of Nutanix Enterprise AI 2.5, we have doubled down on enhanced GPU resource management, hardened security enhancements, deeper observability, and a broader model ecosystem. Here is how NAI 2.5 drives security, maximizes efficiency, and empowers your platform teams. 

Enterprise-Grade Security: Centralized Control, Trusted Access

Every enterprise has sensitive data they must protect, such as proprietary research, HR records, source code, customer data, and more. Protecting sensitive data is a critical part of an enterprise AI infrastructure strategy. NAI 2.5 deepens enterprise security with the following enhancements:  

  1. Active Directory (AD) and Single Sign-On (SSO) via LDAP and SAML: This means IT teams can now centrally manage AI platform access using the same controls and existing directory services that secure their data centers and cloud environments. It’s the difference between another isolated tool and an enterprise-ready platform. 
  2. Model Access Control: Fine-grained permissions for which models and catalogs (Hugging Face, NVIDIA NIM, etc.) teams can deploy, expanding governance over what runs in your environment. 
  3. Forward Proxy Support: Secure model downloads through corporate firewalls without configuration headaches. 

Together, these controls allow CIOs to scale GenAI adoption without creating new identity silos while preserving data security. 

Deeper Observability and Logging   

When moving AI applications into production, observability is key to rapid root cause analysis (RCA). With NAI 2.5, teams gain enhanced traceability that enables them to diagnose performance issues, optimize resource utilization, and validate SLAs, all without leaving the AI platform. 

  1. Endpoint Logging and Rsyslog integration: When an API endpoint isn’t performing as expected, the first step toward resolution is visibility. With NAI 2.5, administrators can now access endpoint-specific logs directly from within the NAI console, eliminating the dependency on external Kubernetes® teams for root cause analysis. Logs can be viewed or downloaded from the endpoint summary screen. Audit logs can be exported to remote SysLog servers for centralized, long-term storage and compliance auditing.
  2. LLM metrics dashboard: When running large language model (LLM) endpoints, visibility into usage and performance is critical. NAI 2.5 delivers granular analytics that display request and token trends over time, sortable by API key for clear attribution. The enhanced performance dashboard tracks latency, active and queued requests, time to first token (TTFT), time per output token (TPOT), and tokens generated per second, giving teams a comprehensive view of endpoint efficiency and user demand. Designed for shared or multi-team setups, it helps admins monitor usage, link performance to workloads, and allocate compute resources with precision. 
LLM Metrics Dashboard

Advanced Resource Controls: Smarter, Efficient Scaling

As AI adoption accelerates, so does the demand for GPU efficiency. NAI 2.5 brings advanced hardware optimization and elasticity to help enterprises right-size compute resources, balance performance, and achieve higher overall GPU utilization across AI workloads. 

  1. New GPU Support: NAI has added support for powerful new GPUs, including the NVIDIA H200 and RTX PRO 6000 GPUs.  
  2. Multi-Instance GPU (MIG) and vGPU Support: To maximize utilization, NAI 2.5 introduces support for NVIDIA Multi-Instance GPU (MIG), a hardware-level segmentation feature that divides a single physical GPU into multiple, isolated instances. Each slice is presented to Kubernetes and AI applications as an independent GPU, enabling multiple endpoints or smaller models to share the same hardware. This is valuable for embedding workloads and small language models (SLLMs) that don’t require a full GPU.
    NAI 2.5 also supports vGPU, a software-based approach that allows GPU resources to be shared dynamically across endpoints for smaller or concurrent models.  
  3. Endpoint Scaling: NAI 2.5 introduces manual scaling for endpoints, allowing administrators to programmatically scale up or scale down compute instances based on real-time traffic and utilization. 
  4. Intel AMX Support: Not every inference workload requires a GPU. For embedding models and models with fewer than 10 billion parameters, NAI can utilize Intel Advanced Matrix Extensions (AMX) available on Intel Xeon 4th Generation processors, providing performant, CPU-based AI workload acceleration designed to improve inference performance without additional GPU overhead. This built-in optimization lets teams leverage existing CPU infrastructure for cost-effective AI, freeing GPU resources for larger, compute-intensive models while maintaining fast, reliable performance. 

Developer Delight and Rapid Innovation

AI infrastructure shouldn’t slow down innovation; it should accelerate it. Say hello to NAI Labs! This is your playground for simplified validation of workflows, fast feedback loops, and the confidence that your generative AI environment is production-ready from day one. 

1. Two New Sample Applications: NAI 2.5 introduces two sample apps, a conversational chatbot and a RAG (“Talk-to-My-Data”) app to help users quickly validate their end-to-end AI infrastructure. Users can select deployed LLM, embedding and reranker endpoints, provide API keys, and instantly verify live responses - no scripts, configuration, or external tools required. These built-in apps deliver a fast, frictionless validation experience, enabling teams to test performance, collaborate efficiently, and move to production with confidence. 

LLM chatbot

2. Automation and Model Expansion: NAI 2.5 simplifies automation and broadens model choice. With publicly available NAI APIs, teams can automate endpoint creation, scaling, and monitoring directly through their CI/CD pipelines. Each release also expands the catalog of validated LLMs and NVIDIA NIM microservices, giving enterprises a continuously growing library of trusted, production-ready models to deploy with confidence. See the NAI Admin Guide on the Nutanix portal for the latest list of validated models. 

Ready to Begin Your AI Journey?  

With NAI 2.5, Nutanix delivers the enterprise-grade infrastructure platform for AI. The latest release combines security, control, and resource efficiency with the simplicity developers love. From proof of concept to production, NAI empowers teams to deploy swiftly, operate smarter, and innovate with confidence across every stage of their generative AI journey.  

  ©2025 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and all Nutanix product and service names mentioned are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. All other brand names mentioned are for identification purposes only and may be the trademarks of their respective holder(s).