Enterprise AI teams are laboring under two key pressures: 1) squeeze maximum value out of expensive GPUs and 2) deliver new GenAI experiences faster than competitors. Too often, their ability to deliver is blocked by:
The new ClearML running on the Nutanix Kubernetes Platform (NKP) solution is designed to tackle every one of these headaches. Below, we unpack each layer of the stack and explain what it is, why it matters, and how it helps you ship AI both quickly and with cost efficiency*.
The Nutanix Kubernetes Platform (NKP) stack simplifies platform engineering by stamping out operational complexity and establishing consistency across any environment – what enterprises need to run demanding AI workloads at scale without adding overhead.
ClearML builds on this by adding a hardware-agnostic Infrastructure Control Plane that, with the NKP solution, enables a multi-tenant environment built for collaboration, resource efficiency, and speed. With ClearML’s GPU-as-a-Service, AI builders can instantly access AI accelerators through Jupyter, VSCode, SSH, or remote desktop – turning every GPU into a shared, fully governed service.
To tackle the constant struggle with underutilized GPUs and team-level contention, ClearML enables dynamic fractional GPU allocation and resource quotas with controlled bursting – so teams can share GPUs efficiently while still getting the performance they need. Add usage-based billing and cost attribution, and organizations finally get the visibility required to scale AI across departments without chaos or waste.
ClearML centralizes the full AI/ML workflow with Nutanix at its core that includes:
Every experiment is automatically tracked, so AI teams can focus on building better models – not managing logs or recreating past work.
Development workloads can run directly on the NKP stack, enabling consistent performance and seamless resource provisioning across environments.
When it’s time to operationalize AI workloads, the ClearML GenAI App Engine becomes a central component of the stack. At its core is the ClearML Application Gateway, which manages all ingress and egress networking for containers running on NKP Kubernetes. It acts as the secure entry point for AI services, enforcing authentication, resource-aware routing, and adding a layer of role-based access control (RBAC) around deployed models and endpoints.
Beyond secure access, the GenAI App Engine also supports RAG workflows through a built-in vector database, enabling use cases like question answering, semantic search, and content generation. AI builders can index, query, and serve embeddings - all within the same platform, without stitching together third-party tools. Add to this ClearML’s orchestration pipelines, and teams can automate the entire RAG flow – from ingestion to inference – using native building blocks that run directly on NKP-managed infrastructure.
To ensure production reliability, ClearML includes a centralized monitoring dashboard for deployed endpoints. Teams get real-time visibility into important metrics such as latency, requests rate and resource usage, with the ability to drill down into each individual service for performance analysis and troubleshooting. Whether you're tracking usage patterns or investigating anomalies, everything is in one place, purpose-built for operational AI.
Skip the spreadsheets and status-meeting chaos. Spin up fractional GPUs, train a model, deploy a GenAI endpoint, and watch real-time cost dashboards, all in a single session. See how ClearML on the Nutanix Kubernetes Platform solution turns your GPUs into a high-speed, fully accountable AI factory; book a live demo now at clear.ml/demo or test drive Nutanix at https://www.nutanix.com/products/kubernetes-management-platform.
*ClearML internal research