Videos

Composable Data Centers That Power Enterprise AI

In this video interview, Liqid President and Chief Strategy Officer Sumit Puri explains why dynamic IT infrastructure that taps into pools of GPUs and scale-up memory can quickly and efficiently run virtual machines, containers and AI workloads on-premises and at the edge.
  • Nutanix-Newsroom:Article, Video

July 17, 2025

Data centers, once siloed and monolithic, have evolved rapidly into scalable, hyperconverged and disaggregated systems that connect across owned and rented IT infrastructure. With the rapid rise in demand for artificial intelligence (AI) and machine learning (ML) capabilities, Colorado-based Liqid, a Nutanix partner, has created software that allows IT teams to dynamically "compose" bare-metal servers on the fly, creating custom configurations tailored to the exact needs of a specific workload. This can help optimize the use of high-cost, high-power consuming resources like GPUs.

In this video, Liqid President and Chief Strategy Officer Sumit Puri explains how IT teams can plug their data centers into a shared pool of GPUs to run specific workloads on premises or at the edge. When the AI tasks are done, the GPUs are disengaged and remain on standby for the next AI workload that comes along. He describes how this can ease and speed how enterprises embrace AI to power their business and operations. 

“Let's put a centralized pool of whatever device type we need (i.e. GPUs from various vendors) and pick the right tool for the right job at the right time,” Puri told The Forecast at the 2025 .NEXT conference in Washington DC.

Puri explains how Liqid's software enables intelligent orchestration and management of infrastructure by abstracting the physical hardware and presenting it as a fluid, dynamic resource pool. He said this approach helps maximize IT resource efficiency and brings agility, scalability, cost and sustainability benefits without being locked into one GPU vendor. 

This addresses the growing need for efficient AI inference workloads, allowing companies to process data on-prem without moving to the cloud. 

“What Liquid focuses on is very much on building power-efficient, cost-efficient solutions for (AI) inference,” Puri said. 

RELATED AI Lifecycle’s Impact on IT Infrastructure
In this Tech Barometer podcast interview with Induprakas Keri, senior vice president and general manager for hybrid multicloud at Nutanix, learn why hybrid cloud IT systems help enterprises evolve their use of AI applications and data.
  • Nutanix-Newsroom:Article, Podcast

March 14, 2025

“Whether it's on-prem or it's in the cloud, a lot of the data, which is what a lot of this AI is driven on, lives on-prem. Eighty-three percent of the data is actually on-prem. And so one of two things must happen. We either must move the data into the cloud, or we must bring the GPUs on-prem. We think there are a lot of customers who are not willing to wholesale move their data to public cloud, and so, therefore, we want to build efficient solutions to allow them to process their data on-prem.”

This innovative approach can make AI inference more accessible and scalable, according to Puri. He said it can help IT teams focus on metrics like tokens per dollar and tokens per watt to cater to enterprises with limited power and budget.

In a related video demonstration of AI/ML-driven workloads with Liqid and Kubernetes, an IT manager dynamically attaches 11 GPUs to 6 servers in under 30 seconds and updates hardware configurations in real-time, so all pods have access to the necessary GPU resources.

Video transcript:

Jason Lopez: Liquid is a software-defined infrastructure company that enables more efficient use of high-cost, high-power resources like GPUs in data centers. Instead of installing GPUs directly into each server, Liquid creates centralized GPU pools and dynamically allocates them to servers based on workload demands. This composable approach maximizes GPU utilization, reducing waste, and improving performance.

Sumit Puri: We saw this vision of GPUs being important in the data center many years back, and then all of a sudden this thing called ChatGPT burst upon the scene and made this front and center in everyone's mind. And so we've been focusing on pooling and sharing these resources for a long time, and now all of a sudden AI is forcing itself into the mainstream, especially in the areas that we focus on, things like the enterprise, and now all of it's coming kind of market for us in a very, very good way. And so we ended up building the product, and the market ended up coming our way.

RELATED Nutanix CEO Stokes Surge in IT Ecosystem Partnerships
In this Tech Barometer podcast, Nutanix CEO Rajiv Ramaswami describes why a thriving IT ecosystem enables enterprises to maintain investments in traditional infrastructure and applications while evolving to newer innovations such as cloud native and AI technologies.
  • Nutanix-Newsroom:Article, Podcast

May 27, 2025

Jason Lopez: Liquid addresses the growing demand for AI inference workloads, especially for enterprises that want to keep their data on-prem instead of moving to the cloud.

Sumit Puri: What Liquid focuses on is very much on building power-efficient, cost-efficient solutions for inference. And it's interesting to think whether it's on-prem or it's in the cloud, a lot of the data, which is what a lot of this AI is driven on, it lives on-prem. Eighty-three percent of the data is actually on-prem. And so one of two things must happen. We either must move the data into the cloud, or we must bring the GPUs on-prem. We think there's a lot of customers who are not willing to wholesale move their data to public cloud, and so therefore we want to build efficient solutions to allow them to process their data on-prem.

RELATED Swarms of AI Agents Powering Businesses and Daily Life
In this Tech Barometer podcast, disruptive technology investor and analyst Jeremiah Owyang explains the rise of AI agents and a future shaped by a multiplying AI-first mindset.
  • Nutanix-Newsroom:Article, Podcast

June 4, 2025

Jason Lopez: There are advantages of pooling and sharing GPUs instead of deploying them directly in each server.

Sumit Puri: There's three primary benefits why somebody takes this journey of pooling and sharing the GPUs. One is around performance. I need that server to have a lot of GPUs because I need it to run very fast. We're not limited by two or four or eight in a box. We can compose 30 GPUs to a server and give you the fastest servers on the planet. That's one reason. The other is cost. If I have to deploy 30 GPUs, do I want to buy four servers, put eight GPUs in every server, buy a bunch of networking? Or do I want to buy a single server, deploy 30 GPUs, reduce my cost, reduce my power, and have a more efficient way of deploying these resources? And the third reason is agility. It's very hard to predict, will my workload need one? Will it need two? Will it need four? Will it need eight? Do I use an H100? Do I use an A100? Do I use an L40S? There's too many choices, and so we try to take the guesswork out of it. Let's put a centralized pool of whatever device type we need and pick the right tool for the right job at the right time.

RELATED Search for VMware Alternatives That Meet Existing and Future Needs
Experts explain why IT teams interested in migrating from VMWare software want a future-ready IT platform that manages virtual machines and modern application needs.
  • Article:News
  • Nutanix-Newsroom:Article

October 8, 2025

Jason Lopez: The AI infrastructure landscape is evolving from training to inference, especially for companies that are not building foundational models.

Sumit Puri: The first chapter of this entire AI journey was very much focused on training and building these foundational models. And the way that we see that going forward, there's probably only going to be 10 companies on the planet that can afford to build these massively large 100,000, 200,000 GPU clusters to build the foundational model. The other 100,000 customers that are out there, they're going to take these models, open-source models like Llama, they're going to bring them on-prem, they're going to fine-tune that model, they're going to do RAG, they're going to do inference, and that part of the journey now is just starting. We think by the end of the decade, inference actually is going to be a larger piece of the overall AI pie than is something like the training portion of it.

Jason Lopez: Liquid makes AI inference more accessible and efficient for enterprises, especially with Kubernetes and model deployments.

Sumit Puri: So NVIDIA has a big push for something called NIMS, NVIDIA Inference Microservices, which is basically a container. And what they've said, it's very, very difficult for people to get all the layers of the stack perfectly right to deploy these models, and so we're going to containerize these models, and that is the way that enterprises are going to go off and deploy this. We have a plugin for our solution where what we do is we suck the container in, we probe the container, we figure out what type of GPU and the quantity of GPU in the backend, and we connect that GPU resource to the specific server in the Kubernetes cluster, then we deploy that container on that machine, so you have this perfect matching of hardware to container, and we automated the entire process. We're at a point now where we have one-click deployment of inference. You say, give me Llama7b go, and we will automate the entire process on the backend, and within two minutes, give you a model that you can speak with. When you're done with it, you hit the delete button, we'll rip those GPUs off, put them back into a centralized pool so that next container, that next model that you're looking to deploy has resources to use. We think that's how you get there, is you have to make it easy for enterprises to deploy these things. They don't have the large armies of data scientists to do this on their own, so the more that we can automate this, the easier it is for those companies to consume, the more that we can make it more efficient, and the way that we think about efficiency is tokens per dollar and tokens per watt, because that's what the enterprises are limited by, they're limited by power and money. If you're a hyperscaler and you have unlimited power, unlimited money, we can't help you. But if you're an enterprise, that is the metric you need to think about. Tokens per dollar, tokens per watt, automation, ease of use, those are the things that are needed to get AI to scale.

RELATED Bracing Data Centers for Wave of AI Workloads
In this video interview, Harmail Chatha, senior director of cloud computing operations at Nutanix, describes the growing challenges of managing data centers as business demands for enterprise AI applications climb.
  • Nutanix-Newsroom:Article, Video

April 25, 2025

Jason Lopez: The company supports virtualization and dynamic GPU allocation in enterprise environments.

Sumit Puri: We are a platform for a variety of different applications. One of the applications we are very well suited for is virtualization. If we think about VMs for a second, it's very difficult for enterprises to predict which VM they're going to deploy at what time and what resources that VM might need. We deploy infrastructure for three to five years and making that long-term prediction is very difficult. We've partnered with Nutanix where we can put a centralized pool of GPUs inside of a Nutanix cluster and depending on the requirements of a specific VM, we can match the GPUs on the fly dynamically, hot-plugging GPUs into servers to meet the requirements of the VMs on Nutanix.

Ken Kaplan is Editor in Chief for The Forecast by Nutanix. Find him on X @kenekaplan and LinkedIn.

Jason Lopez contributed to this video. He is executive producer of Tech Barometer, the podcast outlet for The Forecast. He’s the founder of Connected Social Media. Previously, he was executive producer at PodTech and a reporter at NPR.

© 2025 Nutanix, Inc. All rights reserved. For additional information and important legal disclaimers, please go here.

Related Articles