Shadow AI: The Data Privacy Risk Lurking in Your Enterprise

By Luke Congdon, Senior Director, Product Management

May 4, 2025 8:00 am |

min

There’s a big challenge that all enterprises need to overcome to take advantage of generative AI (GenAI). Today’s large language models (LLMs) know everything about dogs, cats, and cities of the world, but they know nothing about what's going on inside your company.

This poses a potential risk when using public LLM services like ChatGPT. It inherently exposes private data to achieve results that are relevant to your company – a potentially dangerous proposition.

Providers like OpenAI may offer various options to protect data that is exposed – including VPNs, virtual private clouds and private APIs – but your data is still in the cloud and out of your control. Can you trust this process?

Some enterprise legal departments view this approach as untrustworthy and restrict the use of cloud-based LLM inference services due to data risk. But people don't always seek guidance from the legal department or consider the risk.

Well-meaning employees are probably already copying your company’s data into a wide variety of cloud AI services, engaging in shadow AI, reminiscent of the Shadow IT that was rampant 15 years ago at the start of the cloud era.

To combat Shadow AI, you have to find another way to deliver critical GenAI capabilities to your workforce, while minimizing the risk of data privacy breaches. Let’s look at the pros and cons of both approaches.

How Are Enterprises Approaching AI?

Over the last year and a half, I’ve talked to many enterprise customers about their AI plans. Most customers are starting by considering what they want to accomplish. Do you want a document summarization tool? Do you want a chat tool or a customer service tool?

Nobody buys a load of bricks and then decides if they want to build a house or a shed. Decide what to build, then buy the bricks. The first infrastructure decision to make after defining your GenAI goal is whether to build it in the cloud or on-premises.

Deploying Your LLM On-Premises

My experience has been that most customers have an idea of where they want to implement GenAI, and most enterprises, I have encountered, want to be on-premises. Although building your AI solution on-premises might seem daunting, doing it yourself doesn't mean you’re starting from zero.

LLMs are like high-performance engines – they're very powerful and raring to go. But unlike high-performance engines, many models are free to use under the auspices of an Apache 2 or Meta license. With a pre-trained LLM, much of the work has been done for you but you still need to implement your application on top.

You might need a few more skills internally versus deploying a turnkey cloud solution, but a lot of customers who are concerned about security are willing to trade a little bit more upfront investment to build their AI chatbot or customer service tool on-premises, either by themselves or with the help of a partner.

Deploying Your LLM in the Cloud

Deploying the LLM of your choice on cloud infrastructure can be a viable option if you have cloud infrastructure experience and are confident about your cloud security. Just be mindful of the ever-running cloud cost meter. Over time, on-demand infrastructure costs may be unexpectedly high versus on-premises.

Whether or not you're comfortable with public cloud, that's up to you. Many people start in the cloud, maybe with a proof-of-concept test, and then migrate on-premises for production. On-premises might be your best bet if you want full control over your critical data.

Protecting Your Enterprise Data

If the LLM is a high-performance engine, your data is the fuel that makes it run. Your data needs to be protected using all the resources at your disposal, including your security team, security perimeters and firewalls.

To illustrate, you wouldn't go to the middle of Times Square in New York City, leave your wallet on the ground and walk away. Clearly, there’s a high probability that your credit cards and other personal data might be stolen and misused. You should think about your company data in the same way.

Businesses spend millions of dollars on security, intrusion detection solutions and backup and recovery to safeguard company data. Despite this, some organizations are willing to risk making their data accessible to public LLMs.

Remember Data Privacy and Compliance Capabilities

Because all the GenAI and related software stacks are relatively new, it’s especially important to think about their inherent features. There are several vital capabilities that should be part of the enterprise-grade AI solution you deploy, including:

User management. The processes and tools that control who has access to your GenAI application, including user authentication.
Role-based access control (RBAC). Controlled access to your GenAI application or system based on roles. RBAC allows users to only access the resources necessary for their jobs.
Auditability. The ability to track important details such as who accessed a GenAI system, and what actions were performed, such as configuration changes, new user additions and password changes.
Observability. The ability to monitor, measure, and understand the internal state and behavior of your GenAI systems. This can help you understand the health of your API endpoints and infrastructure.

Without these capabilities, enterprises should be hesitant to touch AI. On your own, it can take a long time to implement these capabilities and bring them to an acceptable level of maturity. In some cases, finding the right partner and the starting point can help you to unlock your goals, with minimal effort and optimize data privacy and security.

Enterprise AI at Nutanix

Nutanix has been in the business of infrastructure for 15 years. We just spent the last year and a half building a new software infrastructure tool for GenAI that is designed to deliver the capabilities enterprises need to speed up and reduce risk in AI deployments.

The Nutanix Enterprise AI solution provides an enterprise class software infrastructure platform with shared agentic AI services, with all the enterprise capabilities you need to embrace and power GenAI applications with confidence, while giving you choice and flexibility in the areas that matter:

Location. Deploy in the datacenter, public cloud or at the edge.
Hardware. Choose the hardware you prefer – including a range of GPU-enabled solutions for inference tasks – from vendors like Dell, Cisco and HPE.
Scale. Start small if you need to and scale quickly.
AI models. Deploy validated models from partners like Hugging Face and NVIDIA.

Learn more by downloading the free guide on How to Deploy Enterprise AI today.

©2025 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and all Nutanix product and service names mentioned are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. All other brand names mentioned are for identification purposes only and may be the trademarks of their respective holder(s).