Blog

They (EMC) Think You Are Stupid!

By Sudheesh Nair
default image

The boy was just 14 years old when he found out that his father killed and ate his pet dog. That was the last straw for the poor Filipino boy; he ran away from his home and became a street dweller. He slept in cardboard boxes and ate whatever he could find. When his close friend, who was a boxer, suddenly died he was inspired to get into the ring.

The year was 1995; now the 16 year old wanted to get into professional boxing because he realized that it was the only way for him to get out of extreme poverty that has been his permanent companion from the day he was born. When an opportunity was presented to him to get into a professional boxing series, he was all set to jump in with both feet. He knew he couldn’t miss this opportunity. He did, however, have one big problem; he was terribly underweight. He was only 4’11” and 98 pounds, 7 pounds below the minimum requirement. He made a potential life-or-death decision; to cheat and get into the series. He carried steel rings in his pocket during the weigh-in and qualified for the fight.

The boy fought 12 fights in his first year and won 11 of them!

Manny Pacquiao (Photo by Time Magazine) didn’t stop taking risks even after becoming one of the richest and best pound-for-pound boxers in the world. All through his illustrious career, he constantly searched for worthy opponents in his weight class. When there weren’t any left in his class, he gained weight and moved up a class where he took on opponents with longer arm reach and heavier punch. He beat them with speed, agility, intelligence, and rigorous training.

Punching above your class

You may be wondering why am I writing a biography of Manny Pacquiao on a technology blog. To me the story is less about boxing and more about human willpower and what really separates winners from losers. Punching above the weight class is what winners routinely do.

Last week we were invited by EMC to punch high above our weight class when we got hold of their secret FUD (fear, uncertainty, and doubt) sheet against Nutanix.

For the uninitiated, a FUD sheet is something vendors provide to their sales teams to compete against other vendors. A recipe for a good FUD sheet includes a good amount of truth: the more truth you add, the better the FUD sheet is. However, when better technology isn’t on your side, FUD sheets will be a little thinner on facts.

Just a few years ago, before the emergence of social media and YouTube, the power and reach of a behemoth like EMC would have propagated these types of unsubstantiated FUD sheets to all corners of the world and give an emerging technology company like Nutanix a lot of trouble. Unfortunately for the EMCs of the world, the power of instant communication with Twitter and blogs level the playing field and allows us to respond immediately. So, with the help of a few people from my engineering team, I am going to take each of the salient points and respond to them with real facts. (The mighty EMC’s FUD sheet against the humble Nutanix is reproduced in full at the end of this blog entry.)

Let the facts speak for themselves

Says EMC:

Enterprise storage is first and foremost about protecting the data and ensuring that is available. That starts with an architecture with no single points of failure, but goes beyond that to include data integrity and all points in the system. Even with the best architecture, failures happen. Thus enterprise storage must be backed up with great service and support. Otherwise failures can cascade and bring down applications or lose data. Nutanix doesn’t make these a priority. Data integrity and availability processes are considered low-priority background tasks … These create times when the cluster may be at risk to data unavailability or data loss.

Yes, we all love our mothers and apple pie. At Nutanix, our engineers and architects come from Google, VMware, Data Domain, Nicira, IBM, Splunk and, yes, EMC. Rest assured that we place data availability and integrity above everything else. We are keenly aware that in a connected world, news of a bad customer support experience or lost data will be propagated across the domain within no time. When Google built the Google File System, they knew that components would fail. They built the system to recover from failures and provide reliable, uninterrupted, uncorrupted data access to their clients. The Nutanix system is built with the same top priorities in mind because we are running customers’ primary server applications and storage. Our customers will probably jump in to vouch for our customer support and reliability of our architecture.

Data Integrity: Data integrity in ensuring that data is not corrupted either in-transit or on drives. Nutanix checks for data integrity only for data at rest as a low priority background process. This means that sources of corruption that are outside of the drive would not be detected. In addition, drive corruption may not be detected until the Curator process gets around to it, allowing corrupt data to be accessed or replicated.

Someone in EMC’s billion-dollar competitive research team delivered real goods here by inserting the word “Curator”. We do have a process called Curator; that much is true. (Dwayne wrote about this recently here in his blog.) What they missed was that we also have a process called Stargate. This highly intelligent process checksums the data before writing and verifies it upon reading.

We didn’t stop there; we added a feature to Stargate where it does continuous monitoring of data integrity by verifying checksums of random extent groups. So even if a disk sector was to go bad after a successful I/O, Stargate’s scrubber operation would detect it and then create new replicas as necessary. So you see, we did think through these kinds of things before putting our product out there.

Non-Deterministic Fault Recovery: Nutanix uses a Map-Reduce based low-priority background process called Curator for data replication in case of node or drive failures. Map-Reduce is a batch process, not real-time, and the time to completion is gated by the slowest node participating in the map-reduce cluster for each stage of the process. Thus while efficient for large clusters, it can take a long time to complete and protection operation. This opens the system to possible additional failures in the cluster losing data.

When one of our architects, who also happens to be a lead brain behind Google’s second generation file system, GFS2, read this one he was practically livid. He commented that EMC is making Curator sound like a non-reactive and dumb process. In actuality, Curator will start a partial scan proactively whenever a node/disk failure is detected. It is also 100% incorrect to imply that the slowest node is a limiting factor: part of the beauty of MapReduce architecture is that it allows us to break the work into small units and distribute them across all nodes. If a node is slow, other nodes can compensate for it.

In fact, the line about MapReduce being gated by the slowest node is so fundamentally false that we thought about offering to sponsor a MapReduce 101 class for EMC at no cost. Then we thought better of it because they may have confused Nutanix with their own Isilon clustering architecture. Unlike an Isilon cluster, a MapReduce cluster is absolutely decoupled from individual node performance because it’s an asymmetric clustering model.

(Having said that, we are still open to providing the aforementioned Map/Reduce 101 class to EMC’s competition research team at a substantially discounted price. Please reach out to us at sales@nutanix.com.)

If EMC will take us on, we are willing to do a challenge on failure rebuild performance of our clustering technology against anything they have in their portfoilio of hundreds of products.

Data at Rest Encryption: Nutanix does not support encryption for data at rest, an important feature for many government, financial, and healthcare accounts.

We don’t, and we won’t until we’ve lost enough deals. That is called iterative product development, based on market feedback. And once we decide to implement encryption, we’ll throw a challenge to EMC to beat us on encryption speeds in a single system. The # of cores we can throw at the problem can’t be beaten by any one single system out there, such is our belief in the system’s scalability.

Disaster Recovery: Nutanix suggests using VMware’s backup facility, VADP, as a disaster recovery mechanism. This would give a RTO of days from any site wide disaster such as a regional power outage. In addition, backup processes using VADP are not continuous and can have scalability problems, resulting in a poor Recovery Point Objective (RPO). Nutanix does not support VMware Site Recovery Manager (SRM). Taken together, Nutanix does not have a viable disaster recovery strategy for most customers’ requirements

I’ve a confident glow on my face right now. They ain’t seen nothing yet! My lips are sealed, and I’d let Marketing have its prerogative of talking about stuff that is about to be unveiled. ‘Nuff said.

Support: Nutanix support has 3 levels with Platinum being the premium 24×7. However it is not 365 as they exclude holidays from the support period. Platinum parts replacement is next business day, so there may be a period of unprotected storage even with their top support level

I wrote about how Nutanix views support in a previous blog entry. We take great pride in our customer support. What EMC wrote about our customer support was true. During the build-up of our support infrastructure, we didn’t want to inflate our capabilities. Had they gone to our website and refreshed our support page recently, they would have seen that we have already rolled out our 24x7x365 support and same-day parts replacement capabilities.

As an emerging technology company, we aren’t going to win the we-have-a-bigger-support-team-than-you game. But, just like everything else in the enterprise data center, support is also evolving. Nutanix is providing personalized support combined with machine-intelligence-based analytics for our customers. If EMC’s billion-dollar competitive research groups want proof, all they have to do is to look at what our customers are saying about in our Twitter feed. Just to highlight a few here:

Robert Sciaraffo, Senior Systems Engineer at the Riverside Company (one of our customers) says, “Everyone has been very responsive. The technical support experience I get at Nutanix is more personalized… “. Another customer, William Tomlinson, IT Director at Leitner, Williams, Dooley & Napolitan, PLLC , is extremely happy with our support and comments that our support is outstanding and the best he has ever dealt with. Mark Winterton, Business Development Manager at United Data Technologies (a reseller) echoes William’s sentiments and says, “Support before the sales, during the sales and after the sales has been magnificent.”

Nutanix also lacks support for data reduction and their tiering software and application support are limited. This raises the costs of deploying a Nutanix based solution.

Data Reduction: Nutanix does not support either compression or deduplication for data. This means that Nutanix requires substantially more storage for their VDI solutions or for the local snapshot based backups than competing solutions, increasing cost and power requirements.

Sermons about data reduction coming from a company that still creates copy-on-write snapshots should be funny. Nutanix’s snapshot technology uses intelligent metadata pointers and ensures that data is not unnecessarily duplicated. With unlimited snapshots, cloning, and thin provisioning, we already provide a few key data-reduction features. As for compression and dedup, you ain’t seen nothing yet. I’ve a confident look yet again. (I shouldn’t be so confident, but our engineering is that good and I am very proud of them)

Slow Tiering: The Nutanix Curator process is also responsible for promotion/demotion of hot data. This process runs at indeterminate times and may react poorly to changing workload patterns, for example changing from heavy OLTP to background batch or backup processes. This would be inefficient use of expensive SSD resources

We had a laugh about this at Nutanix HQ. To correct EMC’s misapprehension, Curator does not promote data; Stargate does this, which means it’s reactive (almost immediate). Regarding “background batch processes”; our system has enough intelligence to sense whether an I/O stream is sequential or batch and handle it without hitting the PCIe Flash tier altogether. When data is promoted, it is smart enough to avoid unnecessarily flooding PCIe Flash for infrequently accessed data. Oh, and by the way, Nutanix tiering happens at a finer grain, allowing finer-grain decision making for mixed workloads, which are very common in case virtualization workloads.

No Microsoft SMB Support: Nutanix does not support SMB file shares, a common protocol in many IT environments and solutions, including common workgroup shares or VDI installations.”

This is true. Nutanix currently supports NFS and iSCSI for the VMs that we run. A few of our customers who want to access storage through SMB run CIFS shares through one of the host VMs. We can’t be all things to all people in just 3 years.

We at Nutanix don’t believe that we have built a product that does everything for everyone. It wasn’t our intention to do that anyway. What we wanted was to build a product that brings Google-like scale out, reliability, manageability, performance, and availability for enterprise data centers. We focused on three use cases to begin with:

  1. End-user computing
  2. Private Cloud for core and edge, including for DR sites
  3. Big-data analytics

Thousands of servers deployed within one year in the market proves that our customers get it. That’s where our focus is going to be.

Times have changed; even small companies now have a voice

Whether you agree with the politics of William Jefferson Clinton or not, you have to acknowledge that the man is a master of boiling complex subject matters into a few choice words. During this year’s Democratic National Convention, he used a very common but powerful phrase when addressing some of the GOP’s talking points. He said, “They think you are stupid.” He was referring to the fact that jaded political consultants overestimate the number of uninformed voters and customize the message to appeal to the lowest common denominator.

When EMC throws around terms like non-deterministic fault recovery, checksums, and Curator they are probably betting that all of you out there are uninformed IT buyers (which to them means stupid) and that Nutanix as a small company won’t have the voice or the medium to set the record straight. That doesn’t work for politics anymore; it certainly is not going to work for today’s highly connected high-information IT decision makers. We are going to set the record straight and do our best to punch much above our weight class.

Please follow me @sudheenair

———–

Here is a reproduction of EMC FUD Sheet in it’s entirety, courtesy channel

“Enterprise storage is first and foremost about protecting the data and ensuring that is available. That starts with an architecture with no single points of failure, but goes beyond that to include data integrity and all points in the system. Even with the best architecture, failures happen. Thus enterprise storage must be backed up with great service and support. Otherwise failures can cascade and bring down applications or lose data.

Nutanix doesn’t make these a priority. Data integrity and availability processes are considered low-priority background tasks, and even their Platinum support doesn’t include times inconvenient to Nutanix support workers, such as holidays. These create times when the cluster may be at risk to data unavailability or data loss.

• Data Integrity: Data integrity in ensuring that data is not corrupted either in-transit or on drives. Nutanix checks for data integrity only for data at rest as a low priority background process. This means that sources of corruption that are outside of the drive would not be detected. In addition, drive corruption may not be detected until the Curator process gets around to it, allowing corrupt data to be accessed or replicated.

• Non-Deterministic Fault Recovery: Nutanix uses a Map-Reduce based low-priority background process called Curator for data replication in case of node or drive failures. Map-Reduce is a batch process, not real-time, and the time to completion is gated by the slowest node participating in the map-reduce cluster for each stage of the process. Thus while efficient for large clusters, it can take a long time to complete and protection operation. This opens the system to possible additional failures in the cluster losing data.

• Data at Rest Encryption: Nutanix does not support encryption for data at rest, an important feature for many government, financial, and healthcare accounts.

• Disaster Recovery: Nutanix suggests using VMware’s backup facility, VADP, as a disaster recovery mechanism. This would give a RTO of days from any site wide disaster such as a regional power outage. In addition, backup processes using VADP are not continuous and can have scalability problems, resulting in a poor Recovery Point Objective (RPO). Nutanix does not support VMware Site Recovery Manager (SRM). Taken together, Nutanix does not have a viable disaster recovery strategy for most customers’ requirements.

• Support: Nutanix support has 3 levels with Platinum being the premium 24×7. However it is not 365 as they exclude holidays from the support period. Platinum parts replacement is next business day, so there may be a period of unprotected storage even with their top support level.

Nutanix also lacks support for data reduction and their tiering software and application support are limited. This raises the costs of deploying a Nutanix based solution.

• Data Reduction: Nutanix does not support either compression or deduplication for data. This means that Nutanix requires substantially more storage for their VDI solutions or for the local snapshot based backups than competing solutions, increasing cost and power requirements.

• Slow Tiering: The Nutanix Curator process is also responsible for promotion/demotion of hot data. This process runs at indeterminate times and may react poorly to changing workload patterns, for example changing from heavy OLTP to background batch or backup processes. This would be inefficient use of expensive SSD resources.

• No Microsoft SMB Support: Nutanix does not support SMB file shares, a common protocol in many IT environments and solutions, including common workgroup shares or VDI installations.”