Raising the Bar and Pushing the Envelope on Performance

Author: March 3, 2016

The release of Acropolis 4.6 illustrates how the constant drive for innovation at Nutanix is paying off big in performance. The formal announcement is included here detailing the new products and major features. Acropolis 4.6 demonstrates a step-function increase in system performance which, when compared to the previous release, boasts up to 4X improvement in small IO random reads & random writes, typical of tier-1 transactional databases running on SQL Server and Oracle. Software optimizations are responsible for the improvements. Customers can alleviate any performance bottlenecks via a simple, non-disruptive, 1-click software upgrade. Hardware and applications stay unchanged and applications take no downtime. Benefits of improved software are realized without IT’s knowledge, not unlike what happens in the public cloud where refreshes to backend IaaS infrastructure go unnoticed by end user developers and IT teams.

Figure 1 below captures the performance increase since the last release:

 

Reversing the “inverse relationship” between number of features and system performance

Rapid pace introduction of products and features results in a considerable amount of additional code optimized for features; the consequence is typically constraints on system performance. However, exactly the opposite is true with Nutanix – system performance has continued to improve in the midst of rapid development of new products and major features. Pushing the envelope requires Nutanix to think years ahead of the industry; the innovative culture coupled with the heavy investment in performance engineering is already paying large dividends.

Compared to Industry Competitors

Our second generation All Flash platform – NX-9460-G4 with 4 server nodes attaching 24 SSDs – can drive the enterprise grade SSDs close to their spec’d limits.

NX-9460 is compared to similar configurations of leading All Flash (AF) vendors on two key metrics, in Figure 2 below:

  • $ per IOP capturing price-performance of the system (green bars)
  • IOPs per RU capturing performance density or performance efficiency (blue line)

 

Comparison summary:

  • Nutanix is up to ~ 3X better, or 65% less, on price-performance ($ per IOP)
  • Nutanix is up to ~ 6X better on performance efficiency (IOPs per RU)

Comparison is apples-to-apples:

  • Normalized on raw flash:
    • Configurations all use the same amount of raw flash
    • Compares street prices, extracted from independent sources
  • Normalized on r/w mix and IO size:
    • Nutanix uses a 70/30 read/write mix and an IO size representative of typical databases
    • Compares to published performance specifications; we extrapolate when necessary to normalize
  • Comparison remains true across appliance types: Entry, Mid, or Large systems

Other points of note regarding competitors:

RACKSPACE
Leading All Flash Competitors
Require separate servers and a storage (FCP/Ethernet) network taking up 4U or more in rackspace, in addition to the all flash array (4U-6U), bloating the total rackspace requirement to between 8U and 10U (see Figure 2 above) NX-9460-G4 sits in 2U of compact rackspace, which also includes 4 servers to run your applications
CONTROLLER BOTTLENECKS
Leading All Flash Competitors
With dual-head controllers sometimes operating in Active-Passive mode, the controller can become the bottleneck as storage is added to the system Every node in a Nutanix cluster has a virtual controller. Adding nodes to the cluster adds storage and controllers in tandem
NETWORK BOTTLENECKS
Leading All Flash Competitors
Prone to bottlenecks as servers are accessing fast storage over a network Low probability of bottlenecks since storage accesses are predominantly local. See why data locality is important
NETWORK LATENCY
Leading All Flash Competitors
Network latency, however small, adds latency to storage accesses. The impact of such latency increases when dealing with next generation storage technologies. See why latency is important Storage accesses are predominantly local in Nutanix cluster minimizing the average latency seen by the application
PERFORMANCE WITH FAILURES
Leading All Flash Competitors
With dual-head HA models, expect a 50% degradation in performance when a controller fails. Alternatively, if dual heads are operating in Active-Passive mode, then one controller lies unutilized in steady state. A node failure even in a small 4-6 node Nutanix cluster degrades performance only by 17-25%; larger the cluster, smaller the degradation
COMPONENTS TO MANAGE
Leading All Flash Competitors
Servers, network and storage are separate; this means more things to manage and more points of failure Hyperconvergence means you are managing just server nodes
DEPENDENCIES TO TRACK
Leading All Flash Competitors
Incorrectly upgrading one component can adversely affect others Upgrade non-disruptively in 1-Click; all dependencies are automatically accounted for
SCALING
Leading All Flash Competitors
Controller heads can become a bottleneck entailing a head upgrade (to a bigger head) even though the current head has not reached end of life Scale out the cluster with a quick and easy node add operation

 


 

Compared to AWS

As shown in Figure 3 below, AWS customers pay 6.5 cents per IOP per month for EBS Provisioned IOPS (SSD) volumes. Over the typical hardware life of storage infrastructure (48 months) this works out to be $3.12.

 

Nutanix customers pay as low as 35 cents per IOP over 48 months. Include the cost of operating the Nutanix cluster (power/rackspace/cooling, people, technical support) at ~20% per year for a total cost of 63 cents per IOP over 48 months.

Therefore, Nutanix is up to 5X better, or 80% less, than AWS on price-performance.

Improvements in Acropolis 4.6

Several improvements and optimizations touching the core of the distributed storage fabric were responsible for the step-function jump in performance. Small IO random read and random write performance experienced the largest improvements, therefore transactional tier-1 database workloads have the most to gain.

Top 3 improvements include:

  • CPU Efficiency: CPU cycles required per IO operation have been reduced. Keeping CPU unchanged, we can now process more random IO
  • Write buffer (opLog) improvements: Write buffer memory has been optimized and recovery accelerated, ensuring that the write buffer can better adapt to continuous write IO (sustained write IO)
  • Log structured design for single vdisk writes: Optimal batches of write IO are maintained to allow a single virtual disk to perform within 80% of multiple virtual disk write performance, when normalized for end-user outstanding IO

Future proofing for faster, higher throughput storage

Continuously improving performance is a journey not a destination. Keeping ahead of the curve to take advantage of next-generation memory class solid state technology (NVMe, 3D XPoint, etc.) keeps us up at night. Such storage can operate at near bus speeds and offers lower latencies and higher throughput than enterprise SSDs. The advantages of such technology are better exploited by an architecture where compute and storage are collocated, allowing compute to access storage at near bus speeds. In other words, Hyperconvergence.

Our architectural advantage combined with a focus on continuous performance improvement positions us very well in the future world of memory class storage. Indeed, it’s a simple matter of continuing to do what we are doing.

Upgrade and see for yourself

The proof is in the pudding! We encourage customers currently experiencing performance bottlenecks, if any, to upgrade their clusters to Acropolis 4.6 and see these improvements for themselves. Please share your feedback, thoughts and questions on the Nutanix NEXT community.

Learn More