How Nutanix Is Able to Run on So Many Platforms, Both On-Premises and in the Public Cloud
Over the past five years, Nutanix has gone from offering a single, integrated hardware and software appliance to providing flexible, subscription software qualified to run on many hardware platforms from a variety of server vendors, resulting in dramatically more hardware choice to users. By giving hardware choice to users, we have made it so they don't have to make a hardware choice a priori; they can purchase Nutanix software and run it for the duration of the subscription on any of the qualified systems they have available. With an extensive range of qualified systems, this effectively removes hardware server and component selection as a major impediment. This is another example of Nutanix pursuing invisible infrastructure, all the way to the lowest levels of the stack: the individual hardware components (such as SSDs, NICs, and HBAs) used in the hyper-converged infrastructure (HCI) platforms running our software, whether on-premises or in the cloud.
Underpinning this transformation, growth, and flexibility is the engineering work of Nutanix's platform group. About two years ago, we began the FleX project to coordinate our efforts related to hardware component and platform support in order to take them to the next level.
Goals and Approaches
The overarching goal of FleX is to allow Nutanix's software to run on as wide a range of hardware platforms, with as wide a range of hardware components, as possible.
In pursuit of that goal, FleX focuses on three areas:
Establishing a unified, streamlined set of tools and processes for enabling, integrating, and qualifying all hardware with Nutanix software.
Decoupling Nutanix software from the underlying hardware by developing a complete hardware abstraction layer (HAL).
Allowing hardware vendors and other partners to bring up and qualify new hardware components and platforms on their own using the FleX tools.
Challenges and Solutions
Hardware qualification, support, and compatibility lists are not new. So what makes Nutanix's use and approach to these things worth discussing? There are three areas worth considering:
Nutanix began by building an integrated hardware and software appliance, and created the HCI category in the process. Unlike others who retrofitted HCI capabilities onto their existing stack, we were born in HCI. As a result, we have native, world-class expertise building, qualifying, and supporting HCI systems.
Tightly integrated and qualified systems - like the Apple iPhone and Nutanix NX appliance - offer superior quality, performance, and user experience. These benefits come at the cost of hardware flexibility and choice for the user. Open ecosystems - like Android - provide flexibility and choice, but can't directly ensure quality. As Nutanix began allowing integrations between our software and third party vendors' hardware, the challenge arose: how do we maintain quality, performance, serviceability, and all of the other properties that made the experience of using the integrated NX appliance exceptional for users?
The answer resides in FleX, through which we are making available to hardware vendors and other partners the tools, tests, and processes used within Nutanix for integrating and qualifying Nutanix’s software on hardware platforms. In fact, FleX drove the standardization and development of most of those tools, tests, and processes, and it is now used for all platforms on which the Nutanix software runs (including the Nutanix NX appliances). Coupled with robust checking of the integration and qualification results by our subject matter experts (SMEs), we are able to ensure the same level of quality from Android-style integrations as we do from our in-house NX appliance.
Hardware / Software Integrations
The next interesting area arises from the depth of Nutanix hardware and software integrations, including bare metal installation through Foundation; BIOS, BMC, and component firmware upgrade support through Lifecycle Manager (LCM); hardware health checks through the Nutanix Cluster Check (NCC); and passthrough of data disks to the Controller VM (CVM). Each of these benefits the user, but increases the knowledge and interactions our software must have with the underlying hardware. As a result, developing and testing these hardware and software integrations is non trivial and must be done for each platform.
We have addressed this in FleX with the development of a full hardware abstraction layer, through which all interactions with the hardware must pass (see below for more information on the HAL). Further, we are making available software development kits (SDKs) for each area of the HAL layer (for example, Foundation, NCC, and LCM). These SDKs allow any user (internal or external to Nutanix) to develop the integrations necessary to enable a new platform or hardware component.
Quality of the Tools
The final area that makes FleX different is the quality and richness of the toolset we provide. As mentioned above, FleX is core to the development and qualification of all platforms on which Nutanix software runs, which are at the foundation of Nutanix's business. That level of importance requires a significant level of quality and robustness. Therefore, the FleX tools are developed by the core platform and infrastructure engineering teams as first-class projects, with the same quality of engineering, Nutanix-styled UIs, depth of QA, and support as other Nutanix products.
FleX provides a holistic, integrated approach to enablement and qualification. Rather than requiring the user to navigate many disparate programs and use many different tools, we offer a single program with just a few tools that cover the complete lifecycle, from hardware proof-of-concept to final platform qualification and release to the field.
All user-visible FleX capabilities are encapsulated within and available through two primary tools: The Elevate Portal and Metis.
The Elevate Portal is the gateway to FleX for all users. It contains
Downloads of all FleX tools.
Documentation of the engineering and qualification practices.
A ticketing system for requesting support or submitting qualification results for review.
Currently, only partners enrolled through the Elevate Alliances program and approved by Nutanix to perform qualifications have access to the Elevate Portal.
Metis is a rich, UI-driven tool that is downloaded and run in the user's environment and used for on-site engineering and qualification. Because most platforms are developed and tested in labs that are often isolated from the internet, Metis can run as an offline tool. Metis offers numerous features, including:
- Providing a single pane-of-glass for all engineering and qualification activities to be performed on the hardware.
- The ability to perform detailed hardware enumeration and checking of the hardware against Nutanix’s hardware-firmware compatibility list (HFCL).
- Automated tests, manual test descriptions, and test suites for qualifying hardware components and platforms.
- One-click uploads of test results to the Elevate Portal for review.
- Engineering tools for deeply integrating Nutanix’s software on new hardware platforms.
- A rich API, through which the usage of Metis can be automated and integrated into each user’s environment, test management system, etc.
Tying it All Together
Establishing support for new hardware requires completing the engineering and qualification “cycles.” The initial bring-up and testing must be completed using Metis and Elevate by the FleX user. The results then flow to Nutanix, where they are reviewed and integrated into our code base and the hardware-firmware compatibility list (HFCL) by the appropriate subject matter experts (SMEs). The work of these SMEs is standardized and optimized by the internal Concord tool and the hardware abstraction layer (HAL), both detailed below. Finally, builds of our software containing the updated hardware and platform support can be published to the world for broader use.
If we view Metis and Elevate as the frontend of FleX, then we can consider Concord and HAL to be the backend. Let’s investigate them in more detail:
Concord is an internal framework for analyzing qualification results, a UI dashboard for reviewing results, and a database for storing HFCL data.
All hardware operations pass through the hardware abstraction layer (HAL). The HAL provides a single interface to higher layers of Nutanix’s software, while internally encoding and abstracting the details of interaction with the variety of possible underlying hardware. New hardware support can be added by simply writing a plugin for the HAL. The HAL also contains the HFCL, allowing it to provide qualification status for each component in the system (for example, during an enumeration of the system’s complete hardware).
For more information on the hardware platform choices available for Nutanix customers, please visit https://www.nutanix.com/products/hardware-platforms.
© 2020 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and the other Nutanix products and features mentioned on this post are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. All other brand names mentioned on this post are for identification purposes only and may be the trademarks of their respective holder(s). This post may contain links to external websites that are not part of Nutanix.com. Nutanix does not control these sites and disclaims all responsibility for the content or accuracy of any external site.