Modern, Fast, & Application-Consistent Data Protection for MongoDB Sharded Clusters Powered by NDB Time Machine and MongoDB Ops Manager
By Saravana Selvaraj, Staff Engineer
and Anand Chandak, Group Product Manager NDB
MongoDB sharded clusters power mission critical applications across industries—from FinTech to E-commerce to global SaaS platforms. These environments demand not just scale, but enterprise class data protection: fast backups, consistent restores, low overhead, and full automation across distributed database architecture.
To meet these needs, Nutanix Database Service (NDB) extends its powerful advanced data-protection framework, Time Machine, to support MongoDB sharded clusters. This integration combines the deep MongoDB native intelligence of MongoDB Ops Manager with the snapshot performance and storage efficiencies of the Nutanix platform, delivering a solution designed for modern, large-scale MongoDB workloads.
This blog explores how NDB orchestrates application-consistent online backups via cluster-consistent snapshots, coarse grained oplogs catchups, and cluster wide coordinated point-in-time restores at seconds granularity.
Sharded MongoDB databases are distributed by design—multiple shards, each composed of replica sets, alongside a config‑server replica set. Protecting such an environment requires:
Note: Traditional backup tools like mongodump aren’t designed for this complexity.
First, Ops Manager is Onboarded into NDB and then Time Machine is enabled to realize the integrated Backup and Restore capabilities.
Reference: To know how to Onboard MongoDB Ops Manager in Nutanix NDB, refer here
MongoDB Ops Manager exposes third-party-based backup APIs that allow storage and data-protection vendors to integrate their own snapshot technologies while Ops Manager maintains MongoDB-level consistency.
Third-party based APIs provide a means for backup vendors like Nutanix to co-ordinate the overall backup and restore process. This model demands the vendor to trigger and orchestrate complete Snapshot and Restore workflows, while the MongoDB Ops Manager controls the overall MongoDB native operations required for achieving consistency across shards.
A well-defined state transition driven model helps both the integrating parties to co-ordinate throughout the process for consistent behaviors and thus results.
NDB leverages this integration to achieve:
This partnership enables NDB to capture backup data quickly and in strict coordination with MongoDB’s internal state.
Backup cursors are a MongoDB feature exposed through Ops Manager that provide a consistent view of data in each replica set.
NDB Time Machine consists of three core protection pillars:
Snapshot backups represent complete, consistent restore points of the entire sharded cluster— shards + config server. The process is designed to be fully online, and designed to help all MongoDB transactions continue seamlessly throughout.
Nutanix snapshots leverage a redirect-on-write algorithm that makes data protection lightweight by design. When a snapshot is triggered, Nutanix creates a new vDisk with read/write access — both the original snapshotted vDisk and the new vDisk reference the same underlying block-map, with zero new data blocks created at snapshot time. Because this is majorly a metadata-dominant operation, snapshot creation is near-instantaneous and largely independent of dataset size — a 50TB or a 500GB data set completes snapshots in comparable time. The I/O overhead is minimal, with negligible impact on the running MongoDB workload during the snapshot window. For further, deeper technical details to know about the underlying mechanism, refer to the Nutanix Bible, Snapshots and Clones section.
The data node for each shard and config server backed up is also known as Eligible & Available (EA) node, chosen, as per backup policy configured. The backup policy enables the user to specify from which data node to take a snapshot from – Primary or Secondary.
Below is the expanded, precise workflow, explained in phases -
A snapshot may be triggered by a schedule, or on demand.
When initiated:
The orchestrating NDB Agent on Mongos:
This is where Nutanix’s infrastructure shines:
Resiliency: Every task is designed to be idempotent and re-entrant, thus enabling optimal behavior on retries. The system is fine-tuned using sufficient retries at interaction points governed by configs, maximizing opportunity to succeed in environments involving multiple sub-systems, which are vulnerable for failures.
Oplogs backups enable data protection, allowing restores to any precise second within the retention window.
Time Machine coordinates this with Ops Manager to achieve consistency and reliability.
Below depiction, describes the workflow further in detail:
Restore operations in sharded environments require precise orchestration—every shard must be restored to the same logical point, even though they are backed up independently.
NDB Time Machine’s in-place restore workflow supports:
Restore process can be best visualized in phases, below:
Ops Manager requires that all nodes of a shard recover from identical data.
So NDB:
Note: NDB, for fast availability of identical data across each node of a given shard, hosted across multiple Nutanix clusters, leverages Nutanix snapshot-based replication technology, for quick turnaround time.
If a restore operation fails:
This design helps reduce the risk of restore operations corrupting backup chains—like how forward‑only database upgrades behave.
The health of a MongoDB sharded cluster database is made up of the combined health of all shards and config-servers. Both the database and Time Machine provide one view for the entire MongoDB sharded cluster. Thus, the Time Machine capabilities and operations are available for the entire MongoDB sharded cluster.
NDB extends its powerful Time Machine framework to MongoDB sharded clusters, delivering consistent, storage‑efficient, and application‑aware data protection.
By integrating with MongoDB Ops Manager’s third‑party backup APIs, NDB orchestrates cluster‑wide snapshots, op‑log backups, and PITR workflows while Ops Manager enables MongoDB‑native coordination across shards and config servers.
Snapshots are near-instantaneous and captured in parallel across the cluster. Log catchups provide continuous protection via op‑log extraction, storage, and retention on the Nutanix object store.
Restore workflows combine NDB’s snapshot intelligence with Ops Manager’s recovery engine to enable point‑in‑time or snapshot‑based recovery across all shards. Unlike other engines, sharded cluster restores are forward‑only—failure places the database in a RESTORE_FAILED state until the issue is resolved. Once Time Machine is enabled, all backup, log, and restore operations operate at a unified cluster scope, giving users a single, consistent operational experience for large‑scale MongoDB deployments.
Feature Availability: The integrated solution for application consistent backups and restores for a sharded cluster MongoDB Database is available from NDB release version 2.10 and MongoDB Ops Manager release version 8.0.19, onwards.
©2026 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and all Nutanix product and service names mentioned are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. All other brand names mentioned are for identification purposes only and may be the trademarks of their respective holder(s).