Search <book_title>...

Veritas NetBackup for MongoDB Administrator's Guide

Last Published: 2020-02-11

Product(s): NetBackup (8.2)

Backing up MongoDB data

MongoDB data is backed up in parallel streams wherein MongoDB data nodes stream data blocks simultaneously to multiple backup hosts.

The following diagram provides an overview of the backup flow:

Figure: Backup flow

As illustrated in the above diagram:

A scheduled backup job is triggered from the master server.
Backup job for MongoDB data is a compound job. When the backup job is triggered, first a discovery job runs.
During discovery, the backup host deploys a transient thin client (mdbserver) on the configuration server and obtains the details of the shards in the MongoDB cluster. The thin client also stops the balancing across the nodes in a replica set.
After receiving the information about the cluster, the backup host deploys a thin client on the secondary node of a replica set in the MongoDB cluster.
The thin client discovers the database paths dynamically, quiesces the secondary nodes, and takes snapshots for full backups and captures oplog for incremental backups.
Individual child jobs run for each backup stream and data is backed up.
Data blocks are streamed simultaneously from different secondary nodes to multiple backup hosts.

Once the backup operation is completed, the thin client is removed from the servers.

The compound backup job is not completed until all the child jobs are completed. After the child jobs are completed, NetBackup cleans all the snapshots from the secondary nodes. Only after the cleanup activity is completed, the compound backup job is completed.