Veritas NetBackup™ for MongoDB Administrator's Guide

Last Published:
Product(s): NetBackup (9.0.0.1, 9.0)
  1. Overview of protecting MongoDB using NetBackup
    1.  
      About protecting a sharded, replica set, or standalone MongoDB cluster using NetBackup
    2.  
      Protecting MongoDB data using NetBackup
    3.  
      NetBackup for MongoDB terminologies
    4.  
      Limitations
    5.  
      Prerequisites and the best practices for protecting MongoDB
  2. Verify the pre-requisites for the MongoDB plug-in for NetBackup
    1.  
      Operating system and platform compatibility
    2.  
      Prerequisites for configuring the MongoDB plug-in
  3. Configuring NetBackup for MongoDB
    1.  
      About the MongoDB configuration tool
    2.  
      Prerequisites for manually creating the mongodb.conf file
    3. Configuring backup options for MongoDB using the mongodb.conf file
      1.  
        Whitelisting the configuration file path on NetBackup master server
    4.  
      Obtaining the RSA key of the MongoDB nodes
    5. Adding MongoDB credentials in NetBackup
      1.  
        About the credential configuration file
      2.  
        How to add the MongoDB credentials in NetBackup
      3.  
        About the MongoDB roles for protecting the data
    6.  
      Using a non-root user as a host user
    7. Managing backup hosts
      1.  
        Whitelisting a NetBackup client on NetBackup master server
  4. Backing up MongoDB using NetBackup
    1. Backing up MongoDB data
      1.  
        Backing up a MongoDB cluster
    2.  
      Prerequisites for backing up a MongoDB cluster
    3. Configuring NetBackup policies for MongoDB plug-in
      1.  
        Creating a BigData backup policy
      2.  
        Creating BigData policy using the NetBackup Administration Console
      3.  
        Using the Policy Configuration Wizard to create a BigData policy for MongoDB clusters
      4.  
        Using the NetBackup Policies utility to create a BigData policy for MongoDB clusters
      5.  
        Using NetBackup Command Line Interface (CLI) to create a BigData policy for MongoDB clusters
  5. Restoring or recovering MongoDB data using NetBackup
    1.  
      Restoring MongoDB data
    2.  
      Prerequisites for MongoDB restore and recovery
    3. About the restore scenarios for MongoDB database from the BAR interface
      1.  
        High-level steps involved in the Restore and Recovery process
    4.  
      Using the BAR interface to restore the MongoDB data on the same cluster
    5.  
      Using the BAR interface to restore the MongoDB data on an alternate cluster
    6.  
      About restoring MongoDB data in a high availability setup on an alternate client
    7. Recovering a MongoDB database using the command line
      1.  
        Creating or modifying the rename file
      2.  
        Using the command line to recover a MongoDB database
    8.  
      Manual steps after the recovery process
  6. Troubleshooting
    1.  
      About NetBackup for MongoDB debug logging
    2.  
      Known limitations for MongoDB protection using NetBackup
  7. Appendix A. Additional information
    1.  
      Sample MongodB configuration utility workflow to add and update MongodB credentials
  8.  
    Index

Known limitations for MongoDB protection using NetBackup

The following table lists the known limitations for MongoDB protection using NetBackup:

Table: Known limitations

Limitation

Workaround

In a sharded MongoDB cluster with high availability that contains multiple mongos processes, before starting a restore and recover operation, only the mongos process on the restore destination for the Config Server Replica Set (CSRS) image should be running.

Manually stop any other mongos processes in the cluster before starting a restore and recover operation.

After recovery, reconfigure the mongos services to point to the recovered cluster.

If mongos process is not shut down on all nodes except one, the additional mongos processes might conflict with the restore and recover operation, causing the data that is restored to be inaccessible via a connection to mongos.

In case the mongos processes are not shutdown before starting the restore and recovery, then after recovery you must manually shutdown the stale mongos processes and then restart all the recovered mongod and mongos processes under cluster.

You must start the MongoDB processes with an absolute path to the configuration files. You must use the absolute paths for the certificate files and the CA file as well. You must specify the absolute paths for the CA file, PEM file and Key Files as well.

N/A

If the authentication type that was present during backup changes and you run a recovery job that requires a different authentication, the recovery process might fail.

Ensure that the authentication type during recovery remains the same as the type used during the backup.

After running a backup if you rename the volume group or the logical volume, the subsequent backup might fail.

N/A

During recovery, ensure that you select only one full backup image and its relevant subsequent incremental images. If you select more than one image, the recovery may fail as the restored data could be corrupted.

N/A

After your recover the MongoDB cluster, the cluster information for only the restored node is available.

After the recovery process is complete, manually add the secondary nodes to the cluster.

For more information, refer to the following article: add-members-to-the-replica-set

During the backup process, if the MongoDB import operation is running, it can become unresponsive. Avoid the MongoDB import operation during the backup or restore process.

N/A

During the restore process, "The restore was successfully initiated" popup is displayed, but the restore job does not start.

This issue occurs when you enter the Application Server in both the Source Client and Destination Client in the BAR UI.

Ensure that Source Client and Destination Client are entered correctly. The Source Client must be the Application Server and the Destination Client must be the backup host.

If your environment has DNAT, ensure that the backup host or the restore host and all the MongoDB nodes are in the same private network.

N/A

The NetBackup for MongoDB plug-in does not support the command line bprestore options -w and -print_jobid.

N/A

MongoDB restores are not supported from the backup hosts. All the restore operations for MongoDB must be initiated from the NetBackup master.

N/A

If your restore job submission is not displaying the restore job, check if your destination node has a MongoDB plug-in installed on it.

N/A

If you restore the MongoDB database to a non-LVM location and then try to take a backup from this non-LVM location, the backup fails.

Restore the data to an LVM location and then try to take a backup of the restored data.

The NetBackup for MongoDB plug-in does not support hard or soft links in the data path folders. Do not add any hard or soft links that point to locations in a different logical volume or a non-logical volume.

NetBackup cannot ensure that the data is consistent at the time of backups if you have hard or soft links in the data path folder. During the restore process, the hard or soft links are created as folders and not links.

N/A

When you cancel a child restore job during the MongoDB restore and recovery process, the thin client (mdbserver) is not removed immediately. The thin client is removed after the next restore operation.

N/A

MongoDB restore fails and displays error 2850.

Ensure that the destination host and port is valid and has the credentials configured using the tpconfig command and the credentials file. For more information, refer to the tar logs.

After recovery, the MongoDB shard node fails to restart manually and the following error is seen in the MongoDB logs:

NoSuchKey: Missing expected field "configsvrConnectionString"

On the MongoDB shard where the problem occurs, start MongoDB in the maintenance mode and run the following method on the system.version collection in the admin database:

use admin
db.system.version.deleteOne
( { _id: "minOpTimeRecovery" } )

In a restore and recover operation containing one or more replica sets, replica set members are restored to the replica set using the default "cfg.members[#].host" value provided by rs.config().

If this value was previously updated from the default value, after the restore and recover completes, this value may need to be updated (for example, from shortname to FQDN), to match the original configuration.

Workaround:

  1. Log in to the replica set MongoDB cluster

  2. Use the following command to check the configuration:

    rs.conf()

  3. Use the following command to update the configuration for replica set:

    Update configuration for replica set member 0:
    cfg = rs.conf();
    cfg.members[0].host = '<hostname.domain.com>:
    <port-number>';
    rs.reconfig(cfg)
  4. Verify the changes using the following command:

    rs.conf()

  5. Repeat the steps for the other replica sets and the members, or just the replica set members.

Backup jobs fail and the following error codes are displayed:

  • (50) client process aborted

  • (1) The requested operation was partially successful

  • (112) no files specified in the file list

Ensure that the backup windows for incremental backups are different for the same MongoDB cluster. The backup windows must not overlap each other for incremental backups for the same MongoDB cluster.

Ensure that permissions are in place for the mdbserver location, oplog location, and snapshot mount location. For more information, See Using a non-root user as a host user.

In a sharded MongoDB cluster environment, a 112 error can indicate that the mongos process is not running on the client defined in the backup policy.

An error 112 can also indicate that same hosts names for multiple backup hosts are added to the BigData policy. Use unique host names for multiple backup hosts that are running the backup operations.

After a restore and recovery operation, if you try to stop and restart the mongod or mongos services (service mongod stop or service mongod restart), the commands fail.

This error occurs when the mongod or mongos processes are launched as service using the service or systemctl commands and not using a direct command.

Workaround:

Stop the mongod or mongos services using alternative methods. For example, mongod -f /etc/mongod.conf --shutdown or kill <PID>. After stopping the services, you can use the service or systemctl commands again.

Note:

When you stop the services after restore and recovery, the .pid or .sock files remain when you shutdown the mongod or mongos processes. You must delete the files if the mongod or mongos services do not start after shutting them down.

The default location of the .sock files is /tmp

The default location of the .pid files is /var/run/mongodb/

Backup operation fails if a command that generates output in .bashrc is added.

Backup fails with error 6646 and displays the following error:

Error: Unable to communicate with the server.

Ensure that no output is generated by .bashrc (echo or any other output generating command). The output should not return STDERR or STDOUT when the shell is non-interactive.

When you select two full backup images and try to restore to a point-in-time image that is between the two full backup images, the latest full backup image is restored.

Workaround:

During the restore and recovery operation, do not select more than one full backup image.

For an effective point-in-time recovery, ensure that you run differential incremental backups of shorter duration.

Unable to see the restore job progress in the Activity Monitor.

Workaround:

For compound restore jobs that use a non-master server as the restore host, you must use the Update Task List button to display the restore job progress in the Activity Monitor.

Backup fails with the following error:

(6625) The backup host is either unauthorized to complete the operation or it is unable to establish a connection with the application server.

Workaround:

On the server where MongoDB is intalled, ensure that PasswordAuthentication is not disabled in /etc/ssh/sshd_config file.

Run the sudo service sshd restart command.

Backup fails with the following error:

(6646) Unable to communicate with the server.

Workaround:

Ensure that the backup host can access the defined port in mongodb.conf file or the default mdbserver_port (11000).

There could be an error while copying the thin client files on the MongoDB server because of the following:

  • Connectivity issues with the MongoDB server

  • User does not have permissions to the location for copying the thin client files

The following error is displayed in the mdbserver logs:

error-sudo: sorry, you must have a tty to run sudo

Workaround:

  • To disable the requiretty option globally, in the sudoers file, replace Defaults requiretty with Defaults !requiretty. This action changes the global sudo configuration.

  • You can change the sudo configuration for the user, group, or command. On the server where MongoDB is installed, add the host user, or group, or command in the sudoers file.

    Add Defaults /path/to/my/bin !requiretty

    Add Default <host_user> !requiretty

The nbaapireq_handler log folder is not created on a Flex Container, even after running the mklogdir command.

Workaround:

When a Flex Appliance is upgraded from version 8.1.2 to 8.2 and the Flex media server is used as backup host, then for logging the MongoDB plug-in restore logs create the nbaapireq_handler folder in the /usr/openv/netbackup/logs/ directory.

MongoDB Restore fails with the error 2850

The target database path does not exist and there are insufficient permissions for the non-root user.

Workaround:

Ensure that the target database path exists and there are sufficient permissions for the non-root user.

The snapshot size as described by the free_space_percentage_snapshot parameter must be set according the MongoDB cluster size and must be large enough. If these criteria are not met, the backup fails and displays the following error:

invalid command parameter (20)

Validate the free_space_percentage_snapshot value with the MongoDB cluster.

Backup fails with the following error:

(13) file read failed for Media

Ensure that the:

  • NetBackup version on the master server is the latest.

  • NetBackup version on the media server is the same as the master server but newer than the NetBackup client version on the backup host.

  • NetBackup client version on the backup host is the same as or older than the media server.

The mdb_progress_loglevel parameter is missing from the MongoDB configuration tool.

To modify the mdb_progress_loglevel parameter, update the mongodb.conf file after it is created by the MongoDB configuration tool.

For more information, refer to the MongoDB Administrator's Guide.

Snapshots are not deleted and stale mdbserver instances are seen. This scenario might cause Cannot lstat errors during backup and partially successful backups.

Change the configuration settings for the following parameters in the mongodb.conf file:

  • cleanup_time_in_min

  • mdbserver_timeout_min

Set the values such that the stale snapshots and stale instances of mdbserver are cleared before the next full or incremental backup schedule.

If the backup host has NetBackup version earlier than 8.3 and master and media server have the latest version of NetBackup, the following invalid error codes can be seen for various scenarios:

13302, 13303, 13304, 13305, 13306, 13307, 13308, 13309, 13310, 13311, 13312, 13313, 13314, 13315

Workaround:

Refer to the following list of corresponding actual error codes if you see the invalid error codes for the actual scenarios and recommended actions:

  • Invalid error code: 13302

    Actual error: 6724

    Message: Restore node count is invalid.

  • Invalid error code: 13303

    Actual error: 6725

    Message: Unable to find information about the MongoDB replica set.

  • Invalid error code: 13304

    Actual error: 6704

    Error: Message: Restoring multiple MongoDB nodes on one replica set is invalid.

  • Invalid error code: 13305

    Actual error: 6705

    Message: Restoring MongoDB data on an arbiter node is invalid.

  • Invalid error code: 13306

    Actual error: 6706

    Message: A discovered shard was found in drain state, cannot proceed with backup.

  • Invalid error code: 13307

    Actual error: 6707

    Message: An unsupported MongoDB storage engine is detected.

  • Invalid error code: 13308

    Actual error: 6708

    Message: Unable to parse command output

  • Invalid error code: 13309

    Actual error: 6709

    Message: Unable to run the command.

  • Invalid error code: 13310

    Actual error: 6710

    Message: Pre-check for recovery has failed as WiredTiger log files are present at the database path.

  • Invalid error code: 13311

    Actual error: 6711

    Message: Unable to backup MongoDB configuration file.

  • Invalid error code: 13312

    Actual error: 6712

    Message: Unable to find operation log for previous backup.

  • Invalid error code: 13313

    Actual error: 6713

    Message: Operations log roll-over detected.

  • Invalid error code: 13314

    Actual error: 6714

    Message: Error while collection was iterated.

  • Invalid error code: 13315

    Actual error: 6715

    Message: Operation log verification error.

For detailed information and recommended actions, refer to the NetBackup Status Codes Reference Guide.

Restore button in the NetBackup BAR UI can get disabled for the imported MongoDB backup images.

Workaround:

If you import the images to the same NetBackup master server that was originally used to back them up, use either of the following methods:

  • Perform the restore operation using the bprestore command.

  • Restore the catalog backup that enables the restore button in the BAR UI and then restore the images.

If you import the images to a different NetBackup master server than the one that was originally used to back them up, use the bprestore command to run the restore operation.

Recovery operation fails on an alternate, sharded MongoDB cluster. The following error is displayed:

Unable to find the configuration parameter. (6661)

This issue occurs during an alternate cluster recovery because the pre-recovery check is unable to find the mongos port for the alternate cluster in the mongodb.conf file. This is because of the way the MongoDB configuration tool creates the mongodb.conf file when you add the alternate MongDB cluster details using the Update option from the tool.

Workaround:

Before you start the recovery process, update the mongodb.conf file to separate the alternate cluster from the original cluster.

For example:

Existing mongodb.conf file:

 "application_servers":
   {
    "original.mongodb.cluster.com:26050":
		 {
     "alternate_config_server":
			  [
       {
         "hostname:port": "alt.mongodb.cluster.com:26000",
         "mongos_port": "26001"
       }
      ],
    "mongos_port": "26051"
    }
   }

Suggested updated to the mongodb.conf file:

"application_servers": 
   {
    "original.mongodb.cluster.com:26050":
   {
      "mongos_port": "26051"
   },
      "alt.mongodb.cluster.com:26000":
   {
   "mongos_port": "26001"
   }
   }

The MUI tool displays the following error:

Unable to delete configuration.

Recommended Action:

  • Check that the <hostname-port>.conf file still exists in the /usr/openv/var/global directory.

  • Refer to the tpconfig logs and check for error:

    Translating EMM_ERROR_MachineNotExist(2000000) to 88 in the Device Config context.

Work Around:

Delete the <hostname-port>.conf file manually from /usr/openv/var/global.