NetBackup™ for Hadoop Administrator's Guide
- Introduction
- Prerequisites and best practices for the NetBackup for Hadoop plug-in for NetBackup
- Configuring NetBackup for Hadoop
- About configuring NetBackup for NetBackup for Hadoop
- Managing backup hosts
- Adding NetBackup for Hadoop credentials in NetBackup
- Configuring the NetBackup for Hadoop plug-in using the NetBackup for Hadoop configuration file
- Configuring NetBackup for a highly-available NetBackup for Hadoop cluster
- Configuring a custom port for the NetBackup for Hadoop cluster
- Configuring number of threads for backup hosts
- Configuring number of streams for backup hosts
- Configuring distribution algorithm and golden ratio for backup hosts
- Configuring communication between NetBackup and Hadoop clusters that are SSL-enabled (HTTPS)
- Configuration for a NetBackup for Hadoop cluster that uses Kerberos
- Hadoop.conf configuration for parallel restore
- Create a BigData policy for Hadoop clusters
- Disaster recovery of a NetBackup for Hadoop cluster
- Performing backups and restores of Hadoop
- Troubleshooting
- About troubleshooting NetBackup for NetBackup for Hadoop issues
- About NetBackup for Hadoop debug logging
- Troubleshooting backup issues for NetBackup for Hadoop data
- Backup operation fails with error 6609
- Backup operation failed with error 6618
- Backup operation fails with error 6647
- Extended attributes (xattrs) and Access Control Lists (ACLs) are not backed up or restored for Hadoop
- Backup operation fails with error 6654
- Backup operation fails with bpbrm error 8857
- Backup operation fails with error 6617
- Backup operation fails with error 6616
- Backup operation fails with error 84
- NetBackup configuration and certificate files do not persist after the container-based NetBackup appliance restarts
- Unable to see incremental backup images during restore even though the images are seen in the backup image selection
- One of the child backup jobs goes in a queued state
- Troubleshooting restore issues for NetBackup for Hadoop data
- Restore fails with error code 2850
- NetBackup restore job for NetBackup for Hadoop completes partially
- Extended attributes (xattrs) and Access Control Lists (ACLs) are not backed up or restored for Hadoop
- Restore operation fails when Hadoop plug-in files are missing on the backup host
- Restore fails with bpbrm error 54932
- Restore operation fails with bpbrm error 21296
- Hadoop with Kerberos restore job fails with error 2850
- Configuration file is not recovered after a disaster recovery
Configuring distribution algorithm and golden ratio for backup hosts
To enhance the backup performance, you can configure the distribution algorithm and golden ratio based on the tunable parameters. You can improve the backup performance by Performance fine tuning of these algorithms is possible via combination of distribution algorithm and golden ratio.
To decide the distribution algorithm and golden ratio, consider the following:
If you have
small number of large sized filesin your data set: Use distribution algorithm 1 and change in golden ratio is not honored.If you have
large number of small sized filesin your data set: Use distribution algorithm 2 and change in golden ratio is not honored.If you have
small number of very large sized files and large number of small sized filesin your data set: Use distribution algorithm 4 or 5 and golden ratio that fits your deployment. Golden ratio supported range is from 1 to 100. If not provided default is considered as 75.Note:
Adjusting this value can change performance drastically.
/usr/openv/var/global/To update the hadoop.conf file for configuring algorithm and golden ratio
- Update the
hadoop.conffile with the following parameters:{ "distro_algo": distribution_algorithm and "golden_ratio":godlen_ratio } - Copy this file to the following location on the backup host:
/usr/openv/var/global/