NetBackup™ Backup Planning and Performance Tuning Guide
- NetBackup capacity planning
- Primary server configuration guidelines
- Size guidance for the NetBackup primary server and domain
- Factors that limit job scheduling
- More than one backup job per second
- Stagger the submission of jobs for better load distribution
- NetBackup job delays
- Selection of storage units: performance considerations
- About file system capacity and NetBackup performance
- About the primary server NetBackup catalog
- Guidelines for managing the primary server NetBackup catalog
- Adjusting the batch size for sending metadata to the NetBackup catalog
- Methods for managing the catalog size
- Performance guidelines for NetBackup policies
- Legacy error log fields
- Media server configuration guidelines
- NetBackup hardware design and tuning considerations
- About NetBackup Media Server Deduplication (MSDP)
- Data segmentation
- Fingerprint lookup for deduplication
- Predictive and sampling cache scheme
- Data store
- Space reclamation
- System resource usage and tuning considerations
- Memory considerations
- I/O considerations
- Network considerations
- CPU considerations
- OS tuning considerations
- MSDP tuning considerations
- MSDP sizing considerations
- Cloud tier sizing and performance
- Accelerator performance considerations
- Media configuration guidelines
- About dedicated versus shared backup environments
- Suggestions for NetBackup media pools
- Disk versus tape: performance considerations
- NetBackup media not available
- About the threshold for media errors
- Adjusting the media_error_threshold
- About tape I/O error handling
- About NetBackup media manager tape drive selection
- How to identify performance bottlenecks
- Best practices
- Best practices: NetBackup SAN Client
- Best practices: NetBackup AdvancedDisk
- Best practices: Disk pool configuration - setting concurrent jobs and maximum I/O streams
- Best practices: About disk staging and NetBackup performance
- Best practices: Supported tape drive technologies for NetBackup
- Best practices: NetBackup tape drive cleaning
- Best practices: NetBackup data recovery methods
- Best practices: Suggestions for disaster recovery planning
- Best practices: NetBackup naming conventions
- Best practices: NetBackup duplication
- Best practices: NetBackup deduplication
- Best practices: Universal shares
- NetBackup for VMware sizing and best practices
- Best practices: Storage lifecycle policies (SLPs)
- Best practices: NetBackup NAS-Data-Protection (D-NAS)
- Best practices: NetBackup for Nutanix AHV
- Best practices: NetBackup Sybase database
- Best practices: Avoiding media server resource bottlenecks with Oracle VLDB backups
- Best practices: Avoiding media server resource bottlenecks with MSDPLB+ prefix policy
- Best practices: Cloud deployment considerations
- Measuring Performance
- Measuring NetBackup performance: overview
- How to control system variables for consistent testing conditions
- Running a performance test without interference from other jobs
- About evaluating NetBackup performance
- Evaluating NetBackup performance through the Activity Monitor
- Evaluating NetBackup performance through the All Log Entries report
- Table of NetBackup All Log Entries report
- Evaluating system components
- About measuring performance independent of tape or disk output
- Measuring performance with bpbkar
- Bypassing disk performance with the SKIP_DISK_WRITES touch file
- Measuring performance with the GEN_DATA directive (Linux/UNIX)
- Monitoring Linux/UNIX CPU load
- Monitoring Linux/UNIX memory use
- Monitoring Linux/UNIX disk load
- Monitoring Linux/UNIX network traffic
- Monitoring Linux/Unix system resource usage with dstat
- About the Windows Performance Monitor
- Monitoring Windows CPU load
- Monitoring Windows memory use
- Monitoring Windows disk load
- Increasing disk performance
- Tuning the NetBackup data transfer path
- About the NetBackup data transfer path
- About tuning the data transfer path
- Tuning suggestions for the NetBackup data transfer path
- NetBackup client performance in the data transfer path
- NetBackup network performance in the data transfer path
- NetBackup server performance in the data transfer path
- About shared memory (number and size of data buffers)
- Default number of shared data buffers
- Default size of shared data buffers
- Amount of shared memory required by NetBackup
- How to change the number of shared data buffers
- Notes on number data buffers files
- How to change the size of shared data buffers
- Notes on size data buffer files
- Size values for shared data buffers
- Note on shared memory and NetBackup for NDMP
- Recommended shared memory settings
- Recommended number of data buffers for SAN Client and FT media server
- Testing changes made to shared memory
- About NetBackup wait and delay counters
- Changing parent and child delay values for NetBackup
- About the communication between NetBackup client and media server
- Processes used in NetBackup client-server communication
- Roles of processes during backup and restore
- Finding wait and delay counter values
- Note on log file creation
- About tunable parameters reported in the bptm log
- Example of using wait and delay counter values
- Issues uncovered by wait and delay counter values
- Estimating the effect of multiple copies on backup performance
- Effect of fragment size on NetBackup restores
- Other NetBackup restore performance issues
- About shared memory (number and size of data buffers)
- NetBackup storage device performance in the data transfer path
- Tuning other NetBackup components
- When to use multiplexing and multiple data streams
- Effects of multiplexing and multistreaming on backup and restore
- How to improve NetBackup resource allocation
- Encryption and NetBackup performance
- Compression and NetBackup performance
- How to enable NetBackup compression
- Effect of encryption plus compression on NetBackup performance
- Information on NetBackup Java performance improvements
- Information on NetBackup Vault
- Fast recovery with Bare Metal Restore
- How to improve performance when backing up many small files
- How to improve FlashBackup performance
- Veritas NetBackup OpsCenter
- Tuning disk I/O performance
Example of using wait and delay counter values
Suppose you wanted to analyze a local backup that has a 30-minute data transfer that is baselined at 5 MB per second. The backup involves a total data transfer of 9,000 MB. Because a local backup is involved, bptm is the data consumer. The data producer depends on the type of data that is backed up.
See Processes used in NetBackup client-server communication.
See Roles of processes during backup and restore.
Find the wait and delay values for the appropriate data producer process and for the consumer process (bptm) from the following:
For this example, suppose those values are the following:
Table: Examples for wait and delay
Process | Wait | Delay |
|---|---|---|
bpbkar (Linux/UNIX) bpbkar32 (Windows) | 29364 | 58033 |
bptm | 95 | 105 |
These values reveal that bpbkar (or bpbkar32) is forced to wait by a bptm process that cannot move data out of the shared buffer fast enough.
Next, you can determine time lost due to delays by multiplying the delay counter value by the parent or child delay value, whichever applies.
In this example, the bpbkar (or bpbkar32) process uses the child delay value, while the bptm process uses the parent delay value. (The defaults for these values are 10 milliseconds for child delay and 15 milliseconds for parent delay.)
You can use the following equations to determine the amount of time lost due to these delays:
Table: Example delays
Process | Delay |
|---|---|
bpbkar (Linux/UNIX) bpbkar32 (Windows) | 58033 delays x 0.010 seconds = 580.33 seconds = 9 minutes 40 seconds |
bptm | 105 x 0.015 seconds = 1.6 seconds |
Use these equations to determine if the delay for bpbkar (or bpbkar32) is significant. In this example, if this delay is removed, the resulting transfer time is:
30 minutes original transfer time - 9 minutes 40 seconds = 20 minutes 20 seconds (1220 seconds)
A transfer time of 1220 seconds results in the following throughput value:
9000 MB / 1220 seconds = 7.38 MB per second
7.38 MB per second is a significant increase over 5 MB per second. With this increase, you should investigate how the tape or disk performance can be improved.
You should interpret the number of delays within the context of how much data was moved. As the amount of moved data increases, the significance threshold for counter values increases as well.
Again, for a total of 9,000 MB of data being transferred, assume a 64-KB buffer.
You can determine the total number of buffers to be transferred using the following equation:
Number of kilobytes | 9,000 x 1024 = 9,216,000 KB |
Number of buffers | 9,216,000 / 64 = 144,000 |
You can now express the wait counter value as a percentage of the total number of buffers:
bpbkar (Linux/UNIX), or bpbkar32 (Windows) | 29364 / 144,000 = 20.39% |
bptm | 95 / 144,000 = 0.07% |
In the 20 percent of cases where bpbkar (or bpbkar32) needed an empty shared data buffer, bptm has not yet emptied the shared data buffer. A value of this size indicates a serious issue. You should investigate as to why the data consumer (bptm) cannot keep up.
In contrast, the delays that bptm encounters are insignificant for the amount of data transferred.
You can also view the delay and wait counters as a ratio:
bpbkar (Linux/UNIX) bpbkar32 (Windows) | = 58033 delays / 29364 waits = 1.98 |
In this example, on average bpbkar (or bpbkar32) had to delay twice for each wait condition that was encountered. If this ratio is large, increase the parent or child delay to avoid checking for a shared data buffer in the correct state too often.
See Changing parent and child delay values for NetBackup.
Conversely, if this ratio is close to 1, reduce the applicable delay value to check more often, which may increase your data throughput performance. Keep in mind that the parent and child delay values are rarely changed in most NetBackup installations.
The preceding information explains how to determine if the values for wait and delay counters are substantial enough for concern.
Note:
The wait and delay counters are related to the size of the data transfer. A value of 1,000 may be extreme when only 1 megabyte of data is moved. The same value may indicate a well-tuned system when gigabytes of data are moved. The final analysis must determine how these counters affect performance.
More Information