Search <book_title>...

Important Update: Cohesity Products Documentation

All Cohesity product documentation are now managed via the Cohesity Docs Portal: https://docs.cohesity.com/HomePage/Content/home.htm. Some documentation available here may not reflect the latest information or may no longer be accessible.

NetBackup™ Backup Planning and Performance Tuning Guide

Last Published: 2024-04-16

Product(s): NetBackup (10.4, 10.3.0.1, 10.3, 10.2.0.1, 10.2, 10.1.1, 10.1, 10.0.0.1, 10.0, 9.1.0.1, 9.1, 9.0.0.1, 9.0, 8.3.0.2, 8.3.0.1, 8.3)

Example of using wait and delay counter values

Suppose you wanted to analyze a local backup that has a 30-minute data transfer that is baselined at 5 MB per second. The backup involves a total data transfer of 9,000 MB. Because a local backup is involved, bptm is the data consumer. The data producer depends on the type of data that is backed up.

See Processes used in NetBackup client-server communication.

See Roles of processes during backup and restore.

Find the wait and delay values for the appropriate data producer process and for the consumer process (bptm) from the following:

For this example, suppose those values are the following:

Table: Examples for wait and delay

Process	Wait	Delay
bpbkar (Linux/UNIX) bpbkar32 (Windows)	29364	58033
bptm	95	105

Process

Wait

Delay

bpbkar (Linux/UNIX)

bpbkar32 (Windows)

29364

58033

bptm

105

These values reveal that bpbkar (or bpbkar32) is forced to wait by a bptm process that cannot move data out of the shared buffer fast enough.

Next, you can determine time lost due to delays by multiplying the delay counter value by the parent or child delay value, whichever applies.

In this example, the bpbkar (or bpbkar32) process uses the child delay value, while the bptm process uses the parent delay value. (The defaults for these values are 10 milliseconds for child delay and 15 milliseconds for parent delay.)

You can use the following equations to determine the amount of time lost due to these delays:

Table: Example delays

Process	Delay
bpbkar (Linux/UNIX) bpbkar32 (Windows)	58033 delays x 0.010 seconds = 580.33 seconds = 9 minutes 40 seconds
bptm	105 x 0.015 seconds = 1.6 seconds

Process

Delay

bpbkar (Linux/UNIX)

bpbkar32 (Windows)

58033 delays x 0.010 seconds = 580.33 seconds = 9 minutes 40 seconds

bptm

105 x 0.015 seconds = 1.6 seconds

Use these equations to determine if the delay for bpbkar (or bpbkar32) is significant. In this example, if this delay is removed, the resulting transfer time is:

30 minutes original transfer time - 9 minutes 40 seconds = 20 minutes 20 seconds (1220 seconds)

A transfer time of 1220 seconds results in the following throughput value:

9000 MB / 1220 seconds = 7.38 MB per second

7.38 MB per second is a significant increase over 5 MB per second. With this increase, you should investigate how the tape or disk performance can be improved.

You should interpret the number of delays within the context of how much data was moved. As the amount of moved data increases, the significance threshold for counter values increases as well.

Again, for a total of 9,000 MB of data being transferred, assume a 64-KB buffer.

You can determine the total number of buffers to be transferred using the following equation:

Number of kilobytes	9,000 x 1024 = 9,216,000 KB
Number of buffers	9,216,000 / 64 = 144,000

You can now express the wait counter value as a percentage of the total number of buffers:

bpbkar (Linux/UNIX), or bpbkar32 (Windows)	29364 / 144,000 = 20.39%
bptm	95 / 144,000 = 0.07%

In the 20 percent of cases where bpbkar (or bpbkar32) needed an empty shared data buffer, bptm has not yet emptied the shared data buffer. A value of this size indicates a serious issue. You should investigate as to why the data consumer (bptm) cannot keep up.

In contrast, the delays that bptm encounters are insignificant for the amount of data transferred.

You can also view the delay and wait counters as a ratio:

bpbkar (Linux/UNIX)

bpbkar32 (Windows)

= 58033 delays / 29364 waits

= 1.98

In this example, on average bpbkar (or bpbkar32) had to delay twice for each wait condition that was encountered. If this ratio is large, increase the parent or child delay to avoid checking for a shared data buffer in the correct state too often.

See Changing parent and child delay values for NetBackup.

Conversely, if this ratio is close to 1, reduce the applicable delay value to check more often, which may increase your data throughput performance. Keep in mind that the parent and child delay values are rarely changed in most NetBackup installations.

The preceding information explains how to determine if the values for wait and delay counters are substantial enough for concern.

Note:

The wait and delay counters are related to the size of the data transfer. A value of 1,000 may be extreme when only 1 megabyte of data is moved. The same value may indicate a well-tuned system when gigabytes of data are moved. The final analysis must determine how these counters affect performance.

More Information

Finding wait and delay counter values