Search <book_title>...

NetBackup™ Backup Planning and Performance Tuning Guide

Last Published: 2024-04-16

Product(s): NetBackup & Alta Data Protection (10.4, 10.3.0.1, 10.3, 10.2.0.1, 10.2, 10.1.1, 10.1, 10.0.0.1, 10.0, 9.1.0.1, 9.1, 9.0.0.1, 9.0, 8.3.0.2, 8.3.0.1, 8.3)

Predictive and sampling cache scheme

Beginning with NetBackup 10.1, a new fingerprint (FP) cache lookup data scheme was introduced. The new scheme splits the current maximum cache size MaxCacheSize into two components, predictive cache (P-cache) and sampling cache (S-cache).

The P-cache is used to cache the fingerprints that are most likely used in the immediate future.

The S-cache is used to cache a percentage of the fingerprint from each backup and a subset of each sample fingerprint is inserted into the S-cache. P-cache is first used to find duplicates, and lookup misses reaching a threshold are searched in S-cache for possible matches. If found, the predicted relevant fingerprints are loaded from disk into the P-cache for deduplication.

For more information about P-cache and S-cache, refer to the NetBackup Deduplicaton Guide for 10.1 or later.

With NetBackup 10.1, the P-and-S-cache is the default FP lookup scheme for cloud LSUs (logical storage units), while the local LSU volume is still defaulted to using the MaxCacheSize. The configuration changes and default values for P-and-S-cache cache are listed in the following table:

Table: Configuration change and default value for P-and-S-cache

Configuration	Default Value
MaxCacheSize	512 MB
MaxPredictiveCacheSizeMax	40% in NetBackup 10.1 10% in NetBackup 10.1.1)
MaxSamplingCacheSize	10%
EnableLocalPredictivesSamplingCache in `spa.cfg`	true
EnableLocalPredictiveSamplingCache in `contentrouter.cfg`	true
MaxCloudCacheSize	Deprecated and replaced with Max P-cache size and Max S-cache size

With the above change, to ensure that memory is used for uploading, the formula before NetBackup 10.1 is changed to:

MaxCacheSize + MaxPredictiveCacheSize + MaxSamplingCacheSize + MaxCloudCacheSize (Cloud in-memory upload cache size) must be less than or equal to the value of UsableMemoryLimit.

With P-and-S-cache in 10.2, local and all cloud LSUs share the same P-and-S-cache, and the previous MaxCacheSize can be ignored. The P-and-S-cache setting needs to be done carefully. Setting them too high will waste memory, while setting them too low will lead to a poor deduplication ratio and impact backup performance.

In general, S-cache size should be proportional to the backend storage size, while P-cache size is determined by the maximum number of concurrent jobs. Use the following rules of thumb for the P-and-S-cache tuning:

For each 10 TB of backend storage, allocate 1 GB of RAM for S-cache
For each backup stream, allocate 250 MB of RAM for P-cache. So, the total P-cache allocated should be (250 MB) * (maximum number of concurrent jobs)

To ensure enough memory for other processes running on the system, P-and-S-cache size together should not exceed the MaxUsableMemory value.

Other processes that also need memory include:

Basic operating system with NetBackup if running as a media server
NetBackup processes if NetBackup runs in the same node
spad cache for the opt-dup source
mtstrd cache for the backup source
Spooler cache

Disk cache for cloud upload and download

The NetBackup cloud tier allows each media server to create one or more cloud logical storage units (LSUs). It is important to know that for each cloud LSU created, roughly 1 TB of MSDP storage pool is reserved for the LSU to be used as cloud disk cache.

Starting with NetBackup10.2, this preserved disk cache can be configured from the NetBackup web UI during LSU creation. The disk cache size for upload is 12 GB and is set by the parameter UploadCacheDB, while the default disk cache size for cloud download is 1 TB which is set by the parameters DownloadDataCacheGB and DownloadMetaCacheGB. The default values for parameters are set in contentrouter.cfg with CloudUploadCacheSize, CloudDataCacheSize, and CloudMetaCacheSize respectively.

As mentioned earlier, the disk caches occupy space in the MSDP pool. For the MSDP pool with limited storage size, the reserved disk cache can consume too much space, resulting in little usable space for regular backup jobs. If jobs are failing with error codes 129 and 84, it may indicate that there is no space left on the device, even though the MSDP pool may still have plenty of space according to df - h and dsstat. For this kind of case, we recommend:

Limit the number of cloud LSUs created per media instance, especially if storage pool is relatively small.
Reduce the default CloudDataCacheSize and CloudMetaCacheSize values.

If there is enough memory for upload to go through the memory cache, the UploadCacheGB can be set to (maximum number of concurrent streams * MaxFileSizeMB * 2) in the cloud.json file. If the maximum number of concurrent streams is 100, the UploadCacheGB value can be set to 12 GB. The DownloadDataCacheGB and DownloadMetaCacheGB used for the restore or opt-dup download cache can be as small as a few GBs to function. A larger download disk cache size can improve restore and opt-dup performance because it can help avoid downloading the same data object more than once.

Tuning the DownloadDataCacheGB and DownloadMetaCacheGB values requires knowing the maximum number of concurrent download streams. In most cases, restoring from the cloud requires downloading the entire data container (64 MB). This is because the container created at backup time usually consists of data from a single client, and MSDP-C will download the entire container so that the same container is only fetched once during a restore.

The default values of the parameters are set under <storage>/etc/puredisk/contentrouter.cfg, and the default values are used for all future LSUs. The parameters in cloud.json are used to set values used for each already created LSU. The file is found at <storage>/etc/puredisk/cloud.json.