Veritas NetBackup™ Deduplication Guide
- Introducing the NetBackup media server deduplication option
- Planning your deployment
- About MSDP storage and connectivity requirements
- About NetBackup media server deduplication
- About NetBackup Client Direct deduplication
- About MSDP remote office client deduplication
- About MSDP performance
- MSDP deployment best practices
- Provisioning the storage
- Licensing deduplication
- Configuring deduplication
- Configuring the Deduplication Multi-Threaded Agent behavior
- Configuring the MSDP fingerprint cache behavior
- Configuring MSDP fingerprint cache seeding on the storage server
- About MSDP Encryption using KMS service
- Configuring a storage server for a Media Server Deduplication Pool
- Configuring a disk pool for deduplication
- Configuring a Media Server Deduplication Pool storage unit
- About MSDP optimized duplication within the same domain
- Configuring MSDP optimized duplication within the same NetBackup domain
- Configuring MSDP replication to a different NetBackup domain
- About NetBackup Auto Image Replication
- Configuring a target for MSDP replication to a remote domain
- Creating a storage lifecycle policy
- Resilient Network properties
- Editing the MSDP pd.conf file
- About protecting the MSDP catalog
- Configuring an MSDP catalog backup
- Configuring deduplication to the cloud with NetBackup CloudCatalyst
- Using NetBackup CloudCatalyst to upload deduplicated data to the cloud
- Configuring a CloudCatalyst storage server for deduplication to the cloud
- Monitoring deduplication activity
- Viewing MSDP job details
- Managing deduplication
- Managing MSDP servers
- Managing NetBackup Deduplication Engine credentials
- Managing Media Server Deduplication Pools
- Changing a Media Server Deduplication Pool properties
- Configuring MSDP data integrity checking behavior
- About MSDP storage rebasing
- Managing MSDP servers
- Recovering MSDP
- Replacing MSDP hosts
- Uninstalling MSDP
- Deduplication architecture
- Troubleshooting
- About unified logging
- About legacy logging
- Troubleshooting MSDP installation issues
- Troubleshooting MSDP configuration issues
- Troubleshooting MSDP operational issues
- Troubleshooting CloudCatalyst issues
- CloudCatalyst logs
- Problems encountered while using the Cloud Storage Server Configuration Wizard
- Disk pool problems
- Problems during cloud storage server configuration
- CloudCatalyst troubleshooting tools
- Appendix A. Migrating to MSDP storage
About variable-length deduplication on NetBackup clients
Currently, NetBackup deduplication follows a fixed-length deduplication method where the data streams are chunked into fixed-length segments (128 KB) and then processed for deduplication. Fixed-length deduplication has the advantage of being a swift method and it consumes less computing resources. Fixed-length deduplication handles most kinds of data streams efficiently. However, there can be cases where fixed-length deduplication might result in low deduplication ratios.
If your data was modified in a shifting mode, that is, if some data was inserted in the middle of a file, then variable-length deduplication enables you to get higher deduplication ratios when you back up the data. Variable-length deduplication reduces backup storage, improves the backup performance, and lowers the overall cost that is spent on data protection.
Note:
Use variable-length deduplication for data types that do not show a good deduplication ratio with the current MSDP intelligent deduplication algorithm and affiliated streamers. Enabling Variable-length deduplication might improve the deduplication ratio, but consider that the CPU performance might get affected.
In variable-length deduplication, every segment has a variable size with configurable size boundaries. The NetBackup client examines and applies a secure hash algorithm (SHA-2) to the variable-length segments of the data. Each data segment is assigned a unique ID and NetBackup evaluates if any data segment with the same ID exists in the backup. If the data segment already exists, then the segment data is not stored again.
Warning:
If you enable compression for the backup policy, variable-length deduplication does not work even when you configure it.
The following table describes the effect of variable-length deduplication on the data backup:
Table: Effect of variable-length deduplication
Effect on the deduplication ratio | Variable-length deduplication is beneficial if the data file is modified in a shifting mode, that is when data is inserted, removed, or modified at a binary level. When such modified data is backed up again, variable-length deduplication achieves a higher deduplication ratio. Thus, the second or subsequent backups have higher deduplication ratios. |
Effect on the CPU | Variable-length deduplication can be a bit more resource-intensive than fixed-length deduplication to achieve a better deduplication ratio. Variable-length deduplication needs more CPU cycles to compute segment boundaries and the backup time might be more than the fixed-length deduplication method. |
Effect on data restore | Variable-length deduplication does not affect the data restore process. |
By default, the variable-length deduplication is disabled on a NetBackup client. You can enable variable-length deduplication by adding parameters in the pd.conf
file. To enable the same settings for all NetBackup clients or policies, you must specify all the clients or policies in the pd.conf
file.
In a deduplication load balancing scenario, you must upgrade the media servers to NetBackup 8.1.1 or later and modify the pd.conf
file on all the media servers. If a backup job selects an older media server (earlier than NetBackup 8.1.1) for the load balancing pool, fixed-length deduplication is used instead of variable-length deduplication. Avoid configuring media servers with different NetBackup versions in a load balancing scenario. The data segments generated from variable-length deduplication are different from the data segments generated from fixed-length deduplication. Therefore, load balancing media servers with different NetBackup versions results in a low deduplication ratio.