NetBackup™ Backup Planning and Performance Tuning Guide
- NetBackup capacity planning
- Primary server configuration guidelines
- Media server configuration guidelines
- NetBackup hardware design and tuning considerations
- About NetBackup Media Server Deduplication (MSDP)
- MSDP tuning considerations
- MSDP sizing considerations
- Accelerator performance considerations
- Media configuration guidelines
- How to identify performance bottlenecks
- Best practices
- Best practices: NetBackup AdvancedDisk
- Best practices: NetBackup tape drive cleaning
- Best practices: Universal shares
- NetBackup for VMware sizing and best practices
- Best practices: Storage lifecycle policies (SLPs)
- Measuring Performance
- Table of NetBackup All Log Entries report
- Evaluating system components
- Tuning the NetBackup data transfer path
- NetBackup network performance in the data transfer path
- NetBackup server performance in the data transfer path
- About shared memory (number and size of data buffers)
- About the communication between NetBackup client and media server
- Effect of fragment size on NetBackup restores
- Other NetBackup restore performance issues
- About shared memory (number and size of data buffers)
- Tuning other NetBackup components
- How to improve NetBackup resource allocation
- How to improve FlashBackup performance
- Tuning disk I/O performance
Fingerprint lookup for deduplication
The SHA-2 hashing algorithm is used to generate the fingerprints of the data segments from backup streams. A unique SHA-2 fingerprint represents a unique data segment and is compared to a set of fingerprints representing data segments already in a data store. A lookup match means the data segment is already stored in the system; a lookup miss means the system does not have it and the corresponding data segment needs to be stored.
The set of fingerprints in memory, also known as the fingerprint cache, contains two sets of fingerprints for a given backup job:
The global fingerprint cache, which is indexed for fast query, maintained at the deduplication server-side for the duration of the deduplication service running.
The job-based fingerprint cache, which is also indexed, created at the deduplication client side in the beginning of the job and released at the end of the job.
The fingerprints of the last image (which is the last full backup by default and can be the last full backup plus subsequent incrementals) is fetched from the MSDP server to the OST pdplugin in the beginning. Whether the deduplication happens on the OST pdplugin completely depends on whether the client-side cache is big enough to hold all the fingerprints from the last image. Any fingerprint lookup that is missed from the client-side cache triggers the lookup to go to the MSDP server-side, even though the fingerprint may not exist on the server-side.
This two-level fingerprint cache provides a high-performance lookup and reduces memory footprint requirement at the server side, such as a NetBackup appliance.