Veritas NetBackup™ Troubleshooting Guide
- Introduction
- Troubleshooting procedures
- Troubleshooting NetBackup problems
- Troubleshooting vnetd proxy connections
- Troubleshooting security certificate revocation
- Verifying host name and service entries in NetBackup
- Frozen media troubleshooting considerations
- Troubleshooting problems with the NetBackup web services
- Resolving PBX problems
- Troubleshooting problems with validation of the remote host
- About troubleshooting Auto Image Replication
- Using NetBackup utilities
- About the NetBackup support utility (nbsu)
- About the NetBackup consistency check utility (NBCC)
- About the robotic test utilities
- Disaster recovery
- About disk recovery procedures for UNIX and Linux
- About clustered NetBackup server recovery for UNIX and Linux
- About disk recovery procedures for Windows
- About clustered NetBackup server recovery for Windows
- About recovering the NetBackup catalog
- About NetBackup catalog recovery and OpsCenter
- About recovering the entire NetBackup catalog
- About recovering the NetBackup catalog image files
- About recovering the NetBackup relational database
Troubleshooting Auto Image Replication
Auto Image Replication replicates the backups that are generated in one NetBackup domain to another media server in one or more NetBackup domains.
Note:
Although Auto Image Replication supports replication across different master server domains, the Replication Director does not.
Auto Image Replication operates like any duplication job except that its job contains no write side. The job must consume a read resource from the disk volume on which the source images reside. If no media server is available, the job fails with status 800.
The Auto Image Replication job operates at a disk volume level. Within the storage unit that is specified in the storage lifecycle policy for the source copy, some disk volumes may not support replication. Use the Disk Pools interface of the NetBackup Administration Console to verify that the image is on a disk volume that supports replication. If the interface shows that the disk volume is not a replication source, click or to update the disk volume(s) in the disk pool. If the problem persists, check your disk device configuration.
The action to take on the automatic replication job depends on several conditions as shown in the following table.
Action | Condition |
---|---|
AIR replication jobs have not started | Verify the following:
|
AIR replication jobs are queued but have not started | No media server or I/O stream is available. |
AIR replication jobs fail, for example with status 191 | Check the job details for more information about the failure. For more details, review the |
The following procedure is based on NetBackup that operates in an OpenStorage configuration. This configuration communicates with a Media Server Deduplication Pool (MSDP) that uses Auto Image Replication.
To troubleshoot Auto Image Replication jobs
- Display the storage server information by using the following command:
# bpstsinfo -lsuinfo -stype PureDisk -storage_server storage_server_name
Example output:
LSU Info: Server Name: PureDisk:ss1.acme.com LSU Name: PureDiskVolume Allocation : STS_LSU_AT_STATIC Storage: STS_LSU_ST_NONE Description: PureDisk storage unit (/ss1.acme.com#1/2) Configuration: Media: (STS_LSUF_DISK | STS_LSUF_ACTIVE | STS_LSUF_STORAGE_NOT_FREED | STS_LSUF_REP_ENABLED | STS_LSUF_REP_SOURCE) Save As : (STS_SA_CLEARF | STS_SA_OPAQUEF | STS_SA_IMAGE) Replication Sources: 0 ( ) Replication Targets: 1 ( PureDisk:bayside:PureDiskVolume ) ...
This output shows the logical storage unit (LSU) flags STS_LSUF_REP_ENABLED and STS_LSUF_REP_SOURCE for
PureDiskVolume
.PureDiskVolume
is enabled for Auto Image Replication and is a replication source. - To verify that NetBackup recognizes these two flags, run the following command:
# nbdevconfig -previewdv -stype PureDisk -storage_server storage_server_name -media_server media_server_name -U Disk Pool Name : Disk Type : PureDisk Disk Volume Name : PureDiskVolume ... Flag : ReplicationSource ...
The
ReplicationSource
flag confirms that NetBackup recognizes the LSU flags. - To display the replication targets by using the raw output, run the following command:
# nbdevconfig -previewdv -stype PureDisk -storage_server storage_server_name -media_server media_server_name V_5_ DiskVolume < "PureDiskVolume" "PureDiskVolume" 46068048064 46058373120 0 0 0 16 1 > V_5_ ReplicationTarget < "bayside:PureDiskVolume" >
The display shows that the replication target is a storage server called
bayside
and the LSU (volume) name isPureDiskVolume
. - To ensure that NetBackup captured this configuration correctly, run the following command:
# nbdevquery -listdv -stype PureDisk -U Disk Pool Name : PDpool Disk Type : PureDisk Disk Volume Name : PureDiskVolume ... Flag : AdminUp Flag : InternalUp Flag : ReplicationSource Num Read Mounts : 0 ...
This listing shows that disk volume
PureDiskVolume
is configured in disk poolPDPool
, and that NetBackup recognizes the replication capability on the source side. A similar nbdevquery command on the target side should display ReplicationTarget for its disk volume. - If NetBackup does not recognize the replication capability, run the following command:
# nbdevconfig -updatedv -stype PureDisk -dp PDpool
- To ensure that you have a storage unit that uses this disk pool, run the following command:
# bpstulist PDstu 0 _STU_NO_DEV_HOST_ 0 -1 -1 1 0 "*NULL*" 1 1 51200 *NULL* 2 6 0 0 0 0 PDpool *NULL*
The output shows that storage unit
PDstu
uses disk poolPDpool
. - Check the settings on the disk pool by running the following command:
nbdevquery -listdp -stype PureDisk -dp PDpool -U Disk Pool Name : PDpool Disk Pool Id : PDpool Disk Type : PureDisk Status : UP Flag : Patchwork ... Flag : OptimizedImage Flag : ReplicationTarget Raw Size (GB) : 42.88 Usable Size (GB) : 42.88 Num Volumes : 1 High Watermark : 98 Low Watermark : 80 Max IO Streams : -1 Comment : Storage Server : ss1.acme.com (UP)
Max IO Streams is set to -1, which means the disk pool has unlimited input-output streams.
- To check the list of media servers that are credentialed to access the storage servers and their disk pools, run the following command:
# tpconfig -dsh -all_hosts ============================================================== Media Server: ss1.acme.com Storage Server: ss1.acme.com User Id: root Storage Server Type: BasicDisk Storage Server Type: SnapVault Storage Server Type: PureDisk ==============================================================
This disk pool only has one media server,
ss1.acme.com
. You have completed the storage configuration validation. - The last phase of validation is the storage lifecycle policy configuration. To run Auto Image Replication, the source copy must be on storage unit
PDstu
. Run the following command (for example):nbstl woodridge2bayside -L Name: woodridge2bayside Data Classification: (none specified) Duplication job priority: 0 State: active Version: 0 Destination 1 Use for: backup Storage: PDstu Volume Pool: (none specified) Server Group: (none specified) Retention Type: Fixed Retention Level: 1 (2 weeks) Alternate Read Server: (none specified) Preserve Multiplexing: false Enable Automatic Remote Import: true State: active Source: (client) Destination ID: 0 Destination 2 Use for: 3 (replication to remote master) Storage: Remote Master Volume Pool: (none specified) Server Group: (none specified) ... Preserve Multiplexing: false Enable Automatic Remote Import: false State: active Source: Destination 1 (backup:PDstu) Destination ID: 0
To troubleshoot the Auto Image Replication job flow, use the same command lines as you use for other storage lifecycle policy managed jobs. For example, to list the images that have been duplicated to remote master, run the following:
nbstlutil list -copy_type replica -U -copy_state 3
To list the images that have not been duplicated to remote master (either pending or failed), run the following:
nbstlutil list -copy_type replica -U -copy_incomplete
- To show the status for completed replication copies, run the following command:
nbstlutil repllist -U Image: Master Server : ss1.acme.com Backup ID : woodridge_1287610477 Client : woodridge Backup Time : 1287610477 (Wed Oct 20 16:34:37 2010) Policy : two-hop-with-dup Client Type : 0 Schedule Type : 0 Storage Lifecycle Policy : woodridge2bayside2pearl_withdup Storage Lifecycle State : 3 (COMPLETE) Time In Process : 1287610545 (Wed Oct 20 16:35:45 2010) Data Classification ID : (none specified) Version Number : 0 OriginMasterServer : (none specified) OriginMasterServerID : 00000000-0000-0000-0000-000000000000 Import From Replica Time : 0 (Wed Dec 31 18:00:00 1969) Required Expiration Date : 0 (Wed Dec 31 18:00:00 1969) Created Date Time : 1287610496 (Wed Oct 20 16:34:56 2010) Copy: Master Server : ss1.acme.com Backup ID : woodridge_1287610477 Copy Number : 102 Copy Type : 3 Expire Time : 1290288877 (Sat Nov 20 15:34:37 2010) Expire LC Time : 1290288877 (Sat Nov 20 15:34:37 2010) Try To Keep Time : 1290288877 (Sat Nov 20 15:34:37 2010) Residence : Remote Master Copy State : 3 (COMPLETE) Job ID : 25 Retention Type : 0 (FIXED) MPX State : 0 (FALSE) Source : 1 Destination ID : Last Retry Time : 1287610614 Replication Destination: Source Master Server: ss1.acme.com Backup ID : woodridge_1287610477 Copy Number : 102 Target Machine : bayside Target Info : PureDiskVolume Remote Master : (none specified)