Search <book_title>...

Product Documentation

Last Published: 2019-11-21

Product(s): Resiliency Platform & CloudMobility (3.4)

Predefined risks in Resiliency Platform

Table: Predefined risks lists the predefined risks available in Resiliency Platform. These risks are reflected in the current risk report and the historical risk report.

Table: Predefined risks

Risks	Description	Risk detection time	Risk type	Affected operation	Fix if violated
Veritas Infoscale Operations Manager disconnected	Checks for Veritas Infoscale Operations Manager to Resiliency Manager connection state	1 minute	Error	All operations	Check Veritas Infoscale Operations Manager reachability Try to reconnect Veritas Infoscale Operations Manager
vCenter Password Incorrect	Checks if vCenter password is incorrect	15 minutes	Error	On primary site: start or stop operations On secondary site: migrate or takeover operations	In case of a password change, resolve the password issue and refresh the vCenter configuration
VM tools not installed	Checks if VM Tools are not Installed. It may affect IP Customization and VM Shutdown	5 minutes	Error	Migrate Stop	In case of VMWare, install VMWare Tools In case of Hyper-V, install Hyper-V Integration Tools
Snapshot reverted on Virtual Machine	Checks if snapshot has been reverted on virtual machine	5 minutes	Error	Resiliency Platform Data Mover replication	Perform the Resync operation on the resiliency group.
Resiliency Platform Data Mover daemon crashed	Resiliency Platform Data Mover filter is not able to connect to its counterpart in ESX. The replication process has stopped and is at risk	5 minutes	Error	Resiliency Platform Data Mover replication	To continue the replication, you can move (VMotion) the virtual machine to a different ESX node in the cluster. Troubleshoot the issue with this ESX node or raise a support case with Veritas.
DataMover virtual machine in no-op mode	Checks if VM Data Mover filter is not able to connect to its counterpart in ESX	5 minutes	Error	Resiliency Platform Data Mover replication	In order to continue the replication, you can move (VMotion) the VM to a different ESX node in the cluster and either troubleshoot the issue with this ESX node or raise a support case with Veritas
Veritas Replication policy has been detached	Veritas Replication policy has been detached from the disk associated with virtual machine.	5 minutes	Error	Migrate	Perform Resync operation on the affected resiliency group.
Asset disk configuration changed	Checks if disk configuration of any of the assets in the resiliency group has changed.	30 minutes	Error	Migrate Rehearsal	Refresh the respective hosts, vCenter servers or Hyper-V servers and the cloud discovery. After refresh, probe the risk. After performing the above mentioned step even if the risk still exists, edit the resiliency group to first remove the impacted virtual machine from the resiliency group and then add it back to the resiliency group.
Asset NIC configuration changed	Checks if NIC configuration of any of the assets in the resiliency group has changed.	30 minutes	Error	Migrate Resync	If the resilience group is online on the target data center, then either revert the NIC changes done on the virtual machines or suppress the risk to be able to migrate the assets back to the source data center. If the resiliency group is online on source data center, edit the resiliency group with Edit Configuration or Customize Network option to update the NIC configuration.
Invalid NIC Configuration	One or more NICs on the host are not configured properly.	Real time, while creating resiliency group	Error	Create resiliency group	Ensure that the keys NAME, DEVICE and HWADDR have appropriate values as per the details of each NIC in its configuration file.
Global user deleted	Checks if there are no global users. In this case, the user will not be able to customize the IP for Windows machines in VMware environment	Real time	Warning	Migrate Takeover	Edit the resiliency group or add a Global user
Failure to validate Windows Global User credentials for IP customization	This risk is raised if: Windows Global User is not configured. Windows Global User does not have appropriate credentials. Virtual machine is offline while configuring resiliency group for disaster recovery.	After the resiliency group is configured for disaster recovery	Warning	Rehearsal Migrate Takeover	Add Windows Global Users with appropriate credentials. Edit the resiliency group using the Network Customization option to resolve the risk.
Missing heartbeat from Resiliency Manager	Checks for heartbeat failure from a Resiliency Manager	5 minutes	Error	All	Fix the Resiliency Manager connectivity issue
Infrastructure Management Server disconnected	Check for Infrastructure Management Server(IMS) to Resiliency Manager(RM) connection state	1 minute	Error	All	Check IMS reachability Try to reconnect IMS
Storage Discovery Host down	Checks if the discovery daemon is down on the storage discovery host	15 minutes	Error	Migrate	Resolve the discovery daemon issue
DNS removed	Checks if DNS is removed from the resiliency group where DNS customization is enabled	real time	Warning	Migrate Takeover	Edit the Resiliency Group and disable DNS customization
IOTap driver not configured	Checks if the IOTap driver is not configured	2 hours	Error	None	Configure the IOTap driver This risk is removed when the workload is configured for disaster recovery.
VMware Discovery Host Down	Checks if the discovery daemon is down on the VMware Discovery Host	15 minutes	Error	Migrate	Resolve the discovery daemon issue
VM restart is pending	Checks if the virtual machine has not been restarted after add host operation	2 hours	Error	Create resiliency group	Restart the virtual machine after add host operation
New virtual machine added to replication storage	Checks if a virtual machine that is added to a Veritas Replication Set on a primary site, is not a part of the resiliency group	5 minutes	Error	Migrate Takeover Rehearsal	Add the virtual machine to the resiliency group
Replication lag exceeding RPO	Checks if the replication lag exceeds the thresholds defined for the resiliency group. This risk affects the SLA for the services running on your production data center	5 minutes	Warning	Migrate Takeover	Check if the replication lag exceeds the RPO that is defined in the Service Objective
Replication state broken/critical	Checks if the replication is not working or is in a critical condition for each resiliency group	5 minutes	Error	Migrate Takeover	Contact the enclosure vendor. In case of Resiliency Platform Data Mover, See Admin Wait state codes . or raise a support case with Veritas
Remote mount point already mounted	Checks if the mount point is not available for mounting on target site for any of the following reasons: Mount point is already mounted Mount point is being used by other assets	Native (ext3, ext4,NTFS ): 30 minutes Virtualization (VMFS, NFS): 6 hours	Warning	Migrate Takeover	Unmount the mount point that is already mounted or is being used by other assets Risk gets resolved after 30 minutes if a successful cleanup rehearsal, migrate, or takeover operation performed and VMware vCenter gets refreshed within 30 minutes.
Disk utilization critical	Checks if at least 80% of the disk capacity is being utilized. The risk is generated for all the resiliency groups associated with that particular file system	Native (ext3, ext4,NTFS ): 30 minutes Virtualization (VMFS, NFS): 6 hours	Warning	Migrate Takeover Rehearsal	Delete or move some files or uninstall some non-critical applications to free up some disk space
ESX not reachable	Checks if the ESX server is in a disconnected state	5 minutes	Error	On primary site: start or stop operations On secondary site: migrate or takeover operations	Resolve the ESX server connection issue
vCenter Server not reachable	Checks if the virtualization server is unreachable or if the password for the virtualization server has changed	5 minutes	Error	On primary site: start or stop operations On secondary site: migrate or takeover operations	Resolve the virtualization server connection issue In case of a password change, resolve the password issue
Insufficient compute resources on failover target	Checks if there are insufficient CPU resources on failover target in a virtual environment	6 hours	Warning	Migrate Takeover	Reduce the number of CPUs assigned to the virtual machines on the primary site to match the available CPU resources on failover target
Host not added on recovery data center	Checks if the host is not added to the IMS on the recovery data center	30 minutes	Error	Migrate	Check the following and fix: Host is up on recovery data center Host is accessible from recovery datacenter IMS Time is synchronized between host and recovery datacenter IMS
NetBackup Notification channel disconnected	Checks for NetBackup Notification channel connection state	5 minutes	Error	Restore	Check if the NetBackup Notification channel is added to the NetBackup master server
Backup image violates the defined RPO	Checks if the backup image violates the defined RPO	30 minutes	Warning	No operation	Check the connection state of NetBackup Notification channel Check for issues due to which backup images are not available
NetBackup master server disconnected	Checks if NetBackup master server is disconnected or not reachable	5 minutes	Error	Restore	Check if IMS is added as an additional server to the NetBackup master server
Assets do not have copy policy	Checks if the assets do not have a copy policy	3 hours	Warning	No operation	Set up copy policy and then refresh the NetBackup master server
Target replication is not configured	Checks if the target replication is not configured	3 hours	Warning	No operation	Configure target replication and then refresh the NetBackup master server
Disabled NetBackup Policy	Checks if NetBackup policy associated with the virtual machine is disabled	3 hours	Warning	No operation	Fix the disabled policy
Replication block tracking disk not found	Checks for the replication block tracking disk. If the replication block tracking disk is not found, then virtual machine does not get configured for remote recovery and the replication stops	30 minutes	Error	Migrate	Ensure that the RBT disk is attached to the virtual machine. After the risk gets resolved, perform reboot of VM then perform the resync operation to avoid disk corruption during migrate or migrate back. If you are not able to locate the RBT disk then perform following steps in the order listed: Remove the virtual machine from resiliency group. Add it again to a resiliency group to ensure that virtual machine is protected.
Members are manually deleted from network groups	Network group goes into faulted state when a member is manually removed. The risk is circulated to resiliency group	Immediate	Warning	Migrate, Rehearse	Edit the network group by adding the missing member and then edit the resiliency group details
Members deleted from network groups	Network group goes into faulted state when a discovered member gets deleted from IMS. The risk is circulated to resiliency group	5 minutes	Warning	Migrate, Rehearse	Edit the network group by adding the missing member and then edit the resiliency group details
Virtual machine configuration not backed up	Unable to take a backup of virtual machine configuration file.	Immediate	Error	Create resiliency group Migrate Rehearse	Check the state of the IMS and its corresponding assets such as the hypervisors and vCenter servers. Perform edit resiliency group operation.
Unable to backup latest Virtual machine configuration	Unable to take a backup of the latest configuration file of the virtual machine.	Immediate	Warning	Edit resiliency group Migrate Rehearse	Check the state of the IMS and its corresponding assets such as the hypervisors and vCenter servers. Perform edit resiliency group operation.
Datastore for disk has changed to X, this datastore is not part of resiliency group	If virtual disk is moved to a non-compliant datastore. Applicable for 3rd party replication technology	5 to 15 minutes	Error	All operations except start and stop resiliency group	Edit the resiliency group or move the disk to a datastore which is part of the resiliency group.
Datastore for configuration file has changed to X, this datastore is not part of resiliency group. Previous datastore was Y.	If the virtual machine configuration file is moved to a non-compliant datastore. Applicable for 3rd party replication technology	5 to 15 minutes	Error	All operations except start and stop resiliency group	Edit the resiliency group or move the disk to a datastore which is part of the resiliency group.
Disk path has changed	Displayed when virtual machine snapshot is taken. Risk is resolved automatically after updating the blob.	5 to 15 minutes	Error	All operations	Risk is automatically resolved.
New datastore added to the consistency group is not part of resiliency group	New datastore added to consistency group Applicable for 3rd party replication technology	6 hours	Error	Migrate Takeover Resync	Edit the resiliency group
Datastore removed from resiliency group	Datastore removed from consistency group Applicable for 3rd party replication technology	6 hours	Error	Migrate Takeover Resync	Edit the resiliency group
Veritas Replication VIB upgrade pending	Checks if the Veritas Replication VIB version on ESXi cluster has latest version installed.	6 hours	Error	None	Upgrade the Veritas Replication VIB to the latest version.
Veritas Replication VIB is in partial state.	Checks if the Veritas Replication VIB installation on ESXi cluster is in partial or unknown state.	6 hours	Error	If the risk is on the target ESXi cluster then block the migrate and rehearsal operations. If the risk is on the source ESXi cluster then block the resync operation.	Perform Resolve and Verify operation on the ESXi cluster to fix the installation issues.
Insufficient privileges on vCenter server	Operations on the resiliency group may fail because of missing privileges on vCenter server data centers.	6 hours	Warning	One or more operations on resiliency group may fail because of missing privileges on vCenter server data center.	Ensure that appropriate privileges are configured on vCenter server data center before invoking any operation. Refer to the documentation for the required privileges.
Infrastructure Management Server data reporting disabled	Infrastructure Management Server cannot report data to Resiliency Manager due to version incompatibility	As soon as IMS connects to the Resiliency Manager after the Resiliency Manager upgrade	Error	All	Upgrade IMS to the latest version that is specified in the risk message
DRS Datastore Is Added Or Removed	New datastore is added to the cluster or is removed from the cluster	6 Hours	Warning	None	Edit the resiliency group
Datastore Cluster Deleted	Datastore cluster is deleted from the data center	6 Hours	Error	Rehearsal Migrate Resync	Edit the resiliency group
SNMP Trap Receiver Not Added Or Deleted	SNMP trap receiver is either not added or is deleted	6 Hours	Error	Start Stop Rehearsal Takeover Migrate Resync	Add the SNMP trap receiver
vCloud Director discovery failed	Checks whether vCloud Director assets can be discovered using the vCloud Director configuration	10 mins	Error	None	Check the user privileges and then refresh the discovery for vCloud Director. If the password has changed, you need to edit the cloud configuration to update the new password.
All the hosts on the applications are not reachable	All the hosts for the application are not reachable	15 minutes	Error	None	Check the connectivity with the application hosts
Application host is disconnected due to change in MAC address	Application Host is in Disconnected state	15 minutes	Error	Rehearsal Migrate	Retry Add Host operation
Assets does not have copy policy	Assets does not have copy policy	When vrp_host unassociated with copy policy.	Warning	None	Check if any asset has no copy policy
Backup image violates the defined RPO	Checks if the backup image violates the defined RPO	Immediate	Warning		Check the connection state of NetBackup Notification channel. Check for issues due to which backup images are not available.
CPU Usage Critical	Available compute capacity on the recovery site may be inadequate for recovering this application. This risk affects the recoverability of the services running on your production data center.	6 hours	Warning	None	Reduce the number of CPUs assigned to the virtual machines on the primary site to match the available CPU resources on failover target
Incorrect .Net version is installed	The expected .NET version is not installed or it is not compatible with the PowerShell version	2 hours	Error	On Primary site: migrate and takeover operations	Ensure that the .NET version is installed with its compatible PowerShell version. Refer to the HSCL for compatible versions of .NET and PowerShell.
Editing the resiliency group is required	Resiliency group needs an upgrade or perform Edit operation.	Immediate	Warning	None	Edit the resiliency group using the Edit Configuration intent. Ensure that the resiliency group is online on the source datacenter before performing the edit operation
Evacuation plan for data center has been invalidated.	Evacuation plan for data center has been invalidated, due to adding , deleting or updating a resiliency group or a VBS	Immediate	Error	None	Regenerate the evacuation plan.
Host reboot is pending after upgrade	The OS is not rebooted after upgrade operation	Immediate	Warning	None	Virtual machine requires to be rebooted after the upgrade operation
Mount point is deleted	Check if the mount point on which the assets of the resiliency group are configured, is deleted or renamed	6 hours	Error	Migrate Rehearsal	Remount using the same mount point else you need to edit the resiliency group
PowerShell is not initialized	PowerShell is not initialized	2 hours	Error	On Secondary site: migrate and rehearsal operations	Check PowerShell Initialization on host
PowerShell is not installed	PowerShell is not installed	2 hours	Error	On Secondary site: migrate and rehearsal operations	Install PowerShell (version > 2.0) on host
Powershell Version is incorrect	Expected Powershell version not found	2 hours	Error	On Secondary site: migrate and takeover operations	Install Powershell version should be 2.0 and above
Registry Parameter LSI_SAS is not set	Registry Parameter LSI_SAS is not set	2 hours	Error	On Secondary site: migrate and rehearsal operations	Change the value for registry parameter LSI_SAS->Start to 0 and refresh host discovery
Replication Gateway is not reachable	The Replication Gateway is down or not reachable from the IMS	15 minutes	Error	None	Make sure the replication gateway appliance is running and is reachable from the IMS
Replication state synchronizing	Data synchronization is in progress.	5 minutes	Warning	None	Wait for synchronization to complete (Replication state should be Active (Connected \|Consistent))
Resync operation is pending on a resiliency group	Resync operation is pending on current resiliency group	Immediate	Error	On Secondary site: migrate operation	Execute Resync operation on current resiliency group
Resiliency group configuration drift	Disk configuration for asset(s) in the resiliency group is changed. This is a configuration drift.	2 minutes	Error	On Primary site: rehearsal operation On Secondary site: migrate, resync, and rehearsal operations	Refresh the respective hosts, vCenter servers or Hyper-V servers and the cloud discovery. After refresh, probe the risk. If the risk still exists, remove the virtual machine from the resiliency group and re-add using the Edit operation
Resiliency group configuration error	The disk size of the virtual machine in the resiliency group has changed. This is a configuration error	2 hours	Error	On Secondary site: migrate and resync operation	Editing the size of a disk is not supported. Restore the disk size for a resiliency group having multiple virtual machines. Edit the resiliency group by removing affected hosts and then add it again to re-protect. For resiliency group having only one virtual machines delete it and recreate again.
Resiliency group outage in datacenter	Outage has been declared for the resiliency group in the datacenter	Immediate	Error	None	Perform remediation steps to clear outage in the specified datacenter. Run a Resync or Clear outage operation (as applicable) to indicate that the outage has been cleared
Data sync failed between Resiliency Manager and database.	Data sync failed between Resiliency Manager and database.	As soon as the vrp_rm vertex gets updated with property db_status as value "Data sync failed"	Error	None	Perform Resync operation for Resiliency Manager
SAN Policy Offline Shared	SAN policy on the Windows host is Offline Shared	2 hours	Warning	None	Change the SAN policy on the Windows host to Online Shared and refresh the host discovery information
Stale configuration :: Object Deleted	Asset is unavailable	As soon as discovery reports delete of addressable objects.	Error	On Primary site: start, stop, migrate, takeover, rehearsal, resync, and restore operations On Primary site: start, stop, migrate, and resync operations	Reconfigure the asset
Stale configuration :: Object Unreachable	Asset is unreachable	As soon as discovery reports DISCONNECTED or NOT REACHABLE fault for addressable objects.	Error	On Primary site: start, stop, migrate, takeover, rehearsal, resync, and restore operations On Primary site: start, stop, migrate, and resync operations	Check the connectivity of the asset.
The migrated virtual machine is not added to the target IMS.	The migrated virtual machine is not added to the target IMS.	45 minutes	Error	On Secondary site: migrate and resync operation	Refer to the documentation to know the possible reasons for failure of add host operation
Unable to get VMX	Unable to backup virtual machine configuration file	Immediate	Error	On Primary site: rehearsal, migrate and takeover operations	Check the state of IMS, its corresponding assets such as the hypervisors and vCenter servers. Perform edit resiliency group operation.
Unable to update virtual machine configurations file	Unable to backup latest virtual machine configuration	Immediate	Error	None	Check the state of IMS, its corresponding assets such as the hypervisors and vCenter servers. Perform edit resiliency group operation.
vCenter server is removed from IMS	vCenter server is removed from IMS	Immediate	Error	On Primary site: start, stop, rehearsal, and migrate operations On Secondary site: start, stop, rehearsal, and migrate operations	Add the vCenter server to the IMS.
VCS Servicegroup Faulted	VCS Servicegroup is in Faulted state	1 hour	Error	None	Resolve the fault on VCS Servicegroup
Insufficient quota on target vCloud Director	Sufficient quota(CPUs/Memory/Storage) is not available on target vCloud Director.	5 minutes	Error	None	Sufficient quota should be available on the target vCloud Director
Virtual machine is deleted	One or more virtual machines are deleted or unregistered. The virtual machines belong to a resiliency group that is configured for remote recovery. This affects the recoverability of the resiliency group.	6 hours	Error	On Secondary site: migrate operation	Edit the resiliency group to remove the virtual machines that are deleted or unregistered.
Virtual machine is not protected	Virtual machine is not configured for remote recovery	Immediate	Error	None	If the virtual machine is in production data center then configure the virtual machine for remote recovery. If the virtual machine is in vCloud data center then ensure that disk.EnableUUID property is set to TRUE on the VRP_VAPP_TEMPLATE virtual machine as well as on the migrated virtual machine. After the risk is resolved, perform the Resync operation to avoid disk corruption during migrate or migrate back operation.
VMware discovery failed	VMware discovery is failed	6 hours	Error	None	In case of a password change, resolve the password issue and refresh the vCenter server configuration
IO Filter is not replicating the IOs from the virtual machine	IO Filter has encountered a fatal error	When IMS is receiving NOOP snmp event.	Error	On Primary site: migrate resync deepstart (Perform start operation after reverse replication is complete) On Secondary site: migrate resync deepstart (Perform start operation after reverse replication is complete)	If IO filter has encountered errors, either invoke the edit resiliency group workflow to remove and re-add asset from the resiliency group or delete the resiliency group and create it again
Cloud discovery failed	Cloud discovery has failed.	After 5 minutes	Error	On primary site: migrate, takeover, rehearsal, cleanup rehearsal, and resync operations. On secondary site: start, stop, migrate, and resync operations	Edit the cloud configuration to resolve the issue. If risk persists contact Veritas Support.
Cloud authentication failed	Cloud credentials are incorrect	After 5 minutes	Error	On primary site: migrate, takeover, rehearsal, cleanup rehearsal, and resync operations. On secondary site: start, stop, migrate, and resync operations	Edit cloud configuration and provide correct credentials to resolve the issue. In case of AWS, check the IAM role with proper privileges is attached to IMS.
Cloud connection timeout	Connection timed out fetching information about cloud resources.	After 5 minutes	Error	On primary site: migrate, takeover, rehearsal, cleanup rehearsal, and resync operations. On secondary site: start, stop, migrate, and resync operations	Resolve network connectivity between IMS and cloud data center and then refresh the cloud configuration.
NTP Time Sync Failed	NTP time skew. Time skew must be less than 3 seconds.	5 minutes	Warning	None	Synchronize with NTP server.
NTP Time Unsynchronized	Not able to synchronize with the NTP server.	5 minutes	Warning	None	Synchronize with NTP server.
NTP Time Indeterminate	NTP status indeterminate	5 minutes	Warning	None	Synchronize with NTP server.
Resiliency Group Configuration Drift for Network changed of some of the assets in the Resiliency Group	This risk is raised if network of some of the assets in the Resiliency Group is changed after the Resiliency Group is created. The change can be in the VLAN, vSwitch or cloud network settings.		Error		The risk is resolved when the deleted network gets discovered in Veritas Resiliency Platform. Or the network update risk will be resolved after successful editing the Resiliency Group.