InfoScale VxVM 7.4.2.3402 dmp_native_support rpool fault handling change to import ZFS zpools using -N to avoid ZFS nested mount conflicts
Problem
If Veritas Volume Manager (VxVM) detects a fault with the ZFS rpool during the Solaris boot sequence, all ZFS zpools are exported and imported potentially leading to ZFS nested mount conflicts later in the boot sequence.
The SMF service "svc:/system/filesystem/local:default" will attempt to perform ZFS related mounts later in the boot sequence and can fail due to missing ZFS rpool related mounts.
Veritas Volume Manager (VxVM) provides DMP support for the ZFS root pool (rpool) and ZFS zpools via the DMP for tunable "dmp_native_support".
To verify the value of the dmp_native_support tunable, use the following command:
# vxdmpadm gettune dmp_native_supportTunable Current Value Default Value
-------------------------- ------------- ---------------
dmp_native_support on off
The dmp_native_support tunable automatically enables or disables DMP support for the ZFS root pool along with other zpools.
DMP support for ZFS root pool requires Solaris 11.1 or later.
Solaris MPxIO (multi-pathing) is not supported in Solaris LDOM (Oracle VM for SPARC) environments, I/O domains & LDOM guests. MPxIO is not supported when Veritas DMP_NATIVE_SUPPORT is enabled.
Error Message
The SMF service "svc:/system/filesystem/local:default" is reported in "maintenance" mode.
# svcs svc:/system/filesystem/local:defaultSTATE STIME FMRI
maintenance 2023-02-08T12:02:30 svc:/system/filesystem/local:default
Sample output
# tail -f /var/svc/log/system-filesystem-local:default.logcannot mount 'rpool/export' on '/export': directory is not empty
cannot mount 'rpool/export' on '/export': directory is not empty
cannot mount 'rpool/export/home' on '/export/home': failure mounting parent dataset
WARNING: /usr/sbin/zfs mount -a failed: one or more file systems failed to mount
[ 2023 Jan 18 11:40:32 Method "start" exited with status 95. ]
Cause
There are two main SMF services of interest "svc:/system/vxvm/vxvm-startup2:default" for Veritas and "svc:/system/filesystem/local:default" for Oracle (Solaris).
Veritas
# svcs svc:/system/vxvm/vxvm-startup2:defaultSTATE STIME FMRI
online 2023-02-08T12:02:27 svc:/system/vxvm/vxvm-startup2:default
Oracle
# svcs svc:/system/filesystem/local:defaultSTATE STIME FMRI
maintenance 2023-02-08T12:02:30 svc:/system/filesystem/local:default
When Veritas detects an issue with the ZFS rpools during the boot sequence, the following script will also attempt to clear the fault and export any ZFS zpools.
File: /lib/svc/method/vxvm-startup2
<snippet> # During boot, if mirror devices under rpool are
# unavailable as vxconfigd is not yet started at this
# early boot. Then devices go into unavailable state.
# So clear the flag after vxconfigd is started to bring
# online all those mirror devices
#
for rpool in `echo $rpools`
do
$ZPOOL clear $rpool
done
for pool in `$ZPOOL list -H | grep "FAULTED" | awk '{print $1}'`
do
for rpool in `echo $rpools`
do
if [ "X$pool" = "X$rpool" ]; then
continue;
fi
done
$ZPOOL export $pool > /dev/null 2>&1
$ZPOOL import $pool > /dev/null 2>&1
done
<snippet>
Traditionally when ZFS pools are imported any corresponding ZFS mounts are also mounted at the same time.
This can result in nested ZFS mounts being mounted out of sequence and reporting ZFS mount errors later in the boot sequence.
The ZFS mounts will be managed by the "/lib/svc/method/fs-local" script:
File: /lib/svc/method/fs-local
<snippet># Mount all local filesystems.
#
# ignore first set of errors from mountall. If it fails
# we will retry it at the end of fs-local
# it most likely will fail due to syntax error in
# /etc/vfstab or you are trying to mount something on top
# of a file system that isn't mounted yet.
#
cd /; /usr/sbin/mountall -l >/dev/null 2>&1
mountall_rc=$?
# Mount all ZFS filesystems. Mount any shadowed file systems standby,
# their source file systems are likely not available yet.
if [ -x /usr/sbin/zfs ]; then
/usr/sbin/zfs mount -vaS 2>&1 | tee /dev/msglog
rc=${PIPESTATUS[0]}
if [ $rc -ne 0 ]; then
msg="WARNING: /usr/sbin/zfs mount -a failed: one or more "
msg="$msg file systems failed to mount"
echo $msg
echo "$SMF_FMRI:" $msg >/dev/msglog
result=$SMF_EXIT_ERR_FATAL
fi
fi
<snippet>
Solution
The issue is specific to Solaris Sparc 11.x environments.
A supported VxVM Private hotfix (vm-sol11_sparc-HotFix-7.4.2.3402.tar.gz) is currently only available for InfoScale 7.4.2.x.
Note: This private hotfix has not yet gone through any extensive Q&A testing. Please contact Veritas Technical Support to obtain this Private hotfix.
Please note that Veritas Technologies LLC reserves the right to remove any fix from the targeted release if it does not pass quality assurance tests. Veritas plans are subject to change and any action taken by you based on the above information or your reliance upon the above information is made at your own risk.
Consequently, if you are not adversely affected by this problem and have a satisfactory temporary workaround in place, we recommend that you wait for the next public patch. Please contact your Veritas Sales representative or the Veritas Sales group for upgrade information including upgrade eligibility to the release containing the resolution for this issue.
Veritas will be releasing a public patch solution in a future patch release update. The solution will also be ported to InfoScale 8.0.x for Solaris Sparc only.
Part #1
When Veritas detects an issue with the ZFS rpools during the boot sequence, the revised script changes to " /lib/svc/method/vxvm-startup2" will attempt to recover the ZFS rpool, export ZFS zpools as before, but now import the zpools using the "-N" option.
File: /lib/svc/method/vxvm-startup2
<snippet> # During boot, if mirror devices under rpool are
# unavailable as vxconfigd is not yet started at this
# early boot. Then devices go into unavailable state.
# So clear the flag after vxconfigd is started to bring
# online all those mirror devices
#
for rpool in `echo $rpools`
do
$ZPOOL clear $rpool
done
for pool in `$ZPOOL list -H | grep "FAULTED" | awk '{print $1}'`
do
for rpool in `echo $rpools`
do
if [ "X$pool" = "X$rpool" ]; then
continue;
fi
done
$ZPOOL export $pool > /dev/null 2>&1
#fs-local will mount all zfs later $ZPOOL import -N $pool > /dev/null 2>&1
done
<snippet>
NOTE: The "-N" option added to the zpool import command will import the required ZFS zpools, but will not mount any ZFS mounts.
The ZFS mount requests will be handled later in the boot sequence by SMF service "svc:/system/filesystem/local:default".
Part #2:
The Solaris "Service Management facility" (SMF). In connection with Oracle Engineering, Veritas has now added a "dependent" relationship between "/system/vxvm/vxvm-startup2" and "/system/filesystem/local".
File: /lib/svc/manifest/system/vxvm/vxvm-startup2.xml
By adding the "dependent" definition between the SMF services, the "/system/filesystem/local" service will wait for "/system/vxvm/vxvm-startup2 to be started, before attempting to perform the ZFS mounts.
Change to vxvm-startup2.xml <dependent
name='vxvm-startup2_fs-local'
grouping='require_all'
restart_on='none'>
<service_fmri value='svc:/system/filesystem/local' />
</dependent>
# svcs -D svc:/system/vxvm/vxvm-startup2:defaultSTATE STIME FMRI
online 2023-02-08T12:02:27 svc:/system/vxvm/vxvm-reconfig:default
online 2023-02-08T12:02:29 svc:/system/dump:config
online 2023-02-08T12:02:30 svc:/system/filesystem/local:default
online 2023-02-08T12:02:31 svc:/system/swap:default
# svcs -d svc:/system/filesystem/local:defaultSTATE STIME FMRI
online 2023-02-08T12:02:05 svc:/system/boot-archive-update:default
online 2023-02-08T12:02:10 svc:/network/npiv_config:default
online 2023-02-08T12:02:17 svc:/network/iscsi/initiator:default
online 2023-02-08T12:02:27 svc:/system/vxvm/vxvm-reconfig:default
online 2023-02-08T12:02:27 svc:/system/vxvm/vxvm-startup2:default
online 2023-02-08T12:02:29 svc:/milestone/single-user:default
To verify the SMF service "svc:/system/filesystem/local:default" started fine without any issues, type:
# svcs -l svc:/system/filesystem/local:defaultfmri svc:/system/filesystem/local:default
name local file system mounts
enabled true
state online
next_state none
state_time 2023-02-08T12:02:30
logfile /var/svc/log/system-filesystem-local:default.log
restarter svc:/system/svc/restarter:default
manifest /lib/svc/manifest/system/filesystem/local-fs.xml
manifest /lib/svc/manifest/network/npiv_config.xml
manifest /lib/svc/manifest/system/boot-archive-update.xml
manifest /lib/svc/manifest/system/vxvm/vxvm-reconfig.xml
manifest /lib/svc/manifest/system/vxvm/vxvm-startup2.xml
dependency optional_all/none svc:/system/boot-archive-update (online)
dependency optional_all/none svc:/network/iscsi/initiator:default (online)
dependency require_all/none svc:/network/npiv_config (online)
dependency require_all/none svc:/milestone/single-user (online)
dependency require_all/none svc:/system/vxvm/vxvm-reconfig (online)
dependency require_all/none svc:/system/vxvm/vxvm-startup2 (online)