Problem
Minimum O/S ulimit settings detected during a NetBackup install or upgrade.
The ulimit settings are most crucial on primary and media servers that execute many hundreds of simultaneously backup or restore jobs.
Settings that are too low will cause application faults and job failures once the concurrent job load exceeds the resources allowed to be consumed by the processes. A host may run without issue for months or years, but then begin to fail once the load increases sufficiently.
The warning below can be ignored on most NetBackup client hosts unless the ulimit value is less than 1024 or the concurrent job/stream count is >10 .
Error Message
The following check may fail during either NetBackup install or NetBackup upgrade:
not ok ulimit_nofiles: nofiles ulimit <value> is too low.
NetBackup Master and Media Server processes may run slower if they are
limited to fewer than 8000 open file descriptors. This test runs
'ulimit -n' and checks that the result is at least 8000 on NetBackup
servers. See
https://www.veritas.com/support/en_US/article.TECH75332
for more information.
The following error message may be displayed or logged at other times:
Resource temporarily unavailable
Solution
The operating system (O/S) must be configured to allow NetBackup programs to utilize sufficient O/S resources to run the configured job load. The error above is specific to:
- Open files per process (nofile); recommended to be at least 8192 (soft) and 65536 (hard), never to unlimited.
Related file and process resources can also be reviewed and adjusted at the same time.
- Number of processes and threads per user (nproc); recommended to be at least 65536, for nbwmc across Flex containers.
- Maximum file size (fsize); recommended to be unlimited, never less than 500 GB.
Note: These resource limits are relevant to the root user, and also to the WEBSVC_USER (NetBackup 8.1+), and also to the SERVICE_USER (NetBackup 9.1+).
When are the ulimit settings for a process determined?
At process startup, soft and hard resource limits are granted by the operating system. Thus, changing the limits involves changing the runtime environment before a process is started. Accordingly, be very aware of both ‘when’ and ‘how’ NetBackup processes are started on the host.
- From the system boot or reboot environment.
- From a clustering technology using a cluster, node, or resource on-line script, such as Veritas Cluster Server.
- Via an independent service group (Linux systemctl) or in association with a resource group (Solaris project).
- From a user login shell environment, by manual execution of a startup script:
/usr/openv/netbackup/bin/goodies/netbackup
/usr/openv/netbackup/bin/bp.start_all - From a user login shell environment, by manual execution of an individual program: initbprd, initbpdbm, bpcd -standalone, etc.
Note: After startup, processes can change their current allocation up to the hard limit imposed by the operating system but may not use additional resources. Processes can also change their current allocation to a lower value, but thereafter cannot raise them. NetBackup services that require significant resources will, upon startup, request to set nofile to 8192 or 65336 or other value as appropriate for their needs. The hard limit must be high enough to allow the request to succeed or the process may subsequently fail if the resource limit is reached.
Note: Linux (prlimit) and Solaris (plimit) allow some limits to be increased after a process is started, but many processes check the limits at startup to pre-configure and optimize operations. Later changes to the limits will not have any effect because the code has already executed. Do not adjust the ulimit settings for NetBackup processes on-the-fly.
How are ulimit settings for a process configured?
All operating systems provide methods to temporarily set or constrain the limits for the various environments that they host, and for persisting changes through a host reboot. The exact methods and default values depend on the operating system version and the features it provides.
Review the operating system vendor documentation for complete details; sysctl, security/limits.d, systemctl/system.d/system, ulimit, PAM, etc.
Examples of some of the more common configuration methods for default environments are shown below. If an environment has already been customized, there may be overriding values configured in other places such as /etc/profile, /etc/bashrc, /opt/VRTSvcs/bin/vcsenv, etc. Seek assistance from the operating system administrator or vendor as needed.
For Linux, do sections A, B, C, and D.
For Solaris, do sections A, B, and E.
For AIX, do sections B and F.
For HP-UX, do sections B and G.
A) [Linux/Solaris] Verifying the current ulimit settings of running processes.
Obtain a short list of PIDs for NetBackup processes that simultaneously use thousands of file descriptors/handles and many process threads.
$ /usr/openv/netbackup/bin/bpps | egrep 'nbwmc|beam.smp|vnetd.*inbound|bpjobd' | cut -c1-100
nbsvcusr 5089 1 0 20:42 ? 00:00:00 /usr/openv/netbackup/bin/vnetd -proxy inbound_proxy
nbwebsvc 5487 1 46 20:43 ? 00:01:24 /usr/openv/java/jre/bin/java -Dnop -Djava.util.loggi
nbwebsvc 5977 1 3 20:43 ? 00:00:04 /opt/openv/mqbroker/erlang/erts-11.1/bin/beam.smp -W
nbsvcusr 6830 6819 0 20:43 pts/0 00:00:00 /usr/openv/netbackup/bin/bpjobd
Note: Starting with NetBackup 9.1, bpjobd and vnetd -proxy inbound are owned by the SERVICE_USER which should be a non-root login.
Check the ulimit settings for each of the displayed PIDs, for example.
Linux$ prlimit --pid=5089 | egrep -i '^RESOURCE|nofile|nproc|fsize'
RESOURCE DESCRIPTION SOFT HARD UNITS
FSIZE max file size 2147483648 2147483648 blocks
NOFILE max number of open files 8192 8192
NPROC max number of processes 27147 27147
Solaris$ plimit 5089 | egrep -i 'nofile|file.*blocks'
file(blocks) unlimited unlimited
nofiles(descriptors) 8192 8192
Note: On Linux, the total system-wide number of concurrently open files actively in use across all processes should also be verified. This limit is set by the fs.file-max setting. Veritas recommends that this number is at least 65536 for all NetBackup versions prior to 10.2, or at least 131072 for versions 10.2 and above. It is likely that the current value is already higher than these minimum values, and if that is the case it should not be changed.
This value must accommodate all applications on the host, and Veritas cannot make a recommendation for the number required by other applications.
Previous configuration changes to reduce fs.file-max or applications holding exceptionally large numbers of open files may cause the system to hit this file limit. If the limit is reached, active primary servers will encounter job failures with status 800, and the syslog will show that the “file-max limit” has been reached.
Check/verify both the number of open files (8032) and the maximum open files (688307). The latter value should not be suspiciously small; it is typically one million or higher.
Linux$ sysctl fs.file-nr
fs.file-nr = 8032 0 688307
Linux$ sysctl fs.file-max
fs.file-max = 688307
To increase the value, first edit 'fs.file-max' in the /etc/sysctl.conf file, then run 'sysctl -p' to apply the changes.
Note: Solaris does not have a system-wide limit to the number of open files by all processes. It is constrained only by available memory.
B) Checking/Changing ulimit settings within the login shell environment before manual command-line startup of processes
Confirm the current user ID, and then review the current soft (-S) and hard (-H) limits for the maximum number of open files per process (nofile), the maximum file size (fsize), and the maximum number of process threads per user (nproc).
$ id -a
uid=0(root) guid=0(root) groups=0(root)…
$ ulimit -a -S | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
file size (blocks, -f) 1097152 <== Lower than recommended value
open files (-n) 1024 <== Lower than recommended value
max user processes (-u) 2048 <== Lower than recommended value
$ ulimit -a -H | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
file size (blocks, -f) unlimited
open files (-n) 65535
max user processes (-u) 65536
Note: HP-UX does not support ulimit -u (nproc).
If any specific soft limit is less than the recommended value, increase it to the recommended value. If the hard limit is also too low, the limits used to configure the terminal shell will need to first be adjusted by the system administrator using one of the sections below or another technique.
$ ulimit -f unlimited
$ ulimit -n 8192
$ ulimit -u 65536
Note: Do not decrease limits that are already set higher than the recommended values.
Always verify the expected changes were successful.
$ ulimit -a -S | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
file size (blocks, -f) unlimited <== Raised by ulimit -f unlimited
open files (-n) 8192 <== Raised by ulimit -n 8192
max user processes (-u) 65536 <== Raised by ulimit -u 65536
$ ulimit -a -S | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
file size (blocks, -f) unlimited
open files (-n) 8192 <== Lowered by ulimit -n 8192
max user processes (-u) 65536
Note: Do not needlessly change limits, it immediately constrains the hard limit. Notice that the hard limit for nofile was lowered from 65535 to 8192.
Then stop NetBackup processes, confirm they are down, and restart.
$ /usr/openv/netbackup/bin/goodies/netbackup stop
$ /usr/openv/netbackup/bin/bpps -a
$ /usr/openv/netbackup/bin/goodies/netbackup start
or
$ /usr/openv/netbackup/bin/bp.start_all
For Linux and Solaris, verify the expected ulimit values are in use after process restart. See section A.
C) [Linux] Checking/Changing ulimit settings for future login shell environments.
The exact method for persisting change for future login shells varies by O/S distribution and version but is generally controlled by PAM entries in the /etc/security/limits.conf file.
Note: Do not perform this step on a Flex-based instance. Appropriate ulimit settings are set for Flex instances. Please contact Support for assistance if adjustments are necessary.
Any current non-default settings will generally be located in these files.
$ egrep -i 'no*file|no*proc|fi*l*e*size' /etc/security/limits* /etc/security/limits.d/*.conf 2>/dev/null
… snip …
/etc/security/limits.d/20-nproc.conf:* soft nproc 4096
/etc/security/limits.d/20-nproc.conf:root soft nproc unlimited
/etc/security/limits.conf:# - fsize - maximum filesize (KB)
/etc/security/limits.conf:# - nofile - max number of open file descriptors
/etc/security/limits.conf:# - nproc - max number of processes
Also review these next outputs to confirm existing or default soft and hard limits while running as the root user.
$ id -a
uid=0(root) gid=0(root) groups=0(root)
$ ulimit -a -S | egrep "\-n|\-f|\-u"
$ ulimit -a -H | egrep "\-n|\-f|\-u"
Then repeat for the other non-root users that own NetBackup processes.
Replace ‘nbwebsvc’ with the configured login name for the WEBSVC_USER used in NetBackup 8.1+.
$ su nbwebsvc
nbwebsvc$ ulimit -a -S | egrep "\-n|\-f|\-u"
nbwebsvc$ ulimit -a -H | egrep "\-n|\-f|\-u"
nbwebsvc$ exit
nbwebsvc$ ulimit -a -S | egrep "\-n|\-f|\-u"
nbwebsvc$ ulimit -a -H | egrep "\-n|\-f|\-u"
nbwebsvc$ exit
Replace ‘nbsvcusr’ with the configured login name for the SERVICE_USER used in NetBackup 9.1+.
$ su nbsvcusr
nbsvcusr$ ulimit -a -S | egrep "\-n|\-f|\-u"
nbsvcusr$ ulimit -a -H | egrep "\-n|\-f|\-u"
nbsvcusr$ exit
If any outputs above were less than the recommended values, add or change appropriate entries to either a /etc/security/limits.d/*.conf file if it exists, or to the /etc/security/limits.conf file. On some O/S versions, the asterisk (*) in the user column matches only non-root users, and it is necessary to add entries specifically for the root user. If the site does not want these settings applied to all (*) users, then add rows specific to the NetBackup login names for the SERVICE_USER and WEBSVC_USER.
* soft nofile 8192
* hard nofile 65536
* soft nproc 65536
* hard nproc 65536
* soft fsize unlimited
* hard fsize unlimited
Note: Do not decrease limits that are already set higher than the recommended values.
Confirm PAM is enabled appropriately. One of the ‘/etc/pam.d/*’ files should contain a line similar to ‘session required pam_limits.so’. The specific pathname for the shared object library will vary by O/S release and version.
The new limits will take effect for new login sessions. To verify, see section B after starting a new terminal shell.
D) [Linux] Checking/Changing ulimit settings for systemd controlled services, both NetBackup and Veritas Cluster Server.
NetBackup versions through at least 10.1, are not formal systemctl/systemd service units. But some versions of systemctl will still allow the system administrator to start, status, and stop NetBackup, and that may cause a different configuration of ulimit values to be used.
Because NetBackup could be started by systemd, check for pre-existing configuration that constrains nofile, fsize, or nproc.
$ systemctl show netbackup | egrep -i 'nofile|nproc|fsize'
LimitFSIZE=2097152
LimitNOFILE=1024
LimitNPROC=2048
If NetBackup is constrained to values less than the recommended values, then configure the recommended values to avoid problems should systemd be used to start NetBackup. Use ‘infinity’ to specify ‘unlimited’ when relevant. On some platforms it may be necessary to use ‘systemctl edit --force netbackup’.
Note: Edit this next command to remove key=value pairs that should not be lowered from higher values that already exist, including the preceding newline (\n), as necessary.
$ echo -e "[Service]\nLimitNOFILE=8192\nLimitFSIZE=infinity\nLimitNPROC=65536" | SYSTEMD_EDITOR="tee" systemctl edit netbackup
Confirm the settings were saved to the service unit specific file and are available to be used upon subsequent restart of the service.
$ systemctl show netbackup | egrep -i 'nofile|nproc|fsize'
LimitFSIZE=infinitiy
LimitNOFILE=8192
LimitNPROC=65536
$ egrep -I -r -i 'nofile|nproc|fsize' /etc/sys* /usr/lib/sys* /run/sys* | grep netbackup
/etc/systemd/system/netbackup.service.d/override.conf:LimitNOFILE=8192
/etc/systemd/system/netbackup.service.d/override.conf:LimitFSIZE=infinity
/etc/systemd/system/netbackup.service.d/override.conf:LimitNPROC=65536
Use systemctl to stop the service, confirm all processes are down, ensure systemctl has loaded the configuration changes, and then restart the service. This should pick up the config changes performed above.
$ systemctl stop netbackup
$ /usr/openv/netbackup/bin/bpps -a
$ systemctl daemon-reload
$ systemctl start netbackup
Verify the expected ulimit values are in use after the restart; see section A.
Note: If NetBackup is under the control of a clustering technology, also check if the cluster service unit is under systemctl control. This example is for Veritas Cluster Server, use the appropriate service unit name for other clustering software.
$ systemctl status vcs
$ systemctl show vcs | egrep -i 'nofile|nproc|fsize'
Note: Edit this next command to remove key=value pairs that should not be lowered from higher values that already exist, including the preceding newline (\n), as necessary.
$ echo -e "[Service]\nLimitNOFILE=8192\nLimitFSIZE=infinity\nLimitNPROC=65536" | SYSTEMD_EDITOR="tee" systemctl edit vcs
$ systemctl show vcs | egrep -i 'nofile|nproc|fsize'
$ egrep -I -r -i 'nofile|nproc|fsize' /etc/sys* /usr/lib/sys* /run/sys* | grep vcs
E) [Solaris] Checking/Changing persistent ulimit settings for future login shell environments or project programs.
Oracle recommends implementing ulimit changes via projects, but the deprecated technique of updating the /etc/system and/or /etc/system.d/* file(s) is still permitted.
Review the current project and non-default system settings.
$ projects -l -v
$ egrep -v '^\*|^$' /etc/system /etc/system.d/* 2>/dev/null
The following commands either adjust the resources for processes running under a project named ‘netbackup’ or append entries to the system file(s). If values already exist in system file(s), edit the existing entries and change the values instead of appending additional conflicting entries. Updates to the system file(s) require a reboot to take effect.
Note: Do not decrease limits that are already set higher than the recommended values.
Increasing the hard and soft limits for nofile:
$ projmod -s -K "process.max-file-descriptor=(priv,65536,deny)" netbackup
$ projmod -s -K "process.max-file-descriptor=(basic,8192,deny)" netbackup
or
echo 'set rlim_fd_max=65536' >> /etc/system
echo 'set rlim_fd_cur=8196' >> /etc/system
Increasing the hard and soft limits for fsize:
$ projmod -s -K "process.max-file-size=(priv,107374182400,deny)" netbackup
$ projmod -s -K "process.max-file-size=(basic,107374182400,deny)" netbackup
Increasing the hard and soft limits for nproc:
$ projmod -s -K "project.max-processes=(priv,65536,deny)" netbackup
$ projmod -s -K "project.max-processes=(basic,65536,deny)" netbackup
or
echo 'set maxuprc=65536' >> /etc/system
Note: Be sure to apply the project to the scripts or processes that start the NetBackup application.
Verify the expected ulimit values are in use after restarting NetBackup under control of the project or rebooting to pick up the system file(s) changes. See section A.
F) [AIX] Checking/Changing persistent ulimit settings for future login shell environments.
The exact method for persisting change for future login shells varies by O/S version but is generally configured by entries in the /etc/security/limits file.
$ egrep -i 'no*file|no*proc|fi*l*e*size' /etc/security/limits* /etc/security/limits.d/*.conf 2>/dev/null
/etc/security/limits:* fsize - soft file size in blocks
/etc/security/limits:* nofiles - soft file descriptor limit
/etc/security/limits:* fsize_hard - hard file size in blocks
/etc/security/limits:* nofiles_hard - hard file descriptor limit
/etc/security/limits:* fsize_hard set to fsize
/etc/security/limits:* nofiles_hard -1
/etc/security/limits: fsize = -1
/etc/security/limits: nofiles = 2000
Review these outputs to confirm existing or default soft and hard limits while running as the root user. Then repeat for other non-root users that own NetBackup processes.
$ id -a
uid=0(root) gid=0(root) groups=0(root)
$ ulimit -a -S | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
$ ulimit -a -H | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
Replace ‘nbwebsvc’ with the configured login name for WEBSVC_USER used on NetBackup 8.1+ primary servers.
$ su nbwebsvc
nbwebsvc$ ulimit -a -S | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
nbwebsvc$ ulimit -a -H | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
nbwebsvc$ exit
If any outputs above were less than the recommended values, add or change appropriate entries in the /etc/security/limits file. In this example, an unlimited fsize already is the default.
Note: Do not decrease limits that are already set higher than the recommended values.
default:
fsize = -1
...snip...
nofiles = 2000
nbwebsvc:
nofiles = 8192
nofiles_hard = 65536
nproc = 65536
Note: Some versions of AIX use ‘-1’ in place of ‘unlimited’.
The new limits will take effect for new login sessions. To verify, see section B after starting a new terminal shell.
G) [HP-UX] Checking/Changing persistent ulimit settings for future login shell environments.
Please review available options with the system administrator and/or the O/S vendor.
To verify, see section B after starting a new terminal shell or rebooting the host as needed.