Isilon: Event notification: The var partition is near capacity, Event ID: 100010001

Summary: This article discusses the way to clear the Var partition as it nears capacity.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Event
One of the following event notifications is issued:

The /var partition is near capacity (95% used)

The /var partition is near capacity (85% used)

The /var partition is near capacity (75% used)

Details
When the /var partition reaches 75%, 85%, or 95% of capacity, an event is logged and an alert is sent.

Cause

The /var folder contains numerous logs, diagnostic files, configuration data, and temp files for various functions of the cluster. Over time, various extra files may accumulate within the /var folder and cause it to fill up.

The /var/log/wtmp file and the rollover files /var/log/wtmp.0/var/log/wtmp.1, for example, increase in size to over 10 MB. Sometimes, they increase to 150 MB. The /var/log/wtmp file is a binary log file that records login and logoff data. The log manager file, /etc/newsyslog.conf, does not archive the same way it does other log files so the /var/log/wtmp can grow and fill the /var directory.

Resolution

NOTE: Due to liability, Isilon Support advises the user carry out the actions of /moving/deleting of customer data. If the user has any questions, Dell can help answer them if needed.

Below is the default content of a /var partition and a brief description of the more relevant sub-directories. Unless otherwise stated, the content and data within /var and its sub-directories should not be altered or removed. 
ps9500x3-2# cd /var
ps9500x3-2# ls
.snap           at              backups         db              ifs             lib             patch           spool
account         audit           cache           empty           journal         log             preserve        tmp
agentx          authpf          crash           games           journal-peer    mail            run             unbound
apache2         backup          cron            heimdal         krb5kdc         msgs            rwho            yp


 .snap          Snapshots. Do not touch.
 account        Account information. Do not touch.
 agentx         Empty but preserved for Agent Extensibility (AgentX) Protocol
 apache2        Apache Files. Do not touch.
 at             Variable data. Do not touch.
 audit          Audit Files. Do not touch.
 authpf         Authentication gateway. Do not touch.
 backup         System configuration backup files. Do not touch
 backups        Group configuration backups. Do not touch
 cache          System cache. Do not touch.
 crash          Crash files, older files can be deleted if needed
 cron           Cron jobs, do not touch
 db             Database files. Do not touch
 empty          Do not touch.
 games          Empty but preserved.
 heimdal        Kerberos 5 protocol. Do not touch.
 ifs            Do not touch unless directed by support
 journal        System Journal database
 journal-peer   System Journal-peer database
 krb5kdc        Kerberos KDC (Key Distribution Center)
 lib            Likewise database files. Do not touch
 log            Various System log files, can be cleared but zero's out the system logs.
 mail           Mail sub-system files.     
 msgs           Message logs
 patch          System patch database. Do not touch
 preserve       Do not touch
 run            Do not touch
 rwho           Do Not Touch
 spool          System Spool files.  Do not touch.
 tmp            Healthcheck items and vi recover. Do not touch.
 unbound        Do Not Touch
 yp             Do Not Touch

The two directories to focus on are /var/crash and /var/logs as these can grow and consume most of the disk space in the /var partition.

Older crash files in /var/crash can be removed if they are no longer needed.

The /var/logs can be zeroed out and reset if logs become too large. Keep in mind that once logs are reset, it is no longer possible to troubleshoot and research past issues.

Review df output for the /var partition. Depending on the output, perform one or more of the following tasks:
 

ps9500x3-2# df
Filesystem            1K-blocks    Used     Avail Capacity  Mounted on
/dev/mirror/root0       1957292  871082    929628    48%    /
devfs                         1       1         0   100%    /dev
/dev/mirror/var0         978604   51394    848922     6%    /var
/dev/mirror/var-crash   2946284      10   2710572     0%    /var/crash
/dev/mirror/keystore      61228      46     56284     0%    /keystore
/dev/md0                  61166    2158     54116     4%    /tmp/ufp
/dev/md1.uzip            435751  406426     -5535   101%    /base
OneFS                 246327840 2362592 173903776     1%    /ifs
ps9500x3-2#
 


 

Rotate logs:

Detailed instructions on how to rotate logs is in KB Article 20315, Isilon: OneFS-How to rotate system logs for a node.
Command to rotate the logs:
newsyslog -f

If the /var partition returns to a normal usage level, review the list of recently written logs to determine if a specific log is rotating frequently. Rotation can resolve the full-partition issue by compressing or removing large logs and old logs, thereby automatically reducing partition usage.
 

Check the percentage of free inodes:

Open an SSH connection to the node that reported the error and log in using the "root" account.
Run the following command:
df -i |grep var |grep -v crash
Output similar to the following appears:
Filesystem            1K-blocks      Used       Avail Capacity iused       ifree %iused  Mounted on
/dev/mirror/var0          1013068   49160      882864      5%   1650      139276  100%   /var
If the %iused value is 90% or higher, reduce the number of files in the /var partition using one of the methods described below.
 

Identify files that do not belong in the /var partition:

NOTE: Do not move or delete any files under /var/patch as they are critical for the patch system on the node.
  1. On the node that generated the alert, run the following command to list files in the /var partition that are greater than 5 MB:
find -x /var -type f -size +10000 -exec ls -lh {} \; | awk '{ print $9 ": " $5 }'
  1. In the output, look for files that do not typically belong in the /var partition. For example, a OneFS installer file, log gathers, or a user created file.
  2. Remove the files or move them to the /ifs directory. 


Manually remove files from the /var Partition:

Once the extra files are identified, the commands needed to cleanup the /var directory usually involve Make Directory (mkdir), Copy (cp), Move (mv) and Remove (rm). Users should be familiar with these basic UNIX/Linux commands before proceeding.

Always make a backup copy of files prior to deleting or moving them from their original location.

Create a directory to move or copy backup data to, where <dest> is the destination directory. This directory is where backup copies of all files that are to be deleted should be copied to first. 

# mkdir /ifs/data/Isilon_Support/<dest>

Either Copy, Move, or Delete files as appropriate:

To copy a file or directory:

# cp  <file>   /ifs/data/Isilon_Support/<dest>

Recursively copy a directory.

# cp  -R <directory>   /ifs/data/Isilon_Support/<dest>

To move a file or directory:

# mv <file>  /ifs/data/Isilon_Support/<dest>

# mv <directory>   /ifs/data/Isilon_Support/<dest>

To remove/delete a file:

# rm <file>

 

Determine if a process is holding a large file open.

Use the fstat command to list the open files on a node or directory, or to list the files that a process has opened. A list of the open files can help you monitor the processes that are writing large files. See how to use the fstat command to list the open files on a node, article 21402, Isilon: How to use the fstat command to list the open files on a node.

If neither of the above tasks resolves the issue, go to the following solution:


Limit the rollover file size and compress the file

  1. Open an SSH connection on any node in the cluster and log in using the "root" account.
  2. Run the following commands to create a backup of the /etc/newsyslog.conf file:
cp /etc/newsyslog.conf /ifs/newsyslog.conf
cp /etc/newsyslog.conf /etc/newsyslog.bak
  1. Open the /ifs/newsyslog.conf file in a text editor.
  2. Locate the following line:
/var/log/wtmp 644 3 * @01T05 B
  1. Change the line to:
/var/log/wtmp 644 3 10000 @01T05 ZB
These changes instruct the system to roll over the /var/log/wtmp file when it reaches 10 MB and to compress the file with gzip.
  1. Save and close the /ifs/newsyslog.conf file.
  2. Run the following command to copy the updated file to all nodes on the cluster:
isi_for_array 'cp /ifs/newsyslog.conf /etc/newsyslog.conf'
  1. Log files rotate automatically if necessary using a cron job on the hour and half hour (/etc/crontab)
#minute hour    mday    month   wday    who     command
#
# rotate log files every half-hour, if necessary
0,30    *       *       *       *       root    newsyslog

If other logs are rotating frequently, or if the preceding steps do not resolve the issue, contact Dell Technical Support for assistance.

Affected Products

Isilon

Products

Isilon, PowerScale OneFS
Article Properties
Article Number: 000169344
Article Type: Solution
Last Modified: 12 May 2025
Version:  18
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.