Isilon: Node fails to boot with multiple errors including 'This system has 0 formatted boot disk.' and 'UnbootableBootdiskException: 5'.

Summary: When a node affected by the BMC/CMC hangs issue described in KB article 466373 is rebooted without first disconnecting both power cords and waiting for remaining power to drain off, it may fail to boot with multiple errors including 'This system has 0 formatted boot disk.' and 'UnbootableBootdiskException: 5'. IPMI-related errors are also often seen in this case. ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms


When a node affected by the BMC/CMC hangs issue described in KB article 466373 is rebooted without first disconnecting both power cords and waiting for remaining power to drain off, it may fail to boot with a set of errors similar to the following:
 
<isi_rc> Executing script isi_bootdisk_init
python: Unable to open /var/run/mlx4_core0.vpd for writing: Read-only file system
Executing GEOM bootdisk startup...
python: Unable to open /var/run/mlx4_core0.vpd for writing: Read-only file system
This system has 0 formatted boot disk.
UnbootableBootdiskException: 5: Exception caught in startup attempt 1
Traceback (most recent call last):
  File "/usr/local/lib/python2.6/site-packages/isi/sys/bootdisk.py", line 1831, in startup
  File "/usr/local/lib/python2.6/site-packages/isi/sys/bootdisk.py", line 1741, in _startup
  File "/usr/local/lib/python2.6/site-packages/isi/sys/bootdisk.py", line 1657, in
handle_bootdisk_ids
  File "/usr/local/lib/python2.6/site-packages/isi/sys/bootdisk.py", line 1580, in
zero_bootdisks
UnbootableBootdiskException: 5
The system is unbootable.
python: Unable to open /var/run/mlx4_core0.vpd for writing: Read-only file system
2016-01-02T14:00:24-07:00 python: dbay_localbm: baymap unknown for chas 0 dskctl 8 portcount 8
2016-01-02T14:00:24-07:00 python: dbay_chascache_init: drive_bay doesn't know chassis Unknown, portcount 8
drive_bay doesn't know chassis Unknown, portcount 8
GEOM start failed

If this boot failure is caused by a hanging BMC, an earlier part of the boot sequence shows that the kernel was not able to initialize the ipmi0 device correctly (the last line should say 4 instead of 0 under normal conditions):

ipmi0: Clear flags illegal
ipmi0: Number of channels 0 

Cause

A known BMC firmware problem on HD400, S210, X210, X410, and NL410 nodes sometimes causes the node's BMC to hang. When the node's BMC is not responding, OneFS is not able to read the EEPROM attached to the CMC during boot to determine what kind of chassis it is running on.  When OneFS cannot determine the chassis type, it cannot determine how to properly access the boot drives in the affected node, and the boot attempt fails.

Resolution

There is an update to the BMC firmware available to help prevent future occurrences of this issue; however before the new firmware can be applied, the hang condition must be remediated first. You must shut down the node (shutdown -p now), remove both power cords, wait one minute, plug the power cords back in, and bring the node back up. You must repeat this power cycle process up to three times to clear the hang (stops responding) condition. 

Once the unresponsive BMC condition has been cleared, the BMC firmware update process detailed in KB Article 466373 S210, X210, X410, NL410 or HD400 shows event: 'Node's Baseboard Management Controller (BMC) and/or Chassis Management Controller (CMC) are unresponsive mitigates the underlying problem causing this issue.

ATTENTION: If the AC power cycle procedure described in the KB mentioned above still does not resolve the issue after three attempts, contact EMC Isilon Technical Support and reference this KB article.

Affected Products

Isilon, Isilon HD400, Isilon NL410, PowerScale OneFS, Isilon S210, Isilon X210, Isilon X410
Article Properties
Article Number: 000052205
Article Type: Solution
Last Modified: 28 Jun 2023
Version:  6
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.