VNX1 Series: VNX Data mover Failover did not occur after a hardware failure of CS
Summary: VNX1 Series: VNX Data mover Failover did not occur after a hardware failure of CS (User Correctable)
Symptoms
Hardware failure in Single Control Station array
A VNX Control Station failure occurred and before the hardware was replaced a fault occurred which should have normally triggered a data mover failover, but no data mover failover happened.
Cause
Array with a Single Control Station, when a hardware failure occurred on the control station resulting it being unbootable or unable to run correctly the NAS Control Station services for the management of the array, any subsequent event which would normally trigger a failover of the data mover will not. The NAS Control Station and its management services are required to perform a data mover failover. An inoperable control station or one that has the NAS Services in a stopped state cannot trigger a data mover failover.
In a dual control station configuration a failure of the primary control station services or hardware will result in the standby peer control station forcibly taking over the role as the primary control station, this is triggered when the peer control station fails to receive responses to it is management heartbeats or the heartbeat responses exceed a timeout value.
Resolution
For a control station that is online, run "nas_checkup" command to confirm if there are any reported hardware faults or software faults reported. If there are hardware faults, VNX Support should be engaged to resolve. A warning for a software issue may be possible to resolve using the Dell Knowledgebase https://support.emc.com/
Always run a collect support material on the control station if possible to capture logs and the current state before changing so these can be analyzed if required.
To check specifically for a hardware fault only the commands below can be used, for enclosure status, the data mover enclosure number is specified after (-e)
$ nas_inventory -tree
$ /nas/sbin/enclosure_status -e 0 -v
Additional Information
More References:
The procedure to generate this diagnostic Zip file on the VNX Control Station is below:
[Collect Support Materials]
-
To Generate a Collect Support Materials (Diagnostic Bundle) from the VNX NAS, run the following script on the control station when connected over SSH and logged in as nasadmin.
$ /nas/tools/collect_support_materials
-
When the script completes, a Zip file is generated and the name and location of this file is displayed on screen
-
An SCP client like Winscp is necessary to download the file from the Control Station to your workstation, the default location on the control station for the collect support materials to be generated in is /nas/var/emcsupport.
Note: Older collect support materials will be automatically deleted to make free space if required in /nas/var/emcsupport.
Celerra: How to increase the Control Station failover timeout value.
https://support.emc.com/kb/331802 (A Dell Support account is required to view this article)
The procedure of Celerra and VNX File Data Mover failover (A Dell Support account is required to view this article)