PowerPath for AIX is causing a host stop responding or reboot after an installation of PowerPath 7.0 or PowerPath 7.0 P01

Summary: PowerPath for AIX is causing an unexpected host reboot everyday around midnight after an installation of (or an upgrade to) PowerPath 7.0 or PowerPath 7.0 P01.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Environment:
Dell EMC SW: PowerPath 7.0 for AIX or PowerPath 7.0 P01 for AIX
Non-Dell EMC SW: Oracle instance installed on Dell EMC legacy storage (Dell storage managed by PowerPath-VMAX array with microcode below 5978, Unity, VNX, VPLEX, PowerStore, XtremIO , and so forth)
A line such as "0 0 * * * /etc/emc/bin/oracleinstance" is found in crontab /var/spool/cron/crontabs/root.

Symptoms:
Running /etc/emc/bin/oracleinstance manually is causing a host to stop responding or to reboot (Do not try it!)
The host stops responding and rebooting unexpectedly everyday around midnight. An analysis of the dump reveals that the issue occurs after invoking the MpxSetDevOrainstMap function.
The dump is displaying the following:

CRASH INFORMATION:
CPU 16 CSA F0xxxxxxxxxxxxxx at time of crash, error code for LEDs: 30000000
pvthread+194300 STACK:
[0000F434]___memmove64+000034 ()
[F1000000C049EE38]MpxSetDevOrainstMap+000138 (F1000C0310327E80, 0000000000000060)
[F1000000C04A90C8]MpxIocmd+0004C8 (0000015300000153, F1000C0310327E80,
   0000006000000060)
[F1000000C041517C]EmcpIocmd+0001FC (F00000002FF46B88, 0000015300000153,
   F1000C0310327E80, 0000006000000060)
[F1000000C042086C]power_ioctl+0003AC (8000000F00000000, 0000000400000004,
   000000002FF22998, 0000000000000003, 0000000000000000, 0000000000000000)
[00014D70].hkey_legacy_gate+00004C ()
[006A5D38]rdevioctl+0000B8 (??, ??, ??, ??, ??, ??)
[008E3F2C]spec_ioctl+00008C (??, ??, ??, ??, ??, ??)
[00704658]vnop_ioctl+000058 (??, ??, ??, ??, ??, ??)
[0071E774]vno_ioctl+0001B4 (??, ??, ??, ??, ??)
[007CF1F4]common_ioctl+000114 (??, ??, ??, ??)
[0000394C]syscall+000244 ()
[kdb_get_virtual_memory] no real storage @ 2FF228A8
[D011CA6C]D011CA6C ()
[kdb_read_mem] no real storage @ FFFFFFFFFFF60F0

Cause

A script, /etc/emc/bin/oracleinstance, was added to PowerPath for AIX 7.0 to handle the "PowerPath Device in Use Reporting" feature. This script is run everyday at midnight due to an entry in crontab. The purpose of the script is to build a list of devices used by Oracle. This list is then transmitted to the array so that a higher priority can be given to these devices. This feature is supported by PowerMax arrays with microcode 5978 and later.

This list is built even when there is no PowerMax array that is attached to the host. This results in a host to stop responding (because of an issue) when the array is not supporting the feature.

Resolution

Workaround:
An easy workaround consists of removing or commenting out the crontab entry for /etc/emc/bin/oracleinstance  as root in /var/spool/cron/crontabs/root. A crontab -e root command can be run. It invokes vi by default on the root crontab file and the entry can then be deleted. See man crontab for more details.

Note: Removing this entry has no adverse effect, especially if there is no PowerMax storage (with microcode 5978 and later) supporting the Oracle instance. If the script is removed while Oracle is using PowerMax devices, the Oracle devices have the same performance as any other device in the array.

Resolution:
PowerPath for AIX 7.0 P02 and later releases address this issue.

Additional Information

Here is what is found in the release notes for PowerPath 7.0 P02 for AIX :
 
Problem number Problem summary Found in Version Fixed in Version
PPAI-783 Avoid host crash, and display warning message in case configuration exceeds maximum paths that are supported per device. 7.0 7.0 P02
PPEE-711 During an AIX LPM, we see "E9595B51 0914221120 I S powerpath0 CONTROL POINT FAILURE" 7.0 7.0 P02
PPAI-683 AIX host crash caused by Oracle Instance Name functionality 7.0 7.0 P02
PPAI-671 All pprootdev commands fail with "/usr/sbin/pprootdev[15]: (requiredSpaceavaliableSpace)*
2: 0403-009 The specified number is not valid for this command."
7.0 7.0 P02
PPEE-583 PowerPath management console loses communication to the appliance every day. 7.0 7.0 P02
PPAI-618 PowerPath: powerdd: MpxPeriodicCallbackDaemon caused AIX failed and restarted. 6.4 7.0
Article Properties
Article Number: 000172441
Article Type: Solution
Last Modified: 03 Jul 2024
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.