PowerPath is missing 1 path to the devices
Summary: Although the Operating System is correctly discovering all the paths of the device, PowerPath is missing 1 path.
Symptoms
Environment:
OS : AIX (any release)
DELL EMC SW : PowerPath for AIX
DELL EMC array : any
Non DELL EMC SW : ASM
The Operating system is discovering all the paths to the device but PowerPath is configuring all of them but 1.
The load appears unbalanced among the HBA, with more I/O on the HBA used by the PowerPath missing path.
For example, enumerate all the paths to Symmetrix device 104 (we would use similar output for other arrays types)
(from inq)
/dev/rhdisk44 :EMC :SYMMETRIX :5978 :2500104000 : 262145280
/dev/rhdisk121 :EMC :SYMMETRIX :5978 :2500104000 : 262145280
/dev/rhdisk198 :EMC :SYMMETRIX :5978 :2500104000 : 262145280
/dev/rhdisk313 :EMC :SYMMETRIX :5978 :2500104000 : 262145280
When checking the pseudodevice, we have :
# powermt display dev=26
Pseudo name=hdiskpower26
Symmetrix ID=000xxxxxxx25
Logical device ID=00104
Device WWN=60000970000xxxxxxx25533030313034
state=alive; policy=SymmOpt; queued-IOs=0
==============================================================================
--------------- Host --------------- - Stor - -- I/O Path -- -- Stats ---
### HW Path I/O Paths Interf. Mode State Q-IOs Errors
==============================================================================
1 fscsi1 hdisk198 FA 6d:09 active alive 0 0
0 fscsi0 hdisk121 FA 3d:04 active alive 0 0
0 fscsi0 hdisk44 FA 4d:04 active alive 0 0
We are missing hdisk313.
Cause
ASM is used on the host and is incorrectly configured. When an ASM device is configured, its major/minor should match the major/minor of the representative pseudo-device.
If we look at the major/minor of the raw pseudo device, we find :
# ls -l -Ralsi /dev/rhdiskpower26
87478 0 crw-rw---- 1 root system 36, 26 Jul 26 11:11 rhdiskpower26
The ASM device should have been created with major 36 and minor 26. In other words, we should have :
# ls -Ralsi /dev/ASM_104
87518 0 crw-rw-r-- 1 grid oinstall 36, 26 Jul 26 11:13 ASM_104
In our case, we find :
# ls -Ralsi /dev/ASM_104
87518 0 crw-rw-r-- 1 grid oinstall 19,313 Jul 26 11:13 ASM_104
and if we look for this major,minor pair, we find:
# ls -Ralsi /dev | grep 19,313
87518 0 crw-rw-r-- 1 grid oinstall 19,313 Jul 26 11:13 ASM_104
87438 0 brw------- 1 root system 19,313 Jul 26 11:11 hdisk313
87439 0 crw------- 1 root system 19,313 Jul 26 11:11 rhdisk313
hdisk313 ... our missing path.
When PowerPath tries to configure the pseudo device, it needs to open all the paths. In the case of hdisk313, the open fails because the device is hold by ASM. Powerpath cannot pick it up. As a result, the ASM device is only using 1 path (no load balancing), not managed by PowerPath. If the path fails, the application collapses (no failover). This explains why the load on the HBA handling this missing path is much higher than the load on the other HBA (visible in fcstat fcsX).
Resolution
The resolution consists in correctly creating the ASM device.
# rm /dev/ASM_104
# mknod /dev/ASM_104 c 36 26 <<<<< the node is created in"raw" mode, with major 36 and minor 26, which are the major/minor of hdiskpower26.
# chown grid:oinstall /dev/ASM_104