I need help with the issues I am encountering on the field.
I have migrated data from DMX3 volumes assigned to a Linux Cluster with two nodes to VMAX running on 5875.135 using Open Replicator Hot Pull. I have used the "FrontEndZeroDetect" option which was supposed to enhance the performance and also mitigate the zero space reclamation as a post-migration activity. Unfortunately, there is a bug in the Enginuity code which caused the DA request buffer to be marked incorrectly as in use.. and consequently crashed two other servers (which were not part of the ongoing migration). EMC Support was called in and a severity 2 case opened to find the root cause. Cleanup was performed and the Enginuity Pack upgrade is scheduled to fix the bug tonight. Meanwhile, the migration was successfully completed and the system admin was able to confirm that everything was running fine then. After four days now, he comes back to me with the following note:
This is the error I’m seeing when trying to activate the Volume Groups, perhaps EMC has seen this issue…
[root@cvoinfs1 ~]# service clvmd start
Activating VG(s): Not activating cvoinfs1/lvpublic since it does not pass activation filter.
Not activating cvoinfs1/lvarchive since it does not pass activation filter.
Not activating cvoinfs1/lvinfinys_root since it does not pass activation filter.
0 logical volume(s) in volume group "cvoinfs1" now active
5 logical volume(s) in volume group "VolGroup00" now active
Not activating cvoinfs2/lvhome since it does not pass activation filter.
Not activating cvoinfs2/lvapps since it does not pass activation filter.
Not activating cvoinfs2/lvlanding since it does not pass activation filter.
Not activating cvoinfs2/lvinfinys_data since it does not pass activation filter.
0 logical volume(s) in volume group "cvoinfs2" now active
I do not understand where the problem could be. I verified the devices (all in RW state, bound to the pool, pool has sufficient free capacity). I do not know what caused him to suddenly have this issue.
Can someone help me if anyone out there had similar issues and found a solution to it.
The Primus solution tells that Linux cluster requires SCSI3_persistent_reserv enabled on the devices. But, would it be a problem if it is enabled in the cases where it is not required?
The following is a Primus(R) eServer solution:
Solution Class: 3.X Compatibility
Goal What applications require that the SCSI3 persistent reservation volume flag be set on Symmetrix disks?
Fact Application SW: Sun Cluster 3.0
Fact Application SW: VERITAS DBED/AC for Oracle 9i RAC 3.5
Fact Application SW: TruCluster 5.x
Fact Application SW: Microsoft Cluster Server for Windows 2008 (Failover Cluster)
Symptom The VERITAS utility vxfentsthdw fails with the following error:
Registrations to disk failed on node XXX.
Disk is not SCSI-3 compliant on node XXX.
Execute the utility vxfentsthdw again and if failure persists contact the vendor for support in enabling SCSI-3 persistent reservations.
Fix The SCSI3 persistent reservation bit is required for Sun Cluster 3.0 and above, VERITAS Database Edition/Active Cluster for Oracle 9I RAC 3.5. and HPQ TruCluster 5.x.
The setting is also required for RHEL cluster. See the RedHat Technical Note: https://access.redhat.com/kb/docs/DOC-30004
Note The SCSI3 persistent reservation (PER) flag is set at the Symmetrix volume level. It is not a director flag setting and should not be confused with the SCSI3 director flag. For information on how to set the PER bit, see Setting Device Characteristics in the Solutions Enabler Symmetrix Configuration Change CLI manual. The output of symdev show <device> will display the setting for the specified device.
Information on this can be found in the latest version of the Symmetrix Array Controls Manual, in the section on Setting Device Attributes, found in the Chapter, on Managing Configuration Changes.
There are few restrictions that should be noted, also found in the manual.
i don't believe this should be a problem, application/OS simply does not request a lock even though that device is capable. One time a had a couple of devices that came from a Windows 2008 cluster (requires SCS3 reserv) and they were presented to a Redhat RAC cluster ..no issues. We were getting ready for migration and noticed that by accident.
We dont recommend enabling PER bit for devices where it is not needed. We have noticed issues at some customer's site so we dont recommend.
It came down to be a OS specific issue. The LVs were misconfigured so they had problems.
He has redone everything now, and the server is up and running.
Though I hear mixed answers for SCSI3 reservation. I think the Development Engineering made the decision to enable the SCSI3 reservation by default on the devices because they think that would not cause any problems. I believe the same - I dont think having SCSI3 reservation enabled on the devices should be a problem - application or OS simply does not request a lock even though device is capable - or does it??
from the log its indeed sounded like a LVM configuration part..
What FS type are you using(ext3/GFS)?
Are the VG cluster aware Or are you using LV activation/deactivation concept using lvm.sh?