This post is more than 5 years old
46 Posts
0
2634
Lost Connection issue VMWare
All
I have a lost communication error showing for a server on VMware 5.1 against a VMAX
Checks seem to inidicate a PDL issue for the device
The articles below seem to indicate fixes but i have treid powering off effected machine and re-mount etc and rescan but it does not seem to work not able to login to teh actual ESXi host as lack permissions so not able to try a LUN reset
http://cormachogan.com/2012/09/07/vsphere-5-1-storage-enhancements-part-4-all-paths-down-apd/
http://www.resole.nl/vmware-how-to-reset-a-lun-connection-without-outage/
Is there anything else i can try to fix this lost communication error on the vmware etc
michael_churchi
46 Posts
0
April 22nd, 2014 08:00
Eventually we had to get VMware into to sort this they made some registry and other changes to the cluster that we were not made aware off that fixed the error
Outside of this we found no way outside the articles listed to try and resolve this issue
So answer is put a call into VMware seems to be the only way to get a fix
michael_churchi
46 Posts
0
March 11th, 2014 09:00
I am wondering if this could have been related to powerpath, this was a P2V migration and as part of this we had to uninstall powerpath as this is not supported on the ESXi
We tried manually removing any locks on the array for the devices and we also checked that the SCSI flags were set correctly we had already done two ealrier migrations that worked with the same configuration and using the same front end ports etc
We also looked at a possible Linux commands for removing the reservations manually
http://itdoc.hitachi.co.jp/manuals/3000/30009140H0e/D1400194.HTM
However the fix seem to be in the cluster.conf / fence aspects of the setup by enabling the SCSI aspects of this we were able to start the databases correctly but this was different than the other two p2v migrations we did on the same day whihc had this aspect of the fencing and cluster.conf files removed
in effect we reinserted the fence_scsi option into the cluster configuration file which allow not clearing teh actual reservations allowed the packages and systems to start
http://manpages.ubuntu.com/manpages/lucid/man8/fence_scsi.8.html
http://linuxnode1.blogspot.co.uk/2013/06/using-scsi-persistent-reservations-with.html
This is now fixed and can be ignored