Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

2634

March 6th, 2014 04:00

Lost Connection issue VMWare

All

I have a lost communication error showing for a server on VMware 5.1 against a VMAX

Checks seem to inidicate a PDL issue for the device

The articles below seem to indicate fixes but i have treid powering off effected machine and re-mount etc and rescan but it does not seem to work not able to login to teh actual ESXi host as lack permissions so not able to try a LUN reset

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004684

http://cormachogan.com/2012/09/07/vsphere-5-1-storage-enhancements-part-4-all-paths-down-apd/

http://www.resole.nl/vmware-how-to-reset-a-lun-connection-without-outage/

Is there anything else i can try to fix this lost communication error on the vmware etc

April 22nd, 2014 08:00

Eventually we had to get VMware into to sort this they made some registry and other changes to the cluster that we were not made aware off that fixed the error

Outside of this we found no way outside the articles listed to try and resolve this issue

So answer is put a call into VMware seems to be the only way to get a fix

March 11th, 2014 09:00


I am wondering if this could have been related to powerpath, this was a P2V migration and as part of this we had to uninstall powerpath as this is not supported on the ESXi

We tried manually removing any locks on the array for the devices and we also checked that the SCSI flags were set correctly we had already done two ealrier migrations that worked with the same configuration and using the same front end ports etc

We also looked at a possible Linux commands for removing the reservations manually

http://itdoc.hitachi.co.jp/manuals/3000/30009140H0e/D1400194.HTM

However the fix seem to be in the cluster.conf / fence aspects of the setup by enabling the SCSI aspects of this we were able to start the databases correctly but this was different than the other two p2v migrations we did on the same day whihc had this aspect of the fencing and cluster.conf files removed

in effect we reinserted the fence_scsi option into the cluster configuration file which allow not clearing teh actual reservations allowed the packages  and systems to start

http://manpages.ubuntu.com/manpages/lucid/man8/fence_scsi.8.html

http://linuxnode1.blogspot.co.uk/2013/06/using-scsi-persistent-reservations-with.html

This is now fixed and can be ignored

No Events found!

Top