Highlighted
8 Krypton

ISCSI Connection Failure Causing ESXi 5.1 VMs to shutdown

Hi all,

 

Was hoping someone may have seen the problem we have encountered over the last 2 weeks.

Our VMs are performing unexpected shutdowns. Our Exchange server has rebooted twice today 😞

 

The timing of the shutdown matches with information we are seeing the Equallogic logs:

Severity  Date and Time          Member    ID                        Message                                                                                                                                                                                                                                                                                                   
--------  ---------------------  --------  ------------------------  --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
 INFO     26/07/2013 9:35:19 AM  APCEQLM1  7.2.15 | 7.2.24 | 7.2.29  iSCSI session to target '192.168.43.12:3260, iqn.2001-05.com.equallogic:4-52aed6-c7ca4349e-87b00217be35059e-lon-prod-vm-1' from initiator '192.168.43.31:58797, iqn.1998-01.com.vmware:apcesxhost1-24c73c18' was closed. | iSCSI initiator connection failure. | No response on connection for 6 seconds. 

 

We have 1 PS4100X, running the latest EQL firmware 6.0.5. We have 2 stacked Powerconnect 6224 switches. Our array and switches were configured by DELL support.

 

We have followed DELL recommendations on connecting vSphere to ISCSI storage. As far as I can see our vSwitches are configured correctly.

 

Anyone has any ideas why we're seeing this behavior ? I've logged this with DELL support but thought I'd ask on the forums in case anyone has experience this issues.

 

Thanks in advance!

 

Lee

Tags (1)
0 Kudos
16 Replies
Moderator
Moderator

Re: ISCSI Connection Failure Causing ESXi 5.1 VMs to shutdown

First ensure you can ping/traceroute from the ESX host(s) to every eth interface on the array:

ping -I sourceIP destintionIP (source is one of the interfaces on the ESX host, destination is each eth interface on the array(s).  You need to test all possible combinations!

The do the same ping test from each array member to each iSCSI interface on your ESX hosts (telnet to a eth interface on the member, do this for all members, and all possible combinations).  The traceroute command on the array is a bit different, support traceroute –s sourceIP destinationIP (soruce would be each eth interface, destination would be each iSCSI on all ESX hosts connected to the array group)

Ensure you are using the GroupIP for the iSCSI discovery address on your storage adapter

Don has written several post on configuration, and some of the details/how to are covered in this forum (See Don’s comments)

en.community.dell.com/.../20008239.aspx

Configure iSCSI with MPIO using this PDF:  

www.dellstorage.com/.../DownloadAsset.aspx

If you are convinced that you have done all these, I would open a support case.

-joe

-Joe

Social Media and Community Professional
#IWork4Dell
Get Support on Twitter - @dellcarespro

Follow me on Twitter: @joesatdell 

0 Kudos
Moderator
Moderator

Re: ISCSI Connection Failure Causing ESXi 5.1 VMs to shutdown

Also, since you stated that this started two weeks ago, you might find that your network topology changed (i.e., array failover to the secondary controller is a typical example).  This might indicate a configuration issue with the way the ESX hosts and/or array secondary controller (now the active) is cabled/configured to the switches, or the inter-switch connection could be an issue too.  The ping test should identify where the problem is.

-joe

-Joe

Social Media and Community Professional
#IWork4Dell
Get Support on Twitter - @dellcarespro

Follow me on Twitter: @joesatdell 

0 Kudos
8 Krypton

Re: ISCSI Connection Failure Causing ESXi 5.1 VMs to shutdown

Hi Joe,

Thanks very much for the suggestions, I'll try the troubleshooting suggestions you mentioned and hopefully it will point me in the right direction.

Thanks

Lee

0 Kudos
8 Krypton

Re: ISCSI Connection Failure Causing ESXi 5.1 VMs to shutdown

Hi Joe,

I've ran ping tests across all combinations from both hosts to the array and all successful. The traceroute was also successful although I received a warning about  multiple interfaces found but I believe this is because we have 2 vmnics assigned to our management network. One is on standby.

From the array the ping tests were successful as well across all combinations with no packet loss. The traceroute from the array failed though.

Our interfaces are on a network 192.168.43.x  and the gateway on the array points to a gateway of 192.168.43.1 which doesn't exist. The DELL engineer that set this up explained the gateway was not relevant so we accepted this. Not sure if that is the cause of the problem. The ping tests work fine.

Stupidly I forgot to mention we had some power issues recently due to a faulty UPS and the EQL rebooted. I don't know if the array is now using a different controller. However, the array is cabled according to DELL best practices for redundancy so not sure if it makes any difference.

The only other thing I noticed was that the MTU for the iSCSI vSwitch on my second host was still at 1500. I changed that to 9000. Not sure if that would cause this problem, but either way it needed to be changed.

I've logged this with DELL so hopefully they can shed some light on the issue.

Thanks.

Lee

0 Kudos

Re: ISCSI Connection Failure Causing ESXi 5.1 VMs to shutdown

What's the FW on the 6224 switches?  

A 6 sec timeout means that the server failed to respond to keepalive packets.  Very similar to a ping test, it's part of the iSCSI spec.  Each side periodically pings the other.  When that fails the iSCSI session is dropped.  When you reboot a server you will see thiese errors as well.

Social Media and Community Professional
#IWork4Dell
Get Support on Twitter - @dellcarespro

0 Kudos
8 Krypton

Re: ISCSI Connection Failure Causing ESXi 5.1 VMs to shutdown

Hi Don,

Firmware on the switches is 3.3.4.1

When you say the server failed to respond to keepalive packets, do you mean the physical server or the VM ? Just making sure I understand.

We didn't reboot the physical server or the VM, except of course when it shutdown unexpectedly.

Thanks

Lee

0 Kudos

Re: ISCSI Connection Failure Causing ESXi 5.1 VMs to shutdown

Re: Keepalive.  Who ever is in charge of the iSCSI session handles this.  If it's a VM using VMDKs then it's ESX, if the VM is using it's native iSCSI adapter, then the VM is responsible for those sessions.

Re: Switch.  That firmware needs to be updated.  

Are the ESX servers and EQL array connecting to the same NTP time source?   If not they should be.  

Social Media and Community Professional
#IWork4Dell
Get Support on Twitter - @dellcarespro

0 Kudos
8 Krypton

Re: ISCSI Connection Failure Causing ESXi 5.1 VMs to shutdown

Hi Don,

Thanks for confirming re: keepalive. We use VMDK's so ESX it is. I'll aim to get the firmware updated on the switch asap.

Regarding the ESX servers and EQL array, neither use an NTP time source. Again, I will look to rectify this. Is there a NTP service that DELL recommend ?

Thanks for the advice.

Lee

0 Kudos

Re: ISCSI Connection Failure Causing ESXi 5.1 VMs to shutdown

RE: NTP. Not really,  There are some public ones out there.  pool.ntp.org for example.   Setting up NTP is important, the clocks on the ESX servers commonly drift, especially under heavy CPU load.

You AD servers, exchange, SQL should all be on the same NTP servers.

Social Media and Community Professional
#IWork4Dell
Get Support on Twitter - @dellcarespro

0 Kudos