seeing the same during Tests. Sometimes it works and sometimes one or a few hosts lost connection. A Host reboot is needed to get access to the LUN/Datastore again.
[tag:SR] is already open but we are faced some more issues and maybe they are related to the infrastructure.
Thanks for your information. I have already opened a case and waiting for the support's reply.
I am using 7.0.5 for all members. The interlink speed is 10G and latency is low (e.g. several ms). All settings seems fine according to the PDF (except the jumbo frame since our network infra does not allow jumbo frame implementation)
In fact, when the problem occurred, I've noticed the following weird things:
1. According to the EQL group manager, the new connections were established successfully.
2. At the ESXi, although all the device and the datastore are greyed out, all the corresponding paths are established and shown as active.
3. At the vmkernel.log file, the following SCSI sense code shown:
H:0x1 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0
According to this link, there is some sort of timeout occurred.
Do you know what SCSI timeout it's referring to and is there any ways to config the timeout value ?
BTW, by doing trial and error, increasing the ESXi option Disk.DiskDelayPDLHelper from 10 to 20 secs (a reboot is required to take effect) seems fix the problem. However, since there's not much information available about this option, not sure whether there's any adverse effect for such change.
Have you tried to increase the Disk.DiskDelayPDLHelper option (a reboot is required)? It seems fixed the problem. I have tested at least 30 times and the problem seems gone.
According to its description, it will delay the execution of PDL helper, which I believe to be the culprit of the problem.
However, since there's not much information about this option on the web, I need to further test and ask VMWare if necessary.
not yet. But if this kind of setting necessary i would like to see that dell comes around the corner with such important information. I'll keep this in mind and ask our escalation engineer about.
We are currently on non public FW 7.0.6 and seeing general problems with our EQL setup and the VMware vSphere Clusters.
I applied 7.0.7 on 2 groups (11 members) on saturday.
Iam not sure if its a EQL problem. Currently iam more interessted in a stable production environment, so no time for tests. I have to work with Dell support on my current issues so i dont want to solve more problems ;) Feel free to open a Case and let us now what they say!
Not yet. Since I will upgrade the FW, I will test the problem again at the new FW and re-approach Dell Support if required. I will post the result here.
Just tested with the FW 7.0.7 and the problem still persists. BTW, after several days' testing, turned out that adjusting Disk.DiskDelayPDLHelper did not solve the problem.
Now Working with Dell support to solve the problem.
Origin3k
4 Operator
•
2.4K Posts
0
August 11th, 2014 01:00
Hi Ryan,
seeing the same during Tests. Sometimes it works and sometimes one or a few hosts lost connection. A Host reboot is needed to get access to the LUN/Datastore again.
[tag:SR] is already open but we are faced some more issues and maybe they are related to the infrastructure.
Regards,
Joerg
ryannw
20 Posts
0
August 11th, 2014 01:00
Hi don,
Thanks for your information. I have already opened a case and waiting for the support's reply.
I am using 7.0.5 for all members. The interlink speed is 10G and latency is low (e.g. several ms). All settings seems fine according to the PDF (except the jumbo frame since our network infra does not allow jumbo frame implementation)
In fact, when the problem occurred, I've noticed the following weird things:
1. According to the EQL group manager, the new connections were established successfully.
2. At the ESXi, although all the device and the datastore are greyed out, all the corresponding paths are established and shown as active.
3. At the vmkernel.log file, the following SCSI sense code shown:
H:0x1 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0
According to this link, there is some sort of timeout occurred.
Do you know what SCSI timeout it's referring to and is there any ways to config the timeout value ?
BTW, by doing trial and error, increasing the ESXi option Disk.DiskDelayPDLHelper from 10 to 20 secs (a reboot is required to take effect) seems fix the problem. However, since there's not much information available about this option, not sure whether there's any adverse effect for such change.
Thanks again.
Ryan
ryannw
20 Posts
0
August 11th, 2014 02:00
Hi Joerg,
Have you tried to increase the Disk.DiskDelayPDLHelper option (a reboot is required)? It seems fixed the problem. I have tested at least 30 times and the problem seems gone.
According to its description, it will delay the execution of PDL helper, which I believe to be the culprit of the problem.
However, since there's not much information about this option on the web, I need to further test and ask VMWare if necessary.
Regards,
Ryan
Origin3k
4 Operator
•
2.4K Posts
0
August 11th, 2014 02:00
Hi Ryan,
not yet. But if this kind of setting necessary i would like to see that dell comes around the corner with such important information. I'll keep this in mind and ask our escalation engineer about.
We are currently on non public FW 7.0.6 and seeing general problems with our EQL setup and the VMware vSphere Clusters.
Regards,
Joerg
Origin3k
4 Operator
•
2.4K Posts
0
August 19th, 2014 10:00
I applied 7.0.7 on 2 groups (11 members) on saturday.
Iam not sure if its a EQL problem. Currently iam more interessted in a stable production environment, so no time for tests. I have to work with Dell support on my current issues so i dont want to solve more problems ;) Feel free to open a Case and let us now what they say!
Regards,
Joerg
ryannw
20 Posts
0
August 19th, 2014 10:00
Hi Joerg,
Just noticed that FW 7.0.7 was released. Just wondering if the release has the fix for this problem?
Thanks in advance
Ryan
ryannw
20 Posts
0
August 19th, 2014 10:00
Thanks Joerg.
I have already opened a case and still waiting for the Dell support's reply. Will do the test after the upgrade.
Regards,
Ryan
ryannw
20 Posts
0
August 21st, 2014 21:00
Hi Don,
Not yet. Since I will upgrade the FW, I will test the problem again at the new FW and re-approach Dell Support if required. I will post the result here.
Thanks and Regards,
Ryan
ryannw
20 Posts
0
August 29th, 2014 00:00
Just tested with the FW 7.0.7 and the problem still persists. BTW, after several days' testing, turned out that adjusting Disk.DiskDelayPDLHelper did not solve the problem.
Now Working with Dell support to solve the problem.