Start a Conversation

Unsolved

This post is more than 5 years old

10271

July 14th, 2017 01:00

H730 With ESXi 6.5 - disk timeout issues

Hello,
I have a problem with timeout disks. The problem is with a particular server. The server has 9 local disks using H730 PERC Dell controller. I have the latest ESXi version VMware ESXi, 6.5.0, 5224529 with latest firmware for the PERC controller - 25.5.0.0018 using the latest LSI driver 6.910.18.00. I am constantly getting timeout on all the data-stores (at the same time) every few hours. Dell iDRAC shows that all the disks and statuses are OK.

Anyone has an idea?

Thanks

Moderator

 • 

8.5K Posts

July 14th, 2017 10:00

Hi,

Can you reboot to the lifecycle controller and run the hardware diagnostics? Are the drives Dell drives, What RAID level are you using?

5 Posts

July 15th, 2017 07:00

I have two ST300MM0008

Three ST1000NM0033 9ZM

Three SSDSC2BB960G7R

All of them have the latest firmware. 

Consistency check returned negative for the ST300MM0008 virtual disk, but only once, second time passed. Otherwise all of them passed. It might be a problem to reset the server. Is there anything I can do for now without resetting?

Edit:

I run full diagnostics (including long DST). everything passed. 

Moderator

 • 

8.5K Posts

July 17th, 2017 09:00

The consistency check may have helped, otherwise scheduling a reboot may be the best option, if the diagnostics passed it probably isn’t a drive causing the latency. 

5 Posts

July 17th, 2017 10:00

Even after the consistency check and reboot the problem persists. 

I see a  lot

2017-07-17T01:36:12.060Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset
2017-07-17T01:36:12.060Z cpu28:65866)lsi_mr3: fusionWaitForOutstanding:2898: megasas: [ 0]waiting for 0 commands to complete
2017-07-17T01:36:12.060Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba2:C2:T1:L0
2017-07-17T01:36:12.060Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 21029349
2017-07-17T01:36:12.060Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset
2017-07-17T01:36:12.060Z cpu28:65866)lsi_mr3: fusionWaitForOutstanding:2898: megasas: [ 0]waiting for 0 commands to complete
2017-07-17T01:36:12.061Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba2:C2:T1:L0
2017-07-17T01:36:12.061Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 21029349
2017-07-17T01:36:12.061Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset
2017-07-17T01:36:12.061Z cpu28:65866)lsi_mr3: fusionWaitForOutstanding:2898: megasas: [ 0]waiting for 0 commands to complete
2017-07-17T01:39:33.495Z cpu2:66279)ScsiDeviceIO: 2948: Cmd(0x43950a7287c0) 0x1a, CmdSN 0x8980 from world 0 to dev "naa.61866da052050b001f191db244e3acc8" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x2
2017-07-17T01:44:38.669Z cpu15:65600)ScsiDeviceIO: 2948: Cmd(0x439d0091d840) 0x1a, CmdSN 0x89bd from world 0 to dev "mpx.vmhba40:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2017-07-17T01:49:13.449Z cpu5:66279)ScsiDeviceIO: 2948: Cmd(0x43950a9b0840) 0x1a, CmdSN 0x89ec from world 0 to dev "naa.61866da052050b001f191db244e3acc8" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x2
2017-07-17T01:49:38.189Z cpu29:65614)ScsiDeviceIO: 2948: Cmd(0x439d00808240) 0x1a, CmdSN 0x89f7 from world 0 to dev "mpx.vmhba39:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
2017-07-17T01:58:08.936Z cpu5:66279)ScsiDeviceIO: 2948: Cmd(0x43950a92f040) 0x1a, CmdSN 0x8a58 from world 0 to dev "naa.61866da052050b001f191db244e3acc8" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x2
2017-07-17T01:59:38.659Z cpu5:65590)ScsiDeviceIO: 2948: Cmd(0x439d008364c0) 0x1a, CmdSN 0x8a6f from world 0 to dev "mpx.vmhba40:C0:T0:L1" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2017-07-17T02:03:58.371Z cpu2:66279)ScsiDeviceIO: 2948: Cmd(0x43950a9ee340) 0x1a, CmdSN 0x8ac5 from world 0 to dev "naa.61866da052050b001f191db244e3acc8" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x2
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba2:C2:T1:L0
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 21273153
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: fusionWaitForOutstanding:2898: megasas: [ 0]waiting for 0 commands to complete
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba2:C2:T1:L0
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 21273153
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: fusionWaitForOutstanding:2898: megasas: [ 0]waiting for 0 commands to complete
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba2:C2:T1:L0
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 21273153
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset
2017-07-17T02:06:23.419Z cpu28:65866)lsi_mr3: fusionWaitForOutstanding:2898: megasas: [ 0]waiting for 0 commands to complete
2017-07-17T02:06:23.420Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba2:C2:T1:L0
2017-07-17T02:06:23.420Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 21273153
2017-07-17T02:06:23.420Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset
2017-07-17T02:06:23.420Z cpu28:65866)lsi_mr3: fusionWaitForOutstanding:2898: megasas: [ 0]waiting for 0 commands to complete
2017-07-17T02:06:23.420Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba2:C2:T1:L0
2017-07-17T02:06:23.420Z cpu28:65866)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 21273153

Moderator

 • 

8.5K Posts

July 17th, 2017 10:00

Are there any errors in these logs?

 /var/log/syslog.log

/var/log/vmkernel.log

You may want to try these steps. https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1021187

5 Posts

July 28th, 2017 07:00

I see this in vmkernel around the event:

2017-07-27T21:59:18.241Z cpu15:65866)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset
2017-07-27T21:59:18.241Z cpu15:65866)lsi_mr3: fusionWaitForOutstanding:2898: megasas: [ 0]waiting for 0 commands to complete
2017-07-27T21:59:18.241Z cpu15:65866)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba2:C2:T1:L0
2017-07-27T21:59:18.241Z cpu15:65866)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 28234198
2017-07-27T21:59:18.241Z cpu15:65866)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset
2017-07-27T21:59:18.241Z cpu15:65866)lsi_mr3: fusionWaitForOutstanding:2898: megasas: [ 0]waiting for 0 commands to complete
2017-07-27T21:59:18.241Z cpu15:65866)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba2:C2:T1:L0
2017-07-27T21:59:18.241Z cpu15:65866)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 28234198
2017-07-27T21:59:18.241Z cpu15:65866)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset

2017-07-27T22:02:41.272Z  cpu1:66275)ScsiDeviceIO: 2948: Cmd(0x43950ab11b40) 0x1a, CmdSN 0xab48 from world 0 to dev "naa.61866da052050b001f1a0f1c1334b711" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2017-07-27T22:03:01.609Z cpu1:66275)ScsiDeviceIO: 2948: Cmd(0x43950abdc0c0) 0x1a, CmdSN 0xab6a from world 0 to dev "naa.61866da052050b001f191db244e3acc8" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2017-07-27T22:08:35.186Z cpu11:65596)ScsiDeviceIO: 2948: Cmd(0x439d0452bb80) 0x1a, CmdSN 0xabb8 from world 0 to dev "mpx.vmhba35:C0:T0:L1" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
 2017-07-27T22:10:14.512Z cpu0:66275)ScsiDeviceIO: 2948: Cmd(0x43950cec6380) 0x1a, CmdSN 0xabd3 from world 0 to dev "naa.61866da052050b001f191db244e3acc8" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2017-07-27T22:18:02.442Z cpu1:66275)ScsiDeviceIO: 2948: Cmd(0x43950e5c4dc0) 0x1a, CmdSN 0xac38 from world 0 to dev "naa.61866da052050b001f191db244e3acc8" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2017-07-27T22:24:41.999Z cpu6:66275)ScsiDeviceIO: 2948: Cmd(0x43950e540140) 0x1a, CmdSN 0xaca4 from world 0 to dev "naa.61866da052050b001f191db244e3acc8" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

also this in the syslog

2017-07-27T22:01:01Z crond[66601]: crond: USER root pid 100159 cmd /sbin/auto-backup.sh
2017-07-27T22:01:02Z backup.sh.100185: Locking esx.conf
2017-07-27T22:01:02Z backup.sh.100185: Creating archive
2017-07-27T22:01:02Z backup.sh.100185: Unlocking esx.conf
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartnvme.so is already loaded
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartmicron.so is already loaded
2017-07-27T22:02:41Z smartd: libsmartsata: SG_IO ioctl ret:0 status:2 host_status:0 driver_status:0
2017-07-27T22:02:41Z smartd: libsmartsata: Not an ATA SMART device:naa.61866da052050b001f1a0f1c1334b711
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartnvme.so is already loaded
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartmicron.so is already loaded
2017-07-27T22:02:41Z smartd: libsmartsata: SG_IO ioctl ret:0 status:2 host_status:0 driver_status:0
2017-07-27T22:02:41Z smartd: libsmartsata: Not an ATA SMART device:naa.61866da052050b001f191db244e3acc8
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartnvme.so is already loaded
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartmicron.so is already loaded
2017-07-27T22:02:41Z smartd: libsmartsata: SG_IO ioctl ret:0 status:2 host_status:0 driver_status:0
2017-07-27T22:02:41Z smartd: libsmartsata: Not an ATA SMART device:naa.61866da052050b002000ef5c3db61daa
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartnvme.so is already loaded
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartmicron.so is already loaded
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartnvme.so is already loaded
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartmicron.so is already loaded
2017-07-27T22:02:41Z smartd: libsmartsata: SG_IO ioctl ret:0 status:2 host_status:0 driver_status:0
2017-07-27T22:02:41Z smartd: libsmartsata: Not an ATA SMART device:mpx.vmhba35:C0:T0:L1
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartnvme.so is already loaded
2017-07-27T22:02:41Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartmicron.so is already loaded
2017-07-27T22:05:01Z crond[66601]: crond: USER root pid 100393 cmd /bin/hostd-probe.sh ++group=host/vim/vmvisor/hostd-probe/stats/sh
2017-07-27T22:05:01Z syslog[100396]: starting hostd probing.
2017-07-27T22:10:01Z crond[66601]: crond: USER root pid 100437 cmd /bin/hostd-probe.sh ++group=host/vim/vmvisor/hostd-probe/stats/sh
2017-07-27T22:10:01Z syslog[100440]: starting hostd probing.
2017-07-27T22:15:01Z crond[66601]: crond: USER root pid 100473 cmd /bin/hostd-probe.sh ++group=host/vim/vmvisor/hostd-probe/stats/sh
2017-07-27T22:15:01Z syslog[100476]: starting hostd probing.

I dont see any of the log lines mentioned in your KB.

The timeout event occurred in 22:06.

Thank you for the help

Moderator

 • 

8.5K Posts

July 28th, 2017 10:00

Seems like some sort of timeout still. Can you try reseating the controller and cables with the system powered off?

5 Posts

August 5th, 2017 02:00

The issue was resolved using this driver 

https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI65-DELL-LSI_MR3-77005000-1OEM&productId=614

instead of the default one. 

51 Posts

December 9th, 2017 13:00

I think I am having very similar issues. I have a Dell T630 with a H730 and 18 internal drives. They are assigned to 4 different RAID sets. I have never had any issues with this setup until I added a LSI 9286CV-8E card. I am trying to copy a large amont of data internally in a Windows 2012 virtual server from one drive which lives on the H730 to another LUN which lives on the 9286CV-8E. It seems to work while copying for anywhere from 5 minutes to 45 minutes, and then eventually the entire virtual locks up. I was hoping installing the newest lsi_mr3 driver listed in the previous post  would solve the issue but it is still happening. I made sure that both the H730 and the 9286CV-8E have the latest firmware. I don't see any obvious errors in the hostd.log or the syslog.log files in ESX. ESX also is 6.5 Update 1 build 6765664. I'm not sure where to go from here. I also purchased a Dell H830 in hopes it resolves this issue, but also because the iDrac will be able to correctly manage and monitor the H830.

Any ideas?

 

1 Message

August 5th, 2018 23:00

We also had the same exact issue with lsi_mr3 driver version 7.700.50.00-1OEM. We were advised by Dell and VMWare to upgrade to 7.703.18.00-1OEM and PERC controller firmware version 25.5.4.0006. However, after the upgrade, we still encountered the same issue. Now, Dell and VMWare are telling us to upgrade to ESXi 6.5 U2 with the latest lsi_mr3 driver and controller firmware versions as per HCL.

No Events found!

Top