Unsolved
9 Posts
0
2766
April 30th, 2021 07:00
Dell ME4024 and Linux KVM/Qemu
Hello,
I've tried to set up a Dell ME4024 storage array with a KVM/Qemu hypervisor (PowerEdge R6515). I first tried with Proxmox, then with plain Qemu, different versions, same issue. The array is configured with Virtual disks and ADAPT RAID. It is connected to the server with SAS HBAs.
I followed the guidelines of the "Dell ME4 and Linux" document and created one LVM Volume Group on top of a LUN.
I create a VM on the server with local storage and add a virtual SCSI drive on the array with Discard = unmap support for thin provisionning support.
Inside the VM, I copy some files to the virtual disk, delete them, I can see that the thin provisionning works because the allocated space decreases. Then I copy the files again, and the virtual drive gets corrupted, "bad bitmap checksum" errors on Ubuntu VMs, and chkdsk errors on Windows VMs.
Thanks for the help.



DELL-Josh Cr
Moderator
•
9.4K Posts
0
April 30th, 2021 12:00
Hi,
Did you do physical volume data alignment? page 26 https://dell.to/3345nFy
Do you have any issues with corruption if you don’t make a thin provisioned lun? Are you using xfs or ext4? Let us know if you have any other questions.
Ddx9
9 Posts
0
May 1st, 2021 02:00
Hi,
Yes I create the LVM with —dataalignment=1M.
With the Linux VMs I use ext4.
The volume doesn't seem to have corruption when I don’t use the discard/unmap option, but the performance becomes much worse. I don’t know if I can create a LUN which is not thin provisioned from the ME4 interface when I use virtual storage.
DELL-Chris H
Moderator
•
9.6K Posts
0
May 3rd, 2021 09:00
Thank you for the details.
Would you confirm the which model drives you're using?
Let us know.
Thanks.
Ddx9
9 Posts
0
May 3rd, 2021 09:00
Toshiba AL15SEB18EQY and
Seagate DL2400MM0159 mixed with ADAPT Raid.
2* Toshiba kpm5xrug960g as SSD cache.
DELL-Josh Cr
Moderator
•
9.4K Posts
0
May 3rd, 2021 10:00
Is the drive firmware up to date? https://dell.to/3xChgjQ
https://dell.to/3aYJmMA
Ddx9
9 Posts
0
May 4th, 2021 00:00
Yes I think so, EF06 firmware for Toshiba HDDs, ST58 for Seagate HDDs and B01C for Toshiba SDDs.
DELL-Charles R
Moderator
•
4.7K Posts
0
May 4th, 2021 06:00
The hard drives do look current on their firmware.
What SAS HBA do you use? Is it current on firmware and driver?
Is the EMM up to date?
https://www.dell.com/support/home/en-us/product-support/product/powervault-me4024/drivers
Ddx9
9 Posts
0
May 4th, 2021 07:00
I use the Dell 405-AAEV, the firmware is 16.17.01.00. On Linux the driver is mpt3sas version 36.100.00.00. I also tried with version 35.101.00.00.
EMM has firmware 526E.
DELL-Josh Cr
Moderator
•
9.4K Posts
0
May 4th, 2021 09:00
Thanks, Is the device under warranty? This may be something we need to have phone support look at or escalated.
Ddx9
9 Posts
0
May 5th, 2021 02:00
Yes, the device is very recent and under warranty, I called Dell Support here in France, they told me that it looks like a configuration issue, that if I want help, I must subscribe to Dell ProDeploy, and they recommended to look at the official documentation. It wasn't very helpful because I already read it... It is strange to get corruption with rather standard settings.
I would be grateful if there is something you can do about it.
DELL-Josh Cr
Moderator
•
9.4K Posts
0
May 5th, 2021 09:00
Can you private message me the service tag? I an look into it but I may not be able to overrule what they said.
Ddx9
9 Posts
0
May 5th, 2021 09:00
I did new testing today, I removed the ADAPT RAID and created a RAID6 storage with only the 1.8 TB Toshiba drives and one SSD cache, the problem still happened. I think that it is a software or firmware issue because the drives seem good. I compared with local storage, where the issue didn't happen.
Ddx9
9 Posts
0
May 6th, 2021 03:00
Also I tested with Linear Storage and the corruption did not happen. But without virtual storage I cannot use SSD cache, snapshots etc.
DELL-Josh Cr
Moderator
•
9.4K Posts
0
May 6th, 2021 11:00
I am still looking into it but I haven’t been able to find any known issues or any reason why this is happening. What connection method is from the server to the storage?
Ddx9
9 Posts
1
May 6th, 2021 12:00
I use SAS cables. I configured multipath.conf by copy pasting from the "Dell ME4 and Linux" document. When I do multipath -ll, the multipaths are showing correctly. I create a new PV with pvcreate dataalignement=1m
I create new VG. Then finally I create a new LV for KVM/Qemu.
I add the storage as raw, discard=unmap, detect_zeroes=unmap (otherwise TRIM doesn't seem to be working).
The storage gets detected by the VM. I copy dummy files with dd to fill the virtual disk. I delete the files, recreate them. After 1 or 2 iteration the disk gets corrupt. If I lower the RAM the corruption seem to occur even faster. The server uses ECC RAM with no error detected...
The corruption occurs with Ubuntu, Debian and Proxmox. I am currently running tests with Windows Server and Hyper-V. It seems much slower. So maybe the actual speed with Linux is wrong.