I am having some weird issues with a pair of raid controllers that I am setting up. The controller is handling 8 drives each. All 8 drives are SSD's setup in JBOD configuration (no need for RAID in this particular setup).
Each disk is dedicated to a single VM and the system itself is running Libvirt / ubuntu.
The issue I am having is that every so often without warning - the perc controllers are failing the drive - causing Read / Write errors to the VM and the underlying host.
The drives come back online immediately after I clear the error and remount the drives using the PERCCLI. I have reviewed the smart data looking for failures - not seeing anything strange, and I have reviewed the configuration and am coming up with no good reason for these drives to go offline.
Does anyone have any ideas on what to look at next?
Are you having a pair of H710 in a server? What server model are you using there? And do you happen to know both H710's DPN#? Are the SSD's Dell certified drives or 3rd party brand? There are many encounters of drives issue when using non-Dell firmware drives.
If the server's firmware, BIOS and LCC version all up to date? By the way, Ubuntu is not tested and verified on Dell servers, could you try ESXi to check on the issue?
Do let me know if you have any further questions.
Thanks, Joey Chong Dell EMC Enterprise Support Services Get support on Twitter @DellCaresPro
Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)