PowerEdge Hardware General

2 Bronze

S2D & Dell PowerEdge 730xd - BSOD Memory_management

I’m building a hyper converged 3 nodes Storage Spaces direct cluster with brand new Dell PowerEdge 730xd that comes with a Mellanox connectx-3 pro nic direct from Dell.
I have Attached the powershell script used for building this. Now to the problem, when I enable storagespacesdirect in my cluster, RDMA traffic is starting to flow, and I get expected performance when I running tests on my storage. So far so good, until I reboot any of my 3 nodes. I get blue screen with Memory_management and it just keep looping so I’m not able to access the server again. And before rebooting I’ve checked that no storage jobs is running and also drain roles from the node.
I have figured out that if set Cluster Service to manual and stop the service before rebooting, I can reboot the node without problem. And then start the cluster service when node is rebooted.
I have tested both the drivers that comes with Dell Driver OS Pack and Latest drivers from Mellanox (MLNX_VPI_WinOF-5_35_All_win2016_x64.exe) but with same result.
[View:/cfs-file/__key/communityserver-discussions-components-files/956/S2D-Cluster-script.zip:550:0]
Replies (2)

Hi,

There was a possible solution to this error in this thread https://community.mellanox.com/thread/3593 Turning off IO Non Posted Prefetching in the UEFI. 


Thanks,
DELL-Josh Cr
Dell EMC Enterprise Support Services
Get support on Twitter @DellCaresPRO
#IWork4Dell
2 Bronze

Hi,

that did't work :(

i did some testing this weekend, i took out mellanox card from the servers and used the two builtin 10Gbit interfaces on the motherboard instead. reinstalled everything and builded a S2D solution without RDMA. i could reboot the hosts a couple of times without BSOD, but suddenly i got BSOD with Memory_Management again :( . So this means that the problem is not related to the Mellanox cards. 

Top Contributor
Latest Solutions
comment leaving this incase mobile wants to be turned on again