Start a Conversation

Unsolved

This post is more than 5 years old

2142

April 21st, 2017 02:00

S2D & Dell PowerEdge 730xd - BSOD Memory_management

I’m building a hyper converged 3 nodes Storage Spaces direct cluster with brand new Dell PowerEdge 730xd that comes with a Mellanox connectx-3 pro nic direct from Dell.
I have Attached the powershell script used for building this. Now to the problem, when I enable storagespacesdirect in my cluster, RDMA traffic is starting to flow, and I get expected performance when I running tests on my storage. So far so good, until I reboot any of my 3 nodes. I get blue screen with Memory_management and it just keep looping so I’m not able to access the server again. And before rebooting I’ve checked that no storage jobs is running and also drain roles from the node.
I have figured out that if set Cluster Service to manual and stop the service before rebooting, I can reboot the node without problem. And then start the cluster service when node is rebooted.
I have tested both the drivers that comes with Dell Driver OS Pack and Latest drivers from Mellanox (MLNX_VPI_WinOF-5_35_All_win2016_x64.exe) but with same result.
[View:/cfs-file/__key/communityserver-discussions-components-files/956/S2D-Cluster-script.zip:550:0]

Moderator

 • 

8.5K Posts

April 24th, 2017 14:00

Hi,

There was a possible solution to this error in this thread https://community.mellanox.com/thread/3593 Turning off IO Non Posted Prefetching in the UEFI. 

2 Posts

June 12th, 2017 02:00

Hi,

that did't work :(

i did some testing this weekend, i took out mellanox card from the servers and used the two builtin 10Gbit interfaces on the motherboard instead. reinstalled everything and builded a S2D solution without RDMA. i could reboot the hosts a couple of times without BSOD, but suddenly i got BSOD with Memory_Management again :( . So this means that the problem is not related to the Mellanox cards. 

No Events found!

Top