Start a Conversation

Unsolved

This post is more than 5 years old

3180

September 30th, 2013 17:00

Customer is complaining about drive stalls on an Isilon cluster


If is customer is complaining about drive stalls you can use this command to determine the number of times a drive on any given
node has stalled.  This is only a rough estimate of actual activity.

Drive stalls

echo "Count Status";grep
-E "gmp-drive-up.*group change.*stalled" /var/log/messages | awk
-F"stalled:" '{print $2}' | sed "s/.* (/(/" | sort | uniq -c

If it is determined that a drive which is stalling frequently, and the drive has not been replaced, performing a proactive
SmartFail of a drive may be beneficial.   As always, work with support for further instructions

178 Posts

January 31st, 2017 10:00

We are seeing stall messages on number of drives, but the health status showing up for all.  Do we need to take any extra action, for those some count are above 120

1.2K Posts

February 1st, 2017 00:00

OneFS: Introduction to drive stalls

Quick note: On older clusters the sysctl value for hw.disk_event.thresh.slowacc_usec might be lower than recommended in this document, and you might want to change it. Just be very careful to get the number of zeros right... double check even if you copy-paste the command. The recommend threshold is 3500000 usec (microseconds) = 3.5 seconds.

hth

-- Peter

2 Posts

April 4th, 2018 02:00

The link https://support.emc.com/kb/466391 is not valid any more... could someone from please point me to the new location of this document OneFS: Introduction to drive stalls?

Thanks

1.2K Posts

April 5th, 2018 09:00

The link is still ok, as usual there is a cascade of redirects before landing

at the actual document somewhere at emcservice.force.com

Try deep refreshing / flushing browser caches, deleting old cookies etc.

2 Posts

June 10th, 2018 23:00

The link does not work for me.

I tried with Chrome, Firefox and IE multiple times, deleted cookies and caches to no avail.

This happens not just with this article but also with another one I really need to read:

https://support.emc.com/kb/454399

This support portal is really totally messed up. It does not help solving problems but creates new ones instead.

No Events found!

Top