andry_kondratie

9 Posts

2298

July 18th, 2014 04:00

Get smartctl status for failed drive on vnxe 3100

Hello, does anyone know how can I get smartctl status for failed disk ?

I login to ssh, and I dont see in /dev/ vnxe disks, where are they located ?

in df -h I see only some system disks:

service@(none) spa:~> df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 5.0G 3.0G 1.7G 64% /

tmpfs 725M 2.8M 722M 1% /dev/shm

udev 469M 824K 468M 1% /dev

/dev/sda4 4.1G 539M 3.4G 14% /cores

/dev/tmpfs 32M 11M 22M 33% /tmp

/dev/mirrora4 17G 1.5G 14G 10% /EMC/backend/service

/dev/c4nasdba2 1019M 31M 937M 4% /EMC/backend/CEM

/dev/c4nasdba1 1015M 38M 927M 4% /nbsnas

/dev/c4loga2 653M 22M 599M 4% /EMC/backend/perf_stats

/dev/c4loga1 3.4G 240M 3.0G 8% /EMC/backend/log_shared

However smartctl tool present in vnxe linux, I assume it suse.

And another question, if I install failed disk in some 3rd party server, for example in Dell, does Dell recognize it ?

Responses(6)

Shardul1

33 Posts

0

July 18th, 2014 06:00

Hello,

smartctl cannot be used to check status of data disks. The disks are not really attached to SLES, but to the Flare (EMC OE for block devices) container running on top of it.

However, there is a suite of scripts available on the SPs. These can be seen by running svc_help on an SP.

For disk status, there are mainly 2 scripts that can be useful:

1. For disk hardware status (can be different on each SP): svc_diag --state=spinfo

2. For backend's (Flare) view of the disks : svc_storagecheck -b

I would trust the status reported by the 2nd script over the first one.

Disks provided by EMC for specific hardware are only supposed to be used on the specified EMC hardware. Even a good disk may or may not work on 3rd party hardware.

Regards,

Shardul

andry_kondratie

9 Posts

0

July 18th, 2014 09:00

Thank you very much, maybe you know what is:

##

Dead Reason:

NEO_DISK_DEAD_REASON_DH_REQUESTED

really means ?

Shardul1

33 Posts

0

July 22nd, 2014 00:00

The disk is certainly dead, and the 'DH requested' indicates that the box tried to open a Dial-Home SR with EMC for the Disk replacement.

NEO_DISK_DEAD_REASON_DH_REQUESTED doesnt give us much information about why the disk is dead. It is probable that the disk has 'media' errors. To conclude on the exact cause, we may need to check the logs.

If you have not received a disk replacement, please login to your online account and place an order for the part replacement:

https://support.emc.com/servicecenter/orderPart/

andry_kondratie

9 Posts

0

July 22nd, 2014 00:00

Hello, unfortunately I have not warranty or support on this device, I ordered a new one, but it doesn't delivered yet.

So I want to understand what happens to disk, it's second failed disk in this month on this vnxe, I assume it happens because this vnxe is more than 3 years old, but I want to see smartctl or something similar to get real cause of disk failure.

I dont know then new vnxe will be delivered to me, and does old one survive until it, or I need to order new disks first.

Could you please point me, what logs I need to check to understand disk failure reason ? And may be there is a command to check health of other disks ?

Shardul1

33 Posts

0

July 22nd, 2014 03:00

You can continue using svc_storagecheck -b to check the status of all disks. If the disk has a temporary problem, it is always a good idea to try reseating it. If the problem is recoverable, a reseat should most likely fix it.

As for the root cause, you can look into flare logs. You might get something useful by running the following command on both SPs:

zgrep -i eventlog /EMC/C4Core/log/c4_ccsx_ktrace.log*

Something like a 'Soft SCSI Bus Error' or a 'Soft Media Error' would indicate hardware level problems (unrecoverable).

DELL-Leo

Community Manager

•

7.1K Posts

0

July 22nd, 2014 19:00

Cool sharing~

View All

No Events found!

VNX

Get smartctl status for failed drive on vnxe 3100