In addition to those commands I also look at the output of these commands:
Notice how bay 9 has disk that is in "Replace" state. That disk was pro-actively smartfailed by the cluster and now is ready to be physically replaced with a new drive. Unfortunately we have had a couple of instances where the cluster did not dial-home so DellEMC support did not dispatch anyone to replace the drive. So that's why we have scripts in place that monitor for these type of failures.
mycluster-1# isi_for_array -s isi devices list
mycluster-3: Lnn Location Device Lnum State Serial
mycluster-3: -----------------------------------------------------------
mycluster-3: 3 Bay 1 /dev/da1 27 L3 0RY8MNRA
mycluster-3: 3 Bay 2 /dev/da2 26 HEALTHY Z1ZBVKR10000C728DLA4
mycluster-3: 3 Bay 3 /dev/da19 13 HEALTHY Z1ZBWGF20000C728JA4A
mycluster-3: 3 Bay 4 /dev/da20 38 HEALTHY Z1Z423X00000C441BFK7
mycluster-3: 3 Bay 5 /dev/da3 25 HEALTHY Z1ZBVP0D0000C728DL2M
mycluster-3: 3 Bay 6 /dev/da21 11 HEALTHY Z1ZBWDGV0000C72726N3
mycluster-3: 3 Bay 7 /dev/da22 10 HEALTHY Z1ZBVKLF0000C728DJNZ
mycluster-3: 3 Bay 8 /dev/da23 9 HEALTHY Z1Z5XMNY0000W5119JXB
mycluster-3: 3 Bay 9 /dev/da4 24 REPLACE Z1ZBW42L0000C728AWYZ
mycluster-3: 3 Bay 10 /dev/da24 36 HEALTHY Z1ZBVP1J0000C728DKSJ
mycluster-3: 3 Bay 11 /dev/da25 7 HEALTHY Z1ZBW3ZN0000C727E87X
mycluster-3: 3 Bay 12 /dev/da26 6 HEALTHY Z1Z7VATR0000R528SV1N
mycluster-3: 3 Bay 13 /dev/da5 23 HEALTHY Z1ZBW39G0000C728AWA9
On Generation 5 nodes we also used to monitor internal drives for wear. Here is a great article that explains that topic
as it seems there is a healthcheck process defined ? Is this in addition to this, in which case should not emc be aware their healthcheck process has gaps ?
dynamox
9 Legend
•
20.4K Posts
0
July 16th, 2019 04:00
Mazaffar,
In addition to those commands I also look at the output of these commands:
Notice how bay 9 has disk that is in "Replace" state. That disk was pro-actively smartfailed by the cluster and now is ready to be physically replaced with a new drive. Unfortunately we have had a couple of instances where the cluster did not dial-home so DellEMC support did not dispatch anyone to replace the drive. So that's why we have scripts in place that monitor for these type of failures.
On Generation 5 nodes we also used to monitor internal drives for wear. Here is a great article that explains that topic
https://community.emc.com/community/products/isilon/blog/2016/10/13/how-to-ssd-wear-level-on-isilon-onefs
Hopefully it helps
mazaff
2 Posts
0
July 24th, 2019 03:00
Thank you
cadencep45
3 Apprentice
•
318 Posts
0
July 30th, 2019 04:00
why is there this
https://community.emc.com/docs/DOC-71575
as it seems there is a healthcheck process defined ? Is this in addition to this, in which case should not emc be aware their healthcheck process has gaps ?
Phil.Lam
3 Apprentice
•
636 Posts
0
September 23rd, 2019 22:00
http://raghuramnaidu.blogspot.com/2018/07/isilon-useful-commands.html