Unsolved

This post is more than 5 years old

8 Posts

1707

August 10th, 2009 21:00

Hotspare disk shows failed

symdisk list -failed show spare disk failed.

Neither replace spare drive nor replace drive procedure detect failed drive automatically. Failed disk dropped to NR.

I try to replace drive with spare drive replacement procedure and select drive manually. Is it okay to continue? It asked about secure erase option. I just quit the procedure.

But why procedure not detecting failed spare drive automatically. I see invalid tracks.

Heathcheck shows;

tep 13: Hot Spares Exist = YES
[00;31m * Step 13: Hot Spares Invoked (dir/Ifc/target) = 001E,0000,00000017
[00;32mTest CS_Verify_Spares finished.


Any help really appreciated.

6 Operator

 • 

2.8K Posts

August 11th, 2009 00:00

Hi DiwakarEMC .. It looks like you are an EMC customer and you are replacing a failed drive in a DMX .. something a customer usually won't ever do.

Can you please explain better what's going on at your site ?? :-)

8 Posts

August 11th, 2009 07:00

It is something like this:

1) When We ran health check it shows hotspare invoked but doesnot show any failed disk.
2) when I ran symdisk list -failed then it shows one failed disk. Failed disk shows as hotspare. Invoked hotspare shows 114 hypers and rest hotspare show 0.
3) When I select procedure "can we replace drive safely", it does not detect failed driver automatically. Neither replace spare drive prcoedure.

4) I see this disk has NR bit set.
5) My question is if there is failed disk then proceudre should find automatically.
6) A7 show nothing.

when i try to replace using replace spare drive proceudre, it ask to select driver manually. Is it normal.?

6 Operator

 • 

2.8K Posts

August 11th, 2009 12:00

Diwakar the main issue here is that it looks like you are a "regular" customer. And customers usually don't even open DMX doors. We know you worked at EMC once upon a time. Now let's answer your questions, if possible. As usual before answering I need more details .. Please answer the following questions, you'll help me in helping you :-)

What code are you running ?? What box are you playing with ??
Do you know if you are using Permanent Sparing ??

6 Operator

 • 

2.8K Posts

August 11th, 2009 12:00

By the way how did you know that I
worked in EMC earlier? :)


Hmmm I can't say more on this ;-)

Just to know how to troubleshoot problem and find
real problem. I know that permanent spare will marked
failed disk as HS after sync but million dollar
question when I try to replace using spare disk
replacement script. Script does not find failed
disk.


Since you have permanent sparing, as you noted, the old (broken) drive is replaced with a brand new hotspare and the old drive becomes a broken hotspare.
It MAY happen that the HS replacement script can't find the broken HS. That's why our CE usually receive details in SR text describing the drive to be replaced. That's why you can choose manually the drive to be replaced, too. However in my expereience it always worked fine (at least with 5771 and 5671).

Are you sure you are running HS replacement script ??

8 Posts

August 11th, 2009 12:00

Yes you are right that we are regular customer and big EMC shop... you name it any EMC product we have it out enviorment including V-MAX etc... So sometime we play as well. By the way how did you know that I worked in EMC earlier? :)

Now answer to your question:-

Yes permanent sparing is on. Microcode :-5772.85.77,DMX-3

Just to know how to troubleshoot problem and find real problem. I know that permanent spare will marked failed disk as HS after sync but million dollar question when I try to replace using spare disk replacement script. Script does not find failed disk.

Thank again for your help...

8 Posts

August 11th, 2009 13:00

Yes we select replace spare drive... Then it ask to enter drive location manually.

EMC support always say not to replace HS manually. If bad disk replace with HS then Bad disk become failed HS then why hotspare get invoked? One HS shows 114 Hyper rest show zero.. correct me if i am confused,... HS does not contain any data.

Let say I have 7 HS and one disk failed and replace with HS. Now we have 6 HS and one failed HS. When failed disk is synced with HS and become data disk then why hotspare shows invoked. I assume it invoked for failed HS????

Appreciate you help!!!

6 Operator

 • 

2.8K Posts

August 11th, 2009 15:00

I've seen a few glitches with permanent sparing and symcli.. I'm still digging but it looks like there are issues when the code moves the drives in the backend (when the spare replaces the failed drive). Symcli doesn't like the change and keeps on thinking (and saying) the old drive is still available and that the spare is still a spare, even if in a different DG and with with devices on its surfaces. IMHO symcli MAY be confused by permanent sparing (but it's only an opinion right now since I can't reproduce the issue).

I guess it's far better if you open a Service Request and ask a check from PSElab. I guess someone will come at your location and replace the bad drive.

And possibly keep your hands away from the SP :-) since now you are "simply" a customer, even if with a lot of experience ;-)

8 Posts

August 11th, 2009 17:00

Thank You!! I was expecting technical discussion... Nevertheless thank you very much for your valuable time.

6 Operator

 • 

2.8K Posts

August 12th, 2009 00:00

Unfortunatly I can't dig further for a lot of different reasons..

0 events found

No Events found!

Top