This post is more than 5 years old
155 Posts
0
3436
August 4th, 2008 23:00
fatal alert
i have got the following error msg on 30th jul. just wanted to ensure no issues with that director now. how to ensure that?
symevent -sid xxx list -fatal
Symmetrix ID: xxx
Detection time Dir Src Category Severity Error Num
------------------------ ------ ---- ------------ ------------ ----------
Wed Jul 30 12:52:29 2008 DF-6C Symm Director Fatal 0x0040
A Symmetrix Director is not responding
symevent -sid xxx list -fatal
Symmetrix ID: xxx
Detection time Dir Src Category Severity Error Num
------------------------ ------ ---- ------------ ------------ ----------
Wed Jul 30 12:52:29 2008 DF-6C Symm Director Fatal 0x0040
A Symmetrix Director is not responding
No Events found!


MikeMac1
2 Intern
•
292 Posts
0
August 6th, 2008 06:00
MikeMac1
2 Intern
•
292 Posts
0
August 5th, 2008 06:00
xe2sdc
6 Operator
•
2.8K Posts
0
August 5th, 2008 07:00
MikeMac1
2 Intern
•
292 Posts
0
August 5th, 2008 08:00
wishtobeMrEMC
155 Posts
0
August 6th, 2008 00:00
xe2sdc
6 Operator
•
2.8K Posts
0
August 6th, 2008 00:00
xe2sdc
6 Operator
•
2.8K Posts
0
August 22nd, 2008 12:00
However before replacing any single hardware part the CE have to carefully read all SR notes from PSE .. And sometime the CE can't even touch hardware if a PSE isn't logged in the box, issuing commands with a very special account (usually your CE will log on symwin using "CE" account while the PSE uses a different and powerfull user) and checking command output before giving directions to your local friendly CE.
Just as an example .. when you have 2 broken drives the CE can't change them. He needs a PSE to check box status and tell him what drive to replace first and when to replace second drive...
And before running any script CE have to run beloved KTS (key to success) script that does a boring and long check of many different aspects. And trust me, KTS will report a not working DF.
In a word .. there are a lot of people checking each other to avoid a single mistake to bring down your business.
RRR
6 Operator
•
5.7K Posts
0
August 22nd, 2008 12:00
1 Question comesto my mind: if another DF goes offline which causes a DF to take over on this particular card (the one with the first failing DF) the system is still running fine, but replacing any of the 2 failing DF boards causes som disks to loose connectivity. How is this handled ?
LBM99
1 Rookie
•
119 Posts
0
August 22nd, 2008 13:00
The director replacement script will check to see if any directors other than the one specified for replacement are in a failed state. If any are found, the script will post a warning. Many scripts (if not all) check for newly failed directors throughout the script.
The drive replacement script has a step(s) that check that all of the volumes on the drive to be replaced have ready and valid mirror/RAID members.
In response to RRR's question about having to replace a DF dual initiator pair, each with a failed processor: This would be considered a rare and very serious condition. High level PSE and / or development engineering assistance would be required to determine the best course of action. This would be determined on a case-by-case basis depending on the cause of the failures, so I cannot say exactly how the problem would be corrected. All measures would be taken to recover the situation without disruption.
Mike
xe2sdc
6 Operator
•
2.8K Posts
0
August 25th, 2008 13:00
xe2sdc
6 Operator
•
2.8K Posts
0
August 31st, 2008 12:00
RRR
6 Operator
•
5.7K Posts
0
August 31st, 2008 12:00
RRR
6 Operator
•
5.7K Posts
0
August 31st, 2008 13:00
I guess you didn't understand what I was saying:
what if 2 DA's are dead each holding half of several disks. If you now replace 1 DA, some disks don't have a working DA anymore.... if you replcae the other DA, the other disks don't have a working DA..... talking about a dilemma.....
xe2sdc
6 Operator
•
2.8K Posts
0
August 31st, 2008 14:00
I think you are talking about replacing both DA in a single DA pair. It looks scary, however code enforces rules to avoid having problems with this very specific situation (and even a lot of other "corner case")
In case you are talking about 2 different DA in 2 different DA pairs, it's even easier .. When you unplug a DA, the other DA (in DA pair) will jump in and give access to ALL devices.
To make a complex thing easy, each DA pair grants access to low level disks, while logical volume protection protects your data against physical drive unavailability. And a drive may be unavailable 'couse it is failed or 'couse both DA in DA-pair are faulty
xe2sdc
6 Operator
•
2.8K Posts
0
September 10th, 2008 13:00
As soon as you realize how it works, it's relaxing