4 Operator

 • 

5.7K Posts

May 7th, 2008 05:00

Yup ;)

Drill down in the physical section and open all physical +'s you can see. I suspect an SPS, fan or PS has an issue.

Please remember that each SPS will do a self test each Sunday morning at around 2AM.

Ofcourse you can also check the event log (right click SPA/SPB and choose Event log).

If in doubtt you can open a case with EMC and have them check the machine.

40 Posts

May 7th, 2008 06:00

Also, if you right-click on the array and select faults, it should tell you if something is faulted, like a fan or SPS as RRR mentions.

Also, you could use naviseccli -h faults -list which will tell you the same information.

If "the array is operating normally" yet the array is still faulted, I would open a case with support. You may just need to restart the management server on each SP.

4 Operator

 • 

1.5K Posts

May 8th, 2008 08:00

The nice replies and explanations from Rob and Mike might have provided the answers you were looking for. If so, please mark the question as answered and select the replies as "Correct" and/or "Helpful". If you need any more details, please feel free to revert back to this forum.

Cheers,
Sandip

35 Posts

May 8th, 2008 08:00

What version of Flare are you running on the CX500. There are some known bugs with older versions of flare that always set off false alerts in our environmetn till we upgraded flare 16 and better.

Mike

18 Posts

May 9th, 2008 04:00

(this is 159deka formerally known as 159eka)
thank you all for the responses. I had a small issue with my account, but am back live & kicking.

1- On expanding NO ITEM was detected as faulty!
2- After business hours the whole system was rebooted, the fault was then mirrored to both Navisphere windows, an SPS was then identified as faulty, then turned to "T", & then the system into normal operation mode without any further errors.
3- Since then (2200 hrs. Tuesday ) the system has been behaving itself.
4- however we have taken the precauion of logging a call, & the event log was collected. A filtered event log pertaining to the error only, is copied below
5-Flare version is 2.19

thanks again for the kind interest

=============

1.
Date:05/07/2008
Time:09:51:35 PM
Event Code:0xfd5
Description:WSAGetLastError() returned error: An address incompatible with the requested protocol was used.
Subsystem:CK200061400543
Device:N/A
SP:N/A
Host:DFCC-SPB
Source:TlntSvr
Category:NT Application Log
Log:NT Application Log
Sense Key:N/A
Ext Code1:N/A
Ext Code2:N/A
Type:Error


2.
Date:05/07/2008
Time:09:51:09 PM
Event Code:0x36d
Description:There was error [DATABASE OPEN FAILED] processing the driver database. 00 00 00 00 02 00 64 00 00 00 00 00 6d 03 00 c0 00 00 00 00 6d 03 00 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Subsystem:CK200061400543
Device:N/A
SP:N/A
Host:DFCC-SPB
Source:Application Popup
Category:NT System Log
Log:NT System Log
Sense Key:N/A
Ext Code1:N/A
Ext Code2:N/A
Type:Error


3.
Date:05/07/2008
Time:09:51:09 PM
Event Code:0x4
Description:Dynamic strings:AMLI0xcfc0xcf8 - 0xcff 00 00 00 00 04 00 52 00 00 00 00 00 04 00 05 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Subsystem:CK200061400543
Device:N/A
SP:N/A
Host:DFCC-SPB
Source:ACPI
Category:NT System Log
Log:NT System Log
Sense Key:N/A
Ext Code1:N/A
Ext Code2:N/A
Type:Error


4.
Date:05/07/2008
Time:09:51:09 PM
Event Code:0x5
Description:Dynamic strings:AMLI0xcf80xcf8 - 0xcff 00 00 00 00 04 00 52 00 00 00 00 00 05 00 05 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Subsystem:CK200061400543
Device:N/A
SP:N/A
Host:DFCC-SPB
Source:ACPI
Category:NT System Log
Log:NT System Log
Sense Key:N/A
Ext Code1:N/A
Ext Code2:N/A
Type:Error


5.
Date:05/07/2008
Time:09:51:09 PM
Event Code:0x36d
Description:There was error [DATABASE NOT LOADED] processing the driver database. 00 00 00 00 02 00 64 00 00 00 00 00 6d 03 00 c0 00 00 00 00 6d 03 00 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Subsystem:CK200061400543
Device:N/A
SP:N/A
Host:DFCC-SPB
Source:Application Popup
Category:NT System Log
Log:NT System Log
Sense Key:N/A
Ext Code1:N/A
Ext Code2:N/A
Type:Error


6.
Date:05/07/2008
Time:09:48:43 PM
Event Code:0x904
Description:VSC Shutdown/Removed
Subsystem:CK200061400543
Device:Enclosure 0 Power A
SP:SPA
Host:DFCC-SPB
Source:N/A
Category:N/A
Log:Storage Array
Sense Key:0x0
Ext Code1:0x0
Ext Code2:0x4
Type:Error


7.
Date:05/07/2008
Time:09:48:40 PM
Event Code:0x941
Description:Battery Online
Subsystem:CK200061400543
Device:Enclosure 0 SPS A
SP:SPA
Host:DFCC-SPB
Source:N/A
Category:N/A
Log:Storage Array
Sense Key:0x0
Ext Code1:0x0
Ext Code2:0x1
Type:Error


8.
Date:05/07/2008
Time:09:48:39 PM
Event Code:0x941
Description:Battery Online
Subsystem:CK200061400543
Device:Enclosure 0 SPS B
SP:SPA
Host:DFCC-SPB
Source:N/A
Category:N/A
Log:Storage Array
Sense Key:0x0
Ext Code1:0x0
Ext Code2:0x1
Type:Error


9.
Date:05/07/2008
Time:09:48:39 PM
Event Code:0x908
Description:Fault - Cache Disabling
Subsystem:CK200061400543
Device:SP A
SP:SPA
Host:DFCC-SPB
Source:N/A
Category:N/A
Log:Storage Array
Sense Key:0x0
Ext Code1:0x0
Ext Code2:0x0
Type:Error


10.
Date:05/02/2008
Time:04:59:27 PM
Event Code:0x2580
Description:Storage Array Faulted Bus 0 Enclosure 0 : Faulted Bus 0 Enclosure 0 SPS B : Removed SP B : Removed
Subsystem:CK200061400543
Device:N/A
SP:N/A
Host:DFCC-SPB
Source:N/A
Category:N/A
Log:Application
Sense Key:N/A
Ext Code1:N/A
Ext Code2:N/A
Type:Error


11.
Date:05/02/2008
Time:04:59:21 PM
Event Code:0x944
Description:Hard Peer Bus Error
Subsystem:CK200061400543
Device:SP A
SP:SPA
Host:DFCC-SPB
Source:N/A
Category:N/A
Log:Storage Array
Sense Key:0x2
Ext Code1:0x0
Ext Code2:0x0
Type:Error


12.
Date:05/02/2008
Time:04:59:21 PM
Event Code:0x944
Description:Hard Peer Bus Error
Subsystem:CK200061400543
Device:SP A
SP:SPA
Host:DFCC-SPB
Source:N/A
Category:N/A
Log:Storage Array
Sense Key:0x1
Ext Code1:0xaa975cf4
Ext Code2:0x0
Type:Error


13.
Date:05/02/2008
Time:04:59:12 PM
Event Code:0x908
Description:Fault - Cache Disabling
Subsystem:CK200061400543
Device:SP A
SP:SPA
Host:DFCC-SPB
Source:N/A
Category:N/A
Log:Storage Array
Sense Key:0x0
Ext Code1:0x0
Ext Code2:0x0
Type:Error

Message was edited by:
159deka

4 Operator

 • 

4.5K Posts

May 9th, 2008 11:00

The key entry is the message:

Description:Hard Peer Bus Error

This generally indicates that the SP reporting the error could not contact its peer SP - this could be caused by a number of things but most likely there was a reboot of one of the SPs.

Service will be able to tell what caused this - may be a patch level

regards,

glen kelley

18 Posts

May 12th, 2008 21:00

A case was opened, & SP collects requested have been sent.

Awaiting further details/instructions

18 Posts

May 13th, 2008 00:00

The latest development is that SP B needs to be replaced. And we plan to do it this evening.

However two concerns before we do that.
We have implemented soft zoning, hence the zoning configuration will go for a fix the moment we have a new SP with new two new WWNs, as the zoning is based on WWNs in soft zoning. The easiest remedy that I can see it is if we can edit the alias names to have the new WWNs. Can that be done ?

Second concern is can this be done on-line as we plan to do it online ? Will there be a inherent disabling/enabling of zone configurations which we hinder work ?

A response ASAP is appreciated.

4 Operator

 • 

5.7K Posts

May 13th, 2008 01:00

wwn's of a new SP do not change !!! So don't worry about that.

The replacement can be done online as long as all hosts are HA connected (each host with 2 HBA's and connected to both SP's and failover software implemented (VMware or Powerpath))

4 Operator

 • 

1.5K Posts

May 13th, 2008 08:00

Replacing a SP will not change the WWN. The world wide number is associated with the Storage Processor Enclosure - new SP will get the same WWN - so no worries at all.

The activity is Online - however, SPB will be removed from the System - means, all LUNs owned by SPB will be trespassed to SPA - ensure all the hosts connected are running proper failover software. It may be a good idea to do this activity during low I/O period.

EMC Customer Engineer who will be doing this activity may guide you properly.

Finally we all are so glad to see that, your post on this forum helped to successfully identify the issue.

Cheers,
Sandip

18 Posts

May 13th, 2008 21:00

Thankks sandeep..
yes the SP was replaced last evening. However it has not yet come on-line. So we are working with EMC to see whether there is anything elso wrong. It has been in "POST" level for the last 13 hours or so.

GEnerally ho long does it take to update teh new SP & bring it on-line

4 Operator

 • 

5.7K Posts

May 14th, 2008 00:00

Minutes, not even 1 hour.

9 Legend

 • 

20.4K Posts

May 14th, 2008 03:00

i had an instance where Dell shipped two replacement SPs and both were DOA.

4 Operator

 • 

5.7K Posts

May 15th, 2008 00:00

if Georgia then
  if Atlanta then
    if no Jamaica then
      sp := fail;
    end_if
  end_if
end_if

9 Legend

 • 

20.4K Posts

May 15th, 2008 04:00

ahahaha ...you need to stop hanging out with Stefano :D
No Events found!

Top