Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

3941

December 30th, 2013 19:00

No alert notification during power outage !

Hi,

We are running a CX4-120 SAN.  Due to power failure of our site, the SAN is down (From log file it seems not gracefully) and restart after the power is back.

We don't get any notification from the SAN.  Is it correct ? If not, what should be the right notification message ?

Thanks

Message was edited by: TonyJK

474 Posts

January 2nd, 2014 16:00

The array will send several emails for the power down and you may see the “too many alerts” in addition to several others. However, it’s likely that the too many alerts was during the power up, not the power down. If the SMTP server is unavailable during the power down, then those alerts will not be received because the array will have no way to send them. In a full datacenter power outage, you probably won’t see many alerts because the network switches, routers, servers, etc all go down immediately, while the array is reacting and powering down for a few minutes after the power goes out. There’s simply no way for the emails to make it out.

812 Posts

January 1st, 2014 11:00

I am not sure of the message but you should have got at least power failure alerts at the moment the SPS lost its AC input. Was the email notification working fine ?

388 Posts

January 1st, 2014 15:00

The SMTP server is at remote site (The one at HQ is not used for Email Spoofing).  When there is power surge, it appears that the main SMTP server is down at HQ as well.

When the SMTP server is started next morning, we get the following error messages

1) Description The storage system received too many events in this poll cycle to display here. Contact your service provider to determine if any storage system error events occurred that you should address.

2) Description LOCAL DMI LOG: Found a warning during POST:  WARNING: Error reading PSA1 resume PROM (0x01C2)

Thanks

812 Posts

January 1st, 2014 18:00

I would suggest a support ticket for spcollect analysis if not done

already. Support will be able to tell you what went wrong with your system

- may be some bugcheck.

If SMTP server was down, that should be the reason why you were not

receiving the alerts.

474 Posts

January 2nd, 2014 09:00

What did you see in the log file that makes you think it was not graceful?

474 Posts

January 2nd, 2014 15:00

Those are normal entries for a power down when the array detects a power outage. All disks, LCCs, etc that are NOT in bus 0, enclosure 0 lose power immediately when the site power is lost. Then the array vaults cache to the vault disks in bus0,encl0,slots0-4. After the vault process is complete the array powers off.

In short: None of those entries indicate anything abnormal, assuming that the rack/datacenter lost power.

388 Posts

January 2nd, 2014 15:00

We get the following error messages:

21:29:50 The AC Fail state has changed for this power supply and no SPS test is in progress. This indicates a problem with the AC input to this power supply.

21:29:56 Battery Online

21:30:09 K10_DGSSP FlushRegistry   76 00 01 00

21:30:09 K10_DGSSP FlushSystemDisk   76 00 01 00

21:30:12 VSC Shutdown/Removed

21:30:13 SPS Removed

21:31:26 Standby Power Supply (Enclosure SPE SPS A) is faulted. See alerts for details.

21:31:26 Power Supply (Enclosure SPE Power A0) is faulted. See alerts for details.

21:31:26 Power Supply (Enclosure SPE Power B0) is faulted. See alerts for details.

21:31:26 Disk Array Enclosure (Bus 1 Enclosure 1) is faulted. Servers may have lost access to disk drives in this enclosure. See alerts for details.

21:31:26 Disk Array Enclosure (Bus 0 Enclosure 0) is faulted. Servers may have lost access to disk drives in this enclosure. See alerts for details.

21:31:26 Disk Processor Enclosure (Enclosure SPE) is faulted. Servers may have lost access to disk drives in this storage system. See alerts for details.

Then we get a lot of

CRU Powered Down

Unit Shutdown

Trespass Failed LUN ....

21:33:49 Fault - Cache Disabling

21:33:53 The AC Fail state has changed for this power supply and no SPS test is in progress. This indicates a problem with the AC input to this power supply.

Then nothing in the log file until power is resumed.

Thanks

388 Posts

January 2nd, 2014 16:00

Dear Rich,

Many thanks for your confirmation that SPS is working properly.

In this way, no notification alert should be the configuration of SMTP Server.

My manager would like to know if the SMTP Server is working properly, should we get alerts other than

"Description The storage system received too many events in this poll cycle to display here. Contact your service provider to determine if any storage system error events occurred that you should address" ?

OR we usually get this error message when power outrage (Because there are too many Error Messages) ?

Thanks again


No Events found!

Top