Unsolved

This post is more than 5 years old

7 Posts

25822

March 30th, 2010 13:00

Standby Power Supply Failure

I asked this before, but it was obviously on the wrong forum as this is regarding a Clariion system (CX-500).

You can ignore all the following text really, at the end of the day I need to answers to a couple of questions and I just can't seem to find an answer anywhere:

1)  How can I verify a standby power supply (SPS) is working properly?

AND/OR

2)  How can I test the SPSs?  This CX-500 has two of them.  Is there a way to test these SPSs?

PS.  I know you can set a test time, but how can you see the results?  I have gone through tons of logs after doing SP Collect, but I've yet to find something.  Besides, I'm also looking into an actual real world test (say unpluggin the SPS from its power source).

Below are some details I had posted in the other forum if you need them.  I just want to know what the normal behavior of a CX-500 is when power to both SPSs go down.  Thanks.

>>>>>>>>>>>>>>>>>>>>>>>>>>>

Hi there, hopefully I'm in the right forum.  I have several things going on I'm a bit doubtful on.  I'm trying to test a Standby Power Supply (SPS).  We had a faulted SPS and so we got a new one.  This one is strangely light compared to the original one.  It looks like it is missing some batteries inside.  It is the proper model, etc etc.

I installed it anyways.  I do not get any errors when viewing the SAN through navisphere, but I'll still like to test it.  I know how to set a test time, but what exactly does the test do?  The only thing I've found from EMC is this:

"You can now set the battery test time, so that each week the SP runs a
battery self-test to ensure that the monitoring circuitry is working in
each standby power supply (SPS) "

I can see through the event log the test was started, but how do I check the results if any?

Also, the CX-500 SAN has two SPSs.  I tested the SPSs by unplugging one service processor (SP) from one SPS, so there is only one SPS working.  I then unplugged the only SPS being used from its power source (so at this point, only the SPS is holding the two Service Processors).  The two service processors then shut off within 5-10 seconds after unplugging the power source going to the SPS.

Is this normal?  What should be the normal behaviour in cases like this?  I read from some documentation that the service processors should shut off within 2 minutes, but 5-10 seconds seems to be on the very low end of that range.

Any help would be appreciated.

Thanks.

6 Operator

 • 

2.1K Posts

March 31st, 2010 11:00

So, I'll try to make this as short as possible to address your questions:

1> The array itself will verify that the SPS works properly. This happens once a week. If you don't see any results then everything is good. You only find out about it if there is a problem.

2> Once again, the SPS get tested once a week. If you want to do a manual test, you would plan a time when you can shut down the array. Make sure all your hosts are shut down... then pull the power on one SPS, wait a few minutes then pull the power from the second SPS. It should not take long for the array to shut everything down since it can skip the step of dumping the Write Chance to the vault drives. To test the other SPS you would remove the power in the opposite order.

Keep in mind that this is a very disruptive test as you have to have all your hosts shut down to do it safely.

7 Posts

March 31st, 2010 22:00

Thanks a lot for the reply Allen,

I ran the test.  It took a mere 5-10 seconds to shut off after unpluggin the second SPS.  Why do you say " it can skip the step of dumping the Write Chance to the vault drives".  Is that because all the hosts would be down? otherwise wouldn't that mean possible loss of data in a real world scenario in case of power outage?

And would you consider those 5-10 seconds to be normal?

Thanks.  I'm probably making this a bit more complicated than it actually is, but we replaced a SPS, and the replacement is strangely light when compared to the originals.  I can actually tell it's missing some batteries inside the metal casing, so that got us thinking on whether or not it would actually do its job in a real scenario.

8 Posts

April 1st, 2010 00:00

Hello, guys

Sorry for stupid question, but what is the function of SPS? I check one of EMC NAS servers (NS350) in my work environment and have strong impression that one SP (server processor) connected to SPS and another one connected directly to external UPS. Is it dangerous situation?

On Celerra NS350 System SP in located DPE e.g. Disk processor enclosure (DPE) houses array and storage processors (SP)

Evgeniy

7 Posts

April 1st, 2010 09:00

As far as I know, the SPS is to allow the array to write the cache in case of power failure.  But I don't know the exact behavior, then Allen mentioned something about skipping the cache writing.

I don't know the exact NAS you are talking about but at least on the CX-500 SAN, your array can work fine with just one SPS, or one Service Processor active.  I don't know if performance is degraded if only one SP is working (my guess would be yes).  Oh well, I guess I really don't have an answer to your question, haha, sorry.

8 Posts

April 1st, 2010 21:00

Hi, LEBATO

That very nice that you answering me. I posted two messages in this forum and nobody answering – that was frustrating -)

And about your initial question… in case that SPS is one big battery – like UPS battery and your Clarion system could work with one SPS.

You can connect you system with one good and checked SPS, disconnect the system from power source and measure the time… (actually you told that it was 5 -10 seconds till two Service Processors was shut off)… After that you can do the same with Second SPS

Did you try it? (to run the system with newly bought SPS only vs to run the system with old good and working SPS) Was it the same time till service processor was shut off?

Evgeniy

P.S. I hope such test will not damage your system, since I use my common sense for understanding and don’t have proper EMC knowledge, just reading documentation and trying to understand how things works

7 Posts

April 2nd, 2010 08:00

Hey no worries, I think we are in the same ballpark, trying to learn really.

I did try the test.  It took 5 seconds with the new SPS, and 10 seconds on the old SPS.  I only had a few VMs (3) running, but they are really doing nothing.  In fact, I think only two of them were running off of the SAN.  This is not a production SAN (although soon it will be again, hence the testing).  So I suppose those 5-10 seconds would be normal.

Again, I'm just wondering since I really don't know exactly what is supposed to be doing.

EDIT:  I'm a bit disappointed on EMC's online documentation really.  It's practically impossible to find all the documentation for a CX-500 for example.  I've found most of my stuff from google (which ends up digging through the EMC site, and finding *something*).  Navigation is pretty bad.  Being used to sites like Cisco really spoils I suppose.

7 Posts

July 7th, 2010 05:00

Hello Lebato,

The SPS is just to protect the write cache. I am saying "just" because this component is not an UPS at all. If there is a power failure  it will allow the cache to be flushed to the vault disks hence avoiding data to be lost/corrupted. This process is only a few minutes.  Your CX500 must have 2 SPSs, one per Storage Processor.

By default the SPS test will run every Sunday at midnight (00:00). You can change this setting in Navisphere. During  the SPS test the write cache will be disabled. And yes, you should be able to see that on the logs (right click on the SPA and View Events) You can verify if the SPS is OK through Navisphere (right-click on the SPS then Properties), by visual inspection (there's a LED column on the right of the SPS) or by CLI command.

In regards of documentation, you can register at http://powerlink.emc.com

Hope that helps

AW

March 4th, 2011 13:00

I have a NS600 and one of the standby power supplies has gone bad. Can I replace that while the EMC  is live and running ? or does it have to be completely shutdown ?

6 Operator

 • 

8.6K Posts

March 7th, 2011 08:00

I would suggest to use the SUPPORT forums (ECN -> support) for Clariion or Unified Storage for technical questions like this - they are read and answered by a lot more people than ECN Connect

No Events found!

Top