Got an issue with this server. I believe the server has 3 146GB HDs set up in a RAID 5. Third drive has an amber blinking light, which I'm guessing means the drive has failed.
I have a replacement drive, but need some help:
Thanx in advance for any responses.
The blinking amber light probably means your drive is bad, but not necessarily.
OpenManage Server Administrator (Download and run to extract, then run C:\Openmanage\windows\setup.exe. Choose Custom and make sure that Storage Management is installed). This software does not come installed, but it can easily be installed after the fact:
If the drives are accessible from the outside of the machine, they are hot-swappable. It will probably begin a rebuild automatically. There are a few things that could prevent the rebuild from starting automatically, including having that feature turned off. If it doesn't you can begin a rebuild manually in the CTRL-M BIOS utility (Objects, Physical Drives, Rebuild) or in OMSA (Storage, PERC, Enclosure/Connector, Physical Drives, Rebuild from drop-down menu (if FAILED), or Assign as Hot Spare (if READY)).
Also, you original drive may not actually be bad. I would attempt to test the drive with PowerEdge Diagnostics (from Windows) - the Quick Test is sufficient, and you can run the Extended if it passes to be sure. If it is not available for testing, you can reseat it or just replace it. Just know ... it might not be bad. Your firmware may just need to be updated:
Ah, thanx for all the info! At first, I was getting the light on the front/back of the server blinking, indicating one of a multitude of possible internal errors. Without the proper software, I had no idea what that was. Then, the HD light came on.
I will give those a shot on Monday when I'm physically in front of the server again.
To install OMSA, I had to upgrade the firmware. Did that and ran Diagnostics as well. Diagnostics didn't find anything. OSMA said the drive was bad. Replaced it with one I bought online, a refurbished one. When I put that one in, the lights on the front flashed green quickly, then went to blinking amber. In OSMA, there were no options for the drive; it showed up as 0.0 GB.
I decided to put the original drive back in. This time, the light was green and the server internal health light went from amber to blue. In OSMA, the drive looked good and it started to rebuild itself, which it completed overnight.
SO, it looks good, but I'm wondering about the health of the drive and the backplane. This server is over 6 years old now. My main concern is that something is still wrong. I use this for a proprietary news audio program for 6 radio stations I work for. This morning, connectivity with and saving to a network share on this array has been inconsistent, and transfer speed seems diminished.
That can happen if the drive is rebuilding ... had it completed? I would run a Consistency Check on your array (OMSA - Storage, PERC, Virtual Disks).
I have yet to do that. I assume it takes a lot of time/resources, and that people shouldn't be using the server, really, correct?
Also, the 'bad' drive I put back in, and which it rebuilt... performance has been quite degraded since. Is it OK to pull that drive and see what happens? Do I need to stop it in OSMA first?
Thanx again for all the help.
I hope this is a different server than in your other thread ... otherwise, we may have offered suggestions based on incorrect (incomplete at best) information.
You should stop the drive first (Prepare to Remove/Force Offline), and you can try removing it. If the drive is bad but still in a usable condition, it can take it much longer to respond properly to commands, writes, reads, etc. If the drive is bad, you should replace it. If your replacement did not rebuild, you should figure out why - this can be a sign of bigger problems.
Consistency Checks take several hours and some resources, but the controller allocates the same amount of resources (and it takes about the same amount of time) as a rebuild. There may be some performance hit while running a CC, just like with a rebuild. It can be run safely while in use, but the more people using it, the more slow down they will experience.
One in the same.
Well, my co-worker just decided to pull the drive without stopping it. Server seems to still be working. Turns out we may have database/software issues at this point. We'll put the drive back in, rebuild and will do a Consistency Check. What exactly will this tell us?
A consistency check does for a RAID array what a CHKDSK does for the Windows file system - it remaps damaged data blocks to safe locations and attempts to check and fix RAID parity information. If the CC completes with errors, you may have permanently damaged data/files.