Start a Conversation

Unsolved

B

4 Posts

1108

July 16th, 2019 21:00

PowerEdge 2970 - Disk in Perc6/i RAID5 showing "Unknown"

Hello,

I have inherited a number of older, out of warranty Dell servers, including this one. I recently had a drive in the three-disk RAID5 array fail. I reseated it and it came back up, but degraded. Replaced it with a new Dell certified drive and all seemed well. OpenManage showed three online physical disks.

A few weeks later, I check OpenManage and the disk I replaced is now showing up as "Unknown". Going back through logs I see some errors a few days ago:

Command timeout on physical disk

OpenManage shows green on the Physical Disks and the Virtual Disk, so the fact that this one disk is now "Unknown" is not showing as an issue in OpenManage. But I'm still concerned I'm looking at a future failure.

I'm not a sysadmin and I'm operating without a safety net. I did some research and saw that this issue can be caused by older drivers/firmware, but I'm having trouble figuring out what needs updating and don't want to do something in the wrong order and brick the system. Hoping the community can help:

PowerEdge 2970
BIOS: 4.2.1
Baseboard Management Controller: 2.43.00
Processors: Two Quad-Core AMD Opteron(tm) Processor 2378
OpenManage: 7.4.0
PERC 6/i Integrated: 6.2.0-0013 (latest version reported is 6.3.3-0002)
Disks are ST3500414SS - one has revision KS65, one has revision KS68, and I can't tell which one the "unknown" one has

Also probably worth noting that this is not the original mainboard - a year or so ago I miraculously managed to replace the mainboard with a refurbished one. That was fun considering all I had was the hardware manual and Youtube...

I haven't gotten physical access to the machine yet to see if there are any lights or warnings on the console - I'll get to see it tomorrow. Hoping someone in the community can give some guidance on if this is something I should be concerned about and what to do about it.

Thanks!

Moderator

 • 

8.4K Posts

July 17th, 2019 05:00

Bjctsysen,

 

It was smart to reach out before venturing on, I also agree with you that normally the timeouts are due to older firmware. Here is what I see, you are on older versions of most of the listed firmware.

The BIOS version you are on I don't even see available, so it may have been pulled. The latest version is 2.0.2, I know the version number is not as expected, and that update is listed as Urgent. 

Then you can address the BMC/ESM, which the update needed would be version 2.50.

The Perc 6/i firmware, as you had stated is 6.3.3-0002

I don't see an available update for that specific drive.

I would run those as they are listed, but I also always recommend backing up the data beforehand.

Let me know how it goes.

4 Posts

July 17th, 2019 08:00

Thanks Chris.

I just got a look at the physical server - it looks fine - no blinking orange lights or errors on the console. I installed Windows Updates last night, so I'm going to reboot it later today and see what happens.

As for the BIOS - yes, I was very confused by the version numbers on Dell's support page for the 2970 as well. When I look at the link you sent for 2.0.2, I noticed it said it was for Opteron 2200 series processors.I have Opteron 2300 processors - if I google for "2970 Opteron 2300 BIOS", I get this:

https://www.dell.com/support/home/us/en/04/drivers/driversdetails?driverid=17hxw

You'll see it's version 4.1.1. Looking at the release notes, the date is 11/11/2009. The date on the 4.1.2 BIOS is 2010. Looking at the release notes for the 2.0.2 version, the date is 9/2/2008, despite the fact that the update said it was last updated in 2015. So I'm hesitant to install 2.0.2 based on the Opteron 2200/2300 discrepancy and the fact that it looks older...

If you can clear up the confusion on this, I'd appreciate it. I can't afford to brick this server because, while I have some backups, I don't have another server (or budget) to setup another one and try to restore said backups. Thanks!

Moderator

 • 

8.4K Posts

July 17th, 2019 08:00

I understand, I would suggest moving past the BIOS and then start with updating the BMC/ESM and then the others. I will look into the BIOS for you.

4 Posts

July 17th, 2019 10:00

OK thanks. After the reboot, the disk is now showing as "Online" again by OpenManage. Although I did notice that OpenManage is seeing the disk as a completely different disk than what I ordered and what was on the label. I ordered the same disk as the other two (and the one that I was replacing), but this one is showing as a different manufacturer and twice the capacity! So not sure if I got a mislabeled drive or if the firmware being out of date is causing the drive to be mis-identified.

Either way, I'll plan to do the other firmware upgrades on my next maintenance window. There wouldn't be any issue with upgrading the firmware before the BIOS? I know with some posts I encountered (https://www.dell.com/community/PowerEdge-HDD-SCSI-RAID/Dell-Hardware-Event-ID-129-2405-2905-on-power-edge-R710/td-p/5094610) people had to upgrade the BIOS and other components in a certain order.

Thanks for looking into the BIOS!

Moderator

 • 

8.4K Posts

July 18th, 2019 04:00

If the system was further behind on the BIOS I would update it first, but your BIOS is only about an update back so there shouldn't be an issue. 

4 Posts

July 24th, 2019 04:00

Hi Chris,

I just got nudged to accept a solution, so I wanted to check if you'd gotten any clarification regarding the BIOS version numbering weirdness and whether the one you sent for Opteron 2200 series processors is OK for Opteron 2300 series processors.

Since the reboot, the disk that was showing as "Unknown" is still showing as "Online". I haven't had a maintenance window yet to try the firmware updates.

Thanks!

No Events found!

Top