Start a Conversation

Unsolved

D

6 Posts

3157

July 4th, 2019 02:00

Disks failing during initialisation in r510

Hi

I've tried to use SAS disks from my VNX5300 in my r510, but the disks keeps on failing 10 seconds when the PERC RAID utility reports that the VD has been set up and should be initialised.

The disks are presented as 'ready' and I've run extended test on all components and no faults are found.

I've tried with both a 3TB Hitachi (HUS723030ALS640) and a Seagate 600GB (ST3600057SS) with exactly the same result. I've upgraded the PERC H700 to the latest version 12.10.7.0001 (and the BIOS and even the iDRAC). I've tried all types of RAID configs, adding them as hot spares, fiddled with the advanced settings, booting EUFI, you name it.

My Dell disks (146GB, ST3146356SS) works without any problems. I've intentionally not elaborated on everything

I've tried as it would become an impenetrable essay, but I'm happy to do so, and if I can figure out how to attach pictures, then I'll do that as well. I'm really at my wits end, so any help would be much appreciated.

Cheers,
Dan

Moderator

 • 

8.8K Posts

July 5th, 2019 05:00

Dpetzen,

Normally when you have an issue using drives from a storage device it is due to the firmware on the drives being specific to the storage device. If you download and run this firmware update on the drives, does it alleviate any of the previous issues? 

HUS723030ALS640

ST3600057SS

Let me know what you see after the update.

 

6 Posts

July 5th, 2019 20:00

Update and correction: I can (kind of) see the disks.

I installed smartctl (smartmontools) and I can see the disks through SMART:

sudo smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/bus/0 -d megaraid,0 # /dev/bus/0 [megaraid_disk_00], SCSI device
/dev/bus/0 -d megaraid,1 # /dev/bus/0 [megaraid_disk_01], SCSI device
/dev/bus/0 -d megaraid,2 # /dev/bus/0 [megaraid_disk_02], SCSI device
/dev/bus/0 -d megaraid,3 # /dev/bus/0 [megaraid_disk_03], SCSI device

sudo smartctl -i /dev/bus/0 -d megaraid,3
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-957.21.2.el7.x86_64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: SEAGATE
Product: STE60005 CLAR600
Revision: ES0E
User Capacity: 585,399,659,520 bytes [585 GB]
Logical block size: 520 bytes
Rotation Rate: 15000 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c50047f283bf
Serial number: 6SL34J1D 0000N22803W3
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Sat Jul 6 15:41:58 2019 NZST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Disabled or Not Supported

I know SMART isn't the same things as block access via SMART, but it feels like it's tantalising close to get this to work.

6 Posts

July 5th, 2019 20:00

Hi Chris and thanks for helping!

I've already tried to perform a firmware upgrade of the Hitachi drive, so I downloaded the driver you linked to and compared md5 checksums to make sure it was the same, and it was.

I don't think I've tried to upgrade the Seagate, so I located the RHEL6 version of that binary (I'm running CentOS 7 on the host) and tried with that, but with the same result:

sudo ./SAS-Drive_Firmware_V38WK_LN_ES68_A07_01.BIN
Collecting inventory...
...
Running validation...

This Update Package is not compatible with your system configuration.

The reason I gave up on this was that I thought that the drive weren't visible to the OS and/or the binaries. They certainly aren't visible to fdisk for example.

I don't have any of the supported Windows versions available to install, but I guess I can install RHEL6 if you think this is a binary/OS mismatch issue.

4 Operator

 • 

1.9K Posts

July 7th, 2019 20:00

Does the PERC support drives with 520Bytes, which are normaly used by Storage system or or only 512/4K?

Regards
Joerg

6 Posts

July 8th, 2019 01:00

Re: sector size

I'm not sure, Joerg, but I managed to get megacli installed last night, so I now have an amazing control over the PERC controller. I'll see if I can figure out sector size etc.

megacli can see all drives and they are all happy, so I decided to try to set up a RAID1 using megacli:

megacli -CfgLdAdd -r1 [32:2,32:3] -a0

scsi-rescan

...boom! (dmesg):

[ 2882.417681] scsi 0:2:1:0: Direct-Access DELL PERC H700 2.10 PQ: 0 ANSI: 5
[ 2882.428907] sd 0:2:1:0: [sdd] Spinning up disk...
[ 2882.429192] sd 0:2:1:0: Attached scsi generic sg5 type 0
[ 2882.473555] megaraid_sas 0000:02:00.0: scanning for scsi0...
[ 2882.473895] megaraid_sas 0000:02:00.0: 27405 (615889784s/0x0001/CRIT) - VD 01/1 is now DEGRADED
[ 2882.515553] megaraid_sas 0000:02:00.0: scanning for scsi0...
[ 2882.515618] megaraid_sas 0000:02:00.0: 27408 (615889785s/0x0001/FATAL) - VD 01/1 is now OFFLINE
[ 2883.429552] ....................................................................................................not responding...
[ 2982.528813] sd 0:2:1:0: [sdd] 5769265152 512-byte logical blocks: (2.95 TB/2.68 TiB)
[ 2982.528885] sd 0:2:1:0: [sdd] Write Protect is off
[ 2982.528889] sd 0:2:1:0: [sdd] Mode Sense: 1f 00 00 08
[ 2982.528935] sd 0:2:1:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 2982.529369] sd 0:2:1:0: [sdd] Spinning up disk...
[ 3002.841712] ata3: hard resetting link
[ 3003.297842] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 3003.339004] ata3.00: configured for UDMA/100
[ 3003.340177] ata3: EH complete
[ 2983.529767] ....................................................................................................not responding...
[ 3082.629162] sd 0:2:1:0: [sdd] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 3082.629170] sd 0:2:1:0: [sdd] tag#0 CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00
[ 3082.629173] blk_update_request: I/O error, dev sdd, sector 0
[ 3082.629220] Buffer I/O error on dev sdd, logical block 0, async page read

...and so on.

The drives have failed:

megacli -PDInfo -PhysDrv '[32:2]' -a0 | egrep 'Firmware|Inquiry'
Firmware state: Failed
Device Firmware Level: C1D6
Inquiry Data: HITACHI HUS72303CLAR3000C1D6YHHNNEGA

I may have the same problem as before, but at least I'm getting information out of the system now rather than a blank screen.

I think I need new firmware. I'll start Googling, but any suggestions would be very welcome.

6 Posts

July 8th, 2019 03:00

...alright, I have to admit defeat for tonight, but I'm not giving up.

I've extracted what I think is the firmware file:

./SAS-Drive_Firmware_DGY1G_LN_M440_A04.BIN --extract .
md5sum payload/M440.fwh
2dc55f4ecc1a9c516ec192a08e450dee payload/M440.fwh

The problem is that the firmware upgrade fails:

megacli -PdFwDownload -PhysDrv '[32:2]' -f M440.fwh -a0
Flashing firmware image size 0x8000 (0x0 0x80 0x0). Please wait...
Flashing firmware image size 0x8000 (0x0 0x80 0x0). Please wait...
...
FW error description:
The requested operation could not be completed as device is busy or unexpected error occurred.

Exit Code: 0x2d

The exit code means:

0x2d SCSI command done, but non-GOOD status was received - see mf.hdr.extStatus for SCSI_STATUS

I think the problem is that the drive needs to be offline, but I can't get it offline:

megacli -PDOffline -PhysDrv '[32:2]' -a0
Adapter: 0: Failed to change PD state at EnclId-32 SlotId-2.

The disk appears to be happy otherwise:

megacli -PDInfo -PhysDrv '[32:2]' -a0

Enclosure Device ID: 32
Slot Number: 2
Enclosure position: N/A
Device Id: 2
WWN: 5000CCA01A5DEE8B
Sequence Number: 7
Media Error Count: 0
Other Error Count: 11
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 2.686 TB [0x157f0e02e Sectors]
Non Coerced Size: 2.686 TB [0x157e0e02e Sectors]
Coerced Size: 2.686 TB [0x157e00000 Sectors]
Sector Size: 0
Firmware state: Unconfigured(good), Spun Up
Device Firmware Level: C1D6
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000cca01a5dee89
SAS Address(1): 0x0
Connected Port Number: 3(path0)
Inquiry Data: HITACHI HUS72303CLAR3000C1D6YHHNNEGA
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :37C (98.60 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: Unknown
Drive has flagged a S.M.A.R.T alert : No

Any ideas would be much appreciated.

January 30th, 2020 20:00

Any updates on this issue? I am having a similar issue trying to use Hitachi drives that were previously in an EMC storage array. The drives state "Failed" after creating a VD on the H700 controller in my R510. Please if anyone has a solution to this issue i would be very appreciative. I have tried updating the firmware of the drives using a WindowsPE USB and the EXE provided at the link a previous poster shared. 

Moderator

 • 

3.4K Posts

January 31st, 2020 02:00

Hi @idreamofjeepy

 

You may be facing a incompatibility issue on the system. The RAID card need to communicate with the drive and the drive firmware. EMC Storage array drive have their storage firmware which may not be flashed into server drive. 

No Events found!

Top