Unsolved
This post is more than 5 years old
67 Posts
0
1023
Having to import raid on reboots
Hello
on a clients Dell R410 when we reboot it, often we are greeted with foreign config found on adapter, i have to import the config for the virtual disk to show, and then it starts rebuilding.
We have noticed a few times disk 3 on the server has amber flashing light and i've tried reseating the disk and physically replacing the disk. When replacing it the issue goes for weeks but comes back.
Example Logs from open manager, logs > alerts are below.
Server config:
Virtual disk 0 (OS) 2 x 480G SSD (works fine no issues)
Virtual Disk 1 (disk2) 2 x 1TB SATA (problematic)
Command timeout on physical disk: Physical Disk 0:0:3 Controller 0, Connector 0
many of these in October then eventually:
Device failed: Physical Disk 0:0:3 Controller 0, Connector 0
Then it logs the replacement disk on 1st November:
Physical device removed: Physical Disk 0:0:3 Controller 0, Connector 0
Physical disk Rebuild started: Physical Disk 0:0:3 Controller 0, Connector 0
Device returned to normal: Virtual Disk 1 (disk2) Controller 0 (PERC H700 Integrated)
Physical disk online: Physical Disk 0:0:3 Controller 0, Connector 0
Physical disk Rebuild completed: Physical Disk 0:0:3 Controller 0, Connector 0
Redundancy normal: Virtual Disk 1 (disk2) Controller 0 (PERC H700 Integrated)
logs are daily quiet then until 10 days later on 11th November:
Command timeout on physical disk: Physical Disk 0:0:3 Controller 0, Connector 0
Unexpected sense. SCSI sense data: Sense key: 6 Sense code: 29 Sense qualifier: 0: Physical Disk 0:0:3 Controller 0, Connector 0
Device failed: Physical Disk 0:0:3 Controller 0, Connector 0
Redundancy lost: Virtual Disk 1 (disk2) Controller 0 (PERC H700 Integrated)
so again the same disk has failed. the next day on the 12th November its parent disk gives a similar error:
Command timeout on physical disk: Physical Disk 0:0:2 Controller 0, Connector 0
Unexpected sense. SCSI sense data: Sense key: 6 Sense code: 29 Sense qualifier: 0: Physical Disk 0:0:2 Controller 0, Connector 0
The controller write policy has been changed to Write Through.: Battery 0 Controller 0
The virtual disk cache policy has changed.: Virtual Disk 1 (disk2) Controller 0 (PERC H700 Integrated)
the client then rebooted windows as it was having issues, although he confirms he could access both disks in windows and navigate without issues.
I imported the config back to Raid card at 10:10am here are logs from this morning
9:06am:
Controller has preserved cache.: Virtual Disk 1 (disk2) Controller 0 (PERC H700 Integrated)
Virtual disk failed: Virtual Disk 1 (disk2) Controller 0 (PERC H700 Integrated)
Device failed: Physical Disk 0:0:2 Controller 0, Connector 0
The virtual disk cache policy has changed.: Virtual Disk 1 (disk2) Controller 0 (PERC H700 Integrated)
then nothing until the server came back up at 10:09am with logs:
Virtual disk degraded: Virtual Disk 1 (disk2) Controller 0 (PERC H700 Integrated)
Controller event log: Package version 12.10.6-0001: Controller 0 (PERC H700 Integrated)
Controller event log: VD 01/1 is now DEGRADED: Controller 0 (PERC H700 Integrated)
Controller event log: State change on PD 03(e0x20/s3) from OFFLINE(10) to REBUILD(14): Controller 0 (PERC H700 Integrated)
Controller event log: Rebuild automatically started on PD 03(e0x20/s3): Controller 0 (PERC H700 Integrated)
so last month it was disk 3, this month disk 2..
I am wondering if the issue maybe something else? maybe raid card or backplane?
we last run LCC for Firmware upgrades in July so most things are modern.
I've today ordered 2 x 1TB Enterprise disks from Dell to replace both disks, however if it happens again what else should I replace?
theflash1932
7 Technologist
7 Technologist
•
16.3K Posts
0
December 13th, 2016 05:00
Pull a controller log (OMSA, PERC, Information/Configuration, Export Log from Available Tasks dropdown menu) and search it for "puncture". Your array may be damaged.
Are you using Dell certified drives? or "off the shelf"/generic drives?
chrisduk112
67 Posts
0
December 13th, 2016 07:00
TheFlash1932,
Firstly thank you very much for replying to my thread.
I have downloaded the log and searched for puncture and couldn't find anything. I have however attached the log to this post if you want to take a quick peak.
These disks are not Dell ones, I have today ordered 2 x Dell Enterprise 1TB SATA to replace both of them. II am thinking I should do disk 2 first (based on my first post), wait for rebuild to complete. Then do disk 3 once second rebuild is done I'll test a reboot to check all raid config has stayed in place with the new drives.
1 Attachment
lsi_1213.zip
theflash1932
7 Technologist
7 Technologist
•
16.3K Posts
0
December 13th, 2016 08:00
It is most likely from incompatible drives.