Unsolved
This post is more than 5 years old
2 Posts
0
1042
August 26th, 2018 08:00
MD3000 in reboot loop, dbm::versionConvert fail to commit
I was re-configuring an MD3000, and foolishly changed a couple of drives that were in the process of initializing. MDSM showed several errors, I turned the array off, swapped a couple more drives, and now when I power it up it's stuck in an endless reboot loop, and can't connect to the array with MDSM. There are 2 controllers - behavior is unchanged if I try running with just either one of them.
Can anyone help?
I've attached a serial cable, and am getting the following output from power on until the first reboot:
-=<###>=-
Attaching interface lo0... done
Adding 9766 symbols for standalone.
Error
08/26/18-15:19:44 (GMT) (tRootTask): NOTE: I2C transaction returned 0x0423fe00
Reset, Power-Up Diagnostics - Loop 1 of 1
3600 Processor DRAM
01 Data lines Passed
02 Address lines Passed
3300 NVSRAM
01 Data lines Passed
5900 Ethernet 91c111 #1
01 Register read Passed
02 Register test Passed
3A00 NAND Flash
06 Bad Blocks Test Passed
2310 Application Accelerator Unit
01 AAU Register Test Passed
6D00 LSI SAS 1068 IOC--Base Board
01 IOC Register Read Test Passed
02 IOC Register Address Lines Test Passed
03 IOC Register Data Lines Test Passed
6D01 LSI SAS 1068 IOC--Host Card
01 IOC Register Read Test Passed
02 IOC Register Address Lines Test Passed
03 IOC Register Data Lines Test Passed
3900 Real-Time Clock
01 RT Clock Tick Passed
Diagnostic Manager exited normally.
Current date: 08/26/18 time: 05:19:03
Send for Service Interface or baud rate change
08/26/18-15:20:02 (GMT) (tRAID): NOTE: Set Powerup State
08/26/18-15:20:02 (GMT) (tRAID): NOTE: SOD Sequence is Normal, 0
08/26/18-15:20:02 (GMT) (tRAID): NOTE: Turning on tray summary fault LED
08/26/18-15:20:04 (GMT) (tRAID): NOTE: SYMBOL: SYMbolAPI registered.
08/26/18-15:20:04 (GMT) (tRAID): NOTE: lost persistent dq data because buffer was modified or size changed.
esmc0: LinkUp event
08/26/18-15:20:06 (GMT) (tNetCfgInit): NOTE: Network Ready
08/26/18-15:20:08 (GMT) (tRAID): NOTE: Initiating Drive channel: ioc:0 bringup
08/26/18-15:20:10 (GMT) (tRAID): NOTE: IOC Firmware Version: 00-24-63-00
08/26/18-15:20:20 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: chan:1 phy:0 prevNumActivePhys:2 numActivePhys:2
08/26/18-15:20:20 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: chan:1 phy:1 prevNumActivePhys:2 numActivePhys:2
08/26/18-15:20:20 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: chan:0 phy:2 prevNumActivePhys:2 numActivePhys:2
08/26/18-15:20:20 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: chan:0 phy:3 prevNumActivePhys:2 numActivePhys:2
08/26/18-15:20:20 (GMT) (tSasCfg017): NOTE: Alt Controller path up - chan:1 phy:18 itn:1
08/26/18-15:20:20 (GMT) (tSasCfg021): NOTE: Alt Controller path up - chan:0 phy:16 itn:2
08/26/18-15:20:21 (GMT) (IOSched): NOTE: New Initiator: 1 - channel: 1,devHandle: x15, SAS Address: 500221948c113601
08/26/18-15:20:21 (GMT) (IOSched): NOTE: New Initiator: 2 - channel: 0,devHandle: x21, SAS Address: 500221948c113600
08/26/18-15:20:21 (GMT) (tRAID): NOTE: IonMgr: Drive Interface Enabled
08/26/18-15:20:22 (GMT) (tRAID): NOTE: SOD: Instantiation Phase Complete
08/26/18-15:20:22 (GMT) (tRAID): NOTE: Inter-Controller Communication Channels Opened
08/26/18-15:20:22 (GMT) (tSasDiscCom): NOTE: SAS Discovery complete task spawned
08/26/18-15:20:22 (GMT) (iacTask1): NOTE: fbm:altValidateSubModelID: Sub-Model IDs Validated
08/26/18-15:20:22 (GMT) (tRAID): NOTE: LockMgr Role is Slave
08/26/18-15:20:22 (GMT) (sasCheckExpanderSet): NOTE: Expander Firmware Version: 0116-e05c
08/26/18-15:20:22 (GMT) (sasCheckExpanderSet): NOTE: Expander SAS address: Hi = x50022194 Low = x8dfda410
08/26/18-15:20:22 (GMT) (tRAID): NOTE: spmEarlyData: No data available
08/26/18-15:20:35 (GMT) (tSasDiscCom): WARN: SAS: Initial Discovery Complete Time: 30 seconds
08/26/18-15:20:35 (GMT) (tRAID): NOTE: WWN baseName 00040022-198c1136 (valid==>AltMacMatch)
08/26/18-15:20:38 (GMT) (tRAID): NOTE: Initiating Host channel: ioc:1 bringup
08/26/18-15:20:41 (GMT) (tRAID): NOTE: IOC Firmware Version: 00-24-63-00
08/26/18-15:20:51 (GMT) (tRAID): NOTE: sasEnableInterface: Enabling Host Interface for channel 2
08/26/18-15:20:51 (GMT) (tRAID): NOTE: sasEnableInterface: Enabling Host Interface for channel 3
08/26/18-15:20:51 (GMT) (tRAID): NOTE: IonMgr: Host Interface Enabled
08/26/18-15:20:51 (GMT) (tRAID): NOTE: SOD: Pre-Initialization Phase Complete
08/26/18-15:20:52 (GMT) (tRAID): NOTE: I2C transaction returned 0x0421ab06
08/26/18-15:20:52 (GMT) (tRAID): NOTE: I2C transaction returned 0x0421ab08
08/26/18-15:20:52 (GMT) (tRAID): NOTE: I2C transaction returned 0x0421ab14
08/26/18-15:20:52 (GMT) (tRAID): NOTE: I2C transaction returned 0x0421ab0c
08/26/18-15:20:52 (GMT) (tRAID): NOTE: I2C transaction returned 0x0421ab12
08/26/18-15:20:52 (GMT) (tRAID): NOTE: I2C transaction returned 0x0421ab76
08/26/18-15:20:52 (GMT) (tRAID): NOTE: I2C transaction returned 0x0421ab0a
08/26/18-15:20:52 (GMT) (tRAID): NOTE: I2C transaction returned 0x0421ab77
08/26/18-15:20:52 (GMT) (tRAID): WARN: BID: initialize(): Power latched!
08/26/18-15:20:52 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: chan:3 phy:6 prevNumActivePhys:0 numActivePhys:1
08/26/18-15:20:52 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: IOCPort: chan:3, FAILED to DEGRADED
08/26/18-15:20:52 (GMT) (tSasEvtWkr): WARN: sasIocPhyUp: Initializing Channel 3: Attached SAS Address: 5001c231cc77d504
08/26/18-15:20:53 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: chan:3 phy:7 prevNumActivePhys:1 numActivePhys:2
08/26/18-15:20:55 (GMT) (ssmTimer): NOTE: I2C transaction returned 0x0421ab06
08/26/18-15:20:55 (GMT) (ssmTimer): NOTE: I2C transaction reporting disabled.
08/26/18-15:21:02 (GMT) (tRAID): PANIC: dbm::versionConvert fail to commit
0 events found


DELL-Sam L
Moderator
•
7.9K Posts
•
27 Points
0
August 27th, 2018 09:00
Hello Wellywell1,
I2C bus involves the following components - Backplane, Controllers and the Power supplies. What you will need to do is to find out which component is causing the issue. You will need to take the MD3000 down to 1 controller 1 power supply connected to the back plane. The controller needs to be in slot 0 and power supply needs to be on the left had side. You will also need to remove all HDD’s from the backplane as well. You will need to connect via serial cable and boot the system to see if you get the l2C errors. If error still happens then will need to try & move power supply to right side. You can also try swapping the controllers as well to see if the error happens on both controllers. Once you are able to find the faulty part then you can replace it to get your MD3000 back up & running.
Please let us know if you have any other questions.
Wellywell1
2 Posts
0
August 27th, 2018 18:00
Sam - thanks for the quick reply.
When you refer to the I2C errors, I assume that the very first one at the top of the log is expected, as I can never get that to go away. I've found that one of the controllers produces the 8 I2C errors in a row, and the other does not. With 1 or both power supplies in, and the "good" controller in slot 0, I only see that first I2C error, the tray boots and I'm able to access it through MDSM. Note, however, that with either or both controllers, and no disks, it boots fine and can be managed through MDSM.
If I try running 1 or 2 power supplies, and the good controller in slot 0 with with disks 0 and 1 installed, I get the panic I was getting earlier. Here's a lot of what that looks like:
-=<###>=-
Attaching interface lo0... done
Adding 9766 symbols for standalone.
Error
08/28/18-01:34:35 (GMT) (tRootTask): NOTE: I2C transaction returned 0x0423fe00
Reset, Power-Up Diagnostics - Loop 1 of 1
3600 Processor DRAM
01 Data lines Passed
02 Address lines Passed
3300 NVSRAM
01 Data lines Passed
5900 Ethernet 91c111 #1
01 Register read Passed
02 Register test Passed
3A00 NAND Flash
06 Bad Blocks Test Passed
2310 Application Accelerator Unit
01 AAU Register Test Passed
6D00 LSI SAS 1068 IOC--Base Board
01 IOC Register Read Test Passed
02 IOC Register Address Lines Test Passed
03 IOC Register Data Lines Test Passed
6D01 LSI SAS 1068 IOC--Host Card
01 IOC Register Read Test Passed
02 IOC Register Address Lines Test Passed
03 IOC Register Data Lines Test Passed
3900 Real-Time Clock
01 RT Clock Tick Passed
Diagnostic Manager exited normally.
Current date: 08/27/18 time: 16:28:18
Send for Service Interface or baud rate change
08/28/18-01:34:53 (GMT) (tRAID): NOTE: Set Powerup State
08/28/18-01:34:53 (GMT) (tRAID): NOTE: SOD Sequence is Normal, 0
08/28/18-01:34:53 (GMT) (tRAID): NOTE: Turning on tray summary fault LED
08/28/18-01:34:54 (GMT) (tRAID): NOTE: SYMBOL: SYMbolAPI registered.
08/28/18-01:34:54 (GMT) (tRAID): NOTE: lost persistent dq data because buffer was modified or size changed.
esmc0: LinkUp event
08/28/18-01:34:57 (GMT) (tNetCfgInit): NOTE: Network Ready
08/28/18-01:34:58 (GMT) (tRAID): NOTE: Initiating Drive channel: ioc:0 bringup
08/28/18-01:35:01 (GMT) (tRAID): NOTE: IOC Firmware Version: 00-24-63-00
08/28/18-01:35:09 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: chan:0 phy:0 prevNumActivePhys:2 numActivePhys:2
08/28/18-01:35:09 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: chan:0 phy:1 prevNumActivePhys:2 numActivePhys:2
08/28/18-01:35:19 (GMT) (tRAID): NOTE: IonMgr: Drive Interface Enabled
08/28/18-01:35:19 (GMT) (tRAID): NOTE: SOD: Instantiation Phase Complete
08/28/18-01:35:19 (GMT) (tRAID): WARN: No attempt made to open Inter-Controller Communication Channels
08/28/18-01:35:19 (GMT) (tRAID): NOTE: LockMgr Role is Master
08/28/18-01:35:19 (GMT) (tRAID): WARN: FBM:validateSubModel: Exception - Alt controller not ready
08/28/18-01:35:19 (GMT) (tSasDiscCom): NOTE: SAS Discovery complete task spawned
08/28/18-01:35:19 (GMT) (tRAID): NOTE: spmEarlyData: No data available
08/28/18-01:35:19 (GMT) (sasCheckExpanderSet): NOTE: Expander Firmware Version: 0116-e05c
08/28/18-01:35:19 (GMT) (sasCheckExpanderSet): NOTE: Expander SAS address: Hi = x50022194 Low = x8c113610
08/28/18-01:35:25 (GMT) (tSasDiscCom): WARN: SAS: Initial Discovery Complete Time: 30 seconds
08/28/18-01:35:25 (GMT) (tRAID): NOTE: WWN baseName 00080022-198c1136 (valid==>NoPrevAlt)
08/28/18-01:35:28 (GMT) (tRAID): NOTE: Initiating Host channel: ioc:1 bringup
08/28/18-01:35:31 (GMT) (tRAID): NOTE: IOC Firmware Version: 00-24-63-00
08/28/18-01:35:41 (GMT) (tRAID): NOTE: sasEnableInterface: Enabling Host Interface for channel 2
08/28/18-01:35:41 (GMT) (tRAID): NOTE: sasEnableInterface: Enabling Host Interface for channel 3
08/28/18-01:35:42 (GMT) (tRAID): NOTE: IonMgr: Host Interface Enabled
08/28/18-01:35:42 (GMT) (tRAID): NOTE: SOD: Pre-Initialization Phase Complete
08/28/18-01:35:42 (GMT) (tRAID): WARN: BID: initialize(): Power latched!
08/28/18-01:35:48 (GMT) (tRAID): PANIC: dbm::versionConvert fail to commit
-=<###>=-
DELL-Sam L
Moderator
•
7.9K Posts
•
27 Points
0
August 28th, 2018 07:00
Hello Wellywell1,
Does you controller reboot after it displays “PANIC: dbm::versionConvert fail to commit”? I am trying to see if it displays a lockdown status or provides more info on the panic. I can send you a private message and if you can send me the full serial capture so that I can look at it to see what is causing the panic.
Please let us know if you have any other questions.