whchenlong
1 Nickel

VNX更换控制器问题

转到解答

VNX5100更换控制器无法正常启动,蓝灯慢闪

OE R51.502

bootlog_spb如下,

spcollect_spa 如下,

麻烦帮忙分析下,谢谢。

Catch69D5(01-04-20-16-39).jpg

标记 (1)
0 项奖励
8 条回复
whchenlong
1 Nickel

Re: VNX更换控制器问题

转到解答

Bootlog和sp collect上传到百度云了,

http://pan.baidu.com/s/1pKKM69P

0 项奖励
Roger_Wu
5 Tungsten

Re: VNX更换控制器问题

转到解答

Bootlog中可以看到如下提示:

DDBS: First (primary) disk needs rebuild.

DDBS: Second disk is valid for boot.

根据KB: Interpreting DDBS messages when a CLARiiON CX-Series array boots https://support.emc.com/kb/338122,解释如下:

When a CX-Series array tries to boot, SP-A first looks for a good image on drive 0-0, its primary OS drive. When monitoring the boot process from HyperTerminal, you will see messages like the following when a serial cable is inserted into the SP:

DDBS: MDDE (Rev 2) on disk 0
DDBS: MDDE (Rev 2) on disk 2DDBS: MDB read from both disks.
DDBS: Chassis and disk WWN seeds match.
DDBS: First (primary) disk needs rebuild.  <=== Indicates the image is bad.
DDBS: Second disk is valid for boot.         <=== Indicates the image is good.
NT FLARE image (0x00400007) located at sector LBA 0x0002284B

When the array fails to find a good image on drive 0-0, it fails the  boot process. If you were to pull drive 0-0,  the array will look to secondary drive 0-2 for a good image. If it finds a good image on that drive, it will boot up. In this example, the image on the secondary drive is valid for the array boots. If a valid image is not found on either drive, the boot-up fails and the drives must be re-imaged.

SP- A uses 0-0 and 0-2 as primary and secondary boot drives in that order.

SP-B uses 0-1 and 0-3 as primary and secondary boot drives in that order.


“蓝灯慢闪”是1/4 Hz频率的话,症状也符合KB:
VNX How can we troubleshoot a faulted SP ? https://support.emc.com/kb/464621 中的描述:

(   1 / 4 Hz Amber and/or Blue)Invalid Master Boot Record (Corrupt Image)
Corrupted Data Directory Boot Service (DDBS)
SP is panicking and reboot count is getting cleared on every boot.
SP is panicking even after the reboot count trips.

另外是日志收集不完整吗?我跑完TRiiAGE后,TRiiAGE_SPlogs.txt是空的,查不到之前为何换SPB的原因,如果有其他时间收集的日志也可以传上来看看。

0 项奖励
whchenlong
1 Nickel

Re: VNX更换控制器问题

转到解答

您好,我这边Second disk 是有效的,应该不需要re-image吧

0 项奖励
Highlighted
zhangjia
2 Iron

Re: VNX更换控制器问题

转到解答

停业务,把SP A 拔出来,单独启动SP B,起来后把SP A插进去

0 项奖励
whchenlong
1 Nickel

Re: VNX更换控制器问题

转到解答

我也是怎么想的 ,之前碰到过类似问题,disabled spa write cache,然后单独启动spb,正常后插入spA,准备试试

Severity : Warning

System : FCN00114300207

Domain : Local

Created : Jan 5, 2017 6:37:42 PM

Message : Valid SP-Cache is for Storage Processor SP B.

Full Description : Valid SP-Cache is for Storage Processor SP B.

Recommended Action : Valid SP-Cache is for Storage Processor SP B.

Event Code : 0x7241

0 项奖励
Roger_Wu
5 Tungsten

Re: VNX更换控制器问题

转到解答

照着KB做,拔出primary disk,能启动的话,应该还能凑合用。

0 项奖励
whchenlong
1 Nickel

Re: VNX更换控制器问题

转到解答

好的,下次过去试试拔出primary disk

0 项奖励
Roger_Wu
5 Tungsten

Re: VNX更换控制器问题

转到解答

关键数据备份下,毕竟是系统盘,真要re-image挺花时间的。

0 项奖励