Unsolved
This post is more than 5 years old
4 Posts
0
1620
November 15th, 2017 15:00
MD3620f, one controller not booting. Can't access or VXWORKS interface
Hi
I have 2 controllers, 1 works with access to both and VXWORKS interface, so my cable etc are known good.
No data needs to be kept.
The faulty controller boots to, then hangs, with no access to either and VXWORKS interface :
-=<###>=-
Instantiating /ram as rawFs, device = 0x1
Formatting /ram for DOSFS
Instantiating /ram as rawFs, device = 0x1
Formatting...Retrieved old volume params with %38 confidence:
Volume Parameters: FAT type: FAT32, sectors per cluster 0
0 FAT copies, 0 clusters, 0 sectors per FAT
Sectors reserved 0, hidden 0, FAT sectors 0
Root dir entries 0, sysId (null) , serial number 10000
Label:" " ...
Disk with 1024 sectors of 512 bytes will be formatted with:
Volume Parameters: FAT type: FAT12, sectors per cluster 1
2 FAT copies, 1010 clusters, 3 sectors per FAT
Sectors reserved 1, hidden 0, FAT sectors 6
Root dir entries 112, sysId VXDOS12 , serial number 10000
Label:" " ...
Instantiating /ram as rawFs, device = 0x1
OK.
RTC Error: Real-time clock device is not working
Adding 13888 symbols for standalone.
I've tried doing a lemClearLockdown with these results (from the good controller):
-> lemClearLockdown
-=<###>=-
Instantiating /ram as rawFs, device = 0x1
Formatting /ram for DOSFS
Instantiating /ram as rawFs, device = 0x1
Formatting...Retrieved old volume params with %38 confidence:
Volume Parameters: FAT type: FAT32, sectors per cluster 0
0 FAT copies, 0 clusters, 0 sectors per FAT
Sectors reserved 0, hidden 0, FAT sectors 0
Root dir entries 0, sysId (null) , serial number 10000
Label:" " ...
Disk with 1024 sectors of 512 bytes will be formatted with:
Volume Parameters: FAT type: FAT12, sectors per cluster 1
2 FAT copies, 1010 clusters, 3 sectors per FAT
Sectors reserved 1, hidden 0, FAT sectors 6
Root dir entries 112, sysId VXDOS12 , serial number 10000
Label:" " ...
Instantiating /ram as rawFs, device = 0x1
OK.
RTC Error: Real-time clock device is not working
Adding 13888 symbols for standalone.
Length: 0x13c Bytes
Version ver03.0A
Current date: 11/15/17 time: 15:00:28
Send for Service Interface or baud rate change
11/15/17-22:21:16 (tRAID): SOD Sequence is Normal, 0 on controller B
11/15/17-22:21:16 (tRAID): NOTE: Turning on tray summary fault LED
11/15/17-22:21:16 (tRAID): NOTE: Installed Protocols:
11/15/17-22:21:16 (tRAID): NOTE: Required Protocols:
11/15/17-22:21:16 (tRAID): NOTE: loading flash file: Fibre
11/15/17-22:21:18 (tRAID): NOTE: DSM: Current revision 7
11/15/17-22:21:18 (tRAID): NOTE: SYMBOL: SYMbolAPI registered.
11/15/17-22:21:19 (tRAID): NOTE: RCBBitmapManager total RPA size = 1778384896
11/15/17-22:21:19 (tRAID): NOTE: init: ioc: 0, PLVersion: 11-075-20-00
11/15/17-22:21:20 (tRAID): WARN: MLM: Failed creating m_MelNvsramLock
11/15/17-22:21:20 (tRAID): WARN: MLM: Failed creating m_MelEventListLock
11/15/17-22:21:20 (tRAID): NOTE: CrushMemoryPoolMgr: platform and memory (CPU MemSize:2048), adjusted allocating size to 262144 for CStripes
11/15/17-22:21:22 (tRAID): SOD: Instantiation Phase Complete
11/15/17-22:21:21 (IOSched): NOTE: SAS Expander Added: expDevHandle:x11 enclHandle:x2 numPhys:37 port:2 ioc:0 channel:1 numEntries:22
11/15/17-22:21:21 (IOSched): NOTE: SAS Expander Added: expDevHandle:x12 enclHandle:x3 numPhys:37 port:3 ioc:0 channel:0 numEntries:22
11/15/17-22:21:22 (tSasEvtWkr): NOTE: Alt controller path up on channel:1 devH:x13 expDevH:x11 phy:28 itn:2
11/15/17-22:21:22 (tSasExpChk): NOTE: Local Expander Firmware Version: 00.00.80.10
11/15/17-22:21:22 (tRAID): NOTE: Inter-Controller Communication Channels Opened
11/15/17-22:21:22 (tRAID): NOTE: LockMgr Role is Master
11/15/17-22:21:23 (tSasEvtWkr): NOTE: Alt controller path up on channel:0 devH:x14 expDevH:x12 phy:24 itn:4
11/15/17-22:21:23 (tSasExpChk): NOTE: Alternate Expander Firmware Version: 00.00.80.10
11/15/17-22:21:25 (tHckReset): NOTE: HealthCheck: Alt Ctl: 1 Reset_Failure, state: 0 Start
11/15/17-22:21:29 (tSasDiscCom): NOTE: SAS: Initial Discovery Complete Time: 35 seconds since last power on/reset, 10 seconds since sas instantiated
11/15/17-22:21:5 (tRAID): NOTE: WWN baseName 0004f01f-afd74520 (valid==>SigMatch)
11/15/17-22:22:01 (tRAID): SOD: Pre-Initialization Phase Complete
11/15/17-22:22:01 (utlTimer): NOTE: fcnChannelReport ==> -2 -3 -4 -5
11/15/17-22:22:06 (utlTimer): NOTE: fcnChannelReport ==> =2 =3 =4 -5
11/15/17-22:22:07 (utlTimer): NOTE: fcnChannelReport ==> =2 =3 =4 =5
11/15/17-22:22:12 (tRAID): NOTE: ACS: Icon ping to alternate failed: -2, resp: 0
11/15/17-22:22:12 (tRAID): NOTE: ACS: autoCodeSync(): Process start. Comm Mode: 0, Status: 0
11/15/17-22:22:12 (tRAID): WARN: ACS: autoCodeSync(): Skipped since alt not communicating.
11/15/17-22:22:13 (tRAID): SOD: Code Synchronization Initialization Phase Complete
11/15/17-22:22:13 (NvpsPersistentSyncM): NOTE: NVSRAM Persistent Storage updated successfully
11/15/17-22:22:13 (tRAID): NOTE: SAFE: Process new features
11/15/17-22:22:13 (tRAID): NOTE: SAFE: Process legacy features
11/15/17-22:22:14 (bdbmBGTask): ERROR: sendFreezeStateToAlternate: Exception IconSendInfeasibleException Error
11/15/17-22:22:15 (tRAID): NOTE: SPM acquireObjects exception: IconSendInfeasibleException Error
11/15/17-22:22:15 (tRAID): NOTE: sas: SOD Bad Check-in: IconSendInfeasibleException Error
11/15/17-22:22:15 (tRAID): NOTE: fcn: SOD Bad Check-in: IconSendInfeasibleException Error
11/15/17-22:22:15 (tRAID): NOTE: ion: SOD Bad Check-in: IconSendInfeasibleException Error
11/15/17-22:22:15 (tRAID): NOTE: CacheMgr::cacheOpenMirrorDevice:: mirror device 0xf00011
11/15/17-22:22:15 (tRAID): NOTE: PSTOR: PstorRecordManager::readRecord data block not found
11/15/17-22:22:15 (tRAID): WARN: CacheManager::adjustSyncStartStop cache not configured!
11/15/17-22:22:16 (tRAID): WARN: CCM Failed to notify the alternate to update Cache Store in pstor, writeCacheStoreToPstor() caught IconSendInfeasibleException Error
11/15/17-22:22:17 (tRAID): NOTE: CCM: sodClearMOSIntentsAlt(), failure clearing MOS intents on alt
11/15/17-22:22:17 (tRAID): NOTE: doRecovery: myMemory:1, IORCB:0, NEW:0, OLD:0
11/15/17-22:22:17 (tRAID): WARN: CacheManager::adjustSyncStartStop cache not configured!
11/15/17-22:22:17 (tRAID): WARN: CCM Failed to notify the alternate to update Cache Store in pstor, writeCacheStoreToPstor() caught IconSendInfeasibleException Error
11/15/17-22:22:18 (tRAID): WARN: CCM: initComplete() - isRestoreInProgressAlt caught IconSendInfeasibleException Error
11/15/17-22:22:18 (tRAID): WARN: CCM Failed to notify the alternate to update Cache Store in pstor, writeCacheStoreToPstor() caught IconSendInfeasibleException Error
11/15/17-22:22:18 (tRAID): WARN: arvm::AsyncMirrorManager::initialize caught IconSendInfeasibleException Error
11/15/17-22:22:19 (tRAID): NOTE: Exception caught sending size to alt FCM initialize IconSendInfeasibleException Error
11/15/17-22:22:22 (IOSched): NOTE: New Initiator: ioc: 0, port: 3, devHandle: x14, sasAddress: 5f01faf4d7452008
11/15/17-22:22:22 (tSasInitWkr): NOTE: New Initiator: 2 - channel:0, devHandle:x14, SAS Address:5f01faf4d7452008
11/15/17-22:22:22 (tRAID): NOTE: DiagVolManager::initialize: Exception - Alt controller not ready
11/15/17-22:22:26 (tRAID): SOD: Initialization Phase Complete
==============================================
Title: Disk Array Controller
Copyright 2008-2013 NetApp, Inc. All Rights Reserved.
Name: RC
Version: 07.84.47.60
Date: 05/28/2013
Time: 13:16:16 CDT
Models: 2660
Manager: devmgr.v1084api04.Manager
==============================================
11/15/17-22:22:26 (tRAID): sodMain Normal sequence finished, elapsed time = 70 seconds
11/15/17-22:22:26 (tRAID): sodMain complete
11/15/17-22:22:23 (ProcessHandlers): WARN: CCM: backupStorageAvailable() caught IconSendInfeasibleException Error
11/15/17-22:22:23 (ProcessHandlers): NOTE: vdm::CrushTaskCoordinationManager::handleEvent(1) - Exception N4domi27IconSendInfeasibleExceptionE - IconSendInfeasibleException Error
11/15/17-22:22:23 (ProcessHandlers): NOTE: vdm::CrushTaskCoordinationManager::handleEvent(2) - Exception N4domi27IconSendInfeasibleExceptionE - IconSendInfeasibleException Error
11/15/17-22:22:23 (ccmEventTask): WARN: CCM Failed to notify the alternate to update Cache Store in pstor, writeCacheStoreToPstor() caught IconSendInfeasibleException Error
11/15/17-22:22:23 (ccmEventTask): WARN: CCM Failed to notify the alternate to update Cache Store in pstor, writeCacheStoreToPstor() caught IconSendInfeasibleException Error
11/15/17-22:22:23 (tPdmSync): NOTE: getDriveTemperaturePollingIntervalOnAlt, caught IconSendInfeasibleException Error
11/15/17-22:22:23 (ccmEventTask): WARN: CCM Failed to notify the alternate to update Cache Store in pstor, writeCacheStoreToPstor() caught IconSendInfeasibleException Error
11/15/17-22:22:24 (continueActivateAft): NOTE: informPrimaryAlternateFinishedReset::Domi Exception caught in - IconSendInfeasibleException Error
11/15/17-22:22:24 (ccmEventTask): WARN: CCM Failed to notify the alternate to update Cache Store in pstor, writeCacheStoreToPstor() caught IconSendInfeasibleException Error
11/15/17-22:22:24 (ProcessHandlers): NOTE: SYMbol available
11/15/17-22:22:27 (ProcessHandlers): SOD: sodComplete Notification Complete
11/15/17-22:22:24 (IOSched): NOTE: New Initiator: ioc: 0, port: 2, devHandle: x13, sasAddress: 5f01faf4d745200c
11/15/17-22:22:24 (tSasInitWkr): NOTE: New Initiator: 3 - channel:1, devHandle:x13, SAS Address:5f01faf4d745200c
11/15/17-22:22:59 (utlTimer): WARN: Extended Link Down Timeout on channel 2
11/15/17-22:23:00 (utlTimer): WARN: Extended Link Down Timeout on channel 3
11/15/17-22:23:00 (utlTimer): WARN: Extended Link Down Timeout on channel 4
11/15/17-22:23:00 (utlTimer): WARN: Extended Link Down Timeout on channel 5
11/15/17-22:23:23 (ProcessEvents): WARN: RAIDVolumeManager::updateAltMountStates, caught IconSendInfeasibleException Error


DELL-Sam L
Moderator
•
7.9K Posts
•
93 Points
0
November 16th, 2017 08:00
Hello Wysedata,
With your controller that is failing to communicate with your working controller looks to be failed. Based on the serial output that you have posted, it is saying that since it can’t sync the cache on both controllers to match, that it will take the info from the master controller & use it. Now if you can swap the controller in your MD3620f to make sure that the issue follows the controller. If the issue follows the controller then, you will need to replace the controller.
Please let us know if you have any other questions.
Wysedata
4 Posts
0
November 16th, 2017 14:00
Hi
Thanks for your help Sam :)
Actually, I have managed to make some progress, but still not resolved. I now have access to the management IP, but when I add the controller to the MDSM, it shows as unrecognized. I can't telnet or SSH in, but I guess that needs to be enabled.
I was able to get it past the 'Adding 13888 symbols for standalone.' stage and it come up with PCI errors. It has reverted to being stuck at the same point so I can't get a copy of the serial session to post. I guess because of the random nature of its behaviour, I'll have to accept that it is dead.
Thanks again for your help. It is appreciated.