Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products

The switch is Unavailable Due to Kernel Panic on MDS 9148S and MDS 9250i

Summary: A 'sysmgr' service crash results in a kernel panic with the switch unavailable. The 'sysmgr' service crash may be triggered by events such as a high availability (HA) policy reset or other system faults. However, such an occurrence is not a routine event. ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

The reason for the reload is displayed in the output of the below command after the reload:
show system reset-reason
This issue is specific to MDS 9148S and MDS 9250i platforms. It impacts MDS releases 8.1(x) through 8.3(x) exclusively.
show system reset-reason
----- reset reason for module 1 (from Supervisor in slot 1) ---
1) At 374295 usecs after Sun Aug  6 19:05:46 2023
    Reason: Kernel Panic
    Service: 
    Version: 8.3(2)
2) At 554068 usecs after Wed May 10 01:10:25 2023
    Reason: Kernel Panic
    Service: 
    Version: 8.3(2)
    
cpp_di_si=0vip_cpu_srvc_init_kspace: Called**** TOTAL PORTS = 48 *****[sched_delayed] sched: RT throttling activatedsock: process `snmpd' is using obsolete setsockopt SO_BSDCOMPATvip_cpu_srvc_init_kspace: Calledvip_cpu_srvc_trigger_kspace: Calledin viperk_cpu_mts_init *mts_q:0xd46642c0 mts_q :0xd3989bd8During ISSU MTS init handler for Cpu is loadedmts_q ::0xd46642c0 vip_cpu_srvc_trigger_kspace: Calledsmhb_mod_hb_params: Sysmgr modifying HB params, hb_intvl 2 max_hb_loss 8smhb_enable_disable_wd: do nothing on lc_on_hybrid_supps (10188) used greatest stack depth: 4432 bytes leftps (11005) used greatest stack depth: 4208 bytes leftlibphy: mdio@fff726520:00 - Link is Downlibphy: mdio@fff726520:00 - Link is Up - 1000/FullKernel stack overflow in process dbfdc9b0, r1=d64a9afcKernel panic - not syncing: kernel stack overflowKGDB: Waiting for remote debuggerStart stack dumpingMoving to kernel stackDone stack dumpingStart register dumpingDone register dumpingDone all dumping 4053 8196
 
**** KERNEL PANIC OCCURED*******Writing reset reason. 
Irqs 1 
Writing stack trace
Writing kernel traces 
Starting dump of trace eventsUnable to handle kernel paging request for data at address 0x9596a008
Faultiting instruction address: 0xc00b9804`

2023 Aug  6 21:03:30 SGMCISWXXXX1 %SYSMGR-5-MODULE_ONLINE: System Manager has received notification of local module becoming online.

show logging log
2023 Aug  6 21:02:52 SGMCISWXXXX1 %SYSLOG-2-SYSTEM_MSG : Syslogs wont be logged into logflash until logflash is online 
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-5-SYSTEM_MSG: [    0.280269] SCSI subsystem initialized - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-4-SYSTEM_MSG: [    0.348776] pci 0001:03:00.2: EHCI: unrecognized capability 00 - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-4-SYSTEM_MSG: [    0.409937] bounce pool size: 64 pages - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-4-SYSTEM_MSG: [    0.689291] kworker/u4:0 (824) used greatest stack depth: 6400 bytes left - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-5-SYSTEM_MSG: [    0.804272] physmap platform flash device: 02000000 at f0000000 - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-3-SYSTEM_MSG: [    0.804302] physmap-flash physmap-flash.0: Could not reserve memory region - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-4-SYSTEM_MSG: [    0.886657] physmap-flash: probe of physmap-flash.0 failed with error -12 - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-4-SYSTEM_MSG: [    1.052036] mpc85xx_mc_err_probe: No ECC DIMMs discovered - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-0-SYSTEM_MSG: [    1.074666] Enabling all PCI devices - kernel

2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-3-SYSTEM_MSG: [   16.171032] CMOS: Module initialized - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-4-SYSTEM_MSG: [   46.307515] ICMPv6: process `sysctl' is using deprecated sysctl (syscall) net.ipv6.neigh.default.base_reachable_time - use net.ipv6.neigh.default.base_reachable_time_ms instead - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-4-SYSTEM_MSG: [   46.335132] nr_pdflush_threads exported in /proc is scheduled for removal - kernel
2023 Aug  6 21:02:53 SGMCISWXXXX1 %KERN-4-SYSTEM_MSG: [   46.336491] sysctl: The scan_unevictable_pages sysctl/node-interface has been disabled for lack of a legitimate use case.  If you have one, please send an email to linux-mm@kvack.org. - kernel

2023 Aug  6 21:03:24 SGMCISWXXXX1 %FCS-5-API_FAIL: %$VSAN 120%$ pm_send_get_ports_in_vsans() failed: fu ha standby message queued 
2023 Aug  6 21:03:24 SGMCISWXXXX1 %FCS-5-API_FAIL: %$VSAN 150%$ pm_send_get_ports_in_vsans() failed: fu ha standby message queued 
2023 Aug  6 21:03:27 SGMCISWXXXX1 %FCDOMAIN-5-DOMAIN_TYPE_IS_PREFERRED: The domain ID type is currently configured as preferred in all the existing VSANs
2023 Aug  6 21:03:28 SGMCISWXXXX1 %DAEMON-3-SYSTEM_MSG: sendto(10.XX.X.48): Network is unreachable - ntpd[3589]
2023 Aug  6 21:03:28 SGMCISWXXXX1 %DAEMON-3-SYSTEM_MSG: sendto(10.XX.X.49): Network is unreachable - ntpd[3589]
2023 Aug  6 21:03:30 SGMCISWXXXX1 %MODULE-5-ACTIVE_SUP_OK: Supervisor 1 is active (Serial number: JAE200XXXXV)
2023 Aug  6 21:03:30 SGMCISWXXXX1 %PLATFORM-5-MOD_STATUS: Module 1 current-status is MOD_STATUS_ONLINE/OK

Cause

'sysmgr' service crash occurs on MDS 9148S and MDS 9250i platforms running MDS releases 8.1(x) through 8.3(x). It leads to a kernel panic, causing the entire switch to be unavailable. As a result, the system must be forcefully reloaded to restore functionality. Due to the kernel panic, it is not possible to determine the exact reason behind the termination of the 'sysmgr' service.

Resolution

Workaround:
There is no known workaround to mitigate this issue. However, reload the switch to release the unavailable state. 

Resolution:
The suggestion from TAC is to upgrade the code to version 8.4(2f).

Additional Information

Known Affected Releases: 8.3(2)
Cisco Issue ID: CSCvu16450 and CSCvp13486 This hyperlink is taking you to a website outside of Dell Technologies.
Cisco TAC case: 695961883

Affected Products

Connectrix MDS 9148S
Article Properties
Article Number: 000216949
Article Type: Solution
Last Modified: 23 Nov 2023
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.