ECS: The system has detected a switch issue

Summary: What can I check if I receive an email alert informing me that the system has detected a switch issue.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

If the switch reported in the alert is a default Dell switch which has been replaced with a custom switch: Respond to the form in the email that assistance is required with filtering the replaced switch out of xDoctor alerting.

Gen2 default switches are Turtle, Rabbit, and Hare.
Gen3 default switches are Rabbit, Hare, Fox, and Hound.

 

If not then proceed with the following four checks.

  1. Attempt to ping the switch reported in the alert. We should see ping succeed. In the below example however, ping does not work.

    admin@node1:~> ping -c 1 rabbit.rack
    PING rabbit.rack (xxx.xxx.xxx.xxx) 56(84) bytes of data.
    From provo.rack (xxx.xxx.xxx.xxx) icmp_seq=1 Destination Host Unreachable
    
    --- rabbit.rack ping statistics ---
    1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
  2. Attempt to ssh to the switch in the alert. We should reach a password prompt if ssh works. In the below example however, ssh does not work.

    admin@node1:~> ssh rabbit.rack
    ssh: connect to host rabbit.rack port 22: No route to host
  3. Check for connection in the Link Layer Discovery Protocol (LLDP).

    Assuming there are no custom switches:
    A Gen 2 system should have Turtle, Rabbit, and Hare switches.
    A Gen 3 system should have Rabbit, Hare, Fox, and Hound switches.

    Example below for a Gen2 system where the rabbit is missing.

    admin@node1:~> sudo lldpcli show neighbors
    -------------------------------------------------------------------------------
    LLDP neighbors:
    -------------------------------------------------------------------------------
    Interface:    private, via: LLDP, RID: 1, Time: 35 days, 16:09:52
      Chassis:
        ChassisID:    mac xx:xx:xx:xx:xx:xx
        SysName:      turtle
        SysDescr:     Arista Networks EOS version 4.15.6M running on an Arista Networks DCS-7048T-A
        MgmtIP:       xxx.xxx.xxx.xxx
        Capability:   Bridge, on
        Capability:   Router, off
      Port:
        PortID:       ifname Ethernet1
        PortDescr:    Nile Node01 (Data)
        TTL:          120
    -------------------------------------------------------------------------------
    Interface:    slave-1, via: LLDP, RID: 2, Time: 35 days, 16:09:48
      Chassis:
        ChassisID:    mac xx:xx:xx:xx:xx:xx
        SysName:      hare
        SysDescr:     Arista Networks EOS version 4.16.6M running on an Arista Networks DCS-7150S-24
        MgmtIP:       xxx.xxx.xxx.xxx
        Capability:   Bridge, on
        Capability:   Router, off
      Port:
        PortID:       ifname Ethernet9
        PortDescr:    MLAG group 1
        TTL:          120
    -------------------------------------------------------------------------------	
    1. On Gen2 systems, turtle is the management switch. If it is possible to ssh to turtle, then check connection status to rabbit and hare switches by running the below three commands.

      # ssh turtle.rack
      # en
      # show interfaces status | grep Mgmt

      We should see that both switches marked as connect. In the example below however, we can see that one of the connections is marked as notconnect.

      admin@node1:~> ssh turtle.rack
      Password:
      Last login: Wed Nov 27 23:08:48 2019 from xxx.xxx.xxx.xxx
      turtle>en
      turtle#show interfaces status | grep Mgmt
      Et49       Mgmt Port-Secondary 10Ge switch connected    2        a-full a-1G   1000BASE-T
      Et50       Mgmt Port-Primary 10Gbe switch  notconnect    2       auto   auto   1000BASE-T
    2. On Gen3 systems, fox, and hound are both management switches, but fox manages the management links to rabbit and hare. If it is possible to ssh to fox, then check connection status to rabbit and hare switches by running the below two commands.

      # ssh fox.rack
      # show interfaces status | grep MGMT

      We should see that both switches marked as up. In the example below however, we can see that the hare connection is down.

      admin@node1:~> ssh fox.rack
      
      fox# show interface status | grep MGMT
      Eth 1/1/33      Rabbit MGMT     up       1000M    full     A    2    -
      Eth 1/1/35      Hare MGMT       down     0        full     A    2    -
  4. If any of the above checks fail, then respond to the form in the email that assistance is required including the outputs gathered above.

 

Failure states for these checks are:

  1. Ping does not work.
  2. ssh does not work.
  3. The switch is missing from LLDP.
  4. Management switch reports a notconnect/down connection.

 

If all checks pass, then this may be a false alert or caused by something like expected site maintenance. If this alert repeats and all checks are still passing, then respond to the form in the email that assistance is required with an intermittent switch alert.

 

Affected Products

ECS
Article Properties
Article Number: 000227348
Article Type: How To
Last Modified: 30 Jul 2024
Version:  1
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.