ECS: OBS: xDoctor: RAP025: One or more network interfaces are down or missing

Summary: xDoctor reports a warning error indicating that one of the ECS interfaces is down to the switches.

Affected Products

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Check out other resources

Symptoms

Beginning with ECS xDoctor v4.8‑109.0 and ObjectScale xDoctor v5.1‑109.0, RAP025 (One or more network interfaces are down or missing) is implemented as an auto‑healer. When switch connectivity issues exceed the configured Error or Critical severity threshold, xDoctor raises a RAP025 alert and automatically initiates its integrated repair orchestration workflow. This workflow performs the required remediation actions, if xDoctor auto‑healers are enabled.

Note: If your environment is running an xDoctor version earlier than ECS xDoctor v4.8‑109.0 or ObjectScale xDoctor v5.1‑109.0, the RAP025 auto‑healer functionality is not available. In these versions, remediation must be performed using the AutoPilot process described below or by following the manual remediation steps outlined in the Resolution section.

Alert That Triggers RAP025 Auto‑Healing

The RAP025 auto‑healing workflow is triggered when node interface failure is detected crossing the configured Error or Critical severity threshold. Once this threshold is exceeded, xDoctor generates a RAP025 alert, which serves as the trigger for the automated repair process.

Alert Output Example

Note: In xDoctor versions prior to ECS xDoctor v4.8‑109.0 and ObjectScale xDoctor v5.1‑109.0, this condition results in an alert only. Automatic remediation is not performed.

----------------------------------------------
INFO - Auto Healer for interface_down disabled
----------------------------------------------
Extra     = Auto Healer for interface_down disabled
Timestamp = 2026-03-24_205634
PSNT      = CKM00190800120 @ 4.8-109.0

-------------------------------------------------------------------------------------
ERROR - (Cached) One or more network interfaces are down or missing
-------------------------------------------------------------------------------------
Node      = Nodes
Extra     = {"Nodes": {"169.254.1.4": ["pslave-1"]}}
RAP       = RAP025
Solution  = KB 503814
Timestamp = 2026-04-01_190513
PSNT      = CKMXXXXXXXXXXX @ 4.8-109.0

Auto Healer Remediation (Example)

When auto‑healers are enabled, xDoctor automatically initiates remediation for detected node interface failure by applying common corrective actions outlined in this knowledge base.

----------------------------------------
FIXED - Auto Healer fixed interface_down
----------------------------------------
Node      = Nodes
Extra     = {"Nodes": {"169.254.1.1": ["pslave-0"]}}
Timestamp = 2026-04-01_190513
PSNT      = CKMXXXXXXXXXXX @ 4.8-109.0

Auto Healer Requirement

The xDoctor auto‑healer feature must be enabled for this remediation to occur. Auto‑healers can be enabled either during installation or post‑installation by following the steps outlined in:

KB: ECS: xDoctor: How to Enable xDoctor Auto Healer after tool installation

Manual Validation:

Check interface link status on the cluster. In the below example, node 4 has NO-CARRIER which means no connection to the secondary-0 interface.

Command:

# viprexec -i 'ip link show | egrep "slave-|public"'

Example:

admin@node1:~> viprexec -i 'ip link show | egrep "slave-|public"'

Output from host : 1xx.1xx.219.1
bash: public: command not found
3: slave-0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 1000
5: slave-1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 1000
10: public: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default

Output from host : 1xx.1xx.219.2
bash: public: command not found
3: slave-0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 1000
5: slave-1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 1000
10: public: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default

Output from host : 1xx.1xx8.219.3
bash: public: command not found
4: slave-0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 1000
5: slave-1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 1000
10: public: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default

Output from host : 1xx.1xx.219.4
bash: public: command not found
2: slave-0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master public state DOWN mode DEFAULT group default qlen 1000
5: slave-1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 1000
10: public: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default

Confirm that there is a detected link by the individual interfaces:

Command:

# viprexec -i "echo slave-0;sudo ethtool slave-0 | grep 'Link detected:';echo slave-1;sudo ethtool slave-1 | grep 'Link detected:'"

Example:

admin@node1:~> viprexec -i "echo slave-0;sudo ethtool slave-0 | grep 'Link detected:';echo slave-1;sudo ethtool slave-1 | grep 'Link detected:'"

Output from host : 1xx.1xx.219.1
slave-0
        Link detected: yes
slave-1
        Link detected: yes

Output from host : 1xx.1xx.219.2
slave-0
        Link detected: yes
slave-1
        Link detected: yes

Output from host : 1xx.1xx.219.3
slave-0
        Link detected: yes
slave-1
        Link detected: yes

Output from host : 1xx.1xx.219.4
slave-0
        Link detected: no
slave-1
        Link detected: yes

Confirm if the switches are Arista on Gen 1 or Gen 2, and Dell (or other model) on Gen 3 environments.

NOTE: If the switches are not Arista or Dell switches, they are user-managed switches. We must notify the user that they have a connection down to remediate.

Command:

# viprexec -i "sudo lldpcli show neigh | grep SysDescr"

Example:
Gen 1 or Gen 2

admin@ecssh121:~> viprexec -i "sudo lldpcli show neigh | grep SysDescr"

Output from host : 1xx.1xx.219.1
    SysDescr:     Arista Networks EOS version 4.11.7.2 running on an Arista Networks DCS-7150S-24
    SysDescr:     Arista Networks EOS version 4.11.7.2 running on an Arista Networks DCS-7150S-24
    SysDescr:     Arista Networks EOS version 4.15.6M running on an Arista Networks DCS-7048T-A

Output from host : 1xx.1xx.219.2
    SysDescr:     Arista Networks EOS version 4.11.7.2 running on an Arista Networks DCS-7150S-24
    SysDescr:     Arista Networks EOS version 4.11.7.2 running on an Arista Networks DCS-7150S-24
    SysDescr:     Arista Networks EOS version 4.15.6M running on an Arista Networks DCS-7048T-A

Output from host : 1xx.1xx.219.3
    SysDescr:     Arista Networks EOS version 4.11.7.2 running on an Arista Networks DCS-7150S-24
    SysDescr:     Arista Networks EOS version 4.11.7.2 running on an Arista Networks DCS-7150S-24
    SysDescr:     Arista Networks EOS version 4.15.6M running on an Arista Networks DCS-7048T-A

Output from host : 1xx.1xx.219.4
    SysDescr:     Arista Networks EOS version 4.11.7.2 running on an Arista Networks DCS-7150S-24
    SysDescr:     Arista Networks EOS version 4.11.7.2 running on an Arista Networks DCS-7150S-24
    SysDescr:     Arista Networks EOS version 4.15.6M running on an Arista Networks DCS-7048T-A

Example:
Gen 3

admin@ecsnode1:~> viprexec -i "sudo lldpcli show neigh | grep SysDescr"

Output from host : 1xx.1xx.219.1
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10

Output from host : 1xx.1xx.219.2
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10

Output from host : 1xx.1xx.219.3
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10

Output from host : 1xx.1xx.219.4
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10

Output from host : 1xx.1xx.219.5
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10

Output from host : 1xx.1xx.219.6
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10

Output from host : 1xx.1xx.219.7
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10

Output from host : 11xx.1xx.219.8
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10
    SysDescr:     OS10

Check public interface configuration for bonding secondary-0 and secondary-1 for public on the node showing no link that is detected from the command in step 2:

Command:

# sudo cat /etc/sysconfig/network/ifcfg-public

Example:

admin@node4:~> sudo cat /etc/sysconfig/network/ifcfg-public
BONDING_MASTER=yes
BONDING_MODULE_OPTS="miimon=100 mode=4 xmit_hash_policy=layer3+4"
BONDING_SLAVE0=slave-0
BONDING_SLAVE1=slave-1
BOOTPROTO=static
IPADDR=xxx.xxx.xxx.xxx/xx
MTU=1500
STARTMODE=auto

Confirm the correct Link aggregation Control Protocol (LACP) parameter is set on the nodes. (IEEE 802.3ad Dynamic link aggregation is the default setting and is expected.)

Command:

# viprexec -i "grep Mode /proc/net/bonding/public"

Example:

admin@node1:~> viprexec -i "grep Mode /proc/net/bonding/public"

Output from host : 1xx.1xx.219.1
Bonding Mode: IEEE 802.3ad Dynamic link aggregation

Output from host : 1xx.1xx.219.2
Bonding Mode: IEEE 802.3ad Dynamic link aggregation

Output from host : 1xx.1xx.219.3
Bonding Mode: IEEE 802.3ad Dynamic link aggregation

Output from host : 1xx.1xx.219.4
Bonding Mode: IEEE 802.3ad Dynamic link aggregation

Confirm that the connections are consistent to the ports and the interfaces are connected by running the following:

ECS Gen 1 or Gen 2:

Gen 1 or Gen 2 ECS mappings for the nodes are as follows to the corresponding switch ports:

Ethernet9 = node 1
Ethernet10 = node 2
Ethernet11 = node 3
Ethernet12 = node 4
Ethernet13 = node 5
Ethernet14 = node 6
Ethernet15 = node 7
Ethernet16 = node 8

Command:

# viprexec -i "sudo lldpcli show neigh | egrep 'SysName|PortID'"

Example (Example from Gen 1 or Gen 2 ECS):

admin@ecssh121:~> viprexec -i "sudo lldpcli show neigh | egrep 'SysName|PortID'"                                                                                                                                                     
Output from host : 1xx.1xx.219.1
    SysName:      rabbit
    PortID:       ifname Ethernet9
    SysName:      hare
    PortID:       ifname Ethernet9
    SysName:      turtle
    PortID:       ifname Ethernet1

Output from host : 1xx.1xx.219.2
    SysName:      rabbit
    PortID:       ifname Ethernet10
    SysName:      hare
    PortID:       ifname Ethernet10
    SysName:      turtle
    PortID:       ifname Ethernet2

Output from host : 1xx.1xx.219.3
    SysName:      hare
    PortID:       ifname Ethernet11
    SysName:      rabbit
    PortID:       ifname Ethernet11
    SysName:      turtle
    PortID:       ifname Ethernet3

Output from host : 1xx.1xx.219.4
    SysName:      hare
    PortID:       ifname Ethernet12
    SysName:      turtle
    PortID:       ifname Ethernet4

NOTE: In the above example, the connection to Ethernet12 rabbit is missing for node 4. With all the information collected so far, we now know that port 12 is on rabbit switch is connected to secondary-0 interface of node 4.

ECS Gen 3:
Gen 3 ECS mappings for the nodes are as follows to the corresponding switch ports:

ethernet1/1/1 = node 1
ethernet1/1/2 = node 2
ethernet1/1/3 = node 3
ethernet1/1/4 = node 4
ethernet1/1/5 = node 5
ethernet1/1/6 = node 6
ethernet1/1/7 = node 7
ethernet1/1/8 = node 8

Confirm that the connections are up for the ECS connecting to the four ECS Gen 3 switches frontend rabbit and hare backend fox and hound.

Command:

# viprexec -i "sudo lldpcli show neigh | egrep 'SysName|PortID'"

Example:

admin@ecsnode1:~> viprexec -i "sudo lldpcli show neigh | egrep 'SysName|PortID'"

Output from host : 1xx.1xx.219.1
    SysName:      rabbit
    PortID:       ifname ethernet1/1/1
    SysName:      hare
    PortID:       ifname ethernet1/1/1
    SysName:      fox
    PortID:       ifname ethernet1/1/1
    SysName:      hound
    PortID:       ifname ethernet1/1/1

Output from host : 192.168.219.2
    SysName:      rabbit
    PortID:       ifname ethernet1/1/2
    SysName:      hare
    PortID:       ifname ethernet1/1/2
    SysName:      fox
    PortID:       ifname ethernet1/1/2
    SysName:      hound
    PortID:       ifname ethernet1/1/2

Output from host : 1xx.1xx.219.3
    SysName:      rabbit
    PortID:       ifname ethernet1/1/3
    SysName:      hare
    PortID:       ifname ethernet1/1/3
    SysName:      fox
    PortID:       ifname ethernet1/1/3
    SysName:      hound
    PortID:       ifname ethernet1/1/3

Output from host : 1xx.1xx.219.4
    SysName:      rabbit
    PortID:       ifname ethernet1/1/4
    SysName:      hare
    PortID:       ifname ethernet1/1/4
    SysName:      fox
    PortID:       ifname ethernet1/1/4
    SysName:      hound
    PortID:       ifname ethernet1/1/4

Output from host : 1xx.1xx.219.5
    SysName:      rabbit
    PortID:       ifname ethernet1/1/5
    SysName:      hare
    PortID:       ifname ethernet1/1/5
    SysName:      fox
    PortID:       ifname ethernet1/1/5
    SysName:      hound
    PortID:       ifname ethernet1/1/5

Output from host : 1xx.1xx.219.6
    SysName:      rabbit
    PortID:       ifname ethernet1/1/6
    SysName:      hare
    PortID:       ifname ethernet1/1/6
    SysName:      fox
    PortID:       ifname ethernet1/1/6
    SysName:      hound
    PortID:       ifname ethernet1/1/6

Output from host : 1xx.1xx.219.7
    SysName:      rabbit
    PortID:       ifname ethernet1/1/7
    SysName:      hare
    PortID:       ifname ethernet1/1/7
    SysName:      fox
    PortID:       ifname ethernet1/1/7
    SysName:      hound
    PortID:       ifname ethernet1/1/7

Output from host : 1xx.1xx.219.8
    SysName:      rabbit
    PortID:       ifname ethernet1/1/8
    SysName:      hare
    PortID:       ifname ethernet1/1/8
    SysName:      fox
    PortID:       ifname ethernet1/1/8
    SysName:      hound
    PortID:       ifname ethernet1/1/8

Connect to the rabbit with admin credentials from and check interface status.

Command:
ECS Gen 1 or Gen 2

# ssh <switch name>
# show interfaces show interfaces Ethernet<Interface Number>

Example:

ECS Gen 1 or Gen 2

admin@ecs1:~> ssh rabbit
Password:
Last login: Tue Sep  5 11:13:30 2017 from 1xx.1xx.219.1
rabbit>show interfaces Ethernet12
Ethernet12 is down, line protocol is notpresent (notconnect)
  Hardware is Ethernet, address is 444c.a8de.8f83 (bia 444c.a8de.8f83)
  Description: MLAG group 4
  Member of Port-Channel4
  Ethernet MTU 9214 bytes , BW 10000000 kbit
  Full-duplex, 10Gb/s, auto negotiation: off, uni-link: n/a
  Loopback Mode : None
  0 link status changes since last clear
  Last clearing of "show interface" counters never
  5 minutes input rate 0 bps (0.0% with framing overhead), 0 packets/sec
  5 minutes output rate 0 bps (0.0% with framing overhead), 0 packets/sec
     0 packets input, 0 bytes
     Received 0 broadcasts, 0 multicast
     0 runts, 0 giants
     0 input errors, 0 CRC, 0 alignment, 0 symbol, 0 input discards
     0 PAUSE input
     0 packets output, 0 bytes
     Sent 0 broadcasts, 0 multicast
     0 output errors, 0 collisions
     0 late collision, 0 deferred, 0 output discards
     0 PAUSE output

Command:
ECS Gen 3

# ssh <switch name>
# show interfaces ethernet 1/1/<Interface Number>

Example:
ECS Gen 3

rabbit# show interface ethernet 1/1/4
Ethernet 1/1/9 is up, line protocol is down
Description: VLT Group 9
Port is part of Port-channel 9
Hardware is Dell EMC Eth, address is d8:9e:f3:c6:31:09
    Current address is d8:9e:f3:c6:31:09
Pluggable media not present

Interface index is 17305276
Internet address is not set
Mode of IPv4 Address Assignment: not set
Interface IPv6 oper status: Enabled
MTU 9216 bytes, IP MTU 9184 bytes
LineSpeed 0, Auto-Negotiation off
FEC is auto
Flowcontrol rx off tx off
ARP type: ARPA, ARP Timeout: 60
Last clearing of "show interface" counters: 6 days 17:39:09
Queuing strategy: fifo
Input statistics:
     0 packets, 0 octets
     0 64-byte pkts, 0 over 64-byte pkts, 0 over 127-byte pkts
     0 over 255-byte pkts, 0 over 511-byte pkts, 0 over 1023-byte pkts
     0 Multicasts, 0 Broadcasts, 0 Unicasts
     0 runts, 0 giants, 0 throttles
     0 CRC,  overrun, 0 discarded
Output statistics:
     0 packets, 0 octets
     0 64-byte pkts, 0 over 64-byte pkts, 0 over 127-byte pkts
     0 over 255-byte pkts, 0 over 511-byte pkts, 0 over 1023-byte pkts
     0 Multicasts, 0 Broadcasts, 0 Unicasts
     0 throttles, 0 discarded, 0 Collisions, 0 wreddrops
Rate Info(interval 299 seconds):
     Input 0 Mbits/sec, 0 packets/sec, 0% of line rate
     Output 0 Mbits/sec, 0 packets/sec, 0% of line rate
Time since last interface status change: 6 days 17:39:10

Check for possible network interface (NIC) cards issues:
Another component to check if the nodes message buffer (dmesg) for recent NIC issues on the affected node. In the below example, the NIC loops between being up and down status.

If there is a NIC alert in the dmesg, it may require replacing.

If ECS Gen 1 or Gen 2 - Provide output to ECS support hardware engineer to analysis.
If ECS Gen 3 - Provide output to the Dell hardware team who would be expert in EX-series hardware.

Command:

# sudo dmesg -T

Example:

admin@node1:~> sudo dmesg -T
[Tue Jul  6 09:40:56 2021] public: link status definitely up for interface slave-1, 25000 Mbps full duplex
[Tue Jul  6 09:40:56 2021] bnxt_en 0000:04:00.1 slave-1: NIC Link is Down
[Tue Jul  6 09:40:56 2021] public: link status definitely down for interface slave-1, disabling it
[Tue Jul  6 09:40:57 2021] public: link status definitely down for interface slave-1, disabling it
[Tue Jul  6 09:41:09 2021] bnxt_en 0000:04:00.1 slave-1: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
[Tue Jul  6 09:41:09 2021] bnxt_en 0000:04:00.1 slave-1: FEC autoneg off encodings: None
[Tue Jul  6 09:41:09 2021] public: link status definitely up for interface slave-1, 25000 Mbps full duplex
[Tue Jul  6 09:41:10 2021] bnxt_en 0000:04:00.1 slave-1: NIC Link is Down
[Tue Jul  6 09:41:10 2021] bnxt_en 0000:04:00.1 slave-1: speed changed to 0 for port slave-1
[Tue Jul  6 09:41:11 2021] public: link status definitely down for interface slave-1, disabling it
[Tue Jul  6 09:41:20 2021] bnxt_en 0000:04:00.1 slave-1: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
[Tue Jul  6 09:41:20 2021] bnxt_en 0000:04:00.1 slave-1: FEC autoneg off encodings: None
[Tue Jul  6 09:41:20 2021] public: link status definitely up for interface slave-1, 25000 Mbps full duplex
[Tue Jul  6 09:41:22 2021] bnxt_en 0000:04:00.1 slave-1: NIC Link is Down
[Tue Jul  6 09:41:22 2021] bnxt_en 0000:04:00.1 slave-1: speed changed to 0 for port slave-1
[Tue Jul  6 09:41:22 2021] public: link status definitely down for interface slave-1, disabling it
[Tue Jul  6 09:41:31 2021] bnxt_en 0000:04:00.1 slave-1: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
[Tue Jul  6 09:41:31 2021] bnxt_en 0000:04:00.1 slave-1: FEC autoneg off encodings: None
[Tue Jul  6 09:41:32 2021] public: link status definitely up for interface slave-1, 25000 Mbps full duplex
[Tue Jul  6 09:41:33 2021] bnxt_en 0000:04:00.1 slave-1: NIC Link is Down

Cause

Possible Causes

This issue may occur due to one or more of the following conditions:

Improperly seated SFP
The Small Form‑Factor Pluggable (SFP) module may not have been fully seated during installation.
User‑managed switch changes
Changes made on customer‑managed switches outside of Dell control—such as firmware upgrades, configuration changes, or maintenance activities—can cause network links to go down and then renegotiate, as described in the Resolution section.
Failed ECS networking hardware, including:
- Network Interface Card (NIC)
- Network cable
- Small Form‑Factor Pluggable (SFP) module
- Network switch or switch port

Resolution

xDoctor Auto Healer: ObjectScale xDoctor v5.1‑109.0 / ECS xDoctor v4.8‑109.0 or later

To manually trigger the Enabled auto healer, run the following command on the master.rack node. This initiates the rack analyzers, which will validate and auto‑heal the nodes one at a time.

Command:

# sudo xdoctor --rap=RAP025

Example:

admin@ecsnode1:~> admin@ecsnode1:~> sudo xdoctor --rap=RAP025
2026-04-01 19:05:13,590: xDoctor_4.8-109.0 - INFO    : Initializing xDoctor v4.8-109.0 ...
2026-04-01 19:05:14,143: xDoctor_4.8-109.0 - INFO    : Removing orphaned session - session_1775070174.306
2026-04-01 19:05:14,146: xDoctor_4.8-109.0 - INFO    : Starting xDoctor session_1775070313.036 ... (SYSTEM)
2026-04-01 19:05:14,146: xDoctor_4.8-109.0 - INFO    : Primary Node Control Check ...
2026-04-01 19:05:14,336: xDoctor_4.8-109.0 - INFO    : xDoctor Composition - Analyzer(s):ac_interface_check
2026-04-01 19:05:14,352: xDoctor_4.8-109.0 - INFO    : Session limited to 0:40:00
2026-04-01 19:05:15,410: xDoctor_4.8-109.0 - INFO    : --------------------
2026-04-01 19:05:15,410: xDoctor_4.8-109.0 - INFO    : ECS Version: 3.8.1.4
2026-04-01 19:05:15,410: xDoctor_4.8-109.0 - INFO    : --------------------
2026-04-01 19:05:15,525: xDoctor_4.8-109.0 - INFO    : xDoctor Pre Features
2026-04-01 19:05:15,526: xDoctor_4.8-109.0 - INFO    : Cron Activation
2026-04-01 19:05:15,526: xDoctor_4.8-109.0 - INFO    : xDoctor already active ...
[... Truncated Output ...]
2026-04-01 19:05:21,987: xDoctor_4.8-109.0 - INFO    : ------------------------------
2026-04-01 19:05:21,992: xDoctor_4.8-109.0 - INFO    : Analyzing collected data ...
2026-04-01 19:05:21,992: xDoctor_4.8-109.0 - INFO    : Analysis Limit: 0:13:00
2026-04-01 19:05:21,993: xDoctor_4.8-109.0 - INFO    : ANALYZER [ac_interface_check]
2026-04-01 19:05:22,291: xDoctor_4.8-109.0 - INFO    : Autohealing interface_down on node 169.254.1.4 ...
2026-04-01 19:08:39,599: xDoctor_4.8-109.0 - INFO    : Autohealing interface_down on node 169.254.1.1 ...
2026-04-01 19:11:19,874: xDoctor_4.8-109.0 - INFO    : All data analyzed in 0:05:57
2026-04-01 19:11:20,709: xDoctor_4.8-109.0 - INFO    : --------------------
2026-04-01 19:11:20,709: xDoctor_4.8-109.0 - INFO    : Diagnosis Summary
2026-04-01 19:11:20,709: xDoctor_4.8-109.0 - INFO    : --------------------
2026-04-01 19:11:20,709: xDoctor_4.8-109.0 - INFO    : PSNT: CKM00190800120
2026-04-01 19:11:20,709: xDoctor_4.8-109.0 - INFO    : --------------------
2026-04-01 19:11:20,710: xDoctor_4.8-109.0 - INFO    : FIXED             =  1
2026-04-01 19:11:20,710: xDoctor_4.8-109.0 - INFO    : CRITICAL          =  0
2026-04-01 19:11:20,710: xDoctor_4.8-109.0 - INFO    : CRITICAL (CACHED) =  0
2026-04-01 19:11:20,710: xDoctor_4.8-109.0 - INFO    : ERROR             =  0
2026-04-01 19:11:20,710: xDoctor_4.8-109.0 - INFO    : ERROR (CACHED)    =  1
2026-04-01 19:11:20,710: xDoctor_4.8-109.0 - INFO    : WARNING           =  0
2026-04-01 19:11:20,710: xDoctor_4.8-109.0 - INFO    : INFO              =  0
2026-04-01 19:11:20,710: xDoctor_4.8-109.0 - INFO    : VERBOSE           =  0
2026-04-01 19:11:20,710: xDoctor_4.8-109.0 - INFO    : REPORT            =  0
2026-04-01 19:11:20,826: xDoctor_4.8-109.0 - INFO    : ---------------------
2026-04-01 19:11:20,826: xDoctor_4.8-109.0 - INFO    : xDoctor Post Features
2026-04-01 19:11:20,826: xDoctor_4.8-109.0 - INFO    : ----------------
2026-04-01 19:11:20,826: xDoctor_4.8-109.0 - INFO    : Data Combiner
2026-04-01 19:11:20,826: xDoctor_4.8-109.0 - INFO    : -------------
2026-04-01 19:11:20,827: xDoctor_4.8-109.0 - INFO    : Created a Data Collection Report (data.xml)
2026-04-01 19:11:20,827: xDoctor_4.8-109.0 - INFO    : ------
2026-04-01 19:11:20,827: xDoctor_4.8-109.0 - INFO    : SysLog
2026-04-01 19:11:20,827: xDoctor_4.8-109.0 - INFO    : ------
2026-04-01 19:11:20,828: xDoctor_4.8-109.0 - INFO    : Using Fabric as Syslog Server
2026-04-01 19:11:20,828: xDoctor_4.8-109.0 - INFO    : Syslog triggered on Warning
2026-04-01 19:11:20,828: xDoctor_4.8-109.0 - INFO    : Forwarded 1 event(s) to syslog
2026-04-01 19:11:20,828: xDoctor_4.8-109.0 - INFO    : ----
2026-04-01 19:11:20,828: xDoctor_4.8-109.0 - INFO    : SNMP
2026-04-01 19:11:20,828: xDoctor_4.8-109.0 - INFO    : ----
2026-04-01 19:11:20,829: xDoctor_4.8-109.0 - INFO    : Using 10.118.165.48:162 as SNMP server
2026-04-01 19:11:20,829: xDoctor_4.8-109.0 - INFO    : SNMP triggered on Warning
2026-04-01 19:11:26,905: xDoctor_4.8-109.0 - INFO    : Triggered 1 SNMPv2c event(s) and the SNMP server acknowledged 0 event(s)
2026-04-01 19:11:26,905: xDoctor_4.8-109.0 - INFO    : ------------
2026-04-01 19:11:26,905: xDoctor_4.8-109.0 - INFO    : ProcComplete
2026-04-01 19:11:26,905: xDoctor_4.8-109.0 - INFO    : ------------
2026-04-01 19:11:26,905: xDoctor_4.8-109.0 - WARNING : ProcComplete is disabled, please re-enable it (xdoctor --config)
2026-04-01 19:11:27,023: xDoctor_4.8-109.0 - INFO    : ----------------
2026-04-01 19:11:27,023: xDoctor_4.8-109.0 - INFO    : Session Archiver
2026-04-01 19:11:27,023: xDoctor_4.8-109.0 - INFO    : ----------------
2026-04-01 19:11:27,033: xDoctor_4.8-109.0 - INFO    : Session Stored in folder      - /usr/local/xdoctor/archive/other/2026-04-01_190513
2026-04-01 19:11:27,033: xDoctor_4.8-109.0 - INFO    : Session Archived as tar       - /usr/local/xdoctor/archive/other/xDoctor-CKMXXXXXXXXXXX-2026-04-01_190513.tgz
2026-04-01 19:11:27,034: xDoctor_4.8-109.0 - INFO    : --------------------------
2026-04-01 19:11:27,034: xDoctor_4.8-109.0 - INFO    : Session Report                - sudo xdoctor --report --archive=2026-04-01_190513
2026-04-01 19:11:27,034: xDoctor_4.8-109.0 - INFO    : ---------------
2026-04-01 19:11:27,034: xDoctor_4.8-109.0 - INFO    : Session Cleaner
2026-04-01 19:11:27,034: xDoctor_4.8-109.0 - INFO    : ---------------
2026-04-01 19:11:27,046: xDoctor_4.8-109.0 - INFO    : Removing folder  (count limit) - /usr/local/xdoctor/archive/other/2026-04-01_180540
2026-04-01 19:11:27,047: xDoctor_4.8-109.0 - INFO    : Removing archive (count limit) - /usr/local/xdoctor/archive/other/xDoctor-CKMXXXXXXXXXXX-2026-04-01_180540.tgz
2026-04-01 19:11:27,051: xDoctor_4.8-109.0 - INFO    : Cleaned 2 archived session(s)
2026-04-01 19:11:27,051: xDoctor_4.8-109.0 - INFO    : -------
2026-04-01 19:11:27,051: xDoctor_4.8-109.0 - INFO    : Emailer
2026-04-01 19:11:27,051: xDoctor_4.8-109.0 - INFO    : -------
2026-04-01 19:11:27,052: xDoctor_4.8-109.0 - INFO    : Using Dedicated Server (25:25) as SMTP Server ...
2026-04-01 19:11:27,052: xDoctor_4.8-109.0 - INFO    : Email Type = Individual Events
2026-04-01 19:11:27,052: xDoctor_4.8-109.0 - INFO    : ------------------------------
2026-04-01 19:11:27,053: xDoctor_4.8-109.0 - INFO    : xDoctor session_1775070313.036 finished in 0:06:14
2026-04-01 19:11:27,075: xDoctor_4.8-109.0 - INFO    : Successful Job:1775070313 Exit Code:200

Run the session report to review the results of the manual auto‑heal execution.

Command:

Note: Use the Session Report command provided in the xDoctor example above to review the results of the manual auto‑heal execution.

# sudo xdoctor --report --archive=<session report>

Example:

admin@ecsnode1:~> sudo xdoctor --report --archive=2026-04-01_180344

xDoctor 4.8-109.0
CKMXXXXXXXXXXX - ECS 3.8.1.4

Displaying xDoctor Report (2026-04-01_180344) Filter:[] ...

----------------------------------------
FIXED - Auto Healer fixed interface_down
----------------------------------------
Node      = Nodes
Extra     = {"Nodes": {"169.254.1.1": ["pslave-0"]}}
Timestamp = 2026-04-01_190513
PSNT      = CKMXXXXXXXXXXX @ 4.8-109.0

In the case you have a failure, open an SR for investigation.
Example of failure:

-------------------------------------------------------------------------------------
ERROR - (Cached) Auto fix failed - One or more network interfaces are down or missing
-------------------------------------------------------------------------------------
Node      = Nodes
Extra     = {"Nodes": {"169.254.1.4": ["pslave-1"]}}
RAP       = RAP025
Solution  = KB 503814
Timestamp = 2026-04-01_190513
PSNT      = CKMXXXXXXXXXXX @ 4.8-109.0

xDoctor Auto Pilot:

IMPORTANT: A new feature has been released in xDoctor 4-8.104.0 and later. This knowledge base (KB) is now automated with xDoctor to quickly attempt to recover a node interface by checking both node and switch details. If the script is unable to recover the interface, it provides a detailed summary of its findings.

To set up the automation repository with xDoctor 4-8.104.0 and later, follow ECS: ObjectScale: How to run KB Automation Scripts (Auto Pilot)

Run the automation command from a node with xDoctor 4-8.104.0 and above.

Note: --target-node is supported for this action. This script was included with the xDoctor 4-8.104.0 release, which features version 3.0 of the Ansible automation scripts.

Command:

# sudo xdoctor autopilot --kb 50341 --target-node <Node NAN IP>

Example:

admin@ecsnode1:~> sudo xdoctor autopilot --kb 50341 --target-node 169.254.1.1
Checking for existing screen sessions...
Starting screen session 'autopilot_kb_50341_20250627_124206'...
Screen session 'autopilot_kb_50341_20250627_124206' started successfully.
Attaching to screen session 'autopilot_kb_50341_20250627_124206'...

Review automation summary:

Example:

>TASK [Print all summaries] ***************************************************************************************************************************************************************************
ok: [169.254.1.7] => {
    "msg": [
        "Host: 169.254.1.7",
        "*******************************************************************************",
        "Node interface summary:",
        "*******************************************************************************",
        "Interface status for switch rabbit and node interface(slave-0):",
        "                           Node Interface: UP",
        "                           Node physical link detected: yes",
        "                           Node SFP Vendor: DELL",
        "                           Node SFP PN: VXFJY",
        "                           Node SFP Type: GBIC/SFP defined by 2-wire interface ID",
        "                           Node connection detected: ethernet1/1/7",
        "                           Switch Interface: ethernet1/1/7",
        "                           Switch Speed: 25G",
        "                           Switch Status: up",
        "                           Switch SFP Model: 25GBASE-CR-3.0M",
        "                           Switch SFP Qualified: true",
        "Interface status for switch hare and node interface(slave-1):",
        "                           Node Interface: UP",
        "                           Node physical link detected: yes",
        "                           Node SFP Vendor: DELL",
        "                           Node SFP PN: VXFJY",
        "                           Node SFP Type: GBIC/SFP defined by 2-wire interface ID",
        "                           Node connection detected: ethernet1/1/7",
        "                           Switch Interface: ethernet1/1/7",
        "                           Switch Speed: 25G",
        "                           Switch Status: up",
        "                           Switch SFP Model: 25GBASE-CR-3.0M",
        "                           Switch SFP Qualified: true",
        "Interface status for switch fox and node interface(pslave-0):",
        "                           Node Interface: UP",
        "                           Node physical link detected: yes",
        "                           Node SFP Vendor: DELL",
        "                           Node SFP PN: VXFJY",
        "                           Node SFP Type: GBIC/SFP defined by 2-wire interface ID",
        "                           Node connection detected: ethernet1/1/7",
        "                           Switch Interface: ethernet1/1/7",
        "                           Switch Speed: 25G",
        "                           Switch Status: up",
        "                           Switch SFP Model: 25GBASE-CR-3.0M",
        "                           Switch SFP Qualified: true",
        "Interface status for switch hound and node interface(pslave-1): ",
        "                           Node Interface:UP",
        "                           Node physical link detected:yes",
        "                           Node SFP Vendor: DELL",
        "                           Node SFP PN: VXFJY",
        "                           Node SFP Type: GBIC/SFP defined by 2-wire interface ID",
        "                           Node connection detected: ethernet1/1/7",
        "                           Switch Interface: ethernet1/1/7",
        "                           Switch Speed: 25G",
        "                           Switch Status: up",
        "                           Switch SFP Model: 25GBASE-CR-3.0M",
        "                           Switch SFP Qualified: true",
        "*******************************************************************************",
        "Actions taken and Reccomendations:",
        "*******************************************************************************",
        "Validated node(s) to switch connection are correct: PASS: Node(s) are correctly connected on all reachable Dell managed switches",
        "Validated SFP on the node(s): PASS: SFP detected on all interfaces",
        "Validated interface status on the node(s): PASS: All interfaces are up",
        "Validated SFP on the switches are qualified: PASS: All SFPs are qualified.",
        "Validated Frontend switch(es): PASS: All frontend switches appear to be Dell managed switches.",
        "Validated Backend switch(es): PASS: All backend switch(es) are reachable",
        "*******************************************************************************"
    ]
}

TASK [Fail if validation fails] **********************************************************************************************************************************************************************
skipping: [169.254.1.7] => {"changed": false, "skip_reason": "Conditional result was False", "skipped": true}

PLAY RECAP *******************************************************************************************************************************************************************************************
169.254.1.7                : ok=70   changed=6    unreachable=0    failed=0
localhost                  : ok=8    changed=1    unreachable=0    failed=0

Playbook execution successful
20241112 22:28:58.932: | PASS (2 min 44 sec)
================================================================================
Status: PASS
Time Elapsed: 3 min 1 sec
Debug log: /opt/emc/caspian/service-console/log/20241112_222531_run_KB_Fix/dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20241112_222531_run_KB_Fix/log.html
================================================================================

[screen is terminating]

Note: If the switches are not Arista or Dell switches, they are user-managed switches. We must notify the user that they have a connection down to remediate. Once confirmed, follow article ECS: How to filter switch alerts from xDoctor to disable analysis to the user-managed switches.

Manual Steps below:

Before doing any work: In cases where user-managed switches get maintained, like firmware upgrades, links can go down and may get renegotiated. The example below is for secondary-0 down after a firmware upgrade done from the user on a user-managed FE switch. The following command is used to renegotiate the interface:

ethtool -r <i/f name>

admin@ecs-n-1:~> sudo ethtool slave-0
Settings for slave-0:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseX/Full 
                                10000baseSR/Full 
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Advertised link modes:  1000baseX/Full 
                                10000baseSR/Full 
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Speed: Unknown!
        Duplex: Unknown! (255)
        Port: FIBRE
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: off
        Supports Wake-on: g
        Wake-on: g
        Current message level: 0x0000000f (15)
                               drv probe link timer
        Link detected: no                   
admin@ecs-n-1:~>

admin@ecs-n-1:~> sudo ethtool -r slave-0
admin@ecs-n-1:~> 

admin@ecs-n-1:~> sudo ethtool slave-0
Settings for slave-0:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseX/Full 
                                10000baseSR/Full 
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Advertised link modes:  1000baseX/Full 
                                10000baseSR/Full 
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Speed: 10000Mb/s
        Duplex: Full
        Port: FIBRE
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: off
        Supports Wake-on: g
        Wake-on: g
        Current message level: 0x0000000f (15)
                               drv probe link timer
        Link detected: yes

Note: If the interface is still down and cannot renegotiate, a Dell representative must be onsite to inspect and reseat the SFP and cable. If this does not work, it is recommended to replace the SFP and cable connection from the switch to the node.

There are different SFP models that use either copper or fiber. Also, Back-end and Front-End switches may use different types of SFP wires. Confirm which part is required before replacement.

Additional Information

Extract from ECS Hardware Guide Gen 1 or Gen 2:

Network cabling
The network cabling diagrams apply to U-Series, D-Series, or C-Series ECS Appliance in a Dell or customer provided rack.

To distinguish between the three switches, each switch has a nickname:

Hare: 10 GbE public switch is at the top of the rack in a U- or D-Series or the top switch in a C-Series segment.
Rabbit: 10 GbE public switch is located below the hare in the top of the rack in a U- or D-Series or below the hare switch in a C-Series segment.
Turtle: 1 GbE private switch that is located below rabbit in the top of the rack in a U-Series or below the hare switch in a C-Series segment.

U- and D-Series network cabling
The following figure shows a simplified network cabling diagram for an eight-node rack configuration for a U- or D-Series ECS configured by Dell or customer. Following this figure, other detailed figures and tables provide port, label, and cable color information.

See below table for matching switches and ports and the figure for also showing the appropriate switch port numbers.

Switch 1 = Rabbit = Bottom switch
Switch 2 = Hare = Top switch

Node ports:
Secondary-0 = P01 = right port - connects to Switch 1 / Rabbit/Bottom switch.
Secondary-1 = P02 = left port - connects to Switch 2 / Hare/Top switch.

Affected Products

ECS Appliance

Products

ECS Appliance Gen 1, ECS Appliance Gen 2, ECS Appliance Gen 3, ECS Appliance Hardware Gen1 U-Series, Elastic Cloud Storage

Article Number: 000050341

Article Type: Solution

Last Modified: 03 أبريل 2026

Version: 11

Check if your device is covered by Support Services.

ECS: OBS: xDoctor: RAP025: One or more network interfaces are down or missing

Summary: xDoctor reports a warning error indicating that one of the ECS interfaces is down to the switches.

Symptoms

Cause

Resolution

Additional Info

Affected Products

Symptoms

Alert That Triggers RAP025 Auto‑Healing

Alert Output Example

Auto Healer Remediation (Example)

Auto Healer Requirement

Manual Validation:

Cause

Possible Causes

Resolution

xDoctor Auto Pilot:

Additional Information

Affected Products

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services

ECS: OBS: xDoctor: RAP025: One or more network interfaces are down or missing

Summary: xDoctor reports a warning error indicating that one of the ECS interfaces is down to the switches.

Detailed Article

Symptoms

Cause

Resolution

Additional Info

Affected Products

Symptoms

Alert That Triggers RAP025 Auto‑Healing

Alert Output Example

Auto Healer Remediation (Example)

Auto Healer Requirement

Manual Validation:

Cause

Possible Causes

Resolution

xDoctor Auto Pilot:

Additional Information

Affected Products

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services