ECS: OBS: xDoctor: RAP073/208: Falha de conexão do switch detectada

Summary: Esta base de conhecimento explica como lidar com o alerta de falha de conexão do switch detectado.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

A partir do ECS xDoctor v4.8-109.0 e do ObjectScale xDoctor v5.1-109.0, o RAP208 (Switch Connection Failure Detected) é implementado como um auto-healer. Quando os problemas de conectividade do switch excedem o limite configurado de severidade Error ou Critical , o xDoctor emite um alerta RAP208 e inicia automaticamente seu fluxo de trabalho de orquestração de reparo integrado. Esse fluxo de trabalho executará as ações de correção necessárias, se os executores de correção automática do xDoctor estiverem ativados.
 

 

Nota: Se seu ambiente estiver executando uma versão do xDoctor anterior ao ECS xDoctor v4.8-109.0 ou ao ObjectScale xDoctor v5.1-109.0, a funcionalidade de correção automática RAP208 não estará disponível. Nessas versões, a correção deve ser realizada usando o processo do Autopilot descrito abaixo ou seguindo as etapas manuais de correção descritas na seção Resolução.

 

Alerta que aciona a autocorreção RAP208

O fluxo de trabalho de correção automática do RAP208 é acionado quando as falhas de conectividade do switch ultrapassam o limite configurado de severidade Error ou Critical . Quando esse limite é excedido, o xDoctor gera um alerta RAP208, que serve como gatilho para o processo de reparo automatizado.
 

Exemplo de saída de alerta

 

Nota: Nas versões do xDoctor anteriores ao ECS xDoctor v4.8-109.0 e ao ObjectScale xDoctor v5.1-109.0, essa condição resulta apenas em um alerta. A correção automática não é executada.
 
--------------------------------------------------------
INFO - Auto Healer for dell_switch_connectivity disabled
--------------------------------------------------------
Extra     = Auto Healer for dell_switch_connectivity disabled
Timestamp = 2026-04-01_180132
PSNT      = CKMXXXXXXXXXXX @ 4.8-109.0

----------------------------------------------------
ERROR - (Cached) Switch Connection Failure detected.
----------------------------------------------------
Node      = 169.254.1.1
Extra     = {"169.254.1.1": ["hare"]}
RAP       = RAP208
Solution  = KB 39838
Timestamp = 2026-04-01_180132
PSNT      = CKMXXXXXXXXXXX @ 4.8-109.0

 

Correção do Auto Healer (exemplo)

Quando os recuperadores automáticos estão ativados, o xDoctor inicia automaticamente a correção dos problemas detectados de conectividade do switch aplicando ações corretivas comuns descritas nesta base de conhecimento.

--------------------------------------------------------
FIXED - Auto Healer fixed Dell switch connectivity issue
--------------------------------------------------------
Node      = Nodes
Extra     = {"Nodes": ["169.254.1.1"]}
Timestamp = 2026-04-01_180344
PSNT      = CKMXXXXXXXXXXX @ 4.8-109.0 

 

Requisito do Auto Healer

O recurso de correção automática do xDoctor deve estar ativado para que essa correção ocorra. Os recuperadores automáticos podem ser ativados durante a instalação ou após a instalação seguindo as etapas descritas em:

KB: ECS: xDoctor: Como ativar o xDoctor Auto Healer após a instalação da ferramenta

Cause

Após a substituição de um switch, as chaves de host SSH usadas para autenticar no switch podem mudar ou a interface de gerenciamento que se conecta ao switch pode ser desligada administrativamente. Às vezes, a senha configurada no xDoctor não corresponde à senha atual no switch afetado e deve ser atualizada adequadamente.

Os fluxos de trabalho de automação e correção automática do xDoctor não executam a correção de senha de switch. Em vez disso, o xDoctor detecta falhas relacionadas à autenticação e emite o alerta apropriado, direcionando o usuário para o artigo relevante da base de conhecimento que descreve como configurar o xDoctor para usar a senha definida nos switches.

Resolution

Curador automático xDoctor: ObjectScale xDoctor v5.1-109.0/ECS xDoctor v4.8-109.0 ou posterior

 

  1. Para acionar manualmente a recuperação automática ativada, execute o seguinte comando no master.rack Nó. Isso inicia os analisadores de rack, que validarão e corrigirão automaticamente os nós um de cada vez.
Comando:
# sudo xdoctor --rap=RAP208

Exemplo:

admin@ecsnode1:~> sudo xdoctor --rap=RAP208
2026-04-01 18:03:45,441: xDoctor_4.8-109.0 - INFO    : Initializing xDoctor v4.8-109.0 ...
[... Truncated Output ...]
2026-04-01 18:05:01,725: xDoctor_4.8-109.0 - INFO    : ANALYZER [ac_dell_switch_connectivity]
2026-04-01 18:05:02,063: xDoctor_4.8-109.0 - INFO    : Autohealing switch_connectivity on node 169.254.1.1 ...
2026-04-01 18:08:57,494: xDoctor_4.8-109.0 - INFO    : All data analyzed in 0:03:55
2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO    : --------------------
2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO    : Diagnosis Summary
2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO    : --------------------
2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO    : PSNT: CKMXXXXXXXXXXX
2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO    : --------------------
2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO    : FIXED             =  1
2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO    : CRITICAL          =  0
2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO    : CRITICAL (CACHED) =  0
2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO    : ERROR             =  0
2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO    : ERROR (CACHED)    =  0
2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO    : WARNING           =  0
2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO    : INFO              =  0
2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO    : VERBOSE           =  0
2026-04-01 18:08:58,531: xDoctor_4.8-109.0 - INFO    : REPORT            =  0
2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO    : ---------------------
2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO    : xDoctor Post Features
2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO    : ----------------
2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO    : Data Combiner
2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO    : -------------
2026-04-01 18:08:58,647: xDoctor_4.8-109.0 - INFO    : Created a Data Collection Report (data.xml)
2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO    : ------
2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO    : SysLog
2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO    : ------
2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO    : Using Fabric as Syslog Server
2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO    : Not triggered ... no WARNING, ERROR, nor CRITICAL
2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO    : ----
2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO    : SNMP
2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO    : ----
2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO    : Using 10.118.165.48:162 as SNMP server
2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO    : Not triggered .. no WARNING, ERROR nor CRITICAL
2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO    : ------------
2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO    : ProcComplete
2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO    : ------------
2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - WARNING : ProcComplete is disabled, please re-enable it (xdoctor --config)
2026-04-01 18:08:58,767: xDoctor_4.8-109.0 - INFO    : ----------------
2026-04-01 18:08:58,767: xDoctor_4.8-109.0 - INFO    : Session Archiver
2026-04-01 18:08:58,768: xDoctor_4.8-109.0 - INFO    : ----------------
2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO    : Session Stored in folder      - /usr/local/xdoctor/archive/other/2026-04-01_180344
2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO    : Session Archived as tar       - /usr/local/xdoctor/archive/other/xDoctor-CKMXXXXXXXXXXX-2026-04-01_180344.tgz
2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO    : --------------------------
2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO    : Session Report                - sudo xdoctor --report --archive=2026-04-01_180344
2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO    : ---------------
2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO    : Session Cleaner
2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO    : ---------------
2026-04-01 18:08:58,789: xDoctor_4.8-109.0 - INFO    : Removing folder  (count limit) - /usr/local/xdoctor/archive/other/2026-04-01_170120
2026-04-01 18:08:58,790: xDoctor_4.8-109.0 - INFO    : Removing archive (count limit) - /usr/local/xdoctor/archive/other/xDoctor-CKMXXXXXXXXXXX-2026-04-01_170120.tgz
2026-04-01 18:08:58,793: xDoctor_4.8-109.0 - INFO    : Cleaned 2 archived session(s)
2026-04-01 18:08:58,793: xDoctor_4.8-109.0 - INFO    : -------
2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO    : Emailer
2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO    : -------
2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO    : Using Dedicated Server (25:25) as SMTP Server ...
2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO    : Email Type = Individual Events
2026-04-01 18:08:58,795: xDoctor_4.8-109.0 - INFO    : ------------------------------
2026-04-01 18:08:58,795: xDoctor_4.8-109.0 - INFO    : xDoctor session_1775066624.943 finished in 0:05:13
2026-04-01 18:08:58,813: xDoctor_4.8-109.0 - INFO    : Successful Job:1775066624 Exit Code:192

 

  1. Execute o relatório da sessão para analisar os resultados da execução da recuperação automática manual.
Comando: 
Nota: Use o comando Session Report fornecido no exemplo do xDoctor acima para analisar os resultados da execução da correção automática manual.
# sudo xdoctor --report --archive=<session report>

Exemplo:

admin@ecsnode1:~> sudo xdoctor --report --archive=2026-04-01_180344

xDoctor 4.8-109.0
CKMXXXXXXXXXXX - ECS 3.8.1.4

Displaying xDoctor Report (2026-04-01_180344) Filter:[] ...

--------------------------------------------------------
FIXED - Auto Healer fixed Dell switch connectivity issue
--------------------------------------------------------
Node      = Nodes
Extra     = {"Nodes": ["169.254.1.1"]}
Timestamp = 2026-04-01_180344
PSNT      = CKMXXXXXXXXXXX @ 4.8-109.0
  1. No caso de uma falha, abra um SRn para investigação.
    Exemplo de falha:
    ----------------------------------------------------
    ERROR - (Cached) Auto fix failed - Switch Connection Failure detected.
    ----------------------------------------------------
    Node      = 169.254.1.1
    Extra     = {"169.254.1.1": ["hare"]}
    RAP       = RAP208
    Solution  = KB 39838
    Timestamp = 2026-04-01_180132
    PSNT      = CKMXXXXXXXXXXX @ 4.8-109.0

 

Piloto automático do xDoctor: 

Essa base de conhecimento (KB) agora é automatizada com o xDoctor Auto Pilot, abordando a maioria dos problemas sem a necessidade de envolvimento do suporte.

Esse recurso é nativo do xDoctor 4-8.104.0 e versões posteriores, para problemas de sintaxe e uso, consulte o ECS: ObjectScale: Como executar scripts de automação da KB (piloto automático).
 

Para localizar o nó mestre do rack:

Comando: 

ssh master.rack

 

Para localizar o IP de NAN, você pode usar o IP identificado no alerta ou no getrackinfo:

Comando:

admin@ecsnode1:~> getrackinfo
Node private      Node              Public                                BMC
Ip Address        Id       Status   Mac                 Ip Address        Mac                 Ip Address        Private.4(NAN)    Node Name
===============   ======   ======   =================   ===============   =================   ===============   ===============   =========
192.168.219.1     1        MA       00:00:00:00:00      0.0.0.0           00:00:00:00:00      192.168.219.101   169.254.1.1       provo-red
192.168.219.2     2        SA       00:00:00:00:00      0.0.0.0           00:00:00:00:00      192.168.219.102   169.254.1.2       sandy-red
192.168.219.3     3        SA       00:00:00:00:00      0.0.0.0           00:00:00:00:00      192.168.219.103   169.254.1.3       orem-red
192.168.219.4     4        SA       00:00:00:00:00      0.0.0.0           00:00:00:00:00      192.168.219.104   169.254.1.4       ogden-red
192.168.219.5     5        SA       00:00:00:00:00      0.0.0.0           00:00:00:00:00      192.168.219.105   169.254.1.5       layton-red
192.168.219.6     6        SA       00:00:00:00:00      0.0.0.0           00:00:00:00:00      192.168.219.106   169.254.1.6       logan-red
192.168.219.7     7        SA       00:00:00:00:00      0.0.0.0           00:00:00:00:00      192.168.219.107   169.254.1.7       lehi-red
192.168.219.8     8        SA       00:00:00:00:00      0.0.0.0           00:00:00:00:00      192.168.219.108   169.254.1.8       murray-red

 

  1. Execute o comando de automação a partir do nó mestre com o xDoctor 4-8.104.0 e superior.
Comando: 
Nota: --target-rack é compatível com esta ação. 
# sudo xdoctor autopilot --kb 39838 --target-rack <rack_colour>
Exemplo:
admin@ecsnode1:~>  sudo xdoctor autopilot --kb 39838 --target-rack red
Checking for existing screen sessions...
Starting screen session 'autopilot_kb_39838_20250626_112318'...
Screen session 'autopilot_kb_39838_20250626_112318' started successfully.
Attaching to screen session 'autopilot_kb_39838_20250626_112318'...

Using /etc/ansible/ansible.cfg as config file
VERSION: 3.0
Playbook tasks: 47
Role tasks: 97
Total tasks: 144 across 1 host(s)

PLAY [red] ******************************************************************************************************************************************************************
Detected 8 hosts for this play.

TASK [target_check : set_fact] **********************************************************************************************************************************************
ok: [169.254.1.1 -> localhost] => {"ansible_facts": {"allowed_targets": "Please use: --target-rack", "target_node_check": false, "target_rack_check": true, "target_vdc_check": false}, "changed": false}

TASK [target_check : context] ***********************************************************************************************************************************************
skipping: [169.254.1.1] => {"changed": false, "false_condition": "node_script == false and target_node_check == true or rack_script == false and target_rack_check == true or vdc_script == false and target_vdc_check == true", "skip_reason": "Conditional result was False"}

...truncated
 
  1. Resumo da análise:

Exemplo: 

TASK [Print all summaries] **************************************************************************************************************************************************
ok: [169.254.1.1] => {
    "msg": [
        "*******************************************************************************",
        "Switch xDoctor 'RAP073' password and SSH summary:",
        "*******************************************************************************",
        "Validated Frontend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838.",
        "Validated Backend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838.",
        "Validated Backend management connections: PASS: Management connections are up and connected to the frontend switches.",
        "*******************************************************************************",
        "Validated ssh keys to switch(es): PASS: All ssh keys are valid and nothing was corrected.",
        "Validated xDoctor alert: PASS: Alert RAP073 was not present in xDoctor.",
        "*******************************************************************************"
    ]
}

TASK [Set fact for context] *************************************************************************************************************************************************
ok: [169.254.1.1 -> localhost] => {"ansible_facts": {"context": " Validated Frontend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838., Validated Backend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838."}, "changed": false}

TASK [Fail if validation fails] *********************************************************************************************************************************************
fatal: [169.254.1.1]: FAILED! => {"changed": false, "msg": "Review the summary above for recommendations."}

NO MORE HOSTS LEFT **********************************************************************************************************************************************************

PLAY RECAP ******************************************************************************************************************************************************************
169.254.1.1                : ok=65   changed=13   unreachable=0    failed=1    skipped=73   rescued=0    ignored=1
169.254.1.2                : ok=4    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
169.254.1.3                : ok=4    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
169.254.1.4                : ok=4    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
169.254.1.5                : ok=4    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
169.254.1.6                : ok=4    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
169.254.1.7                : ok=4    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
169.254.1.8                : ok=4    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0

=============================================================================================================================================================================
Status: FAIL
Time Elapsed: 0h 1m 25s
Debug log: /tmp/autopilot/log/autopilot_39838_20250626_113201.log
Message:  Validated Frontend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838., Validated Backend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838.
=============================================================================================================================================================================

 

  1. Atualize a senha do xDoctor: 
admin@ecsnode7:~> sudo xdoctor -c --expert

xDoctor Configuration Menu
--------------------------
[Expert Mode Active]

(1)  Overview
(2)  Scheduling
(3)  Archiving
(5)  Repository



(9)  Miscellaneous

(0)  Exit

Please make a choice: 9

xDoctor Miscellaneous
---------------------


(3)  Switches
(4)  Remove Hardware Alerting Timestamp

(0)  Main menu

Please make a choice: 3

xDoctor Switch Settings
---------------------
Enable Switch Analysis?  [Yes]:
Switches [hare,rabbit,fox,hound]:
Username [admin]:
Password [*****]:

[New Switch Settings]
Enabled = Yes
Switches = hare,rabbit,fox,hound
Username = admin
Password = *****

> Issue new settings?  [No]: yes
2024-11-20 16:03:53,702: xDoctor_4.8-100.0 - INFO    : Settings saved and distributed ...

xDoctor Miscellaneous
---------------------


(3)  Switches
(4)  Remove Hardware Alerting Timestamp

(0)  Main menu

 

Automação básica da base de conhecimento: 
ECS: xDoctor: RAP073: Falha de conexão do switch detectada

Consolidação adicional da KB nessa automação:
ECS: o xDoctor relata falha de conexão do switch devido à chave RSA no known_hosts

Affected Products

Elastic Cloud Storage

Products

Elastic Cloud Storage
Article Properties
Article Number: 000039838
Article Type: Solution
Last Modified: 02 أبريل 2026
Version:  10
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.