ECS: OBS: xDoctor: RAP073/208: Switchforbindelsesfejl registreret
Summary: Denne vidensbase forklarer, hvordan en advarsel om, at der registreres en forbindelsesfejl til switchen, håndteres.
Symptoms
Fra og med ECS xDoctor v4.8-109.0 og ObjectScale xDoctor v5.1-109.0 implementeres RAP208 (Switch Connection Failure Detected) som en auto-healer. Når forbindelsesproblemer med switche overskrider den konfigurerede fejl - eller kritiske alvorsgrænse, udløser xDoctor en RAP208-advarsel og starter automatisk den integrerede reparationsorkestreringsarbejdsgang. Denne arbejdsproces udfører de nødvendige afhjælpningshandlinger, hvis xDoctor-autohealere er aktiveret.
BEMÆRK: Hvis dit miljø kører en xDoctor-version, der er ældre end ECS xDoctor v4.8-109.0 eller ObjectScale xDoctor v5.1-109.0, er RAP208-funktionen til automatisk healing ikke tilgængelig. I disse versioner skal afhjælpningen udføres ved hjælp af AutoPilot-processen , der er beskrevet nedenfor, eller ved at følge de manuelle afhjælpningstrin , der er beskrevet i afsnittet Løsning.
Advarsel, der udløser automatisk reparation af RAP208
RAP208-arbejdsgangen til automatisk reparation udløses, når switchforbindelsesfejl når grænsen for den konfigurerede fejl eller kritiske alvorsgrad overskrides. Når denne tærskel er overskredet, xDoctor genererer en RAP208 alarm, som tjener som udløser for den automatiserede reparationsproces.
Eksempel på alarmoutput
BEMÆRK: I xDoctor-versioner før ECS xDoctor v4.8-109.0 og ObjectScale xDoctor v5.1-109.0 resulterer denne tilstand kun i en advarsel. Automatisk afhjælpning udføres ikke.
--------------------------------------------------------
INFO - Auto Healer for dell_switch_connectivity disabled
--------------------------------------------------------
Extra = Auto Healer for dell_switch_connectivity disabled
Timestamp = 2026-04-01_180132
PSNT = CKMXXXXXXXXXXX @ 4.8-109.0
----------------------------------------------------
ERROR - (Cached) Switch Connection Failure detected.
----------------------------------------------------
Node = 169.254.1.1
Extra = {"169.254.1.1": ["hare"]}
RAP = RAP208
Solution = KB 39838
Timestamp = 2026-04-01_180132
PSNT = CKMXXXXXXXXXXX @ 4.8-109.0
Auto Healer afhjælpning (eksempel)
Når automatisk reparation er aktiveret, igangsætter xDoctor automatisk afhjælpning af registrerede problemer med switchtilslutningen ved at anvende almindelige korrigerende handlinger, som er beskrevet i denne vidensbase.
--------------------------------------------------------
FIXED - Auto Healer fixed Dell switch connectivity issue
--------------------------------------------------------
Node = Nodes
Extra = {"Nodes": ["169.254.1.1"]}
Timestamp = 2026-04-01_180344
PSNT = CKMXXXXXXXXXXX @ 4.8-109.0
Auto Healer Krav
Funktionen xDoctor auto-healer skal være aktiveret, for at denne afhjælpning kan finde sted. Autohealere kan aktiveres enten under installation eller efter installation ved at følge trinene beskrevet i:
KB: ECS: xDoctor: Sådan aktiveres xDoctor Auto Healer efter værktøjsinstallation
Cause
Efter udskiftning af switchen kan de SSH-værtsnøgler, der bruges til at godkende switchen, ændre sig, eller administrationsgrænsefladen, der opretter forbindelse til switchen, kan blive administrativt lukket ned. Nogle gange stemmer adgangskoden, der er konfigureret i xDoctor, ikke overens med den aktuelle adgangskode på den berørte switch og skal opdateres i overensstemmelse hermed.
Arbejdsprocesserne for xDoctor-automatisering og -auto-healer udfører ikke afhjælpning af switchadgangskode. I stedet registrerer xDoctor godkendelsesrelaterede fejl og udsender den relevante advarsel og leder brugeren til den relevante vidensbaseartikel, der beskriver, hvordan du konfigurerer xDoctor til at bruge den adgangskode, der er angivet på switchene.
Resolution
xDoctor Auto Healer: ObjectScale xDoctor v5.1-109.0/ECS xDoctor v4.8-109.0 eller nyere
- Hvis du vil udløse Enabled Auto Healer manuelt, skal du køre følgende kommando på
master.rackNode. Dette starter rackanalysatorerne, som validerer og automatisk reparerer noderne én ad gangen.
# sudo xdoctor --rap=RAP208
Eksempel:
admin@ecsnode1:~> sudo xdoctor --rap=RAP208 2026-04-01 18:03:45,441: xDoctor_4.8-109.0 - INFO : Initializing xDoctor v4.8-109.0 ... [... Truncated Output ...] 2026-04-01 18:05:01,725: xDoctor_4.8-109.0 - INFO : ANALYZER [ac_dell_switch_connectivity] 2026-04-01 18:05:02,063: xDoctor_4.8-109.0 - INFO : Autohealing switch_connectivity on node 169.254.1.1 ... 2026-04-01 18:08:57,494: xDoctor_4.8-109.0 - INFO : All data analyzed in 0:03:55 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : -------------------- 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : Diagnosis Summary 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : -------------------- 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : PSNT: CKMXXXXXXXXXXX 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : -------------------- 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : FIXED = 1 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : CRITICAL = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : CRITICAL (CACHED) = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : ERROR = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : ERROR (CACHED) = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : WARNING = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : INFO = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : VERBOSE = 0 2026-04-01 18:08:58,531: xDoctor_4.8-109.0 - INFO : REPORT = 0 2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO : --------------------- 2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO : xDoctor Post Features 2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO : ---------------- 2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO : Data Combiner 2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO : ------------- 2026-04-01 18:08:58,647: xDoctor_4.8-109.0 - INFO : Created a Data Collection Report (data.xml) 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : ------ 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : SysLog 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : ------ 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : Using Fabric as Syslog Server 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : Not triggered ... no WARNING, ERROR, nor CRITICAL 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : ---- 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : SNMP 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : ---- 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO : Using 10.118.165.48:162 as SNMP server 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO : Not triggered .. no WARNING, ERROR nor CRITICAL 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO : ------------ 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO : ProcComplete 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO : ------------ 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - WARNING : ProcComplete is disabled, please re-enable it (xdoctor --config) 2026-04-01 18:08:58,767: xDoctor_4.8-109.0 - INFO : ---------------- 2026-04-01 18:08:58,767: xDoctor_4.8-109.0 - INFO : Session Archiver 2026-04-01 18:08:58,768: xDoctor_4.8-109.0 - INFO : ---------------- 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : Session Stored in folder - /usr/local/xdoctor/archive/other/2026-04-01_180344 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : Session Archived as tar - /usr/local/xdoctor/archive/other/xDoctor-CKMXXXXXXXXXXX-2026-04-01_180344.tgz 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : -------------------------- 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : Session Report - sudo xdoctor --report --archive=2026-04-01_180344 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : --------------- 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : Session Cleaner 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : --------------- 2026-04-01 18:08:58,789: xDoctor_4.8-109.0 - INFO : Removing folder (count limit) - /usr/local/xdoctor/archive/other/2026-04-01_170120 2026-04-01 18:08:58,790: xDoctor_4.8-109.0 - INFO : Removing archive (count limit) - /usr/local/xdoctor/archive/other/xDoctor-CKMXXXXXXXXXXX-2026-04-01_170120.tgz 2026-04-01 18:08:58,793: xDoctor_4.8-109.0 - INFO : Cleaned 2 archived session(s) 2026-04-01 18:08:58,793: xDoctor_4.8-109.0 - INFO : ------- 2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO : Emailer 2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO : ------- 2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO : Using Dedicated Server (25:25) as SMTP Server ... 2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO : Email Type = Individual Events 2026-04-01 18:08:58,795: xDoctor_4.8-109.0 - INFO : ------------------------------ 2026-04-01 18:08:58,795: xDoctor_4.8-109.0 - INFO : xDoctor session_1775066624.943 finished in 0:05:13 2026-04-01 18:08:58,813: xDoctor_4.8-109.0 - INFO : Successful Job:1775066624 Exit Code:192
- Kør sessionsrapporten for at gennemse resultaterne af den manuelle udførelse af automatisk reparation.
# sudo xdoctor --report --archive=<session report>
Eksempel:
admin@ecsnode1:~> sudo xdoctor --report --archive=2026-04-01_180344
xDoctor 4.8-109.0
CKMXXXXXXXXXXX - ECS 3.8.1.4
Displaying xDoctor Report (2026-04-01_180344) Filter:[] ...
--------------------------------------------------------
FIXED - Auto Healer fixed Dell switch connectivity issue
--------------------------------------------------------
Node = Nodes
Extra = {"Nodes": ["169.254.1.1"]}
Timestamp = 2026-04-01_180344
PSNT = CKMXXXXXXXXXXX @ 4.8-109.0
- Hvis du har en fejl, skal du åbne et SRn til undersøgelse.
Eksempel på fejl:---------------------------------------------------- ERROR - (Cached) Auto fix failed - Switch Connection Failure detected. ---------------------------------------------------- Node = 169.254.1.1 Extra = {"169.254.1.1": ["hare"]} RAP = RAP208 Solution = KB 39838 Timestamp = 2026-04-01_180132 PSNT = CKMXXXXXXXXXXX @ 4.8-109.0
xDoctor Autopilot:
Denne vidensbase (KB) er nu automatiseret med xDoctor Auto Pilot, der løser de fleste problemer uden behov for support involvering.
Denne funktion er hjemmehørende i xDoctor 4-8.104.0 og nyere, for syntaks- og brugsproblemer henvises til ECS: ObjectScale: Sådan køres KB-automatiseringsscripts (autopilot).
Sådan finder du masternoden på racket:
Kommando:
ssh master.rack
For at finde NAN IP kan du bruge den IP, der er identificeret i advarslen eller fra getrackinfo:
Kommando:
admin@ecsnode1:~> getrackinfo
Node private Node Public BMC
Ip Address Id Status Mac Ip Address Mac Ip Address Private.4(NAN) Node Name
=============== ====== ====== ================= =============== ================= =============== =============== =========
192.168.219.1 1 MA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.101 169.254.1.1 provo-red
192.168.219.2 2 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.102 169.254.1.2 sandy-red
192.168.219.3 3 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.103 169.254.1.3 orem-red
192.168.219.4 4 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.104 169.254.1.4 ogden-red
192.168.219.5 5 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.105 169.254.1.5 layton-red
192.168.219.6 6 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.106 169.254.1.6 logan-red
192.168.219.7 7 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.107 169.254.1.7 lehi-red
192.168.219.8 8 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.108 169.254.1.8 murray-red
- Kør automatiseringskommandoen fra masternoden med xDoctor 4-8.104.0 og derover.
Bemærk:
--target-rack understøttes til denne handling.
# sudo xdoctor autopilot --kb 39838 --target-rack <rack_colour>
admin@ecsnode1:~> sudo xdoctor autopilot --kb 39838 --target-rack red
Checking for existing screen sessions...
Starting screen session 'autopilot_kb_39838_20250626_112318'...
Screen session 'autopilot_kb_39838_20250626_112318' started successfully.
Attaching to screen session 'autopilot_kb_39838_20250626_112318'...
Using /etc/ansible/ansible.cfg as config file
VERSION: 3.0
Playbook tasks: 47
Role tasks: 97
Total tasks: 144 across 1 host(s)
PLAY [red] ******************************************************************************************************************************************************************
Detected 8 hosts for this play.
TASK [target_check : set_fact] **********************************************************************************************************************************************
ok: [169.254.1.1 -> localhost] => {"ansible_facts": {"allowed_targets": "Please use: --target-rack", "target_node_check": false, "target_rack_check": true, "target_vdc_check": false}, "changed": false}
TASK [target_check : context] ***********************************************************************************************************************************************
skipping: [169.254.1.1] => {"changed": false, "false_condition": "node_script == false and target_node_check == true or rack_script == false and target_rack_check == true or vdc_script == false and target_vdc_check == true", "skip_reason": "Conditional result was False"}
...truncated
- Oversigt over gennemgang:
Eksempel:
TASK [Print all summaries] **************************************************************************************************************************************************
ok: [169.254.1.1] => {
"msg": [
"*******************************************************************************",
"Switch xDoctor 'RAP073' password and SSH summary:",
"*******************************************************************************",
"Validated Frontend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838.",
"Validated Backend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838.",
"Validated Backend management connections: PASS: Management connections are up and connected to the frontend switches.",
"*******************************************************************************",
"Validated ssh keys to switch(es): PASS: All ssh keys are valid and nothing was corrected.",
"Validated xDoctor alert: PASS: Alert RAP073 was not present in xDoctor.",
"*******************************************************************************"
]
}
TASK [Set fact for context] *************************************************************************************************************************************************
ok: [169.254.1.1 -> localhost] => {"ansible_facts": {"context": " Validated Frontend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838., Validated Backend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838."}, "changed": false}
TASK [Fail if validation fails] *********************************************************************************************************************************************
fatal: [169.254.1.1]: FAILED! => {"changed": false, "msg": "Review the summary above for recommendations."}
NO MORE HOSTS LEFT **********************************************************************************************************************************************************
PLAY RECAP ******************************************************************************************************************************************************************
169.254.1.1 : ok=65 changed=13 unreachable=0 failed=1 skipped=73 rescued=0 ignored=1
169.254.1.2 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.3 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.4 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.5 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.6 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.7 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.8 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
=============================================================================================================================================================================
Status: FAIL
Time Elapsed: 0h 1m 25s
Debug log: /tmp/autopilot/log/autopilot_39838_20250626_113201.log
Message: Validated Frontend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838., Validated Backend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838.
=============================================================================================================================================================================
- Opdater xDoctor-adgangskoden:
admin@ecsnode7:~> sudo xdoctor -c --expert
xDoctor Configuration Menu
--------------------------
[Expert Mode Active]
(1) Overview
(2) Scheduling
(3) Archiving
(5) Repository
(9) Miscellaneous
(0) Exit
Please make a choice: 9
xDoctor Miscellaneous
---------------------
(3) Switches
(4) Remove Hardware Alerting Timestamp
(0) Main menu
Please make a choice: 3
xDoctor Switch Settings
---------------------
Enable Switch Analysis? [Yes]:
Switches [hare,rabbit,fox,hound]:
Username [admin]:
Password [*****]:
[New Switch Settings]
Enabled = Yes
Switches = hare,rabbit,fox,hound
Username = admin
Password = *****
> Issue new settings? [No]: yes
2024-11-20 16:03:53,702: xDoctor_4.8-100.0 - INFO : Settings saved and distributed ...
xDoctor Miscellaneous
---------------------
(3) Switches
(4) Remove Hardware Alerting Timestamp
(0) Main menu
Grundlæggende KB-automatisering:
ECS: xDoctor: RAP073: Switchforbindelsesfejl registreret
Yderligere KB-konsolidering i denne automatisering:
ECS: xDoctor rapporterer switchforbindelsesfejl pga. RSA-nøgle i known_hosts