ECS: OBS: xDoctor: RAP073/208: Feil ved svitsjtilkobling oppdaget
Summary: Denne kunnskapsbasen forklarer hvordan du håndterer varsel om at det oppdages en svitsjtilkoblingsfeil.
Symptoms
Fra og med ECS xDoctor v4.8-109.0 og ObjectScale xDoctor v5.1-109.0 blir RAP208 (Switch Connection Failure Detected) implementert som en autohealer. Når problemer med svitsjtilkobling overskrider den konfigurerte feil- eller kritiske alvorlighetsgraden, hever xDoctor et RAP208-varsel og starter automatisk sin integrerte arbeidsflyt for reparasjonsorkestrering. Denne arbeidsflyten utfører de nødvendige utbedringshandlingene, hvis xDoctor auto-healere er aktivert.
MERK: Hvis miljøet ditt kjører en xDoctor-versjon eldre enn ECS xDoctor v4.8-109.0 eller ObjectScale xDoctor v5.1-109.0, er RAP208 auto-healer-funksjonaliteten ikke tilgjengelig. I disse versjonene må utbedring utføres ved hjelp av Autopilot-prosessen som er beskrevet nedenfor, eller ved å følge de manuelle utbedringstrinnene som er beskrevet i løsningsdelen.
Varsel som utløser automatisk reparasjon av RAP208
Arbeidsflyten for automatisk reparasjon av RAP208 utløses når tilkoblingsfeil på svitsjen overskrider den konfigurerte feil- eller kritiske alvorlighetsgradsterskelen. Når denne terskelen er overskredet, genererer xDoctor et RAP208-varsel, som fungerer som utløser for den automatiserte reparasjonsprosessen.
Eksempel på varselutdata
MERK: I xDoctor-versjoner før ECS xDoctor v4.8-109.0 og ObjectScale xDoctor v5.1-109.0 resulterer denne tilstanden kun i et varsel. Automatisk utbedring utføres ikke.
--------------------------------------------------------
INFO - Auto Healer for dell_switch_connectivity disabled
--------------------------------------------------------
Extra = Auto Healer for dell_switch_connectivity disabled
Timestamp = 2026-04-01_180132
PSNT = CKMXXXXXXXXXXX @ 4.8-109.0
----------------------------------------------------
ERROR - (Cached) Switch Connection Failure detected.
----------------------------------------------------
Node = 169.254.1.1
Extra = {"169.254.1.1": ["hare"]}
RAP = RAP208
Solution = KB 39838
Timestamp = 2026-04-01_180132
PSNT = CKMXXXXXXXXXXX @ 4.8-109.0
Auto Healer Remediation (eksempel)
Når automatiske healere er aktivert, starter xDoctor automatisk utbedring for oppdagede problemer med svitsjtilkoblingen ved å bruke vanlige korrigerende tiltak som er beskrevet i denne kunnskapsbasen.
--------------------------------------------------------
FIXED - Auto Healer fixed Dell switch connectivity issue
--------------------------------------------------------
Node = Nodes
Extra = {"Nodes": ["169.254.1.1"]}
Timestamp = 2026-04-01_180344
PSNT = CKMXXXXXXXXXXX @ 4.8-109.0
Auto Healer Krav
Den xDoctor auto-healer funksjonen må være aktivert for denne utbedring skal skje. Autohealere kan aktiveres enten under installasjonen eller etter installasjonen ved å følge trinnene som er beskrevet i:
KB: ECS: xDoctor: Hvordan aktivere xDoctor Auto Healer etter installasjon av verktøy
Cause
Etter en utskifting av svitsjen kan det hende at SSH-vertsnøklene som brukes til å godkjenne svitsjen, endres, eller administrasjonsgrensesnittet som kobles til svitsjen, kan være administrativt slått av. Noen ganger passordet konfigurert i xDoctor ikke samsvarer med gjeldende passord på den berørte bryteren og må oppdateres tilsvarende.
Arbeidsflytene for xDoctor-automatisering og automatisk healer utfører ikke utbedring av svitsjpassord. I stedet oppdager xDoctor autentiseringsrelaterte feil og hever riktig varsel, og dirigerer brukeren til den relevante kunnskapsbaseartikkelen som beskriver hvordan du konfigurerer xDoctor til å bruke passordet som er angitt på bryterne.
Resolution
xDoctor Auto Healer: ObjectScale xDoctor v5.1-109.0 / ECS xDoctor v4.8-109.0 eller nyere
- For å manuelt utløse den aktiverte auto healer, kjør følgende kommando på
master.rackNode. Dette starter rackanalysatorene, som validerer og automatisk reparerer nodene, én om gangen.
# sudo xdoctor --rap=RAP208
Eksempel:
admin@ecsnode1:~> sudo xdoctor --rap=RAP208 2026-04-01 18:03:45,441: xDoctor_4.8-109.0 - INFO : Initializing xDoctor v4.8-109.0 ... [... Truncated Output ...] 2026-04-01 18:05:01,725: xDoctor_4.8-109.0 - INFO : ANALYZER [ac_dell_switch_connectivity] 2026-04-01 18:05:02,063: xDoctor_4.8-109.0 - INFO : Autohealing switch_connectivity on node 169.254.1.1 ... 2026-04-01 18:08:57,494: xDoctor_4.8-109.0 - INFO : All data analyzed in 0:03:55 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : -------------------- 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : Diagnosis Summary 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : -------------------- 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : PSNT: CKMXXXXXXXXXXX 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : -------------------- 2026-04-01 18:08:58,529: xDoctor_4.8-109.0 - INFO : FIXED = 1 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : CRITICAL = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : CRITICAL (CACHED) = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : ERROR = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : ERROR (CACHED) = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : WARNING = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : INFO = 0 2026-04-01 18:08:58,530: xDoctor_4.8-109.0 - INFO : VERBOSE = 0 2026-04-01 18:08:58,531: xDoctor_4.8-109.0 - INFO : REPORT = 0 2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO : --------------------- 2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO : xDoctor Post Features 2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO : ---------------- 2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO : Data Combiner 2026-04-01 18:08:58,646: xDoctor_4.8-109.0 - INFO : ------------- 2026-04-01 18:08:58,647: xDoctor_4.8-109.0 - INFO : Created a Data Collection Report (data.xml) 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : ------ 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : SysLog 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : ------ 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : Using Fabric as Syslog Server 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : Not triggered ... no WARNING, ERROR, nor CRITICAL 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : ---- 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : SNMP 2026-04-01 18:08:58,648: xDoctor_4.8-109.0 - INFO : ---- 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO : Using 10.118.165.48:162 as SNMP server 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO : Not triggered .. no WARNING, ERROR nor CRITICAL 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO : ------------ 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO : ProcComplete 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - INFO : ------------ 2026-04-01 18:08:58,649: xDoctor_4.8-109.0 - WARNING : ProcComplete is disabled, please re-enable it (xdoctor --config) 2026-04-01 18:08:58,767: xDoctor_4.8-109.0 - INFO : ---------------- 2026-04-01 18:08:58,767: xDoctor_4.8-109.0 - INFO : Session Archiver 2026-04-01 18:08:58,768: xDoctor_4.8-109.0 - INFO : ---------------- 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : Session Stored in folder - /usr/local/xdoctor/archive/other/2026-04-01_180344 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : Session Archived as tar - /usr/local/xdoctor/archive/other/xDoctor-CKMXXXXXXXXXXX-2026-04-01_180344.tgz 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : -------------------------- 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : Session Report - sudo xdoctor --report --archive=2026-04-01_180344 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : --------------- 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : Session Cleaner 2026-04-01 18:08:58,777: xDoctor_4.8-109.0 - INFO : --------------- 2026-04-01 18:08:58,789: xDoctor_4.8-109.0 - INFO : Removing folder (count limit) - /usr/local/xdoctor/archive/other/2026-04-01_170120 2026-04-01 18:08:58,790: xDoctor_4.8-109.0 - INFO : Removing archive (count limit) - /usr/local/xdoctor/archive/other/xDoctor-CKMXXXXXXXXXXX-2026-04-01_170120.tgz 2026-04-01 18:08:58,793: xDoctor_4.8-109.0 - INFO : Cleaned 2 archived session(s) 2026-04-01 18:08:58,793: xDoctor_4.8-109.0 - INFO : ------- 2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO : Emailer 2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO : ------- 2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO : Using Dedicated Server (25:25) as SMTP Server ... 2026-04-01 18:08:58,794: xDoctor_4.8-109.0 - INFO : Email Type = Individual Events 2026-04-01 18:08:58,795: xDoctor_4.8-109.0 - INFO : ------------------------------ 2026-04-01 18:08:58,795: xDoctor_4.8-109.0 - INFO : xDoctor session_1775066624.943 finished in 0:05:13 2026-04-01 18:08:58,813: xDoctor_4.8-109.0 - INFO : Successful Job:1775066624 Exit Code:192
- Kjør øktrapporten for å gjennomgå resultatene av den manuelle utførelsen av automatisk helbredelse.
# sudo xdoctor --report --archive=<session report>
Eksempel:
admin@ecsnode1:~> sudo xdoctor --report --archive=2026-04-01_180344
xDoctor 4.8-109.0
CKMXXXXXXXXXXX - ECS 3.8.1.4
Displaying xDoctor Report (2026-04-01_180344) Filter:[] ...
--------------------------------------------------------
FIXED - Auto Healer fixed Dell switch connectivity issue
--------------------------------------------------------
Node = Nodes
Extra = {"Nodes": ["169.254.1.1"]}
Timestamp = 2026-04-01_180344
PSNT = CKMXXXXXXXXXXX @ 4.8-109.0
- Hvis du har en feil, åpner du en SRn for undersøkelse.
Eksempel på feil:---------------------------------------------------- ERROR - (Cached) Auto fix failed - Switch Connection Failure detected. ---------------------------------------------------- Node = 169.254.1.1 Extra = {"169.254.1.1": ["hare"]} RAP = RAP208 Solution = KB 39838 Timestamp = 2026-04-01_180132 PSNT = CKMXXXXXXXXXXX @ 4.8-109.0
xDoctor Auto Pilot:
Denne kunnskapsbasen (KB) er nå automatisert med xDoctor Auto Pilot som adresserer de fleste problemer uten behov for involvering av støtte.
Denne funksjonen er opprinnelig i xDoctor 4-8.104.0 og nyere, for syntaks og bruksproblemer referanse ECS: ObjectScale: Slik kjører du skript for KB-automatisering (autopilot).
Slik finner du hovednoden til racket:
Kommando:
ssh master.rack
For å finne NAN-IP-en kan du bruke IP-adressen som er identifisert i varselet eller fra getrackinfo:
Kommando:
admin@ecsnode1:~> getrackinfo
Node private Node Public BMC
Ip Address Id Status Mac Ip Address Mac Ip Address Private.4(NAN) Node Name
=============== ====== ====== ================= =============== ================= =============== =============== =========
192.168.219.1 1 MA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.101 169.254.1.1 provo-red
192.168.219.2 2 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.102 169.254.1.2 sandy-red
192.168.219.3 3 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.103 169.254.1.3 orem-red
192.168.219.4 4 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.104 169.254.1.4 ogden-red
192.168.219.5 5 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.105 169.254.1.5 layton-red
192.168.219.6 6 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.106 169.254.1.6 logan-red
192.168.219.7 7 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.107 169.254.1.7 lehi-red
192.168.219.8 8 SA 00:00:00:00:00 0.0.0.0 00:00:00:00:00 192.168.219.108 169.254.1.8 murray-red
- Kjør automatiseringskommandoen fra hovednoden med xDoctor 4-8.104.0 og nyere.
Merk:
--target-rack støttes for denne handlingen.
# sudo xdoctor autopilot --kb 39838 --target-rack <rack_colour>
admin@ecsnode1:~> sudo xdoctor autopilot --kb 39838 --target-rack red
Checking for existing screen sessions...
Starting screen session 'autopilot_kb_39838_20250626_112318'...
Screen session 'autopilot_kb_39838_20250626_112318' started successfully.
Attaching to screen session 'autopilot_kb_39838_20250626_112318'...
Using /etc/ansible/ansible.cfg as config file
VERSION: 3.0
Playbook tasks: 47
Role tasks: 97
Total tasks: 144 across 1 host(s)
PLAY [red] ******************************************************************************************************************************************************************
Detected 8 hosts for this play.
TASK [target_check : set_fact] **********************************************************************************************************************************************
ok: [169.254.1.1 -> localhost] => {"ansible_facts": {"allowed_targets": "Please use: --target-rack", "target_node_check": false, "target_rack_check": true, "target_vdc_check": false}, "changed": false}
TASK [target_check : context] ***********************************************************************************************************************************************
skipping: [169.254.1.1] => {"changed": false, "false_condition": "node_script == false and target_node_check == true or rack_script == false and target_rack_check == true or vdc_script == false and target_vdc_check == true", "skip_reason": "Conditional result was False"}
...truncated
- Sammendrag av anmeldelsen:
Eksempel:
TASK [Print all summaries] **************************************************************************************************************************************************
ok: [169.254.1.1] => {
"msg": [
"*******************************************************************************",
"Switch xDoctor 'RAP073' password and SSH summary:",
"*******************************************************************************",
"Validated Frontend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838.",
"Validated Backend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838.",
"Validated Backend management connections: PASS: Management connections are up and connected to the frontend switches.",
"*******************************************************************************",
"Validated ssh keys to switch(es): PASS: All ssh keys are valid and nothing was corrected.",
"Validated xDoctor alert: PASS: Alert RAP073 was not present in xDoctor.",
"*******************************************************************************"
]
}
TASK [Set fact for context] *************************************************************************************************************************************************
ok: [169.254.1.1 -> localhost] => {"ansible_facts": {"context": " Validated Frontend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838., Validated Backend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838."}, "changed": false}
TASK [Fail if validation fails] *********************************************************************************************************************************************
fatal: [169.254.1.1]: FAILED! => {"changed": false, "msg": "Review the summary above for recommendations."}
NO MORE HOSTS LEFT **********************************************************************************************************************************************************
PLAY RECAP ******************************************************************************************************************************************************************
169.254.1.1 : ok=65 changed=13 unreachable=0 failed=1 skipped=73 rescued=0 ignored=1
169.254.1.2 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.3 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.4 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.5 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.6 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.7 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
169.254.1.8 : ok=4 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
=============================================================================================================================================================================
Status: FAIL
Time Elapsed: 0h 1m 25s
Debug log: /tmp/autopilot/log/autopilot_39838_20250626_113201.log
Message: Validated Frontend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838., Validated Backend switch(es): FAIL: The passwords for the Dell managed switch(es) are incorrect and need to be configured in the xDoctor settings according to KB 39838.
=============================================================================================================================================================================
- Oppdater xDoctor-passordet:
admin@ecsnode7:~> sudo xdoctor -c --expert
xDoctor Configuration Menu
--------------------------
[Expert Mode Active]
(1) Overview
(2) Scheduling
(3) Archiving
(5) Repository
(9) Miscellaneous
(0) Exit
Please make a choice: 9
xDoctor Miscellaneous
---------------------
(3) Switches
(4) Remove Hardware Alerting Timestamp
(0) Main menu
Please make a choice: 3
xDoctor Switch Settings
---------------------
Enable Switch Analysis? [Yes]:
Switches [hare,rabbit,fox,hound]:
Username [admin]:
Password [*****]:
[New Switch Settings]
Enabled = Yes
Switches = hare,rabbit,fox,hound
Username = admin
Password = *****
> Issue new settings? [No]: yes
2024-11-20 16:03:53,702: xDoctor_4.8-100.0 - INFO : Settings saved and distributed ...
xDoctor Miscellaneous
---------------------
(3) Switches
(4) Remove Hardware Alerting Timestamp
(0) Main menu
Grunnleggende KB-automatisering:
ECS: xDoctor: RAP073: Feil ved svitsjtilkobling oppdaget
Ytterligere KB-konsolidering i denne automatiseringen:
ECS: xDoctor rapporterer brytertilkoblingsfeil på grunn av RSA-nøkkel i known_hosts