IDPA: Node Event Service Is in Degraded Due to a Duplicate Route 169.254.0.1
Summary: The Cached Response with Node Event Service Is Disabled. Node Event Service Is in Degraded Due to a Duplicate Route 169.254.0.1.
Symptoms
RUCK Precheck Failure firmware_readiness:
[ERROR] Firmware pre-upgrade checks failed. [ 10.100.XX.XXX ]: The cached response with Node Event Service is disabled.
Node Event Service is in a degraded state,iDRAC Service Module is not available/active at this time.
Check iDRAC Service Module/iDRAC status.
When running below command, idraccache will error or degraded.
[root@idpa-esx1:~] /opt/dell/DellPTAgent/tools/pta_call get agent/info
Request sent to DellPTAgent @ https://192.168.100.101:8086
{
"libstorelibit.so": "07.05",
"uptime": "177266 seconds ( 2 days 1 hour 14 minutes 26 seconds )",
"host_pass_thru_ip": "169.254.0.2",
"ism_version": "3.6.0",
"model": "R640 IDPA",
"os_version": "6.7.0 build-17700523",
"process_id": "2101654",
"libstorelibir-3.so": "15.03-0",
"default_server_cert": "true",
"TPM Present": "false",
"MarvellLibraryVersion": "5.0.13.1109",
"servicetag": "D3WCCD3",
"mfr": "Dell Inc.",
"system_uuid": "62dcf7ee-334e-8f96-f507-78ac4426a310",
"status": {
"iSM": "N/A",
"agent": "Error/degraded",
"idracConnection": "OK",
"idraccache": "Error/degraded"
The iSM is running but cannot communicate with iDRAC:
PTAgent debug logs /scratch/log/pta_debug.log
2023/10/23 14:29:33[UTC] [19394369:246258496] WARN - WSManClient::isValidResponse: Http request to host: 169.254.0.1, failed with status code: -4
2023/10/23 14:29:34[UTC] [19394369:246786880] WARN - WSManClient::isValidResponse: Http request to host: 169.254.0.1, failed with status code: -4
2023/10/23 14:29:34[UTC] [19394369:246258496] WARN - ISMMonitor::isISMServiceRunning: Command to check iSM status failed with error code <2>. <ism is active (running limited functionality)
Cause
This extra route is created at the host OS layer and based on existing network configuration. This behavior is commonly seen when multiple static routes are defined in the ESXi configuration.
Possible cause includes the using of an internal SSH tunnel from ESXi to iDRAC.
iSM does not create the additional routes, nor detect and heal any extra defined routes.
Resolution
To check this ptagent run:
./goidpa esx ptagent check
To fix issues with ptagent run:
./goidpa esx ptagent fix
Additional Information
vmk0 only. Removing any other route from vmk0 or removing a 169.254.xx.xx route from any other vmk can cause other routing issues.
- Confirm an invalid entry in the ESXi routing table, which prevents the iSM using 169.254.0.2 to communicate with iDRAC's 169.254.0.1.
An example network environment experiencing this issue is seen below. The vmk0 interface's gateway address for an extra route for network 169.254.0.1/32.
[root@idpa-esxi:~] esxcli network ip route ipv4 list
Network Netmask Gateway Interface Source
-------------- --------------- ------------- --------- ------
default 0.0.0.0 10.100.10.254 vmk0 MANUAL
10.100.10.0 255.255.255.0 0.0.0.0 vmk0 MANUAL
169.254.0.0 255.255.255.0 0.0.0.0 vmk2 MANUAL
169.254.0.1 255.255.255.255 10.100.xx.xxx vmk0 MANUAL
192.168.100.96 255.255.255.224 0.0.0.0 vmk1 MANUAL
- Remove the entry to fix the routing issue between iSM and iDRAC.
esxcli network ip route ipv4 remove -g 10.100.xx.xxx -n 169.254.0.1/32
- Entry has been removed:
[root@idpa-esxi:~] esxcli network ip route ipv4 list
Network Netmask Gateway Interface Source
-------------- --------------- ------------- --------- ------
default 0.0.0.0 10.100.10.254 vmk0 MANUAL
10.100.10.0 255.255.255.0 0.0.0.0 vmk0 MANUAL
169.254.0.0 255.255.255.0 0.0.0.0 vmk2 MANUAL
192.168.100.96 255.255.255.224 0.0.0.0 vmk1 MANUAL
- Once the extra route is removed, restart PTAgent and the iSM service.
/etc/init.d/DellPTAgent restart
/etc/init.d/dcism-netmon-watchdog restart
- Run firmware readiness to confirm SUCCESS
DP4400:
curl -k -i -H "Content-Type:application/json" -X POST https://localhost:8039/dpatools/api/v1/firmware/readinesscheck -d '{"idpaVersion": "2.7", "isRack": false, "hostList": [{"hostIP": "192.168.100.101", "esxiUser": "root", "esxiPassword": "IDPAPASSWORD"}]}'
IDPAPASSWORD = IDPA common password
Example:
curl -k -i -H "Content-Type:application/json" -X POST https://localhost:8039/dpatools/api/v1/firmware/readinesscheck -d '{"idpaVersion": "2.7", "isRack": false, "hostList": [{"hostIP": "192.168.100.101", "esxiUser": "root", "esxiPassword": "Idpa_12345"}]}'
Output:
HTTP/1.1 200
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: POST, PUT, GET, OPTIONS, DELETE
Access-Control-Max-Age: 3600
Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
Content-Type: application/json
Transfer-Encoding: chunked
Date: Sun, 05 Nov 2023 01:38:41 GMT
[{"id":814402494717874004,"hostIP":"192.168.100.101","result":"SUCCESS","agentIdracDataCached":true,"agentIsOverallReady":true,"agentIsRunning":true,"agentVersionIsOk":true,"idracInRecoveryMode":false,"idracIsAvailable":true,"idracIsReady":true,"idracLcJobQueueIsClear":true,"ismIsEnabled":true,"ismIsReady":true,"firmwareIsValid":true,"twoHopIsRequired":false,"invalidFirmwareList":[],"messages":"The IDPA system is ready for firmware update.","links":[]}]
DP5x00/DP8x00:
for i in {1..3}; do curl -k -i -H "Content-Type:application/json" -X POST https://localhost:8039/dpatools/api/v1/firmware/readinesscheck -d '{"idpaVersion": "2.7", "isRack": true, "hostList": [{"hostIP": "192.168.100.10'${i}'", "esxiUser": "root", "esxiPassword": "IDPAPASSWORD"}]}'; done
IDPAPASSWORD = IDPA common password
Example:
for i in {1..3}; do curl -k -i -H "Content-Type:application/json" -X POST https://localhost:8039/dpatools/api/v1/firmware/readinesscheck -d '{"idpaVersion": "2.7", "isRack": true, "hostList": [{"hostIP": "192.168.100.10'${i}'", "esxiUser": "root", "esxiPassword": "Idpa_12345"}]}'; done
Output:
HTTP/1.1 200
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: POST, PUT, GET, OPTIONS, DELETE
Access-Control-Max-Age: 3600
Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
Content-Type: application/json
Transfer-Encoding: chunked
Date: Sun, 05 Nov 2023 01:39:05 GMT
[{"id":2337625064092707360,"hostIP":"192.168.100.101","result":"SUCCESS","agentIdracDataCached":true,"agentIsOverallReady":true,"agentIsRunning":true,"agentVersionIsOk":true,"idracInRecoveryMode":false,"idracIsAvailable":true,"idracIsReady":true,"idracLcJobQueueIsClear":true,"ismIsEnabled":true,"ismIsReady":true,"firmwareIsValid":false,"twoHopIsRequired":false,"invalidFirmwareList":[{"name":"Integrated Remote Access Controller","currentVersion":"3.36.103.36","minimumRequiredVersion":"4.40.10.00"}],"messages":"The IDPA system is ready for firmware update.","links":[]}]HTTP/1.1 200
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: POST, PUT, GET, OPTIONS, DELETE
Access-Control-Max-Age: 3600
Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
Content-Type: application/json
Transfer-Encoding: chunked
Date: Sun, 05 Nov 2023 01:39:44 GMT
[{"id":1389413431580575643,"hostIP":"192.168.100.102","result":"SUCCESS","agentIdracDataCached":true,"agentIsOverallReady":true,"agentIsRunning":true,"agentVersionIsOk":true,"idracInRecoveryMode":false,"idracIsAvailable":true,"idracIsReady":true,"idracLcJobQueueIsClear":true,"ismIsEnabled":true,"ismIsReady":true,"firmwareIsValid":true,"twoHopIsRequired":false,"invalidFirmwareList":[],"messages":"The IDPA system is ready for firmware update.","links":[]}]HTTP/1.1 200
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: POST, PUT, GET, OPTIONS, DELETE
Access-Control-Max-Age: 3600
Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
Content-Type: application/json
Transfer-Encoding: chunked
Date: Sun, 05 Nov 2023 01:40:19 GMT
[{"id":1184184063038749065,"hostIP":"192.168.100.103","result":"SUCCESS","agentIdracDataCached":true,"agentIsOverallReady":true,"agentIsRunning":true,"agentVersionIsOk":true,"idracInRecoveryMode":false,"idracIsAvailable":true,"idracIsReady":true,"idracLcJobQueueIsClear":true,"ismIsEnabled":true,"ismIsReady":true,"firmwareIsValid":true,"twoHopIsRequired":false,"invalidFirmwareList":[],"messages":"The IDPA system is ready for firmware update.","links":[]}]
Check the cache status by running the below command:
/opt/dell/DellPTAgent/tools/pta_call get agent/info
Output:
Request sent to DellPTAgent @ https://192.168.100.101:8086
{
"mfr": "Dell Inc.",
"servicetag": "D3WCCD3",
"host_epoch_time": "1699148602.56383 (secs.usecs)",
"uptime": "204355 seconds ( 2 days 8 hours 45 minutes 55 seconds )",
"system_uuid": "62dcf7ee-334e-8f96-f507-78ac4426a310",
"model": "R640 IDPA",
"process_id": "2101654",
"libstorelibir-3.so": "15.03-0",
"domain": "esx1-5800-crk.dp.ce.gslabs.lab.emc.com",
"name": "esx1-5800-crk",
"ptagentversion": "2.4.1-3",
"idrac_ethernet_ip": "192.168.100.110",
"os_version": "6.7.0 build-17700523",
"ism_version": "3.6.0",
"MarvellLibraryVersion": "5.0.13.1109",
"libstorelib.so": "07.07",
"host_pass_thru_ip": "169.254.0.2",
"default_server_cert": "true",
"status": {
"iSM": "N/A",
"agent": "OK",
"idraccache": "OK",
"idracConnection": "OK"
},
"idrac_pass_thru_ip": "169.254.0.1",
"api_blocking_enabled": "false",
"os": "VMWare ESXi",
"rest_endpoints": "https://192.168.100.101:8086",
"TPM Present": "false",
"libstorelibit.so": "07.05"
}
Response: status: 200 [OK], size: 1063 bytes, latency: 0.144 seconds.