ECS: svc_tools are showing ERROR: "Unexpected type 'exceptions.RuntimeError' error while accessing Unknown."

Summary: This ERROR is displayed in various svc_tools like "svc_gc", "svc_replicate", "svc_rg", "svc_task", "svc_vdc" after a successful node evacuation "Unexpected type 'exceptions.RuntimeError' error while accessing Unknown. Error was: DT search redirected to an unknown node, with IP 169.254.2.1" ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

This ERROR is displayed in various svc_tools like "svc_gc", "svc_replicate", "svc_rg", "svc_task", "svc_vdc" after a successful node evacuation.
"Unexpected type 'exceptions.RuntimeError' error while accessing Unknown. Error was: DT search redirected to an unknown node, with IP 169.254.2.1"

admin@ecsnode3:~> svc_gc config list
svc_gc v3.7.0 (svc_tools v2.20.0)                 Started 2024-12-11 09:21:45

Local node ECS Object Version: 3.8.1.0-140092.463d649a0d3.0.1.bugfix_release_ecs_3_8_1_0_GA_Customer (3.8.1.0 Isolated Patch 0.1 (Customer))
ERROR     Unexpected <type 'exceptions.RuntimeError'> error while accessing Unknown.  Error was: DT search redirected to an unknown node, with IP 169.254.2.1
ERROR     Unexpected <type 'exceptions.RuntimeError'> error while accessing Unknown.  Error was: DT search redirected to an unknown node, with IP 169.254.2.1
WARNING   Failed to discover VDC info from dtquery, falling back to REST.  Error was: dtQueryCmdFailure - Unexpected <type 'exceptions.RuntimeError'> error while accessing Unknown.  Error was: DT search redirected to an unknown node, with IP 169.254.2.1
Local VDC: urn:storageos:VirtualDataCenterData:12345678-abcd-1212-3434-abcde123456 vdc_ecs_01
ERROR     Unexpected <type 'exceptions.RuntimeError'> error while accessing Unknown.  Error was: DT search redirected to an unknown node, with IP 169.254.2.1
ERROR     Unexpected <type 'exceptions.RuntimeError'> error while accessing Unknown.  Error was: DT search redirected to an unknown node, with IP 169.254.2.1

Current Param values:


Repo     com.emc.ecs.chunk.gc.repo.enabled                              true                 true                
Repo     com.emc.ecs.chunk.gc.repo.verification.enabled                 true                 true                
Repo     com.emc.ecs.chunk.gc.repo.reclaimer.no_recycle_window          78 hours             78 hours            

BTREE_L1 com.emc.ecs.chunk.gc.btree.enabled                             true                 true                
BTREE_L1 com.emc.ecs.chunk.gc.btree.scanner.verification.enabled        true                 true                
BTREE_L1 com.emc.ecs.chunk.gc.btree.scanner.copy.enabled                true                 true                
BTREE_L1 com.emc.ecs.chunk.gc.btree.occupancyScanner.enabled            true                 true                

BTREE_L2 com.emc.ecs.chunk.gc.btree.reclaimer.level2.enabled            true                 true                
BTREE_L2 com.emc.ecs.chunk.gc.btree.occupancyScanner.level2.enabled     true                 true                
BTREE_L2 com.emc.ecs.chunk.gc.btree.scanner.level2.copy.enabled         true                 true                
BTREE_L2 com.emc.ecs.chunk.gc.btree.scanner.level2.verification.enabled true                 true                

Partial  com.emc.ecs.chunk.gc.repo.partial.enabled                      true                 true                
Partial  com.emc.ecs.chunk.gc.repo.partial.merge_chunk_threshold        89478400             89478400            
Partial  com.emc.ecs.chunk.gc.repo.partial.merge_old_chunk_threshold    89478400             89478400            

Journal  com.emc.ecs.chunk.gc.journal.enabled                           true                 true                
Journal  com.emc.ecs.prtable.gc.enabled                                 true                 true                
Journal  com.emc.ecs.prtable.gc.record_expiration                       14 days              14 days             
Journal  com.emc.ecs.chunk.gc.journal.protection_period                 14 days              14 days             

CAS      com.emc.ecs.objectgc.cas.enabled                               true                 true                
CAS      com.emc.ecs.objectgc.cas.process_update.enabled                true                 true                
CAS      com.emc.ecs.objectgc.cas.process_object.enabled                true                 true                
CAS      com.emc.ecs.objectgc.cas.process_audit.enabled                 true                 true                
CAS      com.emc.ecs.objectgc.cas.consistency_scanner.enabled           true                 true                
CAS      com.emc.ecs.objectgc.cas.process_object.dry_run                false                false               


====> List of the Parameters Not Default: 

Type   Param (com.emc.ecs)   Default   Configure(active)   MTime   Reason   Description
---------------------------------------------------------------------------------------
< No result data >

admin@ecsnode3:~> 

Cause

For example a node evacuation of rack 2 what included nodes R2N1 - R2N5 (private.4 IP's IP 169.254.2.1 - IP 169.254.2.5) was done recently.
Service dtquery held these private.4 IPs in cache.

This also applies for evacuations of other racks.

Resolution

Check if a recent node evacuation took place. Check the latest Service Console logs across all nodes and look for "run_Node_Evacuation"

Command: svc_exec "ls -altrd /opt/emc/caspian/service-console/log/*"

Example:

admin@ecsnode3:~> svc_exec "ls -altrd /opt/emc/caspian/service-console/log/*"
svc_exec v1.0.8 (svc_tools v2.20.0)                 Started 2024-12-11 13:05:08

Output from node: r1n1                                retval: 0
drwxr-xr-x 2 root root 122 Apr  6  2018 /opt/emc/caspian/service-console/log/runClusterConfig_20180406_070816_0
drwxr-xr-x 2 root root 122 Apr  6  2018 /opt/emc/caspian/service-console/log/runHealthCheck_20180406_070956_0
<...>
drwxr-xr-x 2 root root  78 Nov 26 15:52 /opt/emc/caspian/service-console/log/20241126_153841_run_Node_Evacuation
drwxr-xr-x 2 root root  78 Nov 26 16:11 /opt/emc/caspian/service-console/log/20241126_160456_run_Node_Evacuation
drwxr-xr-x 2 root root  78 Nov 26 22:10 /opt/emc/caspian/service-console/log/20241126_212407_run_Node_Evacuation
drwxr-xr-x 2 root root  78 Nov 27 11:20 /opt/emc/caspian/service-console/log/20241127_095641_run_Node_Evacuation     <--------------------
drwxr-xr-x 2 root root  78 Nov 27 11:36 /opt/emc/caspian/service-console/log/20241127_113633_run_Node_Evacuation
drwxr-xr-x 2 root root  78 Nov 27 11:43 /opt/emc/caspian/service-console/log/20241127_114041_run_Cluster_Config
drwxr-xr-x 2 root root  78 Dec  4 09:42 /opt/emc/caspian/service-console/log/20241204_094234_run_Node_Maintenance_Enter
<...>
drwxr-xr-x 2 root root  78 Dec 11 08:20 /opt/emc/caspian/service-console/log/20241211_081946_run_Node_Maintenance_List

Output from node: r1n2                                retval: 0
drwxr-xr-x 2 root root 139 Nov 25  2018 /opt/emc/caspian/service-console/log/20181125_091403_run_OS_and_Node_Upgrade
drwxr-xr-x 2 root root 139 Feb 23  2019 /opt/emc/caspian/service-console/log/20190223_084218_run_OS_and_Node_Upgrade
drwxr-xr-x 2 root root 139 Nov 23  2019 /opt/emc/caspian/service-console/log/20191123_105303_run_Upgrade
drwxr-xr-x 2 root root  78 Feb  7  2021 /opt/emc/caspian/service-console/log/20210207_122312_run_Upgrade_To_35
<...>

Output from node: r1n3                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r1n4                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r1n5                                retval: 0
drwxr-xr-x 2 root root 78 Dec 22  2023 /opt/emc/caspian/service-console/log/20231222_094024_run_Cluster_Config
drwxr-xr-x 2 root root 62 Dec 22  2023 /opt/emc/caspian/service-console/log/20231222_094644_run_Node_Maintenance_Enter
drwxr-xr-x 2 root root 78 Dec 22  2023 /opt/emc/caspian/service-console/log/20231222_094716_run_Node_Maintenance_Enter

Output from node: r1n6                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r1n7                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r1n8                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r3n1                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r3n2                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r3n3                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r3n4                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r3n5                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r3n6                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r3n7                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


Output from node: r3n8                                retval: 2
ls: cannot access '/opt/emc/caspian/service-console/log/*': No such file or directory


admin@ecsnode3:~>

 

Confirm that the rack is removed. In this example rack 2 with IP 169.254.2.1 is not shown:

admin@ecsnode3:~> getclusterinfo 

Registered Racks
================

Ip Address        epoxy   seg mac             seg color    seg id    NAN Hostname
===============   =====   =================   ==========   =======   ============
169.254.1.1       False   28:99:3a:12:34:56   red          1         provo-red.nanlocal
169.254.3.1       False   28:99:3a:78:90:12   blue         3         provo-blue.nanlocal
admin@ecsnode3:~>


Each time the error is returned from a svc_tools command, the error is shown in dtquery.log of the node where the command was issued:

admin@ecsnode3:~> svc_log -f "169.254.2.1" -sr all -start 1m -sn -sf
svc_log v1.0.33 (svc_tools v2.20.0)                 Started 2024-12-11 11:18:29

Running on nodes:                        <All nodes>
Time range:                              2024-12-11 11:17:30 - 2024-12-11 11:18:30
Filter string(s):                        '169.254.2.1'
Service(s) to search:                    zk-fabric,cm,am,metering,resourcesvc,nvmeengine,casaccess,vnest,upgrade,ecsportalsvc,blobsvc,coordinatorsvc,authsvc,atlas,rm,eventsvc,dataheadsvc,stat,dm,accesslog,metering-georeplayer,zk-object,objcontrolsvc,ssm,nginx,nvmetargetviewer,messages-object,dtsm,provisionsvc,lifecycle,dataheadsvc-access,georeceiver,sr,transformsvc,dtquery,storageserver,datahead-cas-access
Show filename(s):                        True
Show nodename(s):                        True
Log type(s) to search for each service:  <Main Logs>
Show nodename(s):                        True

169.254.1.1 dtquery.log 2024-12-11T11:18:05,800 [qtp877323851-243362]  INFO  DtQueryService.java (line 3749) redirecting to http://169.254.2.1:9101/urn:storageos:OwnershipInfo:3a6bc46a-8551-4df9-a140-5e6b9774f2cb__RT_58_128_0:/REP_GROUP_KEY/?maxkeys=1000&showvalue=gpb&rgId=urn%3Astorageos%3AReplicationGroupInfo%3A00000000-0000-0000-0000-000000000000%3Aglobal
169.254.1.1 dtquery.log 2024-12-11T11:18:05,804 [qtp877323851-243014]  INFO  DtQueryService.java (line 3749) redirecting to http://169.254.2.1:9101/urn:storageos:OwnershipInfo:3a6bc46a-8551-4df9-a140-5e6b9774f2cb__RT_114_128_0:/REP_GROUP_KEY/?maxkeys=1000&showvalue=gpb&rgId=urn%3Astorageos%3AReplicationGroupInfo%3A00000000-0000-0000-0000-000000000000%3Aglobal

admin@ecsnode3:~> 

 

The timestamp of the last restart of dtquery is before the last recent node evacuation.

Open a Service Request with Dell Technical support and mention this KBA 000259052 for getting the dtquery service restarted.

A restart of dtquery is not impacting any frontend I/O's as it is only used internally by ECS.

Affected Products

ECS

Products

ECS Appliance
Article Properties
Article Number: 000259052
Article Type: Solution
Last Modified: 16 Dec 2024
Version:  1
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.