Metro node: Password-less SSH to a node or director fails
Summary: This article talks to the issue when attempting to access any metro node node/director, you are prompted for a password. This is not an expected behavior as metro node supports password-less SSH between the nodes/director(s) when using the service account. ...
Symptoms
Impacted metro node Hardware:
Dell Hardware: Metro Node-mn114/mn215
Dell Hardware: Metro node - Local or Metro
Impacted metro node OS versions:
Dell Software: Metro node OS 7.x
Dell Software: Metro node OS 8.x
Issues:
When attempting to SSH to any Metro Node node or director(s) with the service account, you are prompted to input a password as follows:
Example: (The below output depicts that node-B is unable to perform password-less SSH to Node-A)
service@director-1-1-b:~> ssh 128.221.252.35 Password:
Symptoms:
All directors reporting director-x-y-z : Systemd IO error: Cannot determine the state of system on the 'cluster status' output as follows:
Example:
VPlexcli:/> cluster status
WARNING: There are unreachable directors: director-1-1-A. Connectivity may still have errors even if none are reported.
Cluster cluster-1
operational-status: degraded
transitioning-indications: disk(s) not visible from all directors,meta data problem
transitioning-progress:
health-state: degraded
health-indications: director-1-1-A : Systemd IO error: Cannot determine the state of system services
local-com: connectivity: NONE
LC-00 ports - FAIL - Failed to determine expected connectivity.
LC-01 ports - FAIL - Failed to determine expected connectivity.
Cluster Name Port Group MTU Connectivity Status
------------ ---------- ---- ------------ -----------
cluster-1 LC-00 1500 fail all-enabled
LC-01 1500 fail all-enabled
cluster-2 LC-00 1500 fail all-enabled
LC-01 1500 fail all-enabled
man-com: connectivity: NONE
MC-01 ports - FAIL - Failed to determine expected connectivity.
MC-00 ports - FAIL - Failed to determine expected connectivity.
Cluster Name Port Group MTU Connectivity Status
------------ ---------- ---- ------------ -----------
cluster-1 MC-00 1500 fail all-enabled
MC-01 1500 fail all-enabled
cluster-2 MC-00 1500 fail all-enabled
MC-01 1500 fail all-enabled
Cluster cluster-2
operational-status: ok
transitioning-indications:
transitioning-progress:
health-state: ok
health-indications:
local-com: ok
man-com: ok
wan-com: ok
Cause
This issue may occur if service directory permissions are modified to full access (777) (Read, write, and run permissions)
Resolution
Workaround Steps:
-
Log in to the affected node with service user and go to service-maintenance-window by entering the below command from the service/Linux prompt:
sudo /usr/sbin/service-maintenance-window -f
-
Check and change the permissions defined for the service directory on the affected node as follows:
Example: (Affected metro node node-A as per symptom section)
From: director-1-1-a:/home # ll total 28 drwxr-x--- 6 admin users 4096 Aug 20 2021 admin drwxr-x--- 6 eseservice users 4096 Aug 20 2021 eseservice drwx------ 2 root root 16384 Mar 3 2022 lost+found drwxrwxrwx 11 service users 4096 May 20 05:08 service To: director-1-1-a:/home # chmod 750 service director-1-1-a:/home # ll total 28 drwxr-x--- 6 admin users 4096 Aug 20 2021 admin drwxr-x--- 6 eseservice users 4096 Aug 20 2021 eseservice drwx------ 2 root root 16384 Mar 3 2022 lost+found drwxr-x--- 11 service users 4096 May 20 05:08 service
-
Attempt to SSH back to the affected node as the service user and confirm if you can SSH password-less or not as follows:
Example: (The below output shows that Node-B can perform password-less SSH to node-A successfully)
service@director-1-1-b:~> ssh 128.221.252.35 Last login: Mon May 20 05:32:04 2024 from 10.107.104.132 service@director-1-1-a:~>
-
Run the cluster status command again to confirm no errors were seen:
VPlexcli:/> cluster status Cluster cluster-1 operational-status: ok transitioning-indications: transitioning-progress: health-state: ok health-indications: local-com: ok man-com: ok Cluster cluster-2 operational-status: ok transitioning-indications: transitioning-progress: health-state: ok health-indications: local-com: ok man-com: ok wan-com: ok
Additional Information
*Using chmod in absolute mode*
In the absolute mode, permissions are represented in numeric form (octal system to be precise). In this system, each file permission is represented by a number.
r (read) = 4
w (write) = 2
x (execute) = 1
– (no permission) = 0
With these numeric values, you can combine them and thus one number can be used to represent the entire permission set.
| Number | Permission |
|---|---|
| 0 | — |
| 1 | –x |
| 2 | -w- |
| 3 (i.e. 2+1) | -wx |
| 4 | r– |
| 5 (i.e. 4+1) | r-x |
| 6 (i.e. 4+2) | rw- |
| 7 (i.e. 4+2+1) | rwx |