NetWorker: Identify Clients that Require Clearing Peer Information "Error-SSL protocol failure"
Summary: The NetWorker server's /nsr/logs/daemon.raw is flooded with "Unable to complete SSL handshake with nsrexecd on host 'CLIENT_NAME': An error occurred as a result of an SSL protocol failure." Aside from a possible connection issue, this makes parsing the logs for any other troubleshooting difficult. This article highlights steps that can be followed to clear this issue from both the server and client-side connection. ...
Instructions
In some situations, the NetWorker server's daemon.raw may be flooded with Generic Security Service (GSS) authentication connection errors between two NetWorker systems:
MM/DD/YYYY HH:MM:SS 5 13 9 3635926784 26586 0 NSR_HOSTNAME nsrexecd SSL critical Unable to complete SSL handshake with nsrexecd on host 'CLIENT_NAME': An error occurred as a result of an SSL protocol failure. To complete this request, ensure that the certificate attributes for CLIENT_NAME and NSR_NAME match in the NSRLA database on each host.
Or
MM/DD/YYYY HH:mm:SS 5 12 10 11256 2900 0 NSR_NAME nsrexecd GSS critical An authentication request from CLIENT_NAME was denied. The 'NSR peer information' provided did not match the one stored by NSR_NAME. To accept this request, delete the 'NSR peer information' resource with the following attributes from NSR_NAME's NSRLA database: name: CLIENT_NAME; NW instance ID: CLIENT_ID; peer hostname: CLIENT_NAME MM/DD/YYYY HH:mm:SS 0 0 0 6384 6380 0 NSR_NAME nsrd NSR info Authentication Warning: Conflicting NSR peer information resources detected for host 'CLIENT_NAME'. Please check server daemon log for more information.
From the NetWorker server, as root or Administrator command prompt run:
nsradmin -C -y -p nsrexecd "nsr peer information"
See: NetWorker: How to clear NSR peer information mismatches automatically using nsradmin -C
This command checks each peer certificate resource in the NetWorker server's nsrladb and attempts to correct it. This operation must also be run on the clients reporting this issue. This may be occurring for many clients, and it becomes difficult to isolate all the different hosts that require correction.
The following process can be used to determine which systems require running nsradmin -C -y or may require manual peer info deletion.
Linux Hosts:
- Render the
daemon.raw:
nsr_render_log -S "1 weeks ago" /nsr/logs/daemon.raw > /nsr/logs/daemon.out 2<&1
NOTE: This example only renders the last 1 week of messages. This avoids checking for peer issues which may no longer be occurring. Other filters are explained in: NetWorker: How to use nsr_render_log
- Create a file containing only the GSS authentication connection errors:
cat /nsr/logs/daemon.out | grep "SSL handshake" > GSS_error.out
Or:
cat /nsr/logs/daemon.out | grep "NSR peer information" > GSS_error.out
NOTE: Depending on the specific GSS authentication error observed, change the
filter used by grep to collect the required output.
- Create a file containing only the client names from the GSS output file:
cat GSS_error.out | awk {'print $24'} | sort > client.out
This command uses the Linux awk and print commands to print only the column containing the client name from the full SSL connection error message. Depending on the filter number used, modify the print number to output the client names correctly if the above example does not return the expected results.
- Review the file using the unique command to output only one instance of each of the clients reporting this problem:
cat client.out | uniq
Example:
[root@nsrserver logs]# nsr_render_log daemon.raw > daemon.out 2<&1
[root@nsrserver logs]# cat daemon.out | grep "SSL handshake" > GSS_error.out
[root@nsrserver logs]# cat GSS_error.out | awk {'print $24'} | sort > client.out
[root@nsrserver logs]# cat client.out | uniq
'client1':
'client2':
'client3':
'client4':
'client5':
'client6':
The above hostnames have been changed; now, instead of hundreds of entries in daemon.raw, only one entry for each client is reporting this behavior.
- Connect to the client systems reported using SSH or Remote Desktop Protocol (RDP) and use a root/Administrative command prompt to run:
nsradmin -C -y -p nsrexecd "nsr peer information"
Running this command on both the server and client should ensure that the nsrladb on each system contains the correct peer certificate information. If a mismatch is detected, the certificate is deleted and the next connection attempt between the server and client should generate a new one.
The nsradmin command shows which hosts have a mismatch and what action was taken in the output.
Manually deleting peer information is detailed in article NetWorker: Fixing inconsistent NSR peer information
- The output files can be deleted once no longer needed:
rm -rf filename
Windows Hosts:
- Open a Windows Powershell prompt as Administrator.
- Change directories to the NetWorker log directory:
cd "C:\Program Files\EMC NetWorker\nsr\logs"
The example assumes that the default install location is used. If you installed NetWorker in another location, modify the command accordingly.
- Render the
daemon.raw:
nsr_render_log -S "1 weeks ago" daemon.raw > daemon.out
NOTE: This example only renders the last 1 week of messages. This avoids checking for peer issues which may no longer be occurring. Other filters are explained in: NetWorker: How to use nsr_render_log
- Create a file containing only the GSS authentication connection errors:
elect-String -Path .\daemon.out -pattern "SSL handshake" > GSS_error.out
Or:
Select-String -Path .\daemon.out -pattern "NSR peer information" > GSS_error.out
NOTE: Depending on the specific GSS authentication error observed, change the "
filter" used by grep to collect the required output.
- Generate output showing unique systems reporting the GSS authentication errors:
Get-Content .\GSS_error.out | %{ $_.Split(' ')[9]; } | Sort | Unique
filter number used, change the print number to output the client names if the above example does not return the expected results.
Example:
PS C:\Program Files\EMC NetWorker\nsr\logs> Get-Content .\GSS_error.out | %{ $_.Split(' ')[9]; } | Sort | Unique
13120
13932
2808
2828
2856
2900
2920
2956
5716
6088
6328
6380
6772
6852
8196
9388
networker-mc.emclab.local
redhat.emclab.local
winsrvr.emclab.local
- Connect to the client systems reported using SSH or RDP and use a root/Administrative command prompt to run:
nsradmin -C -y -p nsrexecd "nsr peer information"
Running this command on both the server and client should ensure that the nsrladb on each system contains the correct peer certificate information. If a mismatch is detected, the certificate is deleted and the next connection attempt between the server and client should generate a new one.
The nsradmin command shows which hosts have a mismatch and what action was taken in the output.
Manually deleting peer information is detailed in article NetWorker: Fixing inconsistent NSR peer information
- The output files can be deleted once no longer needed.