Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

2636

September 20th, 2017 18:00

Does anyone know the meaning of "isi_stats_d: connect() to devid 0 : Connection reset by peer"?

I often see the following message in /var/log/messages of some nodes.


>isi_stats_d[1893]: connect() to devid 0 192.168.0.7: Connection reset by peer

I know "192.168.0.7" is corresponding to the internal failover IP of node 7.


>nfs06-data-kks-7:  lo0: flags=8049 metric 0 mtu 16384

>nfs06-data-kks-7:  inet6 ::1 prefixlen 128 zone 1

>nfs06-data-kks-7:  inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 zone 1

>nfs06-data-kks-7:  inet 127.0.0.1 netmask 0xff000000 zone 1

>nfs06-data-kks-7:  inet 192.168.0.7 netmask 0xffffff00 zone 1


Does it indicate any failure of HW or S/W?

If anyone knows the meaning of such message, please kindly tell me.


The cluster is IQ108NL x 13 nodes cluster, running with OneFS 7.1.1.11.

9 Posts

October 23rd, 2017 10:00

This is part of the routine in which isi_stats_d connects to a remote socket. isi_stats_d connects to all other nodes from all other nodes for the purpose of keeping cluster statistics updated. isi_stats_d makes these connections by way of the node name, which resolves to the node failover IP via /etc/hosts. The failover IP then resolves to either the int-a or int-b IP based on the entry in the routing table. If you see that error, it means the connect() call that isi_stats_d made to the remote socket returned a non-zero value. This could mean many things, but the two most likely situations are:

  • The remote process is down or otherwise busy (isi_stats_d on the node being connected to may not be available to accept the incoming connection).
  • Backend communication failed for a duration of time (either due to congestion or other type of failure).

6 Posts

December 26th, 2017 01:00

jepstein-san,

Sorry for my late response. It's a very detailed explanation and I understand it. Thank you!

No Events found!

Top