Unsolved
This post is more than 5 years old
5 Practitioner
•
274.2K Posts
0
713
Customer question about IB traffic?
Hi all,
Customer’s question is about ib0 (int-a) and ib1 (int-b) throughput on their cluster. Usually, IB traffic should be balanced on the cluster but customer reports that traffic of ib0/ib1 on node 1 is much lower that on node 2 or 3.
# isi statistics history --stats=node.net.iface.bytes.in.rate.0,node.net.iface.bytes.out.rate.0
# isi statistics history --stats=node.net.iface.bytes.in.rate.1,node.net.iface.bytes.out.rate.1
I have just connected to cluster and node-1 is reporting unbalanced rates.
ISI00-EMAD11-1% sudo isi statistics query --nodes=all --stats=node.net.iface.name.0
Password:
NodeID node.net.iface.name.0
1 int-a
2 int-a
3 int-a
ISI00-EMAD11-1% sudo isi statistics query --nodes=all --stats=node.net.iface.name.1
NodeID node.net.iface.name.1
1 int-b
2 int-b
3 int-b
ISI00-EMAD11-1% sudo isi statistics query --nodes=all --stats=node.net.iface.bytes.in.rate.0
NodeID node.net.iface.bytes.in.rate.0
1 193.6
2 42249.8
3 139942.8
average 60795.4
ISI00-EMAD11-1% sudo isi statistics query --nodes=all --stats=node.net.iface.bytes.in.rate.1
NodeID node.net.iface.bytes.in.rate.1
1 558.0
2 165.0
3 1503.2
average 742.1
ISI00-EMAD11-1% sudo isi statistics query --nodes=all --stats=node.net.iface.bytes.out.rate.0
NodeID node.net.iface.bytes.out.rate.0
1 559.4
2 1135796.8
3 949409.4
average 695255.2
ISI00-EMAD11-1% sudo isi statistics query --nodes=all --stats=node.net.iface.bytes.out.rate.1
NodeID node.net.iface.bytes.out.rate.1
1 1025.2
2 0.0
3 652.2
average 559.1
In addition, I have attached a excel sheet where samples were taken for a day and there are three graphics where you can see that traffic is unbalanced on node-1.
Is there any explanation or reported bug?
Thank you
Daniel
Peter_Sero
1.2K Posts
0
October 9th, 2015 00:00
We have seen those consistently inconsistent IB throughput statistics on OneFS 6.5,
as well as on 7.0, on 7.1 and on 7.2...
So apparently nobody cares (enough)... good luck
-- Peter
carlilek
205 Posts
0
October 10th, 2015 04:00
This would seem to be normal to me... unless the data is laid out completely evenly, and is all accessed evenly, then the traffic is going to be different. A critical question might be if all nodes have external networking connected and whether the connections are balanced there.
Peter_Sero
1.2K Posts
0
October 10th, 2015 04:00
The data usually /is/ laid out quite nicely, and so is the disk throughput per node.
The IB reports simply don't match (not even sum up correctly)
between the distributed back-end disk traffic and the NAS front-end traffic;
a certain subset of nodes always report the IB flow way too low.
-- Peter