Avamar: Slow NDMP backup and low avtar CPU use due to TCP window scaling
摘要: TCP window scaling
本文适用于
本文不适用于
本文并非针对某种特定的产品。
本文并非包含所有产品版本。
症状
Avamar backups of a NAS device over NDMP are running slower than expected.
The VNX/Celerra network Interfaces are set to 10Gb/sec.
The Avamar NDMP Accelerator node network interface is set to 1Gb/sec.
The backup logs show that NDMP Accelerator node CPU usage is low during backup.
avtar Info <8688>: Status 2014-10-20 07:09:19, 83,476 files, 9,244 directories, 72.26 GB (83,476 files, 7.312 MB, 44.34% new) 592MB 7% CPU
avtar Info <8688>: Status 2014-10-20 07:24:19, 126,201 files, 13,423 directories, 80.16 GB (126,201 files, 10.14 MB, 44.95% new) 592MB 10% CPU
avtar Info <8688>: Status 2014-10-20 07:54:20, 187,013 files, 19,327 directories, 94.54 GB (187,013 files, 14.23 MB, 45.52% new) 600MB 8% CPU
The NDMP protocol sends only changed files to the NDMP accelerator. We a significant amount of work to rechunk the modified files.
If CPU usage is low, this indicates that data that is sent to the NDMP accelerator more slowly than is optimal.
If a network trace is performed between the two devices, one sees that many TCP re-transmissions are occurring.
The VNX/Celerra network Interfaces are set to 10Gb/sec.
The Avamar NDMP Accelerator node network interface is set to 1Gb/sec.
The backup logs show that NDMP Accelerator node CPU usage is low during backup.
avtar Info <8688>: Status 2014-10-20 07:09:19, 83,476 files, 9,244 directories, 72.26 GB (83,476 files, 7.312 MB, 44.34% new) 592MB 7% CPU
avtar Info <8688>: Status 2014-10-20 07:24:19, 126,201 files, 13,423 directories, 80.16 GB (126,201 files, 10.14 MB, 44.95% new) 592MB 10% CPU
avtar Info <8688>: Status 2014-10-20 07:54:20, 187,013 files, 19,327 directories, 94.54 GB (187,013 files, 14.23 MB, 45.52% new) 600MB 8% CPU
The NDMP protocol sends only changed files to the NDMP accelerator. We a significant amount of work to rechunk the modified files.
If CPU usage is low, this indicates that data that is sent to the NDMP accelerator more slowly than is optimal.
If a network trace is performed between the two devices, one sees that many TCP re-transmissions are occurring.
原因
One side tries to force the other to an inappropriate speed.
Part of the normal TCP/IP negotiation between two devices it to find a commonly acceptable transmit receive speed. Some times one side or the other tries to optimise the connection speed but does so inappropriately.
If the switch runs at 10Gb/sec and the server only 1Gb/sec, the switch can get into a state where it keeps trying to force the server NIC to communicate at 10Gb/sec.
The resultant renegotiation requests can cause slow performance.
It is common for VNX/Data Domain to be configured to use 10Gb/sec NICs.
It is also common for the Avamar NDMP Accelerator to be configured to use 1Gb/sec NICs.
This is more complicated to see and may or may not show up as re-transmission of data.
It commonly only shows up as slow performance.
In the Resolution section of this article we show how to turn off TCP window scaling so remote attempts to increase the interface speed are ignored.
You can try turning off the TCP sliding window and testing before making it permanent.
The auto-negotiated speed may be lower than the maximum possible speed.
Other situations which could occur involve one side, say the Avamar NDMP accelerator NIC advertising a 1Gb/sec speed but the switch connected at 10Mb/sec limiting the performance by a factor of 10.
Below is an example of this issue:
As the root user, run:
# ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 100Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Link detected: yes
Here we see 1000BaseT/Full supported and advertised, but the Speed is only 100 Mb/s.
For an unknown reason during the negotiation attempt, they settled on less than the server was capable of handling.
Rebooting the server will typically reset this.
# ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Link detected: yes
If not, there might be network issues limiting the speed or a problem with the switch.
Have the customer network team review the situation.
解决方案
If ethtool shows slower speed than the maximum speed the NIC can perform at, reboot.
If this is not the case and the switch is 10Gb and server is 1Gb, disable the Window Scaling as per below.
To disable Window Scaling:
This way TCP flow control is activated before the network can become over saturated.
To do this,
1) Run the following command
2) Add the following text to /etc/sysctl.conf
3) Start a new NDMP backup
If this is not the case and the switch is 10Gb and server is 1Gb, disable the Window Scaling as per below.
To disable Window Scaling:
This way TCP flow control is activated before the network can become over saturated.
To do this,
1) Run the following command
echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
2) Add the following text to /etc/sysctl.conf
net.ipv4.tcp_window_scaling = 0
3) Start a new NDMP backup
受影响的产品
Avamar产品
Avamar, Avamar Plug-in for NDMP文章属性
文章编号: 000051503
文章类型: Solution
上次修改时间: 10 2月 2025
版本: 4
从其他戴尔用户那里查找问题的答案
支持服务
检查您的设备是否在支持服务涵盖的范围内。