Avamar: Slow NDMP backup and low avtar CPU use due to TCP window scaling

摘要: TCP window scaling

本文适用于 本文不适用于 本文并非针对某种特定的产品。 本文并非包含所有产品版本。

症状

Avamar backups of a NAS device over NDMP are running slower than expected.

The VNX/Celerra network Interfaces are set to 10Gb/sec.
The Avamar NDMP Accelerator node network interface is set to 1Gb/sec.

The backup logs show that NDMP Accelerator node CPU usage is low during backup.  

avtar Info <8688>: Status 2014-10-20 07:09:19, 83,476 files, 9,244 directories, 72.26 GB (83,476 files, 7.312 MB, 44.34% new) 592MB      7% CPU   
avtar Info <8688>: Status 2014-10-20 07:24:19, 126,201 files, 13,423 directories, 80.16 GB (126,201 files, 10.14 MB, 44.95% new) 592MB  10% CPU   
avtar Info <8688>: Status 2014-10-20 07:54:20, 187,013 files, 19,327 directories, 94.54 GB (187,013 files, 14.23 MB, 45.52% new) 600MB   8% CPU   


The NDMP protocol sends only changed files to the NDMP accelerator. We a significant amount of work to rechunk the modified files.
If CPU usage is low, this indicates that data that is sent to the NDMP accelerator more slowly than is optimal.

If a network trace is performed between the two devices, one sees that many TCP re-transmissions are occurring.

原因


One side tries to force the other to an inappropriate speed.
Part of the normal TCP/IP negotiation between two devices it to find a commonly acceptable transmit  receive speed. Some times one side or the other tries to optimise the connection speed but does so inappropriately.

If the switch runs at 10Gb/sec and the server only 1Gb/sec, the switch can get into a state where it keeps trying to force the server NIC to communicate at 10Gb/sec.
The resultant renegotiation requests can cause slow performance.
 
It is common for VNX/Data Domain to be configured to use 10Gb/sec NICs.
It is also common for the Avamar NDMP Accelerator to be configured to use 1Gb/sec NICs.

This is more complicated to see and may or may not show up as re-transmission of data.
It commonly only shows up as slow performance.

In the Resolution section of this article we show how to turn off TCP window scaling so remote attempts to increase the interface speed are ignored.

You can try turning off the TCP sliding window and testing before making it permanent.
   
The auto-negotiated speed may be lower than the maximum possible speed.
Other situations which could occur involve one side, say the Avamar NDMP accelerator NIC advertising a 1Gb/sec speed but the switch connected at 10Mb/sec limiting the performance by a factor of 10.

Below is an example of this issue:

As the root user, run:
#  ethtool eth0
Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised auto-negotiation: Yes
        Speed: 100Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: g
        Wake-on: d
        Link detected: yes

Here we see 1000BaseT/Full supported and advertised, but the Speed is only 100 Mb/s.

For an unknown reason during the negotiation attempt, they settled on less than the server was capable of handling.  
Rebooting the server will typically reset this.
# ethtool eth0
Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: g
        Wake-on: d
        Link detected: yes

If not, there might be network issues limiting the speed or a problem with the switch.

Have the customer network team review the situation.

解决方案

If ethtool shows slower speed than the maximum speed the NIC can perform at, reboot.
If this is not the case and the switch is 10Gb and server is 1Gb, disable the Window Scaling as per below.

To  disable Window Scaling:

This way TCP flow control is activated before the network can become over saturated.

To do this,

1) Run the following command
echo 0 > /proc/sys/net/ipv4/tcp_window_scaling

2) Add the following text to /etc/sysctl.conf
net.ipv4.tcp_window_scaling = 0

3) Start a new NDMP backup

受影响的产品

Avamar

产品

Avamar, Avamar Plug-in for NDMP
文章属性
文章编号: 000051503
文章类型: Solution
上次修改时间: 10 2月 2025
版本:  4
从其他戴尔用户那里查找问题的答案
支持服务
检查您的设备是否在支持服务涵盖的范围内。