ECN-APJ

308 Posts

2174

July 29th, 2015 22:00

Ask The Expert “Network Diagnostics And Tracing” Summary

Ask The Expert “Network Diagnostics And Tracing” Summary

Share:

Please click here for all contents shared by us.

Introduction

This article summaries the Chinese ATE activity: "Network Diagnostics And Tracing".

Detailed Information

Q1: What is the troubleshooting process of common TCP issues?

Answer:

For TCP connectivity issues:

· In case an SYN packet is answered with RST, look for the firewall that blocks the port numbers.

· Triple SYN without any answer occurs either due to an application that didn't respond, or a firewall that blocks the request on a specific port.

· Always verify if you have Network Address Translation (NAT), port forwarding, and mechanisms that play with TCP or UDP ports. These mechanisms can interrupt with the standard operation of TCP.

For TCP retransmission issue:

When you see retransmissions on a communication link (to the Internet, on a server, between sites, or any other link), perform the following:

· Locate the problem—is it a specific IP address, specific connection, specific application, or some other problem.

· Check if the problem is because of the communication link, packet loss, or a slow server or PC. Check if the application is slow.

· If it is not due to any of the preceding reasons, check for delay variations.

For TCP Dup ACK and Out-of-Order issue:

· When you see a reasonable amount of duplicate ACKs, that is, 1 or 2 percent, this is probably not your problem.

· When you see a huge number of duplicate ACKs (say ten of them), you might have:

o A very busy communication line that causes variations in delays

o A non-responsive server or client (depends on who is not responding)

· A fast retransmission is a packet that is sent in response to the duplicate ACKs.

Q2: How to capture and analyze WeChat (a Twitter-like app)? How to filter other network packets in network packets capture tool?

Answer:

Usually we can follow the following steps to capture the packets:

1. Capture the packets on a clean machine

2. Disable or kill other apps

3. Capture the packets without running any apps, then you can know what packets exist all the time

4. Isolate the packets by “ping” command

Q3: What is Selective Acknowledgment (SACK)?

Answer:

Multiple packet losses from a window of data can have a catastrophic effect on TCP throughput. TCP uses a cumulative acknowledgment scheme in which received segments that are not at the left edge of the receive window are not acknowledged. This forces the sender to either wait a roundtrip time to find out about each lost packet, or to unnecessarily retransmit segments which have been correctly received. With the cumulative acknowledgment scheme, multiple dropped segments generally cause TCP to lose its ACK-based clock, reducing overall throughput.

Selective Acknowledgment (SACK) is a strategy which corrects this behavior in the face of multiple dropped segments. With selective acknowledgments, the data receiver can inform the sender about all segments that have arrived successfully, so the sender need retransmit only the segments that have actually been lost.

This is great over a high bandwidth, lossy (or high delay) link. The problem is that it can cause severe performance issues in specific circumstances. Normal TCP ACKs will make the server treat a high-bandwidth, lossy connection with kid gloves (send500 bytes, wait, send 500 bytes, wait, etc). SACK lets it adapt to the high delay because it knows exactly how many packets were actually lost.

Here is where bad things can happen. An attacker can force your server to keep a massive retransmission queue for a long time, then process that whole **** thing over and over and over again. This can peg the CPU, eat up RAM, and consume more bandwidth than it should. In a nutshell, a lightweight system can initiate a DoS against a beefier server.

· If your server is robust and doesn't serve large files, you're pretty well insulated against this.

· If you're mostly serving an intranet or other low-latency group of users, SACK buys you nothing and can be turned off for security reasons with no performance loss.

· If you're on a low-bandwidth link (say 1Mbps or less as a completely arbitrary rule of thumb), SACK can cause problems in normal operations by saturating your connection and should be turned off.

Ultimately, it's up to you. Consider what you're serving, to whom, from what, and weigh the degree of your risk against the performance effects of SACK.

Q4: What is TCP Delayed-Ack?

Answer:

Delayed ACK

A central precept of the TCP network protocol is that data sent through TCP be acknowledged by the recipient. According to RFC 813, "Very simply, when data arrives at the recipient, the protocol requires that it send back an acknowledgement of this data. The protocol specifies that the bytes of data are sequentially numbered, so that the recipient can acknowledge data by naming the highest numbered byte of data it has received, which also acknowledges the previous bytes.". The TCP packet that carries the acknowledgement is known as an ACK.

A host receiving a stream of TCP data segments can increase efficiency in both the network and the hosts by sending less than one ACK acknowledgment segment per data segment received. This is known as a delayed ACK. The common practice is to send an ACK for every other full-sized data segment and not to delay the ACK for a segment by more than a specified threshold. This threshold varies between 100ms and 500ms. ESXi/ESX uses delayed ACK because of its benefits, as do most other servers.

Slow start and Congestion avoidance

Congestion can occur when there is a mismatch of data processing capabilities between two elements of the network leading from the source to the destination. Congestion manifests itself as a delay, timeout, or packet loss. To avoid and recover from congestion, TCP uses two algorithms—the congestion avoidance algorithm and the slow start algorithm. Although the underlying mechanisms of these two algorithms differ, the underlying concept is that when congestion occurs, the TCP sender must slow down its transmission rate and then increase the rate as retransmitted data segments are acknowledged.

When congestion occurs, the typical recovery sequence for TCP/IP networks that use delayed ACK and slow start is:

1. The sender detects congestion because it did not receive an ACK within the retransmission timeout period.

2. The sender retransmits the first data segment and waits for the ACK before sequencing the remaining segments for retransmission.

3. The receiver receives the retransmitted data segment and starts the delayed ACK timer.

4. The sender transmits the ACK when the delayed ACK timer times out. During this waiting period, there are no other transmissions between the sender and receiver.

5. Having sent the ACK, the receiver then retransmits the next two data segments back to back.

6. The sender, upon receiving the second data segments, promptly transmits an ACK.

7. The receiver, upon receiving the ACK, retransmits the next four data segments back to back.

8. This sequence continues until the congestion period passes and the network returns to a normal traffic rate.

In this recovery sequence, the longest lag time comes from the initial delayed ACK timer in step 3.

Author: Roger

View All

No Events found!

NetWorker

Ask The Expert “Network Diagnostics And Tracing” Summary

Please click here for all contents shared by us.

Introduction

Detailed Information