How does TCP deal with timeouts with cwnd? - tcp

I've been researching TCP congestion control recently, however one question plagues me...
If I understand everything correctly, TCP will not send NEW data unless allowed by the cwnd (congestion window) and rwnd (the receiving side's window). In other words:
if(flightSize < MIN(cwnd, rwnd))
{
// Send some new data (if possible)
// Taking into account other details that we don't need
// to get into such as Nagle's algorithm, etc.
}
Where flightSize is the amount of data that has been sent but not yet acknowledged.
Let us assume that TCP is going along, sending data, and increasing cwnd as appropriate. Let's say cwnd = [10 full packets], and the flightSize == cwnd. Then packet loss occurs in the network, and the sender's retransmission timer goes off. How/When does New Reno retransmit the unacknowledged data?
Here's my current understanding/misunderstanding:
When the timer goes off, the cwnd will be reset to [1 full packet], the oldest sent but unacknowledged packet will be resent, the rto will be doubled, and the retransmission timer will be reset. So if we say the rto was 1 second when the timer went off, it will get updated to 2 seconds, and the retransmission timer will get started again with a wait time of 2 seconds.
Here is why I'm confused:
In the above situation, TCP will resend only a single packet. Even if that packet gets ACKed right away, TCP cannot send any NEW data because cwnd is still less than the flightSize. So what does it do? Sit around and wait until the 2 second retransmission timer goes off again before it resends another packet? Does it force a resend of the old data since it can't send new data? Does it reset the flightSize, and reconsider all previously sent data to be unsent?
I've read all the RFC's I could find, and all kinds of guides and explanations of TCP. I must have missed something somewhere...
Clarification:
I was considering multiple losses, where TCP is not using SACK.
If duplicate acks are received, TCP will resend the oldest ack on the 3rd duplicate ack (fast retrasmit) and will send new data on and after the 4th duplicate ack (fast recovery). My question concerns what happens if the TCP sender gets less than 3 dup acks?

I found the answer in the book "TCP/IP Illustrated, Volume 2", section 25.11, pages 842-844:
[On a retransmission timeout] the next
send sequence number (snd_nxt) is set
to the oldest unacknowledged sequence
number (snd_una). ... By moving
snd_nxt back, [TCP can begin to
retransmit all unacknowledged data].
In other words, the flightSize will get reset, so data can continue to be sent (in slow start mode). It's just that some of this data may be data that has already been sent before. A cumulative ack might come along that prevents all data from being resent though.

Request for clarification: are you considering a single packet loss? Or multiple losses within a window?
In a single loss case, there will be duplicate acknowledgements received because of packets received after the lost one. I believe New Reno will transmit subsequent packets ("NEW data") in response to the duplicate acks. This then resets the timeout timer.

Related

Alternating bit protocol delayed packet

In the alternating bit protocol, how does the receiver know the difference between a delayed packet and a correct one. For example if the sender sends a packet with seq#0, it gets severely delayed on the way there, so much so that the sender and receiver have completed 2 packets in between and the receiver is expecting the next packet with seq#0, and instead receives the delayed one. Should the receiver have a temporary storage of the last few packets to compare if it's just delayed or are there other ways to check?

How do delayed acknowledgements affect TCP's congestion avoidance phase?

From what I studied, congestion avoidance phase sets CWND = CWND + MSS * (MSS/CWND) every time a new acknowledgment is received. This is assuming we don't encounter duplicate ACKS or timeouts. But what happens if there are delayed acknowledgements ?
Here's what I think from research on delayed acks (no idea if this is correct):
Basically Delayed ACK is the destination retaining the ACK segment for a period of time expecting one of two things.
Either there will be more ACKS will be required to be sent before the timer is up because of new packets recieved by the receiver. OR the receiver will need to send some data back to the sender in which case it can piggy back the message on that packet.
How does this affect the congestion avoidance phase ?
This would be bad for congestion avoidance phase of TCP which depends on new Acks to increase CWND. This would cause delays in CWND window size change thus causing delay in the sending of packets. This means by the time that TCP could be sending packets to the receiver, it is actually not because acknowledgments are being delayed.
This affects the congestion avoidance phase the same way it affects the other phase (SS) : it will slow down the traffic. However, keep in mind there are two different network uses, the interactive one (such as telnet), and the bulk one. Delayed Acks are likely to be used with interactive protocols sending very small amounts of data, but this can bring new problems if Nagle's algorithm is used by the other side. When unsure, just disable delayed Acks.
That is a really good question. Since they keep the bottleneck buffer full, delayed Ack is not a big problem for traditional congestion control algorithms such as Reno and CUBIC.
For TCP Vegas which tries to keep the queue small, it is still not a problem because if it faces delayed Ack, it will reduce cwnd very slowly (one unit every RTT). Therefore a Vegas connection will not suffer from under-utilization unless delayed ack lasts for a very long time.

TCP keep-alive gets involved after TCP zero-window and closes the connection erroneously

We're seeing this pattern happen a lot between two RHEL 6 boxes that are transferring data via a TCP connection. The client issues a TCP Window Full, 0.2s later the client sends TCP Keep-Alives, to which the server responds with what look like correctly shaped responses. The client is unsatisfied by this however and continues sending TCP Keep-Alives until it finally closes the connection with an RST nearly 9s later.
This is despite the RHEL boxes having the default TCP Keep-Alive configuration:
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 75
...which declares that this should only occur until 2hrs of silence. Am I reading my PCAP wrong (relevant packets available on request)?
Below is Wireshark screenshot of the pattern, with my own packet notes in the middle.
Actually, these "keep-alive" packets are not used for TCP keep-alive! They are used for window size updates detection.
Wireshark treats them as keep-alive packets just because these packets look like keep-alive packet.
A TCP keep-alive packet is simply an ACK with the sequence number set to one less than the current sequence number for the connection.
(We assume that ip 10.120.67.113 refers to host A, 10.120.67.132 refers to host B.) In packet No.249511, A acks seq 24507484. In next packet(No.249512), B send seq 24507483(24507484-1).
Why there are so many "keep-alive" packets, what are they used for?
A sends data to B, and B replies zero-window size to tell A that he temporarily can't receive data anymore. In order to assure that A knows when B can receive data again, A send "keep-alive" packet to B again and again with persistence timer, B replies to A with his window size info (In our case, B's window size has always been zero).
And the normal TCP exponential backoff is used when calculating the persist timer. So we can see that A send its first "keep-alive" packet after 0.2s, send its second packet after 0.4s, the third is sent after 0.8, the fouth is sent after 1.6s...
This phenomenon is related to TCP flow control.
The source and destination IP addresses in the packets originating from client do not match the destination and source IP addresses in the response packets, which indicates that there is some device in between the boxes doing NAT. It is also important to understand where the packets have been captured. Probably a packet capture on the client itself will help understand the issue.
Please note that the client can generate TCP keepalive if it does not receive a data packet for two hours or more. As per RFC 1122, the client retries keepalive if it does not receive a keepalive response from the peer. It eventually disconnects after continuous retry failure.
The NAT devices typically implement connection caches to maintain the state of ongoing connections. If the size of the connection reaches limit, the NAT devices drops old connections in order to service the new connections. This could also lead to such a scenario.
The given packet capture indicates that there is a high probability that packets are not reaching the client, so it will be helpful to capture packets on client machine.
I read the trace slightly differently:
Sender sends more data than receiver can handle and gets zerowindow response
Sender sends window probes (not keepalives it is way to soon for that) and the application gives up after 10 seconds with no progress and closes the connection, the reset indicates there is data pending in the TCP sendbuffer.
If the application uses a large blocksize writing to the socket it may have seen no progress for more than the 10 seconds seen in the tcpdump.
If this is a straight connection (no proxies etc.) the most likely reason is that the receiving up stop receiving (or is slower than the sender & data transmission)
It looks to me like packet number 249522 provoked the application on 10.120.67.113 to abort the connection. All the window probes get a zero window response from .132 (with no payload) and then .132 sends (unsolicited) packet 249522 with 63 bytes (and still showing 0 window). The PSH flag suggests that this 63 bytes is the entire data written by the app on .132. Then .113 in the same millisecond responds with an RST. I can't think of any reason why the TCP stack would send a RST immediately after receiving data (sequence numbers are correct). In my view it is almost certain that the app on .113 decided to give up based on the 63 byte message sent by .132.

Can transmition of a packet finish before the first bit has reached the reciever?

I should find a combination of the lenght of a packet, bandwidth and the link lenght and then find out if there is a combination for which transmitting time of a packet finishes before the first bit of the packet has reached the receiver. Is this even possible?
TCP or UDP?
TCP will require to receive a response from the destination before it actually starts sending the packet thus it won't be possible here.
UDP has no concept of knowing whether or not the packet got received, which means that as soon as the packet has left the sender there is no further communication between the two.
Your question is worded ambiguously though: how can you talk about 'transmission time' (which implies the time between sending and receiving) while comparing that to the 'receiving time' (which is already part of the transmission time).?

How can a file transfer go so quickly if the size of a packet cannot exceed 1500 bytes?

When downloading a file from a Web Site, speeds of many megabytes per second can be acheived. If TCP needs to break up and individually send packets over 1500 bytes, then how are these speeds possible? Doesn't the client have to wait for every 1500 byte fragment which should take a while?
Thanks
Doesn't the client have to wait for every 1500 byte fragment which
should take a while
No. That's the magic of TCP, you don't have to ACK every segment, you can ACK once in a while. The server can push lots of segments before the client positively must acknowledge at least some.
TCP uses a concept called "windows". A sender can push data into a window, causing it to shrink. The receiver acknowledges data, causing the window to expand. If the receiver doesn't acknowledge data, the transfer grinds to a halt.
In modern TCP knowing when to acknowledge data is the gist of the protocol. Doing it too often or not often enough has enormous impacts on performance.

Resources