I am trying to learn how TCP Flow Control works when I came across the concept of receive window.
My question is, why is the TCP receive window scale-able? Are there any advantages from implementing a small receive window size?
Because as I understand it, the larger the receive window size, the higher the throughput. While the smaller the receive window, the lower the throughput, since TCP will always wait until the allocated buffer is not full before sending more data. So doesn't it make sense to have the receive window at the maximum at all times to have maximum transfer rate?
My question is, why is the TCP receive window scale-able?
There are two questions there. Window scaling is the ability to multiply the scale by a power of 2 so you can have window sizes > 64k. However the rest of your question indicates that you are really asking why it is resizeable, to which the answer is 'so the application can choose its own receive window size'.
Are there any advantages from implementing a small receive window size?
Not really.
Because as I understand it, the larger the receive window size, the higher the throughput.
Correct, up to the bandwidth-delay product. Beyond that, increasing it has no effect.
While the smaller the receive window, the lower the throughput, since TCP will always wait until the allocated buffer is not full before sending more data. So doesn't it make sense to have the receive window at the maximum at all times to have maximum transfer rate?
Yes, up to the bandwidth-delay product (see above).
A small receive window ensures that when a packet loss is detected (which happens frequently on high collision network),
No it doesn't. Simulations show that if packet loss gets above a few %, TCP becomes unusable.
the sender will not need to resend a lot of packets.
It doesn't happen like that. There aren't any advantages to small window sizes except lower memory occupancy.
After much reading around, I think I might just have found an answer.
Throughput is not just a function of receive window. Both small and large receive windows have their own benefits and harms.
A small receive window ensures that when a packet loss is detected (which happens frequently on high collision network), the sender will not need to resend a lot of packets.
A large receive window ensures that the sender will not be idle a most of the time as it waits for the receiver to acknowledge that a packet has been received.
The receive window needs to be adjustable to get the optimal throughput for any given network.
Related
I was writing a TCP implementation, did all the fancy slow and fast retransmission stuff, and it all worked so I thought I was done. But then I reviewed my packet receive function (almost half of the 400 lines total code), and realized that my understanding of basic flow control is incomplete...
Suppose we have a TCP connection with a "sender" and "receiver". Suppose that the "sender" is not sending anything, and the receiver is stalling and then unstalling.
Since the "sender" is not sending anything, the "receiver" sees no ack_no delta. So the two window updates from the "receiver" look like:
ack_no = X, window = 0
ack_no = X, window = 8K
since both packets have the same ack_no, and they could be reordered in transit, how does the sender know which came first?
If the sender doesn't know which came first, then, after receiving both packets, how does it know whether it's allowed to send?
One guess is that maybe the window's upper endpoint is never allowed to decrease? Once the receiver has allocated a receive buffer and advertised it, it can never un-advertise it? In that case the window update could be reliably handled via the following code (assume no window scale, for simplicity):
// window update (https://stackoverflow.com/questions/63931135/)
int ack_delta = pkt_ack_no - c->tx_sn_ack;
c->tx_window = MAX(BE16(PKT.l4.window), c->tx_window - ack_delta);
if (c->tx_window)
Net_Notify(); // wake up transmission
But this is terrible from a receiver standpoint: it vastly increases the memory you'd need to support 10K connections reliably. Surely the protocol is smarter than that?
There is an assumption that the receive buffer never shrinks, which is intentionally undocumented to create an elite "skin in the game" club in order to limit the number of TCP implementations.
The original standard says that shrinking the window is "discouraged" but doesn't point out that it can't work reliably:
The mechanisms provided allow a TCP to advertise a large window and
to subsequently advertise a much smaller window without having
accepted that much data. This, so called "shrinking the window," is
strongly discouraged.
Even worse, the standard is actually missing the MAX operation proposed in the question, and just sets the window from the most recent packet if the acknowledgement number isn't increasing:
If SND.UNA < SEG.ACK =< SND.NXT, the send window should be
updated. If (SND.WL1 < SEG.SEQ or (SND.WL1 = SEG.SEQ and
SND.WL2 =< SEG.ACK)), set SND.WND <- SEG.WND, set
SND.WL1 <- SEG.SEQ, and set SND.WL2 <- SEG.ACK.
Note that SND.WND is an offset from SND.UNA, that SND.WL1
records the sequence number of the last segment used to update
SND.WND, and that SND.WL2 records the acknowledgment number of
the last segment used to update SND.WND. The check here
prevents using old segments to update the window.
so it will fail to grow the window if packets having the same ack number are reordered.
Bottom line: implement something that actually works robustly, not what's in the standard.
I'm sending 1k data using TCP/IP (using FreeRTOS + LwiP). From documents I understood that TCP/IP protocol has its flow control inside its stack itself, but this flow control is dependent on the Network buffers. I'm not sure how this can be handled in my scenario which is described below.
Receive data of 1k size using TCP/IP from wifi (this data rate will be in 20Mb/s)
The received Wifi data is put into a queue of 10k size10 block, each block having a size of 1K
From the queue, each block is taken and send to another interface at lower rate 1Mb/s
So in this scenario, do I have to implement flow control manually between data from wifi <-> queue? How can I achieve this?
No you do not have to implement flow control yourself, the TCP algorithm takes care of it internally.
Basically what happens is that when a TCP segment is received from your sender LwIP will send back an ACK that includes the available space remaining in its buffers (the window size). Since the data is arriving faster than you can process it the stack will eventually send back an ACK with a window size of zero. This tells the sender's stack to back off and try again later, which it will do automatically. When you get around to extracting more data from the network buffers the stack should re-ACK the last segment it received, only this time it opens up the window to say that it can receive more data.
What you want to avoid is something called silly window syndrome because it can have a drastic effect on your network utilisation and performance. Try to read data off the network in big chunks if you can. Avoid tight loops that fill a buffer 1-byte at a time.
It seems the maximum TCP receive window size is 1GB (when scaling is used). So then the largest RTT that would still make it possible to fill a 100Gb pipe with one connection is 40ms (because 2 * 40E-3 * 100E9 / 8 = 1GB). That would limit that sort of communication speed to a distance IRO 10000 kilometres.
Another scaling problem seems to be that 32-bit sequence numbers don't offer protection against duplicated packets delayed by more than about 400ms (because they wrap around in that amount of time). They also limit the window size to 2GB (because they need to be split between the sender and receiver window).
Three questions:
I am aware of TCP timestamps that can help solve the problem of sequence numbers, but I would like to know if that is a feature that just happens to help but was really designed for some other purpose. Also, I don't understand what it is that timestamps achieve that could not be done simply by increasing the number of bits used for sequence numbers.
I don't understand why the maximum receive window is just 1GB as opposed to 2GB that would presumably be trivially possible with the current headers.
Finally, I would like to know if TCP already scales well enough to be used over the sort of links that are supposedly coming soon.
Many thanks.
The TCP features you're talking about were specified in RFC 1323 in the early 1990s. The limitations you're encountering are justified by discussion text in the RFC:
The sequence number appears in the middle of the TCP segment header and could not have been lengthened without an incompatible change.
Using timestamps allows for the protocol to simultaneously measure round-trip time and protect against wrapped sequence numbers. Making the sequence number bigger would not provide any information about round-trip time.
You need the timestamps in order to measure round-trip time accurately. Measuring round-trip time without timestamps is a sampling problem, and the sampling becomes unsolvable due to aliasing if you get more than 1 error per window.
A 1 GB receive window is the largest that can be kept in sync across the connection. The RFC explains it about as well as can be done:
TCP determines if a data segment is "old" or "new" by testing
whether its sequence number is within 2**31 bytes of the left edge
of the window, and if it is not, discarding the data as "old". To
insure that new data is never mistakenly considered old and vice-
versa, the left edge of the sender's window has to be at most
2**31 away from the right edge of the receiver's window.
Similarly with the sender's right edge and receiver's left edge.
Since the right and left edges of either the sender's or
receiver's window differ by the window size, and since the sender
and receiver windows can be out of phase by at most the window
size, the above constraints imply that 2 * the max window size
must be less than 2**31, or
max window < 2**30
As Jonathon mentioned earlier, these limitations are per-TCP connection. It's tough to think of a scenario where a single application could reach the limits of a single TCP connection, and tougher to think of one where the application couldn't open additional connection(s) if needed.
I have a peripheral over USB that is sending data samples at a rate of 183 MBit/s. I would like to send this data over ethernet, which is limited to < 100 Mbit/s. Is it possible to send this data without overflow (i.e missing data) by increasing the TCP socket buffer?
It also depends on the receiver window size. Even if there is 100mbits, sender will push data depending on the window size available on the receiver. TCP window size without scaling enabled can go only upto 64kb. In your case, this size is not sufficient as it needs at least (100-183Mbits)10MB buffer. In Windows 7 & newer Linux OS, TCP by default enables window scaling which can extend the size upto 1GB. After enabling the TCP window scaleing option, you can increase the socket buffer to a bigger size say 50MB which should provide the required buffering.
The short answer is, it depends.
Increasing buffers (at transmitter) can help if the data is bursty. If the average rate is <100MBit (actually less, you need to allow for network contention and overhead), then buffering can help. You can do this by increasing the size of the buffers internally to the TCP stack, or by buffering internally to your application.
If the data isn't bursty, or the average is still too high, you might need to compress the data before transmission. Dependant on the nature of the data, you may be able to achieve significant compression.
Assume we talking about the situation of many senders sending packets to a receiver.
Often senders would be the one that control congestion by using sliding window that limits sending rate.
We have:
snd_cwnd = min(cwnd,rwnd)
Using explicit or implicit feedback information from network (router,switch), sender would control cwnd to control sending rate.
Normally, rwnd is always big enough that sender only care about cwnd. But if we consider rwnd, using it to limit snd_cwnd, it would make congestion control more efficiently.
rwnd is the number of packets (or bytes) that receiver be able to receive. What I'm concerned about is capability of senders.
Questions:
1. So how do receiver know how many flows sending packets to it?
2. Is there anyway that receiver know the snd_cwnd of sender?
This is all very confused.
The number of flows into a receiver isn't relevant to the rwnd of any specific flow. The rwnd is simply the amount of space left in the receive buffer for that flow.
The receiver has no need to know the sender's cwnd. That's the sender's problem.
Your statement that 'normally rwnd is always big enough that sender only cares about cwnd' is simply untrue. The receive window changes with every receive; it is re-advertised with every ACK; and it frequently drops to zero.
Your following statement 'if we consider rwnd, using it to limit cwnd ...' is simply a description of what already happens, as per 'snd_cwnd = min(cwnd, rwnd)'.
Or else it may constitute a completely unexplained proposal to needlessly modify TCP's flow control which has been working for 25 years, and which didn't work for several years before that: I remember several Arpanet freezes in the middle 1980s.