TCP congestion control - Fast Recovery in graph - networking
I've been reading the book "Computer Networking: A Top Down Approach" and encountered a question I don't seem to understand.
As I read, TCP Congestion Control has three states: Slow Start, Congestion Avoidance and Fast Recovery. I understand Slow Start and Congestion Avoidance well, but Fast Recovery is pretty vague. The book claims that TCP behaves this way:(cwnd= Congestion Window)
Let's look at the following graph:
As we can see, at round 16 the sender sends 42 segments, and because the congestion window size had been halved (+3), We can infer that there have been 3 Duplicate-ACKs. The answer to this question claims that rounds between 16 and 22 are in Congestion Avoidance state. But why not Fast recovery? I mean, after three duplicate ACKs TCP enters Fast recovery and every other duplicate ACK since should increase the congestion window. Why the graph has no representation of that? The only reasonable explanation I could think of is that in this graph, there was only three-duplicate ACKs, and the ACKs that had been received since were not duplications.Even if that's the case, how the graph would have looked like if there had been more than 3 duplicated ACKs? **
Is there any representation of Fast Recovery in the graph above? Why not/yes?
** I've been struggling to answer that question for a long time. I'll be very glad for any reply, Thank you!
UPDATE here's the image. I think a round is defined as when all segments in the window are being ACKed. In the photo, a round is shown with a circle.
Why does cwnd grow exponentially when in Fast Recovery state? (in the image I accidentally wrote expediently instead of exponentially)
UPDATE: My original answer agreed with the solution, but after careful thought, I think the solution is wrong. This answer was rewritten from scratch; please read it carefully. I show why Fast recovery is entered at time T=16 and why the protocol remains there until T=22. The data in the graph backs my theory, so I'm pretty much positive that the solution is plain wrong.
Let's start by setting something straight: Slow Start grows exponentially; congestion avoidance grows linearly, and fast recovery grows linearly even though it uses the same formula as slow start to update the value of cwnd.
Allow me to clarify.
Why do we say that Slow Start grows cwnd exponentially?
Note that cwnd is increased by MSS bytes for each ACK received.
Let's see an example. Suppose cwnd is initialized to 1 MSS (the value of MSS is typically 1460 bytes, so in practice this means cwnd is initialized to 1460). At this point, because the congestion window size can only hold 1 packet, TCP will not send new data until this packet is acknowledged. Assuming that ACKs aren't being lost, this implies that approximately one new packet is transferred every RTT seconds (recall that RTT is the round-trip time), since we need (1/2)*RTT to send the packet, and (1/2)*RTT for the ACK to arrive.
So, this results in a send rate of roughly MSS/RTT bps. Now, remember that for each ACK, cwnd is incremented by MSS. Thus, once the first ACK arrives, cwnd becomes 2*MSS, so now we can send 2 packets. When these two packets are acknowledged, we increment cwnd twice, so now cwnd is 4*MSS. Great! We can send 4 packets. These 4 packets are acknowledged, so we get to increment cwnd 4 times! So we have cwnd = 8*MSS. And then we get cwnd = 16*MSS. We are essentially doubling cwnd every RTT seconds (this also explains why cwnd = cwnd+MSS*(MSS/cwnd) in Congestion Avoidance leads to linear growing)
Yes, it's tricky, the formula cwnd = cwnd+MSS easily leads us to believe it's linear - a common misconception, because people often forget this is applied for each acknowledged packet.
Note that in the real world, transmitting 4 packets doesn't necessarily generate 4 ACKs. It may generate 1 ACK only, but since TCP uses cumulative ACKs, that single ACK is still acknowledging 4 packets.
Why is Fast Recovery linear?
The cwnd = cwnd+MSS formula is applied in both Slow start and Congestion avoidance. One would think that this causes both states to induce exponential growth. However, Fast recovery applies that formula in a different context: when a duplicate ACK is received. Herein lies the difference: in slow start, one RTT acknowledged a whole bunch of segments, and each acknowledged segment contributed with +1MSS to the new value of cwnd, whereas in fast recovery, a duplicate ACK is wasting an RTT to acknowledge the loss of a single segment, so instead of updating cwnd N times each RTT seconds (where N is the number of segments transmitted), we are updating cwnd once for a segment that was LOST. So we "wasted" one round trip with just one segment, so we only increment cwnd by 1.
About congestion avoidance - this one I'll explain below when analysing the graph.
Analysing the graph
Ok, so let's see exactly what happens in that graph, round by round. Your picture is correct up to some degree. Let me clear some things first:
When we say that Slow Start and Fast Recovery grow exponentially, it means it grows exponentially round by round, as you show in your picture. So, that is correct. You correctly identified the rounds with blue circles: notice how the values of cwnd grow exponentially from one circle to the next - 1, 2, 4, 8, 16, ...
Your picture seems to say that after Slow Start, the protocol enters Fast Recovery. This is not what happens. If it went to Fast Recovery from Slow Start, we would see cwnd being halved. That's not what the graph shows: the value of cwnd does not decrease to half from T=6 to T=7.
Ok, so now let's see exactly what happens on each round. Note that the time unit in the graph is a round. So, if at time T=X we transmit N segments, then it is assumed that at time T=X+1 these N segments have been ACKed (assuming they weren't lost, of course).
Also note how we can tell the value of ssthresh just by looking at the graph. At T=6, cwnd stops growing exponentially and starts growing linearly, and its value does not decrease. The only possible transition from slow start to another state that doesn't involve decreasing cwnd is the transition to congestion avoidance, which happens when the congestion window size is equal to ssthresh. We can see in the graph that this happens when cwnd is 32. So, we immediately know that ssthresh is initialized to 32 MSS. The book shows a very similar graph on page 276 (Figure 3.53), where the authors draw a similar conclusion:
Under normal conditions, this is what happens - when TCP switches for the first time from an exponential growth to a linear growth without decreasing the size of the window, it's always because it hit the threshold and switched to congestion avoidance.
Finally, assume that MSS is at least 1460 bytes (it is commonly 1460 bytes because Ethernet has MTU = 1500 bytes and we need to account for the size of the TCP + IP headers, which together need 40 bytes). This is important to see when cwnd exceeds ssthresh, since cwnd's unit is MSS and ssthresh is expressed in bytes.
So here we go:
T = 1:
cwnd = 1 MSS; ssthresh = 32 kB
Transmit 1 segment
T = 2
1 segment acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 2
Transmit 2 segments
T = 3
2 segments acknowledged
cwnd += 2; ssthresh = 32 kB
New value of cwnd: 4
Transmit 4 segments
T = 4
4 segments acknowledged
cwnd += 4; ssthresh = 32 kB
New value of cwnd: 8
Transmit 8 segments
T = 5
8 segments acknowledged
cwnd += 8; ssthresh = 32 kB
New value of cwnd: 16
Transmit 16 segments
T = 6
16 segments acknowledged
cwnd += 16; ssthresh = 32 kB
New value of cwnd: 32
Transmit 32 segments
Ok, let's see what happens now. cwnd reached ssthresh (32*1460 = 46720 bytes, which is greater than 32000). It's time to switch to congestion avoidance. Note how the values of cwnd grow exponentially across rounds, because each acknowledged packet contributes with 1 MSS to the new value of cwnd, and every packet sent is acknowledged in the next round.
The switch to congestion avoidance
Now, cwnd will not increase exponentially, because each ACK won't contribute with 1 MSS anymore. Instead, each ACK contributes with MSS*(MSS/cwnd). So, for example, if MSS is 1460 bytes and cwnd is 14600 bytes (so at the beginning of each round we are sending 10 segments), then each ACK (assuming one ACK per segment) will increase cwnd by 1/10 MSS (146 bytes). Since we send 10 segments, and at the end of the round we assume that every segment was acknowledged, then at the end of the round we have increased cwnd by 10 * 1/10 = 1. In other words, each segment contributes a small fraction to cwnd such that we just increment cwnd by 1 MSS each round. So now each round increments cwnd by 1 rather than by the number of segments that were transferred / acknowledged.
We will remain in congestion avoidance until some loss is detected (either 3 duplicate ACKs or a timeout).
Now, let the clocks resume...
T = 7
32 segments acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 33
Transmit 33 segments
Note how cwnd went from 32 to 33 even though 32 segments were acknowledged (each ACK therefore contributes 1/32). If we were in slow start, as in T=6, we would have cwnd += 32. This new value of cwnd is also consistent with what we see in the graph at time T = 7.
T = 8
33 segments acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 34
Transmit 34 segments
T = 9
34 segments acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 35
Transmit 35 segments
Notice that this is consistent with the graph: at T=9, we have cwnd = 35. This keeps happening up to T = 16...
T = 10
35 segments acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 36
Transmit 36 segments
T = 11
36 segments acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 37
Transmit 37 segments
T = 12
37 segments acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 38
Transmit 38 segments
T = 13
38 segments acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 39
Transmit 39 segments
T = 14
39 segments acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 40
Transmit 40 segments
T = 15
40 segments acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 41
Transmit 41 segments
T = 16
41 segments acknowledged
cwnd += 1; ssthresh = 32 kB
New value of cwnd: 42
Transmit 42 segments
PAUSE
What happens now? The graph shows that the congestion window size decreases to approximately half of its size, and then it grows linearly across rounds again. The only possibility is that there were 3 duplicate ACKs and the protocol switches to Fast recovery. The graph shows that it does NOT switch to slow start because that would bring cwnd down to 1. So the only possible transition is to fast recovery.
By entering fast recovery, we get ssthresh = cwnd/2. Remember that cwnd's units is MSS and ssthresh is in bytes, we have to be careful with that. Thus, the new value is ssthresh = cwnd*MSS/2 = 42*1460/2 = 30660.
Again, this lines up with the graph; notice that ssthresh will be hit in the near future when cwnd is slightly less than 30 (recall that with MSS = 1460, the ratio is not exactly 1:1, that's why we hit the threshold even though the congestion window size is slightly below 30).
The switch to congestion avoidance also causes the new value of cwnd to be ssthresh+3MSS = 21+3 = 24 (remember to be careful with units, here I converted ssthresh into MSS again because our values of cwnd are counted in MSS).
As of now, we are in congestion avoidance, with T=17, ssthresh = 30660 bytes and cwnd = 24.
Upon entering T=18, two things can happen: either we receive a duplicate ACK, or we don't. If we don't (so it's a new ACK), we would transition to congestion avoidance. But this would bring cwnd down to the value of ssthresh, which is 21. That wouldn't match the graph - the graph shows that cwnd keeps increasing linearly. Also, it doesn't switch to slow start because that would bring cwnd down to 1. This implies that fast recovery isn't left and we are getting duplicate ACKs. This happens up to time T=22:
T = 18
Duplicate ACK arrived
cwnd += 1; ssthresh = 30660 bytes
New value of cwnd: 25
T = 19
Duplicate ACK arrived
cwnd += 1; ssthresh = 30660 bytes
New value of cwnd: 26
T = 20
Duplicate ACK arrived
cwnd += 1; ssthresh = 30660 bytes
New value of cwnd: 27
T = 21
Duplicate ACK arrived
cwnd += 1; ssthresh = 30660 bytes
New value of cwnd: 28
T = 22
Duplicate ACK arrived
cwnd += 1; ssthresh = 30660 bytes
New value of cwnd: 29
** PAUSE **
We are still in Fast recovery, and now, suddenly cwnd goes down to 1. This shows that it enters slow start again. The new value of ssthresh will be 29*1460/2 = 21170, and cwnd = 1. It also means that despite our efforts to retransmit the segment, there was a timeout.
T = 23
cwnd = 1; ssthresh = 21170 bytes
Transmit 1 segment
T = 24
1 segment acknowledged
cwnd += 1; ssthresh = 21170 bytes
New value of cwnd: 2
Transmit 2 segments
T = 25
2 segments acknowledged
cwnd += 2; ssthresh = 21170 bytes
New value of cwnd: 4
Transmit 4 segments
T = 26
4 segments acknowledged
cwnd += 4; ssthresh = 21170 bytes
New value of cwnd: 8
Transmit 8 segments
...
I hope that makes it clear.
In TCP Reno(the version of TCP involving Fast Recovery), a cwnd (congestion window) graph should looks like as this:
Only one RTT time between Slow Start and Congestion Avoidance is Fast Recovery.
If like the graph in "Computer Networking: A Top Down Approach" book, just use a straight line in T16 to represent the Fast Recovery process, then the cwnd in T17 should be 21 MSS instead of (21+3) MSS, because when it transitions from Fast Recovery to Congestion Avoidance, the cwnd will down to the value of ssthresh. So the graph in book is wrong. And also, #Filipe Gonçalves 's answer is wrong, too.
There is another graph from the perspective of timeline trace of sender and receiver, which may help you understand the Fast Recovery process, too.
reference:
1.http://www.ijcse.com/docs/INDJCSE17-08-03-113.pdf
2.https://www.isi.edu/nsnam/DIRECTED_RESEARCH/DR_WANIDA/DR/JavisInActionFastRecoveryFrame.html
Related
Serial point to point protocol but with 8 bytes instead of 16
I was looking at answers in Simple serial point-to-point communication protocol and it doesn't help me enough with my issue. I am also trying to communicate data between a computer and an 8-bit microcontroller at first, then eventually I want to communicate the one microcontroller to about 40 others via wireless radio modules. Basically one is designated as a master and the rest are slaves. speed is an issue The issue at hand is speed. because communication of every packet needs to be done at least 4x a second back and forth between the master and each slave. Let's assume baud rate for data is 9600bps. That's 960 bytes a second. If I used 16-byte packets then: 40 (slaves) times 16 (bytes) times 2 (ways) = 640. Divide that into 960 and that would mean well more than 1/2 a second. Not good. If I used 8-byte packets then: 40 (slaves) times 8 (bytes) times 2 (ways) = 320. Divide that into 960 and that would mean 1/3 second. It's so-so. But the thing is I need to watch my baud because too high of baud might mean missed data at larger distances, but you can see the speed difference between an 8 and 16 byte packet. packet format idea In my design, I may have a need to transmit a number in the low millions so that will use 24-bits which fits in my idea. But here's my initial idea: Byte 1: Recipient address 0-255 Byte 2: Sender address 0-255 Byte 3: Command Byte 4-6: Data Byte 7-8: 16-bit fletcher checksum of above data I don't mind if the above format is adjusted, just as long as I have at least 6 bits to identify the sender and receiver (since I'll only deal with 40 units), and the data with command included should be at least 4 bytes total. How should I modify my data packet idea so that even the device that just turned on in the middle of reception can be in sync with the next set of data? Is there a way without stripping a bit from each data byte?
Rely on the check sum! My packet would consists of: Recipient's address (0..40) XORed with 0x55 Sender's address (0..40) XORed with 0xAA Command Byte Data Byte 0 Data Byte 1 Data Byte 2 CRC8 sum, as suggested by Vroomfondel Every receiver should have a sliding window of the last seven received bytes. When a byte was shifted in, that window should checked if it is valid: Are the two addresses in the valid range? Is it a valid command? Is the CRC correct? Especially the last one should safely reject packets on which the receiver hopped on off-sync. If you have less than 32 command codes, you may go down to six bytes per packet: 40[Senders] times 40[Receivers] times 32[Commands] evaluates to 51200, which would fit into 16 bits instead of 24. Don't forget to turn off the parity bit! Update 2017-12-09: Here a receiving function: typedef uint8_t U8; void ByteReceived(U8 Byte) { static U8 Buf[7]; //Bytes received so far static U8 BufBC=0; Buf[BufBC++] = Byte; if (BufBC<7) return; //Msg incomplete /*** Seven Byte Message received ***/ //Check Addresses U8 Adr; Adr = Buf[0] ^ 0x55; if (Adr >= 40) goto Fail; Adr = Buf[1] ^ 0xAA; if (Adr >= 40) goto Fail; if (Buf[2] > ???) goto Fail; //Check Cmd if (CalcCRC8(Buf, 6) != Buf[6]) goto Fail; Evaluate(...); BufBC=0; //empty Buf[] return; Fail: //Seven Byte Msg invalid -> chop off first byte, could use memmove() Buf[0] = Buf[1]; Buf[1] = Buf[2]; Buf[2] = Buf[3]; Buf[3] = Buf[4]; Buf[4] = Buf[5]; Buf[5] = Buf[6]; BufBC = 6; }
Sequence Number Calculation TCP
two host A and B are communicating with each other using TCP. Assume that the sequence number field starts at 0 and the receiver employs cummulative ACK. A has successfully send 465 bytes of data which were also acked by B. Suppose A were now to send 3 segment of size 110, 40, 60 size. what sequence number will the third segment carry ??
This is very simple to work out, and it sounds a lot like a homework problem. I usually won't answer these, but... Remember that the initial SYN consumes 1 byte in the connection. This means that the initial SYN with sequence number zero is ACKed as 1. We now transfer 465 bytes. This means that the last sequence number ACKed will be 466, and 466 will now appear as the sequence number from A to B. We now send 110 bytes. The sequence number in the packet will be 466 with a data payload of 110. The ACK will be for 576. Following this, 40 more bytes are sent. This will have a sequence number of 576 in the packet with 40 bytes of payload and the ACK will be for 616. That brings us to the last segment. The sequence number in the segment should be 616, as long as I've done the maths correctly in my head, and this is the sequence number in the packet that you are asking about. The ACK for that will be for 676.
Sliding window protocol, calculation of sequence number bits
I am preparing for my exams and was solving problems regarding Sliding Window Protocol and I came across these questions.. A 1000km long cable operates a 1MBPS. Propagation delay is 10 microsec/km. If frame size is 1kB, then how many bits are required for sequence number? A) 3 B) 4 C) 5 D) 6 I got the ans as C option as follows, propagation time is 10 microsec/km so, for 1000 km it is 10*1000 microsec, ie 10 milisec then RTT will be 20 milisec in 10^3 milisec 8*10^6 bits so, in 20 milisec X bits; X = 20*(8*10^6)/10^3 = 160*10^3 bits now, 1 frame is of size 1kB ie 8000 bits so total number of frames will be 20. this will be a window size. hence, to represent 20 frames uniquely we need 5 bits. the ans was correct as per the answer key.. and then I came across this one.. Frames of 1000 bits are sent over a 10^6 bps duplex link between two hosts. The propagation time is 25ms. Frames are to be transmitted into this link to maximally pack them in transit (within the link). What is the minimum number of bits (l) that will be required to represent the sequence numbers distinctly? Assume that no time gap needs to be given between transmission of two frames. (A) l=2 (B) l=3 (C) l=4 (D) l=5 as per the earlier one I solved this one like follows, propagation time is 25 ms then RTT will be 50 ms in 10^3 ms 10^6 bits so, in 50 ms X bits; X = 50*(10^6)/10^3 = 50*10^3 bits now, 1 frame is of size 1kb ie 1000 bits so total number of frames will be 50. this will be a window size. hence, to represent 50 frames uniquely we need 6 bits. and 6 is not even in the option. Answer key is using same solution but taking propagation time not RTT for calculation. and their answer is 5 bits. I am totally confused, which one is correct?
I don't see what RTT has to do with it. The frames are only being sent in one direction.
Round-Trip-Time means that you have to take into account the ACK (acknowledgement message) you must receive that tells you the frames you are sending are being received by on the other side of the link. This 'time' window is the period where you get to send the remaining frames that the window allows you to send before you anticipate an ACK. Ideally you want to be able to transmit continuously, i.e not having to stop at the window frame limit to wait for an ACK (which is essentially turns into a stop-and-wait situation if you have to stop and wait for the ack. The solution to this question is: the minimum number of frames that will be transmitted from the moment the first frame is transmitted to the moment you get an ack. (also known as the size for a large window) Your calculations look to be correct in both cases and it would be safe to assume the answer choices for the second question are wrong .
Here its duplex channel so YOUR RTT= Tp hence they have considered Tp Now you will get X = 25*10³ So total bits of window will be 5..
TCP Slow start window size
I am revising for a networks exam and I am not sure of the answer to the following: Consider the effect of using slow-start on a link with a 10ms round-trip time and no congestion. The receive window is 24KB and the maximum segment size is 2KB. How long does it take before the first full window can be sent in one transmission round? after every ACK, does the max segment size grow by 1 or does it double? If it doubles, would the answer be 50ms because 2KB^5 = 32KB so after 5 trips the MSS will equal 32KB and due to 10ms round-trip time it will be 10x5 = 50ms?
I found a solution to your question here (Problem 4): http://web.eecs.utk.edu/~qi/teaching/ece453f06/hw/hw7_sol.htm You are more or less correct, except you disregard the last RTT (since the window size goes beyond 24KB). So your answer is: 4 x 10 = 40ms.
Why is window size less than or equal to half the sequence number in SR protocol?
In selective repeat protocol, the window size must be less than or equal to half the size of the sequence number space for the SR protocol. Why is this so, and how?
This is to avoid packets being recognized incorrectly. If the windows size is greater than half the sequence number space, then if an ACK is lost, the sender may send new packets that the receiver believes are retransmissions. For example, if our sequence number range is 0-3 and the window size is 3, this situation can occur. [initially] (B's window = [0,1,2]) A -> 0 -> B (B's window = [1,2,3]) A -> 1 -> B (B's window = [2,3,0]) A -> 2 -> B (B's window = [3,0,1]) [lost] ACK0 [lost] ACK1 A <- ACK2 <- B A -> 3 -> B A -> 0 -> B [retransmission] A -> 1 -> B [retransmission] After the lost packet, B now expects the next packets to have sequence numbers 3, 0, and 1. But, the 0 and 1 that A is sending are actually retransmissions, so B receives them out of order. By limiting the window size to 2 in this example, we avoid this problem because B will be expecting 2 and 3, and only 0 and 1 can be retransmissions.
The sequence space wraps to zero after max number is reached. Consider the corner case where all ACKs are lost - sender does not move its window, but receiver does (since it's unaware the sender is not getting the ACKs). If we don't limit the window size to half the sequence space, we end up with overlapping sender "sent but not acknowledged" and receiver "valid new" sequence spaces. This would result in retransmissions being interpreted as new packets.
Because the receiver will fail to distinguish between an old packet or a new packet. The receiver identifies packets based on sequence numbers, and there is a finite number of unique numbers for each connection. You can't have an infinite buffer. Lets look at a obvious fail scenario: The window size is greater than the sequence number space. Lets say we have sequence numbers 0, 1, 2. And our window size is 4. This means that the window has two occurrences of 0. 0,1,2,0 <- modulo wrap. When we get a package with a seq of 0. Is it the first packet or the fourth? No clue. Now, this problem will occur insofar as the window size is greater than half of the sequence number space. Why? Because there's always the possibility that the receiver is looking at a sequence number that MAY be contained in a packet coming from the sender that is NEW or OLD. Does it always happen? No. But when it does, here's what happens: Case 1: Receiver window after properly receiving packets 0,1,2. 0,1,2,[3,0,1],2 But what if the ACKs sent are lost? Well, the sender will resend 0,1,2. But are 0,1 OLD or NEW? The receiver can't tell. Case 2: Same window on receiving end. The three packets are received. 0,1,2,[3,0,1],2 Now, the receiver receives ALL the acks but ONE correctly. Lets pick the 2nd one (1). Now, it's going to resend 1. But the receiver is looking at 1! So is this the new one as it expects (nope), or the old one? Therefore, to ensure that the window is never expecting sequence numbers that could possibly be used by potential outstanding packets (either coming from a normal transmission or re-transmission of a missing ack) we have to either decrease the window size or increase sequence numbers. Look what happens when we increase the sequence number space to, say 6. 0,1,2,3,4,5. No matter how we position the window, it's never at risk of receiving a packet with a old sequence number. 0,1,2,[3,4,5]0,1... By the time the window wraps around, we are positive that we've received the previous ones in order.
This link has an animation that walks through each of the steps of the protocol to explain why the window size matters: http://webmuseum.mi.fh-offenburg.de/index.php?view=exh&src=73 Basically, if the window size is too high, then corruption in transmission can cause incorrect assumptions and lead to data corruption in the final result.