Differences between TCP and Go Back N - networking

I was reading Computer Networking from Kurose, and while reading in the TCP chapter about the differences between TCP and Go Back N I found something that I don't fully understand. The book says the following about some of the differences between the two protocols:
"many TCP implementations buffer correctly received but out-of-order segs rather than discard.
also, suppose a seqof segs 1, 2, …N, are received correctively in-order,ACK(n),
n < N, gets lost, and remaining N-1 acks arrive at sender before their respective timeouts
TCP retransmit most one seg, i.e., seg n, instead of pkts, n, n+1, …, N
TCP wouldn’t even retransmit seg n if ACK(n+1) arrived before timeout for seg n"
I understand the buffering of out-of-order segments, but I don't understand the other behavior, and I think it is because I don't fully understand Go Back N. Following that example, if ACK(n+t) arrives before Go Back N timeout, the protocol would continue as if seg n was in fact received, which is the case, because of the accumulative ACKS... so, Go Back N wouldn't retransmit that segment either.... or am I missing something?

I was looking at this question's answer and after finding it I thought even though this is old, it might help someone, so I copied a fragment from Kurose-Ross Computer Networking - A top down approach:
Is TCP a GBN or an SR protocol? Recall that TCP acknowledgments are
cumulative and correctly received but out-of-order segments are not individually ACKed by the receiver. Consequently, the TCP sender need only maintain the smallest sequence number of a transmitted but unacknowledged byte (SendBase) and the sequence number of the next byte to be sent (NextSeqNum). In this sense, TCP looks a lot like a GBN-style protocol. But
there are some striking differences between TCP and Go-Back-N. Many TCP implementations
will buffer correctly received but out-of-order segments [Stevens 1994].
Consider also what happens when the sender sends a sequence of segments 1, 2, . . . ,
N, and all of the segments arrive in order without error at the receiver. Further suppose
that the acknowledgment for packet n < N gets lost, but the remaining N – 1 acknowledgments
arrive at the sender before their respective timeouts. In this example, GBN
would retransmit not only packet n, but also all of the subsequent packets n + 1, n + 2,
. . . , N. TCP, on the other hand, would retransmit at most one segment, namely, segment
n. Moreover, TCP would not even retransmit segment n if the acknowledgment
for segment n + 1 arrived before the timeout for segment n.
My conclusion: in practice TCP is a mixture between both GBN and SR.

see these links, it is easy to understand about GBN and SR:
Go Back N protocol (GBN):
enter link description here
Selective Repeat protocol (SR):
https://www.youtube.com/watch?v=Cs8tR8A9jm8
in GBN and SR protocol,the receiver has to send ACK message for all segments which it has received in the slide window.
in TCP protocol, the receiver don't send ACK message for all segments which it has received in the slide window. the receiver only send ACK to get the next segments that it expect. it means that less ACK messages will be sent to the sender. therefore, it is good for reducing network congestion.
in abnormal cases, some segments are lost (by network congestion or bit error), TCP transmission time is longer than GBN and SR because the receiver can not sent 2 ACK messages at the same time.
in my opinion, losing segment rarely happens. so TCP protocol optimizes for normal cases instead of abnormal cases. in normal cases, TCP is better than GBN and SR

The quote says that the ACK(n) got lost, not the nth segment got lost. In such case, nothing needs to be re-transmitted, because ACK(n + x) means that everything upto n + x was successfully received.

I was confused by the statement from the book too, but I think I have found the answer:
Consider also what happens when the sender sends a sequence of segments 1, 2, . . . , N, and all of the segments arrive in order without error at the receiver. Further suppose that the acknowledgment for packet n < N gets lost, but the remaining N – 1 acknowledgments arrive at the sender before their respective timeouts. In this example, GBN would retransmit not only packet n, but also all of the subsequent packets n + 1, n + 2, . . . , N. TCP, on the other hand, would retransmit at most one segment, namely, segment n. Moreover, TCP would not even retransmit segment n if the acknowledgment for segment n + 1 arrived before the timeout for segment n.
Actually, in the above example, even though the ACK for packet n+1 arrives at the sender before its timeout, one has to be aware that the timer for packet n could have timed-out before that arrival. So, because packet n timeout and the GBN has not seen ACK(n+1) or ACK(n+2)... so far, it will trigger the re-transmission of all packets after n .
However, for TCP, the sender would only send packet n again at this specific moment.
P.S. this question has been very old. But, anyway, hopefully that might help anyone.

ACK(n) acknowledges arrival of the entire stream up to n. So ACK(n+1) says that everything up to n+1 has arrived, including n.

Related

Stumbling on a Reliable UDP implementation

I received an assignment from the College where I have to implement a reliable transfer through UDP aka. TCP Over UDP (I know, reinvent the wheel since this has already been implemented on TCP) to know in deep how TCP works. Some of the requirements are: 3-Way Handshake, Congestion Control (TCP Tahoe, in particular) and Waved Hands. I think about doing this with Java or Python.
Some more specific requirements are:
After each ACK is received:
(Slow start) If CWND < SS-THRESH: CWND += 512
(Congestion Avoidance) If CWND >= SS-THRESH: CWND += (512 * 512) / CWND
After timeout, set SS-THRESH -> CWND / 2, CWND -> 512, and retransmit data after the last acknowledged byte.
I couldn't find more specific information about the TCP Tahoe implementation. But from what I understand, TCP Tahoe is based on Go-Back-N, so I found the following pseudo algorithm for sender and receiver:
My question is the Slow Start and Congestion Avoidance phase should happen right after if sendbase == nextseqnum? That is, right after confirming the receipt of an expected ACK?
My other question is about the Window Size, Go-Back-N uses a fixed window whereas TCP Tahoe uses a dynamic window. How can I calculate window size based on cwnd?
Note: your pictures are unreadable, please provide a higher resolution images
I don't think that algorithm is correct. A timer should be associated with each packet and stopped when ACK for this packet is received. Congestion control is triggered when the timer for any of the packets fires.
TCP is not exactly Go-Back-N receiver. In TCP receiver has a buffer too. This does not require any changes at the sender Go-Back-N. However, TCP is also supposed to implement flow control, in which the receiver tells the sender how much space in its buffer remains, and the sender adjusts its window accordingly.
Note, that Go-Back-N sequence number count packets, and TCP sequence numbers count bytes in the packets, you have to change your algorithm accordingly.
I would advice to get somewhat familiar with rfc793. It does not have congestion control, but it specifies how other TCP mechanics is supposed to work. Also this link has a nice illustration of TCP window and all variables associated with it.
My question is the Slow Start and Congestion Avoidance phase should happen right after if sendbase == nextseqnum? That is, right after confirming the receipt of an expected ACK?
your algorithm only does something when it receives ACK for the last packet. As I said, this is incorrect.
Regardless. Every ACK that acknowledges new packet shoult trigger window increase. You can do check this by checking if send_base was increased as the result of an ACK.
Dunno if every Tahoe implementation does this, but you may need this also. After three consequtive duplicate ACKs, i.e., ACKs that do not increase send_base you trigger congestion response.
My other question is about the Window Size, Go-Back-N uses a fixed window whereas TCP Tahoe uses a dynamic window. How can I calculate window size based on cwnd?
you make the N variable instead of constant, and assign congestion window to it.
in a real TCP with flow control you do N = min (cwnd, receiver_window).

Understanding TCP Slow Start

I was trying to get my head around tcp congestion control and came across what is called the slow start phase, where tcp starts by sending just 1 MSS and then keep on adding 1 MSS to the congestion window on receipt of an ack. This much is clear. But after this, almost all books/articles that i refers goes ahead a say that this results in doubling the cwnd every RTT showing a image something like below where i got confused.
The first segment is clear, tcp sends it and receives the ack after a RTT and then doubles the cwnd which now is 2. Now it transmits two segments, the ack for the fist one comes after RTT making the cwnd 3. But the ack for the second segment comes after this making cwnd 4(ie doubling it). So i am not able to understand how the cwnd doubles every RTT, since as per my understanding, in this example, cwnd doubled on the first RTT and got incremented by one on the second RTT and again doubled on some other time(RTT+tx time of the first segment i believe). Is this understanding correct. Please explain.
After the two segments' acks had been received by the sender, the CWND was increased by 2, not 1. Note that the second ack in round 2 arrived right after the first ack in round 2, that's why they were considered in the same round and costed 1 RTT.
I can not agree with you more. I think there is not exist a double mechanism and they get the wrong meaning for the Slow Start.
When the server receives the ACK, the cwnd will increase the value of 1 MSS. So if you send n requests and receive all the ACK, cwnd will become (n + n) MSS.
In the process of Slow Start, the requests can be done in 1 RTT, because it is not easy to happend traffic jam in the Internet. So it looks like it become double per RTT. But, the real mechanism is to add, not multiply.

TCP -- acknowledgement

The 32-bit acknowledgement field, say x, on the TCP header
tells the other host that "I received all the bytes up until and including x-1,
now expecting
the bytes from x and on". In this case, the receiver may have received some
further bytes, say x+100 through x+180,
but it hasn't yet received x-th byte yet.
Is there a case that, although the receiver hasn't received
x through x+100 bytes but received the bytes say x+100 through x+180,
the receiver is acknowledging that it received x+180?
One resource I read indicates the acknowledgement of bytes received despite a gap in the earlier bytes.
However, every other source tells
"acknowledgement of x tells all bytes up until x-1 are received".
Are there any exceptional cases? I'm looking to verify this.
TIA.
This can be achieved by TCP option called SACK.
Here, client can say through a duplicate ACK that it has only up to particular packet number 2 (sequence number of packet) in order and append SACK(Selective Acknowledgement) option for the range of contiguous packets received like packets numbered 4 to 5 (sequence number). This in turn shall enable the server to retransmit only the packets(3 sequence number) that were not received by the client.
Provided below an extract from RFC 2018 : TCP Selective Acknowledgement Options
The SACK option is to be sent by a data receiver to inform the data
sender of non-contiguous blocks of data that have been received and
queued. The data receiver awaits the receipt of data (perhaps by
means of retransmissions) to fill the gaps in sequence space between
received blocks. When missing segments are received, the data
receiver acknowledges the data normally by advancing the left window
edge in the Acknowledgement Number Field of the TCP header. The SACK
option does not change the meaning of the Acknowledgement Number
field.
From the TCP RFC at https://www.rfc-editor.org/rfc/rfc793.txt:
3.3. Sequence Numbers
A fundamental notion in the design is that every octet of data sent
over a TCP connection has a sequence number. Since every octet is
sequenced, each of them can be acknowledged. The acknowledgment
mechanism employed is cumulative so that an acknowledgment of sequence
number X indicates that all octets up to but not including X have been
received.
That seems pretty clear to me, the sequence number stops at the first missing data.

retransmission mechanism in TCP protocol

Can somebody just simplely describe the retransmission mechanism in TCP?
I want to know how it deal in this situation?
A send a packet to B:
A send a packet.
B receive and send ack,but this ack is lose.
A timeout and resend.
In this situation B will receive 2 same packets, how can B do to avoid dealing the same packet again?
Thanks.
Each packet has a sequence number associated with it. As data is sent, the sequence number is incremented by the amount of original data in the packet. You can think of the sequence number as the offset of the first byte in the packet from the beginning of the data stream although it may not, likely will not, start at zero. When A sends the retry, it will use the same sequence number it used the first time. B tracks the sequence numbers as it receives data and can know that it has seen the retry's sequence number before. If it has already made that data available to the (upper layer) client, then it knows that it should not do so again.

Rationale behind ACKs and SEQs?

I am not sure if people find this obvious but I have two questions:
During the 3-way handshake, why is ACK = SEQ + 1 i.e. why am I ACKing for the next byte that I am expecting from the sender?
After the handshake, my ACK = SEQ + len. Why is this different from the handshake? Why not just ACK for the next byte I am expecting as well (the same as during the handshake)?
I know I must've missed out a basic point somewhere. Can someone clarify this?
This is because the first byte of sequence number space corresponds to the SYN flag, not to a data byte. (The FIN flag at the end also consumes a byte of sequence number space itself.)
During the handshake you're synchronizing. The sequence number is the known data. Once you've synced, the data length is the known data as well as a useful pseudo-random verifier. Sender knows how much he sent and if you reply, he assumes you got it. This is easier than reply with, say a checksum or hash of the data, and is usually sufficient.
Both the SYN and FIN flags cause the sequence number of the stream to increment by one. Thus
SYN (seq x) -------------->
<--- SYNACK (ack x+1, seq y)
ACK (seq x+1, ack y+1) --->
Is your three way handshake. It's done that way because SYNs and FINs require acknowledgement of receipt. That way everyone can be on the same page during the lifetime of the connection.
Theoretically any packet in part of the TWHS could have payload, but if either of the packets with the SYN flag set have payload, the opposite side needs to acknowledge both data AND the flag.

Resources