Selective Repeat Dilemma - networking

I'm not understanding what is the dilemma in this figure. I only understood that if the window size is smaller than the sequence number then it's going to lead to some problems. This picture is addressing one of these problems:

When the receiver receives a pkt0 he doesn't know if this packet is:
A retransmission of the previous pkt0 (in a case where ACK0 has been lost).
or
if this is a new packet with seqnum 0.
Solution:
Maximum allowable window size = half the sequence number space.

Related

Why is the largest scale window size is 14 in TCP?

I am reading RFC1323, and I don't understand the following sentences,
TCP determines if a data segment is "old" or "new" by testing
whether its sequence number is within 2**31 bytes of the left edge
of the window, and if it is not, discarding the data as "old". To
insure that new data is never mistakenly considered old and vice-
versa, the left edge of the sender's window has to be at most
2**31 away from the right edge of the receiver's window.
What does "old" and "new" mean in the sentence?
I knew that the space size of sequence number in the current design is 2^32, so I understand that the scale size should be smaller then 32-16 = 16, but I don't understand why it should also be smaller than 15.
TCP use sequence numbers to track data. Higher sequence numbers are "newer" meaning they were originally sent at an earlier time. For example, if "hello" was sent, the "e" is newer than "h". Since the sequence number range is finite, there needs to be a way to prevent "wrapping". In this example, if the sequence range is 0 to 3, then "h" and "o" might have the same sequence number of 0. So, in this example, if data is lost during transit and needs to be resent, then the receiver might end up getting "hellh".

computer networking error detction cyclic redundancy check

could anyone tell me that
is it right way if i calculate the crc by the method given below:
i did this because dataword length is large.
is it right?
The bottom set of divisions are what you want for a CRC, and arrive at the correct remainders. Your quotients each need one more bit, but they are not used anyway.
The top division was not completed, but it is not relevant, since you need to append the four zeros first as you did in the bottom left, or the four-bit CRC as you did in the bottom right.
Ultimately, You are doing the same thing what a division does. Refer https://www.wikihow.com/Divide-Binary-Numbers binary division for more. However, the data word to be sent to the receiver should not be altered.

The realationship between window size and sequence number

The question is :
We have a transport protocoll that uses pipelining and use a 8-bit long sequence number (0 to 255)
What is the maximum window size sender can use ? (How many packets the sender can send out on the net before it muse wait for an ACK?)
Go-Back-N the maximum window size is: w= 2^m -1 w=255.
Selective Repeat the maximu window size is: w=(2^m)/2 w=128.
I do not know which is correct and which formula shall I use.
Thanks for help
Those two are different protocols having different issues.
In case of Go-Back-N, you are correct. The window size can be up to 255. (2^8-1 is the last seq # of packets to send starting from 0. And it's also the maximum window size possible for Go-Back-N protocol.)
However, Selective Repeat protocol has limitation of window size up to half of the max seq # since the receiver cannot distinguish a retransmitted packet having the same seq # with an already ack'ed packet but lost and never reached to sender in previous window. Hence, the window size must be in half range of seq # so that the consecutive windows cannot have duplicated seq # each other.
Go-Back-N doesn't have this issue since the sender pushes n packets up to window size (which is at max: n-1) and never slides the window until it gets cumulative acks up to n. And those two protocol have different max size windows.
Note: For Go-Back-N, the maximum window size is maximum number of unique sequence numbers - 1. If the window is equal to the maximum number of unique sequence numbers, if all the acknowledgements are lost, the receiver will accept all the retransmitted messages as a separate set of messages and relays the messages an additional time to it's application. To avoid this inconsistency, maximum window size = maximum number of unique sequence numbers - 1. This answer has been updated according to the fact provided in the comment by #noamgot.

Why is window size less than or equal to half the sequence number in SR protocol?

In selective repeat protocol, the window size must be less than or equal to half the size of the sequence number space for the SR protocol. Why is this so, and how?
This is to avoid packets being recognized incorrectly.
If the windows size is greater than half the sequence number space, then if an ACK is lost, the sender may send new packets that the receiver believes are retransmissions.
For example, if our sequence number range is 0-3 and the window size is 3, this situation can occur.
[initially] (B's window = [0,1,2])
A -> 0 -> B (B's window = [1,2,3])
A -> 1 -> B (B's window = [2,3,0])
A -> 2 -> B (B's window = [3,0,1])
[lost] ACK0
[lost] ACK1
A <- ACK2 <- B
A -> 3 -> B
A -> 0 -> B [retransmission]
A -> 1 -> B [retransmission]
After the lost packet, B now expects the next packets to have sequence numbers 3, 0, and 1.
But, the 0 and 1 that A is sending are actually retransmissions, so B receives them out of order.
By limiting the window size to 2 in this example, we avoid this problem because B will be expecting 2 and 3, and only 0 and 1 can be retransmissions.
The sequence space wraps to zero after max number is reached. Consider the corner case where all ACKs are lost - sender does not move its window, but receiver does (since it's unaware the sender is not getting the ACKs). If we don't limit the window size to half the sequence space, we end up with overlapping sender "sent but not acknowledged" and receiver "valid new" sequence spaces. This would result in retransmissions being interpreted as new packets.
Because the receiver will fail to distinguish between an old packet or a new packet. The receiver identifies packets based on sequence numbers, and there is a finite number of unique numbers for each connection. You can't have an infinite buffer.
Lets look at a obvious fail scenario:
The window size is greater than the sequence number space. Lets say we have sequence numbers 0, 1, 2. And our window size is 4. This means that the window has two occurrences of 0.
0,1,2,0 <- modulo wrap. When we get a package with a seq of 0. Is it the first packet or the fourth? No clue. Now, this problem will occur insofar as the window size is greater than half of the sequence number space. Why? Because there's always the possibility that the receiver is looking at a sequence number that MAY be contained in a packet coming from the sender that is NEW or OLD. Does it always happen? No. But when it does, here's what happens:
Case 1:
Receiver window after properly receiving packets 0,1,2.
0,1,2,[3,0,1],2
But what if the ACKs sent are lost? Well, the sender will resend 0,1,2. But are 0,1 OLD or NEW? The receiver can't tell.
Case 2:
Same window on receiving end. The three packets are received.
0,1,2,[3,0,1],2
Now, the receiver receives ALL the acks but ONE correctly. Lets pick the 2nd one (1). Now, it's going to resend 1. But the receiver is looking at 1! So is this the new one as it expects (nope), or the old one?
Therefore, to ensure that the window is never expecting sequence numbers that could possibly be used by potential outstanding packets (either coming from a normal transmission or re-transmission of a missing ack) we have to either decrease the window size or increase sequence numbers.
Look what happens when we increase the sequence number space to, say 6.
0,1,2,3,4,5.
No matter how we position the window, it's never at risk of receiving a packet with a old sequence number.
0,1,2,[3,4,5]0,1...
By the time the window wraps around, we are positive that we've received the previous ones in order.
This link has an animation that walks through each of the steps of the protocol to explain why the window size matters:
http://webmuseum.mi.fh-offenburg.de/index.php?view=exh&src=73
Basically, if the window size is too high, then corruption in transmission can cause incorrect assumptions and lead to data corruption in the final result.

Detecting and fixing overflows

we have a particle detector hard-wired to use 16-bit and 8-bit buffers. Every now and then, there are certain [predicted] peaks of particle fluxes passing through it; that's okay. What is not okay is that these fluxes usually reach magnitudes above the capacity of the buffers to store them; thus, overflows occur. On a chart, they look like the flux suddenly drops and begins growing again. Can you propose a [mostly] accurate method of detecting points of data suffering from an overflow?
P.S. The detector is physically inaccessible, so fixing it the 'right way' by replacing the buffers doesn't seem to be an option.
Update: Some clarifications as requested. We use python at the data processing facility; the technology used in the detector itself is pretty obscure (treat it as if it was developed by a completely unrelated third party), but it is definitely unsophisticated, i.e. not running a 'real' OS, just some low-level stuff to record the detector readings and to respond to remote commands like power cycle. Memory corruption and other problems are not an issue right now. The overflows occur simply because the designer of the detector used 16-bit buffers for counting the particle flux, and sometimes the flux exceeds 65535 particles per second.
Update 2: As several readers have pointed out, the intended solution would have something to do with analyzing the flux profile to detect sharp declines (e.g. by an order of magnitude) in an attempt to separate them from normal fluctuations. Another problem arises: can restorations (points where the original flux drops below the overflowing level) be detected by simply running the correction program against the reverted (by the x axis) flux profile?
int32[] unwrap(int16[] x)
{
// this is pseudocode
int32[] y = new int32[x.length];
y[0] = x[0];
for (i = 1:x.length-1)
{
y[i] = y[i-1] + sign_extend(x[i]-x[i-1]);
// works fine as long as the "real" value of x[i] and x[i-1]
// differ by less than 1/2 of the span of allowable values
// of x's storage type (=32768 in the case of int16)
// Otherwise there is ambiguity.
}
return y;
}
int32 sign_extend(int16 x)
{
return (int32)x; // works properly in Java and in most C compilers
}
// exercise for the reader to write similar code to unwrap 8-bit arrays
// to a 16-bit or 32-bit array
Of course, ideally you'd fix the detector software to max out at 65535 to prevent wraparound of the sort that is causing your grief. I understand that this isn't always possible, or at least isn't always possible to do quickly.
When the particle flux exceeds 65535, does it do so quickly, or does the flux gradually increase and then gradually decrease? This makes a difference in what algorithm you might use to detect this. For example, if the flux goes up slowly enough:
true flux measurement
5000 5000
10000 10000
30000 30000
50000 50000
70000 4465
90000 24465
60000 60000
30000 30000
10000 10000
then you'll tend to have a large negative drop at times when you have overflowed. A much larger negative drop than you'll have at any other time. This can serve as a signal that you've overflowed. To find the end of the overflow time period, you could look for a large jump to a value not too far from 65535.
All of this depends on the maximum true flux that is possible and on how rapidly the flux rises and falls. For example, is it possible to get more than 128k counts in one measurement period? Is it possible for one measurement to be 5000 and the next measurement to be 50000? If the data is not well-behaved enough, you may be able to make only statistical judgment about when you have overflowed.
Your question needs to provide more information about your implementation - what language/framework are you using?
Data overflows in software (which is what I think you're talking about) are bad practice and should be avoided. While you are seeing (strange data output) is only one side effect that is possible when experiencing data overflows, but it is merely the tip of the iceberg of the sorts of issues you can see.
You could quite easily experience more serious issues like memory corruption, which can cause programs to crash loudly, or worse, obscurely.
Is there any validation you can do to prevent the overflows from occurring in the first place?
I really don't think you can fix it without fixing the underlying buffers. How are you supposed to tell the difference between the sequences of values (0, 1, 2, 1, 0) and (0, 1, 65538, 1, 0)? You can't.
How about using an HMM where the hidden state is whether you are in an overflow and the emissions are observed particle flux?
The tricky part would be coming up with the probability models for the transitions (which will basically encode the time-scale of peaks) and for the emissions (which you can build if you know how the flux behaves and how overflow affects measurement). These are domain-specific questions, so there probably aren't ready-made solutions out there.
But one you have the model, everything else---fitting your data, quantifying uncertainty, simulation, etc.---is routine.
You can only do this if the actual jumps between successive values are much smaller than 65536. Otherwise, an overflow-induced valley artifact is indistinguishable from a real valley, you can only guess. You can try to match overflows to corresponding restorations, by simultaneously analysing a signal from the right and the left (assuming that there is a recognizable base line).
Other than that, all you can do is to adjust your experiment by repeating it with different original particle flows, so that real valleys will not move, but artifact ones move to the point of overflow.

Resources