Why is the TCP/UDP checksum finally complemented? - tcp

In TCP/UDP, the sender xors 16-bit words and the final result is complemented again to get the checksum. Now, this is done so that the receiver would recompute the checksum with the data and the checksum and if the result were all ones, it can be certain (well, almost!) that there's no error. My question is why would we have to do a final complement of the result at the sender. We might as well send it as such so that when the receiver recomputes the checksum, it'll have to check for all zeros, instead of all ones like in the other case.

Because 0 has a special meaning. It is used to indicate that checksum computation is to be ignored.

So that the receiver can just do a 1's complement sum of the all the data (including the checksum field) and see if it is -0 (0xffff).

Related

What value goes into the receiving end's checksum header using TCP in Wireshark?

so I was wondering, what value goes into the checksum header on the receiving end?
For example, if I am sniffing http data, and I receive a packet, how is the value in the checksum header calculated? I am pretty sure I got how to calculate checksum but I don't understand why the value is as it is.
basically you get the sum of the binary string while splitting the string into groups of 1 byte and lines of 2 bytes, and then operate a 1 complement thingy on that sum, and that's your checksum. and to verify, the receiving end calculates the sum by himself, and adds the checksum to the sum, if everything is 1 then the packet was sent with no errors and 0 is the opposite. but if that's the case shouldn't I see an "ff ff" checksum value? why does it look like "34 ef" instead?
I apologize if this is a stupid question but I just couldn't find the answer as much as I tried looking. Thanks!

are CRC generator bits same for all?

In CRC if sender has to send data, It will divide the data bits with g(x) which will give some remainder bits. Those remainder bits are appended to the data bits. Now when this code word is sent to the receiver, the receiver will divide the code word with same g(x) giving some remainder. If this remainder is zero that means data is correct.
Now, If all the systems can communicate with each other does that mean every system in the world have the same g(x). because sender and receiver must have common g(x).
(please answer only if have correct knowledge with some valid proof)
It depends on the protocol. CRC by itself can work for different polynomials, the protocols that use it define the g(x) polynomial to use.
There is a list of examples on https://en.wikipedia.org/wiki/Cyclic_redundancy_check#Standards_and_common_use
This is not an issue since systems cannot communicate using different protocols on the sending and receiving end, obviously. Potentially, a protocol could also use a variable polynomial, somehow decided at the start of the communication, but I can't see why that would be useful
That's a big no. Furthermore, there are several other variations besides just g(x).
If someone tells you to compute a CRC, you have many questions to ask. You need to ask: what is g(x)? What is the initial value of the CRC? Is it exclusive-or'ed with a constant at the end? In what order are bits fed into the CRC, least-significant or most-significant first? In what order are the CRC bits put into the message? In what order are the CRC bytes put into the message?
Here is a catalog of CRCs, with (today) 107 entries. It is not all of the CRCs in use. There are many lengths of CRCs (the degree of g(x)), many polynomials (g(x)), and among those, many choices for the bit orderings, initial value, and final exclusive-or.
The person telling you to compute the CRC might not even know! "Isn't there just one?" they might naively ask (as you have). You then have to either find a definition of the CRC for the protocol you are using, and be able to interpret that, or find examples of correct CRC calculations for your message, and attempt to deduce the parameters.
By the way, not all CRCs will give zero when computing the CRC of a message with the CRC appended. Depending on the initial value and final exclusive-or, you will get a constant for all correct messages, but that constant is not necessarily zero. Even then, you will get a constant only if you compute the bits from the CRC in the proper order. It's actually easier and faster to ignore that property and compute the CRC on just the received message and then simply compare that to the CRC received with the message.

If TCP runs out of its sequence number, what will happen ? if it is 0 again, will that byte not be considered duplicate?

If TCP runs out of its sequence number, what will happen?
If it again turns to 0 as the sequence number of the next byte, won't that be considered "duplicate" by the receiver?
If yes, then it has to ignore that byte.
If not, why?
I think, i found the answer.
The answer of this query, lies on one of the TCP option field known as "timestamp". It's in every TCP segment (including data and ACK segments).
Therefore to identify a unique tcp segment, we look for a combination of "timestamp" and "sequence number".
The basic idea is that a segment can be discarded as an old duplicate if it is
received with a timestamp less than some timestamp recently received on this connection.
Example :
Two segments 400:12001 and 700:12001 definitely belongs to two different incarnations.
And this mechanism is known as "PAWS" or protection against wrapped sequence numbers.
Reference: https://www.rfc-editor.org/rfc/rfc1323#page-17

Computer Networking - Bit stuffing

In bit stuffing why always add non information bits after consecutive 5 bits? Any reason behind that?
Here is some information from tutorialspoint:
Bit-Stuffing: A pattern of bits of arbitrary length is stuffed in the message to differentiate from the delimiter.
The flag field is some fixed sequence of binary values like 01111110. Now the payload can also have similar pattern, but the machine on the network can get confused and misinterpret that payload data as the flag field (indicating end of frame). So, to avoid the machine getting confused, some bits are stuffed into the payload (especially at points where payload data looks like the flag) so as to differentiate it from flag.

Calculating the Checksum in the receiver

I'm reading the book Data Communications and Networking 4th Edition Behrouz-Forouzan. I have a question in an exercise that asked me the following: The receiver of a message uses the checksum technique (Checksum) for 8-bit characters and get the following information
100101000011010100101000
. How I can know if the Data sent is correct or not? and why?
I Learned how to calculate the checksum in hexadecimal values, but do not understand as determined by a binary output, if the information is correct.
The sender calculates checksum to the data are sends it with the data in same message.
The receiver calculates the checksum again to the received data and checks if result matches with the received checksum.
There is still a chance that both the data and checksum got modified during transmission so they still match but the likelihood of that happening because of random noise is extremely low.

Resources