why there is separate checksum in TCP and IP headers? - networking

What is the need for having checksum at various layers ? For eg, there is a checksum in TCP layer and again in IP layer and also Ethernet layer has it.
Is not it sufficient to have checksum at one layer ?

All three layers are needed, for multiple reasons:
IP does not always run over ethernet (imagine IP over RS-232 serial, something every Cisco and Unix box can do)
IP does not checksum the data
TCP packets can be reassembled incorrectly from IP packets and fragments that each have perfect checksums
Even if reassembled correctly, software or other errors could be introduced in the layers between IP and TCP
Even if all software functions correctly, and TCP/IP is over ethernet, the limited size of the checksums can be accidently correct (and will be at some point, given enough packets) in the face of persistent errors, so having more than one checksum is helpful.
Every time a new header is introduced there is more to checksum, and the new layer can't see the header bits of the layer below.

Ethernet checksum is a hop to hop checksum - meaning that it is recomputed everytime the Ethernet header fields change. TCP/UDP checksum is a end-to-end checksum meaning it is computed by the sender and verified by the receiver. TCP/UDP checksums cover the entire segment. IP checksum covers only the header. Ethernet CRC covers the entire frame.

The designers of IPv6 decided it's not necessary at all those layers and removed it in favor of checksums at other layers (such as those you mentioned).

Related

UDP numbered segments?

My firewall textbook says: "UDP breaks a message into numbered segments so that it can be transmitted."
My understanding was UDP had no sequence or other numbering scheme? That data was broken into packets and sent out with no ordered reconstruction on the other end, at least on this level. Am I missing something?
The book is just wrong here. The relevant section says:
User Datagram Protocol (UDP)—This protocol is similar to TCP in that it handles the addressing of a message. UDP breaks a message into numbered segments so that it can be transmitted. It then reassembles the message when it reaches the destination computer.
UDP does not include any mechanism to segment or reassemble messages; each message is sent as a single UDP datagram. If you look at the UDP "packet" (technically datagram) structure on page 108, there's no segment number or anything like that.
Mind you, segmentation can happen at other layers, either above or below UDP:
IP packets can be fragmented if they're too big for a network link's MTU (maximum transfer unit). This can happen to IP packets that contain UDP, TCP, or whatever. This is actually relevant for firewalls because creative fragmentation can sometimes be used to bypass packet filtering rules.
Some protocols that run on top of UDP also use something like numbered segments. For example, TFTP (trivial file transfer protocol) breaks files into "blocks", and transmits a block number in the header for each block. (And the receiver responds acknowledging the block number it's received -- it's like a drastically simplified version of TCP.) But this is part of the TFTP protocol, not part of UDP.
QUIC is another example of a protocol that runs over UDP and supports segmentation (and multiple connections, and...), and each packet contains a packet number. But again it's part of the QUIC protocol, not UDP.

Will IP layer split TCP segments?

I know that data from the application layer is split into segments by the Transport Layer (like TCP). Also, the Data Link layer might split the datagrams into multiple frames.
What about the Internet layer? Will IP layer simply encapsulate the segment or will it further split it?
Thanks,
Pavan.
The IP layer can't split a single TCP packet into multiple TCP packets, because it doesn't know what TCP is. However, routers along the network path may choose to fragment the IP packet itself into multiple pieces. Each of those pieces contains only a fraction of the TCP packet, so they all need to be received before the TCP layer can go to work. (For that matter, the sending machine can send out the packet pre-fragmented if it likes, though one generally tries to size the TCP packet so it doesn't have to.)
All that's the theory. In practice, IP fragmentation is uncommon, and avoided as much as possible. Additionally, IPv6 doesn't support fragmentation at all.

Packet switching vs fragmentation at ip layer

packet switching is a protocol, where a message received from tcp layer is divided into packets only at sender machine ip layer and each packet is sent individually on different routes with an identification field set in ip header to help use re-assemble at destination machine.
Where as
fragmentation at ip layer is done on sender machine or any of the, on the way layer 3 device ip layer and fragmentation field is set in ip header to help use re-assemble at destination machine only.
My question:
is my understanding correct?
In packet switching, If message could not be re-assembled due to missing packet at destination, based on identification field, that message is discarded at ip layer of destination machine and tcp layer of sender machine will take care of retransmission of that message, am i correct?
packet switching is a protocol
No. Packet switching is an alternative to circuit switching at the physical layer.
where a message received from tcp layer is divided into packets only at sender machine ip layer and each packet is sent individually on different routes with an identification field set in ip header to help use re-assemble at destination machine.
None of this is correct as a description of packet switching. Packet switching implies the existence of packets, period. It doesn't impose any of these constraints.
Whereas fragmentation at ip layer is done on sender machine or any of the
... intermediate nodes
on the way layer 3 device ip layer
Layer 2
and fragmentation field is set in ip header to help use re-assemble at destination machine only.
My question:
is my understanding correct?
No. You seem to think that packet switching and fragmentation are in some kind of opposition. They aren't. Fragmentation is an extension of packet switching if anything, not an alternative to it.
In packet switching, If message could not be re-assembled due to missing packet at destination, based on identification field, that message is discarded at ip layer of destination machine and tcp layer of sender machine will take care of retransmission of that message, am i correct?
No. Again you're confused about what packet switching is. Your remark applies pretty well to TCP, but that's because of the semantics of TCP, not because of packet switching.

How is the Protocol Attribute set for IP Fragments?

I am testing a network device driver's ability to cope with corrupted packets. The specific case I want to test is a when a large TCP packet is fragmented along the path because of smaller MTU in the way.
What most interests me about the IP Fragmentation of the large TCP packet is, is the protocol attribute of the IP Fragment packet set to TCP for each packet, or just the first fragment?
The protocol field will be set to TCP (6) for each fragment.
From RFC 791 - Internet Protocol
To fragment a long internet datagram,
an internet protocol module (for
example, in a gateway), creates two
new internet datagrams and copies the
contents of the internet header fields
from the long datagram into both new
internet headers. ... This procedure
can be generalized for an n-way split,
rather than the two-way split
described.
Protocol is part of the header and will consequently be copied into each of the fragments.
IP Fragmentation is a layer-3 activity, while the packet will be marked TCP, the intermediate fragments will not be usable by TCP. The TCP layer will have to wait for a re-assembly of the actual IP packet (unfragmented) before it can process it.
Wikipedia IP Fragmentation reference.
Path MTU-Discovery will usually update the source MTU and TCP packets (actually segments) will be sent with sizes limited to not cause fragmentation on the way

What is the Significance of Pseudo Header used in UDP/TCP

Why is the Pseudo header prepended to the UDP datagram for the computation of the UDP checksum? What's the rational behind this?
The nearest you will get to an answer "straight from the horse's mouth", is from David P. Reed at the following link.
http://www.postel.org/pipermail/end2end-interest/2005-February/004616.html
The short version of the answer is, "the pseudo header exists for historical reasons".
Originally, TCP/IP was a single monolithic protocol (called just TCP). When they decided to split it up into TCP and IP (and others), they didn't separate the two all that cleanly: the IP addresses were still thought of as part of TCP, but they were just "inherited" from the IP layer rather than repeated in the TCP header. The reason why the TCP checksum operates over parts of the IP header (including the IP addresses) is because they intended to use cryptography to encrypt and authenticate the TCP payload, and they wanted the IP addresses and other TCP parameters in the pseudo header to be protected by the authentication code. That would make it infeasible for a man in the middle to tamper with the IP source and destination addresses: intermediate routers wouldn't notice the tampering, but the TCP end-point would when it attempted to verify the signature.
For various reasons, none of that grand cryptographic plan came to pass, but the TCP checksum which took its place still operates over the pseudo header as though it were a useful thing to do. Yes, it gives you a teensy bit of extra protection against random errors, but that's not why it exists. Frankly, we'd be better off without it: the coupling between TCP and IP means that you have to redefine TCP when you change IP. Thus, the definition of IPv6 includes a new definition for the TCP and UDP pseudo header (see RFC 2460, s8.1). Why the IPv6 designers chose to perpetuate this coupling rather than take the chance to abolish it is beyond me.
From the TCP or UDP point of view, the packet does not contain IP addresses. (IP being the layer beneath them.)
Thus, to do a proper checksum, a "pseudo header" is included. It's "pseudo", because it is not actaully part of the UDP datagram. It contains the most important parts of the IP header, that is, source and destination address, protocol number and data length.
This is to ensure that the UDP checksum takes into account these fields.
When these protocols were being designed, a serious concern of theirs was a host receiving a packet thinking it was theirs when it was not. If a few bits were flipped in the IP header during transit and a packet changed course (but the IP checksum was still correct), the TCP/UDP stack of the redirected receiver can still know to reject the packet.
Though the pseudo-header broke the separation of layers idiom, it was deemed acceptable for the increased reliability.
"The purpose of using a pseudo-header is to verify that the UDP
datagram has reached its correct destination. The key to
understanding the pseudo-header lies in realizing that the correct
destination consists of a specific machine and a specific protocol
port within that machine. The UDP header itself specifies only the
protocol port number. Thus, to verify the destination, UDP on the
sending machine computes a checksum that covers the destination IP
address as well as the UDP datagram. The pseudo-header is not
transmitted with the UDP datagram, nor is it included in the length."
E. Comer - Internetworking with TCP/IP 4th edition.
Pseudo IP header contains the source IP, destination IP, protocol and Total length fields. Now, by including these fields in TCP checksum, we are verifying the checksum for these fields both at Network layer and Transport layer, thus doing a double check to ensure that the data is delivered to the correct host.

Resources