From what I have understood so far, unlike non-transparent IP fragmentation; where packets are fragmented at the source and reassembled only at the destination; intermediate network systems in transparent IP fragmentation, reassemble and fragment the IP packet while in transit. Given the below example where two end-systems (A) and (B) communicate with each other, each of which is part of subnet1 and subnet3 respectively. Would router1 reassemble IP packet fragments sent by A and then send the whole non-fragmented packet to router2 who would fragment it before sending it to B ? Is this how transparent IP fragmentation works ?
(A) subnet1 ----- router1 ------ subnet2 ------- router2 ------- subnet3 (B)
I think you should read this post which gives you an idea of how it works.
For me the packet is rassemble by router1 sent to router2 and then arrive to B.
In all cases B recieve the whole paquet and not fragments made by A.
In normal IPv4 fragmentation, a router will fragment a packet if the MTU of the next interface is smaller than the packet size. Fragmentation occurs on the router(s), but reassembly of the packet fragments is the responsibility of the end-system.
The use of fragmentation and reassembly between two intermediate systems in the path is not directly covered by the IPv4 RFC, so there is no real standard for how this happens. Basically, the end of one link will fragment a packet to fit the MTU of the link, and the other end of the link will reassemble the fragments to the original packet. This is not something that is common because it places a large burden on the routers. In normal fragmentation, the burden of reassembly is place on the end-system.
From RFC 791, Internet Protocol:
The basic internet service is datagram oriented and provides for the
fragmentation of datagrams at gateways, with reassembly taking place
at the destination internet protocol module in the destination host.
Of course, fragmentation and reassembly of datagrams within a network
or by private agreement between the gateways of a network is also
allowed since this is transparent to the internet protocols and the
higher-level protocols. This transparent type of fragmentation and
reassembly is termed "network-dependent" (or intranet) fragmentation
and is not discussed further here.
The full description of fragmentation in the RFC:
Fragmentation
Fragmentation of an internet datagram is necessary when it originates
in a local net that allows a large packet size and must traverse a
local net that limits packets to a smaller size to reach its
destination.
An internet datagram can be marked "don't fragment." Any internet
datagram so marked is not to be internet fragmented under any
circumstances. If internet datagram marked don't fragment cannot be
delivered to its destination without fragmenting it, it is to be
discarded instead.
Fragmentation, transmission and reassembly across a local network
which is invisible to the internet protocol module is called intranet
fragmentation and may be used [6].
The internet fragmentation and reassembly procedure needs to be able
to break a datagram into an almost arbitrary number of pieces that can
be later reassembled. The receiver of the fragments uses the
identification field to ensure that fragments of different datagrams
are not mixed. The fragment offset field tells the receiver the
position of a fragment in the original datagram. The fragment offset
and length determine the portion of the original datagram covered by
this fragment. The more-fragments flag indicates (by being reset) the
last fragment. These fields provide sufficient information to
reassemble datagrams.
The identification field is used to distinguish the fragments of one
datagram from those of another. The originating protocol module of an
internet datagram sets the identification field to a value that must
be unique for that source-destination pair and protocol for the time
the datagram will be active in the internet system. The originating
protocol module of a complete datagram sets the more-fragments flag to
zero and the fragment offset to zero.
To fragment a long internet datagram, an internet protocol module (for
example, in a gateway), creates two new internet datagrams and copies
the contents of the internet header fields from the long datagram into
both new internet headers. The data of the long datagram is divided
into two portions on a 8 octet (64 bit) boundary (the second portion
might not be an integral multiple of 8 octets, but the first must be).
Call the number of 8 octet blocks in the first portion NFB (for Number
of Fragment Blocks). The first portion of the data is placed in the
first new internet datagram, and the total length field is set to the
length of the first datagram. The more-fragments flag is set to one.
The second portion of the data is placed in the second new internet
datagram, and the total length field is set to the length of the
second datagram. The more-fragments flag carries the same value as
the long datagram. The fragment offset field of the second new
internet datagram is set to the value of that field in the long
datagram plus NFB.
This procedure can be generalized for an n-way split, rather than the
two-way split described.
To assemble the fragments of an internet datagram, an internet
protocol module (for example at a destination host) combines internet
datagrams that all have the same value for the four fields:
identification, source, destination, and protocol. The combination is
done by placing the data portion of each fragment in the relative
position indicated by the fragment offset in that fragment's internet
header. The first fragment will have the fragment offset zero, and
the last fragment will have the more-fragments flag reset to zero.
Related
I send mixtures of large UDP packets back-to-back with small UDP packets. The large packets get fragmented to my MTU.
On RHEL6 (CentOS6), the small UDP packets always arrive at the receivers in the correct order with respect to the final fragment of any previous large packet.
On RHEL7, this no longer is the case. The small packet can get transmitted in-between the fragments of the larger packet, thus causing the receiver to see the small packet BEFORE the reassembled large packet.
As near as I can tell with ethtool, the configuration of the NIC is the same on both machines (It's actually the same machine and I swap hard drives).
So, my question is... What controls this behavior in RHEL7+? It's not udp-fragmentation-offload (That's set the same in both configurations). I'd like to find out how to force the fragments to be transmitted as a complete group, with no interfering packets, in RHEL7+.
Thanks,
XL600
My firewall textbook says: "UDP breaks a message into numbered segments so that it can be transmitted."
My understanding was UDP had no sequence or other numbering scheme? That data was broken into packets and sent out with no ordered reconstruction on the other end, at least on this level. Am I missing something?
The book is just wrong here. The relevant section says:
User Datagram Protocol (UDP)—This protocol is similar to TCP in that it handles the addressing of a message. UDP breaks a message into numbered segments so that it can be transmitted. It then reassembles the message when it reaches the destination computer.
UDP does not include any mechanism to segment or reassemble messages; each message is sent as a single UDP datagram. If you look at the UDP "packet" (technically datagram) structure on page 108, there's no segment number or anything like that.
Mind you, segmentation can happen at other layers, either above or below UDP:
IP packets can be fragmented if they're too big for a network link's MTU (maximum transfer unit). This can happen to IP packets that contain UDP, TCP, or whatever. This is actually relevant for firewalls because creative fragmentation can sometimes be used to bypass packet filtering rules.
Some protocols that run on top of UDP also use something like numbered segments. For example, TFTP (trivial file transfer protocol) breaks files into "blocks", and transmits a block number in the header for each block. (And the receiver responds acknowledging the block number it's received -- it's like a drastically simplified version of TCP.) But this is part of the TFTP protocol, not part of UDP.
QUIC is another example of a protocol that runs over UDP and supports segmentation (and multiple connections, and...), and each packet contains a packet number. But again it's part of the QUIC protocol, not UDP.
packet switching is a protocol, where a message received from tcp layer is divided into packets only at sender machine ip layer and each packet is sent individually on different routes with an identification field set in ip header to help use re-assemble at destination machine.
Where as
fragmentation at ip layer is done on sender machine or any of the, on the way layer 3 device ip layer and fragmentation field is set in ip header to help use re-assemble at destination machine only.
My question:
is my understanding correct?
In packet switching, If message could not be re-assembled due to missing packet at destination, based on identification field, that message is discarded at ip layer of destination machine and tcp layer of sender machine will take care of retransmission of that message, am i correct?
packet switching is a protocol
No. Packet switching is an alternative to circuit switching at the physical layer.
where a message received from tcp layer is divided into packets only at sender machine ip layer and each packet is sent individually on different routes with an identification field set in ip header to help use re-assemble at destination machine.
None of this is correct as a description of packet switching. Packet switching implies the existence of packets, period. It doesn't impose any of these constraints.
Whereas fragmentation at ip layer is done on sender machine or any of the
... intermediate nodes
on the way layer 3 device ip layer
Layer 2
and fragmentation field is set in ip header to help use re-assemble at destination machine only.
My question:
is my understanding correct?
No. You seem to think that packet switching and fragmentation are in some kind of opposition. They aren't. Fragmentation is an extension of packet switching if anything, not an alternative to it.
In packet switching, If message could not be re-assembled due to missing packet at destination, based on identification field, that message is discarded at ip layer of destination machine and tcp layer of sender machine will take care of retransmission of that message, am i correct?
No. Again you're confused about what packet switching is. Your remark applies pretty well to TCP, but that's because of the semantics of TCP, not because of packet switching.
I am testing a network device driver's ability to cope with corrupted packets. The specific case I want to test is a when a large TCP packet is fragmented along the path because of smaller MTU in the way.
What most interests me about the IP Fragmentation of the large TCP packet is, is the protocol attribute of the IP Fragment packet set to TCP for each packet, or just the first fragment?
The protocol field will be set to TCP (6) for each fragment.
From RFC 791 - Internet Protocol
To fragment a long internet datagram,
an internet protocol module (for
example, in a gateway), creates two
new internet datagrams and copies the
contents of the internet header fields
from the long datagram into both new
internet headers. ... This procedure
can be generalized for an n-way split,
rather than the two-way split
described.
Protocol is part of the header and will consequently be copied into each of the fragments.
IP Fragmentation is a layer-3 activity, while the packet will be marked TCP, the intermediate fragments will not be usable by TCP. The TCP layer will have to wait for a re-assembly of the actual IP packet (unfragmented) before it can process it.
Wikipedia IP Fragmentation reference.
Path MTU-Discovery will usually update the source MTU and TCP packets (actually segments) will be sent with sizes limited to not cause fragmentation on the way
Why is the Pseudo header prepended to the UDP datagram for the computation of the UDP checksum? What's the rational behind this?
The nearest you will get to an answer "straight from the horse's mouth", is from David P. Reed at the following link.
http://www.postel.org/pipermail/end2end-interest/2005-February/004616.html
The short version of the answer is, "the pseudo header exists for historical reasons".
Originally, TCP/IP was a single monolithic protocol (called just TCP). When they decided to split it up into TCP and IP (and others), they didn't separate the two all that cleanly: the IP addresses were still thought of as part of TCP, but they were just "inherited" from the IP layer rather than repeated in the TCP header. The reason why the TCP checksum operates over parts of the IP header (including the IP addresses) is because they intended to use cryptography to encrypt and authenticate the TCP payload, and they wanted the IP addresses and other TCP parameters in the pseudo header to be protected by the authentication code. That would make it infeasible for a man in the middle to tamper with the IP source and destination addresses: intermediate routers wouldn't notice the tampering, but the TCP end-point would when it attempted to verify the signature.
For various reasons, none of that grand cryptographic plan came to pass, but the TCP checksum which took its place still operates over the pseudo header as though it were a useful thing to do. Yes, it gives you a teensy bit of extra protection against random errors, but that's not why it exists. Frankly, we'd be better off without it: the coupling between TCP and IP means that you have to redefine TCP when you change IP. Thus, the definition of IPv6 includes a new definition for the TCP and UDP pseudo header (see RFC 2460, s8.1). Why the IPv6 designers chose to perpetuate this coupling rather than take the chance to abolish it is beyond me.
From the TCP or UDP point of view, the packet does not contain IP addresses. (IP being the layer beneath them.)
Thus, to do a proper checksum, a "pseudo header" is included. It's "pseudo", because it is not actaully part of the UDP datagram. It contains the most important parts of the IP header, that is, source and destination address, protocol number and data length.
This is to ensure that the UDP checksum takes into account these fields.
When these protocols were being designed, a serious concern of theirs was a host receiving a packet thinking it was theirs when it was not. If a few bits were flipped in the IP header during transit and a packet changed course (but the IP checksum was still correct), the TCP/UDP stack of the redirected receiver can still know to reject the packet.
Though the pseudo-header broke the separation of layers idiom, it was deemed acceptable for the increased reliability.
"The purpose of using a pseudo-header is to verify that the UDP
datagram has reached its correct destination. The key to
understanding the pseudo-header lies in realizing that the correct
destination consists of a specific machine and a specific protocol
port within that machine. The UDP header itself specifies only the
protocol port number. Thus, to verify the destination, UDP on the
sending machine computes a checksum that covers the destination IP
address as well as the UDP datagram. The pseudo-header is not
transmitted with the UDP datagram, nor is it included in the length."
E. Comer - Internetworking with TCP/IP 4th edition.
Pseudo IP header contains the source IP, destination IP, protocol and Total length fields. Now, by including these fields in TCP checksum, we are verifying the checksum for these fields both at Network layer and Transport layer, thus doing a double check to ensure that the data is delivered to the correct host.