Sequence number and acknowledgement number do not match - networking

I have been learning wireshark lately. While inspecting TCP segments, I saw a strange situation, at least for me. There was a mismatching SEQ,ACK numbers. Then I realized that difference between two ACK's same as 1 and half package size. However, as far as I know, ACKs are only growing by the full packet size. So what happened here?
SEQ ACK
1 1
2897 8689
5793 13033 <--
8689 14481
11585
14481

Well, there's not much to go by here, but I'm going to guess that you are capturing packets on the machine sending the large packets, i.e., the packets with 2896 bytes of TCP payload. As such, you are seeing the packets before they are IP-fragmented and actually transmitted out on the wire.
But you can't just send 2896 bytes of data onto the wire; Ethernet links typically impose a 1500 byte MTU (Maximum Transmission Unit), and when you account for IP and TCP header overhead, you usually end up with 1460 bytes of available payload or MSS (Maximum Segment Size). In your case it looks like you're only getting 1448 bytes of available MSS, most likely due to the addition of one or more IP and/or TCP header options.
In any case, the 2896 bytes of payload are going to be fragmented over 2 IP fragments, each containing 1448 bytes of TCP payload. I'm pretty sure that what you're seeing is an ACK from the receiver after having received 1 full segment plus 1 IP fragment from the next segment.
The previous ACK number was 8689, and 8689 + 2896 = 11585. Now add 1/2 of the data segment (2896 / 2 = 1448) and you get 11585 + 1448 = 13033. That's the ACK number you're seeing. Now add the other 1/2 and you get 13033 + 1448 = 14481, which is the ACK number of the next packet.
I hope that makes sense?
For an in depth look at the drawbacks of local packet captures, I direct you to a well-written blog by Jasper Bongertz titled, "The drawbacks of local packet captures".

Since I do not have reputation to comment the answer of Christopher Maynard, I comment it here.
As Christopher stated correctly the MTU (Maximum Transmission Unit) is standard Ethernet networks is 1500Bytes. However, crafting 1460Bytes (1500Bytes - 20Bytes IP Header - 20Bytes TCP Header, ignoring possible TCP options) TCP segments imposes a high load on the network stack since every segment must be handled separately.
As an example, if I want to transmit 1Mb (equals to 1024*1024Bytes = 1048576Bytes) of data, it would result in 1Mb/1460Bytes = 719 segments. In contrast to 1Mb/65525Bytes (theoretical max TCP segment size) = 16 segments. Since the overhead in the Kernel is mostly independent of the segment/packet size, small segments require much more processing.
To counteract this fact, TCP Segmentation Offloading (TSO) was developed. TSO allows the Kernel to craft TCP segments with maximum size and the NIC (Network Interface Card) uses hardware acceleration to split these segments into the separate TCP segments of 1460Bytes. Therefore, no IP fragmentation is required.

Related

Does the internet really works at 1500 bytes?

MTU (Maximum transmission unit) is the maximum frame size that can be transported.
When we talk about MTU, it's generally a cap at the hardware level and is for the lower level layers - DataLink and Physical layer.
Now, considering the OSI layer, it does not matter how efficient are the upper layers or what kind of magic-sauce they are applying, data-link layer will always construct frames of size < 1500 bytes (or whatever is the MTU) and anything in the "internet" will always be transmitted at that frame size.
Does the internet's transmission rate really capped at 1500 bytes. Now-a-days, we see speeds in 10-100 Mbps and even Gbps. I wonder for such speeds, does the frames still get transmitted at 1500 bytes, which would mean lots and lots and lots of fragmentation and re-assembly at the receiver. At this scale, how does the upper layer achieve efficiency ?!
[EDIT]
Based on below comments, I re-frame my question:
If data-layer transmits at 1500 byte frames, I want to know how is upper layer at the receiver able to handle such huge incoming data-frames.
For ex: If internet speed in 100 Mbps, upper layers will have to process 104857600 bytes/second or 104857600/1500 = 69905 frames/second. Network layer also need to re-assemble these frames. How network layer is able to handle at such scale.
If data-layer transmits at 1500 byte frames, I want to know how is
upper layer at the receiver able to handle such huge incoming
data-frames.
1500 octets is a reasonable MTU (Maximum Transmission Unit), which is the size of the data-link protocol payload. Remember that not all frames are that size, it is just the maximum size of the frame payload. There are many, many things with much smaller payloads. For example, VoIP has very small payloads, often smaller than the overhead of the various protocols.
Frames and packets get lost or dropped all the time, often on purpose (see RED, Random Early Detection). The larger the data unit, the more data that is lost when a frame or packet is lost, and with reliable protocols, such as TCP, the more data must be resent.
Also, having a reasonable limit on frame or packet size keeps one host from monopolizing the network. Hosts must take turns.
For ex: If internet speed in 100 Mbps, upper layers will have to
process 104857600 bytes/second or 104857600/1500 = 69905
frames/second. Network layer also need to re-assemble these frames.
How network layer is able to handle at such scale.
Your statement has several problems.
First, 100 Mbps is 12,500,000 bytes per second. To calculate the number of frames per second, you must take into account the data-link overhead. For ethernet, you have 7 octet Preabmle, a 1 octet SoF, a 14 octet frame header, the payload (46 to 1500 octets), a four octet CRC, then a 12 octet Inter-Packet Gap. The ethernet overhead is 38 octets, not counting the payload. To now how many frames per second, you would need to know the payload size of each frame, but you seem to wrongly assume every frame payload is the maximum 1500 octets, and that is not true. You get just over 8,000 frames per second for the maximum frame size.
Next, the network layer does not reassemble frame payloads. The payload of the frame is one network-layer packet. The payload of the network packet is the transport-layer data unit (TCP segment, UDP datagram, etc.). The payload of the transport protocol is application data (remember that the OSI model is just a model, and OSes do not implement separate session and presentation layers; only the application layer). The payload of the transport protocol is presented to the application process, and it may be application data or an application-layer protocol, e.g. HTTP.
The bandwidth, 100 Mbps in your example, is how fast a host can serialize the bits onto the wire. That is a function of the NIC hardware and the physical/data-link protocol it uses.
which would mean lots and lots and lots of fragmentation and
re-assembly at the receiver.
Packet fragmentation is basically obsolete. It is still part of IPv4, but fragmentation in the path has been eliminated in IPv6, and smart businesses, do not allow IPv4 packet fragments due to fragmentation attacks. IPv4 packets may be fragmented if the DF bit is not set in the packet header, and the MTU in the path shrinks smaller than the original MTU. For example, a tunnel will have a smaller MTU because of the tunnel overhead. If the DF bit is set, then a packet too large for the MTU on the next link, the packet is dropped. Packet fragmentation is very resource intensive on a router, and there is a set of steps that must be performed to fragment a packet.
You may be confusing IPv4 packet fragmentation and reassembly with TCP segmentation, which is something completely different.

Does TCP buffers segment even if a previous segment is lost?

Let's say we have a host A and a host B
I send 5 segment from A to B, with sequence number starting from 100 and each segment is 20 bytes long
If there were no packet lost, then I should expect ACK = 200 from B
but there was a packet lost, B got all segment except the 2nd one
I should get 4 ACK of 120 from B, indicating a lost of 2nd segment
After I resend 2nd segment, what will be the ACK from B, is it gonna be 140 or 200?
If it is 140, then it means B didn't buffer 3rd 4th and 5th segment
If it is 200, then it means B only needs the 2nd segment
which one is the true answer?
The buffering technique of out-of-order packets is not the part of the TCP protocol. So it depends on the receiver's TCP implementation.
In both options, TCP sender will correctly handle the situation.
The technique to deal with out of order packets in Linux TCP is well described here:
Johannessen, Mads. Investigate reordering in Linux TCP. MS thesis. 2015.
https://www.duo.uio.no/bitstream/handle/10852/47651/1/thesis-madsjoh.pdf
According to this research, Linux TCP will probably buffer 3rd, 4th and 5th segments (if there is enough space for it) and reply to retransmission with the ACK = 200.

difference between MTU and link bandwidth

what is the difference between a link MTU max transmission unit and a link bandwidth? MTU refers to the max size of a packet that a link can send. link bandwidth refers to the max number of bits that a link can send. arent they the same thing
MTU = maximum size of 1 packet. For example, ethernet is 1500 bytes.
Bandwidth = bits you can send trough a link, for example, 1 gigabit per second.
So to send 1 megabyte of data over a line, you first need to cut it into small packets that fit the protocol. So you will create about 700 packets of 1500 bytes to fit one megabyte into. Those will go over your line at the bandwidth specified.
But there is overhead: in TCP/IP every packet needs its IP and TCP headers. So the smaller the MTU, the more packets you will need, so the header overhead will become more significant, meaning you can actually use less of the total bandwidth

Calculating network throughput

suppose i have a 4MBits network and i want to calculate the data throughput, this is considering the max transfer rate minus overhead from ethernet/IP/TCP headers.
Reading on the web i found out that the MSS ( maximum segment size) of a TCP segment is 576 - 20 - 20, these last two being TCP and IP headers overhead, resulting in a 93% of data, meaning i will be only using 93% of my 4MBits link to transfer data. Now where's the link ayer overhead? Shouldn't it be added as well? If im not wrong an ethernet header is around 46 bytes so the final sum would be 576 - 20 - 20 - 46 = 490, resulting in an 85% data throughput, but am i doing something wrong?
Just work bottom up. Regular ethernet frames (no jumbo frames, no vlan tagging) are 1542 bytes in total and can have a payload of 1500 bytes. An Ipv4 header without options is 20 bytes and a TCP header without options also 20 bytes. So you end up with 1460 bytes possible payload of a 1542 byte link-layer frame. So your efficiency is 1460/1542=0.9468223086900129, resulting in a maximum throughput of 3.7872892347600517Mbps.
Notice however this will usually be lower. This is the theoretical maximum rate for a continuous stream you can get on a full duplex link, after the TCP session is established and when you're the only user of that link. Also note that as soon as you're sending at a slightly higher rate for some time your link will get congested, you will see drops and your actual TCP throughput might drop significantly because of slow-start.
If the link is wireless (802.11) the calculation becomes a lot more complex because of RTS/CTS mechanisms, but it's about /2 for only one active user and that's without incorporating loss, which is unrealistic.
In general, the protocol can impact network throughput and much more than simply the packet overhead. You mention that you want to measure throughput on an Ethernet/IP/TCP network but the impact of packet overhead of those protocols is NOT the only thing to consider. TCP is a connection-oriented protocol and uses ACK's to signal if a packet has been received or not. user1777914 missed the mark about ACK's but was on to something - they do not take up any more SPACE but they can DELAY the transmission of packets. As latency increases the overall network throughput can decrease based on how often the application or hosting OS expects a response.
W. Richard Stevens has written an AMAZING book on TCP/IP. Here is an except that explains theoretical TCP performance, what impacts it and how it is calculated.
There too is the Nagle algorithm helps with latency but if disabled can slow down throughput.

Size of empty UDP and TCP packet?

What is the size of an empty UDP datagram? And that of an empty TCP packet?
I can only find info about the MTU, but I want to know what is the "base" size of these, in order to estimate bandwidth consumption for protocols on top of them.
TCP:
Size of Ethernet frame - 24 Bytes
Size of IPv4 Header (without any options) - 20 bytes
Size of TCP Header (without any options) - 20 Bytes
Total size of an Ethernet Frame carrying an IP Packet with an empty TCP Segment - 24 + 20 + 20 = 64 bytes
UDP:
Size of Ethernet frame - 24 Bytes
Size of IPv4 Header (without any options) - 20 bytes
Size of UDP header - 8 bytes
Total size of an Ethernet Frame carrying an IP Packet with an empty UDP Datagram - 24 + 20 + 8 = 52 bytes
Himanshus answer is perfectly correct.
What might be misleading when looking at the structure of an Ethernet frame [see further reading], is that without payload the minimum size of an Ethernet frame would be 18 bytes: Dst Mac(6) + Src Mac(6) + Length (2) + Fcs(4), adding minimum size of IPv4 (20) and TCP (20) gives us a total of 58 bytes.
What has not been mentioned yet is that the minimum payload of an ethernet frame is 46 byte, so the 20+20 byte from the IPv4 an TCP are not enough payload! This means that 6 bytes have to be padded, thats where the total of 64 bytes is coming from.
18(min. Ethernet "header" fields) + 6(padding) + 20(IPv4) + 20(TCP) = 64 bytes
Hope this clears things up a little.
Further Reading:
Ethernet_frame
Ethernet
See User Datagram Protocol. The UDP Header is 8 Bytes (64 bits) long.
The mimimum size of the bare TCP header is 5 words (32bit word), while the maximum size of a TCP header is 15 words.
Best wishes,
Fabian
If you intend to calculate the bandwidth consumption and relate them to the maximum rate of your network (like 1Gb/s or 10Gb/s), it is necessary, as pointed out by Useless, to add the Ethernet framing overhead at layer 1 to the numbers calculated by Felix and others, namely
7 bytes preamble
1 byte start-of-frame delimiter
12 bytes interpacket gap
i.e. a total of 20 more bytes consumed per packet.
If you're looking for a software perspective (after all, Stack Overflow is for software questions) then the frame does not include the FCS, padding, and framing symbol overhead, and the answer is 54:
14 bytes L2 header
20 bytes L3 header
20 bytes L4 header
This occurs in the case of a TCP ack packet, as ack packets have no L4 options.
As for FCS, padding, framing symbol, tunneling, etc. that hardware and intermediate routers generally hide from host software... Software really only cares about the additional overheads because of their effect on throughput. As the other answers note, FCS adds 4 bytes to the frame, making it frame 58 bytes. Therefore 6 bytes of padding are required to reach the minimum frame size of 64 bytes. The Ethernet controller adds an additional 20 bytes of framing symbols, meaning that a packet takes a minimum of 84 byte times (or 672 bit times) on the wire. So, for example, a 1Gbps link can send one packet every 672ns, corresponding to a max packet rate of roughly 1.5MHz. Also, intermediate routers can add various tags and tunnel headers that further increase the minimum TCP packet size at points inside the network (especially in public network backbones).
However, given that software is probably sharing bandwidth with other software, it is reasonable to assume that this question is not asking about total wire bandwidth, but rather how many bytes does the host software need to generate. The host relies on the Ethernet controller (for example this 10Mbps one or this 100Gbps one) to add FCS, padding, and framing symbols, and relies on routers to add tags and tunnels (A virtualization-offload controller, such as in the second link, has an integrated tunnel engine. Older controllers rely on a separate router box). Therefore the minimum TCP packet generated by the host software is 54 bytes.

Resources