Calculating network throughput - tcp

suppose i have a 4MBits network and i want to calculate the data throughput, this is considering the max transfer rate minus overhead from ethernet/IP/TCP headers.
Reading on the web i found out that the MSS ( maximum segment size) of a TCP segment is 576 - 20 - 20, these last two being TCP and IP headers overhead, resulting in a 93% of data, meaning i will be only using 93% of my 4MBits link to transfer data. Now where's the link ayer overhead? Shouldn't it be added as well? If im not wrong an ethernet header is around 46 bytes so the final sum would be 576 - 20 - 20 - 46 = 490, resulting in an 85% data throughput, but am i doing something wrong?

Just work bottom up. Regular ethernet frames (no jumbo frames, no vlan tagging) are 1542 bytes in total and can have a payload of 1500 bytes. An Ipv4 header without options is 20 bytes and a TCP header without options also 20 bytes. So you end up with 1460 bytes possible payload of a 1542 byte link-layer frame. So your efficiency is 1460/1542=0.9468223086900129, resulting in a maximum throughput of 3.7872892347600517Mbps.
Notice however this will usually be lower. This is the theoretical maximum rate for a continuous stream you can get on a full duplex link, after the TCP session is established and when you're the only user of that link. Also note that as soon as you're sending at a slightly higher rate for some time your link will get congested, you will see drops and your actual TCP throughput might drop significantly because of slow-start.
If the link is wireless (802.11) the calculation becomes a lot more complex because of RTS/CTS mechanisms, but it's about /2 for only one active user and that's without incorporating loss, which is unrealistic.

In general, the protocol can impact network throughput and much more than simply the packet overhead. You mention that you want to measure throughput on an Ethernet/IP/TCP network but the impact of packet overhead of those protocols is NOT the only thing to consider. TCP is a connection-oriented protocol and uses ACK's to signal if a packet has been received or not. user1777914 missed the mark about ACK's but was on to something - they do not take up any more SPACE but they can DELAY the transmission of packets. As latency increases the overall network throughput can decrease based on how often the application or hosting OS expects a response.
W. Richard Stevens has written an AMAZING book on TCP/IP. Here is an except that explains theoretical TCP performance, what impacts it and how it is calculated.
There too is the Nagle algorithm helps with latency but if disabled can slow down throughput.

Related

Would a device physically closer to me transfer a file quicker(P2P)

Hello all I am new to networking and a question arose in my head. Would a device that is physically closer to another device transfer a file quicker than a device which is across the globe if a P2P connection were used?
Thanks!
No, not generally.
The maximum throughput between any two nodes is limited by the slowest interconnect they are using in their path. When acknowledgments are used (eg. with TCP), throughput is also limited by congestion, possible send/acknowledgment window size, round-trip time (RTT) - you cannot transfer more than one full window in each RTT period - and packet loss.
Distance doesn't matter basically. However, for long distance a large number of interconnects is likely used, increasing the chance for a weak link, congestion, or packet loss. Also, RTT inevitably increases, requiring a large send window (TCP window scale option).
Whether the links are wired, copper, fiber, or wireless doesn't matter - the latter means there's some risk for additional packet loss, however. P2P or classic client-server doesn't matter either.

Does the internet really works at 1500 bytes?

MTU (Maximum transmission unit) is the maximum frame size that can be transported.
When we talk about MTU, it's generally a cap at the hardware level and is for the lower level layers - DataLink and Physical layer.
Now, considering the OSI layer, it does not matter how efficient are the upper layers or what kind of magic-sauce they are applying, data-link layer will always construct frames of size < 1500 bytes (or whatever is the MTU) and anything in the "internet" will always be transmitted at that frame size.
Does the internet's transmission rate really capped at 1500 bytes. Now-a-days, we see speeds in 10-100 Mbps and even Gbps. I wonder for such speeds, does the frames still get transmitted at 1500 bytes, which would mean lots and lots and lots of fragmentation and re-assembly at the receiver. At this scale, how does the upper layer achieve efficiency ?!
[EDIT]
Based on below comments, I re-frame my question:
If data-layer transmits at 1500 byte frames, I want to know how is upper layer at the receiver able to handle such huge incoming data-frames.
For ex: If internet speed in 100 Mbps, upper layers will have to process 104857600 bytes/second or 104857600/1500 = 69905 frames/second. Network layer also need to re-assemble these frames. How network layer is able to handle at such scale.
If data-layer transmits at 1500 byte frames, I want to know how is
upper layer at the receiver able to handle such huge incoming
data-frames.
1500 octets is a reasonable MTU (Maximum Transmission Unit), which is the size of the data-link protocol payload. Remember that not all frames are that size, it is just the maximum size of the frame payload. There are many, many things with much smaller payloads. For example, VoIP has very small payloads, often smaller than the overhead of the various protocols.
Frames and packets get lost or dropped all the time, often on purpose (see RED, Random Early Detection). The larger the data unit, the more data that is lost when a frame or packet is lost, and with reliable protocols, such as TCP, the more data must be resent.
Also, having a reasonable limit on frame or packet size keeps one host from monopolizing the network. Hosts must take turns.
For ex: If internet speed in 100 Mbps, upper layers will have to
process 104857600 bytes/second or 104857600/1500 = 69905
frames/second. Network layer also need to re-assemble these frames.
How network layer is able to handle at such scale.
Your statement has several problems.
First, 100 Mbps is 12,500,000 bytes per second. To calculate the number of frames per second, you must take into account the data-link overhead. For ethernet, you have 7 octet Preabmle, a 1 octet SoF, a 14 octet frame header, the payload (46 to 1500 octets), a four octet CRC, then a 12 octet Inter-Packet Gap. The ethernet overhead is 38 octets, not counting the payload. To now how many frames per second, you would need to know the payload size of each frame, but you seem to wrongly assume every frame payload is the maximum 1500 octets, and that is not true. You get just over 8,000 frames per second for the maximum frame size.
Next, the network layer does not reassemble frame payloads. The payload of the frame is one network-layer packet. The payload of the network packet is the transport-layer data unit (TCP segment, UDP datagram, etc.). The payload of the transport protocol is application data (remember that the OSI model is just a model, and OSes do not implement separate session and presentation layers; only the application layer). The payload of the transport protocol is presented to the application process, and it may be application data or an application-layer protocol, e.g. HTTP.
The bandwidth, 100 Mbps in your example, is how fast a host can serialize the bits onto the wire. That is a function of the NIC hardware and the physical/data-link protocol it uses.
which would mean lots and lots and lots of fragmentation and
re-assembly at the receiver.
Packet fragmentation is basically obsolete. It is still part of IPv4, but fragmentation in the path has been eliminated in IPv6, and smart businesses, do not allow IPv4 packet fragments due to fragmentation attacks. IPv4 packets may be fragmented if the DF bit is not set in the packet header, and the MTU in the path shrinks smaller than the original MTU. For example, a tunnel will have a smaller MTU because of the tunnel overhead. If the DF bit is set, then a packet too large for the MTU on the next link, the packet is dropped. Packet fragmentation is very resource intensive on a router, and there is a set of steps that must be performed to fragment a packet.
You may be confusing IPv4 packet fragmentation and reassembly with TCP segmentation, which is something completely different.

Sequence number and acknowledgement number do not match

I have been learning wireshark lately. While inspecting TCP segments, I saw a strange situation, at least for me. There was a mismatching SEQ,ACK numbers. Then I realized that difference between two ACK's same as 1 and half package size. However, as far as I know, ACKs are only growing by the full packet size. So what happened here?
SEQ ACK
1 1
2897 8689
5793 13033 <--
8689 14481
11585
14481
Well, there's not much to go by here, but I'm going to guess that you are capturing packets on the machine sending the large packets, i.e., the packets with 2896 bytes of TCP payload. As such, you are seeing the packets before they are IP-fragmented and actually transmitted out on the wire.
But you can't just send 2896 bytes of data onto the wire; Ethernet links typically impose a 1500 byte MTU (Maximum Transmission Unit), and when you account for IP and TCP header overhead, you usually end up with 1460 bytes of available payload or MSS (Maximum Segment Size). In your case it looks like you're only getting 1448 bytes of available MSS, most likely due to the addition of one or more IP and/or TCP header options.
In any case, the 2896 bytes of payload are going to be fragmented over 2 IP fragments, each containing 1448 bytes of TCP payload. I'm pretty sure that what you're seeing is an ACK from the receiver after having received 1 full segment plus 1 IP fragment from the next segment.
The previous ACK number was 8689, and 8689 + 2896 = 11585. Now add 1/2 of the data segment (2896 / 2 = 1448) and you get 11585 + 1448 = 13033. That's the ACK number you're seeing. Now add the other 1/2 and you get 13033 + 1448 = 14481, which is the ACK number of the next packet.
I hope that makes sense?
For an in depth look at the drawbacks of local packet captures, I direct you to a well-written blog by Jasper Bongertz titled, "The drawbacks of local packet captures".
Since I do not have reputation to comment the answer of Christopher Maynard, I comment it here.
As Christopher stated correctly the MTU (Maximum Transmission Unit) is standard Ethernet networks is 1500Bytes. However, crafting 1460Bytes (1500Bytes - 20Bytes IP Header - 20Bytes TCP Header, ignoring possible TCP options) TCP segments imposes a high load on the network stack since every segment must be handled separately.
As an example, if I want to transmit 1Mb (equals to 1024*1024Bytes = 1048576Bytes) of data, it would result in 1Mb/1460Bytes = 719 segments. In contrast to 1Mb/65525Bytes (theoretical max TCP segment size) = 16 segments. Since the overhead in the Kernel is mostly independent of the segment/packet size, small segments require much more processing.
To counteract this fact, TCP Segmentation Offloading (TSO) was developed. TSO allows the Kernel to craft TCP segments with maximum size and the NIC (Network Interface Card) uses hardware acceleration to split these segments into the separate TCP segments of 1460Bytes. Therefore, no IP fragmentation is required.

Many small UDP datagrams vs fewer, larger ones

I have a system that sends "many" (hundreds) of UDP datagrams in bursts, every once in awhile (say, 10 times a minute). According to nload, this averages about 222kBit/s. The content of these datagrams is JSON. I've considered altering the system so that it waits some time (500ms?) and combines many of the JSON objects into one datagram, before sending. But I'm not sure it's worth the effort (bandwidth, protocol, frequency of sending considered.) Would the new approach provide any real benefits over the current one?
The short answer is that it's up to you to decide that.
The long version is that it depends on your use case. Since we don't know what you're building, it's hard to say what's more important - latency? Throughput? Reliability? Something else? Let's analyze some pros and cons. Here's what I came up with:
Pros to sending larger packets:
Fewer messages means fewer system calls and less I/O against the network. That means fewer blocked/waiting threads and less time spent on interrupts.
Fewer, larger packets means less overhead for each individual packet (stuff like IP/UDP headers that's send with each packet). Therefore a higher data rate is (theoretically) achievable, although keep in mind that all of these headers (L2+IP+UDP) typically add up to no more than 60-70 bytes per packet since the UDP header is only 8 bytes long.
Since UDP doesn't guarantee ordering, larger packets with more time between them will reduce any existing reordering.
Cons to sending larger packets:
Re-writing existing code, and making it (slightly) more complicated.
UDP is unreliable, so a loss of a single (large) packet would be more significant compared to the loss of a small packet.
Latency - some data will have to wait 500ms to be sent. That means that a delay is added between the sender and the receiver.
Fragmentation - if one of the packets you create crosses the MTU boundary (typically 1450-1500 bytes including the IP+UDP header, which is normally 28 bytes long), the IP layer would need to fragment the packet into several smaller ones. IP fragmentation is considered bad for a multitude of reasons.
Processing of larger packets might take longer

Generic case UDP packet loss?

I'm designing a networking protocol for a game, with acked events and non-acked real-time data. Just for design considerations, what kind of packet losses can I expect, when I keep my packet sizes less than 512 bytes, and the total throughput less than 128Kbps?
I'm looking for realistic numbers on the % of packet loss, and the average chunk of packets getting lost (just one, or often 4 in a row?).

Resources