UDP and TCP/IP packet size in Toit

UDP and TCP/IP packet size in Toit - tcp

While experimenting with a UDP server that runs on esp32, I found that the size of the received packet is limited to 1500 bytes: 20 (IP header) + 8 (UDP header) + 1472 (data), (although in theory UDP as if can support packets data up to 64K). This means that in order to transfer a larger amount of data, the client must split it into several chunks and send them one after the other, and on the server side, this data will need to be restored. I think that the overhead of such a solution will be quite high. I also know that Toit provides TCP/IP connection. Naturally, the packet size is also limited in the case of TCP/IP. This is 64K (65535 bytes). Does Toit have any additional restrictions on the TCP/IP connection, or 64K value is fact also for Toit?

As described in this question/answer, it's a matter of avoiding packet fragmentation. Sending packages above this size will force the system to split them up into multiple fragments of size MTU, with each of them being individually unreliable. As memory is already very limited on embedded systems, sending large (> MTU) packages where all fragments has to arrive before it can be processed, can be very unfortunate for the overall application behavior as it can time out or go out-of-memory.
Instead the application should look at a streaming pipeline (perhaps even TCP to handle the unreliable aspects as well).
As TCP/IP is a streaming protocol, any sized "packages" can be sent, as they are automatically split into fragments of size MTU. Note that the data is received in "random"-sized packages, though the order of the bytes is fully preserved.

Related

Proper way to calculate Link Throughput

I have read some articles online and I got a pretty good idea about the TCP and UDP in general. However, I still have some doubts which I am sure not completely clear to me.
What is the proper way to calculate throughput ?
(Can't we just divide Total number of bytes received by total time taken ?)
What is that key feature in TCP that makes it have much much higher
throughput than UDP ?
UPDATE:
I understood that TCP uses windows which is nothing but that much segments can be sent before actually waiting for Acknowledgements. But my doubt is that in UDP segments are continuously sent without even bothering about Acknowledgements. So there is no extra overheads in UDP. Then, why the throughput of TCP is much much higher than that of UDP ?
Lastly,
Is this true ?
TCP throughput = (TCP Window Size / RTT) = BDP / RTT = (Link Speed in Bytes/sec * RTT)/RTT = Link Speed in Bytes/sec
If so then TCP throughput is always equals to the Know Link speed. And since the RTTs cancels out each other, the TCP throughput does not even depends on RTT.
I have seen in some network analysis tools like iperf, passmark performance test etc. that the TCP/UDP Throughput changes with Block size.
How is throughput dependent on Block size ?
Is Block size equals TCP window or UDP datagram size ?

What is the proper way to calculate throughput?
There are multiple ways, depending on what exactly you want to measure. They all boil down to dividing some number of bits (or bytes) to some duration, as you mention; what varies is which bits you are counting or (more rarely) which moments of time you are considering for measuring the duration.
The factors you need to take into account are:
At which layer in the network stack are you measuring throughput?
If you measure at the application layer, all that matters is what useful data you transmit to the other endpoint. For example, if you are transferring a file of 6 kB, the amount of data you count when measuring throughput is 6 kB (that is 6,000 bytes, not bits, and note the multiplier of 1000, not 1024; these conventions are common in networking).
This is usually called goodput and it may be different from what is actually sent at the transport layer (as in TCP or UDP), for two reasons:
1. Overhead due to headers
Each layer in the network adds a header to the data that introduces some overhead due to its transmission time. Moreover, the transport layer breaks your data into segments; this is because the network layer (as in IPv4 or IPv6) has a maximum packet size called MTU, typically 1,500 B in Ethernet networks. This value includes the network layer header size (e.g. the IPv4 header, which is variable in length but usually 20 B long) and the transport layer header (for TCP, it is also variable in length but usually 40 B long). This leads to a maximum segment size MSS (number of data bytes, without headers, in one segment) of 1500 - 40 - 20 = 1440 bytes.
Thus if we want to send 6 kB of application-layer data, we must break it into 6 segments, 4 of 1440 bytes each and one of 240 bytes. However at the network layer we end up sending 6 packets, 4 of 1500 bytes each and one of 300 bytes, for a total of 6.3 kB.
Here I have not considered the fact that the link layer (as in Ethernet) adds its own header and possibly also a suffix, which increases the overhead further. For Ethernet this is 14 bytes for the Ethernet header, optionally 4 bytes for VLAN tag, then a CRC of 4 bytes and a gap of 12 bytes, for a total of 36 bytes per packet.
If you consider a fixed-rate link, say of 10 Mb/s, depending on what you measure you will get a different throughput. Normally you want one of these:
The goodput, i.e. application layer throughput, if what you want to measure is application performance. For this example, you divide 6 kB by the transfer duration.
The link-layer throughput, if what you want to measure is network performance. For this example, you divide 6 kB + TCP overhead + IP overhead + Ethernet overhead = 6.3 kB + 5 * 36 B = 6516 B by the transfer duration.
Retransmission overheads
The Internet is a best-effort network, meaning that the packets will be delivered if possible, but may also be dropped. Packet drops are corrected by the transport layer, in case of TCP; for UDP, there is no such mechanism, which means that either the application does not care if some parts of the data do not get delivered, or the application implements retransmission itself on top of UDP.
Retransmission reduce goodput for two reasons:
a. Some data needs to be sent again, which takes time. This introduces a delay which is inversely proportional to the rate of the slowest link in the network between the sender and the receiver (a.k.a the bottleneck link).
b. Detecting that some data was not delivered needs feedback from the receiver to the sender. Due to propagation delays (sometimes called latency; caused by the finite speed of light in the cable), feedback can only be received by the sender with some latency, which slows down the transmission even more. In most practical cases, this is the most significant contribution to the extra delay caused by the retransmission.
Clearly, if you use UDP instead of TCP and you do not care about packet loss, you will of course get better performance. But for many applications, data loss cannot be tolerated, so such a measurement is meaningless.
There are some applications that do use UDP for transferring data. One is BitTorrent, which may use either TCP or a protocol they designed called uTP, which emulates TCP on top of UDP, but aims at being more efficient with many parallel connections. Another transport protocol implemented over UDP is QUIC, which also emulates TCP and offers multiplexing multiple parallel transfers over a single connection, and forward error correction to reduce retransmissions.
I will discuss forward error correction a little since it is related to your question about throughput. A naive way of implementing it is by sending every packet twice; in case one gets lost, the other still has a chance of being received. This reduces the amount of retransmissions to half, but also halves your goodput since you send redundant data (note that the network or link layer throughput remains the same!). In some cases this is fine; especially if the latency is very large, such as on intercontinental or satellite links. Moreover, some mathematical methods exist where you don't have to send a full copy of the data; for instance for every n packets you send, you send another reduntant one which is the XOR (or some other arithmetic operation) of them; if the redundant one gets lost, it doesn't matter; if one of the n packets gets lost, you can reconstruct it based on the redundant one and the other n-1. You can thus configure the overhead introduced by forward error correction to whatever amount of bandwidth you can spare.
How you are measuring the transfer time
Is the transfer completed when the sender finished sending the last bit over the wire, or does it also include the time it takes for the last bit to travel to the receiver? Additionally, does it include the time it takes to get a confirmation from the receiver, stating that all data has been received successfully and no retransmission is neede?
It really depends on what you want to measure. Note that for large transfers, one extra round-trip-time is insignificant in most cases (unless you are communicating, for instance, with a probe on Mars).
What is that key feature in TCP that makes it have much much higher throughput than UDP?
This is not true, although a common misconception.
In addition to retransmitting data when needed, TCP will also adjust its sending rate so that it will not cause packet drops by congesting the network. The adjustment algorithm has been perfected over decades, and usually converges quickly to the maximum rate supported by the network (actually, the bottleneck link). For this reason it is usually difficult to beat TCP in throughput.
With UDP, there is no rate limiting at the sender. UDP lets the application send as much as it wants. But if you try to send more than the network can handle, some of the data will be dropped, lowering your throughput, and also making the admin of the network you are congesting very angry. This means that sending UDP traffic at high rates is impractical (unless the goal is to DoS a network).
Some media applications are using UDP but rate-limiting the transfer at the sender at a very small rate. This is typically used in VoIP applications or Internet Radio, where you require very little throughput but low latency. I suppose this is one of the reasons for the misconception that UDP is slower than TCP; that is not the case, UDP can be as fast as the network allows.
As I said before, there are protocols such as uTP or QUIC, implemented over UDP, which achieve performance similar to TCP.
Is this true ?
TCP throughput = (TCP Window Size / RTT)
Without packet loss (and retransmissions), this is correct.
TCP throughput = BDP / RTT = (Link Speed in Bytes/sec * RTT)/RTT = Link Speed in Bytes/sec
This is correct only if the window size is configured to the optimal value. BDP/RTT is the optimal (maximum possible) transfer rate in the network. Most modern operating systems should be able to auto-configure it optimally.
How is throughput dependent on Block size ? Is Block size equals TCP window or UDP datagram size?
I don't see any block size in the iperf documentation.
If you refer to the TCP window size, if it is smaller than BDP, then your throughput will be suboptimal (because you waste time waiting for ACKs instead of sending more data; if needed I can explain further). If it is equal or higher to the BDP, then you achieve optimal throughput.

It depends on how you define "Throughput". It usually can be one of the followings.
Number of bytes (or bits) sent in a fixed period of time;
Number of bytes (or bits) sent and received on the receiver end in a fixed period of time;
You can apply these definition to every layer when people talking about throughput. In application layer, 2nd definition means the bytes have really been received by the receiver end of the application. Some people refer to it as "goodput". In Transport layer, say TCP, 2nd definition means the corresponding TCP ACKs are received. To me, most of people should be only interested in the bytes are really received by the receiver end. So, 2nd definition is usually what people mean by "Throughput".
Now, once we have a clear definition of throughput (2nd definition). We can discuss how to measure the throughput correctly.
Usually, people either use TCP or UDP to measure the network throughput.
TCP: People usually measure TCP throughput only on the sender end. As for packets successfully received by the receiver end, ACKs will be sending back. So, sender itself will know how many bytes are sent and received on the receiver end. Divided this number by the measuring time, we will know the throughput.
But, there are two things need to be noticed during TCP throughput measurement:
Is sender side always full buffer during the measurement? i.e. During the measurement period, sender should always has packets to send. It is important for correct throughput measurement. e.g. if I set my measuring time to be 60 seconds, but my file has been finished transmission in 40 seconds. Then there are 20 seconds the network is actually idle. I will under-estimate the throughput.
TCP rate is regulated by its congestion window size, slow-start duration, sender window (and receiver window) size. Sub-optimal configuration of these parameters will result in under-estimated TCP throughput. Although most of the modern TCP implementation should have a quite good configuration of all of these, it is hard for a tester to 100% sure all these configurations are optimal.
Due to these limitations/risks of TCP in network throughput estimation, quite a number of researchers will use UDP for measuring network throughput.
UDP: As UDP has no ACK sending back once the packets are successfully received, people has to measure the throughput in the receiver end. Or, if the receiver end is not easily accessed, people can compare the logs on both sender and receiver sides to determine the throughput. But, this inconvenience is mitigated by some throughput measuring tools. For example, iperf has embedded sequence numbers in its customized payload, so that it can detect any loss. Also, a receiver's report will be sent to the sender to show the throughput.
As UDP by nature is just sending whatever it has to the network and not waiting for the feedback. Its throughput (remember the 2nd definition) once measured will be the actual capacity (or bandwidth) of the network.
So, usually, the throughput measured by UDP should be higher than that from TCP although the difference should be small (~5%-10%).
One biggest drawback of UDP throughput measuring is that, when using UDP one should also make sure that sender-side buffer must be full. (Otherwise, it results in under-estimated throughput as TCP). This step will be little tricky. In iperf, one can specify the sending rate by -b option. Increase -b value in different rounds of testing will converge the throughput measured. For example, in my gigabit ethernet, I first use -b 100k in the test. The throughput measured is 100Kbps. Then I perform the following iterations to converge the maximum throughput which is the capacity of my ethernet.
-b 1m --> throughput: 1Mbps
-b 10m --> throughput: 10Mbps
-b 100m --> throughput: 100Mbps
-b 200m --> throughput: 170Mbps
-b 180m --> throughput: 175Mbps (this should be quite close to the actual capacity)

What is the actual bit transfer rate over a network transferring 100kb/s?

When transferring, say, 1GB worth of data over the internet this data is split into packets, each packet containing a small piece of data and aach of these packets are part of a frame.
Eg. Windows reports that you are transferring the file in 100kb/s over a TCP connection, but this appears to be the amount of data from the file being transferred per second, and does not seem to include the ip or tcp header, or ethernet frame.
What is the actual amount of traffic on the network needed to transfer at this speed? Or is that data actually already included in the transfer speed, but just small enough that it makes no significant difference?
Also, IP supports up to 1500 byte / packet (I think?), but what is the common size of data packets when loading, say, an HD image on reddit?
Sorry for the rather basic questions I probably should have figured out myself by now...

It depends on where you look at the transfer rate:
Task Manager will report all the transferred bytes (i.e. the sum of all the packets including their headers).
A file transfer program will report the transmitted payload.
Task Manager
If you look at Task Manger / Network, you can see the transmitted bytes together with the number of transmitted packets (unicast or non-unicast).
That data comes from the network driver (or at least something close to it), so it makes sense to report the total amount of data here (otherwise each packet would need to be inspected to calculate the payload).
There is also a graph showing the transfer rate. Those numbers could easily be compared with the reported numbers in file transfer software.
File transfer program
A file transfer program on the other hand, does not know the details about the packets being created in the lower layers (those could be any size). So the only option here is to report the amount of transmitted payload data / part of the file, which also makes more sense to the user.
Network packets
On normal networks (there could also be jumbo frames), a TCP-packet (full ethernet frame) is around 1500 bytes when fully loaded (on my system (IPv4) the packets are 1514 bytes with a total header size of 54 bytes -- 14 for the Ethernet header, 20 for the IP header and 20 for the TCP header). Those could be split in smaller packets along the way in the network, but in most cases they won't.
Transfer rate
When transferring a file (or other large datastream), on average 2 full packets (1514 bytes) will be sent each time, and 1 small packet (54 bytes) is received (the [ACK] packet). In this optimal case we have 2 x 1460 payload, and 2 x 54 bytes of overhead on the sending side + 54 bytes on the receiving side. When comparing to the maximum transfer rate of the internet connection, we also have to consider some latency.
Not all transmissions are optimal:
There could be packets that never arrived or where the checksum was wrong, so a retransmit would be needed.
In some cases data could be sent in smaller parts, causing a higher overhead/payload ratio (but with small chunks Nagle's algorithm could take care of that).
Certain software could be reading the file contents into small buffers (say 4096 bytes). Those could then be split in 2 x 1460 and 1 x 1176, introducing some extra overhead.
Conclusion
It's hard to tell or calculate the exact ratio transferred_bytes/payload. It depends on the quality of the internet connection (lost packets, retransmits), the software or API calls used to transfer the data, and even the underlying network (small frames vs jumbo frames for example).

A typical full-size TCP/IPv4 packet on the Internet is of size 1500B (maximum transmission unit (MTU)), of which (minimum) 20B are of TCP header and (minimum) 20B are of IPv4. This MTU was chosen to be compatible with Ethernet. Furthermore, there are application headers (e.g., HTTP for web, SIP/RTP/RTCP for voice call, etc.) included in this packet. The minimum MTU is 576B for IPv4 and 1280B for IPv6. One can see MTU on Linux with ifconfig command.
The best way to figure these values is by using a pcap tool/network analyzer such as Wireshark. Also refer to wiki pages or a good networking book for headers and fields of the protocols.

I'm pretty sure that the reported transer rate doen't include all the headers and overhead of the different layers in the protocal stack, since the reported thruput usually comes from some user-space application which would only get the actual data from the network stream object. It would need to do additional work to find out about all the headers and frames and other overhead that occurred in the different layers and affected the actual physical transmission.

UDP - Optional Checksum

From what I have read about UDP, it has no error handling, no checking for things like sequence of data sent/recieved, no checking for duplicate packets, no checking for corrupt packets and obviously no guarantee that the packets sent are even received...
So with that in mind, why an earth is there actually an option to use checksums in UDP?? Because surely if you want to make sure the data being sent is received in the correct order (and not corrupt and so on) then you would use TCP...

UDP packets include a field for a 16 bit CRC checksum which the receiving operating system will use to check for packet corruption. If the checksum is present and fails, then the packet will be silently discarded. It is up to the application to notice that the packet disappeared and take corrective action.
UDP checksums are enabled by default on all modern operating systems. It is possible to disable UDP checksums in IPv4, either at the socket or OS level. Doing so would reduce the CPU overhead of processing each packet at both the sender and receiver. This might be desirable if, for example, the application were calculating its own checksum separately. Without any checksum, there would be no guarantee that the bytes received are the same as the bytes sent.

The task of UDP is to transport datagrams, which are "network data packets". For UDP, every data packet is a transmission of its own. If you send 3 packets, those are three independent transmissions for UDP. Whether the content of these 3 packets somehow belongs together or if these are three individual requests (think of DNS requests, where every request is sent as an own UDP packet), UDP doesn't know and doesn't care. All that UDP guarantees is that a packet is either transmitted as a whole or not at all; either the entire packet arrives or the entire packet is lost, you will never see "half of a packet" arriving. So if you just want to send a bunch of data packets, you use UDP.
The task of TCP, on the other hand, is to transport a stream of data. It's not about packets. It's about a stream of bytes somehow making it from one host to another. How this happens, e.g. how TCP is breaking the data stream into chunks and sending these chunks over the network and ensuring that no data is lost and all data is in order, is up to TCP. All that TCP guarantees is that the bytes will arrive correctly and in order at the other side, unless the TCP connection is lost, in which case the stream ends abruptly somewhere in the middle but all data, that arrived up to that point, did arrive correctly and in correct order. So despite TCP also working with packets, the transmission behaves like a stream that has no internal "data units". When sending 80 bytes over TCP, there may be one packet with 80 bytes or 10 packets with each 8 bytes or anything in between, you cannot know and you don't have to.
But just because you use UDP doesn't mean you don't care for data corruption in UDP packets. Keep in mind that corruption may not just affect your data, it may also affect the UDP header itself. If only a single bit swaps, the UDP packets may have an incorrect destination port. So they added a checksum which ensures that neither the UDP header nor the data payload has been corrupted but made it optional, so it's up to you whether you want to use it or not. If used, corrupt packets are dropped and thus behave like lost packets. If your code takes care of lost packets, it will automatically take care of corrupt packets, too.
With IPv6 though, the checksum was dropped from the IP header, which means that IP header corruptions are no longer detected. But this was seen as a small problem, as most layer 2 protocols have their own mechanism to detect corrupt data (e.g. Ethernet and WiFi already guarantee that data is not corrupted on its way through the network) and the checksums of UDP/TCP also cover some of the IP header fields, so even without layer 2 error checking, the recipient would notice if the IP addresses in the header have been corrupted along the way and drop the packet. As a consequence, the UDP checksum is no longer optional with IPv6.

UDP Packet size and packet losses

I've been writing a program that uses a stop and wait protocol on top of UDP to send packets over LAN and also over WAN. I've recently been testing my program and have noticed that the packet loss rate is higher for larger packets (approaching 64k bytes). Intuitively this makes sense but what are the actual reasons for this?

UDP packets greater than the MTU size of the network that carries them will be automatically split up into multiple packets, and then reassembled by the recipient. If any of those multiple sub-packets gets dropped, then the receiver will drop the rest of them as well.
So for example if you send a 63k UDP packet, and it goes over Ethernet, it will get broken up into 47+ smaller "fragment" packets (because Ethernet's MTU is 1500 bytes, but some of those are used for UDP headers, etc, so the amount of user-data-space available in a UDP packet is smaller than that). The receiver will only "see" that UDP packet if all 47+ of those fragment-packets make it through okay. If just one of those fragment-packets gets dropped, the whole operation fails.

Well, data networks are far from reliable; packets get dropped all the time. Overloaded routers, full buffers and corrupt packets are some of the reasons. Since UDP has no flow control capabilities, it can't slow down if for example the receiving end is overloaded.
As Jeremy explained, the bigger the payload, the more packets it is going to be split into, and therefore a bigger chance of losing some of them.
UDP is used in cases where a dropped packet here in there won't affect anything or cases that you need something to get there in time or not at all. (VOIP, streaming video etc)

Its all about IP fragmentation and defragmentation. Packet more than MTU would be fragmented and has to be defragmented at the final host, there are also chances the fragments gets fragmented again on the path and which again can add the delay. sometimes if some N/W element is configured for layer 4 filtering then it defragments(not the final host) apply rules and then again frgaments and forward. Thats the reason the applicaiton which need performance always try to send data with size <= (MTU-ETHHDR-IPHDR)

Maximum packet size for a TCP connection

What is the maximum packet size for a TCP connection or how can I get the maximum packet size?

The absolute limitation on TCP packet size is 64K (65535 bytes), but in practicality this is far larger than the size of any packet you will see, because the lower layers (e.g. ethernet) have lower packet sizes.
The MTU (Maximum Transmission Unit) for Ethernet, for instance, is 1500 bytes. Some types of networks (like Token Ring) have larger MTUs, and some types have smaller MTUs, but the values are fixed for each physical technology.

This is an excellent question and I run in to this a lot at work actually. There are a lot of "technically correct" answers such as 65k and 1500. I've done a lot of work writing network interfaces and using 65k is silly, and 1500 can also get you in to big trouble. My work goes on a lot of different hardware / platforms / routers, and to be honest the place I start is 1400 bytes. If you NEED more than 1400 you can start to inch your way up, you can probably go to 1450 and sometimes to 1480'ish? If you need more than that then of course you need to split in to 2 packets, of which there are several obvious ways of doing..
The problem is that you're talking about creating a data packet and writing it out via TCP, but of course there's header data tacked on and so forth, so you have "baggage" that puts you to 1500 or beyond.. and also a lot of hardware has lower limits.
If you "push it" you can get some really weird things going on. Truncated data, obviously, or dropped data I've seen rarely. Corrupted data also rarely but certainly does happen.

At the application level, the application uses TCP as a stream oriented protocol. TCP in turn has segments and abstracts away the details of working with unreliable IP packets.
TCP deals with segments instead of packets. Each TCP segment has a sequence number which is contained inside a TCP header.
The actual data sent in a TCP segment is variable.
There is a value for getsockopt that is supported on some OS that you can use called TCP_MAXSEG which retrieves the maximum TCP segment size (MSS). It is not supported on all OS though.
I'm not sure exactly what you're trying to do but if you want to reduce the buffer size that's used you could also look into: SO_SNDBUF and SO_RCVBUF.

There're no packets in TCP API.
There're packets in underlying protocols often, like when TCP is done over IP, which you have no interest in, because they have nothing to do with the user except for very delicate performance optimizations which you are probably not interested in (according to the question's formulation).
If you ask what is a maximum number of bytes you can send() in one API call, then this is implementation and settings dependent. You would usually call send() for chunks of up to several kilobytes, and be always ready for the system to refuse to accept it totally or partially, in which case you will have to manually manage splitting into smaller chunks to feed your data into the TCP send() API.

According to http://en.wikipedia.org/wiki/Maximum_segment_size, the default largest size for a IPV4 packet on a network is 536 octets (bytes of size 8 bits). See RFC 879

Generally, this will be dependent on the interface the connection is using. You can probably use an ioctl() to get the MTU, and if it is ethernet, you can usually get the maximum packet size by subtracting the size of the hardware header from that, which is 14 for ethernet with no VLAN.
This is only the case if the MTU is at least that large across the network. TCP may use path MTU discovery to reduce your effective MTU.
The question is, why do you care?

If you are with Linux machines, "ifconfig eth0 mtu 9000 up" is the command to set the MTU for an interface. However, I have to say, big MTU has some downsides if the network transmission is not so stable, and it may use more kernel space memories.

One solution can be to set socket option TCP_MAXSEG (http://linux.die.net/man/7/tcp) to a value that is "safe" with underlying network (e.g. set to 1400 to be safe on ethernet) and then use a large buffer in send system call.
This way there can be less system calls which are expensive.
Kernel will split the data to match MSS.
This way you can avoid truncated data and your application doesn't have to worry about small buffers.

It seems most web sites out on the internet use 1460 bytes for the value of MTU. Sometimes it's 1452 and if you are on a VPN it will drop even more for the IPSec headers.
The default window size varies quite a bit up to a max of 65535 bytes. I use http://tcpcheck.com to look at my own source IP values and to check what other Internet vendors are using.

The packet size for a TCP setting in IP protocol(Ip4). For this field(TL), 16 bits are allocated, accordingly the max size of packet is 65535 bytes: IP protocol details

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex