How can a file transfer go so quickly if the size of a packet cannot exceed 1500 bytes? - tcp

When downloading a file from a Web Site, speeds of many megabytes per second can be acheived. If TCP needs to break up and individually send packets over 1500 bytes, then how are these speeds possible? Doesn't the client have to wait for every 1500 byte fragment which should take a while?
Thanks

Doesn't the client have to wait for every 1500 byte fragment which
should take a while
No. That's the magic of TCP, you don't have to ACK every segment, you can ACK once in a while. The server can push lots of segments before the client positively must acknowledge at least some.
TCP uses a concept called "windows". A sender can push data into a window, causing it to shrink. The receiver acknowledges data, causing the window to expand. If the receiver doesn't acknowledge data, the transfer grinds to a halt.
In modern TCP knowing when to acknowledge data is the gist of the protocol. Doing it too often or not often enough has enormous impacts on performance.

Related

TCP congestion control-fast retransmit and fast recovery

QUESTION
Suppose you are using TCP to transfer a 4 MB file over a network. The receiver advertises a receive window of 4 MB. Assume that the retransmission timers expire after 5 RTTs.
(a) If TCP sends 4KB segments, how many RTTs does it take to send the file, assuming that there are no segment losses?
(b) Describe what happens if the first segment sent after the send window reaches 1MB is lost. Assume that the version of TCP you are using does not implement fast retransmit and fast recovery. How many RTTs does it take to send the file?
(c) Now assume that the version of TCP you are using implements fast retransmit and fast recovery. How many RTTs are saved by the fact that your TCP implements fast retransmit and fast recovery?
I have recently started studying Congestion control but I am little confused with how fast retransmit/recovery works differently from ordinary retransmission. Please explain about that.

udp vs tcp packet dropping

If I send two packets via the net one is UDP packet and the other is TCP packet, which packet is more likely to reach its destination? I have been told that the TCP protocol is safer but this is because of it's "fail-safe" mechanism. But does it also mean that UDP packets are more likely to fall in the way?
I think it's related to the specific router implementation, because on one hand if a UDP packet disappears then both sides probably know it might happen and can afford to lose a packet or two but on the other hand if a TCP packet disappears then by it's "fail-safe" mechanism it will send another and the problem is solved, and TCP packet is much heavier.
I would like to have more solid answer for that question because i find this subject quite interesting.
If you are making a decision on which protocol to use for your application, you really need to look into both in more detail. Below is just an overview.
TCP is a stream protocol that provides several mechanisms that will deliver: a guaranteed delivery of data, in order. It will control the rate at which the data is sent (it will start transmitting slowly, then upping the speed auntil it reaches a rate that is sustainable by the peer). It will resend any data that was not received on the other side. To do that, you pay a price (for example the slow start, the need for acknowledging all data received etc.)
UDP on the other side is a "data chunk" (datagram) protocol and provides none of the checks of integrity / rate / order. It "compensates" by being (potentially) faster: you pump out data as fast as you can, the other side receives whatever it is able to catch, at full network speed in the extreme case. No guarantee of delivery or order of data arriving at the other side. They either receive the whole datagram or nothing.
Any decision one usually makes has nothing to do with the possibility of data being lost or not but the criticality of losing any of it. Video streaming is done via UDP many times since missing the occasional datagram is less critical than having a smooth image. File transmission cannot afford any data loss or inversions of data chunks, so TCP is the natural choice.
Apart form that question, remeber that the network protocol is only half your problem. The other half is coming up with your application protocol to interprest the bytes you are receiveing...

UDP Packet size and packet losses

I've been writing a program that uses a stop and wait protocol on top of UDP to send packets over LAN and also over WAN. I've recently been testing my program and have noticed that the packet loss rate is higher for larger packets (approaching 64k bytes). Intuitively this makes sense but what are the actual reasons for this?
UDP packets greater than the MTU size of the network that carries them will be automatically split up into multiple packets, and then reassembled by the recipient. If any of those multiple sub-packets gets dropped, then the receiver will drop the rest of them as well.
So for example if you send a 63k UDP packet, and it goes over Ethernet, it will get broken up into 47+ smaller "fragment" packets (because Ethernet's MTU is 1500 bytes, but some of those are used for UDP headers, etc, so the amount of user-data-space available in a UDP packet is smaller than that). The receiver will only "see" that UDP packet if all 47+ of those fragment-packets make it through okay. If just one of those fragment-packets gets dropped, the whole operation fails.
Well, data networks are far from reliable; packets get dropped all the time. Overloaded routers, full buffers and corrupt packets are some of the reasons. Since UDP has no flow control capabilities, it can't slow down if for example the receiving end is overloaded.
As Jeremy explained, the bigger the payload, the more packets it is going to be split into, and therefore a bigger chance of losing some of them.
UDP is used in cases where a dropped packet here in there won't affect anything or cases that you need something to get there in time or not at all. (VOIP, streaming video etc)
Its all about IP fragmentation and defragmentation. Packet more than MTU would be fragmented and has to be defragmented at the final host, there are also chances the fragments gets fragmented again on the path and which again can add the delay. sometimes if some N/W element is configured for layer 4 filtering then it defragments(not the final host) apply rules and then again frgaments and forward. Thats the reason the applicaiton which need performance always try to send data with size <= (MTU-ETHHDR-IPHDR)

Can UDP retransmit lost data?

I know the protocol doesn't support this but is it common for clients that require some level of reliability to build into their application a method for requesting retransmission of a packet if found to be corrupt?
It is common for clients to implement reliability on top of UDP if they need reliability (or sometimes just some reliability) but not any of the other things that TCP offers, for example strict in-order delivery, and if they want, at the same time, low latency (or multicast, where it works).
In general, you will only want to use reliable UDP if there are urgent reasons (very low latency and high speed needed, e.g. for a twitch game). In every "normal" case, simply using TCP will serve you equally well or better.
Also note that it is easy to implement your own stack on top of UDP that performs worse than TCP.
See enet for an example of a library that implements reliability (and some other features) on top of UDP (Raknet or HawkNL would be other examples).
You might want to look at the answers to this question: What do you use when you need reliable UDP?
Of course. You can build a reliable protocol (like TCP) on top of UDP.
Example
Imagine you are building a fileserver:
* read the file using blocks of 1024 bytes
* construct an UDP packet with payload: 4 bytes for the "position" in the file, 4 bytes for the "length" of the contents of the packet.
The receiver now receives UDP packets. If he gets following packets:
* 0-1024: DATA
* 1024-2048: DATA
* 3072-4096: DATA
it realises a packet got missing, and asks the sending application to resend the part between 2048 and 3072.
This is a very basic example to explain your application code needs to deal with the packet construction and payload interpretation. Don't use it, it does not take edge cases (last packet, checksums for corrupted packets, ...) into account.
Short answer: No.
Long answer: UDP doesn't care for packet loss. If an UDP packet arrives and has a bad checksum, it is simply dropped. Neither the sender of the packet is informed about this, nor the recipient is informed. It is only dropped, that's all that happens. Also UDP packets are not numbered, so if you send four packets and only three arrive at the recipient, the recipient cannot know that there used to be four and one is missing. Last but not least, packets are not acknowledged, so the sender will never know if a packet it was sending out ever made it to the recipient or not.
Contrary to this, TCP breaks data into segments, each segment is numbered, so the recipient can know if a segment is missing. Also all segments must be acknowledged to the sender. If the sender receives no acknowledgment for a segment after a certain period of time, it will assume the segment was lost and send it again.
Of course, you can add an own frame header on top of every data fragment you sent over UDP, that way your application code can number the sent fragments and you can implement an acknowledgement-resent strategy in your code but the question is: Will this really be better than what TCP is already offering for free? Even if it would be equally good, save yourself the time and just use TCP. A lot of people already thought they can do better than TCP and usually they realize in the end, actually they cannot. TCP has its weaknesses but in general it is pretty good at what it does.

What does LAN/traffic congestion mean?

While talking about UDP I saw/heard congestion come up a few times. What does that mean?
congestion is when you are trying to send too much data over a limited bandwidth, it cannot send the data faster than the incoming amount so additional packets are dropped.
When congestion occurs, you can see these effects:
Delay due to the queue at one end of the connection being too big, so it takes time for your packet to be transmitted.
Packet loss when new packets are simply dropped, forcing connection resets (and often causing more congestion).
Lower quality of service, protocols like TCP will do a cutback on the transmission rate, so your throughput will be lowered.
Blocking, certain networks have protocol priorities, so your UDP packets may be dropped in favor of allowing TCP traffic through.
Its like a traffic jam, imagine right after a sports game where a parking lot full of cars is trying to empty out into a small side street.
It means that network-connected devices are attempting to send more data across the network than it can handle, e.g. 20 Mbps of data across a 10 Mbps link.
In context of UDP, it's your main source of lost datagrams under ordinary circumstances.
Most LANs use some sort of a collission detection/avoidance system. A congestion typically means that the amount of data which is being transmiited on the medium is causing enough collissions to deteriorate the quality of service defined for that medium.
You may want to read up CSMA/CD at wikipedia.
As UDP packets can often be broadcasted, congestion can occur more often.
Kind regards,
For instance, Ethernet is a broadband protocol. Once a message is sent, every node receives it but ignores if the packet are not sent to them. What happens when two nodes send a packet at the same time? It will cause a collision and data loss.
So, both of the nodes will have to resend the message. To avoid more collisions, nodes are designed to wait a random number of milliseconds. Otherwise they keep going on sending messages simultaneously and packages will collide forever.

Resources