Why is UDP usually used for DNS requests instead of TCP?
I know that we could use TCP, but why UDP is the default protocol? Are there any reasons for that, or it is just for design purposes?
UDP is default protocol because in most cases, and when DNS was designed, an exchange is a single question/response, each part fitting into a small 512 bytes packet, so there is no need to establish a long running connection, where TCP needs first a 3-way handshake before exchanging any data.
Hence in most cases UDP gives better performances and DNS is time sensitive.
But then of course UDP is easier to spoof than TCP and bigger packets can be a problem.
First of all, it is important to note that TCP can also be used for DNS. In practice, most DNS servers support both UDP and TCP, though TCP is rarely used for simple DNS queries and is reserved mainly for operations like zone transfers.
The biggest advantage to using UDP is the performance boost. There are several reasons why TCP DNS queries are slower:
TCP requires a connection to be established before each request, then subsequently torn down. So if it takes 20ms for a message to travel from your computer to the server and back (a time known as RTT - Round Trip Time), then a TCP query would require 3xRTT (60ms) to be fully processed - 20ms for opening the connection, 20 more ms for the query, and another 20ms to tear it down. UDP would only require one RTT, so 20ms.
Due to TCP's connection-oriented nature, more resource are needed per-connection to store and manage TCP's state. TCP requires both the client and server to have a separate socket for each and every connection.
UDP makes it easy to deploy anycast DNS servers. In anycast, several servers (possibly around the world) share a single IP address - e.g. 1.1.1.1. When you send a query to 1.1.1.1, one of these servers (probably one of the closest ones geographically) gets it. Since TCP involves multiple packets sent back and forth, reliable anycast is harder to achieve since you need to make sure that the packets always reach the same exact server. Otherwise, they might end up reaching different servers which won't know what to do with them.
Lower data overhead - a UDP header is tiny compared to the header TCP sends for every segment. Using UDP means sending fewer bytes.
Simplicity - UDP is a lot simpler than TCP. TCP is optimized for long data transfers and has a bunch of complex mechanisms such as flow control and congestion control for optimizing the rate of data flow. DNS doesn't need any of these mechanisms for simple queries since the typical amount of sent data is tiny.
Related
When a server has only 1 UDP socket, and many clients are sending UDP packets to it, what would be the best approach to handle all of the incoming packets?
I think this can also be a problem with TCP packets, since there's a limited thread count, which cannot cover all client TCP socket receive events.
But things are better in this situation because there's 1 TCP socket per client, and even if the network buffer is full, packet receiving is blocked until the queue has space (let me know if I'm wrong).
UDP packets, however, are discarded when the buffer is full, and there's only 1 socket, so the chances of that happening are higher.
How can I solve this problem? I've searched for a while, but I couldn't get a clear answer. Should I implement my own queueing system? Or just maximize the network buffer size?
There is no way to guarantee you won't drop UDP messages. No matter what you do, if the rate of packets being sent is too large, you will drop some, either on the receiving host or somewhere in the network.
Some things that can help include:
Implementing an internal queue for messages in your Java app, and handing them over to a thread pool to process.
Increasing the kernel's message buffering.
But neither of these can deal with the case where the average message arrival rate is higher that the receiver's ability to process them or the network capacity. This will inevitably lead to lost messages (requests).
I've searched for a while, but I couldn't get a clear answer.
That is because there isn't one! Some problems are fundamentally unsolvable. For others, the best answer depends on factors that are too hard to measure or predict.
(If you want certainty ... don't use networking!)
In the TCP case, what you should do is use a (long-term) socket for each client. Depending on the number of sockets you need to support, you could either:
Dedicate a server-side thread to each socket (and client).
Use java.nio.channels.Selector and a thread pool.
You will still get problems if the rate of requests exceeds your server's ability to process them. However, the TCP connections will ensure that requests are not lost, and that the clients get some "back pressure".
What purpose is UDP for..if it delivers packets without any order (and given the fact that packets may get lost on the way or sent to other network).
UDP as many very usefull use cases.
Just a few off the top of my head:
1/ Your payloads are small (will hold in a single "packet") and you want to go fast. That's why DNS uses UDP when the data size does exceeds 512 bytes (99% of the cases?):
https://en.wikipedia.org/wiki/Domain_Name_System#Protocol_transport
And you do hundreds of DNS requests every day. How many TCP 3-way handshakes and connection tear-down saved by this? How may petabytes or network load saved on "the internet"? I'd say that's quite useful!
2/ You do not know who you are talking too, or even if someone is listening or wishing to reply. In other words, you cannot or do not want for sure to establish an actual connection, like TCP would do. There may not be a TCP service listening for you. For example, the SSDP protocol from UPnP uses UDP to discover devices/services:
https://en.wikipedia.org/wiki/Simple_Service_Discovery_Protocol
With UDP tough, you can send your data "in the wild" even if nobody is listening to you. Which leads me to point 3...
3/ You want to talk to multiple hosts, or even "everyone". That's multicasting and broadcasting, and it's very easy to do in UDP. The SSDP mentioned above is an example of such case. On the other hand, if you want to do multicast or broadcast on TCP, that becomes very tricky from the start. You'll have to subscribe to multicast group and blablabla. A multicast daemon may help (ex: https://github.com/troglobit/smcroute), but it's really way more difficult in TCP than with UDP.
4/ Your data is realtime, if the target is missing it there's no point for it to ask for a "please send it again, I did not get it and/or not in the correct order". That's too late, sorry. The receiver better forget it, go on and try to catch-up. A typical use case here can be live audio/video (telephony conversations, real time video streaming). There's no point for the receiver to try to get old, expired data again and again in case of TCP missed segment(s). You can only accumulate network data debt doing this. Better forget it and move on to the new, real-time data that keep coming in. You cannot "pause" real-time incoming data. If you want actual real-time, not pseudo real-time like you get in your web browser.
And I'm sure other posters will find many use-cases for UDP.
So UDP is very, VERY useful. You use it daily without noticing it. The networking world would be a pitiful place without it. You would really miss it. The "TCP/IP" should really be renamed "TCP-UDP/IP".
This was my advocacy for the unfairly despised but Oh-so-useful UDP. :-)
Typically, use UDP in applications where speed is more critical than reliability. For example, it may be better to use UDP in an application sending data from a fast acquisition where it is acceptable to lose some data points. You can also use UDP to broadcast/multicast to any/many machine(s) listening to the server.
Applications may want finer control over the performance characteristics or the reliability of their communications. For this, the operating system exposes IP’s “best-effort datagrams” to the application for it to do whatever it wants with them.
To do this, the OS provides UDP — the “user” datagram protocol. It’s just like IP, in that the service is best-effort datagrams, but instead of delivering those datagrams “to a computer,” there’s an added layer of addressing that says which application is interested in them (and like TCP, UDP does this with a port number).
Applications can run whatever they want on top of UDP — anything that runs on best-effort datagrams. There are lots of protocols you can run on top of that abstraction.
In general:
TCP is for high-reliability data transmissions
UDP is for low-overhead transmissions
Is TCP not responsible for making sure that a stream is sent intact over the wire by doing whatever may become necessary as losses etc. occur during a transfer?
Does it not do a proper job of it?
Why do higher application-layer protocols and their applications still perform checksums?
While TCP does contain its own checksum, it is only a 16-bit checksum and it is certainly possible for a multi-bit transmission error to slip by the TCP checksum mechanism. This is quite rare, but it is still possible and I have in fact seen it happen (once or twice in a couple of decades).
A robust protocol will want to use a higher-level hash function to assure integrity of transmitted data. Having said that, not many applications that transmit a small amount of data go to this trouble. Bulk transfer applications (such as a package manager or auto-update mechanism) will usually use a cryptographic hash function to increase the assurance of data integrity.
TCP ensures that TCP packets are delivered reliably, using checksums to trap errors introduced during transmission, and retransmitting lost or damaged packets as required. When a packet is transmitted it is retained in a retransmission queue until the peer host acknowledges receipt; if no acknowledgement is received within a certain timeout period then the packet is retransmitted. But the host won't keep retransmitting a packet forever - if a packet repeatedly fails then TCP eventually gives up and closes the connection.
Higher-level protocols assume that TCP works reliably (a fair assumption) and use their own checksums or whatever to check that the higher-level data stream arrived safely. I've written lots of buggy sockets applications that screwed up their own higher-level buffers and mangled the application data stream!
In any production-grade TCP/IP stack with a robust application I think you can be confident that the problem is that your connection is dropping out. Or you might have a buggy application, but I doubt that your fetch/wget is buggy.
I am using Winsock, and I have a need to issue a TCP connect repeatedly to a third-party server. These applications will stay up potentially for days at a time. I am the only client connecting to the server. The time between connects is on the order of seconds, and the connection stays up only long enough to send a single message of a few bytes. I am currently seeing that the connects start to fail (WSAECONNREFUSED) after a few hours. Is there anything I must do (e.g. socket options, etc.) to ensure these frequent repeated connects will succeed for an indefinite amount of time? Thanks!
When doing a lot of transaction based connections and having issues with TCP's TIME_WAIT state duration (which last 2MSL = 120 seconds) leading to no more connections available for a client host toward a special server host, you should consider UDP and managing yourself the re-sending of lost requests.
I know that sounds odd. But standard services like DNS are required to use UDP to handle a ton of transactions (request then a single answer in one UDP segment) in order to avoid issues you are experimenting yourself. Web browsers send a request using UDP to the DNS. Re-request is done using UDP after a short time, no longer than a few milliseconds I guess. Sometimes the resolved name is too long and does not fit in the UDP paquet. As a consequence the DNS server send a UDP reply with a dedicated flag raised, in order to ask the client to use TCP this time.
Moreover you may consider also the T/TCP extension (Transactional TCP) of TCP, if available on your Windows platform. It provides TCP reliability with shorter TIME_WAIT state, as nearly no costs in the modifications of your client code. As far as I know it may work even though the server does not handle that extension. As a side note it is currently not used on the internet as it is know to have some flaw...
When I'm learning about various technologies, I often try to think of how applications I use regularly implement such things. I've played a few MMOs along with some FPSs. I did some looking around and happened upon this thread:
http://www.gamedev.net/topic/319003-mmorpg-and-the-ol-udp-vs-tcp
I have been seeing that UDP shines when some loss of packets is permissible. There's less overhead involved and updates are done more quickly. After looking around for a bit and reading various articles and threads, I've come to see that character positioning will often be done with UDP. Games like FPSs will often be done with UDP because of all the rapid changes that are occuring.
I've seen multiple times now where someone pointed out issues that can occur when using UDP and TCP simultaneously. What might some of these problems be? Are these issues that would mostly be encountered by novice programmers? It seems to me that it would be ideal to use a combination of UDP and TCP, gaining the advantages of each. However, if using the two together adds a significant amount of complexity to the code to deal with problems caused, it may not be worth it in certain situations.
Many games use UDP and TCP together. Since it is mandatory for every game to deliver the actions of a player to everyone, it has to be done in one way or the other. It now depends on what kind of game you want to make. In a RTS, surely TCP would be much wiser, since you cannot lose information about your opponents movement. In an RPG, it is not that incredibly important to keep track of everything every second of the game.
Bottom line is, if data has to arrive at the client, in any case, (position updates, upgrades aso.), you have to send it via TCP, or, you implement your own reliable protocol based on UDP. I have constructed quite a few network stacks for games, and I have to say, what you use depends on the usecase and what you are trying to accomplish. I mostly did a heartbeat over UDP, so the remote server/client knows that I am still there. The problem with UDP is, that packets get lost and not resent. If a packet drops, it is lost for ever. You have to take that into account. If you send some information over UDP, it has to be information that is allowed to be lost. Everything else goes via TCP.
And so the madness began. If you want to get most out of both, you will have to adapt them. TCP can be slow sometimes, and you have to wait, if packets get fragmented or do not arrive in order, until the OS has reassembled them. In some cases it could be advisable to build your own reliable protocol on top of UDP. That would allow you complete control over your traffic. Most firewalls do not drop UDP or anything else, but as with TCP, any traffic that is not declared to be safe (Opening Ports, packet redirects, aso.), gets dropped. I would suggest you read up the TCP and UDP and UDP-Lite article up at wikipedia and then decide which ones you want to use for what. AFAIK Battle.net uses a combination of the two.
Many services use udp and tcp together, but doing so without adding congestion control to your udp implementation can cause major problems for tcp. Long story short, udp can and often will clog the routers at each endpoint making tcp's congestion control to go haywire and significantly limit the throughout of the tcp connection. The udp based congestion can also cause significant increase in packet loss for tcp, limiting tcp's throughput even more as it will need to have these packets retransmitted. Using them together isn't a bad idea and is even becoming somewhat common, but you'll want to keep this in mind.
The first possible issue I can think of is that, because UDP doesn't have the overhead inherent in the "transmission control" that TCP does, UDP has higher data bandwidth and lower latency. So, it is possible for a UDP datagram that was sent after a TCP message to be available on the remote computer's input buffers before the TCP message is received in full.
This may cause problems if you use TCP to control or monitor UDP transmission; for instance, you might tell the server via TCP that you'll be sending some datagrams via UDP on port X. If the remote computer isn't already listening on port X, it may not receive some of these datagrams, because they arrived before it was told to listen; if it is listening, but not expecting traffic from you, it may discard them because they showed up before it was told to expect them. This may have an adverse effect on your program's flow or your user's experience.
I think that if you like to transfer game data TCP will be the only solution. Imagine that you send a command (e.g gained x item :P) at the server and this packet never reach its destination. (udp has no guaranties).
Also imagine the scenario that two or more UDP packets reach their destination in wrong order.
But if with your game you integrate any VoIP or Video Call capabilities you can use at these UDP.