UDP functionality for communication between 2 ports/sockets? - networking

What purpose is UDP for..if it delivers packets without any order (and given the fact that packets may get lost on the way or sent to other network).

UDP as many very usefull use cases.
Just a few off the top of my head:
1/ Your payloads are small (will hold in a single "packet") and you want to go fast. That's why DNS uses UDP when the data size does exceeds 512 bytes (99% of the cases?):
https://en.wikipedia.org/wiki/Domain_Name_System#Protocol_transport
And you do hundreds of DNS requests every day. How many TCP 3-way handshakes and connection tear-down saved by this? How may petabytes or network load saved on "the internet"? I'd say that's quite useful!
2/ You do not know who you are talking too, or even if someone is listening or wishing to reply. In other words, you cannot or do not want for sure to establish an actual connection, like TCP would do. There may not be a TCP service listening for you. For example, the SSDP protocol from UPnP uses UDP to discover devices/services:
https://en.wikipedia.org/wiki/Simple_Service_Discovery_Protocol
With UDP tough, you can send your data "in the wild" even if nobody is listening to you. Which leads me to point 3...
3/ You want to talk to multiple hosts, or even "everyone". That's multicasting and broadcasting, and it's very easy to do in UDP. The SSDP mentioned above is an example of such case. On the other hand, if you want to do multicast or broadcast on TCP, that becomes very tricky from the start. You'll have to subscribe to multicast group and blablabla. A multicast daemon may help (ex: https://github.com/troglobit/smcroute), but it's really way more difficult in TCP than with UDP.
4/ Your data is realtime, if the target is missing it there's no point for it to ask for a "please send it again, I did not get it and/or not in the correct order". That's too late, sorry. The receiver better forget it, go on and try to catch-up. A typical use case here can be live audio/video (telephony conversations, real time video streaming). There's no point for the receiver to try to get old, expired data again and again in case of TCP missed segment(s). You can only accumulate network data debt doing this. Better forget it and move on to the new, real-time data that keep coming in. You cannot "pause" real-time incoming data. If you want actual real-time, not pseudo real-time like you get in your web browser.
And I'm sure other posters will find many use-cases for UDP.
So UDP is very, VERY useful. You use it daily without noticing it. The networking world would be a pitiful place without it. You would really miss it. The "TCP/IP" should really be renamed "TCP-UDP/IP".
This was my advocacy for the unfairly despised but Oh-so-useful UDP. :-)

Typically, use UDP in applications where speed is more critical than reliability. For example, it may be better to use UDP in an application sending data from a fast acquisition where it is acceptable to lose some data points. You can also use UDP to broadcast/multicast to any/many machine(s) listening to the server.
Applications may want finer control over the performance characteristics or the reliability of their communications. For this, the operating system exposes IP’s “best-effort datagrams” to the application for it to do whatever it wants with them.
To do this, the OS provides UDP — the “user” datagram protocol. It’s just like IP, in that the service is best-effort datagrams, but instead of delivering those datagrams “to a computer,” there’s an added layer of addressing that says which application is interested in them (and like TCP, UDP does this with a port number).
Applications can run whatever they want on top of UDP — anything that runs on best-effort datagrams. There are lots of protocols you can run on top of that abstraction.
In general:
TCP is for high-reliability data transmissions
UDP is for low-overhead transmissions

Related

TCP vs UDP - Issues that arise from using both

When I'm learning about various technologies, I often try to think of how applications I use regularly implement such things. I've played a few MMOs along with some FPSs. I did some looking around and happened upon this thread:
http://www.gamedev.net/topic/319003-mmorpg-and-the-ol-udp-vs-tcp
I have been seeing that UDP shines when some loss of packets is permissible. There's less overhead involved and updates are done more quickly. After looking around for a bit and reading various articles and threads, I've come to see that character positioning will often be done with UDP. Games like FPSs will often be done with UDP because of all the rapid changes that are occuring.
I've seen multiple times now where someone pointed out issues that can occur when using UDP and TCP simultaneously. What might some of these problems be? Are these issues that would mostly be encountered by novice programmers? It seems to me that it would be ideal to use a combination of UDP and TCP, gaining the advantages of each. However, if using the two together adds a significant amount of complexity to the code to deal with problems caused, it may not be worth it in certain situations.
Many games use UDP and TCP together. Since it is mandatory for every game to deliver the actions of a player to everyone, it has to be done in one way or the other. It now depends on what kind of game you want to make. In a RTS, surely TCP would be much wiser, since you cannot lose information about your opponents movement. In an RPG, it is not that incredibly important to keep track of everything every second of the game.
Bottom line is, if data has to arrive at the client, in any case, (position updates, upgrades aso.), you have to send it via TCP, or, you implement your own reliable protocol based on UDP. I have constructed quite a few network stacks for games, and I have to say, what you use depends on the usecase and what you are trying to accomplish. I mostly did a heartbeat over UDP, so the remote server/client knows that I am still there. The problem with UDP is, that packets get lost and not resent. If a packet drops, it is lost for ever. You have to take that into account. If you send some information over UDP, it has to be information that is allowed to be lost. Everything else goes via TCP.
And so the madness began. If you want to get most out of both, you will have to adapt them. TCP can be slow sometimes, and you have to wait, if packets get fragmented or do not arrive in order, until the OS has reassembled them. In some cases it could be advisable to build your own reliable protocol on top of UDP. That would allow you complete control over your traffic. Most firewalls do not drop UDP or anything else, but as with TCP, any traffic that is not declared to be safe (Opening Ports, packet redirects, aso.), gets dropped. I would suggest you read up the TCP and UDP and UDP-Lite article up at wikipedia and then decide which ones you want to use for what. AFAIK Battle.net uses a combination of the two.
Many services use udp and tcp together, but doing so without adding congestion control to your udp implementation can cause major problems for tcp. Long story short, udp can and often will clog the routers at each endpoint making tcp's congestion control to go haywire and significantly limit the throughout of the tcp connection. The udp based congestion can also cause significant increase in packet loss for tcp, limiting tcp's throughput even more as it will need to have these packets retransmitted. Using them together isn't a bad idea and is even becoming somewhat common, but you'll want to keep this in mind.
The first possible issue I can think of is that, because UDP doesn't have the overhead inherent in the "transmission control" that TCP does, UDP has higher data bandwidth and lower latency. So, it is possible for a UDP datagram that was sent after a TCP message to be available on the remote computer's input buffers before the TCP message is received in full.
This may cause problems if you use TCP to control or monitor UDP transmission; for instance, you might tell the server via TCP that you'll be sending some datagrams via UDP on port X. If the remote computer isn't already listening on port X, it may not receive some of these datagrams, because they arrived before it was told to listen; if it is listening, but not expecting traffic from you, it may discard them because they showed up before it was told to expect them. This may have an adverse effect on your program's flow or your user's experience.
I think that if you like to transfer game data TCP will be the only solution. Imagine that you send a command (e.g gained x item :P) at the server and this packet never reach its destination. (udp has no guaranties).
Also imagine the scenario that two or more UDP packets reach their destination in wrong order.
But if with your game you integrate any VoIP or Video Call capabilities you can use at these UDP.

How to understand network protocols?

I work in web development, but I don't have a great understanding of network protocols. I recall hearing an analogy that TCP, HTTP, and SSL can be thought of as a series of nested envelopes around the actual request content.
I also have a fuzzy idea TCP consists of packets, which are verified on the other end. But I'm sort of picturing the HTTP request being chopped into packets, too...
So basically, I don't understand this stuff well at all. Can anybody give a good overview of this? Also, is there a beginner-friendly book or other resource that you'd recommend?
Since I asked this question, I've learned more about this topic, so I'll take a crack at answering it myself.
The easiest way to picture the protocol stack is as a letter, wrapped in a series of envelopes. Each envelope has a different role in getting the letter to its recipient, and envelopes are added and removed as needed along the journey.
The Application Layer
The letter itself is an application-layer request. For example, you've typed "StackOverflow.com" in your browser and pressed enter. Your browser needs to ask the StackOverflow server for its home page. So it writes a letter saying, "Dear StackOverflow, would you please send me your home page?"
If the writer of the letter is your browser, the recipient of the letter is the web server program running on StackOverflow. The browser wants the web server to "write back" with a response in the form of a web page. Both the browser and server are applications - programs running on specific computers.
Because browsers speak HTTP, that's what it uses to make the request: the letter says something like "GET http://stackoverflow.com". The browser also writes down any cookie information it got from StackOverflow last time ("remember me? You told me my login ID was X") and adds some miscellaneous labeled information called "headers" (things like "I'm Firefox" and "I can accept HTML or text" and "it's OK with me if you compress the content with gzip"). All that information will help the server know how to personalize or customize its response.
At that point, the browser is basically done. It hands this letter to the operating system and says, "would you please send this for me?" The OS says, "Sure." It then does some work to connect to StackOverflow (more on that in a minute), then tells the browser, "I'm working on it. By the way, here's a little mail bin I made for you, called a socket. When I hear back from StackOverflow, I'll put its letter in there and you can read it just like a file." The browser then happily awaits the response.
The IP layer
To send the request from the browser to StackOverflow, the operating system has to do several things.
First, it has to look up the address for StackOverflow.com - specifically, the IP address. It does this using DNS (which I won't go into here). Once it knows the IP address, it will know how to wrap the request in one of the "envelopes" called the IP layer.
Why do we need the IP layer? Well, once upon a time, we didn't.
Why we need IP
Have you ever seen an old movie where someone makes a phone call by asking the operator to connect them? The operator would physically connect the wire from Person #1's house to the wire for Person #2's house. Before the protocol stack was invented, connecting computers was a lot like that phone call: you needed a dedicated wire from point to point.
So, for example, if the computer scientists at Stanford wanted to exchange data with the ones at Harvard, they'd pay a bunch of money to rent a dedicated wire between the two places (a "leased line"). Any data that went into one end came out reliably at the other end. However, this was very expensive: imagine paying for a separate line for every place you want to connect to!
People realized that this wouldn't scale up. We needed a way to have a network that was shared by all users, like a giant spiderweb of wires spread out all over the map. That way, each user would only need one connection to the network and could reach any other user through it.
But that presented a problem. If everyone's communications went on the same lines, how would the data get to the right place? Imagine a bunch of letters dumped on a conveyor belt. Obviously, every letter needs to be addressed to someone, or else they can't be delivered.
That's the basic idea of IP: every machine needs to have an IP address that uniquely identifies it. Messages are placed in IP packets, which are like envelopes with addresses and return addresses.
So, once the OS has looked up the IP address for Stackoverflow.com, it puts the HTTP request in an IP envelope. If it's a "long letter", too big for one envelope, the OS cuts it into pieces and puts it in several IP envelopes. Each envelope says something like "FROM: (your IP address); TO: (The Server's IP address." Like the HTTP request, the IP packet has some other miscellaneous header information, which we won't go into here, but the basic idea is just "to" and "from."
So, at this point, the letter is ready to go, right?
The messiness of IP
Not quite. This letter could easily get lost! See, with IP, we no longer have a dedicated line from place to place. If we did, we'd be sure that our letters were getting delivered: as long as the line wasn't broken, everything would go through.
But with IP, everyone's packets get dumped onto conveyor belts and carried along. The belts lead to little sorting stations, called "routers". If you imagine the routers like physical mail centers, you could picture one in, say, New York City.
"Here's a letter headed for Mexico City. I don't know exactly how to get there, but the station in Houston should be able to get it closer, so I'll send it there. Ah, here's a letter that's going to Atlanta. I'll send it to Charlotte; they should be able to forward it a step closer."
Generally this system works OK, but it's not as reliable as having your own dedicated line. Nearly anything could happen en route: a conveyor belt could break or catch fire, and everything on it could be lost. Or one could get bogged down for a while, so that its packets are delivered very late.
Besides that, because these conveyor belts and stations are used by everyone, nobody's letters get treated specially. So what happens if a router gets more letters than it can possibly handle? For a while, it can stack them in a corner (maybe in RAM), but eventually, it runs out of space.
What it does then may seem shocking: it starts throwing them away.
Yep. That's it. You might think that it would at least be kind enough to send back a note to you, saying, "sorry, we couldn't deliver your letter." But it doesn't. If you think about it, if the router is overwhelmed, it's probably because there's too much traffic on the lines already. Adding apology notes would only make the problem worse. So it throws away your packet and doesn't bother telling anyone.
Obviously, this is a problem for our HTTP request. We need it to get there, and we need the response to get back reliably, too.
To make sure it gets there, we want some kind of "delivery confirmation" service. For that, we'll wrap another envelope around our HTTP request before putting into IP packets. That layer is called TCP.
TCP
TCP stands for "transfer control protocol." It exists to control what would otherwise be a messy, error-prone delivery process.
As implied before, TCP lets us add some "delivery confirmation" to this messy delivery system. Before we wrap our HTTP request in IP packets, we first put it into TCP packets. Each one gets a number: packet 1 of 5, 2 of 5, etc. (The numbering scheme is actually more complicated and counts bytes rather than packets, but let's ignore that for now.)
The basic idea of TCP is this:
First, the client and server - in this case, your operating system and the StackOverflow server's operating system - do a "handshake" to establish a "connection". Both words needs quotes because the "handshake" is actually a few messages back and forth, proving that packets can get successfully there and back, and the "connection" is really nothing more than each side deciding that they'll keep track of the packets flowing between them.
Next, they send packets back and forth; the client maybe requesting a web page, and the server maybe sending it back (in as many packets as that takes).
As one side receives packets, it sends back confirmation messages, saying "so far I've received your packets up to packet 100" and so forth. If one party sends packets and doesn't hear a confirmation for a while, it will assume they were lost and re-send them.
(Getting confirmations when things arrive at the other end is better than getting error reports when a router drops things along the way for a couple of reasons. One is that confirmations go back over a working connection, whereas errors would further clog a non-working connection. Another is that we don't have to trust the intermediary routers to do the right thing; the client and server are the ones who care most about this particular conversation, so they're the ones who take charge of being sure that it works.)
Besides making sure that all the data gets to the other end, TCP also makes sure that the received data gets put back into the right order before handing it up the stack, in case earlier packets got resent and arrived later, or packets in the middle took a longer route, or whatever.
That's basically it - having this kind of delivery confirmation makes the unreliable IP network reliable.
Why wasn't it built straight into IP?
UDP
Well, confirmation has a drawback: it makes things slower. If something is missed, it must be repeated. In some cases, that would be a waste of time, because what you really want is a real-time connection. For example, if you're having a phone conversation over IP, or you're playing a real-time game over the internet, you want to know what's happening right now, even if it means you miss a bit of what happened a second ago. If you stop to repeat things, you'll fall out of sync with everyone else. In cases like that, you can use a cousin of TCP called UDP, which doesn't re-send lost packets. UDP stands for "user datagram protocol", but many people think of it as "unreliable data protocol". That's not an insult; sometimes reliability is less important than staying current.
Since both of these are valid use cases, it makes sense that the IP protocol stayed neutral on the issue of reliability; those who use it can choose whether to add reliability or not.
Both TCP and UDP add one other important piece of information to the request: a port number.
Port numbers
Remember, our original request is comes from a browser and is going to a web server program. But the IP protocol only has addresses that specify computers, not the applications running on them. The machine with StackOverflow's web server may also have other server programs that are listening for requests: a database server, an FTP server, etc. When that machine gets the request, how will it know which program should handle it?
It will know because the TCP request has a port number on it. This is just a number, nothing fancy, but by convention, certain numbers are interpreted to mean certain things. For example, using a port number of 80 is a conventional way of saying "this is a request for a web server." Then the server machine's operating system will know to hand that request to the web server program and not, say, the FTP server program.
When the TCP packets start streaming back to your computer, they will also have a port number, to let your machine know which program to give the response to. That number will vary based on the socket that your machine created initially.
Wait, what's a socket?
Sockets
Remember earlier when the browser asked the OS to send the request? The OS said it would set up a "mail bin" for any response it got back. That bin is called a socket.
You can think of a socket sort of like a file. A file is an interface that the OS provides. It says, "you can read and write data here, and I will take care of figuring out how to actually store it on the hard drive or USB key or whatever." The thing that uniquely identifies a file is the combination of path and filename. In other words, you can only have one file located in the same folder with the same name.
Similarly, a socket is an interface the OS provides. It says, "you can write requests here and read responses." The thing that uniquely identifies a socket is the combination of four things:
Destination IP
Destination Port
Source IP
Source Port
So, you can only have one socket on a system with the same combination of all of those. Notice that you could easily have several sockets open to the same destination IP and port - say, StackOverflow's web server - as long as they all have different source ports. The OS will guarantee that they do by choosing an arbitrary source port for each request, which is why you can have several tabs or several browsers all requesting the same web site simultaneously without anything getting confused; the packets coming back all say which port on your computer they're headed for, which lets the OS know "ah, this packet is for tab 3 in Firefox" or whatever.
Summing up so far
We've been thinking of the protocols as a series of envelops wrapped around the letter. In our example, the letter was an HTTP request, which got wrapped in TCP, then in IP. The IP packets get sent to the right destination computer. That computer removes the IP "envelope" and finds a TCP packet inside. The TCP packet has a port number, which lets the operating system know which port to collect its information in. It replies saying that it got that packet, and it puts its contents (the HTTP request) into the correct socket for the appropriate program to read from. When that program writes a reponse to the socket, the OS sends it back to the requester.
So our "stack" is:
An HTTP request (a "letter"). This is the application layer.
Wrapped in TCP packets ("envelopes"). This is the transport layer.
Wrapped in IP packets ("envelopes"). This is the IP layer.
It's important to understand that this stack is totally customizable. All of these "protocols" are just standard ways of doing things. You can put anything you want inside of an IP packet if you think the receiving computer will know what to do with it, and you can put anything you want inside a TCP or UDP packet if you think the receiving application will know what to do with it.
You could even put something else inside your HTTP request. You could say that some JSON data in there is the "phone number exchange protocol," and as long as both ends know what to do with it, that's fine, and you've just added a higher-level protocol.
Of course, there's a limit to how "high" you can go in the stack - that is, you can put a smaller envelope inside HTTP, and a smaller one inside that, etc, but eventually you won't have any room to go smaller; you won't have any bits for actual content.
But you can easily go "lower" in the stack; you can wrap more "envelopes" around the existing ones.
Other protocol layers
Once common "envelope" to wrap around IP is Ethernet. For example, when your computer decides to send IP packets to Google, it wraps them up as we've described so far, but to send them, it gives them to your network card. The network card may then wrap the IP packets in Ethernet packets (or token ring packets, if you've got an antique setup), addressing them to your router and sending them there. Your router removes those Ethernet "envelopes", checks the IP address, decides who the next closest router is, wraps another Ethernet envelope addressed to that router, and sends the packet along.
Other protocols could be wrapped as well. Maybe two devices are only connected wirelessly, so they wrap their Ethernet packets in a Wi-Fi or Bluetooth or 4G protocol. Maybe your packets need to cross a village with no electricity, so someone physically prints the packets on paper with numbered pages, rides them across town on a bicycle, and scans them into another computer in the order of the page numbers. Voila! A print-to-OCR protocol. Or maybe, I don't know, TCP over carrier pigeon would be better.
Conclusion
The protocol stack is a beautiful invention, and it works so well that we generally take it for granted.
It is a great example of abstracting functionality: each layer has its own work to do and can rely on others to deal with the rest.
The application layer is only concerned with applications talking to each other: "Firefox wants to talk to the web server at StackOverflow.com."
The transport layer is only concerned with getting a stream of packets delivered correctly from one app to another: "all the packets from port 123 on machine 1 need to get to port 80 on machine 2".
The IP layer is only concerned with routing individual packets: "this packet needs to get to the following IP address."
The link layer is only concerned with getting packets from one waypoint to the next: "this ethernet packet needs to get from the network card to the router."
The physical layer is only concerned with signal transmission: "these pulses need to be sent over this wire."
(Although these layer terms are borrowed from OSI, OSI was actually a competing standard to TCP/IP, and included things like the "session layer" and "presentation layer" that TCP/IP doesn't use. OSI was intended to be a more sane and standardized alternative to the scrappy hacked-together TCP/IP stack, but while it was still being discussed, TCP/IP was already working and was widely adopted.)
Because the layers can be mixed and matched as needed, the stack is flexible enough to accommodate nearly any use we can think of, so it's probably going to be around for a long time. And hopefully now you can appreciate it a bit more.
For the throughout description of TCP/IP networking (without physical layer, e.g., Ethernet), pick TCP/IP Illustrated by Stevens. If you going to do some low-level network programming, Unix network programming by the same author is the best.
There's a reason you'll often hear of TCP/IP implementations called a "stack". Part of the concept is that you have a low-level protocol (Ethernet, PPP, what-have-you), slightly higher-level protocols built on top of it (IP), and so on. It's quite similar to the OSI model, and can be described in terms of that model, though TCP/IP breaks up the layers just a bit differently. Anyway, programs generally send data using one of the upper-level protocols, and let the TCP/IP stack handle the details of getting the data from point A to point B.
TCP sits on top of IP and lets you think of the data flowing in and out as a pair of streams (one in, one out) rather than getting raw IP packets and having to figure out what to do with them. (Big BIG benefit: it simplifies multiplexing. Without TCP or UDP or the like, IP would be near useless -- only one program could normally communicate with the network at a given time.)
SSL sits on top of TCP, and lets you send data over the stream that TCP provides without having to get involved in the ugly details of encrypting and decrypting data, verifying certificates, etc.
HTTP sits on top of TCP (or SSL, in the case of HTTPS), and provides a way for a client and server to pass entire requests and responses, along with metadata describing them.
Network protocols are formal standards and policies comprised of rules, procedures and formats that define communication between two or more devices over a network. Network protocols govern the end-to-end processes of timely, secure and managed data or network communication.
There are several broad types of networking protocols, including:
• Network communication protocols: Basic data communication protocols, such as TCP/IP and HTTP.
• Network security protocols: Implement security over network communications and include HTTPS, SSL and SFTP.
• Network management protocols: Provide network governance and maintenance and include SNMP and ICMP.
The different layers of the Open Systems Interconnection (OSI) reference model are:
Application layer: This is the upper most layer in the OSI reference model. The application layer provides the means by which application processes can access network services, and is therefore associated with services that include direct support for applications.
Presentation layer: This layer in the OSI reference model deals with specifying the format which should be utilized to enable network data to be communicated between computers in the network. The presentation layer adds formatting, encryption, and data compression to the packet.
Session layer: This layer enables applications that reside on different computers to create and close network sessions. It also manages open network connections, or sessions that are open.
Transport layer: The transport layer is responsible for ensuring that data is delivered in sequence, error-free, and efficiently over the network. The transport layer also identifies duplicated packets, and drops them. Transport layer protocols include Transmission Control Protocol (TCP) and Sequenced Packet Exchange (SPX). These protocols open packets at the receiving computer, and reassemble the original messages as well.
Network layer: This layer of the OSI reference model provides addressing for messages for all networks. It translates your logical addresses and names to physical addresses, and then identifies the preferred route from the source computer to the destination computer.
Data Link layer: The Data Link layer prepares data for the physical connection by defining the means by which software drivers can access the physical medium. The Data Link layer transmits frames from the Network layer to the Physical layer.
Physical layer: This layer places the data on the physical medium which is carrying the data. It is responsible for the actual physical connection between two computers on the network that are exchanging data.
The function of protocols at the sending computer is summarized below:
• Segment data into smaller more manageable chunks or packets.
• Append addressing to the packets.
• Ensure that data is ready for sending via the network interface card (NIC) to the network cable
The function of protocols at the receiving computer is summarized below:
• Remove packets from the network cable, and move the packets through the NIC to the computer.
• Remove all information that relate to the sending of the packet. This is information added to the packet by the sending computer.
• Move the packets to the buffer for the reassembly process.
• Convey the data to the particular application.
Internet Protocol :
Internet protocol suite is the set of communication protocols that implement the protocol stack on which the internet runs. The Internet protocol suite is sometimes called the TCP/IP protocol suite, after TCP\IP, which refers to the important protocols in it, the Transmission Control Protocol(TCP) and the Internet Protocol(IP). The Internet protocol suite can be described by the analogy with the OSI model, but there are some differences. Also not all of the layers correspond well.
Protocol Stack:
A protocol stack is the complete set of protocol layers that work together to provide networking capabilities.
Transmission Control Protocol (TCP):
The Transmission Control Protocol is the core protocol of the internet protocol suite. It originated in the network implementation in which it complemented the Internet Protocol. Therefore the entire suite is commonly referred to as TCP/IP. TCP provides reliable delivery of a stream of octets over an IP network. Ordering and error-checking are main characteristics of the TCP. All major Internet applications such as World Wide Web, email and file transfer rely on TCP.
Internet Protocol(IP):
The Internet Protocol is the principal protocol in the Internet protocol suite for relaying data across networks. Its routing function essentially establishes the internet. Historically it was the connectionless datagram service in the original Transmission Control Program; the other being the connection oriented protocol(TCP). Therefore, the Internet protocol suite is referred as TCP/IP.
Is there a beginner-friendly book or other resource that you'd
recommend?
Data Communications and Networking by Behrouz Forouzan:
This contains introductory material and the explanation is beginner friendly. At the same time, it is not dumbed down and the material gets a bit more challenging as you go on. There are very good diagrams explaining concepts too. The typesetting is awesome and you'll have lots of interesting tips surrounding the content. The chapters are ordered according to the OSI stack as mentioned in other answers here. But a lot of the math and derivations for formulas for protocol efficiencies aren't explained.
Computer Networks by Andrew S. Tanenbaum
Everything found in Behrouz Forouzan + lots of equations.
My recommendation is to read the first book first and if you are particularly curious about the math, go to the second one.
We had computer networking on school and we had to buy this book it really helps. It explains every layer of the OSI model. (From the internetcabel and routers up to the tcp udp protecol layers up to the application layer). If you want to have more basic knowledge of how it all works this is a must read.

Why is UDP + a software reliable ordering system faster than TCP?

Some games today use a network system that transmits messages over UDP, and ensures that the messages are reliable and ordered.
For example, RakNet is a popular game network engine. It uses only UDP for its connections, and has a whole system to ensure that packets can be reliable and ordered if you so choose.
My basic question is, what's up with that? Isn't TCP the same thing as ordered, reliable UDP? What makes it so much slower that people have to basically reinvent the wheel?
General/Specialisation
TCP is a general purpose reliable system
UDP +whatever is a special purpose reliable system.
Specialized things are usually better than general purpose things for the thing they are specialized.
Stream / Message
TCP is stream-based
UDP is message-based
Sending discrete gaming information maps usually better to a message-based paradigm. Sending it through a stream is possible but horribly ineffective. If you want to reliably send a huge amount of data (File transfer), TCP is quite effective. That's why Bit-torrent use UDP for control messages and TCP for data sending.
We switched from reliable to unreliable in "league of legends" about a year ago because of several advantages which have since proven to be true:
1) Old information becomes irrelevant. If I send a health packet and it doesn't arrive... I don't want to have to wait for that same health packet to resend when I know its changed.
2) Order is sometimes not necessary. If I'm sending different messages to different systems it may not be necessary to get those messages in order. I don't force the client to wait for in-order messages.
3) Unreliable doesn't get backed up with messages... ie waiting for acknowledgements which means you can resolve loss spikes much more quickly.
4) You can control resends when necessarily more efficiently. Such as repacking something that didn't send into another packet. (TCP does repack but you can do it more efficiently with knowledge about how your program works.)
5) Flow control of message such as throwing away messages that are less relevant when the network suddenly spikes. The network system can choose not to resend less relevant messages when you have a loss spike. With TCP you'd still have a queue of messages that are trying to resend which may be lower priority.
6) Smaller header packet... don't really need to say much about that.
There's much more of a difference between UDP and TCP than just reliability and sequencing:
At the heart of the matter is the fact that UDP is connectionless while TCP is connected. This simple difference leads to a host of other differences that I'm not going to be able to reasonbly summarize here. You can read the analysis below for much more detail.
TCP - UDP Comparative Analysis
The answer in in my opinion the two words: "Congestion control".
TCP goes to great lengths to manage the bandwidth of the path - to use the most of it, but to ensure that there is space for other applications. This is a very hard task, and inherently it is not possible to use 100% of the bandwidth 100% of the time.
With UDP, on the other hand, one can make their own protocol to send the packets onto the wire as fast as they want - this makes the protocol very unfriendly to other applications, but can gain more "performance" short-term. On the other hand, it is with high probability that if the conditions are appropriate, this kind of protocols might contribute to congestion collapse.
TCP is a stream-oriented protocol, whereas UDP is a message-oriented protocol. Hence TCP does more than just reliability and ordering. See this post for more details. Basically, the RakNet developers added the reliability and ordering while still keeping it as a message-oriented protocol, and so the result was more lightweight than TCP (which has to do more).
This little article is old, but it's still pretty true when it comes to games. It explains the two protocols, and the havoc these folks went trying to develop a multiplayer internet game. "X-Wing vs Tie Fighter"
Lessons Learned (The Internet Sucks)
There is one caveat to this though, I run/develop a multiplayer game, and I've used both. UDP was much better for my app, but alot of people couldn't play with UDP. Routers and such blocked the connections. So I changed to the "reliable" TCP. Well... Reliable? I don't think so. You send a packet, no errors, you send another and it crashes (exception) in the middle of the packet. Now which packets made it? So you end up writing a reliable protocol ON TOP OF tcp, to simulate UDP - but continuously establish a new connection when it crashes. Take about inefficient.
UDP + Stop and wait ARW = good
UDP + Sliding Window Protocol = better
TCP + Sliding Window Protocol with reconnection? = Worthless bulkware. (IMHO)
The other side effect is multi-threaded applications. TCP works well for a chat room type thing, since each room can be it's own thread. A room can hold 60-100 people and it runs fine, as the Room thread contains the Sockets for each participant.
UDP on the other hand is best served (IMO) by one thread, but when you get the packet, you have to parse it to figure out who it came from (via info sent or RemoteEndPoint), then pass that data to the chatroom thread in a threadsafe manner.
Actually, you have to do the same with TCP, but only on connect.
Last point. Remember that TCP will just error out and kill the connection at anytime, but you can reconnect in about .5 seconds and send the same information. Most bizzare thing I've ever worked with.
UDP have a lower reliability give it more reliability by making it send a message and wait for respond if no respond came it resend the message.

Streams and UDP

Why does the UDP protocol not support streams like TCP protocol does?
Others have given perfectly good answers, but I think an analogy may help.
Imagine that UDP is a bit like a mailman - putting letters through people's doors, but never checking that they actually see the letter, or that the person even exists. This is cheap, and works well for junk mail send to many people (read: broadcast packets).
TCP is more like a messenger who will knock on your door and deliver a message in person. If you're not in, he'll come back and try a bit later. Eventually, you'll get the message - or the messenger will know that he hasn't delivered it. You can also send a message back to the sender via the messenger.
Don't try to read too much into my analogy - it's not like there's actually a single "messenger" in TCP - but it might help in terms of thinking about things. In particular, imagine you were sending a whole series of letters to someone, which they had to read in order - that can't work (reliably) with a mailman, as you might come down to find 10 letters sitting on your doormat, with no idea what order to read them in. The mailman might have dropped a few on the way, as well - and no-one will know. That's not a suitable basis for a reliable stream.
(On the other hand, it's a fine model for newspaper distribution. If you happen to miss a few, that's not a problem - you'll still get the later ones, which will be more interesting to you by then. That's why some streaming media solutions use UDP, despite it not giving a reliable actual stream.)
Well ...
UDP is a datagram protocol. It is intended to let applications send invidual datagrams, without a high-level protocol controlling what goes on.
A "stream" is a high-level concept, and thus not possible for UDP to support.
Many applications end up re-implementing parts of TCP on UDP, to get the particular mixture of latency, order independence, error handling, and bandwidth that they need.
The reason is simple: It doesn't, because it doesn't, because that wasn't one of the design goals for the protocol.
Making UDP work as a stream would mean adding almost every feature that TCP has. You'd end up with two identical protocols. What'd be the point?
UDP was meant to be the minimal lightweight protocol. It was meant to complement TCP.
If you define a 'stream' as a 5-tuple of IP(& L4) header, UDP does support streams. I guess what you mean to convey is whether UDP supports connections (where in each stream, aka, 5-tuple, is treated as a logical connection) and reliability of delivery of packets belonging to that stream is more or less guaranteed. Whereas, in case of UDP, althrough the packets are part of a stream (5-tuple), there's no effort to ensure reliability, instead relying on the underlying network layer to make proper delivery. If it fails, so be it.
Jon Skeet's analogy makes perfect sense.

Why Does RTP use UDP instead of TCP?

I wanted to know why UDP is used in RTP rather than TCP ?. Major VoIP Tools used only UDP as i hacked some of the VoIP OSS.
As DJ pointed out, TCP is about getting a reliable data stream, and will slow down transmission, and re-transmit corrupted packets, in order to achieve that.
UDP does not care about reliability of the communication, and will not slow down or re-transmit data.
If your application needs a reliable data stream, for example, to retrieve a file from a webserver, you choose TCP.
If your application doesn't care about corrupted or lost packets, and you don't need to incur the additional overhead to provide the additional reliability, you can choose UDP instead.
VOIP is not significantly improved by reliable packet transmission, and in fact, in some cases things in TCP like retransmission and exponential backoff can actually hurt VOIP quality. Therefore, UDP was a better choice.
A lot of good answers have been given, but I'd like to point one thing out explicitly:
Basically a complete data stream is a nice thing to have for real-time audio/video, but its not strictly necessary (as others have pointed out):
The important fact is that some data that arrives too late is worthless. What good is the missing data for a frame that should have been displayed a second ago?
If you were to use TCP (which also guarantees the correct order of all data), then you wouldn't be able to get to the more up-to-date data until the old one is transmitted correctly. This is doubly bad: you have to wait for the re-transmission of the old data and the new data (which is now delayed) will probably be just as worthless.
So RTP does some kind of best-effort transmission in that it tries to transfer all available data in time, but doesn't attempt to re-transmit data that was lost/corrupted during the transfer (*). It just goes on with life and hopes that the more important current data gets there correctly.
(*) actually I don't know the specifics of RTP. Maybe it does try to re-transmit, but if it does then it won't be as aggressive as TCP is (which won't ever accept any lost data).
The others are correct, however the don't really tell you the REAL reason why. Saua kind of hints at it, but here's a more complete answer.
Audio and Video is real-time. If you are listening to a radio, or watching TV, and the signal is interrupted, it doesn't pick up where you left off.. you're just "observing" the signal as it streams, and if you can't observe it at any given time, you lose it.
The reason, is simple. Delay. VOIP tries very hard to minimize the amount of delay from the time someone speaks into one end and you get it on your end, and your response back. Otherwise, as errors occured, the amount of delay between when the person spoke and when the signal was received would continuously grow until it became useless.
Remember, each delay from a retransmission has to be replayed, and that causes further data to be delayed, then another error causes an even greater delay. The only workable solution is to simply drop any data that can't be displayed in real-time.
A 1 second delay from retransmission would mean it would now be 1 second from the time I said something until you heard it. A second 1 second delay now means it's 2 seconds from the time i say something until you hear it. This is cumulative because data is played back at the same rate at which it is spoken, and so on...
RTP could be connection oriented, but then it would have to drop (or skip) data to keep up with retransmission errors anyways, so why bother with the extra overhead?
Technically RTP packets can be interleaved over a TCP connection. There are lots of great answers given here. Two additional minor points:
RFC 4588 describes how one could use retransmission with RTP data. Most clients that receive RTP streams employ a buffer to account for jitter in the network that is typically 1-5 seconds long and which means there is time available for a retransmit to receive the desired data.
RTP traffic can be interleaved over a TCP connection. In practice when this is done, the difference between Interleaved RTP (i.e. over TCP) and RTP sent over UDP is how these two perform over a lossy network with insufficient bandwidth available for the user. The Interleaved TCP stream will end up being jerky as the player continually waits in a buffering state for packets to arrive. Depending on the player it may jump ahead to catch up. With an RTP connection you will get artifacts (smearing/tearing) in the video.
UDP is often used for various types of realtime traffic that doesn't need strict ordering to be useful. This is because TCP enforces an ordering before passing data to an application (by default, you can get around this by setting the URG pointer, but no one seems to ever do this) and that can be highly undesirable in an environment where you'd rather get current realtime data than get old data reliably.
RTP is fairly insensitive to packet loss, so it doesn't require the reliability of TCP.
UDP has less overhead for headers so that one packet can carry more data, so the network bandwidth is utilized more efficiently.
UDP provides fast data transmission also.
So UDP is the obvious choice in cases such as this.
Besides all the others nice and correct answers this article gives a good understanding about the differences between TCP and UDP.
The Real-time Transport Protocol is a network protocol used to deliver streaming audio and video media over the internet, thereby enabling the Voice Over Internet Protocol (VoIP).
RTP is generally used with a signaling protocol, such as SIP, which sets up connections across the network. RTP applications can use the Transmission Control Protocol (TCP), but most use the User Datagram protocol (UDP) instead because UDP allows for faster delivery of data.
UDP is used wherever data is send, that does not need to be exactly received on the target, or where no stable connection is needed.
TCP is used if data needs to be exactly received, bit for bit, no loss of bits.
For Video and Sound streaming, some bits that are lost on the way do not affect the result in a way, that is mentionable, some pixels failing in a picture of a stream, nothing that affects a user, on DVDs the lost bit rate is higher.
just a remark:
Each packet sent in an RTP stream is given a number one higher than its predecessor.This allows thr destination to determine if any packets are missing.
If a packet is mising, the best action for the destination to take is to approximate the missing vaue by interpolation.
Retranmission is not a proctical option since the retransmitted packet would be too late to be useful.
I'd like to add quickly to what Matt H said in response to Stobor's answer. Matt H mentioned that RTP over UDP packets can be checksum'ed so that if they are corrupted, they will get resent. This is actually an optional feature on most PBXs. In Asterisk, for example, you can enable / disable checksums on your RTP over UDP traffic in the rtp.conf configuration file with the following line:
rtpchecksums=yes ; or no if you prefer
Cheers!

Resources