destination port in UDP protocol - networking

My question is that how the destination port address in UDP is chosen/given?
I mean what matters to set a destination port in a UDP packet?
Because when we send a packet, just the destination address(ip) is important and we want to send data to our destination.
It has nothing to do with the port!
Do we assign a random port?

Typically, whatever documentation tells you what to put in the UDP datagram you're sending should also tell you what port to send it to.
For example, if you're trying to talk to an NTP server, RFC5905 tells you what to put in the UDP datagrams you send. It also tells you, on page 16, to send it to port 123.
If you're writing a DNS resolver, RFC1035 is one place you might look for the information needed to know what to put in your UDP datagrams. It also tells you, in section 4.2, to send the datagrams to port 53.
So however you're figuring out what to put in the UDP datagrams you're going to send, that's typically what tells you either what port to send them to or, in some cases, how to determine what port to send them to.
For example, a media streaming protocol might start with the information about the stream being delivered by a web server. In that case, the information delivered by the web server to the client might include the destination port to send datagrams to.
Generally, there's either a well-known port that at least one side listens for datagrams on or there's some external method using a different protocol that tells whichever end sends the first datagram what port to send it to. The other end then just replies, sending its response datagrams to whatever port that first datagram was sent from.

Generally, the sending port is chose randomly for the ephemeral ports available.
The destination port is the port to which the destination application is listening. To facilitate this, IANA maintains the Service Name and Transport Protocol Port Number Registry for standard applications and protocols.
If you create your own application or protocol, there is a range for you to use, but you should always check the registry to make sure you will not step on some other application or protocol.
When you design your listening application or protocol, you choose a port on which it listens, and the sending application will need to send to that port.

Related

UDP hole-punch explanation

I'm trying to understand UDP hole punching and I just don't quite get it.
In concept it seems simple but when I put it into practice I can't pull it off.
From what I understand there's a public server we call the hole-punch server. A client makes a request to hole-punch server (this is public). The hole-punch server spits out a public ip and port of the client that just made the request. So long as that port is open then essentially any random client can make a request to that client using that specific port and ip ?
The issue I guess I'm having is, the client is able to make a request to the server. The server is able to send data back to the client on that public port and ip however when another client tries to send a request to that client using that same port and ip it just doesn't go through and that's what's confusing me. If the server can make the request why can't another random client make that request?
The thing to know about UDP hole-punching is that many consumer-grade Internet routers/NAT-firewalls have a policy along the lines of "block any incoming UDP packets, except for UDP packets coming from an IP address that the user's local computer has recently sent a UDP packet to"; the idea being that if the local user is sending packets to a particular IP address, then the packets coming back from that same IP address are probably legitimate/desirable.
So in order to get UDP packets flowing between two firewalled/NAT'd computers, you have to get each of the two computers to first send a UDP packet to the other one; which is a bit of a chicken-and-egg problem since they can't know where to send the UDP packet without being able to communicate; the public server is what solves that problem. Since that server is public, both clients can communicate with the server (via UDP or TCP or HTTP or whatever), and that server can tell each client the IP address and port to send its UDP packets to. Once each client has sent some initial packets to the other, it should also (in most cases) then be able to receive UDP packets from the other client as well, at which point the server is no longer necessary as a go-between.

Multicast Broadcasting to self clarification

Setup:
The user has two applications - one sender one receiver - running on the same host/server. The user sets it up such that the sender sends messages to its own IP address not 127.0.0.1. Lets say its IP and port is x:y for simplicity. The user then sets up the receiver to receiver messages on x:y. Again this is on the same host/server.
Questions:
From my understanding this is not possible since the port will already be reserved. Therefore I cannot use the same port to try and send packets out to myself. Can I have a port used for a sender and receiver on the same node?
Is this resolved if I use SO_REUSEADDR or does this only resolve the IP conflict and not the port reuse?
If the program is not setup with IP_MULTICAST_LOOP the host will not multicast the message to itself, correct?
With IP_MULITCAST_LOOP set, if I only wanted to send the message to myself can I use 127.0.0.1 or must I use another address? Additionally, how do the ports get resolved?
If I am not seeing messages on the same node, would the first best guess be that IP_MULITCAST_LOOP is not set?
Let's take it step by step:
The sending port does not matter at all. So you can choose an arbitrary port for the sender, and use the specific port number for your service just for the receiver.
No, SO_REUSEADDR/PORT does not solve this problem. Even if you manage to achieve it: Do not start multiple listeners on the same port. This will cause strange effects. The main purpose of SO_REUSEADDR/PORT is to allow servers to create a TCP (not UDP) socket when the previous server process just died, without waiting for a timeout of the TCP state machine of the stale socket.
Corrects, assuming you mean multicast rather than broadcast,
Yes and no: If you only want to send messages to yourself you can send the packets to 127.0.0.1, and then you message will be a normal unicast packet and no longer a multicast packet, and IP_MULTICAST_LOOP does not matter at all. Multicast packets are normal UDP packets which have a destination address in the multicast address range (i.e. 224.0.0.0-239.255.255.255). The receiving socket cannot easily tell whether a packet was sent via unicast or multicast.
IP routing on the same host between interfaces is far from trivial. There are a lot of mechanisms and routing rules involved which are not shown in the normal routing table, which is just for outgoing traffic. It also depends on by which means you try to observe the messages. There is not a single point where you can see all messages going through a node (unfortunately). This is usually all attached to interfaces, and there also to an ingress and egress side, and the latter is usually not documented and not configurable. Monitoring local traffic can be tricky and may require virtual network interfaces. Really messy.
In summary: You are trying to send messages from one process to another process on the same host. Use unicast UDP for this and you are done. No multicast involved.

How does TCP identify the application level protocol?

IP protocol datagram header contains a Protocol field to define the protocol used in the data portion of the IP datagram.
How does a TCP packet identify the its application level protocols? I don't see similar fields in the TCP header format. So it all depends on the port number?
If so, does it mean I can silently switch the application protocol on the same port, just like what happens when WebSocket uses a handshake request in the format of HTTP to tell the server to switch from HTTP to WebSocket protocol?
TCP itself does not care about the application layer protocol used. The closest thing is the port number. Port numbers are used to distinguish different connections on the same host. When a packet is received, the operating system uses the port number to determine which program it belongs to. Although many protocols have standard port numbers, you are not required to use them.
So yes, you can switch protocols on the same port.

Where is the source and destination address fields in TCP header?

From what I've read, TCP sits on the layer between the application and IP, and handles setting up the packets, checking for errors, ordering etc so the application itself doesn't have to do it.
However, when I looked at the TCP header I became confused. From the way I understand it, some data is handed to TCP from the application, and is given a destination address to which to send the data. The TCP layer packages it up, and sends it on to the IP layer, who in turn hands it off, all the way on down to the physical layer.
But looking at the TCP header on Wikipedia, there is no mention of a destination address! There is only a destination port number which I am pretty sure is not an address.
So my question is, how does TCP get the addresses? And/or, how does IP get the address if TCP isn't passing them to it?
It's the Application that's running on top of Transport Layer that chooses everything.
If the Application is designed with reliability in mind, it chooses the connection oriented protocol like TCP.
The same applications tells TCP what the Source and Destination port should be, TCP alone cannot decide this.
Example: If you're accessing a website, your Application would be the browser, since accessing websites normally happens over HTTP/HTTPS and HTTP/HTTPS is designed to be reliable, it chooses TCP. Port 80(HTTP) or 443(HTTPS) are the standard ports used for accessing websites, so either of these ports are used in the Destination Port field while the Source Port can be any random higher number port.
This combination is used to identify something called Transport Layer VC(Virtual Circuit).
Coming to IP, the same application tells what the Destination IP address is, while the Source IP is the machine from where you are running the browser.
IP in Network Layer and TCP in Transport Layer cannot choose anything, it's the Application that tells them what to choose, considering they are the chosen ones.

How are different TCP connections in HTTP requests identified?

From what I understand, each HTTP request uses its own TCP connection (please correct me if i'm wrong). So, let's say that there are two current connections to the same server. For example, client side javascript code triggering a couple of AJAX POST requests using the XMLHttpRequest object, one right after the other, before getting the response to the first one. So we're talking about two connections to the same server, each waiting for a response in order to route it to each separate callback function.
Now here's the thing that I don't understand: The TCP packet includes source and destination ip and port, but won't both of these connections have the same src and dest ip addresses, and port 80? How can the packets be differentiated and routed to appropriately? Does it have anything to do with the packet sequence number which is different for each connection?
When your browser creates a new connection to the HTTP server, it uses a different source port.
For example, say your browser creates two connections to a server and that your IP address is 60.12.34.56. The first connection might originate from source port 60123 and the second from 60127. This is embedded in the TCP header of each packet sent to the server. When the server replies to each connection, it uses the appropriate port (e.g. 60123 or 60127) so that the packet makes it back to the right spot.
One of the best ways to learn about this is to download Wireshark and just observe traffic on your own network. It will show you this and much more.
Additionally, this gives insight into how Network Address Translation (NAT) works on a router. You can have many computers share the same IP address and the router will rewrite the request to use a different port so that two computers can simultaneously connect to places like AOL Instant Messenger.
They're differentiated by the source port.
The main reason for each HTTP request to not generate a separate TCP connection is called keepalives, incidentally.
A socket, in packet network communications, is considered to be the combination of 4 elements: server IP, server port, client IP, client port. The second one is usually fixed in a protocol, e.g. http usually listen in port 80, but the client port is a random number usually in the range 1024-65535. This is because the operating system could use those ports for known server protocols (e.g. 21 for FTP, 22 for SSH, etc.). The same network device can not use the same client port to open two different connections even to different servers and if two different clients use the same port, the server can tell them apart by their IP addresses. If a port is being used in a system either to listen for connection or to establish a connection, it can not be used for anything else. That's how the operating system can dispatch packets to the correct process once received by the network card.

Resources