Related
Let's assume that I start a server at one of the computers in my private network (192.168.10.10:9900).
Now when making a request from some other computer in the same network, how does the client computer (OS?) knows which protocol to use / which protocol the server follows ? [TCP or UDP]
EDIT: As mentioned in the answers, I was basically looking for a default protocol which will be used by the client in the absence of any transport protocol information.
TCP / UDP protocols work at the transport layer level (TCP / IP MODEL) and its main difference is that TCP has a method to ensure the arrival of messages while UDP is lighter because of its virtue is to be faster in Information delivery. The use of one protocol or another is always defined by the application that will use it.
So the reference you put on the private server with ip: port 192.168.10.10:9900 is very vague to be more precise we could say that we have an Apache web server running on the ip: port 192.168.10.10:9900 (the port for default is 80 when installing the server, but it can be changed in the configuration).
Now the web servers (apache, IIS, etc.) work using the TCP protocol because when a client (computer, cell phone, etc.) consults a page through a browser (Chrome, Firefox, etc.), the ideal thing is that all the website and not just some pieces. This is why this type of servers chose and use this protocol in the first instance since they seek that in the end the result is that the user obtains the complete page regardless of whether a few more milliseconds are sacrificed in the validations involved in using TPC.
Now going to the client side. The user when visiting a web page from any browser (Chrome, Firefox, etc.) will use TCP since this protocol is already configured in the browser to send the query messages and subsequently receive the messages with the same form Website information.
Now this behavior is going to be repeated for any client / server application. For example, to change the type of application on the UDP side, we can observe the operation of DHCP services which are used to receive an IP when connecting any device to a Wi-Fi network. In this case, this service seeks to be as fast as possible (instead of the most reliable) since you want the device to connect as quickly as possible to the network, so use the UDP protocol and in this case any equipment when connecting To a WIFI network you will send your messages using this protocol.
Finally, if you want to know promptly about the type of TCP / UDP protocol used by a specific application, you can search on the Wireshark application which allows you to scan the messages that leave the device or show the protocol used in the different layers of the application.
There is no reason any client would make a request to your server, so why would it care what protocol it follows? Clients don't just randomly connect to things to see if there's a server there. So it doesn't make any difference to any client.
Normally, the client computer will use the TCP protocol as default. If you start the server using UDP protocol mode, then when you use curl -XGET 192.168.10.10:9900/test-page, it will give you back an curl: (7) Failed to connect to 192.168.10.10 port 9900: Connection refused error. You can try it, use the nc -lvp 9900 -u, it will give you that result.
The answers here are pointing to some default protocol. Its' not that, Whenever you start an application let say HTTP server, the server's internal has code to open a socket(which can be TCP or UDP), since HTTP:80 is a TCP protocol the code creates a TCP socket. Similarly for other network application it depends on their requirement what kind of transport layer protocol to use (TCP Or UDP). Like a DNS client will create a UDP socket to connect to DNS server, since DNS:53 is mostly over UDP. Both TCP and UDP have different use cases, advantages and disadvantages. Depending on there uses cases / advantages / disadvantages of UDP/TCP decision is taken to implement server using either of them.
This question already has answers here:
Does the port change when a server accepts a TCP connection?
(3 answers)
Closed 4 years ago.
I understand the basics of how ports work. However, what I don't get is how multiple clients can simultaneously connect to say port 80. I know each client has a unique (for their machine) port. Does the server reply back from an available port to the client, and simply state the reply came from 80? How does this work?
First off, a "port" is just a number. All a "connection to a port" really represents is a packet which has that number specified in its "destination port" header field.
Now, there are two answers to your question, one for stateful protocols and one for stateless protocols.
For a stateless protocol (ie UDP), there is no problem because "connections" don't exist - multiple people can send packets to the same port, and their packets will arrive in whatever sequence. Nobody is ever in the "connected" state.
For a stateful protocol (like TCP), a connection is identified by a 4-tuple consisting of source and destination ports and source and destination IP addresses. So, if two different machines connect to the same port on a third machine, there are two distinct connections because the source IPs differ. If the same machine (or two behind NAT or otherwise sharing the same IP address) connects twice to a single remote end, the connections are differentiated by source port (which is generally a random high-numbered port).
Simply, if I connect to the same web server twice from my client, the two connections will have different source ports from my perspective and destination ports from the web server's. So there is no ambiguity, even though both connections have the same source and destination IP addresses.
Ports are a way to multiplex IP addresses so that different applications can listen on the same IP address/protocol pair. Unless an application defines its own higher-level protocol, there is no way to multiplex a port. If two connections using the same protocol simultaneously have identical source and destination IPs and identical source and destination ports, they must be the same connection.
Important:
I'm sorry to say that the response from "Borealid" is imprecise and somewhat incorrect - firstly there is no relation to statefulness or statelessness to answer this question, and most importantly the definition of the tuple for a socket is incorrect.
First remember below two rules:
Primary key of a socket: A socket is identified by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL} not by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT} - Protocol is an important part of a socket's definition.
OS Process & Socket mapping: A process can be associated with (can open/can listen to) multiple sockets which might be obvious to many readers.
Example 1: Two clients connecting to same server port means: socket1 {SRC-A, 100, DEST-X,80, TCP} and socket2{SRC-B, 100, DEST-X,80, TCP}. This means host A connects to server X's port 80 and another host B also connects to the same server X to the same port 80. Now, how the server handles these two sockets depends on if the server is single-threaded or multiple-threaded (I'll explain this later). What is important is that one server can listen to multiple sockets simultaneously.
To answer the original question of the post:
Irrespective of stateful or stateless protocols, two clients can connect to the same server port because for each client we can assign a different socket (as the client IP will definitely differ). The same client can also have two sockets connecting to the same server port - since such sockets differ by SRC-PORT. With all fairness, "Borealid" essentially mentioned the same correct answer but the reference to state-less/full was kind of unnecessary/confusing.
To answer the second part of the question on how a server knows which socket to answer. First understand that for a single server process that is listening to the same port, there could be more than one socket (maybe from the same client or from different clients). Now as long as a server knows which request is associated with which socket, it can always respond to the appropriate client using the same socket. Thus a server never needs to open another port in its own node than the original one on which the client initially tried to connect. If any server allocates different server ports after a socket is bound, then in my opinion the server is wasting its resource and it must be needing the client to connect again to the new port assigned.
A bit more for completeness:
Example 2: It's a very interesting question: "can two different processes on a server listen to the same port". If you do not consider protocol as one of the parameters defining sockets then the answer is no. This is so because we can say that in such a case, a single client trying to connect to a server port will not have any mechanism to mention which of the two listening processes the client intends to connect to. This is the same theme asserted by rule (2). However, this is the WRONG answer because 'protocol' is also a part of the socket definition. Thus two processes in the same node can listen to the same port only if they are using different protocols. For example, two unrelated clients (say one is using TCP and another is using UDP) can connect and communicate to the same server node and to the same port but they must be served by two different server processes.
Server Types - single & multiple:
When a server processes listening to a port that means multiple sockets can simultaneously connect and communicate with the same server process. If a server uses only a single child process to serve all the sockets then the server is called single-process/threaded and if the server uses many sub-processes to serve each socket by one sub-process then the server is called a multi-process/threaded server. Note that irrespective of the server's type a server can/should always use the same initial socket to respond back (no need to allocate another server port).
Suggested Books and the rest of the two volumes if you can.
A Note on Parent/Child Process (in response to query/comment of 'Ioan Alexandru Cucu')
Wherever I mentioned any concept in relation to two processes say A and B, consider that they are not related by the parent-child relationship. OS's (especially UNIX) by design allows a child process to inherit all File-descriptors (FD) from parents. Thus all the sockets (in UNIX like OS are also part of FD) that process A listening to can be listened to by many more processes A1, A2, .. as long as they are related by parent-child relation to A. But an independent process B (i.e. having no parent-child relation to A) cannot listen to the same socket. In addition, also note that this rule of disallowing two independent processes to listen to the same socket lies on an OS (or its network libraries), and by far it's obeyed by most OS's. However, one can create own OS which can very well violate this restriction.
TCP / HTTP Listening On Ports: How Can Many Users Share the Same Port
So, what happens when a server listen for incoming connections on a TCP port? For example, let's say you have a web-server on port 80. Let's assume that your computer has the public IP address of 24.14.181.229 and the person that tries to connect to you has IP address 10.1.2.3. This person can connect to you by opening a TCP socket to 24.14.181.229:80. Simple enough.
Intuitively (and wrongly), most people assume that it looks something like this:
Local Computer | Remote Computer
--------------------------------
<local_ip>:80 | <foreign_ip>:80
^^ not actually what happens, but this is the conceptual model a lot of people have in mind.
This is intuitive, because from the standpoint of the client, he has an IP address, and connects to a server at IP:PORT. Since the client connects to port 80, then his port must be 80 too? This is a sensible thing to think, but actually not what happens. If that were to be correct, we could only serve one user per foreign IP address. Once a remote computer connects, then he would hog the port 80 to port 80 connection, and no one else could connect.
Three things must be understood:
1.) On a server, a process is listening on a port. Once it gets a connection, it hands it off to another thread. The communication never hogs the listening port.
2.) Connections are uniquely identified by the OS by the following 5-tuple: (local-IP, local-port, remote-IP, remote-port, protocol). If any element in the tuple is different, then this is a completely independent connection.
3.) When a client connects to a server, it picks a random, unused high-order source port. This way, a single client can have up to ~64k connections to the server for the same destination port.
So, this is really what gets created when a client connects to a server:
Local Computer | Remote Computer | Role
-----------------------------------------------------------
0.0.0.0:80 | <none> | LISTENING
127.0.0.1:80 | 10.1.2.3:<random_port> | ESTABLISHED
Looking at What Actually Happens
First, let's use netstat to see what is happening on this computer. We will use port 500 instead of 80 (because a whole bunch of stuff is happening on port 80 as it is a common port, but functionally it does not make a difference).
netstat -atnp | grep -i ":500 "
As expected, the output is blank. Now let's start a web server:
sudo python3 -m http.server 500
Now, here is the output of running netstat again:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
So now there is one process that is actively listening (State: LISTEN) on port 500. The local address is 0.0.0.0, which is code for "listening for all". An easy mistake to make is to listen on address 127.0.0.1, which will only accept connections from the current computer. So this is not a connection, this just means that a process requested to bind() to port IP, and that process is responsible for handling all connections to that port. This hints to the limitation that there can only be one process per computer listening on a port (there are ways to get around that using multiplexing, but this is a much more complicated topic). If a web-server is listening on port 80, it cannot share that port with other web-servers.
So now, let's connect a user to our machine:
quicknet -m tcp -t localhost:500 -p Test payload.
This is a simple script (https://github.com/grokit/dcore/tree/master/apps/quicknet) that opens a TCP socket, sends the payload ("Test payload." in this case), waits a few seconds and disconnects. Doing netstat again while this is happening displays the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:54240 ESTABLISHED -
If you connect with another client and do netstat again, you will see the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:26813 ESTABLISHED -
... that is, the client used another random port for the connection. So there is never confusion between the IP addresses.
Normally, for every connecting client the server forks a child process that communicates with the client (TCP). The parent server hands off to the child process an established socket that communicates back to the client.
When you send the data to a socket from your child server, the TCP stack in the OS creates a packet going back to the client and sets the "from port" to 80.
Multiple clients can connect to the same port (say 80) on the server because on the server side, after creating a socket and binding (setting local IP and port) listen is called on the socket which tells the OS to accept incoming connections.
When a client tries to connect to server on port 80, the accept call is invoked on the server socket. This creates a new socket for the client trying to connect and similarly new sockets will be created for subsequent clients using same port 80.
Words in italics are system calls.
Ref
http://www.scs.stanford.edu/07wi-cs244b/refs/net2.pdf
If I understand right, applications sometimes use HTTP to send messages, since using other ports is liable to cause firewall problems. But how does that work without conflicting with other applications such as web-browsers? In fact how do multiple browsers running at once not conflict? Do they all monitor the port and get notified... can you share a port in this way?
I have a feeling this is a dumb question, but not something I ever thought of before, and in other cases I've seen problems when 2 apps are configured to use the same port.
There are 2 ports: a source port (browser) and a destination port (server). The browser asks the OS for an available source port (let's say it receives 33123) then makes a socket connection to the destination port (usually 80/HTTP, 443/HTTPS).
When the web server receives the answer, it sends a response that has 80 as source port and 33123 as destination port.
So if you have 2 browsers concurrently accessing stackoverflow.com, you'd have something like this:
Firefox (localhost:33123) <-----------> stackoverflow.com (69.59.196.211:80)
Chrome (localhost:33124) <-----------> stackoverflow.com (69.59.196.211:80)
Outgoing HTTP requests don't happen on port 80. When an application requests a socket, it usually receives one at random. This is the Source port.
Port 80 is for serving HTTP content (by the server, not the client). This is the Destination port.
Each browser uses a different Source to generate requests. That way, the packets make it back to the correct application.
It is the 5-tuple of (IP protocol, local IP address, local port, remote IP address, remote port) that identifies a connection. Multiple browsers (or in fact a single browser loading multiple pages simultaneously) will each use destination port 80, but the local port (which is allocated by the O/S) is distinct in each case. Therefore there is no conflict.
Clients usually pick a port between 1024 and 65535.
It depends on the operating system how to handle this. I think Windows Clients increment the value for each new connection, Unix Clients pick a random port no.
Some services rely on a static client port like NTP (123 UDP)
A browser is a client application that you use in order to see content on a web server which is usually on a different machine.
The web server is the one listening on port 80, not the browser on the client.
You need to be careful in making the distinction between "listening on port 80" and "connecting to port 80".
When you say "applications sometimes use HTTP to send messages, since using other ports is liable to cause firewall problems", you actually mean "applications sometimes send messages to port 80".
The server is listening on port 80, and can accept multiple connections on that port.
Port 80 you're talking about here is the remote port on the server, locally browser opens high port for each connection established.
Each connection has port numbers on both ends, one is called local port, other remote port.
Firewall will allow traffic to high port for browser, because it knows that connection has been established from you computer.
Can someone tell me what is the difference between active and passive FTP?
Which one is preferable?
Active and passive are the two modes that FTP can run in.
For background, FTP actually uses two channels between client and server, the command and data channels, which are actually separate TCP connections.
The command channel is for commands and responses while the data channel is for actually transferring files.
This separation of command information and data into separate channels a nifty way of being able to send commands to the server without having to wait for the current data transfer to finish. As per the RFC, this is only mandated for a subset of commands, such as quitting, aborting the current transfer, and getting the status.
In active mode, the client establishes the command channel but the server is responsible for establishing the data channel. This can actually be a problem if, for example, the client machine is protected by firewalls and will not allow unauthorised session requests from external parties.
In passive mode, the client establishes both channels. We already know it establishes the command channel in active mode and it does the same here.
However, it then requests the server (on the command channel) to start listening on a port (at the servers discretion) rather than trying to establish a connection back to the client.
As part of this, the server also returns to the client the port number it has selected to listen on, so that the client knows how to connect to it.
Once the client knows that, it can then successfully create the data channel and continue.
More details are available in the RFC: https://www.ietf.org/rfc/rfc959.txt
I recently run into this question in my work place so I think I should say something more here. I will use image to explain how the FTP works as an additional source for previous answer.
Active mode:
Passive mode:
In an active mode configuration, the server will attempt to connect to a random client-side port. So chances are, that port wouldn't be one of those predefined ports. As a result, an attempt to connect to it will be blocked by the firewall and no connection will be established.
A passive configuration will not have this problem since the client will be the one initiating the connection. Of course, it's possible for the server side to have a firewall too. However, since the server is expected to receive a greater number of connection requests compared to a client, then it would be but logical for the server admin to adapt to the situation and open up a selection of ports to satisfy passive mode configurations.
So it would be best for you to configure server to support passive mode FTP. However, passive mode would make your system vulnerable to attacks because clients are supposed to connect to random server ports. Thus, to support this mode, not only should your server have to have multiple ports available, your firewall should also allow connections to all those ports to pass through!
To mitigate the risks, a good solution would be to specify a range of ports on your server and then to allow only that range of ports on your firewall.
For more information, please read the official document.
Redacted version of my article FTP Connection Modes (Active vs. Passive):
FTP connection mode (active or passive), determines how a data connection is established. In both cases, a client creates a TCP control connection to an FTP server command port 21. This is a standard outgoing connection, as with any other file transfer protocol (SFTP, SCP, WebDAV) or any other TCP client application (e.g. web browser). So, usually there are no problems when opening the control connection.
Where FTP protocol is more complicated comparing to the other file transfer protocols are file transfers. While the other protocols use the same connection for both session control and file (data) transfers, the FTP protocol uses a separate connection for the file transfers and directory listings.
In the active mode, the client starts listening on a random port for incoming data connections from the server (the client sends the FTP command PORT to inform the server on which port it is listening). Nowadays, it is typical that the client is behind a firewall (e.g. built-in Windows firewall) or NAT router (e.g. ADSL modem), unable to accept incoming TCP connections.
For this reason the passive mode was introduced and is mostly used nowadays. Using the passive mode is preferable because most of the complex configuration is done only once on the server side, by experienced administrator, rather than individually on a client side, by (possibly) inexperienced users.
In the passive mode, the client uses the control connection to send a PASV command to the server and then receives a server IP address and server port number from the server, which the client then uses to open a data connection to the server IP address and server port number received.
Network Configuration for Passive Mode
With the passive mode, most of the configuration burden is on the server side. The server administrator should setup the server as described below.
The firewall and NAT on the FTP server side have to be configured not only to allow/route the incoming connections on FTP port 21 but also a range of ports for the incoming data connections. Typically, the FTP server software has a configuration option to setup a range of the ports, the server will use. And the same range has to be opened/routed on the firewall/NAT.
When the FTP server is behind a NAT, it needs to know it's external IP address, so it can provide it to the client in a response to PASV command.
Network Configuration for Active Mode
With the active mode, most of the configuration burden is on the client side.
The firewall (e.g. Windows firewall) and NAT (e.g. ADSL modem routing rules) on the client side have to be configured to allow/route a range of ports for the incoming data connections. To open the ports in Windows, go to Control Panel > System and Security > Windows Firewall > Advanced Settings > Inbound Rules > New Rule. For routing the ports on the NAT (if any), refer to its documentation.
When there's NAT in your network, the FTP client needs to know its external IP address that the WinSCP needs to provide to the FTP server using PORT command. So that the server can correctly connect back to the client to open the data connection. Some FTP clients are capable of autodetecting the external IP address, some have to be manually configured.
Smart Firewalls/NATs
Some firewalls/NATs try to automatically open/close data ports by inspecting FTP control connection and/or translate the data connection IP addresses in control connection traffic.
With such a firewall/NAT, the above configuration is not necessary for a plain unencrypted FTP. But this cannot work with FTPS, as the control connection traffic is encrypted and the firewall/NAT cannot inspect nor modify it.
Active mode:
-server initiates the connection.
Passive mode:
-client initiates the connection.
Active Mode—The client issues a PORT command to the server signaling that it will “actively” provide an IP and port number to open the Data Connection back to the client.
Passive Mode—The client issues a PASV command to indicate that it will wait “passively” for the server to supply an IP and port number, after which the client will create a Data Connection to the server.
There are lots of good answers above, but this blog post includes some helpful graphics and gives a pretty solid explanation: https://titanftp.com/2018/08/23/what-is-the-difference-between-active-and-passive-ftp/
From what I understand, each HTTP request uses its own TCP connection (please correct me if i'm wrong). So, let's say that there are two current connections to the same server. For example, client side javascript code triggering a couple of AJAX POST requests using the XMLHttpRequest object, one right after the other, before getting the response to the first one. So we're talking about two connections to the same server, each waiting for a response in order to route it to each separate callback function.
Now here's the thing that I don't understand: The TCP packet includes source and destination ip and port, but won't both of these connections have the same src and dest ip addresses, and port 80? How can the packets be differentiated and routed to appropriately? Does it have anything to do with the packet sequence number which is different for each connection?
When your browser creates a new connection to the HTTP server, it uses a different source port.
For example, say your browser creates two connections to a server and that your IP address is 60.12.34.56. The first connection might originate from source port 60123 and the second from 60127. This is embedded in the TCP header of each packet sent to the server. When the server replies to each connection, it uses the appropriate port (e.g. 60123 or 60127) so that the packet makes it back to the right spot.
One of the best ways to learn about this is to download Wireshark and just observe traffic on your own network. It will show you this and much more.
Additionally, this gives insight into how Network Address Translation (NAT) works on a router. You can have many computers share the same IP address and the router will rewrite the request to use a different port so that two computers can simultaneously connect to places like AOL Instant Messenger.
They're differentiated by the source port.
The main reason for each HTTP request to not generate a separate TCP connection is called keepalives, incidentally.
A socket, in packet network communications, is considered to be the combination of 4 elements: server IP, server port, client IP, client port. The second one is usually fixed in a protocol, e.g. http usually listen in port 80, but the client port is a random number usually in the range 1024-65535. This is because the operating system could use those ports for known server protocols (e.g. 21 for FTP, 22 for SSH, etc.). The same network device can not use the same client port to open two different connections even to different servers and if two different clients use the same port, the server can tell them apart by their IP addresses. If a port is being used in a system either to listen for connection or to establish a connection, it can not be used for anything else. That's how the operating system can dispatch packets to the correct process once received by the network card.