As far as I know, a (single generic) webserver uses ports (like any other tcp/upd application) to identify users/process. Since, a port is a 16-bit unsigned integer, thus ranging from 0 to 65535. How does server act when it reaches its limit?
High level sample
The server1 answering by 8080! The client1 conect to server1 (now they are connected by random ports: (but unique) server1:5123 <--> client1:6123)
Another client2 connect to the server1 ( server1:5124 <--> client2:7123 )
So, the thing is: Is the server limited by 65535 (in pratical less than that) for a given instance?
In the simplest case a webserver consumes only one TCP port (conventionally port 80) on the server system. All of the connections to the webserver are handled through that single port. The other 65534 ports remain available for other uses.
This works because a TCP connection is identified not just by the port number on the server, but by the combination of (server IP, server TCP port, client IP, client TCP port). So the server can have a huge number of concurrent TCP connections all on its port 80, using the other three items to identify which connection the traffic belongs to. If the server only has a single IP address, and therefore the (server IP, server port) portions are identical on all connections to the webserver, the individual connections are still distinguishable by the (client IP, client port) portions of the combination.
If you run the netstat -a command on a busy Unix webserver you'll see this in action. That command will show a bunch of connections on the server's port 80, but all with different client IPs and/or ports. It'll also show that the system is still listening for new connections on port 80, at the same time as it is handling all of the existing connections on that port.
The total number of connections to the webserver might be limited by some other constraint (perhaps memory usage, perhaps some arbitrary limit in the webserver itself or in the OS kernel) or by some external constraint (perhaps connection table size in external firewalls or gateways) but it's not limited by the 16-bit TCP port range.
Also note that TCP ports are completely separate from UDP ports, so using TCP port 80 for a webserver does not prevent UDP port 80 from being used for some other purpose. And vice versa.
Related
I know that multiple TCP clients can connect to the same remote endpoint (e.g. my server runs on 127.0.0.1:8080).
I know that multiple TCP clients can connect from the same IP address. But when I test this (in my case using .Net's TcpClient class, they seem to be automatically given unique ports, but is that enforced?
Can my TCP listener/server have multiple concurrent connections from the same IP/port combination? If not, how can I uniquely distinguish my connections?
One edge case I considered is if two clients have the same IP4 address... or on any given network is IP4 uniqueness also enforced?
When speaking about TCP, a connection between two entities is identified by the quadruple
<clientIP, clientPort, serverIP, serverPort>
This is why the same service (running on serverIP, serverPort) can accept
multiple connections from different clients (entities with different IPs)
multiple connections from the same client (same clientIP BUT different clientPort)
I'm not sure about .NET's implementation, altough I think it would work anyway, but when estabilishing a connection on the client side, you can also specify the port to which bind your client side socket; this way, on the server sidem you could identify incoming connections connections from the same host by their clientPort value.
Given the TCP/IP stack definition, no, you cannot enforce which port a client listens on from the server side. When a client hits a server it sends a port number that it expects the server to respond to (the client will listen on this port) called a Source Port. The server responds to that unique port for that particular client.
https://packetlife.net/blog/2010/jun/7/understanding-tcp-sequence-acknowledgment-numbers/
Looking at this capture you can see the client sent a packet with the request to (Destination) port 80 on the server with a (source) port 54841. This 54841 should be unique to that client from the servers perspective especially when combined with the Client IP address.
In more detail, the client sends a packet like this:
IP: Src: 192.168.1.23 (client), Dst: 192.168.1.11 (server)
TCP: Src Port: 54841, Dst Port: 80
The server responds with a packet that looks like this:
IP: Src: 192.168.1.11 (server), Dst: 192.168.1.23 (client)
TCP: Src Port: 80, Dst Port: 54841
Hopefully you can see that the server is able to uniquely track each request by the client IP/Port combination. A new request from the same client would have the same IP but a different Port.
IPv4 address conflicts are a problem for most networks. Most networks will not tolerate an IP address conflict. It is usually enforce by a competent network administrator or a DHCP server handling the IP addresses. Someone can manually force an IP address on a client and this will cause problems with the other host that has the same IP address. But this seems to be a different question from your original.
According to ISO_13400_2_June_2012
TCP uses a pair of port numbers (one sending, called remote port, and
one receiving, called local port) to identify a connection. The
sending port on one host will be the receiving port on the other and
vice versa. The ports listed in Table 6 are the receiving ports on the
DoIP entities that shall be used for TCP connections between external
test equipment and DoIP entities.
My question is : Does that mean that the Tester also should use 13400 as port number or can use any other port
The passive, listening socket that is accepting connections (the "server" socket) is listening on a specified and well-known port number.
What port number the connecting application (the "client") uses is irrelevant, and for most applications will be assigned automatically by the operating system.
I know that port numbers are used for identifying different processes running on a server, so that multiple processes can use the same networking resources. But how does it work internally?
For example, if a request to a website http://www.my-awesome-website.com:80 reaches a server, how does the server know that there is a web server running on port 80? I mean, what does the request pipeline look like between getting the request to finding out that a web server is running on port 80 and forwarding the request to the web server?
Port numbers are merely addresses for some transport-layer protocols, such as TCP and UDP, in the same way that IP addresses are for layer-3 protocols, and MAC addresses are for layer-2 protocols. Not all transport-layer protocols use ports, and each transport-layer protocol independently maintains its ports so that TCP port 80 is not the same as UDP port 80, and each can be used simultaneously by different applications.
Layer-2 addresses are only relevant to the LAN links, layer-3 addresses are only relevant host-to-host over the layer-3 network, and layer-4 addresses are relevant application-to-application.
IANA registers ports and maintains the official registry list at Service Name and Transport Protocol Port Number Registry.
From RFC 793, TRANSMISSION CONTROL PROTOCOL:
Multiplexing:
To allow for many processes within a single Host to use TCP
communication facilities simultaneously, the TCP provides a set of
addresses or ports within each host. Concatenated with the network
and host addresses from the internet communication layer, this forms
a socket. A pair of sockets uniquely identifies each connection.
That is, a socket may be simultaneously used in multiple
connections.
The binding of ports to processes is handled independently by each
Host. However, it proves useful to attach frequently used processes
(e.g., a "logger" or timesharing service) to fixed sockets which are
made known to the public. These services can then be accessed
through the known addresses. Establishing and learning the port
addresses of other processes may involve more dynamic mechanisms.
Connections:
The reliability and flow control mechanisms described above require
that TCPs initialize and maintain certain status information for
each data stream. The combination of this information, including
sockets, sequence numbers, and window sizes, is called a connection.
Each connection is uniquely specified by a pair of sockets
identifying its two sides.
When two processes wish to communicate, their TCP's must first
establish a connection (initialize the status information on each
side). When their communication is complete, the connection is
terminated or closed to free the resources for other uses.
Since connections must be established between unreliable hosts and
over the unreliable internet communication system, a handshake
mechanism with clock-based sequence numbers is used to avoid
erroneous initialization of connections.
After opening a socket(which is like an open file but used for network communications), the user of the socket may use it directly with an ephemeral port(selected by the OS), which is typical if the application is a client application.
What server processes do is to call the bind() socket API call to set a port for the socket, and then call listen() in case of a TCP socket to start listening for incoming connection requests.
Because of the bind() call the OS will know that this particular socket is the one receiving the data sent to the particular port number.
The packets sent over the network contain the source and destination IP addresses as well as the source and destination ports:
http://www.techrepublic.com/article/exploring-the-anatomy-of-a-data-packet/
So the OS has a data structure with open sockets listed by their port numbers and it will pass the received data to the correct socket's input buffer. Sent data will be marked by the port number of the sending socket.
This question already has answers here:
Does the port change when a server accepts a TCP connection?
(3 answers)
Closed 4 years ago.
I understand the basics of how ports work. However, what I don't get is how multiple clients can simultaneously connect to say port 80. I know each client has a unique (for their machine) port. Does the server reply back from an available port to the client, and simply state the reply came from 80? How does this work?
First off, a "port" is just a number. All a "connection to a port" really represents is a packet which has that number specified in its "destination port" header field.
Now, there are two answers to your question, one for stateful protocols and one for stateless protocols.
For a stateless protocol (ie UDP), there is no problem because "connections" don't exist - multiple people can send packets to the same port, and their packets will arrive in whatever sequence. Nobody is ever in the "connected" state.
For a stateful protocol (like TCP), a connection is identified by a 4-tuple consisting of source and destination ports and source and destination IP addresses. So, if two different machines connect to the same port on a third machine, there are two distinct connections because the source IPs differ. If the same machine (or two behind NAT or otherwise sharing the same IP address) connects twice to a single remote end, the connections are differentiated by source port (which is generally a random high-numbered port).
Simply, if I connect to the same web server twice from my client, the two connections will have different source ports from my perspective and destination ports from the web server's. So there is no ambiguity, even though both connections have the same source and destination IP addresses.
Ports are a way to multiplex IP addresses so that different applications can listen on the same IP address/protocol pair. Unless an application defines its own higher-level protocol, there is no way to multiplex a port. If two connections using the same protocol simultaneously have identical source and destination IPs and identical source and destination ports, they must be the same connection.
Important:
I'm sorry to say that the response from "Borealid" is imprecise and somewhat incorrect - firstly there is no relation to statefulness or statelessness to answer this question, and most importantly the definition of the tuple for a socket is incorrect.
First remember below two rules:
Primary key of a socket: A socket is identified by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL} not by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT} - Protocol is an important part of a socket's definition.
OS Process & Socket mapping: A process can be associated with (can open/can listen to) multiple sockets which might be obvious to many readers.
Example 1: Two clients connecting to same server port means: socket1 {SRC-A, 100, DEST-X,80, TCP} and socket2{SRC-B, 100, DEST-X,80, TCP}. This means host A connects to server X's port 80 and another host B also connects to the same server X to the same port 80. Now, how the server handles these two sockets depends on if the server is single-threaded or multiple-threaded (I'll explain this later). What is important is that one server can listen to multiple sockets simultaneously.
To answer the original question of the post:
Irrespective of stateful or stateless protocols, two clients can connect to the same server port because for each client we can assign a different socket (as the client IP will definitely differ). The same client can also have two sockets connecting to the same server port - since such sockets differ by SRC-PORT. With all fairness, "Borealid" essentially mentioned the same correct answer but the reference to state-less/full was kind of unnecessary/confusing.
To answer the second part of the question on how a server knows which socket to answer. First understand that for a single server process that is listening to the same port, there could be more than one socket (maybe from the same client or from different clients). Now as long as a server knows which request is associated with which socket, it can always respond to the appropriate client using the same socket. Thus a server never needs to open another port in its own node than the original one on which the client initially tried to connect. If any server allocates different server ports after a socket is bound, then in my opinion the server is wasting its resource and it must be needing the client to connect again to the new port assigned.
A bit more for completeness:
Example 2: It's a very interesting question: "can two different processes on a server listen to the same port". If you do not consider protocol as one of the parameters defining sockets then the answer is no. This is so because we can say that in such a case, a single client trying to connect to a server port will not have any mechanism to mention which of the two listening processes the client intends to connect to. This is the same theme asserted by rule (2). However, this is the WRONG answer because 'protocol' is also a part of the socket definition. Thus two processes in the same node can listen to the same port only if they are using different protocols. For example, two unrelated clients (say one is using TCP and another is using UDP) can connect and communicate to the same server node and to the same port but they must be served by two different server processes.
Server Types - single & multiple:
When a server processes listening to a port that means multiple sockets can simultaneously connect and communicate with the same server process. If a server uses only a single child process to serve all the sockets then the server is called single-process/threaded and if the server uses many sub-processes to serve each socket by one sub-process then the server is called a multi-process/threaded server. Note that irrespective of the server's type a server can/should always use the same initial socket to respond back (no need to allocate another server port).
Suggested Books and the rest of the two volumes if you can.
A Note on Parent/Child Process (in response to query/comment of 'Ioan Alexandru Cucu')
Wherever I mentioned any concept in relation to two processes say A and B, consider that they are not related by the parent-child relationship. OS's (especially UNIX) by design allows a child process to inherit all File-descriptors (FD) from parents. Thus all the sockets (in UNIX like OS are also part of FD) that process A listening to can be listened to by many more processes A1, A2, .. as long as they are related by parent-child relation to A. But an independent process B (i.e. having no parent-child relation to A) cannot listen to the same socket. In addition, also note that this rule of disallowing two independent processes to listen to the same socket lies on an OS (or its network libraries), and by far it's obeyed by most OS's. However, one can create own OS which can very well violate this restriction.
TCP / HTTP Listening On Ports: How Can Many Users Share the Same Port
So, what happens when a server listen for incoming connections on a TCP port? For example, let's say you have a web-server on port 80. Let's assume that your computer has the public IP address of 24.14.181.229 and the person that tries to connect to you has IP address 10.1.2.3. This person can connect to you by opening a TCP socket to 24.14.181.229:80. Simple enough.
Intuitively (and wrongly), most people assume that it looks something like this:
Local Computer | Remote Computer
--------------------------------
<local_ip>:80 | <foreign_ip>:80
^^ not actually what happens, but this is the conceptual model a lot of people have in mind.
This is intuitive, because from the standpoint of the client, he has an IP address, and connects to a server at IP:PORT. Since the client connects to port 80, then his port must be 80 too? This is a sensible thing to think, but actually not what happens. If that were to be correct, we could only serve one user per foreign IP address. Once a remote computer connects, then he would hog the port 80 to port 80 connection, and no one else could connect.
Three things must be understood:
1.) On a server, a process is listening on a port. Once it gets a connection, it hands it off to another thread. The communication never hogs the listening port.
2.) Connections are uniquely identified by the OS by the following 5-tuple: (local-IP, local-port, remote-IP, remote-port, protocol). If any element in the tuple is different, then this is a completely independent connection.
3.) When a client connects to a server, it picks a random, unused high-order source port. This way, a single client can have up to ~64k connections to the server for the same destination port.
So, this is really what gets created when a client connects to a server:
Local Computer | Remote Computer | Role
-----------------------------------------------------------
0.0.0.0:80 | <none> | LISTENING
127.0.0.1:80 | 10.1.2.3:<random_port> | ESTABLISHED
Looking at What Actually Happens
First, let's use netstat to see what is happening on this computer. We will use port 500 instead of 80 (because a whole bunch of stuff is happening on port 80 as it is a common port, but functionally it does not make a difference).
netstat -atnp | grep -i ":500 "
As expected, the output is blank. Now let's start a web server:
sudo python3 -m http.server 500
Now, here is the output of running netstat again:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
So now there is one process that is actively listening (State: LISTEN) on port 500. The local address is 0.0.0.0, which is code for "listening for all". An easy mistake to make is to listen on address 127.0.0.1, which will only accept connections from the current computer. So this is not a connection, this just means that a process requested to bind() to port IP, and that process is responsible for handling all connections to that port. This hints to the limitation that there can only be one process per computer listening on a port (there are ways to get around that using multiplexing, but this is a much more complicated topic). If a web-server is listening on port 80, it cannot share that port with other web-servers.
So now, let's connect a user to our machine:
quicknet -m tcp -t localhost:500 -p Test payload.
This is a simple script (https://github.com/grokit/dcore/tree/master/apps/quicknet) that opens a TCP socket, sends the payload ("Test payload." in this case), waits a few seconds and disconnects. Doing netstat again while this is happening displays the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:54240 ESTABLISHED -
If you connect with another client and do netstat again, you will see the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:26813 ESTABLISHED -
... that is, the client used another random port for the connection. So there is never confusion between the IP addresses.
Normally, for every connecting client the server forks a child process that communicates with the client (TCP). The parent server hands off to the child process an established socket that communicates back to the client.
When you send the data to a socket from your child server, the TCP stack in the OS creates a packet going back to the client and sets the "from port" to 80.
Multiple clients can connect to the same port (say 80) on the server because on the server side, after creating a socket and binding (setting local IP and port) listen is called on the socket which tells the OS to accept incoming connections.
When a client tries to connect to server on port 80, the accept call is invoked on the server socket. This creates a new socket for the client trying to connect and similarly new sockets will be created for subsequent clients using same port 80.
Words in italics are system calls.
Ref
http://www.scs.stanford.edu/07wi-cs244b/refs/net2.pdf
If I understand right, applications sometimes use HTTP to send messages, since using other ports is liable to cause firewall problems. But how does that work without conflicting with other applications such as web-browsers? In fact how do multiple browsers running at once not conflict? Do they all monitor the port and get notified... can you share a port in this way?
I have a feeling this is a dumb question, but not something I ever thought of before, and in other cases I've seen problems when 2 apps are configured to use the same port.
There are 2 ports: a source port (browser) and a destination port (server). The browser asks the OS for an available source port (let's say it receives 33123) then makes a socket connection to the destination port (usually 80/HTTP, 443/HTTPS).
When the web server receives the answer, it sends a response that has 80 as source port and 33123 as destination port.
So if you have 2 browsers concurrently accessing stackoverflow.com, you'd have something like this:
Firefox (localhost:33123) <-----------> stackoverflow.com (69.59.196.211:80)
Chrome (localhost:33124) <-----------> stackoverflow.com (69.59.196.211:80)
Outgoing HTTP requests don't happen on port 80. When an application requests a socket, it usually receives one at random. This is the Source port.
Port 80 is for serving HTTP content (by the server, not the client). This is the Destination port.
Each browser uses a different Source to generate requests. That way, the packets make it back to the correct application.
It is the 5-tuple of (IP protocol, local IP address, local port, remote IP address, remote port) that identifies a connection. Multiple browsers (or in fact a single browser loading multiple pages simultaneously) will each use destination port 80, but the local port (which is allocated by the O/S) is distinct in each case. Therefore there is no conflict.
Clients usually pick a port between 1024 and 65535.
It depends on the operating system how to handle this. I think Windows Clients increment the value for each new connection, Unix Clients pick a random port no.
Some services rely on a static client port like NTP (123 UDP)
A browser is a client application that you use in order to see content on a web server which is usually on a different machine.
The web server is the one listening on port 80, not the browser on the client.
You need to be careful in making the distinction between "listening on port 80" and "connecting to port 80".
When you say "applications sometimes use HTTP to send messages, since using other ports is liable to cause firewall problems", you actually mean "applications sometimes send messages to port 80".
The server is listening on port 80, and can accept multiple connections on that port.
Port 80 you're talking about here is the remote port on the server, locally browser opens high port for each connection established.
Each connection has port numbers on both ends, one is called local port, other remote port.
Firewall will allow traffic to high port for browser, because it knows that connection has been established from you computer.