How does a browser establish connection with a web server on 80 port? Details? - networking

(This question is inspired by a response to this thread: How WebSocket server handles multiple incoming connection requests?)
My understanding is this way:
Assume client IP = 1.1.1.1, server IP = 9.9.9.9
Browser choose a random local available port, say 5555, and initiate a connection to server's port 80. So on client, the socketfd_client should represent an IP connection like (1.1.1.1:5555, 9.9.9.9:80, TCP).
Server calls accept() on its port 80 and identified the connection request from client. Then server picks a random local available port, say 8888, to fulfill that connection request. So on server, the socketfd_server should represent an IP connection like (1.1.1.1:5555, 9.9.9.9:8888, TCP).
My question is:
If my above understanding is correct, socektfd_client and socketfd_server have different server port. Client has 80 while server has 8888. How could the communication be carried out? I think client should change to use the server port 8888 as well, but when and how?

Browser choose a random local available port, say 5555
No. The operating system does that: specifically, the TCP part of the network stack.
and initiate a connection to server's port 80. So on client, the socketfd_client should represent an IP connection like (1.1.1.1:5555, 9.9.9.9:80, TCP).
Correct.
Server calls accept() on its port 80 and identified the connection request from client.
Correct.
Then server picks a random local available port, say 8888
No.
to fulfill that connection request.
No.
So on server, the socketfd_server should represent an IP connection like (1.1.1.1:5555, 9.9.9.9:8888, TCP).
No. The connection at both ends is represented by {1.1.1.1:5555, 9.9.9.9:80}. There is no new port at the server end.
My question is:
If my above understanding is correct
It isn't.
socektfd_client and socketfd_server have different server port.
No.
Client has 80 while server has 8888. How could the communication be carried out? I think client should change to use the server port 8888 as well, but when and how?
Never.

Related

Can a TCP listener receive multiple connections from the same IP/port combination?

I know that multiple TCP clients can connect to the same remote endpoint (e.g. my server runs on 127.0.0.1:8080).
I know that multiple TCP clients can connect from the same IP address. But when I test this (in my case using .Net's TcpClient class, they seem to be automatically given unique ports, but is that enforced?
Can my TCP listener/server have multiple concurrent connections from the same IP/port combination? If not, how can I uniquely distinguish my connections?
One edge case I considered is if two clients have the same IP4 address... or on any given network is IP4 uniqueness also enforced?
When speaking about TCP, a connection between two entities is identified by the quadruple
<clientIP, clientPort, serverIP, serverPort>
This is why the same service (running on serverIP, serverPort) can accept
multiple connections from different clients (entities with different IPs)
multiple connections from the same client (same clientIP BUT different clientPort)
I'm not sure about .NET's implementation, altough I think it would work anyway, but when estabilishing a connection on the client side, you can also specify the port to which bind your client side socket; this way, on the server sidem you could identify incoming connections connections from the same host by their clientPort value.
Given the TCP/IP stack definition, no, you cannot enforce which port a client listens on from the server side. When a client hits a server it sends a port number that it expects the server to respond to (the client will listen on this port) called a Source Port. The server responds to that unique port for that particular client.
https://packetlife.net/blog/2010/jun/7/understanding-tcp-sequence-acknowledgment-numbers/
Looking at this capture you can see the client sent a packet with the request to (Destination) port 80 on the server with a (source) port 54841. This 54841 should be unique to that client from the servers perspective especially when combined with the Client IP address.
In more detail, the client sends a packet like this:
IP: Src: 192.168.1.23 (client), Dst: 192.168.1.11 (server)
TCP: Src Port: 54841, Dst Port: 80
The server responds with a packet that looks like this:
IP: Src: 192.168.1.11 (server), Dst: 192.168.1.23 (client)
TCP: Src Port: 80, Dst Port: 54841
Hopefully you can see that the server is able to uniquely track each request by the client IP/Port combination. A new request from the same client would have the same IP but a different Port.
IPv4 address conflicts are a problem for most networks. Most networks will not tolerate an IP address conflict. It is usually enforce by a competent network administrator or a DHCP server handling the IP addresses. Someone can manually force an IP address on a client and this will cause problems with the other host that has the same IP address. But this seems to be a different question from your original.

Ephemeral port numbers: Same server port after establishment?

If have a webserver running at port 80 and someone connects from a client using randomly assigned port x, then the server knows which port to reply to. However, at that time on, does the communication to the server continue on port 80 from then on (assigned a file descriptor to socket-pair ip:x), or does the server also delegate further communication onto another randomly assigned port of itself; y?
So what I am really asking is: When the server replies -does it reply with a source port of 80 back for further communication?
If have a webserver running at port 80 and someone connects from a client using randomly assigned port x
At the client end.
then the server knows on what port to reply.
The server replies via the same connection it received the request on. What happens below that is up to TCP. It isn't 'knowledge' of the server application.
However, at that time on, does the communication to the server continue on port 80 from then on
Yes.
(assigned a file descriptor to socket-pair ip:x)
To the socket quad {local-IP, local-port, remote-IP, remote port}.
or does the server also delegate further communication onto another randomly assigned port of itself;
No.
So what I am really asking is: When the server replies -does it reply with a source port of 80 back for further communication?
Yes.

How do browsers detect which HTTP response is theirs?

Given that you have multiple web browsers running, all which obviously listen on port 80, how would a browser figure if an incoming HTTP response was originated by itself? And whether or not catch the response and show it?
As part of the connection process a TCP/IP connection is assigned a client port. Browsers do not "listen on port 80"; rather a browser/clients initiate a request to port 80 on the server and waits for a reply on the client port from the server's IP.
After the client port is assigned (locally), each client [TCP/IP] connection is uniquely identified by (server IP, server port, client IP, client port) and the connection (and response sent over such) can be "connected back" to the correct browser. This same connection-identifying tuple is how a server doesn't confuse multiple requests coming from the same client/IP1
HTTP sits on top of the TCP/IP layer and doesn't have to concern itself with mixing up connection streams. (HTTP/2 introduces multiplexing, but that is a different beast and only affects connection from the same browser.)
See The Ephemeral Port Range for an overview:
A TCP/IPv4 connection consists of two endpoints, and each endpoint consists of an IP address and a port number. Therefore, when a client user connects to a server computer, an established connection can be thought of as the 4-tuple of (server IP, server port, client IP, client port). Usually three of the four are readily known -- client machine uses its own IP address and when connecting to a remote service, the server machine's IP address and service port number are required [leaving only the client port unknown and to be automatically assigned].
What is not immediately evident is that when a connection is established that the client side of the connection uses a port number. Unless a client program explicitly requests a specific port number, the port number used is an ephemeral port number. Ephemeral ports are temporary ports assigned by a machine's IP stack, and are assigned from a designated range of ports for this purpose. When the connection terminates, the ephemeral port is available for reuse, although most IP stacks won't reuse that port number until the entire pool of ephemeral ports have been used. So, if the client program reconnects, it will be assigned a different ephemeral port number for its side of the new connection.
See TCP/IP Client (Ephemeral) Ports and Client/Server Application Port Use for an additional gentle explanation:
To know where to send the reply, the server must know the port number the client is using. This [client port] is supplied by the client as the Source Port in the request, and then used by the server as the destination port to send the reply. Client processes don't use well-known or registered ports. Instead, each client process is assigned a temporary port number for its use. This is commonly called an ephemeral port number.
1 If there are multiple client computers (ie. different TCP/IP stacks each assigning possibly-duplicate ephemeral ports) using the same external IP then something like Network Address Translation must be used so the server still has a unique tuple per connection:
Network address translation (NAT) is a methodology of modifying network address information in Internet Protocol (IP) datagram packet headers while they are in transit across a traffic routing device for the purpose of remapping one IP address space into another.
thank you all for answers.
the hole listening thing over port 80 was my bad,I must have been dizzy last night :D
anyway,as I have read HTTP is connectionless.
browser initiates an HTTP request and after a request is made, the client disconnects from >the server and waits for a response. The server process the request and re-establish the >connection with the client to send response back.
therefor the browser does not maintain connection waiting for a response.so the answer is not that easy to just send the response back to the open socket.
here's the source
Pay attention browesers aren't listening on specific port to receive HTTP response. Web server listening on specific ports (usually 80 or 443). Browser open connection to web server, and send HTTP request to web server. Browser don't close connection before receive HTTP response. Web server writes HTTP response on opened connection.
Given that you have multiple web browsers running, all which obviously listen on port 80
Not obvious: just wrong. The HTTP server listens on port 80. The browsers connect to port 80.
how would a browser figure if an incoming HTTP response was originated by itself?
Because it comes back on the same connection and socket that was used to send the request.
And whether or not catch the response and show it?
Anything that comes back on the connected socket belongs to the guy who connected the socket.
And in any case all this is the function of TCP, not the browser.

After requesting a TCP connection request to a server, can client receive the reply on another port generated by server at its end

When TCP client requests conn'n on server's listening port, server will accept it and create a new port meant for this conn'n with this client. Hence forth the client will communicate with server on this new port.
if the above statement is true and possible, how server conveys the newly generated port to client. In reply to the conn'n request the packet from server to client will have what port as source port (Server's listening port OR New port generated by server for client).
Will Client accept this port and take into use or it will give error ? I need this to implement an architecture having 2 clients and one server in an embedded system using lwip stack.
regards,
ED
The server doesn't create a new port. It creates a new TCP connection and it sends its reply packets to the IP and port the client sent its connection request from. (A TCP connection has an IP address and port on each side.)
When you connect to a server, you get a port number yourself, which is assigned to you by the system (unless you bind the socket before connecting). When the network stack of the server replies to your connection request, the "source" port is the new port number of the server, and the "destination" port of the message is your port. That's how the network stack on the client side knows what port the server has.
The new port number on the server used for your connection can not be set or changed by the actual server program, it's the network stack on the server machine that just grabs an available port number.
Edit: You might also want to read up a little on how connections are established, a.k.a. the three-way handshake.

How the clients (client sockets) are identified?

To my understanding by serverSocket = new ServerSocket(portNumber) we create an object which potentially can "listen" to the indicated port. By clientSocket = serverSocket.accept() we force the server socket to "listen" to its port and to "accept" a connection from any client which tries to connect to the server through the port associated with the server. When I say "client tries to connect to the server" I mean that client program executes "nameSocket = new Socket(serverIP,serverPort)".
If client is trying to connect to the server, the server "accepts" this client (i.e. creates a "client socket" associated with this client).
If a new client tries to connect to the server, the server creates another client socket (associated with the new client). But how the server knows if it is a "new" client or an "old" one which has already its socket? Or, in other words, how the clients are identified? By their IP? By their IP and port? By some "signatures"?
What happens if an "old" client tries to use Socket(serverIP,serverIP) again? Will server create the second socket associated with this client?
The server listens on an address and port. For example, your server's IP address is 10.0.0.1, and it is listening on port 8000.
Your client IP address is 10.0.0.2, and the client "connects" to the server at 10.0.0.1 port 8000. In the TCP connect, you are giving the port of the server that you want to connect to. Your client will actually get its own port number, but you don't control this, and it will be different on each connection. The client chooses the server port that it wants to connect to and not the client port that it is connecting from.
For example, on the first connection, your client may get client-side port 12345. It is connecting from 10.0.0.2 port 12345 to the server 10.0.0.1 port 8000. Your server can see what port the client is connecting from by calling getpeername on its side of the connection.
When the client connects a second time, the port number is going to be different, say port 12377. The server can see this by calling getpeername on the second connection -- it will see a different port number on the client side. (getpeername also shows the client's IP address.)
Also, each time you call accept on the server, you are getting a new socket. You still have the original socket listening, and on each accept you get a new socket. Call getpeername on the accepted socket to see which client port the connection is coming from. If two clients connect to your server, you now have three sockets -- the original listening socket, and the sockets of each of the two clients.
You can have many clients connected to the same server port 8000 at the same time. And, many clients can be connected from the same client port (e.g. port 12345), only not from the same IP address. From the same client IP address, e.g. 10.0.0.2, each client connection to the server port 8000 will be from a unique client port, e.g. 12345, 12377, etc. You can tell the clients apart by their combination of IP address and port.
The same client can also have multiple connections to the server at the same time, e.g. one connection from client port 12345 and another from 12377 at the same time. By client I mean the originating IP address, and not a particular software object. You'll just see two active connections having the same client IP address.
Also, eventually over time, the combination of client-address and client-port can be reused. That is, eventually, you may see a new client come in from 10.0.0.2 port 12345, long after the first client at 10.0.0.2 port 12345 has disconnected.
Every TCP connection has as identifier the quadruple (src port, src address, dest port, dest address).
Whenever your server accepts a new client, a new Socket is created and it's indipendent from every other socket created so far. The identification of clients is not implictly handled somehow..
You don't have to think sockets as associated to "clients", they are associated with an ip and a port, but there is not direct correlation between these two.
If the same client tries to open another socket by creating a new one you'll have two unrelated sockets (because ports will be different for sure). This because the client cannot use the same port to open the new connection so the quadruple will be different, same client ip, same server ip, same server port but different client port.
EDIT for your questions:
clients don't specify a port because it's randomly choosen from the free ones (> 1024 if I'm not wrong) from the underlying operating system
a connection cannot be opened from a client using the same port, the operating system won't let you do that (actually you don't specify any port at all) and in any case it would tell you that port is already bound to a socket so this issue cannot happen.
whenever the server receives a new connection request it's is considered new, because also if ip is the same port will be different for sure (in case of old packet resend or similar caveats I think that the request will be discarded)
By the way all these situations are clearly explained in TCP RFC here.
I think the question here is why do you care if the client is new or old. What is new and old?
For example, a web browser could connect to a web server to request a web page. This will create a connection so serverSocket.accept() will return a new Socket. Then the connection is closed by the web browser.
Afer a couple of minutes, the end used click on a link in the web page and the browser request a new page to the server. This will create a connection so serverSocket.accept() will return a new Socket.
Now, the web server do not care if this is a new or old client. It just need to server the requested page. If the server do care if the "client" already requested a page in the past, it should do so using some information in the protocol used on the socket. Check out http://en.wikipedia.org/wiki/OSI_model
In this case, the ServerSocket and Socket ack on the transport level. The question "does this client already requested a page on the server" should be answered by information on the session or even application layer.
In the web browser/server example, the http protocol (which is an application) protocol hold information about who is this browser in the parameters of the request (the browser transmit cookie informations with every request). The http server can then set/read cookie information to known if the browser connected before and eventually maintain a server side session for that browser.
So back to your question: why do you care if it's a new or old client?
A socket is identified by:
(Local IP,Local Port, Remote IP,
Remote Port,IP Protocol(UDP/TCP/SCTP/etc.)
And that's the information the OS uses to map the packets/data to the right handle/file descriptor of your program. For some kinds of sockets,(e.g. an non-connected UDP socket)the remote port/remote IP might be wildcards.
By definition, this is not a Java related question, but about networking in general, since Sockets and SeverSockets apply to any networking-enabled programming language.
A Socket is bounded to a local-port. The client will open a connection to the server (by the Operating System/drivers/adapters/hardware/line/.../line/hardware/adapters/drivers/Server OS). This "connection" is done by a protocol, called the IP (Internet Protocol) when you are connected to the Internet. When you use "Sockets", it will use another protocol, which is the TCP/IP-protocol.
The Internet Protocol will identify nodes on a network by two things: their IP-address and their port. The TCP/IP-protocol will send messages using the IP, and making sure messages are correctly received.
Now; to answer your question: it all depends! It depends on your drivers, your adapters, your hardware, your line. When you connect to your localhost machine, you will not get further than the adapter. The hardware isn't necessairy, since no data is actually sent over the line. (Though often you need hardware before you can have an adapter.)
By definition, the Internet Protocol defines a connection as pair of nodes (thus four things: two IP-adresses and two ports). Also, the Internet Protocol defines that one node can only use one port at a time to initiate a connection with another node (note: this only applies for the client, not the server).
To answer your second question: if there are two Sockets: the "new" and the "old". Since, by the Internet Protocol, a connection is a pair of nodes, and nodes can only use one port at a time for a connection, the ports of "new" and "old" must be different. And because this is different, the "new" client can be discriminated from the "old", since the port-number is differently.

Resources