How do browsers detect which HTTP response is theirs?

How do browsers detect which HTTP response is theirs? - http

Given that you have multiple web browsers running, all which obviously listen on port 80, how would a browser figure if an incoming HTTP response was originated by itself? And whether or not catch the response and show it?

As part of the connection process a TCP/IP connection is assigned a client port. Browsers do not "listen on port 80"; rather a browser/clients initiate a request to port 80 on the server and waits for a reply on the client port from the server's IP.
After the client port is assigned (locally), each client [TCP/IP] connection is uniquely identified by (server IP, server port, client IP, client port) and the connection (and response sent over such) can be "connected back" to the correct browser. This same connection-identifying tuple is how a server doesn't confuse multiple requests coming from the same client/IP1
HTTP sits on top of the TCP/IP layer and doesn't have to concern itself with mixing up connection streams. (HTTP/2 introduces multiplexing, but that is a different beast and only affects connection from the same browser.)
See The Ephemeral Port Range for an overview:
A TCP/IPv4 connection consists of two endpoints, and each endpoint consists of an IP address and a port number. Therefore, when a client user connects to a server computer, an established connection can be thought of as the 4-tuple of (server IP, server port, client IP, client port). Usually three of the four are readily known -- client machine uses its own IP address and when connecting to a remote service, the server machine's IP address and service port number are required [leaving only the client port unknown and to be automatically assigned].
What is not immediately evident is that when a connection is established that the client side of the connection uses a port number. Unless a client program explicitly requests a specific port number, the port number used is an ephemeral port number. Ephemeral ports are temporary ports assigned by a machine's IP stack, and are assigned from a designated range of ports for this purpose. When the connection terminates, the ephemeral port is available for reuse, although most IP stacks won't reuse that port number until the entire pool of ephemeral ports have been used. So, if the client program reconnects, it will be assigned a different ephemeral port number for its side of the new connection.
See TCP/IP Client (Ephemeral) Ports and Client/Server Application Port Use for an additional gentle explanation:
To know where to send the reply, the server must know the port number the client is using. This [client port] is supplied by the client as the Source Port in the request, and then used by the server as the destination port to send the reply. Client processes don't use well-known or registered ports. Instead, each client process is assigned a temporary port number for its use. This is commonly called an ephemeral port number.
1 If there are multiple client computers (ie. different TCP/IP stacks each assigning possibly-duplicate ephemeral ports) using the same external IP then something like Network Address Translation must be used so the server still has a unique tuple per connection:
Network address translation (NAT) is a methodology of modifying network address information in Internet Protocol (IP) datagram packet headers while they are in transit across a traffic routing device for the purpose of remapping one IP address space into another.

thank you all for answers.
the hole listening thing over port 80 was my bad,I must have been dizzy last night :D
anyway,as I have read HTTP is connectionless.
browser initiates an HTTP request and after a request is made, the client disconnects from >the server and waits for a response. The server process the request and re-establish the >connection with the client to send response back.
therefor the browser does not maintain connection waiting for a response.so the answer is not that easy to just send the response back to the open socket.
here's the source

Pay attention browesers aren't listening on specific port to receive HTTP response. Web server listening on specific ports (usually 80 or 443). Browser open connection to web server, and send HTTP request to web server. Browser don't close connection before receive HTTP response. Web server writes HTTP response on opened connection.

Given that you have multiple web browsers running, all which obviously listen on port 80
Not obvious: just wrong. The HTTP server listens on port 80. The browsers connect to port 80.
how would a browser figure if an incoming HTTP response was originated by itself?
Because it comes back on the same connection and socket that was used to send the request.
And whether or not catch the response and show it?
Anything that comes back on the connected socket belongs to the guy who connected the socket.
And in any case all this is the function of TCP, not the browser.

Related

How does a browser establish connection with a web server on 80 port? Details?

(This question is inspired by a response to this thread: How WebSocket server handles multiple incoming connection requests?)
My understanding is this way:
Assume client IP = 1.1.1.1, server IP = 9.9.9.9
Browser choose a random local available port, say 5555, and initiate a connection to server's port 80. So on client, the socketfd_client should represent an IP connection like (1.1.1.1:5555, 9.9.9.9:80, TCP).
Server calls accept() on its port 80 and identified the connection request from client. Then server picks a random local available port, say 8888, to fulfill that connection request. So on server, the socketfd_server should represent an IP connection like (1.1.1.1:5555, 9.9.9.9:8888, TCP).
My question is:
If my above understanding is correct, socektfd_client and socketfd_server have different server port. Client has 80 while server has 8888. How could the communication be carried out? I think client should change to use the server port 8888 as well, but when and how?

Browser choose a random local available port, say 5555
No. The operating system does that: specifically, the TCP part of the network stack.
and initiate a connection to server's port 80. So on client, the socketfd_client should represent an IP connection like (1.1.1.1:5555, 9.9.9.9:80, TCP).
Correct.
Server calls accept() on its port 80 and identified the connection request from client.
Correct.
Then server picks a random local available port, say 8888
No.
to fulfill that connection request.
No.
So on server, the socketfd_server should represent an IP connection like (1.1.1.1:5555, 9.9.9.9:8888, TCP).
No. The connection at both ends is represented by {1.1.1.1:5555, 9.9.9.9:80}. There is no new port at the server end.
My question is:
If my above understanding is correct
It isn't.
socektfd_client and socketfd_server have different server port.
No.
Client has 80 while server has 8888. How could the communication be carried out? I think client should change to use the server port 8888 as well, but when and how?
Never.

How does firewall handle incoming http traffic to a browser?

when a browser sends a request to a web server, the web server has to send a response.
from what i have understood from reading so far, the server than dispatches the packets of response data with dest-port/dest-ip parts being the client browser's.
1) If the above is right, than doesn't it mean that the browser has to always be listening to a port for incoming traffic from the server?
2) And if the client is listening for incoming connections on a port, isn't that a security concern?
3) If 2 is right, than how are most corporate firewalls for employees be configured? (seeing as they probably need to browse the net) - a quick overview, details unnecessary.

doesn't it mean that the browser has to always be listening to a port for incoming traffic from the server?
No. Layman's explanation: a browser initiates a TCP connection to the web server. This connection is recognized by source ip and port, dest ip and port and protocol by all intermediate level 3 machines (e.g. routers, firewalls).
In a TCP connection, one party listens (the web server) while the other party connects (the browser). Traffic can flow over this connection in both directions, until either party (or intermediate machine) closes the connection.
Corporate firewalls allow outbound connections over port 80 (and 443), so their employees can browse the web over HTTP(S). The data the server returns is sent over the connection initiated by the client.
Of course if an outside attacker knows of a connection, they can send packets with a spoofed IP, so they can send data pretending to be the server. Those packets will be dropped if anything is wrong, like the sequence number, so they won't end up in the user's browser.

After requesting a TCP connection request to a server, can client receive the reply on another port generated by server at its end

When TCP client requests conn'n on server's listening port, server will accept it and create a new port meant for this conn'n with this client. Hence forth the client will communicate with server on this new port.
if the above statement is true and possible, how server conveys the newly generated port to client. In reply to the conn'n request the packet from server to client will have what port as source port (Server's listening port OR New port generated by server for client).
Will Client accept this port and take into use or it will give error ? I need this to implement an architecture having 2 clients and one server in an embedded system using lwip stack.
regards,
ED

The server doesn't create a new port. It creates a new TCP connection and it sends its reply packets to the IP and port the client sent its connection request from. (A TCP connection has an IP address and port on each side.)

When you connect to a server, you get a port number yourself, which is assigned to you by the system (unless you bind the socket before connecting). When the network stack of the server replies to your connection request, the "source" port is the new port number of the server, and the "destination" port of the message is your port. That's how the network stack on the client side knows what port the server has.
The new port number on the server used for your connection can not be set or changed by the actual server program, it's the network stack on the server machine that just grabs an available port number.
Edit: You might also want to read up a little on how connections are established, a.k.a. the three-way handshake.

How the clients (client sockets) are identified?

To my understanding by serverSocket = new ServerSocket(portNumber) we create an object which potentially can "listen" to the indicated port. By clientSocket = serverSocket.accept() we force the server socket to "listen" to its port and to "accept" a connection from any client which tries to connect to the server through the port associated with the server. When I say "client tries to connect to the server" I mean that client program executes "nameSocket = new Socket(serverIP,serverPort)".
If client is trying to connect to the server, the server "accepts" this client (i.e. creates a "client socket" associated with this client).
If a new client tries to connect to the server, the server creates another client socket (associated with the new client). But how the server knows if it is a "new" client or an "old" one which has already its socket? Or, in other words, how the clients are identified? By their IP? By their IP and port? By some "signatures"?
What happens if an "old" client tries to use Socket(serverIP,serverIP) again? Will server create the second socket associated with this client?

The server listens on an address and port. For example, your server's IP address is 10.0.0.1, and it is listening on port 8000.
Your client IP address is 10.0.0.2, and the client "connects" to the server at 10.0.0.1 port 8000. In the TCP connect, you are giving the port of the server that you want to connect to. Your client will actually get its own port number, but you don't control this, and it will be different on each connection. The client chooses the server port that it wants to connect to and not the client port that it is connecting from.
For example, on the first connection, your client may get client-side port 12345. It is connecting from 10.0.0.2 port 12345 to the server 10.0.0.1 port 8000. Your server can see what port the client is connecting from by calling getpeername on its side of the connection.
When the client connects a second time, the port number is going to be different, say port 12377. The server can see this by calling getpeername on the second connection -- it will see a different port number on the client side. (getpeername also shows the client's IP address.)
Also, each time you call accept on the server, you are getting a new socket. You still have the original socket listening, and on each accept you get a new socket. Call getpeername on the accepted socket to see which client port the connection is coming from. If two clients connect to your server, you now have three sockets -- the original listening socket, and the sockets of each of the two clients.
You can have many clients connected to the same server port 8000 at the same time. And, many clients can be connected from the same client port (e.g. port 12345), only not from the same IP address. From the same client IP address, e.g. 10.0.0.2, each client connection to the server port 8000 will be from a unique client port, e.g. 12345, 12377, etc. You can tell the clients apart by their combination of IP address and port.
The same client can also have multiple connections to the server at the same time, e.g. one connection from client port 12345 and another from 12377 at the same time. By client I mean the originating IP address, and not a particular software object. You'll just see two active connections having the same client IP address.
Also, eventually over time, the combination of client-address and client-port can be reused. That is, eventually, you may see a new client come in from 10.0.0.2 port 12345, long after the first client at 10.0.0.2 port 12345 has disconnected.

Every TCP connection has as identifier the quadruple (src port, src address, dest port, dest address).
Whenever your server accepts a new client, a new Socket is created and it's indipendent from every other socket created so far. The identification of clients is not implictly handled somehow..
You don't have to think sockets as associated to "clients", they are associated with an ip and a port, but there is not direct correlation between these two.
If the same client tries to open another socket by creating a new one you'll have two unrelated sockets (because ports will be different for sure). This because the client cannot use the same port to open the new connection so the quadruple will be different, same client ip, same server ip, same server port but different client port.
EDIT for your questions:
clients don't specify a port because it's randomly choosen from the free ones (> 1024 if I'm not wrong) from the underlying operating system
a connection cannot be opened from a client using the same port, the operating system won't let you do that (actually you don't specify any port at all) and in any case it would tell you that port is already bound to a socket so this issue cannot happen.
whenever the server receives a new connection request it's is considered new, because also if ip is the same port will be different for sure (in case of old packet resend or similar caveats I think that the request will be discarded)
By the way all these situations are clearly explained in TCP RFC here.

I think the question here is why do you care if the client is new or old. What is new and old?
For example, a web browser could connect to a web server to request a web page. This will create a connection so serverSocket.accept() will return a new Socket. Then the connection is closed by the web browser.
Afer a couple of minutes, the end used click on a link in the web page and the browser request a new page to the server. This will create a connection so serverSocket.accept() will return a new Socket.
Now, the web server do not care if this is a new or old client. It just need to server the requested page. If the server do care if the "client" already requested a page in the past, it should do so using some information in the protocol used on the socket. Check out http://en.wikipedia.org/wiki/OSI_model
In this case, the ServerSocket and Socket ack on the transport level. The question "does this client already requested a page on the server" should be answered by information on the session or even application layer.
In the web browser/server example, the http protocol (which is an application) protocol hold information about who is this browser in the parameters of the request (the browser transmit cookie informations with every request). The http server can then set/read cookie information to known if the browser connected before and eventually maintain a server side session for that browser.
So back to your question: why do you care if it's a new or old client?

A socket is identified by:
(Local IP,Local Port, Remote IP,
Remote Port,IP Protocol(UDP/TCP/SCTP/etc.)
And that's the information the OS uses to map the packets/data to the right handle/file descriptor of your program. For some kinds of sockets,(e.g. an non-connected UDP socket)the remote port/remote IP might be wildcards.

By definition, this is not a Java related question, but about networking in general, since Sockets and SeverSockets apply to any networking-enabled programming language.
A Socket is bounded to a local-port. The client will open a connection to the server (by the Operating System/drivers/adapters/hardware/line/.../line/hardware/adapters/drivers/Server OS). This "connection" is done by a protocol, called the IP (Internet Protocol) when you are connected to the Internet. When you use "Sockets", it will use another protocol, which is the TCP/IP-protocol.
The Internet Protocol will identify nodes on a network by two things: their IP-address and their port. The TCP/IP-protocol will send messages using the IP, and making sure messages are correctly received.
Now; to answer your question: it all depends! It depends on your drivers, your adapters, your hardware, your line. When you connect to your localhost machine, you will not get further than the adapter. The hardware isn't necessairy, since no data is actually sent over the line. (Though often you need hardware before you can have an adapter.)
By definition, the Internet Protocol defines a connection as pair of nodes (thus four things: two IP-adresses and two ports). Also, the Internet Protocol defines that one node can only use one port at a time to initiate a connection with another node (note: this only applies for the client, not the server).
To answer your second question: if there are two Sockets: the "new" and the "old". Since, by the Internet Protocol, a connection is a pair of nodes, and nodes can only use one port at a time for a connection, the ports of "new" and "old" must be different. And because this is different, the "new" client can be discriminated from the "old", since the port-number is differently.

How are different TCP connections in HTTP requests identified?

From what I understand, each HTTP request uses its own TCP connection (please correct me if i'm wrong). So, let's say that there are two current connections to the same server. For example, client side javascript code triggering a couple of AJAX POST requests using the XMLHttpRequest object, one right after the other, before getting the response to the first one. So we're talking about two connections to the same server, each waiting for a response in order to route it to each separate callback function.
Now here's the thing that I don't understand: The TCP packet includes source and destination ip and port, but won't both of these connections have the same src and dest ip addresses, and port 80? How can the packets be differentiated and routed to appropriately? Does it have anything to do with the packet sequence number which is different for each connection?

When your browser creates a new connection to the HTTP server, it uses a different source port.
For example, say your browser creates two connections to a server and that your IP address is 60.12.34.56. The first connection might originate from source port 60123 and the second from 60127. This is embedded in the TCP header of each packet sent to the server. When the server replies to each connection, it uses the appropriate port (e.g. 60123 or 60127) so that the packet makes it back to the right spot.
One of the best ways to learn about this is to download Wireshark and just observe traffic on your own network. It will show you this and much more.
Additionally, this gives insight into how Network Address Translation (NAT) works on a router. You can have many computers share the same IP address and the router will rewrite the request to use a different port so that two computers can simultaneously connect to places like AOL Instant Messenger.

They're differentiated by the source port.
The main reason for each HTTP request to not generate a separate TCP connection is called keepalives, incidentally.

A socket, in packet network communications, is considered to be the combination of 4 elements: server IP, server port, client IP, client port. The second one is usually fixed in a protocol, e.g. http usually listen in port 80, but the client port is a random number usually in the range 1024-65535. This is because the operating system could use those ports for known server protocols (e.g. 21 for FTP, 22 for SSH, etc.). The same network device can not use the same client port to open two different connections even to different servers and if two different clients use the same port, the server can tell them apart by their IP addresses. If a port is being used in a system either to listen for connection or to establish a connection, it can not be used for anything else. That's how the operating system can dispatch packets to the correct process once received by the network card.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex