How does server find browser without public IP in a lan via Websocket? - tcp

Browser can send a request to web server and get response, it is easy to understand because every domain does resolve to one or more public network IP, browser can find web servers via public network IP.
Some clients have public IP like PPPOE, so, when I establish a Websocket connection between browser and server, server can send data to browser on a device that owns a public network IP device, but not every client has public network IP.
My question is how does server find browser without public IP in a lan via Websocket?

Part of the magic is Network Address Translation and is performed by the routers between the server and the web browser's computer.

The simple answer is the server never has to find the client because once a browser sends a request to the server and a TCP connection is established, that connection can be maintained for as long as necessary.
The TCP protocol has the keepalive concept where every so often a peer sends a probe packet to the client with no data expecting and ACK packet in response. This way the connection remains alive despite network inactivity and can be terminated where peer does not reply.
The WebSocket Protocol, a TCP-based protocol also defines a similar concept Ping/Pong in which either of the peers can send a Ping probe on demand once a connection is established. The peer must respond to a Ping frame with a Pong frame as soon as is practical. This checks for dead peers, in which case the connection would be dead.

Related

UDP hole-punch explanation

I'm trying to understand UDP hole punching and I just don't quite get it.
In concept it seems simple but when I put it into practice I can't pull it off.
From what I understand there's a public server we call the hole-punch server. A client makes a request to hole-punch server (this is public). The hole-punch server spits out a public ip and port of the client that just made the request. So long as that port is open then essentially any random client can make a request to that client using that specific port and ip ?
The issue I guess I'm having is, the client is able to make a request to the server. The server is able to send data back to the client on that public port and ip however when another client tries to send a request to that client using that same port and ip it just doesn't go through and that's what's confusing me. If the server can make the request why can't another random client make that request?
The thing to know about UDP hole-punching is that many consumer-grade Internet routers/NAT-firewalls have a policy along the lines of "block any incoming UDP packets, except for UDP packets coming from an IP address that the user's local computer has recently sent a UDP packet to"; the idea being that if the local user is sending packets to a particular IP address, then the packets coming back from that same IP address are probably legitimate/desirable.
So in order to get UDP packets flowing between two firewalled/NAT'd computers, you have to get each of the two computers to first send a UDP packet to the other one; which is a bit of a chicken-and-egg problem since they can't know where to send the UDP packet without being able to communicate; the public server is what solves that problem. Since that server is public, both clients can communicate with the server (via UDP or TCP or HTTP or whatever), and that server can tell each client the IP address and port to send its UDP packets to. Once each client has sent some initial packets to the other, it should also (in most cases) then be able to receive UDP packets from the other client as well, at which point the server is no longer necessary as a go-between.

How does firewall handle incoming http traffic to a browser?

when a browser sends a request to a web server, the web server has to send a response.
from what i have understood from reading so far, the server than dispatches the packets of response data with dest-port/dest-ip parts being the client browser's.
1) If the above is right, than doesn't it mean that the browser has to always be listening to a port for incoming traffic from the server?
2) And if the client is listening for incoming connections on a port, isn't that a security concern?
3) If 2 is right, than how are most corporate firewalls for employees be configured? (seeing as they probably need to browse the net) - a quick overview, details unnecessary.
doesn't it mean that the browser has to always be listening to a port for incoming traffic from the server?
No. Layman's explanation: a browser initiates a TCP connection to the web server. This connection is recognized by source ip and port, dest ip and port and protocol by all intermediate level 3 machines (e.g. routers, firewalls).
In a TCP connection, one party listens (the web server) while the other party connects (the browser). Traffic can flow over this connection in both directions, until either party (or intermediate machine) closes the connection.
Corporate firewalls allow outbound connections over port 80 (and 443), so their employees can browse the web over HTTP(S). The data the server returns is sent over the connection initiated by the client.
Of course if an outside attacker knows of a connection, they can send packets with a spoofed IP, so they can send data pretending to be the server. Those packets will be dropped if anything is wrong, like the sequence number, so they won't end up in the user's browser.

How do browsers detect which HTTP response is theirs?

Given that you have multiple web browsers running, all which obviously listen on port 80, how would a browser figure if an incoming HTTP response was originated by itself? And whether or not catch the response and show it?
As part of the connection process a TCP/IP connection is assigned a client port. Browsers do not "listen on port 80"; rather a browser/clients initiate a request to port 80 on the server and waits for a reply on the client port from the server's IP.
After the client port is assigned (locally), each client [TCP/IP] connection is uniquely identified by (server IP, server port, client IP, client port) and the connection (and response sent over such) can be "connected back" to the correct browser. This same connection-identifying tuple is how a server doesn't confuse multiple requests coming from the same client/IP1
HTTP sits on top of the TCP/IP layer and doesn't have to concern itself with mixing up connection streams. (HTTP/2 introduces multiplexing, but that is a different beast and only affects connection from the same browser.)
See The Ephemeral Port Range for an overview:
A TCP/IPv4 connection consists of two endpoints, and each endpoint consists of an IP address and a port number. Therefore, when a client user connects to a server computer, an established connection can be thought of as the 4-tuple of (server IP, server port, client IP, client port). Usually three of the four are readily known -- client machine uses its own IP address and when connecting to a remote service, the server machine's IP address and service port number are required [leaving only the client port unknown and to be automatically assigned].
What is not immediately evident is that when a connection is established that the client side of the connection uses a port number. Unless a client program explicitly requests a specific port number, the port number used is an ephemeral port number. Ephemeral ports are temporary ports assigned by a machine's IP stack, and are assigned from a designated range of ports for this purpose. When the connection terminates, the ephemeral port is available for reuse, although most IP stacks won't reuse that port number until the entire pool of ephemeral ports have been used. So, if the client program reconnects, it will be assigned a different ephemeral port number for its side of the new connection.
See TCP/IP Client (Ephemeral) Ports and Client/Server Application Port Use for an additional gentle explanation:
To know where to send the reply, the server must know the port number the client is using. This [client port] is supplied by the client as the Source Port in the request, and then used by the server as the destination port to send the reply. Client processes don't use well-known or registered ports. Instead, each client process is assigned a temporary port number for its use. This is commonly called an ephemeral port number.
1 If there are multiple client computers (ie. different TCP/IP stacks each assigning possibly-duplicate ephemeral ports) using the same external IP then something like Network Address Translation must be used so the server still has a unique tuple per connection:
Network address translation (NAT) is a methodology of modifying network address information in Internet Protocol (IP) datagram packet headers while they are in transit across a traffic routing device for the purpose of remapping one IP address space into another.
thank you all for answers.
the hole listening thing over port 80 was my bad,I must have been dizzy last night :D
anyway,as I have read HTTP is connectionless.
browser initiates an HTTP request and after a request is made, the client disconnects from >the server and waits for a response. The server process the request and re-establish the >connection with the client to send response back.
therefor the browser does not maintain connection waiting for a response.so the answer is not that easy to just send the response back to the open socket.
here's the source
Pay attention browesers aren't listening on specific port to receive HTTP response. Web server listening on specific ports (usually 80 or 443). Browser open connection to web server, and send HTTP request to web server. Browser don't close connection before receive HTTP response. Web server writes HTTP response on opened connection.
Given that you have multiple web browsers running, all which obviously listen on port 80
Not obvious: just wrong. The HTTP server listens on port 80. The browsers connect to port 80.
how would a browser figure if an incoming HTTP response was originated by itself?
Because it comes back on the same connection and socket that was used to send the request.
And whether or not catch the response and show it?
Anything that comes back on the connected socket belongs to the guy who connected the socket.
And in any case all this is the function of TCP, not the browser.

Http 1.1 connection and client port

Does the client remote port changes during an HTTP 1.1 connection exchange?
I am trying to figure out if I can programmaticaly uniquely identify a connection on the server using the request remote port and remote ip address.
This is not as much an HTTP question, as it's a TCP one. And no, the port doesn't change: the ephemeral port stays the same for the duration of the connection.
However, as soon as a new connection is made, the client can (and most probably will) use a different port. This totally depends on the implementation of the client OS and the Network Address Translation of intermediary routers.
Anyway, it is not something you can depend on to build something like a session, because the next request from the same client may very well arrive from a different port (let alone that HTTP does not have to run on top of TCP).
Just use a session-ID which you store in a cookie.

How the clients (client sockets) are identified?

To my understanding by serverSocket = new ServerSocket(portNumber) we create an object which potentially can "listen" to the indicated port. By clientSocket = serverSocket.accept() we force the server socket to "listen" to its port and to "accept" a connection from any client which tries to connect to the server through the port associated with the server. When I say "client tries to connect to the server" I mean that client program executes "nameSocket = new Socket(serverIP,serverPort)".
If client is trying to connect to the server, the server "accepts" this client (i.e. creates a "client socket" associated with this client).
If a new client tries to connect to the server, the server creates another client socket (associated with the new client). But how the server knows if it is a "new" client or an "old" one which has already its socket? Or, in other words, how the clients are identified? By their IP? By their IP and port? By some "signatures"?
What happens if an "old" client tries to use Socket(serverIP,serverIP) again? Will server create the second socket associated with this client?
The server listens on an address and port. For example, your server's IP address is 10.0.0.1, and it is listening on port 8000.
Your client IP address is 10.0.0.2, and the client "connects" to the server at 10.0.0.1 port 8000. In the TCP connect, you are giving the port of the server that you want to connect to. Your client will actually get its own port number, but you don't control this, and it will be different on each connection. The client chooses the server port that it wants to connect to and not the client port that it is connecting from.
For example, on the first connection, your client may get client-side port 12345. It is connecting from 10.0.0.2 port 12345 to the server 10.0.0.1 port 8000. Your server can see what port the client is connecting from by calling getpeername on its side of the connection.
When the client connects a second time, the port number is going to be different, say port 12377. The server can see this by calling getpeername on the second connection -- it will see a different port number on the client side. (getpeername also shows the client's IP address.)
Also, each time you call accept on the server, you are getting a new socket. You still have the original socket listening, and on each accept you get a new socket. Call getpeername on the accepted socket to see which client port the connection is coming from. If two clients connect to your server, you now have three sockets -- the original listening socket, and the sockets of each of the two clients.
You can have many clients connected to the same server port 8000 at the same time. And, many clients can be connected from the same client port (e.g. port 12345), only not from the same IP address. From the same client IP address, e.g. 10.0.0.2, each client connection to the server port 8000 will be from a unique client port, e.g. 12345, 12377, etc. You can tell the clients apart by their combination of IP address and port.
The same client can also have multiple connections to the server at the same time, e.g. one connection from client port 12345 and another from 12377 at the same time. By client I mean the originating IP address, and not a particular software object. You'll just see two active connections having the same client IP address.
Also, eventually over time, the combination of client-address and client-port can be reused. That is, eventually, you may see a new client come in from 10.0.0.2 port 12345, long after the first client at 10.0.0.2 port 12345 has disconnected.
Every TCP connection has as identifier the quadruple (src port, src address, dest port, dest address).
Whenever your server accepts a new client, a new Socket is created and it's indipendent from every other socket created so far. The identification of clients is not implictly handled somehow..
You don't have to think sockets as associated to "clients", they are associated with an ip and a port, but there is not direct correlation between these two.
If the same client tries to open another socket by creating a new one you'll have two unrelated sockets (because ports will be different for sure). This because the client cannot use the same port to open the new connection so the quadruple will be different, same client ip, same server ip, same server port but different client port.
EDIT for your questions:
clients don't specify a port because it's randomly choosen from the free ones (> 1024 if I'm not wrong) from the underlying operating system
a connection cannot be opened from a client using the same port, the operating system won't let you do that (actually you don't specify any port at all) and in any case it would tell you that port is already bound to a socket so this issue cannot happen.
whenever the server receives a new connection request it's is considered new, because also if ip is the same port will be different for sure (in case of old packet resend or similar caveats I think that the request will be discarded)
By the way all these situations are clearly explained in TCP RFC here.
I think the question here is why do you care if the client is new or old. What is new and old?
For example, a web browser could connect to a web server to request a web page. This will create a connection so serverSocket.accept() will return a new Socket. Then the connection is closed by the web browser.
Afer a couple of minutes, the end used click on a link in the web page and the browser request a new page to the server. This will create a connection so serverSocket.accept() will return a new Socket.
Now, the web server do not care if this is a new or old client. It just need to server the requested page. If the server do care if the "client" already requested a page in the past, it should do so using some information in the protocol used on the socket. Check out http://en.wikipedia.org/wiki/OSI_model
In this case, the ServerSocket and Socket ack on the transport level. The question "does this client already requested a page on the server" should be answered by information on the session or even application layer.
In the web browser/server example, the http protocol (which is an application) protocol hold information about who is this browser in the parameters of the request (the browser transmit cookie informations with every request). The http server can then set/read cookie information to known if the browser connected before and eventually maintain a server side session for that browser.
So back to your question: why do you care if it's a new or old client?
A socket is identified by:
(Local IP,Local Port, Remote IP,
Remote Port,IP Protocol(UDP/TCP/SCTP/etc.)
And that's the information the OS uses to map the packets/data to the right handle/file descriptor of your program. For some kinds of sockets,(e.g. an non-connected UDP socket)the remote port/remote IP might be wildcards.
By definition, this is not a Java related question, but about networking in general, since Sockets and SeverSockets apply to any networking-enabled programming language.
A Socket is bounded to a local-port. The client will open a connection to the server (by the Operating System/drivers/adapters/hardware/line/.../line/hardware/adapters/drivers/Server OS). This "connection" is done by a protocol, called the IP (Internet Protocol) when you are connected to the Internet. When you use "Sockets", it will use another protocol, which is the TCP/IP-protocol.
The Internet Protocol will identify nodes on a network by two things: their IP-address and their port. The TCP/IP-protocol will send messages using the IP, and making sure messages are correctly received.
Now; to answer your question: it all depends! It depends on your drivers, your adapters, your hardware, your line. When you connect to your localhost machine, you will not get further than the adapter. The hardware isn't necessairy, since no data is actually sent over the line. (Though often you need hardware before you can have an adapter.)
By definition, the Internet Protocol defines a connection as pair of nodes (thus four things: two IP-adresses and two ports). Also, the Internet Protocol defines that one node can only use one port at a time to initiate a connection with another node (note: this only applies for the client, not the server).
To answer your second question: if there are two Sockets: the "new" and the "old". Since, by the Internet Protocol, a connection is a pair of nodes, and nodes can only use one port at a time for a connection, the ports of "new" and "old" must be different. And because this is different, the "new" client can be discriminated from the "old", since the port-number is differently.

Resources