https://serverfault.com/questions/296603/understanding-ports-how-do-multiple-browser-tabs-communicate-at-the-same-time
how can an application use port 80/HTTP without conflicting with browsers?
How do multiple clients connect simultaneously to one port, say 80, on a server?
I have read the above questions but it seems the answers are inconsistent.
I want to know what exactly defines a socket connection, is it:
(sockid, source ip, source port, dest ip, dest port)
or only:
(source ip, source port, dest ip, dest port)
Can two different processes (e.g, two different browsers) communicate with a web server on the same source port? (the dest port would be the same by default)
What will happen in the case of different tabs in the same browser?
Also, as mentioned in one of the answers, a single web page can connect to multiple servers (e.g., ad servers) simultaneously. When connecting to multiple servers simultaneously, does the web browser (e.g., Chrome, Firefox) connect to each server using the same port, or does it use a different port for each server?
I know this is late, but since the thread is still on the internet, and because it's a common question with very few authoritative answers on the web, it deserves a more complete and concise explanation for those who may stumble into it, even at this late date.
You open a browser to pull up a web site, let's say google.com. In the process of specifying the web site your computer will arbitrarily pick a port number to use as its "source port." The number will be above 49152 which is the beginning of the "Dynamic, Private, or ephemeral ports" but below 65535 which is the highest port number available. The chosen port number is associated with that browser "instance."
Just for fun, you open a new tab on the browser and also type "google.com" on the url line. Your goal is to have two instances of google.com running because you're looking for different things. Your computer then chooses a second port number for that session, different from the "source" port it used for the first session. You could, in fact, do this many, many times, and each session will have a unique "source" port associated with each instance.
Your packet goes out to google.com, and the destination port number in every instance will be port 80. Web servers "listen" on port 80 for incoming connection requests. For each session with google.com you've opened, the destination port, from the perspective of your computer, will always be port 80, but for each instance of connection to google.com on your browser or browsers, the source port will uniquely identify one specific tab on one browser instance, uniquely. This is how they are differentiated on your computer.
google.com gets your first request. Its reply will specify port 80 as its source port. Here's where it gets interesting. Each query google.com received from you will be treated as a "socket," which is a combination of your IP address, and the specific port number associated with the process that contacted google.com. So, for instance we'll say your IP address is 165.40.30.12, and the source port number your computer used as its source port for four instances of communications with google.com (let's say four different tabs all opening google.com) were 61235, 62421, 58392, and 53925. These four "sockets" would then be 165.40.30.12:61235, 165.40.30.12:62421, 165.40.30.12:58392, and 165.40.30.12:53925. Each combination of "IP address:source port number" is unique, and google.com will treat each instance as unique. google.com responds to, and communicates with the "socket," that is, the combination of IP address and port.
Your computer gets its responses from google.com, and sorts them out by what it receives as its destination "socket" which it parses out by port numbers, assigning the response from google.com to the appropriate browser window or tab, as appropriate. No problem from your end of things as you see the correct response in the correct window every time.
"But," you're thinking, "what if someone else accidentally uses the same port number - let's say 61235 - as I've used for my source number? Doesn't this confuse google.com?" Not at all, because google.com is tracking "sockets" which are a combination of IP address and port number. Someone else, naturally, and we sincerely hope, is using a different IP address from the one you're using, let's say 152.126.11.27, and the combination of IP address and port number are unique - 152.126.11.27:61235 - differentiated by their different IP addresses, even though the port numbers are the same.
It doesn't matter if google.com gets 1000 queries from 1000 users, all using port 80 as their destination port number (the port google.com is listening to for incoming communications) because each of those 1000 users will all have unique IP addresses. google.com tracks its clients by their unique - and they always have to be unique, don't they? - socket numbers consisting of their IP address and "source" port number. Even if every one of those 1000 clients somehow managed to use the same "source" port number (unlikely to the max), they would still have differing IP addresses, making their source "socket" unique among all of the others.
This is all rather simple when you see it explained this way. It makes it clear that a web server can always listen on one port (80) and still differentiate from the various clients. It doesn't simply look at the IP address it received in the query - you'd think they should all be unique at first blush until you realize you could have more than one web page opened on that web server - but at the source port number, which in combination, make every query unique. I hope this is pretty clear. It's an elegant system when you think about it, yet simple once you understand it.
Taking your questions in turn:
A connection is defined by:
{ protocol, local IP, local port, remote IP, remote port }
(It's better to say local and remote rather than source and destination, because the local port is the source when you send, but the destination when you receive.)
The sockid is just a descriptor in the user process that maps to the connection in the kernel, just as a file descriptor maps to a file on disk that has been opened.
Two different processes cannot bind to the same local port. However, it's possible for two processes to use the same connection -- a socket descriptor can be inherited from a parent process to a child process, or the descriptor may be passed between processes using interprocess communications. The two processes would be using the same ports because they're actually sharing the same connection.
While the protocol allows use of the same local port when connecting to different remote servers or ports, most TCP stacks will not allow this. The API for binding a local port is the same whether you're using it for an outgoing connection or listening for an incoming connection, and the intention isn't specified until AFTER the port is bound. Since only one socket can listen on a particular port, the API simply refuses to allow multiple sockets to bind to a port (there's a special exception to this, but it's not relevant to this discussion). As a result, all outgoing connections will use different local ports. So when the browser opens multiple connections (either to the same or different web servers) they will have different local ports.
I have read the above question on respective forum but I guess people
there also do not agree with each other.
I see no disagreement in any of those concerning the matters you mention.
I want to know what exactly defines a socket Connection
( sockid , source ip , source port , dext ip , dest port )
or only
( source ip , source port , dext ip , dest port )
The latter. The former is a figment of imagination. It isn't mentioned in any of the threads you cite.
What I want to ask is that can two different processes like two
different Browsers communicate with a web server on same source
port.(Dest port would be same by default).
Not if they are at the same source IP. It would violate the identity definition stated above.
What in case of Different tabs in same browser.
Yes, because of connection pooling. If you are talking about separate connections, the answer is still no.
Also as mentioned in one of the answers a single web page tries to
connect to many different servers like ad servers etc. So does Chrome
or firefox for that matter connect with the same port to different
servers or use a single port.
You are going to have to explain this. What is the difference between 'the same port' and 'a single port'? Not a real question.
Related
Well-known services usually use a pre-defined port number on the server-side.
However, I realized that that is not always the case. Some services and games for example seem to pick a random port from a pre-defined range.
when you connect to a pre-defined port number, you send a request first, so the client's port can be determined, but if the service's port is not predetermined, how does the client know to which port to send the request? Also, what is the reason for always using a different port and how does this happen?
how does the client know to which port to send the request?
This depends on the specific protocol. For example with protocols like SIP, H.323 or FTP there are predefined port numbers for the signaling channel. The actual data transfer though is done by new connections on dynamic ports. These ports are advertised within the signaling channel.
In other cases there is no such signaling channel on a predefined port number. This is typically the case for servers which have no IANA assigned port number. It also happens when multiple instances (with different configurations) of the server should run on the same system and these simply cannot use the same port number. In this case the relevant IP and port might be advertised for example through DNS SRV records. And of course there might be other ways, like publishing the information on some web site or similar.
Also, what is the reason for always using a different port ...
Again it depends on the specific protocol. With SIP, H.323 or FTP for example the data connection is specific to the client and it will simply use a port which is free on the system for this. And there can be multiple connections at the same time from the same or from different clients which all use different ports. Any restrictions regarding the range of the port are usually only done to work better with firewalls, so that these don't need to open a huge port range but can allow a smaller port range and thus lower the attack surface.
... and how does this happen?
Just let the system pick a random port by not giving a specific value. Or if a port should be used from a range then it will simply figure out which port is available by trying to bind to the port and continue with the next if the binding failed.
if the service's port is not predetermined, how does the client know to which port to send the request?
The port has to be known ahead of time, entered by the user, or advertised somewhere the client can find it.
what is the reason for always using a different port
Many reasons: security, network/firewall restrictions, etc.
I need to find the port number of a server, I have the host name and the IP address.
Is this possible?
I need this as when I try to connect to this server through putty its throwing a Network error:Connection refused error, which may be because of the wrong port number
So you are looking for the port number the ssh server on that system listens on. Usually that is port 22 (well known ssh port), but you are right, this can be changed in the ssh server configuration. If so there are two possibilities yo have:
ask the administrator of the ssh server for the port number
make a network scan of the server which shows up all open ports. Note however that this can be regarded as offensive behavior and may be blocked in mid way.
But most likely you are facing another problem: some firewall blocking your requests or the ssh server not listening to request from outside at all.
And a side note: a server is a service, often listening on a port, you can interact with it typically by "speaking" a specific protocol. A system might refer to a computer running software, typically reachable via network these days. Many servers can be operated on a system. A system can be identified by its ip address. Many people confuse this and speak of a "server" when referring to such a "system" which is simply wrong and creates confusion from a technical point of view.
I'm trying to develop an applicaton for p2p communication between two android devices. In order to punch a hole through my NAT(s), I'd need to know my external ip address and port.
To that end, I've developed a java server on GAE to report my "remote" ip address and port. The problem is that on GAE I can get my ip address, but not my port. Without it, I'm unable to successfully punch the hole.
So, my question is what's the best, free method to find out my external IP address and port?
That's a question that has no answer with TCP.
Here's the problem: your "port" is not a fixed value. You don't have "an" external port. You typically get one dynamically assigned for each outbound connection.
As answers you should see from the test sites posted in another answer clearly indicate, it's a moving target (though it may stay stationary for a short time due to the browser using HTTP/1.1 keepalives and actually reusing the same connection, not just the same port)... but if you hit the site repeatedly, you'll see it either drift around randomly, or increment. Trying it from two different web browsers on the same machine, you'd never see the same port number -- the port corresponds to the specific source connection, not the machine sourcing the connection.
Sometimes, you may find that it's the same port number as the port your machine's stack opened for the outbound connection, but even when it is, it doesn't matter, because no traffic should be able to return to your machine on that port unless it is from the IP address and port of the machine to which you made the outbound connection. Any decent network address translating device would never accept traffic from another source IP address and/or port, other than the one you addressed in the outbound connection.
There is no standard, simple, predictable, reliable, or consistent way to punch a hole in TCP NAT and then exploit that hole for a peer-to-per connection. To the extent that such things are possible in a given NAT implementation, that is an implementation that is shoddy, broken, defective, and insecure.
See also: https://www.rfc-editor.org/rfc/rfc5128
Sounds like your app could use a STUN server to get its external address.
Based on my understanding, port numbers are just like telephone extensions. Just as a business telephone switchboard can use a main phone number and assign each employee an extension number (like x100, x101, etc.), so a computer has a main address and a set of port numbers to handle incoming and outgoing connections.
But the question is:
On what basis is a port number assigned? A process or an application?
Based on my experience with firewall, I usually open a port for a specific application. So port number should be assigned on an application's basis. But what if there're multiple instances of the same application running on a single machine. Each of the instances uses the same port number. So if a message is arrived at that port number, how could the system tell which instance should the message go?
And another question also related to port.
If a web server is setup to listen on port 80, client browser should always contact the 80 port. I am not sure if the following illustration of the communication between a web browser and the web server is correct.
Client Browser sent request to Server, the message should contain info like this:
To: < ServerAddress:80 >
From: < ClientAddress:XXX >
Server sent reponse to Client Browser like this:
To: < ClientAddress:XXX >
From: < ServerAddress:80 >
So the question is, will the server pick other port numbers for sending messages to client? Because I think a single 80 port doesn't look enough.
Add - 1 - 21:16 2010/12/19
In my above post, the word "application" represents a static program file that the system knows. Multiple instnaces of this application could be launched, which are multiple "processes"
Each client connection will be represented by a socket on the server. Sockets are uniquely represented by the combination of the following 4 pieces of information:
Peer IP address
Peer port
Local IP address
Local port
The client chooses a random port, so if there are multiple connections from one client to the same server/port, the connections will still differ by the client's port.
If there are multiple web server applications running on the same server, they will have to listen on different ports or the server will need to have multiple IP addresses.
On a computer, only one process can be listening on a specific port number. For example, if an Apache process is listening on port 80, no other application can also listen on port 80.
Apache usually pre-forks several processes, only one of those is listening on port 80. The job of that process is to hand over the processing for any connection to one of the pool of other Apache processes as quickly and efficiently as it can.
Each of many concurrent connections to port 80 is distinguished by it's source IP-address and by the source TCP port number (which the client computer chooses randomly from the set not in use).
(Edit)
I was pretty sure that webservers have one process (or thread) listening which accepts incoming connections and passes corresponding filehandles to the worker processes (or threads). EJP advises that this is not so.
Apache seems to have several different multi-processing modules that affect how it spreads the load of responding to multiple concurrent requests. For example: MPM Prefork and MPM Worker
Jeff Pozkaner wrote an overview of HTTP server design that I found interesting:
The basic operation of a web server is to accept a request and send back a response. The first web servers were probably written to do exactly that. Their users no doubt noticed very quickly that while the server was sending a response to someone else, they couldn't get their own requests serviced. There would have been long annoying pauses.
The second generation of web servers addressed this problem by forking off a child process for each request. …
A slight variant of this type of server uses "lightweight processes" or "threads" instead of full-blown Unix processes. …
The third generation of servers is called "pre-forking". Instead of starting a new subprocess for each request, they have a pool of subprocesses that they keep around and re-use. …
The fourth generation. One process only. No non-portable threads/LWPs. Sends multiple files concurrently using non-blocking I/O, calling select()/poll()/kqueue() to tell which ones are ready for more data. …
Network stack distinguishes TCP connections by triple <source IP,source port,destination port>, so knowing client address and port is enough to work correctly.
What is the application, if it is not a process? In firewalls you open ports for executables. It may be considered as an application, and it is a process when it is running.
Multiple listeners cannot listen to the same port. The same process can listen to multiple ports.
Ports are assigned to the listeners. Depending on the firewall (and its configuration) you can allow the process (executable) to listen several ports, or to create several exceptions for the same process listening to multiple ports.
I'm not sure what you mean by the difference between a "process" and an "application". Everything is just code executing on your box.
Anyway, a process/application will listen/bind to whatever port number the authors of the application have configured. By convention, many port numbers are reserved for particular types of application - that is applications which communicate using a particular protocol. So for example web servers which use HTTP typically run on port 80. SMTP servers run on port 22. HTTPS is 443 and so on.
Of course you can configure your web server (e.g apache httpd) to run on whatever port you like - but your client needs to know else it will assume port 80.
Two processes/applications may not bind to the same port. If you try to start another process/application on a port already in use you'll get an error: cannot bind to port or something to that effect.
will the server pick other port
numbers for sending messages to
client?
No. All the accepted sockets use the same server-side port number as the original listening socket. The identifying tuple mentioned above disambiguates this so as to make each connection unique.
From what I understand, each HTTP request uses its own TCP connection (please correct me if i'm wrong). So, let's say that there are two current connections to the same server. For example, client side javascript code triggering a couple of AJAX POST requests using the XMLHttpRequest object, one right after the other, before getting the response to the first one. So we're talking about two connections to the same server, each waiting for a response in order to route it to each separate callback function.
Now here's the thing that I don't understand: The TCP packet includes source and destination ip and port, but won't both of these connections have the same src and dest ip addresses, and port 80? How can the packets be differentiated and routed to appropriately? Does it have anything to do with the packet sequence number which is different for each connection?
When your browser creates a new connection to the HTTP server, it uses a different source port.
For example, say your browser creates two connections to a server and that your IP address is 60.12.34.56. The first connection might originate from source port 60123 and the second from 60127. This is embedded in the TCP header of each packet sent to the server. When the server replies to each connection, it uses the appropriate port (e.g. 60123 or 60127) so that the packet makes it back to the right spot.
One of the best ways to learn about this is to download Wireshark and just observe traffic on your own network. It will show you this and much more.
Additionally, this gives insight into how Network Address Translation (NAT) works on a router. You can have many computers share the same IP address and the router will rewrite the request to use a different port so that two computers can simultaneously connect to places like AOL Instant Messenger.
They're differentiated by the source port.
The main reason for each HTTP request to not generate a separate TCP connection is called keepalives, incidentally.
A socket, in packet network communications, is considered to be the combination of 4 elements: server IP, server port, client IP, client port. The second one is usually fixed in a protocol, e.g. http usually listen in port 80, but the client port is a random number usually in the range 1024-65535. This is because the operating system could use those ports for known server protocols (e.g. 21 for FTP, 22 for SSH, etc.). The same network device can not use the same client port to open two different connections even to different servers and if two different clients use the same port, the server can tell them apart by their IP addresses. If a port is being used in a system either to listen for connection or to establish a connection, it can not be used for anything else. That's how the operating system can dispatch packets to the correct process once received by the network card.