Problem
Following LuaSocket Introduction I managed to get the server running. I also managed to connect from the client side. But to get this connection the client must know the server port number. In the example code the server port is 0, which means:
If port is 0, the system automatically chooses an ephemeral port.
I guess this approach has it's advantages, but how does the poor client is supposed to know which port to connect to?
Question
How to communicate an ephemeral port number from server to client? I assume there should be no human action in the process.
Code
server (from LuaSocket Introduction)
-- load namespace
local socket = require("socket")
-- create a TCP socket and bind it to the local host, at any port
local server = assert(socket.bind("*", 0))
-- find out which port the OS chose for us
local ip, port = server:getsockname()
-- print a message informing what's up
print("Please telnet to localhost on port " .. port)
print("After connecting, you have 10s to enter a line to be echoed")
-- loop forever waiting for clients
while 1 do
-- wait for a connection from any client
local client = server:accept()
-- make sure we don't block waiting for this client's line
client:settimeout(10)
-- receive the line
local line, err = client:receive()
-- if there was no error, send it back to the client
if not err then client:send(line .. "\n") end
-- done with client, close the object
client:close()
end
client (follows this answer)
local host, port = "127.0.0.1", 100
local socket = require("socket")
local tcp = assert(socket.tcp())
tcp:connect(host, port);
--note the newline below
tcp:send("hello world\n");
while true do
local s, status, partial = tcp:receive()
print(s or partial)
if status == "closed" then break end
end
tcp:close()
How to communicate an ephemeral port number from server to client? I assume there should be no human action in the process.I assume there should be no human action in the process.
You are right. The server example is quite odd. The servers generally should bind to specific port. It shouldn't be ephemeral. So that when the server restarts the client connects to the same port as earlier. Otherwise, the clients would be at loss if server ports keeps changing.
Clients, can indeed bind to ephemeral ports ( they generally do ). Servers should be bound to specific ones.
In TCP/IP connections the server always BIND to a known port. A server is not allowed to bind to port 0. It has to be available on a known port so clients can connect to it. You should change the example to bind the server to a fixed port and then use that fixed port in the client connect function.
Related
I'm using a socketConnection which takes the TCP port number port as a mandatory parameter. Now, how can I ensure that I specify an unused port for port (my process is the server)?
Edits:
I know that you can specify port = 0 so that the system allocates the port for you, but I have no means of subsequently finding out which port I was given.
Moreover, socketConnection(..., server = T) is blocking until the client process connects. But the client process doesn't know which port to connect to because the server process is being blocked and isn't able to determine the port number it got assigned. Catch 22!
It depends on whether you're creating a client socket or a server socket.
If you're creating a client socket, the port parameter species the target port. So it must be a port that is in use, as a server, otherwise you will get a connection refusal.
If you're creating a server socket, normally you will want a fixed port number, and if it's already in use it means your server is already running. If you want a random system-allocated port number for a server, and you have some other means of telling the clients what the port number is, just specify zero.
When TCP client requests conn'n on server's listening port, server will accept it and create a new port meant for this conn'n with this client. Hence forth the client will communicate with server on this new port.
if the above statement is true and possible, how server conveys the newly generated port to client. In reply to the conn'n request the packet from server to client will have what port as source port (Server's listening port OR New port generated by server for client).
Will Client accept this port and take into use or it will give error ? I need this to implement an architecture having 2 clients and one server in an embedded system using lwip stack.
regards,
ED
The server doesn't create a new port. It creates a new TCP connection and it sends its reply packets to the IP and port the client sent its connection request from. (A TCP connection has an IP address and port on each side.)
When you connect to a server, you get a port number yourself, which is assigned to you by the system (unless you bind the socket before connecting). When the network stack of the server replies to your connection request, the "source" port is the new port number of the server, and the "destination" port of the message is your port. That's how the network stack on the client side knows what port the server has.
The new port number on the server used for your connection can not be set or changed by the actual server program, it's the network stack on the server machine that just grabs an available port number.
Edit: You might also want to read up a little on how connections are established, a.k.a. the three-way handshake.
This question already has answers here:
Does the port change when a server accepts a TCP connection?
(3 answers)
Closed 4 years ago.
I understand the basics of how ports work. However, what I don't get is how multiple clients can simultaneously connect to say port 80. I know each client has a unique (for their machine) port. Does the server reply back from an available port to the client, and simply state the reply came from 80? How does this work?
First off, a "port" is just a number. All a "connection to a port" really represents is a packet which has that number specified in its "destination port" header field.
Now, there are two answers to your question, one for stateful protocols and one for stateless protocols.
For a stateless protocol (ie UDP), there is no problem because "connections" don't exist - multiple people can send packets to the same port, and their packets will arrive in whatever sequence. Nobody is ever in the "connected" state.
For a stateful protocol (like TCP), a connection is identified by a 4-tuple consisting of source and destination ports and source and destination IP addresses. So, if two different machines connect to the same port on a third machine, there are two distinct connections because the source IPs differ. If the same machine (or two behind NAT or otherwise sharing the same IP address) connects twice to a single remote end, the connections are differentiated by source port (which is generally a random high-numbered port).
Simply, if I connect to the same web server twice from my client, the two connections will have different source ports from my perspective and destination ports from the web server's. So there is no ambiguity, even though both connections have the same source and destination IP addresses.
Ports are a way to multiplex IP addresses so that different applications can listen on the same IP address/protocol pair. Unless an application defines its own higher-level protocol, there is no way to multiplex a port. If two connections using the same protocol simultaneously have identical source and destination IPs and identical source and destination ports, they must be the same connection.
Important:
I'm sorry to say that the response from "Borealid" is imprecise and somewhat incorrect - firstly there is no relation to statefulness or statelessness to answer this question, and most importantly the definition of the tuple for a socket is incorrect.
First remember below two rules:
Primary key of a socket: A socket is identified by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL} not by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT} - Protocol is an important part of a socket's definition.
OS Process & Socket mapping: A process can be associated with (can open/can listen to) multiple sockets which might be obvious to many readers.
Example 1: Two clients connecting to same server port means: socket1 {SRC-A, 100, DEST-X,80, TCP} and socket2{SRC-B, 100, DEST-X,80, TCP}. This means host A connects to server X's port 80 and another host B also connects to the same server X to the same port 80. Now, how the server handles these two sockets depends on if the server is single-threaded or multiple-threaded (I'll explain this later). What is important is that one server can listen to multiple sockets simultaneously.
To answer the original question of the post:
Irrespective of stateful or stateless protocols, two clients can connect to the same server port because for each client we can assign a different socket (as the client IP will definitely differ). The same client can also have two sockets connecting to the same server port - since such sockets differ by SRC-PORT. With all fairness, "Borealid" essentially mentioned the same correct answer but the reference to state-less/full was kind of unnecessary/confusing.
To answer the second part of the question on how a server knows which socket to answer. First understand that for a single server process that is listening to the same port, there could be more than one socket (maybe from the same client or from different clients). Now as long as a server knows which request is associated with which socket, it can always respond to the appropriate client using the same socket. Thus a server never needs to open another port in its own node than the original one on which the client initially tried to connect. If any server allocates different server ports after a socket is bound, then in my opinion the server is wasting its resource and it must be needing the client to connect again to the new port assigned.
A bit more for completeness:
Example 2: It's a very interesting question: "can two different processes on a server listen to the same port". If you do not consider protocol as one of the parameters defining sockets then the answer is no. This is so because we can say that in such a case, a single client trying to connect to a server port will not have any mechanism to mention which of the two listening processes the client intends to connect to. This is the same theme asserted by rule (2). However, this is the WRONG answer because 'protocol' is also a part of the socket definition. Thus two processes in the same node can listen to the same port only if they are using different protocols. For example, two unrelated clients (say one is using TCP and another is using UDP) can connect and communicate to the same server node and to the same port but they must be served by two different server processes.
Server Types - single & multiple:
When a server processes listening to a port that means multiple sockets can simultaneously connect and communicate with the same server process. If a server uses only a single child process to serve all the sockets then the server is called single-process/threaded and if the server uses many sub-processes to serve each socket by one sub-process then the server is called a multi-process/threaded server. Note that irrespective of the server's type a server can/should always use the same initial socket to respond back (no need to allocate another server port).
Suggested Books and the rest of the two volumes if you can.
A Note on Parent/Child Process (in response to query/comment of 'Ioan Alexandru Cucu')
Wherever I mentioned any concept in relation to two processes say A and B, consider that they are not related by the parent-child relationship. OS's (especially UNIX) by design allows a child process to inherit all File-descriptors (FD) from parents. Thus all the sockets (in UNIX like OS are also part of FD) that process A listening to can be listened to by many more processes A1, A2, .. as long as they are related by parent-child relation to A. But an independent process B (i.e. having no parent-child relation to A) cannot listen to the same socket. In addition, also note that this rule of disallowing two independent processes to listen to the same socket lies on an OS (or its network libraries), and by far it's obeyed by most OS's. However, one can create own OS which can very well violate this restriction.
TCP / HTTP Listening On Ports: How Can Many Users Share the Same Port
So, what happens when a server listen for incoming connections on a TCP port? For example, let's say you have a web-server on port 80. Let's assume that your computer has the public IP address of 24.14.181.229 and the person that tries to connect to you has IP address 10.1.2.3. This person can connect to you by opening a TCP socket to 24.14.181.229:80. Simple enough.
Intuitively (and wrongly), most people assume that it looks something like this:
Local Computer | Remote Computer
--------------------------------
<local_ip>:80 | <foreign_ip>:80
^^ not actually what happens, but this is the conceptual model a lot of people have in mind.
This is intuitive, because from the standpoint of the client, he has an IP address, and connects to a server at IP:PORT. Since the client connects to port 80, then his port must be 80 too? This is a sensible thing to think, but actually not what happens. If that were to be correct, we could only serve one user per foreign IP address. Once a remote computer connects, then he would hog the port 80 to port 80 connection, and no one else could connect.
Three things must be understood:
1.) On a server, a process is listening on a port. Once it gets a connection, it hands it off to another thread. The communication never hogs the listening port.
2.) Connections are uniquely identified by the OS by the following 5-tuple: (local-IP, local-port, remote-IP, remote-port, protocol). If any element in the tuple is different, then this is a completely independent connection.
3.) When a client connects to a server, it picks a random, unused high-order source port. This way, a single client can have up to ~64k connections to the server for the same destination port.
So, this is really what gets created when a client connects to a server:
Local Computer | Remote Computer | Role
-----------------------------------------------------------
0.0.0.0:80 | <none> | LISTENING
127.0.0.1:80 | 10.1.2.3:<random_port> | ESTABLISHED
Looking at What Actually Happens
First, let's use netstat to see what is happening on this computer. We will use port 500 instead of 80 (because a whole bunch of stuff is happening on port 80 as it is a common port, but functionally it does not make a difference).
netstat -atnp | grep -i ":500 "
As expected, the output is blank. Now let's start a web server:
sudo python3 -m http.server 500
Now, here is the output of running netstat again:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
So now there is one process that is actively listening (State: LISTEN) on port 500. The local address is 0.0.0.0, which is code for "listening for all". An easy mistake to make is to listen on address 127.0.0.1, which will only accept connections from the current computer. So this is not a connection, this just means that a process requested to bind() to port IP, and that process is responsible for handling all connections to that port. This hints to the limitation that there can only be one process per computer listening on a port (there are ways to get around that using multiplexing, but this is a much more complicated topic). If a web-server is listening on port 80, it cannot share that port with other web-servers.
So now, let's connect a user to our machine:
quicknet -m tcp -t localhost:500 -p Test payload.
This is a simple script (https://github.com/grokit/dcore/tree/master/apps/quicknet) that opens a TCP socket, sends the payload ("Test payload." in this case), waits a few seconds and disconnects. Doing netstat again while this is happening displays the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:54240 ESTABLISHED -
If you connect with another client and do netstat again, you will see the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:26813 ESTABLISHED -
... that is, the client used another random port for the connection. So there is never confusion between the IP addresses.
Normally, for every connecting client the server forks a child process that communicates with the client (TCP). The parent server hands off to the child process an established socket that communicates back to the client.
When you send the data to a socket from your child server, the TCP stack in the OS creates a packet going back to the client and sets the "from port" to 80.
Multiple clients can connect to the same port (say 80) on the server because on the server side, after creating a socket and binding (setting local IP and port) listen is called on the socket which tells the OS to accept incoming connections.
When a client tries to connect to server on port 80, the accept call is invoked on the server socket. This creates a new socket for the client trying to connect and similarly new sockets will be created for subsequent clients using same port 80.
Words in italics are system calls.
Ref
http://www.scs.stanford.edu/07wi-cs244b/refs/net2.pdf
To my understanding by serverSocket = new ServerSocket(portNumber) we create an object which potentially can "listen" to the indicated port. By clientSocket = serverSocket.accept() we force the server socket to "listen" to its port and to "accept" a connection from any client which tries to connect to the server through the port associated with the server. When I say "client tries to connect to the server" I mean that client program executes "nameSocket = new Socket(serverIP,serverPort)".
If client is trying to connect to the server, the server "accepts" this client (i.e. creates a "client socket" associated with this client).
If a new client tries to connect to the server, the server creates another client socket (associated with the new client). But how the server knows if it is a "new" client or an "old" one which has already its socket? Or, in other words, how the clients are identified? By their IP? By their IP and port? By some "signatures"?
What happens if an "old" client tries to use Socket(serverIP,serverIP) again? Will server create the second socket associated with this client?
The server listens on an address and port. For example, your server's IP address is 10.0.0.1, and it is listening on port 8000.
Your client IP address is 10.0.0.2, and the client "connects" to the server at 10.0.0.1 port 8000. In the TCP connect, you are giving the port of the server that you want to connect to. Your client will actually get its own port number, but you don't control this, and it will be different on each connection. The client chooses the server port that it wants to connect to and not the client port that it is connecting from.
For example, on the first connection, your client may get client-side port 12345. It is connecting from 10.0.0.2 port 12345 to the server 10.0.0.1 port 8000. Your server can see what port the client is connecting from by calling getpeername on its side of the connection.
When the client connects a second time, the port number is going to be different, say port 12377. The server can see this by calling getpeername on the second connection -- it will see a different port number on the client side. (getpeername also shows the client's IP address.)
Also, each time you call accept on the server, you are getting a new socket. You still have the original socket listening, and on each accept you get a new socket. Call getpeername on the accepted socket to see which client port the connection is coming from. If two clients connect to your server, you now have three sockets -- the original listening socket, and the sockets of each of the two clients.
You can have many clients connected to the same server port 8000 at the same time. And, many clients can be connected from the same client port (e.g. port 12345), only not from the same IP address. From the same client IP address, e.g. 10.0.0.2, each client connection to the server port 8000 will be from a unique client port, e.g. 12345, 12377, etc. You can tell the clients apart by their combination of IP address and port.
The same client can also have multiple connections to the server at the same time, e.g. one connection from client port 12345 and another from 12377 at the same time. By client I mean the originating IP address, and not a particular software object. You'll just see two active connections having the same client IP address.
Also, eventually over time, the combination of client-address and client-port can be reused. That is, eventually, you may see a new client come in from 10.0.0.2 port 12345, long after the first client at 10.0.0.2 port 12345 has disconnected.
Every TCP connection has as identifier the quadruple (src port, src address, dest port, dest address).
Whenever your server accepts a new client, a new Socket is created and it's indipendent from every other socket created so far. The identification of clients is not implictly handled somehow..
You don't have to think sockets as associated to "clients", they are associated with an ip and a port, but there is not direct correlation between these two.
If the same client tries to open another socket by creating a new one you'll have two unrelated sockets (because ports will be different for sure). This because the client cannot use the same port to open the new connection so the quadruple will be different, same client ip, same server ip, same server port but different client port.
EDIT for your questions:
clients don't specify a port because it's randomly choosen from the free ones (> 1024 if I'm not wrong) from the underlying operating system
a connection cannot be opened from a client using the same port, the operating system won't let you do that (actually you don't specify any port at all) and in any case it would tell you that port is already bound to a socket so this issue cannot happen.
whenever the server receives a new connection request it's is considered new, because also if ip is the same port will be different for sure (in case of old packet resend or similar caveats I think that the request will be discarded)
By the way all these situations are clearly explained in TCP RFC here.
I think the question here is why do you care if the client is new or old. What is new and old?
For example, a web browser could connect to a web server to request a web page. This will create a connection so serverSocket.accept() will return a new Socket. Then the connection is closed by the web browser.
Afer a couple of minutes, the end used click on a link in the web page and the browser request a new page to the server. This will create a connection so serverSocket.accept() will return a new Socket.
Now, the web server do not care if this is a new or old client. It just need to server the requested page. If the server do care if the "client" already requested a page in the past, it should do so using some information in the protocol used on the socket. Check out http://en.wikipedia.org/wiki/OSI_model
In this case, the ServerSocket and Socket ack on the transport level. The question "does this client already requested a page on the server" should be answered by information on the session or even application layer.
In the web browser/server example, the http protocol (which is an application) protocol hold information about who is this browser in the parameters of the request (the browser transmit cookie informations with every request). The http server can then set/read cookie information to known if the browser connected before and eventually maintain a server side session for that browser.
So back to your question: why do you care if it's a new or old client?
A socket is identified by:
(Local IP,Local Port, Remote IP,
Remote Port,IP Protocol(UDP/TCP/SCTP/etc.)
And that's the information the OS uses to map the packets/data to the right handle/file descriptor of your program. For some kinds of sockets,(e.g. an non-connected UDP socket)the remote port/remote IP might be wildcards.
By definition, this is not a Java related question, but about networking in general, since Sockets and SeverSockets apply to any networking-enabled programming language.
A Socket is bounded to a local-port. The client will open a connection to the server (by the Operating System/drivers/adapters/hardware/line/.../line/hardware/adapters/drivers/Server OS). This "connection" is done by a protocol, called the IP (Internet Protocol) when you are connected to the Internet. When you use "Sockets", it will use another protocol, which is the TCP/IP-protocol.
The Internet Protocol will identify nodes on a network by two things: their IP-address and their port. The TCP/IP-protocol will send messages using the IP, and making sure messages are correctly received.
Now; to answer your question: it all depends! It depends on your drivers, your adapters, your hardware, your line. When you connect to your localhost machine, you will not get further than the adapter. The hardware isn't necessairy, since no data is actually sent over the line. (Though often you need hardware before you can have an adapter.)
By definition, the Internet Protocol defines a connection as pair of nodes (thus four things: two IP-adresses and two ports). Also, the Internet Protocol defines that one node can only use one port at a time to initiate a connection with another node (note: this only applies for the client, not the server).
To answer your second question: if there are two Sockets: the "new" and the "old". Since, by the Internet Protocol, a connection is a pair of nodes, and nodes can only use one port at a time for a connection, the ports of "new" and "old" must be different. And because this is different, the "new" client can be discriminated from the "old", since the port-number is differently.
Some things look strange to me:
What is the distinction between 0.0.0.0, 127.0.0.1, and [::]?
How should each part of the foreign address be read (part1:part2)?
What does a state Time_Wait, Close_Wait mean?
etc.
Could someone give a quick overview of how to interpret these results?
0.0.0.0 usually refers to stuff listening on all interfaces.
127.0.0.1 = localhost (only your local interface)
I'm not sure about [::]
TIME_WAIT means both sides have agreed to close and TCP
must now wait a prescribed time before taking the connection
down.
CLOSE_WAIT means the remote system has finished sending
and your system has yet to say it's finished.
I understand the answer has been accepted but here is some additional information:
If it says 0.0.0.0 on the Local Address column, it means that port is listening on all 'network interfaces' (i.e. your computer, your modem(s) and your network card(s)).
If it says 127.0.0.1 on the Local Address column, it means that port is ONLY listening for connections from your PC itself, not from the Internet or network. No danger there.
If it displays your online IP on the Local Address column, it means that port is ONLY listening for connections from the Internet.
If it displays your local network IP on the Local Address column, it means that port is ONLY listening for connections from the local network.
Foreign Address - The IP address and port number of the remote computer to which the socket is connected. The names that corresponds to the IP address and the port are shown unless the -n parameter is specified. If the port is not yet established, the port number is shown as an asterisk (*). (from wikipedia)
What is the distinction between 0.0.0.0, 127.0.0.1, and [::]?
0.0.0.0 indicates something that is listening on all interfaces on the machine.
127.0.0.1 indicates your own machine.
[::] is the IPv6 version of 0.0.0.0
My machine also shows *:\* for UDP which shows that UDP connections don't really have a foreign address - they receive packets from any where. That is the nature of UDP.
How should each part of the foreign address be read (part1:part2)?
part1 is the hostname or IP addresspart2 is the port
127.0.0.1 is your loopback address also known as 'localhost' if set in your HOSTS file. See here for more info: http://en.wikipedia.org/wiki/Localhost
0.0.0.0 means that an app has bound to all ip addresses using a specific port. MS info here: http://support.microsoft.com/default.aspx?scid=kb;en-us;175952
'::' is ipv6 shorthand for ipv4 0.0.0.0.
Send-Q is the amount of data sent by the application, but not yet acknowledged by the other side of the socket.
Recv-Q is the amount of data received from the NIC, but not yet consumed by the application.
Both of these queues reside in kernel memory.
There are guides to help you tweak these kernel buffers, if you are so inclined. Although, you may find the default params do quite well.
This link has helped me a lot to interpret netstat -a
A copy from there -
TCP Connection States
Following is a brief explanation of this handshake. In this context the "client" is the peer requesting a connection and the "server" is the peer accepting a connection. Note that this notation does not reflect Client/Server relationships as an architectural principal.
Connection Establishment
The client sends a SYN message which contains the server's port and the client's Initial Sequence Number (ISN) to the server (active open).
The server sends back its own SYN and ACK (which consists of the client's ISN + 1).
The Client sends an ACK (which consists of the server's ISN + 1).
Connection Tear-down (modified three way handshake).
The client sends a FIN (active close). This is a now a half-closed connection. The client no longer sends data, but is still able to receive data from the server. Upon receiving this FIN, the server enters a passive close state.
The server sends an ACK (which is the clients FIN sequence + 1)
The server sends its own FIN.
The client sends an ACK (which is server's FIN sequence + 1). Upon receiving this ACK, the server closes the connection.
A half-closed connection can be used to terminate sending data while sill receiving data. Socket applications can call shutdown with the second argument set to 1 to enter this state.
State explanations as shown in Netstat:
State Explanation
SYN_SEND Indicates active open.
SYN_RECEIVED Server just received SYN from the client.
ESTABLISHED Client received server's SYN and session is established.
LISTEN Server is ready to accept connection.
NOTE: See documentation for listen() socket call. TCP sockets in listening state are not shown - this is a limitation of NETSTAT. For additional information, please see the following article in the Microsoft Knowledge Base:
134404 NETSTAT.EXE Does Not Show TCP Listen Sockets
FIN_WAIT_1 Indicates active close.
TIMED_WAIT Client enters this state after active close.
CLOSE_WAIT Indicates passive close. Server just received first FIN from a client.
FIN_WAIT_2 Client just received acknowledgment of its first FIN from the server.
LAST_ACK Server is in this state when it sends its own FIN.
CLOSED Server received ACK from client and connection is closed.
For those seeing [::] in their netstat output, I'm betting your machine is running IPv6; that would be equivalent to 0.0.0.0, i.e. listen on any IPv6 address.