Discovering free ports - tcp

I wrote an server application in erlang and a client in C#. They communicate through 3 TCP ports. Port numbers are hardcoded. Now I'd like to do this dynamically. This is my first time doing network programming, so please pardon my inability to use proper terminology :-D
What I would like to do is make a supervisor which would accept a TCP connection from a client on a previously known port (say, 10000, or whatever), then find 3 free ports, start a server application on those 3 ports and tell the client those port numbers so client can connect to the server.
My particular problem is: how do I find 3 ports which are not in use? (clarification: which module:fun() to use to find a free port?)
My general problem is: I'm sure this kind of stuff (one server allocating ports and redirecting clients) is quite common network programming problem and there should be a bunch of (erlang-specific or general) resources about this, but I just don't have the terminology to google it out.

According to the Erlang documentation here, if the Port argument to the gen_tcp:listen/2 function is 0, then the OS will assign any available port to the socket. The latter can then be retreived using inet:port/1 .
You can therefore do something like this :
{ok, Listen} = gen_tcp:listen(0, [Options]),
Port = inet:port(Listen).

just in case you didn't know that - you dont have to allocate new ports for each client, it's perfectly fine to have all clients to connect to same ports
UPDATE:
if there is a reason to allocate new ports for incoming clients then it's far beyond your first "introduction to network programming" program.
separate ports could mean you want to completely isolate environments of different groups of clients. it's comparable to providing completely different IP addresses to connect to. if you want to write a simple ping-pong program - you don't need it. and i honestly believe you will never need to use such solution in your whole life - that's how incredibly rarely it is.
regarding cpu/ports overhead - allocating ports and starting a server that listens to that port is already far bigger overhead than accepting clients on same port.

You need to avoid commonly known ports, ftp, http, smtp etc, But I don't think there is any master list of which ports other software uses that you should avoid. I think your best bet is to come up with a range of ports you want to use. Check at runtime if anybody else answers ( is using the ports ) on the numbers you choose dynamically, if not issue it to the client.

Related

HTTP through a domain socket

I'm writing a bit of desktop software which has two components. Component B queries component A. Creating a web service seems like an ideal way to do IPC in principle. The data model fits, there are ready-made client and server libraries, a well known way to encode and decode parameters etc.
But setting up an HTTP server on a network socket doesn't seem right for a local application. For example what port do I choose? I don't really want people to be able to scan and talk to the app from outside etc.
So I was thinking that I might be able to do HTTP over a domain socket. Does that make any sense? Is there any precedence for it? Is there an equivalent protocol that I could use for IPC which has the same properties as HTTP (requests for specified resources (URIs), encoded parameters, response)?
Looking for C libraries (and possibly Go and ObjC for bonus points).
Binding to the loopback interface only (127.0.0.1) solves your "external visibility" problem, only processes on the local machine will be able to connect.
It does not solve your port allocation problem though, the port number you choose might be taken by the time your app starts. Then your server can't bind and your client connects to the other process bound to your port.
Old, less hip, but CORBA implementations tend to have the problems you have not thought of yet figured out already.

Creating a networking application that can work over internet connections

I have a somewhat basic understanding of network programming (and networking concepts in general) from taking a networking course in university a few years ago.
I remember being able to create a simple chat application, where the chat server is used as a central directory aware of which clients are currently online, but once a client knows another client it wants to chat with, the actual messages between them don't need to go through the server. I remember we could only test this over a bunch of LAN machines.
This C# chat program also has several comments mentioning that the program does not work over the internet: http://www.geekpedia.com/tutorial239_Csharp-Chat-Part-1---Building-the-Chat-Client.html
My question is why do these applications not work over the internet when "commercial" chat applications can. Surely, there is some way to make my computer accessible to the outer network even if its IP address is not valid outside the network of the ISP.
I see no problem with the linked-to code. The server doesn't even bind to a local address, which means it will listen for connections on all ip-addresses on the computer. There is however a comment for in the server article where the user changed the TcpListener object creation to bind to a specific address, which means clients only can connect to that specific address.
In the original server design, with using TcpListenet with only a port number, there should be nothing preventing its use on an Internet connected computer, unless there is a firewall blocking access.
Were you aware of networkComms.net and in particular the short chat example demonstrating the functionality here (It's less than 15 lines of code)? This was written specifically for people writing server-client apps in c# and given most of the problems you might come across will already have been solved and it might save you some time. This library is completely plug & play and has no issues working over the internet (as long as you can setup the necessary port forwarding where necessary).
Generally if both of your targets are behind NAT (so no true external ip addresses) and you are unable to configure port forwarding you need to look at 'TCP / UDP hole punching', quite an advanced technique.

How to detect router?

I am trying to write a program that scan an ip range and detect if an ip is address of a router or not.
Currently i used traceroute from my computer to all host in the network. However, i believe there must be some way to directly "ask" a host at an ip if it is a router or not?
by the way, do you know any program/ opensource already does this?
Routers are supposed to talk couple of protocols (actually a neat bunch) that regular IP nodes do not, and then there are some which are more common (i.e. even non-router nodes do).
Router-only protocols:
VRRP
IGRP / EIGRP
OSPF
BGP
RIP
You could do active-probing on those, i.e. send a packet (behaving as if you are another router, or an end-node) and checking to see what kind of response the router (if at all) sends.
Alternatively you could do passive-probing, like 'sniffing', i.e. watching out for the kind of IP packets being sent out by various nodes. There are some which are usually sent out by Routers only (again, mostly from the above list).
Common protocol, but that can actually tell you a lot:
SNMP (esply the unsecure one's like v1/v2, are easy to deal with, without having to establish a secure session)
Other ways:
Portscanning (actually can tell you a real lot), for example all routers have some management ports (although, often they are locked down due to security concerns)
What you want to do is often what many 'Network Management' software do, to "discover" capabilities / functionality of other nodes in the network. And, there isn't a single size-fits all solution. They use bunch of different methods, heuristics to finally figure out what the other node is.
Any node which is hopped to and not just an endpoint is a router. However, this doesn't allow you to detect routers with no reachable devices hooked up. (Any input as to whether my answer has merit would be great!)

How to retain one million simultaneous TCP connections?

I am to design a server that needs to serve millions of clients that are simultaneously connected with the server via TCP.
The data traffic between the server and the clients will be sparse, so bandwidth issues can be ignored.
One important requirement is that whenever the server needs to send data to any client it should use the existing TCP connection instead of opening a new connection toward the client (because the client may be behind a firewall).
Does anybody know how to do this, and what hardware/software is needed (at the least cost)?
What operating systems are you considering for this?
If using a Windows OS and using something later than Vista then you shouldn't have a problem with many thousands of connections on a single machine. I've run tests (here: http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html) with a low spec Windows Server 2003 machine and easily achieved more than 70,000 active TCP connections. Some of the resource limits that affect the number of connections possible have been lifted considerably on Vista (see here: http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html) and so you could probably achieve your goal with a small cluster of machines. I don't know what you'd need in front of those to route the connections.
Windows provides a facility called I/O Completion Ports (see: http://msdn.microsoft.com/en-us/magazine/cc302334.aspx) which allow you to service many thousands of concurrent connections with very few threads (I was running tests yesterday with 5000 connections saturating a link to a server with 2 threads to process the I/O...). Thus the basic architecture is very scalable.
If you want to run some tests then I have some freely available tools on my blog that allow you to thrash a simple echo server using many thousands of connections (1) and (2) and some free code which you could use to get you started (3)
The second part of your question, from your comments, is more tricky. If the client's IP address keeps changing and there's nothing between you and them that is providing NAT to give you a consistent IP address then their connections will, no doubt, be terminated and need to be re-established. If the clients detect this connection tear down when their IP address changes then they can reconnect to the server, if they can't then I would suggest that the clients need to poll the server every so often so that they can detect the connection loss and reconnect. There's nothing the server can do here as it can't predict the new IP address and it will discover that the old connection has failed when it tries to send data.
And remember, your problems are only just beginning once you get your system to scale to this level...
This problem is related to the so-called C10K problem. The C10K page lists a large number of good resources for addressing the problems you will encounter when you try to allow thousands of clients to connect to the same server.
I've come across the APE Project
a while back. It seems like a dream come true. They can support up to 100k concurrent clients on a single node. Spread them across 10 or 20 nodes, and you can serve millions. Perfect for RESTful applications. Might want to look deeper for any shared namespace. One drawback is that this is a standalone server, as in supplementary to a web server. This server is of course Open Source, so any cost is hardware/ISP related.
You cannot use UDP. If the client sends a request and you don't reply immediately, a router is going to forget the reverse route in 30 seconds or less, so your server will never be able to reply to the client.
TCP is the only option, and it, too, will give you headaches. Most routers are going to forget the route and/or drop the connection after a few minutes, so your client/server code is going to have to send "keep alives" fairly often.
I recommend setting up a "sniffer", to see how the phone companies are staying in touch with your smartphone for their "push" technology. Copy whatever they're doing, because that stuff works!
As Greg mentioned, the problem you are describing is C10K (or rather "C1M" in your case )
I recently made a simple TCP echo server on linux that scales very well with the number of sessions (only tested up to 200.000 though), by using the epoll queue. On BSD, you have something similar called kqueue.
You can check out the code if you want to. Hope this helps and good luck!
EDIT: As noted in the comments below, my original assertion that there is a 64K limit based on the number of ports is incorrect, however there is a 32K limit on the number of socket handles, so my suggested design is valid.
With a typical TCP/IP server design, you're limited in the number of simultaneous open connections you can have. The server has one listening port, and when a client connects to it the server makes an accept call, and that creates a new socket on a random port for the rest of the connection.
To handle more than 64K simultaneous connections I think you need to use UDP instead. You only need one port for the server to listen on, and you need to manage the connections using a 32-bit client ID in the packet data instead of having a separate port for each client. The 32-bit client ID could be the client's IP address, and the client can listen on a known UDP port for messages coming back from the server. That port would be the only one that needs to be open on the firewall.
With this approach, your only limitation is how quickly you can handle and respond to UDP messages. With millions of clients, even sparse traffic could give you large spikes, and if you don't read the packets fast enough your input queue will fill up and you'll start dropping packets. The C10K page Greg points to will give you strategies for that.

P2P network games/apps: Good choice for a "battle.net"-like matching server

I'm making a network game (1v1) where in-game its p2p - no need for a game server.
However, for players to be able to "find each other", without the need to coordinate in another medium and enter IP addresses (similar to the modem days of network games), I need to have a coordination/matching server.
I can't use regular web hosting because:
The clients will communicate in UDP.
Therefore I'll need to do UDP Hole Punching to be able to go through the NAT
That would require the server to talk in UDP and know the client's IP and port
afaik with regular web hosting (php/etc) I can only get the client's IP address and can only communicate in TCP (HTTP).
Options I am currently considering:
Use a hosting solution where my program can accept UDP connection. (any recommendations?)
UDPonNAT seems to do this but uses GTalk and requires each client to have a GTalk account for this (which probably makes it an unsuitable solution)
Any ideas? Thanks :)
First, let me say that this is well out of my realm of expertise, but I found myself very interested, so I've been doing some searching and reading.
It seems that the most commonly prescribed solution for UDP NAT traversal is to use a STUN server. I did some quick searches to see if there are any companies that will just straight-up provide you with a STUN hosting solution, but if there even were any, they were buried in piles of ads for simple web hosting.
Fortunately, it seems there are several STUN servers that are already up and running and free for public use. There is a list of public STUN servers at voip-info.org.
In addition, there is plenty more information to be had if you explore SO questions tagged "nat".
I don't see any other choice than to have a dedicated server running your code. The other solutions you propose are, shall we say, less than optimal.
If you start small, virtual hosting will be fine. Costs are pretty minimal.
Rather than a full-blown dedicated server, you could just get a cheap shared hosting service and have the application interface with a PHP page, which in turn interfaces with a MySQL database backend.
For example, Lunarpages has a $3/month starter package that includes 5gb of space and 50gb of bandwidth. For something this simple, that's all you should need.
Then you just have your application poll the web page for the list of games, and submit a POST request in order to add their own game to the list.
Of course, this method requires learning PHP and MySQL if you don't already know them. And if you do it right, you can have the PHP page enter a sort of infinite loop to keep the connection open and just feed updates to the client, rather than polling the page every few seconds and wasting a lot of bandwidth. That's way outside the scope of this answer though.
Oh, and if you're looking for something absolutely free, search for a free PHP host. Those exist too! Even with an ad-supported host, your app could just grab the page and ignore the ads when you parse the list of games. I know that T35 used to be one of my favorites because their free plan doesn't track space or bandwidth (it limits the per-file size, to eliminate their service being used as a media share, but it shouldn't be a problem for PHP files). But of course, I think in the long run you'll be better off going with a paid host.
Edit: T35 also says "Free hosting allows 1 domain to be hosted, while paid offers unlimited domain hosting." So you can even just pay for a domain name and link it to them! I think in the short term, that's your best (cheapest) bet. Of course, this is all assuming you either know or are willing to learn PHP in order to make this happen. :)
There's nothing that every net connection will support. STUN is probably good, UPnP can work for this.
However, it's rumored that most firewalls can be enticed to pass almost anything through UDP port 53 (DNS). You might have to argue with the OS about your access to that port though.
Also, check out SIP, it's another protocol designed for this sort of thing. With the popularity of VOIP, there may be decent built-in support for this in more firewalls.
If you're really committed to UDP, you might also consider tunneling it over HTTP.
how about you break the problem into two parts - make a game matcher client (that is distinct from the game), which can communicate via http to your cheap/shared webhost. All gamers who wants to use the game matching function use this. THe game matcher client then launches the actual game with the correct parameters (IP, etc etc) after obtaining the info from your server.
The game will then use the standard way to UDP punch thru NAT, etc etc, as per your network code. The game dont actually need to know anything about the matcher client or matcher server - in the true sense of p2p (like torrents, once you can obtain your peer's IPs, you can even disconnect from the tracker).
That way, your problems become smaller.
An intermediate solution between hosting your own dedicated server and a strictly P2P networking environment is the gnutella model. In that model, there are superpeers that act like local servers, having known IP addresses and being connected to (and thus having knowledge of) more clients than a typical peer. This still requires you to run at least one superpeer yourself, but it gives you the option to let other people run their own superpeers.

Resources