Dear StackOverflow members,
I was wondering, let's say with Whatsapp... you're continously connected to their servers.(Using TCP)
And assuming there's a max of 65535 connections/port, how do they avoid that limit?
Seeing as that'd mean once a server hits 65535 one time it'll always stay on that and never go down, as everyone's phone simply stays connected.
I'm not sure if you guys understand my question, but if you have any questions feel free to ask.
Kind regards,
Rene Roosen
Any large website wouldn't rely on one server. They'd usually use a load balancing proxy (commercial or open-source ones like ATS or HA proxy) and have several servers behind those. Those proxies have mechanisms to scale to much higher connections.
As long as the 4-tuple is unique (source-ip, source-port, dest-ip, dest-port), a proxy can handle the connection provided other resources (memory, cpu, etc.) are available. They don't restrict traffic to 64k connections/port.
Related
We published the game on russian server and 1% of people couldn't connect to server on 46xx port through raw TCP while they can load it's HTML page (through HTTP). Most of such people live in Germany, Israel....
Why is it so? What's the politics decisions lay behind it? We discovered that their such ports (which are free on IANA) are closed. Does it mean that such people cannot run Steam (and, then, play all games which you can buy through it), play WoW and many other modern games which use TCP through 4xxx ports?
Thank you.
ISPs have been known to filter certain ports for various reasons. Users should complain loudly to them (or switch) in order to send a signal that such is not to be tolerated. You can encourage them to do so but of course that doesn't solve your problem (or really answer your question).
Common reasons are:
- trying to block bittorrent traffic
- limit bandwidth usage (largely related to previous reason)
- security (mistaken)
- control (companies often don't want employees goofing off)
The easiest thing for you to do is run your game over port 443 (perhaps as an alternate). That's HTTPS and so will not generally be blocked. However, because HTTPS is encrypted, there's no way to inspect the stream to know if its web traffic or something else and thus you can run any data stream (encrypted or not) that you wish over it.
That's precisely correct. In fact every public web site would by default block all ports except the ones they expect to be running some traffic they would want to.
This is the reason many applications often try to encapsulate their programs to use port 80 which can't be blocked as long as some one wants http traffic to run.
They simply don't want any application that they haven't approved to run through their servers. If you have a sensitive server in public you surely won't want any one to use your machine for any apps that you don't allow. A common reason is applications that eat up bandwidth such as bittorent, edonkey, gnutella as well as streaming, voip and other high bandwidth consuming apps
Sorry for banal question, but i would like to know, while i launch many many http requests to a site, will site goes down or just i will be banned after n request limit (according to server settings)?
As is most always the case the answer is: it depends :)
Now, seriously, it depends on a few things but most likely if you're just hitting "a" server with a single client computer you most likely will not be able to make it go down as servers usually have more bandwidth (networking & processing powers) available to them than most client computers. Plus, when you're making requests to a website you might actually be making requests to a load-balanced network of servers which definitely means that a single client will not be able to outperform the site. If this is the case your IP will probably get blacklisted or maybe just ignored.
The other possibility is to make a lot of requests with a number of different clients at the same time - using different IPs. This is what is usually called a coordinated or distributed denial of service attack. In that case you might be able to make the web server(s) go down for a while.
This is just a simple answer but I hope you get the point.
I am writing an application where the client side will be uploading data to the server through a wireless link.
The connection should be very reliable.The link is expected to break many times and there will be many clients connected to the server.
I am confused whether to use TCP or reliable UDP.
Please share your thoughts.
Thanks.
RUDP is not, of course, a formal standard, and there's no telling if you will find existing implementations you can use. Given a choice between rolling this from scratch and just re-making TCP connections, I'd chose TCP.
To be safe, I would go with TCP just because it's a reliable, standard protocol. RUDP has the disadvantage of not being an established standard (although it's been mentioned in several IETF discussions).
Good luck with your project!
It's likely that both your TCP and RUDP links would be broken by your environment, so the fact that you're using RUDP is unlikely to help there; there will likely be times when no datagrams can get through...
What you actually need to make sure of is that a) you can handle the number of connected clients, b) your application protocol can detect reasonably quickly when you've lost connectivity with a client (or server) and c) you can handle the required reconnection and maintenance of cross connection session state for clients.
As long as you deal with b) and c) it doesn't really matter if the connection keeps being broken. Make sure you design your application protocol so that you can get things done in short batches; so if you're uploading files, make sure that you're sending small blocks and that the application protocol can resume a transfer that was broken half way through; you don't want to get 99% of the way through a 2gb transfer and lose the connection and have to start again.
For this to work your server needs some kind of client session state cache where you can keep the logical state of a client's connection beyond the life of the connection itself. Design from the start to expect a given session to include multiple separate connections. The session state should possibly have some kind of timeout so if the client goes away for along time it doesn't continue to consume resources on the server but, to be honest, it may simply be a case of saving the state off to disk after a while.
In summary, I don't think the choice of transport matters and I'd go with TCP at least to start with. What will really matter is being able to manage your client's session state on the server and deal with the fact that clients will connect and disconnect regularly.
If you aren't sure, odds are that you should use TCP. For one thing, it's certain to be part of the network stack for anything supporting IP. "Reliable UDP" is rarely supported out of the box, so you'll have some extra support work for your clients.
I am to design a server that needs to serve millions of clients that are simultaneously connected with the server via TCP.
The data traffic between the server and the clients will be sparse, so bandwidth issues can be ignored.
One important requirement is that whenever the server needs to send data to any client it should use the existing TCP connection instead of opening a new connection toward the client (because the client may be behind a firewall).
Does anybody know how to do this, and what hardware/software is needed (at the least cost)?
What operating systems are you considering for this?
If using a Windows OS and using something later than Vista then you shouldn't have a problem with many thousands of connections on a single machine. I've run tests (here: http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html) with a low spec Windows Server 2003 machine and easily achieved more than 70,000 active TCP connections. Some of the resource limits that affect the number of connections possible have been lifted considerably on Vista (see here: http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html) and so you could probably achieve your goal with a small cluster of machines. I don't know what you'd need in front of those to route the connections.
Windows provides a facility called I/O Completion Ports (see: http://msdn.microsoft.com/en-us/magazine/cc302334.aspx) which allow you to service many thousands of concurrent connections with very few threads (I was running tests yesterday with 5000 connections saturating a link to a server with 2 threads to process the I/O...). Thus the basic architecture is very scalable.
If you want to run some tests then I have some freely available tools on my blog that allow you to thrash a simple echo server using many thousands of connections (1) and (2) and some free code which you could use to get you started (3)
The second part of your question, from your comments, is more tricky. If the client's IP address keeps changing and there's nothing between you and them that is providing NAT to give you a consistent IP address then their connections will, no doubt, be terminated and need to be re-established. If the clients detect this connection tear down when their IP address changes then they can reconnect to the server, if they can't then I would suggest that the clients need to poll the server every so often so that they can detect the connection loss and reconnect. There's nothing the server can do here as it can't predict the new IP address and it will discover that the old connection has failed when it tries to send data.
And remember, your problems are only just beginning once you get your system to scale to this level...
This problem is related to the so-called C10K problem. The C10K page lists a large number of good resources for addressing the problems you will encounter when you try to allow thousands of clients to connect to the same server.
I've come across the APE Project
a while back. It seems like a dream come true. They can support up to 100k concurrent clients on a single node. Spread them across 10 or 20 nodes, and you can serve millions. Perfect for RESTful applications. Might want to look deeper for any shared namespace. One drawback is that this is a standalone server, as in supplementary to a web server. This server is of course Open Source, so any cost is hardware/ISP related.
You cannot use UDP. If the client sends a request and you don't reply immediately, a router is going to forget the reverse route in 30 seconds or less, so your server will never be able to reply to the client.
TCP is the only option, and it, too, will give you headaches. Most routers are going to forget the route and/or drop the connection after a few minutes, so your client/server code is going to have to send "keep alives" fairly often.
I recommend setting up a "sniffer", to see how the phone companies are staying in touch with your smartphone for their "push" technology. Copy whatever they're doing, because that stuff works!
As Greg mentioned, the problem you are describing is C10K (or rather "C1M" in your case )
I recently made a simple TCP echo server on linux that scales very well with the number of sessions (only tested up to 200.000 though), by using the epoll queue. On BSD, you have something similar called kqueue.
You can check out the code if you want to. Hope this helps and good luck!
EDIT: As noted in the comments below, my original assertion that there is a 64K limit based on the number of ports is incorrect, however there is a 32K limit on the number of socket handles, so my suggested design is valid.
With a typical TCP/IP server design, you're limited in the number of simultaneous open connections you can have. The server has one listening port, and when a client connects to it the server makes an accept call, and that creates a new socket on a random port for the rest of the connection.
To handle more than 64K simultaneous connections I think you need to use UDP instead. You only need one port for the server to listen on, and you need to manage the connections using a 32-bit client ID in the packet data instead of having a separate port for each client. The 32-bit client ID could be the client's IP address, and the client can listen on a known UDP port for messages coming back from the server. That port would be the only one that needs to be open on the firewall.
With this approach, your only limitation is how quickly you can handle and respond to UDP messages. With millions of clients, even sparse traffic could give you large spikes, and if you don't read the packets fast enough your input queue will fill up and you'll start dropping packets. The C10K page Greg points to will give you strategies for that.
When writing a custom server, what are the best practices or techniques to determine maximum number of users that can connect to the server at any given time?
I would assume that the capabilities of the computer hardware, network capacity, and server protocol would all be important factors.
Also, do you think it is a good practice to limit the number of network connections to a certain maximum number of users? Or should the server not limit the number of network connections and let performance degrade until the response time is extremely high?
Dan Kegel put together a summary of techniques for handling large amounts of network connections from a single server, here: http://www.kegel.com/c10k.html
In general modern servers can handle very large numbers of concurrent connections. I've worked on systems having over 8,000 concurrently open TCP/IP sockets.
You will need a high quality servicing interface to handle that kind of load, check out libevent or libev.
That is a good question and it definitely is situational. What is your computer? Do you have a 4 socket machine filled with Quad Core Xeons, 128 GB of RAM, and Fiber Channel Connectivity (like the pair of Dell R900s we just bought)? Or are you running on a p3 550 with 256 MB of RAM, and 56K modem? How much load does each connection place on your server? What kind of response is acceptible?
These are the questions you need to answer. I guess the best way to find the answer is through load testing. Create a unit test of the expected (and maybe some unexpected) paths that your code will perform against your server. Find a load testing framework that will allow you to simulate 10, 100, 1000, 10000 users performing those tasks at the same time.
That will tell you how many connections your computer can support.
The great thing about the load/unit test scenario is that you can put in response time expectations in your unit tests and increase the load until you fall outside of your response time. If you have a requirement of supporting X number of Users with Y second response, you will be able to demonstrate it with your load tests.
One of the biggest setbacks in high concurrency connections is actually the routers involved. Home user oriented routers usually have a small NAT table, preventing the router from actually servicing the server the connections.
Be sure to research your router/ network infrastructure setup just as well.
I think you shouldn't limit the number of connections your server will allow - just catch and handle properly any exceptions that might occur when accepting and closing connections and you should be fine. You should leave that kind of lower level programming to the underlying OS layers - that way you can port your server easier etc.
This really depends on your operating system.
Different Unix flavors will support "unlimited" number of file handles / sockets others have high values like 32768.
A typical user limit is 8192 but it can usually be set higher.
I think windows is more limiting but the server version may have higher limits.