How many socket connections can a web server handle? - http

Say if I was to get shared, virtual or dedicated hosting, I read somewhere a server/machine can only handle 64,000 TCP connections at one time, is this true? How many could any type of hosting handle regardless of bandwidth? I'm assuming HTTP works over TCP.
Would this mean only 64,000 users could connect to the website, and if I wanted to serve more I'd have to move to a web farm?

In short:
You should be able to achieve in the order of millions of simultaneous active TCP connections and by extension HTTP request(s). This tells you the maximum performance you can expect with the right platform with the right configuration.
Today, I was worried whether IIS with ASP.NET would support in the order of 100 concurrent connections (look at my update, expect ~10k responses per second on older ASP.Net Mono versions). When I saw this question/answers, I couldn't resist answering myself, many answers to the question here are completely incorrect.
Best Case
The answer to this question must only concern itself with the simplest server configuration to decouple from the countless variables and configurations possible downstream.
So consider the following scenario for my answer:
No traffic on the TCP sessions, except for keep-alive packets (otherwise you would obviously need a corresponding amount of network bandwidth and other computer resources)
Software designed to use asynchronous sockets and programming, rather than a hardware thread per request from a pool. (ie. IIS, Node.js, Nginx... webserver [but not Apache] with async designed application software)
Good performance/dollar CPU / Ram. Today, arbitrarily, let's say i7 (4 core) with 8GB of RAM.
A good firewall/router to match.
No virtual limit/governor - ie. Linux somaxconn, IIS web.config...
No dependency on other slower hardware - no reading from harddisk, because it would be the lowest common denominator and bottleneck, not network IO.
Detailed Answer
Synchronous thread-bound designs tend to be the worst performing relative to Asynchronous IO implementations.
WhatsApp can handle a million WITH traffic on a single Unix flavoured OS machine - https://blog.whatsapp.com/index.php/2012/01/1-million-is-so-2011/.
And finally, this one, http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html, goes into a lot of detail, exploring how even 10 million could be achieved. Servers often have hardware TCP offload engines, ASICs designed for this specific role more efficiently than a general purpose CPU.
Good software design choices
Asynchronous IO design will differ across Operating Systems and Programming platforms. Node.js was designed with asynchronous in mind. You should use Promises at least, and when ECMAScript 7 comes along, async/await. C#/.Net already has full asynchronous support like node.js. Whatever the OS and platform, asynchronous should be expected to perform very well. And whatever language you choose, look for the keyword "asynchronous", most modern languages will have some support, even if it's an add-on of some sort.
To WebFarm?
Whatever the limit is for your particular situation, yes a web-farm is one good solution to scaling. There are many architectures for achieving this. One is using a load balancer (hosting providers can offer these, but even these have a limit, along with bandwidth ceiling), but I don't favour this option. For Single Page Applications with long-running connections, I prefer to instead have an open list of servers which the client application will choose from randomly at startup and reuse over the lifetime of the application. This removes the single point of failure (load balancer) and enables scaling through multiple data centres and therefore much more bandwidth.
Busting a myth - 64K ports
To address the question component regarding "64,000", this is a misconception. A server can connect to many more than 65535 clients. See https://networkengineering.stackexchange.com/questions/48283/is-a-tcp-server-limited-to-65535-clients/48284
By the way, Http.sys on Windows permits multiple applications to share the same server port under the HTTP URL schema. They each register a separate domain binding, but there is ultimately a single server application proxying the requests to the correct applications.
Update 2019-05-30
Here is an up to date comparison of the fastest HTTP libraries - https://www.techempower.com/benchmarks/#section=data-r16&hw=ph&test=plaintext
Test date: 2018-06-06
Hardware used: Dell R440 Xeon Gold + 10 GbE
The leader has ~7M plaintext reponses per second (responses not connections)
The second one Fasthttp for golang advertises 1.5M concurrent connections - see https://github.com/valyala/fasthttp
The leading languages are Rust, Go, C++, Java, C, and even C# ranks at 11 (6.9M per second). Scala and Clojure rank further down. Python ranks at 29th at 2.7M per second.
At the bottom of the list, I note laravel and cakephp, rails, aspnet-mono-ngx, symfony, zend. All below 10k per second. Note, most of these frameworks are build for dynamic pages and quite old, there may be newer variants that feature higher up in the list.
Remember this is HTTP plaintext, not for the Websocket specialty: many people coming here will likely be interested in concurrent connections for websocket.

This question is a fairly difficult one. There is no real software limitation on the number of active connections a machine can have, though some OS's are more limited than others. The problem becomes one of resources. For example, let's say a single machine wants to support 64,000 simultaneous connections. If the server uses 1MB of RAM per connection, it would need 64GB of RAM. If each client needs to read a file, the disk or storage array access load becomes much larger than those devices can handle. If a server needs to fork one process per connection then the OS will spend the majority of its time context switching or starving processes for CPU time.
The C10K problem page has a very good discussion of this issue.

To add my two cents to the conversation a process can have simultaneously open a number of sockets connected equal to this number (in Linux type sytems) /proc/sys/net/core/somaxconn
cat /proc/sys/net/core/somaxconn
This number can be modified on the fly (only by root user of course)
echo 1024 > /proc/sys/net/core/somaxconn
But entirely depends on the server process, the hardware of the machine and the network, the real number of sockets that can be connected before crashing the system

It looks like the answer is at least 12 million if you have a beefy server, your server software is optimized for it, you have enough clients. If you test from one client to one server, the number of port numbers on the client will be one of the obvious resource limits (Each TCP connection is defined by the unique combination of IP and port number at the source and destination).
(You need to run multiple clients as otherwise you hit the 64K limit on port numbers first)
When it comes down to it, this is a classic example of the witticism that "the difference between theory and practise is much larger in practise than in theory" - in practise achieving the higher numbers seems to be a cycle of a. propose specific configuration/architecture/code changes, b. test it till you hit a limit, c. Have I finished? If not then d. work out what was the limiting factor, e. go back to step a (rinse and repeat).
Here is an example with 2 million TCP connections onto a beefy box (128GB RAM and 40 cores) running Phoenix http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections - they ended up needing 50 or so reasonably significant servers just to provide the client load (their initial smaller clients maxed out to early, eg "maxed our 4core/15gb box # 450k clients").
Here is another reference for go this time at 10 million: http://goroutines.com/10m.
This appears to be java based and 12 million connections: https://mrotaru.wordpress.com/2013/06/20/12-million-concurrent-connections-with-migratorydata-websocket-server/

Note that HTTP doesn't typically keep TCP connections open for any longer than it takes to transmit the page to the client; and it usually takes much more time for the user to read a web page than it takes to download the page... while the user is viewing the page, he adds no load to the server at all.
So the number of people that can be simultaneously viewing your web site is much larger than the number of TCP connections that it can simultaneously serve.

in case of the IPv4 protocol, the server with one IP address that listens on one port only can handle 2^32 IP addresses x 2^16 ports so 2^48 unique sockets. If you speak about a server as a physical machine, and you are able to utilize all 2^16 ports, then there could be maximum of 2^48 x 2^16 = 2^64 unique TCP/IP sockets for one IP address. Please note that some ports are reserved for the OS, so this number will be lower. To sum up:
1 IP and 1 port --> 2^48 sockets
1 IP and all ports --> 2^64 sockets
all unique IPv4 sockets in the universe --> 2^96 sockets

There are two different discussions here: One is how many people can connect to your server. This one has been answered adequately by others, so I won't go into that.
Other is how many ports yours server can listen on? I believe this is where the 64K number came from. Actually, TCP protocol uses a 16-bit identifier for a port, which translates to 65536 (a bit more than 64K). This means that you can have that many different "listeners" on the server per IP Address.

I think that the number of concurrent socket connections one web server can handle largely depends on the amount of resources each connection consumes and the amount of total resource available on the server barring any other web server resource limiting configuration.
To illustrate, if every socket connection consumed 1MB of server resource and the server has 16GB of RAM available (theoretically) this would mean it would only be able to handle (16GB / 1MB) concurrent connections. I think it's as simple as that... REALLY!
So regardless of how the web server handles connections, every connection will ultimately consume some resource.

Related

What strategies I can use to overcome networking limitations?

I maintain a service that basically pings sites to check whether they're online or not. The service per se is really simple, it relies only on the HTTP status code returned by the requested URL. For instance, I ignore the response body completely.
The service works fine for a small list of domains. However, networking becomes an issue as the number of sites to ping grows. I tried a couple of different languages and libraries. My latest implementation uses NodeJS and node-fetch. But I already had versions of it wrote in Python, PHP, Java, Golang. From that experience, I now know the language is not what determines the request/response speed. There are differences between languages and lib, for sure, but the bottleneck is not there.
Today, I think the only way to make the service scales is with multiple clusters in different networks (e.g. VPC if we're talking AWS). I can't think of a way to deal with networking restrictions in a single or just a few instances.
So, I'm asking this really broad question: what strategies I can use to overcome networking limitations? I'm looking for both dev and ops answers, but mostly focusing on keep the structure as light as possible.
One robust way to ping a website (or any TCP service in general) is to send TCP SYN packet to port 443 (or 80 for insecure HTTP) and measure the time till SYN+ACK response. Tools like hping3 and MTR utilize this method.
This method is one of the best because ICMP may be blocked, take a different path, be prioritized differently on routers in the path, or be responded to by a totally different host. Whereas TCP SYN is the actual scenario the users of the website exercise. The network load is minimal as no data is sent in SYN/SYN+ACK packets, only protocol headers (TCP, IP, and lower level protocol headers).
The answer of #Maxim Egorushkin is great, TCP SYN scanning is the most efficient way I can think of. There are other tools like Masscan, use pcap to send SYN packet in userspace, reduce TCP connection management overhead in kernel. This approach may do the job with a single instance.
If you wanna use HTTP protocol to make sure application layer works fine, use HTTP HEAD request. It responses with a header and status code as GET, but without the body.
Another potential optimization is DNS, you can host a DNS server locally and manage to update domains beforehand, or use a script to update host file before pinging those sites. This can save several milliseconds and bandwith
during pinging sites.
At development level, you could impletement a library just parse status code in HTTP response, so saving some CPU time on parsing headers.
It is helpful to address the actual bottleneck first, it that bandwith limit? memory limit? file descriptor limit? etc.

Can servers limit the number of incoming TLS connections?

I wrote a Java application that communicates with a 3rd-party server that is hosted on AWS. According to what I've read, there are no connection limits on their REST API, yet regardless of how many threads I try to hit them with I am seeing a limit of 64 outgoing connections.
My client is written in Java, running under Windows 10. This number smells like an intentional limit, but I have been unable to find any documented limits specific to the aforementioned environment.
I tried pointing my application to https://test.com/ and got a limit of 128 outgoing connections which leads me to believe the limit is server-side.
How can one determine (from the client's side) whether outgoing connections are being restricted by the server-side? (I tried netstat -an but I only saw ESTABLISHED connections)
Seeing as my application is not receiving any connection failures from the server, could they be intentionally withholding SYN-ACK responses until older connections are closed?
I'm pretty sure there's no reliable way to tell from a client's side what the outgoing connections are being restricted to (and that would seem like a security issue to me if you could).
As for what the limits are, I think that also could be a difficult thing to determine from the server itself (unless it was configured at some lower bound) as there would probably be a lot of variables such as pretty much everything, from the amount of memory, ports, CPU on the machine you are connected to, from and every single one in between.
Also 64 and 128 don't seem like human set limits to me, but instead limits on something internal as it's a power of 2.

Load Testing Thousands of SLOW Connections

I would like to test an upload service with hundreds, if not thousands,
of slow HTTPS connections simultaneously.
I would like to have lots of, say, 3G-quality connections,
each throttled with low bandwidth and high latency,
each sending a few megabytes of data up to the server,
resulting in lots of concurrent, long-lived requests being handled by the server.
There are many load generation tools that can generate thousands of simultaneous requests.
(I'm currently using Locust, mostly so that I can take
advantage of my existing client library written in Python.)
Such tools typically run each concurrent request as fast as possible
over the shared network link.
There are various ways to adjust the apparent bandwidth and latency of TCP connections,
such as Linux's TC
and handy wrappers like Comcast.
As far as I can tell, TC and the like control the shared link
but they cannot throttle the individual requests.
If you want to throttle a single request, TC works well.
In theory, with many clients sharing the same throttled network link,
each request could be run serially,
subject to the constrained bandwidth,
rather than having lots of requests executing concurrently,
a few packets at a time.
The former would result in much fewer active requests executing
concurrently on the server.
I suspect that the tool I want has to actively manage each individual client's sending
and receiving to throttle them fairly.
Is there such a tool?
You can take a look at Apache JMeter, it can "throttle" connections to the throughput configurable via the following properties:
httpclient.socket.http.cps=0
httpclient.socket.https.cps=0
The properties can be defined either in user.properties file or passed to JMeter via -J command-line argument
cps stands for character per second so you can "slow down" JMeter threads (virtual users) to the given throughput rate, the formula for cps calculation is:
cps = (target bandwidth in kbps * 1024) / 8
Check out How to Simulate Different Network Speeds in Your JMeter Load Test for more information.
Yes, these are network simulators. A very primitive one is in the form of WanEM. It is not going to cover your testing needs. You will need something akin to Shunra Storm, a hardware device which can manage individual connections and impairment with models derived from Ookla (think speedtest.com) related to 3,4,5g connections from the wild. Well, perhaps I should say, "could manage," as this product has been absent since the HP acquisition of Shunra.
There are some other market competitors on the network front from companies such as Ixia, Agilent, PacketStorm, Spirent and the like. None of them are inexpensive, but I see your need. Slow, and particularly dirty connections likes cell phones, have a disproportionate impact on the stack and can result in the server running out of resources with fewer mobile connections than desktop ones.
On a side note, be sure you are including a representative model for think time in your test code. If you collapse the client-server model with no or extremely limited think time & impair the network only bad things can happen. This will play particular havoc with both predictability and repeatability on your tests. You may also wind up chasing dozens of engineering ghosts related to load in your code that will not occur in production because of the natural delays and the release of resources which should occur during those windows of activity between client requests.

How to retain one million simultaneous TCP connections?

I am to design a server that needs to serve millions of clients that are simultaneously connected with the server via TCP.
The data traffic between the server and the clients will be sparse, so bandwidth issues can be ignored.
One important requirement is that whenever the server needs to send data to any client it should use the existing TCP connection instead of opening a new connection toward the client (because the client may be behind a firewall).
Does anybody know how to do this, and what hardware/software is needed (at the least cost)?
What operating systems are you considering for this?
If using a Windows OS and using something later than Vista then you shouldn't have a problem with many thousands of connections on a single machine. I've run tests (here: http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html) with a low spec Windows Server 2003 machine and easily achieved more than 70,000 active TCP connections. Some of the resource limits that affect the number of connections possible have been lifted considerably on Vista (see here: http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html) and so you could probably achieve your goal with a small cluster of machines. I don't know what you'd need in front of those to route the connections.
Windows provides a facility called I/O Completion Ports (see: http://msdn.microsoft.com/en-us/magazine/cc302334.aspx) which allow you to service many thousands of concurrent connections with very few threads (I was running tests yesterday with 5000 connections saturating a link to a server with 2 threads to process the I/O...). Thus the basic architecture is very scalable.
If you want to run some tests then I have some freely available tools on my blog that allow you to thrash a simple echo server using many thousands of connections (1) and (2) and some free code which you could use to get you started (3)
The second part of your question, from your comments, is more tricky. If the client's IP address keeps changing and there's nothing between you and them that is providing NAT to give you a consistent IP address then their connections will, no doubt, be terminated and need to be re-established. If the clients detect this connection tear down when their IP address changes then they can reconnect to the server, if they can't then I would suggest that the clients need to poll the server every so often so that they can detect the connection loss and reconnect. There's nothing the server can do here as it can't predict the new IP address and it will discover that the old connection has failed when it tries to send data.
And remember, your problems are only just beginning once you get your system to scale to this level...
This problem is related to the so-called C10K problem. The C10K page lists a large number of good resources for addressing the problems you will encounter when you try to allow thousands of clients to connect to the same server.
I've come across the APE Project
a while back. It seems like a dream come true. They can support up to 100k concurrent clients on a single node. Spread them across 10 or 20 nodes, and you can serve millions. Perfect for RESTful applications. Might want to look deeper for any shared namespace. One drawback is that this is a standalone server, as in supplementary to a web server. This server is of course Open Source, so any cost is hardware/ISP related.
You cannot use UDP. If the client sends a request and you don't reply immediately, a router is going to forget the reverse route in 30 seconds or less, so your server will never be able to reply to the client.
TCP is the only option, and it, too, will give you headaches. Most routers are going to forget the route and/or drop the connection after a few minutes, so your client/server code is going to have to send "keep alives" fairly often.
I recommend setting up a "sniffer", to see how the phone companies are staying in touch with your smartphone for their "push" technology. Copy whatever they're doing, because that stuff works!
As Greg mentioned, the problem you are describing is C10K (or rather "C1M" in your case )
I recently made a simple TCP echo server on linux that scales very well with the number of sessions (only tested up to 200.000 though), by using the epoll queue. On BSD, you have something similar called kqueue.
You can check out the code if you want to. Hope this helps and good luck!
EDIT: As noted in the comments below, my original assertion that there is a 64K limit based on the number of ports is incorrect, however there is a 32K limit on the number of socket handles, so my suggested design is valid.
With a typical TCP/IP server design, you're limited in the number of simultaneous open connections you can have. The server has one listening port, and when a client connects to it the server makes an accept call, and that creates a new socket on a random port for the rest of the connection.
To handle more than 64K simultaneous connections I think you need to use UDP instead. You only need one port for the server to listen on, and you need to manage the connections using a 32-bit client ID in the packet data instead of having a separate port for each client. The 32-bit client ID could be the client's IP address, and the client can listen on a known UDP port for messages coming back from the server. That port would be the only one that needs to be open on the firewall.
With this approach, your only limitation is how quickly you can handle and respond to UDP messages. With millions of clients, even sparse traffic could give you large spikes, and if you don't read the packets fast enough your input queue will fill up and you'll start dropping packets. The C10K page Greg points to will give you strategies for that.

How Many Network Connections Can a Computer Support?

When writing a custom server, what are the best practices or techniques to determine maximum number of users that can connect to the server at any given time?
I would assume that the capabilities of the computer hardware, network capacity, and server protocol would all be important factors.
Also, do you think it is a good practice to limit the number of network connections to a certain maximum number of users? Or should the server not limit the number of network connections and let performance degrade until the response time is extremely high?
Dan Kegel put together a summary of techniques for handling large amounts of network connections from a single server, here: http://www.kegel.com/c10k.html
In general modern servers can handle very large numbers of concurrent connections. I've worked on systems having over 8,000 concurrently open TCP/IP sockets.
You will need a high quality servicing interface to handle that kind of load, check out libevent or libev.
That is a good question and it definitely is situational. What is your computer? Do you have a 4 socket machine filled with Quad Core Xeons, 128 GB of RAM, and Fiber Channel Connectivity (like the pair of Dell R900s we just bought)? Or are you running on a p3 550 with 256 MB of RAM, and 56K modem? How much load does each connection place on your server? What kind of response is acceptible?
These are the questions you need to answer. I guess the best way to find the answer is through load testing. Create a unit test of the expected (and maybe some unexpected) paths that your code will perform against your server. Find a load testing framework that will allow you to simulate 10, 100, 1000, 10000 users performing those tasks at the same time.
That will tell you how many connections your computer can support.
The great thing about the load/unit test scenario is that you can put in response time expectations in your unit tests and increase the load until you fall outside of your response time. If you have a requirement of supporting X number of Users with Y second response, you will be able to demonstrate it with your load tests.
One of the biggest setbacks in high concurrency connections is actually the routers involved. Home user oriented routers usually have a small NAT table, preventing the router from actually servicing the server the connections.
Be sure to research your router/ network infrastructure setup just as well.
I think you shouldn't limit the number of connections your server will allow - just catch and handle properly any exceptions that might occur when accepting and closing connections and you should be fine. You should leave that kind of lower level programming to the underlying OS layers - that way you can port your server easier etc.
This really depends on your operating system.
Different Unix flavors will support "unlimited" number of file handles / sockets others have high values like 32768.
A typical user limit is 8192 but it can usually be set higher.
I think windows is more limiting but the server version may have higher limits.

Resources