lua-resty-redis set_keepalive recommended settings - nginx

I'm using
red:set_keepalive(max_idle_timeout, pool_size)
(From here: https://github.com/openresty/lua-resty-redis#set_keepalive)
with Nginx and trying to determine the best values to use for max_idle_timeout and pool_size.
If my worker_connections is set to 1024, does it make sense to have a pool_size of 1024?
For max_idle_timeout, is 60000 (1 minute) too "aggressive"? Is it safer to go with a smaller value?
Thanks,
Matt

I think Check List for Issues section of official documentation has a good guideline for sizing your connection pool:
Basically if your NGINX handle n concurrent requests and your NGINX has m workers, then the connection pool size should be configured as n/m. For example, if your NGINX usually handles 1000 concurrent requests and you have 10 NGINX workers, then the connection pool size should be 100.
So, if you expect 1024 concurrent requests that actually connect to Redis then a good size for your pool is 1024/worker_processes. Maybe a few more to account for uneven request distribution among workers.
Your keepalive should be long enough to account for the way traffic arives. If your traffic is constant then you can lower your timeout. Or stay with 60 seconds, in most cases longer timeout won't make any noticeable difference.

Related

NiFi. There are too many outstanding HTTP requests with a total 100 outstanding requests

I'm having There are too many outstanding HTTP requests with a total 100 outstanding requests error each time I try to login to apache NiFi and sometimes when I'm working on the interface, memory and cpu are available on the cluster machines and NiFi flows work normally even when I can't login, is there a reason I keep getting this? And what can I try to prevent it?
You can change nifi.cluster.node.max.concurrent.requests to a greater number. As stated here, the number of requests is limited to 100 by default, this would increase the possible number of requests.
It might not fix the problem that makes your cluster experience so many requests, but it could at least let you login to your cluster.

Why does HTTP/1.1 recommend to be conservative with opening connections?

With HTTP/1.0, there used to be a recommended limit of 2 connections per domain. More recent HTTP RFCs have relaxed this limitation but still warn to be conservative when opening multiple connections:
According to RFC 7230, section 6.4, "a client ought to limit the number of simultaneous open connections that it maintains to a given server".
More specifically, besides HTTP/2, these days, browsers impose a per-domain limit of 6-8 connections when using HTTP/1.1. From what I'm reading, these guidelines are intended to improve HTTP response times and avoid congestion.
Can someone help me understand what would happen with congestion and response times if many connections were opened by domain? It doesn't sound like an HTTP server problem since the amount of connection they can handle seems like an implementation detail. The explanation above seems to say it's about TCP performance? I can't find any more precise explanations for why HTTP clients limit the number of connections per domains.
The primary reasoning for this is resources on the server side.
Imagine that you have a server running Apache with the default of 256 worker threads. Imagine that this server is hosting an index page that has 20 images on it. Now imagine that 20 clients simultaneously connect and download the index page; each of these clients closes these connections after obtaining the page.
Since each of them will now establish connections to download the image, you likely see that the connections increase exponentially (or multiplicatively, I suppose). Consider what happens if every client is configured to establish up to ten simultaneous connections in parallel to optimize the display of the page with images. This takes us very quickly to 400 simultaneous connections. This is nearly double the number of worker processes that Apache has available (again, by default, with a pre-fork).
For the server, resources must be balanced to be able to serve the most likely load, but the clients help with this tremendously by throttling connections. If every client felt free to establish 100+ connections to a server in parallel, we would very quickly DoS lots of hosts. :)

I set a very large value for the nginx backlog what consequences will lead to

i don't know, if I set a very large value for the nginx backlog , what consequences will lead to.
who can tell me ?
It is too large for php-fpm to set the listen backlog to 65535.
It is realy NOT a good idea to clog the accept queue especially
when the client or nginx has a timeout for this connection.
Assume that the php-fpm qps is 5000. It will take 13s to completely
consume the 65535 backloged connections. The connection maybe already
have been closed cause of timeout of nginx or clients. So when we accept
the 65535th socket, we get a broken pipe.
Even worse, if hundreds of php-fpm processes get a closed connection
they are just wasting time and resouces to run a heavy task and finally
get error when writing to the closed connection(error: Broken Pipe).
The really max accept queue size will be backlog+1(ie, 512 here).
We take 511 which is the same as nginx and redis.

Optimal value for Nginx worker_connections

Nginx worker_connections sets the maximum number of simultaneous connections that can be opened by a worker process. This number includes all connections (e.g. connections with proxied servers, among others), not only connections with clients. Another consideration is that the actual number of simultaneous connections cannot exceed the current limit on the maximum number of open files. I have few queries around this:
What should be the optimal or recommended value for this?
What are the downsides of using a high number of worker connections?
Setting lower limits may be useful when you may be resource-constrained. Some connections, for example, keep-alive connections, are effectively wasting your resources (even if nginx is very efficient, which it is), and aren't required for correct operation of a general-purpose server.
Having a lower resource limit will indicate to nginx that you are low on physical resources, and those available should be allocated to new connections, rather than to serve the idling keep-alive connections.
What is the recommended value? It's the default.
The defaults are all documented within the documentation:
Default: worker_connections 512;
And can be confirmed in the source-code at event/ngx_event.c, too
13#define DEFAULT_CONNECTIONS 512

About Nginx status

This is my nginx status below:
Active connections: 6119
server accepts handled requests
418584709 418584709 455575794
Reading: 439 Writing: 104 Waiting: 5576
The value of Waiting is much higher than Reading and Writing, is it normal?
Because of the 'keep-alive' is open?
But if I send a large number of requests to the server, the value of Reading and Writing don't increase, so I think there must be a bottleneck of the nginx or any other.
The Waiting time is Active - (Reading + Writing), i.e. connection still opened waiting for either a new request, or the keepalive expiration.
You could change the keepalive default (which is 75 seconds)
keepalive_timeout 20s;
or tell the browser when it should close the connection by adding an optional second timeout in the header sent to the browser
keepalive_timeout 20s 20s;
but in this nginx page about keepalive you see that some browsers do not care about the header (anyway your site wound't gain much thanks to this optional parameter).
The keepalive is a way to reduce the overhead of creating the connection, as, most of the time, a user will navigate through the site etc... (Plus the multiple requests from a single page, to download css, javascript, images etc...)
It depends on your site, you could reduce the keepalive - but keep in mind that establishing connections is expensive. This is a trade-off you have to refine depending on the site statistics. You could also decrease little by little the timeout (75s -> 50, then a week later 30...) and see how the server behaves.
You don't really want to fix it, as "waiting" means keep-alive
connections. They consume almost no resources (socket + about
2.5M of memory per 10000 connections in nginx).
Are the requests short lived? it's possible they're reading/writing then closing in a short amount of time.
If you're genuinely interested in fixing it you can test to see if nginx is bottleneck you could set keep-alive to 0 in your nginx config:
keepalive_timeout 0;

Resources