I am trying to rate limit number GRPC connections based on a token included in the Authorization header. I tried the following settings in the Nginx configmap and Ingress annotation but Nginx rate limiting is not working.
---
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-ingress-controller
namespace: default
data:
http-snippet: |
limit_req_zone $http_authorization zone=zone-1:20m rate=10r/m;
limit_req_zone $http_token zone=zone-2:20m rate=10r/m;
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/backend-protocol: GRPC
nginx.ingress.kubernetes.io/configuration-snippet: |
limit_req zone=zone-1;
limit_req_log_level notice;
limit_req_status 429;
I try to have Nginx Ingress Controller to rate limit the GRPC/HTTP2 stream connection based on the value in the $http_authorization variable. I have modified the Nginx log_format to log the $http_authorization value and observe that Nginx receives the value. The problem I am facing is that for some reason the rate limiting rule doesn't get triggered.
Is this the correct approach?
Any help and feedback would be much appreciated!
Thanks
Hello Bobby_H and welcome to Stack Overflow!
When using Nginx Ingress on Kubernetes you can set up your rate limits with these annotations:
nginx.ingress.kubernetes.io/limit-connections: number of concurrent connections allowed from a single IP address. A 503 error
is returned when exceeding this limit.
nginx.ingress.kubernetes.io/limit-rps: number of requests accepted from a given IP each second. The burst limit is set to this limit
multiplied by the burst multiplier, the default multiplier is 5. When
clients exceed this limit, limit-req-status-code default: 503 is
returned.
nginx.ingress.kubernetes.io/limit-rpm: number of requests accepted from a given IP each minute. The burst limit is set to this limit
multiplied by the burst multiplier, the default multiplier is 5. When
clients exceed this limit, limit-req-status-code default: 503 is
returned.
nginx.ingress.kubernetes.io/limit-burst-multiplier: multiplier of the limit rate for burst size. The default burst multiplier is 5, this
annotation override the default multiplier. When clients exceed this
limit, limit-req-status-code default: 503 is returned.
nginx.ingress.kubernetes.io/limit-rate-after: initial number of kilobytes after which the further transmission of a response to a
given connection will be rate limited. This feature must be used with
proxy-buffering enabled.
nginx.ingress.kubernetes.io/limit-rate: number of kilobytes per second allowed to send to a given connection. The zero value disables
rate limiting. This feature must be used with proxy-buffering enabled.
nginx.ingress.kubernetes.io/limit-whitelist: client IP source ranges to be excluded from rate-limiting. The value is a comma
separated list of CIDRs.
Nginx implements the leaky bucket algorithm, where incoming requests are buffered in a FIFO queue, and then consumed at a limited rate. The burst value defines the size of the queue, which allows an exceeding number of requests to be served beyond the base limit. When the queue becomes full, the following requests will be rejected with an error code returned.
Here you will find all important parameters to configure your rate limiting.
The number of expected successful requests can be calculated like this:
successful requests = (period * rate + burst) * nginx replica
so it is important to notice that the number of nginx replicas will also multiply the number of successful requests. Also, notice that Nginx ingress controller sets burst value at 5 times the limit. You can check those parameters at nginx.conf after setting up your desired annotations. For example:
limit_req_zone $limit_cmRfaW5ncmVzcy1yZC1oZWxsby1sZWdhY3k zone=ingress-hello-world_rps:5m rate=5r/s;
limit_req zone=ingress-hello-world_rps burst=25 nodelay;
limit_req_zone $limit_cmRfaW5ncmVzcy1yZC1oZWxsby1sZWdhY3k zone=ingress-hello-world_rpm:5m rate=300r/m;
limit_req zone=ingress-hello-world_rpm burst=1500 nodelay;
There are two limitations that I would also like to underline:
Requests are counted by client IP, which might not be accurate, or not fit your business needs such as rate-limiting by user identity.
Options like burst and delay are not configurable.
I strongly recommend to go through the below sources also to have a more in-depth explanation regarding this topic:
NGINX rate-limiting in a nutshell
Rate Limiting with NGINX and NGINX Plus
Related
Discription:
The k8s nginx-ingress-controllers are exposed in loadbalancer type(implemented by metallb) with ip addr 192.168.1.254. Another nginx cluster is in front of k8s cluster and it has only one upstream which is 192.168.1.254(lb ip addr).The request flow route:client -> nginx clusters -> nginx-ingress-controllers-> services.
Question:
Sometimes nginx cluster reports very tiny little "upstream(192.168.1.254) time out" and finally the client got 504 timeout from nginx.
But When I dropped the nginx cluster and switch request flow to : client -> nginx-ingress-controllers-> services.It goes well and client didn't get 504 timeout any more.I am sure the network between nginx cluster and nginx ingress controller works well.
Most of requests can be handled by nginx cluster and return status 200.I have no idea why few requests report "upstream time out" and return status 504.
system architecture
nginx cluster reports timeout
tcpdump package track
That's most likely slow file uploads (the requests you've showed are all POSTs), something that can't fit into the limit.
You can set a greater timeout value for application paths where uploads can be possible. If you are using and ingress controller you'd better create a separate ingress object for that. You can manage timeouts with these annotations, for example:
annotations:
nginx.ingress.kubernetes.io/proxy-send-timeout: 300s
nginx.ingress.kubernetes.io/proxy-read-timeout: 300s
These two annotations define the maximum upload time to 5 minutes.
If you are configuring nginx manually, you can set limits with proxy_read_timeout and proxy_send_timeout.
We are running on GKE using a public-facing Nginx Ingress Controller exposed under a TCP Load Balancer which is automatically configured by Kubernetes.
The problem is that 0.05% of our requests have status code 499 (An Nginx unique status code which means that the client cancelled). Our P99 Latency on average is always below 100ms.
This error code 499 relates to the clients browser closing the connections before a response is sent from the backends.
As per DerSkythe's answer.
My problem is solved by adding the following in the config map.
apiVersion: v1
kind: ConfigMap
data:
http-snippet: |
proxy_ignore_client_abort on;
See http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ignore_client_abort
After turning this on, I have almost zero 499 errors!
I highly recommend trying this configuration if you are encountering the same problem.
I am wondering if it is possible to specify the burst value inside the ingress config or ingress controller configmap.
limit_req zone=one burst=5 nodelay
Br,
Tim
If the aim is to limit the request processing rate for requests from a particular IP then I think what you can do is use the near-equivalent nginx.ingress.kubernetes.io/limit-rps. It seems to have a limitation on controlling the burst (which is five times the limit) but should do the job. There's an example of using this in https://carlos.mendible.com/2018/03/20/secure-your-kubernetes-services-with-nginx-ingress-controller-tls-and-more/
I'm not sure if it gives quite as much flexibility as the form of limit_req zone=one burst=5 nodelay but presumably it would work for your purposes?
I'd like to throttle incoming requests into an nginx route.
The current config is similar to this:
upstream up0 {
server x.x.x.x:1111;
keepalive 1024;
}
server {
location /auc {
limit_req zone=one burst=2100;
proxy_pass http://up0/auc;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
I'd like to control the number of requests I see at the upstream server. For all other requests I'd like nginx to respond with a 204 response.
Controlling by percentage of incoming requests would also work.
Thanks.
Nginx is very effective at limiting requests using limit_req_zone and limit_req.
First create a zone which has defined limits. For a global limit, the key of the zone can be static, it's also possible to use variables such as the source ip address as a key for the zone which is useful for limiting by specific ip's or just slower pages on your site. The rate can be defined in requests per second or minute.
limit_req_zone key zone=name:size rate=rate;
Next, create a rule to apply that zone to incoming requests. The location directive can be used first to apply the rule only to specific requests or it can be server wide. The burst option will queue a specified number requests that exceed the rate limit and is useful to throttle short bursts of traffic rather than return errors.
limit_req zone=name [burst=number] [nodelay];
The default response code for traffic exceeding the rate limit and not held in a burst queue is 503 (Service Unvailable). Alternate codes like 204 (No content) can be set.
limit_req_status code;
Putting all that together a valid config to limit all requests in the location block to 10 per second with a buffer to queue up to 50 requests before returning errors and return the specified 204 response could would look like:
http {
....
limit_req_zone $hostname zone=limit:20m rate=10r/s;
limit_req_status 204;
server {
...
location / {
...
limit_req zone=limit burst=50;
}
}
}
In practice it's likely the server block will be in a different file included from within the http block. I've just condensed them for clarity.
To test, either use a flood tool or set the request rate=10r/m (10 per minute) and use a browser. It's useful to check the logs and monitor the amount of rejected requests so that you are aware of any impact on your users.
Multiple limit_req_zone rules can be combined to specify loose global limits and then stricter per source ip limits. This will enable targeting of the most persistent few users before the wider user base.
I am experiencing unusual behavior with rate limiting in NGinX. I have been tasked with supporting 10 requests per second and not to use the burst option. I am using the nodelay option to reject any requests over my set rate.
My config is:
..
http
{
..
limit_req_zone $binary_remote_addr zone=one:10m rate=10r/s;
..
server
{
..
location /
{
limit_req zone=one nodelay;
limit_req_status 503;
..
}
}
}
The behavior I am seeing is if a request is sent before a response is received from a previous request NGinX will return a 503 error. I see this behavior with as little as 2 requests in a second.
Is there something missing from my configuration which is causing this behavior?
Is the burst option needed to service multiple requests at once?
Burst Works like a queue. No delay means the requests will not be delayed for next second. If you are not specifying a queue then you are not allowing any other simultaneous request to come in from that IP. The zone takes effect for per ip as your key is $binary_remote_addr.
You need a burst.