I am looking to implement global rate limiting to a production deployment on Azure in order to ensure that my application do not become unstable due to an uncontrollable volume of traffic(I am not talking about DDoS, but a large volume of legitimate traffic). Azure Web Application Firewall supports only IP based rate limiting.
I've looked for alternatives without to do this without increasing the hop count in the system. The only solution I've found is using limit_req_zone directive in NGINX. This does not give actual global rate limits, but it can be used to impose a global rate limit per pod. Following configmap is mounted to the Kubernetes NGINX ingress controller to achieve this.
kind: ConfigMap
apiVersion: v1
metadata:
name: nginx-ingress-ingress-nginx-controller
namespace: ingress-basic
data:
http-snippet : |
limit_req_zone test zone=static_string_rps:5m rate=10r/m ;
location-snippet: |
limit_req zone=static_string_rps burst=20 nodelay;
limit_req_status 429;
static_string_rps is a constant string and due to this all the requests are counted under a single keyword which provides global rate limits per pod.
This seems like a hacky way to achieve global rate limiting. Is there a better alternative for this and does Kubernetes NGINX ingress controller officially support this approach?(Their documentation says they support mounting configmaps for advanced configurations but there is no mention about using this approach without using an additional memcached pod for syncing counters between pods)
https://www.nginx.com/blog/rate-limiting-nginx/#:~:text=One%20of%20the%20most%20useful,on%20a%20log%E2%80%91in%20form.
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#global-rate-limiting
According to Kubernetes slack community, anything that requires global coordination for rate limiting is going to have a potentially severe bottleneck for performance and will create a single point of failure. Therefore even if we do use an external solution to this would cause bottlenecks and hence it is not recommended.(However this is not mentioned in the docs)
According to them using limit_req_zone is a valid approach and it is officially supported by the Kubernetes NGINX Ingress controller community which means that it is production ready.
I suggest you use this module if you want to apply global rate limiting(Although its not exact global rate limiting). If you have multiple ingresses in your cluster, you can use the following approach to apply global rate limits per ingress.
Deploy the following ConfigMap in the namespace in which your K8 NGINX Ingress controller is present. This will create 2 counters with the keys static_string_ingress1 and static_string_ingress2.
NGINX Config Map
kind: ConfigMap
apiVersion: v1
metadata:
name: nginx-ingress-ingress-nginx-controller
namespace: ingress-basic
data:
http-snippet : |
limit_req_zone test zone=static_string_ingress1:5m rate=10r/m ;
limit_req_zone test zone=static_string_ingress2:5m rate=30r/m ;
Ingress Resource 1
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: test-ingress-1
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/affinity: cookie
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
nginx.ingress.kubernetes.io/configuration-snippet: |
limit_req zone=static_string_ingress1 burst=5 nodelay;
limit_req_status 429;
spec:
tls:
- hosts:
- test.com
rules:
- host: test.com
http:
paths:
- path: /
backend:
serviceName: test-service
servicePort: 9443
Similary you can add a separate limit to the ingress resource 2 by adding the following configuration snippet to ingress resource 2 annotations.
Ingress resource 2
annotations:
nginx.ingress.kubernetes.io/configuration-snippet: |
limit_req zone=static_string_ingress2 burst=20 nodelay;
limit_req_status 429;
Note that the keys static_string_ingress1 and static_string_ingress2 are static strings and all requests passing through the relevan ingress will be counted using one of they keys which will create the global rate limiting effect.
However, these counts are maintained separately by each NGINX Ingress controller pod. Therefore the actual rate limit will be defined limit * No. of NGINX pods
Further I have monitored the pod memory and CPU usage when using limit_req_zone module counts and it does not create a considerable increase in resource usage.
More information on this topic is available on this blog post I wrote: https://faun.pub/global-rate-limiting-with-kubernetes-nginx-ingress-controller-fb0453447d65
Please note that this explanation is valid for Kubernetes NGINX Ingress Controller(https://github.com/kubernetes/ingress-nginx) not to be confused with NGINX controller for Kubernetes(https://github.com/nginxinc/kubernetes-ingress)
Related
What I wanna accomplish
I'm trying to connect an external HTTPS (L7) load balancer with an NGINX Ingress exposed as a zonal Network Endpoint Group (NEG). My Kubernetes cluster (in GKE) contains a couple of web application deployments that I've exposed as a ClusterIP service.
I know that the NGINX Ingress object can be directly exposed as a TCP load balancer. But, this is not what I want. Instead in my architecture, I want to load balance the HTTPS requests with an external HTTPS load balancer. I want this external load balancer to provide SSL/TLS termination and forward HTTP requests to my Ingress resource.
The ideal architecture would look like this:
HTTPS requests --> external HTTPS load balancer --> HTTP request --> NGINX Ingress zonal NEG --> appropriate web application
I'd like to add the zonal NEGs from the NGINX Ingress as the backends for the HTTPS load balancer. This is where things fall apart.
What I've done
NGINX Ingress config
I'm using the default NGINX Ingress config from the official kubernetes/ingress-nginx project. Specifically, this YAML file https://github.com/kubernetes/ingress-nginx/blob/master/deploy/static/provider/cloud/deploy.yaml.
Note that, I've changed the NGINX-controller Service section as follows:
Added NEG annotation
Changed the Service type from LoadBalancer to ClusterIP.
# Source: ingress-nginx/templates/controller-service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
# added NEG annotation
cloud.google.com/neg: '{"exposed_ports": {"80":{"name": "NGINX_NEG"}}}'
labels:
helm.sh/chart: ingress-nginx-3.30.0
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/version: 0.46.0
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/component: controller
name: ingress-nginx-controller
namespace: ingress-nginx
spec:
type: ClusterIP
ports:
- name: http
port: 80
protocol: TCP
targetPort: http
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/component: controller
---
NGINX Ingress routing
I've tested the path based routing rules for the NGINX Ingress to my web applications independently. This works when the NGINX Ingress is configured with a TCP Load Balancer. I've set up my application Deployment and Service configs the usual way.
External HTTPS Load Balancer
I created an external HTTPS load balancer with the following settings:
Backend: added the zonal NEGs named NGINX_NEG as the backends. The backend is configured to accept HTTP requests on port 80. I configured the health check on the serving port via the TCP protocol. I added the firewall rules to allow incoming traffic from 130.211.0.0/22 and 35.191.0.0/16 as mentioned here https://cloud.google.com/kubernetes-engine/docs/how-to/standalone-neg#traffic_does_not_reach_the_endpoints
What's not working
Soon after the external load balancer is set up, I can see that GCP creates a new endpoint under one of the zonal NEGs. But this shows as "Unhealthy". Requests to the external HTTPS load balancer return a 502 error.
I'm not sure where to start debugging this configuration in GCP logging. I've enabled logging for the health check but nothing shows up in the logs.
I configured the health check on the /healthz path of the NGINX Ingress controller. That didn't seem to work either.
Any tips on how to get this to work will be much appreciated. Thanks!
Edit 1: As requested, I ran the kubectl get svcneg -o yaml --namespace=<namespace>, here's the output
apiVersion: networking.gke.io/v1beta1
kind: ServiceNetworkEndpointGroup
metadata:
creationTimestamp: "2021-05-07T19:04:01Z"
finalizers:
- networking.gke.io/neg-finalizer
generation: 418
labels:
networking.gke.io/managed-by: neg-controller
networking.gke.io/service-name: ingress-nginx-controller
networking.gke.io/service-port: "80"
name: NGINX_NEG
namespace: ingress-nginx
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: false
controller: true
kind: Service
name: ingress-nginx-controller
uid: <unique ID>
resourceVersion: "2922506"
selfLink: /apis/networking.gke.io/v1beta1/namespaces/ingress-nginx/servicenetworkendpointgroups/NGINX_NEG
uid: <unique ID>
spec: {}
status:
conditions:
- lastTransitionTime: "2021-05-07T19:04:08Z"
message: ""
reason: NegInitializationSuccessful
status: "True"
type: Initialized
- lastTransitionTime: "2021-05-07T19:04:10Z"
message: ""
reason: NegSyncSuccessful
status: "True"
type: Synced
lastSyncTime: "2021-05-10T15:02:06Z"
networkEndpointGroups:
- id: <id1>
networkEndpointType: GCE_VM_IP_PORT
selfLink: https://www.googleapis.com/compute/v1/projects/<project>/zones/us-central1-a/networkEndpointGroups/NGINX_NEG
- id: <id2>
networkEndpointType: GCE_VM_IP_PORT
selfLink: https://www.googleapis.com/compute/v1/projects/<project>/zones/us-central1-b/networkEndpointGroups/NGINX_NEG
- id: <id3>
networkEndpointType: GCE_VM_IP_PORT
selfLink: https://www.googleapis.com/compute/v1/projects/<project>/zones/us-central1-f/networkEndpointGroups/NGINX_NEG
As per my understanding, your issue is - “when an external load balancer is set up, GCP creates a new endpoint under one of the zonal NEGs and it shows "Unhealthy" and requests to the external HTTPS load balancer which return a 502 error”.
Essentially, the Service's annotation, cloud.google.com/neg: '{"ingress": true}', enables container-native load balancing. After creating the Ingress, an HTTP(S) load balancer is created in the project, and NEGs are created in each zone in which the cluster runs. The endpoints in the NEG and the endpoints of the Service are kept in sync.
Refer to the link [1].
New endpoints generally become reachable after attaching them to the load balancer, provided that they respond to health checks. You might encounter 502 errors or rejected connections if traffic cannot reach the endpoints.
One of your endpoints in zonal NEG is showing unhealthy so please confirm the status of other endpoints and how many endpoints are spread across the zones in the backend.
If all backends are unhealthy, then your firewall, Ingress, or service might be misconfigured.
You can run following command to check if your endpoints are healthy or not and refer link [2] for the same -
gcloud compute network-endpoint-groups list-network-endpoints NAME \ --zone=ZONE
To troubleshoot traffic that is not reaching the endpoints, verify that health check firewall rules allow incoming TCP traffic to your endpoints in the 130.211.0.0/22 and 35.191.0.0/16 ranges. But as you mentioned you have configured this rule correctly. Please refer link [3] for health check Configuration.
Run the Curl command against your LB IP to check for responses -
Curl [LB IP]
[1] https://cloud.google.com/kubernetes-engine/docs/concepts/ingress-xlb
[2] https://cloud.google.com/load-balancing/docs/negs/zonal-neg-concepts#troubleshooting
[3] https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#health_checks
We're running an NGINX Ingress Controller as the 'front door' to our EKS cluster.
Our upstream apps need the client IP to be preserved, so I've had my ingress configmap configured to use the proxy protocol:
apiVersion: v1
kind: ConfigMap
metadata:
name: ingress-nginx-controller
namespace: ingress-nginx
data:
custom-http-errors: 404,503,502,500
ssl-redirect: "true"
ssl-protocols: "TLSv1.2"
ssl-ciphers: "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES256-GCM-SHA384:!ECDHE-RSA-AES256-SHA384:!ECDHE-RSA-AES128-SHA256"
use-proxy-protocol: "true"
proxy-real-ip-cidr: "0.0.0.0/0"
This sends the X-Forwarded-For header with the client IP to the upstream pods. This seemed like it was working well, but once our apps started to receive heavier traffic, our monitors would occasionally report connection timeouts when connecting to the apps on the cluster.
I was able reproduce the issue in a test environment using JMeter. Once I set use-proxy-protocol to false, the connection timeouts would no longer occur. I started to look into the use-proxy-protocol setting.
The ingress docs describe the use-proxy-protocol setting here
However the docs also mention the settings "enable-real-ip" and "forwarded-for-header"
At the link provided in the description for enable-real-ip, it says that I can set forwarded-for-header to value proxy_protocol.
Based on this, I've updated my Ingress configmap to:
apiVersion: v1
kind: ConfigMap
metadata:
name: ingress-nginx-controller
namespace: ingress-nginx-test
data:
custom-http-errors: 404,503,502,500
ssl-redirect: "true"
ssl-protocols: "TLSv1.2"
ssl-ciphers: "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES256-GCM-SHA384:!ECDHE-RSA-AES256-SHA384:!ECDHE-RSA-AES128-SHA256"
enable-real-ip: "true"
forwarded-for-header: "proxy_protocol"
proxy-real-ip-cidr: "0.0.0.0/0"
This configuration also properly sends the X-Forwarded-For header with the client IP to the upstream pods. However, it also seems to eliminate the connection timeout issues I was seeing. With this setup performance does not degrade anywhere near as badly as I ramp up the thread count in JMeter.
I would like to better understand the difference between these two configurations. I’d also like to know what is the best practice most widely adopted method of achieving this among Kubernetes shops since this is likely a common use-case.
I have a GET request URL into a service on my kubernetes that's ~9k long and it seems like the request is getting stuck in Kubernetes's ingress. When I tried calling the url from within the docker or from other docker in the cluster it works fine. However when I go through a domain name I'm getting the following response header:
I think the parameter you must modify is Client Body Buffer Size
Sets buffer size for reading client request body per location. In case
the request body is larger than the buffer, the whole body or only its
part is written to a temporary file. By default, buffer size is equal
to two memory pages. This is 8K on x86, other 32-bit platforms, and
x86-64. It is usually 16K on other 64-bit platforms. This annotation
is applied to each location provided in the ingress rule
nginx.ingress.kubernetes.io/client-body-buffer-size: "1000" # 1000 bytes
nginx.ingress.kubernetes.io/client-body-buffer-size: 1k # 1 kilobyte
nginx.ingress.kubernetes.io/client-body-buffer-size: 1K # 1 kilobyte
nginx.ingress.kubernetes.io/client-body-buffer-size: 1m # 1 megabyte
So you must add an annotation to your nginx ingress config.
In my case, I had to set http2_max_header_size and http2_max_field_size in my ingress server-snippet annotation. For example:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/server-snippet: |
http2_max_header_size 16k;
http2_max_field_size 16k;
I was getting ERR_CONNECTION_CLOSED and ERR_FAILED in Google Chrome and "empty response" using curl, but the backend would work if accessed directly from the cluster network.
Assigning client-header-buffer-size or large-client-header-buffers in the ingress controller ConfigMap didn't seem to work for me either, but I realized that curl would do it if using HTTP 1.1 (curl --http1.1)
Find the configmap name in the nginx ingress controller pod describition
kubectl -n utility describe pods/test-nginx-ingress-controller-584dd58494-d8fqr |grep configmap
--configmap=test-namespace/test-nginx-ingress-controller
Note: In my case, the namespace is "test-namespace" and the configmap name is "test-nginx-ingress-controller"
Create a configmap yaml
cat << EOF > test-nginx-ingress-controller-configmap.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: test-nginx-ingress-controller
namespace: test-namespace
data:
large-client-header-buffers: "4 16k"
EOF
Note: Please replace the namespace and configmap name as per finding in the step 1
Deploy the configmap yaml
kubectl apply -f test-nginx-ingress-controller-configmap.yaml
Then you will see the change is updated to nginx controller pod after mins
i.g.
kubectl -n test-namespace exec -it test-nginx-ingress-controller-584dd58494-d8fqr -- cat /etc/nginx/nginx.conf|grep large
large_client_header_buffers 4 16k;
Thanks to the sharing by NeverEndingQueue in How to use ConfigMap configuration with Helm NginX Ingress controller - Kubernetes
I'm trying to integrate socket.io into an application deployed on Google Kubernetes Engine. Developing locally, everything works great. But once deployed, I am continuously getting the dreaded 400 response when my sockets try to connect on. I've been searching on SO and other sites for a few days now and I haven't found anything that fixes my issue.
Unfortunately this architecture was set up by a developer who is no longer at our company, and I'm certainly not a Kubernetes or GKE expert, so I'm definitely not sure I've got everything set up correctly.
Here's out setup:
we have 5 app pods that serve our application distributed across 5 cloud nodes (GCE vm instances)
we are using the nginx ingress controller (https://github.com/kubernetes/ingress-nginx) to create a load balancer to distribute traffic between our nodes
Here's what I've tried so far:
adding the following annotations to the ingress:
nginx.ingress.kubernetes.io/affinity: "cookie"
nginx.ingress.kubernetes.io/session-cookie-hash: "sha1"
nginx.ingress.kubernetes.io/session-cookie-name: "route"
adding sessionAffinity: ClientIP to the backend service referenced by the ingress
These measures don't seem to have made any difference, I'm still getting a 400 response. If anyone has handled a similar situation or has any advice to point me in the right direction I'd be very, very appreciative!
I just setup ngin ingress with same config where we are using socket.io.
here is my ingress config
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: core-ingress
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.org/websocket-services : "app-test"
nginx.ingress.kubernetes.io/rewrite-target: /
certmanager.k8s.io/cluster-issuer: core-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/secure-backends: "true"
nginx.ingress.kubernetes.io/websocket-services : "socket-service"
nginx.ingress.kubernetes.io/proxy-send-timeout: "1800"
nginx.ingress.kubernetes.io/proxy-read-timeout: "1800"
spec:
tls:
- hosts:
- <domain>
secretName: core-prod
rules:
- host: <domain>
http:
paths:
- backend:
serviceName: service-name
servicePort: 80
i was also facing same issue so added proxy-send-timeout and proxy-read-timeout.
I'm guessing you have probably found the answer by now, but you have to add an annotation to your ingress to specify which service will provide websocket upgrades. It looks something like this:
# web socket support
nginx.org/websocket-services: "(your-websocket-service)"
I have a web app hosted in the Google Cloud platform that sits behind a load balancer, which itself sits behind an ingress. The ingress is set up with an SSL certificate and accepts HTTPS connections as expected, with one problem: I cannot get it to redirect non-HTTPS connections to HTTPS. For example, if I connect to it with the URL http://foo.com or foo.com, it just goes to foo.com, instead of https://foo.com as I would expect. Connecting to https://foo.com explicitly produces the desired HTTPS connection.
I have tried every annotation and config imaginable, but it stubbornly refuses, although it shouldn't even be necessary since docs imply that the redirect is automatic if TLS is specified. Am I fundamentally misunderstanding how ingress resources work?
Update: Is it necessary to manually install nginx ingress on GCP? Now that I think about it, I've been taking its availability on the platform for granted, but after coming across information on how to install nginx ingress on the Google Container Engine, I realized the answer may be a lot simpler than I thought. Will investigate further.
Kubernetes version: 1.8.5-gke.0
Ingress YAML file:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: https-ingress
annotations:
kubernetes.io/ingress.class: "nginx"
ingress.kubernetes.io/ssl-redirect: "true"
ingress.kubernetes.io/secure-backends: "true"
ingress.kubernetes.io/force-ssl-redirect: "true"
spec:
tls:
- hosts:
- foo.com
secretName: tls-secret
rules:
- host: foo.com
http:
paths:
- path: /*
backend:
serviceName: foo-prod
servicePort: 80
kubectl describe ing https-ingress output
Name: https-ingress
Namespace: default
Address:
Default backend: default-http-backend:80 (10.56.0.3:8080)
TLS:
tls-secret terminates foo.com
Rules:
Host Path Backends
---- ---- --------
foo.com
/* foo-prod:80 (<none>)
Annotations:
force-ssl-redirect: true
secure-backends: true
ssl-redirect: true
Events: <none>
The problem was indeed the fact that the Nginx Ingress is not standard on the Google Cloud Platform, and needs to be installed manually - doh!
However, I found installing it to be much more difficult than anticipated (especially because my needs pertained specifically to GCP), so I'm going to outline every step I took from start to finish in hopes of helping anyone else who uses that specific cloud and has that specific need, and finds generic guides to not quite fit the bill.
Get Cluster Credentials
This is a GCP specific step that tripped me up for a while - you're dealing with it if you get weird errors like
kubectl unable to connect to server: x509: certificate signed by unknown authority
when trying to run kubectl commands. Run this to set up your console:
gcloud container clusters get-credentials YOUR-K8s-CLUSTER-NAME --z YOUR-K8S-CLUSTER-ZONE
Install Helm
Helm by itself is not hard to install, and the directions can be found on GCP's own docs; what they neglect to mention, however, is that on new versions of K8s, RBAC configuration is required to allow Tiller to install things. Run the following after helm init:
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'
Install Nginx Ingress through Helm
Here's another step that tripped me up - rbac.create=true is necessary for the aforementioned RBAC factor.
helm install --name nginx-ingress-release stable/nginx-ingress --set rbac.create=true
Create your Ingress resource
This step is the simplest, and there are plenty of sample nginx ingress configs to tweak - see #JahongirRahmonov's example above. What you MUST keep in mind is that this step takes anywhere from half an hour to over an hour to set up - if you change the config and check again immediately, it won't be set up, but don't take that as implication that you messed something up! Wait for a while and see first.
It's hard to believe this is how much it takes just to redirect HTTP to HTTPS with Kubernetes right now, but I hope this guide helps anyone else stuck on such a seemingly simple and yet so critical need.
GCP has a default ingress controller which at the time of this writing cannot force https.
You need to explicitly manage an NGINX Ingress Controller.
See this article on how to do that on GCP.
Then add this annotation to your ingress:
kubernetes.io/ingress.allow-http: "false"
Hope it helps.