We're running an NGINX Ingress Controller as the 'front door' to our EKS cluster.
Our upstream apps need the client IP to be preserved, so I've had my ingress configmap configured to use the proxy protocol:
apiVersion: v1
kind: ConfigMap
metadata:
name: ingress-nginx-controller
namespace: ingress-nginx
data:
custom-http-errors: 404,503,502,500
ssl-redirect: "true"
ssl-protocols: "TLSv1.2"
ssl-ciphers: "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES256-GCM-SHA384:!ECDHE-RSA-AES256-SHA384:!ECDHE-RSA-AES128-SHA256"
use-proxy-protocol: "true"
proxy-real-ip-cidr: "0.0.0.0/0"
This sends the X-Forwarded-For header with the client IP to the upstream pods. This seemed like it was working well, but once our apps started to receive heavier traffic, our monitors would occasionally report connection timeouts when connecting to the apps on the cluster.
I was able reproduce the issue in a test environment using JMeter. Once I set use-proxy-protocol to false, the connection timeouts would no longer occur. I started to look into the use-proxy-protocol setting.
The ingress docs describe the use-proxy-protocol setting here
However the docs also mention the settings "enable-real-ip" and "forwarded-for-header"
At the link provided in the description for enable-real-ip, it says that I can set forwarded-for-header to value proxy_protocol.
Based on this, I've updated my Ingress configmap to:
apiVersion: v1
kind: ConfigMap
metadata:
name: ingress-nginx-controller
namespace: ingress-nginx-test
data:
custom-http-errors: 404,503,502,500
ssl-redirect: "true"
ssl-protocols: "TLSv1.2"
ssl-ciphers: "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES256-GCM-SHA384:!ECDHE-RSA-AES256-SHA384:!ECDHE-RSA-AES128-SHA256"
enable-real-ip: "true"
forwarded-for-header: "proxy_protocol"
proxy-real-ip-cidr: "0.0.0.0/0"
This configuration also properly sends the X-Forwarded-For header with the client IP to the upstream pods. However, it also seems to eliminate the connection timeout issues I was seeing. With this setup performance does not degrade anywhere near as badly as I ramp up the thread count in JMeter.
I would like to better understand the difference between these two configurations. I’d also like to know what is the best practice most widely adopted method of achieving this among Kubernetes shops since this is likely a common use-case.
Related
I'm using GKE version 1.21.12-gke.1700 and I'm trying to configure externalTrafficPolicy to "local" on my nginx external load balancer (not ingress). After the change, nothing happens, and I still see the source as the internal IP for the kubernetes IP range instead of the client's IP.
This is my service's YAML:
apiVersion: v1
kind: Service
metadata:
name: nginx-ext
namespace: my-namespace
spec:
externalTrafficPolicy: Local
healthCheckNodePort: xxxxx
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
loadBalancerSourceRanges:
- x.x.x.x/32
ports:
- name: dashboard
port: 443
protocol: TCP
targetPort: 443
selector:
app: nginx
sessionAffinity: None
type: LoadBalancer
And the nginx logs:
*2 access forbidden by rule, client: 10.X.X.X
My goal is to make a restriction endpoint based (to deny all and allow only specific clients)
You can use curl to query the ip from the load balance, this is an example curl 202.0.113.120 . Please notice that the service.spec.externalTrafficPolicy set to Local in GKE will force to remove the nodes without service endpoints from the list of nodes eligible for load balanced traffic; so if you are applying the Local value to your external traffic policy, you will have at least one Service Endpoint. So based on this, it is important to deploy the service.spec.healthCheckNodePort . This port needs to be allowed in the ingress firewall rule, you can get the health check node port from your yaml file with this command:
kubectl get svc loadbalancer -o yaml | grep -i healthCheckNodePort
You can follow this guide if you need more information about how the service load balancer type works in GKE and finally you can limit the traffic from outside at your external load balancer deploying loadBalancerSourceRanges. In the following link, you can find more information related on how to protect your applications from outside traffic.
I am looking to implement global rate limiting to a production deployment on Azure in order to ensure that my application do not become unstable due to an uncontrollable volume of traffic(I am not talking about DDoS, but a large volume of legitimate traffic). Azure Web Application Firewall supports only IP based rate limiting.
I've looked for alternatives without to do this without increasing the hop count in the system. The only solution I've found is using limit_req_zone directive in NGINX. This does not give actual global rate limits, but it can be used to impose a global rate limit per pod. Following configmap is mounted to the Kubernetes NGINX ingress controller to achieve this.
kind: ConfigMap
apiVersion: v1
metadata:
name: nginx-ingress-ingress-nginx-controller
namespace: ingress-basic
data:
http-snippet : |
limit_req_zone test zone=static_string_rps:5m rate=10r/m ;
location-snippet: |
limit_req zone=static_string_rps burst=20 nodelay;
limit_req_status 429;
static_string_rps is a constant string and due to this all the requests are counted under a single keyword which provides global rate limits per pod.
This seems like a hacky way to achieve global rate limiting. Is there a better alternative for this and does Kubernetes NGINX ingress controller officially support this approach?(Their documentation says they support mounting configmaps for advanced configurations but there is no mention about using this approach without using an additional memcached pod for syncing counters between pods)
https://www.nginx.com/blog/rate-limiting-nginx/#:~:text=One%20of%20the%20most%20useful,on%20a%20log%E2%80%91in%20form.
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#global-rate-limiting
According to Kubernetes slack community, anything that requires global coordination for rate limiting is going to have a potentially severe bottleneck for performance and will create a single point of failure. Therefore even if we do use an external solution to this would cause bottlenecks and hence it is not recommended.(However this is not mentioned in the docs)
According to them using limit_req_zone is a valid approach and it is officially supported by the Kubernetes NGINX Ingress controller community which means that it is production ready.
I suggest you use this module if you want to apply global rate limiting(Although its not exact global rate limiting). If you have multiple ingresses in your cluster, you can use the following approach to apply global rate limits per ingress.
Deploy the following ConfigMap in the namespace in which your K8 NGINX Ingress controller is present. This will create 2 counters with the keys static_string_ingress1 and static_string_ingress2.
NGINX Config Map
kind: ConfigMap
apiVersion: v1
metadata:
name: nginx-ingress-ingress-nginx-controller
namespace: ingress-basic
data:
http-snippet : |
limit_req_zone test zone=static_string_ingress1:5m rate=10r/m ;
limit_req_zone test zone=static_string_ingress2:5m rate=30r/m ;
Ingress Resource 1
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: test-ingress-1
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/affinity: cookie
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
nginx.ingress.kubernetes.io/configuration-snippet: |
limit_req zone=static_string_ingress1 burst=5 nodelay;
limit_req_status 429;
spec:
tls:
- hosts:
- test.com
rules:
- host: test.com
http:
paths:
- path: /
backend:
serviceName: test-service
servicePort: 9443
Similary you can add a separate limit to the ingress resource 2 by adding the following configuration snippet to ingress resource 2 annotations.
Ingress resource 2
annotations:
nginx.ingress.kubernetes.io/configuration-snippet: |
limit_req zone=static_string_ingress2 burst=20 nodelay;
limit_req_status 429;
Note that the keys static_string_ingress1 and static_string_ingress2 are static strings and all requests passing through the relevan ingress will be counted using one of they keys which will create the global rate limiting effect.
However, these counts are maintained separately by each NGINX Ingress controller pod. Therefore the actual rate limit will be defined limit * No. of NGINX pods
Further I have monitored the pod memory and CPU usage when using limit_req_zone module counts and it does not create a considerable increase in resource usage.
More information on this topic is available on this blog post I wrote: https://faun.pub/global-rate-limiting-with-kubernetes-nginx-ingress-controller-fb0453447d65
Please note that this explanation is valid for Kubernetes NGINX Ingress Controller(https://github.com/kubernetes/ingress-nginx) not to be confused with NGINX controller for Kubernetes(https://github.com/nginxinc/kubernetes-ingress)
What I wanna accomplish
I'm trying to connect an external HTTPS (L7) load balancer with an NGINX Ingress exposed as a zonal Network Endpoint Group (NEG). My Kubernetes cluster (in GKE) contains a couple of web application deployments that I've exposed as a ClusterIP service.
I know that the NGINX Ingress object can be directly exposed as a TCP load balancer. But, this is not what I want. Instead in my architecture, I want to load balance the HTTPS requests with an external HTTPS load balancer. I want this external load balancer to provide SSL/TLS termination and forward HTTP requests to my Ingress resource.
The ideal architecture would look like this:
HTTPS requests --> external HTTPS load balancer --> HTTP request --> NGINX Ingress zonal NEG --> appropriate web application
I'd like to add the zonal NEGs from the NGINX Ingress as the backends for the HTTPS load balancer. This is where things fall apart.
What I've done
NGINX Ingress config
I'm using the default NGINX Ingress config from the official kubernetes/ingress-nginx project. Specifically, this YAML file https://github.com/kubernetes/ingress-nginx/blob/master/deploy/static/provider/cloud/deploy.yaml.
Note that, I've changed the NGINX-controller Service section as follows:
Added NEG annotation
Changed the Service type from LoadBalancer to ClusterIP.
# Source: ingress-nginx/templates/controller-service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
# added NEG annotation
cloud.google.com/neg: '{"exposed_ports": {"80":{"name": "NGINX_NEG"}}}'
labels:
helm.sh/chart: ingress-nginx-3.30.0
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/version: 0.46.0
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/component: controller
name: ingress-nginx-controller
namespace: ingress-nginx
spec:
type: ClusterIP
ports:
- name: http
port: 80
protocol: TCP
targetPort: http
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/component: controller
---
NGINX Ingress routing
I've tested the path based routing rules for the NGINX Ingress to my web applications independently. This works when the NGINX Ingress is configured with a TCP Load Balancer. I've set up my application Deployment and Service configs the usual way.
External HTTPS Load Balancer
I created an external HTTPS load balancer with the following settings:
Backend: added the zonal NEGs named NGINX_NEG as the backends. The backend is configured to accept HTTP requests on port 80. I configured the health check on the serving port via the TCP protocol. I added the firewall rules to allow incoming traffic from 130.211.0.0/22 and 35.191.0.0/16 as mentioned here https://cloud.google.com/kubernetes-engine/docs/how-to/standalone-neg#traffic_does_not_reach_the_endpoints
What's not working
Soon after the external load balancer is set up, I can see that GCP creates a new endpoint under one of the zonal NEGs. But this shows as "Unhealthy". Requests to the external HTTPS load balancer return a 502 error.
I'm not sure where to start debugging this configuration in GCP logging. I've enabled logging for the health check but nothing shows up in the logs.
I configured the health check on the /healthz path of the NGINX Ingress controller. That didn't seem to work either.
Any tips on how to get this to work will be much appreciated. Thanks!
Edit 1: As requested, I ran the kubectl get svcneg -o yaml --namespace=<namespace>, here's the output
apiVersion: networking.gke.io/v1beta1
kind: ServiceNetworkEndpointGroup
metadata:
creationTimestamp: "2021-05-07T19:04:01Z"
finalizers:
- networking.gke.io/neg-finalizer
generation: 418
labels:
networking.gke.io/managed-by: neg-controller
networking.gke.io/service-name: ingress-nginx-controller
networking.gke.io/service-port: "80"
name: NGINX_NEG
namespace: ingress-nginx
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: false
controller: true
kind: Service
name: ingress-nginx-controller
uid: <unique ID>
resourceVersion: "2922506"
selfLink: /apis/networking.gke.io/v1beta1/namespaces/ingress-nginx/servicenetworkendpointgroups/NGINX_NEG
uid: <unique ID>
spec: {}
status:
conditions:
- lastTransitionTime: "2021-05-07T19:04:08Z"
message: ""
reason: NegInitializationSuccessful
status: "True"
type: Initialized
- lastTransitionTime: "2021-05-07T19:04:10Z"
message: ""
reason: NegSyncSuccessful
status: "True"
type: Synced
lastSyncTime: "2021-05-10T15:02:06Z"
networkEndpointGroups:
- id: <id1>
networkEndpointType: GCE_VM_IP_PORT
selfLink: https://www.googleapis.com/compute/v1/projects/<project>/zones/us-central1-a/networkEndpointGroups/NGINX_NEG
- id: <id2>
networkEndpointType: GCE_VM_IP_PORT
selfLink: https://www.googleapis.com/compute/v1/projects/<project>/zones/us-central1-b/networkEndpointGroups/NGINX_NEG
- id: <id3>
networkEndpointType: GCE_VM_IP_PORT
selfLink: https://www.googleapis.com/compute/v1/projects/<project>/zones/us-central1-f/networkEndpointGroups/NGINX_NEG
As per my understanding, your issue is - “when an external load balancer is set up, GCP creates a new endpoint under one of the zonal NEGs and it shows "Unhealthy" and requests to the external HTTPS load balancer which return a 502 error”.
Essentially, the Service's annotation, cloud.google.com/neg: '{"ingress": true}', enables container-native load balancing. After creating the Ingress, an HTTP(S) load balancer is created in the project, and NEGs are created in each zone in which the cluster runs. The endpoints in the NEG and the endpoints of the Service are kept in sync.
Refer to the link [1].
New endpoints generally become reachable after attaching them to the load balancer, provided that they respond to health checks. You might encounter 502 errors or rejected connections if traffic cannot reach the endpoints.
One of your endpoints in zonal NEG is showing unhealthy so please confirm the status of other endpoints and how many endpoints are spread across the zones in the backend.
If all backends are unhealthy, then your firewall, Ingress, or service might be misconfigured.
You can run following command to check if your endpoints are healthy or not and refer link [2] for the same -
gcloud compute network-endpoint-groups list-network-endpoints NAME \ --zone=ZONE
To troubleshoot traffic that is not reaching the endpoints, verify that health check firewall rules allow incoming TCP traffic to your endpoints in the 130.211.0.0/22 and 35.191.0.0/16 ranges. But as you mentioned you have configured this rule correctly. Please refer link [3] for health check Configuration.
Run the Curl command against your LB IP to check for responses -
Curl [LB IP]
[1] https://cloud.google.com/kubernetes-engine/docs/concepts/ingress-xlb
[2] https://cloud.google.com/load-balancing/docs/negs/zonal-neg-concepts#troubleshooting
[3] https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#health_checks
The end goal: be able to sftp into the server using domain.com:42150 using routing through Kubernetes.
The reason: This behavior is currently handled by an HAProxy config that we are moving away from, but we still need to support this behavior in our Kubernetes set up.
I came across this and could not figure out how to make it work.
I have the IP of the sftp server and the port.
So, basicaly if a request comes in at domain.com:42150 then it should connect to external-ip:22
I have created a config-map like the one in the linked article:
apiVersion: v1
kind: ConfigMap
metadata:
name: tcp-services
namespace: nginx-ingress
data:
42150: "nginx-ingress/external-sftp:80"
Which, by my understanding should route requests to port 42150 to this service:
apiVersion: v1
kind: Service
metadata:
name: external-sftp
namespace: nginx-ingress
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 22
protocol: TCP
And although it's not listed in that article, I know from connecting to other outside services, I need to create an endpoint to use.
apiVersion: v1
kind: Endpoints
metadata:
name: external-sftp
namespace: nginx-ingress
subsets:
- addresses:
- ip: 12.345.67.89
ports:
- port: 22
protocol: TCP
Obviously this isn't working. I never ask questions here. Usually my answers are easy to find, but this one I cannot find an answer for. I'm just stuck.
Is there something I'm missing? I'm thinking this way of doing it is not possible. Is there a better way to go about doing this?
I have a web app hosted in the Google Cloud platform that sits behind a load balancer, which itself sits behind an ingress. The ingress is set up with an SSL certificate and accepts HTTPS connections as expected, with one problem: I cannot get it to redirect non-HTTPS connections to HTTPS. For example, if I connect to it with the URL http://foo.com or foo.com, it just goes to foo.com, instead of https://foo.com as I would expect. Connecting to https://foo.com explicitly produces the desired HTTPS connection.
I have tried every annotation and config imaginable, but it stubbornly refuses, although it shouldn't even be necessary since docs imply that the redirect is automatic if TLS is specified. Am I fundamentally misunderstanding how ingress resources work?
Update: Is it necessary to manually install nginx ingress on GCP? Now that I think about it, I've been taking its availability on the platform for granted, but after coming across information on how to install nginx ingress on the Google Container Engine, I realized the answer may be a lot simpler than I thought. Will investigate further.
Kubernetes version: 1.8.5-gke.0
Ingress YAML file:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: https-ingress
annotations:
kubernetes.io/ingress.class: "nginx"
ingress.kubernetes.io/ssl-redirect: "true"
ingress.kubernetes.io/secure-backends: "true"
ingress.kubernetes.io/force-ssl-redirect: "true"
spec:
tls:
- hosts:
- foo.com
secretName: tls-secret
rules:
- host: foo.com
http:
paths:
- path: /*
backend:
serviceName: foo-prod
servicePort: 80
kubectl describe ing https-ingress output
Name: https-ingress
Namespace: default
Address:
Default backend: default-http-backend:80 (10.56.0.3:8080)
TLS:
tls-secret terminates foo.com
Rules:
Host Path Backends
---- ---- --------
foo.com
/* foo-prod:80 (<none>)
Annotations:
force-ssl-redirect: true
secure-backends: true
ssl-redirect: true
Events: <none>
The problem was indeed the fact that the Nginx Ingress is not standard on the Google Cloud Platform, and needs to be installed manually - doh!
However, I found installing it to be much more difficult than anticipated (especially because my needs pertained specifically to GCP), so I'm going to outline every step I took from start to finish in hopes of helping anyone else who uses that specific cloud and has that specific need, and finds generic guides to not quite fit the bill.
Get Cluster Credentials
This is a GCP specific step that tripped me up for a while - you're dealing with it if you get weird errors like
kubectl unable to connect to server: x509: certificate signed by unknown authority
when trying to run kubectl commands. Run this to set up your console:
gcloud container clusters get-credentials YOUR-K8s-CLUSTER-NAME --z YOUR-K8S-CLUSTER-ZONE
Install Helm
Helm by itself is not hard to install, and the directions can be found on GCP's own docs; what they neglect to mention, however, is that on new versions of K8s, RBAC configuration is required to allow Tiller to install things. Run the following after helm init:
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'
Install Nginx Ingress through Helm
Here's another step that tripped me up - rbac.create=true is necessary for the aforementioned RBAC factor.
helm install --name nginx-ingress-release stable/nginx-ingress --set rbac.create=true
Create your Ingress resource
This step is the simplest, and there are plenty of sample nginx ingress configs to tweak - see #JahongirRahmonov's example above. What you MUST keep in mind is that this step takes anywhere from half an hour to over an hour to set up - if you change the config and check again immediately, it won't be set up, but don't take that as implication that you messed something up! Wait for a while and see first.
It's hard to believe this is how much it takes just to redirect HTTP to HTTPS with Kubernetes right now, but I hope this guide helps anyone else stuck on such a seemingly simple and yet so critical need.
GCP has a default ingress controller which at the time of this writing cannot force https.
You need to explicitly manage an NGINX Ingress Controller.
See this article on how to do that on GCP.
Then add this annotation to your ingress:
kubernetes.io/ingress.allow-http: "false"
Hope it helps.