Connect external HTTPS load balancer with NGINX Ingress exposed as zonal NEG - nginx

What I wanna accomplish
I'm trying to connect an external HTTPS (L7) load balancer with an NGINX Ingress exposed as a zonal Network Endpoint Group (NEG). My Kubernetes cluster (in GKE) contains a couple of web application deployments that I've exposed as a ClusterIP service.
I know that the NGINX Ingress object can be directly exposed as a TCP load balancer. But, this is not what I want. Instead in my architecture, I want to load balance the HTTPS requests with an external HTTPS load balancer. I want this external load balancer to provide SSL/TLS termination and forward HTTP requests to my Ingress resource.
The ideal architecture would look like this:
HTTPS requests --> external HTTPS load balancer --> HTTP request --> NGINX Ingress zonal NEG --> appropriate web application
I'd like to add the zonal NEGs from the NGINX Ingress as the backends for the HTTPS load balancer. This is where things fall apart.
What I've done
NGINX Ingress config
I'm using the default NGINX Ingress config from the official kubernetes/ingress-nginx project. Specifically, this YAML file https://github.com/kubernetes/ingress-nginx/blob/master/deploy/static/provider/cloud/deploy.yaml.
Note that, I've changed the NGINX-controller Service section as follows:
Added NEG annotation
Changed the Service type from LoadBalancer to ClusterIP.
# Source: ingress-nginx/templates/controller-service.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
# added NEG annotation
cloud.google.com/neg: '{"exposed_ports": {"80":{"name": "NGINX_NEG"}}}'
labels:
helm.sh/chart: ingress-nginx-3.30.0
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/version: 0.46.0
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/component: controller
name: ingress-nginx-controller
namespace: ingress-nginx
spec:
type: ClusterIP
ports:
- name: http
port: 80
protocol: TCP
targetPort: http
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/component: controller
---
NGINX Ingress routing
I've tested the path based routing rules for the NGINX Ingress to my web applications independently. This works when the NGINX Ingress is configured with a TCP Load Balancer. I've set up my application Deployment and Service configs the usual way.
External HTTPS Load Balancer
I created an external HTTPS load balancer with the following settings:
Backend: added the zonal NEGs named NGINX_NEG as the backends. The backend is configured to accept HTTP requests on port 80. I configured the health check on the serving port via the TCP protocol. I added the firewall rules to allow incoming traffic from 130.211.0.0/22 and 35.191.0.0/16 as mentioned here https://cloud.google.com/kubernetes-engine/docs/how-to/standalone-neg#traffic_does_not_reach_the_endpoints
What's not working
Soon after the external load balancer is set up, I can see that GCP creates a new endpoint under one of the zonal NEGs. But this shows as "Unhealthy". Requests to the external HTTPS load balancer return a 502 error.
I'm not sure where to start debugging this configuration in GCP logging. I've enabled logging for the health check but nothing shows up in the logs.
I configured the health check on the /healthz path of the NGINX Ingress controller. That didn't seem to work either.
Any tips on how to get this to work will be much appreciated. Thanks!
Edit 1: As requested, I ran the kubectl get svcneg -o yaml --namespace=<namespace>, here's the output
apiVersion: networking.gke.io/v1beta1
kind: ServiceNetworkEndpointGroup
metadata:
creationTimestamp: "2021-05-07T19:04:01Z"
finalizers:
- networking.gke.io/neg-finalizer
generation: 418
labels:
networking.gke.io/managed-by: neg-controller
networking.gke.io/service-name: ingress-nginx-controller
networking.gke.io/service-port: "80"
name: NGINX_NEG
namespace: ingress-nginx
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: false
controller: true
kind: Service
name: ingress-nginx-controller
uid: <unique ID>
resourceVersion: "2922506"
selfLink: /apis/networking.gke.io/v1beta1/namespaces/ingress-nginx/servicenetworkendpointgroups/NGINX_NEG
uid: <unique ID>
spec: {}
status:
conditions:
- lastTransitionTime: "2021-05-07T19:04:08Z"
message: ""
reason: NegInitializationSuccessful
status: "True"
type: Initialized
- lastTransitionTime: "2021-05-07T19:04:10Z"
message: ""
reason: NegSyncSuccessful
status: "True"
type: Synced
lastSyncTime: "2021-05-10T15:02:06Z"
networkEndpointGroups:
- id: <id1>
networkEndpointType: GCE_VM_IP_PORT
selfLink: https://www.googleapis.com/compute/v1/projects/<project>/zones/us-central1-a/networkEndpointGroups/NGINX_NEG
- id: <id2>
networkEndpointType: GCE_VM_IP_PORT
selfLink: https://www.googleapis.com/compute/v1/projects/<project>/zones/us-central1-b/networkEndpointGroups/NGINX_NEG
- id: <id3>
networkEndpointType: GCE_VM_IP_PORT
selfLink: https://www.googleapis.com/compute/v1/projects/<project>/zones/us-central1-f/networkEndpointGroups/NGINX_NEG

As per my understanding, your issue is - “when an external load balancer is set up, GCP creates a new endpoint under one of the zonal NEGs and it shows "Unhealthy" and requests to the external HTTPS load balancer which return a 502 error”.
Essentially, the Service's annotation, cloud.google.com/neg: '{"ingress": true}', enables container-native load balancing. After creating the Ingress, an HTTP(S) load balancer is created in the project, and NEGs are created in each zone in which the cluster runs. The endpoints in the NEG and the endpoints of the Service are kept in sync.
Refer to the link [1].
New endpoints generally become reachable after attaching them to the load balancer, provided that they respond to health checks. You might encounter 502 errors or rejected connections if traffic cannot reach the endpoints.
One of your endpoints in zonal NEG is showing unhealthy so please confirm the status of other endpoints and how many endpoints are spread across the zones in the backend.
If all backends are unhealthy, then your firewall, Ingress, or service might be misconfigured.
You can run following command to check if your endpoints are healthy or not and refer link [2] for the same -
gcloud compute network-endpoint-groups list-network-endpoints NAME \ --zone=ZONE
To troubleshoot traffic that is not reaching the endpoints, verify that health check firewall rules allow incoming TCP traffic to your endpoints in the 130.211.0.0/22 and 35.191.0.0/16 ranges. But as you mentioned you have configured this rule correctly. Please refer link [3] for health check Configuration.
Run the Curl command against your LB IP to check for responses -
Curl [LB IP]
[1] https://cloud.google.com/kubernetes-engine/docs/concepts/ingress-xlb
[2] https://cloud.google.com/load-balancing/docs/negs/zonal-neg-concepts#troubleshooting
[3] https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#health_checks

Related

externalTrafficPolicy Local on GKE service not working

I'm using GKE version 1.21.12-gke.1700 and I'm trying to configure externalTrafficPolicy to "local" on my nginx external load balancer (not ingress). After the change, nothing happens, and I still see the source as the internal IP for the kubernetes IP range instead of the client's IP.
This is my service's YAML:
apiVersion: v1
kind: Service
metadata:
name: nginx-ext
namespace: my-namespace
spec:
externalTrafficPolicy: Local
healthCheckNodePort: xxxxx
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
loadBalancerSourceRanges:
- x.x.x.x/32
ports:
- name: dashboard
port: 443
protocol: TCP
targetPort: 443
selector:
app: nginx
sessionAffinity: None
type: LoadBalancer
And the nginx logs:
*2 access forbidden by rule, client: 10.X.X.X
My goal is to make a restriction endpoint based (to deny all and allow only specific clients)
You can use curl to query the ip from the load balance, this is an example curl 202.0.113.120 . Please notice that the service.spec.externalTrafficPolicy set to Local in GKE will force to remove the nodes without service endpoints from the list of nodes eligible for load balanced traffic; so if you are applying the Local value to your external traffic policy, you will have at least one Service Endpoint. So based on this, it is important to deploy the service.spec.healthCheckNodePort . This port needs to be allowed in the ingress firewall rule, you can get the health check node port from your yaml file with this command:
kubectl get svc loadbalancer -o yaml | grep -i healthCheckNodePort
You can follow this guide if you need more information about how the service load balancer type works in GKE and finally you can limit the traffic from outside at your external load balancer deploying loadBalancerSourceRanges. In the following link, you can find more information related on how to protect your applications from outside traffic.

Deploy Spring MVC application using GPC: Cloud SQL, Kubernetes (Service and Ingress) and HTTP(S) Load Balancer with Google Managed Certificate

Let me explain what the deployment consists of. First of all I created a Cloud SQL db by importing some data. To connect the db to the application I used cloud-sql-proxy and so far everything works.
I created a kubernetes cluster in which there is a pod containing the Docker container of the application that I want to depoly and so far everything works ... To reach the application in https then I followed several online guides (https://cloud.google.com/load-balancing/docs/ssl-certificates/google-managed-certs#console , https://cloud.google.com/load-balancing/docs/ssl-certificates/google-managed-certs#console , etc.), all converge on using a service and an ingress kubernetes. The first one maps the 8080 of spring to the 80 while the second one creates a load balacer that exposes a frontend in https. I configured a health-check, I created a certificate (google managed) associated to a domain which maps the static ip assigned to the ingress.
Apparently everything works but as soon as you try to reach from the browser the address https://example.org/ you are correctly redirected to the login page ( http://example.org/login ) but as you can see it switches to the HTTP protocol and obviously a 404 is returned by google since http is disabled. Forcing https on the address to which it redirects you then ( https://example.org/login ) for some absurd reason adds "www" in front of the domain name ( https://www.example.org/login ). If you try not to use the domain by switching to the static IP the www problem disappears... However, every time you make a request in HTTPS it keeps changing to HTTP.
P.S. the general goal would be to have http up to the load balancer (google's internal network) and then have https between the load balancer and the client.
Can anyone help me? If it helps I post the yaml file of the deployment. Thank you very much!
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
run: my-app # Label for the Deployment
name: my-app # Name of Deployment
spec:
minReadySeconds: 60 # Number of seconds to wait after a Pod is created and its status is Ready
selector:
matchLabels:
run: my-app
template: # Pod template
metadata:
labels:
run: my-app # Labels Pods from this Deployment
spec: # Pod specification; each Pod created by this Deployment has this specification
containers:
- image: eu.gcr.io/my-app/my-app-production:latest # Application to run in Deployment's Pods
name: my-app-production # Container name
# Note: The following line is necessary only on clusters running GKE v1.11 and lower.
# For details, see https://cloud.google.com/kubernetes-engine/docs/how-to/container-native-load-balancing#align_rollouts
ports:
- containerPort: 8080
protocol: TCP
- image: gcr.io/cloudsql-docker/gce-proxy:1.17
name: cloud-sql-proxy
command:
- "/cloud_sql_proxy"
- "-instances=my-app:europe-west6:my-app-cloud-sql-instance=tcp:3306"
- "-credential_file=/secrets/service_account.json"
securityContext:
runAsNonRoot: true
volumeMounts:
- name: my-app-service-account-secret-volume
mountPath: /secrets/
readOnly: true
volumes:
- name: my-app-service-account-secret-volume
secret:
secretName: my-app-service-account-secret
terminationGracePeriodSeconds: 60 # Number of seconds to wait for connections to terminate before shutting down Pods
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: my-app-health-check
spec:
healthCheck:
checkIntervalSec: 60
port: 8080
type: HTTP
requestPath: /health/check
---
apiVersion: v1
kind: Service
metadata:
name: my-app-svc # Name of Service
annotations:
cloud.google.com/neg: '{"ingress": true}' # Creates a NEG after an Ingress is created
cloud.google.com/backend-config: '{"default": "my-app-health-check"}'
spec: # Service's specification
type: ClusterIP
selector:
run: my-app # Selects Pods labelled run: neg-demo-app
ports:
- port: 80 # Service's port
protocol: TCP
targetPort: 8080
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: my-app-ing
annotations:
kubernetes.io/ingress.global-static-ip-name: "my-static-ip"
ingress.gcp.kubernetes.io/pre-shared-cert: "example-org"
kubernetes.io/ingress.allow-http: "false"
spec:
backend:
serviceName: my-app-svc
servicePort: 80
tls:
- secretName: example-org
hosts:
- example.org
---
As I mention in the comment section, you can redirect HTTP to HTTPS.
Google Cloud have quite good documentation and you can find there step by step guides, including firewall configurations or tests. You can find this guide here.
I would also suggest you to read also docs like:
Traffic management overview for external HTTP(S) load balancers
Setting up traffic management for external HTTP(S) load balancers
Routing and traffic management
As alternative you could check Nginx Ingress with proper annotation (force-ssl-redirect). Some examples can be found here.

Using self-signed certificates in nginx Ingress

I'm migrating services into a kubernetes cluster on minikube, these services require a self-signed certificate on load, accessing the service via NodePort works perfectly and demands the certificate in the browser (picture below), but accessing via the ingress host (the domain is modified locally in /etc/hosts) provides me with a Kubernetes Ingress Controller Fake Certificate by Acme and skips my self-signed cert without any message.
The SSLs should be decrypted inside the app and not in the Ingress, and the tls-acme: "false" flag does not work and still gives me the fake cert
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
# decryption of tls occurs in the backend service
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
nginx.ingress.kubernetes.io/tls-acme: "false"
spec:
rules:
- host: admin.domain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: admin-service
port:
number: 443
when signing in it should show the following before loading:
minikube version: v1.15.1
kubectl version: 1.19
using ingress-nginx 3.18.0
The problem turned out to be a bug on Minikube, and also having to enable ssl passthrough in the nginx controller (in addition to the annotation) with the flag --enable-ssl-passthrough=true.
I was doing all my cluster testing on a Minikube cluster version v1.15.1 with kubernetes v1.19.4 where ssl passthrough failed, and after following the guidance in the ingress-nginx GitHub issue, I discovered that the issue didn't replicate in kind, so I tried deploying my app on a new AWS cluster (k8 version 1.18) and everything worked great.

Gitlab kubernetes integration

I have a custom kubernetes cluster on a serve with public IP and DNS pointing to it (also wildcard).
Gitlab was configured with the cluster following this guide: https://gitlab.touch4it.com/help/user/project/clusters/index#add-existing-kubernetes-cluster
However, after installing Ingress, the ingress endpoint is never detected:
I tried patching the object in k8s, like so
externalIPs: (was empty)
- 1.2.3.4
externalTrafficPolicy: local (was cluster)
I suspect that the problem is empty ingress (scroll to the end) object then calling:
# kubectl get service ingress-nginx-ingress-controller -n gitlab-managed-apps -o yaml
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2019-11-20T08:57:18Z"
labels:
app: nginx-ingress
chart: nginx-ingress-1.22.1
component: controller
heritage: Tiller
release: ingress
name: ingress-nginx-ingress-controller
namespace: gitlab-managed-apps
resourceVersion: "3940"
selfLink: /api/v1/namespaces/gitlab-managed-apps/services/ingress-nginx-ingress-controller
uid: c175afcc-0b73-11ea-91ec-5254008dd01b
spec:
clusterIP: 10.107.35.248
externalIPs:
- 1.2.3.4 # (public IP)
externalTrafficPolicy: Local
healthCheckNodePort: 30737
ports:
- name: http
nodePort: 31972
port: 80
protocol: TCP
targetPort: http
- name: https
nodePort: 31746
port: 443
protocol: TCP
targetPort: https
selector:
app: nginx-ingress
component: controller
release: ingress
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer: {}
But Gitlab still cant find the ingress endpoint. I tried restarting cluster and Gitlab.
The network inspection in Gitlab always shows this response:
...
name ingress
status installed
status_reason null
version 1.22.1
external_ip null
external_hostname null
update_available false
can_uninstall false
...
Any ideas how to have a working Ingress Endpoint?
GitLab: 12.4.3 (4d477238500) k8s: 1.16.3-00
I had the exact same issue as you, and I finally figured out how to solve it.
The first to understand, is that on bare metal, you can't make it working without using MetalLB, because it calls the required Kubernetes APIs making it accepting the IP address you give to the Service of LoadBalancer type.
So first step is to deploy MetalLB to your cluster.
Then you need to have another machine, running a service like NGiNX or HAproxy or whatever can do some load balancing.
Last but not least, you have to give the Load Balancer machine IP address to MetalLB so that it can assign it to the Service.
Usually MetalLB requires a range of IP addresses, but you can also give one IP address like I did:
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: staging-public-ips
protocol: layer2
addresses:
- 1.2.3.4/32
This way, MetalLB will assign the IP address to the Service with type LoadBalancer and Gitlab will finally find the IP address.
WARNING: MetalLB will assign only once an IP address. If you need many Service with type LoadBalancer, you will need many machines running NGiNX/HAproxy and so on and add its IP address in the MetalLB addresses pool.
For your information, I've posted all the technical details to my Gitlab issue here.

Kubernetes loadbalancer stops serving traffic if using local traffic policy

Currently I am having an issue with one of my services set to be a load balancer. I am trying to get the source ip preservation like its stated in the docs. However when I set the externalTrafficPolicy to local I lose all traffic to the service. Is there something I'm missing that is causing this to fail like this?
Load Balancer Service:
apiVersion: v1
kind: Service
metadata:
labels:
app: loadbalancer
role: loadbalancer-service
name: lb-test
namespace: default
spec:
clusterIP: 10.3.249.57
externalTrafficPolicy: Local
ports:
- name: example service
nodePort: 30581
port: 8000
protocol: TCP
targetPort: 8000
selector:
app: loadbalancer-example
role: example
type: LoadBalancer
status:
loadBalancer:
ingress:
- ip: *example.ip*
Could be several things. A couple of suggestions:
Your service is getting an external IP and doesn't know how to reply back based on the local IP address of the pod.
Try running a sniffer on your pod see if you are getting packets from the external source.
Try checking at logs of your application.
Healthcheck in your load balancer is failing. Check the load balancer for your service on GCP console.
Check the instance port is listening. (probably not if your health check is failing)
Hope it helps.

Resources