Error 502 when accessing backend inside same cluster in Kubernetes - nginx

Backend: python (Django)
Frontend: angular6
I just deployed my backend and frontend on same cluster in Google Kubernetes. They are two individual services inside same cluster. The pods on the clusters look like:
NAME READY STATUS RESTARTS AGE
backend-f4f5df588-nbc9p 1/1 Running 0 1h
frontend-85885799d9-92z5f 1/1 Running 0 1h
And the service looks like:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
backend LoadBalancer 10.3.249.148 35.232.61.116 8000:32291/TCP 26m
frontend LoadBalancer 10.3.248.72 35.224.112.111 8081:31444/TCP 3m
kubernetes ClusterIP 10.3.240.1 <none> 443/TCP 1h
My backend just works on the django server, starting by using python manage.py runserver command, everything works fine. I built the frontend and deployed on Nginx server. So there're two Docker images, one for django, one for nginx, as two pods in cluster.
Then there're two ingress for both of them. Exposing 80 port for frontend and 8000 for backend. Holding on the load balancer nginx controller. After assigning a domain, I can visit https://abc/project as front end. But when I want to make API requests, ERR 502 appears. The error message in nginx is:
38590 connect() failed (111: Connection refused) while connecting to upstream, client: 163.185.148.245, server: _, request: "GET /project/api HTTP/1.1", upstream: "http://10.0.0.30:8000/dataproject/api", host: "abc"
The upstream in error message is the correct IP for the backend service, but still gets a 502 error. I can curl from nginx server to frontend. But cannot cur to backend. Any help?
PS. Everything works fine before deployment.

Fixed. Django runserver cmd use 0.0.0.0 so it wont prevent from outside connections:
python runserver 0.0.0.0:8000

Related

Haproxy in docker cannot access device's localhost via "host.docker.internal"

I need to configure Haproxy for local testing.
The goal is to have 3 services
haproxy in docker container
listening http app on device's (macOS) localhost
a client app sending request through haproxy to the listening app (number 2.)
docker-compose.yml configuration for the proxy is
proxy_server:
image: haproxy:2.7.0-alpine
container_name: proxy_server
user: root # I used this to install curl in the container
ports:
- '3128:80' # haproxy itself
- '20005:20005' # configured proxy in haproxy.cfg
restart: always
volumes:
- ./test/proxy_server/config:/usr/local/etc/haproxy # this maps the haproxy.cfg file into the container
extra_hosts:
- 'host.docker.internal:host-gateway' # allows the proxy to access device's localhost on linux
haproxy.cfg is
defaults
timeout client 5s
timeout connect 5s
timeout server 5s
timeout http-request 5s
listen reverse-proxy
bind *:20005
mode http
option httplog
log stdout format raw local0 debug
When the listening app (2.) listens on localhost:56454, I can connect into the haproxy container, install curl and connect to the listening app via host.docker.internal host.
/ # curl -v -I http://host.docker.internal:56454
* Trying 192.168.65.2:56454...
* Connected to host.docker.internal (192.168.65.2) port 56454 (#0)
> HEAD / HTTP/1.1
> Host: host.docker.internal:56454
> User-Agent: curl/7.86.0
> Accept: */*
This is correct.
The problem is that I am not able to send request through the proxy to the same URL (http://host.docker.internal:56454) because the proxy logs
172.22.0.1:56636 [10/Dec/2022:13:50:22.601] reverse-proxy reverse-proxy/<NOSRV> 0/-1/-1/-1/0 503 217 - - SC-- 1/1/0/0/0 0/0 "POST http://host.docker.internal:56454/confirmation HTTP/1.1"
and the client gets following response:
HTTP Status 503: <html><body><h1>503 Service Unavailable</h1>\nNo server is available to handle this request.\n</body></html>
Also, the request passes correctly when I use the following docker-compose.yml configuration with ubuntu/squid image instead of the haproxy one
proxy_server:
image: ubuntu/squid
container_name: proxy_server
ports:
- '3128:3128'
restart: always
extra_hosts:
- 'host.docker.internal:host-gateway' # allows the proxy to access device's localhost on linux
So, I guess the problem is that haproxy somehow does not see the service on http://host.docker.internal:56454 even though the service is accessible from the container.
I've also tried ubuntu and debian versions of the haproxy image and it still does not work correctly.
Any idea how to fix it?
Edit:
Investigating <NOSRV>... No clue how to fix it yet.

externalTrafficPolicy Local on GKE service not working

I'm using GKE version 1.21.12-gke.1700 and I'm trying to configure externalTrafficPolicy to "local" on my nginx external load balancer (not ingress). After the change, nothing happens, and I still see the source as the internal IP for the kubernetes IP range instead of the client's IP.
This is my service's YAML:
apiVersion: v1
kind: Service
metadata:
name: nginx-ext
namespace: my-namespace
spec:
externalTrafficPolicy: Local
healthCheckNodePort: xxxxx
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
loadBalancerSourceRanges:
- x.x.x.x/32
ports:
- name: dashboard
port: 443
protocol: TCP
targetPort: 443
selector:
app: nginx
sessionAffinity: None
type: LoadBalancer
And the nginx logs:
*2 access forbidden by rule, client: 10.X.X.X
My goal is to make a restriction endpoint based (to deny all and allow only specific clients)
You can use curl to query the ip from the load balance, this is an example curl 202.0.113.120 . Please notice that the service.spec.externalTrafficPolicy set to Local in GKE will force to remove the nodes without service endpoints from the list of nodes eligible for load balanced traffic; so if you are applying the Local value to your external traffic policy, you will have at least one Service Endpoint. So based on this, it is important to deploy the service.spec.healthCheckNodePort . This port needs to be allowed in the ingress firewall rule, you can get the health check node port from your yaml file with this command:
kubectl get svc loadbalancer -o yaml | grep -i healthCheckNodePort
You can follow this guide if you need more information about how the service load balancer type works in GKE and finally you can limit the traffic from outside at your external load balancer deploying loadBalancerSourceRanges. In the following link, you can find more information related on how to protect your applications from outside traffic.

oauth2-proxy authentication calls slow on kubernetes cluster with auth annotations for nginx ingress

We have secured some of our services on the K8S cluster using the approach described on this page. Concretely, we have:
nginx.ingress.kubernetes.io/auth-url: "https://oauth2.${var.hosted_zone}/oauth2/auth"
nginx.ingress.kubernetes.io/auth-signin: "https://oauth2.${var.hosted_zone}/oauth2/start?rd=/redirect/$http_host$escaped_request_uri"
set on the service to be secured and we have followed this tutorial to only have one deployment of oauth2_proxy per cluster. We have 2 proxies set up, both with affinity to be placed on the same node as the nginx ingress.
$ kubectl get pods -o wide -A | egrep "nginx|oauth"
infra-system wer-exp-nginx-ingress-exp-controller-696f5fbd8c-bm5ld 1/1 Running 0 3h24m 10.76.11.65 ip-10-76-9-52.eu-central-1.compute.internal <none> <none>
infra-system wer-exp-nginx-ingress-exp-controller-696f5fbd8c-ldwb8 1/1 Running 0 3h24m 10.76.14.42 ip-10-76-15-164.eu-central-1.compute.internal <none> <none>
infra-system wer-exp-nginx-ingress-exp-default-backend-7d69cc6868-wttss 1/1 Running 0 3h24m 10.76.15.52 ip-10-76-15-164.eu-central-1.compute.internal <none> <none>
infra-system wer-exp-nginx-ingress-exp-default-backend-7d69cc6868-z998v 1/1 Running 0 3h24m 10.76.11.213 ip-10-76-9-52.eu-central-1.compute.internal <none> <none>
infra-system oauth2-proxy-68bf786866-vcdns 2/2 Running 0 14s 10.76.10.106 ip-10-76-9-52.eu-central-1.compute.internal <none> <none>
infra-system oauth2-proxy-68bf786866-wx62c 2/2 Running 0 14s 10.76.12.107 ip-10-76-15-164.eu-central-1.compute.internal <none> <none>
However, a simple website load usually takes around 10 seconds, compared to 2-3 seconds with the proxy annotations not being present on the secured service.
We added a proxy_cache to the auth.domain.com service which hosts our proxy by adding
"nginx.ingress.kubernetes.io/server-snippet": <<EOF
proxy_cache auth_cache;
proxy_cache_lock on;
proxy_ignore_headers Cache-Control;
proxy_cache_valid any 30m;
add_header X-Cache-Status $upstream_cache_status;
EOF
but this didn't improve the latency either. We still see all HTTP requests triggering a log line in our proxy. Oddly, only some of the requests take 5 seconds.
We are unsure if:
- the proxy forwards each request to the oauth provider (github) or
- caches the authentications
We use cookie authentication, therefore, in theory, the oauth2_proxy should just decrypt the cookie and then return a 200 to the nginx ingress. Since they are both on the same node it should be fast. But it's not. Any ideas?
Edit 1
I have analyzed the situation further. Visiting my auth server with https://oauth2.domain.com/auth in the browser and copying the request copy for curl I found that:
running 10.000 queries against my oauth server from my local machine (via curl) is very fast
running 100 requests on the nginx ingress with the same curl is slow
replacing the host name in the curl with the cluster IP of the auth service makes the performance increase drastically
setting the annotation to nginx.ingress.kubernetes.io/auth-url: http://172.20.95.17/oauth2/auth (e.g. setting the host == cluster IP) makes the GUI load as expected (fast)
it doesn't matter if the curl is run on the nginx-ingress or on any other pod (e.g. a test debian), the result is the same
Edit 2
A better fix I found was to set the annotation to the following
nginx.ingress.kubernetes.io/auth-url: "http://oauth2.infra-system.svc.cluster.local/oauth2/auth"
nginx.ingress.kubernetes.io/auth-signin: "https://oauth2.domain.com/oauth2/start?rd=/redirect/$http_host$escaped_request_uri"
The auth-url is what the ingress queries with the cookie of the user. Hence, a local DNS of the oauth2 service is the same as the external dns name, but without the SSL communication and since it's DNS, it's permanent (while the cluster IP is not)
Given that it's unlikely that someone comes up with the why this happens, I'll answer my workaround.
A fix I found was to set the annotation to the following
nginx.ingress.kubernetes.io/auth-url: "http://oauth2.infra-system.svc.cluster.local/oauth2/auth"
nginx.ingress.kubernetes.io/auth-signin: "https://oauth2.domain.com/oauth2/start?rd=/redirect/$http_host$escaped_request_uri"
The auth-url is what the ingress queries with the cookie of the user. Hence, a local DNS of the oauth2 service is the same as the external dns name, but without the SSL communication and since it's DNS, it's permanent (while the cluster IP is not)
In my opinion you observe the increased latency in response time in case of:
nginx.ingress.kubernetes.io/auth-url: "https://oauth2.${var.hosted_zone}/oauth2/auth" setting
due to the fact, that auth server URL resolves to the external service (in this case VIP of Load Balancer seating in front of Ingress Controller).
Practically it means, that you go out with the traffic outside of the cluster (so called hairpin mode), and goes back via External IP of Ingress that routes to internal ClusterIP Service (which adds extra hops), instead going directly with ClusterIP/Service DNS name (you stay within
Kubernetes cluster):
nginx.ingress.kubernetes.io/auth-url: "http://oauth2.infra-system.svc.cluster.local/oauth2/auth"

Nginx Ingress for Kubernetes "Connection refused"

Was there a recent change to Nginx Ingress? Out of the blue I'm now getting "Connection refused" errors. I thought it was my own configuration which worked on a previous cluster.
Instead I decided to follow this tutorial GKE NGINX INGRESS and I'm getting the same result.
$ kubectl get deployments --all-namespaces
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
default hello-app 1 1 1 1 13m
default nginx-ingress-controller 1 1 1 1 12m
default nginx-ingress-default-backend 1 1 1 0 12m
I see the default-backend isn't running but I don't know enough about Kubernetes to know if that's what's preventing everything from working properly.
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-app ClusterIP 10.31.255.90 <none> 8080/TCP 14m
kubernetes ClusterIP 10.31.240.1 <none> 443/TCP 19m
nginx-ingress-controller LoadBalancer 10.31.251.198 35.227.50.24 80:31285/TCP,443:30966/TCP 14m
nginx-ingress-default-backend ClusterIP 10.31.242.167 <none> 80/TCP 14m
Finally:
$ kubectl get ing
NAME HOSTS ADDRESS PORTS AGE
ingress-resource * 35.237.184.85 80 10m
According to the tutorial I should just be able to go to here to receive a 200 and here to get a 404.
I've left the links live so you all can see them.
$ curl -I http://35.237.184.85/hello
curl: (7) Failed to connect to 35.237.184.85 port 80: Connection refused
I swear everything worked before and the only thing I can think of is that something from the teller install of nginx-ingress changed.
Please, any help is appreciated! Thank you in advance!
That's because you are trying the request against the IP address created by the Ingress. Your entrypoint is the LoadBalancer type service created IP.
Try curl -I http://35.227.50.24/hello. That's where you will get 200.
OK so 6k views and this is not right. Let's fix this :
I see the default-backend isn't running but I don't know enough about Kubernetes to know if that's what's preventing everything from working properly.
The defaut-backend is what serves thos hello and healthz pages. Not runnig = no pages
Is the LoadBalancer service IP where an A record should point for a domain? I thought traffic was supposed to hit the ingress IP, especially if you want SSL.
Yes you should always point to the ingress' IP. That's the point of ingress they handle named based http(s) request to route to proper services and handling (un the case of https) TLS/SSL stuff. Make sure you have a certificate autority like cert-manager configured on your cluster if you plan on doing this.

gce nginix-ingress type NodePort and port:80 connection refused

In my gce kube-cluster, i'm using nginx ingress controller instead of google load balancer, by using "nginx-ingress" with NodePort instead of type LoadBalance as below:
helm install --name my-lb stable/nginx-ingress --set controller.service.type=NodePort
Since nginx-controller deployed as "conroller.service.type=NodePort", the nodePorts were opened/assigned(kubect get svc), also got external ip 104.196.xxx.xxx.
At this point nginx-ingress-controller is running in kube-cluster and confirmed in console "networking/load balancing" that no cloud load balancer created.
kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-lb-nginx-ingress-controller 10.39.249.242 <nodes> 80:31181/TCP,443:31462/TCP 15h
my-lb-nginx-ingress-default-backend 10.39.246.94 <none> 80/TCP 15h
After this, created a new firewall rule in console "networking/firewall" to allow node ports "tcp:31181;tcp:31462".
Now using browser/curl to reach "http://104.196.xxx.xxx:31181" or "https://104.196.xxx.xxx:31462" gets response from ngnix controllers..works well.
But, port access through port 80 not working. When I do curl on "http://104.196.xxx.xxx:80", get back connection refused as below:
* connect to 104.196.xxx.xxx port 80 failed: Connection refused
Note,
firewall rules have "default-allow-http" for "tcp:80"
ngnix-ingress version = nginx-ingress-0.8.5
kube-server-version = Major:"1", Minor:"7", GitVersion:"v1.7.5"
helm ls
NAME REVISION UPDATED STATUS CHART NAMESPACE
my-lb 1 Fri Sep 22 23:05:30 2017 DEPLOYED nginx-ingress-0.8.5 default
kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:57:25Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017-08-31T08:56:23Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Any idea why "https://104.196.xxx.xxx:80" gets "port 80: Connection refused" while "https://104.196.xxx.xxx:31462" is working fine?
Thx.
When using a NodePort, as is very clearly described in the NodePort documentation, it translates the Service port number to a random(+/-) port in the high 30,000 range which that Service will use on the Node itself.
Think of it in that if Service alpha wants to listen on port 80, and Service beta wants to listen on port 80, without that translation mechanism alpha and beta could not exist in the cluster at the same time. Those two ports (31181 for 80, 31462 for 443) are assigned to the Service -- nothing else in the cluster will listen on those ports for as long as that Service is declared.

Resources