what is the difference between kube-nginx (here i am not talking about nginx ingress controller) and kube-proxy ?
i've seen a recent deployment that all nodes in the cluster are running 1 kube-proxy (which is used for accessing services running in the nodes according to https://kubernetes.io/docs/concepts/cluster-administration/proxies/) and 1 kube-nginx pod, so they are used for different purposes.
As mentioned by community above and here
K8s components require a loadbalancer to access the apiservers via a reverse proxy. Kubespray includes support for an nginx-based proxy that resides on each non-master Kubernetes node. This is referred to as localhost loadbalancing. It is less efficient than a dedicated load balancer because it creates extra health checks on the Kubernetes apiserver, but is more practical for scenarios where an external LB or virtual IP management is inconvenient. This option is configured by the variable loadbalancer_apiserver_localhost (defaults to True. Or False, if there is an external loadbalancer_apiserver defined). You may also define the port the local internal loadbalancer uses by changing, loadbalancer_apiserver_port. This defaults to the value of kube_apiserver_port. It is also important to note that Kubespray will only configure kubelet and kube-proxy on non-master nodes to use the local internal loadbalancer.
Related
We are moving from standalone docker containers architecture to K3s architecture. The current architecture uses a Nginx container to expose multiple uwsgi and websocket services (for django) that are running in different containers. I'm reading conflicting opinions on the internet over what approach should be used.
The options are:
Nginx service of type LoadBalancer (Most conf from existing architecture can be reused)
Nginx-ingress (All conf from existing architecture will have to be converted to ingress annotations and ConfigMap)
Nginx-ingress + nginx service of type ClusterIP (Most conf from existing architecture can be reused, traffic coming into ingress will simply be routed to nginx service)
In a very similar situation, we used option 3.
It might be seen as sub-optimal in terms of network, but gave us a much smoother transition path. It also gives the time to see what could be handled by the Ingress afterwards.
The support of your various nginx configurations would vary on the Ingress implementation, and would be specific to this Ingress implementation (a generic Ingress only handles HTTP routing based on host or path). So I would not advise option 2 except you're already sure your Ingress can handle it (and you won't want to switch to another Ingress)
Regarding option 1 (LoadBalancer, or even NodePort), it would probably work too, but an Ingress is a much better fit when using http(s).
My opinion about the 3 options is:
You can maintain the existing config but you need to assign one IP from your network to each service that you want to expose. And in bare metals you need to use an adicional service like Metallb.
Could be an option too, but it's not flexible if you want to rollback your previous configuration, it's like you are adapting your solution to Kubernetes architecture.
I think that it's the best option, you maintain your nginx+wsgi to talk with your Django apps, and use Nginx ingress to centralize the exposure of your services, apply SSL, domain names, etc.
I've read everywhere that to set Https to access a kubernetes cluster you need to have an Ingress and not simply a LoadBalancer service which also exposes the cluster outside.
My question is pretty theoretical: if an Ingress (and it is) is composed of a LoadBalancer service, a Controller (a deployment/pod of an nginx image for example) and a set of Rules (in order to correctly proxy the incoming requests inside the cluster), why can't we set Https in front of a LoadBalancer instead of an Ingress?
As title of exercise I've built the three components separately by myself (a LoadBalancer, a Controller/API Gateway with some Rules): these three together already get the incoming requests and proxy them inside the cluster according to specific rules so, I can say, I have built an Ingress by myself. Can't I add https to this structure and do I need to set a redundant part (a k8s Ingress) in front of the cluster?
Not sure if I fully understood your question.
In Kubernetes you are exposing you cluster/application using service, which is well described here. Good compare of all services can be found in this article.
When you are creating service type LoadBalancer it creates L4 LoadBalancer. L4 is aware of information like source IP:port and destination IP:port, but don't have any information about application layer (Layer 7). HTTP/HTTPS LoadBalancers are on Layer 7, so they are aware of application. More information about Load Balancing can be found here.
Layer 4-based load balancing to direct traffic based on data from network and transport layer protocols, such as IP address and TCP or UDP port
Layer 7-based load balancing to add content-based routing decisions based on attributes, such as the HTTP header and the uniform resource identifier
Ingress is something like LoadBalancer with L7 support.
The Ingress is a Kubernetes resource that lets you configure an HTTP load balancer for applications running on Kubernetes, represented by one or more Services. Such a load balancer is necessary to deliver those applications to clients outside of the Kubernetes cluster.
Ingress also provides many advantages. For example if you have many services in your cluster you can create one LoadBalancer and Ingress which will be able to redirect traffic to proper service and allows you to cut costs of creating a few LoadBalancers.
In order for the Ingress resource to work, the cluster must have an ingress controller running.
The Ingress controller is an application that runs in a cluster and configures an HTTP load balancer according to Ingress resources. The load balancer can be a software load balancer running in the cluster or a hardware or cloud load balancer running externally. Different load balancers require different Ingress controller implementations.
In the case of NGINX, the Ingress controller is deployed in a pod along with the load balancer.
There are many Ingress Controllers, but the most popular is Nginx Ingress Controller
So my answer regarding:
why can't we set Https in front of a LoadBalancer instead of an Ingress?
It's not only about securing your cluster using HTTPS but also many capabilities and features which Ingress provides.
Very good documentation regarding HTTP(S) Load Balancing can be found on GKE Docs.
Kubernetes installed on premise,
nginx-ingress
a service with multiple pods on multiple nodes
All this nodes are working as an nginx ingress.
The problem is when a request come from a load balancer can jump to another worker that have a pod, this cause unecesary trafic inside the workers network, I want to force when a request come from outside to the ingress,
the ingress always choice pods on the same node, in case no pods then
can forward to other nodes.
More or less this image represent my case.
example
I have the problem in the blue case, what I expect is the red case.
I saw exist the "externalTrafficPolicy: Local" but this only work for
serviceType nodePort/loadBalancer, nginx ingress try to connect using the "clusterIP" so it skips this functionality.
There are a way to have this feature working for clusterIP or something similar? I started to read about istio and linkerd, they seem so powerful but I don't see any parameter to configure this workflow.
You have to deploy an Ingress Controller using a NodeSelector to deploy it to specific nodes, named ingress or whatever you want: so you can proceed to create an LB on these node IPs using simple health-checking on port 80 and 443 (just to update the zone in case of node failure) or, even better, with a custom health-check endpoint.
As you said, the externalTrafficPolicy=Local works only for Load-Balancer services: dealing with on-prem clusters is tough :)
I deploy an nginx ingress controller to my cluster. This provisions a load balancer in my cloud provider (assume AWS or GCE). However, all traffic inside the cluster is routed by the controller based on my ingress rules and annotations.
What is then the purpose of having a load balancer in the cloud sit in front of this controller? It seems like the controller is doing the actual load balancing anyway?
I would like to understand how to have it so that the cloud load balancer is actually routing traffic towards machines inside the cluster while still following all my nginx configurations/annotations or even if that is possible/makes sense.
You may have a High Availability (HA) Cluster with multiple masters, a Load Balancer is a easy and practical way to "enter" in your Kubernetes cluster, as your applications are supposed to be usable by your users (who are on a different net from your cluster).
So you need to have an entry point to your K8S cluster.
A LB is an easy configurable entrypoint.
Take a look at this picture as example:
Your API servers are load balanced. A call from outside your cluster will pass through the LB and will be manager by a API server. Only one master (the elected one) will be responsible to persist the status of the cluster in the etcd database.
When you have ingress controller and ingress rules, in my opinion it's easier to configure and manage them inside K8S, instead of writing them in the LB configuration file (and reload the configuration on each modification).
I suggest you to take a look at https://github.com/kelseyhightower/kubernetes-the-hard-way and make the exercise inside it. It's a good way to understand the flow.
In ingress controller is a "controller", which in kubernetes terms is a software loop that listens for changes to declarative configuration (the Ingress resource) and to the relevant "state of the cluster", compares the two, and then "reconciles" the state of the cluster to declarative configuration.
In this case, the "state of the cluster" is a combination of:
an nginx configuration file generated by the controller, in use by the pods in the cluster running nginx
the configuration at the cloud provider that instructs the provider's edge traffic infrastructure to deliver certain externally-delivered traffic meeting certain criteria to the nginx pods in your cluster.
So when you update an Ingress resource, the controller notices the change, creates a new nginx configuration file, updates and restarts the pods running nginx.
In terms of the physical architecture- it is not really accurate to visualize things as "inside" your cluster vs "outside", or "your cluster" and "the cloud." Everything customer visible is an abstraction.
In GCP, there are several layers of packet and traffic management underneath the customer-visible VMs, load balancers and managed clusters.
External traffic destined for ingress passes through several logical control points in Google's infrastructure, the details of which you have no visibility into, before it gets to "your" cluster.
I would like to be able to create a pod in Kubernetes and expose a port and be able to reach the exposed port using a domain name (myservice.example.com)
I saw that this is possible using a Load Balancer but in that case every network communication has to go through the Load Balancer and it seems to be a network bottleneck. It is possible using Kubernetes to access directly the node using a domain name (dynamically created for each pod) ?
Thanks.
Maybe u should try the NodePort service
If accessing the service through a high port (default range: 30000-32767) is not an issue you can setup your service to use type NodePort and access it through myservice.example.com:30080.
If that is not acceptable, your other option is to setup an Ingress controller and route to different services based on domain name. You can then scale out the Ingress as needed.
Having dynamic exposed domain names to each pod doesn't make much sense because ideally you want to expose services, not individual pods (which have unpredictable lifetimes).