Do I need a service for exposing every app running in a pod? - nginx

I'm planning to build a website to host static files. Users will upload their files and I deploy bunch of deployments with nginx images on those to a Kubernetes node. My main goal is for some point, users will deploy their apps to a subdomain like my-blog-app.mysite.com. After some time users can use custom domains.
I understand that when I deploy an nginx image on a pod, I have to create a service to expose port 80 (or 443) to the internet via load balancer.
I also read about Ingress, looks like what I need but I don't think I understand that concept.
My question is, for example if I have 500 nginx pods running (each is a different website), do I need a service for every pod in that node (in this case 500 services)?

You are looking for https://kubernetes.io/docs/concepts/services-networking/ingress/#name-based-virtual-hosting.
With this type of Ingress, you route the traffic to the different nginx instances, based on the Host header, which perfectly matches your use-case.
In any case, yes, assuming your current architecture you need to have a service for each pod. Haven't you considered a different approach? Like having a general listener (nginx instances) and get the correct content based on authorization or something?

Related

ECS Nginx network setup

I have 3 containers on ECS: web, api and nginx. Basically nginx is proxying traffic to web and api containers:
upstream web {
server web-container:3000;
}
upstream api {
server api-container:3001;
}
But every time I redeploy web or api they change their IPs so I need to redeploy nginx afterwards in order to make it to "pick up" new IPs.
Is there a way to avoid this so I could just update let's say api service and nginx service would automatically proxy to correct IP address?
I assume these containers belong to 3 different task definitions and ultimately 3 different tasks (or better 3 different services).
If that is the setup then you want to use service discovery for this. This only works with ECS services and the idea is that you create 3 distinct services each with 1+ tasks in it. You give the service a name (e.g. nginx, web, api) and each container in them is going to be able to resolve the other containers by pointing to the fqdn (e.g. api.local). When your container in the nginx service tries to connect to api.local service discovery will resolve that name to the IP of one of the tasks in the ECS service api.
If you want to see an example re how this is setup you can look at this demo app and particularly at this CloudFormation template

Best practise for a website hosted on Kubernetes (DigitalOcean)

I followed this guide: https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nginx-ingress-with-cert-manager-on-digitalocean-kubernetes on how to setup an Nginx Ingress with Cert Manager with Kubernetes having DigitalOcean as a cloud provider.
The tutorial worked fine, I was able to setup everything according to what it was written. Though, (as it is stated) following the tutorial one ends up with three pods of which only one is in "Running 1/1", while the other two are "Down". Also when checking the comments section, it seems that it is quite a problem. Since if all the traffic gets routed to only 1 pods, it is not really scalable. Or am I missing something? Quoting from their tutorial:
Note: By default the Nginx Ingress LoadBalancer Service has
service.spec.externalTrafficPolicy set to the value Local, which
routes all load balancer traffic to nodes running Nginx Ingress Pods.
The other nodes will deliberately fail load balancer health checks so
that Ingress traffic does not get routed to them.
Mainly my question is: Is there a best practice that I am missing in order to have Kubernetes hosting my website? It seems I have to choose either scalability (having all the pods healthy and running) or getting IP of the client visitor.
And for whoever will ever find himself/herself in my situation, this is the reply I got from the DigitaOcean Support:
Unfortunately with that Kubernetes setup it would show those other
nodes as down without additional traffic configuration. It is possible
to skip the nginx ingress part and just use a DigitalOcean load
balancer but this again does require a good deal of setup and can be
more difficult then easy.
The suggestion to have a website with analytics (IP) and scalable was to setup a droplet with Nginx and setup a LoadBalancer to it. More specifically:
As for using a droplet this would be a normal website configuration
with Nginx as your webserver configured to serve content to your app.
You would have full access to your application and the Nginx logs on
the droplet itself. Putting a load balancer in front of this would
require additional configuration as load balancers do not pass the
x-forward header so the IP addresses of clients would not show up in
the logs by default. You would need to configured proxy protocol on
the load balancer and in your nginx configuration to be able to obtain
those IPs.
https://www.digitalocean.com/blog/load-balancers-now-support-proxy-protocol/
This is also a bit more complex unfortunately.
Hope it might save some time to someone

Traefik instance loadbalance to Kubernetes NodePort services

Intro:
On AWS, Loadbalancers are expensive ($20/month + usage), so I'm looking for a way to achieve flexible load-balancing between the k8s nodes, without having to pay that expense. The load is not that big, so I don't need the scalability of the AWS load balancer any time soon. I just need services to be HA. I can get a small EC2 instance for $3.5/month that can easily handle the current traffic, so I'm chasing that option now.
Current setup
Currently, I've set up a regular standalone Nginx instance (outside of k8s) that does load balancing between the nodes in my cluster, on which all services are set up to expose through NodePorts. This works really well, but whenever my cluster topology changes during restarts, adding, restarting or removing nodes, I have to manually update the upstream config on the Nginx instance, which is far from optimal, given that cluster nodes cannot be expected to stay around forever.
So the question is:
Can Trækfik be set up outside of Kubernetes to do simple load-balancing between the Kubernetes nodes, just like my Nginx setup, but keep the upstream/backend servers of the traefik config in sync with Kubernetes list of nodes, such that my Kubernetes services are still HA when I make changes to my node setup? All I really need is for Træfik to listen to the Kubernetes API and change the backend servers whenever the cluster changes.
Sounds simple, right? ;-)
When looking at the Træfik documentation, it seems to want an ingress resource to send its trafik to, and an ingress resource requires an ingress controller, which I guess, requires a load balancer to become accessible? Doesn't that defeat the purpose, or is there something I'm missing?
Here is something what would be useful in your case https://github.com/unibet/ext_nginx but I'm note sure if project is still in development and configuration is probably hard as you need to allow external ingress to access internal k8s network.
Maybe you can try to do that on AWS level? You can add cron job on Nginx EC2 instance where you query AWS using CLI for all EC2 instances tagged as "k8s" and make update in nginx configuration if something changed.

Nginx to load balance deployment inside a service kubernetes

I want to use Nginx to load balance a kubernetes deployment.
The deployment is part of a service. It contains pod which can be scaled. I want NGINX to be part of the service without being scaled.
I know that I can use NGINX as an external load balancer by configuring it with external dns resolver. With that it can get the IP of the pods scaled and apply its own load balanced rules.
Is it possible to make NGINX part of the service? Then how to do the DNS resolution to the pods? In that case, which pods the service name is refered to?
I would like to avoid the declaration of two services to keep a single definition of the setup which represent a microservice.
More generally, how can I declare in a same service:
a unit which is scaled
a backend, not scaled
a database, not scaled
Thanks all
You can't have NGINX as part of the service. Service doesn't contain any pods, deployment does. It sounds like you want an ingress service, that would be a load balancer any and all services on the cluster
EDIT:
An ingress controller in essence is a deployment of NGINX exposed publicly as a service acting as a load balancer/fan out. The deployment scans your cluster for ingress resources and reconfigures NGINX to forward requests to appropriate services.
Typically people deploy a single controller that acts as the load balancer for all of your microservices. You can fan out based on DNS, URI, other headers and so on. You can also do TLS termination, add basic auth to specific services, it's even possible to splice in NGINX config snippets directly into the ingress resources.

Nginx Reverse Proxy With Alternating Live Backend Services

I have different versions of a backend service, and would like nginx to be like a "traffic cop", sending users ONLY to the currently online live backend service. Is there a simple way to do this without changing the nginx config each time I want to redirect users to a different backend service?
In this example, I want to shut down the live backend service and direct users to the test backend service. Then, vice-versa. I'm calling it a logical "traffic cop" which knows which backend service to direct users to.
I don't think adding all backend services to the proxy_pass using upstream load balancing will work. I think load balancing would not give me what I'm looking for.
I also do not want user root to update the /etc/hosts file on the machine, because of security and collision concerns with multiple programs editing /etc/hosts simultaneously.
I'm thinking of doing proxy_pass http://live-backend.localhost in nginx and using a local DNS server to manage the internal IP for live-backend-localhost which I can change (re-point to another backend IP) at any time. However, would nginx actually query the DNS server on every request, or does it resolve once then cache the IP forever?
Am I over-thinking this? Is there an easy way to do this within nginx?
You can use the backup parameter to the server directive so that the test server will only be used when the live one is down.
NGINX queries DNS on startup and caches it, so you'd still have to reload it to update.

Resources