we are running a nextJS server as a service on a Kubernetes cluster, with a minimum of two replicas. So in a normal situation, we have these:
our-nextjs-server-prod-bd7c6dc4c-2dlqg 1/1 Running 0 18h
our-nextjs-server-prod-bd7c6dc4c-7dkbp 1/1 Running 0 18h
When the first server is hit for a page it hasn't cached yet, it will work on it, store it in the host node's volume, and will serve it from there in subsequent calls. Now, if the second server is hit for the same page, but is hosted on a different node, it will have to, as I understand it, re-generate the page as it doesn't exist on its node's volume.
Is there a way to have multiple nextJS pods from different nodes utilize a common resource to cache pages? a common volume, an external resource like Redis perhaps? Is there a best practice around that requirement?
For a moment: Let's disregard the CDN in front of the nextJS service caching the results for a certain TTL. We need those nextJS pods hit frequently so that they can ping the application server for changed properties that'll trigger a re-build of the page.
Related
I'm planning to build a website to host static files. Users will upload their files and I deploy bunch of deployments with nginx images on those to a Kubernetes node. My main goal is for some point, users will deploy their apps to a subdomain like my-blog-app.mysite.com. After some time users can use custom domains.
I understand that when I deploy an nginx image on a pod, I have to create a service to expose port 80 (or 443) to the internet via load balancer.
I also read about Ingress, looks like what I need but I don't think I understand that concept.
My question is, for example if I have 500 nginx pods running (each is a different website), do I need a service for every pod in that node (in this case 500 services)?
You are looking for https://kubernetes.io/docs/concepts/services-networking/ingress/#name-based-virtual-hosting.
With this type of Ingress, you route the traffic to the different nginx instances, based on the Host header, which perfectly matches your use-case.
In any case, yes, assuming your current architecture you need to have a service for each pod. Haven't you considered a different approach? Like having a general listener (nginx instances) and get the correct content based on authorization or something?
I hosted my Angular 6 and Laravel application on an AWS EC2 Instance, the angular container is running on(or mapped to) port 80 and 443 served by Nginx while the Laravel application is running on another container running on (or mapped to) port 8000 also served by Nginx.
I configured the angular app to be running at https://example.com and the Laravel app on https://api.example.com.
To be clear, the containers are task on separate services in the same EC2 Cluster on a cloud formation and there is no load-balancer.
The set up works perfectly for like 97% of customers but the remaining customers cannot get content on the site. I worked with one of the customers and realized that the Angular app(at https://example.com) loaded successfully but https://api.example.com:8000 cannot be reached.
What on earth can cause this?
Is there a way (may be tool) I can use to simulate different kinds of network request so that I can simulate the problematic network of the customers that cannot assess the site, for me to be able to trace and debug, because right now I am not having issue, making the problem very dicey for me to solve.
Consider the following cluster running on Google Container Engine:
tier: g1-small
cpu cores: 2
memory: 1,7GB per cpu core (3,4GB total in this case)
autoscaling: enabled, min=2, max=5
On this cluster, I have the following Deployments running via Kubernetes:
Load Balancer using NGINX
Web App using Node.js (communicating with WordPress via REST calls)
example.com
CMS using WordPress on Apache (wp.example.com)
wp.example.com
For clarity, every request goes through the NGINX Load Balancer first, then, depending on the subdomain, to either the CMS or the Web App.
I'd like to have more control over how much resources each Deployment consumes in order to consume resources more efficiently by applying Kubernetes Limit Ranges to my Pods/Containers resources.
Limits can be set on CPU and Memory. These are well explained in the docs. So far, so good.
The problem I'm having is to figure out what limits to apply per Deployment.
Example
The WordPress Deployment contains two containers in the deployment.yaml file. One for the WordPress image itself, one for the Cloud SQL Proxy Container that is needed for WordPress to connect to my Cloud SQL Database. How would I know what each container needs with respect to CPU/Memory resources?
Furthermore, considering that all HTTP/HTTPS traffic hits my NGINX Load Balancer first, an educated guess would be apply relatively more resources to the NGINX Deployment than to my CMS and Web App Deployment, right?!.
So is there a way to better estimate how many resources each Deployment would need?
Any help is greatly appreciated!
k8s' default value to pods is 100m CPU request and no CPU limit, and no memory request/limit. If you don't set limitation pods/containers will consume as much as it need. Which is pretty convenient since usually you don't specify limitation one by one.
Nginx as a load balancer is pretty light-weighted. So it's hard to say which one needs more resources. I would follow the default at the beginning then use kubectl top pod to check CPU/memory pressure for tuning reference.
I have a graphite relay and webapp installed on one servers, that is supposed to be communicating with 4 carbon caches (and respective webapps) on 4 other servers. I've validated that the relay is working by observing that different whisper files are being updated on different carbon-relay servers.
However, the webapp is only showing metrics that are stored on the first carbon cache server in the list and I'm not sure what else to look at.
The webapps on the carbon relays are set up to listen on port 81, and I have the following in local_settings.py on the relay server (the one I'm pointing my browser at):
CLUSTER_SERVERS = ["graphite-storage1.mydomain.com:81", "graphite-storage2.mydomain.com:81", "graphite-storage3.mydomain.com:81", "graphite-storage4.mydomain.com:81", ]
However - at one point I did have all metrics on all servers - I've migrated from a single instance to this federated cluster. I've since removed the whisper files that weren't active on each carbon-cache server. I've restarted all carbon-caches, the carbon-relay and the webapp server several times. Is there somewhere the metrics-->carbon-cache mapping is getting cached? Have I missed a setting somewhere?
I've found at that Instagram share their technology implementation with other developers trough their blog. They've some great solutions for the problems they run into. One of those solutions they've is an Elastic Load Balancer on Amazon with 3 nginx instances behind it. What is the task of those nginx servers? And what is the task of the Elastic Load balancers, and what is the relation between them?
Disclaimer: I am no expert on this in any way and am in the process of learning about AWS ecosystem myself.
The ELB (Elastic load balancer) has no functionality on its own except receiving the requests and routing it to the right server. The servers can run Nginx, IIS, Apache, lighthttpd, you name it.
I will give you a real use case.
I had one Nginx server running one WordPress blog. This server was, like I said, powered by Nginx serving static content and "upstreaming" .php requests to phpfpm running on the same server. Everything was going fine until one day. This blog was featured on a tv show. I had a ton of users and the server could not keep up with that much traffic.
My first reaction would be to just use the AMI (Amazon machine image) to spin up a copy of my server on a more powerful instance like m1.heavy. The problem was I knew I would have traffic increasing over time over the next couple of days. Soon I would have to spin an even more powerful machine, which would mean more downtime and trouble.
Instead, I launched an ELB (elastic load balancer) and updated my DNS to point website traffic to the ELB instead of directly to the server. The user doesn’t know server IP or anything, he only sees the ELB, everything else goes on inside amazon’s cloud.
The ELB decides to which server the traffic goes. You can have ELB and only one server on at the time (if your traffic is low at the moment), or hundreds. Servers can be created and added to the server array (server group) at any time, or you can configure auto scaling to spawn new servers and add them to the ELB Server group using amazon command line, all automatically.
Amazon cloud watch (another product and important part of the AWS ecosystem) is always watching your server’s health and decides to which server it will route that user. It also knows when all the servers are becoming too loaded and is the agent that gives the order to spawn another server (using your AMI). When the servers are not under heavy load anymore they are automatically destroyed (or stopped, I don’t recall).
This way I was able to serve all users at all times, and when the load was light, I would have ELB and only one Nginx server. When the load was high I would let it decide how many servers I need (according to server load). Minimal downtime. Of course, you can set limits to how many servers you can afford at the same time and stuff like that so you don’t get billed over what you can pay.
You see, Instagram guys said the following - "we used to run 2 Nginx machines and DNS Round-Robin between them". This is inefficient IMO compared to ELB. DNS Round Robin is DNS routing each request to a different server. So first goes to server one, second goes to server two and on and on.
ELB actually watches the servers' HEALTH (CPU usage, network usage) and decides which server traffic goes based on that. Do you see the difference?
And they say: "The downside of this approach is the time it takes for DNS to update in case one of the machines needs to get decommissioned."
DNS Round robin is a form of a load balancer. But if one server goes kaput and you need to update DNS to remove this server from the server group, you will have downtime (DNS takes time to update to the whole world). Some users will get routed to this bad server. With ELB this is automatic - if the server is in bad health it does not receive any more traffic - unless of course the whole group of servers is in bad health and you do not have any kind of auto-scaling setup.
And now the guys at Instagram: "Recently, we moved to using Amazon’s Elastic Load Balancer, with 3 NGINX instances behind it that can be swapped in and out (and are automatically taken out of rotation if they fail a health check).".
The scenario I illustrated is fictional. It is actually more complex than that but nothing that cannot be solved. For instance, if users upload pictures to your application, how can you keep consistency between all the machines on the server group? You would need to store the images on an external service like Amazon s3. On another post on Instagram engineering – “The photos themselves go straight to Amazon S3, which currently stores several terabytes of photo data for us.”. If they have 3 Nginx servers on the load balancer and all servers serve HTML pages on which the links for images point to S3, you will have no problem. If the image is stored locally on the instance – no way to do it.
All servers on the ELB would also need an external database. For that amazon has RDS – All machines can point to the same database and data consistency would be guaranteed.
On the image above, you can see an RDS "Read replica" - that is RDS way of load balancing. I don't know much about that at this time, sorry.
Try and read this: http://awsadvent.tumblr.com/post/38043683444/using-elb-and-auto-scaling
Can you please point the blog entry out?
Load balancers balance load. They monitor the Web servers health (response time etc) and distribute the load between the Web servers. On more complex implementations it is possible to have new servers spawn automatically if there is a traffic spike. Of course you need to make sure there is a consistency between the servers. THEY CAN share the same databases for instance.
So I believe the load balancer gets hit and decides to which server it will route the traffic according to server health.
.
Nginx is a Web server that is extremely good at serving a lot of static content for simultaneous users.
Requests for dynamic pages can be offloaded to a different server using cgi. Or the same servers that run nginx can also run phpfpm.
.
A lot of possibilities. I am on my cell phone right now. tomorrow I can write a little more.
Best regards.
I am aware that I am late to the party, but I think the use of NGINX instances behind ELB in Istagram blogpost is to provide high available load balancer as described here.
NGINX instances do not seem to be used as web servers in the blogpost.
For that role they mention:
Next up comes the application servers that handle our requests. We run Djangoon Amazon High-CPU Extra-Large machines
So ELB is used just as a replacement for their older solution with DNS Round-Robin between NGINX instances that was not providing high availability.