KubernetesPodOperator - crashing pods when scaling down - airflow

I ran into this issue the other day and I'm not sure if this the correct cause. Essentially I am spinning up 10 KubernetesPodOperators in parallel in airflow. When I request the 10 pods, the nodes will autoscale to meet the resource requirements of those 10 pods. However, once let's say 8/10 pods have completed their task, the autoscaler will scale down the nodes, which seemed to crash my currently running remaining 2 pods (as I assume they are being placed onto a new node). When I set autoscale to "off" in kubernetes and predefined the correct number of nodes, my 10 pods run fine. Does this logic make sense? If so has anyone faced a similar issue and if so is there any way around this? We are running airflow in an Azure AKS instance.
Thanks,

Related

ECS Cluster died, no logs, no alarms

We're running a platform made out of 5 clusters. One of the clusters died. We're using Kibana because its cheaper than Cloudwatch (log router with fluentbit). The 14 hour span that the cluster was dead shows 0 logs on Kibana, and we have no idea what happened to the cluster. A simple restart of the cluster fixed our issue. So, to make sure it doesn't die again while we're away, we need to set it up so it automatically restarts. Dev did not implement a cluster health check. We're using Kibana, so I can't use Cloudwatch to implement metrics, alarms and actions. What do I do here? How do I make the cluster restart itself when Kibana detects no incoming logs from it? Thank you.

Airflow EKS and Fargate

Have some basic questions about setting up Airflow on EKS using Fargate. What I have understood so far is that master plane will still be managed by AWS, while the worker plane will be Fargate instances.
Question: What I am unclear is while setting up webserver/scheduler etc on Fargate, do I need to specify anywhere the amount of Vcpu and Memory?
More importantly, do any changes need to be done on how dags are written so that they can execute on the individual pods ? Also do the tasks in the dags specify how much Vcpu and memory will the task use?
Sorry just entering into the Fargate/EKS/Airflow world.
Thanks

How to setup Airflow > 2.0 high availability cluster on centos 7 or above

I want to setup HA for airflow(2.3.1) on centos7. Messaging queue - Rabbitmq and metadata db - postgres. Anybody knows how to setup it.
Your question is very large, because the high availability has multiple level and definition:
Airflow availability: multiple scheduler, multiple workers, auto scaling to avoid pressure, high storage volume, ...
The databases: a HA cluster for Rabbitmq and a HA cluster for postgres
Even if you have the first two levels, how many node you want to use? you cannot put everything in the same node, you need to run one service replica per node
Suppose you did that, and now you have 3 different nodes running in the same data center, what if there is a fire in the data center? So you need to use multiple nodes in different regions
After doing all of above, is there a risk for network problem? of course there is
If you just want to run airflow in HA mode, you have multiple option to do that on any OS:
docker compose: usually we use it for developing, but you can use it for production too, you can create multiple scheduler instances, with multiple workers, it can help you to improve the availability of your service
docker swarm: similar to docker compose with additional features (scaling, multi nodes, ...), you will not find much resources to install it, but you can use the compose files and just do some changes
kubernetes: the best solution, K8S can help you to ensure the availability of your services, easy install with helm
or just running the different services on your host: not recommended, because of manual tasks, and applying the HA is complicated

Kubernetes Nginx many small pods vs one pod per node

I am running nginx on a kubernetes cluster having 3 nodes.
I am wondering if there is any benefit of having for example 4 pods and limit their cpu/mem to approx. 1/4 of the nodes capacity vs running 1 pod per node limiting cpu/mem so that pod can use resources of the whole node (for the sake of simplicity, we leave cubernet services out of the equation).
My feeling is that the fewer pods the less overhead and going for 1 pod per node should be the best in performance?
Thanks in advance
With more then 1 Pod, you have a certain high availability. Your pod will die at one point, and if it is behind a controller (which is how is must be), it will be re-created, but you will have a small downtime.
Now, take into consideration that if you deploy more then one replica of your app, even though you give it 1/n resources, there is a base image and dependencies that are going to be replicated.
As an example, let's imagine an app that runs on Ubuntu, and has 5 dependencies:
If you run 1 replica of this app, you are deploying 1 Ubuntu + 5 dependencies + the app itself.
If you are run 4 replicas of this app, you are running 4 Ubuntus + 4*5 dependencies + 4 times the app.
My point is, if your base image would be big, and you would need heavy dependencies, it would be not a linear increase of resources.
Performance-wise, I don't think there is much difference. One of your nodes will be heavily bombed as all your requests will end up there, but if your nodes can handle it, there should be no problem.
What you are referring to is the difference between horizontal and vertical scaling. Regarding vertical scaling, you would increase the resources of your application as you see fit. Otherwise, you would scale horizontally by increasing the amount of replicas of your application.
Doing one or the other depends on features that you application may or may not have. In the case of nginx scaling horizontally would split traffic per pod and also per node which would result in a better throughput for your most likely reverse proxy.

minimise disruption on weave network upgrade on kubernetes

I would like to upgrade my weave network from version 2.5.0 to 2.5.2. I understand that it's "as simple" as updating the weave daemonset.... however, i was wondering if there is a way that this can be done with minimal disruption to running pods on the system.
An simple example in my mind would be to:
cordon node1
drain node1 of all pods
update weave on node1
uncordon node1
... then rinse and repeat for each k8s node until all done.
Basing on the weave net documentation
Upgrading the Daemon Sets
The DaemonSet definition specifies Rolling Updates, so when you apply a new version Kubernetes will automatically restart the Weave Net pods one by one.
With RollingUpdate update strategy, after you update a DaemonSet template, old DaemonSet pods will be killed, and new DaemonSet pods will be created automatically, in a controlled fashion.
As i could read in another stackoverflow answer
It is possible to perform rolling updates with no downtime using a DeamonSet as of Today! What you need is to have at least 2 nodes running on your cluster and set maxUnavailable to 1 in your DaemonSet configuration.
Assuming the previous configuration, when an update is pushed, a first node will start updating. The second will waiting until the first completes. Upon success, the second does the same.
The major drawback is that you need to keep 2 nodes runnings continuously or to take actions to spawn/kill a node before/after an update.
So i think the best option for you to upgrade your CNI plugin is using DaemonSet with rolling update and set maxUnavailable to 1 in your DaemonSet configuration.

Resources