Using TOP to check the cpu and memory status of K8s cluster host, I found that the CPU cost of Nginx-ingress-controller is always high. If I set the cpu limit to 2, then the CPU time would be almost 200%. If without cpu limit, the CPU time would be almost 400% in a 4-core VM. The nginx ingress controller is 0.8.3.
Is that normal?
Thanks
Liu Peng
Related
I used Confluent for Kubernetes solution to use a Kafka cluster on my bare metal server. Monitoring all pods I noticed that controlcenter-0 pod take more and more ram gradually. Why this behavior?
In a night it reached almost 6GB.
What are the minimum hardware requirements for setting up an Apache Airflow cluster.
Eg. RAM, CPU, Disk etc for different types of nodes in the cluster.
I have had no issues using very small instances in pseudo-distributed mode (32 parallel workers; Postgres backend):
RAM 4096 MB
CPU 1000 MHz
VCPUs 2 VCPUs
Disk 40 GB
If you want distributed mode, you should be more than fine with that if you keep it homogenous. Airflow shouldn't really do heavy lifting anyways; push the workload out to other things (Spark, EMR, BigQuery, etc).
You will also have to run some kind of messaging queue, like RabbitMQ. I think they take Redis too. However, this doesn't really dramatically impact how you size.
We are running the airflow in AWS with below config
t2.small --> airflow scheduler and webserver
db.t2.small --> postgres for metastore
The parallelism parameter in airflow.cfg is set to 10 and there are around 10 users who access airflow UI
All we do from airflow is ssh to other instances and run the code from there
We are using OpenShift Platform 3.3.1 and in order to decrease socket connections (TIME_WAIT), planning to set an unsafe sysctl parameters (as per OSE) for NGINX POD:
net.ipv4.tcp_fin_timeout=20 (default value is 60 - Decrease TIME_WAIT seconds)
net.ipv4.tcp_tw_recycle=1 (Enable fast recycling TIME-WAIT sockets)
net.ipv4.tcp_tw_recycle=1 (Enable fast recycling TIME-WAIT sockets)
Will it have impact on OpenShift by setting these parameters in PROD? Or is there any alternative solution (we have also increased number of PODs for NGINX).
I have openstack deployed across multiple servers. Each server has 2 CPU, 8 Cores each, 16 threads each. If I turn hyper-threading on, how many max vCPUs can I use on my openstack deployment so that I don't overcommit any vCPUs for any VM.
Hyperthreading
I recommend against turning on hyperthreading in when working with KVM in general, however I am biased. When hyperthreading and kvm were both young, there were many issues that cropped up around vcpu and hyperthreading.
For clarity, hyperthreading simply creates a soft-logical processor in the linux kernel in an effort to reach a higher efficiency in the cpu processing queue.
Overcommitting, vCPUs and logical CPUs
A vCPU is a virtual cpu allocated to a virtual machine.
A logical CPU is a CPU logically allocated to your host system's Linux kernel.
As seen with hyperthreading, sometimes the logical CPUs outnumber the physical CPUs or cores on the host.
You are technically overcommitting the moment you have more vcpu cores than physical cores. Note how I said PHYSICAL cores, not logical CPUs. What linux shows you in proc/cpuinfo may not be an accurate reflection of available physical cores, in part thanks to hyperthreading.
As kvm allocates vCPUs they are not set with any sort of CPU affinity by default. What this means is, the vCPUs are going to whichever logical processor in linux seems to be most available at the time. If someone kicks off a make 'MAKE=make -j64' World sort of job, you might see some pretty significant utilization spin up and begin to fire hose around whatever logical CPUs are most available at any given instruction set.
Now if you have an 8 physical core box, hosting 4 virtual machines, with 2 vCPUs a piece this is fine. But think about what happens with hyperthreading enabled... now you have 16 logical CPUs, but only 8 cores. What happens when you bring up 4 more virtual machines? You run the risk of having virtual machines directly impacting resource availability to their neighbors. This is technically overcommitting.
Don't overcommit if you don't have to.
Also consider the needs of the host. You might want to set cpu_affinity on the host system when you perform CPU intensive actions and consider that physical core as dedicated to the HOST, and subtract it from the available ( max ) vCPU count available to VMs.
Learn how to set affinity's with taskset:
ref: http://manpages.ubuntu.com/manpages/hardy/man1/taskset.1.html
Max vCPU per VM
As for cpu quotas, this is basically a function of your hypervisor not of OpenStack. You'll want to handle that with a CFM and some careful planning.
For instance, RedHat tunes their own KVM packages:
The maximum amount of virtual CPUs that is supported per guest varies
depending on which minor version of Red Hat Enterprise Linux 6 you are
using as a host machine. The release of 6.0 introduced a maximum of
64, while 6.3 introduced a maximum of 160. Currently with the release
of 6.7, a maximum of 240 virtual CPUs per guest is supported.
ref: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/chap-Virtualization_Host_Configuration_and_Guest_Installation_Guide-Virtualization_Restrictions.html
Here is some info on tuning PER vm cpu / resource allocation
ref: http://libvirt.org/formatdomain.html#elementsCPUAllocation
Should I take in consideration CPU utilization, network traffic or http response time checks? I've run some tests with Apache AB (from the same server - eq: ab -k -n 500000 -c 100 http://192.XXX.XXX.XXX/) - and I monitored the load average. Even if the load was between 1.0 - 1.50(one core server), "time per request"(mean) was pretty solid, 140ms for a simple dynamic page with one set/get Redis operation. Anyway, I'm confused as the general advice is to launch a new instance when you pass the 70% CPU utilization threshold.
70% CPU utilization is a good rule of thumb for CPU-bound applications like nginx. CPU time is kind of like body temperature: it actually hides a lot of different things, but is a good general indicator of health. Load average is a separate measure of how many processes are waiting to be scheduled. The reason the rule is 70% (or 80%) utilization is that, past this point, CPU-bound appliations tend to suffer contention-induced latency and non-linear performance.
You can test this yourself by plotting throughput and latency (median and 90th percentile) against CPU utilization on your setup. Finding the inflection point for your particular system is important for capacity planning.
A very good writeup of this phenomenon is given in Facebook's original paper on Dyno, their system for measuring throughput of PHP under load.
https://www.facebook.com/note.php?note_id=203367363919