Why is Apache Nifi process consumes about 30% CPU resource? - cpu-usage

I have only one processor named "GetFile" running on the Nifi flow without any messages being handled, also, there is no error in the nifi-app.log.
However, the Nifi java process consumes about 30% system CPU resource, does anyone know why and how to tune the performance?

The default setting for that processor is overly aggressive and leads to high CPU usage. Set 'polling interval' to say '1 sec' and that should help.

Related

Limit CPU & Memory for *nix Process

Is it possible to limit CPU & Memory for the *nix Process?
The CPU limit may look like "use no more than 10% of one core".
The memory limit may look like "use no more than 100Mb", the OS may limit it or kill the process if it try to exceed the limit, both ways are fine.
Any *nix that could do that would be fine.
It seems it is possible to implement it with virtual machines, but it is not acceptable because the overhead is too huge.
If you happen to use Solaris, the ability to limit resource usage is a native feature.
Memory (RAM) usage can be capped per process using the rcap.max-rss setting while CPU usage can be limited per project using the project.cpu-caps.
Note that Solaris also allows OS level virtualization (a.k.a. zones) which have no significant overhead, if any, compared to a bare metal OS instance.
Resource capping is part of Solaris zones configuration.
Try CPULimit
cpulimit is a simple program which attempts to limit the cpu usage of a process (expressed in percentage, not in cpu time). This is useful to control batch jobs, when you don't want them to eat too much cpu. It does not act on the nice value or other scheduling priority stuff, but on the real cpu usage. Also, it is able to adapt itself to the overall system load, dynamically and quickly.

Is there any way to delimit resources(CPU, RAM usage) to virtual directories on IIS or Apache?

We are using ASP.NET but sometimes our applications using very much resources of CPU or RAM. I want to restrict resources for every virtual directory/web site and get alert when they reached the alert level.
I don't know I can do this for IIS but I wonder is this possible for other web servers like apache?
In Java, you cannot have the Java Virtual Machine enforce CPU time limits for certain threads: the best you can do is set the thread priority for any threads that you create. If there is no other work to be done, then a thread will burn as many CPU cycles as it can get unless is it blocking on something (I/O, synchronization monitor, etc.).
Using JMX, you can get some information about threads to possibly detect some runaway-thread situation, but you can't directly control the amount of CPU Time allowed for the thread.
If you are willing to re-architect your dynamic-content and other services, you could write them in such a way that they can support a "unit of work" that is somehow less than the full request's worth of work, and then you can have your request-processing thread execute a unit of work and then sleep an appropriate amount of time. You can lessen your CPU usage by doing this, but you will also certainly slow-down response times measurably.
Is there anything wrong with a fully-utilized CPU if your users are happy? Perhaps the real solution to your problem is more or bigger hardware.

Does high CPU usage indicate a software module is designed wrong

We have a process that is computationally intensive. When it runs it typically it uses 99% of the available CPU. It is configured to take advantage of all available processors so I believe this is OK. However, one of our customers is complaining because alarms go off on the server on which this process is running because of the high CPU utilization. I think that there is nothing wrong with high CPU utilization per se. The CPU drops back to normal when the process stops running and the process does run to completion (no infinite loops, etc.). I'm just wondering if I am on solid ground when I say that high CPU usage is not a problem per se.
Thank you,
Elliott
if I am on solid ground when I say that high CPU usage is not a problem per se
You are on solid ground.
We have a process that is computationally intensive
Then I'd expect high CPU usage.
The CPU drops back to normal when the process stops running
Sounds good so far.
Chances are that the systems you client are using are configured to notify when the CPU usage goes over a certain limit, as sometimes this is indicative of a problem (and sustained high usage can cause over heating and associated problems).
If this is expected behavior, your client needs to adjust their monitoring - but you need to ensure that the behavior is as expected on their systems and that it is not likely to cause problems (ensure that high CPU usage is not sustained).
Alarm is not a viable reason for poor design. The real reason may be that it chokes other tasks on the system. Modern OSes usually take care of this by lowering dynamic priority of the CPU hungry process in such a way that others that are less demanding of CPU time will get higher priority. You may tell the customer to "nice" the process to start with, since you probably don't care if it runs 10 mins of 12 mins. Just my 2 cents :)

"System Idle Process" eats CPU on a high threading application

I have a multi-threaded web application with about 1000~2000 threads at production environment.
I expect CPU usage on w3wp.exe but System Idle Process eats CPU. Why?
The Idle process isn't actually a real process, it doesn't "eat" your CPU time. the %cpu you see next to it is actually unused %cpu (more or less).
The reason for the poor performance of your application is most likely due to your 2000 threads. Windows (or indeed any operating system) was never meant to run so many threads at a time. You're wasting most of the time just context switching between them, each getting a couple of milliseconds of processing time every ~30 seconds (15ms*2000=30sec!!!!).
Rethink your application.
the idle process is simply holding process time until a program needs it, its not actually eating any cycles at all. you can think the system idle time as 'available cpu'
System Idle Process is not a real process, it represents unused processor time.
This means that your app doesn't utilize the processor completely - it may be memory-bound or CPU-bound; possibly the threads are waiting for each other, or for external resources? Context switching overhead could also be a culprit - unless you have 2000 cores, the threads are not actually running all at the same time, but assigned time slices by the task scheduler, this also takes some time.
You have not provided a lot of details so I can only speculate at this point. I would say it is likely that most of those threads are doing nothing. The ones that are doing something are probably IO bound meaning that they are spending most of their waiting for the external resource to respond.
Now lets talk about the "1000~2000 threads". There are very few cases (maybe none) where having that many threads is a good idea. I think your current issue is a perfect example of why. Most of those threads are (apparently anyway) doing nothing but wasting resources. If you want to process multiple tasks in parallel, espcially if they are IO bound, then it is better to take advantage of pooled resources like the ThreadPool or by using the Task Parallel Library.

Should a non-blocking code push CPU to 100%

I've been thinking today about NodeJS and it attitude towards blocking, it got me thinking, if a block of code is purely non-blocking, say calculating some real long alogirthm and variables are all present in the stack etc.. should this push a single core non hyperthreaded to CPU as Windows Task Manager defines it to 100% as it aims to complete this task as quickly as possible? Say that this is generally calculation that can take minutes.
Yes, it should. The algorithm should run as fast as it can. It's the operating system's job to schedule time to other processes if necessary.
If your non-blocking computation intensive code doesn't use 100% of the CPU then you are wasting cycles in the idle task. It always irritates me to see the idle task using 99% of the CPU.
As long as the CPU is "given" to other processes when there are some that need it to do their calculations, I suppose it's OK : why not use the CPU if it's available and there is some work to do ?
As RAM can be paged out to disk, all applications are potentially blocking. This would happen if the algorithm uses more RAM than available on the system. As a result, it won't hit 100%.

Resources